From tobias.hartmann at oracle.com Tue May 2 08:42:08 2017 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Tue, 2 May 2017 10:42:08 +0200 Subject: [9 or 10] RFR(S): 8179070: nashorn+octane's box2d causes c2 to crash with "Bad graph detected in compute_lca_of_uses" In-Reply-To: References: Message-ID: Hi Roland, On 25.04.2017 10:38, Roland Westrelin wrote: > http://cr.openjdk.java.net/~roland/8179070/webrev.00/ The fix looks good to me, thanks for the detailed explanation! > The fix seems safe enough that I would say this would better be fixed in 9. Yes, I agree. I targeted this for JDK 9, please request approval according to http://openjdk.java.net/projects/jdk9/fix-request-process Thanks, Tobias From stefan.anzinger at oracle.com Tue May 2 15:24:25 2017 From: stefan.anzinger at oracle.com (Stefan Anzinger) Date: Tue, 2 May 2017 15:24:25 +0000 Subject: JDK10/RFR(L): 8172231: SPARC ISA/CPU feature detection is broken/insufficient (on Solaris). In-Reply-To: <6801aebd-eb60-6711-2859-0096818007eb@oracle.com> References: <6801aebd-eb60-6711-2859-0096818007eb@oracle.com> Message-ID: <28f59894-01f4-9008-9de9-00a7f2c43c8b@oracle.com> Hi Patric, I had a look and tested this change with jvmci/graal on Solaris. It compiles and runs without any error. Looks good to me. Stefan On 04/28/2017 01:48 PM, Patric Hedlin wrote: > Dear all, > > I would like to ask for help to review the following change/update: > > Issue: https://bugs.openjdk.java.net/browse/JDK-8172231 > > Webrev: http://cr.openjdk.java.net/~neliasso/8172231/ > > > > 8172231: SPARC ISA/CPU feature detection is broken/insufficient (on > Solaris). > > Updating SPARC feature/capability detection (incorporating changes > from Martin Walsh). > More complete set of features as provided by 'getisax(2)' interface, > propagated via JVMCI. > More robust hardware probing for additional features (up to Core S4). > Removing support for old, pre Niagara, hardware. > Removing support for old, pre 11.1, Solaris. > > Changed behaviour: > Changing SPARC setup for AllocatePrefetchLines and > AllocateInstancePrefetchLines > such that they will (still) be doubled when cache-line size is small > (32 bytes), > but more moderately increased on new/contemporary hardware (inc >= 50%). > Changing to default instruction fetch alignment based on derived > caps. instead > of relying on default/configuration values. > > The above changes also subsumes: > 8035146: assert(is_T_family(features) == is_niagara(features), > "Niagara should be T series") is incorrect > 8054979: Remove unnecessary defines in SPARC's > VM_Version::platform_features > > > Rationale: > > Current hardware detection on Solaris/SPARC is not up to date with > the "latest" (here, > meaning commercially available server solutions, i.e. T7/M7). To > facilitate improved > use of the new hardware features provided (by Core S3&S4) these > capabilities need to > be recognised by the JVM. > > NOTE: This update is limited to Core S3&S4, i.e. not including Core > S5. Proper Core S5 > support will be added when regular testing and benchmarking > resources are available, > i.e. regular testing need to include M8 hardware. > > > Caveat: > > This update will introduce some redundancies into the code base, > features and definitions > currently not used, as well as a (small) number of FIXMEs, addressed > by subsequent bug or > feature updates/patches. Fujitsu HW is treated very conservatively. > > > Testing: > > Mostly tested on JDK9 (RBT/hotspot/comp). Only local testing on > JDK10 (jtreg/hotspot). > > > Benchmarking: > > Benchmark reports from a limited set of runs can be found at: > > http://aurora.se.oracle.com/performance/reporting/report/patric.hedlin.TvM.jbb05 > > http://aurora.se.oracle.com/performance/reporting/report/patric.hedlin.TvM.jvm08 > > http://aurora.se.oracle.com/performance/reporting/report/patric.hedlin.TvM.octane.plus > > > (Limited availability of M7 hardware prevents complete suites/runs.) > > > Best regards, > Patric > From vladimir.kozlov at oracle.com Tue May 2 17:20:15 2017 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Tue, 2 May 2017 10:20:15 -0700 Subject: [10] RFR(M): 8176506: C2: loop unswitching and unsafe accesses cause crash In-Reply-To: References: <6ce634a9-cd51-d98b-8e46-fbe6cb1f3f43@oracle.com> <1aef21ad-3676-2ffd-069a-c74ef36a668a@oracle.com> <83aed7fe-cafa-f28b-577d-c6871123e269@oracle.com> <8cfe23ac-b308-24e3-2725-44992031cddd@oracle.com> <352faa06-3fe4-de8a-eeb6-3506bf555a1e@redhat.com> Message-ID: Looks good to me. I will submit RBT testing, lets see how it goes. Thanks, Vladimir On 4/28/17 1:46 AM, Roland Westrelin wrote: > >> In addition to Andrew?s test i have a simple JMH microbenchmark i can >> send you, which i used to measure performance and identify issues with >> code generation, misalignment, and unrolled loops for byte buffer >> access (off/on heap) and VarHandle views. specjvm2008 is likely too >> coarse grained. > > I found some regressions with Paul's tests but I can recover lost > performance using profiling data and some tweaks to the compiler. I'll > take care of all that as part of a follow up change. > > Here is a new webrev: > > http://cr.openjdk.java.net/~roland/8176506/webrev.03/ > > I found I introduced a bug in LibraryCallKit::make_unsafe_address(): in > case of a raw memory access classify_unsafe_addr() changes base but we > don't want that change to propagate to > LibraryCallKit::inline_unsafe_access() otherwise it won't see a null > base for a raw memory access. > > If C2 compiles a method that makes incorrect use of Unsafe (by trying to > load a field from a null object for instance), the C2 HaltNode can be > executed. Its current implementation on several platforms result in a > call to os::breakpoint() which is an empty method. So execution would > happily return to the compiled method and continue and fail in some > mysterious way. That's why in my previous webrevs, I had implemented the > HaltNode on x86 and aarch64 as a call to a method that call > ShouldNotReachHere(). This needs to be taken care of on all > platforms. The webrev now includes an implementation on arm64 (that I > tested), arm32 and sparc (that I can't build or test). I would need > someone to verify them. I suppose PPC would need something similar as > well but I have limited PPC skills so I would need code for that > too. The easy way to test the halt implementation is to modify the > TestMaybeNullUnsafeAccess test from the patch and add a: > > test2(null); > > call after the loop of the main method. This should result in the Halt > node being executed. > > Roland. > From vladimir.kozlov at oracle.com Tue May 2 17:43:01 2017 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Tue, 2 May 2017 10:43:01 -0700 Subject: [10] RFR(XS): 8169697: aarch64: vectorized MLA instruction not generated for some test cases In-Reply-To: References: <15b3b2dc-481f-98fd-c7db-f72f58d970e1@oracle.com> Message-ID: <78d865ac-c85f-988d-f63e-7f0f65ff43f9@oracle.com> Changes look fine to me. I will run RBT testing but it only covers platforms supported by Oracle. You have to test it on other platforms. thanks, Vladimir On 4/28/17 3:58 AM, Roland Westrelin wrote: > >> The patch has been updated in >> http://cr.openjdk.java.net/~njian/8169697/webrev.01/. > > That patch looks good to me. Can we have a sponsor for it (in jdk 10)? > > Roland. > From vladimir.kozlov at oracle.com Tue May 2 17:59:17 2017 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Tue, 2 May 2017 10:59:17 -0700 Subject: RFR(S): 8178811: Minimize the AVX <-> SSE transition penalty on x86 In-Reply-To: <53E8E64DB2403849AFD89B7D4DAC8B2A63CDA33F@ORSMSX106.amr.corp.intel.com> References: <53E8E64DB2403849AFD89B7D4DAC8B2A63CD479B@ORSMSX106.amr.corp.intel.com> <53E8E64DB2403849AFD89B7D4DAC8B2A63CDA33F@ORSMSX106.amr.corp.intel.com> Message-ID: <2fbdf68a-9ceb-3183-a12a-bbef83b77c44@oracle.com> Looks good. One nit - move next code into separate function instead of duplicating 3 times: + if( is_intel() ) { // Intel cpus specific settings + if ((cpu_family() == 0x06) && + ((extended_cpu_model() == 0x57) || // Xeon Phi 3200/5200/7200 + (extended_cpu_model() == 0x85))) { // Future Xeon Phi + _features &= ~CPU_VZEROUPPER; + } + } I will run testing for current changes and let you know results. Thanks, Vladimir On 4/20/17 1:54 PM, Deshpande, Vivek R wrote: > HI Vladimir > > We added almost all the vzeroupper instructions by analyzing SPECjbb2015. > > We need vzeroupper after we execute avx2/avx-512 and before transition to SSE to avoid penalty of saving and restoring higher bits in YMM/ZMM registers. Since JVM is SSE compiled, mixing of AVX and SSE code happens in C1 and interpreter with usage of intrinsics, so I have added vzerouppers at these transitions in the stubs. > > I think max_vector_size() > 16 would be with auto vectorization, but we need to have vzeroupper with stubs and intrinsics. > > I have made all the changes which you suggested. Please take a look at the patch and let me know your thoughts. > > The updated Webrev is here: > http://cr.openjdk.java.net/~vdeshpande/8178811/webrev.01/ > > Thank you. > Regards, > Vivek > > -----Original Message----- > From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com] > Sent: Friday, April 14, 2017 12:48 PM > To: Deshpande, Vivek R; hotspot compiler > Cc: Viswanathan, Sandhya > Subject: Re: RFR(S): 8178811: Minimize the AVX <-> SSE transition penalty on x86 > > Hi Vivek, > > Did you pinpoint particular change which helps SPECjbb2015? > > From what I remember it is mostly during call to runtime or JNI calls. > > // AVX instruction which is used to clear upper 128 bits of YMM registers and > // to avoid transaction penalty between AVX and SSE states. There is no > // penalty if legacy SSE instructions are encoded using VEX prefix because > // they always clear upper 128 bits. It should be used before calling > // runtime code and native libraries. > void vzeroupper(); > > So why you added vzeroupper() into arraycopy and other stubs? If AVX is supported all instructions should be encoded with VEX prefix. Or the statement in the comment is incorrect? > > About UseVzeroupper flag. We should avoid adding new product flags if possible. > IT only make sense if you want to do experiments to check performance with and without it. But we already know that it is needed on most CPUs. > > I would suggest to add VM_Version::supports_vzeroupper() { return (_features & CPU_VZEROUPPER) != 0; } Set CPU_VZEROUPPE if AVX is supported and clear it for Knights CPU. The add check inside assembler instruction: > > void Assembler::vzeroupper() { > - assert(VM_Version::supports_avx(), ""); > + if (VM_Version::supports_vzeroupper()) { > > It will allow to avoid supports_avx() checks on each vzeroupper() call. > > You missed check in MachEpilogNode::emit() in x86_64.ad > > Note, I used (C->max_vector_size() > 16) checks in .ad files because of the same reasons as above: > > // There is no penalty if legacy SSE instructions are encoded using VEX prefix because > // they always clear upper 128 bits. > > I thought if vectors are small and we use VEX prefix then upper bits will be 0 anyway so you don't need vzeroupper(). > > Thanks, > Vladimir > > On 4/14/17 11:17 AM, Deshpande, Vivek R wrote: >> Hi All >> >> >> >> This fix minimizes the AVX to SSE and SSE to AVX transition penalty >> through generation of vzeroupper instruction. With this patch we see >> zero transitions with penalty per SPECjbb2015 jOPS on BDW and a significant reduction on SKX CPU event vector width mismatch from 65 to 0.01 per SPECjbb2015 jOPS. We have also implemented an enhancement to disable vzeroupper generation for Knights family where the instruction has high penalty and is not recommended. The option UseVzeroupper is used to control generation of vzeroupper instruction and gets set to false on the Knights family. >> We observed ~3% gain on SPECJvm2008 composite result on Skylake. >> >> Webrev: >> >> http://cr.openjdk.java.net/~vdeshpande/8178811/webrev.00/ >> >> I have also updated the JBS entry. >> >> https://bugs.openjdk.java.net/browse/JDK-8178811 >> >> Would you please review and sponsor it. >> >> >> >> Regards, >> >> Vivek >> >> >> From vladimir.kozlov at oracle.com Tue May 2 18:19:35 2017 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Tue, 2 May 2017 11:19:35 -0700 Subject: RFR(S): 8178811: Minimize the AVX <-> SSE transition penalty on x86 In-Reply-To: <2fbdf68a-9ceb-3183-a12a-bbef83b77c44@oracle.com> References: <53E8E64DB2403849AFD89B7D4DAC8B2A63CD479B@ORSMSX106.amr.corp.intel.com> <53E8E64DB2403849AFD89B7D4DAC8B2A63CDA33F@ORSMSX106.amr.corp.intel.com> <2fbdf68a-9ceb-3183-a12a-bbef83b77c44@oracle.com> Message-ID: <31e0a9e5-c63d-b859-9bd4-9f66e093aed6@oracle.com> The build failed: # Internal Error (/scratch/opt/jprt/T/P1/175733.vkozlov/s/hotspot/src/cpu/x86/vm/vm_version_x86.hpp:640), pid=7347, tid=7348 # assert(_cpuid_info.std_cpuid1_eax.bits.family != 0) failed: VM_Version not initialized V [libjvm.so+0x93a6a0] report_vm_error(char const*, int, char const*, char const*, ...)+0x60 V [libjvm.so+0x14cb1f0] VM_Version_StubGenerator::generate_get_cpu_info()+0x2f20 V [libjvm.so+0x14c7f53] VM_Version::initialize()+0x103 I think it fails at check is_intel() in set_avx_cpuFeatures() or set_evex_cpuFeatures(). Please, build fastdebug JVM for testing. Vladimir On 5/2/17 10:59 AM, Vladimir Kozlov wrote: > Looks good. One nit - move next code into separate function instead of duplicating 3 times: > > + if( is_intel() ) { // Intel cpus specific settings > + if ((cpu_family() == 0x06) && > + ((extended_cpu_model() == 0x57) || // Xeon Phi 3200/5200/7200 > + (extended_cpu_model() == 0x85))) { // Future Xeon Phi > + _features &= ~CPU_VZEROUPPER; > + } > + } > > I will run testing for current changes and let you know results. > > Thanks, > Vladimir > > On 4/20/17 1:54 PM, Deshpande, Vivek R wrote: >> HI Vladimir >> >> We added almost all the vzeroupper instructions by analyzing SPECjbb2015. >> >> We need vzeroupper after we execute avx2/avx-512 and before transition to SSE to avoid penalty of saving and restoring higher bits in YMM/ZMM registers. Since JVM is SSE compiled, mixing of AVX and >> SSE code happens in C1 and interpreter with usage of intrinsics, so I have added vzerouppers at these transitions in the stubs. >> >> I think max_vector_size() > 16 would be with auto vectorization, but we need to have vzeroupper with stubs and intrinsics. >> >> I have made all the changes which you suggested. Please take a look at the patch and let me know your thoughts. >> >> The updated Webrev is here: >> http://cr.openjdk.java.net/~vdeshpande/8178811/webrev.01/ >> >> Thank you. >> Regards, >> Vivek >> >> -----Original Message----- >> From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com] >> Sent: Friday, April 14, 2017 12:48 PM >> To: Deshpande, Vivek R; hotspot compiler >> Cc: Viswanathan, Sandhya >> Subject: Re: RFR(S): 8178811: Minimize the AVX <-> SSE transition penalty on x86 >> >> Hi Vivek, >> >> Did you pinpoint particular change which helps SPECjbb2015? >> >> From what I remember it is mostly during call to runtime or JNI calls. >> >> // AVX instruction which is used to clear upper 128 bits of YMM registers and >> // to avoid transaction penalty between AVX and SSE states. There is no >> // penalty if legacy SSE instructions are encoded using VEX prefix because >> // they always clear upper 128 bits. It should be used before calling >> // runtime code and native libraries. >> void vzeroupper(); >> >> So why you added vzeroupper() into arraycopy and other stubs? If AVX is supported all instructions should be encoded with VEX prefix. Or the statement in the comment is incorrect? >> >> About UseVzeroupper flag. We should avoid adding new product flags if possible. >> IT only make sense if you want to do experiments to check performance with and without it. But we already know that it is needed on most CPUs. >> >> I would suggest to add VM_Version::supports_vzeroupper() { return (_features & CPU_VZEROUPPER) != 0; } Set CPU_VZEROUPPE if AVX is supported and clear it for Knights CPU. The add check inside >> assembler instruction: >> >> void Assembler::vzeroupper() { >> - assert(VM_Version::supports_avx(), ""); >> + if (VM_Version::supports_vzeroupper()) { >> >> It will allow to avoid supports_avx() checks on each vzeroupper() call. >> >> You missed check in MachEpilogNode::emit() in x86_64.ad >> >> Note, I used (C->max_vector_size() > 16) checks in .ad files because of the same reasons as above: >> >> // There is no penalty if legacy SSE instructions are encoded using VEX prefix because >> // they always clear upper 128 bits. >> >> I thought if vectors are small and we use VEX prefix then upper bits will be 0 anyway so you don't need vzeroupper(). >> >> Thanks, >> Vladimir >> >> On 4/14/17 11:17 AM, Deshpande, Vivek R wrote: >>> Hi All >>> >>> >>> >>> This fix minimizes the AVX to SSE and SSE to AVX transition penalty >>> through generation of vzeroupper instruction. With this patch we see >>> zero transitions with penalty per SPECjbb2015 jOPS on BDW and a significant reduction on SKX CPU event vector width mismatch from 65 to 0.01 per SPECjbb2015 jOPS. We have also implemented an >>> enhancement to disable vzeroupper generation for Knights family where the instruction has high penalty and is not recommended. The option UseVzeroupper is used to control generation of vzeroupper >>> instruction and gets set to false on the Knights family. >>> We observed ~3% gain on SPECJvm2008 composite result on Skylake. >>> >>> Webrev: >>> >>> http://cr.openjdk.java.net/~vdeshpande/8178811/webrev.00/ >>> >>> I have also updated the JBS entry. >>> >>> https://bugs.openjdk.java.net/browse/JDK-8178811 >>> >>> Would you please review and sponsor it. >>> >>> >>> >>> Regards, >>> >>> Vivek >>> >>> >>> From vivek.r.deshpande at intel.com Tue May 2 19:26:55 2017 From: vivek.r.deshpande at intel.com (Deshpande, Vivek R) Date: Tue, 2 May 2017 19:26:55 +0000 Subject: RFR(S): 8178811: Minimize the AVX <-> SSE transition penalty on x86 In-Reply-To: <31e0a9e5-c63d-b859-9bd4-9f66e093aed6@oracle.com> References: <53E8E64DB2403849AFD89B7D4DAC8B2A63CD479B@ORSMSX106.amr.corp.intel.com> <53E8E64DB2403849AFD89B7D4DAC8B2A63CDA33F@ORSMSX106.amr.corp.intel.com> <2fbdf68a-9ceb-3183-a12a-bbef83b77c44@oracle.com> <31e0a9e5-c63d-b859-9bd4-9f66e093aed6@oracle.com> Message-ID: <53E8E64DB2403849AFD89B7D4DAC8B2A63CE6B5B@ORSMSX106.amr.corp.intel.com> Hi Vladimir I will check and get back to you soon. Thanks. Regards, Vivek -----Original Message----- From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com] Sent: Tuesday, May 2, 2017 11:20 AM To: Deshpande, Vivek R; hotspot compiler Cc: Viswanathan, Sandhya Subject: Re: RFR(S): 8178811: Minimize the AVX <-> SSE transition penalty on x86 The build failed: # Internal Error (/scratch/opt/jprt/T/P1/175733.vkozlov/s/hotspot/src/cpu/x86/vm/vm_version_x86.hpp:640), pid=7347, tid=7348 # assert(_cpuid_info.std_cpuid1_eax.bits.family != 0) failed: VM_Version not initialized V [libjvm.so+0x93a6a0] report_vm_error(char const*, int, char const*, char const*, ...)+0x60 V [libjvm.so+0x14cb1f0] VM_Version_StubGenerator::generate_get_cpu_info()+0x2f20 V [libjvm.so+0x14c7f53] VM_Version::initialize()+0x103 I think it fails at check is_intel() in set_avx_cpuFeatures() or set_evex_cpuFeatures(). Please, build fastdebug JVM for testing. Vladimir On 5/2/17 10:59 AM, Vladimir Kozlov wrote: > Looks good. One nit - move next code into separate function instead of duplicating 3 times: > > + if( is_intel() ) { // Intel cpus specific settings > + if ((cpu_family() == 0x06) && > + ((extended_cpu_model() == 0x57) || // Xeon Phi 3200/5200/7200 > + (extended_cpu_model() == 0x85))) { // Future Xeon Phi > + _features &= ~CPU_VZEROUPPER; > + } > + } > > I will run testing for current changes and let you know results. > > Thanks, > Vladimir > > On 4/20/17 1:54 PM, Deshpande, Vivek R wrote: >> HI Vladimir >> >> We added almost all the vzeroupper instructions by analyzing SPECjbb2015. >> >> We need vzeroupper after we execute avx2/avx-512 and before >> transition to SSE to avoid penalty of saving and restoring higher bits in YMM/ZMM registers. Since JVM is SSE compiled, mixing of AVX and SSE code happens in C1 and interpreter with usage of intrinsics, so I have added vzerouppers at these transitions in the stubs. >> >> I think max_vector_size() > 16 would be with auto vectorization, but we need to have vzeroupper with stubs and intrinsics. >> >> I have made all the changes which you suggested. Please take a look at the patch and let me know your thoughts. >> >> The updated Webrev is here: >> http://cr.openjdk.java.net/~vdeshpande/8178811/webrev.01/ >> >> Thank you. >> Regards, >> Vivek >> >> -----Original Message----- >> From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com] >> Sent: Friday, April 14, 2017 12:48 PM >> To: Deshpande, Vivek R; hotspot compiler >> Cc: Viswanathan, Sandhya >> Subject: Re: RFR(S): 8178811: Minimize the AVX <-> SSE transition >> penalty on x86 >> >> Hi Vivek, >> >> Did you pinpoint particular change which helps SPECjbb2015? >> >> From what I remember it is mostly during call to runtime or JNI calls. >> >> // AVX instruction which is used to clear upper 128 bits of YMM registers and >> // to avoid transaction penalty between AVX and SSE states. There is no >> // penalty if legacy SSE instructions are encoded using VEX prefix because >> // they always clear upper 128 bits. It should be used before calling >> // runtime code and native libraries. >> void vzeroupper(); >> >> So why you added vzeroupper() into arraycopy and other stubs? If AVX is supported all instructions should be encoded with VEX prefix. Or the statement in the comment is incorrect? >> >> About UseVzeroupper flag. We should avoid adding new product flags if possible. >> IT only make sense if you want to do experiments to check performance with and without it. But we already know that it is needed on most CPUs. >> >> I would suggest to add VM_Version::supports_vzeroupper() { return >> (_features & CPU_VZEROUPPER) != 0; } Set CPU_VZEROUPPE if AVX is supported and clear it for Knights CPU. The add check inside assembler instruction: >> >> void Assembler::vzeroupper() { >> - assert(VM_Version::supports_avx(), ""); >> + if (VM_Version::supports_vzeroupper()) { >> >> It will allow to avoid supports_avx() checks on each vzeroupper() call. >> >> You missed check in MachEpilogNode::emit() in x86_64.ad >> >> Note, I used (C->max_vector_size() > 16) checks in .ad files because of the same reasons as above: >> >> // There is no penalty if legacy SSE instructions are encoded using VEX prefix because >> // they always clear upper 128 bits. >> >> I thought if vectors are small and we use VEX prefix then upper bits will be 0 anyway so you don't need vzeroupper(). >> >> Thanks, >> Vladimir >> >> On 4/14/17 11:17 AM, Deshpande, Vivek R wrote: >>> Hi All >>> >>> >>> >>> This fix minimizes the AVX to SSE and SSE to AVX transition penalty >>> through generation of vzeroupper instruction. With this patch we see >>> zero transitions with penalty per SPECjbb2015 jOPS on BDW and a >>> significant reduction on SKX CPU event vector width mismatch from 65 >>> to 0.01 per SPECjbb2015 jOPS. We have also implemented an enhancement to disable vzeroupper generation for Knights family where the instruction has high penalty and is not recommended. The option UseVzeroupper is used to control generation of vzeroupper instruction and gets set to false on the Knights family. >>> We observed ~3% gain on SPECJvm2008 composite result on Skylake. >>> >>> Webrev: >>> >>> http://cr.openjdk.java.net/~vdeshpande/8178811/webrev.00/ >>> >>> I have also updated the JBS entry. >>> >>> https://bugs.openjdk.java.net/browse/JDK-8178811 >>> >>> Would you please review and sponsor it. >>> >>> >>> >>> Regards, >>> >>> Vivek >>> >>> >>> From rwestrel at redhat.com Wed May 3 07:58:39 2017 From: rwestrel at redhat.com (Roland Westrelin) Date: Wed, 03 May 2017 09:58:39 +0200 Subject: [10] RFR(M): 8176506: C2: loop unswitching and unsafe accesses cause crash In-Reply-To: References: <6ce634a9-cd51-d98b-8e46-fbe6cb1f3f43@oracle.com> <1aef21ad-3676-2ffd-069a-c74ef36a668a@oracle.com> <83aed7fe-cafa-f28b-577d-c6871123e269@oracle.com> <8cfe23ac-b308-24e3-2725-44992031cddd@oracle.com> <352faa06-3fe4-de8a-eeb6-3506bf555a1e@redhat.com> Message-ID: > Looks good to me. I will submit RBT testing, lets see how it goes. Thanks, Vladimir. Note that usual testing shouldn't cause the Halt to be executed so wouldn't test its new implementation. Roland. From rwestrel at redhat.com Wed May 3 08:05:38 2017 From: rwestrel at redhat.com (Roland Westrelin) Date: Wed, 03 May 2017 10:05:38 +0200 Subject: [10] RFR(XS): 8169697: aarch64: vectorized MLA instruction not generated for some test cases In-Reply-To: <78d865ac-c85f-988d-f63e-7f0f65ff43f9@oracle.com> References: <15b3b2dc-481f-98fd-c7db-f72f58d970e1@oracle.com> <78d865ac-c85f-988d-f63e-7f0f65ff43f9@oracle.com> Message-ID: > Changes look fine to me. > > I will run RBT testing but it only covers platforms supported by > Oracle. You have to test it on other platforms. Sure. Thanks, Vladimir! Roland. From rwestrel at redhat.com Wed May 3 08:07:44 2017 From: rwestrel at redhat.com (Roland Westrelin) Date: Wed, 03 May 2017 10:07:44 +0200 Subject: [9 or 10] RFR(S): 8179070: nashorn+octane's box2d causes c2 to crash with "Bad graph detected in compute_lca_of_uses" In-Reply-To: References: Message-ID: >> http://cr.openjdk.java.net/~roland/8179070/webrev.00/ > > The fix looks good to me, thanks for the detailed explanation! Thanks for the review, Tobias! >> The fix seems safe enough that I would say this would better be fixed in 9. > > Yes, I agree. I targeted this for JDK 9, please request approval according to > http://openjdk.java.net/projects/jdk9/fix-request-process Will do. Roland. From volker.simonis at gmail.com Wed May 3 16:30:46 2017 From: volker.simonis at gmail.com (Volker Simonis) Date: Wed, 3 May 2017 18:30:46 +0200 Subject: [10] RFR(M): 8176506: C2: loop unswitching and unsafe accesses cause crash In-Reply-To: References: <6ce634a9-cd51-d98b-8e46-fbe6cb1f3f43@oracle.com> <1aef21ad-3676-2ffd-069a-c74ef36a668a@oracle.com> <83aed7fe-cafa-f28b-577d-c6871123e269@oracle.com> <8cfe23ac-b308-24e3-2725-44992031cddd@oracle.com> <352faa06-3fe4-de8a-eeb6-3506bf555a1e@redhat.com> Message-ID: Hi Roland, Vladimir, I'm currently implementing the ppc64/s390x pieces for this change. Can you please give me a day or two before submitting it? Thanks, Volker On Tue, May 2, 2017 at 7:20 PM, Vladimir Kozlov wrote: > Looks good to me. I will submit RBT testing, lets see how it goes. > > Thanks, > Vladimir > > > On 4/28/17 1:46 AM, Roland Westrelin wrote: >> >> >>> In addition to Andrew?s test i have a simple JMH microbenchmark i can >>> send you, which i used to measure performance and identify issues with >>> code generation, misalignment, and unrolled loops for byte buffer >>> access (off/on heap) and VarHandle views. specjvm2008 is likely too >>> coarse grained. >> >> >> I found some regressions with Paul's tests but I can recover lost >> performance using profiling data and some tweaks to the compiler. I'll >> take care of all that as part of a follow up change. >> >> Here is a new webrev: >> >> http://cr.openjdk.java.net/~roland/8176506/webrev.03/ >> >> I found I introduced a bug in LibraryCallKit::make_unsafe_address(): in >> case of a raw memory access classify_unsafe_addr() changes base but we >> don't want that change to propagate to >> LibraryCallKit::inline_unsafe_access() otherwise it won't see a null >> base for a raw memory access. >> >> If C2 compiles a method that makes incorrect use of Unsafe (by trying to >> load a field from a null object for instance), the C2 HaltNode can be >> executed. Its current implementation on several platforms result in a >> call to os::breakpoint() which is an empty method. So execution would >> happily return to the compiled method and continue and fail in some >> mysterious way. That's why in my previous webrevs, I had implemented the >> HaltNode on x86 and aarch64 as a call to a method that call >> ShouldNotReachHere(). This needs to be taken care of on all >> platforms. The webrev now includes an implementation on arm64 (that I >> tested), arm32 and sparc (that I can't build or test). I would need >> someone to verify them. I suppose PPC would need something similar as >> well but I have limited PPC skills so I would need code for that >> too. The easy way to test the halt implementation is to modify the >> TestMaybeNullUnsafeAccess test from the patch and add a: >> >> test2(null); >> >> call after the loop of the main method. This should result in the Halt >> node being executed. >> >> Roland. >> > From vladimir.kozlov at oracle.com Wed May 3 19:05:33 2017 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Wed, 3 May 2017 12:05:33 -0700 Subject: [10] RFR(XS): 8169697: aarch64: vectorized MLA instruction not generated for some test cases In-Reply-To: <78d865ac-c85f-988d-f63e-7f0f65ff43f9@oracle.com> References: <15b3b2dc-481f-98fd-c7db-f72f58d970e1@oracle.com> <78d865ac-c85f-988d-f63e-7f0f65ff43f9@oracle.com> Message-ID: <65c075cd-6837-984d-db82-2d884b68052b@oracle.com> Testing passed. I will sponsor it after latest jdk10/jdk10 integrated into jdk10/hs today. Vladimir On 5/2/17 10:43 AM, Vladimir Kozlov wrote: > Changes look fine to me. > > I will run RBT testing but it only covers platforms supported by Oracle. You have to test it on other platforms. > > thanks, > Vladimir > > On 4/28/17 3:58 AM, Roland Westrelin wrote: >> >>> The patch has been updated in >>> http://cr.openjdk.java.net/~njian/8169697/webrev.01/. >> >> That patch looks good to me. Can we have a sponsor for it (in jdk 10)? >> >> Roland. >> From vladimir.kozlov at oracle.com Wed May 3 19:09:10 2017 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Wed, 3 May 2017 12:09:10 -0700 Subject: [10] RFR(M): 8176506: C2: loop unswitching and unsafe accesses cause crash In-Reply-To: References: <6ce634a9-cd51-d98b-8e46-fbe6cb1f3f43@oracle.com> <1aef21ad-3676-2ffd-069a-c74ef36a668a@oracle.com> <83aed7fe-cafa-f28b-577d-c6871123e269@oracle.com> <8cfe23ac-b308-24e3-2725-44992031cddd@oracle.com> <352faa06-3fe4-de8a-eeb6-3506bf555a1e@redhat.com> Message-ID: <6045dd5f-71f2-c892-79ec-0a68809a2fd2@oracle.com> Testing passed. On 5/3/17 9:30 AM, Volker Simonis wrote: > Hi Roland, Vladimir, > > I'm currently implementing the ppc64/s390x pieces for this change. > Can you please give me a day or two before submitting it? Okay. No problem. Thanks, Vladimir > > Thanks, > Volker > > > On Tue, May 2, 2017 at 7:20 PM, Vladimir Kozlov > wrote: >> Looks good to me. I will submit RBT testing, lets see how it goes. >> >> Thanks, >> Vladimir >> >> >> On 4/28/17 1:46 AM, Roland Westrelin wrote: >>> >>> >>>> In addition to Andrew?s test i have a simple JMH microbenchmark i can >>>> send you, which i used to measure performance and identify issues with >>>> code generation, misalignment, and unrolled loops for byte buffer >>>> access (off/on heap) and VarHandle views. specjvm2008 is likely too >>>> coarse grained. >>> >>> >>> I found some regressions with Paul's tests but I can recover lost >>> performance using profiling data and some tweaks to the compiler. I'll >>> take care of all that as part of a follow up change. >>> >>> Here is a new webrev: >>> >>> http://cr.openjdk.java.net/~roland/8176506/webrev.03/ >>> >>> I found I introduced a bug in LibraryCallKit::make_unsafe_address(): in >>> case of a raw memory access classify_unsafe_addr() changes base but we >>> don't want that change to propagate to >>> LibraryCallKit::inline_unsafe_access() otherwise it won't see a null >>> base for a raw memory access. >>> >>> If C2 compiles a method that makes incorrect use of Unsafe (by trying to >>> load a field from a null object for instance), the C2 HaltNode can be >>> executed. Its current implementation on several platforms result in a >>> call to os::breakpoint() which is an empty method. So execution would >>> happily return to the compiled method and continue and fail in some >>> mysterious way. That's why in my previous webrevs, I had implemented the >>> HaltNode on x86 and aarch64 as a call to a method that call >>> ShouldNotReachHere(). This needs to be taken care of on all >>> platforms. The webrev now includes an implementation on arm64 (that I >>> tested), arm32 and sparc (that I can't build or test). I would need >>> someone to verify them. I suppose PPC would need something similar as >>> well but I have limited PPC skills so I would need code for that >>> too. The easy way to test the halt implementation is to modify the >>> TestMaybeNullUnsafeAccess test from the patch and add a: >>> >>> test2(null); >>> >>> call after the loop of the main method. This should result in the Halt >>> node being executed. >>> >>> Roland. >>> >> From volker.simonis at gmail.com Wed May 3 19:29:34 2017 From: volker.simonis at gmail.com (Volker Simonis) Date: Wed, 3 May 2017 21:29:34 +0200 Subject: [10] RFR(M): 8176506: C2: loop unswitching and unsafe accesses cause crash In-Reply-To: References: <6ce634a9-cd51-d98b-8e46-fbe6cb1f3f43@oracle.com> <1aef21ad-3676-2ffd-069a-c74ef36a668a@oracle.com> <83aed7fe-cafa-f28b-577d-c6871123e269@oracle.com> <8cfe23ac-b308-24e3-2725-44992031cddd@oracle.com> <352faa06-3fe4-de8a-eeb6-3506bf555a1e@redhat.com> Message-ID: Hi Roland, first of all, I think you still have a small error in sparc.ad. I think the "size(8)" is not right because I see the following error at VM startup: # Internal Error (ad_sparc.cpp:15486), pid=11655, tid=73 # assert(VerifyOops || MachNode::size(ra_) <= 8) failed: bad fixed size Current CompileTask: C2: 2241 16 4 java.lang.StringLatin1::hashCode (42 bytes) Stack: [0xffffffff45200000,0xffffffff45300000], sp=0xffffffff452fadd0, free space=1003k Native frames: (J=compiled Java code, A=aot compiled Java code, j=interpreted, Vv=VM code, C=native code) V [libjvm.so+0x1c7ce4c] void VMError::report_and_die(int,const char*,const char*,void*,Thread*,unsigned char*,void*,void*,const char*,int,unsigned long)+0x894 V [libjvm.so+0x1c7c4fc] void VMError::report_and_die(Thread*,const char*,int,const char*,const char*,void*)+0x6c V [libjvm.so+0xfed1dc] void report_vm_error(const char*,int,const char*,const char*,...)+0xc4 V [libjvm.so+0x96b118] unsigned ShouldNotReachHereNode::size(PhaseRegAlloc*)const+0xd8 V [libjvm.so+0x1923ac0] void Compile::shorten_branches(unsigned*,int&,int&,int&)+0x6c0 V [libjvm.so+0x19279d4] CodeBuffer*Compile::init_buffer(unsigned*)+0x3e4 V [libjvm.so+0x1922d08] void Compile::Output()+0x730 V [libjvm.so+0xf247d0] void Compile::Code_Gen()+0x610 V [libjvm.so+0xf1a624] Compile::Compile(ciEnv*,C2Compiler*,ciMethod*,int,bool,bool,bool,DirectiveSet*)+0x1894 ... Removing the "size()" instruction altogether, fixes the problem (but there's probably a better solution). I've also looked at your change and I think it works fine on ppc64/s390x because we've always implemented the HaltNode by issuing a trap instructions. So with your change (and your modified test) we will just crash in the generates C2 method which I think is fine: # SIGTRAP (0x5) at pc=0x00003fff47b12bd8, pid=19883, tid=19964 J 215 c2 TestMaybeNullUnsafeAccess.test2(Ljava/lang/Object;)I (34 bytes) @ 0x00003fff47b12bd8 [0x00003fff47b12b80+0x0000000000000058] j TestMaybeNullUnsafeAccess.main([Ljava/lang/String;)V+49 v ~StubRoutines::call_stub I'm not sure if your change to the implementation of the HaltNode (i.e. calling ShouldNotReachHere()) is really the best solution. With that change, a crash due to an Unsafe NULL access looks as follows on SPARC: # Internal Error (/usr/work/d046063/OpenJDK/jdk10-hs/hotspot/src/cpu/sparc/vm/macroAssembler_sparc.cpp:1378), pid=29415, tid=87 # Error: ShouldNotReachHere() V [libjvm.so+0x1c7cd0c] void VMError::report_and_die(int,const char*,const char*,void*,Thread*,unsigned char*,void*,void*,const char*,int,unsigned long)+0x894 V [libjvm.so+0x1c7c3bc] void VMError::report_and_die(Thread*,const char*,int,const char*,const char*,void*)+0x6c V [libjvm.so+0xfed034] void report_vm_error(const char*,int,const char*,const char*,...)+0xc4 V [libjvm.so+0xfecf4c] void report_vm_error(const char*,int,const char*)+0x64 V [libjvm.so+0xfed3a0] void report_should_not_reach_here(const char*,int)+0x48 V [libjvm.so+0x1754c64] void halt()+0x34 J 216 c2 TestMaybeNullUnsafeAccess.test2(Ljava/lang/Object;)I (34 bytes) @ 0xffffffff53412884 [0xffffffff53412820+0x0000000000000064] j TestMaybeNullUnsafeAccess.main([Ljava/lang/String;)V+49 I think this may be quite confusing to users/support because you can not easily see that it is related to an illegal Unsafe access. It looks more like an error in macroAssembler_sparc.cpp:1378 (and ShouldNotReachHere() is usually used for errors in the VM and not to signal user errors). I obviously prefer the first crash, so maybe you can at least only change the platforms which called "breakpoint()" until now? Not sure if this is easily possible, but I think an ideal solution would be to really execute the actual, offending load instead of inserting a HaltNode. That would give you a nice hs_err file on all platforms where the registers would even contain the offending address (NULL) and the offset. I think this could simplify troubleshooting in the case of a crash. What do you think? Please find some other minor comments/suggestions below: src/share/vm/opto/graphKit.cpp +// add a check for null for which one branch can't be taken. It uses +// and Opaque4 node that will cause the check to be removed after loop - should probably read "// an Opaque4 node.." in the second line src/share/vm/opto/opaquenode.hpp +// know implicitly is always true of false but the compiler has no way +// to prove. If during optimizations, that check becomes true of - s/of/or/ test/compiler/unsafe/TestMaybeNullUnsafeAccess.java - wouldn't it be safer to run the test with -Xbatch and -XX:-UseOnStackReplacement for any case and to make it evident that the test relays on the fact that test1() and test2() should be compiled but not inlined ? - you should also update the copyright on most files you've touched. Regards, Volker On Wed, May 3, 2017 at 9:58 AM, Roland Westrelin wrote: > >> Looks good to me. I will submit RBT testing, lets see how it goes. > > Thanks, Vladimir. > > Note that usual testing shouldn't cause the Halt to be executed so > wouldn't test its new implementation. > > Roland. From ningsheng.jian at linaro.org Thu May 4 02:23:08 2017 From: ningsheng.jian at linaro.org (Ningsheng Jian) Date: Thu, 4 May 2017 10:23:08 +0800 Subject: [aarch64-port-dev ] [10] RFR(XS): 8169697: aarch64: vectorized MLA instruction not generated for some test cases In-Reply-To: <65c075cd-6837-984d-db82-2d884b68052b@oracle.com> References: <15b3b2dc-481f-98fd-c7db-f72f58d970e1@oracle.com> <78d865ac-c85f-988d-f63e-7f0f65ff43f9@oracle.com> <65c075cd-6837-984d-db82-2d884b68052b@oracle.com> Message-ID: Thanks, Vladimir and Roland! This patch has been contributed by yang.zhang at linaro.org. I updated this information in the commit message. Regards, Ningsheng On 4 May 2017 at 03:05, Vladimir Kozlov wrote: > Testing passed. I will sponsor it after latest jdk10/jdk10 integrated into > jdk10/hs today. > > Vladimir > > > On 5/2/17 10:43 AM, Vladimir Kozlov wrote: >> >> Changes look fine to me. >> >> I will run RBT testing but it only covers platforms supported by Oracle. >> You have to test it on other platforms. >> >> thanks, >> Vladimir >> >> On 4/28/17 3:58 AM, Roland Westrelin wrote: >>> >>> >>>> The patch has been updated in >>>> http://cr.openjdk.java.net/~njian/8169697/webrev.01/. >>> >>> >>> That patch looks good to me. Can we have a sponsor for it (in jdk 10)? >>> >>> Roland. >>> > From robbin.ehn at oracle.com Thu May 4 09:13:05 2017 From: robbin.ehn at oracle.com (Robbin Ehn) Date: Thu, 4 May 2017 11:13:05 +0200 Subject: Low-Overhead Heap Profiling In-Reply-To: References: Message-ID: <2af975e6-3827-bd57-0c3d-fadd54867a67@oracle.com> Hi, To me the compiler changes looks what is expected. It would be good if someone from compiler could take a look at that. Added compiler to mail thread. Also adding Serguei, It would be good with his view also. My initial take on it, read through most of the code and took it for a ride. ############################## - Regarding the compiler changes: I think we need the 'TLAB end' trickery (mentioned by Tony P) instead of a separate check for sampling in fast path for the final version. ############################## - This patch I had to apply to get it compile on JDK 10: diff -r ac3ded340b35 src/share/vm/gc/shared/collectedHeap.inline.hpp --- a/src/share/vm/gc/shared/collectedHeap.inline.hpp Fri Apr 28 14:31:38 2017 +0200 +++ b/src/share/vm/gc/shared/collectedHeap.inline.hpp Thu May 04 10:22:56 2017 +0200 @@ -87,3 +87,3 @@ // support for object alloc event (no-op most of the time) - if (klass() != NULL && klass()->name() != NULL) { + if (klass != NULL && klass->name() != NULL) { Thread *base_thread = Thread::current(); diff -r ac3ded340b35 src/share/vm/runtime/heapMonitoring.cpp --- a/src/share/vm/runtime/heapMonitoring.cpp Fri Apr 28 14:31:38 2017 +0200 +++ b/src/share/vm/runtime/heapMonitoring.cpp Thu May 04 10:22:56 2017 +0200 @@ -316,3 +316,3 @@ JavaThread *thread = reinterpret_cast(Thread::current()); - assert(o->size() << LogHeapWordSize == byte_size, + assert(o->size() << LogHeapWordSize == (long)byte_size, "Object size is incorrect."); ############################## - This patch I had to apply to get it not asserting during slowdebug: --- a/src/share/vm/runtime/heapMonitoring.cpp Fri Apr 28 15:15:16 2017 +0200 +++ b/src/share/vm/runtime/heapMonitoring.cpp Thu May 04 10:24:25 2017 +0200 @@ -32,3 +32,3 @@ // TODO(jcbeyler): should we make this into a JVMTI structure? -struct StackTraceData { +struct StackTraceData : CHeapObj { ASGCT_CallTrace *trace; @@ -143,3 +143,2 @@ StackTraceStorage::StackTraceStorage() : - _allocated_traces(new StackTraceData*[MaxHeapTraces]), _allocated_traces_size(MaxHeapTraces), @@ -147,2 +146,3 @@ _allocated_count(0) { + _allocated_traces = NEW_C_HEAP_ARRAY(StackTraceData*, MaxHeapTraces, mtInternal); memset(_allocated_traces, 0, sizeof(*_allocated_traces) * MaxHeapTraces); @@ -152,3 +152,3 @@ StackTraceStorage::~StackTraceStorage() { - delete[] _allocated_traces; + FREE_C_HEAP_ARRAY(StackTraceData*, _allocated_traces); } - Classes should extend correct base class for which type of memory is used for it e.g.: CHeapObj or StackObj or AllStatic - The style in heapMonitoring.cpp is a bit different from normal vm-style, e.g. using C++ casts instead of C. You mix NEW_C_HEAP_ARRAY, os::malloc and new. - In jvmtiHeapTransition.hpp you use C cast instead. ############################## - This patch I had apply to get traces without setting an ?unrelated? capability - Should this not be a new capability? diff -r c02a5d8785bf src/share/vm/prims/forte.cpp --- a/src/share/vm/prims/forte.cpp Fri Apr 28 15:15:16 2017 +0200 +++ b/src/share/vm/prims/forte.cpp Thu May 04 10:24:25 2017 +0200 @@ -530,6 +530,6 @@ - if (!JvmtiExport::should_post_class_load()) { +/* if (!JvmtiExport::should_post_class_load()) { trace->num_frames = ticks_no_class_load; // -1 return; - } + }*/ ############################## - forte.cpp: (I know this is not part of your changes but) find_jmethod_id_or_null give me NULL for my test. It looks like we actually want the regular jmethod_id() ? Since we are the thread we are talking about (and in same ucontext) and thread is in vm and have a last java frame, I think most of the checks done in AsyncGetCallTrace is irrelevant, so you should be-able to call forte_fill_call_trace_given_top directly. But since we might need jmethod_id() if possible to avoid getting method id NULL, we need some fixes in forte code, or just do the vframStream loop inside heapMonitoring.cpp and not use forte.cpp. Something like: if (jthread->has_last_Java_frame()) { // just to be safe vframeStream vfst(jthread); while (!vfst.at_end()) { Method* m = vfst.method(); m->jmethod_id(); m->line_number_from_bci(vfst.bci()); vfst.next(); } - This is a bit confusing in forte.cpp, trace->frames[count].lineno = bci. Line number should be m->line_number_from_bci(bci); Do the heapMonitoring suppose to trace with bci or line number? I would say bci, meaning we should either rename ASGCT_CallFrame?lineno or use another data structure which says bci. ############################## - // TODO(jcbeyler): remove this extra code handling the extra trace for Please fix all these TODO's :) ############################## - heapMonitoring.hpp: // TODO(jcbeyler): is this algorithm acceptable in open source? Why is this comment here? What is the implication? Have you tested any simpler algorithm? ############################## - Create a sanity jtreg test. (./hotspot/make/test/JtregNative.gmk for building the agent) ############################## - monitoring_period vs HeapMonitorRate, pick rate or period. ############################## - globals.hpp Why is MaxHeapTraces not settable/overridable from jvmti interface? That would be handy. ############################## - jvmtiStackTraceData + ASGCT_CallFrame memory Are the agent suppose to loop through and free all ASGCT_CallFrame? Wouldn't it be better with some kinda protocol, like: (*jvmti)->GetLiveTraces(jvmti, &stack_traces, &num_traces); (*jvmti)->ReleaseTraces(jvmti, stack_traces, num_traces); Also using another data structure that have num_traces inside it simplifies things. So I'm not convinced using the async structure is the best way forward. I have more questions, but I think it's better if you respond and update the code first. Thanks! /Robbin On 04/21/2017 11:34 PM, JC Beyler wrote: > Hi all, > > I've added size information to the allocation sampling system. This allows the callback to remember the size of each sampled allocation. > http://cr.openjdk.java.net/~rasbold/8171119/webrev.01/ > > The new webrev.01 also adds the actual heap monitoring sampling system in files: > http://cr.openjdk.java.net/~rasbold/8171119/webrev.01/src/share/vm/runtime/heapMonitoring.cpp.patch > and > http://cr.openjdk.java.net/~rasbold/8171119/webrev.01/src/share/vm/runtime/heapMonitoring.hpp.patch > > My next step is to add the GC part to the webrev, which will allow users to determine what objects are live and what are garbage. > > Thanks for your attention and let me know if there are any questions! > > Have a wonderful Friday! > Jc > > On Mon, Apr 17, 2017 at 12:37 PM, JC Beyler > wrote: > > Hi all, > > I worked on getting a few numbers for overhead and accuracy for my feature. I'm unsure if here is the right place to provide the full data, so I am just summarizing > here for now. > > - Overhead of the feature > > Using the Dacapo benchmark (http://dacapobench.org/). My initial results are that sampling provides 2.4% with a 512k sampling, 512k being our default setting. > > - Note: this was without the tradesoap, tradebeans and tomcat benchmarks since they did not work with my JDK9 (issue between Dacapo and JDK9 it seems) > - I want to rerun next week to ensure number stability > > - Accuracy of the feature > > I wrote a small microbenchmark that allocates from two different stacktraces at a given ratio. For example, 10% of stacktrace S1 and 90% from stacktrace S2. The > microbenchmark was run 20 times, I averaged the results and looked for accuracy. It seems that statistically it is sound since if I allocated10% S1 and 90% S2, with a > sampling rate of 512k, I obtained 9.61% S1 and 90.49% S2. > > Let me know if there are any questions on the numbers and if you'd like to see some more data. > > Note: this was done using our internal JDK8 implementation since the webrev provided by http://cr.openjdk.java.net/~rasbold/heapz/webrev.00/index.html > does not yet contain the whole implementation and therefore would have been misleading. > > Thanks, > Jc > > > On Tue, Apr 4, 2017 at 3:55 PM, JC Beyler > wrote: > > Hi all, > > To move the discussion forward, with Chuck Rasbold's help to make a webrev, we pushed this: > http://cr.openjdk.java.net/~rasbold/heapz/webrev.00/index.html > 415 lines changed: 399 ins; 13 del; 3 mod; 51122 unchg > > This is not a final change that does the whole proposition from the JBS entry: https://bugs.openjdk.java.net/browse/JDK-8177374 > ; what it does show is parts of the implementation that is proposed and hopefully can start the conversation going > as I work through the details. > > For example, the changes to C2 are done here for the allocations: http://cr.openjdk.java.net/~rasbold/heapz/webrev.00/src/share/vm/opto/macro.cpp.patch > > > Hopefully this all makes sense and thank you for all your future comments! > Jc > > > On Tue, Dec 13, 2016 at 1:11 PM, JC Beyler > wrote: > > Hello all, > > This is a follow-up from Jeremy's initial email from last year: > http://mail.openjdk.java.net/pipermail/serviceability-dev/2015-June/017543.html > > I've gone ahead and started working on preparing this and Jeremy and I went down the route of actually writing it up in JEP form: > https://bugs.openjdk.java.net/browse/JDK-8171119 > > I think original conversation that happened last year in that thread still holds true: > > - We have a patch at Google that we think others might be interested in > - It provides a means to understand where the allocation hotspots are at a very low overhead > - Since it is at a low overhead, we can leave it on by default > > So I come to the mailing list with Jeremy's initial question: > "I thought I would ask if there is any interest / if I should write a JEP / if I should just forget it." > > A year ago, it seemed some thought it was a good idea, is this still true? > > Thanks, > Jc > > > > > From goetz.lindenmaier at sap.com Thu May 4 10:57:24 2017 From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz) Date: Thu, 4 May 2017 10:57:24 +0000 Subject: RFR(S): 8179618: Fixes for range of OptoLoopAlignment and Inlining flags Message-ID: Hi, This change fixes range handling of a few flags of C2. This should go to jdk10, and later be downported to some update of jdk9. Please review this change. I please need a sponsor. http://cr.openjdk.java.net/~goetz/wr17/8179618-FlagRanges/webrev.01/ Class WarmCallInfo limits its values to 1.0e10, but the flags used to set it's fields (HotCallCountThreshold etc.) are limited by max_intx. Using values over 1.0e10 causes assertions in the debug build. OptoLoopAlignment must be a multiple of nop size, else it's not possible to generate the instructions that go into the pad. On x86 NOP size is 1, so it's no problem. For SPARC, OptoLoopAlignmentConstraintFunc implements a special case for bigger NOPs. This is also needed for s390 and ppc. I just removed the #define, as the code works also on platforms where NOPsize == 1. Actually, it should be optimized by the C compiler in these cases. Best regards, Goetz. -------------- next part -------------- An HTML attachment was scrubbed... URL: From rwestrel at redhat.com Thu May 4 16:04:03 2017 From: rwestrel at redhat.com (Roland Westrelin) Date: Thu, 04 May 2017 18:04:03 +0200 Subject: [10] RFR(M): 8176506: C2: loop unswitching and unsafe accesses cause crash In-Reply-To: References: <6ce634a9-cd51-d98b-8e46-fbe6cb1f3f43@oracle.com> <1aef21ad-3676-2ffd-069a-c74ef36a668a@oracle.com> <83aed7fe-cafa-f28b-577d-c6871123e269@oracle.com> <8cfe23ac-b308-24e3-2725-44992031cddd@oracle.com> <352faa06-3fe4-de8a-eeb6-3506bf555a1e@redhat.com> Message-ID: Hi Volker, Thanks for looking at this and trying it. > I obviously prefer the first crash, so maybe you can at least only > change the platforms which called "breakpoint()" until now? I thought making the implementation consistent across platforms was worthwhile but I don't have a strong opinion on this. I won't have to troubleshoot failures on sparc so whatever works for those who do is fine with me. What do you think, Vladimir? > Not sure if this is easily possible, but I think an ideal solution > would be to really execute the actual, offending load instead of > inserting a HaltNode. That would give you a nice hs_err file on all > platforms where the registers would even contain the offending address > (NULL) and the offset. I think this could simplify troubleshooting in > the case of a crash. What do you think? It's not possible with the current implementation. We would need to turn the Halt into an uncommon trap. My reasoning was that if we were about to crash we could as well crash right away. It's true it could make troubleshooting harder. Note that trying to read an object field off heap is another bad scenario that would trigger a Halt but it might not crash if run interpreted. Let's say we uncommon trap instead of crash in that case, what do we do if we hit that unc several times? Roland. From vladimir.kozlov at oracle.com Thu May 4 16:27:08 2017 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Thu, 4 May 2017 09:27:08 -0700 Subject: RFR(S): 8179618: Fixes for range of OptoLoopAlignment and Inlining flags In-Reply-To: References: Message-ID: <0d21d8f1-6de3-c93c-07e3-407de3b6a7da@oracle.com> Looks good to me. Thanks, Vladimir On 5/4/17 3:57 AM, Lindenmaier, Goetz wrote: > Hi, > > > > This change fixes range handling of a few flags of C2. > > This should go to jdk10, and later be downported to some > > update of jdk9. > > > > Please review this change. I please need a sponsor. > > http://cr.openjdk.java.net/~goetz/wr17/8179618-FlagRanges/webrev.01/ > > > > Class WarmCallInfo limits its values to 1.0e10, but the flags used > > to set it's fields (HotCallCountThreshold etc.) are limited by max_intx. > > Using values over 1.0e10 causes assertions in the debug build. > > > > OptoLoopAlignment must be a multiple of nop size, else it's not > > possible to generate the instructions that go into the pad. > > On x86 NOP size is 1, so it's no problem. > > For SPARC, OptoLoopAlignmentConstraintFunc implements a special > > case for bigger NOPs. This is also needed for s390 and ppc. > > I just removed the #define, as the code works also on platforms > > where NOPsize == 1. Actually, it should be optimized by the C > > compiler in these cases. > > > > Best regards, > > Goetz. > From vladimir.kozlov at oracle.com Thu May 4 17:03:26 2017 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Thu, 4 May 2017 10:03:26 -0700 Subject: [10] RFR(M): 8176506: C2: loop unswitching and unsafe accesses cause crash In-Reply-To: References: <6ce634a9-cd51-d98b-8e46-fbe6cb1f3f43@oracle.com> <1aef21ad-3676-2ffd-069a-c74ef36a668a@oracle.com> <83aed7fe-cafa-f28b-577d-c6871123e269@oracle.com> <8cfe23ac-b308-24e3-2725-44992031cddd@oracle.com> <352faa06-3fe4-de8a-eeb6-3506bf555a1e@redhat.com> Message-ID: <4050c7c9-b688-5d52-a85c-f72284f50ccf@oracle.com> Can use SIGILL (illegal instruction) on all platforms? It should be correctly processed on all platforms and generate hs_err file. Vladimir On 5/4/17 9:04 AM, Roland Westrelin wrote: > > Hi Volker, > > Thanks for looking at this and trying it. > >> I obviously prefer the first crash, so maybe you can at least only >> change the platforms which called "breakpoint()" until now? > > I thought making the implementation consistent across platforms was > worthwhile but I don't have a strong opinion on this. I won't have to > troubleshoot failures on sparc so whatever works for those who do is > fine with me. What do you think, Vladimir? > >> Not sure if this is easily possible, but I think an ideal solution >> would be to really execute the actual, offending load instead of >> inserting a HaltNode. That would give you a nice hs_err file on all >> platforms where the registers would even contain the offending address >> (NULL) and the offset. I think this could simplify troubleshooting in >> the case of a crash. What do you think? > > It's not possible with the current implementation. We would need to turn > the Halt into an uncommon trap. My reasoning was that if we were about > to crash we could as well crash right away. It's true it could make > troubleshooting harder. Note that trying to read an object field off > heap is another bad scenario that would trigger a Halt but it might not > crash if run interpreted. Let's say we uncommon trap instead of crash in > that case, what do we do if we hit that unc several times? > > Roland. > From vivek.r.deshpande at intel.com Fri May 5 00:25:04 2017 From: vivek.r.deshpande at intel.com (Deshpande, Vivek R) Date: Fri, 5 May 2017 00:25:04 +0000 Subject: RFR(S): 8178811: Minimize the AVX <-> SSE transition penalty on x86 In-Reply-To: <53E8E64DB2403849AFD89B7D4DAC8B2A63CE6B5B@ORSMSX106.amr.corp.intel.com> References: <53E8E64DB2403849AFD89B7D4DAC8B2A63CD479B@ORSMSX106.amr.corp.intel.com> <53E8E64DB2403849AFD89B7D4DAC8B2A63CDA33F@ORSMSX106.amr.corp.intel.com> <2fbdf68a-9ceb-3183-a12a-bbef83b77c44@oracle.com> <31e0a9e5-c63d-b859-9bd4-9f66e093aed6@oracle.com> <53E8E64DB2403849AFD89B7D4DAC8B2A63CE6B5B@ORSMSX106.amr.corp.intel.com> Message-ID: <53E8E64DB2403849AFD89B7D4DAC8B2A63CE9436@ORSMSX106.amr.corp.intel.com> HI Vladimir I have the updated patch at this location: http://cr.openjdk.java.net/~vdeshpande/8178811/webrev.02/ This resolves the problem which comes with FASTDEBUG build. Thanks for reviewing and taking a look at it. Regards, Vivek -----Original Message----- From: Deshpande, Vivek R Sent: Tuesday, May 02, 2017 12:27 PM To: Vladimir Kozlov; hotspot compiler Cc: Viswanathan, Sandhya Subject: RE: RFR(S): 8178811: Minimize the AVX <-> SSE transition penalty on x86 Hi Vladimir I will check and get back to you soon. Thanks. Regards, Vivek -----Original Message----- From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com] Sent: Tuesday, May 2, 2017 11:20 AM To: Deshpande, Vivek R; hotspot compiler Cc: Viswanathan, Sandhya Subject: Re: RFR(S): 8178811: Minimize the AVX <-> SSE transition penalty on x86 The build failed: # Internal Error (/scratch/opt/jprt/T/P1/175733.vkozlov/s/hotspot/src/cpu/x86/vm/vm_version_x86.hpp:640), pid=7347, tid=7348 # assert(_cpuid_info.std_cpuid1_eax.bits.family != 0) failed: VM_Version not initialized V [libjvm.so+0x93a6a0] report_vm_error(char const*, int, char const*, char const*, ...)+0x60 V [libjvm.so+0x14cb1f0] VM_Version_StubGenerator::generate_get_cpu_info()+0x2f20 V [libjvm.so+0x14c7f53] VM_Version::initialize()+0x103 I think it fails at check is_intel() in set_avx_cpuFeatures() or set_evex_cpuFeatures(). Please, build fastdebug JVM for testing. Vladimir On 5/2/17 10:59 AM, Vladimir Kozlov wrote: > Looks good. One nit - move next code into separate function instead of duplicating 3 times: > > + if( is_intel() ) { // Intel cpus specific settings > + if ((cpu_family() == 0x06) && > + ((extended_cpu_model() == 0x57) || // Xeon Phi 3200/5200/7200 > + (extended_cpu_model() == 0x85))) { // Future Xeon Phi > + _features &= ~CPU_VZEROUPPER; > + } > + } > > I will run testing for current changes and let you know results. > > Thanks, > Vladimir > > On 4/20/17 1:54 PM, Deshpande, Vivek R wrote: >> HI Vladimir >> >> We added almost all the vzeroupper instructions by analyzing SPECjbb2015. >> >> We need vzeroupper after we execute avx2/avx-512 and before >> transition to SSE to avoid penalty of saving and restoring higher bits in YMM/ZMM registers. Since JVM is SSE compiled, mixing of AVX and SSE code happens in C1 and interpreter with usage of intrinsics, so I have added vzerouppers at these transitions in the stubs. >> >> I think max_vector_size() > 16 would be with auto vectorization, but we need to have vzeroupper with stubs and intrinsics. >> >> I have made all the changes which you suggested. Please take a look at the patch and let me know your thoughts. >> >> The updated Webrev is here: >> http://cr.openjdk.java.net/~vdeshpande/8178811/webrev.01/ >> >> Thank you. >> Regards, >> Vivek >> >> -----Original Message----- >> From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com] >> Sent: Friday, April 14, 2017 12:48 PM >> To: Deshpande, Vivek R; hotspot compiler >> Cc: Viswanathan, Sandhya >> Subject: Re: RFR(S): 8178811: Minimize the AVX <-> SSE transition >> penalty on x86 >> >> Hi Vivek, >> >> Did you pinpoint particular change which helps SPECjbb2015? >> >> From what I remember it is mostly during call to runtime or JNI calls. >> >> // AVX instruction which is used to clear upper 128 bits of YMM registers and >> // to avoid transaction penalty between AVX and SSE states. There is no >> // penalty if legacy SSE instructions are encoded using VEX prefix because >> // they always clear upper 128 bits. It should be used before calling >> // runtime code and native libraries. >> void vzeroupper(); >> >> So why you added vzeroupper() into arraycopy and other stubs? If AVX is supported all instructions should be encoded with VEX prefix. Or the statement in the comment is incorrect? >> >> About UseVzeroupper flag. We should avoid adding new product flags if possible. >> IT only make sense if you want to do experiments to check performance with and without it. But we already know that it is needed on most CPUs. >> >> I would suggest to add VM_Version::supports_vzeroupper() { return >> (_features & CPU_VZEROUPPER) != 0; } Set CPU_VZEROUPPE if AVX is supported and clear it for Knights CPU. The add check inside assembler instruction: >> >> void Assembler::vzeroupper() { >> - assert(VM_Version::supports_avx(), ""); >> + if (VM_Version::supports_vzeroupper()) { >> >> It will allow to avoid supports_avx() checks on each vzeroupper() call. >> >> You missed check in MachEpilogNode::emit() in x86_64.ad >> >> Note, I used (C->max_vector_size() > 16) checks in .ad files because of the same reasons as above: >> >> // There is no penalty if legacy SSE instructions are encoded using VEX prefix because >> // they always clear upper 128 bits. >> >> I thought if vectors are small and we use VEX prefix then upper bits will be 0 anyway so you don't need vzeroupper(). >> >> Thanks, >> Vladimir >> >> On 4/14/17 11:17 AM, Deshpande, Vivek R wrote: >>> Hi All >>> >>> >>> >>> This fix minimizes the AVX to SSE and SSE to AVX transition penalty >>> through generation of vzeroupper instruction. With this patch we see >>> zero transitions with penalty per SPECjbb2015 jOPS on BDW and a >>> significant reduction on SKX CPU event vector width mismatch from 65 >>> to 0.01 per SPECjbb2015 jOPS. We have also implemented an enhancement to disable vzeroupper generation for Knights family where the instruction has high penalty and is not recommended. The option UseVzeroupper is used to control generation of vzeroupper instruction and gets set to false on the Knights family. >>> We observed ~3% gain on SPECJvm2008 composite result on Skylake. >>> >>> Webrev: >>> >>> http://cr.openjdk.java.net/~vdeshpande/8178811/webrev.00/ >>> >>> I have also updated the JBS entry. >>> >>> https://bugs.openjdk.java.net/browse/JDK-8178811 >>> >>> Would you please review and sponsor it. >>> >>> >>> >>> Regards, >>> >>> Vivek >>> >>> >>> From vladimir.kozlov at oracle.com Fri May 5 01:04:48 2017 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Thu, 4 May 2017 18:04:48 -0700 Subject: RFR(S): 8178811: Minimize the AVX <-> SSE transition penalty on x86 In-Reply-To: <53E8E64DB2403849AFD89B7D4DAC8B2A63CE9436@ORSMSX106.amr.corp.intel.com> References: <53E8E64DB2403849AFD89B7D4DAC8B2A63CD479B@ORSMSX106.amr.corp.intel.com> <53E8E64DB2403849AFD89B7D4DAC8B2A63CDA33F@ORSMSX106.amr.corp.intel.com> <2fbdf68a-9ceb-3183-a12a-bbef83b77c44@oracle.com> <31e0a9e5-c63d-b859-9bd4-9f66e093aed6@oracle.com> <53E8E64DB2403849AFD89B7D4DAC8B2A63CE6B5B@ORSMSX106.amr.corp.intel.com> <53E8E64DB2403849AFD89B7D4DAC8B2A63CE9436@ORSMSX106.amr.corp.intel.com> Message-ID: Good. Build passed. Let see how testing will go. Thanks, Vladimir On 5/4/17 5:25 PM, Deshpande, Vivek R wrote: > HI Vladimir > > I have the updated patch at this location: > http://cr.openjdk.java.net/~vdeshpande/8178811/webrev.02/ > This resolves the problem which comes with FASTDEBUG build. > Thanks for reviewing and taking a look at it. > > Regards, > Vivek > > -----Original Message----- > From: Deshpande, Vivek R > Sent: Tuesday, May 02, 2017 12:27 PM > To: Vladimir Kozlov; hotspot compiler > Cc: Viswanathan, Sandhya > Subject: RE: RFR(S): 8178811: Minimize the AVX <-> SSE transition penalty on x86 > > Hi Vladimir > > I will check and get back to you soon. > Thanks. > > Regards, > Vivek > > -----Original Message----- > From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com] > Sent: Tuesday, May 2, 2017 11:20 AM > To: Deshpande, Vivek R; hotspot compiler > Cc: Viswanathan, Sandhya > Subject: Re: RFR(S): 8178811: Minimize the AVX <-> SSE transition penalty on x86 > > The build failed: > > # Internal Error (/scratch/opt/jprt/T/P1/175733.vkozlov/s/hotspot/src/cpu/x86/vm/vm_version_x86.hpp:640), pid=7347, tid=7348 # assert(_cpuid_info.std_cpuid1_eax.bits.family != 0) failed: VM_Version not initialized > > V [libjvm.so+0x93a6a0] report_vm_error(char const*, int, char const*, char const*, ...)+0x60 V [libjvm.so+0x14cb1f0] VM_Version_StubGenerator::generate_get_cpu_info()+0x2f20 > V [libjvm.so+0x14c7f53] VM_Version::initialize()+0x103 > > I think it fails at check is_intel() in set_avx_cpuFeatures() or set_evex_cpuFeatures(). > Please, build fastdebug JVM for testing. > > Vladimir > > On 5/2/17 10:59 AM, Vladimir Kozlov wrote: >> Looks good. One nit - move next code into separate function instead of duplicating 3 times: >> >> + if( is_intel() ) { // Intel cpus specific settings >> + if ((cpu_family() == 0x06) && >> + ((extended_cpu_model() == 0x57) || // Xeon Phi 3200/5200/7200 >> + (extended_cpu_model() == 0x85))) { // Future Xeon Phi >> + _features &= ~CPU_VZEROUPPER; >> + } >> + } >> >> I will run testing for current changes and let you know results. >> >> Thanks, >> Vladimir >> >> On 4/20/17 1:54 PM, Deshpande, Vivek R wrote: >>> HI Vladimir >>> >>> We added almost all the vzeroupper instructions by analyzing SPECjbb2015. >>> >>> We need vzeroupper after we execute avx2/avx-512 and before >>> transition to SSE to avoid penalty of saving and restoring higher bits in YMM/ZMM registers. Since JVM is SSE compiled, mixing of AVX and SSE code happens in C1 and interpreter with usage of intrinsics, so I have added vzerouppers at these transitions in the stubs. >>> >>> I think max_vector_size() > 16 would be with auto vectorization, but we need to have vzeroupper with stubs and intrinsics. >>> >>> I have made all the changes which you suggested. Please take a look at the patch and let me know your thoughts. >>> >>> The updated Webrev is here: >>> http://cr.openjdk.java.net/~vdeshpande/8178811/webrev.01/ >>> >>> Thank you. >>> Regards, >>> Vivek >>> >>> -----Original Message----- >>> From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com] >>> Sent: Friday, April 14, 2017 12:48 PM >>> To: Deshpande, Vivek R; hotspot compiler >>> Cc: Viswanathan, Sandhya >>> Subject: Re: RFR(S): 8178811: Minimize the AVX <-> SSE transition >>> penalty on x86 >>> >>> Hi Vivek, >>> >>> Did you pinpoint particular change which helps SPECjbb2015? >>> >>> From what I remember it is mostly during call to runtime or JNI calls. >>> >>> // AVX instruction which is used to clear upper 128 bits of YMM registers and >>> // to avoid transaction penalty between AVX and SSE states. There is no >>> // penalty if legacy SSE instructions are encoded using VEX prefix because >>> // they always clear upper 128 bits. It should be used before calling >>> // runtime code and native libraries. >>> void vzeroupper(); >>> >>> So why you added vzeroupper() into arraycopy and other stubs? If AVX is supported all instructions should be encoded with VEX prefix. Or the statement in the comment is incorrect? >>> >>> About UseVzeroupper flag. We should avoid adding new product flags if possible. >>> IT only make sense if you want to do experiments to check performance with and without it. But we already know that it is needed on most CPUs. >>> >>> I would suggest to add VM_Version::supports_vzeroupper() { return >>> (_features & CPU_VZEROUPPER) != 0; } Set CPU_VZEROUPPE if AVX is supported and clear it for Knights CPU. The add check inside assembler instruction: >>> >>> void Assembler::vzeroupper() { >>> - assert(VM_Version::supports_avx(), ""); >>> + if (VM_Version::supports_vzeroupper()) { >>> >>> It will allow to avoid supports_avx() checks on each vzeroupper() call. >>> >>> You missed check in MachEpilogNode::emit() in x86_64.ad >>> >>> Note, I used (C->max_vector_size() > 16) checks in .ad files because of the same reasons as above: >>> >>> // There is no penalty if legacy SSE instructions are encoded using VEX prefix because >>> // they always clear upper 128 bits. >>> >>> I thought if vectors are small and we use VEX prefix then upper bits will be 0 anyway so you don't need vzeroupper(). >>> >>> Thanks, >>> Vladimir >>> >>> On 4/14/17 11:17 AM, Deshpande, Vivek R wrote: >>>> Hi All >>>> >>>> >>>> >>>> This fix minimizes the AVX to SSE and SSE to AVX transition penalty >>>> through generation of vzeroupper instruction. With this patch we see >>>> zero transitions with penalty per SPECjbb2015 jOPS on BDW and a >>>> significant reduction on SKX CPU event vector width mismatch from 65 >>>> to 0.01 per SPECjbb2015 jOPS. We have also implemented an enhancement to disable vzeroupper generation for Knights family where the instruction has high penalty and is not recommended. The option UseVzeroupper is used to control generation of vzeroupper instruction and gets set to false on the Knights family. >>>> We observed ~3% gain on SPECJvm2008 composite result on Skylake. >>>> >>>> Webrev: >>>> >>>> http://cr.openjdk.java.net/~vdeshpande/8178811/webrev.00/ >>>> >>>> I have also updated the JBS entry. >>>> >>>> https://bugs.openjdk.java.net/browse/JDK-8178811 >>>> >>>> Would you please review and sponsor it. >>>> >>>> >>>> >>>> Regards, >>>> >>>> Vivek >>>> >>>> >>>> From serguei.spitsyn at oracle.com Fri May 5 06:21:55 2017 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Thu, 4 May 2017 23:21:55 -0700 Subject: Low-Overhead Heap Profiling In-Reply-To: <2af975e6-3827-bd57-0c3d-fadd54867a67@oracle.com> References: <2af975e6-3827-bd57-0c3d-fadd54867a67@oracle.com> Message-ID: <365499b6-3f4d-a4df-9e7e-e72a739fb26b@oracle.com> Robbin, Thank you for forwarding! I will review it. Thanks, Serguei On 5/4/17 02:13, Robbin Ehn wrote: > Hi, > > To me the compiler changes looks what is expected. > It would be good if someone from compiler could take a look at that. > Added compiler to mail thread. > > Also adding Serguei, It would be good with his view also. > > My initial take on it, read through most of the code and took it for a > ride. > > ############################## > - Regarding the compiler changes: I think we need the 'TLAB end' > trickery (mentioned by Tony P) > instead of a separate check for sampling in fast path for the final > version. > > ############################## > - This patch I had to apply to get it compile on JDK 10: > > diff -r ac3ded340b35 src/share/vm/gc/shared/collectedHeap.inline.hpp > --- a/src/share/vm/gc/shared/collectedHeap.inline.hpp Fri Apr 28 > 14:31:38 2017 +0200 > +++ b/src/share/vm/gc/shared/collectedHeap.inline.hpp Thu May 04 > 10:22:56 2017 +0200 > @@ -87,3 +87,3 @@ > // support for object alloc event (no-op most of the time) > - if (klass() != NULL && klass()->name() != NULL) { > + if (klass != NULL && klass->name() != NULL) { > Thread *base_thread = Thread::current(); > diff -r ac3ded340b35 src/share/vm/runtime/heapMonitoring.cpp > --- a/src/share/vm/runtime/heapMonitoring.cpp Fri Apr 28 14:31:38 > 2017 +0200 > +++ b/src/share/vm/runtime/heapMonitoring.cpp Thu May 04 10:22:56 > 2017 +0200 > @@ -316,3 +316,3 @@ > JavaThread *thread = reinterpret_cast *>(Thread::current()); > - assert(o->size() << LogHeapWordSize == byte_size, > + assert(o->size() << LogHeapWordSize == (long)byte_size, > "Object size is incorrect."); > > ############################## > - This patch I had to apply to get it not asserting during slowdebug: > > --- a/src/share/vm/runtime/heapMonitoring.cpp Fri Apr 28 15:15:16 > 2017 +0200 > +++ b/src/share/vm/runtime/heapMonitoring.cpp Thu May 04 10:24:25 > 2017 +0200 > @@ -32,3 +32,3 @@ > // TODO(jcbeyler): should we make this into a JVMTI structure? > -struct StackTraceData { > +struct StackTraceData : CHeapObj { > ASGCT_CallTrace *trace; > @@ -143,3 +143,2 @@ > StackTraceStorage::StackTraceStorage() : > - _allocated_traces(new StackTraceData*[MaxHeapTraces]), > _allocated_traces_size(MaxHeapTraces), > @@ -147,2 +146,3 @@ > _allocated_count(0) { > + _allocated_traces = NEW_C_HEAP_ARRAY(StackTraceData*, > MaxHeapTraces, mtInternal); > memset(_allocated_traces, 0, sizeof(*_allocated_traces) * > MaxHeapTraces); > @@ -152,3 +152,3 @@ > StackTraceStorage::~StackTraceStorage() { > - delete[] _allocated_traces; > + FREE_C_HEAP_ARRAY(StackTraceData*, _allocated_traces); > } > > - Classes should extend correct base class for which type of memory is > used for it e.g.: CHeapObj or StackObj or AllStatic > - The style in heapMonitoring.cpp is a bit different from normal > vm-style, e.g. using C++ casts instead of C. You mix NEW_C_HEAP_ARRAY, > os::malloc and new. > - In jvmtiHeapTransition.hpp you use C cast instead. > > ############################## > - This patch I had apply to get traces without setting an ?unrelated? > capability > - Should this not be a new capability? > > diff -r c02a5d8785bf src/share/vm/prims/forte.cpp > --- a/src/share/vm/prims/forte.cpp Fri Apr 28 15:15:16 2017 +0200 > +++ b/src/share/vm/prims/forte.cpp Thu May 04 10:24:25 2017 +0200 > @@ -530,6 +530,6 @@ > > - if (!JvmtiExport::should_post_class_load()) { > +/* if (!JvmtiExport::should_post_class_load()) { > trace->num_frames = ticks_no_class_load; // -1 > return; > - } > + }*/ > > ############################## > - forte.cpp: (I know this is not part of your changes but) > find_jmethod_id_or_null give me NULL for my test. > It looks like we actually want the regular jmethod_id() ? > > Since we are the thread we are talking about (and in same ucontext) > and thread is in vm and have a last java frame, > I think most of the checks done in AsyncGetCallTrace is irrelevant, so > you should be-able to call forte_fill_call_trace_given_top directly. > But since we might need jmethod_id() if possible to avoid getting > method id NULL, > we need some fixes in forte code, or just do the vframStream loop > inside heapMonitoring.cpp and not use forte.cpp. > > Something like: > > if (jthread->has_last_Java_frame()) { // just to be safe > vframeStream vfst(jthread); > while (!vfst.at_end()) { > Method* m = vfst.method(); > m->jmethod_id(); > m->line_number_from_bci(vfst.bci()); > vfst.next(); > } > > - This is a bit confusing in forte.cpp, trace->frames[count].lineno = > bci. > Line number should be m->line_number_from_bci(bci); > Do the heapMonitoring suppose to trace with bci or line number? > I would say bci, meaning we should either rename > ASGCT_CallFrame?lineno or use another data structure which says bci. > > ############################## > - // TODO(jcbeyler): remove this extra code handling the extra trace for > Please fix all these TODO's :) > > ############################## > - heapMonitoring.hpp: > // TODO(jcbeyler): is this algorithm acceptable in open source? > > Why is this comment here? What is the implication? > Have you tested any simpler algorithm? > > ############################## > - Create a sanity jtreg test. (./hotspot/make/test/JtregNative.gmk for > building the agent) > > ############################## > - monitoring_period vs HeapMonitorRate, pick rate or period. > > ############################## > - globals.hpp > Why is MaxHeapTraces not settable/overridable from jvmti interface? > That would be handy. > > ############################## > - jvmtiStackTraceData + ASGCT_CallFrame memory > Are the agent suppose to loop through and free all ASGCT_CallFrame? > Wouldn't it be better with some kinda protocol, like: > (*jvmti)->GetLiveTraces(jvmti, &stack_traces, &num_traces); > (*jvmti)->ReleaseTraces(jvmti, stack_traces, num_traces); > > Also using another data structure that have num_traces inside it > simplifies things. > So I'm not convinced using the async structure is the best way forward. > > > I have more questions, but I think it's better if you respond and > update the code first. > > Thanks! > > /Robbin > > > On 04/21/2017 11:34 PM, JC Beyler wrote: >> Hi all, >> >> I've added size information to the allocation sampling system. This >> allows the callback to remember the size of each sampled allocation. >> http://cr.openjdk.java.net/~rasbold/8171119/webrev.01/ >> >> The new webrev.01 also adds the actual heap monitoring sampling >> system in files: >> http://cr.openjdk.java.net/~rasbold/8171119/webrev.01/src/share/vm/runtime/heapMonitoring.cpp.patch >> >> and >> http://cr.openjdk.java.net/~rasbold/8171119/webrev.01/src/share/vm/runtime/heapMonitoring.hpp.patch >> >> >> My next step is to add the GC part to the webrev, which will allow >> users to determine what objects are live and what are garbage. >> >> Thanks for your attention and let me know if there are any questions! >> >> Have a wonderful Friday! >> Jc >> >> On Mon, Apr 17, 2017 at 12:37 PM, JC Beyler > > wrote: >> >> Hi all, >> >> I worked on getting a few numbers for overhead and accuracy for >> my feature. I'm unsure if here is the right place to provide the full >> data, so I am just summarizing >> here for now. >> >> - Overhead of the feature >> >> Using the Dacapo benchmark (http://dacapobench.org/). My initial >> results are that sampling provides 2.4% with a 512k sampling, 512k >> being our default setting. >> >> - Note: this was without the tradesoap, tradebeans and tomcat >> benchmarks since they did not work with my JDK9 (issue between Dacapo >> and JDK9 it seems) >> - I want to rerun next week to ensure number stability >> >> - Accuracy of the feature >> >> I wrote a small microbenchmark that allocates from two different >> stacktraces at a given ratio. For example, 10% of stacktrace S1 and >> 90% from stacktrace S2. The >> microbenchmark was run 20 times, I averaged the results and >> looked for accuracy. It seems that statistically it is sound since if >> I allocated10% S1 and 90% S2, with a >> sampling rate of 512k, I obtained 9.61% S1 and 90.49% S2. >> >> Let me know if there are any questions on the numbers and if >> you'd like to see some more data. >> >> Note: this was done using our internal JDK8 implementation since >> the webrev provided by >> http://cr.openjdk.java.net/~rasbold/heapz/webrev.00/index.html >> does >> not yet contain the whole implementation and therefore would have >> been misleading. >> >> Thanks, >> Jc >> >> >> On Tue, Apr 4, 2017 at 3:55 PM, JC Beyler > > wrote: >> >> Hi all, >> >> To move the discussion forward, with Chuck Rasbold's help to >> make a webrev, we pushed this: >> http://cr.openjdk.java.net/~rasbold/heapz/webrev.00/index.html >> >> 415 lines changed: 399 ins; 13 del; 3 mod; 51122 unchg >> >> This is not a final change that does the whole proposition >> from the JBS entry: https://bugs.openjdk.java.net/browse/JDK-8177374 >> ; what it does show >> is parts of the implementation that is proposed and hopefully can >> start the conversation going >> as I work through the details. >> >> For example, the changes to C2 are done here for the >> allocations: >> http://cr.openjdk.java.net/~rasbold/heapz/webrev.00/src/share/vm/opto/macro.cpp.patch >> >> >> Hopefully this all makes sense and thank you for all your >> future comments! >> Jc >> >> >> On Tue, Dec 13, 2016 at 1:11 PM, JC Beyler >> > wrote: >> >> Hello all, >> >> This is a follow-up from Jeremy's initial email from last >> year: >> http://mail.openjdk.java.net/pipermail/serviceability-dev/2015-June/017543.html >> >> >> >> I've gone ahead and started working on preparing this and >> Jeremy and I went down the route of actually writing it up in JEP form: >> https://bugs.openjdk.java.net/browse/JDK-8171119 >> >> I think original conversation that happened last year in >> that thread still holds true: >> >> - We have a patch at Google that we think others might >> be interested in >> - It provides a means to understand where the >> allocation hotspots are at a very low overhead >> - Since it is at a low overhead, we can leave it on >> by default >> >> So I come to the mailing list with Jeremy's initial >> question: >> "I thought I would ask if there is any interest / if I >> should write a JEP / if I should just forget it." >> >> A year ago, it seemed some thought it was a good idea, is >> this still true? >> >> Thanks, >> Jc >> >> >> >> >> From rwestrel at redhat.com Fri May 5 14:17:45 2017 From: rwestrel at redhat.com (Roland Westrelin) Date: Fri, 05 May 2017 16:17:45 +0200 Subject: [10] RFR(M): 8176506: C2: loop unswitching and unsafe accesses cause crash In-Reply-To: <4050c7c9-b688-5d52-a85c-f72284f50ccf@oracle.com> References: <6ce634a9-cd51-d98b-8e46-fbe6cb1f3f43@oracle.com> <1aef21ad-3676-2ffd-069a-c74ef36a668a@oracle.com> <83aed7fe-cafa-f28b-577d-c6871123e269@oracle.com> <8cfe23ac-b308-24e3-2725-44992031cddd@oracle.com> <352faa06-3fe4-de8a-eeb6-3506bf555a1e@redhat.com> <4050c7c9-b688-5d52-a85c-f72284f50ccf@oracle.com> Message-ID: > Can use SIGILL (illegal instruction) on all platforms? It should be > correctly processed on all platforms and generate hs_err file. Ok. Do you have a instruction sequence to recommend to trigger SIGILL on x86? Roland. From vladimir.kozlov at oracle.com Fri May 5 18:07:12 2017 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Fri, 5 May 2017 11:07:12 -0700 Subject: [10] RFR(M): 8176506: C2: loop unswitching and unsafe accesses cause crash In-Reply-To: References: <6ce634a9-cd51-d98b-8e46-fbe6cb1f3f43@oracle.com> <1aef21ad-3676-2ffd-069a-c74ef36a668a@oracle.com> <83aed7fe-cafa-f28b-577d-c6871123e269@oracle.com> <8cfe23ac-b308-24e3-2725-44992031cddd@oracle.com> <352faa06-3fe4-de8a-eeb6-3506bf555a1e@redhat.com> <4050c7c9-b688-5d52-a85c-f72284f50ccf@oracle.com> Message-ID: <6cea2fa0-1bcf-3c86-1494-a51f66430577@oracle.com> NativeIllegalInstruction from nativeInst_.hpp x86: 0x0B0F, // Real byte order is: 0x0F, 0x0B But I CCing to Intel for verification. Vivek, is it still true that these 2 bytes sequence will cause SIGILL? Or you have other sequence? X86 Manual says: "Use the 0F0B opcode (UD2 instruction) or the 0FB9H opcode when deliberately trying to generate an invalid opcode exception (#UD)." Vladimir On 5/5/17 7:17 AM, Roland Westrelin wrote: > >> Can use SIGILL (illegal instruction) on all platforms? It should be >> correctly processed on all platforms and generate hs_err file. > > Ok. Do you have a instruction sequence to recommend to trigger SIGILL on > x86? > > Roland. > From vladimir.kozlov at oracle.com Sat May 6 02:08:50 2017 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Fri, 5 May 2017 19:08:50 -0700 Subject: [10] RFFR(M) 8179656: [AOT] Add AOT manual test scripts Message-ID: http://cr.openjdk.java.net/~kvn/8179656/webrev/ https://bugs.openjdk.java.net/browse/JDK-8179656 We have set of manual scripts we used during AOT development. They mostly are testing 'jaotc' tool by compiling different modules. They are also good examples how to use 'jaotc'. Tested on Linux and OS X. Thanks, Vladimir From igor.veresov at oracle.com Sat May 6 04:11:47 2017 From: igor.veresov at oracle.com (Igor Veresov) Date: Fri, 5 May 2017 21:11:47 -0700 Subject: [10] RFFR(M) 8179656: [AOT] Add AOT manual test scripts In-Reply-To: References: Message-ID: Looks good. igor > On May 5, 2017, at 7:08 PM, Vladimir Kozlov wrote: > > http://cr.openjdk.java.net/~kvn/8179656/webrev/ > https://bugs.openjdk.java.net/browse/JDK-8179656 > > We have set of manual scripts we used during AOT development. > They mostly are testing 'jaotc' tool by compiling different modules. > They are also good examples how to use 'jaotc'. > > Tested on Linux and OS X. > > Thanks, > Vladimir From vladimir.kozlov at oracle.com Sat May 6 04:31:30 2017 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Fri, 5 May 2017 21:31:30 -0700 Subject: [10] RFFR(M) 8179656: [AOT] Add AOT manual test scripts In-Reply-To: References: Message-ID: <372cd8f0-4c3f-0079-c166-e51041668225@oracle.com> Thank you, Igor Vladimir On 5/5/17 9:11 PM, Igor Veresov wrote: > Looks good. > > igor > >> On May 5, 2017, at 7:08 PM, Vladimir Kozlov wrote: >> >> http://cr.openjdk.java.net/~kvn/8179656/webrev/ >> https://bugs.openjdk.java.net/browse/JDK-8179656 >> >> We have set of manual scripts we used during AOT development. >> They mostly are testing 'jaotc' tool by compiling different modules. >> They are also good examples how to use 'jaotc'. >> >> Tested on Linux and OS X. >> >> Thanks, >> Vladimir > From david.holmes at oracle.com Mon May 8 06:27:39 2017 From: david.holmes at oracle.com (David Holmes) Date: Mon, 8 May 2017 16:27:39 +1000 Subject: JDK10/RFR(L): 8172231: SPARC ISA/CPU feature detection is broken/insufficient (on Solaris). In-Reply-To: <6801aebd-eb60-6711-2859-0096818007eb@oracle.com> References: <6801aebd-eb60-6711-2859-0096818007eb@oracle.com> Message-ID: <2fb17c2c-684a-267f-5e3b-ae7defeed874@oracle.com> Hi Patric, I have read the below and looked through the proposed changes. While I can't validate the details (as I am not familiar with chip capabilities) the overall approach looks good and I prefer the capability-based tests to the "family" based tests. As you note there are a few fixme's and follow ups to do, but one suggestion I have is to remove the UseV8InstrsOnly flag. It doesn't make sense to me to keep this if we will abort on V8 anyway. The nature of the flag, in an unsupported environment, precludes it from following our more usual deprecation process, and it is a develop flag anyway. I do have concerns about how this may work on Fujitsu, but hopefully there is plenty of bake-time in JDK 10 to shake out any issue. Thanks, David On 28/04/2017 11:48 PM, Patric Hedlin wrote: > Dear all, > > I would like to ask for help to review the following change/update: > > Issue: https://bugs.openjdk.java.net/browse/JDK-8172231 > > Webrev: http://cr.openjdk.java.net/~neliasso/8172231/ > > > > 8172231: SPARC ISA/CPU feature detection is broken/insufficient (on > Solaris). > > Updating SPARC feature/capability detection (incorporating changes > from Martin Walsh). > More complete set of features as provided by 'getisax(2)' interface, > propagated via JVMCI. > More robust hardware probing for additional features (up to Core S4). > Removing support for old, pre Niagara, hardware. > Removing support for old, pre 11.1, Solaris. > > Changed behaviour: > Changing SPARC setup for AllocatePrefetchLines and > AllocateInstancePrefetchLines > such that they will (still) be doubled when cache-line size is small > (32 bytes), > but more moderately increased on new/contemporary hardware (inc >= > 50%). > Changing to default instruction fetch alignment based on derived > caps. instead > of relying on default/configuration values. > > The above changes also subsumes: > 8035146: assert(is_T_family(features) == is_niagara(features), > "Niagara should be T series") is incorrect > 8054979: Remove unnecessary defines in SPARC's > VM_Version::platform_features > > > Rationale: > > Current hardware detection on Solaris/SPARC is not up to date with > the "latest" (here, > meaning commercially available server solutions, i.e. T7/M7). To > facilitate improved > use of the new hardware features provided (by Core S3&S4) these > capabilities need to > be recognised by the JVM. > > NOTE: This update is limited to Core S3&S4, i.e. not including Core > S5. Proper Core S5 > support will be added when regular testing and benchmarking > resources are available, > i.e. regular testing need to include M8 hardware. > > > Caveat: > > This update will introduce some redundancies into the code base, > features and definitions > currently not used, as well as a (small) number of FIXMEs, addressed > by subsequent bug or > feature updates/patches. Fujitsu HW is treated very conservatively. > > > Testing: > > Mostly tested on JDK9 (RBT/hotspot/comp). Only local testing on > JDK10 (jtreg/hotspot). > > > Benchmarking: > > Benchmark reports from a limited set of runs can be found at: > > > http://aurora.se.oracle.com/performance/reporting/report/patric.hedlin.TvM.jbb05 > > > > http://aurora.se.oracle.com/performance/reporting/report/patric.hedlin.TvM.jvm08 > > > > http://aurora.se.oracle.com/performance/reporting/report/patric.hedlin.TvM.octane.plus > > > > (Limited availability of M7 hardware prevents complete suites/runs.) > > > Best regards, > Patric > From cthalinger at twitter.com Mon May 8 22:32:25 2017 From: cthalinger at twitter.com (Christian Thalinger) Date: Mon, 8 May 2017 12:32:25 -1000 Subject: [10] RFFR(M) 8179656: [AOT] Add AOT manual test scripts In-Reply-To: References: Message-ID: Quick question: are we still seeing this? 33 # assert(referenceMask != -1) failed: must not be a derived reference type 34 exclude com.sun.crypto.provider.AESWrapCipher.engineUnwrap([BLjava/lang/String;I)Ljava/security/Key; 35 exclude sun.security.ssl.* 36 exclude sun.net.RegisteredDomain.()V > On May 5, 2017, at 4:08 PM, Vladimir Kozlov wrote: > > http://cr.openjdk.java.net/~kvn/8179656/webrev/ > https://bugs.openjdk.java.net/browse/JDK-8179656 > > We have set of manual scripts we used during AOT development. > They mostly are testing 'jaotc' tool by compiling different modules. > They are also good examples how to use 'jaotc'. > > Tested on Linux and OS X. > > Thanks, > Vladimir -------------- next part -------------- An HTML attachment was scrubbed... URL: From thomas.stuefe at gmail.com Tue May 9 12:58:36 2017 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Tue, 9 May 2017 14:58:36 +0200 Subject: RFR(S): 8179618: Fixes for range of OptoLoopAlignment and Inlining flags In-Reply-To: References: Message-ID: Hi Goetz, c2_globals.hpp: - range(0, max_intx) \ + range(0, ((intx)MIN2((int64_t)max_intx,(int64_t)(+1.0e10)))) \ 32bit: I would have expected a build warning for the cast. Is it okay that we can never reach the max value on 32bit? You could probably loose some brackets (around +1.0e10 and around the whole MIN2 expression). commandLineFlagConstraintsCompiler.cpp: CommandLineError::print(verbose, "OptoLoopAlignment (" INTX_FORMAT ") must be " "multiple of NOP size\n"); There is an error here, the print parameter is missing. Would have expected the compiler to complain, actually - at least the gcc. Again, curious. Kind Regards, Thomas On Thu, May 4, 2017 at 12:57 PM, Lindenmaier, Goetz < goetz.lindenmaier at sap.com> wrote: > Hi, > > > > This change fixes range handling of a few flags of C2. > > This should go to jdk10, and later be downported to some > > update of jdk9. > > > > Please review this change. I please need a sponsor. > > http://cr.openjdk.java.net/~goetz/wr17/8179618-FlagRanges/webrev.01/ > > > > Class WarmCallInfo limits its values to 1.0e10, but the flags used > > to set it's fields (HotCallCountThreshold etc.) are limited by max_intx. > > Using values over 1.0e10 causes assertions in the debug build. > > > > OptoLoopAlignment must be a multiple of nop size, else it's not > > possible to generate the instructions that go into the pad. > > On x86 NOP size is 1, so it's no problem. > > For SPARC, OptoLoopAlignmentConstraintFunc implements a special > > case for bigger NOPs. This is also needed for s390 and ppc. > > I just removed the #define, as the code works also on platforms > > where NOPsize == 1. Actually, it should be optimized by the C > > compiler in these cases. > > > > Best regards, > > Goetz. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From goetz.lindenmaier at sap.com Tue May 9 14:17:35 2017 From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz) Date: Tue, 9 May 2017 14:17:35 +0000 Subject: RFR(S): 8179618: Fixes for range of OptoLoopAlignment and Inlining flags In-Reply-To: References: Message-ID: <91d36776f94b4a4c91a7be90013a9f21@sap.com> Hi Thomas, thanks for looking at my change. New webrev: http://cr.openjdk.java.net/~goetz/wr17/8179618-FlagRanges/webrev.02/ > c2_globals.hpp: > - range(0, max_intx) \ > + range(0, ((intx)MIN2((int64_t)max_intx,(int64_t)(+1.0e10)))) \ > 32bit: I would have expected a build warning for the cast. Is it okay that we can never reach the max value on 32bit? I double checked that there is no warning in our night builds and on linuxintel. > commandLineFlagConstraintsCompiler.cpp: > ? ? ?CommandLineError::print(verbose, > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?"OptoLoopAlignment (" INTX_FORMAT ") must be " > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?"multiple of NOP size\n"); > There is an error here, the print parameter is missing. Would have expected the compiler to complain, actually - at least the gcc. Again, curious. Thanks, good catch! The error was there before, but fixed anyways. I also added the NOP size. Best regards, Goetz. Kind Regards, Thomas On Thu, May 4, 2017 at 12:57 PM, Lindenmaier, Goetz wrote: Hi, ? This change fixes range handling of a few flags of C2. This should go to jdk10, and later be downported to some update of jdk9. ? Please review this change. I please need a sponsor. http://cr.openjdk.java.net/~goetz/wr17/8179618-FlagRanges/webrev.01/ ? Class WarmCallInfo limits its values to 1.0e10, but the flags used to set it's fields (HotCallCountThreshold etc.) are limited by max_intx. Using values over 1.0e10 causes assertions in the debug build. ? OptoLoopAlignment must be a multiple of nop size, else it's not possible to generate the instructions that go into the pad. On x86 NOP size is 1, so it's no problem. For SPARC, OptoLoopAlignmentConstraintFunc implements a special case for bigger NOPs. This is also needed for s390 and ppc. I just removed the #define, as the code works also on platforms where NOPsize == 1. Actually, it should be optimized by the C compiler in these cases. ? Best regards, ? Goetz. From vladimir.kozlov at oracle.com Tue May 9 15:57:56 2017 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Tue, 9 May 2017 08:57:56 -0700 Subject: [10] RFFR(M) 8179656: [AOT] Add AOT manual test scripts In-Reply-To: References: Message-ID: <4a3dad91-9a88-0a17-86e9-fcec3c0f24fa@oracle.com> Thank you for pointing this, Chris. It passed now. I can exclude it from the list. Vladimir On 5/8/17 3:32 PM, Christian Thalinger wrote: > Quick question: are we still seeing this? > > 33 # assert(referenceMask != -1) failed: must not be a derived reference type > 34 exclude com.sun.crypto.provider.AESWrapCipher.engineUnwrap([BLjava/lang/String;I)Ljava/security/Key; > 35 exclude sun.security.ssl.* > 36 exclude sun.net.RegisteredDomain.()V > > >> On May 5, 2017, at 4:08 PM, Vladimir Kozlov >> > wrote: >> >> http://cr.openjdk.java.net/~kvn/8179656/webrev/ >> https://bugs.openjdk.java.net/browse/JDK-8179656 >> >> We have set of manual scripts we used during AOT development. >> They mostly are testing 'jaotc' tool by compiling different modules. >> They are also good examples how to use 'jaotc'. >> >> Tested on Linux and OS X. >> >> Thanks, >> Vladimir > From thomas.stuefe at gmail.com Tue May 9 17:53:44 2017 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Tue, 9 May 2017 19:53:44 +0200 Subject: RFR(S): 8179618: Fixes for range of OptoLoopAlignment and Inlining flags In-Reply-To: <91d36776f94b4a4c91a7be90013a9f21@sap.com> References: <91d36776f94b4a4c91a7be90013a9f21@sap.com> Message-ID: Hi Goetz, On Tue, May 9, 2017 at 4:17 PM, Lindenmaier, Goetz < goetz.lindenmaier at sap.com> wrote: > Hi Thomas, > > thanks for looking at my change. > New webrev: > http://cr.openjdk.java.net/~goetz/wr17/8179618-FlagRanges/webrev.02/ > > > c2_globals.hpp: > > - range(0, max_intx) > \ > > + range(0, ((intx)MIN2((int64_t)max_intx,(int64_t)(+1.0e10)))) > \ > > 32bit: I would have expected a build warning for the cast. Is it okay > that we can never reach the max value on 32bit? > > I double checked that there is no warning in our night builds and on > linuxintel. > > > commandLineFlagConstraintsCompiler.cpp: > > CommandLineError::print(verbose, > > "OptoLoopAlignment (" INTX_FORMAT ") must > be " > > "multiple of NOP size\n"); > > There is an error here, the print parameter is missing. Would have > expected the compiler to complain, actually - at least the gcc. Again, > curious. > > Thanks, good catch! The error was there before, but fixed anyways. I also > added the NOP size. > > + // Relevant on ppc, s390, sparc. Will be optimized where + // addr_unit() == 1. if (OptoLoopAlignment % relocInfo::addr_unit() != 0) { CommandLineError::print(verbose, "OptoLoopAlignment (" INTX_FORMAT ") must be " - "multiple of NOP size\n"); + "multiple of NOP size (" INTX_FORMAT ")\n", + value, relocInfo::addr_unit()); We are getting there... addr_unit() returns int, so use %d, not INTX_FORMAT. Apart from that all is fine. No need for a new webrev. ..Thomas > > Best regards, > Goetz. > > > > Kind Regards, Thomas > > On Thu, May 4, 2017 at 12:57 PM, Lindenmaier, Goetz goetz.lindenmaier at sap.com> wrote: > Hi, > > This change fixes range handling of a few flags of C2. > This should go to jdk10, and later be downported to some > update of jdk9. > > Please review this change. I please need a sponsor. > http://cr.openjdk.java.net/~goetz/wr17/8179618-FlagRanges/webrev.01/ > > Class WarmCallInfo limits its values to 1.0e10, but the flags used > to set it's fields (HotCallCountThreshold etc.) are limited by max_intx. > Using values over 1.0e10 causes assertions in the debug build. > > OptoLoopAlignment must be a multiple of nop size, else it's not > possible to generate the instructions that go into the pad. > On x86 NOP size is 1, so it's no problem. > For SPARC, OptoLoopAlignmentConstraintFunc implements a special > case for bigger NOPs. This is also needed for s390 and ppc. > I just removed the #define, as the code works also on platforms > where NOPsize == 1. Actually, it should be optimized by the C > compiler in these cases. > > Best regards, > Goetz. > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From michael.c.berg at intel.com Wed May 10 19:58:31 2017 From: michael.c.berg at intel.com (Berg, Michael C) Date: Wed, 10 May 2017 19:58:31 +0000 Subject: CR for RFR 8178800 Message-ID: Hi Folks, Some support was added to enable a pass state of all validation tests for KNL for JDK9. This support enables upper bank register usage for auto code generation for a small number of instructions (3) which needed some additional coverage under cases where servers do not enable vector length but have EVEX support. This code was tested as follows: All internal validation tests at both Intel and Oracle. This change needs approval for JDK9 addition. Bug-id: https://bugs.openjdk.java.net/browse/JDK-8178800 webrev: http://cr.openjdk.java.net/~mcberg/8178800/webrev Regards, Michael -------------- next part -------------- An HTML attachment was scrubbed... URL: From michael.c.berg at intel.com Wed May 10 23:53:30 2017 From: michael.c.berg at intel.com (Berg, Michael C) Date: Wed, 10 May 2017 23:53:30 +0000 Subject: CR for RFR 8178800 In-Reply-To: References: Message-ID: Some additional notes, the vabsss and vabssd instructions utilize a test on src2 vs dst, this is required as the live range of src2 can end making it available for reuse, in such a case, we manage and use xmm0 so that we have a scratch register to utilize which will facilitate VEX encoding. Regards, Michael From: hotspot-compiler-dev [mailto:hotspot-compiler-dev-bounces at openjdk.java.net] On Behalf Of Berg, Michael C Sent: Wednesday, May 10, 2017 12:59 PM To: 'hotspot-compiler-dev at openjdk.java.net' Subject: CR for RFR 8178800 Hi Folks, Some support was added to enable a pass state of all validation tests for KNL for JDK9. This support enables upper bank register usage for auto code generation for a small number of instructions (3) which needed some additional coverage under cases where servers do not enable vector length but have EVEX support. This code was tested as follows: All internal validation tests at both Intel and Oracle. This change needs approval for JDK9 addition. Bug-id: https://bugs.openjdk.java.net/browse/JDK-8178800 webrev: http://cr.openjdk.java.net/~mcberg/8178800/webrev Regards, Michael -------------- next part -------------- An HTML attachment was scrubbed... URL: From vladimir.kozlov at oracle.com Wed May 10 23:54:48 2017 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Wed, 10 May 2017 16:54:48 -0700 Subject: CR for RFR 8178800 In-Reply-To: References: Message-ID: <2175ef0a-7d72-6f23-95c1-e4c37024379d@oracle.com> Hi Michael, Why do you need pshufd()? Why to save xmm0 in register you can use movdqu() but for stack you use evmovdqul()? Why you can use movss(src, nds) and movsd(src, nds) when other instructions use evmovdqul() for that? Why you removed (nds_enc < 16) block? Thanks, Vladimir On 5/10/17 12:58 PM, Berg, Michael C wrote: > Hi Folks, > > Some support was added to enable a pass state of all validation tests > for KNL for JDK9. This support enables upper bank register usage for > auto code generation for a small number of instructions (3) which needed > some additional coverage under cases where servers do not enable vector > length but have EVEX support. > > > > This code was tested as follows: All internal validation tests at both > Intel and Oracle. This change needs approval for JDK9 addition. > > > Bug-id: https://bugs.openjdk.java.net/browse/JDK-8178800 > > > webrev: http://cr.openjdk.java.net/~mcberg/8178800/webrev > > > > Regards, > > Michael > From michael.c.berg at intel.com Thu May 11 00:05:46 2017 From: michael.c.berg at intel.com (Berg, Michael C) Date: Thu, 11 May 2017 00:05:46 +0000 Subject: CR for RFR 8178800 In-Reply-To: <2175ef0a-7d72-6f23-95c1-e4c37024379d@oracle.com> References: <2175ef0a-7d72-6f23-95c1-e4c37024379d@oracle.com> Message-ID: Hi Vladimir, See below for explanations... Regards, Michael -----Original Message----- From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com] Sent: Wednesday, May 10, 2017 4:55 PM To: Berg, Michael C ; 'hotspot-compiler-dev at openjdk.java.net' Subject: Re: CR for RFR 8178800 Hi Michael, Why do you need pshufd()? I found it possible for auto code generation to emit this pattern in memory form, ergo I added the support so that if the dst register is in the upper bank on vector length constrained instructions (128bit, 256bit, etc), that we could encode with VEX on KNL. Why to save xmm0 in register you can use movdqu() but for stack you use evmovdqul()? I will make that consistent too, and change to the full register version. Why you can use movss(src, nds) and movsd(src, nds) when other instructions use evmovdqul() for that? For consistency sake I will make the change, semantically it does not mater though. Why you removed (nds_enc < 16) block? Because the code I found I needed was exactly the same for that case and the final case, ergo I removed some code. Would these changes suffice? They cosmetic in nature, so I will limit my testing. Thanks, Vladimir On 5/10/17 12:58 PM, Berg, Michael C wrote: > Hi Folks, > > Some support was added to enable a pass state of all validation tests > for KNL for JDK9. This support enables upper bank register usage for > auto code generation for a small number of instructions (3) which > needed some additional coverage under cases where servers do not > enable vector length but have EVEX support. > > > > This code was tested as follows: All internal validation tests at both > Intel and Oracle. This change needs approval for JDK9 addition. > > > Bug-id: https://bugs.openjdk.java.net/browse/JDK-8178800 > > > webrev: http://cr.openjdk.java.net/~mcberg/8178800/webrev > > > > Regards, > > Michael > From rwestrel at redhat.com Thu May 11 08:33:46 2017 From: rwestrel at redhat.com (Roland Westrelin) Date: Thu, 11 May 2017 10:33:46 +0200 Subject: [9] RFR(S): 8179678: ArrayCopy with same src and dst can cause incorrect execution or compiler crash Message-ID: http://cr.openjdk.java.net/~roland/8179678/webrev.00/ When possible: System.arraycopy(src, spos, dst, dpos, l); v = dst[i]; is transformed to: System.arraycopy(src, spos, dst, dpos, l); v = src[i + (spos - dpos)]; So the arraycopy has a chance to be eliminated. This breaks if src and dst are the same arrays and src[i + (spos - dpos)] is written to by the arraycopy. We need to validate that either src[i + (spos - dpos)] is not modified by the arraycopy or src and dst are not the same. Roland. From vladimir.kozlov at oracle.com Thu May 11 16:57:02 2017 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Thu, 11 May 2017 09:57:02 -0700 Subject: CR for RFR 8178800 In-Reply-To: References: <2175ef0a-7d72-6f23-95c1-e4c37024379d@oracle.com> Message-ID: Updated webrev: http://cr.openjdk.java.net/~mcberg/8178800/webrev.02/ On 5/10/17 5:05 PM, Berg, Michael C wrote: > Hi Vladimir, > > See below for explanations... > > Regards, > Michael > > -----Original Message----- > From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com] > Sent: Wednesday, May 10, 2017 4:55 PM > To: Berg, Michael C ; 'hotspot-compiler-dev at openjdk.java.net' > Subject: Re: CR for RFR 8178800 > > Hi Michael, > > Why do you need pshufd()? > I found it possible for auto code generation to emit this pattern in memory form, ergo I added the support so that if the dst register is in the upper bank on vector length constrained instructions (128bit, 256bit, etc), that we could encode with VEX on KNL. Okay, I see now. You replaced Assembler::pshufd() calls with memory parameter with MacroAssembler::pshufd() to use vex encoding. > > Why to save xmm0 in register you can use movdqu() but for stack you use evmovdqul()? > I will make that consistent too, and change to the full register version. Ok. > > Why you can use movss(src, nds) and movsd(src, nds) when other instructions use evmovdqul() for that? > For consistency sake I will make the change, semantically it does not mater though. Ok. > > Why you removed (nds_enc < 16) block? > Because the code I found I needed was exactly the same for that case and the final case, ergo I removed some code. Ok. > > Would these changes suffice? They cosmetic in nature, so I will limit my testing. Yes. But we will run our RBT testing again. Thanks, Vladimir > > Thanks, > Vladimir > > > On 5/10/17 12:58 PM, Berg, Michael C wrote: >> Hi Folks, >> >> Some support was added to enable a pass state of all validation tests >> for KNL for JDK9. This support enables upper bank register usage for >> auto code generation for a small number of instructions (3) which >> needed some additional coverage under cases where servers do not >> enable vector length but have EVEX support. >> >> >> >> This code was tested as follows: All internal validation tests at both >> Intel and Oracle. This change needs approval for JDK9 addition. >> >> >> Bug-id: https://bugs.openjdk.java.net/browse/JDK-8178800 >> >> >> webrev: http://cr.openjdk.java.net/~mcberg/8178800/webrev >> >> >> >> Regards, >> >> Michael >> From tobias.hartmann at oracle.com Fri May 12 06:14:39 2017 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Fri, 12 May 2017 08:14:39 +0200 Subject: CR for RFR 8178800 In-Reply-To: References: <2175ef0a-7d72-6f23-95c1-e4c37024379d@oracle.com> Message-ID: <1575ed4f-4824-cd15-13a9-f8addc8d0e5c@oracle.com> Hi, On 11.05.2017 18:57, Vladimir Kozlov wrote: > http://cr.openjdk.java.net/~mcberg/8178800/webrev.02/ This looks good to me. Best regards, Tobias > On 5/10/17 5:05 PM, Berg, Michael C wrote: >> Hi Vladimir, >> >> See below for explanations... >> >> Regards, >> Michael >> >> -----Original Message----- >> From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com] >> Sent: Wednesday, May 10, 2017 4:55 PM >> To: Berg, Michael C ; 'hotspot-compiler-dev at openjdk.java.net' >> Subject: Re: CR for RFR 8178800 >> >> Hi Michael, >> >> Why do you need pshufd()? >> I found it possible for auto code generation to emit this pattern in memory form, ergo I added the support so that if the dst register is in the upper bank on vector length constrained instructions (128bit, 256bit, etc), that we could encode with VEX on KNL. > > Okay, I see now. You replaced Assembler::pshufd() calls with memory parameter with MacroAssembler::pshufd() to use vex encoding. > >> >> Why to save xmm0 in register you can use movdqu() but for stack you use evmovdqul()? >> I will make that consistent too, and change to the full register version. > > Ok. > >> >> Why you can use movss(src, nds) and movsd(src, nds) when other instructions use evmovdqul() for that? >> For consistency sake I will make the change, semantically it does not mater though. > > Ok. > >> >> Why you removed (nds_enc < 16) block? >> Because the code I found I needed was exactly the same for that case and the final case, ergo I removed some code. > > Ok. > >> >> Would these changes suffice? They cosmetic in nature, so I will limit my testing. > > Yes. But we will run our RBT testing again. > > Thanks, > Vladimir > >> >> Thanks, >> Vladimir >> >> >> On 5/10/17 12:58 PM, Berg, Michael C wrote: >>> Hi Folks, >>> >>> Some support was added to enable a pass state of all validation tests >>> for KNL for JDK9. This support enables upper bank register usage for >>> auto code generation for a small number of instructions (3) which >>> needed some additional coverage under cases where servers do not >>> enable vector length but have EVEX support. >>> >>> >>> >>> This code was tested as follows: All internal validation tests at both >>> Intel and Oracle. This change needs approval for JDK9 addition. >>> >>> >>> Bug-id: https://bugs.openjdk.java.net/browse/JDK-8178800 >>> >>> >>> webrev: http://cr.openjdk.java.net/~mcberg/8178800/webrev >>> >>> >>> >>> Regards, >>> >>> Michael >>> From goetz.lindenmaier at sap.com Fri May 12 07:10:11 2017 From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz) Date: Fri, 12 May 2017 07:10:11 +0000 Subject: RFR(S): 8179618: Fixes for range of OptoLoopAlignment and Inlining flags In-Reply-To: References: <91d36776f94b4a4c91a7be90013a9f21@sap.com> Message-ID: <75df45a5ae8c4f4f9e06a1ae35e1fda7@sap.com> Hi, could someone please sponsor? Thanks! I fixed the print statement. New webrev anyways: http://cr.openjdk.java.net/~goetz/wr17/8179618-FlagRanges/webrev.03/ Best regards, Goetz. From: Thomas St?fe [mailto:thomas.stuefe at gmail.com] Sent: Tuesday, May 09, 2017 7:54 PM To: Lindenmaier, Goetz Cc: hotspot-compiler-dev at openjdk.java.net Subject: Re: RFR(S): 8179618: Fixes for range of OptoLoopAlignment and Inlining flags Hi Goetz, On Tue, May 9, 2017 at 4:17 PM, Lindenmaier, Goetz > wrote: Hi Thomas, thanks for looking at my change. New webrev: http://cr.openjdk.java.net/~goetz/wr17/8179618-FlagRanges/webrev.02/ > c2_globals.hpp: > - range(0, max_intx) \ > + range(0, ((intx)MIN2((int64_t)max_intx,(int64_t)(+1.0e10)))) \ > 32bit: I would have expected a build warning for the cast. Is it okay that we can never reach the max value on 32bit? I double checked that there is no warning in our night builds and on linuxintel. > commandLineFlagConstraintsCompiler.cpp: > CommandLineError::print(verbose, > "OptoLoopAlignment (" INTX_FORMAT ") must be " > "multiple of NOP size\n"); > There is an error here, the print parameter is missing. Would have expected the compiler to complain, actually - at least the gcc. Again, curious. Thanks, good catch! The error was there before, but fixed anyways. I also added the NOP size. + // Relevant on ppc, s390, sparc. Will be optimized where + // addr_unit() == 1. if (OptoLoopAlignment % relocInfo::addr_unit() != 0) { CommandLineError::print(verbose, "OptoLoopAlignment (" INTX_FORMAT ") must be " - "multiple of NOP size\n"); + "multiple of NOP size (" INTX_FORMAT ")\n", + value, relocInfo::addr_unit()); We are getting there... addr_unit() returns int, so use %d, not INTX_FORMAT. Apart from that all is fine. No need for a new webrev. ..Thomas Best regards, Goetz. Kind Regards, Thomas On Thu, May 4, 2017 at 12:57 PM, Lindenmaier, Goetz > wrote: Hi, This change fixes range handling of a few flags of C2. This should go to jdk10, and later be downported to some update of jdk9. Please review this change. I please need a sponsor. http://cr.openjdk.java.net/~goetz/wr17/8179618-FlagRanges/webrev.01/ Class WarmCallInfo limits its values to 1.0e10, but the flags used to set it's fields (HotCallCountThreshold etc.) are limited by max_intx. Using values over 1.0e10 causes assertions in the debug build. OptoLoopAlignment must be a multiple of nop size, else it's not possible to generate the instructions that go into the pad. On x86 NOP size is 1, so it's no problem. For SPARC, OptoLoopAlignmentConstraintFunc implements a special case for bigger NOPs. This is also needed for s390 and ppc. I just removed the #define, as the code works also on platforms where NOPsize == 1. Actually, it should be optimized by the C compiler in these cases. Best regards, Goetz. -------------- next part -------------- An HTML attachment was scrubbed... URL: From rwestrel at redhat.com Fri May 12 07:55:17 2017 From: rwestrel at redhat.com (Roland Westrelin) Date: Fri, 12 May 2017 09:55:17 +0200 Subject: [10] RFR(M): 8176506: C2: loop unswitching and unsafe accesses cause crash In-Reply-To: <6cea2fa0-1bcf-3c86-1494-a51f66430577@oracle.com> References: <1aef21ad-3676-2ffd-069a-c74ef36a668a@oracle.com> <83aed7fe-cafa-f28b-577d-c6871123e269@oracle.com> <8cfe23ac-b308-24e3-2725-44992031cddd@oracle.com> <352faa06-3fe4-de8a-eeb6-3506bf555a1e@redhat.com> <4050c7c9-b688-5d52-a85c-f72284f50ccf@oracle.com> <6cea2fa0-1bcf-3c86-1494-a51f66430577@oracle.com> Message-ID: > X86 Manual says: > > "Use the 0F0B opcode (UD2 instruction) or the 0FB9H opcode when > deliberately trying to generate an invalid opcode exception (#UD)." Thanks, Vladimir. Here is a new webrev. Halt now should trigger a SIGILL on all platforms. I tested x86, aarch64, arm64. The code for arm32 needs to be tested (I can't even build on arm32). http://cr.openjdk.java.net/~roland/8176506/webrev.04/ Roland. From george.triantafillou at oracle.com Fri May 12 14:34:39 2017 From: george.triantafillou at oracle.com (George Triantafillou) Date: Fri, 12 May 2017 10:34:39 -0400 Subject: RFR 8179903: Clean up SPARC 32-bit support Message-ID: Please review this fix to clean up SPARC 32-bit support. JBS: https://bugs.openjdk.java.net/browse/JDK-8179903 webrev: http://cr.openjdk.java.net/~gtriantafill/8179903-webrev/webrev/index.html This is a followup RFE to "JDK-8150388 Remove SPARC 32-bit support". The work includes addressing formatting, 32-bit comments, and other issues that Kim raised after JDK-8150388 was reviewed and checked in. Built and tested on solaris-sparcv9-debug, solaris-x64-debug with the nsk.jvmti, nsk.jdwp, and nsk.jdi testlists. Thanks. -George From frederic.parain at oracle.com Fri May 12 15:47:33 2017 From: frederic.parain at oracle.com (Frederic Parain) Date: Fri, 12 May 2017 11:47:33 -0400 Subject: RFR 8179903: Clean up SPARC 32-bit support In-Reply-To: References: Message-ID: <7F106736-E4B7-4683-B396-9703559C0790@oracle.com> Looks good to me. Thank you for doing this clean up. Fred > On May 12, 2017, at 10:34, George Triantafillou wrote: > > Please review this fix to clean up SPARC 32-bit support. > > JBS: https://bugs.openjdk.java.net/browse/JDK-8179903 > webrev: http://cr.openjdk.java.net/~gtriantafill/8179903-webrev/webrev/index.html > > This is a followup RFE to "JDK-8150388 Remove SPARC 32-bit support". The work includes addressing formatting, 32-bit comments, and other issues that Kim raised after JDK-8150388 was reviewed and checked in. > > Built and tested on solaris-sparcv9-debug, solaris-x64-debug with the nsk.jvmti, nsk.jdwp, and nsk.jdi testlists. > > Thanks. > > -George > From george.triantafillou at oracle.com Fri May 12 16:29:57 2017 From: george.triantafillou at oracle.com (George Triantafillou) Date: Fri, 12 May 2017 12:29:57 -0400 Subject: RFR 8179903: Clean up SPARC 32-bit support In-Reply-To: <7F106736-E4B7-4683-B396-9703559C0790@oracle.com> References: <7F106736-E4B7-4683-B396-9703559C0790@oracle.com> Message-ID: <437c965d-176d-cde8-d175-a403d914b49b@oracle.com> Thanks Fred! -George On 5/12/2017 11:47 AM, Frederic Parain wrote: > Looks good to me. > Thank you for doing this clean up. > > Fred > > >> On May 12, 2017, at 10:34, George Triantafillou wrote: >> >> Please review this fix to clean up SPARC 32-bit support. >> >> JBS: https://bugs.openjdk.java.net/browse/JDK-8179903 >> webrev: http://cr.openjdk.java.net/~gtriantafill/8179903-webrev/webrev/index.html >> >> This is a followup RFE to "JDK-8150388 Remove SPARC 32-bit support". The work includes addressing formatting, 32-bit comments, and other issues that Kim raised after JDK-8150388 was reviewed and checked in. >> >> Built and tested on solaris-sparcv9-debug, solaris-x64-debug with the nsk.jvmti, nsk.jdwp, and nsk.jdi testlists. >> >> Thanks. >> >> -George >> From george.triantafillou at oracle.com Fri May 12 16:30:18 2017 From: george.triantafillou at oracle.com (George Triantafillou) Date: Fri, 12 May 2017 12:30:18 -0400 Subject: RFR 8179903: Clean up SPARC 32-bit support In-Reply-To: <3A83A992-62FF-4575-9906-91DEB4154113@oracle.com> References: <3A83A992-62FF-4575-9906-91DEB4154113@oracle.com> Message-ID: <6f6b82c0-d159-c459-56d4-5fbfbeb84004@oracle.com> Thanks Jerry! -George On 5/12/2017 12:10 PM, Gerald Thornbrugh wrote: > Hi George, > > Your changes look good to me. > > Jerry >> On May 12, 2017, at 8:34 AM, George Triantafillou wrote: >> >> Please review this fix to clean up SPARC 32-bit support. >> >> JBS: https://bugs.openjdk.java.net/browse/JDK-8179903 >> webrev: http://cr.openjdk.java.net/~gtriantafill/8179903-webrev/webrev/index.html >> >> This is a followup RFE to "JDK-8150388 Remove SPARC 32-bit support". The work includes addressing formatting, 32-bit comments, and other issues that Kim raised after JDK-8150388 was reviewed and checked in. >> >> Built and tested on solaris-sparcv9-debug, solaris-x64-debug with the nsk.jvmti, nsk.jdwp, and nsk.jdi testlists. >> >> Thanks. >> >> -George >> From vladimir.kozlov at oracle.com Fri May 12 17:45:02 2017 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Fri, 12 May 2017 10:45:02 -0700 Subject: [10] RFR(XXL) 8180267: Update Graal Message-ID: <15a1c8d7-1d14-ffaa-d062-db3ee38f5967@oracle.com> This will be pushed together with merge from jdk10/jdk10 to jdk10/hs because new Graal code is required after merge of JDK-8177845: "Need a mechanism to load Graal". Webrev: http://cr.openjdk.java.net/~kvn/8180267/webrev.00/ https://bugs.openjdk.java.net/browse/JDK-8180267 Sync latest Graal sources into JDK10/hs. Additional changes in make file and AOT were needed because WordBase class was moved to new org.graalvm.api.word package. Lists of Graal changes are in the JBS report. Note, there are 2 lists. One list was created by looking only on ./compiler directory in https://github.com/graalvm/graal where Graal compiler was moved after truffle+graal-core+visualizer+graalvm consolidation. Second list contains changes before the consolidation and after previous JDK Graal's snapshot: JDK-8178864. It is generated by looking on changes after JDK-8178864 snapshot in previous Graal-core repository: https://github.com/graalvm/graal-core Tested by building JDK and running jtreg on local machine. Thanks, Vladimir From rwestrel at redhat.com Mon May 15 13:00:55 2017 From: rwestrel at redhat.com (Roland Westrelin) Date: Mon, 15 May 2017 15:00:55 +0200 Subject: [10] RFR(M): 8176506: C2: loop unswitching and unsafe accesses cause crash In-Reply-To: References: <1aef21ad-3676-2ffd-069a-c74ef36a668a@oracle.com> <83aed7fe-cafa-f28b-577d-c6871123e269@oracle.com> <8cfe23ac-b308-24e3-2725-44992031cddd@oracle.com> <352faa06-3fe4-de8a-eeb6-3506bf555a1e@redhat.com> <4050c7c9-b688-5d52-a85c-f72284f50ccf@oracle.com> <6cea2fa0-1bcf-3c86-1494-a51f66430577@oracle.com> Message-ID: > Here is a new webrev. Halt now should trigger a SIGILL on all > platforms. I tested x86, aarch64, arm64. The code for arm32 needs to be > tested (I can't even build on arm32). > > http://cr.openjdk.java.net/~roland/8176506/webrev.04/ Tobias verified the change on arm32 (thanks Tobias!). So AFAICT, it's good to go. Anyone to sponsor it? Roland. From vladimir.kozlov at oracle.com Tue May 16 04:45:20 2017 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Mon, 15 May 2017 21:45:20 -0700 Subject: [10] RFR(M): 8176506: C2: loop unswitching and unsafe accesses cause crash In-Reply-To: References: <1aef21ad-3676-2ffd-069a-c74ef36a668a@oracle.com> <83aed7fe-cafa-f28b-577d-c6871123e269@oracle.com> <8cfe23ac-b308-24e3-2725-44992031cddd@oracle.com> <352faa06-3fe4-de8a-eeb6-3506bf555a1e@redhat.com> <4050c7c9-b688-5d52-a85c-f72284f50ccf@oracle.com> <6cea2fa0-1bcf-3c86-1494-a51f66430577@oracle.com> Message-ID: <8de1ea64-79cd-e7ec-adb4-1f1f276612bf@oracle.com> Looks good to me. Volker should look too. Thanks, Vladimir On 5/12/17 12:55 AM, Roland Westrelin wrote: > >> X86 Manual says: >> >> "Use the 0F0B opcode (UD2 instruction) or the 0FB9H opcode when >> deliberately trying to generate an invalid opcode exception (#UD)." > > Thanks, Vladimir. > > Here is a new webrev. Halt now should trigger a SIGILL on all > platforms. I tested x86, aarch64, arm64. The code for arm32 needs to be > tested (I can't even build on arm32). > > http://cr.openjdk.java.net/~roland/8176506/webrev.04/ > > Roland. > From jcbeyler at google.com Mon May 15 16:48:40 2017 From: jcbeyler at google.com (JC Beyler) Date: Mon, 15 May 2017 09:48:40 -0700 Subject: Low-Overhead Heap Profiling In-Reply-To: <365499b6-3f4d-a4df-9e7e-e72a739fb26b@oracle.com> References: <2af975e6-3827-bd57-0c3d-fadd54867a67@oracle.com> <365499b6-3f4d-a4df-9e7e-e72a739fb26b@oracle.com> Message-ID: Dear all, I've updated the webrev to: http://cr.openjdk.java.net/~rasbold/8171119/webrev.02/ Robbin, I believe I have addressed most of your items with webrev 02: - I added a JTreg test to show how it works: http://cr.openjdk.java.net/~rasbold/8171119/webrev.02/raw_ files/new/test/serviceability/jvmti/HeapMonitor/libHeapMonitor.c - I've modified the code to use its own data structures both internally and externally, this will make it easier to move out of AsyncGetCallTrace as we move forward, that is still on my TODOs - I cleaned up the JVMTI API by passing a structure that handles the num_traces and put in a ReleaseTraces as well - I cleaned up other issues as well. However, I have three questions, which are probably because I'm new in this community: 1) My previous webrevs were based off of JDK9 by mistake. When I took JDK10 via : hg clone http://hg.openjdk.java.net/jdk10/jdk10 jdk10 - I don't see code compatible with what you were showing (ie your patches don't make sense for that code base; ex: klass is still accessed via klass() for example in collectedHeap.inline.hpp) - Would you know what is the right hg clone command so we are working on the same code base? 2) You mentioned I was using os::malloc, new, NEW_C_HEAP_ARRAY; I cleaned out the os::malloc but which of the new vs NEW_C_HEAP_ARRAY should I use. It might be that I don't understand when one uses one or the other but I see both used around the code base? - Is it that new is to be used for anything internal and NEW_C_HEAP_ARRAY anything provided to the JVMTI users outside of the JVM? 3) Casts: same kind question: which should I use. The code was using a bit of everything, I'll refactor it entirely but I was not clear if I should go to C casts or C++ casts as I see both in the codebase. What is the convention I should use? Final notes on this webrev: - I am still missing: - Putting a TLAB implementation so that we can compare both webrevs - Have not tried to circumvent AsyncGetCallTrace - Putting in the handling of GC'd objects - Fix a stack walker issue I have seen, I think I know the problem and will test that theory out for the next webrev I will work on integrating those items for the next webrev! Thanks for your help, Jc Ps: I tested this on a new repo: hg clone http://hg.openjdk.java.net/jdk10/jdk10 jdk10 ... building it cd test jtreg -nativepath:/build/linux-x86_64-normal-server-release/support/test/hotspot/jtreg/native/lib/ -jdk /linux-x86_64-normal-server-release/images/jdk ../hotspot/test/serviceability/jvmti/HeapMonitor/ On Thu, May 4, 2017 at 11:21 PM, serguei.spitsyn at oracle.com < serguei.spitsyn at oracle.com> wrote: > Robbin, > > Thank you for forwarding! > I will review it. > > Thanks, > Serguei > > > > On 5/4/17 02:13, Robbin Ehn wrote: > >> Hi, >> >> To me the compiler changes looks what is expected. >> It would be good if someone from compiler could take a look at that. >> Added compiler to mail thread. >> >> Also adding Serguei, It would be good with his view also. >> >> My initial take on it, read through most of the code and took it for a >> ride. >> >> ############################## >> - Regarding the compiler changes: I think we need the 'TLAB end' trickery >> (mentioned by Tony P) >> instead of a separate check for sampling in fast path for the final >> version. >> >> ############################## >> - This patch I had to apply to get it compile on JDK 10: >> >> diff -r ac3ded340b35 src/share/vm/gc/shared/collectedHeap.inline.hpp >> --- a/src/share/vm/gc/shared/collectedHeap.inline.hpp Fri Apr 28 >> 14:31:38 2017 +0200 >> +++ b/src/share/vm/gc/shared/collectedHeap.inline.hpp Thu May 04 >> 10:22:56 2017 +0200 >> @@ -87,3 +87,3 @@ >> // support for object alloc event (no-op most of the time) >> - if (klass() != NULL && klass()->name() != NULL) { >> + if (klass != NULL && klass->name() != NULL) { >> Thread *base_thread = Thread::current(); >> diff -r ac3ded340b35 src/share/vm/runtime/heapMonitoring.cpp >> --- a/src/share/vm/runtime/heapMonitoring.cpp Fri Apr 28 14:31:38 >> 2017 +0200 >> +++ b/src/share/vm/runtime/heapMonitoring.cpp Thu May 04 10:22:56 >> 2017 +0200 >> @@ -316,3 +316,3 @@ >> JavaThread *thread = reinterpret_cast(Thread::current()); >> - assert(o->size() << LogHeapWordSize == byte_size, >> + assert(o->size() << LogHeapWordSize == (long)byte_size, >> "Object size is incorrect."); >> >> ############################## >> - This patch I had to apply to get it not asserting during slowdebug: >> >> --- a/src/share/vm/runtime/heapMonitoring.cpp Fri Apr 28 15:15:16 >> 2017 +0200 >> +++ b/src/share/vm/runtime/heapMonitoring.cpp Thu May 04 10:24:25 >> 2017 +0200 >> @@ -32,3 +32,3 @@ >> // TODO(jcbeyler): should we make this into a JVMTI structure? >> -struct StackTraceData { >> +struct StackTraceData : CHeapObj { >> ASGCT_CallTrace *trace; >> @@ -143,3 +143,2 @@ >> StackTraceStorage::StackTraceStorage() : >> - _allocated_traces(new StackTraceData*[MaxHeapTraces]), >> _allocated_traces_size(MaxHeapTraces), >> @@ -147,2 +146,3 @@ >> _allocated_count(0) { >> + _allocated_traces = NEW_C_HEAP_ARRAY(StackTraceData*, MaxHeapTraces, >> mtInternal); >> memset(_allocated_traces, 0, sizeof(*_allocated_traces) * >> MaxHeapTraces); >> @@ -152,3 +152,3 @@ >> StackTraceStorage::~StackTraceStorage() { >> - delete[] _allocated_traces; >> + FREE_C_HEAP_ARRAY(StackTraceData*, _allocated_traces); >> } >> >> - Classes should extend correct base class for which type of memory is >> used for it e.g.: CHeapObj or StackObj or AllStatic >> - The style in heapMonitoring.cpp is a bit different from normal >> vm-style, e.g. using C++ casts instead of C. You mix NEW_C_HEAP_ARRAY, >> os::malloc and new. >> - In jvmtiHeapTransition.hpp you use C cast instead. >> >> ############################## >> - This patch I had apply to get traces without setting an ?unrelated? >> capability >> - Should this not be a new capability? >> >> diff -r c02a5d8785bf src/share/vm/prims/forte.cpp >> --- a/src/share/vm/prims/forte.cpp Fri Apr 28 15:15:16 2017 +0200 >> +++ b/src/share/vm/prims/forte.cpp Thu May 04 10:24:25 2017 +0200 >> @@ -530,6 +530,6 @@ >> >> - if (!JvmtiExport::should_post_class_load()) { >> +/* if (!JvmtiExport::should_post_class_load()) { >> trace->num_frames = ticks_no_class_load; // -1 >> return; >> - } >> + }*/ >> >> ############################## >> - forte.cpp: (I know this is not part of your changes but) >> find_jmethod_id_or_null give me NULL for my test. >> It looks like we actually want the regular jmethod_id() ? >> >> Since we are the thread we are talking about (and in same ucontext) and >> thread is in vm and have a last java frame, >> I think most of the checks done in AsyncGetCallTrace is irrelevant, so >> you should be-able to call forte_fill_call_trace_given_top directly. >> But since we might need jmethod_id() if possible to avoid getting method >> id NULL, >> we need some fixes in forte code, or just do the vframStream loop inside >> heapMonitoring.cpp and not use forte.cpp. >> >> Something like: >> >> if (jthread->has_last_Java_frame()) { // just to be safe >> vframeStream vfst(jthread); >> while (!vfst.at_end()) { >> Method* m = vfst.method(); >> m->jmethod_id(); >> m->line_number_from_bci(vfst.bci()); >> vfst.next(); >> } >> >> - This is a bit confusing in forte.cpp, trace->frames[count].lineno = bci. >> Line number should be m->line_number_from_bci(bci); >> Do the heapMonitoring suppose to trace with bci or line number? >> I would say bci, meaning we should either rename ASGCT_CallFrame?lineno >> or use another data structure which says bci. >> >> ############################## >> - // TODO(jcbeyler): remove this extra code handling the extra trace for >> Please fix all these TODO's :) >> >> ############################## >> - heapMonitoring.hpp: >> // TODO(jcbeyler): is this algorithm acceptable in open source? >> >> Why is this comment here? What is the implication? >> Have you tested any simpler algorithm? >> >> ############################## >> - Create a sanity jtreg test. (./hotspot/make/test/JtregNative.gmk for >> building the agent) >> >> ############################## >> - monitoring_period vs HeapMonitorRate, pick rate or period. >> >> ############################## >> - globals.hpp >> Why is MaxHeapTraces not settable/overridable from jvmti interface? That >> would be handy. >> >> ############################## >> - jvmtiStackTraceData + ASGCT_CallFrame memory >> Are the agent suppose to loop through and free all ASGCT_CallFrame? >> Wouldn't it be better with some kinda protocol, like: >> (*jvmti)->GetLiveTraces(jvmti, &stack_traces, &num_traces); >> (*jvmti)->ReleaseTraces(jvmti, stack_traces, num_traces); >> >> Also using another data structure that have num_traces inside it >> simplifies things. >> So I'm not convinced using the async structure is the best way forward. >> >> >> I have more questions, but I think it's better if you respond and update >> the code first. >> >> Thanks! >> >> /Robbin >> >> >> On 04/21/2017 11:34 PM, JC Beyler wrote: >> >>> Hi all, >>> >>> I've added size information to the allocation sampling system. This >>> allows the callback to remember the size of each sampled allocation. >>> http://cr.openjdk.java.net/~rasbold/8171119/webrev.01/ >>> >>> The new webrev.01 also adds the actual heap monitoring sampling system >>> in files: >>> http://cr.openjdk.java.net/~rasbold/8171119/webrev.01/src/sh >>> are/vm/runtime/heapMonitoring.cpp.patch >>> and >>> http://cr.openjdk.java.net/~rasbold/8171119/webrev.01/src/sh >>> are/vm/runtime/heapMonitoring.hpp.patch >>> >>> My next step is to add the GC part to the webrev, which will allow users >>> to determine what objects are live and what are garbage. >>> >>> Thanks for your attention and let me know if there are any questions! >>> >>> Have a wonderful Friday! >>> Jc >>> >>> On Mon, Apr 17, 2017 at 12:37 PM, JC Beyler >> > wrote: >>> >>> Hi all, >>> >>> I worked on getting a few numbers for overhead and accuracy for my >>> feature. I'm unsure if here is the right place to provide the full data, so >>> I am just summarizing >>> here for now. >>> >>> - Overhead of the feature >>> >>> Using the Dacapo benchmark (http://dacapobench.org/). My initial >>> results are that sampling provides 2.4% with a 512k sampling, 512k being >>> our default setting. >>> >>> - Note: this was without the tradesoap, tradebeans and tomcat >>> benchmarks since they did not work with my JDK9 (issue between Dacapo and >>> JDK9 it seems) >>> - I want to rerun next week to ensure number stability >>> >>> - Accuracy of the feature >>> >>> I wrote a small microbenchmark that allocates from two different >>> stacktraces at a given ratio. For example, 10% of stacktrace S1 and 90% >>> from stacktrace S2. The >>> microbenchmark was run 20 times, I averaged the results and looked >>> for accuracy. It seems that statistically it is sound since if I >>> allocated10% S1 and 90% S2, with a >>> sampling rate of 512k, I obtained 9.61% S1 and 90.49% S2. >>> >>> Let me know if there are any questions on the numbers and if you'd >>> like to see some more data. >>> >>> Note: this was done using our internal JDK8 implementation since the >>> webrev provided by http://cr.openjdk.java.net/~ra >>> sbold/heapz/webrev.00/index.html >>> does >>> not yet contain the whole implementation and therefore would have been >>> misleading. >>> >>> Thanks, >>> Jc >>> >>> >>> On Tue, Apr 4, 2017 at 3:55 PM, JC Beyler >> > wrote: >>> >>> Hi all, >>> >>> To move the discussion forward, with Chuck Rasbold's help to >>> make a webrev, we pushed this: >>> http://cr.openjdk.java.net/~rasbold/heapz/webrev.00/index.html < >>> http://cr.openjdk.java.net/~rasbold/heapz/webrev.00/index.html> >>> 415 lines changed: 399 ins; 13 del; 3 mod; 51122 unchg >>> >>> This is not a final change that does the whole proposition from >>> the JBS entry: https://bugs.openjdk.java.net/browse/JDK-8177374 >>> ; what it does show >>> is parts of the implementation that is proposed and hopefully can start the >>> conversation going >>> as I work through the details. >>> >>> For example, the changes to C2 are done here for the >>> allocations: http://cr.openjdk.java.net/~ra >>> sbold/heapz/webrev.00/src/share/vm/opto/macro.cpp.patch >>> >> re/vm/opto/macro.cpp.patch> >>> >>> Hopefully this all makes sense and thank you for all your future >>> comments! >>> Jc >>> >>> >>> On Tue, Dec 13, 2016 at 1:11 PM, JC Beyler >> > wrote: >>> >>> Hello all, >>> >>> This is a follow-up from Jeremy's initial email from last >>> year: >>> http://mail.openjdk.java.net/pipermail/serviceability-dev/20 >>> 15-June/017543.html >> pipermail/serviceability-dev/2015-June/017543.html> >>> >>> I've gone ahead and started working on preparing this and >>> Jeremy and I went down the route of actually writing it up in JEP form: >>> https://bugs.openjdk.java.net/browse/JDK-8171119 >>> >>> I think original conversation that happened last year in >>> that thread still holds true: >>> >>> - We have a patch at Google that we think others might be >>> interested in >>> - It provides a means to understand where the >>> allocation hotspots are at a very low overhead >>> - Since it is at a low overhead, we can leave it on by >>> default >>> >>> So I come to the mailing list with Jeremy's initial question: >>> "I thought I would ask if there is any interest / if I >>> should write a JEP / if I should just forget it." >>> >>> A year ago, it seemed some thought it was a good idea, is >>> this still true? >>> >>> Thanks, >>> Jc >>> >>> >>> >>> >>> >>> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From yasuenag at gmail.com Tue May 16 06:27:48 2017 From: yasuenag at gmail.com (Yasumasa Suenaga) Date: Tue, 16 May 2017 15:27:48 +0900 Subject: HotSpotResolvedJavaMethod#setNotInlineable() in JVMCI Message-ID: Hi all, I've tried to use JVMCI implementation. HotSpotResolvedJavaMethod#setNotInlineable() is explained that "Manually adds a DontInline annotation to this method" in the comment, however this method seems to disable C1/C2 compile in CompilerToVM [1][2]. Is this behavior correct? Should we fix the comment or method name (and / or function name) or behavior in jvmciCompilerToVM.cpp ? I will file it to JBS if it is a bug. Thanks, Yasumasa (ysuenaga) [1] http://hg.openjdk.java.net/jdk9/dev/hotspot/file/d6d7e5caf497/src/jdk.internal.vm.ci/share/classes/jdk.vm.ci.hotspot/src/jdk/vm/ci/hotspot/HotSpotResolvedJavaMethodImpl.java#l320 [2] http://hg.openjdk.java.net/jdk9/dev/hotspot/file/d6d7e5caf497/src/share/vm/jvmci/jvmciCompilerToVM.cpp#l1004 -------------- next part -------------- An HTML attachment was scrubbed... URL: From tobias.hartmann at oracle.com Tue May 16 09:32:50 2017 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Tue, 16 May 2017 11:32:50 +0200 Subject: [9] RFR(S): 8179678: ArrayCopy with same src and dst can cause incorrect execution or compiler crash In-Reply-To: References: Message-ID: Hi Roland, On 11.05.2017 10:33, Roland Westrelin wrote: > http://cr.openjdk.java.net/~roland/8179678/webrev.00/ > > When possible: > > System.arraycopy(src, spos, dst, dpos, l); > v = dst[i]; > > is transformed to: > > System.arraycopy(src, spos, dst, dpos, l); > v = src[i + (spos - dpos)]; > > So the arraycopy has a chance to be eliminated. This breaks if src and > dst are the same arrays and src[i + (spos - dpos)] is written to by the > arraycopy. We need to validate that either src[i + (spos - dpos)] is not > modified by the arraycopy or src and dst are not the same. But in ArrayCopyNode::can_replace_dest_load_with_src_load() you return false, if src == dst. Why is that? And in line 733, shouldn't we pass must_modify = false to detect the case we the array copy _may_ modify the source we would load? Best regards, Tobias From goetz.lindenmaier at sap.com Tue May 16 10:08:11 2017 From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz) Date: Tue, 16 May 2017 10:08:11 +0000 Subject: RFR(S): 8179618: Fixes for range of OptoLoopAlignment and Inlining flags In-Reply-To: <75df45a5ae8c4f4f9e06a1ae35e1fda7@sap.com> References: <91d36776f94b4a4c91a7be90013a9f21@sap.com> <75df45a5ae8c4f4f9e06a1ae35e1fda7@sap.com> Message-ID: <387a7e8d602647c6b49de4c487c40141@sap.com> Hi, could someone please sponsor this change? Final webrev: http://cr.openjdk.java.net/~goetz/wr17/8179618-FlagRanges/webrev.03/ Thanks, Goetz > -----Original Message----- > From: Lindenmaier, Goetz > Sent: Freitag, 12. Mai 2017 09:10 > To: 'Thomas St?fe' > Cc: hotspot-compiler-dev at openjdk.java.net > Subject: RE: RFR(S): 8179618: Fixes for range of OptoLoopAlignment and Inlining > flags > > Hi, > > > > could someone please sponsor? Thanks! > > > > I fixed the print statement. New webrev anyways: > > http://cr.openjdk.java.net/~goetz/wr17/8179618-FlagRanges/webrev.03/ > > > > Best regards, > > Goetz. > > > > From: Thomas St?fe [mailto:thomas.stuefe at gmail.com] > Sent: Tuesday, May 09, 2017 7:54 PM > To: Lindenmaier, Goetz > Cc: hotspot-compiler-dev at openjdk.java.net > Subject: Re: RFR(S): 8179618: Fixes for range of OptoLoopAlignment and Inlining > flags > > > > Hi Goetz, > > > > On Tue, May 9, 2017 at 4:17 PM, Lindenmaier, Goetz > > wrote: > > Hi Thomas, > > thanks for looking at my change. > New webrev: > http://cr.openjdk.java.net/~goetz/wr17/8179618- > FlagRanges/webrev.02/ > > > c2_globals.hpp: > > - range(0, max_intx) \ > > + range(0, ((intx)MIN2((int64_t)max_intx,(int64_t)(+1.0e10)))) \ > > 32bit: I would have expected a build warning for the cast. Is it okay > that we can never reach the max value on 32bit? > > I double checked that there is no warning in our night builds and on > linuxintel. > > > commandLineFlagConstraintsCompiler.cpp: > > CommandLineError::print(verbose, > > "OptoLoopAlignment (" INTX_FORMAT ") must be " > > "multiple of NOP size\n"); > > There is an error here, the print parameter is missing. Would have > expected the compiler to complain, actually - at least the gcc. Again, curious. > > Thanks, good catch! The error was there before, but fixed anyways. I > also > added the NOP size. > > > + // Relevant on ppc, s390, sparc. Will be optimized where > + // addr_unit() == 1. > if (OptoLoopAlignment % relocInfo::addr_unit() != 0) { > CommandLineError::print(verbose, > "OptoLoopAlignment (" INTX_FORMAT ") must be " > - "multiple of NOP size\n"); > + "multiple of NOP size (" INTX_FORMAT ")\n", > + value, relocInfo::addr_unit()); > > We are getting there... > > > > addr_unit() returns int, so use %d, not INTX_FORMAT. > > > > Apart from that all is fine. No need for a new webrev. > > > > ..Thomas > > > > > Best regards, > Goetz. > > > > Kind Regards, Thomas > > > On Thu, May 4, 2017 at 12:57 PM, Lindenmaier, Goetz > > > wrote: > Hi, > > This change fixes range handling of a few flags of C2. > This should go to jdk10, and later be downported to some > update of jdk9. > > Please review this change. I please need a sponsor. > http://cr.openjdk.java.net/~goetz/wr17/8179618- > FlagRanges/webrev.01/ > > Class WarmCallInfo limits its values to 1.0e10, but the flags used > to set it's fields (HotCallCountThreshold etc.) are limited by max_intx. > Using values over 1.0e10 causes assertions in the debug build. > > OptoLoopAlignment must be a multiple of nop size, else it's not > possible to generate the instructions that go into the pad. > On x86 NOP size is 1, so it's no problem. > For SPARC, OptoLoopAlignmentConstraintFunc implements a special > case for bigger NOPs. This is also needed for s390 and ppc. > I just removed the #define, as the code works also on platforms > where NOPsize == 1. Actually, it should be optimized by the C > compiler in these cases. > > Best regards, > Goetz. > > From robbin.ehn at oracle.com Tue May 16 12:20:33 2017 From: robbin.ehn at oracle.com (Robbin Ehn) Date: Tue, 16 May 2017 14:20:33 +0200 Subject: Low-Overhead Heap Profiling In-Reply-To: References: <2af975e6-3827-bd57-0c3d-fadd54867a67@oracle.com> <365499b6-3f4d-a4df-9e7e-e72a739fb26b@oracle.com> Message-ID: Just a few answers, On 05/15/2017 06:48 PM, JC Beyler wrote: > Dear all, > > I've updated the webrev to: > http://cr.openjdk.java.net/~rasbold/8171119/webrev.02/ I'll look at this later, thanks! > > Robbin, > I believe I have addressed most of your items with webrev 02: > - I added a JTreg test to show how it works: > http://cr.openjdk.java.net/~rasbold/8171119/webrev.02/raw_files/new/test/serviceability/jvmti/HeapMonitor/libHeapMonitor.c > > - I've modified the code to use its own data structures both internally and externally, this will make it easier to move out of AsyncGetCallTrace as we move forward, that > is still on my TODOs > - I cleaned up the JVMTI API by passing a structure that handles the num_traces and put in a ReleaseTraces as well > - I cleaned up other issues as well. > > However, I have three questions, which are probably because I'm new in this community: > 1) My previous webrevs were based off of JDK9 by mistake. When I took JDK10 via : hg clone http://hg.openjdk.java.net/jdk10/jdk10 > jdk10 > - I don't see code compatible with what you were showing (ie your patches don't make sense for that code base; ex: klass is still accessed via klass() for example in > collectedHeap.inline.hpp) > - Would you know what is the right hg clone command so we are working on the same code base? We use jdk10-hs, e.g. hg tclone http://hg.openjdk.java.net/jdk10/hs 10-hs There is sporadic big merges going from jdk9->jdk10->jdk10-hs and jdk10-hs->jdk10, so 10 is moving... > > 2) You mentioned I was using os::malloc, new, NEW_C_HEAP_ARRAY; I cleaned out the os::malloc but which of the new vs NEW_C_HEAP_ARRAY should I use. It might be that I > don't understand when one uses one or the other but I see both used around the code base? > - Is it that new is to be used for anything internal and NEW_C_HEAP_ARRAY anything provided to the JVMTI users outside of the JVM? We overload new operator when you extend correct base class, e.g. CHeapObj so use 'new' But for arrays you will need the macro NEW_C_HEAP_ARRAY. > > 3) Casts: same kind question: which should I use. The code was using a bit of everything, I'll refactor it entirely but I was not clear if I should go to C casts or C++ > casts as I see both in the codebase. What is the convention I should use? Just be consist, use what suites you, C++ casts might be preferable, if we are moving towards C++11. And use 'right' cast, e.g. going from Thread* to JavaThread* you should use C cast or static_cast, not reinterpret_cast I would say. > > Final notes on this webrev: > - I am still missing: > - Putting a TLAB implementation so that we can compare both webrevs > - Have not tried to circumvent AsyncGetCallTrace > - Putting in the handling of GC'd objects > - Fix a stack walker issue I have seen, I think I know the problem and will test that theory out for the next webrev > > I will work on integrating those items for the next webrev! Thanks! > > Thanks for your help, > Jc > > Ps: I tested this on a new repo: > > hg clone http://hg.openjdk.java.net/jdk10/jdk10 jdk10 > ... building it > cd test > jtreg -nativepath:/build/linux-x86_64-normal-server-release/support/test/hotspot/jtreg/native/lib/ -jdk > /linux-x86_64-normal-server-release/images/jdk ../hotspot/test/serviceability/jvmti/HeapMonitor/ > I'll test it out! /Robbin > > > On Thu, May 4, 2017 at 11:21 PM, serguei.spitsyn at oracle.com > wrote: > > Robbin, > > Thank you for forwarding! > I will review it. > > Thanks, > Serguei > > > > On 5/4/17 02:13, Robbin Ehn wrote: > > Hi, > > To me the compiler changes looks what is expected. > It would be good if someone from compiler could take a look at that. > Added compiler to mail thread. > > Also adding Serguei, It would be good with his view also. > > My initial take on it, read through most of the code and took it for a ride. > > ############################## > - Regarding the compiler changes: I think we need the 'TLAB end' trickery (mentioned by Tony P) > instead of a separate check for sampling in fast path for the final version. > > ############################## > - This patch I had to apply to get it compile on JDK 10: > > diff -r ac3ded340b35 src/share/vm/gc/shared/collectedHeap.inline.hpp > --- a/src/share/vm/gc/shared/collectedHeap.inline.hpp Fri Apr 28 14:31:38 2017 +0200 > +++ b/src/share/vm/gc/shared/collectedHeap.inline.hpp Thu May 04 10:22:56 2017 +0200 > @@ -87,3 +87,3 @@ > // support for object alloc event (no-op most of the time) > - if (klass() != NULL && klass()->name() != NULL) { > + if (klass != NULL && klass->name() != NULL) { > Thread *base_thread = Thread::current(); > diff -r ac3ded340b35 src/share/vm/runtime/heapMonitoring.cpp > --- a/src/share/vm/runtime/heapMonitoring.cpp Fri Apr 28 14:31:38 2017 +0200 > +++ b/src/share/vm/runtime/heapMonitoring.cpp Thu May 04 10:22:56 2017 +0200 > @@ -316,3 +316,3 @@ > JavaThread *thread = reinterpret_cast(Thread::current()); > - assert(o->size() << LogHeapWordSize == byte_size, > + assert(o->size() << LogHeapWordSize == (long)byte_size, > "Object size is incorrect."); > > ############################## > - This patch I had to apply to get it not asserting during slowdebug: > > --- a/src/share/vm/runtime/heapMonitoring.cpp Fri Apr 28 15:15:16 2017 +0200 > +++ b/src/share/vm/runtime/heapMonitoring.cpp Thu May 04 10:24:25 2017 +0200 > @@ -32,3 +32,3 @@ > // TODO(jcbeyler): should we make this into a JVMTI structure? > -struct StackTraceData { > +struct StackTraceData : CHeapObj { > ASGCT_CallTrace *trace; > @@ -143,3 +143,2 @@ > StackTraceStorage::StackTraceStorage() : > - _allocated_traces(new StackTraceData*[MaxHeapTraces]), > _allocated_traces_size(MaxHeapTraces), > @@ -147,2 +146,3 @@ > _allocated_count(0) { > + _allocated_traces = NEW_C_HEAP_ARRAY(StackTraceData*, MaxHeapTraces, mtInternal); > memset(_allocated_traces, 0, sizeof(*_allocated_traces) * MaxHeapTraces); > @@ -152,3 +152,3 @@ > StackTraceStorage::~StackTraceStorage() { > - delete[] _allocated_traces; > + FREE_C_HEAP_ARRAY(StackTraceData*, _allocated_traces); > } > > - Classes should extend correct base class for which type of memory is used for it e.g.: CHeapObj or StackObj or AllStatic > - The style in heapMonitoring.cpp is a bit different from normal vm-style, e.g. using C++ casts instead of C. You mix NEW_C_HEAP_ARRAY, os::malloc and new. > - In jvmtiHeapTransition.hpp you use C cast instead. > > ############################## > - This patch I had apply to get traces without setting an ?unrelated? capability > - Should this not be a new capability? > > diff -r c02a5d8785bf src/share/vm/prims/forte.cpp > --- a/src/share/vm/prims/forte.cpp Fri Apr 28 15:15:16 2017 +0200 > +++ b/src/share/vm/prims/forte.cpp Thu May 04 10:24:25 2017 +0200 > @@ -530,6 +530,6 @@ > > - if (!JvmtiExport::should_post_class_load()) { > +/* if (!JvmtiExport::should_post_class_load()) { > trace->num_frames = ticks_no_class_load; // -1 > return; > - } > + }*/ > > ############################## > - forte.cpp: (I know this is not part of your changes but) > find_jmethod_id_or_null give me NULL for my test. > It looks like we actually want the regular jmethod_id() ? > > Since we are the thread we are talking about (and in same ucontext) and thread is in vm and have a last java frame, > I think most of the checks done in AsyncGetCallTrace is irrelevant, so you should be-able to call forte_fill_call_trace_given_top directly. > But since we might need jmethod_id() if possible to avoid getting method id NULL, > we need some fixes in forte code, or just do the vframStream loop inside heapMonitoring.cpp and not use forte.cpp. > > Something like: > > if (jthread->has_last_Java_frame()) { // just to be safe > vframeStream vfst(jthread); > while (!vfst.at_end()) { > Method* m = vfst.method(); > m->jmethod_id(); > m->line_number_from_bci(vfst.bci()); > vfst.next(); > } > > - This is a bit confusing in forte.cpp, trace->frames[count].lineno = bci. > Line number should be m->line_number_from_bci(bci); > Do the heapMonitoring suppose to trace with bci or line number? > I would say bci, meaning we should either rename ASGCT_CallFrame?lineno or use another data structure which says bci. > > ############################## > - // TODO(jcbeyler): remove this extra code handling the extra trace for > Please fix all these TODO's :) > > ############################## > - heapMonitoring.hpp: > // TODO(jcbeyler): is this algorithm acceptable in open source? > > Why is this comment here? What is the implication? > Have you tested any simpler algorithm? > > ############################## > - Create a sanity jtreg test. (./hotspot/make/test/JtregNative.gmk for building the agent) > > ############################## > - monitoring_period vs HeapMonitorRate, pick rate or period. > > ############################## > - globals.hpp > Why is MaxHeapTraces not settable/overridable from jvmti interface? That would be handy. > > ############################## > - jvmtiStackTraceData + ASGCT_CallFrame memory > Are the agent suppose to loop through and free all ASGCT_CallFrame? > Wouldn't it be better with some kinda protocol, like: > (*jvmti)->GetLiveTraces(jvmti, &stack_traces, &num_traces); > (*jvmti)->ReleaseTraces(jvmti, stack_traces, num_traces); > > Also using another data structure that have num_traces inside it simplifies things. > So I'm not convinced using the async structure is the best way forward. > > > I have more questions, but I think it's better if you respond and update the code first. > > Thanks! > > /Robbin > > > On 04/21/2017 11:34 PM, JC Beyler wrote: > > Hi all, > > I've added size information to the allocation sampling system. This allows the callback to remember the size of each sampled allocation. > http://cr.openjdk.java.net/~rasbold/8171119/webrev.01/ > > The new webrev.01 also adds the actual heap monitoring sampling system in files: > http://cr.openjdk.java.net/~rasbold/8171119/webrev.01/src/share/vm/runtime/heapMonitoring.cpp.patch > > and > http://cr.openjdk.java.net/~rasbold/8171119/webrev.01/src/share/vm/runtime/heapMonitoring.hpp.patch > > > My next step is to add the GC part to the webrev, which will allow users to determine what objects are live and what are garbage. > > Thanks for your attention and let me know if there are any questions! > > Have a wonderful Friday! > Jc > > On Mon, Apr 17, 2017 at 12:37 PM, JC Beyler >> wrote: > > Hi all, > > I worked on getting a few numbers for overhead and accuracy for my feature. I'm unsure if here is the right place to provide the full data, so I am just > summarizing > here for now. > > - Overhead of the feature > > Using the Dacapo benchmark (http://dacapobench.org/). My initial results are that sampling provides 2.4% with a 512k sampling, 512k being our default setting. > > - Note: this was without the tradesoap, tradebeans and tomcat benchmarks since they did not work with my JDK9 (issue between Dacapo and JDK9 it seems) > - I want to rerun next week to ensure number stability > > - Accuracy of the feature > > I wrote a small microbenchmark that allocates from two different stacktraces at a given ratio. For example, 10% of stacktrace S1 and 90% from stacktrace > S2. The > microbenchmark was run 20 times, I averaged the results and looked for accuracy. It seems that statistically it is sound since if I allocated10% S1 and 90% > S2, with a > sampling rate of 512k, I obtained 9.61% S1 and 90.49% S2. > > Let me know if there are any questions on the numbers and if you'd like to see some more data. > > Note: this was done using our internal JDK8 implementation since the webrev provided by http://cr.openjdk.java.net/~rasbold/heapz/webrev.00/index.html > > > does not yet contain the whole > implementation and therefore would have been misleading. > > Thanks, > Jc > > > On Tue, Apr 4, 2017 at 3:55 PM, JC Beyler >> wrote: > > Hi all, > > To move the discussion forward, with Chuck Rasbold's help to make a webrev, we pushed this: > http://cr.openjdk.java.net/~rasbold/heapz/webrev.00/index.html > > > 415 lines changed: 399 ins; 13 del; 3 mod; 51122 unchg > > This is not a final change that does the whole proposition from the JBS entry: https://bugs.openjdk.java.net/browse/JDK-8177374 > > >; what it does show is parts of the implementation that is > proposed and hopefully can start the conversation going > as I work through the details. > > For example, the changes to C2 are done here for the allocations: http://cr.openjdk.java.net/~rasbold/heapz/webrev.00/src/share/vm/opto/macro.cpp.patch > > > > > Hopefully this all makes sense and thank you for all your future comments! > Jc > > > On Tue, Dec 13, 2016 at 1:11 PM, JC Beyler >> > wrote: > > Hello all, > > This is a follow-up from Jeremy's initial email from last year: > http://mail.openjdk.java.net/pipermail/serviceability-dev/2015-June/017543.html > > > > > I've gone ahead and started working on preparing this and Jeremy and I went down the route of actually writing it up in JEP form: > https://bugs.openjdk.java.net/browse/JDK-8171119 > > I think original conversation that happened last year in that thread still holds true: > > - We have a patch at Google that we think others might be interested in > - It provides a means to understand where the allocation hotspots are at a very low overhead > - Since it is at a low overhead, we can leave it on by default > > So I come to the mailing list with Jeremy's initial question: > "I thought I would ask if there is any interest / if I should write a JEP / if I should just forget it." > > A year ago, it seemed some thought it was a good idea, is this still true? > > Thanks, > Jc > > > > > > > From rwestrel at redhat.com Tue May 16 12:22:47 2017 From: rwestrel at redhat.com (Roland Westrelin) Date: Tue, 16 May 2017 14:22:47 +0200 Subject: [9] RFR(S): 8179678: ArrayCopy with same src and dst can cause incorrect execution or compiler crash In-Reply-To: References: Message-ID: Thanks for looking a this, Tobias. > But in ArrayCopyNode::can_replace_dest_load_with_src_load() you return > false, if src == dst. Why is that? See test2(): src[0] is the destination of the copy, it is replaced by a read of the source: src[0] which is the destination of the copy... and the compiler is sent into an infinite loop. This said, this test is too conservative. I've reworked it. > And in line 733, shouldn't we pass must_modify = false to detect the > case we the array copy _may_ modify the source we would load? Yes, you're right. Thanks for spotting that. New webrev: http://cr.openjdk.java.net/~roland/8179678/webrev.01/ Roland. From claes.redestad at oracle.com Tue May 16 15:27:19 2017 From: claes.redestad at oracle.com (Claes Redestad) Date: Tue, 16 May 2017 17:27:19 +0200 Subject: RFR [10]: 8180423: Remove flag UseRelocIndex Message-ID: <764d0654-5b2e-eece-117e-e28cec51e2cb@oracle.com> Hi, the develop flag UseRelocIndex has been both disabled and broken for quite some time and I think it should be removed (if the now dead optimization is worthwhile it needs to be re-designed anyhow). Bug: https://bugs.openjdk.java.net/browse/JDK-8180423 Webrev: http://cr.openjdk.java.net/~redestad/8180423/hotspot.00/ While some of this does show up in startup profiles, e.g., RelocIterator::create_index is called 242 times in one test, the cost is definitely infinitesimal and this should be seen primarily as cleaning out dead, broken and untested code. Testing: RBT hs-tier2,hs-tier3-comp Thanks! /Claes From tobias.hartmann at oracle.com Tue May 16 15:49:53 2017 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Tue, 16 May 2017 17:49:53 +0200 Subject: [9] RFR(S): 8179678: ArrayCopy with same src and dst can cause incorrect execution or compiler crash In-Reply-To: References: Message-ID: Hi Roland, On 16.05.2017 14:22, Roland Westrelin wrote: >> But in ArrayCopyNode::can_replace_dest_load_with_src_load() you return >> false, if src == dst. Why is that? > > See test2(): src[0] is the destination of the copy, it is replaced by a > read of the source: src[0] which is the destination of the copy... and > the compiler is sent into an infinite loop. Yes but my point was that even if src == dst, it's not necessary the case that the arraycopy affects the offset we are reading from src. Is the arraycopy still removed in the test2 case? > This said, this test is too conservative. I've reworked it. Okay, looks good now. >> And in line 733, shouldn't we pass must_modify = false to detect the >> case we the array copy _may_ modify the source we would load? > > Yes, you're right. Thanks for spotting that. > > New webrev: > > http://cr.openjdk.java.net/~roland/8179678/webrev.01/ Looks good to me! Best regards, Tobias From vladimir.kozlov at oracle.com Tue May 16 16:07:01 2017 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Tue, 16 May 2017 09:07:01 -0700 Subject: RFR [10]: 8180423: Remove flag UseRelocIndex In-Reply-To: <764d0654-5b2e-eece-117e-e28cec51e2cb@oracle.com> References: <764d0654-5b2e-eece-117e-e28cec51e2cb@oracle.com> Message-ID: <56d057b4-9122-8d12-8dcc-72af35619439@oracle.com> Nice cleanup. Thanks, Vladimir On 5/16/17 8:27 AM, Claes Redestad wrote: > Hi, > > the develop flag UseRelocIndex has been both disabled and broken > for quite some time and I think it should be removed (if the now > dead optimization is worthwhile it needs to be re-designed anyhow). > > Bug: https://bugs.openjdk.java.net/browse/JDK-8180423 > Webrev: http://cr.openjdk.java.net/~redestad/8180423/hotspot.00/ > > While some of this does show up in startup profiles, e.g., > RelocIterator::create_index is called 242 times in one test, > the cost is definitely infinitesimal and this should be seen > primarily as cleaning out dead, broken and untested code. > > Testing: RBT hs-tier2,hs-tier3-comp > > Thanks! > > /Claes From claes.redestad at oracle.com Tue May 16 16:38:18 2017 From: claes.redestad at oracle.com (Claes Redestad) Date: Tue, 16 May 2017 18:38:18 +0200 Subject: RFR [10]: 8180423: Remove flag UseRelocIndex In-Reply-To: <56d057b4-9122-8d12-8dcc-72af35619439@oracle.com> References: <764d0654-5b2e-eece-117e-e28cec51e2cb@oracle.com> <56d057b4-9122-8d12-8dcc-72af35619439@oracle.com> Message-ID: <23ebd65a-bc0e-a2f7-d618-dd50b12abaf4@oracle.com> Thanks, Vladimir! /Claes On 2017-05-16 18:07, Vladimir Kozlov wrote: > Nice cleanup. > > Thanks, > Vladimir > > On 5/16/17 8:27 AM, Claes Redestad wrote: >> Hi, >> >> the develop flag UseRelocIndex has been both disabled and broken >> for quite some time and I think it should be removed (if the now >> dead optimization is worthwhile it needs to be re-designed anyhow). >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8180423 >> Webrev: http://cr.openjdk.java.net/~redestad/8180423/hotspot.00/ >> >> While some of this does show up in startup profiles, e.g., >> RelocIterator::create_index is called 242 times in one test, >> the cost is definitely infinitesimal and this should be seen >> primarily as cleaning out dead, broken and untested code. >> >> Testing: RBT hs-tier2,hs-tier3-comp >> >> Thanks! >> >> /Claes From vladimir.kozlov at oracle.com Tue May 16 18:45:25 2017 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Tue, 16 May 2017 11:45:25 -0700 Subject: JDK10/RFR(L): 8172231: SPARC ISA/CPU feature detection is broken/insufficient (on Solaris). In-Reply-To: <6801aebd-eb60-6711-2859-0096818007eb@oracle.com> References: <6801aebd-eb60-6711-2859-0096818007eb@oracle.com> Message-ID: <6a35e1cc-96cb-fdb1-ee89-bea4b87eb9f7@oracle.com> Hi Patric Very nice work. Did you measure a performance improvement when you increase allocation prefetch values on new S4? Since you removed call to kstat you can remove kstat.h include and link to libkstat in makefile: http://hg.openjdk.java.net/jdk10/hs/file/35273d1dff83/common/autoconf/flags.m4#l1290 Thanks, Vladimir On 4/28/17 6:48 AM, Patric Hedlin wrote: > Dear all, > > I would like to ask for help to review the following change/update: > > Issue: https://bugs.openjdk.java.net/browse/JDK-8172231 > > Webrev: http://cr.openjdk.java.net/~neliasso/8172231/ > > > > 8172231: SPARC ISA/CPU feature detection is broken/insufficient (on Solaris). > > Updating SPARC feature/capability detection (incorporating changes from Martin Walsh). > More complete set of features as provided by 'getisax(2)' interface, propagated via JVMCI. > More robust hardware probing for additional features (up to Core S4). > Removing support for old, pre Niagara, hardware. > Removing support for old, pre 11.1, Solaris. > > Changed behaviour: > Changing SPARC setup for AllocatePrefetchLines and AllocateInstancePrefetchLines > such that they will (still) be doubled when cache-line size is small (32 bytes), > but more moderately increased on new/contemporary hardware (inc >= 50%). > Changing to default instruction fetch alignment based on derived caps. instead > of relying on default/configuration values. > > The above changes also subsumes: > 8035146: assert(is_T_family(features) == is_niagara(features), "Niagara should be T series") is incorrect > 8054979: Remove unnecessary defines in SPARC's VM_Version::platform_features > > > Rationale: > > Current hardware detection on Solaris/SPARC is not up to date with the "latest" (here, > meaning commercially available server solutions, i.e. T7/M7). To facilitate improved > use of the new hardware features provided (by Core S3&S4) these capabilities need to > be recognised by the JVM. > > NOTE: This update is limited to Core S3&S4, i.e. not including Core S5. Proper Core S5 > support will be added when regular testing and benchmarking resources are available, > i.e. regular testing need to include M8 hardware. > > > Caveat: > > This update will introduce some redundancies into the code base, features and definitions > currently not used, as well as a (small) number of FIXMEs, addressed by subsequent bug or > feature updates/patches. Fujitsu HW is treated very conservatively. > > > Testing: > > Mostly tested on JDK9 (RBT/hotspot/comp). Only local testing on JDK10 (jtreg/hotspot). > > > Benchmarking: > > Benchmark reports from a limited set of runs can be found at: > > http://aurora.se.oracle.com/performance/reporting/report/patric.hedlin.TvM.jbb05 > http://aurora.se.oracle.com/performance/reporting/report/patric.hedlin.TvM.jvm08 > http://aurora.se.oracle.com/performance/reporting/report/patric.hedlin.TvM.octane.plus > > (Limited availability of M7 hardware prevents complete suites/runs.) > > > Best regards, > Patric > From vladimir.kozlov at oracle.com Tue May 16 18:52:03 2017 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Tue, 16 May 2017 11:52:03 -0700 Subject: [9] RFR(S): 8179678: ArrayCopy with same src and dst can cause incorrect execution or compiler crash In-Reply-To: References: Message-ID: Looks good. Roland, can you move detect_ptr_independence() after instance_id() check? detect_ptr_independence() calls all_controls_dominate() which is expensive. Thanks, Vladimir On 5/16/17 8:49 AM, Tobias Hartmann wrote: > Hi Roland, > > On 16.05.2017 14:22, Roland Westrelin wrote: >>> But in ArrayCopyNode::can_replace_dest_load_with_src_load() you return >>> false, if src == dst. Why is that? >> >> See test2(): src[0] is the destination of the copy, it is replaced by a >> read of the source: src[0] which is the destination of the copy... and >> the compiler is sent into an infinite loop. > > Yes but my point was that even if src == dst, it's not necessary the case that the arraycopy affects the offset we are reading from src. > > Is the arraycopy still removed in the test2 case? > >> This said, this test is too conservative. I've reworked it. > > Okay, looks good now. > >>> And in line 733, shouldn't we pass must_modify = false to detect the >>> case we the array copy _may_ modify the source we would load? >> >> Yes, you're right. Thanks for spotting that. >> >> New webrev: >> >> http://cr.openjdk.java.net/~roland/8179678/webrev.01/ > > Looks good to me! > > Best regards, > Tobias > From cthalinger at twitter.com Tue May 16 22:50:35 2017 From: cthalinger at twitter.com (Christian Thalinger) Date: Tue, 16 May 2017 12:50:35 -1000 Subject: HotSpotResolvedJavaMethod#setNotInlineable() in JVMCI In-Reply-To: References: Message-ID: <8578DBFB-6523-436D-A830-08E4240A6F3A@twitter.com> > On May 15, 2017, at 8:27 PM, Yasumasa Suenaga wrote: > > Hi all, > > I've tried to use JVMCI implementation. > > HotSpotResolvedJavaMethod#setNotInlineable() is explained that "Manually adds a DontInline annotation to this method" in the comment, however this method seems to disable C1/C2 compile in CompilerToVM [1][2]. > > Is this behavior correct? > Should we fix the comment or method name (and / or function name) or behavior in jvmciCompilerToVM.cpp ? Yes, that?s a bit confusing. I think HotSpotResolvedJavaMethodImpl.setNotInlineable should be renamed and the documentation updated. > > > I will file it to JBS if it is a bug. > > > Thanks, > > Yasumasa (ysuenaga) > > > [1] http://hg.openjdk.java.net/jdk9/dev/hotspot/file/d6d7e5caf497/src/jdk.internal.vm.ci/share/classes/jdk.vm.ci.hotspot/src/jdk/vm/ci/hotspot/HotSpotResolvedJavaMethodImpl.java#l320 > [2] http://hg.openjdk.java.net/jdk9/dev/hotspot/file/d6d7e5caf497/src/share/vm/jvmci/jvmciCompilerToVM.cpp#l1004 -------------- next part -------------- An HTML attachment was scrubbed... URL: From jcbeyler at google.com Wed May 17 03:47:06 2017 From: jcbeyler at google.com (JC Beyler) Date: Tue, 16 May 2017 20:47:06 -0700 Subject: Low-Overhead Heap Profiling In-Reply-To: References: <2af975e6-3827-bd57-0c3d-fadd54867a67@oracle.com> <365499b6-3f4d-a4df-9e7e-e72a739fb26b@oracle.com> Message-ID: Dear Robbin, Thank you for the answers, much appreciated :) Jc On Tue, May 16, 2017 at 5:20 AM, Robbin Ehn wrote: > Just a few answers, > > On 05/15/2017 06:48 PM, JC Beyler wrote: > >> Dear all, >> >> I've updated the webrev to: >> http://cr.openjdk.java.net/~rasbold/8171119/webrev.02/ < >> http://cr.openjdk.java.net/~rasbold/8171119/webrev.02/> >> > > I'll look at this later, thanks! > > >> Robbin, >> I believe I have addressed most of your items with webrev 02: >> - I added a JTreg test to show how it works: >> http://cr.openjdk.java.net/~rasbold/8171119/webrev.02/raw_fi >> les/new/test/serviceability/jvmti/HeapMonitor/libHeapMonitor.c < >> http://cr.openjdk.java.net/~rasbold/8171119/webrev.02/raw_f >> iles/new/test/serviceability/jvmti/HeapMonitor/libHeapMonitor.c> >> - I've modified the code to use its own data structures both internally >> and externally, this will make it easier to move out of AsyncGetCallTrace >> as we move forward, that is still on my TODOs >> - I cleaned up the JVMTI API by passing a structure that handles the >> num_traces and put in a ReleaseTraces as well >> - I cleaned up other issues as well. >> >> However, I have three questions, which are probably because I'm new in >> this community: >> 1) My previous webrevs were based off of JDK9 by mistake. When I took >> JDK10 via : hg clone http://hg.openjdk.java.net/jdk10/jdk10 < >> http://hg.openjdk.java.net/jdk10/jdk10> jdk10 >> - I don't see code compatible with what you were showing (ie your >> patches don't make sense for that code base; ex: klass is still accessed >> via klass() for example in collectedHeap.inline.hpp) >> - Would you know what is the right hg clone command so we are >> working on the same code base? >> > > We use jdk10-hs, e.g. > hg tclone http://hg.openjdk.java.net/jdk10/hs 10-hs > > There is sporadic big merges going from jdk9->jdk10->jdk10-hs and > jdk10-hs->jdk10, so 10 is moving... > > >> 2) You mentioned I was using os::malloc, new, NEW_C_HEAP_ARRAY; I >> cleaned out the os::malloc but which of the new vs NEW_C_HEAP_ARRAY should >> I use. It might be that I don't understand when one uses one or the other >> but I see both used around the code base? >> - Is it that new is to be used for anything internal and >> NEW_C_HEAP_ARRAY anything provided to the JVMTI users outside of the JVM? >> > > We overload new operator when you extend correct base class, e.g. > CHeapObj so use 'new' > But for arrays you will need the macro NEW_C_HEAP_ARRAY. > > >> 3) Casts: same kind question: which should I use. The code was using a >> bit of everything, I'll refactor it entirely but I was not clear if I >> should go to C casts or C++ casts as I see both in the codebase. What is >> the convention I should use? >> > > Just be consist, use what suites you, C++ casts might be preferable, if we > are moving towards C++11. > And use 'right' cast, e.g. going from Thread* to JavaThread* you should > use C cast or static_cast, not reinterpret_cast I would say. > > >> Final notes on this webrev: >> - I am still missing: >> - Putting a TLAB implementation so that we can compare both webrevs >> - Have not tried to circumvent AsyncGetCallTrace >> - Putting in the handling of GC'd objects >> - Fix a stack walker issue I have seen, I think I know the problem >> and will test that theory out for the next webrev >> >> I will work on integrating those items for the next webrev! >> > > Thanks! > > >> Thanks for your help, >> Jc >> >> Ps: I tested this on a new repo: >> >> hg clone http://hg.openjdk.java.net/jdk10/jdk10 < >> http://hg.openjdk.java.net/jdk10/jdk10> jdk10 >> ... building it >> cd test >> jtreg -nativepath:/build/linux-x86_64-normal-server >> -release/support/test/hotspot/jtreg/native/lib/ -jdk >> /linux-x86_64-normal-server-release/images/jdk >> ../hotspot/test/serviceability/jvmti/HeapMonitor/ >> >> > I'll test it out! > > /Robbin > > >> >> On Thu, May 4, 2017 at 11:21 PM, serguei.spitsyn at oracle.com > serguei.spitsyn at oracle.com> > serguei.spitsyn at oracle.com>> wrote: >> >> Robbin, >> >> Thank you for forwarding! >> I will review it. >> >> Thanks, >> Serguei >> >> >> >> On 5/4/17 02:13, Robbin Ehn wrote: >> >> Hi, >> >> To me the compiler changes looks what is expected. >> It would be good if someone from compiler could take a look at >> that. >> Added compiler to mail thread. >> >> Also adding Serguei, It would be good with his view also. >> >> My initial take on it, read through most of the code and took it >> for a ride. >> >> ############################## >> - Regarding the compiler changes: I think we need the 'TLAB end' >> trickery (mentioned by Tony P) >> instead of a separate check for sampling in fast path for the >> final version. >> >> ############################## >> - This patch I had to apply to get it compile on JDK 10: >> >> diff -r ac3ded340b35 src/share/vm/gc/shared/collect >> edHeap.inline.hpp >> --- a/src/share/vm/gc/shared/collectedHeap.inline.hpp Fri Apr >> 28 14:31:38 2017 +0200 >> +++ b/src/share/vm/gc/shared/collectedHeap.inline.hpp Thu May >> 04 10:22:56 2017 +0200 >> @@ -87,3 +87,3 @@ >> // support for object alloc event (no-op most of the time) >> - if (klass() != NULL && klass()->name() != NULL) { >> + if (klass != NULL && klass->name() != NULL) { >> Thread *base_thread = Thread::current(); >> diff -r ac3ded340b35 src/share/vm/runtime/heapMonitoring.cpp >> --- a/src/share/vm/runtime/heapMonitoring.cpp Fri Apr 28 >> 14:31:38 2017 +0200 >> +++ b/src/share/vm/runtime/heapMonitoring.cpp Thu May 04 >> 10:22:56 2017 +0200 >> @@ -316,3 +316,3 @@ >> JavaThread *thread = reinterpret_cast> *>(Thread::current()); >> - assert(o->size() << LogHeapWordSize == byte_size, >> + assert(o->size() << LogHeapWordSize == (long)byte_size, >> "Object size is incorrect."); >> >> ############################## >> - This patch I had to apply to get it not asserting during >> slowdebug: >> >> --- a/src/share/vm/runtime/heapMonitoring.cpp Fri Apr 28 >> 15:15:16 2017 +0200 >> +++ b/src/share/vm/runtime/heapMonitoring.cpp Thu May 04 >> 10:24:25 2017 +0200 >> @@ -32,3 +32,3 @@ >> // TODO(jcbeyler): should we make this into a JVMTI structure? >> -struct StackTraceData { >> +struct StackTraceData : CHeapObj { >> ASGCT_CallTrace *trace; >> @@ -143,3 +143,2 @@ >> StackTraceStorage::StackTraceStorage() : >> - _allocated_traces(new StackTraceData*[MaxHeapTraces]), >> _allocated_traces_size(MaxHeapTraces), >> @@ -147,2 +146,3 @@ >> _allocated_count(0) { >> + _allocated_traces = NEW_C_HEAP_ARRAY(StackTraceData*, >> MaxHeapTraces, mtInternal); >> memset(_allocated_traces, 0, sizeof(*_allocated_traces) * >> MaxHeapTraces); >> @@ -152,3 +152,3 @@ >> StackTraceStorage::~StackTraceStorage() { >> - delete[] _allocated_traces; >> + FREE_C_HEAP_ARRAY(StackTraceData*, _allocated_traces); >> } >> >> - Classes should extend correct base class for which type of >> memory is used for it e.g.: CHeapObj or StackObj or AllStatic >> - The style in heapMonitoring.cpp is a bit different from normal >> vm-style, e.g. using C++ casts instead of C. You mix NEW_C_HEAP_ARRAY, >> os::malloc and new. >> - In jvmtiHeapTransition.hpp you use C cast instead. >> >> ############################## >> - This patch I had apply to get traces without setting an >> ?unrelated? capability >> - Should this not be a new capability? >> >> diff -r c02a5d8785bf src/share/vm/prims/forte.cpp >> --- a/src/share/vm/prims/forte.cpp Fri Apr 28 15:15:16 2017 >> +0200 >> +++ b/src/share/vm/prims/forte.cpp Thu May 04 10:24:25 2017 >> +0200 >> @@ -530,6 +530,6 @@ >> >> - if (!JvmtiExport::should_post_class_load()) { >> +/* if (!JvmtiExport::should_post_class_load()) { >> trace->num_frames = ticks_no_class_load; // -1 >> return; >> - } >> + }*/ >> >> ############################## >> - forte.cpp: (I know this is not part of your changes but) >> find_jmethod_id_or_null give me NULL for my test. >> It looks like we actually want the regular jmethod_id() ? >> >> Since we are the thread we are talking about (and in same >> ucontext) and thread is in vm and have a last java frame, >> I think most of the checks done in AsyncGetCallTrace is >> irrelevant, so you should be-able to call forte_fill_call_trace_given_top >> directly. >> But since we might need jmethod_id() if possible to avoid getting >> method id NULL, >> we need some fixes in forte code, or just do the vframStream loop >> inside heapMonitoring.cpp and not use forte.cpp. >> >> Something like: >> >> if (jthread->has_last_Java_frame()) { // just to be safe >> vframeStream vfst(jthread); >> while (!vfst.at_end()) { >> Method* m = vfst.method(); >> m->jmethod_id(); >> m->line_number_from_bci(vfst.bci()); >> vfst.next(); >> } >> >> - This is a bit confusing in forte.cpp, >> trace->frames[count].lineno = bci. >> Line number should be m->line_number_from_bci(bci); >> Do the heapMonitoring suppose to trace with bci or line number? >> I would say bci, meaning we should either rename >> ASGCT_CallFrame?lineno or use another data structure which says bci. >> >> ############################## >> - // TODO(jcbeyler): remove this extra code handling the extra >> trace for >> Please fix all these TODO's :) >> >> ############################## >> - heapMonitoring.hpp: >> // TODO(jcbeyler): is this algorithm acceptable in open source? >> >> Why is this comment here? What is the implication? >> Have you tested any simpler algorithm? >> >> ############################## >> - Create a sanity jtreg test. (./hotspot/make/test/JtregNative.gmk >> for building the agent) >> >> ############################## >> - monitoring_period vs HeapMonitorRate, pick rate or period. >> >> ############################## >> - globals.hpp >> Why is MaxHeapTraces not settable/overridable from jvmti >> interface? That would be handy. >> >> ############################## >> - jvmtiStackTraceData + ASGCT_CallFrame memory >> Are the agent suppose to loop through and free all >> ASGCT_CallFrame? >> Wouldn't it be better with some kinda protocol, like: >> (*jvmti)->GetLiveTraces(jvmti, &stack_traces, &num_traces); >> (*jvmti)->ReleaseTraces(jvmti, stack_traces, num_traces); >> >> Also using another data structure that have num_traces inside it >> simplifies things. >> So I'm not convinced using the async structure is the best way >> forward. >> >> >> I have more questions, but I think it's better if you respond and >> update the code first. >> >> Thanks! >> >> /Robbin >> >> >> On 04/21/2017 11:34 PM, JC Beyler wrote: >> >> Hi all, >> >> I've added size information to the allocation sampling >> system. This allows the callback to remember the size of each sampled >> allocation. >> http://cr.openjdk.java.net/~rasbold/8171119/webrev.01/ < >> http://cr.openjdk.java.net/~rasbold/8171119/webrev.01/> >> >> The new webrev.01 also adds the actual heap monitoring >> sampling system in files: >> http://cr.openjdk.java.net/~rasbold/8171119/webrev.01/src/sh >> are/vm/runtime/heapMonitoring.cpp.patch >> > hare/vm/runtime/heapMonitoring.cpp.patch> >> and >> http://cr.openjdk.java.net/~rasbold/8171119/webrev.01/src/sh >> are/vm/runtime/heapMonitoring.hpp.patch >> > hare/vm/runtime/heapMonitoring.hpp.patch> >> >> My next step is to add the GC part to the webrev, which will >> allow users to determine what objects are live and what are garbage. >> >> Thanks for your attention and let me know if there are any >> questions! >> >> Have a wonderful Friday! >> Jc >> >> On Mon, Apr 17, 2017 at 12:37 PM, JC Beyler < >> jcbeyler at google.com > jcbeyler at google.com >> wrote: >> >> Hi all, >> >> I worked on getting a few numbers for overhead and >> accuracy for my feature. I'm unsure if here is the right place to provide >> the full data, so I am just >> summarizing >> here for now. >> >> - Overhead of the feature >> >> Using the Dacapo benchmark (http://dacapobench.org/). >> My initial results are that sampling provides 2.4% with a 512k sampling, >> 512k being our default setting. >> >> - Note: this was without the tradesoap, tradebeans and >> tomcat benchmarks since they did not work with my JDK9 (issue between >> Dacapo and JDK9 it seems) >> - I want to rerun next week to ensure number stability >> >> - Accuracy of the feature >> >> I wrote a small microbenchmark that allocates from two >> different stacktraces at a given ratio. For example, 10% of stacktrace S1 >> and 90% from stacktrace >> S2. The >> microbenchmark was run 20 times, I averaged the results >> and looked for accuracy. It seems that statistically it is sound since if I >> allocated10% S1 and 90% >> S2, with a >> sampling rate of 512k, I obtained 9.61% S1 and 90.49% S2. >> >> Let me know if there are any questions on the numbers >> and if you'd like to see some more data. >> >> Note: this was done using our internal JDK8 >> implementation since the webrev provided by >> http://cr.openjdk.java.net/~rasbold/heapz/webrev.00/index.html >> > tml> >> > tml > >> does not yet contain the whole >> implementation and therefore would have been misleading. >> >> Thanks, >> Jc >> >> >> On Tue, Apr 4, 2017 at 3:55 PM, JC Beyler < >> jcbeyler at google.com > jcbeyler at google.com >> wrote: >> >> Hi all, >> >> To move the discussion forward, with Chuck Rasbold's >> help to make a webrev, we pushed this: >> http://cr.openjdk.java.net/~rasbold/heapz/webrev.00/index.ht >> ml >> > tml > >> 415 lines changed: 399 ins; 13 del; 3 mod; 51122 >> unchg >> >> This is not a final change that does the whole >> proposition from the JBS entry: https://bugs.openjdk.java.net/ >> browse/JDK-8177374 >> >> > https://bugs.openjdk.java.net/browse/JDK-8177374>>; what it does show is >> parts of the implementation that is >> proposed and hopefully can start the conversation going >> as I work through the details. >> >> For example, the changes to C2 are done here for the >> allocations: http://cr.openjdk.java.net/~rasbold/heapz/webrev.00/src/shar >> e/vm/opto/macro.cpp.patch >> > re/vm/opto/macro.cpp.patch> >> > re/vm/opto/macro.cpp.patch >> > re/vm/opto/macro.cpp.patch>> >> >> Hopefully this all makes sense and thank you for all >> your future comments! >> Jc >> >> >> On Tue, Dec 13, 2016 at 1:11 PM, JC Beyler < >> jcbeyler at google.com > jcbeyler at google.com >> >> wrote: >> >> Hello all, >> >> This is a follow-up from Jeremy's initial email >> from last year: >> http://mail.openjdk.java.net/pipermail/serviceability-dev/20 >> 15-June/017543.html >> > 015-June/017543.html> >> > 015-June/017543.html > pipermail/serviceability-dev/2015-June/017543.html>> >> >> I've gone ahead and started working on preparing >> this and Jeremy and I went down the route of actually writing it up in JEP >> form: >> https://bugs.openjdk.java.net/browse/JDK-8171119 < >> https://bugs.openjdk.java.net/browse/JDK-8171119> >> >> I think original conversation that happened last >> year in that thread still holds true: >> >> - We have a patch at Google that we think >> others might be interested in >> - It provides a means to understand where >> the allocation hotspots are at a very low overhead >> - Since it is at a low overhead, we can >> leave it on by default >> >> So I come to the mailing list with Jeremy's >> initial question: >> "I thought I would ask if there is any interest >> / if I should write a JEP / if I should just forget it." >> >> A year ago, it seemed some thought it was a good >> idea, is this still true? >> >> Thanks, >> Jc >> >> >> >> >> >> >> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From tobias.hartmann at oracle.com Wed May 17 05:55:08 2017 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Wed, 17 May 2017 07:55:08 +0200 Subject: RFR [10]: 8180423: Remove flag UseRelocIndex In-Reply-To: <764d0654-5b2e-eece-117e-e28cec51e2cb@oracle.com> References: <764d0654-5b2e-eece-117e-e28cec51e2cb@oracle.com> Message-ID: <5ae9441f-e8a4-d4e9-d942-16bb047498df@oracle.com> Hi Claes, On 16.05.2017 17:27, Claes Redestad wrote: > Webrev: http://cr.openjdk.java.net/~redestad/8180423/hotspot.00/ Looks good to me! Thanks, Tobias From yasuenag at gmail.com Wed May 17 07:37:30 2017 From: yasuenag at gmail.com (Yasumasa Suenaga) Date: Wed, 17 May 2017 16:37:30 +0900 Subject: hsdis output from JVMCI Message-ID: <2394d1e7-9144-cc35-74f6-86ea9e9a5617@gmail.com> Hi all, I tried to disassemble JVMCI installed code via hsdis. It works but I got duplicate output from hsdis. How to reproduce: 1. Clone reproducer from GitHub https://github.com/YaSuenag/jdt-2017-examples 2. Copy some files from hotspot testcase Please read README.md in this repository. 3. Edit Makefile to use hsdis - code-injection/Makefile - Enable UnlockDiagnosticVMOptions and CompilerDirectivesFile 4. Deploy hsdis to JDK 9 EA b169 5. Run reproducer $ make JAVA_HOME=/path/to/jdk9 syscall hsdis is called from JVMCIEnv::register_method() and CodeInstaller::install(). So we get same output from hsdis twice. I think we should fix it as following: ------------------------- diff -r d6d7e5caf497 src/share/vm/jvmci/jvmciCodeInstaller.cpp --- a/src/share/vm/jvmci/jvmciCodeInstaller.cpp Mon May 15 12:20:15 2017 +0200 +++ b/src/share/vm/jvmci/jvmciCodeInstaller.cpp Tue May 17 16:21:08 2017 +0900 @@ -623,7 +623,7 @@ if (nm != NULL && env == NULL) { DirectiveSet* directive = DirectivesStack::getMatchingDirective(method, compiler); bool printnmethods = directive->PrintAssemblyOption || directive->PrintNMethodsOption; - if (printnmethods || PrintDebugInfo || PrintRelocations || PrintDependencies || PrintExceptionHandlers) { + if (!printnmethods && (PrintDebugInfo || PrintRelocations || PrintDependencies || PrintExceptionHandlers)) { nm->print_nmethod(printnmethods); } DirectivesStack::release(directive); ------------------------- Is this bug? If so, I will file it to JBS and will upload webrev. Thanks, Yasumasa From claes.redestad at oracle.com Wed May 17 10:16:10 2017 From: claes.redestad at oracle.com (Claes Redestad) Date: Wed, 17 May 2017 12:16:10 +0200 Subject: RFR: 8180479: [TESTBUG] Some hotspot tests broken after internal Unsafe name changes Message-ID: <5461ddec-450a-a2fb-6708-a8d43f6cfecb@oracle.com> Hi, JDK-8159995 changed method names in jdk.internal.misc.Unsafe and was recently integrated into jdk10-hs, causing some trivially fixed test failures: Webrev: http://cr.openjdk.java.net/~redestad/8180479/hotspot.00/ Bug: https://bugs.openjdk.java.net/browse/JDK-8180479 Thanks! /Claes From tobias.hartmann at oracle.com Wed May 17 10:26:47 2017 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Wed, 17 May 2017 12:26:47 +0200 Subject: RFR: 8180479: [TESTBUG] Some hotspot tests broken after internal Unsafe name changes In-Reply-To: <5461ddec-450a-a2fb-6708-a8d43f6cfecb@oracle.com> References: <5461ddec-450a-a2fb-6708-a8d43f6cfecb@oracle.com> Message-ID: <2de0a4a0-4791-6fe1-81bc-af6ba736e306@oracle.com> Hi Claes, looks good to me, thanks for fixing! Best regards, Tobias On 17.05.2017 12:16, Claes Redestad wrote: > Hi, > > JDK-8159995 changed method names in jdk.internal.misc.Unsafe and was recently > integrated into jdk10-hs, causing some trivially fixed test failures: > > Webrev: http://cr.openjdk.java.net/~redestad/8180479/hotspot.00/ > Bug: https://bugs.openjdk.java.net/browse/JDK-8180479 > > Thanks! > > /Claes > From forax at univ-mlv.fr Wed May 17 10:28:07 2017 From: forax at univ-mlv.fr (Remi Forax) Date: Wed, 17 May 2017 12:28:07 +0200 (CEST) Subject: hsdis output from JVMCI In-Reply-To: <2394d1e7-9144-cc35-74f6-86ea9e9a5617@gmail.com> References: <2394d1e7-9144-cc35-74f6-86ea9e9a5617@gmail.com> Message-ID: <630977854.1744201.1495016887589.JavaMail.zimbra@u-pem.fr> Hi Yasumasa, i've read the slides referenced on your github project, slide 19: i believe that instead of printing the first tiers, you either want the last one or better all the tiers, IntStream.range(0, 4).filter(...).boxed().collect(Collectors.joining(", ")) otherwise the whole deck is quite fun, thank you ! R?mi ----- Mail original ----- > De: "Yasumasa Suenaga" > ?: hotspot-compiler-dev at openjdk.java.net > Envoy?: Mercredi 17 Mai 2017 09:37:30 > Objet: hsdis output from JVMCI > Hi all, > > I tried to disassemble JVMCI installed code via hsdis. > It works but I got duplicate output from hsdis. > > How to reproduce: > > 1. Clone reproducer from GitHub > https://github.com/YaSuenag/jdt-2017-examples > > 2. Copy some files from hotspot testcase > Please read README.md in this repository. > > 3. Edit Makefile to use hsdis > - code-injection/Makefile > - Enable UnlockDiagnosticVMOptions and CompilerDirectivesFile > > 4. Deploy hsdis to JDK 9 EA b169 > > 5. Run reproducer > $ make JAVA_HOME=/path/to/jdk9 syscall > > > hsdis is called from JVMCIEnv::register_method() and CodeInstaller::install(). > So we get same output from hsdis twice. > I think we should fix it as following: > > ------------------------- > diff -r d6d7e5caf497 src/share/vm/jvmci/jvmciCodeInstaller.cpp > --- a/src/share/vm/jvmci/jvmciCodeInstaller.cpp Mon May 15 12:20:15 2017 +0200 > +++ b/src/share/vm/jvmci/jvmciCodeInstaller.cpp Tue May 17 16:21:08 2017 +0900 > @@ -623,7 +623,7 @@ > if (nm != NULL && env == NULL) { > DirectiveSet* directive = DirectivesStack::getMatchingDirective(method, > compiler); > bool printnmethods = directive->PrintAssemblyOption || > directive->PrintNMethodsOption; > - if (printnmethods || PrintDebugInfo || PrintRelocations || > PrintDependencies || PrintExceptionHandlers) { > + if (!printnmethods && (PrintDebugInfo || PrintRelocations || > PrintDependencies || PrintExceptionHandlers)) { > nm->print_nmethod(printnmethods); > } > DirectivesStack::release(directive); > ------------------------- > > Is this bug? > If so, I will file it to JBS and will upload webrev. > > > Thanks, > > Yasumasa From claes.redestad at oracle.com Wed May 17 10:34:36 2017 From: claes.redestad at oracle.com (Claes Redestad) Date: Wed, 17 May 2017 12:34:36 +0200 Subject: RFR [10]: 8180423: Remove flag UseRelocIndex In-Reply-To: <5ae9441f-e8a4-d4e9-d942-16bb047498df@oracle.com> References: <764d0654-5b2e-eece-117e-e28cec51e2cb@oracle.com> <5ae9441f-e8a4-d4e9-d942-16bb047498df@oracle.com> Message-ID: <5575f7e8-cab1-51fe-a753-a17d896839bd@oracle.com> Tobias, On 2017-05-17 07:55, Tobias Hartmann wrote: > Hi Claes, > > On 16.05.2017 17:27, Claes Redestad wrote: >> Webrev: http://cr.openjdk.java.net/~redestad/8180423/hotspot.00/ > Looks good to me! thanks! Pushing the patch now. /Claes > > Thanks, > Tobias From zoltan.majo at oracle.com Wed May 17 10:52:10 2017 From: zoltan.majo at oracle.com (=?UTF-8?B?Wm9sdMOhbiBNYWrDsw==?=) Date: Wed, 17 May 2017 12:52:10 +0200 Subject: [10] RFR(XS): 8180473: Use proper deallocation for FileBuff::_bigbuf Message-ID: <48b9ffd4-14ff-37ea-8dbe-9250ea56e73b@oracle.com> Hi, please review the following fix for 8180473. https://bugs.openjdk.java.net/browse/JDK-8180473 http://cr.openjdk.java.net/~zmajo/8180473/webrev.00/ It is a good coding practice to properly deallocate dynamically allocated resources. In particular, dynamically allocated memory for arrays using the array-specific new[] operator is supposed to be deallocated using the array-specific delete[] operator. The deallocation of FileBuff::_bigbuf does not follow this recommendation and should therefore be changed. Tested with JPRT. Thank you! Best regards, Zoltan From claes.redestad at oracle.com Wed May 17 11:18:45 2017 From: claes.redestad at oracle.com (Claes Redestad) Date: Wed, 17 May 2017 13:18:45 +0200 Subject: RFR: 8180479: [TESTBUG] Some hotspot tests broken after internal Unsafe name changes In-Reply-To: <2de0a4a0-4791-6fe1-81bc-af6ba736e306@oracle.com> References: <5461ddec-450a-a2fb-6708-a8d43f6cfecb@oracle.com> <2de0a4a0-4791-6fe1-81bc-af6ba736e306@oracle.com> Message-ID: <9237c9ac-62ed-8013-8886-d1a72f04f6d3@oracle.com> Hi Tobias, On 2017-05-17 12:26, Tobias Hartmann wrote: > Hi Claes, > > looks good to me, thanks for fixing! thanks for reviewing! /Claes > > Best regards, > Tobias > > On 17.05.2017 12:16, Claes Redestad wrote: >> Hi, >> >> JDK-8159995 changed method names in jdk.internal.misc.Unsafe and was recently >> integrated into jdk10-hs, causing some trivially fixed test failures: >> >> Webrev: http://cr.openjdk.java.net/~redestad/8180479/hotspot.00/ >> Bug: https://bugs.openjdk.java.net/browse/JDK-8180479 >> >> Thanks! >> >> /Claes >> From doug.simon at oracle.com Wed May 17 11:52:37 2017 From: doug.simon at oracle.com (Doug Simon) Date: Wed, 17 May 2017 13:52:37 +0200 Subject: HotSpotResolvedJavaMethod#setNotInlineable() in JVMCI In-Reply-To: <8578DBFB-6523-436D-A830-08E4240A6F3A@twitter.com> References: <8578DBFB-6523-436D-A830-08E4240A6F3A@twitter.com> Message-ID: <242E7457-64C6-442B-8927-34E4A84B871A@oracle.com> > On 17 May 2017, at 00:50, Christian Thalinger wrote: > > >> On May 15, 2017, at 8:27 PM, Yasumasa Suenaga wrote: >> >> Hi all, >> >> I've tried to use JVMCI implementation. >> >> HotSpotResolvedJavaMethod#setNotInlineable() is explained that "Manually adds a DontInline annotation to this method" in the comment, however this method seems to disable C1/C2 compile in CompilerToVM [1][2]. >> >> Is this behavior correct? >> Should we fix the comment or method name (and / or function name) or behavior in jvmciCompilerToVM.cpp ? > > Yes, that?s a bit confusing. I think HotSpotResolvedJavaMethodImpl.setNotInlineable should be renamed and the documentation updated. I agree. I've can confirm that the expectation of the single user of this API[1] is for the method to never be inlined or compiled by HotSpot. The method name should be changed to setNotInlineableOrCompileable. As part of this change, the documentation for CompilerToVM.doNotInlineOrCompile[2] should also be fixed to reflect that it is a setter, not a getter. -Doug [1] https://github.com/graalvm/graal/blob/ddebb13523f54112c552f38ff87e23c6034ff725/compiler/src/org.graalvm.compiler.truffle.hotspot/src/org/graalvm/compiler/truffle/hotspot/HotSpotTruffleRuntime.java#L228 [2] http://hg.openjdk.java.net/jdk9/dev/hotspot/file/507f8a7678b4/src/jdk.internal.vm.ci/share/classes/jdk.vm.ci.hotspot/src/jdk/vm/ci/hotspot/CompilerToVM.java#l462 > >> >> >> I will file it to JBS if it is a bug. >> >> >> Thanks, >> >> Yasumasa (ysuenaga) >> >> >> [1] http://hg.openjdk.java.net/jdk9/dev/hotspot/file/d6d7e5caf497/src/jdk.internal.vm.ci/share/classes/jdk.vm.ci.hotspot/src/jdk/vm/ci/hotspot/HotSpotResolvedJavaMethodImpl.java#l320 >> [2] http://hg.openjdk.java.net/jdk9/dev/hotspot/file/d6d7e5caf497/src/share/vm/jvmci/jvmciCompilerToVM.cpp#l1004 > From rwestrel at redhat.com Wed May 17 12:15:06 2017 From: rwestrel at redhat.com (Roland Westrelin) Date: Wed, 17 May 2017 14:15:06 +0200 Subject: [9] RFR(S): 8179678: ArrayCopy with same src and dst can cause incorrect execution or compiler crash In-Reply-To: References: Message-ID: Thanks for the review, Vladimir. > Roland, can you move detect_ptr_independence() after instance_id() > check? detect_ptr_independence() calls all_controls_dominate() which > is expensive. Right. I found another bug with this. Another webrev is coming. Roland. From vladimir.kozlov at oracle.com Wed May 17 15:20:43 2017 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Wed, 17 May 2017 08:20:43 -0700 Subject: RFR: 8180479: [TESTBUG] Some hotspot tests broken after internal Unsafe name changes In-Reply-To: <5461ddec-450a-a2fb-6708-a8d43f6cfecb@oracle.com> References: <5461ddec-450a-a2fb-6708-a8d43f6cfecb@oracle.com> Message-ID: Hi Claes, Do you have bug to fix C1 and C2 intrinsics too after methods renaming?: Thanks, Vladimir On 5/17/17 3:16 AM, Claes Redestad wrote: > Hi, > > JDK-8159995 changed method names in jdk.internal.misc.Unsafe and was recently > integrated into jdk10-hs, causing some trivially fixed test failures: > > Webrev: http://cr.openjdk.java.net/~redestad/8180479/hotspot.00/ > Bug: https://bugs.openjdk.java.net/browse/JDK-8180479 > > Thanks! > > /Claes > From claes.redestad at oracle.com Wed May 17 16:35:52 2017 From: claes.redestad at oracle.com (Claes Redestad) Date: Wed, 17 May 2017 18:35:52 +0200 Subject: RFR: 8180479: [TESTBUG] Some hotspot tests broken after internal Unsafe name changes In-Reply-To: References: <5461ddec-450a-a2fb-6708-a8d43f6cfecb@oracle.com> Message-ID: Hi Vladimir, I don't know if that's planned - Paul, Ron? Thanks! /Claes On 2017-05-17 17:20, Vladimir Kozlov wrote: > Hi Claes, > > Do you have bug to fix C1 and C2 intrinsics too after methods renaming?: > > Thanks, > Vladimir > > On 5/17/17 3:16 AM, Claes Redestad wrote: >> Hi, >> >> JDK-8159995 changed method names in jdk.internal.misc.Unsafe and was >> recently >> integrated into jdk10-hs, causing some trivially fixed test failures: >> >> Webrev: http://cr.openjdk.java.net/~redestad/8180479/hotspot.00/ >> Bug: https://bugs.openjdk.java.net/browse/JDK-8180479 >> >> Thanks! >> >> /Claes >> From vladimir.kozlov at oracle.com Wed May 17 16:59:06 2017 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Wed, 17 May 2017 09:59:06 -0700 Subject: RFR: 8180479: [TESTBUG] Some hotspot tests broken after internal Unsafe name changes In-Reply-To: References: <5461ddec-450a-a2fb-6708-a8d43f6cfecb@oracle.com> Message-ID: Sorry for noise - I should have looked more Original rename changes fixed intrinsics - everything is fine Thanks Vladimir > On May 17, 2017, at 9:35 AM, Claes Redestad wrote: > > Hi Vladimir, I don't know if that's planned - Paul, Ron? > > Thanks! > > /Claes > >> On 2017-05-17 17:20, Vladimir Kozlov wrote: >> Hi Claes, >> >> Do you have bug to fix C1 and C2 intrinsics too after methods renaming?: >> >> Thanks, >> Vladimir >> >>> On 5/17/17 3:16 AM, Claes Redestad wrote: >>> Hi, >>> >>> JDK-8159995 changed method names in jdk.internal.misc.Unsafe and was recently >>> integrated into jdk10-hs, causing some trivially fixed test failures: >>> >>> Webrev: http://cr.openjdk.java.net/~redestad/8180479/hotspot.00/ >>> Bug: https://bugs.openjdk.java.net/browse/JDK-8180479 >>> >>> Thanks! >>> >>> /Claes >>> > From paul.sandoz at oracle.com Wed May 17 17:21:13 2017 From: paul.sandoz at oracle.com (Paul Sandoz) Date: Wed, 17 May 2017 10:21:13 -0700 Subject: RFR: 8180479: [TESTBUG] Some hotspot tests broken after internal Unsafe name changes In-Reply-To: References: <5461ddec-450a-a2fb-6708-a8d43f6cfecb@oracle.com> Message-ID: > On 17 May 2017, at 09:59, Vladimir Kozlov wrote: > > Sorry for noise - I should have looked more > Original rename changes fixed intrinsics - everything is fine > Thanks. We avoided changing the name of C2 nodes, which could be a follow up fix. I now realize i pushed the original fix to jdk10/jdk10 rather than jdk10/hs, sorry about that. I did run everything through JPRT both independently and for the push, so i am unsure why we missed updating those tests. Paul. > Thanks > Vladimir > >> On May 17, 2017, at 9:35 AM, Claes Redestad wrote: >> >> Hi Vladimir, I don't know if that's planned - Paul, Ron? >> >> Thanks! >> >> /Claes >> -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 841 bytes Desc: Message signed with OpenPGP using GPGMail URL: From vladimir.kozlov at oracle.com Wed May 17 23:46:47 2017 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Wed, 17 May 2017 16:46:47 -0700 Subject: [10] RFR(XS): 8180473: Use proper deallocation for FileBuff::_bigbuf In-Reply-To: <48b9ffd4-14ff-37ea-8dbe-9250ea56e73b@oracle.com> References: <48b9ffd4-14ff-37ea-8dbe-9250ea56e73b@oracle.com> Message-ID: <3c53363c-6f55-9126-83a6-57596713c6c2@oracle.com> Good. Thanks, Vladimir On 5/17/17 3:52 AM, Zolt?n Maj? wrote: > Hi, > > > please review the following fix for 8180473. > https://bugs.openjdk.java.net/browse/JDK-8180473 > http://cr.openjdk.java.net/~zmajo/8180473/webrev.00/ > > It is a good coding practice to properly deallocate dynamically allocated resources. In particular, dynamically allocated memory for arrays using the array-specific new[] operator is supposed to be > deallocated using the array-specific delete[] operator. The deallocation of FileBuff::_bigbuf does not follow this recommendation and should therefore be changed. Tested with JPRT. > > Thank you! > > Best regards, > > > Zoltan > From yasuenag at gmail.com Thu May 18 03:02:55 2017 From: yasuenag at gmail.com (Yasumasa Suenaga) Date: Thu, 18 May 2017 12:02:55 +0900 Subject: hsdis output from JVMCI In-Reply-To: <630977854.1744201.1495016887589.JavaMail.zimbra@u-pem.fr> References: <2394d1e7-9144-cc35-74f6-86ea9e9a5617@gmail.com> <630977854.1744201.1495016887589.JavaMail.zimbra@u-pem.fr> Message-ID: Hi Remi, Thank you for reading my slides in spite of Japanese :-) > slide 19: i believe that instead of printing the first tiers, you either want the last one or better all the tiers, > IntStream.range(0, 4).filter(...).boxed().collect(Collectors.joining(", ")) It is correct. HotSpotResolvedJavaMethod#hasCompiledCodeAtLevel() collects data from nmethod object [1]. It is single value [2] because one nmethod has one compiled code. Anyway, I'm waiting the response about hsdis. Thanks, Yasumasa [1] http://hg.openjdk.java.net/jdk9/dev/hotspot/file/507f8a7678b4/src/jdk.internal.vm.ci/share/classes/jdk.vm.ci.hotspot/src/jdk/vm/ci/hotspot/HotSpotResolvedJavaMethodImpl.java#l430 [2] http://hg.openjdk.java.net/jdk9/dev/hotspot/file/507f8a7678b4/src/share/vm/code/nmethod.hpp#l105 2017-05-17 19:28 GMT+09:00 Remi Forax : > Hi Yasumasa, > i've read the slides referenced on your github project, > > slide 19: i believe that instead of printing the first tiers, you either > want the last one or better all the tiers, > IntStream.range(0, 4).filter(...).boxed().collect(Collectors.joining(", > ")) > > otherwise the whole deck is quite fun, thank you ! > > R?mi > > ----- Mail original ----- > > De: "Yasumasa Suenaga" > > ?: hotspot-compiler-dev at openjdk.java.net > > Envoy?: Mercredi 17 Mai 2017 09:37:30 > > Objet: hsdis output from JVMCI > > > Hi all, > > > > I tried to disassemble JVMCI installed code via hsdis. > > It works but I got duplicate output from hsdis. > > > > How to reproduce: > > > > 1. Clone reproducer from GitHub > > https://github.com/YaSuenag/jdt-2017-examples > > > > 2. Copy some files from hotspot testcase > > Please read README.md in this repository. > > > > 3. Edit Makefile to use hsdis > > - code-injection/Makefile > > - Enable UnlockDiagnosticVMOptions and CompilerDirectivesFile > > > > 4. Deploy hsdis to JDK 9 EA b169 > > > > 5. Run reproducer > > $ make JAVA_HOME=/path/to/jdk9 syscall > > > > > > hsdis is called from JVMCIEnv::register_method() and > CodeInstaller::install(). > > So we get same output from hsdis twice. > > I think we should fix it as following: > > > > ------------------------- > > diff -r d6d7e5caf497 src/share/vm/jvmci/jvmciCodeInstaller.cpp > > --- a/src/share/vm/jvmci/jvmciCodeInstaller.cpp Mon May 15 > 12:20:15 2017 +0200 > > +++ b/src/share/vm/jvmci/jvmciCodeInstaller.cpp Tue May 17 > 16:21:08 2017 +0900 > > @@ -623,7 +623,7 @@ > > if (nm != NULL && env == NULL) { > > DirectiveSet* directive = DirectivesStack:: > getMatchingDirective(method, > > compiler); > > bool printnmethods = directive->PrintAssemblyOption || > > directive->PrintNMethodsOption; > > - if (printnmethods || PrintDebugInfo || PrintRelocations || > > PrintDependencies || PrintExceptionHandlers) { > > + if (!printnmethods && (PrintDebugInfo || PrintRelocations || > > PrintDependencies || PrintExceptionHandlers)) { > > nm->print_nmethod(printnmethods); > > } > > DirectivesStack::release(directive); > > ------------------------- > > > > Is this bug? > > If so, I will file it to JBS and will upload webrev. > > > > > > Thanks, > > > > Yasumasa > -------------- next part -------------- An HTML attachment was scrubbed... URL: From yasuenag at gmail.com Thu May 18 03:18:28 2017 From: yasuenag at gmail.com (Yasumasa Suenaga) Date: Thu, 18 May 2017 12:18:28 +0900 Subject: RFR: JDK-8180487: HotSpotResolvedJavaMethod#setNotInlineable() should be renamed to represent actual behavior Message-ID: Hi all, This review request is related to [1]. JBS: https://bugs.openjdk.java.net/browse/JDK-8180487 Patch: http://cr.openjdk.java.net/~ysuenaga/JDK-8180487/webrev.00/ Could you review it? I cannot access JPRT. So I need a sponsor. Thanks, Yasumasa [1] http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2017-May/026218.html From tobias.hartmann at oracle.com Thu May 18 06:57:33 2017 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Thu, 18 May 2017 08:57:33 +0200 Subject: RFR: 8180479: [TESTBUG] Some hotspot tests broken after internal Unsafe name changes In-Reply-To: References: <5461ddec-450a-a2fb-6708-a8d43f6cfecb@oracle.com> Message-ID: <89605609-0d7b-1571-25d4-f8e99ba95887@oracle.com> Hi Paul, On 17.05.2017 19:21, Paul Sandoz wrote: > I now realize i pushed the original fix to jdk10/jdk10 rather than jdk10/hs, sorry about that. I did run everything through JPRT both independently and for the push, so i am unsure why we missed updating those tests. We do not execute all the HotSpot (compiler) tests on JPRT due to runtime constraints. For example, the two failing tests are not executed. To avoid such failures in the future, I would suggest to run all hotspot tests on RBT before pushing. Thanks, Tobias >>> On May 17, 2017, at 9:35 AM, Claes Redestad wrote: >>> >>> Hi Vladimir, I don't know if that's planned - Paul, Ron? >>> >>> Thanks! >>> >>> /Claes >>> > From forax at univ-mlv.fr Thu May 18 08:05:12 2017 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Thu, 18 May 2017 10:05:12 +0200 (CEST) Subject: hsdis output from JVMCI In-Reply-To: References: <2394d1e7-9144-cc35-74f6-86ea9e9a5617@gmail.com> <630977854.1744201.1495016887589.JavaMail.zimbra@u-pem.fr> Message-ID: <307202872.2328551.1495094712887.JavaMail.zimbra@u-pem.fr> > De: "Yasumasa Suenaga" > ?: "Remi Forax" > Cc: hotspot-compiler-dev at openjdk.java.net > Envoy?: Jeudi 18 Mai 2017 05:02:55 > Objet: Re: hsdis output from JVMCI > Hi Remi, > Thank you for reading my slides in spite of Japanese :-) >> slide 19: i believe that instead of printing the first tiers, you either want > > the last one or better all the tiers, > > IntStream.range(0, 4).filter(...).boxed().collect(Collectors.joining(", ")) > It is correct. > HotSpotResolvedJavaMethod#hasCompiledCodeAtLevel() collects data from nmethod > object [1]. > It is single value [2] because one nmethod has one compiled code. ok, good to know, hence the 'resolved' in HotSpotResolvedJavaMethod i suppose. > Anyway, I'm waiting the response about hsdis. i can't hep you on that :) > Thanks, > Yasumasa cheers, R?mi > [1] > http://hg.openjdk.java.net/jdk9/dev/hotspot/file/507f8a7678b4/src/jdk.internal.vm.ci/share/classes/jdk.vm.ci.hotspot/src/jdk/vm/ci/hotspot/HotSpotResolvedJavaMethodImpl.java#l430 > [2] > http://hg.openjdk.java.net/jdk9/dev/hotspot/file/507f8a7678b4/src/share/vm/code/nmethod.hpp#l105 > 2017-05-17 19:28 GMT+09:00 Remi Forax < forax at univ-mlv.fr > : >> Hi Yasumasa, >> i've read the slides referenced on your github project, >> slide 19: i believe that instead of printing the first tiers, you either want >> the last one or better all the tiers, >> IntStream.range(0, 4).filter(...).boxed().collect(Collectors.joining(", ")) >> otherwise the whole deck is quite fun, thank you ! >> R?mi >> ----- Mail original ----- >> > De: "Yasumasa Suenaga" < yasuenag at gmail.com > >> > ?: hotspot-compiler-dev at openjdk.java.net >> > Envoy?: Mercredi 17 Mai 2017 09:37:30 >> > Objet: hsdis output from JVMCI >> > Hi all, >> > I tried to disassemble JVMCI installed code via hsdis. >> > It works but I got duplicate output from hsdis. >> > How to reproduce: >> > 1. Clone reproducer from GitHub >> > https://github.com/YaSuenag/jdt-2017-examples >> > 2. Copy some files from hotspot testcase >> > Please read README.md in this repository. >> > 3. Edit Makefile to use hsdis >> > - code-injection/Makefile >> > - Enable UnlockDiagnosticVMOptions and CompilerDirectivesFile >> > 4. Deploy hsdis to JDK 9 EA b169 >> > 5. Run reproducer >> > $ make JAVA_HOME=/path/to/jdk9 syscall >> > hsdis is called from JVMCIEnv::register_method() and CodeInstaller::install(). >> > So we get same output from hsdis twice. >> > I think we should fix it as following: >> > ------------------------- >> > diff -r d6d7e5caf497 src/share/vm/jvmci/jvmciCodeInstaller.cpp >> > --- a/src/share/vm/jvmci/jvmciCodeInstaller.cpp Mon May 15 12:20:15 2017 +0200 >> > +++ b/src/share/vm/jvmci/jvmciCodeInstaller.cpp Tue May 17 16:21:08 2017 +0900 >> > @@ -623,7 +623,7 @@ >> > if (nm != NULL && env == NULL) { >> > DirectiveSet* directive = DirectivesStack::getMatchingDirective(method, >> > compiler); >> > bool printnmethods = directive->PrintAssemblyOption || >> > directive->PrintNMethodsOption; >> > - if (printnmethods || PrintDebugInfo || PrintRelocations || >> > PrintDependencies || PrintExceptionHandlers) { >> > + if (!printnmethods && (PrintDebugInfo || PrintRelocations || >> > PrintDependencies || PrintExceptionHandlers)) { >> > nm->print_nmethod(printnmethods); >> > } >> > DirectivesStack::release(directive); >> > ------------------------- >> > Is this bug? >> > If so, I will file it to JBS and will upload webrev. >> > Thanks, >> > Yasumasa -------------- next part -------------- An HTML attachment was scrubbed... URL: From doug.simon at oracle.com Thu May 18 09:11:10 2017 From: doug.simon at oracle.com (Doug Simon) Date: Thu, 18 May 2017 11:11:10 +0200 Subject: RFR: JDK-8180487: HotSpotResolvedJavaMethod#setNotInlineable() should be renamed to represent actual behavior In-Reply-To: References: Message-ID: The code changes look good. However, the javadoc still describes the functionality as a query: /** * Determines if {@code method} should not be inlined or compiled. */ where as it's really a setter. That is, the comment should be: /** * Sets flags on {@code method} indicating that it should never be inlined or compiled by the VM. */ -Doug > On 18 May 2017, at 05:18, Yasumasa Suenaga wrote: > > Hi all, > > This review request is related to [1]. > > JBS: https://bugs.openjdk.java.net/browse/JDK-8180487 > Patch: http://cr.openjdk.java.net/~ysuenaga/JDK-8180487/webrev.00/ > > Could you review it? > > I cannot access JPRT. > So I need a sponsor. > > > Thanks, > > Yasumasa > > > [1] http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2017-May/026218.html From doug.simon at oracle.com Thu May 18 09:58:09 2017 From: doug.simon at oracle.com (Doug Simon) Date: Thu, 18 May 2017 11:58:09 +0200 Subject: hsdis output from JVMCI In-Reply-To: <2394d1e7-9144-cc35-74f6-86ea9e9a5617@gmail.com> References: <2394d1e7-9144-cc35-74f6-86ea9e9a5617@gmail.com> Message-ID: <62D11F1C-7F02-4C42-9EF4-36DD2D078C2E@oracle.com> > On 17 May 2017, at 09:37, Yasumasa Suenaga wrote: > > Hi all, > > I tried to disassemble JVMCI installed code via hsdis. > It works but I got duplicate output from hsdis. > > How to reproduce: > > 1. Clone reproducer from GitHub > https://github.com/YaSuenag/jdt-2017-examples > > 2. Copy some files from hotspot testcase > Please read README.md in this repository. > > 3. Edit Makefile to use hsdis > - code-injection/Makefile > - Enable UnlockDiagnosticVMOptions and CompilerDirectivesFile > > 4. Deploy hsdis to JDK 9 EA b169 > > 5. Run reproducer > $ make JAVA_HOME=/path/to/jdk9 syscall > > > hsdis is called from JVMCIEnv::register_method() and CodeInstaller::install(). > So we get same output from hsdis twice. > I think we should fix it as following: > > ------------------------- > diff -r d6d7e5caf497 src/share/vm/jvmci/jvmciCodeInstaller.cpp > --- a/src/share/vm/jvmci/jvmciCodeInstaller.cpp Mon May 15 12:20:15 2017 +0200 > +++ b/src/share/vm/jvmci/jvmciCodeInstaller.cpp Tue May 17 16:21:08 2017 +0900 > @@ -623,7 +623,7 @@ > if (nm != NULL && env == NULL) { > DirectiveSet* directive = DirectivesStack::getMatchingDirective(method, compiler); > bool printnmethods = directive->PrintAssemblyOption || directive->PrintNMethodsOption; > - if (printnmethods || PrintDebugInfo || PrintRelocations || PrintDependencies || PrintExceptionHandlers) { > + if (!printnmethods && (PrintDebugInfo || PrintRelocations || PrintDependencies || PrintExceptionHandlers)) { > nm->print_nmethod(printnmethods); > } > DirectivesStack::release(directive); > ------------------------- That looks like a sensible change to me. -Doug From yasuenag at gmail.com Thu May 18 12:12:26 2017 From: yasuenag at gmail.com (Yasumasa Suenaga) Date: Thu, 18 May 2017 21:12:26 +0900 Subject: RFR: JDK-8180487: HotSpotResolvedJavaMethod#setNotInlineable() should be renamed to represent actual behavior In-Reply-To: References: Message-ID: <2318d91f-bbaa-11f3-3729-decd094e2ace@gmail.com> Hi Doug, Thank you for your comment. I uploaded new webrev. Could you check again? http://cr.openjdk.java.net/~ysuenaga/JDK-8180487/webrev.01/ Yasumasa On 2017/05/18 18:11, Doug Simon wrote: > The code changes look good. However, the javadoc still describes the functionality as a query: > > /** > * Determines if {@code method} should not be inlined or compiled. > */ > > where as it's really a setter. That is, the comment should be: > > /** > * Sets flags on {@code method} indicating that it should never be inlined or compiled by the VM. > */ > > -Doug > >> On 18 May 2017, at 05:18, Yasumasa Suenaga wrote: >> >> Hi all, >> >> This review request is related to [1]. >> >> JBS: https://bugs.openjdk.java.net/browse/JDK-8180487 >> Patch: http://cr.openjdk.java.net/~ysuenaga/JDK-8180487/webrev.00/ >> >> Could you review it? >> >> I cannot access JPRT. >> So I need a sponsor. >> >> >> Thanks, >> >> Yasumasa >> >> >> [1] http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2017-May/026218.html > From doug.simon at oracle.com Thu May 18 12:27:34 2017 From: doug.simon at oracle.com (Doug Simon) Date: Thu, 18 May 2017 14:27:34 +0200 Subject: RFR: JDK-8180487: HotSpotResolvedJavaMethod#setNotInlineable() should be renamed to represent actual behavior In-Reply-To: <2318d91f-bbaa-11f3-3729-decd094e2ace@gmail.com> References: <2318d91f-bbaa-11f3-3729-decd094e2ace@gmail.com> Message-ID: <13B58304-46AD-4F25-A882-A112B9C55AA3@oracle.com> Thanks for the update - the changes look good to me. -Doug > On 18 May 2017, at 14:12, Yasumasa Suenaga wrote: > > Hi Doug, > > Thank you for your comment. > > I uploaded new webrev. > Could you check again? > > http://cr.openjdk.java.net/~ysuenaga/JDK-8180487/webrev.01/ > > > Yasumasa > > > On 2017/05/18 18:11, Doug Simon wrote: >> The code changes look good. However, the javadoc still describes the functionality as a query: >> >> /** >> * Determines if {@code method} should not be inlined or compiled. >> */ >> >> where as it's really a setter. That is, the comment should be: >> >> /** >> * Sets flags on {@code method} indicating that it should never be inlined or compiled by the VM. >> */ >> >> -Doug >> >>> On 18 May 2017, at 05:18, Yasumasa Suenaga wrote: >>> >>> Hi all, >>> >>> This review request is related to [1]. >>> >>> JBS: https://bugs.openjdk.java.net/browse/JDK-8180487 >>> Patch: http://cr.openjdk.java.net/~ysuenaga/JDK-8180487/webrev.00/ >>> >>> Could you review it? >>> >>> I cannot access JPRT. >>> So I need a sponsor. >>> >>> >>> Thanks, >>> >>> Yasumasa >>> >>> >>> [1] http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2017-May/026218.html >> From yasuenag at gmail.com Thu May 18 12:28:45 2017 From: yasuenag at gmail.com (Yasumasa Suenaga) Date: Thu, 18 May 2017 21:28:45 +0900 Subject: RFR: JDK-8180487: HotSpotResolvedJavaMethod#setNotInlineable() should be renamed to represent actual behavior In-Reply-To: <13B58304-46AD-4F25-A882-A112B9C55AA3@oracle.com> References: <2318d91f-bbaa-11f3-3729-decd094e2ace@gmail.com> <13B58304-46AD-4F25-A882-A112B9C55AA3@oracle.com> Message-ID: Thanks Doug! BTW, could you be a sponsor? Yasumasa On 2017/05/18 21:27, Doug Simon wrote: > Thanks for the update - the changes look good to me. > > -Doug > >> On 18 May 2017, at 14:12, Yasumasa Suenaga wrote: >> >> Hi Doug, >> >> Thank you for your comment. >> >> I uploaded new webrev. >> Could you check again? >> >> http://cr.openjdk.java.net/~ysuenaga/JDK-8180487/webrev.01/ >> >> >> Yasumasa >> >> >> On 2017/05/18 18:11, Doug Simon wrote: >>> The code changes look good. However, the javadoc still describes the functionality as a query: >>> >>> /** >>> * Determines if {@code method} should not be inlined or compiled. >>> */ >>> >>> where as it's really a setter. That is, the comment should be: >>> >>> /** >>> * Sets flags on {@code method} indicating that it should never be inlined or compiled by the VM. >>> */ >>> >>> -Doug >>> >>>> On 18 May 2017, at 05:18, Yasumasa Suenaga wrote: >>>> >>>> Hi all, >>>> >>>> This review request is related to [1]. >>>> >>>> JBS: https://bugs.openjdk.java.net/browse/JDK-8180487 >>>> Patch: http://cr.openjdk.java.net/~ysuenaga/JDK-8180487/webrev.00/ >>>> >>>> Could you review it? >>>> >>>> I cannot access JPRT. >>>> So I need a sponsor. >>>> >>>> >>>> Thanks, >>>> >>>> Yasumasa >>>> >>>> >>>> [1] http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2017-May/026218.html >>> > From yasuenag at gmail.com Thu May 18 12:38:39 2017 From: yasuenag at gmail.com (Yasumasa Suenaga) Date: Thu, 18 May 2017 21:38:39 +0900 Subject: RFR: JDK-8180601: hsdis generates duplicate output for JVMCI installed code Message-ID: Hi all, This review request is related to [1]. JBS: https://bugs.openjdk.java.net/browse/JDK-8180601 Patch: http://cr.openjdk.java.net/~ysuenaga/JDK-8180601/webrev.00/ Could you review it? I cannot access JPRT. So I need a sponsor. Thanks, Yasumasa [1] http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2017-May/026232.html From doug.simon at oracle.com Thu May 18 12:41:13 2017 From: doug.simon at oracle.com (Doug Simon) Date: Thu, 18 May 2017 14:41:13 +0200 Subject: RFR: JDK-8180487: HotSpotResolvedJavaMethod#setNotInlineable() should be renamed to represent actual behavior In-Reply-To: References: <2318d91f-bbaa-11f3-3729-decd094e2ace@gmail.com> <13B58304-46AD-4F25-A882-A112B9C55AA3@oracle.com> Message-ID: <14289636-87F0-4F2C-B827-BDD68043FA45@oracle.com> > On 18 May 2017, at 14:28, Yasumasa Suenaga wrote: > > Thanks Doug! > > BTW, could you be a sponsor? I'd prefer it if one of the compiler devs could do it. Vladimir, can you take care of this? -Doug > On 2017/05/18 21:27, Doug Simon wrote: >> Thanks for the update - the changes look good to me. >> >> -Doug >> >>> On 18 May 2017, at 14:12, Yasumasa Suenaga wrote: >>> >>> Hi Doug, >>> >>> Thank you for your comment. >>> >>> I uploaded new webrev. >>> Could you check again? >>> >>> http://cr.openjdk.java.net/~ysuenaga/JDK-8180487/webrev.01/ >>> >>> >>> Yasumasa >>> >>> >>> On 2017/05/18 18:11, Doug Simon wrote: >>>> The code changes look good. However, the javadoc still describes the functionality as a query: >>>> >>>> /** >>>> * Determines if {@code method} should not be inlined or compiled. >>>> */ >>>> >>>> where as it's really a setter. That is, the comment should be: >>>> >>>> /** >>>> * Sets flags on {@code method} indicating that it should never be inlined or compiled by the VM. >>>> */ >>>> >>>> -Doug >>>> >>>>> On 18 May 2017, at 05:18, Yasumasa Suenaga wrote: >>>>> >>>>> Hi all, >>>>> >>>>> This review request is related to [1]. >>>>> >>>>> JBS: https://bugs.openjdk.java.net/browse/JDK-8180487 >>>>> Patch: http://cr.openjdk.java.net/~ysuenaga/JDK-8180487/webrev.00/ >>>>> >>>>> Could you review it? >>>>> >>>>> I cannot access JPRT. >>>>> So I need a sponsor. >>>>> >>>>> >>>>> Thanks, >>>>> >>>>> Yasumasa >>>>> >>>>> >>>>> [1] http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2017-May/026218.html >>>> >> From chihiro.ito at oracle.com Thu May 18 13:10:24 2017 From: chihiro.ito at oracle.com (chihiro ito) Date: Thu, 18 May 2017 22:10:24 +0900 Subject: RFR: Apply UL to PrintCodeCacheOnCompilation Message-ID: <591D9D40.5050400@oracle.com> Hi all, I apply Unified JVM Logging to log of PrintCodeCacheOnCompilation option. Logs which applied this is following. Could you possibly review for this following small change? If review is ok, please commit this as cito. Sample Log: [1.370s][debug][compilation,codecache] CodeHeap 'non-profiled nmethods': size=120036Kb used=13Kb max_used=13Kb free=120022Kb [1.372s][debug][compilation,codecache] CodeHeap 'profiled nmethods': size=120032Kb used=85Kb max_used=85Kb free=119946Kb [1.372s][debug][compilation,codecache] CodeHeap 'non-nmethods': size=5692Kb used=2648Kb max_used=2655Kb free=3043Kb Source: diff --git a/src/share/vm/compiler/compileBroker.cpp b/src/share/vm/compiler/compileBroker.cpp --- a/src/share/vm/compiler/compileBroker.cpp +++ b/src/share/vm/compiler/compileBroker.cpp @@ -1726,6 +1726,22 @@ tty->print("%s", s.as_string()); } +// wrapper for CodeCache::print_summary() using outputStream +static void codecache_print(outputStream* out, bool detailed) { + ResourceMark rm; + stringStream s; + + // Dump code cache into a buffer + { + MutexLockerEx mu(CodeCache_lock, Mutex::_no_safepoint_check_flag); + CodeCache::print_summary(&s, detailed); + } + + for( char *pos, *line = strtok_r(s.as_string(), "\n", &pos) ; line != NULL ; line = strtok_r(NULL, "\n", &pos) ) { + out->print_cr("%s", line); + } +} + void CompileBroker::post_compile(CompilerThread* thread, CompileTask* task, EventCompilation& event, bool success, ciEnv* ci_env) { if (success) { @@ -1939,6 +1955,10 @@ tty->print_cr("time: %d inlined: %d bytes", (int)time.milliseconds(), task->num_inlined_bytecodes()); } + Log(compilation, codecache) log; + if (log.is_debug()) + codecache_print(log.debug_stream(), /* detailed= */ false); + if (PrintCodeCacheOnCompilation) codecache_print(/* detailed= */ false); Regards, Chihiro -- Chihiro Ito | Principal Consultant | +81.90.6148.8815 Oracle Consultant ORACLE Japan | Akasaka Center Bldg. | Motoakasaka 1-3-13 | 1070051 Minato-ku, Tokyo, JAPAN Oracle is committed to developing practices and products that help protect the environment From tobias.hartmann at oracle.com Thu May 18 14:08:16 2017 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Thu, 18 May 2017 16:08:16 +0200 Subject: [9] RFR(XS): 8180511: Null pointer dereference in Matcher::ReduceInst() Message-ID: <28049658-c5d4-1889-9e38-d8992a60dbcb@oracle.com> Hi, please review the following patch: https://bugs.openjdk.java.net/browse/JDK-8180511 http://cr.openjdk.java.net/~thartmann/8180511/webrev.00/ Fixed a missing null check on the return value of MachNodeGenerator() found by Parfait. Thanks, Tobias From tobias.hartmann at oracle.com Thu May 18 14:08:22 2017 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Thu, 18 May 2017 16:08:22 +0200 Subject: [9] RFR(XS): 8180576: Null pointer dereference in Matcher::xform() Message-ID: Hi, please review the following patch: https://bugs.openjdk.java.net/browse/JDK-8180576 http://cr.openjdk.java.net/~thartmann/8180576/webrev.00/ Fixed a missing null check on n->in(0) found by Parfait. Thanks, Tobias From tobias.hartmann at oracle.com Thu May 18 14:08:30 2017 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Thu, 18 May 2017 16:08:30 +0200 Subject: [9] RFR(XS): 8180575: Null pointer dereference in LoadNode::Identity() Message-ID: Hi, please review the following patch: https://bugs.openjdk.java.net/browse/JDK-8180575 http://cr.openjdk.java.net/~thartmann/8180575/webrev.00/ Fixed a missing null check on the return value of AddPNode::Ideal_base_and_offset() found by Parfait. Thanks, Tobias From tobias.hartmann at oracle.com Thu May 18 14:08:36 2017 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Thu, 18 May 2017 16:08:36 +0200 Subject: [9] RFR(XS): 8180565: Null pointer dereferences of ConstMethod::method() Message-ID: <30304de2-7e50-4a0b-f74f-8a264da926cf@oracle.com> Hi, please review the following patch: https://bugs.openjdk.java.net/browse/JDK-8180565 http://cr.openjdk.java.net/~thartmann/8180565/webrev.00/ ConstMethod::method() returns _constants->pool_holder()->method_with_idnum(_method_idnum) which may be NULL. We need to check for NULL before dereferencing. Thanks, Tobias From zoltan.majo at oracle.com Thu May 18 15:18:28 2017 From: zoltan.majo at oracle.com (=?UTF-8?B?Wm9sdMOhbiBNYWrDsw==?=) Date: Thu, 18 May 2017 17:18:28 +0200 Subject: [10] RFR(XS): 8180473: Use proper deallocation for FileBuff::_bigbuf In-Reply-To: <3c53363c-6f55-9126-83a6-57596713c6c2@oracle.com> References: <48b9ffd4-14ff-37ea-8dbe-9250ea56e73b@oracle.com> <3c53363c-6f55-9126-83a6-57596713c6c2@oracle.com> Message-ID: <1de9afb7-9d53-139e-d3b7-7e51b776d17c@oracle.com> Thank you, Vladimir, for the review! Best regards, Zoltan On 05/18/2017 01:46 AM, Vladimir Kozlov wrote: > Good. > > Thanks, > Vladimir > > On 5/17/17 3:52 AM, Zolt?n Maj? wrote: >> Hi, >> >> >> please review the following fix for 8180473. >> https://bugs.openjdk.java.net/browse/JDK-8180473 >> http://cr.openjdk.java.net/~zmajo/8180473/webrev.00/ >> >> It is a good coding practice to properly deallocate dynamically >> allocated resources. In particular, dynamically allocated memory for >> arrays using the array-specific new[] operator is supposed to be >> deallocated using the array-specific delete[] operator. The >> deallocation of FileBuff::_bigbuf does not follow this recommendation >> and should therefore be changed. Tested with JPRT. >> >> Thank you! >> >> Best regards, >> >> >> Zoltan >> From vladimir.kozlov at oracle.com Thu May 18 17:07:41 2017 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Thu, 18 May 2017 10:07:41 -0700 Subject: [9] RFR(XS): 8180511: Null pointer dereference in Matcher::ReduceInst() In-Reply-To: <28049658-c5d4-1889-9e38-d8992a60dbcb@oracle.com> References: <28049658-c5d4-1889-9e38-d8992a60dbcb@oracle.com> Message-ID: Good. Vladimir On 5/18/17 7:08 AM, Tobias Hartmann wrote: > Hi, > > please review the following patch: > https://bugs.openjdk.java.net/browse/JDK-8180511 > http://cr.openjdk.java.net/~thartmann/8180511/webrev.00/ > > Fixed a missing null check on the return value of MachNodeGenerator() found by Parfait. > > Thanks, > Tobias > From vladimir.kozlov at oracle.com Thu May 18 17:08:09 2017 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Thu, 18 May 2017 10:08:09 -0700 Subject: [9] RFR(XS): 8180576: Null pointer dereference in Matcher::xform() In-Reply-To: References: Message-ID: <5dc3ad10-5a8d-bdb4-4ef2-0e3e9a75252b@oracle.com> Good. Vladimir On 5/18/17 7:08 AM, Tobias Hartmann wrote: > Hi, > > please review the following patch: > https://bugs.openjdk.java.net/browse/JDK-8180576 > http://cr.openjdk.java.net/~thartmann/8180576/webrev.00/ > > Fixed a missing null check on n->in(0) found by Parfait. > > Thanks, > Tobias > From vladimir.kozlov at oracle.com Thu May 18 17:09:06 2017 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Thu, 18 May 2017 10:09:06 -0700 Subject: [9] RFR(XS): 8180575: Null pointer dereference in LoadNode::Identity() In-Reply-To: References: Message-ID: <7a717adb-a1b4-1673-1b1f-17af28266893@oracle.com> Good. Vladimir On 5/18/17 7:08 AM, Tobias Hartmann wrote: > Hi, > > please review the following patch: > https://bugs.openjdk.java.net/browse/JDK-8180575 > http://cr.openjdk.java.net/~thartmann/8180575/webrev.00/ > > Fixed a missing null check on the return value of AddPNode::Ideal_base_and_offset() found by Parfait. > > Thanks, > Tobias > From vladimir.kozlov at oracle.com Thu May 18 17:10:14 2017 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Thu, 18 May 2017 10:10:14 -0700 Subject: [9] RFR(XS): 8180565: Null pointer dereferences of ConstMethod::method() In-Reply-To: <30304de2-7e50-4a0b-f74f-8a264da926cf@oracle.com> References: <30304de2-7e50-4a0b-f74f-8a264da926cf@oracle.com> Message-ID: <809ceefb-11e3-75cf-718d-b883ca88efbd@oracle.com> Good. Vladimir On 5/18/17 7:08 AM, Tobias Hartmann wrote: > Hi, > > please review the following patch: > https://bugs.openjdk.java.net/browse/JDK-8180565 > http://cr.openjdk.java.net/~thartmann/8180565/webrev.00/ > > ConstMethod::method() returns _constants->pool_holder()->method_with_idnum(_method_idnum) which may be NULL. We need to check for NULL before dereferencing. > > Thanks, > Tobias > From vladimir.kozlov at oracle.com Thu May 18 17:51:00 2017 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Thu, 18 May 2017 10:51:00 -0700 Subject: RFR: Apply UL to PrintCodeCacheOnCompilation In-Reply-To: <591D9D40.5050400@oracle.com> References: <591D9D40.5050400@oracle.com> Message-ID: <6a279f07-c135-30a0-5558-5c64b0b7374f@oracle.com> Hi Chihiro, Changes looks fine. Please, file Enhancement in JBS. Then we can sponsor it. Thanks, Vladimir On 5/18/17 6:10 AM, chihiro ito wrote: > Hi all, > > I apply Unified JVM Logging to log of PrintCodeCacheOnCompilation option. Logs which applied this is following. > Could you possibly review for this following small change? If review is ok, please commit this as cito. > > Sample Log: > [1.370s][debug][compilation,codecache] CodeHeap 'non-profiled nmethods': size=120036Kb used=13Kb max_used=13Kb free=120022Kb > [1.372s][debug][compilation,codecache] CodeHeap 'profiled nmethods': size=120032Kb used=85Kb max_used=85Kb free=119946Kb > [1.372s][debug][compilation,codecache] CodeHeap 'non-nmethods': size=5692Kb used=2648Kb max_used=2655Kb free=3043Kb > > Source: > diff --git a/src/share/vm/compiler/compileBroker.cpp b/src/share/vm/compiler/compileBroker.cpp > --- a/src/share/vm/compiler/compileBroker.cpp > +++ b/src/share/vm/compiler/compileBroker.cpp > @@ -1726,6 +1726,22 @@ > tty->print("%s", s.as_string()); > } > > +// wrapper for CodeCache::print_summary() using outputStream > +static void codecache_print(outputStream* out, bool detailed) { > + ResourceMark rm; > + stringStream s; > + > + // Dump code cache into a buffer > + { > + MutexLockerEx mu(CodeCache_lock, Mutex::_no_safepoint_check_flag); > + CodeCache::print_summary(&s, detailed); > + } > + > + for( char *pos, *line = strtok_r(s.as_string(), "\n", &pos) ; line != NULL ; line = strtok_r(NULL, "\n", &pos) ) { > + out->print_cr("%s", line); > + } > +} > + > void CompileBroker::post_compile(CompilerThread* thread, CompileTask* task, EventCompilation& event, bool success, ciEnv* ci_env) { > > if (success) { > @@ -1939,6 +1955,10 @@ > tty->print_cr("time: %d inlined: %d bytes", (int)time.milliseconds(), task->num_inlined_bytecodes()); > } > > + Log(compilation, codecache) log; > + if (log.is_debug()) > + codecache_print(log.debug_stream(), /* detailed= */ false); > + > if (PrintCodeCacheOnCompilation) > codecache_print(/* detailed= */ false); > > > > Regards, > Chihiro > > From vladimir.kozlov at oracle.com Thu May 18 18:15:34 2017 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Thu, 18 May 2017 11:15:34 -0700 Subject: RFR: JDK-8180601: hsdis generates duplicate output for JVMCI installed code In-Reply-To: References: Message-ID: <9c1181e4-1d5b-6c54-cdee-79924ff12a34@oracle.com> Hi Yasumasa Fix looks good. And I see Doug reviewed it too. I will sponsor it for JDK 10. Thanks, Vladimir On 5/18/17 5:38 AM, Yasumasa Suenaga wrote: > Hi all, > > This review request is related to [1]. > > JBS: https://bugs.openjdk.java.net/browse/JDK-8180601 > Patch: http://cr.openjdk.java.net/~ysuenaga/JDK-8180601/webrev.00/ > > Could you review it? > > I cannot access JPRT. > So I need a sponsor. > > > Thanks, > > Yasumasa > > > [1] http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2017-May/026232.html From thomas.stuefe at gmail.com Thu May 18 18:47:39 2017 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Thu, 18 May 2017 20:47:39 +0200 Subject: [10] RFR(XS): 8180473: Use proper deallocation for FileBuff::_bigbuf In-Reply-To: <48b9ffd4-14ff-37ea-8dbe-9250ea56e73b@oracle.com> References: <48b9ffd4-14ff-37ea-8dbe-9250ea56e73b@oracle.com> Message-ID: Hi Zoltan, looks good. Old code may have worked but relied on undefined behaviour. Kind Regards, Thomas On Wed, May 17, 2017 at 12:52 PM, Zolt?n Maj? wrote: > Hi, > > > please review the following fix for 8180473. > https://bugs.openjdk.java.net/browse/JDK-8180473 > http://cr.openjdk.java.net/~zmajo/8180473/webrev.00/ > > It is a good coding practice to properly deallocate dynamically allocated > resources. In particular, dynamically allocated memory for arrays using the > array-specific new[] operator is supposed to be deallocated using the > array-specific delete[] operator. The deallocation of FileBuff::_bigbuf > does not follow this recommendation and should therefore be changed. Tested > with JPRT. > > Thank you! > > Best regards, > > > Zoltan > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From vladimir.kozlov at oracle.com Thu May 18 23:39:02 2017 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Thu, 18 May 2017 16:39:02 -0700 Subject: RFR: JDK-8180487: HotSpotResolvedJavaMethod#setNotInlineable() should be renamed to represent actual behavior In-Reply-To: <14289636-87F0-4F2C-B827-BDD68043FA45@oracle.com> References: <2318d91f-bbaa-11f3-3729-decd094e2ace@gmail.com> <13B58304-46AD-4F25-A882-A112B9C55AA3@oracle.com> <14289636-87F0-4F2C-B827-BDD68043FA45@oracle.com> Message-ID: <73edf931-d27f-52c6-d760-df1dab015992@oracle.com> In JPRT queue. Vladimir On 5/18/17 5:41 AM, Doug Simon wrote: > >> On 18 May 2017, at 14:28, Yasumasa Suenaga wrote: >> >> Thanks Doug! >> >> BTW, could you be a sponsor? > > > I'd prefer it if one of the compiler devs could do it. Vladimir, can you take care of this? > > -Doug > >> On 2017/05/18 21:27, Doug Simon wrote: >>> Thanks for the update - the changes look good to me. >>> >>> -Doug >>> >>>> On 18 May 2017, at 14:12, Yasumasa Suenaga wrote: >>>> >>>> Hi Doug, >>>> >>>> Thank you for your comment. >>>> >>>> I uploaded new webrev. >>>> Could you check again? >>>> >>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8180487/webrev.01/ >>>> >>>> >>>> Yasumasa >>>> >>>> >>>> On 2017/05/18 18:11, Doug Simon wrote: >>>>> The code changes look good. However, the javadoc still describes the functionality as a query: >>>>> >>>>> /** >>>>> * Determines if {@code method} should not be inlined or compiled. >>>>> */ >>>>> >>>>> where as it's really a setter. That is, the comment should be: >>>>> >>>>> /** >>>>> * Sets flags on {@code method} indicating that it should never be inlined or compiled by the VM. >>>>> */ >>>>> >>>>> -Doug >>>>> >>>>>> On 18 May 2017, at 05:18, Yasumasa Suenaga wrote: >>>>>> >>>>>> Hi all, >>>>>> >>>>>> This review request is related to [1]. >>>>>> >>>>>> JBS: https://bugs.openjdk.java.net/browse/JDK-8180487 >>>>>> Patch: http://cr.openjdk.java.net/~ysuenaga/JDK-8180487/webrev.00/ >>>>>> >>>>>> Could you review it? >>>>>> >>>>>> I cannot access JPRT. >>>>>> So I need a sponsor. >>>>>> >>>>>> >>>>>> Thanks, >>>>>> >>>>>> Yasumasa >>>>>> >>>>>> >>>>>> [1] http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2017-May/026218.html >>>>> >>> > From tobias.hartmann at oracle.com Fri May 19 06:16:44 2017 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Fri, 19 May 2017 08:16:44 +0200 Subject: [9] RFR(XS): 8180565: Null pointer dereferences of ConstMethod::method() In-Reply-To: <809ceefb-11e3-75cf-718d-b883ca88efbd@oracle.com> References: <30304de2-7e50-4a0b-f74f-8a264da926cf@oracle.com> <809ceefb-11e3-75cf-718d-b883ca88efbd@oracle.com> Message-ID: <42fc8336-2bf9-94a6-56a8-3574c67ac456@oracle.com> Hi Vladimir, thanks for the review! Best regards, Tobias On 18.05.2017 19:10, Vladimir Kozlov wrote: > Good. > > Vladimir > > > On 5/18/17 7:08 AM, Tobias Hartmann wrote: >> Hi, >> >> please review the following patch: >> https://bugs.openjdk.java.net/browse/JDK-8180565 >> http://cr.openjdk.java.net/~thartmann/8180565/webrev.00/ >> >> ConstMethod::method() returns _constants->pool_holder()->method_with_idnum(_method_idnum) which may be NULL. We need to check for NULL before dereferencing. >> >> Thanks, >> Tobias >> From tobias.hartmann at oracle.com Fri May 19 06:16:50 2017 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Fri, 19 May 2017 08:16:50 +0200 Subject: [9] RFR(XS): 8180575: Null pointer dereference in LoadNode::Identity() In-Reply-To: <7a717adb-a1b4-1673-1b1f-17af28266893@oracle.com> References: <7a717adb-a1b4-1673-1b1f-17af28266893@oracle.com> Message-ID: <54ee2486-7fb5-57bc-9e37-d216335487d6@oracle.com> Hi Vladimir, thanks for the review! Best regards, Tobias On 18.05.2017 19:09, Vladimir Kozlov wrote: > Good. > > Vladimir > > > On 5/18/17 7:08 AM, Tobias Hartmann wrote: >> Hi, >> >> please review the following patch: >> https://bugs.openjdk.java.net/browse/JDK-8180575 >> http://cr.openjdk.java.net/~thartmann/8180575/webrev.00/ >> >> Fixed a missing null check on the return value of AddPNode::Ideal_base_and_offset() found by Parfait. >> >> Thanks, >> Tobias >> From tobias.hartmann at oracle.com Fri May 19 06:16:56 2017 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Fri, 19 May 2017 08:16:56 +0200 Subject: [9] RFR(XS): 8180576: Null pointer dereference in Matcher::xform() In-Reply-To: <5dc3ad10-5a8d-bdb4-4ef2-0e3e9a75252b@oracle.com> References: <5dc3ad10-5a8d-bdb4-4ef2-0e3e9a75252b@oracle.com> Message-ID: Hi Vladimir, thanks for the review! Best regards, Tobias On 18.05.2017 19:08, Vladimir Kozlov wrote: > Good. > > Vladimir > > > On 5/18/17 7:08 AM, Tobias Hartmann wrote: >> Hi, >> >> please review the following patch: >> https://bugs.openjdk.java.net/browse/JDK-8180576 >> http://cr.openjdk.java.net/~thartmann/8180576/webrev.00/ >> >> Fixed a missing null check on n->in(0) found by Parfait. >> >> Thanks, >> Tobias >> From tobias.hartmann at oracle.com Fri May 19 06:17:02 2017 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Fri, 19 May 2017 08:17:02 +0200 Subject: [9] RFR(XS): 8180511: Null pointer dereference in Matcher::ReduceInst() In-Reply-To: References: <28049658-c5d4-1889-9e38-d8992a60dbcb@oracle.com> Message-ID: <5b399c67-f9c8-99b0-0656-d0ffcc5a8af7@oracle.com> Hi Vladimir, thanks for the review! Best regards, Tobias On 18.05.2017 19:07, Vladimir Kozlov wrote: > Good. > > Vladimir > > On 5/18/17 7:08 AM, Tobias Hartmann wrote: >> Hi, >> >> please review the following patch: >> https://bugs.openjdk.java.net/browse/JDK-8180511 >> http://cr.openjdk.java.net/~thartmann/8180511/webrev.00/ >> >> Fixed a missing null check on the return value of MachNodeGenerator() found by Parfait. >> >> Thanks, >> Tobias >> From tobias.hartmann at oracle.com Fri May 19 07:27:31 2017 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Fri, 19 May 2017 09:27:31 +0200 Subject: [9] RFR(XS): 8180617: Null pointer dereference in InitializeNode::complete_stores Message-ID: <44cb6da4-136e-2a01-7f04-c3e54e660e49@oracle.com> Hi, please review the following patch: https://bugs.openjdk.java.net/browse/JDK-8180617 http://cr.openjdk.java.net/~thartmann/8180617/webrev.00/ Fixed a missing null check on the return value of InitializeNode::allocation() found by Parfait. Thanks, Tobias From zoltan.majo at oracle.com Fri May 19 08:30:11 2017 From: zoltan.majo at oracle.com (=?UTF-8?B?Wm9sdMOhbiBNYWrDsw==?=) Date: Fri, 19 May 2017 10:30:11 +0200 Subject: [9] RFR(XS): 8180617: Null pointer dereference in InitializeNode::complete_stores In-Reply-To: <44cb6da4-136e-2a01-7f04-c3e54e660e49@oracle.com> References: <44cb6da4-136e-2a01-7f04-c3e54e660e49@oracle.com> Message-ID: <3da5e879-6049-ab3f-6e7d-8899cc6afa5b@oracle.com> Hi Tobias, the fix looks good to me. Thank you! Best regards, Zoltan On 05/19/2017 09:27 AM, Tobias Hartmann wrote: > Hi, > > please review the following patch: > https://bugs.openjdk.java.net/browse/JDK-8180617 > http://cr.openjdk.java.net/~thartmann/8180617/webrev.00/ > > Fixed a missing null check on the return value of InitializeNode::allocation() found by Parfait. > > Thanks, > Tobias From zoltan.majo at oracle.com Fri May 19 08:41:53 2017 From: zoltan.majo at oracle.com (=?UTF-8?B?Wm9sdMOhbiBNYWrDsw==?=) Date: Fri, 19 May 2017 10:41:53 +0200 Subject: [10] RFR(XS): 8180473: Use proper deallocation for FileBuff::_bigbuf In-Reply-To: References: <48b9ffd4-14ff-37ea-8dbe-9250ea56e73b@oracle.com> Message-ID: <314fc8f4-43ab-0afd-da65-33ddc064b222@oracle.com> Hi Thomas, On 05/18/2017 08:47 PM, Thomas St?fe wrote: > Hi Zoltan, > > looks good. Old code may have worked but relied on undefined behaviour. thank you for the review. Unfortunately, the fix was already in the repo when your mail arrived, so I am not able to mention you as a reviewer. Sorry for that. Best regards, Zoltan > > Kind Regards, Thomas > > On Wed, May 17, 2017 at 12:52 PM, Zolt?n Maj? > wrote: > > Hi, > > > please review the following fix for 8180473. > https://bugs.openjdk.java.net/browse/JDK-8180473 > > http://cr.openjdk.java.net/~zmajo/8180473/webrev.00/ > > > It is a good coding practice to properly deallocate dynamically > allocated resources. In particular, dynamically allocated memory > for arrays using the array-specific new[] operator is supposed to > be deallocated using the array-specific delete[] operator. The > deallocation of FileBuff::_bigbuf does not follow this > recommendation and should therefore be changed. Tested with JPRT. > > Thank you! > > Best regards, > > > Zoltan > > From tobias.hartmann at oracle.com Fri May 19 08:43:46 2017 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Fri, 19 May 2017 10:43:46 +0200 Subject: [9] RFR(XS): 8180617: Null pointer dereference in InitializeNode::complete_stores In-Reply-To: <3da5e879-6049-ab3f-6e7d-8899cc6afa5b@oracle.com> References: <44cb6da4-136e-2a01-7f04-c3e54e660e49@oracle.com> <3da5e879-6049-ab3f-6e7d-8899cc6afa5b@oracle.com> Message-ID: <0c26e712-d19c-f0e5-5efc-565732c0e3c8@oracle.com> Hi Zoltan, thanks for the review! Best regards, Tobias On 19.05.2017 10:30, Zolt?n Maj? wrote: > Hi Tobias, > > the fix looks good to me. Thank you! > > Best regards, > > Zoltan > > On 05/19/2017 09:27 AM, Tobias Hartmann wrote: >> Hi, >> >> please review the following patch: >> https://bugs.openjdk.java.net/browse/JDK-8180617 >> http://cr.openjdk.java.net/~thartmann/8180617/webrev.00/ >> >> Fixed a missing null check on the return value of InitializeNode::allocation() found by Parfait. >> >> Thanks, >> Tobias > From chihiro.ito at oracle.com Fri May 19 10:03:28 2017 From: chihiro.ito at oracle.com (chihiro ito) Date: Fri, 19 May 2017 19:03:28 +0900 Subject: RFR: Apply UL to PrintCodeCacheOnCompilation In-Reply-To: <6a279f07-c135-30a0-5558-5c64b0b7374f@oracle.com> References: <591D9D40.5050400@oracle.com> <6a279f07-c135-30a0-5558-5c64b0b7374f@oracle.com> Message-ID: <591EC2F0.8010109@oracle.com> Hi Vladimir, Thank you for reviewing and advice. I created a enhancement in JBS as JDK-8180654. Could you possibly check it and commit this to jdk10/hs as cito. Regards, Chihiro On 2017/05/19 2:51, Vladimir Kozlov wrote: > Hi Chihiro, > > Changes looks fine. > Please, file Enhancement in JBS. Then we can sponsor it. > > Thanks, > Vladimir > > On 5/18/17 6:10 AM, chihiro ito wrote: >> Hi all, >> >> I apply Unified JVM Logging to log of PrintCodeCacheOnCompilation >> option. Logs which applied this is following. >> Could you possibly review for this following small change? If review >> is ok, please commit this as cito. >> >> Sample Log: >> [1.370s][debug][compilation,codecache] CodeHeap 'non-profiled >> nmethods': size=120036Kb used=13Kb max_used=13Kb free=120022Kb >> [1.372s][debug][compilation,codecache] CodeHeap 'profiled nmethods': >> size=120032Kb used=85Kb max_used=85Kb free=119946Kb >> [1.372s][debug][compilation,codecache] CodeHeap 'non-nmethods': >> size=5692Kb used=2648Kb max_used=2655Kb free=3043Kb >> >> Source: >> diff --git a/src/share/vm/compiler/compileBroker.cpp >> b/src/share/vm/compiler/compileBroker.cpp >> --- a/src/share/vm/compiler/compileBroker.cpp >> +++ b/src/share/vm/compiler/compileBroker.cpp >> @@ -1726,6 +1726,22 @@ >> tty->print("%s", s.as_string()); >> } >> >> +// wrapper for CodeCache::print_summary() using outputStream >> +static void codecache_print(outputStream* out, bool detailed) { >> + ResourceMark rm; >> + stringStream s; >> + >> + // Dump code cache into a buffer >> + { >> + MutexLockerEx mu(CodeCache_lock, Mutex::_no_safepoint_check_flag); >> + CodeCache::print_summary(&s, detailed); >> + } >> + >> + for( char *pos, *line = strtok_r(s.as_string(), "\n", &pos) ; line >> != NULL ; line = strtok_r(NULL, "\n", &pos) ) { >> + out->print_cr("%s", line); >> + } >> +} >> + >> void CompileBroker::post_compile(CompilerThread* thread, >> CompileTask* task, EventCompilation& event, bool success, ciEnv* >> ci_env) { >> >> if (success) { >> @@ -1939,6 +1955,10 @@ >> tty->print_cr("time: %d inlined: %d bytes", >> (int)time.milliseconds(), task->num_inlined_bytecodes()); >> } >> >> + Log(compilation, codecache) log; >> + if (log.is_debug()) >> + codecache_print(log.debug_stream(), /* detailed= */ false); >> + >> if (PrintCodeCacheOnCompilation) >> codecache_print(/* detailed= */ false); >> >> >> >> Regards, >> Chihiro >> >> -- Chihiro Ito | Principal Consultant | +81.90.6148.8815 Oracle Consultant ORACLE Japan | Akasaka Center Bldg. | Motoakasaka 1-3-13 | 1070051 Minato-ku, Tokyo, JAPAN Oracle is committed to developing practices and products that help protect the environment -------------- next part -------------- An HTML attachment was scrubbed... URL: From rwestrel at redhat.com Fri May 19 12:14:22 2017 From: rwestrel at redhat.com (Roland Westrelin) Date: Fri, 19 May 2017 14:14:22 +0200 Subject: [9] RFR(S): 8179678: ArrayCopy with same src and dst can cause incorrect execution or compiler crash In-Reply-To: References: Message-ID: I found 2 more bugs with this. Here is a new webrev: http://cr.openjdk.java.net/~roland/8179678/webrev.02/ In the case of test3(), C2 now correctly finds that it can't move the src[0] load above the arraycopy. Once the ArrayCopyNode is expanded, C2 tries to move the src[0] load above what's now the subgraph of the expanded arraycopy. ArrayCopyNode::modifies() is how C2 knows whether it can step over an arraycopy. ArrayCopyNode::modifies() covers the expanded arraycopy case by looking for arraycopy stub calls along memory edges from the MemBar that's at the end of the arraycopy subgraph. The problem here is that ArrayCopyNode::modifies() looks for the stub calls on the raw memory slice but in this particular case, the stubs are on the slice of the array that's input to arraycopy because of this code in PhaseMacroExpand::expand_arraycopy_node(): // This is where the memory effects are placed: const TypePtr* adr_type = TypeAryPtr::get_array_body_type(dest_elem); if (ac->_dest_type != TypeOopPtr::BOTTOM) { adr_type = ac->_dest_type->add_offset(Type::OffsetBot)->is_ptr(); } if (ac->_src_type != ac->_dest_type) { adr_type = TypeRawPtr::BOTTOM; } C2 then sets the load memory edge to the MemBar memory input that causes an unschedulable graph. I fixed this by changing ArrayCopyNode::modifies() so it looks for arraycopy stubs along control edges. The last bug, is with test4(). In that case, it's legal for the src[0] load to move above the arraycopy and the arraycopy is eliminated. In the process, the code in PhaseMacroExpand::process_users_of_allocation() sets the dest input of the ArrayCopy node to top and because src == dest, src to top as well but the logic there doesn't expect src to be top. Roland. From lutz.schmidt at sap.com Fri May 19 12:45:34 2017 From: lutz.schmidt at sap.com (Schmidt, Lutz) Date: Fri, 19 May 2017 12:45:34 +0000 Subject: [10] [ppc] RFR(XS): 8180612: assert failure due to immediate value out of range Message-ID: <3B5A2D91-D4F5-44B9-9129-4876D992C077@sap.com> Hi all, May I kindly request reviews for this small fix? A voluntary sponsor would be great as well! Bug: https://bugs.openjdk.java.net/browse/JDK-8180612 Webrev: http://cr.openjdk.java.net/~lucy/webrevs/8180612.00/ The RTM code generation on ppc relied on RTM-related cmdline parameters to provide ?well-behaved? values only. At least one jtreg test breaks this assumption. The fix makes code generation adapt to actual parameter values. Thanks, Lutz From vladimir.kozlov at oracle.com Fri May 19 16:35:22 2017 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Fri, 19 May 2017 09:35:22 -0700 Subject: [10] [ppc] RFR(XS): 8180612: assert failure due to immediate value out of range In-Reply-To: <3B5A2D91-D4F5-44B9-9129-4876D992C077@sap.com> References: <3B5A2D91-D4F5-44B9-9129-4876D992C077@sap.com> Message-ID: Hi Lutz, I can sponsor it but someone familiar with PPC have to review the fix. Thanks, Vladimir On 5/19/17 5:45 AM, Schmidt, Lutz wrote: > Hi all, > > May I kindly request reviews for this small fix? A voluntary sponsor would be great as well! > > Bug: https://bugs.openjdk.java.net/browse/JDK-8180612 > Webrev: http://cr.openjdk.java.net/~lucy/webrevs/8180612.00/ > > The RTM code generation on ppc relied on RTM-related cmdline parameters to provide ?well-behaved? values only. At least one jtreg test breaks this assumption. The fix makes code generation adapt to actual parameter values. > > Thanks, > Lutz > > > From vladimir.kozlov at oracle.com Fri May 19 17:12:00 2017 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Fri, 19 May 2017 10:12:00 -0700 Subject: RFR: Apply UL to PrintCodeCacheOnCompilation In-Reply-To: <591EC2F0.8010109@oracle.com> References: <591D9D40.5050400@oracle.com> <6a279f07-c135-30a0-5558-5c64b0b7374f@oracle.com> <591EC2F0.8010109@oracle.com> Message-ID: Unfortunately build failed on Windows: compileBroker.cpp(1740) : error C3861: 'strtok_r': identifier not found Vladimir On 5/19/17 3:03 AM, chihiro ito wrote: > Hi Vladimir, > > Thank you for reviewing and advice. I created a enhancement in JBS as JDK-8180654. Could you possibly check it and > commit this to jdk10/hs as cito. > > Regards, > Chihiro > > On 2017/05/19 2:51, Vladimir Kozlov wrote: >> Hi Chihiro, >> >> Changes looks fine. >> Please, file Enhancement in JBS. Then we can sponsor it. >> >> Thanks, >> Vladimir >> >> On 5/18/17 6:10 AM, chihiro ito wrote: >>> Hi all, >>> >>> I apply Unified JVM Logging to log of PrintCodeCacheOnCompilation option. Logs which applied this is following. >>> Could you possibly review for this following small change? If review is ok, please commit this as cito. >>> >>> Sample Log: >>> [1.370s][debug][compilation,codecache] CodeHeap 'non-profiled nmethods': size=120036Kb used=13Kb max_used=13Kb >>> free=120022Kb >>> [1.372s][debug][compilation,codecache] CodeHeap 'profiled nmethods': size=120032Kb used=85Kb max_used=85Kb free=119946Kb >>> [1.372s][debug][compilation,codecache] CodeHeap 'non-nmethods': size=5692Kb used=2648Kb max_used=2655Kb free=3043Kb >>> >>> Source: >>> diff --git a/src/share/vm/compiler/compileBroker.cpp b/src/share/vm/compiler/compileBroker.cpp >>> --- a/src/share/vm/compiler/compileBroker.cpp >>> +++ b/src/share/vm/compiler/compileBroker.cpp >>> @@ -1726,6 +1726,22 @@ >>> tty->print("%s", s.as_string()); >>> } >>> >>> +// wrapper for CodeCache::print_summary() using outputStream >>> +static void codecache_print(outputStream* out, bool detailed) { >>> + ResourceMark rm; >>> + stringStream s; >>> + >>> + // Dump code cache into a buffer >>> + { >>> + MutexLockerEx mu(CodeCache_lock, Mutex::_no_safepoint_check_flag); >>> + CodeCache::print_summary(&s, detailed); >>> + } >>> + >>> + for( char *pos, *line = strtok_r(s.as_string(), "\n", &pos) ; line != NULL ; line = strtok_r(NULL, "\n", &pos) ) { >>> + out->print_cr("%s", line); >>> + } >>> +} >>> + >>> void CompileBroker::post_compile(CompilerThread* thread, CompileTask* task, EventCompilation& event, bool success, >>> ciEnv* ci_env) { >>> >>> if (success) { >>> @@ -1939,6 +1955,10 @@ >>> tty->print_cr("time: %d inlined: %d bytes", (int)time.milliseconds(), task->num_inlined_bytecodes()); >>> } >>> >>> + Log(compilation, codecache) log; >>> + if (log.is_debug()) >>> + codecache_print(log.debug_stream(), /* detailed= */ false); >>> + >>> if (PrintCodeCacheOnCompilation) >>> codecache_print(/* detailed= */ false); >>> >>> >>> >>> Regards, >>> Chihiro >>> >>> > > -- > > Chihiro Ito | Principal Consultant | +81.90.6148.8815 > Oracle Consultant > ORACLE Japan | Akasaka Center Bldg. | Motoakasaka 1-3-13 | 1070051 Minato-ku, Tokyo, JAPAN > > Oracle is committed to developing practices and products that help protect the environment > From volker.simonis at gmail.com Fri May 19 19:02:42 2017 From: volker.simonis at gmail.com (Volker Simonis) Date: Fri, 19 May 2017 21:02:42 +0200 Subject: [10] [ppc] RFR(XS): 8180612: assert failure due to immediate value out of range In-Reply-To: References: <3B5A2D91-D4F5-44B9-9129-4876D992C077@sap.com> Message-ID: Hi Lutz, Vladimir, @Lutz: thanks for fixing this. I think your change looks good. @Vladimir: thanks, but I think we can push this ourselves because it is ppc only. I've also realized that amd64 uses cmpptr() which takes the result of "RTMLockingThreshold / RTMTotalCountIncrRate" as an int32_t. This can be wrong if the result of the division is greater than 32 bit. I'm not sure how relevant that is, but maybe we could either change the types of RTMLockingThreshold and RTMTotalCountIncrRate to int or else fix the compare on amd64 to compare against a full 64 bit value. What do you think Vladimir - maybe do that as a follow up change or do you want to include it here (in which case you'd have to sponsor :) ? Thank you and best regards, Volker On Fri, May 19, 2017 at 6:35 PM, Vladimir Kozlov wrote: > Hi Lutz, > > I can sponsor it but someone familiar with PPC have to review the fix. > > Thanks, > Vladimir > > > On 5/19/17 5:45 AM, Schmidt, Lutz wrote: >> >> Hi all, >> >> May I kindly request reviews for this small fix? A voluntary sponsor would >> be great as well! >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8180612 >> Webrev: http://cr.openjdk.java.net/~lucy/webrevs/8180612.00/ >> >> The RTM code generation on ppc relied on RTM-related cmdline parameters to >> provide ?well-behaved? values only. At least one jtreg test breaks this >> assumption. The fix makes code generation adapt to actual parameter values. >> >> Thanks, >> Lutz >> >> >> > From vladimir.kozlov at oracle.com Fri May 19 19:16:32 2017 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Fri, 19 May 2017 12:16:32 -0700 Subject: RFR(S): 8175096: Use Subword Analysis for set vector size In-Reply-To: <53E8E64DB2403849AFD89B7D4DAC8B2A63C7DE02@ORSMSX106.amr.corp.intel.com> References: <53E8E64DB2403849AFD89B7D4DAC8B2A63C6F5E1@ORSMSX106.amr.corp.intel.com> <37b3da72-c082-7b1b-bf2d-18209b25faa2@oracle.com> <53E8E64DB2403849AFD89B7D4DAC8B2A63C7DE02@ORSMSX106.amr.corp.intel.com> Message-ID: <1220b12e-d7f2-ddce-c1bc-b225c40ab73f@oracle.com> I have time to look on this again. And I don't see that patch helps. See my comments in the bug report. Regards, Vladimir On 2/24/17 10:56 AM, Deshpande, Vivek R wrote: > Hi Vladimir > > Subwords get converted to int by broadening for the operations on them and max_vector gets set according to int, even if there are no mixed type operations, leading to short vector width instructions. > > Regards, > Vivek > > > -----Original Message----- > From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com] > Sent: Friday, February 24, 2017 10:07 AM > To: Berg, Michael C; Deshpande, Vivek R; hotspot-compiler-dev at openjdk.java.net > Cc: Viswanathan, Sandhya > Subject: Re: RFR(S): 8175096: Use Subword Analysis for set vector size > > On 2/23/17 9:03 AM, Berg, Michael C wrote: >> I think using the current approach is the best way, it find the consistent shared vectorsize that is optimal for the loop for unrolling to act to. This additional approach augments that by some testing of constraints as the subword types are often occluded by sign and zero extension artifacts that would otherwise have caused the next common size to surface. These subword types are only in the integer subtype domain. > > "optimal" has wide meaning :) > > Currently it means small number of unrolls in presence of wide (long, > double) values. > > And if it is optimal why we need this additional fix (8175096)? > > I still did not get answer about why "Currently subword types cannot use entire vector width using SLP". In what cases this happen? > > Thanks, > Vladimir > >> >> Regards, >> Michael >> >> -----Original Message----- >> From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com] >> Sent: Wednesday, February 22, 2017 12:09 PM >> To: Deshpande, Vivek R ; >> hotspot-compiler-dev at openjdk.java.net >> Cc: Viswanathan, Sandhya ; Berg, >> Michael C >> Subject: Re: RFR(S): 8175096: Use Subword Analysis for set vector size >> >> Hi Vivek, >> >> This should go into jdk 10 since it is enhancement. We will consider to backport it into jdk 9 update release later. >> >> First, please explain why "Currently subword types cannot use entire vector width using SLP". >> >> The only explanation I have is that if loop has mixing types operations then we narrow unroll factor for biggest type: >> >> if (cur_max_vector < max_vector) { >> max_vector = cur_max_vector; >> >> And we start with smallest type: >> >> int max_vector = Matcher::max_vector_size(T_BYTE); >> >> Should we do opposite and start from long T_LONG (small unroll factor) and widen it to smallest type (big unroll factor)?: >> >> if (cur_max_vector > max_vector) { >> max_vector = cur_max_vector; >> >> Note, max_vector_size() returns number of elements and not size of vector in bytes. >> >> Michael, you are author of this code. What do you think? >> >> Thanks, >> Vladimir >> >> On 2/16/17 11:45 AM, Deshpande, Vivek R wrote: >>> Hi >>> >>> >>> >>> Currently subword types cannot use entire vector width using SLP. >>> >>> This fix analyzes the subword in the loop for possibility of >>> narrowing and sets the vector size accordingly. >>> >>> >>> >>> Webrev: >>> >>> http://cr.openjdk.java.net/~vdeshpande/8175096/webrev.00/ >>> >>> I have also updated the JBS entry. >>> >>> https://bugs.openjdk.java.net/browse/JDK-8175096 >>> >>> Would you please review and sponsor it. >>> >>> >>> >>> Regards, >>> >>> Vivek >>> >>> >>> From vladimir.kozlov at oracle.com Fri May 19 19:33:32 2017 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Fri, 19 May 2017 12:33:32 -0700 Subject: [10] [ppc] RFR(XS): 8180612: assert failure due to immediate value out of range In-Reply-To: References: <3B5A2D91-D4F5-44B9-9129-4876D992C077@sap.com> Message-ID: <6d7bd7fd-52a8-9542-10ac-9b13c470afe8@oracle.com> Thank you, Volker I think all RTM tuning flags should be uint (unsigned 32bit int). We did not have int/uint types when RTM was implemented. They were added 2 years ago: http://hg.openjdk.java.net/jdk9/hs/hotspot/rev/8597e296c18b Lets change type of RTM flags in all places. I will review and sponsor. thanks, Vladimir On 5/19/17 12:02 PM, Volker Simonis wrote: > Hi Lutz, Vladimir, > > @Lutz: thanks for fixing this. I think your change looks good. > > @Vladimir: thanks, but I think we can push this ourselves because it > is ppc only. > > I've also realized that amd64 uses cmpptr() which takes the result of > "RTMLockingThreshold / RTMTotalCountIncrRate" as an int32_t. This can > be wrong if the result of the division is greater than 32 bit. I'm not > sure how relevant that is, but maybe we could either change the types > of RTMLockingThreshold and RTMTotalCountIncrRate to int or else fix > the compare on amd64 to compare against a full 64 bit value. > > What do you think Vladimir - maybe do that as a follow up change or do > you want to include it here (in which case you'd have to sponsor :) ? > > Thank you and best regards, > Volker > > On Fri, May 19, 2017 at 6:35 PM, Vladimir Kozlov > wrote: >> Hi Lutz, >> >> I can sponsor it but someone familiar with PPC have to review the fix. >> >> Thanks, >> Vladimir >> >> >> On 5/19/17 5:45 AM, Schmidt, Lutz wrote: >>> >>> Hi all, >>> >>> May I kindly request reviews for this small fix? A voluntary sponsor would >>> be great as well! >>> >>> Bug: https://bugs.openjdk.java.net/browse/JDK-8180612 >>> Webrev: http://cr.openjdk.java.net/~lucy/webrevs/8180612.00/ >>> >>> The RTM code generation on ppc relied on RTM-related cmdline parameters to >>> provide ?well-behaved? values only. At least one jtreg test breaks this >>> assumption. The fix makes code generation adapt to actual parameter values. >>> >>> Thanks, >>> Lutz >>> >>> >>> >> From vladimir.kozlov at oracle.com Fri May 19 19:40:50 2017 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Fri, 19 May 2017 12:40:50 -0700 Subject: [10] [ppc] RFR(XS): 8180612: assert failure due to immediate value out of range In-Reply-To: <6d7bd7fd-52a8-9542-10ac-9b13c470afe8@oracle.com> References: <3B5A2D91-D4F5-44B9-9129-4876D992C077@sap.com> <6d7bd7fd-52a8-9542-10ac-9b13c470afe8@oracle.com> Message-ID: <0072b1f1-99d3-25eb-e7a2-871c697c3fef@oracle.com> Actually we need to use 'int' because we do signed arithmetic on them. And put range() restriction for positive values only. experimental(int, RTMTotalCountIncrRate, 64, \ "Increment total RTM attempted lock count once every n times") \ range(0, max_jint) \ Vladimir On 5/19/17 12:33 PM, Vladimir Kozlov wrote: > Thank you, Volker > > I think all RTM tuning flags should be uint (unsigned 32bit int). > We did not have int/uint types when RTM was implemented. They were added 2 years ago: > > http://hg.openjdk.java.net/jdk9/hs/hotspot/rev/8597e296c18b > > Lets change type of RTM flags in all places. I will review and sponsor. > > thanks, > Vladimir > > On 5/19/17 12:02 PM, Volker Simonis wrote: >> Hi Lutz, Vladimir, >> >> @Lutz: thanks for fixing this. I think your change looks good. >> >> @Vladimir: thanks, but I think we can push this ourselves because it >> is ppc only. >> >> I've also realized that amd64 uses cmpptr() which takes the result of >> "RTMLockingThreshold / RTMTotalCountIncrRate" as an int32_t. This can >> be wrong if the result of the division is greater than 32 bit. I'm not >> sure how relevant that is, but maybe we could either change the types >> of RTMLockingThreshold and RTMTotalCountIncrRate to int or else fix >> the compare on amd64 to compare against a full 64 bit value. >> >> What do you think Vladimir - maybe do that as a follow up change or do >> you want to include it here (in which case you'd have to sponsor :) ? >> >> Thank you and best regards, >> Volker >> >> On Fri, May 19, 2017 at 6:35 PM, Vladimir Kozlov >> wrote: >>> Hi Lutz, >>> >>> I can sponsor it but someone familiar with PPC have to review the fix. >>> >>> Thanks, >>> Vladimir >>> >>> >>> On 5/19/17 5:45 AM, Schmidt, Lutz wrote: >>>> >>>> Hi all, >>>> >>>> May I kindly request reviews for this small fix? A voluntary sponsor would >>>> be great as well! >>>> >>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8180612 >>>> Webrev: http://cr.openjdk.java.net/~lucy/webrevs/8180612.00/ >>>> >>>> The RTM code generation on ppc relied on RTM-related cmdline parameters to >>>> provide ?well-behaved? values only. At least one jtreg test breaks this >>>> assumption. The fix makes code generation adapt to actual parameter values. >>>> >>>> Thanks, >>>> Lutz >>>> >>>> >>>> >>> From lutz.schmidt at sap.com Fri May 19 19:55:49 2017 From: lutz.schmidt at sap.com (Schmidt, Lutz) Date: Fri, 19 May 2017 19:55:49 +0000 Subject: [10] [ppc] RFR(XS): 8180612: assert failure due to immediate value out of range In-Reply-To: References: <3B5A2D91-D4F5-44B9-9129-4876D992C077@sap.com> Message-ID: Hi Vladimir, Volker, Thanks for looking at my fix and your willingness to sponsor it. I had mentioned that possible x86 issue in the bug description. Maybe that was the wrong place. Anyway, Volker wrote it down here so everyone knows. Regards, Lutz On 19.05.2017, 21:02, "Volker Simonis" wrote: Hi Lutz, Vladimir, @Lutz: thanks for fixing this. I think your change looks good. @Vladimir: thanks, but I think we can push this ourselves because it is ppc only. I've also realized that amd64 uses cmpptr() which takes the result of "RTMLockingThreshold / RTMTotalCountIncrRate" as an int32_t. This can be wrong if the result of the division is greater than 32 bit. I'm not sure how relevant that is, but maybe we could either change the types of RTMLockingThreshold and RTMTotalCountIncrRate to int or else fix the compare on amd64 to compare against a full 64 bit value. What do you think Vladimir - maybe do that as a follow up change or do you want to include it here (in which case you'd have to sponsor :) ? Thank you and best regards, Volker On Fri, May 19, 2017 at 6:35 PM, Vladimir Kozlov wrote: > Hi Lutz, > > I can sponsor it but someone familiar with PPC have to review the fix. > > Thanks, > Vladimir > > > On 5/19/17 5:45 AM, Schmidt, Lutz wrote: >> >> Hi all, >> >> May I kindly request reviews for this small fix? A voluntary sponsor would >> be great as well! >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8180612 >> Webrev: http://cr.openjdk.java.net/~lucy/webrevs/8180612.00/ >> >> The RTM code generation on ppc relied on RTM-related cmdline parameters to >> provide ?well-behaved? values only. At least one jtreg test breaks this >> assumption. The fix makes code generation adapt to actual parameter values. >> >> Thanks, >> Lutz >> >> >> > From vivek.r.deshpande at intel.com Fri May 19 20:27:52 2017 From: vivek.r.deshpande at intel.com (Deshpande, Vivek R) Date: Fri, 19 May 2017 20:27:52 +0000 Subject: RFR(S): 8175096: Use Subword Analysis for set vector size In-Reply-To: <1220b12e-d7f2-ddce-c1bc-b225c40ab73f@oracle.com> References: <53E8E64DB2403849AFD89B7D4DAC8B2A63C6F5E1@ORSMSX106.amr.corp.intel.com> <37b3da72-c082-7b1b-bf2d-18209b25faa2@oracle.com> <53E8E64DB2403849AFD89B7D4DAC8B2A63C7DE02@ORSMSX106.amr.corp.intel.com> <1220b12e-d7f2-ddce-c1bc-b225c40ab73f@oracle.com> Message-ID: <53E8E64DB2403849AFD89B7D4DAC8B2A63CF904F@ORSMSX106.amr.corp.intel.com> Hi Vladimir Thanks for taking a look at the patch. I will take a look at your example and work on improving the patch. Also work on the suggestion of different vector elements counts for different types. Regards, Vivek -----Original Message----- From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com] Sent: Friday, May 19, 2017 12:17 PM To: Deshpande, Vivek R; Berg, Michael C; hotspot-compiler-dev at openjdk.java.net Cc: Viswanathan, Sandhya Subject: Re: RFR(S): 8175096: Use Subword Analysis for set vector size I have time to look on this again. And I don't see that patch helps. See my comments in the bug report. Regards, Vladimir On 2/24/17 10:56 AM, Deshpande, Vivek R wrote: > Hi Vladimir > > Subwords get converted to int by broadening for the operations on them and max_vector gets set according to int, even if there are no mixed type operations, leading to short vector width instructions. > > Regards, > Vivek > > > -----Original Message----- > From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com] > Sent: Friday, February 24, 2017 10:07 AM > To: Berg, Michael C; Deshpande, Vivek R; > hotspot-compiler-dev at openjdk.java.net > Cc: Viswanathan, Sandhya > Subject: Re: RFR(S): 8175096: Use Subword Analysis for set vector size > > On 2/23/17 9:03 AM, Berg, Michael C wrote: >> I think using the current approach is the best way, it find the consistent shared vectorsize that is optimal for the loop for unrolling to act to. This additional approach augments that by some testing of constraints as the subword types are often occluded by sign and zero extension artifacts that would otherwise have caused the next common size to surface. These subword types are only in the integer subtype domain. > > "optimal" has wide meaning :) > > Currently it means small number of unrolls in presence of wide (long, > double) values. > > And if it is optimal why we need this additional fix (8175096)? > > I still did not get answer about why "Currently subword types cannot use entire vector width using SLP". In what cases this happen? > > Thanks, > Vladimir > >> >> Regards, >> Michael >> >> -----Original Message----- >> From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com] >> Sent: Wednesday, February 22, 2017 12:09 PM >> To: Deshpande, Vivek R ; >> hotspot-compiler-dev at openjdk.java.net >> Cc: Viswanathan, Sandhya ; Berg, >> Michael C >> Subject: Re: RFR(S): 8175096: Use Subword Analysis for set vector >> size >> >> Hi Vivek, >> >> This should go into jdk 10 since it is enhancement. We will consider to backport it into jdk 9 update release later. >> >> First, please explain why "Currently subword types cannot use entire vector width using SLP". >> >> The only explanation I have is that if loop has mixing types operations then we narrow unroll factor for biggest type: >> >> if (cur_max_vector < max_vector) { >> max_vector = cur_max_vector; >> >> And we start with smallest type: >> >> int max_vector = Matcher::max_vector_size(T_BYTE); >> >> Should we do opposite and start from long T_LONG (small unroll factor) and widen it to smallest type (big unroll factor)?: >> >> if (cur_max_vector > max_vector) { >> max_vector = cur_max_vector; >> >> Note, max_vector_size() returns number of elements and not size of vector in bytes. >> >> Michael, you are author of this code. What do you think? >> >> Thanks, >> Vladimir >> >> On 2/16/17 11:45 AM, Deshpande, Vivek R wrote: >>> Hi >>> >>> >>> >>> Currently subword types cannot use entire vector width using SLP. >>> >>> This fix analyzes the subword in the loop for possibility of >>> narrowing and sets the vector size accordingly. >>> >>> >>> >>> Webrev: >>> >>> http://cr.openjdk.java.net/~vdeshpande/8175096/webrev.00/ >>> >>> I have also updated the JBS entry. >>> >>>c >>> >>> Would you please review and sponsor it. >>> >>> >>> >>> Regards, >>> >>> Vivek >>> >>> >>> From igor.ignatyev at oracle.com Fri May 19 21:10:08 2017 From: igor.ignatyev at oracle.com (Igor Ignatyev) Date: Fri, 19 May 2017 14:10:08 -0700 Subject: [9] RFR(XS): 8180565: Null pointer dereferences of ConstMethod::method() In-Reply-To: <42fc8336-2bf9-94a6-56a8-3574c67ac456@oracle.com> References: <30304de2-7e50-4a0b-f74f-8a264da926cf@oracle.com> <809ceefb-11e3-75cf-718d-b883ca88efbd@oracle.com> <42fc8336-2bf9-94a6-56a8-3574c67ac456@oracle.com> Message-ID: <619D4C02-C294-4F78-A58F-9BBD54A9949A@oracle.com> Hi Tobias, in ConstMethod::print_on, don't we need st->cr() when 'm' is null? Thanks, -- Igor > On May 18, 2017, at 11:16 PM, Tobias Hartmann wrote: > > Hi Vladimir, > > thanks for the review! > > Best regards, > Tobias > > On 18.05.2017 19:10, Vladimir Kozlov wrote: >> Good. >> >> Vladimir >> >> >> On 5/18/17 7:08 AM, Tobias Hartmann wrote: >>> Hi, >>> >>> please review the following patch: >>> https://bugs.openjdk.java.net/browse/JDK-8180565 >>> http://cr.openjdk.java.net/~thartmann/8180565/webrev.00/ >>> >>> ConstMethod::method() returns _constants->pool_holder()->method_with_idnum(_method_idnum) which may be NULL. We need to check for NULL before dereferencing. >>> >>> Thanks, >>> Tobias >>> From vladimir.kozlov at oracle.com Sat May 20 01:02:41 2017 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Fri, 19 May 2017 18:02:41 -0700 Subject: [9] RFR(S): 8179678: ArrayCopy with same src and dst can cause incorrect execution or compiler crash In-Reply-To: References: Message-ID: <43893ed5-cbf5-4cf9-a727-8733850260de@oracle.com> Call nodes have complicated graph of following control projections. You don't check for them. Why? I think that was the reason that we scan memory edges instead of control when searching for calls. Thanks, Vladimir On 5/19/17 5:14 AM, Roland Westrelin wrote: > > I found 2 more bugs with this. Here is a new webrev: > > http://cr.openjdk.java.net/~roland/8179678/webrev.02/ > > In the case of test3(), C2 now correctly finds that it can't move the > src[0] load above the arraycopy. Once the ArrayCopyNode is expanded, C2 > tries to move the src[0] load above what's now the subgraph of the > expanded arraycopy. ArrayCopyNode::modifies() is how C2 knows whether it > can step over an arraycopy. ArrayCopyNode::modifies() covers the > expanded arraycopy case by looking for arraycopy stub calls along memory > edges from the MemBar that's at the end of the arraycopy subgraph. The > problem here is that ArrayCopyNode::modifies() looks for the stub calls > on the raw memory slice but in this particular case, the stubs are on > the slice of the array that's input to arraycopy because of this code in > PhaseMacroExpand::expand_arraycopy_node(): > > // This is where the memory effects are placed: > const TypePtr* adr_type = TypeAryPtr::get_array_body_type(dest_elem); > if (ac->_dest_type != TypeOopPtr::BOTTOM) { > adr_type = ac->_dest_type->add_offset(Type::OffsetBot)->is_ptr(); > } > if (ac->_src_type != ac->_dest_type) { > adr_type = TypeRawPtr::BOTTOM; > } > > C2 then sets the load memory edge to the MemBar memory input that causes > an unschedulable graph. I fixed this by changing > ArrayCopyNode::modifies() so it looks for arraycopy stubs along control > edges. > > The last bug, is with test4(). In that case, it's legal for the src[0] > load to move above the arraycopy and the arraycopy is eliminated. In the > process, the code in PhaseMacroExpand::process_users_of_allocation() > sets the dest input of the ArrayCopy node to top and because src == > dest, src to top as well but the logic there doesn't expect src to be > top. > > Roland. > From chihiro.ito at oracle.com Sat May 20 09:25:50 2017 From: chihiro.ito at oracle.com (chihiro ito) Date: Sat, 20 May 2017 18:25:50 +0900 Subject: RFR: Apply UL to PrintCodeCacheOnCompilation In-Reply-To: References: <591D9D40.5050400@oracle.com> <6a279f07-c135-30a0-5558-5c64b0b7374f@oracle.com> <591EC2F0.8010109@oracle.com> Message-ID: <59200B9E.20106@oracle.com> Hi Vladimir, Thank you for build. For Windows, I found that I use strtok_s instead of strtok_r. On Windows, should I use strtok_s? I thought that a common code is good (not depend on the OS) . It was fixed as follows. diff --git a/src/share/vm/compiler/compileBroker.cpp b/src/share/vm/compiler/compileBroker.cpp --- a/src/share/vm/compiler/compileBroker.cpp +++ b/src/share/vm/compiler/compileBroker.cpp @@ -1726,6 +1726,33 @@ tty->print("%s", s.as_string()); } +// wrapper for CodeCache::print_summary() using outputStream +static void codecache_print(outputStream* out, bool detailed) { + ResourceMark rm; + stringStream s; + + // Dump code cache into a buffer + { + MutexLockerEx mu(CodeCache_lock, Mutex::_no_safepoint_check_flag); + CodeCache::print_summary(&s, detailed); + } + + char* remaining_log = s.as_string(); + char* eol; + + while( *remaining_log != '\0' ){ + eol = strchr(remaining_log, '\n'); + if( eol == NULL ) { + out->print_cr("%s", remaining_log); + remaining_log = remaining_log + strlen(remaining_log); + } else { + *eol = '\0'; + out->print_cr("%s", remaining_log); + remaining_log = eol + 1; + } + } +} + void CompileBroker::post_compile(CompilerThread* thread, CompileTask* task, EventCompilation& event, bool success, ciEnv* ci_env) { if (success) { @@ -1939,6 +1966,10 @@ tty->print_cr("time: %d inlined: %d bytes", (int)time.milliseconds(), task->num_inlined_bytecodes()); } + Log(compilation, codecache) log; + if (log.is_debug()) + codecache_print(log.debug_stream(), /* detailed= */ false); + if (PrintCodeCacheOnCompilation) codecache_print(/* detailed= */ false); Regards, Chihiro On 2017/05/20 2:12, Vladimir Kozlov wrote: > Unfortunately build failed on Windows: > > compileBroker.cpp(1740) : error C3861: 'strtok_r': identifier not found > > Vladimir > > On 5/19/17 3:03 AM, chihiro ito wrote: >> Hi Vladimir, >> >> Thank you for reviewing and advice. I created a enhancement in JBS as >> JDK-8180654. Could you possibly check it and >> commit this to jdk10/hs as cito. >> >> Regards, >> Chihiro >> >> On 2017/05/19 2:51, Vladimir Kozlov wrote: >>> Hi Chihiro, >>> >>> Changes looks fine. >>> Please, file Enhancement in JBS. Then we can sponsor it. >>> >>> Thanks, >>> Vladimir >>> >>> On 5/18/17 6:10 AM, chihiro ito wrote: >>>> Hi all, >>>> >>>> I apply Unified JVM Logging to log of PrintCodeCacheOnCompilation >>>> option. Logs which applied this is following. >>>> Could you possibly review for this following small change? If >>>> review is ok, please commit this as cito. >>>> >>>> Sample Log: >>>> [1.370s][debug][compilation,codecache] CodeHeap 'non-profiled >>>> nmethods': size=120036Kb used=13Kb max_used=13Kb >>>> free=120022Kb >>>> [1.372s][debug][compilation,codecache] CodeHeap 'profiled >>>> nmethods': size=120032Kb used=85Kb max_used=85Kb free=119946Kb >>>> [1.372s][debug][compilation,codecache] CodeHeap 'non-nmethods': >>>> size=5692Kb used=2648Kb max_used=2655Kb free=3043Kb >>>> >>>> Source: >>>> diff --git a/src/share/vm/compiler/compileBroker.cpp >>>> b/src/share/vm/compiler/compileBroker.cpp >>>> --- a/src/share/vm/compiler/compileBroker.cpp >>>> +++ b/src/share/vm/compiler/compileBroker.cpp >>>> @@ -1726,6 +1726,22 @@ >>>> tty->print("%s", s.as_string()); >>>> } >>>> >>>> +// wrapper for CodeCache::print_summary() using outputStream >>>> +static void codecache_print(outputStream* out, bool detailed) { >>>> + ResourceMark rm; >>>> + stringStream s; >>>> + >>>> + // Dump code cache into a buffer >>>> + { >>>> + MutexLockerEx mu(CodeCache_lock, >>>> Mutex::_no_safepoint_check_flag); >>>> + CodeCache::print_summary(&s, detailed); >>>> + } >>>> + >>>> + for( char *pos, *line = strtok_r(s.as_string(), "\n", &pos) ; >>>> line != NULL ; line = strtok_r(NULL, "\n", &pos) ) { >>>> + out->print_cr("%s", line); >>>> + } >>>> +} >>>> + >>>> void CompileBroker::post_compile(CompilerThread* thread, >>>> CompileTask* task, EventCompilation& event, bool success, >>>> ciEnv* ci_env) { >>>> >>>> if (success) { >>>> @@ -1939,6 +1955,10 @@ >>>> tty->print_cr("time: %d inlined: %d bytes", >>>> (int)time.milliseconds(), task->num_inlined_bytecodes()); >>>> } >>>> >>>> + Log(compilation, codecache) log; >>>> + if (log.is_debug()) >>>> + codecache_print(log.debug_stream(), /* detailed= */ false); >>>> + >>>> if (PrintCodeCacheOnCompilation) >>>> codecache_print(/* detailed= */ false); >>>> >>>> >>>> >>>> Regards, >>>> Chihiro >>>> >>>> >> >> -- >> >> Chihiro Ito | Principal Consultant | +81.90.6148.8815 >> Oracle Consultant >> ORACLE Japan | Akasaka Center Bldg. | Motoakasaka 1-3-13 | 1070051 >> Minato-ku, Tokyo, JAPAN >> >> Oracle is committed to developing practices and products that help >> protect the environment >> -- Chihiro Ito | Principal Consultant | +81.90.6148.8815 Oracle Consultant ORACLE Japan | Akasaka Center Bldg. | Motoakasaka 1-3-13 | 1070051 Minato-ku, Tokyo, JAPAN Oracle is committed to developing practices and products that help protect the environment -------------- next part -------------- An HTML attachment was scrubbed... URL: From tobias.hartmann at oracle.com Mon May 22 07:11:35 2017 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Mon, 22 May 2017 09:11:35 +0200 Subject: [9] RFR(XS): 8180565: Null pointer dereferences of ConstMethod::method() In-Reply-To: <619D4C02-C294-4F78-A58F-9BBD54A9949A@oracle.com> References: <30304de2-7e50-4a0b-f74f-8a264da926cf@oracle.com> <809ceefb-11e3-75cf-718d-b883ca88efbd@oracle.com> <42fc8336-2bf9-94a6-56a8-3574c67ac456@oracle.com> <619D4C02-C294-4F78-A58F-9BBD54A9949A@oracle.com> Message-ID: <09356c52-47b2-a686-1eb0-aa3c69c368b8@oracle.com> Hi Igor, On 19.05.2017 23:10, Igor Ignatyev wrote: > in ConstMethod::print_on, don't we need st->cr() when 'm' is null? Right, thanks for catching this! I'll fix it before pushing. Best regards, Tobias >> On May 18, 2017, at 11:16 PM, Tobias Hartmann wrote: >> >> Hi Vladimir, >> >> thanks for the review! >> >> Best regards, >> Tobias >> >> On 18.05.2017 19:10, Vladimir Kozlov wrote: >>> Good. >>> >>> Vladimir >>> >>> >>> On 5/18/17 7:08 AM, Tobias Hartmann wrote: >>>> Hi, >>>> >>>> please review the following patch: >>>> https://bugs.openjdk.java.net/browse/JDK-8180565 >>>> http://cr.openjdk.java.net/~thartmann/8180565/webrev.00/ >>>> >>>> ConstMethod::method() returns _constants->pool_holder()->method_with_idnum(_method_idnum) which may be NULL. We need to check for NULL before dereferencing. >>>> >>>> Thanks, >>>> Tobias >>>> > From thomas.schatzl at oracle.com Mon May 22 15:39:53 2017 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Mon, 22 May 2017 17:39:53 +0200 Subject: RFR (S): 8180755: Remove use of bitMap.inline.hpp include from instanceKlass.hpp and c1_ValueSet.hpp Message-ID: <1495467593.2573.82.camel@oracle.com> Hi all, ? can I have reviews for this change that removes the use of bitMap.inline.hpp from instanceKlass.hpp (which does not use bitMaps at all) and?c1_ValueSet.hpp as per guidelines? This is only seemingly a large change the reason is the required move of methods that use the BitMap class in .hpp files into .inline.hpp files (GC and compiler only). Also, this touches files from GC, runtime and compiler, so I sent this RFR to all corresponding lists to get reviews from every group for their respective files. It's mostly GC changes though. CR: https://bugs.openjdk.java.net/browse/JDK-8180755 Webrev: http://cr.openjdk.java.net/~tschatzl/8180755/webrev/ Testing: jprt Thanks, ? Thomas From rwestrel at redhat.com Mon May 22 16:26:34 2017 From: rwestrel at redhat.com (Roland Westrelin) Date: Mon, 22 May 2017 18:26:34 +0200 Subject: [9] RFR(S): 8179678: ArrayCopy with same src and dst can cause incorrect execution or compiler crash In-Reply-To: <43893ed5-cbf5-4cf9-a727-8733850260de@oracle.com> References: <43893ed5-cbf5-4cf9-a727-8733850260de@oracle.com> Message-ID: > Call nodes have complicated graph of following control > projections. You don't check for them. Why? I think that was the > reason that we scan memory edges instead of control when searching for > calls. Catch and CatchProj? The logic I changed handles an array copy, once it is expanded, whose destination is a non escaping array. Escape analysis only considers the destination of "validated" array copies as non escaping: PointsToNode::EscapeState es = PointsToNode::ArgEscape; if (call->is_ArrayCopy()) { ArrayCopyNode* ac = call->as_ArrayCopy(); if (ac->is_clonebasic() || ac->is_arraycopy_validated() || ac->is_copyof_validated() || ac->is_copyofrange_validated()) { es = PointsToNode::NoEscape; } } An array copy is "validated" when LibraryCallKit::inline_arraycopy() inserted guards to validate that this arraycopy is a "good" arraycopy. In PhaseMacroExpand::generate_arraycopy() the stub calls that are inserted are all connected through a single ProjNode to the result Region except: 1- the slow arraycopy call which has Catch and CatchProj nodes. But with validated arraycopies, there shouldn't be a slow arraycopy alone but at least one stub call on one of the control path. 2- a checkcast arraycopy for which the return of the call needs to be checked: checkcast arraycopy are explicitly skipped for validated arraycopies. 3- if the copy size is null, PhaseMacroExpand::generate_arraycopy() would generate no call but that special case should be handled by ArrayCopyNode::Ideal. 4- A generic arraycopy (if C2 can't tell whether inputs to arraycopy are arrays). I found that 4 can happen eventhough given the arraycopy is validated, it could be avoided. So here is a new webrev which sets src_elem = dest_elem in PhaseMacroExpand::expand_arraycopy_node() when dest_elem is known and the arraycopy is validated. http://cr.openjdk.java.net/~roland/8179678/webrev.03/ Roland. From jcbeyler at google.com Mon May 22 18:47:35 2017 From: jcbeyler at google.com (JC Beyler) Date: Mon, 22 May 2017 11:47:35 -0700 Subject: Low-Overhead Heap Profiling In-Reply-To: References: <2af975e6-3827-bd57-0c3d-fadd54867a67@oracle.com> <365499b6-3f4d-a4df-9e7e-e72a739fb26b@oracle.com> Message-ID: Dear all, I have a new webrev up: http://cr.openjdk.java.net/~rasbold/8171119/webrev.03/ This webrev has, I hope, fixed a lot of the comments from Robbin: - The casts normally are all C++ style - Moved this to jdk10-hs - I have not tested slowdebug yet, hopefully it does not break there - Added the garbage collection system: - Now live sampled allocations are tracked throughout their lifetime - When GC happens, it moves the sampled allocation information to two lists: recent and frequent GC lists - Those lists use the array system that the live objects were using before but have different re-use strategies - Added the JVMTI API for them via a GetFrequentGarbageTraces and GetGarbageTraces - Both use the same JVMTI structures - Added the calls to them for the test, though I've kept that test simple for now: http://cr.openjdk.java.net/~rasbold/8171119/webrev.03/raw_files/new/test/serviceability/jvmti/HeapMonitor/libHeapMonitor.c - As I write this, I notice my webrev is missing a final change I made to the test that calls the same ReleaseTraces to each live/garbage/frequent structure. This is updated in my local repo and will get in the next webrev. Next steps for this work are: - Putting the TLAB implementation (yes not forgotten ;-)) - Adding more testing and separate the current test system to check things a bit more thoroughly - Have not tried to circumvent AsyncGetCallTrace yet - Still have to double check the stack walker a bit more Happy webrev perusal! Jc On Tue, May 16, 2017 at 5:20 AM, Robbin Ehn wrote: > Just a few answers, > > On 05/15/2017 06:48 PM, JC Beyler wrote: > >> Dear all, >> >> I've updated the webrev to: >> http://cr.openjdk.java.net/~rasbold/8171119/webrev.02/ < >> http://cr.openjdk.java.net/~rasbold/8171119/webrev.02/> >> > > I'll look at this later, thanks! > > >> Robbin, >> I believe I have addressed most of your items with webrev 02: >> - I added a JTreg test to show how it works: >> http://cr.openjdk.java.net/~rasbold/8171119/webrev.02/raw_fi >> les/new/test/serviceability/jvmti/HeapMonitor/libHeapMonitor.c < >> http://cr.openjdk.java.net/~rasbold/8171119/webrev.02/raw_f >> iles/new/test/serviceability/jvmti/HeapMonitor/libHeapMonitor.c> >> - I've modified the code to use its own data structures both internally >> and externally, this will make it easier to move out of AsyncGetCallTrace >> as we move forward, that is still on my TODOs >> - I cleaned up the JVMTI API by passing a structure that handles the >> num_traces and put in a ReleaseTraces as well >> - I cleaned up other issues as well. >> >> However, I have three questions, which are probably because I'm new in >> this community: >> 1) My previous webrevs were based off of JDK9 by mistake. When I took >> JDK10 via : hg clone http://hg.openjdk.java.net/jdk10/jdk10 < >> http://hg.openjdk.java.net/jdk10/jdk10> jdk10 >> - I don't see code compatible with what you were showing (ie your >> patches don't make sense for that code base; ex: klass is still accessed >> via klass() for example in collectedHeap.inline.hpp) >> - Would you know what is the right hg clone command so we are >> working on the same code base? >> > > We use jdk10-hs, e.g. > hg tclone http://hg.openjdk.java.net/jdk10/hs 10-hs > > There is sporadic big merges going from jdk9->jdk10->jdk10-hs and > jdk10-hs->jdk10, so 10 is moving... > > >> 2) You mentioned I was using os::malloc, new, NEW_C_HEAP_ARRAY; I >> cleaned out the os::malloc but which of the new vs NEW_C_HEAP_ARRAY should >> I use. It might be that I don't understand when one uses one or the other >> but I see both used around the code base? >> - Is it that new is to be used for anything internal and >> NEW_C_HEAP_ARRAY anything provided to the JVMTI users outside of the JVM? >> > > We overload new operator when you extend correct base class, e.g. > CHeapObj so use 'new' > But for arrays you will need the macro NEW_C_HEAP_ARRAY. > > >> 3) Casts: same kind question: which should I use. The code was using a >> bit of everything, I'll refactor it entirely but I was not clear if I >> should go to C casts or C++ casts as I see both in the codebase. What is >> the convention I should use? >> > > Just be consist, use what suites you, C++ casts might be preferable, if we > are moving towards C++11. > And use 'right' cast, e.g. going from Thread* to JavaThread* you should > use C cast or static_cast, not reinterpret_cast I would say. > > >> Final notes on this webrev: >> - I am still missing: >> - Putting a TLAB implementation so that we can compare both webrevs >> - Have not tried to circumvent AsyncGetCallTrace >> - Putting in the handling of GC'd objects >> - Fix a stack walker issue I have seen, I think I know the problem >> and will test that theory out for the next webrev >> >> I will work on integrating those items for the next webrev! >> > > Thanks! > > >> Thanks for your help, >> Jc >> >> Ps: I tested this on a new repo: >> >> hg clone http://hg.openjdk.java.net/jdk10/jdk10 < >> http://hg.openjdk.java.net/jdk10/jdk10> jdk10 >> ... building it >> cd test >> jtreg -nativepath:/build/linux-x86_64-normal-server >> -release/support/test/hotspot/jtreg/native/lib/ -jdk >> /linux-x86_64-normal-server-release/images/jdk >> ../hotspot/test/serviceability/jvmti/HeapMonitor/ >> >> > I'll test it out! > > /Robbin > > >> >> On Thu, May 4, 2017 at 11:21 PM, serguei.spitsyn at oracle.com > serguei.spitsyn at oracle.com> > serguei.spitsyn at oracle.com>> wrote: >> >> Robbin, >> >> Thank you for forwarding! >> I will review it. >> >> Thanks, >> Serguei >> >> >> >> On 5/4/17 02:13, Robbin Ehn wrote: >> >> Hi, >> >> To me the compiler changes looks what is expected. >> It would be good if someone from compiler could take a look at >> that. >> Added compiler to mail thread. >> >> Also adding Serguei, It would be good with his view also. >> >> My initial take on it, read through most of the code and took it >> for a ride. >> >> ############################## >> - Regarding the compiler changes: I think we need the 'TLAB end' >> trickery (mentioned by Tony P) >> instead of a separate check for sampling in fast path for the >> final version. >> >> ############################## >> - This patch I had to apply to get it compile on JDK 10: >> >> diff -r ac3ded340b35 src/share/vm/gc/shared/collect >> edHeap.inline.hpp >> --- a/src/share/vm/gc/shared/collectedHeap.inline.hpp Fri Apr >> 28 14:31:38 2017 +0200 >> +++ b/src/share/vm/gc/shared/collectedHeap.inline.hpp Thu May >> 04 10:22:56 2017 +0200 >> @@ -87,3 +87,3 @@ >> // support for object alloc event (no-op most of the time) >> - if (klass() != NULL && klass()->name() != NULL) { >> + if (klass != NULL && klass->name() != NULL) { >> Thread *base_thread = Thread::current(); >> diff -r ac3ded340b35 src/share/vm/runtime/heapMonitoring.cpp >> --- a/src/share/vm/runtime/heapMonitoring.cpp Fri Apr 28 >> 14:31:38 2017 +0200 >> +++ b/src/share/vm/runtime/heapMonitoring.cpp Thu May 04 >> 10:22:56 2017 +0200 >> @@ -316,3 +316,3 @@ >> JavaThread *thread = reinterpret_cast> *>(Thread::current()); >> - assert(o->size() << LogHeapWordSize == byte_size, >> + assert(o->size() << LogHeapWordSize == (long)byte_size, >> "Object size is incorrect."); >> >> ############################## >> - This patch I had to apply to get it not asserting during >> slowdebug: >> >> --- a/src/share/vm/runtime/heapMonitoring.cpp Fri Apr 28 >> 15:15:16 2017 +0200 >> +++ b/src/share/vm/runtime/heapMonitoring.cpp Thu May 04 >> 10:24:25 2017 +0200 >> @@ -32,3 +32,3 @@ >> // TODO(jcbeyler): should we make this into a JVMTI structure? >> -struct StackTraceData { >> +struct StackTraceData : CHeapObj { >> ASGCT_CallTrace *trace; >> @@ -143,3 +143,2 @@ >> StackTraceStorage::StackTraceStorage() : >> - _allocated_traces(new StackTraceData*[MaxHeapTraces]), >> _allocated_traces_size(MaxHeapTraces), >> @@ -147,2 +146,3 @@ >> _allocated_count(0) { >> + _allocated_traces = NEW_C_HEAP_ARRAY(StackTraceData*, >> MaxHeapTraces, mtInternal); >> memset(_allocated_traces, 0, sizeof(*_allocated_traces) * >> MaxHeapTraces); >> @@ -152,3 +152,3 @@ >> StackTraceStorage::~StackTraceStorage() { >> - delete[] _allocated_traces; >> + FREE_C_HEAP_ARRAY(StackTraceData*, _allocated_traces); >> } >> >> - Classes should extend correct base class for which type of >> memory is used for it e.g.: CHeapObj or StackObj or AllStatic >> - The style in heapMonitoring.cpp is a bit different from normal >> vm-style, e.g. using C++ casts instead of C. You mix NEW_C_HEAP_ARRAY, >> os::malloc and new. >> - In jvmtiHeapTransition.hpp you use C cast instead. >> >> ############################## >> - This patch I had apply to get traces without setting an >> ?unrelated? capability >> - Should this not be a new capability? >> >> diff -r c02a5d8785bf src/share/vm/prims/forte.cpp >> --- a/src/share/vm/prims/forte.cpp Fri Apr 28 15:15:16 2017 >> +0200 >> +++ b/src/share/vm/prims/forte.cpp Thu May 04 10:24:25 2017 >> +0200 >> @@ -530,6 +530,6 @@ >> >> - if (!JvmtiExport::should_post_class_load()) { >> +/* if (!JvmtiExport::should_post_class_load()) { >> trace->num_frames = ticks_no_class_load; // -1 >> return; >> - } >> + }*/ >> >> ############################## >> - forte.cpp: (I know this is not part of your changes but) >> find_jmethod_id_or_null give me NULL for my test. >> It looks like we actually want the regular jmethod_id() ? >> >> Since we are the thread we are talking about (and in same >> ucontext) and thread is in vm and have a last java frame, >> I think most of the checks done in AsyncGetCallTrace is >> irrelevant, so you should be-able to call forte_fill_call_trace_given_top >> directly. >> But since we might need jmethod_id() if possible to avoid getting >> method id NULL, >> we need some fixes in forte code, or just do the vframStream loop >> inside heapMonitoring.cpp and not use forte.cpp. >> >> Something like: >> >> if (jthread->has_last_Java_frame()) { // just to be safe >> vframeStream vfst(jthread); >> while (!vfst.at_end()) { >> Method* m = vfst.method(); >> m->jmethod_id(); >> m->line_number_from_bci(vfst.bci()); >> vfst.next(); >> } >> >> - This is a bit confusing in forte.cpp, >> trace->frames[count].lineno = bci. >> Line number should be m->line_number_from_bci(bci); >> Do the heapMonitoring suppose to trace with bci or line number? >> I would say bci, meaning we should either rename >> ASGCT_CallFrame?lineno or use another data structure which says bci. >> >> ############################## >> - // TODO(jcbeyler): remove this extra code handling the extra >> trace for >> Please fix all these TODO's :) >> >> ############################## >> - heapMonitoring.hpp: >> // TODO(jcbeyler): is this algorithm acceptable in open source? >> >> Why is this comment here? What is the implication? >> Have you tested any simpler algorithm? >> >> ############################## >> - Create a sanity jtreg test. (./hotspot/make/test/JtregNative.gmk >> for building the agent) >> >> ############################## >> - monitoring_period vs HeapMonitorRate, pick rate or period. >> >> ############################## >> - globals.hpp >> Why is MaxHeapTraces not settable/overridable from jvmti >> interface? That would be handy. >> >> ############################## >> - jvmtiStackTraceData + ASGCT_CallFrame memory >> Are the agent suppose to loop through and free all >> ASGCT_CallFrame? >> Wouldn't it be better with some kinda protocol, like: >> (*jvmti)->GetLiveTraces(jvmti, &stack_traces, &num_traces); >> (*jvmti)->ReleaseTraces(jvmti, stack_traces, num_traces); >> >> Also using another data structure that have num_traces inside it >> simplifies things. >> So I'm not convinced using the async structure is the best way >> forward. >> >> >> I have more questions, but I think it's better if you respond and >> update the code first. >> >> Thanks! >> >> /Robbin >> >> >> On 04/21/2017 11:34 PM, JC Beyler wrote: >> >> Hi all, >> >> I've added size information to the allocation sampling >> system. This allows the callback to remember the size of each sampled >> allocation. >> http://cr.openjdk.java.net/~rasbold/8171119/webrev.01/ < >> http://cr.openjdk.java.net/~rasbold/8171119/webrev.01/> >> >> The new webrev.01 also adds the actual heap monitoring >> sampling system in files: >> http://cr.openjdk.java.net/~rasbold/8171119/webrev.01/src/sh >> are/vm/runtime/heapMonitoring.cpp.patch >> > hare/vm/runtime/heapMonitoring.cpp.patch> >> and >> http://cr.openjdk.java.net/~rasbold/8171119/webrev.01/src/sh >> are/vm/runtime/heapMonitoring.hpp.patch >> > hare/vm/runtime/heapMonitoring.hpp.patch> >> >> My next step is to add the GC part to the webrev, which will >> allow users to determine what objects are live and what are garbage. >> >> Thanks for your attention and let me know if there are any >> questions! >> >> Have a wonderful Friday! >> Jc >> >> On Mon, Apr 17, 2017 at 12:37 PM, JC Beyler < >> jcbeyler at google.com > jcbeyler at google.com >> wrote: >> >> Hi all, >> >> I worked on getting a few numbers for overhead and >> accuracy for my feature. I'm unsure if here is the right place to provide >> the full data, so I am just >> summarizing >> here for now. >> >> - Overhead of the feature >> >> Using the Dacapo benchmark (http://dacapobench.org/). >> My initial results are that sampling provides 2.4% with a 512k sampling, >> 512k being our default setting. >> >> - Note: this was without the tradesoap, tradebeans and >> tomcat benchmarks since they did not work with my JDK9 (issue between >> Dacapo and JDK9 it seems) >> - I want to rerun next week to ensure number stability >> >> - Accuracy of the feature >> >> I wrote a small microbenchmark that allocates from two >> different stacktraces at a given ratio. For example, 10% of stacktrace S1 >> and 90% from stacktrace >> S2. The >> microbenchmark was run 20 times, I averaged the results >> and looked for accuracy. It seems that statistically it is sound since if I >> allocated10% S1 and 90% >> S2, with a >> sampling rate of 512k, I obtained 9.61% S1 and 90.49% S2. >> >> Let me know if there are any questions on the numbers >> and if you'd like to see some more data. >> >> Note: this was done using our internal JDK8 >> implementation since the webrev provided by >> http://cr.openjdk.java.net/~rasbold/heapz/webrev.00/index.html >> > tml> >> > tml > >> does not yet contain the whole >> implementation and therefore would have been misleading. >> >> Thanks, >> Jc >> >> >> On Tue, Apr 4, 2017 at 3:55 PM, JC Beyler < >> jcbeyler at google.com > jcbeyler at google.com >> wrote: >> >> Hi all, >> >> To move the discussion forward, with Chuck Rasbold's >> help to make a webrev, we pushed this: >> http://cr.openjdk.java.net/~rasbold/heapz/webrev.00/index.ht >> ml >> > tml > >> 415 lines changed: 399 ins; 13 del; 3 mod; 51122 >> unchg >> >> This is not a final change that does the whole >> proposition from the JBS entry: https://bugs.openjdk.java.net/ >> browse/JDK-8177374 >> >> > https://bugs.openjdk.java.net/browse/JDK-8177374>>; what it does show is >> parts of the implementation that is >> proposed and hopefully can start the conversation going >> as I work through the details. >> >> For example, the changes to C2 are done here for the >> allocations: http://cr.openjdk.java.net/~rasbold/heapz/webrev.00/src/shar >> e/vm/opto/macro.cpp.patch >> > re/vm/opto/macro.cpp.patch> >> > re/vm/opto/macro.cpp.patch >> > re/vm/opto/macro.cpp.patch>> >> >> Hopefully this all makes sense and thank you for all >> your future comments! >> Jc >> >> >> On Tue, Dec 13, 2016 at 1:11 PM, JC Beyler < >> jcbeyler at google.com > jcbeyler at google.com >> >> wrote: >> >> Hello all, >> >> This is a follow-up from Jeremy's initial email >> from last year: >> http://mail.openjdk.java.net/pipermail/serviceability-dev/20 >> 15-June/017543.html >> > 015-June/017543.html> >> > 015-June/017543.html > pipermail/serviceability-dev/2015-June/017543.html>> >> >> I've gone ahead and started working on preparing >> this and Jeremy and I went down the route of actually writing it up in JEP >> form: >> https://bugs.openjdk.java.net/browse/JDK-8171119 < >> https://bugs.openjdk.java.net/browse/JDK-8171119> >> >> I think original conversation that happened last >> year in that thread still holds true: >> >> - We have a patch at Google that we think >> others might be interested in >> - It provides a means to understand where >> the allocation hotspots are at a very low overhead >> - Since it is at a low overhead, we can >> leave it on by default >> >> So I come to the mailing list with Jeremy's >> initial question: >> "I thought I would ask if there is any interest >> / if I should write a JEP / if I should just forget it." >> >> A year ago, it seemed some thought it was a good >> idea, is this still true? >> >> Thanks, >> Jc >> >> >> >> >> >> >> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From vladimir.kozlov at oracle.com Mon May 22 19:41:13 2017 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Mon, 22 May 2017 12:41:13 -0700 Subject: [9] RFR(S): 8179678: ArrayCopy with same src and dst can cause incorrect execution or compiler crash In-Reply-To: References: <43893ed5-cbf5-4cf9-a727-8733850260de@oracle.com> Message-ID: Okay, this seems fine. I will run testing on this change and sponsor if it pass. Thanks, Vladimir On 5/22/17 9:26 AM, Roland Westrelin wrote: > >> Call nodes have complicated graph of following control >> projections. You don't check for them. Why? I think that was the >> reason that we scan memory edges instead of control when searching for >> calls. > > Catch and CatchProj? > > The logic I changed handles an array copy, once it is expanded, whose > destination is a non escaping array. > > Escape analysis only considers the destination of "validated" array > copies as non escaping: > > PointsToNode::EscapeState es = PointsToNode::ArgEscape; > if (call->is_ArrayCopy()) { > ArrayCopyNode* ac = call->as_ArrayCopy(); > if (ac->is_clonebasic() || > ac->is_arraycopy_validated() || > ac->is_copyof_validated() || > ac->is_copyofrange_validated()) { > es = PointsToNode::NoEscape; > } > } > > An array copy is "validated" when LibraryCallKit::inline_arraycopy() > inserted guards to validate that this arraycopy is a "good" arraycopy. > > In PhaseMacroExpand::generate_arraycopy() the stub calls that are > inserted are all connected through a single ProjNode to the result > Region except: > > 1- the slow arraycopy call which has Catch and CatchProj nodes. But with > validated arraycopies, there shouldn't be a slow arraycopy alone but at > least one stub call on one of the control path. > > 2- a checkcast arraycopy for which the return of the call needs to be > checked: checkcast arraycopy are explicitly skipped for validated > arraycopies. > > 3- if the copy size is null, PhaseMacroExpand::generate_arraycopy() > would generate no call but that special case should be handled by > ArrayCopyNode::Ideal. > > 4- A generic arraycopy (if C2 can't tell whether inputs to arraycopy are > arrays). > > I found that 4 can happen eventhough given the arraycopy is validated, > it could be avoided. So here is a new webrev which sets src_elem = > dest_elem in PhaseMacroExpand::expand_arraycopy_node() when dest_elem is > known and the arraycopy is validated. > > http://cr.openjdk.java.net/~roland/8179678/webrev.03/ > > Roland. > From dean.long at oracle.com Mon May 22 22:41:02 2017 From: dean.long at oracle.com (dean.long at oracle.com) Date: Mon, 22 May 2017 15:41:02 -0700 Subject: Some JVMCI/Graal questions related to AOT Message-ID: 1) I'm working on "8132547: [AOT] support invokedynamic instructions" and I've hacked up jdk.vm.ci.hotspot.HotSpotConstantPool.java to handle things like the invokedynamic appendix differently. However, since this will only be used by AOT, I'm thinking I need to put my changes in an AOTHotSpotConstantPool subclass. My question is, where is a good place to put such as class (which hopefully won't require messing with modules)? 2) How can I tell if a ResolvedJavaType corresponds to a VM anonymous class (Klass::is_anonymous())? I can't rely on getFingerprint() returning 0, because I want fingerprints for anonymous classes. Is there something existing, or do I need to add something to JVMCI? dl From david.holmes at oracle.com Tue May 23 01:38:14 2017 From: david.holmes at oracle.com (David Holmes) Date: Tue, 23 May 2017 11:38:14 +1000 Subject: RFR (S): 8180755: Remove use of bitMap.inline.hpp include from instanceKlass.hpp and c1_ValueSet.hpp In-Reply-To: <1495467593.2573.82.camel@oracle.com> References: <1495467593.2573.82.camel@oracle.com> Message-ID: <722e9800-5d1b-bc9e-7bff-16a228d367c2@oracle.com> Hi Thomas, This looks okay to me. A couple of comments: src/share/vm/oops/generateOopMap.cpp inline void GenerateOopMap::set_bbmark_bit(int bci) { _bb_hdr_bits.at_put(bci, true); } Does "inline" serve any purpose here? --- src/share/vm/oops/generateOopMap.hpp - void set_bbmark_bit (int bci) { - _bb_hdr_bits.at_put(bci, true); - } + inline void set_bbmark_bit (int bci); void clear_bbmark_bit (int bci) { _bb_hdr_bits.at_put(bci, false); } I don't understand why set_bbmark_bit had to be moved out but clear_bbmark_bit remains ?? Thanks, David ----- On 23/05/2017 1:39 AM, Thomas Schatzl wrote: > Hi all, > > can I have reviews for this change that removes the use of > bitMap.inline.hpp from instanceKlass.hpp (which does not use bitMaps at > all) and c1_ValueSet.hpp as per guidelines? > > This is only seemingly a large change the reason is the required move > of methods that use the BitMap class in .hpp files into .inline.hpp > files (GC and compiler only). > > Also, this touches files from GC, runtime and compiler, so I sent this > RFR to all corresponding lists to get reviews from every group for > their respective files. It's mostly GC changes though. > > CR: > https://bugs.openjdk.java.net/browse/JDK-8180755 > Webrev: > http://cr.openjdk.java.net/~tschatzl/8180755/webrev/ > Testing: > jprt > > Thanks, > Thomas > From tobias.hartmann at oracle.com Tue May 23 06:09:08 2017 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Tue, 23 May 2017 08:09:08 +0200 Subject: [9] RFR(S): 8179678: ArrayCopy with same src and dst can cause incorrect execution or compiler crash In-Reply-To: References: <43893ed5-cbf5-4cf9-a727-8733850260de@oracle.com> Message-ID: <31848120-9bfa-505a-60dd-fa8b1c0b7efa@oracle.com> Hi Roland, with your fix, compiler/arraycopy/TestEliminatedArrayCopyDeopt fails on all platforms with: ----------messages:(4/454)---------- command: main -XX:-BackgroundCompilation -XX:-UseOnStackReplacement -XX:+IgnoreUnrecognizedVMOptions -XX:-ReduceInitialCardMarks compiler.arraycopy.TestEliminatedArrayCopyDeopt reason: User specified action: run main/othervm -XX:-BackgroundCompilation -XX:-UseOnStackReplacement -XX:+IgnoreUnrecognizedVMOptions -XX:-ReduceInitialCardMarks compiler.arraycopy.TestEliminatedArrayCopyDeopt Mode: othervm [/othervm specified] elapsed time (seconds): 0.538 ----------configuration:(0/0)---------- ----------System.out:(1/10)---------- m1 failed ----------System.err:(13/825)---------- java.lang.RuntimeException: Test failed at compiler.arraycopy.TestEliminatedArrayCopyDeopt.main(TestEliminatedArrayCopyDeopt.java:206) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.base/java.lang.reflect.Method.invoke(Method.java:563) at com.sun.javatest.regtest.agent.MainWrapper$MainThread.run(MainWrapper.java:115) at java.base/java.lang.Thread.run(Thread.java:844) Best regards, Tobias On 22.05.2017 21:41, Vladimir Kozlov wrote: > Okay, this seems fine. > > I will run testing on this change and sponsor if it pass. > > Thanks, > Vladimir > > On 5/22/17 9:26 AM, Roland Westrelin wrote: >> >>> Call nodes have complicated graph of following control >>> projections. You don't check for them. Why? I think that was the >>> reason that we scan memory edges instead of control when searching for >>> calls. >> >> Catch and CatchProj? >> >> The logic I changed handles an array copy, once it is expanded, whose >> destination is a non escaping array. >> >> Escape analysis only considers the destination of "validated" array >> copies as non escaping: >> >> PointsToNode::EscapeState es = PointsToNode::ArgEscape; >> if (call->is_ArrayCopy()) { >> ArrayCopyNode* ac = call->as_ArrayCopy(); >> if (ac->is_clonebasic() || >> ac->is_arraycopy_validated() || >> ac->is_copyof_validated() || >> ac->is_copyofrange_validated()) { >> es = PointsToNode::NoEscape; >> } >> } >> >> An array copy is "validated" when LibraryCallKit::inline_arraycopy() >> inserted guards to validate that this arraycopy is a "good" arraycopy. >> >> In PhaseMacroExpand::generate_arraycopy() the stub calls that are >> inserted are all connected through a single ProjNode to the result >> Region except: >> >> 1- the slow arraycopy call which has Catch and CatchProj nodes. But with >> validated arraycopies, there shouldn't be a slow arraycopy alone but at >> least one stub call on one of the control path. >> >> 2- a checkcast arraycopy for which the return of the call needs to be >> checked: checkcast arraycopy are explicitly skipped for validated >> arraycopies. >> >> 3- if the copy size is null, PhaseMacroExpand::generate_arraycopy() >> would generate no call but that special case should be handled by >> ArrayCopyNode::Ideal. >> >> 4- A generic arraycopy (if C2 can't tell whether inputs to arraycopy are >> arrays). >> >> I found that 4 can happen eventhough given the arraycopy is validated, >> it could be avoided. So here is a new webrev which sets src_elem = >> dest_elem in PhaseMacroExpand::expand_arraycopy_node() when dest_elem is >> known and the arraycopy is validated. >> >> http://cr.openjdk.java.net/~roland/8179678/webrev.03/ >> >> Roland. >> From goetz.lindenmaier at sap.com Tue May 23 06:29:38 2017 From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz) Date: Tue, 23 May 2017 06:29:38 +0000 Subject: [ping] RE: RFR(S): 8179618: Fixes for range of OptoLoopAlignment and Inlining flags Message-ID: <15526d8eef034bd48e793c086bf9b1ca@sap.com> Hi, could someone please sponsor this change? Final webrev: http://cr.openjdk.java.net/~goetz/wr17/8179618-FlagRanges/webrev.03/ Thanks, Goetz > -----Original Message----- > From: hotspot-compiler-dev [mailto:hotspot-compiler-dev- > bounces at openjdk.java.net] On Behalf Of Lindenmaier, Goetz > Sent: Dienstag, 16. Mai 2017 12:08 > Cc: hotspot-compiler-dev at openjdk.java.net > Subject: RE: RFR(S): 8179618: Fixes for range of OptoLoopAlignment and Inlining > flags > > Hi, > > could someone please sponsor this change? > Final webrev: > http://cr.openjdk.java.net/~goetz/wr17/8179618-FlagRanges/webrev.03/ > > Thanks, > Goetz > > > -----Original Message----- > > From: Lindenmaier, Goetz > > Sent: Freitag, 12. Mai 2017 09:10 > > To: 'Thomas St?fe' > > Cc: hotspot-compiler-dev at openjdk.java.net > > Subject: RE: RFR(S): 8179618: Fixes for range of OptoLoopAlignment and > Inlining > > flags > > > > Hi, > > > > > > > > could someone please sponsor? Thanks! > > > > > > > > I fixed the print statement. New webrev anyways: > > > > http://cr.openjdk.java.net/~goetz/wr17/8179618-FlagRanges/webrev.03/ > > > > > > > > Best regards, > > > > Goetz. > > > > > > > > From: Thomas St?fe [mailto:thomas.stuefe at gmail.com] > > Sent: Tuesday, May 09, 2017 7:54 PM > > To: Lindenmaier, Goetz > > Cc: hotspot-compiler-dev at openjdk.java.net > > Subject: Re: RFR(S): 8179618: Fixes for range of OptoLoopAlignment and > Inlining > > flags > > > > > > > > Hi Goetz, > > > > > > > > On Tue, May 9, 2017 at 4:17 PM, Lindenmaier, Goetz > > > wrote: > > > > Hi Thomas, > > > > thanks for looking at my change. > > New webrev: > > http://cr.openjdk.java.net/~goetz/wr17/8179618- > > FlagRanges/webrev.02/ > > > > > c2_globals.hpp: > > > - range(0, max_intx) \ > > > + range(0, ((intx)MIN2((int64_t)max_intx,(int64_t)(+1.0e10)))) \ > > > 32bit: I would have expected a build warning for the cast. Is it okay > > that we can never reach the max value on 32bit? > > > > I double checked that there is no warning in our night builds and on > > linuxintel. > > > > > commandLineFlagConstraintsCompiler.cpp: > > > CommandLineError::print(verbose, > > > "OptoLoopAlignment (" INTX_FORMAT ") must be " > > > "multiple of NOP size\n"); > > > There is an error here, the print parameter is missing. Would have > > expected the compiler to complain, actually - at least the gcc. Again, curious. > > > > Thanks, good catch! The error was there before, but fixed anyways. I > > also > > added the NOP size. > > > > > > + // Relevant on ppc, s390, sparc. Will be optimized where > > + // addr_unit() == 1. > > if (OptoLoopAlignment % relocInfo::addr_unit() != 0) { > > CommandLineError::print(verbose, > > "OptoLoopAlignment (" INTX_FORMAT ") must be " > > - "multiple of NOP size\n"); > > + "multiple of NOP size (" INTX_FORMAT ")\n", > > + value, relocInfo::addr_unit()); > > > > We are getting there... > > > > > > > > addr_unit() returns int, so use %d, not INTX_FORMAT. > > > > > > > > Apart from that all is fine. No need for a new webrev. > > > > > > > > ..Thomas > > > > > > > > > > Best regards, > > Goetz. > > > > > > > > Kind Regards, Thomas > > > > > > On Thu, May 4, 2017 at 12:57 PM, Lindenmaier, Goetz > > > > > wrote: > > Hi, > > > > This change fixes range handling of a few flags of C2. > > This should go to jdk10, and later be downported to some > > update of jdk9. > > > > Please review this change. I please need a sponsor. > > http://cr.openjdk.java.net/~goetz/wr17/8179618- > > FlagRanges/webrev.01/ > > > > Class WarmCallInfo limits its values to 1.0e10, but the flags used > > to set it's fields (HotCallCountThreshold etc.) are limited by max_intx. > > Using values over 1.0e10 causes assertions in the debug build. > > > > OptoLoopAlignment must be a multiple of nop size, else it's not > > possible to generate the instructions that go into the pad. > > On x86 NOP size is 1, so it's no problem. > > For SPARC, OptoLoopAlignmentConstraintFunc implements a special > > case for bigger NOPs. This is also needed for s390 and ppc. > > I just removed the #define, as the code works also on platforms > > where NOPsize == 1. Actually, it should be optimized by the C > > compiler in these cases. > > > > Best regards, > > Goetz. > > > > From tobias.hartmann at oracle.com Tue May 23 06:41:33 2017 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Tue, 23 May 2017 08:41:33 +0200 Subject: [ping] RE: RFR(S): 8179618: Fixes for range of OptoLoopAlignment and Inlining flags In-Reply-To: <15526d8eef034bd48e793c086bf9b1ca@sap.com> References: <15526d8eef034bd48e793c086bf9b1ca@sap.com> Message-ID: Hi Goetz, On 23.05.2017 08:29, Lindenmaier, Goetz wrote: > could someone please sponsor this change? > Final webrev: > http://cr.openjdk.java.net/~goetz/wr17/8179618-FlagRanges/webrev.03/ Sure, I'll sponsor it! Best regards, Tobias >> -----Original Message----- >> From: hotspot-compiler-dev [mailto:hotspot-compiler-dev- >> bounces at openjdk.java.net] On Behalf Of Lindenmaier, Goetz >> Sent: Dienstag, 16. Mai 2017 12:08 >> Cc: hotspot-compiler-dev at openjdk.java.net >> Subject: RE: RFR(S): 8179618: Fixes for range of OptoLoopAlignment and Inlining >> flags >> >> Hi, >> >> could someone please sponsor this change? >> Final webrev: >> http://cr.openjdk.java.net/~goetz/wr17/8179618-FlagRanges/webrev.03/ >> >> Thanks, >> Goetz >> >>> -----Original Message----- >>> From: Lindenmaier, Goetz >>> Sent: Freitag, 12. Mai 2017 09:10 >>> To: 'Thomas St?fe' >>> Cc: hotspot-compiler-dev at openjdk.java.net >>> Subject: RE: RFR(S): 8179618: Fixes for range of OptoLoopAlignment and >> Inlining >>> flags >>> >>> Hi, >>> >>> >>> >>> could someone please sponsor? Thanks! >>> >>> >>> >>> I fixed the print statement. New webrev anyways: >>> >>> http://cr.openjdk.java.net/~goetz/wr17/8179618-FlagRanges/webrev.03/ >>> >>> >>> >>> Best regards, >>> >>> Goetz. >>> >>> >>> >>> From: Thomas St?fe [mailto:thomas.stuefe at gmail.com] >>> Sent: Tuesday, May 09, 2017 7:54 PM >>> To: Lindenmaier, Goetz >>> Cc: hotspot-compiler-dev at openjdk.java.net >>> Subject: Re: RFR(S): 8179618: Fixes for range of OptoLoopAlignment and >> Inlining >>> flags >>> >>> >>> >>> Hi Goetz, >>> >>> >>> >>> On Tue, May 9, 2017 at 4:17 PM, Lindenmaier, Goetz >>> > wrote: >>> >>> Hi Thomas, >>> >>> thanks for looking at my change. >>> New webrev: >>> http://cr.openjdk.java.net/~goetz/wr17/8179618- >>> FlagRanges/webrev.02/ >>> >>> > c2_globals.hpp: >>> > - range(0, max_intx) \ >>> > + range(0, ((intx)MIN2((int64_t)max_intx,(int64_t)(+1.0e10)))) \ >>> > 32bit: I would have expected a build warning for the cast. Is it okay >>> that we can never reach the max value on 32bit? >>> >>> I double checked that there is no warning in our night builds and on >>> linuxintel. >>> >>> > commandLineFlagConstraintsCompiler.cpp: >>> > CommandLineError::print(verbose, >>> > "OptoLoopAlignment (" INTX_FORMAT ") must be " >>> > "multiple of NOP size\n"); >>> > There is an error here, the print parameter is missing. Would have >>> expected the compiler to complain, actually - at least the gcc. Again, curious. >>> >>> Thanks, good catch! The error was there before, but fixed anyways. I >>> also >>> added the NOP size. >>> >>> >>> + // Relevant on ppc, s390, sparc. Will be optimized where >>> + // addr_unit() == 1. >>> if (OptoLoopAlignment % relocInfo::addr_unit() != 0) { >>> CommandLineError::print(verbose, >>> "OptoLoopAlignment (" INTX_FORMAT ") must be " >>> - "multiple of NOP size\n"); >>> + "multiple of NOP size (" INTX_FORMAT ")\n", >>> + value, relocInfo::addr_unit()); >>> >>> We are getting there... >>> >>> >>> >>> addr_unit() returns int, so use %d, not INTX_FORMAT. >>> >>> >>> >>> Apart from that all is fine. No need for a new webrev. >>> >>> >>> >>> ..Thomas >>> >>> >>> >>> >>> Best regards, >>> Goetz. >>> >>> >>> >>> Kind Regards, Thomas >>> >>> >>> On Thu, May 4, 2017 at 12:57 PM, Lindenmaier, Goetz >>> > >>> wrote: >>> Hi, >>> >>> This change fixes range handling of a few flags of C2. >>> This should go to jdk10, and later be downported to some >>> update of jdk9. >>> >>> Please review this change. I please need a sponsor. >>> http://cr.openjdk.java.net/~goetz/wr17/8179618- >>> FlagRanges/webrev.01/ >>> >>> Class WarmCallInfo limits its values to 1.0e10, but the flags used >>> to set it's fields (HotCallCountThreshold etc.) are limited by max_intx. >>> Using values over 1.0e10 causes assertions in the debug build. >>> >>> OptoLoopAlignment must be a multiple of nop size, else it's not >>> possible to generate the instructions that go into the pad. >>> On x86 NOP size is 1, so it's no problem. >>> For SPARC, OptoLoopAlignmentConstraintFunc implements a special >>> case for bigger NOPs. This is also needed for s390 and ppc. >>> I just removed the #define, as the code works also on platforms >>> where NOPsize == 1. Actually, it should be optimized by the C >>> compiler in these cases. >>> >>> Best regards, >>> Goetz. >>> >>> > From lutz.schmidt at sap.com Tue May 23 07:29:03 2017 From: lutz.schmidt at sap.com (Schmidt, Lutz) Date: Tue, 23 May 2017 07:29:03 +0000 Subject: [10] [ppc] RFR(XS): 8180612: assert failure due to immediate value out of range In-Reply-To: <0072b1f1-99d3-25eb-e7a2-871c697c3fef@oracle.com> References: <3B5A2D91-D4F5-44B9-9129-4876D992C077@sap.com> <6d7bd7fd-52a8-9542-10ac-9b13c470afe8@oracle.com> <0072b1f1-99d3-25eb-e7a2-871c697c3fef@oracle.com> Message-ID: <7A3AD54D-9DC2-4B83-8B4D-BB2E442CC68A@sap.com> Vladimir, Volker, Triggered by your suggestions, I have read through the RTM code with extended diligence. What I came up with is this updated/extended webrev: http://cr.openjdk.java.net/~lucy/webrevs/8180612.01/ for bug: https://bugs.openjdk.java.net/browse/JDK-8180612 For both x86 and ppc, I have added ranges to all numeric RTM flags. Their type is now ?int?. Could you please have a look and let me know what you don?t like? Thanks and best regards, Lutz On 19.05.2017, 21:40, "Vladimir Kozlov" wrote: Actually we need to use 'int' because we do signed arithmetic on them. And put range() restriction for positive values only. experimental(int, RTMTotalCountIncrRate, 64, \ "Increment total RTM attempted lock count once every n times") \ range(0, max_jint) \ Vladimir On 5/19/17 12:33 PM, Vladimir Kozlov wrote: > Thank you, Volker > > I think all RTM tuning flags should be uint (unsigned 32bit int). > We did not have int/uint types when RTM was implemented. They were added 2 years ago: > > http://hg.openjdk.java.net/jdk9/hs/hotspot/rev/8597e296c18b > > Lets change type of RTM flags in all places. I will review and sponsor. > > thanks, > Vladimir > > On 5/19/17 12:02 PM, Volker Simonis wrote: >> Hi Lutz, Vladimir, >> >> @Lutz: thanks for fixing this. I think your change looks good. >> >> @Vladimir: thanks, but I think we can push this ourselves because it >> is ppc only. >> >> I've also realized that amd64 uses cmpptr() which takes the result of >> "RTMLockingThreshold / RTMTotalCountIncrRate" as an int32_t. This can >> be wrong if the result of the division is greater than 32 bit. I'm not >> sure how relevant that is, but maybe we could either change the types >> of RTMLockingThreshold and RTMTotalCountIncrRate to int or else fix >> the compare on amd64 to compare against a full 64 bit value. >> >> What do you think Vladimir - maybe do that as a follow up change or do >> you want to include it here (in which case you'd have to sponsor :) ? >> >> Thank you and best regards, >> Volker >> >> On Fri, May 19, 2017 at 6:35 PM, Vladimir Kozlov >> wrote: >>> Hi Lutz, >>> >>> I can sponsor it but someone familiar with PPC have to review the fix. >>> >>> Thanks, >>> Vladimir >>> >>> >>> On 5/19/17 5:45 AM, Schmidt, Lutz wrote: >>>> >>>> Hi all, >>>> >>>> May I kindly request reviews for this small fix? A voluntary sponsor would >>>> be great as well! >>>> >>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8180612 >>>> Webrev: http://cr.openjdk.java.net/~lucy/webrevs/8180612.00/ >>>> >>>> The RTM code generation on ppc relied on RTM-related cmdline parameters to >>>> provide ?well-behaved? values only. At least one jtreg test breaks this >>>> assumption. The fix makes code generation adapt to actual parameter values. >>>> >>>> Thanks, >>>> Lutz >>>> >>>> >>>> >>> From doug.simon at oracle.com Tue May 23 07:29:30 2017 From: doug.simon at oracle.com (Doug Simon) Date: Tue, 23 May 2017 09:29:30 +0200 Subject: Some JVMCI/Graal questions related to AOT In-Reply-To: References: Message-ID: > On 23 May 2017, at 00:41, dean.long at oracle.com wrote: > > 1) I'm working on "8132547: [AOT] support invokedynamic instructions" and I've hacked up jdk.vm.ci.hotspot.HotSpotConstantPool.java to handle things like the invokedynamic appendix differently. However, since this will only be used by AOT, I'm thinking I need to put my changes in an AOTHotSpotConstantPool subclass. My question is, where is a good place to put such as class (which hopefully won't require messing with modules)? Depending on the nature of the changes, I suspect they can simply be added to HotSpotConstantPool, guarded by a VM flag exposed by HotSpotVMConfig if necessary. HotSpotConstantPool is currently final and I don't see a natural place for an AOT specific subclass > 2) How can I tell if a ResolvedJavaType corresponds to a VM anonymous class (Klass::is_anonymous())? I can't rely on getFingerprint() returning 0, because I want fingerprints for anonymous classes. Is there something existing, or do I need to add something to JVMCI? You'd need to add something to JVMCI by exposing the required flags and fields in HotSpotVMConfig. -Doug From goetz.lindenmaier at sap.com Tue May 23 07:57:45 2017 From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz) Date: Tue, 23 May 2017 07:57:45 +0000 Subject: [ping] RE: RFR(S): 8179618: Fixes for range of OptoLoopAlignment and Inlining flags In-Reply-To: References: <15526d8eef034bd48e793c086bf9b1ca@sap.com> Message-ID: <10930ac381644f8a91273dd0e6e82cc5@sap.com> Hi Tobias, that was quick, thanks! Best regards, Goetz. > -----Original Message----- > From: Tobias Hartmann [mailto:tobias.hartmann at oracle.com] > Sent: Dienstag, 23. Mai 2017 08:42 > To: Lindenmaier, Goetz ; hotspot-compiler- > dev at openjdk.java.net > Subject: Re: [ping] RE: RFR(S): 8179618: Fixes for range of OptoLoopAlignment > and Inlining flags > > Hi Goetz, > > On 23.05.2017 08:29, Lindenmaier, Goetz wrote: > > could someone please sponsor this change? > > Final webrev: > > http://cr.openjdk.java.net/~goetz/wr17/8179618-FlagRanges/webrev.03/ > > Sure, I'll sponsor it! > > Best regards, > Tobias > > >> -----Original Message----- > >> From: hotspot-compiler-dev [mailto:hotspot-compiler-dev- > >> bounces at openjdk.java.net] On Behalf Of Lindenmaier, Goetz > >> Sent: Dienstag, 16. Mai 2017 12:08 > >> Cc: hotspot-compiler-dev at openjdk.java.net > >> Subject: RE: RFR(S): 8179618: Fixes for range of OptoLoopAlignment and > Inlining > >> flags > >> > >> Hi, > >> > >> could someone please sponsor this change? > >> Final webrev: > >> http://cr.openjdk.java.net/~goetz/wr17/8179618-FlagRanges/webrev.03/ > >> > >> Thanks, > >> Goetz > >> > >>> -----Original Message----- > >>> From: Lindenmaier, Goetz > >>> Sent: Freitag, 12. Mai 2017 09:10 > >>> To: 'Thomas St?fe' > >>> Cc: hotspot-compiler-dev at openjdk.java.net > >>> Subject: RE: RFR(S): 8179618: Fixes for range of OptoLoopAlignment and > >> Inlining > >>> flags > >>> > >>> Hi, > >>> > >>> > >>> > >>> could someone please sponsor? Thanks! > >>> > >>> > >>> > >>> I fixed the print statement. New webrev anyways: > >>> > >>> http://cr.openjdk.java.net/~goetz/wr17/8179618-FlagRanges/webrev.03/ > >>> > >>> > >>> > >>> Best regards, > >>> > >>> Goetz. > >>> > >>> > >>> > >>> From: Thomas St?fe [mailto:thomas.stuefe at gmail.com] > >>> Sent: Tuesday, May 09, 2017 7:54 PM > >>> To: Lindenmaier, Goetz > >>> Cc: hotspot-compiler-dev at openjdk.java.net > >>> Subject: Re: RFR(S): 8179618: Fixes for range of OptoLoopAlignment and > >> Inlining > >>> flags > >>> > >>> > >>> > >>> Hi Goetz, > >>> > >>> > >>> > >>> On Tue, May 9, 2017 at 4:17 PM, Lindenmaier, Goetz > >>> > > wrote: > >>> > >>> Hi Thomas, > >>> > >>> thanks for looking at my change. > >>> New webrev: > >>> http://cr.openjdk.java.net/~goetz/wr17/8179618- > >>> FlagRanges/webrev.02/ > >>> > >>> > c2_globals.hpp: > >>> > - range(0, max_intx) \ > >>> > + range(0, ((intx)MIN2((int64_t)max_intx,(int64_t)(+1.0e10)))) \ > >>> > 32bit: I would have expected a build warning for the cast. Is it okay > >>> that we can never reach the max value on 32bit? > >>> > >>> I double checked that there is no warning in our night builds and on > >>> linuxintel. > >>> > >>> > commandLineFlagConstraintsCompiler.cpp: > >>> > CommandLineError::print(verbose, > >>> > "OptoLoopAlignment (" INTX_FORMAT ") must be " > >>> > "multiple of NOP size\n"); > >>> > There is an error here, the print parameter is missing. Would have > >>> expected the compiler to complain, actually - at least the gcc. Again, > curious. > >>> > >>> Thanks, good catch! The error was there before, but fixed anyways. I > >>> also > >>> added the NOP size. > >>> > >>> > >>> + // Relevant on ppc, s390, sparc. Will be optimized where > >>> + // addr_unit() == 1. > >>> if (OptoLoopAlignment % relocInfo::addr_unit() != 0) { > >>> CommandLineError::print(verbose, > >>> "OptoLoopAlignment (" INTX_FORMAT ") must be " > >>> - "multiple of NOP size\n"); > >>> + "multiple of NOP size (" INTX_FORMAT ")\n", > >>> + value, relocInfo::addr_unit()); > >>> > >>> We are getting there... > >>> > >>> > >>> > >>> addr_unit() returns int, so use %d, not INTX_FORMAT. > >>> > >>> > >>> > >>> Apart from that all is fine. No need for a new webrev. > >>> > >>> > >>> > >>> ..Thomas > >>> > >>> > >>> > >>> > >>> Best regards, > >>> Goetz. > >>> > >>> > >>> > >>> Kind Regards, Thomas > >>> > >>> > >>> On Thu, May 4, 2017 at 12:57 PM, Lindenmaier, Goetz > >>> > > > >>> wrote: > >>> Hi, > >>> > >>> This change fixes range handling of a few flags of C2. > >>> This should go to jdk10, and later be downported to some > >>> update of jdk9. > >>> > >>> Please review this change. I please need a sponsor. > >>> http://cr.openjdk.java.net/~goetz/wr17/8179618- > >>> FlagRanges/webrev.01/ > >>> > >>> Class WarmCallInfo limits its values to 1.0e10, but the flags used > >>> to set it's fields (HotCallCountThreshold etc.) are limited by max_intx. > >>> Using values over 1.0e10 causes assertions in the debug build. > >>> > >>> OptoLoopAlignment must be a multiple of nop size, else it's not > >>> possible to generate the instructions that go into the pad. > >>> On x86 NOP size is 1, so it's no problem. > >>> For SPARC, OptoLoopAlignmentConstraintFunc implements a special > >>> case for bigger NOPs. This is also needed for s390 and ppc. > >>> I just removed the #define, as the code works also on platforms > >>> where NOPsize == 1. Actually, it should be optimized by the C > >>> compiler in these cases. > >>> > >>> Best regards, > >>> Goetz. > >>> > >>> > > From yasuenag at gmail.com Tue May 23 08:27:45 2017 From: yasuenag at gmail.com (Yasumasa Suenaga) Date: Tue, 23 May 2017 17:27:45 +0900 Subject: [10] RFR: JDK-8160354: uninitialized value warning and VM crash are occurred with GCC 6 Message-ID: Hi all, I've posted review request [1]. I want to resume to work about it. This issue is marked as P3 bug [2]. I uploaded the webrev for jdk 10. Could you review it? http://cr.openjdk.java.net/~ysuenaga/JDK-8160354/webrev.02/ I cannot access JPRT. So I need a sponsor. Thanks, Yasumasa [1] http://mail.openjdk.java.net/pipermail/hotspot-dev/2016-June/023658.html [2] https://bugs.openjdk.java.net/browse/JDK-8160354 From tobias.hartmann at oracle.com Tue May 23 08:55:35 2017 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Tue, 23 May 2017 10:55:35 +0200 Subject: [9] RFR(XS): 8180813: Null pointer dereference of CodeCache::find_blob() result Message-ID: Hi, please review the following patch: https://bugs.openjdk.java.net/browse/JDK-8180813 http://cr.openjdk.java.net/~thartmann/8180813/webrev.00/ Fixed missing null checks on the result of CodeCache::find_blob() found by Parfait. Thanks, Tobias From shade at redhat.com Tue May 23 09:02:21 2017 From: shade at redhat.com (Aleksey Shipilev) Date: Tue, 23 May 2017 11:02:21 +0200 Subject: [9] RFR(XS): 8180813: Null pointer dereference of CodeCache::find_blob() result In-Reply-To: References: Message-ID: <74f9c094-774e-9ad5-fc55-ef0a515cb43f@redhat.com> On 05/23/2017 10:55 AM, Tobias Hartmann wrote: > Hi, > > please review the following patch: > https://bugs.openjdk.java.net/browse/JDK-8180813 > http://cr.openjdk.java.net/~thartmann/8180813/webrev.00/ Looks okay to me. -Aleksey -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: OpenPGP digital signature URL: From tobias.hartmann at oracle.com Tue May 23 09:05:45 2017 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Tue, 23 May 2017 11:05:45 +0200 Subject: [9] RFR(XS): 8180813: Null pointer dereference of CodeCache::find_blob() result In-Reply-To: <74f9c094-774e-9ad5-fc55-ef0a515cb43f@redhat.com> References: <74f9c094-774e-9ad5-fc55-ef0a515cb43f@redhat.com> Message-ID: <5f485ac1-0f9e-58b6-e380-eeca284b3346@oracle.com> Hi Aleksey, On 23.05.2017 11:02, Aleksey Shipilev wrote: > On 05/23/2017 10:55 AM, Tobias Hartmann wrote: >> Hi, >> >> please review the following patch: >> https://bugs.openjdk.java.net/browse/JDK-8180813 >> http://cr.openjdk.java.net/~thartmann/8180813/webrev.00/ > > Looks okay to me. Thanks for the review! Best regards, Tobias From kim.barrett at oracle.com Tue May 23 09:47:51 2017 From: kim.barrett at oracle.com (Kim Barrett) Date: Tue, 23 May 2017 05:47:51 -0400 Subject: [10] RFR: JDK-8160354: uninitialized value warning and VM crash are occurred with GCC 6 In-Reply-To: References: Message-ID: <8E5A6618-71DF-4B0C-A652-B6F8EADC74B3@oracle.com> > On May 23, 2017, at 4:27 AM, Yasumasa Suenaga wrote: > > Hi all, > > I've posted review request [1]. > I want to resume to work about it. > > This issue is marked as P3 bug [2]. > I uploaded the webrev for jdk 10. Could you review it? > > http://cr.openjdk.java.net/~ysuenaga/JDK-8160354/webrev.02/ > > I cannot access JPRT. > So I need a sponsor. > > > Thanks, > > Yasumasa > > > [1] http://mail.openjdk.java.net/pipermail/hotspot-dev/2016-June/023658.html > [2] https://bugs.openjdk.java.net/browse/JDK-8160354 As discussed in the CR, I think this is closely related to JDK-8160404. I think the change to the RelocationHolder constructor being suggested here may be just papering over a symptom of that other bug. And I suspect that if JDK-8160404 were fixed then there wouldn't be any warnings in make_raw to be addressed in what looks to me like a kludgy workaround. That all makes me not a fan of the changes being proposed here. From rahul.v.raghavan at oracle.com Tue May 23 09:58:16 2017 From: rahul.v.raghavan at oracle.com (Rahul Raghavan) Date: Tue, 23 May 2017 02:58:16 -0700 (PDT) Subject: [8u Backport] RFR: 8175345: possible null pointer dereference defects In-Reply-To: References: <38de2381-67bc-44d0-b0a8-a14dcd914a0e@default> <20170516162442.GC3102@vimes> <20170516170821.GE3102@vimes> Message-ID: <1678f805-ed3b-4237-99b5-95f089780121@default> Hi, I am sorry that the first webrev submitted for 8175345 backport was wrong. Missed to note that difference in code base and to run jprt test before submitting for review, approval. Patch from 9 does not apply cleanly. Apologies for the confusion. Request for review approval for the following revised webrev for jdk8u. : http://cr.openjdk.java.net/~rraghavan/8177429/webrev.01/ : https://bugs.openjdk.java.net/browse/JDK-8175345 The changes done in jdk9 were reviewed in open and pushed. : http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2017-March/025798.html : http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/55e3f1f3d0a7 Above for jdk8u is same as jdk9 changes, except following changes in 'src/share/vm/opto/ifnode.cpp', due the difference in 'ProjNode::is_uncommon_trap_proj()' return type (returns 'bool' in jdk8u and 'CallStaticJavaNode*' in jdk9) So change done for jdk9 was - - if (unc_proj->is_uncommon_trap_proj(Deoptimization::Reason_predicate) != NULL) - prev_dom = idom; + if ((unc_proj != NULL) && (unc_proj->is_uncommon_trap_proj(Deoptimization::Reason_predicate) != NULL)) { + prev_dom = idom; + } And now the proposed change for jdk8u in above webrev.01 is - - if (unc_proj->is_uncommon_trap_proj(Deoptimization::Reason_predicate)) - prev_dom = idom; + if ((unc_proj != NULL) && (unc_proj->is_uncommon_trap_proj(Deoptimization::Reason_predicate))) { + prev_dom = idom; + } Thanks, Rahul > -----Original Message----- > From: Rahul Raghavan > Sent: Wednesday, May 17, 2017 10:14 AM > To: Robert Mckenna > Cc: jdk8u-dev at openjdk.java.net > Subject: RE: [8u Backport] RFR: 8175345: possible null pointer dereference defects > > > > -----Original Message----- > > From: Rob McKenna > Sent: Tuesday, May 16, 2017 10:38 PM > > > > ...also please add a suitable noreg label to the main bug. > > Done, thank you. > -Rahul > > > > > -Rob > > > > On 16/05/17 05:24, Rob McKenna wrote: > > > Updated subject line to reflect the correct bug id. > > > > > > Rahul, for future requests, please only refer to the main bug id. > > > > > > Approved > > > > > > -Rob > > > > > > On 15/05/17 11:50, Rahul Raghavan wrote: > > > > Hi, > > > > > > > > Request for approval - > > > > - http://cr.openjdk.java.net/~rraghavan/8175345/webrev.01/ > > > > > > > > - '49 Null pointer dereference defect groups in 21 files' - > > > > https://bugs.openjdk.java.net/browse/JDK-8177429 > > > > Backport of - > > > > https://bugs.openjdk.java.net/browse/JDK-8175345 > > > > > > > > With 8175345, the changes done in jdk9 were reviewed in open and committed. > > > > http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2017-March/025798.html > > > > http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/55e3f1f3d0a7 > > > > > > > > This backport fix is same as in jdk9. > > > > > > > > Thanks, > > > > Rahul From tobias.hartmann at oracle.com Tue May 23 10:19:49 2017 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Tue, 23 May 2017 12:19:49 +0200 Subject: [8u Backport] RFR: 8175345: possible null pointer dereference defects In-Reply-To: <1678f805-ed3b-4237-99b5-95f089780121@default> References: <38de2381-67bc-44d0-b0a8-a14dcd914a0e@default> <20170516162442.GC3102@vimes> <20170516170821.GE3102@vimes> <1678f805-ed3b-4237-99b5-95f089780121@default> Message-ID: Hi Rahul, On 23.05.2017 11:58, Rahul Raghavan wrote: > Request for review approval for the following revised webrev for jdk8u. > : http://cr.openjdk.java.net/~rraghavan/8177429/webrev.01/ Looks good. Best regards, Tobias From thomas.schatzl at oracle.com Tue May 23 10:36:51 2017 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Tue, 23 May 2017 12:36:51 +0200 Subject: RFR (S): 8180755: Remove use of bitMap.inline.hpp include from instanceKlass.hpp and c1_ValueSet.hpp In-Reply-To: <722e9800-5d1b-bc9e-7bff-16a228d367c2@oracle.com> References: <1495467593.2573.82.camel@oracle.com> <722e9800-5d1b-bc9e-7bff-16a228d367c2@oracle.com> Message-ID: <1495535811.2781.1.camel@oracle.com> Hi David, ? thanks for your review. On Tue, 2017-05-23 at 11:38 +1000, David Holmes wrote: > Hi Thomas, > > This looks okay to me. A couple of comments: > > src/share/vm/oops/generateOopMap.cpp > > inline void GenerateOopMap::set_bbmark_bit(int bci) { > ???_bb_hdr_bits.at_put(bci, true); > } > > Does "inline" serve any purpose here? Removed. > > --- > > src/share/vm/oops/generateOopMap.hpp > > -??void??????????set_bbmark_bit??????????????(int bci) { > -????_bb_hdr_bits.at_put(bci, true); > -??} > +??inline void???set_bbmark_bit??????????????(int bci); > ????void??????????clear_bbmark_bit????????????(int bci) { > ??????_bb_hdr_bits.at_put(bci, false); > ????} > > I don't understand why set_bbmark_bit had to be moved out but? > clear_bbmark_bit remains ?? > It is never referenced. Removed the method. http://cr.openjdk.java.net/~tschatzl/8180755/webrev.0_to_1 (diff) http://cr.openjdk.java.net/~tschatzl/8180755/webrev.1 (full) Thanks, ? Thomas From lutz.schmidt at sap.com Tue May 23 10:47:19 2017 From: lutz.schmidt at sap.com (Schmidt, Lutz) Date: Tue, 23 May 2017 10:47:19 +0000 Subject: [10] [s390] RFR(XS): micro-optimization in resize_frame_absolute() Message-ID: <26853C9C-4CBE-4551-ACAB-20509AB82478@sap.com> Dear all, I would like to request reviews for this tiny, s390-only enhancement: Bug: https://bugs.openjdk.java.net/browse/JDK-8180659 Webrev: http://cr.openjdk.java.net/~lucy/webrevs/8180659.00/ Thank you! Lutz -------------- next part -------------- An HTML attachment was scrubbed... URL: From david.holmes at oracle.com Tue May 23 12:31:35 2017 From: david.holmes at oracle.com (David Holmes) Date: Tue, 23 May 2017 22:31:35 +1000 Subject: RFR (S): 8180755: Remove use of bitMap.inline.hpp include from instanceKlass.hpp and c1_ValueSet.hpp In-Reply-To: <1495535811.2781.1.camel@oracle.com> References: <1495467593.2573.82.camel@oracle.com> <722e9800-5d1b-bc9e-7bff-16a228d367c2@oracle.com> <1495535811.2781.1.camel@oracle.com> Message-ID: <1ac8dcac-b750-9e60-addc-0f80b2644943@oracle.com> On 23/05/2017 8:36 PM, Thomas Schatzl wrote: > Hi David, > > thanks for your review. > > On Tue, 2017-05-23 at 11:38 +1000, David Holmes wrote: >> Hi Thomas, >> >> This looks okay to me. A couple of comments: >> >> src/share/vm/oops/generateOopMap.cpp >> >> inline void GenerateOopMap::set_bbmark_bit(int bci) { >> _bb_hdr_bits.at_put(bci, true); >> } >> >> Does "inline" serve any purpose here? > > Removed. It is still on the declaration in the header file: inline void set_bbmark_bit (int bci); >> >> --- >> >> src/share/vm/oops/generateOopMap.hpp >> >> - void set_bbmark_bit (int bci) { >> - _bb_hdr_bits.at_put(bci, true); >> - } >> + inline void set_bbmark_bit (int bci); >> void clear_bbmark_bit (int bci) { >> _bb_hdr_bits.at_put(bci, false); >> } >> >> I don't understand why set_bbmark_bit had to be moved out but >> clear_bbmark_bit remains ?? >> > > It is never referenced. Removed the method. Ok. Thanks, David > http://cr.openjdk.java.net/~tschatzl/8180755/webrev.0_to_1 (diff) > http://cr.openjdk.java.net/~tschatzl/8180755/webrev.1 (full) > > Thanks, > Thomas > From yasuenag at gmail.com Tue May 23 12:38:30 2017 From: yasuenag at gmail.com (Yasumasa Suenaga) Date: Tue, 23 May 2017 21:38:30 +0900 Subject: [10] RFR: JDK-8160354: uninitialized value warning and VM crash are occurred with GCC 6 In-Reply-To: <8E5A6618-71DF-4B0C-A652-B6F8EADC74B3@oracle.com> References: <8E5A6618-71DF-4B0C-A652-B6F8EADC74B3@oracle.com> Message-ID: <15aff59f-d062-4fcb-8af5-809405269119@gmail.com> Hi Kim, On 2017/05/23 18:47, Kim Barrett wrote: >> On May 23, 2017, at 4:27 AM, Yasumasa Suenaga wrote: >> >> Hi all, >> >> I've posted review request [1]. >> I want to resume to work about it. >> >> This issue is marked as P3 bug [2]. >> I uploaded the webrev for jdk 10. Could you review it? >> >> http://cr.openjdk.java.net/~ysuenaga/JDK-8160354/webrev.02/ >> >> I cannot access JPRT. >> So I need a sponsor. >> >> >> Thanks, >> >> Yasumasa >> >> >> [1] http://mail.openjdk.java.net/pipermail/hotspot-dev/2016-June/023658.html >> [2] https://bugs.openjdk.java.net/browse/JDK-8160354 > > As discussed in the CR, I think this is closely related to > JDK-8160404. I think the change to the RelocationHolder constructor > being suggested here may be just papering over a symptom of that other > bug. Will the change of JDK-8160404 initialize _relocbuf in RelocationHolder constructor? IMHO all C++ class member should initialize before reading, and c'tor is useful for it. > And I suspect that if JDK-8160404 were fixed then there wouldn't be > any warnings in make_raw to be addressed in what looks to me like a > kludgy workaround. I don't know how to fix it in JDK-8160404. Should I wait to fix JDK-8160404? Thanks, Yasumasa > That all makes me not a fan of the changes being proposed here. > From thomas.schatzl at oracle.com Tue May 23 13:01:53 2017 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Tue, 23 May 2017 15:01:53 +0200 Subject: RFR (S): 8180755: Remove use of bitMap.inline.hpp include from instanceKlass.hpp and c1_ValueSet.hpp In-Reply-To: <1ac8dcac-b750-9e60-addc-0f80b2644943@oracle.com> References: <1495467593.2573.82.camel@oracle.com> <722e9800-5d1b-bc9e-7bff-16a228d367c2@oracle.com> <1495535811.2781.1.camel@oracle.com> <1ac8dcac-b750-9e60-addc-0f80b2644943@oracle.com> Message-ID: <1495544513.2781.44.camel@oracle.com> Hi David, On Tue, 2017-05-23 at 22:31 +1000, David Holmes wrote: > On 23/05/2017 8:36 PM, Thomas Schatzl wrote: > > > > Hi David, > > > > ???thanks for your review. > > > > On Tue, 2017-05-23 at 11:38 +1000, David Holmes wrote: > > > > > > Hi Thomas, > > > > > > This looks okay to me. A couple of comments: > > > > > > src/share/vm/oops/generateOopMap.cpp > > > > > > inline void GenerateOopMap::set_bbmark_bit(int bci) { > > > ????_bb_hdr_bits.at_put(bci, true); > > > } > > > > > > Does "inline" serve any purpose here? > > Removed. > It is still on the declaration in the header file: > > ?????inline void???set_bbmark_bit??????????????(int bci); ? I updated the webrev. > > http://cr.openjdk.java.net/~tschatzl/8180755/webrev.0_to_1 (diff) > > http://cr.openjdk.java.net/~tschatzl/8180755/webrev.1 (full) > > Thanks, ? Thomas From vladimir.kozlov at oracle.com Tue May 23 15:13:42 2017 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Tue, 23 May 2017 08:13:42 -0700 Subject: RFR (S): 8180755: Remove use of bitMap.inline.hpp include from instanceKlass.hpp and c1_ValueSet.hpp In-Reply-To: <1495544513.2781.44.camel@oracle.com> References: <1495467593.2573.82.camel@oracle.com> <722e9800-5d1b-bc9e-7bff-16a228d367c2@oracle.com> <1495535811.2781.1.camel@oracle.com> <1ac8dcac-b750-9e60-addc-0f80b2644943@oracle.com> <1495544513.2781.44.camel@oracle.com> Message-ID: <1bb9fc26-8d99-0660-6ef8-5b45226a7b0a@oracle.com> C1 changes are fine. Thanks, Vladimir On 5/23/17 6:01 AM, Thomas Schatzl wrote: > Hi David, > > On Tue, 2017-05-23 at 22:31 +1000, David Holmes wrote: >> On 23/05/2017 8:36 PM, Thomas Schatzl wrote: >>> >>> Hi David, >>> >>> thanks for your review. >>> >>> On Tue, 2017-05-23 at 11:38 +1000, David Holmes wrote: >>>> >>>> Hi Thomas, >>>> >>>> This looks okay to me. A couple of comments: >>>> >>>> src/share/vm/oops/generateOopMap.cpp >>>> >>>> inline void GenerateOopMap::set_bbmark_bit(int bci) { >>>> _bb_hdr_bits.at_put(bci, true); >>>> } >>>> >>>> Does "inline" serve any purpose here? >>> Removed. >> It is still on the declaration in the header file: >> >> inline void set_bbmark_bit (int bci); > > I updated the webrev. > >>> http://cr.openjdk.java.net/~tschatzl/8180755/webrev.0_to_1 (diff) >>> http://cr.openjdk.java.net/~tschatzl/8180755/webrev.1 (full) >>> > > Thanks, > Thomas > From vladimir.kozlov at oracle.com Tue May 23 15:19:54 2017 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Tue, 23 May 2017 08:19:54 -0700 Subject: [9] RFR(XS): 8180813: Null pointer dereference of CodeCache::find_blob() result In-Reply-To: References: Message-ID: <5e1fac94-8578-7864-d3d8-6a21e625b57f@oracle.com> Good. Vladimir On 5/23/17 1:55 AM, Tobias Hartmann wrote: > Hi, > > please review the following patch: > https://bugs.openjdk.java.net/browse/JDK-8180813 > http://cr.openjdk.java.net/~thartmann/8180813/webrev.00/ > > Fixed missing null checks on the result of CodeCache::find_blob() found by Parfait. > > Thanks, > Tobias > From volker.simonis at gmail.com Tue May 23 15:25:09 2017 From: volker.simonis at gmail.com (Volker Simonis) Date: Tue, 23 May 2017 17:25:09 +0200 Subject: [10] [ppc] RFR(XS): 8180612: assert failure due to immediate value out of range In-Reply-To: <7A3AD54D-9DC2-4B83-8B4D-BB2E442CC68A@sap.com> References: <3B5A2D91-D4F5-44B9-9129-4876D992C077@sap.com> <6d7bd7fd-52a8-9542-10ac-9b13c470afe8@oracle.com> <0072b1f1-99d3-25eb-e7a2-871c697c3fef@oracle.com> <7A3AD54D-9DC2-4B83-8B4D-BB2E442CC68A@sap.com> Message-ID: Hi Lutz, in general the change looks good. I think in globals_ppc.hpp the minimal value for RTMTotalCountIncrRate should be 1 (as on x86) to avoid division by zero errors. What is the rational behind restricting some parameters to 16-bit immediates on ppc while handling bigger immediate values in the code generation for some other parameters? Wouldn't it be easier to restrict all parameters to 16 bit on ppc? Thank you and best regards, Volker On Tue, May 23, 2017 at 9:29 AM, Schmidt, Lutz wrote: > Vladimir, Volker, > > Triggered by your suggestions, I have read through the RTM code with extended diligence. What I came up with is this updated/extended > webrev: http://cr.openjdk.java.net/~lucy/webrevs/8180612.01/ > for bug: https://bugs.openjdk.java.net/browse/JDK-8180612 > > For both x86 and ppc, I have added ranges to all numeric RTM flags. Their type is now ?int?. Could you please have a look and let me know what you don?t like? > > Thanks and best regards, > Lutz > > > > On 19.05.2017, 21:40, "Vladimir Kozlov" wrote: > > Actually we need to use 'int' because we do signed arithmetic on them. And put range() restriction for positive values only. > > experimental(int, RTMTotalCountIncrRate, 64, \ > "Increment total RTM attempted lock count once every n times") \ > range(0, max_jint) \ > > Vladimir > > On 5/19/17 12:33 PM, Vladimir Kozlov wrote: > > Thank you, Volker > > > > I think all RTM tuning flags should be uint (unsigned 32bit int). > > We did not have int/uint types when RTM was implemented. They were added 2 years ago: > > > > http://hg.openjdk.java.net/jdk9/hs/hotspot/rev/8597e296c18b > > > > Lets change type of RTM flags in all places. I will review and sponsor. > > > > thanks, > > Vladimir > > > > On 5/19/17 12:02 PM, Volker Simonis wrote: > >> Hi Lutz, Vladimir, > >> > >> @Lutz: thanks for fixing this. I think your change looks good. > >> > >> @Vladimir: thanks, but I think we can push this ourselves because it > >> is ppc only. > >> > >> I've also realized that amd64 uses cmpptr() which takes the result of > >> "RTMLockingThreshold / RTMTotalCountIncrRate" as an int32_t. This can > >> be wrong if the result of the division is greater than 32 bit. I'm not > >> sure how relevant that is, but maybe we could either change the types > >> of RTMLockingThreshold and RTMTotalCountIncrRate to int or else fix > >> the compare on amd64 to compare against a full 64 bit value. > >> > >> What do you think Vladimir - maybe do that as a follow up change or do > >> you want to include it here (in which case you'd have to sponsor :) ? > >> > >> Thank you and best regards, > >> Volker > >> > >> On Fri, May 19, 2017 at 6:35 PM, Vladimir Kozlov > >> wrote: > >>> Hi Lutz, > >>> > >>> I can sponsor it but someone familiar with PPC have to review the fix. > >>> > >>> Thanks, > >>> Vladimir > >>> > >>> > >>> On 5/19/17 5:45 AM, Schmidt, Lutz wrote: > >>>> > >>>> Hi all, > >>>> > >>>> May I kindly request reviews for this small fix? A voluntary sponsor would > >>>> be great as well! > >>>> > >>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8180612 > >>>> Webrev: http://cr.openjdk.java.net/~lucy/webrevs/8180612.00/ > >>>> > >>>> The RTM code generation on ppc relied on RTM-related cmdline parameters to > >>>> provide ?well-behaved? values only. At least one jtreg test breaks this > >>>> assumption. The fix makes code generation adapt to actual parameter values. > >>>> > >>>> Thanks, > >>>> Lutz > >>>> > >>>> > >>>> > >>> > > > > > > From vladimir.kozlov at oracle.com Tue May 23 15:29:35 2017 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Tue, 23 May 2017 08:29:35 -0700 Subject: [10] [ppc] RFR(XS): 8180612: assert failure due to immediate value out of range In-Reply-To: <7A3AD54D-9DC2-4B83-8B4D-BB2E442CC68A@sap.com> References: <3B5A2D91-D4F5-44B9-9129-4876D992C077@sap.com> <6d7bd7fd-52a8-9542-10ac-9b13c470afe8@oracle.com> <0072b1f1-99d3-25eb-e7a2-871c697c3fef@oracle.com> <7A3AD54D-9DC2-4B83-8B4D-BB2E442CC68A@sap.com> Message-ID: <47e3a28a-8e38-2f2b-ca93-94f321a7fdfe@oracle.com> x86 changes looks good. Thanks, Vladimir On 5/23/17 12:29 AM, Schmidt, Lutz wrote: > Vladimir, Volker, > > Triggered by your suggestions, I have read through the RTM code with extended diligence. What I came up with is this updated/extended > webrev: http://cr.openjdk.java.net/~lucy/webrevs/8180612.01/ > for bug: https://bugs.openjdk.java.net/browse/JDK-8180612 > > For both x86 and ppc, I have added ranges to all numeric RTM flags. Their type is now ?int?. Could you please have a look and let me know what you don?t like? > > Thanks and best regards, > Lutz > > > > On 19.05.2017, 21:40, "Vladimir Kozlov" wrote: > > Actually we need to use 'int' because we do signed arithmetic on them. And put range() restriction for positive values only. > > experimental(int, RTMTotalCountIncrRate, 64, \ > "Increment total RTM attempted lock count once every n times") \ > range(0, max_jint) \ > > Vladimir > > On 5/19/17 12:33 PM, Vladimir Kozlov wrote: > > Thank you, Volker > > > > I think all RTM tuning flags should be uint (unsigned 32bit int). > > We did not have int/uint types when RTM was implemented. They were added 2 years ago: > > > > http://hg.openjdk.java.net/jdk9/hs/hotspot/rev/8597e296c18b > > > > Lets change type of RTM flags in all places. I will review and sponsor. > > > > thanks, > > Vladimir > > > > On 5/19/17 12:02 PM, Volker Simonis wrote: > >> Hi Lutz, Vladimir, > >> > >> @Lutz: thanks for fixing this. I think your change looks good. > >> > >> @Vladimir: thanks, but I think we can push this ourselves because it > >> is ppc only. > >> > >> I've also realized that amd64 uses cmpptr() which takes the result of > >> "RTMLockingThreshold / RTMTotalCountIncrRate" as an int32_t. This can > >> be wrong if the result of the division is greater than 32 bit. I'm not > >> sure how relevant that is, but maybe we could either change the types > >> of RTMLockingThreshold and RTMTotalCountIncrRate to int or else fix > >> the compare on amd64 to compare against a full 64 bit value. > >> > >> What do you think Vladimir - maybe do that as a follow up change or do > >> you want to include it here (in which case you'd have to sponsor :) ? > >> > >> Thank you and best regards, > >> Volker > >> > >> On Fri, May 19, 2017 at 6:35 PM, Vladimir Kozlov > >> wrote: > >>> Hi Lutz, > >>> > >>> I can sponsor it but someone familiar with PPC have to review the fix. > >>> > >>> Thanks, > >>> Vladimir > >>> > >>> > >>> On 5/19/17 5:45 AM, Schmidt, Lutz wrote: > >>>> > >>>> Hi all, > >>>> > >>>> May I kindly request reviews for this small fix? A voluntary sponsor would > >>>> be great as well! > >>>> > >>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8180612 > >>>> Webrev: http://cr.openjdk.java.net/~lucy/webrevs/8180612.00/ > >>>> > >>>> The RTM code generation on ppc relied on RTM-related cmdline parameters to > >>>> provide ?well-behaved? values only. At least one jtreg test breaks this > >>>> assumption. The fix makes code generation adapt to actual parameter values. > >>>> > >>>> Thanks, > >>>> Lutz > >>>> > >>>> > >>>> > >>> > > > > > > From thomas.schatzl at oracle.com Tue May 23 15:32:55 2017 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Tue, 23 May 2017 17:32:55 +0200 Subject: RFR (S): 8180755: Remove use of bitMap.inline.hpp include from instanceKlass.hpp and c1_ValueSet.hpp In-Reply-To: <1bb9fc26-8d99-0660-6ef8-5b45226a7b0a@oracle.com> References: <1495467593.2573.82.camel@oracle.com> <722e9800-5d1b-bc9e-7bff-16a228d367c2@oracle.com> <1495535811.2781.1.camel@oracle.com> <1ac8dcac-b750-9e60-addc-0f80b2644943@oracle.com> <1495544513.2781.44.camel@oracle.com> <1bb9fc26-8d99-0660-6ef8-5b45226a7b0a@oracle.com> Message-ID: <1495553575.2781.49.camel@oracle.com> Hi, On Tue, 2017-05-23 at 08:13 -0700, Vladimir Kozlov wrote: > C1 changes are fine. > > Thanks, > Vladimir > ? thanks for your review. Thanks, ? Thomas From dean.long at oracle.com Tue May 23 19:11:37 2017 From: dean.long at oracle.com (dean.long at oracle.com) Date: Tue, 23 May 2017 12:11:37 -0700 Subject: Some JVMCI/Graal questions related to AOT In-Reply-To: References: Message-ID: <22330f24-9bbe-3055-41e4-35b1f6854fa4@oracle.com> Thanks, I'll try that. dl On 5/23/17 12:29 AM, Doug Simon wrote: >> On 23 May 2017, at 00:41, dean.long at oracle.com wrote: >> >> 1) I'm working on "8132547: [AOT] support invokedynamic instructions" and I've hacked up jdk.vm.ci.hotspot.HotSpotConstantPool.java to handle things like the invokedynamic appendix differently. However, since this will only be used by AOT, I'm thinking I need to put my changes in an AOTHotSpotConstantPool subclass. My question is, where is a good place to put such as class (which hopefully won't require messing with modules)? > Depending on the nature of the changes, I suspect they can simply be added to HotSpotConstantPool, guarded by a VM flag exposed by HotSpotVMConfig if necessary. HotSpotConstantPool is currently final and I don't see a natural place for an AOT specific subclass > >> 2) How can I tell if a ResolvedJavaType corresponds to a VM anonymous class (Klass::is_anonymous())? I can't rely on getFingerprint() returning 0, because I want fingerprints for anonymous classes. Is there something existing, or do I need to add something to JVMCI? > You'd need to add something to JVMCI by exposing the required flags and fields in HotSpotVMConfig. > > -Doug From kim.barrett at oracle.com Tue May 23 20:38:56 2017 From: kim.barrett at oracle.com (Kim Barrett) Date: Tue, 23 May 2017 16:38:56 -0400 Subject: [10] RFR: JDK-8160354: uninitialized value warning and VM crash are occurred with GCC 6 In-Reply-To: <15aff59f-d062-4fcb-8af5-809405269119@gmail.com> References: <8E5A6618-71DF-4B0C-A652-B6F8EADC74B3@oracle.com> <15aff59f-d062-4fcb-8af5-809405269119@gmail.com> Message-ID: <7DE24441-3E26-48CF-8C4E-674274A8A941@oracle.com> > On May 23, 2017, at 8:38 AM, Yasumasa Suenaga wrote: > > Hi Kim, > > On 2017/05/23 18:47, Kim Barrett wrote: >>> On May 23, 2017, at 4:27 AM, Yasumasa Suenaga wrote: >>> >>> Hi all, >>> >>> I've posted review request [1]. >>> I want to resume to work about it. >>> >>> This issue is marked as P3 bug [2]. >>> I uploaded the webrev for jdk 10. Could you review it? >>> >>> http://cr.openjdk.java.net/~ysuenaga/JDK-8160354/webrev.02/ >>> >>> I cannot access JPRT. >>> So I need a sponsor. >>> >>> >>> Thanks, >>> >>> Yasumasa >>> >>> >>> [1] http://mail.openjdk.java.net/pipermail/hotspot-dev/2016-June/023658.html >>> [2] https://bugs.openjdk.java.net/browse/JDK-8160354 >> >> As discussed in the CR, I think this is closely related to >> JDK-8160404. I think the change to the RelocationHolder constructor >> being suggested here may be just papering over a symptom of that other >> bug. > > Will the change of JDK-8160404 initialize _relocbuf in RelocationHolder constructor? > IMHO all C++ class member should initialize before reading, and c'tor is useful for it. I don?t know exactly how JDK-8160404 will be fixed. I think it will involve initializing at least some of the elements of _relocbuf in a more obvious way that should be visible to the compiler. I?m not sure the trailing elements need to be initialized, and it would be wasteful to unconditionally fill with zeros and then overwrite the prefix. Let?s see what comes out of work on JDK-8160404. >> And I suspect that if JDK-8160404 were fixed then there wouldn't be >> any warnings in make_raw to be addressed in what looks to me like a >> kludgy workaround. > > I don't know how to fix it in JDK-8160404. > Should I wait to fix JDK-8160404? Someone else is working on JDK-8160404. I think a fix for it ought to solve this one too. From tobias.hartmann at oracle.com Wed May 24 05:03:38 2017 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Wed, 24 May 2017 07:03:38 +0200 Subject: [9] RFR(XS): 8180813: Null pointer dereference of CodeCache::find_blob() result In-Reply-To: <5e1fac94-8578-7864-d3d8-6a21e625b57f@oracle.com> References: <5e1fac94-8578-7864-d3d8-6a21e625b57f@oracle.com> Message-ID: <57eacbf0-db7c-afed-20de-14a9f68f8aee@oracle.com> Hi Vladimir, thanks for the review! Best regards, Tobias On 23.05.2017 17:19, Vladimir Kozlov wrote: > Good. > > Vladimir > > On 5/23/17 1:55 AM, Tobias Hartmann wrote: >> Hi, >> >> please review the following patch: >> https://bugs.openjdk.java.net/browse/JDK-8180813 >> http://cr.openjdk.java.net/~thartmann/8180813/webrev.00/ >> >> Fixed missing null checks on the result of CodeCache::find_blob() found by Parfait. >> >> Thanks, >> Tobias >> From rwestrel at redhat.com Wed May 24 07:55:24 2017 From: rwestrel at redhat.com (Roland Westrelin) Date: Wed, 24 May 2017 09:55:24 +0200 Subject: [9] RFR(S): 8179678: ArrayCopy with same src and dst can cause incorrect execution or compiler crash In-Reply-To: <31848120-9bfa-505a-60dd-fa8b1c0b7efa@oracle.com> References: <43893ed5-cbf5-4cf9-a727-8733850260de@oracle.com> <31848120-9bfa-505a-60dd-fa8b1c0b7efa@oracle.com> Message-ID: > with your fix, compiler/arraycopy/TestEliminatedArrayCopyDeopt fails > on all platforms with: Thanks. The problem here is that with -XX:-ReduceInitialCardMarks, C2 adds a g1 post barrier between the arraycopy node and the following membar that ArrayCopyNode::may_modify() doesn't expect. Here is a new webrev: http://cr.openjdk.java.net/~roland/8179678/webrev.04/ with a fixed ArrayCopyNode::may_modify(). I also added verification code that calls ArrayCopyNode::may_modify() after an array copy node is expanded to verify that ArrayCopyNode::may_modify() is consistent with the just expanded subgraph. Roland. From lutz.schmidt at sap.com Wed May 24 09:13:41 2017 From: lutz.schmidt at sap.com (Schmidt, Lutz) Date: Wed, 24 May 2017 09:13:41 +0000 Subject: [10] [ppc] RFR(XS): 8180612: assert failure due to immediate value out of range In-Reply-To: References: <3B5A2D91-D4F5-44B9-9129-4876D992C077@sap.com> <6d7bd7fd-52a8-9542-10ac-9b13c470afe8@oracle.com> <0072b1f1-99d3-25eb-e7a2-871c697c3fef@oracle.com> <7A3AD54D-9DC2-4B83-8B4D-BB2E442CC68A@sap.com> Message-ID: <181C18A7-D259-48EB-9553-BC47980350E6@sap.com> Hi Volker, thanks for looking at the change and for your findings. To set RTMTotalCountIncrRate=1 was my intention. Obviously, it escaped my attention on ppc. Fixed. Webrev updated in-place. I tried to keep the allowed parameter range as wide as possible. Unfortunately, some of the parameters are used as immediate operands at places where an additional temp register is not readily available. RTMLockingThreshold, for example, is only used after being divided by RTMTotalCountIncrRate. Except for RTMTotalCountIncrRate=1, the quotient will be significantly smaller than the original parameter value. Limiting the range of RTMLockingThreshold to int16 seems too restrictive to me. In addition, at that place in the code there is a register available to smoothly handle the ?overflow? case. I would like to keep the ranges as they are. Regards, Lutz On 23.05.2017, 17:25, "Volker Simonis" wrote: Hi Lutz, in general the change looks good. I think in globals_ppc.hpp the minimal value for RTMTotalCountIncrRate should be 1 (as on x86) to avoid division by zero errors. What is the rational behind restricting some parameters to 16-bit immediates on ppc while handling bigger immediate values in the code generation for some other parameters? Wouldn't it be easier to restrict all parameters to 16 bit on ppc? Thank you and best regards, Volker On Tue, May 23, 2017 at 9:29 AM, Schmidt, Lutz wrote: > Vladimir, Volker, > > Triggered by your suggestions, I have read through the RTM code with extended diligence. What I came up with is this updated/extended > webrev: http://cr.openjdk.java.net/~lucy/webrevs/8180612.01/ > for bug: https://bugs.openjdk.java.net/browse/JDK-8180612 > > For both x86 and ppc, I have added ranges to all numeric RTM flags. Their type is now ?int?. Could you please have a look and let me know what you don?t like? > > Thanks and best regards, > Lutz > > > > On 19.05.2017, 21:40, "Vladimir Kozlov" wrote: > > Actually we need to use 'int' because we do signed arithmetic on them. And put range() restriction for positive values only. > > experimental(int, RTMTotalCountIncrRate, 64, \ > "Increment total RTM attempted lock count once every n times") \ > range(0, max_jint) \ > > Vladimir > > On 5/19/17 12:33 PM, Vladimir Kozlov wrote: > > Thank you, Volker > > > > I think all RTM tuning flags should be uint (unsigned 32bit int). > > We did not have int/uint types when RTM was implemented. They were added 2 years ago: > > > > http://hg.openjdk.java.net/jdk9/hs/hotspot/rev/8597e296c18b > > > > Lets change type of RTM flags in all places. I will review and sponsor. > > > > thanks, > > Vladimir > > > > On 5/19/17 12:02 PM, Volker Simonis wrote: > >> Hi Lutz, Vladimir, > >> > >> @Lutz: thanks for fixing this. I think your change looks good. > >> > >> @Vladimir: thanks, but I think we can push this ourselves because it > >> is ppc only. > >> > >> I've also realized that amd64 uses cmpptr() which takes the result of > >> "RTMLockingThreshold / RTMTotalCountIncrRate" as an int32_t. This can > >> be wrong if the result of the division is greater than 32 bit. I'm not > >> sure how relevant that is, but maybe we could either change the types > >> of RTMLockingThreshold and RTMTotalCountIncrRate to int or else fix > >> the compare on amd64 to compare against a full 64 bit value. > >> > >> What do you think Vladimir - maybe do that as a follow up change or do > >> you want to include it here (in which case you'd have to sponsor :) ? > >> > >> Thank you and best regards, > >> Volker > >> > >> On Fri, May 19, 2017 at 6:35 PM, Vladimir Kozlov > >> wrote: > >>> Hi Lutz, > >>> > >>> I can sponsor it but someone familiar with PPC have to review the fix. > >>> > >>> Thanks, > >>> Vladimir > >>> > >>> > >>> On 5/19/17 5:45 AM, Schmidt, Lutz wrote: > >>>> > >>>> Hi all, > >>>> > >>>> May I kindly request reviews for this small fix? A voluntary sponsor would > >>>> be great as well! > >>>> > >>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8180612 > >>>> Webrev: http://cr.openjdk.java.net/~lucy/webrevs/8180612.00/ > >>>> > >>>> The RTM code generation on ppc relied on RTM-related cmdline parameters to > >>>> provide ?well-behaved? values only. At least one jtreg test breaks this > >>>> assumption. The fix makes code generation adapt to actual parameter values. > >>>> > >>>> Thanks, > >>>> Lutz > >>>> > >>>> > >>>> > >>> > > > > > > From lutz.schmidt at sap.com Wed May 24 09:19:28 2017 From: lutz.schmidt at sap.com (Schmidt, Lutz) Date: Wed, 24 May 2017 09:19:28 +0000 Subject: [10] [ppc] RFR(XS): 8180612: assert failure due to immediate value out of range In-Reply-To: <47e3a28a-8e38-2f2b-ca93-94f321a7fdfe@oracle.com> References: <3B5A2D91-D4F5-44B9-9129-4876D992C077@sap.com> <6d7bd7fd-52a8-9542-10ac-9b13c470afe8@oracle.com> <0072b1f1-99d3-25eb-e7a2-871c697c3fef@oracle.com> <7A3AD54D-9DC2-4B83-8B4D-BB2E442CC68A@sap.com> <47e3a28a-8e38-2f2b-ca93-94f321a7fdfe@oracle.com> Message-ID: <1F449344-A37E-4117-8E1F-CDA279CA26AE@sap.com> Thank you, Vladimir! Regards, Lutz Dr. Lutz Schmidt | SAP JVM | PI SAP CP Core | T: +49 (6227) 7-42834 On 23.05.2017, 17:29, "Vladimir Kozlov" wrote: x86 changes looks good. Thanks, Vladimir On 5/23/17 12:29 AM, Schmidt, Lutz wrote: > Vladimir, Volker, > > Triggered by your suggestions, I have read through the RTM code with extended diligence. What I came up with is this updated/extended > webrev: http://cr.openjdk.java.net/~lucy/webrevs/8180612.01/ > for bug: https://bugs.openjdk.java.net/browse/JDK-8180612 > > For both x86 and ppc, I have added ranges to all numeric RTM flags. Their type is now ?int?. Could you please have a look and let me know what you don?t like? > > Thanks and best regards, > Lutz > > > > On 19.05.2017, 21:40, "Vladimir Kozlov" wrote: > > Actually we need to use 'int' because we do signed arithmetic on them. And put range() restriction for positive values only. > > experimental(int, RTMTotalCountIncrRate, 64, \ > "Increment total RTM attempted lock count once every n times") \ > range(0, max_jint) \ > > Vladimir > > On 5/19/17 12:33 PM, Vladimir Kozlov wrote: > > Thank you, Volker > > > > I think all RTM tuning flags should be uint (unsigned 32bit int). > > We did not have int/uint types when RTM was implemented. They were added 2 years ago: > > > > http://hg.openjdk.java.net/jdk9/hs/hotspot/rev/8597e296c18b > > > > Lets change type of RTM flags in all places. I will review and sponsor. > > > > thanks, > > Vladimir > > > > On 5/19/17 12:02 PM, Volker Simonis wrote: > >> Hi Lutz, Vladimir, > >> > >> @Lutz: thanks for fixing this. I think your change looks good. > >> > >> @Vladimir: thanks, but I think we can push this ourselves because it > >> is ppc only. > >> > >> I've also realized that amd64 uses cmpptr() which takes the result of > >> "RTMLockingThreshold / RTMTotalCountIncrRate" as an int32_t. This can > >> be wrong if the result of the division is greater than 32 bit. I'm not > >> sure how relevant that is, but maybe we could either change the types > >> of RTMLockingThreshold and RTMTotalCountIncrRate to int or else fix > >> the compare on amd64 to compare against a full 64 bit value. > >> > >> What do you think Vladimir - maybe do that as a follow up change or do > >> you want to include it here (in which case you'd have to sponsor :) ? > >> > >> Thank you and best regards, > >> Volker > >> > >> On Fri, May 19, 2017 at 6:35 PM, Vladimir Kozlov > >> wrote: > >>> Hi Lutz, > >>> > >>> I can sponsor it but someone familiar with PPC have to review the fix. > >>> > >>> Thanks, > >>> Vladimir > >>> > >>> > >>> On 5/19/17 5:45 AM, Schmidt, Lutz wrote: > >>>> > >>>> Hi all, > >>>> > >>>> May I kindly request reviews for this small fix? A voluntary sponsor would > >>>> be great as well! > >>>> > >>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8180612 > >>>> Webrev: http://cr.openjdk.java.net/~lucy/webrevs/8180612.00/ > >>>> > >>>> The RTM code generation on ppc relied on RTM-related cmdline parameters to > >>>> provide ?well-behaved? values only. At least one jtreg test breaks this > >>>> assumption. The fix makes code generation adapt to actual parameter values. > >>>> > >>>> Thanks, > >>>> Lutz > >>>> > >>>> > >>>> > >>> > > > > > > From volker.simonis at gmail.com Wed May 24 12:11:08 2017 From: volker.simonis at gmail.com (Volker Simonis) Date: Wed, 24 May 2017 14:11:08 +0200 Subject: [10] [ppc] RFR(XS): 8180612: assert failure due to immediate value out of range In-Reply-To: <181C18A7-D259-48EB-9553-BC47980350E6@sap.com> References: <3B5A2D91-D4F5-44B9-9129-4876D992C077@sap.com> <6d7bd7fd-52a8-9542-10ac-9b13c470afe8@oracle.com> <0072b1f1-99d3-25eb-e7a2-871c697c3fef@oracle.com> <7A3AD54D-9DC2-4B83-8B4D-BB2E442CC68A@sap.com> <181C18A7-D259-48EB-9553-BC47980350E6@sap.com> Message-ID: OK, looks good now. Vladimir, can you please push this change now? Thank you and best regards, Volker On Wed, May 24, 2017 at 11:13 AM, Schmidt, Lutz wrote: > Hi Volker, > > thanks for looking at the change and for your findings. > > To set RTMTotalCountIncrRate=1 was my intention. Obviously, it escaped my attention on ppc. Fixed. Webrev updated in-place. > > I tried to keep the allowed parameter range as wide as possible. Unfortunately, some of the parameters are used as immediate operands at places where an additional temp register is not readily available. > > RTMLockingThreshold, for example, is only used after being divided by RTMTotalCountIncrRate. Except for RTMTotalCountIncrRate=1, the quotient will be significantly smaller than the original parameter value. Limiting the range of RTMLockingThreshold to int16 seems too restrictive to me. In addition, at that place in the code there is a register available to smoothly handle the ?overflow? case. > > I would like to keep the ranges as they are. > > Regards, > Lutz > > > On 23.05.2017, 17:25, "Volker Simonis" wrote: > > Hi Lutz, > > in general the change looks good. > > I think in globals_ppc.hpp the minimal value for RTMTotalCountIncrRate > should be 1 (as on x86) to avoid division by zero errors. > > What is the rational behind restricting some parameters to 16-bit > immediates on ppc while handling bigger immediate values in the code > generation for some other parameters? Wouldn't it be easier to > restrict all parameters to 16 bit on ppc? > > Thank you and best regards, > Volker > > > > On Tue, May 23, 2017 at 9:29 AM, Schmidt, Lutz wrote: > > Vladimir, Volker, > > > > Triggered by your suggestions, I have read through the RTM code with extended diligence. What I came up with is this updated/extended > > webrev: http://cr.openjdk.java.net/~lucy/webrevs/8180612.01/ > > for bug: https://bugs.openjdk.java.net/browse/JDK-8180612 > > > > For both x86 and ppc, I have added ranges to all numeric RTM flags. Their type is now ?int?. Could you please have a look and let me know what you don?t like? > > > > Thanks and best regards, > > Lutz > > > > > > > > On 19.05.2017, 21:40, "Vladimir Kozlov" wrote: > > > > Actually we need to use 'int' because we do signed arithmetic on them. And put range() restriction for positive values only. > > > > experimental(int, RTMTotalCountIncrRate, 64, \ > > "Increment total RTM attempted lock count once every n times") \ > > range(0, max_jint) \ > > > > Vladimir > > > > On 5/19/17 12:33 PM, Vladimir Kozlov wrote: > > > Thank you, Volker > > > > > > I think all RTM tuning flags should be uint (unsigned 32bit int). > > > We did not have int/uint types when RTM was implemented. They were added 2 years ago: > > > > > > http://hg.openjdk.java.net/jdk9/hs/hotspot/rev/8597e296c18b > > > > > > Lets change type of RTM flags in all places. I will review and sponsor. > > > > > > thanks, > > > Vladimir > > > > > > On 5/19/17 12:02 PM, Volker Simonis wrote: > > >> Hi Lutz, Vladimir, > > >> > > >> @Lutz: thanks for fixing this. I think your change looks good. > > >> > > >> @Vladimir: thanks, but I think we can push this ourselves because it > > >> is ppc only. > > >> > > >> I've also realized that amd64 uses cmpptr() which takes the result of > > >> "RTMLockingThreshold / RTMTotalCountIncrRate" as an int32_t. This can > > >> be wrong if the result of the division is greater than 32 bit. I'm not > > >> sure how relevant that is, but maybe we could either change the types > > >> of RTMLockingThreshold and RTMTotalCountIncrRate to int or else fix > > >> the compare on amd64 to compare against a full 64 bit value. > > >> > > >> What do you think Vladimir - maybe do that as a follow up change or do > > >> you want to include it here (in which case you'd have to sponsor :) ? > > >> > > >> Thank you and best regards, > > >> Volker > > >> > > >> On Fri, May 19, 2017 at 6:35 PM, Vladimir Kozlov > > >> wrote: > > >>> Hi Lutz, > > >>> > > >>> I can sponsor it but someone familiar with PPC have to review the fix. > > >>> > > >>> Thanks, > > >>> Vladimir > > >>> > > >>> > > >>> On 5/19/17 5:45 AM, Schmidt, Lutz wrote: > > >>>> > > >>>> Hi all, > > >>>> > > >>>> May I kindly request reviews for this small fix? A voluntary sponsor would > > >>>> be great as well! > > >>>> > > >>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8180612 > > >>>> Webrev: http://cr.openjdk.java.net/~lucy/webrevs/8180612.00/ > > >>>> > > >>>> The RTM code generation on ppc relied on RTM-related cmdline parameters to > > >>>> provide ?well-behaved? values only. At least one jtreg test breaks this > > >>>> assumption. The fix makes code generation adapt to actual parameter values. > > >>>> > > >>>> Thanks, > > >>>> Lutz > > >>>> > > >>>> > > >>>> > > >>> > > > > > > > > > > > > > > From jcbeyler at google.com Fri May 5 06:03:59 2017 From: jcbeyler at google.com (JC Beyler) Date: Fri, 05 May 2017 06:03:59 -0000 Subject: Low-Overhead Heap Profiling In-Reply-To: <2af975e6-3827-bd57-0c3d-fadd54867a67@oracle.com> References: <2af975e6-3827-bd57-0c3d-fadd54867a67@oracle.com> Message-ID: So first off: thank you so much for your diligent look at the patch. A lot of the comments are due to me either using the wrong baseline (I was using JDK9) it seems or either that I first worked on keeping things simple, so maybe there are things that are still in flight and why things might seem unclear. I will try to rectify most of those in the next webrev. I will say again that I really do appreciate the time you took to give all those comments. I was expecting a high level review at first and not an in-depth one :) I'll work on the assumption that my code base is wrong and figure out how to get the right one tomorrow. I've inlined my comments which mostly are questions to help me fix the code. On Thu, May 4, 2017 at 2:13 AM, Robbin Ehn wrote: > Hi, > > To me the compiler changes looks what is expected. > It would be good if someone from compiler could take a look at that. > Added compiler to mail thread. > > Also adding Serguei, It would be good with his view also. > > My initial take on it, read through most of the code and took it for a > ride. > > ############################## > - Regarding the compiler changes: I think we need the 'TLAB end' trickery > (mentioned by Tony P) > instead of a separate check for sampling in fast path for the final > version. > > Agreed, I am still working on assessing it and the final version most likely will be TLAB-based if I can prove to myself and to you the overhead is acceptable. My initial data and the patch is simple enough that it seems like it should work. I might send out a webrev with it just to show it and people can play/comment on it. - For simplicity, I've removed your actual patches but most are either because I believe my JDK that I was basing off of was not the right one compared to the one you are using. If you have a second and can provide me a link or process on how you get the right mercurial repositories set up, I'll set myself the same way you are doing it to align myself. - I'll also double check slowdebug to ensure no problems. - Classes should extend correct base class for which type of memory is used > for it e.g.: CHeapObj or StackObj or AllStatic > - The style in heapMonitoring.cpp is a bit different from normal vm-style, > e.g. using C++ casts instead of C. You mix NEW_C_HEAP_ARRAY, os::malloc and > new. > - In jvmtiHeapTransition.hpp you use C cast instead. > > Noted, I thought I had cleaned up most of them but seemingly not. I'll go and clean it up and align myself with what I see the surrounding code is doing between code from runtime and code for the jvmti. > ############################## > - This patch I had apply to get traces without setting an ?unrelated? > capability > - Should this not be a new capability? > > diff -r c02a5d8785bf src/share/vm/prims/forte.cpp > --- a/src/share/vm/prims/forte.cpp Fri Apr 28 15:15:16 2017 +0200 > +++ b/src/share/vm/prims/forte.cpp Thu May 04 10:24:25 2017 +0200 > @@ -530,6 +530,6 @@ > > - if (!JvmtiExport::should_post_class_load()) { > +/* if (!JvmtiExport::should_post_class_load()) { > trace->num_frames = ticks_no_class_load; // -1 > return; > - } > + }*/ > > So the way I've been testing all of this for now was by creating a Java agent that sets everything up. I was waiting to have this conversation for when we were a bit more into the conversations. Since you have brought it up: caps.can_get_line_numbers = 1; caps.can_get_source_file_name = 1; And I've added some callbacks too to help set things up: ernum = global_jvmti->SetEventCallbacks(&callbacks, sizeof(jvmtiEventCallbacks)); ernum = global_jvmti->SetEventNotificationMode(JVMTI_ENABLE, JVMTI_EVENT_VM_INIT, NULL); ernum = global_jvmti->SetEventNotificationMode(JVMTI_ENABLE, JVMTI_EVENT_CLASS_LOAD, NULL); The class loading callback me to force the creation of the jmethod ids for the class, which most likely solves your problem here: Quick question: Does it make sense to put my agent code somewhere in the webrev for all to see how I've been calling into this? I was waiting a bit for things to settle a bit first but I could do that easily. Is there a spot where that would make sense? ############################## > - forte.cpp: (I know this is not part of your changes but) > find_jmethod_id_or_null give me NULL for my test. > It looks like we actually want the regular jmethod_id() ? > > Since we are the thread we are talking about (and in same ucontext) and > thread is in vm and have a last java frame, > I think most of the checks done in AsyncGetCallTrace is irrelevant, so you > should be-able to call forte_fill_call_trace_given_top directly. > But since we might need jmethod_id() if possible to avoid getting method > id NULL, > we need some fixes in forte code, or just do the vframStream loop inside > heapMonitoring.cpp and not use forte.cpp. > > Something like: > > if (jthread->has_last_Java_frame()) { // just to be safe > vframeStream vfst(jthread); > while (!vfst.at_end()) { > Method* m = vfst.method(); > m->jmethod_id(); > m->line_number_from_bci(vfst.bci()); > vfst.next(); > } > I have no idea about this and will have to try :). I am open to any solution that will work and this seems to go around forte.cpp. We have a few changes in forte.cpp that I have not put here that help get more frames than the current implementation allows. Let me test this and see if it works :) > > - This is a bit confusing in forte.cpp, trace->frames[count].lineno = bci. > Line number should be m->line_number_from_bci(bci); > Do the heapMonitoring suppose to trace with bci or line number? > I would say bci, meaning we should either rename ASGCT_CallFrame?lineno or > use another data structure which says bci. > Agreed, I just did not want to create another structure straight away because I thought it was going to be confusing. + it seems that it is not necessary automatically. For the bci instead of line number, it is possible that, because this is code that has been ported over and over, we never cleaned it up or knew that a method was now available. Let me test it out and see how it works. > > ############################## > - // TODO(jcbeyler): remove this extra code handling the extra trace for > Please fix all these TODO's :) > > Will do, they were my trackers of TODOs, I prefetched a bit when I sent out the webrev. I will try to clean out most of them for the next webrev :) > ############################## > - heapMonitoring.hpp: > // TODO(jcbeyler): is this algorithm acceptable in open source? > > Why is this comment here? What is the implication? > Have you tested any simpler algorithm? > Yes sorry, this is me being paranoid and trying to figure out if the licenses are compatible. This implementation exists in the open source world but I wanted to double check via this TODO that the original code and the JDK are compatible in licenses before bringing this in. >From what I've seen here, this is a known fast log implementation/algorithm and the license seems fine so I believe it is fine but still need to do the due diligence. I will double check and ensure it is fine :) > > ############################## > - Create a sanity jtreg test. (./hotspot/make/test/JtregNative.gmk for > building the agent) > Sounds good, I will look at how to do that :) > > ############################## > - monitoring_period vs HeapMonitorRate, pick rate or period. > Agreed. > > ############################## > - globals.hpp > Why is MaxHeapTraces not settable/overridable from jvmti interface? That > would be handy. > Just to be clear and complete: MaxHeapTraces will actually disappear in the future webrev: 1) This was here only to show how the whole system will look with especially the general jvmti interface, compiler changes, and hooks 2) In practice, we keep all live sampled objects and we free up the list via the GC events of sampled objects 3) In our implementation and the final one that would be shown here: - MaxHeapTraces disappears but we would have the analoguous MaxGarbageHeapTraces - That would could use a flag to be settable/overridable > > ############################## > - jvmtiStackTraceData + ASGCT_CallFrame memory > Are the agent suppose to loop through and free all ASGCT_CallFrame? > Wouldn't it be better with some kinda protocol, like: > (*jvmti)->GetLiveTraces(jvmti, &stack_traces, &num_traces); > (*jvmti)->ReleaseTraces(jvmti, stack_traces, num_traces); > I will change it like you suggest. Our implementation did do the free in the agent but in the general case, your solution is more logical. > > Also using another data structure that have num_traces inside it > simplifies things. > So I'm not convinced using the async structure is the best way forward. > I'm not yet sure about leaving the async entirely, I will double check on that before commiting to it. I will let you know. > > > I have more questions, but I think it's better if you respond and update > the code first. > I will work on the code changes and adapt to all your suggestions here. Hopefully, I've helped answer partially most of them and if you have more, please feel free to ask them now :) Or, as you said, wait until I send my next version out if you think that we should first iterate on this before opening other questions and conversations! Thank you again for taking the time to do such a thorough review and I hope I have answered the initial questions well! Jc > > Thanks! > > /Robbin > > > On 04/21/2017 11:34 PM, JC Beyler wrote: > >> Hi all, >> >> I've added size information to the allocation sampling system. This >> allows the callback to remember the size of each sampled allocation. >> http://cr.openjdk.java.net/~rasbold/8171119/webrev.01/ >> >> The new webrev.01 also adds the actual heap monitoring sampling system in >> files: >> http://cr.openjdk.java.net/~rasbold/8171119/webrev.01/src/sh >> are/vm/runtime/heapMonitoring.cpp.patch >> and >> http://cr.openjdk.java.net/~rasbold/8171119/webrev.01/src/sh >> are/vm/runtime/heapMonitoring.hpp.patch >> >> My next step is to add the GC part to the webrev, which will allow users >> to determine what objects are live and what are garbage. >> >> Thanks for your attention and let me know if there are any questions! >> >> Have a wonderful Friday! >> Jc >> >> On Mon, Apr 17, 2017 at 12:37 PM, JC Beyler > jcbeyler at google.com>> wrote: >> >> Hi all, >> >> I worked on getting a few numbers for overhead and accuracy for my >> feature. I'm unsure if here is the right place to provide the full data, so >> I am just summarizing >> here for now. >> >> - Overhead of the feature >> >> Using the Dacapo benchmark (http://dacapobench.org/). My initial >> results are that sampling provides 2.4% with a 512k sampling, 512k being >> our default setting. >> >> - Note: this was without the tradesoap, tradebeans and tomcat >> benchmarks since they did not work with my JDK9 (issue between Dacapo and >> JDK9 it seems) >> - I want to rerun next week to ensure number stability >> >> - Accuracy of the feature >> >> I wrote a small microbenchmark that allocates from two different >> stacktraces at a given ratio. For example, 10% of stacktrace S1 and 90% >> from stacktrace S2. The >> microbenchmark was run 20 times, I averaged the results and looked >> for accuracy. It seems that statistically it is sound since if I >> allocated10% S1 and 90% S2, with a >> sampling rate of 512k, I obtained 9.61% S1 and 90.49% S2. >> >> Let me know if there are any questions on the numbers and if you'd >> like to see some more data. >> >> Note: this was done using our internal JDK8 implementation since the >> webrev provided by http://cr.openjdk.java.net/~ra >> sbold/heapz/webrev.00/index.html >> >> does not yet contain the whole implementation and therefore would have been >> misleading. >> >> Thanks, >> Jc >> >> >> On Tue, Apr 4, 2017 at 3:55 PM, JC Beyler > > wrote: >> >> Hi all, >> >> To move the discussion forward, with Chuck Rasbold's help to make >> a webrev, we pushed this: >> http://cr.openjdk.java.net/~rasbold/heapz/webrev.00/index.html < >> http://cr.openjdk.java.net/~rasbold/heapz/webrev.00/index.html> >> 415 lines changed: 399 ins; 13 del; 3 mod; 51122 unchg >> >> This is not a final change that does the whole proposition from >> the JBS entry: https://bugs.openjdk.java.net/browse/JDK-8177374 >> ; what it does >> show is parts of the implementation that is proposed and hopefully can >> start the conversation going >> as I work through the details. >> >> For example, the changes to C2 are done here for the allocations: >> http://cr.openjdk.java.net/~rasbold/heapz/webrev.00/src/shar >> e/vm/opto/macro.cpp.patch >> > re/vm/opto/macro.cpp.patch> >> >> Hopefully this all makes sense and thank you for all your future >> comments! >> Jc >> >> >> On Tue, Dec 13, 2016 at 1:11 PM, JC Beyler > > wrote: >> >> Hello all, >> >> This is a follow-up from Jeremy's initial email from last >> year: >> http://mail.openjdk.java.net/pipermail/serviceability-dev/20 >> 15-June/017543.html > pipermail/serviceability-dev/2015-June/017543.html> >> >> I've gone ahead and started working on preparing this and >> Jeremy and I went down the route of actually writing it up in JEP form: >> https://bugs.openjdk.java.net/browse/JDK-8171119 >> >> I think original conversation that happened last year in that >> thread still holds true: >> >> - We have a patch at Google that we think others might be >> interested in >> - It provides a means to understand where the allocation >> hotspots are at a very low overhead >> - Since it is at a low overhead, we can leave it on by >> default >> >> So I come to the mailing list with Jeremy's initial question: >> "I thought I would ask if there is any interest / if I should >> write a JEP / if I should just forget it." >> >> A year ago, it seemed some thought it was a good idea, is >> this still true? >> >> Thanks, >> Jc >> >> >> >> >> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From gerald.thornbrugh at oracle.com Fri May 12 16:10:27 2017 From: gerald.thornbrugh at oracle.com (Gerald Thornbrugh) Date: Fri, 12 May 2017 10:10:27 -0600 Subject: RFR 8179903: Clean up SPARC 32-bit support In-Reply-To: References: Message-ID: <3A83A992-62FF-4575-9906-91DEB4154113@oracle.com> Hi George, Your changes look good to me. Jerry > On May 12, 2017, at 8:34 AM, George Triantafillou wrote: > > Please review this fix to clean up SPARC 32-bit support. > > JBS: https://bugs.openjdk.java.net/browse/JDK-8179903 > webrev: http://cr.openjdk.java.net/~gtriantafill/8179903-webrev/webrev/index.html > > This is a followup RFE to "JDK-8150388 Remove SPARC 32-bit support". The work includes addressing formatting, 32-bit comments, and other issues that Kim raised after JDK-8150388 was reviewed and checked in. > > Built and tested on solaris-sparcv9-debug, solaris-x64-debug with the nsk.jvmti, nsk.jdwp, and nsk.jdi testlists. > > Thanks. > > -George > From volker.simonis at gmail.com Wed May 24 16:05:27 2017 From: volker.simonis at gmail.com (Volker Simonis) Date: Wed, 24 May 2017 18:05:27 +0200 Subject: [10] RFR(M): 8176506: C2: loop unswitching and unsafe accesses cause crash In-Reply-To: <8de1ea64-79cd-e7ec-adb4-1f1f276612bf@oracle.com> References: <1aef21ad-3676-2ffd-069a-c74ef36a668a@oracle.com> <83aed7fe-cafa-f28b-577d-c6871123e269@oracle.com> <8cfe23ac-b308-24e3-2725-44992031cddd@oracle.com> <352faa06-3fe4-de8a-eeb6-3506bf555a1e@redhat.com> <4050c7c9-b688-5d52-a85c-f72284f50ccf@oracle.com> <6cea2fa0-1bcf-3c86-1494-a51f66430577@oracle.com> <8de1ea64-79cd-e7ec-adb4-1f1f276612bf@oracle.com> Message-ID: Sorry, I somehow missed the fact that you're still waiting for my OK. I think it's good that you now use SIGILL on all platforms. The change looks good except the following minor nits I've already mentioned in my first review. I'll leave it up to you if you want to fix them and in the case you do there's no need for a new webrev. src/share/vm/opto/graphKit.cpp // add a check for null for which one branch can't be taken. It uses // and Opaque4 node that will cause the check to be removed after loop - should probably read "// an Opaque4 node.." in the second line src/share/vm/opto/opaquenode.hpp // know implicitly is always true of false but the compiler has no way // to prove. If during optimizations, that check becomes true of // false, the Opaque4 node is replaced by that constant true or - s/of/or/ test/compiler/unsafe/TestMaybeNullUnsafeAccess.java - wouldn't it be safer to run the test with -Xbatch and -XX:-UseOnStackReplacement for any case and to make it evident that the test relays on the fact that test1() and test2() are both compiled but not inlined ? - you should also update the copyright on most files you've touched. Thank you and best regards, Volker On Tue, May 16, 2017 at 6:45 AM, Vladimir Kozlov wrote: > Looks good to me. > > Volker should look too. > > Thanks, > Vladimir > > > On 5/12/17 12:55 AM, Roland Westrelin wrote: >> >> >>> X86 Manual says: >>> >>> "Use the 0F0B opcode (UD2 instruction) or the 0FB9H opcode when >>> deliberately trying to generate an invalid opcode exception (#UD)." >> >> >> Thanks, Vladimir. >> >> Here is a new webrev. Halt now should trigger a SIGILL on all >> platforms. I tested x86, aarch64, arm64. The code for arm32 needs to be >> tested (I can't even build on arm32). >> >> http://cr.openjdk.java.net/~roland/8176506/webrev.04/ >> >> Roland. >> > From vladimir.x.ivanov at oracle.com Wed May 24 16:18:41 2017 From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov) Date: Wed, 24 May 2017 19:18:41 +0300 Subject: RFR [9] (XS): 8179882: C2: Stale control info after cast node elimination during loop optimization pass Message-ID: <8983a735-8435-9b27-6580-8eed965b9bc2@oracle.com> http://cr.openjdk.java.net/~vlivanov/8179882/webrev.00 https://bugs.openjdk.java.net/browse/JDK-8179882 There's cast elimination logic in PhaseIdealLoop::split_if_with_blocks_pre() which was introduced by JDK-8139771. The problem with it is that ConstraintCastNode::dominating_cast() relies on immediate control info (in(0)) which can get out of sync with what is cached in PhaseIdealLoop (get_ctrl()). The fix is to catch the case when dom_cast doesn't dominate n based on info from PhaseIdealLoop and update control info accordingly. Testing: manual (replayed problematic compilation & eyeballed the IR), JPRT, RBT (hs-tier0-comp, in progress). Best regards, Vladimir Ivanov PS: thanks to Roland for helping with the fix. From vladimir.kozlov at oracle.com Wed May 24 22:36:52 2017 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Wed, 24 May 2017 15:36:52 -0700 Subject: RFR [9] (XS): 8179882: C2: Stale control info after cast node elimination during loop optimization pass In-Reply-To: <8983a735-8435-9b27-6580-8eed965b9bc2@oracle.com> References: <8983a735-8435-9b27-6580-8eed965b9bc2@oracle.com> Message-ID: <613dc165-c8d2-c38b-bbc4-12f1884944dc@oracle.com> Yes, calling IGVN during loop opts is dangerous - we always have to remember adjust loops info. On 5/24/17 9:18 AM, Vladimir Ivanov wrote: > http://cr.openjdk.java.net/~vlivanov/8179882/webrev.00 > https://bugs.openjdk.java.net/browse/JDK-8179882 > > There's cast elimination logic in > PhaseIdealLoop::split_if_with_blocks_pre() which was introduced by > JDK-8139771. > > The problem with it is that ConstraintCastNode::dominating_cast() relies > on immediate control info (in(0)) which can get out of sync with what is > cached in PhaseIdealLoop (get_ctrl()). Should we fix this instead? It means somewhere we forgot to update loop info. Thanks, Vladimir > > The fix is to catch the case when dom_cast doesn't dominate n based on > info from PhaseIdealLoop and update control info accordingly. > > Testing: manual (replayed problematic compilation & eyeballed the IR), > JPRT, RBT (hs-tier0-comp, in progress). > > Best regards, > Vladimir Ivanov > > PS: thanks to Roland for helping with the fix. From vladimir.x.ivanov at oracle.com Wed May 24 22:50:26 2017 From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov) Date: Thu, 25 May 2017 01:50:26 +0300 Subject: RFR [9] (XS): 8179882: C2: Stale control info after cast node elimination during loop optimization pass In-Reply-To: <613dc165-c8d2-c38b-bbc4-12f1884944dc@oracle.com> References: <8983a735-8435-9b27-6580-8eed965b9bc2@oracle.com> <613dc165-c8d2-c38b-bbc4-12f1884944dc@oracle.com> Message-ID: <8e502315-5c87-100b-783c-fd8dab4cbb9c@oracle.com> Thanks for the review, Vladimir. > Yes, calling IGVN during loop opts is dangerous - we always have to > remember adjust loops info. > > On 5/24/17 9:18 AM, Vladimir Ivanov wrote: >> http://cr.openjdk.java.net/~vlivanov/8179882/webrev.00 >> https://bugs.openjdk.java.net/browse/JDK-8179882 >> >> There's cast elimination logic in >> PhaseIdealLoop::split_if_with_blocks_pre() which was introduced by >> JDK-8139771. >> >> The problem with it is that ConstraintCastNode::dominating_cast() >> relies on immediate control info (in(0)) which can get out of sync >> with what is cached in PhaseIdealLoop (get_ctrl()). > > Should we fix this instead? It means somewhere we forgot to update loop > info. Can you elaborate your point, please? The fix being proposed does exactly that - detect the case when dom_cast should be moved and updates its control & loop info to dominate all users (dom_lca(dom_cast_ctrl, n_ctrl)). Or do you suggest to rewrite ConstraintCastNode::dominating_cast() to use get_ctrl() instead of in(0)? Best regards, Vladimir Ivanov >> >> The fix is to catch the case when dom_cast doesn't dominate n based on >> info from PhaseIdealLoop and update control info accordingly. >> >> Testing: manual (replayed problematic compilation & eyeballed the IR), >> JPRT, RBT (hs-tier0-comp, in progress). >> >> Best regards, >> Vladimir Ivanov >> >> PS: thanks to Roland for helping with the fix. From vladimir.kozlov at oracle.com Wed May 24 22:55:45 2017 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Wed, 24 May 2017 15:55:45 -0700 Subject: RFR [9] (XS): 8179882: C2: Stale control info after cast node elimination during loop optimization pass In-Reply-To: <8e502315-5c87-100b-783c-fd8dab4cbb9c@oracle.com> References: <8983a735-8435-9b27-6580-8eed965b9bc2@oracle.com> <613dc165-c8d2-c38b-bbc4-12f1884944dc@oracle.com> <8e502315-5c87-100b-783c-fd8dab4cbb9c@oracle.com> Message-ID: On 5/24/17 3:50 PM, Vladimir Ivanov wrote: > Thanks for the review, Vladimir. > >> Yes, calling IGVN during loop opts is dangerous - we always have to >> remember adjust loops info. >> >> On 5/24/17 9:18 AM, Vladimir Ivanov wrote: >>> http://cr.openjdk.java.net/~vlivanov/8179882/webrev.00 >>> https://bugs.openjdk.java.net/browse/JDK-8179882 >>> >>> There's cast elimination logic in >>> PhaseIdealLoop::split_if_with_blocks_pre() which was introduced by >>> JDK-8139771. >>> >>> The problem with it is that ConstraintCastNode::dominating_cast() >>> relies on immediate control info (in(0)) which can get out of sync >>> with what is cached in PhaseIdealLoop (get_ctrl()). >> >> Should we fix this instead? It means somewhere we forgot to update loop >> info. > > Can you elaborate your point, please? The fix being proposed does > exactly that - detect the case when dom_cast should be moved and updates > its control & loop info to dominate all users (dom_lca(dom_cast_ctrl, > n_ctrl)). Could be misunderstanding. You said get_ctrl() does not point to in(0) at time when dominating_cast() is called. My question is why? During loop construction it should point to it. At which time it is changed? If there was a new node created during loopopts then PhaseIdealLoop::register_new_node() should be called for it already. Vladimir K > > Or do you suggest to rewrite ConstraintCastNode::dominating_cast() to > use get_ctrl() instead of in(0)? > > Best regards, > Vladimir Ivanov > >>> >>> The fix is to catch the case when dom_cast doesn't dominate n based on >>> info from PhaseIdealLoop and update control info accordingly. >>> >>> Testing: manual (replayed problematic compilation & eyeballed the IR), >>> JPRT, RBT (hs-tier0-comp, in progress). >>> >>> Best regards, >>> Vladimir Ivanov >>> >>> PS: thanks to Roland for helping with the fix. From vladimir.x.ivanov at oracle.com Wed May 24 23:38:52 2017 From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov) Date: Thu, 25 May 2017 02:38:52 +0300 Subject: RFR [9] (XS): 8179882: C2: Stale control info after cast node elimination during loop optimization pass In-Reply-To: References: <8983a735-8435-9b27-6580-8eed965b9bc2@oracle.com> <613dc165-c8d2-c38b-bbc4-12f1884944dc@oracle.com> <8e502315-5c87-100b-783c-fd8dab4cbb9c@oracle.com> Message-ID: >>>> The problem with it is that ConstraintCastNode::dominating_cast() >>>> relies on immediate control info (in(0)) which can get out of sync >>>> with what is cached in PhaseIdealLoop (get_ctrl()). >>> >>> Should we fix this instead? It means somewhere we forgot to update loop >>> info. >> >> Can you elaborate your point, please? The fix being proposed does >> exactly that - detect the case when dom_cast should be moved and >> updates its control & loop info to dominate all users >> (dom_lca(dom_cast_ctrl, n_ctrl)). > > Could be misunderstanding. You said get_ctrl() does not point to in(0) > at time when dominating_cast() is called. My question is why? During > loop construction it should point to it. At which time it is changed? Step-by-step description of applied transformations follows (details [1]): #1: If: 1012 => 1257 - same condition (1820 Bool), 1257 dominates 1012 #2: CastPP: 5846 => 733 - 733 dominates 5846 #3: CheckCastPP: 773 => 574 - 773 & 574 have the same control node after #1 (1260/1257), but get_ctrl(dom_cast) points to 1581 (before #2: 574 -1-> 5846 -0-> 1581). The problem is that 773 was part of N4442 and 574 doesn't dominate the loop body, but #3 doesn't place 574 there when replacing 773. So, subsequent cloning of loop body during loop peeling of N4442 produces unschedulable graph. The fix moves 574 up the dominator tree and puts it inside N4442, so it dominates all users of 773 & 574 after #3. get_ctrl() & in(0) point to different CFG nodes after #1, but the loop info is correct until #3 happens which requires 574 to be moved up. Best regards, Vladimir Ivanov [1] === new loop opt pass #0: Initial state Loop: N4442/N1838 body={ 1257 773 1260 } Loop: N4445/N2197 body={ 1012 574 1075 } 574 CheckCastPP === 1075 5846 Oop:java/lang/Integer:NotNull:exact * 1075 IfFalse === 1012 1012 If === 1575 1820 1820 Bool === ... 5846 CastPP === 1581 677 1581 Proj === 2137 #0 Type:control 2137 Unlock === ... 677 Phi === ... 773 CheckCastPP === 1260 733 Oop:java/lang/Integer:NotNull:exact * 1260 IfFalse === 1257 1257 If === 1819 1820 1820 Bool === ... 733 CastPP === 1230 677 1230 IfTrue === 1225 1225 If === ... 677 Phi === ... #1: 1012 => 1257 (same condition, 1257 dominates 1012) and 574 isn't part of N4445 anymore Loop: N4442/N1838 body={ 1257 773 1260 } Loop: N4445/N2197 body={ } 574 CheckCastPP === 1260 5846 1260 IfFalse === 1257 1257 If === 1575 1820 1820 Bool === ... 5846 CastPP === 1581 677 1581 Proj === 2137 #0 Type:control 2137 Unlock === ... 677 Phi === ... 773 CheckCastPP === 1260 733 1260 IfFalse === 1257 1257 If === 1819 1820 1820 Bool === ... 733 CastPP === 1230 677 1230 IfTrue === 1225 1225 If === ... 677 Phi === ... === next loop opts pass Loop: N4442/N1838 body={ 1257 773 1260 } Loop: N4445/N2197 body={ } #2: 5846 => 733 574 CheckCastPP === 1260 733 773 CheckCastPP === 1260 733 1260 IfFalse === 1257 1257 If === 1819 1820 1820 Bool === ... 733 CastPP === 1230 677 1230 IfTrue === 1225 1225 If === ... 677 Phi === ... Loop: N4442/N1838 body={ 1257 773 1260 } Loop: N4445/N2197 body={ } #3: 773 => 574 Loop: N4442/N1838 body={ 1257 1260 } // NB! missing 574 Loop: N4445/N2197 body={ } #4: Peel N4442 (NB! same loop opt iteration, so loop tree hasn't been rebuilt yet.) - 574 isn't cloned since it's not part of the loop body and it leads to unscheduleable IR. === Crash > If there was a new node created during loopopts then > PhaseIdealLoop::register_new_node() should be called for it already. > > Vladimir K > >> >> Or do you suggest to rewrite ConstraintCastNode::dominating_cast() to >> use get_ctrl() instead of in(0)? >> >> Best regards, >> Vladimir Ivanov >> >>>> >>>> The fix is to catch the case when dom_cast doesn't dominate n based on >>>> info from PhaseIdealLoop and update control info accordingly. >>>> >>>> Testing: manual (replayed problematic compilation & eyeballed the IR), >>>> JPRT, RBT (hs-tier0-comp, in progress). >>>> >>>> Best regards, >>>> Vladimir Ivanov >>>> >>>> PS: thanks to Roland for helping with the fix. From vladimir.kozlov at oracle.com Thu May 25 00:24:12 2017 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Wed, 24 May 2017 17:24:12 -0700 Subject: RFR [9] (XS): 8179882: C2: Stale control info after cast node elimination during loop optimization pass In-Reply-To: References: <8983a735-8435-9b27-6580-8eed965b9bc2@oracle.com> <613dc165-c8d2-c38b-bbc4-12f1884944dc@oracle.com> <8e502315-5c87-100b-783c-fd8dab4cbb9c@oracle.com> Message-ID: On 5/24/17 4:38 PM, Vladimir Ivanov wrote: >>>>> The problem with it is that ConstraintCastNode::dominating_cast() >>>>> relies on immediate control info (in(0)) which can get out of sync >>>>> with what is cached in PhaseIdealLoop (get_ctrl()). >>>> >>>> Should we fix this instead? It means somewhere we forgot to update loop >>>> info. >>> >>> Can you elaborate your point, please? The fix being proposed does >>> exactly that - detect the case when dom_cast should be moved and >>> updates its control & loop info to dominate all users >>> (dom_lca(dom_cast_ctrl, n_ctrl)). >> >> Could be misunderstanding. You said get_ctrl() does not point to in(0) >> at time when dominating_cast() is called. My question is why? During >> loop construction it should point to it. At which time it is changed? > > Step-by-step description of applied transformations follows (details [1]): > #1: If: 1012 => 1257 > - same condition (1820 Bool), 1257 dominates 1012 Which transformation replace 1012 with 1257? We should exit loop opts after such transformation is done and recalculate new loop info again on next loop opts iteration. Based on code it look like we are doing split_if_with_blocks when this happens. major_progress should be set and exit this iteration of loop opts. So why "loop cloning" happens? Vladimir > > #2: CastPP: 5846 => 733 > - 733 dominates 5846 > > #3: CheckCastPP: 773 => 574 > - 773 & 574 have the same control node after #1 (1260/1257), but > get_ctrl(dom_cast) points to 1581 (before #2: 574 -1-> 5846 -0-> 1581). > > The problem is that 773 was part of N4442 and 574 doesn't dominate the > loop body, but #3 doesn't place 574 there when replacing 773. So, > subsequent cloning of loop body during loop peeling of N4442 produces > unschedulable graph. > > The fix moves 574 up the dominator tree and puts it inside N4442, so it > dominates all users of 773 & 574 after #3. > > get_ctrl() & in(0) point to different CFG nodes after #1, but the loop > info is correct until #3 happens which requires 574 to be moved up. > > Best regards, > Vladimir Ivanov > > [1] > === new loop opt pass > > #0: Initial state > Loop: N4442/N1838 body={ 1257 773 1260 } > Loop: N4445/N2197 body={ 1012 574 1075 } > > 574 CheckCastPP === 1075 5846 > Oop:java/lang/Integer:NotNull:exact * > 1075 IfFalse === 1012 > 1012 If === 1575 1820 > 1820 Bool === ... > 5846 CastPP === 1581 677 > 1581 Proj === 2137 #0 Type:control > 2137 Unlock === ... > 677 Phi === ... > > 773 CheckCastPP === 1260 733 > Oop:java/lang/Integer:NotNull:exact * > 1260 IfFalse === 1257 > 1257 If === 1819 1820 > 1820 Bool === ... > 733 CastPP === 1230 677 > 1230 IfTrue === 1225 > 1225 If === ... > 677 Phi === ... > > > #1: 1012 => 1257 (same condition, 1257 dominates 1012) and 574 isn't > part of N4445 anymore > > Loop: N4442/N1838 body={ 1257 773 1260 } > Loop: N4445/N2197 body={ } > > 574 CheckCastPP === 1260 5846 > 1260 IfFalse === 1257 > 1257 If === 1575 1820 > 1820 Bool === ... > 5846 CastPP === 1581 677 > 1581 Proj === 2137 #0 Type:control > 2137 Unlock === ... > 677 Phi === ... > > 773 CheckCastPP === 1260 733 > 1260 IfFalse === 1257 > 1257 If === 1819 1820 > 1820 Bool === ... > 733 CastPP === 1230 677 > 1230 IfTrue === 1225 > 1225 If === ... > 677 Phi === ... > > === next loop opts pass > > Loop: N4442/N1838 body={ 1257 773 1260 } > Loop: N4445/N2197 body={ } > > #2: 5846 => 733 > > 574 CheckCastPP === 1260 733 > 773 CheckCastPP === 1260 733 > > 1260 IfFalse === 1257 > 1257 If === 1819 1820 > 1820 Bool === ... > > 733 CastPP === 1230 677 > 1230 IfTrue === 1225 > 1225 If === ... > 677 Phi === ... > > Loop: N4442/N1838 body={ 1257 773 1260 } > Loop: N4445/N2197 body={ } > > #3: 773 => 574 > > Loop: N4442/N1838 body={ 1257 1260 } // NB! missing 574 > Loop: N4445/N2197 body={ } > > #4: Peel N4442 (NB! same loop opt iteration, so loop tree hasn't been > rebuilt yet.) > - 574 isn't cloned since it's not part of the loop body and it leads > to unscheduleable IR. > > === Crash > >> If there was a new node created during loopopts then >> PhaseIdealLoop::register_new_node() should be called for it already. >> >> Vladimir K >> >>> >>> Or do you suggest to rewrite ConstraintCastNode::dominating_cast() to >>> use get_ctrl() instead of in(0)? >>> >>> Best regards, >>> Vladimir Ivanov >>> >>>>> >>>>> The fix is to catch the case when dom_cast doesn't dominate n based on >>>>> info from PhaseIdealLoop and update control info accordingly. >>>>> >>>>> Testing: manual (replayed problematic compilation & eyeballed the IR), >>>>> JPRT, RBT (hs-tier0-comp, in progress). >>>>> >>>>> Best regards, >>>>> Vladimir Ivanov >>>>> >>>>> PS: thanks to Roland for helping with the fix. From vladimir.x.ivanov at oracle.com Thu May 25 00:50:12 2017 From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov) Date: Thu, 25 May 2017 03:50:12 +0300 Subject: RFR [9] (XS): 8179882: C2: Stale control info after cast node elimination during loop optimization pass In-Reply-To: References: <8983a735-8435-9b27-6580-8eed965b9bc2@oracle.com> <613dc165-c8d2-c38b-bbc4-12f1884944dc@oracle.com> <8e502315-5c87-100b-783c-fd8dab4cbb9c@oracle.com> Message-ID: <488d42b7-be5f-f739-1887-56691bcc39c1@oracle.com> On 5/25/17 3:24 AM, Vladimir Kozlov wrote: > On 5/24/17 4:38 PM, Vladimir Ivanov wrote: >>>>>> The problem with it is that ConstraintCastNode::dominating_cast() >>>>>> relies on immediate control info (in(0)) which can get out of sync >>>>>> with what is cached in PhaseIdealLoop (get_ctrl()). >>>>> >>>>> Should we fix this instead? It means somewhere we forgot to update >>>>> loop >>>>> info. >>>> >>>> Can you elaborate your point, please? The fix being proposed does >>>> exactly that - detect the case when dom_cast should be moved and >>>> updates its control & loop info to dominate all users >>>> (dom_lca(dom_cast_ctrl, n_ctrl)). >>> >>> Could be misunderstanding. You said get_ctrl() does not point to in(0) >>> at time when dominating_cast() is called. My question is why? During >>> loop construction it should point to it. At which time it is changed? >> >> Step-by-step description of applied transformations follows (details >> [1]): >> #1: If: 1012 => 1257 >> - same condition (1820 Bool), 1257 dominates 1012 > > Which transformation replace 1012 with 1257? We should exit loop opts > after such transformation is done and recalculate new loop info again on > next loop opts iteration. Sorry for the confusion. #1 & #2-#3 happen in different (consequent) loop opts iterations. Loop info is rebuilt after #2 and it goes out of sync during #2-#3. > Based on code it look like we are doing split_if_with_blocks when this > happens. major_progress should be set and exit this iteration of loop > opts. So why "loop cloning" happens? Cast node elimination in split_if_with_blocks_pre doesn't bump _major_progress, so after split-ifs are over (#2-#3 are part of them), the compiler proceeds to loop opts. Calling set_major_progress() when dom_cast doesn't dominate n in split_if_with_blocks_pre() is another way to fix the bug, but it is much more expensive (introduces additional loop opts pass). Best regards, Vladimir Ivanov >> >> #2: CastPP: 5846 => 733 >> - 733 dominates 5846 >> >> #3: CheckCastPP: 773 => 574 >> - 773 & 574 have the same control node after #1 (1260/1257), but >> get_ctrl(dom_cast) points to 1581 (before #2: 574 -1-> 5846 -0-> 1581). >> >> The problem is that 773 was part of N4442 and 574 doesn't dominate the >> loop body, but #3 doesn't place 574 there when replacing 773. So, >> subsequent cloning of loop body during loop peeling of N4442 produces >> unschedulable graph. >> >> The fix moves 574 up the dominator tree and puts it inside N4442, so >> it dominates all users of 773 & 574 after #3. >> >> get_ctrl() & in(0) point to different CFG nodes after #1, but the loop >> info is correct until #3 happens which requires 574 to be moved up. >> >> Best regards, >> Vladimir Ivanov >> >> [1] >> === new loop opt pass >> >> #0: Initial state >> Loop: N4442/N1838 body={ 1257 773 1260 } >> Loop: N4445/N2197 body={ 1012 574 1075 } >> >> 574 CheckCastPP === 1075 5846 >> Oop:java/lang/Integer:NotNull:exact * >> 1075 IfFalse === 1012 >> 1012 If === 1575 1820 >> 1820 Bool === ... >> 5846 CastPP === 1581 677 >> 1581 Proj === 2137 #0 Type:control >> 2137 Unlock === ... >> 677 Phi === ... >> >> 773 CheckCastPP === 1260 733 >> Oop:java/lang/Integer:NotNull:exact * >> 1260 IfFalse === 1257 >> 1257 If === 1819 1820 >> 1820 Bool === ... >> 733 CastPP === 1230 677 >> 1230 IfTrue === 1225 >> 1225 If === ... >> 677 Phi === ... >> >> >> #1: 1012 => 1257 (same condition, 1257 dominates 1012) and 574 isn't >> part of N4445 anymore >> >> Loop: N4442/N1838 body={ 1257 773 1260 } >> Loop: N4445/N2197 body={ } >> >> 574 CheckCastPP === 1260 5846 >> 1260 IfFalse === 1257 >> 1257 If === 1575 1820 >> 1820 Bool === ... >> 5846 CastPP === 1581 677 >> 1581 Proj === 2137 #0 Type:control >> 2137 Unlock === ... >> 677 Phi === ... >> >> 773 CheckCastPP === 1260 733 >> 1260 IfFalse === 1257 >> 1257 If === 1819 1820 >> 1820 Bool === ... >> 733 CastPP === 1230 677 >> 1230 IfTrue === 1225 >> 1225 If === ... >> 677 Phi === ... >> >> === next loop opts pass >> >> Loop: N4442/N1838 body={ 1257 773 1260 } >> Loop: N4445/N2197 body={ } >> >> #2: 5846 => 733 >> >> 574 CheckCastPP === 1260 733 >> 773 CheckCastPP === 1260 733 >> >> 1260 IfFalse === 1257 >> 1257 If === 1819 1820 >> 1820 Bool === ... >> >> 733 CastPP === 1230 677 >> 1230 IfTrue === 1225 >> 1225 If === ... >> 677 Phi === ... >> >> Loop: N4442/N1838 body={ 1257 773 1260 } >> Loop: N4445/N2197 body={ } >> >> #3: 773 => 574 >> >> Loop: N4442/N1838 body={ 1257 1260 } // NB! missing 574 >> Loop: N4445/N2197 body={ } >> >> #4: Peel N4442 (NB! same loop opt iteration, so loop tree hasn't been >> rebuilt yet.) >> - 574 isn't cloned since it's not part of the loop body and it >> leads to unscheduleable IR. >> >> === Crash >> >>> If there was a new node created during loopopts then >>> PhaseIdealLoop::register_new_node() should be called for it already. >>> >>> Vladimir K >>> >>>> >>>> Or do you suggest to rewrite ConstraintCastNode::dominating_cast() to >>>> use get_ctrl() instead of in(0)? >>>> >>>> Best regards, >>>> Vladimir Ivanov >>>> >>>>>> >>>>>> The fix is to catch the case when dom_cast doesn't dominate n >>>>>> based on >>>>>> info from PhaseIdealLoop and update control info accordingly. >>>>>> >>>>>> Testing: manual (replayed problematic compilation & eyeballed the >>>>>> IR), >>>>>> JPRT, RBT (hs-tier0-comp, in progress). >>>>>> >>>>>> Best regards, >>>>>> Vladimir Ivanov >>>>>> >>>>>> PS: thanks to Roland for helping with the fix. From vladimir.kozlov at oracle.com Thu May 25 01:48:11 2017 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Wed, 24 May 2017 18:48:11 -0700 Subject: RFR [9] (XS): 8179882: C2: Stale control info after cast node elimination during loop optimization pass In-Reply-To: <488d42b7-be5f-f739-1887-56691bcc39c1@oracle.com> References: <8983a735-8435-9b27-6580-8eed965b9bc2@oracle.com> <613dc165-c8d2-c38b-bbc4-12f1884944dc@oracle.com> <8e502315-5c87-100b-783c-fd8dab4cbb9c@oracle.com> <488d42b7-be5f-f739-1887-56691bcc39c1@oracle.com> Message-ID: <37ed70d5-60be-de34-5143-938c2cb34576@oracle.com> On 5/24/17 5:50 PM, Vladimir Ivanov wrote: > > > On 5/25/17 3:24 AM, Vladimir Kozlov wrote: >> On 5/24/17 4:38 PM, Vladimir Ivanov wrote: >>>>>>> The problem with it is that ConstraintCastNode::dominating_cast() >>>>>>> relies on immediate control info (in(0)) which can get out of sync >>>>>>> with what is cached in PhaseIdealLoop (get_ctrl()). >>>>>> >>>>>> Should we fix this instead? It means somewhere we forgot to update >>>>>> loop >>>>>> info. >>>>> >>>>> Can you elaborate your point, please? The fix being proposed does >>>>> exactly that - detect the case when dom_cast should be moved and >>>>> updates its control & loop info to dominate all users >>>>> (dom_lca(dom_cast_ctrl, n_ctrl)). >>>> >>>> Could be misunderstanding. You said get_ctrl() does not point to in(0) >>>> at time when dominating_cast() is called. My question is why? During >>>> loop construction it should point to it. At which time it is changed? >>> >>> Step-by-step description of applied transformations follows (details >>> [1]): >>> #1: If: 1012 => 1257 >>> - same condition (1820 Bool), 1257 dominates 1012 >> >> Which transformation replace 1012 with 1257? We should exit loop opts >> after such transformation is done and recalculate new loop info again on >> next loop opts iteration. > > Sorry for the confusion. #1 & #2-#3 happen in different (consequent) > loop opts iterations. Loop info is rebuilt after #2 and it goes out of > sync during #2-#3. You mean: rebuild after #1. Right? > >> Based on code it look like we are doing split_if_with_blocks when this >> happens. major_progress should be set and exit this iteration of loop >> opts. So why "loop cloning" happens? > > Cast node elimination in split_if_with_blocks_pre doesn't bump > _major_progress, so after split-ifs are over (#2-#3 are part of them), > the compiler proceeds to loop opts. Rereading this part: >>> The problem is that 773 was part of N4442 and 574 doesn't dominate the >>> loop body, but #3 doesn't place 574 there when replacing 773. So, Would it help if 574 CheckCastPP is replaced with 773? Vladimir > > Calling set_major_progress() when dom_cast doesn't dominate n in > split_if_with_blocks_pre() is another way to fix the bug, but it is much > more expensive (introduces additional loop opts pass). > > Best regards, > Vladimir Ivanov > >>> >>> #2: CastPP: 5846 => 733 >>> - 733 dominates 5846 >>> >>> #3: CheckCastPP: 773 => 574 >>> - 773 & 574 have the same control node after #1 (1260/1257), but >>> get_ctrl(dom_cast) points to 1581 (before #2: 574 -1-> 5846 -0-> 1581). >>> >>> The problem is that 773 was part of N4442 and 574 doesn't dominate the >>> loop body, but #3 doesn't place 574 there when replacing 773. So, >>> subsequent cloning of loop body during loop peeling of N4442 produces >>> unschedulable graph. >>> >>> The fix moves 574 up the dominator tree and puts it inside N4442, so >>> it dominates all users of 773 & 574 after #3. >>> >>> get_ctrl() & in(0) point to different CFG nodes after #1, but the loop >>> info is correct until #3 happens which requires 574 to be moved up. >>> >>> Best regards, >>> Vladimir Ivanov >>> >>> [1] >>> === new loop opt pass >>> >>> #0: Initial state >>> Loop: N4442/N1838 body={ 1257 773 1260 } >>> Loop: N4445/N2197 body={ 1012 574 1075 } >>> >>> 574 CheckCastPP === 1075 5846 >>> Oop:java/lang/Integer:NotNull:exact * >>> 1075 IfFalse === 1012 >>> 1012 If === 1575 1820 >>> 1820 Bool === ... >>> 5846 CastPP === 1581 677 >>> 1581 Proj === 2137 #0 Type:control >>> 2137 Unlock === ... >>> 677 Phi === ... >>> >>> 773 CheckCastPP === 1260 733 >>> Oop:java/lang/Integer:NotNull:exact * >>> 1260 IfFalse === 1257 >>> 1257 If === 1819 1820 >>> 1820 Bool === ... >>> 733 CastPP === 1230 677 >>> 1230 IfTrue === 1225 >>> 1225 If === ... >>> 677 Phi === ... >>> >>> >>> #1: 1012 => 1257 (same condition, 1257 dominates 1012) and 574 isn't >>> part of N4445 anymore >>> >>> Loop: N4442/N1838 body={ 1257 773 1260 } >>> Loop: N4445/N2197 body={ } >>> >>> 574 CheckCastPP === 1260 5846 >>> 1260 IfFalse === 1257 >>> 1257 If === 1575 1820 >>> 1820 Bool === ... >>> 5846 CastPP === 1581 677 >>> 1581 Proj === 2137 #0 Type:control >>> 2137 Unlock === ... >>> 677 Phi === ... >>> >>> 773 CheckCastPP === 1260 733 >>> 1260 IfFalse === 1257 >>> 1257 If === 1819 1820 >>> 1820 Bool === ... >>> 733 CastPP === 1230 677 >>> 1230 IfTrue === 1225 >>> 1225 If === ... >>> 677 Phi === ... >>> >>> === next loop opts pass >>> >>> Loop: N4442/N1838 body={ 1257 773 1260 } >>> Loop: N4445/N2197 body={ } >>> >>> #2: 5846 => 733 >>> >>> 574 CheckCastPP === 1260 733 >>> 773 CheckCastPP === 1260 733 >>> >>> 1260 IfFalse === 1257 >>> 1257 If === 1819 1820 >>> 1820 Bool === ... >>> >>> 733 CastPP === 1230 677 >>> 1230 IfTrue === 1225 >>> 1225 If === ... >>> 677 Phi === ... >>> >>> Loop: N4442/N1838 body={ 1257 773 1260 } >>> Loop: N4445/N2197 body={ } >>> >>> #3: 773 => 574 >>> >>> Loop: N4442/N1838 body={ 1257 1260 } // NB! missing 574 >>> Loop: N4445/N2197 body={ } >>> >>> #4: Peel N4442 (NB! same loop opt iteration, so loop tree hasn't been >>> rebuilt yet.) >>> - 574 isn't cloned since it's not part of the loop body and it >>> leads to unscheduleable IR. >>> >>> === Crash >>> >>>> If there was a new node created during loopopts then >>>> PhaseIdealLoop::register_new_node() should be called for it already. >>>> >>>> Vladimir K >>>> >>>>> >>>>> Or do you suggest to rewrite ConstraintCastNode::dominating_cast() to >>>>> use get_ctrl() instead of in(0)? >>>>> >>>>> Best regards, >>>>> Vladimir Ivanov >>>>> >>>>>>> >>>>>>> The fix is to catch the case when dom_cast doesn't dominate n >>>>>>> based on >>>>>>> info from PhaseIdealLoop and update control info accordingly. >>>>>>> >>>>>>> Testing: manual (replayed problematic compilation & eyeballed the >>>>>>> IR), >>>>>>> JPRT, RBT (hs-tier0-comp, in progress). >>>>>>> >>>>>>> Best regards, >>>>>>> Vladimir Ivanov >>>>>>> >>>>>>> PS: thanks to Roland for helping with the fix. From lutz.schmidt at sap.com Thu May 25 08:38:29 2017 From: lutz.schmidt at sap.com (Schmidt, Lutz) Date: Thu, 25 May 2017 08:38:29 +0000 Subject: [10] [ppc] RFR(XS): 8180612: assert failure due to immediate value out of range In-Reply-To: References: <3B5A2D91-D4F5-44B9-9129-4876D992C077@sap.com> <6d7bd7fd-52a8-9542-10ac-9b13c470afe8@oracle.com> <0072b1f1-99d3-25eb-e7a2-871c697c3fef@oracle.com> <7A3AD54D-9DC2-4B83-8B4D-BB2E442CC68A@sap.com> <181C18A7-D259-48EB-9553-BC47980350E6@sap.com> Message-ID: Thank you, Volker! And thank you, Vladimir, for pushing the change ? I concluded this fact from a JBS notification. Your original mail was probably classified as junk. I will be able to recover it by tomorrow. Regards, Lutz On 24.05.2017, 14:11, "Volker Simonis" wrote: OK, looks good now. Vladimir, can you please push this change now? Thank you and best regards, Volker On Wed, May 24, 2017 at 11:13 AM, Schmidt, Lutz wrote: > Hi Volker, > > thanks for looking at the change and for your findings. > > To set RTMTotalCountIncrRate=1 was my intention. Obviously, it escaped my attention on ppc. Fixed. Webrev updated in-place. > > I tried to keep the allowed parameter range as wide as possible. Unfortunately, some of the parameters are used as immediate operands at places where an additional temp register is not readily available. > > RTMLockingThreshold, for example, is only used after being divided by RTMTotalCountIncrRate. Except for RTMTotalCountIncrRate=1, the quotient will be significantly smaller than the original parameter value. Limiting the range of RTMLockingThreshold to int16 seems too restrictive to me. In addition, at that place in the code there is a register available to smoothly handle the ?overflow? case. > > I would like to keep the ranges as they are. > > Regards, > Lutz > > > On 23.05.2017, 17:25, "Volker Simonis" wrote: > > Hi Lutz, > > in general the change looks good. > > I think in globals_ppc.hpp the minimal value for RTMTotalCountIncrRate > should be 1 (as on x86) to avoid division by zero errors. > > What is the rational behind restricting some parameters to 16-bit > immediates on ppc while handling bigger immediate values in the code > generation for some other parameters? Wouldn't it be easier to > restrict all parameters to 16 bit on ppc? > > Thank you and best regards, > Volker > > > > On Tue, May 23, 2017 at 9:29 AM, Schmidt, Lutz wrote: > > Vladimir, Volker, > > > > Triggered by your suggestions, I have read through the RTM code with extended diligence. What I came up with is this updated/extended > > webrev: http://cr.openjdk.java.net/~lucy/webrevs/8180612.01/ > > for bug: https://bugs.openjdk.java.net/browse/JDK-8180612 > > > > For both x86 and ppc, I have added ranges to all numeric RTM flags. Their type is now ?int?. Could you please have a look and let me know what you don?t like? > > > > Thanks and best regards, > > Lutz > > > > > > > > On 19.05.2017, 21:40, "Vladimir Kozlov" wrote: > > > > Actually we need to use 'int' because we do signed arithmetic on them. And put range() restriction for positive values only. > > > > experimental(int, RTMTotalCountIncrRate, 64, \ > > "Increment total RTM attempted lock count once every n times") \ > > range(0, max_jint) \ > > > > Vladimir > > > > On 5/19/17 12:33 PM, Vladimir Kozlov wrote: > > > Thank you, Volker > > > > > > I think all RTM tuning flags should be uint (unsigned 32bit int). > > > We did not have int/uint types when RTM was implemented. They were added 2 years ago: > > > > > > http://hg.openjdk.java.net/jdk9/hs/hotspot/rev/8597e296c18b > > > > > > Lets change type of RTM flags in all places. I will review and sponsor. > > > > > > thanks, > > > Vladimir > > > > > > On 5/19/17 12:02 PM, Volker Simonis wrote: > > >> Hi Lutz, Vladimir, > > >> > > >> @Lutz: thanks for fixing this. I think your change looks good. > > >> > > >> @Vladimir: thanks, but I think we can push this ourselves because it > > >> is ppc only. > > >> > > >> I've also realized that amd64 uses cmpptr() which takes the result of > > >> "RTMLockingThreshold / RTMTotalCountIncrRate" as an int32_t. This can > > >> be wrong if the result of the division is greater than 32 bit. I'm not > > >> sure how relevant that is, but maybe we could either change the types > > >> of RTMLockingThreshold and RTMTotalCountIncrRate to int or else fix > > >> the compare on amd64 to compare against a full 64 bit value. > > >> > > >> What do you think Vladimir - maybe do that as a follow up change or do > > >> you want to include it here (in which case you'd have to sponsor :) ? > > >> > > >> Thank you and best regards, > > >> Volker > > >> > > >> On Fri, May 19, 2017 at 6:35 PM, Vladimir Kozlov > > >> wrote: > > >>> Hi Lutz, > > >>> > > >>> I can sponsor it but someone familiar with PPC have to review the fix. > > >>> > > >>> Thanks, > > >>> Vladimir > > >>> > > >>> > > >>> On 5/19/17 5:45 AM, Schmidt, Lutz wrote: > > >>>> > > >>>> Hi all, > > >>>> > > >>>> May I kindly request reviews for this small fix? A voluntary sponsor would > > >>>> be great as well! > > >>>> > > >>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8180612 > > >>>> Webrev: http://cr.openjdk.java.net/~lucy/webrevs/8180612.00/ > > >>>> > > >>>> The RTM code generation on ppc relied on RTM-related cmdline parameters to > > >>>> provide ?well-behaved? values only. At least one jtreg test breaks this > > >>>> assumption. The fix makes code generation adapt to actual parameter values. > > >>>> > > >>>> Thanks, > > >>>> Lutz > > >>>> > > >>>> > > >>>> > > >>> > > > > > > > > > > > > > > From vladimir.x.ivanov at oracle.com Thu May 25 14:28:20 2017 From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov) Date: Thu, 25 May 2017 17:28:20 +0300 Subject: RFR [9] (XS): 8179882: C2: Stale control info after cast node elimination during loop optimization pass In-Reply-To: <37ed70d5-60be-de34-5143-938c2cb34576@oracle.com> References: <8983a735-8435-9b27-6580-8eed965b9bc2@oracle.com> <613dc165-c8d2-c38b-bbc4-12f1884944dc@oracle.com> <8e502315-5c87-100b-783c-fd8dab4cbb9c@oracle.com> <488d42b7-be5f-f739-1887-56691bcc39c1@oracle.com> <37ed70d5-60be-de34-5143-938c2cb34576@oracle.com> Message-ID: >> Sorry for the confusion. #1 & #2-#3 happen in different (consequent) >> loop opts iterations. Loop info is rebuilt after #2 and it goes out of >> sync during #2-#3. > > You mean: rebuild after #1. Right? Yes, typo. >>> Based on code it look like we are doing split_if_with_blocks when this >>> happens. major_progress should be set and exit this iteration of loop >>> opts. So why "loop cloning" happens? >> >> Cast node elimination in split_if_with_blocks_pre doesn't bump >> _major_progress, so after split-ifs are over (#2-#3 are part of them), >> the compiler proceeds to loop opts. > > Rereading this part: > >>>> The problem is that 773 was part of N4442 and 574 doesn't dominate the >>>> loop body, but #3 doesn't place 574 there when replacing 773. So, > > Would it help if 574 CheckCastPP is replaced with 773? Yes, that's another way to fix it [1]. Either one or the other cast node dominates. If we are unlucky, compiler encounters the "wrong one" (dominator) first and tries to replace it with a dominated one. If the compiler skips it, it will eventually find dominated node and will attempt to replace it with the dominator (which will succeed). BTW it's equivalent to using get_ctrl() instead of in(0) in ConstraintCastNode::dominating_cast() when called with PhaseIdealLoop. Let me know what approach to fix the bug you prefer. Best regards, Vladimir Ivanov [1] diff --git a/src/share/vm/opto/loopopts.cpp b/src/share/vm/opto/loopopts.cpp --- a/src/share/vm/opto/loopopts.cpp +++ b/src/share/vm/opto/loopopts.cpp @@ -913,7 +913,7 @@ if (n->is_ConstraintCast()) { Node* dom_cast = n->as_ConstraintCast()->dominating_cast(this); - if (dom_cast != NULL) { + if (dom_cast != NULL && is_dominator(get_ctrl(dom_cast), get_ctrl(n)) { _igvn.replace_node(n, dom_cast); return dom_cast; } >> Calling set_major_progress() when dom_cast doesn't dominate n in >> split_if_with_blocks_pre() is another way to fix the bug, but it is >> much more expensive (introduces additional loop opts pass). >> >> Best regards, >> Vladimir Ivanov >> >>>> >>>> #2: CastPP: 5846 => 733 >>>> - 733 dominates 5846 >>>> >>>> #3: CheckCastPP: 773 => 574 >>>> - 773 & 574 have the same control node after #1 (1260/1257), but >>>> get_ctrl(dom_cast) points to 1581 (before #2: 574 -1-> 5846 -0-> 1581). >>>> >>>> The problem is that 773 was part of N4442 and 574 doesn't dominate the >>>> loop body, but #3 doesn't place 574 there when replacing 773. So, >>>> subsequent cloning of loop body during loop peeling of N4442 produces >>>> unschedulable graph. >>>> >>>> The fix moves 574 up the dominator tree and puts it inside N4442, so >>>> it dominates all users of 773 & 574 after #3. >>>> >>>> get_ctrl() & in(0) point to different CFG nodes after #1, but the loop >>>> info is correct until #3 happens which requires 574 to be moved up. >>>> >>>> Best regards, >>>> Vladimir Ivanov >>>> >>>> [1] >>>> === new loop opt pass >>>> >>>> #0: Initial state >>>> Loop: N4442/N1838 body={ 1257 773 1260 } >>>> Loop: N4445/N2197 body={ 1012 574 1075 } >>>> >>>> 574 CheckCastPP === 1075 5846 >>>> Oop:java/lang/Integer:NotNull:exact * >>>> 1075 IfFalse === 1012 >>>> 1012 If === 1575 1820 >>>> 1820 Bool === ... >>>> 5846 CastPP === 1581 677 >>>> 1581 Proj === 2137 #0 Type:control >>>> 2137 Unlock === ... >>>> 677 Phi === ... >>>> >>>> 773 CheckCastPP === 1260 733 >>>> Oop:java/lang/Integer:NotNull:exact * >>>> 1260 IfFalse === 1257 >>>> 1257 If === 1819 1820 >>>> 1820 Bool === ... >>>> 733 CastPP === 1230 677 >>>> 1230 IfTrue === 1225 >>>> 1225 If === ... >>>> 677 Phi === ... >>>> >>>> >>>> #1: 1012 => 1257 (same condition, 1257 dominates 1012) and 574 isn't >>>> part of N4445 anymore >>>> >>>> Loop: N4442/N1838 body={ 1257 773 1260 } >>>> Loop: N4445/N2197 body={ } >>>> >>>> 574 CheckCastPP === 1260 5846 >>>> 1260 IfFalse === 1257 >>>> 1257 If === 1575 1820 >>>> 1820 Bool === ... >>>> 5846 CastPP === 1581 677 >>>> 1581 Proj === 2137 #0 Type:control >>>> 2137 Unlock === ... >>>> 677 Phi === ... >>>> >>>> 773 CheckCastPP === 1260 733 >>>> 1260 IfFalse === 1257 >>>> 1257 If === 1819 1820 >>>> 1820 Bool === ... >>>> 733 CastPP === 1230 677 >>>> 1230 IfTrue === 1225 >>>> 1225 If === ... >>>> 677 Phi === ... >>>> >>>> === next loop opts pass >>>> >>>> Loop: N4442/N1838 body={ 1257 773 1260 } >>>> Loop: N4445/N2197 body={ } >>>> >>>> #2: 5846 => 733 >>>> >>>> 574 CheckCastPP === 1260 733 >>>> 773 CheckCastPP === 1260 733 >>>> >>>> 1260 IfFalse === 1257 >>>> 1257 If === 1819 1820 >>>> 1820 Bool === ... >>>> >>>> 733 CastPP === 1230 677 >>>> 1230 IfTrue === 1225 >>>> 1225 If === ... >>>> 677 Phi === ... >>>> >>>> Loop: N4442/N1838 body={ 1257 773 1260 } >>>> Loop: N4445/N2197 body={ } >>>> >>>> #3: 773 => 574 >>>> >>>> Loop: N4442/N1838 body={ 1257 1260 } // NB! missing 574 >>>> Loop: N4445/N2197 body={ } >>>> >>>> #4: Peel N4442 (NB! same loop opt iteration, so loop tree hasn't been >>>> rebuilt yet.) >>>> - 574 isn't cloned since it's not part of the loop body and it >>>> leads to unscheduleable IR. >>>> >>>> === Crash >>>> >>>>> If there was a new node created during loopopts then >>>>> PhaseIdealLoop::register_new_node() should be called for it already. >>>>> >>>>> Vladimir K >>>>> >>>>>> >>>>>> Or do you suggest to rewrite ConstraintCastNode::dominating_cast() to >>>>>> use get_ctrl() instead of in(0)? >>>>>> >>>>>> Best regards, >>>>>> Vladimir Ivanov >>>>>> >>>>>>>> >>>>>>>> The fix is to catch the case when dom_cast doesn't dominate n >>>>>>>> based on >>>>>>>> info from PhaseIdealLoop and update control info accordingly. >>>>>>>> >>>>>>>> Testing: manual (replayed problematic compilation & eyeballed the >>>>>>>> IR), >>>>>>>> JPRT, RBT (hs-tier0-comp, in progress). >>>>>>>> >>>>>>>> Best regards, >>>>>>>> Vladimir Ivanov >>>>>>>> >>>>>>>> PS: thanks to Roland for helping with the fix. From attila at hontvari.net Thu May 25 16:16:57 2017 From: attila at hontvari.net (=?UTF-8?Q?Hontv=c3=a1ri_Attila?=) Date: Thu, 25 May 2017 18:16:57 +0200 Subject: escape analysis issue with nested objects Message-ID: <15ad210e-d4b8-f04a-ca5d-a988d93ae819@hontvari.net> When creating a non-escaping array and putting a newly created, non-escaping object in it, the EA works, there are no heap allocations. private static void single() { Object x = new Object(); Object[] array = new Object[]{x}; Object a = array[0]; } But if we do the same with two or more objects, the array will be allocated on the heap, and not eliminated. private static void multi() { Object x = new Object(); Object y = new Object(); Object[] array = new Object[]{x, y}; Object a = array[0]; Object b = array[1]; } Is there a reason why it is only working in the first case? This would be useful for example, MethodHandle::invokeWithArguments, when the primitive types are boxed, and put into a varargs array, see my older email [1]. A complete test source code is in [2], if we run it with -verbose:gc, we can see there are many GCs in the second case, but there are no GCs in the first case. [1] http://mail.openjdk.java.net/pipermail/jigsaw-dev/2017-January/010933.html [2] https://gist.github.com/anonymous/bd46075ef1ebd858dae49fe6cfe39da8 From vladimir.kozlov at oracle.com Thu May 25 17:11:04 2017 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Thu, 25 May 2017 10:11:04 -0700 Subject: [10] [ppc] RFR(XS): 8180612: assert failure due to immediate value out of range In-Reply-To: References: <3B5A2D91-D4F5-44B9-9129-4876D992C077@sap.com> <6d7bd7fd-52a8-9542-10ac-9b13c470afe8@oracle.com> <0072b1f1-99d3-25eb-e7a2-871c697c3fef@oracle.com> <7A3AD54D-9DC2-4B83-8B4D-BB2E442CC68A@sap.com> <181C18A7-D259-48EB-9553-BC47980350E6@sap.com> Message-ID: Yes, changes are pushed into jdk10/hs. Regards, Vladimir On 5/25/17 1:38 AM, Schmidt, Lutz wrote: > Thank you, Volker! > And thank you, Vladimir, for pushing the change ? I concluded this fact from a JBS notification. Your original mail was probably classified as junk. I will be able to recover it by tomorrow. > > Regards, Lutz > > > > On 24.05.2017, 14:11, "Volker Simonis" wrote: > > OK, looks good now. > > Vladimir, can you please push this change now? > > Thank you and best regards, > Volker > > > On Wed, May 24, 2017 at 11:13 AM, Schmidt, Lutz wrote: > > Hi Volker, > > > > thanks for looking at the change and for your findings. > > > > To set RTMTotalCountIncrRate=1 was my intention. Obviously, it escaped my attention on ppc. Fixed. Webrev updated in-place. > > > > I tried to keep the allowed parameter range as wide as possible. Unfortunately, some of the parameters are used as immediate operands at places where an additional temp register is not readily available. > > > > RTMLockingThreshold, for example, is only used after being divided by RTMTotalCountIncrRate. Except for RTMTotalCountIncrRate=1, the quotient will be significantly smaller than the original parameter value. Limiting the range of RTMLockingThreshold to int16 seems too restrictive to me. In addition, at that place in the code there is a register available to smoothly handle the ?overflow? case. > > > > I would like to keep the ranges as they are. > > > > Regards, > > Lutz > > > > > > On 23.05.2017, 17:25, "Volker Simonis" wrote: > > > > Hi Lutz, > > > > in general the change looks good. > > > > I think in globals_ppc.hpp the minimal value for RTMTotalCountIncrRate > > should be 1 (as on x86) to avoid division by zero errors. > > > > What is the rational behind restricting some parameters to 16-bit > > immediates on ppc while handling bigger immediate values in the code > > generation for some other parameters? Wouldn't it be easier to > > restrict all parameters to 16 bit on ppc? > > > > Thank you and best regards, > > Volker > > > > > > > > On Tue, May 23, 2017 at 9:29 AM, Schmidt, Lutz wrote: > > > Vladimir, Volker, > > > > > > Triggered by your suggestions, I have read through the RTM code with extended diligence. What I came up with is this updated/extended > > > webrev: http://cr.openjdk.java.net/~lucy/webrevs/8180612.01/ > > > for bug: https://bugs.openjdk.java.net/browse/JDK-8180612 > > > > > > For both x86 and ppc, I have added ranges to all numeric RTM flags. Their type is now ?int?. Could you please have a look and let me know what you don?t like? > > > > > > Thanks and best regards, > > > Lutz > > > > > > > > > > > > On 19.05.2017, 21:40, "Vladimir Kozlov" wrote: > > > > > > Actually we need to use 'int' because we do signed arithmetic on them. And put range() restriction for positive values only. > > > > > > experimental(int, RTMTotalCountIncrRate, 64, \ > > > "Increment total RTM attempted lock count once every n times") \ > > > range(0, max_jint) \ > > > > > > Vladimir > > > > > > On 5/19/17 12:33 PM, Vladimir Kozlov wrote: > > > > Thank you, Volker > > > > > > > > I think all RTM tuning flags should be uint (unsigned 32bit int). > > > > We did not have int/uint types when RTM was implemented. They were added 2 years ago: > > > > > > > > http://hg.openjdk.java.net/jdk9/hs/hotspot/rev/8597e296c18b > > > > > > > > Lets change type of RTM flags in all places. I will review and sponsor. > > > > > > > > thanks, > > > > Vladimir > > > > > > > > On 5/19/17 12:02 PM, Volker Simonis wrote: > > > >> Hi Lutz, Vladimir, > > > >> > > > >> @Lutz: thanks for fixing this. I think your change looks good. > > > >> > > > >> @Vladimir: thanks, but I think we can push this ourselves because it > > > >> is ppc only. > > > >> > > > >> I've also realized that amd64 uses cmpptr() which takes the result of > > > >> "RTMLockingThreshold / RTMTotalCountIncrRate" as an int32_t. This can > > > >> be wrong if the result of the division is greater than 32 bit. I'm not > > > >> sure how relevant that is, but maybe we could either change the types > > > >> of RTMLockingThreshold and RTMTotalCountIncrRate to int or else fix > > > >> the compare on amd64 to compare against a full 64 bit value. > > > >> > > > >> What do you think Vladimir - maybe do that as a follow up change or do > > > >> you want to include it here (in which case you'd have to sponsor :) ? > > > >> > > > >> Thank you and best regards, > > > >> Volker > > > >> > > > >> On Fri, May 19, 2017 at 6:35 PM, Vladimir Kozlov > > > >> wrote: > > > >>> Hi Lutz, > > > >>> > > > >>> I can sponsor it but someone familiar with PPC have to review the fix. > > > >>> > > > >>> Thanks, > > > >>> Vladimir > > > >>> > > > >>> > > > >>> On 5/19/17 5:45 AM, Schmidt, Lutz wrote: > > > >>>> > > > >>>> Hi all, > > > >>>> > > > >>>> May I kindly request reviews for this small fix? A voluntary sponsor would > > > >>>> be great as well! > > > >>>> > > > >>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8180612 > > > >>>> Webrev: http://cr.openjdk.java.net/~lucy/webrevs/8180612.00/ > > > >>>> > > > >>>> The RTM code generation on ppc relied on RTM-related cmdline parameters to > > > >>>> provide ?well-behaved? values only. At least one jtreg test breaks this > > > >>>> assumption. The fix makes code generation adapt to actual parameter values. > > > >>>> > > > >>>> Thanks, > > > >>>> Lutz > > > >>>> > > > >>>> > > > >>>> > > > >>> > > > > > > > > > > > > > > > > > > > > > > > > From vladimir.kozlov at oracle.com Thu May 25 17:23:12 2017 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Thu, 25 May 2017 10:23:12 -0700 Subject: RFR [9] (XS): 8179882: C2: Stale control info after cast node elimination during loop optimization pass In-Reply-To: References: <8983a735-8435-9b27-6580-8eed965b9bc2@oracle.com> <613dc165-c8d2-c38b-bbc4-12f1884944dc@oracle.com> <8e502315-5c87-100b-783c-fd8dab4cbb9c@oracle.com> <488d42b7-be5f-f739-1887-56691bcc39c1@oracle.com> <37ed70d5-60be-de34-5143-938c2cb34576@oracle.com> Message-ID: Okay, now I understand the problem. An other thing I do not like about original change is that it could place dom_cast above its inputs which could be incorrect: set_ctrl_and_loop(dom_cast, dom_lca(dom_cast_ctrl, n_ctrl)); I like [1] approach which do replacement only with clear domination. Also would be nice if you add comment in this code explaining it. Thanks, Vladimir On 5/25/17 7:28 AM, Vladimir Ivanov wrote: >>> Sorry for the confusion. #1 & #2-#3 happen in different (consequent) >>> loop opts iterations. Loop info is rebuilt after #2 and it goes out of >>> sync during #2-#3. >> >> You mean: rebuild after #1. Right? > > Yes, typo. > >>>> Based on code it look like we are doing split_if_with_blocks when this >>>> happens. major_progress should be set and exit this iteration of loop >>>> opts. So why "loop cloning" happens? >>> >>> Cast node elimination in split_if_with_blocks_pre doesn't bump >>> _major_progress, so after split-ifs are over (#2-#3 are part of them), >>> the compiler proceeds to loop opts. >> >> Rereading this part: >> >>>>> The problem is that 773 was part of N4442 and 574 doesn't dominate the >>>>> loop body, but #3 doesn't place 574 there when replacing 773. So, >> >> Would it help if 574 CheckCastPP is replaced with 773? > > Yes, that's another way to fix it [1]. Either one or the other cast node > dominates. If we are unlucky, compiler encounters the "wrong one" > (dominator) first and tries to replace it with a dominated one. If the > compiler skips it, it will eventually find dominated node and will > attempt to replace it with the dominator (which will succeed). > > BTW it's equivalent to using get_ctrl() instead of in(0) in > ConstraintCastNode::dominating_cast() when called with PhaseIdealLoop. > > Let me know what approach to fix the bug you prefer. > > Best regards, > Vladimir Ivanov > > [1] > diff --git a/src/share/vm/opto/loopopts.cpp > b/src/share/vm/opto/loopopts.cpp > --- a/src/share/vm/opto/loopopts.cpp > +++ b/src/share/vm/opto/loopopts.cpp > @@ -913,7 +913,7 @@ > > if (n->is_ConstraintCast()) { > Node* dom_cast = n->as_ConstraintCast()->dominating_cast(this); > - if (dom_cast != NULL) { > + if (dom_cast != NULL && is_dominator(get_ctrl(dom_cast), > get_ctrl(n)) { > _igvn.replace_node(n, dom_cast); > return dom_cast; > } > >>> Calling set_major_progress() when dom_cast doesn't dominate n in >>> split_if_with_blocks_pre() is another way to fix the bug, but it is >>> much more expensive (introduces additional loop opts pass). >>> >>> Best regards, >>> Vladimir Ivanov >>> >>>>> >>>>> #2: CastPP: 5846 => 733 >>>>> - 733 dominates 5846 >>>>> >>>>> #3: CheckCastPP: 773 => 574 >>>>> - 773 & 574 have the same control node after #1 (1260/1257), but >>>>> get_ctrl(dom_cast) points to 1581 (before #2: 574 -1-> 5846 -0-> >>>>> 1581). >>>>> >>>>> The problem is that 773 was part of N4442 and 574 doesn't dominate the >>>>> loop body, but #3 doesn't place 574 there when replacing 773. So, >>>>> subsequent cloning of loop body during loop peeling of N4442 produces >>>>> unschedulable graph. >>>>> >>>>> The fix moves 574 up the dominator tree and puts it inside N4442, so >>>>> it dominates all users of 773 & 574 after #3. >>>>> >>>>> get_ctrl() & in(0) point to different CFG nodes after #1, but the loop >>>>> info is correct until #3 happens which requires 574 to be moved up. >>>>> >>>>> Best regards, >>>>> Vladimir Ivanov >>>>> >>>>> [1] >>>>> === new loop opt pass >>>>> >>>>> #0: Initial state >>>>> Loop: N4442/N1838 body={ 1257 773 1260 } >>>>> Loop: N4445/N2197 body={ 1012 574 1075 } >>>>> >>>>> 574 CheckCastPP === 1075 5846 >>>>> Oop:java/lang/Integer:NotNull:exact * >>>>> 1075 IfFalse === 1012 >>>>> 1012 If === 1575 1820 >>>>> 1820 Bool === ... >>>>> 5846 CastPP === 1581 677 >>>>> 1581 Proj === 2137 #0 Type:control >>>>> 2137 Unlock === ... >>>>> 677 Phi === ... >>>>> >>>>> 773 CheckCastPP === 1260 733 >>>>> Oop:java/lang/Integer:NotNull:exact * >>>>> 1260 IfFalse === 1257 >>>>> 1257 If === 1819 1820 >>>>> 1820 Bool === ... >>>>> 733 CastPP === 1230 677 >>>>> 1230 IfTrue === 1225 >>>>> 1225 If === ... >>>>> 677 Phi === ... >>>>> >>>>> >>>>> #1: 1012 => 1257 (same condition, 1257 dominates 1012) and 574 isn't >>>>> part of N4445 anymore >>>>> >>>>> Loop: N4442/N1838 body={ 1257 773 1260 } >>>>> Loop: N4445/N2197 body={ } >>>>> >>>>> 574 CheckCastPP === 1260 5846 >>>>> 1260 IfFalse === 1257 >>>>> 1257 If === 1575 1820 >>>>> 1820 Bool === ... >>>>> 5846 CastPP === 1581 677 >>>>> 1581 Proj === 2137 #0 Type:control >>>>> 2137 Unlock === ... >>>>> 677 Phi === ... >>>>> >>>>> 773 CheckCastPP === 1260 733 >>>>> 1260 IfFalse === 1257 >>>>> 1257 If === 1819 1820 >>>>> 1820 Bool === ... >>>>> 733 CastPP === 1230 677 >>>>> 1230 IfTrue === 1225 >>>>> 1225 If === ... >>>>> 677 Phi === ... >>>>> >>>>> === next loop opts pass >>>>> >>>>> Loop: N4442/N1838 body={ 1257 773 1260 } >>>>> Loop: N4445/N2197 body={ } >>>>> >>>>> #2: 5846 => 733 >>>>> >>>>> 574 CheckCastPP === 1260 733 >>>>> 773 CheckCastPP === 1260 733 >>>>> >>>>> 1260 IfFalse === 1257 >>>>> 1257 If === 1819 1820 >>>>> 1820 Bool === ... >>>>> >>>>> 733 CastPP === 1230 677 >>>>> 1230 IfTrue === 1225 >>>>> 1225 If === ... >>>>> 677 Phi === ... >>>>> >>>>> Loop: N4442/N1838 body={ 1257 773 1260 } >>>>> Loop: N4445/N2197 body={ } >>>>> >>>>> #3: 773 => 574 >>>>> >>>>> Loop: N4442/N1838 body={ 1257 1260 } // NB! missing 574 >>>>> Loop: N4445/N2197 body={ } >>>>> >>>>> #4: Peel N4442 (NB! same loop opt iteration, so loop tree hasn't been >>>>> rebuilt yet.) >>>>> - 574 isn't cloned since it's not part of the loop body and it >>>>> leads to unscheduleable IR. >>>>> >>>>> === Crash >>>>> >>>>>> If there was a new node created during loopopts then >>>>>> PhaseIdealLoop::register_new_node() should be called for it already. >>>>>> >>>>>> Vladimir K >>>>>> >>>>>>> >>>>>>> Or do you suggest to rewrite >>>>>>> ConstraintCastNode::dominating_cast() to >>>>>>> use get_ctrl() instead of in(0)? >>>>>>> >>>>>>> Best regards, >>>>>>> Vladimir Ivanov >>>>>>> >>>>>>>>> >>>>>>>>> The fix is to catch the case when dom_cast doesn't dominate n >>>>>>>>> based on >>>>>>>>> info from PhaseIdealLoop and update control info accordingly. >>>>>>>>> >>>>>>>>> Testing: manual (replayed problematic compilation & eyeballed the >>>>>>>>> IR), >>>>>>>>> JPRT, RBT (hs-tier0-comp, in progress). >>>>>>>>> >>>>>>>>> Best regards, >>>>>>>>> Vladimir Ivanov >>>>>>>>> >>>>>>>>> PS: thanks to Roland for helping with the fix. From kirk at kodewerk.com Thu May 25 18:40:42 2017 From: kirk at kodewerk.com (Kirk Pepperdine) Date: Thu, 25 May 2017 20:40:42 +0200 Subject: escape analysis issue with nested objects In-Reply-To: <15ad210e-d4b8-f04a-ca5d-a988d93ae819@hontvari.net> References: <15ad210e-d4b8-f04a-ca5d-a988d93ae819@hontvari.net> Message-ID: <086B2E14-DA04-4F02-82B7-2C39B65B3A8D@kodewerk.com> Hi, No bug in Hotspot.. Bench is buggy. multi() is not inlined because it?s too big where as single() is inlined. There are 11 allocation eliminated events that inluce single() so it?s been DCE?ed. Kind regards, Kirk > On May 25, 2017, at 6:16 PM, Hontv?ri Attila wrote: > > When creating a non-escaping array and putting a newly created, non-escaping object in it, the EA works, there are no heap allocations. > > private static void single() { > Object x = new Object(); > Object[] array = new Object[]{x}; > Object a = array[0]; > } > > But if we do the same with two or more objects, the array will be allocated on the heap, and not eliminated. > > private static void multi() { > Object x = new Object(); > Object y = new Object(); > Object[] array = new Object[]{x, y}; > Object a = array[0]; > Object b = array[1]; > } > > Is there a reason why it is only working in the first case? > > This would be useful for example, MethodHandle::invokeWithArguments, when the primitive types are boxed, and put into a varargs array, see my older email [1]. > > A complete test source code is in [2], if we run it with -verbose:gc, we can see there are many GCs in the second case, but there are no GCs in the first case. > > [1] http://mail.openjdk.java.net/pipermail/jigsaw-dev/2017-January/010933.html > > [2] https://gist.github.com/anonymous/bd46075ef1ebd858dae49fe6cfe39da8 > From vladimir.kozlov at oracle.com Thu May 25 19:03:57 2017 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Thu, 25 May 2017 12:03:57 -0700 Subject: escape analysis issue with nested objects In-Reply-To: <15ad210e-d4b8-f04a-ca5d-a988d93ae819@hontvari.net> References: <15ad210e-d4b8-f04a-ca5d-a988d93ae819@hontvari.net> Message-ID: Hi Attila, What Java version you are using? I tried jdk 7, 8, 9. What I see is that in both cases only arrays are eliminated (-XX:+PrintEliminateAllocations with debug version of VM): BEGIN single Scalar 73 AllocateArray === 51 46 47 8 1 ( 71 59 21 58 41 1 1 ) [[ 74 75 76 83 84 85 ]] rawptr:NotNull ( int:>=0, java/lang/Object:NotNull *, bool, int ) NewMain::single @ bci:9 !jvms: NewMain::single @ bci:9 ++++ Eliminated: 73 AllocateArray END single BEGIN multi Scalar 116 AllocateArray === 91 86 87 8 1 ( 114 103 21 102 41 76 1 1 1 ) [[ 117 118 119 126 127 128 ]] rawptr:NotNull ( int:>=0, java/lang/Object:NotNull *, bool, int ) NewMain::multi @ bci:17 !jvms: NewMain::multi @ bci:17 ++++ Eliminated: 116 AllocateArray DONE But based on code non of objects should be allocated - both methods should have only 'return' generated. Looks like EA missing such case when objects are stored into small array. Thanks, Vladimir On 5/25/17 9:16 AM, Hontv?ri Attila wrote: > When creating a non-escaping array and putting a newly created, > non-escaping object in it, the EA works, there are no heap allocations. > > private static void single() { > Object x = new Object(); > Object[] array = new Object[]{x}; > Object a = array[0]; > } > > But if we do the same with two or more objects, the array will be > allocated on the heap, and not eliminated. > > private static void multi() { > Object x = new Object(); > Object y = new Object(); > Object[] array = new Object[]{x, y}; > Object a = array[0]; > Object b = array[1]; > } > > Is there a reason why it is only working in the first case? > > This would be useful for example, MethodHandle::invokeWithArguments, > when the primitive types are boxed, and put into a varargs array, see my > older email [1]. > > A complete test source code is in [2], if we run it with -verbose:gc, we > can see there are many GCs in the second case, but there are no GCs in > the first case. > > [1] > http://mail.openjdk.java.net/pipermail/jigsaw-dev/2017-January/010933.html > > [2] https://gist.github.com/anonymous/bd46075ef1ebd858dae49fe6cfe39da8 > From kirk.pepperdine at gmail.com Thu May 25 19:12:45 2017 From: kirk.pepperdine at gmail.com (Kirk Pepperdine) Date: Thu, 25 May 2017 21:12:45 +0200 Subject: escape analysis issue with nested objects In-Reply-To: References: <15ad210e-d4b8-f04a-ca5d-a988d93ae819@hontvari.net> Message-ID: I can see the arrays being DVE?ed.. in 8_121. EA seems to be doing it?s job. Kind regards, Kirk > On May 25, 2017, at 9:03 PM, Vladimir Kozlov wrote: > > Hi Attila, > > What Java version you are using? > > I tried jdk 7, 8, 9. > > What I see is that in both cases only arrays are eliminated (-XX:+PrintEliminateAllocations with debug version of VM): > > BEGIN single > Scalar 73 AllocateArray === 51 46 47 8 1 ( 71 59 21 58 41 1 1 ) [[ 74 75 76 83 84 85 ]] rawptr:NotNull ( int:>=0, java/lang/Object:NotNull *, bool, int ) NewMain::single @ bci:9 !jvms: NewMain::single @ bci:9 > ++++ Eliminated: 73 AllocateArray > END single > BEGIN multi > Scalar 116 AllocateArray === 91 86 87 8 1 ( 114 103 21 102 41 76 1 1 1 ) [[ 117 118 119 126 127 128 ]] rawptr:NotNull ( int:>=0, java/lang/Object:NotNull *, bool, int ) NewMain::multi @ bci:17 !jvms: NewMain::multi @ bci:17 > ++++ Eliminated: 116 AllocateArray > DONE > > But based on code non of objects should be allocated - both methods should have only 'return' generated. > > Looks like EA missing such case when objects are stored into small array. > > Thanks, > Vladimir > > On 5/25/17 9:16 AM, Hontv?ri Attila wrote: >> When creating a non-escaping array and putting a newly created, non-escaping object in it, the EA works, there are no heap allocations. >> private static void single() { >> Object x = new Object(); >> Object[] array = new Object[]{x}; >> Object a = array[0]; >> } >> But if we do the same with two or more objects, the array will be allocated on the heap, and not eliminated. >> private static void multi() { >> Object x = new Object(); >> Object y = new Object(); >> Object[] array = new Object[]{x, y}; >> Object a = array[0]; >> Object b = array[1]; >> } >> Is there a reason why it is only working in the first case? >> This would be useful for example, MethodHandle::invokeWithArguments, when the primitive types are boxed, and put into a varargs array, see my older email [1]. >> A complete test source code is in [2], if we run it with -verbose:gc, we can see there are many GCs in the second case, but there are no GCs in the first case. >> [1] http://mail.openjdk.java.net/pipermail/jigsaw-dev/2017-January/010933.html >> [2] https://gist.github.com/anonymous/bd46075ef1ebd858dae49fe6cfe39da8 From vitalyd at gmail.com Thu May 25 19:50:05 2017 From: vitalyd at gmail.com (Vitaly Davidovich) Date: Thu, 25 May 2017 15:50:05 -0400 Subject: escape analysis issue with nested objects In-Reply-To: References: <15ad210e-d4b8-f04a-ca5d-a988d93ae819@hontvari.net> Message-ID: Looking at C2 generated assembly (8u121), both cases are eliminated - the single() and multi() are dead code. To alleviate the truly dead code scenario here, I had these methods return something from the allocated object (I created a dummy class that holds an int) and accessed those objects via array indexing - no allocations are generated (not the array, and not the objects). On Thu, May 25, 2017 at 3:12 PM, Kirk Pepperdine wrote: > I can see the arrays being DVE?ed.. in 8_121. EA seems to be doing it?s > job. > > Kind regards, > Kirk > > > On May 25, 2017, at 9:03 PM, Vladimir Kozlov > wrote: > > > > Hi Attila, > > > > What Java version you are using? > > > > I tried jdk 7, 8, 9. > > > > What I see is that in both cases only arrays are eliminated (-XX:+PrintEliminateAllocations > with debug version of VM): > > > > BEGIN single > > Scalar 73 AllocateArray === 51 46 47 8 1 ( 71 59 21 58 > 41 1 1 ) [[ 74 75 76 83 84 85 ]] rawptr:NotNull ( int:>=0, > java/lang/Object:NotNull *, bool, int ) NewMain::single @ bci:9 !jvms: > NewMain::single @ bci:9 > > ++++ Eliminated: 73 AllocateArray > > END single > > BEGIN multi > > Scalar 116 AllocateArray === 91 86 87 8 1 ( 114 103 21 102 > 41 76 1 1 1 ) [[ 117 118 119 126 127 128 ]] rawptr:NotNull ( > int:>=0, java/lang/Object:NotNull *, bool, int ) NewMain::multi @ bci:17 > !jvms: NewMain::multi @ bci:17 > > ++++ Eliminated: 116 AllocateArray > > DONE > > > > But based on code non of objects should be allocated - both methods > should have only 'return' generated. > > > > Looks like EA missing such case when objects are stored into small array. > > > > Thanks, > > Vladimir > > > > On 5/25/17 9:16 AM, Hontv?ri Attila wrote: > >> When creating a non-escaping array and putting a newly created, > non-escaping object in it, the EA works, there are no heap allocations. > >> private static void single() { > >> Object x = new Object(); > >> Object[] array = new Object[]{x}; > >> Object a = array[0]; > >> } > >> But if we do the same with two or more objects, the array will be > allocated on the heap, and not eliminated. > >> private static void multi() { > >> Object x = new Object(); > >> Object y = new Object(); > >> Object[] array = new Object[]{x, y}; > >> Object a = array[0]; > >> Object b = array[1]; > >> } > >> Is there a reason why it is only working in the first case? > >> This would be useful for example, MethodHandle::invokeWithArguments, > when the primitive types are boxed, and put into a varargs array, see my > older email [1]. > >> A complete test source code is in [2], if we run it with -verbose:gc, > we can see there are many GCs in the second case, but there are no GCs in > the first case. > >> [1] http://mail.openjdk.java.net/pipermail/jigsaw-dev/2017- > January/010933.html > >> [2] https://gist.github.com/anonymous/bd46075ef1ebd858dae49fe6cfe39da8 > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From igor.veresov at oracle.com Thu May 25 21:54:51 2017 From: igor.veresov at oracle.com (Igor Veresov) Date: Thu, 25 May 2017 14:54:51 -0700 Subject: RFR(XXS) 8181115: Update suite.py after JDK-8180267 Message-ID: <3C715894-5A72-4D9F-8543-E4DD0E276D8B@oracle.com> Webrev: http://cr.openjdk.java.net/~iveresov/8181115/webrev.00/ Thanks! igor -------------- next part -------------- An HTML attachment was scrubbed... URL: From dean.long at oracle.com Thu May 25 22:02:26 2017 From: dean.long at oracle.com (dean.long at oracle.com) Date: Thu, 25 May 2017 15:02:26 -0700 Subject: Some JVMCI/Graal questions related to AOT In-Reply-To: <22330f24-9bbe-3055-41e4-35b1f6854fa4@oracle.com> References: <22330f24-9bbe-3055-41e4-35b1f6854fa4@oracle.com> Message-ID: <1f1b91ba-2c0f-ddd2-d822-64ca81c554c5@oracle.com> I have something mostly working, but I noticed that checkcasts on the appendix object are getting folded away. Eventually I want to support that, by adding the type as a dependency that AOT can check at runtime, but for this initial phase I'll probably just wrap the object constant in a LoadConstantIndirectlyNode and return that to the parser. I played around with guarding my changes with (EnableJVMCI && !UseJVMCICompiler) via HotSpotVMConfig, but I want more fine-level control over how invokedynamic constant pool slots are resolved and the adapter and appendix types exposed to the parser, so I'm thinking about adding more flags to GraphBuilderConfiguration or even better introducing a new StaticCompilationPlugin. Does a new plugin sound reasonable? Then I can move the hooks out of HotSpotConstantPool. dl PS - adding new flag fields to GraphBuilderConfiguration looks painful, with all the flags getting passed to the constructor and needing to change all the other withFlag methods. I'm tempted to rewrite it to use setFlag methods. I don't see why we need to create intermediate objects just to set fields. On 5/23/17 12:11 PM, dean.long at oracle.com wrote: > Thanks, I'll try that. > > dl > > > On 5/23/17 12:29 AM, Doug Simon wrote: > >>> On 23 May 2017, at 00:41, dean.long at oracle.com wrote: >>> >>> 1) I'm working on "8132547: [AOT] support invokedynamic >>> instructions" and I've hacked up >>> jdk.vm.ci.hotspot.HotSpotConstantPool.java to handle things like the >>> invokedynamic appendix differently. However, since this will only >>> be used by AOT, I'm thinking I need to put my changes in an >>> AOTHotSpotConstantPool subclass. My question is, where is a good >>> place to put such as class (which hopefully won't require messing >>> with modules)? >> Depending on the nature of the changes, I suspect they can simply be >> added to HotSpotConstantPool, guarded by a VM flag exposed by >> HotSpotVMConfig if necessary. HotSpotConstantPool is currently final >> and I don't see a natural place for an AOT specific subclass >> >>> 2) How can I tell if a ResolvedJavaType corresponds to a VM >>> anonymous class (Klass::is_anonymous())? I can't rely on >>> getFingerprint() returning 0, because I want fingerprints for >>> anonymous classes. Is there something existing, or do I need to add >>> something to JVMCI? >> You'd need to add something to JVMCI by exposing the required flags >> and fields in HotSpotVMConfig. >> >> -Doug > From vladimir.kozlov at oracle.com Thu May 25 22:06:28 2017 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Thu, 25 May 2017 15:06:28 -0700 Subject: RFR(XXS) 8181115: Update suite.py after JDK-8180267 In-Reply-To: <3C715894-5A72-4D9F-8543-E4DD0E276D8B@oracle.com> References: <3C715894-5A72-4D9F-8543-E4DD0E276D8B@oracle.com> Message-ID: Good Thanks, Vladimir On 5/25/17 2:54 PM, Igor Veresov wrote: > Webrev: http://cr.openjdk.java.net/~iveresov/8181115/webrev.00/ > > Thanks! > igor From ekaterina.pavlova at oracle.com Thu May 25 23:07:18 2017 From: ekaterina.pavlova at oracle.com (Ekaterina Pavlova) Date: Thu, 25 May 2017 16:07:18 -0700 Subject: RFR (S): 8145728 compiler/cpuflags/TestAESIntrinsicsOnSupportedConfig.java fails in some configurations Message-ID: Hi all, Please review the fix for compiler/cpuflags/TestAESIntrinsicsOnSupportedConfig.java. The testcase 'testUseAES' fails in case the test is ran in server configuration where -XX:TieredStopAtLevel is set to 1,2 or 3. Added TieredStopAtLevel check. Also got rid of AESSupportPredicate and use "@requires" instead. This is more efficient as in this case the tests will be skipped if pre-conditions are not valid. Such tests should be also reported as "not run" instead of "passed" which is more accurate. bug: https://bugs.openjdk.java.net/browse/JDK-8145728 webrev: http://cr.openjdk.java.net/~epavlova//8145728/webrev.00/ Tested by running jprt and manually running compiler/cpuflags tests with different -XX:TieredStopAtLevel values. thanks, -katya p.s. Igor Ignatyev volunteered to sponsor this change. From vladimir.kozlov at oracle.com Fri May 26 00:18:41 2017 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Thu, 25 May 2017 17:18:41 -0700 Subject: RFR (S): 8145728 compiler/cpuflags/TestAESIntrinsicsOnSupportedConfig.java fails in some configurations In-Reply-To: References: Message-ID: <8c27c0f5-02be-0845-05d8-5541bf38fa05@oracle.com> Looks good to me. Thank you for fixing it. One question: why CPUInfo needs to be imported? Vladimir On 5/25/17 4:07 PM, Ekaterina Pavlova wrote: > Hi all, > > Please review the fix for compiler/cpuflags/TestAESIntrinsicsOnSupportedConfig.java. > The testcase 'testUseAES' fails in case the test is ran in server configuration where -XX:TieredStopAtLevel is set to 1,2 or 3. > Added TieredStopAtLevel check. > > Also got rid of AESSupportPredicate and use "@requires" instead. > This is more efficient as in this case the tests will be skipped if pre-conditions are not valid. > Such tests should be also reported as "not run" instead of "passed" which is more accurate. > > bug: https://bugs.openjdk.java.net/browse/JDK-8145728 > webrev: http://cr.openjdk.java.net/~epavlova//8145728/webrev.00/ > > Tested by running jprt and manually running compiler/cpuflags tests with different -XX:TieredStopAtLevel values. > > thanks, > -katya > > p.s. > Igor Ignatyev volunteered to sponsor this change. > From ekaterina.pavlova at oracle.com Fri May 26 00:49:45 2017 From: ekaterina.pavlova at oracle.com (Ekaterina Pavlova) Date: Thu, 25 May 2017 17:49:45 -0700 Subject: RFR (S): 8145728 compiler/cpuflags/TestAESIntrinsicsOnSupportedConfig.java fails in some configurations In-Reply-To: <8c27c0f5-02be-0845-05d8-5541bf38fa05@oracle.com> References: <8c27c0f5-02be-0845-05d8-5541bf38fa05@oracle.com> Message-ID: It is not needed, removed. Thanks for catching this. I have uploaded new webrev. -katya On 5/25/17 5:18 PM, Vladimir Kozlov wrote: > Looks good to me. Thank you for fixing it. > One question: why CPUInfo needs to be imported? > > Vladimir > > On 5/25/17 4:07 PM, Ekaterina Pavlova wrote: >> Hi all, >> >> Please review the fix for compiler/cpuflags/TestAESIntrinsicsOnSupportedConfig.java. >> The testcase 'testUseAES' fails in case the test is ran in server configuration where -XX:TieredStopAtLevel is set to 1,2 or 3. >> Added TieredStopAtLevel check. >> >> Also got rid of AESSupportPredicate and use "@requires" instead. >> This is more efficient as in this case the tests will be skipped if pre-conditions are not valid. >> Such tests should be also reported as "not run" instead of "passed" which is more accurate. >> >> bug: https://bugs.openjdk.java.net/browse/JDK-8145728 >> webrev: http://cr.openjdk.java.net/~epavlova//8145728/webrev.00/ >> >> Tested by running jprt and manually running compiler/cpuflags tests with different -XX:TieredStopAtLevel values. >> >> thanks, >> -katya >> >> p.s. >> Igor Ignatyev volunteered to sponsor this change. >> From vladimir.kozlov at oracle.com Fri May 26 01:44:59 2017 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Thu, 25 May 2017 18:44:59 -0700 Subject: RFR (S): 8145728 compiler/cpuflags/TestAESIntrinsicsOnSupportedConfig.java fails in some configurations In-Reply-To: References: <8c27c0f5-02be-0845-05d8-5541bf38fa05@oracle.com> Message-ID: <10d9f54f-89c8-a5e6-cc61-807010f5d80b@oracle.com> Good. Thanks, Vladimir On 5/25/17 5:49 PM, Ekaterina Pavlova wrote: > It is not needed, removed. > Thanks for catching this. I have uploaded new webrev. > > -katya > > On 5/25/17 5:18 PM, Vladimir Kozlov wrote: >> Looks good to me. Thank you for fixing it. >> One question: why CPUInfo needs to be imported? >> >> Vladimir >> >> On 5/25/17 4:07 PM, Ekaterina Pavlova wrote: >>> Hi all, >>> >>> Please review the fix for compiler/cpuflags/TestAESIntrinsicsOnSupportedConfig.java. >>> The testcase 'testUseAES' fails in case the test is ran in server configuration where -XX:TieredStopAtLevel is set to 1,2 or 3. >>> Added TieredStopAtLevel check. >>> >>> Also got rid of AESSupportPredicate and use "@requires" instead. >>> This is more efficient as in this case the tests will be skipped if pre-conditions are not valid. >>> Such tests should be also reported as "not run" instead of "passed" which is more accurate. >>> >>> bug: https://bugs.openjdk.java.net/browse/JDK-8145728 >>> webrev: http://cr.openjdk.java.net/~epavlova//8145728/webrev.00/ >>> >>> Tested by running jprt and manually running compiler/cpuflags tests with different -XX:TieredStopAtLevel values. >>> >>> thanks, >>> -katya >>> >>> p.s. >>> Igor Ignatyev volunteered to sponsor this change. >>> > From attila at hontvari.net Fri May 26 11:34:04 2017 From: attila at hontvari.net (=?UTF-8?Q?Hontv=c3=a1ri_Attila?=) Date: Fri, 26 May 2017 13:34:04 +0200 Subject: escape analysis issue with nested objects In-Reply-To: References: <15ad210e-d4b8-f04a-ca5d-a988d93ae819@hontvari.net> Message-ID: They are eliminated only if multi() is not inlined into an enclosing loop. (8u101) If I disable inlining it with CompileCommand, C2 compiles it to an noop method. But when inlining is enabled, the wrapper loop compiles into a long code: http://pastebin.com/KAq5v2XE Maybe this issue is related to JDK-8155769 . 2017-05-25 21:50 keltez?ssel, Vitaly Davidovich ?rta: > Looking at C2 generated assembly (8u121), both cases are eliminated - > the single() and multi() are dead code. To alleviate the truly dead > code scenario here, I had these methods return something from the > allocated object (I created a dummy class that holds an int) and > accessed those objects via array indexing - no allocations are > generated (not the array, and not the objects). > > On Thu, May 25, 2017 at 3:12 PM, Kirk Pepperdine > > wrote: > > I can see the arrays being DVE?ed.. in 8_121. EA seems to be doing > it?s job. > > Kind regards, > Kirk > > > On May 25, 2017, at 9:03 PM, Vladimir Kozlov > > > wrote: > > > > Hi Attila, > > > > What Java version you are using? > > > > I tried jdk 7, 8, 9. > > > > What I see is that in both cases only arrays are eliminated > (-XX:+PrintEliminateAllocations with debug version of VM): > > > > BEGIN single > > Scalar 73 AllocateArray === 51 46 47 8 1 ( 71 59 > 21 58 41 1 1 ) [[ 74 75 76 83 84 85 ]] rawptr:NotNull ( > int:>=0, java/lang/Object:NotNull *, bool, int ) NewMain::single @ > bci:9 !jvms: NewMain::single @ bci:9 > > ++++ Eliminated: 73 AllocateArray > > END single > > BEGIN multi > > Scalar 116 AllocateArray === 91 86 87 8 1 ( 114 103 > 21 102 41 76 1 1 1 ) [[ 117 118 119 126 127 128 ]] > rawptr:NotNull ( int:>=0, java/lang/Object:NotNull *, bool, int ) > NewMain::multi @ bci:17 !jvms: NewMain::multi @ bci:17 > > ++++ Eliminated: 116 AllocateArray > > DONE > > > > But based on code non of objects should be allocated - both > methods should have only 'return' generated. > > > > Looks like EA missing such case when objects are stored into > small array. > > > > Thanks, > > Vladimir > > > > On 5/25/17 9:16 AM, Hontv?ri Attila wrote: > >> When creating a non-escaping array and putting a newly created, > non-escaping object in it, the EA works, there are no heap > allocations. > >> private static void single() { > >> Object x = new Object(); > >> Object[] array = new Object[]{x}; > >> Object a = array[0]; > >> } > >> But if we do the same with two or more objects, the array will > be allocated on the heap, and not eliminated. > >> private static void multi() { > >> Object x = new Object(); > >> Object y = new Object(); > >> Object[] array = new Object[]{x, y}; > >> Object a = array[0]; > >> Object b = array[1]; > >> } > >> Is there a reason why it is only working in the first case? > >> This would be useful for example, > MethodHandle::invokeWithArguments, when the primitive types are > boxed, and put into a varargs array, see my older email [1]. > >> A complete test source code is in [2], if we run it with > -verbose:gc, we can see there are many GCs in the second case, but > there are no GCs in the first case. > >> [1] > http://mail.openjdk.java.net/pipermail/jigsaw-dev/2017-January/010933.html > > >> [2] > https://gist.github.com/anonymous/bd46075ef1ebd858dae49fe6cfe39da8 > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From zoltan.majo at oracle.com Fri May 26 11:36:18 2017 From: zoltan.majo at oracle.com (=?UTF-8?B?Wm9sdMOhbiBNYWrDsw==?=) Date: Fri, 26 May 2017 13:36:18 +0200 Subject: [9] RFR(XS): 8180855: Null pointer dereference in OopMapSet::all_do of oopMap.cpp:394 Message-ID: Hi, please review the following fix for 8180855. https://bugs.openjdk.java.net/browse/JDK-8180855 https://bugs.openjdk.java.net/browse/JDK-8180855 We check if the contents of the memory location returned by oopmapreg_to_location() is NULL, but we do not check if the memory location itself is NULL. The fix adds the missing check. JPRT testing passes, RBT pre-PIT testing is in progress. Thank you! Best regards, Zoltan From vitalyd at gmail.com Fri May 26 12:18:42 2017 From: vitalyd at gmail.com (Vitaly Davidovich) Date: Fri, 26 May 2017 08:18:42 -0400 Subject: escape analysis issue with nested objects In-Reply-To: References: <15ad210e-d4b8-f04a-ca5d-a988d93ae819@hontvari.net> Message-ID: Indeed, you're right - I see the same (unsurprisingly) on 8u121. I wasn't looking at the loop compilation before, but just the single/multi methods standalone (analogous to them not being inlined). What's interesting also is that even single(), with just a slight re-arrangement, doesn't get eliminated. For instance: static void single() { final Object[] array = new Object[] {new Object()}; final Object a = array[0]; } On Fri, May 26, 2017 at 7:34 AM, Hontv?ri Attila wrote: > They are eliminated only if multi() is not inlined into an enclosing loop. > (8u101) If I disable inlining it with CompileCommand, C2 compiles it to an > noop method. But when inlining is enabled, the wrapper loop compiles into a > long code: http://pastebin.com/KAq5v2XE > > Maybe this issue is related to JDK-8155769 > . > 2017-05-25 21:50 keltez?ssel, Vitaly Davidovich ?rta: > > Looking at C2 generated assembly (8u121), both cases are eliminated - the > single() and multi() are dead code. To alleviate the truly dead code > scenario here, I had these methods return something from the allocated > object (I created a dummy class that holds an int) and accessed those > objects via array indexing - no allocations are generated (not the array, > and not the objects). > > On Thu, May 25, 2017 at 3:12 PM, Kirk Pepperdine < > kirk.pepperdine at gmail.com> wrote: > >> I can see the arrays being DVE?ed.. in 8_121. EA seems to be doing it?s >> job. >> >> Kind regards, >> Kirk >> >> > On May 25, 2017, at 9:03 PM, Vladimir Kozlov < >> vladimir.kozlov at oracle.com> wrote: >> > >> > Hi Attila, >> > >> > What Java version you are using? >> > >> > I tried jdk 7, 8, 9. >> > >> > What I see is that in both cases only arrays are eliminated >> (-XX:+PrintEliminateAllocations with debug version of VM): >> > >> > BEGIN single >> > Scalar 73 AllocateArray === 51 46 47 8 1 ( 71 59 21 58 >> 41 1 1 ) [[ 74 75 76 83 84 85 ]] rawptr:NotNull ( int:>=0, >> java/lang/Object:NotNull *, bool, int ) NewMain::single @ bci:9 !jvms: >> NewMain::single @ bci:9 >> > ++++ Eliminated: 73 AllocateArray >> > END single >> > BEGIN multi >> > Scalar 116 AllocateArray === 91 86 87 8 1 ( 114 103 21 >> 102 41 76 1 1 1 ) [[ 117 118 119 126 127 128 ]] rawptr:NotNull ( >> int:>=0, java/lang/Object:NotNull *, bool, int ) NewMain::multi @ bci:17 >> !jvms: NewMain::multi @ bci:17 >> > ++++ Eliminated: 116 AllocateArray >> > DONE >> > >> > But based on code non of objects should be allocated - both methods >> should have only 'return' generated. >> > >> > Looks like EA missing such case when objects are stored into small >> array. >> > >> > Thanks, >> > Vladimir >> > >> > On 5/25/17 9:16 AM, Hontv?ri Attila wrote: >> >> When creating a non-escaping array and putting a newly created, >> non-escaping object in it, the EA works, there are no heap allocations. >> >> private static void single() { >> >> Object x = new Object(); >> >> Object[] array = new Object[]{x}; >> >> Object a = array[0]; >> >> } >> >> But if we do the same with two or more objects, the array will be >> allocated on the heap, and not eliminated. >> >> private static void multi() { >> >> Object x = new Object(); >> >> Object y = new Object(); >> >> Object[] array = new Object[]{x, y}; >> >> Object a = array[0]; >> >> Object b = array[1]; >> >> } >> >> Is there a reason why it is only working in the first case? >> >> This would be useful for example, MethodHandle::invokeWithArguments, >> when the primitive types are boxed, and put into a varargs array, see my >> older email [1]. >> >> A complete test source code is in [2], if we run it with -verbose:gc, >> we can see there are many GCs in the second case, but there are no GCs in >> the first case. >> >> [1] http://mail.openjdk.java.net/pipermail/jigsaw-dev/2017-Janua >> ry/010933.html >> >> [2] https://gist.github.com/anonymous/bd46075ef1ebd858dae49fe6cfe39da8 >> >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From vladimir.x.ivanov at oracle.com Fri May 26 13:09:25 2017 From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov) Date: Fri, 26 May 2017 16:09:25 +0300 Subject: RFR [9] (XS): 8179882: C2: Stale control info after cast node elimination during loop optimization pass In-Reply-To: References: <8983a735-8435-9b27-6580-8eed965b9bc2@oracle.com> <613dc165-c8d2-c38b-bbc4-12f1884944dc@oracle.com> <8e502315-5c87-100b-783c-fd8dab4cbb9c@oracle.com> <488d42b7-be5f-f739-1887-56691bcc39c1@oracle.com> <37ed70d5-60be-de34-5143-938c2cb34576@oracle.com> Message-ID: <3964efc3-00d8-35da-bbec-df035b694598@oracle.com> How about the following version? http://cr.openjdk.java.net/~vlivanov/8179882/webrev.01/ Best regards, Vladimir Ivanov On 5/25/17 8:23 PM, Vladimir Kozlov wrote: > Okay, now I understand the problem. > > An other thing I do not like about original change is that it could > place dom_cast above its inputs which could be incorrect: > > set_ctrl_and_loop(dom_cast, dom_lca(dom_cast_ctrl, n_ctrl)); > > I like [1] approach which do replacement only with clear domination. > > Also would be nice if you add comment in this code explaining it. > > Thanks, > Vladimir > > On 5/25/17 7:28 AM, Vladimir Ivanov wrote: >>>> Sorry for the confusion. #1 & #2-#3 happen in different (consequent) >>>> loop opts iterations. Loop info is rebuilt after #2 and it goes out of >>>> sync during #2-#3. >>> >>> You mean: rebuild after #1. Right? >> >> Yes, typo. >> >>>>> Based on code it look like we are doing split_if_with_blocks when this >>>>> happens. major_progress should be set and exit this iteration of loop >>>>> opts. So why "loop cloning" happens? >>>> >>>> Cast node elimination in split_if_with_blocks_pre doesn't bump >>>> _major_progress, so after split-ifs are over (#2-#3 are part of them), >>>> the compiler proceeds to loop opts. >>> >>> Rereading this part: >>> >>>>>> The problem is that 773 was part of N4442 and 574 doesn't dominate >>>>>> the >>>>>> loop body, but #3 doesn't place 574 there when replacing 773. So, >>> >>> Would it help if 574 CheckCastPP is replaced with 773? >> >> Yes, that's another way to fix it [1]. Either one or the other cast >> node dominates. If we are unlucky, compiler encounters the "wrong one" >> (dominator) first and tries to replace it with a dominated one. If the >> compiler skips it, it will eventually find dominated node and will >> attempt to replace it with the dominator (which will succeed). >> >> BTW it's equivalent to using get_ctrl() instead of in(0) in >> ConstraintCastNode::dominating_cast() when called with PhaseIdealLoop. >> >> Let me know what approach to fix the bug you prefer. >> >> Best regards, >> Vladimir Ivanov >> >> [1] >> diff --git a/src/share/vm/opto/loopopts.cpp >> b/src/share/vm/opto/loopopts.cpp >> --- a/src/share/vm/opto/loopopts.cpp >> +++ b/src/share/vm/opto/loopopts.cpp >> @@ -913,7 +913,7 @@ >> >> if (n->is_ConstraintCast()) { >> Node* dom_cast = n->as_ConstraintCast()->dominating_cast(this); >> - if (dom_cast != NULL) { >> + if (dom_cast != NULL && is_dominator(get_ctrl(dom_cast), >> get_ctrl(n)) { >> _igvn.replace_node(n, dom_cast); >> return dom_cast; >> } >> >>>> Calling set_major_progress() when dom_cast doesn't dominate n in >>>> split_if_with_blocks_pre() is another way to fix the bug, but it is >>>> much more expensive (introduces additional loop opts pass). >>>> >>>> Best regards, >>>> Vladimir Ivanov >>>> >>>>>> >>>>>> #2: CastPP: 5846 => 733 >>>>>> - 733 dominates 5846 >>>>>> >>>>>> #3: CheckCastPP: 773 => 574 >>>>>> - 773 & 574 have the same control node after #1 (1260/1257), >>>>>> but >>>>>> get_ctrl(dom_cast) points to 1581 (before #2: 574 -1-> 5846 -0-> >>>>>> 1581). >>>>>> >>>>>> The problem is that 773 was part of N4442 and 574 doesn't dominate >>>>>> the >>>>>> loop body, but #3 doesn't place 574 there when replacing 773. So, >>>>>> subsequent cloning of loop body during loop peeling of N4442 produces >>>>>> unschedulable graph. >>>>>> >>>>>> The fix moves 574 up the dominator tree and puts it inside N4442, so >>>>>> it dominates all users of 773 & 574 after #3. >>>>>> >>>>>> get_ctrl() & in(0) point to different CFG nodes after #1, but the >>>>>> loop >>>>>> info is correct until #3 happens which requires 574 to be moved up. >>>>>> >>>>>> Best regards, >>>>>> Vladimir Ivanov >>>>>> >>>>>> [1] >>>>>> === new loop opt pass >>>>>> >>>>>> #0: Initial state >>>>>> Loop: N4442/N1838 body={ 1257 773 1260 } >>>>>> Loop: N4445/N2197 body={ 1012 574 1075 } >>>>>> >>>>>> 574 CheckCastPP === 1075 5846 >>>>>> Oop:java/lang/Integer:NotNull:exact * >>>>>> 1075 IfFalse === 1012 >>>>>> 1012 If === 1575 1820 >>>>>> 1820 Bool === ... >>>>>> 5846 CastPP === 1581 677 >>>>>> 1581 Proj === 2137 #0 Type:control >>>>>> 2137 Unlock === ... >>>>>> 677 Phi === ... >>>>>> >>>>>> 773 CheckCastPP === 1260 733 >>>>>> Oop:java/lang/Integer:NotNull:exact * >>>>>> 1260 IfFalse === 1257 >>>>>> 1257 If === 1819 1820 >>>>>> 1820 Bool === ... >>>>>> 733 CastPP === 1230 677 >>>>>> 1230 IfTrue === 1225 >>>>>> 1225 If === ... >>>>>> 677 Phi === ... >>>>>> >>>>>> >>>>>> #1: 1012 => 1257 (same condition, 1257 dominates 1012) and 574 isn't >>>>>> part of N4445 anymore >>>>>> >>>>>> Loop: N4442/N1838 body={ 1257 773 1260 } >>>>>> Loop: N4445/N2197 body={ } >>>>>> >>>>>> 574 CheckCastPP === 1260 5846 >>>>>> 1260 IfFalse === 1257 >>>>>> 1257 If === 1575 1820 >>>>>> 1820 Bool === ... >>>>>> 5846 CastPP === 1581 677 >>>>>> 1581 Proj === 2137 #0 Type:control >>>>>> 2137 Unlock === ... >>>>>> 677 Phi === ... >>>>>> >>>>>> 773 CheckCastPP === 1260 733 >>>>>> 1260 IfFalse === 1257 >>>>>> 1257 If === 1819 1820 >>>>>> 1820 Bool === ... >>>>>> 733 CastPP === 1230 677 >>>>>> 1230 IfTrue === 1225 >>>>>> 1225 If === ... >>>>>> 677 Phi === ... >>>>>> >>>>>> === next loop opts pass >>>>>> >>>>>> Loop: N4442/N1838 body={ 1257 773 1260 } >>>>>> Loop: N4445/N2197 body={ } >>>>>> >>>>>> #2: 5846 => 733 >>>>>> >>>>>> 574 CheckCastPP === 1260 733 >>>>>> 773 CheckCastPP === 1260 733 >>>>>> >>>>>> 1260 IfFalse === 1257 >>>>>> 1257 If === 1819 1820 >>>>>> 1820 Bool === ... >>>>>> >>>>>> 733 CastPP === 1230 677 >>>>>> 1230 IfTrue === 1225 >>>>>> 1225 If === ... >>>>>> 677 Phi === ... >>>>>> >>>>>> Loop: N4442/N1838 body={ 1257 773 1260 } >>>>>> Loop: N4445/N2197 body={ } >>>>>> >>>>>> #3: 773 => 574 >>>>>> >>>>>> Loop: N4442/N1838 body={ 1257 1260 } // NB! missing 574 >>>>>> Loop: N4445/N2197 body={ } >>>>>> >>>>>> #4: Peel N4442 (NB! same loop opt iteration, so loop tree hasn't been >>>>>> rebuilt yet.) >>>>>> - 574 isn't cloned since it's not part of the loop body and it >>>>>> leads to unscheduleable IR. >>>>>> >>>>>> === Crash >>>>>> >>>>>>> If there was a new node created during loopopts then >>>>>>> PhaseIdealLoop::register_new_node() should be called for it already. >>>>>>> >>>>>>> Vladimir K >>>>>>> >>>>>>>> >>>>>>>> Or do you suggest to rewrite >>>>>>>> ConstraintCastNode::dominating_cast() to >>>>>>>> use get_ctrl() instead of in(0)? >>>>>>>> >>>>>>>> Best regards, >>>>>>>> Vladimir Ivanov >>>>>>>> >>>>>>>>>> >>>>>>>>>> The fix is to catch the case when dom_cast doesn't dominate n >>>>>>>>>> based on >>>>>>>>>> info from PhaseIdealLoop and update control info accordingly. >>>>>>>>>> >>>>>>>>>> Testing: manual (replayed problematic compilation & eyeballed the >>>>>>>>>> IR), >>>>>>>>>> JPRT, RBT (hs-tier0-comp, in progress). >>>>>>>>>> >>>>>>>>>> Best regards, >>>>>>>>>> Vladimir Ivanov >>>>>>>>>> >>>>>>>>>> PS: thanks to Roland for helping with the fix. From attila at hontvari.net Fri May 26 13:15:45 2017 From: attila at hontvari.net (=?UTF-8?Q?Hontv=c3=a1ri_Attila?=) Date: Fri, 26 May 2017 15:15:45 +0200 Subject: escape analysis issue with nested objects In-Reply-To: References: <15ad210e-d4b8-f04a-ca5d-a988d93ae819@hontvari.net> Message-ID: I've found a bug report that likely describes the same problem you mentioned in your example: JDK-8171828 Another interesting thing is if we replace new Object() with a creation of a primitive box object from a constant value, it will be eliminated, even without a local variable. private static void singleBoxedConstant() { Object[] array = new Object[]{new Double(123.456)}; Object a = array[0]; } It also works with multiple box objects: private static void multiBoxedConstant() { Object[] array = new Object[]{new Double(21), new Double(1.2)}; Object a = array[0]; Object b = array[1]; } But if we replace the box constructor argument with a non-constant value, it will again not work. private static volatile double x = 43; private static void multiBoxedNonconstant() { Object[] array = new Object[]{new Double(x), new Double(1.2)}; Object a = array[0]; Object b = array[1]; } 2017-05-26 14:18 keltez?ssel, Vitaly Davidovich ?rta: > Indeed, you're right - I see the same (unsurprisingly) on 8u121. I > wasn't looking at the loop compilation before, but just the > single/multi methods standalone (analogous to them not being inlined). > > What's interesting also is that even single(), with just a slight > re-arrangement, doesn't get eliminated. For instance: > > static void single() { > final Object[] array = new Object[] {new Object()}; > final Object a = array[0]; > } > > > On Fri, May 26, 2017 at 7:34 AM, Hontv?ri Attila > wrote: > > They are eliminated only if multi() is not inlined into an > enclosing loop. (8u101) If I disable inlining it with > CompileCommand, C2 compiles it to an noop method. But when > inlining is enabled, the wrapper loop compiles into a long code: > http://pastebin.com/KAq5v2XE > > Maybe this issue is related to JDK-8155769 > . > > 2017-05-25 21:50 keltez?ssel, Vitaly Davidovich ?rta: >> Looking at C2 generated assembly (8u121), both cases are >> eliminated - the single() and multi() are dead code. To >> alleviate the truly dead code scenario here, I had these methods >> return something from the allocated object (I created a dummy >> class that holds an int) and accessed those objects via array >> indexing - no allocations are generated (not the array, and not >> the objects). >> >> On Thu, May 25, 2017 at 3:12 PM, Kirk Pepperdine >> > wrote: >> >> I can see the arrays being DVE?ed.. in 8_121. EA seems to be >> doing it?s job. >> >> Kind regards, >> Kirk >> >> > On May 25, 2017, at 9:03 PM, Vladimir Kozlov >> > > wrote: >> > >> > Hi Attila, >> > >> > What Java version you are using? >> > >> > I tried jdk 7, 8, 9. >> > >> > What I see is that in both cases only arrays are eliminated >> (-XX:+PrintEliminateAllocations with debug version of VM): >> > >> > BEGIN single >> > Scalar 73 AllocateArray === 51 46 47 8 1 ( 71 >> 59 21 58 41 1 1 ) [[ 74 75 76 83 84 85 ]] >> rawptr:NotNull ( int:>=0, java/lang/Object:NotNull *, bool, >> int ) NewMain::single @ bci:9 !jvms: NewMain::single @ bci:9 >> > ++++ Eliminated: 73 AllocateArray >> > END single >> > BEGIN multi >> > Scalar 116 AllocateArray === 91 86 87 8 1 ( 114 >> 103 21 102 41 76 1 1 1 ) [[ 117 118 119 126 127 128 >> ]] rawptr:NotNull ( int:>=0, java/lang/Object:NotNull *, >> bool, int ) NewMain::multi @ bci:17 !jvms: NewMain::multi @ >> bci:17 >> > ++++ Eliminated: 116 AllocateArray >> > DONE >> > >> > But based on code non of objects should be allocated - both >> methods should have only 'return' generated. >> > >> > Looks like EA missing such case when objects are stored >> into small array. >> > >> > Thanks, >> > Vladimir >> > >> > On 5/25/17 9:16 AM, Hontv?ri Attila wrote: >> >> When creating a non-escaping array and putting a newly >> created, non-escaping object in it, the EA works, there are >> no heap allocations. >> >> private static void single() { >> >> Object x = new Object(); >> >> Object[] array = new Object[]{x}; >> >> Object a = array[0]; >> >> } >> >> But if we do the same with two or more objects, the array >> will be allocated on the heap, and not eliminated. >> >> private static void multi() { >> >> Object x = new Object(); >> >> Object y = new Object(); >> >> Object[] array = new Object[]{x, y}; >> >> Object a = array[0]; >> >> Object b = array[1]; >> >> } >> >> Is there a reason why it is only working in the first case? >> >> This would be useful for example, >> MethodHandle::invokeWithArguments, when the primitive types >> are boxed, and put into a varargs array, see my older email [1]. >> >> A complete test source code is in [2], if we run it with >> -verbose:gc, we can see there are many GCs in the second >> case, but there are no GCs in the first case. >> >> [1] >> http://mail.openjdk.java.net/pipermail/jigsaw-dev/2017-January/010933.html >> >> >> [2] >> https://gist.github.com/anonymous/bd46075ef1ebd858dae49fe6cfe39da8 >> >> >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From dean.long at oracle.com Fri May 26 20:10:19 2017 From: dean.long at oracle.com (dean.long at oracle.com) Date: Fri, 26 May 2017 13:10:19 -0700 Subject: Some JVMCI/Graal questions related to AOT In-Reply-To: <2108DEBD-0FE7-4F78-AC4D-52950F8C6D15@oracle.com> References: <22330f24-9bbe-3055-41e4-35b1f6854fa4@oracle.com> <1f1b91ba-2c0f-ddd2-d822-64ca81c554c5@oracle.com> <2108DEBD-0FE7-4F78-AC4D-52950F8C6D15@oracle.com> Message-ID: <8ed3398b-84ec-9ae2-95f1-f53902dd6238@oracle.com> OK, it makes sense now :-) dl On 5/26/17 10:44 AM, Thomas Wuerthinger wrote: > Dean, > > Those intermediate objects are not actually created when the code is compiled with a compiler supporting escape analysis (e.g., Graal ;)). This pattern is useful for flags that are only set at initialisation time, because it allows to declare those flags as ?final". > > - thomas > > >> On 26 May 2017, at 00:02, dean.long at oracle.com wrote: >> >> I have something mostly working, but I noticed that checkcasts on the appendix object are getting folded away. Eventually I want to support that, by adding the type as a dependency that AOT can check at runtime, but for this initial phase I'll probably just wrap the object constant in a LoadConstantIndirectlyNode and return that to the parser. >> >> I played around with guarding my changes with (EnableJVMCI && !UseJVMCICompiler) via HotSpotVMConfig, but I want more fine-level control over how invokedynamic constant pool slots are resolved and the adapter and appendix types exposed to the parser, so I'm thinking about adding more flags to GraphBuilderConfiguration or even better introducing a new StaticCompilationPlugin. Does a new plugin sound reasonable? Then I can move the hooks out of HotSpotConstantPool. >> >> dl >> >> PS - adding new flag fields to GraphBuilderConfiguration looks painful, with all the flags getting passed to the constructor and needing to change all the other withFlag methods. I'm tempted to rewrite it to use setFlag methods. I don't see why we need to create intermediate objects just to set fields. >> >> On 5/23/17 12:11 PM, dean.long at oracle.com wrote: >>> Thanks, I'll try that. >>> >>> dl >>> >>> >>> On 5/23/17 12:29 AM, Doug Simon wrote: >>> >>>>> On 23 May 2017, at 00:41, dean.long at oracle.com wrote: >>>>> >>>>> 1) I'm working on "8132547: [AOT] support invokedynamic instructions" and I've hacked up jdk.vm.ci.hotspot.HotSpotConstantPool.java to handle things like the invokedynamic appendix differently. However, since this will only be used by AOT, I'm thinking I need to put my changes in an AOTHotSpotConstantPool subclass. My question is, where is a good place to put such as class (which hopefully won't require messing with modules)? >>>> Depending on the nature of the changes, I suspect they can simply be added to HotSpotConstantPool, guarded by a VM flag exposed by HotSpotVMConfig if necessary. HotSpotConstantPool is currently final and I don't see a natural place for an AOT specific subclass >>>> >>>>> 2) How can I tell if a ResolvedJavaType corresponds to a VM anonymous class (Klass::is_anonymous())? I can't rely on getFingerprint() returning 0, because I want fingerprints for anonymous classes. Is there something existing, or do I need to add something to JVMCI? >>>> You'd need to add something to JVMCI by exposing the required flags and fields in HotSpotVMConfig. >>>> >>>> -Doug From vladimir.kozlov at oracle.com Sat May 27 00:22:56 2017 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Fri, 26 May 2017 17:22:56 -0700 Subject: RFR [9] (XS): 8179882: C2: Stale control info after cast node elimination during loop optimization pass In-Reply-To: <3964efc3-00d8-35da-bbec-df035b694598@oracle.com> References: <8983a735-8435-9b27-6580-8eed965b9bc2@oracle.com> <613dc165-c8d2-c38b-bbc4-12f1884944dc@oracle.com> <8e502315-5c87-100b-783c-fd8dab4cbb9c@oracle.com> <488d42b7-be5f-f739-1887-56691bcc39c1@oracle.com> <37ed70d5-60be-de34-5143-938c2cb34576@oracle.com> <3964efc3-00d8-35da-bbec-df035b694598@oracle.com> Message-ID: Looks good. Thanks, Vladimir On 5/26/17 6:09 AM, Vladimir Ivanov wrote: > How about the following version? > http://cr.openjdk.java.net/~vlivanov/8179882/webrev.01/ > > Best regards, > Vladimir Ivanov > > On 5/25/17 8:23 PM, Vladimir Kozlov wrote: >> Okay, now I understand the problem. >> >> An other thing I do not like about original change is that it could >> place dom_cast above its inputs which could be incorrect: >> >> set_ctrl_and_loop(dom_cast, dom_lca(dom_cast_ctrl, n_ctrl)); >> >> I like [1] approach which do replacement only with clear domination. >> >> Also would be nice if you add comment in this code explaining it. >> >> Thanks, >> Vladimir >> >> On 5/25/17 7:28 AM, Vladimir Ivanov wrote: >>>>> Sorry for the confusion. #1 & #2-#3 happen in different (consequent) >>>>> loop opts iterations. Loop info is rebuilt after #2 and it goes out of >>>>> sync during #2-#3. >>>> >>>> You mean: rebuild after #1. Right? >>> >>> Yes, typo. >>> >>>>>> Based on code it look like we are doing split_if_with_blocks when >>>>>> this >>>>>> happens. major_progress should be set and exit this iteration of loop >>>>>> opts. So why "loop cloning" happens? >>>>> >>>>> Cast node elimination in split_if_with_blocks_pre doesn't bump >>>>> _major_progress, so after split-ifs are over (#2-#3 are part of them), >>>>> the compiler proceeds to loop opts. >>>> >>>> Rereading this part: >>>> >>>>>>> The problem is that 773 was part of N4442 and 574 doesn't dominate >>>>>>> the >>>>>>> loop body, but #3 doesn't place 574 there when replacing 773. So, >>>> >>>> Would it help if 574 CheckCastPP is replaced with 773? >>> >>> Yes, that's another way to fix it [1]. Either one or the other cast >>> node dominates. If we are unlucky, compiler encounters the "wrong one" >>> (dominator) first and tries to replace it with a dominated one. If the >>> compiler skips it, it will eventually find dominated node and will >>> attempt to replace it with the dominator (which will succeed). >>> >>> BTW it's equivalent to using get_ctrl() instead of in(0) in >>> ConstraintCastNode::dominating_cast() when called with PhaseIdealLoop. >>> >>> Let me know what approach to fix the bug you prefer. >>> >>> Best regards, >>> Vladimir Ivanov >>> >>> [1] >>> diff --git a/src/share/vm/opto/loopopts.cpp >>> b/src/share/vm/opto/loopopts.cpp >>> --- a/src/share/vm/opto/loopopts.cpp >>> +++ b/src/share/vm/opto/loopopts.cpp >>> @@ -913,7 +913,7 @@ >>> >>> if (n->is_ConstraintCast()) { >>> Node* dom_cast = n->as_ConstraintCast()->dominating_cast(this); >>> - if (dom_cast != NULL) { >>> + if (dom_cast != NULL && is_dominator(get_ctrl(dom_cast), >>> get_ctrl(n)) { >>> _igvn.replace_node(n, dom_cast); >>> return dom_cast; >>> } >>> >>>>> Calling set_major_progress() when dom_cast doesn't dominate n in >>>>> split_if_with_blocks_pre() is another way to fix the bug, but it is >>>>> much more expensive (introduces additional loop opts pass). >>>>> >>>>> Best regards, >>>>> Vladimir Ivanov >>>>> >>>>>>> >>>>>>> #2: CastPP: 5846 => 733 >>>>>>> - 733 dominates 5846 >>>>>>> >>>>>>> #3: CheckCastPP: 773 => 574 >>>>>>> - 773 & 574 have the same control node after #1 (1260/1257), >>>>>>> but >>>>>>> get_ctrl(dom_cast) points to 1581 (before #2: 574 -1-> 5846 -0-> >>>>>>> 1581). >>>>>>> >>>>>>> The problem is that 773 was part of N4442 and 574 doesn't dominate >>>>>>> the >>>>>>> loop body, but #3 doesn't place 574 there when replacing 773. So, >>>>>>> subsequent cloning of loop body during loop peeling of N4442 >>>>>>> produces >>>>>>> unschedulable graph. >>>>>>> >>>>>>> The fix moves 574 up the dominator tree and puts it inside N4442, so >>>>>>> it dominates all users of 773 & 574 after #3. >>>>>>> >>>>>>> get_ctrl() & in(0) point to different CFG nodes after #1, but the >>>>>>> loop >>>>>>> info is correct until #3 happens which requires 574 to be moved up. >>>>>>> >>>>>>> Best regards, >>>>>>> Vladimir Ivanov >>>>>>> >>>>>>> [1] >>>>>>> === new loop opt pass >>>>>>> >>>>>>> #0: Initial state >>>>>>> Loop: N4442/N1838 body={ 1257 773 1260 } >>>>>>> Loop: N4445/N2197 body={ 1012 574 1075 } >>>>>>> >>>>>>> 574 CheckCastPP === 1075 5846 >>>>>>> Oop:java/lang/Integer:NotNull:exact * >>>>>>> 1075 IfFalse === 1012 >>>>>>> 1012 If === 1575 1820 >>>>>>> 1820 Bool === ... >>>>>>> 5846 CastPP === 1581 677 >>>>>>> 1581 Proj === 2137 #0 Type:control >>>>>>> 2137 Unlock === ... >>>>>>> 677 Phi === ... >>>>>>> >>>>>>> 773 CheckCastPP === 1260 733 >>>>>>> Oop:java/lang/Integer:NotNull:exact * >>>>>>> 1260 IfFalse === 1257 >>>>>>> 1257 If === 1819 1820 >>>>>>> 1820 Bool === ... >>>>>>> 733 CastPP === 1230 677 >>>>>>> 1230 IfTrue === 1225 >>>>>>> 1225 If === ... >>>>>>> 677 Phi === ... >>>>>>> >>>>>>> >>>>>>> #1: 1012 => 1257 (same condition, 1257 dominates 1012) and 574 isn't >>>>>>> part of N4445 anymore >>>>>>> >>>>>>> Loop: N4442/N1838 body={ 1257 773 1260 } >>>>>>> Loop: N4445/N2197 body={ } >>>>>>> >>>>>>> 574 CheckCastPP === 1260 5846 >>>>>>> 1260 IfFalse === 1257 >>>>>>> 1257 If === 1575 1820 >>>>>>> 1820 Bool === ... >>>>>>> 5846 CastPP === 1581 677 >>>>>>> 1581 Proj === 2137 #0 Type:control >>>>>>> 2137 Unlock === ... >>>>>>> 677 Phi === ... >>>>>>> >>>>>>> 773 CheckCastPP === 1260 733 >>>>>>> 1260 IfFalse === 1257 >>>>>>> 1257 If === 1819 1820 >>>>>>> 1820 Bool === ... >>>>>>> 733 CastPP === 1230 677 >>>>>>> 1230 IfTrue === 1225 >>>>>>> 1225 If === ... >>>>>>> 677 Phi === ... >>>>>>> >>>>>>> === next loop opts pass >>>>>>> >>>>>>> Loop: N4442/N1838 body={ 1257 773 1260 } >>>>>>> Loop: N4445/N2197 body={ } >>>>>>> >>>>>>> #2: 5846 => 733 >>>>>>> >>>>>>> 574 CheckCastPP === 1260 733 >>>>>>> 773 CheckCastPP === 1260 733 >>>>>>> >>>>>>> 1260 IfFalse === 1257 >>>>>>> 1257 If === 1819 1820 >>>>>>> 1820 Bool === ... >>>>>>> >>>>>>> 733 CastPP === 1230 677 >>>>>>> 1230 IfTrue === 1225 >>>>>>> 1225 If === ... >>>>>>> 677 Phi === ... >>>>>>> >>>>>>> Loop: N4442/N1838 body={ 1257 773 1260 } >>>>>>> Loop: N4445/N2197 body={ } >>>>>>> >>>>>>> #3: 773 => 574 >>>>>>> >>>>>>> Loop: N4442/N1838 body={ 1257 1260 } // NB! missing 574 >>>>>>> Loop: N4445/N2197 body={ } >>>>>>> >>>>>>> #4: Peel N4442 (NB! same loop opt iteration, so loop tree hasn't >>>>>>> been >>>>>>> rebuilt yet.) >>>>>>> - 574 isn't cloned since it's not part of the loop body and it >>>>>>> leads to unscheduleable IR. >>>>>>> >>>>>>> === Crash >>>>>>> >>>>>>>> If there was a new node created during loopopts then >>>>>>>> PhaseIdealLoop::register_new_node() should be called for it >>>>>>>> already. >>>>>>>> >>>>>>>> Vladimir K >>>>>>>> >>>>>>>>> >>>>>>>>> Or do you suggest to rewrite >>>>>>>>> ConstraintCastNode::dominating_cast() to >>>>>>>>> use get_ctrl() instead of in(0)? >>>>>>>>> >>>>>>>>> Best regards, >>>>>>>>> Vladimir Ivanov >>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> The fix is to catch the case when dom_cast doesn't dominate n >>>>>>>>>>> based on >>>>>>>>>>> info from PhaseIdealLoop and update control info accordingly. >>>>>>>>>>> >>>>>>>>>>> Testing: manual (replayed problematic compilation & eyeballed >>>>>>>>>>> the >>>>>>>>>>> IR), >>>>>>>>>>> JPRT, RBT (hs-tier0-comp, in progress). >>>>>>>>>>> >>>>>>>>>>> Best regards, >>>>>>>>>>> Vladimir Ivanov >>>>>>>>>>> >>>>>>>>>>> PS: thanks to Roland for helping with the fix. From vladimir.kozlov at oracle.com Sat May 27 00:25:14 2017 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Fri, 26 May 2017 17:25:14 -0700 Subject: [9] RFR(XS): 8180855: Null pointer dereference in OopMapSet::all_do of oopMap.cpp:394 In-Reply-To: References: Message-ID: <8c5fbda8-a390-ad72-dd39-09cf5cd5bdeb@oracle.com> Zoltan, you missed webrev link. Is it this one?: http://cr.openjdk.java.net/~zmajo/8180855/webrev.00/ It looks good. Thanks, Vladimir On 5/26/17 4:36 AM, Zolt?n Maj? wrote: > Hi, > > > please review the following fix for 8180855. > https://bugs.openjdk.java.net/browse/JDK-8180855 > https://bugs.openjdk.java.net/browse/JDK-8180855 > > We check if the contents of the memory location returned by > oopmapreg_to_location() is NULL, but we do not check if the memory > location itself is NULL. The fix adds the missing check. JPRT testing > passes, RBT pre-PIT testing is in progress. > > Thank you! > > Best regards, > > > Zoltan > From rwestrel at redhat.com Mon May 29 07:43:05 2017 From: rwestrel at redhat.com (Roland Westrelin) Date: Mon, 29 May 2017 09:43:05 +0200 Subject: RFR [9] (XS): 8179882: C2: Stale control info after cast node elimination during loop optimization pass In-Reply-To: <3964efc3-00d8-35da-bbec-df035b694598@oracle.com> References: <8983a735-8435-9b27-6580-8eed965b9bc2@oracle.com> <613dc165-c8d2-c38b-bbc4-12f1884944dc@oracle.com> <8e502315-5c87-100b-783c-fd8dab4cbb9c@oracle.com> <488d42b7-be5f-f739-1887-56691bcc39c1@oracle.com> <37ed70d5-60be-de34-5143-938c2cb34576@oracle.com> <3964efc3-00d8-35da-bbec-df035b694598@oracle.com> Message-ID: > http://cr.openjdk.java.net/~vlivanov/8179882/webrev.01/ That looks good to me. Roland. From rwestrel at redhat.com Mon May 29 08:10:08 2017 From: rwestrel at redhat.com (Roland Westrelin) Date: Mon, 29 May 2017 10:10:08 +0200 Subject: [10] RFR(M): 8176506: C2: loop unswitching and unsafe accesses cause crash In-Reply-To: References: <83aed7fe-cafa-f28b-577d-c6871123e269@oracle.com> <8cfe23ac-b308-24e3-2725-44992031cddd@oracle.com> <352faa06-3fe4-de8a-eeb6-3506bf555a1e@redhat.com> <4050c7c9-b688-5d52-a85c-f72284f50ccf@oracle.com> <6cea2fa0-1bcf-3c86-1494-a51f66430577@oracle.com> <8de1ea64-79cd-e7ec-adb4-1f1f276612bf@oracle.com> Message-ID: Hi Volker, Thanks for reviewing this again. > I think it's good that you now use SIGILL on all platforms. The change > looks good except the following minor nits I've already mentioned in > my first review. I'll leave it up to you if you want to fix them and > in the case you do there's no need for a new webrev. Sorry, I forgot about those. I will fix the comments. > test/compiler/unsafe/TestMaybeNullUnsafeAccess.java > > - wouldn't it be safer to run the test with -Xbatch and > -XX:-UseOnStackReplacement for any case and to make it evident that > the test relays on the fact that test1() and test2() are both > compiled but not inlined ? Ok. Do you suggest adding a comment that states that test1/test2 shouldn't be inlined? > - you should also update the copyright on most files you've touched. I don't think I've updated a copyright in years. I thought it was ok to let bulk periodic copyright updates take care of that. Roland. From zoltan.majo at oracle.com Mon May 29 08:28:39 2017 From: zoltan.majo at oracle.com (=?UTF-8?B?Wm9sdMOhbiBNYWrDsw==?=) Date: Mon, 29 May 2017 10:28:39 +0200 Subject: [9] RFR(XS): 8180855: Null pointer dereference in OopMapSet::all_do of oopMap.cpp:394 In-Reply-To: <8c5fbda8-a390-ad72-dd39-09cf5cd5bdeb@oracle.com> References: <8c5fbda8-a390-ad72-dd39-09cf5cd5bdeb@oracle.com> Message-ID: <8200cb65-9a8e-dcec-29cb-d4fd0d348d65@oracle.com> Hi Vladimir, On 05/27/2017 02:25 AM, Vladimir Kozlov wrote: > Zoltan, you missed webrev link. Is it this one?: > > http://cr.openjdk.java.net/~zmajo/8180855/webrev.00/ sorry, I indeed missed the link. You posted the correct link -- thank you for digging it up. > It looks good. RBT testing looks good, I'll push the fix. Thank you for the review! Best regards, Zoltan > > Thanks, > Vladimir > > On 5/26/17 4:36 AM, Zolt?n Maj? wrote: >> Hi, >> >> >> please review the following fix for 8180855. >> https://bugs.openjdk.java.net/browse/JDK-8180855 >> https://bugs.openjdk.java.net/browse/JDK-8180855 >> >> We check if the contents of the memory location returned by >> oopmapreg_to_location() is NULL, but we do not check if the memory >> location itself is NULL. The fix adds the missing check. JPRT testing >> passes, RBT pre-PIT testing is in progress. >> >> Thank you! >> >> Best regards, >> >> >> Zoltan >> From martin.doerr at sap.com Mon May 29 10:58:19 2017 From: martin.doerr at sap.com (Doerr, Martin) Date: Mon, 29 May 2017 10:58:19 +0000 Subject: [10] [s390] RFR(XS): micro-optimization in resize_frame_absolute() In-Reply-To: <26853C9C-4CBE-4551-ACAB-20509AB82478@sap.com> References: <26853C9C-4CBE-4551-ACAB-20509AB82478@sap.com> Message-ID: Hi Lutz, thanks for providing the webrev. Is it allowed to write to the stack before updating the SP? I know that the PPC ABI allows this within a certain range, but I?m not aware of such an exception on s390x. I?d also prefer separate functions instead of one with so many cases. E.g. one function which copies the fp and one which takes a given fp like: void MacroAssembler::resize_frame_absolute(Register newSP, Register fp) { assert_different_registers(newSP, fp, Z_SP); z_lgr(Z_SP, newSP); z_stg(fp, _z_abi(callers_sp), (newSP == Z_R0) ? Z_SP : newSP); } Thanks and best regards, Martin From: hotspot-compiler-dev [mailto:hotspot-compiler-dev-bounces at openjdk.java.net] On Behalf Of Schmidt, Lutz Sent: Dienstag, 23. Mai 2017 12:47 To: hotspot-compiler-dev at openjdk.java.net Subject: [10] [s390] RFR(XS): micro-optimization in resize_frame_absolute() Dear all, I would like to request reviews for this tiny, s390-only enhancement: Bug: https://bugs.openjdk.java.net/browse/JDK-8180659 Webrev: http://cr.openjdk.java.net/~lucy/webrevs/8180659.00/ Thank you! Lutz -------------- next part -------------- An HTML attachment was scrubbed... URL: From vladimir.x.ivanov at oracle.com Mon May 29 11:22:49 2017 From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov) Date: Mon, 29 May 2017 14:22:49 +0300 Subject: RFR [9] (XS): 8179882: C2: Stale control info after cast node elimination during loop optimization pass In-Reply-To: References: <8983a735-8435-9b27-6580-8eed965b9bc2@oracle.com> <613dc165-c8d2-c38b-bbc4-12f1884944dc@oracle.com> <8e502315-5c87-100b-783c-fd8dab4cbb9c@oracle.com> <488d42b7-be5f-f739-1887-56691bcc39c1@oracle.com> <37ed70d5-60be-de34-5143-938c2cb34576@oracle.com> <3964efc3-00d8-35da-bbec-df035b694598@oracle.com> Message-ID: Vladimir, Roland, thanks! Best regards, Vladimir Ivanov On 5/29/17 10:43 AM, Roland Westrelin wrote: > >> http://cr.openjdk.java.net/~vlivanov/8179882/webrev.01/ > > That looks good to me. > > Roland. > From tobias.hartmann at oracle.com Mon May 29 15:48:02 2017 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Mon, 29 May 2017 17:48:02 +0200 Subject: [8u] RFR(XS): 8180813: Null pointer dereference of CodeCache::find_blob() result Message-ID: <594c466b-ed3e-6c71-ab82-a520c3a9c952@oracle.com> Hi, please review the following backport to JDK 8u: 8180813: Null pointer dereference of CodeCache::find_blob() result http://cr.openjdk.java.net/~thartmann/8180813/webrev.00/ http://hg.openjdk.java.net/jdk9/dev/hotspot/rev/63ac6d565c21 http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2017-May/026307.html The backport does not apply cleanly because "is_compiled" is called "is_nmethod" and "as_compiled_method_or_null" is called "as_nmethod_or_null" in JDK 8u. Here's the new webrev: http://cr.openjdk.java.net/~thartmann/8180813_8u/webrev.00/ I'll request approval in a separate thread on jdk8u-dev. Thanks, Tobias From vladimir.kozlov at oracle.com Mon May 29 17:44:52 2017 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Mon, 29 May 2017 10:44:52 -0700 Subject: [8u] RFR(XS): 8180813: Null pointer dereference of CodeCache::find_blob() result In-Reply-To: <594c466b-ed3e-6c71-ab82-a520c3a9c952@oracle.com> References: <594c466b-ed3e-6c71-ab82-a520c3a9c952@oracle.com> Message-ID: <0c93b64c-47ea-823f-b8bf-b75c56ed5052@oracle.com> Looks good. Thanks, Vladimir On 5/29/17 8:48 AM, Tobias Hartmann wrote: > Hi, > > please review the following backport to JDK 8u: > > 8180813: Null pointer dereference of CodeCache::find_blob() result > http://cr.openjdk.java.net/~thartmann/8180813/webrev.00/ > http://hg.openjdk.java.net/jdk9/dev/hotspot/rev/63ac6d565c21 > http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2017-May/026307.html > > The backport does not apply cleanly because "is_compiled" is called "is_nmethod" and "as_compiled_method_or_null" is called "as_nmethod_or_null" in JDK 8u. > > Here's the new webrev: > http://cr.openjdk.java.net/~thartmann/8180813_8u/webrev.00/ > > I'll request approval in a separate thread on jdk8u-dev. > > Thanks, > Tobias > From tobias.hartmann at oracle.com Tue May 30 05:00:44 2017 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Tue, 30 May 2017 07:00:44 +0200 Subject: [8u] RFR(XS): 8180813: Null pointer dereference of CodeCache::find_blob() result In-Reply-To: <0c93b64c-47ea-823f-b8bf-b75c56ed5052@oracle.com> References: <594c466b-ed3e-6c71-ab82-a520c3a9c952@oracle.com> <0c93b64c-47ea-823f-b8bf-b75c56ed5052@oracle.com> Message-ID: Thanks Vladimir! Best regards, Tobias On 29.05.2017 19:44, Vladimir Kozlov wrote: > Looks good. > > Thanks, > Vladimir > > On 5/29/17 8:48 AM, Tobias Hartmann wrote: >> Hi, >> >> please review the following backport to JDK 8u: >> >> 8180813: Null pointer dereference of CodeCache::find_blob() result >> http://cr.openjdk.java.net/~thartmann/8180813/webrev.00/ >> http://hg.openjdk.java.net/jdk9/dev/hotspot/rev/63ac6d565c21 >> http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2017-May/026307.html >> >> The backport does not apply cleanly because "is_compiled" is called "is_nmethod" and "as_compiled_method_or_null" is called "as_nmethod_or_null" in JDK 8u. >> >> Here's the new webrev: >> http://cr.openjdk.java.net/~thartmann/8180813_8u/webrev.00/ >> >> I'll request approval in a separate thread on jdk8u-dev. >> >> Thanks, >> Tobias >> From tobias.hartmann at oracle.com Tue May 30 12:17:10 2017 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Tue, 30 May 2017 14:17:10 +0200 Subject: [9] RFR(S): 8179678: ArrayCopy with same src and dst can cause incorrect execution or compiler crash In-Reply-To: References: <43893ed5-cbf5-4cf9-a727-8733850260de@oracle.com> <31848120-9bfa-505a-60dd-fa8b1c0b7efa@oracle.com> Message-ID: Hi Roland, okay, looks good. I'll re-run test. Thanks, Tobias On 24.05.2017 09:55, Roland Westrelin wrote: > >> with your fix, compiler/arraycopy/TestEliminatedArrayCopyDeopt fails >> on all platforms with: > > Thanks. The problem here is that with -XX:-ReduceInitialCardMarks, C2 > adds a g1 post barrier between the arraycopy node and the following > membar that ArrayCopyNode::may_modify() doesn't expect. Here is a new > webrev: > > http://cr.openjdk.java.net/~roland/8179678/webrev.04/ > > with a fixed ArrayCopyNode::may_modify(). I also added verification code > that calls ArrayCopyNode::may_modify() after an array copy node is > expanded to verify that ArrayCopyNode::may_modify() is consistent with > the just expanded subgraph. > > Roland. > From robbin.ehn at oracle.com Tue May 30 12:33:06 2017 From: robbin.ehn at oracle.com (Robbin Ehn) Date: Tue, 30 May 2017 14:33:06 +0200 Subject: Low-Overhead Heap Profiling In-Reply-To: References: <2af975e6-3827-bd57-0c3d-fadd54867a67@oracle.com> <365499b6-3f4d-a4df-9e7e-e72a739fb26b@oracle.com> Message-ID: <102c59b8-25b6-8c21-8eef-1de7d0bbf629@oracle.com> Hi Jc, On 05/22/2017 08:47 PM, JC Beyler wrote: > Dear all, > > I have a new webrev up: > http://cr.openjdk.java.net/~rasbold/8171119/webrev.03/ I liked this! Two small things: heapMonitoring.hpp class HeapMonitoring should extend AllStatic heapMonitoring.cpp class MuxLocker should extend StackObj But I think you should skip MuxLocker or push it separate generic enhancement. Great with the jtreg test, thanks alot! > > This webrev has, I hope, fixed a lot of the comments from Robbin: > - The casts normally are all C++ style > - Moved this to jdk10-hs > - I have not tested slowdebug yet, hopefully it does not break there > - Added the garbage collection system: > - Now live sampled allocations are tracked throughout their lifetime > - When GC happens, it moves the sampled allocation information to two lists: recent and frequent GC lists > - Those lists use the array system that the live objects were using before but have different re-use strategies > - Added the JVMTI API for them via a GetFrequentGarbageTraces and GetGarbageTraces > - Both use the same JVMTI structures > - Added the calls to them for the test, though I've kept that test simple for now: > http://cr.openjdk.java.net/~rasbold/8171119/webrev.03/raw_files/new/test/serviceability/jvmti/HeapMonitor/libHeapMonitor.c > - As I write this, I notice my webrev is missing a final change I made to the test that calls the same ReleaseTraces to each live/garbage/frequent structure. This > is updated in my local repo and will get in the next webrev. > > Next steps for this work are: > - Putting the TLAB implementation (yes not forgotten ;-)) > - Adding more testing and separate the current test system to check things a bit more thoroughly > - Have not tried to circumvent AsyncGetCallTrace yet > - Still have to double check the stack walker a bit more Looking forward to this. Could someone from compiler take a look please? Thanks! /Robbin > > Happy webrev perusal! > Jc > > > On Tue, May 16, 2017 at 5:20 AM, Robbin Ehn > wrote: > > Just a few answers, > > On 05/15/2017 06:48 PM, JC Beyler wrote: > > Dear all, > > I've updated the webrev to: > http://cr.openjdk.java.net/~rasbold/8171119/webrev.02/ > > > > > I'll look at this later, thanks! > > > Robbin, > I believe I have addressed most of your items with webrev 02: > - I added a JTreg test to show how it works: > http://cr.openjdk.java.net/~rasbold/8171119/webrev.02/raw_files/new/test/serviceability/jvmti/HeapMonitor/libHeapMonitor.c > > > > - I've modified the code to use its own data structures both internally and externally, this will make it easier to move out of AsyncGetCallTrace as we move > forward, that is still on my TODOs > - I cleaned up the JVMTI API by passing a structure that handles the num_traces and put in a ReleaseTraces as well > - I cleaned up other issues as well. > > However, I have three questions, which are probably because I'm new in this community: > 1) My previous webrevs were based off of JDK9 by mistake. When I took JDK10 via : hg clone http://hg.openjdk.java.net/jdk10/jdk10 > > jdk10 > - I don't see code compatible with what you were showing (ie your patches don't make sense for that code base; ex: klass is still accessed via klass() for > example in collectedHeap.inline.hpp) > - Would you know what is the right hg clone command so we are working on the same code base? > > > We use jdk10-hs, e.g. > hg tclone http://hg.openjdk.java.net/jdk10/hs 10-hs > > There is sporadic big merges going from jdk9->jdk10->jdk10-hs and jdk10-hs->jdk10, so 10 is moving... > > > 2) You mentioned I was using os::malloc, new, NEW_C_HEAP_ARRAY; I cleaned out the os::malloc but which of the new vs NEW_C_HEAP_ARRAY should I use. It might be > that I don't understand when one uses one or the other but I see both used around the code base? > - Is it that new is to be used for anything internal and NEW_C_HEAP_ARRAY anything provided to the JVMTI users outside of the JVM? > > > We overload new operator when you extend correct base class, e.g. CHeapObj so use 'new' > But for arrays you will need the macro NEW_C_HEAP_ARRAY. > > > 3) Casts: same kind question: which should I use. The code was using a bit of everything, I'll refactor it entirely but I was not clear if I should go to C casts > or C++ casts as I see both in the codebase. What is the convention I should use? > > > Just be consist, use what suites you, C++ casts might be preferable, if we are moving towards C++11. > And use 'right' cast, e.g. going from Thread* to JavaThread* you should use C cast or static_cast, not reinterpret_cast I would say. > > > Final notes on this webrev: > - I am still missing: > - Putting a TLAB implementation so that we can compare both webrevs > - Have not tried to circumvent AsyncGetCallTrace > - Putting in the handling of GC'd objects > - Fix a stack walker issue I have seen, I think I know the problem and will test that theory out for the next webrev > > I will work on integrating those items for the next webrev! > > > Thanks! > > > Thanks for your help, > Jc > > Ps: I tested this on a new repo: > > hg clone http://hg.openjdk.java.net/jdk10/jdk10 > jdk10 > ... building it > cd test > jtreg -nativepath:/build/linux-x86_64-normal-server-release/support/test/hotspot/jtreg/native/lib/ -jdk > /linux-x86_64-normal-server-release/images/jdk ../hotspot/test/serviceability/jvmti/HeapMonitor/ > > > I'll test it out! > > /Robbin > > > > On Thu, May 4, 2017 at 11:21 PM, serguei.spitsyn at oracle.com > >> wrote: > > Robbin, > > Thank you for forwarding! > I will review it. > > Thanks, > Serguei > > > > On 5/4/17 02:13, Robbin Ehn wrote: > > Hi, > > To me the compiler changes looks what is expected. > It would be good if someone from compiler could take a look at that. > Added compiler to mail thread. > > Also adding Serguei, It would be good with his view also. > > My initial take on it, read through most of the code and took it for a ride. > > ############################## > - Regarding the compiler changes: I think we need the 'TLAB end' trickery (mentioned by Tony P) > instead of a separate check for sampling in fast path for the final version. > > ############################## > - This patch I had to apply to get it compile on JDK 10: > > diff -r ac3ded340b35 src/share/vm/gc/shared/collectedHeap.inline.hpp > --- a/src/share/vm/gc/shared/collectedHeap.inline.hpp Fri Apr 28 14:31:38 2017 +0200 > +++ b/src/share/vm/gc/shared/collectedHeap.inline.hpp Thu May 04 10:22:56 2017 +0200 > @@ -87,3 +87,3 @@ > // support for object alloc event (no-op most of the time) > - if (klass() != NULL && klass()->name() != NULL) { > + if (klass != NULL && klass->name() != NULL) { > Thread *base_thread = Thread::current(); > diff -r ac3ded340b35 src/share/vm/runtime/heapMonitoring.cpp > --- a/src/share/vm/runtime/heapMonitoring.cpp Fri Apr 28 14:31:38 2017 +0200 > +++ b/src/share/vm/runtime/heapMonitoring.cpp Thu May 04 10:22:56 2017 +0200 > @@ -316,3 +316,3 @@ > JavaThread *thread = reinterpret_cast(Thread::current()); > - assert(o->size() << LogHeapWordSize == byte_size, > + assert(o->size() << LogHeapWordSize == (long)byte_size, > "Object size is incorrect."); > > ############################## > - This patch I had to apply to get it not asserting during slowdebug: > > --- a/src/share/vm/runtime/heapMonitoring.cpp Fri Apr 28 15:15:16 2017 +0200 > +++ b/src/share/vm/runtime/heapMonitoring.cpp Thu May 04 10:24:25 2017 +0200 > @@ -32,3 +32,3 @@ > // TODO(jcbeyler): should we make this into a JVMTI structure? > -struct StackTraceData { > +struct StackTraceData : CHeapObj { > ASGCT_CallTrace *trace; > @@ -143,3 +143,2 @@ > StackTraceStorage::StackTraceStorage() : > - _allocated_traces(new StackTraceData*[MaxHeapTraces]), > _allocated_traces_size(MaxHeapTraces), > @@ -147,2 +146,3 @@ > _allocated_count(0) { > + _allocated_traces = NEW_C_HEAP_ARRAY(StackTraceData*, MaxHeapTraces, mtInternal); > memset(_allocated_traces, 0, sizeof(*_allocated_traces) * MaxHeapTraces); > @@ -152,3 +152,3 @@ > StackTraceStorage::~StackTraceStorage() { > - delete[] _allocated_traces; > + FREE_C_HEAP_ARRAY(StackTraceData*, _allocated_traces); > } > > - Classes should extend correct base class for which type of memory is used for it e.g.: CHeapObj or StackObj or AllStatic > - The style in heapMonitoring.cpp is a bit different from normal vm-style, e.g. using C++ casts instead of C. You mix NEW_C_HEAP_ARRAY, os::malloc and new. > - In jvmtiHeapTransition.hpp you use C cast instead. > > ############################## > - This patch I had apply to get traces without setting an ?unrelated? capability > - Should this not be a new capability? > > diff -r c02a5d8785bf src/share/vm/prims/forte.cpp > --- a/src/share/vm/prims/forte.cpp Fri Apr 28 15:15:16 2017 +0200 > +++ b/src/share/vm/prims/forte.cpp Thu May 04 10:24:25 2017 +0200 > @@ -530,6 +530,6 @@ > > - if (!JvmtiExport::should_post_class_load()) { > +/* if (!JvmtiExport::should_post_class_load()) { > trace->num_frames = ticks_no_class_load; // -1 > return; > - } > + }*/ > > ############################## > - forte.cpp: (I know this is not part of your changes but) > find_jmethod_id_or_null give me NULL for my test. > It looks like we actually want the regular jmethod_id() ? > > Since we are the thread we are talking about (and in same ucontext) and thread is in vm and have a last java frame, > I think most of the checks done in AsyncGetCallTrace is irrelevant, so you should be-able to call forte_fill_call_trace_given_top directly. > But since we might need jmethod_id() if possible to avoid getting method id NULL, > we need some fixes in forte code, or just do the vframStream loop inside heapMonitoring.cpp and not use forte.cpp. > > Something like: > > if (jthread->has_last_Java_frame()) { // just to be safe > vframeStream vfst(jthread); > while (!vfst.at_end()) { > Method* m = vfst.method(); > m->jmethod_id(); > m->line_number_from_bci(vfst.bci()); > vfst.next(); > } > > - This is a bit confusing in forte.cpp, trace->frames[count].lineno = bci. > Line number should be m->line_number_from_bci(bci); > Do the heapMonitoring suppose to trace with bci or line number? > I would say bci, meaning we should either rename ASGCT_CallFrame?lineno or use another data structure which says bci. > > ############################## > - // TODO(jcbeyler): remove this extra code handling the extra trace for > Please fix all these TODO's :) > > ############################## > - heapMonitoring.hpp: > // TODO(jcbeyler): is this algorithm acceptable in open source? > > Why is this comment here? What is the implication? > Have you tested any simpler algorithm? > > ############################## > - Create a sanity jtreg test. (./hotspot/make/test/JtregNative.gmk for building the agent) > > ############################## > - monitoring_period vs HeapMonitorRate, pick rate or period. > > ############################## > - globals.hpp > Why is MaxHeapTraces not settable/overridable from jvmti interface? That would be handy. > > ############################## > - jvmtiStackTraceData + ASGCT_CallFrame memory > Are the agent suppose to loop through and free all ASGCT_CallFrame? > Wouldn't it be better with some kinda protocol, like: > (*jvmti)->GetLiveTraces(jvmti, &stack_traces, &num_traces); > (*jvmti)->ReleaseTraces(jvmti, stack_traces, num_traces); > > Also using another data structure that have num_traces inside it simplifies things. > So I'm not convinced using the async structure is the best way forward. > > > I have more questions, but I think it's better if you respond and update the code first. > > Thanks! > > /Robbin > > > On 04/21/2017 11:34 PM, JC Beyler wrote: > > Hi all, > > I've added size information to the allocation sampling system. This allows the callback to remember the size of each sampled allocation. > http://cr.openjdk.java.net/~rasbold/8171119/webrev.01/ > > > > The new webrev.01 also adds the actual heap monitoring sampling system in files: > http://cr.openjdk.java.net/~rasbold/8171119/webrev.01/src/share/vm/runtime/heapMonitoring.cpp.patch > > > > and > http://cr.openjdk.java.net/~rasbold/8171119/webrev.01/src/share/vm/runtime/heapMonitoring.hpp.patch > > > > > My next step is to add the GC part to the webrev, which will allow users to determine what objects are live and what are garbage. > > Thanks for your attention and let me know if there are any questions! > > Have a wonderful Friday! > Jc > > On Mon, Apr 17, 2017 at 12:37 PM, JC Beyler > > >>> wrote: > > Hi all, > > I worked on getting a few numbers for overhead and accuracy for my feature. I'm unsure if here is the right place to provide the full data, so I > am just > summarizing > here for now. > > - Overhead of the feature > > Using the Dacapo benchmark (http://dacapobench.org/). My initial results are that sampling provides 2.4% with a 512k sampling, 512k being our > default setting. > > - Note: this was without the tradesoap, tradebeans and tomcat benchmarks since they did not work with my JDK9 (issue between Dacapo and JDK9 it seems) > - I want to rerun next week to ensure number stability > > - Accuracy of the feature > > I wrote a small microbenchmark that allocates from two different stacktraces at a given ratio. For example, 10% of stacktrace S1 and 90% from > stacktrace > S2. The > microbenchmark was run 20 times, I averaged the results and looked for accuracy. It seems that statistically it is sound since if I allocated10% > S1 and 90% > S2, with a > sampling rate of 512k, I obtained 9.61% S1 and 90.49% S2. > > Let me know if there are any questions on the numbers and if you'd like to see some more data. > > Note: this was done using our internal JDK8 implementation since the webrev provided by > http://cr.openjdk.java.net/~rasbold/heapz/webrev.00/index.html > > > > >> does not yet contain the whole > implementation and therefore would have been misleading. > > Thanks, > Jc > > > On Tue, Apr 4, 2017 at 3:55 PM, JC Beyler > >>> wrote: > > Hi all, > > To move the discussion forward, with Chuck Rasbold's help to make a webrev, we pushed this: > http://cr.openjdk.java.net/~rasbold/heapz/webrev.00/index.html > > > > >> > 415 lines changed: 399 ins; 13 del; 3 mod; 51122 unchg > > This is not a final change that does the whole proposition from the JBS entry: https://bugs.openjdk.java.net/browse/JDK-8177374 > > > > >>; what it does show is parts of the implementation that is > proposed and hopefully can start the conversation going > as I work through the details. > > For example, the changes to C2 are done here for the allocations: > http://cr.openjdk.java.net/~rasbold/heapz/webrev.00/src/share/vm/opto/macro.cpp.patch > > > > > >> > > Hopefully this all makes sense and thank you for all your future comments! > Jc > > > On Tue, Dec 13, 2016 at 1:11 PM, JC Beyler > >>> > wrote: > > Hello all, > > This is a follow-up from Jeremy's initial email from last year: > http://mail.openjdk.java.net/pipermail/serviceability-dev/2015-June/017543.html > > > >> > > I've gone ahead and started working on preparing this and Jeremy and I went down the route of actually writing it up in JEP form: > https://bugs.openjdk.java.net/browse/JDK-8171119 > > > I think original conversation that happened last year in that thread still holds true: > > - We have a patch at Google that we think others might be interested in > - It provides a means to understand where the allocation hotspots are at a very low overhead > - Since it is at a low overhead, we can leave it on by default > > So I come to the mailing list with Jeremy's initial question: > "I thought I would ask if there is any interest / if I should write a JEP / if I should just forget it." > > A year ago, it seemed some thought it was a good idea, is this still true? > > Thanks, > Jc > > > > > > > > From rwestrel at redhat.com Tue May 30 13:21:58 2017 From: rwestrel at redhat.com (Roland Westrelin) Date: Tue, 30 May 2017 15:21:58 +0200 Subject: [9] RFR(S): 8179678: ArrayCopy with same src and dst can cause incorrect execution or compiler crash In-Reply-To: References: <43893ed5-cbf5-4cf9-a727-8733850260de@oracle.com> <31848120-9bfa-505a-60dd-fa8b1c0b7efa@oracle.com> Message-ID: > okay, looks good. I'll re-run test. Thanks. Roland. From rwestrel at redhat.com Tue May 30 14:07:27 2017 From: rwestrel at redhat.com (Roland Westrelin) Date: Tue, 30 May 2017 16:07:27 +0200 Subject: RFR(M): 8181211: C2: Use profiling data to optimize on/off heap unsafe accesses Message-ID: http://cr.openjdk.java.net/~roland/8181211/webrev.00/ The not yet pushed 8176506 makes code emitted for unsafe accesses a lot more conservative and harder for c2 to optimize when an access is not known to be on or off heap. This change proposes that profile data be used to detect on or off heap accesses allowing c2 to better optimize sequences of unsafe accesses (at the cost of an extra null check/assert). With this change: 1- argument profiling is enabled at unsafe put* and get* call site 2- I found that enabling argument profiling but not return value profiling is broken. That change fixes it. 3- C2 needs to be able to tell, from profile data, whether a pointer is always null, never null or sometimes null. So far, whether a pointer is always null wasn't made available to c2. 4- the C2 unsafe code now checks profile data when it cannot tell whether an access is on or off heap and emits a guard. 5- load from non null objects at an offset guaranteed to fall in the object are not pinned anymore. With this change, Paul's benchmarks perform as well with 8176506+8181211 as they do without (actually better because of 5). Andrew's byte buffer tests generate a main loop with 8176506+8181211 that's identical to what is generated currently. Roland. From paul.sandoz at oracle.com Tue May 30 19:19:23 2017 From: paul.sandoz at oracle.com (Paul Sandoz) Date: Tue, 30 May 2017 12:19:23 -0700 Subject: RFR(M): 8181211: C2: Use profiling data to optimize on/off heap unsafe accesses In-Reply-To: References: Message-ID: <119D0A87-6995-46C3-AB31-227784187A20@oracle.com> Hi Roland, Thanks for looking at this. A further clean up of the ByteBuffer code is to unify heap and direct implementations to consistently use the double addressing mode, where for the latter a null from a field is read and passed for the heap object. We conservatively held off doing that due to performance concerns. Do you think your patch will help here too? Paul. > On 30 May 2017, at 07:07, Roland Westrelin wrote: > > > http://cr.openjdk.java.net/~roland/8181211/webrev.00/ > > The not yet pushed 8176506 makes code emitted for unsafe accesses a lot > more conservative and harder for c2 to optimize when an access is not > known to be on or off heap. This change proposes that profile data be > used to detect on or off heap accesses allowing c2 to better optimize > sequences of unsafe accesses (at the cost of an extra null > check/assert). > > With this change: > > 1- argument profiling is enabled at unsafe put* and get* call site > > 2- I found that enabling argument profiling but not return value > profiling is broken. That change fixes it. > > 3- C2 needs to be able to tell, from profile data, whether a pointer is > always null, never null or sometimes null. So far, whether a pointer > is always null wasn't made available to c2. > > 4- the C2 unsafe code now checks profile data when it cannot tell > whether an access is on or off heap and emits a guard. > > 5- load from non null objects at an offset guaranteed to fall in the > object are not pinned anymore. > > With this change, Paul's benchmarks perform as well with 8176506+8181211 > as they do without (actually better because of 5). Andrew's byte buffer > tests generate a main loop with 8176506+8181211 that's identical to what > is generated currently. > > Roland. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 841 bytes Desc: Message signed with OpenPGP using GPGMail URL: From ekaterina.pavlova at oracle.com Tue May 30 21:28:22 2017 From: ekaterina.pavlova at oracle.com (Ekaterina Pavlova) Date: Tue, 30 May 2017 14:28:22 -0700 Subject: RFR: 8181124 Get rid of compiler.testlibrary.rtm.predicate Message-ID: <10980278-19fa-2c5d-9dc6-d7312db11435@oracle.com> Hi all, Please revive these changes which refactor compiler/rtm tests. Many compiler/rtm tests use compiler.testlibrary.rtm.predicate.* predicates to check if test should be executed or not. If test is not considered to be run it will not do real testing and be marked as passed. It will be better to don't run such tests at all. It will be more efficient from performance point of view and more accurate from reporting point of view. The fix removes using of SupportedCPU, SupportedOS and SupportedVM predicates and use proper "@requires" instead. New vm.rtm.cpu and vm.rtm.os 'requires' properties have been implemented. These change are in second webrev. Local Platform.fileAsString() function was added instead of using of Utils.fileAsString() to avoid full dependency on test/lib/jdk/test/lib library. It is not good but otherwise we need refactor test/lib/jdk/test/lib. We agreed with Igor Ig. to postpone this refactoring. Recent JDK-8180612 fix added range checks for RTMAbortRatio and RTMTotalCountIncrRate. Fixed compiler/rtm/cli/TestRTMAbortRatioOptionOnUnsupportedConfig.java and compiler/rtm/cli/TestRTMTotalCountIncrRateOptionOnUnsupportedConfig.java to don't pass invalid value options. Otherwise these tests will fail. bug: https://bugs.openjdk.java.net/browse/JDK-8181124 webrev[1]: http://cr.openjdk.java.net/~epavlova//8181124_hs/webrev.00/ [2]: http://cr.openjdk.java.net/~epavlova//8181124_test/webrev.00/ Tested by running jprt and running compiler/rtm tests on all supported platforms. thanks, -katya p.s. Igor Ignatyev volunteered to sponsor this change. From rwestrel at redhat.com Wed May 31 08:54:30 2017 From: rwestrel at redhat.com (Roland Westrelin) Date: Wed, 31 May 2017 10:54:30 +0200 Subject: RFR(M): 8181211: C2: Use profiling data to optimize on/off heap unsafe accesses In-Reply-To: <119D0A87-6995-46C3-AB31-227784187A20@oracle.com> References: <119D0A87-6995-46C3-AB31-227784187A20@oracle.com> Message-ID: Hi Paul, > A further clean up of the ByteBuffer code is to unify heap and direct > implementations to consistently use the double addressing mode, where > for the latter a null from a field is read and passed for the heap > object. We conservatively held off doing that due to performance > concerns. Do you think your patch will help here too? It would help if the application either only uses on heap accesses or off heap accesses. A mix of the two would cause profile pollution and a performance drop. MethodHandles.byteBufferViewVarHandle() shares one implementation for both off and on heap accesses, right? AFAICT, it is subject to that problem. Roland. From volker.simonis at gmail.com Wed May 31 09:10:08 2017 From: volker.simonis at gmail.com (Volker Simonis) Date: Wed, 31 May 2017 11:10:08 +0200 Subject: [10] RFR(M): 8176506: C2: loop unswitching and unsafe accesses cause crash In-Reply-To: References: <83aed7fe-cafa-f28b-577d-c6871123e269@oracle.com> <8cfe23ac-b308-24e3-2725-44992031cddd@oracle.com> <352faa06-3fe4-de8a-eeb6-3506bf555a1e@redhat.com> <4050c7c9-b688-5d52-a85c-f72284f50ccf@oracle.com> <6cea2fa0-1bcf-3c86-1494-a51f66430577@oracle.com> <8de1ea64-79cd-e7ec-adb4-1f1f276612bf@oracle.com> Message-ID: On Mon, May 29, 2017 at 10:10 AM, Roland Westrelin wrote: > > Hi Volker, > > Thanks for reviewing this again. > >> I think it's good that you now use SIGILL on all platforms. The change >> looks good except the following minor nits I've already mentioned in >> my first review. I'll leave it up to you if you want to fix them and >> in the case you do there's no need for a new webrev. > > Sorry, I forgot about those. I will fix the comments. > >> test/compiler/unsafe/TestMaybeNullUnsafeAccess.java >> >> - wouldn't it be safer to run the test with -Xbatch and >> -XX:-UseOnStackReplacement for any case and to make it evident that >> the test relays on the fact that test1() and test2() are both >> compiled but not inlined ? > > Ok. Do you suggest adding a comment that states that test1/test2 > shouldn't be inlined? > No I suggest using: * @run main/othervm -Xbatch -XX:-UseOnStackReplacement TestMaybeNullUnsafeAccess instead of: * @run main/othervm TestMaybeNullUnsafeAccess in test/compiler/unsafe/TestMaybeNullUnsafeAccess.java >> - you should also update the copyright on most files you've touched. > > I don't think I've updated a copyright in years. I thought it was ok to > let bulk periodic copyright updates take care of that. > > Roland. From vladimir.kozlov at oracle.com Wed May 31 16:39:43 2017 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Wed, 31 May 2017 09:39:43 -0700 Subject: RFR: 8181124 Get rid of compiler.testlibrary.rtm.predicate In-Reply-To: <10980278-19fa-2c5d-9dc6-d7312db11435@oracle.com> References: <10980278-19fa-2c5d-9dc6-d7312db11435@oracle.com> Message-ID: <53984a33-4df8-a99a-e389-5425b4749bea@oracle.com> Nice. Thank you for fixing it. Vladimir On 5/30/17 2:28 PM, Ekaterina Pavlova wrote: > Hi all, > > Please revive these changes which refactor compiler/rtm tests. > > Many compiler/rtm tests use compiler.testlibrary.rtm.predicate.* predicates > to check if test should be executed or not. If test is not considered to be run it will > not do real testing and be marked as passed. It will be better to don't run such tests at all. > It will be more efficient from performance point of view and more accurate from reporting point of view. > The fix removes using of SupportedCPU, SupportedOS and SupportedVM predicates and use proper > "@requires" instead. > > New vm.rtm.cpu and vm.rtm.os 'requires' properties have been implemented. > These change are in second webrev. > Local Platform.fileAsString() function was added instead of using of Utils.fileAsString() > to avoid full dependency on test/lib/jdk/test/lib library. It is not good but otherwise > we need refactor test/lib/jdk/test/lib. We agreed with Igor Ig. to postpone this refactoring. > > Recent JDK-8180612 fix added range checks for RTMAbortRatio and RTMTotalCountIncrRate. > Fixed compiler/rtm/cli/TestRTMAbortRatioOptionOnUnsupportedConfig.java and > compiler/rtm/cli/TestRTMTotalCountIncrRateOptionOnUnsupportedConfig.java to don't pass invalid value options. > Otherwise these tests will fail. > > > bug: https://bugs.openjdk.java.net/browse/JDK-8181124 > webrev[1]: http://cr.openjdk.java.net/~epavlova//8181124_hs/webrev.00/ > [2]: http://cr.openjdk.java.net/~epavlova//8181124_test/webrev.00/ > > Tested by running jprt and running compiler/rtm tests on all supported platforms. > > thanks, > -katya > > p.s. > Igor Ignatyev volunteered to sponsor this change. > From paul.sandoz at oracle.com Wed May 31 16:42:43 2017 From: paul.sandoz at oracle.com (Paul Sandoz) Date: Wed, 31 May 2017 09:42:43 -0700 Subject: RFR(M): 8181211: C2: Use profiling data to optimize on/off heap unsafe accesses In-Reply-To: References: <119D0A87-6995-46C3-AB31-227784187A20@oracle.com> Message-ID: > On 31 May 2017, at 01:54, Roland Westrelin wrote: > > > Hi Paul, > >> A further clean up of the ByteBuffer code is to unify heap and direct >> implementations to consistently use the double addressing mode, where >> for the latter a null from a field is read and passed for the heap >> object. We conservatively held off doing that due to performance >> concerns. Do you think your patch will help here too? > > It would help if the application either only uses on heap accesses or > off heap accesses. A mix of the two would cause profile pollution and > a performance drop. > > MethodHandles.byteBufferViewVarHandle() shares one implementation for > both off and on heap accesses, right? AFAICT, it is subject to that > problem. Yes, it avoided the profile pollution induced by virtual calls to multiple buffer implementations by accessing common state, but at the cost of taking a hit on uniformly reading the heap object value from a field. So i suspect your patch might be marginally improve things in certain cases? Paul. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 841 bytes Desc: Message signed with OpenPGP using GPGMail URL: From igor.veresov at oracle.com Wed May 31 19:57:21 2017 From: igor.veresov at oracle.com (Igor Veresov) Date: Wed, 31 May 2017 12:57:21 -0700 Subject: RFR(XL) 8181369: Update Graal Message-ID: Please see the issue in the JBS for the list of changesets included in this update. Webrev: http://cr.openjdk.java.net/~iveresov/8181369/webrev.00/ JBS: https://bugs.openjdk.java.net/browse/JDK-8181369 Thanks! igor -------------- next part -------------- An HTML attachment was scrubbed... URL: From ekaterina.pavlova at oracle.com Wed May 31 20:15:16 2017 From: ekaterina.pavlova at oracle.com (Ekaterina Pavlova) Date: Wed, 31 May 2017 13:15:16 -0700 Subject: RFR: 8181124 Get rid of compiler.testlibrary.rtm.predicate In-Reply-To: <53984a33-4df8-a99a-e389-5425b4749bea@oracle.com> References: <10980278-19fa-2c5d-9dc6-d7312db11435@oracle.com> <53984a33-4df8-a99a-e389-5425b4749bea@oracle.com> Message-ID: Vladimir, thanks for prompt review. -katya On 5/31/17 9:39 AM, Vladimir Kozlov wrote: > Nice. Thank you for fixing it. > > Vladimir > > On 5/30/17 2:28 PM, Ekaterina Pavlova wrote: >> Hi all, >> >> Please revive these changes which refactor compiler/rtm tests. >> >> Many compiler/rtm tests use compiler.testlibrary.rtm.predicate.* predicates >> to check if test should be executed or not. If test is not considered to be run it will >> not do real testing and be marked as passed. It will be better to don't run such tests at all. >> It will be more efficient from performance point of view and more accurate from reporting point of view. >> The fix removes using of SupportedCPU, SupportedOS and SupportedVM predicates and use proper >> "@requires" instead. >> >> New vm.rtm.cpu and vm.rtm.os 'requires' properties have been implemented. >> These change are in second webrev. >> Local Platform.fileAsString() function was added instead of using of Utils.fileAsString() >> to avoid full dependency on test/lib/jdk/test/lib library. It is not good but otherwise >> we need refactor test/lib/jdk/test/lib. We agreed with Igor Ig. to postpone this refactoring. >> >> Recent JDK-8180612 fix added range checks for RTMAbortRatio and RTMTotalCountIncrRate. >> Fixed compiler/rtm/cli/TestRTMAbortRatioOptionOnUnsupportedConfig.java and >> compiler/rtm/cli/TestRTMTotalCountIncrRateOptionOnUnsupportedConfig.java to don't pass invalid value options. >> Otherwise these tests will fail. >> >> >> bug: https://bugs.openjdk.java.net/browse/JDK-8181124 >> webrev[1]: http://cr.openjdk.java.net/~epavlova//8181124_hs/webrev.00/ >> [2]: http://cr.openjdk.java.net/~epavlova//8181124_test/webrev.00/ >> >> Tested by running jprt and running compiler/rtm tests on all supported platforms. >> >> thanks, >> -katya >> >> p.s. >> Igor Ignatyev volunteered to sponsor this change. >> From vladimir.kozlov at oracle.com Wed May 31 22:31:27 2017 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Wed, 31 May 2017 15:31:27 -0700 Subject: RFR(XL) 8181369: Update Graal In-Reply-To: References: Message-ID: <4a926eb5-e397-1cf8-752b-6c5145867645@oracle.com> Good. Thanks, Vladimir On 5/31/17 12:57 PM, Igor Veresov wrote: > Please see the issue in the JBS for the list of changesets included in > this update. > > Webrev: http://cr.openjdk.java.net/~iveresov/8181369/webrev.00/ > JBS: https://bugs.openjdk.java.net/browse/JDK-8181369 > > Thanks! > igor From thomas.wuerthinger at oracle.com Fri May 26 17:45:45 2017 From: thomas.wuerthinger at oracle.com (Thomas Wuerthinger) Date: Fri, 26 May 2017 17:45:45 -0000 Subject: Some JVMCI/Graal questions related to AOT In-Reply-To: <1f1b91ba-2c0f-ddd2-d822-64ca81c554c5@oracle.com> References: <22330f24-9bbe-3055-41e4-35b1f6854fa4@oracle.com> <1f1b91ba-2c0f-ddd2-d822-64ca81c554c5@oracle.com> Message-ID: <2108DEBD-0FE7-4F78-AC4D-52950F8C6D15@oracle.com> Dean, Those intermediate objects are not actually created when the code is compiled with a compiler supporting escape analysis (e.g., Graal ;)). This pattern is useful for flags that are only set at initialisation time, because it allows to declare those flags as ?final". - thomas > On 26 May 2017, at 00:02, dean.long at oracle.com wrote: > > I have something mostly working, but I noticed that checkcasts on the appendix object are getting folded away. Eventually I want to support that, by adding the type as a dependency that AOT can check at runtime, but for this initial phase I'll probably just wrap the object constant in a LoadConstantIndirectlyNode and return that to the parser. > > I played around with guarding my changes with (EnableJVMCI && !UseJVMCICompiler) via HotSpotVMConfig, but I want more fine-level control over how invokedynamic constant pool slots are resolved and the adapter and appendix types exposed to the parser, so I'm thinking about adding more flags to GraphBuilderConfiguration or even better introducing a new StaticCompilationPlugin. Does a new plugin sound reasonable? Then I can move the hooks out of HotSpotConstantPool. > > dl > > PS - adding new flag fields to GraphBuilderConfiguration looks painful, with all the flags getting passed to the constructor and needing to change all the other withFlag methods. I'm tempted to rewrite it to use setFlag methods. I don't see why we need to create intermediate objects just to set fields. > > On 5/23/17 12:11 PM, dean.long at oracle.com wrote: >> Thanks, I'll try that. >> >> dl >> >> >> On 5/23/17 12:29 AM, Doug Simon wrote: >> >>>> On 23 May 2017, at 00:41, dean.long at oracle.com wrote: >>>> >>>> 1) I'm working on "8132547: [AOT] support invokedynamic instructions" and I've hacked up jdk.vm.ci.hotspot.HotSpotConstantPool.java to handle things like the invokedynamic appendix differently. However, since this will only be used by AOT, I'm thinking I need to put my changes in an AOTHotSpotConstantPool subclass. My question is, where is a good place to put such as class (which hopefully won't require messing with modules)? >>> Depending on the nature of the changes, I suspect they can simply be added to HotSpotConstantPool, guarded by a VM flag exposed by HotSpotVMConfig if necessary. HotSpotConstantPool is currently final and I don't see a natural place for an AOT specific subclass >>> >>>> 2) How can I tell if a ResolvedJavaType corresponds to a VM anonymous class (Klass::is_anonymous())? I can't rely on getFingerprint() returning 0, because I want fingerprints for anonymous classes. Is there something existing, or do I need to add something to JVMCI? >>> You'd need to add something to JVMCI by exposing the required flags and fields in HotSpotVMConfig. >>> >>> -Doug >> >