From chris.plummer at oracle.com Sat Aug 1 04:06:03 2020 From: chris.plummer at oracle.com (Chris Plummer) Date: Fri, 31 Jul 2020 21:06:03 -0700 Subject: RFR (trivial): 8250826: jhsdb does not work with coredump which comes from Substrate VM In-Reply-To: <3805349a-342f-a130-3b0b-7ee26f23a278@oss.nttdata.com> References: <06e53c65-3cc6-8c51-33d0-26e207f4d0a0@oss.nttdata.com> <3805349a-342f-a130-3b0b-7ee26f23a278@oss.nttdata.com> Message-ID: <6009b927-3949-6d03-51ab-bef939cfbd7f@oracle.com> On 7/30/20 6:18 PM, Yasumasa Suenaga wrote: > Hi Chris, > > On 2020/07/31 7:29, Chris Plummer wrote: >> Hi Yasumasa, >> >> If I understand correctly we first call add_map_info() for all the >> PT_LOAD segments in the core file. We then process all the library >> segments, calling add_map_info() for them if the target_vaddr has not >> already been addded. If has already been added, which I assume is the >> case for any library segment that is already in the core file, then >> the core file version is replaced the the library version.? I'm a >> little unclear of the purpose of this replacing of the core PT_LOAD >> segments with those found in the libraries. If you could explain this >> that would help me understand your change. > > Read only segments in ELF should not be any different from PT_LOAD > segments in the core. > And head of ELF header might be included in coredump (See > JDK-7133122). Thus we need to replace PT_LOAD segments the library > version. Ok. The code in the area really should have been commented better when first written. The purpose is not understandable simply by reading the code. > >> I'm also unsure why existing_map->fd would ever be something other >> than the core file. Why would another library map the same target_vaddr. > > When mmap() is called to read-only ELF segments / sections, Linux > kernel seems to allocate other memory segments which has same top > virtual memory address. I've not yet found out from the code of Linux > kernel, but I confirmed this behavior on GDB. Ok. Same comment as above. This should have been explained with comments in the code. As for your fix, if I understand correctly the issue is that a single segment in the library is being split into two segments in the process (and therefore in the core file) due to an mprotect being done on part of the segment. Because of this the segment size in the library does match the segment size in the core file. So with your fix the library segment is used, but what about the other half of the segment that is in the core file? Don't we now have overlapping segments; the full original segment from the library, and then a second segment that overlaps the tail end of the library segment? Will that cause any confusion later on? thanks, Chris > > > Thanks, > > Yasumasa > > >> thanks, >> >> Chris >> >> On 7/30/20 1:18 PM, Chris Plummer wrote: >>> Hi Yasumasa, >>> >>> I'm reviewing this RFR, and I'd like to ask that it not be pushed as >>> trivial. Although it is just a one line change, it takes an >>> extensive knowledge to understand the impact. I'll read up on the >>> filed graal issue and try to understand the ELF code a bit better. >>> >>> thanks, >>> >>> Chris >>> >>> On 7/30/20 6:45 AM, Yasumasa Suenaga wrote: >>>> Hi all, >>>> >>>> Please review this trivial change: >>>> >>>> ? JBS: https://bugs.openjdk.java.net/browse/JDK-8250826 >>>> ? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8250826/webrev.00/ >>>> >>>> I played Truffle NFI on GraalVM, but I cannot get Java stacks from >>>> coredump via jhsdb. >>>> >>>> I've reported this issue to GraalVM community [1], and I 've found >>>> out the cause of this issue is .svm_heap would be separated to RO >>>> and RW areas by mprotect() calls in run time in spite of .svm_heap >>>> is RO section in ELF (please see [1] for details). >>>> >>>> It is corner case, but we will see same problem on jhsdb when we >>>> attempt to analyze coredump which comes from some applications / >>>> libraries which would separate RO sections in ELF like Substrate VM. >>>> >>>> I sent PR to fix libsaproc.so in LabsJDK 11 for this issue [2], >>>> then community members suggested me to discuss in serviceability-dev. >>>> >>>> >>>> Thanks, >>>> >>>> Yasumasa >>>> >>>> >>>> [1] https://github.com/oracle/graal/issues/2579 >>>> [2] https://github.com/graalvm/labs-openjdk-11/pull/9 >>> >>> >> >> From suenaga at oss.nttdata.com Sun Aug 2 00:20:03 2020 From: suenaga at oss.nttdata.com (Yasumasa Suenaga) Date: Sun, 2 Aug 2020 09:20:03 +0900 Subject: RFR (trivial): 8250826: jhsdb does not work with coredump which comes from Substrate VM In-Reply-To: <6009b927-3949-6d03-51ab-bef939cfbd7f@oracle.com> References: <06e53c65-3cc6-8c51-33d0-26e207f4d0a0@oss.nttdata.com> <3805349a-342f-a130-3b0b-7ee26f23a278@oss.nttdata.com> <6009b927-3949-6d03-51ab-bef939cfbd7f@oracle.com> Message-ID: <313155c0-58b2-cef8-68f8-1aa3ffd7e7df@oss.nttdata.com> Hi Chris, Thanks for your comment! I pushed new change to submit repo, but the build failed on macOS. Could you share details? (I do not have Mac) commit: http://hg.openjdk.java.net/jdk/submit/rev/0eb1c497f297 job: mach5-one-ysuenaga-JDK-8250826-1-20200801-1407-13098989 On 2020/08/01 13:06, Chris Plummer wrote: > On 7/30/20 6:18 PM, Yasumasa Suenaga wrote: >> Hi Chris, >> >> On 2020/07/31 7:29, Chris Plummer wrote: >>> Hi Yasumasa, >>> >>> If I understand correctly we first call add_map_info() for all the PT_LOAD segments in the core file. We then process all the library segments, calling add_map_info() for them if the target_vaddr has not already been addded. If has already been added, which I assume is the case for any library segment that is already in the core file, then the core file version is replaced the the library version.? I'm a little unclear of the purpose of this replacing of the core PT_LOAD segments with those found in the libraries. If you could explain this that would help me understand your change. >> >> Read only segments in ELF should not be any different from PT_LOAD segments in the core. >> And head of ELF header might be included in coredump (See JDK-7133122). Thus we need to replace PT_LOAD segments the library version. > Ok. The code in the area really should have been commented better when first written. The purpose is not understandable simply by reading the code. I added some comments to existing code. Please tell me if it is insufficient. >>> I'm also unsure why existing_map->fd would ever be something other than the core file. Why would another library map the same target_vaddr. >> >> When mmap() is called to read-only ELF segments / sections, Linux kernel seems to allocate other memory segments which has same top virtual memory address. I've not yet found out from the code of Linux kernel, but I confirmed this behavior on GDB. > Ok. Same comment as above. This should have been explained with comments in the code. Added some comments. > As for your fix, if I understand correctly the issue is that a single segment in the library is being split into two segments in the process (and therefore in the core file) due to an mprotect being done on part of the segment. Because of this the segment size in the library does match the segment size in the core file. So with your fix the library segment is used, but what about the other half of the segment that is in the core file? Don't we now have overlapping segments; the full original segment from the library, and then a second segment that overlaps the tail end of the library segment? Will that cause any confusion later on? As long as vaddr is valid, it doesn't matter even if it overlaps because SA would sort the map with vaddr, and would lookup with it. In Substrate VM, there are RO and RW sections in that order, so it is ok with webrev.00 . However it might not be appropriate because RW section might be top of PT_LOAD. To make it more generalized, I changed it to the commit on submit repo. It would check access flags between in coredump and in binary. If they are different, we respect current (loaded from coredump) map because it might be changed at runtime. The change for LabsJDK 11 is more simple because JDK 11 does not have ps_core_common.c . So I share you it. It may help you: http://cr.openjdk.java.net/~ysuenaga/JDK-8250826/JDK-8250826-labsjdk11-0.patch Thanks, Yasumasa > thanks, > > Chris >> >> >> Thanks, >> >> Yasumasa >> >> >>> thanks, >>> >>> Chris >>> >>> On 7/30/20 1:18 PM, Chris Plummer wrote: >>>> Hi Yasumasa, >>>> >>>> I'm reviewing this RFR, and I'd like to ask that it not be pushed as trivial. Although it is just a one line change, it takes an extensive knowledge to understand the impact. I'll read up on the filed graal issue and try to understand the ELF code a bit better. >>>> >>>> thanks, >>>> >>>> Chris >>>> >>>> On 7/30/20 6:45 AM, Yasumasa Suenaga wrote: >>>>> Hi all, >>>>> >>>>> Please review this trivial change: >>>>> >>>>> ? JBS: https://bugs.openjdk.java.net/browse/JDK-8250826 >>>>> ? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8250826/webrev.00/ >>>>> >>>>> I played Truffle NFI on GraalVM, but I cannot get Java stacks from coredump via jhsdb. >>>>> >>>>> I've reported this issue to GraalVM community [1], and I 've found out the cause of this issue is .svm_heap would be separated to RO and RW areas by mprotect() calls in run time in spite of .svm_heap is RO section in ELF (please see [1] for details). >>>>> >>>>> It is corner case, but we will see same problem on jhsdb when we attempt to analyze coredump which comes from some applications / libraries which would separate RO sections in ELF like Substrate VM. >>>>> >>>>> I sent PR to fix libsaproc.so in LabsJDK 11 for this issue [2], then community members suggested me to discuss in serviceability-dev. >>>>> >>>>> >>>>> Thanks, >>>>> >>>>> Yasumasa >>>>> >>>>> >>>>> [1] https://github.com/oracle/graal/issues/2579 >>>>> [2] https://github.com/graalvm/labs-openjdk-11/pull/9 >>>> >>>> >>> >>> > > From chris.plummer at oracle.com Sun Aug 2 01:22:10 2020 From: chris.plummer at oracle.com (Chris Plummer) Date: Sat, 1 Aug 2020 18:22:10 -0700 Subject: RFR (trivial): 8250826: jhsdb does not work with coredump which comes from Substrate VM In-Reply-To: <313155c0-58b2-cef8-68f8-1aa3ffd7e7df@oss.nttdata.com> References: <06e53c65-3cc6-8c51-33d0-26e207f4d0a0@oss.nttdata.com> <3805349a-342f-a130-3b0b-7ee26f23a278@oss.nttdata.com> <6009b927-3949-6d03-51ab-bef939cfbd7f@oracle.com> <313155c0-58b2-cef8-68f8-1aa3ffd7e7df@oss.nttdata.com> Message-ID: <7788f89a-507d-0d4a-ab81-52a09a2d5161@oracle.com> Hi Yasumasa, [2020-08-01T14:15:42,514Z] Creating support/native/jdk.hotspot.agent/libsaproc/static/libsaproc.a from 8 file(s) [2020-08-01T14:15:43,961Z] ./open/src/jdk.hotspot.agent/share/native/libsaproc/ps_core_common.c:128:8: error: no member named 'flags' in 'struct map_info' [2020-08-01T14:15:43,961Z]?? map->flags? = flags; [2020-08-01T14:15:43,961Z]?? ~~~? ^ [2020-08-01T14:15:43,963Z] ./open/src/jdk.hotspot.agent/share/native/libsaproc/ps_core_common.c:153:54: error: use of undeclared identifier 'PF_R' [2020-08-01T14:15:43,963Z]??????????????????????????????? offset, vaddr, memsz, PF_R)) == NULL) { I'll look at the code changes later. No time at the moment. thanks, Chris 2020-08-01-1405571.suenaga.source2020-08-01-1405571.suenaga.source 2020-08-01-1405571.suenaga.source On 8/1/20 5:20 PM, Yasumasa Suenaga wrote: > Hi Chris, > > Thanks for your comment! > I pushed new change to submit repo, but the build failed on macOS. > Could you share details? > (I do not have Mac) > > ? commit: http://hg.openjdk.java.net/jdk/submit/rev/0eb1c497f297 > ? job: mach5-one-ysuenaga-JDK-8250826-1-20200801-1407-13098989 > > On 2020/08/01 13:06, Chris Plummer wrote: >> On 7/30/20 6:18 PM, Yasumasa Suenaga wrote: >>> Hi Chris, >>> >>> On 2020/07/31 7:29, Chris Plummer wrote: >>>> Hi Yasumasa, >>>> >>>> If I understand correctly we first call add_map_info() for all the >>>> PT_LOAD segments in the core file. We then process all the library >>>> segments, calling add_map_info() for them if the target_vaddr has >>>> not already been addded. If has already been added, which I assume >>>> is the case for any library segment that is already in the core >>>> file, then the core file version is replaced the the library >>>> version.? I'm a little unclear of the purpose of this replacing of >>>> the core PT_LOAD segments with those found in the libraries. If you >>>> could explain this that would help me understand your change. >>> >>> Read only segments in ELF should not be any different from PT_LOAD >>> segments in the core. >>> And head of ELF header might be included in coredump (See >>> JDK-7133122). Thus we need to replace PT_LOAD segments the library >>> version. >> Ok. The code in the area really should have been commented better >> when first written. The purpose is not understandable simply by >> reading the code. > > I added some comments to existing code. Please tell me if it is > insufficient. > > >>>> I'm also unsure why existing_map->fd would ever be something other >>>> than the core file. Why would another library map the same >>>> target_vaddr. >>> >>> When mmap() is called to read-only ELF segments / sections, Linux >>> kernel seems to allocate other memory segments which has same top >>> virtual memory address. I've not yet found out from the code of >>> Linux kernel, but I confirmed this behavior on GDB. >> Ok. Same comment as above. This should have been explained with >> comments in the code. > > Added some comments. > > >> As for your fix, if I understand correctly the issue is that a single >> segment in the library is being split into two segments in the >> process (and therefore in the core file) due to an mprotect being >> done on part of the segment. Because of this the segment size in the >> library does match the segment size in the core file. So with your >> fix the library segment is used, but what about the other half of the >> segment that is in the core file? Don't we now have overlapping >> segments; the full original segment from the library, and then a >> second segment that overlaps the tail end of the library segment? >> Will that cause any confusion later on? > > As long as vaddr is valid, it doesn't matter even if it overlaps > because SA would sort the map with vaddr, and would lookup with it. > In Substrate VM, there are RO and RW sections in that order, so it is > ok with webrev.00 . However it might not be appropriate because RW > section might be top of PT_LOAD. > > To make it more generalized, I changed it to the commit on submit repo. > It would check access flags between in coredump and in binary. If they > are different, we respect current (loaded from coredump) map because > it might be changed at runtime. > > The change for LabsJDK 11 is more simple because JDK 11 does not have > ps_core_common.c . > So I share you it. It may help you: > > http://cr.openjdk.java.net/~ysuenaga/JDK-8250826/JDK-8250826-labsjdk11-0.patch > > > Thanks, > > Yasumasa > > >> thanks, >> >> Chris >>> >>> >>> Thanks, >>> >>> Yasumasa >>> >>> >>>> thanks, >>>> >>>> Chris >>>> >>>> On 7/30/20 1:18 PM, Chris Plummer wrote: >>>>> Hi Yasumasa, >>>>> >>>>> I'm reviewing this RFR, and I'd like to ask that it not be pushed >>>>> as trivial. Although it is just a one line change, it takes an >>>>> extensive knowledge to understand the impact. I'll read up on the >>>>> filed graal issue and try to understand the ELF code a bit better. >>>>> >>>>> thanks, >>>>> >>>>> Chris >>>>> >>>>> On 7/30/20 6:45 AM, Yasumasa Suenaga wrote: >>>>>> Hi all, >>>>>> >>>>>> Please review this trivial change: >>>>>> >>>>>> ? JBS: https://bugs.openjdk.java.net/browse/JDK-8250826 >>>>>> ? webrev: >>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8250826/webrev.00/ >>>>>> >>>>>> I played Truffle NFI on GraalVM, but I cannot get Java stacks >>>>>> from coredump via jhsdb. >>>>>> >>>>>> I've reported this issue to GraalVM community [1], and I 've >>>>>> found out the cause of this issue is .svm_heap would be separated >>>>>> to RO and RW areas by mprotect() calls in run time in spite of >>>>>> .svm_heap is RO section in ELF (please see [1] for details). >>>>>> >>>>>> It is corner case, but we will see same problem on jhsdb when we >>>>>> attempt to analyze coredump which comes from some applications / >>>>>> libraries which would separate RO sections in ELF like Substrate VM. >>>>>> >>>>>> I sent PR to fix libsaproc.so in LabsJDK 11 for this issue [2], >>>>>> then community members suggested me to discuss in >>>>>> serviceability-dev. >>>>>> >>>>>> >>>>>> Thanks, >>>>>> >>>>>> Yasumasa >>>>>> >>>>>> >>>>>> [1] https://github.com/oracle/graal/issues/2579 >>>>>> [2] https://github.com/graalvm/labs-openjdk-11/pull/9 >>>>> >>>>> >>>> >>>> >> >> From suenaga at oss.nttdata.com Sun Aug 2 07:18:30 2020 From: suenaga at oss.nttdata.com (Yasumasa Suenaga) Date: Sun, 2 Aug 2020 16:18:30 +0900 Subject: RFR: 8250826: jhsdb does not work with coredump which comes from Substrate VM In-Reply-To: <7788f89a-507d-0d4a-ab81-52a09a2d5161@oracle.com> References: <06e53c65-3cc6-8c51-33d0-26e207f4d0a0@oss.nttdata.com> <3805349a-342f-a130-3b0b-7ee26f23a278@oss.nttdata.com> <6009b927-3949-6d03-51ab-bef939cfbd7f@oracle.com> <313155c0-58b2-cef8-68f8-1aa3ffd7e7df@oss.nttdata.com> <7788f89a-507d-0d4a-ab81-52a09a2d5161@oracle.com> Message-ID: Hi Chris, (Remove "trivial" from subject) Thanks for the information! I fixed errors in new webrev. It passed tests on submit repo (mach5-one-ysuenaga-JDK-8250826-1-20200802-0151-13109525) http://cr.openjdk.java.net/~ysuenaga/JDK-8250826/webrev.01/ I tried to use elf.h instead of #define for PF_R, however it failed (mach5-one-ysuenaga-JDK-8250826-1-20200802-0542-13111335). http://hg.openjdk.java.net/jdk/submit/rev/67baee1a1a1d Thus I added #define for it in this webrev. Thanks, Yasumasa On 2020/08/02 10:22, Chris Plummer wrote: > Hi Yasumasa, > > [2020-08-01T14:15:42,514Z] Creating support/native/jdk.hotspot.agent/libsaproc/static/libsaproc.a from 8 file(s) > [2020-08-01T14:15:43,961Z] ./open/src/jdk.hotspot.agent/share/native/libsaproc/ps_core_common.c:128:8: error: no member named 'flags' in 'struct map_info' > [2020-08-01T14:15:43,961Z]?? map->flags? = flags; > [2020-08-01T14:15:43,961Z]?? ~~~? ^ > [2020-08-01T14:15:43,963Z] ./open/src/jdk.hotspot.agent/share/native/libsaproc/ps_core_common.c:153:54: error: use of undeclared identifier 'PF_R' > [2020-08-01T14:15:43,963Z]??????????????????????????????? offset, vaddr, memsz, PF_R)) == NULL) { > > I'll look at the code changes later. No time at the moment. > > thanks, > > Chris > > 2020-08-01-1405571.suenaga.source2020-08-01-1405571.suenaga.source 2020-08-01-1405571.suenaga.source On 8/1/20 5:20 PM, Yasumasa Suenaga wrote: >> Hi Chris, >> >> Thanks for your comment! >> I pushed new change to submit repo, but the build failed on macOS. Could you share details? >> (I do not have Mac) >> >> ? commit: http://hg.openjdk.java.net/jdk/submit/rev/0eb1c497f297 >> ? job: mach5-one-ysuenaga-JDK-8250826-1-20200801-1407-13098989 >> >> On 2020/08/01 13:06, Chris Plummer wrote: >>> On 7/30/20 6:18 PM, Yasumasa Suenaga wrote: >>>> Hi Chris, >>>> >>>> On 2020/07/31 7:29, Chris Plummer wrote: >>>>> Hi Yasumasa, >>>>> >>>>> If I understand correctly we first call add_map_info() for all the PT_LOAD segments in the core file. We then process all the library segments, calling add_map_info() for them if the target_vaddr has not already been addded. If has already been added, which I assume is the case for any library segment that is already in the core file, then the core file version is replaced the the library version.? I'm a little unclear of the purpose of this replacing of the core PT_LOAD segments with those found in the libraries. If you could explain this that would help me understand your change. >>>> >>>> Read only segments in ELF should not be any different from PT_LOAD segments in the core. >>>> And head of ELF header might be included in coredump (See JDK-7133122). Thus we need to replace PT_LOAD segments the library version. >>> Ok. The code in the area really should have been commented better when first written. The purpose is not understandable simply by reading the code. >> >> I added some comments to existing code. Please tell me if it is insufficient. >> >> >>>>> I'm also unsure why existing_map->fd would ever be something other than the core file. Why would another library map the same target_vaddr. >>>> >>>> When mmap() is called to read-only ELF segments / sections, Linux kernel seems to allocate other memory segments which has same top virtual memory address. I've not yet found out from the code of Linux kernel, but I confirmed this behavior on GDB. >>> Ok. Same comment as above. This should have been explained with comments in the code. >> >> Added some comments. >> >> >>> As for your fix, if I understand correctly the issue is that a single segment in the library is being split into two segments in the process (and therefore in the core file) due to an mprotect being done on part of the segment. Because of this the segment size in the library does match the segment size in the core file. So with your fix the library segment is used, but what about the other half of the segment that is in the core file? Don't we now have overlapping segments; the full original segment from the library, and then a second segment that overlaps the tail end of the library segment? Will that cause any confusion later on? >> >> As long as vaddr is valid, it doesn't matter even if it overlaps because SA would sort the map with vaddr, and would lookup with it. >> In Substrate VM, there are RO and RW sections in that order, so it is ok with webrev.00 . However it might not be appropriate because RW section might be top of PT_LOAD. >> >> To make it more generalized, I changed it to the commit on submit repo. >> It would check access flags between in coredump and in binary. If they are different, we respect current (loaded from coredump) map because it might be changed at runtime. >> >> The change for LabsJDK 11 is more simple because JDK 11 does not have ps_core_common.c . >> So I share you it. It may help you: >> >> http://cr.openjdk.java.net/~ysuenaga/JDK-8250826/JDK-8250826-labsjdk11-0.patch >> >> >> Thanks, >> >> Yasumasa >> >> >>> thanks, >>> >>> Chris >>>> >>>> >>>> Thanks, >>>> >>>> Yasumasa >>>> >>>> >>>>> thanks, >>>>> >>>>> Chris >>>>> >>>>> On 7/30/20 1:18 PM, Chris Plummer wrote: >>>>>> Hi Yasumasa, >>>>>> >>>>>> I'm reviewing this RFR, and I'd like to ask that it not be pushed as trivial. Although it is just a one line change, it takes an extensive knowledge to understand the impact. I'll read up on the filed graal issue and try to understand the ELF code a bit better. >>>>>> >>>>>> thanks, >>>>>> >>>>>> Chris >>>>>> >>>>>> On 7/30/20 6:45 AM, Yasumasa Suenaga wrote: >>>>>>> Hi all, >>>>>>> >>>>>>> Please review this trivial change: >>>>>>> >>>>>>> ? JBS: https://bugs.openjdk.java.net/browse/JDK-8250826 >>>>>>> ? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8250826/webrev.00/ >>>>>>> >>>>>>> I played Truffle NFI on GraalVM, but I cannot get Java stacks from coredump via jhsdb. >>>>>>> >>>>>>> I've reported this issue to GraalVM community [1], and I 've found out the cause of this issue is .svm_heap would be separated to RO and RW areas by mprotect() calls in run time in spite of .svm_heap is RO section in ELF (please see [1] for details). >>>>>>> >>>>>>> It is corner case, but we will see same problem on jhsdb when we attempt to analyze coredump which comes from some applications / libraries which would separate RO sections in ELF like Substrate VM. >>>>>>> >>>>>>> I sent PR to fix libsaproc.so in LabsJDK 11 for this issue [2], then community members suggested me to discuss in serviceability-dev. >>>>>>> >>>>>>> >>>>>>> Thanks, >>>>>>> >>>>>>> Yasumasa >>>>>>> >>>>>>> >>>>>>> [1] https://github.com/oracle/graal/issues/2579 >>>>>>> [2] https://github.com/graalvm/labs-openjdk-11/pull/9 >>>>>> >>>>>> >>>>> >>>>> >>> >>> > > From suenaga at oss.nttdata.com Mon Aug 3 05:55:51 2020 From: suenaga at oss.nttdata.com (Yasumasa Suenaga) Date: Mon, 3 Aug 2020 14:55:51 +0900 Subject: RFR: 8250930: [TESTBUG] Some forceEarlyReturn00* tests failed due to compiler optimization Message-ID: <16dae51b-f1c5-c283-acc6-c106366d82be@oss.nttdata.com> Hi all, Please review this change: JBS: https://bugs.openjdk.java.net/browse/JDK-8250930 webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8250930/webrev.00/ Following tests which were compiled by GCC 10.2 failed. - vmTestbase/nsk/jdi/ThreadReference/forceEarlyReturn/forceEarlyReturn004/forceEarlyReturn004.java - vmTestbase/nsk/jdwp/ThreadReference/ForceEarlyReturn/forceEarlyReturn002/forceEarlyReturn002.java They have native module, and they are commented as below: ``` // execute infinite loop to be sure that thread in native method while (always_true) { // Need some dummy code so the optimizer does not remove this loop. dummy_counter = dummy_counter < 1000 ? 0 : dummy_counter + 1; } // The optimizer can be surprisingly clever. // Use dummy_counter so it can never be optimized out. // This statement will always return 0. return dummy_counter >= 0 ? 0 : 1; ``` C compiler maybe eliminate this loop. We should not consider compiler optimization at this point with other solution. Thanks, Yasumasa From linzang at tencent.com Mon Aug 3 14:51:19 2020 From: linzang at tencent.com (=?utf-8?B?bGluemFuZyjoh6fnkLMp?=) Date: Mon, 3 Aug 2020 14:51:19 +0000 Subject: RFR(L): 8215624: add parallel heap inspection support for jmap histo(G1)(Internet mail) In-Reply-To: <01181077-0DD7-4AEE-945B-5D6E21224A31@amazon.com> References: <01181077-0DD7-4AEE-945B-5D6E21224A31@amazon.com> Message-ID: <5EC2DEE8-C1F0-4C88-9F90-A8A55BBE39FE@tencent.com> Dear Stefan, May I ask your help to review again? I have made a delta based on the last changeset you have reviewed(webrev04),hope it could ease your reviewing work. webrev: https://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_10/ delta (vs webrev04): https://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/delta_10vs04/webrev/ bug: https://bugs.openjdk.java.net/browse/JDK-8215624 CSR(approved): https://bugs.openjdk.java.net/browse/JDK-8239290 BRs, Lin ?On 2020/7/30, 5:21 AM, "Hohensee, Paul" wrote: A submit repo run with this succeeded, so afaic you're good to go. Stefan, you reviewed the GC part before, it'd be great if you could ok the final version. Thanks, Paul On 7/29/20, 5:02 AM, "linzang(??)" wrote: Upload a new change at http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_10/ It fix an issue of windows fail : #################################### In heapInspect.cpp - size_t HeapInspection::populate_table(KlassInfoTable* cit, BoolObjectClosure *filter, uint parallel_thread_num) { + uint HeapInspection::populate_table(KlassInfoTable* cit, BoolObjectClosure *filter, uint parallel_thread_num) { #################################### In heapInspect.hpp - size_t populate_table(KlassInfoTable* cit, BoolObjectClosure* filter = NULL, uint parallel_thread_num = 1) NOT_SERVICES_RETURN_(0); + uint populate_table(KlassInfoTable* cit, BoolObjectClosure* filter = NULL, uint parallel_thread_num = 1) NOT_SERVICES_RETURN_(0); #################################### BRs, Lin On 2020/7/27, 11:26 AM, "linzang(??)" wrote: I update a new change at http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_09 It includes a tiny fix of build failure on windows: #################################### In attachListener.cpp: - uint parallel_thread_num = MAX(1, (uint)os::initial_active_processor_count() * 3 / 8); + uint parallel_thread_num = MAX2(1, (uint)os::initial_active_processor_count() * 3 / 8); #################################### BRs, Lin On 2020/7/23, 11:56 AM, "linzang(??)" wrote: Hi Paul, Thanks for your help, that all looks good to me. Just 2 minor changes: ? delete the final return in ParHeapInspectTask::work, you mentioned it but seems not include in the webrev :-) ? delete a unnecessary blank line in heapInspect.cpp at merge_entry() ######################################################################### --- old/src/hotspot/share/memory/heapInspection.cpp 2020-07-23 11:23:29.281666456 +0800 +++ new/src/hotspot/share/memory/heapInspection.cpp 2020-07-23 11:23:29.017666447 +0800 @@ -251,7 +251,6 @@ _size_of_instances_in_words += cie->words(); return true; } - return false; } @@ -568,7 +567,6 @@ Atomic::add(&_missed_count, missed_count); } else { Atomic::store(&_success, false); - return; } } ######################################################################### Here is the webrev http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_08/ BRs, Lin --------------------------------------------- From: "Hohensee, Paul" Date: Thursday, July 23, 2020 at 6:48 AM To: "linzang(??)" , Stefan Karlsson , "serguei.spitsyn at oracle.com" , David Holmes , serviceability-dev , "hotspot-gc-dev at openjdk.java.net" Subject: RE: RFR(L): 8215624: add parallel heap inspection support for jmap histo(G1)(Internet mail) Just small things. heapInspection.cpp: In ParHeapInspectTask::work, remove the final return statement and fix the following ?}? indent. I.e., replace + Atomic::store(&_success, false); + return; + } with + Atomic::store(&_success, false); + } In HeapInspection::heap_inspection, missed_count should be a uint to match other missed_count declarations, and should be initialized to the result of populate_table() rather than separately to 0. attachListener.cpp: In heap_inspection, initial_processor_count returns an int, so cast its result to a uint. Similarly, parse_uintx returns a uintx, so cast its result (num) to uint when assigning to parallel_thread_num. BasicJMapTest.java: I took the liberty of refactoring testHisto*/histoToFile/testDump*/dump to remove redundant interposition methods and make histoToFile and dump look as similar as possible. Webrev with the above changes in http://cr.openjdk.java.net/~phh/8214535/webrev.01/ Thanks, Paul On 7/15/20, 2:13 AM, "linzang(??)" wrote: Upload a new webrev at http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_07/ It fix a potential issue that unexpected number of threads maybe calculated for "parallel" option of jmap -histo in container. As shown at http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_07-delta/src/hotspot/share/services/attachListener.cpp.udiff.html ############### attachListener.cpp #################### @@ -252,11 +252,11 @@ static jint heap_inspection(AttachOperation* op, outputStream* out) { bool live_objects_only = true; // default is true to retain the behavior before this change is made outputStream* os = out; // if path not specified or path is NULL, use out fileStream* fs = NULL; const char* arg0 = op->arg(0); - uint parallel_thread_num = MAX(1, os::processor_count() * 3 / 8); // default is less than half of processors. + uint parallel_thread_num = MAX(1, os::initial_active_processor_count() * 3 / 8); // default is less than half of processors. if (arg0 != NULL && (strlen(arg0) > 0)) { if (strcmp(arg0, "-all") != 0 && strcmp(arg0, "-live") != 0) { out->print_cr("Invalid argument to inspectheap operation: %s", arg0); return JNI_ERR; } ################################################### Thanks. BRs, Lin On 2020/7/9, 3:22 PM, "linzang(??)" wrote: Hi Paul, Thanks for reviewing! >> >> I'd move all the argument parsing code to JMap.java and just pass the results to Hotspot. Both histo() in JMap.java and code in attachListener.* parse the command line arguments, though the code in histo() doesn't parse the argument to "parallel". I'd upgrade the code in histo() to do a complete parse and pass the option values to executeCommandForPid as before: there would just be more of them now. That would allow you to eliminate all the parsing code in attachListener.cpp as well as the change to arguments.hpp. >> The reason I made the change in Jmap.java that compose all arguments as 1 string , instead of passing 3 argments, is to avoid the compatibility issue, as we discussed in http://mail.openjdk.java.net/pipermail/serviceability-dev/2019-February/thread.html#27240. The root cause of the compatibility issue is because max argument count in HotspotVirtualMachineImpl.java and attachlistener.cpp need to be enlarged (changes like http://hg.openjdk.java.net/jdk/jdk/rev/e7cf035682e3#l2.1) when jmap has more than 3 arguments. But if user use an old jcmd/jmap tool, it may stuck at socket read(), because the "max argument count" don't match. I re-checked this change, the argument count of jmap histo is equal to 3 (live, file, parallel), so it can work normally even without the change of passing argument. But I think we have to face the problem if more arguments is added in jcmd alike tools later, not sure whether it should be sloved (or a workaround) in this changeset. And here are the lastest webrev and delta: http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_06/ http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_06-delta/ Cheers, Lin On 2020/7/7, 5:57 AM, "Hohensee, Paul" wrote: I'd like to see this feature added. :) The CSR looks good, as does the basic parallel inspection algorithm. Stefan's done the GC part, so I'll stick to the non-GC part (fwiw, the GC part lgtm). I'd move all the argument parsing code to JMap.java and just pass the results to Hotspot. Both histo() in JMap.java and code in attachListener.* parse the command line arguments, though the code in histo() doesn't parse the argument to "parallel". I'd upgrade the code in histo() to do a complete parse and pass the option values to executeCommandForPid as before: there would just be more of them now. That would allow you to eliminate all the parsing code in attachListener.cpp as well as the change to arguments.hpp. heapInspection.hpp: _shared_miss_count (s/b _missed_count, see below) isn't a size, so it should be a uint instead of a size_t. Same with the new parallel_thread_num argument to heap_inspection() and populate_table(). Comment copy-edit: +// Parallel heap inspection task. Parallel inspection can fail due to +// a native OOM when allocating memory for TL-KlassInfoTable. +// _success will be set false on an OOM, and serial inspection tried. _shared_miss_count should be _missed_count to match the missed_count() getter, or rename missed_count() to be shared_miss_count(). Whichever way you go, the field type should match the getter result type: uint is reasonable. heapInspection.cpp: You might use ResourceMark twice in populate_table, separately for the parallel attempt and the serial code. If the parallel attempt fails and available memory is low, it would be good to clean up the memory used by the parallel attempt before doing the serial code. Style nit in KlassInfoTable::merge_entry(). I'd line up the definitions of k and elt, so "k" is even with "elt". And, because it's two lines shorter, I'd replace + } else { + return false; + } with + return false; KlassInfoTableMergeClosure.is_success() should be just success() (i.e., no "is_" prefix) because it's a getter. I'd reorganize the code in populate_table() to make it more clear, vis (I changed _shared_missed_count to _missed_count) + if (cit.allocation_failed()) { + // fail to allocate memory, stop parallel mode + Atomic::store(&_success, false); + return; + } + RecordInstanceClosure ric(&cit, _filter); + _poi->object_iterate(&ric, worker_id); + missed_count = ric.missed_count(); + { + MutexLocker x(&_mutex); + merge_success = _shared_cit->merge(&cit); + } + if (merge_success) { + Atomic::add(&_missed_count, missed_count); + else { + Atomic::store(&_success, false); + } Thanks, Paul On 6/29/20, 7:20 PM, "linzang(??)" wrote: Dear All, Sorry to bother again, I just want to make sure that is this change worth to be continue to work on? If decision is made to not. I think I can drop this work and stop asking for help reviewing... Thanks for all your help about reviewing this previously. BRs, Lin On 2020/5/9, 3:47 PM, "linzang(??)" wrote: Dear All, May I ask your help again for review the latest change? Thanks! BRs, Lin On 2020/4/28, 1:54 PM, "linzang(??)" wrote: Hi Stefan, >> - Adding Atomic::load/store. >> - Removing the time measurement in the run_task. I renamed G1's function >> to run_task_timed. If we need this outside of G1, we can rethink the API >> at that point. >> - ZGC style cleanups Thanks for revising the patch, they are all good to me, and I have made a tiny change based on it: http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_04/ http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_04-delta/ it reduce the scope of mutex in ParHeapInspectTask, and delete unnecessary comments. BRs, Lin On 2020/4/27, 4:34 PM, "Stefan Karlsson" wrote: Hi Lin, On 2020-04-26 05:10, linzang(??) wrote: > Hi Stefan and Paul? > I have made a new patch based on your comments and Stefan's Poc code: > Webrev: http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_03/ > Delta(based on Stefan's change:) : http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_03-delta/webrev_03-delta/ Thanks for providing a delta patch. It makes it much easier to look at, and more likely for reviewers to continue reviewing. I'm going to continue focusing on the GC parts, and leave the rest to others to review. > > And Here are main changed I made and want to discuss with you: > 1. changed"parallelThreadNum=" to "parallel=" for jmap -histo options. > 2. Add logic to test where parallelHeapInspection is fail, in heapInspection.cpp > This is because the parHeapInspectTask create thread local KlassInfoTable in it's work() method, and this may fail because of native OOM, in this case, the parallel should fail and serial heap inspection can be tried. > One more thing I want discuss with you is about the member "_success" of parHeapInspectTask, when native OOM happenes, it is set to false. And since this "set" operation can be conducted in multiple threads, should it be atomic ops? IMO, this is not necessary because "_success" can only be set to false, and there is no way to change it from back to true after the ParHeapInspectTask instance is created, so it is save to be non-atomic, do you agree with that? In these situations you should be using the Atomic::load/store primitives. We're moving toward a later C++ standard were data races are considered undefined behavior. > 3. make CollectedHeap::run_task() be an abstract virtual func, so that every subclass of collectedHeap should support it, so later implementation of new collectedHeap will not miss the "parallel" features. > The problem I want to discuss with you is about epsilonHeap and SerialHeap, as they may not need parallel heap iteration, so I only make task->work(0), in case the run_task() is invoked someway in future. Another way is to left run_task() unimplemented, which one do you think is better? I don't have a strong opinion about this. And also please help take a look at the zHeap, as there is a class zTask that wrap the abstractGangTask, and the collectedHeap::run_task() only accept AbstraceGangTask* as argument, so I made a delegate class to adapt it , please see src/hotspot/share/gc/z/zHeap.cpp. > > There maybe other better ways to sovle the above problems, welcome for any comments, Thanks! I've created a few cleanups and changes on top of your latest patch: https://cr.openjdk.java.net/~stefank/8215624/webrev.02.delta https://cr.openjdk.java.net/~stefank/8215624/webrev.02 - Adding Atomic::load/store. - Removing the time measurement in the run_task. I renamed G1's function to run_task_timed. If we need this outside of G1, we can rethink the API at that point. - ZGC style cleanups Thanks, StefanK > > BRs, > Lin > > On 2020/4/23, 11:08 AM, "linzang(??)" wrote: > > Thanks Paul! I agree with using "parallel", will make the update in next patch, Thanks for help update the CSR. > > BRs, > Lin > > On 2020/4/23, 4:42 AM, "Hohensee, Paul" wrote: > > For the interface, I'd use "parallel" instead of "parallelThreadNum". All the other options are lower case, and it's a lot easier to type "parallel". I took the liberty of updating the CSR. If you're ok with it, you might want to change variable names and such, plus of course JMap.usage. > > Thanks, > Paul > > On 4/22/20, 2:29 AM, "serviceability-dev on behalf of linzang(??)" wrote: > > Dear Stefan, > > Thanks a lot! I agree with you to decouple the heap inspection code with GC's. > I will start from your POC code, may discuss with you later. > > > BRs, > Lin > > On 2020/4/22, 5:14 PM, "Stefan Karlsson" wrote: > > Hi Lin, > > I took a look at this earlier and saw that the heap inspection code is > strongly coupled with the CollectedHeap and G1CollectedHeap. I'd prefer > if we'd abstract this away, so that the GCs only provide a "parallel > object iteration" interface, and the heap inspection code is kept elsewhere. > > I started experimenting with doing that, but other higher-priority (to > me) tasks have had to take precedence. > > I've uploaded my work-in-progress / proof-of-concept: > https://cr.openjdk.java.net/~stefank/8215624/webrev.01.delta/ > https://cr.openjdk.java.net/~stefank/8215624/webrev.01/ > > The current code doesn't handle the lifecycle (deletion) of the > ParallelObjectIterators. There's also code left unimplemented in around > CollectedHeap::run_task. However, I think this could work as a basis to > pull out the heap inspection code out of the GCs. > > Thanks, > StefanK > > On 2020-04-22 02:21, linzang(??) wrote: > > Dear all, > > May I ask you help to review? This RFR has been there for quite a while. > > Thanks! > > > > BRs, > > Lin > > > > > On 2020/3/16, 5:18 PM, "linzang(??)" wrote:> > > > >> Just update a new path, my preliminary measure show about 3.5x speedup of jmap histo on a nearly full 4GB G1 heap (8-core platform with parallel thread number set to 4). > >> webrev: http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_02/ > >> bug: https://bugs.openjdk.java.net/browse/JDK-8215624 > >> CSR: https://bugs.openjdk.java.net/browse/JDK-8239290 > >> BRs, > >> Lin > >> > On 2020/3/2, 9:56 PM, "linzang(??)" wrote: > >> > > >> > Dear all, > >> > Let me try to ease the reviewing work by some explanation :P > >> > The patch's target is to speed up jmap -histo for heap iteration, from my experience it is necessary for large heap investigation. E.g in bigData scenario I have tried to conduct jmap -histo against 180GB heap, it does take quite a while. > >> > And if my understanding is corrent, even the jmap -histo without "live" option does heap inspection with heap lock acquired. so it is very likely to block mutator thread in allocation-sensitive scenario. I would say the faster the heap inspection does, the shorter the mutator be blocked. This is parallel iteration for jmap is necessary. > >> > I think the parallel heap inspection should be applied to all kind of heap. However, consider the heap layout are different for GCs, much time is required to understand all kinds of the heap layout to make the whole change. IMO, It is not wise to have a huge patch for the whole solution at once, and it is even harder to review it. So I plan to implement it incrementally, the first patch (this one) is going to confirm the implemention detail of how jmap accept the new option, passes it to attachListener of the jvm process and then how to make the parallel inspection closure be generic enough to make it easy to extend to different heap layout. And also how to implement the heap inspection in specific gc's heap. This patch use G1's heap as the begining. > >> > This patch actually do several things: > >> > 1. Add an option "parallelThreadNum=" to jmap -histo, the default behavior is to set N to 0, means let's JVM decide how many threads to use for heap inspection. Set this option to 1 will disable parallel heap inspection. (more details in CSR: https://bugs.openjdk.java.net/browse/JDK-8239290) > >> > 2. Make a change in how Jmap passing arguments, changes in http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_01/src/jdk.jcmd/share/classes/sun/tools/jmap/JMap.java.udiff.html, originally it pass options as separate arguments to attachListener, this patch change to that all options be compose to a single string. So the arg_count_max in attachListener.hpp do not need to be changed, and hence avoid the compatibility issue, as disscussed at https://mail.openjdk.java.net/pipermail/serviceability-dev/2019-March/027334.html > >> > 3. Add an abstract class ParHeapInspectTask in heapInspection.hpp / heapInspection.cpp, It's work(uint worker_id) method prepares the data structure (KlassInfoTable) need for every parallel worker thread, and then call do_object_iterate_parallel() which is heap specific implementation. I also added some machenism in KlassInfoTable to support parallel iteration, such as merge(). > >> > 4. In specific heap (G1 in this patch), create a subclass of ParHeapInspectTask, implement the do_object_iterate_parallel() for parallel heap inspection. For G1, it simply invoke g1CollectedHeap's object_iterate_parallel(). > >> > 5. Add related test. > >> > 6. it may be easy to extend this patch for other kinds of heap by creating subclass of ParHeapInspectTask and implement the do_object_iterate_parallel(). > >> > > >> > Hope these info could help on code review and initate the discussion :-) > >> > Thanks! > >> > > >> > BRs, > >> > Lin > >> > >On 2020/2/19, 9:40 AM, "linzang(??)" wrote:. > >> > > > >> > > Re-post this RFR with correct enhancement number to make it trackable. > >> > > please ignore the previous wrong post. sorry for troubles. > >> > > > >> > > webrev: http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_01/ > >> > > Hi bug: https://bugs.openjdk.java.net/browse/JDK-8215624 > >> > > CSR: https://bugs.openjdk.java.net/browse/JDK-8239290 > >> > > -------------- > >> > > Lin > >> > > >Hi Lin, > > > > > > > >> > > >Could you, please, re-post your RFR with the right enhancement number in > >> > > >the message subject? > >> > > >It will be more trackable this way. > >> > > > > >> > > >Thanks, > >> > > >Serguei > >> > > > > >> > > > > >> > > >On 2/17/20 10:29 PM, linzang(??) wrote: > >> > > >> Dear David, > >> > > >> Thanks a lot! > >> > > >> I have updated the refined code to http://cr.openjdk.java.net/~lzang/jmap-8214535/8215264/webrev_01/. > >> > > >> IMHO the parallel heap inspection can be extended to all kinds of heap as long as the heap layout can support parallel iteration. > >> > > >> Maybe we can firstly use this webrev to discuss how to implement it, because I am not sure my current implementation is an appropriate way to communicate with collectedHeap, then we can extend the solution to other kinds of heap. > >> > > >> > >> > > >> Thanks, > >> > > >> -------------- > >> > > >> Lin > >> > > >>> Hi Lin, > >> > > >>> > >> > > >>> Adding in hotspot-gc-dev as they need to see how this interacts with GC > >> > > >>> worker threads, and whether it needs to be extended beyond G1. > >> > > >>> > >> > > >>> I happened to spot one nit when browsing: > >> > > >>> > >> > > >>> src/hotspot/share/gc/shared/collectedHeap.hpp > >> > > >>> > >> > > >>> + virtual bool run_par_heap_inspect_task(KlassInfoTable* cit, > >> > > >>> + BoolObjectClosure* filter, > >> > > >>> + size_t* missed_count, > >> > > >>> + size_t thread_num) { > >> > > >>> + return NULL; > >> > > >>> > >> > > >>> s/NULL/false/ > >> > > >>> > >> > > >>> Cheers, > >> > > >>> David > > > > > >>> > >> > > >>> On 18/02/2020 2:15 pm, linzang(??) wrote: > >> > > >>>> Dear All, > >> > > >>>> May I ask your help to review the follow changes: > >> > > >>>> webrev: > >> > > >>>> http://cr.openjdk.java.net/~lzang/jmap-8214535/8215264/webrev_00/ > >> > > >>>> bug: https://bugs.openjdk.java.net/browse/JDK-8215624 > >> > > >>>> related CSR: https://bugs.openjdk.java.net/browse/JDK-8239290 > >> > > >>>> This patch enable parallel heap inspection of G1 for jmap histo. > >> > > >>>> my simple test shown it can speed up 2x of jmap -histo with > >> > > >>>> parallelThreadNum set to 2 for heap at ~500M on 4-core platform. > >> > > >>>> > >> > > >>>> ------------------------------------------------------------------------ > >> > > >>>> BRs, > >> > > >>>> Lin > >> > > >> > > >> > > > > > > > > > > > > > > > From chris.plummer at oracle.com Mon Aug 3 21:41:00 2020 From: chris.plummer at oracle.com (Chris Plummer) Date: Mon, 3 Aug 2020 14:41:00 -0700 Subject: RFR: 8250826: jhsdb does not work with coredump which comes from Substrate VM In-Reply-To: References: <06e53c65-3cc6-8c51-33d0-26e207f4d0a0@oss.nttdata.com> <3805349a-342f-a130-3b0b-7ee26f23a278@oss.nttdata.com> <6009b927-3949-6d03-51ab-bef939cfbd7f@oracle.com> <313155c0-58b2-cef8-68f8-1aa3ffd7e7df@oss.nttdata.com> <7788f89a-507d-0d4a-ab81-52a09a2d5161@oracle.com> Message-ID: Hi Yasumasa, Your updated fix resulted in using the core file map whereas the original fix used the library map. In both cases the assert is avoided, which I think is the main goal. Does it matter which map is used? ? 42 #ifndef PF_R ? 43 #define PF_R 0x4 ? 44 #endif ?156?? if ((map = allocate_init_map(ph->core->classes_jsa_fd, ?157??????????????????????????????? offset, vaddr, memsz, PF_R)) == NULL) { I'm not so sure this is appropriate for OSX. It uses mach-o files, not elf files. The segment_command flags field comes from loader.h [1]. I don't see anything in there that looks like the equivalent of ELF access flags. /* Constants for the flags field of the segment_command */ #define??? SG_HIGHVM??? 0x1??? /* the file contents for this segment is for ??? ??? ??? ??? ?? the high part of the VM space, the low part ??? ??? ??? ??? ?? is zero filled (for stacks in core files) */ #define??? SG_FVMLIB??? 0x2??? /* this segment is the VM that is allocated by ??? ??? ??? ??? ?? a fixed VM library, for overlap checking in ??? ??? ??? ??? ?? the link editor */ #define??? SG_NORELOC??? 0x4??? /* this segment has nothing that was relocated ??? ??? ??? ??? ?? in it and nothing relocated to it, that is ??? ??? ??? ??? ?? it maybe safely replaced without relocation*/ #define SG_PROTECTED_VERSION_1??? 0x8 /* This segment is protected.? If the ??? ??? ??? ??? ?????? segment starts at file offset 0, the ??? ??? ??? ??? ?????? first page of the segment is not ??? ??? ??? ??? ?????? protected.? All other pages of the ??? ??? ??? ??? ?????? segment are protected. */ Since the flags don't matter for OSX, maybe you should just pass 0. You can do something like: #ifndef PF_R #define MAP_R_FLAG PF_R #else #define MAP_R_FLAG 0 #endif Some minor comment fixes are needed: ?397???????? // Access flags fot this memory region is different between the library "fot" -> "for" "is" -> "are" ?399???????? // We should respect to coredump. "to" -> "the" ?404???????? // And head of ELF header might be included in coredump (See JDK-7133122). ?405???????? // Thus we need to replace PT_LOAD segments the library version. How about: ?404???????? // Also the first page of the ELF header might be included in the coredump (See JDK-7133122). ?405???????? // Thus we need to replace the PT_LOAD segment with the library version. thanks, Chris [1] https://opensource.apple.com/source/xnu/xnu-1456.1.26/EXTERNAL_HEADERS/mach-o/loader.h.auto.html On 8/2/20 12:18 AM, Yasumasa Suenaga wrote: > Hi Chris, > > (Remove "trivial" from subject) > > Thanks for the information! I fixed errors in new webrev. It passed > tests on submit repo > (mach5-one-ysuenaga-JDK-8250826-1-20200802-0151-13109525) > > ? http://cr.openjdk.java.net/~ysuenaga/JDK-8250826/webrev.01/ > > > I tried to use elf.h instead of #define for PF_R, however it failed > (mach5-one-ysuenaga-JDK-8250826-1-20200802-0542-13111335). > > ? http://hg.openjdk.java.net/jdk/submit/rev/67baee1a1a1d > > Thus I added #define for it in this webrev. > > > Thanks, > > Yasumasa > > > On 2020/08/02 10:22, Chris Plummer wrote: >> Hi Yasumasa, >> >> [2020-08-01T14:15:42,514Z] Creating >> support/native/jdk.hotspot.agent/libsaproc/static/libsaproc.a from 8 >> file(s) >> [2020-08-01T14:15:43,961Z] >> ./open/src/jdk.hotspot.agent/share/native/libsaproc/ps_core_common.c:128:8: >> error: no member named 'flags' in 'struct map_info' >> [2020-08-01T14:15:43,961Z]?? map->flags? = flags; >> [2020-08-01T14:15:43,961Z]?? ~~~? ^ >> [2020-08-01T14:15:43,963Z] >> ./open/src/jdk.hotspot.agent/share/native/libsaproc/ps_core_common.c:153:54: >> error: use of undeclared identifier 'PF_R' >> [2020-08-01T14:15:43,963Z] offset, vaddr, memsz, PF_R)) == NULL) { >> >> I'll look at the code changes later. No time at the moment. >> >> thanks, >> >> Chris >> >> 2020-08-01-1405571.suenaga.source2020-08-01-1405571.suenaga.source >> 2020-08-01-1405571.suenaga.source On 8/1/20 5:20 PM, Yasumasa Suenaga >> wrote: >>> Hi Chris, >>> >>> Thanks for your comment! >>> I pushed new change to submit repo, but the build failed on macOS. >>> Could you share details? >>> (I do not have Mac) >>> >>> ? commit: http://hg.openjdk.java.net/jdk/submit/rev/0eb1c497f297 >>> ? job: mach5-one-ysuenaga-JDK-8250826-1-20200801-1407-13098989 >>> >>> On 2020/08/01 13:06, Chris Plummer wrote: >>>> On 7/30/20 6:18 PM, Yasumasa Suenaga wrote: >>>>> Hi Chris, >>>>> >>>>> On 2020/07/31 7:29, Chris Plummer wrote: >>>>>> Hi Yasumasa, >>>>>> >>>>>> If I understand correctly we first call add_map_info() for all >>>>>> the PT_LOAD segments in the core file. We then process all the >>>>>> library segments, calling add_map_info() for them if the >>>>>> target_vaddr has not already been addded. If has already been >>>>>> added, which I assume is the case for any library segment that is >>>>>> already in the core file, then the core file version is replaced >>>>>> the the library version.? I'm a little unclear of the purpose of >>>>>> this replacing of the core PT_LOAD segments with those found in >>>>>> the libraries. If you could explain this that would help me >>>>>> understand your change. >>>>> >>>>> Read only segments in ELF should not be any different from PT_LOAD >>>>> segments in the core. >>>>> And head of ELF header might be included in coredump (See >>>>> JDK-7133122). Thus we need to replace PT_LOAD segments the library >>>>> version. >>>> Ok. The code in the area really should have been commented better >>>> when first written. The purpose is not understandable simply by >>>> reading the code. >>> >>> I added some comments to existing code. Please tell me if it is >>> insufficient. >>> >>> >>>>>> I'm also unsure why existing_map->fd would ever be something >>>>>> other than the core file. Why would another library map the same >>>>>> target_vaddr. >>>>> >>>>> When mmap() is called to read-only ELF segments / sections, Linux >>>>> kernel seems to allocate other memory segments which has same top >>>>> virtual memory address. I've not yet found out from the code of >>>>> Linux kernel, but I confirmed this behavior on GDB. >>>> Ok. Same comment as above. This should have been explained with >>>> comments in the code. >>> >>> Added some comments. >>> >>> >>>> As for your fix, if I understand correctly the issue is that a >>>> single segment in the library is being split into two segments in >>>> the process (and therefore in the core file) due to an mprotect >>>> being done on part of the segment. Because of this the segment size >>>> in the library does match the segment size in the core file. So >>>> with your fix the library segment is used, but what about the other >>>> half of the segment that is in the core file? Don't we now have >>>> overlapping segments; the full original segment from the library, >>>> and then a second segment that overlaps the tail end of the library >>>> segment? Will that cause any confusion later on? >>> >>> As long as vaddr is valid, it doesn't matter even if it overlaps >>> because SA would sort the map with vaddr, and would lookup with it. >>> In Substrate VM, there are RO and RW sections in that order, so it >>> is ok with webrev.00 . However it might not be appropriate because >>> RW section might be top of PT_LOAD. >>> >>> To make it more generalized, I changed it to the commit on submit repo. >>> It would check access flags between in coredump and in binary. If >>> they are different, we respect current (loaded from coredump) map >>> because it might be changed at runtime. >>> >>> The change for LabsJDK 11 is more simple because JDK 11 does not >>> have ps_core_common.c . >>> So I share you it. It may help you: >>> >>> http://cr.openjdk.java.net/~ysuenaga/JDK-8250826/JDK-8250826-labsjdk11-0.patch >>> >>> >>> >>> Thanks, >>> >>> Yasumasa >>> >>> >>>> thanks, >>>> >>>> Chris >>>>> >>>>> >>>>> Thanks, >>>>> >>>>> Yasumasa >>>>> >>>>> >>>>>> thanks, >>>>>> >>>>>> Chris >>>>>> >>>>>> On 7/30/20 1:18 PM, Chris Plummer wrote: >>>>>>> Hi Yasumasa, >>>>>>> >>>>>>> I'm reviewing this RFR, and I'd like to ask that it not be >>>>>>> pushed as trivial. Although it is just a one line change, it >>>>>>> takes an extensive knowledge to understand the impact. I'll read >>>>>>> up on the filed graal issue and try to understand the ELF code a >>>>>>> bit better. >>>>>>> >>>>>>> thanks, >>>>>>> >>>>>>> Chris >>>>>>> >>>>>>> On 7/30/20 6:45 AM, Yasumasa Suenaga wrote: >>>>>>>> Hi all, >>>>>>>> >>>>>>>> Please review this trivial change: >>>>>>>> >>>>>>>> ? JBS: https://bugs.openjdk.java.net/browse/JDK-8250826 >>>>>>>> ? webrev: >>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8250826/webrev.00/ >>>>>>>> >>>>>>>> I played Truffle NFI on GraalVM, but I cannot get Java stacks >>>>>>>> from coredump via jhsdb. >>>>>>>> >>>>>>>> I've reported this issue to GraalVM community [1], and I 've >>>>>>>> found out the cause of this issue is .svm_heap would be >>>>>>>> separated to RO and RW areas by mprotect() calls in run time in >>>>>>>> spite of .svm_heap is RO section in ELF (please see [1] for >>>>>>>> details). >>>>>>>> >>>>>>>> It is corner case, but we will see same problem on jhsdb when >>>>>>>> we attempt to analyze coredump which comes from some >>>>>>>> applications / libraries which would separate RO sections in >>>>>>>> ELF like Substrate VM. >>>>>>>> >>>>>>>> I sent PR to fix libsaproc.so in LabsJDK 11 for this issue [2], >>>>>>>> then community members suggested me to discuss in >>>>>>>> serviceability-dev. >>>>>>>> >>>>>>>> >>>>>>>> Thanks, >>>>>>>> >>>>>>>> Yasumasa >>>>>>>> >>>>>>>> >>>>>>>> [1] https://github.com/oracle/graal/issues/2579 >>>>>>>> [2] https://github.com/graalvm/labs-openjdk-11/pull/9 >>>>>>> >>>>>>> >>>>>> >>>>>> >>>> >>>> >> >> From serguei.spitsyn at oracle.com Mon Aug 3 21:51:21 2020 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Mon, 3 Aug 2020 14:51:21 -0700 Subject: RFR(S): 8250750: JDK-8247515 fix for OSX pc_to_symbol() lookup fails with some symbols In-Reply-To: <2e35e463-9940-02a7-ec35-e1eace2ce718@oracle.com> References: <33d6d75e-e9f0-65c2-fa56-b1dc2f2be223@oracle.com> <2e35e463-9940-02a7-ec35-e1eace2ce718@oracle.com> Message-ID: Hi Chris, LGTM++ Thanks, Serguei On 7/30/20 02:16, Kevin Walls wrote: > Hi Chris - Yes, that's a good discovery, looks good, > > Thanks > Kevin > > > On 29/07/2020 21:08, Chris Plummer wrote: >> Hello, >> >> Please help review the following: >> >> https://bugs.openjdk.java.net/browse/JDK-8250750 >> http://cr.openjdk.java.net/~cjplummer/8250750/webrev.00/index.html >> >> Details are in the CR description. >> >> thanks, >> >> Chris From chris.plummer at oracle.com Mon Aug 3 21:53:00 2020 From: chris.plummer at oracle.com (Chris Plummer) Date: Mon, 3 Aug 2020 14:53:00 -0700 Subject: Fwd: RFR(XXS): 8249150: SA core file tests sometimes time out on OSX with "java.io.IOException: App waiting timeout" In-Reply-To: <124869c8-dc29-d453-1df1-fc4488f25acb@oracle.com> References: <124869c8-dc29-d453-1df1-fc4488f25acb@oracle.com> Message-ID: An HTML attachment was scrubbed... URL: From chris.plummer at oracle.com Mon Aug 3 21:53:23 2020 From: chris.plummer at oracle.com (Chris Plummer) Date: Mon, 3 Aug 2020 14:53:23 -0700 Subject: RFR(S): 8250750: JDK-8247515 fix for OSX pc_to_symbol() lookup fails with some symbols In-Reply-To: References: <33d6d75e-e9f0-65c2-fa56-b1dc2f2be223@oracle.com> <2e35e463-9940-02a7-ec35-e1eace2ce718@oracle.com> Message-ID: <45bb2319-220f-943a-798a-bbf9633b336e@oracle.com> Thanks Kevin and Serguei! Chris On 8/3/20 2:51 PM, serguei.spitsyn at oracle.com wrote: > Hi Chris, > > LGTM++ > > Thanks, > Serguei > > > On 7/30/20 02:16, Kevin Walls wrote: >> Hi Chris - Yes, that's a good discovery, looks good, >> >> Thanks >> Kevin >> >> >> On 29/07/2020 21:08, Chris Plummer wrote: >>> Hello, >>> >>> Please help review the following: >>> >>> https://bugs.openjdk.java.net/browse/JDK-8250750 >>> http://cr.openjdk.java.net/~cjplummer/8250750/webrev.00/index.html >>> >>> Details are in the CR description. >>> >>> thanks, >>> >>> Chris > From alexey.menkov at oracle.com Mon Aug 3 23:19:25 2020 From: alexey.menkov at oracle.com (Alex Menkov) Date: Mon, 3 Aug 2020 16:19:25 -0700 Subject: Fwd: RFR(XXS): 8249150: SA core file tests sometimes time out on OSX with "java.io.IOException: App waiting timeout" In-Reply-To: References: <124869c8-dc29-d453-1df1-fc4488f25acb@oracle.com> Message-ID: <6bb15142-f778-10cb-7e2b-6c98b616ddac@oracle.com> Hi Chris, Looks good --alex On 08/03/2020 14:53, Chris Plummer wrote: > Ping! This is a fairly trivial testing fix to avoid timeouts when using > LingeredApp to generate a core dump. No knowledge of SA is needed. > > thanks, > > Chris > > -------- Forwarded Message -------- > Subject: RFR(XXS): 8249150: SA core file tests sometimes time out on > OSX with "java.io.IOException: App waiting timeout" > Date: Fri, 31 Jul 2020 11:07:01 -0700 > From: Chris Plummer > To: serviceability-dev > > > > Hello, > > Please help review the following: > > https://bugs.openjdk.java.net/browse/JDK-8249150 > http://cr.openjdk.java.net/~cjplummer/8249150/webrev.00/index.html > > The tests in question use recently added LingeredApp support for dumping > a core file. On OSX a core file dump sometimes takes a very long time, > exceeding the amount of time Lingeredapp.waitAppReady() is willing to > wait. This wait time needs to be increased to allow the core file to > finish dumping, and also a couple of the tests that use the LingeredApp > core file support need longer test timeouts. Note ClhsdbFindPC already > has a long timeout, so no timeout changes were needed for it. > > Tested by running all serviceability/sa tests once on all platforms, and > 400 times on OSX (200 regular runs and 100 -Xcomp runs). > > thanks, > > Chris From serguei.spitsyn at oracle.com Mon Aug 3 23:46:14 2020 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Mon, 3 Aug 2020 16:46:14 -0700 Subject: Fwd: RFR(XXS): 8249150: SA core file tests sometimes time out on OSX with "java.io.IOException: App waiting timeout" In-Reply-To: <6bb15142-f778-10cb-7e2b-6c98b616ddac@oracle.com> References: <124869c8-dc29-d453-1df1-fc4488f25acb@oracle.com> <6bb15142-f778-10cb-7e2b-6c98b616ddac@oracle.com> Message-ID: <770fec66-6ce2-72a6-317e-f4d4bc2a149d@oracle.com> Hi Chris, LGTM++ Thanks, Serguei On 8/3/20 16:19, Alex Menkov wrote: > Hi Chris, > > Looks good > > --alex > > On 08/03/2020 14:53, Chris Plummer wrote: >> Ping! This is a fairly trivial testing fix to avoid timeouts when >> using LingeredApp to generate a core dump. No knowledge of SA is needed. >> >> thanks, >> >> Chris >> >> -------- Forwarded Message -------- >> Subject:???? RFR(XXS): 8249150: SA core file tests sometimes time out >> on OSX with "java.io.IOException: App waiting timeout" >> Date:???? Fri, 31 Jul 2020 11:07:01 -0700 >> From:???? Chris Plummer >> To:???? serviceability-dev >> >> >> >> Hello, >> >> Please help review the following: >> >> https://bugs.openjdk.java.net/browse/JDK-8249150 >> http://cr.openjdk.java.net/~cjplummer/8249150/webrev.00/index.html >> >> The tests in question use recently added LingeredApp support for >> dumping a core file. On OSX a core file dump sometimes takes a very >> long time, exceeding the amount of time Lingeredapp.waitAppReady() is >> willing to wait. This wait time needs to be increased to allow the >> core file to finish dumping, and also a couple of the tests that use >> the LingeredApp core file support need longer test timeouts. Note >> ClhsdbFindPC already has a long timeout, so no timeout changes were >> needed for it. >> >> Tested by running all serviceability/sa tests once on all platforms, >> and 400 times on OSX (200 regular runs and 100 -Xcomp runs). >> >> thanks, >> >> Chris From chris.plummer at oracle.com Tue Aug 4 00:38:28 2020 From: chris.plummer at oracle.com (Chris Plummer) Date: Mon, 3 Aug 2020 17:38:28 -0700 Subject: Fwd: RFR(XXS): 8249150: SA core file tests sometimes time out on OSX with "java.io.IOException: App waiting timeout" In-Reply-To: <770fec66-6ce2-72a6-317e-f4d4bc2a149d@oracle.com> References: <124869c8-dc29-d453-1df1-fc4488f25acb@oracle.com> <6bb15142-f778-10cb-7e2b-6c98b616ddac@oracle.com> <770fec66-6ce2-72a6-317e-f4d4bc2a149d@oracle.com> Message-ID: <465ac65b-1f31-4063-0d96-e38ceaecd9c5@oracle.com> Thanks Alex and Serguei! On 8/3/20 4:46 PM, serguei.spitsyn at oracle.com wrote: > Hi Chris, > > LGTM++ > > Thanks, > Serguei > > > On 8/3/20 16:19, Alex Menkov wrote: >> Hi Chris, >> >> Looks good >> >> --alex >> >> On 08/03/2020 14:53, Chris Plummer wrote: >>> Ping! This is a fairly trivial testing fix to avoid timeouts when >>> using LingeredApp to generate a core dump. No knowledge of SA is >>> needed. >>> >>> thanks, >>> >>> Chris >>> >>> -------- Forwarded Message -------- >>> Subject:???? RFR(XXS): 8249150: SA core file tests sometimes time >>> out on OSX with "java.io.IOException: App waiting timeout" >>> Date:???? Fri, 31 Jul 2020 11:07:01 -0700 >>> From:???? Chris Plummer >>> To:???? serviceability-dev >>> >>> >>> >>> Hello, >>> >>> Please help review the following: >>> >>> https://bugs.openjdk.java.net/browse/JDK-8249150 >>> http://cr.openjdk.java.net/~cjplummer/8249150/webrev.00/index.html >>> >>> The tests in question use recently added LingeredApp support for >>> dumping a core file. On OSX a core file dump sometimes takes a very >>> long time, exceeding the amount of time Lingeredapp.waitAppReady() >>> is willing to wait. This wait time needs to be increased to allow >>> the core file to finish dumping, and also a couple of the tests that >>> use the LingeredApp core file support need longer test timeouts. >>> Note ClhsdbFindPC already has a long timeout, so no timeout changes >>> were needed for it. >>> >>> Tested by running all serviceability/sa tests once on all platforms, >>> and 400 times on OSX (200 regular runs and 100 -Xcomp runs). >>> >>> thanks, >>> >>> Chris > From suenaga at oss.nttdata.com Tue Aug 4 00:47:09 2020 From: suenaga at oss.nttdata.com (Yasumasa Suenaga) Date: Tue, 4 Aug 2020 09:47:09 +0900 Subject: RFR: 8250826: jhsdb does not work with coredump which comes from Substrate VM In-Reply-To: References: <06e53c65-3cc6-8c51-33d0-26e207f4d0a0@oss.nttdata.com> <3805349a-342f-a130-3b0b-7ee26f23a278@oss.nttdata.com> <6009b927-3949-6d03-51ab-bef939cfbd7f@oracle.com> <313155c0-58b2-cef8-68f8-1aa3ffd7e7df@oss.nttdata.com> <7788f89a-507d-0d4a-ab81-52a09a2d5161@oracle.com> Message-ID: <439d0860-ae4a-98d4-6402-40ebfa833159@oss.nttdata.com> Hi Chris, Thank you for the comment! I updated webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8250826/webrev.02/ Diff from webrev.01: http://hg.openjdk.java.net/jdk/submit/rev/e98dc25b69c2 On 2020/08/04 6:41, Chris Plummer wrote: > Hi Yasumasa, > > Your updated fix resulted in using the core file map whereas the original fix used the library map. In both cases the assert is avoided, which I think is the main goal. Does it matter which map is used? In GraalVM, read only segment is conflicted, thus it does not matter which map is used. However this webrev is more generalize, so segments in coredump should be used. > ? 42 #ifndef PF_R > ? 43 #define PF_R 0x4 > ? 44 #endif > > ?156?? if ((map = allocate_init_map(ph->core->classes_jsa_fd, > ?157??????????????????????????????? offset, vaddr, memsz, PF_R)) == NULL) { > > I'm not so sure this is appropriate for OSX. It uses mach-o files, not elf files. The segment_command flags field comes from loader.h [1]. I don't see anything in there that looks like the equivalent of ELF access flags. > > /* Constants for the flags field of the segment_command */ > #define??? SG_HIGHVM??? 0x1??? /* the file contents for this segment is for > ??? ??? ??? ??? ?? the high part of the VM space, the low part > ??? ??? ??? ??? ?? is zero filled (for stacks in core files) */ > #define??? SG_FVMLIB??? 0x2??? /* this segment is the VM that is allocated by > ??? ??? ??? ??? ?? a fixed VM library, for overlap checking in > ??? ??? ??? ??? ?? the link editor */ > #define??? SG_NORELOC??? 0x4??? /* this segment has nothing that was relocated > ??? ??? ??? ??? ?? in it and nothing relocated to it, that is > ??? ??? ??? ??? ?? it maybe safely replaced without relocation*/ > #define SG_PROTECTED_VERSION_1??? 0x8 /* This segment is protected.? If the > ??? ??? ??? ??? ?????? segment starts at file offset 0, the > ??? ??? ??? ??? ?????? first page of the segment is not > ??? ??? ??? ??? ?????? protected.? All other pages of the > ??? ??? ??? ??? ?????? segment are protected. */ > > Since the flags don't matter for OSX, maybe you should just pass 0. You can do something like: > > #ifndef PF_R > #define MAP_R_FLAG PF_R > #else > #define MAP_R_FLAG 0 > #endif Thanks! I thought PF_R can be used PF_R from elf.h on macOS: https://opensource.apple.com/source/dtrace/dtrace-90/sys/elf.h I merged your code in this webrev. > Some minor comment fixes are needed: > > ?397???????? // Access flags fot this memory region is different between the library > > "fot" -> "for" > "is" -> "are" > > ?399???????? // We should respect to coredump. > > "to" -> "the" > > ?404???????? // And head of ELF header might be included in coredump (See JDK-7133122). > ?405???????? // Thus we need to replace PT_LOAD segments the library version. > > How about: > > ?404???????? // Also the first page of the ELF header might be included in the coredump (See JDK-7133122). > ?405???????? // Thus we need to replace the PT_LOAD segment with the library version. Fixed them. Thanks, Yasumasa > thanks, > > Chris > > [1] https://opensource.apple.com/source/xnu/xnu-1456.1.26/EXTERNAL_HEADERS/mach-o/loader.h.auto.html > > On 8/2/20 12:18 AM, Yasumasa Suenaga wrote: >> Hi Chris, >> >> (Remove "trivial" from subject) >> >> Thanks for the information! I fixed errors in new webrev. It passed tests on submit repo (mach5-one-ysuenaga-JDK-8250826-1-20200802-0151-13109525) >> >> ? http://cr.openjdk.java.net/~ysuenaga/JDK-8250826/webrev.01/ >> >> >> I tried to use elf.h instead of #define for PF_R, however it failed (mach5-one-ysuenaga-JDK-8250826-1-20200802-0542-13111335). >> >> ? http://hg.openjdk.java.net/jdk/submit/rev/67baee1a1a1d >> >> Thus I added #define for it in this webrev. >> >> >> Thanks, >> >> Yasumasa >> >> >> On 2020/08/02 10:22, Chris Plummer wrote: >>> Hi Yasumasa, >>> >>> [2020-08-01T14:15:42,514Z] Creating support/native/jdk.hotspot.agent/libsaproc/static/libsaproc.a from 8 file(s) >>> [2020-08-01T14:15:43,961Z] ./open/src/jdk.hotspot.agent/share/native/libsaproc/ps_core_common.c:128:8: error: no member named 'flags' in 'struct map_info' >>> [2020-08-01T14:15:43,961Z]?? map->flags? = flags; >>> [2020-08-01T14:15:43,961Z]?? ~~~? ^ >>> [2020-08-01T14:15:43,963Z] ./open/src/jdk.hotspot.agent/share/native/libsaproc/ps_core_common.c:153:54: error: use of undeclared identifier 'PF_R' >>> [2020-08-01T14:15:43,963Z] offset, vaddr, memsz, PF_R)) == NULL) { >>> >>> I'll look at the code changes later. No time at the moment. >>> >>> thanks, >>> >>> Chris >>> >>> 2020-08-01-1405571.suenaga.source2020-08-01-1405571.suenaga.source 2020-08-01-1405571.suenaga.source On 8/1/20 5:20 PM, Yasumasa Suenaga wrote: >>>> Hi Chris, >>>> >>>> Thanks for your comment! >>>> I pushed new change to submit repo, but the build failed on macOS. Could you share details? >>>> (I do not have Mac) >>>> >>>> ? commit: http://hg.openjdk.java.net/jdk/submit/rev/0eb1c497f297 >>>> ? job: mach5-one-ysuenaga-JDK-8250826-1-20200801-1407-13098989 >>>> >>>> On 2020/08/01 13:06, Chris Plummer wrote: >>>>> On 7/30/20 6:18 PM, Yasumasa Suenaga wrote: >>>>>> Hi Chris, >>>>>> >>>>>> On 2020/07/31 7:29, Chris Plummer wrote: >>>>>>> Hi Yasumasa, >>>>>>> >>>>>>> If I understand correctly we first call add_map_info() for all the PT_LOAD segments in the core file. We then process all the library segments, calling add_map_info() for them if the target_vaddr has not already been addded. If has already been added, which I assume is the case for any library segment that is already in the core file, then the core file version is replaced the the library version.? I'm a little unclear of the purpose of this replacing of the core PT_LOAD segments with those found in the libraries. If you could explain this that would help me understand your change. >>>>>> >>>>>> Read only segments in ELF should not be any different from PT_LOAD segments in the core. >>>>>> And head of ELF header might be included in coredump (See JDK-7133122). Thus we need to replace PT_LOAD segments the library version. >>>>> Ok. The code in the area really should have been commented better when first written. The purpose is not understandable simply by reading the code. >>>> >>>> I added some comments to existing code. Please tell me if it is insufficient. >>>> >>>> >>>>>>> I'm also unsure why existing_map->fd would ever be something other than the core file. Why would another library map the same target_vaddr. >>>>>> >>>>>> When mmap() is called to read-only ELF segments / sections, Linux kernel seems to allocate other memory segments which has same top virtual memory address. I've not yet found out from the code of Linux kernel, but I confirmed this behavior on GDB. >>>>> Ok. Same comment as above. This should have been explained with comments in the code. >>>> >>>> Added some comments. >>>> >>>> >>>>> As for your fix, if I understand correctly the issue is that a single segment in the library is being split into two segments in the process (and therefore in the core file) due to an mprotect being done on part of the segment. Because of this the segment size in the library does match the segment size in the core file. So with your fix the library segment is used, but what about the other half of the segment that is in the core file? Don't we now have overlapping segments; the full original segment from the library, and then a second segment that overlaps the tail end of the library segment? Will that cause any confusion later on? >>>> >>>> As long as vaddr is valid, it doesn't matter even if it overlaps because SA would sort the map with vaddr, and would lookup with it. >>>> In Substrate VM, there are RO and RW sections in that order, so it is ok with webrev.00 . However it might not be appropriate because RW section might be top of PT_LOAD. >>>> >>>> To make it more generalized, I changed it to the commit on submit repo. >>>> It would check access flags between in coredump and in binary. If they are different, we respect current (loaded from coredump) map because it might be changed at runtime. >>>> >>>> The change for LabsJDK 11 is more simple because JDK 11 does not have ps_core_common.c . >>>> So I share you it. It may help you: >>>> >>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8250826/JDK-8250826-labsjdk11-0.patch >>>> >>>> >>>> Thanks, >>>> >>>> Yasumasa >>>> >>>> >>>>> thanks, >>>>> >>>>> Chris >>>>>> >>>>>> >>>>>> Thanks, >>>>>> >>>>>> Yasumasa >>>>>> >>>>>> >>>>>>> thanks, >>>>>>> >>>>>>> Chris >>>>>>> >>>>>>> On 7/30/20 1:18 PM, Chris Plummer wrote: >>>>>>>> Hi Yasumasa, >>>>>>>> >>>>>>>> I'm reviewing this RFR, and I'd like to ask that it not be pushed as trivial. Although it is just a one line change, it takes an extensive knowledge to understand the impact. I'll read up on the filed graal issue and try to understand the ELF code a bit better. >>>>>>>> >>>>>>>> thanks, >>>>>>>> >>>>>>>> Chris >>>>>>>> >>>>>>>> On 7/30/20 6:45 AM, Yasumasa Suenaga wrote: >>>>>>>>> Hi all, >>>>>>>>> >>>>>>>>> Please review this trivial change: >>>>>>>>> >>>>>>>>> ? JBS: https://bugs.openjdk.java.net/browse/JDK-8250826 >>>>>>>>> ? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8250826/webrev.00/ >>>>>>>>> >>>>>>>>> I played Truffle NFI on GraalVM, but I cannot get Java stacks from coredump via jhsdb. >>>>>>>>> >>>>>>>>> I've reported this issue to GraalVM community [1], and I 've found out the cause of this issue is .svm_heap would be separated to RO and RW areas by mprotect() calls in run time in spite of .svm_heap is RO section in ELF (please see [1] for details). >>>>>>>>> >>>>>>>>> It is corner case, but we will see same problem on jhsdb when we attempt to analyze coredump which comes from some applications / libraries which would separate RO sections in ELF like Substrate VM. >>>>>>>>> >>>>>>>>> I sent PR to fix libsaproc.so in LabsJDK 11 for this issue [2], then community members suggested me to discuss in serviceability-dev. >>>>>>>>> >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> >>>>>>>>> Yasumasa >>>>>>>>> >>>>>>>>> >>>>>>>>> [1] https://github.com/oracle/graal/issues/2579 >>>>>>>>> [2] https://github.com/graalvm/labs-openjdk-11/pull/9 >>>>>>>> >>>>>>>> >>>>>>> >>>>>>> >>>>> >>>>> >>> >>> > > From chris.plummer at oracle.com Tue Aug 4 00:51:17 2020 From: chris.plummer at oracle.com (Chris Plummer) Date: Mon, 3 Aug 2020 17:51:17 -0700 Subject: RFR: 8250930: [TESTBUG] Some forceEarlyReturn00* tests failed due to compiler optimization In-Reply-To: <16dae51b-f1c5-c283-acc6-c106366d82be@oss.nttdata.com> References: <16dae51b-f1c5-c283-acc6-c106366d82be@oss.nttdata.com> Message-ID: Hi Yasumasa, Although I don't doubt that it works, calling fgetc() seems like an odd way to resolve this issue. I had some internal discussions on how to safely cause an infinite loop. Something like the following should work: static volatile int dummy_counter = 0; while (dummy_counter == 0) {} volatile is important because it prevents gcc from assuming dummy_counter will always be 0. thanks, Chris On 8/2/20 10:55 PM, Yasumasa Suenaga wrote: > Hi all, > > Please review this change: > > ? JBS: https://bugs.openjdk.java.net/browse/JDK-8250930 > ? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8250930/webrev.00/ > > Following tests which were compiled by GCC 10.2 failed. > > ?- > vmTestbase/nsk/jdi/ThreadReference/forceEarlyReturn/forceEarlyReturn004/forceEarlyReturn004.java > ?- > vmTestbase/nsk/jdwp/ThreadReference/ForceEarlyReturn/forceEarlyReturn002/forceEarlyReturn002.java > > They have native module, and they are commented as below: > > ``` > ?? // execute infinite loop to be sure that thread in native method > ?? while (always_true) > ?? { > ?????? // Need some dummy code so the optimizer does not remove this > loop. > ?????? dummy_counter = dummy_counter < 1000 ? 0 : dummy_counter + 1; > ?? } > ?? // The optimizer can be surprisingly clever. > ?? // Use dummy_counter so it can never be optimized out. > ?? // This statement will always return 0. > ?? return dummy_counter >= 0 ? 0 : 1; > ``` > > C compiler maybe eliminate this loop. We should not consider compiler > optimization at this point with other solution. > > > Thanks, > > Yasumasa From chris.plummer at oracle.com Tue Aug 4 00:54:25 2020 From: chris.plummer at oracle.com (Chris Plummer) Date: Mon, 3 Aug 2020 17:54:25 -0700 Subject: RFR: 8250826: jhsdb does not work with coredump which comes from Substrate VM In-Reply-To: <439d0860-ae4a-98d4-6402-40ebfa833159@oss.nttdata.com> References: <06e53c65-3cc6-8c51-33d0-26e207f4d0a0@oss.nttdata.com> <3805349a-342f-a130-3b0b-7ee26f23a278@oss.nttdata.com> <6009b927-3949-6d03-51ab-bef939cfbd7f@oracle.com> <313155c0-58b2-cef8-68f8-1aa3ffd7e7df@oss.nttdata.com> <7788f89a-507d-0d4a-ab81-52a09a2d5161@oracle.com> <439d0860-ae4a-98d4-6402-40ebfa833159@oss.nttdata.com> Message-ID: <6740ad1d-5e35-458c-7c2c-caf6da6b1276@oracle.com> Hi Yasumasa, Your changes look good now. thanks, Chris On 8/3/20 5:47 PM, Yasumasa Suenaga wrote: > Hi Chris, > > Thank you for the comment! > I updated webrev: > > ? http://cr.openjdk.java.net/~ysuenaga/JDK-8250826/webrev.02/ > ? Diff from webrev.01: > http://hg.openjdk.java.net/jdk/submit/rev/e98dc25b69c2 > > On 2020/08/04 6:41, Chris Plummer wrote: >> Hi Yasumasa, >> >> Your updated fix resulted in using the core file map whereas the >> original fix used the library map. In both cases the assert is >> avoided, which I think is the main goal. Does it matter which map is >> used? > > In GraalVM, read only segment is conflicted, thus it does not matter > which map is used. > However this webrev is more generalize, so segments in coredump should > be used. > >> ?? 42 #ifndef PF_R >> ?? 43 #define PF_R 0x4 >> ?? 44 #endif >> >> ??156?? if ((map = allocate_init_map(ph->core->classes_jsa_fd, >> ??157??????????????????????????????? offset, vaddr, memsz, PF_R)) == >> NULL) { >> >> I'm not so sure this is appropriate for OSX. It uses mach-o files, >> not elf files. The segment_command flags field comes from loader.h >> [1]. I don't see anything in there that looks like the equivalent of >> ELF access flags. >> >> /* Constants for the flags field of the segment_command */ >> #define??? SG_HIGHVM??? 0x1??? /* the file contents for this segment >> is for >> ???? ??? ??? ??? ?? the high part of the VM space, the low part >> ???? ??? ??? ??? ?? is zero filled (for stacks in core files) */ >> #define??? SG_FVMLIB??? 0x2??? /* this segment is the VM that is >> allocated by >> ???? ??? ??? ??? ?? a fixed VM library, for overlap checking in >> ???? ??? ??? ??? ?? the link editor */ >> #define??? SG_NORELOC??? 0x4??? /* this segment has nothing that was >> relocated >> ???? ??? ??? ??? ?? in it and nothing relocated to it, that is >> ???? ??? ??? ??? ?? it maybe safely replaced without relocation*/ >> #define SG_PROTECTED_VERSION_1??? 0x8 /* This segment is protected.? >> If the >> ???? ??? ??? ??? ?????? segment starts at file offset 0, the >> ???? ??? ??? ??? ?????? first page of the segment is not >> ???? ??? ??? ??? ?????? protected.? All other pages of the >> ???? ??? ??? ??? ?????? segment are protected. */ >> >> Since the flags don't matter for OSX, maybe you should just pass 0. >> You can do something like: >> >> #ifndef PF_R >> #define MAP_R_FLAG PF_R >> #else >> #define MAP_R_FLAG 0 >> #endif > > Thanks! > I thought PF_R can be used PF_R from elf.h on macOS: > ? https://opensource.apple.com/source/dtrace/dtrace-90/sys/elf.h > > I merged your code in this webrev. > >> Some minor comment fixes are needed: >> >> ??397???????? // Access flags fot this memory region is different >> between the library >> >> "fot" -> "for" >> "is" -> "are" >> >> ??399???????? // We should respect to coredump. >> >> "to" -> "the" >> >> ??404???????? // And head of ELF header might be included in coredump >> (See JDK-7133122). >> ??405???????? // Thus we need to replace PT_LOAD segments the library >> version. >> >> How about: >> >> ??404???????? // Also the first page of the ELF header might be >> included in the coredump (See JDK-7133122). >> ??405???????? // Thus we need to replace the PT_LOAD segment with the >> library version. > > Fixed them. > > > Thanks, > > Yasumasa > > >> thanks, >> >> Chris >> >> [1] >> https://opensource.apple.com/source/xnu/xnu-1456.1.26/EXTERNAL_HEADERS/mach-o/loader.h.auto.html >> >> On 8/2/20 12:18 AM, Yasumasa Suenaga wrote: >>> Hi Chris, >>> >>> (Remove "trivial" from subject) >>> >>> Thanks for the information! I fixed errors in new webrev. It passed >>> tests on submit repo >>> (mach5-one-ysuenaga-JDK-8250826-1-20200802-0151-13109525) >>> >>> ? http://cr.openjdk.java.net/~ysuenaga/JDK-8250826/webrev.01/ >>> >>> >>> I tried to use elf.h instead of #define for PF_R, however it failed >>> (mach5-one-ysuenaga-JDK-8250826-1-20200802-0542-13111335). >>> >>> ? http://hg.openjdk.java.net/jdk/submit/rev/67baee1a1a1d >>> >>> Thus I added #define for it in this webrev. >>> >>> >>> Thanks, >>> >>> Yasumasa >>> >>> >>> On 2020/08/02 10:22, Chris Plummer wrote: >>>> Hi Yasumasa, >>>> >>>> [2020-08-01T14:15:42,514Z] Creating >>>> support/native/jdk.hotspot.agent/libsaproc/static/libsaproc.a from >>>> 8 file(s) >>>> [2020-08-01T14:15:43,961Z] >>>> ./open/src/jdk.hotspot.agent/share/native/libsaproc/ps_core_common.c:128:8: >>>> error: no member named 'flags' in 'struct map_info' >>>> [2020-08-01T14:15:43,961Z]?? map->flags? = flags; >>>> [2020-08-01T14:15:43,961Z]?? ~~~? ^ >>>> [2020-08-01T14:15:43,963Z] >>>> ./open/src/jdk.hotspot.agent/share/native/libsaproc/ps_core_common.c:153:54: >>>> error: use of undeclared identifier 'PF_R' >>>> [2020-08-01T14:15:43,963Z] offset, vaddr, memsz, PF_R)) == NULL) { >>>> >>>> I'll look at the code changes later. No time at the moment. >>>> >>>> thanks, >>>> >>>> Chris >>>> >>>> 2020-08-01-1405571.suenaga.source2020-08-01-1405571.suenaga.source >>>> 2020-08-01-1405571.suenaga.source On 8/1/20 5:20 PM, Yasumasa >>>> Suenaga wrote: >>>>> Hi Chris, >>>>> >>>>> Thanks for your comment! >>>>> I pushed new change to submit repo, but the build failed on macOS. >>>>> Could you share details? >>>>> (I do not have Mac) >>>>> >>>>> ? commit: http://hg.openjdk.java.net/jdk/submit/rev/0eb1c497f297 >>>>> ? job: mach5-one-ysuenaga-JDK-8250826-1-20200801-1407-13098989 >>>>> >>>>> On 2020/08/01 13:06, Chris Plummer wrote: >>>>>> On 7/30/20 6:18 PM, Yasumasa Suenaga wrote: >>>>>>> Hi Chris, >>>>>>> >>>>>>> On 2020/07/31 7:29, Chris Plummer wrote: >>>>>>>> Hi Yasumasa, >>>>>>>> >>>>>>>> If I understand correctly we first call add_map_info() for all >>>>>>>> the PT_LOAD segments in the core file. We then process all the >>>>>>>> library segments, calling add_map_info() for them if the >>>>>>>> target_vaddr has not already been addded. If has already been >>>>>>>> added, which I assume is the case for any library segment that >>>>>>>> is already in the core file, then the core file version is >>>>>>>> replaced the the library version.? I'm a little unclear of the >>>>>>>> purpose of this replacing of the core PT_LOAD segments with >>>>>>>> those found in the libraries. If you could explain this that >>>>>>>> would help me understand your change. >>>>>>> >>>>>>> Read only segments in ELF should not be any different from >>>>>>> PT_LOAD segments in the core. >>>>>>> And head of ELF header might be included in coredump (See >>>>>>> JDK-7133122). Thus we need to replace PT_LOAD segments the >>>>>>> library version. >>>>>> Ok. The code in the area really should have been commented better >>>>>> when first written. The purpose is not understandable simply by >>>>>> reading the code. >>>>> >>>>> I added some comments to existing code. Please tell me if it is >>>>> insufficient. >>>>> >>>>> >>>>>>>> I'm also unsure why existing_map->fd would ever be something >>>>>>>> other than the core file. Why would another library map the >>>>>>>> same target_vaddr. >>>>>>> >>>>>>> When mmap() is called to read-only ELF segments / sections, >>>>>>> Linux kernel seems to allocate other memory segments which has >>>>>>> same top virtual memory address. I've not yet found out from the >>>>>>> code of Linux kernel, but I confirmed this behavior on GDB. >>>>>> Ok. Same comment as above. This should have been explained with >>>>>> comments in the code. >>>>> >>>>> Added some comments. >>>>> >>>>> >>>>>> As for your fix, if I understand correctly the issue is that a >>>>>> single segment in the library is being split into two segments in >>>>>> the process (and therefore in the core file) due to an mprotect >>>>>> being done on part of the segment. Because of this the segment >>>>>> size in the library does match the segment size in the core file. >>>>>> So with your fix the library segment is used, but what about the >>>>>> other half of the segment that is in the core file? Don't we now >>>>>> have overlapping segments; the full original segment from the >>>>>> library, and then a second segment that overlaps the tail end of >>>>>> the library segment? Will that cause any confusion later on? >>>>> >>>>> As long as vaddr is valid, it doesn't matter even if it overlaps >>>>> because SA would sort the map with vaddr, and would lookup with it. >>>>> In Substrate VM, there are RO and RW sections in that order, so it >>>>> is ok with webrev.00 . However it might not be appropriate because >>>>> RW section might be top of PT_LOAD. >>>>> >>>>> To make it more generalized, I changed it to the commit on submit >>>>> repo. >>>>> It would check access flags between in coredump and in binary. If >>>>> they are different, we respect current (loaded from coredump) map >>>>> because it might be changed at runtime. >>>>> >>>>> The change for LabsJDK 11 is more simple because JDK 11 does not >>>>> have ps_core_common.c . >>>>> So I share you it. It may help you: >>>>> >>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8250826/JDK-8250826-labsjdk11-0.patch >>>>> >>>>> >>>>> >>>>> Thanks, >>>>> >>>>> Yasumasa >>>>> >>>>> >>>>>> thanks, >>>>>> >>>>>> Chris >>>>>>> >>>>>>> >>>>>>> Thanks, >>>>>>> >>>>>>> Yasumasa >>>>>>> >>>>>>> >>>>>>>> thanks, >>>>>>>> >>>>>>>> Chris >>>>>>>> >>>>>>>> On 7/30/20 1:18 PM, Chris Plummer wrote: >>>>>>>>> Hi Yasumasa, >>>>>>>>> >>>>>>>>> I'm reviewing this RFR, and I'd like to ask that it not be >>>>>>>>> pushed as trivial. Although it is just a one line change, it >>>>>>>>> takes an extensive knowledge to understand the impact. I'll >>>>>>>>> read up on the filed graal issue and try to understand the ELF >>>>>>>>> code a bit better. >>>>>>>>> >>>>>>>>> thanks, >>>>>>>>> >>>>>>>>> Chris >>>>>>>>> >>>>>>>>> On 7/30/20 6:45 AM, Yasumasa Suenaga wrote: >>>>>>>>>> Hi all, >>>>>>>>>> >>>>>>>>>> Please review this trivial change: >>>>>>>>>> >>>>>>>>>> ? JBS: https://bugs.openjdk.java.net/browse/JDK-8250826 >>>>>>>>>> ? webrev: >>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8250826/webrev.00/ >>>>>>>>>> >>>>>>>>>> I played Truffle NFI on GraalVM, but I cannot get Java stacks >>>>>>>>>> from coredump via jhsdb. >>>>>>>>>> >>>>>>>>>> I've reported this issue to GraalVM community [1], and I 've >>>>>>>>>> found out the cause of this issue is .svm_heap would be >>>>>>>>>> separated to RO and RW areas by mprotect() calls in run time >>>>>>>>>> in spite of .svm_heap is RO section in ELF (please see [1] >>>>>>>>>> for details). >>>>>>>>>> >>>>>>>>>> It is corner case, but we will see same problem on jhsdb when >>>>>>>>>> we attempt to analyze coredump which comes from some >>>>>>>>>> applications / libraries which would separate RO sections in >>>>>>>>>> ELF like Substrate VM. >>>>>>>>>> >>>>>>>>>> I sent PR to fix libsaproc.so in LabsJDK 11 for this issue >>>>>>>>>> [2], then community members suggested me to discuss in >>>>>>>>>> serviceability-dev. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> >>>>>>>>>> Yasumasa >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> [1] https://github.com/oracle/graal/issues/2579 >>>>>>>>>> [2] https://github.com/graalvm/labs-openjdk-11/pull/9 >>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>> >>>>>> >>>> >>>> >> >> From suenaga at oss.nttdata.com Tue Aug 4 00:54:58 2020 From: suenaga at oss.nttdata.com (Yasumasa Suenaga) Date: Tue, 4 Aug 2020 09:54:58 +0900 Subject: RFR: 8250826: jhsdb does not work with coredump which comes from Substrate VM In-Reply-To: <6740ad1d-5e35-458c-7c2c-caf6da6b1276@oracle.com> References: <06e53c65-3cc6-8c51-33d0-26e207f4d0a0@oss.nttdata.com> <3805349a-342f-a130-3b0b-7ee26f23a278@oss.nttdata.com> <6009b927-3949-6d03-51ab-bef939cfbd7f@oracle.com> <313155c0-58b2-cef8-68f8-1aa3ffd7e7df@oss.nttdata.com> <7788f89a-507d-0d4a-ab81-52a09a2d5161@oracle.com> <439d0860-ae4a-98d4-6402-40ebfa833159@oss.nttdata.com> <6740ad1d-5e35-458c-7c2c-caf6da6b1276@oracle.com> Message-ID: <29af768a-bebe-13fe-ac51-3062ede936bf@oss.nttdata.com> Thanks Chris! I will push it when I got second reviewer. Yasumasa On 2020/08/04 9:54, Chris Plummer wrote: > Hi Yasumasa, > > Your changes look good now. > > thanks, > > Chris > > On 8/3/20 5:47 PM, Yasumasa Suenaga wrote: >> Hi Chris, >> >> Thank you for the comment! >> I updated webrev: >> >> ? http://cr.openjdk.java.net/~ysuenaga/JDK-8250826/webrev.02/ >> ? Diff from webrev.01: http://hg.openjdk.java.net/jdk/submit/rev/e98dc25b69c2 >> >> On 2020/08/04 6:41, Chris Plummer wrote: >>> Hi Yasumasa, >>> >>> Your updated fix resulted in using the core file map whereas the original fix used the library map. In both cases the assert is avoided, which I think is the main goal. Does it matter which map is used? >> >> In GraalVM, read only segment is conflicted, thus it does not matter which map is used. >> However this webrev is more generalize, so segments in coredump should be used. >> >>> ?? 42 #ifndef PF_R >>> ?? 43 #define PF_R 0x4 >>> ?? 44 #endif >>> >>> ??156?? if ((map = allocate_init_map(ph->core->classes_jsa_fd, >>> ??157??????????????????????????????? offset, vaddr, memsz, PF_R)) == NULL) { >>> >>> I'm not so sure this is appropriate for OSX. It uses mach-o files, not elf files. The segment_command flags field comes from loader.h [1]. I don't see anything in there that looks like the equivalent of ELF access flags. >>> >>> /* Constants for the flags field of the segment_command */ >>> #define??? SG_HIGHVM??? 0x1??? /* the file contents for this segment is for >>> ???? ??? ??? ??? ?? the high part of the VM space, the low part >>> ???? ??? ??? ??? ?? is zero filled (for stacks in core files) */ >>> #define??? SG_FVMLIB??? 0x2??? /* this segment is the VM that is allocated by >>> ???? ??? ??? ??? ?? a fixed VM library, for overlap checking in >>> ???? ??? ??? ??? ?? the link editor */ >>> #define??? SG_NORELOC??? 0x4??? /* this segment has nothing that was relocated >>> ???? ??? ??? ??? ?? in it and nothing relocated to it, that is >>> ???? ??? ??? ??? ?? it maybe safely replaced without relocation*/ >>> #define SG_PROTECTED_VERSION_1??? 0x8 /* This segment is protected. If the >>> ???? ??? ??? ??? ?????? segment starts at file offset 0, the >>> ???? ??? ??? ??? ?????? first page of the segment is not >>> ???? ??? ??? ??? ?????? protected.? All other pages of the >>> ???? ??? ??? ??? ?????? segment are protected. */ >>> >>> Since the flags don't matter for OSX, maybe you should just pass 0. You can do something like: >>> >>> #ifndef PF_R >>> #define MAP_R_FLAG PF_R >>> #else >>> #define MAP_R_FLAG 0 >>> #endif >> >> Thanks! >> I thought PF_R can be used PF_R from elf.h on macOS: >> ? https://opensource.apple.com/source/dtrace/dtrace-90/sys/elf.h >> >> I merged your code in this webrev. >> >>> Some minor comment fixes are needed: >>> >>> ??397???????? // Access flags fot this memory region is different between the library >>> >>> "fot" -> "for" >>> "is" -> "are" >>> >>> ??399???????? // We should respect to coredump. >>> >>> "to" -> "the" >>> >>> ??404???????? // And head of ELF header might be included in coredump (See JDK-7133122). >>> ??405???????? // Thus we need to replace PT_LOAD segments the library version. >>> >>> How about: >>> >>> ??404???????? // Also the first page of the ELF header might be included in the coredump (See JDK-7133122). >>> ??405???????? // Thus we need to replace the PT_LOAD segment with the library version. >> >> Fixed them. >> >> >> Thanks, >> >> Yasumasa >> >> >>> thanks, >>> >>> Chris >>> >>> [1] https://opensource.apple.com/source/xnu/xnu-1456.1.26/EXTERNAL_HEADERS/mach-o/loader.h.auto.html >>> >>> On 8/2/20 12:18 AM, Yasumasa Suenaga wrote: >>>> Hi Chris, >>>> >>>> (Remove "trivial" from subject) >>>> >>>> Thanks for the information! I fixed errors in new webrev. It passed tests on submit repo (mach5-one-ysuenaga-JDK-8250826-1-20200802-0151-13109525) >>>> >>>> ? http://cr.openjdk.java.net/~ysuenaga/JDK-8250826/webrev.01/ >>>> >>>> >>>> I tried to use elf.h instead of #define for PF_R, however it failed (mach5-one-ysuenaga-JDK-8250826-1-20200802-0542-13111335). >>>> >>>> ? http://hg.openjdk.java.net/jdk/submit/rev/67baee1a1a1d >>>> >>>> Thus I added #define for it in this webrev. >>>> >>>> >>>> Thanks, >>>> >>>> Yasumasa >>>> >>>> >>>> On 2020/08/02 10:22, Chris Plummer wrote: >>>>> Hi Yasumasa, >>>>> >>>>> [2020-08-01T14:15:42,514Z] Creating support/native/jdk.hotspot.agent/libsaproc/static/libsaproc.a from 8 file(s) >>>>> [2020-08-01T14:15:43,961Z] ./open/src/jdk.hotspot.agent/share/native/libsaproc/ps_core_common.c:128:8: error: no member named 'flags' in 'struct map_info' >>>>> [2020-08-01T14:15:43,961Z]?? map->flags? = flags; >>>>> [2020-08-01T14:15:43,961Z]?? ~~~? ^ >>>>> [2020-08-01T14:15:43,963Z] ./open/src/jdk.hotspot.agent/share/native/libsaproc/ps_core_common.c:153:54: error: use of undeclared identifier 'PF_R' >>>>> [2020-08-01T14:15:43,963Z] offset, vaddr, memsz, PF_R)) == NULL) { >>>>> >>>>> I'll look at the code changes later. No time at the moment. >>>>> >>>>> thanks, >>>>> >>>>> Chris >>>>> >>>>> 2020-08-01-1405571.suenaga.source2020-08-01-1405571.suenaga.source 2020-08-01-1405571.suenaga.source On 8/1/20 5:20 PM, Yasumasa Suenaga wrote: >>>>>> Hi Chris, >>>>>> >>>>>> Thanks for your comment! >>>>>> I pushed new change to submit repo, but the build failed on macOS. Could you share details? >>>>>> (I do not have Mac) >>>>>> >>>>>> ? commit: http://hg.openjdk.java.net/jdk/submit/rev/0eb1c497f297 >>>>>> ? job: mach5-one-ysuenaga-JDK-8250826-1-20200801-1407-13098989 >>>>>> >>>>>> On 2020/08/01 13:06, Chris Plummer wrote: >>>>>>> On 7/30/20 6:18 PM, Yasumasa Suenaga wrote: >>>>>>>> Hi Chris, >>>>>>>> >>>>>>>> On 2020/07/31 7:29, Chris Plummer wrote: >>>>>>>>> Hi Yasumasa, >>>>>>>>> >>>>>>>>> If I understand correctly we first call add_map_info() for all the PT_LOAD segments in the core file. We then process all the library segments, calling add_map_info() for them if the target_vaddr has not already been addded. If has already been added, which I assume is the case for any library segment that is already in the core file, then the core file version is replaced the the library version.? I'm a little unclear of the purpose of this replacing of the core PT_LOAD segments with those found in the libraries. If you could explain this that would help me understand your change. >>>>>>>> >>>>>>>> Read only segments in ELF should not be any different from PT_LOAD segments in the core. >>>>>>>> And head of ELF header might be included in coredump (See JDK-7133122). Thus we need to replace PT_LOAD segments the library version. >>>>>>> Ok. The code in the area really should have been commented better when first written. The purpose is not understandable simply by reading the code. >>>>>> >>>>>> I added some comments to existing code. Please tell me if it is insufficient. >>>>>> >>>>>> >>>>>>>>> I'm also unsure why existing_map->fd would ever be something other than the core file. Why would another library map the same target_vaddr. >>>>>>>> >>>>>>>> When mmap() is called to read-only ELF segments / sections, Linux kernel seems to allocate other memory segments which has same top virtual memory address. I've not yet found out from the code of Linux kernel, but I confirmed this behavior on GDB. >>>>>>> Ok. Same comment as above. This should have been explained with comments in the code. >>>>>> >>>>>> Added some comments. >>>>>> >>>>>> >>>>>>> As for your fix, if I understand correctly the issue is that a single segment in the library is being split into two segments in the process (and therefore in the core file) due to an mprotect being done on part of the segment. Because of this the segment size in the library does match the segment size in the core file. So with your fix the library segment is used, but what about the other half of the segment that is in the core file? Don't we now have overlapping segments; the full original segment from the library, and then a second segment that overlaps the tail end of the library segment? Will that cause any confusion later on? >>>>>> >>>>>> As long as vaddr is valid, it doesn't matter even if it overlaps because SA would sort the map with vaddr, and would lookup with it. >>>>>> In Substrate VM, there are RO and RW sections in that order, so it is ok with webrev.00 . However it might not be appropriate because RW section might be top of PT_LOAD. >>>>>> >>>>>> To make it more generalized, I changed it to the commit on submit repo. >>>>>> It would check access flags between in coredump and in binary. If they are different, we respect current (loaded from coredump) map because it might be changed at runtime. >>>>>> >>>>>> The change for LabsJDK 11 is more simple because JDK 11 does not have ps_core_common.c . >>>>>> So I share you it. It may help you: >>>>>> >>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8250826/JDK-8250826-labsjdk11-0.patch >>>>>> >>>>>> >>>>>> Thanks, >>>>>> >>>>>> Yasumasa >>>>>> >>>>>> >>>>>>> thanks, >>>>>>> >>>>>>> Chris >>>>>>>> >>>>>>>> >>>>>>>> Thanks, >>>>>>>> >>>>>>>> Yasumasa >>>>>>>> >>>>>>>> >>>>>>>>> thanks, >>>>>>>>> >>>>>>>>> Chris >>>>>>>>> >>>>>>>>> On 7/30/20 1:18 PM, Chris Plummer wrote: >>>>>>>>>> Hi Yasumasa, >>>>>>>>>> >>>>>>>>>> I'm reviewing this RFR, and I'd like to ask that it not be pushed as trivial. Although it is just a one line change, it takes an extensive knowledge to understand the impact. I'll read up on the filed graal issue and try to understand the ELF code a bit better. >>>>>>>>>> >>>>>>>>>> thanks, >>>>>>>>>> >>>>>>>>>> Chris >>>>>>>>>> >>>>>>>>>> On 7/30/20 6:45 AM, Yasumasa Suenaga wrote: >>>>>>>>>>> Hi all, >>>>>>>>>>> >>>>>>>>>>> Please review this trivial change: >>>>>>>>>>> >>>>>>>>>>> ? JBS: https://bugs.openjdk.java.net/browse/JDK-8250826 >>>>>>>>>>> ? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8250826/webrev.00/ >>>>>>>>>>> >>>>>>>>>>> I played Truffle NFI on GraalVM, but I cannot get Java stacks from coredump via jhsdb. >>>>>>>>>>> >>>>>>>>>>> I've reported this issue to GraalVM community [1], and I 've found out the cause of this issue is .svm_heap would be separated to RO and RW areas by mprotect() calls in run time in spite of .svm_heap is RO section in ELF (please see [1] for details). >>>>>>>>>>> >>>>>>>>>>> It is corner case, but we will see same problem on jhsdb when we attempt to analyze coredump which comes from some applications / libraries which would separate RO sections in ELF like Substrate VM. >>>>>>>>>>> >>>>>>>>>>> I sent PR to fix libsaproc.so in LabsJDK 11 for this issue [2], then community members suggested me to discuss in serviceability-dev. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> >>>>>>>>>>> Yasumasa >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> [1] https://github.com/oracle/graal/issues/2579 >>>>>>>>>>> [2] https://github.com/graalvm/labs-openjdk-11/pull/9 >>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>> >>>>>>> >>>>> >>>>> >>> >>> > > From suenaga at oss.nttdata.com Tue Aug 4 01:38:08 2020 From: suenaga at oss.nttdata.com (Yasumasa Suenaga) Date: Tue, 4 Aug 2020 10:38:08 +0900 Subject: RFR: 8250930: [TESTBUG] Some forceEarlyReturn00* tests failed due to compiler optimization In-Reply-To: References: <16dae51b-f1c5-c283-acc6-c106366d82be@oss.nttdata.com> Message-ID: <1d59b607-aa20-8808-0bbc-9cce1c8ed28c@oss.nttdata.com> Hi Chris, Thanks for your comment! I updated webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8250930/webrev.01/ This change produces infinite loop as below, it works fine. 1150: 8b 05 ae 2e 00 00 mov 0x2eae(%rip),%eax # 4004 <_ZZ100Java_nsk_jdwp_ThreadReference_ForceEarlyReturn_forceEarlyReturn002_forceEarlyReturn002a_nativeMethodE13dummy_counter> 1156: 85 c0 test %eax,%eax 1158: 74 f6 je 1150 Thanks, Yasumasa On 2020/08/04 9:51, Chris Plummer wrote: > Hi Yasumasa, > > Although I don't doubt that it works, calling fgetc() seems like an odd way to resolve this issue. I had some internal discussions on how to safely cause an infinite loop. Something like the following should work: > > static volatile int dummy_counter = 0; > > while (dummy_counter == 0) {} > > volatile is important because it prevents gcc from assuming dummy_counter will always be 0. > > thanks, > > Chris > > On 8/2/20 10:55 PM, Yasumasa Suenaga wrote: >> Hi all, >> >> Please review this change: >> >> ? JBS: https://bugs.openjdk.java.net/browse/JDK-8250930 >> ? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8250930/webrev.00/ >> >> Following tests which were compiled by GCC 10.2 failed. >> >> ?- vmTestbase/nsk/jdi/ThreadReference/forceEarlyReturn/forceEarlyReturn004/forceEarlyReturn004.java >> ?- vmTestbase/nsk/jdwp/ThreadReference/ForceEarlyReturn/forceEarlyReturn002/forceEarlyReturn002.java >> >> They have native module, and they are commented as below: >> >> ``` >> ?? // execute infinite loop to be sure that thread in native method >> ?? while (always_true) >> ?? { >> ?????? // Need some dummy code so the optimizer does not remove this loop. >> ?????? dummy_counter = dummy_counter < 1000 ? 0 : dummy_counter + 1; >> ?? } >> ?? // The optimizer can be surprisingly clever. >> ?? // Use dummy_counter so it can never be optimized out. >> ?? // This statement will always return 0. >> ?? return dummy_counter >= 0 ? 0 : 1; >> ``` >> >> C compiler maybe eliminate this loop. We should not consider compiler optimization at this point with other solution. >> >> >> Thanks, >> >> Yasumasa > > From suenaga at oss.nttdata.com Tue Aug 4 02:41:53 2020 From: suenaga at oss.nttdata.com (Yasumasa Suenaga) Date: Tue, 4 Aug 2020 11:41:53 +0900 Subject: RFR: 8250930: [TESTBUG] Some forceEarlyReturn00* tests failed due to compiler optimization In-Reply-To: <1d59b607-aa20-8808-0bbc-9cce1c8ed28c@oss.nttdata.com> References: <16dae51b-f1c5-c283-acc6-c106366d82be@oss.nttdata.com> <1d59b607-aa20-8808-0bbc-9cce1c8ed28c@oss.nttdata.com> Message-ID: <862f35e6-ea2a-5d1e-4865-6734e41d6dcb@oss.nttdata.com> Submit repo reported build failure on macOS. Can you share details? Job: mach5-one-ysuenaga-JDK-8250930-1-20200804-0139-13140081 Thanks, Yasumasa On 2020/08/04 10:38, Yasumasa Suenaga wrote: > Hi Chris, > > Thanks for your comment! > I updated webrev: > > ? http://cr.openjdk.java.net/~ysuenaga/JDK-8250930/webrev.01/ > > This change produces infinite loop as below, it works fine. > > ? 1150:?????? 8b 05 ae 2e 00 00?????? mov??? 0x2eae(%rip),%eax??????? # 4004 <_ZZ100Java_nsk_jdwp_ThreadReference_ForceEarlyReturn_forceEarlyReturn002_forceEarlyReturn002a_nativeMethodE13dummy_counter> > ? 1156:?????? 85 c0?????????????????? test?? %eax,%eax > ? 1158:?????? 74 f6?????????????????? je???? 1150 > > > Thanks, > > Yasumasa > > > On 2020/08/04 9:51, Chris Plummer wrote: >> Hi Yasumasa, >> >> Although I don't doubt that it works, calling fgetc() seems like an odd way to resolve this issue. I had some internal discussions on how to safely cause an infinite loop. Something like the following should work: >> >> static volatile int dummy_counter = 0; >> >> while (dummy_counter == 0) {} >> >> volatile is important because it prevents gcc from assuming dummy_counter will always be 0. >> >> thanks, >> >> Chris >> >> On 8/2/20 10:55 PM, Yasumasa Suenaga wrote: >>> Hi all, >>> >>> Please review this change: >>> >>> ? JBS: https://bugs.openjdk.java.net/browse/JDK-8250930 >>> ? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8250930/webrev.00/ >>> >>> Following tests which were compiled by GCC 10.2 failed. >>> >>> ?- vmTestbase/nsk/jdi/ThreadReference/forceEarlyReturn/forceEarlyReturn004/forceEarlyReturn004.java >>> ?- vmTestbase/nsk/jdwp/ThreadReference/ForceEarlyReturn/forceEarlyReturn002/forceEarlyReturn002.java >>> >>> They have native module, and they are commented as below: >>> >>> ``` >>> ?? // execute infinite loop to be sure that thread in native method >>> ?? while (always_true) >>> ?? { >>> ?????? // Need some dummy code so the optimizer does not remove this loop. >>> ?????? dummy_counter = dummy_counter < 1000 ? 0 : dummy_counter + 1; >>> ?? } >>> ?? // The optimizer can be surprisingly clever. >>> ?? // Use dummy_counter so it can never be optimized out. >>> ?? // This statement will always return 0. >>> ?? return dummy_counter >= 0 ? 0 : 1; >>> ``` >>> >>> C compiler maybe eliminate this loop. We should not consider compiler optimization at this point with other solution. >>> >>> >>> Thanks, >>> >>> Yasumasa >> >> From chris.plummer at oracle.com Tue Aug 4 03:18:04 2020 From: chris.plummer at oracle.com (Chris Plummer) Date: Mon, 3 Aug 2020 20:18:04 -0700 Subject: RFR: 8250930: [TESTBUG] Some forceEarlyReturn00* tests failed due to compiler optimization In-Reply-To: <862f35e6-ea2a-5d1e-4865-6734e41d6dcb@oss.nttdata.com> References: <16dae51b-f1c5-c283-acc6-c106366d82be@oss.nttdata.com> <1d59b607-aa20-8808-0bbc-9cce1c8ed28c@oss.nttdata.com> <862f35e6-ea2a-5d1e-4865-6734e41d6dcb@oss.nttdata.com> Message-ID: <79d4eba8-2041-bd66-c4ff-a4c56a712096@oracle.com> Hi Yasumasa, I'm not sure yet. I'm waiting for a answer from the build support team. It looks like some sort of sporadic build failure unrelated to your changes. Only one of several macosx builds failed. You might just want to try to submit the changes again. thanks, Chris On 8/3/20 7:41 PM, Yasumasa Suenaga wrote: > Submit repo reported build failure on macOS. > Can you share details? > > ? Job: mach5-one-ysuenaga-JDK-8250930-1-20200804-0139-13140081 > > > Thanks, > > Yasumasa > > > On 2020/08/04 10:38, Yasumasa Suenaga wrote: >> Hi Chris, >> >> Thanks for your comment! >> I updated webrev: >> >> ?? http://cr.openjdk.java.net/~ysuenaga/JDK-8250930/webrev.01/ >> >> This change produces infinite loop as below, it works fine. >> >> ?? 1150:?????? 8b 05 ae 2e 00 00?????? mov 0x2eae(%rip),%eax??????? # >> 4004 >> <_ZZ100Java_nsk_jdwp_ThreadReference_ForceEarlyReturn_forceEarlyReturn002_forceEarlyReturn002a_nativeMethodE13dummy_counter> >> ?? 1156:?????? 85 c0?????????????????? test?? %eax,%eax >> ?? 1158:?????? 74 f6?????????????????? je???? 1150 >> >> >> >> Thanks, >> >> Yasumasa >> >> >> On 2020/08/04 9:51, Chris Plummer wrote: >>> Hi Yasumasa, >>> >>> Although I don't doubt that it works, calling fgetc() seems like an >>> odd way to resolve this issue. I had some internal discussions on >>> how to safely cause an infinite loop. Something like the following >>> should work: >>> >>> static volatile int dummy_counter = 0; >>> >>> while (dummy_counter == 0) {} >>> >>> volatile is important because it prevents gcc from assuming >>> dummy_counter will always be 0. >>> >>> thanks, >>> >>> Chris >>> >>> On 8/2/20 10:55 PM, Yasumasa Suenaga wrote: >>>> Hi all, >>>> >>>> Please review this change: >>>> >>>> ? JBS: https://bugs.openjdk.java.net/browse/JDK-8250930 >>>> ? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8250930/webrev.00/ >>>> >>>> Following tests which were compiled by GCC 10.2 failed. >>>> >>>> ?- >>>> vmTestbase/nsk/jdi/ThreadReference/forceEarlyReturn/forceEarlyReturn004/forceEarlyReturn004.java >>>> ?- >>>> vmTestbase/nsk/jdwp/ThreadReference/ForceEarlyReturn/forceEarlyReturn002/forceEarlyReturn002.java >>>> >>>> They have native module, and they are commented as below: >>>> >>>> ``` >>>> ?? // execute infinite loop to be sure that thread in native method >>>> ?? while (always_true) >>>> ?? { >>>> ?????? // Need some dummy code so the optimizer does not remove >>>> this loop. >>>> ?????? dummy_counter = dummy_counter < 1000 ? 0 : dummy_counter + 1; >>>> ?? } >>>> ?? // The optimizer can be surprisingly clever. >>>> ?? // Use dummy_counter so it can never be optimized out. >>>> ?? // This statement will always return 0. >>>> ?? return dummy_counter >= 0 ? 0 : 1; >>>> ``` >>>> >>>> C compiler maybe eliminate this loop. We should not consider >>>> compiler optimization at this point with other solution. >>>> >>>> >>>> Thanks, >>>> >>>> Yasumasa >>> >>> From chris.plummer at oracle.com Tue Aug 4 03:19:14 2020 From: chris.plummer at oracle.com (Chris Plummer) Date: Mon, 3 Aug 2020 20:19:14 -0700 Subject: RFR: 8250930: [TESTBUG] Some forceEarlyReturn00* tests failed due to compiler optimization In-Reply-To: <1d59b607-aa20-8808-0bbc-9cce1c8ed28c@oss.nttdata.com> References: <16dae51b-f1c5-c283-acc6-c106366d82be@oss.nttdata.com> <1d59b607-aa20-8808-0bbc-9cce1c8ed28c@oss.nttdata.com> Message-ID: <5edb082d-08ed-60e5-70b8-0d2f29f88a35@oracle.com> Hi Yasumasa, The changes look good. thanks, Chris On 8/3/20 6:38 PM, Yasumasa Suenaga wrote: > Hi Chris, > > Thanks for your comment! > I updated webrev: > > ? http://cr.openjdk.java.net/~ysuenaga/JDK-8250930/webrev.01/ > > This change produces infinite loop as below, it works fine. > > ? 1150:?????? 8b 05 ae 2e 00 00?????? mov 0x2eae(%rip),%eax??????? # > 4004 > <_ZZ100Java_nsk_jdwp_ThreadReference_ForceEarlyReturn_forceEarlyReturn002_forceEarlyReturn002a_nativeMethodE13dummy_counter> > ? 1156:?????? 85 c0?????????????????? test?? %eax,%eax > ? 1158:?????? 74 f6?????????????????? je???? 1150 > > > > Thanks, > > Yasumasa > > > On 2020/08/04 9:51, Chris Plummer wrote: >> Hi Yasumasa, >> >> Although I don't doubt that it works, calling fgetc() seems like an >> odd way to resolve this issue. I had some internal discussions on how >> to safely cause an infinite loop. Something like the following should >> work: >> >> static volatile int dummy_counter = 0; >> >> while (dummy_counter == 0) {} >> >> volatile is important because it prevents gcc from assuming >> dummy_counter will always be 0. >> >> thanks, >> >> Chris >> >> On 8/2/20 10:55 PM, Yasumasa Suenaga wrote: >>> Hi all, >>> >>> Please review this change: >>> >>> ? JBS: https://bugs.openjdk.java.net/browse/JDK-8250930 >>> ? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8250930/webrev.00/ >>> >>> Following tests which were compiled by GCC 10.2 failed. >>> >>> ?- >>> vmTestbase/nsk/jdi/ThreadReference/forceEarlyReturn/forceEarlyReturn004/forceEarlyReturn004.java >>> ?- >>> vmTestbase/nsk/jdwp/ThreadReference/ForceEarlyReturn/forceEarlyReturn002/forceEarlyReturn002.java >>> >>> They have native module, and they are commented as below: >>> >>> ``` >>> ?? // execute infinite loop to be sure that thread in native method >>> ?? while (always_true) >>> ?? { >>> ?????? // Need some dummy code so the optimizer does not remove this >>> loop. >>> ?????? dummy_counter = dummy_counter < 1000 ? 0 : dummy_counter + 1; >>> ?? } >>> ?? // The optimizer can be surprisingly clever. >>> ?? // Use dummy_counter so it can never be optimized out. >>> ?? // This statement will always return 0. >>> ?? return dummy_counter >= 0 ? 0 : 1; >>> ``` >>> >>> C compiler maybe eliminate this loop. We should not consider >>> compiler optimization at this point with other solution. >>> >>> >>> Thanks, >>> >>> Yasumasa >> >> From chris.plummer at oracle.com Tue Aug 4 04:10:07 2020 From: chris.plummer at oracle.com (Chris Plummer) Date: Mon, 3 Aug 2020 21:10:07 -0700 Subject: RFR(S): 8247516: DSO.closestSymbolToPC() should use dbg.lookup() rather than rely on java ELF file support In-Reply-To: <2873b4bd-09c6-7f29-04fc-3910f360def8@oracle.com> References: <2873b4bd-09c6-7f29-04fc-3910f360def8@oracle.com> Message-ID: <3108b833-83e1-c2f6-d1fa-7200650f5279@oracle.com> Ping! On 7/27/20 10:04 PM, Chris Plummer wrote: > I should have mentioned that currently there is no testing of this > code. There will with the changes for [1] JDK-8247514, which will add > the lost clhsdb "whatis" functionality, which was lost when JavaScript > support went away. "whatis" used DSO.closestSymbolToPC(), so as part > of JDK-8247514 I'm adding this support to the PointerFinder class so > the "findpc" will also be able to do address to native symbol lookups, > and the ClhsdbFindPC will check that it is working. > > [1] https://bugs.openjdk.java.net/browse/JDK-8247514 > > thanks, > > Chris > > On 7/27/20 9:32 PM, Chris Plummer wrote: >> Hello, >> >> Please review the following: >> >> https://bugs.openjdk.java.net/browse/JDK-8247516 >> http://cr.openjdk.java.net/~cjplummer/8247516/webrev.00/index.html >> >> I put all the details in the description of the CR, including some >> background on how symbol lookups are done, including what LoadObjects >> are and their class hierarchy, and also info on JVMDebugger subclasses. >> >> One thing not covered in the bug description is the additional >> gutting of DSO.java that comes with these changes. Many APIs were not >> used so I removed them, such as setBase(), lookupSymbol(), and >> isDSO(). Doing so allowed completely severing any need for java ELF >> file support. Note I plan on removing the java ELF file support >> itself with another CR after pushing these changes. >> >> thanks, >> >> Chris > From suenaga at oss.nttdata.com Tue Aug 4 04:43:44 2020 From: suenaga at oss.nttdata.com (Yasumasa Suenaga) Date: Tue, 4 Aug 2020 13:43:44 +0900 Subject: RFR: 8250930: [TESTBUG] Some forceEarlyReturn00* tests failed due to compiler optimization In-Reply-To: <79d4eba8-2041-bd66-c4ff-a4c56a712096@oracle.com> References: <16dae51b-f1c5-c283-acc6-c106366d82be@oss.nttdata.com> <1d59b607-aa20-8808-0bbc-9cce1c8ed28c@oss.nttdata.com> <862f35e6-ea2a-5d1e-4865-6734e41d6dcb@oss.nttdata.com> <79d4eba8-2041-bd66-c4ff-a4c56a712096@oracle.com> Message-ID: Hi Chris, On 2020/08/04 12:18, Chris Plummer wrote: > Hi Yasumasa, > > I'm not sure yet. I'm waiting for a answer from the build support team. It looks like some sort of sporadic build failure unrelated to your changes. Only one of several macosx builds failed. You might just want to try to submit the changes again. I resubmitted the change, then it succeeded (mach5-one-ysuenaga-JDK-8250930-2-20200804-0330-13141411). It might be sporadic failure as you said. I will push it to jdk/jdk when you get answer from the build support team. Thanks, Yasumasa > thanks, > > Chris > > On 8/3/20 7:41 PM, Yasumasa Suenaga wrote: >> Submit repo reported build failure on macOS. >> Can you share details? >> >> ? Job: mach5-one-ysuenaga-JDK-8250930-1-20200804-0139-13140081 >> >> >> Thanks, >> >> Yasumasa >> >> >> On 2020/08/04 10:38, Yasumasa Suenaga wrote: >>> Hi Chris, >>> >>> Thanks for your comment! >>> I updated webrev: >>> >>> ?? http://cr.openjdk.java.net/~ysuenaga/JDK-8250930/webrev.01/ >>> >>> This change produces infinite loop as below, it works fine. >>> >>> ?? 1150:?????? 8b 05 ae 2e 00 00?????? mov 0x2eae(%rip),%eax??????? # 4004 <_ZZ100Java_nsk_jdwp_ThreadReference_ForceEarlyReturn_forceEarlyReturn002_forceEarlyReturn002a_nativeMethodE13dummy_counter> >>> ?? 1156:?????? 85 c0?????????????????? test?? %eax,%eax >>> ?? 1158:?????? 74 f6?????????????????? je???? 1150 >>> >>> >>> Thanks, >>> >>> Yasumasa >>> >>> >>> On 2020/08/04 9:51, Chris Plummer wrote: >>>> Hi Yasumasa, >>>> >>>> Although I don't doubt that it works, calling fgetc() seems like an odd way to resolve this issue. I had some internal discussions on how to safely cause an infinite loop. Something like the following should work: >>>> >>>> static volatile int dummy_counter = 0; >>>> >>>> while (dummy_counter == 0) {} >>>> >>>> volatile is important because it prevents gcc from assuming dummy_counter will always be 0. >>>> >>>> thanks, >>>> >>>> Chris >>>> >>>> On 8/2/20 10:55 PM, Yasumasa Suenaga wrote: >>>>> Hi all, >>>>> >>>>> Please review this change: >>>>> >>>>> ? JBS: https://bugs.openjdk.java.net/browse/JDK-8250930 >>>>> ? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8250930/webrev.00/ >>>>> >>>>> Following tests which were compiled by GCC 10.2 failed. >>>>> >>>>> ?- vmTestbase/nsk/jdi/ThreadReference/forceEarlyReturn/forceEarlyReturn004/forceEarlyReturn004.java >>>>> ?- vmTestbase/nsk/jdwp/ThreadReference/ForceEarlyReturn/forceEarlyReturn002/forceEarlyReturn002.java >>>>> >>>>> They have native module, and they are commented as below: >>>>> >>>>> ``` >>>>> ?? // execute infinite loop to be sure that thread in native method >>>>> ?? while (always_true) >>>>> ?? { >>>>> ?????? // Need some dummy code so the optimizer does not remove this loop. >>>>> ?????? dummy_counter = dummy_counter < 1000 ? 0 : dummy_counter + 1; >>>>> ?? } >>>>> ?? // The optimizer can be surprisingly clever. >>>>> ?? // Use dummy_counter so it can never be optimized out. >>>>> ?? // This statement will always return 0. >>>>> ?? return dummy_counter >= 0 ? 0 : 1; >>>>> ``` >>>>> >>>>> C compiler maybe eliminate this loop. We should not consider compiler optimization at this point with other solution. >>>>> >>>>> >>>>> Thanks, >>>>> >>>>> Yasumasa >>>> >>>> > > From suenaga at oss.nttdata.com Tue Aug 4 05:10:12 2020 From: suenaga at oss.nttdata.com (Yasumasa Suenaga) Date: Tue, 4 Aug 2020 14:10:12 +0900 Subject: RFR(S): 8247516: DSO.closestSymbolToPC() should use dbg.lookup() rather than rely on java ELF file support In-Reply-To: <3108b833-83e1-c2f6-d1fa-7200650f5279@oracle.com> References: <2873b4bd-09c6-7f29-04fc-3910f360def8@oracle.com> <3108b833-83e1-c2f6-d1fa-7200650f5279@oracle.com> Message-ID: <18037da5-ef72-0de3-a7cb-a5dc4470f285@oss.nttdata.com> Hi Chris, Looks good. Yasumasa On 2020/08/04 13:10, Chris Plummer wrote: > Ping! > > On 7/27/20 10:04 PM, Chris Plummer wrote: >> I should have mentioned that currently there is no testing of this code. There will with the changes for [1] JDK-8247514, which will add the lost clhsdb "whatis" functionality, which was lost when JavaScript support went away. "whatis" used DSO.closestSymbolToPC(), so as part of JDK-8247514 I'm adding this support to the PointerFinder class so the "findpc" will also be able to do address to native symbol lookups, and the ClhsdbFindPC will check that it is working. >> >> [1] https://bugs.openjdk.java.net/browse/JDK-8247514 >> >> thanks, >> >> Chris >> >> On 7/27/20 9:32 PM, Chris Plummer wrote: >>> Hello, >>> >>> Please review the following: >>> >>> https://bugs.openjdk.java.net/browse/JDK-8247516 >>> http://cr.openjdk.java.net/~cjplummer/8247516/webrev.00/index.html >>> >>> I put all the details in the description of the CR, including some background on how symbol lookups are done, including what LoadObjects are and their class hierarchy, and also info on JVMDebugger subclasses. >>> >>> One thing not covered in the bug description is the additional gutting of DSO.java that comes with these changes. Many APIs were not used so I removed them, such as setBase(), lookupSymbol(), and isDSO(). Doing so allowed completely severing any need for java ELF file support. Note I plan on removing the java ELF file support itself with another CR after pushing these changes. >>> >>> thanks, >>> >>> Chris >> > From chris.plummer at oracle.com Tue Aug 4 05:12:12 2020 From: chris.plummer at oracle.com (Chris Plummer) Date: Mon, 3 Aug 2020 22:12:12 -0700 Subject: RFR: 8250930: [TESTBUG] Some forceEarlyReturn00* tests failed due to compiler optimization In-Reply-To: References: <16dae51b-f1c5-c283-acc6-c106366d82be@oss.nttdata.com> <1d59b607-aa20-8808-0bbc-9cce1c8ed28c@oss.nttdata.com> <862f35e6-ea2a-5d1e-4865-6734e41d6dcb@oss.nttdata.com> <79d4eba8-2041-bd66-c4ff-a4c56a712096@oracle.com> Message-ID: <9cb65898-9d95-1a13-4956-ad4e759d844f@oracle.com> On 8/3/20 9:43 PM, Yasumasa Suenaga wrote: > Hi Chris, > > On 2020/08/04 12:18, Chris Plummer wrote: >> Hi Yasumasa, >> >> I'm not sure yet. I'm waiting for a answer from the build support >> team. It looks like some sort of sporadic build failure unrelated to >> your changes. Only one of several macosx builds failed. You might >> just want to try to submit the changes again. > > I resubmitted the change, then it succeeded > (mach5-one-ysuenaga-JDK-8250930-2-20200804-0330-13141411). > It might be sporadic failure as you said. > > I will push it to jdk/jdk when you get answer from the build support > team. > Still not 100% what went wrong, but it's clear it's not due to your changes. You can push, but you still need to get a 2nd reviewer. thanks, Chris > > Thanks, > > Yasumasa > > >> thanks, >> >> Chris >> >> On 8/3/20 7:41 PM, Yasumasa Suenaga wrote: >>> Submit repo reported build failure on macOS. >>> Can you share details? >>> >>> ? Job: mach5-one-ysuenaga-JDK-8250930-1-20200804-0139-13140081 >>> >>> >>> Thanks, >>> >>> Yasumasa >>> >>> >>> On 2020/08/04 10:38, Yasumasa Suenaga wrote: >>>> Hi Chris, >>>> >>>> Thanks for your comment! >>>> I updated webrev: >>>> >>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8250930/webrev.01/ >>>> >>>> This change produces infinite loop as below, it works fine. >>>> >>>> ?? 1150:?????? 8b 05 ae 2e 00 00?????? mov 0x2eae(%rip),%eax??????? >>>> # 4004 >>>> <_ZZ100Java_nsk_jdwp_ThreadReference_ForceEarlyReturn_forceEarlyReturn002_forceEarlyReturn002a_nativeMethodE13dummy_counter> >>>> ?? 1156:?????? 85 c0?????????????????? test?? %eax,%eax >>>> ?? 1158:?????? 74 f6?????????????????? je???? 1150 >>>> >>>> >>>> >>>> Thanks, >>>> >>>> Yasumasa >>>> >>>> >>>> On 2020/08/04 9:51, Chris Plummer wrote: >>>>> Hi Yasumasa, >>>>> >>>>> Although I don't doubt that it works, calling fgetc() seems like >>>>> an odd way to resolve this issue. I had some internal discussions >>>>> on how to safely cause an infinite loop. Something like the >>>>> following should work: >>>>> >>>>> static volatile int dummy_counter = 0; >>>>> >>>>> while (dummy_counter == 0) {} >>>>> >>>>> volatile is important because it prevents gcc from assuming >>>>> dummy_counter will always be 0. >>>>> >>>>> thanks, >>>>> >>>>> Chris >>>>> >>>>> On 8/2/20 10:55 PM, Yasumasa Suenaga wrote: >>>>>> Hi all, >>>>>> >>>>>> Please review this change: >>>>>> >>>>>> ? JBS: https://bugs.openjdk.java.net/browse/JDK-8250930 >>>>>> ? webrev: >>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8250930/webrev.00/ >>>>>> >>>>>> Following tests which were compiled by GCC 10.2 failed. >>>>>> >>>>>> ?- >>>>>> vmTestbase/nsk/jdi/ThreadReference/forceEarlyReturn/forceEarlyReturn004/forceEarlyReturn004.java >>>>>> ?- >>>>>> vmTestbase/nsk/jdwp/ThreadReference/ForceEarlyReturn/forceEarlyReturn002/forceEarlyReturn002.java >>>>>> >>>>>> They have native module, and they are commented as below: >>>>>> >>>>>> ``` >>>>>> ?? // execute infinite loop to be sure that thread in native method >>>>>> ?? while (always_true) >>>>>> ?? { >>>>>> ?????? // Need some dummy code so the optimizer does not remove >>>>>> this loop. >>>>>> ?????? dummy_counter = dummy_counter < 1000 ? 0 : dummy_counter + 1; >>>>>> ?? } >>>>>> ?? // The optimizer can be surprisingly clever. >>>>>> ?? // Use dummy_counter so it can never be optimized out. >>>>>> ?? // This statement will always return 0. >>>>>> ?? return dummy_counter >= 0 ? 0 : 1; >>>>>> ``` >>>>>> >>>>>> C compiler maybe eliminate this loop. We should not consider >>>>>> compiler optimization at this point with other solution. >>>>>> >>>>>> >>>>>> Thanks, >>>>>> >>>>>> Yasumasa >>>>> >>>>> >> >> From suenaga at oss.nttdata.com Tue Aug 4 05:13:54 2020 From: suenaga at oss.nttdata.com (Yasumasa Suenaga) Date: Tue, 4 Aug 2020 14:13:54 +0900 Subject: RFR: 8250930: [TESTBUG] Some forceEarlyReturn00* tests failed due to compiler optimization In-Reply-To: <9cb65898-9d95-1a13-4956-ad4e759d844f@oracle.com> References: <16dae51b-f1c5-c283-acc6-c106366d82be@oss.nttdata.com> <1d59b607-aa20-8808-0bbc-9cce1c8ed28c@oss.nttdata.com> <862f35e6-ea2a-5d1e-4865-6734e41d6dcb@oss.nttdata.com> <79d4eba8-2041-bd66-c4ff-a4c56a712096@oracle.com> <9cb65898-9d95-1a13-4956-ad4e759d844f@oracle.com> Message-ID: On 2020/08/04 14:12, Chris Plummer wrote: > On 8/3/20 9:43 PM, Yasumasa Suenaga wrote: >> Hi Chris, >> >> On 2020/08/04 12:18, Chris Plummer wrote: >>> Hi Yasumasa, >>> >>> I'm not sure yet. I'm waiting for a answer from the build support team. It looks like some sort of sporadic build failure unrelated to your changes. Only one of several macosx builds failed. You might just want to try to submit the changes again. >> >> I resubmitted the change, then it succeeded (mach5-one-ysuenaga-JDK-8250930-2-20200804-0330-13141411). >> It might be sporadic failure as you said. >> >> I will push it to jdk/jdk when you get answer from the build support team. >> > Still not 100% what went wrong, but it's clear it's not due to your changes. You can push, but you still need to get a 2nd reviewer. Ok, I'm waiting for 2nd reviewer for this change. Thanks, Yasumasa > thanks, > > Chris >> >> Thanks, >> >> Yasumasa >> >> >>> thanks, >>> >>> Chris >>> >>> On 8/3/20 7:41 PM, Yasumasa Suenaga wrote: >>>> Submit repo reported build failure on macOS. >>>> Can you share details? >>>> >>>> ? Job: mach5-one-ysuenaga-JDK-8250930-1-20200804-0139-13140081 >>>> >>>> >>>> Thanks, >>>> >>>> Yasumasa >>>> >>>> >>>> On 2020/08/04 10:38, Yasumasa Suenaga wrote: >>>>> Hi Chris, >>>>> >>>>> Thanks for your comment! >>>>> I updated webrev: >>>>> >>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8250930/webrev.01/ >>>>> >>>>> This change produces infinite loop as below, it works fine. >>>>> >>>>> ?? 1150:?????? 8b 05 ae 2e 00 00?????? mov 0x2eae(%rip),%eax # 4004 <_ZZ100Java_nsk_jdwp_ThreadReference_ForceEarlyReturn_forceEarlyReturn002_forceEarlyReturn002a_nativeMethodE13dummy_counter> >>>>> ?? 1156:?????? 85 c0?????????????????? test?? %eax,%eax >>>>> ?? 1158:?????? 74 f6?????????????????? je???? 1150 >>>>> >>>>> >>>>> Thanks, >>>>> >>>>> Yasumasa >>>>> >>>>> >>>>> On 2020/08/04 9:51, Chris Plummer wrote: >>>>>> Hi Yasumasa, >>>>>> >>>>>> Although I don't doubt that it works, calling fgetc() seems like an odd way to resolve this issue. I had some internal discussions on how to safely cause an infinite loop. Something like the following should work: >>>>>> >>>>>> static volatile int dummy_counter = 0; >>>>>> >>>>>> while (dummy_counter == 0) {} >>>>>> >>>>>> volatile is important because it prevents gcc from assuming dummy_counter will always be 0. >>>>>> >>>>>> thanks, >>>>>> >>>>>> Chris >>>>>> >>>>>> On 8/2/20 10:55 PM, Yasumasa Suenaga wrote: >>>>>>> Hi all, >>>>>>> >>>>>>> Please review this change: >>>>>>> >>>>>>> ? JBS: https://bugs.openjdk.java.net/browse/JDK-8250930 >>>>>>> ? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8250930/webrev.00/ >>>>>>> >>>>>>> Following tests which were compiled by GCC 10.2 failed. >>>>>>> >>>>>>> ?- vmTestbase/nsk/jdi/ThreadReference/forceEarlyReturn/forceEarlyReturn004/forceEarlyReturn004.java >>>>>>> ?- vmTestbase/nsk/jdwp/ThreadReference/ForceEarlyReturn/forceEarlyReturn002/forceEarlyReturn002.java >>>>>>> >>>>>>> They have native module, and they are commented as below: >>>>>>> >>>>>>> ``` >>>>>>> ?? // execute infinite loop to be sure that thread in native method >>>>>>> ?? while (always_true) >>>>>>> ?? { >>>>>>> ?????? // Need some dummy code so the optimizer does not remove this loop. >>>>>>> ?????? dummy_counter = dummy_counter < 1000 ? 0 : dummy_counter + 1; >>>>>>> ?? } >>>>>>> ?? // The optimizer can be surprisingly clever. >>>>>>> ?? // Use dummy_counter so it can never be optimized out. >>>>>>> ?? // This statement will always return 0. >>>>>>> ?? return dummy_counter >= 0 ? 0 : 1; >>>>>>> ``` >>>>>>> >>>>>>> C compiler maybe eliminate this loop. We should not consider compiler optimization at this point with other solution. >>>>>>> >>>>>>> >>>>>>> Thanks, >>>>>>> >>>>>>> Yasumasa >>>>>> >>>>>> >>> >>> > > From david.holmes at oracle.com Tue Aug 4 06:18:03 2020 From: david.holmes at oracle.com (David Holmes) Date: Tue, 4 Aug 2020 16:18:03 +1000 Subject: RFR: 8250930: [TESTBUG] Some forceEarlyReturn00* tests failed due to compiler optimization In-Reply-To: References: <16dae51b-f1c5-c283-acc6-c106366d82be@oss.nttdata.com> <1d59b607-aa20-8808-0bbc-9cce1c8ed28c@oss.nttdata.com> <862f35e6-ea2a-5d1e-4865-6734e41d6dcb@oss.nttdata.com> <79d4eba8-2041-bd66-c4ff-a4c56a712096@oracle.com> <9cb65898-9d95-1a13-4956-ad4e759d844f@oracle.com> Message-ID: <81a92427-82f8-dffc-c4ff-14d2b282f254@oracle.com> Looks good to me. Thanks, David On 4/08/2020 3:13 pm, Yasumasa Suenaga wrote: > On 2020/08/04 14:12, Chris Plummer wrote: >> On 8/3/20 9:43 PM, Yasumasa Suenaga wrote: >>> Hi Chris, >>> >>> On 2020/08/04 12:18, Chris Plummer wrote: >>>> Hi Yasumasa, >>>> >>>> I'm not sure yet. I'm waiting for a answer from the build support >>>> team. It looks like some sort of sporadic build failure unrelated to >>>> your changes. Only one of several macosx builds failed. You might >>>> just want to try to submit the changes again. >>> >>> I resubmitted the change, then it succeeded >>> (mach5-one-ysuenaga-JDK-8250930-2-20200804-0330-13141411). >>> It might be sporadic failure as you said. >>> >>> I will push it to jdk/jdk when you get answer from the build support >>> team. >>> >> Still not 100% what went wrong, but it's clear it's not due to your >> changes. You can push, but you still need to get a 2nd reviewer. > > Ok, I'm waiting for 2nd reviewer for this change. > > > Thanks, > > Yasumasa > > >> thanks, >> >> Chris >>> >>> Thanks, >>> >>> Yasumasa >>> >>> >>>> thanks, >>>> >>>> Chris >>>> >>>> On 8/3/20 7:41 PM, Yasumasa Suenaga wrote: >>>>> Submit repo reported build failure on macOS. >>>>> Can you share details? >>>>> >>>>> ? Job: mach5-one-ysuenaga-JDK-8250930-1-20200804-0139-13140081 >>>>> >>>>> >>>>> Thanks, >>>>> >>>>> Yasumasa >>>>> >>>>> >>>>> On 2020/08/04 10:38, Yasumasa Suenaga wrote: >>>>>> Hi Chris, >>>>>> >>>>>> Thanks for your comment! >>>>>> I updated webrev: >>>>>> >>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8250930/webrev.01/ >>>>>> >>>>>> This change produces infinite loop as below, it works fine. >>>>>> >>>>>> ?? 1150:?????? 8b 05 ae 2e 00 00?????? mov 0x2eae(%rip),%eax # >>>>>> 4004 >>>>>> <_ZZ100Java_nsk_jdwp_ThreadReference_ForceEarlyReturn_forceEarlyReturn002_forceEarlyReturn002a_nativeMethodE13dummy_counter> >>>>>> >>>>>> ?? 1156:?????? 85 c0?????????????????? test?? %eax,%eax >>>>>> ?? 1158:?????? 74 f6?????????????????? je???? 1150 >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> Thanks, >>>>>> >>>>>> Yasumasa >>>>>> >>>>>> >>>>>> On 2020/08/04 9:51, Chris Plummer wrote: >>>>>>> Hi Yasumasa, >>>>>>> >>>>>>> Although I don't doubt that it works, calling fgetc() seems like >>>>>>> an odd way to resolve this issue. I had some internal discussions >>>>>>> on how to safely cause an infinite loop. Something like the >>>>>>> following should work: >>>>>>> >>>>>>> static volatile int dummy_counter = 0; >>>>>>> >>>>>>> while (dummy_counter == 0) {} >>>>>>> >>>>>>> volatile is important because it prevents gcc from assuming >>>>>>> dummy_counter will always be 0. >>>>>>> >>>>>>> thanks, >>>>>>> >>>>>>> Chris >>>>>>> >>>>>>> On 8/2/20 10:55 PM, Yasumasa Suenaga wrote: >>>>>>>> Hi all, >>>>>>>> >>>>>>>> Please review this change: >>>>>>>> >>>>>>>> ? JBS: https://bugs.openjdk.java.net/browse/JDK-8250930 >>>>>>>> ? webrev: >>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8250930/webrev.00/ >>>>>>>> >>>>>>>> Following tests which were compiled by GCC 10.2 failed. >>>>>>>> >>>>>>>> ?- >>>>>>>> vmTestbase/nsk/jdi/ThreadReference/forceEarlyReturn/forceEarlyReturn004/forceEarlyReturn004.java >>>>>>>> >>>>>>>> ?- >>>>>>>> vmTestbase/nsk/jdwp/ThreadReference/ForceEarlyReturn/forceEarlyReturn002/forceEarlyReturn002.java >>>>>>>> >>>>>>>> >>>>>>>> They have native module, and they are commented as below: >>>>>>>> >>>>>>>> ``` >>>>>>>> ?? // execute infinite loop to be sure that thread in native method >>>>>>>> ?? while (always_true) >>>>>>>> ?? { >>>>>>>> ?????? // Need some dummy code so the optimizer does not remove >>>>>>>> this loop. >>>>>>>> ?????? dummy_counter = dummy_counter < 1000 ? 0 : dummy_counter >>>>>>>> + 1; >>>>>>>> ?? } >>>>>>>> ?? // The optimizer can be surprisingly clever. >>>>>>>> ?? // Use dummy_counter so it can never be optimized out. >>>>>>>> ?? // This statement will always return 0. >>>>>>>> ?? return dummy_counter >= 0 ? 0 : 1; >>>>>>>> ``` >>>>>>>> >>>>>>>> C compiler maybe eliminate this loop. We should not consider >>>>>>>> compiler optimization at this point with other solution. >>>>>>>> >>>>>>>> >>>>>>>> Thanks, >>>>>>>> >>>>>>>> Yasumasa >>>>>>> >>>>>>> >>>> >>>> >> >> From suenaga at oss.nttdata.com Tue Aug 4 06:24:30 2020 From: suenaga at oss.nttdata.com (Yasumasa Suenaga) Date: Tue, 4 Aug 2020 15:24:30 +0900 Subject: RFR: 8250930: [TESTBUG] Some forceEarlyReturn00* tests failed due to compiler optimization In-Reply-To: <81a92427-82f8-dffc-c4ff-14d2b282f254@oracle.com> References: <16dae51b-f1c5-c283-acc6-c106366d82be@oss.nttdata.com> <1d59b607-aa20-8808-0bbc-9cce1c8ed28c@oss.nttdata.com> <862f35e6-ea2a-5d1e-4865-6734e41d6dcb@oss.nttdata.com> <79d4eba8-2041-bd66-c4ff-a4c56a712096@oracle.com> <9cb65898-9d95-1a13-4956-ad4e759d844f@oracle.com> <81a92427-82f8-dffc-c4ff-14d2b282f254@oracle.com> Message-ID: Thanks David! Yasumasa On 2020/08/04 15:18, David Holmes wrote: > Looks good to me. > > Thanks, > David > > On 4/08/2020 3:13 pm, Yasumasa Suenaga wrote: >> On 2020/08/04 14:12, Chris Plummer wrote: >>> On 8/3/20 9:43 PM, Yasumasa Suenaga wrote: >>>> Hi Chris, >>>> >>>> On 2020/08/04 12:18, Chris Plummer wrote: >>>>> Hi Yasumasa, >>>>> >>>>> I'm not sure yet. I'm waiting for a answer from the build support team. It looks like some sort of sporadic build failure unrelated to your changes. Only one of several macosx builds failed. You might just want to try to submit the changes again. >>>> >>>> I resubmitted the change, then it succeeded (mach5-one-ysuenaga-JDK-8250930-2-20200804-0330-13141411). >>>> It might be sporadic failure as you said. >>>> >>>> I will push it to jdk/jdk when you get answer from the build support team. >>>> >>> Still not 100% what went wrong, but it's clear it's not due to your changes. You can push, but you still need to get a 2nd reviewer. >> >> Ok, I'm waiting for 2nd reviewer for this change. >> >> >> Thanks, >> >> Yasumasa >> >> >>> thanks, >>> >>> Chris >>>> >>>> Thanks, >>>> >>>> Yasumasa >>>> >>>> >>>>> thanks, >>>>> >>>>> Chris >>>>> >>>>> On 8/3/20 7:41 PM, Yasumasa Suenaga wrote: >>>>>> Submit repo reported build failure on macOS. >>>>>> Can you share details? >>>>>> >>>>>> ? Job: mach5-one-ysuenaga-JDK-8250930-1-20200804-0139-13140081 >>>>>> >>>>>> >>>>>> Thanks, >>>>>> >>>>>> Yasumasa >>>>>> >>>>>> >>>>>> On 2020/08/04 10:38, Yasumasa Suenaga wrote: >>>>>>> Hi Chris, >>>>>>> >>>>>>> Thanks for your comment! >>>>>>> I updated webrev: >>>>>>> >>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8250930/webrev.01/ >>>>>>> >>>>>>> This change produces infinite loop as below, it works fine. >>>>>>> >>>>>>> ?? 1150:?????? 8b 05 ae 2e 00 00?????? mov 0x2eae(%rip),%eax # 4004 <_ZZ100Java_nsk_jdwp_ThreadReference_ForceEarlyReturn_forceEarlyReturn002_forceEarlyReturn002a_nativeMethodE13dummy_counter> >>>>>>> ?? 1156:?????? 85 c0?????????????????? test?? %eax,%eax >>>>>>> ?? 1158:?????? 74 f6?????????????????? je???? 1150 >>>>>>> >>>>>>> >>>>>>> Thanks, >>>>>>> >>>>>>> Yasumasa >>>>>>> >>>>>>> >>>>>>> On 2020/08/04 9:51, Chris Plummer wrote: >>>>>>>> Hi Yasumasa, >>>>>>>> >>>>>>>> Although I don't doubt that it works, calling fgetc() seems like an odd way to resolve this issue. I had some internal discussions on how to safely cause an infinite loop. Something like the following should work: >>>>>>>> >>>>>>>> static volatile int dummy_counter = 0; >>>>>>>> >>>>>>>> while (dummy_counter == 0) {} >>>>>>>> >>>>>>>> volatile is important because it prevents gcc from assuming dummy_counter will always be 0. >>>>>>>> >>>>>>>> thanks, >>>>>>>> >>>>>>>> Chris >>>>>>>> >>>>>>>> On 8/2/20 10:55 PM, Yasumasa Suenaga wrote: >>>>>>>>> Hi all, >>>>>>>>> >>>>>>>>> Please review this change: >>>>>>>>> >>>>>>>>> ? JBS: https://bugs.openjdk.java.net/browse/JDK-8250930 >>>>>>>>> ? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8250930/webrev.00/ >>>>>>>>> >>>>>>>>> Following tests which were compiled by GCC 10.2 failed. >>>>>>>>> >>>>>>>>> ?- vmTestbase/nsk/jdi/ThreadReference/forceEarlyReturn/forceEarlyReturn004/forceEarlyReturn004.java >>>>>>>>> ?- vmTestbase/nsk/jdwp/ThreadReference/ForceEarlyReturn/forceEarlyReturn002/forceEarlyReturn002.java >>>>>>>>> >>>>>>>>> They have native module, and they are commented as below: >>>>>>>>> >>>>>>>>> ``` >>>>>>>>> ?? // execute infinite loop to be sure that thread in native method >>>>>>>>> ?? while (always_true) >>>>>>>>> ?? { >>>>>>>>> ?????? // Need some dummy code so the optimizer does not remove this loop. >>>>>>>>> ?????? dummy_counter = dummy_counter < 1000 ? 0 : dummy_counter + 1; >>>>>>>>> ?? } >>>>>>>>> ?? // The optimizer can be surprisingly clever. >>>>>>>>> ?? // Use dummy_counter so it can never be optimized out. >>>>>>>>> ?? // This statement will always return 0. >>>>>>>>> ?? return dummy_counter >= 0 ? 0 : 1; >>>>>>>>> ``` >>>>>>>>> >>>>>>>>> C compiler maybe eliminate this loop. We should not consider compiler optimization at this point with other solution. >>>>>>>>> >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> >>>>>>>>> Yasumasa >>>>>>>> >>>>>>>> >>>>> >>>>> >>> >>> From serguei.spitsyn at oracle.com Tue Aug 4 09:11:52 2020 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Tue, 4 Aug 2020 02:11:52 -0700 Subject: RFR: 8250826: jhsdb does not work with coredump which comes from Substrate VM In-Reply-To: <29af768a-bebe-13fe-ac51-3062ede936bf@oss.nttdata.com> References: <06e53c65-3cc6-8c51-33d0-26e207f4d0a0@oss.nttdata.com> <3805349a-342f-a130-3b0b-7ee26f23a278@oss.nttdata.com> <6009b927-3949-6d03-51ab-bef939cfbd7f@oracle.com> <313155c0-58b2-cef8-68f8-1aa3ffd7e7df@oss.nttdata.com> <7788f89a-507d-0d4a-ab81-52a09a2d5161@oracle.com> <439d0860-ae4a-98d4-6402-40ebfa833159@oss.nttdata.com> <6740ad1d-5e35-458c-7c2c-caf6da6b1276@oracle.com> <29af768a-bebe-13fe-ac51-3062ede936bf@oss.nttdata.com> Message-ID: <2ca95f3f-c47f-d2c9-ab4f-347f65e74d9f@oracle.com> An HTML attachment was scrubbed... URL: From stefan.karlsson at oracle.com Tue Aug 4 12:11:30 2020 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Tue, 4 Aug 2020 14:11:30 +0200 Subject: RFR(L): 8215624: add parallel heap inspection support for jmap histo(G1)(Internet mail) In-Reply-To: <5EC2DEE8-C1F0-4C88-9F90-A8A55BBE39FE@tencent.com> References: <01181077-0DD7-4AEE-945B-5D6E21224A31@amazon.com> <5EC2DEE8-C1F0-4C88-9F90-A8A55BBE39FE@tencent.com> Message-ID: <1aa24f16-4813-6ef6-fb37-54a974c69940@oracle.com> Hi Lin, Some small nits: Could you go over the patch and move both declaration and definition of the newly added heap functions, so that their location match the one chosen in collectedHeap.hpp? And that the locations is consistent between the hpp and cpp files? https://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_10/src/hotspot/share/gc/serial/serialHeap.hpp.udiff.html + // Runs the given AbstractGangTask with the current active workers. Since the SerialGC doesn't use "workers", this comment needs to be updated. Maybe use the comments from the serialHeap.cpp change? https://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_10/src/hotspot/share/gc/shared/collectedHeap.hpp.udiff.html #include "memory/allocation.hpp" #include "memory/universe.hpp" +#include "memory/heapInspection.hpp" The new include breaks the sorting. https://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_10/src/hotspot/share/gc/shared/gcVMOperations.hpp.patch You changed the indentation here: + VM_GC_HeapInspection(outputStream* out, bool request_full_gc, + uint parallel_thread_num = 1) : + VM_GC_Operation(0 /* total collections, dummy, ignored */, Could you reindent VM_GC_Operation and subsequent lines? https://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_10/src/hotspot/share/gc/z/zHeap.hpp.udiff.html + void run_task(AbstractGangTask* task); // Reference processing ReferenceDiscoverer* reference_discoverer(); void set_soft_reference_policy(bool clear); The grouping of this is awkward. The run_task function has nothing to do with reference processing and shouldn't be grouped with it. I propose that you add a newline between line 103 and 104. Except for these nits, the rest of the GC code looks good. Note that I'm only reviewing the changes to share/gc the rest of the changes. I think it would be prudent to get two other reviewers for the rest of the code changes. With that said, I saw the comment and change of from the 'size_t missed_count' to 'uint missed_count'. This changes the variable to a 32 bit variable on 64 bit builds. It seems like that could cause overflows. Since missed_count wasn't added by this change, maybe not change the type as part of this RFE? Thanks, StefanK On 2020-08-03 16:51, linzang(??) wrote: > Dear Stefan, > May I ask your help to review again? I have made a delta based on the last changeset you have reviewed(webrev04),hope it could ease your reviewing work. > webrev: https://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_10/ > delta (vs webrev04): https://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/delta_10vs04/webrev/ > bug: https://bugs.openjdk.java.net/browse/JDK-8215624 > CSR(approved): https://bugs.openjdk.java.net/browse/JDK-8239290 > > BRs, > Lin > > ?On 2020/7/30, 5:21 AM, "Hohensee, Paul" wrote: > > A submit repo run with this succeeded, so afaic you're good to go. Stefan, you reviewed the GC part before, it'd be great if you could ok the final version. > > Thanks, > Paul > > On 7/29/20, 5:02 AM, "linzang(??)" wrote: > > Upload a new change at http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_10/ > It fix an issue of windows fail : > > #################################### > In heapInspect.cpp > - size_t HeapInspection::populate_table(KlassInfoTable* cit, BoolObjectClosure *filter, uint parallel_thread_num) { > + uint HeapInspection::populate_table(KlassInfoTable* cit, BoolObjectClosure *filter, uint parallel_thread_num) { > #################################### > In heapInspect.hpp > - size_t populate_table(KlassInfoTable* cit, BoolObjectClosure* filter = NULL, uint parallel_thread_num = 1) NOT_SERVICES_RETURN_(0); > + uint populate_table(KlassInfoTable* cit, BoolObjectClosure* filter = NULL, uint parallel_thread_num = 1) NOT_SERVICES_RETURN_(0); > #################################### > > > BRs, > Lin > > On 2020/7/27, 11:26 AM, "linzang(??)" wrote: > > I update a new change at http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_09 > It includes a tiny fix of build failure on windows: > #################################### > In attachListener.cpp: > - uint parallel_thread_num = MAX(1, (uint)os::initial_active_processor_count() * 3 / 8); > + uint parallel_thread_num = MAX2(1, (uint)os::initial_active_processor_count() * 3 / 8); > #################################### > > BRs, > Lin > > On 2020/7/23, 11:56 AM, "linzang(??)" wrote: > > Hi Paul, > Thanks for your help, that all looks good to me. > Just 2 minor changes: > ? delete the final return in ParHeapInspectTask::work, you mentioned it but seems not include in the webrev :-) > ? delete a unnecessary blank line in heapInspect.cpp at merge_entry() > > ######################################################################### > --- old/src/hotspot/share/memory/heapInspection.cpp 2020-07-23 11:23:29.281666456 +0800 > +++ new/src/hotspot/share/memory/heapInspection.cpp 2020-07-23 11:23:29.017666447 +0800 > @@ -251,7 +251,6 @@ > _size_of_instances_in_words += cie->words(); > return true; > } > - > return false; > } > > @@ -568,7 +567,6 @@ > Atomic::add(&_missed_count, missed_count); > } else { > Atomic::store(&_success, false); > - return; > } > } > ######################################################################### > > > Here is the webrev http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_08/ > > BRs, > Lin > --------------------------------------------- > From: "Hohensee, Paul" > Date: Thursday, July 23, 2020 at 6:48 AM > To: "linzang(??)" , Stefan Karlsson , "serguei.spitsyn at oracle.com" , David Holmes , serviceability-dev , "hotspot-gc-dev at openjdk.java.net" > Subject: RE: RFR(L): 8215624: add parallel heap inspection support for jmap histo(G1)(Internet mail) > > Just small things. > > heapInspection.cpp: > > In ParHeapInspectTask::work, remove the final return statement and fix the following ?}? indent. I.e., replace > > + Atomic::store(&_success, false); > + return; > + } > > with > > + Atomic::store(&_success, false); > + } > > In HeapInspection::heap_inspection, missed_count should be a uint to match other missed_count declarations, and should be initialized to the result of populate_table() rather than separately to 0. > > attachListener.cpp: > > In heap_inspection, initial_processor_count returns an int, so cast its result to a uint. > > Similarly, parse_uintx returns a uintx, so cast its result (num) to uint when assigning to parallel_thread_num. > > BasicJMapTest.java: > > I took the liberty of refactoring testHisto*/histoToFile/testDump*/dump to remove redundant interposition methods and make histoToFile and dump look as similar as possible. > > Webrev with the above changes in > > http://cr.openjdk.java.net/~phh/8214535/webrev.01/ > > Thanks, > Paul > > On 7/15/20, 2:13 AM, "linzang(??)" wrote: > > Upload a new webrev at http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_07/ > It fix a potential issue that unexpected number of threads maybe calculated for "parallel" option of jmap -histo in container. > As shown at http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_07-delta/src/hotspot/share/services/attachListener.cpp.udiff.html > > ############### attachListener.cpp #################### > @@ -252,11 +252,11 @@ > static jint heap_inspection(AttachOperation* op, outputStream* out) { > bool live_objects_only = true; // default is true to retain the behavior before this change is made > outputStream* os = out; // if path not specified or path is NULL, use out > fileStream* fs = NULL; > const char* arg0 = op->arg(0); > - uint parallel_thread_num = MAX(1, os::processor_count() * 3 / 8); // default is less than half of processors. > + uint parallel_thread_num = MAX(1, os::initial_active_processor_count() * 3 / 8); // default is less than half of processors. > if (arg0 != NULL && (strlen(arg0) > 0)) { > if (strcmp(arg0, "-all") != 0 && strcmp(arg0, "-live") != 0) { > out->print_cr("Invalid argument to inspectheap operation: %s", arg0); > return JNI_ERR; > } > ################################################### > > Thanks. > > BRs, > Lin > > On 2020/7/9, 3:22 PM, "linzang(??)" wrote: > > Hi Paul, > Thanks for reviewing! > >> > >> I'd move all the argument parsing code to JMap.java and just pass the results to Hotspot. Both histo() in JMap.java and code in attachListener.* parse the command line arguments, though the code in histo() doesn't parse the argument to "parallel". I'd upgrade the code in histo() to do a complete parse and pass the option values to executeCommandForPid as before: there would just be more of them now. That would allow you to eliminate all the parsing code in attachListener.cpp as well as the change to arguments.hpp. > >> > The reason I made the change in Jmap.java that compose all arguments as 1 string , instead of passing 3 argments, is to avoid the compatibility issue, as we discussed in http://mail.openjdk.java.net/pipermail/serviceability-dev/2019-February/thread.html#27240. The root cause of the compatibility issue is because max argument count in HotspotVirtualMachineImpl.java and attachlistener.cpp need to be enlarged (changes like http://hg.openjdk.java.net/jdk/jdk/rev/e7cf035682e3#l2.1) when jmap has more than 3 arguments. But if user use an old jcmd/jmap tool, it may stuck at socket read(), because the "max argument count" don't match. > I re-checked this change, the argument count of jmap histo is equal to 3 (live, file, parallel), so it can work normally even without the change of passing argument. But I think we have to face the problem if more arguments is added in jcmd alike tools later, not sure whether it should be sloved (or a workaround) in this changeset. > > And here are the lastest webrev and delta: > http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_06/ > http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_06-delta/ > > Cheers, > Lin > > On 2020/7/7, 5:57 AM, "Hohensee, Paul" wrote: > > I'd like to see this feature added. :) > > The CSR looks good, as does the basic parallel inspection algorithm. Stefan's done the GC part, so I'll stick to the non-GC part (fwiw, the GC part lgtm). > > I'd move all the argument parsing code to JMap.java and just pass the results to Hotspot. Both histo() in JMap.java and code in attachListener.* parse the command line arguments, though the code in histo() doesn't parse the argument to "parallel". I'd upgrade the code in histo() to do a complete parse and pass the option values to executeCommandForPid as before: there would just be more of them now. That would allow you to eliminate all the parsing code in attachListener.cpp as well as the change to arguments.hpp. > > heapInspection.hpp: > > _shared_miss_count (s/b _missed_count, see below) isn't a size, so it should be a uint instead of a size_t. Same with the new parallel_thread_num argument to heap_inspection() and populate_table(). > > Comment copy-edit: > +// Parallel heap inspection task. Parallel inspection can fail due to > +// a native OOM when allocating memory for TL-KlassInfoTable. > +// _success will be set false on an OOM, and serial inspection tried. > > _shared_miss_count should be _missed_count to match the missed_count() getter, or rename missed_count() to be shared_miss_count(). Whichever way you go, the field type should match the getter result type: uint is reasonable. > > heapInspection.cpp: > > You might use ResourceMark twice in populate_table, separately for the parallel attempt and the serial code. If the parallel attempt fails and available memory is low, it would be good to clean up the memory used by the parallel attempt before doing the serial code. > > Style nit in KlassInfoTable::merge_entry(). I'd line up the definitions of k and elt, so "k" is even with "elt". And, because it's two lines shorter, I'd replace > + } else { > + return false; > + } > with > + return false; > > KlassInfoTableMergeClosure.is_success() should be just success() (i.e., no "is_" prefix) because it's a getter. > > I'd reorganize the code in populate_table() to make it more clear, vis (I changed _shared_missed_count to _missed_count) > + if (cit.allocation_failed()) { > + // fail to allocate memory, stop parallel mode > + Atomic::store(&_success, false); > + return; > + } > + RecordInstanceClosure ric(&cit, _filter); > + _poi->object_iterate(&ric, worker_id); > + missed_count = ric.missed_count(); > + { > + MutexLocker x(&_mutex); > + merge_success = _shared_cit->merge(&cit); > + } > + if (merge_success) { > + Atomic::add(&_missed_count, missed_count); > + else { > + Atomic::store(&_success, false); > + } > > Thanks, > Paul > > On 6/29/20, 7:20 PM, "linzang(??)" wrote: > > Dear All, > Sorry to bother again, I just want to make sure that is this change worth to be continue to work on? If decision is made to not. I think I can drop this work and stop asking for help reviewing... > Thanks for all your help about reviewing this previously. > > BRs, > Lin > > On 2020/5/9, 3:47 PM, "linzang(??)" wrote: > > Dear All, > May I ask your help again for review the latest change? Thanks! > > BRs, > Lin > > On 2020/4/28, 1:54 PM, "linzang(??)" wrote: > > Hi Stefan, > >> - Adding Atomic::load/store. > >> - Removing the time measurement in the run_task. I renamed G1's function > >> to run_task_timed. If we need this outside of G1, we can rethink the API > >> at that point. > >> - ZGC style cleanups > Thanks for revising the patch, they are all good to me, and I have made a tiny change based on it: > http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_04/ > http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_04-delta/ > it reduce the scope of mutex in ParHeapInspectTask, and delete unnecessary comments. > > BRs, > Lin > > On 2020/4/27, 4:34 PM, "Stefan Karlsson" wrote: > > Hi Lin, > > On 2020-04-26 05:10, linzang(??) wrote: > > Hi Stefan and Paul? > > I have made a new patch based on your comments and Stefan's Poc code: > > Webrev: http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_03/ > > Delta(based on Stefan's change:) : http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_03-delta/webrev_03-delta/ > > Thanks for providing a delta patch. It makes it much easier to look at, > and more likely for reviewers to continue reviewing. > > I'm going to continue focusing on the GC parts, and leave the rest to > others to review. > > > > > And Here are main changed I made and want to discuss with you: > > 1. changed"parallelThreadNum=" to "parallel=" for jmap -histo options. > > 2. Add logic to test where parallelHeapInspection is fail, in heapInspection.cpp > > This is because the parHeapInspectTask create thread local KlassInfoTable in it's work() method, and this may fail because of native OOM, in this case, the parallel should fail and serial heap inspection can be tried. > > One more thing I want discuss with you is about the member "_success" of parHeapInspectTask, when native OOM happenes, it is set to false. And since this "set" operation can be conducted in multiple threads, should it be atomic ops? IMO, this is not necessary because "_success" can only be set to false, and there is no way to change it from back to true after the ParHeapInspectTask instance is created, so it is save to be non-atomic, do you agree with that? > > In these situations you should be using the Atomic::load/store > primitives. We're moving toward a later C++ standard were data races are > considered undefined behavior. > > > 3. make CollectedHeap::run_task() be an abstract virtual func, so that every subclass of collectedHeap should support it, so later implementation of new collectedHeap will not miss the "parallel" features. > > The problem I want to discuss with you is about epsilonHeap and SerialHeap, as they may not need parallel heap iteration, so I only make task->work(0), in case the run_task() is invoked someway in future. Another way is to left run_task() unimplemented, which one do you think is better? > > I don't have a strong opinion about this. > > And also please help take a look at the zHeap, as there is a class > zTask that wrap the abstractGangTask, and the collectedHeap::run_task() > only accept AbstraceGangTask* as argument, so I made a delegate class > to adapt it , please see src/hotspot/share/gc/z/zHeap.cpp. > > > > There maybe other better ways to sovle the above problems, welcome for any comments, Thanks! > > I've created a few cleanups and changes on top of your latest patch: > > https://cr.openjdk.java.net/~stefank/8215624/webrev.02.delta > https://cr.openjdk.java.net/~stefank/8215624/webrev.02 > > - Adding Atomic::load/store. > - Removing the time measurement in the run_task. I renamed G1's function > to run_task_timed. If we need this outside of G1, we can rethink the API > at that point. > - ZGC style cleanups > > Thanks, > StefanK > > > > > BRs, > > Lin > > > > On 2020/4/23, 11:08 AM, "linzang(??)" wrote: > > > > Thanks Paul! I agree with using "parallel", will make the update in next patch, Thanks for help update the CSR. > > > > BRs, > > Lin > > > > On 2020/4/23, 4:42 AM, "Hohensee, Paul" wrote: > > > > For the interface, I'd use "parallel" instead of "parallelThreadNum". All the other options are lower case, and it's a lot easier to type "parallel". I took the liberty of updating the CSR. If you're ok with it, you might want to change variable names and such, plus of course JMap.usage. > > > > Thanks, > > Paul > > > > On 4/22/20, 2:29 AM, "serviceability-dev on behalf of linzang(??)" wrote: > > > > Dear Stefan, > > > > Thanks a lot! I agree with you to decouple the heap inspection code with GC's. > > I will start from your POC code, may discuss with you later. > > > > > > BRs, > > Lin > > > > On 2020/4/22, 5:14 PM, "Stefan Karlsson" wrote: > > > > Hi Lin, > > > > I took a look at this earlier and saw that the heap inspection code is > > strongly coupled with the CollectedHeap and G1CollectedHeap. I'd prefer > > if we'd abstract this away, so that the GCs only provide a "parallel > > object iteration" interface, and the heap inspection code is kept elsewhere. > > > > I started experimenting with doing that, but other higher-priority (to > > me) tasks have had to take precedence. > > > > I've uploaded my work-in-progress / proof-of-concept: > > https://cr.openjdk.java.net/~stefank/8215624/webrev.01.delta/ > > https://cr.openjdk.java.net/~stefank/8215624/webrev.01/ > > > > The current code doesn't handle the lifecycle (deletion) of the > > ParallelObjectIterators. There's also code left unimplemented in around > > CollectedHeap::run_task. However, I think this could work as a basis to > > pull out the heap inspection code out of the GCs. > > > > Thanks, > > StefanK > > > > On 2020-04-22 02:21, linzang(??) wrote: > > > Dear all, > > > May I ask you help to review? This RFR has been there for quite a while. > > > Thanks! > > > > > > BRs, > > > Lin > > > > > > > On 2020/3/16, 5:18 PM, "linzang(??)" wrote:> > > > > > >> Just update a new path, my preliminary measure show about 3.5x speedup of jmap histo on a nearly full 4GB G1 heap (8-core platform with parallel thread number set to 4). > > >> webrev: http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_02/ > > >> bug: https://bugs.openjdk.java.net/browse/JDK-8215624 > > >> CSR: https://bugs.openjdk.java.net/browse/JDK-8239290 > > >> BRs, > > >> Lin > > >> > On 2020/3/2, 9:56 PM, "linzang(??)" wrote: > > >> > > > >> > Dear all, > > >> > Let me try to ease the reviewing work by some explanation :P > > >> > The patch's target is to speed up jmap -histo for heap iteration, from my experience it is necessary for large heap investigation. E.g in bigData scenario I have tried to conduct jmap -histo against 180GB heap, it does take quite a while. > > >> > And if my understanding is corrent, even the jmap -histo without "live" option does heap inspection with heap lock acquired. so it is very likely to block mutator thread in allocation-sensitive scenario. I would say the faster the heap inspection does, the shorter the mutator be blocked. This is parallel iteration for jmap is necessary. > > >> > I think the parallel heap inspection should be applied to all kind of heap. However, consider the heap layout are different for GCs, much time is required to understand all kinds of the heap layout to make the whole change. IMO, It is not wise to have a huge patch for the whole solution at once, and it is even harder to review it. So I plan to implement it incrementally, the first patch (this one) is going to confirm the implemention detail of how jmap accept the new option, passes it to attachListener of the jvm process and then how to make the parallel inspection closure be generic enough to make it easy to extend to different heap layout. And also how to implement the heap inspection in specific gc's heap. This patch use G1's heap as the begining. > > >> > This patch actually do several things: > > >> > 1. Add an option "parallelThreadNum=" to jmap -histo, the default behavior is to set N to 0, means let's JVM decide how many threads to use for heap inspection. Set this option to 1 will disable parallel heap inspection. (more details in CSR: https://bugs.openjdk.java.net/browse/JDK-8239290) > > >> > 2. Make a change in how Jmap passing arguments, changes in http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_01/src/jdk.jcmd/share/classes/sun/tools/jmap/JMap.java.udiff.html, originally it pass options as separate arguments to attachListener, this patch change to that all options be compose to a single string. So the arg_count_max in attachListener.hpp do not need to be changed, and hence avoid the compatibility issue, as disscussed at https://mail.openjdk.java.net/pipermail/serviceability-dev/2019-March/027334.html > > >> > 3. Add an abstract class ParHeapInspectTask in heapInspection.hpp / heapInspection.cpp, It's work(uint worker_id) method prepares the data structure (KlassInfoTable) need for every parallel worker thread, and then call do_object_iterate_parallel() which is heap specific implementation. I also added some machenism in KlassInfoTable to support parallel iteration, such as merge(). > > >> > 4. In specific heap (G1 in this patch), create a subclass of ParHeapInspectTask, implement the do_object_iterate_parallel() for parallel heap inspection. For G1, it simply invoke g1CollectedHeap's object_iterate_parallel(). > > >> > 5. Add related test. > > >> > 6. it may be easy to extend this patch for other kinds of heap by creating subclass of ParHeapInspectTask and implement the do_object_iterate_parallel(). > > >> > > > >> > Hope these info could help on code review and initate the discussion :-) > > >> > Thanks! > > >> > > > >> > BRs, > > >> > Lin > > >> > >On 2020/2/19, 9:40 AM, "linzang(??)" wrote:. > > >> > > > > >> > > Re-post this RFR with correct enhancement number to make it trackable. > > >> > > please ignore the previous wrong post. sorry for troubles. > > >> > > > > >> > > webrev: http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_01/ > > >> > > Hi bug: https://bugs.openjdk.java.net/browse/JDK-8215624 > > >> > > CSR: https://bugs.openjdk.java.net/browse/JDK-8239290 > > >> > > -------------- > > >> > > Lin > > >> > > >Hi Lin, > > > > > > > > > >> > > >Could you, please, re-post your RFR with the right enhancement number in > > >> > > >the message subject? > > >> > > >It will be more trackable this way. > > >> > > > > > >> > > >Thanks, > > >> > > >Serguei > > >> > > > > > >> > > > > > >> > > >On 2/17/20 10:29 PM, linzang(??) wrote: > > >> > > >> Dear David, > > >> > > >> Thanks a lot! > > >> > > >> I have updated the refined code to http://cr.openjdk.java.net/~lzang/jmap-8214535/8215264/webrev_01/. > > >> > > >> IMHO the parallel heap inspection can be extended to all kinds of heap as long as the heap layout can support parallel iteration. > > >> > > >> Maybe we can firstly use this webrev to discuss how to implement it, because I am not sure my current implementation is an appropriate way to communicate with collectedHeap, then we can extend the solution to other kinds of heap. > > >> > > >> > > >> > > >> Thanks, > > >> > > >> -------------- > > >> > > >> Lin > > >> > > >>> Hi Lin, > > >> > > >>> > > >> > > >>> Adding in hotspot-gc-dev as they need to see how this interacts with GC > > >> > > >>> worker threads, and whether it needs to be extended beyond G1. > > >> > > >>> > > >> > > >>> I happened to spot one nit when browsing: > > >> > > >>> > > >> > > >>> src/hotspot/share/gc/shared/collectedHeap.hpp > > >> > > >>> > > >> > > >>> + virtual bool run_par_heap_inspect_task(KlassInfoTable* cit, > > >> > > >>> + BoolObjectClosure* filter, > > >> > > >>> + size_t* missed_count, > > >> > > >>> + size_t thread_num) { > > >> > > >>> + return NULL; > > >> > > >>> > > >> > > >>> s/NULL/false/ > > >> > > >>> > > >> > > >>> Cheers, > > >> > > >>> David > > > > > > >>> > > >> > > >>> On 18/02/2020 2:15 pm, linzang(??) wrote: > > >> > > >>>> Dear All, > > >> > > >>>> May I ask your help to review the follow changes: > > >> > > >>>> webrev: > > >> > > >>>> http://cr.openjdk.java.net/~lzang/jmap-8214535/8215264/webrev_00/ > > >> > > >>>> bug: https://bugs.openjdk.java.net/browse/JDK-8215624 > > >> > > >>>> related CSR: https://bugs.openjdk.java.net/browse/JDK-8239290 > > >> > > >>>> This patch enable parallel heap inspection of G1 for jmap histo. > > >> > > >>>> my simple test shown it can speed up 2x of jmap -histo with > > >> > > >>>> parallelThreadNum set to 2 for heap at ~500M on 4-core platform. > > >> > > >>>> > > >> > > >>>> ------------------------------------------------------------------------ > > >> > > >>>> BRs, > > >> > > >>>> Lin > > >> > > >> > > > >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > From suenaga at oss.nttdata.com Tue Aug 4 12:22:43 2020 From: suenaga at oss.nttdata.com (Yasumasa Suenaga) Date: Tue, 4 Aug 2020 21:22:43 +0900 Subject: RFR: 8250826: jhsdb does not work with coredump which comes from Substrate VM In-Reply-To: <2ca95f3f-c47f-d2c9-ab4f-347f65e74d9f@oracle.com> References: <06e53c65-3cc6-8c51-33d0-26e207f4d0a0@oss.nttdata.com> <3805349a-342f-a130-3b0b-7ee26f23a278@oss.nttdata.com> <6009b927-3949-6d03-51ab-bef939cfbd7f@oracle.com> <313155c0-58b2-cef8-68f8-1aa3ffd7e7df@oss.nttdata.com> <7788f89a-507d-0d4a-ab81-52a09a2d5161@oracle.com> <439d0860-ae4a-98d4-6402-40ebfa833159@oss.nttdata.com> <6740ad1d-5e35-458c-7c2c-caf6da6b1276@oracle.com> <29af768a-bebe-13fe-ac51-3062ede936bf@oss.nttdata.com> <2ca95f3f-c47f-d2c9-ab4f-347f65e74d9f@oracle.com> Message-ID: <94d17e2d-9b58-a50a-62cb-60d391101f47@oss.nttdata.com> Hi Serguei, Thanks for your comment! On 2020/08/04 18:11, serguei.spitsyn at oracle.com wrote: > Hi Yasumasa, > > The fix looks good to me. > Thanks to Chris for discussing the details in previous emails. > > Just one suggestion: > > http://cr.openjdk.java.net/~ysuenaga/JDK-8250826/webrev.02/src/jdk.hotspot.agent/share/native/libsaproc/ps_core_common.c.frames.html > > 44 #ifdef PF_R > 45 #define MAP_R_FLAG PF_R > 46 #else > 47 #define MAP_R_FLAG 0 > 48 #endif > > Could you, please add a small comment before? > Something like this would be enough, I think: > ? // Define a segment permission flag allowing read. I added it to new webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8250826/webrev.03/ Thanks, Yasumasa > Thanks, > Serguei > > > On 8/3/20 17:54, Yasumasa Suenaga wrote: >> Thanks Chris! >> I will push it when I got second reviewer. >> >> >> Yasumasa >> >> >> On 2020/08/04 9:54, Chris Plummer wrote: >>> Hi Yasumasa, >>> >>> Your changes look good now. >>> >>> thanks, >>> >>> Chris >>> >>> On 8/3/20 5:47 PM, Yasumasa Suenaga wrote: >>>> Hi Chris, >>>> >>>> Thank you for the comment! >>>> I updated webrev: >>>> >>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8250826/webrev.02/ >>>> ? Diff from webrev.01: http://hg.openjdk.java.net/jdk/submit/rev/e98dc25b69c2 >>>> >>>> On 2020/08/04 6:41, Chris Plummer wrote: >>>>> Hi Yasumasa, >>>>> >>>>> Your updated fix resulted in using the core file map whereas the original fix used the library map. In both cases the assert is avoided, which I think is the main goal. Does it matter which map is used? >>>> >>>> In GraalVM, read only segment is conflicted, thus it does not matter which map is used. >>>> However this webrev is more generalize, so segments in coredump should be used. >>>> >>>>> ?? 42 #ifndef PF_R >>>>> ?? 43 #define PF_R 0x4 >>>>> ?? 44 #endif >>>>> >>>>> ??156?? if ((map = allocate_init_map(ph->core->classes_jsa_fd, >>>>> ??157??????????????????????????????? offset, vaddr, memsz, PF_R)) == NULL) { >>>>> >>>>> I'm not so sure this is appropriate for OSX. It uses mach-o files, not elf files. The segment_command flags field comes from loader.h [1]. I don't see anything in there that looks like the equivalent of ELF access flags. >>>>> >>>>> /* Constants for the flags field of the segment_command */ >>>>> #define??? SG_HIGHVM??? 0x1??? /* the file contents for this segment is for >>>>> ???? ??? ??? ??? ?? the high part of the VM space, the low part >>>>> ???? ??? ??? ??? ?? is zero filled (for stacks in core files) */ >>>>> #define??? SG_FVMLIB??? 0x2??? /* this segment is the VM that is allocated by >>>>> ???? ??? ??? ??? ?? a fixed VM library, for overlap checking in >>>>> ???? ??? ??? ??? ?? the link editor */ >>>>> #define??? SG_NORELOC??? 0x4??? /* this segment has nothing that was relocated >>>>> ???? ??? ??? ??? ?? in it and nothing relocated to it, that is >>>>> ???? ??? ??? ??? ?? it maybe safely replaced without relocation*/ >>>>> #define SG_PROTECTED_VERSION_1??? 0x8 /* This segment is protected. If the >>>>> ???? ??? ??? ??? ?????? segment starts at file offset 0, the >>>>> ???? ??? ??? ??? ?????? first page of the segment is not >>>>> ???? ??? ??? ??? ?????? protected.? All other pages of the >>>>> ???? ??? ??? ??? ?????? segment are protected. */ >>>>> >>>>> Since the flags don't matter for OSX, maybe you should just pass 0. You can do something like: >>>>> >>>>> #ifndef PF_R >>>>> #define MAP_R_FLAG PF_R >>>>> #else >>>>> #define MAP_R_FLAG 0 >>>>> #endif >>>> >>>> Thanks! >>>> I thought PF_R can be used PF_R from elf.h on macOS: >>>> https://opensource.apple.com/source/dtrace/dtrace-90/sys/elf.h >>>> >>>> I merged your code in this webrev. >>>> >>>>> Some minor comment fixes are needed: >>>>> >>>>> ??397???????? // Access flags fot this memory region is different between the library >>>>> >>>>> "fot" -> "for" >>>>> "is" -> "are" >>>>> >>>>> ??399???????? // We should respect to coredump. >>>>> >>>>> "to" -> "the" >>>>> >>>>> ??404???????? // And head of ELF header might be included in coredump (See JDK-7133122). >>>>> ??405???????? // Thus we need to replace PT_LOAD segments the library version. >>>>> >>>>> How about: >>>>> >>>>> ??404???????? // Also the first page of the ELF header might be included in the coredump (See JDK-7133122). >>>>> ??405???????? // Thus we need to replace the PT_LOAD segment with the library version. >>>> >>>> Fixed them. >>>> >>>> >>>> Thanks, >>>> >>>> Yasumasa >>>> >>>> >>>>> thanks, >>>>> >>>>> Chris >>>>> >>>>> [1] https://opensource.apple.com/source/xnu/xnu-1456.1.26/EXTERNAL_HEADERS/mach-o/loader.h.auto.html >>>>> >>>>> On 8/2/20 12:18 AM, Yasumasa Suenaga wrote: >>>>>> Hi Chris, >>>>>> >>>>>> (Remove "trivial" from subject) >>>>>> >>>>>> Thanks for the information! I fixed errors in new webrev. It passed tests on submit repo (mach5-one-ysuenaga-JDK-8250826-1-20200802-0151-13109525) >>>>>> >>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8250826/webrev.01/ >>>>>> >>>>>> >>>>>> I tried to use elf.h instead of #define for PF_R, however it failed (mach5-one-ysuenaga-JDK-8250826-1-20200802-0542-13111335). >>>>>> >>>>>> http://hg.openjdk.java.net/jdk/submit/rev/67baee1a1a1d >>>>>> >>>>>> Thus I added #define for it in this webrev. >>>>>> >>>>>> >>>>>> Thanks, >>>>>> >>>>>> Yasumasa >>>>>> >>>>>> >>>>>> On 2020/08/02 10:22, Chris Plummer wrote: >>>>>>> Hi Yasumasa, >>>>>>> >>>>>>> [2020-08-01T14:15:42,514Z] Creating support/native/jdk.hotspot.agent/libsaproc/static/libsaproc.a from 8 file(s) >>>>>>> [2020-08-01T14:15:43,961Z] ./open/src/jdk.hotspot.agent/share/native/libsaproc/ps_core_common.c:128:8: error: no member named 'flags' in 'struct map_info' >>>>>>> [2020-08-01T14:15:43,961Z]?? map->flags? = flags; >>>>>>> [2020-08-01T14:15:43,961Z]?? ~~~? ^ >>>>>>> [2020-08-01T14:15:43,963Z] ./open/src/jdk.hotspot.agent/share/native/libsaproc/ps_core_common.c:153:54: error: use of undeclared identifier 'PF_R' >>>>>>> [2020-08-01T14:15:43,963Z] offset, vaddr, memsz, PF_R)) == NULL) { >>>>>>> >>>>>>> I'll look at the code changes later. No time at the moment. >>>>>>> >>>>>>> thanks, >>>>>>> >>>>>>> Chris >>>>>>> >>>>>>> 2020-08-01-1405571.suenaga.source2020-08-01-1405571.suenaga.source 2020-08-01-1405571.suenaga.source On 8/1/20 5:20 PM, Yasumasa Suenaga wrote: >>>>>>>> Hi Chris, >>>>>>>> >>>>>>>> Thanks for your comment! >>>>>>>> I pushed new change to submit repo, but the build failed on macOS. Could you share details? >>>>>>>> (I do not have Mac) >>>>>>>> >>>>>>>> ? commit: http://hg.openjdk.java.net/jdk/submit/rev/0eb1c497f297 >>>>>>>> ? job: mach5-one-ysuenaga-JDK-8250826-1-20200801-1407-13098989 >>>>>>>> >>>>>>>> On 2020/08/01 13:06, Chris Plummer wrote: >>>>>>>>> On 7/30/20 6:18 PM, Yasumasa Suenaga wrote: >>>>>>>>>> Hi Chris, >>>>>>>>>> >>>>>>>>>> On 2020/07/31 7:29, Chris Plummer wrote: >>>>>>>>>>> Hi Yasumasa, >>>>>>>>>>> >>>>>>>>>>> If I understand correctly we first call add_map_info() for all the PT_LOAD segments in the core file. We then process all the library segments, calling add_map_info() for them if the target_vaddr has not already been addded. If has already been added, which I assume is the case for any library segment that is already in the core file, then the core file version is replaced the the library version.? I'm a little unclear of the purpose of this replacing of the core PT_LOAD segments with those found in the libraries. If you could explain this that would help me understand your change. >>>>>>>>>> >>>>>>>>>> Read only segments in ELF should not be any different from PT_LOAD segments in the core. >>>>>>>>>> And head of ELF header might be included in coredump (See JDK-7133122). Thus we need to replace PT_LOAD segments the library version. >>>>>>>>> Ok. The code in the area really should have been commented better when first written. The purpose is not understandable simply by reading the code. >>>>>>>> >>>>>>>> I added some comments to existing code. Please tell me if it is insufficient. >>>>>>>> >>>>>>>> >>>>>>>>>>> I'm also unsure why existing_map->fd would ever be something other than the core file. Why would another library map the same target_vaddr. >>>>>>>>>> >>>>>>>>>> When mmap() is called to read-only ELF segments / sections, Linux kernel seems to allocate other memory segments which has same top virtual memory address. I've not yet found out from the code of Linux kernel, but I confirmed this behavior on GDB. >>>>>>>>> Ok. Same comment as above. This should have been explained with comments in the code. >>>>>>>> >>>>>>>> Added some comments. >>>>>>>> >>>>>>>> >>>>>>>>> As for your fix, if I understand correctly the issue is that a single segment in the library is being split into two segments in the process (and therefore in the core file) due to an mprotect being done on part of the segment. Because of this the segment size in the library does match the segment size in the core file. So with your fix the library segment is used, but what about the other half of the segment that is in the core file? Don't we now have overlapping segments; the full original segment from the library, and then a second segment that overlaps the tail end of the library segment? Will that cause any confusion later on? >>>>>>>> >>>>>>>> As long as vaddr is valid, it doesn't matter even if it overlaps because SA would sort the map with vaddr, and would lookup with it. >>>>>>>> In Substrate VM, there are RO and RW sections in that order, so it is ok with webrev.00 . However it might not be appropriate because RW section might be top of PT_LOAD. >>>>>>>> >>>>>>>> To make it more generalized, I changed it to the commit on submit repo. >>>>>>>> It would check access flags between in coredump and in binary. If they are different, we respect current (loaded from coredump) map because it might be changed at runtime. >>>>>>>> >>>>>>>> The change for LabsJDK 11 is more simple because JDK 11 does not have ps_core_common.c . >>>>>>>> So I share you it. It may help you: >>>>>>>> >>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8250826/JDK-8250826-labsjdk11-0.patch >>>>>>>> >>>>>>>> >>>>>>>> Thanks, >>>>>>>> >>>>>>>> Yasumasa >>>>>>>> >>>>>>>> >>>>>>>>> thanks, >>>>>>>>> >>>>>>>>> Chris >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> >>>>>>>>>> Yasumasa >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>> thanks, >>>>>>>>>>> >>>>>>>>>>> Chris >>>>>>>>>>> >>>>>>>>>>> On 7/30/20 1:18 PM, Chris Plummer wrote: >>>>>>>>>>>> Hi Yasumasa, >>>>>>>>>>>> >>>>>>>>>>>> I'm reviewing this RFR, and I'd like to ask that it not be pushed as trivial. Although it is just a one line change, it takes an extensive knowledge to understand the impact. I'll read up on the filed graal issue and try to understand the ELF code a bit better. >>>>>>>>>>>> >>>>>>>>>>>> thanks, >>>>>>>>>>>> >>>>>>>>>>>> Chris >>>>>>>>>>>> >>>>>>>>>>>> On 7/30/20 6:45 AM, Yasumasa Suenaga wrote: >>>>>>>>>>>>> Hi all, >>>>>>>>>>>>> >>>>>>>>>>>>> Please review this trivial change: >>>>>>>>>>>>> >>>>>>>>>>>>> ? JBS: https://bugs.openjdk.java.net/browse/JDK-8250826 >>>>>>>>>>>>> ? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8250826/webrev.00/ >>>>>>>>>>>>> >>>>>>>>>>>>> I played Truffle NFI on GraalVM, but I cannot get Java stacks from coredump via jhsdb. >>>>>>>>>>>>> >>>>>>>>>>>>> I've reported this issue to GraalVM community [1], and I 've found out the cause of this issue is .svm_heap would be separated to RO and RW areas by mprotect() calls in run time in spite of .svm_heap is RO section in ELF (please see [1] for details). >>>>>>>>>>>>> >>>>>>>>>>>>> It is corner case, but we will see same problem on jhsdb when we attempt to analyze coredump which comes from some applications / libraries which would separate RO sections in ELF like Substrate VM. >>>>>>>>>>>>> >>>>>>>>>>>>> I sent PR to fix libsaproc.so in LabsJDK 11 for this issue [2], then community members suggested me to discuss in serviceability-dev. >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> Thanks, >>>>>>>>>>>>> >>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> [1] https://github.com/oracle/graal/issues/2579 >>>>>>>>>>>>> [2] https://github.com/graalvm/labs-openjdk-11/pull/9 >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>> >>>>>>> >>>>> >>>>> >>> >>> > From hohensee at amazon.com Tue Aug 4 14:56:07 2020 From: hohensee at amazon.com (Hohensee, Paul) Date: Tue, 4 Aug 2020 14:56:07 +0000 Subject: RFR(L): 8215624: add parallel heap inspection support for jmap histo(G1)(Internet mail) Message-ID: <4C0BAC93-D23F-4958-A554-467F0CEDF450@amazon.com> Hi, Stefan, I suggested changing the missed_count type because it's a count, not a size. It didn?t seem to me that 32 bits would overflow, but if that's a concern, then using uint64_t would make the size explicit. Thanks, Paul ?On 8/4/20, 5:13 AM, "Stefan Karlsson" wrote: Hi Lin, Some small nits: Could you go over the patch and move both declaration and definition of the newly added heap functions, so that their location match the one chosen in collectedHeap.hpp? And that the locations is consistent between the hpp and cpp files? https://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_10/src/hotspot/share/gc/serial/serialHeap.hpp.udiff.html + // Runs the given AbstractGangTask with the current active workers. Since the SerialGC doesn't use "workers", this comment needs to be updated. Maybe use the comments from the serialHeap.cpp change? https://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_10/src/hotspot/share/gc/shared/collectedHeap.hpp.udiff.html #include "memory/allocation.hpp" #include "memory/universe.hpp" +#include "memory/heapInspection.hpp" The new include breaks the sorting. https://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_10/src/hotspot/share/gc/shared/gcVMOperations.hpp.patch You changed the indentation here: + VM_GC_HeapInspection(outputStream* out, bool request_full_gc, + uint parallel_thread_num = 1) : + VM_GC_Operation(0 /* total collections, dummy, ignored */, Could you reindent VM_GC_Operation and subsequent lines? https://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_10/src/hotspot/share/gc/z/zHeap.hpp.udiff.html + void run_task(AbstractGangTask* task); // Reference processing ReferenceDiscoverer* reference_discoverer(); void set_soft_reference_policy(bool clear); The grouping of this is awkward. The run_task function has nothing to do with reference processing and shouldn't be grouped with it. I propose that you add a newline between line 103 and 104. Except for these nits, the rest of the GC code looks good. Note that I'm only reviewing the changes to share/gc the rest of the changes. I think it would be prudent to get two other reviewers for the rest of the code changes. With that said, I saw the comment and change of from the 'size_t missed_count' to 'uint missed_count'. This changes the variable to a 32 bit variable on 64 bit builds. It seems like that could cause overflows. Since missed_count wasn't added by this change, maybe not change the type as part of this RFE? Thanks, StefanK On 2020-08-03 16:51, linzang(??) wrote: > Dear Stefan, > May I ask your help to review again? I have made a delta based on the last changeset you have reviewed(webrev04),hope it could ease your reviewing work. > webrev: https://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_10/ > delta (vs webrev04): https://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/delta_10vs04/webrev/ > bug: https://bugs.openjdk.java.net/browse/JDK-8215624 > CSR(approved): https://bugs.openjdk.java.net/browse/JDK-8239290 > > BRs, > Lin > > On 2020/7/30, 5:21 AM, "Hohensee, Paul" wrote: > > A submit repo run with this succeeded, so afaic you're good to go. Stefan, you reviewed the GC part before, it'd be great if you could ok the final version. > > Thanks, > Paul > > On 7/29/20, 5:02 AM, "linzang(??)" wrote: > > Upload a new change at http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_10/ > It fix an issue of windows fail : > > #################################### > In heapInspect.cpp > - size_t HeapInspection::populate_table(KlassInfoTable* cit, BoolObjectClosure *filter, uint parallel_thread_num) { > + uint HeapInspection::populate_table(KlassInfoTable* cit, BoolObjectClosure *filter, uint parallel_thread_num) { > #################################### > In heapInspect.hpp > - size_t populate_table(KlassInfoTable* cit, BoolObjectClosure* filter = NULL, uint parallel_thread_num = 1) NOT_SERVICES_RETURN_(0); > + uint populate_table(KlassInfoTable* cit, BoolObjectClosure* filter = NULL, uint parallel_thread_num = 1) NOT_SERVICES_RETURN_(0); > #################################### > > > BRs, > Lin > > On 2020/7/27, 11:26 AM, "linzang(??)" wrote: > > I update a new change at http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_09 > It includes a tiny fix of build failure on windows: > #################################### > In attachListener.cpp: > - uint parallel_thread_num = MAX(1, (uint)os::initial_active_processor_count() * 3 / 8); > + uint parallel_thread_num = MAX2(1, (uint)os::initial_active_processor_count() * 3 / 8); > #################################### > > BRs, > Lin > > On 2020/7/23, 11:56 AM, "linzang(??)" wrote: > > Hi Paul, > Thanks for your help, that all looks good to me. > Just 2 minor changes: > ? delete the final return in ParHeapInspectTask::work, you mentioned it but seems not include in the webrev :-) > ? delete a unnecessary blank line in heapInspect.cpp at merge_entry() > > ######################################################################### > --- old/src/hotspot/share/memory/heapInspection.cpp 2020-07-23 11:23:29.281666456 +0800 > +++ new/src/hotspot/share/memory/heapInspection.cpp 2020-07-23 11:23:29.017666447 +0800 > @@ -251,7 +251,6 @@ > _size_of_instances_in_words += cie->words(); > return true; > } > - > return false; > } > > @@ -568,7 +567,6 @@ > Atomic::add(&_missed_count, missed_count); > } else { > Atomic::store(&_success, false); > - return; > } > } > ######################################################################### > > > Here is the webrev http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_08/ > > BRs, > Lin > --------------------------------------------- > From: "Hohensee, Paul" > Date: Thursday, July 23, 2020 at 6:48 AM > To: "linzang(??)" , Stefan Karlsson , "serguei.spitsyn at oracle.com" , David Holmes , serviceability-dev , "hotspot-gc-dev at openjdk.java.net" > Subject: RE: RFR(L): 8215624: add parallel heap inspection support for jmap histo(G1)(Internet mail) > > Just small things. > > heapInspection.cpp: > > In ParHeapInspectTask::work, remove the final return statement and fix the following ?}? indent. I.e., replace > > + Atomic::store(&_success, false); > + return; > + } > > with > > + Atomic::store(&_success, false); > + } > > In HeapInspection::heap_inspection, missed_count should be a uint to match other missed_count declarations, and should be initialized to the result of populate_table() rather than separately to 0. > > attachListener.cpp: > > In heap_inspection, initial_processor_count returns an int, so cast its result to a uint. > > Similarly, parse_uintx returns a uintx, so cast its result (num) to uint when assigning to parallel_thread_num. > > BasicJMapTest.java: > > I took the liberty of refactoring testHisto*/histoToFile/testDump*/dump to remove redundant interposition methods and make histoToFile and dump look as similar as possible. > > Webrev with the above changes in > > http://cr.openjdk.java.net/~phh/8214535/webrev.01/ > > Thanks, > Paul > > On 7/15/20, 2:13 AM, "linzang(??)" wrote: > > Upload a new webrev at http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_07/ > It fix a potential issue that unexpected number of threads maybe calculated for "parallel" option of jmap -histo in container. > As shown at http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_07-delta/src/hotspot/share/services/attachListener.cpp.udiff.html > > ############### attachListener.cpp #################### > @@ -252,11 +252,11 @@ > static jint heap_inspection(AttachOperation* op, outputStream* out) { > bool live_objects_only = true; // default is true to retain the behavior before this change is made > outputStream* os = out; // if path not specified or path is NULL, use out > fileStream* fs = NULL; > const char* arg0 = op->arg(0); > - uint parallel_thread_num = MAX(1, os::processor_count() * 3 / 8); // default is less than half of processors. > + uint parallel_thread_num = MAX(1, os::initial_active_processor_count() * 3 / 8); // default is less than half of processors. > if (arg0 != NULL && (strlen(arg0) > 0)) { > if (strcmp(arg0, "-all") != 0 && strcmp(arg0, "-live") != 0) { > out->print_cr("Invalid argument to inspectheap operation: %s", arg0); > return JNI_ERR; > } > ################################################### > > Thanks. > > BRs, > Lin > > On 2020/7/9, 3:22 PM, "linzang(??)" wrote: > > Hi Paul, > Thanks for reviewing! > >> > >> I'd move all the argument parsing code to JMap.java and just pass the results to Hotspot. Both histo() in JMap.java and code in attachListener.* parse the command line arguments, though the code in histo() doesn't parse the argument to "parallel". I'd upgrade the code in histo() to do a complete parse and pass the option values to executeCommandForPid as before: there would just be more of them now. That would allow you to eliminate all the parsing code in attachListener.cpp as well as the change to arguments.hpp. > >> > The reason I made the change in Jmap.java that compose all arguments as 1 string , instead of passing 3 argments, is to avoid the compatibility issue, as we discussed in http://mail.openjdk.java.net/pipermail/serviceability-dev/2019-February/thread.html#27240. The root cause of the compatibility issue is because max argument count in HotspotVirtualMachineImpl.java and attachlistener.cpp need to be enlarged (changes like http://hg.openjdk.java.net/jdk/jdk/rev/e7cf035682e3#l2.1) when jmap has more than 3 arguments. But if user use an old jcmd/jmap tool, it may stuck at socket read(), because the "max argument count" don't match. > I re-checked this change, the argument count of jmap histo is equal to 3 (live, file, parallel), so it can work normally even without the change of passing argument. But I think we have to face the problem if more arguments is added in jcmd alike tools later, not sure whether it should be sloved (or a workaround) in this changeset. > > And here are the lastest webrev and delta: > http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_06/ > http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_06-delta/ > > Cheers, > Lin > > On 2020/7/7, 5:57 AM, "Hohensee, Paul" wrote: > > I'd like to see this feature added. :) > > The CSR looks good, as does the basic parallel inspection algorithm. Stefan's done the GC part, so I'll stick to the non-GC part (fwiw, the GC part lgtm). > > I'd move all the argument parsing code to JMap.java and just pass the results to Hotspot. Both histo() in JMap.java and code in attachListener.* parse the command line arguments, though the code in histo() doesn't parse the argument to "parallel". I'd upgrade the code in histo() to do a complete parse and pass the option values to executeCommandForPid as before: there would just be more of them now. That would allow you to eliminate all the parsing code in attachListener.cpp as well as the change to arguments.hpp. > > heapInspection.hpp: > > _shared_miss_count (s/b _missed_count, see below) isn't a size, so it should be a uint instead of a size_t. Same with the new parallel_thread_num argument to heap_inspection() and populate_table(). > > Comment copy-edit: > +// Parallel heap inspection task. Parallel inspection can fail due to > +// a native OOM when allocating memory for TL-KlassInfoTable. > +// _success will be set false on an OOM, and serial inspection tried. > > _shared_miss_count should be _missed_count to match the missed_count() getter, or rename missed_count() to be shared_miss_count(). Whichever way you go, the field type should match the getter result type: uint is reasonable. > > heapInspection.cpp: > > You might use ResourceMark twice in populate_table, separately for the parallel attempt and the serial code. If the parallel attempt fails and available memory is low, it would be good to clean up the memory used by the parallel attempt before doing the serial code. > > Style nit in KlassInfoTable::merge_entry(). I'd line up the definitions of k and elt, so "k" is even with "elt". And, because it's two lines shorter, I'd replace > + } else { > + return false; > + } > with > + return false; > > KlassInfoTableMergeClosure.is_success() should be just success() (i.e., no "is_" prefix) because it's a getter. > > I'd reorganize the code in populate_table() to make it more clear, vis (I changed _shared_missed_count to _missed_count) > + if (cit.allocation_failed()) { > + // fail to allocate memory, stop parallel mode > + Atomic::store(&_success, false); > + return; > + } > + RecordInstanceClosure ric(&cit, _filter); > + _poi->object_iterate(&ric, worker_id); > + missed_count = ric.missed_count(); > + { > + MutexLocker x(&_mutex); > + merge_success = _shared_cit->merge(&cit); > + } > + if (merge_success) { > + Atomic::add(&_missed_count, missed_count); > + else { > + Atomic::store(&_success, false); > + } > > Thanks, > Paul > > On 6/29/20, 7:20 PM, "linzang(??)" wrote: > > Dear All, > Sorry to bother again, I just want to make sure that is this change worth to be continue to work on? If decision is made to not. I think I can drop this work and stop asking for help reviewing... > Thanks for all your help about reviewing this previously. > > BRs, > Lin > > On 2020/5/9, 3:47 PM, "linzang(??)" wrote: > > Dear All, > May I ask your help again for review the latest change? Thanks! > > BRs, > Lin > > On 2020/4/28, 1:54 PM, "linzang(??)" wrote: > > Hi Stefan, > >> - Adding Atomic::load/store. > >> - Removing the time measurement in the run_task. I renamed G1's function > >> to run_task_timed. If we need this outside of G1, we can rethink the API > >> at that point. > >> - ZGC style cleanups > Thanks for revising the patch, they are all good to me, and I have made a tiny change based on it: > http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_04/ > http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_04-delta/ > it reduce the scope of mutex in ParHeapInspectTask, and delete unnecessary comments. > > BRs, > Lin > > On 2020/4/27, 4:34 PM, "Stefan Karlsson" wrote: > > Hi Lin, > > On 2020-04-26 05:10, linzang(??) wrote: > > Hi Stefan and Paul? > > I have made a new patch based on your comments and Stefan's Poc code: > > Webrev: http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_03/ > > Delta(based on Stefan's change:) : http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_03-delta/webrev_03-delta/ > > Thanks for providing a delta patch. It makes it much easier to look at, > and more likely for reviewers to continue reviewing. > > I'm going to continue focusing on the GC parts, and leave the rest to > others to review. > > > > > And Here are main changed I made and want to discuss with you: > > 1. changed"parallelThreadNum=" to "parallel=" for jmap -histo options. > > 2. Add logic to test where parallelHeapInspection is fail, in heapInspection.cpp > > This is because the parHeapInspectTask create thread local KlassInfoTable in it's work() method, and this may fail because of native OOM, in this case, the parallel should fail and serial heap inspection can be tried. > > One more thing I want discuss with you is about the member "_success" of parHeapInspectTask, when native OOM happenes, it is set to false. And since this "set" operation can be conducted in multiple threads, should it be atomic ops? IMO, this is not necessary because "_success" can only be set to false, and there is no way to change it from back to true after the ParHeapInspectTask instance is created, so it is save to be non-atomic, do you agree with that? > > In these situations you should be using the Atomic::load/store > primitives. We're moving toward a later C++ standard were data races are > considered undefined behavior. > > > 3. make CollectedHeap::run_task() be an abstract virtual func, so that every subclass of collectedHeap should support it, so later implementation of new collectedHeap will not miss the "parallel" features. > > The problem I want to discuss with you is about epsilonHeap and SerialHeap, as they may not need parallel heap iteration, so I only make task->work(0), in case the run_task() is invoked someway in future. Another way is to left run_task() unimplemented, which one do you think is better? > > I don't have a strong opinion about this. > > And also please help take a look at the zHeap, as there is a class > zTask that wrap the abstractGangTask, and the collectedHeap::run_task() > only accept AbstraceGangTask* as argument, so I made a delegate class > to adapt it , please see src/hotspot/share/gc/z/zHeap.cpp. > > > > There maybe other better ways to sovle the above problems, welcome for any comments, Thanks! > > I've created a few cleanups and changes on top of your latest patch: > > https://cr.openjdk.java.net/~stefank/8215624/webrev.02.delta > https://cr.openjdk.java.net/~stefank/8215624/webrev.02 > > - Adding Atomic::load/store. > - Removing the time measurement in the run_task. I renamed G1's function > to run_task_timed. If we need this outside of G1, we can rethink the API > at that point. > - ZGC style cleanups > > Thanks, > StefanK > > > > > BRs, > > Lin > > > > On 2020/4/23, 11:08 AM, "linzang(??)" wrote: > > > > Thanks Paul! I agree with using "parallel", will make the update in next patch, Thanks for help update the CSR. > > > > BRs, > > Lin > > > > On 2020/4/23, 4:42 AM, "Hohensee, Paul" wrote: > > > > For the interface, I'd use "parallel" instead of "parallelThreadNum". All the other options are lower case, and it's a lot easier to type "parallel". I took the liberty of updating the CSR. If you're ok with it, you might want to change variable names and such, plus of course JMap.usage. > > > > Thanks, > > Paul > > > > On 4/22/20, 2:29 AM, "serviceability-dev on behalf of linzang(??)" wrote: > > > > Dear Stefan, > > > > Thanks a lot! I agree with you to decouple the heap inspection code with GC's. > > I will start from your POC code, may discuss with you later. > > > > > > BRs, > > Lin > > > > On 2020/4/22, 5:14 PM, "Stefan Karlsson" wrote: > > > > Hi Lin, > > > > I took a look at this earlier and saw that the heap inspection code is > > strongly coupled with the CollectedHeap and G1CollectedHeap. I'd prefer > > if we'd abstract this away, so that the GCs only provide a "parallel > > object iteration" interface, and the heap inspection code is kept elsewhere. > > > > I started experimenting with doing that, but other higher-priority (to > > me) tasks have had to take precedence. > > > > I've uploaded my work-in-progress / proof-of-concept: > > https://cr.openjdk.java.net/~stefank/8215624/webrev.01.delta/ > > https://cr.openjdk.java.net/~stefank/8215624/webrev.01/ > > > > The current code doesn't handle the lifecycle (deletion) of the > > ParallelObjectIterators. There's also code left unimplemented in around > > CollectedHeap::run_task. However, I think this could work as a basis to > > pull out the heap inspection code out of the GCs. > > > > Thanks, > > StefanK > > > > On 2020-04-22 02:21, linzang(??) wrote: > > > Dear all, > > > May I ask you help to review? This RFR has been there for quite a while. > > > Thanks! > > > > > > BRs, > > > Lin > > > > > > > On 2020/3/16, 5:18 PM, "linzang(??)" wrote:> > > > > > >> Just update a new path, my preliminary measure show about 3.5x speedup of jmap histo on a nearly full 4GB G1 heap (8-core platform with parallel thread number set to 4). > > >> webrev: http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_02/ > > >> bug: https://bugs.openjdk.java.net/browse/JDK-8215624 > > >> CSR: https://bugs.openjdk.java.net/browse/JDK-8239290 > > >> BRs, > > >> Lin > > >> > On 2020/3/2, 9:56 PM, "linzang(??)" wrote: > > >> > > > >> > Dear all, > > >> > Let me try to ease the reviewing work by some explanation :P > > >> > The patch's target is to speed up jmap -histo for heap iteration, from my experience it is necessary for large heap investigation. E.g in bigData scenario I have tried to conduct jmap -histo against 180GB heap, it does take quite a while. > > >> > And if my understanding is corrent, even the jmap -histo without "live" option does heap inspection with heap lock acquired. so it is very likely to block mutator thread in allocation-sensitive scenario. I would say the faster the heap inspection does, the shorter the mutator be blocked. This is parallel iteration for jmap is necessary. > > >> > I think the parallel heap inspection should be applied to all kind of heap. However, consider the heap layout are different for GCs, much time is required to understand all kinds of the heap layout to make the whole change. IMO, It is not wise to have a huge patch for the whole solution at once, and it is even harder to review it. So I plan to implement it incrementally, the first patch (this one) is going to confirm the implemention detail of how jmap accept the new option, passes it to attachListener of the jvm process and then how to make the parallel inspection closure be generic enough to make it easy to extend to different heap layout. And also how to implement the heap inspection in specific gc's heap. This patch use G1's heap as the begining. > > >> > This patch actually do several things: > > >> > 1. Add an option "parallelThreadNum=" to jmap -histo, the default behavior is to set N to 0, means let's JVM decide how many threads to use for heap inspection. Set this option to 1 will disable parallel heap inspection. (more details in CSR: https://bugs.openjdk.java.net/browse/JDK-8239290) > > >> > 2. Make a change in how Jmap passing arguments, changes in http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_01/src/jdk.jcmd/share/classes/sun/tools/jmap/JMap.java.udiff.html, originally it pass options as separate arguments to attachListener, this patch change to that all options be compose to a single string. So the arg_count_max in attachListener.hpp do not need to be changed, and hence avoid the compatibility issue, as disscussed at https://mail.openjdk.java.net/pipermail/serviceability-dev/2019-March/027334.html > > >> > 3. Add an abstract class ParHeapInspectTask in heapInspection.hpp / heapInspection.cpp, It's work(uint worker_id) method prepares the data structure (KlassInfoTable) need for every parallel worker thread, and then call do_object_iterate_parallel() which is heap specific implementation. I also added some machenism in KlassInfoTable to support parallel iteration, such as merge(). > > >> > 4. In specific heap (G1 in this patch), create a subclass of ParHeapInspectTask, implement the do_object_iterate_parallel() for parallel heap inspection. For G1, it simply invoke g1CollectedHeap's object_iterate_parallel(). > > >> > 5. Add related test. > > >> > 6. it may be easy to extend this patch for other kinds of heap by creating subclass of ParHeapInspectTask and implement the do_object_iterate_parallel(). > > >> > > > >> > Hope these info could help on code review and initate the discussion :-) > > >> > Thanks! > > >> > > > >> > BRs, > > >> > Lin > > >> > >On 2020/2/19, 9:40 AM, "linzang(??)" wrote:. > > >> > > > > >> > > Re-post this RFR with correct enhancement number to make it trackable. > > >> > > please ignore the previous wrong post. sorry for troubles. > > >> > > > > >> > > webrev: http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_01/ > > >> > > Hi bug: https://bugs.openjdk.java.net/browse/JDK-8215624 > > >> > > CSR: https://bugs.openjdk.java.net/browse/JDK-8239290 > > >> > > -------------- > > >> > > Lin > > >> > > >Hi Lin, > > > > > > > > > >> > > >Could you, please, re-post your RFR with the right enhancement number in > > >> > > >the message subject? > > >> > > >It will be more trackable this way. > > >> > > > > > >> > > >Thanks, > > >> > > >Serguei > > >> > > > > > >> > > > > > >> > > >On 2/17/20 10:29 PM, linzang(??) wrote: > > >> > > >> Dear David, > > >> > > >> Thanks a lot! > > >> > > >> I have updated the refined code to http://cr.openjdk.java.net/~lzang/jmap-8214535/8215264/webrev_01/. > > >> > > >> IMHO the parallel heap inspection can be extended to all kinds of heap as long as the heap layout can support parallel iteration. > > >> > > >> Maybe we can firstly use this webrev to discuss how to implement it, because I am not sure my current implementation is an appropriate way to communicate with collectedHeap, then we can extend the solution to other kinds of heap. > > >> > > >> > > >> > > >> Thanks, > > >> > > >> -------------- > > >> > > >> Lin > > >> > > >>> Hi Lin, > > >> > > >>> > > >> > > >>> Adding in hotspot-gc-dev as they need to see how this interacts with GC > > >> > > >>> worker threads, and whether it needs to be extended beyond G1. > > >> > > >>> > > >> > > >>> I happened to spot one nit when browsing: > > >> > > >>> > > >> > > >>> src/hotspot/share/gc/shared/collectedHeap.hpp > > >> > > >>> > > >> > > >>> + virtual bool run_par_heap_inspect_task(KlassInfoTable* cit, > > >> > > >>> + BoolObjectClosure* filter, > > >> > > >>> + size_t* missed_count, > > >> > > >>> + size_t thread_num) { > > >> > > >>> + return NULL; > > >> > > >>> > > >> > > >>> s/NULL/false/ > > >> > > >>> > > >> > > >>> Cheers, > > >> > > >>> David > > > > > > >>> > > >> > > >>> On 18/02/2020 2:15 pm, linzang(??) wrote: > > >> > > >>>> Dear All, > > >> > > >>>> May I ask your help to review the follow changes: > > >> > > >>>> webrev: > > >> > > >>>> http://cr.openjdk.java.net/~lzang/jmap-8214535/8215264/webrev_00/ > > >> > > >>>> bug: https://bugs.openjdk.java.net/browse/JDK-8215624 > > >> > > >>>> related CSR: https://bugs.openjdk.java.net/browse/JDK-8239290 > > >> > > >>>> This patch enable parallel heap inspection of G1 for jmap histo. > > >> > > >>>> my simple test shown it can speed up 2x of jmap -histo with > > >> > > >>>> parallelThreadNum set to 2 for heap at ~500M on 4-core platform. > > >> > > >>>> > > >> > > >>>> ------------------------------------------------------------------------ > > >> > > >>>> BRs, > > >> > > >>>> Lin > > >> > > >> > > > >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > From jiefu at tencent.com Tue Aug 4 15:10:13 2020 From: jiefu at tencent.com (=?utf-8?B?amllZnUo5YKF5p2wKQ==?=) Date: Tue, 4 Aug 2020 15:10:13 +0000 Subject: FW: 8251031: Some vmTestbase/nsk/monitoring/RuntimeMXBean tests fail with hostnames starting from digits In-Reply-To: <8A8E8751-AF91-471A-B668-5D6461B72356@tencent.com> References: <8A8E8751-AF91-471A-B668-5D6461B72356@tencent.com> Message-ID: <3C8F8615-C994-4725-B18B-4F856DA68AF3@tencent.com> Forward it to serviceability-dev since this issue in the JBS has been moved from hotspot/runtime to core-svc/java.lang.management. Please review it. Thanks. Best regards, Jie From: "jiefu(??)" Date: Tuesday, August 4, 2020 at 5:10 PM To: "hotspot-runtime-dev at openjdk.java.net" Subject: RFR: 8251031: Some vmTestbase/nsk/monitoring/RuntimeMXBean tests fail with hostnames starting from digits Hi all, JBS: https://bugs.openjdk.java.net/browse/JDK-8251031 Webrev: http://cr.openjdk.java.net/~jiefu/8251031/webrev.00/ Some vmTestbase/nsk/monitoring/RuntimeMXBean tests failed in our test infrastructure. The reason is that these tests reject hostnames starting with digits. However, hostnames starting from digits are actually valid according to RFC1123 [1][2]. It would be better to fix it. Thanks a lot. Best regards, Jie [1] https://tools.ietf.org/html/rfc1123#page-13 [2] https://en.wikipedia.org/wiki/Hostname -------------- next part -------------- An HTML attachment was scrubbed... URL: From serguei.spitsyn at oracle.com Tue Aug 4 16:27:26 2020 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Tue, 4 Aug 2020 09:27:26 -0700 Subject: RFR: 8250826: jhsdb does not work with coredump which comes from Substrate VM In-Reply-To: <94d17e2d-9b58-a50a-62cb-60d391101f47@oss.nttdata.com> References: <06e53c65-3cc6-8c51-33d0-26e207f4d0a0@oss.nttdata.com> <3805349a-342f-a130-3b0b-7ee26f23a278@oss.nttdata.com> <6009b927-3949-6d03-51ab-bef939cfbd7f@oracle.com> <313155c0-58b2-cef8-68f8-1aa3ffd7e7df@oss.nttdata.com> <7788f89a-507d-0d4a-ab81-52a09a2d5161@oracle.com> <439d0860-ae4a-98d4-6402-40ebfa833159@oss.nttdata.com> <6740ad1d-5e35-458c-7c2c-caf6da6b1276@oracle.com> <29af768a-bebe-13fe-ac51-3062ede936bf@oss.nttdata.com> <2ca95f3f-c47f-d2c9-ab4f-347f65e74d9f@oracle.com> <94d17e2d-9b58-a50a-62cb-60d391101f47@oss.nttdata.com> Message-ID: Hi Yasumasa, It looks good. I forgot to say there is no need in new webrev. Thanks, Serguei On 8/4/20 05:22, Yasumasa Suenaga wrote: > Hi Serguei, > > Thanks for your comment! > > On 2020/08/04 18:11, serguei.spitsyn at oracle.com wrote: >> Hi Yasumasa, >> >> The fix looks good to me. >> Thanks to Chris for discussing the details in previous emails. >> >> Just one suggestion: >> >> http://cr.openjdk.java.net/~ysuenaga/JDK-8250826/webrev.02/src/jdk.hotspot.agent/share/native/libsaproc/ps_core_common.c.frames.html >> >> >> 44 #ifdef PF_R >> 45 #define MAP_R_FLAG PF_R >> 46 #else >> 47 #define MAP_R_FLAG 0 >> 48 #endif >> >> Could you, please add a small comment before? >> Something like this would be enough, I think: >> ?? // Define a segment permission flag allowing read. > > I added it to new webrev: > ? http://cr.openjdk.java.net/~ysuenaga/JDK-8250826/webrev.03/ > > > Thanks, > > Yasumasa > > >> Thanks, >> Serguei >> >> >> On 8/3/20 17:54, Yasumasa Suenaga wrote: >>> Thanks Chris! >>> I will push it when I got second reviewer. >>> >>> >>> Yasumasa >>> >>> >>> On 2020/08/04 9:54, Chris Plummer wrote: >>>> Hi Yasumasa, >>>> >>>> Your changes look good now. >>>> >>>> thanks, >>>> >>>> Chris >>>> >>>> On 8/3/20 5:47 PM, Yasumasa Suenaga wrote: >>>>> Hi Chris, >>>>> >>>>> Thank you for the comment! >>>>> I updated webrev: >>>>> >>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8250826/webrev.02/ >>>>> ? Diff from webrev.01: >>>>> http://hg.openjdk.java.net/jdk/submit/rev/e98dc25b69c2 >>>>> >>>>> On 2020/08/04 6:41, Chris Plummer wrote: >>>>>> Hi Yasumasa, >>>>>> >>>>>> Your updated fix resulted in using the core file map whereas the >>>>>> original fix used the library map. In both cases the assert is >>>>>> avoided, which I think is the main goal. Does it matter which map >>>>>> is used? >>>>> >>>>> In GraalVM, read only segment is conflicted, thus it does not >>>>> matter which map is used. >>>>> However this webrev is more generalize, so segments in coredump >>>>> should be used. >>>>> >>>>>> ?? 42 #ifndef PF_R >>>>>> ?? 43 #define PF_R 0x4 >>>>>> ?? 44 #endif >>>>>> >>>>>> ??156?? if ((map = allocate_init_map(ph->core->classes_jsa_fd, >>>>>> ??157??????????????????????????????? offset, vaddr, memsz, PF_R)) >>>>>> == NULL) { >>>>>> >>>>>> I'm not so sure this is appropriate for OSX. It uses mach-o >>>>>> files, not elf files. The segment_command flags field comes from >>>>>> loader.h [1]. I don't see anything in there that looks like the >>>>>> equivalent of ELF access flags. >>>>>> >>>>>> /* Constants for the flags field of the segment_command */ >>>>>> #define??? SG_HIGHVM??? 0x1??? /* the file contents for this >>>>>> segment is for >>>>>> ???? ??? ??? ??? ?? the high part of the VM space, the low part >>>>>> ???? ??? ??? ??? ?? is zero filled (for stacks in core files) */ >>>>>> #define??? SG_FVMLIB??? 0x2??? /* this segment is the VM that is >>>>>> allocated by >>>>>> ???? ??? ??? ??? ?? a fixed VM library, for overlap checking in >>>>>> ???? ??? ??? ??? ?? the link editor */ >>>>>> #define??? SG_NORELOC??? 0x4??? /* this segment has nothing that >>>>>> was relocated >>>>>> ???? ??? ??? ??? ?? in it and nothing relocated to it, that is >>>>>> ???? ??? ??? ??? ?? it maybe safely replaced without relocation*/ >>>>>> #define SG_PROTECTED_VERSION_1??? 0x8 /* This segment is >>>>>> protected. If the >>>>>> ???? ??? ??? ??? ?????? segment starts at file offset 0, the >>>>>> ???? ??? ??? ??? ?????? first page of the segment is not >>>>>> ???? ??? ??? ??? ?????? protected.? All other pages of the >>>>>> ???? ??? ??? ??? ?????? segment are protected. */ >>>>>> >>>>>> Since the flags don't matter for OSX, maybe you should just pass >>>>>> 0. You can do something like: >>>>>> >>>>>> #ifndef PF_R >>>>>> #define MAP_R_FLAG PF_R >>>>>> #else >>>>>> #define MAP_R_FLAG 0 >>>>>> #endif >>>>> >>>>> Thanks! >>>>> I thought PF_R can be used PF_R from elf.h on macOS: >>>>> https://opensource.apple.com/source/dtrace/dtrace-90/sys/elf.h >>>>> >>>>> I merged your code in this webrev. >>>>> >>>>>> Some minor comment fixes are needed: >>>>>> >>>>>> ??397???????? // Access flags fot this memory region is different >>>>>> between the library >>>>>> >>>>>> "fot" -> "for" >>>>>> "is" -> "are" >>>>>> >>>>>> ??399???????? // We should respect to coredump. >>>>>> >>>>>> "to" -> "the" >>>>>> >>>>>> ??404???????? // And head of ELF header might be included in >>>>>> coredump (See JDK-7133122). >>>>>> ??405???????? // Thus we need to replace PT_LOAD segments the >>>>>> library version. >>>>>> >>>>>> How about: >>>>>> >>>>>> ??404???????? // Also the first page of the ELF header might be >>>>>> included in the coredump (See JDK-7133122). >>>>>> ??405???????? // Thus we need to replace the PT_LOAD segment with >>>>>> the library version. >>>>> >>>>> Fixed them. >>>>> >>>>> >>>>> Thanks, >>>>> >>>>> Yasumasa >>>>> >>>>> >>>>>> thanks, >>>>>> >>>>>> Chris >>>>>> >>>>>> [1] >>>>>> https://opensource.apple.com/source/xnu/xnu-1456.1.26/EXTERNAL_HEADERS/mach-o/loader.h.auto.html >>>>>> >>>>>> On 8/2/20 12:18 AM, Yasumasa Suenaga wrote: >>>>>>> Hi Chris, >>>>>>> >>>>>>> (Remove "trivial" from subject) >>>>>>> >>>>>>> Thanks for the information! I fixed errors in new webrev. It >>>>>>> passed tests on submit repo >>>>>>> (mach5-one-ysuenaga-JDK-8250826-1-20200802-0151-13109525) >>>>>>> >>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8250826/webrev.01/ >>>>>>> >>>>>>> >>>>>>> I tried to use elf.h instead of #define for PF_R, however it >>>>>>> failed (mach5-one-ysuenaga-JDK-8250826-1-20200802-0542-13111335). >>>>>>> >>>>>>> http://hg.openjdk.java.net/jdk/submit/rev/67baee1a1a1d >>>>>>> >>>>>>> Thus I added #define for it in this webrev. >>>>>>> >>>>>>> >>>>>>> Thanks, >>>>>>> >>>>>>> Yasumasa >>>>>>> >>>>>>> >>>>>>> On 2020/08/02 10:22, Chris Plummer wrote: >>>>>>>> Hi Yasumasa, >>>>>>>> >>>>>>>> [2020-08-01T14:15:42,514Z] Creating >>>>>>>> support/native/jdk.hotspot.agent/libsaproc/static/libsaproc.a >>>>>>>> from 8 file(s) >>>>>>>> [2020-08-01T14:15:43,961Z] >>>>>>>> ./open/src/jdk.hotspot.agent/share/native/libsaproc/ps_core_common.c:128:8: >>>>>>>> error: no member named 'flags' in 'struct map_info' >>>>>>>> [2020-08-01T14:15:43,961Z]?? map->flags? = flags; >>>>>>>> [2020-08-01T14:15:43,961Z]?? ~~~? ^ >>>>>>>> [2020-08-01T14:15:43,963Z] >>>>>>>> ./open/src/jdk.hotspot.agent/share/native/libsaproc/ps_core_common.c:153:54: >>>>>>>> error: use of undeclared identifier 'PF_R' >>>>>>>> [2020-08-01T14:15:43,963Z] offset, vaddr, memsz, PF_R)) == NULL) { >>>>>>>> >>>>>>>> I'll look at the code changes later. No time at the moment. >>>>>>>> >>>>>>>> thanks, >>>>>>>> >>>>>>>> Chris >>>>>>>> >>>>>>>> 2020-08-01-1405571.suenaga.source2020-08-01-1405571.suenaga.source >>>>>>>> 2020-08-01-1405571.suenaga.source On 8/1/20 5:20 PM, Yasumasa >>>>>>>> Suenaga wrote: >>>>>>>>> Hi Chris, >>>>>>>>> >>>>>>>>> Thanks for your comment! >>>>>>>>> I pushed new change to submit repo, but the build failed on >>>>>>>>> macOS. Could you share details? >>>>>>>>> (I do not have Mac) >>>>>>>>> >>>>>>>>> ? commit: http://hg.openjdk.java.net/jdk/submit/rev/0eb1c497f297 >>>>>>>>> ? job: mach5-one-ysuenaga-JDK-8250826-1-20200801-1407-13098989 >>>>>>>>> >>>>>>>>> On 2020/08/01 13:06, Chris Plummer wrote: >>>>>>>>>> On 7/30/20 6:18 PM, Yasumasa Suenaga wrote: >>>>>>>>>>> Hi Chris, >>>>>>>>>>> >>>>>>>>>>> On 2020/07/31 7:29, Chris Plummer wrote: >>>>>>>>>>>> Hi Yasumasa, >>>>>>>>>>>> >>>>>>>>>>>> If I understand correctly we first call add_map_info() for >>>>>>>>>>>> all the PT_LOAD segments in the core file. We then process >>>>>>>>>>>> all the library segments, calling add_map_info() for them >>>>>>>>>>>> if the target_vaddr has not already been addded. If has >>>>>>>>>>>> already been added, which I assume is the case for any >>>>>>>>>>>> library segment that is already in the core file, then the >>>>>>>>>>>> core file version is replaced the the library version.? I'm >>>>>>>>>>>> a little unclear of the purpose of this replacing of the >>>>>>>>>>>> core PT_LOAD segments with those found in the libraries. If >>>>>>>>>>>> you could explain this that would help me understand your >>>>>>>>>>>> change. >>>>>>>>>>> >>>>>>>>>>> Read only segments in ELF should not be any different from >>>>>>>>>>> PT_LOAD segments in the core. >>>>>>>>>>> And head of ELF header might be included in coredump (See >>>>>>>>>>> JDK-7133122). Thus we need to replace PT_LOAD segments the >>>>>>>>>>> library version. >>>>>>>>>> Ok. The code in the area really should have been commented >>>>>>>>>> better when first written. The purpose is not understandable >>>>>>>>>> simply by reading the code. >>>>>>>>> >>>>>>>>> I added some comments to existing code. Please tell me if it >>>>>>>>> is insufficient. >>>>>>>>> >>>>>>>>> >>>>>>>>>>>> I'm also unsure why existing_map->fd would ever be >>>>>>>>>>>> something other than the core file. Why would another >>>>>>>>>>>> library map the same target_vaddr. >>>>>>>>>>> >>>>>>>>>>> When mmap() is called to read-only ELF segments / sections, >>>>>>>>>>> Linux kernel seems to allocate other memory segments which >>>>>>>>>>> has same top virtual memory address. I've not yet found out >>>>>>>>>>> from the code of Linux kernel, but I confirmed this behavior >>>>>>>>>>> on GDB. >>>>>>>>>> Ok. Same comment as above. This should have been explained >>>>>>>>>> with comments in the code. >>>>>>>>> >>>>>>>>> Added some comments. >>>>>>>>> >>>>>>>>> >>>>>>>>>> As for your fix, if I understand correctly the issue is that >>>>>>>>>> a single segment in the library is being split into two >>>>>>>>>> segments in the process (and therefore in the core file) due >>>>>>>>>> to an mprotect being done on part of the segment. Because of >>>>>>>>>> this the segment size in the library does match the segment >>>>>>>>>> size in the core file. So with your fix the library segment >>>>>>>>>> is used, but what about the other half of the segment that is >>>>>>>>>> in the core file? Don't we now have overlapping segments; the >>>>>>>>>> full original segment from the library, and then a second >>>>>>>>>> segment that overlaps the tail end of the library segment? >>>>>>>>>> Will that cause any confusion later on? >>>>>>>>> >>>>>>>>> As long as vaddr is valid, it doesn't matter even if it >>>>>>>>> overlaps because SA would sort the map with vaddr, and would >>>>>>>>> lookup with it. >>>>>>>>> In Substrate VM, there are RO and RW sections in that order, >>>>>>>>> so it is ok with webrev.00 . However it might not be >>>>>>>>> appropriate because RW section might be top of PT_LOAD. >>>>>>>>> >>>>>>>>> To make it more generalized, I changed it to the commit on >>>>>>>>> submit repo. >>>>>>>>> It would check access flags between in coredump and in binary. >>>>>>>>> If they are different, we respect current (loaded from >>>>>>>>> coredump) map because it might be changed at runtime. >>>>>>>>> >>>>>>>>> The change for LabsJDK 11 is more simple because JDK 11 does >>>>>>>>> not have ps_core_common.c . >>>>>>>>> So I share you it. It may help you: >>>>>>>>> >>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8250826/JDK-8250826-labsjdk11-0.patch >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> >>>>>>>>> Yasumasa >>>>>>>>> >>>>>>>>> >>>>>>>>>> thanks, >>>>>>>>>> >>>>>>>>>> Chris >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> >>>>>>>>>>> Yasumasa >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>> thanks, >>>>>>>>>>>> >>>>>>>>>>>> Chris >>>>>>>>>>>> >>>>>>>>>>>> On 7/30/20 1:18 PM, Chris Plummer wrote: >>>>>>>>>>>>> Hi Yasumasa, >>>>>>>>>>>>> >>>>>>>>>>>>> I'm reviewing this RFR, and I'd like to ask that it not be >>>>>>>>>>>>> pushed as trivial. Although it is just a one line change, >>>>>>>>>>>>> it takes an extensive knowledge to understand the impact. >>>>>>>>>>>>> I'll read up on the filed graal issue and try to >>>>>>>>>>>>> understand the ELF code a bit better. >>>>>>>>>>>>> >>>>>>>>>>>>> thanks, >>>>>>>>>>>>> >>>>>>>>>>>>> Chris >>>>>>>>>>>>> >>>>>>>>>>>>> On 7/30/20 6:45 AM, Yasumasa Suenaga wrote: >>>>>>>>>>>>>> Hi all, >>>>>>>>>>>>>> >>>>>>>>>>>>>> Please review this trivial change: >>>>>>>>>>>>>> >>>>>>>>>>>>>> ? JBS: https://bugs.openjdk.java.net/browse/JDK-8250826 >>>>>>>>>>>>>> ? webrev: >>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8250826/webrev.00/ >>>>>>>>>>>>>> >>>>>>>>>>>>>> I played Truffle NFI on GraalVM, but I cannot get Java >>>>>>>>>>>>>> stacks from coredump via jhsdb. >>>>>>>>>>>>>> >>>>>>>>>>>>>> I've reported this issue to GraalVM community [1], and I >>>>>>>>>>>>>> 've found out the cause of this issue is .svm_heap would >>>>>>>>>>>>>> be separated to RO and RW areas by mprotect() calls in >>>>>>>>>>>>>> run time in spite of .svm_heap is RO section in ELF >>>>>>>>>>>>>> (please see [1] for details). >>>>>>>>>>>>>> >>>>>>>>>>>>>> It is corner case, but we will see same problem on jhsdb >>>>>>>>>>>>>> when we attempt to analyze coredump which comes from some >>>>>>>>>>>>>> applications / libraries which would separate RO sections >>>>>>>>>>>>>> in ELF like Substrate VM. >>>>>>>>>>>>>> >>>>>>>>>>>>>> I sent PR to fix libsaproc.so in LabsJDK 11 for this >>>>>>>>>>>>>> issue [2], then community members suggested me to discuss >>>>>>>>>>>>>> in serviceability-dev. >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>> >>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> [1] https://github.com/oracle/graal/issues/2579 >>>>>>>>>>>>>> [2] https://github.com/graalvm/labs-openjdk-11/pull/9 >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>> >>>>>>>> >>>>>> >>>>>> >>>> >>>> >> From serguei.spitsyn at oracle.com Tue Aug 4 16:38:43 2020 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Tue, 4 Aug 2020 09:38:43 -0700 Subject: RFR(S): 8247516: DSO.closestSymbolToPC() should use dbg.lookup() rather than rely on java ELF file support In-Reply-To: <18037da5-ef72-0de3-a7cb-a5dc4470f285@oss.nttdata.com> References: <2873b4bd-09c6-7f29-04fc-3910f360def8@oracle.com> <3108b833-83e1-c2f6-d1fa-7200650f5279@oracle.com> <18037da5-ef72-0de3-a7cb-a5dc4470f285@oss.nttdata.com> Message-ID: <15cdec99-8648-c3eb-df1f-7b25fa3cf64a@oracle.com> Hi Chris, LGTM++ Thanks, Serguei On 8/3/20 22:10, Yasumasa Suenaga wrote: > Hi Chris, > > Looks good. > > > Yasumasa > > On 2020/08/04 13:10, Chris Plummer wrote: >> Ping! >> >> On 7/27/20 10:04 PM, Chris Plummer wrote: >>> I should have mentioned that currently there is no testing of this >>> code. There will with the changes for [1] JDK-8247514, which will >>> add the lost clhsdb "whatis" functionality, which was lost when >>> JavaScript support went away. "whatis" used DSO.closestSymbolToPC(), >>> so as part of JDK-8247514 I'm adding this support to the >>> PointerFinder class so the "findpc" will also be able to do address >>> to native symbol lookups, and the ClhsdbFindPC will check that it is >>> working. >>> >>> [1] https://bugs.openjdk.java.net/browse/JDK-8247514 >>> >>> thanks, >>> >>> Chris >>> >>> On 7/27/20 9:32 PM, Chris Plummer wrote: >>>> Hello, >>>> >>>> Please review the following: >>>> >>>> https://bugs.openjdk.java.net/browse/JDK-8247516 >>>> http://cr.openjdk.java.net/~cjplummer/8247516/webrev.00/index.html >>>> >>>> I put all the details in the description of the CR, including some >>>> background on how symbol lookups are done, including what >>>> LoadObjects are and their class hierarchy, and also info on >>>> JVMDebugger subclasses. >>>> >>>> One thing not covered in the bug description is the additional >>>> gutting of DSO.java that comes with these changes. Many APIs were >>>> not used so I removed them, such as setBase(), lookupSymbol(), and >>>> isDSO(). Doing so allowed completely severing any need for java ELF >>>> file support. Note I plan on removing the java ELF file support >>>> itself with another CR after pushing these changes. >>>> >>>> thanks, >>>> >>>> Chris >>> >> From chris.plummer at oracle.com Tue Aug 4 18:04:04 2020 From: chris.plummer at oracle.com (Chris Plummer) Date: Tue, 4 Aug 2020 11:04:04 -0700 Subject: RFR: 8250826: jhsdb does not work with coredump which comes from Substrate VM In-Reply-To: <2ca95f3f-c47f-d2c9-ab4f-347f65e74d9f@oracle.com> References: <06e53c65-3cc6-8c51-33d0-26e207f4d0a0@oss.nttdata.com> <3805349a-342f-a130-3b0b-7ee26f23a278@oss.nttdata.com> <6009b927-3949-6d03-51ab-bef939cfbd7f@oracle.com> <313155c0-58b2-cef8-68f8-1aa3ffd7e7df@oss.nttdata.com> <7788f89a-507d-0d4a-ab81-52a09a2d5161@oracle.com> <439d0860-ae4a-98d4-6402-40ebfa833159@oss.nttdata.com> <6740ad1d-5e35-458c-7c2c-caf6da6b1276@oracle.com> <29af768a-bebe-13fe-ac51-3062ede936bf@oss.nttdata.com> <2ca95f3f-c47f-d2c9-ab4f-347f65e74d9f@oracle.com> Message-ID: <2cbb7908-7af1-6c18-7b53-7c943bdfcc7e@oracle.com> An HTML attachment was scrubbed... URL: From chris.plummer at oracle.com Tue Aug 4 18:20:28 2020 From: chris.plummer at oracle.com (Chris Plummer) Date: Tue, 4 Aug 2020 11:20:28 -0700 Subject: FW: 8251031: Some vmTestbase/nsk/monitoring/RuntimeMXBean tests fail with hostnames starting from digits In-Reply-To: <3C8F8615-C994-4725-B18B-4F856DA68AF3@tencent.com> References: <8A8E8751-AF91-471A-B668-5D6461B72356@tencent.com> <3C8F8615-C994-4725-B18B-4F856DA68AF3@tencent.com> Message-ID: <7349629f-1a21-c41b-089e-8c0ebbdec7ba@oracle.com> Hi Jie, The fix appears to directly address the issue of allowing a hostname to start with a digit. I'm still not convinced that the check will properly validate the hostname in all cases, but maybe that's a fix for another day: https://stackoverflow.com/questions/1418423/the-hostname-regex thanks, Chris On 8/4/20 8:10 AM, jiefu(??) wrote: > > Forward it to serviceability-dev since this issue in the JBS has been > moved ?from hotspot/runtime to core-svc/java.lang.management. > > Please review it. > > Thanks. > > Best regards, > > Jie > > *From: *"jiefu(??)" > *Date: *Tuesday, August 4, 2020 at 5:10 PM > *To: *"hotspot-runtime-dev at openjdk.java.net" > > *Subject: *RFR: 8251031: Some vmTestbase/nsk/monitoring/RuntimeMXBean > tests fail with hostnames starting from digits > > Hi all, > > JBS: https://bugs.openjdk.java.net/browse/JDK-8251031 > > Webrev: http://cr.openjdk.java.net/~jiefu/8251031/webrev.00/ > > Some vmTestbase/nsk/monitoring/RuntimeMXBean tests failed in our test > infrastructure. > > The reason is that these tests reject hostnames starting with digits. > > However, hostnames starting from digits are actually valid according > to RFC1123 [1][2]. > > It would be better to fix it. > > Thanks a lot. > > Best regards, > > Jie > > [1] https://tools.ietf.org/html/rfc1123#page-13 > > [2] https://en.wikipedia.org/wiki/Hostname > From chris.plummer at oracle.com Tue Aug 4 18:46:08 2020 From: chris.plummer at oracle.com (Chris Plummer) Date: Tue, 4 Aug 2020 11:46:08 -0700 Subject: RFR(S): 8247516: DSO.closestSymbolToPC() should use dbg.lookup() rather than rely on java ELF file support In-Reply-To: <15cdec99-8648-c3eb-df1f-7b25fa3cf64a@oracle.com> References: <2873b4bd-09c6-7f29-04fc-3910f360def8@oracle.com> <3108b833-83e1-c2f6-d1fa-7200650f5279@oracle.com> <18037da5-ef72-0de3-a7cb-a5dc4470f285@oss.nttdata.com> <15cdec99-8648-c3eb-df1f-7b25fa3cf64a@oracle.com> Message-ID: <8f11755e-bacb-b4b2-7f8e-cb6518811052@oracle.com> Thanks Serguei and Yasumasa! On 8/4/20 9:38 AM, serguei.spitsyn at oracle.com wrote: > Hi Chris, > > LGTM++ > > Thanks, > Serguei > > > On 8/3/20 22:10, Yasumasa Suenaga wrote: >> Hi Chris, >> >> Looks good. >> >> >> Yasumasa >> >> On 2020/08/04 13:10, Chris Plummer wrote: >>> Ping! >>> >>> On 7/27/20 10:04 PM, Chris Plummer wrote: >>>> I should have mentioned that currently there is no testing of this >>>> code. There will with the changes for [1] JDK-8247514, which will >>>> add the lost clhsdb "whatis" functionality, which was lost when >>>> JavaScript support went away. "whatis" used >>>> DSO.closestSymbolToPC(), so as part of JDK-8247514 I'm adding this >>>> support to the PointerFinder class so the "findpc" will also be >>>> able to do address to native symbol lookups, and the ClhsdbFindPC >>>> will check that it is working. >>>> >>>> [1] https://bugs.openjdk.java.net/browse/JDK-8247514 >>>> >>>> thanks, >>>> >>>> Chris >>>> >>>> On 7/27/20 9:32 PM, Chris Plummer wrote: >>>>> Hello, >>>>> >>>>> Please review the following: >>>>> >>>>> https://bugs.openjdk.java.net/browse/JDK-8247516 >>>>> http://cr.openjdk.java.net/~cjplummer/8247516/webrev.00/index.html >>>>> >>>>> I put all the details in the description of the CR, including some >>>>> background on how symbol lookups are done, including what >>>>> LoadObjects are and their class hierarchy, and also info on >>>>> JVMDebugger subclasses. >>>>> >>>>> One thing not covered in the bug description is the additional >>>>> gutting of DSO.java that comes with these changes. Many APIs were >>>>> not used so I removed them, such as setBase(), lookupSymbol(), and >>>>> isDSO(). Doing so allowed completely severing any need for java >>>>> ELF file support. Note I plan on removing the java ELF file >>>>> support itself with another CR after pushing these changes. >>>>> >>>>> thanks, >>>>> >>>>> Chris >>>> >>> > From serguei.spitsyn at oracle.com Tue Aug 4 19:15:15 2020 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Tue, 4 Aug 2020 12:15:15 -0700 Subject: RFR: 8250826: jhsdb does not work with coredump which comes from Substrate VM In-Reply-To: <2cbb7908-7af1-6c18-7b53-7c943bdfcc7e@oracle.com> References: <06e53c65-3cc6-8c51-33d0-26e207f4d0a0@oss.nttdata.com> <3805349a-342f-a130-3b0b-7ee26f23a278@oss.nttdata.com> <6009b927-3949-6d03-51ab-bef939cfbd7f@oracle.com> <313155c0-58b2-cef8-68f8-1aa3ffd7e7df@oss.nttdata.com> <7788f89a-507d-0d4a-ab81-52a09a2d5161@oracle.com> <439d0860-ae4a-98d4-6402-40ebfa833159@oss.nttdata.com> <6740ad1d-5e35-458c-7c2c-caf6da6b1276@oracle.com> <29af768a-bebe-13fe-ac51-3062ede936bf@oss.nttdata.com> <2ca95f3f-c47f-d2c9-ab4f-347f65e74d9f@oracle.com> <2cbb7908-7af1-6c18-7b53-7c943bdfcc7e@oracle.com> Message-ID: An HTML attachment was scrubbed... URL: From alexey.menkov at oracle.com Tue Aug 4 21:04:26 2020 From: alexey.menkov at oracle.com (Alex Menkov) Date: Tue, 4 Aug 2020 14:04:26 -0700 Subject: RFR: 8244537: JDI tests fail due to "ERROR: Exception : nsk.share.jdi.JDITestRuntimeException: JDITestRuntimeException : ** event IS NOT a breakpoint **" In-Reply-To: <5CA977F6-2B49-4E47-B5BD-E4EA68F361FA@oracle.com> References: <16EB9126-382A-4092-BB61-14E9C3CF208C@oracle.com> <68FBDE47-11DA-45C3-AD07-8200DCC9BA29@oracle.com> <5d78a07f-6374-7edb-c57c-8a34faf4d840@oracle.com> <4CF129DD-1733-47A0-91F9-B27891BCC63E@oracle.com> <11e29ad2-8e06-ad30-df48-6dcfea0bd4dd@oracle.com> <764e3f2d-75a8-143d-d29f-01ea44d0b6e0@oracle.com> <5CA977F6-2B49-4E47-B5BD-E4EA68F361FA@oracle.com> Message-ID: <9590d62a-47ed-65d2-50a4-22f71cf4ceb5@oracle.com> LGTM --alex On 07/28/2020 23:26, Leonid Mesnik wrote: > ok, let change it to following > > diff -r f489d5d13a51 > test/hotspot/jtreg/vmTestbase/nsk/jdi/EventRequest/setEnabled/setenabled002.java > --- > a/test/hotspot/jtreg/vmTestbase/nsk/jdi/EventRequest/setEnabled/setenabled002.java > ?Thu Jul 23 16:36:44 2020 -0400 > +++ > b/test/hotspot/jtreg/vmTestbase/nsk/jdi/EventRequest/setEnabled/setenabled002.java > ?Tue Jul 28 23:17:22 2020 -0700 > @@ -368,6 +368,7 @@ > ? ? ? ? ? ? ? ? ? ? ? ?throw new JDITestRuntimeException("** default > case 2 **"); > ? ? ? ? ? ? ?} > > + ? ? ? ? ? ?vm.suspend(); > ? ? ? ? ? ? ?if (eventRequest1 instanceof StepRequest) { > ? ? ? ? ? ? ? ? ?try { > ? ? ? ? ? ? ? ? ? ? ?log2("......eventRequest1.setEnabled(true); > ?IllegalThreadStateException is expected"); > @@ -405,6 +406,7 @@ > ? ? ? ? ? ? ?} catch ( InvalidRequestStateException e ) { > ? ? ? ? ? ? ? ? ? ? ?log2(" ? ? ? InvalidRequestStateException"); > ? ? ? ? ? ? ?} > + ? ? ? ? ? ?vm.resume(); > > ? ? ? ? ? ? ?//~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > ? ? ? ? ?} > > webrev: http://cr.openjdk.java.net/~lmesnik/8244537/webrev.04/ > Leonid > >> On Jul 28, 2020, at 11:06 PM, serguei.spitsyn at oracle.com >> wrote: >> >> http://cr.openjdk.java.net/~lmesnik/8244537/webrev.03/test/hotspot/jtreg/vmTestbase/nsk/jdi/EventRequest/setEnabled/setenabled002.java.frames.html >> >> I'd suggest to simplify it: >> >> ?- insert suspend before the line: >> 371 if (eventRequest1 instanceof StepRequest) { >> ?- keep resume at the line: >> 411 vm.resume(); >> ?- these lines are not needed: >> 389 vm.resume(); 394 vm.suspend(); >> >> Thanks, >> Serguei >> >> >> On 7/28/20 21:15, Leonid Mesnik wrote: >>> Included in webrev.03 >>> http://cr.openjdk.java.net/~lmesnik/8244537/webrev.03 >>> >>> Leonid >>> >>>> On Jul 28, 2020, at 6:44 PM, serguei.spitsyn at oracle.com >>>> wrote: >>>> >>>> http://cr.openjdk.java.net/~lmesnik/8244537/webrev.01/test/hotspot/jtreg/vmTestbase/nsk/jdi/EventRequest/setEnabled/setenabled002.java.frames.html >>>> 371 if (eventRequest1 instanceof StepRequest) { >>>> 372 try { >>>> 373 log2("......eventRequest1.setEnabled(true); IllegalThreadStateException is expected"); >>>> 374 eventRequest1.setEnabled(true); >>>> 375 testExitCode = FAILED; >>>> 376 log3("ERROR: NO IllegalThreadStateException for StepRequest"); >>>> 377 } catch ( IllegalThreadStateException e ) { >>>> 378 log2(" IllegalThreadStateException"); >>>> 379 } >>>> 380 try { >>>> 381 log2("......eventRequest1.setEnabled(false); IllegalThreadStateException is not expected"); >>>> 382 eventRequest1.setEnabled(false); >>>> 383 log2(" no IllegalThreadStateException for StepRequest"); >>>> 384 } catch ( IllegalThreadStateException e ) { >>>> 385 testExitCode = FAILED; >>>> 386 log3("ERROR: IllegalThreadStateException"); >>>> 387 } >>>> 388 } >>>> Above is one more case where suspend/resume is needed, I guess. >>>> >>>> Thanks, >>>> Serguei >>>> >>>> >>>> On 7/28/20 17:45, serguei.spitsyn at oracle.com wrote: >>>>> http://cr.openjdk.java.net/~lmesnik/8244537/webrev.01/test/hotspot/jtreg/vmTestbase/nsk/jdi/EventRequest/setEnabled/setenabled003.java.frames.html >>>>> 288 case 0: >>>>> 289 thread1 = (ThreadReference) >>>>> 290 debuggeeClass.getValue(debuggeeClass.fieldByName(threadName1)); >>>>> 291 >>>>> 292 log2("......setting up StepRequest"); >>>>> 293 eventRequest1 = eventRManager.createStepRequest >>>>> 294 (thread1, StepRequest.STEP_MIN, StepRequest.STEP_INTO); >>>>> 295 >>>>> *296 vm.suspend();* ... >>>>> 360 default: >>>>> 361 throw new JDITestRuntimeException("** default case 2 **"); >>>>> 362 } >>>>> 363 vm.resume(); >>>>> >>>>> Sorry, the fix is not going to work correctly. >>>>> The first vm.suspend() has to be before the switch statement to >>>>> work for all 3 cases. >>>>> >>>>> Thanks, >>>>> Serguei >>>>> >>>>> >>>>> On 7/28/20 16:49, Leonid Mesnik wrote: >>>>>> I've update to suspend/resume in all cases. >>>>>> >>>>>> new webrev: http://cr.openjdk.java.net/~lmesnik/8244537/webrev.01/ >>>>>> >>>>>> Leonid >>>>>> >>>>>>> On Jul 28, 2020, at 2:06 PM, serguei.spitsyn at oracle.com >>>>>>> wrote: >>>>>>> >>>>>>> I prefer to suspend/resume in all cases, so we avoid all these >>>>>>> unexpected failures. >>>>>>> >>>>>>> Thanks, >>>>>>> Serguei >>>>>>> >>>>>>> >>>>>>> On 7/28/20 13:59, Leonid Mesnik wrote: >>>>>>>> It should be failure anyway if we managed to enable events, so >>>>>>>> we don't expect to really enable anything in these cases. >>>>>>>> However I agree that adding suspend/resume shouldn't make it >>>>>>>> worse, just possible cleaner log (in very rare cases also). If >>>>>>>> you feel it is need I will just add suspension for all cases. >>>>>>>> >>>>>>>> Leonid >>>>>>>> >>>>>>>>> On Jul 28, 2020, at 1:54 PM, serguei.spitsyn at oracle.com >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>> Does it mean, you did not fix cases 0 and 2 because the related >>>>>>>>> failures have never been observed? >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> Serguei >>>>>>>>> >>>>>>>>> >>>>>>>>> On 7/28/20 13:51, Leonid Mesnik wrote: >>>>>>>>>> Test should fail in cases 0 and 2 with >>>>>>>>>> IllegalThreadStateException if we can enable events. Such >>>>>>>>>> failures should be easily identified by reading logs. >>>>>>>>>> >>>>>>>>>> Leonid >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>> On Jul 27, 2020, at 10:28 PM, serguei.spitsyn at oracle.com >>>>>>>>>>> wrote: >>>>>>>>>>> >>>>>>>>>>> Hi Leonid, >>>>>>>>>>> >>>>>>>>>>> The fix looks good in general. >>>>>>>>>>> You missed to explain that the suspend/resume are added to >>>>>>>>>>> avoid actual generation of event that cause this issue. >>>>>>>>>>> The reason is that these events are not actually required. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> http://cr.openjdk.java.net/~lmesnik/8244537/webrev.00/test/hotspot/jtreg/vmTestbase/nsk/jdi/EventRequest/setEnabled/setenabled003.java.frames.html >>>>>>>>>>> 316 case 1: >>>>>>>>>>> 317 vm.suspend(); ... 336 vm.resume(); >>>>>>>>>>> >>>>>>>>>>> Q: Why is only in case 1 suspend/resume used? >>>>>>>>>>> ?? What about cases 0 and 2? >>>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> Serguei >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On 7/27/20 18:08, Leonid Mesnik wrote: >>>>>>>>>>>> Hi >>>>>>>>>>>> >>>>>>>>>>>> Could you please review following fix which suspends >>>>>>>>>>>> debugger VM while enabling/disabling events. >>>>>>>>>>>> >>>>>>>>>>>> All changed tests fail intermittently getting unexpected >>>>>>>>>>>> events instead of breakpoint used for communication between >>>>>>>>>>>> debugger/debuggee VM. The tests request different events and >>>>>>>>>>>> verify request's properties but don't process/verify events >>>>>>>>>>>> themselves. Test doesn't aware if events are generated or >>>>>>>>>>>> not. The vm suspension doesn't affect JDWP native agent and >>>>>>>>>>>> it still should get and verify JDWP commands. >>>>>>>>>>>> >>>>>>>>>>>> webrev: http://cr.openjdk.java.net/~lmesnik/8244537/webrev.00/ >>>>>>>>>>>> bug: https://bugs.openjdk.java.net/browse/JDK-8244537 >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Leonid >>>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> >> > From alexey.menkov at oracle.com Tue Aug 4 22:05:36 2020 From: alexey.menkov at oracle.com (Alex Menkov) Date: Tue, 4 Aug 2020 15:05:36 -0700 Subject: RFR(XS): 8248879: SA core file support on OSX has some bugs trying to locate the jvm libraries In-Reply-To: <7fc9666b-ce1e-f83a-e7a7-e052ca451a09@oracle.com> References: <6089f692-152b-8615-dfca-779f2b5028a0@oracle.com> <219dd806-ca93-64ce-f5f9-d75dd7acb7f1@oracle.com> <37eee0d3-6b00-eb11-6198-d8d87e3a7ad4@oracle.com> <99909975-074c-34a1-0f9b-69c4fb8f0eff@oracle.com> <8c7edf32-3037-b8d7-a18e-587f7ae8d294@oracle.com> <9bd21f67-982a-ef98-7c9a-3515c38b688c@oracle.com> <0906c510-8c2c-2c56-803f-0a0bf4340df6@oracle.com> <6f8cebde-428e-9e70-ed15-b05b6f6446e8@oracle.com> <7fc9666b-ce1e-f83a-e7a7-e052ca451a09@oracle.com> Message-ID: <8c82117c-047e-9fd5-190c-ffbd9d73711d@oracle.com> Hi Chris, 396 posbin = strstr(execname, "/bin/java"); I suppose this should be rstrstr. --alex On 07/28/2020 20:40, Chris Plummer wrote: > Hi Serguei and Alex, > > Sorry about the delay getting back to this. I got sidetracked with other > bugs and also realized the code needed more work than just Alex's > suggestion for rstrstr(). > > As a bit of background first, get_real_path() is used to locate any > library that is referenced from the core file using a relative path. So > the core file will, for example, refer to @rpath/libjvm.dylib, and > get_real_path() will convert that to a usable path to the file. Usually > only JDK libraries and user libraries are specified with @rpath. System > libraries all use full path names. > > get_real_path() had a couple of shortcomings. The way it worked is if > the specified execname ended in bin/java or if $JAVA_HOME was set, then > it only checked for libraries in subdirs of the first one of those 2 > that it found to be valid. It would not look in both directories if both > were valid, only in the first to be found valid. Only if neither of > those were valid did it look in DYLD_LIBRARY_PATH. So, for example, as > long as execname ended in bin/java, that's the only jdk directory that > was checked for libraries. If it didn't end in bin/java, and $JAVA_HOME > was set, then only it was checked. Then I added a 3rd option looking for > the existence of any "bin/" in execname. Only if none of these 3 paths > existed did the code defer to DYLD_LIBRARY_PATH. That made is hard to > locate non JDK libraries, such as user JNI libraries, or to override the > execname search for the JDK by setting $JAVA_HOME. > > I've fixed this by having it check all 3 of the potential JDK locations > not only to see if the paths are valid, but also if the library is in > any of the paths, and then check all the paths DYLD_LIBRARY_PATH if it > failed to find the library in the JDK paths. So now all the potential > locations are checked to see if they contain the library. By doing this > I was able to make it find the JDK libraries by properly specifying the > execname or JAVA_HOME, and still find a user JNI library in > DYLD_LIBRARY_PATH. > > Since the code was kind of a mess and not well suited to just fix with > some minor adjustments, I for the most part rewrote it. Although it > still does a lot of the same things, it's much cleaner and easier to > read now, and there's less replication of similar code. I also replaced > strcat and strcpy calls with strncat and strncpy to prevent overflows. I > would suggest for this review to just start by looking at > get_real_path() and follow the code, and not compare the diffs, which > aren't very readable. > > http://cr.openjdk.java.net/~cjplummer/8248879/webrev.02/index.html > > thanks, > > Chris > > > On 7/14/20 8:54 PM, serguei.spitsyn at oracle.com wrote: >> Hi Alex, >> >> Yes, I understand this. >> After some thinking, I doubt my suggestion to check all occurrences or >> "/bin/" is good. :) >> >> Thanks, >> Serguei >> >> On 7/14/20 18:19, Alex Menkov wrote: >>> Hi Serguei, >>> >>> On 07/14/2020 15:55, serguei.spitsyn at oracle.com wrote: >>>> Hi Chris and Alex, >>>> >>>> I agree the last occurrence of "/bin/" is better than the first. >>>> But I wonder if it makes sense to check all occurrences. >>> >>> The problem is strrstr (search for last occurrence) is not a part of >>> std C lib. >>> So to avoid dependency on new library I suggested this simple >>> implementation using standard strstr. >>> >>> --alex >>> >>>> >>>> Thanks, >>>> Serguei >>>> >>>> >>>> On 7/14/20 15:14, Alex Menkov wrote: >>>>> Yes, you are right. >>>>> This is not a function from strings.h >>>>> >>>>> Ok, you can leave strstr (and keep in mind that the path can't >>>>> contain "/bin/" other than jdk's bin) or implement the >>>>> functionality. It should be something simple like >>>>> >>>>> static const char* rstrstr(const char *str, const char *sub) { >>>>> ? const char *result = NULL; >>>>> ? for (const char *p = strstr(str, sub); p != NULL; p = strstr(p + >>>>> 1, sub)) { >>>>> ??? result = p; >>>>> ? } >>>>> ? return result; >>>>> } >>>>> >>>>> --alex >>>>> >>>>> On 07/14/2020 13:43, Chris Plummer wrote: >>>>>> Actually it's not so easy. I don't see any other references to >>>>>> strrstr in our source. When I reference strstr, it gives a warning >>>>>> because it's not declared. The only man page I can find says to >>>>>> include sstring2.h, but this file does not exist. It also says to >>>>>> link with -lsstrings2. >>>>>> >>>>>> Chris >>>>>> >>>>>> On 7/14/20 1:37 PM, Chris Plummer wrote: >>>>>>> Ok. I'll change both references to use strrstr. >>>>>>> >>>>>>> thanks, >>>>>>> >>>>>>> Chris >>>>>>> >>>>>>> On 7/14/20 1:11 PM, Alex Menkov wrote: >>>>>>>> Hi Chris, >>>>>>>> >>>>>>>> I think it would be better to use strrstr to correctly handle >>>>>>>> paths like >>>>>>>> /something/bin/jdk/bin/jhsdb >>>>>>>> >>>>>>>> And I'd updated >>>>>>>> 358?? char* posbin = strstr(execname, "/bin/java"); >>>>>>>> to use strrstr as well >>>>>>>> >>>>>>>> --alex >>>>>>>> >>>>>>>> On 07/14/2020 12:01, Chris Plummer wrote: >>>>>>>>> Ping! >>>>>>>>> >>>>>>>>> On 7/6/20 9:31 PM, Chris Plummer wrote: >>>>>>>>>> Hello, >>>>>>>>>> >>>>>>>>>> Please help review the following: >>>>>>>>>> >>>>>>>>>> http://cr.openjdk.java.net/~cjplummer/8248879/webrev.00/index.html >>>>>>>>>> >>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8248879 >>>>>>>>>> >>>>>>>>>> The description of the problem and the fix are both in the CR. >>>>>>>>>> >>>>>>>>>> thanks, >>>>>>>>>> >>>>>>>>>> Chris >>>>>>>>> >>>>>>> >>>>>>> >>>>>> >>>>>> >>>> >> > > From chris.plummer at oracle.com Tue Aug 4 23:27:18 2020 From: chris.plummer at oracle.com (Chris Plummer) Date: Tue, 4 Aug 2020 16:27:18 -0700 Subject: RFR(XS): 8248879: SA core file support on OSX has some bugs trying to locate the jvm libraries In-Reply-To: <8c82117c-047e-9fd5-190c-ffbd9d73711d@oracle.com> References: <6089f692-152b-8615-dfca-779f2b5028a0@oracle.com> <219dd806-ca93-64ce-f5f9-d75dd7acb7f1@oracle.com> <37eee0d3-6b00-eb11-6198-d8d87e3a7ad4@oracle.com> <99909975-074c-34a1-0f9b-69c4fb8f0eff@oracle.com> <8c7edf32-3037-b8d7-a18e-587f7ae8d294@oracle.com> <9bd21f67-982a-ef98-7c9a-3515c38b688c@oracle.com> <0906c510-8c2c-2c56-803f-0a0bf4340df6@oracle.com> <6f8cebde-428e-9e70-ed15-b05b6f6446e8@oracle.com> <7fc9666b-ce1e-f83a-e7a7-e052ca451a09@oracle.com> <8c82117c-047e-9fd5-190c-ffbd9d73711d@oracle.com> Message-ID: <3d6e5cea-6077-3e02-bf32-fe7ce23f53b8@oracle.com> Hi Alex, Since this is searching for a file and not a directory, I didn't think it was necessary, but I can see now that rstrstr may be better just in case /bin/java appears somewhere in the middle of the path such as ~/bin/java16/bin/java. thanks, Chris On 8/4/20 3:05 PM, Alex Menkov wrote: > Hi Chris, > > 396?? posbin = strstr(execname, "/bin/java"); > > I suppose this should be rstrstr. > > > --alex > > > On 07/28/2020 20:40, Chris Plummer wrote: >> Hi Serguei and Alex, >> >> Sorry about the delay getting back to this. I got sidetracked with >> other bugs and also realized the code needed more work than just >> Alex's suggestion for rstrstr(). >> >> As a bit of background first, get_real_path() is used to locate any >> library that is referenced from the core file using a relative path. >> So the core file will, for example, refer to @rpath/libjvm.dylib, and >> get_real_path() will convert that to a usable path to the file. >> Usually only JDK libraries and user libraries are specified with >> @rpath. System libraries all use full path names. >> >> get_real_path() had a couple of shortcomings. The way it worked is if >> the specified execname ended in bin/java or if $JAVA_HOME was set, >> then it only checked for libraries in subdirs of the first one of >> those 2 that it found to be valid. It would not look in both >> directories if both were valid, only in the first to be found valid. >> Only if neither of those were valid did it look in DYLD_LIBRARY_PATH. >> So, for example, as long as execname ended in bin/java, that's the >> only jdk directory that was checked for libraries. If it didn't end >> in bin/java, and $JAVA_HOME was set, then only it was checked. Then I >> added a 3rd option looking for the existence of any "bin/" in >> execname. Only if none of these 3 paths existed did the code defer to >> DYLD_LIBRARY_PATH. That made is hard to locate non JDK libraries, >> such as user JNI libraries, or to override the execname search for >> the JDK by setting $JAVA_HOME. >> >> I've fixed this by having it check all 3 of the potential JDK >> locations not only to see if the paths are valid, but also if the >> library is in any of the paths, and then check all the paths >> DYLD_LIBRARY_PATH if it failed to find the library in the JDK paths. >> So now all the potential locations are checked to see if they contain >> the library. By doing this I was able to make it find the JDK >> libraries by properly specifying the execname or JAVA_HOME, and still >> find a user JNI library in DYLD_LIBRARY_PATH. >> >> Since the code was kind of a mess and not well suited to just fix >> with some minor adjustments, I for the most part rewrote it. Although >> it still does a lot of the same things, it's much cleaner and easier >> to read now, and there's less replication of similar code. I also >> replaced strcat and strcpy calls with strncat and strncpy to prevent >> overflows. I would suggest for this review to just start by looking >> at get_real_path() and follow the code, and not compare the diffs, >> which aren't very readable. >> >> http://cr.openjdk.java.net/~cjplummer/8248879/webrev.02/index.html >> >> thanks, >> >> Chris >> >> >> On 7/14/20 8:54 PM, serguei.spitsyn at oracle.com wrote: >>> Hi Alex, >>> >>> Yes, I understand this. >>> After some thinking, I doubt my suggestion to check all occurrences >>> or "/bin/" is good. :) >>> >>> Thanks, >>> Serguei >>> >>> On 7/14/20 18:19, Alex Menkov wrote: >>>> Hi Serguei, >>>> >>>> On 07/14/2020 15:55, serguei.spitsyn at oracle.com wrote: >>>>> Hi Chris and Alex, >>>>> >>>>> I agree the last occurrence of "/bin/" is better than the first. >>>>> But I wonder if it makes sense to check all occurrences. >>>> >>>> The problem is strrstr (search for last occurrence) is not a part >>>> of std C lib. >>>> So to avoid dependency on new library I suggested this simple >>>> implementation using standard strstr. >>>> >>>> --alex >>>> >>>>> >>>>> Thanks, >>>>> Serguei >>>>> >>>>> >>>>> On 7/14/20 15:14, Alex Menkov wrote: >>>>>> Yes, you are right. >>>>>> This is not a function from strings.h >>>>>> >>>>>> Ok, you can leave strstr (and keep in mind that the path can't >>>>>> contain "/bin/" other than jdk's bin) or implement the >>>>>> functionality. It should be something simple like >>>>>> >>>>>> static const char* rstrstr(const char *str, const char *sub) { >>>>>> ? const char *result = NULL; >>>>>> ? for (const char *p = strstr(str, sub); p != NULL; p = strstr(p >>>>>> + 1, sub)) { >>>>>> ??? result = p; >>>>>> ? } >>>>>> ? return result; >>>>>> } >>>>>> >>>>>> --alex >>>>>> >>>>>> On 07/14/2020 13:43, Chris Plummer wrote: >>>>>>> Actually it's not so easy. I don't see any other references to >>>>>>> strrstr in our source. When I reference strstr, it gives a >>>>>>> warning because it's not declared. The only man page I can find >>>>>>> says to include sstring2.h, but this file does not exist. It >>>>>>> also says to link with -lsstrings2. >>>>>>> >>>>>>> Chris >>>>>>> >>>>>>> On 7/14/20 1:37 PM, Chris Plummer wrote: >>>>>>>> Ok. I'll change both references to use strrstr. >>>>>>>> >>>>>>>> thanks, >>>>>>>> >>>>>>>> Chris >>>>>>>> >>>>>>>> On 7/14/20 1:11 PM, Alex Menkov wrote: >>>>>>>>> Hi Chris, >>>>>>>>> >>>>>>>>> I think it would be better to use strrstr to correctly handle >>>>>>>>> paths like >>>>>>>>> /something/bin/jdk/bin/jhsdb >>>>>>>>> >>>>>>>>> And I'd updated >>>>>>>>> 358?? char* posbin = strstr(execname, "/bin/java"); >>>>>>>>> to use strrstr as well >>>>>>>>> >>>>>>>>> --alex >>>>>>>>> >>>>>>>>> On 07/14/2020 12:01, Chris Plummer wrote: >>>>>>>>>> Ping! >>>>>>>>>> >>>>>>>>>> On 7/6/20 9:31 PM, Chris Plummer wrote: >>>>>>>>>>> Hello, >>>>>>>>>>> >>>>>>>>>>> Please help review the following: >>>>>>>>>>> >>>>>>>>>>> http://cr.openjdk.java.net/~cjplummer/8248879/webrev.00/index.html >>>>>>>>>>> >>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8248879 >>>>>>>>>>> >>>>>>>>>>> The description of the problem and the fix are both in the CR. >>>>>>>>>>> >>>>>>>>>>> thanks, >>>>>>>>>>> >>>>>>>>>>> Chris >>>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>> >>>>>>> >>>>> >>> >> >> From alexey.menkov at oracle.com Tue Aug 4 23:32:05 2020 From: alexey.menkov at oracle.com (Alex Menkov) Date: Tue, 4 Aug 2020 16:32:05 -0700 Subject: Ping: RFR: JDK-8249550: jdb should use loopback address when not using remote agent In-Reply-To: <5b200434-6181-1478-6423-70cd9150ce56@oracle.com> References: <541a2f57-4b86-b453-7739-f1de35b52212@oracle.com> <5b200434-6181-1478-6423-70cd9150ce56@oracle.com> Message-ID: <467c37ea-3146-aec2-2fc5-94153acd5acc@oracle.com> Needs one more reviewer. One more details to simplify review. SocketTransportService extends TransportService and spec for TransportService.startListening() is: /** * Listens on an address chosen by the transport service. * *

This convenience method works as if by invoking * {@link #startListening(String) startListening(null)}. * * @return a listen key to be used in subsequent calls to be * {@link #accept accept} or {@link #stopListening * stopListening} methods. * * @throws IOException * If an I/O error occurs. */ public abstract ListenKey startListening() throws IOException; I.e. the fix updates SocketTransportService to comply the spec. --alex On 07/23/2020 13:05, Chris Plummer wrote: > Hi Alex, > > I'm no expert in this area, but the changes appear to do what you > describe (use the loopback address), so thumbs up. > > thanks, > > Chris > > On 7/21/20 3:04 PM, Alex Menkov wrote: >> Hi all, >> >> please review the fix for >> https://bugs.openjdk.java.net/browse/JDK-8249550 >> >> webrev: >> http://cr.openjdk.java.net/~amenkov/jdk16/jdb_loopback/webrev/ >> >> some background: >> https://bugs.openjdk.java.net/browse/JDK-8041435 made default >> listening on loopback address. >> Later https://bugs.openjdk.java.net/browse/JDK-8184770 added handling >> of "*" address to listen on all addresses, but it didn't fixed >> "default" startListening() method (used by jdb through >> SunCommandLineLauncher). >> >> The method called startListening(String localaddress, int port) with >> localaddress == null, but this method for null localladdress starts >> listening on all addresses (i.e. handle null value as "*"). >> The fix changes it to startListening(String address) which handles >> null address the same way as JDI socket connector does (i.e. listens >> on loopback address only) >> >> --alex > From igor.ignatyev at oracle.com Tue Aug 4 23:59:32 2020 From: igor.ignatyev at oracle.com (Igor Ignatyev) Date: Tue, 4 Aug 2020 16:59:32 -0700 Subject: RFR(L/S) : 8249030 : clean up FileInstaller $test.src $cwd in vmTestbase_nsk_jdi tests In-Reply-To: <637D6495-24E6-4874-9024-1B0082492085@oracle.com> References: <637D6495-24E6-4874-9024-1B0082492085@oracle.com> Message-ID: ping? -- Igor > On Jul 31, 2020, at 1:24 PM, Igor Ignatyev wrote: > > http://cr.openjdk.java.net/~iignatyev//8249030/webrev.00 >> 2258 lines changed: 0 ins; 1144 del; 1114 mod; > > Hi all, > > could you please review the clean-up of nsk_jdi tests? > from main issue(8204985) : >> all vmTestbase tests have '@run driver jdk.test.lib.FileInstaller . .' to mimic old test harness behavior and copy all files from a test source directory to a current work directory. some tests depend on this step, so we need 1st identify such tests and then either rewrite them not to have this dependency or leave FileInstaller only in these tests. > > > the patch removes FileInstaller actions in the said tests, and as before, the biggest part of patch was done by `ag -l '@run driver jdk.test.lib.FileInstaller . .' $DIR | xargs -I{} gsed -i '/@run driver jdk.test.lib.FileInstaller \. \./d' {}` with $DIR being test/hotspot/jtreg/vmTestbase/nsk/jdi/. > > the 10 tests which had '-configFile ./<...>', and hence were looking for config file in the current directory, were updated to search for a config file in 'test.src' directory: http://cr.openjdk.java.net/~iignatyev//8249030/webrev.00-configFile > > testing: :vmTestbase_nsk_jdi on {linux,windows,macos}-x64 > JBS: https://bugs.openjdk.java.net/browse/JDK-8249030 > webrev: http://cr.openjdk.java.net/~iignatyev/8249030/webrev.00/ > > Thanks, > -- Igor From suenaga at oss.nttdata.com Wed Aug 5 00:06:54 2020 From: suenaga at oss.nttdata.com (Yasumasa Suenaga) Date: Wed, 5 Aug 2020 09:06:54 +0900 Subject: RFR: 8250826: jhsdb does not work with coredump which comes from Substrate VM In-Reply-To: References: <06e53c65-3cc6-8c51-33d0-26e207f4d0a0@oss.nttdata.com> <3805349a-342f-a130-3b0b-7ee26f23a278@oss.nttdata.com> <6009b927-3949-6d03-51ab-bef939cfbd7f@oracle.com> <313155c0-58b2-cef8-68f8-1aa3ffd7e7df@oss.nttdata.com> <7788f89a-507d-0d4a-ab81-52a09a2d5161@oracle.com> <439d0860-ae4a-98d4-6402-40ebfa833159@oss.nttdata.com> <6740ad1d-5e35-458c-7c2c-caf6da6b1276@oracle.com> <29af768a-bebe-13fe-ac51-3062ede936bf@oss.nttdata.com> <2ca95f3f-c47f-d2c9-ab4f-347f65e74d9f@oracle.com> <2cbb7908-7af1-6c18-7b53-7c943bdfcc7e@oracle.com> Message-ID: <3c021f85-3c42-1c04-ec9b-77fd8bd3c56d@oss.nttdata.com> On 2020/08/05 4:15, serguei.spitsyn at oracle.com wrote: > On 8/4/20 11:04, Chris Plummer wrote: >> On 8/4/20 2:11 AM, serguei.spitsyn at oracle.com wrote: >>> Hi Yasumasa, >>> >>> The fix looks good to me. >>> Thanks to Chris for discussing the details in previous emails. >>> >>> Just one suggestion: >>> >>> http://cr.openjdk.java.net/~ysuenaga/JDK-8250826/webrev.02/src/jdk.hotspot.agent/share/native/libsaproc/ps_core_common.c.frames.html >>> 44 #ifdef PF_R >>> 45 #define MAP_R_FLAG PF_R >>> 46 #else >>> 47 #define MAP_R_FLAG 0 >>> 48 #endif >>> Could you, please add a small comment before? >>> Something like this would be enough, I think: >>> ? // Define a segment permission flag allowing read. >>> >> That comment is a little bit misleading, since on OSX the flag really has no meaning. Maybe the following would be better: >> >> ? // Define a segment permission flag allowing read if there is a read flag. Otherwise use 0. > > Thanks, this is more precise. > In fact, I was more concerned to explain what the macro PF_R is about. Ok, I will push the change with this comment. Thanks, Yasumasa > Thanks, > Serguei > >> >> thanks, >> >> Chris >>> Thanks, >>> Serguei >>> >>> >>> On 8/3/20 17:54, Yasumasa Suenaga wrote: >>>> Thanks Chris! >>>> I will push it when I got second reviewer. >>>> >>>> >>>> Yasumasa >>>> >>>> >>>> On 2020/08/04 9:54, Chris Plummer wrote: >>>>> Hi Yasumasa, >>>>> >>>>> Your changes look good now. >>>>> >>>>> thanks, >>>>> >>>>> Chris >>>>> >>>>> On 8/3/20 5:47 PM, Yasumasa Suenaga wrote: >>>>>> Hi Chris, >>>>>> >>>>>> Thank you for the comment! >>>>>> I updated webrev: >>>>>> >>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8250826/webrev.02/ >>>>>> ? Diff from webrev.01: http://hg.openjdk.java.net/jdk/submit/rev/e98dc25b69c2 >>>>>> >>>>>> On 2020/08/04 6:41, Chris Plummer wrote: >>>>>>> Hi Yasumasa, >>>>>>> >>>>>>> Your updated fix resulted in using the core file map whereas the original fix used the library map. In both cases the assert is avoided, which I think is the main goal. Does it matter which map is used? >>>>>> >>>>>> In GraalVM, read only segment is conflicted, thus it does not matter which map is used. >>>>>> However this webrev is more generalize, so segments in coredump should be used. >>>>>> >>>>>>> ?? 42 #ifndef PF_R >>>>>>> ?? 43 #define PF_R 0x4 >>>>>>> ?? 44 #endif >>>>>>> >>>>>>> ??156?? if ((map = allocate_init_map(ph->core->classes_jsa_fd, >>>>>>> ??157??????????????????????????????? offset, vaddr, memsz, PF_R)) == NULL) { >>>>>>> >>>>>>> I'm not so sure this is appropriate for OSX. It uses mach-o files, not elf files. The segment_command flags field comes from loader.h [1]. I don't see anything in there that looks like the equivalent of ELF access flags. >>>>>>> >>>>>>> /* Constants for the flags field of the segment_command */ >>>>>>> #define??? SG_HIGHVM??? 0x1??? /* the file contents for this segment is for >>>>>>> ???? ??? ??? ??? ?? the high part of the VM space, the low part >>>>>>> ???? ??? ??? ??? ?? is zero filled (for stacks in core files) */ >>>>>>> #define??? SG_FVMLIB??? 0x2??? /* this segment is the VM that is allocated by >>>>>>> ???? ??? ??? ??? ?? a fixed VM library, for overlap checking in >>>>>>> ???? ??? ??? ??? ?? the link editor */ >>>>>>> #define??? SG_NORELOC??? 0x4??? /* this segment has nothing that was relocated >>>>>>> ???? ??? ??? ??? ?? in it and nothing relocated to it, that is >>>>>>> ???? ??? ??? ??? ?? it maybe safely replaced without relocation*/ >>>>>>> #define SG_PROTECTED_VERSION_1??? 0x8 /* This segment is protected. If the >>>>>>> ???? ??? ??? ??? ?????? segment starts at file offset 0, the >>>>>>> ???? ??? ??? ??? ?????? first page of the segment is not >>>>>>> ???? ??? ??? ??? ?????? protected.? All other pages of the >>>>>>> ???? ??? ??? ??? ?????? segment are protected. */ >>>>>>> >>>>>>> Since the flags don't matter for OSX, maybe you should just pass 0. You can do something like: >>>>>>> >>>>>>> #ifndef PF_R >>>>>>> #define MAP_R_FLAG PF_R >>>>>>> #else >>>>>>> #define MAP_R_FLAG 0 >>>>>>> #endif >>>>>> >>>>>> Thanks! >>>>>> I thought PF_R can be used PF_R from elf.h on macOS: >>>>>> https://opensource.apple.com/source/dtrace/dtrace-90/sys/elf.h >>>>>> >>>>>> I merged your code in this webrev. >>>>>> >>>>>>> Some minor comment fixes are needed: >>>>>>> >>>>>>> ??397???????? // Access flags fot this memory region is different between the library >>>>>>> >>>>>>> "fot" -> "for" >>>>>>> "is" -> "are" >>>>>>> >>>>>>> ??399???????? // We should respect to coredump. >>>>>>> >>>>>>> "to" -> "the" >>>>>>> >>>>>>> ??404???????? // And head of ELF header might be included in coredump (See JDK-7133122). >>>>>>> ??405???????? // Thus we need to replace PT_LOAD segments the library version. >>>>>>> >>>>>>> How about: >>>>>>> >>>>>>> ??404???????? // Also the first page of the ELF header might be included in the coredump (See JDK-7133122). >>>>>>> ??405???????? // Thus we need to replace the PT_LOAD segment with the library version. >>>>>> >>>>>> Fixed them. >>>>>> >>>>>> >>>>>> Thanks, >>>>>> >>>>>> Yasumasa >>>>>> >>>>>> >>>>>>> thanks, >>>>>>> >>>>>>> Chris >>>>>>> >>>>>>> [1] https://opensource.apple.com/source/xnu/xnu-1456.1.26/EXTERNAL_HEADERS/mach-o/loader.h.auto.html >>>>>>> >>>>>>> On 8/2/20 12:18 AM, Yasumasa Suenaga wrote: >>>>>>>> Hi Chris, >>>>>>>> >>>>>>>> (Remove "trivial" from subject) >>>>>>>> >>>>>>>> Thanks for the information! I fixed errors in new webrev. It passed tests on submit repo (mach5-one-ysuenaga-JDK-8250826-1-20200802-0151-13109525) >>>>>>>> >>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8250826/webrev.01/ >>>>>>>> >>>>>>>> >>>>>>>> I tried to use elf.h instead of #define for PF_R, however it failed (mach5-one-ysuenaga-JDK-8250826-1-20200802-0542-13111335). >>>>>>>> >>>>>>>> http://hg.openjdk.java.net/jdk/submit/rev/67baee1a1a1d >>>>>>>> >>>>>>>> Thus I added #define for it in this webrev. >>>>>>>> >>>>>>>> >>>>>>>> Thanks, >>>>>>>> >>>>>>>> Yasumasa >>>>>>>> >>>>>>>> >>>>>>>> On 2020/08/02 10:22, Chris Plummer wrote: >>>>>>>>> Hi Yasumasa, >>>>>>>>> >>>>>>>>> [2020-08-01T14:15:42,514Z] Creating support/native/jdk.hotspot.agent/libsaproc/static/libsaproc.a from 8 file(s) >>>>>>>>> [2020-08-01T14:15:43,961Z] ./open/src/jdk.hotspot.agent/share/native/libsaproc/ps_core_common.c:128:8: error: no member named 'flags' in 'struct map_info' >>>>>>>>> [2020-08-01T14:15:43,961Z]?? map->flags? = flags; >>>>>>>>> [2020-08-01T14:15:43,961Z]?? ~~~? ^ >>>>>>>>> [2020-08-01T14:15:43,963Z] ./open/src/jdk.hotspot.agent/share/native/libsaproc/ps_core_common.c:153:54: error: use of undeclared identifier 'PF_R' >>>>>>>>> [2020-08-01T14:15:43,963Z] offset, vaddr, memsz, PF_R)) == NULL) { >>>>>>>>> >>>>>>>>> I'll look at the code changes later. No time at the moment. >>>>>>>>> >>>>>>>>> thanks, >>>>>>>>> >>>>>>>>> Chris >>>>>>>>> >>>>>>>>> 2020-08-01-1405571.suenaga.source2020-08-01-1405571.suenaga.source 2020-08-01-1405571.suenaga.source On 8/1/20 5:20 PM, Yasumasa Suenaga wrote: >>>>>>>>>> Hi Chris, >>>>>>>>>> >>>>>>>>>> Thanks for your comment! >>>>>>>>>> I pushed new change to submit repo, but the build failed on macOS. Could you share details? >>>>>>>>>> (I do not have Mac) >>>>>>>>>> >>>>>>>>>> ? commit: http://hg.openjdk.java.net/jdk/submit/rev/0eb1c497f297 >>>>>>>>>> ? job: mach5-one-ysuenaga-JDK-8250826-1-20200801-1407-13098989 >>>>>>>>>> >>>>>>>>>> On 2020/08/01 13:06, Chris Plummer wrote: >>>>>>>>>>> On 7/30/20 6:18 PM, Yasumasa Suenaga wrote: >>>>>>>>>>>> Hi Chris, >>>>>>>>>>>> >>>>>>>>>>>> On 2020/07/31 7:29, Chris Plummer wrote: >>>>>>>>>>>>> Hi Yasumasa, >>>>>>>>>>>>> >>>>>>>>>>>>> If I understand correctly we first call add_map_info() for all the PT_LOAD segments in the core file. We then process all the library segments, calling add_map_info() for them if the target_vaddr has not already been addded. If has already been added, which I assume is the case for any library segment that is already in the core file, then the core file version is replaced the the library version.? I'm a little unclear of the purpose of this replacing of the core PT_LOAD segments with those found in the libraries. If you could explain this that would help me understand your change. >>>>>>>>>>>> >>>>>>>>>>>> Read only segments in ELF should not be any different from PT_LOAD segments in the core. >>>>>>>>>>>> And head of ELF header might be included in coredump (See JDK-7133122). Thus we need to replace PT_LOAD segments the library version. >>>>>>>>>>> Ok. The code in the area really should have been commented better when first written. The purpose is not understandable simply by reading the code. >>>>>>>>>> >>>>>>>>>> I added some comments to existing code. Please tell me if it is insufficient. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>>>> I'm also unsure why existing_map->fd would ever be something other than the core file. Why would another library map the same target_vaddr. >>>>>>>>>>>> >>>>>>>>>>>> When mmap() is called to read-only ELF segments / sections, Linux kernel seems to allocate other memory segments which has same top virtual memory address. I've not yet found out from the code of Linux kernel, but I confirmed this behavior on GDB. >>>>>>>>>>> Ok. Same comment as above. This should have been explained with comments in the code. >>>>>>>>>> >>>>>>>>>> Added some comments. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>> As for your fix, if I understand correctly the issue is that a single segment in the library is being split into two segments in the process (and therefore in the core file) due to an mprotect being done on part of the segment. Because of this the segment size in the library does match the segment size in the core file. So with your fix the library segment is used, but what about the other half of the segment that is in the core file? Don't we now have overlapping segments; the full original segment from the library, and then a second segment that overlaps the tail end of the library segment? Will that cause any confusion later on? >>>>>>>>>> >>>>>>>>>> As long as vaddr is valid, it doesn't matter even if it overlaps because SA would sort the map with vaddr, and would lookup with it. >>>>>>>>>> In Substrate VM, there are RO and RW sections in that order, so it is ok with webrev.00 . However it might not be appropriate because RW section might be top of PT_LOAD. >>>>>>>>>> >>>>>>>>>> To make it more generalized, I changed it to the commit on submit repo. >>>>>>>>>> It would check access flags between in coredump and in binary. If they are different, we respect current (loaded from coredump) map because it might be changed at runtime. >>>>>>>>>> >>>>>>>>>> The change for LabsJDK 11 is more simple because JDK 11 does not have ps_core_common.c . >>>>>>>>>> So I share you it. It may help you: >>>>>>>>>> >>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8250826/JDK-8250826-labsjdk11-0.patch >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> >>>>>>>>>> Yasumasa >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>> thanks, >>>>>>>>>>> >>>>>>>>>>> Chris >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Thanks, >>>>>>>>>>>> >>>>>>>>>>>> Yasumasa >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>>> thanks, >>>>>>>>>>>>> >>>>>>>>>>>>> Chris >>>>>>>>>>>>> >>>>>>>>>>>>> On 7/30/20 1:18 PM, Chris Plummer wrote: >>>>>>>>>>>>>> Hi Yasumasa, >>>>>>>>>>>>>> >>>>>>>>>>>>>> I'm reviewing this RFR, and I'd like to ask that it not be pushed as trivial. Although it is just a one line change, it takes an extensive knowledge to understand the impact. I'll read up on the filed graal issue and try to understand the ELF code a bit better. >>>>>>>>>>>>>> >>>>>>>>>>>>>> thanks, >>>>>>>>>>>>>> >>>>>>>>>>>>>> Chris >>>>>>>>>>>>>> >>>>>>>>>>>>>> On 7/30/20 6:45 AM, Yasumasa Suenaga wrote: >>>>>>>>>>>>>>> Hi all, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Please review this trivial change: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> ? JBS: https://bugs.openjdk.java.net/browse/JDK-8250826 >>>>>>>>>>>>>>> ? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8250826/webrev.00/ >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> I played Truffle NFI on GraalVM, but I cannot get Java stacks from coredump via jhsdb. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> I've reported this issue to GraalVM community [1], and I 've found out the cause of this issue is .svm_heap would be separated to RO and RW areas by mprotect() calls in run time in spite of .svm_heap is RO section in ELF (please see [1] for details). >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> It is corner case, but we will see same problem on jhsdb when we attempt to analyze coredump which comes from some applications / libraries which would separate RO sections in ELF like Substrate VM. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> I sent PR to fix libsaproc.so in LabsJDK 11 for this issue [2], then community members suggested me to discuss in serviceability-dev. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> [1] https://github.com/oracle/graal/issues/2579 >>>>>>>>>>>>>>> [2] https://github.com/graalvm/labs-openjdk-11/pull/9 >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>> >>>>>>> >>>>> >>>>> >>> >> > From serguei.spitsyn at oracle.com Wed Aug 5 00:22:47 2020 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Tue, 4 Aug 2020 17:22:47 -0700 Subject: Ping: RFR: JDK-8249550: jdb should use loopback address when not using remote agent In-Reply-To: <467c37ea-3146-aec2-2fc5-94153acd5acc@oracle.com> References: <541a2f57-4b86-b453-7739-f1de35b52212@oracle.com> <5b200434-6181-1478-6423-70cd9150ce56@oracle.com> <467c37ea-3146-aec2-2fc5-94153acd5acc@oracle.com> Message-ID: Hi Alex, This looks good to me. But do we need a CSR for this? I understand that the intention is to comply with the TransportService spec but the behavior is being changed. How long did we have this behavior? Thanks, Serguei On 8/4/20 16:32, Alex Menkov wrote: > Needs one more reviewer. > > One more details to simplify review. > SocketTransportService extends TransportService and spec for > TransportService.startListening() is: > > ??? /** > ???? * Listens on an address chosen by the transport service. > ???? * > ???? *

This convenience method works as if by invoking > ???? * {@link #startListening(String) startListening(null)}. > ???? * > ???? * @return? a listen key to be used in subsequent calls to be > ???? *????????? {@link #accept accept} or {@link #stopListening > ???? *????????? stopListening} methods. > ???? * > ???? * @throws? IOException > ???? *????????? If an I/O error occurs. > ???? */ > ??? public abstract ListenKey startListening() throws IOException; > > I.e. the fix updates SocketTransportService? to comply the spec. > > --alex > > On 07/23/2020 13:05, Chris Plummer wrote: >> Hi Alex, >> >> I'm no expert in this area, but the changes appear to do what you >> describe (use the loopback address), so thumbs up. >> >> thanks, >> >> Chris >> >> On 7/21/20 3:04 PM, Alex Menkov wrote: >>> Hi all, >>> >>> please review the fix for >>> https://bugs.openjdk.java.net/browse/JDK-8249550 >>> >>> webrev: >>> http://cr.openjdk.java.net/~amenkov/jdk16/jdb_loopback/webrev/ >>> >>> some background: >>> https://bugs.openjdk.java.net/browse/JDK-8041435 made default >>> listening on loopback address. >>> Later https://bugs.openjdk.java.net/browse/JDK-8184770 added >>> handling of "*" address to listen on all addresses, but it didn't >>> fixed "default" startListening() method (used by jdb through >>> SunCommandLineLauncher). >>> >>> The method called startListening(String localaddress, int port) with >>> localaddress == null, but this method for null localladdress starts >>> listening on all addresses (i.e. handle null value as "*"). >>> The fix changes it to startListening(String address) which handles >>> null address the same way as JDI socket connector does (i.e. listens >>> on loopback address only) >>> >>> --alex >> From david.holmes at oracle.com Wed Aug 5 01:03:43 2020 From: david.holmes at oracle.com (David Holmes) Date: Wed, 5 Aug 2020 11:03:43 +1000 Subject: FW: 8251031: Some vmTestbase/nsk/monitoring/RuntimeMXBean tests fail with hostnames starting from digits In-Reply-To: <3C8F8615-C994-4725-B18B-4F856DA68AF3@tencent.com> References: <8A8E8751-AF91-471A-B668-5D6461B72356@tencent.com> <3C8F8615-C994-4725-B18B-4F856DA68AF3@tencent.com> Message-ID: <9ad2e87b-a099-75a0-15dc-73a344ac8806@oracle.com> My Review still stands. :) Thanks, David On 5/08/2020 1:10 am, jiefu(??) wrote: > Forward it to serviceability-dev since this issue in the JBS has been > moved ?from hotspot/runtime to core-svc/java.lang.management. > > Please review it. > > Thanks. > > Best regards, > > Jie > > *From: *"jiefu(??)" > *Date: *Tuesday, August 4, 2020 at 5:10 PM > *To: *"hotspot-runtime-dev at openjdk.java.net" > > *Subject: *RFR: 8251031: Some vmTestbase/nsk/monitoring/RuntimeMXBean > tests fail with hostnames starting from digits > > Hi all, > > JBS:??? https://bugs.openjdk.java.net/browse/JDK-8251031 > > Webrev: http://cr.openjdk.java.net/~jiefu/8251031/webrev.00/ > > Some vmTestbase/nsk/monitoring/RuntimeMXBean tests failed in our test > infrastructure. > > The reason is that these tests reject hostnames starting with digits. > > However, hostnames starting from digits are actually valid according to > RFC1123 [1][2]. > > It would be better to fix it. > > Thanks a lot. > > Best regards, > > Jie > > [1] https://tools.ietf.org/html/rfc1123#page-13 > > [2] https://en.wikipedia.org/wiki/Hostname > From jiefu at tencent.com Wed Aug 5 01:38:54 2020 From: jiefu at tencent.com (=?gb2312?B?amllZnUouLW93Ck=?=) Date: Wed, 5 Aug 2020 01:38:54 +0000 Subject: FW: 8251031: Some vmTestbase/nsk/monitoring/RuntimeMXBean tests fail with hostnames starting from digits(Internet mail) In-Reply-To: <9ad2e87b-a099-75a0-15dc-73a344ac8806@oracle.com> References: <8A8E8751-AF91-471A-B668-5D6461B72356@tencent.com> <3C8F8615-C994-4725-B18B-4F856DA68AF3@tencent.com>, <9ad2e87b-a099-75a0-15dc-73a344ac8806@oracle.com> Message-ID: <421fb49a041d4e2b87a2158dec4c676b@tencent.com> Thanks Chris and David for your review. Will push it later. Best regards, Jie ________________________________ From: David Holmes Sent: Wednesday, August 5, 2020 9:03 AM To: jiefu(??); serviceability-dev at openjdk.java.net Subject: Re: FW: 8251031: Some vmTestbase/nsk/monitoring/RuntimeMXBean tests fail with hostnames starting from digits(Internet mail) My Review still stands. :) Thanks, David On 5/08/2020 1:10 am, jiefu(??) wrote: > Forward it to serviceability-dev since this issue in the JBS has been > moved from hotspot/runtime to core-svc/java.lang.management. > > Please review it. > > Thanks. > > Best regards, > > Jie > > *From: *"jiefu(??)" > *Date: *Tuesday, August 4, 2020 at 5:10 PM > *To: *"hotspot-runtime-dev at openjdk.java.net" > > *Subject: *RFR: 8251031: Some vmTestbase/nsk/monitoring/RuntimeMXBean > tests fail with hostnames starting from digits > > Hi all, > > JBS: https://bugs.openjdk.java.net/browse/JDK-8251031 > > Webrev: http://cr.openjdk.java.net/~jiefu/8251031/webrev.00/ > > Some vmTestbase/nsk/monitoring/RuntimeMXBean tests failed in our test > infrastructure. > > The reason is that these tests reject hostnames starting with digits. > > However, hostnames starting from digits are actually valid according to > RFC1123 [1][2]. > > It would be better to fix it. > > Thanks a lot. > > Best regards, > > Jie > > [1] https://tools.ietf.org/html/rfc1123#page-13 > > [2] https://en.wikipedia.org/wiki/Hostname > -------------- next part -------------- An HTML attachment was scrubbed... URL: From serguei.spitsyn at oracle.com Wed Aug 5 02:10:45 2020 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Tue, 4 Aug 2020 19:10:45 -0700 Subject: FW: 8251031: Some vmTestbase/nsk/monitoring/RuntimeMXBean tests fail with hostnames starting from digits(Internet mail) In-Reply-To: <421fb49a041d4e2b87a2158dec4c676b@tencent.com> References: <8A8E8751-AF91-471A-B668-5D6461B72356@tencent.com> <3C8F8615-C994-4725-B18B-4F856DA68AF3@tencent.com> <9ad2e87b-a099-75a0-15dc-73a344ac8806@oracle.com> <421fb49a041d4e2b87a2158dec4c676b@tencent.com> Message-ID: An HTML attachment was scrubbed... URL: From serguei.spitsyn at oracle.com Wed Aug 5 02:22:20 2020 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Tue, 4 Aug 2020 19:22:20 -0700 Subject: RFR(L/S) : 8249030 : clean up FileInstaller $test.src $cwd in vmTestbase_nsk_jdi tests In-Reply-To: References: <637D6495-24E6-4874-9024-1B0082492085@oracle.com> Message-ID: Hi Igor, It looks okay to me. At least, I've not noticed any issues. Thanks, Serguei On 8/4/20 16:59, Igor Ignatyev wrote: > ping? > -- Igor > >> On Jul 31, 2020, at 1:24 PM, Igor Ignatyev wrote: >> >> http://cr.openjdk.java.net/~iignatyev//8249030/webrev.00 >>> 2258 lines changed: 0 ins; 1144 del; 1114 mod; >> Hi all, >> >> could you please review the clean-up of nsk_jdi tests? >> from main issue(8204985) : >>> all vmTestbase tests have '@run driver jdk.test.lib.FileInstaller . .' to mimic old test harness behavior and copy all files from a test source directory to a current work directory. some tests depend on this step, so we need 1st identify such tests and then either rewrite them not to have this dependency or leave FileInstaller only in these tests. >> >> the patch removes FileInstaller actions in the said tests, and as before, the biggest part of patch was done by `ag -l '@run driver jdk.test.lib.FileInstaller . .' $DIR | xargs -I{} gsed -i '/@run driver jdk.test.lib.FileInstaller \. \./d' {}` with $DIR being test/hotspot/jtreg/vmTestbase/nsk/jdi/. >> >> the 10 tests which had '-configFile ./<...>', and hence were looking for config file in the current directory, were updated to search for a config file in 'test.src' directory: http://cr.openjdk.java.net/~iignatyev//8249030/webrev.00-configFile >> >> testing: :vmTestbase_nsk_jdi on {linux,windows,macos}-x64 >> JBS: https://bugs.openjdk.java.net/browse/JDK-8249030 >> webrev: http://cr.openjdk.java.net/~iignatyev/8249030/webrev.00/ >> >> Thanks, >> -- Igor From jiefu at tencent.com Wed Aug 5 02:23:43 2020 From: jiefu at tencent.com (=?gb2312?B?amllZnUouLW93Ck=?=) Date: Wed, 5 Aug 2020 02:23:43 +0000 Subject: FW: 8251031: Some vmTestbase/nsk/monitoring/RuntimeMXBean tests fail with hostnames starting from digits(Internet mail) In-Reply-To: References: <8A8E8751-AF91-471A-B668-5D6461B72356@tencent.com> <3C8F8615-C994-4725-B18B-4F856DA68AF3@tencent.com> <9ad2e87b-a099-75a0-15dc-73a344ac8806@oracle.com> <421fb49a041d4e2b87a2158dec4c676b@tencent.com>, Message-ID: Thanks Serguei for your review. I'll try to split it. Best regards, Jie ________________________________ From: serguei.spitsyn at oracle.com Sent: Wednesday, August 5, 2020 10:10 AM To: jiefu(??); chris.plummer at oracle.com; David Holmes Cc: serviceability-dev at openjdk.java.net Subject: Re: FW: 8251031: Some vmTestbase/nsk/monitoring/RuntimeMXBean tests fail with hostnames starting from digits(Internet mail) Hi Jie, Could you, please, split the format string in two shorter lines? Otherwise, it looks okay. There is no need in another webrev. Thanks, Serguei On 8/4/20 18:38, jiefu(??) wrote: Thanks Chris and David for your review. Will push it later. Best regards, Jie ________________________________ From: David Holmes Sent: Wednesday, August 5, 2020 9:03 AM To: jiefu(??); serviceability-dev at openjdk.java.net Subject: Re: FW: 8251031: Some vmTestbase/nsk/monitoring/RuntimeMXBean tests fail with hostnames starting from digits(Internet mail) My Review still stands. :) Thanks, David On 5/08/2020 1:10 am, jiefu(??) wrote: > Forward it to serviceability-dev since this issue in the JBS has been > moved from hotspot/runtime to core-svc/java.lang.management. > > Please review it. > > Thanks. > > Best regards, > > Jie > > *From: *"jiefu(??)" > *Date: *Tuesday, August 4, 2020 at 5:10 PM > *To: *"hotspot-runtime-dev at openjdk.java.net" > > *Subject: *RFR: 8251031: Some vmTestbase/nsk/monitoring/RuntimeMXBean > tests fail with hostnames starting from digits > > Hi all, > > JBS: https://bugs.openjdk.java.net/browse/JDK-8251031 > > Webrev: http://cr.openjdk.java.net/~jiefu/8251031/webrev.00/ > > Some vmTestbase/nsk/monitoring/RuntimeMXBean tests failed in our test > infrastructure. > > The reason is that these tests reject hostnames starting with digits. > > However, hostnames starting from digits are actually valid according to > RFC1123 [1][2]. > > It would be better to fix it. > > Thanks a lot. > > Best regards, > > Jie > > [1] https://tools.ietf.org/html/rfc1123#page-13 > > [2] https://en.wikipedia.org/wiki/Hostname > -------------- next part -------------- An HTML attachment was scrubbed... URL: From david.holmes at oracle.com Wed Aug 5 02:26:24 2020 From: david.holmes at oracle.com (David Holmes) Date: Wed, 5 Aug 2020 12:26:24 +1000 Subject: RFR(L/S) : 8249030 : clean up FileInstaller $test.src $cwd in vmTestbase_nsk_jdi tests In-Reply-To: References: <637D6495-24E6-4874-9024-1B0082492085@oracle.com> Message-ID: That was a hard slog :) But looks okay to me. Thanks, David On 5/08/2020 9:59 am, Igor Ignatyev wrote: > ping? > -- Igor > >> On Jul 31, 2020, at 1:24 PM, Igor Ignatyev wrote: >> >> http://cr.openjdk.java.net/~iignatyev//8249030/webrev.00 >>> 2258 lines changed: 0 ins; 1144 del; 1114 mod; >> >> Hi all, >> >> could you please review the clean-up of nsk_jdi tests? >> from main issue(8204985) : >>> all vmTestbase tests have '@run driver jdk.test.lib.FileInstaller . .' to mimic old test harness behavior and copy all files from a test source directory to a current work directory. some tests depend on this step, so we need 1st identify such tests and then either rewrite them not to have this dependency or leave FileInstaller only in these tests. >> >> >> the patch removes FileInstaller actions in the said tests, and as before, the biggest part of patch was done by `ag -l '@run driver jdk.test.lib.FileInstaller . .' $DIR | xargs -I{} gsed -i '/@run driver jdk.test.lib.FileInstaller \. \./d' {}` with $DIR being test/hotspot/jtreg/vmTestbase/nsk/jdi/. >> >> the 10 tests which had '-configFile ./<...>', and hence were looking for config file in the current directory, were updated to search for a config file in 'test.src' directory: http://cr.openjdk.java.net/~iignatyev//8249030/webrev.00-configFile >> >> testing: :vmTestbase_nsk_jdi on {linux,windows,macos}-x64 >> JBS: https://bugs.openjdk.java.net/browse/JDK-8249030 >> webrev: http://cr.openjdk.java.net/~iignatyev/8249030/webrev.00/ >> >> Thanks, >> -- Igor > From igor.ignatyev at oracle.com Wed Aug 5 03:08:45 2020 From: igor.ignatyev at oracle.com (Igor Ignatyev) Date: Tue, 4 Aug 2020 20:08:45 -0700 Subject: RFR(L/S) : 8249030 : clean up FileInstaller $test.src $cwd in vmTestbase_nsk_jdi tests In-Reply-To: References: <637D6495-24E6-4874-9024-1B0082492085@oracle.com> Message-ID: <9E8AE6ED-8DCD-473C-9AEB-9B0951F0180D@oracle.com> Serguei, David, thank you for your reviews. pushed. -- Igor > On Aug 4, 2020, at 7:26 PM, David Holmes wrote: > > That was a hard slog :) > > But looks okay to me. > > Thanks, > David > > On 5/08/2020 9:59 am, Igor Ignatyev wrote: >> ping? >> -- Igor >>> On Jul 31, 2020, at 1:24 PM, Igor Ignatyev wrote: >>> >>> http://cr.openjdk.java.net/~iignatyev//8249030/webrev.00 >>>> 2258 lines changed: 0 ins; 1144 del; 1114 mod; >>> >>> Hi all, >>> >>> could you please review the clean-up of nsk_jdi tests? >>> from main issue(8204985) : >>>> all vmTestbase tests have '@run driver jdk.test.lib.FileInstaller . .' to mimic old test harness behavior and copy all files from a test source directory to a current work directory. some tests depend on this step, so we need 1st identify such tests and then either rewrite them not to have this dependency or leave FileInstaller only in these tests. >>> >>> >>> the patch removes FileInstaller actions in the said tests, and as before, the biggest part of patch was done by `ag -l '@run driver jdk.test.lib.FileInstaller . .' $DIR | xargs -I{} gsed -i '/@run driver jdk.test.lib.FileInstaller \. \./d' {}` with $DIR being test/hotspot/jtreg/vmTestbase/nsk/jdi/. >>> >>> the 10 tests which had '-configFile ./<...>', and hence were looking for config file in the current directory, were updated to search for a config file in 'test.src' directory: http://cr.openjdk.java.net/~iignatyev//8249030/webrev.00-configFile >>> >>> testing: :vmTestbase_nsk_jdi on {linux,windows,macos}-x64 >>> JBS: https://bugs.openjdk.java.net/browse/JDK-8249030 >>> webrev: http://cr.openjdk.java.net/~iignatyev/8249030/webrev.00/ >>> >>> Thanks, >>> -- Igor From serguei.spitsyn at oracle.com Wed Aug 5 04:56:44 2020 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Tue, 4 Aug 2020 21:56:44 -0700 Subject: RFR(L): 8215624: add parallel heap inspection support for jmap histo(G1)(Internet mail) In-Reply-To: <5EC2DEE8-C1F0-4C88-9F90-A8A55BBE39FE@tencent.com> References: <01181077-0DD7-4AEE-945B-5D6E21224A31@amazon.com> <5EC2DEE8-C1F0-4C88-9F90-A8A55BBE39FE@tencent.com> Message-ID: <5fe3c424-3e48-7fb5-e964-90d9b52fad92@oracle.com> An HTML attachment was scrubbed... URL: From serguei.spitsyn at oracle.com Wed Aug 5 05:01:18 2020 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Tue, 4 Aug 2020 22:01:18 -0700 Subject: RFR(L): 8215624: add parallel heap inspection support for jmap histo(G1)(Internet mail) In-Reply-To: <5fe3c424-3e48-7fb5-e964-90d9b52fad92@oracle.com> References: <01181077-0DD7-4AEE-945B-5D6E21224A31@amazon.com> <5EC2DEE8-C1F0-4C88-9F90-A8A55BBE39FE@tencent.com> <5fe3c424-3e48-7fb5-e964-90d9b52fad92@oracle.com> Message-ID: <936c5cb9-3895-e518-c912-5e5cd8a71e66@oracle.com> An HTML attachment was scrubbed... URL: From linzang at tencent.com Wed Aug 5 05:22:45 2020 From: linzang at tencent.com (=?utf-8?B?bGluemFuZyjoh6fnkLMp?=) Date: Wed, 5 Aug 2020 05:22:45 +0000 Subject: RFR(L): 8215624: add parallel heap inspection support for jmap histo(G1)(Internet mail) In-Reply-To: <936c5cb9-3895-e518-c912-5e5cd8a71e66@oracle.com> References: <01181077-0DD7-4AEE-945B-5D6E21224A31@amazon.com> <5EC2DEE8-C1F0-4C88-9F90-A8A55BBE39FE@tencent.com> <5fe3c424-3e48-7fb5-e964-90d9b52fad92@oracle.com> <936c5cb9-3895-e518-c912-5e5cd8a71e66@oracle.com> Message-ID: Hi Serguei, No problem, Thanks for your reviewing :) I wll upload a new webrev later, so may I ask your help to review it again? Hi Stefan, As Paul mentioned, the _missed_count is not a size, so size_t may not be clear, what?s your opinion about uint64_t? It seems the uint overflow may happened on 64bit machine with large heap, e.g. may be more than 4 Giga objects (8byte header + 8 byte klassptr + 8byte field) in a heap that is larger than 96 GB, uint64_t is ok in this case. BRs, Lin From: "serguei.spitsyn at oracle.com" Date: Wednesday, August 5, 2020 at 1:02 PM To: "linzang(??)" , "Hohensee, Paul" , Stefan Karlsson , David Holmes , serviceability-dev , "hotspot-gc-dev at openjdk.java.net" Subject: Re: RFR(L): 8215624: add parallel heap inspection support for jmap histo(G1)(Internet mail) Oh, sorry for the confusion, please, skip my question. :) C++ does not have the '&&=' operator. Thanks, Serguei On 8/4/20 21:56, serguei.spitsyn at oracle.com wrote: Hi Lin, https://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_10/src/hotspot/share/memory/heapInspection.cpp.udiff.html +class KlassInfoTableMergeClosure : public KlassInfoClosure { +private: + KlassInfoTable* _dest; + bool _success; +public: + KlassInfoTableMergeClosure(KlassInfoTable* table) : _dest(table), _success(true) {} + void do_cinfo(KlassInfoEntry* cie) { + _success &= _dest->merge_entry(cie); + } The operator '&=' above looks strange. Did you actually want to use the operator '&&=' instead? : + _success &&= _dest->merge_entry(cie); Thanks, Serguei On 8/3/20 07:51, linzang(??) wrote: Dear Stefan, May I ask your help to review again? I have made a delta based on the last changeset you have reviewed(webrev04),hope it could ease your reviewing work. webrev: https://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_10/ delta (vs webrev04): https://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/delta_10vs04/webrev/ bug: https://bugs.openjdk.java.net/browse/JDK-8215624 CSR(approved): https://bugs.openjdk.java.net/browse/JDK-8239290 BRs, Lin On 2020/7/30, 5:21 AM, "Hohensee, Paul" wrote: A submit repo run with this succeeded, so afaic you're good to go. Stefan, you reviewed the GC part before, it'd be great if you could ok the final version. Thanks, Paul On 7/29/20, 5:02 AM, "linzang(??)" wrote: Upload a new change at http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_10/ It fix an issue of windows fail : #################################### In heapInspect.cpp - size_t HeapInspection::populate_table(KlassInfoTable* cit, BoolObjectClosure *filter, uint parallel_thread_num) { + uint HeapInspection::populate_table(KlassInfoTable* cit, BoolObjectClosure *filter, uint parallel_thread_num) { #################################### In heapInspect.hpp - size_t populate_table(KlassInfoTable* cit, BoolObjectClosure* filter = NULL, uint parallel_thread_num = 1) NOT_SERVICES_RETURN_(0); + uint populate_table(KlassInfoTable* cit, BoolObjectClosure* filter = NULL, uint parallel_thread_num = 1) NOT_SERVICES_RETURN_(0); #################################### BRs, Lin On 2020/7/27, 11:26 AM, "linzang(??)" wrote: I update a new change at http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_09 It includes a tiny fix of build failure on windows: #################################### In attachListener.cpp: - uint parallel_thread_num = MAX(1, (uint)os::initial_active_processor_count() * 3 / 8); + uint parallel_thread_num = MAX2(1, (uint)os::initial_active_processor_count() * 3 / 8); #################################### BRs, Lin On 2020/7/23, 11:56 AM, "linzang(??)" wrote: Hi Paul, Thanks for your help, that all looks good to me. Just 2 minor changes: ? delete the final return in ParHeapInspectTask::work, you mentioned it but seems not include in the webrev :-) ? delete a unnecessary blank line in heapInspect.cpp at merge_entry() ######################################################################### --- old/src/hotspot/share/memory/heapInspection.cpp 2020-07-23 11:23:29.281666456 +0800 +++ new/src/hotspot/share/memory/heapInspection.cpp 2020-07-23 11:23:29.017666447 +0800 @@ -251,7 +251,6 @@ _size_of_instances_in_words += cie->words(); return true; } - return false; } @@ -568,7 +567,6 @@ Atomic::add(&_missed_count, missed_count); } else { Atomic::store(&_success, false); - return; } } ######################################################################### Here is the webrev http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_08/ BRs, Lin --------------------------------------------- From: "Hohensee, Paul" Date: Thursday, July 23, 2020 at 6:48 AM To: "linzang(??)" , Stefan Karlsson , "serguei.spitsyn at oracle.com" , David Holmes , serviceability-dev , "hotspot-gc-dev at openjdk.java.net" Subject: RE: RFR(L): 8215624: add parallel heap inspection support for jmap histo(G1)(Internet mail) Just small things. heapInspection.cpp: In ParHeapInspectTask::work, remove the final return statement and fix the following ?}? indent. I.e., replace + Atomic::store(&_success, false); + return; + } with + Atomic::store(&_success, false); + } In HeapInspection::heap_inspection, missed_count should be a uint to match other missed_count declarations, and should be initialized to the result of populate_table() rather than separately to 0. attachListener.cpp: In heap_inspection, initial_processor_count returns an int, so cast its result to a uint. Similarly, parse_uintx returns a uintx, so cast its result (num) to uint when assigning to parallel_thread_num. BasicJMapTest.java: I took the liberty of refactoring testHisto*/histoToFile/testDump*/dump to remove redundant interposition methods and make histoToFile and dump look as similar as possible. Webrev with the above changes in http://cr.openjdk.java.net/~phh/8214535/webrev.01/ Thanks, Paul On 7/15/20, 2:13 AM, "linzang(??)" wrote: Upload a new webrev at http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_07/ It fix a potential issue that unexpected number of threads maybe calculated for "parallel" option of jmap -histo in container. As shown at http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_07-delta/src/hotspot/share/services/attachListener.cpp.udiff.html ############### attachListener.cpp #################### @@ -252,11 +252,11 @@ static jint heap_inspection(AttachOperation* op, outputStream* out) { bool live_objects_only = true; // default is true to retain the behavior before this change is made outputStream* os = out; // if path not specified or path is NULL, use out fileStream* fs = NULL; const char* arg0 = op->arg(0); - uint parallel_thread_num = MAX(1, os::processor_count() * 3 / 8); // default is less than half of processors. + uint parallel_thread_num = MAX(1, os::initial_active_processor_count() * 3 / 8); // default is less than half of processors. if (arg0 != NULL && (strlen(arg0) > 0)) { if (strcmp(arg0, "-all") != 0 && strcmp(arg0, "-live") != 0) { out->print_cr("Invalid argument to inspectheap operation: %s", arg0); return JNI_ERR; } ################################################### Thanks. BRs, Lin On 2020/7/9, 3:22 PM, "linzang(??)" wrote: Hi Paul, Thanks for reviewing! >> >> I'd move all the argument parsing code to JMap.java and just pass the results to Hotspot. Both histo() in JMap.java and code in attachListener.* parse the command line arguments, though the code in histo() doesn't parse the argument to "parallel". I'd upgrade the code in histo() to do a complete parse and pass the option values to executeCommandForPid as before: there would just be more of them now. That would allow you to eliminate all the parsing code in attachListener.cpp as well as the change to arguments.hpp. >> The reason I made the change in Jmap.java that compose all arguments as 1 string , instead of passing 3 argments, is to avoid the compatibility issue, as we discussed in http://mail.openjdk.java.net/pipermail/serviceability-dev/2019-February/thread.html#27240. The root cause of the compatibility issue is because max argument count in HotspotVirtualMachineImpl.java and attachlistener.cpp need to be enlarged (changes like http://hg.openjdk.java.net/jdk/jdk/rev/e7cf035682e3#l2.1) when jmap has more than 3 arguments. But if user use an old jcmd/jmap tool, it may stuck at socket read(), because the "max argument count" don't match. I re-checked this change, the argument count of jmap histo is equal to 3 (live, file, parallel), so it can work normally even without the change of passing argument. But I think we have to face the problem if more arguments is added in jcmd alike tools later, not sure whether it should be sloved (or a workaround) in this changeset. And here are the lastest webrev and delta: http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_06/ http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_06-delta/ Cheers, Lin On 2020/7/7, 5:57 AM, "Hohensee, Paul" wrote: I'd like to see this feature added. :) The CSR looks good, as does the basic parallel inspection algorithm. Stefan's done the GC part, so I'll stick to the non-GC part (fwiw, the GC part lgtm). I'd move all the argument parsing code to JMap.java and just pass the results to Hotspot. Both histo() in JMap.java and code in attachListener.* parse the command line arguments, though the code in histo() doesn't parse the argument to "parallel". I'd upgrade the code in histo() to do a complete parse and pass the option values to executeCommandForPid as before: there would just be more of them now. That would allow you to eliminate all the parsing code in attachListener.cpp as well as the change to arguments.hpp. heapInspection.hpp: _shared_miss_count (s/b _missed_count, see below) isn't a size, so it should be a uint instead of a size_t. Same with the new parallel_thread_num argument to heap_inspection() and populate_table(). Comment copy-edit: +// Parallel heap inspection task. Parallel inspection can fail due to +// a native OOM when allocating memory for TL-KlassInfoTable. +// _success will be set false on an OOM, and serial inspection tried. _shared_miss_count should be _missed_count to match the missed_count() getter, or rename missed_count() to be shared_miss_count(). Whichever way you go, the field type should match the getter result type: uint is reasonable. heapInspection.cpp: You might use ResourceMark twice in populate_table, separately for the parallel attempt and the serial code. If the parallel attempt fails and available memory is low, it would be good to clean up the memory used by the parallel attempt before doing the serial code. Style nit in KlassInfoTable::merge_entry(). I'd line up the definitions of k and elt, so "k" is even with "elt". And, because it's two lines shorter, I'd replace + } else { + return false; + } with + return false; KlassInfoTableMergeClosure.is_success() should be just success() (i.e., no "is_" prefix) because it's a getter. I'd reorganize the code in populate_table() to make it more clear, vis (I changed _shared_missed_count to _missed_count) + if (cit.allocation_failed()) { + // fail to allocate memory, stop parallel mode + Atomic::store(&_success, false); + return; + } + RecordInstanceClosure ric(&cit, _filter); + _poi->object_iterate(&ric, worker_id); + missed_count = ric.missed_count(); + { + MutexLocker x(&_mutex); + merge_success = _shared_cit->merge(&cit); + } + if (merge_success) { + Atomic::add(&_missed_count, missed_count); + else { + Atomic::store(&_success, false); + } Thanks, Paul On 6/29/20, 7:20 PM, "linzang(??)" wrote: Dear All, Sorry to bother again, I just want to make sure that is this change worth to be continue to work on? If decision is made to not. I think I can drop this work and stop asking for help reviewing... Thanks for all your help about reviewing this previously. BRs, Lin On 2020/5/9, 3:47 PM, "linzang(??)" wrote: Dear All, May I ask your help again for review the latest change? Thanks! BRs, Lin On 2020/4/28, 1:54 PM, "linzang(??)" wrote: Hi Stefan, >> - Adding Atomic::load/store. >> - Removing the time measurement in the run_task. I renamed G1's function >> to run_task_timed. If we need this outside of G1, we can rethink the API >> at that point. >> - ZGC style cleanups Thanks for revising the patch, they are all good to me, and I have made a tiny change based on it: http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_04/ http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_04-delta/ it reduce the scope of mutex in ParHeapInspectTask, and delete unnecessary comments. BRs, Lin On 2020/4/27, 4:34 PM, "Stefan Karlsson" wrote: Hi Lin, On 2020-04-26 05:10, linzang(??) wrote: > Hi Stefan and Paul? > I have made a new patch based on your comments and Stefan's Poc code: > Webrev: http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_03/ > Delta(based on Stefan's change:) : http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_03-delta/webrev_03-delta/ Thanks for providing a delta patch. It makes it much easier to look at, and more likely for reviewers to continue reviewing. I'm going to continue focusing on the GC parts, and leave the rest to others to review. > > And Here are main changed I made and want to discuss with you: > 1. changed"parallelThreadNum=" to "parallel=" for jmap -histo options. > 2. Add logic to test where parallelHeapInspection is fail, in heapInspection.cpp > This is because the parHeapInspectTask create thread local KlassInfoTable in it's work() method, and this may fail because of native OOM, in this case, the parallel should fail and serial heap inspection can be tried. > One more thing I want discuss with you is about the member "_success" of parHeapInspectTask, when native OOM happenes, it is set to false. And since this "set" operation can be conducted in multiple threads, should it be atomic ops? IMO, this is not necessary because "_success" can only be set to false, and there is no way to change it from back to true after the ParHeapInspectTask instance is created, so it is save to be non-atomic, do you agree with that? In these situations you should be using the Atomic::load/store primitives. We're moving toward a later C++ standard were data races are considered undefined behavior. > 3. make CollectedHeap::run_task() be an abstract virtual func, so that every subclass of collectedHeap should support it, so later implementation of new collectedHeap will not miss the "parallel" features. > The problem I want to discuss with you is about epsilonHeap and SerialHeap, as they may not need parallel heap iteration, so I only make task->work(0), in case the run_task() is invoked someway in future. Another way is to left run_task() unimplemented, which one do you think is better? I don't have a strong opinion about this. And also please help take a look at the zHeap, as there is a class zTask that wrap the abstractGangTask, and the collectedHeap::run_task() only accept AbstraceGangTask* as argument, so I made a delegate class to adapt it , please see src/hotspot/share/gc/z/zHeap.cpp. > > There maybe other better ways to sovle the above problems, welcome for any comments, Thanks! I've created a few cleanups and changes on top of your latest patch: https://cr.openjdk.java.net/~stefank/8215624/webrev.02.delta https://cr.openjdk.java.net/~stefank/8215624/webrev.02 - Adding Atomic::load/store. - Removing the time measurement in the run_task. I renamed G1's function to run_task_timed. If we need this outside of G1, we can rethink the API at that point. - ZGC style cleanups Thanks, StefanK > > BRs, > Lin > > On 2020/4/23, 11:08 AM, "linzang(??)" wrote: > > Thanks Paul! I agree with using "parallel", will make the update in next patch, Thanks for help update the CSR. > > BRs, > Lin > > On 2020/4/23, 4:42 AM, "Hohensee, Paul" wrote: > > For the interface, I'd use "parallel" instead of "parallelThreadNum". All the other options are lower case, and it's a lot easier to type "parallel". I took the liberty of updating the CSR. If you're ok with it, you might want to change variable names and such, plus of course JMap.usage. > > Thanks, > Paul > > On 4/22/20, 2:29 AM, "serviceability-dev on behalf of linzang(??)" wrote: > > Dear Stefan, > > Thanks a lot! I agree with you to decouple the heap inspection code with GC's. > I will start from your POC code, may discuss with you later. > > > BRs, > Lin > > On 2020/4/22, 5:14 PM, "Stefan Karlsson" wrote: > > Hi Lin, > > I took a look at this earlier and saw that the heap inspection code is > strongly coupled with the CollectedHeap and G1CollectedHeap. I'd prefer > if we'd abstract this away, so that the GCs only provide a "parallel > object iteration" interface, and the heap inspection code is kept elsewhere. > > I started experimenting with doing that, but other higher-priority (to > me) tasks have had to take precedence. > > I've uploaded my work-in-progress / proof-of-concept: > https://cr.openjdk.java.net/~stefank/8215624/webrev.01.delta/ > https://cr.openjdk.java.net/~stefank/8215624/webrev.01/ > > The current code doesn't handle the lifecycle (deletion) of the > ParallelObjectIterators. There's also code left unimplemented in around > CollectedHeap::run_task. However, I think this could work as a basis to > pull out the heap inspection code out of the GCs. > > Thanks, > StefanK > > On 2020-04-22 02:21, linzang(??) wrote: > > Dear all, > > May I ask you help to review? This RFR has been there for quite a while. > > Thanks! > > > > BRs, > > Lin > > > > > On 2020/3/16, 5:18 PM, "linzang(??)" wrote:> > > > >> Just update a new path, my preliminary measure show about 3.5x speedup of jmap histo on a nearly full 4GB G1 heap (8-core platform with parallel thread number set to 4). > >> webrev: http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_02/ > >> bug: https://bugs.openjdk.java.net/browse/JDK-8215624 > >> CSR: https://bugs.openjdk.java.net/browse/JDK-8239290 > >> BRs, > >> Lin > >> > On 2020/3/2, 9:56 PM, "linzang(??)" wrote: > >> > > >> > Dear all, > >> > Let me try to ease the reviewing work by some explanation :P > >> > The patch's target is to speed up jmap -histo for heap iteration, from my experience it is necessary for large heap investigation. E.g in bigData scenario I have tried to conduct jmap -histo against 180GB heap, it does take quite a while. > >> > And if my understanding is corrent, even the jmap -histo without "live" option does heap inspection with heap lock acquired. so it is very likely to block mutator thread in allocation-sensitive scenario. I would say the faster the heap inspection does, the shorter the mutator be blocked. This is parallel iteration for jmap is necessary. > >> > I think the parallel heap inspection should be applied to all kind of heap. However, consider the heap layout are different for GCs, much time is required to understand all kinds of the heap layout to make the whole change. IMO, It is not wise to have a huge patch for the whole solution at once, and it is even harder to review it. So I plan to implement it incrementally, the first patch (this one) is going to confirm the implemention detail of how jmap accept the new option, passes it to attachListener of the jvm process and then how to make the parallel inspection closure be generic enough to make it easy to extend to different heap layout. And also how to implement the heap inspection in specific gc's heap. This patch use G1's heap as the begining. > >> > This patch actually do several things: > >> > 1. Add an option "parallelThreadNum=" to jmap -histo, the default behavior is to set N to 0, means let's JVM decide how many threads to use for heap inspection. Set this option to 1 will disable parallel heap inspection. (more details in CSR: https://bugs.openjdk.java.net/browse/JDK-8239290) > >> > 2. Make a change in how Jmap passing arguments, changes in http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_01/src/jdk.jcmd/share/classes/sun/tools/jmap/JMap.java.udiff.html, originally it pass options as separate arguments to attachListener, this patch change to that all options be compose to a single string. So the arg_count_max in attachListener.hpp do not need to be changed, and hence avoid the compatibility issue, as disscussed at https://mail.openjdk.java.net/pipermail/serviceability-dev/2019-March/027334.html > >> > 3. Add an abstract class ParHeapInspectTask in heapInspection.hpp / heapInspection.cpp, It's work(uint worker_id) method prepares the data structure (KlassInfoTable) need for every parallel worker thread, and then call do_object_iterate_parallel() which is heap specific implementation. I also added some machenism in KlassInfoTable to support parallel iteration, such as merge(). > >> > 4. In specific heap (G1 in this patch), create a subclass of ParHeapInspectTask, implement the do_object_iterate_parallel() for parallel heap inspection. For G1, it simply invoke g1CollectedHeap's object_iterate_parallel(). > >> > 5. Add related test. > >> > 6. it may be easy to extend this patch for other kinds of heap by creating subclass of ParHeapInspectTask and implement the do_object_iterate_parallel(). > >> > > >> > Hope these info could help on code review and initate the discussion :-) > >> > Thanks! > >> > > >> > BRs, > >> > Lin > >> > >On 2020/2/19, 9:40 AM, "linzang(??)" wrote:. > >> > > > >> > > Re-post this RFR with correct enhancement number to make it trackable. > >> > > please ignore the previous wrong post. sorry for troubles. > >> > > > >> > > webrev: http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_01/ > >> > > Hi bug: https://bugs.openjdk.java.net/browse/JDK-8215624 > >> > > CSR: https://bugs.openjdk.java.net/browse/JDK-8239290 > >> > > -------------- > >> > > Lin > >> > > >Hi Lin, > > > > > > > >> > > >Could you, please, re-post your RFR with the right enhancement number in > >> > > >the message subject? > >> > > >It will be more trackable this way. > >> > > > > >> > > >Thanks, > >> > > >Serguei > >> > > > > >> > > > > >> > > >On 2/17/20 10:29 PM, linzang(??) wrote: > >> > > >> Dear David, > >> > > >> Thanks a lot! > >> > > >> I have updated the refined code to http://cr.openjdk.java.net/~lzang/jmap-8214535/8215264/webrev_01/. > >> > > >> IMHO the parallel heap inspection can be extended to all kinds of heap as long as the heap layout can support parallel iteration. > >> > > >> Maybe we can firstly use this webrev to discuss how to implement it, because I am not sure my current implementation is an appropriate way to communicate with collectedHeap, then we can extend the solution to other kinds of heap. > >> > > >> > >> > > >> Thanks, > >> > > >> -------------- > >> > > >> Lin > >> > > >>> Hi Lin, > >> > > >>> > >> > > >>> Adding in hotspot-gc-dev as they need to see how this interacts with GC > >> > > >>> worker threads, and whether it needs to be extended beyond G1. > >> > > >>> > >> > > >>> I happened to spot one nit when browsing: > >> > > >>> > >> > > >>> src/hotspot/share/gc/shared/collectedHeap.hpp > >> > > >>> > >> > > >>> + virtual bool run_par_heap_inspect_task(KlassInfoTable* cit, > >> > > >>> + BoolObjectClosure* filter, > >> > > >>> + size_t* missed_count, > >> > > >>> + size_t thread_num) { > >> > > >>> + return NULL; > >> > > >>> > >> > > >>> s/NULL/false/ > >> > > >>> > >> > > >>> Cheers, > >> > > >>> David > > > > > >>> > >> > > >>> On 18/02/2020 2:15 pm, linzang(??) wrote: > >> > > >>>> Dear All, > >> > > >>>> May I ask your help to review the follow changes: > >> > > >>>> webrev: > >> > > >>>> http://cr.openjdk.java.net/~lzang/jmap-8214535/8215264/webrev_00/ > >> > > >>>> bug: https://bugs.openjdk.java.net/browse/JDK-8215624 > >> > > >>>> related CSR: https://bugs.openjdk.java.net/browse/JDK-8239290 > >> > > >>>> This patch enable parallel heap inspection of G1 for jmap histo. > >> > > >>>> my simple test shown it can speed up 2x of jmap -histo with > >> > > >>>> parallelThreadNum set to 2 for heap at ~500M on 4-core platform. > >> > > >>>> > >> > > >>>> ------------------------------------------------------------------------ > >> > > >>>> BRs, > >> > > >>>> Lin > >> > > >> > > >> > > > > > > > > > > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From serguei.spitsyn at oracle.com Wed Aug 5 06:16:32 2020 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Tue, 4 Aug 2020 23:16:32 -0700 Subject: RFR(L): 8215624: add parallel heap inspection support for jmap histo(G1)(Internet mail) In-Reply-To: References: <01181077-0DD7-4AEE-945B-5D6E21224A31@amazon.com> <5EC2DEE8-C1F0-4C88-9F90-A8A55BBE39FE@tencent.com> <5fe3c424-3e48-7fb5-e964-90d9b52fad92@oracle.com> <936c5cb9-3895-e518-c912-5e5cd8a71e66@oracle.com> Message-ID: <6d86ce31-1784-a371-64db-0c9a914ac13e@oracle.com> An HTML attachment was scrubbed... URL: From jiefu at tencent.com Wed Aug 5 07:18:00 2020 From: jiefu at tencent.com (=?utf-8?B?amllZnUo5YKF5p2wKQ==?=) Date: Wed, 5 Aug 2020 07:18:00 +0000 Subject: RFR: 8251155: HostIdentifier fails to canonicalize hostnames starting with digits Message-ID: <483D9D17-F2C3-47D4-8578-0DAF1E353AF5@tencent.com> Hi all, JBS: https://bugs.openjdk.java.net/browse/JDK-8251155 Webrev: http://cr.openjdk.java.net/~jiefu/8251155/webrev.00/ HostIdentifier fails to canonicalize hostname:port if the hostname starts with digits. The current implementation will get "scheme = hostname". But the scheme should not be started with digits, which leads to this bug. Thanks a lot. Best regards, Jie -------------- next part -------------- An HTML attachment was scrubbed... URL: From stefan.karlsson at oracle.com Wed Aug 5 07:38:04 2020 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Wed, 5 Aug 2020 09:38:04 +0200 Subject: RFR(L): 8215624: add parallel heap inspection support for jmap histo(G1)(Internet mail) In-Reply-To: References: <01181077-0DD7-4AEE-945B-5D6E21224A31@amazon.com> <5EC2DEE8-C1F0-4C88-9F90-A8A55BBE39FE@tencent.com> <5fe3c424-3e48-7fb5-e964-90d9b52fad92@oracle.com> <936c5cb9-3895-e518-c912-5e5cd8a71e66@oracle.com> Message-ID: On 2020-08-05 07:22, linzang(??) wrote: > Hi Serguei, > > No problem, Thanks for your reviewing :) > > ?? I wll upload a new webrev later, so may I ask your help to review it > again? > > Hi Stefan, > > ? ?As Paul mentioned, the _/missed/_count is not a size, ?so size_t may > not be clear, what?s your opinion about uint64_t? We typically don't restrict the usage of size_t to only *sizes* in the HotSpot. If you search the code you'll find many count variables using size_t, so I personally don't see the need to change the type. However, if you really do want to change it then maybe using another type that is 32 bits on 32-bit machines, maybe uintx? IIRC, using uint64_t and some of the Atomics operations are problematic on some 32-bit platforms, so using a type that matches the word size of the targetted machine helps you not having to think about that. > > ?? It seems the uint overflow may happened on 64bit machine with large > heap, e.g. may be more than 4 Giga objects (8byte header + 8 byte > klassptr + 8byte field) in a heap that is larger than 96 GB, ?uint64_t > is ok in this case. Exactly. Thanks, StefanK > > BRs, > > Lin > > *From: *"serguei.spitsyn at oracle.com" > *Date: *Wednesday, August 5, 2020 at 1:02 PM > *To: *"linzang(??)" , "Hohensee, Paul" > , Stefan Karlsson , > David Holmes , serviceability-dev > , "hotspot-gc-dev at openjdk.java.net" > > *Subject: *Re: RFR(L): 8215624: add parallel heap inspection support for > jmap histo(G1)(Internet mail) > > Oh, sorry for the confusion, please, skip my question. :) > C++ does not have the '&&=' operator. > > Thanks, > Serguei > > On 8/4/20 21:56, serguei.spitsyn at oracle.com > wrote: > > Hi Lin, > > https://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_10/src/hotspot/share/memory/heapInspection.cpp.udiff.html > > +class KlassInfoTableMergeClosure : public KlassInfoClosure { > > +private: > > +? KlassInfoTable* _dest; > > +? bool _success; > > +public: > > +? KlassInfoTableMergeClosure(KlassInfoTable* table) : _dest(table), > _success(true) {} > > +? void do_cinfo(KlassInfoEntry* cie) { > > +??? _success &= _dest->merge_entry(cie); > > +? } > > The operator '&=' above looks strange. > Did you actually want to use the operator '&&=' instead? : > > +??? _success &&= _dest->merge_entry(cie); > > > Thanks, > Serguei > > > > > On 8/3/20 07:51, linzang(??) wrote: > > Dear Stefan, > > ?????????May I ask your help to review again? I have made a delta based on the last changeset you have reviewed(webrev04),hope it could ease your reviewing work. > > ???????? webrev:https://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_10/ > > ???????? delta (vs webrev04):https://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/delta_10vs04/webrev/ > > ???????? bug:https://bugs.openjdk.java.net/browse/JDK-8215624 > > ???????? CSR(approved):https://bugs.openjdk.java.net/browse/JDK-8239290 > > > > BRs, > > Lin > > On 2020/7/30, 5:21 AM, "Hohensee, Paul" wrote: > > ??? A submit repo run with this succeeded, so afaic you're good to go. Stefan, you reviewed the GC part before, it'd be great if you could ok the final version. > > ??? Thanks, > > ??? Paul > > ??? On 7/29/20, 5:02 AM, "linzang(??)" wrote: > > ??????? Upload a new change athttp://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_10/ > > ??????? It fix an issue of windows fail : > > ??????? #################################### > > ??????? In heapInspect.cpp > > ??????? - size_t HeapInspection::populate_table(KlassInfoTable* cit, BoolObjectClosure *filter, uint parallel_thread_num) { > > ??????? + uint HeapInspection::populate_table(KlassInfoTable* cit, BoolObjectClosure *filter, uint parallel_thread_num) { > > ??????? #################################### > > ??????? In heapInspect.hpp > > ??????? - size_t populate_table(KlassInfoTable* cit, BoolObjectClosure* filter = NULL, uint parallel_thread_num = 1) NOT_SERVICES_RETURN_(0); > > ??????? +? uint populate_table(KlassInfoTable* cit, BoolObjectClosure* filter = NULL, uint parallel_thread_num = 1) NOT_SERVICES_RETURN_(0); > > ??????? #################################### > > ??????? BRs, > > ??????? Lin > > ??????? On 2020/7/27, 11:26 AM, "linzang(??)" wrote: > > ??????????? I update a new change athttp://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_09 > > ??????????? It includes a tiny fix of build failure on windows: > > ??????????? #################################### > > ??????????? In attachListener.cpp: > > ??????????? -? uint parallel_thread_num = MAX(1, (uint)os::initial_active_processor_count() * 3 / 8); > > ??????????? +? uint parallel_thread_num = MAX2(1, (uint)os::initial_active_processor_count() * 3 / 8); > > ??????????? #################################### > > ????????? ??BRs, > > ??????????? Lin > > ??????????? On 2020/7/23, 11:56 AM, "linzang(??)" wrote: > > ??????????????? Hi Paul, > > ???????????????????? Thanks for your help, that all looks good to me. > > ????????????? ???????Just 2 minor changes: > > ??????????????????????? ? delete the final return in ParHeapInspectTask::work, you mentioned it but seems not include in the webrev :-) > > ??????????????????????? ? delete a unnecessary blank line in heapInspect.cpp at merge_entry() > > ??????????????? ######################################################################### > > ??????????????? --- old/src/hotspot/share/memory/heapInspection.cpp???? 2020-07-23 11:23:29.281666456 +0800 > > ??????????????? +++ new/src/hotspot/share/memory/heapInspection.cpp???? 2020-07-23 11:23:29.017666447 +0800 > > ??????????????? @@ -251,7 +251,6 @@ > > ???????????????????? _size_of_instances_in_words += cie->words(); > > ???????????????????? return true; > > ?????????????????? } > > ??????????????? - > > ?????????????????? return false; > > ???????????????? } > > ??????????????? @@ -568,7 +567,6 @@ > > ???????????????????? Atomic::add(&_missed_count, missed_count); > > ?????????????????? } else { > > ???????????????????? Atomic::store(&_success, false); > > ??????????????? -?? return; > > ?????????????????? } > > ???????????????? } > > ??????????????? ######################################################################### > > ??????????????? Here is the webrevhttp://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_08/ > > ??????????????? BRs, > > ??????????????? Lin > > ??????????????? --------------------------------------------- > > ??????????????? From: "Hohensee, Paul" > > ?????? ?????????Date: Thursday, July 23, 2020 at 6:48 AM > > ??????????????? To: "linzang(??)" , Stefan Karlsson ,"serguei.spitsyn at oracle.com" , David Holmes , serviceability-dev ,"hotspot-gc-dev at openjdk.java.net" > > ??????????????? Subject: RE: RFR(L): 8215624: add parallel heap inspection support for jmap histo(G1)(Internet mail) > > ??????????????? Just small things. > > ??????????????? heapInspection.cpp: > > ??????????????? In ParHeapInspectTask::work, remove the final return statement and fix the following ?}? indent. I.e., replace > > ??????????????? +??? Atomic::store(&_success, false); > > ??????????????? +??? return; > > ??????????????? +?? } > > ??????????????? with > > ??????????????? +??? Atomic::store(&_success, false); > > ??????????????? +? } > > ??????????????? In HeapInspection::heap_inspection, missed_count should be a uint to match other missed_count declarations, and should be initialized to the result of populate_table() rather than separately to 0. > > ??????????????? attachListener.cpp: > > ??????????????? In heap_inspection, initial_processor_count returns an int, so cast its result to a uint. > > ??????????????? Similarly, parse_uintx returns a uintx, so cast its result (num) to uint when assigning to parallel_thread_num. > > ??????????????? BasicJMapTest.java: > > ??????????????? I took the liberty of refactoring testHisto*/histoToFile/testDump*/dump to remove redundant interposition methods and make histoToFile and dump look as similar as possible. > > ??????????????? Webrev with the above changes in > > http://cr.openjdk.java.net/~phh/8214535/webrev.01/ > > ??????????????? Thanks, > > ??????????????? Paul > > ??????????????? On 7/15/20, 2:13 AM, "linzang(??)" wrote: > > ???????????????????? Upload a new webrev athttp://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_07/ > > ???????????????????? It fix a potential issue that unexpected number of threads maybe calculated for "parallel" option of jmap -histo in container. > > ??????????????????? As shown athttp://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_07-delta/src/hotspot/share/services/attachListener.cpp.udiff.html > > ??????????????????? ############### attachListener.cpp #################### > > ??????????????????? @@ -252,11 +252,11 @@ > > ???????????????????? static jint heap_inspection(AttachOperation* op, outputStream* out) { > > ?????????????????????? bool live_objects_only = true;?? // default is true to retain the behavior before this change is made > > ?????????????????????? outputStream* os = out;?? // if path not specified or path is NULL, use out > > ?????????????????????? fileStream* fs = NULL; > > ???????? ??????????????const char* arg0 = op->arg(0); > > ??????????????????? -? uint parallel_thread_num = MAX(1, os::processor_count() * 3 / 8); // default is less than half of processors. > > ??????????????????? +? uint parallel_thread_num = MAX(1, os::initial_active_processor_count() * 3 / 8); // default is less than half of processors. > > ?????????????????????? if (arg0 != NULL && (strlen(arg0) > 0)) { > > ???????????????????????? if (strcmp(arg0, "-all") != 0 && strcmp(arg0, "-live") != 0) { > > ?????????????????????????? out->print_cr("Invalid argument to inspectheap operation: %s", arg0); > > ?????????????????????????? return JNI_ERR; > > ???????????????????????? } > > ??????????????????? ################################################### > > ??????????????????? Thanks. > > ??????????????????? BRs, > > ?????????????????? Lin > > ??????????????????? On 2020/7/9, 3:22 PM, "linzang(??)" wrote: > > ??????????????????????? Hi Paul, > > ??????????????????????????? Thanks for reviewing! > > ?????????????? ?????????????>> > > ??????????????????????????? >>???? I'd move all the argument parsing code to JMap.java and just pass the results to Hotspot. Both histo() in JMap.java and code in attachListener.* parse the command line arguments, though the code in histo() doesn't parse the argument to "parallel". I'd upgrade the code in histo() to do a complete parse and pass the option values to executeCommandForPid as before: there would just be more of them now. That would allow you to eliminate all the parsing code in attachListener.cpp as well as the change to arguments.hpp. > > ??????????????????????????? >> > > ??????????????????????????? The reason I made the change in Jmap.java that compose all arguments as 1 string , instead of passing 3 argments, is to avoid the compatibility issue, as we discussed inhttp://mail.openjdk.java.net/pipermail/serviceability-dev/2019-February/thread.html#27240.? The root cause of the compatibility issue is because max argument count in HotspotVirtualMachineImpl.java and attachlistener.cpp need to be enlarged (changes likehttp://hg.openjdk.java.net/jdk/jdk/rev/e7cf035682e3#l2.1) when jmap has more than 3 arguments. But if user use an old jcmd/jmap tool, it may stuck at socket read(), because the "max argument count" don't match. > > ???????????????????????????? I re-checked this change, the argument count of jmap histo is equal to 3 (live, file, parallel), so it can work normally even without the change of passing argument. But I think we have to face the problem if more arguments is added in jcmd alike tools later, not sure whether it should be sloved (or a workaround) in this changeset. > > ??????????????????????????? And here are the lastest webrev and delta: > > http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_06/ > > http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_06-delta/ > > ??????????????????????? Cheers, > > ??????????????????????? Lin > > ??????????????????????? On 2020/7/7, 5:57 AM, "Hohensee, Paul" wrote: > > ??????????????????????????? I'd like to see this feature added. :) > > ??????????????????????????? The CSR looks good, as does the basic parallel inspection algorithm. Stefan's done the GC part, so I'll stick to the non-GC part (fwiw, the GC part lgtm). > > ??????????????????????????? I'd move all the argument parsing code to JMap.java and just pass the results to Hotspot. Both histo() in JMap.java and code in attachListener.* parse the command line arguments, though the code in histo() doesn't parse the argument to "parallel". I'd upgrade the code in histo() to do a complete parse and pass the option values to executeCommandForPid as before: there would just be more of them now. That would allow you to eliminate all the parsing code in attachListener.cpp as well as the change to arguments.hpp. > > ??????????????????????????? heapInspection.hpp: > > ??????????????????????????? _shared_miss_count (s/b _missed_count, see below) isn't a size, so it should be a uint instead of a size_t. Same with the new parallel_thread_num argument to heap_inspection() and populate_table(). > > ??????????????????????????? Comment copy-edit: > > ??????????????????????????? +// Parallel heap inspection task. Parallel inspection can fail due to > > ??????????????????????????? +// a native OOM when allocating memory for TL-KlassInfoTable. > > ??????????????????????????? +// _success will be set false on an OOM, and serial inspection tried. > > ??????????????????? ????????_shared_miss_count should be _missed_count to match the missed_count() getter, or rename missed_count() to be shared_miss_count(). Whichever way you go, the field type should match the getter result type: uint is reasonable. > > ????????????????????? ??????heapInspection.cpp: > > ??????????????????????????? You might use ResourceMark twice in populate_table, separately for the parallel attempt and the serial code. If the parallel attempt fails and available memory is low, it would be good to clean up the memory used by the parallel attempt before doing the serial code. > > ??????????????????????????? Style nit in KlassInfoTable::merge_entry(). I'd line up the definitions of k and elt, so "k" is even with "elt". And, because it's two lines shorter, I'd replace > > ??????????????????????????? +? } else { > > ????????????? ??????????????+??? return false; > > ??????????????????????????? +? } > > ??????????????????????????? with > > ??????????????????????????? +? return false; > > ??????????????????????????? KlassInfoTableMergeClosure.is_success() should be just success() (i.e., no "is_" prefix) because it's a getter. > > ??????????????????????????? I'd reorganize the code in populate_table() to make it more clear, vis (I changed _shared_missed_count to _missed_count) > > ??????????????????????????? +? if (cit.allocation_failed()) { > > ?????????????? ?????????????+??? // fail to allocate memory, stop parallel mode > > ??????????????????????????? +??? Atomic::store(&_success, false); > > ??????????????????????????? +??? return; > > ??????????????????????????? +? } > > ??????????????????????????? +? RecordInstanceClosure ric(&cit, _filter); > > ??????????????????????????? +? _poi->object_iterate(&ric, worker_id); > > ??????????????????????????? +? missed_count = ric.missed_count(); > > ??????????????????????????? +? { > > ??????????????????????????? +??? MutexLocker x(&_mutex); > > ??????? ????????????????????+??? merge_success = _shared_cit->merge(&cit); > > ??????????????????????????? +? } > > ??????????????????????????? +? if (merge_success) { > > ??????????????????????????? +??? Atomic::add(&_missed_count, missed_count); > > ??????????????????????????? +? else { > > ??????????????????????????? +??? Atomic::store(&_success, false); > > ??????????????????????????? +? } > > ??????????????????????????? Thanks, > > ??????????????????????????? Paul > > ??????????????????????????? On 6/29/20, 7:20 PM, "linzang(??)" wrote: > > ??????????????????????????????? Dear All, > > ??????????????????????????????????????? Sorry to bother again, I just want to make sure that is this change worth to be continue to work on? If decision is made to not. I think I can drop this work and stop asking for help reviewing... > > ??????????????????????????????????????? Thanks for all your help about reviewing this previously. > > ?????????????????????????? ?????BRs, > > ??????????????????????????????? Lin > > ??????????????????????????????? On 2020/5/9, 3:47 PM, "linzang(??)" wrote: > > ??????????????????????????????????? Dear All, > > ????????????????????? ?????????????????????May I ask your help again for review the latest change?? Thanks! > > ??????????????????????????????????? BRs, > > ??????????????????????????????????? Lin > > ??????????????????????????????????? On 2020/4/28, 1:54 PM, "linzang(??)" wrote: > > ??????????????????????????????????????? Hi Stefan, > > ????????????????????????????????????????? >>? - Adding Atomic::load/store. > > ????????????????????????????????????????? >>? - Removing the time measurement in the run_task. I renamed G1's function > > ????????????????????????????????????????? >>? to run_task_timed. If we need this outside of G1, we can rethink the API > > ????????????????????????????????????????? >>? at that point. > > ??????????????????????????? ???????????????>>? - ZGC style cleanups > > ?????????????????????????????????????????? Thanks for revising the patch,? they are all good to me, and I have made a tiny change based on it: > > http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_04/ > > http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_04-delta/ > > ????????????????????????????????????????? it reduce the scope of mutex in ParHeapInspectTask, and delete unnecessary comments. > > ???????????????????????????? ???????????BRs, > > ??????????????????????????????????????? Lin > > ??????????????????????????????????????? On 2020/4/27, 4:34 PM, "Stefan Karlsson" wrote: > > ?????????????????????????? ?????????????????Hi Lin, > > ??????????????????????????????????????????? On 2020-04-26 05:10, linzang(??) wrote: > > ??????????????????????????????????????????? > Hi Stefan and Paul? > > ??????????????????????????????????????????? >????? I have made a new patch based on your comments and Stefan's Poc code: > > ??????????????????????????????????????????? >????? Webrev:http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_03/ > > ??????? ????????????????????????????????????>????? Delta(based on Stefan's change:) :http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_03-delta/webrev_03-delta/ > > ??????????????????????????????????????????? Thanks for providing a delta patch. It makes it much easier to look at, > > ??????????????????????????????????????????? and more likely for reviewers to continue reviewing. > > ?????????????????????????? ?????????????????I'm going to continue focusing on the GC parts, and leave the rest to > > ??????????????????????????????????????????? others to review. > > ??????????????????????????????????????????? > > > ??????????????????????????????????????????? >????? And Here are main changed I made and want to discuss with you: > > ??????????????????????????????????????????? >????? 1.? changed"parallelThreadNum=" to "parallel=" for jmap -histo options. > > ??????????????????????????????????????????? >????? 2.? Add logic to test where parallelHeapInspection is fail, in heapInspection.cpp > > ??????????????????????????????????????????? >??????????? This is because the parHeapInspectTask create thread local KlassInfoTable in it's work() method, and this may fail because of native OOM, in this case, the parallel should fail and serial heap inspection can be tried. > > ??????????????????????????????????????????? >??????????? One more thing I want discuss with you is about the member "_success" of parHeapInspectTask, when native OOM happenes, it is set to false. And since this "set" operation can be conducted in multiple threads, should it be atomic ops?? IMO, this is not necessary because "_success" can only be set to false, and there is no way to change it from back to true after the ParHeapInspectTask instance is created, so it is save to be non-atomic, do you agree with that? > > ??????????????????????????????????????????? In these situations you should be using the Atomic::load/store > > ??????????????????????????????????????????? primitives. We're moving toward a later C++ standard were data races are > > ?????????????????????????????????????????? considered undefined behavior. > > ???? ???????????????????????????????????????>???? 3. make CollectedHeap::run_task() be an abstract virtual func, so that every subclass of collectedHeap should support it, so later implementation of new collectedHeap will not miss the "parallel" features. > > ???? ???????????????????????????????????????>?????????? The problem I want to discuss with you is about epsilonHeap and SerialHeap, as they may not need parallel heap iteration, so I only make task->work(0), in case the run_task() is invoked someway in future. Another way is to left run_task()? unimplemented, which one do you think is better? > > ??????????????????????????????????????????? I don't have a strong opinion about this. > > ????????????????????????????????????????????? And also please help take a look at the zHeap, as there is a class > > ??????????????????????????????????????????? zTask that wrap the abstractGangTask, and the collectedHeap::run_task() > > ??????????????????????????????????????????? only accept? AbstraceGangTask* as argument, so I made a delegate class > > ??????????????????????????????????????????? to adapt it , please see src/hotspot/share/gc/z/zHeap.cpp. > > ??????????????????????????????????????????? > > > ??????????????????????????????????????????? >??????? There maybe other better ways to sovle the above problems, welcome for any comments, Thanks! > > ??????????????????????????????????????????? I've created a few cleanups and changes on top of your latest patch: > > https://cr.openjdk.java.net/~stefank/8215624/webrev.02.delta > > https://cr.openjdk.java.net/~stefank/8215624/webrev.02 > > ???????????????????????????????????????????- Adding Atomic::load/store. > > ??????????????????????????????????????????? - Removing the time measurement in the run_task. I renamed G1's function > > ??????????????????????????????????????????? to run_task_timed. If we need this outside of G1, we can rethink the API > > ??????????????????????????????????????????? at that point. > > ??????????????????????????????????????????? - ZGC style cleanups > > ??????????????????????????????????????????? Thanks, > > ??????????????????????????? ????????????????StefanK > > ??????????????????????????????????????????? > > > ??????????????????????????????????????????? > BRs, > > ??????????????????????????????????????????? > Lin > > ??????????????????????????????????????????? > > > ????????????????????????????????????? ??????> On 2020/4/23, 11:08 AM, "linzang(??)" wrote: > > ??????????????????????????????????????????? > > > ??????????????????????????????????????????? >????? Thanks Paul! I agree with using "parallel", will make the update in next patch, Thanks for help update the CSR. > > ??????????????????????????????????????????? > > > ??????????????????????????????????????????? >????? BRs, > > ??????????????????????????????????????????? >????? Lin > > ??????????????????????????? ????????????????> > > ??????????????????????????????????????????? >????? On 2020/4/23, 4:42 AM, "Hohensee, Paul" wrote: > > ??????????????????????????????????????????? > > > ???????????????????????????? ???????????????>????????? For the interface, I'd use "parallel" instead of "parallelThreadNum". All the other options are lower case, and it's a lot easier to type "parallel". I took the liberty of updating the CSR. If you're ok with it, you might want to change variable names and such, plus of course JMap.usage. > > ??????????????????????????????????????????? > > > ??????????????????????????????????????????? >????????? Thanks, > > ??????????????????????????????????????????? >????????? Paul > > ??????????????????????????? ????????????????> > > ??????????????????????????????????????????? >????????? On 4/22/20, 2:29 AM, "serviceability-dev on behalf of linzang(??)" linzang at tencent.com> wrote: > > ??????????????????????????????????????????? > > > ??????????????????????????????????????????? >????????????? Dear Stefan, > > ??????????????????????????????????????????? > > > ????????????????????????? ??????????????????>????????????????????? Thanks a lot! I agree with you to decouple the heap inspection code with GC's. > > ??????????????????????????????????????????? >????????????????????? I will start? from your POC code, may discuss with you later. > > ?????? ?????????????????????????????????????> > > ??????????????????????????????????????????? > > > ??????????????????????????????????????????? >????????????? BRs, > > ??????????????????????????????????????????? >????????????? Lin > > ?????????????????????????????????????????? ?> > > ??????????????????????????????????????????? >????????????? On 2020/4/22, 5:14 PM, "Stefan Karlsson" wrote: > > ??????????????????????????????????????????? > > > ???????????????????? ???????????????????????>????????????????? Hi Lin, > > ??????????????????????????????????????????? > > > ??????????????????????????????????????????? >????????????????? I took a look at this earlier and saw that the heap inspection code is > > ????????????????????????? ??????????????????>????????????????? strongly coupled with the CollectedHeap and G1CollectedHeap. I'd prefer > > ??????????????????????????????????????????? >????????????????? if we'd abstract this away, so that the GCs only provide a "parallel > > ?????????????? ?????????????????????????????>????????????????? object iteration" interface, and the heap inspection code is kept elsewhere. > > ??????????????????????????????????????????? > > > ??????????????????????????????????????????? >????????????????? I started experimenting with doing that, but other higher-priority (to > > ??????????????????????????????????????????? >????????????????? me) tasks have had to take precedence. > > ??????????????????????????????????????????? > > > ??????????????????????????????????????????? >????????????? ????I've uploaded my work-in-progress / proof-of-concept: > > ??????????????????????????????????????????? >https://cr.openjdk.java.net/~stefank/8215624/webrev.01.delta/ > > ??????????????????????????????????????????? >https://cr.openjdk.java.net/~stefank/8215624/webrev.01/ > > ??????????????????????????????????????????? > > > ????????? ??????????????????????????????????>????????????????? The current code doesn't handle the lifecycle (deletion) of the > > ??????????????????????????????????????????? >????????????????? ParallelObjectIterators. There's also code left unimplemented in around > > ??? ????????????????????????????????????????>????????????????? CollectedHeap::run_task. However, I think this could work as a basis to > > ??????????????????????????????????????????? >????????????????? pull out the heap inspection code out of the GCs. > > ??????????? ????????????????????????????????> > > ??????????????????????????????????????????? >????????????????? Thanks, > > ??????????????????????????????????????????? >????????????????? StefanK > > ??????????????????????????????????????????? > > > ????????????????????????????????? ??????????>????????????????? On 2020-04-22 02:21, linzang(??) wrote: > > ??????????????????????????????????????????? >????????????????? > Dear all, > > ??????????????????????????????????????????? >????????????????? >?????? May I ask you help to review? This RFR has been there for quite a while. > > ??????????????????????????????????????????? >????????????????? >?????? Thanks! > > ??????????????????????????????????????????? >????????????????? > > > ?????????????????????????????????????? ?????>????????????????? > BRs, > > ??????????????????????????????????????????? >????????????????? > Lin > > ??????????????????????????????????????????? >????????????????? > > > ??????????????????????????????????????????? >????????????????? > > On 2020/3/16, 5:18 PM, "linzang(??)" wrote:> > > ??????????????????????????????????????????? >????????????????? > > > ??????????????????????????????????????????? >????????????????? >>??? Just update a new path, my preliminary measure show about 3.5x speedup of jmap histo on a nearly full 4GB G1 heap (8-core platform with parallel thread number set to 4). > > ??????????????????????????????????????????? >????????????????? >>???? webrev:http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_02/ > > ??????????????????????????????????????????? >????????????????? >>???? bug:https://bugs.openjdk.java.net/browse/JDK-8215624 > > ??????????????????????????????????????????? >????????????????? >>???? CSR:https://bugs.openjdk.java.net/browse/JDK-8239290 > > ???????????????????????????????????????????? >????????????????? >>???? BRs, > > ??????????????????????????????????????????? >????????????????? >>?????? Lin > > ??????????????????????????????????????????? >????????????????? >>?????? > On 2020/3/2, 9:56 PM, "linzang(??)" wrote: > > ??????????????????????????????????????????? >????????????????? >>?????? > > > ??????????????????????????????????????????? >????????????????? >>?????? >??? Dear all, > > ??????????? ????????????????????????????????>????????????????? >>?????? >????????? Let me try to ease the reviewing work by some explanation :P > > ??????????????????????????????????????????? >????????????????? >>?????? >????????? The patch's target is to speed up jmap -histo for heap iteration, from my experience it is necessary for large heap investigation. E.g in bigData scenario I have tried to conduct jmap -histo against 180GB heap, it does take quite a while. > > ??????????????????????????????????????????? >???????????? ?????>>?????? >????????? And if my understanding is corrent, even the jmap -histo without "live" option does heap inspection with heap lock acquired. so it is very likely to block mutator thread in allocation-sensitive scenario. I would say the faster the heap inspection does, the shorter the mutator be blocked. This is parallel iteration for jmap is necessary. > > ??????????????????????????????????????????? >????????????????? >>?????? >????????? I think the parallel heap inspection should be applied to all kind of heap. However, consider the heap layout are different for? GCs, much time is required to understand all kinds of the heap layout to make the whole change. IMO, It is not wise to have a huge patch for the whole solution at once, and it is even harder to review it. So I plan to implement it incrementally, the first patch (this one) is going to confirm the implemention detail of how jmap accept the new option, passes it to attachListener of the jvm process and then how to make the parallel inspection closure be generic enough to make it easy to extend to different heap layout. And also how to implement the heap inspection in specific gc's heap. This patch use G1's heap as the begining. > > ??????????????????????????????????????????? >????????????????? >>????? ?>????????? This patch actually do several things: > > ??????????????????????????????????????????? >????????????????? >>?????? >????????? 1. Add an option "parallelThreadNum=" to jmap -histo, the default behavior is to set N to 0, means let's JVM decide how many threads to use for heap inspection. Set this option to 1 will disable parallel heap inspection. (more details in CSR:https://bugs.openjdk.java.net/browse/JDK-8239290) > > ???????????????? ???????????????????????????>????????????????? >>?????? >????????? 2. Make a change in how Jmap passing arguments, changes inhttp://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_01/src/jdk.jcmd/share/classes/sun/tools/jmap/JMap.java.udiff.html, originally it pass options as separate arguments to attachListener, this patch change to that all options be compose to a single string. So the arg_count_max in attachListener.hpp do not need to be changed, and hence avoid the compatibility issue, as disscussed athttps://mail.openjdk.java.net/pipermail/serviceability-dev/2019-March/027334.html > > ??????????????????????????????????????????? >????????????????? >>?????? >???????? 3. Add an abstract class ParHeapInspectTask in heapInspection.hpp / heapInspection.cpp, It's work(uint worker_id) method prepares the data structure (KlassInfoTable) need for every parallel worker thread, and then call do_object_iterate_parallel() which is heap specific implementation. I also added some machenism in KlassInfoTable to support parallel iteration, such as merge(). > > ??????????????????????????????????????????? >????????????????? >>?????? >??????? 4. In specific heap (G1 in this patch), create a subclass of ParHeapInspectTask, implement the do_object_iterate_parallel() for parallel heap inspection. For G1, it simply invoke g1CollectedHeap's object_iterate_parallel(). > > ??????????????????????????????????????????? >????????????????? >>?????? >??????? 5. Add related test. > > ??????????????????????????????????????????? >????????????????? >>?????? >????? ??6. it may be easy to extend this patch for other kinds of heap by creating subclass of ParHeapInspectTask and implement the do_object_iterate_parallel(). > > ??????????????????????????????????????????? >????????????????? >>?????? > > > ???????????????????????? ???????????????????>????????????????? >>?????? >??? Hope these info could help on code review and initate the discussion :-) > > ??????????????????????????????????????????? >????????????????? >>?????? >??? Thanks! > > ??????????????????????????????????????????? > ?????????????????>>?????? > > > ??????????????????????????????????????????? >????????????????? >>?????? >??? BRs, > > ??????????????????????????????????????????? >????????????????? >>?????? >??? Lin > > ??????????????????????????????????????????? >????????????????? >>?????? >??? >On 2020/2/19, 9:40 AM, "linzang(??)" wrote:. > > ??????????????????????????????????????????? >????????????????? >>?????? >??? > > > ??????????????????????????????????????????? >???????? ?????????>>?????? >??? >? Re-post this RFR with correct enhancement number to make it trackable. > > ??????????????????????????????????????????? >????????????????? >>?????? >??? >? please ignore the previous wrong post. sorry for troubles. > > ??????????????????? ????????????????????????>????????????????? >>?????? >??? > > > ??????????????????????????????????????????? >????????????????? >>?????? >??? >?? webrev:http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_01/ > > ??????????????????????????????????????????? >????????????????? >>?????? >??? >??? Hi bug:https://bugs.openjdk.java.net/browse/JDK-8215624 > > ??????????? ????????????????????????????????>????????????????? >>?????? >??? >??? CSR:https://bugs.openjdk.java.net/browse/JDK-8239290 > > ??????????????????????????????????????????? >????????????????? >>? ?????>??? >??? -------------- > > ??????????????????????????????????????????? >????????????????? >>?????? >??? >??? Lin > > ??????????????????????????????????????????? >????????????????? >>?????? >??? >??? >Hi Lin, > > ??????????????????????????????????????????? >??? ??????????????>?? >???? >??? >??? > > > ??????????????????????????????????????????? >????????????????? >>?????? >??? >??? >Could you, please, re-post your RFR with the right enhancement number in > > ??????????????????????????????????????????? >????????????????? >>?????? >??? >??? >the message subject? > > ??????????????????????????????????????????? >????????????????? >>?????? >??? >??? >It will be more trackable this way. > > ??????????????????????????????????????????? >????????????????? >>?????? >??? >??? > > > ???????????? ???????????????????????????????>????????????????? >>?????? >??? >??? >Thanks, > > ??????????????????????????????????????????? >????????????????? >>?????? >??? >??? >Serguei > > ??????????????????????????????????????????? >????????????????? >>?????? >??? >??? > > > ?? ?????????????????????????????????????????>????????????????? >>?????? >??? >??? > > > ??????????????????????????????????????????? >????????????????? >>?????? >??? >??? >On 2/17/20 10:29 PM, linzang(??) wrote: > > ??????????????????????????????????????????? >?????? ???????????>>?????? >??? >??? >> Dear David, > > ??????????????????????????????????????????? >????????????????? >>?????? >??? >??? >>??????? Thanks a lot! > > ??????????????????????????????????????????? >????????????????? >>?????? >??? >??? >>?????? I have updated the refined code tohttp://cr.openjdk.java.net/~lzang/jmap-8214535/8215264/webrev_01/. > > ??????????????????????????????????????????? >????????????????? >>?????? >??? >??? >> ???????IMHO the parallel heap inspection can be extended to all kinds of heap as long as the heap layout can support parallel iteration. > > ??????????????????????????????????????????? >????????????????? >>?????? >??? >??? >>??????? Maybe we can firstly use this webrev to discuss how to implement it, because I am not sure my current implementation is an appropriate way to communicate with collectedHeap, then we can extend the solution to other kinds of heap. > > ??????????????????????????????????????????? >??????? ??????????>>?????? >??? >??? >> > > ??????????????????????????????????????????? >????????????????? >>?????? >??? >??? >> Thanks, > > ??????????????????????????????????????????? >????????????????? >>?????? >??? >??? >> -------------- > > ?????????????????????????????? ?????????????>????????????????? >>?????? >??? >??? >> Lin > > ??????????????????????????????????????????? >????????????????? >>?????? >??? >??? >>> Hi Lin, > > ??????????????????????????????????????????? >????????????????? >>?????? >??? >??? >>> > > ????????????????? ??????????????????????????>????????????????? >>?????? >??? >??? >>> Adding in hotspot-gc-dev as they need to see how this interacts with GC > > ??????????????????????????????????????????? >????????????????? >>?????? >??? >??? >>> worker threads, and whether it needs to be extended beyond G1. > > ??????????????????????????????????????????? >????????????????? >>?????? >??? >??? >>> > > ??????????????????????????????????????????? >????????????????? >>?????? >??? >?? >>> I happened to spot one nit when browsing: > > ????????? ??????????????????????????????????>????????????????? >>?????? >??? >??? >>> > > ??????????????????????????????????????????? >????????????????? >>?????? >??? >??? >>> src/hotspot/share/gc/shared/collectedHeap.hpp > > ??????????????????????????????????????????? >?? ???????????????>>?????? >??? >??? >>> > > ??????????????????????????????????????????? >????????????????? >>?????? >??? >??? >>> +?? virtual bool run_par_heap_inspect_task(KlassInfoTable* cit, > > ??????????????????????????????????????????? >????????????????? >>?? ????>??? >??? >>> +????????????????????????????????????????? BoolObjectClosure* filter, > > ??????????????????????????????????????????? >????????????????? >>?????? >??? >??? >>> +????????????????????????????????????????? size_t* missed_count, > > ???????????????? ???????????????????????????>????????????????? >>?????? >??? >??? >>> +????????????????????????????????????????? size_t thread_num) { > > ??????????????????????????????????????????? >????????????????? >>?????? >??? >??? >>> +???? return NULL; > > ????????????????? ??????????????????????????>????????????????? >>?????? >??? >??? >>> > > ??????????????????????????????????????????? >????????????????? >>?????? >??? >??? >>> s/NULL/false/ > > ??????????????????????????????????????????? >????????????????? >>?????? >??? >??? >>> > > ? ??????????????????????????????????????????>????????????????? >>?????? >??? >??? >>> Cheers, > > ??????????????????????????????????????????? >????????????????? >>?????? >??? >??? >>> David > > ??????????????????????????????????????????? >????????????????? >?? >??? ?>??? >??? >>> > > ??????????????????????????????????????????? >????????????????? >>?????? >??? >??? >>> On 18/02/2020 2:15 pm, linzang(??) wrote: > > ??????????????????????????????????????????? >????????????????? >>?????? >??? >??? >>>> Dear All, > > ??????????????? ????????????????????????????>????????????????? >>?????? >??? >??? >>>>???????? May I ask your help to review the follow changes: > > ??????????????????????????????????????????? >????????????????? >>?????? >??? >??? >>>>???????? webrev: > > ?????????????????????? ?????????????????????>????????????????? >>?????? >??? >??? >>>>http://cr.openjdk.java.net/~lzang/jmap-8214535/8215264/webrev_00/ > > ??????????????????????????????????????????? >????????????????? >>?????? >??? >??? >>>>????? bug:https://bugs.openjdk.java.net/browse/JDK-8215624 > > ??????????????????????????????????????????? >????????????????? >>?????? >??? >??? >>>>?? ???related CSR:https://bugs.openjdk.java.net/browse/JDK-8239290 > > ??????????????????????????????????????????? >????????????????? >>?????? >??? >??? >>>>???????? This patch enable parallel heap inspection of G1 for jmap histo. > > ??????????????????????????????????????????? >????????????????? >>?????? >??? >??? >>>>???????? my simple test shown it can speed up 2x of jmap -histo with > > ??????????????????????????????????????????? >????????????????? >>? ?????>??? >??? >>>> parallelThreadNum set to 2 for heap at ~500M on 4-core platform. > > ??????????????????????????????????????????? >????????????????? >>?????? >??? >??? >>>> > > ??????????????????????????????????????????? >????????????????? >>?????? >??? >??? >>>> ------------------------------------------------------------------------ > > ??????????????????????????????????????????? >????????????????? >>?????? >??? >??? >>>> BRs, > > ??????????????????????????????????????????? >????????????????? >>?????? >??? >??? >>>> Lin > > ??????????????????????????????????????????? >????????????????? >>?????? >??? >??? >> > > > ??????????????????????????????????????????? >????????????????? >>?????? >??? >??? > > > ??????????????????????????????????????????? >????????????????? > > > ???????????????? ???????????????????????????>????????????????? > > > ??????????????????????????????????????????? >????????????????? > > > ??????????????????????????????????????????? > > > ??????????????????????????????????????????? > > > ??????????????????????????????????????????? > > > ???? ???????????????????????????????????????> > > ??????????????????????????????????????????? > > > ??????????????????????????????????????????? > > > > From linzang at tencent.com Wed Aug 5 08:12:47 2020 From: linzang at tencent.com (=?utf-8?B?bGluemFuZyjoh6fnkLMp?=) Date: Wed, 5 Aug 2020 08:12:47 +0000 Subject: RFR(L): 8215624: add parallel heap inspection support for jmap histo(G1)(Internet mail) In-Reply-To: References: <01181077-0DD7-4AEE-945B-5D6E21224A31@amazon.com> <5EC2DEE8-C1F0-4C88-9F90-A8A55BBE39FE@tencent.com> <5fe3c424-3e48-7fb5-e964-90d9b52fad92@oracle.com> <936c5cb9-3895-e518-c912-5e5cd8a71e66@oracle.com> Message-ID: Hi Stefan, I got your point, thanks for explanation. I missed the atomic part when considering it. Hi Paul, Do you think it is ok to use uintx? I checked it is actually a uintptr_t type. BRs, Lin ?On 2020/8/5, 3:39 PM, "Stefan Karlsson" wrote: On 2020-08-05 07:22, linzang(??) wrote: > Hi Serguei, > > No problem, Thanks for your reviewing :) > > I wll upload a new webrev later, so may I ask your help to review it > again? > > Hi Stefan, > > As Paul mentioned, the _/missed/_count is not a size, so size_t may > not be clear, what?s your opinion about uint64_t? We typically don't restrict the usage of size_t to only *sizes* in the HotSpot. If you search the code you'll find many count variables using size_t, so I personally don't see the need to change the type. However, if you really do want to change it then maybe using another type that is 32 bits on 32-bit machines, maybe uintx? IIRC, using uint64_t and some of the Atomics operations are problematic on some 32-bit platforms, so using a type that matches the word size of the targetted machine helps you not having to think about that. > > It seems the uint overflow may happened on 64bit machine with large > heap, e.g. may be more than 4 Giga objects (8byte header + 8 byte > klassptr + 8byte field) in a heap that is larger than 96 GB, uint64_t > is ok in this case. Exactly. Thanks, StefanK > > BRs, > > Lin > > *From: *"serguei.spitsyn at oracle.com" > *Date: *Wednesday, August 5, 2020 at 1:02 PM > *To: *"linzang(??)" , "Hohensee, Paul" > , Stefan Karlsson , > David Holmes , serviceability-dev > , "hotspot-gc-dev at openjdk.java.net" > > *Subject: *Re: RFR(L): 8215624: add parallel heap inspection support for > jmap histo(G1)(Internet mail) > > Oh, sorry for the confusion, please, skip my question. :) > C++ does not have the '&&=' operator. > > Thanks, > Serguei > > On 8/4/20 21:56, serguei.spitsyn at oracle.com > wrote: > > Hi Lin, > > https://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_10/src/hotspot/share/memory/heapInspection.cpp.udiff.html > > +class KlassInfoTableMergeClosure : public KlassInfoClosure { > > +private: > > + KlassInfoTable* _dest; > > + bool _success; > > +public: > > + KlassInfoTableMergeClosure(KlassInfoTable* table) : _dest(table), > _success(true) {} > > + void do_cinfo(KlassInfoEntry* cie) { > > + _success &= _dest->merge_entry(cie); > > + } > > The operator '&=' above looks strange. > Did you actually want to use the operator '&&=' instead? : > > + _success &&= _dest->merge_entry(cie); > > > Thanks, > Serguei > > > > > On 8/3/20 07:51, linzang(??) wrote: > > Dear Stefan, > > May I ask your help to review again? I have made a delta based on the last changeset you have reviewed(webrev04),hope it could ease your reviewing work. > > webrev:https://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_10/ > > delta (vs webrev04):https://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/delta_10vs04/webrev/ > > bug:https://bugs.openjdk.java.net/browse/JDK-8215624 > > CSR(approved):https://bugs.openjdk.java.net/browse/JDK-8239290 > > > > BRs, > > Lin > > On 2020/7/30, 5:21 AM, "Hohensee, Paul" wrote: > > A submit repo run with this succeeded, so afaic you're good to go. Stefan, you reviewed the GC part before, it'd be great if you could ok the final version. > > Thanks, > > Paul > > On 7/29/20, 5:02 AM, "linzang(??)" wrote: > > Upload a new change athttp://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_10/ > > It fix an issue of windows fail : > > #################################### > > In heapInspect.cpp > > - size_t HeapInspection::populate_table(KlassInfoTable* cit, BoolObjectClosure *filter, uint parallel_thread_num) { > > + uint HeapInspection::populate_table(KlassInfoTable* cit, BoolObjectClosure *filter, uint parallel_thread_num) { > > #################################### > > In heapInspect.hpp > > - size_t populate_table(KlassInfoTable* cit, BoolObjectClosure* filter = NULL, uint parallel_thread_num = 1) NOT_SERVICES_RETURN_(0); > > + uint populate_table(KlassInfoTable* cit, BoolObjectClosure* filter = NULL, uint parallel_thread_num = 1) NOT_SERVICES_RETURN_(0); > > #################################### > > BRs, > > Lin > > On 2020/7/27, 11:26 AM, "linzang(??)" wrote: > > I update a new change athttp://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_09 > > It includes a tiny fix of build failure on windows: > > #################################### > > In attachListener.cpp: > > - uint parallel_thread_num = MAX(1, (uint)os::initial_active_processor_count() * 3 / 8); > > + uint parallel_thread_num = MAX2(1, (uint)os::initial_active_processor_count() * 3 / 8); > > #################################### > > BRs, > > Lin > > On 2020/7/23, 11:56 AM, "linzang(??)" wrote: > > Hi Paul, > > Thanks for your help, that all looks good to me. > > Just 2 minor changes: > > ? delete the final return in ParHeapInspectTask::work, you mentioned it but seems not include in the webrev :-) > > ? delete a unnecessary blank line in heapInspect.cpp at merge_entry() > > ######################################################################### > > --- old/src/hotspot/share/memory/heapInspection.cpp 2020-07-23 11:23:29.281666456 +0800 > > +++ new/src/hotspot/share/memory/heapInspection.cpp 2020-07-23 11:23:29.017666447 +0800 > > @@ -251,7 +251,6 @@ > > _size_of_instances_in_words += cie->words(); > > return true; > > } > > - > > return false; > > } > > @@ -568,7 +567,6 @@ > > Atomic::add(&_missed_count, missed_count); > > } else { > > Atomic::store(&_success, false); > > - return; > > } > > } > > ######################################################################### > > Here is the webrevhttp://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_08/ > > BRs, > > Lin > > --------------------------------------------- > > From: "Hohensee, Paul" > > Date: Thursday, July 23, 2020 at 6:48 AM > > To: "linzang(??)" , Stefan Karlsson ,"serguei.spitsyn at oracle.com" , David Holmes , serviceability-dev ,"hotspot-gc-dev at openjdk.java.net" > > Subject: RE: RFR(L): 8215624: add parallel heap inspection support for jmap histo(G1)(Internet mail) > > Just small things. > > heapInspection.cpp: > > In ParHeapInspectTask::work, remove the final return statement and fix the following ?}? indent. I.e., replace > > + Atomic::store(&_success, false); > > + return; > > + } > > with > > + Atomic::store(&_success, false); > > + } > > In HeapInspection::heap_inspection, missed_count should be a uint to match other missed_count declarations, and should be initialized to the result of populate_table() rather than separately to 0. > > attachListener.cpp: > > In heap_inspection, initial_processor_count returns an int, so cast its result to a uint. > > Similarly, parse_uintx returns a uintx, so cast its result (num) to uint when assigning to parallel_thread_num. > > BasicJMapTest.java: > > I took the liberty of refactoring testHisto*/histoToFile/testDump*/dump to remove redundant interposition methods and make histoToFile and dump look as similar as possible. > > Webrev with the above changes in > > http://cr.openjdk.java.net/~phh/8214535/webrev.01/ > > Thanks, > > Paul > > On 7/15/20, 2:13 AM, "linzang(??)" wrote: > > Upload a new webrev athttp://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_07/ > > It fix a potential issue that unexpected number of threads maybe calculated for "parallel" option of jmap -histo in container. > > As shown athttp://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_07-delta/src/hotspot/share/services/attachListener.cpp.udiff.html > > ############### attachListener.cpp #################### > > @@ -252,11 +252,11 @@ > > static jint heap_inspection(AttachOperation* op, outputStream* out) { > > bool live_objects_only = true; // default is true to retain the behavior before this change is made > > outputStream* os = out; // if path not specified or path is NULL, use out > > fileStream* fs = NULL; > > const char* arg0 = op->arg(0); > > - uint parallel_thread_num = MAX(1, os::processor_count() * 3 / 8); // default is less than half of processors. > > + uint parallel_thread_num = MAX(1, os::initial_active_processor_count() * 3 / 8); // default is less than half of processors. > > if (arg0 != NULL && (strlen(arg0) > 0)) { > > if (strcmp(arg0, "-all") != 0 && strcmp(arg0, "-live") != 0) { > > out->print_cr("Invalid argument to inspectheap operation: %s", arg0); > > return JNI_ERR; > > } > > ################################################### > > Thanks. > > BRs, > > Lin > > On 2020/7/9, 3:22 PM, "linzang(??)" wrote: > > Hi Paul, > > Thanks for reviewing! > > >> > > >> I'd move all the argument parsing code to JMap.java and just pass the results to Hotspot. Both histo() in JMap.java and code in attachListener.* parse the command line arguments, though the code in histo() doesn't parse the argument to "parallel". I'd upgrade the code in histo() to do a complete parse and pass the option values to executeCommandForPid as before: there would just be more of them now. That would allow you to eliminate all the parsing code in attachListener.cpp as well as the change to arguments.hpp. > > >> > > The reason I made the change in Jmap.java that compose all arguments as 1 string , instead of passing 3 argments, is to avoid the compatibility issue, as we discussed inhttp://mail.openjdk.java.net/pipermail/serviceability-dev/2019-February/thread.html#27240. The root cause of the compatibility issue is because max argument count in HotspotVirtualMachineImpl.java and attachlistener.cpp need to be enlarged (changes likehttp://hg.openjdk.java.net/jdk/jdk/rev/e7cf035682e3#l2.1) when jmap has more than 3 arguments. But if user use an old jcmd/jmap tool, it may stuck at socket read(), because the "max argument count" don't match. > > I re-checked this change, the argument count of jmap histo is equal to 3 (live, file, parallel), so it can work normally even without the change of passing argument. But I think we have to face the problem if more arguments is added in jcmd alike tools later, not sure whether it should be sloved (or a workaround) in this changeset. > > And here are the lastest webrev and delta: > > http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_06/ > > http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_06-delta/ > > Cheers, > > Lin > > On 2020/7/7, 5:57 AM, "Hohensee, Paul" wrote: > > I'd like to see this feature added. :) > > The CSR looks good, as does the basic parallel inspection algorithm. Stefan's done the GC part, so I'll stick to the non-GC part (fwiw, the GC part lgtm). > > I'd move all the argument parsing code to JMap.java and just pass the results to Hotspot. Both histo() in JMap.java and code in attachListener.* parse the command line arguments, though the code in histo() doesn't parse the argument to "parallel". I'd upgrade the code in histo() to do a complete parse and pass the option values to executeCommandForPid as before: there would just be more of them now. That would allow you to eliminate all the parsing code in attachListener.cpp as well as the change to arguments.hpp. > > heapInspection.hpp: > > _shared_miss_count (s/b _missed_count, see below) isn't a size, so it should be a uint instead of a size_t. Same with the new parallel_thread_num argument to heap_inspection() and populate_table(). > > Comment copy-edit: > > +// Parallel heap inspection task. Parallel inspection can fail due to > > +// a native OOM when allocating memory for TL-KlassInfoTable. > > +// _success will be set false on an OOM, and serial inspection tried. > > _shared_miss_count should be _missed_count to match the missed_count() getter, or rename missed_count() to be shared_miss_count(). Whichever way you go, the field type should match the getter result type: uint is reasonable. > > heapInspection.cpp: > > You might use ResourceMark twice in populate_table, separately for the parallel attempt and the serial code. If the parallel attempt fails and available memory is low, it would be good to clean up the memory used by the parallel attempt before doing the serial code. > > Style nit in KlassInfoTable::merge_entry(). I'd line up the definitions of k and elt, so "k" is even with "elt". And, because it's two lines shorter, I'd replace > > + } else { > > + return false; > > + } > > with > > + return false; > > KlassInfoTableMergeClosure.is_success() should be just success() (i.e., no "is_" prefix) because it's a getter. > > I'd reorganize the code in populate_table() to make it more clear, vis (I changed _shared_missed_count to _missed_count) > > + if (cit.allocation_failed()) { > > + // fail to allocate memory, stop parallel mode > > + Atomic::store(&_success, false); > > + return; > > + } > > + RecordInstanceClosure ric(&cit, _filter); > > + _poi->object_iterate(&ric, worker_id); > > + missed_count = ric.missed_count(); > > + { > > + MutexLocker x(&_mutex); > > + merge_success = _shared_cit->merge(&cit); > > + } > > + if (merge_success) { > > + Atomic::add(&_missed_count, missed_count); > > + else { > > + Atomic::store(&_success, false); > > + } > > Thanks, > > Paul > > On 6/29/20, 7:20 PM, "linzang(??)" wrote: > > Dear All, > > Sorry to bother again, I just want to make sure that is this change worth to be continue to work on? If decision is made to not. I think I can drop this work and stop asking for help reviewing... > > Thanks for all your help about reviewing this previously. > > BRs, > > Lin > > On 2020/5/9, 3:47 PM, "linzang(??)" wrote: > > Dear All, > > May I ask your help again for review the latest change? Thanks! > > BRs, > > Lin > > On 2020/4/28, 1:54 PM, "linzang(??)" wrote: > > Hi Stefan, > > >> - Adding Atomic::load/store. > > >> - Removing the time measurement in the run_task. I renamed G1's function > > >> to run_task_timed. If we need this outside of G1, we can rethink the API > > >> at that point. > > >> - ZGC style cleanups > > Thanks for revising the patch, they are all good to me, and I have made a tiny change based on it: > > http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_04/ > > http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_04-delta/ > > it reduce the scope of mutex in ParHeapInspectTask, and delete unnecessary comments. > > BRs, > > Lin > > On 2020/4/27, 4:34 PM, "Stefan Karlsson" wrote: > > Hi Lin, > > On 2020-04-26 05:10, linzang(??) wrote: > > > Hi Stefan and Paul? > > > I have made a new patch based on your comments and Stefan's Poc code: > > > Webrev:http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_03/ > > > Delta(based on Stefan's change:) :http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_03-delta/webrev_03-delta/ > > Thanks for providing a delta patch. It makes it much easier to look at, > > and more likely for reviewers to continue reviewing. > > I'm going to continue focusing on the GC parts, and leave the rest to > > others to review. > > > > > > And Here are main changed I made and want to discuss with you: > > > 1. changed"parallelThreadNum=" to "parallel=" for jmap -histo options. > > > 2. Add logic to test where parallelHeapInspection is fail, in heapInspection.cpp > > > This is because the parHeapInspectTask create thread local KlassInfoTable in it's work() method, and this may fail because of native OOM, in this case, the parallel should fail and serial heap inspection can be tried. > > > One more thing I want discuss with you is about the member "_success" of parHeapInspectTask, when native OOM happenes, it is set to false. And since this "set" operation can be conducted in multiple threads, should it be atomic ops? IMO, this is not necessary because "_success" can only be set to false, and there is no way to change it from back to true after the ParHeapInspectTask instance is created, so it is save to be non-atomic, do you agree with that? > > In these situations you should be using the Atomic::load/store > > primitives. We're moving toward a later C++ standard were data races are > > considered undefined behavior. > > > 3. make CollectedHeap::run_task() be an abstract virtual func, so that every subclass of collectedHeap should support it, so later implementation of new collectedHeap will not miss the "parallel" features. > > > The problem I want to discuss with you is about epsilonHeap and SerialHeap, as they may not need parallel heap iteration, so I only make task->work(0), in case the run_task() is invoked someway in future. Another way is to left run_task() unimplemented, which one do you think is better? > > I don't have a strong opinion about this. > > And also please help take a look at the zHeap, as there is a class > > zTask that wrap the abstractGangTask, and the collectedHeap::run_task() > > only accept AbstraceGangTask* as argument, so I made a delegate class > > to adapt it , please see src/hotspot/share/gc/z/zHeap.cpp. > > > > > > There maybe other better ways to sovle the above problems, welcome for any comments, Thanks! > > I've created a few cleanups and changes on top of your latest patch: > > https://cr.openjdk.java.net/~stefank/8215624/webrev.02.delta > > https://cr.openjdk.java.net/~stefank/8215624/webrev.02 > > - Adding Atomic::load/store. > > - Removing the time measurement in the run_task. I renamed G1's function > > to run_task_timed. If we need this outside of G1, we can rethink the API > > at that point. > > - ZGC style cleanups > > Thanks, > > StefanK > > > > > > BRs, > > > Lin > > > > > > On 2020/4/23, 11:08 AM, "linzang(??)" wrote: > > > > > > Thanks Paul! I agree with using "parallel", will make the update in next patch, Thanks for help update the CSR. > > > > > > BRs, > > > Lin > > > > > > On 2020/4/23, 4:42 AM, "Hohensee, Paul" wrote: > > > > > > For the interface, I'd use "parallel" instead of "parallelThreadNum". All the other options are lower case, and it's a lot easier to type "parallel". I took the liberty of updating the CSR. If you're ok with it, you might want to change variable names and such, plus of course JMap.usage. > > > > > > Thanks, > > > Paul > > > > > > On 4/22/20, 2:29 AM, "serviceability-dev on behalf of linzang(??)" linzang at tencent.com> wrote: > > > > > > Dear Stefan, > > > > > > Thanks a lot! I agree with you to decouple the heap inspection code with GC's. > > > I will start from your POC code, may discuss with you later. > > > > > > > > > BRs, > > > Lin > > > > > > On 2020/4/22, 5:14 PM, "Stefan Karlsson" wrote: > > > > > > Hi Lin, > > > > > > I took a look at this earlier and saw that the heap inspection code is > > > strongly coupled with the CollectedHeap and G1CollectedHeap. I'd prefer > > > if we'd abstract this away, so that the GCs only provide a "parallel > > > object iteration" interface, and the heap inspection code is kept elsewhere. > > > > > > I started experimenting with doing that, but other higher-priority (to > > > me) tasks have had to take precedence. > > > > > > I've uploaded my work-in-progress / proof-of-concept: > > >https://cr.openjdk.java.net/~stefank/8215624/webrev.01.delta/ > > >https://cr.openjdk.java.net/~stefank/8215624/webrev.01/ > > > > > > The current code doesn't handle the lifecycle (deletion) of the > > > ParallelObjectIterators. There's also code left unimplemented in around > > > CollectedHeap::run_task. However, I think this could work as a basis to > > > pull out the heap inspection code out of the GCs. > > > > > > Thanks, > > > StefanK > > > > > > On 2020-04-22 02:21, linzang(??) wrote: > > > > Dear all, > > > > May I ask you help to review? This RFR has been there for quite a while. > > > > Thanks! > > > > > > > > BRs, > > > > Lin > > > > > > > > > On 2020/3/16, 5:18 PM, "linzang(??)" wrote:> > > > > > > > >> Just update a new path, my preliminary measure show about 3.5x speedup of jmap histo on a nearly full 4GB G1 heap (8-core platform with parallel thread number set to 4). > > > >> webrev:http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_02/ > > > >> bug:https://bugs.openjdk.java.net/browse/JDK-8215624 > > > >> CSR:https://bugs.openjdk.java.net/browse/JDK-8239290 > > ?? > >> BRs, > > > >> Lin > > > >> > On 2020/3/2, 9:56 PM, "linzang(??)" wrote: > > > >> > > > > >> > Dear all, > > > >> > Let me try to ease the reviewing work by some explanation :P > > > >> > The patch's target is to speed up jmap -histo for heap iteration, from my experience it is necessary for large heap investigation. E.g in bigData scenario I have tried to conduct jmap -histo against 180GB heap, it does take quite a while. > > > >> > And if my understanding is corrent, even the jmap -histo without "live" option does heap inspection with heap lock acquired. so it is very likely to block mutator thread in allocation-sensitive scenario. I would say the faster the heap inspection does, the shorter the mutator be blocked. This is parallel iteration for jmap is necessary. > > > >> > I think the parallel heap inspection should be applied to all kind of heap. However, consider the heap layout are different for GCs, much time is required to understand all kinds of the heap layout to make the whole change. IMO, It is not wise to have a huge patch for the whole solution at once, and it is even harder to review it. So I plan to implement it incrementally, the first patch (this one) is going to confirm the implemention detail of how jmap accept the new option, passes it to attachListener of the jvm process and then how to make the parallel inspection closure be generic enough to make it easy to extend to different heap layout. And also how to implement the heap inspection in specific gc's heap. This patch use G1's heap as the begining. > > > >> > This patch actually do several things: > > > >> > 1. Add an option "parallelThreadNum=" to jmap -histo, the default behavior is to set N to 0, means let's JVM decide how many threads to use for heap inspection. Set this option to 1 will disable parallel heap inspection. (more details in CSR:https://bugs.openjdk.java.net/browse/JDK-8239290) > > > >> > 2. Make a change in how Jmap passing arguments, changes inhttp://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_01/src/jdk.jcmd/share/classes/sun/tools/jmap/JMap.java.udiff.html, originally it pass options as separate arguments to attachListener, this patch change to that all options be compose to a single string. So the arg_count_max in attachListener.hpp do not need to be changed, and hence avoid the compatibility issue, as disscussed athttps://mail.openjdk.java.net/pipermail/serviceability-dev/2019-March/027334.html > > > >> > 3. Add an abstract class ParHeapInspectTask in heapInspection.hpp / heapInspection.cpp, It's work(uint worker_id) method prepares the data structure (KlassInfoTable) need for every parallel worker thread, and then call do_object_iterate_parallel() which is heap specific implementation. I also added some machenism in KlassInfoTable to support parallel iteration, such as merge(). > > > >> > 4. In specific heap (G1 in this patch), create a subclass of ParHeapInspectTask, implement the do_object_iterate_parallel() for parallel heap inspection. For G1, it simply invoke g1CollectedHeap's object_iterate_parallel(). > > > >> > 5. Add related test. > > > >> > 6. it may be easy to extend this patch for other kinds of heap by creating subclass of ParHeapInspectTask and implement the do_object_iterate_parallel(). > > > >> > > > > >> > Hope these info could help on code review and initate the discussion :-) > > > >> > Thanks! > > > >> > > > > >> > BRs, > > > >> > Lin > > > >> > >On 2020/2/19, 9:40 AM, "linzang(??)" wrote:. > > > >> > > > > > >> > > Re-post this RFR with correct enhancement number to make it trackable. > > > >> > > please ignore the previous wrong post. sorry for troubles. > > > >> > > > > > >> > > webrev:http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_01/ > > > >> > > Hi bug:https://bugs.openjdk.java.net/browse/JDK-8215624 > > > >> > > CSR:https://bugs.openjdk.java.net/browse/JDK-8239290 > > > >> > > -------------- > > > >> > > Lin > > > >> > > >Hi Lin, > > > > > > > > > > > >> > > >Could you, please, re-post your RFR with the right enhancement number in > > > >> > > >the message subject? > > > >> > > >It will be more trackable this way. > > > >> > > > > > > >> > > >Thanks, > > > >> > > >Serguei > > > >> > > > > > > >> > > > > > > >> > > >On 2/17/20 10:29 PM, linzang(??) wrote: > > > >> > > >> Dear David, > > > >> > > >> Thanks a lot! > > > >> > > >> I have updated the refined code tohttp://cr.openjdk.java.net/~lzang/jmap-8214535/8215264/webrev_01/. > > > >> > > >> IMHO the parallel heap inspection can be extended to all kinds of heap as long as the heap layout can support parallel iteration. > > > >> > > >> Maybe we can firstly use this webrev to discuss how to implement it, because I am not sure my current implementation is an appropriate way to communicate with collectedHeap, then we can extend the solution to other kinds of heap. > > > >> > > >> > > > >> > > >> Thanks, > > > >> > > >> -------------- > > > >> > > >> Lin > > > >> > > >>> Hi Lin, > > > >> > > >>> > > > >> > > >>> Adding in hotspot-gc-dev as they need to see how this interacts with GC > > > >> > > >>> worker threads, and whether it needs to be extended beyond G1. > > > >> > > >>> > > > >> > > >>> I happened to spot one nit when browsing: > > > >> > > >>> > > > >> > > >>> src/hotspot/share/gc/shared/collectedHeap.hpp > > > >> > > >>> > > > >> > > >>> + virtual bool run_par_heap_inspect_task(KlassInfoTable* cit, > > > >> > > >>> + BoolObjectClosure* filter, > > > >> > > >>> + size_t* missed_count, > > > >> > > >>> + size_t thread_num) { > > > >> > > >>> + return NULL; > > > >> > > >>> > > > >> > > >>> s/NULL/false/ > > > >> > > >>> > > > >> > > >>> Cheers, > > > >> > > >>> David > > > > > > > >>> > > > >> > > >>> On 18/02/2020 2:15 pm, linzang(??) wrote: > > > >> > > >>>> Dear All, > > > >> > > >>>> May I ask your help to review the follow changes: > > > >> > > >>>> webrev: > > > >> > > >>>>http://cr.openjdk.java.net/~lzang/jmap-8214535/8215264/webrev_00/ > > > >> > > >>>> bug:https://bugs.openjdk.java.net/browse/JDK-8215624 > > > >> > > >>>> related CSR:https://bugs.openjdk.java.net/browse/JDK-8239290 > > > >> > > >>>> This patch enable parallel heap inspection of G1 for jmap histo. > > > >> > > >>>> my simple test shown it can speed up 2x of jmap -histo with > > > >> > > >>>> parallelThreadNum set to 2 for heap at ~500M on 4-core platform. > > > >> > > >>>> > > > >> > > >>>> ------------------------------------------------------------------------ > > > >> > > >>>> BRs, > > > >> > > >>>> Lin > > > >> > > >> > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > From hohensee at amazon.com Wed Aug 5 12:55:49 2020 From: hohensee at amazon.com (Hohensee, Paul) Date: Wed, 5 Aug 2020 12:55:49 +0000 Subject: RFR(L): 8215624: add parallel heap inspection support for jmap histo(G1)(Internet mail) Message-ID: <6133B284-A654-4EE1-B510-61A85944C1CB@amazon.com> uintx is fine with me. Thanks, Paul ?On 8/5/20, 1:14 AM, "linzang(??)" wrote: Hi Stefan, I got your point, thanks for explanation. I missed the atomic part when considering it. Hi Paul, Do you think it is ok to use uintx? I checked it is actually a uintptr_t type. BRs, Lin On 2020/8/5, 3:39 PM, "Stefan Karlsson" wrote: On 2020-08-05 07:22, linzang(??) wrote: > Hi Serguei, > > No problem, Thanks for your reviewing :) > > I wll upload a new webrev later, so may I ask your help to review it > again? > > Hi Stefan, > > As Paul mentioned, the _/missed/_count is not a size, so size_t may > not be clear, what?s your opinion about uint64_t? We typically don't restrict the usage of size_t to only *sizes* in the HotSpot. If you search the code you'll find many count variables using size_t, so I personally don't see the need to change the type. However, if you really do want to change it then maybe using another type that is 32 bits on 32-bit machines, maybe uintx? IIRC, using uint64_t and some of the Atomics operations are problematic on some 32-bit platforms, so using a type that matches the word size of the targetted machine helps you not having to think about that. > > It seems the uint overflow may happened on 64bit machine with large > heap, e.g. may be more than 4 Giga objects (8byte header + 8 byte > klassptr + 8byte field) in a heap that is larger than 96 GB, uint64_t > is ok in this case. Exactly. Thanks, StefanK > > BRs, > > Lin > > *From: *"serguei.spitsyn at oracle.com" > *Date: *Wednesday, August 5, 2020 at 1:02 PM > *To: *"linzang(??)" , "Hohensee, Paul" > , Stefan Karlsson , > David Holmes , serviceability-dev > , "hotspot-gc-dev at openjdk.java.net" > > *Subject: *Re: RFR(L): 8215624: add parallel heap inspection support for > jmap histo(G1)(Internet mail) > > Oh, sorry for the confusion, please, skip my question. :) > C++ does not have the '&&=' operator. > > Thanks, > Serguei > > On 8/4/20 21:56, serguei.spitsyn at oracle.com > wrote: > > Hi Lin, > > https://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_10/src/hotspot/share/memory/heapInspection.cpp.udiff.html > > +class KlassInfoTableMergeClosure : public KlassInfoClosure { > > +private: > > + KlassInfoTable* _dest; > > + bool _success; > > +public: > > + KlassInfoTableMergeClosure(KlassInfoTable* table) : _dest(table), > _success(true) {} > > + void do_cinfo(KlassInfoEntry* cie) { > > + _success &= _dest->merge_entry(cie); > > + } > > The operator '&=' above looks strange. > Did you actually want to use the operator '&&=' instead? : > > + _success &&= _dest->merge_entry(cie); > > > Thanks, > Serguei > > > > > On 8/3/20 07:51, linzang(??) wrote: > > Dear Stefan, > > May I ask your help to review again? I have made a delta based on the last changeset you have reviewed(webrev04),hope it could ease your reviewing work. > > webrev:https://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_10/ > > delta (vs webrev04):https://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/delta_10vs04/webrev/ > > bug:https://bugs.openjdk.java.net/browse/JDK-8215624 > > CSR(approved):https://bugs.openjdk.java.net/browse/JDK-8239290 > > > > BRs, > > Lin > > On 2020/7/30, 5:21 AM, "Hohensee, Paul" wrote: > > A submit repo run with this succeeded, so afaic you're good to go. Stefan, you reviewed the GC part before, it'd be great if you could ok the final version. > > Thanks, > > Paul > > On 7/29/20, 5:02 AM, "linzang(??)" wrote: > > Upload a new change athttp://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_10/ > > It fix an issue of windows fail : > > #################################### > > In heapInspect.cpp > > - size_t HeapInspection::populate_table(KlassInfoTable* cit, BoolObjectClosure *filter, uint parallel_thread_num) { > > + uint HeapInspection::populate_table(KlassInfoTable* cit, BoolObjectClosure *filter, uint parallel_thread_num) { > > #################################### > > In heapInspect.hpp > > - size_t populate_table(KlassInfoTable* cit, BoolObjectClosure* filter = NULL, uint parallel_thread_num = 1) NOT_SERVICES_RETURN_(0); > > + uint populate_table(KlassInfoTable* cit, BoolObjectClosure* filter = NULL, uint parallel_thread_num = 1) NOT_SERVICES_RETURN_(0); > > #################################### > > BRs, > > Lin > > On 2020/7/27, 11:26 AM, "linzang(??)" wrote: > > I update a new change athttp://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_09 > > It includes a tiny fix of build failure on windows: > > #################################### > > In attachListener.cpp: > > - uint parallel_thread_num = MAX(1, (uint)os::initial_active_processor_count() * 3 / 8); > > + uint parallel_thread_num = MAX2(1, (uint)os::initial_active_processor_count() * 3 / 8); > > #################################### > > BRs, > > Lin > > On 2020/7/23, 11:56 AM, "linzang(??)" wrote: > > Hi Paul, > > Thanks for your help, that all looks good to me. > > Just 2 minor changes: > > ? delete the final return in ParHeapInspectTask::work, you mentioned it but seems not include in the webrev :-) > > ? delete a unnecessary blank line in heapInspect.cpp at merge_entry() > > ######################################################################### > > --- old/src/hotspot/share/memory/heapInspection.cpp 2020-07-23 11:23:29.281666456 +0800 > > +++ new/src/hotspot/share/memory/heapInspection.cpp 2020-07-23 11:23:29.017666447 +0800 > > @@ -251,7 +251,6 @@ > > _size_of_instances_in_words += cie->words(); > > return true; > > } > > - > > return false; > > } > > @@ -568,7 +567,6 @@ > > Atomic::add(&_missed_count, missed_count); > > } else { > > Atomic::store(&_success, false); > > - return; > > } > > } > > ######################################################################### > > Here is the webrevhttp://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_08/ > > BRs, > > Lin > > --------------------------------------------- > > From: "Hohensee, Paul" > > Date: Thursday, July 23, 2020 at 6:48 AM > > To: "linzang(??)" , Stefan Karlsson ,"serguei.spitsyn at oracle.com" , David Holmes , serviceability-dev ,"hotspot-gc-dev at openjdk.java.net" > > Subject: RE: RFR(L): 8215624: add parallel heap inspection support for jmap histo(G1)(Internet mail) > > Just small things. > > heapInspection.cpp: > > In ParHeapInspectTask::work, remove the final return statement and fix the following ?}? indent. I.e., replace > > + Atomic::store(&_success, false); > > + return; > > + } > > with > > + Atomic::store(&_success, false); > > + } > > In HeapInspection::heap_inspection, missed_count should be a uint to match other missed_count declarations, and should be initialized to the result of populate_table() rather than separately to 0. > > attachListener.cpp: > > In heap_inspection, initial_processor_count returns an int, so cast its result to a uint. > > Similarly, parse_uintx returns a uintx, so cast its result (num) to uint when assigning to parallel_thread_num. > > BasicJMapTest.java: > > I took the liberty of refactoring testHisto*/histoToFile/testDump*/dump to remove redundant interposition methods and make histoToFile and dump look as similar as possible. > > Webrev with the above changes in > > http://cr.openjdk.java.net/~phh/8214535/webrev.01/ > > Thanks, > > Paul > > On 7/15/20, 2:13 AM, "linzang(??)" wrote: > > Upload a new webrev athttp://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_07/ > > It fix a potential issue that unexpected number of threads maybe calculated for "parallel" option of jmap -histo in container. > > As shown athttp://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_07-delta/src/hotspot/share/services/attachListener.cpp.udiff.html > > ############### attachListener.cpp #################### > > @@ -252,11 +252,11 @@ > > static jint heap_inspection(AttachOperation* op, outputStream* out) { > > bool live_objects_only = true; // default is true to retain the behavior before this change is made > > outputStream* os = out; // if path not specified or path is NULL, use out > > fileStream* fs = NULL; > > const char* arg0 = op->arg(0); > > - uint parallel_thread_num = MAX(1, os::processor_count() * 3 / 8); // default is less than half of processors. > > + uint parallel_thread_num = MAX(1, os::initial_active_processor_count() * 3 / 8); // default is less than half of processors. > > if (arg0 != NULL && (strlen(arg0) > 0)) { > > if (strcmp(arg0, "-all") != 0 && strcmp(arg0, "-live") != 0) { > > out->print_cr("Invalid argument to inspectheap operation: %s", arg0); > > return JNI_ERR; > > } > > ################################################### > > Thanks. > > BRs, > > Lin > > On 2020/7/9, 3:22 PM, "linzang(??)" wrote: > > Hi Paul, > > Thanks for reviewing! > > >> > > >> I'd move all the argument parsing code to JMap.java and just pass the results to Hotspot. Both histo() in JMap.java and code in attachListener.* parse the command line arguments, though the code in histo() doesn't parse the argument to "parallel". I'd upgrade the code in histo() to do a complete parse and pass the option values to executeCommandForPid as before: there would just be more of them now. That would allow you to eliminate all the parsing code in attachListener.cpp as well as the change to arguments.hpp. > > >> > > The reason I made the change in Jmap.java that compose all arguments as 1 string , instead of passing 3 argments, is to avoid the compatibility issue, as we discussed inhttp://mail.openjdk.java.net/pipermail/serviceability-dev/2019-February/thread.html#27240. The root cause of the compatibility issue is because max argument count in HotspotVirtualMachineImpl.java and attachlistener.cpp need to be enlarged (changes likehttp://hg.openjdk.java.net/jdk/jdk/rev/e7cf035682e3#l2.1) when jmap has more than 3 arguments. But if user use an old jcmd/jmap tool, it may stuck at socket read(), because the "max argument count" don't match. > > I re-checked this change, the argument count of jmap histo is equal to 3 (live, file, parallel), so it can work normally even without the change of passing argument. But I think we have to face the problem if more arguments is added in jcmd alike tools later, not sure whether it should be sloved (or a workaround) in this changeset. > > And here are the lastest webrev and delta: > > http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_06/ > > http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_06-delta/ > > Cheers, > > Lin > > On 2020/7/7, 5:57 AM, "Hohensee, Paul" wrote: > > I'd like to see this feature added. :) > > The CSR looks good, as does the basic parallel inspection algorithm. Stefan's done the GC part, so I'll stick to the non-GC part (fwiw, the GC part lgtm). > > I'd move all the argument parsing code to JMap.java and just pass the results to Hotspot. Both histo() in JMap.java and code in attachListener.* parse the command line arguments, though the code in histo() doesn't parse the argument to "parallel". I'd upgrade the code in histo() to do a complete parse and pass the option values to executeCommandForPid as before: there would just be more of them now. That would allow you to eliminate all the parsing code in attachListener.cpp as well as the change to arguments.hpp. > > heapInspection.hpp: > > _shared_miss_count (s/b _missed_count, see below) isn't a size, so it should be a uint instead of a size_t. Same with the new parallel_thread_num argument to heap_inspection() and populate_table(). > > Comment copy-edit: > > +// Parallel heap inspection task. Parallel inspection can fail due to > > +// a native OOM when allocating memory for TL-KlassInfoTable. > > +// _success will be set false on an OOM, and serial inspection tried. > > _shared_miss_count should be _missed_count to match the missed_count() getter, or rename missed_count() to be shared_miss_count(). Whichever way you go, the field type should match the getter result type: uint is reasonable. > > heapInspection.cpp: > > You might use ResourceMark twice in populate_table, separately for the parallel attempt and the serial code. If the parallel attempt fails and available memory is low, it would be good to clean up the memory used by the parallel attempt before doing the serial code. > > Style nit in KlassInfoTable::merge_entry(). I'd line up the definitions of k and elt, so "k" is even with "elt". And, because it's two lines shorter, I'd replace > > + } else { > > + return false; > > + } > > with > > + return false; > > KlassInfoTableMergeClosure.is_success() should be just success() (i.e., no "is_" prefix) because it's a getter. > > I'd reorganize the code in populate_table() to make it more clear, vis (I changed _shared_missed_count to _missed_count) > > + if (cit.allocation_failed()) { > > + // fail to allocate memory, stop parallel mode > > + Atomic::store(&_success, false); > > + return; > > + } > > + RecordInstanceClosure ric(&cit, _filter); > > + _poi->object_iterate(&ric, worker_id); > > + missed_count = ric.missed_count(); > > + { > > + MutexLocker x(&_mutex); > > + merge_success = _shared_cit->merge(&cit); > > + } > > + if (merge_success) { > > + Atomic::add(&_missed_count, missed_count); > > + else { > > + Atomic::store(&_success, false); > > + } > > Thanks, > > Paul > > On 6/29/20, 7:20 PM, "linzang(??)" wrote: > > Dear All, > > Sorry to bother again, I just want to make sure that is this change worth to be continue to work on? If decision is made to not. I think I can drop this work and stop asking for help reviewing... > > Thanks for all your help about reviewing this previously. > > BRs, > > Lin > > On 2020/5/9, 3:47 PM, "linzang(??)" wrote: > > Dear All, > > May I ask your help again for review the latest change? Thanks! > > BRs, > > Lin > > On 2020/4/28, 1:54 PM, "linzang(??)" wrote: > > Hi Stefan, > > >> - Adding Atomic::load/store. > > >> - Removing the time measurement in the run_task. I renamed G1's function > > >> to run_task_timed. If we need this outside of G1, we can rethink the API > > >> at that point. > > >> - ZGC style cleanups > > Thanks for revising the patch, they are all good to me, and I have made a tiny change based on it: > > http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_04/ > > http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_04-delta/ > > it reduce the scope of mutex in ParHeapInspectTask, and delete unnecessary comments. > > BRs, > > Lin > > On 2020/4/27, 4:34 PM, "Stefan Karlsson" wrote: > > Hi Lin, > > On 2020-04-26 05:10, linzang(??) wrote: > > > Hi Stefan and Paul? > > > I have made a new patch based on your comments and Stefan's Poc code: > > > Webrev:http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_03/ > > > Delta(based on Stefan's change:) :http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_03-delta/webrev_03-delta/ > > Thanks for providing a delta patch. It makes it much easier to look at, > > and more likely for reviewers to continue reviewing. > > I'm going to continue focusing on the GC parts, and leave the rest to > > others to review. > > > > > > And Here are main changed I made and want to discuss with you: > > > 1. changed"parallelThreadNum=" to "parallel=" for jmap -histo options. > > > 2. Add logic to test where parallelHeapInspection is fail, in heapInspection.cpp > > > This is because the parHeapInspectTask create thread local KlassInfoTable in it's work() method, and this may fail because of native OOM, in this case, the parallel should fail and serial heap inspection can be tried. > > > One more thing I want discuss with you is about the member "_success" of parHeapInspectTask, when native OOM happenes, it is set to false. And since this "set" operation can be conducted in multiple threads, should it be atomic ops? IMO, this is not necessary because "_success" can only be set to false, and there is no way to change it from back to true after the ParHeapInspectTask instance is created, so it is save to be non-atomic, do you agree with that? > > In these situations you should be using the Atomic::load/store > > primitives. We're moving toward a later C++ standard were data races are > > considered undefined behavior. > > > 3. make CollectedHeap::run_task() be an abstract virtual func, so that every subclass of collectedHeap should support it, so later implementation of new collectedHeap will not miss the "parallel" features. > > > The problem I want to discuss with you is about epsilonHeap and SerialHeap, as they may not need parallel heap iteration, so I only make task->work(0), in case the run_task() is invoked someway in future. Another way is to left run_task() unimplemented, which one do you think is better? > > I don't have a strong opinion about this. > > And also please help take a look at the zHeap, as there is a class > > zTask that wrap the abstractGangTask, and the collectedHeap::run_task() > > only accept AbstraceGangTask* as argument, so I made a delegate class > > to adapt it , please see src/hotspot/share/gc/z/zHeap.cpp. > > > > > > There maybe other better ways to sovle the above problems, welcome for any comments, Thanks! > > I've created a few cleanups and changes on top of your latest patch: > > https://cr.openjdk.java.net/~stefank/8215624/webrev.02.delta > > https://cr.openjdk.java.net/~stefank/8215624/webrev.02 > > - Adding Atomic::load/store. > > - Removing the time measurement in the run_task. I renamed G1's function > > to run_task_timed. If we need this outside of G1, we can rethink the API > > at that point. > > - ZGC style cleanups > > Thanks, > > StefanK > > > > > > BRs, > > > Lin > > > > > > On 2020/4/23, 11:08 AM, "linzang(??)" wrote: > > > > > > Thanks Paul! I agree with using "parallel", will make the update in next patch, Thanks for help update the CSR. > > > > > > BRs, > > > Lin > > > > > > On 2020/4/23, 4:42 AM, "Hohensee, Paul" wrote: > > > > > > For the interface, I'd use "parallel" instead of "parallelThreadNum". All the other options are lower case, and it's a lot easier to type "parallel". I took the liberty of updating the CSR. If you're ok with it, you might want to change variable names and such, plus of course JMap.usage. > > > > > > Thanks, > > > Paul > > > > > > On 4/22/20, 2:29 AM, "serviceability-dev on behalf of linzang(??)" linzang at tencent.com> wrote: > > > > > > Dear Stefan, > > > > > > Thanks a lot! I agree with you to decouple the heap inspection code with GC's. > > > I will start from your POC code, may discuss with you later. > > > > > > > > > BRs, > > > Lin > > > > > > On 2020/4/22, 5:14 PM, "Stefan Karlsson" wrote: > > > > > > Hi Lin, > > > > > > I took a look at this earlier and saw that the heap inspection code is > > > strongly coupled with the CollectedHeap and G1CollectedHeap. I'd prefer > > > if we'd abstract this away, so that the GCs only provide a "parallel > > > object iteration" interface, and the heap inspection code is kept elsewhere. > > > > > > I started experimenting with doing that, but other higher-priority (to > > > me) tasks have had to take precedence. > > > > > > I've uploaded my work-in-progress / proof-of-concept: > > >https://cr.openjdk.java.net/~stefank/8215624/webrev.01.delta/ > > >https://cr.openjdk.java.net/~stefank/8215624/webrev.01/ > > > > > > The current code doesn't handle the lifecycle (deletion) of the > > > ParallelObjectIterators. There's also code left unimplemented in around > > > CollectedHeap::run_task. However, I think this could work as a basis to > > > pull out the heap inspection code out of the GCs. > > > > > > Thanks, > > > StefanK > > > > > > On 2020-04-22 02:21, linzang(??) wrote: > > > > Dear all, > > > > May I ask you help to review? This RFR has been there for quite a while. > > > > Thanks! > > > > > > > > BRs, > > > > Lin > > > > > > > > > On 2020/3/16, 5:18 PM, "linzang(??)" wrote:> > > > > > > > >> Just update a new path, my preliminary measure show about 3.5x speedup of jmap histo on a nearly full 4GB G1 heap (8-core platform with parallel thread number set to 4). > > > >> webrev:http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_02/ > > > >> bug:https://bugs.openjdk.java.net/browse/JDK-8215624 > > > >> CSR:https://bugs.openjdk.java.net/browse/JDK-8239290 > > ?? > >> BRs, > > > >> Lin > > > >> > On 2020/3/2, 9:56 PM, "linzang(??)" wrote: > > > >> > > > > >> > Dear all, > > > >> > Let me try to ease the reviewing work by some explanation :P > > > >> > The patch's target is to speed up jmap -histo for heap iteration, from my experience it is necessary for large heap investigation. E.g in bigData scenario I have tried to conduct jmap -histo against 180GB heap, it does take quite a while. > > > >> > And if my understanding is corrent, even the jmap -histo without "live" option does heap inspection with heap lock acquired. so it is very likely to block mutator thread in allocation-sensitive scenario. I would say the faster the heap inspection does, the shorter the mutator be blocked. This is parallel iteration for jmap is necessary. > > > >> > I think the parallel heap inspection should be applied to all kind of heap. However, consider the heap layout are different for GCs, much time is required to understand all kinds of the heap layout to make the whole change. IMO, It is not wise to have a huge patch for the whole solution at once, and it is even harder to review it. So I plan to implement it incrementally, the first patch (this one) is going to confirm the implemention detail of how jmap accept the new option, passes it to attachListener of the jvm process and then how to make the parallel inspection closure be generic enough to make it easy to extend to different heap layout. And also how to implement the heap inspection in specific gc's heap. This patch use G1's heap as the begining. > > > >> > This patch actually do several things: > > > >> > 1. Add an option "parallelThreadNum=" to jmap -histo, the default behavior is to set N to 0, means let's JVM decide how many threads to use for heap inspection. Set this option to 1 will disable parallel heap inspection. (more details in CSR:https://bugs.openjdk.java.net/browse/JDK-8239290) > > > >> > 2. Make a change in how Jmap passing arguments, changes inhttp://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_01/src/jdk.jcmd/share/classes/sun/tools/jmap/JMap.java.udiff.html, originally it pass options as separate arguments to attachListener, this patch change to that all options be compose to a single string. So the arg_count_max in attachListener.hpp do not need to be changed, and hence avoid the compatibility issue, as disscussed athttps://mail.openjdk.java.net/pipermail/serviceability-dev/2019-March/027334.html > > > >> > 3. Add an abstract class ParHeapInspectTask in heapInspection.hpp / heapInspection.cpp, It's work(uint worker_id) method prepares the data structure (KlassInfoTable) need for every parallel worker thread, and then call do_object_iterate_parallel() which is heap specific implementation. I also added some machenism in KlassInfoTable to support parallel iteration, such as merge(). > > > >> > 4. In specific heap (G1 in this patch), create a subclass of ParHeapInspectTask, implement the do_object_iterate_parallel() for parallel heap inspection. For G1, it simply invoke g1CollectedHeap's object_iterate_parallel(). > > > >> > 5. Add related test. > > > >> > 6. it may be easy to extend this patch for other kinds of heap by creating subclass of ParHeapInspectTask and implement the do_object_iterate_parallel(). > > > >> > > > > >> > Hope these info could help on code review and initate the discussion :-) > > > >> > Thanks! > > > >> > > > > >> > BRs, > > > >> > Lin > > > >> > >On 2020/2/19, 9:40 AM, "linzang(??)" wrote:. > > > >> > > > > > >> > > Re-post this RFR with correct enhancement number to make it trackable. > > > >> > > please ignore the previous wrong post. sorry for troubles. > > > >> > > > > > >> > > webrev:http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_01/ > > > >> > > Hi bug:https://bugs.openjdk.java.net/browse/JDK-8215624 > > > >> > > CSR:https://bugs.openjdk.java.net/browse/JDK-8239290 > > > >> > > -------------- > > > >> > > Lin > > > >> > > >Hi Lin, > > > > > > > > > > > >> > > >Could you, please, re-post your RFR with the right enhancement number in > > > >> > > >the message subject? > > > >> > > >It will be more trackable this way. > > > >> > > > > > > >> > > >Thanks, > > > >> > > >Serguei > > > >> > > > > > > >> > > > > > > >> > > >On 2/17/20 10:29 PM, linzang(??) wrote: > > > >> > > >> Dear David, > > > >> > > >> Thanks a lot! > > > >> > > >> I have updated the refined code tohttp://cr.openjdk.java.net/~lzang/jmap-8214535/8215264/webrev_01/. > > > >> > > >> IMHO the parallel heap inspection can be extended to all kinds of heap as long as the heap layout can support parallel iteration. > > > >> > > >> Maybe we can firstly use this webrev to discuss how to implement it, because I am not sure my current implementation is an appropriate way to communicate with collectedHeap, then we can extend the solution to other kinds of heap. > > > >> > > >> > > > >> > > >> Thanks, > > > >> > > >> -------------- > > > >> > > >> Lin > > > >> > > >>> Hi Lin, > > > >> > > >>> > > > >> > > >>> Adding in hotspot-gc-dev as they need to see how this interacts with GC > > > >> > > >>> worker threads, and whether it needs to be extended beyond G1. > > > >> > > >>> > > > >> > > >>> I happened to spot one nit when browsing: > > > >> > > >>> > > > >> > > >>> src/hotspot/share/gc/shared/collectedHeap.hpp > > > >> > > >>> > > > >> > > >>> + virtual bool run_par_heap_inspect_task(KlassInfoTable* cit, > > > >> > > >>> + BoolObjectClosure* filter, > > > >> > > >>> + size_t* missed_count, > > > >> > > >>> + size_t thread_num) { > > > >> > > >>> + return NULL; > > > >> > > >>> > > > >> > > >>> s/NULL/false/ > > > >> > > >>> > > > >> > > >>> Cheers, > > > >> > > >>> David > > > > > > > >>> > > > >> > > >>> On 18/02/2020 2:15 pm, linzang(??) wrote: > > > >> > > >>>> Dear All, > > > >> > > >>>> May I ask your help to review the follow changes: > > > >> > > >>>> webrev: > > > >> > > >>>>http://cr.openjdk.java.net/~lzang/jmap-8214535/8215264/webrev_00/ > > > >> > > >>>> bug:https://bugs.openjdk.java.net/browse/JDK-8215624 > > > >> > > >>>> related CSR:https://bugs.openjdk.java.net/browse/JDK-8239290 > > > >> > > >>>> This patch enable parallel heap inspection of G1 for jmap histo. > > > >> > > >>>> my simple test shown it can speed up 2x of jmap -histo with > > > >> > > >>>> parallelThreadNum set to 2 for heap at ~500M on 4-core platform. > > > >> > > >>>> > > > >> > > >>>> ------------------------------------------------------------------------ > > > >> > > >>>> BRs, > > > >> > > >>>> Lin > > > >> > > >> > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > From linzang at tencent.com Wed Aug 5 13:45:03 2020 From: linzang at tencent.com (=?utf-8?B?bGluemFuZyjoh6fnkLMp?=) Date: Wed, 5 Aug 2020 13:45:03 +0000 Subject: RFR(L): 8215624: add parallel heap inspection support for jmap histo(G1)(Internet mail) In-Reply-To: <6133B284-A654-4EE1-B510-61A85944C1CB@amazon.com> References: <6133B284-A654-4EE1-B510-61A85944C1CB@amazon.com> Message-ID: <49147493-03F7-4691-B726-EF03B536472F@tencent.com> Hi Paul, Stefan and Serguei, Here I uploaded a new changeset, would you like to help review again? Webrev: http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_11/ Delta (based on webrev10): http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_11_delta/ P.S. I am in process of building it on windows environment for a double check. May update result later. Thanks! BRs, Lin ?On 2020/8/5, 8:56 PM, "Hohensee, Paul" wrote: uintx is fine with me. Thanks, Paul On 8/5/20, 1:14 AM, "linzang(??)" wrote: Hi Stefan, I got your point, thanks for explanation. I missed the atomic part when considering it. Hi Paul, Do you think it is ok to use uintx? I checked it is actually a uintptr_t type. BRs, Lin On 2020/8/5, 3:39 PM, "Stefan Karlsson" wrote: On 2020-08-05 07:22, linzang(??) wrote: > Hi Serguei, > > No problem, Thanks for your reviewing :) > > I wll upload a new webrev later, so may I ask your help to review it > again? > > Hi Stefan, > > As Paul mentioned, the _/missed/_count is not a size, so size_t may > not be clear, what?s your opinion about uint64_t? We typically don't restrict the usage of size_t to only *sizes* in the HotSpot. If you search the code you'll find many count variables using size_t, so I personally don't see the need to change the type. However, if you really do want to change it then maybe using another type that is 32 bits on 32-bit machines, maybe uintx? IIRC, using uint64_t and some of the Atomics operations are problematic on some 32-bit platforms, so using a type that matches the word size of the targetted machine helps you not having to think about that. > > It seems the uint overflow may happened on 64bit machine with large > heap, e.g. may be more than 4 Giga objects (8byte header + 8 byte > klassptr + 8byte field) in a heap that is larger than 96 GB, uint64_t > is ok in this case. Exactly. Thanks, StefanK > > BRs, > > Lin > > *From: *"serguei.spitsyn at oracle.com" > *Date: *Wednesday, August 5, 2020 at 1:02 PM > *To: *"linzang(??)" , "Hohensee, Paul" > , Stefan Karlsson , > David Holmes , serviceability-dev > , "hotspot-gc-dev at openjdk.java.net" > > *Subject: *Re: RFR(L): 8215624: add parallel heap inspection support for > jmap histo(G1)(Internet mail) > > Oh, sorry for the confusion, please, skip my question. :) > C++ does not have the '&&=' operator. > > Thanks, > Serguei > > On 8/4/20 21:56, serguei.spitsyn at oracle.com > wrote: > > Hi Lin, > > https://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_10/src/hotspot/share/memory/heapInspection.cpp.udiff.html > > +class KlassInfoTableMergeClosure : public KlassInfoClosure { > > +private: > > + KlassInfoTable* _dest; > > + bool _success; > > +public: > > + KlassInfoTableMergeClosure(KlassInfoTable* table) : _dest(table), > _success(true) {} > > + void do_cinfo(KlassInfoEntry* cie) { > > + _success &= _dest->merge_entry(cie); > > + } > > The operator '&=' above looks strange. > Did you actually want to use the operator '&&=' instead? : > > + _success &&= _dest->merge_entry(cie); > > > Thanks, > Serguei > > > > > On 8/3/20 07:51, linzang(??) wrote: > > Dear Stefan, > > May I ask your help to review again? I have made a delta based on the last changeset you have reviewed(webrev04),hope it could ease your reviewing work. > > webrev:https://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_10/ > > delta (vs webrev04):https://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/delta_10vs04/webrev/ > > bug:https://bugs.openjdk.java.net/browse/JDK-8215624 > > CSR(approved):https://bugs.openjdk.java.net/browse/JDK-8239290 > > > > BRs, > > Lin > > On 2020/7/30, 5:21 AM, "Hohensee, Paul" wrote: > > A submit repo run with this succeeded, so afaic you're good to go. Stefan, you reviewed the GC part before, it'd be great if you could ok the final version. > > Thanks, > > Paul > > On 7/29/20, 5:02 AM, "linzang(??)" wrote: > > Upload a new change athttp://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_10/ > > It fix an issue of windows fail : > > #################################### > > In heapInspect.cpp > > - size_t HeapInspection::populate_table(KlassInfoTable* cit, BoolObjectClosure *filter, uint parallel_thread_num) { > > + uint HeapInspection::populate_table(KlassInfoTable* cit, BoolObjectClosure *filter, uint parallel_thread_num) { > > #################################### > > In heapInspect.hpp > > - size_t populate_table(KlassInfoTable* cit, BoolObjectClosure* filter = NULL, uint parallel_thread_num = 1) NOT_SERVICES_RETURN_(0); > > + uint populate_table(KlassInfoTable* cit, BoolObjectClosure* filter = NULL, uint parallel_thread_num = 1) NOT_SERVICES_RETURN_(0); > > #################################### > > BRs, > > Lin > > On 2020/7/27, 11:26 AM, "linzang(??)" wrote: > > I update a new change athttp://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_09 > > It includes a tiny fix of build failure on windows: > > #################################### > > In attachListener.cpp: > > - uint parallel_thread_num = MAX(1, (uint)os::initial_active_processor_count() * 3 / 8); > > + uint parallel_thread_num = MAX2(1, (uint)os::initial_active_processor_count() * 3 / 8); > > #################################### > > BRs, > > Lin > > On 2020/7/23, 11:56 AM, "linzang(??)" wrote: > > Hi Paul, > > Thanks for your help, that all looks good to me. > > Just 2 minor changes: > > ? delete the final return in ParHeapInspectTask::work, you mentioned it but seems not include in the webrev :-) > > ? delete a unnecessary blank line in heapInspect.cpp at merge_entry() > > ######################################################################### > > --- old/src/hotspot/share/memory/heapInspection.cpp 2020-07-23 11:23:29.281666456 +0800 > > +++ new/src/hotspot/share/memory/heapInspection.cpp 2020-07-23 11:23:29.017666447 +0800 > > @@ -251,7 +251,6 @@ > > _size_of_instances_in_words += cie->words(); > > return true; > > } > > - > > return false; > > } > > @@ -568,7 +567,6 @@ > > Atomic::add(&_missed_count, missed_count); > > } else { > > Atomic::store(&_success, false); > > - return; > > } > > } > > ######################################################################### > > Here is the webrevhttp://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_08/ > > BRs, > > Lin > > --------------------------------------------- > > From: "Hohensee, Paul" > > Date: Thursday, July 23, 2020 at 6:48 AM > > To: "linzang(??)" , Stefan Karlsson ,"serguei.spitsyn at oracle.com" , David Holmes , serviceability-dev ,"hotspot-gc-dev at openjdk.java.net" > > Subject: RE: RFR(L): 8215624: add parallel heap inspection support for jmap histo(G1)(Internet mail) > > Just small things. > > heapInspection.cpp: > > In ParHeapInspectTask::work, remove the final return statement and fix the following ?}? indent. I.e., replace > > + Atomic::store(&_success, false); > > + return; > > + } > > with > > + Atomic::store(&_success, false); > > + } > > In HeapInspection::heap_inspection, missed_count should be a uint to match other missed_count declarations, and should be initialized to the result of populate_table() rather than separately to 0. > > attachListener.cpp: > > In heap_inspection, initial_processor_count returns an int, so cast its result to a uint. > > Similarly, parse_uintx returns a uintx, so cast its result (num) to uint when assigning to parallel_thread_num. > > BasicJMapTest.java: > > I took the liberty of refactoring testHisto*/histoToFile/testDump*/dump to remove redundant interposition methods and make histoToFile and dump look as similar as possible. > > Webrev with the above changes in > > http://cr.openjdk.java.net/~phh/8214535/webrev.01/ > > Thanks, > > Paul > > On 7/15/20, 2:13 AM, "linzang(??)" wrote: > > Upload a new webrev athttp://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_07/ > > It fix a potential issue that unexpected number of threads maybe calculated for "parallel" option of jmap -histo in container. > > As shown athttp://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_07-delta/src/hotspot/share/services/attachListener.cpp.udiff.html > > ############### attachListener.cpp #################### > > @@ -252,11 +252,11 @@ > > static jint heap_inspection(AttachOperation* op, outputStream* out) { > > bool live_objects_only = true; // default is true to retain the behavior before this change is made > > outputStream* os = out; // if path not specified or path is NULL, use out > > fileStream* fs = NULL; > > const char* arg0 = op->arg(0); > > - uint parallel_thread_num = MAX(1, os::processor_count() * 3 / 8); // default is less than half of processors. > > + uint parallel_thread_num = MAX(1, os::initial_active_processor_count() * 3 / 8); // default is less than half of processors. > > if (arg0 != NULL && (strlen(arg0) > 0)) { > > if (strcmp(arg0, "-all") != 0 && strcmp(arg0, "-live") != 0) { > > out->print_cr("Invalid argument to inspectheap operation: %s", arg0); > > return JNI_ERR; > > } > > ################################################### > > Thanks. > > BRs, > > Lin > > On 2020/7/9, 3:22 PM, "linzang(??)" wrote: > > Hi Paul, > > Thanks for reviewing! > > >> > > >> I'd move all the argument parsing code to JMap.java and just pass the results to Hotspot. Both histo() in JMap.java and code in attachListener.* parse the command line arguments, though the code in histo() doesn't parse the argument to "parallel". I'd upgrade the code in histo() to do a complete parse and pass the option values to executeCommandForPid as before: there would just be more of them now. That would allow you to eliminate all the parsing code in attachListener.cpp as well as the change to arguments.hpp. > > >> > > The reason I made the change in Jmap.java that compose all arguments as 1 string , instead of passing 3 argments, is to avoid the compatibility issue, as we discussed inhttp://mail.openjdk.java.net/pipermail/serviceability-dev/2019-February/thread.html#27240. The root cause of the compatibility issue is because max argument count in HotspotVirtualMachineImpl.java and attachlistener.cpp need to be enlarged (changes likehttp://hg.openjdk.java.net/jdk/jdk/rev/e7cf035682e3#l2.1) when jmap has more than 3 arguments. But if user use an old jcmd/jmap tool, it may stuck at socket read(), because the "max argument count" don't match. > > I re-checked this change, the argument count of jmap histo is equal to 3 (live, file, parallel), so it can work normally even without the change of passing argument. But I think we have to face the problem if more arguments is added in jcmd alike tools later, not sure whether it should be sloved (or a workaround) in this changeset. > > And here are the lastest webrev and delta: > > http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_06/ > > http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_06-delta/ > > Cheers, > > Lin > > On 2020/7/7, 5:57 AM, "Hohensee, Paul" wrote: > > I'd like to see this feature added. :) > > The CSR looks good, as does the basic parallel inspection algorithm. Stefan's done the GC part, so I'll stick to the non-GC part (fwiw, the GC part lgtm). > > I'd move all the argument parsing code to JMap.java and just pass the results to Hotspot. Both histo() in JMap.java and code in attachListener.* parse the command line arguments, though the code in histo() doesn't parse the argument to "parallel". I'd upgrade the code in histo() to do a complete parse and pass the option values to executeCommandForPid as before: there would just be more of them now. That would allow you to eliminate all the parsing code in attachListener.cpp as well as the change to arguments.hpp. > > heapInspection.hpp: > > _shared_miss_count (s/b _missed_count, see below) isn't a size, so it should be a uint instead of a size_t. Same with the new parallel_thread_num argument to heap_inspection() and populate_table(). > > Comment copy-edit: > > +// Parallel heap inspection task. Parallel inspection can fail due to > > +// a native OOM when allocating memory for TL-KlassInfoTable. > > +// _success will be set false on an OOM, and serial inspection tried. > > _shared_miss_count should be _missed_count to match the missed_count() getter, or rename missed_count() to be shared_miss_count(). Whichever way you go, the field type should match the getter result type: uint is reasonable. > > heapInspection.cpp: > > You might use ResourceMark twice in populate_table, separately for the parallel attempt and the serial code. If the parallel attempt fails and available memory is low, it would be good to clean up the memory used by the parallel attempt before doing the serial code. > > Style nit in KlassInfoTable::merge_entry(). I'd line up the definitions of k and elt, so "k" is even with "elt". And, because it's two lines shorter, I'd replace > > + } else { > > + return false; > > + } > > with > > + return false; > > KlassInfoTableMergeClosure.is_success() should be just success() (i.e., no "is_" prefix) because it's a getter. > > I'd reorganize the code in populate_table() to make it more clear, vis (I changed _shared_missed_count to _missed_count) > > + if (cit.allocation_failed()) { > > + // fail to allocate memory, stop parallel mode > > + Atomic::store(&_success, false); > > + return; > > + } > > + RecordInstanceClosure ric(&cit, _filter); > > + _poi->object_iterate(&ric, worker_id); > > + missed_count = ric.missed_count(); > > + { > > + MutexLocker x(&_mutex); > > + merge_success = _shared_cit->merge(&cit); > > + } > > + if (merge_success) { > > + Atomic::add(&_missed_count, missed_count); > > + else { > > + Atomic::store(&_success, false); > > + } > > Thanks, > > Paul > > On 6/29/20, 7:20 PM, "linzang(??)" wrote: > > Dear All, > > Sorry to bother again, I just want to make sure that is this change worth to be continue to work on? If decision is made to not. I think I can drop this work and stop asking for help reviewing... > > Thanks for all your help about reviewing this previously. > > BRs, > > Lin > > On 2020/5/9, 3:47 PM, "linzang(??)" wrote: > > Dear All, > > May I ask your help again for review the latest change? Thanks! > > BRs, > > Lin > > On 2020/4/28, 1:54 PM, "linzang(??)" wrote: > > Hi Stefan, > > >> - Adding Atomic::load/store. > > >> - Removing the time measurement in the run_task. I renamed G1's function > > >> to run_task_timed. If we need this outside of G1, we can rethink the API > > >> at that point. > > >> - ZGC style cleanups > > Thanks for revising the patch, they are all good to me, and I have made a tiny change based on it: > > http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_04/ > > http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_04-delta/ > > it reduce the scope of mutex in ParHeapInspectTask, and delete unnecessary comments. > > BRs, > > Lin > > On 2020/4/27, 4:34 PM, "Stefan Karlsson" wrote: > > Hi Lin, > > On 2020-04-26 05:10, linzang(??) wrote: > > > Hi Stefan and Paul? > > > I have made a new patch based on your comments and Stefan's Poc code: > > > Webrev:http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_03/ > > > Delta(based on Stefan's change:) :http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_03-delta/webrev_03-delta/ > > Thanks for providing a delta patch. It makes it much easier to look at, > > and more likely for reviewers to continue reviewing. > > I'm going to continue focusing on the GC parts, and leave the rest to > > others to review. > > > > > > And Here are main changed I made and want to discuss with you: > > > 1. changed"parallelThreadNum=" to "parallel=" for jmap -histo options. > > > 2. Add logic to test where parallelHeapInspection is fail, in heapInspection.cpp > > > This is because the parHeapInspectTask create thread local KlassInfoTable in it's work() method, and this may fail because of native OOM, in this case, the parallel should fail and serial heap inspection can be tried. > > > One more thing I want discuss with you is about the member "_success" of parHeapInspectTask, when native OOM happenes, it is set to false. And since this "set" operation can be conducted in multiple threads, should it be atomic ops? IMO, this is not necessary because "_success" can only be set to false, and there is no way to change it from back to true after the ParHeapInspectTask instance is created, so it is save to be non-atomic, do you agree with that? > > In these situations you should be using the Atomic::load/store > > primitives. We're moving toward a later C++ standard were data races are > > considered undefined behavior. > > > 3. make CollectedHeap::run_task() be an abstract virtual func, so that every subclass of collectedHeap should support it, so later implementation of new collectedHeap will not miss the "parallel" features. > > > The problem I want to discuss with you is about epsilonHeap and SerialHeap, as they may not need parallel heap iteration, so I only make task->work(0), in case the run_task() is invoked someway in future. Another way is to left run_task() unimplemented, which one do you think is better? > > I don't have a strong opinion about this. > > And also please help take a look at the zHeap, as there is a class > > zTask that wrap the abstractGangTask, and the collectedHeap::run_task() > > only accept AbstraceGangTask* as argument, so I made a delegate class > > to adapt it , please see src/hotspot/share/gc/z/zHeap.cpp. > > > > > > There maybe other better ways to sovle the above problems, welcome for any comments, Thanks! > > I've created a few cleanups and changes on top of your latest patch: > > https://cr.openjdk.java.net/~stefank/8215624/webrev.02.delta > > https://cr.openjdk.java.net/~stefank/8215624/webrev.02 > > - Adding Atomic::load/store. > > - Removing the time measurement in the run_task. I renamed G1's function > > to run_task_timed. If we need this outside of G1, we can rethink the API > > at that point. > > - ZGC style cleanups > > Thanks, > > StefanK > > > > > > BRs, > > > Lin > > > > > > On 2020/4/23, 11:08 AM, "linzang(??)" wrote: > > > > > > Thanks Paul! I agree with using "parallel", will make the update in next patch, Thanks for help update the CSR. > > > > > > BRs, > > > Lin > > > > > > On 2020/4/23, 4:42 AM, "Hohensee, Paul" wrote: > > > > > > For the interface, I'd use "parallel" instead of "parallelThreadNum". All the other options are lower case, and it's a lot easier to type "parallel". I took the liberty of updating the CSR. If you're ok with it, you might want to change variable names and such, plus of course JMap.usage. > > > > > > Thanks, > > > Paul > > > > > > On 4/22/20, 2:29 AM, "serviceability-dev on behalf of linzang(??)" linzang at tencent.com> wrote: > > > > > > Dear Stefan, > > > > > > Thanks a lot! I agree with you to decouple the heap inspection code with GC's. > > > I will start from your POC code, may discuss with you later. > > > > > > > > > BRs, > > > Lin > > > > > > On 2020/4/22, 5:14 PM, "Stefan Karlsson" wrote: > > > > > > Hi Lin, > > > > > > I took a look at this earlier and saw that the heap inspection code is > > > strongly coupled with the CollectedHeap and G1CollectedHeap. I'd prefer > > > if we'd abstract this away, so that the GCs only provide a "parallel > > > object iteration" interface, and the heap inspection code is kept elsewhere. > > > > > > I started experimenting with doing that, but other higher-priority (to > > > me) tasks have had to take precedence. > > > > > > I've uploaded my work-in-progress / proof-of-concept: > > >https://cr.openjdk.java.net/~stefank/8215624/webrev.01.delta/ > > >https://cr.openjdk.java.net/~stefank/8215624/webrev.01/ > > > > > > The current code doesn't handle the lifecycle (deletion) of the > > > ParallelObjectIterators. There's also code left unimplemented in around > > > CollectedHeap::run_task. However, I think this could work as a basis to > > > pull out the heap inspection code out of the GCs. > > > > > > Thanks, > > > StefanK > > > > > > On 2020-04-22 02:21, linzang(??) wrote: > > > > Dear all, > > > > May I ask you help to review? This RFR has been there for quite a while. > > > > Thanks! > > > > > > > > BRs, > > > > Lin > > > > > > > > > On 2020/3/16, 5:18 PM, "linzang(??)" wrote:> > > > > > > > >> Just update a new path, my preliminary measure show about 3.5x speedup of jmap histo on a nearly full 4GB G1 heap (8-core platform with parallel thread number set to 4). > > > >> webrev:http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_02/ > > > >> bug:https://bugs.openjdk.java.net/browse/JDK-8215624 > > > >> CSR:https://bugs.openjdk.java.net/browse/JDK-8239290 > > ?? > >> BRs, > > > >> Lin > > > >> > On 2020/3/2, 9:56 PM, "linzang(??)" wrote: > > > >> > > > > >> > Dear all, > > > >> > Let me try to ease the reviewing work by some explanation :P > > > >> > The patch's target is to speed up jmap -histo for heap iteration, from my experience it is necessary for large heap investigation. E.g in bigData scenario I have tried to conduct jmap -histo against 180GB heap, it does take quite a while. > > > >> > And if my understanding is corrent, even the jmap -histo without "live" option does heap inspection with heap lock acquired. so it is very likely to block mutator thread in allocation-sensitive scenario. I would say the faster the heap inspection does, the shorter the mutator be blocked. This is parallel iteration for jmap is necessary. > > > >> > I think the parallel heap inspection should be applied to all kind of heap. However, consider the heap layout are different for GCs, much time is required to understand all kinds of the heap layout to make the whole change. IMO, It is not wise to have a huge patch for the whole solution at once, and it is even harder to review it. So I plan to implement it incrementally, the first patch (this one) is going to confirm the implemention detail of how jmap accept the new option, passes it to attachListener of the jvm process and then how to make the parallel inspection closure be generic enough to make it easy to extend to different heap layout. And also how to implement the heap inspection in specific gc's heap. This patch use G1's heap as the begining. > > > >> > This patch actually do several things: > > > >> > 1. Add an option "parallelThreadNum=" to jmap -histo, the default behavior is to set N to 0, means let's JVM decide how many threads to use for heap inspection. Set this option to 1 will disable parallel heap inspection. (more details in CSR:https://bugs.openjdk.java.net/browse/JDK-8239290) > > > >> > 2. Make a change in how Jmap passing arguments, changes inhttp://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_01/src/jdk.jcmd/share/classes/sun/tools/jmap/JMap.java.udiff.html, originally it pass options as separate arguments to attachListener, this patch change to that all options be compose to a single string. So the arg_count_max in attachListener.hpp do not need to be changed, and hence avoid the compatibility issue, as disscussed athttps://mail.openjdk.java.net/pipermail/serviceability-dev/2019-March/027334.html > > > >> > 3. Add an abstract class ParHeapInspectTask in heapInspection.hpp / heapInspection.cpp, It's work(uint worker_id) method prepares the data structure (KlassInfoTable) need for every parallel worker thread, and then call do_object_iterate_parallel() which is heap specific implementation. I also added some machenism in KlassInfoTable to support parallel iteration, such as merge(). > > > >> > 4. In specific heap (G1 in this patch), create a subclass of ParHeapInspectTask, implement the do_object_iterate_parallel() for parallel heap inspection. For G1, it simply invoke g1CollectedHeap's object_iterate_parallel(). > > > >> > 5. Add related test. > > > >> > 6. it may be easy to extend this patch for other kinds of heap by creating subclass of ParHeapInspectTask and implement the do_object_iterate_parallel(). > > > >> > > > > >> > Hope these info could help on code review and initate the discussion :-) > > > >> > Thanks! > > > >> > > > > >> > BRs, > > > >> > Lin > > > >> > >On 2020/2/19, 9:40 AM, "linzang(??)" wrote:. > > > >> > > > > > >> > > Re-post this RFR with correct enhancement number to make it trackable. > > > >> > > please ignore the previous wrong post. sorry for troubles. > > > >> > > > > > >> > > webrev:http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_01/ > > > >> > > Hi bug:https://bugs.openjdk.java.net/browse/JDK-8215624 > > > >> > > CSR:https://bugs.openjdk.java.net/browse/JDK-8239290 > > > >> > > -------------- > > > >> > > Lin > > > >> > > >Hi Lin, > > > > > > > > > > > >> > > >Could you, please, re-post your RFR with the right enhancement number in > > > >> > > >the message subject? > > > >> > > >It will be more trackable this way. > > > >> > > > > > > >> > > >Thanks, > > > >> > > >Serguei > > > >> > > > > > > >> > > > > > > >> > > >On 2/17/20 10:29 PM, linzang(??) wrote: > > > >> > > >> Dear David, > > > >> > > >> Thanks a lot! > > > >> > > >> I have updated the refined code tohttp://cr.openjdk.java.net/~lzang/jmap-8214535/8215264/webrev_01/. > > > >> > > >> IMHO the parallel heap inspection can be extended to all kinds of heap as long as the heap layout can support parallel iteration. > > > >> > > >> Maybe we can firstly use this webrev to discuss how to implement it, because I am not sure my current implementation is an appropriate way to communicate with collectedHeap, then we can extend the solution to other kinds of heap. > > > >> > > >> > > > >> > > >> Thanks, > > > >> > > >> -------------- > > > >> > > >> Lin > > > >> > > >>> Hi Lin, > > > >> > > >>> > > > >> > > >>> Adding in hotspot-gc-dev as they need to see how this interacts with GC > > > >> > > >>> worker threads, and whether it needs to be extended beyond G1. > > > >> > > >>> > > > >> > > >>> I happened to spot one nit when browsing: > > > >> > > >>> > > > >> > > >>> src/hotspot/share/gc/shared/collectedHeap.hpp > > > >> > > >>> > > > >> > > >>> + virtual bool run_par_heap_inspect_task(KlassInfoTable* cit, > > > >> > > >>> + BoolObjectClosure* filter, > > > >> > > >>> + size_t* missed_count, > > > >> > > >>> + size_t thread_num) { > > > >> > > >>> + return NULL; > > > >> > > >>> > > > >> > > >>> s/NULL/false/ > > > >> > > >>> > > > >> > > >>> Cheers, > > > >> > > >>> David > > > > > > > >>> > > > >> > > >>> On 18/02/2020 2:15 pm, linzang(??) wrote: > > > >> > > >>>> Dear All, > > > >> > > >>>> May I ask your help to review the follow changes: > > > >> > > >>>> webrev: > > > >> > > >>>>http://cr.openjdk.java.net/~lzang/jmap-8214535/8215264/webrev_00/ > > > >> > > >>>> bug:https://bugs.openjdk.java.net/browse/JDK-8215624 > > > >> > > >>>> related CSR:https://bugs.openjdk.java.net/browse/JDK-8239290 > > > >> > > >>>> This patch enable parallel heap inspection of G1 for jmap histo. > > > >> > > >>>> my simple test shown it can speed up 2x of jmap -histo with > > > >> > > >>>> parallelThreadNum set to 2 for heap at ~500M on 4-core platform. > > > >> > > >>>> > > > >> > > >>>> ------------------------------------------------------------------------ > > > >> > > >>>> BRs, > > > >> > > >>>> Lin > > > >> > > >> > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > From chris.plummer at oracle.com Wed Aug 5 19:47:23 2020 From: chris.plummer at oracle.com (Chris Plummer) Date: Wed, 5 Aug 2020 12:47:23 -0700 Subject: Fwd: Re: RFR(XS): 8248879: SA core file support on OSX has some bugs trying to locate the jvm libraries In-Reply-To: <5fc88753-3d07-7dc2-6925-ee954a938971@oracle.com> References: <7fc9666b-ce1e-f83a-e7a7-e052ca451a09@oracle.com> <625675d7-541b-7641-7306-91807fb7487c@oracle.com> <5e9ed560-32d2-33eb-8cbc-887bb97e90ca@oracle.com> <396ebbbd-42c0-2784-2426-01ccc5ead589@oracle.com> <431f7e78-a738-f1c3-8fc3-20e507b65b98@oracle.com> <5fc88753-3d07-7dc2-6925-ee954a938971@oracle.com> Message-ID: <276014d9-c0c0-1fe8-8bd6-ac33262b5b8e@oracle.com> An HTML attachment was scrubbed... URL: From hohensee at amazon.com Wed Aug 5 20:16:40 2020 From: hohensee at amazon.com (Hohensee, Paul) Date: Wed, 5 Aug 2020 20:16:40 +0000 Subject: RFR(L): 8215624: add parallel heap inspection support for jmap histo(G1)(Internet mail) Message-ID: <107799EA-8956-426D-B05E-B3344AFCBA28@amazon.com> Two tiny nits that don't need a new webrev: In heapInspection.cpp, you don't need to cast missed_count to uintx in the call to log_info(). In heapInspection.hpp, you can delete two of the three blank lines before #endif // SHARE_MEMORY_HEAPINSPECTION_HPP Thanks, Paul ?On 8/5/20, 6:46 AM, "linzang(??)" wrote: Hi Paul, Stefan and Serguei, Here I uploaded a new changeset, would you like to help review again? Webrev: http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_11/ Delta (based on webrev10): http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_11_delta/ P.S. I am in process of building it on windows environment for a double check. May update result later. Thanks! BRs, Lin On 2020/8/5, 8:56 PM, "Hohensee, Paul" wrote: uintx is fine with me. Thanks, Paul On 8/5/20, 1:14 AM, "linzang(??)" wrote: Hi Stefan, I got your point, thanks for explanation. I missed the atomic part when considering it. Hi Paul, Do you think it is ok to use uintx? I checked it is actually a uintptr_t type. BRs, Lin On 2020/8/5, 3:39 PM, "Stefan Karlsson" wrote: On 2020-08-05 07:22, linzang(??) wrote: > Hi Serguei, > > No problem, Thanks for your reviewing :) > > I wll upload a new webrev later, so may I ask your help to review it > again? > > Hi Stefan, > > As Paul mentioned, the _/missed/_count is not a size, so size_t may > not be clear, what?s your opinion about uint64_t? We typically don't restrict the usage of size_t to only *sizes* in the HotSpot. If you search the code you'll find many count variables using size_t, so I personally don't see the need to change the type. However, if you really do want to change it then maybe using another type that is 32 bits on 32-bit machines, maybe uintx? IIRC, using uint64_t and some of the Atomics operations are problematic on some 32-bit platforms, so using a type that matches the word size of the targetted machine helps you not having to think about that. > > It seems the uint overflow may happened on 64bit machine with large > heap, e.g. may be more than 4 Giga objects (8byte header + 8 byte > klassptr + 8byte field) in a heap that is larger than 96 GB, uint64_t > is ok in this case. Exactly. Thanks, StefanK > > BRs, > > Lin > > *From: *"serguei.spitsyn at oracle.com" > *Date: *Wednesday, August 5, 2020 at 1:02 PM > *To: *"linzang(??)" , "Hohensee, Paul" > , Stefan Karlsson , > David Holmes , serviceability-dev > , "hotspot-gc-dev at openjdk.java.net" > > *Subject: *Re: RFR(L): 8215624: add parallel heap inspection support for > jmap histo(G1)(Internet mail) > > Oh, sorry for the confusion, please, skip my question. :) > C++ does not have the '&&=' operator. > > Thanks, > Serguei > > On 8/4/20 21:56, serguei.spitsyn at oracle.com > wrote: > > Hi Lin, > > https://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_10/src/hotspot/share/memory/heapInspection.cpp.udiff.html > > +class KlassInfoTableMergeClosure : public KlassInfoClosure { > > +private: > > + KlassInfoTable* _dest; > > + bool _success; > > +public: > > + KlassInfoTableMergeClosure(KlassInfoTable* table) : _dest(table), > _success(true) {} > > + void do_cinfo(KlassInfoEntry* cie) { > > + _success &= _dest->merge_entry(cie); > > + } > > The operator '&=' above looks strange. > Did you actually want to use the operator '&&=' instead? : > > + _success &&= _dest->merge_entry(cie); > > > Thanks, > Serguei > > > > > On 8/3/20 07:51, linzang(??) wrote: > > Dear Stefan, > > May I ask your help to review again? I have made a delta based on the last changeset you have reviewed(webrev04),hope it could ease your reviewing work. > > webrev:https://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_10/ > > delta (vs webrev04):https://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/delta_10vs04/webrev/ > > bug:https://bugs.openjdk.java.net/browse/JDK-8215624 > > CSR(approved):https://bugs.openjdk.java.net/browse/JDK-8239290 > > > > BRs, > > Lin > > On 2020/7/30, 5:21 AM, "Hohensee, Paul" wrote: > > A submit repo run with this succeeded, so afaic you're good to go. Stefan, you reviewed the GC part before, it'd be great if you could ok the final version. > > Thanks, > > Paul > > On 7/29/20, 5:02 AM, "linzang(??)" wrote: > > Upload a new change athttp://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_10/ > > It fix an issue of windows fail : > > #################################### > > In heapInspect.cpp > > - size_t HeapInspection::populate_table(KlassInfoTable* cit, BoolObjectClosure *filter, uint parallel_thread_num) { > > + uint HeapInspection::populate_table(KlassInfoTable* cit, BoolObjectClosure *filter, uint parallel_thread_num) { > > #################################### > > In heapInspect.hpp > > - size_t populate_table(KlassInfoTable* cit, BoolObjectClosure* filter = NULL, uint parallel_thread_num = 1) NOT_SERVICES_RETURN_(0); > > + uint populate_table(KlassInfoTable* cit, BoolObjectClosure* filter = NULL, uint parallel_thread_num = 1) NOT_SERVICES_RETURN_(0); > > #################################### > > BRs, > > Lin > > On 2020/7/27, 11:26 AM, "linzang(??)" wrote: > > I update a new change athttp://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_09 > > It includes a tiny fix of build failure on windows: > > #################################### > > In attachListener.cpp: > > - uint parallel_thread_num = MAX(1, (uint)os::initial_active_processor_count() * 3 / 8); > > + uint parallel_thread_num = MAX2(1, (uint)os::initial_active_processor_count() * 3 / 8); > > #################################### > > BRs, > > Lin > > On 2020/7/23, 11:56 AM, "linzang(??)" wrote: > > Hi Paul, > > Thanks for your help, that all looks good to me. > > Just 2 minor changes: > > ? delete the final return in ParHeapInspectTask::work, you mentioned it but seems not include in the webrev :-) > > ? delete a unnecessary blank line in heapInspect.cpp at merge_entry() > > ######################################################################### > > --- old/src/hotspot/share/memory/heapInspection.cpp 2020-07-23 11:23:29.281666456 +0800 > > +++ new/src/hotspot/share/memory/heapInspection.cpp 2020-07-23 11:23:29.017666447 +0800 > > @@ -251,7 +251,6 @@ > > _size_of_instances_in_words += cie->words(); > > return true; > > } > > - > > return false; > > } > > @@ -568,7 +567,6 @@ > > Atomic::add(&_missed_count, missed_count); > > } else { > > Atomic::store(&_success, false); > > - return; > > } > > } > > ######################################################################### > > Here is the webrevhttp://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_08/ > > BRs, > > Lin > > --------------------------------------------- > > From: "Hohensee, Paul" > > Date: Thursday, July 23, 2020 at 6:48 AM > > To: "linzang(??)" , Stefan Karlsson ,"serguei.spitsyn at oracle.com" , David Holmes , serviceability-dev ,"hotspot-gc-dev at openjdk.java.net" > > Subject: RE: RFR(L): 8215624: add parallel heap inspection support for jmap histo(G1)(Internet mail) > > Just small things. > > heapInspection.cpp: > > In ParHeapInspectTask::work, remove the final return statement and fix the following ?}? indent. I.e., replace > > + Atomic::store(&_success, false); > > + return; > > + } > > with > > + Atomic::store(&_success, false); > > + } > > In HeapInspection::heap_inspection, missed_count should be a uint to match other missed_count declarations, and should be initialized to the result of populate_table() rather than separately to 0. > > attachListener.cpp: > > In heap_inspection, initial_processor_count returns an int, so cast its result to a uint. > > Similarly, parse_uintx returns a uintx, so cast its result (num) to uint when assigning to parallel_thread_num. > > BasicJMapTest.java: > > I took the liberty of refactoring testHisto*/histoToFile/testDump*/dump to remove redundant interposition methods and make histoToFile and dump look as similar as possible. > > Webrev with the above changes in > > http://cr.openjdk.java.net/~phh/8214535/webrev.01/ > > Thanks, > > Paul > > On 7/15/20, 2:13 AM, "linzang(??)" wrote: > > Upload a new webrev athttp://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_07/ > > It fix a potential issue that unexpected number of threads maybe calculated for "parallel" option of jmap -histo in container. > > As shown athttp://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_07-delta/src/hotspot/share/services/attachListener.cpp.udiff.html > > ############### attachListener.cpp #################### > > @@ -252,11 +252,11 @@ > > static jint heap_inspection(AttachOperation* op, outputStream* out) { > > bool live_objects_only = true; // default is true to retain the behavior before this change is made > > outputStream* os = out; // if path not specified or path is NULL, use out > > fileStream* fs = NULL; > > const char* arg0 = op->arg(0); > > - uint parallel_thread_num = MAX(1, os::processor_count() * 3 / 8); // default is less than half of processors. > > + uint parallel_thread_num = MAX(1, os::initial_active_processor_count() * 3 / 8); // default is less than half of processors. > > if (arg0 != NULL && (strlen(arg0) > 0)) { > > if (strcmp(arg0, "-all") != 0 && strcmp(arg0, "-live") != 0) { > > out->print_cr("Invalid argument to inspectheap operation: %s", arg0); > > return JNI_ERR; > > } > > ################################################### > > Thanks. > > BRs, > > Lin > > On 2020/7/9, 3:22 PM, "linzang(??)" wrote: > > Hi Paul, > > Thanks for reviewing! > > >> > > >> I'd move all the argument parsing code to JMap.java and just pass the results to Hotspot. Both histo() in JMap.java and code in attachListener.* parse the command line arguments, though the code in histo() doesn't parse the argument to "parallel". I'd upgrade the code in histo() to do a complete parse and pass the option values to executeCommandForPid as before: there would just be more of them now. That would allow you to eliminate all the parsing code in attachListener.cpp as well as the change to arguments.hpp. > > >> > > The reason I made the change in Jmap.java that compose all arguments as 1 string , instead of passing 3 argments, is to avoid the compatibility issue, as we discussed inhttp://mail.openjdk.java.net/pipermail/serviceability-dev/2019-February/thread.html#27240. The root cause of the compatibility issue is because max argument count in HotspotVirtualMachineImpl.java and attachlistener.cpp need to be enlarged (changes likehttp://hg.openjdk.java.net/jdk/jdk/rev/e7cf035682e3#l2.1) when jmap has more than 3 arguments. But if user use an old jcmd/jmap tool, it may stuck at socket read(), because the "max argument count" don't match. > > I re-checked this change, the argument count of jmap histo is equal to 3 (live, file, parallel), so it can work normally even without the change of passing argument. But I think we have to face the problem if more arguments is added in jcmd alike tools later, not sure whether it should be sloved (or a workaround) in this changeset. > > And here are the lastest webrev and delta: > > http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_06/ > > http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_06-delta/ > > Cheers, > > Lin > > On 2020/7/7, 5:57 AM, "Hohensee, Paul" wrote: > > I'd like to see this feature added. :) > > The CSR looks good, as does the basic parallel inspection algorithm. Stefan's done the GC part, so I'll stick to the non-GC part (fwiw, the GC part lgtm). > > I'd move all the argument parsing code to JMap.java and just pass the results to Hotspot. Both histo() in JMap.java and code in attachListener.* parse the command line arguments, though the code in histo() doesn't parse the argument to "parallel". I'd upgrade the code in histo() to do a complete parse and pass the option values to executeCommandForPid as before: there would just be more of them now. That would allow you to eliminate all the parsing code in attachListener.cpp as well as the change to arguments.hpp. > > heapInspection.hpp: > > _shared_miss_count (s/b _missed_count, see below) isn't a size, so it should be a uint instead of a size_t. Same with the new parallel_thread_num argument to heap_inspection() and populate_table(). > > Comment copy-edit: > > +// Parallel heap inspection task. Parallel inspection can fail due to > > +// a native OOM when allocating memory for TL-KlassInfoTable. > > +// _success will be set false on an OOM, and serial inspection tried. > > _shared_miss_count should be _missed_count to match the missed_count() getter, or rename missed_count() to be shared_miss_count(). Whichever way you go, the field type should match the getter result type: uint is reasonable. > > heapInspection.cpp: > > You might use ResourceMark twice in populate_table, separately for the parallel attempt and the serial code. If the parallel attempt fails and available memory is low, it would be good to clean up the memory used by the parallel attempt before doing the serial code. > > Style nit in KlassInfoTable::merge_entry(). I'd line up the definitions of k and elt, so "k" is even with "elt". And, because it's two lines shorter, I'd replace > > + } else { > > + return false; > > + } > > with > > + return false; > > KlassInfoTableMergeClosure.is_success() should be just success() (i.e., no "is_" prefix) because it's a getter. > > I'd reorganize the code in populate_table() to make it more clear, vis (I changed _shared_missed_count to _missed_count) > > + if (cit.allocation_failed()) { > > + // fail to allocate memory, stop parallel mode > > + Atomic::store(&_success, false); > > + return; > > + } > > + RecordInstanceClosure ric(&cit, _filter); > > + _poi->object_iterate(&ric, worker_id); > > + missed_count = ric.missed_count(); > > + { > > + MutexLocker x(&_mutex); > > + merge_success = _shared_cit->merge(&cit); > > + } > > + if (merge_success) { > > + Atomic::add(&_missed_count, missed_count); > > + else { > > + Atomic::store(&_success, false); > > + } > > Thanks, > > Paul > > On 6/29/20, 7:20 PM, "linzang(??)" wrote: > > Dear All, > > Sorry to bother again, I just want to make sure that is this change worth to be continue to work on? If decision is made to not. I think I can drop this work and stop asking for help reviewing... > > Thanks for all your help about reviewing this previously. > > BRs, > > Lin > > On 2020/5/9, 3:47 PM, "linzang(??)" wrote: > > Dear All, > > May I ask your help again for review the latest change? Thanks! > > BRs, > > Lin > > On 2020/4/28, 1:54 PM, "linzang(??)" wrote: > > Hi Stefan, > > >> - Adding Atomic::load/store. > > >> - Removing the time measurement in the run_task. I renamed G1's function > > >> to run_task_timed. If we need this outside of G1, we can rethink the API > > >> at that point. > > >> - ZGC style cleanups > > Thanks for revising the patch, they are all good to me, and I have made a tiny change based on it: > > http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_04/ > > http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_04-delta/ > > it reduce the scope of mutex in ParHeapInspectTask, and delete unnecessary comments. > > BRs, > > Lin > > On 2020/4/27, 4:34 PM, "Stefan Karlsson" wrote: > > Hi Lin, > > On 2020-04-26 05:10, linzang(??) wrote: > > > Hi Stefan and Paul? > > > I have made a new patch based on your comments and Stefan's Poc code: > > > Webrev:http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_03/ > > > Delta(based on Stefan's change:) :http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_03-delta/webrev_03-delta/ > > Thanks for providing a delta patch. It makes it much easier to look at, > > and more likely for reviewers to continue reviewing. > > I'm going to continue focusing on the GC parts, and leave the rest to > > others to review. > > > > > > And Here are main changed I made and want to discuss with you: > > > 1. changed"parallelThreadNum=" to "parallel=" for jmap -histo options. > > > 2. Add logic to test where parallelHeapInspection is fail, in heapInspection.cpp > > > This is because the parHeapInspectTask create thread local KlassInfoTable in it's work() method, and this may fail because of native OOM, in this case, the parallel should fail and serial heap inspection can be tried. > > > One more thing I want discuss with you is about the member "_success" of parHeapInspectTask, when native OOM happenes, it is set to false. And since this "set" operation can be conducted in multiple threads, should it be atomic ops? IMO, this is not necessary because "_success" can only be set to false, and there is no way to change it from back to true after the ParHeapInspectTask instance is created, so it is save to be non-atomic, do you agree with that? > > In these situations you should be using the Atomic::load/store > > primitives. We're moving toward a later C++ standard were data races are > > considered undefined behavior. > > > 3. make CollectedHeap::run_task() be an abstract virtual func, so that every subclass of collectedHeap should support it, so later implementation of new collectedHeap will not miss the "parallel" features. > > > The problem I want to discuss with you is about epsilonHeap and SerialHeap, as they may not need parallel heap iteration, so I only make task->work(0), in case the run_task() is invoked someway in future. Another way is to left run_task() unimplemented, which one do you think is better? > > I don't have a strong opinion about this. > > And also please help take a look at the zHeap, as there is a class > > zTask that wrap the abstractGangTask, and the collectedHeap::run_task() > > only accept AbstraceGangTask* as argument, so I made a delegate class > > to adapt it , please see src/hotspot/share/gc/z/zHeap.cpp. > > > > > > There maybe other better ways to sovle the above problems, welcome for any comments, Thanks! > > I've created a few cleanups and changes on top of your latest patch: > > https://cr.openjdk.java.net/~stefank/8215624/webrev.02.delta > > https://cr.openjdk.java.net/~stefank/8215624/webrev.02 > > - Adding Atomic::load/store. > > - Removing the time measurement in the run_task. I renamed G1's function > > to run_task_timed. If we need this outside of G1, we can rethink the API > > at that point. > > - ZGC style cleanups > > Thanks, > > StefanK > > > > > > BRs, > > > Lin > > > > > > On 2020/4/23, 11:08 AM, "linzang(??)" wrote: > > > > > > Thanks Paul! I agree with using "parallel", will make the update in next patch, Thanks for help update the CSR. > > > > > > BRs, > > > Lin > > > > > > On 2020/4/23, 4:42 AM, "Hohensee, Paul" wrote: > > > > > > For the interface, I'd use "parallel" instead of "parallelThreadNum". All the other options are lower case, and it's a lot easier to type "parallel". I took the liberty of updating the CSR. If you're ok with it, you might want to change variable names and such, plus of course JMap.usage. > > > > > > Thanks, > > > Paul > > > > > > On 4/22/20, 2:29 AM, "serviceability-dev on behalf of linzang(??)" linzang at tencent.com> wrote: > > > > > > Dear Stefan, > > > > > > Thanks a lot! I agree with you to decouple the heap inspection code with GC's. > > > I will start from your POC code, may discuss with you later. > > > > > > > > > BRs, > > > Lin > > > > > > On 2020/4/22, 5:14 PM, "Stefan Karlsson" wrote: > > > > > > Hi Lin, > > > > > > I took a look at this earlier and saw that the heap inspection code is > > > strongly coupled with the CollectedHeap and G1CollectedHeap. I'd prefer > > > if we'd abstract this away, so that the GCs only provide a "parallel > > > object iteration" interface, and the heap inspection code is kept elsewhere. > > > > > > I started experimenting with doing that, but other higher-priority (to > > > me) tasks have had to take precedence. > > > > > > I've uploaded my work-in-progress / proof-of-concept: > > >https://cr.openjdk.java.net/~stefank/8215624/webrev.01.delta/ > > >https://cr.openjdk.java.net/~stefank/8215624/webrev.01/ > > > > > > The current code doesn't handle the lifecycle (deletion) of the > > > ParallelObjectIterators. There's also code left unimplemented in around > > > CollectedHeap::run_task. However, I think this could work as a basis to > > > pull out the heap inspection code out of the GCs. > > > > > > Thanks, > > > StefanK > > > > > > On 2020-04-22 02:21, linzang(??) wrote: > > > > Dear all, > > > > May I ask you help to review? This RFR has been there for quite a while. > > > > Thanks! > > > > > > > > BRs, > > > > Lin > > > > > > > > > On 2020/3/16, 5:18 PM, "linzang(??)" wrote:> > > > > > > > >> Just update a new path, my preliminary measure show about 3.5x speedup of jmap histo on a nearly full 4GB G1 heap (8-core platform with parallel thread number set to 4). > > > >> webrev:http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_02/ > > > >> bug:https://bugs.openjdk.java.net/browse/JDK-8215624 > > > >> CSR:https://bugs.openjdk.java.net/browse/JDK-8239290 > > ?? > >> BRs, > > > >> Lin > > > >> > On 2020/3/2, 9:56 PM, "linzang(??)" wrote: > > > >> > > > > >> > Dear all, > > > >> > Let me try to ease the reviewing work by some explanation :P > > > >> > The patch's target is to speed up jmap -histo for heap iteration, from my experience it is necessary for large heap investigation. E.g in bigData scenario I have tried to conduct jmap -histo against 180GB heap, it does take quite a while. > > > >> > And if my understanding is corrent, even the jmap -histo without "live" option does heap inspection with heap lock acquired. so it is very likely to block mutator thread in allocation-sensitive scenario. I would say the faster the heap inspection does, the shorter the mutator be blocked. This is parallel iteration for jmap is necessary. > > > >> > I think the parallel heap inspection should be applied to all kind of heap. However, consider the heap layout are different for GCs, much time is required to understand all kinds of the heap layout to make the whole change. IMO, It is not wise to have a huge patch for the whole solution at once, and it is even harder to review it. So I plan to implement it incrementally, the first patch (this one) is going to confirm the implemention detail of how jmap accept the new option, passes it to attachListener of the jvm process and then how to make the parallel inspection closure be generic enough to make it easy to extend to different heap layout. And also how to implement the heap inspection in specific gc's heap. This patch use G1's heap as the begining. > > > >> > This patch actually do several things: > > > >> > 1. Add an option "parallelThreadNum=" to jmap -histo, the default behavior is to set N to 0, means let's JVM decide how many threads to use for heap inspection. Set this option to 1 will disable parallel heap inspection. (more details in CSR:https://bugs.openjdk.java.net/browse/JDK-8239290) > > > >> > 2. Make a change in how Jmap passing arguments, changes inhttp://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_01/src/jdk.jcmd/share/classes/sun/tools/jmap/JMap.java.udiff.html, originally it pass options as separate arguments to attachListener, this patch change to that all options be compose to a single string. So the arg_count_max in attachListener.hpp do not need to be changed, and hence avoid the compatibility issue, as disscussed athttps://mail.openjdk.java.net/pipermail/serviceability-dev/2019-March/027334.html > > > >> > 3. Add an abstract class ParHeapInspectTask in heapInspection.hpp / heapInspection.cpp, It's work(uint worker_id) method prepares the data structure (KlassInfoTable) need for every parallel worker thread, and then call do_object_iterate_parallel() which is heap specific implementation. I also added some machenism in KlassInfoTable to support parallel iteration, such as merge(). > > > >> > 4. In specific heap (G1 in this patch), create a subclass of ParHeapInspectTask, implement the do_object_iterate_parallel() for parallel heap inspection. For G1, it simply invoke g1CollectedHeap's object_iterate_parallel(). > > > >> > 5. Add related test. > > > >> > 6. it may be easy to extend this patch for other kinds of heap by creating subclass of ParHeapInspectTask and implement the do_object_iterate_parallel(). > > > >> > > > > >> > Hope these info could help on code review and initate the discussion :-) > > > >> > Thanks! > > > >> > > > > >> > BRs, > > > >> > Lin > > > >> > >On 2020/2/19, 9:40 AM, "linzang(??)" wrote:. > > > >> > > > > > >> > > Re-post this RFR with correct enhancement number to make it trackable. > > > >> > > please ignore the previous wrong post. sorry for troubles. > > > >> > > > > > >> > > webrev:http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_01/ > > > >> > > Hi bug:https://bugs.openjdk.java.net/browse/JDK-8215624 > > > >> > > CSR:https://bugs.openjdk.java.net/browse/JDK-8239290 > > > >> > > -------------- > > > >> > > Lin > > > >> > > >Hi Lin, > > > > > > > > > > > >> > > >Could you, please, re-post your RFR with the right enhancement number in > > > >> > > >the message subject? > > > >> > > >It will be more trackable this way. > > > >> > > > > > > >> > > >Thanks, > > > >> > > >Serguei > > > >> > > > > > > >> > > > > > > >> > > >On 2/17/20 10:29 PM, linzang(??) wrote: > > > >> > > >> Dear David, > > > >> > > >> Thanks a lot! > > > >> > > >> I have updated the refined code tohttp://cr.openjdk.java.net/~lzang/jmap-8214535/8215264/webrev_01/. > > > >> > > >> IMHO the parallel heap inspection can be extended to all kinds of heap as long as the heap layout can support parallel iteration. > > > >> > > >> Maybe we can firstly use this webrev to discuss how to implement it, because I am not sure my current implementation is an appropriate way to communicate with collectedHeap, then we can extend the solution to other kinds of heap. > > > >> > > >> > > > >> > > >> Thanks, > > > >> > > >> -------------- > > > >> > > >> Lin > > > >> > > >>> Hi Lin, > > > >> > > >>> > > > >> > > >>> Adding in hotspot-gc-dev as they need to see how this interacts with GC > > > >> > > >>> worker threads, and whether it needs to be extended beyond G1. > > > >> > > >>> > > > >> > > >>> I happened to spot one nit when browsing: > > > >> > > >>> > > > >> > > >>> src/hotspot/share/gc/shared/collectedHeap.hpp > > > >> > > >>> > > > >> > > >>> + virtual bool run_par_heap_inspect_task(KlassInfoTable* cit, > > > >> > > >>> + BoolObjectClosure* filter, > > > >> > > >>> + size_t* missed_count, > > > >> > > >>> + size_t thread_num) { > > > >> > > >>> + return NULL; > > > >> > > >>> > > > >> > > >>> s/NULL/false/ > > > >> > > >>> > > > >> > > >>> Cheers, > > > >> > > >>> David > > > > > > > >>> > > > >> > > >>> On 18/02/2020 2:15 pm, linzang(??) wrote: > > > >> > > >>>> Dear All, > > > >> > > >>>> May I ask your help to review the follow changes: > > > >> > > >>>> webrev: > > > >> > > >>>>http://cr.openjdk.java.net/~lzang/jmap-8214535/8215264/webrev_00/ > > > >> > > >>>> bug:https://bugs.openjdk.java.net/browse/JDK-8215624 > > > >> > > >>>> related CSR:https://bugs.openjdk.java.net/browse/JDK-8239290 > > > >> > > >>>> This patch enable parallel heap inspection of G1 for jmap histo. > > > >> > > >>>> my simple test shown it can speed up 2x of jmap -histo with > > > >> > > >>>> parallelThreadNum set to 2 for heap at ~500M on 4-core platform. > > > >> > > >>>> > > > >> > > >>>> ------------------------------------------------------------------------ > > > >> > > >>>> BRs, > > > >> > > >>>> Lin > > > >> > > >> > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > From alexey.menkov at oracle.com Wed Aug 5 21:08:13 2020 From: alexey.menkov at oracle.com (Alex Menkov) Date: Wed, 5 Aug 2020 14:08:13 -0700 Subject: Ping: RFR: JDK-8249550: jdb should use loopback address when not using remote agent In-Reply-To: References: <541a2f57-4b86-b453-7739-f1de35b52212@oracle.com> <5b200434-6181-1478-6423-70cd9150ce56@oracle.com> <467c37ea-3146-aec2-2fc5-94153acd5acc@oracle.com> Message-ID: <1f47f637-9bfa-a682-fddc-4b434ef1a858@oracle.com> Hi Serguei, Original issue is about jdb with CommandLineLaunch connector on non-Windows systems tries to resolve hostname and start to listening on it. This behavior can cause error (this is what the bug about) and does not makes sense and CommandLineLaunch connector launches local process, so there is no sense to listen for connections from other hosts. SocketTransportService is also used for jdb "listen" and "listenany" commands, but this commands require address to be specified, so startListening(String) is called, not startListening(). So the only behavior change is a debuggee started by CommandLineLaunch connector accepts debuggers only from local machine, but this is how jdb works (it starts local process and connects to it). I don't think this requires CSR. --alex On 08/04/2020 17:22, serguei.spitsyn at oracle.com wrote: > Hi Alex, > > This looks good to me. > But do we need a CSR for this? > I understand that the intention is to comply with the TransportService > spec but the behavior is being changed. > How long did we have this behavior? > > Thanks, > Serguei > > > On 8/4/20 16:32, Alex Menkov wrote: >> Needs one more reviewer. >> >> One more details to simplify review. >> SocketTransportService extends TransportService and spec for >> TransportService.startListening() is: >> >> ??? /** >> ???? * Listens on an address chosen by the transport service. >> ???? * >> ???? *

This convenience method works as if by invoking >> ???? * {@link #startListening(String) startListening(null)}. >> ???? * >> ???? * @return? a listen key to be used in subsequent calls to be >> ???? *????????? {@link #accept accept} or {@link #stopListening >> ???? *????????? stopListening} methods. >> ???? * >> ???? * @throws? IOException >> ???? *????????? If an I/O error occurs. >> ???? */ >> ??? public abstract ListenKey startListening() throws IOException; >> >> I.e. the fix updates SocketTransportService? to comply the spec. >> >> --alex >> >> On 07/23/2020 13:05, Chris Plummer wrote: >>> Hi Alex, >>> >>> I'm no expert in this area, but the changes appear to do what you >>> describe (use the loopback address), so thumbs up. >>> >>> thanks, >>> >>> Chris >>> >>> On 7/21/20 3:04 PM, Alex Menkov wrote: >>>> Hi all, >>>> >>>> please review the fix for >>>> https://bugs.openjdk.java.net/browse/JDK-8249550 >>>> >>>> webrev: >>>> http://cr.openjdk.java.net/~amenkov/jdk16/jdb_loopback/webrev/ >>>> >>>> some background: >>>> https://bugs.openjdk.java.net/browse/JDK-8041435 made default >>>> listening on loopback address. >>>> Later https://bugs.openjdk.java.net/browse/JDK-8184770 added >>>> handling of "*" address to listen on all addresses, but it didn't >>>> fixed "default" startListening() method (used by jdb through >>>> SunCommandLineLauncher). >>>> >>>> The method called startListening(String localaddress, int port) with >>>> localaddress == null, but this method for null localladdress starts >>>> listening on all addresses (i.e. handle null value as "*"). >>>> The fix changes it to startListening(String address) which handles >>>> null address the same way as JDI socket connector does (i.e. listens >>>> on loopback address only) >>>> >>>> --alex >>> > From chris.plummer at oracle.com Thu Aug 6 01:16:55 2020 From: chris.plummer at oracle.com (Chris Plummer) Date: Wed, 5 Aug 2020 18:16:55 -0700 Subject: RFR(XS): 8251121: six SA tests leave core files behind on macOS Message-ID: Hello, Please review the following: https://bugs.openjdk.java.net/browse/JDK-8251121 http://cr.openjdk.java.net/~cjplummer/8251121/webrev.00/index.html On OSX (and possibly some linux systems), core files are not produced in the cwd, but instead end up in some well known location. For OSX it is the /cores directory. The core files tend to accumulate there. This fixes the core file accumulation problem by moving the core file into the cwd, allowing jtreg to manage it. By default jtreg will delete the core if the test passes, and retain if if the test fails or RETAIN=all is specified. I got rid of the code in ClhsdbCDSCore.java that explicitly deletes the core file because we don't want it deleted if RETAIN=all is used. thanks, Chris From philip.race at oracle.com Thu Aug 6 01:46:59 2020 From: philip.race at oracle.com (Philip Race) Date: Wed, 05 Aug 2020 18:46:59 -0700 Subject: RFR: 8240487 : Cleanup whitespace in .cc, .hh, .m, and .mm files Message-ID: <5F2B6113.5030909@oracle.com> Bug: https://bugs.openjdk.java.net/browse/JDK-8240487 Webrev: http://cr.openjdk.java.net/~prr/8240487/ In advance of the move to Project Skara/git it is desirable to clean up whitespace in source files that are not currently checked by jcheck so we can add these extensions to jcheck at that time. The fix is therefore to remove tabs and trailing spaces. The 3rd party harfbuzz library has .cc and .hh files but there are no current violations there since I've cleaned those up when importing harfbuzz upgrades. There is one JDK file that relates to those that inherited tabs that is fixed. But almost all the fixes are in Objective C .m and .mm files. JDK has no examples of .mm but JavaFX does so I was looking just to be sure. And all but one of the .m violations are in the desktop module which is mainly because that is where all but 5 of the Objective-C files are. The only non-desktop violator is ./jdk.hotspot.agent/macosx/native/libsaproc/MacosxDebuggerLocal.m and that is included in this webrev and why I've included serviceability-dev. -phil. From linzang at tencent.com Thu Aug 6 01:59:39 2020 From: linzang at tencent.com (=?utf-8?B?bGluemFuZyjoh6fnkLMp?=) Date: Thu, 6 Aug 2020 01:59:39 +0000 Subject: RFR(L): 8215624: add parallel heap inspection support for jmap histo(G1)(Internet mail) In-Reply-To: <107799EA-8956-426D-B05E-B3344AFCBA28@amazon.com> References: <107799EA-8956-426D-B05E-B3344AFCBA28@amazon.com> Message-ID: <63236320-6741-46F4-A22A-24FCE699E26A@tencent.com> Thanks Paul! And I have verified this change could build success in windows. BRs, Lin ?On 2020/8/6, 4:17 AM, "Hohensee, Paul" wrote: Two tiny nits that don't need a new webrev: In heapInspection.cpp, you don't need to cast missed_count to uintx in the call to log_info(). In heapInspection.hpp, you can delete two of the three blank lines before #endif // SHARE_MEMORY_HEAPINSPECTION_HPP Thanks, Paul On 8/5/20, 6:46 AM, "linzang(??)" wrote: Hi Paul, Stefan and Serguei, Here I uploaded a new changeset, would you like to help review again? Webrev: http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_11/ Delta (based on webrev10): http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_11_delta/ P.S. I am in process of building it on windows environment for a double check. May update result later. Thanks! BRs, Lin On 2020/8/5, 8:56 PM, "Hohensee, Paul" wrote: uintx is fine with me. Thanks, Paul On 8/5/20, 1:14 AM, "linzang(??)" wrote: Hi Stefan, I got your point, thanks for explanation. I missed the atomic part when considering it. Hi Paul, Do you think it is ok to use uintx? I checked it is actually a uintptr_t type. BRs, Lin On 2020/8/5, 3:39 PM, "Stefan Karlsson" wrote: On 2020-08-05 07:22, linzang(??) wrote: > Hi Serguei, > > No problem, Thanks for your reviewing :) > > I wll upload a new webrev later, so may I ask your help to review it > again? > > Hi Stefan, > > As Paul mentioned, the _/missed/_count is not a size, so size_t may > not be clear, what?s your opinion about uint64_t? We typically don't restrict the usage of size_t to only *sizes* in the HotSpot. If you search the code you'll find many count variables using size_t, so I personally don't see the need to change the type. However, if you really do want to change it then maybe using another type that is 32 bits on 32-bit machines, maybe uintx? IIRC, using uint64_t and some of the Atomics operations are problematic on some 32-bit platforms, so using a type that matches the word size of the targetted machine helps you not having to think about that. > > It seems the uint overflow may happened on 64bit machine with large > heap, e.g. may be more than 4 Giga objects (8byte header + 8 byte > klassptr + 8byte field) in a heap that is larger than 96 GB, uint64_t > is ok in this case. Exactly. Thanks, StefanK > > BRs, > > Lin > > *From: *"serguei.spitsyn at oracle.com" > *Date: *Wednesday, August 5, 2020 at 1:02 PM > *To: *"linzang(??)" , "Hohensee, Paul" > , Stefan Karlsson , > David Holmes , serviceability-dev > , "hotspot-gc-dev at openjdk.java.net" > > *Subject: *Re: RFR(L): 8215624: add parallel heap inspection support for > jmap histo(G1)(Internet mail) > > Oh, sorry for the confusion, please, skip my question. :) > C++ does not have the '&&=' operator. > > Thanks, > Serguei > > On 8/4/20 21:56, serguei.spitsyn at oracle.com > wrote: > > Hi Lin, > > https://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_10/src/hotspot/share/memory/heapInspection.cpp.udiff.html > > +class KlassInfoTableMergeClosure : public KlassInfoClosure { > > +private: > > + KlassInfoTable* _dest; > > + bool _success; > > +public: > > + KlassInfoTableMergeClosure(KlassInfoTable* table) : _dest(table), > _success(true) {} > > + void do_cinfo(KlassInfoEntry* cie) { > > + _success &= _dest->merge_entry(cie); > > + } > > The operator '&=' above looks strange. > Did you actually want to use the operator '&&=' instead? : > > + _success &&= _dest->merge_entry(cie); > > > Thanks, > Serguei > > > > > On 8/3/20 07:51, linzang(??) wrote: > > Dear Stefan, > > May I ask your help to review again? I have made a delta based on the last changeset you have reviewed(webrev04),hope it could ease your reviewing work. > > webrev:https://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_10/ > > delta (vs webrev04):https://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/delta_10vs04/webrev/ > > bug:https://bugs.openjdk.java.net/browse/JDK-8215624 > > CSR(approved):https://bugs.openjdk.java.net/browse/JDK-8239290 > > > > BRs, > > Lin > > On 2020/7/30, 5:21 AM, "Hohensee, Paul" wrote: > > A submit repo run with this succeeded, so afaic you're good to go. Stefan, you reviewed the GC part before, it'd be great if you could ok the final version. > > Thanks, > > Paul > > On 7/29/20, 5:02 AM, "linzang(??)" wrote: > > Upload a new change athttp://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_10/ > > It fix an issue of windows fail : > > #################################### > > In heapInspect.cpp > > - size_t HeapInspection::populate_table(KlassInfoTable* cit, BoolObjectClosure *filter, uint parallel_thread_num) { > > + uint HeapInspection::populate_table(KlassInfoTable* cit, BoolObjectClosure *filter, uint parallel_thread_num) { > > #################################### > > In heapInspect.hpp > > - size_t populate_table(KlassInfoTable* cit, BoolObjectClosure* filter = NULL, uint parallel_thread_num = 1) NOT_SERVICES_RETURN_(0); > > + uint populate_table(KlassInfoTable* cit, BoolObjectClosure* filter = NULL, uint parallel_thread_num = 1) NOT_SERVICES_RETURN_(0); > > #################################### > > BRs, > > Lin > > On 2020/7/27, 11:26 AM, "linzang(??)" wrote: > > I update a new change athttp://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_09 > > It includes a tiny fix of build failure on windows: > > #################################### > > In attachListener.cpp: > > - uint parallel_thread_num = MAX(1, (uint)os::initial_active_processor_count() * 3 / 8); > > + uint parallel_thread_num = MAX2(1, (uint)os::initial_active_processor_count() * 3 / 8); > > #################################### > > BRs, > > Lin > > On 2020/7/23, 11:56 AM, "linzang(??)" wrote: > > Hi Paul, > > Thanks for your help, that all looks good to me. > > Just 2 minor changes: > > ? delete the final return in ParHeapInspectTask::work, you mentioned it but seems not include in the webrev :-) > > ? delete a unnecessary blank line in heapInspect.cpp at merge_entry() > > ######################################################################### > > --- old/src/hotspot/share/memory/heapInspection.cpp 2020-07-23 11:23:29.281666456 +0800 > > +++ new/src/hotspot/share/memory/heapInspection.cpp 2020-07-23 11:23:29.017666447 +0800 > > @@ -251,7 +251,6 @@ > > _size_of_instances_in_words += cie->words(); > > return true; > > } > > - > > return false; > > } > > @@ -568,7 +567,6 @@ > > Atomic::add(&_missed_count, missed_count); > > } else { > > Atomic::store(&_success, false); > > - return; > > } > > } > > ######################################################################### > > Here is the webrevhttp://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_08/ > > BRs, > > Lin > > --------------------------------------------- > > From: "Hohensee, Paul" > > Date: Thursday, July 23, 2020 at 6:48 AM > > To: "linzang(??)" , Stefan Karlsson ,"serguei.spitsyn at oracle.com" , David Holmes , serviceability-dev ,"hotspot-gc-dev at openjdk.java.net" > > Subject: RE: RFR(L): 8215624: add parallel heap inspection support for jmap histo(G1)(Internet mail) > > Just small things. > > heapInspection.cpp: > > In ParHeapInspectTask::work, remove the final return statement and fix the following ?}? indent. I.e., replace > > + Atomic::store(&_success, false); > > + return; > > + } > > with > > + Atomic::store(&_success, false); > > + } > > In HeapInspection::heap_inspection, missed_count should be a uint to match other missed_count declarations, and should be initialized to the result of populate_table() rather than separately to 0. > > attachListener.cpp: > > In heap_inspection, initial_processor_count returns an int, so cast its result to a uint. > > Similarly, parse_uintx returns a uintx, so cast its result (num) to uint when assigning to parallel_thread_num. > > BasicJMapTest.java: > > I took the liberty of refactoring testHisto*/histoToFile/testDump*/dump to remove redundant interposition methods and make histoToFile and dump look as similar as possible. > > Webrev with the above changes in > > http://cr.openjdk.java.net/~phh/8214535/webrev.01/ > > Thanks, > > Paul > > On 7/15/20, 2:13 AM, "linzang(??)" wrote: > > Upload a new webrev athttp://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_07/ > > It fix a potential issue that unexpected number of threads maybe calculated for "parallel" option of jmap -histo in container. > > As shown athttp://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_07-delta/src/hotspot/share/services/attachListener.cpp.udiff.html > > ############### attachListener.cpp #################### > > @@ -252,11 +252,11 @@ > > static jint heap_inspection(AttachOperation* op, outputStream* out) { > > bool live_objects_only = true; // default is true to retain the behavior before this change is made > > outputStream* os = out; // if path not specified or path is NULL, use out > > fileStream* fs = NULL; > > const char* arg0 = op->arg(0); > > - uint parallel_thread_num = MAX(1, os::processor_count() * 3 / 8); // default is less than half of processors. > > + uint parallel_thread_num = MAX(1, os::initial_active_processor_count() * 3 / 8); // default is less than half of processors. > > if (arg0 != NULL && (strlen(arg0) > 0)) { > > if (strcmp(arg0, "-all") != 0 && strcmp(arg0, "-live") != 0) { > > out->print_cr("Invalid argument to inspectheap operation: %s", arg0); > > return JNI_ERR; > > } > > ################################################### > > Thanks. > > BRs, > > Lin > > On 2020/7/9, 3:22 PM, "linzang(??)" wrote: > > Hi Paul, > > Thanks for reviewing! > > >> > > >> I'd move all the argument parsing code to JMap.java and just pass the results to Hotspot. Both histo() in JMap.java and code in attachListener.* parse the command line arguments, though the code in histo() doesn't parse the argument to "parallel". I'd upgrade the code in histo() to do a complete parse and pass the option values to executeCommandForPid as before: there would just be more of them now. That would allow you to eliminate all the parsing code in attachListener.cpp as well as the change to arguments.hpp. > > >> > > The reason I made the change in Jmap.java that compose all arguments as 1 string , instead of passing 3 argments, is to avoid the compatibility issue, as we discussed inhttp://mail.openjdk.java.net/pipermail/serviceability-dev/2019-February/thread.html#27240. The root cause of the compatibility issue is because max argument count in HotspotVirtualMachineImpl.java and attachlistener.cpp need to be enlarged (changes likehttp://hg.openjdk.java.net/jdk/jdk/rev/e7cf035682e3#l2.1) when jmap has more than 3 arguments. But if user use an old jcmd/jmap tool, it may stuck at socket read(), because the "max argument count" don't match. > > I re-checked this change, the argument count of jmap histo is equal to 3 (live, file, parallel), so it can work normally even without the change of passing argument. But I think we have to face the problem if more arguments is added in jcmd alike tools later, not sure whether it should be sloved (or a workaround) in this changeset. > > And here are the lastest webrev and delta: > > http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_06/ > > http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_06-delta/ > > Cheers, > > Lin > > On 2020/7/7, 5:57 AM, "Hohensee, Paul" wrote: > > I'd like to see this feature added. :) > > The CSR looks good, as does the basic parallel inspection algorithm. Stefan's done the GC part, so I'll stick to the non-GC part (fwiw, the GC part lgtm). > > I'd move all the argument parsing code to JMap.java and just pass the results to Hotspot. Both histo() in JMap.java and code in attachListener.* parse the command line arguments, though the code in histo() doesn't parse the argument to "parallel". I'd upgrade the code in histo() to do a complete parse and pass the option values to executeCommandForPid as before: there would just be more of them now. That would allow you to eliminate all the parsing code in attachListener.cpp as well as the change to arguments.hpp. > > heapInspection.hpp: > > _shared_miss_count (s/b _missed_count, see below) isn't a size, so it should be a uint instead of a size_t. Same with the new parallel_thread_num argument to heap_inspection() and populate_table(). > > Comment copy-edit: > > +// Parallel heap inspection task. Parallel inspection can fail due to > > +// a native OOM when allocating memory for TL-KlassInfoTable. > > +// _success will be set false on an OOM, and serial inspection tried. > > _shared_miss_count should be _missed_count to match the missed_count() getter, or rename missed_count() to be shared_miss_count(). Whichever way you go, the field type should match the getter result type: uint is reasonable. > > heapInspection.cpp: > > You might use ResourceMark twice in populate_table, separately for the parallel attempt and the serial code. If the parallel attempt fails and available memory is low, it would be good to clean up the memory used by the parallel attempt before doing the serial code. > > Style nit in KlassInfoTable::merge_entry(). I'd line up the definitions of k and elt, so "k" is even with "elt". And, because it's two lines shorter, I'd replace > > + } else { > > + return false; > > + } > > with > > + return false; > > KlassInfoTableMergeClosure.is_success() should be just success() (i.e., no "is_" prefix) because it's a getter. > > I'd reorganize the code in populate_table() to make it more clear, vis (I changed _shared_missed_count to _missed_count) > > + if (cit.allocation_failed()) { > > + // fail to allocate memory, stop parallel mode > > + Atomic::store(&_success, false); > > + return; > > + } > > + RecordInstanceClosure ric(&cit, _filter); > > + _poi->object_iterate(&ric, worker_id); > > + missed_count = ric.missed_count(); > > + { > > + MutexLocker x(&_mutex); > > + merge_success = _shared_cit->merge(&cit); > > + } > > + if (merge_success) { > > + Atomic::add(&_missed_count, missed_count); > > + else { > > + Atomic::store(&_success, false); > > + } > > Thanks, > > Paul > > On 6/29/20, 7:20 PM, "linzang(??)" wrote: > > Dear All, > > Sorry to bother again, I just want to make sure that is this change worth to be continue to work on? If decision is made to not. I think I can drop this work and stop asking for help reviewing... > > Thanks for all your help about reviewing this previously. > > BRs, > > Lin > > On 2020/5/9, 3:47 PM, "linzang(??)" wrote: > > Dear All, > > May I ask your help again for review the latest change? Thanks! > > BRs, > > Lin > > On 2020/4/28, 1:54 PM, "linzang(??)" wrote: > > Hi Stefan, > > >> - Adding Atomic::load/store. > > >> - Removing the time measurement in the run_task. I renamed G1's function > > >> to run_task_timed. If we need this outside of G1, we can rethink the API > > >> at that point. > > >> - ZGC style cleanups > > Thanks for revising the patch, they are all good to me, and I have made a tiny change based on it: > > http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_04/ > > http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_04-delta/ > > it reduce the scope of mutex in ParHeapInspectTask, and delete unnecessary comments. > > BRs, > > Lin > > On 2020/4/27, 4:34 PM, "Stefan Karlsson" wrote: > > Hi Lin, > > On 2020-04-26 05:10, linzang(??) wrote: > > > Hi Stefan and Paul? > > > I have made a new patch based on your comments and Stefan's Poc code: > > > Webrev:http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_03/ > > > Delta(based on Stefan's change:) :http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_03-delta/webrev_03-delta/ > > Thanks for providing a delta patch. It makes it much easier to look at, > > and more likely for reviewers to continue reviewing. > > I'm going to continue focusing on the GC parts, and leave the rest to > > others to review. > > > > > > And Here are main changed I made and want to discuss with you: > > > 1. changed"parallelThreadNum=" to "parallel=" for jmap -histo options. > > > 2. Add logic to test where parallelHeapInspection is fail, in heapInspection.cpp > > > This is because the parHeapInspectTask create thread local KlassInfoTable in it's work() method, and this may fail because of native OOM, in this case, the parallel should fail and serial heap inspection can be tried. > > > One more thing I want discuss with you is about the member "_success" of parHeapInspectTask, when native OOM happenes, it is set to false. And since this "set" operation can be conducted in multiple threads, should it be atomic ops? IMO, this is not necessary because "_success" can only be set to false, and there is no way to change it from back to true after the ParHeapInspectTask instance is created, so it is save to be non-atomic, do you agree with that? > > In these situations you should be using the Atomic::load/store > > primitives. We're moving toward a later C++ standard were data races are > > considered undefined behavior. > > > 3. make CollectedHeap::run_task() be an abstract virtual func, so that every subclass of collectedHeap should support it, so later implementation of new collectedHeap will not miss the "parallel" features. > > > The problem I want to discuss with you is about epsilonHeap and SerialHeap, as they may not need parallel heap iteration, so I only make task->work(0), in case the run_task() is invoked someway in future. Another way is to left run_task() unimplemented, which one do you think is better? > > I don't have a strong opinion about this. > > And also please help take a look at the zHeap, as there is a class > > zTask that wrap the abstractGangTask, and the collectedHeap::run_task() > > only accept AbstraceGangTask* as argument, so I made a delegate class > > to adapt it , please see src/hotspot/share/gc/z/zHeap.cpp. > > > > > > There maybe other better ways to sovle the above problems, welcome for any comments, Thanks! > > I've created a few cleanups and changes on top of your latest patch: > > https://cr.openjdk.java.net/~stefank/8215624/webrev.02.delta > > https://cr.openjdk.java.net/~stefank/8215624/webrev.02 > > - Adding Atomic::load/store. > > - Removing the time measurement in the run_task. I renamed G1's function > > to run_task_timed. If we need this outside of G1, we can rethink the API > > at that point. > > - ZGC style cleanups > > Thanks, > > StefanK > > > > > > BRs, > > > Lin > > > > > > On 2020/4/23, 11:08 AM, "linzang(??)" wrote: > > > > > > Thanks Paul! I agree with using "parallel", will make the update in next patch, Thanks for help update the CSR. > > > > > > BRs, > > > Lin > > > > > > On 2020/4/23, 4:42 AM, "Hohensee, Paul" wrote: > > > > > > For the interface, I'd use "parallel" instead of "parallelThreadNum". All the other options are lower case, and it's a lot easier to type "parallel". I took the liberty of updating the CSR. If you're ok with it, you might want to change variable names and such, plus of course JMap.usage. > > > > > > Thanks, > > > Paul > > > > > > On 4/22/20, 2:29 AM, "serviceability-dev on behalf of linzang(??)" linzang at tencent.com> wrote: > > > > > > Dear Stefan, > > > > > > Thanks a lot! I agree with you to decouple the heap inspection code with GC's. > > > I will start from your POC code, may discuss with you later. > > > > > > > > > BRs, > > > Lin > > > > > > On 2020/4/22, 5:14 PM, "Stefan Karlsson" wrote: > > > > > > Hi Lin, > > > > > > I took a look at this earlier and saw that the heap inspection code is > > > strongly coupled with the CollectedHeap and G1CollectedHeap. I'd prefer > > > if we'd abstract this away, so that the GCs only provide a "parallel > > > object iteration" interface, and the heap inspection code is kept elsewhere. > > > > > > I started experimenting with doing that, but other higher-priority (to > > > me) tasks have had to take precedence. > > > > > > I've uploaded my work-in-progress / proof-of-concept: > > >https://cr.openjdk.java.net/~stefank/8215624/webrev.01.delta/ > > >https://cr.openjdk.java.net/~stefank/8215624/webrev.01/ > > > > > > The current code doesn't handle the lifecycle (deletion) of the > > > ParallelObjectIterators. There's also code left unimplemented in around > > > CollectedHeap::run_task. However, I think this could work as a basis to > > > pull out the heap inspection code out of the GCs. > > > > > > Thanks, > > > StefanK > > > > > > On 2020-04-22 02:21, linzang(??) wrote: > > > > Dear all, > > > > May I ask you help to review? This RFR has been there for quite a while. > > > > Thanks! > > > > > > > > BRs, > > > > Lin > > > > > > > > > On 2020/3/16, 5:18 PM, "linzang(??)" wrote:> > > > > > > > >> Just update a new path, my preliminary measure show about 3.5x speedup of jmap histo on a nearly full 4GB G1 heap (8-core platform with parallel thread number set to 4). > > > >> webrev:http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_02/ > > > >> bug:https://bugs.openjdk.java.net/browse/JDK-8215624 > > > >> CSR:https://bugs.openjdk.java.net/browse/JDK-8239290 > > ?? > >> BRs, > > > >> Lin > > > >> > On 2020/3/2, 9:56 PM, "linzang(??)" wrote: > > > >> > > > > >> > Dear all, > > > >> > Let me try to ease the reviewing work by some explanation :P > > > >> > The patch's target is to speed up jmap -histo for heap iteration, from my experience it is necessary for large heap investigation. E.g in bigData scenario I have tried to conduct jmap -histo against 180GB heap, it does take quite a while. > > > >> > And if my understanding is corrent, even the jmap -histo without "live" option does heap inspection with heap lock acquired. so it is very likely to block mutator thread in allocation-sensitive scenario. I would say the faster the heap inspection does, the shorter the mutator be blocked. This is parallel iteration for jmap is necessary. > > > >> > I think the parallel heap inspection should be applied to all kind of heap. However, consider the heap layout are different for GCs, much time is required to understand all kinds of the heap layout to make the whole change. IMO, It is not wise to have a huge patch for the whole solution at once, and it is even harder to review it. So I plan to implement it incrementally, the first patch (this one) is going to confirm the implemention detail of how jmap accept the new option, passes it to attachListener of the jvm process and then how to make the parallel inspection closure be generic enough to make it easy to extend to different heap layout. And also how to implement the heap inspection in specific gc's heap. This patch use G1's heap as the begining. > > > >> > This patch actually do several things: > > > >> > 1. Add an option "parallelThreadNum=" to jmap -histo, the default behavior is to set N to 0, means let's JVM decide how many threads to use for heap inspection. Set this option to 1 will disable parallel heap inspection. (more details in CSR:https://bugs.openjdk.java.net/browse/JDK-8239290) > > > >> > 2. Make a change in how Jmap passing arguments, changes inhttp://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_01/src/jdk.jcmd/share/classes/sun/tools/jmap/JMap.java.udiff.html, originally it pass options as separate arguments to attachListener, this patch change to that all options be compose to a single string. So the arg_count_max in attachListener.hpp do not need to be changed, and hence avoid the compatibility issue, as disscussed athttps://mail.openjdk.java.net/pipermail/serviceability-dev/2019-March/027334.html > > > >> > 3. Add an abstract class ParHeapInspectTask in heapInspection.hpp / heapInspection.cpp, It's work(uint worker_id) method prepares the data structure (KlassInfoTable) need for every parallel worker thread, and then call do_object_iterate_parallel() which is heap specific implementation. I also added some machenism in KlassInfoTable to support parallel iteration, such as merge(). > > > >> > 4. In specific heap (G1 in this patch), create a subclass of ParHeapInspectTask, implement the do_object_iterate_parallel() for parallel heap inspection. For G1, it simply invoke g1CollectedHeap's object_iterate_parallel(). > > > >> > 5. Add related test. > > > >> > 6. it may be easy to extend this patch for other kinds of heap by creating subclass of ParHeapInspectTask and implement the do_object_iterate_parallel(). > > > >> > > > > >> > Hope these info could help on code review and initate the discussion :-) > > > >> > Thanks! > > > >> > > > > >> > BRs, > > > >> > Lin > > > >> > >On 2020/2/19, 9:40 AM, "linzang(??)" wrote:. > > > >> > > > > > >> > > Re-post this RFR with correct enhancement number to make it trackable. > > > >> > > please ignore the previous wrong post. sorry for troubles. > > > >> > > > > > >> > > webrev:http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_01/ > > > >> > > Hi bug:https://bugs.openjdk.java.net/browse/JDK-8215624 > > > >> > > CSR:https://bugs.openjdk.java.net/browse/JDK-8239290 > > > >> > > -------------- > > > >> > > Lin > > > >> > > >Hi Lin, > > > > > > > > > > > >> > > >Could you, please, re-post your RFR with the right enhancement number in > > > >> > > >the message subject? > > > >> > > >It will be more trackable this way. > > > >> > > > > > > >> > > >Thanks, > > > >> > > >Serguei > > > >> > > > > > > >> > > > > > > >> > > >On 2/17/20 10:29 PM, linzang(??) wrote: > > > >> > > >> Dear David, > > > >> > > >> Thanks a lot! > > > >> > > >> I have updated the refined code tohttp://cr.openjdk.java.net/~lzang/jmap-8214535/8215264/webrev_01/. > > > >> > > >> IMHO the parallel heap inspection can be extended to all kinds of heap as long as the heap layout can support parallel iteration. > > > >> > > >> Maybe we can firstly use this webrev to discuss how to implement it, because I am not sure my current implementation is an appropriate way to communicate with collectedHeap, then we can extend the solution to other kinds of heap. > > > >> > > >> > > > >> > > >> Thanks, > > > >> > > >> -------------- > > > >> > > >> Lin > > > >> > > >>> Hi Lin, > > > >> > > >>> > > > >> > > >>> Adding in hotspot-gc-dev as they need to see how this interacts with GC > > > >> > > >>> worker threads, and whether it needs to be extended beyond G1. > > > >> > > >>> > > > >> > > >>> I happened to spot one nit when browsing: > > > >> > > >>> > > > >> > > >>> src/hotspot/share/gc/shared/collectedHeap.hpp > > > >> > > >>> > > > >> > > >>> + virtual bool run_par_heap_inspect_task(KlassInfoTable* cit, > > > >> > > >>> + BoolObjectClosure* filter, > > > >> > > >>> + size_t* missed_count, > > > >> > > >>> + size_t thread_num) { > > > >> > > >>> + return NULL; > > > >> > > >>> > > > >> > > >>> s/NULL/false/ > > > >> > > >>> > > > >> > > >>> Cheers, > > > >> > > >>> David > > > > > > > >>> > > > >> > > >>> On 18/02/2020 2:15 pm, linzang(??) wrote: > > > >> > > >>>> Dear All, > > > >> > > >>>> May I ask your help to review the follow changes: > > > >> > > >>>> webrev: > > > >> > > >>>>http://cr.openjdk.java.net/~lzang/jmap-8214535/8215264/webrev_00/ > > > >> > > >>>> bug:https://bugs.openjdk.java.net/browse/JDK-8215624 > > > >> > > >>>> related CSR:https://bugs.openjdk.java.net/browse/JDK-8239290 > > > >> > > >>>> This patch enable parallel heap inspection of G1 for jmap histo. > > > >> > > >>>> my simple test shown it can speed up 2x of jmap -histo with > > > >> > > >>>> parallelThreadNum set to 2 for heap at ~500M on 4-core platform. > > > >> > > >>>> > > > >> > > >>>> ------------------------------------------------------------------------ > > > >> > > >>>> BRs, > > > >> > > >>>> Lin > > > >> > > >> > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > From chris.plummer at oracle.com Thu Aug 6 03:12:38 2020 From: chris.plummer at oracle.com (Chris Plummer) Date: Wed, 5 Aug 2020 20:12:38 -0700 Subject: RFR: 8240487 : Cleanup whitespace in .cc, .hh, .m, and .mm files In-Reply-To: <5F2B6113.5030909@oracle.com> References: <5F2B6113.5030909@oracle.com> Message-ID: <43459381-f941-78a0-aa44-5484c0278ccb@oracle.com> Hi Philip, The MacosxDebuggerLocal.m changes look fine. It took a while to detect what was actually changed since the html files seem to convert tabs to spaces. I ended up looking in the patch file, and could see the tabs there. thanks, Chris On 8/5/20 6:46 PM, Philip Race wrote: > Bug: https://bugs.openjdk.java.net/browse/JDK-8240487 > Webrev: http://cr.openjdk.java.net/~prr/8240487/ > > In advance of the move to Project Skara/git it is desirable to clean > up whitespace in source files > that are not currently checked by jcheck so we can add these > extensions to jcheck at that time. > > The fix is therefore to remove tabs and trailing spaces. > > The 3rd party harfbuzz library has .cc and .hh files but there are no > current violations there > since I've cleaned those up when importing harfbuzz upgrades. > > There is one JDK file that relates to those that inherited tabs that > is fixed. > > But almost all the fixes are in Objective C .m and .mm files. > JDK has no examples of .mm but JavaFX does so I was looking just to be > sure. > > And all but one of the .m violations are in the desktop module which > is mainly because > that is where all but 5 of the Objective-C files are. > > The only non-desktop violator is > ./jdk.hotspot.agent/macosx/native/libsaproc/MacosxDebuggerLocal.m > and that is included in this webrev and why I've included > serviceability-dev. > > -phil. From david.holmes at oracle.com Thu Aug 6 04:03:02 2020 From: david.holmes at oracle.com (David Holmes) Date: Thu, 6 Aug 2020 14:03:02 +1000 Subject: RFR(XS): 8251121: six SA tests leave core files behind on macOS In-Reply-To: References: Message-ID: <31d49fd0-4d95-eec9-f85e-2977afb7652d@oracle.com> Hi Chris, On 6/08/2020 11:16 am, Chris Plummer wrote: > Hello, > > Please review the following: > > https://bugs.openjdk.java.net/browse/JDK-8251121 > http://cr.openjdk.java.net/~cjplummer/8251121/webrev.00/index.html > > On OSX (and possibly some linux systems), core files are not produced in > the cwd, but instead end up in some well known location. For OSX it is > the /cores directory. The core files tend to accumulate there. This > fixes the core file accumulation problem by moving the core file into > the cwd, allowing jtreg to manage it. By default jtreg will delete the > core if the test passes, and retain if if the test fails or RETAIN=all > is specified. So the current code returns the absolute path to the corefile, while your new code just returns the corefile name - which is effectively the relative path ./corefilename. Is that change going to cause a problem for any clients of this API? Second we have theorised about the length of time it can take to dump the corefile on macOS, and now we are moving that huge corefile to another location, likely on a different disk. Could that make the timeout problem worse? Thanks, David ----- > I got rid of the code in ClhsdbCDSCore.java that explicitly deletes the > core file because we don't want it deleted if RETAIN=all is used. > > thanks, > > Chris From chris.plummer at oracle.com Thu Aug 6 04:11:43 2020 From: chris.plummer at oracle.com (Chris Plummer) Date: Wed, 5 Aug 2020 21:11:43 -0700 Subject: 15: RFR(T): 8251214: ProblemList serviceability/sa/ClhsdbCDSCore.java on linux-x64 Message-ID: <4ee0694c-0bbd-3e23-ab6a-837311336755@oracle.com> Hello, Please review the following: https://bugs.openjdk.java.net/browse/JDK-8251214 diff --git a/test/hotspot/jtreg/ProblemList.txt b/test/hotspot/jtreg/ProblemList.txt --- a/test/hotspot/jtreg/ProblemList.txt +++ b/test/hotspot/jtreg/ProblemList.txt @@ -91,6 +91,7 @@ ?# :hotspot_serviceability ?serviceability/sa/sadebugd/DebugdConnectTest.java 8239062 macosx-x64 +serviceability/sa/ClhsdbCDSCore.java 8246016 linux-x64 ?serviceability/sa/TestInstanceKlassSize.java 8230664 linux-ppc64le,linux-ppc64 ?serviceability/sa/TestInstanceKlassSizeForInterface.java 8230664 linux-ppc64le,linux-ppc64 ?serviceability/sa/TestRevPtrsForInvokeDynamic.java 8241235 generic-all We need to problem list this test on JDK 15 because it has suddenly started failing a lot due to [1] JDK-8246016 , and we won't be able to get the fix for JDK-8246016 into JDK 15. Note the other tests this is failing with on 16 (there are 6 other SA tests that generate cores and can produce the failure) do not have to be problem listed for JDK 15 because they either do not have the core file check or are new to 16 and therefore are not in 15. [1] https://bugs.openjdk.java.net/browse/JDK-8246016 thanks, Chris From chris.plummer at oracle.com Thu Aug 6 04:20:11 2020 From: chris.plummer at oracle.com (Chris Plummer) Date: Wed, 5 Aug 2020 21:20:11 -0700 Subject: RFR(XS): 8251121: six SA tests leave core files behind on macOS In-Reply-To: <31d49fd0-4d95-eec9-f85e-2977afb7652d@oracle.com> References: <31d49fd0-4d95-eec9-f85e-2977afb7652d@oracle.com> Message-ID: <43acfd92-f5f3-e4d8-1e49-b8c7909637a9@oracle.com> Hi David, On 8/5/20 9:03 PM, David Holmes wrote: > Hi Chris, > > On 6/08/2020 11:16 am, Chris Plummer wrote: >> Hello, >> >> Please review the following: >> >> https://bugs.openjdk.java.net/browse/JDK-8251121 >> http://cr.openjdk.java.net/~cjplummer/8251121/webrev.00/index.html >> >> On OSX (and possibly some linux systems), core files are not produced >> in the cwd, but instead end up in some well known location. For OSX >> it is the /cores directory. The core files tend to accumulate there. >> This fixes the core file accumulation problem by moving the core file >> into the cwd, allowing jtreg to manage it. By default jtreg will >> delete the core if the test passes, and retain if if the test fails >> or RETAIN=all is specified. > > So the current code returns the absolute path to the corefile, while > your new code just returns the corefile name - which is effectively > the relative path ./corefilename. Is that change going to cause a > problem for any clients of this API? No. They just want the path to the core file, wherever it is. It can be a relative or absolute path. > > Second we have theorised about the length of time it can take to dump > the corefile on macOS, and now we are moving that huge corefile to > another location, likely on a different disk. Could that make the > timeout problem worse? Possibly, but I wouldn't think by much. We actually don't have an explanation as to why it takes so long. It's more like the OS is getting wedged for a while rather than it just having to do a lot of processing and I/O. (I'm seeing the spinning beach ball in the back of my head right now). Compared to the 30 minutes we are currently allowing for the core dump, I would hope a 3-5g disk to disk copy would not take that long relatively speaking. thanks, Chris > > Thanks, > David > ----- > >> I got rid of the code in ClhsdbCDSCore.java that explicitly deletes >> the core file because we don't want it deleted if RETAIN=all is used. >> >> thanks, >> >> Chris From david.holmes at oracle.com Thu Aug 6 05:31:08 2020 From: david.holmes at oracle.com (David Holmes) Date: Thu, 6 Aug 2020 15:31:08 +1000 Subject: 15: RFR(T): 8251214: ProblemList serviceability/sa/ClhsdbCDSCore.java on linux-x64 In-Reply-To: <4ee0694c-0bbd-3e23-ab6a-837311336755@oracle.com> References: <4ee0694c-0bbd-3e23-ab6a-837311336755@oracle.com> Message-ID: <44460dbe-9703-13a3-fe6e-7580ae0f6cd4@oracle.com> Problem listing this test seems fine to me. Thanks, David On 6/08/2020 2:11 pm, Chris Plummer wrote: > Hello, > > Please review the following: > > https://bugs.openjdk.java.net/browse/JDK-8251214 > > diff --git a/test/hotspot/jtreg/ProblemList.txt > b/test/hotspot/jtreg/ProblemList.txt > --- a/test/hotspot/jtreg/ProblemList.txt > +++ b/test/hotspot/jtreg/ProblemList.txt > @@ -91,6 +91,7 @@ > ?# :hotspot_serviceability > > ?serviceability/sa/sadebugd/DebugdConnectTest.java 8239062 macosx-x64 > +serviceability/sa/ClhsdbCDSCore.java 8246016 linux-x64 > ?serviceability/sa/TestInstanceKlassSize.java 8230664 > linux-ppc64le,linux-ppc64 > ?serviceability/sa/TestInstanceKlassSizeForInterface.java 8230664 > linux-ppc64le,linux-ppc64 > ?serviceability/sa/TestRevPtrsForInvokeDynamic.java 8241235 generic-all > > We need to problem list this test on JDK 15 because it has suddenly > started failing a lot due to [1] JDK-8246016 > , and we won't be able > to get the fix for JDK-8246016 > into JDK 15. Note the > other tests this is failing with on 16 (there are 6 other SA tests that > generate cores and can produce the failure) do not have to be problem > listed for JDK 15 because they either do not have the core file check or > are new to 16 and therefore are not in 15. > > [1] https://bugs.openjdk.java.net/browse/JDK-8246016 > > > thanks, > > Chris > From david.holmes at oracle.com Thu Aug 6 05:33:25 2020 From: david.holmes at oracle.com (David Holmes) Date: Thu, 6 Aug 2020 15:33:25 +1000 Subject: RFR(XS): 8251121: six SA tests leave core files behind on macOS In-Reply-To: <43acfd92-f5f3-e4d8-1e49-b8c7909637a9@oracle.com> References: <31d49fd0-4d95-eec9-f85e-2977afb7652d@oracle.com> <43acfd92-f5f3-e4d8-1e49-b8c7909637a9@oracle.com> Message-ID: <5f20d938-ccc4-5a96-cd42-a98d16bebb94@oracle.com> On 6/08/2020 2:20 pm, Chris Plummer wrote: > Hi David, > > On 8/5/20 9:03 PM, David Holmes wrote: >> Hi Chris, >> >> On 6/08/2020 11:16 am, Chris Plummer wrote: >>> Hello, >>> >>> Please review the following: >>> >>> https://bugs.openjdk.java.net/browse/JDK-8251121 >>> http://cr.openjdk.java.net/~cjplummer/8251121/webrev.00/index.html >>> >>> On OSX (and possibly some linux systems), core files are not produced >>> in the cwd, but instead end up in some well known location. For OSX >>> it is the /cores directory. The core files tend to accumulate there. >>> This fixes the core file accumulation problem by moving the core file >>> into the cwd, allowing jtreg to manage it. By default jtreg will >>> delete the core if the test passes, and retain if if the test fails >>> or RETAIN=all is specified. >> >> So the current code returns the absolute path to the corefile, while >> your new code just returns the corefile name - which is effectively >> the relative path ./corefilename. Is that change going to cause a >> problem for any clients of this API? > No. They just want the path to the core file, wherever it is. It can be > a relative or absolute path. >> >> Second we have theorised about the length of time it can take to dump >> the corefile on macOS, and now we are moving that huge corefile to >> another location, likely on a different disk. Could that make the >> timeout problem worse? > Possibly, but I wouldn't think by much. We actually don't have an > explanation as to why it takes so long. It's more like the OS is getting > wedged for a while rather than it just having to do a lot of processing > and I/O. (I'm seeing the spinning beach ball in the back of my head > right now). Compared to the 30 minutes we are currently allowing for the > core dump, I would hope a 3-5g disk to disk copy would not take that > long relatively speaking. Okay - I guess we will find out. :) Thanks, David > thanks, > > Chris >> >> Thanks, >> David >> ----- >> >>> I got rid of the code in ClhsdbCDSCore.java that explicitly deletes >>> the core file because we don't want it deleted if RETAIN=all is used. >>> >>> thanks, >>> >>> Chris > From alexander.zuev at oracle.com Thu Aug 6 05:44:59 2020 From: alexander.zuev at oracle.com (Alexander Zuev) Date: Wed, 5 Aug 2020 22:44:59 -0700 Subject: RFR: 8240487 : Cleanup whitespace in .cc, .hh, .m, and .mm files In-Reply-To: <5F2B6113.5030909@oracle.com> References: <5F2B6113.5030909@oracle.com> Message-ID: <997582fe-09f1-3642-d3ce-21c33b3cc118@oracle.com> Looks fine to me. Had to recall the vi settings that visualize spaces and tabs but it was worth it. Some places looks hilarious, like this one: http://cr.openjdk.java.net/~kizune/tmp/extra_spaces.png I mean - someone spent a lot of time creating this invisible art. Good it is going to be gone. /Alex On 8/5/2020 6:46 PM, Philip Race wrote: > Bug: https://bugs.openjdk.java.net/browse/JDK-8240487 > Webrev: http://cr.openjdk.java.net/~prr/8240487/ > > In advance of the move to Project Skara/git it is desirable to clean > up whitespace in source files > that are not currently checked by jcheck so we can add these > extensions to jcheck at that time. > > The fix is therefore to remove tabs and trailing spaces. > > The 3rd party harfbuzz library has .cc and .hh files but there are no > current violations there > since I've cleaned those up when importing harfbuzz upgrades. > > There is one JDK file that relates to those that inherited tabs that > is fixed. > > But almost all the fixes are in Objective C .m and .mm files. > JDK has no examples of .mm but JavaFX does so I was looking just to be > sure. > > And all but one of the .m violations are in the desktop module which > is mainly because > that is where all but 5 of the Objective-C files are. > > The only non-desktop violator is > ./jdk.hotspot.agent/macosx/native/libsaproc/MacosxDebuggerLocal.m > and that is included in this webrev and why I've included > serviceability-dev. > > -phil. From hohensee at amazon.com Thu Aug 6 13:49:16 2020 From: hohensee at amazon.com (Hohensee, Paul) Date: Thu, 6 Aug 2020 13:49:16 +0000 Subject: RFR(L): 8215624: add parallel heap inspection support for jmap histo(G1)(Internet mail) In-Reply-To: <63236320-6741-46F4-A22A-24FCE699E26A@tencent.com> References: <107799EA-8956-426D-B05E-B3344AFCBA28@amazon.com> <63236320-6741-46F4-A22A-24FCE699E26A@tencent.com> Message-ID: <1C77C7B9-4541-41E3-B16D-FB1B243D8087@amazon.com> And a submit repo run succeeds. Serguei, would you be willing to review? Thanks, Paul ?On 8/5/20, 7:00 PM, "linzang(??)" wrote: CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe. Thanks Paul! And I have verified this change could build success in windows. BRs, Lin On 2020/8/6, 4:17 AM, "Hohensee, Paul" wrote: Two tiny nits that don't need a new webrev: In heapInspection.cpp, you don't need to cast missed_count to uintx in the call to log_info(). In heapInspection.hpp, you can delete two of the three blank lines before #endif // SHARE_MEMORY_HEAPINSPECTION_HPP Thanks, Paul On 8/5/20, 6:46 AM, "linzang(??)" wrote: Hi Paul, Stefan and Serguei, Here I uploaded a new changeset, would you like to help review again? Webrev: http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_11/ Delta (based on webrev10): http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_11_delta/ P.S. I am in process of building it on windows environment for a double check. May update result later. Thanks! BRs, Lin On 2020/8/5, 8:56 PM, "Hohensee, Paul" wrote: uintx is fine with me. Thanks, Paul On 8/5/20, 1:14 AM, "linzang(??)" wrote: Hi Stefan, I got your point, thanks for explanation. I missed the atomic part when considering it. Hi Paul, Do you think it is ok to use uintx? I checked it is actually a uintptr_t type. BRs, Lin On 2020/8/5, 3:39 PM, "Stefan Karlsson" wrote: On 2020-08-05 07:22, linzang(??) wrote: > Hi Serguei, > > No problem, Thanks for your reviewing :) > > I wll upload a new webrev later, so may I ask your help to review it > again? > > Hi Stefan, > > As Paul mentioned, the _/missed/_count is not a size, so size_t may > not be clear, what?s your opinion about uint64_t? We typically don't restrict the usage of size_t to only *sizes* in the HotSpot. If you search the code you'll find many count variables using size_t, so I personally don't see the need to change the type. However, if you really do want to change it then maybe using another type that is 32 bits on 32-bit machines, maybe uintx? IIRC, using uint64_t and some of the Atomics operations are problematic on some 32-bit platforms, so using a type that matches the word size of the targetted machine helps you not having to think about that. > > It seems the uint overflow may happened on 64bit machine with large > heap, e.g. may be more than 4 Giga objects (8byte header + 8 byte > klassptr + 8byte field) in a heap that is larger than 96 GB, uint64_t > is ok in this case. Exactly. Thanks, StefanK > > BRs, > > Lin > > *From: *"serguei.spitsyn at oracle.com" > *Date: *Wednesday, August 5, 2020 at 1:02 PM > *To: *"linzang(??)" , "Hohensee, Paul" > , Stefan Karlsson , > David Holmes , serviceability-dev > , "hotspot-gc-dev at openjdk.java.net" > > *Subject: *Re: RFR(L): 8215624: add parallel heap inspection support for > jmap histo(G1)(Internet mail) > > Oh, sorry for the confusion, please, skip my question. :) > C++ does not have the '&&=' operator. > > Thanks, > Serguei > > On 8/4/20 21:56, serguei.spitsyn at oracle.com > wrote: > > Hi Lin, > > https://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_10/src/hotspot/share/memory/heapInspection.cpp.udiff.html > > +class KlassInfoTableMergeClosure : public KlassInfoClosure { > > +private: > > + KlassInfoTable* _dest; > > + bool _success; > > +public: > > + KlassInfoTableMergeClosure(KlassInfoTable* table) : _dest(table), > _success(true) {} > > + void do_cinfo(KlassInfoEntry* cie) { > > + _success &= _dest->merge_entry(cie); > > + } > > The operator '&=' above looks strange. > Did you actually want to use the operator '&&=' instead? : > > + _success &&= _dest->merge_entry(cie); > > > Thanks, > Serguei > > > > > On 8/3/20 07:51, linzang(??) wrote: > > Dear Stefan, > > May I ask your help to review again? I have made a delta based on the last changeset you have reviewed(webrev04),hope it could ease your reviewing work. > > webrev:https://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_10/ > > delta (vs webrev04):https://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/delta_10vs04/webrev/ > > bug:https://bugs.openjdk.java.net/browse/JDK-8215624 > > CSR(approved):https://bugs.openjdk.java.net/browse/JDK-8239290 > > > > BRs, > > Lin > > On 2020/7/30, 5:21 AM, "Hohensee, Paul" wrote: > > A submit repo run with this succeeded, so afaic you're good to go. Stefan, you reviewed the GC part before, it'd be great if you could ok the final version. > > Thanks, > > Paul > > On 7/29/20, 5:02 AM, "linzang(??)" wrote: > > Upload a new change athttp://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_10/ > > It fix an issue of windows fail : > > #################################### > > In heapInspect.cpp > > - size_t HeapInspection::populate_table(KlassInfoTable* cit, BoolObjectClosure *filter, uint parallel_thread_num) { > > + uint HeapInspection::populate_table(KlassInfoTable* cit, BoolObjectClosure *filter, uint parallel_thread_num) { > > #################################### > > In heapInspect.hpp > > - size_t populate_table(KlassInfoTable* cit, BoolObjectClosure* filter = NULL, uint parallel_thread_num = 1) NOT_SERVICES_RETURN_(0); > > + uint populate_table(KlassInfoTable* cit, BoolObjectClosure* filter = NULL, uint parallel_thread_num = 1) NOT_SERVICES_RETURN_(0); > > #################################### > > BRs, > > Lin > > On 2020/7/27, 11:26 AM, "linzang(??)" wrote: > > I update a new change athttp://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_09 > > It includes a tiny fix of build failure on windows: > > #################################### > > In attachListener.cpp: > > - uint parallel_thread_num = MAX(1, (uint)os::initial_active_processor_count() * 3 / 8); > > + uint parallel_thread_num = MAX2(1, (uint)os::initial_active_processor_count() * 3 / 8); > > #################################### > > BRs, > > Lin > > On 2020/7/23, 11:56 AM, "linzang(??)" wrote: > > Hi Paul, > > Thanks for your help, that all looks good to me. > > Just 2 minor changes: > > ? delete the final return in ParHeapInspectTask::work, you mentioned it but seems not include in the webrev :-) > > ? delete a unnecessary blank line in heapInspect.cpp at merge_entry() > > ######################################################################### > > --- old/src/hotspot/share/memory/heapInspection.cpp 2020-07-23 11:23:29.281666456 +0800 > > +++ new/src/hotspot/share/memory/heapInspection.cpp 2020-07-23 11:23:29.017666447 +0800 > > @@ -251,7 +251,6 @@ > > _size_of_instances_in_words += cie->words(); > > return true; > > } > > - > > return false; > > } > > @@ -568,7 +567,6 @@ > > Atomic::add(&_missed_count, missed_count); > > } else { > > Atomic::store(&_success, false); > > - return; > > } > > } > > ######################################################################### > > Here is the webrevhttp://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_08/ > > BRs, > > Lin > > --------------------------------------------- > > From: "Hohensee, Paul" > > Date: Thursday, July 23, 2020 at 6:48 AM > > To: "linzang(??)" , Stefan Karlsson ,"serguei.spitsyn at oracle.com" , David Holmes , serviceability-dev ,"hotspot-gc-dev at openjdk.java.net" > > Subject: RE: RFR(L): 8215624: add parallel heap inspection support for jmap histo(G1)(Internet mail) > > Just small things. > > heapInspection.cpp: > > In ParHeapInspectTask::work, remove the final return statement and fix the following ?}? indent. I.e., replace > > + Atomic::store(&_success, false); > > + return; > > + } > > with > > + Atomic::store(&_success, false); > > + } > > In HeapInspection::heap_inspection, missed_count should be a uint to match other missed_count declarations, and should be initialized to the result of populate_table() rather than separately to 0. > > attachListener.cpp: > > In heap_inspection, initial_processor_count returns an int, so cast its result to a uint. > > Similarly, parse_uintx returns a uintx, so cast its result (num) to uint when assigning to parallel_thread_num. > > BasicJMapTest.java: > > I took the liberty of refactoring testHisto*/histoToFile/testDump*/dump to remove redundant interposition methods and make histoToFile and dump look as similar as possible. > > Webrev with the above changes in > > http://cr.openjdk.java.net/~phh/8214535/webrev.01/ > > Thanks, > > Paul > > On 7/15/20, 2:13 AM, "linzang(??)" wrote: > > Upload a new webrev athttp://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_07/ > > It fix a potential issue that unexpected number of threads maybe calculated for "parallel" option of jmap -histo in container. > > As shown athttp://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_07-delta/src/hotspot/share/services/attachListener.cpp.udiff.html > > ############### attachListener.cpp #################### > > @@ -252,11 +252,11 @@ > > static jint heap_inspection(AttachOperation* op, outputStream* out) { > > bool live_objects_only = true; // default is true to retain the behavior before this change is made > > outputStream* os = out; // if path not specified or path is NULL, use out > > fileStream* fs = NULL; > > const char* arg0 = op->arg(0); > > - uint parallel_thread_num = MAX(1, os::processor_count() * 3 / 8); // default is less than half of processors. > > + uint parallel_thread_num = MAX(1, os::initial_active_processor_count() * 3 / 8); // default is less than half of processors. > > if (arg0 != NULL && (strlen(arg0) > 0)) { > > if (strcmp(arg0, "-all") != 0 && strcmp(arg0, "-live") != 0) { > > out->print_cr("Invalid argument to inspectheap operation: %s", arg0); > > return JNI_ERR; > > } > > ################################################### > > Thanks. > > BRs, > > Lin > > On 2020/7/9, 3:22 PM, "linzang(??)" wrote: > > Hi Paul, > > Thanks for reviewing! > > >> > > >> I'd move all the argument parsing code to JMap.java and just pass the results to Hotspot. Both histo() in JMap.java and code in attachListener.* parse the command line arguments, though the code in histo() doesn't parse the argument to "parallel". I'd upgrade the code in histo() to do a complete parse and pass the option values to executeCommandForPid as before: there would just be more of them now. That would allow you to eliminate all the parsing code in attachListener.cpp as well as the change to arguments.hpp. > > >> > > The reason I made the change in Jmap.java that compose all arguments as 1 string , instead of passing 3 argments, is to avoid the compatibility issue, as we discussed inhttp://mail.openjdk.java.net/pipermail/serviceability-dev/2019-February/thread.html#27240. The root cause of the compatibility issue is because max argument count in HotspotVirtualMachineImpl.java and attachlistener.cpp need to be enlarged (changes likehttp://hg.openjdk.java.net/jdk/jdk/rev/e7cf035682e3#l2.1) when jmap has more than 3 arguments. But if user use an old jcmd/jmap tool, it may stuck at socket read(), because the "max argument count" don't match. > > I re-checked this change, the argument count of jmap histo is equal to 3 (live, file, parallel), so it can work normally even without the change of passing argument. But I think we have to face the problem if more arguments is added in jcmd alike tools later, not sure whether it should be sloved (or a workaround) in this changeset. > > And here are the lastest webrev and delta: > > http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_06/ > > http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_06-delta/ > > Cheers, > > Lin > > On 2020/7/7, 5:57 AM, "Hohensee, Paul" wrote: > > I'd like to see this feature added. :) > > The CSR looks good, as does the basic parallel inspection algorithm. Stefan's done the GC part, so I'll stick to the non-GC part (fwiw, the GC part lgtm). > > I'd move all the argument parsing code to JMap.java and just pass the results to Hotspot. Both histo() in JMap.java and code in attachListener.* parse the command line arguments, though the code in histo() doesn't parse the argument to "parallel". I'd upgrade the code in histo() to do a complete parse and pass the option values to executeCommandForPid as before: there would just be more of them now. That would allow you to eliminate all the parsing code in attachListener.cpp as well as the change to arguments.hpp. > > heapInspection.hpp: > > _shared_miss_count (s/b _missed_count, see below) isn't a size, so it should be a uint instead of a size_t. Same with the new parallel_thread_num argument to heap_inspection() and populate_table(). > > Comment copy-edit: > > +// Parallel heap inspection task. Parallel inspection can fail due to > > +// a native OOM when allocating memory for TL-KlassInfoTable. > > +// _success will be set false on an OOM, and serial inspection tried. > > _shared_miss_count should be _missed_count to match the missed_count() getter, or rename missed_count() to be shared_miss_count(). Whichever way you go, the field type should match the getter result type: uint is reasonable. > > heapInspection.cpp: > > You might use ResourceMark twice in populate_table, separately for the parallel attempt and the serial code. If the parallel attempt fails and available memory is low, it would be good to clean up the memory used by the parallel attempt before doing the serial code. > > Style nit in KlassInfoTable::merge_entry(). I'd line up the definitions of k and elt, so "k" is even with "elt". And, because it's two lines shorter, I'd replace > > + } else { > > + return false; > > + } > > with > > + return false; > > KlassInfoTableMergeClosure.is_success() should be just success() (i.e., no "is_" prefix) because it's a getter. > > I'd reorganize the code in populate_table() to make it more clear, vis (I changed _shared_missed_count to _missed_count) > > + if (cit.allocation_failed()) { > > + // fail to allocate memory, stop parallel mode > > + Atomic::store(&_success, false); > > + return; > > + } > > + RecordInstanceClosure ric(&cit, _filter); > > + _poi->object_iterate(&ric, worker_id); > > + missed_count = ric.missed_count(); > > + { > > + MutexLocker x(&_mutex); > > + merge_success = _shared_cit->merge(&cit); > > + } > > + if (merge_success) { > > + Atomic::add(&_missed_count, missed_count); > > + else { > > + Atomic::store(&_success, false); > > + } > > Thanks, > > Paul > > On 6/29/20, 7:20 PM, "linzang(??)" wrote: > > Dear All, > > Sorry to bother again, I just want to make sure that is this change worth to be continue to work on? If decision is made to not. I think I can drop this work and stop asking for help reviewing... > > Thanks for all your help about reviewing this previously. > > BRs, > > Lin > > On 2020/5/9, 3:47 PM, "linzang(??)" wrote: > > Dear All, > > May I ask your help again for review the latest change? Thanks! > > BRs, > > Lin > > On 2020/4/28, 1:54 PM, "linzang(??)" wrote: > > Hi Stefan, > > >> - Adding Atomic::load/store. > > >> - Removing the time measurement in the run_task. I renamed G1's function > > >> to run_task_timed. If we need this outside of G1, we can rethink the API > > >> at that point. > > >> - ZGC style cleanups > > Thanks for revising the patch, they are all good to me, and I have made a tiny change based on it: > > http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_04/ > > http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_04-delta/ > > it reduce the scope of mutex in ParHeapInspectTask, and delete unnecessary comments. > > BRs, > > Lin > > On 2020/4/27, 4:34 PM, "Stefan Karlsson" wrote: > > Hi Lin, > > On 2020-04-26 05:10, linzang(??) wrote: > > > Hi Stefan and Paul? > > > I have made a new patch based on your comments and Stefan's Poc code: > > > Webrev:http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_03/ > > > Delta(based on Stefan's change:) :http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_03-delta/webrev_03-delta/ > > Thanks for providing a delta patch. It makes it much easier to look at, > > and more likely for reviewers to continue reviewing. > > I'm going to continue focusing on the GC parts, and leave the rest to > > others to review. > > > > > > And Here are main changed I made and want to discuss with you: > > > 1. changed"parallelThreadNum=" to "parallel=" for jmap -histo options. > > > 2. Add logic to test where parallelHeapInspection is fail, in heapInspection.cpp > > > This is because the parHeapInspectTask create thread local KlassInfoTable in it's work() method, and this may fail because of native OOM, in this case, the parallel should fail and serial heap inspection can be tried. > > > One more thing I want discuss with you is about the member "_success" of parHeapInspectTask, when native OOM happenes, it is set to false. And since this "set" operation can be conducted in multiple threads, should it be atomic ops? IMO, this is not necessary because "_success" can only be set to false, and there is no way to change it from back to true after the ParHeapInspectTask instance is created, so it is save to be non-atomic, do you agree with that? > > In these situations you should be using the Atomic::load/store > > primitives. We're moving toward a later C++ standard were data races are > > considered undefined behavior. > > > 3. make CollectedHeap::run_task() be an abstract virtual func, so that every subclass of collectedHeap should support it, so later implementation of new collectedHeap will not miss the "parallel" features. > > > The problem I want to discuss with you is about epsilonHeap and SerialHeap, as they may not need parallel heap iteration, so I only make task->work(0), in case the run_task() is invoked someway in future. Another way is to left run_task() unimplemented, which one do you think is better? > > I don't have a strong opinion about this. > > And also please help take a look at the zHeap, as there is a class > > zTask that wrap the abstractGangTask, and the collectedHeap::run_task() > > only accept AbstraceGangTask* as argument, so I made a delegate class > > to adapt it , please see src/hotspot/share/gc/z/zHeap.cpp. > > > > > > There maybe other better ways to sovle the above problems, welcome for any comments, Thanks! > > I've created a few cleanups and changes on top of your latest patch: > > https://cr.openjdk.java.net/~stefank/8215624/webrev.02.delta > > https://cr.openjdk.java.net/~stefank/8215624/webrev.02 > > - Adding Atomic::load/store. > > - Removing the time measurement in the run_task. I renamed G1's function > > to run_task_timed. If we need this outside of G1, we can rethink the API > > at that point. > > - ZGC style cleanups > > Thanks, > > StefanK > > > > > > BRs, > > > Lin > > > > > > On 2020/4/23, 11:08 AM, "linzang(??)" wrote: > > > > > > Thanks Paul! I agree with using "parallel", will make the update in next patch, Thanks for help update the CSR. > > > > > > BRs, > > > Lin > > > > > > On 2020/4/23, 4:42 AM, "Hohensee, Paul" wrote: > > > > > > For the interface, I'd use "parallel" instead of "parallelThreadNum". All the other options are lower case, and it's a lot easier to type "parallel". I took the liberty of updating the CSR. If you're ok with it, you might want to change variable names and such, plus of course JMap.usage. > > > > > > Thanks, > > > Paul > > > > > > On 4/22/20, 2:29 AM, "serviceability-dev on behalf of linzang(??)" linzang at tencent.com> wrote: > > > > > > Dear Stefan, > > > > > > Thanks a lot! I agree with you to decouple the heap inspection code with GC's. > > > I will start from your POC code, may discuss with you later. > > > > > > > > > BRs, > > > Lin > > > > > > On 2020/4/22, 5:14 PM, "Stefan Karlsson" wrote: > > > > > > Hi Lin, > > > > > > I took a look at this earlier and saw that the heap inspection code is > > > strongly coupled with the CollectedHeap and G1CollectedHeap. I'd prefer > > > if we'd abstract this away, so that the GCs only provide a "parallel > > > object iteration" interface, and the heap inspection code is kept elsewhere. > > > > > > I started experimenting with doing that, but other higher-priority (to > > > me) tasks have had to take precedence. > > > > > > I've uploaded my work-in-progress / proof-of-concept: > > >https://cr.openjdk.java.net/~stefank/8215624/webrev.01.delta/ > > >https://cr.openjdk.java.net/~stefank/8215624/webrev.01/ > > > > > > The current code doesn't handle the lifecycle (deletion) of the > > > ParallelObjectIterators. There's also code left unimplemented in around > > > CollectedHeap::run_task. However, I think this could work as a basis to > > > pull out the heap inspection code out of the GCs. > > > > > > Thanks, > > > StefanK > > > > > > On 2020-04-22 02:21, linzang(??) wrote: > > > > Dear all, > > > > May I ask you help to review? This RFR has been there for quite a while. > > > > Thanks! > > > > > > > > BRs, > > > > Lin > > > > > > > > > On 2020/3/16, 5:18 PM, "linzang(??)" wrote:> > > > > > > > >> Just update a new path, my preliminary measure show about 3.5x speedup of jmap histo on a nearly full 4GB G1 heap (8-core platform with parallel thread number set to 4). > > > >> webrev:http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_02/ > > > >> bug:https://bugs.openjdk.java.net/browse/JDK-8215624 > > > >> CSR:https://bugs.openjdk.java.net/browse/JDK-8239290 > > ?? > >> BRs, > > > >> Lin > > > >> > On 2020/3/2, 9:56 PM, "linzang(??)" wrote: > > > >> > > > > >> > Dear all, > > > >> > Let me try to ease the reviewing work by some explanation :P > > > >> > The patch's target is to speed up jmap -histo for heap iteration, from my experience it is necessary for large heap investigation. E.g in bigData scenario I have tried to conduct jmap -histo against 180GB heap, it does take quite a while. > > > >> > And if my understanding is corrent, even the jmap -histo without "live" option does heap inspection with heap lock acquired. so it is very likely to block mutator thread in allocation-sensitive scenario. I would say the faster the heap inspection does, the shorter the mutator be blocked. This is parallel iteration for jmap is necessary. > > > >> > I think the parallel heap inspection should be applied to all kind of heap. However, consider the heap layout are different for GCs, much time is required to understand all kinds of the heap layout to make the whole change. IMO, It is not wise to have a huge patch for the whole solution at once, and it is even harder to review it. So I plan to implement it incrementally, the first patch (this one) is going to confirm the implemention detail of how jmap accept the new option, passes it to attachListener of the jvm process and then how to make the parallel inspection closure be generic enough to make it easy to extend to different heap layout. And also how to implement the heap inspection in specific gc's heap. This patch use G1's heap as the begining. > > > >> > This patch actually do several things: > > > >> > 1. Add an option "parallelThreadNum=" to jmap -histo, the default behavior is to set N to 0, means let's JVM decide how many threads to use for heap inspection. Set this option to 1 will disable parallel heap inspection. (more details in CSR:https://bugs.openjdk.java.net/browse/JDK-8239290) > > > >> > 2. Make a change in how Jmap passing arguments, changes inhttp://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_01/src/jdk.jcmd/share/classes/sun/tools/jmap/JMap.java.udiff.html, originally it pass options as separate arguments to attachListener, this patch change to that all options be compose to a single string. So the arg_count_max in attachListener.hpp do not need to be changed, and hence avoid the compatibility issue, as disscussed athttps://mail.openjdk.java.net/pipermail/serviceability-dev/2019-March/027334.html > > > >> > 3. Add an abstract class ParHeapInspectTask in heapInspection.hpp / heapInspection.cpp, It's work(uint worker_id) method prepares the data structure (KlassInfoTable) need for every parallel worker thread, and then call do_object_iterate_parallel() which is heap specific implementation. I also added some machenism in KlassInfoTable to support parallel iteration, such as merge(). > > > >> > 4. In specific heap (G1 in this patch), create a subclass of ParHeapInspectTask, implement the do_object_iterate_parallel() for parallel heap inspection. For G1, it simply invoke g1CollectedHeap's object_iterate_parallel(). > > > >> > 5. Add related test. > > > >> > 6. it may be easy to extend this patch for other kinds of heap by creating subclass of ParHeapInspectTask and implement the do_object_iterate_parallel(). > > > >> > > > > >> > Hope these info could help on code review and initate the discussion :-) > > > >> > Thanks! > > > >> > > > > >> > BRs, > > > >> > Lin > > > >> > >On 2020/2/19, 9:40 AM, "linzang(??)" wrote:. > > > >> > > > > > >> > > Re-post this RFR with correct enhancement number to make it trackable. > > > >> > > please ignore the previous wrong post. sorry for troubles. > > > >> > > > > > >> > > webrev:http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_01/ > > > >> > > Hi bug:https://bugs.openjdk.java.net/browse/JDK-8215624 > > > >> > > CSR:https://bugs.openjdk.java.net/browse/JDK-8239290 > > > >> > > -------------- > > > >> > > Lin > > > >> > > >Hi Lin, > > > > > > > > > > > >> > > >Could you, please, re-post your RFR with the right enhancement number in > > > >> > > >the message subject? > > > >> > > >It will be more trackable this way. > > > >> > > > > > > >> > > >Thanks, > > > >> > > >Serguei > > > >> > > > > > > >> > > > > > > >> > > >On 2/17/20 10:29 PM, linzang(??) wrote: > > > >> > > >> Dear David, > > > >> > > >> Thanks a lot! > > > >> > > >> I have updated the refined code tohttp://cr.openjdk.java.net/~lzang/jmap-8214535/8215264/webrev_01/. > > > >> > > >> IMHO the parallel heap inspection can be extended to all kinds of heap as long as the heap layout can support parallel iteration. > > > >> > > >> Maybe we can firstly use this webrev to discuss how to implement it, because I am not sure my current implementation is an appropriate way to communicate with collectedHeap, then we can extend the solution to other kinds of heap. > > > >> > > >> > > > >> > > >> Thanks, > > > >> > > >> -------------- > > > >> > > >> Lin > > > >> > > >>> Hi Lin, > > > >> > > >>> > > > >> > > >>> Adding in hotspot-gc-dev as they need to see how this interacts with GC > > > >> > > >>> worker threads, and whether it needs to be extended beyond G1. > > > >> > > >>> > > > >> > > >>> I happened to spot one nit when browsing: > > > >> > > >>> > > > >> > > >>> src/hotspot/share/gc/shared/collectedHeap.hpp > > > >> > > >>> > > > >> > > >>> + virtual bool run_par_heap_inspect_task(KlassInfoTable* cit, > > > >> > > >>> + BoolObjectClosure* filter, > > > >> > > >>> + size_t* missed_count, > > > >> > > >>> + size_t thread_num) { > > > >> > > >>> + return NULL; > > > >> > > >>> > > > >> > > >>> s/NULL/false/ > > > >> > > >>> > > > >> > > >>> Cheers, > > > >> > > >>> David > > > > > > > >>> > > > >> > > >>> On 18/02/2020 2:15 pm, linzang(??) wrote: > > > >> > > >>>> Dear All, > > > >> > > >>>> May I ask your help to review the follow changes: > > > >> > > >>>> webrev: > > > >> > > >>>>http://cr.openjdk.java.net/~lzang/jmap-8214535/8215264/webrev_00/ > > > >> > > >>>> bug:https://bugs.openjdk.java.net/browse/JDK-8215624 > > > >> > > >>>> related CSR:https://bugs.openjdk.java.net/browse/JDK-8239290 > > > >> > > >>>> This patch enable parallel heap inspection of G1 for jmap histo. > > > >> > > >>>> my simple test shown it can speed up 2x of jmap -histo with > > > >> > > >>>> parallelThreadNum set to 2 for heap at ~500M on 4-core platform. > > > >> > > >>>> > > > >> > > >>>> ------------------------------------------------------------------------ > > > >> > > >>>> BRs, > > > >> > > >>>> Lin > > > >> > > >> > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > From serguei.spitsyn at oracle.com Thu Aug 6 15:13:08 2020 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Thu, 6 Aug 2020 08:13:08 -0700 Subject: Ping: RFR: JDK-8249550: jdb should use loopback address when not using remote agent In-Reply-To: <1f47f637-9bfa-a682-fddc-4b434ef1a858@oracle.com> References: <541a2f57-4b86-b453-7739-f1de35b52212@oracle.com> <5b200434-6181-1478-6423-70cd9150ce56@oracle.com> <467c37ea-3146-aec2-2fc5-94153acd5acc@oracle.com> <1f47f637-9bfa-a682-fddc-4b434ef1a858@oracle.com> Message-ID: Hi Alex, Thank you, for explanation. Serguei On 8/5/20 14:08, Alex Menkov wrote: > Hi Serguei, > > Original issue is about jdb with CommandLineLaunch connector on > non-Windows systems tries to resolve hostname and start to listening > on it. > This behavior can cause error (this is what the bug about) and does > not makes sense and CommandLineLaunch connector launches local > process, so there is no sense to listen for connections from other hosts. > > SocketTransportService is also used for jdb "listen" and "listenany" > commands, but this commands require address to be specified, so > startListening(String) is called, not startListening(). > > So the only behavior change is a debuggee started by CommandLineLaunch > connector accepts debuggers only from local machine, but this is how > jdb works (it starts local process and connects to it). > I don't think this requires CSR. > > --alex > > On 08/04/2020 17:22, serguei.spitsyn at oracle.com wrote: >> Hi Alex, >> >> This looks good to me. >> But do we need a CSR for this? >> I understand that the intention is to comply with the >> TransportService spec but the behavior is being changed. >> How long did we have this behavior? >> >> Thanks, >> Serguei >> >> >> On 8/4/20 16:32, Alex Menkov wrote: >>> Needs one more reviewer. >>> >>> One more details to simplify review. >>> SocketTransportService extends TransportService and spec for >>> TransportService.startListening() is: >>> >>> ??? /** >>> ???? * Listens on an address chosen by the transport service. >>> ???? * >>> ???? *

This convenience method works as if by invoking >>> ???? * {@link #startListening(String) startListening(null)}. >>> ???? * >>> ???? * @return? a listen key to be used in subsequent calls to be >>> ???? *????????? {@link #accept accept} or {@link #stopListening >>> ???? *????????? stopListening} methods. >>> ???? * >>> ???? * @throws? IOException >>> ???? *????????? If an I/O error occurs. >>> ???? */ >>> ??? public abstract ListenKey startListening() throws IOException; >>> >>> I.e. the fix updates SocketTransportService? to comply the spec. >>> >>> --alex >>> >>> On 07/23/2020 13:05, Chris Plummer wrote: >>>> Hi Alex, >>>> >>>> I'm no expert in this area, but the changes appear to do what you >>>> describe (use the loopback address), so thumbs up. >>>> >>>> thanks, >>>> >>>> Chris >>>> >>>> On 7/21/20 3:04 PM, Alex Menkov wrote: >>>>> Hi all, >>>>> >>>>> please review the fix for >>>>> https://bugs.openjdk.java.net/browse/JDK-8249550 >>>>> >>>>> webrev: >>>>> http://cr.openjdk.java.net/~amenkov/jdk16/jdb_loopback/webrev/ >>>>> >>>>> some background: >>>>> https://bugs.openjdk.java.net/browse/JDK-8041435 made default >>>>> listening on loopback address. >>>>> Later https://bugs.openjdk.java.net/browse/JDK-8184770 added >>>>> handling of "*" address to listen on all addresses, but it didn't >>>>> fixed "default" startListening() method (used by jdb through >>>>> SunCommandLineLauncher). >>>>> >>>>> The method called startListening(String localaddress, int port) >>>>> with localaddress == null, but this method for null localladdress >>>>> starts listening on all addresses (i.e. handle null value as "*"). >>>>> The fix changes it to startListening(String address) which handles >>>>> null address the same way as JDI socket connector does (i.e. >>>>> listens on loopback address only) >>>>> >>>>> --alex >>>> >> From kevin.rushforth at oracle.com Thu Aug 6 12:44:04 2020 From: kevin.rushforth at oracle.com (Kevin Rushforth) Date: Thu, 6 Aug 2020 05:44:04 -0700 Subject: [OpenJDK 2D-Dev] RFR: 8240487 : Cleanup whitespace in .cc, .hh, .m, and .mm files In-Reply-To: <5F2B6113.5030909@oracle.com> References: <5F2B6113.5030909@oracle.com> Message-ID: <256fa7cf-8913-4638-b32d-7cfe209e81a6@oracle.com> Looks good to me. I verified that the only changes are whitespace changes, and that after applying the patch there are no more whitespace errors. +1 -- Kevin On 8/5/2020 6:46 PM, Philip Race wrote: > Bug: https://bugs.openjdk.java.net/browse/JDK-8240487 > Webrev: http://cr.openjdk.java.net/~prr/8240487/ > > In advance of the move to Project Skara/git it is desirable to clean > up whitespace in source files > that are not currently checked by jcheck so we can add these > extensions to jcheck at that time. > > The fix is therefore to remove tabs and trailing spaces. > > The 3rd party harfbuzz library has .cc and .hh files but there are no > current violations there > since I've cleaned those up when importing harfbuzz upgrades. > > There is one JDK file that relates to those that inherited tabs that > is fixed. > > But almost all the fixes are in Objective C .m and .mm files. > JDK has no examples of .mm but JavaFX does so I was looking just to be > sure. > > And all but one of the .m violations are in the desktop module which > is mainly because > that is where all but 5 of the Objective-C files are. > > The only non-desktop violator is > ./jdk.hotspot.agent/macosx/native/libsaproc/MacosxDebuggerLocal.m > and that is included in this webrev and why I've included > serviceability-dev. > > -phil. From Sergey.Bylokhov at oracle.com Thu Aug 6 15:16:27 2020 From: Sergey.Bylokhov at oracle.com (Sergey Bylokhov) Date: Thu, 6 Aug 2020 08:16:27 -0700 Subject: [OpenJDK 2D-Dev] RFR: 8240487 : Cleanup whitespace in .cc, .hh, .m, and .mm files In-Reply-To: <5F2B6113.5030909@oracle.com> References: <5F2B6113.5030909@oracle.com> Message-ID: Hi, Phil. Maybe we can enable jcheck for such files at the same time? On 05.08.2020 18:46, Philip Race wrote: > Bug: https://bugs.openjdk.java.net/browse/JDK-8240487 > Webrev: http://cr.openjdk.java.net/~prr/8240487/ > > In advance of the move to Project Skara/git it is desirable to clean up whitespace in source files > that are not currently checked by jcheck so we can add these extensions to jcheck at that time. > > The fix is therefore to remove tabs and trailing spaces. > > The 3rd party harfbuzz library has .cc and .hh files but there are no current violations there > since I've cleaned those up when importing harfbuzz upgrades. > > There is one JDK file that relates to those that inherited tabs that is fixed. > > But almost all the fixes are in Objective C .m and .mm files. > JDK has no examples of .mm but JavaFX does so I was looking just to be sure. > > And all but one of the .m violations are in the desktop module which is mainly because > that is where all but 5 of the Objective-C files are. > > The only non-desktop violator is > ./jdk.hotspot.agent/macosx/native/libsaproc/MacosxDebuggerLocal.m > and that is included in this webrev and why I've included serviceability-dev. > > -phil. -- Best regards, Sergey. From serguei.spitsyn at oracle.com Thu Aug 6 15:21:02 2020 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Thu, 6 Aug 2020 08:21:02 -0700 Subject: Fwd: Re: RFR(XS): 8248879: SA core file support on OSX has some bugs trying to locate the jvm libraries In-Reply-To: <276014d9-c0c0-1fe8-8bd6-ac33262b5b8e@oracle.com> References: <7fc9666b-ce1e-f83a-e7a7-e052ca451a09@oracle.com> <625675d7-541b-7641-7306-91807fb7487c@oracle.com> <5e9ed560-32d2-33eb-8cbc-887bb97e90ca@oracle.com> <396ebbbd-42c0-2784-2426-01ccc5ead589@oracle.com> <431f7e78-a738-f1c3-8fc3-20e507b65b98@oracle.com> <5fc88753-3d07-7dc2-6925-ee954a938971@oracle.com> <276014d9-c0c0-1fe8-8bd6-ac33262b5b8e@oracle.com> Message-ID: <2c098b61-5416-94dc-deb1-4311c35aa87c@oracle.com> An HTML attachment was scrubbed... URL: From chris.plummer at oracle.com Thu Aug 6 15:57:45 2020 From: chris.plummer at oracle.com (Chris Plummer) Date: Thu, 6 Aug 2020 08:57:45 -0700 Subject: 15: RFR(T): 8251214: ProblemList serviceability/sa/ClhsdbCDSCore.java on linux-x64 In-Reply-To: <44460dbe-9703-13a3-fe6e-7580ae0f6cd4@oracle.com> References: <4ee0694c-0bbd-3e23-ab6a-837311336755@oracle.com> <44460dbe-9703-13a3-fe6e-7580ae0f6cd4@oracle.com> Message-ID: <0dc5b775-93ed-e395-5687-735bb4dc53ad@oracle.com> I'm withdrawing this RFR. Failures are due to a host config issue. Chris On 8/5/20 10:31 PM, David Holmes wrote: > Problem listing this test seems fine to me. > > Thanks, > David > > On 6/08/2020 2:11 pm, Chris Plummer wrote: >> Hello, >> >> Please review the following: >> >> https://bugs.openjdk.java.net/browse/JDK-8251214 >> >> diff --git a/test/hotspot/jtreg/ProblemList.txt >> b/test/hotspot/jtreg/ProblemList.txt >> --- a/test/hotspot/jtreg/ProblemList.txt >> +++ b/test/hotspot/jtreg/ProblemList.txt >> @@ -91,6 +91,7 @@ >> ??# :hotspot_serviceability >> >> ??serviceability/sa/sadebugd/DebugdConnectTest.java 8239062 macosx-x64 >> +serviceability/sa/ClhsdbCDSCore.java 8246016 linux-x64 >> ??serviceability/sa/TestInstanceKlassSize.java 8230664 >> linux-ppc64le,linux-ppc64 >> ??serviceability/sa/TestInstanceKlassSizeForInterface.java 8230664 >> linux-ppc64le,linux-ppc64 >> ??serviceability/sa/TestRevPtrsForInvokeDynamic.java 8241235 generic-all >> >> We need to problem list this test on JDK 15 because it has suddenly >> started failing a lot due to [1] JDK-8246016 >> , and we won't be >> able to get the fix for JDK-8246016 >> into JDK 15. Note >> the other tests this is failing with on 16 (there are 6 other SA >> tests that generate cores and can produce the failure) do not have to >> be problem listed for JDK 15 because they either do not have the core >> file check or are new to 16 and therefore are not in 15. >> >> [1] https://bugs.openjdk.java.net/browse/JDK-8246016 >> >> >> thanks, >> >> Chris >> From daniel.daugherty at oracle.com Thu Aug 6 17:17:00 2020 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Thu, 6 Aug 2020 13:17:00 -0400 Subject: RFR(XS): 8251121: six SA tests leave core files behind on macOS In-Reply-To: References: Message-ID: <26ef8c3d-c59d-a97c-a56b-6940a4fd5283@oracle.com> On 8/5/20 9:16 PM, Chris Plummer wrote: > Hello, > > Please review the following: > > https://bugs.openjdk.java.net/browse/JDK-8251121 > http://cr.openjdk.java.net/~cjplummer/8251121/webrev.00/index.html test/lib/jdk/test/lib/util/CoreUtils.java ??? You might consider two messages with timestamps: one before the move ??? and one after the move completes. test/hotspot/jtreg/serviceability/sa/ClhsdbCDSCore.java ??? No comments. Thumbs up. No need for another webrev if you decide to update the mesgs. I'm testing your patch on my MBP13 to verify that it solves the issue that I reported. Dan > > On OSX (and possibly some linux systems), core files are not produced > in the cwd, but instead end up in some well known location. For OSX it > is the /cores directory. The core files tend to accumulate there. This > fixes the core file accumulation problem by moving the core file into > the cwd, allowing jtreg to manage it. By default jtreg will delete the > core if the test passes, and retain if if the test fails or RETAIN=all > is specified. > > I got rid of the code in ClhsdbCDSCore.java that explicitly deletes > the core file because we don't want it deleted if RETAIN=all is used. > > thanks, > > Chris From daniel.daugherty at oracle.com Thu Aug 6 18:22:01 2020 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Thu, 6 Aug 2020 14:22:01 -0400 Subject: RFR(XS): 8251121: six SA tests leave core files behind on macOS In-Reply-To: <26ef8c3d-c59d-a97c-a56b-6940a4fd5283@oracle.com> References: <26ef8c3d-c59d-a97c-a56b-6940a4fd5283@oracle.com> Message-ID: <9414ec9a-c14a-fda2-e329-c5b30cacb751@oracle.com> $ do_java_test -c fastdebug serviceability/sa 2>&1 | tee do_java_test.8251121.log INFO: GNUMAKE=make INFO: GNUMAKE version is: GNU Make 3.81 INFO: JTREG options: INFO:?? JOBS=1 INFO:?? TEST_MODE=othervm INFO:?? VM_OPTIONS= INFO: test_val=serviceability/sa Test Config: macosx-x86_64-normal-server-fastdebug ??? INFO: TIMEOUT_FACTOR=6 ??? Done testing ??? Test Run macosx-x86_64-normal-server-fastdebug time: 7.48 minutes. ??? TEST????????????????????????????????????????????? TOTAL? PASS FAIL ERROR ??? jtreg:open/test/hotspot/jtreg/serviceability/sa????? 54 54???? 0???? 0 Total test time: 7.48 minutes. 660 2020.08.06 14:10:36 $ ls -l /cores 661 2020.08.06 14:19:18 $ When I have done this test run before, I always had 6 core files left. Now there are none. Dan On 8/6/20 1:17 PM, Daniel D. Daugherty wrote: > On 8/5/20 9:16 PM, Chris Plummer wrote: >> Hello, >> >> Please review the following: >> >> https://bugs.openjdk.java.net/browse/JDK-8251121 >> http://cr.openjdk.java.net/~cjplummer/8251121/webrev.00/index.html > > test/lib/jdk/test/lib/util/CoreUtils.java > ??? You might consider two messages with timestamps: one before the move > ??? and one after the move completes. > > test/hotspot/jtreg/serviceability/sa/ClhsdbCDSCore.java > ??? No comments. > > Thumbs up. No need for another webrev if you decide to update the mesgs. > > I'm testing your patch on my MBP13 to verify that it solves the issue > that I reported. > > Dan > > >> >> On OSX (and possibly some linux systems), core files are not produced >> in the cwd, but instead end up in some well known location. For OSX >> it is the /cores directory. The core files tend to accumulate there. >> This fixes the core file accumulation problem by moving the core file >> into the cwd, allowing jtreg to manage it. By default jtreg will >> delete the core if the test passes, and retain if if the test fails >> or RETAIN=all is specified. >> >> I got rid of the code in ClhsdbCDSCore.java that explicitly deletes >> the core file because we don't want it deleted if RETAIN=all is used. >> >> thanks, >> >> Chris > From ioi.lam at oracle.com Thu Aug 6 18:25:42 2020 From: ioi.lam at oracle.com (Ioi Lam) Date: Thu, 6 Aug 2020 11:25:42 -0700 Subject: RFR (xs) 8251209 [TESTBUG] CDS jvmti tests should use "@modules java.instrument" Message-ID: https://bugs.openjdk.java.net/browse/JDK-8251209 http://cr.openjdk.java.net/~iklam/jdk16/8251209-cds-jvmti-tests-modules-tag.v01/ Summary -- changed the tests from (mis)using ?* @requires vm.flavor != "minimal" to ?* @modules java.instrument ... to be consistent with other jvmti tests. Thanks - Ioi From chris.plummer at oracle.com Thu Aug 6 18:31:01 2020 From: chris.plummer at oracle.com (Chris Plummer) Date: Thu, 6 Aug 2020 11:31:01 -0700 Subject: RFR(XS): 8251121: six SA tests leave core files behind on macOS In-Reply-To: <26ef8c3d-c59d-a97c-a56b-6940a4fd5283@oracle.com> References: <26ef8c3d-c59d-a97c-a56b-6940a4fd5283@oracle.com> Message-ID: <48eed7f6-2740-94d7-5e07-761a264a3842@oracle.com> Hi Dan, On 8/6/20 10:17 AM, Daniel D. Daugherty wrote: > On 8/5/20 9:16 PM, Chris Plummer wrote: >> Hello, >> >> Please review the following: >> >> https://bugs.openjdk.java.net/browse/JDK-8251121 >> http://cr.openjdk.java.net/~cjplummer/8251121/webrev.00/index.html > > test/lib/jdk/test/lib/util/CoreUtils.java > ??? You might consider two messages with timestamps: one before the move > ??? and one after the move completes. > Do we have an standard timestamp printing support for our jtreg tests? I found the following in vmTestbase/nsk/share/Log.java: ??? /** ???? * Compose line to print possible prefixing it with timestamp. ???? */ ??? private String composeLine(String message) { ??????? if (timestamp) { ??????????? long time = System.currentTimeMillis(); ??????????? long ms = time % 1000; ??????????? time /= 1000; ??????????? long secs = time % 60; ??????????? time /= 60; ??????????? long mins = time % 60; ??????????? time /= 60; ??????????? long hours = time % 24; ??????????? return "[" + hours + ":" + mins + ":" + secs + "." + ms + "] " + message; ??????? } ??????? return message; ??? } Would be nice if we had something like that more generally available to all jtreg tests. thanks, Chris > test/hotspot/jtreg/serviceability/sa/ClhsdbCDSCore.java > ??? No comments. > > Thumbs up. No need for another webrev if you decide to update the mesgs. > > I'm testing your patch on my MBP13 to verify that it solves the issue > that I reported. > > Dan > > >> >> On OSX (and possibly some linux systems), core files are not produced >> in the cwd, but instead end up in some well known location. For OSX >> it is the /cores directory. The core files tend to accumulate there. >> This fixes the core file accumulation problem by moving the core file >> into the cwd, allowing jtreg to manage it. By default jtreg will >> delete the core if the test passes, and retain if if the test fails >> or RETAIN=all is specified. >> >> I got rid of the code in ClhsdbCDSCore.java that explicitly deletes >> the core file because we don't want it deleted if RETAIN=all is used. >> >> thanks, >> >> Chris > From alexey.menkov at oracle.com Thu Aug 6 19:04:12 2020 From: alexey.menkov at oracle.com (Alex Menkov) Date: Thu, 6 Aug 2020 12:04:12 -0700 Subject: Fwd: Re: RFR(XS): 8248879: SA core file support on OSX has some bugs trying to locate the jvm libraries In-Reply-To: <2c098b61-5416-94dc-deb1-4311c35aa87c@oracle.com> References: <7fc9666b-ce1e-f83a-e7a7-e052ca451a09@oracle.com> <625675d7-541b-7641-7306-91807fb7487c@oracle.com> <5e9ed560-32d2-33eb-8cbc-887bb97e90ca@oracle.com> <396ebbbd-42c0-2784-2426-01ccc5ead589@oracle.com> <431f7e78-a738-f1c3-8fc3-20e507b65b98@oracle.com> <5fc88753-3d07-7dc2-6925-ee954a938971@oracle.com> <276014d9-c0c0-1fe8-8bd6-ac33262b5b8e@oracle.com> <2c098b61-5416-94dc-deb1-4311c35aa87c@oracle.com> Message-ID: +1 --alex On 08/06/2020 08:21, serguei.spitsyn at oracle.com wrote: > Hi Chris, > > Thank you for the update. > LGTM > > Thanks, > Serguei > > > On 8/5/20 12:47, Chris Plummer wrote: >> Hi Alex and Serguei, >> >> Here's an update. I think I covered all recommendations: >> >> http://cr.openjdk.java.net/~cjplummer/8248879/webrev.03/index.html >> >> Here's a diff of the changes since webrev.02: >> >> http://cr.openjdk.java.net/~cjplummer/8248879/webrev.03/ps_core.c.diff >> >> thanks, >> >> Chris >> >> On 8/4/20 4:59 PM, serguei.spitsyn at oracle.com wrote: >>> On 8/4/20 16:46, Chris Plummer wrote: >>>> On 8/4/20 4:41 PM, Chris Plummer wrote: >>>>> On 8/4/20 4:05 PM, serguei.spitsyn at oracle.com wrote: >>>>>> On 8/4/20 16:01, serguei.spitsyn at oracle.com wrote: >>>>>>> Hi Chris, >>>>>>> >>>>>>> Just a quick comment. >>>>>>> >>>>>>> This fragment is not fully safe: >>>>>>> 356 strncpy(filepath, jdk_dir, BUF_SIZE - 1); >>>>>>> 357 strncat(filepath, jdk_subdir, BUF_SIZE -1); >>>>>>> 358 strncat(filepath, filename, BUF_SIZE - 1); >>>>>>> (The line 357 misses a space before 1.) >>>>>>> >>>>>>> Both strncpy and strncat define 'n' as max size of the 'src' >>>>>>> string to be copied. >>>>>>> For instance, the strncat man says: >>>>>>> >>>>>>> *char *strncat(char **/dest/*, const char **/src/*, size_t* /n/*);* >>>>>>> "The *strncat*() function is similar, except that it will use at >>>>>>> most /n/ bytes from /src/; ..." >>>>>>> Please, see: https://linux.die.net/man/3/strncat >>>>>> Forgot to say... >>>>>> >>>>>> This part of the strncpy description looks dangerous: >>>>>> If the length of /src/ is less than /n/, *strncpy*() *writes >>>>>> additional null bytes to **/dest/**to ensure that a total of >>>>>> **/n/**bytes are written*. >>>>>> See: https://linux.die.net/man/3/strncpy >>>>>> >>>>> Yes. The strncpy code is still correct since I only use it for the >>>>> first copy, although it is a bit wasteful to have it add all those >>>>> extra null bytes. I could probably get away with strcpy here since >>>>> we know the incoming path are all limited in size to BUF_SIZE. >>>> I take that back. When JAVA_HOME is passed, there is nothing >>>> preventing it from being any size, so it could be bigger than >>>> BUF_SIZE. So I guess I need to leave it as strncpy. >>> >>> Then just keep in mind the 'filepath' in case of a size bigger that >>> BUF_SIZE won't be null-terminated: >>> 356 strncpy(filepath, jdk_dir, BUF_SIZE - 1); >>> You may need to write the '\0' explicitly: >>> filepath[BUF_SIZE - 1] = '\0'; >>> Alternatively, it can be simplier to initialize the local buffer: >>> 355 char filepath[BUF_SIZE] = { '\0' }; >>> >>> Thanks, >>> Serguei >>> >>> >>>> >>>> Chris >>>>> >>>>> Chris >>>>>> Thanks, >>>>>> Serguei >>>>>> >>>>>>> So, something like this would be safe: >>>>>>> 356 strncpy(filepath, jdk_dir, BUF_SIZE - 1); >>>>>>> 357 strncat(filepath, jdk_subdir, BUF_SIZE - 1 - strlen(filepath)); >>>>>>> 358 strncat(filepath, filename, BUF_SIZE - 1 - strlen(filepath)); >>>>>>> >>>>>>> Thanks, >>>>>>> Serguei >>>>>>> >>>>>>> >>>>>>> On 8/4/20 11:05, Chris Plummer wrote: >>>>>>>> Ping! Serguei and Alex can you have a look at this? >>>>>>>> >>>>>>>> thanks, >>>>>>>> >>>>>>>> Chris >>>>>>>> >>>>>>>> -------- Forwarded Message -------- >>>>>>>> Subject: Re: RFR(XS): 8248879: SA core file support on OSX has >>>>>>>> some bugs trying to locate the jvm libraries >>>>>>>> Date: Tue, 28 Jul 2020 20:40:31 -0700 >>>>>>>> From: Chris Plummer >>>>>>>> To: serguei.spitsyn at oracle.com , >>>>>>>> Alex Menkov , serviceability-dev >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> Hi Serguei and Alex, >>>>>>>> >>>>>>>> Sorry about the delay getting back to this. I got sidetracked >>>>>>>> with other bugs and also realized the code needed more work than >>>>>>>> just Alex's suggestion for rstrstr(). >>>>>>>> >>>>>>>> As a bit of background first, get_real_path() is used to locate >>>>>>>> any library that is referenced from the core file using a >>>>>>>> relative path. So the core file will, for example, refer to >>>>>>>> @rpath/libjvm.dylib, and get_real_path() will convert that to a >>>>>>>> usable path to the file. Usually only JDK libraries and user >>>>>>>> libraries are specified with @rpath. System libraries all use >>>>>>>> full path names. >>>>>>>> >>>>>>>> get_real_path() had a couple of shortcomings. The way it worked >>>>>>>> is if the specified execname ended in bin/java or if $JAVA_HOME >>>>>>>> was set, then it only checked for libraries in subdirs of the >>>>>>>> first one of those 2 that it found to be valid. It would not >>>>>>>> look in both directories if both were valid, only in the first >>>>>>>> to be found valid. Only if neither of those were valid did it >>>>>>>> look in DYLD_LIBRARY_PATH. So, for example, as long as execname >>>>>>>> ended in bin/java, that's the only jdk directory that was >>>>>>>> checked for libraries. If it didn't end in bin/java, and >>>>>>>> $JAVA_HOME was set, then only it was checked. Then I added a 3rd >>>>>>>> option looking for the existence of any "bin/" in execname. Only >>>>>>>> if none of these 3 paths existed did the code defer to >>>>>>>> DYLD_LIBRARY_PATH. That made is hard to locate non JDK >>>>>>>> libraries, such as user JNI libraries, or to override the >>>>>>>> execname search for the JDK by setting $JAVA_HOME. >>>>>>>> >>>>>>>> I've fixed this by having it check all 3 of the potential JDK >>>>>>>> locations not only to see if the paths are valid, but also if >>>>>>>> the library is in any of the paths, and then check all the paths >>>>>>>> DYLD_LIBRARY_PATH if it failed to find the library in the JDK >>>>>>>> paths. So now all the potential locations are checked to see if >>>>>>>> they contain the library. By doing this I was able to make it >>>>>>>> find the JDK libraries by properly specifying the execname or >>>>>>>> JAVA_HOME, and still find a user JNI library in DYLD_LIBRARY_PATH. >>>>>>>> >>>>>>>> Since the code was kind of a mess and not well suited to just >>>>>>>> fix with some minor adjustments, I for the most part rewrote it. >>>>>>>> Although it still does a lot of the same things, it's much >>>>>>>> cleaner and easier to read now, and there's less replication of >>>>>>>> similar code. I also replaced strcat and strcpy calls with >>>>>>>> strncat and strncpy to prevent overflows. I would suggest for >>>>>>>> this review to just start by looking at get_real_path() and >>>>>>>> follow the code, and not compare the diffs, which aren't very >>>>>>>> readable. >>>>>>>> >>>>>>>> http://cr.openjdk.java.net/~cjplummer/8248879/webrev.02/index.html >>>>>>>> >>>>>>>> thanks, >>>>>>>> >>>>>>>> Chris >>>>>>>> >>>>>>>> >>>>>>>> On 7/14/20 8:54 PM, serguei.spitsyn at oracle.com wrote: >>>>>>>>> Hi Alex, >>>>>>>>> >>>>>>>>> Yes, I understand this. >>>>>>>>> After some thinking, I doubt my suggestion to check all >>>>>>>>> occurrences or "/bin/" is good. :) >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> Serguei >>>>>>>>> >>>>>>>>> On 7/14/20 18:19, Alex Menkov wrote: >>>>>>>>>> Hi Serguei, >>>>>>>>>> >>>>>>>>>> On 07/14/2020 15:55, serguei.spitsyn at oracle.com wrote: >>>>>>>>>>> Hi Chris and Alex, >>>>>>>>>>> >>>>>>>>>>> I agree the last occurrence of "/bin/" is better than the first. >>>>>>>>>>> But I wonder if it makes sense to check all occurrences. >>>>>>>>>> >>>>>>>>>> The problem is strrstr (search for last occurrence) is not a >>>>>>>>>> part of std C lib. >>>>>>>>>> So to avoid dependency on new library I suggested this simple >>>>>>>>>> implementation using standard strstr. >>>>>>>>>> >>>>>>>>>> --alex >>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> Serguei >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On 7/14/20 15:14, Alex Menkov wrote: >>>>>>>>>>>> Yes, you are right. >>>>>>>>>>>> This is not a function from strings.h >>>>>>>>>>>> >>>>>>>>>>>> Ok, you can leave strstr (and keep in mind that the path >>>>>>>>>>>> can't contain "/bin/" other than jdk's bin) or implement the >>>>>>>>>>>> functionality. It should be something simple like >>>>>>>>>>>> >>>>>>>>>>>> static const char* rstrstr(const char *str, const char *sub) { >>>>>>>>>>>> ?? const char *result = NULL; >>>>>>>>>>>> ?? for (const char *p = strstr(str, sub); p != NULL; p = >>>>>>>>>>>> strstr(p + 1, sub)) { >>>>>>>>>>>> ?????? result = p; >>>>>>>>>>>> ?? } >>>>>>>>>>>> ?? return result; >>>>>>>>>>>> } >>>>>>>>>>>> >>>>>>>>>>>> --alex >>>>>>>>>>>> >>>>>>>>>>>> On 07/14/2020 13:43, Chris Plummer wrote: >>>>>>>>>>>>> Actually it's not so easy. I don't see any other references >>>>>>>>>>>>> to strrstr in our source. When I reference strstr, it gives >>>>>>>>>>>>> a warning because it's not declared. The only man page I >>>>>>>>>>>>> can find says to include sstring2.h, but this file does not >>>>>>>>>>>>> exist. It also says to link with -lsstrings2. >>>>>>>>>>>>> >>>>>>>>>>>>> Chris >>>>>>>>>>>>> >>>>>>>>>>>>> On 7/14/20 1:37 PM, Chris Plummer wrote: >>>>>>>>>>>>>> Ok. I'll change both references to use strrstr. >>>>>>>>>>>>>> >>>>>>>>>>>>>> thanks, >>>>>>>>>>>>>> >>>>>>>>>>>>>> Chris >>>>>>>>>>>>>> >>>>>>>>>>>>>> On 7/14/20 1:11 PM, Alex Menkov wrote: >>>>>>>>>>>>>>> Hi Chris, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> I think it would be better to use strrstr to correctly >>>>>>>>>>>>>>> handle paths like >>>>>>>>>>>>>>> /something/bin/jdk/bin/jhsdb >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> And I'd updated >>>>>>>>>>>>>>> 358???? char* posbin = strstr(execname, "/bin/java"); >>>>>>>>>>>>>>> to use strrstr as well >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> --alex >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> On 07/14/2020 12:01, Chris Plummer wrote: >>>>>>>>>>>>>>>> Ping! >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> On 7/6/20 9:31 PM, Chris Plummer wrote: >>>>>>>>>>>>>>>>> Hello, >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Please help review the following: >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~cjplummer/8248879/webrev.00/index.html >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8248879 >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> The description of the problem and the fix are both in >>>>>>>>>>>>>>>>> the CR. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> thanks, >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Chris >>>>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> >> > From chris.plummer at oracle.com Thu Aug 6 20:06:26 2020 From: chris.plummer at oracle.com (Chris Plummer) Date: Thu, 6 Aug 2020 13:06:26 -0700 Subject: Fwd: Re: RFR(XS): 8248879: SA core file support on OSX has some bugs trying to locate the jvm libraries In-Reply-To: References: <7fc9666b-ce1e-f83a-e7a7-e052ca451a09@oracle.com> <625675d7-541b-7641-7306-91807fb7487c@oracle.com> <5e9ed560-32d2-33eb-8cbc-887bb97e90ca@oracle.com> <396ebbbd-42c0-2784-2426-01ccc5ead589@oracle.com> <431f7e78-a738-f1c3-8fc3-20e507b65b98@oracle.com> <5fc88753-3d07-7dc2-6925-ee954a938971@oracle.com> <276014d9-c0c0-1fe8-8bd6-ac33262b5b8e@oracle.com> <2c098b61-5416-94dc-deb1-4311c35aa87c@oracle.com> Message-ID: Thanks! On 8/6/20 12:04 PM, Alex Menkov wrote: > +1 > > --alex > > On 08/06/2020 08:21, serguei.spitsyn at oracle.com wrote: >> Hi Chris, >> >> Thank you for the update. >> LGTM >> >> Thanks, >> Serguei >> >> >> On 8/5/20 12:47, Chris Plummer wrote: >>> Hi Alex and Serguei, >>> >>> Here's an update. I think I covered all recommendations: >>> >>> http://cr.openjdk.java.net/~cjplummer/8248879/webrev.03/index.html >>> >>> Here's a diff of the changes since webrev.02: >>> >>> http://cr.openjdk.java.net/~cjplummer/8248879/webrev.03/ps_core.c.diff >>> >>> thanks, >>> >>> Chris >>> >>> On 8/4/20 4:59 PM, serguei.spitsyn at oracle.com wrote: >>>> On 8/4/20 16:46, Chris Plummer wrote: >>>>> On 8/4/20 4:41 PM, Chris Plummer wrote: >>>>>> On 8/4/20 4:05 PM, serguei.spitsyn at oracle.com wrote: >>>>>>> On 8/4/20 16:01, serguei.spitsyn at oracle.com wrote: >>>>>>>> Hi Chris, >>>>>>>> >>>>>>>> Just a quick comment. >>>>>>>> >>>>>>>> This fragment is not fully safe: >>>>>>>> 356 strncpy(filepath, jdk_dir, BUF_SIZE - 1); >>>>>>>> 357 strncat(filepath, jdk_subdir, BUF_SIZE -1); >>>>>>>> 358 strncat(filepath, filename, BUF_SIZE - 1); >>>>>>>> (The line 357 misses a space before 1.) >>>>>>>> >>>>>>>> Both strncpy and strncat define 'n' as max size of the 'src' >>>>>>>> string to be copied. >>>>>>>> For instance, the strncat man says: >>>>>>>> >>>>>>>> *char *strncat(char **/dest/*, const char **/src/*, size_t* >>>>>>>> /n/*);* >>>>>>>> "The *strncat*() function is similar, except that it will use >>>>>>>> at most /n/ bytes from /src/; ..." >>>>>>>> Please, see: https://linux.die.net/man/3/strncat >>>>>>> Forgot to say... >>>>>>> >>>>>>> This part of the strncpy description looks dangerous: >>>>>>> If the length of /src/ is less than /n/, *strncpy*() *writes >>>>>>> additional null bytes to **/dest/**to ensure that a total of >>>>>>> **/n/**bytes are written*. >>>>>>> See: https://linux.die.net/man/3/strncpy >>>>>>> >>>>>> Yes. The strncpy code is still correct since I only use it for >>>>>> the first copy, although it is a bit wasteful to have it add all >>>>>> those extra null bytes. I could probably get away with strcpy >>>>>> here since we know the incoming path are all limited in size to >>>>>> BUF_SIZE. >>>>> I take that back. When JAVA_HOME is passed, there is nothing >>>>> preventing it from being any size, so it could be bigger than >>>>> BUF_SIZE. So I guess I need to leave it as strncpy. >>>> >>>> Then just keep in mind the 'filepath' in case of a size bigger that >>>> BUF_SIZE won't be null-terminated: >>>> 356 strncpy(filepath, jdk_dir, BUF_SIZE - 1); >>>> You may need to write the '\0' explicitly: >>>> filepath[BUF_SIZE - 1] = '\0'; >>>> Alternatively, it can be simplier to initialize the local buffer: >>>> 355 char filepath[BUF_SIZE] = { '\0' }; >>>> >>>> Thanks, >>>> Serguei >>>> >>>> >>>>> >>>>> Chris >>>>>> >>>>>> Chris >>>>>>> Thanks, >>>>>>> Serguei >>>>>>> >>>>>>>> So, something like this would be safe: >>>>>>>> 356 strncpy(filepath, jdk_dir, BUF_SIZE - 1); >>>>>>>> 357 strncat(filepath, jdk_subdir, BUF_SIZE - 1 - >>>>>>>> strlen(filepath)); >>>>>>>> 358 strncat(filepath, filename, BUF_SIZE - 1 - strlen(filepath)); >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Serguei >>>>>>>> >>>>>>>> >>>>>>>> On 8/4/20 11:05, Chris Plummer wrote: >>>>>>>>> Ping! Serguei and Alex can you have a look at this? >>>>>>>>> >>>>>>>>> thanks, >>>>>>>>> >>>>>>>>> Chris >>>>>>>>> >>>>>>>>> -------- Forwarded Message -------- >>>>>>>>> Subject:???? Re: RFR(XS): 8248879: SA core file support on OSX >>>>>>>>> has some bugs trying to locate the jvm libraries >>>>>>>>> Date:???? Tue, 28 Jul 2020 20:40:31 -0700 >>>>>>>>> From:???? Chris Plummer >>>>>>>>> To:???? serguei.spitsyn at oracle.com >>>>>>>>> , Alex Menkov >>>>>>>>> , serviceability-dev >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> Hi Serguei and Alex, >>>>>>>>> >>>>>>>>> Sorry about the delay getting back to this. I got sidetracked >>>>>>>>> with other bugs and also realized the code needed more work >>>>>>>>> than just Alex's suggestion for rstrstr(). >>>>>>>>> >>>>>>>>> As a bit of background first, get_real_path() is used to >>>>>>>>> locate any library that is referenced from the core file using >>>>>>>>> a relative path. So the core file will, for example, refer to >>>>>>>>> @rpath/libjvm.dylib, and get_real_path() will convert that to >>>>>>>>> a usable path to the file. Usually only JDK libraries and user >>>>>>>>> libraries are specified with @rpath. System libraries all use >>>>>>>>> full path names. >>>>>>>>> >>>>>>>>> get_real_path() had a couple of shortcomings. The way it >>>>>>>>> worked is if the specified execname ended in bin/java or if >>>>>>>>> $JAVA_HOME was set, then it only checked for libraries in >>>>>>>>> subdirs of the first one of those 2 that it found to be valid. >>>>>>>>> It would not look in both directories if both were valid, only >>>>>>>>> in the first to be found valid. Only if neither of those were >>>>>>>>> valid did it look in DYLD_LIBRARY_PATH. So, for example, as >>>>>>>>> long as execname ended in bin/java, that's the only jdk >>>>>>>>> directory that was checked for libraries. If it didn't end in >>>>>>>>> bin/java, and $JAVA_HOME was set, then only it was checked. >>>>>>>>> Then I added a 3rd option looking for the existence of any >>>>>>>>> "bin/" in execname. Only if none of these 3 paths existed did >>>>>>>>> the code defer to DYLD_LIBRARY_PATH. That made is hard to >>>>>>>>> locate non JDK libraries, such as user JNI libraries, or to >>>>>>>>> override the execname search for the JDK by setting $JAVA_HOME. >>>>>>>>> >>>>>>>>> I've fixed this by having it check all 3 of the potential JDK >>>>>>>>> locations not only to see if the paths are valid, but also if >>>>>>>>> the library is in any of the paths, and then check all the >>>>>>>>> paths DYLD_LIBRARY_PATH if it failed to find the library in >>>>>>>>> the JDK paths. So now all the potential locations are checked >>>>>>>>> to see if they contain the library. By doing this I was able >>>>>>>>> to make it find the JDK libraries by properly specifying the >>>>>>>>> execname or JAVA_HOME, and still find a user JNI library in >>>>>>>>> DYLD_LIBRARY_PATH. >>>>>>>>> >>>>>>>>> Since the code was kind of a mess and not well suited to just >>>>>>>>> fix with some minor adjustments, I for the most part rewrote >>>>>>>>> it. Although it still does a lot of the same things, it's much >>>>>>>>> cleaner and easier to read now, and there's less replication >>>>>>>>> of similar code. I also replaced strcat and strcpy calls with >>>>>>>>> strncat and strncpy to prevent overflows. I would suggest for >>>>>>>>> this review to just start by looking at get_real_path() and >>>>>>>>> follow the code, and not compare the diffs, which aren't very >>>>>>>>> readable. >>>>>>>>> >>>>>>>>> http://cr.openjdk.java.net/~cjplummer/8248879/webrev.02/index.html >>>>>>>>> >>>>>>>>> >>>>>>>>> thanks, >>>>>>>>> >>>>>>>>> Chris >>>>>>>>> >>>>>>>>> >>>>>>>>> On 7/14/20 8:54 PM, serguei.spitsyn at oracle.com wrote: >>>>>>>>>> Hi Alex, >>>>>>>>>> >>>>>>>>>> Yes, I understand this. >>>>>>>>>> After some thinking, I doubt my suggestion to check all >>>>>>>>>> occurrences or "/bin/" is good. :) >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> Serguei >>>>>>>>>> >>>>>>>>>> On 7/14/20 18:19, Alex Menkov wrote: >>>>>>>>>>> Hi Serguei, >>>>>>>>>>> >>>>>>>>>>> On 07/14/2020 15:55, serguei.spitsyn at oracle.com wrote: >>>>>>>>>>>> Hi Chris and Alex, >>>>>>>>>>>> >>>>>>>>>>>> I agree the last occurrence of "/bin/" is better than the >>>>>>>>>>>> first. >>>>>>>>>>>> But I wonder if it makes sense to check all occurrences. >>>>>>>>>>> >>>>>>>>>>> The problem is strrstr (search for last occurrence) is not a >>>>>>>>>>> part of std C lib. >>>>>>>>>>> So to avoid dependency on new library I suggested this >>>>>>>>>>> simple implementation using standard strstr. >>>>>>>>>>> >>>>>>>>>>> --alex >>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Thanks, >>>>>>>>>>>> Serguei >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On 7/14/20 15:14, Alex Menkov wrote: >>>>>>>>>>>>> Yes, you are right. >>>>>>>>>>>>> This is not a function from strings.h >>>>>>>>>>>>> >>>>>>>>>>>>> Ok, you can leave strstr (and keep in mind that the path >>>>>>>>>>>>> can't contain "/bin/" other than jdk's bin) or implement >>>>>>>>>>>>> the functionality. It should be something simple like >>>>>>>>>>>>> >>>>>>>>>>>>> static const char* rstrstr(const char *str, const char >>>>>>>>>>>>> *sub) { >>>>>>>>>>>>> ?? const char *result = NULL; >>>>>>>>>>>>> ?? for (const char *p = strstr(str, sub); p != NULL; p = >>>>>>>>>>>>> strstr(p + 1, sub)) { >>>>>>>>>>>>> ?????? result = p; >>>>>>>>>>>>> ?? } >>>>>>>>>>>>> ?? return result; >>>>>>>>>>>>> } >>>>>>>>>>>>> >>>>>>>>>>>>> --alex >>>>>>>>>>>>> >>>>>>>>>>>>> On 07/14/2020 13:43, Chris Plummer wrote: >>>>>>>>>>>>>> Actually it's not so easy. I don't see any other >>>>>>>>>>>>>> references to strrstr in our source. When I reference >>>>>>>>>>>>>> strstr, it gives a warning because it's not declared. The >>>>>>>>>>>>>> only man page I can find says to include sstring2.h, but >>>>>>>>>>>>>> this file does not exist. It also says to link with >>>>>>>>>>>>>> -lsstrings2. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Chris >>>>>>>>>>>>>> >>>>>>>>>>>>>> On 7/14/20 1:37 PM, Chris Plummer wrote: >>>>>>>>>>>>>>> Ok. I'll change both references to use strrstr. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> thanks, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Chris >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> On 7/14/20 1:11 PM, Alex Menkov wrote: >>>>>>>>>>>>>>>> Hi Chris, >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> I think it would be better to use strrstr to correctly >>>>>>>>>>>>>>>> handle paths like >>>>>>>>>>>>>>>> /something/bin/jdk/bin/jhsdb >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> And I'd updated >>>>>>>>>>>>>>>> 358???? char* posbin = strstr(execname, "/bin/java"); >>>>>>>>>>>>>>>> to use strrstr as well >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> --alex >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> On 07/14/2020 12:01, Chris Plummer wrote: >>>>>>>>>>>>>>>>> Ping! >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> On 7/6/20 9:31 PM, Chris Plummer wrote: >>>>>>>>>>>>>>>>>> Hello, >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Please help review the following: >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~cjplummer/8248879/webrev.00/index.html >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8248879 >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> The description of the problem and the fix are both >>>>>>>>>>>>>>>>>> in the CR. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> thanks, >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Chris >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> >> From chris.plummer at oracle.com Thu Aug 6 21:50:56 2020 From: chris.plummer at oracle.com (Chris Plummer) Date: Thu, 6 Aug 2020 14:50:56 -0700 Subject: RFR(XS): 8251121: six SA tests leave core files behind on macOS In-Reply-To: <48eed7f6-2740-94d7-5e07-761a264a3842@oracle.com> References: <26ef8c3d-c59d-a97c-a56b-6940a4fd5283@oracle.com> <48eed7f6-2740-94d7-5e07-761a264a3842@oracle.com> Message-ID: <1432df97-8207-1f48-85f2-e560b48febec@oracle.com> Hi David and Dan, I went with just logging how long the copy takes. Here's all the code involved: ??????????? if (corePath.getParent() != null) { ??????????????? Path coreFileName = corePath.getFileName(); ??????????????? System.out.println("Moving core file to cwd: " + coreFileName); ??????????????? long startTime = System.currentTimeMillis(); ??????????????? Files.move(corePath, coreFileName); ??????????????? System.out.println("Core file move took " + (System.currentTimeMillis() - startTime) + "ms"); ??????????????? coreFileLocation = coreFileName.toString(); ??????????? } On linux where it didn't actually end up having to move the file (src and dest paths are the same), it reported 0ms. On OSX where it did move the file from /cores, it reported 2ms. Let me know if you're ok with these changes. thanks, Chris On 8/6/20 11:31 AM, Chris Plummer wrote: > Hi Dan, > > On 8/6/20 10:17 AM, Daniel D. Daugherty wrote: >> On 8/5/20 9:16 PM, Chris Plummer wrote: >>> Hello, >>> >>> Please review the following: >>> >>> https://bugs.openjdk.java.net/browse/JDK-8251121 >>> http://cr.openjdk.java.net/~cjplummer/8251121/webrev.00/index.html >> >> test/lib/jdk/test/lib/util/CoreUtils.java >> ??? You might consider two messages with timestamps: one before the move >> ??? and one after the move completes. >> > Do we have an standard timestamp printing support for our jtreg tests? > I found the following in vmTestbase/nsk/share/Log.java: > > ??? /** > ???? * Compose line to print possible prefixing it with timestamp. > ???? */ > ??? private String composeLine(String message) { > ??????? if (timestamp) { > ??????????? long time = System.currentTimeMillis(); > ??????????? long ms = time % 1000; > ??????????? time /= 1000; > ??????????? long secs = time % 60; > ??????????? time /= 60; > ??????????? long mins = time % 60; > ??????????? time /= 60; > ??????????? long hours = time % 24; > ??????????? return "[" + hours + ":" + mins + ":" + secs + "." + ms + > "] " + message; > ??????? } > ??????? return message; > ??? } > > Would be nice if we had something like that more generally available > to all jtreg tests. > > thanks, > > Chris >> test/hotspot/jtreg/serviceability/sa/ClhsdbCDSCore.java >> ??? No comments. >> >> Thumbs up. No need for another webrev if you decide to update the mesgs. >> >> I'm testing your patch on my MBP13 to verify that it solves the issue >> that I reported. >> >> Dan >> >> >>> >>> On OSX (and possibly some linux systems), core files are not >>> produced in the cwd, but instead end up in some well known location. >>> For OSX it is the /cores directory. The core files tend to >>> accumulate there. This fixes the core file accumulation problem by >>> moving the core file into the cwd, allowing jtreg to manage it. By >>> default jtreg will delete the core if the test passes, and retain if >>> if the test fails or RETAIN=all is specified. >>> >>> I got rid of the code in ClhsdbCDSCore.java that explicitly deletes >>> the core file because we don't want it deleted if RETAIN=all is used. >>> >>> thanks, >>> >>> Chris >> > From david.holmes at oracle.com Thu Aug 6 21:52:59 2020 From: david.holmes at oracle.com (David Holmes) Date: Fri, 7 Aug 2020 07:52:59 +1000 Subject: RFR (xs) 8251209 [TESTBUG] CDS jvmti tests should use "@modules java.instrument" In-Reply-To: References: Message-ID: <2b9000ef-9a09-caf4-8610-bb354afb722b@oracle.com> Hi Ioi, On 7/08/2020 4:25 am, Ioi Lam wrote: > https://bugs.openjdk.java.net/browse/JDK-8251209 > http://cr.openjdk.java.net/~iklam/jdk16/8251209-cds-jvmti-tests-modules-tag.v01/ > > > Summary -- changed the tests from (mis)using > > ?* @requires vm.flavor != "minimal" > > to > > ?* @modules java.instrument > > ... to be consistent with other jvmti tests. That seems like an invalid precondition to me. It would have been somewhat valid in the Compact Profiles world when we did not provide "java.instrument" in the profiles which supported MinimalVM, but you can define a minimal VM in a build that still has all modules available. I don't think building the minimal VM makes any changes to the supported modules. Also AIUI the @modules statement simply adds the necessary command-line args to use the java.instrument module (if present), it doesn't ensure that the listed module has to be present. David > Thanks > - Ioi From david.holmes at oracle.com Thu Aug 6 21:58:37 2020 From: david.holmes at oracle.com (David Holmes) Date: Fri, 7 Aug 2020 07:58:37 +1000 Subject: RFR(XS): 8251121: six SA tests leave core files behind on macOS In-Reply-To: <1432df97-8207-1f48-85f2-e560b48febec@oracle.com> References: <26ef8c3d-c59d-a97c-a56b-6940a4fd5283@oracle.com> <48eed7f6-2740-94d7-5e07-761a264a3842@oracle.com> <1432df97-8207-1f48-85f2-e560b48febec@oracle.com> Message-ID: <3ad92452-142d-e00c-e25b-042c1814d95b@oracle.com> Update looks good. Thanks, David On 7/08/2020 7:50 am, Chris Plummer wrote: > Hi David and Dan, > > I went with just logging how long the copy takes. Here's all the code > involved: > > ??????????? if (corePath.getParent() != null) { > ??????????????? Path coreFileName = corePath.getFileName(); > ??????????????? System.out.println("Moving core file to cwd: " + > coreFileName); > ??????????????? long startTime = System.currentTimeMillis(); > ??????????????? Files.move(corePath, coreFileName); > ??????????????? System.out.println("Core file move took " + > (System.currentTimeMillis() - startTime) + "ms"); > ??????????????? coreFileLocation = coreFileName.toString(); > ??????????? } > > On linux where it didn't actually end up having to move the file (src > and dest paths are the same), it reported 0ms. On OSX where it did move > the file from /cores, it reported 2ms. > > Let me know if you're ok with these changes. > > thanks, > > Chris > > On 8/6/20 11:31 AM, Chris Plummer wrote: >> Hi Dan, >> >> On 8/6/20 10:17 AM, Daniel D. Daugherty wrote: >>> On 8/5/20 9:16 PM, Chris Plummer wrote: >>>> Hello, >>>> >>>> Please review the following: >>>> >>>> https://bugs.openjdk.java.net/browse/JDK-8251121 >>>> http://cr.openjdk.java.net/~cjplummer/8251121/webrev.00/index.html >>> >>> test/lib/jdk/test/lib/util/CoreUtils.java >>> ??? You might consider two messages with timestamps: one before the move >>> ??? and one after the move completes. >>> >> Do we have an standard timestamp printing support for our jtreg tests? >> I found the following in vmTestbase/nsk/share/Log.java: >> >> ??? /** >> ???? * Compose line to print possible prefixing it with timestamp. >> ???? */ >> ??? private String composeLine(String message) { >> ??????? if (timestamp) { >> ??????????? long time = System.currentTimeMillis(); >> ??????????? long ms = time % 1000; >> ??????????? time /= 1000; >> ??????????? long secs = time % 60; >> ??????????? time /= 60; >> ??????????? long mins = time % 60; >> ??????????? time /= 60; >> ??????????? long hours = time % 24; >> ??????????? return "[" + hours + ":" + mins + ":" + secs + "." + ms + >> "] " + message; >> ??????? } >> ??????? return message; >> ??? } >> >> Would be nice if we had something like that more generally available >> to all jtreg tests. >> >> thanks, >> >> Chris >>> test/hotspot/jtreg/serviceability/sa/ClhsdbCDSCore.java >>> ??? No comments. >>> >>> Thumbs up. No need for another webrev if you decide to update the mesgs. >>> >>> I'm testing your patch on my MBP13 to verify that it solves the issue >>> that I reported. >>> >>> Dan >>> >>> >>>> >>>> On OSX (and possibly some linux systems), core files are not >>>> produced in the cwd, but instead end up in some well known location. >>>> For OSX it is the /cores directory. The core files tend to >>>> accumulate there. This fixes the core file accumulation problem by >>>> moving the core file into the cwd, allowing jtreg to manage it. By >>>> default jtreg will delete the core if the test passes, and retain if >>>> if the test fails or RETAIN=all is specified. >>>> >>>> I got rid of the code in ClhsdbCDSCore.java that explicitly deletes >>>> the core file because we don't want it deleted if RETAIN=all is used. >>>> >>>> thanks, >>>> >>>> Chris >>> >> > From daniel.daugherty at oracle.com Thu Aug 6 22:10:58 2020 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Thu, 6 Aug 2020 18:10:58 -0400 Subject: RFR(XS): 8251121: six SA tests leave core files behind on macOS In-Reply-To: <3ad92452-142d-e00c-e25b-042c1814d95b@oracle.com> References: <26ef8c3d-c59d-a97c-a56b-6940a4fd5283@oracle.com> <48eed7f6-2740-94d7-5e07-761a264a3842@oracle.com> <1432df97-8207-1f48-85f2-e560b48febec@oracle.com> <3ad92452-142d-e00c-e25b-042c1814d95b@oracle.com> Message-ID: <77b47feb-e69b-7333-a296-aff6575f8d16@oracle.com> +1 Dan On 8/6/20 5:58 PM, David Holmes wrote: > Update looks good. > > Thanks, > David > > On 7/08/2020 7:50 am, Chris Plummer wrote: >> Hi David and Dan, >> >> I went with just logging how long the copy takes. Here's all the code >> involved: >> >> ???????????? if (corePath.getParent() != null) { >> ???????????????? Path coreFileName = corePath.getFileName(); >> ???????????????? System.out.println("Moving core file to cwd: " + >> coreFileName); >> ???????????????? long startTime = System.currentTimeMillis(); >> ???????????????? Files.move(corePath, coreFileName); >> ???????????????? System.out.println("Core file move took " + >> (System.currentTimeMillis() - startTime) + "ms"); >> ???????????????? coreFileLocation = coreFileName.toString(); >> ???????????? } >> >> On linux where it didn't actually end up having to move the file (src >> and dest paths are the same), it reported 0ms. On OSX where it did >> move the file from /cores, it reported 2ms. >> >> Let me know if you're ok with these changes. >> >> thanks, >> >> Chris >> >> On 8/6/20 11:31 AM, Chris Plummer wrote: >>> Hi Dan, >>> >>> On 8/6/20 10:17 AM, Daniel D. Daugherty wrote: >>>> On 8/5/20 9:16 PM, Chris Plummer wrote: >>>>> Hello, >>>>> >>>>> Please review the following: >>>>> >>>>> https://bugs.openjdk.java.net/browse/JDK-8251121 >>>>> http://cr.openjdk.java.net/~cjplummer/8251121/webrev.00/index.html >>>> >>>> test/lib/jdk/test/lib/util/CoreUtils.java >>>> ??? You might consider two messages with timestamps: one before the >>>> move >>>> ??? and one after the move completes. >>>> >>> Do we have an standard timestamp printing support for our jtreg >>> tests? I found the following in vmTestbase/nsk/share/Log.java: >>> >>> ??? /** >>> ???? * Compose line to print possible prefixing it with timestamp. >>> ???? */ >>> ??? private String composeLine(String message) { >>> ??????? if (timestamp) { >>> ??????????? long time = System.currentTimeMillis(); >>> ??????????? long ms = time % 1000; >>> ??????????? time /= 1000; >>> ??????????? long secs = time % 60; >>> ??????????? time /= 60; >>> ??????????? long mins = time % 60; >>> ??????????? time /= 60; >>> ??????????? long hours = time % 24; >>> ??????????? return "[" + hours + ":" + mins + ":" + secs + "." + ms >>> + "] " + message; >>> ??????? } >>> ??????? return message; >>> ??? } >>> >>> Would be nice if we had something like that more generally available >>> to all jtreg tests. >>> >>> thanks, >>> >>> Chris >>>> test/hotspot/jtreg/serviceability/sa/ClhsdbCDSCore.java >>>> ??? No comments. >>>> >>>> Thumbs up. No need for another webrev if you decide to update the >>>> mesgs. >>>> >>>> I'm testing your patch on my MBP13 to verify that it solves the issue >>>> that I reported. >>>> >>>> Dan >>>> >>>> >>>>> >>>>> On OSX (and possibly some linux systems), core files are not >>>>> produced in the cwd, but instead end up in some well known >>>>> location. For OSX it is the /cores directory. The core files tend >>>>> to accumulate there. This fixes the core file accumulation problem >>>>> by moving the core file into the cwd, allowing jtreg to manage it. >>>>> By default jtreg will delete the core if the test passes, and >>>>> retain if if the test fails or RETAIN=all is specified. >>>>> >>>>> I got rid of the code in ClhsdbCDSCore.java that explicitly >>>>> deletes the core file because we don't want it deleted if >>>>> RETAIN=all is used. >>>>> >>>>> thanks, >>>>> >>>>> Chris >>>> >>> >> From chris.plummer at oracle.com Thu Aug 6 22:53:29 2020 From: chris.plummer at oracle.com (Chris Plummer) Date: Thu, 6 Aug 2020 15:53:29 -0700 Subject: RFR(XS): 8251121: six SA tests leave core files behind on macOS In-Reply-To: <77b47feb-e69b-7333-a296-aff6575f8d16@oracle.com> References: <26ef8c3d-c59d-a97c-a56b-6940a4fd5283@oracle.com> <48eed7f6-2740-94d7-5e07-761a264a3842@oracle.com> <1432df97-8207-1f48-85f2-e560b48febec@oracle.com> <3ad92452-142d-e00c-e25b-042c1814d95b@oracle.com> <77b47feb-e69b-7333-a296-aff6575f8d16@oracle.com> Message-ID: Thanks! On 8/6/20 3:10 PM, Daniel D. Daugherty wrote: > +1 > > Dan > > > On 8/6/20 5:58 PM, David Holmes wrote: >> Update looks good. >> >> Thanks, >> David >> >> On 7/08/2020 7:50 am, Chris Plummer wrote: >>> Hi David and Dan, >>> >>> I went with just logging how long the copy takes. Here's all the >>> code involved: >>> >>> ???????????? if (corePath.getParent() != null) { >>> ???????????????? Path coreFileName = corePath.getFileName(); >>> ???????????????? System.out.println("Moving core file to cwd: " + >>> coreFileName); >>> ???????????????? long startTime = System.currentTimeMillis(); >>> ???????????????? Files.move(corePath, coreFileName); >>> ???????????????? System.out.println("Core file move took " + >>> (System.currentTimeMillis() - startTime) + "ms"); >>> ???????????????? coreFileLocation = coreFileName.toString(); >>> ???????????? } >>> >>> On linux where it didn't actually end up having to move the file >>> (src and dest paths are the same), it reported 0ms. On OSX where it >>> did move the file from /cores, it reported 2ms. >>> >>> Let me know if you're ok with these changes. >>> >>> thanks, >>> >>> Chris >>> >>> On 8/6/20 11:31 AM, Chris Plummer wrote: >>>> Hi Dan, >>>> >>>> On 8/6/20 10:17 AM, Daniel D. Daugherty wrote: >>>>> On 8/5/20 9:16 PM, Chris Plummer wrote: >>>>>> Hello, >>>>>> >>>>>> Please review the following: >>>>>> >>>>>> https://bugs.openjdk.java.net/browse/JDK-8251121 >>>>>> http://cr.openjdk.java.net/~cjplummer/8251121/webrev.00/index.html >>>>> >>>>> test/lib/jdk/test/lib/util/CoreUtils.java >>>>> ??? You might consider two messages with timestamps: one before >>>>> the move >>>>> ??? and one after the move completes. >>>>> >>>> Do we have an standard timestamp printing support for our jtreg >>>> tests? I found the following in vmTestbase/nsk/share/Log.java: >>>> >>>> ??? /** >>>> ???? * Compose line to print possible prefixing it with timestamp. >>>> ???? */ >>>> ??? private String composeLine(String message) { >>>> ??????? if (timestamp) { >>>> ??????????? long time = System.currentTimeMillis(); >>>> ??????????? long ms = time % 1000; >>>> ??????????? time /= 1000; >>>> ??????????? long secs = time % 60; >>>> ??????????? time /= 60; >>>> ??????????? long mins = time % 60; >>>> ??????????? time /= 60; >>>> ??????????? long hours = time % 24; >>>> ??????????? return "[" + hours + ":" + mins + ":" + secs + "." + ms >>>> + "] " + message; >>>> ??????? } >>>> ??????? return message; >>>> ??? } >>>> >>>> Would be nice if we had something like that more generally >>>> available to all jtreg tests. >>>> >>>> thanks, >>>> >>>> Chris >>>>> test/hotspot/jtreg/serviceability/sa/ClhsdbCDSCore.java >>>>> ??? No comments. >>>>> >>>>> Thumbs up. No need for another webrev if you decide to update the >>>>> mesgs. >>>>> >>>>> I'm testing your patch on my MBP13 to verify that it solves the issue >>>>> that I reported. >>>>> >>>>> Dan >>>>> >>>>> >>>>>> >>>>>> On OSX (and possibly some linux systems), core files are not >>>>>> produced in the cwd, but instead end up in some well known >>>>>> location. For OSX it is the /cores directory. The core files tend >>>>>> to accumulate there. This fixes the core file accumulation >>>>>> problem by moving the core file into the cwd, allowing jtreg to >>>>>> manage it. By default jtreg will delete the core if the test >>>>>> passes, and retain if if the test fails or RETAIN=all is specified. >>>>>> >>>>>> I got rid of the code in ClhsdbCDSCore.java that explicitly >>>>>> deletes the core file because we don't want it deleted if >>>>>> RETAIN=all is used. >>>>>> >>>>>> thanks, >>>>>> >>>>>> Chris >>>>> >>>> >>> > From jiefu at tencent.com Thu Aug 6 23:42:24 2020 From: jiefu at tencent.com (=?utf-8?B?amllZnUo5YKF5p2wKQ==?=) Date: Thu, 6 Aug 2020 23:42:24 +0000 Subject: RFR: 8251155: HostIdentifier fails to canonicalize hostnames starting with digits In-Reply-To: <483D9D17-F2C3-47D4-8578-0DAF1E353AF5@tencent.com> References: <483D9D17-F2C3-47D4-8578-0DAF1E353AF5@tencent.com> Message-ID: FYI: This bug will lead to failures of the following tests on machines with hostname starting from digits. - test/jdk/sun/tools/jstatd/TestJstatdExternalRegistry.java - test/jdk/sun/tools/jstatd/TestJstatdPort.java - test/jdk/sun/tools/jstatd/TestJstatdPortAndServer.java - test/jdk/sun/tools/jstatd/TestJstatdRmiPort.java So it's worth fixing it. Testing: - tier1-3 on Linux/x64 Thanks. Best regards, Jie From: "jiefu(??)" Date: Wednesday, August 5, 2020 at 3:19 PM To: "serviceability-dev at openjdk.java.net" Subject: RFR: 8251155: HostIdentifier fails to canonicalize hostnames starting with digits Hi all, JBS: https://bugs.openjdk.java.net/browse/JDK-8251155 Webrev: http://cr.openjdk.java.net/~jiefu/8251155/webrev.00/ HostIdentifier fails to canonicalize hostname:port if the hostname starts with digits. The current implementation will get "scheme = hostname". But the scheme should not be started with digits, which leads to this bug. Thanks a lot. Best regards, Jie -------------- next part -------------- An HTML attachment was scrubbed... URL: From david.holmes at oracle.com Fri Aug 7 00:58:41 2020 From: david.holmes at oracle.com (David Holmes) Date: Fri, 7 Aug 2020 10:58:41 +1000 Subject: RFR (xs) 8251209 [TESTBUG] CDS jvmti tests should use "@modules java.instrument" In-Reply-To: <2b9000ef-9a09-caf4-8610-bb354afb722b@oracle.com> References: <2b9000ef-9a09-caf4-8610-bb354afb722b@oracle.com> Message-ID: <138c579d-2ab9-5e35-9150-dfdd74e899eb@oracle.com> Correction ... On 7/08/2020 7:52 am, David Holmes wrote: > Hi Ioi, > > On 7/08/2020 4:25 am, Ioi Lam wrote: >> https://bugs.openjdk.java.net/browse/JDK-8251209 >> http://cr.openjdk.java.net/~iklam/jdk16/8251209-cds-jvmti-tests-modules-tag.v01/ >> >> >> Summary -- changed the tests from (mis)using >> >> ??* @requires vm.flavor != "minimal" >> >> to >> >> ??* @modules java.instrument >> >> ... to be consistent with other jvmti tests. > > That seems like an invalid precondition to me. It would have been > somewhat valid in the Compact Profiles world when we did not provide > "java.instrument" in the profiles which supported MinimalVM, but you can > define a minimal VM in a build that still has all modules available. I > don't think building the minimal VM makes any changes to the supported > modules. > > Also AIUI the @modules statement simply adds the necessary command-line > args to use the java.instrument module (if present), it doesn't ensure > that the listed module has to be present. It does in fact ensure that: "Otherwise, a test will not be run if the system being tested does not contain all of the specified modules." http://openjdk.java.net/jtreg/tag-spec.html But as I said the module could be present in a JRE but you are still using the MinimalVM. David ----- > David > >> Thanks >> - Ioi From ioi.lam at oracle.com Fri Aug 7 04:41:55 2020 From: ioi.lam at oracle.com (Ioi Lam) Date: Thu, 6 Aug 2020 21:41:55 -0700 Subject: RFR (xs) 8251209 [TESTBUG] CDS jvmti tests should use "@modules java.instrument" In-Reply-To: <138c579d-2ab9-5e35-9150-dfdd74e899eb@oracle.com> References: <2b9000ef-9a09-caf4-8610-bb354afb722b@oracle.com> <138c579d-2ab9-5e35-9150-dfdd74e899eb@oracle.com> Message-ID: On 8/6/20 5:58 PM, David Holmes wrote: > Correction ... > > On 7/08/2020 7:52 am, David Holmes wrote: >> Hi Ioi, >> >> On 7/08/2020 4:25 am, Ioi Lam wrote: >>> https://bugs.openjdk.java.net/browse/JDK-8251209 >>> http://cr.openjdk.java.net/~iklam/jdk16/8251209-cds-jvmti-tests-modules-tag.v01/ >>> >>> >>> Summary -- changed the tests from (mis)using >>> >>> ??* @requires vm.flavor != "minimal" >>> >>> to >>> >>> ??* @modules java.instrument >>> >>> ... to be consistent with other jvmti tests. >> >> That seems like an invalid precondition to me. It would have been >> somewhat valid in the Compact Profiles world when we did not provide >> "java.instrument" in the profiles which supported MinimalVM, but you >> can define a minimal VM in a build that still has all modules >> available. I don't think building the minimal VM makes any changes to >> the supported modules. >> >> Also AIUI the @modules statement simply adds the necessary >> command-line args to use the java.instrument module (if present), it >> doesn't ensure that the listed module has to be present. > > It does in fact ensure that: > > "Otherwise, a test will not be run if the system being tested does not > contain all of the specified modules." > > http://openjdk.java.net/jtreg/tag-spec.html > > But as I said the module could be present in a JRE but you are still > using the MinimalVM. > Hi David, As I mentioned above, I am following the same rule as other jvmti tests, which only use "@modules java.instrument" and do not check whether the VM is minimal. E.g., http://hg.openjdk.java.net/jdk/jdk/file/4d36e29a5410/test/hotspot/jtreg/serviceability/jvmti/GetObjectSizeClass.java ------- If I understand correctly, you're saying someone can build a minimal JDK (configure --with-jvm-variants=minimal), and then try to add the java.instrument module to it. I.e., adding the following module to your JDK (with jlink, or by hand). $ unzip -l ./jmods/java.instrument.jmod ? Length????? Date??? Time??? Name ---------? ---------- -----?? ---- ????? 294? 2020-08-04 17:03?? classes/module-info.class ???? 1102? 2020-08-04 17:03 classes/sun/instrument/TransformerManager$TransformerInfo.class ???? 4294? 2020-08-04 17:03 classes/sun/instrument/TransformerManager.class ????? 911? 2020-08-04 17:03 classes/sun/instrument/InstrumentationImpl$1.class ??? 16663? 2020-08-04 17:03 classes/sun/instrument/InstrumentationImpl.class ???? 1356? 2020-08-04 17:03 classes/java/lang/instrument/ClassFileTransformer.class ????? 554? 2020-08-04 17:03 classes/java/lang/instrument/IllegalClassFormatException.class ???? 1734? 2020-08-04 17:03 classes/java/lang/instrument/Instrumentation.class ????? 563? 2020-08-04 17:03 classes/java/lang/instrument/UnmodifiableModuleException.class ????? 970? 2020-08-04 17:03 classes/java/lang/instrument/ClassDefinition.class ????? 551? 2020-08-04 17:03 classes/java/lang/instrument/UnmodifiableClassException.class ???? 3244? 2020-08-04 17:03?? legal/COPYRIGHT ?????? 44? 2020-08-04 17:03?? legal/LICENSE ??? 50920? 2020-08-04 17:03?? lib/libinstrument.so<<<<<<<<< But this module has a native library, libinstrument.so, which requires JVMTI to be present in libjvm.so. E.g.: ??? jvmtiEnv * ??? retransformableEnvironment(JPLISAgent * agent) { ??? .... ??????? jnierror = (*agent->mJVM)->GetEnv(? agent->mJVM, ??????? ? ?? ????????????????????? (void **) &retransformerEnv, ??????????? ? ?? ????????????????? JVMTI_VERSION_1_1); So if you try to run the CDS JVMTI test cases, it will be executed (because your JDK says "I have java.instrument") and the test finds out that your JDK's java.instrument module isn't working properly. So the test is doing exactly what it's supposed to do. I would argue that this is better than before (which would exclude the test when the libjvm.so is a minimal build, and would will not detect such a mis-configured java.instrument module.) Thanks - Ioi From Alan.Bateman at oracle.com Fri Aug 7 05:49:40 2020 From: Alan.Bateman at oracle.com (Alan Bateman) Date: Fri, 7 Aug 2020 06:49:40 +0100 Subject: RFR (xs) 8251209 [TESTBUG] CDS jvmti tests should use "@modules java.instrument" In-Reply-To: <138c579d-2ab9-5e35-9150-dfdd74e899eb@oracle.com> References: <2b9000ef-9a09-caf4-8610-bb354afb722b@oracle.com> <138c579d-2ab9-5e35-9150-dfdd74e899eb@oracle.com> Message-ID: On 07/08/2020 01:58, David Holmes wrote: > > It does in fact ensure that: > > "Otherwise, a test will not be run if the system being tested does not > contain all of the specified modules." > > http://openjdk.java.net/jtreg/tag-spec.html > > But as I said the module could be present in a JRE but you are still > using the MinimalVM. Right, tests with `@modules java.instrument` will not be selected if the run-time under test doesn't contain this module. It would be a bit strange to create a run-time image with a minimal VM build that doesn't have JVM TI but include java.instrument. All usages of -javaagent would be fatal because the JPLIS agent uses JVM TI. A long time ago there were calls for a way java.management and java.instrument to express that they required specific VM features but I don't think it came to anything. So nothing in jlink to catch this at link-time, at least for the case that there is only one libjvm in the generated run-time image. So technically I think the tests would need both @requires and @modules if someone really wanted to be able to run all tests with the minimal VM and expect jtreg to not select these tests. -Alan. From david.holmes at oracle.com Fri Aug 7 06:01:50 2020 From: david.holmes at oracle.com (David Holmes) Date: Fri, 7 Aug 2020 16:01:50 +1000 Subject: RFR (xs) 8251209 [TESTBUG] CDS jvmti tests should use "@modules java.instrument" In-Reply-To: References: <2b9000ef-9a09-caf4-8610-bb354afb722b@oracle.com> <138c579d-2ab9-5e35-9150-dfdd74e899eb@oracle.com> Message-ID: On 7/08/2020 2:41 pm, Ioi Lam wrote: > On 8/6/20 5:58 PM, David Holmes wrote: >> Correction ... >> >> On 7/08/2020 7:52 am, David Holmes wrote: >>> Hi Ioi, >>> >>> On 7/08/2020 4:25 am, Ioi Lam wrote: >>>> https://bugs.openjdk.java.net/browse/JDK-8251209 >>>> http://cr.openjdk.java.net/~iklam/jdk16/8251209-cds-jvmti-tests-modules-tag.v01/ >>>> >>>> >>>> Summary -- changed the tests from (mis)using >>>> >>>> ??* @requires vm.flavor != "minimal" >>>> >>>> to >>>> >>>> ??* @modules java.instrument >>>> >>>> ... to be consistent with other jvmti tests. >>> >>> That seems like an invalid precondition to me. It would have been >>> somewhat valid in the Compact Profiles world when we did not provide >>> "java.instrument" in the profiles which supported MinimalVM, but you >>> can define a minimal VM in a build that still has all modules >>> available. I don't think building the minimal VM makes any changes to >>> the supported modules. >>> >>> Also AIUI the @modules statement simply adds the necessary >>> command-line args to use the java.instrument module (if present), it >>> doesn't ensure that the listed module has to be present. >> >> It does in fact ensure that: >> >> "Otherwise, a test will not be run if the system being tested does not >> contain all of the specified modules." >> >> http://openjdk.java.net/jtreg/tag-spec.html >> >> But as I said the module could be present in a JRE but you are still >> using the MinimalVM. >> > > Hi David, > > As I mentioned above, I am following the same rule as other jvmti tests, > which only use "@modules java.instrument" and do not check whether the > VM is minimal. E.g., > > http://hg.openjdk.java.net/jdk/jdk/file/4d36e29a5410/test/hotspot/jtreg/serviceability/jvmti/GetObjectSizeClass.java Sure but I contend those tests are wrong and the tests you are changing are right (or more right given common test configurations). > > ------- > > If I understand correctly, you're saying someone can build a minimal JDK > (configure --with-jvm-variants=minimal), and then try to add the > java.instrument module to it. I.e., adding the following module to your > JDK (with jlink, or by hand). Just build a JDK with multiple VMs present. > > $ unzip -l ./jmods/java.instrument.jmod > ? Length????? Date??? Time??? Name > ---------? ---------- -----?? ---- > ????? 294? 2020-08-04 17:03?? classes/module-info.class > ???? 1102? 2020-08-04 17:03 > classes/sun/instrument/TransformerManager$TransformerInfo.class > ???? 4294? 2020-08-04 17:03 > classes/sun/instrument/TransformerManager.class > ????? 911? 2020-08-04 17:03 > classes/sun/instrument/InstrumentationImpl$1.class > ??? 16663? 2020-08-04 17:03 > classes/sun/instrument/InstrumentationImpl.class > ???? 1356? 2020-08-04 17:03 > classes/java/lang/instrument/ClassFileTransformer.class > ????? 554? 2020-08-04 17:03 > classes/java/lang/instrument/IllegalClassFormatException.class > ???? 1734? 2020-08-04 17:03 > classes/java/lang/instrument/Instrumentation.class > ????? 563? 2020-08-04 17:03 > classes/java/lang/instrument/UnmodifiableModuleException.class > ????? 970? 2020-08-04 17:03 > classes/java/lang/instrument/ClassDefinition.class > ????? 551? 2020-08-04 17:03 > classes/java/lang/instrument/UnmodifiableClassException.class > ???? 3244? 2020-08-04 17:03?? legal/COPYRIGHT > ?????? 44? 2020-08-04 17:03?? legal/LICENSE > ??? 50920? 2020-08-04 17:03?? lib/libinstrument.so<<<<<<<<< > > But this module has a native library, libinstrument.so, which requires > JVMTI to be present in libjvm.so. E.g.: > > ??? jvmtiEnv * > ??? retransformableEnvironment(JPLISAgent * agent) { > ??? .... > ??????? jnierror = (*agent->mJVM)->GetEnv(? agent->mJVM, > ??????? ? ?? ????????????????????? (void **) &retransformerEnv, > ??????????? ? ?? ????????????????? JVMTI_VERSION_1_1); > > So if you try to run the CDS JVMTI test cases, it will be executed > (because your JDK says "I have java.instrument") and the test finds out > that your JDK's java.instrument module isn't working properly. So the > test is doing exactly what it's supposed to do. The whole point of the @requires is to not waste time and resources running a test on a platform that cannot run the test successfully. So the fully correct solution could be to have both settings: @requires vm.flavor != "minimal" @modules java.instrument if you require both a VM that supports JVM TI and you need a JRE that includes the java.instrument module. But that assumes your test does need java.instrument. Not all JVM TI tests need java.instrument, but all instrumentation tests depend on JVM TI. Just looking at the first three of tests in your webrev I don't see any dependency on java.instrument - they are CDS only tests as far as I can see and so require a VM with CDS which means not a Minimal VM - though perhaps it is sufficient to have the @requires vm.cds in those cases? For the other JVM TI related tests using -javaagent they probably need both @requires and @module. David ----- > I would argue that this is better than before (which would exclude the > test when the libjvm.so is a minimal build, and would will not detect > such a mis-configured java.instrument module.) > > > Thanks > - Ioi > > From ioi.lam at oracle.com Fri Aug 7 06:13:59 2020 From: ioi.lam at oracle.com (Ioi Lam) Date: Thu, 6 Aug 2020 23:13:59 -0700 Subject: RFR (xs) 8251209 [TESTBUG] CDS jvmti tests should use "@modules java.instrument" In-Reply-To: References: <2b9000ef-9a09-caf4-8610-bb354afb722b@oracle.com> <138c579d-2ab9-5e35-9150-dfdd74e899eb@oracle.com> Message-ID: <044d8bc3-d359-6ca6-2e38-91a158ec1203@oracle.com> On 8/6/20 11:01 PM, David Holmes wrote: > On 7/08/2020 2:41 pm, Ioi Lam wrote: >> On 8/6/20 5:58 PM, David Holmes wrote: >>> Correction ... >>> >>> On 7/08/2020 7:52 am, David Holmes wrote: >>>> Hi Ioi, >>>> >>>> On 7/08/2020 4:25 am, Ioi Lam wrote: >>>>> https://bugs.openjdk.java.net/browse/JDK-8251209 >>>>> http://cr.openjdk.java.net/~iklam/jdk16/8251209-cds-jvmti-tests-modules-tag.v01/ >>>>> >>>>> >>>>> Summary -- changed the tests from (mis)using >>>>> >>>>> ??* @requires vm.flavor != "minimal" >>>>> >>>>> to >>>>> >>>>> ??* @modules java.instrument >>>>> >>>>> ... to be consistent with other jvmti tests. >>>> >>>> That seems like an invalid precondition to me. It would have been >>>> somewhat valid in the Compact Profiles world when we did not >>>> provide "java.instrument" in the profiles which supported >>>> MinimalVM, but you can define a minimal VM in a build that still >>>> has all modules available. I don't think building the minimal VM >>>> makes any changes to the supported modules. >>>> >>>> Also AIUI the @modules statement simply adds the necessary >>>> command-line args to use the java.instrument module (if present), >>>> it doesn't ensure that the listed module has to be present. >>> >>> It does in fact ensure that: >>> >>> "Otherwise, a test will not be run if the system being tested does >>> not contain all of the specified modules." >>> >>> http://openjdk.java.net/jtreg/tag-spec.html >>> >>> But as I said the module could be present in a JRE but you are still >>> using the MinimalVM. >>> >> >> Hi David, >> >> As I mentioned above, I am following the same rule as other jvmti >> tests, which only use "@modules java.instrument" and do not check >> whether the VM is minimal. E.g., >> >> http://hg.openjdk.java.net/jdk/jdk/file/4d36e29a5410/test/hotspot/jtreg/serviceability/jvmti/GetObjectSizeClass.java > > > Sure but I contend those tests are wrong and the tests you are > changing are right (or more right given common test configurations). > >> >> ------- >> >> If I understand correctly, you're saying someone can build a minimal >> JDK (configure --with-jvm-variants=minimal), and then try to add the >> java.instrument module to it. I.e., adding the following module to >> your JDK (with jlink, or by hand). > > Just build a JDK with multiple VMs present. > >> >> $ unzip -l ./jmods/java.instrument.jmod >> ?? Length????? Date??? Time??? Name >> ---------? ---------- -----?? ---- >> ?????? 294? 2020-08-04 17:03?? classes/module-info.class >> ????? 1102? 2020-08-04 17:03 >> classes/sun/instrument/TransformerManager$TransformerInfo.class >> ????? 4294? 2020-08-04 17:03 >> classes/sun/instrument/TransformerManager.class >> ?????? 911? 2020-08-04 17:03 >> classes/sun/instrument/InstrumentationImpl$1.class >> ???? 16663? 2020-08-04 17:03 >> classes/sun/instrument/InstrumentationImpl.class >> ????? 1356? 2020-08-04 17:03 >> classes/java/lang/instrument/ClassFileTransformer.class >> ?????? 554? 2020-08-04 17:03 >> classes/java/lang/instrument/IllegalClassFormatException.class >> ????? 1734? 2020-08-04 17:03 >> classes/java/lang/instrument/Instrumentation.class >> ?????? 563? 2020-08-04 17:03 >> classes/java/lang/instrument/UnmodifiableModuleException.class >> ?????? 970? 2020-08-04 17:03 >> classes/java/lang/instrument/ClassDefinition.class >> ?????? 551? 2020-08-04 17:03 >> classes/java/lang/instrument/UnmodifiableClassException.class >> ????? 3244? 2020-08-04 17:03?? legal/COPYRIGHT >> ??????? 44? 2020-08-04 17:03?? legal/LICENSE >> ???? 50920? 2020-08-04 17:03 lib/libinstrument.so<<<<<<<<< >> >> But this module has a native library, libinstrument.so, which >> requires JVMTI to be present in libjvm.so. E.g.: >> >> ???? jvmtiEnv * >> ???? retransformableEnvironment(JPLISAgent * agent) { >> ???? .... >> ???????? jnierror = (*agent->mJVM)->GetEnv( agent->mJVM, >> ???????? ? ?? ????????????????????? (void **) &retransformerEnv, >> ???????????? ? ?? ????????????????? JVMTI_VERSION_1_1); >> >> So if you try to run the CDS JVMTI test cases, it will be executed >> (because your JDK says "I have java.instrument") and the test finds >> out that your JDK's java.instrument module isn't working properly. So >> the test is doing exactly what it's supposed to do. > > The whole point of the @requires is to not waste time and resources > running a test on a platform that cannot run the test successfully. > > So the fully correct solution could be to have both settings: > > @requires vm.flavor != "minimal" > @modules java.instrument > > if you require both a VM that supports JVM TI and you need a JRE that > includes the java.instrument module. But that assumes your test does > need java.instrument. Not all JVM TI tests need java.instrument, but > all instrumentation tests depend on JVM TI. Just looking at the first > three of tests in your webrev I don't see any dependency on > java.instrument - they are CDS only tests as far as I can see and so > require a VM with CDS which means not a Minimal VM - though perhaps it > is sufficient to have the > > @requires vm.cds > > in those cases? > > For the other JVM TI related tests using -javaagent they probably need > both @requires and @module. You can disable JVMTI with "configure --disable-jvm-feature-jvmti". So checking for vm.flavor != "minimal" is not sufficient. Similarly, "@requires vm.cds" doesn't guarantee that JVMTI is supported. Maybe we should have a new VM prop so we can do @requires vm.jvmti @modules java.instrument Thanks - Ioi > > David > ----- > >> I would argue that this is better than before (which would exclude >> the test when the libjvm.so is a minimal build, and would will not >> detect such a mis-configured java.instrument module.) >> >> >> Thanks >> - Ioi >> >> From david.holmes at oracle.com Fri Aug 7 06:41:58 2020 From: david.holmes at oracle.com (David Holmes) Date: Fri, 7 Aug 2020 16:41:58 +1000 Subject: RFR (xs) 8251209 [TESTBUG] CDS jvmti tests should use "@modules java.instrument" In-Reply-To: <044d8bc3-d359-6ca6-2e38-91a158ec1203@oracle.com> References: <2b9000ef-9a09-caf4-8610-bb354afb722b@oracle.com> <138c579d-2ab9-5e35-9150-dfdd74e899eb@oracle.com> <044d8bc3-d359-6ca6-2e38-91a158ec1203@oracle.com> Message-ID: <76bb64fd-bb2d-40ed-7624-b5fe8adce3c6@oracle.com> On 7/08/2020 4:13 pm, Ioi Lam wrote: > On 8/6/20 11:01 PM, David Holmes wrote: >> On 7/08/2020 2:41 pm, Ioi Lam wrote: >>> On 8/6/20 5:58 PM, David Holmes wrote: >>>> Correction ... >>>> >>>> On 7/08/2020 7:52 am, David Holmes wrote: >>>>> Hi Ioi, >>>>> >>>>> On 7/08/2020 4:25 am, Ioi Lam wrote: >>>>>> https://bugs.openjdk.java.net/browse/JDK-8251209 >>>>>> http://cr.openjdk.java.net/~iklam/jdk16/8251209-cds-jvmti-tests-modules-tag.v01/ >>>>>> >>>>>> >>>>>> Summary -- changed the tests from (mis)using >>>>>> >>>>>> ??* @requires vm.flavor != "minimal" >>>>>> >>>>>> to >>>>>> >>>>>> ??* @modules java.instrument >>>>>> >>>>>> ... to be consistent with other jvmti tests. >>>>> >>>>> That seems like an invalid precondition to me. It would have been >>>>> somewhat valid in the Compact Profiles world when we did not >>>>> provide "java.instrument" in the profiles which supported >>>>> MinimalVM, but you can define a minimal VM in a build that still >>>>> has all modules available. I don't think building the minimal VM >>>>> makes any changes to the supported modules. >>>>> >>>>> Also AIUI the @modules statement simply adds the necessary >>>>> command-line args to use the java.instrument module (if present), >>>>> it doesn't ensure that the listed module has to be present. >>>> >>>> It does in fact ensure that: >>>> >>>> "Otherwise, a test will not be run if the system being tested does >>>> not contain all of the specified modules." >>>> >>>> http://openjdk.java.net/jtreg/tag-spec.html >>>> >>>> But as I said the module could be present in a JRE but you are still >>>> using the MinimalVM. >>>> >>> >>> Hi David, >>> >>> As I mentioned above, I am following the same rule as other jvmti >>> tests, which only use "@modules java.instrument" and do not check >>> whether the VM is minimal. E.g., >>> >>> http://hg.openjdk.java.net/jdk/jdk/file/4d36e29a5410/test/hotspot/jtreg/serviceability/jvmti/GetObjectSizeClass.java >> >> >> >> Sure but I contend those tests are wrong and the tests you are >> changing are right (or more right given common test configurations). >> >>> >>> ------- >>> >>> If I understand correctly, you're saying someone can build a minimal >>> JDK (configure --with-jvm-variants=minimal), and then try to add the >>> java.instrument module to it. I.e., adding the following module to >>> your JDK (with jlink, or by hand). >> >> Just build a JDK with multiple VMs present. >> >>> >>> $ unzip -l ./jmods/java.instrument.jmod >>> ?? Length????? Date??? Time??? Name >>> ---------? ---------- -----?? ---- >>> ?????? 294? 2020-08-04 17:03?? classes/module-info.class >>> ????? 1102? 2020-08-04 17:03 >>> classes/sun/instrument/TransformerManager$TransformerInfo.class >>> ????? 4294? 2020-08-04 17:03 >>> classes/sun/instrument/TransformerManager.class >>> ?????? 911? 2020-08-04 17:03 >>> classes/sun/instrument/InstrumentationImpl$1.class >>> ???? 16663? 2020-08-04 17:03 >>> classes/sun/instrument/InstrumentationImpl.class >>> ????? 1356? 2020-08-04 17:03 >>> classes/java/lang/instrument/ClassFileTransformer.class >>> ?????? 554? 2020-08-04 17:03 >>> classes/java/lang/instrument/IllegalClassFormatException.class >>> ????? 1734? 2020-08-04 17:03 >>> classes/java/lang/instrument/Instrumentation.class >>> ?????? 563? 2020-08-04 17:03 >>> classes/java/lang/instrument/UnmodifiableModuleException.class >>> ?????? 970? 2020-08-04 17:03 >>> classes/java/lang/instrument/ClassDefinition.class >>> ?????? 551? 2020-08-04 17:03 >>> classes/java/lang/instrument/UnmodifiableClassException.class >>> ????? 3244? 2020-08-04 17:03?? legal/COPYRIGHT >>> ??????? 44? 2020-08-04 17:03?? legal/LICENSE >>> ???? 50920? 2020-08-04 17:03 lib/libinstrument.so<<<<<<<<< >>> >>> But this module has a native library, libinstrument.so, which >>> requires JVMTI to be present in libjvm.so. E.g.: >>> >>> ???? jvmtiEnv * >>> ???? retransformableEnvironment(JPLISAgent * agent) { >>> ???? .... >>> ???????? jnierror = (*agent->mJVM)->GetEnv( agent->mJVM, >>> ???????? ? ?? ????????????????????? (void **) &retransformerEnv, >>> ???????????? ? ?? ????????????????? JVMTI_VERSION_1_1); >>> >>> So if you try to run the CDS JVMTI test cases, it will be executed >>> (because your JDK says "I have java.instrument") and the test finds >>> out that your JDK's java.instrument module isn't working properly. So >>> the test is doing exactly what it's supposed to do. >> >> The whole point of the @requires is to not waste time and resources >> running a test on a platform that cannot run the test successfully. >> >> So the fully correct solution could be to have both settings: >> >> @requires vm.flavor != "minimal" >> @modules java.instrument >> >> if you require both a VM that supports JVM TI and you need a JRE that >> includes the java.instrument module. But that assumes your test does >> need java.instrument. Not all JVM TI tests need java.instrument, but >> all instrumentation tests depend on JVM TI. Just looking at the first >> three of tests in your webrev I don't see any dependency on >> java.instrument - they are CDS only tests as far as I can see and so >> require a VM with CDS which means not a Minimal VM - though perhaps it >> is sufficient to have the >> >> @requires vm.cds >> >> in those cases? >> >> For the other JVM TI related tests using -javaagent they probably need >> both @requires and @module. > > You can disable JVMTI with "configure --disable-jvm-feature-jvmti". So > checking for vm.flavor != "minimal" is not sufficient. Not it isn't but we don't try to accommodate every possible VM feature and module that someone could create a JRE with. Otherwise we will need an @require capability for every selectable feature and we would need to update every test to explicitly list every dependency. But we do build a minimal VM and people can test the minimal VM, and if you build the minimal VM as part a multi-VM JRE then you do have java.instrument present. So only testing for the module, instead of testing for a non-minimal VM is not sufficient and changing tests to do that is a step in the wrong direction IMO. Cheers, David > Similarly, "@requires vm.cds" doesn't guarantee that JVMTI is supported. > > Maybe we should have a new VM prop so we can do > > @requires vm.jvmti > @modules java.instrument > > Thanks > - Ioi > > > >> >> David >> ----- >> >>> I would argue that this is better than before (which would exclude >>> the test when the libjvm.so is a minimal build, and would will not >>> detect such a mis-configured java.instrument module.) >>> >>> >>> Thanks >>> - Ioi >>> >>> > From serguei.spitsyn at oracle.com Fri Aug 7 06:49:33 2020 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Thu, 6 Aug 2020 23:49:33 -0700 Subject: RFR(L): 8215624: add parallel heap inspection support for jmap histo(G1)(Internet mail) In-Reply-To: <1C77C7B9-4541-41E3-B16D-FB1B243D8087@amazon.com> References: <107799EA-8956-426D-B05E-B3344AFCBA28@amazon.com> <63236320-6741-46F4-A22A-24FCE699E26A@tencent.com> <1C77C7B9-4541-41E3-B16D-FB1B243D8087@amazon.com> Message-ID: Hi Paul, Yes, I'm reviewing it. Sorry for the late reply. Thanks, Serguei On 8/6/20 06:49, Hohensee, Paul wrote: > And a submit repo run succeeds. > > Serguei, would you be willing to review? > > Thanks, > Paul > > ?On 8/5/20, 7:00 PM, "linzang(??)" wrote: > > CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe. > > > > Thanks Paul! > And I have verified this change could build success in windows. > > BRs, > Lin > > On 2020/8/6, 4:17 AM, "Hohensee, Paul" wrote: > > Two tiny nits that don't need a new webrev: > > In heapInspection.cpp, you don't need to cast missed_count to uintx in the call to log_info(). > > In heapInspection.hpp, you can delete two of the three blank lines before #endif // SHARE_MEMORY_HEAPINSPECTION_HPP > > Thanks, > Paul > > On 8/5/20, 6:46 AM, "linzang(??)" wrote: > > Hi Paul, Stefan and Serguei, > Here I uploaded a new changeset, would you like to help review again? > Webrev: http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_11/ > Delta (based on webrev10): http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_11_delta/ > > P.S. I am in process of building it on windows environment for a double check. May update result later. Thanks! > > > BRs, > Lin > > On 2020/8/5, 8:56 PM, "Hohensee, Paul" wrote: > > uintx is fine with me. > > Thanks, > Paul > > On 8/5/20, 1:14 AM, "linzang(??)" wrote: > > Hi Stefan, > I got your point, thanks for explanation. > I missed the atomic part when considering it. > > Hi Paul, > Do you think it is ok to use uintx? I checked it is actually a uintptr_t type. > > > BRs, > Lin > > On 2020/8/5, 3:39 PM, "Stefan Karlsson" wrote: > > On 2020-08-05 07:22, linzang(??) wrote: > > Hi Serguei, > > > > No problem, Thanks for your reviewing :) > > > > I wll upload a new webrev later, so may I ask your help to review it > > again? > > > > Hi Stefan, > > > > As Paul mentioned, the _/missed/_count is not a size, so size_t may > > not be clear, what?s your opinion about uint64_t? > > We typically don't restrict the usage of size_t to only *sizes* in the > HotSpot. If you search the code you'll find many count variables using > size_t, so I personally don't see the need to change the type. > > However, if you really do want to change it then maybe using another > type that is 32 bits on 32-bit machines, maybe uintx? IIRC, using > uint64_t and some of the Atomics operations are problematic on some > 32-bit platforms, so using a type that matches the word size of the > targetted machine helps you not having to think about that. > > > > > It seems the uint overflow may happened on 64bit machine with large > > heap, e.g. may be more than 4 Giga objects (8byte header + 8 byte > > klassptr + 8byte field) in a heap that is larger than 96 GB, uint64_t > > is ok in this case. > > Exactly. > > Thanks, > StefanK > > > > > BRs, > > > > Lin > > > > *From: *"serguei.spitsyn at oracle.com" > > *Date: *Wednesday, August 5, 2020 at 1:02 PM > > *To: *"linzang(??)" , "Hohensee, Paul" > > , Stefan Karlsson , > > David Holmes , serviceability-dev > > , "hotspot-gc-dev at openjdk.java.net" > > > > *Subject: *Re: RFR(L): 8215624: add parallel heap inspection support for > > jmap histo(G1)(Internet mail) > > > > Oh, sorry for the confusion, please, skip my question. :) > > C++ does not have the '&&=' operator. > > > > Thanks, > > Serguei > > > > On 8/4/20 21:56, serguei.spitsyn at oracle.com > > wrote: > > > > Hi Lin, > > > > https://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_10/src/hotspot/share/memory/heapInspection.cpp.udiff.html > > > > +class KlassInfoTableMergeClosure : public KlassInfoClosure { > > > > +private: > > > > + KlassInfoTable* _dest; > > > > + bool _success; > > > > +public: > > > > + KlassInfoTableMergeClosure(KlassInfoTable* table) : _dest(table), > > _success(true) {} > > > > + void do_cinfo(KlassInfoEntry* cie) { > > > > + _success &= _dest->merge_entry(cie); > > > > + } > > > > The operator '&=' above looks strange. > > Did you actually want to use the operator '&&=' instead? : > > > > + _success &&= _dest->merge_entry(cie); > > > > > > Thanks, > > Serguei > > > > > > > > > > On 8/3/20 07:51, linzang(??) wrote: > > > > Dear Stefan, > > > > May I ask your help to review again? I have made a delta based on the last changeset you have reviewed(webrev04),hope it could ease your reviewing work. > > > > webrev:https://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_10/ > > > > delta (vs webrev04):https://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/delta_10vs04/webrev/ > > > > bug:https://bugs.openjdk.java.net/browse/JDK-8215624 > > > > CSR(approved):https://bugs.openjdk.java.net/browse/JDK-8239290 > > > > > > > > BRs, > > > > Lin > > > > On 2020/7/30, 5:21 AM, "Hohensee, Paul" wrote: > > > > A submit repo run with this succeeded, so afaic you're good to go. Stefan, you reviewed the GC part before, it'd be great if you could ok the final version. > > > > Thanks, > > > > Paul > > > > On 7/29/20, 5:02 AM, "linzang(??)" wrote: > > > > Upload a new change athttp://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_10/ > > > > It fix an issue of windows fail : > > > > #################################### > > > > In heapInspect.cpp > > > > - size_t HeapInspection::populate_table(KlassInfoTable* cit, BoolObjectClosure *filter, uint parallel_thread_num) { > > > > + uint HeapInspection::populate_table(KlassInfoTable* cit, BoolObjectClosure *filter, uint parallel_thread_num) { > > > > #################################### > > > > In heapInspect.hpp > > > > - size_t populate_table(KlassInfoTable* cit, BoolObjectClosure* filter = NULL, uint parallel_thread_num = 1) NOT_SERVICES_RETURN_(0); > > > > + uint populate_table(KlassInfoTable* cit, BoolObjectClosure* filter = NULL, uint parallel_thread_num = 1) NOT_SERVICES_RETURN_(0); > > > > #################################### > > > > BRs, > > > > Lin > > > > On 2020/7/27, 11:26 AM, "linzang(??)" wrote: > > > > I update a new change athttp://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_09 > > > > It includes a tiny fix of build failure on windows: > > > > #################################### > > > > In attachListener.cpp: > > > > - uint parallel_thread_num = MAX(1, (uint)os::initial_active_processor_count() * 3 / 8); > > > > + uint parallel_thread_num = MAX2(1, (uint)os::initial_active_processor_count() * 3 / 8); > > > > #################################### > > > > BRs, > > > > Lin > > > > On 2020/7/23, 11:56 AM, "linzang(??)" wrote: > > > > Hi Paul, > > > > Thanks for your help, that all looks good to me. > > > > Just 2 minor changes: > > > > ? delete the final return in ParHeapInspectTask::work, you mentioned it but seems not include in the webrev :-) > > > > ? delete a unnecessary blank line in heapInspect.cpp at merge_entry() > > > > ######################################################################### > > > > --- old/src/hotspot/share/memory/heapInspection.cpp 2020-07-23 11:23:29.281666456 +0800 > > > > +++ new/src/hotspot/share/memory/heapInspection.cpp 2020-07-23 11:23:29.017666447 +0800 > > > > @@ -251,7 +251,6 @@ > > > > _size_of_instances_in_words += cie->words(); > > > > return true; > > > > } > > > > - > > > > return false; > > > > } > > > > @@ -568,7 +567,6 @@ > > > > Atomic::add(&_missed_count, missed_count); > > > > } else { > > > > Atomic::store(&_success, false); > > > > - return; > > > > } > > > > } > > > > ######################################################################### > > > > Here is the webrevhttp://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_08/ > > > > BRs, > > > > Lin > > > > --------------------------------------------- > > > > From: "Hohensee, Paul" > > > > Date: Thursday, July 23, 2020 at 6:48 AM > > > > To: "linzang(??)" , Stefan Karlsson ,"serguei.spitsyn at oracle.com" , David Holmes , serviceability-dev ,"hotspot-gc-dev at openjdk.java.net" > > > > Subject: RE: RFR(L): 8215624: add parallel heap inspection support for jmap histo(G1)(Internet mail) > > > > Just small things. > > > > heapInspection.cpp: > > > > In ParHeapInspectTask::work, remove the final return statement and fix the following ?}? indent. I.e., replace > > > > + Atomic::store(&_success, false); > > > > + return; > > > > + } > > > > with > > > > + Atomic::store(&_success, false); > > > > + } > > > > In HeapInspection::heap_inspection, missed_count should be a uint to match other missed_count declarations, and should be initialized to the result of populate_table() rather than separately to 0. > > > > attachListener.cpp: > > > > In heap_inspection, initial_processor_count returns an int, so cast its result to a uint. > > > > Similarly, parse_uintx returns a uintx, so cast its result (num) to uint when assigning to parallel_thread_num. > > > > BasicJMapTest.java: > > > > I took the liberty of refactoring testHisto*/histoToFile/testDump*/dump to remove redundant interposition methods and make histoToFile and dump look as similar as possible. > > > > Webrev with the above changes in > > > > http://cr.openjdk.java.net/~phh/8214535/webrev.01/ > > > > Thanks, > > > > Paul > > > > On 7/15/20, 2:13 AM, "linzang(??)" wrote: > > > > Upload a new webrev athttp://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_07/ > > > > It fix a potential issue that unexpected number of threads maybe calculated for "parallel" option of jmap -histo in container. > > > > As shown athttp://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_07-delta/src/hotspot/share/services/attachListener.cpp.udiff.html > > > > ############### attachListener.cpp #################### > > > > @@ -252,11 +252,11 @@ > > > > static jint heap_inspection(AttachOperation* op, outputStream* out) { > > > > bool live_objects_only = true; // default is true to retain the behavior before this change is made > > > > outputStream* os = out; // if path not specified or path is NULL, use out > > > > fileStream* fs = NULL; > > > > const char* arg0 = op->arg(0); > > > > - uint parallel_thread_num = MAX(1, os::processor_count() * 3 / 8); // default is less than half of processors. > > > > + uint parallel_thread_num = MAX(1, os::initial_active_processor_count() * 3 / 8); // default is less than half of processors. > > > > if (arg0 != NULL && (strlen(arg0) > 0)) { > > > > if (strcmp(arg0, "-all") != 0 && strcmp(arg0, "-live") != 0) { > > > > out->print_cr("Invalid argument to inspectheap operation: %s", arg0); > > > > return JNI_ERR; > > > > } > > > > ################################################### > > > > Thanks. > > > > BRs, > > > > Lin > > > > On 2020/7/9, 3:22 PM, "linzang(??)" wrote: > > > > Hi Paul, > > > > Thanks for reviewing! > > > > >> > > > > >> I'd move all the argument parsing code to JMap.java and just pass the results to Hotspot. Both histo() in JMap.java and code in attachListener.* parse the command line arguments, though the code in histo() doesn't parse the argument to "parallel". I'd upgrade the code in histo() to do a complete parse and pass the option values to executeCommandForPid as before: there would just be more of them now. That would allow you to eliminate all the parsing code in attachListener.cpp as well as the change to arguments.hpp. > > > > >> > > > > The reason I made the change in Jmap.java that compose all arguments as 1 string , instead of passing 3 argments, is to avoid the compatibility issue, as we discussed inhttp://mail.openjdk.java.net/pipermail/serviceability-dev/2019-February/thread.html#27240. The root cause of the compatibility issue is because max argument count in HotspotVirtualMachineImpl.java and attachlistener.cpp need to be enlarged (changes likehttp://hg.openjdk.java.net/jdk/jdk/rev/e7cf035682e3#l2.1) when jmap has more than 3 arguments. But if user use an old jcmd/jmap tool, it may stuck at socket read(), because the "max argument count" don't match. > > > > I re-checked this change, the argument count of jmap histo is equal to 3 (live, file, parallel), so it can work normally even without the change of passing argument. But I think we have to face the problem if more arguments is added in jcmd alike tools later, not sure whether it should be sloved (or a workaround) in this changeset. > > > > And here are the lastest webrev and delta: > > > > http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_06/ > > > > http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_06-delta/ > > > > Cheers, > > > > Lin > > > > On 2020/7/7, 5:57 AM, "Hohensee, Paul" wrote: > > > > I'd like to see this feature added. :) > > > > The CSR looks good, as does the basic parallel inspection algorithm. Stefan's done the GC part, so I'll stick to the non-GC part (fwiw, the GC part lgtm). > > > > I'd move all the argument parsing code to JMap.java and just pass the results to Hotspot. Both histo() in JMap.java and code in attachListener.* parse the command line arguments, though the code in histo() doesn't parse the argument to "parallel". I'd upgrade the code in histo() to do a complete parse and pass the option values to executeCommandForPid as before: there would just be more of them now. That would allow you to eliminate all the parsing code in attachListener.cpp as well as the change to arguments.hpp. > > > > heapInspection.hpp: > > > > _shared_miss_count (s/b _missed_count, see below) isn't a size, so it should be a uint instead of a size_t. Same with the new parallel_thread_num argument to heap_inspection() and populate_table(). > > > > Comment copy-edit: > > > > +// Parallel heap inspection task. Parallel inspection can fail due to > > > > +// a native OOM when allocating memory for TL-KlassInfoTable. > > > > +// _success will be set false on an OOM, and serial inspection tried. > > > > _shared_miss_count should be _missed_count to match the missed_count() getter, or rename missed_count() to be shared_miss_count(). Whichever way you go, the field type should match the getter result type: uint is reasonable. > > > > heapInspection.cpp: > > > > You might use ResourceMark twice in populate_table, separately for the parallel attempt and the serial code. If the parallel attempt fails and available memory is low, it would be good to clean up the memory used by the parallel attempt before doing the serial code. > > > > Style nit in KlassInfoTable::merge_entry(). I'd line up the definitions of k and elt, so "k" is even with "elt". And, because it's two lines shorter, I'd replace > > > > + } else { > > > > + return false; > > > > + } > > > > with > > > > + return false; > > > > KlassInfoTableMergeClosure.is_success() should be just success() (i.e., no "is_" prefix) because it's a getter. > > > > I'd reorganize the code in populate_table() to make it more clear, vis (I changed _shared_missed_count to _missed_count) > > > > + if (cit.allocation_failed()) { > > > > + // fail to allocate memory, stop parallel mode > > > > + Atomic::store(&_success, false); > > > > + return; > > > > + } > > > > + RecordInstanceClosure ric(&cit, _filter); > > > > + _poi->object_iterate(&ric, worker_id); > > > > + missed_count = ric.missed_count(); > > > > + { > > > > + MutexLocker x(&_mutex); > > > > + merge_success = _shared_cit->merge(&cit); > > > > + } > > > > + if (merge_success) { > > > > + Atomic::add(&_missed_count, missed_count); > > > > + else { > > > > + Atomic::store(&_success, false); > > > > + } > > > > Thanks, > > > > Paul > > > > On 6/29/20, 7:20 PM, "linzang(??)" wrote: > > > > Dear All, > > > > Sorry to bother again, I just want to make sure that is this change worth to be continue to work on? If decision is made to not. I think I can drop this work and stop asking for help reviewing... > > > > Thanks for all your help about reviewing this previously. > > > > BRs, > > > > Lin > > > > On 2020/5/9, 3:47 PM, "linzang(??)" wrote: > > > > Dear All, > > > > May I ask your help again for review the latest change? Thanks! > > > > BRs, > > > > Lin > > > > On 2020/4/28, 1:54 PM, "linzang(??)" wrote: > > > > Hi Stefan, > > > > >> - Adding Atomic::load/store. > > > > >> - Removing the time measurement in the run_task. I renamed G1's function > > > > >> to run_task_timed. If we need this outside of G1, we can rethink the API > > > > >> at that point. > > > > >> - ZGC style cleanups > > > > Thanks for revising the patch, they are all good to me, and I have made a tiny change based on it: > > > > http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_04/ > > > > http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_04-delta/ > > > > it reduce the scope of mutex in ParHeapInspectTask, and delete unnecessary comments. > > > > BRs, > > > > Lin > > > > On 2020/4/27, 4:34 PM, "Stefan Karlsson" wrote: > > > > Hi Lin, > > > > On 2020-04-26 05:10, linzang(??) wrote: > > > > > Hi Stefan and Paul? > > > > > I have made a new patch based on your comments and Stefan's Poc code: > > > > > Webrev:http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_03/ > > > > > Delta(based on Stefan's change:) :http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_03-delta/webrev_03-delta/ > > > > Thanks for providing a delta patch. It makes it much easier to look at, > > > > and more likely for reviewers to continue reviewing. > > > > I'm going to continue focusing on the GC parts, and leave the rest to > > > > others to review. > > > > > > > > > > And Here are main changed I made and want to discuss with you: > > > > > 1. changed"parallelThreadNum=" to "parallel=" for jmap -histo options. > > > > > 2. Add logic to test where parallelHeapInspection is fail, in heapInspection.cpp > > > > > This is because the parHeapInspectTask create thread local KlassInfoTable in it's work() method, and this may fail because of native OOM, in this case, the parallel should fail and serial heap inspection can be tried. > > > > > One more thing I want discuss with you is about the member "_success" of parHeapInspectTask, when native OOM happenes, it is set to false. And since this "set" operation can be conducted in multiple threads, should it be atomic ops? IMO, this is not necessary because "_success" can only be set to false, and there is no way to change it from back to true after the ParHeapInspectTask instance is created, so it is save to be non-atomic, do you agree with that? > > > > In these situations you should be using the Atomic::load/store > > > > primitives. We're moving toward a later C++ standard were data races are > > > > considered undefined behavior. > > > > > 3. make CollectedHeap::run_task() be an abstract virtual func, so that every subclass of collectedHeap should support it, so later implementation of new collectedHeap will not miss the "parallel" features. > > > > > The problem I want to discuss with you is about epsilonHeap and SerialHeap, as they may not need parallel heap iteration, so I only make task->work(0), in case the run_task() is invoked someway in future. Another way is to left run_task() unimplemented, which one do you think is better? > > > > I don't have a strong opinion about this. > > > > And also please help take a look at the zHeap, as there is a class > > > > zTask that wrap the abstractGangTask, and the collectedHeap::run_task() > > > > only accept AbstraceGangTask* as argument, so I made a delegate class > > > > to adapt it , please see src/hotspot/share/gc/z/zHeap.cpp. > > > > > > > > > > There maybe other better ways to sovle the above problems, welcome for any comments, Thanks! > > > > I've created a few cleanups and changes on top of your latest patch: > > > > https://cr.openjdk.java.net/~stefank/8215624/webrev.02.delta > > > > https://cr.openjdk.java.net/~stefank/8215624/webrev.02 > > > > - Adding Atomic::load/store. > > > > - Removing the time measurement in the run_task. I renamed G1's function > > > > to run_task_timed. If we need this outside of G1, we can rethink the API > > > > at that point. > > > > - ZGC style cleanups > > > > Thanks, > > > > StefanK > > > > > > > > > > BRs, > > > > > Lin > > > > > > > > > > On 2020/4/23, 11:08 AM, "linzang(??)" wrote: > > > > > > > > > > Thanks Paul! I agree with using "parallel", will make the update in next patch, Thanks for help update the CSR. > > > > > > > > > > BRs, > > > > > Lin > > > > > > > > > > On 2020/4/23, 4:42 AM, "Hohensee, Paul" wrote: > > > > > > > > > > For the interface, I'd use "parallel" instead of "parallelThreadNum". All the other options are lower case, and it's a lot easier to type "parallel". I took the liberty of updating the CSR. If you're ok with it, you might want to change variable names and such, plus of course JMap.usage. > > > > > > > > > > Thanks, > > > > > Paul > > > > > > > > > > On 4/22/20, 2:29 AM, "serviceability-dev on behalf of linzang(??)" > linzang at tencent.com> wrote: > > > > > > > > > > Dear Stefan, > > > > > > > > > > Thanks a lot! I agree with you to decouple the heap inspection code with GC's. > > > > > I will start from your POC code, may discuss with you later. > > > > > > > > > > > > > > > BRs, > > > > > Lin > > > > > > > > > > On 2020/4/22, 5:14 PM, "Stefan Karlsson" wrote: > > > > > > > > > > Hi Lin, > > > > > > > > > > I took a look at this earlier and saw that the heap inspection code is > > > > > strongly coupled with the CollectedHeap and G1CollectedHeap. I'd prefer > > > > > if we'd abstract this away, so that the GCs only provide a "parallel > > > > > object iteration" interface, and the heap inspection code is kept elsewhere. > > > > > > > > > > I started experimenting with doing that, but other higher-priority (to > > > > > me) tasks have had to take precedence. > > > > > > > > > > I've uploaded my work-in-progress / proof-of-concept: > > > > >https://cr.openjdk.java.net/~stefank/8215624/webrev.01.delta/ > > > > >https://cr.openjdk.java.net/~stefank/8215624/webrev.01/ > > > > > > > > > > The current code doesn't handle the lifecycle (deletion) of the > > > > > ParallelObjectIterators. There's also code left unimplemented in around > > > > > CollectedHeap::run_task. However, I think this could work as a basis to > > > > > pull out the heap inspection code out of the GCs. > > > > > > > > > > Thanks, > > > > > StefanK > > > > > > > > > > On 2020-04-22 02:21, linzang(??) wrote: > > > > > > Dear all, > > > > > > May I ask you help to review? This RFR has been there for quite a while. > > > > > > Thanks! > > > > > > > > > > > > BRs, > > > > > > Lin > > > > > > > > > > > > > On 2020/3/16, 5:18 PM, "linzang(??)" wrote:> > > > > > > > > > > > >> Just update a new path, my preliminary measure show about 3.5x speedup of jmap histo on a nearly full 4GB G1 heap (8-core platform with parallel thread number set to 4). > > > > > >> webrev:http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_02/ > > > > > >> bug:https://bugs.openjdk.java.net/browse/JDK-8215624 > > > > > >> CSR:https://bugs.openjdk.java.net/browse/JDK-8239290 > > > > ?? > >> BRs, > > > > > >> Lin > > > > > >> > On 2020/3/2, 9:56 PM, "linzang(??)" wrote: > > > > > >> > > > > > > >> > Dear all, > > > > > >> > Let me try to ease the reviewing work by some explanation :P > > > > > >> > The patch's target is to speed up jmap -histo for heap iteration, from my experience it is necessary for large heap investigation. E.g in bigData scenario I have tried to conduct jmap -histo against 180GB heap, it does take quite a while. > > > > > >> > And if my understanding is corrent, even the jmap -histo without "live" option does heap inspection with heap lock acquired. so it is very likely to block mutator thread in allocation-sensitive scenario. I would say the faster the heap inspection does, the shorter the mutator be blocked. This is parallel iteration for jmap is necessary. > > > > > >> > I think the parallel heap inspection should be applied to all kind of heap. However, consider the heap layout are different for GCs, much time is required to understand all kinds of the heap layout to make the whole change. IMO, It is not wise to have a huge patch for the whole solution at once, and it is even harder to review it. So I plan to implement it incrementally, the first patch (this one) is going to confirm the implemention detail of how jmap accept the new option, passes it to attachListener of the jvm process and then how to make the parallel inspection closure be generic enough to make it easy to extend to different heap layout. And also how to implement the heap inspection in specific gc's heap. This patch use G1's heap as the begining. > > > > > >> > This patch actually do several things: > > > > > >> > 1. Add an option "parallelThreadNum=" to jmap -histo, the default behavior is to set N to 0, means let's JVM decide how many threads to use for heap inspection. Set this option to 1 will disable parallel heap inspection. (more details in CSR:https://bugs.openjdk.java.net/browse/JDK-8239290) > > > > > >> > 2. Make a change in how Jmap passing arguments, changes inhttp://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_01/src/jdk.jcmd/share/classes/sun/tools/jmap/JMap.java.udiff.html, originally it pass options as separate arguments to attachListener, this patch change to that all options be compose to a single string. So the arg_count_max in attachListener.hpp do not need to be changed, and hence avoid the compatibility issue, as disscussed athttps://mail.openjdk.java.net/pipermail/serviceability-dev/2019-March/027334.html > > > > > >> > 3. Add an abstract class ParHeapInspectTask in heapInspection.hpp / heapInspection.cpp, It's work(uint worker_id) method prepares the data structure (KlassInfoTable) need for every parallel worker thread, and then call do_object_iterate_parallel() which is heap specific implementation. I also added some machenism in KlassInfoTable to support parallel iteration, such as merge(). > > > > > >> > 4. In specific heap (G1 in this patch), create a subclass of ParHeapInspectTask, implement the do_object_iterate_parallel() for parallel heap inspection. For G1, it simply invoke g1CollectedHeap's object_iterate_parallel(). > > > > > >> > 5. Add related test. > > > > > >> > 6. it may be easy to extend this patch for other kinds of heap by creating subclass of ParHeapInspectTask and implement the do_object_iterate_parallel(). > > > > > >> > > > > > > >> > Hope these info could help on code review and initate the discussion :-) > > > > > >> > Thanks! > > > > > >> > > > > > > >> > BRs, > > > > > >> > Lin > > > > > >> > >On 2020/2/19, 9:40 AM, "linzang(??)" wrote:. > > > > > >> > > > > > > > >> > > Re-post this RFR with correct enhancement number to make it trackable. > > > > > >> > > please ignore the previous wrong post. sorry for troubles. > > > > > >> > > > > > > > >> > > webrev:http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_01/ > > > > > >> > > Hi bug:https://bugs.openjdk.java.net/browse/JDK-8215624 > > > > > >> > > CSR:https://bugs.openjdk.java.net/browse/JDK-8239290 > > > > > >> > > -------------- > > > > > >> > > Lin > > > > > >> > > >Hi Lin, > > > > > > > > > > > > > > > >> > > >Could you, please, re-post your RFR with the right enhancement number in > > > > > >> > > >the message subject? > > > > > >> > > >It will be more trackable this way. > > > > > >> > > > > > > > > >> > > >Thanks, > > > > > >> > > >Serguei > > > > > >> > > > > > > > > >> > > > > > > > > >> > > >On 2/17/20 10:29 PM, linzang(??) wrote: > > > > > >> > > >> Dear David, > > > > > >> > > >> Thanks a lot! > > > > > >> > > >> I have updated the refined code tohttp://cr.openjdk.java.net/~lzang/jmap-8214535/8215264/webrev_01/. > > > > > >> > > >> IMHO the parallel heap inspection can be extended to all kinds of heap as long as the heap layout can support parallel iteration. > > > > > >> > > >> Maybe we can firstly use this webrev to discuss how to implement it, because I am not sure my current implementation is an appropriate way to communicate with collectedHeap, then we can extend the solution to other kinds of heap. > > > > > >> > > >> > > > > > >> > > >> Thanks, > > > > > >> > > >> -------------- > > > > > >> > > >> Lin > > > > > >> > > >>> Hi Lin, > > > > > >> > > >>> > > > > > >> > > >>> Adding in hotspot-gc-dev as they need to see how this interacts with GC > > > > > >> > > >>> worker threads, and whether it needs to be extended beyond G1. > > > > > >> > > >>> > > > > > >> > > >>> I happened to spot one nit when browsing: > > > > > >> > > >>> > > > > > >> > > >>> src/hotspot/share/gc/shared/collectedHeap.hpp > > > > > >> > > >>> > > > > > >> > > >>> + virtual bool run_par_heap_inspect_task(KlassInfoTable* cit, > > > > > >> > > >>> + BoolObjectClosure* filter, > > > > > >> > > >>> + size_t* missed_count, > > > > > >> > > >>> + size_t thread_num) { > > > > > >> > > >>> + return NULL; > > > > > >> > > >>> > > > > > >> > > >>> s/NULL/false/ > > > > > >> > > >>> > > > > > >> > > >>> Cheers, > > > > > >> > > >>> David > > > > > > > > > >>> > > > > > >> > > >>> On 18/02/2020 2:15 pm, linzang(??) wrote: > > > > > >> > > >>>> Dear All, > > > > > >> > > >>>> May I ask your help to review the follow changes: > > > > > >> > > >>>> webrev: > > > > > >> > > >>>>http://cr.openjdk.java.net/~lzang/jmap-8214535/8215264/webrev_00/ > > > > > >> > > >>>> bug:https://bugs.openjdk.java.net/browse/JDK-8215624 > > > > > >> > > >>>> related CSR:https://bugs.openjdk.java.net/browse/JDK-8239290 > > > > > >> > > >>>> This patch enable parallel heap inspection of G1 for jmap histo. > > > > > >> > > >>>> my simple test shown it can speed up 2x of jmap -histo with > > > > > >> > > >>>> parallelThreadNum set to 2 for heap at ~500M on 4-core platform. > > > > > >> > > >>>> > > > > > >> > > >>>> ------------------------------------------------------------------------ > > > > > >> > > >>>> BRs, > > > > > >> > > >>>> Lin > > > > > >> > > >> > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > From serguei.spitsyn at oracle.com Fri Aug 7 07:24:45 2020 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Fri, 7 Aug 2020 00:24:45 -0700 Subject: RFR(L): 8215624: add parallel heap inspection support for jmap histo(G1)(Internet mail) In-Reply-To: <63236320-6741-46F4-A22A-24FCE699E26A@tencent.com> References: <107799EA-8956-426D-B05E-B3344AFCBA28@amazon.com> <63236320-6741-46F4-A22A-24FCE699E26A@tencent.com> Message-ID: <7015e935-51df-54de-8eea-7dc375db4c85@oracle.com> An HTML attachment was scrubbed... URL: From serguei.spitsyn at oracle.com Fri Aug 7 07:29:47 2020 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Fri, 7 Aug 2020 00:29:47 -0700 Subject: RFR(L): 8215624: add parallel heap inspection support for jmap histo(G1)(Internet mail) In-Reply-To: <7015e935-51df-54de-8eea-7dc375db4c85@oracle.com> References: <107799EA-8956-426D-B05E-B3344AFCBA28@amazon.com> <63236320-6741-46F4-A22A-24FCE699E26A@tencent.com> <7015e935-51df-54de-8eea-7dc375db4c85@oracle.com> Message-ID: <72bb58cb-add7-b25b-9e1c-879886993857@oracle.com> An HTML attachment was scrubbed... URL: From linzang at tencent.com Fri Aug 7 10:41:06 2020 From: linzang at tencent.com (=?utf-8?B?bGluemFuZyjoh6fnkLMp?=) Date: Fri, 7 Aug 2020 10:41:06 +0000 Subject: RFR(L): 8215624: add parallel heap inspection support for jmap histo(G1)(Internet mail) In-Reply-To: <7015e935-51df-54de-8eea-7dc375db4c85@oracle.com> References: <107799EA-8956-426D-B05E-B3344AFCBA28@amazon.com> <63236320-6741-46F4-A22A-24FCE699E26A@tencent.com> <7015e935-51df-54de-8eea-7dc375db4c85@oracle.com> Message-ID: <35B7B432-29F1-45B5-8FD8-3889C0EAC440@tencent.com> Dear Serguei, Thanks a lot for your review! >> The spec says nothing if the new option 'parallel' is mandatory or optional. >> Also, (it was before your fix) the spec does not say if the options 'live' and 'all' are mutually exclusive. For ?parallel?, the spec adds ?parallel=0? is the default behavior. So my assumption is if parallel is not used, it will be 0. Do you think it is ok ? is it necessary to obviously add comments like ?if no parallel is set, use the default value 0?? For ?live? and ?all?, before the changeset , I see the logic from the code is that both of them can be set at the same time, and the ?live? will take effect. IMHO this may be a little confused. So I made the change, not sure whether I should keep the same behavior as before in this change? And I like your idea of printing more error msg if something wrong with the options setting, but I checked that before the change, if there is not a match option, it only print usage. and not only jmap -histo but also jmap -dump has this issue, do you agree if I fix both in the changeset? >> What is going to happen if null is passed in place of parallel here? : The default value 0 will be used if no ?parallel? option is set. >> Should the lines 193-195 be moved after the line 202? I don?t think so, the logic is a little different. At line 193, the case is ?parallel=?. If move them to line 203, it mean ?parallel? is not optional. Thanks! BRs, Lin From: "serguei.spitsyn at oracle.com" Date: Friday, August 7, 2020 at 3:28 PM To: "linzang(??)" , "Hohensee, Paul" , Stefan Karlsson , David Holmes , serviceability-dev , "hotspot-gc-dev at openjdk.java.net" Subject: Re: RFR(L): 8215624: add parallel heap inspection support for jmap histo(G1)(Internet mail) Hi Lin, Not sure, I fully understand the spec update and the options processing in the file: http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_11/src/jdk.jcmd/share/classes/sun/tools/jmap/JMap.java.frames.html The spec says nothing if the new option 'parallel' is mandatory or optional. Also, (it was before your fix) the spec does not say if the options 'live' and 'all' are mutually exclusive. The JMap.java implementation just print usage in two cases: 191 } else if (subopt.startsWith("parallel=")) { 192 parallel = subopt.substring("parallel=".length()); 193 if (parallel == null) { 194 usage(1); 195 } ... 200 if (set_live && set_all) { 201 usage(1); 202 } It is not that helpful as the usage does not explain anything about these corner cases. Also, it allows to pass no parallel option. What is going to happen if null is passed in place of parallel here? : 206 executeCommandForPid(pid, "inspectheap", liveopt, filename, parallel); Should the lines 193-195 be moved after the line 202? Thanks, Serguei On 8/5/20 18:59, linzang(??) wrote: Thanks Paul! And I have verified this change could build success in windows. BRs, Lin On 2020/8/6, 4:17 AM, "Hohensee, Paul" wrote: Two tiny nits that don't need a new webrev: In heapInspection.cpp, you don't need to cast missed_count to uintx in the call to log_info(). In heapInspection.hpp, you can delete two of the three blank lines before #endif // SHARE_MEMORY_HEAPINSPECTION_HPP Thanks, Paul On 8/5/20, 6:46 AM, "linzang(??)" wrote: Hi Paul, Stefan and Serguei, Here I uploaded a new changeset, would you like to help review again? Webrev: http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_11/ Delta (based on webrev10): http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_11_delta/ P.S. I am in process of building it on windows environment for a double check. May update result later. Thanks! BRs, Lin On 2020/8/5, 8:56 PM, "Hohensee, Paul" wrote: uintx is fine with me. Thanks, Paul On 8/5/20, 1:14 AM, "linzang(??)" wrote: Hi Stefan, I got your point, thanks for explanation. I missed the atomic part when considering it. Hi Paul, Do you think it is ok to use uintx? I checked it is actually a uintptr_t type. BRs, Lin On 2020/8/5, 3:39 PM, "Stefan Karlsson" wrote: On 2020-08-05 07:22, linzang(??) wrote: > Hi Serguei, > > No problem, Thanks for your reviewing :) > > I wll upload a new webrev later, so may I ask your help to review it > again? > > Hi Stefan, > > As Paul mentioned, the _/missed/_count is not a size, so size_t may > not be clear, what?s your opinion about uint64_t? We typically don't restrict the usage of size_t to only *sizes* in the HotSpot. If you search the code you'll find many count variables using size_t, so I personally don't see the need to change the type. However, if you really do want to change it then maybe using another type that is 32 bits on 32-bit machines, maybe uintx? IIRC, using uint64_t and some of the Atomics operations are problematic on some 32-bit platforms, so using a type that matches the word size of the targetted machine helps you not having to think about that. > > It seems the uint overflow may happened on 64bit machine with large > heap, e.g. may be more than 4 Giga objects (8byte header + 8 byte > klassptr + 8byte field) in a heap that is larger than 96 GB, uint64_t > is ok in this case. Exactly. Thanks, StefanK > > BRs, > > Lin > > *From: *"serguei.spitsyn at oracle.com" > *Date: *Wednesday, August 5, 2020 at 1:02 PM > *To: *"linzang(??)" , "Hohensee, Paul" > , Stefan Karlsson , > David Holmes , serviceability-dev > , "hotspot-gc-dev at openjdk.java.net" > > *Subject: *Re: RFR(L): 8215624: add parallel heap inspection support for > jmap histo(G1)(Internet mail) > > Oh, sorry for the confusion, please, skip my question. :) > C++ does not have the '&&=' operator. > > Thanks, > Serguei > > On 8/4/20 21:56, serguei.spitsyn at oracle.com > wrote: > > Hi Lin, > > https://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_10/src/hotspot/share/memory/heapInspection.cpp.udiff.html > > +class KlassInfoTableMergeClosure : public KlassInfoClosure { > > +private: > > + KlassInfoTable* _dest; > > + bool _success; > > +public: > > + KlassInfoTableMergeClosure(KlassInfoTable* table) : _dest(table), > _success(true) {} > > + void do_cinfo(KlassInfoEntry* cie) { > > + _success &= _dest->merge_entry(cie); > > + } > > The operator '&=' above looks strange. > Did you actually want to use the operator '&&=' instead? : > > + _success &&= _dest->merge_entry(cie); > > > Thanks, > Serguei > > > > > On 8/3/20 07:51, linzang(??) wrote: > > Dear Stefan, > > May I ask your help to review again? I have made a delta based on the last changeset you have reviewed(webrev04),hope it could ease your reviewing work. > > webrev:https://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_10/ > > delta (vs webrev04):https://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/delta_10vs04/webrev/ > > bug:https://bugs.openjdk.java.net/browse/JDK-8215624 > > CSR(approved):https://bugs.openjdk.java.net/browse/JDK-8239290 > > > > BRs, > > Lin > > On 2020/7/30, 5:21 AM, "Hohensee, Paul" wrote: > > A submit repo run with this succeeded, so afaic you're good to go. Stefan, you reviewed the GC part before, it'd be great if you could ok the final version. > > Thanks, > > Paul > > On 7/29/20, 5:02 AM, "linzang(??)" wrote: > > Upload a new change athttp://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_10/ > > It fix an issue of windows fail : > > #################################### > > In heapInspect.cpp > > - size_t HeapInspection::populate_table(KlassInfoTable* cit, BoolObjectClosure *filter, uint parallel_thread_num) { > > + uint HeapInspection::populate_table(KlassInfoTable* cit, BoolObjectClosure *filter, uint parallel_thread_num) { > > #################################### > > In heapInspect.hpp > > - size_t populate_table(KlassInfoTable* cit, BoolObjectClosure* filter = NULL, uint parallel_thread_num = 1) NOT_SERVICES_RETURN_(0); > > + uint populate_table(KlassInfoTable* cit, BoolObjectClosure* filter = NULL, uint parallel_thread_num = 1) NOT_SERVICES_RETURN_(0); > > #################################### > > BRs, > > Lin > > On 2020/7/27, 11:26 AM, "linzang(??)" wrote: > > I update a new change athttp://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_09 > > It includes a tiny fix of build failure on windows: > > #################################### > > In attachListener.cpp: > > - uint parallel_thread_num = MAX(1, (uint)os::initial_active_processor_count() * 3 / 8); > > + uint parallel_thread_num = MAX2(1, (uint)os::initial_active_processor_count() * 3 / 8); > > #################################### > > BRs, > > Lin > > On 2020/7/23, 11:56 AM, "linzang(??)" wrote: > > Hi Paul, > > Thanks for your help, that all looks good to me. > > Just 2 minor changes: > > ? delete the final return in ParHeapInspectTask::work, you mentioned it but seems not include in the webrev :-) > > ? delete a unnecessary blank line in heapInspect.cpp at merge_entry() > > ######################################################################### > > --- old/src/hotspot/share/memory/heapInspection.cpp 2020-07-23 11:23:29.281666456 +0800 > > +++ new/src/hotspot/share/memory/heapInspection.cpp 2020-07-23 11:23:29.017666447 +0800 > > @@ -251,7 +251,6 @@ > > _size_of_instances_in_words += cie->words(); > > return true; > > } > > - > > return false; > > } > > @@ -568,7 +567,6 @@ > > Atomic::add(&_missed_count, missed_count); > > } else { > > Atomic::store(&_success, false); > > - return; > > } > > } > > ######################################################################### > > Here is the webrevhttp://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_08/ > > BRs, > > Lin > > --------------------------------------------- > > From: "Hohensee, Paul" > > Date: Thursday, July 23, 2020 at 6:48 AM > > To: "linzang(??)" , Stefan Karlsson ,"serguei.spitsyn at oracle.com" , David Holmes , serviceability-dev ,"hotspot-gc-dev at openjdk.java.net" > > Subject: RE: RFR(L): 8215624: add parallel heap inspection support for jmap histo(G1)(Internet mail) > > Just small things. > > heapInspection.cpp: > > In ParHeapInspectTask::work, remove the final return statement and fix the following ?}? indent. I.e., replace > > + Atomic::store(&_success, false); > > + return; > > + } > > with > > + Atomic::store(&_success, false); > > + } > > In HeapInspection::heap_inspection, missed_count should be a uint to match other missed_count declarations, and should be initialized to the result of populate_table() rather than separately to 0. > > attachListener.cpp: > > In heap_inspection, initial_processor_count returns an int, so cast its result to a uint. > > Similarly, parse_uintx returns a uintx, so cast its result (num) to uint when assigning to parallel_thread_num. > > BasicJMapTest.java: > > I took the liberty of refactoring testHisto*/histoToFile/testDump*/dump to remove redundant interposition methods and make histoToFile and dump look as similar as possible. > > Webrev with the above changes in > > http://cr.openjdk.java.net/~phh/8214535/webrev.01/ > > Thanks, > > Paul > > On 7/15/20, 2:13 AM, "linzang(??)" wrote: > > Upload a new webrev athttp://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_07/ > > It fix a potential issue that unexpected number of threads maybe calculated for "parallel" option of jmap -histo in container. > > As shown athttp://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_07-delta/src/hotspot/share/services/attachListener.cpp.udiff.html > > ############### attachListener.cpp #################### > > @@ -252,11 +252,11 @@ > > static jint heap_inspection(AttachOperation* op, outputStream* out) { > > bool live_objects_only = true; // default is true to retain the behavior before this change is made > > outputStream* os = out; // if path not specified or path is NULL, use out > > fileStream* fs = NULL; > > const char* arg0 = op->arg(0); > > - uint parallel_thread_num = MAX(1, os::processor_count() * 3 / 8); // default is less than half of processors. > > + uint parallel_thread_num = MAX(1, os::initial_active_processor_count() * 3 / 8); // default is less than half of processors. > > if (arg0 != NULL && (strlen(arg0) > 0)) { > > if (strcmp(arg0, "-all") != 0 && strcmp(arg0, "-live") != 0) { > > out->print_cr("Invalid argument to inspectheap operation: %s", arg0); > > return JNI_ERR; > > } > > ################################################### > > Thanks. > > BRs, > > Lin > > On 2020/7/9, 3:22 PM, "linzang(??)" wrote: > > Hi Paul, > > Thanks for reviewing! > > >> > > >> I'd move all the argument parsing code to JMap.java and just pass the results to Hotspot. Both histo() in JMap.java and code in attachListener.* parse the command line arguments, though the code in histo() doesn't parse the argument to "parallel". I'd upgrade the code in histo() to do a complete parse and pass the option values to executeCommandForPid as before: there would just be more of them now. That would allow you to eliminate all the parsing code in attachListener.cpp as well as the change to arguments.hpp. > > >> > > The reason I made the change in Jmap.java that compose all arguments as 1 string , instead of passing 3 argments, is to avoid the compatibility issue, as we discussed inhttp://mail.openjdk.java.net/pipermail/serviceability-dev/2019-February/thread.html#27240. The root cause of the compatibility issue is because max argument count in HotspotVirtualMachineImpl.java and attachlistener.cpp need to be enlarged (changes likehttp://hg.openjdk.java.net/jdk/jdk/rev/e7cf035682e3#l2.1) when jmap has more than 3 arguments. But if user use an old jcmd/jmap tool, it may stuck at socket read(), because the "max argument count" don't match. > > I re-checked this change, the argument count of jmap histo is equal to 3 (live, file, parallel), so it can work normally even without the change of passing argument. But I think we have to face the problem if more arguments is added in jcmd alike tools later, not sure whether it should be sloved (or a workaround) in this changeset. > > And here are the lastest webrev and delta: > > http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_06/ > > http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_06-delta/ > > Cheers, > > Lin > > On 2020/7/7, 5:57 AM, "Hohensee, Paul" wrote: > > I'd like to see this feature added. :) > > The CSR looks good, as does the basic parallel inspection algorithm. Stefan's done the GC part, so I'll stick to the non-GC part (fwiw, the GC part lgtm). > > I'd move all the argument parsing code to JMap.java and just pass the results to Hotspot. Both histo() in JMap.java and code in attachListener.* parse the command line arguments, though the code in histo() doesn't parse the argument to "parallel". I'd upgrade the code in histo() to do a complete parse and pass the option values to executeCommandForPid as before: there would just be more of them now. That would allow you to eliminate all the parsing code in attachListener.cpp as well as the change to arguments.hpp. > > heapInspection.hpp: > > _shared_miss_count (s/b _missed_count, see below) isn't a size, so it should be a uint instead of a size_t. Same with the new parallel_thread_num argument to heap_inspection() and populate_table(). > > Comment copy-edit: > > +// Parallel heap inspection task. Parallel inspection can fail due to > > +// a native OOM when allocating memory for TL-KlassInfoTable. > > +// _success will be set false on an OOM, and serial inspection tried. > > _shared_miss_count should be _missed_count to match the missed_count() getter, or rename missed_count() to be shared_miss_count(). Whichever way you go, the field type should match the getter result type: uint is reasonable. > > heapInspection.cpp: > > You might use ResourceMark twice in populate_table, separately for the parallel attempt and the serial code. If the parallel attempt fails and available memory is low, it would be good to clean up the memory used by the parallel attempt before doing the serial code. > > Style nit in KlassInfoTable::merge_entry(). I'd line up the definitions of k and elt, so "k" is even with "elt". And, because it's two lines shorter, I'd replace > > + } else { > > + return false; > > + } > > with > > + return false; > > KlassInfoTableMergeClosure.is_success() should be just success() (i.e., no "is_" prefix) because it's a getter. > > I'd reorganize the code in populate_table() to make it more clear, vis (I changed _shared_missed_count to _missed_count) > > + if (cit.allocation_failed()) { > > + // fail to allocate memory, stop parallel mode > > + Atomic::store(&_success, false); > > + return; > > + } > > + RecordInstanceClosure ric(&cit, _filter); > > + _poi->object_iterate(&ric, worker_id); > > + missed_count = ric.missed_count(); > > + { > > + MutexLocker x(&_mutex); > > + merge_success = _shared_cit->merge(&cit); > > + } > > + if (merge_success) { > > + Atomic::add(&_missed_count, missed_count); > > + else { > > + Atomic::store(&_success, false); > > + } > > Thanks, > > Paul > > On 6/29/20, 7:20 PM, "linzang(??)" wrote: > > Dear All, > > Sorry to bother again, I just want to make sure that is this change worth to be continue to work on? If decision is made to not. I think I can drop this work and stop asking for help reviewing... > > Thanks for all your help about reviewing this previously. > > BRs, > > Lin > > On 2020/5/9, 3:47 PM, "linzang(??)" wrote: > > Dear All, > > May I ask your help again for review the latest change? Thanks! > > BRs, > > Lin > > On 2020/4/28, 1:54 PM, "linzang(??)" wrote: > > Hi Stefan, > > >> - Adding Atomic::load/store. > > >> - Removing the time measurement in the run_task. I renamed G1's function > > >> to run_task_timed. If we need this outside of G1, we can rethink the API > > >> at that point. > > >> - ZGC style cleanups > > Thanks for revising the patch, they are all good to me, and I have made a tiny change based on it: > > http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_04/ > > http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_04-delta/ > > it reduce the scope of mutex in ParHeapInspectTask, and delete unnecessary comments. > > BRs, > > Lin > > On 2020/4/27, 4:34 PM, "Stefan Karlsson" wrote: > > Hi Lin, > > On 2020-04-26 05:10, linzang(??) wrote: > > > Hi Stefan and Paul? > > > I have made a new patch based on your comments and Stefan's Poc code: > > > Webrev:http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_03/ > > > Delta(based on Stefan's change:) :http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_03-delta/webrev_03-delta/ > > Thanks for providing a delta patch. It makes it much easier to look at, > > and more likely for reviewers to continue reviewing. > > I'm going to continue focusing on the GC parts, and leave the rest to > > others to review. > > > > > > And Here are main changed I made and want to discuss with you: > > > 1. changed"parallelThreadNum=" to "parallel=" for jmap -histo options. > > > 2. Add logic to test where parallelHeapInspection is fail, in heapInspection.cpp > > > This is because the parHeapInspectTask create thread local KlassInfoTable in it's work() method, and this may fail because of native OOM, in this case, the parallel should fail and serial heap inspection can be tried. > > > One more thing I want discuss with you is about the member "_success" of parHeapInspectTask, when native OOM happenes, it is set to false. And since this "set" operation can be conducted in multiple threads, should it be atomic ops? IMO, this is not necessary because "_success" can only be set to false, and there is no way to change it from back to true after the ParHeapInspectTask instance is created, so it is save to be non-atomic, do you agree with that? > > In these situations you should be using the Atomic::load/store > > primitives. We're moving toward a later C++ standard were data races are > > considered undefined behavior. > > > 3. make CollectedHeap::run_task() be an abstract virtual func, so that every subclass of collectedHeap should support it, so later implementation of new collectedHeap will not miss the "parallel" features. > > > The problem I want to discuss with you is about epsilonHeap and SerialHeap, as they may not need parallel heap iteration, so I only make task->work(0), in case the run_task() is invoked someway in future. Another way is to left run_task() unimplemented, which one do you think is better? > > I don't have a strong opinion about this. > > And also please help take a look at the zHeap, as there is a class > > zTask that wrap the abstractGangTask, and the collectedHeap::run_task() > > only accept AbstraceGangTask* as argument, so I made a delegate class > > to adapt it , please see src/hotspot/share/gc/z/zHeap.cpp. > > > > > > There maybe other better ways to sovle the above problems, welcome for any comments, Thanks! > > I've created a few cleanups and changes on top of your latest patch: > > https://cr.openjdk.java.net/~stefank/8215624/webrev.02.delta > > https://cr.openjdk.java.net/~stefank/8215624/webrev.02 > > - Adding Atomic::load/store. > > - Removing the time measurement in the run_task. I renamed G1's function > > to run_task_timed. If we need this outside of G1, we can rethink the API > > at that point. > > - ZGC style cleanups > > Thanks, > > StefanK > > > > > > BRs, > > > Lin > > > > > > On 2020/4/23, 11:08 AM, "linzang(??)" wrote: > > > > > > Thanks Paul! I agree with using "parallel", will make the update in next patch, Thanks for help update the CSR. > > > > > > BRs, > > > Lin > > > > > > On 2020/4/23, 4:42 AM, "Hohensee, Paul" wrote: > > > > > > For the interface, I'd use "parallel" instead of "parallelThreadNum". All the other options are lower case, and it's a lot easier to type "parallel". I took the liberty of updating the CSR. If you're ok with it, you might want to change variable names and such, plus of course JMap.usage. > > > > > > Thanks, > > > Paul > > > > > > On 4/22/20, 2:29 AM, "serviceability-dev on behalf of linzang(??)" > linzang at tencent.com> wrote: > > > > > > Dear Stefan, > > > > > > Thanks a lot! I agree with you to decouple the heap inspection code with GC's. > > > I will start from your POC code, may discuss with you later. > > > > > > > > > BRs, > > > Lin > > > > > > On 2020/4/22, 5:14 PM, "Stefan Karlsson" wrote: > > > > > > Hi Lin, > > > > > > I took a look at this earlier and saw that the heap inspection code is > > > strongly coupled with the CollectedHeap and G1CollectedHeap. I'd prefer > > > if we'd abstract this away, so that the GCs only provide a "parallel > > > object iteration" interface, and the heap inspection code is kept elsewhere. > > > > > > I started experimenting with doing that, but other higher-priority (to > > > me) tasks have had to take precedence. > > > > > > I've uploaded my work-in-progress / proof-of-concept: > > >https://cr.openjdk.java.net/~stefank/8215624/webrev.01.delta/ > > >https://cr.openjdk.java.net/~stefank/8215624/webrev.01/ > > > > > > The current code doesn't handle the lifecycle (deletion) of the > > > ParallelObjectIterators. There's also code left unimplemented in around > > > CollectedHeap::run_task. However, I think this could work as a basis to > > > pull out the heap inspection code out of the GCs. > > > > > > Thanks, > > > StefanK > > > > > > On 2020-04-22 02:21, linzang(??) wrote: > > > > Dear all, > > > > May I ask you help to review? This RFR has been there for quite a while. > > > > Thanks! > > > > > > > > BRs, > > > > Lin > > > > > > > > > On 2020/3/16, 5:18 PM, "linzang(??)" wrote:> > > > > > > > >> Just update a new path, my preliminary measure show about 3.5x speedup of jmap histo on a nearly full 4GB G1 heap (8-core platform with parallel thread number set to 4). > > > >> webrev:http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_02/ > > > >> bug:https://bugs.openjdk.java.net/browse/JDK-8215624 > > > >> CSR:https://bugs.openjdk.java.net/browse/JDK-8239290 > > ?? > >> BRs, > > > >> Lin > > > >> > On 2020/3/2, 9:56 PM, "linzang(??)" wrote: > > > >> > > > > >> > Dear all, > > > >> > Let me try to ease the reviewing work by some explanation :P > > > >> > The patch's target is to speed up jmap -histo for heap iteration, from my experience it is necessary for large heap investigation. E.g in bigData scenario I have tried to conduct jmap -histo against 180GB heap, it does take quite a while. > > > >> > And if my understanding is corrent, even the jmap -histo without "live" option does heap inspection with heap lock acquired. so it is very likely to block mutator thread in allocation-sensitive scenario. I would say the faster the heap inspection does, the shorter the mutator be blocked. This is parallel iteration for jmap is necessary. > > > >> > I think the parallel heap inspection should be applied to all kind of heap. However, consider the heap layout are different for GCs, much time is required to understand all kinds of the heap layout to make the whole change. IMO, It is not wise to have a huge patch for the whole solution at once, and it is even harder to review it. So I plan to implement it incrementally, the first patch (this one) is going to confirm the implemention detail of how jmap accept the new option, passes it to attachListener of the jvm process and then how to make the parallel inspection closure be generic enough to make it easy to extend to different heap layout. And also how to implement the heap inspection in specific gc's heap. This patch use G1's heap as the begining. > > > >> > This patch actually do several things: > > > >> > 1. Add an option "parallelThreadNum=" to jmap -histo, the default behavior is to set N to 0, means let's JVM decide how many threads to use for heap inspection. Set this option to 1 will disable parallel heap inspection. (more details in CSR:https://bugs.openjdk.java.net/browse/JDK-8239290) > > > >> > 2. Make a change in how Jmap passing arguments, changes inhttp://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_01/src/jdk.jcmd/share/classes/sun/tools/jmap/JMap.java.udiff.html, originally it pass options as separate arguments to attachListener, this patch change to that all options be compose to a single string. So the arg_count_max in attachListener.hpp do not need to be changed, and hence avoid the compatibility issue, as disscussed athttps://mail.openjdk.java.net/pipermail/serviceability-dev/2019-March/027334.html > > > >> > 3. Add an abstract class ParHeapInspectTask in heapInspection.hpp / heapInspection.cpp, It's work(uint worker_id) method prepares the data structure (KlassInfoTable) need for every parallel worker thread, and then call do_object_iterate_parallel() which is heap specific implementation. I also added some machenism in KlassInfoTable to support parallel iteration, such as merge(). > > > >> > 4. In specific heap (G1 in this patch), create a subclass of ParHeapInspectTask, implement the do_object_iterate_parallel() for parallel heap inspection. For G1, it simply invoke g1CollectedHeap's object_iterate_parallel(). > > > >> > 5. Add related test. > > > >> > 6. it may be easy to extend this patch for other kinds of heap by creating subclass of ParHeapInspectTask and implement the do_object_iterate_parallel(). > > > >> > > > > >> > Hope these info could help on code review and initate the discussion :-) > > > >> > Thanks! > > > >> > > > > >> > BRs, > > > >> > Lin > > > >> > >On 2020/2/19, 9:40 AM, "linzang(??)" wrote:. > > > >> > > > > > >> > > Re-post this RFR with correct enhancement number to make it trackable. > > > >> > > please ignore the previous wrong post. sorry for troubles. > > > >> > > > > > >> > > webrev:http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_01/ > > > >> > > Hi bug:https://bugs.openjdk.java.net/browse/JDK-8215624 > > > >> > > CSR:https://bugs.openjdk.java.net/browse/JDK-8239290 > > > >> > > -------------- > > > >> > > Lin > > > >> > > >Hi Lin, > > > > > > > > > > > >> > > >Could you, please, re-post your RFR with the right enhancement number in > > > >> > > >the message subject? > > > >> > > >It will be more trackable this way. > > > >> > > > > > > >> > > >Thanks, > > > >> > > >Serguei > > > >> > > > > > > >> > > > > > > >> > > >On 2/17/20 10:29 PM, linzang(??) wrote: > > > >> > > >> Dear David, > > > >> > > >> Thanks a lot! > > > >> > > >> I have updated the refined code tohttp://cr.openjdk.java.net/~lzang/jmap-8214535/8215264/webrev_01/. > > > >> > > >> IMHO the parallel heap inspection can be extended to all kinds of heap as long as the heap layout can support parallel iteration. > > > >> > > >> Maybe we can firstly use this webrev to discuss how to implement it, because I am not sure my current implementation is an appropriate way to communicate with collectedHeap, then we can extend the solution to other kinds of heap. > > > >> > > >> > > > >> > > >> Thanks, > > > >> > > >> -------------- > > > >> > > >> Lin > > > >> > > >>> Hi Lin, > > > >> > > >>> > > > >> > > >>> Adding in hotspot-gc-dev as they need to see how this interacts with GC > > > >> > > >>> worker threads, and whether it needs to be extended beyond G1. > > > >> > > >>> > > > >> > > >>> I happened to spot one nit when browsing: > > > >> > > >>> > > > >> > > >>> src/hotspot/share/gc/shared/collectedHeap.hpp > > > >> > > >>> > > > >> > > >>> + virtual bool run_par_heap_inspect_task(KlassInfoTable* cit, > > > >> > > >>> + BoolObjectClosure* filter, > > > >> > > >>> + size_t* missed_count, > > > >> > > >>> + size_t thread_num) { > > > >> > > >>> + return NULL; > > > >> > > >>> > > > >> > > >>> s/NULL/false/ > > > >> > > >>> > > > >> > > >>> Cheers, > > > >> > > >>> David > > > > > > > >>> > > > >> > > >>> On 18/02/2020 2:15 pm, linzang(??) wrote: > > > >> > > >>>> Dear All, > > > >> > > >>>> May I ask your help to review the follow changes: > > > >> > > >>>> webrev: > > > >> > > >>>>http://cr.openjdk.java.net/~lzang/jmap-8214535/8215264/webrev_00/ > > > >> > > >>>> bug:https://bugs.openjdk.java.net/browse/JDK-8215624 > > > >> > > >>>> related CSR:https://bugs.openjdk.java.net/browse/JDK-8239290 > > > >> > > >>>> This patch enable parallel heap inspection of G1 for jmap histo. > > > >> > > >>>> my simple test shown it can speed up 2x of jmap -histo with > > > >> > > >>>> parallelThreadNum set to 2 for heap at ~500M on 4-core platform. > > > >> > > >>>> > > > >> > > >>>> ------------------------------------------------------------------------ > > > >> > > >>>> BRs, > > > >> > > >>>> Lin > > > >> > > >> > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.plummer at oracle.com Fri Aug 7 22:03:57 2020 From: chris.plummer at oracle.com (Chris Plummer) Date: Fri, 7 Aug 2020 15:03:57 -0700 Subject: RFR(XXS): 8241951: ClhsdbCDSCore.java failed to find core file Message-ID: <3897af73-334a-816c-9f2a-5db7c64f8dc8@oracle.com> Hello, Please review the following: https://bugs.openjdk.java.net/browse/JDK-8241951 http://cr.openjdk.java.net/~cjplummer/8241951/webrev.00/index.html We need to disable SA core file testing on OSX 10.15.* and later when the binary is signed because OSX will no longer produce core files with this configuration. thanks, Chris From alexey.menkov at oracle.com Fri Aug 7 22:09:23 2020 From: alexey.menkov at oracle.com (Alex Menkov) Date: Fri, 7 Aug 2020 15:09:23 -0700 Subject: RFR: JDK-8234808: jdb quoted option parsing broken Message-ID: Hi all, please review the fix for https://bugs.openjdk.java.net/browse/JDK-8234808 webrev: http://cr.openjdk.java.net/~amenkov/jdk16/jdb_options/webrev/ Some background: when jdb launches debuggee process it passes java options from "options" value for CommandLineLaunch connector and forward options specified before command. The fix solves several discovered issues: - proper handling of java options with spaces - if both way are used to specify java options, forwarded options override options from "options" value VMConnection class implements tricky logic for "options" field parsing for JFR needs (handling of single and double quotes). I decided to keep it as is to avoid massive test failures with JFR (there is no test coverage for this functionality and I'm not sure I understand all requirements). --alex From daniel.daugherty at oracle.com Fri Aug 7 22:11:58 2020 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Fri, 7 Aug 2020 18:11:58 -0400 Subject: RFR(XXS): 8241951: ClhsdbCDSCore.java failed to find core file In-Reply-To: <3897af73-334a-816c-9f2a-5db7c64f8dc8@oracle.com> References: <3897af73-334a-816c-9f2a-5db7c64f8dc8@oracle.com> Message-ID: On 8/7/20 6:03 PM, Chris Plummer wrote: > Hello, > > Please review the following: > > https://bugs.openjdk.java.net/browse/JDK-8241951 > http://cr.openjdk.java.net/~cjplummer/8241951/webrev.00/index.html test/lib/jdk/test/lib/util/CoreUtils.java ??? L139: ??????????????? if (Platform.getOsVersionMajor() == 10 && Platform.getOsVersionMinor() >= 15) { ??????? Should the major version check be ">= 10"? ??????? Platform.isSignedOSX() would return true for signed 11.X. Dunno. Thumbs up. Dan > > We need to disable SA core file testing on OSX 10.15.* and later when > the binary is signed because OSX will no longer produce core files > with this configuration. > > thanks, > > Chris From chris.plummer at oracle.com Fri Aug 7 22:22:12 2020 From: chris.plummer at oracle.com (Chris Plummer) Date: Fri, 7 Aug 2020 15:22:12 -0700 Subject: RFR(XXS): 8241951: ClhsdbCDSCore.java failed to find core file In-Reply-To: References: <3897af73-334a-816c-9f2a-5db7c64f8dc8@oracle.com> Message-ID: On 8/7/20 3:11 PM, Daniel D. Daugherty wrote: > On 8/7/20 6:03 PM, Chris Plummer wrote: >> Hello, >> >> Please review the following: >> >> https://bugs.openjdk.java.net/browse/JDK-8241951 >> http://cr.openjdk.java.net/~cjplummer/8241951/webrev.00/index.html > > test/lib/jdk/test/lib/util/CoreUtils.java > ??? L139: ??????????????? if (Platform.getOsVersionMajor() == 10 && > Platform.getOsVersionMinor() >= 15) { > ??????? Should the major version check be ">= 10"? If I'm going to check for >= 10, then it needs to be something like : ?? if (major > 10 || (major == 10 && minor >= 15) > > ??????? Platform.isSignedOSX() would return true for signed 11.X. Dunno. > Yes, it would return true for 11.x if signed, which is what we would want. Chris > > Thumbs up. > > Dan > > >> >> We need to disable SA core file testing on OSX 10.15.* and later when >> the binary is signed because OSX will no longer produce core files >> with this configuration. >> >> thanks, >> >> Chris > From chris.plummer at oracle.com Fri Aug 7 22:25:09 2020 From: chris.plummer at oracle.com (Chris Plummer) Date: Fri, 7 Aug 2020 15:25:09 -0700 Subject: RFR(XXS): 8241951: ClhsdbCDSCore.java failed to find core file In-Reply-To: References: <3897af73-334a-816c-9f2a-5db7c64f8dc8@oracle.com> Message-ID: On 8/7/20 3:22 PM, Chris Plummer wrote: > On 8/7/20 3:11 PM, Daniel D. Daugherty wrote: >> On 8/7/20 6:03 PM, Chris Plummer wrote: >>> Hello, >>> >>> Please review the following: >>> >>> https://bugs.openjdk.java.net/browse/JDK-8241951 >>> http://cr.openjdk.java.net/~cjplummer/8241951/webrev.00/index.html >> >> test/lib/jdk/test/lib/util/CoreUtils.java >> ??? L139: ??????????????? if (Platform.getOsVersionMajor() == 10 && >> Platform.getOsVersionMinor() >= 15) { >> ??????? Should the major version check be ">= 10"? > If I'm going to check for >= 10, then it needs to be something like : > > ?? if (major > 10 || (major == 10 && minor >= 15) > ??????????? if (Platform.isSignedOSX()) { ??????????????? if (Platform.getOsVersionMajor() > 10 || ??????????????????? (Platform.getOsVersionMajor() == 10 && Platform.getOsVersionMinor() >= 15)) ??????????????? { ??????????????????? // We can't generate cores files with signed binaries on OSX 10.15 and later. ??????????????????? throw new SkippedException("Cannot produce core file with signed binary on OSX 10.15 and later"); ??????????????? } ??????????? } I'll send an updated webrev if you're ok with this thanks Chris >> >> ??????? Platform.isSignedOSX() would return true for signed 11.X. Dunno. >> > Yes, it would return true for 11.x if signed, which is what we would > want. > > Chris >> >> Thumbs up. >> >> Dan >> >> >>> >>> We need to disable SA core file testing on OSX 10.15.* and later >>> when the binary is signed because OSX will no longer produce core >>> files with this configuration. >>> >>> thanks, >>> >>> Chris >> > > From daniel.daugherty at oracle.com Fri Aug 7 22:26:29 2020 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Fri, 7 Aug 2020 18:26:29 -0400 Subject: RFR(XXS): 8241951: ClhsdbCDSCore.java failed to find core file In-Reply-To: References: <3897af73-334a-816c-9f2a-5db7c64f8dc8@oracle.com> Message-ID: <772ce481-7d27-71a5-e94d-41fb5a4ba9f1@oracle.com> On 8/7/20 6:25 PM, Chris Plummer wrote: > On 8/7/20 3:22 PM, Chris Plummer wrote: >> On 8/7/20 3:11 PM, Daniel D. Daugherty wrote: >>> On 8/7/20 6:03 PM, Chris Plummer wrote: >>>> Hello, >>>> >>>> Please review the following: >>>> >>>> https://bugs.openjdk.java.net/browse/JDK-8241951 >>>> http://cr.openjdk.java.net/~cjplummer/8241951/webrev.00/index.html >>> >>> test/lib/jdk/test/lib/util/CoreUtils.java >>> ??? L139: ??????????????? if (Platform.getOsVersionMajor() == 10 && >>> Platform.getOsVersionMinor() >= 15) { >>> ??????? Should the major version check be ">= 10"? >> If I'm going to check for >= 10, then it needs to be something like : >> >> ?? if (major > 10 || (major == 10 && minor >= 15) >> > ??????????? if (Platform.isSignedOSX()) { > ??????????????? if (Platform.getOsVersionMajor() > 10 || > ??????????????????? (Platform.getOsVersionMajor() == 10 && > Platform.getOsVersionMinor() >= 15)) > ??????????????? { > ??????????????????? // We can't generate cores files with signed > binaries on OSX 10.15 and later. > ??????????????????? throw new SkippedException("Cannot produce core > file with signed binary on OSX 10.15 and later"); > ??????????????? } > ??????????? } > > I'll send an updated webrev if you're ok with this I'm okay with (I don't need another webrev), but other folks might want one... Dan > > thanks > > Chris >>> >>> ??????? Platform.isSignedOSX() would return true for signed 11.X. >>> Dunno. >>> >> Yes, it would return true for 11.x if signed, which is what we would >> want. >> >> Chris >>> >>> Thumbs up. >>> >>> Dan >>> >>> >>>> >>>> We need to disable SA core file testing on OSX 10.15.* and later >>>> when the binary is signed because OSX will no longer produce core >>>> files with this configuration. >>>> >>>> thanks, >>>> >>>> Chris >>> >> >> > From chris.plummer at oracle.com Fri Aug 7 22:34:28 2020 From: chris.plummer at oracle.com (Chris Plummer) Date: Fri, 7 Aug 2020 15:34:28 -0700 Subject: RFR(XXS): 8241951: ClhsdbCDSCore.java failed to find core file In-Reply-To: <772ce481-7d27-71a5-e94d-41fb5a4ba9f1@oracle.com> References: <3897af73-334a-816c-9f2a-5db7c64f8dc8@oracle.com> <772ce481-7d27-71a5-e94d-41fb5a4ba9f1@oracle.com> Message-ID: On 8/7/20 3:26 PM, Daniel D. Daugherty wrote: > > > On 8/7/20 6:25 PM, Chris Plummer wrote: >> On 8/7/20 3:22 PM, Chris Plummer wrote: >>> On 8/7/20 3:11 PM, Daniel D. Daugherty wrote: >>>> On 8/7/20 6:03 PM, Chris Plummer wrote: >>>>> Hello, >>>>> >>>>> Please review the following: >>>>> >>>>> https://bugs.openjdk.java.net/browse/JDK-8241951 >>>>> http://cr.openjdk.java.net/~cjplummer/8241951/webrev.00/index.html >>>> >>>> test/lib/jdk/test/lib/util/CoreUtils.java >>>> ??? L139: ??????????????? if (Platform.getOsVersionMajor() == 10 && >>>> Platform.getOsVersionMinor() >= 15) { >>>> ??????? Should the major version check be ">= 10"? >>> If I'm going to check for >= 10, then it needs to be something like : >>> >>> ?? if (major > 10 || (major == 10 && minor >= 15) >>> >> ??????????? if (Platform.isSignedOSX()) { >> ??????????????? if (Platform.getOsVersionMajor() > 10 || >> ??????????????????? (Platform.getOsVersionMajor() == 10 && >> Platform.getOsVersionMinor() >= 15)) >> ??????????????? { >> ??????????????????? // We can't generate cores files with signed >> binaries on OSX 10.15 and later. >> ??????????????????? throw new SkippedException("Cannot produce core >> file with signed binary on OSX 10.15 and later"); >> ??????????????? } >> ??????????? } >> >> I'll send an updated webrev if you're ok with this > > I'm okay with (I don't need another webrev), but other folks might > want one... http://cr.openjdk.java.net/~cjplummer/8241951/webrev.01/index.html Still testing. Chris > > Dan > >> >> thanks >> >> Chris >>>> >>>> ??????? Platform.isSignedOSX() would return true for signed 11.X. >>>> Dunno. >>>> >>> Yes, it would return true for 11.x if signed, which is what we would >>> want. >>> >>> Chris >>>> >>>> Thumbs up. >>>> >>>> Dan >>>> >>>> >>>>> >>>>> We need to disable SA core file testing on OSX 10.15.* and later >>>>> when the binary is signed because OSX will no longer produce core >>>>> files with this configuration. >>>>> >>>>> thanks, >>>>> >>>>> Chris >>>> >>> >>> >> > From alexey.menkov at oracle.com Fri Aug 7 22:36:52 2020 From: alexey.menkov at oracle.com (Alex Menkov) Date: Fri, 7 Aug 2020 15:36:52 -0700 Subject: RFR(XXS): 8241951: ClhsdbCDSCore.java failed to find core file In-Reply-To: References: <3897af73-334a-816c-9f2a-5db7c64f8dc8@oracle.com> <772ce481-7d27-71a5-e94d-41fb5a4ba9f1@oracle.com> Message-ID: LGTM --alex On 08/07/2020 15:34, Chris Plummer wrote: > On 8/7/20 3:26 PM, Daniel D. Daugherty wrote: >> >> >> On 8/7/20 6:25 PM, Chris Plummer wrote: >>> On 8/7/20 3:22 PM, Chris Plummer wrote: >>>> On 8/7/20 3:11 PM, Daniel D. Daugherty wrote: >>>>> On 8/7/20 6:03 PM, Chris Plummer wrote: >>>>>> Hello, >>>>>> >>>>>> Please review the following: >>>>>> >>>>>> https://bugs.openjdk.java.net/browse/JDK-8241951 >>>>>> http://cr.openjdk.java.net/~cjplummer/8241951/webrev.00/index.html >>>>> >>>>> test/lib/jdk/test/lib/util/CoreUtils.java >>>>> ??? L139: ??????????????? if (Platform.getOsVersionMajor() == 10 && >>>>> Platform.getOsVersionMinor() >= 15) { >>>>> ??????? Should the major version check be ">= 10"? >>>> If I'm going to check for >= 10, then it needs to be something like : >>>> >>>> ?? if (major > 10 || (major == 10 && minor >= 15) >>>> >>> ??????????? if (Platform.isSignedOSX()) { >>> ??????????????? if (Platform.getOsVersionMajor() > 10 || >>> ??????????????????? (Platform.getOsVersionMajor() == 10 && >>> Platform.getOsVersionMinor() >= 15)) >>> ??????????????? { >>> ??????????????????? // We can't generate cores files with signed >>> binaries on OSX 10.15 and later. >>> ??????????????????? throw new SkippedException("Cannot produce core >>> file with signed binary on OSX 10.15 and later"); >>> ??????????????? } >>> ??????????? } >>> >>> I'll send an updated webrev if you're ok with this >> >> I'm okay with (I don't need another webrev), but other folks might >> want one... > > http://cr.openjdk.java.net/~cjplummer/8241951/webrev.01/index.html > > Still testing. > > Chris >> >> Dan >> >>> >>> thanks >>> >>> Chris >>>>> >>>>> ??????? Platform.isSignedOSX() would return true for signed 11.X. >>>>> Dunno. >>>>> >>>> Yes, it would return true for 11.x if signed, which is what we would >>>> want. >>>> >>>> Chris >>>>> >>>>> Thumbs up. >>>>> >>>>> Dan >>>>> >>>>> >>>>>> >>>>>> We need to disable SA core file testing on OSX 10.15.* and later >>>>>> when the binary is signed because OSX will no longer produce core >>>>>> files with this configuration. >>>>>> >>>>>> thanks, >>>>>> >>>>>> Chris >>>>> >>>> >>>> >>> >> > From linzang at tencent.com Sat Aug 8 09:58:48 2020 From: linzang at tencent.com (=?utf-8?B?bGluemFuZyjoh6fnkLMp?=) Date: Sat, 8 Aug 2020 09:58:48 +0000 Subject: RFR(L): 8215624: add parallel heap inspection support for jmap histo(G1)(Internet mail) In-Reply-To: <72bb58cb-add7-b25b-9e1c-879886993857@oracle.com> References: <107799EA-8956-426D-B05E-B3344AFCBA28@amazon.com> <63236320-6741-46F4-A22A-24FCE699E26A@tencent.com> <7015e935-51df-54de-8eea-7dc375db4c85@oracle.com> <72bb58cb-add7-b25b-9e1c-879886993857@oracle.com> Message-ID: Hi Serguei, >> What is going to happen if the resulting 'parallel' substring above is not a number? The error handling logic locates at http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_11/src/hotspot/share/services/attachListener.cpp.frames.html, (line 276-284) Generally, the result is error message will be print if ?parallel? is illegal. An example output would be: ############################ $ time jmap -histo:parallel=c 26233 Exception in thread "main" com.sun.tools.attach.AttachOperationFailedException: Invalid parallel thread number: [c] at jdk.attach/sun.tools.attach.VirtualMachineImpl.execute(VirtualMachineImpl.java:227) at jdk.attach/sun.tools.attach.HotSpotVirtualMachine.executeCommand(HotSpotVirtualMachine.java:309) at jdk.jcmd/sun.tools.jmap.JMap.executeCommandForPid(JMap.java:133) at jdk.jcmd/sun.tools.jmap.JMap.histo(JMap.java:206) at jdk.jcmd/sun.tools.jmap.JMap.main(JMap.java:112) ############################ Hi Serguei, Paul and Stefan. Moreover, I will made a new changeset with following changes: 1. Print error message + usage when parameter check fail in Jmap.java 2. Retrive the histo logic that if ?all? and ?live? are set at same time, use ?live?, rather than print error message. (not sure which one is better :P) What do you think? BRs, Lin From: "serguei.spitsyn at oracle.com" Date: Saturday, August 8, 2020 at 1:16 AM To: "linzang(??)" , "Hohensee, Paul" , Stefan Karlsson , David Holmes , serviceability-dev , "hotspot-gc-dev at openjdk.java.net" Subject: Re: RFR(L): 8215624: add parallel heap inspection support for jmap histo(G1)(Internet mail) On 8/7/20 00:24, serguei.spitsyn at oracle.com wrote: Hi Lin, Not sure, I fully understand the spec update and the options processing in the file: http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_11/src/jdk.jcmd/share/classes/sun/tools/jmap/JMap.java.frames.html The spec says nothing if the new option 'parallel' is mandatory or optional. Also, (it was before your fix) the spec does not say if the options 'live' and 'all' are mutually exclusive. The JMap.java implementation just print usage in two cases: 191 } else if (subopt.startsWith("parallel=")) { 192 parallel = subopt.substring("parallel=".length()); Forgot to ask... What is going to happen if the resulting 'parallel' substring above is not a number? Thanks, Serguei 193 if (parallel == null) { 194 usage(1); 195 } ... 200 if (set_live && set_all) { 201 usage(1); 202 } It is not that helpful as the usage does not explain anything about these corner cases. Also, it allows to pass no parallel option. What is going to happen if null is passed in place of parallel here? : 206 executeCommandForPid(pid, "inspectheap", liveopt, filename, parallel); Should the lines 193-195 be moved after the line 202? Thanks, Serguei On 8/5/20 18:59, linzang(??) wrote: Thanks Paul! And I have verified this change could build success in windows. BRs, Lin On 2020/8/6, 4:17 AM, "Hohensee, Paul" wrote: Two tiny nits that don't need a new webrev: In heapInspection.cpp, you don't need to cast missed_count to uintx in the call to log_info(). In heapInspection.hpp, you can delete two of the three blank lines before #endif // SHARE_MEMORY_HEAPINSPECTION_HPP Thanks, Paul On 8/5/20, 6:46 AM, "linzang(??)" wrote: Hi Paul, Stefan and Serguei, Here I uploaded a new changeset, would you like to help review again? Webrev: http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_11/ Delta (based on webrev10): http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_11_delta/ P.S. I am in process of building it on windows environment for a double check. May update result later. Thanks! BRs, Lin On 2020/8/5, 8:56 PM, "Hohensee, Paul" wrote: uintx is fine with me. Thanks, Paul On 8/5/20, 1:14 AM, "linzang(??)" wrote: Hi Stefan, I got your point, thanks for explanation. I missed the atomic part when considering it. Hi Paul, Do you think it is ok to use uintx? I checked it is actually a uintptr_t type. BRs, Lin On 2020/8/5, 3:39 PM, "Stefan Karlsson" wrote: On 2020-08-05 07:22, linzang(??) wrote: > Hi Serguei, > > No problem, Thanks for your reviewing :) > > I wll upload a new webrev later, so may I ask your help to review it > again? > > Hi Stefan, > > As Paul mentioned, the _/missed/_count is not a size, so size_t may > not be clear, what?s your opinion about uint64_t? We typically don't restrict the usage of size_t to only *sizes* in the HotSpot. If you search the code you'll find many count variables using size_t, so I personally don't see the need to change the type. However, if you really do want to change it then maybe using another type that is 32 bits on 32-bit machines, maybe uintx? IIRC, using uint64_t and some of the Atomics operations are problematic on some 32-bit platforms, so using a type that matches the word size of the targetted machine helps you not having to think about that. > > It seems the uint overflow may happened on 64bit machine with large > heap, e.g. may be more than 4 Giga objects (8byte header + 8 byte > klassptr + 8byte field) in a heap that is larger than 96 GB, uint64_t > is ok in this case. Exactly. Thanks, StefanK > > BRs, > > Lin > > *From: *"serguei.spitsyn at oracle.com" > *Date: *Wednesday, August 5, 2020 at 1:02 PM > *To: *"linzang(??)" , "Hohensee, Paul" > , Stefan Karlsson , > David Holmes , serviceability-dev > , "hotspot-gc-dev at openjdk.java.net" > > *Subject: *Re: RFR(L): 8215624: add parallel heap inspection support for > jmap histo(G1)(Internet mail) > > Oh, sorry for the confusion, please, skip my question. :) > C++ does not have the '&&=' operator. > > Thanks, > Serguei > > On 8/4/20 21:56, serguei.spitsyn at oracle.com > wrote: > > Hi Lin, > > https://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_10/src/hotspot/share/memory/heapInspection.cpp.udiff.html > > +class KlassInfoTableMergeClosure : public KlassInfoClosure { > > +private: > > + KlassInfoTable* _dest; > > + bool _success; > > +public: > > + KlassInfoTableMergeClosure(KlassInfoTable* table) : _dest(table), > _success(true) {} > > + void do_cinfo(KlassInfoEntry* cie) { > > + _success &= _dest->merge_entry(cie); > > + } > > The operator '&=' above looks strange. > Did you actually want to use the operator '&&=' instead? : > > + _success &&= _dest->merge_entry(cie); > > > Thanks, > Serguei > > > > > On 8/3/20 07:51, linzang(??) wrote: > > Dear Stefan, > > May I ask your help to review again? I have made a delta based on the last changeset you have reviewed(webrev04),hope it could ease your reviewing work. > > webrev:https://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_10/ > > delta (vs webrev04):https://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/delta_10vs04/webrev/ > > bug:https://bugs.openjdk.java.net/browse/JDK-8215624 > > CSR(approved):https://bugs.openjdk.java.net/browse/JDK-8239290 > > > > BRs, > > Lin > > On 2020/7/30, 5:21 AM, "Hohensee, Paul" wrote: > > A submit repo run with this succeeded, so afaic you're good to go. Stefan, you reviewed the GC part before, it'd be great if you could ok the final version. > > Thanks, > > Paul > > On 7/29/20, 5:02 AM, "linzang(??)" wrote: > > Upload a new change athttp://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_10/ > > It fix an issue of windows fail : > > #################################### > > In heapInspect.cpp > > - size_t HeapInspection::populate_table(KlassInfoTable* cit, BoolObjectClosure *filter, uint parallel_thread_num) { > > + uint HeapInspection::populate_table(KlassInfoTable* cit, BoolObjectClosure *filter, uint parallel_thread_num) { > > #################################### > > In heapInspect.hpp > > - size_t populate_table(KlassInfoTable* cit, BoolObjectClosure* filter = NULL, uint parallel_thread_num = 1) NOT_SERVICES_RETURN_(0); > > + uint populate_table(KlassInfoTable* cit, BoolObjectClosure* filter = NULL, uint parallel_thread_num = 1) NOT_SERVICES_RETURN_(0); > > #################################### > > BRs, > > Lin > > On 2020/7/27, 11:26 AM, "linzang(??)" wrote: > > I update a new change athttp://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_09 > > It includes a tiny fix of build failure on windows: > > #################################### > > In attachListener.cpp: > > - uint parallel_thread_num = MAX(1, (uint)os::initial_active_processor_count() * 3 / 8); > > + uint parallel_thread_num = MAX2(1, (uint)os::initial_active_processor_count() * 3 / 8); > > #################################### > > BRs, > > Lin > > On 2020/7/23, 11:56 AM, "linzang(??)" wrote: > > Hi Paul, > > Thanks for your help, that all looks good to me. > > Just 2 minor changes: > > ? delete the final return in ParHeapInspectTask::work, you mentioned it but seems not include in the webrev :-) > > ? delete a unnecessary blank line in heapInspect.cpp at merge_entry() > > ######################################################################### > > --- old/src/hotspot/share/memory/heapInspection.cpp 2020-07-23 11:23:29.281666456 +0800 > > +++ new/src/hotspot/share/memory/heapInspection.cpp 2020-07-23 11:23:29.017666447 +0800 > > @@ -251,7 +251,6 @@ > > _size_of_instances_in_words += cie->words(); > > return true; > > } > > - > > return false; > > } > > @@ -568,7 +567,6 @@ > > Atomic::add(&_missed_count, missed_count); > > } else { > > Atomic::store(&_success, false); > > - return; > > } > > } > > ######################################################################### > > Here is the webrevhttp://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_08/ > > BRs, > > Lin > > --------------------------------------------- > > From: "Hohensee, Paul" > > Date: Thursday, July 23, 2020 at 6:48 AM > > To: "linzang(??)" , Stefan Karlsson ,"serguei.spitsyn at oracle.com" , David Holmes , serviceability-dev ,"hotspot-gc-dev at openjdk.java.net" > > Subject: RE: RFR(L): 8215624: add parallel heap inspection support for jmap histo(G1)(Internet mail) > > Just small things. > > heapInspection.cpp: > > In ParHeapInspectTask::work, remove the final return statement and fix the following ?}? indent. I.e., replace > > + Atomic::store(&_success, false); > > + return; > > + } > > with > > + Atomic::store(&_success, false); > > + } > > In HeapInspection::heap_inspection, missed_count should be a uint to match other missed_count declarations, and should be initialized to the result of populate_table() rather than separately to 0. > > attachListener.cpp: > > In heap_inspection, initial_processor_count returns an int, so cast its result to a uint. > > Similarly, parse_uintx returns a uintx, so cast its result (num) to uint when assigning to parallel_thread_num. > > BasicJMapTest.java: > > I took the liberty of refactoring testHisto*/histoToFile/testDump*/dump to remove redundant interposition methods and make histoToFile and dump look as similar as possible. > > Webrev with the above changes in > > http://cr.openjdk.java.net/~phh/8214535/webrev.01/ > > Thanks, > > Paul > > On 7/15/20, 2:13 AM, "linzang(??)" wrote: > > Upload a new webrev athttp://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_07/ > > It fix a potential issue that unexpected number of threads maybe calculated for "parallel" option of jmap -histo in container. > > As shown athttp://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_07-delta/src/hotspot/share/services/attachListener.cpp.udiff.html > > ############### attachListener.cpp #################### > > @@ -252,11 +252,11 @@ > > static jint heap_inspection(AttachOperation* op, outputStream* out) { > > bool live_objects_only = true; // default is true to retain the behavior before this change is made > > outputStream* os = out; // if path not specified or path is NULL, use out > > fileStream* fs = NULL; > > const char* arg0 = op->arg(0); > > - uint parallel_thread_num = MAX(1, os::processor_count() * 3 / 8); // default is less than half of processors. > > + uint parallel_thread_num = MAX(1, os::initial_active_processor_count() * 3 / 8); // default is less than half of processors. > > if (arg0 != NULL && (strlen(arg0) > 0)) { > > if (strcmp(arg0, "-all") != 0 && strcmp(arg0, "-live") != 0) { > > out->print_cr("Invalid argument to inspectheap operation: %s", arg0); > > return JNI_ERR; > > } > > ################################################### > > Thanks. > > BRs, > > Lin > > On 2020/7/9, 3:22 PM, "linzang(??)" wrote: > > Hi Paul, > > Thanks for reviewing! > > >> > > >> I'd move all the argument parsing code to JMap.java and just pass the results to Hotspot. Both histo() in JMap.java and code in attachListener.* parse the command line arguments, though the code in histo() doesn't parse the argument to "parallel". I'd upgrade the code in histo() to do a complete parse and pass the option values to executeCommandForPid as before: there would just be more of them now. That would allow you to eliminate all the parsing code in attachListener.cpp as well as the change to arguments.hpp. > > >> > > The reason I made the change in Jmap.java that compose all arguments as 1 string , instead of passing 3 argments, is to avoid the compatibility issue, as we discussed inhttp://mail.openjdk.java.net/pipermail/serviceability-dev/2019-February/thread.html#27240. The root cause of the compatibility issue is because max argument count in HotspotVirtualMachineImpl.java and attachlistener.cpp need to be enlarged (changes likehttp://hg.openjdk.java.net/jdk/jdk/rev/e7cf035682e3#l2.1) when jmap has more than 3 arguments. But if user use an old jcmd/jmap tool, it may stuck at socket read(), because the "max argument count" don't match. > > I re-checked this change, the argument count of jmap histo is equal to 3 (live, file, parallel), so it can work normally even without the change of passing argument. But I think we have to face the problem if more arguments is added in jcmd alike tools later, not sure whether it should be sloved (or a workaround) in this changeset. > > And here are the lastest webrev and delta: > > http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_06/ > > http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_06-delta/ > > Cheers, > > Lin > > On 2020/7/7, 5:57 AM, "Hohensee, Paul" wrote: > > I'd like to see this feature added. :) > > The CSR looks good, as does the basic parallel inspection algorithm. Stefan's done the GC part, so I'll stick to the non-GC part (fwiw, the GC part lgtm). > > I'd move all the argument parsing code to JMap.java and just pass the results to Hotspot. Both histo() in JMap.java and code in attachListener.* parse the command line arguments, though the code in histo() doesn't parse the argument to "parallel". I'd upgrade the code in histo() to do a complete parse and pass the option values to executeCommandForPid as before: there would just be more of them now. That would allow you to eliminate all the parsing code in attachListener.cpp as well as the change to arguments.hpp. > > heapInspection.hpp: > > _shared_miss_count (s/b _missed_count, see below) isn't a size, so it should be a uint instead of a size_t. Same with the new parallel_thread_num argument to heap_inspection() and populate_table(). > > Comment copy-edit: > > +// Parallel heap inspection task. Parallel inspection can fail due to > > +// a native OOM when allocating memory for TL-KlassInfoTable. > > +// _success will be set false on an OOM, and serial inspection tried. > > _shared_miss_count should be _missed_count to match the missed_count() getter, or rename missed_count() to be shared_miss_count(). Whichever way you go, the field type should match the getter result type: uint is reasonable. > > heapInspection.cpp: > > You might use ResourceMark twice in populate_table, separately for the parallel attempt and the serial code. If the parallel attempt fails and available memory is low, it would be good to clean up the memory used by the parallel attempt before doing the serial code. > > Style nit in KlassInfoTable::merge_entry(). I'd line up the definitions of k and elt, so "k" is even with "elt". And, because it's two lines shorter, I'd replace > > + } else { > > + return false; > > + } > > with > > + return false; > > KlassInfoTableMergeClosure.is_success() should be just success() (i.e., no "is_" prefix) because it's a getter. > > I'd reorganize the code in populate_table() to make it more clear, vis (I changed _shared_missed_count to _missed_count) > > + if (cit.allocation_failed()) { > > + // fail to allocate memory, stop parallel mode > > + Atomic::store(&_success, false); > > + return; > > + } > > + RecordInstanceClosure ric(&cit, _filter); > > + _poi->object_iterate(&ric, worker_id); > > + missed_count = ric.missed_count(); > > + { > > + MutexLocker x(&_mutex); > > + merge_success = _shared_cit->merge(&cit); > > + } > > + if (merge_success) { > > + Atomic::add(&_missed_count, missed_count); > > + else { > > + Atomic::store(&_success, false); > > + } > > Thanks, > > Paul > > On 6/29/20, 7:20 PM, "linzang(??)" wrote: > > Dear All, > > Sorry to bother again, I just want to make sure that is this change worth to be continue to work on? If decision is made to not. I think I can drop this work and stop asking for help reviewing... > > Thanks for all your help about reviewing this previously. > > BRs, > > Lin > > On 2020/5/9, 3:47 PM, "linzang(??)" wrote: > > Dear All, > > May I ask your help again for review the latest change? Thanks! > > BRs, > > Lin > > On 2020/4/28, 1:54 PM, "linzang(??)" wrote: > > Hi Stefan, > > >> - Adding Atomic::load/store. > > >> - Removing the time measurement in the run_task. I renamed G1's function > > >> to run_task_timed. If we need this outside of G1, we can rethink the API > > >> at that point. > > >> - ZGC style cleanups > > Thanks for revising the patch, they are all good to me, and I have made a tiny change based on it: > > http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_04/ > > http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_04-delta/ > > it reduce the scope of mutex in ParHeapInspectTask, and delete unnecessary comments. > > BRs, > > Lin > > On 2020/4/27, 4:34 PM, "Stefan Karlsson" wrote: > > Hi Lin, > > On 2020-04-26 05:10, linzang(??) wrote: > > > Hi Stefan and Paul? > > > I have made a new patch based on your comments and Stefan's Poc code: > > > Webrev:http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_03/ > > > Delta(based on Stefan's change:) :http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_03-delta/webrev_03-delta/ > > Thanks for providing a delta patch. It makes it much easier to look at, > > and more likely for reviewers to continue reviewing. > > I'm going to continue focusing on the GC parts, and leave the rest to > > others to review. > > > > > > And Here are main changed I made and want to discuss with you: > > > 1. changed"parallelThreadNum=" to "parallel=" for jmap -histo options. > > > 2. Add logic to test where parallelHeapInspection is fail, in heapInspection.cpp > > > This is because the parHeapInspectTask create thread local KlassInfoTable in it's work() method, and this may fail because of native OOM, in this case, the parallel should fail and serial heap inspection can be tried. > > > One more thing I want discuss with you is about the member "_success" of parHeapInspectTask, when native OOM happenes, it is set to false. And since this "set" operation can be conducted in multiple threads, should it be atomic ops? IMO, this is not necessary because "_success" can only be set to false, and there is no way to change it from back to true after the ParHeapInspectTask instance is created, so it is save to be non-atomic, do you agree with that? > > In these situations you should be using the Atomic::load/store > > primitives. We're moving toward a later C++ standard were data races are > > considered undefined behavior. > > > 3. make CollectedHeap::run_task() be an abstract virtual func, so that every subclass of collectedHeap should support it, so later implementation of new collectedHeap will not miss the "parallel" features. > > > The problem I want to discuss with you is about epsilonHeap and SerialHeap, as they may not need parallel heap iteration, so I only make task->work(0), in case the run_task() is invoked someway in future. Another way is to left run_task() unimplemented, which one do you think is better? > > I don't have a strong opinion about this. > > And also please help take a look at the zHeap, as there is a class > > zTask that wrap the abstractGangTask, and the collectedHeap::run_task() > > only accept AbstraceGangTask* as argument, so I made a delegate class > > to adapt it , please see src/hotspot/share/gc/z/zHeap.cpp. > > > > > > There maybe other better ways to sovle the above problems, welcome for any comments, Thanks! > > I've created a few cleanups and changes on top of your latest patch: > > https://cr.openjdk.java.net/~stefank/8215624/webrev.02.delta > > https://cr.openjdk.java.net/~stefank/8215624/webrev.02 > > - Adding Atomic::load/store. > > - Removing the time measurement in the run_task. I renamed G1's function > > to run_task_timed. If we need this outside of G1, we can rethink the API > > at that point. > > - ZGC style cleanups > > Thanks, > > StefanK > > > > > > BRs, > > > Lin > > > > > > On 2020/4/23, 11:08 AM, "linzang(??)" wrote: > > > > > > Thanks Paul! I agree with using "parallel", will make the update in next patch, Thanks for help update the CSR. > > > > > > BRs, > > > Lin > > > > > > On 2020/4/23, 4:42 AM, "Hohensee, Paul" wrote: > > > > > > For the interface, I'd use "parallel" instead of "parallelThreadNum". All the other options are lower case, and it's a lot easier to type "parallel". I took the liberty of updating the CSR. If you're ok with it, you might want to change variable names and such, plus of course JMap.usage. > > > > > > Thanks, > > > Paul > > > > > > On 4/22/20, 2:29 AM, "serviceability-dev on behalf of linzang(??)" > linzang at tencent.com> wrote: > > > > > > Dear Stefan, > > > > > > Thanks a lot! I agree with you to decouple the heap inspection code with GC's. > > > I will start from your POC code, may discuss with you later. > > > > > > > > > BRs, > > > Lin > > > > > > On 2020/4/22, 5:14 PM, "Stefan Karlsson" wrote: > > > > > > Hi Lin, > > > > > > I took a look at this earlier and saw that the heap inspection code is > > > strongly coupled with the CollectedHeap and G1CollectedHeap. I'd prefer > > > if we'd abstract this away, so that the GCs only provide a "parallel > > > object iteration" interface, and the heap inspection code is kept elsewhere. > > > > > > I started experimenting with doing that, but other higher-priority (to > > > me) tasks have had to take precedence. > > > > > > I've uploaded my work-in-progress / proof-of-concept: > > >https://cr.openjdk.java.net/~stefank/8215624/webrev.01.delta/ > > >https://cr.openjdk.java.net/~stefank/8215624/webrev.01/ > > > > > > The current code doesn't handle the lifecycle (deletion) of the > > > ParallelObjectIterators. There's also code left unimplemented in around > > > CollectedHeap::run_task. However, I think this could work as a basis to > > > pull out the heap inspection code out of the GCs. > > > > > > Thanks, > > > StefanK > > > > > > On 2020-04-22 02:21, linzang(??) wrote: > > > > Dear all, > > > > May I ask you help to review? This RFR has been there for quite a while. > > > > Thanks! > > > > > > > > BRs, > > > > Lin > > > > > > > > > On 2020/3/16, 5:18 PM, "linzang(??)" wrote:> > > > > > > > >> Just update a new path, my preliminary measure show about 3.5x speedup of jmap histo on a nearly full 4GB G1 heap (8-core platform with parallel thread number set to 4). > > > >> webrev:http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_02/ > > > >> bug:https://bugs.openjdk.java.net/browse/JDK-8215624 > > > >> CSR:https://bugs.openjdk.java.net/browse/JDK-8239290 > > ?? > >> BRs, > > > >> Lin > > > >> > On 2020/3/2, 9:56 PM, "linzang(??)" wrote: > > > >> > > > > >> > Dear all, > > > >> > Let me try to ease the reviewing work by some explanation :P > > > >> > The patch's target is to speed up jmap -histo for heap iteration, from my experience it is necessary for large heap investigation. E.g in bigData scenario I have tried to conduct jmap -histo against 180GB heap, it does take quite a while. > > > >> > And if my understanding is corrent, even the jmap -histo without "live" option does heap inspection with heap lock acquired. so it is very likely to block mutator thread in allocation-sensitive scenario. I would say the faster the heap inspection does, the shorter the mutator be blocked. This is parallel iteration for jmap is necessary. > > > >> > I think the parallel heap inspection should be applied to all kind of heap. However, consider the heap layout are different for GCs, much time is required to understand all kinds of the heap layout to make the whole change. IMO, It is not wise to have a huge patch for the whole solution at once, and it is even harder to review it. So I plan to implement it incrementally, the first patch (this one) is going to confirm the implemention detail of how jmap accept the new option, passes it to attachListener of the jvm process and then how to make the parallel inspection closure be generic enough to make it easy to extend to different heap layout. And also how to implement the heap inspection in specific gc's heap. This patch use G1's heap as the begining. > > > >> > This patch actually do several things: > > > >> > 1. Add an option "parallelThreadNum=" to jmap -histo, the default behavior is to set N to 0, means let's JVM decide how many threads to use for heap inspection. Set this option to 1 will disable parallel heap inspection. (more details in CSR:https://bugs.openjdk.java.net/browse/JDK-8239290) > > > >> > 2. Make a change in how Jmap passing arguments, changes inhttp://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_01/src/jdk.jcmd/share/classes/sun/tools/jmap/JMap.java.udiff.html, originally it pass options as separate arguments to attachListener, this patch change to that all options be compose to a single string. So the arg_count_max in attachListener.hpp do not need to be changed, and hence avoid the compatibility issue, as disscussed athttps://mail.openjdk.java.net/pipermail/serviceability-dev/2019-March/027334.html > > > >> > 3. Add an abstract class ParHeapInspectTask in heapInspection.hpp / heapInspection.cpp, It's work(uint worker_id) method prepares the data structure (KlassInfoTable) need for every parallel worker thread, and then call do_object_iterate_parallel() which is heap specific implementation. I also added some machenism in KlassInfoTable to support parallel iteration, such as merge(). > > > >> > 4. In specific heap (G1 in this patch), create a subclass of ParHeapInspectTask, implement the do_object_iterate_parallel() for parallel heap inspection. For G1, it simply invoke g1CollectedHeap's object_iterate_parallel(). > > > >> > 5. Add related test. > > > >> > 6. it may be easy to extend this patch for other kinds of heap by creating subclass of ParHeapInspectTask and implement the do_object_iterate_parallel(). > > > >> > > > > >> > Hope these info could help on code review and initate the discussion :-) > > > >> > Thanks! > > > >> > > > > >> > BRs, > > > >> > Lin > > > >> > >On 2020/2/19, 9:40 AM, "linzang(??)" wrote:. > > > >> > > > > > >> > > Re-post this RFR with correct enhancement number to make it trackable. > > > >> > > please ignore the previous wrong post. sorry for troubles. > > > >> > > > > > >> > > webrev:http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_01/ > > > >> > > Hi bug:https://bugs.openjdk.java.net/browse/JDK-8215624 > > > >> > > CSR:https://bugs.openjdk.java.net/browse/JDK-8239290 > > > >> > > -------------- > > > >> > > Lin > > > >> > > >Hi Lin, > > > > > > > > > > > >> > > >Could you, please, re-post your RFR with the right enhancement number in > > > >> > > >the message subject? > > > >> > > >It will be more trackable this way. > > > >> > > > > > > >> > > >Thanks, > > > >> > > >Serguei > > > >> > > > > > > >> > > > > > > >> > > >On 2/17/20 10:29 PM, linzang(??) wrote: > > > >> > > >> Dear David, > > > >> > > >> Thanks a lot! > > > >> > > >> I have updated the refined code tohttp://cr.openjdk.java.net/~lzang/jmap-8214535/8215264/webrev_01/. > > > >> > > >> IMHO the parallel heap inspection can be extended to all kinds of heap as long as the heap layout can support parallel iteration. > > > >> > > >> Maybe we can firstly use this webrev to discuss how to implement it, because I am not sure my current implementation is an appropriate way to communicate with collectedHeap, then we can extend the solution to other kinds of heap. > > > >> > > >> > > > >> > > >> Thanks, > > > >> > > >> -------------- > > > >> > > >> Lin > > > >> > > >>> Hi Lin, > > > >> > > >>> > > > >> > > >>> Adding in hotspot-gc-dev as they need to see how this interacts with GC > > > >> > > >>> worker threads, and whether it needs to be extended beyond G1. > > > >> > > >>> > > > >> > > >>> I happened to spot one nit when browsing: > > > >> > > >>> > > > >> > > >>> src/hotspot/share/gc/shared/collectedHeap.hpp > > > >> > > >>> > > > >> > > >>> + virtual bool run_par_heap_inspect_task(KlassInfoTable* cit, > > > >> > > >>> + BoolObjectClosure* filter, > > > >> > > >>> + size_t* missed_count, > > > >> > > >>> + size_t thread_num) { > > > >> > > >>> + return NULL; > > > >> > > >>> > > > >> > > >>> s/NULL/false/ > > > >> > > >>> > > > >> > > >>> Cheers, > > > >> > > >>> David > > > > > > > >>> > > > >> > > >>> On 18/02/2020 2:15 pm, linzang(??) wrote: > > > >> > > >>>> Dear All, > > > >> > > >>>> May I ask your help to review the follow changes: > > > >> > > >>>> webrev: > > > >> > > >>>>http://cr.openjdk.java.net/~lzang/jmap-8214535/8215264/webrev_00/ > > > >> > > >>>> bug:https://bugs.openjdk.java.net/browse/JDK-8215624 > > > >> > > >>>> related CSR:https://bugs.openjdk.java.net/browse/JDK-8239290 > > > >> > > >>>> This patch enable parallel heap inspection of G1 for jmap histo. > > > >> > > >>>> my simple test shown it can speed up 2x of jmap -histo with > > > >> > > >>>> parallelThreadNum set to 2 for heap at ~500M on 4-core platform. > > > >> > > >>>> > > > >> > > >>>> ------------------------------------------------------------------------ > > > >> > > >>>> BRs, > > > >> > > >>>> Lin > > > >> > > >> > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From david.holmes at oracle.com Mon Aug 10 05:35:15 2020 From: david.holmes at oracle.com (David Holmes) Date: Mon, 10 Aug 2020 15:35:15 +1000 Subject: RFR(S) 8249293: Unsafe stackwalk in VM_GetOrSetLocal::doit_prologue() In-Reply-To: References: Message-ID: <54218a0f-5ea7-5a26-af7b-d13829bece14@oracle.com> Hi Richard, On 31/07/2020 5:28 pm, Reingruber, Richard wrote: > Hi, > > I rebase the fix after JDK-8250042. > > New webrev: http://cr.openjdk.java.net/~rrich/webrevs/8249293/webrev.2/ The general fix for this seems good. A minor nit: 588 if (!is_assignable(signature, ob_k, Thread::current())) { You know that the current thread is the VMThread so can use VMThread::vm_thread(). Similarly for this existing code: 694 Thread* current_thread = Thread::current(); --- Looking at the test code ... I'm less clear on exactly what is happening and the use of spin-waits raises some red-flags for me in terms of test reliability on different platforms. The "while (--waitCycles > 0)" loop in particular offers no certainty that the agent thread is executing anything in particular. And the use of the spin_count as a guide to future waiting time seems somewhat arbitrary. In all seriousness I got a headache trying to work out how the test was expecting to operate. Some parts could be simplified using raw monitors, I think. But there's no sure way to know the agent thread is in the midst of the stackwalk when the target thread wants to leave the native code. So I understand what you are trying to achieve here, I'm just not sure how reliably it will actually achieve it. test/hotspot/jtreg/serviceability/jvmti/GetLocalVariable/libGetLocalWithoutSuspendTest.cpp 32 static volatile jlong spinn_count = 0; Using a 64-bit counter seems like it will be a problem on 32-bit systems. Should be spin_count not spinn_count. 36 // Agent thread waits for value != 0, then performas the JVMTI call to get local variable. typo: performas Thanks, David ----- > Thanks, Richard. > > -----Original Message----- > From: serviceability-dev On Behalf Of Reingruber, Richard > Sent: Montag, 27. Juli 2020 09:45 > To: serguei.spitsyn at oracle.com; serviceability-dev at openjdk.java.net > Subject: [CAUTION] RE: RFR(S) 8249293: Unsafe stackwalk in VM_GetOrSetLocal::doit_prologue() > > Hi Serguei, > > > I tested it on Linux and Windows but not yet on MacOS. > > The test succeeded now on all platforms. > > Thanks, Richard. > > -----Original Message----- > From: Reingruber, Richard > Sent: Freitag, 24. Juli 2020 15:04 > To: serguei.spitsyn at oracle.com; serviceability-dev at openjdk.java.net > Subject: RE: RFR(S) 8249293: Unsafe stackwalk in VM_GetOrSetLocal::doit_prologue() > > Hi Serguei, > >> The fix itself looks good to me. > > thanks for looking at the fix. > >> I still need another look at new test. >> Could you, please, convert the agent of new test to C++? >> It will make it a little bit simpler. > > Sure, here is the new webrev.1 with a C++ version of the test agent: > > http://cr.openjdk.java.net/~rrich/webrevs/8249293/webrev.1/ > > I tested it on Linux and Windows but not yet on MacOS. > > Thanks, > Richard. > > -----Original Message----- > From: serguei.spitsyn at oracle.com > Sent: Freitag, 24. Juli 2020 00:00 > To: Reingruber, Richard ; serviceability-dev at openjdk.java.net > Subject: Re: RFR(S) 8249293: Unsafe stackwalk in VM_GetOrSetLocal::doit_prologue() > > Hi Richard, > > Thank you for filing the CR and taking care about it! > The fix itself looks good to me. > I still need another look at new test. > Could you, please, convert the agent of new test to C++? > It will make it a little bit simpler. > > Thanks, > Serguei > > > On 7/20/20 01:15, Reingruber, Richard wrote: >> Hi, >> >> please help review this fix for VM_GetOrSetLocal. It moves the unsafe stackwalk from the vm >> operation prologue before the safepoint into the doit() method executed at the safepoint. >> >> Webrev: http://cr.openjdk.java.net/~rrich/webrevs/8249293/webrev.0/index.html >> Bug: https://bugs.openjdk.java.net/browse/JDK-8249293 >> >> According to the JVMTI spec on local variable access it is not required to suspend the target thread >> T [1]. The operation will simply fail with JVMTI_ERROR_NO_MORE_FRAMES if T is executing >> bytecodes. It will succeed though if T is blocked because of synchronization or executing some native >> code. >> >> The issue is that in the latter case the stack walk in VM_GetOrSetLocal::doit_prologue() to prepare >> the access to the local variable is unsafe, because it is done before the safepoint and it races >> with T returning to execute bytecodes making its stack not walkable. The included test shows that >> this can crash the VM if T wins the race. >> >> Manual testing: >> >> - new test test/hotspot/jtreg/serviceability/jvmti/GetLocalVariable/GetLocalWithoutSuspendTest.java >> - test/hotspot/jtreg/vmTestbase/nsk/jvmti >> - test/hotspot/jtreg/serviceability/jvmti >> >> Nightly regression tests @SAP: JCK and JTREG, also in Xcomp mode, SPECjvm2008, SPECjbb2015, >> Renaissance Suite, SAP specific tests with fastdebug and release builds on all platforms >> >> Thanks, Richard. >> >> [1] https://docs.oracle.com/en/java/javase/14/docs/specs/jvmti.html#local > From serguei.spitsyn at oracle.com Mon Aug 10 17:21:42 2020 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Mon, 10 Aug 2020 10:21:42 -0700 Subject: RFR(S) 8249293: Unsafe stackwalk in VM_GetOrSetLocal::doit_prologue() In-Reply-To: <54218a0f-5ea7-5a26-af7b-d13829bece14@oracle.com> References: <54218a0f-5ea7-5a26-af7b-d13829bece14@oracle.com> Message-ID: <037ee9ad-615e-e70b-80f8-21b4fdc80f53@oracle.com> An HTML attachment was scrubbed... URL: From serguei.spitsyn at oracle.com Mon Aug 10 17:23:59 2020 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Mon, 10 Aug 2020 10:23:59 -0700 Subject: RFR(S) 8249293: Unsafe stackwalk in VM_GetOrSetLocal::doit_prologue() In-Reply-To: <037ee9ad-615e-e70b-80f8-21b4fdc80f53@oracle.com> References: <54218a0f-5ea7-5a26-af7b-d13829bece14@oracle.com> <037ee9ad-615e-e70b-80f8-21b4fdc80f53@oracle.com> Message-ID: <4b23be41-fe50-1178-76db-673e37f53e59@oracle.com> An HTML attachment was scrubbed... URL: From coleen.phillimore at oracle.com Mon Aug 10 19:37:08 2020 From: coleen.phillimore at oracle.com (Coleen Phillimore) Date: Mon, 10 Aug 2020 15:37:08 -0400 Subject: RFR (S) 8251302: Create dedicated OopStorages for Management and Jvmti Message-ID: <3493f679-c5e2-9f05-4bfa-aaab1543361d@oracle.com> These OopHandles are created and released during breakpoints and Thread stack walking operations.? They should have their own OopStorage so that GC can detect whether these things affect timing. Tested with tier1-6. open webrev at http://cr.openjdk.java.net/~coleenp/2020/8251302.01/webrev bug link https://bugs.openjdk.java.net/browse/JDK-8251302 Thanks, Coleen From serguei.spitsyn at oracle.com Mon Aug 10 20:34:27 2020 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Mon, 10 Aug 2020 13:34:27 -0700 Subject: RFR (S) 8251302: Create dedicated OopStorages for Management and Jvmti In-Reply-To: <3493f679-c5e2-9f05-4bfa-aaab1543361d@oracle.com> References: <3493f679-c5e2-9f05-4bfa-aaab1543361d@oracle.com> Message-ID: <49934bad-680d-121b-a1e5-56b9cd42d4a0@oracle.com> An HTML attachment was scrubbed... URL: From serguei.spitsyn at oracle.com Mon Aug 10 20:38:56 2020 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Mon, 10 Aug 2020 13:38:56 -0700 Subject: RFR (S) 8251302: Create dedicated OopStorages for Management and Jvmti In-Reply-To: <49934bad-680d-121b-a1e5-56b9cd42d4a0@oracle.com> References: <3493f679-c5e2-9f05-4bfa-aaab1543361d@oracle.com> <49934bad-680d-121b-a1e5-56b9cd42d4a0@oracle.com> Message-ID: <024599a2-6c5d-98e0-8f26-bfe183e9c474@oracle.com> An HTML attachment was scrubbed... URL: From serguei.spitsyn at oracle.com Mon Aug 10 21:10:05 2020 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Mon, 10 Aug 2020 14:10:05 -0700 Subject: RFR(L): 8215624: add parallel heap inspection support for jmap histo(G1)(Internet mail) In-Reply-To: <35B7B432-29F1-45B5-8FD8-3889C0EAC440@tencent.com> References: <107799EA-8956-426D-B05E-B3344AFCBA28@amazon.com> <63236320-6741-46F4-A22A-24FCE699E26A@tencent.com> <7015e935-51df-54de-8eea-7dc375db4c85@oracle.com> <35B7B432-29F1-45B5-8FD8-3889C0EAC440@tencent.com> Message-ID: <84327b71-8a2d-d6bf-4e05-51ea944d524e@oracle.com> An HTML attachment was scrubbed... URL: From coleen.phillimore at oracle.com Mon Aug 10 21:28:59 2020 From: coleen.phillimore at oracle.com (Coleen Phillimore) Date: Mon, 10 Aug 2020 17:28:59 -0400 Subject: RFR (S) 8251302: Create dedicated OopStorages for Management and Jvmti In-Reply-To: <024599a2-6c5d-98e0-8f26-bfe183e9c474@oracle.com> References: <3493f679-c5e2-9f05-4bfa-aaab1543361d@oracle.com> <49934bad-680d-121b-a1e5-56b9cd42d4a0@oracle.com> <024599a2-6c5d-98e0-8f26-bfe183e9c474@oracle.com> Message-ID: <36dc666d-f12c-0250-9b65-4794f05641db@oracle.com> On 8/10/20 4:38 PM, serguei.spitsyn at oracle.com wrote: > On 8/10/20 13:34, serguei.spitsyn at oracle.com wrote: >> Hi Coleen, >> >> It looks good to me. >> Minor: >> +void JvmtiExport::initialize_oop_storage() { >> + // OopStorage needs to be created early in startup and unconditionally >> + // because of OopStorageSet static array indices. >> + _jvmti_oop_storage = OopStorageSet::create_strong("Thread OopStorage"); >> +} >> + >> Would it better to replace "Thread Oopstorage" with "JVMTI OopStorage"? > > In the file > http://cr.openjdk.java.net/~coleenp/2020/8251302.01/webrev/test/jdk/jdk/jfr/event/gc/collection/TestG1ParallelPhases.java.udiff.html > I see this: > "Thread OopStorage", > + "ThreadService OopStorage", > It is not clear if we can simply add ""JVMTI OopStorage" above. Serguei,? Thank you for finding this.? I was wondering why I didn't have to add JVMTI OopStorage to the test.? I'd cut/pasted the same string for Thread OopStorage. I'll fix this and the test and retest. thanks, Coleen > > Thanks, > Serguei > > >> No need in another webrev. >> >> Thanks, >> Serguei >> >> On 8/10/20 12:37, Coleen Phillimore wrote: >>> These OopHandles are created and released during breakpoints and >>> Thread stack walking operations.? They should have their own >>> OopStorage so that GC can detect whether these things affect timing. >>> >>> Tested with tier1-6. >>> >>> open webrev at >>> http://cr.openjdk.java.net/~coleenp/2020/8251302.01/webrev >>> bug link https://bugs.openjdk.java.net/browse/JDK-8251302 >>> >>> Thanks, >>> Coleen >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From serguei.spitsyn at oracle.com Mon Aug 10 22:23:09 2020 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Mon, 10 Aug 2020 15:23:09 -0700 Subject: RFR (S) 8251302: Create dedicated OopStorages for Management and Jvmti In-Reply-To: <36dc666d-f12c-0250-9b65-4794f05641db@oracle.com> References: <3493f679-c5e2-9f05-4bfa-aaab1543361d@oracle.com> <49934bad-680d-121b-a1e5-56b9cd42d4a0@oracle.com> <024599a2-6c5d-98e0-8f26-bfe183e9c474@oracle.com> <36dc666d-f12c-0250-9b65-4794f05641db@oracle.com> Message-ID: <7a0de293-3aba-c580-84df-83d06306e7e3@oracle.com> An HTML attachment was scrubbed... URL: From coleen.phillimore at oracle.com Mon Aug 10 22:39:47 2020 From: coleen.phillimore at oracle.com (Coleen Phillimore) Date: Mon, 10 Aug 2020 18:39:47 -0400 Subject: RFR (S) 8251302: Create dedicated OopStorages for Management and Jvmti In-Reply-To: References: <3493f679-c5e2-9f05-4bfa-aaab1543361d@oracle.com> <49934bad-680d-121b-a1e5-56b9cd42d4a0@oracle.com> <024599a2-6c5d-98e0-8f26-bfe183e9c474@oracle.com> <36dc666d-f12c-0250-9b65-4794f05641db@oracle.com> Message-ID: <20c3370f-ae8c-aadc-53b1-95db92776f31@oracle.com> Adding back serviceability-dev. Thanks for reviewing Serguei. Coleen On 8/10/20 6:11 PM, Coleen Phillimore wrote: > > > On 8/10/20 5:28 PM, Coleen Phillimore wrote: >> >> >> On 8/10/20 4:38 PM, serguei.spitsyn at oracle.com wrote: >>> On 8/10/20 13:34, serguei.spitsyn at oracle.com wrote: >>>> Hi Coleen, >>>> >>>> It looks good to me. >>>> Minor: >>>> +void JvmtiExport::initialize_oop_storage() { >>>> + // OopStorage needs to be created early in startup and >>>> unconditionally >>>> + // because of OopStorageSet static array indices. >>>> + _jvmti_oop_storage = OopStorageSet::create_strong("Thread >>>> OopStorage"); >>>> +} >>>> + >>>> Would it better to replace "Thread Oopstorage" with "JVMTI >>>> OopStorage"? >>> >>> In the file >>> http://cr.openjdk.java.net/~coleenp/2020/8251302.01/webrev/test/jdk/jdk/jfr/event/gc/collection/TestG1ParallelPhases.java.udiff.html >>> >>> I see this: >>> ????????????? "Thread OopStorage", >>> + "ThreadService OopStorage", >>> It is not clear if we can simply add ""JVMTI OopStorage" above. >> >> Serguei,? Thank you for finding this.? I was wondering why I didn't >> have to add JVMTI OopStorage to the test.? I'd cut/pasted the same >> string for Thread OopStorage. >> >> I'll fix this and the test and retest. > > Hi Serguei, > > open webrev at > http://cr.openjdk.java.net/~coleenp/2020/8251302.02.incr/webrev > > This fixes the test as well. > > Thanks! > Coleen > >> >> thanks, >> Coleen >>> >>> Thanks, >>> Serguei >>> >>> >>>> No need in another webrev. >>>> >>>> Thanks, >>>> Serguei >>>> >>>> On 8/10/20 12:37, Coleen Phillimore wrote: >>>>> These OopHandles are created and released during breakpoints and >>>>> Thread stack walking operations.? They should have their own >>>>> OopStorage so that GC can detect whether these things affect timing. >>>>> >>>>> Tested with tier1-6. >>>>> >>>>> open webrev at >>>>> http://cr.openjdk.java.net/~coleenp/2020/8251302.01/webrev >>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8251302 >>>>> >>>>> Thanks, >>>>> Coleen >>>> >>> >> > From linzang at tencent.com Mon Aug 10 23:23:44 2020 From: linzang at tencent.com (=?utf-8?B?bGluemFuZyjoh6fnkLMp?=) Date: Mon, 10 Aug 2020 23:23:44 +0000 Subject: RFR(L): 8215624: add parallel heap inspection support for jmap histo(G1)(Internet mail) In-Reply-To: <84327b71-8a2d-d6bf-4e05-51ea944d524e@oracle.com> References: <107799EA-8956-426D-B05E-B3344AFCBA28@amazon.com> <63236320-6741-46F4-A22A-24FCE699E26A@tencent.com> <7015e935-51df-54de-8eea-7dc375db4c85@oracle.com> <35B7B432-29F1-45B5-8FD8-3889C0EAC440@tencent.com> <84327b71-8a2d-d6bf-4e05-51ea944d524e@oracle.com> Message-ID: Dear Serguei, Here is my reply for your question about non-numeric value for ?parallel? (somehow the thread of replay became out of order, not sure why). >???? ?>> What is going to happen if the resulting 'parallel' substring above is not a number? >????? The error handling logic locates at?http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_11/src/hotspot/share/services/attachListener.cpp.frames.html, (line 276-284) >??????Generally, the result is error message will be print if ?parallel? is illegal.? An example output would be: >???? ############################ > $ time jmap -histo:parallel=c 26233 > Exception in thread "main" com.sun.tools.attach.AttachOperationFailedException: Invalid parallel thread number: [c] >???? ???????????????????????????????????????? at jdk.attach/sun.tools.attach.VirtualMachineImpl.execute(VirtualMachineImpl.java:227) >????? ???????? ??????????????????????????????? at jdk.attach/sun.tools.attach.HotSpotVirtualMachine.executeCommand(HotSpotVirtualMachine.java:309) >???? ????????? ??????????????????????????????? at jdk.jcmd/sun.tools.jmap.JMap.executeCommandForPid(JMap.java:133) >??? ??????????????????????????? ??????????????? at jdk.jcmd/sun.tools.jmap.JMap.histo(JMap.java:206) >?????? ??????????????????????????????????????? at jdk.jcmd/sun.tools.jmap.JMap.main(JMap.java:112)??? >? > ############################ > > Hi Serguei, Paul and Stefan. > Moreover, I will made a new changeset with following changes: > * Print error message + usage when parameter check fail in Jmap.java > *Retrive the histo logic that if ?all? and ?live? are set at same time, use ?live?, rather than print error message. (not sure which one is better :P) My last point is to retrive the behavior for compatibility. And do you think make a separate enhancement about spec is reasonable ? Thanks! ? BRs, Lin From: "serguei.spitsyn at oracle.com" Date: Tuesday, August 11, 2020 at 5:11 AM To: "linzang(??)" , "Hohensee, Paul" , Stefan Karlsson , David Holmes , serviceability-dev , "hotspot-gc-dev at openjdk.java.net" Subject: Re: RFR(L): 8215624: add parallel heap inspection support for jmap histo(G1)(Internet mail) Hi Lin, On 8/7/20 03:41, linzang(??) wrote: Dear Serguei, ?????????Thanks a lot for your review! >> The spec says nothing if the new option 'parallel' is mandatory or optional. >> Also, (it was before your fix) the spec does not say if the options 'live' and 'all' are mutually exclusive. ??????? ?For ?parallel?,? the spec adds ?parallel=0? is the default behavior. So my assumption is if parallel is not used, it will be 0.? Do you think it is ok ? is it necessary to obviously add comments like ?if no parallel is set, use the default value 0?? It'd be nice to make it clear. But the CSR will need to be updated. In fact, I did not want you to go through this cycle again. But maybe it is worth to improve the specs in this regard. May be Paul has some alternative suggestions. ?????????For ?live? and ?all?, before the changeset , I see the logic from the code is that both of them can be set at the same time, and the ?live? will take effect. IMHO this may be a little confused. So I made the change, not sure whether I should keep the same behavior as before in this change? This is better to clearly specify what is allowed and what is the behavior. ?????????And I like your idea of printing more error msg if something wrong with the options setting, but I checked that before the change, if there is not a match option, it only print usage. and not only jmap -histo but also jmap -dump has this issue, do you agree if I fix both in the changeset? Yes, it'd be nice to make it clear in both specs. ????????? ??????????????? >> What is going to happen if null is passed in place of parallel here? : ??????? The default value 0 will be used if no ?parallel? option is set. Okay, thanks. ?????????????????????????????? >> ?Should the lines 193-195 be moved after the line 202? ???????? I don?t think so, the logic is a little different.? At line 193, the case is ?parallel=?.? ?If move them to line 203, it mean ?parallel? is not optional. Okay, I see what you mean. The problem is that the help/spec says nothing about the flag 'parallel' as being optional. I also asked this question: ? Q: What is going to happen if the resulting 'parallel' sub-string above is not a number? Thanks, Serguei ??????? Thanks! ? ? BRs, Lin ? From: mailto:serguei.spitsyn at oracle.com mailto:serguei.spitsyn at oracle.com Date: Friday, August 7, 2020 at 3:28 PM To: "linzang(??)" mailto:linzang at tencent.com, "Hohensee, Paul" mailto:hohensee at amazon.com, Stefan Karlsson mailto:stefan.karlsson at oracle.com, David Holmes mailto:david.holmes at oracle.com, serviceability-dev mailto:serviceability-dev at openjdk.java.net, mailto:hotspot-gc-dev at openjdk.java.net mailto:hotspot-gc-dev at openjdk.java.net Subject: Re: RFR(L): 8215624: add parallel heap inspection support for jmap histo(G1)(Internet mail) ? Hi Lin, Not sure, I fully understand the spec update and the options processing in the file: http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_11/src/jdk.jcmd/share/classes/sun/tools/jmap/JMap.java.frames.html The spec says nothing if the new option 'parallel' is mandatory or optional. Also, (it was before your fix) the spec does not say if the options 'live' and 'all' are mutually exclusive. The JMap.java implementation just print usage in two cases: 191???????????? } else if (subopt.startsWith("parallel=")) { 192??????????????? parallel = subopt.substring("parallel=".length()); 193??????????????? if (parallel == null) { 194???????????????????? usage(1); 195??????????????? } ... 200???????? if (set_live && set_all) { 201???????????? usage(1); 202???????? } It is not that helpful as the usage does not explain anything about these corner cases. Also, it allows to pass no parallel option. What is going to happen if null is passed in place of parallel here? : 206???????? executeCommandForPid(pid, "inspectheap", liveopt, filename, parallel); Should the lines 193-195 be moved after the line 202? Thanks, Serguei On 8/5/20 18:59, linzang(??) wrote: Thanks Paul! And I have verified this change could build success in windows. ? BRs, Lin ? On 2020/8/6, 4:17 AM, "Hohensee, Paul" mailto:hohensee at amazon.com wrote: ? ??? Two tiny nits that don't need a new webrev: ? ??? In heapInspection.cpp, you don't need to cast missed_count to uintx in the call to log_info(). ? ??? In heapInspection.hpp, you can delete two of the three blank lines before #endif // SHARE_MEMORY_HEAPINSPECTION_HPP ? ??? Thanks, ??? Paul ? ??? On 8/5/20, 6:46 AM, "linzang(??)" mailto:linzang at tencent.com wrote: ? ??????? Hi Paul, Stefan and Serguei, ??????????? Here I uploaded a new changeset, would you like to help review again? ??????????? Webrev: http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_11/ ??????????? Delta (based on webrev10): http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_11_delta/ ? ??????????? P.S.? I am in process of building it on windows environment for a double check. May update result later. Thanks! ? ? ??????? BRs, ??????? Lin From serguei.spitsyn at oracle.com Mon Aug 10 23:29:35 2020 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Mon, 10 Aug 2020 16:29:35 -0700 Subject: RFR (S) 8251302: Create dedicated OopStorages for Management and Jvmti In-Reply-To: <20c3370f-ae8c-aadc-53b1-95db92776f31@oracle.com> References: <3493f679-c5e2-9f05-4bfa-aaab1543361d@oracle.com> <49934bad-680d-121b-a1e5-56b9cd42d4a0@oracle.com> <024599a2-6c5d-98e0-8f26-bfe183e9c474@oracle.com> <36dc666d-f12c-0250-9b65-4794f05641db@oracle.com> <20c3370f-ae8c-aadc-53b1-95db92776f31@oracle.com> Message-ID: <02cf0c6f-120c-9c96-6193-2a10a7e26f6a@oracle.com> Coleen, The update looks good to me. Thanks, Serguei On 8/10/20 15:39, Coleen Phillimore wrote: > Adding back serviceability-dev. > Thanks for reviewing Serguei. > Coleen > > On 8/10/20 6:11 PM, Coleen Phillimore wrote: >> >> >> On 8/10/20 5:28 PM, Coleen Phillimore wrote: >>> >>> >>> On 8/10/20 4:38 PM, serguei.spitsyn at oracle.com wrote: >>>> On 8/10/20 13:34, serguei.spitsyn at oracle.com wrote: >>>>> Hi Coleen, >>>>> >>>>> It looks good to me. >>>>> Minor: >>>>> +void JvmtiExport::initialize_oop_storage() { >>>>> + // OopStorage needs to be created early in startup and >>>>> unconditionally >>>>> + // because of OopStorageSet static array indices. >>>>> + _jvmti_oop_storage = OopStorageSet::create_strong("Thread >>>>> OopStorage"); >>>>> +} >>>>> + >>>>> Would it better to replace "Thread Oopstorage" with "JVMTI >>>>> OopStorage"? >>>> >>>> In the file >>>> http://cr.openjdk.java.net/~coleenp/2020/8251302.01/webrev/test/jdk/jdk/jfr/event/gc/collection/TestG1ParallelPhases.java.udiff.html >>>> >>>> I see this: >>>> ????????????? "Thread OopStorage", >>>> + "ThreadService OopStorage", >>>> It is not clear if we can simply add ""JVMTI OopStorage" above. >>> >>> Serguei,? Thank you for finding this.? I was wondering why I didn't >>> have to add JVMTI OopStorage to the test.? I'd cut/pasted the same >>> string for Thread OopStorage. >>> >>> I'll fix this and the test and retest. >> >> Hi Serguei, >> >> open webrev at >> http://cr.openjdk.java.net/~coleenp/2020/8251302.02.incr/webrev >> >> This fixes the test as well. >> >> Thanks! >> Coleen >> >>> >>> thanks, >>> Coleen >>>> >>>> Thanks, >>>> Serguei >>>> >>>> >>>>> No need in another webrev. >>>>> >>>>> Thanks, >>>>> Serguei >>>>> >>>>> On 8/10/20 12:37, Coleen Phillimore wrote: >>>>>> These OopHandles are created and released during breakpoints and >>>>>> Thread stack walking operations.? They should have their own >>>>>> OopStorage so that GC can detect whether these things affect timing. >>>>>> >>>>>> Tested with tier1-6. >>>>>> >>>>>> open webrev at >>>>>> http://cr.openjdk.java.net/~coleenp/2020/8251302.01/webrev >>>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8251302 >>>>>> >>>>>> Thanks, >>>>>> Coleen >>>>> >>>> >>> >> > From linzang at tencent.com Mon Aug 10 23:46:07 2020 From: linzang at tencent.com (=?utf-8?B?bGluemFuZyjoh6fnkLMp?=) Date: Mon, 10 Aug 2020 23:46:07 +0000 Subject: RFR(L): 8215624: add parallel heap inspection support for jmap histo(G1)(Internet mail) In-Reply-To: References: <107799EA-8956-426D-B05E-B3344AFCBA28@amazon.com> <63236320-6741-46F4-A22A-24FCE699E26A@tencent.com> <7015e935-51df-54de-8eea-7dc375db4c85@oracle.com> <35B7B432-29F1-45B5-8FD8-3889C0EAC440@tencent.com> <84327b71-8a2d-d6bf-4e05-51ea944d524e@oracle.com> Message-ID: <34627F19-76BD-48FC-8B5A-C316FAEB3C0F@tencent.com> And Here is the latest refined changeset Webrev: http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_12/ Delta: http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_12_delta/ BRs, Lin ?On 2020/8/11, 7:23 AM, "linzang(??)" wrote: Dear Serguei, Here is my reply for your question about non-numeric value for ?parallel? (somehow the thread of replay became out of order, not sure why). > >> What is going to happen if the resulting 'parallel' substring above is not a number? > The error handling logic locates at http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_11/src/hotspot/share/services/attachListener.cpp.frames.html, (line 276-284) > Generally, the result is error message will be print if ?parallel? is illegal. An example output would be: > ############################ > $ time jmap -histo:parallel=c 26233 > Exception in thread "main" com.sun.tools.attach.AttachOperationFailedException: Invalid parallel thread number: [c] > at jdk.attach/sun.tools.attach.VirtualMachineImpl.execute(VirtualMachineImpl.java:227) > at jdk.attach/sun.tools.attach.HotSpotVirtualMachine.executeCommand(HotSpotVirtualMachine.java:309) > at jdk.jcmd/sun.tools.jmap.JMap.executeCommandForPid(JMap.java:133) > at jdk.jcmd/sun.tools.jmap.JMap.histo(JMap.java:206) > at jdk.jcmd/sun.tools.jmap.JMap.main(JMap.java:112) > > ############################ > > Hi Serguei, Paul and Stefan. > Moreover, I will made a new changeset with following changes: > * Print error message + usage when parameter check fail in Jmap.java > *Retrive the histo logic that if ?all? and ?live? are set at same time, use ?live?, rather than print error message. (not sure which one is better :P) My last point is to retrive the behavior for compatibility. And do you think make a separate enhancement about spec is reasonable ? Thanks! BRs, Lin From: "serguei.spitsyn at oracle.com" Date: Tuesday, August 11, 2020 at 5:11 AM To: "linzang(??)" , "Hohensee, Paul" , Stefan Karlsson , David Holmes , serviceability-dev , "hotspot-gc-dev at openjdk.java.net" Subject: Re: RFR(L): 8215624: add parallel heap inspection support for jmap histo(G1)(Internet mail) Hi Lin, On 8/7/20 03:41, linzang(??) wrote: Dear Serguei, Thanks a lot for your review! >> The spec says nothing if the new option 'parallel' is mandatory or optional. >> Also, (it was before your fix) the spec does not say if the options 'live' and 'all' are mutually exclusive. For ?parallel?, the spec adds ?parallel=0? is the default behavior. So my assumption is if parallel is not used, it will be 0. Do you think it is ok ? is it necessary to obviously add comments like ?if no parallel is set, use the default value 0?? It'd be nice to make it clear. But the CSR will need to be updated. In fact, I did not want you to go through this cycle again. But maybe it is worth to improve the specs in this regard. May be Paul has some alternative suggestions. For ?live? and ?all?, before the changeset , I see the logic from the code is that both of them can be set at the same time, and the ?live? will take effect. IMHO this may be a little confused. So I made the change, not sure whether I should keep the same behavior as before in this change? This is better to clearly specify what is allowed and what is the behavior. And I like your idea of printing more error msg if something wrong with the options setting, but I checked that before the change, if there is not a match option, it only print usage. and not only jmap -histo but also jmap -dump has this issue, do you agree if I fix both in the changeset? Yes, it'd be nice to make it clear in both specs. >> What is going to happen if null is passed in place of parallel here? : The default value 0 will be used if no ?parallel? option is set. Okay, thanks. >> Should the lines 193-195 be moved after the line 202? I don?t think so, the logic is a little different. At line 193, the case is ?parallel=?. If move them to line 203, it mean ?parallel? is not optional. Okay, I see what you mean. The problem is that the help/spec says nothing about the flag 'parallel' as being optional. I also asked this question: Q: What is going to happen if the resulting 'parallel' sub-string above is not a number? Thanks, Serguei Thanks! BRs, Lin From: mailto:serguei.spitsyn at oracle.com mailto:serguei.spitsyn at oracle.com Date: Friday, August 7, 2020 at 3:28 PM To: "linzang(??)" mailto:linzang at tencent.com, "Hohensee, Paul" mailto:hohensee at amazon.com, Stefan Karlsson mailto:stefan.karlsson at oracle.com, David Holmes mailto:david.holmes at oracle.com, serviceability-dev mailto:serviceability-dev at openjdk.java.net, mailto:hotspot-gc-dev at openjdk.java.net mailto:hotspot-gc-dev at openjdk.java.net Subject: Re: RFR(L): 8215624: add parallel heap inspection support for jmap histo(G1)(Internet mail) Hi Lin, Not sure, I fully understand the spec update and the options processing in the file: http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_11/src/jdk.jcmd/share/classes/sun/tools/jmap/JMap.java.frames.html The spec says nothing if the new option 'parallel' is mandatory or optional. Also, (it was before your fix) the spec does not say if the options 'live' and 'all' are mutually exclusive. The JMap.java implementation just print usage in two cases: 191 } else if (subopt.startsWith("parallel=")) { 192 parallel = subopt.substring("parallel=".length()); 193 if (parallel == null) { 194 usage(1); 195 } ... 200 if (set_live && set_all) { 201 usage(1); 202 } It is not that helpful as the usage does not explain anything about these corner cases. Also, it allows to pass no parallel option. What is going to happen if null is passed in place of parallel here? : 206 executeCommandForPid(pid, "inspectheap", liveopt, filename, parallel); Should the lines 193-195 be moved after the line 202? Thanks, Serguei On 8/5/20 18:59, linzang(??) wrote: Thanks Paul! And I have verified this change could build success in windows. BRs, Lin On 2020/8/6, 4:17 AM, "Hohensee, Paul" mailto:hohensee at amazon.com wrote: Two tiny nits that don't need a new webrev: In heapInspection.cpp, you don't need to cast missed_count to uintx in the call to log_info(). In heapInspection.hpp, you can delete two of the three blank lines before #endif // SHARE_MEMORY_HEAPINSPECTION_HPP Thanks, Paul On 8/5/20, 6:46 AM, "linzang(??)" mailto:linzang at tencent.com wrote: Hi Paul, Stefan and Serguei, Here I uploaded a new changeset, would you like to help review again? Webrev: http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_11/ Delta (based on webrev10): http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_11_delta/ P.S. I am in process of building it on windows environment for a double check. May update result later. Thanks! BRs, Lin From david.holmes at oracle.com Tue Aug 11 00:04:43 2020 From: david.holmes at oracle.com (David Holmes) Date: Tue, 11 Aug 2020 10:04:43 +1000 Subject: RFR (S) 8251302: Create dedicated OopStorages for Management and Jvmti In-Reply-To: <20c3370f-ae8c-aadc-53b1-95db92776f31@oracle.com> References: <3493f679-c5e2-9f05-4bfa-aaab1543361d@oracle.com> <49934bad-680d-121b-a1e5-56b9cd42d4a0@oracle.com> <024599a2-6c5d-98e0-8f26-bfe183e9c474@oracle.com> <36dc666d-f12c-0250-9b65-4794f05641db@oracle.com> <20c3370f-ae8c-aadc-53b1-95db92776f31@oracle.com> Message-ID: Hi Coleen, This looks good to me too! Were the changes in src/hotspot/share/utilities/macros.hpp just for completeness? You don't seem to use the new macro. src/hotspot/share/runtime/init.cpp It seems a little odd having an explicit call to JvmtiExport::initialize_oop_storage() rather than that call being inside on of the other init functions. But I don't really see an appropriate place for it. I thought perhaps management_init as it seems to combine a few related things, but it isn't really the right place either. I was also a little unsure if this initialization point would always be early enough, but it seems the oop-storage can't be touched by anything until the live phase, so this seems okay. Though I was wondering whether it should be done in vm_init_globals after universe_oopstorage_init, to maintain the same initialization point as it currnetly has? Thanks, David ----- On 11/08/2020 8:39 am, Coleen Phillimore wrote: > Adding back serviceability-dev. > Thanks for reviewing Serguei. > Coleen > > On 8/10/20 6:11 PM, Coleen Phillimore wrote: >> >> >> On 8/10/20 5:28 PM, Coleen Phillimore wrote: >>> >>> >>> On 8/10/20 4:38 PM, serguei.spitsyn at oracle.com wrote: >>>> On 8/10/20 13:34, serguei.spitsyn at oracle.com wrote: >>>>> Hi Coleen, >>>>> >>>>> It looks good to me. >>>>> Minor: >>>>> +void JvmtiExport::initialize_oop_storage() { >>>>> + // OopStorage needs to be created early in startup and >>>>> unconditionally >>>>> + // because of OopStorageSet static array indices. >>>>> + _jvmti_oop_storage = OopStorageSet::create_strong("Thread >>>>> OopStorage"); >>>>> +} >>>>> + >>>>> Would it better to replace "Thread Oopstorage" with "JVMTI >>>>> OopStorage"? >>>> >>>> In the file >>>> http://cr.openjdk.java.net/~coleenp/2020/8251302.01/webrev/test/jdk/jdk/jfr/event/gc/collection/TestG1ParallelPhases.java.udiff.html >>>> >>>> I see this: >>>> ????????????? "Thread OopStorage", >>>> + "ThreadService OopStorage", >>>> It is not clear if we can simply add ""JVMTI OopStorage" above. >>> >>> Serguei,? Thank you for finding this.? I was wondering why I didn't >>> have to add JVMTI OopStorage to the test.? I'd cut/pasted the same >>> string for Thread OopStorage. >>> >>> I'll fix this and the test and retest. >> >> Hi Serguei, >> >> open webrev at >> http://cr.openjdk.java.net/~coleenp/2020/8251302.02.incr/webrev >> >> This fixes the test as well. >> >> Thanks! >> Coleen >> >>> >>> thanks, >>> Coleen >>>> >>>> Thanks, >>>> Serguei >>>> >>>> >>>>> No need in another webrev. >>>>> >>>>> Thanks, >>>>> Serguei >>>>> >>>>> On 8/10/20 12:37, Coleen Phillimore wrote: >>>>>> These OopHandles are created and released during breakpoints and >>>>>> Thread stack walking operations.? They should have their own >>>>>> OopStorage so that GC can detect whether these things affect timing. >>>>>> >>>>>> Tested with tier1-6. >>>>>> >>>>>> open webrev at >>>>>> http://cr.openjdk.java.net/~coleenp/2020/8251302.01/webrev >>>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8251302 >>>>>> >>>>>> Thanks, >>>>>> Coleen >>>>> >>>> >>> >> > From serguei.spitsyn at oracle.com Tue Aug 11 00:39:43 2020 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Mon, 10 Aug 2020 17:39:43 -0700 Subject: RFR(L): 8215624: add parallel heap inspection support for jmap histo(G1)(Internet mail) In-Reply-To: <34627F19-76BD-48FC-8B5A-C316FAEB3C0F@tencent.com> References: <107799EA-8956-426D-B05E-B3344AFCBA28@amazon.com> <63236320-6741-46F4-A22A-24FCE699E26A@tencent.com> <7015e935-51df-54de-8eea-7dc375db4c85@oracle.com> <35B7B432-29F1-45B5-8FD8-3889C0EAC440@tencent.com> <84327b71-8a2d-d6bf-4e05-51ea944d524e@oracle.com> <34627F19-76BD-48FC-8B5A-C316FAEB3C0F@tencent.com> Message-ID: <74470c79-086a-34a1-54f3-d642fef101b0@oracle.com> An HTML attachment was scrubbed... URL: From coleen.phillimore at oracle.com Tue Aug 11 00:48:04 2020 From: coleen.phillimore at oracle.com (Coleen Phillimore) Date: Mon, 10 Aug 2020 20:48:04 -0400 Subject: RFR (S) 8251302: Create dedicated OopStorages for Management and Jvmti In-Reply-To: References: <3493f679-c5e2-9f05-4bfa-aaab1543361d@oracle.com> <49934bad-680d-121b-a1e5-56b9cd42d4a0@oracle.com> <024599a2-6c5d-98e0-8f26-bfe183e9c474@oracle.com> <36dc666d-f12c-0250-9b65-4794f05641db@oracle.com> <20c3370f-ae8c-aadc-53b1-95db92776f31@oracle.com> Message-ID: <86edc516-a7e5-b9aa-1f17-80b53f26d57d@oracle.com> Hi David,? Thank you for reviewing. On 8/10/20 8:04 PM, David Holmes wrote: > Hi Coleen, > > This looks good to me too! > > Were the changes in src/hotspot/share/utilities/macros.hpp just for > completeness? You don't seem to use the new macro. It was left over from an earlier edit.? I reverted it. > > ?src/hotspot/share/runtime/init.cpp > > It seems a little odd having an explicit call to > JvmtiExport::initialize_oop_storage() rather than that call being > inside on of the other init functions. But I don't really see an > appropriate place for it. I thought perhaps management_init as it > seems to combine a few related things, but it isn't really the right > place either. management_init() isn't the right place for JVMTI.?? I could have added a jvmti_init() that calls JvmtiExport::initialize_oop_storage() but honestly I think this whole thing should be rewritten to call static functions rather than through these forward declarations.? There are other places that call qualified static initialization functions in this code and I think this should migrate to that. > I was also a little unsure if this initialization point would always > be early enough, but it seems the oop-storage can't be touched by > anything until the live phase, so this seems okay. Though I was > wondering whether it should be done in vm_init_globals after > universe_oopstorage_init, to maintain the same initialization point as > it currnetly has? There are no jvmti specific initializations in the short amount of initialization code between vm_init_globals and init_globals so I chose the later initialization because it worked, and the later initialization risks dragging more dependencies forward with it. Doing jvmti initialization in what looks like very early initialization doesn't seem appropriate.? And isn't needed for correctness. thanks, Coleen > > Thanks, > David > ----- > > > On 11/08/2020 8:39 am, Coleen Phillimore wrote: >> Adding back serviceability-dev. >> Thanks for reviewing Serguei. >> Coleen >> >> On 8/10/20 6:11 PM, Coleen Phillimore wrote: >>> >>> >>> On 8/10/20 5:28 PM, Coleen Phillimore wrote: >>>> >>>> >>>> On 8/10/20 4:38 PM, serguei.spitsyn at oracle.com wrote: >>>>> On 8/10/20 13:34, serguei.spitsyn at oracle.com wrote: >>>>>> Hi Coleen, >>>>>> >>>>>> It looks good to me. >>>>>> Minor: >>>>>> +void JvmtiExport::initialize_oop_storage() { >>>>>> + // OopStorage needs to be created early in startup and >>>>>> unconditionally >>>>>> + // because of OopStorageSet static array indices. >>>>>> + _jvmti_oop_storage = OopStorageSet::create_strong("Thread >>>>>> OopStorage"); >>>>>> +} >>>>>> + >>>>>> Would it better to replace "Thread Oopstorage" with "JVMTI >>>>>> OopStorage"? >>>>> >>>>> In the file >>>>> http://cr.openjdk.java.net/~coleenp/2020/8251302.01/webrev/test/jdk/jdk/jfr/event/gc/collection/TestG1ParallelPhases.java.udiff.html >>>>> >>>>> I see this: >>>>> ????????????? "Thread OopStorage", >>>>> + "ThreadService OopStorage", >>>>> It is not clear if we can simply add ""JVMTI OopStorage" above. >>>> >>>> Serguei,? Thank you for finding this.? I was wondering why I didn't >>>> have to add JVMTI OopStorage to the test.? I'd cut/pasted the same >>>> string for Thread OopStorage. >>>> >>>> I'll fix this and the test and retest. >>> >>> Hi Serguei, >>> >>> open webrev at >>> http://cr.openjdk.java.net/~coleenp/2020/8251302.02.incr/webrev >>> >>> This fixes the test as well. >>> >>> Thanks! >>> Coleen >>> >>>> >>>> thanks, >>>> Coleen >>>>> >>>>> Thanks, >>>>> Serguei >>>>> >>>>> >>>>>> No need in another webrev. >>>>>> >>>>>> Thanks, >>>>>> Serguei >>>>>> >>>>>> On 8/10/20 12:37, Coleen Phillimore wrote: >>>>>>> These OopHandles are created and released during breakpoints and >>>>>>> Thread stack walking operations.? They should have their own >>>>>>> OopStorage so that GC can detect whether these things affect >>>>>>> timing. >>>>>>> >>>>>>> Tested with tier1-6. >>>>>>> >>>>>>> open webrev at >>>>>>> http://cr.openjdk.java.net/~coleenp/2020/8251302.01/webrev >>>>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8251302 >>>>>>> >>>>>>> Thanks, >>>>>>> Coleen >>>>>> >>>>> >>>> >>> >> From david.holmes at oracle.com Tue Aug 11 01:59:30 2020 From: david.holmes at oracle.com (David Holmes) Date: Tue, 11 Aug 2020 11:59:30 +1000 Subject: RFR(S) 8249293: Unsafe stackwalk in VM_GetOrSetLocal::doit_prologue() In-Reply-To: <037ee9ad-615e-e70b-80f8-21b4fdc80f53@oracle.com> References: <54218a0f-5ea7-5a26-af7b-d13829bece14@oracle.com> <037ee9ad-615e-e70b-80f8-21b4fdc80f53@oracle.com> Message-ID: <0a112ad5-0771-9a83-32d8-08c87bed9b40@oracle.com> Hi Serguei, On 11/08/2020 3:21 am, serguei.spitsyn at oracle.com wrote: > Hi Richard and David, > > The implementation looks good to me. > > But I do not understand what the test is doing with all this counters > and recursions. > > For instance, these fragments: > > 86 recursions = 0; > 87 try { > 88 recursiveMethod(1<<20); > 89 } catch (StackOverflowError e) { > 90 msg("Caught StackOverflowError as expected"); > 91 } > 92 int target_depth = recursions-100; // spaces are missed around the '-' sigh > > It is not obvious that the 'recursion' is updated in the recursiveMethod. > I would suggestto make it more explicit: > ?? recursiveMethod(M); > ?? int target_depth = M - 100; > > Then the variable 'recursions' can be removed or become local. The recursiveMethod takes in the maximum recursions to try and updates the recursions variable to record how many recursions were possible - so: target_depth = - 100; Possibly recursiveMethod could return the actual recursions instead of using the global variables? David ----- > This method will be: > > 47 private static final int M = 1 << 20; > ... > 121 public long recursiveMethod(int depth) { > 123 if (depth == 0) { > 124 notifyAgentToGetLocalAndWaitShortly(M - 100, waitTimeInNativeAfterNotify); > 126 } else { > 127 recursiveMethod(--depth); > 128 } > 129 } > > > At least, he test is missing the comments explaining all these. > > Thanks, > Serguei > > > > On 8/9/20 22:35, David Holmes wrote: >> Hi Richard, >> >> On 31/07/2020 5:28 pm, Reingruber, Richard wrote: >>> Hi, >>> >>> I rebase the fix after JDK-8250042. >>> >>> New webrev: http://cr.openjdk.java.net/~rrich/webrevs/8249293/webrev.2/ >> >> The general fix for this seems good. A minor nit: >> >> ?588???? if (!is_assignable(signature, ob_k, Thread::current())) { >> >> You know that the current thread is the VMThread so can use >> VMThread::vm_thread(). >> >> Similarly for this existing code: >> >> ?694???? Thread* current_thread = Thread::current(); >> >> --- >> >> Looking at the test code ... I'm less clear on exactly what is >> happening and the use of spin-waits raises some red-flags for me in >> terms of test reliability on different platforms. The "while >> (--waitCycles > 0)" loop in particular offers no certainty that the >> agent thread is executing anything in particular. And the use of the >> spin_count as a guide to future waiting time seems somewhat arbitrary. >> In all seriousness I got a headache trying to work out how the test >> was expecting to operate. Some parts could be simplified using raw >> monitors, I think. But there's no sure way to know the agent thread is >> in the midst of the stackwalk when the target thread wants to leave >> the native code. So I understand what you are trying to achieve here, >> I'm just not sure how reliably it will actually achieve it. >> >> test/hotspot/jtreg/serviceability/jvmti/GetLocalVariable/libGetLocalWithoutSuspendTest.cpp >> >> >> ?32 static volatile jlong spinn_count???? = 0; >> >> Using a 64-bit counter seems like it will be a problem on 32-bit systems. >> >> Should be spin_count not spinn_count. >> >> ?36 // Agent thread waits for value != 0, then performas the JVMTI >> call to get local variable. >> >> typo: performas >> >> Thanks, >> David >> ----- >> >>> Thanks, Richard. >>> >>> -----Original Message----- >>> From: serviceability-dev >>> On Behalf Of Reingruber, Richard >>> Sent: Montag, 27. Juli 2020 09:45 >>> To: serguei.spitsyn at oracle.com; serviceability-dev at openjdk.java.net >>> Subject: [CAUTION] RE: RFR(S) 8249293: Unsafe stackwalk in >>> VM_GetOrSetLocal::doit_prologue() >>> >>> Hi Serguei, >>> >>> ?? > I tested it on Linux and Windows but not yet on MacOS. >>> >>> The test succeeded now on all platforms. >>> >>> Thanks, Richard. >>> >>> -----Original Message----- >>> From: Reingruber, Richard >>> Sent: Freitag, 24. Juli 2020 15:04 >>> To: serguei.spitsyn at oracle.com; serviceability-dev at openjdk.java.net >>> Subject: RE: RFR(S) 8249293: Unsafe stackwalk in >>> VM_GetOrSetLocal::doit_prologue() >>> >>> Hi Serguei, >>> >>>> The fix itself looks good to me. >>> >>> thanks for looking at the fix. >>> >>>> I still need another look at new test. >>>> Could you, please, convert the agent of new test to C++? >>>> It will make it a little bit simpler. >>> >>> Sure, here is the new webrev.1 with a C++ version of the test agent: >>> >>> http://cr.openjdk.java.net/~rrich/webrevs/8249293/webrev.1/ >>> >>> I tested it on Linux and Windows but not yet on MacOS. >>> >>> Thanks, >>> Richard. >>> >>> -----Original Message----- >>> From: serguei.spitsyn at oracle.com >>> Sent: Freitag, 24. Juli 2020 00:00 >>> To: Reingruber, Richard ; >>> serviceability-dev at openjdk.java.net >>> Subject: Re: RFR(S) 8249293: Unsafe stackwalk in >>> VM_GetOrSetLocal::doit_prologue() >>> >>> Hi Richard, >>> >>> Thank you for filing the CR and taking care about it! >>> The fix itself looks good to me. >>> I still need another look at new test. >>> Could you, please, convert the agent of new test to C++? >>> It will make it a little bit simpler. >>> >>> Thanks, >>> Serguei >>> >>> >>> On 7/20/20 01:15, Reingruber, Richard wrote: >>>> Hi, >>>> >>>> please help review this fix for VM_GetOrSetLocal. It moves the >>>> unsafe stackwalk from the vm >>>> operation prologue before the safepoint into the doit() method >>>> executed at the safepoint. >>>> >>>> Webrev: >>>> http://cr.openjdk.java.net/~rrich/webrevs/8249293/webrev.0/index.html >>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8249293 >>>> >>>> According to the JVMTI spec on local variable access it is not >>>> required to suspend the target thread >>>> T [1]. The operation will simply fail with >>>> JVMTI_ERROR_NO_MORE_FRAMES if T is executing >>>> bytecodes. It will succeed though if T is blocked because of >>>> synchronization or executing some native >>>> code. >>>> >>>> The issue is that in the latter case the stack walk in >>>> VM_GetOrSetLocal::doit_prologue() to prepare >>>> the access to the local variable is unsafe, because it is done >>>> before the safepoint and it races >>>> with T returning to execute bytecodes making its stack not walkable. >>>> The included test shows that >>>> this can crash the VM if T wins the race. >>>> >>>> Manual testing: >>>> >>>> ??? - new test >>>> test/hotspot/jtreg/serviceability/jvmti/GetLocalVariable/GetLocalWithoutSuspendTest.java >>>> ??? - test/hotspot/jtreg/vmTestbase/nsk/jvmti >>>> ??? - test/hotspot/jtreg/serviceability/jvmti >>>> >>>> Nightly regression tests @SAP: JCK and JTREG, also in Xcomp mode, >>>> SPECjvm2008, SPECjbb2015, >>>> Renaissance Suite, SAP specific tests with fastdebug and release >>>> builds on all platforms >>>> >>>> Thanks, Richard. >>>> >>>> [1] >>>> https://docs.oracle.com/en/java/javase/14/docs/specs/jvmti.html#local >>> > From david.holmes at oracle.com Tue Aug 11 02:07:28 2020 From: david.holmes at oracle.com (David Holmes) Date: Tue, 11 Aug 2020 12:07:28 +1000 Subject: RFR (S) 8251302: Create dedicated OopStorages for Management and Jvmti In-Reply-To: <86edc516-a7e5-b9aa-1f17-80b53f26d57d@oracle.com> References: <3493f679-c5e2-9f05-4bfa-aaab1543361d@oracle.com> <49934bad-680d-121b-a1e5-56b9cd42d4a0@oracle.com> <024599a2-6c5d-98e0-8f26-bfe183e9c474@oracle.com> <36dc666d-f12c-0250-9b65-4794f05641db@oracle.com> <20c3370f-ae8c-aadc-53b1-95db92776f31@oracle.com> <86edc516-a7e5-b9aa-1f17-80b53f26d57d@oracle.com> Message-ID: <0ce2b90e-02b6-a4e8-f14f-522578a9e924@oracle.com> On 11/08/2020 10:48 am, Coleen Phillimore wrote: > > Hi David,? Thank you for reviewing. > > On 8/10/20 8:04 PM, David Holmes wrote: >> Hi Coleen, >> >> This looks good to me too! >> >> Were the changes in src/hotspot/share/utilities/macros.hpp just for >> completeness? You don't seem to use the new macro. > > It was left over from an earlier edit.? I reverted it. Okay. >> >> ?src/hotspot/share/runtime/init.cpp >> >> It seems a little odd having an explicit call to >> JvmtiExport::initialize_oop_storage() rather than that call being >> inside on of the other init functions. But I don't really see an >> appropriate place for it. I thought perhaps management_init as it >> seems to combine a few related things, but it isn't really the right >> place either. > > management_init() isn't the right place for JVMTI.?? I could have added > a jvmti_init() that calls JvmtiExport::initialize_oop_storage() but > honestly I think this whole thing should be rewritten to call static > functions rather than through these forward declarations.? There are > other places that call qualified static initialization functions in this > code and I think this should migrate to that. > >> I was also a little unsure if this initialization point would always >> be early enough, but it seems the oop-storage can't be touched by >> anything until the live phase, so this seems okay. Though I was >> wondering whether it should be done in vm_init_globals after >> universe_oopstorage_init, to maintain the same initialization point as >> it currnetly has? > > There are no jvmti specific initializations in the short amount of > initialization code between vm_init_globals and init_globals so I chose > the later initialization because it worked, and the later initialization > risks dragging more dependencies forward with it. Doing jvmti > initialization in what looks like very early initialization doesn't seem > appropriate.? And isn't needed for correctness. Okay. Thanks, David > thanks, > Coleen >> >> Thanks, >> David >> ----- >> >> >> On 11/08/2020 8:39 am, Coleen Phillimore wrote: >>> Adding back serviceability-dev. >>> Thanks for reviewing Serguei. >>> Coleen >>> >>> On 8/10/20 6:11 PM, Coleen Phillimore wrote: >>>> >>>> >>>> On 8/10/20 5:28 PM, Coleen Phillimore wrote: >>>>> >>>>> >>>>> On 8/10/20 4:38 PM, serguei.spitsyn at oracle.com wrote: >>>>>> On 8/10/20 13:34, serguei.spitsyn at oracle.com wrote: >>>>>>> Hi Coleen, >>>>>>> >>>>>>> It looks good to me. >>>>>>> Minor: >>>>>>> +void JvmtiExport::initialize_oop_storage() { >>>>>>> + // OopStorage needs to be created early in startup and >>>>>>> unconditionally >>>>>>> + // because of OopStorageSet static array indices. >>>>>>> + _jvmti_oop_storage = OopStorageSet::create_strong("Thread >>>>>>> OopStorage"); >>>>>>> +} >>>>>>> + >>>>>>> Would it better to replace "Thread Oopstorage" with "JVMTI >>>>>>> OopStorage"? >>>>>> >>>>>> In the file >>>>>> http://cr.openjdk.java.net/~coleenp/2020/8251302.01/webrev/test/jdk/jdk/jfr/event/gc/collection/TestG1ParallelPhases.java.udiff.html >>>>>> >>>>>> I see this: >>>>>> ????????????? "Thread OopStorage", >>>>>> + "ThreadService OopStorage", >>>>>> It is not clear if we can simply add ""JVMTI OopStorage" above. >>>>> >>>>> Serguei,? Thank you for finding this.? I was wondering why I didn't >>>>> have to add JVMTI OopStorage to the test.? I'd cut/pasted the same >>>>> string for Thread OopStorage. >>>>> >>>>> I'll fix this and the test and retest. >>>> >>>> Hi Serguei, >>>> >>>> open webrev at >>>> http://cr.openjdk.java.net/~coleenp/2020/8251302.02.incr/webrev >>>> >>>> This fixes the test as well. >>>> >>>> Thanks! >>>> Coleen >>>> >>>>> >>>>> thanks, >>>>> Coleen >>>>>> >>>>>> Thanks, >>>>>> Serguei >>>>>> >>>>>> >>>>>>> No need in another webrev. >>>>>>> >>>>>>> Thanks, >>>>>>> Serguei >>>>>>> >>>>>>> On 8/10/20 12:37, Coleen Phillimore wrote: >>>>>>>> These OopHandles are created and released during breakpoints and >>>>>>>> Thread stack walking operations.? They should have their own >>>>>>>> OopStorage so that GC can detect whether these things affect >>>>>>>> timing. >>>>>>>> >>>>>>>> Tested with tier1-6. >>>>>>>> >>>>>>>> open webrev at >>>>>>>> http://cr.openjdk.java.net/~coleenp/2020/8251302.01/webrev >>>>>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8251302 >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Coleen >>>>>>> >>>>>> >>>>> >>>> >>> > From linzang at tencent.com Tue Aug 11 02:25:54 2020 From: linzang at tencent.com (=?utf-8?B?bGluemFuZyjoh6fnkLMp?=) Date: Tue, 11 Aug 2020 02:25:54 +0000 Subject: RFR(L): 8215624: add parallel heap inspection support for jmap histo(G1)(Internet mail) In-Reply-To: <74470c79-086a-34a1-54f3-d642fef101b0@oracle.com> References: <107799EA-8956-426D-B05E-B3344AFCBA28@amazon.com> <63236320-6741-46F4-A22A-24FCE699E26A@tencent.com> <7015e935-51df-54de-8eea-7dc375db4c85@oracle.com> <35B7B432-29F1-45B5-8FD8-3889C0EAC440@tencent.com> <84327b71-8a2d-d6bf-4e05-51ea944d524e@oracle.com> <34627F19-76BD-48FC-8B5A-C316FAEB3C0F@tencent.com> <74470c79-086a-34a1-54f3-d642fef101b0@oracle.com> Message-ID: <526DE8E4-A838-41C7-83BD-3219384B67F7@tencent.com> Hi Serguei, >> First, the CSR does not include any update for 'live' and 'all' options, does it? >> If so, then I'm confused why do you need all these changes related to these two options. >> Did you intend to really change anything? Yes, you?re correct, CSR doesn?t mention any thing about ?live? and ?all?. so all those changes related become unnecessary. Webrev: http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_13 Delta: http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_13_delta BTW, during refining this changeset I also found an issue that jmap -dump could accept undefined options, will setup a new issue in JBS and fix it separately soon. BRs, Lin From: "serguei.spitsyn at oracle.com" Date: Tuesday, August 11, 2020 at 8:40 AM To: "linzang(??)" , "Hohensee, Paul" , Stefan Karlsson , David Holmes , serviceability-dev , "hotspot-gc-dev at openjdk.java.net" Subject: Re: RFR(L): 8215624: add parallel heap inspection support for jmap histo(G1)(Internet mail) Hi Lin, A couple of things. First, the CSR does not include any update for 'live' and 'all' options, does it? If so, then I'm confused why do you need all these changes related to these two options. Did you intend to really change anything? Second, new error messages do not look useful as they say nothing about what is wrong. Printing usage does not help either. Could these messages be more specific? My suggestions are: 188 if (filename == null) { 189 System.err.println("Fail at processing option '" + subopt +"'"); 190 usage(1); // invalid options or no filename 191 } ? System.err.println("Fail: invalid option or no file name: '" + subopt +"'"); 194 if (parallel == null) { 195 System.err.println("Fail at processing option '" + subopt + "'"); 196 usage(1); 197 } ? System.err.println("Fail: no number provided in option: '" + subopt +"'"); 198 } else { 199 System.err.println("Fail at processing option '" + subopt + "'"); 200 usage(1); 201 } ? System.err.println("Fail: invalid option: '" + subopt +"'"); The default value is listed in the 'parallel' flag description: parallel= generate histogram using this many parallel threads, default 0 It means that the flag is optionl. I'm okay to file a separate enhancement to add a clarification for 'live' and 'all' flags. Thanks, Serguei On 8/10/20 16:46, linzang(??) wrote: And Here is the latest refined changeset Webrev: http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_12/ Delta: http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_12_delta/ BRs, Lin On 2020/8/11, 7:23 AM, "linzang(??)" mailto:linzang at tencent.com wrote: Dear Serguei, Here is my reply for your question about non-numeric value for ?parallel? (somehow the thread of replay became out of order, not sure why). > >> What is going to happen if the resulting 'parallel' substring above is not a number? > The error handling logic locates at http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_11/src/hotspot/share/services/attachListener.cpp.frames.html, (line 276-284) > Generally, the result is error message will be print if ?parallel? is illegal. An example output would be: > ############################ > $ time jmap -histo:parallel=c 26233 > Exception in thread "main" com.sun.tools.attach.AttachOperationFailedException: Invalid parallel thread number: [c] > at jdk.attach/sun.tools.attach.VirtualMachineImpl.execute(VirtualMachineImpl.java:227) > at jdk.attach/sun.tools.attach.HotSpotVirtualMachine.executeCommand(HotSpotVirtualMachine.java:309) > at jdk.jcmd/sun.tools.jmap.JMap.executeCommandForPid(JMap.java:133) > at jdk.jcmd/sun.tools.jmap.JMap.histo(JMap.java:206) > at jdk.jcmd/sun.tools.jmap.JMap.main(JMap.java:112) > > ############################ > > Hi Serguei, Paul and Stefan. > Moreover, I will made a new changeset with following changes: > * Print error message + usage when parameter check fail in Jmap.java > *Retrive the histo logic that if ?all? and ?live? are set at same time, use ?live?, rather than print error message. (not sure which one is better :P) My last point is to retrive the behavior for compatibility. And do you think make a separate enhancement about spec is reasonable ? Thanks! BRs, Lin From: mailto:serguei.spitsyn at oracle.com mailto:serguei.spitsyn at oracle.com Date: Tuesday, August 11, 2020 at 5:11 AM To: "linzang(??)" mailto:linzang at tencent.com, "Hohensee, Paul" mailto:hohensee at amazon.com, Stefan Karlsson mailto:stefan.karlsson at oracle.com, David Holmes mailto:david.holmes at oracle.com, serviceability-dev mailto:serviceability-dev at openjdk.java.net, mailto:hotspot-gc-dev at openjdk.java.net mailto:hotspot-gc-dev at openjdk.java.net Subject: Re: RFR(L): 8215624: add parallel heap inspection support for jmap histo(G1)(Internet mail) Hi Lin, On 8/7/20 03:41, linzang(??) wrote: Dear Serguei, Thanks a lot for your review! >> The spec says nothing if the new option 'parallel' is mandatory or optional. >> Also, (it was before your fix) the spec does not say if the options 'live' and 'all' are mutually exclusive. For ?parallel?, the spec adds ?parallel=0? is the default behavior. So my assumption is if parallel is not used, it will be 0. Do you think it is ok ? is it necessary to obviously add comments like ?if no parallel is set, use the default value 0?? It'd be nice to make it clear. But the CSR will need to be updated. In fact, I did not want you to go through this cycle again. But maybe it is worth to improve the specs in this regard. May be Paul has some alternative suggestions. For ?live? and ?all?, before the changeset , I see the logic from the code is that both of them can be set at the same time, and the ?live? will take effect. IMHO this may be a little confused. So I made the change, not sure whether I should keep the same behavior as before in this change? This is better to clearly specify what is allowed and what is the behavior. And I like your idea of printing more error msg if something wrong with the options setting, but I checked that before the change, if there is not a match option, it only print usage. and not only jmap -histo but also jmap -dump has this issue, do you agree if I fix both in the changeset? Yes, it'd be nice to make it clear in both specs. >> What is going to happen if null is passed in place of parallel here? : The default value 0 will be used if no ?parallel? option is set. Okay, thanks. >> Should the lines 193-195 be moved after the line 202? I don?t think so, the logic is a little different. At line 193, the case is ?parallel=?. If move them to line 203, it mean ?parallel? is not optional. Okay, I see what you mean. The problem is that the help/spec says nothing about the flag 'parallel' as being optional. I also asked this question: Q: What is going to happen if the resulting 'parallel' sub-string above is not a number? Thanks, Serguei Thanks! BRs, Lin From: mailto:serguei.spitsyn at oracle.com mailto:serguei.spitsyn at oracle.com Date: Friday, August 7, 2020 at 3:28 PM To: "linzang(??)mailto:linzang at tencent.com,Hohensee, Paulmailto:hohensee at amazon.com,StefanKarlssonmailto:stefan.karlsson at oracle.com,DavidHolmesmailto:david.holmes at oracle.com,serviceability-devmailto:serviceability-dev at openjdk.java.net,mailto:hotspot-gc-dev at openjdk.java.netmailto:hotspot-gc-dev at openjdk.java.netSubject:Re:RFR(L):8215624:addparallelheapinspectionsupportforjmaphisto(G1)(Internetmail)HiLin,Notsure,Ifullyunderstandthespecupdateandtheoptionsprocessinginthefile:http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_11/src/jdk.jcmd/share/classes/sun/tools/jmap/JMap.java.frames.htmlThespecsaysnothingifthenewoption'parallel'ismandatoryoroptional.Also,(itwasbeforeyourfix)thespecdoesnotsayiftheoptions'live'and'all'aremutuallyexclusive.TheJMap.javaimplementationjustprintusageintwocases:191}elseif(subopt.startsWith(parallel=")) { 192 parallel = subopt.substring("parallel=".length()); 193 if (parallel == null) { 194 usage(1); 195 } ... 200 if (set_live && set_all) { 201 usage(1); 202 } It is not that helpful as the usage does not explain anything about these corner cases. Also, it allows to pass no parallel option. What is going to happen if null is passed in place of parallel here? : 206 executeCommandForPid(pid, "inspectheap", liveopt, filename, parallel); Should the lines 193-195 be moved after the line 202? Thanks, Serguei On 8/5/20 18:59, linzang(??) wrote: Thanks Paul! And I have verified this change could build success in windows. BRs, Lin On 2020/8/6, 4:17 AM, "Hohensee, Paulmailto:hohensee at amazon.comwrote:Twotinynitsthatdon'tneedanewwebrev:InheapInspection.cpp,youdon'tneedtocastmissed_counttouintxinthecalltolog_info().InheapInspection.hpp,youcandeletetwoofthethreeblanklinesbefore#endif//SHARE_MEMORY_HEAPINSPECTION_HPPThanks,PaulOn8/5/20,6:46AM,linzang(??)" mailto:linzang at tencent.com wrote: Hi Paul, Stefan and Serguei, Here I uploaded a new changeset, would you like to help review again? Webrev: http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_11/ Delta (based on webrev10): http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_11_delta/ P.S. I am in process of building it on windows environment for a double check. May update result later. Thanks! BRs, Lin From linzang at tencent.com Tue Aug 11 02:52:58 2020 From: linzang at tencent.com (=?utf-8?B?bGluemFuZyjoh6fnkLMp?=) Date: Tue, 11 Aug 2020 02:52:58 +0000 Subject: RFR(S):8251374:jmap -dump should not accept invalid options Message-ID: <77F863FB-3B42-4B2E-B48E-0113E2F4DA37@tencent.com> Hi All, May I ask your help to review this tiny patch? It fix an issue that jmap -dump could wrongly accept invalid optioins. Bugs: https://bugs.openjdk.java.net/browse/JDK-8251374 Patch: (Can not connect to webrev ftp currently, will try it later, following are all code changes) ################################ --- old/src/jdk.jcmd/share/classes/sun/tools/jmap/JMap.java 2020-08-11 10:42:32.044567791 +0800 +++ new/src/jdk.jcmd/share/classes/sun/tools/jmap/JMap.java 2020-08-11 10:42:31.876568681 +0800 @@ -207,6 +207,11 @@ ?? ? ? ? ? ? ? ? liveopt = "-live"; ?? ? ? ? ? ? } else if (subopt.startsWith("file=")) { ?? ? ? ? ? ? ? ? filename = parseFileName(subopt); +? ? ? ? ? ? } else if (subopt.equals("format=b")) { +? ? ? ? ? ? ? ? // ignore format (not needed at this time) +? ? ? ? ? ? } else { + ? ? ? ? ? ? ? System.err.println("Fail: invalid option: '" + subopt +"'"); + ? ? ? ? ? ? ? System.exit(1); ?? ? ? ? ? ? } ?? ? ? ? } ################################ Thanks, Lin From linzang at tencent.com Tue Aug 11 02:57:35 2020 From: linzang at tencent.com (=?utf-8?B?bGluemFuZyjoh6fnkLMp?=) Date: Tue, 11 Aug 2020 02:57:35 +0000 Subject: RFR(S):8251374:jmap -dump should not accept invalid options In-Reply-To: <77F863FB-3B42-4B2E-B48E-0113E2F4DA37@tencent.com> References: <77F863FB-3B42-4B2E-B48E-0113E2F4DA37@tencent.com> Message-ID: <52647A5E-8808-4916-92AD-1674468C746C@tencent.com> Here is the webrev: http://cr.openjdk.java.net/~lzang/8251374/webrev01/ BRs, Lin ?On 2020/8/11, 10:52 AM, "linzang(??)" wrote: Hi All, May I ask your help to review this tiny patch? It fix an issue that jmap -dump could wrongly accept invalid optioins. Bugs: https://bugs.openjdk.java.net/browse/JDK-8251374 Patch: (Can not connect to webrev ftp currently, will try it later, following are all code changes) ################################ --- old/src/jdk.jcmd/share/classes/sun/tools/jmap/JMap.java 2020-08-11 10:42:32.044567791 +0800 +++ new/src/jdk.jcmd/share/classes/sun/tools/jmap/JMap.java 2020-08-11 10:42:31.876568681 +0800 @@ -207,6 +207,11 @@ liveopt = "-live"; } else if (subopt.startsWith("file=")) { filename = parseFileName(subopt); + } else if (subopt.equals("format=b")) { + // ignore format (not needed at this time) + } else { + System.err.println("Fail: invalid option: '" + subopt +"'"); + System.exit(1); } } ################################ Thanks, Lin From serguei.spitsyn at oracle.com Tue Aug 11 03:04:38 2020 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Mon, 10 Aug 2020 20:04:38 -0700 Subject: RFR(L): 8215624: add parallel heap inspection support for jmap histo(G1)(Internet mail) In-Reply-To: <526DE8E4-A838-41C7-83BD-3219384B67F7@tencent.com> References: <107799EA-8956-426D-B05E-B3344AFCBA28@amazon.com> <63236320-6741-46F4-A22A-24FCE699E26A@tencent.com> <7015e935-51df-54de-8eea-7dc375db4c85@oracle.com> <35B7B432-29F1-45B5-8FD8-3889C0EAC440@tencent.com> <84327b71-8a2d-d6bf-4e05-51ea944d524e@oracle.com> <34627F19-76BD-48FC-8B5A-C316FAEB3C0F@tencent.com> <74470c79-086a-34a1-54f3-d642fef101b0@oracle.com> <526DE8E4-A838-41C7-83BD-3219384B67F7@tencent.com> Message-ID: <6cbc8167-c81b-a13a-6eb9-3a5cb704fd8b@oracle.com> Hi Lin, I've re-reviewed the JMap.java only. It looks good except there was no need to replace the usage(1) call with the System.exit(1). I did not say usage is not needed, just that it is not enough. Thanks, Serguei On 8/10/20 19:25, linzang(??) wrote: > Hi Serguei, > > >> First, the CSR does not include any update for 'live' and 'all' options, does it? > >> If so, then I'm confused why do you need all these changes related to these two options. > >> Did you intend to really change anything? > Yes, you?re correct, CSR doesn?t mention any thing about ?live? and ?all?. so all those changes related become unnecessary. > > Webrev: http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_13 > Delta: http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_13_delta > > BTW, during refining this changeset I also found an issue that jmap -dump could accept undefined options, will setup a new issue in JBS and fix it separately soon. > > > BRs, > Lin > > From: "serguei.spitsyn at oracle.com" > Date: Tuesday, August 11, 2020 at 8:40 AM > To: "linzang(??)" , "Hohensee, Paul" , Stefan Karlsson , David Holmes , serviceability-dev , "hotspot-gc-dev at openjdk.java.net" > Subject: Re: RFR(L): 8215624: add parallel heap inspection support for jmap histo(G1)(Internet mail) > > Hi Lin, > > A couple of things. > > First, the CSR does not include any update for 'live' and 'all' options, does it? > If so, then I'm confused why do you need all these changes related to these two options. > Did you intend to really change anything? > > Second, new error messages do not look useful as they say nothing about what is wrong. > Printing usage does not help either. > Could these messages be more specific? > My suggestions are: > 188 if (filename == null) { > 189 System.err.println("Fail at processing option '" + subopt +"'"); > 190 usage(1); // invalid options or no filename > 191 } > ? System.err.println("Fail: invalid option or no file name: '" + subopt +"'"); > 194 if (parallel == null) { > 195 System.err.println("Fail at processing option '" + subopt + "'"); > 196 usage(1); > 197 } > ? System.err.println("Fail: no number provided in option: '" + subopt +"'"); > 198 } else { > 199 System.err.println("Fail at processing option '" + subopt + "'"); > 200 usage(1); > 201 } > ? System.err.println("Fail: invalid option: '" + subopt +"'"); > > > The default value is listed in the 'parallel' flag description: > parallel= generate histogram using this many parallel threads, default 0 > It means that the flag is optionl. > > I'm okay to file a separate enhancement to add a clarification for 'live' and 'all' flags. > > Thanks, > Serguei > > > On 8/10/20 16:46, linzang(??) wrote: > And Here is the latest refined changeset > Webrev: http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_12/ > Delta: http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_12_delta/ > > BRs, > Lin > > On 2020/8/11, 7:23 AM, "linzang(??)" mailto:linzang at tencent.com wrote: > > Dear Serguei, > Here is my reply for your question about non-numeric value for ?parallel? (somehow the thread of replay became out of order, not sure why). > > > >> What is going to happen if the resulting 'parallel' substring above is not a number? > > The error handling logic locates at http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_11/src/hotspot/share/services/attachListener.cpp.frames.html, (line 276-284) > > Generally, the result is error message will be print if ?parallel? is illegal. An example output would be: > > ############################ > > $ time jmap -histo:parallel=c 26233 > > Exception in thread "main" com.sun.tools.attach.AttachOperationFailedException: Invalid parallel thread number: [c] > > at jdk.attach/sun.tools.attach.VirtualMachineImpl.execute(VirtualMachineImpl.java:227) > > at jdk.attach/sun.tools.attach.HotSpotVirtualMachine.executeCommand(HotSpotVirtualMachine.java:309) > > at jdk.jcmd/sun.tools.jmap.JMap.executeCommandForPid(JMap.java:133) > > at jdk.jcmd/sun.tools.jmap.JMap.histo(JMap.java:206) > > at jdk.jcmd/sun.tools.jmap.JMap.main(JMap.java:112) > > > > ############################ > > > > Hi Serguei, Paul and Stefan. > > Moreover, I will made a new changeset with following changes: > > * Print error message + usage when parameter check fail in Jmap.java > > *Retrive the histo logic that if ?all? and ?live? are set at same time, use ?live?, rather than print error message. (not sure which one is better :P) > > My last point is to retrive the behavior for compatibility. And do you think make a separate enhancement about spec is reasonable ? > > Thanks! > > BRs, > Lin > > From: mailto:serguei.spitsyn at oracle.com mailto:serguei.spitsyn at oracle.com > Date: Tuesday, August 11, 2020 at 5:11 AM > To: "linzang(??)" mailto:linzang at tencent.com, "Hohensee, Paul" mailto:hohensee at amazon.com, Stefan Karlsson mailto:stefan.karlsson at oracle.com, David Holmes mailto:david.holmes at oracle.com, serviceability-dev mailto:serviceability-dev at openjdk.java.net, mailto:hotspot-gc-dev at openjdk.java.net mailto:hotspot-gc-dev at openjdk.java.net > Subject: Re: RFR(L): 8215624: add parallel heap inspection support for jmap histo(G1)(Internet mail) > > Hi Lin, > > > On 8/7/20 03:41, linzang(??) wrote: > Dear Serguei, > Thanks a lot for your review! > >> The spec says nothing if the new option 'parallel' is mandatory or optional. > >> Also, (it was before your fix) the spec does not say if the options 'live' and 'all' are mutually exclusive. > For ?parallel?, the spec adds ?parallel=0? is the default behavior. So my assumption is if parallel is not used, it will be 0. Do you think it is ok ? is it necessary to obviously add comments like ?if no parallel is set, use the default value 0?? > > It'd be nice to make it clear. > But the CSR will need to be updated. > In fact, I did not want you to go through this cycle again. > But maybe it is worth to improve the specs in this regard. > May be Paul has some alternative suggestions. > > > For ?live? and ?all?, before the changeset , I see the logic from the code is that both of them can be set at the same time, and the ?live? will take effect. IMHO this may be a little confused. So I made the change, not sure whether I should keep the same behavior as before in this change? > > This is better to clearly specify what is allowed and what is the behavior. > > > And I like your idea of printing more error msg if something wrong with the options setting, but I checked that before the change, if there is not a match option, it only print usage. and not only jmap -histo but also jmap -dump has this issue, do you agree if I fix both in the changeset? > > Yes, it'd be nice to make it clear in both specs. > > >> What is going to happen if null is passed in place of parallel here? : > The default value 0 will be used if no ?parallel? option is set. > > Okay, thanks. > > > >> Should the lines 193-195 be moved after the line 202? > I don?t think so, the logic is a little different. At line 193, the case is ?parallel=?. If move them to line 203, it mean ?parallel? is not optional. > Okay, I see what you mean. > The problem is that the help/spec says nothing about the flag 'parallel' as being optional. > > > I also asked this question: > Q: What is going to happen if the resulting 'parallel' sub-string above is not a number? > > > Thanks, > Serguei > > > > Thanks! > > > BRs, > Lin > > From: mailto:serguei.spitsyn at oracle.com mailto:serguei.spitsyn at oracle.com > Date: Friday, August 7, 2020 at 3:28 PM > To: "linzang(??)mailto:linzang at tencent.com,Hohensee, Paulmailto:hohensee at amazon.com,StefanKarlssonmailto:stefan.karlsson at oracle.com,DavidHolmesmailto:david.holmes at oracle.com,serviceability-devmailto:serviceability-dev at openjdk.java.net,mailto:hotspot-gc-dev at openjdk.java.netmailto:hotspot-gc-dev at openjdk.java.netSubject:Re:RFR(L):8215624:addparallelheapinspectionsupportforjmaphisto(G1)(Internetmail)HiLin,Notsure,Ifullyunderstandthespecupdateandtheoptionsprocessinginthefile:http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_11/src/jdk.jcmd/share/classes/sun/tools/jmap/JMap.java.frames.htmlThespecsaysnothingifthenewoption'parallel'ismandatoryoroptional.Also,(itwasbeforeyourfix)thespecdoesnotsayiftheoptions'live'and'all'aremutuallyexclusive.TheJMap.javaimplementationjustprintusageintwocases:191}elseif(subopt.startsWith(parallel=")) { > 192 parallel = subopt.substring("parallel=".length()); > 193 if (parallel == null) { > 194 usage(1); > 195 } > ... > 200 if (set_live && set_all) { > 201 usage(1); > 202 } > It is not that helpful as the usage does not explain anything about these corner cases. > Also, it allows to pass no parallel option. > What is going to happen if null is passed in place of parallel here? : > 206 executeCommandForPid(pid, "inspectheap", liveopt, filename, parallel); > > Should the lines 193-195 be moved after the line 202? > > Thanks, > Serguei > > > On 8/5/20 18:59, linzang(??) wrote: > Thanks Paul! > And I have verified this change could build success in windows. > > BRs, > Lin > > On 2020/8/6, 4:17 AM, "Hohensee, Paulmailto:hohensee at amazon.comwrote:Twotinynitsthatdon'tneedanewwebrev:InheapInspection.cpp,youdon'tneedtocastmissed_counttouintxinthecalltolog_info().InheapInspection.hpp,youcandeletetwoofthethreeblanklinesbefore#endif//SHARE_MEMORY_HEAPINSPECTION_HPPThanks,PaulOn8/5/20,6:46AM,linzang(??)" mailto:linzang at tencent.com wrote: > > Hi Paul, Stefan and Serguei, > Here I uploaded a new changeset, would you like to help review again? > Webrev: http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_11/ > Delta (based on webrev10): http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_11_delta/ > > P.S. I am in process of building it on windows environment for a double check. May update result later. Thanks! > > > BRs, > Lin > > > > > From linzang at tencent.com Tue Aug 11 03:23:32 2020 From: linzang at tencent.com (=?utf-8?B?bGluemFuZyjoh6fnkLMp?=) Date: Tue, 11 Aug 2020 03:23:32 +0000 Subject: RFR(L): 8215624: add parallel heap inspection support for jmap histo(G1)(Internet mail) In-Reply-To: <6cbc8167-c81b-a13a-6eb9-3a5cb704fd8b@oracle.com> References: <107799EA-8956-426D-B05E-B3344AFCBA28@amazon.com> <63236320-6741-46F4-A22A-24FCE699E26A@tencent.com> <7015e935-51df-54de-8eea-7dc375db4c85@oracle.com> <35B7B432-29F1-45B5-8FD8-3889C0EAC440@tencent.com> <84327b71-8a2d-d6bf-4e05-51ea944d524e@oracle.com> <34627F19-76BD-48FC-8B5A-C316FAEB3C0F@tencent.com> <74470c79-086a-34a1-54f3-d642fef101b0@oracle.com> <526DE8E4-A838-41C7-83BD-3219384B67F7@tencent.com>, <6cbc8167-c81b-a13a-6eb9-3a5cb704fd8b@oracle.com> Message-ID: <1CD7647F-FCED-4A77-AB8D-735A06E49FF6@tencent.com> Hi Serguei I got your point, just thought usage may be a little verbosity, it prints almost my whole screen which could flush the error message. And I checked that other jcmd tools usually use System.exit() after print errors. So I made the change. Thanks! Lin > On Aug 11, 2020, at 11:05 AM, "serguei.spitsyn at oracle.com" wrote: > > ?Hi Lin, > > I've re-reviewed the JMap.java only. > It looks good except there was no need to replace the usage(1) call with the System.exit(1). > I did not say usage is not needed, just that it is not enough. > > Thanks, > Serguei > > >> On 8/10/20 19:25, linzang(??) wrote: >> Hi Serguei, >> >> First, the CSR does not include any update for 'live' and 'all' options, does it? >> >> If so, then I'm confused why do you need all these changes related to these two options. >> >> Did you intend to really change anything? >> Yes, you?re correct, CSR doesn?t mention any thing about ?live? and ?all?. so all those changes related become unnecessary. >> >> Webrev: http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_13 >> Delta: http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_13_delta >> >> BTW, during refining this changeset I also found an issue that jmap -dump could accept undefined options, will setup a new issue in JBS and fix it separately soon. >> >> >> BRs, >> Lin >> >> From: "serguei.spitsyn at oracle.com" >> Date: Tuesday, August 11, 2020 at 8:40 AM >> To: "linzang(??)" , "Hohensee, Paul" , Stefan Karlsson , David Holmes , serviceability-dev , "hotspot-gc-dev at openjdk.java.net" >> Subject: Re: RFR(L): 8215624: add parallel heap inspection support for jmap histo(G1)(Internet mail) >> >> Hi Lin, >> >> A couple of things. >> >> First, the CSR does not include any update for 'live' and 'all' options, does it? >> If so, then I'm confused why do you need all these changes related to these two options. >> Did you intend to really change anything? >> >> Second, new error messages do not look useful as they say nothing about what is wrong. >> Printing usage does not help either. >> Could these messages be more specific? >> My suggestions are: >> 188 if (filename == null) { >> 189 System.err.println("Fail at processing option '" + subopt +"'"); >> 190 usage(1); // invalid options or no filename >> 191 } >> System.err.println("Fail: invalid option or no file name: '" + subopt +"'"); >> 194 if (parallel == null) { >> 195 System.err.println("Fail at processing option '" + subopt + "'"); >> 196 usage(1); >> 197 } >> System.err.println("Fail: no number provided in option: '" + subopt +"'"); >> 198 } else { >> 199 System.err.println("Fail at processing option '" + subopt + "'"); >> 200 usage(1); >> 201 } >> System.err.println("Fail: invalid option: '" + subopt +"'"); >> >> >> The default value is listed in the 'parallel' flag description: >> parallel= generate histogram using this many parallel threads, default 0 >> It means that the flag is optionl. >> >> I'm okay to file a separate enhancement to add a clarification for 'live' and 'all' flags. >> >> Thanks, >> Serguei >> >> >> On 8/10/20 16:46, linzang(??) wrote: >> And Here is the latest refined changeset >> Webrev: http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_12/ >> Delta: http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_12_delta/ >> >> BRs, >> Lin >> >> On 2020/8/11, 7:23 AM, "linzang(??)" mailto:linzang at tencent.com wrote: >> >> Dear Serguei, >> Here is my reply for your question about non-numeric value for ?parallel? (somehow the thread of replay became out of order, not sure why). >> >> > >> What is going to happen if the resulting 'parallel' substring above is not a number? >> > The error handling logic locates at http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_11/src/hotspot/share/services/attachListener.cpp.frames.html, (line 276-284) >> > Generally, the result is error message will be print if ?parallel? is illegal. An example output would be: >> > ############################ >> > $ time jmap -histo:parallel=c 26233 >> > Exception in thread "main" com.sun.tools.attach.AttachOperationFailedException: Invalid parallel thread number: [c] >> > at jdk.attach/sun.tools.attach.VirtualMachineImpl.execute(VirtualMachineImpl.java:227) >> > at jdk.attach/sun.tools.attach.HotSpotVirtualMachine.executeCommand(HotSpotVirtualMachine.java:309) >> > at jdk.jcmd/sun.tools.jmap.JMap.executeCommandForPid(JMap.java:133) >> > at jdk.jcmd/sun.tools.jmap.JMap.histo(JMap.java:206) >> > at jdk.jcmd/sun.tools.jmap.JMap.main(JMap.java:112) >> > >> > ############################ >> > >> > Hi Serguei, Paul and Stefan. >> > Moreover, I will made a new changeset with following changes: >> > * Print error message + usage when parameter check fail in Jmap.java >> > *Retrive the histo logic that if ?all? and ?live? are set at same time, use ?live?, rather than print error message. (not sure which one is better :P) >> >> My last point is to retrive the behavior for compatibility. And do you think make a separate enhancement about spec is reasonable ? >> >> Thanks! >> >> BRs, >> Lin >> >> From: mailto:serguei.spitsyn at oracle.com mailto:serguei.spitsyn at oracle.com >> Date: Tuesday, August 11, 2020 at 5:11 AM >> To: "linzang(??)" mailto:linzang at tencent.com, "Hohensee, Paul" mailto:hohensee at amazon.com, Stefan Karlsson mailto:stefan.karlsson at oracle.com, David Holmes mailto:david.holmes at oracle.com, serviceability-dev mailto:serviceability-dev at openjdk.java.net, mailto:hotspot-gc-dev at openjdk.java.net mailto:hotspot-gc-dev at openjdk.java.net >> Subject: Re: RFR(L): 8215624: add parallel heap inspection support for jmap histo(G1)(Internet mail) >> >> Hi Lin, >> >> >> On 8/7/20 03:41, linzang(??) wrote: >> Dear Serguei, >> Thanks a lot for your review! >> >> The spec says nothing if the new option 'parallel' is mandatory or optional. >> >> Also, (it was before your fix) the spec does not say if the options 'live' and 'all' are mutually exclusive. >> For ?parallel?, the spec adds ?parallel=0? is the default behavior. So my assumption is if parallel is not used, it will be 0. Do you think it is ok ? is it necessary to obviously add comments like ?if no parallel is set, use the default value 0?? >> >> It'd be nice to make it clear. >> But the CSR will need to be updated. >> In fact, I did not want you to go through this cycle again. >> But maybe it is worth to improve the specs in this regard. >> May be Paul has some alternative suggestions. >> >> >> For ?live? and ?all?, before the changeset , I see the logic from the code is that both of them can be set at the same time, and the ?live? will take effect. IMHO this may be a little confused. So I made the change, not sure whether I should keep the same behavior as before in this change? >> >> This is better to clearly specify what is allowed and what is the behavior. >> >> >> And I like your idea of printing more error msg if something wrong with the options setting, but I checked that before the change, if there is not a match option, it only print usage. and not only jmap -histo but also jmap -dump has this issue, do you agree if I fix both in the changeset? >> >> Yes, it'd be nice to make it clear in both specs. >> >> >> What is going to happen if null is passed in place of parallel here? : >> The default value 0 will be used if no ?parallel? option is set. >> >> Okay, thanks. >> >> >> >> Should the lines 193-195 be moved after the line 202? >> I don?t think so, the logic is a little different. At line 193, the case is ?parallel=?. If move them to line 203, it mean ?parallel? is not optional. >> Okay, I see what you mean. >> The problem is that the help/spec says nothing about the flag 'parallel' as being optional. >> >> >> I also asked this question: >> Q: What is going to happen if the resulting 'parallel' sub-string above is not a number? >> >> >> Thanks, >> Serguei >> >> >> >> Thanks! >> >> >> BRs, >> Lin >> >> From: mailto:serguei.spitsyn at oracle.com mailto:serguei.spitsyn at oracle.com >> Date: Friday, August 7, 2020 at 3:28 PM >> To: "linzang(??)mailto:linzang at tencent.com,Hohensee, Paulmailto:hohensee at amazon.com,StefanKarlssonmailto:stefan.karlsson at oracle.com,DavidHolmesmailto:david.holmes at oracle.com,serviceability-devmailto:serviceability-dev at openjdk.java.net,mailto:hotspot-gc-dev at openjdk.java.netmailto:hotspot-gc-dev at openjdk.java.netSubject:Re:RFR(L):8215624:addparallelheapinspectionsupportforjmaphisto(G1)(Internetmail)HiLin,Notsure,Ifullyunderstandthespecupdateandtheoptionsprocessinginthefile:http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_11/src/jdk.jcmd/share/classes/sun/tools/jmap/JMap.java.frames.htmlThespecsaysnothingifthenewoption'parallel'ismandatoryoroptional.Also,(itwasbeforeyourfix)thespecdoesnotsayiftheoptions'live'and'all'aremutuallyexclusive.TheJMap.javaimplementationjustprintusageintwocases:191}elseif(subopt.startsWith(parallel=")) { >> 192 parallel = subopt.substring("parallel=".length()); >> 193 if (parallel == null) { >> 194 usage(1); >> 195 } >> ... >> 200 if (set_live && set_all) { >> 201 usage(1); >> 202 } >> It is not that helpful as the usage does not explain anything about these corner cases. >> Also, it allows to pass no parallel option. >> What is going to happen if null is passed in place of parallel here? : >> 206 executeCommandForPid(pid, "inspectheap", liveopt, filename, parallel); >> >> Should the lines 193-195 be moved after the line 202? >> >> Thanks, >> Serguei >> >> >> On 8/5/20 18:59, linzang(??) wrote: >> Thanks Paul! >> And I have verified this change could build success in windows. >> >> BRs, >> Lin >> >> On 2020/8/6, 4:17 AM, "Hohensee, Paulmailto:hohensee at amazon.comwrote:Twotinynitsthatdon'tneedanewwebrev:InheapInspection.cpp,youdon'tneedtocastmissed_counttouintxinthecalltolog_info().InheapInspection.hpp,youcandeletetwoofthethreeblanklinesbefore#endif//SHARE_MEMORY_HEAPINSPECTION_HPPThanks,PaulOn8/5/20,6:46AM,linzang(??)" mailto:linzang at tencent.com wrote: >> >> Hi Paul, Stefan and Serguei, >> Here I uploaded a new changeset, would you like to help review again? >> Webrev: http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_11/ >> Delta (based on webrev10): http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_11_delta/ >> >> P.S. I am in process of building it on windows environment for a double check. May update result later. Thanks! >> >> >> BRs, >> Lin >> >> >> >> >> > > From serguei.spitsyn at oracle.com Tue Aug 11 06:40:07 2020 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Mon, 10 Aug 2020 23:40:07 -0700 Subject: RFR(L): 8215624: add parallel heap inspection support for jmap histo(G1)(Internet mail) In-Reply-To: <1CD7647F-FCED-4A77-AB8D-735A06E49FF6@tencent.com> References: <107799EA-8956-426D-B05E-B3344AFCBA28@amazon.com> <63236320-6741-46F4-A22A-24FCE699E26A@tencent.com> <7015e935-51df-54de-8eea-7dc375db4c85@oracle.com> <35B7B432-29F1-45B5-8FD8-3889C0EAC440@tencent.com> <84327b71-8a2d-d6bf-4e05-51ea944d524e@oracle.com> <34627F19-76BD-48FC-8B5A-C316FAEB3C0F@tencent.com> <74470c79-086a-34a1-54f3-d642fef101b0@oracle.com> <526DE8E4-A838-41C7-83BD-3219384B67F7@tencent.com> <6cbc8167-c81b-a13a-6eb9-3a5cb704fd8b@oracle.com> <1CD7647F-FCED-4A77-AB8D-735A06E49FF6@tencent.com> Message-ID: Hi Lin, I prefer a conservative approach and do not change things without a real need. Thanks, Serguei On 8/10/20 20:23, linzang(??) wrote: > Hi Serguei > I got your point, just thought usage may be a little verbosity, it prints almost my whole screen which could flush the error message. And I checked that other jcmd tools usually use System.exit() after print errors. So I made the change. > > Thanks! > Lin > >> On Aug 11, 2020, at 11:05 AM, "serguei.spitsyn at oracle.com" wrote: >> >> ?Hi Lin, >> >> I've re-reviewed the JMap.java only. >> It looks good except there was no need to replace the usage(1) call with the System.exit(1). >> I did not say usage is not needed, just that it is not enough. >> >> Thanks, >> Serguei >> >> >>> On 8/10/20 19:25, linzang(??) wrote: >>> Hi Serguei, >>> >> First, the CSR does not include any update for 'live' and 'all' options, does it? >>> >> If so, then I'm confused why do you need all these changes related to these two options. >>> >> Did you intend to really change anything? >>> Yes, you?re correct, CSR doesn?t mention any thing about ?live? and ?all?. so all those changes related become unnecessary. >>> >>> Webrev: http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_13 >>> Delta: http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_13_delta >>> >>> BTW, during refining this changeset I also found an issue that jmap -dump could accept undefined options, will setup a new issue in JBS and fix it separately soon. >>> >>> >>> BRs, >>> Lin >>> >>> From: "serguei.spitsyn at oracle.com" >>> Date: Tuesday, August 11, 2020 at 8:40 AM >>> To: "linzang(??)" , "Hohensee, Paul" , Stefan Karlsson , David Holmes , serviceability-dev , "hotspot-gc-dev at openjdk.java.net" >>> Subject: Re: RFR(L): 8215624: add parallel heap inspection support for jmap histo(G1)(Internet mail) >>> >>> Hi Lin, >>> >>> A couple of things. >>> >>> First, the CSR does not include any update for 'live' and 'all' options, does it? >>> If so, then I'm confused why do you need all these changes related to these two options. >>> Did you intend to really change anything? >>> >>> Second, new error messages do not look useful as they say nothing about what is wrong. >>> Printing usage does not help either. >>> Could these messages be more specific? >>> My suggestions are: >>> 188 if (filename == null) { >>> 189 System.err.println("Fail at processing option '" + subopt +"'"); >>> 190 usage(1); // invalid options or no filename >>> 191 } >>> System.err.println("Fail: invalid option or no file name: '" + subopt +"'"); >>> 194 if (parallel == null) { >>> 195 System.err.println("Fail at processing option '" + subopt + "'"); >>> 196 usage(1); >>> 197 } >>> System.err.println("Fail: no number provided in option: '" + subopt +"'"); >>> 198 } else { >>> 199 System.err.println("Fail at processing option '" + subopt + "'"); >>> 200 usage(1); >>> 201 } >>> System.err.println("Fail: invalid option: '" + subopt +"'"); >>> >>> >>> The default value is listed in the 'parallel' flag description: >>> parallel= generate histogram using this many parallel threads, default 0 >>> It means that the flag is optionl. >>> >>> I'm okay to file a separate enhancement to add a clarification for 'live' and 'all' flags. >>> >>> Thanks, >>> Serguei >>> >>> >>> On 8/10/20 16:46, linzang(??) wrote: >>> And Here is the latest refined changeset >>> Webrev: http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_12/ >>> Delta: http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_12_delta/ >>> >>> BRs, >>> Lin >>> >>> On 2020/8/11, 7:23 AM, "linzang(??)" mailto:linzang at tencent.com wrote: >>> >>> Dear Serguei, >>> Here is my reply for your question about non-numeric value for ?parallel? (somehow the thread of replay became out of order, not sure why). >>> >>> > >> What is going to happen if the resulting 'parallel' substring above is not a number? >>> > The error handling logic locates at http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_11/src/hotspot/share/services/attachListener.cpp.frames.html, (line 276-284) >>> > Generally, the result is error message will be print if ?parallel? is illegal. An example output would be: >>> > ############################ >>> > $ time jmap -histo:parallel=c 26233 >>> > Exception in thread "main" com.sun.tools.attach.AttachOperationFailedException: Invalid parallel thread number: [c] >>> > at jdk.attach/sun.tools.attach.VirtualMachineImpl.execute(VirtualMachineImpl.java:227) >>> > at jdk.attach/sun.tools.attach.HotSpotVirtualMachine.executeCommand(HotSpotVirtualMachine.java:309) >>> > at jdk.jcmd/sun.tools.jmap.JMap.executeCommandForPid(JMap.java:133) >>> > at jdk.jcmd/sun.tools.jmap.JMap.histo(JMap.java:206) >>> > at jdk.jcmd/sun.tools.jmap.JMap.main(JMap.java:112) >>> > >>> > ############################ >>> > >>> > Hi Serguei, Paul and Stefan. >>> > Moreover, I will made a new changeset with following changes: >>> > * Print error message + usage when parameter check fail in Jmap.java >>> > *Retrive the histo logic that if ?all? and ?live? are set at same time, use ?live?, rather than print error message. (not sure which one is better :P) >>> >>> My last point is to retrive the behavior for compatibility. And do you think make a separate enhancement about spec is reasonable ? >>> >>> Thanks! >>> >>> BRs, >>> Lin >>> >>> From: mailto:serguei.spitsyn at oracle.com mailto:serguei.spitsyn at oracle.com >>> Date: Tuesday, August 11, 2020 at 5:11 AM >>> To: "linzang(??)" mailto:linzang at tencent.com, "Hohensee, Paul" mailto:hohensee at amazon.com, Stefan Karlsson mailto:stefan.karlsson at oracle.com, David Holmes mailto:david.holmes at oracle.com, serviceability-dev mailto:serviceability-dev at openjdk.java.net, mailto:hotspot-gc-dev at openjdk.java.net mailto:hotspot-gc-dev at openjdk.java.net >>> Subject: Re: RFR(L): 8215624: add parallel heap inspection support for jmap histo(G1)(Internet mail) >>> >>> Hi Lin, >>> >>> >>> On 8/7/20 03:41, linzang(??) wrote: >>> Dear Serguei, >>> Thanks a lot for your review! >>> >> The spec says nothing if the new option 'parallel' is mandatory or optional. >>> >> Also, (it was before your fix) the spec does not say if the options 'live' and 'all' are mutually exclusive. >>> For ?parallel?, the spec adds ?parallel=0? is the default behavior. So my assumption is if parallel is not used, it will be 0. Do you think it is ok ? is it necessary to obviously add comments like ?if no parallel is set, use the default value 0?? >>> >>> It'd be nice to make it clear. >>> But the CSR will need to be updated. >>> In fact, I did not want you to go through this cycle again. >>> But maybe it is worth to improve the specs in this regard. >>> May be Paul has some alternative suggestions. >>> >>> >>> For ?live? and ?all?, before the changeset , I see the logic from the code is that both of them can be set at the same time, and the ?live? will take effect. IMHO this may be a little confused. So I made the change, not sure whether I should keep the same behavior as before in this change? >>> >>> This is better to clearly specify what is allowed and what is the behavior. >>> >>> >>> And I like your idea of printing more error msg if something wrong with the options setting, but I checked that before the change, if there is not a match option, it only print usage. and not only jmap -histo but also jmap -dump has this issue, do you agree if I fix both in the changeset? >>> >>> Yes, it'd be nice to make it clear in both specs. >>> >>> >> What is going to happen if null is passed in place of parallel here? : >>> The default value 0 will be used if no ?parallel? option is set. >>> >>> Okay, thanks. >>> >>> >>> >> Should the lines 193-195 be moved after the line 202? >>> I don?t think so, the logic is a little different. At line 193, the case is ?parallel=?. If move them to line 203, it mean ?parallel? is not optional. >>> Okay, I see what you mean. >>> The problem is that the help/spec says nothing about the flag 'parallel' as being optional. >>> >>> >>> I also asked this question: >>> Q: What is going to happen if the resulting 'parallel' sub-string above is not a number? >>> >>> >>> Thanks, >>> Serguei >>> >>> >>> >>> Thanks! >>> >>> >>> BRs, >>> Lin >>> >>> From: mailto:serguei.spitsyn at oracle.com mailto:serguei.spitsyn at oracle.com >>> Date: Friday, August 7, 2020 at 3:28 PM >>> To: "linzang(??)mailto:linzang at tencent.com,Hohensee, Paulmailto:hohensee at amazon.com,StefanKarlssonmailto:stefan.karlsson at oracle.com,DavidHolmesmailto:david.holmes at oracle.com,serviceability-devmailto:serviceability-dev at openjdk.java.net,mailto:hotspot-gc-dev at openjdk.java.netmailto:hotspot-gc-dev at openjdk.java.netSubject:Re:RFR(L):8215624:addparallelheapinspectionsupportforjmaphisto(G1)(Internetmail)HiLin,Notsure,Ifullyunderstandthespecupdateandtheoptionsprocessinginthefile:http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_11/src/jdk.jcmd/share/classes/sun/tools/jmap/JMap.java.frames.htmlThespecsaysnothingifthenewoption'parallel'ismandatoryoroptional.Also,(itwasbeforeyourfix)thespecdoesnotsayiftheoptions'live'and'all'aremutuallyexclusive.TheJMap.javaimplementationjustprintusageintwocases:191}elseif(subopt.startsWith(parallel=")) { >>> 192 parallel = subopt.substring("parallel=".length()); >>> 193 if (parallel == null) { >>> 194 usage(1); >>> 195 } >>> ... >>> 200 if (set_live && set_all) { >>> 201 usage(1); >>> 202 } >>> It is not that helpful as the usage does not explain anything about these corner cases. >>> Also, it allows to pass no parallel option. >>> What is going to happen if null is passed in place of parallel here? : >>> 206 executeCommandForPid(pid, "inspectheap", liveopt, filename, parallel); >>> >>> Should the lines 193-195 be moved after the line 202? >>> >>> Thanks, >>> Serguei >>> >>> >>> On 8/5/20 18:59, linzang(??) wrote: >>> Thanks Paul! >>> And I have verified this change could build success in windows. >>> >>> BRs, >>> Lin >>> >>> On 2020/8/6, 4:17 AM, "Hohensee, Paulmailto:hohensee at amazon.comwrote:Twotinynitsthatdon'tneedanewwebrev:InheapInspection.cpp,youdon'tneedtocastmissed_counttouintxinthecalltolog_info().InheapInspection.hpp,youcandeletetwoofthethreeblanklinesbefore#endif//SHARE_MEMORY_HEAPINSPECTION_HPPThanks,PaulOn8/5/20,6:46AM,linzang(??)" mailto:linzang at tencent.com wrote: >>> >>> Hi Paul, Stefan and Serguei, >>> Here I uploaded a new changeset, would you like to help review again? >>> Webrev: http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_11/ >>> Delta (based on webrev10): http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_11_delta/ >>> >>> P.S. I am in process of building it on windows environment for a double check. May update result later. Thanks! >>> >>> >>> BRs, >>> Lin >>> >>> >>> >>> >>> >> From richard.reingruber at sap.com Tue Aug 11 09:50:13 2020 From: richard.reingruber at sap.com (Reingruber, Richard) Date: Tue, 11 Aug 2020 09:50:13 +0000 Subject: RFR(S) 8249293: Unsafe stackwalk in VM_GetOrSetLocal::doit_prologue() In-Reply-To: <54218a0f-5ea7-5a26-af7b-d13829bece14@oracle.com> References: <54218a0f-5ea7-5a26-af7b-d13829bece14@oracle.com> Message-ID: Hi David, thanks for looking at this. I've prepared a new webrev based on your feedback. Webrev: http://cr.openjdk.java.net/~rrich/webrevs/8249293/webrev.3/ Delta: http://cr.openjdk.java.net/~rrich/webrevs/8249293/webrev.3.inc/ I'm answering your points inline below. Thanks, Richard. > On 31/07/2020 5:28 pm, Reingruber, Richard wrote: > > Hi, > > > > I rebase the fix after JDK-8250042. > > > > New webrev: http://cr.openjdk.java.net/~rrich/webrevs/8249293/webrev.2/ > The general fix for this seems good. A minor nit: > 588 if (!is_assignable(signature, ob_k, Thread::current())) { > You know that the current thread is the VMThread so can use > VMThread::vm_thread(). > Similarly for this existing code: > 694 Thread* current_thread = Thread::current(); > --- Done. > Looking at the test code ... I'm less clear on exactly what is happening The @comment in GetLocalWithoutSuspendTest.java describes the intent. > and the use of spin-waits raises some red-flags for me in terms of test > reliability on different platforms. The "while (--waitCycles > 0)" loop > in particular offers no certainty that the agent thread is executing > anything in particular. And the use of the spin_count as a guide to > future waiting time seems somewhat arbitrary. In all seriousness I got > a headache trying to work out how the test was expecting to operate. > Some parts could be simplified using raw monitors, I think. But there's > no sure way to know the agent thread is in the midst of the stackwalk > when the target thread wants to leave the native code. So I understand > what you are trying to achieve here, I'm just not sure how reliably it > will actually achieve it. I've replaced 2 of 3 spin waits with wait calls on a raw monitor. The state of the test is kept in the global variable test_state. Agent and target thread use it to synchronize. I hope this helps to make clear what is happening. As you noticed the remaining spin wait cannot be replaced with a monitor wait. It should not be eliminated either, though. Without it the target thread might always return from the native call before the unsafe stack walk was started. I guess you have noticed that the target thread doubles the spin cycles in each iteration searching the sweet spot :) About reliability: on my Windows notebook and on a Linux server the test has never succeeded without the fix. The probability for failure could be increased by using a thread with a larger stack, but I don't think this is necessary. On uniprocessor systems the test might not work very reliably. I experimented with pinning the jvm to one cpu. Executed like this the test reliably fails without the fix as well, though. Without modification of the jvm the test cannot be made 100% reliable. But there are many tests like this I reckon. > test/hotspot/jtreg/serviceability/jvmti/GetLocalVariable/libGetLocalWithoutSuspendTest.cpp > 32 static volatile jlong spinn_count = 0; > Using a 64-bit counter seems like it will be a problem on 32-bit systems. > Should be spin_count not spinn_count. This is actually a dummy variable. I have renamed it to dummy_counter. Its purpose is to prevent the c++ compiler from eliminating the spin wait loop. I have replaced all longs with ints in the test. -----Original Message----- From: David Holmes Sent: Montag, 10. August 2020 07:35 To: Reingruber, Richard ; serguei.spitsyn at oracle.com; serviceability-dev at openjdk.java.net Subject: Re: RFR(S) 8249293: Unsafe stackwalk in VM_GetOrSetLocal::doit_prologue() Hi Richard, On 31/07/2020 5:28 pm, Reingruber, Richard wrote: > Hi, > > I rebase the fix after JDK-8250042. > > New webrev: http://cr.openjdk.java.net/~rrich/webrevs/8249293/webrev.2/ The general fix for this seems good. A minor nit: 588 if (!is_assignable(signature, ob_k, Thread::current())) { You know that the current thread is the VMThread so can use VMThread::vm_thread(). Similarly for this existing code: 694 Thread* current_thread = Thread::current(); --- Looking at the test code ... I'm less clear on exactly what is happening and the use of spin-waits raises some red-flags for me in terms of test reliability on different platforms. The "while (--waitCycles > 0)" loop in particular offers no certainty that the agent thread is executing anything in particular. And the use of the spin_count as a guide to future waiting time seems somewhat arbitrary. In all seriousness I got a headache trying to work out how the test was expecting to operate. Some parts could be simplified using raw monitors, I think. But there's no sure way to know the agent thread is in the midst of the stackwalk when the target thread wants to leave the native code. So I understand what you are trying to achieve here, I'm just not sure how reliably it will actually achieve it. test/hotspot/jtreg/serviceability/jvmti/GetLocalVariable/libGetLocalWithoutSuspendTest.cpp 32 static volatile jlong spinn_count = 0; Using a 64-bit counter seems like it will be a problem on 32-bit systems. Should be spin_count not spinn_count. 36 // Agent thread waits for value != 0, then performas the JVMTI call to get local variable. typo: performas Thanks, David ----- > Thanks, Richard. > > -----Original Message----- > From: serviceability-dev On Behalf Of Reingruber, Richard > Sent: Montag, 27. Juli 2020 09:45 > To: serguei.spitsyn at oracle.com; serviceability-dev at openjdk.java.net > Subject: [CAUTION] RE: RFR(S) 8249293: Unsafe stackwalk in VM_GetOrSetLocal::doit_prologue() > > Hi Serguei, > > > I tested it on Linux and Windows but not yet on MacOS. > > The test succeeded now on all platforms. > > Thanks, Richard. > > -----Original Message----- > From: Reingruber, Richard > Sent: Freitag, 24. Juli 2020 15:04 > To: serguei.spitsyn at oracle.com; serviceability-dev at openjdk.java.net > Subject: RE: RFR(S) 8249293: Unsafe stackwalk in VM_GetOrSetLocal::doit_prologue() > > Hi Serguei, > >> The fix itself looks good to me. > > thanks for looking at the fix. > >> I still need another look at new test. >> Could you, please, convert the agent of new test to C++? >> It will make it a little bit simpler. > > Sure, here is the new webrev.1 with a C++ version of the test agent: > > http://cr.openjdk.java.net/~rrich/webrevs/8249293/webrev.1/ > > I tested it on Linux and Windows but not yet on MacOS. > > Thanks, > Richard. > > -----Original Message----- > From: serguei.spitsyn at oracle.com > Sent: Freitag, 24. Juli 2020 00:00 > To: Reingruber, Richard ; serviceability-dev at openjdk.java.net > Subject: Re: RFR(S) 8249293: Unsafe stackwalk in VM_GetOrSetLocal::doit_prologue() > > Hi Richard, > > Thank you for filing the CR and taking care about it! > The fix itself looks good to me. > I still need another look at new test. > Could you, please, convert the agent of new test to C++? > It will make it a little bit simpler. > > Thanks, > Serguei > > > On 7/20/20 01:15, Reingruber, Richard wrote: >> Hi, >> >> please help review this fix for VM_GetOrSetLocal. It moves the unsafe stackwalk from the vm >> operation prologue before the safepoint into the doit() method executed at the safepoint. >> >> Webrev: http://cr.openjdk.java.net/~rrich/webrevs/8249293/webrev.0/index.html >> Bug: https://bugs.openjdk.java.net/browse/JDK-8249293 >> >> According to the JVMTI spec on local variable access it is not required to suspend the target thread >> T [1]. The operation will simply fail with JVMTI_ERROR_NO_MORE_FRAMES if T is executing >> bytecodes. It will succeed though if T is blocked because of synchronization or executing some native >> code. >> >> The issue is that in the latter case the stack walk in VM_GetOrSetLocal::doit_prologue() to prepare >> the access to the local variable is unsafe, because it is done before the safepoint and it races >> with T returning to execute bytecodes making its stack not walkable. The included test shows that >> this can crash the VM if T wins the race. >> >> Manual testing: >> >> - new test test/hotspot/jtreg/serviceability/jvmti/GetLocalVariable/GetLocalWithoutSuspendTest.java >> - test/hotspot/jtreg/vmTestbase/nsk/jvmti >> - test/hotspot/jtreg/serviceability/jvmti >> >> Nightly regression tests @SAP: JCK and JTREG, also in Xcomp mode, SPECjvm2008, SPECjbb2015, >> Renaissance Suite, SAP specific tests with fastdebug and release builds on all platforms >> >> Thanks, Richard. >> >> [1] https://docs.oracle.com/en/java/javase/14/docs/specs/jvmti.html#local > From richard.reingruber at sap.com Tue Aug 11 09:52:20 2020 From: richard.reingruber at sap.com (Reingruber, Richard) Date: Tue, 11 Aug 2020 09:52:20 +0000 Subject: RFR(S) 8249293: Unsafe stackwalk in VM_GetOrSetLocal::doit_prologue() In-Reply-To: <037ee9ad-615e-e70b-80f8-21b4fdc80f53@oracle.com> References: <54218a0f-5ea7-5a26-af7b-d13829bece14@oracle.com> <037ee9ad-615e-e70b-80f8-21b4fdc80f53@oracle.com> Message-ID: Hi Serguei, > The implementation looks good to me. Thanks. > But I do not understand what the test is doing with all this counters and recursions. The @comment gives an explanation: the target thread builds a stack as large as possible to prolong the unsafe stackwalk. This is done by means of recursion. > For instance, these fragments: > 86 recursions = 0; > 87 try { > 88 recursiveMethod(1<<20); > 89 } catch (StackOverflowError e) { > 90 msg("Caught StackOverflowError as expected"); > 91 } > 92 int target_depth = recursions-100; // spaces are missed around the '-' sigh Here we calibrate the test for maximum stack depth. We measure how large the stack can grow by calling recursiveMethod() until a StackOverflowError is raised. We use recursions - 100 as target_depth to avoid the StackOverflowError during the actual test. > It is not obvious that the 'recursion' is updated in the recursiveMethod. > I would suggestto make it more explicit: > recursiveMethod(M); > int target_depth = M - 100; > Then the variable 'recursions' can be removed or become local. > This method will be: > 47 private static final int M = 1 << 20; > ... > 121 public long recursiveMethod(int depth) { > 123 if (depth == 0) { > 124 notifyAgentToGetLocalAndWaitShortly(M - 100, waitTimeInNativeAfterNotify); > 126 } else { > 127 recursiveMethod(--depth); > 128 } > 129 } I don't think this would work. A StackOverflowError would be raised before notifyAgentToGetLocalAndWaitShortly() is called. > At least, he test is missing the comments explaining all these. The arguments to the msg() method serve as documentation too. I would not want to repeat the msg strings in comments. Thanks, Richard. ------------------------------------------------------------- From: serguei.spitsyn at oracle.com Sent: Montag, 10. August 2020 19:22 To: David Holmes ; Reingruber, Richard ; serviceability-dev at openjdk.java.net Subject: Re: RFR(S) 8249293: Unsafe stackwalk in VM_GetOrSetLocal::doit_prologue() Hi Richard and David, The implementation looks good to me. But I do not understand what the test is doing with all this counters and recursions. For instance, these fragments: 86 recursions = 0; 87 try { 88 recursiveMethod(1<<20); 89 } catch (StackOverflowError e) { 90 msg("Caught StackOverflowError as expected"); 91 } 92 int target_depth = recursions-100; // spaces are missed around the '-' sigh It is not obvious that the 'recursion' is updated in the recursiveMethod. I would suggestto make it more explicit: ?? recursiveMethod(M); ?? int target_depth = M - 100; Then the variable 'recursions' can be removed or become local. This method will be: 47 private static final int M = 1 << 20; ... 121 public long recursiveMethod(int depth) { 123 if (depth == 0) { 124 notifyAgentToGetLocalAndWaitShortly(M - 100, waitTimeInNativeAfterNotify); 126 } else { 127 recursiveMethod(--depth); 128 } 129 } At least, he test is missing the comments explaining all these. Thanks, Serguei On 8/9/20 22:35, David Holmes wrote: Hi Richard, On 31/07/2020 5:28 pm, Reingruber, Richard wrote: Hi, I rebase the fix after JDK-8250042. New webrev: http://cr.openjdk.java.net/~rrich/webrevs/8249293/webrev.2/ The general fix for this seems good. A minor nit: ?588???? if (!is_assignable(signature, ob_k, Thread::current())) { You know that the current thread is the VMThread so can use VMThread::vm_thread(). Similarly for this existing code: ?694???? Thread* current_thread = Thread::current(); --- Looking at the test code ... I'm less clear on exactly what is happening and the use of spin-waits raises some red-flags for me in terms of test reliability on different platforms. The "while (--waitCycles > 0)" loop in particular offers no certainty that the agent thread is executing anything in particular. And the use of the spin_count as a guide to future waiting time seems somewhat arbitrary. In all seriousness I got a headache trying to work out how the test was expecting to operate. Some parts could be simplified using raw monitors, I think. But there's no sure way to know the agent thread is in the midst of the stackwalk when the target thread wants to leave the native code. So I understand what you are trying to achieve here, I'm just not sure how reliably it will actually achieve it. test/hotspot/jtreg/serviceability/jvmti/GetLocalVariable/libGetLocalWithoutSuspendTest.cpp ?32 static volatile jlong spinn_count???? = 0; Using a 64-bit counter seems like it will be a problem on 32-bit systems. Should be spin_count not spinn_count. ?36 // Agent thread waits for value != 0, then performas the JVMTI call to get local variable. typo: performas Thanks, David ----- Thanks, Richard. -----Original Message----- From: serviceability-dev mailto:serviceability-dev-retn at openjdk.java.net On Behalf Of Reingruber, Richard Sent: Montag, 27. Juli 2020 09:45 To: mailto:serguei.spitsyn at oracle.com; mailto:serviceability-dev at openjdk.java.net Subject: [CAUTION] RE: RFR(S) 8249293: Unsafe stackwalk in VM_GetOrSetLocal::doit_prologue() Hi Serguei, ?? > I tested it on Linux and Windows but not yet on MacOS. The test succeeded now on all platforms. Thanks, Richard. -----Original Message----- From: Reingruber, Richard Sent: Freitag, 24. Juli 2020 15:04 To: mailto:serguei.spitsyn at oracle.com; mailto:serviceability-dev at openjdk.java.net Subject: RE: RFR(S) 8249293: Unsafe stackwalk in VM_GetOrSetLocal::doit_prologue() Hi Serguei, The fix itself looks good to me. thanks for looking at the fix. I still need another look at new test. Could you, please, convert the agent of new test to C++? It will make it a little bit simpler. Sure, here is the new webrev.1 with a C++ version of the test agent: http://cr.openjdk.java.net/~rrich/webrevs/8249293/webrev.1/ I tested it on Linux and Windows but not yet on MacOS. Thanks, Richard. -----Original Message----- From: mailto:serguei.spitsyn at oracle.com mailto:serguei.spitsyn at oracle.com Sent: Freitag, 24. Juli 2020 00:00 To: Reingruber, Richard mailto:richard.reingruber at sap.com; mailto:serviceability-dev at openjdk.java.net Subject: Re: RFR(S) 8249293: Unsafe stackwalk in VM_GetOrSetLocal::doit_prologue() Hi Richard, Thank you for filing the CR and taking care about it! The fix itself looks good to me. I still need another look at new test. Could you, please, convert the agent of new test to C++? It will make it a little bit simpler. Thanks, Serguei On 7/20/20 01:15, Reingruber, Richard wrote: Hi, please help review this fix for VM_GetOrSetLocal. It moves the unsafe stackwalk from the vm operation prologue before the safepoint into the doit() method executed at the safepoint. Webrev: http://cr.openjdk.java.net/~rrich/webrevs/8249293/webrev.0/index.html Bug:??? https://bugs.openjdk.java.net/browse/JDK-8249293 According to the JVMTI spec on local variable access it is not required to suspend the target thread T [1]. The operation will simply fail with JVMTI_ERROR_NO_MORE_FRAMES if T is executing bytecodes. It will succeed though if T is blocked because of synchronization or executing some native code. The issue is that in the latter case the stack walk in VM_GetOrSetLocal::doit_prologue() to prepare the access to the local variable is unsafe, because it is done before the safepoint and it races with T returning to execute bytecodes making its stack not walkable. The included test shows that this can crash the VM if T wins the race. Manual testing: ??? - new test test/hotspot/jtreg/serviceability/jvmti/GetLocalVariable/GetLocalWithoutSuspendTest.java ??? - test/hotspot/jtreg/vmTestbase/nsk/jvmti ??? - test/hotspot/jtreg/serviceability/jvmti Nightly regression tests @SAP: JCK and JTREG, also in Xcomp mode, SPECjvm2008, SPECjbb2015, Renaissance Suite, SAP specific tests with fastdebug and release builds on all platforms Thanks, Richard. [1] https://docs.oracle.com/en/java/javase/14/docs/specs/jvmti.html#local From richard.reingruber at sap.com Tue Aug 11 10:02:43 2020 From: richard.reingruber at sap.com (Reingruber, Richard) Date: Tue, 11 Aug 2020 10:02:43 +0000 Subject: RFR(S) 8249293: Unsafe stackwalk in VM_GetOrSetLocal::doit_prologue() In-Reply-To: <0a112ad5-0771-9a83-32d8-08c87bed9b40@oracle.com> References: <54218a0f-5ea7-5a26-af7b-d13829bece14@oracle.com> <037ee9ad-615e-e70b-80f8-21b4fdc80f53@oracle.com> <0a112ad5-0771-9a83-32d8-08c87bed9b40@oracle.com> Message-ID: Hi David and Serguei, > On 11/08/2020 3:21 am, serguei.spitsyn at oracle.com wrote: > > Hi Richard and David, > > > > The implementation looks good to me. > > > > But I do not understand what the test is doing with all this counters > > and recursions. > > > > For instance, these fragments: > > > > 86 recursions = 0; > > 87 try { > > 88 recursiveMethod(1<<20); > > 89 } catch (StackOverflowError e) { > > 90 msg("Caught StackOverflowError as expected"); > > 91 } > > 92 int target_depth = recursions-100; // spaces are missed around the '-' sigh > > > > It is not obvious that the 'recursion' is updated in the recursiveMethod. > > I would suggestto make it more explicit: > > recursiveMethod(M); > > int target_depth = M - 100; > > > > Then the variable 'recursions' can be removed or become local. > The recursiveMethod takes in the maximum recursions to try and updates > the recursions variable to record how many recursions were possible - so: > target_depth = - 100; > Possibly recursiveMethod could return the actual recursions instead of > using the global variables? I've eliminated the static 'recursions' variable. recursiveMethod() now returns the depth at which the recursion was ended. I hesitated doing this, because I had to handle the StackOverflowError with all those frames still on stack. But the handler is empty, so it should not cause problems. This is the new webrev (as posted previously): Webrev: http://cr.openjdk.java.net/~rrich/webrevs/8249293/webrev.3/ Delta: http://cr.openjdk.java.net/~rrich/webrevs/8249293/webrev.3.inc/ Thanks, Richard. -----Original Message----- From: David Holmes Sent: Dienstag, 11. August 2020 04:00 To: serguei.spitsyn at oracle.com; Reingruber, Richard ; serviceability-dev at openjdk.java.net Subject: Re: RFR(S) 8249293: Unsafe stackwalk in VM_GetOrSetLocal::doit_prologue() Hi Serguei, On 11/08/2020 3:21 am, serguei.spitsyn at oracle.com wrote: > Hi Richard and David, > > The implementation looks good to me. > > But I do not understand what the test is doing with all this counters > and recursions. > > For instance, these fragments: > > 86 recursions = 0; > 87 try { > 88 recursiveMethod(1<<20); > 89 } catch (StackOverflowError e) { > 90 msg("Caught StackOverflowError as expected"); > 91 } > 92 int target_depth = recursions-100; // spaces are missed around the '-' sigh > > It is not obvious that the 'recursion' is updated in the recursiveMethod. > I would suggestto make it more explicit: > ?? recursiveMethod(M); > ?? int target_depth = M - 100; > > Then the variable 'recursions' can be removed or become local. The recursiveMethod takes in the maximum recursions to try and updates the recursions variable to record how many recursions were possible - so: target_depth = - 100; Possibly recursiveMethod could return the actual recursions instead of using the global variables? David ----- > This method will be: > > 47 private static final int M = 1 << 20; > ... > 121 public long recursiveMethod(int depth) { > 123 if (depth == 0) { > 124 notifyAgentToGetLocalAndWaitShortly(M - 100, waitTimeInNativeAfterNotify); > 126 } else { > 127 recursiveMethod(--depth); > 128 } > 129 } > > > At least, he test is missing the comments explaining all these. > > Thanks, > Serguei > > > > On 8/9/20 22:35, David Holmes wrote: >> Hi Richard, >> >> On 31/07/2020 5:28 pm, Reingruber, Richard wrote: >>> Hi, >>> >>> I rebase the fix after JDK-8250042. >>> >>> New webrev: http://cr.openjdk.java.net/~rrich/webrevs/8249293/webrev.2/ >> >> The general fix for this seems good. A minor nit: >> >> ?588???? if (!is_assignable(signature, ob_k, Thread::current())) { >> >> You know that the current thread is the VMThread so can use >> VMThread::vm_thread(). >> >> Similarly for this existing code: >> >> ?694???? Thread* current_thread = Thread::current(); >> >> --- >> >> Looking at the test code ... I'm less clear on exactly what is >> happening and the use of spin-waits raises some red-flags for me in >> terms of test reliability on different platforms. The "while >> (--waitCycles > 0)" loop in particular offers no certainty that the >> agent thread is executing anything in particular. And the use of the >> spin_count as a guide to future waiting time seems somewhat arbitrary. >> In all seriousness I got a headache trying to work out how the test >> was expecting to operate. Some parts could be simplified using raw >> monitors, I think. But there's no sure way to know the agent thread is >> in the midst of the stackwalk when the target thread wants to leave >> the native code. So I understand what you are trying to achieve here, >> I'm just not sure how reliably it will actually achieve it. >> >> test/hotspot/jtreg/serviceability/jvmti/GetLocalVariable/libGetLocalWithoutSuspendTest.cpp >> >> >> ?32 static volatile jlong spinn_count???? = 0; >> >> Using a 64-bit counter seems like it will be a problem on 32-bit systems. >> >> Should be spin_count not spinn_count. >> >> ?36 // Agent thread waits for value != 0, then performas the JVMTI >> call to get local variable. >> >> typo: performas >> >> Thanks, >> David >> ----- >> >>> Thanks, Richard. >>> >>> -----Original Message----- >>> From: serviceability-dev >>> On Behalf Of Reingruber, Richard >>> Sent: Montag, 27. Juli 2020 09:45 >>> To: serguei.spitsyn at oracle.com; serviceability-dev at openjdk.java.net >>> Subject: [CAUTION] RE: RFR(S) 8249293: Unsafe stackwalk in >>> VM_GetOrSetLocal::doit_prologue() >>> >>> Hi Serguei, >>> >>> ?? > I tested it on Linux and Windows but not yet on MacOS. >>> >>> The test succeeded now on all platforms. >>> >>> Thanks, Richard. >>> >>> -----Original Message----- >>> From: Reingruber, Richard >>> Sent: Freitag, 24. Juli 2020 15:04 >>> To: serguei.spitsyn at oracle.com; serviceability-dev at openjdk.java.net >>> Subject: RE: RFR(S) 8249293: Unsafe stackwalk in >>> VM_GetOrSetLocal::doit_prologue() >>> >>> Hi Serguei, >>> >>>> The fix itself looks good to me. >>> >>> thanks for looking at the fix. >>> >>>> I still need another look at new test. >>>> Could you, please, convert the agent of new test to C++? >>>> It will make it a little bit simpler. >>> >>> Sure, here is the new webrev.1 with a C++ version of the test agent: >>> >>> http://cr.openjdk.java.net/~rrich/webrevs/8249293/webrev.1/ >>> >>> I tested it on Linux and Windows but not yet on MacOS. >>> >>> Thanks, >>> Richard. >>> >>> -----Original Message----- >>> From: serguei.spitsyn at oracle.com >>> Sent: Freitag, 24. Juli 2020 00:00 >>> To: Reingruber, Richard ; >>> serviceability-dev at openjdk.java.net >>> Subject: Re: RFR(S) 8249293: Unsafe stackwalk in >>> VM_GetOrSetLocal::doit_prologue() >>> >>> Hi Richard, >>> >>> Thank you for filing the CR and taking care about it! >>> The fix itself looks good to me. >>> I still need another look at new test. >>> Could you, please, convert the agent of new test to C++? >>> It will make it a little bit simpler. >>> >>> Thanks, >>> Serguei >>> >>> >>> On 7/20/20 01:15, Reingruber, Richard wrote: >>>> Hi, >>>> >>>> please help review this fix for VM_GetOrSetLocal. It moves the >>>> unsafe stackwalk from the vm >>>> operation prologue before the safepoint into the doit() method >>>> executed at the safepoint. >>>> >>>> Webrev: >>>> http://cr.openjdk.java.net/~rrich/webrevs/8249293/webrev.0/index.html >>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8249293 >>>> >>>> According to the JVMTI spec on local variable access it is not >>>> required to suspend the target thread >>>> T [1]. The operation will simply fail with >>>> JVMTI_ERROR_NO_MORE_FRAMES if T is executing >>>> bytecodes. It will succeed though if T is blocked because of >>>> synchronization or executing some native >>>> code. >>>> >>>> The issue is that in the latter case the stack walk in >>>> VM_GetOrSetLocal::doit_prologue() to prepare >>>> the access to the local variable is unsafe, because it is done >>>> before the safepoint and it races >>>> with T returning to execute bytecodes making its stack not walkable. >>>> The included test shows that >>>> this can crash the VM if T wins the race. >>>> >>>> Manual testing: >>>> >>>> ??? - new test >>>> test/hotspot/jtreg/serviceability/jvmti/GetLocalVariable/GetLocalWithoutSuspendTest.java >>>> ??? - test/hotspot/jtreg/vmTestbase/nsk/jvmti >>>> ??? - test/hotspot/jtreg/serviceability/jvmti >>>> >>>> Nightly regression tests @SAP: JCK and JTREG, also in Xcomp mode, >>>> SPECjvm2008, SPECjbb2015, >>>> Renaissance Suite, SAP specific tests with fastdebug and release >>>> builds on all platforms >>>> >>>> Thanks, Richard. >>>> >>>> [1] >>>> https://docs.oracle.com/en/java/javase/14/docs/specs/jvmti.html#local >>> > From coleen.phillimore at oracle.com Tue Aug 11 11:28:01 2020 From: coleen.phillimore at oracle.com (Coleen Phillimore) Date: Tue, 11 Aug 2020 07:28:01 -0400 Subject: RFR (S) 8251302: Create dedicated OopStorages for Management and Jvmti In-Reply-To: <0ce2b90e-02b6-a4e8-f14f-522578a9e924@oracle.com> References: <3493f679-c5e2-9f05-4bfa-aaab1543361d@oracle.com> <49934bad-680d-121b-a1e5-56b9cd42d4a0@oracle.com> <024599a2-6c5d-98e0-8f26-bfe183e9c474@oracle.com> <36dc666d-f12c-0250-9b65-4794f05641db@oracle.com> <20c3370f-ae8c-aadc-53b1-95db92776f31@oracle.com> <86edc516-a7e5-b9aa-1f17-80b53f26d57d@oracle.com> <0ce2b90e-02b6-a4e8-f14f-522578a9e924@oracle.com> Message-ID: <1ef431be-152c-1d7d-4390-c0a21594c8cc@oracle.com> Thanks David! Coleen On 8/10/20 10:07 PM, David Holmes wrote: > On 11/08/2020 10:48 am, Coleen Phillimore wrote: >> >> Hi David,? Thank you for reviewing. >> >> On 8/10/20 8:04 PM, David Holmes wrote: >>> Hi Coleen, >>> >>> This looks good to me too! >>> >>> Were the changes in src/hotspot/share/utilities/macros.hpp just for >>> completeness? You don't seem to use the new macro. >> >> It was left over from an earlier edit.? I reverted it. > > Okay. > >>> >>> ?src/hotspot/share/runtime/init.cpp >>> >>> It seems a little odd having an explicit call to >>> JvmtiExport::initialize_oop_storage() rather than that call being >>> inside on of the other init functions. But I don't really see an >>> appropriate place for it. I thought perhaps management_init as it >>> seems to combine a few related things, but it isn't really the right >>> place either. >> >> management_init() isn't the right place for JVMTI.?? I could have >> added a jvmti_init() that calls JvmtiExport::initialize_oop_storage() >> but honestly I think this whole thing should be rewritten to call >> static functions rather than through these forward declarations.? >> There are other places that call qualified static initialization >> functions in this code and I think this should migrate to that. >> >>> I was also a little unsure if this initialization point would always >>> be early enough, but it seems the oop-storage can't be touched by >>> anything until the live phase, so this seems okay. Though I was >>> wondering whether it should be done in vm_init_globals after >>> universe_oopstorage_init, to maintain the same initialization point >>> as it currnetly has? >> >> There are no jvmti specific initializations in the short amount of >> initialization code between vm_init_globals and init_globals so I >> chose the later initialization because it worked, and the later >> initialization risks dragging more dependencies forward with it. >> Doing jvmti initialization in what looks like very early >> initialization doesn't seem appropriate.? And isn't needed for >> correctness. > > Okay. > > Thanks, > David > >> thanks, >> Coleen >>> >>> Thanks, >>> David >>> ----- >>> >>> >>> On 11/08/2020 8:39 am, Coleen Phillimore wrote: >>>> Adding back serviceability-dev. >>>> Thanks for reviewing Serguei. >>>> Coleen >>>> >>>> On 8/10/20 6:11 PM, Coleen Phillimore wrote: >>>>> >>>>> >>>>> On 8/10/20 5:28 PM, Coleen Phillimore wrote: >>>>>> >>>>>> >>>>>> On 8/10/20 4:38 PM, serguei.spitsyn at oracle.com wrote: >>>>>>> On 8/10/20 13:34, serguei.spitsyn at oracle.com wrote: >>>>>>>> Hi Coleen, >>>>>>>> >>>>>>>> It looks good to me. >>>>>>>> Minor: >>>>>>>> +void JvmtiExport::initialize_oop_storage() { >>>>>>>> + // OopStorage needs to be created early in startup and >>>>>>>> unconditionally >>>>>>>> + // because of OopStorageSet static array indices. >>>>>>>> + _jvmti_oop_storage = OopStorageSet::create_strong("Thread >>>>>>>> OopStorage"); >>>>>>>> +} >>>>>>>> + >>>>>>>> Would it better to replace "Thread Oopstorage" with "JVMTI >>>>>>>> OopStorage"? >>>>>>> >>>>>>> In the file >>>>>>> http://cr.openjdk.java.net/~coleenp/2020/8251302.01/webrev/test/jdk/jdk/jfr/event/gc/collection/TestG1ParallelPhases.java.udiff.html >>>>>>> >>>>>>> I see this: >>>>>>> ????????????? "Thread OopStorage", >>>>>>> + "ThreadService OopStorage", >>>>>>> It is not clear if we can simply add ""JVMTI OopStorage" above. >>>>>> >>>>>> Serguei,? Thank you for finding this.? I was wondering why I >>>>>> didn't have to add JVMTI OopStorage to the test. I'd cut/pasted >>>>>> the same string for Thread OopStorage. >>>>>> >>>>>> I'll fix this and the test and retest. >>>>> >>>>> Hi Serguei, >>>>> >>>>> open webrev at >>>>> http://cr.openjdk.java.net/~coleenp/2020/8251302.02.incr/webrev >>>>> >>>>> This fixes the test as well. >>>>> >>>>> Thanks! >>>>> Coleen >>>>> >>>>>> >>>>>> thanks, >>>>>> Coleen >>>>>>> >>>>>>> Thanks, >>>>>>> Serguei >>>>>>> >>>>>>> >>>>>>>> No need in another webrev. >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Serguei >>>>>>>> >>>>>>>> On 8/10/20 12:37, Coleen Phillimore wrote: >>>>>>>>> These OopHandles are created and released during breakpoints >>>>>>>>> and Thread stack walking operations.? They should have their >>>>>>>>> own OopStorage so that GC can detect whether these things >>>>>>>>> affect timing. >>>>>>>>> >>>>>>>>> Tested with tier1-6. >>>>>>>>> >>>>>>>>> open webrev at >>>>>>>>> http://cr.openjdk.java.net/~coleenp/2020/8251302.01/webrev >>>>>>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8251302 >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> Coleen >>>>>>>> >>>>>>> >>>>>> >>>>> >>>> >> From linzang at tencent.com Tue Aug 11 15:22:56 2020 From: linzang at tencent.com (=?utf-8?B?bGluemFuZyjoh6fnkLMp?=) Date: Tue, 11 Aug 2020 15:22:56 +0000 Subject: RFR(L): 8215624: add parallel heap inspection support for jmap histo(G1)(Internet mail) In-Reply-To: References: <107799EA-8956-426D-B05E-B3344AFCBA28@amazon.com> <63236320-6741-46F4-A22A-24FCE699E26A@tencent.com> <7015e935-51df-54de-8eea-7dc375db4c85@oracle.com> <35B7B432-29F1-45B5-8FD8-3889C0EAC440@tencent.com> <84327b71-8a2d-d6bf-4e05-51ea944d524e@oracle.com> <34627F19-76BD-48FC-8B5A-C316FAEB3C0F@tencent.com> <74470c79-086a-34a1-54f3-d642fef101b0@oracle.com> <526DE8E4-A838-41C7-83BD-3219384B67F7@tencent.com> <6cbc8167-c81b-a13a-6eb9-3a5cb704fd8b@oracle.com> <1CD7647F-FCED-4A77-AB8D-735A06E49FF6@tencent.com> Message-ID: <572A8E2C-7F04-45BE-9D39-E7B08BE1079A@tencent.com> Hi Serguei, Thanks a lot for your advice. I agree your concern and will take care of it in future. Here is the latest webrev based on your comments: (delta is just retrieving the usage(1)) http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev14/ So may I assume that the patch is OK with you now? Hi All, In summary, Here are the status of this change at present: * Paul and Serguei have helped review the runtime/JMap part and the changes now is Okay with them. * Stefan has helped review the GC part and it is Okay with him now. So does it need more review and approval for pushing this change? BRs, Lin ?On 2020/8/11, 2:40 PM, "serguei.spitsyn at oracle.com" wrote: Hi Lin, I prefer a conservative approach and do not change things without a real need. Thanks, Serguei On 8/10/20 20:23, linzang(??) wrote: > Hi Serguei > I got your point, just thought usage may be a little verbosity, it prints almost my whole screen which could flush the error message. And I checked that other jcmd tools usually use System.exit() after print errors. So I made the change. > > Thanks! > Lin > >> On Aug 11, 2020, at 11:05 AM, "serguei.spitsyn at oracle.com" wrote: >> >> Hi Lin, >> >> I've re-reviewed the JMap.java only. >> It looks good except there was no need to replace the usage(1) call with the System.exit(1). >> I did not say usage is not needed, just that it is not enough. >> >> Thanks, >> Serguei >> >> >>> On 8/10/20 19:25, linzang(??) wrote: >>> Hi Serguei, >>> >> First, the CSR does not include any update for 'live' and 'all' options, does it? >>> >> If so, then I'm confused why do you need all these changes related to these two options. >>> >> Did you intend to really change anything? >>> Yes, you?re correct, CSR doesn?t mention any thing about ?live? and ?all?. so all those changes related become unnecessary. >>> >>> Webrev: http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_13 >>> Delta: http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_13_delta >>> >>> BTW, during refining this changeset I also found an issue that jmap -dump could accept undefined options, will setup a new issue in JBS and fix it separately soon. >>> >>> >>> BRs, >>> Lin >>> >>> From: "serguei.spitsyn at oracle.com" >>> Date: Tuesday, August 11, 2020 at 8:40 AM >>> To: "linzang(??)" , "Hohensee, Paul" , Stefan Karlsson , David Holmes , serviceability-dev , "hotspot-gc-dev at openjdk.java.net" >>> Subject: Re: RFR(L): 8215624: add parallel heap inspection support for jmap histo(G1)(Internet mail) >>> >>> Hi Lin, >>> >>> A couple of things. >>> >>> First, the CSR does not include any update for 'live' and 'all' options, does it? >>> If so, then I'm confused why do you need all these changes related to these two options. >>> Did you intend to really change anything? >>> >>> Second, new error messages do not look useful as they say nothing about what is wrong. >>> Printing usage does not help either. >>> Could these messages be more specific? >>> My suggestions are: >>> 188 if (filename == null) { >>> 189 System.err.println("Fail at processing option '" + subopt +"'"); >>> 190 usage(1); // invalid options or no filename >>> 191 } >>> System.err.println("Fail: invalid option or no file name: '" + subopt +"'"); >>> 194 if (parallel == null) { >>> 195 System.err.println("Fail at processing option '" + subopt + "'"); >>> 196 usage(1); >>> 197 } >>> System.err.println("Fail: no number provided in option: '" + subopt +"'"); >>> 198 } else { >>> 199 System.err.println("Fail at processing option '" + subopt + "'"); >>> 200 usage(1); >>> 201 } >>> System.err.println("Fail: invalid option: '" + subopt +"'"); >>> >>> >>> The default value is listed in the 'parallel' flag description: >>> parallel= generate histogram using this many parallel threads, default 0 >>> It means that the flag is optionl. >>> >>> I'm okay to file a separate enhancement to add a clarification for 'live' and 'all' flags. >>> >>> Thanks, >>> Serguei >>> >>> >>> On 8/10/20 16:46, linzang(??) wrote: >>> And Here is the latest refined changeset >>> Webrev: http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_12/ >>> Delta: http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_12_delta/ >>> >>> BRs, >>> Lin >>> >>> On 2020/8/11, 7:23 AM, "linzang(??)" mailto:linzang at tencent.com wrote: >>> >>> Dear Serguei, >>> Here is my reply for your question about non-numeric value for ?parallel? (somehow the thread of replay became out of order, not sure why). >>> >>> > >> What is going to happen if the resulting 'parallel' substring above is not a number? >>> > The error handling logic locates at http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_11/src/hotspot/share/services/attachListener.cpp.frames.html, (line 276-284) >>> > Generally, the result is error message will be print if ?parallel? is illegal. An example output would be: >>> > ############################ >>> > $ time jmap -histo:parallel=c 26233 >>> > Exception in thread "main" com.sun.tools.attach.AttachOperationFailedException: Invalid parallel thread number: [c] >>> > at jdk.attach/sun.tools.attach.VirtualMachineImpl.execute(VirtualMachineImpl.java:227) >>> > at jdk.attach/sun.tools.attach.HotSpotVirtualMachine.executeCommand(HotSpotVirtualMachine.java:309) >>> > at jdk.jcmd/sun.tools.jmap.JMap.executeCommandForPid(JMap.java:133) >>> > at jdk.jcmd/sun.tools.jmap.JMap.histo(JMap.java:206) >>> > at jdk.jcmd/sun.tools.jmap.JMap.main(JMap.java:112) >>> > >>> > ############################ >>> > >>> > Hi Serguei, Paul and Stefan. >>> > Moreover, I will made a new changeset with following changes: >>> > * Print error message + usage when parameter check fail in Jmap.java >>> > *Retrive the histo logic that if ?all? and ?live? are set at same time, use ?live?, rather than print error message. (not sure which one is better :P) >>> >>> My last point is to retrive the behavior for compatibility. And do you think make a separate enhancement about spec is reasonable ? >>> >>> Thanks! >>> >>> BRs, >>> Lin >>> >>> From: mailto:serguei.spitsyn at oracle.com mailto:serguei.spitsyn at oracle.com >>> Date: Tuesday, August 11, 2020 at 5:11 AM >>> To: "linzang(??)" mailto:linzang at tencent.com, "Hohensee, Paul" mailto:hohensee at amazon.com, Stefan Karlsson mailto:stefan.karlsson at oracle.com, David Holmes mailto:david.holmes at oracle.com, serviceability-dev mailto:serviceability-dev at openjdk.java.net, mailto:hotspot-gc-dev at openjdk.java.net mailto:hotspot-gc-dev at openjdk.java.net >>> Subject: Re: RFR(L): 8215624: add parallel heap inspection support for jmap histo(G1)(Internet mail) >>> >>> Hi Lin, >>> >>> >>> On 8/7/20 03:41, linzang(??) wrote: >>> Dear Serguei, >>> Thanks a lot for your review! >>> >> The spec says nothing if the new option 'parallel' is mandatory or optional. >>> >> Also, (it was before your fix) the spec does not say if the options 'live' and 'all' are mutually exclusive. >>> For ?parallel?, the spec adds ?parallel=0? is the default behavior. So my assumption is if parallel is not used, it will be 0. Do you think it is ok ? is it necessary to obviously add comments like ?if no parallel is set, use the default value 0?? >>> >>> It'd be nice to make it clear. >>> But the CSR will need to be updated. >>> In fact, I did not want you to go through this cycle again. >>> But maybe it is worth to improve the specs in this regard. >>> May be Paul has some alternative suggestions. >>> >>> >>> For ?live? and ?all?, before the changeset , I see the logic from the code is that both of them can be set at the same time, and the ?live? will take effect. IMHO this may be a little confused. So I made the change, not sure whether I should keep the same behavior as before in this change? >>> >>> This is better to clearly specify what is allowed and what is the behavior. >>> >>> >>> And I like your idea of printing more error msg if something wrong with the options setting, but I checked that before the change, if there is not a match option, it only print usage. and not only jmap -histo but also jmap -dump has this issue, do you agree if I fix both in the changeset? >>> >>> Yes, it'd be nice to make it clear in both specs. >>> >>> >> What is going to happen if null is passed in place of parallel here? : >>> The default value 0 will be used if no ?parallel? option is set. >>> >>> Okay, thanks. >>> >>> >>> >> Should the lines 193-195 be moved after the line 202? >>> I don?t think so, the logic is a little different. At line 193, the case is ?parallel=?. If move them to line 203, it mean ?parallel? is not optional. >>> Okay, I see what you mean. >>> The problem is that the help/spec says nothing about the flag 'parallel' as being optional. >>> >>> >>> I also asked this question: >>> Q: What is going to happen if the resulting 'parallel' sub-string above is not a number? >>> >>> >>> Thanks, >>> Serguei >>> >>> >>> >>> Thanks! >>> >>> >>> BRs, >>> Lin >>> >>> From: mailto:serguei.spitsyn at oracle.com mailto:serguei.spitsyn at oracle.com >>> Date: Friday, August 7, 2020 at 3:28 PM >>> To: "linzang(??)mailto:linzang at tencent.com,Hohensee, Paulmailto:hohensee at amazon.com,StefanKarlssonmailto:stefan.karlsson at oracle.com,DavidHolmesmailto:david.holmes at oracle.com,serviceability-devmailto:serviceability-dev at openjdk.java.net,mailto:hotspot-gc-dev at openjdk.java.netmailto:hotspot-gc-dev at openjdk.java.netSubject:Re:RFR(L):8215624:addparallelheapinspectionsupportforjmaphisto(G1)(Internetmail)HiLin,Notsure,Ifullyunderstandthespecupdateandtheoptionsprocessinginthefile:http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_11/src/jdk.jcmd/share/classes/sun/tools/jmap/JMap.java.frames.htmlThespecsaysnothingifthenewoption'parallel'ismandatoryoroptional.Also,(itwasbeforeyourfix)thespecdoesnotsayiftheoptions'live'and'all'aremutuallyexclusive.TheJMap.javaimplementationjustprintusageintwocases:191}elseif(subopt.startsWith(parallel=")) { >>> 192 parallel = subopt.substring("parallel=".length()); >>> 193 if (parallel == null) { >>> 194 usage(1); >>> 195 } >>> ... >>> 200 if (set_live && set_all) { >>> 201 usage(1); >>> 202 } >>> It is not that helpful as the usage does not explain anything about these corner cases. >>> Also, it allows to pass no parallel option. >>> What is going to happen if null is passed in place of parallel here? : >>> 206 executeCommandForPid(pid, "inspectheap", liveopt, filename, parallel); >>> >>> Should the lines 193-195 be moved after the line 202? >>> >>> Thanks, >>> Serguei >>> >>> >>> On 8/5/20 18:59, linzang(??) wrote: >>> Thanks Paul! >>> And I have verified this change could build success in windows. >>> >>> BRs, >>> Lin >>> >>> On 2020/8/6, 4:17 AM, "Hohensee, Paulmailto:hohensee at amazon.comwrote:Twotinynitsthatdon'tneedanewwebrev:InheapInspection.cpp,youdon'tneedtocastmissed_counttouintxinthecalltolog_info().InheapInspection.hpp,youcandeletetwoofthethreeblanklinesbefore#endif//SHARE_MEMORY_HEAPINSPECTION_HPPThanks,PaulOn8/5/20,6:46AM,linzang(??)" mailto:linzang at tencent.com wrote: >>> >>> Hi Paul, Stefan and Serguei, >>> Here I uploaded a new changeset, would you like to help review again? >>> Webrev: http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_11/ >>> Delta (based on webrev10): http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_11_delta/ >>> >>> P.S. I am in process of building it on windows environment for a double check. May update result later. Thanks! >>> >>> >>> BRs, >>> Lin >>> >>> >>> >>> >>> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From serguei.spitsyn at oracle.com Tue Aug 11 20:20:28 2020 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Tue, 11 Aug 2020 13:20:28 -0700 Subject: RFR(L): 8215624: add parallel heap inspection support for jmap histo(G1)(Internet mail) In-Reply-To: <572A8E2C-7F04-45BE-9D39-E7B08BE1079A@tencent.com> References: <107799EA-8956-426D-B05E-B3344AFCBA28@amazon.com> <63236320-6741-46F4-A22A-24FCE699E26A@tencent.com> <7015e935-51df-54de-8eea-7dc375db4c85@oracle.com> <35B7B432-29F1-45B5-8FD8-3889C0EAC440@tencent.com> <84327b71-8a2d-d6bf-4e05-51ea944d524e@oracle.com> <34627F19-76BD-48FC-8B5A-C316FAEB3C0F@tencent.com> <74470c79-086a-34a1-54f3-d642fef101b0@oracle.com> <526DE8E4-A838-41C7-83BD-3219384B67F7@tencent.com> <6cbc8167-c81b-a13a-6eb9-3a5cb704fd8b@oracle.com> <1CD7647F-FCED-4A77-AB8D-735A06E49FF6@tencent.com> <572A8E2C-7F04-45BE-9D39-E7B08BE1079A@tencent.com> Message-ID: <7f9bbee0-b6f5-01fe-b358-177cf551c70e@oracle.com> An HTML attachment was scrubbed... URL: From linzang at tencent.com Wed Aug 12 00:22:55 2020 From: linzang at tencent.com (=?utf-8?B?bGluemFuZyjoh6fnkLMp?=) Date: Wed, 12 Aug 2020 00:22:55 +0000 Subject: RFR(L): 8215624: add parallel heap inspection support for jmap histo(G1)(Internet mail) In-Reply-To: <7f9bbee0-b6f5-01fe-b358-177cf551c70e@oracle.com> References: <107799EA-8956-426D-B05E-B3344AFCBA28@amazon.com> <63236320-6741-46F4-A22A-24FCE699E26A@tencent.com> <7015e935-51df-54de-8eea-7dc375db4c85@oracle.com> <35B7B432-29F1-45B5-8FD8-3889C0EAC440@tencent.com> <84327b71-8a2d-d6bf-4e05-51ea944d524e@oracle.com> <34627F19-76BD-48FC-8B5A-C316FAEB3C0F@tencent.com> <74470c79-086a-34a1-54f3-d642fef101b0@oracle.com> <526DE8E4-A838-41C7-83BD-3219384B67F7@tencent.com> <6cbc8167-c81b-a13a-6eb9-3a5cb704fd8b@oracle.com> <1CD7647F-FCED-4A77-AB8D-735A06E49FF6@tencent.com> <572A8E2C-7F04-45BE-9D39-E7B08BE1079A@tencent.com> <7f9bbee0-b6f5-01fe-b358-177cf551c70e@oracle.com> Message-ID: <1D640939-A03A-456F-BABD-4A9D863341A9@tencent.com> Dear Serguei, Here are the tests I have done: Generally, the new version of jmap could work with the old version of hotspot, the ?parallel? option tooks no effect. And the old verdion of jmap could work with the new version of hotspot without parallel option, and the jvm side works in parallel heap inspection mode by default. The old jmap could not accept the ?parallel? option, so usage printed. | histo options | Jmap version | hotspot version | result | | no option | latest | 1.8.0.232 | work normally (parallel take no effect) | | live | latest | 1.8.0_232 | work normally (parallel take no effect) | | live, parallel=0 | latest | 1.8.0_232 | work normally (parallel take no effect) | | live, parallel=1 | latest | 1.8.0_232 | work normally (parallel take no effect) | | live, parallel=2 | latest | 1.8.0_232 | work normally (parallel take no effect) | | parallel=3 | latest | 1.8.0.232 | work normally (parallel take no effect) | | no option | latest | 11.0.2 | work normally (parallel take no effect) | | live | latest | 11.0.2 | work normally (parallel take no effect) | | live, parallel=0 | latest | 11.0.2 | work normally (parallel take no effect) | | live, parallel=1 | latest | 11.0.2 | work normally (parallel take no effect) | | live, parallel=2 | latest | 11.0.2 | work normally (parallel take no effect) | | parallel=3 | latest | 11.0.2 | work normally (parallel take no effect) | | no option | latest | 14.0.2 | work normally (parallel take no effect) | | live | latest | 14.0.2 | work normally (parallel take no effect) | | live, parallel=0 | latest | 14.0.2 | work normally (parallel take no effect) | | live, parallel=1 | latest | 14.0.2 | work normally (parallel take no effect) | | live, parallel=2 | latest | 14.0.2 | work normally (parallel take no effect) | | parallel=3 | latest | 14.0.2 | work normally (parallel take no effect) | | no option | 1.8.0.232 | latest | work normally (parallel by default) | | live | 1.8.0.232 | latest | work normally (parallel by default) | | live, parallel=0 | 1.8.0.232 | latest | usage printed | | live, parallel=1 | 1.8.0.232 | latest | usage printed | | live, parallel=2 | 1.8.0.232 | latest | usage printed | | parallel=3 | 1.8.0.232 | latest | usage printed | | no option | 11.0.2 | latest | work normally (parallel by default) | | live | 11.0.2 | latest | work normally (parallel by default) | | live, parallel=0 | 11.0.2 | latest | usage printed | | live, parallel=1 | 11.0.2 | latest | usage printed | | live, parallel=2 | 11.0.2 | latest | usage printed | | parallel=3 | 11.0.2 | latest | usage printed | | no option | 14.0.2 | latest | work normally (parallel by default) | | live | 14.0.2 | latest | work normally (parallel by default) | | live, parallel=0 | 14.0.2 | latest | usage printed | | live, parallel=1 | 14.0.2 | latest | usage printed | | live, parallel=2 | 14.0.2 | latest | usage printed | | parallel=3 | 14.0.2 | latest | usage printed | BRs, Lin From: "serguei.spitsyn at oracle.com" Date: Wednesday, August 12, 2020 at 4:23 AM To: "linzang(??)" Cc: "Hohensee, Paul" , Stefan Karlsson , David Holmes , serviceability-dev , "hotspot-gc-dev at openjdk.java.net" Subject: Re: RFR(L): 8215624: add parallel heap inspection support for jmap histo(G1)(Internet mail) Hi Lin, The latest webrev looks good to me. Just want to double check, how did you check no regressions are introduced with your fix? Thanks, Serguei On 8/11/20 08:22, linzang(??) wrote: Hi Serguei, Thanks a lot for your advice. I agree your concern and will take care of it in future. Here is the latest webrev based on your comments: (delta is just retrieving the usage(1)) http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev14/ So may I assume that the patch is OK with you now? Hi All, In summary, Here are the status of this change at present: * Paul and Serguei have helped review the runtime/JMap part and the changes now is Okay with them. * Stefan has helped review the GC part and it is Okay with him now. So does it need more review and approval for pushing this change? BRs, Lin On 2020/8/11, 2:40 PM, "serguei.spitsyn at oracle.com" wrote: Hi Lin, I prefer a conservative approach and do not change things without a real need. Thanks, Serguei On 8/10/20 20:23, linzang(??) wrote: > Hi Serguei > I got your point, just thought usage may be a little verbosity, it prints almost my whole screen which could flush the error message. And I checked that other jcmd tools usually use System.exit() after print errors. So I made the change. > > Thanks! > Lin > >> On Aug 11, 2020, at 11:05 AM, "serguei.spitsyn at oracle.com" wrote: >> >> Hi Lin, >> >> I've re-reviewed the JMap.java only. >> It looks good except there was no need to replace the usage(1) call with the System.exit(1). >> I did not say usage is not needed, just that it is not enough. >> >> Thanks, >> Serguei >> >> >>> On 8/10/20 19:25, linzang(??) wrote: >>> Hi Serguei, >>> >> First, the CSR does not include any update for 'live' and 'all' options, does it? >>> >> If so, then I'm confused why do you need all these changes related to these two options. >>> >> Did you intend to really change anything? >>> Yes, you?re correct, CSR doesn?t mention any thing about ?live? and ?all?. so all those changes related become unnecessary. >>> >>> Webrev: http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_13 >>> Delta: http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_13_delta >>> >>> BTW, during refining this changeset I also found an issue that jmap -dump could accept undefined options, will setup a new issue in JBS and fix it separately soon. >>> >>> >>> BRs, >>> Lin >>> >>> From: "serguei.spitsyn at oracle.com" >>> Date: Tuesday, August 11, 2020 at 8:40 AM >>> To: "linzang(??)" , "Hohensee, Paul" , Stefan Karlsson , David Holmes , serviceability-dev , "hotspot-gc-dev at openjdk.java.net" >>> Subject: Re: RFR(L): 8215624: add parallel heap inspection support for jmap histo(G1)(Internet mail) >>> >>> Hi Lin, >>> >>> A couple of things. >>> >>> First, the CSR does not include any update for 'live' and 'all' options, does it? >>> If so, then I'm confused why do you need all these changes related to these two options. >>> Did you intend to really change anything? >>> >>> Second, new error messages do not look useful as they say nothing about what is wrong. >>> Printing usage does not help either. >>> Could these messages be more specific? >>> My suggestions are: >>> 188 if (filename == null) { >>> 189 System.err.println("Fail at processing option '" + subopt +"'"); >>> 190 usage(1); // invalid options or no filename >>> 191 } >>> System.err.println("Fail: invalid option or no file name: '" + subopt +"'"); >>> 194 if (parallel == null) { >>> 195 System.err.println("Fail at processing option '" + subopt + "'"); >>> 196 usage(1); >>> 197 } >>> System.err.println("Fail: no number provided in option: '" + subopt +"'"); >>> 198 } else { >>> 199 System.err.println("Fail at processing option '" + subopt + "'"); >>> 200 usage(1); >>> 201 } >>> System.err.println("Fail: invalid option: '" + subopt +"'"); >>> >>> >>> The default value is listed in the 'parallel' flag description: >>> parallel= generate histogram using this many parallel threads, default 0 >>> It means that the flag is optionl. >>> >>> I'm okay to file a separate enhancement to add a clarification for 'live' and 'all' flags. >>> >>> Thanks, >>> Serguei >>> >>> >>> On 8/10/20 16:46, linzang(??) wrote: >>> And Here is the latest refined changeset >>> Webrev: http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_12/ >>> Delta: http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_12_delta/ >>> >>> BRs, >>> Lin >>> >>> On 2020/8/11, 7:23 AM, "linzang(??)" mailto:linzang at tencent.com wrote: >>> >>> Dear Serguei, >>> Here is my reply for your question about non-numeric value for ?parallel? (somehow the thread of replay became out of order, not sure why). >>> >>> > >> What is going to happen if the resulting 'parallel' substring above is not a number? >>> > The error handling logic locates at http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_11/src/hotspot/share/services/attachListener.cpp.frames.html, (line 276-284) >>> > Generally, the result is error message will be print if ?parallel? is illegal. An example output would be: >>> > ############################ >>> > $ time jmap -histo:parallel=c 26233 >>> > Exception in thread "main" com.sun.tools.attach.AttachOperationFailedException: Invalid parallel thread number: [c] >>> > at jdk.attach/sun.tools.attach.VirtualMachineImpl.execute(VirtualMachineImpl.java:227) >>> > at jdk.attach/sun.tools.attach.HotSpotVirtualMachine.executeCommand(HotSpotVirtualMachine.java:309) >>> > at jdk.jcmd/sun.tools.jmap.JMap.executeCommandForPid(JMap.java:133) >>> > at jdk.jcmd/sun.tools.jmap.JMap.histo(JMap.java:206) >>> > at jdk.jcmd/sun.tools.jmap.JMap.main(JMap.java:112) >>> > >>> > ############################ >>> > >>> > Hi Serguei, Paul and Stefan. >>> > Moreover, I will made a new changeset with following changes: >>> > * Print error message + usage when parameter check fail in Jmap.java >>> > *Retrive the histo logic that if ?all? and ?live? are set at same time, use ?live?, rather than print error message. (not sure which one is better :P) >>> >>> My last point is to retrive the behavior for compatibility. And do you think make a separate enhancement about spec is reasonable ? >>> >>> Thanks! >>> >>> BRs, >>> Lin >>> >>> From: mailto:serguei.spitsyn at oracle.com mailto:serguei.spitsyn at oracle.com >>> Date: Tuesday, August 11, 2020 at 5:11 AM >>> To: "linzang(??)" mailto:linzang at tencent.com, "Hohensee, Paul" mailto:hohensee at amazon.com, Stefan Karlsson mailto:stefan.karlsson at oracle.com, David Holmes mailto:david.holmes at oracle.com, serviceability-dev mailto:serviceability-dev at openjdk.java.net, mailto:hotspot-gc-dev at openjdk.java.net mailto:hotspot-gc-dev at openjdk.java.net >>> Subject: Re: RFR(L): 8215624: add parallel heap inspection support for jmap histo(G1)(Internet mail) >>> >>> Hi Lin, >>> >>> >>> On 8/7/20 03:41, linzang(??) wrote: >>> Dear Serguei, >>> Thanks a lot for your review! >>> >> The spec says nothing if the new option 'parallel' is mandatory or optional. >>> >> Also, (it was before your fix) the spec does not say if the options 'live' and 'all' are mutually exclusive. >>> For ?parallel?, the spec adds ?parallel=0? is the default behavior. So my assumption is if parallel is not used, it will be 0. Do you think it is ok ? is it necessary to obviously add comments like ?if no parallel is set, use the default value 0?? >>> >>> It'd be nice to make it clear. >>> But the CSR will need to be updated. >>> In fact, I did not want you to go through this cycle again. >>> But maybe it is worth to improve the specs in this regard. >>> May be Paul has some alternative suggestions. >>> >>> >>> For ?live? and ?all?, before the changeset , I see the logic from the code is that both of them can be set at the same time, and the ?live? will take effect. IMHO this may be a little confused. So I made the change, not sure whether I should keep the same behavior as before in this change? >>> >>> This is better to clearly specify what is allowed and what is the behavior. >>> >>> >>> And I like your idea of printing more error msg if something wrong with the options setting, but I checked that before the change, if there is not a match option, it only print usage. and not only jmap -histo but also jmap -dump has this issue, do you agree if I fix both in the changeset? >>> >>> Yes, it'd be nice to make it clear in both specs. >>> >>> >> What is going to happen if null is passed in place of parallel here? : >>> The default value 0 will be used if no ?parallel? option is set. >>> >>> Okay, thanks. >>> >>> >>> >> Should the lines 193-195 be moved after the line 202? >>> I don?t think so, the logic is a little different. At line 193, the case is ?parallel=?. If move them to line 203, it mean ?parallel? is not optional. >>> Okay, I see what you mean. >>> The problem is that the help/spec says nothing about the flag 'parallel' as being optional. >>> >>> >>> I also asked this question: >>> Q: What is going to happen if the resulting 'parallel' sub-string above is not a number? >>> >>> >>> Thanks, >>> Serguei >>> >>> >>> >>> Thanks! >>> >>> >>> BRs, >>> Lin >>> >>> From: mailto:serguei.spitsyn at oracle.com mailto:serguei.spitsyn at oracle.com >>> Date: Friday, August 7, 2020 at 3:28 PM >>> To: "linzang(??)mailto:linzang at tencent.com,Hohensee, Paulmailto:hohensee at amazon.com,StefanKarlssonmailto:stefan.karlsson at oracle.com,DavidHolmesmailto:david.holmes at oracle.com,serviceability-devmailto:serviceability-dev at openjdk.java.net,mailto:hotspot-gc-dev at openjdk.java.netmailto:hotspot-gc-dev at openjdk.java.netSubject:Re:RFR(L):8215624:addparallelheapinspectionsupportforjmaphisto(G1)(Internetmail)HiLin,Notsure,Ifullyunderstandthespecupdateandtheoptionsprocessinginthefile:http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_11/src/jdk.jcmd/share/classes/sun/tools/jmap/JMap.java.frames.htmlThespecsaysnothingifthenewoption'parallel'ismandatoryoroptional.Also,(itwasbeforeyourfix)thespecdoesnotsayiftheoptions'live'and'all'aremutuallyexclusive.TheJMap.javaimplementationjustprintusageintwocases:191}elseif(subopt.startsWith(parallel=")) { >>> 192 parallel = subopt.substring("parallel=".length()); >>> 193 if (parallel == null) { >>> 194 usage(1); >>> 195 } >>> ... >>> 200 if (set_live && set_all) { >>> 201 usage(1); >>> 202 } >>> It is not that helpful as the usage does not explain anything about these corner cases. >>> Also, it allows to pass no parallel option. >>> What is going to happen if null is passed in place of parallel here? : >>> 206 executeCommandForPid(pid, "inspectheap", liveopt, filename, parallel); >>> >>> Should the lines 193-195 be moved after the line 202? >>> >>> Thanks, >>> Serguei >>> >>> >>> On 8/5/20 18:59, linzang(??) wrote: >>> Thanks Paul! >>> And I have verified this change could build success in windows. >>> >>> BRs, >>> Lin >>> >>> On 2020/8/6, 4:17 AM, "Hohensee, Paulmailto:hohensee at amazon.comwrote:Twotinynitsthatdon'tneedanewwebrev:InheapInspection.cpp,youdon'tneedtocastmissed_counttouintxinthecalltolog_info().InheapInspection.hpp,youcandeletetwoofthethreeblanklinesbefore#endif//SHARE_MEMORY_HEAPINSPECTION_HPPThanks,PaulOn8/5/20,6:46AM,linzang(??)" mailto:linzang at tencent.com wrote: >>> >>> Hi Paul, Stefan and Serguei, >>> Here I uploaded a new changeset, would you like to help review again? >>> Webrev: http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_11/ >>> Delta (based on webrev10): http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_11_delta/ >>> >>> P.S. I am in process of building it on windows environment for a double check. May update result later. Thanks! >>> >>> >>> BRs, >>> Lin >>> >>> >>> >>> >>> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From hohensee at amazon.com Wed Aug 12 14:07:48 2020 From: hohensee at amazon.com (Hohensee, Paul) Date: Wed, 12 Aug 2020 14:07:48 +0000 Subject: RFR(L): 8215624: add parallel heap inspection support for jmap histo(G1)(Internet mail) Message-ID: The latest webrev looks good to me also. Thanks, Paul From: "linzang(??)" Date: Tuesday, August 11, 2020 at 5:24 PM To: "serguei.spitsyn at oracle.com" Cc: "Hohensee, Paul" , Stefan Karlsson , David Holmes , serviceability-dev , "hotspot-gc-dev at openjdk.java.net" Subject: RE: RFR(L): 8215624: add parallel heap inspection support for jmap histo(G1)(Internet mail) Dear Serguei, Here are the tests I have done: Generally, the new version of jmap could work with the old version of hotspot, the ?parallel? option tooks no effect. And the old verdion of jmap could work with the new version of hotspot without parallel option, and the jvm side works in parallel heap inspection mode by default. The old jmap could not accept the ?parallel? option, so usage printed. | histo options | Jmap version | hotspot version | result | | no option | latest | 1.8.0.232 | work normally (parallel take no effect) | | live | latest | 1.8.0_232 | work normally (parallel take no effect) | | live, parallel=0 | latest | 1.8.0_232 | work normally (parallel take no effect) | | live, parallel=1 | latest | 1.8.0_232 | work normally (parallel take no effect) | | live, parallel=2 | latest | 1.8.0_232 | work normally (parallel take no effect) | | parallel=3 | latest | 1.8.0.232 | work normally (parallel take no effect) | | no option | latest | 11.0.2 | work normally (parallel take no effect) | | live | latest | 11.0.2 | work normally (parallel take no effect) | | live, parallel=0 | latest | 11.0.2 | work normally (parallel take no effect) | | live, parallel=1 | latest | 11.0.2 | work normally (parallel take no effect) | | live, parallel=2 | latest | 11.0.2 | work normally (parallel take no effect) | | parallel=3 | latest | 11.0.2 | work normally (parallel take no effect) | | no option | latest | 14.0.2 | work normally (parallel take no effect) | | live | latest | 14.0.2 | work normally (parallel take no effect) | | live, parallel=0 | latest | 14.0.2 | work normally (parallel take no effect) | | live, parallel=1 | latest | 14.0.2 | work normally (parallel take no effect) | | live, parallel=2 | latest | 14.0.2 | work normally (parallel take no effect) | | parallel=3 | latest | 14.0.2 | work normally (parallel take no effect) | | no option | 1.8.0.232 | latest | work normally (parallel by default) | | live | 1.8.0.232 | latest | work normally (parallel by default) | | live, parallel=0 | 1.8.0.232 | latest | usage printed | | live, parallel=1 | 1.8.0.232 | latest | usage printed | | live, parallel=2 | 1.8.0.232 | latest | usage printed | | parallel=3 | 1.8.0.232 | latest | usage printed | | no option | 11.0.2 | latest | work normally (parallel by default) | | live | 11.0.2 | latest | work normally (parallel by default) | | live, parallel=0 | 11.0.2 | latest | usage printed | | live, parallel=1 | 11.0.2 | latest | usage printed | | live, parallel=2 | 11.0.2 | latest | usage printed | | parallel=3 | 11.0.2 | latest | usage printed | | no option | 14.0.2 | latest | work normally (parallel by default) | | live | 14.0.2 | latest | work normally (parallel by default) | | live, parallel=0 | 14.0.2 | latest | usage printed | | live, parallel=1 | 14.0.2 | latest | usage printed | | live, parallel=2 | 14.0.2 | latest | usage printed | | parallel=3 | 14.0.2 | latest | usage printed | BRs, Lin From: "serguei.spitsyn at oracle.com" Date: Wednesday, August 12, 2020 at 4:23 AM To: "linzang(??)" Cc: "Hohensee, Paul" , Stefan Karlsson , David Holmes , serviceability-dev , "hotspot-gc-dev at openjdk.java.net" Subject: Re: RFR(L): 8215624: add parallel heap inspection support for jmap histo(G1)(Internet mail) Hi Lin, The latest webrev looks good to me. Just want to double check, how did you check no regressions are introduced with your fix? Thanks, Serguei On 8/11/20 08:22, linzang(??) wrote: Hi Serguei, Thanks a lot for your advice. I agree your concern and will take care of it in future. Here is the latest webrev based on your comments: (delta is just retrieving the usage(1)) http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev14/ So may I assume that the patch is OK with you now? Hi All, In summary, Here are the status of this change at present: * Paul and Serguei have helped review the runtime/JMap part and the changes now is Okay with them. * Stefan has helped review the GC part and it is Okay with him now. So does it need more review and approval for pushing this change? BRs, Lin On 2020/8/11, 2:40 PM, "serguei.spitsyn at oracle.com" wrote: Hi Lin, I prefer a conservative approach and do not change things without a real need. Thanks, Serguei On 8/10/20 20:23, linzang(??) wrote: > Hi Serguei > I got your point, just thought usage may be a little verbosity, it prints almost my whole screen which could flush the error message. And I checked that other jcmd tools usually use System.exit() after print errors. So I made the change. > > Thanks! > Lin > >> On Aug 11, 2020, at 11:05 AM, "serguei.spitsyn at oracle.com" wrote: >> >> Hi Lin, >> >> I've re-reviewed the JMap.java only. >> It looks good except there was no need to replace the usage(1) call with the System.exit(1). >> I did not say usage is not needed, just that it is not enough. >> >> Thanks, >> Serguei >> >> >>> On 8/10/20 19:25, linzang(??) wrote: >>> Hi Serguei, >>> >> First, the CSR does not include any update for 'live' and 'all' options, does it? >>> >> If so, then I'm confused why do you need all these changes related to these two options. >>> >> Did you intend to really change anything? >>> Yes, you?re correct, CSR doesn?t mention any thing about ?live? and ?all?. so all those changes related become unnecessary. >>> >>> Webrev: http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_13 >>> Delta: http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_13_delta >>> >>> BTW, during refining this changeset I also found an issue that jmap -dump could accept undefined options, will setup a new issue in JBS and fix it separately soon. >>> >>> >>> BRs, >>> Lin >>> >>> From: "serguei.spitsyn at oracle.com" >>> Date: Tuesday, August 11, 2020 at 8:40 AM >>> To: "linzang(??)" , "Hohensee, Paul" , Stefan Karlsson , David Holmes , serviceability-dev , "hotspot-gc-dev at openjdk.java.net" >>> Subject: Re: RFR(L): 8215624: add parallel heap inspection support for jmap histo(G1)(Internet mail) >>> >>> Hi Lin, >>> >>> A couple of things. >>> >>> First, the CSR does not include any update for 'live' and 'all' options, does it? >>> If so, then I'm confused why do you need all these changes related to these two options. >>> Did you intend to really change anything? >>> >>> Second, new error messages do not look useful as they say nothing about what is wrong. >>> Printing usage does not help either. >>> Could these messages be more specific? >>> My suggestions are: >>> 188 if (filename == null) { >>> 189 System.err.println("Fail at processing option '" + subopt +"'"); >>> 190 usage(1); // invalid options or no filename >>> 191 } >>> System.err.println("Fail: invalid option or no file name: '" + subopt +"'"); >>> 194 if (parallel == null) { >>> 195 System.err.println("Fail at processing option '" + subopt + "'"); >>> 196 usage(1); >>> 197 } >>> System.err.println("Fail: no number provided in option: '" + subopt +"'"); >>> 198 } else { >>> 199 System.err.println("Fail at processing option '" + subopt + "'"); >>> 200 usage(1); >>> 201 } >>> System.err.println("Fail: invalid option: '" + subopt +"'"); >>> >>> >>> The default value is listed in the 'parallel' flag description: >>> parallel= generate histogram using this many parallel threads, default 0 >>> It means that the flag is optionl. >>> >>> I'm okay to file a separate enhancement to add a clarification for 'live' and 'all' flags. >>> >>> Thanks, >>> Serguei >>> >>> >>> On 8/10/20 16:46, linzang(??) wrote: >>> And Here is the latest refined changeset >>> Webrev: http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_12/ >>> Delta: http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_12_delta/ >>> >>> BRs, >>> Lin >>> >>> On 2020/8/11, 7:23 AM, "linzang(??)" mailto:linzang at tencent.com wrote: >>> >>> Dear Serguei, >>> Here is my reply for your question about non-numeric value for ?parallel? (somehow the thread of replay became out of order, not sure why). >>> >>> > >> What is going to happen if the resulting 'parallel' substring above is not a number? >>> > The error handling logic locates at http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_11/src/hotspot/share/services/attachListener.cpp.frames.html, (line 276-284) >>> > Generally, the result is error message will be print if ?parallel? is illegal. An example output would be: >>> > ############################ >>> > $ time jmap -histo:parallel=c 26233 >>> > Exception in thread "main" com.sun.tools.attach.AttachOperationFailedException: Invalid parallel thread number: [c] >>> > at jdk.attach/sun.tools.attach.VirtualMachineImpl.execute(VirtualMachineImpl.java:227) >>> > at jdk.attach/sun.tools.attach.HotSpotVirtualMachine.executeCommand(HotSpotVirtualMachine.java:309) >>> > at jdk.jcmd/sun.tools.jmap.JMap.executeCommandForPid(JMap.java:133) >>> > at jdk.jcmd/sun.tools.jmap.JMap.histo(JMap.java:206) >>> > at jdk.jcmd/sun.tools.jmap.JMap.main(JMap.java:112) >>> > >>> > ############################ >>> > >>> > Hi Serguei, Paul and Stefan. >>> > Moreover, I will made a new changeset with following changes: >>> > * Print error message + usage when parameter check fail in Jmap.java >>> > *Retrive the histo logic that if ?all? and ?live? are set at same time, use ?live?, rather than print error message. (not sure which one is better :P) >>> >>> My last point is to retrive the behavior for compatibility. And do you think make a separate enhancement about spec is reasonable ? >>> >>> Thanks! >>> >>> BRs, >>> Lin >>> >>> From: mailto:serguei.spitsyn at oracle.com mailto:serguei.spitsyn at oracle.com >>> Date: Tuesday, August 11, 2020 at 5:11 AM >>> To: "linzang(??)" mailto:linzang at tencent.com, "Hohensee, Paul" mailto:hohensee at amazon.com, Stefan Karlsson mailto:stefan.karlsson at oracle.com, David Holmes mailto:david.holmes at oracle.com, serviceability-dev mailto:serviceability-dev at openjdk.java.net, mailto:hotspot-gc-dev at openjdk.java.net mailto:hotspot-gc-dev at openjdk.java.net >>> Subject: Re: RFR(L): 8215624: add parallel heap inspection support for jmap histo(G1)(Internet mail) >>> >>> Hi Lin, >>> >>> >>> On 8/7/20 03:41, linzang(??) wrote: >>> Dear Serguei, >>> Thanks a lot for your review! >>> >> The spec says nothing if the new option 'parallel' is mandatory or optional. >>> >> Also, (it was before your fix) the spec does not say if the options 'live' and 'all' are mutually exclusive. >>> For ?parallel?, the spec adds ?parallel=0? is the default behavior. So my assumption is if parallel is not used, it will be 0. Do you think it is ok ? is it necessary to obviously add comments like ?if no parallel is set, use the default value 0?? >>> >>> It'd be nice to make it clear. >>> But the CSR will need to be updated. >>> In fact, I did not want you to go through this cycle again. >>> But maybe it is worth to improve the specs in this regard. >>> May be Paul has some alternative suggestions. >>> >>> >>> For ?live? and ?all?, before the changeset , I see the logic from the code is that both of them can be set at the same time, and the ?live? will take effect. IMHO this may be a little confused. So I made the change, not sure whether I should keep the same behavior as before in this change? >>> >>> This is better to clearly specify what is allowed and what is the behavior. >>> >>> >>> And I like your idea of printing more error msg if something wrong with the options setting, but I checked that before the change, if there is not a match option, it only print usage. and not only jmap -histo but also jmap -dump has this issue, do you agree if I fix both in the changeset? >>> >>> Yes, it'd be nice to make it clear in both specs. >>> >>> >> What is going to happen if null is passed in place of parallel here? : >>> The default value 0 will be used if no ?parallel? option is set. >>> >>> Okay, thanks. >>> >>> >>> >> Should the lines 193-195 be moved after the line 202? >>> I don?t think so, the logic is a little different. At line 193, the case is ?parallel=?. If move them to line 203, it mean ?parallel? is not optional. >>> Okay, I see what you mean. >>> The problem is that the help/spec says nothing about the flag 'parallel' as being optional. >>> >>> >>> I also asked this question: >>> Q: What is going to happen if the resulting 'parallel' sub-string above is not a number? >>> >>> >>> Thanks, >>> Serguei >>> >>> >>> >>> Thanks! >>> >>> >>> BRs, >>> Lin >>> >>> From: mailto:serguei.spitsyn at oracle.com mailto:serguei.spitsyn at oracle.com >>> Date: Friday, August 7, 2020 at 3:28 PM >>> To: "linzang(??)mailto:linzang at tencent.com,Hohensee, Paulmailto:hohensee at amazon.com,StefanKarlssonmailto:stefan.karlsson at oracle.com,DavidHolmesmailto:david.holmes at oracle.com,serviceability-devmailto:serviceability-dev at openjdk.java.net,mailto:hotspot-gc-dev at openjdk.java.netmailto:hotspot-gc-dev at openjdk.java.netSubject:Re:RFR(L):8215624:addparallelheapinspectionsupportforjmaphisto(G1)(Internetmail)HiLin,Notsure,Ifullyunderstandthespecupdateandtheoptionsprocessinginthefile:http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_11/src/jdk.jcmd/share/classes/sun/tools/jmap/JMap.java.frames.htmlThespecsaysnothingifthenewoption'parallel'ismandatoryoroptional.Also,(itwasbeforeyourfix)thespecdoesnotsayiftheoptions'live'and'all'aremutuallyexclusive.TheJMap.javaimplementationjustprintusageintwocases:191}elseif(subopt.startsWith(parallel=")) { >>> 192 parallel = subopt.substring("parallel=".length()); >>> 193 if (parallel == null) { >>> 194 usage(1); >>> 195 } >>> ... >>> 200 if (set_live && set_all) { >>> 201 usage(1); >>> 202 } >>> It is not that helpful as the usage does not explain anything about these corner cases. >>> Also, it allows to pass no parallel option. >>> What is going to happen if null is passed in place of parallel here? : >>> 206 executeCommandForPid(pid, "inspectheap", liveopt, filename, parallel); >>> >>> Should the lines 193-195 be moved after the line 202? >>> >>> Thanks, >>> Serguei >>> >>> >>> On 8/5/20 18:59, linzang(??) wrote: >>> Thanks Paul! >>> And I have verified this change could build success in windows. >>> >>> BRs, >>> Lin >>> >>> On 2020/8/6, 4:17 AM, "Hohensee, Paulmailto:hohensee at amazon.comwrote:Twotinynitsthatdon'tneedanewwebrev:InheapInspection.cpp,youdon'tneedtocastmissed_counttouintxinthecalltolog_info().InheapInspection.hpp,youcandeletetwoofthethreeblanklinesbefore#endif//SHARE_MEMORY_HEAPINSPECTION_HPPThanks,PaulOn8/5/20,6:46AM,linzang(??)" mailto:linzang at tencent.com wrote: >>> >>> Hi Paul, Stefan and Serguei, >>> Here I uploaded a new changeset, would you like to help review again? >>> Webrev: http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_11/ >>> Delta (based on webrev10): http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_11_delta/ >>> >>> P.S. I am in process of building it on windows environment for a double check. May update result later. Thanks! >>> >>> >>> BRs, >>> Lin >>> >>> >>> >>> >>> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From hohensee at amazon.com Wed Aug 12 14:32:15 2020 From: hohensee at amazon.com (Hohensee, Paul) Date: Wed, 12 Aug 2020 14:32:15 +0000 Subject: RFR(S):8251374:jmap -dump should not accept invalid options Message-ID: Looks good, but for readability, please add a space before the trailing single quote string, vis System.err.println("Fail: invalid option: '" + subopt +"'"); should be System.err.println("Fail: invalid option: '" + subopt + "'"); No need for a new webrev. Thanks, Paul ?On 8/10/20, 8:00 PM, "serviceability-dev on behalf of linzang(??)" wrote: Here is the webrev: http://cr.openjdk.java.net/~lzang/8251374/webrev01/ BRs, Lin On 2020/8/11, 10:52 AM, "linzang(??)" wrote: Hi All, May I ask your help to review this tiny patch? It fix an issue that jmap -dump could wrongly accept invalid optioins. Bugs: https://bugs.openjdk.java.net/browse/JDK-8251374 Patch: (Can not connect to webrev ftp currently, will try it later, following are all code changes) ################################ --- old/src/jdk.jcmd/share/classes/sun/tools/jmap/JMap.java 2020-08-11 10:42:32.044567791 +0800 +++ new/src/jdk.jcmd/share/classes/sun/tools/jmap/JMap.java 2020-08-11 10:42:31.876568681 +0800 @@ -207,6 +207,11 @@ liveopt = "-live"; } else if (subopt.startsWith("file=")) { filename = parseFileName(subopt); + } else if (subopt.equals("format=b")) { + // ignore format (not needed at this time) + } else { + System.err.println("Fail: invalid option: '" + subopt +"'"); + System.exit(1); } } ################################ Thanks, Lin From serguei.spitsyn at oracle.com Wed Aug 12 16:54:34 2020 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Wed, 12 Aug 2020 09:54:34 -0700 Subject: RFR(S):8251374:jmap -dump should not accept invalid options In-Reply-To: <52647A5E-8808-4916-92AD-1674468C746C@tencent.com> References: <77F863FB-3B42-4B2E-B48E-0113E2F4DA37@tencent.com> <52647A5E-8808-4916-92AD-1674468C746C@tencent.com> Message-ID: <547d8c1c-ff2e-2e5b-1e0b-89e422fccbca@oracle.com> Hi Lin, It looks good. Just one comment. + System.err.println("Fail: invalid option: '" + subopt +"'"); + System.exit(1); Exit needs to be replaced wit usage for consistency. Thanks, Serguei On 8/10/20 19:57, linzang(??) wrote: > Here is the webrev: http://cr.openjdk.java.net/~lzang/8251374/webrev01/ > > BRs, > Lin > > ?On 2020/8/11, 10:52 AM, "linzang(??)" wrote: > > Hi All, > May I ask your help to review this tiny patch? It fix an issue that jmap -dump could wrongly accept invalid optioins. > Bugs: https://bugs.openjdk.java.net/browse/JDK-8251374 > Patch: (Can not connect to webrev ftp currently, will try it later, following are all code changes) > > ################################ > --- old/src/jdk.jcmd/share/classes/sun/tools/jmap/JMap.java 2020-08-11 10:42:32.044567791 +0800 > +++ new/src/jdk.jcmd/share/classes/sun/tools/jmap/JMap.java 2020-08-11 10:42:31.876568681 +0800 > @@ -207,6 +207,11 @@ > liveopt = "-live"; > } else if (subopt.startsWith("file=")) { > filename = parseFileName(subopt); > + } else if (subopt.equals("format=b")) { > + // ignore format (not needed at this time) > + } else { > + System.err.println("Fail: invalid option: '" + subopt +"'"); > + System.exit(1); > } > } > ################################ > > Thanks, > Lin > > From serguei.spitsyn at oracle.com Wed Aug 12 17:05:04 2020 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Wed, 12 Aug 2020 10:05:04 -0700 Subject: RFR(L): 8215624: add parallel heap inspection support for jmap histo(G1)(Internet mail) In-Reply-To: <1D640939-A03A-456F-BABD-4A9D863341A9@tencent.com> References: <107799EA-8956-426D-B05E-B3344AFCBA28@amazon.com> <63236320-6741-46F4-A22A-24FCE699E26A@tencent.com> <7015e935-51df-54de-8eea-7dc375db4c85@oracle.com> <35B7B432-29F1-45B5-8FD8-3889C0EAC440@tencent.com> <84327b71-8a2d-d6bf-4e05-51ea944d524e@oracle.com> <34627F19-76BD-48FC-8B5A-C316FAEB3C0F@tencent.com> <74470c79-086a-34a1-54f3-d642fef101b0@oracle.com> <526DE8E4-A838-41C7-83BD-3219384B67F7@tencent.com> <6cbc8167-c81b-a13a-6eb9-3a5cb704fd8b@oracle.com> <1CD7647F-FCED-4A77-AB8D-735A06E49FF6@tencent.com> <572A8E2C-7F04-45BE-9D39-E7B08BE1079A@tencent.com> <7f9bbee0-b6f5-01fe-b358-177cf551c70e@oracle.com> <1D640939-A03A-456F-BABD-4A9D863341A9@tencent.com> Message-ID: An HTML attachment was scrubbed... URL: From linzang at tencent.com Wed Aug 12 23:46:37 2020 From: linzang at tencent.com (=?utf-8?B?bGluemFuZyjoh6fnkLMp?=) Date: Wed, 12 Aug 2020 23:46:37 +0000 Subject: RFR(L): 8215624: add parallel heap inspection support for jmap histo(G1)(Internet mail) In-Reply-To: References: <107799EA-8956-426D-B05E-B3344AFCBA28@amazon.com> <63236320-6741-46F4-A22A-24FCE699E26A@tencent.com> <7015e935-51df-54de-8eea-7dc375db4c85@oracle.com> <35B7B432-29F1-45B5-8FD8-3889C0EAC440@tencent.com> <84327b71-8a2d-d6bf-4e05-51ea944d524e@oracle.com> <34627F19-76BD-48FC-8B5A-C316FAEB3C0F@tencent.com> <74470c79-086a-34a1-54f3-d642fef101b0@oracle.com> <526DE8E4-A838-41C7-83BD-3219384B67F7@tencent.com> <6cbc8167-c81b-a13a-6eb9-3a5cb704fd8b@oracle.com> <1CD7647F-FCED-4A77-AB8D-735A06E49FF6@tencent.com> <572A8E2C-7F04-45BE-9D39-E7B08BE1079A@tencent.com> <7f9bbee0-b6f5-01fe-b358-177cf551c70e@oracle.com> <1D640939-A03A-456F-BABD-4A9D863341A9@tencent.com> Message-ID: Dear All, Really appreciated for your time and effort for reviewing this webrev. So it got 3 approval now (From Paul, Serguei and Stefan). I think maybe it is okay to be pushed? Or If needs more review, here are the latest webrev and related info. Webrev:? http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev14/ Bug: https://bugs.openjdk.java.net/browse/JDK-8215624 CSR(approved): https://bugs.openjdk.java.net/browse/JDK-8239290 BRs, Lin From: "serguei.spitsyn at oracle.com" Date: Thursday, August 13, 2020 at 1:06 AM To: "linzang(??)" Cc: "Hohensee, Paul" , Stefan Karlsson , David Holmes , serviceability-dev , "hotspot-gc-dev at openjdk.java.net" Subject: Re: RFR(L): 8215624: add parallel heap inspection support for jmap histo(G1)(Internet mail) Hi Lin, Thanks you for testing details, it looks good. Thanks, Serguei On 8/11/20 17:22, linzang(??) wrote: Dear Serguei, ??????????????? Here are the tests I have done: ??????????????? Generally, the new version of jmap could work with the old version of hotspot, the ?parallel? option tooks no effect. And the old verdion of jmap could work with the new version of hotspot without parallel option, and the jvm side works in parallel heap inspection mode by default. ?The old ?jmap could not accept the ?parallel? option, so usage printed.? ? | ? ? ? histo options ? ? ?| ? ? ? Jmap version ? ? ? | ? ? ? hotspot version ? ?| ? ? ? ? ? ? ? ? ? ? ? ? ? result ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?| | ? ? ? no option ? ? ? ? ? ?| ? ? ? ? ? ? ?latest ? ? ? ? ? ? | ? ? ? 1.8.0.232 ? ? ? ? ? ? | ? ? ? ?work normally ?(parallel take no effect) ? |? | ? ? ? ? ? ? ?live ? ? ? ? ? ? ? | ? ? ? ? ? ? ?latest ? ? ? ? ? ? | ? ? ? 1.8.0_232 ? ? ? ? ? ? | ? ? ? work normally ?(parallel take no effect) ? | ? | ? ? ? live, parallel=0 ? ?| ? ? ? ? ? ? ?latest ? ? ? ? ? ? | ? ? ? 1.8.0_232 ? ? ? ? ? ? | ? ? ? work normally ?(parallel take no effect) ? |? | ? ? ? live, parallel=1 ? ?| ? ? ? ? ? ? ?latest ? ? ? ? ? ? | ? ? ? 1.8.0_232 ? ? ? ? ? ? | ? ? ? work normally ?(parallel take no effect) ? |? | ? ? ? live, parallel=2 ? ?| ? ? ? ? ? ? ?latest ? ? ? ? ? ? | ? ? ? 1.8.0_232 ? ? ? ? ? ? | ? ? ? work normally ?(parallel take no effect) ? | ? | ? ? ? parallel=3 ? ? ? ? ? | ? ? ? ? ? ? ?latest ? ? ? ? ? ? | ? ? ? ?1.8.0.232 ? ? ? ? ? ? | ? ? ? ?work normally ?(parallel take no effect) ?| ? | ? ? ? no option ? ? ? ? ? ?| ? ? ? ? ? ? ?latest ? ? ? ? ? ? | ? ? ? ? ?11.0.2 ? ? ? ? ? ? ? | ? ? ? ?work normally ?(parallel take no effect) ? |? | ? ? ? ? ? ?live ? ? ? ? ? ? ? ? | ? ? ? ? ? ? ?latest ? ? ? ? ? ? | ? ? ? ? ?11.0.2 ? ? ? ? ? ? ? | ? ? ? work normally ?(parallel take no effect) ? | | ? ? ? live, parallel=0 ? ?| ? ? ? ? ? ? ?latest ? ? ? ? ? ? | ? ? ? ? ?11.0.2 ? ? ? ? ? ? ? | ? ? ? work normally ?(parallel take no effect) ? | | ? ? ? live, parallel=1 ? ?| ? ? ? ? ? ? ?latest ? ? ? ? ? ? | ? ? ? ? ?11.0.2 ? ? ? ? ? ? ? | ? ? ? work normally ?(parallel take no effect) ? | | ? ? ? live, parallel=2 ? ?| ? ? ? ? ? ? ?latest ? ? ? ? ? ? | ? ? ? ? ?11.0.2 ? ? ? ? ? ? ? | ? ? ? work normally ?(parallel take no effect) ? | | ? ? ? parallel=3 ? ? ? ? ? | ? ? ? ? ? ? ?latest ? ? ? ? ? ? | ? ? ? ? ? 11.0.2 ? ? ? ? ? ? ?| ? ? ? ?work normally ?(parallel take no effect) ? |? | ? ? ? no option ? ? ? ? ? ?| ? ? ? ? ? ? ?latest ? ? ? ? ? ? | ? ? ? ? ?14.0.2 ? ? ? ? ? ? ? | ? ? ? ?work normally ?(parallel take no effect) ?|? | ? ? ? ? ? ?live ? ? ? ? ? ? ? ? | ? ? ? ? ? ? ?latest ? ? ? ? ? ? | ? ? ? ? ?14.0.2 ? ? ? ? ? ? ? | ? ? ? work normally ?(parallel take no effect) ? | | ? ? ? live, parallel=0 ? ?| ? ? ? ? ? ? ?latest ? ? ? ? ? ? | ? ? ? ? ?14.0.2 ? ? ? ? ? ? ? | ? ? ? work normally ?(parallel take no effect) ? | | ? ? ? live, parallel=1 ? ?| ? ? ? ? ? ? ?latest ? ? ? ? ? ? | ? ? ? ? ?14.0.2 ? ? ? ? ? ? ? | ? ? ? work normally ?(parallel take no effect) ? | | ? ? ? live, parallel=2 ? ?| ? ? ? ? ? ? ?latest ? ? ? ? ? ? | ? ? ? ? ?14.0.2 ? ? ? ? ? ? ? | ? ? ? work normally ?(parallel take no effect) ? | | ? ? ? parallel=3 ? ? ? ? ? | ? ? ? ? ? ? ?latest ? ? ? ? ? ? | ? ? ? ? ? 14.0.2? ? ? ? ? ? ? | ? ? ? ?work normally ?(parallel take no effect) ? |? ? | ? ? ? no option ? ? ? ? ? ?| ? ? ? ? ? ?1.8.0.232 ? ? ? ?| ? ? ? ? ? latest ? ? ? ? ? ? ? ?| ? ? ? ?work normally ?(parallel by default) ? ? ? ?|? | ? ? ? ? ? ?live ? ? ? ? ? ? ? ? | ? ? ? ? ? ?1.8.0.232? ? ? ? | ? ? ? ? ? latest ? ? ? ? ? ? ? ?| ? ? ???work normally ?(parallel by default) ? ? ? ?|? | ? ? ? live, parallel=0 ? ?| ? ? ? ? ? ?1.8.0.232? ? ? ? | ? ? ? ? ? latest ? ? ? ? ? ? ? ?| ? ? ? ? ? ? ? ? ? ? ?usage printed ? ? ? ? ? ? ? ? ? ? ? ? ? | | ? ? ? live, parallel=1 ? ?| ? ? ? ? ? ?1.8.0.232 ? ? ? ?| ? ? ? ? ? latest ? ? ? ? ? ? ? ?| ? ? ? ? ? ? ? ? ? ? ?usage printed ? ? ? ? ? ? ? ? ? ? ? ? ? | | ? ? ? live, parallel=2 ? ?| ? ? ? ? ? ?1.8.0.232 ? ? ? ?| ? ? ? ? ? latest ? ? ? ? ? ? ? ?|?? ? ? ? ? ? ? ? ? ? ?usage printed ? ? ? ? ? ? ? ? ? ? ? ? ? | | ? ? ? parallel=3 ? ? ? ? ? | ? ? ? ? ? ?1.8.0.232 ? ? ? ?| ? ? ? ? ? latest ? ? ? ? ? ? ? ?| ? ? ? ? ? ? ? ? ? ? ?usage printed ? ? ? ? ? ? ? ? ? ? ? ? ? | | ? ? ? no option ? ? ? ? ? ?| ? ? ? ? ? ? ?11.0.2 ? ? ? ? ?| ? ? ? ? ? latest ? ? ? ? ? ? ? ?| ? ? ? ?work normally ?(parallel by default) ? ? ? ?|? | ? ? ? ? ? ?live ? ? ? ? ? ? ? ? | ? ? ? ? ? ? ?11.0.2 ? ? ? ? ?| ? ? ? ? ? latest ? ? ? ? ? ? ? ?| ? ? ???work normally ?(parallel by default) ? ? ? ?|? | ? ? ? live, parallel=0 ? ?| ? ? ? ? ? ? ?11.0.2 ? ? ? ? ?| ? ? ? ? ? latest ? ? ? ? ? ? ? ?| ? ? ? ? ? ? ? ? ? ? ?usage printed ? ? ? ? ? ? ? ? ? ? ? ? ? | | ? ? ? live, parallel=1 ? ?| ? ? ? ? ? ?? ?11.0.2 ? ? ? ? | ? ? ? ? ? latest ? ? ? ? ? ? ? ?| ? ? ? ? ? ? ? ? ? ? ?usage printed ? ? ? ? ? ? ? ? ? ? ? ? ? | | ? ? ? live, parallel=2 ? ?| ? ? ? ? ? ?? ?11.0.2 ? ? ? ? | ? ? ? ? ? latest ? ? ? ? ? ? ? ?|?? ? ? ? ? ? ? ? ? ? ?usage printed ? ? ? ? ? ? ? ? ? ? ? ? ? | | ? ? ? parallel=3 ? ? ? ? ? | ? ? ? ? ? ? ? 11.0.2 ? ? ? ? | ? ? ? ? ? latest ? ? ? ? ? ? ? ?| ? ? ? ? ? ? ? ? ? ? ?usage printed ? ? ? ? ? ? ? ? ? ? ? ? ? | | ? ? ? no option ? ? ? ? ? ?| ? ? ? ? ? ? ?14.0.2 ? ? ? ? ?| ? ? ? ? ? latest ? ? ? ? ? ? ? ?| ? ? ? ?work normally ?(parallel by default) ? ? ? ?|? | ? ? ? ? ? ?live ? ? ? ? ? ? ? ? | ? ? ? ? ? ? ?14.0.2 ? ? ? ? ?| ? ? ? ? ? latest ? ? ? ? ? ? ? ?| ? ? ???work normally ?(parallel by default) ? ? ? ?|? | ? ? ? live, parallel=0 ? ?| ? ? ? ? ? ? ?14.0.2 ? ? ? ? ?| ? ? ? ? ? latest ? ? ? ? ? ? ? ?| ? ? ? ? ? ? ? ? ? ? ?usage printed ? ? ? ? ? ? ? ? ? ? ? ? ? | | ? ? ? live, parallel=1 ? ?| ? ? ? ? ? ?? ?14.0.2 ? ? ? ? | ? ? ? ? ? latest ? ? ? ? ? ? ? ?| ? ? ? ? ? ? ? ? ? ? ?usage printed ? ? ? ? ? ? ? ? ? ? ? ? ? | | ? ? ? live, parallel=2 ? ?| ? ? ? ? ? ?? ?14.0.2 ? ? ? ? | ? ? ? ? ? latest ? ? ? ? ? ? ? ?|?? ? ? ? ? ? ? ? ? ? ?usage printed ? ? ? ? ? ? ? ? ? ? ? ? ? | | ? ? ? parallel=3 ? ? ? ? ? | ? ? ? ? ? ? ? 14.0.2 ? ? ? ? | ? ? ? ? ? latest ? ? ? ? ? ? ? ?| ? ? ? ? ? ? ? ? ? ? ?usage printed ? ? ? ? ? ? ? ? ? ? ? ? ? | ? ? BRs, Lin ? From: mailto:serguei.spitsyn at oracle.com mailto:serguei.spitsyn at oracle.com Date: Wednesday, August 12, 2020 at 4:23 AM To: "linzang(??)" mailto:linzang at tencent.com Cc: "Hohensee, Paul" mailto:hohensee at amazon.com, Stefan Karlsson mailto:stefan.karlsson at oracle.com, David Holmes mailto:david.holmes at oracle.com, serviceability-dev mailto:serviceability-dev at openjdk.java.net, mailto:hotspot-gc-dev at openjdk.java.net mailto:hotspot-gc-dev at openjdk.java.net Subject: Re: RFR(L): 8215624: add parallel heap inspection support for jmap histo(G1)(Internet mail) ? Hi Lin, The latest webrev looks good to me. Just want to double check, how did you check no regressions are introduced with your fix? Thanks, Serguei On 8/11/20 08:22, linzang(??) wrote: Hi Serguei, ??????????????? Thanks a lot for your advice. I agree your concern and will take care of it in future. ? ??????????????? Here is the latest webrev based on your comments: ?(delta is just retrieving the usage(1)) ??????????????? http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev14/ ??????????????So may I assume that the patch is OK with you now? Hi All, ??????????????? In summary, Here are the status of this change at present: ??????????????? * Paul and Serguei have helped review the runtime/JMap part and the changes now is Okay with them. ??????????????? * Stefan has helped review the GC part and it is Okay with him now. ??????????????? So does it need more review and approval for pushing this change? ? BRs, Lin ? On 2020/8/11, 2:40 PM, mailto:serguei.spitsyn at oracle.com mailto:serguei.spitsyn at oracle.com wrote: ? ??? Hi Lin, ? ??? I prefer a conservative approach and do not change things without a real ????need. ? ??? Thanks, ??? Serguei ? ??? On 8/10/20 20:23, linzang(??) wrote: ??? > Hi Serguei ??? >????? I got your point, just thought usage may be a little verbosity, it prints almost my whole screen which could flush the error message. And I checked that other jcmd tools usually use System.exit() after print errors. So I made the change. ??? > ???> Thanks! ??? > Lin ??? > ??? >> On Aug 11, 2020, at 11:05 AM, mailto:serguei.spitsyn at oracle.com mailto:serguei.spitsyn at oracle.com wrote: ??? >> ??? >> Hi Lin, ??? >> ??? >> I've re-reviewed the JMap.java only. ??? >> It looks good except there was no need to replace the usage(1) call with the System.exit(1). ??? >> I did not say usage is not needed, just that it is not enough. ??? >> ??? >> Thanks, ??? >> Serguei ??? >> ??? >> ??? >>> On 8/10/20 19:25, linzang(??) wrote: ??? >>> Hi Serguei, ??? >>>?????????? >> First, the CSR does not include any update for 'live' and 'all' options, does it? ??? >>>???? >> If so, then I'm confused why do you need all these changes related to these two options. ??? >>>???? >> Did you intend to really change anything? ??? >>>????? Yes, you?re correct, CSR doesn?t mention any thing about ?live? and ?all?. so all those changes related become unnecessary. ??? >>> ??? >>>????? Webrev: http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_13 ??? >>>????? Delta: http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_13_delta ??? >>> ??? >>>????? BTW, during refining this changeset I also found an issue that jmap -dump could accept undefined options, will setup a new issue in JBS and fix it separately soon. ??? >>> ??? >>> ??? >>> BRs, ??? >>> Lin ??? >>> ??? >>> From: mailto:serguei.spitsyn at oracle.com mailto:serguei.spitsyn at oracle.com ??? >>> Date: Tuesday, August 11, 2020 at 8:40 AM ??? >>> To: "linzang(??)" mailto:linzang at tencent.com, "Hohensee, Paul" mailto:hohensee at amazon.com, Stefan Karlsson mailto:stefan.karlsson at oracle.com, David Holmes mailto:david.holmes at oracle.com, serviceability-dev mailto:serviceability-dev at openjdk.java.net, mailto:hotspot-gc-dev at openjdk.java.net mailto:hotspot-gc-dev at openjdk.java.net ??? >>> Subject: Re: RFR(L): 8215624: add parallel heap inspection support for jmap histo(G1)(Internet mail) ??? >>> ??? >>> Hi Lin, ??? >>> ??? >>> A couple of things. ??? >>> ??? >>> First, the CSR does not include any update for 'live' and 'all' options, does it? ??? >>> If so, then I'm confused why do you need all these changes related to these two options. ??? >>> Did you intend to really change anything? ??? >>> ??? >>> Second, new error messages do not look useful as they say nothing about what is wrong. ??? >>> Printing usage does not help either. ??? >>> Could these messages be more specific? ??? >>> My suggestions are: ??? >>>?? 188???????????????? if (filename == null) { ??? >>>?? 189???????????????????? System.err.println("Fail at processing option '" + subopt +"'"); ??? >>>?? 190???????????????????? usage(1); // invalid options or no filename ??? >>>?? 191???????????????? } ??? >>>??? System.err.println("Fail: invalid option or no file name: '" + subopt +"'"); ??? >>>?? 194??????????????? if (parallel == null) { ??? >>>?? 195???????????????????? System.err.println("Fail at processing option '" + subopt + "'"); ??? >>>?? 196???????????????????? usage(1); ??? >>>?? 197??????????????? } ??? >>>??? System.err.println("Fail: no number provided in option: '" + subopt +"'"); ??? >>>?? 198???????????? } else { ??? >>>?? 199???????????????? System.err.println("Fail at processing option '" + subopt + "'"); ??? >>>?? 200???????????????? usage(1); ??? >>>?? 201???????????? } ??? >>>??? System.err.println("Fail: invalid option: '" + subopt +"'"); ??? >>> ??? >>> ??? >>> The default value is listed in the 'parallel' flag description: ??? >>>??? parallel= generate histogram using this many parallel threads, default 0 ??? >>> It means that the flag is optionl. ? ??>>> ??? >>> I'm okay to file a separate enhancement to add a clarification for 'live' and 'all' flags. ??? >>> ??? >>> Thanks, ??? >>> Serguei ??? >>> ??? >>> ??? >>> On 8/10/20 16:46, linzang(??) wrote: ??? >>> And Here is the latest refined changeset ??? >>> Webrev: http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_12/ ??? >>> Delta: http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_12_delta/ ??? >>> ??? >>> BRs, ??? >>> Lin ??? >>> ??? >>> On 2020/8/11, 7:23 AM, "linzang(??)" mailto:linzang at tencent.com wrote: ??? >>> ??? >>>????? Dear Serguei, ??? >>>??????????????? Here is my reply for your question about non-numeric value for ?parallel? (somehow the thread of replay became out of order, not sure why). ??? >>> ??? >>>????? >????? >> What is going to happen if the resulting 'parallel' substring above is not a number? ??? >>>????? >????? The error handling logic locates at http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_11/src/hotspot/share/services/attachListener.cpp.frames.html, (line 276-284) ??? >>>????? >????? Generally, the result is error message will be print if ?parallel? is illegal.? An example output would be: ??? >>>????? >???? ############################ ???>>>????? >????????? $ time jmap -histo:parallel=c 26233 ??? >>>????? >??????? Exception in thread "main" com.sun.tools.attach.AttachOperationFailedException: Invalid parallel thread number: [c] ??? >>>????? >???????????????????????????????????????????? ?at jdk.attach/sun.tools.attach.VirtualMachineImpl.execute(VirtualMachineImpl.java:227) ??? >>>????? >?????????????????????????????????????????????? at jdk.attach/sun.tools.attach.HotSpotVirtualMachine.executeCommand(HotSpotVirtualMachine.java:309) ??? >>>????? >?????????????????????????????????????????????? at jdk.jcmd/sun.tools.jmap.JMap.executeCommandForPid(JMap.java:133) ??? >>>????? >??????????????????????????????????????????????? at jdk.jcmd/sun.tools.jmap.JMap.histo(JMap.java:206) ??? >>>????? >???? ??????????????????????????????????????????at jdk.jcmd/sun.tools.jmap.JMap.main(JMap.java:112) ??? >>>????? > ??? >>>????? >??? ############################ ??? >>>????? > ??? >>>????? > Hi Serguei, Paul and Stefan. ??? >>>????? >????? Moreover, I will made a new changeset with following changes: ??? >>>????? >??? * Print error message + usage when parameter check fail in Jmap.java ??? >>>????? >??? *Retrive the histo logic that if ?all? and ?live? are set at same time, use ?live?, rather than print error message. (not sure which one is better :P) ??? >>> ??? >>>????? My last point is to retrive the behavior for compatibility.? And do you think make a separate enhancement about spec is reasonable ? ??? >>> ??? >>>???? ?Thanks! ??? >>> ??? >>>????? BRs, ??? >>>????? Lin ??? >>> ??? >>>????? From: mailto:serguei.spitsyn at oracle.com mailto:serguei.spitsyn at oracle.com ??? >>>????? Date: Tuesday, August 11, 2020 at 5:11 AM ??? >>>????? To: "linzang(??)mailto:linzang at tencent.com,Hohensee, Paul" mailto:hohensee at amazon.com, Stefan Karlsson mailto:stefan.karlsson at oracle.com, David Holmes mailto:david.holmes at oracle.com, serviceability-dev mailto:serviceability-dev at openjdk.java.net, mailto:hotspot-gc-dev at openjdk.java.net mailto:hotspot-gc-dev at openjdk.java.net ??? >>>????? Subject: Re: RFR(L): 8215624: add parallel heap inspection support for jmap histo(G1)(Internet mail) ??? >>> ??? >>>????? Hi Lin, ??? >>> ??? >>> ??? >>>????? On 8/7/20 03:41, linzang(??) wrote: ??? >>>????? Dear Serguei, ??? >>>?????????????? Thanks a lot for your review! ??? >>>????? >> The spec says nothing if the new option 'parallel' is mandatory or optional. ??? >>>????? >> Also, (it was before your fix) the spec does not say if the options 'live' and 'all' are mutually exclusive. ??? >>>?????????????? For ?parallel?,? the spec adds ?parallel=0? is the default behavior. So my assumption is if parallel is not used, it will be 0.? Do you think it is ok ? is it necessary to obviously add comments like ?if no parallel is set, use the default value 0?? ??? >>> ??? >>>????? It'd be nice to make it clear. ??? >>>????? But the CSR will need to be updated. ??? >>>????? In fact, I did not want you to go through this cycle again. ??? >>>????? But maybe it is worth to improve the specs in this regard. ??? >>>????? May be Paul has some alternative suggestions. ??? >>> ??? >>> ??? >>>?????????????? For ?live? and ?all?, before the changeset , I see the logic from the code is that both of them can be set at the same time, and the ?live? will take effect. IMHO this may be a little confused. So I made the change, not sure whether I should keep the same behavior as before in this change? ??? >>> ??? >>>????? This is better to clearly specify what is allowed and what is the behavior. ??? >>> ??? >>> ??? >>>?????????????? And I like your idea of printing more error msg if something wrong with the options setting, but I checked that before the change, if there is not a match option, it only print usage. and not only jmap -histo but also jmap -dump has this issue, do you agree if I fix both in the changeset? ??? >>> ??? >>>????? Yes, it'd be nice to make it clear in both specs. ??? >>> ??? >>>????????????????????? >> What is going to happen if null is passed in place of parallel here? : ??? >>>????????????? The default value 0 will be used if no ?parallel? option is set. ??? >>> ??? >>>????? Okay, thanks. ??? >>> ??? >>> ??? >>>???????????????????????????????????? >>? Should the lines 193-195 be moved after the line 202? ??? >>>?????????????? I don?t think so, the logic is a little different.? At line 193, the case is ?parallel=?.?? If move them to line 203, it mean ?parallel? is not optional. ??? >>>????? Okay, I see what you mean. ??? >>>????? The problem is that the help/spec says nothing about the flag 'parallel' as being optional. ??? >>> ??? >>> ??? >>>????? I also asked this question: ??? >>>??????? Q: What is going to happen if the resulting 'parallel' sub-string above is not a number? ??? >>> ??? >>> ??? >>>????? Thanks, ??? >>>????? Serguei ??? >>> ??? >>> ??? >>> ??? >>>????????????? Thanks! ??? >>> ??? >>> ??? >>>????? BRs, ??? >>>????? Lin ??? >>> ??? >>>????? From: mailto:serguei.spitsyn at oracle.com mailto:serguei.spitsyn at oracle.com ??? >>>????? Date: Friday, August 7, 2020 at 3:28 PM ??? >>>????? To: "linzang(??)mailto:linzang at tencent.com,Hohensee, mailto:Paulmailto:hohensee at amazon.com,StefanKarlssonmailto:stefan.karlsson at oracle.com,DavidHolmesmailto:david.holmes at oracle.com,serviceability-devmailto:serviceability-dev at openjdk.java.net,mailto:hotspot-gc-dev at openjdk.java.netmailto:hotspot-gc-dev at openjdk.java.netSubject:Re:RFR(L):8215624:addparallelheapinspectionsupportforjmaphisto(G1)(Internetmail)HiLin,Notsure,Ifullyunderstandthespecupdateandtheoptionsprocessinginthefile:http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_11/src/jdk.jcmd/share/classes/sun/tools/jmap/JMap.java.frames.htmlThespecsaysnothingifthenewoption'parallel'ismandatoryoroptional.Also,(itwasbeforeyourfix)thespecdoesnotsayiftheoptions'live'and'all'aremutuallyexclusive.TheJMap.javaimplementationjustprintusageintwocases:191}elseif(subopt.startsWith(parallel=")) { ??? >>>?????? 192?? ?????????????parallel = subopt.substring("parallel=".length()); ??? >>>?????? 193??????????????? if (parallel == null) { ??? >>>?????? 194???????????????????? usage(1); ??? >>>?????? 195??????????????? } ??? >>>?????? ... ??? >>>?????? 200???????? if (set_live && set_all) { ??? >>>?????? 201???????????? usage(1); ??? >>>?????? 202???????? } ??? >>>????? It is not that helpful as the usage does not explain anything about these corner cases. ??? >>>????? Also, it allows to pass no parallel option. ??? >>>??? ??What is going to happen if null is passed in place of parallel here? : ??? >>>?????? 206???????? executeCommandForPid(pid, "inspectheap", liveopt, filename, parallel); ??? >>> ??? >>>????? Should the lines 193-195 be moved after the line 202? ??? >>> ??? >>>????? Thanks, ??? >>>????? Serguei ??? >>> ??? >>> ??? >>>????? On 8/5/20 18:59, linzang(??) wrote: ??? >>>????? Thanks Paul! ??? >>>????? And I have verified this change could build success in windows. ??? >>> ??? >>>????? BRs, ??? >>>????? Lin ?? ?>>> ??? >>>????? On 2020/8/6, 4:17 AM, "Hohensee, mailto:Paulmailto:hohensee at amazon.comwrote:Twotinynitsthatdon'tneedanewwebrev:InheapInspection.cpp,youdon'tneedtocastmissed_counttouintxinthecalltolog_info().InheapInspection.hpp,youcandeletetwoofthethreeblanklinesbefore#endif//SHARE_MEMORY_HEAPINSPECTION_HPPThanks,PaulOn8/5/20,6:46AM,linzang(??)" mailto:linzang at tencent.com wrote: ??? >>> ??? >>>????????????? Hi Paul, Stefan and Serguei, ??? >>>????????????????? Here I uploaded a new changeset, would you like to help review again? ??? >>>????????????????? Webrev: http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_11/ ??? >>>????????????????? Delta (based on webrev10): http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_11_delta/ ??? >>> ??? >>>????????????????? P.S.? I am in process of building it on windows environment for a double check. May update result later. Thanks! ??? >>> ??? >>> ??? >>>????????????? BRs, ??? >>>????????????? Lin ??? >>> ??? >>> ??? >>> ??? >>> ??? >>> ??? >> ? ? From serguei.spitsyn at oracle.com Wed Aug 12 23:49:11 2020 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Wed, 12 Aug 2020 16:49:11 -0700 Subject: RFR(L): 8215624: add parallel heap inspection support for jmap histo(G1)(Internet mail) In-Reply-To: References: <107799EA-8956-426D-B05E-B3344AFCBA28@amazon.com> <63236320-6741-46F4-A22A-24FCE699E26A@tencent.com> <7015e935-51df-54de-8eea-7dc375db4c85@oracle.com> <35B7B432-29F1-45B5-8FD8-3889C0EAC440@tencent.com> <84327b71-8a2d-d6bf-4e05-51ea944d524e@oracle.com> <34627F19-76BD-48FC-8B5A-C316FAEB3C0F@tencent.com> <74470c79-086a-34a1-54f3-d642fef101b0@oracle.com> <526DE8E4-A838-41C7-83BD-3219384B67F7@tencent.com> <6cbc8167-c81b-a13a-6eb9-3a5cb704fd8b@oracle.com> <1CD7647F-FCED-4A77-AB8D-735A06E49FF6@tencent.com> <572A8E2C-7F04-45BE-9D39-E7B08BE1079A@tencent.com> <7f9bbee0-b6f5-01fe-b358-177cf551c70e@oracle.com> <1D640939-A03A-456F-BABD-4A9D863341A9@tencent.com> Message-ID: Hi Lin, Yes, this fix can pushed. Thanks, Serguei On 8/12/20 16:46, linzang(??) wrote: > Dear All, > Really appreciated for your time and effort for reviewing this webrev. > So it got 3 approval now (From Paul, Serguei and Stefan). I think maybe it is okay to be pushed? > Or If needs more review, here are the latest webrev and related info. > > Webrev:? http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev14/ > Bug: https://bugs.openjdk.java.net/browse/JDK-8215624 > CSR(approved): https://bugs.openjdk.java.net/browse/JDK-8239290 > > BRs, > Lin > > From: "serguei.spitsyn at oracle.com" > Date: Thursday, August 13, 2020 at 1:06 AM > To: "linzang(??)" > Cc: "Hohensee, Paul" , Stefan Karlsson , David Holmes , serviceability-dev , "hotspot-gc-dev at openjdk.java.net" > > Subject: Re: RFR(L): 8215624: add parallel heap inspection support for jmap histo(G1)(Internet mail) > > Hi Lin, > > Thanks you for testing details, it looks good. > > Thanks, > Serguei > > > On 8/11/20 17:22, linzang(??) wrote: > Dear Serguei, > ??????????????? Here are the tests I have done: > ??????????????? Generally, the new version of jmap could work with the old version of hotspot, the ?parallel? option tooks no effect. > And the old verdion of jmap could work with the new version of hotspot without parallel option, and the jvm side works in parallel heap inspection mode by default. ?The old ?jmap could not accept the ?parallel? option, so usage printed. > > | ? ? ? histo options ? ? ?| ? ? ? Jmap version ? ? ? | ? ? ? hotspot version ? ?| ? ? ? ? ? ? ? ? ? ? ? ? ? result ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?| > | ? ? ? no option ? ? ? ? ? ?| ? ? ? ? ? ? ?latest ? ? ? ? ? ? | ? ? ? 1.8.0.232 ? ? ? ? ? ? | ? ? ? ?work normally ?(parallel take no effect) ? | > | ? ? ? ? ? ? ?live ? ? ? ? ? ? ? | ? ? ? ? ? ? ?latest ? ? ? ? ? ? | ? ? ? 1.8.0_232 ? ? ? ? ? ? | ? ? ? work normally ?(parallel take no effect) ? | > | ? ? ? live, parallel=0 ? ?| ? ? ? ? ? ? ?latest ? ? ? ? ? ? | ? ? ? 1.8.0_232 ? ? ? ? ? ? | ? ? ? work normally ?(parallel take no effect) ? | > | ? ? ? live, parallel=1 ? ?| ? ? ? ? ? ? ?latest ? ? ? ? ? ? | ? ? ? 1.8.0_232 ? ? ? ? ? ? | ? ? ? work normally ?(parallel take no effect) ? | > | ? ? ? live, parallel=2 ? ?| ? ? ? ? ? ? ?latest ? ? ? ? ? ? | ? ? ? 1.8.0_232 ? ? ? ? ? ? | ? ? ? work normally ?(parallel take no effect) ? | > | ? ? ? parallel=3 ? ? ? ? ? | ? ? ? ? ? ? ?latest ? ? ? ? ? ? | ? ? ? ?1.8.0.232 ? ? ? ? ? ? | ? ? ? ?work normally ?(parallel take no effect) ?| > | ? ? ? no option ? ? ? ? ? ?| ? ? ? ? ? ? ?latest ? ? ? ? ? ? | ? ? ? ? ?11.0.2 ? ? ? ? ? ? ? | ? ? ? ?work normally ?(parallel take no effect) ? | > | ? ? ? ? ? ?live ? ? ? ? ? ? ? ? | ? ? ? ? ? ? ?latest ? ? ? ? ? ? | ? ? ? ? ?11.0.2 ? ? ? ? ? ? ? | ? ? ? work normally ?(parallel take no effect) ? | > | ? ? ? live, parallel=0 ? ?| ? ? ? ? ? ? ?latest ? ? ? ? ? ? | ? ? ? ? ?11.0.2 ? ? ? ? ? ? ? | ? ? ? work normally ?(parallel take no effect) ? | > | ? ? ? live, parallel=1 ? ?| ? ? ? ? ? ? ?latest ? ? ? ? ? ? | ? ? ? ? ?11.0.2 ? ? ? ? ? ? ? | ? ? ? work normally ?(parallel take no effect) ? | > | ? ? ? live, parallel=2 ? ?| ? ? ? ? ? ? ?latest ? ? ? ? ? ? | ? ? ? ? ?11.0.2 ? ? ? ? ? ? ? | ? ? ? work normally ?(parallel take no effect) ? | > | ? ? ? parallel=3 ? ? ? ? ? | ? ? ? ? ? ? ?latest ? ? ? ? ? ? | ? ? ? ? ? 11.0.2 ? ? ? ? ? ? ?| ? ? ? ?work normally ?(parallel take no effect) ? | > | ? ? ? no option ? ? ? ? ? ?| ? ? ? ? ? ? ?latest ? ? ? ? ? ? | ? ? ? ? ?14.0.2 ? ? ? ? ? ? ? | ? ? ? ?work normally ?(parallel take no effect) ?| > | ? ? ? ? ? ?live ? ? ? ? ? ? ? ? | ? ? ? ? ? ? ?latest ? ? ? ? ? ? | ? ? ? ? ?14.0.2 ? ? ? ? ? ? ? | ? ? ? work normally ?(parallel take no effect) ? | > | ? ? ? live, parallel=0 ? ?| ? ? ? ? ? ? ?latest ? ? ? ? ? ? | ? ? ? ? ?14.0.2 ? ? ? ? ? ? ? | ? ? ? work normally ?(parallel take no effect) ? | > | ? ? ? live, parallel=1 ? ?| ? ? ? ? ? ? ?latest ? ? ? ? ? ? | ? ? ? ? ?14.0.2 ? ? ? ? ? ? ? | ? ? ? work normally ?(parallel take no effect) ? | > | ? ? ? live, parallel=2 ? ?| ? ? ? ? ? ? ?latest ? ? ? ? ? ? | ? ? ? ? ?14.0.2 ? ? ? ? ? ? ? | ? ? ? work normally ?(parallel take no effect) ? | > | ? ? ? parallel=3 ? ? ? ? ? | ? ? ? ? ? ? ?latest ? ? ? ? ? ? | ? ? ? ? ? 14.0.2? ? ? ? ? ? ? | ? ? ? ?work normally ?(parallel take no effect) ? | > > | ? ? ? no option ? ? ? ? ? ?| ? ? ? ? ? ?1.8.0.232 ? ? ? ?| ? ? ? ? ? latest ? ? ? ? ? ? ? ?| ? ? ? ?work normally ?(parallel by default) ? ? ? ?| > | ? ? ? ? ? ?live ? ? ? ? ? ? ? ? | ? ? ? ? ? ?1.8.0.232? ? ? ? | ? ? ? ? ? latest ? ? ? ? ? ? ? ?| ? ? ???work normally ?(parallel by default) ? ? ? ?| > | ? ? ? live, parallel=0 ? ?| ? ? ? ? ? ?1.8.0.232? ? ? ? | ? ? ? ? ? latest ? ? ? ? ? ? ? ?| ? ? ? ? ? ? ? ? ? ? ?usage printed ? ? ? ? ? ? ? ? ? ? ? ? ? | > | ? ? ? live, parallel=1 ? ?| ? ? ? ? ? ?1.8.0.232 ? ? ? ?| ? ? ? ? ? latest ? ? ? ? ? ? ? ?| ? ? ? ? ? ? ? ? ? ? ?usage printed ? ? ? ? ? ? ? ? ? ? ? ? ? | > | ? ? ? live, parallel=2 ? ?| ? ? ? ? ? ?1.8.0.232 ? ? ? ?| ? ? ? ? ? latest ? ? ? ? ? ? ? ?|?? ? ? ? ? ? ? ? ? ? ?usage printed ? ? ? ? ? ? ? ? ? ? ? ? ? | > | ? ? ? parallel=3 ? ? ? ? ? | ? ? ? ? ? ?1.8.0.232 ? ? ? ?| ? ? ? ? ? latest ? ? ? ? ? ? ? ?| ? ? ? ? ? ? ? ? ? ? ?usage printed ? ? ? ? ? ? ? ? ? ? ? ? ? | > | ? ? ? no option ? ? ? ? ? ?| ? ? ? ? ? ? ?11.0.2 ? ? ? ? ?| ? ? ? ? ? latest ? ? ? ? ? ? ? ?| ? ? ? ?work normally ?(parallel by default) ? ? ? ?| > | ? ? ? ? ? ?live ? ? ? ? ? ? ? ? | ? ? ? ? ? ? ?11.0.2 ? ? ? ? ?| ? ? ? ? ? latest ? ? ? ? ? ? ? ?| ? ? ???work normally ?(parallel by default) ? ? ? ?| > | ? ? ? live, parallel=0 ? ?| ? ? ? ? ? ? ?11.0.2 ? ? ? ? ?| ? ? ? ? ? latest ? ? ? ? ? ? ? ?| ? ? ? ? ? ? ? ? ? ? ?usage printed ? ? ? ? ? ? ? ? ? ? ? ? ? | > | ? ? ? live, parallel=1 ? ?| ? ? ? ? ? ?? ?11.0.2 ? ? ? ? | ? ? ? ? ? latest ? ? ? ? ? ? ? ?| ? ? ? ? ? ? ? ? ? ? ?usage printed ? ? ? ? ? ? ? ? ? ? ? ? ? | > | ? ? ? live, parallel=2 ? ?| ? ? ? ? ? ?? ?11.0.2 ? ? ? ? | ? ? ? ? ? latest ? ? ? ? ? ? ? ?|?? ? ? ? ? ? ? ? ? ? ?usage printed ? ? ? ? ? ? ? ? ? ? ? ? ? | > | ? ? ? parallel=3 ? ? ? ? ? | ? ? ? ? ? ? ? 11.0.2 ? ? ? ? | ? ? ? ? ? latest ? ? ? ? ? ? ? ?| ? ? ? ? ? ? ? ? ? ? ?usage printed ? ? ? ? ? ? ? ? ? ? ? ? ? | > | ? ? ? no option ? ? ? ? ? ?| ? ? ? ? ? ? ?14.0.2 ? ? ? ? ?| ? ? ? ? ? latest ? ? ? ? ? ? ? ?| ? ? ? ?work normally ?(parallel by default) ? ? ? ?| > | ? ? ? ? ? ?live ? ? ? ? ? ? ? ? | ? ? ? ? ? ? ?14.0.2 ? ? ? ? ?| ? ? ? ? ? latest ? ? ? ? ? ? ? ?| ? ? ???work normally ?(parallel by default) ? ? ? ?| > | ? ? ? live, parallel=0 ? ?| ? ? ? ? ? ? ?14.0.2 ? ? ? ? ?| ? ? ? ? ? latest ? ? ? ? ? ? ? ?| ? ? ? ? ? ? ? ? ? ? ?usage printed ? ? ? ? ? ? ? ? ? ? ? ? ? | > | ? ? ? live, parallel=1 ? ?| ? ? ? ? ? ?? ?14.0.2 ? ? ? ? | ? ? ? ? ? latest ? ? ? ? ? ? ? ?| ? ? ? ? ? ? ? ? ? ? ?usage printed ? ? ? ? ? ? ? ? ? ? ? ? ? | > | ? ? ? live, parallel=2 ? ?| ? ? ? ? ? ?? ?14.0.2 ? ? ? ? | ? ? ? ? ? latest ? ? ? ? ? ? ? ?|?? ? ? ? ? ? ? ? ? ? ?usage printed ? ? ? ? ? ? ? ? ? ? ? ? ? | > | ? ? ? parallel=3 ? ? ? ? ? | ? ? ? ? ? ? ? 14.0.2 ? ? ? ? | ? ? ? ? ? latest ? ? ? ? ? ? ? ?| ? ? ? ? ? ? ? ? ? ? ?usage printed ? ? ? ? ? ? ? ? ? ? ? ? ? | > > > BRs, > Lin > > From: mailto:serguei.spitsyn at oracle.com mailto:serguei.spitsyn at oracle.com > Date: Wednesday, August 12, 2020 at 4:23 AM > To: "linzang(??)" mailto:linzang at tencent.com > Cc: "Hohensee, Paul" mailto:hohensee at amazon.com, Stefan Karlsson mailto:stefan.karlsson at oracle.com, David Holmes mailto:david.holmes at oracle.com, serviceability-dev mailto:serviceability-dev at openjdk.java.net, mailto:hotspot-gc-dev at openjdk.java.net mailto:hotspot-gc-dev at openjdk.java.net > Subject: Re: RFR(L): 8215624: add parallel heap inspection support for jmap histo(G1)(Internet mail) > > Hi Lin, > > The latest webrev looks good to me. > Just want to double check, how did you check no regressions are introduced with your fix? > > Thanks, > Serguei > > > > On 8/11/20 08:22, linzang(??) wrote: > Hi Serguei, > ??????????????? Thanks a lot for your advice. I agree your concern and will take care of it in future. > ??????????????? Here is the latest webrev based on your comments: ?(delta is just retrieving the usage(1)) > ??????????????? http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev14/ > ??????????????So may I assume that the patch is OK with you now? > Hi All, > ??????????????? In summary, Here are the status of this change at present: > ??????????????? * Paul and Serguei have helped review the runtime/JMap part and the changes now is Okay with them. > ??????????????? * Stefan has helped review the GC part and it is Okay with him now. > ??????????????? So does it need more review and approval for pushing this change? > > BRs, > Lin > > On 2020/8/11, 2:40 PM, mailto:serguei.spitsyn at oracle.com mailto:serguei.spitsyn at oracle.com wrote: > > ??? Hi Lin, > > ??? I prefer a conservative approach and do not change things without a real > ????need. > > ??? Thanks, > ??? Serguei > > ??? On 8/10/20 20:23, linzang(??) wrote: > ??? > Hi Serguei > ??? >????? I got your point, just thought usage may be a little verbosity, it prints almost my whole screen which could flush the error message. And I checked that other jcmd tools usually use System.exit() after print errors. So I made the change. > ??? > > ???> Thanks! > ??? > Lin > ??? > > ??? >> On Aug 11, 2020, at 11:05 AM, mailto:serguei.spitsyn at oracle.com mailto:serguei.spitsyn at oracle.com wrote: > ??? >> > ??? >> Hi Lin, > ??? >> > ??? >> I've re-reviewed the JMap.java only. > ??? >> It looks good except there was no need to replace the usage(1) call with the System.exit(1). > ??? >> I did not say usage is not needed, just that it is not enough. > ??? >> > ??? >> Thanks, > ??? >> Serguei > ??? >> > ??? >> > ??? >>> On 8/10/20 19:25, linzang(??) wrote: > ??? >>> Hi Serguei, > ??? >>>?????????? >> First, the CSR does not include any update for 'live' and 'all' options, does it? > ??? >>>???? >> If so, then I'm confused why do you need all these changes related to these two options. > ??? >>>???? >> Did you intend to really change anything? > ??? >>>????? Yes, you?re correct, CSR doesn?t mention any thing about ?live? and ?all?. so all those changes related become unnecessary. > ??? >>> > ??? >>>????? Webrev: http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_13 > ??? >>>????? Delta: http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_13_delta > ??? >>> > ??? >>>????? BTW, during refining this changeset I also found an issue that jmap -dump could accept undefined options, will setup a new issue in JBS and fix it separately soon. > ??? >>> > ??? >>> > ??? >>> BRs, > ??? >>> Lin > ??? >>> > ??? >>> From: mailto:serguei.spitsyn at oracle.com mailto:serguei.spitsyn at oracle.com > ??? >>> Date: Tuesday, August 11, 2020 at 8:40 AM > ??? >>> To: "linzang(??)" mailto:linzang at tencent.com, "Hohensee, Paul" mailto:hohensee at amazon.com, Stefan Karlsson mailto:stefan.karlsson at oracle.com, David Holmes mailto:david.holmes at oracle.com, serviceability-dev mailto:serviceability-dev at openjdk.java.net, mailto:hotspot-gc-dev at openjdk.java.net mailto:hotspot-gc-dev at openjdk.java.net > ??? >>> Subject: Re: RFR(L): 8215624: add parallel heap inspection support for jmap histo(G1)(Internet mail) > ??? >>> > ??? >>> Hi Lin, > ??? >>> > ??? >>> A couple of things. > ??? >>> > ??? >>> First, the CSR does not include any update for 'live' and 'all' options, does it? > ??? >>> If so, then I'm confused why do you need all these changes related to these two options. > ??? >>> Did you intend to really change anything? > ??? >>> > ??? >>> Second, new error messages do not look useful as they say nothing about what is wrong. > ??? >>> Printing usage does not help either. > ??? >>> Could these messages be more specific? > ??? >>> My suggestions are: > ??? >>>?? 188???????????????? if (filename == null) { > ??? >>>?? 189???????????????????? System.err.println("Fail at processing option '" + subopt +"'"); > ??? >>>?? 190???????????????????? usage(1); // invalid options or no filename > ??? >>>?? 191???????????????? } > ??? >>>??? System.err.println("Fail: invalid option or no file name: '" + subopt +"'"); > ??? >>>?? 194??????????????? if (parallel == null) { > ??? >>>?? 195???????????????????? System.err.println("Fail at processing option '" + subopt + "'"); > ??? >>>?? 196???????????????????? usage(1); > ??? >>>?? 197??????????????? } > ??? >>>??? System.err.println("Fail: no number provided in option: '" + subopt +"'"); > ??? >>>?? 198???????????? } else { > ??? >>>?? 199???????????????? System.err.println("Fail at processing option '" + subopt + "'"); > ??? >>>?? 200???????????????? usage(1); > ??? >>>?? 201???????????? } > ??? >>>??? System.err.println("Fail: invalid option: '" + subopt +"'"); > ??? >>> > ??? >>> > ??? >>> The default value is listed in the 'parallel' flag description: > ??? >>>??? parallel= generate histogram using this many parallel threads, default 0 > ??? >>> It means that the flag is optionl. > ? ??>>> > ??? >>> I'm okay to file a separate enhancement to add a clarification for 'live' and 'all' flags. > ??? >>> > ??? >>> Thanks, > ??? >>> Serguei > ??? >>> > ??? >>> > ??? >>> On 8/10/20 16:46, linzang(??) wrote: > ??? >>> And Here is the latest refined changeset > ??? >>> Webrev: http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_12/ > ??? >>> Delta: http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_12_delta/ > ??? >>> > ??? >>> BRs, > ??? >>> Lin > ??? >>> > ??? >>> On 2020/8/11, 7:23 AM, "linzang(??)" mailto:linzang at tencent.com wrote: > ??? >>> > ??? >>>????? Dear Serguei, > ??? >>>??????????????? Here is my reply for your question about non-numeric value for ?parallel? (somehow the thread of replay became out of order, not sure why). > ??? >>> > ??? >>>????? >????? >> What is going to happen if the resulting 'parallel' substring above is not a number? > ??? >>>????? >????? The error handling logic locates at http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_11/src/hotspot/share/services/attachListener.cpp.frames.html, (line 276-284) > ??? >>>????? >????? Generally, the result is error message will be print if ?parallel? is illegal.? An example output would be: > ??? >>>????? >???? ############################ > ???>>>????? >????????? $ time jmap -histo:parallel=c 26233 > ??? >>>????? >??????? Exception in thread "main" com.sun.tools.attach.AttachOperationFailedException: Invalid parallel thread number: [c] > ??? >>>????? >???????????????????????????????????????????? ?at jdk.attach/sun.tools.attach.VirtualMachineImpl.execute(VirtualMachineImpl.java:227) > ??? >>>????? >?????????????????????????????????????????????? at jdk.attach/sun.tools.attach.HotSpotVirtualMachine.executeCommand(HotSpotVirtualMachine.java:309) > ??? >>>????? >?????????????????????????????????????????????? at jdk.jcmd/sun.tools.jmap.JMap.executeCommandForPid(JMap.java:133) > ??? >>>????? >??????????????????????????????????????????????? at jdk.jcmd/sun.tools.jmap.JMap.histo(JMap.java:206) > ??? >>>????? >???? ??????????????????????????????????????????at jdk.jcmd/sun.tools.jmap.JMap.main(JMap.java:112) > ??? >>>????? > > ??? >>>????? >??? ############################ > ??? >>>????? > > ??? >>>????? > Hi Serguei, Paul and Stefan. > ??? >>>????? >????? Moreover, I will made a new changeset with following changes: > ??? >>>????? >??? * Print error message + usage when parameter check fail in Jmap.java > ??? >>>????? >??? *Retrive the histo logic that if ?all? and ?live? are set at same time, use ?live?, rather than print error message. (not sure which one is better :P) > ??? >>> > ??? >>>????? My last point is to retrive the behavior for compatibility.? And do you think make a separate enhancement about spec is reasonable ? > ??? >>> > ??? >>>???? ?Thanks! > ??? >>> > ??? >>>????? BRs, > ??? >>>????? Lin > ??? >>> > ??? >>>????? From: mailto:serguei.spitsyn at oracle.com mailto:serguei.spitsyn at oracle.com > ??? >>>????? Date: Tuesday, August 11, 2020 at 5:11 AM > ??? >>>????? To: "linzang(??)mailto:linzang at tencent.com,Hohensee, Paul" mailto:hohensee at amazon.com, Stefan Karlsson mailto:stefan.karlsson at oracle.com, David Holmes mailto:david.holmes at oracle.com, serviceability-dev mailto:serviceability-dev at openjdk.java.net, mailto:hotspot-gc-dev at openjdk.java.net mailto:hotspot-gc-dev at openjdk.java.net > ??? >>>????? Subject: Re: RFR(L): 8215624: add parallel heap inspection support for jmap histo(G1)(Internet mail) > ??? >>> > ??? >>>????? Hi Lin, > ??? >>> > ??? >>> > ??? >>>????? On 8/7/20 03:41, linzang(??) wrote: > ??? >>>????? Dear Serguei, > ??? >>>?????????????? Thanks a lot for your review! > ??? >>>????? >> The spec says nothing if the new option 'parallel' is mandatory or optional. > ??? >>>????? >> Also, (it was before your fix) the spec does not say if the options 'live' and 'all' are mutually exclusive. > ??? >>>?????????????? For ?parallel?,? the spec adds ?parallel=0? is the default behavior. So my assumption is if parallel is not used, it will be 0.? Do you think it is ok ? is it necessary to obviously add comments like ?if no parallel is set, use the default value 0?? > ??? >>> > ??? >>>????? It'd be nice to make it clear. > ??? >>>????? But the CSR will need to be updated. > ??? >>>????? In fact, I did not want you to go through this cycle again. > ??? >>>????? But maybe it is worth to improve the specs in this regard. > ??? >>>????? May be Paul has some alternative suggestions. > ??? >>> > ??? >>> > ??? >>>?????????????? For ?live? and ?all?, before the changeset , I see the logic from the code is that both of them can be set at the same time, and the ?live? will take effect. IMHO this may be a little confused. So I made the change, not sure whether I should keep the same behavior as before in this change? > ??? >>> > ??? >>>????? This is better to clearly specify what is allowed and what is the behavior. > ??? >>> > ??? >>> > ??? >>>?????????????? And I like your idea of printing more error msg if something wrong with the options setting, but I checked that before the change, if there is not a match option, it only print usage. and not only jmap -histo but also jmap -dump has this issue, do you agree if I fix both in the changeset? > ??? >>> > ??? >>>????? Yes, it'd be nice to make it clear in both specs. > ??? >>> > ??? >>>????????????????????? >> What is going to happen if null is passed in place of parallel here? : > ??? >>>????????????? The default value 0 will be used if no ?parallel? option is set. > ??? >>> > ??? >>>????? Okay, thanks. > ??? >>> > ??? >>> > ??? >>>???????????????????????????????????? >>? Should the lines 193-195 be moved after the line 202? > ??? >>>?????????????? I don?t think so, the logic is a little different.? At line 193, the case is ?parallel=?.?? If move them to line 203, it mean ?parallel? is not optional. > ??? >>>????? Okay, I see what you mean. > ??? >>>????? The problem is that the help/spec says nothing about the flag 'parallel' as being optional. > ??? >>> > ??? >>> > ??? >>>????? I also asked this question: > ??? >>>??????? Q: What is going to happen if the resulting 'parallel' sub-string above is not a number? > ??? >>> > ??? >>> > ??? >>>????? Thanks, > ??? >>>????? Serguei > ??? >>> > ??? >>> > ??? >>> > ??? >>>????????????? Thanks! > ??? >>> > ??? >>> > ??? >>>????? BRs, > ??? >>>????? Lin > ??? >>> > ??? >>>????? From: mailto:serguei.spitsyn at oracle.com mailto:serguei.spitsyn at oracle.com > ??? >>>????? Date: Friday, August 7, 2020 at 3:28 PM > ??? >>>????? To: "linzang(??)mailto:linzang at tencent.com,Hohensee, mailto:Paulmailto:hohensee at amazon.com,StefanKarlssonmailto:stefan.karlsson at oracle.com,DavidHolmesmailto:david.holmes at oracle.com,serviceability-devmailto:serviceability-dev at openjdk.java.net,mailto:hotspot-gc-dev at openjdk.java.netmailto:hotspot-gc-dev at openjdk.java.netSubject:Re:RFR(L):8215624:addparallelheapinspectionsupportforjmaphisto(G1)(Internetmail)HiLin,Notsure,Ifullyunderstandthespecupdateandtheoptionsprocessinginthefile:http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_11/src/jdk.jcmd/share/classes/sun/tools/jmap/JMap.java.frames.htmlThespecsaysnothingifthenewoption'parallel'ismandatoryoroptional.Also,(itwasbeforeyourfix)thespecdoesnotsayiftheoptions'live'and'all'aremutuallyexclusive.TheJMap.javaimplementationjustprintusageintwocases:191}elseif(subopt.startsWith(parallel=")) { > ??? >>>?????? 192?? ?????????????parallel = subopt.substring("parallel=".length()); > ??? >>>?????? 193??????????????? if (parallel == null) { > ??? >>>?????? 194???????????????????? usage(1); > ??? >>>?????? 195??????????????? } > ??? >>>?????? ... > ??? >>>?????? 200???????? if (set_live && set_all) { > ??? >>>?????? 201???????????? usage(1); > ??? >>>?????? 202???????? } > ??? >>>????? It is not that helpful as the usage does not explain anything about these corner cases. > ??? >>>????? Also, it allows to pass no parallel option. > ??? >>>??? ??What is going to happen if null is passed in place of parallel here? : > ??? >>>?????? 206???????? executeCommandForPid(pid, "inspectheap", liveopt, filename, parallel); > ??? >>> > ??? >>>????? Should the lines 193-195 be moved after the line 202? > ??? >>> > ??? >>>????? Thanks, > ??? >>>????? Serguei > ??? >>> > ??? >>> > ??? >>>????? On 8/5/20 18:59, linzang(??) wrote: > ??? >>>????? Thanks Paul! > ??? >>>????? And I have verified this change could build success in windows. > ??? >>> > ??? >>>????? BRs, > ??? >>>????? Lin > ?? ?>>> > ??? >>>????? On 2020/8/6, 4:17 AM, "Hohensee, mailto:Paulmailto:hohensee at amazon.comwrote:Twotinynitsthatdon'tneedanewwebrev:InheapInspection.cpp,youdon'tneedtocastmissed_counttouintxinthecalltolog_info().InheapInspection.hpp,youcandeletetwoofthethreeblanklinesbefore#endif//SHARE_MEMORY_HEAPINSPECTION_HPPThanks,PaulOn8/5/20,6:46AM,linzang(??)" mailto:linzang at tencent.com wrote: > ??? >>> > ??? >>>????????????? Hi Paul, Stefan and Serguei, > ??? >>>????????????????? Here I uploaded a new changeset, would you like to help review again? > ??? >>>????????????????? Webrev: http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_11/ > ??? >>>????????????????? Delta (based on webrev10): http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_11_delta/ > ??? >>> > ??? >>>????????????????? P.S.? I am in process of building it on windows environment for a double check. May update result later. Thanks! > ??? >>> > ??? >>> > ??? >>>????????????? BRs, > ??? >>>????????????? Lin > ??? >>> > ??? >>> > ??? >>> > ??? >>> > ??? >>> > ??? >> > > > > > > From linzang at tencent.com Thu Aug 13 00:08:10 2020 From: linzang at tencent.com (=?utf-8?B?bGluemFuZyjoh6fnkLMp?=) Date: Thu, 13 Aug 2020 00:08:10 +0000 Subject: RFR(S):8251374:jmap -dump should not accept invalid options(Internet mail) In-Reply-To: <547d8c1c-ff2e-2e5b-1e0b-89e422fccbca@oracle.com> References: <77F863FB-3B42-4B2E-B48E-0113E2F4DA37@tencent.com> <52647A5E-8808-4916-92AD-1674468C746C@tencent.com> <547d8c1c-ff2e-2e5b-1e0b-89e422fccbca@oracle.com> Message-ID: Hi Paul and Serguei, Thanks for your comments, here is the updated patch: http://cr.openjdk.java.net/~lzang/8251374/webrev02/ BRs, Lin ?On 2020/8/13, 12:55 AM, "serguei.spitsyn at oracle.com" wrote: Hi Lin, It looks good. Just one comment. + System.err.println("Fail: invalid option: '" + subopt +"'"); + System.exit(1); Exit needs to be replaced wit usage for consistency. Thanks, Serguei On 8/10/20 19:57, linzang(??) wrote: > Here is the webrev: http://cr.openjdk.java.net/~lzang/8251374/webrev01/ > > BRs, > Lin > > On 2020/8/11, 10:52 AM, "linzang(??)" wrote: > > Hi All, > May I ask your help to review this tiny patch? It fix an issue that jmap -dump could wrongly accept invalid optioins. > Bugs: https://bugs.openjdk.java.net/browse/JDK-8251374 > Patch: (Can not connect to webrev ftp currently, will try it later, following are all code changes) > > ################################ > --- old/src/jdk.jcmd/share/classes/sun/tools/jmap/JMap.java 2020-08-11 10:42:32.044567791 +0800 > +++ new/src/jdk.jcmd/share/classes/sun/tools/jmap/JMap.java 2020-08-11 10:42:31.876568681 +0800 > @@ -207,6 +207,11 @@ > liveopt = "-live"; > } else if (subopt.startsWith("file=")) { > filename = parseFileName(subopt); > + } else if (subopt.equals("format=b")) { > + // ignore format (not needed at this time) > + } else { > + System.err.println("Fail: invalid option: '" + subopt +"'"); > + System.exit(1); > } > } > ################################ > > Thanks, > Lin > > From serguei.spitsyn at oracle.com Thu Aug 13 01:18:07 2020 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Wed, 12 Aug 2020 18:18:07 -0700 Subject: RFR(S):8251374:jmap -dump should not accept invalid options(Internet mail) In-Reply-To: References: <77F863FB-3B42-4B2E-B48E-0113E2F4DA37@tencent.com> <52647A5E-8808-4916-92AD-1674468C746C@tencent.com> <547d8c1c-ff2e-2e5b-1e0b-89e422fccbca@oracle.com> Message-ID: <903ce74d-4cb3-d229-49eb-e0b260f05931@oracle.com> Hi Lin. Thank you for the update. It looks good. Thanks, Serguei On 8/12/20 17:08, linzang(??) wrote: > Hi Paul and Serguei, > Thanks for your comments, here is the updated patch: http://cr.openjdk.java.net/~lzang/8251374/webrev02/ > > BRs, > Lin > > ?On 2020/8/13, 12:55 AM, "serguei.spitsyn at oracle.com" wrote: > > Hi Lin, > > It looks good. > Just one comment. > > + System.err.println("Fail: invalid option: '" + subopt +"'"); > + System.exit(1); > > Exit needs to be replaced wit usage for consistency. > > Thanks, > Serguei > > > On 8/10/20 19:57, linzang(??) wrote: > > Here is the webrev: http://cr.openjdk.java.net/~lzang/8251374/webrev01/ > > > > BRs, > > Lin > > > > On 2020/8/11, 10:52 AM, "linzang(??)" wrote: > > > > Hi All, > > May I ask your help to review this tiny patch? It fix an issue that jmap -dump could wrongly accept invalid optioins. > > Bugs: https://bugs.openjdk.java.net/browse/JDK-8251374 > > Patch: (Can not connect to webrev ftp currently, will try it later, following are all code changes) > > > > ################################ > > --- old/src/jdk.jcmd/share/classes/sun/tools/jmap/JMap.java 2020-08-11 10:42:32.044567791 +0800 > > +++ new/src/jdk.jcmd/share/classes/sun/tools/jmap/JMap.java 2020-08-11 10:42:31.876568681 +0800 > > @@ -207,6 +207,11 @@ > > liveopt = "-live"; > > } else if (subopt.startsWith("file=")) { > > filename = parseFileName(subopt); > > + } else if (subopt.equals("format=b")) { > > + // ignore format (not needed at this time) > > + } else { > > + System.err.println("Fail: invalid option: '" + subopt +"'"); > > + System.exit(1); > > } > > } > > ################################ > > > > Thanks, > > Lin > > > > > > > From hohensee at amazon.com Thu Aug 13 14:06:01 2020 From: hohensee at amazon.com (Hohensee, Paul) Date: Thu, 13 Aug 2020 14:06:01 +0000 Subject: RFR(S):8251374:jmap -dump should not accept invalid options(Internet mail) Message-ID: <395ADCB4-AA3E-4273-8EC1-33CEB37569CE@amazon.com> +1, except that the indentation for the final 'else' clause needs to be 4 spaces instead of 3. :) Thanks, Paul ?On 8/12/20, 6:21 PM, "serguei.spitsyn at oracle.com" wrote: Hi Lin. Thank you for the update. It looks good. Thanks, Serguei On 8/12/20 17:08, linzang(??) wrote: > Hi Paul and Serguei, > Thanks for your comments, here is the updated patch: http://cr.openjdk.java.net/~lzang/8251374/webrev02/ > > BRs, > Lin > > On 2020/8/13, 12:55 AM, "serguei.spitsyn at oracle.com" wrote: > > Hi Lin, > > It looks good. > Just one comment. > > + System.err.println("Fail: invalid option: '" + subopt +"'"); > + System.exit(1); > > Exit needs to be replaced wit usage for consistency. > > Thanks, > Serguei > > > On 8/10/20 19:57, linzang(??) wrote: > > Here is the webrev: http://cr.openjdk.java.net/~lzang/8251374/webrev01/ > > > > BRs, > > Lin > > > > On 2020/8/11, 10:52 AM, "linzang(??)" wrote: > > > > Hi All, > > May I ask your help to review this tiny patch? It fix an issue that jmap -dump could wrongly accept invalid optioins. > > Bugs: https://bugs.openjdk.java.net/browse/JDK-8251374 > > Patch: (Can not connect to webrev ftp currently, will try it later, following are all code changes) > > > > ################################ > > --- old/src/jdk.jcmd/share/classes/sun/tools/jmap/JMap.java 2020-08-11 10:42:32.044567791 +0800 > > +++ new/src/jdk.jcmd/share/classes/sun/tools/jmap/JMap.java 2020-08-11 10:42:31.876568681 +0800 > > @@ -207,6 +207,11 @@ > > liveopt = "-live"; > > } else if (subopt.startsWith("file=")) { > > filename = parseFileName(subopt); > > + } else if (subopt.equals("format=b")) { > > + // ignore format (not needed at this time) > > + } else { > > + System.err.println("Fail: invalid option: '" + subopt +"'"); > > + System.exit(1); > > } > > } > > ################################ > > > > Thanks, > > Lin > > > > > > > From linzang at tencent.com Thu Aug 13 14:08:15 2020 From: linzang at tencent.com (=?utf-8?B?bGluemFuZyjoh6fnkLMp?=) Date: Thu, 13 Aug 2020 14:08:15 +0000 Subject: RFR(S):8251374:jmap -dump should not accept invalid options(Internet mail) In-Reply-To: <395ADCB4-AA3E-4273-8EC1-33CEB37569CE@amazon.com> References: <395ADCB4-AA3E-4273-8EC1-33CEB37569CE@amazon.com> Message-ID: <94F5F276-1F98-462C-857D-48429B654261@tencent.com> Thanks Paul? May I ask your help to push it? BRs, Lin > On Aug 13, 2020, at 10:06 PM, Hohensee, Paul wrote: > > ?+1, except that the indentation for the final 'else' clause needs to be 4 spaces instead of 3. :) > > Thanks, > Paul > > ?On 8/12/20, 6:21 PM, "serguei.spitsyn at oracle.com" wrote: > > Hi Lin. > > Thank you for the update. > It looks good. > > Thanks, > Serguei > > >> On 8/12/20 17:08, linzang(??) wrote: >> Hi Paul and Serguei, >> Thanks for your comments, here is the updated patch: http://cr.openjdk.java.net/~lzang/8251374/webrev02/ >> >> BRs, >> Lin >> >> On 2020/8/13, 12:55 AM, "serguei.spitsyn at oracle.com" wrote: >> >> Hi Lin, >> >> It looks good. >> Just one comment. >> >> + System.err.println("Fail: invalid option: '" + subopt +"'"); >> + System.exit(1); >> >> Exit needs to be replaced wit usage for consistency. >> >> Thanks, >> Serguei >> >> >>> On 8/10/20 19:57, linzang(??) wrote: >>> Here is the webrev: http://cr.openjdk.java.net/~lzang/8251374/webrev01/ >>> >>> BRs, >>> Lin >>> >>>> On 2020/8/11, 10:52 AM, "linzang(??)" wrote: >>> >>> Hi All, >>> May I ask your help to review this tiny patch? It fix an issue that jmap -dump could wrongly accept invalid optioins. >>> Bugs: https://bugs.openjdk.java.net/browse/JDK-8251374 >>> Patch: (Can not connect to webrev ftp currently, will try it later, following are all code changes) >>> >>> ################################ >>> --- old/src/jdk.jcmd/share/classes/sun/tools/jmap/JMap.java 2020-08-11 10:42:32.044567791 +0800 >>> +++ new/src/jdk.jcmd/share/classes/sun/tools/jmap/JMap.java 2020-08-11 10:42:31.876568681 +0800 >>> @@ -207,6 +207,11 @@ >>> liveopt = "-live"; >>> } else if (subopt.startsWith("file=")) { >>> filename = parseFileName(subopt); >>> + } else if (subopt.equals("format=b")) { >>> + // ignore format (not needed at this time) >>> + } else { >>> + System.err.println("Fail: invalid option: '" + subopt +"'"); >>> + System.exit(1); >>> } >>> } >>> ################################ >>> >>> Thanks, >>> Lin >>> >>> >> >> >> > > From hohensee at amazon.com Thu Aug 13 16:33:36 2020 From: hohensee at amazon.com (Hohensee, Paul) Date: Thu, 13 Aug 2020 16:33:36 +0000 Subject: RFR(S):8251374:jmap -dump should not accept invalid options(Internet mail) Message-ID: <8CC023AF-FDA1-421B-86DD-1C192251565D@amazon.com> Will do. ?On 8/13/20, 7:08 AM, "linzang(??)" wrote: Thanks Paul? May I ask your help to push it? BRs, Lin > On Aug 13, 2020, at 10:06 PM, Hohensee, Paul wrote: > > +1, except that the indentation for the final 'else' clause needs to be 4 spaces instead of 3. :) > > Thanks, > Paul > > On 8/12/20, 6:21 PM, "serguei.spitsyn at oracle.com" wrote: > > Hi Lin. > > Thank you for the update. > It looks good. > > Thanks, > Serguei > > >> On 8/12/20 17:08, linzang(??) wrote: >> Hi Paul and Serguei, >> Thanks for your comments, here is the updated patch: http://cr.openjdk.java.net/~lzang/8251374/webrev02/ >> >> BRs, >> Lin >> >> On 2020/8/13, 12:55 AM, "serguei.spitsyn at oracle.com" wrote: >> >> Hi Lin, >> >> It looks good. >> Just one comment. >> >> + System.err.println("Fail: invalid option: '" + subopt +"'"); >> + System.exit(1); >> >> Exit needs to be replaced wit usage for consistency. >> >> Thanks, >> Serguei >> >> >>> On 8/10/20 19:57, linzang(??) wrote: >>> Here is the webrev: http://cr.openjdk.java.net/~lzang/8251374/webrev01/ >>> >>> BRs, >>> Lin >>> >>>> On 2020/8/11, 10:52 AM, "linzang(??)" wrote: >>> >>> Hi All, >>> May I ask your help to review this tiny patch? It fix an issue that jmap -dump could wrongly accept invalid optioins. >>> Bugs: https://bugs.openjdk.java.net/browse/JDK-8251374 >>> Patch: (Can not connect to webrev ftp currently, will try it later, following are all code changes) >>> >>> ################################ >>> --- old/src/jdk.jcmd/share/classes/sun/tools/jmap/JMap.java 2020-08-11 10:42:32.044567791 +0800 >>> +++ new/src/jdk.jcmd/share/classes/sun/tools/jmap/JMap.java 2020-08-11 10:42:31.876568681 +0800 >>> @@ -207,6 +207,11 @@ >>> liveopt = "-live"; >>> } else if (subopt.startsWith("file=")) { >>> filename = parseFileName(subopt); >>> + } else if (subopt.equals("format=b")) { >>> + // ignore format (not needed at this time) >>> + } else { >>> + System.err.println("Fail: invalid option: '" + subopt +"'"); >>> + System.exit(1); >>> } >>> } >>> ################################ >>> >>> Thanks, >>> Lin >>> >>> >> >> >> > > From hohensee at amazon.com Thu Aug 13 18:36:35 2020 From: hohensee at amazon.com (Hohensee, Paul) Date: Thu, 13 Aug 2020 18:36:35 +0000 Subject: RFR(S):8251374:jmap -dump should not accept invalid options(Internet mail) In-Reply-To: <8CC023AF-FDA1-421B-86DD-1C192251565D@amazon.com> References: <8CC023AF-FDA1-421B-86DD-1C192251565D@amazon.com> Message-ID: <799E79AA-515C-4047-924C-D50E8A31BF77@amazon.com> I mistakenly committed and pushed Lin's patch with myself as author. Would someone with repo access please change the author to 'lzang'? Or tell me how to do it myself? https://hg.openjdk.java.net/jdk/jdk/rev/5036ca733469 Thanks, Paul ?On 8/13/20, 9:48 AM, "serviceability-dev on behalf of Hohensee, Paul" wrote: Will do. On 8/13/20, 7:08 AM, "linzang(??)" wrote: Thanks Paul? May I ask your help to push it? BRs, Lin > On Aug 13, 2020, at 10:06 PM, Hohensee, Paul wrote: > > +1, except that the indentation for the final 'else' clause needs to be 4 spaces instead of 3. :) > > Thanks, > Paul > > On 8/12/20, 6:21 PM, "serguei.spitsyn at oracle.com" wrote: > > Hi Lin. > > Thank you for the update. > It looks good. > > Thanks, > Serguei > > >> On 8/12/20 17:08, linzang(??) wrote: >> Hi Paul and Serguei, >> Thanks for your comments, here is the updated patch: http://cr.openjdk.java.net/~lzang/8251374/webrev02/ >> >> BRs, >> Lin >> >> On 2020/8/13, 12:55 AM, "serguei.spitsyn at oracle.com" wrote: >> >> Hi Lin, >> >> It looks good. >> Just one comment. >> >> + System.err.println("Fail: invalid option: '" + subopt +"'"); >> + System.exit(1); >> >> Exit needs to be replaced wit usage for consistency. >> >> Thanks, >> Serguei >> >> >>> On 8/10/20 19:57, linzang(??) wrote: >>> Here is the webrev: http://cr.openjdk.java.net/~lzang/8251374/webrev01/ >>> >>> BRs, >>> Lin >>> >>>> On 2020/8/11, 10:52 AM, "linzang(??)" wrote: >>> >>> Hi All, >>> May I ask your help to review this tiny patch? It fix an issue that jmap -dump could wrongly accept invalid optioins. >>> Bugs: https://bugs.openjdk.java.net/browse/JDK-8251374 >>> Patch: (Can not connect to webrev ftp currently, will try it later, following are all code changes) >>> >>> ################################ >>> --- old/src/jdk.jcmd/share/classes/sun/tools/jmap/JMap.java 2020-08-11 10:42:32.044567791 +0800 >>> +++ new/src/jdk.jcmd/share/classes/sun/tools/jmap/JMap.java 2020-08-11 10:42:31.876568681 +0800 >>> @@ -207,6 +207,11 @@ >>> liveopt = "-live"; >>> } else if (subopt.startsWith("file=")) { >>> filename = parseFileName(subopt); >>> + } else if (subopt.equals("format=b")) { >>> + // ignore format (not needed at this time) >>> + } else { >>> + System.err.println("Fail: invalid option: '" + subopt +"'"); >>> + System.exit(1); >>> } >>> } >>> ################################ >>> >>> Thanks, >>> Lin >>> >>> >> >> >> > > From daniel.daugherty at oracle.com Thu Aug 13 19:01:28 2020 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Thu, 13 Aug 2020 15:01:28 -0400 Subject: RFR(S):8251374:jmap -dump should not accept invalid options(Internet mail) In-Reply-To: <799E79AA-515C-4047-924C-D50E8A31BF77@amazon.com> References: <8CC023AF-FDA1-421B-86DD-1C192251565D@amazon.com> <799E79AA-515C-4047-924C-D50E8A31BF77@amazon.com> Message-ID: That's something that's very hard to do. It would involve black listing the existing changeset and repushing a new changeset. Black listing a changeset is very, very rarely done and in the past Ops has declined to do that for something like an authorship error. Two options: 1) Manually remember that this changeset should be credited to Lin ?? as author. 2a) [BACKOUT] the changeset using a new bug ID. 2b) [REDO] the changeset with corrected author information with a new bug ID. Dan On 8/13/20 2:36 PM, Hohensee, Paul wrote: > I mistakenly committed and pushed Lin's patch with myself as author. Would someone with repo access please change the author to 'lzang'? Or tell me how to do it myself? > > https://hg.openjdk.java.net/jdk/jdk/rev/5036ca733469 > > Thanks, > Paul > > ?On 8/13/20, 9:48 AM, "serviceability-dev on behalf of Hohensee, Paul" wrote: > > Will do. > > On 8/13/20, 7:08 AM, "linzang(??)" wrote: > > Thanks Paul? > May I ask your help to push it? > > BRs, > Lin > > > On Aug 13, 2020, at 10:06 PM, Hohensee, Paul wrote: > > > > +1, except that the indentation for the final 'else' clause needs to be 4 spaces instead of 3. :) > > > > Thanks, > > Paul > > > > On 8/12/20, 6:21 PM, "serguei.spitsyn at oracle.com" wrote: > > > > Hi Lin. > > > > Thank you for the update. > > It looks good. > > > > Thanks, > > Serguei > > > > > >> On 8/12/20 17:08, linzang(??) wrote: > >> Hi Paul and Serguei, > >> Thanks for your comments, here is the updated patch: http://cr.openjdk.java.net/~lzang/8251374/webrev02/ > >> > >> BRs, > >> Lin > >> > >> On 2020/8/13, 12:55 AM, "serguei.spitsyn at oracle.com" wrote: > >> > >> Hi Lin, > >> > >> It looks good. > >> Just one comment. > >> > >> + System.err.println("Fail: invalid option: '" + subopt +"'"); > >> + System.exit(1); > >> > >> Exit needs to be replaced wit usage for consistency. > >> > >> Thanks, > >> Serguei > >> > >> > >>> On 8/10/20 19:57, linzang(??) wrote: > >>> Here is the webrev: http://cr.openjdk.java.net/~lzang/8251374/webrev01/ > >>> > >>> BRs, > >>> Lin > >>> > >>>> On 2020/8/11, 10:52 AM, "linzang(??)" wrote: > >>> > >>> Hi All, > >>> May I ask your help to review this tiny patch? It fix an issue that jmap -dump could wrongly accept invalid optioins. > >>> Bugs: https://bugs.openjdk.java.net/browse/JDK-8251374 > >>> Patch: (Can not connect to webrev ftp currently, will try it later, following are all code changes) > >>> > >>> ################################ > >>> --- old/src/jdk.jcmd/share/classes/sun/tools/jmap/JMap.java 2020-08-11 10:42:32.044567791 +0800 > >>> +++ new/src/jdk.jcmd/share/classes/sun/tools/jmap/JMap.java 2020-08-11 10:42:31.876568681 +0800 > >>> @@ -207,6 +207,11 @@ > >>> liveopt = "-live"; > >>> } else if (subopt.startsWith("file=")) { > >>> filename = parseFileName(subopt); > >>> + } else if (subopt.equals("format=b")) { > >>> + // ignore format (not needed at this time) > >>> + } else { > >>> + System.err.println("Fail: invalid option: '" + subopt +"'"); > >>> + System.exit(1); > >>> } > >>> } > >>> ################################ > >>> > >>> Thanks, > >>> Lin > >>> > >>> > >> > >> > >> > > > > > > From serguei.spitsyn at oracle.com Thu Aug 13 19:25:50 2020 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Thu, 13 Aug 2020 12:25:50 -0700 Subject: RFR(S):8251374:jmap -dump should not accept invalid options(Internet mail) In-Reply-To: References: <8CC023AF-FDA1-421B-86DD-1C192251565D@amazon.com> <799E79AA-515C-4047-924C-D50E8A31BF77@amazon.com> Message-ID: <4359c23d-43d6-8d86-6547-d45612917188@oracle.com> On 8/13/20 12:01, Daniel D. Daugherty wrote: > That's something that's very hard to do. It would involve black listing > the existing changeset and repushing a new changeset. Black listing a > changeset is very, very rarely done and in the past Ops has declined to > do that for something like an authorship error. > > Two options: > > 1) Manually remember that this changeset should be credited to Lin > ?? as author. If we chose this option then a comment can be added to the bug report that the actual committer was Lin. Thanks, Serguei > 2a) [BACKOUT] the changeset using a new bug ID. > 2b) [REDO] the changeset with corrected author information with a new > bug ID. > > Dan > > On 8/13/20 2:36 PM, Hohensee, Paul wrote: >> I mistakenly committed and pushed Lin's patch with myself as author. >> Would someone with repo access please change the author to 'lzang'? >> Or tell me how to do it myself? >> >> https://hg.openjdk.java.net/jdk/jdk/rev/5036ca733469 >> >> Thanks, >> Paul >> >> ?On 8/13/20, 9:48 AM, "serviceability-dev on behalf of Hohensee, >> Paul" > hohensee at amazon.com> wrote: >> >> ???? Will do. >> >> ???? On 8/13/20, 7:08 AM, "linzang(??)" wrote: >> >> ???????? Thanks Paul? >> ???????? May I ask your help to push it? >> >> ???????? BRs, >> ???????? Lin >> >> ???????? > On Aug 13, 2020, at 10:06 PM, Hohensee, Paul >> wrote: >> ???????? > >> ???????? > +1, except that the indentation for the final 'else' >> clause needs to be 4 spaces instead of 3. :) >> ???????? > >> ???????? > Thanks, >> ???????? > Paul >> ???????? > >> ???????? > On 8/12/20, 6:21 PM, "serguei.spitsyn at oracle.com" >> wrote: >> ???????? > >> ???????? >??? Hi Lin. >> ???????? > >> ???????? >??? Thank you for the update. >> ???????? >??? It looks good. >> ???????? > >> ???????? >??? Thanks, >> ???????? >??? Serguei >> ???????? > >> ???????? > >> ???????? >>??? On 8/12/20 17:08, linzang(??) wrote: >> ???????? >> Hi Paul and Serguei, >> ???????? >>????? Thanks for your comments, here is the updated patch: >> http://cr.openjdk.java.net/~lzang/8251374/webrev02/ >> ???????? >> >> ???????? >> BRs, >> ???????? >> Lin >> ???????? >> >> ???????? >> On 2020/8/13, 12:55 AM, "serguei.spitsyn at oracle.com" >> wrote: >> ???????? >> >> ???????? >>???? Hi Lin, >> ???????? >> >> ???????? >>???? It looks good. >> ???????? >>???? Just one comment. >> ???????? >> >> ???????? >>????????? + System.err.println("Fail: invalid option: '" + >> subopt +"'"); >> ???????? >>????????? +?????????????? System.exit(1); >> ???????? >> >> ???????? >>???? Exit needs to be replaced wit usage for consistency. >> ???????? >> >> ???????? >>???? Thanks, >> ???????? >>???? Serguei >> ???????? >> >> ???????? >> >> ???????? >>>???? On 8/10/20 19:57, linzang(??) wrote: >> ???????? >>> Here is the webrev: >> http://cr.openjdk.java.net/~lzang/8251374/webrev01/ >> ???????? >>> >> ???????? >>> BRs, >> ???????? >>> Lin >> ???????? >>> >> ???????? >>>> On 2020/8/11, 10:52 AM, "linzang(??)" >> wrote: >> ???????? >>> >> ???????? >>>???? Hi All, >> ???????? >>>????????? May I ask your help to review this tiny patch? >> It fix an issue that jmap -dump could wrongly accept invalid optioins. >> ???????? >>>????????? Bugs: >> https://bugs.openjdk.java.net/browse/JDK-8251374 >> ???????? >>>????????? Patch:? (Can not connect to webrev ftp >> currently, will try it later, following are all code changes) >> ???????? >>> >> ???????? >>>???? ################################ >> ???????? >>>???? --- >> old/src/jdk.jcmd/share/classes/sun/tools/jmap/JMap.java 2020-08-11 >> 10:42:32.044567791 +0800 >> ???????? >>>???? +++ >> new/src/jdk.jcmd/share/classes/sun/tools/jmap/JMap.java 2020-08-11 >> 10:42:31.876568681 +0800 >> ???????? >>>???? @@ -207,6 +207,11 @@ >> ???????? >>>????????????????????? liveopt = "-live"; >> ???????? >>>????????????????? } else if (subopt.startsWith("file=")) { >> ???????? >>>????????????????????? filename = parseFileName(subopt); >> ???????? >>>???? +??????????? } else if (subopt.equals("format=b")) { >> ???????? >>>???? +??????????????? // ignore format (not needed at >> this time) >> ???????? >>>???? +??????????? } else { >> ???????? >>>???? + System.err.println("Fail: invalid option: '" + >> subopt +"'"); >> ???????? >>>???? +?????????????? System.exit(1); >> ???????? >>>????????????????? } >> ???????? >>>????????????? } >> ???????? >>>???? ################################ >> ???????? >>> >> ???????? >>>???? Thanks, >> ???????? >>>???? Lin >> ???????? >>> >> ???????? >>> >> ???????? >> >> ???????? >> >> ???????? >> >> ???????? > >> ???????? > >> >> > From daniel.daugherty at oracle.com Thu Aug 13 19:34:07 2020 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Thu, 13 Aug 2020 15:34:07 -0400 Subject: RFR(S):8251374:jmap -dump should not accept invalid options(Internet mail) In-Reply-To: References: <8CC023AF-FDA1-421B-86DD-1C192251565D@amazon.com> <799E79AA-515C-4047-924C-D50E8A31BF77@amazon.com> Message-ID: Paul, Hold up on trying to fix this. I'm discussing another idea with Stefan K. Dan On 8/13/20 3:01 PM, Daniel D. Daugherty wrote: > That's something that's very hard to do. It would involve black listing > the existing changeset and repushing a new changeset. Black listing a > changeset is very, very rarely done and in the past Ops has declined to > do that for something like an authorship error. > > Two options: > > 1) Manually remember that this changeset should be credited to Lin > ?? as author. > 2a) [BACKOUT] the changeset using a new bug ID. > 2b) [REDO] the changeset with corrected author information with a new > bug ID. > > Dan > > On 8/13/20 2:36 PM, Hohensee, Paul wrote: >> I mistakenly committed and pushed Lin's patch with myself as author. >> Would someone with repo access please change the author to 'lzang'? >> Or tell me how to do it myself? >> >> https://hg.openjdk.java.net/jdk/jdk/rev/5036ca733469 >> >> Thanks, >> Paul >> >> ?On 8/13/20, 9:48 AM, "serviceability-dev on behalf of Hohensee, >> Paul" > hohensee at amazon.com> wrote: >> >> ???? Will do. >> >> ???? On 8/13/20, 7:08 AM, "linzang(??)" wrote: >> >> ???????? Thanks Paul? >> ???????? May I ask your help to push it? >> >> ???????? BRs, >> ???????? Lin >> >> ???????? > On Aug 13, 2020, at 10:06 PM, Hohensee, Paul >> wrote: >> ???????? > >> ???????? > +1, except that the indentation for the final 'else' >> clause needs to be 4 spaces instead of 3. :) >> ???????? > >> ???????? > Thanks, >> ???????? > Paul >> ???????? > >> ???????? > On 8/12/20, 6:21 PM, "serguei.spitsyn at oracle.com" >> wrote: >> ???????? > >> ???????? >??? Hi Lin. >> ???????? > >> ???????? >??? Thank you for the update. >> ???????? >??? It looks good. >> ???????? > >> ???????? >??? Thanks, >> ???????? >??? Serguei >> ???????? > >> ???????? > >> ???????? >>??? On 8/12/20 17:08, linzang(??) wrote: >> ???????? >> Hi Paul and Serguei, >> ???????? >>????? Thanks for your comments, here is the updated patch: >> http://cr.openjdk.java.net/~lzang/8251374/webrev02/ >> ???????? >> >> ???????? >> BRs, >> ???????? >> Lin >> ???????? >> >> ???????? >> On 2020/8/13, 12:55 AM, "serguei.spitsyn at oracle.com" >> wrote: >> ???????? >> >> ???????? >>???? Hi Lin, >> ???????? >> >> ???????? >>???? It looks good. >> ???????? >>???? Just one comment. >> ???????? >> >> ???????? >>????????? + System.err.println("Fail: invalid option: '" + >> subopt +"'"); >> ???????? >>????????? +?????????????? System.exit(1); >> ???????? >> >> ???????? >>???? Exit needs to be replaced wit usage for consistency. >> ???????? >> >> ???????? >>???? Thanks, >> ???????? >>???? Serguei >> ???????? >> >> ???????? >> >> ???????? >>>???? On 8/10/20 19:57, linzang(??) wrote: >> ???????? >>> Here is the webrev: >> http://cr.openjdk.java.net/~lzang/8251374/webrev01/ >> ???????? >>> >> ???????? >>> BRs, >> ???????? >>> Lin >> ???????? >>> >> ???????? >>>> On 2020/8/11, 10:52 AM, "linzang(??)" >> wrote: >> ???????? >>> >> ???????? >>>???? Hi All, >> ???????? >>>????????? May I ask your help to review this tiny patch? >> It fix an issue that jmap -dump could wrongly accept invalid optioins. >> ???????? >>>????????? Bugs: >> https://bugs.openjdk.java.net/browse/JDK-8251374 >> ???????? >>>????????? Patch:? (Can not connect to webrev ftp >> currently, will try it later, following are all code changes) >> ???????? >>> >> ???????? >>>???? ################################ >> ???????? >>>???? --- >> old/src/jdk.jcmd/share/classes/sun/tools/jmap/JMap.java 2020-08-11 >> 10:42:32.044567791 +0800 >> ???????? >>>???? +++ >> new/src/jdk.jcmd/share/classes/sun/tools/jmap/JMap.java 2020-08-11 >> 10:42:31.876568681 +0800 >> ???????? >>>???? @@ -207,6 +207,11 @@ >> ???????? >>>????????????????????? liveopt = "-live"; >> ???????? >>>????????????????? } else if (subopt.startsWith("file=")) { >> ???????? >>>????????????????????? filename = parseFileName(subopt); >> ???????? >>>???? +??????????? } else if (subopt.equals("format=b")) { >> ???????? >>>???? +??????????????? // ignore format (not needed at >> this time) >> ???????? >>>???? +??????????? } else { >> ???????? >>>???? + System.err.println("Fail: invalid option: '" + >> subopt +"'"); >> ???????? >>>???? +?????????????? System.exit(1); >> ???????? >>>????????????????? } >> ???????? >>>????????????? } >> ???????? >>>???? ################################ >> ???????? >>> >> ???????? >>>???? Thanks, >> ???????? >>>???? Lin >> ???????? >>> >> ???????? >>> >> ???????? >> >> ???????? >> >> ???????? >> >> ???????? > >> ???????? > >> >> > From daniel.daugherty at oracle.com Thu Aug 13 19:51:36 2020 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Thu, 13 Aug 2020 15:51:36 -0400 Subject: RFR(S):8251374:jmap -dump should not accept invalid options(Internet mail) In-Reply-To: References: <8CC023AF-FDA1-421B-86DD-1C192251565D@amazon.com> <799E79AA-515C-4047-924C-D50E8A31BF77@amazon.com> Message-ID: Stefan K's idea worked like a change. A corrected changeset has been created, merged and pushed. Dan On 8/13/20 3:34 PM, Daniel D. Daugherty wrote: > Paul, > > Hold up on trying to fix this. > > I'm discussing another idea with Stefan K. > > Dan > > > On 8/13/20 3:01 PM, Daniel D. Daugherty wrote: >> That's something that's very hard to do. It would involve black listing >> the existing changeset and repushing a new changeset. Black listing a >> changeset is very, very rarely done and in the past Ops has declined to >> do that for something like an authorship error. >> >> Two options: >> >> 1) Manually remember that this changeset should be credited to Lin >> ?? as author. >> 2a) [BACKOUT] the changeset using a new bug ID. >> 2b) [REDO] the changeset with corrected author information with a new >> bug ID. >> >> Dan >> >> On 8/13/20 2:36 PM, Hohensee, Paul wrote: >>> I mistakenly committed and pushed Lin's patch with myself as author. >>> Would someone with repo access please change the author to 'lzang'? >>> Or tell me how to do it myself? >>> >>> https://hg.openjdk.java.net/jdk/jdk/rev/5036ca733469 >>> >>> Thanks, >>> Paul >>> >>> ?On 8/13/20, 9:48 AM, "serviceability-dev on behalf of Hohensee, >>> Paul" >> hohensee at amazon.com> wrote: >>> >>> ???? Will do. >>> >>> ???? On 8/13/20, 7:08 AM, "linzang(??)" wrote: >>> >>> ???????? Thanks Paul? >>> ???????? May I ask your help to push it? >>> >>> ???????? BRs, >>> ???????? Lin >>> >>> ???????? > On Aug 13, 2020, at 10:06 PM, Hohensee, Paul >>> wrote: >>> ???????? > >>> ???????? > +1, except that the indentation for the final 'else' >>> clause needs to be 4 spaces instead of 3. :) >>> ???????? > >>> ???????? > Thanks, >>> ???????? > Paul >>> ???????? > >>> ???????? > On 8/12/20, 6:21 PM, "serguei.spitsyn at oracle.com" >>> wrote: >>> ???????? > >>> ???????? >??? Hi Lin. >>> ???????? > >>> ???????? >??? Thank you for the update. >>> ???????? >??? It looks good. >>> ???????? > >>> ???????? >??? Thanks, >>> ???????? >??? Serguei >>> ???????? > >>> ???????? > >>> ???????? >>??? On 8/12/20 17:08, linzang(??) wrote: >>> ???????? >> Hi Paul and Serguei, >>> ???????? >>????? Thanks for your comments, here is the updated >>> patch: http://cr.openjdk.java.net/~lzang/8251374/webrev02/ >>> ???????? >> >>> ???????? >> BRs, >>> ???????? >> Lin >>> ???????? >> >>> ???????? >> On 2020/8/13, 12:55 AM, "serguei.spitsyn at oracle.com" >>> wrote: >>> ???????? >> >>> ???????? >>???? Hi Lin, >>> ???????? >> >>> ???????? >>???? It looks good. >>> ???????? >>???? Just one comment. >>> ???????? >> >>> ???????? >>????????? + System.err.println("Fail: invalid option: '" >>> + subopt +"'"); >>> ???????? >>????????? +?????????????? System.exit(1); >>> ???????? >> >>> ???????? >>???? Exit needs to be replaced wit usage for consistency. >>> ???????? >> >>> ???????? >>???? Thanks, >>> ???????? >>???? Serguei >>> ???????? >> >>> ???????? >> >>> ???????? >>>???? On 8/10/20 19:57, linzang(??) wrote: >>> ???????? >>> Here is the webrev: >>> http://cr.openjdk.java.net/~lzang/8251374/webrev01/ >>> ???????? >>> >>> ???????? >>> BRs, >>> ???????? >>> Lin >>> ???????? >>> >>> ???????? >>>> On 2020/8/11, 10:52 AM, "linzang(??)" >>> wrote: >>> ???????? >>> >>> ???????? >>>???? Hi All, >>> ???????? >>>????????? May I ask your help to review this tiny patch? >>> It fix an issue that jmap -dump could wrongly accept invalid optioins. >>> ???????? >>>????????? Bugs: >>> https://bugs.openjdk.java.net/browse/JDK-8251374 >>> ???????? >>>????????? Patch:? (Can not connect to webrev ftp >>> currently, will try it later, following are all code changes) >>> ???????? >>> >>> ???????? >>>???? ################################ >>> ???????? >>>???? --- >>> old/src/jdk.jcmd/share/classes/sun/tools/jmap/JMap.java 2020-08-11 >>> 10:42:32.044567791 +0800 >>> ???????? >>>???? +++ >>> new/src/jdk.jcmd/share/classes/sun/tools/jmap/JMap.java 2020-08-11 >>> 10:42:31.876568681 +0800 >>> ???????? >>>???? @@ -207,6 +207,11 @@ >>> ???????? >>>????????????????????? liveopt = "-live"; >>> ???????? >>>????????????????? } else if (subopt.startsWith("file=")) { >>> ???????? >>>????????????????????? filename = parseFileName(subopt); >>> ???????? >>>???? +??????????? } else if (subopt.equals("format=b")) { >>> ???????? >>>???? +??????????????? // ignore format (not needed at >>> this time) >>> ???????? >>>???? +??????????? } else { >>> ???????? >>>???? + System.err.println("Fail: invalid option: '" + >>> subopt +"'"); >>> ???????? >>>???? +?????????????? System.exit(1); >>> ???????? >>>????????????????? } >>> ???????? >>>????????????? } >>> ???????? >>>???? ################################ >>> ???????? >>> >>> ???????? >>>???? Thanks, >>> ???????? >>>???? Lin >>> ???????? >>> >>> ???????? >>> >>> ???????? >> >>> ???????? >> >>> ???????? >> >>> ???????? > >>> ???????? > >>> >>> >> > From daniel.daugherty at oracle.com Thu Aug 13 19:52:11 2020 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Thu, 13 Aug 2020 15:52:11 -0400 Subject: RFR(S):8251374:jmap -dump should not accept invalid options(Internet mail) In-Reply-To: References: <8CC023AF-FDA1-421B-86DD-1C192251565D@amazon.com> <799E79AA-515C-4047-924C-D50E8A31BF77@amazon.com> Message-ID: s/like a change/like a charm/ Typing too fast today... Dan On 8/13/20 3:51 PM, Daniel D. Daugherty wrote: > Stefan K's idea worked like a change. A corrected changeset has > been created, merged and pushed. > > Dan > > > On 8/13/20 3:34 PM, Daniel D. Daugherty wrote: >> Paul, >> >> Hold up on trying to fix this. >> >> I'm discussing another idea with Stefan K. >> >> Dan >> >> >> On 8/13/20 3:01 PM, Daniel D. Daugherty wrote: >>> That's something that's very hard to do. It would involve black listing >>> the existing changeset and repushing a new changeset. Black listing a >>> changeset is very, very rarely done and in the past Ops has declined to >>> do that for something like an authorship error. >>> >>> Two options: >>> >>> 1) Manually remember that this changeset should be credited to Lin >>> ?? as author. >>> 2a) [BACKOUT] the changeset using a new bug ID. >>> 2b) [REDO] the changeset with corrected author information with a >>> new bug ID. >>> >>> Dan >>> >>> On 8/13/20 2:36 PM, Hohensee, Paul wrote: >>>> I mistakenly committed and pushed Lin's patch with myself as >>>> author. Would someone with repo access please change the author to >>>> 'lzang'? Or tell me how to do it myself? >>>> >>>> https://hg.openjdk.java.net/jdk/jdk/rev/5036ca733469 >>>> >>>> Thanks, >>>> Paul >>>> >>>> ?On 8/13/20, 9:48 AM, "serviceability-dev on behalf of Hohensee, >>>> Paul" >>> hohensee at amazon.com> wrote: >>>> >>>> ???? Will do. >>>> >>>> ???? On 8/13/20, 7:08 AM, "linzang(??)" wrote: >>>> >>>> ???????? Thanks Paul? >>>> ???????? May I ask your help to push it? >>>> >>>> ???????? BRs, >>>> ???????? Lin >>>> >>>> ???????? > On Aug 13, 2020, at 10:06 PM, Hohensee, Paul >>>> wrote: >>>> ???????? > >>>> ???????? > +1, except that the indentation for the final 'else' >>>> clause needs to be 4 spaces instead of 3. :) >>>> ???????? > >>>> ???????? > Thanks, >>>> ???????? > Paul >>>> ???????? > >>>> ???????? > On 8/12/20, 6:21 PM, "serguei.spitsyn at oracle.com" >>>> wrote: >>>> ???????? > >>>> ???????? >??? Hi Lin. >>>> ???????? > >>>> ???????? >??? Thank you for the update. >>>> ???????? >??? It looks good. >>>> ???????? > >>>> ???????? >??? Thanks, >>>> ???????? >??? Serguei >>>> ???????? > >>>> ???????? > >>>> ???????? >>??? On 8/12/20 17:08, linzang(??) wrote: >>>> ???????? >> Hi Paul and Serguei, >>>> ???????? >>????? Thanks for your comments, here is the updated >>>> patch: http://cr.openjdk.java.net/~lzang/8251374/webrev02/ >>>> ???????? >> >>>> ???????? >> BRs, >>>> ???????? >> Lin >>>> ???????? >> >>>> ???????? >> On 2020/8/13, 12:55 AM, "serguei.spitsyn at oracle.com" >>>> wrote: >>>> ???????? >> >>>> ???????? >>???? Hi Lin, >>>> ???????? >> >>>> ???????? >>???? It looks good. >>>> ???????? >>???? Just one comment. >>>> ???????? >> >>>> ???????? >>????????? + System.err.println("Fail: invalid option: '" >>>> + subopt +"'"); >>>> ???????? >>????????? +?????????????? System.exit(1); >>>> ???????? >> >>>> ???????? >>???? Exit needs to be replaced wit usage for consistency. >>>> ???????? >> >>>> ???????? >>???? Thanks, >>>> ???????? >>???? Serguei >>>> ???????? >> >>>> ???????? >> >>>> ???????? >>>???? On 8/10/20 19:57, linzang(??) wrote: >>>> ???????? >>> Here is the webrev: >>>> http://cr.openjdk.java.net/~lzang/8251374/webrev01/ >>>> ???????? >>> >>>> ???????? >>> BRs, >>>> ???????? >>> Lin >>>> ???????? >>> >>>> ???????? >>>> On 2020/8/11, 10:52 AM, "linzang(??)" >>>> wrote: >>>> ???????? >>> >>>> ???????? >>>???? Hi All, >>>> ???????? >>>????????? May I ask your help to review this tiny >>>> patch? It fix an issue that jmap -dump could wrongly accept invalid >>>> optioins. >>>> ???????? >>>????????? Bugs: >>>> https://bugs.openjdk.java.net/browse/JDK-8251374 >>>> ???????? >>>????????? Patch:? (Can not connect to webrev ftp >>>> currently, will try it later, following are all code changes) >>>> ???????? >>> >>>> ???????? >>>???? ################################ >>>> ???????? >>>???? --- >>>> old/src/jdk.jcmd/share/classes/sun/tools/jmap/JMap.java 2020-08-11 >>>> 10:42:32.044567791 +0800 >>>> ???????? >>>???? +++ >>>> new/src/jdk.jcmd/share/classes/sun/tools/jmap/JMap.java 2020-08-11 >>>> 10:42:31.876568681 +0800 >>>> ???????? >>>???? @@ -207,6 +207,11 @@ >>>> ???????? >>>????????????????????? liveopt = "-live"; >>>> ???????? >>>????????????????? } else if (subopt.startsWith("file=")) { >>>> ???????? >>>????????????????????? filename = parseFileName(subopt); >>>> ???????? >>>???? +??????????? } else if (subopt.equals("format=b")) { >>>> ???????? >>>???? +??????????????? // ignore format (not needed at >>>> this time) >>>> ???????? >>>???? +??????????? } else { >>>> ???????? >>>???? + System.err.println("Fail: invalid option: '" + >>>> subopt +"'"); >>>> ???????? >>>???? +?????????????? System.exit(1); >>>> ???????? >>>????????????????? } >>>> ???????? >>>????????????? } >>>> ???????? >>>???? ################################ >>>> ???????? >>> >>>> ???????? >>>???? Thanks, >>>> ???????? >>>???? Lin >>>> ???????? >>> >>>> ???????? >>> >>>> ???????? >> >>>> ???????? >> >>>> ???????? >> >>>> ???????? > >>>> ???????? > >>>> >>>> >>> >> > From hohensee at amazon.com Thu Aug 13 20:47:44 2020 From: hohensee at amazon.com (Hohensee, Paul) Date: Thu, 13 Aug 2020 20:47:44 +0000 Subject: RFR(S):8251374:jmap -dump should not accept invalid options(Internet mail) Message-ID: <83C151AD-9CCF-4F45-8A47-00A8733A633B@amazon.com> Thanks, Dan and Stefan! Paul ?On 8/13/20, 12:55 PM, "Daniel D. Daugherty" wrote: s/like a change/like a charm/ Typing too fast today... Dan On 8/13/20 3:51 PM, Daniel D. Daugherty wrote: > Stefan K's idea worked like a change. A corrected changeset has > been created, merged and pushed. > > Dan > > > On 8/13/20 3:34 PM, Daniel D. Daugherty wrote: >> Paul, >> >> Hold up on trying to fix this. >> >> I'm discussing another idea with Stefan K. >> >> Dan >> >> >> On 8/13/20 3:01 PM, Daniel D. Daugherty wrote: >>> That's something that's very hard to do. It would involve black listing >>> the existing changeset and repushing a new changeset. Black listing a >>> changeset is very, very rarely done and in the past Ops has declined to >>> do that for something like an authorship error. >>> >>> Two options: >>> >>> 1) Manually remember that this changeset should be credited to Lin >>> as author. >>> 2a) [BACKOUT] the changeset using a new bug ID. >>> 2b) [REDO] the changeset with corrected author information with a >>> new bug ID. >>> >>> Dan >>> >>> On 8/13/20 2:36 PM, Hohensee, Paul wrote: >>>> I mistakenly committed and pushed Lin's patch with myself as >>>> author. Would someone with repo access please change the author to >>>> 'lzang'? Or tell me how to do it myself? >>>> >>>> https://hg.openjdk.java.net/jdk/jdk/rev/5036ca733469 >>>> >>>> Thanks, >>>> Paul >>>> >>>> On 8/13/20, 9:48 AM, "serviceability-dev on behalf of Hohensee, >>>> Paul" >>> hohensee at amazon.com> wrote: >>>> >>>> Will do. >>>> >>>> On 8/13/20, 7:08 AM, "linzang(??)" wrote: >>>> >>>> Thanks Paul? >>>> May I ask your help to push it? >>>> >>>> BRs, >>>> Lin >>>> >>>> > On Aug 13, 2020, at 10:06 PM, Hohensee, Paul >>>> wrote: >>>> > >>>> > +1, except that the indentation for the final 'else' >>>> clause needs to be 4 spaces instead of 3. :) >>>> > >>>> > Thanks, >>>> > Paul >>>> > >>>> > On 8/12/20, 6:21 PM, "serguei.spitsyn at oracle.com" >>>> wrote: >>>> > >>>> > Hi Lin. >>>> > >>>> > Thank you for the update. >>>> > It looks good. >>>> > >>>> > Thanks, >>>> > Serguei >>>> > >>>> > >>>> >> On 8/12/20 17:08, linzang(??) wrote: >>>> >> Hi Paul and Serguei, >>>> >> Thanks for your comments, here is the updated >>>> patch: http://cr.openjdk.java.net/~lzang/8251374/webrev02/ >>>> >> >>>> >> BRs, >>>> >> Lin >>>> >> >>>> >> On 2020/8/13, 12:55 AM, "serguei.spitsyn at oracle.com" >>>> wrote: >>>> >> >>>> >> Hi Lin, >>>> >> >>>> >> It looks good. >>>> >> Just one comment. >>>> >> >>>> >> + System.err.println("Fail: invalid option: '" >>>> + subopt +"'"); >>>> >> + System.exit(1); >>>> >> >>>> >> Exit needs to be replaced wit usage for consistency. >>>> >> >>>> >> Thanks, >>>> >> Serguei >>>> >> >>>> >> >>>> >>> On 8/10/20 19:57, linzang(??) wrote: >>>> >>> Here is the webrev: >>>> http://cr.openjdk.java.net/~lzang/8251374/webrev01/ >>>> >>> >>>> >>> BRs, >>>> >>> Lin >>>> >>> >>>> >>>> On 2020/8/11, 10:52 AM, "linzang(??)" >>>> wrote: >>>> >>> >>>> >>> Hi All, >>>> >>> May I ask your help to review this tiny >>>> patch? It fix an issue that jmap -dump could wrongly accept invalid >>>> optioins. >>>> >>> Bugs: >>>> https://bugs.openjdk.java.net/browse/JDK-8251374 >>>> >>> Patch: (Can not connect to webrev ftp >>>> currently, will try it later, following are all code changes) >>>> >>> >>>> >>> ################################ >>>> >>> --- >>>> old/src/jdk.jcmd/share/classes/sun/tools/jmap/JMap.java 2020-08-11 >>>> 10:42:32.044567791 +0800 >>>> >>> +++ >>>> new/src/jdk.jcmd/share/classes/sun/tools/jmap/JMap.java 2020-08-11 >>>> 10:42:31.876568681 +0800 >>>> >>> @@ -207,6 +207,11 @@ >>>> >>> liveopt = "-live"; >>>> >>> } else if (subopt.startsWith("file=")) { >>>> >>> filename = parseFileName(subopt); >>>> >>> + } else if (subopt.equals("format=b")) { >>>> >>> + // ignore format (not needed at >>>> this time) >>>> >>> + } else { >>>> >>> + System.err.println("Fail: invalid option: '" + >>>> subopt +"'"); >>>> >>> + System.exit(1); >>>> >>> } >>>> >>> } >>>> >>> ################################ >>>> >>> >>>> >>> Thanks, >>>> >>> Lin >>>> >>> >>>> >>> >>>> >> >>>> >> >>>> >> >>>> > >>>> > >>>> >>>> >>> >> > From linzang at tencent.com Thu Aug 13 23:29:39 2020 From: linzang at tencent.com (=?utf-8?B?bGluemFuZyjoh6fnkLMp?=) Date: Thu, 13 Aug 2020 23:29:39 +0000 Subject: RFR(S):8251374:jmap -dump should not accept invalid options(Internet mail) In-Reply-To: <83C151AD-9CCF-4F45-8A47-00A8733A633B@amazon.com> References: <83C151AD-9CCF-4F45-8A47-00A8733A633B@amazon.com> Message-ID: Dear Paul, Dan, Stefan and Serguei, Please accept my best thanks for your help! BRs, Lin ?On 2020/8/14, 4:48 AM, "Hohensee, Paul" wrote: Thanks, Dan and Stefan! Paul On 8/13/20, 12:55 PM, "Daniel D. Daugherty" wrote: s/like a change/like a charm/ Typing too fast today... Dan On 8/13/20 3:51 PM, Daniel D. Daugherty wrote: > Stefan K's idea worked like a change. A corrected changeset has > been created, merged and pushed. > > Dan > > > On 8/13/20 3:34 PM, Daniel D. Daugherty wrote: >> Paul, >> >> Hold up on trying to fix this. >> >> I'm discussing another idea with Stefan K. >> >> Dan >> >> >> On 8/13/20 3:01 PM, Daniel D. Daugherty wrote: >>> That's something that's very hard to do. It would involve black listing >>> the existing changeset and repushing a new changeset. Black listing a >>> changeset is very, very rarely done and in the past Ops has declined to >>> do that for something like an authorship error. >>> >>> Two options: >>> >>> 1) Manually remember that this changeset should be credited to Lin >>> as author. >>> 2a) [BACKOUT] the changeset using a new bug ID. >>> 2b) [REDO] the changeset with corrected author information with a >>> new bug ID. >>> >>> Dan >>> >>> On 8/13/20 2:36 PM, Hohensee, Paul wrote: >>>> I mistakenly committed and pushed Lin's patch with myself as >>>> author. Would someone with repo access please change the author to >>>> 'lzang'? Or tell me how to do it myself? >>>> >>>> https://hg.openjdk.java.net/jdk/jdk/rev/5036ca733469 >>>> >>>> Thanks, >>>> Paul >>>> >>>> On 8/13/20, 9:48 AM, "serviceability-dev on behalf of Hohensee, >>>> Paul" >>> hohensee at amazon.com> wrote: >>>> >>>> Will do. >>>> >>>> On 8/13/20, 7:08 AM, "linzang(??)" wrote: >>>> >>>> Thanks Paul? >>>> May I ask your help to push it? >>>> >>>> BRs, >>>> Lin >>>> >>>> > On Aug 13, 2020, at 10:06 PM, Hohensee, Paul >>>> wrote: >>>> > >>>> > +1, except that the indentation for the final 'else' >>>> clause needs to be 4 spaces instead of 3. :) >>>> > >>>> > Thanks, >>>> > Paul >>>> > >>>> > On 8/12/20, 6:21 PM, "serguei.spitsyn at oracle.com" >>>> wrote: >>>> > >>>> > Hi Lin. >>>> > >>>> > Thank you for the update. >>>> > It looks good. >>>> > >>>> > Thanks, >>>> > Serguei >>>> > >>>> > >>>> >> On 8/12/20 17:08, linzang(??) wrote: >>>> >> Hi Paul and Serguei, >>>> >> Thanks for your comments, here is the updated >>>> patch: http://cr.openjdk.java.net/~lzang/8251374/webrev02/ >>>> >> >>>> >> BRs, >>>> >> Lin >>>> >> >>>> >> On 2020/8/13, 12:55 AM, "serguei.spitsyn at oracle.com" >>>> wrote: >>>> >> >>>> >> Hi Lin, >>>> >> >>>> >> It looks good. >>>> >> Just one comment. >>>> >> >>>> >> + System.err.println("Fail: invalid option: '" >>>> + subopt +"'"); >>>> >> + System.exit(1); >>>> >> >>>> >> Exit needs to be replaced wit usage for consistency. >>>> >> >>>> >> Thanks, >>>> >> Serguei >>>> >> >>>> >> >>>> >>> On 8/10/20 19:57, linzang(??) wrote: >>>> >>> Here is the webrev: >>>> http://cr.openjdk.java.net/~lzang/8251374/webrev01/ >>>> >>> >>>> >>> BRs, >>>> >>> Lin >>>> >>> >>>> >>>> On 2020/8/11, 10:52 AM, "linzang(??)" >>>> wrote: >>>> >>> >>>> >>> Hi All, >>>> >>> May I ask your help to review this tiny >>>> patch? It fix an issue that jmap -dump could wrongly accept invalid >>>> optioins. >>>> >>> Bugs: >>>> https://bugs.openjdk.java.net/browse/JDK-8251374 >>>> >>> Patch: (Can not connect to webrev ftp >>>> currently, will try it later, following are all code changes) >>>> >>> >>>> >>> ################################ >>>> >>> --- >>>> old/src/jdk.jcmd/share/classes/sun/tools/jmap/JMap.java 2020-08-11 >>>> 10:42:32.044567791 +0800 >>>> >>> +++ >>>> new/src/jdk.jcmd/share/classes/sun/tools/jmap/JMap.java 2020-08-11 >>>> 10:42:31.876568681 +0800 >>>> >>> @@ -207,6 +207,11 @@ >>>> >>> liveopt = "-live"; >>>> >>> } else if (subopt.startsWith("file=")) { >>>> >>> filename = parseFileName(subopt); >>>> >>> + } else if (subopt.equals("format=b")) { >>>> >>> + // ignore format (not needed at >>>> this time) >>>> >>> + } else { >>>> >>> + System.err.println("Fail: invalid option: '" + >>>> subopt +"'"); >>>> >>> + System.exit(1); >>>> >>> } >>>> >>> } >>>> >>> ################################ >>>> >>> >>>> >>> Thanks, >>>> >>> Lin >>>> >>> >>>> >>> >>>> >> >>>> >> >>>> >> >>>> > >>>> > >>>> >>>> >>> >> > From serguei.spitsyn at oracle.com Fri Aug 14 08:10:58 2020 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Fri, 14 Aug 2020 01:10:58 -0700 Subject: RFR(S) 8249293: Unsafe stackwalk in VM_GetOrSetLocal::doit_prologue() In-Reply-To: References: <54218a0f-5ea7-5a26-af7b-d13829bece14@oracle.com> <037ee9ad-615e-e70b-80f8-21b4fdc80f53@oracle.com> <0a112ad5-0771-9a83-32d8-08c87bed9b40@oracle.com> Message-ID: An HTML attachment was scrubbed... URL: From richard.reingruber at sap.com Fri Aug 14 14:06:53 2020 From: richard.reingruber at sap.com (Reingruber, Richard) Date: Fri, 14 Aug 2020 14:06:53 +0000 Subject: RFR(S) 8249293: Unsafe stackwalk in VM_GetOrSetLocal::doit_prologue() In-Reply-To: References: <54218a0f-5ea7-5a26-af7b-d13829bece14@oracle.com> <037ee9ad-615e-e70b-80f8-21b4fdc80f53@oracle.com> <0a112ad5-0771-9a83-32d8-08c87bed9b40@oracle.com> Message-ID: Hi Serguei, thanks for the feedback. I have implemented your suggestions and created a new webrev: Webrev: http://cr.openjdk.java.net/~rrich/webrevs/8249293/webrev.4/ Delta: http://cr.openjdk.java.net/~rrich/webrevs/8249293/webrev.4.inc/ Please find my replies to your comments below. Best regards, Richard. > http://cr.openjdk.java.net/~rrich/webrevs/8249293/webrev.3.inc/test/hotspot/jtreg/serviceability/jvmti/GetLocalVariable/GetLocalWithoutSuspendTest.java.frames.html > 33 * the stack walk. The target thread's stack is walkable while in native. After sending the notification it > ... > 54 * @param depth Depth of target frame for GetLocalObject() call. Should be large value to prolong the unsafe stack walk. > 55 * @param waitTimeInNativeAfterNotify Time to wait after notify with walkable stack before returning an becoming unsafe again. > ... > 71 * Wait time in native, i.e. with walkable stack, after notifying agent thread to do GetLocalObject() call. > ... > 89 msg((now -start) + " ms Iteration : " + iterations + " waitTimeInNativeAfterNotify : " + waitTimeInNativeAfterNotify); > Could you, please, re-balance the lines above to make them shorter? Ok, done. > 90 int newTargetDepth = recursiveMethod(0, targetDepth); > 91 if (newTargetDepth < targetDepth) { > 92 msg("StackOverflowError during test."); > 93 msg("Old target depth: " + targetDepth); > 94 msg("Retry with new target depth: " + newTargetDepth); > 95 targetDepth = newTargetDepth; > 96 } > A comment is needed to explain why a StackOverflowError is not desired. > At least, it is not obvious initially. > 73 public int waitTimeInNativeAfterNotify; > This name is unreasonably long which makes the code less readable. > I'd suggest to reduce it to waitTime. Ok, done. > 103 notifyAgentToGetLocalAndWaitShortly(-1, 1); > ... > 119 notifyAgentToGetLocalAndWaitShortly(depth - 100, waitTimeInNativeAfterNotify); > It is better to provide a short comment before each call explaining what it is doing. > For instance, it is not clear why the call at the line 103 is needed. > Why do we need to notify the agent to GetLocal for the second time? The test is repeated TEST_ITERATIONS times. In each iteration the agent calls GetLocal racing the target thread returning from the native call. The last call in line 103 ist the shutdown signal. > Can it be refactored into a separate native method? I've made the shutdown process more explicit with the new native method shutDown() which sets thest_state to ShutDown. > Then the the function name can be reduced to 'notifyAgentToGetLocal'. > This long name does not give enough context anyway. Ok, done. > 85 long iterations = 0; > 87 do { > ... > 97 iterations++; > ... > 102 } while (iterations < TEST_ITERATIONS); > Why a more explicit 'for' or 'while' loop is not used here? : > for (long iter = 0; iter < TEST_ITERATIONS; iter++) { I have converted the loop into a for loop. > http://cr.openjdk.java.net/~rrich/webrevs/8249293/webrev.3.inc/test/hotspot/jtreg/serviceability/jvmti/GetLocalVariable/libGetLocalWithoutSuspendTest.cpp.frames.html > The indent in this file varies. It is better to keep it the same: 4 or 2. Yes, I noticed this. I have not corrected it yet, because I didn't want to pullute the incremental webrev with that change. Would you like me to fix the indentation now to 2 spaces or do it as a last step? > 60 AgentCallingGetLocalObject // The target thread waits for the agent to call > I'd suggest to rename the constant to 'AgentInGetLocal'. Ok, done. > 150 GetLocalWithoutSuspendTestThreadLoop(jvmtiEnv * jvmti, JNIEnv* env, void * arg) { > It is better rename the function to TestThreadLoop. Would AgentThreadLoop be ok too? > You can add a comment before to explain some basic about what it is doing. Ok, done. > 167 printf("*** AGENT: GetLocalWithoutSuspendTestThreadLoop thread started. Polling thread '%s' for local variables\n", > It is better to get rid of leading stars in all messages. Ok, done. > 176 // the native method Java_GetLocalWithoutSuspendTest_notifyAgentToGetLocalAndWaitShortly > The part 'Java_GetLocalWithoutSuspendTest_' can be removed from the function name. Ok, done. --- From: serguei.spitsyn at oracle.com Sent: Freitag, 14. August 2020 10:11 To: Reingruber, Richard ; David Holmes ; serviceability-dev at openjdk.java.net Subject: Re: RFR(S) 8249293: Unsafe stackwalk in VM_GetOrSetLocal::doit_prologue() Hi Richard, http://cr.openjdk.java.net/~rrich/webrevs/8249293/webrev.3.inc/test/hotspot/jtreg/serviceability/jvmti/GetLocalVariable/GetLocalWithoutSuspendTest.java.frames.html 33 * the stack walk. The target thread's stack is walkable while in native. After sending the notification it ... 54 * @param depth Depth of target frame for GetLocalObject() call. Should be large value to prolong the unsafe stack walk. 55 * @param waitTimeInNativeAfterNotify Time to wait after notify with walkable stack before returning an becoming unsafe again. ... 71 * Wait time in native, i.e. with walkable stack, after notifying agent thread to do GetLocalObject() call. ... 89 msg((now -start) + " ms Iteration : " + iterations + " waitTimeInNativeAfterNotify : " + waitTimeInNativeAfterNotify); Could you, please, re-balance the lines above to make them shorter? ?90 int newTargetDepth = recursiveMethod(0, targetDepth); 91 if (newTargetDepth < targetDepth) { 92 msg("StackOverflowError during test."); 93 msg("Old target depth: " + targetDepth); 94 msg("Retry with new target depth: " + newTargetDepth); 95 targetDepth = newTargetDepth; 96 } A comment is needed to explain why a StackOverflowError is not desired. At least, it is not obvious initially. 73 public int waitTimeInNativeAfterNotify; This name is unreasonably long which makes the code less readable. I'd suggest to reduce it to waitTime. 103 notifyAgentToGetLocalAndWaitShortly(-1, 1); ... 119 notifyAgentToGetLocalAndWaitShortly(depth - 100, waitTimeInNativeAfterNotify); It is better to provide a short comment before each call explaining what it is doing. For instance, it is not clear why the call at the line 103 is needed. Why do we need to notify the agent to GetLocal for the second time? Can it be refactored into a separate native method? Then the the function name can be reduced to 'notifyAgentToGetLocal'. This long name does not give enough context anyway. 85 long iterations = 0; 87 do { ... 97 iterations++; ... 102 } while (iterations < TEST_ITERATIONS); Why a more explicit 'for' or 'while' loop is not used here? : for (long iter = 0; iter < TEST_ITERATIONS; iter++) { ? http://cr.openjdk.java.net/~rrich/webrevs/8249293/webrev.3.inc/test/hotspot/jtreg/serviceability/jvmti/GetLocalVariable/libGetLocalWithoutSuspendTest.cpp.frames.html The indent in this file varies. It is better to keep it the same: 4 or 2. 60 AgentCallingGetLocalObject // The target thread waits for the agent to call I'd suggest to rename the constant to 'AgentInGetLocal'. 150 GetLocalWithoutSuspendTestThreadLoop(jvmtiEnv * jvmti, JNIEnv* env, void * arg) { It is better rename the function to TestThreadLoop. You can add a comment before to explain some basic about what it is doing. 167 printf("*** AGENT: GetLocalWithoutSuspendTestThreadLoop thread started. Polling thread '%s' for local variables\n", It is better to get rid of leading stars in all messages. 176 // the native method Java_GetLocalWithoutSuspendTest_notifyAgentToGetLocalAndWaitShortly The part 'Java_GetLocalWithoutSuspendTest_' can be removed from the function name. I'm still reviewing the test native agent code. Thanks, Serguei On 8/11/20 03:02, Reingruber, Richard wrote: Hi David and Serguei, On 11/08/2020 3:21 am, mailto:serguei.spitsyn at oracle.com wrote: Hi Richard and David, The implementation looks good to me. But I do not understand what the test is doing with all this counters and recursions. For instance, these fragments: 86 recursions = 0; 87 try { 88 recursiveMethod(1<<20); 89 } catch (StackOverflowError e) { 90 msg("Caught StackOverflowError as expected"); 91 } 92 int target_depth = recursions-100; // spaces are missed around the '-' sigh It is not obvious that the 'recursion' is updated in the recursiveMethod. I would suggestto make it more explicit: recursiveMethod(M); int target_depth = M - 100; Then the variable 'recursions' can be removed or become local. The recursiveMethod takes in the maximum recursions to try and updates the recursions variable to record how many recursions were possible - so: target_depth = - 100; Possibly recursiveMethod could return the actual recursions instead of using the global variables? I've eliminated the static 'recursions' variable. recursiveMethod() now returns the depth at which the recursion was ended. I hesitated doing this, because I had to handle the StackOverflowError with all those frames still on stack. But the handler is empty, so it should not cause problems. This is the new webrev (as posted previously): Webrev: http://cr.openjdk.java.net/~rrich/webrevs/8249293/webrev.3/ Delta: http://cr.openjdk.java.net/~rrich/webrevs/8249293/webrev.3.inc/ Thanks, Richard. -----Original Message----- From: David Holmes mailto:david.holmes at oracle.com Sent: Dienstag, 11. August 2020 04:00 To: mailto:serguei.spitsyn at oracle.com; Reingruber, Richard mailto:richard.reingruber at sap.com; mailto:serviceability-dev at openjdk.java.net Subject: Re: RFR(S) 8249293: Unsafe stackwalk in VM_GetOrSetLocal::doit_prologue() Hi Serguei, On 11/08/2020 3:21 am, mailto:serguei.spitsyn at oracle.com wrote: Hi Richard and David, The implementation looks good to me. But I do not understand what the test is doing with all this counters and recursions. For instance, these fragments: 86 recursions = 0; 87 try { 88 recursiveMethod(1<<20); 89 } catch (StackOverflowError e) { 90 msg("Caught StackOverflowError as expected"); 91 } 92 int target_depth = recursions-100; // spaces are missed around the '-' sigh It is not obvious that the 'recursion' is updated in the recursiveMethod. I would suggestto make it more explicit: ?? recursiveMethod(M); ?? int target_depth = M - 100; Then the variable 'recursions' can be removed or become local. The recursiveMethod takes in the maximum recursions to try and updates the recursions variable to record how many recursions were possible - so: target_depth = - 100; Possibly recursiveMethod could return the actual recursions instead of using the global variables? David ----- This method will be: 47 private static final int M = 1 << 20; ... 121 public long recursiveMethod(int depth) { 123 if (depth == 0) { 124 notifyAgentToGetLocalAndWaitShortly(M - 100, waitTimeInNativeAfterNotify); 126 } else { 127 recursiveMethod(--depth); 128 } 129 } At least, he test is missing the comments explaining all these. Thanks, Serguei On 8/9/20 22:35, David Holmes wrote: Hi Richard, On 31/07/2020 5:28 pm, Reingruber, Richard wrote: Hi, I rebase the fix after JDK-8250042. New webrev: http://cr.openjdk.java.net/~rrich/webrevs/8249293/webrev.2/ The general fix for this seems good. A minor nit: ?588???? if (!is_assignable(signature, ob_k, Thread::current())) { You know that the current thread is the VMThread so can use VMThread::vm_thread(). Similarly for this existing code: ?694???? Thread* current_thread = Thread::current(); --- Looking at the test code ... I'm less clear on exactly what is happening and the use of spin-waits raises some red-flags for me in terms of test reliability on different platforms. The "while (--waitCycles > 0)" loop in particular offers no certainty that the agent thread is executing anything in particular. And the use of the spin_count as a guide to future waiting time seems somewhat arbitrary. In all seriousness I got a headache trying to work out how the test was expecting to operate. Some parts could be simplified using raw monitors, I think. But there's no sure way to know the agent thread is in the midst of the stackwalk when the target thread wants to leave the native code. So I understand what you are trying to achieve here, I'm just not sure how reliably it will actually achieve it. test/hotspot/jtreg/serviceability/jvmti/GetLocalVariable/libGetLocalWithoutSuspendTest.cpp ?32 static volatile jlong spinn_count???? = 0; Using a 64-bit counter seems like it will be a problem on 32-bit systems. Should be spin_count not spinn_count. ?36 // Agent thread waits for value != 0, then performas the JVMTI call to get local variable. typo: performas Thanks, David ----- Thanks, Richard. -----Original Message----- From: serviceability-dev mailto:serviceability-dev-retn at openjdk.java.net On Behalf Of Reingruber, Richard Sent: Montag, 27. Juli 2020 09:45 To: mailto:serguei.spitsyn at oracle.com; mailto:serviceability-dev at openjdk.java.net Subject: [CAUTION] RE: RFR(S) 8249293: Unsafe stackwalk in VM_GetOrSetLocal::doit_prologue() Hi Serguei, ?? > I tested it on Linux and Windows but not yet on MacOS. The test succeeded now on all platforms. Thanks, Richard. -----Original Message----- From: Reingruber, Richard Sent: Freitag, 24. Juli 2020 15:04 To: mailto:serguei.spitsyn at oracle.com; mailto:serviceability-dev at openjdk.java.net Subject: RE: RFR(S) 8249293: Unsafe stackwalk in VM_GetOrSetLocal::doit_prologue() Hi Serguei, The fix itself looks good to me. thanks for looking at the fix. I still need another look at new test. Could you, please, convert the agent of new test to C++? It will make it a little bit simpler. Sure, here is the new webrev.1 with a C++ version of the test agent: http://cr.openjdk.java.net/~rrich/webrevs/8249293/webrev.1/ I tested it on Linux and Windows but not yet on MacOS. Thanks, Richard. -----Original Message----- From: mailto:serguei.spitsyn at oracle.com mailto:serguei.spitsyn at oracle.com Sent: Freitag, 24. Juli 2020 00:00 To: Reingruber, Richard mailto:richard.reingruber at sap.com; mailto:serviceability-dev at openjdk.java.net Subject: Re: RFR(S) 8249293: Unsafe stackwalk in VM_GetOrSetLocal::doit_prologue() Hi Richard, Thank you for filing the CR and taking care about it! The fix itself looks good to me. I still need another look at new test. Could you, please, convert the agent of new test to C++? It will make it a little bit simpler. Thanks, Serguei On 7/20/20 01:15, Reingruber, Richard wrote: Hi, please help review this fix for VM_GetOrSetLocal. It moves the unsafe stackwalk from the vm operation prologue before the safepoint into the doit() method executed at the safepoint. Webrev: http://cr.openjdk.java.net/~rrich/webrevs/8249293/webrev.0/index.html Bug: https://bugs.openjdk.java.net/browse/JDK-8249293 According to the JVMTI spec on local variable access it is not required to suspend the target thread T [1]. The operation will simply fail with JVMTI_ERROR_NO_MORE_FRAMES if T is executing bytecodes. It will succeed though if T is blocked because of synchronization or executing some native code. The issue is that in the latter case the stack walk in VM_GetOrSetLocal::doit_prologue() to prepare the access to the local variable is unsafe, because it is done before the safepoint and it races with T returning to execute bytecodes making its stack not walkable. The included test shows that this can crash the VM if T wins the race. Manual testing: ??? - new test test/hotspot/jtreg/serviceability/jvmti/GetLocalVariable/GetLocalWithoutSuspendTest.java ??? - test/hotspot/jtreg/vmTestbase/nsk/jvmti ??? - test/hotspot/jtreg/serviceability/jvmti Nightly regression tests @SAP: JCK and JTREG, also in Xcomp mode, SPECjvm2008, SPECjbb2015, Renaissance Suite, SAP specific tests with fastdebug and release builds on all platforms Thanks, Richard. [1] https://docs.oracle.com/en/java/javase/14/docs/specs/jvmti.html#local From stefan.karlsson at oracle.com Fri Aug 14 14:48:09 2020 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Fri, 14 Aug 2020 16:48:09 +0200 Subject: RFR: 8251835: 8251374 breaks jmap -dump:all Message-ID: <06845148-6769-c560-8593-2c44abe1b92b@oracle.com> Hi all, Please review this patch to fix a recently introduced jmap bug. https://cr.openjdk.java.net/~stefank/8251835/webrev.01/ https://bugs.openjdk.java.net/browse/JDK-8251835 I added the same kind of checks that we have in histo. Testing: - Tested locally with the failing test - Tier1-tier5 on Linux x64 Paul posted a slightly more elaborate fix that makes dump more akin to histo: http://cr.openjdk.java.net/~phh/8251835/webrev.00/ I don't know the testing status of that patch. If this needs to be fixed ASAP, I propose my fix, and then add the rest of Pauls bits as a follow-up RFE. If we have time to run Paul's patch through testing, then I'm fine with that as well. Thanks, StefanK From hohensee at amazon.com Fri Aug 14 16:39:06 2020 From: hohensee at amazon.com (Hohensee, Paul) Date: Fri, 14 Aug 2020 16:39:06 +0000 Subject: RFR: 8251835: 8251374 breaks jmap -dump:all Message-ID: <34BDC278-5544-4D82-8BD9-1DCE2199B95A@amazon.com> Makes sense to me to do a followup. I've filed https://bugs.openjdk.java.net/browse/JDK-8251848. I ran TEST="test/jdk/sun/tools/jmap/BasicJMapTest.java" JTREG="JAVA_OPTIONS=-XX:+UseParallelGC -XX:ParallelGCThreads=100" successfully, including your patch for 8251570. This 8251835 patch looks good to me. Thanks, Paul ?On 8/14/20, 7:49 AM, "Stefan Karlsson" wrote: Hi all, Please review this patch to fix a recently introduced jmap bug. https://cr.openjdk.java.net/~stefank/8251835/webrev.01/ https://bugs.openjdk.java.net/browse/JDK-8251835 I added the same kind of checks that we have in histo. Testing: - Tested locally with the failing test - Tier1-tier5 on Linux x64 Paul posted a slightly more elaborate fix that makes dump more akin to histo: http://cr.openjdk.java.net/~phh/8251835/webrev.00/ I don't know the testing status of that patch. If this needs to be fixed ASAP, I propose my fix, and then add the rest of Pauls bits as a follow-up RFE. If we have time to run Paul's patch through testing, then I'm fine with that as well. Thanks, StefanK From stefan.karlsson at oracle.com Fri Aug 14 16:47:44 2020 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Fri, 14 Aug 2020 18:47:44 +0200 Subject: RFR: 8251835: 8251374 breaks jmap -dump:all In-Reply-To: <34BDC278-5544-4D82-8BD9-1DCE2199B95A@amazon.com> References: <34BDC278-5544-4D82-8BD9-1DCE2199B95A@amazon.com> Message-ID: On 2020-08-14 18:39, Hohensee, Paul wrote: > Makes sense to me to do a followup. I've filed https://bugs.openjdk.java.net/browse/JDK-8251848. Great. > > I ran TEST="test/jdk/sun/tools/jmap/BasicJMapTest.java" JTREG="JAVA_OPTIONS=-XX:+UseParallelGC -XX:ParallelGCThreads=100" successfully, including your patch for 8251570. > > This 8251835 patch looks good to me. Thanks! StefanK > > Thanks, > Paul > > ?On 8/14/20, 7:49 AM, "Stefan Karlsson" wrote: > > Hi all, > > Please review this patch to fix a recently introduced jmap bug. > > https://cr.openjdk.java.net/~stefank/8251835/webrev.01/ > https://bugs.openjdk.java.net/browse/JDK-8251835 > > I added the same kind of checks that we have in histo. > > Testing: > - Tested locally with the failing test > - Tier1-tier5 on Linux x64 > > Paul posted a slightly more elaborate fix that makes dump more akin to > histo: > http://cr.openjdk.java.net/~phh/8251835/webrev.00/ > > I don't know the testing status of that patch. If this needs to be fixed > ASAP, I propose my fix, and then add the rest of Pauls bits as a > follow-up RFE. If we have time to run Paul's patch through testing, then > I'm fine with that as well. > > Thanks, > StefanK > From serguei.spitsyn at oracle.com Fri Aug 14 22:01:27 2020 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Fri, 14 Aug 2020 15:01:27 -0700 Subject: RFR: 8251835: 8251374 breaks jmap -dump:all In-Reply-To: References: <34BDC278-5544-4D82-8BD9-1DCE2199B95A@amazon.com> Message-ID: Hi Stefan and Paul, Thank you for taking care and fixing this regression! It seems, the fix from Paul is more complete and is better to push after testing. It looks good to me. Unfortunately, there is very limited test coverage for this. Thanks, Serguei On 8/14/20 09:47, Stefan Karlsson wrote: > On 2020-08-14 18:39, Hohensee, Paul wrote: >> Makes sense to me to do a followup. I've filed >> https://bugs.openjdk.java.net/browse/JDK-8251848. > > Great. > >> >> I ran TEST="test/jdk/sun/tools/jmap/BasicJMapTest.java" >> JTREG="JAVA_OPTIONS=-XX:+UseParallelGC -XX:ParallelGCThreads=100" >> successfully, including your patch for 8251570. >> >> This 8251835 patch looks good to me. > > Thanks! > > StefanK > >> >> Thanks, >> Paul >> >> ?On 8/14/20, 7:49 AM, "Stefan Karlsson" >> wrote: >> >> ???? Hi all, >> >> ???? Please review this patch to fix a recently introduced jmap bug. >> >> ???? https://cr.openjdk.java.net/~stefank/8251835/webrev.01/ >> ???? https://bugs.openjdk.java.net/browse/JDK-8251835 >> >> ???? I added the same kind of checks that we have in histo. >> >> ???? Testing: >> ???? - Tested locally with the failing test >> ???? - Tier1-tier5 on Linux x64 >> >> ???? Paul posted a slightly more elaborate fix that makes dump more >> akin to >> ???? histo: >> ???? http://cr.openjdk.java.net/~phh/8251835/webrev.00/ >> >> ???? I don't know the testing status of that patch. If this needs to >> be fixed >> ???? ASAP, I propose my fix, and then add the rest of Pauls bits as a >> ???? follow-up RFE. If we have time to run Paul's patch through >> testing, then >> ???? I'm fine with that as well. >> >> ???? Thanks, >> ???? StefanK >> From linzang at tencent.com Fri Aug 14 22:57:03 2020 From: linzang at tencent.com (=?utf-8?B?bGluemFuZyjoh6fnkLMp?=) Date: Fri, 14 Aug 2020 22:57:03 +0000 Subject: RFR: 8251835: 8251374 breaks jmap -dump:all(Internet mail) In-Reply-To: References: <34BDC278-5544-4D82-8BD9-1DCE2199B95A@amazon.com> , Message-ID: <57A5EE89-AA63-4AC1-9D7F-B3276FF28F71@tencent.com> Dear All, Sorry for making incomplete patch. And thanks for help fixing it. BRs, Lin > On Aug 15, 2020, at 6:04 AM, "serguei.spitsyn at oracle.com" wrote: > > ?Hi Stefan and Paul, > > Thank you for taking care and fixing this regression! > > It seems, the fix from Paul is more complete and is better to push after testing. > It looks good to me. > > Unfortunately, there is very limited test coverage for this. > > Thanks, > Serguei > > >> On 8/14/20 09:47, Stefan Karlsson wrote: >>> On 2020-08-14 18:39, Hohensee, Paul wrote: >>> Makes sense to me to do a followup. I've filed https://bugs.openjdk.java.net/browse/JDK-8251848. >> >> Great. >> >>> >>> I ran TEST="test/jdk/sun/tools/jmap/BasicJMapTest.java" JTREG="JAVA_OPTIONS=-XX:+UseParallelGC -XX:ParallelGCThreads=100" successfully, including your patch for 8251570. >>> >>> This 8251835 patch looks good to me. >> >> Thanks! >> >> StefanK >> >>> >>> Thanks, >>> Paul >>> >>> ?On 8/14/20, 7:49 AM, "Stefan Karlsson" wrote: >>> >>> Hi all, >>> >>> Please review this patch to fix a recently introduced jmap bug. >>> >>> https://cr.openjdk.java.net/~stefank/8251835/webrev.01/ >>> https://bugs.openjdk.java.net/browse/JDK-8251835 >>> >>> I added the same kind of checks that we have in histo. >>> >>> Testing: >>> - Tested locally with the failing test >>> - Tier1-tier5 on Linux x64 >>> >>> Paul posted a slightly more elaborate fix that makes dump more akin to >>> histo: >>> http://cr.openjdk.java.net/~phh/8251835/webrev.00/ >>> >>> I don't know the testing status of that patch. If this needs to be fixed >>> ASAP, I propose my fix, and then add the rest of Pauls bits as a >>> follow-up RFE. If we have time to run Paul's patch through testing, then >>> I'm fine with that as well. >>> >>> Thanks, >>> StefanK >>> > > From david.holmes at oracle.com Sat Aug 15 05:07:14 2020 From: david.holmes at oracle.com (David Holmes) Date: Sat, 15 Aug 2020 15:07:14 +1000 Subject: RFR(S):8251374:jmap -dump should not accept invalid options(Internet mail) In-Reply-To: References: <8CC023AF-FDA1-421B-86DD-1C192251565D@amazon.com> <799E79AA-515C-4047-924C-D50E8A31BF77@amazon.com> Message-ID: So am I right in thinking that what you did was forge a Merge changeset that actually did a backout, and then recreated the changeset correctly and pushed under the same bug number? I don't think that is a process we want to endorse. A "Merge" changeset should be exactly that. Also the bug needs updating with a link to the second changeset. David ----- On 14/08/2020 5:52 am, Daniel D. Daugherty wrote: > s/like a change/like a charm/ > > Typing too fast today... > > Dan > > > On 8/13/20 3:51 PM, Daniel D. Daugherty wrote: >> Stefan K's idea worked like a change. A corrected changeset has >> been created, merged and pushed. >> >> Dan >> >> >> On 8/13/20 3:34 PM, Daniel D. Daugherty wrote: >>> Paul, >>> >>> Hold up on trying to fix this. >>> >>> I'm discussing another idea with Stefan K. >>> >>> Dan >>> >>> >>> On 8/13/20 3:01 PM, Daniel D. Daugherty wrote: >>>> That's something that's very hard to do. It would involve black listing >>>> the existing changeset and repushing a new changeset. Black listing a >>>> changeset is very, very rarely done and in the past Ops has declined to >>>> do that for something like an authorship error. >>>> >>>> Two options: >>>> >>>> 1) Manually remember that this changeset should be credited to Lin >>>> ?? as author. >>>> 2a) [BACKOUT] the changeset using a new bug ID. >>>> 2b) [REDO] the changeset with corrected author information with a >>>> new bug ID. >>>> >>>> Dan >>>> >>>> On 8/13/20 2:36 PM, Hohensee, Paul wrote: >>>>> I mistakenly committed and pushed Lin's patch with myself as >>>>> author. Would someone with repo access please change the author to >>>>> 'lzang'? Or tell me how to do it myself? >>>>> >>>>> https://hg.openjdk.java.net/jdk/jdk/rev/5036ca733469 >>>>> >>>>> Thanks, >>>>> Paul >>>>> >>>>> ?On 8/13/20, 9:48 AM, "serviceability-dev on behalf of Hohensee, >>>>> Paul" >>>> hohensee at amazon.com> wrote: >>>>> >>>>> ???? Will do. >>>>> >>>>> ???? On 8/13/20, 7:08 AM, "linzang(??)" wrote: >>>>> >>>>> ???????? Thanks Paul? >>>>> ???????? May I ask your help to push it? >>>>> >>>>> ???????? BRs, >>>>> ???????? Lin >>>>> >>>>> ???????? > On Aug 13, 2020, at 10:06 PM, Hohensee, Paul >>>>> wrote: >>>>> ???????? > >>>>> ???????? > +1, except that the indentation for the final 'else' >>>>> clause needs to be 4 spaces instead of 3. :) >>>>> ???????? > >>>>> ???????? > Thanks, >>>>> ???????? > Paul >>>>> ???????? > >>>>> ???????? > On 8/12/20, 6:21 PM, "serguei.spitsyn at oracle.com" >>>>> wrote: >>>>> ???????? > >>>>> ???????? >??? Hi Lin. >>>>> ???????? > >>>>> ???????? >??? Thank you for the update. >>>>> ???????? >??? It looks good. >>>>> ???????? > >>>>> ???????? >??? Thanks, >>>>> ???????? >??? Serguei >>>>> ???????? > >>>>> ???????? > >>>>> ???????? >>??? On 8/12/20 17:08, linzang(??) wrote: >>>>> ???????? >> Hi Paul and Serguei, >>>>> ???????? >>????? Thanks for your comments, here is the updated >>>>> patch: http://cr.openjdk.java.net/~lzang/8251374/webrev02/ >>>>> ???????? >> >>>>> ???????? >> BRs, >>>>> ???????? >> Lin >>>>> ???????? >> >>>>> ???????? >> On 2020/8/13, 12:55 AM, "serguei.spitsyn at oracle.com" >>>>> wrote: >>>>> ???????? >> >>>>> ???????? >>???? Hi Lin, >>>>> ???????? >> >>>>> ???????? >>???? It looks good. >>>>> ???????? >>???? Just one comment. >>>>> ???????? >> >>>>> ???????? >>????????? + System.err.println("Fail: invalid option: '" >>>>> + subopt +"'"); >>>>> ???????? >>????????? +?????????????? System.exit(1); >>>>> ???????? >> >>>>> ???????? >>???? Exit needs to be replaced wit usage for consistency. >>>>> ???????? >> >>>>> ???????? >>???? Thanks, >>>>> ???????? >>???? Serguei >>>>> ???????? >> >>>>> ???????? >> >>>>> ???????? >>>???? On 8/10/20 19:57, linzang(??) wrote: >>>>> ???????? >>> Here is the webrev: >>>>> http://cr.openjdk.java.net/~lzang/8251374/webrev01/ >>>>> ???????? >>> >>>>> ???????? >>> BRs, >>>>> ???????? >>> Lin >>>>> ???????? >>> >>>>> ???????? >>>> On 2020/8/11, 10:52 AM, "linzang(??)" >>>>> wrote: >>>>> ???????? >>> >>>>> ???????? >>>???? Hi All, >>>>> ???????? >>>????????? May I ask your help to review this tiny >>>>> patch? It fix an issue that jmap -dump could wrongly accept invalid >>>>> optioins. >>>>> ???????? >>>????????? Bugs: >>>>> https://bugs.openjdk.java.net/browse/JDK-8251374 >>>>> ???????? >>>????????? Patch:? (Can not connect to webrev ftp >>>>> currently, will try it later, following are all code changes) >>>>> ???????? >>> >>>>> ???????? >>>???? ################################ >>>>> ???????? >>>???? --- >>>>> old/src/jdk.jcmd/share/classes/sun/tools/jmap/JMap.java 2020-08-11 >>>>> 10:42:32.044567791 +0800 >>>>> ???????? >>>???? +++ >>>>> new/src/jdk.jcmd/share/classes/sun/tools/jmap/JMap.java 2020-08-11 >>>>> 10:42:31.876568681 +0800 >>>>> ???????? >>>???? @@ -207,6 +207,11 @@ >>>>> ???????? >>>????????????????????? liveopt = "-live"; >>>>> ???????? >>>????????????????? } else if (subopt.startsWith("file=")) { >>>>> ???????? >>>????????????????????? filename = parseFileName(subopt); >>>>> ???????? >>>???? +??????????? } else if (subopt.equals("format=b")) { >>>>> ???????? >>>???? +??????????????? // ignore format (not needed at >>>>> this time) >>>>> ???????? >>>???? +??????????? } else { >>>>> ???????? >>>???? + System.err.println("Fail: invalid option: '" + >>>>> subopt +"'"); >>>>> ???????? >>>???? +?????????????? System.exit(1); >>>>> ???????? >>>????????????????? } >>>>> ???????? >>>????????????? } >>>>> ???????? >>>???? ################################ >>>>> ???????? >>> >>>>> ???????? >>>???? Thanks, >>>>> ???????? >>>???? Lin >>>>> ???????? >>> >>>>> ???????? >>> >>>>> ???????? >> >>>>> ???????? >> >>>>> ???????? >> >>>>> ???????? > >>>>> ???????? > >>>>> >>>>> >>>> >>> >> > From daniel.daugherty at oracle.com Sat Aug 15 13:57:00 2020 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Sat, 15 Aug 2020 09:57:00 -0400 Subject: RFR(S):8251374:jmap -dump should not accept invalid options(Internet mail) In-Reply-To: References: <8CC023AF-FDA1-421B-86DD-1C192251565D@amazon.com> <799E79AA-515C-4047-924C-D50E8A31BF77@amazon.com> Message-ID: <555f6455-4900-9af0-a3b6-89f1c33ab490@oracle.com> First, Paul sent his request on the wrong email thread and I didn't notice until just now that we should have been using this one: ??? Subject: RFR(L): 8215624: add parallel heap inspection support for jmap histo(G1)(Internet mail) Fortunately, the correction was applied to the proper changeset. The process that Stefan K. proposed and that I executed was this: $ hg export 5036ca733469 > 8215624.patch # Switch the repo to the parent changeset of 5036ca733469: $ hg update 7b7be8c2b336 # Import the fixed patch as a sibling changeset: $ hg import 8215624.patch # Make sure it looks correct: $ hg log -r tip changeset:?? 60553:b1afb7c82d59 parent:????? 60550:7b7be8c2b336 user:??????? lzang date:??????? Thu Aug 13 11:31:37 2020 -0700 summary:???? 8215624: Add parallel heap iteration for jmap ?histo # Switch the repo back to the main line: $ hg update # Merge the two branches: $ hg merge # Commit the "Merge": $ hg commit # Push the repair: $ hg push The result of this repair is almost identical to the same patch being backported to JDK-(N-1) after being pushed first to JDK-N. When JDK-(N-1) is sync'ed into JDK-N, we'll have two (almost identical) changesets in JDK-N for the same bug ID. In the usual backport scenario, the only difference is the changeset ID. In this case, the only differences are the changeset ID and the author. So no [BACKOUT] was involved at all and the "Merge" changeset is exactly that: a merge. Paul added a link to the corrected changeset two days ago. I added a comment with a handcrafted "notification" comment this morning. Dan On 8/15/20 1:07 AM, David Holmes wrote: > So am I right in thinking that what you did was forge a Merge > changeset that actually did a backout, and then recreated the > changeset correctly and pushed under the same bug number? > > I don't think that is a process we want to endorse. A "Merge" > changeset should be exactly that. > > Also the bug needs updating with a link to the second changeset. > > David > ----- > > On 14/08/2020 5:52 am, Daniel D. Daugherty wrote: >> s/like a change/like a charm/ >> >> Typing too fast today... >> >> Dan >> >> >> On 8/13/20 3:51 PM, Daniel D. Daugherty wrote: >>> Stefan K's idea worked like a change. A corrected changeset has >>> been created, merged and pushed. >>> >>> Dan >>> >>> >>> On 8/13/20 3:34 PM, Daniel D. Daugherty wrote: >>>> Paul, >>>> >>>> Hold up on trying to fix this. >>>> >>>> I'm discussing another idea with Stefan K. >>>> >>>> Dan >>>> >>>> >>>> On 8/13/20 3:01 PM, Daniel D. Daugherty wrote: >>>>> That's something that's very hard to do. It would involve black >>>>> listing >>>>> the existing changeset and repushing a new changeset. Black listing a >>>>> changeset is very, very rarely done and in the past Ops has >>>>> declined to >>>>> do that for something like an authorship error. >>>>> >>>>> Two options: >>>>> >>>>> 1) Manually remember that this changeset should be credited to Lin >>>>> ?? as author. >>>>> 2a) [BACKOUT] the changeset using a new bug ID. >>>>> 2b) [REDO] the changeset with corrected author information with a >>>>> new bug ID. >>>>> >>>>> Dan >>>>> >>>>> On 8/13/20 2:36 PM, Hohensee, Paul wrote: >>>>>> I mistakenly committed and pushed Lin's patch with myself as >>>>>> author. Would someone with repo access please change the author >>>>>> to 'lzang'? Or tell me how to do it myself? >>>>>> >>>>>> https://hg.openjdk.java.net/jdk/jdk/rev/5036ca733469 >>>>>> >>>>>> Thanks, >>>>>> Paul >>>>>> >>>>>> ?On 8/13/20, 9:48 AM, "serviceability-dev on behalf of Hohensee, >>>>>> Paul" >>>>> hohensee at amazon.com> wrote: >>>>>> >>>>>> ???? Will do. >>>>>> >>>>>> ???? On 8/13/20, 7:08 AM, "linzang(??)" >>>>>> wrote: >>>>>> >>>>>> ???????? Thanks Paul? >>>>>> ???????? May I ask your help to push it? >>>>>> >>>>>> ???????? BRs, >>>>>> ???????? Lin >>>>>> >>>>>> ???????? > On Aug 13, 2020, at 10:06 PM, Hohensee, Paul >>>>>> wrote: >>>>>> ???????? > >>>>>> ???????? > +1, except that the indentation for the final 'else' >>>>>> clause needs to be 4 spaces instead of 3. :) >>>>>> ???????? > >>>>>> ???????? > Thanks, >>>>>> ???????? > Paul >>>>>> ???????? > >>>>>> ???????? > On 8/12/20, 6:21 PM, "serguei.spitsyn at oracle.com" >>>>>> wrote: >>>>>> ???????? > >>>>>> ???????? >??? Hi Lin. >>>>>> ???????? > >>>>>> ???????? >??? Thank you for the update. >>>>>> ???????? >??? It looks good. >>>>>> ???????? > >>>>>> ???????? >??? Thanks, >>>>>> ???????? >??? Serguei >>>>>> ???????? > >>>>>> ???????? > >>>>>> ???????? >>??? On 8/12/20 17:08, linzang(??) wrote: >>>>>> ???????? >> Hi Paul and Serguei, >>>>>> ???????? >>????? Thanks for your comments, here is the updated >>>>>> patch: http://cr.openjdk.java.net/~lzang/8251374/webrev02/ >>>>>> ???????? >> >>>>>> ???????? >> BRs, >>>>>> ???????? >> Lin >>>>>> ???????? >> >>>>>> ???????? >> On 2020/8/13, 12:55 AM, "serguei.spitsyn at oracle.com" >>>>>> wrote: >>>>>> ???????? >> >>>>>> ???????? >>???? Hi Lin, >>>>>> ???????? >> >>>>>> ???????? >>???? It looks good. >>>>>> ???????? >>???? Just one comment. >>>>>> ???????? >> >>>>>> ???????? >>????????? + System.err.println("Fail: invalid option: >>>>>> '" + subopt +"'"); >>>>>> ???????? >>????????? + System.exit(1); >>>>>> ???????? >> >>>>>> ???????? >>???? Exit needs to be replaced wit usage for consistency. >>>>>> ???????? >> >>>>>> ???????? >>???? Thanks, >>>>>> ???????? >>???? Serguei >>>>>> ???????? >> >>>>>> ???????? >> >>>>>> ???????? >>>???? On 8/10/20 19:57, linzang(??) wrote: >>>>>> ???????? >>> Here is the webrev: >>>>>> http://cr.openjdk.java.net/~lzang/8251374/webrev01/ >>>>>> ???????? >>> >>>>>> ???????? >>> BRs, >>>>>> ???????? >>> Lin >>>>>> ???????? >>> >>>>>> ???????? >>>> On 2020/8/11, 10:52 AM, "linzang(??)" >>>>>> wrote: >>>>>> ???????? >>> >>>>>> ???????? >>>???? Hi All, >>>>>> ???????? >>>????????? May I ask your help to review this tiny >>>>>> patch? It fix an issue that jmap -dump could wrongly accept >>>>>> invalid optioins. >>>>>> ???????? >>>????????? Bugs: >>>>>> https://bugs.openjdk.java.net/browse/JDK-8251374 >>>>>> ???????? >>>????????? Patch:? (Can not connect to webrev ftp >>>>>> currently, will try it later, following are all code changes) >>>>>> ???????? >>> >>>>>> ???????? >>> ################################ >>>>>> ???????? >>>???? --- >>>>>> old/src/jdk.jcmd/share/classes/sun/tools/jmap/JMap.java >>>>>> 2020-08-11 10:42:32.044567791 +0800 >>>>>> ???????? >>>???? +++ >>>>>> new/src/jdk.jcmd/share/classes/sun/tools/jmap/JMap.java >>>>>> 2020-08-11 10:42:31.876568681 +0800 >>>>>> ???????? >>>???? @@ -207,6 +207,11 @@ >>>>>> ???????? >>>????????????????????? liveopt = "-live"; >>>>>> ???????? >>>????????????????? } else if >>>>>> (subopt.startsWith("file=")) { >>>>>> ???????? >>>????????????????????? filename = parseFileName(subopt); >>>>>> ???????? >>>???? +??????????? } else if >>>>>> (subopt.equals("format=b")) { >>>>>> ???????? >>>???? +??????????????? // ignore format (not needed at >>>>>> this time) >>>>>> ???????? >>>???? +??????????? } else { >>>>>> ???????? >>>???? + System.err.println("Fail: invalid option: '" + >>>>>> subopt +"'"); >>>>>> ???????? >>>???? + System.exit(1); >>>>>> ???????? >>>????????????????? } >>>>>> ???????? >>>????????????? } >>>>>> ???????? >>> ################################ >>>>>> ???????? >>> >>>>>> ???????? >>>???? Thanks, >>>>>> ???????? >>>???? Lin >>>>>> ???????? >>> >>>>>> ???????? >>> >>>>>> ???????? >> >>>>>> ???????? >> >>>>>> ???????? >> >>>>>> ???????? > >>>>>> ???????? > >>>>>> >>>>>> >>>>> >>>> >>> >> From david.holmes at oracle.com Mon Aug 17 01:17:59 2020 From: david.holmes at oracle.com (David Holmes) Date: Mon, 17 Aug 2020 11:17:59 +1000 Subject: RFR(S):8251374:jmap -dump should not accept invalid options(Internet mail) In-Reply-To: <555f6455-4900-9af0-a3b6-89f1c33ab490@oracle.com> References: <8CC023AF-FDA1-421B-86DD-1C192251565D@amazon.com> <799E79AA-515C-4047-924C-D50E8A31BF77@amazon.com> <555f6455-4900-9af0-a3b6-89f1c33ab490@oracle.com> Message-ID: Hi Dan, On 15/08/2020 11:57 pm, Daniel D. Daugherty wrote: > First, Paul sent his request on the wrong email thread and I didn't > notice until just now that we should have been using this one: Yes this has all been very confusing. > ??? Subject: RFR(L): 8215624: add parallel heap inspection support for > jmap histo(G1)(Internet mail) > > Fortunately, the correction was applied to the proper changeset. > > The process that Stefan K. proposed and that I executed was this: Thanks for explaining. Now I can see the correct sequence of changes pertaining to 8215624. David ----- > $ hg export 5036ca733469 > 8215624.patch > > > # Switch the repo to the parent changeset of 5036ca733469: > $ hg update 7b7be8c2b336 > > # Import the fixed patch as a sibling changeset: > $ hg import 8215624.patch > > # Make sure it looks correct: > $ hg log -r tip > changeset:?? 60553:b1afb7c82d59 > parent:????? 60550:7b7be8c2b336 > user:??????? lzang > date:??????? Thu Aug 13 11:31:37 2020 -0700 > summary:???? 8215624: Add parallel heap iteration for jmap ?histo > > # Switch the repo back to the main line: > $ hg update > > # Merge the two branches: > $ hg merge > > # Commit the "Merge": > $ hg commit > > > # Push the repair: > $ hg push > > > The result of this repair is almost identical to the same patch being > backported to JDK-(N-1) after being pushed first to JDK-N. When JDK-(N-1) > is sync'ed into JDK-N, we'll have two (almost identical) changesets in > JDK-N for the same bug ID. > > In the usual backport scenario, the only difference is the changeset ID. > In this case, the only differences are the changeset ID and the author. > > So no [BACKOUT] was involved at all and the "Merge" changeset is exactly > that: a merge. > > Paul added a link to the corrected changeset two days ago. I added a > comment with a handcrafted "notification" comment this morning. > > Dan > > > > > On 8/15/20 1:07 AM, David Holmes wrote: >> So am I right in thinking that what you did was forge a Merge >> changeset that actually did a backout, and then recreated the >> changeset correctly and pushed under the same bug number? >> >> I don't think that is a process we want to endorse. A "Merge" >> changeset should be exactly that. >> >> Also the bug needs updating with a link to the second changeset. >> >> David >> ----- >> >> On 14/08/2020 5:52 am, Daniel D. Daugherty wrote: >>> s/like a change/like a charm/ >>> >>> Typing too fast today... >>> >>> Dan >>> >>> >>> On 8/13/20 3:51 PM, Daniel D. Daugherty wrote: >>>> Stefan K's idea worked like a change. A corrected changeset has >>>> been created, merged and pushed. >>>> >>>> Dan >>>> >>>> >>>> On 8/13/20 3:34 PM, Daniel D. Daugherty wrote: >>>>> Paul, >>>>> >>>>> Hold up on trying to fix this. >>>>> >>>>> I'm discussing another idea with Stefan K. >>>>> >>>>> Dan >>>>> >>>>> >>>>> On 8/13/20 3:01 PM, Daniel D. Daugherty wrote: >>>>>> That's something that's very hard to do. It would involve black >>>>>> listing >>>>>> the existing changeset and repushing a new changeset. Black listing a >>>>>> changeset is very, very rarely done and in the past Ops has >>>>>> declined to >>>>>> do that for something like an authorship error. >>>>>> >>>>>> Two options: >>>>>> >>>>>> 1) Manually remember that this changeset should be credited to Lin >>>>>> ?? as author. >>>>>> 2a) [BACKOUT] the changeset using a new bug ID. >>>>>> 2b) [REDO] the changeset with corrected author information with a >>>>>> new bug ID. >>>>>> >>>>>> Dan >>>>>> >>>>>> On 8/13/20 2:36 PM, Hohensee, Paul wrote: >>>>>>> I mistakenly committed and pushed Lin's patch with myself as >>>>>>> author. Would someone with repo access please change the author >>>>>>> to 'lzang'? Or tell me how to do it myself? >>>>>>> >>>>>>> https://hg.openjdk.java.net/jdk/jdk/rev/5036ca733469 >>>>>>> >>>>>>> Thanks, >>>>>>> Paul >>>>>>> >>>>>>> ?On 8/13/20, 9:48 AM, "serviceability-dev on behalf of Hohensee, >>>>>>> Paul" >>>>>> hohensee at amazon.com> wrote: >>>>>>> >>>>>>> ???? Will do. >>>>>>> >>>>>>> ???? On 8/13/20, 7:08 AM, "linzang(??)" >>>>>>> wrote: >>>>>>> >>>>>>> ???????? Thanks Paul? >>>>>>> ???????? May I ask your help to push it? >>>>>>> >>>>>>> ???????? BRs, >>>>>>> ???????? Lin >>>>>>> >>>>>>> ???????? > On Aug 13, 2020, at 10:06 PM, Hohensee, Paul >>>>>>> wrote: >>>>>>> ???????? > >>>>>>> ???????? > +1, except that the indentation for the final 'else' >>>>>>> clause needs to be 4 spaces instead of 3. :) >>>>>>> ???????? > >>>>>>> ???????? > Thanks, >>>>>>> ???????? > Paul >>>>>>> ???????? > >>>>>>> ???????? > On 8/12/20, 6:21 PM, "serguei.spitsyn at oracle.com" >>>>>>> wrote: >>>>>>> ???????? > >>>>>>> ???????? >??? Hi Lin. >>>>>>> ???????? > >>>>>>> ???????? >??? Thank you for the update. >>>>>>> ???????? >??? It looks good. >>>>>>> ???????? > >>>>>>> ???????? >??? Thanks, >>>>>>> ???????? >??? Serguei >>>>>>> ???????? > >>>>>>> ???????? > >>>>>>> ???????? >>??? On 8/12/20 17:08, linzang(??) wrote: >>>>>>> ???????? >> Hi Paul and Serguei, >>>>>>> ???????? >>????? Thanks for your comments, here is the updated >>>>>>> patch: http://cr.openjdk.java.net/~lzang/8251374/webrev02/ >>>>>>> ???????? >> >>>>>>> ???????? >> BRs, >>>>>>> ???????? >> Lin >>>>>>> ???????? >> >>>>>>> ???????? >> On 2020/8/13, 12:55 AM, "serguei.spitsyn at oracle.com" >>>>>>> wrote: >>>>>>> ???????? >> >>>>>>> ???????? >>???? Hi Lin, >>>>>>> ???????? >> >>>>>>> ???????? >>???? It looks good. >>>>>>> ???????? >>???? Just one comment. >>>>>>> ???????? >> >>>>>>> ???????? >>????????? + System.err.println("Fail: invalid option: >>>>>>> '" + subopt +"'"); >>>>>>> ???????? >>????????? + System.exit(1); >>>>>>> ???????? >> >>>>>>> ???????? >>???? Exit needs to be replaced wit usage for consistency. >>>>>>> ???????? >> >>>>>>> ???????? >>???? Thanks, >>>>>>> ???????? >>???? Serguei >>>>>>> ???????? >> >>>>>>> ???????? >> >>>>>>> ???????? >>>???? On 8/10/20 19:57, linzang(??) wrote: >>>>>>> ???????? >>> Here is the webrev: >>>>>>> http://cr.openjdk.java.net/~lzang/8251374/webrev01/ >>>>>>> ???????? >>> >>>>>>> ???????? >>> BRs, >>>>>>> ???????? >>> Lin >>>>>>> ???????? >>> >>>>>>> ???????? >>>> On 2020/8/11, 10:52 AM, "linzang(??)" >>>>>>> wrote: >>>>>>> ???????? >>> >>>>>>> ???????? >>>???? Hi All, >>>>>>> ???????? >>>????????? May I ask your help to review this tiny >>>>>>> patch? It fix an issue that jmap -dump could wrongly accept >>>>>>> invalid optioins. >>>>>>> ???????? >>>????????? Bugs: >>>>>>> https://bugs.openjdk.java.net/browse/JDK-8251374 >>>>>>> ???????? >>>????????? Patch:? (Can not connect to webrev ftp >>>>>>> currently, will try it later, following are all code changes) >>>>>>> ???????? >>> >>>>>>> ???????? >>> ################################ >>>>>>> ???????? >>>???? --- >>>>>>> old/src/jdk.jcmd/share/classes/sun/tools/jmap/JMap.java >>>>>>> 2020-08-11 10:42:32.044567791 +0800 >>>>>>> ???????? >>>???? +++ >>>>>>> new/src/jdk.jcmd/share/classes/sun/tools/jmap/JMap.java >>>>>>> 2020-08-11 10:42:31.876568681 +0800 >>>>>>> ???????? >>>???? @@ -207,6 +207,11 @@ >>>>>>> ???????? >>>????????????????????? liveopt = "-live"; >>>>>>> ???????? >>>????????????????? } else if >>>>>>> (subopt.startsWith("file=")) { >>>>>>> ???????? >>>????????????????????? filename = parseFileName(subopt); >>>>>>> ???????? >>>???? +??????????? } else if >>>>>>> (subopt.equals("format=b")) { >>>>>>> ???????? >>>???? +??????????????? // ignore format (not needed at >>>>>>> this time) >>>>>>> ???????? >>>???? +??????????? } else { >>>>>>> ???????? >>>???? + System.err.println("Fail: invalid option: '" + >>>>>>> subopt +"'"); >>>>>>> ???????? >>>???? + System.exit(1); >>>>>>> ???????? >>>????????????????? } >>>>>>> ???????? >>>????????????? } >>>>>>> ???????? >>> ################################ >>>>>>> ???????? >>> >>>>>>> ???????? >>>???? Thanks, >>>>>>> ???????? >>>???? Lin >>>>>>> ???????? >>> >>>>>>> ???????? >>> >>>>>>> ???????? >> >>>>>>> ???????? >> >>>>>>> ???????? >> >>>>>>> ???????? > >>>>>>> ???????? > >>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> > From dms at samersoff.net Mon Aug 17 07:21:52 2020 From: dms at samersoff.net (Dmitry Samersoff) Date: Mon, 17 Aug 2020 10:21:52 +0300 Subject: RFR (S): JDK-8250630 JdwpListenTest.java fails on Alpine Linux Message-ID: <3be5b9aa-2d4d-0806-24f3-8207de1baf35@samersoff.net> Hello Everybody, Please review the fix: https://cr.openjdk.java.net/~dsamersoff/JDK-8250630/webrev.01/ Binding to IN6ADDR_ANY allow us to serve both IPv4 and IPv6 connections, but binding to mapped INADDR_ANY (::ffff:0.0.0.0) allow us to serve IPv4 connections only. So make sure, that IN6ADDR_ANY is preferred over mapped INADDR_ANY if preferredAddressFamily is not AF_INET -Dmitry\S From linzang at tencent.com Mon Aug 17 09:17:09 2020 From: linzang at tencent.com (=?utf-8?B?bGluemFuZyjoh6fnkLMp?=) Date: Mon, 17 Aug 2020 09:17:09 +0000 Subject: [Discussion] Expected behavior of combining "all" and "live" options of jmap Message-ID: <16D02038-2F95-4FE0-860F-D0CE134BE871@tencent.com> Dear all, we found the jmap?s histo/dump command could accept "live" and "all" options together, and the specification does not describe what is the expected behavior of it. I have tried that when these two options used together, the "live" takes effect, no matter what sequences are they in commandline. IMO, it is a little confused to use "live" and "all" together, and if it is allowed, the specification may need to be updated to state the behavior clearly. Therefore may I ask your suggestion on which option of the following is prefered: (option 1.) disallow using these two options together, I think this is more clear, but I am not sure whether there is backward compatibility risk. (option 2.) allow the combination use of "live" and "all", and update the specification to clearly describe the behavior that "live" takes effect in this case. What do you think? Thanks, Lin From fairoz.matte at oracle.com Mon Aug 17 12:46:37 2020 From: fairoz.matte at oracle.com (Fairoz Matte) Date: Mon, 17 Aug 2020 05:46:37 -0700 (PDT) Subject: RFR(s): 8248295: serviceability/jvmti/CompiledMethodLoad/Zombie.java failure with Graal Message-ID: Hi, Please review this small test change to work with Graal. Background: Graal require more code cache compared to c1/c2. but the test case always set it to 20MB. This may not be sufficient when running graal. Default configuration for ReservedCodeCacheSize = 250MB With graal enabled, ReservedCodeCacheSize = 350MB Either we can modify the framework to honor ReservedCodeCacheSize for graal or just update the testcase. There are not many test cases they rely on ReservedCodeCacheSize or InitialCodeCacheSize. So the fix prefer the later one. JBS - https://bugs.openjdk.java.net/browse/JDK-8248295 Webrev - http://cr.openjdk.java.net/~fmatte/8248295/webrev.00/ Thanks, Fairoz -------------- next part -------------- An HTML attachment was scrubbed... URL: From jiefu at tencent.com Mon Aug 17 15:13:03 2020 From: jiefu at tencent.com (=?utf-8?B?amllZnUo5YKF5p2wKQ==?=) Date: Mon, 17 Aug 2020 15:13:03 +0000 Subject: 8251155: HostIdentifier fails to canonicalize hostnames starting with digits(Internet mail) Message-ID: <1C8FB411-F73A-4197-A8C9-28B2BE3BFBA0@tencent.com> Ping? Any comments? Thanks. Best regards, Jie From: serviceability-dev on behalf of "jiefu(??)" Date: Friday, August 7, 2020 at 7:44 AM To: "serviceability-dev at openjdk.java.net" Subject: Re: RFR: 8251155: HostIdentifier fails to canonicalize hostnames starting with digits(Internet mail) FYI: This bug will lead to failures of the following tests on machines with hostname starting from digits. - test/jdk/sun/tools/jstatd/TestJstatdExternalRegistry.java - test/jdk/sun/tools/jstatd/TestJstatdPort.java - test/jdk/sun/tools/jstatd/TestJstatdPortAndServer.java - test/jdk/sun/tools/jstatd/TestJstatdRmiPort.java So it's worth fixing it. Testing: - tier1-3 on Linux/x64 Thanks. Best regards, Jie From: "jiefu(??)" Date: Wednesday, August 5, 2020 at 3:19 PM To: "serviceability-dev at openjdk.java.net" Subject: RFR: 8251155: HostIdentifier fails to canonicalize hostnames starting with digits Hi all, JBS: https://bugs.openjdk.java.net/browse/JDK-8251155 Webrev: http://cr.openjdk.java.net/~jiefu/8251155/webrev.00/ HostIdentifier fails to canonicalize hostname:port if the hostname starts with digits. The current implementation will get "scheme = hostname". But the scheme should not be started with digits, which leads to this bug. Thanks a lot. Best regards, Jie -------------- next part -------------- An HTML attachment was scrubbed... URL: From vladimir.kozlov at oracle.com Mon Aug 17 17:52:22 2020 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Mon, 17 Aug 2020 10:52:22 -0700 Subject: RFR(s): 8248295: serviceability/jvmti/CompiledMethodLoad/Zombie.java failure with Graal In-Reply-To: References: Message-ID: <1bdcbd35-097e-1681-3a0c-32f9709497a4@oracle.com> Hi Fairoz, How you determine that +10Mb is enough with Graal? Thanks, Vladimir On 8/17/20 5:46 AM, Fairoz Matte wrote: > Hi, > > > > Please review this small test change to work with Graal. > > > > Background: > > Graal require more code cache compared to c1/c2. but the test case always set it to 20MB. This may not be sufficient when running graal. > > Default configuration for ReservedCodeCacheSize = 250MB > > With graal enabled, ReservedCodeCacheSize = 350MB > > > > Either we can modify the framework to honor ReservedCodeCacheSize for graal or just update the testcase. > > There are not many test cases they rely on ReservedCodeCacheSize or InitialCodeCacheSize. So the fix prefer the later one. > > > > JBS - https://bugs.openjdk.java.net/browse/JDK-8248295 > > Webrev - http://cr.openjdk.java.net/~fmatte/8248295/webrev.00/ > > > > Thanks, > > Fairoz > > > From adityam at microsoft.com Mon Aug 17 19:07:22 2020 From: adityam at microsoft.com (Aditya Mandaleeka) Date: Mon, 17 Aug 2020 19:07:22 +0000 Subject: Protecting references from GC in JDI tests Message-ID: Hi serviceability-dev, I hope this is the right list for this topic, but feel free to redirect if not... It appears that there are jtreg tests that exercise JDI functionality without protecting target objects from being GC'd. An example of this is com/sun/jdi/VarargsTest.java, where references are acquired (with mirrorOf) and then used as args in invokeMethod. I haven't looked into all the other JDI tests to see if there are others with the same issue. This issue was exposed in our test runs with Shenandoah GC in aggressive heuristics mode (which does back to back GCs), but of course it can also be reproduced by inducing a GC in the target VM explicitly before the invoke. While this is unlikely to occur in practice when not using a special GC mode, it seems to me that the tests should not be relying on the fact that a GC _probably_ won't occur, and instead explicitly disable collection on the objects that are going to be used. After all, it is specified in the docs that ObjectReference values returned by JDI may be collected at any time the target VM is running unless disableCollection() is called on them [0], so the test code is implicitly relying on lifetime guarantees that are not provided. I think the test(s) could be improved by calling disableCollection on any references in the target VM prior to using them. There is of course a chance that an object could be GC'd before the disableCollection call, but I inspected the code for this case and it appears the JDWP error code gets converted into an ObjectCollectedException surfaced to user code which we could handle in the test (and perhaps retry the operation a few times before giving up). Is my understanding correct here? What are your thoughts on this? Is there interest in these tests being fixed in this way? Thanks, Aditya [0]: https://docs.oracle.com/en/java/javase/11/docs/api/jdk.jdi/com/sun/jdi/ObjectReference.html#disableCollection() From daniel.daugherty at oracle.com Mon Aug 17 19:21:05 2020 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Mon, 17 Aug 2020 15:21:05 -0400 Subject: Protecting references from GC in JDI tests In-Reply-To: References: Message-ID: Aditya, I think you've found the right alias... A similar observation was made here: http://mail.openjdk.java.net/pipermail/serviceability-dev/2020-March/030635.html It looks like that conversation didn't go beyond Egor's original message and Chris P's reply. My recommendation would be to only use ObjectReference.DisableCollection() when you have observed a specific failure for a specific object in a test. I would not apply that function generally. However, Chris P. or another member of the Serviceability team may have other guidance... Dan On 8/17/20 3:07 PM, Aditya Mandaleeka wrote: > Hi serviceability-dev, > > I hope this is the right list for this topic, but feel free to redirect if not... > > It appears that there are jtreg tests that exercise JDI functionality without protecting target > objects from being GC'd. An example of this is com/sun/jdi/VarargsTest.java, where references are > acquired (with mirrorOf) and then used as args in invokeMethod. I haven't looked into all the > other JDI tests to see if there are others with the same issue. > > This issue was exposed in our test runs with Shenandoah GC in aggressive heuristics mode (which does > back to back GCs), but of course it can also be reproduced by inducing a GC in the target VM > explicitly before the invoke. > > While this is unlikely to occur in practice when not using a special GC mode, it seems to me that > the tests should not be relying on the fact that a GC _probably_ won't occur, and instead explicitly > disable collection on the objects that are going to be used. After all, it is specified in the docs > that ObjectReference values returned by JDI may be collected at any time the target VM is running > unless disableCollection() is called on them [0], so the test code is implicitly relying on lifetime > guarantees that are not provided. > > I think the test(s) could be improved by calling disableCollection on any references in the target > VM prior to using them. There is of course a chance that an object could be GC'd before the > disableCollection call, but I inspected the code for this case and it appears the JDWP error code > gets converted into an ObjectCollectedException surfaced to user code which we could handle in > the test (and perhaps retry the operation a few times before giving up). Is my understanding > correct here? > > What are your thoughts on this? Is there interest in these tests being fixed in this way? > > Thanks, > Aditya > > [0]: https://docs.oracle.com/en/java/javase/11/docs/api/jdk.jdi/com/sun/jdi/ObjectReference.html#disableCollection() > From alexey.menkov at oracle.com Mon Aug 17 21:19:20 2020 From: alexey.menkov at oracle.com (Alex Menkov) Date: Mon, 17 Aug 2020 14:19:20 -0700 Subject: RFR (S): JDK-8250630 JdwpListenTest.java fails on Alpine Linux In-Reply-To: <3be5b9aa-2d4d-0806-24f3-8207de1baf35@samersoff.net> References: <3be5b9aa-2d4d-0806-24f3-8207de1baf35@samersoff.net> Message-ID: <440537ce-f423-163e-a4aa-7fd006a8433f@oracle.com> Hi Dmitry, In general the fix looks good to me. Some notes: please update copyright year; I'd rename compareIPv6Addr to something like isEqualIPv6Addr or equalsIPv6Addr; Also both parameters should be const; 737 // Try to find bind address of preferred address familty first "familty" -> "family" --alex On 08/17/2020 00:21, Dmitry Samersoff wrote: > Hello Everybody, > > Please review the fix: > > https://cr.openjdk.java.net/~dsamersoff/JDK-8250630/webrev.01/ > > Binding to IN6ADDR_ANY allow us to serve both IPv4 and IPv6 connections, > but binding to mapped INADDR_ANY (::ffff:0.0.0.0) allow us to serve IPv4 > connections only. > > So make sure, that IN6ADDR_ANY is preferred over mapped INADDR_ANY if > preferredAddressFamily is not AF_INET > > > -Dmitry\S > From alexey.menkov at oracle.com Mon Aug 17 21:23:51 2020 From: alexey.menkov at oracle.com (Alex Menkov) Date: Mon, 17 Aug 2020 14:23:51 -0700 Subject: Ping: RFR: JDK-8234808: jdb quoted option parsing broken In-Reply-To: References: Message-ID: <4528e3f0-a111-7a18-6abb-2250760c4daa@oracle.com> On 08/07/2020 15:09, Alex Menkov wrote: > Hi all, > > please review the fix for > https://bugs.openjdk.java.net/browse/JDK-8234808 > webrev: > http://cr.openjdk.java.net/~amenkov/jdk16/jdb_options/webrev/ > > Some background: > when jdb launches debuggee process it passes java options from "options" > value for CommandLineLaunch connector and forward options specified > before command. > > The fix solves several discovered issues: > - proper handling of java options with spaces > - if both way are used to specify java options, forwarded options > override options from "options" value > > VMConnection class implements tricky logic for "options" field parsing > for JFR needs (handling of single and double quotes). I decided to keep > it as is to avoid massive test failures with JFR (there is no test > coverage for this functionality and I'm not sure I understand all > requirements). > > --alex From serguei.spitsyn at oracle.com Tue Aug 18 00:20:50 2020 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Mon, 17 Aug 2020 17:20:50 -0700 Subject: RFR (S): JDK-8250630 JdwpListenTest.java fails on Alpine Linux In-Reply-To: <3be5b9aa-2d4d-0806-24f3-8207de1baf35@samersoff.net> References: <3be5b9aa-2d4d-0806-24f3-8207de1baf35@samersoff.net> Message-ID: An HTML attachment was scrubbed... URL: From david.holmes at oracle.com Tue Aug 18 05:20:07 2020 From: david.holmes at oracle.com (David Holmes) Date: Tue, 18 Aug 2020 15:20:07 +1000 Subject: RFR(S) 8249293: Unsafe stackwalk in VM_GetOrSetLocal::doit_prologue() In-Reply-To: References: <54218a0f-5ea7-5a26-af7b-d13829bece14@oracle.com> <037ee9ad-615e-e70b-80f8-21b4fdc80f53@oracle.com> <0a112ad5-0771-9a83-32d8-08c87bed9b40@oracle.com> Message-ID: <4040c9b2-8bcb-65d3-9263-c23e9d7740f1@oracle.com> Hi Richard, The test seems a lot clearer to me now. I'll leave it to you are Serguei to iron out any last wrinkles as I am disappearing on vacation for a week after today. But you have my Review. Thanks, David On 15/08/2020 12:06 am, Reingruber, Richard wrote: > Hi Serguei, > > thanks for the feedback. I have implemented your suggestions and created a new > webrev: > > Webrev: http://cr.openjdk.java.net/~rrich/webrevs/8249293/webrev.4/ > Delta: http://cr.openjdk.java.net/~rrich/webrevs/8249293/webrev.4.inc/ > > Please find my replies to your comments below. > > Best regards, > Richard. > >> http://cr.openjdk.java.net/~rrich/webrevs/8249293/webrev.3.inc/test/hotspot/jtreg/serviceability/jvmti/GetLocalVariable/GetLocalWithoutSuspendTest.java.frames.html > >> 33 * the stack walk. The target thread's stack is walkable while in native. After sending the notification it >> ... > >> 54 * @param depth Depth of target frame for GetLocalObject() call. Should be large value to prolong the unsafe stack walk. >> 55 * @param waitTimeInNativeAfterNotify Time to wait after notify with walkable stack before returning an becoming unsafe again. >> ... > >> 71 * Wait time in native, i.e. with walkable stack, after notifying agent thread to do GetLocalObject() call. >> ... > >> 89 msg((now -start) + " ms Iteration : " + iterations + " waitTimeInNativeAfterNotify : " + waitTimeInNativeAfterNotify); > >> Could you, please, re-balance the lines above to make them shorter? > > Ok, done. > > >> 90 int newTargetDepth = recursiveMethod(0, targetDepth); >> 91 if (newTargetDepth < targetDepth) { >> 92 msg("StackOverflowError during test."); >> 93 msg("Old target depth: " + targetDepth); >> 94 msg("Retry with new target depth: " + newTargetDepth); >> 95 targetDepth = newTargetDepth; >> 96 } >> A comment is needed to explain why a StackOverflowError is not desired. >> At least, it is not obvious initially. >> 73 public int waitTimeInNativeAfterNotify; > >> This name is unreasonably long which makes the code less readable. >> I'd suggest to reduce it to waitTime. > > Ok, done. > >> 103 notifyAgentToGetLocalAndWaitShortly(-1, 1); >> ... >> 119 notifyAgentToGetLocalAndWaitShortly(depth - 100, waitTimeInNativeAfterNotify); > >> It is better to provide a short comment before each call explaining what it is doing. >> For instance, it is not clear why the call at the line 103 is needed. >> Why do we need to notify the agent to GetLocal for the second time? > > The test is repeated TEST_ITERATIONS times. In each iteration the agent calls > GetLocal racing the target thread returning from the native call. The last call > in line 103 ist the shutdown signal. > >> Can it be refactored into a separate native method? > > I've made the shutdown process more explicit with the new native method > shutDown() which sets thest_state to ShutDown. > >> Then the the function name can be reduced to 'notifyAgentToGetLocal'. >> This long name does not give enough context anyway. > > Ok, done. > >> 85 long iterations = 0; >> 87 do { >> ... >> 97 iterations++; >> ... >> 102 } while (iterations < TEST_ITERATIONS); > >> Why a more explicit 'for' or 'while' loop is not used here? : >> for (long iter = 0; iter < TEST_ITERATIONS; iter++) { > > I have converted the loop into a for loop. > >> http://cr.openjdk.java.net/~rrich/webrevs/8249293/webrev.3.inc/test/hotspot/jtreg/serviceability/jvmti/GetLocalVariable/libGetLocalWithoutSuspendTest.cpp.frames.html > >> The indent in this file varies. It is better to keep it the same: 4 or 2. > > Yes, I noticed this. I have not corrected it yet, because I didn't want to > pullute the incremental webrev with that change. Would you like me to fix the > indentation now to 2 spaces or do it as a last step? > >> 60 AgentCallingGetLocalObject // The target thread waits for the agent to call >> I'd suggest to rename the constant to 'AgentInGetLocal'. > > Ok, done. > >> 150 GetLocalWithoutSuspendTestThreadLoop(jvmtiEnv * jvmti, JNIEnv* env, void * arg) { > >> It is better rename the function to TestThreadLoop. > > Would AgentThreadLoop be ok too? > >> You can add a comment before to explain some basic about what it is doing. > > Ok, done. > >> 167 printf("*** AGENT: GetLocalWithoutSuspendTestThreadLoop thread started. Polling thread '%s' for local variables\n", >> It is better to get rid of leading stars in all messages. > > Ok, done. > >> 176 // the native method Java_GetLocalWithoutSuspendTest_notifyAgentToGetLocalAndWaitShortly >> The part 'Java_GetLocalWithoutSuspendTest_' can be removed from the function name. > > Ok, done. > > --- > From: serguei.spitsyn at oracle.com > Sent: Freitag, 14. August 2020 10:11 > To: Reingruber, Richard ; David Holmes ; serviceability-dev at openjdk.java.net > Subject: Re: RFR(S) 8249293: Unsafe stackwalk in VM_GetOrSetLocal::doit_prologue() > > Hi Richard, > > http://cr.openjdk.java.net/~rrich/webrevs/8249293/webrev.3.inc/test/hotspot/jtreg/serviceability/jvmti/GetLocalVariable/GetLocalWithoutSuspendTest.java.frames.html > > 33 * the stack walk. The target thread's stack is walkable while in native. After sending the notification it > ... > > 54 * @param depth Depth of target frame for GetLocalObject() call. Should be large value to prolong the unsafe stack walk. > 55 * @param waitTimeInNativeAfterNotify Time to wait after notify with walkable stack before returning an becoming unsafe again. > ... > > 71 * Wait time in native, i.e. with walkable stack, after notifying agent thread to do GetLocalObject() call. > ... > > 89 msg((now -start) + " ms Iteration : " + iterations + " waitTimeInNativeAfterNotify : " + waitTimeInNativeAfterNotify); > > Could you, please, re-balance the lines above to make them shorter? > > > ?90 int newTargetDepth = recursiveMethod(0, targetDepth); > 91 if (newTargetDepth < targetDepth) { > 92 msg("StackOverflowError during test."); > 93 msg("Old target depth: " + targetDepth); > 94 msg("Retry with new target depth: " + newTargetDepth); > 95 targetDepth = newTargetDepth; > 96 } > A comment is needed to explain why a StackOverflowError is not desired. > At least, it is not obvious initially. > 73 public int waitTimeInNativeAfterNotify; > > This name is unreasonably long which makes the code less readable. > I'd suggest to reduce it to waitTime. > > 103 notifyAgentToGetLocalAndWaitShortly(-1, 1); > ... > 119 notifyAgentToGetLocalAndWaitShortly(depth - 100, waitTimeInNativeAfterNotify); > > It is better to provide a short comment before each call explaining what it is doing. > For instance, it is not clear why the call at the line 103 is needed. > Why do we need to notify the agent to GetLocal for the second time? > Can it be refactored into a separate native method? > Then the the function name can be reduced to 'notifyAgentToGetLocal'. > This long name does not give enough context anyway. > 85 long iterations = 0; > 87 do { > ... > 97 iterations++; > ... > 102 } while (iterations < TEST_ITERATIONS); > > Why a more explicit 'for' or 'while' loop is not used here? : > for (long iter = 0; iter < TEST_ITERATIONS; iter++) { > > > http://cr.openjdk.java.net/~rrich/webrevs/8249293/webrev.3.inc/test/hotspot/jtreg/serviceability/jvmti/GetLocalVariable/libGetLocalWithoutSuspendTest.cpp.frames.html > > The indent in this file varies. It is better to keep it the same: 4 or 2. > > 60 AgentCallingGetLocalObject // The target thread waits for the agent to call > I'd suggest to rename the constant to 'AgentInGetLocal'. > 150 GetLocalWithoutSuspendTestThreadLoop(jvmtiEnv * jvmti, JNIEnv* env, void * arg) { > > It is better rename the function to TestThreadLoop. > You can add a comment before to explain some basic about what it is doing. > 167 printf("*** AGENT: GetLocalWithoutSuspendTestThreadLoop thread started. Polling thread '%s' for local variables\n", > It is better to get rid of leading stars in all messages. > 176 // the native method Java_GetLocalWithoutSuspendTest_notifyAgentToGetLocalAndWaitShortly > The part 'Java_GetLocalWithoutSuspendTest_' can be removed from the function name. > > > I'm still reviewing the test native agent code. > > > Thanks, > Serguei > > > On 8/11/20 03:02, Reingruber, Richard wrote: > Hi David and Serguei, > > On 11/08/2020 3:21 am, mailto:serguei.spitsyn at oracle.com wrote: > Hi Richard and David, > > The implementation looks good to me. > > But I do not understand what the test is doing with all this counters > and recursions. > > For instance, these fragments: > > 86 recursions = 0; > 87 try { > 88 recursiveMethod(1<<20); > 89 } catch (StackOverflowError e) { > 90 msg("Caught StackOverflowError as expected"); > 91 } > 92 int target_depth = recursions-100; // spaces are missed around the '-' sigh > > It is not obvious that the 'recursion' is updated in the recursiveMethod. > I would suggestto make it more explicit: > recursiveMethod(M); > int target_depth = M - 100; > > Then the variable 'recursions' can be removed or become local. > > The recursiveMethod takes in the maximum recursions to try and updates > the recursions variable to record how many recursions were possible - so: > > target_depth = - 100; > > Possibly recursiveMethod could return the actual recursions instead of > using the global variables? > > I've eliminated the static 'recursions' variable. recursiveMethod() now returns > the depth at which the recursion was ended. I hesitated doing this, because I > had to handle the StackOverflowError with all those frames still on stack. But > the handler is empty, so it should not cause problems. > > This is the new webrev (as posted previously): > > Webrev: http://cr.openjdk.java.net/~rrich/webrevs/8249293/webrev.3/ > Delta: http://cr.openjdk.java.net/~rrich/webrevs/8249293/webrev.3.inc/ > > Thanks, Richard. > > -----Original Message----- > From: David Holmes mailto:david.holmes at oracle.com > Sent: Dienstag, 11. August 2020 04:00 > To: mailto:serguei.spitsyn at oracle.com; Reingruber, Richard mailto:richard.reingruber at sap.com; mailto:serviceability-dev at openjdk.java.net > Subject: Re: RFR(S) 8249293: Unsafe stackwalk in VM_GetOrSetLocal::doit_prologue() > > Hi Serguei, > > On 11/08/2020 3:21 am, mailto:serguei.spitsyn at oracle.com wrote: > Hi Richard and David, > > The implementation looks good to me. > > But I do not understand what the test is doing with all this counters > and recursions. > > For instance, these fragments: > > 86 recursions = 0; > 87 try { > 88 recursiveMethod(1<<20); > 89 } catch (StackOverflowError e) { > 90 msg("Caught StackOverflowError as expected"); > 91 } > 92 int target_depth = recursions-100; // spaces are missed around the '-' sigh > > It is not obvious that the 'recursion' is updated in the recursiveMethod. > I would suggestto make it more explicit: > ?? recursiveMethod(M); > ?? int target_depth = M - 100; > > Then the variable 'recursions' can be removed or become local. > > The recursiveMethod takes in the maximum recursions to try and updates > the recursions variable to record how many recursions were possible - so: > > target_depth = - 100; > > Possibly recursiveMethod could return the actual recursions instead of > using the global variables? > > David > ----- > > This method will be: > > 47 private static final int M = 1 << 20; > ... > 121 public long recursiveMethod(int depth) { > 123 if (depth == 0) { > 124 notifyAgentToGetLocalAndWaitShortly(M - 100, waitTimeInNativeAfterNotify); > 126 } else { > 127 recursiveMethod(--depth); > 128 } > 129 } > > > At least, he test is missing the comments explaining all these. > > Thanks, > Serguei > > > > On 8/9/20 22:35, David Holmes wrote: > Hi Richard, > > On 31/07/2020 5:28 pm, Reingruber, Richard wrote: > Hi, > > I rebase the fix after JDK-8250042. > > New webrev: http://cr.openjdk.java.net/~rrich/webrevs/8249293/webrev.2/ > > The general fix for this seems good. A minor nit: > > ?588???? if (!is_assignable(signature, ob_k, Thread::current())) { > > You know that the current thread is the VMThread so can use > VMThread::vm_thread(). > > Similarly for this existing code: > > ?694???? Thread* current_thread = Thread::current(); > > --- > > Looking at the test code ... I'm less clear on exactly what is > happening and the use of spin-waits raises some red-flags for me in > terms of test reliability on different platforms. The "while > (--waitCycles > 0)" loop in particular offers no certainty that the > agent thread is executing anything in particular. And the use of the > spin_count as a guide to future waiting time seems somewhat arbitrary. > In all seriousness I got a headache trying to work out how the test > was expecting to operate. Some parts could be simplified using raw > monitors, I think. But there's no sure way to know the agent thread is > in the midst of the stackwalk when the target thread wants to leave > the native code. So I understand what you are trying to achieve here, > I'm just not sure how reliably it will actually achieve it. > > test/hotspot/jtreg/serviceability/jvmti/GetLocalVariable/libGetLocalWithoutSuspendTest.cpp > > > ?32 static volatile jlong spinn_count???? = 0; > > Using a 64-bit counter seems like it will be a problem on 32-bit systems. > > Should be spin_count not spinn_count. > > ?36 // Agent thread waits for value != 0, then performas the JVMTI > call to get local variable. > > typo: performas > > Thanks, > David > ----- > > Thanks, Richard. > > -----Original Message----- > From: serviceability-dev mailto:serviceability-dev-retn at openjdk.java.net > On Behalf Of Reingruber, Richard > Sent: Montag, 27. Juli 2020 09:45 > To: mailto:serguei.spitsyn at oracle.com; mailto:serviceability-dev at openjdk.java.net > Subject: [CAUTION] RE: RFR(S) 8249293: Unsafe stackwalk in > VM_GetOrSetLocal::doit_prologue() > > Hi Serguei, > > ?? > I tested it on Linux and Windows but not yet on MacOS. > > The test succeeded now on all platforms. > > Thanks, Richard. > > -----Original Message----- > From: Reingruber, Richard > Sent: Freitag, 24. Juli 2020 15:04 > To: mailto:serguei.spitsyn at oracle.com; mailto:serviceability-dev at openjdk.java.net > Subject: RE: RFR(S) 8249293: Unsafe stackwalk in > VM_GetOrSetLocal::doit_prologue() > > Hi Serguei, > > The fix itself looks good to me. > > thanks for looking at the fix. > > I still need another look at new test. > Could you, please, convert the agent of new test to C++? > It will make it a little bit simpler. > > Sure, here is the new webrev.1 with a C++ version of the test agent: > > http://cr.openjdk.java.net/~rrich/webrevs/8249293/webrev.1/ > > I tested it on Linux and Windows but not yet on MacOS. > > Thanks, > Richard. > > -----Original Message----- > From: mailto:serguei.spitsyn at oracle.com mailto:serguei.spitsyn at oracle.com > Sent: Freitag, 24. Juli 2020 00:00 > To: Reingruber, Richard mailto:richard.reingruber at sap.com; > mailto:serviceability-dev at openjdk.java.net > Subject: Re: RFR(S) 8249293: Unsafe stackwalk in > VM_GetOrSetLocal::doit_prologue() > > Hi Richard, > > Thank you for filing the CR and taking care about it! > The fix itself looks good to me. > I still need another look at new test. > Could you, please, convert the agent of new test to C++? > It will make it a little bit simpler. > > Thanks, > Serguei > > > On 7/20/20 01:15, Reingruber, Richard wrote: > Hi, > > please help review this fix for VM_GetOrSetLocal. It moves the > unsafe stackwalk from the vm > operation prologue before the safepoint into the doit() method > executed at the safepoint. > > Webrev: > http://cr.openjdk.java.net/~rrich/webrevs/8249293/webrev.0/index.html > Bug: https://bugs.openjdk.java.net/browse/JDK-8249293 > > According to the JVMTI spec on local variable access it is not > required to suspend the target thread > T [1]. The operation will simply fail with > JVMTI_ERROR_NO_MORE_FRAMES if T is executing > bytecodes. It will succeed though if T is blocked because of > synchronization or executing some native > code. > > The issue is that in the latter case the stack walk in > VM_GetOrSetLocal::doit_prologue() to prepare > the access to the local variable is unsafe, because it is done > before the safepoint and it races > with T returning to execute bytecodes making its stack not walkable. > The included test shows that > this can crash the VM if T wins the race. > > Manual testing: > > ??? - new test > test/hotspot/jtreg/serviceability/jvmti/GetLocalVariable/GetLocalWithoutSuspendTest.java > ??? - test/hotspot/jtreg/vmTestbase/nsk/jvmti > ??? - test/hotspot/jtreg/serviceability/jvmti > > Nightly regression tests @SAP: JCK and JTREG, also in Xcomp mode, > SPECjvm2008, SPECjbb2015, > Renaissance Suite, SAP specific tests with fastdebug and release > builds on all platforms > > Thanks, Richard. > > [1] > https://docs.oracle.com/en/java/javase/14/docs/specs/jvmti.html#local > > > From richard.reingruber at sap.com Tue Aug 18 06:02:22 2020 From: richard.reingruber at sap.com (Reingruber, Richard) Date: Tue, 18 Aug 2020 06:02:22 +0000 Subject: RFR(S) 8249293: Unsafe stackwalk in VM_GetOrSetLocal::doit_prologue() In-Reply-To: <4040c9b2-8bcb-65d3-9263-c23e9d7740f1@oracle.com> References: <54218a0f-5ea7-5a26-af7b-d13829bece14@oracle.com> <037ee9ad-615e-e70b-80f8-21b4fdc80f53@oracle.com> <0a112ad5-0771-9a83-32d8-08c87bed9b40@oracle.com> <4040c9b2-8bcb-65d3-9263-c23e9d7740f1@oracle.com> Message-ID: Thanks David, have a good time! Richard. -----Original Message----- From: David Holmes Sent: Dienstag, 18. August 2020 07:20 To: Reingruber, Richard ; serguei.spitsyn at oracle.com; serviceability-dev at openjdk.java.net Subject: Re: RFR(S) 8249293: Unsafe stackwalk in VM_GetOrSetLocal::doit_prologue() Hi Richard, The test seems a lot clearer to me now. I'll leave it to you are Serguei to iron out any last wrinkles as I am disappearing on vacation for a week after today. But you have my Review. Thanks, David On 15/08/2020 12:06 am, Reingruber, Richard wrote: > Hi Serguei, > > thanks for the feedback. I have implemented your suggestions and created a new > webrev: > > Webrev: http://cr.openjdk.java.net/~rrich/webrevs/8249293/webrev.4/ > Delta: http://cr.openjdk.java.net/~rrich/webrevs/8249293/webrev.4.inc/ > > Please find my replies to your comments below. > > Best regards, > Richard. > >> http://cr.openjdk.java.net/~rrich/webrevs/8249293/webrev.3.inc/test/hotspot/jtreg/serviceability/jvmti/GetLocalVariable/GetLocalWithoutSuspendTest.java.frames.html > >> 33 * the stack walk. The target thread's stack is walkable while in native. After sending the notification it >> ... > >> 54 * @param depth Depth of target frame for GetLocalObject() call. Should be large value to prolong the unsafe stack walk. >> 55 * @param waitTimeInNativeAfterNotify Time to wait after notify with walkable stack before returning an becoming unsafe again. >> ... > >> 71 * Wait time in native, i.e. with walkable stack, after notifying agent thread to do GetLocalObject() call. >> ... > >> 89 msg((now -start) + " ms Iteration : " + iterations + " waitTimeInNativeAfterNotify : " + waitTimeInNativeAfterNotify); > >> Could you, please, re-balance the lines above to make them shorter? > > Ok, done. > > >> 90 int newTargetDepth = recursiveMethod(0, targetDepth); >> 91 if (newTargetDepth < targetDepth) { >> 92 msg("StackOverflowError during test."); >> 93 msg("Old target depth: " + targetDepth); >> 94 msg("Retry with new target depth: " + newTargetDepth); >> 95 targetDepth = newTargetDepth; >> 96 } >> A comment is needed to explain why a StackOverflowError is not desired. >> At least, it is not obvious initially. >> 73 public int waitTimeInNativeAfterNotify; > >> This name is unreasonably long which makes the code less readable. >> I'd suggest to reduce it to waitTime. > > Ok, done. > >> 103 notifyAgentToGetLocalAndWaitShortly(-1, 1); >> ... >> 119 notifyAgentToGetLocalAndWaitShortly(depth - 100, waitTimeInNativeAfterNotify); > >> It is better to provide a short comment before each call explaining what it is doing. >> For instance, it is not clear why the call at the line 103 is needed. >> Why do we need to notify the agent to GetLocal for the second time? > > The test is repeated TEST_ITERATIONS times. In each iteration the agent calls > GetLocal racing the target thread returning from the native call. The last call > in line 103 ist the shutdown signal. > >> Can it be refactored into a separate native method? > > I've made the shutdown process more explicit with the new native method > shutDown() which sets thest_state to ShutDown. > >> Then the the function name can be reduced to 'notifyAgentToGetLocal'. >> This long name does not give enough context anyway. > > Ok, done. > >> 85 long iterations = 0; >> 87 do { >> ... >> 97 iterations++; >> ... >> 102 } while (iterations < TEST_ITERATIONS); > >> Why a more explicit 'for' or 'while' loop is not used here? : >> for (long iter = 0; iter < TEST_ITERATIONS; iter++) { > > I have converted the loop into a for loop. > >> http://cr.openjdk.java.net/~rrich/webrevs/8249293/webrev.3.inc/test/hotspot/jtreg/serviceability/jvmti/GetLocalVariable/libGetLocalWithoutSuspendTest.cpp.frames.html > >> The indent in this file varies. It is better to keep it the same: 4 or 2. > > Yes, I noticed this. I have not corrected it yet, because I didn't want to > pullute the incremental webrev with that change. Would you like me to fix the > indentation now to 2 spaces or do it as a last step? > >> 60 AgentCallingGetLocalObject // The target thread waits for the agent to call >> I'd suggest to rename the constant to 'AgentInGetLocal'. > > Ok, done. > >> 150 GetLocalWithoutSuspendTestThreadLoop(jvmtiEnv * jvmti, JNIEnv* env, void * arg) { > >> It is better rename the function to TestThreadLoop. > > Would AgentThreadLoop be ok too? > >> You can add a comment before to explain some basic about what it is doing. > > Ok, done. > >> 167 printf("*** AGENT: GetLocalWithoutSuspendTestThreadLoop thread started. Polling thread '%s' for local variables\n", >> It is better to get rid of leading stars in all messages. > > Ok, done. > >> 176 // the native method Java_GetLocalWithoutSuspendTest_notifyAgentToGetLocalAndWaitShortly >> The part 'Java_GetLocalWithoutSuspendTest_' can be removed from the function name. > > Ok, done. > > --- > From: serguei.spitsyn at oracle.com > Sent: Freitag, 14. August 2020 10:11 > To: Reingruber, Richard ; David Holmes ; serviceability-dev at openjdk.java.net > Subject: Re: RFR(S) 8249293: Unsafe stackwalk in VM_GetOrSetLocal::doit_prologue() > > Hi Richard, > > http://cr.openjdk.java.net/~rrich/webrevs/8249293/webrev.3.inc/test/hotspot/jtreg/serviceability/jvmti/GetLocalVariable/GetLocalWithoutSuspendTest.java.frames.html > > 33 * the stack walk. The target thread's stack is walkable while in native. After sending the notification it > ... > > 54 * @param depth Depth of target frame for GetLocalObject() call. Should be large value to prolong the unsafe stack walk. > 55 * @param waitTimeInNativeAfterNotify Time to wait after notify with walkable stack before returning an becoming unsafe again. > ... > > 71 * Wait time in native, i.e. with walkable stack, after notifying agent thread to do GetLocalObject() call. > ... > > 89 msg((now -start) + " ms Iteration : " + iterations + " waitTimeInNativeAfterNotify : " + waitTimeInNativeAfterNotify); > > Could you, please, re-balance the lines above to make them shorter? > > > ?90 int newTargetDepth = recursiveMethod(0, targetDepth); > 91 if (newTargetDepth < targetDepth) { > 92 msg("StackOverflowError during test."); > 93 msg("Old target depth: " + targetDepth); > 94 msg("Retry with new target depth: " + newTargetDepth); > 95 targetDepth = newTargetDepth; > 96 } > A comment is needed to explain why a StackOverflowError is not desired. > At least, it is not obvious initially. > 73 public int waitTimeInNativeAfterNotify; > > This name is unreasonably long which makes the code less readable. > I'd suggest to reduce it to waitTime. > > 103 notifyAgentToGetLocalAndWaitShortly(-1, 1); > ... > 119 notifyAgentToGetLocalAndWaitShortly(depth - 100, waitTimeInNativeAfterNotify); > > It is better to provide a short comment before each call explaining what it is doing. > For instance, it is not clear why the call at the line 103 is needed. > Why do we need to notify the agent to GetLocal for the second time? > Can it be refactored into a separate native method? > Then the the function name can be reduced to 'notifyAgentToGetLocal'. > This long name does not give enough context anyway. > 85 long iterations = 0; > 87 do { > ... > 97 iterations++; > ... > 102 } while (iterations < TEST_ITERATIONS); > > Why a more explicit 'for' or 'while' loop is not used here? : > for (long iter = 0; iter < TEST_ITERATIONS; iter++) { > > > http://cr.openjdk.java.net/~rrich/webrevs/8249293/webrev.3.inc/test/hotspot/jtreg/serviceability/jvmti/GetLocalVariable/libGetLocalWithoutSuspendTest.cpp.frames.html > > The indent in this file varies. It is better to keep it the same: 4 or 2. > > 60 AgentCallingGetLocalObject // The target thread waits for the agent to call > I'd suggest to rename the constant to 'AgentInGetLocal'. > 150 GetLocalWithoutSuspendTestThreadLoop(jvmtiEnv * jvmti, JNIEnv* env, void * arg) { > > It is better rename the function to TestThreadLoop. > You can add a comment before to explain some basic about what it is doing. > 167 printf("*** AGENT: GetLocalWithoutSuspendTestThreadLoop thread started. Polling thread '%s' for local variables\n", > It is better to get rid of leading stars in all messages. > 176 // the native method Java_GetLocalWithoutSuspendTest_notifyAgentToGetLocalAndWaitShortly > The part 'Java_GetLocalWithoutSuspendTest_' can be removed from the function name. > > > I'm still reviewing the test native agent code. > > > Thanks, > Serguei > > > On 8/11/20 03:02, Reingruber, Richard wrote: > Hi David and Serguei, > > On 11/08/2020 3:21 am, mailto:serguei.spitsyn at oracle.com wrote: > Hi Richard and David, > > The implementation looks good to me. > > But I do not understand what the test is doing with all this counters > and recursions. > > For instance, these fragments: > > 86 recursions = 0; > 87 try { > 88 recursiveMethod(1<<20); > 89 } catch (StackOverflowError e) { > 90 msg("Caught StackOverflowError as expected"); > 91 } > 92 int target_depth = recursions-100; // spaces are missed around the '-' sigh > > It is not obvious that the 'recursion' is updated in the recursiveMethod. > I would suggestto make it more explicit: > recursiveMethod(M); > int target_depth = M - 100; > > Then the variable 'recursions' can be removed or become local. > > The recursiveMethod takes in the maximum recursions to try and updates > the recursions variable to record how many recursions were possible - so: > > target_depth = - 100; > > Possibly recursiveMethod could return the actual recursions instead of > using the global variables? > > I've eliminated the static 'recursions' variable. recursiveMethod() now returns > the depth at which the recursion was ended. I hesitated doing this, because I > had to handle the StackOverflowError with all those frames still on stack. But > the handler is empty, so it should not cause problems. > > This is the new webrev (as posted previously): > > Webrev: http://cr.openjdk.java.net/~rrich/webrevs/8249293/webrev.3/ > Delta: http://cr.openjdk.java.net/~rrich/webrevs/8249293/webrev.3.inc/ > > Thanks, Richard. > > -----Original Message----- > From: David Holmes mailto:david.holmes at oracle.com > Sent: Dienstag, 11. August 2020 04:00 > To: mailto:serguei.spitsyn at oracle.com; Reingruber, Richard mailto:richard.reingruber at sap.com; mailto:serviceability-dev at openjdk.java.net > Subject: Re: RFR(S) 8249293: Unsafe stackwalk in VM_GetOrSetLocal::doit_prologue() > > Hi Serguei, > > On 11/08/2020 3:21 am, mailto:serguei.spitsyn at oracle.com wrote: > Hi Richard and David, > > The implementation looks good to me. > > But I do not understand what the test is doing with all this counters > and recursions. > > For instance, these fragments: > > 86 recursions = 0; > 87 try { > 88 recursiveMethod(1<<20); > 89 } catch (StackOverflowError e) { > 90 msg("Caught StackOverflowError as expected"); > 91 } > 92 int target_depth = recursions-100; // spaces are missed around the '-' sigh > > It is not obvious that the 'recursion' is updated in the recursiveMethod. > I would suggestto make it more explicit: > ?? recursiveMethod(M); > ?? int target_depth = M - 100; > > Then the variable 'recursions' can be removed or become local. > > The recursiveMethod takes in the maximum recursions to try and updates > the recursions variable to record how many recursions were possible - so: > > target_depth = - 100; > > Possibly recursiveMethod could return the actual recursions instead of > using the global variables? > > David > ----- > > This method will be: > > 47 private static final int M = 1 << 20; > ... > 121 public long recursiveMethod(int depth) { > 123 if (depth == 0) { > 124 notifyAgentToGetLocalAndWaitShortly(M - 100, waitTimeInNativeAfterNotify); > 126 } else { > 127 recursiveMethod(--depth); > 128 } > 129 } > > > At least, he test is missing the comments explaining all these. > > Thanks, > Serguei > > > > On 8/9/20 22:35, David Holmes wrote: > Hi Richard, > > On 31/07/2020 5:28 pm, Reingruber, Richard wrote: > Hi, > > I rebase the fix after JDK-8250042. > > New webrev: http://cr.openjdk.java.net/~rrich/webrevs/8249293/webrev.2/ > > The general fix for this seems good. A minor nit: > > ?588???? if (!is_assignable(signature, ob_k, Thread::current())) { > > You know that the current thread is the VMThread so can use > VMThread::vm_thread(). > > Similarly for this existing code: > > ?694???? Thread* current_thread = Thread::current(); > > --- > > Looking at the test code ... I'm less clear on exactly what is > happening and the use of spin-waits raises some red-flags for me in > terms of test reliability on different platforms. The "while > (--waitCycles > 0)" loop in particular offers no certainty that the > agent thread is executing anything in particular. And the use of the > spin_count as a guide to future waiting time seems somewhat arbitrary. > In all seriousness I got a headache trying to work out how the test > was expecting to operate. Some parts could be simplified using raw > monitors, I think. But there's no sure way to know the agent thread is > in the midst of the stackwalk when the target thread wants to leave > the native code. So I understand what you are trying to achieve here, > I'm just not sure how reliably it will actually achieve it. > > test/hotspot/jtreg/serviceability/jvmti/GetLocalVariable/libGetLocalWithoutSuspendTest.cpp > > > ?32 static volatile jlong spinn_count???? = 0; > > Using a 64-bit counter seems like it will be a problem on 32-bit systems. > > Should be spin_count not spinn_count. > > ?36 // Agent thread waits for value != 0, then performas the JVMTI > call to get local variable. > > typo: performas > > Thanks, > David > ----- > > Thanks, Richard. > > -----Original Message----- > From: serviceability-dev mailto:serviceability-dev-retn at openjdk.java.net > On Behalf Of Reingruber, Richard > Sent: Montag, 27. Juli 2020 09:45 > To: mailto:serguei.spitsyn at oracle.com; mailto:serviceability-dev at openjdk.java.net > Subject: [CAUTION] RE: RFR(S) 8249293: Unsafe stackwalk in > VM_GetOrSetLocal::doit_prologue() > > Hi Serguei, > > ?? > I tested it on Linux and Windows but not yet on MacOS. > > The test succeeded now on all platforms. > > Thanks, Richard. > > -----Original Message----- > From: Reingruber, Richard > Sent: Freitag, 24. Juli 2020 15:04 > To: mailto:serguei.spitsyn at oracle.com; mailto:serviceability-dev at openjdk.java.net > Subject: RE: RFR(S) 8249293: Unsafe stackwalk in > VM_GetOrSetLocal::doit_prologue() > > Hi Serguei, > > The fix itself looks good to me. > > thanks for looking at the fix. > > I still need another look at new test. > Could you, please, convert the agent of new test to C++? > It will make it a little bit simpler. > > Sure, here is the new webrev.1 with a C++ version of the test agent: > > http://cr.openjdk.java.net/~rrich/webrevs/8249293/webrev.1/ > > I tested it on Linux and Windows but not yet on MacOS. > > Thanks, > Richard. > > -----Original Message----- > From: mailto:serguei.spitsyn at oracle.com mailto:serguei.spitsyn at oracle.com > Sent: Freitag, 24. Juli 2020 00:00 > To: Reingruber, Richard mailto:richard.reingruber at sap.com; > mailto:serviceability-dev at openjdk.java.net > Subject: Re: RFR(S) 8249293: Unsafe stackwalk in > VM_GetOrSetLocal::doit_prologue() > > Hi Richard, > > Thank you for filing the CR and taking care about it! > The fix itself looks good to me. > I still need another look at new test. > Could you, please, convert the agent of new test to C++? > It will make it a little bit simpler. > > Thanks, > Serguei > > > On 7/20/20 01:15, Reingruber, Richard wrote: > Hi, > > please help review this fix for VM_GetOrSetLocal. It moves the > unsafe stackwalk from the vm > operation prologue before the safepoint into the doit() method > executed at the safepoint. > > Webrev: > http://cr.openjdk.java.net/~rrich/webrevs/8249293/webrev.0/index.html > Bug: https://bugs.openjdk.java.net/browse/JDK-8249293 > > According to the JVMTI spec on local variable access it is not > required to suspend the target thread > T [1]. The operation will simply fail with > JVMTI_ERROR_NO_MORE_FRAMES if T is executing > bytecodes. It will succeed though if T is blocked because of > synchronization or executing some native > code. > > The issue is that in the latter case the stack walk in > VM_GetOrSetLocal::doit_prologue() to prepare > the access to the local variable is unsafe, because it is done > before the safepoint and it races > with T returning to execute bytecodes making its stack not walkable. > The included test shows that > this can crash the VM if T wins the race. > > Manual testing: > > ??? - new test > test/hotspot/jtreg/serviceability/jvmti/GetLocalVariable/GetLocalWithoutSuspendTest.java > ??? - test/hotspot/jtreg/vmTestbase/nsk/jvmti > ??? - test/hotspot/jtreg/serviceability/jvmti > > Nightly regression tests @SAP: JCK and JTREG, also in Xcomp mode, > SPECjvm2008, SPECjbb2015, > Renaissance Suite, SAP specific tests with fastdebug and release > builds on all platforms > > Thanks, Richard. > > [1] > https://docs.oracle.com/en/java/javase/14/docs/specs/jvmti.html#local > > > From richard.reingruber at sap.com Tue Aug 18 07:43:51 2020 From: richard.reingruber at sap.com (Reingruber, Richard) Date: Tue, 18 Aug 2020 07:43:51 +0000 Subject: RFR(L) 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents In-Reply-To: References: <1f8a3c7a-fa0f-b5b2-4a8a-7d3d8dbbe1b5@oracle.com> <4b56a45c-a14c-6f74-2bfd-25deaabe8201@oracle.com> <5271429a-481d-ddb9-99dc-b3f6670fcc0b@oracle.com> Message-ID: Hi Goetz, I have collected the changes based on your feedback in a new webrev: Webrev.7: http://cr.openjdk.java.net/~rrich/webrevs/8227745/webrev.7/ Delta: http://cr.openjdk.java.net/~rrich/webrevs/8227745/webrev.7.inc/ Most of the changes are renamings, commenting, and reformatting. Besides that ... - I converted the native agent of the test IterateHeapWithEscapeAnalysisEnabled from C to C++, because this seems to be preferred by serviceability developers. I also re-indented the file, but excluded this from the delta webrev. - I had to adapt test/jdk/com/sun/jdi/EATests.java to the fact that background compilation (-Xbatch) cannot be reliably disabled for JVMCI compilers. E.g. the compile broker will compile in the background if JVMCI is not yet fully initialized. Therefore it is possible that test cases are executed before the main test method is compiled on the highest level and then the test case fails. The higher the system load the higher the probability for this to happen. In webrev.7 I skip the compilation level check if the vm is configured to use the JVMCI compiler. I also answered you inline below. Thanks, Richard. -----Original Message----- From: Lindenmaier, Goetz Sent: Donnerstag, 23. Juli 2020 16:20 To: Reingruber, Richard ; serviceability-dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net; hotspot-runtime-dev at openjdk.java.net Subject: RE: RFR(L) 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents Hi Richard, Thanks for your two further explanations in the other thread. That made the points clear to me. > > I was not that happy with the names saying not_global_escape > > and similar. I now agreed you have to use the terms of the escape > > analysis (NoEscape ArgEscape= throughout the runtime code. I'm still not happy with > > the 'not' in the term, I always try to expand the name to some > > sentence with a negated verb, but it makes no sense. > > For example, "has_not_global_escape_in_scope" expands to > > "Hasn't a global escape in its scope." in my thinking, which makes > > no sense. You probably mean > > "Has not-global escape in its scope." or "Has {ArgEscape|NoEscape} > > in its scope." > > > C2 is using the word "non" in this context, e.g., here > > alloc->is_non_escaping. > > There is also ConnectionGraph::not_global_escape() That talks about a single node that represents a single Object. An object has a single state wrt. ea. You use the term for safepoint which tracks a set of objects. Here, has_not_global_excape can mean 1. None of the several objects does escape globaly. 2. There is at least one object that escapes globaly. > > non obviously negates the adjective 'global', > > non-global or nonglobal even is a English term I find in the > > net. > > So what about "has_non_global_escape_in_scope?" > > And what about has_ea_local_in_scope? That's good. Please document somewhere that Ea_local == ArgEscape | NoEscape. That's what it is, right? > > Does jvmti specify that the same limits are used ...? > > ok on your side. > > I don't know and didn't find anything in a quick search. Ok, not your business. > > > jvmtiEnvBase.cpp ok > > jvmtiImpl.h|cpp ok > > jvmtiTagMap.cpp ok > > whitebox.cpp ok > > > deoptimization.cpp > > > line 177: Please break line > > line 246, 281: Please break line > > 1578, 1583, 1589, 1632, 1649, 1651 Break line > > > 1651: You use 'non'-terms, too: non-escaping :) > > I know :) At least here it is wrong I'd say. "...has to be a not escaping obj..." > sounds better > (hopefully not only to my german ears). I thought the term non-escpaing makes it quite clear. I just wanted to point out that using non above would be similar to the wording here. > > IterateHeapWithEscapeAnalysisEnabled.java > > > line 415: > > msg("wait until target thread has set testMethod_result"); > > while (testMethod_result == 0) { > > Thread.sleep(50); > > } > > Might the test run into timeouts at this place? > > The field is volatile, i.e. it will be reloaded > > in each iteration. But will dontinline_testMethod > > write it back to main memory in time? > > You mean, the test could hang in that loop for a couple of minutes? I don't > think so. There are cache coherence protocols in place which will invalidate > stale data very timely. Ok, anyways, it would only be a hanging test. > > Ok. I've removed quite a lot of the occurrances. > > > Also, I like full sentences in comments. > > Especially for me as foreign speaker, this makes > > things much more clear. I.e., I try to make it > > a real sentence with articles, capitalized and a > > dot at the end if there is a subject and a verb > > in first place. > > E.g., jvmtiEnvBase.cpp:1327 > > Are you referring to the following? > (from > http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.6/src/hots > pot/share/prims/jvmtiEnvBase.cpp.frames.html) > > 1326 > 1327 // If the frame is a compiled one, need to deoptimize it. > 1328 if (vf->is_compiled_frame()) { > > This line 1327 is preexisting. Sorry, wrong line number again. I think I meant 1333 // eagerly reallocate scalar replaced objects. But I must admit, the subject is missing. It's one of these imperative sentences where the subject is left out, which are used throughout documentation. Bad example, but still a correct sentence, so qualifies for punctuation? Best regards, Goetz. From fairoz.matte at oracle.com Tue Aug 18 08:10:26 2020 From: fairoz.matte at oracle.com (Fairoz Matte) Date: Tue, 18 Aug 2020 01:10:26 -0700 (PDT) Subject: RFR(s): 8248295: serviceability/jvmti/CompiledMethodLoad/Zombie.java failure with Graal In-Reply-To: <1bdcbd35-097e-1681-3a0c-32f9709497a4@oracle.com> References: <1bdcbd35-097e-1681-3a0c-32f9709497a4@oracle.com> Message-ID: Hi Vladimir, Thanks for looking into. This is intermittent crash, and is reproducible in windows debug build environment. Below is the testing performed. 1. Issues observed 7/100 runs, ReservedCodeCacheSize=20m with "-XX:+UnlockExperimentalVMOptions -XX:+EnableJVMCI -XX:+UseJVMCICompiler" 2. Issues observed 0/300 runs, ReservedCodeCacheSize=30m with "-XX:+UnlockExperimentalVMOptions -XX:+EnableJVMCI -XX:+UseJVMCICompiler" Thanks, Fairoz > -----Original Message----- > From: Vladimir Kozlov > Sent: Monday, August 17, 2020 11:22 PM > To: Fairoz Matte ; hotspot-compiler- > dev at openjdk.java.net; serviceability-dev at openjdk.java.net > Cc: Coleen Phillimore ; Dean Long > > Subject: Re: RFR(s): 8248295: > serviceability/jvmti/CompiledMethodLoad/Zombie.java failure with Graal > > Hi Fairoz, > > How you determine that +10Mb is enough with Graal? > > Thanks, > Vladimir > > On 8/17/20 5:46 AM, Fairoz Matte wrote: > > Hi, > > > > > > > > Please review this small test change to work with Graal. > > > > > > > > Background: > > > > Graal require more code cache compared to c1/c2. but the test case always > set it to 20MB. This may not be sufficient when running graal. > > > > Default configuration for ReservedCodeCacheSize = 250MB > > > > With graal enabled, ReservedCodeCacheSize = 350MB > > > > > > > > Either we can modify the framework to honor ReservedCodeCacheSize for > graal or just update the testcase. > > > > There are not many test cases they rely on ReservedCodeCacheSize or > InitialCodeCacheSize. So the fix prefer the later one. > > > > > > > > JBS - https://bugs.openjdk.java.net/browse/JDK-8248295 > > > > Webrev - http://cr.openjdk.java.net/~fmatte/8248295/webrev.00/ > > > > > > > > Thanks, > > > > Fairoz > > > > > > From Roger.Riggs at oracle.com Tue Aug 18 13:25:52 2020 From: Roger.Riggs at oracle.com (Roger Riggs) Date: Tue, 18 Aug 2020 09:25:52 -0400 Subject: Protecting references from GC in JDI tests In-Reply-To: References: Message-ID: <37042231-cdd7-716c-af0a-8b20ef9d6356@oracle.com> Hi, You may also find useful java.lang.ref.Reference.reachabilityFence(obj [1] . It is designed to prevent the compiler from optimizing away a reference. It keeps the object referenced up to that point. [1] https://download.java.net/java/GA/jdk14/docs/api/java.base/java/lang/ref/Reference.html#reachabilityFence(java.lang.Object) Regards, Roger On 8/17/20 3:21 PM, Daniel D. Daugherty wrote: > Aditya, > > I think you've found the right alias... > > A similar observation was made here: > > http://mail.openjdk.java.net/pipermail/serviceability-dev/2020-March/030635.html > > > It looks like that conversation didn't go beyond Egor's original message > and Chris P's reply. > > My recommendation would be to only use > ObjectReference.DisableCollection() > when you have observed a specific failure for a specific object in a > test. > I would not apply that function generally. However, Chris P. or another > member of the Serviceability team may have other guidance... > > Dan > > > On 8/17/20 3:07 PM, Aditya Mandaleeka wrote: >> Hi serviceability-dev, >> >> I hope this is the right list for this topic, but feel free to >> redirect if not... >> >> It appears that there are jtreg tests that exercise JDI functionality >> without protecting target >> objects from being GC'd. An example of this is >> com/sun/jdi/VarargsTest.java, where references are >> acquired (with mirrorOf) and then used as args in invokeMethod. I >> haven't looked into all the >> other JDI tests to see if there are others with the same issue. >> >> This issue was exposed in our test runs with Shenandoah GC in >> aggressive heuristics mode (which does >> back to back GCs), but of course it can also be reproduced by >> inducing a GC in the target VM >> explicitly before the invoke. >> >> While this is unlikely to occur in practice when not using a special >> GC mode, it seems to me that >> the tests should not be relying on the fact that a GC _probably_ >> won't occur, and instead explicitly >> disable collection on the objects that are going to be used. After >> all, it is specified in the docs >> that ObjectReference values returned by JDI may be collected at any >> time the target VM is running >> unless disableCollection() is called on them [0], so the test code is >> implicitly relying on lifetime >> guarantees that are not provided. >> >> I think the test(s) could be improved by calling disableCollection on >> any references in the target >> VM prior to using them. There is of course a chance that an object >> could be GC'd before the >> disableCollection call, but I inspected the code for this case and it >> appears the JDWP error code >> gets converted into an ObjectCollectedException surfaced to user code >> which we could handle in >> the test (and perhaps retry the operation a few times before giving >> up). Is my understanding >> correct here? >> >> What are your thoughts on this? Is there interest in these tests >> being fixed in this way? >> >> Thanks, >> Aditya >> >> [0]: >> https://docs.oracle.com/en/java/javase/11/docs/api/jdk.jdi/com/sun/jdi/ObjectReference.html#disableCollection() >> > From vladimir.kozlov at oracle.com Tue Aug 18 19:14:01 2020 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Tue, 18 Aug 2020 12:14:01 -0700 Subject: RFR(s): 8248295: serviceability/jvmti/CompiledMethodLoad/Zombie.java failure with Graal In-Reply-To: References: <1bdcbd35-097e-1681-3a0c-32f9709497a4@oracle.com> Message-ID: <7040f785-b871-9771-94a2-4c3472a6bf6d@oracle.com> I would suggest to run test with -XX:+PrintCodeCache flag which prints CodeCache usage on exit. Also add '-ea -esa' flags - some runs failed with them because they increase Graal's methods size. Running test with immediately caused OOM error on my local linux machine: '-server -ea -esa -XX:+TieredCompilation -XX:+PrintCodeCache -XX:+UnlockExperimentalVMOptions -XX:+EnableJVMCI -XX:+UseJVMCICompiler -Djvmci.Compiler=graal' With -XX:ReservedCodeCacheSize=30m I got: [11.217s][warning][codecache] CodeCache is full. Compiler has been disabled. [11.217s][warning][codecache] Try increasing the code cache size using -XX:ReservedCodeCacheSize= With -XX:ReservedCodeCacheSize=50m I got this output: CodeCache: size=51200Kb used=34401Kb max_used=34401Kb free=16798Kb May be you need to set it to 35m or better to 50m to be safe. Note, without Graal test uses only 5.5m: CodeCache: size=20480Kb used=5677Kb max_used=5688Kb free=14803Kb ----------------------------- I also forgot to ask you to update test's Copyright year. Regards, Vladimir K On 8/18/20 1:10 AM, Fairoz Matte wrote: > Hi Vladimir, > > Thanks for looking into. > This is intermittent crash, and is reproducible in windows debug build environment. Below is the testing performed. > > 1. Issues observed 7/100 runs, ReservedCodeCacheSize=20m with "-XX:+UnlockExperimentalVMOptions -XX:+EnableJVMCI -XX:+UseJVMCICompiler" > 2. Issues observed 0/300 runs, ReservedCodeCacheSize=30m with "-XX:+UnlockExperimentalVMOptions -XX:+EnableJVMCI -XX:+UseJVMCICompiler" > > Thanks, > Fairoz > >> -----Original Message----- >> From: Vladimir Kozlov >> Sent: Monday, August 17, 2020 11:22 PM >> To: Fairoz Matte ; hotspot-compiler- >> dev at openjdk.java.net; serviceability-dev at openjdk.java.net >> Cc: Coleen Phillimore ; Dean Long >> >> Subject: Re: RFR(s): 8248295: >> serviceability/jvmti/CompiledMethodLoad/Zombie.java failure with Graal >> >> Hi Fairoz, >> >> How you determine that +10Mb is enough with Graal? >> >> Thanks, >> Vladimir >> >> On 8/17/20 5:46 AM, Fairoz Matte wrote: >>> Hi, >>> >>> >>> >>> Please review this small test change to work with Graal. >>> >>> >>> >>> Background: >>> >>> Graal require more code cache compared to c1/c2. but the test case always >> set it to 20MB. This may not be sufficient when running graal. >>> >>> Default configuration for ReservedCodeCacheSize = 250MB >>> >>> With graal enabled, ReservedCodeCacheSize = 350MB >>> >>> >>> >>> Either we can modify the framework to honor ReservedCodeCacheSize for >> graal or just update the testcase. >>> >>> There are not many test cases they rely on ReservedCodeCacheSize or >> InitialCodeCacheSize. So the fix prefer the later one. >>> >>> >>> >>> JBS - https://bugs.openjdk.java.net/browse/JDK-8248295 >>> >>> Webrev - http://cr.openjdk.java.net/~fmatte/8248295/webrev.00/ >>> >>> >>> >>> Thanks, >>> >>> Fairoz >>> >>> >>> From serguei.spitsyn at oracle.com Tue Aug 18 20:03:02 2020 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Tue, 18 Aug 2020 13:03:02 -0700 Subject: 8251155: HostIdentifier fails to canonicalize hostnames starting with digits(Internet mail) In-Reply-To: <1C8FB411-F73A-4197-A8C9-28B2BE3BFBA0@tencent.com> References: <1C8FB411-F73A-4197-A8C9-28B2BE3BFBA0@tencent.com> Message-ID: An HTML attachment was scrubbed... URL: From igor.ignatyev at oracle.com Tue Aug 18 23:42:19 2020 From: igor.ignatyev at oracle.com (Igor Ignatyev) Date: Tue, 18 Aug 2020 16:42:19 -0700 Subject: RFR(T) : 8252005 : narrow disabling of allowSmartActionArgs in vmTestbase Message-ID: <4E6FECE6-9103-46ED-84B2-79DBA0123ED9@oracle.com> http://cr.openjdk.java.net/~iignatyev//8252005/webrev.00/ > 0 lines changed: 0 ins; 0 del; 0 mod; Hi all, could you please review this trivial (and apparently empty) patch which sets allowSmartActionArgs to false only in subdirectories of vmTestbase which currently use PropertyResolvingWrapper? (it's hard to tell from webrev or patch, but test/hotspot/jtreg/vmTestbase/TEST.properties is effectively removed) webrev: http://cr.openjdk.java.net/~iignatyev//8252005/webrev.00/ JBS: https://bugs.openjdk.java.net/browse/JDK-8252005 Thanks, -- Igor From claes.redestad at oracle.com Wed Aug 19 04:51:13 2020 From: claes.redestad at oracle.com (Claes Redestad) Date: Wed, 19 Aug 2020 06:51:13 +0200 Subject: 8251155: HostIdentifier fails to canonicalize hostnames starting with digits(Internet mail) In-Reply-To: References: <1C8FB411-F73A-4197-A8C9-28B2BE3BFBA0@tencent.com> Message-ID: Hi, not sure I do, but a quick read of the relevant RFC suggests that since a URI scheme (protocol) must start with a letter[1] it seems safe to assume the string must be of the form hostname or hostname:port if the first character in the string is a digit. /Claes [1] https://tools.ietf.org/html/rfc3986#section-3.1 On 2020-08-18 22:03, serguei.spitsyn at oracle.com wrote: > Hi Jie, > > I've added Claes to the list as he may have an expertise in this area. > > 83 *

    > 84 *
  • {@code } - transformed into "//localhost"
  • > 85 *
  • localhost - transformed into "//localhost"
  • > 86 *
  • hostname - transformed into "//hostname"
  • > 87 *
  • hostname:port - transformed into "//hostname:port"
  • > 88 *
  • proto:hostname - transformed into "proto://hostname"
  • > 89 *
  • proto:hostname:port - transformed into > 90 * "proto://hostname:port"
  • > 91 *
  • proto://hostname:port
  • > 92 *
> > Is it worth to add an example to the list above? > > I wander if this fix needs a CSR. > How did you check this fix does not introduce any regressions? > > Thanks, > Serguei > > > On 8/17/20 08:13, jiefu(??) wrote: >> >> Ping? >> >> Any comments? >> >> Thanks. >> >> Best regards, >> >> Jie >> >> *From: *serviceability-dev >> on behalf of "jiefu(??)" >> *Date: *Friday, August 7, 2020 at 7:44 AM >> *To: *"serviceability-dev at openjdk.java.net" >> >> *Subject: *Re: RFR: 8251155: HostIdentifier fails to canonicalize >> hostnames starting with digits(Internet mail) >> >> FYI: >> >> ? This bug will lead to failures of the following tests on machines >> with hostname starting from digits. >> >> ??? - test/jdk/sun/tools/jstatd/TestJstatdExternalRegistry.java >> >> ??? - test/jdk/sun/tools/jstatd/TestJstatdPort.java >> >> ??? - test/jdk/sun/tools/jstatd/TestJstatdPortAndServer.java >> >> ??? - test/jdk/sun/tools/jstatd/TestJstatdRmiPort.java >> >> So it's worth fixing it. >> >> Testing: >> >> ? - tier1-3 on Linux/x64 >> >> Thanks. >> >> Best regards, >> >> Jie >> >> *From: *"jiefu(??)" >> *Date: *Wednesday, August 5, 2020 at 3:19 PM >> *To: *"serviceability-dev at openjdk.java.net" >> >> *Subject: *RFR: 8251155: HostIdentifier fails to canonicalize >> hostnames starting with digits >> >> Hi all, >> >> JBS: https://bugs.openjdk.java.net/browse/JDK-8251155 >> >> Webrev: http://cr.openjdk.java.net/~jiefu/8251155/webrev.00/ >> >> HostIdentifier fails to canonicalize hostname:port if the hostname >> starts with digits. >> >> The current implementation will get "scheme = hostname". >> >> But the scheme should not be started with digits, which leads to this bug. >> >> Thanks a lot. >> >> Best regards, >> >> Jie >> > From linzang at tencent.com Wed Aug 19 06:13:18 2020 From: linzang at tencent.com (=?iso-2022-jp?B?bGluemFuZygbJEJnSU5WGyhCKQ==?=) Date: Wed, 19 Aug 2020 06:13:18 +0000 Subject: [Discussion] Expected behavior of combining "all" and "live" options of jmap References: <16D02038-2F95-4FE0-860F-D0CE134BE871@tencent.com> Message-ID: Dear All, May I get some suggestions? so that I can work out a patch base on that. Or may be it should not be treated as an issue? BRs, Lin On 17/08/2020 17:17, linzang(??) wrote: > Dear all, > we found the jmap?s histo/dump command could accept "live" and "all" options together, and the specification does not describe what is the expected behavior of it. > I have tried that when these two options used together, the "live" takes effect, no matter what sequences are they in commandline. > IMO, it is a little confused to use "live" and "all" together, and if it is allowed, the specification may need to be updated to state the behavior clearly. > Therefore may I ask your suggestion on which option of the following is prefered: > (option 1.) disallow using these two options together, I think this is more clear, but I am not sure whether there is backward compatibility risk. > (option 2.) allow the combination use of "live" and "all", and update the specification to clearly describe the behavior that "live" takes effect in this case. > What do you think? > > Thanks, > Lin > > > From jiefu at tencent.com Wed Aug 19 08:05:52 2020 From: jiefu at tencent.com (=?iso-2022-jp?B?amllZnUoGyRCUHxbPxsoQik=?=) Date: Wed, 19 Aug 2020 08:05:52 +0000 Subject: 8251155: HostIdentifier fails to canonicalize hostnames starting with digits(Internet mail) In-Reply-To: References: <1C8FB411-F73A-4197-A8C9-28B2BE3BFBA0@tencent.com>, Message-ID: <8cee7938941048f4b007b1663fab4b95@tencent.com> Hi Serguei, Thanks for your review and help. Please see comments inline. ________________________________ From: serguei.spitsyn at oracle.com Sent: Wednesday, August 19, 2020 4:03 AM To: jiefu(??); serviceability-dev at openjdk.java.net; Claes Redestad Subject: Re: 8251155: HostIdentifier fails to canonicalize hostnames starting with digits(Internet mail) 83 *
    84 *
  • {@code } - transformed into "//localhost"
  • 85 *
  • localhost - transformed into "//localhost"
  • 86 *
  • hostname - transformed into "//hostname"
  • 87 *
  • hostname:port - transformed into "//hostname:port"
  • 88 *
  • proto:hostname - transformed into "proto://hostname"
  • 89 *
  • proto:hostname:port - transformed into 90 * "proto://hostname:port"
  • 91 *
  • proto://hostname:port
  • 92 *
>> Is it worth to add an example to the list above? Yes. It's really helpful for the review process. Thanks. >> I wander if this fix needs a CSR. I don't think so. This is just a bug fix which doesn't add/remove/change any feature of the tools. The original design has claimed to support hostname and hostname:port cases. But it fails to do so when the hostname starts with digits. It seems to be very common that the hostname will be started with digits in dockers. So I think it's worth to fix this bug. >> How did you check this fix does not introduce any regressions? In fact, Claes had helped me to answer this question here: https://mail.openjdk.java.net/pipermail/serviceability-dev/2020-August/032691.html. Also, I've tested this patch on Linux/x64 with tier1 ~ tier3 (no regression). Thanks a lot. Best regards, Jie -------------- next part -------------- An HTML attachment was scrubbed... URL: From jiefu at tencent.com Wed Aug 19 08:08:21 2020 From: jiefu at tencent.com (=?iso-2022-jp?B?amllZnUoGyRCUHxbPxsoQik=?=) Date: Wed, 19 Aug 2020 08:08:21 +0000 Subject: 8251155: HostIdentifier fails to canonicalize hostnames starting with digits(Internet mail) In-Reply-To: References: <1C8FB411-F73A-4197-A8C9-28B2BE3BFBA0@tencent.com> , Message-ID: <63a5ff854dab495d9bd289462b558a3f@tencent.com> Hi Claes, Thanks for your review and help. Best regards, Jie ________________________________ From: Claes Redestad Sent: Wednesday, August 19, 2020 12:51 PM To: serguei.spitsyn at oracle.com; jiefu(??); serviceability-dev at openjdk.java.net Subject: Re: 8251155: HostIdentifier fails to canonicalize hostnames starting with digits(Internet mail) Hi, not sure I do, but a quick read of the relevant RFC suggests that since a URI scheme (protocol) must start with a letter[1] it seems safe to assume the string must be of the form hostname or hostname:port if the first character in the string is a digit. /Claes [1] https://tools.ietf.org/html/rfc3986#section-3.1 On 2020-08-18 22:03, serguei.spitsyn at oracle.com wrote: > Hi Jie, > > I've added Claes to the list as he may have an expertise in this area. > > 83 *
    > 84 *
  • {@code } - transformed into "//localhost"
  • > 85 *
  • localhost - transformed into "//localhost"
  • > 86 *
  • hostname - transformed into "//hostname"
  • > 87 *
  • hostname:port - transformed into "//hostname:port"
  • > 88 *
  • proto:hostname - transformed into "proto://hostname"
  • > 89 *
  • proto:hostname:port - transformed into > 90 * "proto://hostname:port"
  • > 91 *
  • proto://hostname:port
  • > 92 *
> > Is it worth to add an example to the list above? > > I wander if this fix needs a CSR. > How did you check this fix does not introduce any regressions? > > Thanks, > Serguei > > > On 8/17/20 08:13, jiefu(??) wrote: >> >> Ping? >> >> Any comments? >> >> Thanks. >> >> Best regards, >> >> Jie >> >> *From: *serviceability-dev >> on behalf of "jiefu(??)" >> *Date: *Friday, August 7, 2020 at 7:44 AM >> *To: *"serviceability-dev at openjdk.java.net" >> >> *Subject: *Re: RFR: 8251155: HostIdentifier fails to canonicalize >> hostnames starting with digits(Internet mail) >> >> FYI: >> >> This bug will lead to failures of the following tests on machines >> with hostname starting from digits. >> >> - test/jdk/sun/tools/jstatd/TestJstatdExternalRegistry.java >> >> - test/jdk/sun/tools/jstatd/TestJstatdPort.java >> >> - test/jdk/sun/tools/jstatd/TestJstatdPortAndServer.java >> >> - test/jdk/sun/tools/jstatd/TestJstatdRmiPort.java >> >> So it's worth fixing it. >> >> Testing: >> >> - tier1-3 on Linux/x64 >> >> Thanks. >> >> Best regards, >> >> Jie >> >> *From: *"jiefu(??)" >> *Date: *Wednesday, August 5, 2020 at 3:19 PM >> *To: *"serviceability-dev at openjdk.java.net" >> >> *Subject: *RFR: 8251155: HostIdentifier fails to canonicalize >> hostnames starting with digits >> >> Hi all, >> >> JBS: https://bugs.openjdk.java.net/browse/JDK-8251155 >> >> Webrev: http://cr.openjdk.java.net/~jiefu/8251155/webrev.00/ >> >> HostIdentifier fails to canonicalize hostname:port if the hostname >> starts with digits. >> >> The current implementation will get "scheme = hostname". >> >> But the scheme should not be started with digits, which leads to this bug. >> >> Thanks a lot. >> >> Best regards, >> >> Jie >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From fairoz.matte at oracle.com Wed Aug 19 12:30:47 2020 From: fairoz.matte at oracle.com (Fairoz Matte) Date: Wed, 19 Aug 2020 05:30:47 -0700 (PDT) Subject: RFR(s): 8248295: serviceability/jvmti/CompiledMethodLoad/Zombie.java failure with Graal In-Reply-To: <7040f785-b871-9771-94a2-4c3472a6bf6d@oracle.com> References: <1bdcbd35-097e-1681-3a0c-32f9709497a4@oracle.com> <7040f785-b871-9771-94a2-4c3472a6bf6d@oracle.com> Message-ID: <59cd0914-5a61-463e-b46f-ebdc1496ab9f@default> Hi Vladimir, Thanks for the review. > I would suggest to run test with -XX:+PrintCodeCache flag which prints > CodeCache usage on exit. > > Also add '-ea -esa' flags - some runs failed with them because they increase > Graal's methods size. > > Running test with immediately caused OOM error on my local linux machine: > > '-server -ea -esa -XX:+TieredCompilation -XX:+PrintCodeCache - > XX:+UnlockExperimentalVMOptions -XX:+EnableJVMCI - > XX:+UseJVMCICompiler -Djvmci.Compiler=graal' > > With -XX:ReservedCodeCacheSize=30m I got: > > [11.217s][warning][codecache] CodeCache is full. Compiler has been > disabled. > [11.217s][warning][codecache] Try increasing the code cache size using - > XX:ReservedCodeCacheSize= > > With -XX:ReservedCodeCacheSize=50m I got this output: Further testing with PrintCodeCache, ReservedCodeCacheSize = 50MB is the safe one to use. > > CodeCache: size=51200Kb used=34401Kb max_used=34401Kb free=16798Kb > > May be you need to set it to 35m or better to 50m to be safe. > > Note, without Graal test uses only 5.5m: > > CodeCache: size=20480Kb used=5677Kb max_used=5688Kb free=14803Kb > > ----------------------------- > > I also forgot to ask you to update test's Copyright year. I have updated the copyright year. Updated webrev for the reference - http://cr.openjdk.java.net/~fmatte/8248295/webrev.01/ Thanks, Fairoz > > Regards, > Vladimir K > > On 8/18/20 1:10 AM, Fairoz Matte wrote: > > Hi Vladimir, > > > > Thanks for looking into. > > This is intermittent crash, and is reproducible in windows debug build > environment. Below is the testing performed. > > > > 1. Issues observed 7/100 runs, ReservedCodeCacheSize=20m with "- > XX:+UnlockExperimentalVMOptions -XX:+EnableJVMCI - > XX:+UseJVMCICompiler" > > 2. Issues observed 0/300 runs, ReservedCodeCacheSize=30m with "- > XX:+UnlockExperimentalVMOptions -XX:+EnableJVMCI - > XX:+UseJVMCICompiler" > > > > Thanks, > > Fairoz > > > >> -----Original Message----- > >> From: Vladimir Kozlov > >> Sent: Monday, August 17, 2020 11:22 PM > >> To: Fairoz Matte ; hotspot-compiler- > >> dev at openjdk.java.net; serviceability-dev at openjdk.java.net > >> Cc: Coleen Phillimore ; Dean Long > >> > >> Subject: Re: RFR(s): 8248295: > >> serviceability/jvmti/CompiledMethodLoad/Zombie.java failure with > >> Graal > >> > >> Hi Fairoz, > >> > >> How you determine that +10Mb is enough with Graal? > >> > >> Thanks, > >> Vladimir > >> > >> On 8/17/20 5:46 AM, Fairoz Matte wrote: > >>> Hi, > >>> > >>> > >>> > >>> Please review this small test change to work with Graal. > >>> > >>> > >>> > >>> Background: > >>> > >>> Graal require more code cache compared to c1/c2. but the test case > >>> always > >> set it to 20MB. This may not be sufficient when running graal. > >>> > >>> Default configuration for ReservedCodeCacheSize = 250MB > >>> > >>> With graal enabled, ReservedCodeCacheSize = 350MB > >>> > >>> > >>> > >>> Either we can modify the framework to honor ReservedCodeCacheSize > >>> for > >> graal or just update the testcase. > >>> > >>> There are not many test cases they rely on ReservedCodeCacheSize or > >> InitialCodeCacheSize. So the fix prefer the later one. > >>> > >>> > >>> > >>> JBS - https://bugs.openjdk.java.net/browse/JDK-8248295 > >>> > >>> Webrev - http://cr.openjdk.java.net/~fmatte/8248295/webrev.00/ > >>> > >>> > >>> > >>> Thanks, > >>> > >>> Fairoz > >>> > >>> > >>> From hohensee at amazon.com Wed Aug 19 16:16:33 2020 From: hohensee at amazon.com (Hohensee, Paul) Date: Wed, 19 Aug 2020 16:16:33 +0000 Subject: RFR/RFA (M): 8185003: JMX: Add a version of ThreadMXBean.dumpAllThreads with a maxDepth argument Message-ID: <8B42FF96-373A-4B03-8B93-D4A47D765F3E@amazon.com> Please review this backport to jdk8u. I especially need a CSR review, since the CSR approval process can be a bottleneck. The patch significantly reduces fleet profiling overhead, and a version of it has been in production at Amazon for over 3 years. Original JBS issue: https://bugs.openjdk.java.net/browse/JDK-8185003 Original CSR: https://bugs.openjdk.java.net/browse/JDK-8185705 Original patch: http://hg.openjdk.java.net/jdk10/master/rev/68d46cb9be45 Backport JBS issue: https://bugs.openjdk.java.net/browse/JDK-8251494 Backport CSR: https://bugs.openjdk.java.net/browse/JDK-8251498 Backport JDK webrev: http://cr.openjdk.java.net/~phh/8185003/webrev.8u.jdk.05/ Backport Hotspot webrev: http://cr.openjdk.java.net/~phh/8185003/webrev.8u.hotspot.05/ Details of the interface changes needed for the backport are in the Description of the Backport CSR 8251498. The actual functional changes are minimal and low risk. Passes the included (suitably modified) test, as well as the tests in jdk/test/java/lang/management/ThreadMXBean jdk/test/com/sun/management/ThreadMXBean Thanks, Paul -------------- next part -------------- An HTML attachment was scrubbed... URL: From vladimir.kozlov at oracle.com Wed Aug 19 16:38:24 2020 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Wed, 19 Aug 2020 09:38:24 -0700 Subject: RFR(s): 8248295: serviceability/jvmti/CompiledMethodLoad/Zombie.java failure with Graal In-Reply-To: <59cd0914-5a61-463e-b46f-ebdc1496ab9f@default> References: <1bdcbd35-097e-1681-3a0c-32f9709497a4@oracle.com> <7040f785-b871-9771-94a2-4c3472a6bf6d@oracle.com> <59cd0914-5a61-463e-b46f-ebdc1496ab9f@default> Message-ID: <1b7f5767-7d1f-1f43-87bb-556801ef1c41@oracle.com> Looks good. Thanks, Vladimir K On 8/19/20 5:30 AM, Fairoz Matte wrote: > Hi Vladimir, > > Thanks for the review. > >> I would suggest to run test with -XX:+PrintCodeCache flag which prints >> CodeCache usage on exit. >> >> Also add '-ea -esa' flags - some runs failed with them because they increase >> Graal's methods size. >> >> Running test with immediately caused OOM error on my local linux machine: >> >> '-server -ea -esa -XX:+TieredCompilation -XX:+PrintCodeCache - >> XX:+UnlockExperimentalVMOptions -XX:+EnableJVMCI - >> XX:+UseJVMCICompiler -Djvmci.Compiler=graal' >> >> With -XX:ReservedCodeCacheSize=30m I got: >> >> [11.217s][warning][codecache] CodeCache is full. Compiler has been >> disabled. >> [11.217s][warning][codecache] Try increasing the code cache size using - >> XX:ReservedCodeCacheSize= >> >> With -XX:ReservedCodeCacheSize=50m I got this output: > > Further testing with PrintCodeCache, ReservedCodeCacheSize = 50MB is the safe one to use. > >> >> CodeCache: size=51200Kb used=34401Kb max_used=34401Kb free=16798Kb >> >> May be you need to set it to 35m or better to 50m to be safe. >> >> Note, without Graal test uses only 5.5m: >> >> CodeCache: size=20480Kb used=5677Kb max_used=5688Kb free=14803Kb >> >> ----------------------------- >> >> I also forgot to ask you to update test's Copyright year. > > I have updated the copyright year. > Updated webrev for the reference - http://cr.openjdk.java.net/~fmatte/8248295/webrev.01/ > > Thanks, > Fairoz >> >> Regards, >> Vladimir K >> >> On 8/18/20 1:10 AM, Fairoz Matte wrote: >>> Hi Vladimir, >>> >>> Thanks for looking into. >>> This is intermittent crash, and is reproducible in windows debug build >> environment. Below is the testing performed. >>> >>> 1. Issues observed 7/100 runs, ReservedCodeCacheSize=20m with "- >> XX:+UnlockExperimentalVMOptions -XX:+EnableJVMCI - >> XX:+UseJVMCICompiler" >>> 2. Issues observed 0/300 runs, ReservedCodeCacheSize=30m with "- >> XX:+UnlockExperimentalVMOptions -XX:+EnableJVMCI - >> XX:+UseJVMCICompiler" >>> >>> Thanks, >>> Fairoz >>> >>>> -----Original Message----- >>>> From: Vladimir Kozlov >>>> Sent: Monday, August 17, 2020 11:22 PM >>>> To: Fairoz Matte ; hotspot-compiler- >>>> dev at openjdk.java.net; serviceability-dev at openjdk.java.net >>>> Cc: Coleen Phillimore ; Dean Long >>>> >>>> Subject: Re: RFR(s): 8248295: >>>> serviceability/jvmti/CompiledMethodLoad/Zombie.java failure with >>>> Graal >>>> >>>> Hi Fairoz, >>>> >>>> How you determine that +10Mb is enough with Graal? >>>> >>>> Thanks, >>>> Vladimir >>>> >>>> On 8/17/20 5:46 AM, Fairoz Matte wrote: >>>>> Hi, >>>>> >>>>> >>>>> >>>>> Please review this small test change to work with Graal. >>>>> >>>>> >>>>> >>>>> Background: >>>>> >>>>> Graal require more code cache compared to c1/c2. but the test case >>>>> always >>>> set it to 20MB. This may not be sufficient when running graal. >>>>> >>>>> Default configuration for ReservedCodeCacheSize = 250MB >>>>> >>>>> With graal enabled, ReservedCodeCacheSize = 350MB >>>>> >>>>> >>>>> >>>>> Either we can modify the framework to honor ReservedCodeCacheSize >>>>> for >>>> graal or just update the testcase. >>>>> >>>>> There are not many test cases they rely on ReservedCodeCacheSize or >>>> InitialCodeCacheSize. So the fix prefer the later one. >>>>> >>>>> >>>>> >>>>> JBS - https://bugs.openjdk.java.net/browse/JDK-8248295 >>>>> >>>>> Webrev - http://cr.openjdk.java.net/~fmatte/8248295/webrev.00/ >>>>> >>>>> >>>>> >>>>> Thanks, >>>>> >>>>> Fairoz >>>>> >>>>> >>>>> From serguei.spitsyn at oracle.com Wed Aug 19 20:14:56 2020 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Wed, 19 Aug 2020 13:14:56 -0700 Subject: RFR(s): 8248295: serviceability/jvmti/CompiledMethodLoad/Zombie.java failure with Graal In-Reply-To: <1b7f5767-7d1f-1f43-87bb-556801ef1c41@oracle.com> References: <1bdcbd35-097e-1681-3a0c-32f9709497a4@oracle.com> <7040f785-b871-9771-94a2-4c3472a6bf6d@oracle.com> <59cd0914-5a61-463e-b46f-ebdc1496ab9f@default> <1b7f5767-7d1f-1f43-87bb-556801ef1c41@oracle.com> Message-ID: <6f104422-11cc-1bea-2ebf-a916a22f10fd@oracle.com> Hi Fairoz, LGTM++ Thanks, Serguei On 8/19/20 09:38, Vladimir Kozlov wrote: > Looks good. > > Thanks, > Vladimir K > > On 8/19/20 5:30 AM, Fairoz Matte wrote: >> Hi Vladimir, >> >> Thanks for the review. >> >>> I would suggest to run test with -XX:+PrintCodeCache flag which prints >>> CodeCache usage on exit. >>> >>> Also add '-ea -esa' flags - some runs failed with them because they >>> increase >>> Graal's methods size. >>> >>> Running test with immediately caused OOM error on my local linux >>> machine: >>> >>> '-server -ea -esa -XX:+TieredCompilation -XX:+PrintCodeCache - >>> XX:+UnlockExperimentalVMOptions -XX:+EnableJVMCI - >>> XX:+UseJVMCICompiler -Djvmci.Compiler=graal' >>> >>> With -XX:ReservedCodeCacheSize=30m I got: >>> >>> [11.217s][warning][codecache] CodeCache is full. Compiler has been >>> disabled. >>> [11.217s][warning][codecache] Try increasing the code cache size >>> using - >>> XX:ReservedCodeCacheSize= >>> >>> With -XX:ReservedCodeCacheSize=50m I got this output: >> >> Further testing with PrintCodeCache, ReservedCodeCacheSize = 50MB is >> the safe one to use. >> >>> >>> CodeCache: size=51200Kb used=34401Kb max_used=34401Kb free=16798Kb >>> >>> May be you need to set it to 35m or better to 50m to be safe. >>> >>> Note, without Graal test uses only 5.5m: >>> >>> CodeCache: size=20480Kb used=5677Kb max_used=5688Kb free=14803Kb >>> >>> ----------------------------- >>> >>> I also forgot to ask you to update test's Copyright year. >> >> I have updated the copyright year. >> Updated webrev for the reference - >> http://cr.openjdk.java.net/~fmatte/8248295/webrev.01/ >> >> Thanks, >> Fairoz >>> >>> Regards, >>> Vladimir K >>> >>> On 8/18/20 1:10 AM, Fairoz Matte wrote: >>>> Hi Vladimir, >>>> >>>> Thanks for looking into. >>>> This is intermittent crash, and is reproducible in windows debug build >>> environment. Below is the testing performed. >>>> >>>> 1. Issues observed 7/100 runs, ReservedCodeCacheSize=20m with "- >>> XX:+UnlockExperimentalVMOptions -XX:+EnableJVMCI - >>> XX:+UseJVMCICompiler" >>>> 2. Issues observed 0/300 runs, ReservedCodeCacheSize=30m with "- >>> XX:+UnlockExperimentalVMOptions -XX:+EnableJVMCI - >>> XX:+UseJVMCICompiler" >>>> >>>> Thanks, >>>> Fairoz >>>> >>>>> -----Original Message----- >>>>> From: Vladimir Kozlov >>>>> Sent: Monday, August 17, 2020 11:22 PM >>>>> To: Fairoz Matte ; hotspot-compiler- >>>>> dev at openjdk.java.net; serviceability-dev at openjdk.java.net >>>>> Cc: Coleen Phillimore ; Dean Long >>>>> >>>>> Subject: Re: RFR(s): 8248295: >>>>> serviceability/jvmti/CompiledMethodLoad/Zombie.java failure with >>>>> Graal >>>>> >>>>> Hi Fairoz, >>>>> >>>>> How you determine that +10Mb is enough with Graal? >>>>> >>>>> Thanks, >>>>> Vladimir >>>>> >>>>> On 8/17/20 5:46 AM, Fairoz Matte wrote: >>>>>> Hi, >>>>>> >>>>>> >>>>>> >>>>>> Please review this small test change to work with Graal. >>>>>> >>>>>> >>>>>> >>>>>> Background: >>>>>> >>>>>> Graal require more code cache compared to c1/c2. but the test case >>>>>> always >>>>> set it to 20MB. This may not be sufficient when running graal. >>>>>> >>>>>> Default configuration for ReservedCodeCacheSize = 250MB >>>>>> >>>>>> With graal enabled, ReservedCodeCacheSize = 350MB >>>>>> >>>>>> >>>>>> >>>>>> Either we can modify the framework to honor ReservedCodeCacheSize >>>>>> for >>>>> graal or just update the testcase. >>>>>> >>>>>> There are not many test cases they rely on ReservedCodeCacheSize or >>>>> InitialCodeCacheSize. So the fix prefer the later one. >>>>>> >>>>>> >>>>>> >>>>>> JBS - https://bugs.openjdk.java.net/browse/JDK-8248295 >>>>>> >>>>>> Webrev - http://cr.openjdk.java.net/~fmatte/8248295/webrev.00/ >>>>>> >>>>>> >>>>>> >>>>>> Thanks, >>>>>> >>>>>> Fairoz >>>>>> >>>>>> >>>>>> From hohensee at amazon.com Wed Aug 19 20:18:01 2020 From: hohensee at amazon.com (Hohensee, Paul) Date: Wed, 19 Aug 2020 20:18:01 +0000 Subject: [Discussion] Expected behavior of combining "all" and "live" options of jmap Message-ID: I prioritize compatibility, so would go with option 2. Thanks, Paul ?On 8/18/20, 11:17 PM, "serviceability-dev on behalf of linzang(??)" wrote: Dear All, May I get some suggestions? so that I can work out a patch base on that. Or may be it should not be treated as an issue? BRs, Lin On 17/08/2020 17:17, linzang(??) wrote: > Dear all, > we found the jmap?s histo/dump command could accept "live" and "all" options together, and the specification does not describe what is the expected behavior of it. > I have tried that when these two options used together, the "live" takes effect, no matter what sequences are they in commandline. > IMO, it is a little confused to use "live" and "all" together, and if it is allowed, the specification may need to be updated to state the behavior clearly. > Therefore may I ask your suggestion on which option of the following is prefered: > (option 1.) disallow using these two options together, I think this is more clear, but I am not sure whether there is backward compatibility risk. > (option 2.) allow the combination use of "live" and "all", and update the specification to clearly describe the behavior that "live" takes effect in this case. > What do you think? > > Thanks, > Lin > > > From serguei.spitsyn at oracle.com Wed Aug 19 20:58:19 2020 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Wed, 19 Aug 2020 13:58:19 -0700 Subject: RFR: JDK-8234808: jdb quoted option parsing broken In-Reply-To: References: Message-ID: <5264fc63-3ad4-ad13-768e-6c9183c52dda@oracle.com> An HTML attachment was scrubbed... URL: From alexey.menkov at oracle.com Wed Aug 19 22:11:36 2020 From: alexey.menkov at oracle.com (Alex Menkov) Date: Wed, 19 Aug 2020 15:11:36 -0700 Subject: RFR: JDK-8234808: jdb quoted option parsing broken In-Reply-To: <5264fc63-3ad4-ad13-768e-6c9183c52dda@oracle.com> References: <5264fc63-3ad4-ad13-768e-6c9183c52dda@oracle.com> Message-ID: Hi Serguei, thank you for the feedback. On 08/19/2020 13:58, serguei.spitsyn at oracle.com wrote: > Hi Alex, > > Sorry, I've overlooked this request for review. > The fix looks good in general. > > > http://cr.openjdk.java.net/~amenkov/jdk16/jdb_options/webrev/src/jdk.jdi/share/classes/com/sun/tools/example/debug/tty/VMConnection.java.frames.html > > 81 private Map > parseConnectorArgs(Connector connector, > 82 String argString, > 83 String extraOptions) { > > To make it more elegant I'd suggest to place the returned type on a > separate line like below: > private Map > parseConnectorArgs(Connector connector, String argString, String > extraOptions) { Do you mean second line indent should be the same as 1st? or make it 8 spaces: private Map parseConnectorArgs(Connector connector, String argString, String extraOptions) { > > 127 sb.append(extraOptions).append(" "); > 128 // set extraOptions to null to not set it again > 129 extraOptions = null; > > What about rewording the comment like below? : > ?? ? // set extraOptions to null to avoid appending it again ok. > > 165 if (extraOptions != null) { > 166 // there was no "option" specified in argString > 167 Connector.Argument argument = arguments.get("options"); > 168 if (argument != null) { > 169 argument.setValue(extraOptions); > 170 } > 171 } > > Should the "option" in the comment be replaced with "options"? right. > What if the argument at line 167 was set to null? > Will the extraOptions be ignored in such a case? extraOptions makes sense only for CommandLineLaunch connector which launches new VM (and only this connector has "options" argument). Other connectors (attach or listen) connect to existing VM and cannot set its options. > > > http://cr.openjdk.java.net/~amenkov/jdk16/jdb_options/webrev/test/jdk/com/sun/jdi/JdbOptions.java.html > > This line is probably not needed anymore: > > 157 //jdb.quit(); > will delete. --alex > > > > Thanks, > Serguei > > On 8/7/20 15:09, Alex Menkov wrote: >> Hi all, >> >> please review the fix for >> https://bugs.openjdk.java.net/browse/JDK-8234808 >> webrev: >> http://cr.openjdk.java.net/~amenkov/jdk16/jdb_options/webrev/ >> >> Some background: >> when jdb launches debuggee process it passes java options from >> "options" value for CommandLineLaunch connector and forward options >> specified before command. >> >> The fix solves several discovered issues: >> - proper handling of java options with spaces >> - if both way are used to specify java options, forwarded options >> override options from "options" value >> >> VMConnection class implements tricky logic for "options" field parsing >> for JFR needs (handling of single and double quotes). I decided to >> keep it as is to avoid massive test failures with JFR (there is no >> test coverage for this functionality and I'm not sure I understand all >> requirements). >> >> --alex > From serguei.spitsyn at oracle.com Wed Aug 19 23:14:59 2020 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Wed, 19 Aug 2020 16:14:59 -0700 Subject: RFR: JDK-8234808: jdb quoted option parsing broken In-Reply-To: References: <5264fc63-3ad4-ad13-768e-6c9183c52dda@oracle.com> Message-ID: <67794350-6c08-3354-cefe-302b931cf8ce@oracle.com> On 8/19/20 15:11, Alex Menkov wrote: > Hi Serguei, > > thank you for the feedback. > > On 08/19/2020 13:58, serguei.spitsyn at oracle.com wrote: >> Hi Alex, >> >> Sorry, I've overlooked this request for review. >> The fix looks good in general. >> >> >> http://cr.openjdk.java.net/~amenkov/jdk16/jdb_options/webrev/src/jdk.jdi/share/classes/com/sun/tools/example/debug/tty/VMConnection.java.frames.html >> >> >> 81 private Map >> parseConnectorArgs(Connector connector, >> 82 String argString, >> 83 String extraOptions) { >> >> To make it more elegant I'd suggest to place the returned type on a >> separate line like below: >> private Map >> parseConnectorArgs(Connector connector, String argString, String >> extraOptions) { > > Do you mean second line indent should be the same as 1st? > or make it 8 spaces: > > private Map > ??????? parseConnectorArgs(Connector connector, String argString, > String extraOptions) { No indent is needed, I think. My suggestion is to use extra line for method return type instead of method arguments. >> >> 127 sb.append(extraOptions).append(" "); >> 128 // set extraOptions to null to not set it again >> 129 extraOptions = null; >> >> What about rewording the comment like below? : >> ??? ? // set extraOptions to null to avoid appending it again > > ok. > >> >> 165 if (extraOptions != null) { >> 166 // there was no "option" specified in argString >> 167 Connector.Argument argument = arguments.get("options"); >> 168 if (argument != null) { >> 169 argument.setValue(extraOptions); >> 170 } >> 171 } >> >> Should the "option" in the comment be replaced with "options"? > > right. > >> What if the argument at line 167 was set to null? >> Will the extraOptions be ignored in such a case? > > extraOptions makes sense only for CommandLineLaunch connector which > launches new VM (and only this connector has "options" argument). > Other connectors (attach or listen) connect to existing VM and cannot > set its options. Okay, thank you for explanation. Thanks, Serguei >> >> >> http://cr.openjdk.java.net/~amenkov/jdk16/jdb_options/webrev/test/jdk/com/sun/jdi/JdbOptions.java.html >> >> >> This line is probably not needed anymore: >> >> ? 157???????????? //jdb.quit(); >> > > will delete. > > --alex > >> >> >> >> Thanks, >> Serguei >> >> On 8/7/20 15:09, Alex Menkov wrote: >>> Hi all, >>> >>> please review the fix for >>> https://bugs.openjdk.java.net/browse/JDK-8234808 >>> webrev: >>> http://cr.openjdk.java.net/~amenkov/jdk16/jdb_options/webrev/ >>> >>> Some background: >>> when jdb launches debuggee process it passes java options from >>> "options" value for CommandLineLaunch connector and forward options >>> specified before command. >>> >>> The fix solves several discovered issues: >>> - proper handling of java options with spaces >>> - if both way are used to specify java options, forwarded options >>> override options from "options" value >>> >>> VMConnection class implements tricky logic for "options" field >>> parsing for JFR needs (handling of single and double quotes). I >>> decided to keep it as is to avoid massive test failures with JFR >>> (there is no test coverage for this functionality and I'm not sure I >>> understand all requirements). >>> >>> --alex >> From serguei.spitsyn at oracle.com Wed Aug 19 23:22:08 2020 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Wed, 19 Aug 2020 16:22:08 -0700 Subject: RFR(T) : 8252005 : narrow disabling of allowSmartActionArgs in vmTestbase In-Reply-To: <4E6FECE6-9103-46ED-84B2-79DBA0123ED9@oracle.com> References: <4E6FECE6-9103-46ED-84B2-79DBA0123ED9@oracle.com> Message-ID: <17a8369e-5f38-ebab-974b-28e083378aa2@oracle.com> Hi Igor, This looks reasonable. Thanks, Serguei On 8/18/20 16:42, Igor Ignatyev wrote: > http://cr.openjdk.java.net/~iignatyev//8252005/webrev.00/ >> 0 lines changed: 0 ins; 0 del; 0 mod; > Hi all, > > could you please review this trivial (and apparently empty) patch which sets allowSmartActionArgs to false only in subdirectories of vmTestbase which currently use PropertyResolvingWrapper? > > (it's hard to tell from webrev or patch, but test/hotspot/jtreg/vmTestbase/TEST.properties is effectively removed) > > webrev: http://cr.openjdk.java.net/~iignatyev//8252005/webrev.00/ > JBS: https://bugs.openjdk.java.net/browse/JDK-8252005 > > Thanks, > -- Igor > > From alexey.menkov at oracle.com Wed Aug 19 23:35:28 2020 From: alexey.menkov at oracle.com (Alex Menkov) Date: Wed, 19 Aug 2020 16:35:28 -0700 Subject: RFR: JDK-8234808: jdb quoted option parsing broken In-Reply-To: <67794350-6c08-3354-cefe-302b931cf8ce@oracle.com> References: <5264fc63-3ad4-ad13-768e-6c9183c52dda@oracle.com> <67794350-6c08-3354-cefe-302b931cf8ce@oracle.com> Message-ID: <0b1a25ba-c044-1f49-ef2e-6b412c2cf601@oracle.com> Updated webrev: http://cr.openjdk.java.net/~amenkov/jdk16/jdb_options/webrev.02/ --alex On 08/19/2020 16:14, serguei.spitsyn at oracle.com wrote: > On 8/19/20 15:11, Alex Menkov wrote: >> Hi Serguei, >> >> thank you for the feedback. >> >> On 08/19/2020 13:58, serguei.spitsyn at oracle.com wrote: >>> Hi Alex, >>> >>> Sorry, I've overlooked this request for review. >>> The fix looks good in general. >>> >>> >>> http://cr.openjdk.java.net/~amenkov/jdk16/jdb_options/webrev/src/jdk.jdi/share/classes/com/sun/tools/example/debug/tty/VMConnection.java.frames.html >>> >>> >>> 81 private Map >>> parseConnectorArgs(Connector connector, >>> 82 String argString, >>> 83 String extraOptions) { >>> >>> To make it more elegant I'd suggest to place the returned type on a >>> separate line like below: >>> private Map >>> parseConnectorArgs(Connector connector, String argString, String >>> extraOptions) { >> >> Do you mean second line indent should be the same as 1st? >> or make it 8 spaces: >> >> private Map >> ??????? parseConnectorArgs(Connector connector, String argString, >> String extraOptions) { > > No indent is needed, I think. > My suggestion is to use extra line for method return type instead of > method arguments. > > >>> >>> 127 sb.append(extraOptions).append(" "); >>> 128 // set extraOptions to null to not set it again >>> 129 extraOptions = null; >>> >>> What about rewording the comment like below? : >>> ??? ? // set extraOptions to null to avoid appending it again >> >> ok. >> >>> >>> 165 if (extraOptions != null) { >>> 166 // there was no "option" specified in argString >>> 167 Connector.Argument argument = arguments.get("options"); >>> 168 if (argument != null) { >>> 169 argument.setValue(extraOptions); >>> 170 } >>> 171 } >>> >>> Should the "option" in the comment be replaced with "options"? >> >> right. >> >>> What if the argument at line 167 was set to null? >>> Will the extraOptions be ignored in such a case? >> >> extraOptions makes sense only for CommandLineLaunch connector which >> launches new VM (and only this connector has "options" argument). >> Other connectors (attach or listen) connect to existing VM and cannot >> set its options. > > Okay, thank you for explanation. > > Thanks, > Serguei > >>> >>> >>> http://cr.openjdk.java.net/~amenkov/jdk16/jdb_options/webrev/test/jdk/com/sun/jdi/JdbOptions.java.html >>> >>> >>> This line is probably not needed anymore: >>> >>> ? 157???????????? //jdb.quit(); >>> >> >> will delete. >> >> --alex >> >>> >>> >>> >>> Thanks, >>> Serguei >>> >>> On 8/7/20 15:09, Alex Menkov wrote: >>>> Hi all, >>>> >>>> please review the fix for >>>> https://bugs.openjdk.java.net/browse/JDK-8234808 >>>> webrev: >>>> http://cr.openjdk.java.net/~amenkov/jdk16/jdb_options/webrev/ >>>> >>>> Some background: >>>> when jdb launches debuggee process it passes java options from >>>> "options" value for CommandLineLaunch connector and forward options >>>> specified before command. >>>> >>>> The fix solves several discovered issues: >>>> - proper handling of java options with spaces >>>> - if both way are used to specify java options, forwarded options >>>> override options from "options" value >>>> >>>> VMConnection class implements tricky logic for "options" field >>>> parsing for JFR needs (handling of single and double quotes). I >>>> decided to keep it as is to avoid massive test failures with JFR >>>> (there is no test coverage for this functionality and I'm not sure I >>>> understand all requirements). >>>> >>>> --alex >>> > From serguei.spitsyn at oracle.com Wed Aug 19 23:46:03 2020 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Wed, 19 Aug 2020 16:46:03 -0700 Subject: RFR: JDK-8234808: jdb quoted option parsing broken In-Reply-To: <0b1a25ba-c044-1f49-ef2e-6b412c2cf601@oracle.com> References: <5264fc63-3ad4-ad13-768e-6c9183c52dda@oracle.com> <67794350-6c08-3354-cefe-302b931cf8ce@oracle.com> <0b1a25ba-c044-1f49-ef2e-6b412c2cf601@oracle.com> Message-ID: <89e03cfd-7ae1-1c88-26eb-73b8c6b8d79f@oracle.com> Thank you for the update, Alex! It looks good. Thanks, Serguei On 8/19/20 16:35, Alex Menkov wrote: > Updated webrev: > > http://cr.openjdk.java.net/~amenkov/jdk16/jdb_options/webrev.02/ > > --alex > > On 08/19/2020 16:14, serguei.spitsyn at oracle.com wrote: >> On 8/19/20 15:11, Alex Menkov wrote: >>> Hi Serguei, >>> >>> thank you for the feedback. >>> >>> On 08/19/2020 13:58, serguei.spitsyn at oracle.com wrote: >>>> Hi Alex, >>>> >>>> Sorry, I've overlooked this request for review. >>>> The fix looks good in general. >>>> >>>> >>>> http://cr.openjdk.java.net/~amenkov/jdk16/jdb_options/webrev/src/jdk.jdi/share/classes/com/sun/tools/example/debug/tty/VMConnection.java.frames.html >>>> >>>> >>>> 81 private Map >>>> parseConnectorArgs(Connector connector, >>>> 82 String argString, >>>> 83 String extraOptions) { >>>> >>>> To make it more elegant I'd suggest to place the returned type on a >>>> separate line like below: >>>> private Map >>>> parseConnectorArgs(Connector connector, String argString, String >>>> extraOptions) { >>> >>> Do you mean second line indent should be the same as 1st? >>> or make it 8 spaces: >>> >>> private Map >>> ??????? parseConnectorArgs(Connector connector, String argString, >>> String extraOptions) { >> >> No indent is needed, I think. >> My suggestion is to use extra line for method return type instead of >> method arguments. >> >> >>>> >>>> 127 sb.append(extraOptions).append(" "); >>>> 128 // set extraOptions to null to not set it again >>>> 129 extraOptions = null; >>>> >>>> What about rewording the comment like below? : >>>> ??? ? // set extraOptions to null to avoid appending it again >>> >>> ok. >>> >>>> >>>> 165 if (extraOptions != null) { >>>> 166 // there was no "option" specified in argString >>>> 167 Connector.Argument argument = arguments.get("options"); >>>> 168 if (argument != null) { >>>> 169 argument.setValue(extraOptions); >>>> 170 } >>>> 171 } >>>> >>>> Should the "option" in the comment be replaced with "options"? >>> >>> right. >>> >>>> What if the argument at line 167 was set to null? >>>> Will the extraOptions be ignored in such a case? >>> >>> extraOptions makes sense only for CommandLineLaunch connector which >>> launches new VM (and only this connector has "options" argument). >>> Other connectors (attach or listen) connect to existing VM and >>> cannot set its options. >> >> Okay, thank you for explanation. >> >> Thanks, >> Serguei >> >>>> >>>> >>>> http://cr.openjdk.java.net/~amenkov/jdk16/jdb_options/webrev/test/jdk/com/sun/jdi/JdbOptions.java.html >>>> >>>> >>>> This line is probably not needed anymore: >>>> >>>> ? 157???????????? //jdb.quit(); >>>> >>> >>> will delete. >>> >>> --alex >>> >>>> >>>> >>>> >>>> Thanks, >>>> Serguei >>>> >>>> On 8/7/20 15:09, Alex Menkov wrote: >>>>> Hi all, >>>>> >>>>> please review the fix for >>>>> https://bugs.openjdk.java.net/browse/JDK-8234808 >>>>> webrev: >>>>> http://cr.openjdk.java.net/~amenkov/jdk16/jdb_options/webrev/ >>>>> >>>>> Some background: >>>>> when jdb launches debuggee process it passes java options from >>>>> "options" value for CommandLineLaunch connector and forward >>>>> options specified before command. >>>>> >>>>> The fix solves several discovered issues: >>>>> - proper handling of java options with spaces >>>>> - if both way are used to specify java options, forwarded options >>>>> override options from "options" value >>>>> >>>>> VMConnection class implements tricky logic for "options" field >>>>> parsing for JFR needs (handling of single and double quotes). I >>>>> decided to keep it as is to avoid massive test failures with JFR >>>>> (there is no test coverage for this functionality and I'm not sure >>>>> I understand all requirements). >>>>> >>>>> --alex >>>> >> From alexey.menkov at oracle.com Thu Aug 20 01:02:24 2020 From: alexey.menkov at oracle.com (Alex Menkov) Date: Wed, 19 Aug 2020 18:02:24 -0700 Subject: RFR: JDK-8251384: [TESTBUG] jvmti tests should not be executed with minimal VM Message-ID: Hi all, please review the fix for https://bugs.openjdk.java.net/browse/JDK-8251384 webrev: http://cr.openjdk.java.net/~amenkov/jdk16/minimal_jvmti/webrev/ The fix introduces new @requires option "vm.jvmti": test/lib/sun/hotspot/WhiteBox.java test/jtreg-ext/requires/VMProps.java src/hotspot/share/prims/whitebox.cpp test/hotspot/jtreg/TEST.ROOT and updates tests in test/hotspot/jtreg/serviceability/jvmti (the only change in all tests is added "@requires vm.jvmti") Other tests will be updated in the follow-ups. The From serguei.spitsyn at oracle.com Thu Aug 20 02:42:29 2020 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Wed, 19 Aug 2020 19:42:29 -0700 Subject: RFR(S) 8249293: Unsafe stackwalk in VM_GetOrSetLocal::doit_prologue() In-Reply-To: References: <54218a0f-5ea7-5a26-af7b-d13829bece14@oracle.com> <037ee9ad-615e-e70b-80f8-21b4fdc80f53@oracle.com> <0a112ad5-0771-9a83-32d8-08c87bed9b40@oracle.com> Message-ID: <8d945a55-aebf-a497-e793-2cd58c2f2392@oracle.com> An HTML attachment was scrubbed... URL: From serguei.spitsyn at oracle.com Thu Aug 20 03:34:48 2020 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Wed, 19 Aug 2020 20:34:48 -0700 Subject: RFR: JDK-8251384: [TESTBUG] jvmti tests should not be executed with minimal VM In-Reply-To: References: Message-ID: <20a48d99-d0c9-be20-640f-0207114e662f@oracle.com> Hi Alex, It looks good to me. But there are more tests in test/hotspot/jtreg/runtime and test/hotspot/jtreg/compiler which use the Instrumentation API, and so, depend on the JVMTI. Examples are: ? test/hotspot/jtreg/compiler/jsr292 test/hotspot/jtreg/compiler/jvmci/jdk.vm.ci.runtime.test/src/jdk/vm/ci/runtime/test ? test/hotspot/jtreg/compiler/profiling/spectrapredefineclass ? test/hotspot/jtreg/runtime/Metaspace/DefineClass.java ? test/hotspot/jtreg/runtime/cds/appcds/cacheObject ? test/hotspot/jtreg/runtime/cds/appcds/javaldr ? test/hotspot/jtreg/runtime/cds/appcds/jvmti ? test/hotspot/jtreg/runtime/records ? test/hotspot/jtreg/runtime/sealedClasses I wonder if you are aware about these tests and have any plan for them. Thanks, Serguei On 8/19/20 18:02, Alex Menkov wrote: > Hi all, > > please review the fix for > https://bugs.openjdk.java.net/browse/JDK-8251384 > webrev: > http://cr.openjdk.java.net/~amenkov/jdk16/minimal_jvmti/webrev/ > > The fix introduces new @requires option "vm.jvmti": > test/lib/sun/hotspot/WhiteBox.java > test/jtreg-ext/requires/VMProps.java > src/hotspot/share/prims/whitebox.cpp > test/hotspot/jtreg/TEST.ROOT > > and updates tests in test/hotspot/jtreg/serviceability/jvmti (the only > change in all tests is added "@requires vm.jvmti") > Other tests will be updated in the follow-ups. > > The From fairoz.matte at oracle.com Thu Aug 20 03:39:51 2020 From: fairoz.matte at oracle.com (Fairoz Matte) Date: Wed, 19 Aug 2020 20:39:51 -0700 (PDT) Subject: RFR(s): 8248295: serviceability/jvmti/CompiledMethodLoad/Zombie.java failure with Graal In-Reply-To: <6f104422-11cc-1bea-2ebf-a916a22f10fd@oracle.com> References: <1bdcbd35-097e-1681-3a0c-32f9709497a4@oracle.com> <7040f785-b871-9771-94a2-4c3472a6bf6d@oracle.com> <59cd0914-5a61-463e-b46f-ebdc1496ab9f@default> <1b7f5767-7d1f-1f43-87bb-556801ef1c41@oracle.com> <6f104422-11cc-1bea-2ebf-a916a22f10fd@oracle.com> Message-ID: <94f5c0a2-f324-4613-abbd-68c4d7df6f52@default> Thanks Vladimir and Serguei for the reviews. Thanks, Fairoz > -----Original Message----- > From: Serguei Spitsyn > Sent: Thursday, August 20, 2020 1:45 AM > To: Vladimir Kozlov ; Fairoz Matte > ; hotspot-compiler-dev at openjdk.java.net; > serviceability-dev at openjdk.java.net > Cc: Coleen Phillimore > Subject: Re: RFR(s): 8248295: > serviceability/jvmti/CompiledMethodLoad/Zombie.java failure with Graal > > Hi Fairoz, > > LGTM++ > > Thanks, > Serguei > > > On 8/19/20 09:38, Vladimir Kozlov wrote: > > Looks good. > > > > Thanks, > > Vladimir K > > > > On 8/19/20 5:30 AM, Fairoz Matte wrote: > >> Hi Vladimir, > >> > >> Thanks for the review. > >> > >>> I would suggest to run test with -XX:+PrintCodeCache flag which > >>> prints CodeCache usage on exit. > >>> > >>> Also add '-ea -esa' flags - some runs failed with them because they > >>> increase Graal's methods size. > >>> > >>> Running test with immediately caused OOM error on my local linux > >>> machine: > >>> > >>> '-server -ea -esa -XX:+TieredCompilation -XX:+PrintCodeCache - > >>> XX:+UnlockExperimentalVMOptions -XX:+EnableJVMCI - > >>> XX:+UseJVMCICompiler -Djvmci.Compiler=graal' > >>> > >>> With -XX:ReservedCodeCacheSize=30m I got: > >>> > >>> [11.217s][warning][codecache] CodeCache is full. Compiler has been > >>> disabled. > >>> [11.217s][warning][codecache] Try increasing the code cache size > >>> using - XX:ReservedCodeCacheSize= > >>> > >>> With -XX:ReservedCodeCacheSize=50m I got this output: > >> > >> Further testing with PrintCodeCache, ReservedCodeCacheSize = 50MB is > >> the safe one to use. > >> > >>> > >>> CodeCache: size=51200Kb used=34401Kb max_used=34401Kb > free=16798Kb > >>> > >>> May be you need to set it to 35m or better to 50m to be safe. > >>> > >>> Note, without Graal test uses only 5.5m: > >>> > >>> CodeCache: size=20480Kb used=5677Kb max_used=5688Kb > free=14803Kb > >>> > >>> ----------------------------- > >>> > >>> I also forgot to ask you to update test's Copyright year. > >> > >> I have updated the copyright year. > >> Updated webrev for the reference - > >> http://cr.openjdk.java.net/~fmatte/8248295/webrev.01/ > >> > >> Thanks, > >> Fairoz > >>> > >>> Regards, > >>> Vladimir K > >>> > >>> On 8/18/20 1:10 AM, Fairoz Matte wrote: > >>>> Hi Vladimir, > >>>> > >>>> Thanks for looking into. > >>>> This is intermittent crash, and is reproducible in windows debug > >>>> build > >>> environment. Below is the testing performed. > >>>> > >>>> 1. Issues observed 7/100 runs, ReservedCodeCacheSize=20m with "- > >>> XX:+UnlockExperimentalVMOptions -XX:+EnableJVMCI - > >>> XX:+UseJVMCICompiler" > >>>> 2. Issues observed 0/300 runs, ReservedCodeCacheSize=30m with "- > >>> XX:+UnlockExperimentalVMOptions -XX:+EnableJVMCI - > >>> XX:+UseJVMCICompiler" > >>>> > >>>> Thanks, > >>>> Fairoz > >>>> > >>>>> -----Original Message----- > >>>>> From: Vladimir Kozlov > >>>>> Sent: Monday, August 17, 2020 11:22 PM > >>>>> To: Fairoz Matte ; hotspot-compiler- > >>>>> dev at openjdk.java.net; serviceability-dev at openjdk.java.net > >>>>> Cc: Coleen Phillimore ; Dean Long > >>>>> > >>>>> Subject: Re: RFR(s): 8248295: > >>>>> serviceability/jvmti/CompiledMethodLoad/Zombie.java failure with > >>>>> Graal > >>>>> > >>>>> Hi Fairoz, > >>>>> > >>>>> How you determine that +10Mb is enough with Graal? > >>>>> > >>>>> Thanks, > >>>>> Vladimir > >>>>> > >>>>> On 8/17/20 5:46 AM, Fairoz Matte wrote: > >>>>>> Hi, > >>>>>> > >>>>>> > >>>>>> > >>>>>> Please review this small test change to work with Graal. > >>>>>> > >>>>>> > >>>>>> > >>>>>> Background: > >>>>>> > >>>>>> Graal require more code cache compared to c1/c2. but the test > >>>>>> case always > >>>>> set it to 20MB. This may not be sufficient when running graal. > >>>>>> > >>>>>> Default configuration for ReservedCodeCacheSize = 250MB > >>>>>> > >>>>>> With graal enabled, ReservedCodeCacheSize = 350MB > >>>>>> > >>>>>> > >>>>>> > >>>>>> Either we can modify the framework to honor > ReservedCodeCacheSize > >>>>>> for > >>>>> graal or just update the testcase. > >>>>>> > >>>>>> There are not many test cases they rely on ReservedCodeCacheSize > >>>>>> or > >>>>> InitialCodeCacheSize. So the fix prefer the later one. > >>>>>> > >>>>>> > >>>>>> > >>>>>> JBS - https://bugs.openjdk.java.net/browse/JDK-8248295 > >>>>>> > >>>>>> Webrev - http://cr.openjdk.java.net/~fmatte/8248295/webrev.00/ > >>>>>> > >>>>>> > >>>>>> > >>>>>> Thanks, > >>>>>> > >>>>>> Fairoz > >>>>>> > >>>>>> > >>>>>> > From linzang at tencent.com Thu Aug 20 12:18:10 2020 From: linzang at tencent.com (=?utf-8?B?bGluemFuZyjoh6fnkLMp?=) Date: Thu, 20 Aug 2020 12:18:10 +0000 Subject: [Discussion] Expected behavior of combining "all" and "live" options of jmap(Internet mail) References: Message-ID: <258fb4cee0564eefb66b83bd6a74a8ae@tencent.com> Thanks Paul! I have filed CSR and Bug: CSR: https://bugs.openjdk.java.net/browse/JDK-8252102 Bug: https://bugs.openjdk.java.net/browse/JDK-8252101 Patch is under testing, will create RFR thread when it is ready. Thanks! Cheers, Lin On 20/08/2020 04:18, Hohensee, Paul wrote: > I prioritize compatibility, so would go with option 2. > > Thanks, > Paul > > ?On 8/18/20, 11:17 PM, "serviceability-dev on behalf of linzang(??)" wrote: > > Dear All, > May I get some suggestions? so that I can work out a patch > base on that. > Or may be it should not be treated as an issue? > BRs, > Lin > > On 17/08/2020 17:17, linzang(??) wrote: > > Dear all, > > we found the jmap?s histo/dump command could accept "live" and "all" options together, and the specification does not describe what is the expected behavior of it. > > I have tried that when these two options used together, the "live" takes effect, no matter what sequences are they in commandline. > > IMO, it is a little confused to use "live" and "all" together, and if it is allowed, the specification may need to be updated to state the behavior clearly. > > Therefore may I ask your suggestion on which option of the following is prefered: > > (option 1.) disallow using these two options together, I think this is more clear, but I am not sure whether there is backward compatibility risk. > > (option 2.) allow the combination use of "live" and "all", and update the specification to clearly describe the behavior that "live" takes effect in this case. > > What do you think? > > > > Thanks, > > Lin > > > > > > > > > From linzang at tencent.com Thu Aug 20 16:17:42 2020 From: linzang at tencent.com (=?utf-8?B?bGluemFuZyjoh6fnkLMp?=) Date: Thu, 20 Aug 2020 16:17:42 +0000 Subject: RFR(s):8252101 Add specification of expected behavior of combining "all" and "live" options of jmap(Internet mail) Message-ID: Dear All, May I ask your help to review this change: Webrev: http://cr.openjdk.java.net/~lzang/8252101/webrev.00/ CSR: https://bugs.openjdk.java.net/browse/JDK-8252102 Bug: https://bugs.openjdk.java.net/browse/JDK-8252101 This change adds the description of expected behavior for jmap -hiso/-dump to use "all" and "live" at the same time. With Paul's help, It also includes code refine of the dump() function in Jmap.java. which is based on Paul's change http://cr.openjdk.java.net/~phh/8251835/webrev.00/ BRs, Lin ?On 2020/8/20, 8:18 PM, "linzang(??)" wrote: Thanks Paul! I have filed CSR and Bug: CSR: https://bugs.openjdk.java.net/browse/JDK-8252102 Bug: https://bugs.openjdk.java.net/browse/JDK-8252101 Patch is under testing, will create RFR thread when it is ready. Thanks! Cheers, Lin On 20/08/2020 04:18, Hohensee, Paul wrote: > I prioritize compatibility, so would go with option 2. > > Thanks, > Paul > > On 8/18/20, 11:17 PM, "serviceability-dev on behalf of linzang(??)" wrote: > > Dear All, > May I get some suggestions? so that I can work out a patch > base on that. > Or may be it should not be treated as an issue? > BRs, > Lin > > On 17/08/2020 17:17, linzang(??) wrote: > > Dear all, > > we found the jmap?s histo/dump command could accept "live" and "all" options together, and the specification does not describe what is the expected behavior of it. > > I have tried that when these two options used together, the "live" takes effect, no matter what sequences are they in commandline. > > IMO, it is a little confused to use "live" and "all" together, and if it is allowed, the specification may need to be updated to state the behavior clearly. > > Therefore may I ask your suggestion on which option of the following is prefered: > > (option 1.) disallow using these two options together, I think this is more clear, but I am not sure whether there is backward compatibility risk. > > (option 2.) allow the combination use of "live" and "all", and update the specification to clearly describe the behavior that "live" takes effect in this case. > > What do you think? > > > > Thanks, > > Lin > > > > > > > > > From igor.ignatyev at oracle.com Thu Aug 20 16:23:31 2020 From: igor.ignatyev at oracle.com (Igor Ignatyev) Date: Thu, 20 Aug 2020 09:23:31 -0700 Subject: RFR: JDK-8251384: [TESTBUG] jvmti tests should not be executed with minimal VM In-Reply-To: References: Message-ID: <80FFB92E-28EC-45D1-9E1D-9F189550270A@oracle.com> HI Alex, one minor nit: according to usual java coding conventions, isJVMTIIncluded should be spelled as isJvmtiIncluded. otherwise the fix looks good to me. > Other tests will be updated in the follow-ups. have you already identified all the tests which need this @requires? filed bugs/RFEs for them? Cheers, -- Igor > On Aug 19, 2020, at 6:02 PM, Alex Menkov wrote: > > Hi all, > > please review the fix for > https://bugs.openjdk.java.net/browse/JDK-8251384 > webrev: > http://cr.openjdk.java.net/~amenkov/jdk16/minimal_jvmti/webrev/ > > The fix introduces new @requires option "vm.jvmti": > test/lib/sun/hotspot/WhiteBox.java > test/jtreg-ext/requires/VMProps.java > src/hotspot/share/prims/whitebox.cpp > test/hotspot/jtreg/TEST.ROOT > > and updates tests in test/hotspot/jtreg/serviceability/jvmti (the only change in all tests is added "@requires vm.jvmti") > Other tests will be updated in the follow-ups. > > The From igor.ignatyev at oracle.com Thu Aug 20 17:16:31 2020 From: igor.ignatyev at oracle.com (Igor Ignatyev) Date: Thu, 20 Aug 2020 10:16:31 -0700 Subject: RFR(T) : 8252005 : narrow disabling of allowSmartActionArgs in vmTestbase In-Reply-To: <17a8369e-5f38-ebab-974b-28e083378aa2@oracle.com> References: <4E6FECE6-9103-46ED-84B2-79DBA0123ED9@oracle.com> <17a8369e-5f38-ebab-974b-28e083378aa2@oracle.com> Message-ID: Hi Serguei, thanks for your review. I've decided to slightly modify the patch and use the ids of subtasks in TEST.properties files (instead of main bug id) in order to avoid possible confusion in the future: - incremental: http://cr.openjdk.java.net/~iignatyev//8252005/webrev.0-1/index.html - whole: http://cr.openjdk.java.net/~iignatyev//8252005/webrev.01/index.html could you please re-review it? Thanks, -- Igor > On Aug 19, 2020, at 4:22 PM, serguei.spitsyn at oracle.com wrote: > > Hi Igor, > > This looks reasonable. > > Thanks, > Serguei > > > On 8/18/20 16:42, Igor Ignatyev wrote: >> http://cr.openjdk.java.net/~iignatyev//8252005/webrev.00/ >>> 0 lines changed: 0 ins; 0 del; 0 mod; >> Hi all, >> >> could you please review this trivial (and apparently empty) patch which sets allowSmartActionArgs to false only in subdirectories of vmTestbase which currently use PropertyResolvingWrapper? >> >> (it's hard to tell from webrev or patch, but test/hotspot/jtreg/vmTestbase/TEST.properties is effectively removed) >> >> webrev: http://cr.openjdk.java.net/~iignatyev//8252005/webrev.00/ >> JBS: https://bugs.openjdk.java.net/browse/JDK-8252005 >> >> Thanks, >> -- Igor >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From serguei.spitsyn at oracle.com Thu Aug 20 17:55:53 2020 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Thu, 20 Aug 2020 10:55:53 -0700 Subject: RFR(T) : 8252005 : narrow disabling of allowSmartActionArgs in vmTestbase In-Reply-To: References: <4E6FECE6-9103-46ED-84B2-79DBA0123ED9@oracle.com> <17a8369e-5f38-ebab-974b-28e083378aa2@oracle.com> Message-ID: <8eb1187f-8030-2adf-b20d-d289bfa35198@oracle.com> An HTML attachment was scrubbed... URL: From igor.ignatyev at oracle.com Thu Aug 20 18:18:19 2020 From: igor.ignatyev at oracle.com (Igor Ignatyev) Date: Thu, 20 Aug 2020 11:18:19 -0700 Subject: RFR(T) : 8252005 : narrow disabling of allowSmartActionArgs in vmTestbase In-Reply-To: <8eb1187f-8030-2adf-b20d-d289bfa35198@oracle.com> References: <4E6FECE6-9103-46ED-84B2-79DBA0123ED9@oracle.com> <17a8369e-5f38-ebab-974b-28e083378aa2@oracle.com> <8eb1187f-8030-2adf-b20d-d289bfa35198@oracle.com> Message-ID: <3CB6B3FF-458B-4B76-872B-46A6D30B7A33@oracle.com> thanks Serguei, pushed. -- Igor > On Aug 20, 2020, at 10:55 AM, serguei.spitsyn at oracle.com wrote: > > Hi Igor, > > Still looks good to me. > The webrev is veeeeery slow. > > Thanks, > Serguei > > > On 8/20/20 10:16, Igor Ignatyev wrote: >> Hi Serguei, >> >> thanks for your review. I've decided to slightly modify the patch and use the ids of subtasks in TEST.properties files (instead of main bug id) in order to avoid possible confusion in the future: >> - incremental: http://cr.openjdk.java.net/~iignatyev//8252005/webrev.0-1/index.html >> - whole: http://cr.openjdk.java.net/~iignatyev//8252005/webrev.01/index.html >> >> could you please re-review it? >> >> Thanks, >> -- Igor >> >>> On Aug 19, 2020, at 4:22 PM, serguei.spitsyn at oracle.com wrote: >>> >>> Hi Igor, >>> >>> This looks reasonable. >>> >>> Thanks, >>> Serguei >>> >>> >>> On 8/18/20 16:42, Igor Ignatyev wrote: >>>> http://cr.openjdk.java.net/~iignatyev//8252005/webrev.00/ >>>>> 0 lines changed: 0 ins; 0 del; 0 mod; >>>> Hi all, >>>> >>>> could you please review this trivial (and apparently empty) patch which sets allowSmartActionArgs to false only in subdirectories of vmTestbase which currently use PropertyResolvingWrapper? >>>> >>>> (it's hard to tell from webrev or patch, but test/hotspot/jtreg/vmTestbase/TEST.properties is effectively removed) >>>> >>>> webrev: http://cr.openjdk.java.net/~iignatyev//8252005/webrev.00/ >>>> JBS: https://bugs.openjdk.java.net/browse/JDK-8252005 >>>> >>>> Thanks, >>>> -- Igor >>>> >>>> >>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From volker.simonis at gmail.com Thu Aug 20 18:30:16 2020 From: volker.simonis at gmail.com (Volker Simonis) Date: Thu, 20 Aug 2020 20:30:16 +0200 Subject: RFR/RFA (M): 8185003: JMX: Add a version of ThreadMXBean.dumpAllThreads with a maxDepth argument In-Reply-To: <8B42FF96-373A-4B03-8B93-D4A47D765F3E@amazon.com> References: <8B42FF96-373A-4B03-8B93-D4A47D765F3E@amazon.com> Message-ID: On Wed, Aug 19, 2020 at 6:17 PM Hohensee, Paul wrote: > > Please review this backport to jdk8u. I especially need a CSR review, since the CSR approval process can be a bottleneck. The patch significantly reduces fleet profiling overhead, and a version of it has been in production at Amazon for over 3 years. > > > > Original JBS issue: https://bugs.openjdk.java.net/browse/JDK-8185003 > > Original CSR: https://bugs.openjdk.java.net/browse/JDK-8185705 > > Original patch: http://hg.openjdk.java.net/jdk10/master/rev/68d46cb9be45 > > > > Backport JBS issue: https://bugs.openjdk.java.net/browse/JDK-8251494 > > Backport CSR: https://bugs.openjdk.java.net/browse/JDK-8251498 > > Backport JDK webrev: http://cr.openjdk.java.net/~phh/8185003/webrev.8u.jdk.05/ > JDK part looks good to me. > Backport Hotspot webrev: http://cr.openjdk.java.net/~phh/8185003/webrev.8u.hotspot.05/ > HotSpot part looks good to me but see discussion below. > > > Details of the interface changes needed for the backport are in the Description of the Backport CSR 8251498. The actual functional changes are minimal and low risk. > I've also reviewed the CSR yesterday which I think is fine. But now, when looking at the implementation, I'm a little concerned about changing JMM_VERSION from " 0x20010203" to "0x20020000" in "jmm.h". This might be especially problematic in combination with the changes in "Management::get_jmm_interface()" which is called by JVM_GetManagement(): void* Management::get_jmm_interface(int version) { #if INCLUDE_MANAGEMENT - if (version == JMM_VERSION_1_0) { + if (version == JMM_VERSION) { return (void*) &jmm_interface; } #endif // INCLUDE_MANAGEMENT return NULL; } You've correctly fixed the single caller of "JVM_GetManagement()" in the JDK (in "JNI_OnLoad()" in "management.c"): - jmm_interface = (JmmInterface*) JVM_GetManagement(JMM_VERSION_1_0); + jmm_interface = (JmmInterface*) JVM_GetManagement(JMM_VERSION); but I wonder if there are other monitoring/serviceability tools out there which use this interface and which will break after this change. A quick search revealed at least two StackOverflow entries which recommend using "JVM_GetManagement(JMM_VERSION_1_0)" [1,2] and there's a talk and a blog entry doing the same [3, 4]. I'm not sure how relevant this is but I think a much safer and backwards-compatible way of doing this downport would be the following: - don't change "Management::get_jmm_interface()" (i.e. still check for "JMM_VERSION_1_0") but return the "new" JMM_VERSION in "jmm_GetVersion()". This won't break anything but will make it possible for clients to detect the new version if they want. - don't change the signature of "DumpThreads()". Instead add a new version (e.g. "DumpThreadsMaxDepth()/jmm_DumpThreadsMaxDepth()") to the "JMMInterface" struct and to "jmm_interface" in "management.cpp". You can do this in one of the two first, reserved fields of "JMMInterface" so you won't break binary compatibility. "jmm_DumpThreads()" will then be a simple wrapper which calls "jmm_DumpThreadsMaxDepth()" with Integer.MAX_VALUE as depth. - in the jdk you then simply call "DumpThreadsMaxDepth()" in "Java_sun_management_ThreadImpl_dumpThreads0()" I think this way we can maintain full binary compatibility while still using the new feature. What do you think? Best regards, Volker [1] https://stackoverflow.com/questions/23632653/generate-java-heap-dump-on-uncaught-exception [2] https://stackoverflow.com/questions/60887816/jvm-options-printnmtstatistics-save-info-to-file [3] https://sudonull.com/post/25841-JVM-TI-how-to-make-a-plugin-for-a-virtual-machine-Odnoklassniki-company-blog [4] https://2019.jpoint.ru/talks/2o8scc5jbaurnqqlsydzxv/ > > Passes the included (suitably modified) test, as well as the tests in > > > > jdk/test/java/lang/management/ThreadMXBean > > jdk/test/com/sun/management/ThreadMXBean > > > > Thanks, > > Paul From alexey.menkov at oracle.com Thu Aug 20 18:50:53 2020 From: alexey.menkov at oracle.com (Alex Menkov) Date: Thu, 20 Aug 2020 11:50:53 -0700 Subject: RFR: JDK-8251384: [TESTBUG] jvmti tests should not be executed with minimal VM In-Reply-To: <20a48d99-d0c9-be20-640f-0207114e662f@oracle.com> References: <20a48d99-d0c9-be20-640f-0207114e662f@oracle.com> Message-ID: <842f818b-6805-78f8-ee0e-41cc60caf67c@oracle.com> Hi Serguei, On 08/19/2020 20:34, serguei.spitsyn at oracle.com wrote: > Hi Alex, > > It looks good to me. > > But there are more tests in test/hotspot/jtreg/runtime and > test/hotspot/jtreg/compiler which use the Instrumentation API, and so, > depend on the JVMTI. > > Examples are: > ? test/hotspot/jtreg/compiler/jsr292 > test/hotspot/jtreg/compiler/jvmci/jdk.vm.ci.runtime.test/src/jdk/vm/ci/runtime/test > > ? test/hotspot/jtreg/compiler/profiling/spectrapredefineclass > ? test/hotspot/jtreg/runtime/Metaspace/DefineClass.java > ? test/hotspot/jtreg/runtime/cds/appcds/cacheObject > ? test/hotspot/jtreg/runtime/cds/appcds/javaldr > ? test/hotspot/jtreg/runtime/cds/appcds/jvmti > ? test/hotspot/jtreg/runtime/records > ? test/hotspot/jtreg/runtime/sealedClasses > > I wonder if you are aware about these tests and have any plan for them. Looks like you missed this: >> and updates tests in test/hotspot/jtreg/serviceability/jvmti (the only >> change in all tests is added "@requires vm.jvmti") >> Other tests will be updated in the follow-ups. The are many tests which needs to be updated in test/hotspot and test/jdk --alex > > Thanks, > Serguei > > > On 8/19/20 18:02, Alex Menkov wrote: >> Hi all, >> >> please review the fix for >> https://bugs.openjdk.java.net/browse/JDK-8251384 >> webrev: >> http://cr.openjdk.java.net/~amenkov/jdk16/minimal_jvmti/webrev/ >> >> The fix introduces new @requires option "vm.jvmti": >> test/lib/sun/hotspot/WhiteBox.java >> test/jtreg-ext/requires/VMProps.java >> src/hotspot/share/prims/whitebox.cpp >> test/hotspot/jtreg/TEST.ROOT >> >> and updates tests in test/hotspot/jtreg/serviceability/jvmti (the only >> change in all tests is added "@requires vm.jvmti") >> Other tests will be updated in the follow-ups. >> >> The > From serguei.spitsyn at oracle.com Thu Aug 20 19:05:24 2020 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Thu, 20 Aug 2020 12:05:24 -0700 Subject: RFR: JDK-8251384: [TESTBUG] jvmti tests should not be executed with minimal VM In-Reply-To: <842f818b-6805-78f8-ee0e-41cc60caf67c@oracle.com> References: <20a48d99-d0c9-be20-640f-0207114e662f@oracle.com> <842f818b-6805-78f8-ee0e-41cc60caf67c@oracle.com> Message-ID: Hi Alex, Yes, of course, there are more tests depending on JVMTI in the vmTestbase, jdk_jdi and jdk_instrument. Just wanted to point that the test/hotspot/jtreg/compiler and test/hotspot/jtreg/runtime also have such tests. I wonder if there is any enhancement filed to cover all needs. One question is: ?Do we really want to update each nsk test depending on JVMTI or there is a way to solve it on the test suite level? Here we may need some recommendation from Igor. Thanks, Serguei On 8/20/20 11:50, Alex Menkov wrote: > Hi Serguei, > > On 08/19/2020 20:34, serguei.spitsyn at oracle.com wrote: >> Hi Alex, >> >> It looks good to me. >> >> But there are more tests in test/hotspot/jtreg/runtime and >> test/hotspot/jtreg/compiler which use the Instrumentation API, and >> so, depend on the JVMTI. >> >> Examples are: >> ?? test/hotspot/jtreg/compiler/jsr292 >> test/hotspot/jtreg/compiler/jvmci/jdk.vm.ci.runtime.test/src/jdk/vm/ci/runtime/test >> >> ?? test/hotspot/jtreg/compiler/profiling/spectrapredefineclass >> ?? test/hotspot/jtreg/runtime/Metaspace/DefineClass.java >> ?? test/hotspot/jtreg/runtime/cds/appcds/cacheObject >> ?? test/hotspot/jtreg/runtime/cds/appcds/javaldr >> ?? test/hotspot/jtreg/runtime/cds/appcds/jvmti >> ?? test/hotspot/jtreg/runtime/records >> ?? test/hotspot/jtreg/runtime/sealedClasses >> >> I wonder if you are aware about these tests and have any plan for them. > > Looks like you missed this: > > >> and updates tests in test/hotspot/jtreg/serviceability/jvmti (the only > >> change in all tests is added "@requires vm.jvmti") > >> Other tests will be updated in the follow-ups. > > The are many tests which needs to be updated in test/hotspot and test/jdk > > --alex > >> >> Thanks, >> Serguei >> >> >> On 8/19/20 18:02, Alex Menkov wrote: >>> Hi all, >>> >>> please review the fix for >>> https://bugs.openjdk.java.net/browse/JDK-8251384 >>> webrev: >>> http://cr.openjdk.java.net/~amenkov/jdk16/minimal_jvmti/webrev/ >>> >>> The fix introduces new @requires option "vm.jvmti": >>> test/lib/sun/hotspot/WhiteBox.java >>> test/jtreg-ext/requires/VMProps.java >>> src/hotspot/share/prims/whitebox.cpp >>> test/hotspot/jtreg/TEST.ROOT >>> >>> and updates tests in test/hotspot/jtreg/serviceability/jvmti (the >>> only change in all tests is added "@requires vm.jvmti") >>> Other tests will be updated in the follow-ups. >>> >>> The >> From serguei.spitsyn at oracle.com Thu Aug 20 20:06:26 2020 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Thu, 20 Aug 2020 13:06:26 -0700 Subject: RFR/RFA (M): 8185003: JMX: Add a version of ThreadMXBean.dumpAllThreads with a maxDepth argument In-Reply-To: References: <8B42FF96-373A-4B03-8B93-D4A47D765F3E@amazon.com> Message-ID: <9c22ade8-67d3-0ddc-6f6f-4bf0108108b4@oracle.com> An HTML attachment was scrubbed... URL: From alexey.menkov at oracle.com Thu Aug 20 20:54:05 2020 From: alexey.menkov at oracle.com (Alex Menkov) Date: Thu, 20 Aug 2020 13:54:05 -0700 Subject: RFR: JDK-8251384: [TESTBUG] jvmti tests should not be executed with minimal VM In-Reply-To: <80FFB92E-28EC-45D1-9E1D-9F189550270A@oracle.com> References: <80FFB92E-28EC-45D1-9E1D-9F189550270A@oracle.com> Message-ID: Hi Igor, On 08/20/2020 09:23, Igor Ignatyev wrote: > HI Alex, > > one minor nit: according to usual java coding conventions, isJVMTIIncluded should be spelled as isJvmtiIncluded. otherwise the fix looks good to me. I tried to be consistent with other methods like isCDSIncludedInVmBuild, isJFRIncludedInVmBuild, isGCSupported, isGCSelected, etc. Maybe this should be isJVMTIIncludedInVmBuild.. > >> Other tests will be updated in the follow-ups. > have you already identified all the tests which need this @requires? filed bugs/RFEs for them? Not yet. I had problem with running all hotspot tests with minimal build (for some reason jtreg was not able to complete it), so I decided start from the tests mentioned in the jira issue and then test area-by-area, file and fix the tests in batches. --alex > > Cheers, > -- Igor > > >> On Aug 19, 2020, at 6:02 PM, Alex Menkov wrote: >> >> Hi all, >> >> please review the fix for >> https://bugs.openjdk.java.net/browse/JDK-8251384 >> webrev: >> http://cr.openjdk.java.net/~amenkov/jdk16/minimal_jvmti/webrev/ >> >> The fix introduces new @requires option "vm.jvmti": >> test/lib/sun/hotspot/WhiteBox.java >> test/jtreg-ext/requires/VMProps.java >> src/hotspot/share/prims/whitebox.cpp >> test/hotspot/jtreg/TEST.ROOT >> >> and updates tests in test/hotspot/jtreg/serviceability/jvmti (the only change in all tests is added "@requires vm.jvmti") >> Other tests will be updated in the follow-ups. >> >> The > From linzang at tencent.com Thu Aug 20 23:42:59 2020 From: linzang at tencent.com (=?utf-8?B?bGluemFuZyjoh6fnkLMp?=) Date: Thu, 20 Aug 2020 23:42:59 +0000 Subject: RFR(s):8252101 Add specification of expected behavior of combining "all" and "live" options of jmap(Internet mail) In-Reply-To: References: Message-ID: <060569EC-202B-4DE8-8549-FF82DCCD9157@tencent.com> After discuss with paul, it is not a good idea to combine two fix together in one webrev. I will handle them separately Please help review the updated one. Thanks! Webrev: http://cr.openjdk.java.net/~lzang/8252101/webrev.01/ CSR: https://bugs.openjdk.java.net/browse/JDK-8252102 Bug: https://bugs.openjdk.java.net/browse/JDK-8252101 BRs, Lin ?On 2020/8/21, 12:17 AM, "linzang(??)" wrote: Dear All, May I ask your help to review this change: Webrev: http://cr.openjdk.java.net/~lzang/8252101/webrev.00/ CSR: https://bugs.openjdk.java.net/browse/JDK-8252102 Bug: https://bugs.openjdk.java.net/browse/JDK-8252101 This change adds the description of expected behavior for jmap -hiso/-dump to use "all" and "live" at the same time. With Paul's help, It also includes code refine of the dump() function in Jmap.java. which is based on Paul's change http://cr.openjdk.java.net/~phh/8251835/webrev.00/ BRs, Lin On 2020/8/20, 8:18 PM, "linzang(??)" wrote: Thanks Paul! I have filed CSR and Bug: CSR: https://bugs.openjdk.java.net/browse/JDK-8252102 Bug: https://bugs.openjdk.java.net/browse/JDK-8252101 Patch is under testing, will create RFR thread when it is ready. Thanks! Cheers, Lin On 20/08/2020 04:18, Hohensee, Paul wrote: > I prioritize compatibility, so would go with option 2. > > Thanks, > Paul > > On 8/18/20, 11:17 PM, "serviceability-dev on behalf of linzang(??)" wrote: > > Dear All, > May I get some suggestions? so that I can work out a patch > base on that. > Or may be it should not be treated as an issue? > BRs, > Lin > > On 17/08/2020 17:17, linzang(??) wrote: > > Dear all, > > we found the jmap?s histo/dump command could accept "live" and "all" options together, and the specification does not describe what is the expected behavior of it. > > I have tried that when these two options used together, the "live" takes effect, no matter what sequences are they in commandline. > > IMO, it is a little confused to use "live" and "all" together, and if it is allowed, the specification may need to be updated to state the behavior clearly. > > Therefore may I ask your suggestion on which option of the following is prefered: > > (option 1.) disallow using these two options together, I think this is more clear, but I am not sure whether there is backward compatibility risk. > > (option 2.) allow the combination use of "live" and "all", and update the specification to clearly describe the behavior that "live" takes effect in this case. > > What do you think? > > > > Thanks, > > Lin > > > > > > > > > From linzang at tencent.com Fri Aug 21 00:01:55 2020 From: linzang at tencent.com (=?utf-8?B?bGluemFuZyjoh6fnkLMp?=) Date: Fri, 21 Aug 2020 00:01:55 +0000 Subject: RFR(s):8251848: JMap.histo() and JMap.dump() should parse sub-arguments similarly Message-ID: Dear All, Please help review the patch of 8251848, Thanks! Webrev: http://cr.openjdk.java.net/~lzang/8251848/webrev.00/ Bug: https://bugs.openjdk.java.net/browse/JDK-8251848 BRs, Lin -------------- next part -------------- An HTML attachment was scrubbed... URL: From richard.reingruber at sap.com Fri Aug 21 06:47:18 2020 From: richard.reingruber at sap.com (Reingruber, Richard) Date: Fri, 21 Aug 2020 06:47:18 +0000 Subject: RFR(S) 8249293: Unsafe stackwalk in VM_GetOrSetLocal::doit_prologue() In-Reply-To: <8d945a55-aebf-a497-e793-2cd58c2f2392@oracle.com> References: <54218a0f-5ea7-5a26-af7b-d13829bece14@oracle.com> <037ee9ad-615e-e70b-80f8-21b4fdc80f53@oracle.com> <0a112ad5-0771-9a83-32d8-08c87bed9b40@oracle.com> <8d945a55-aebf-a497-e793-2cd58c2f2392@oracle.com> Message-ID: Hi Serguei, > Sorry for the delay in reply and thank you for the update. > I like it in general. There are some minor comments though. Excellent, thanks :) I've prepared webrev.5. Webrev: http://cr.openjdk.java.net/~rrich/webrevs/8249293/webrev.5/ Delta: http://cr.openjdk.java.net/~rrich/webrevs/8249293/webrev.5.inc/ > http://cr.openjdk.java.net/~rrich/webrevs/8249293/webrev.4.inc/test/hotspot/jtreg/serviceability/jvmti/GetLocalVariable/GetLocalWithoutSuspendTest.java.frames.html > 112 waitTime = (waitTime << 1); // double wait time > 113 if (waitTime >= M || waitTime < 0) { > 114 waitTime = 1; // reset when too long > 115 } > The M is too big for time. "waitTime" is roughly the number of cycles spent in a spin wait. M ~= 10^6 cycles does not seem too long. Should I rename the variable to spinWaitCycles or something similar? > What about something like this: > waitTime = (waitTime << 1) % 32; > or > waitTime = (waitTime << 1) & 32; I went for // Double wait time, but limit to roughly 10^6 cycles. waitTime = (waitTime << 1) & (M - 1); waitTime = waitTime == 0 ? 1 : waitTime; Masking the waitTime with % 32 is too small. In my experiments with fastdebug builds I got the crash often with a waitTime of 8K on a Linux server and 256K on my Windows notebook. > http://cr.openjdk.java.net/~rrich/webrevs/8249293/webrev.4.inc/test/hotspot/jtreg/serviceability/jvmti/GetLocalVariable/libGetLocalWithoutSuspendTest.cpp.udiff.html > // - Wait for the target thread to either start a new test iteration or to > +// signal shutdown. > A suggestion to replace: "to either start" => "either to start". Ok, done. > +// Shutdown is signalled by setting test_state to ShutDown. The agent reacts > +// to it by changing test_state to Terminated and then it exits. > The second "it" is not needed: "then it exits" => "then exits". Ok, done. > +// ... It sets the shared variable test_state > +// to TargetInNative and then it uses the glws_monitor to send the > The second "it" is not needed. Ok, done. > + monitor_enter(jvmti, env, glws_monitor, AT_LINE); > + monitor_notify(jvmti, env, glws_monitor, AT_LINE); > + monitor_wait(jvmti, env, glws_monitor, AT_LINE); > + monitor_exit(jvmti, env, glws_monitor, AT_LINE); > + monitor_destroy(jvmti, env, glws_monitor, AT_LINE); > There is only one lock. > It'd be more simple to directly use it in the called functions and get rid of the parameter. > Just a suggestion, it is up to you to decide. Ok, done. > http://cr.openjdk.java.net/~rrich/webrevs/8249293/webrev.4.inc/test/hotspot/jtreg/serviceability/jvmti/GetLocalVariable/libGetLocalWithoutSuspendTest.cpp.frames.html > I'd suggest to refactor the lines 213 and 239-257 to a separate function test_GetLocalObject(jvmti, depth). > 240 jobject local_val; > Better to rename it to local_obj or just obj. Ok, done. > There are still problems with the indent. I reformatted the file using 2 space indentation like in other C++ sources. I didn't include the indentation change in the delta webrev. Thanks, Richard. ______________________ From: serguei.spitsyn at oracle.com Sent: Donnerstag, 20. August 2020 04:42 To: Reingruber, Richard ; David Holmes ; serviceability-dev at openjdk.java.net Subject: Re: RFR(S) 8249293: Unsafe stackwalk in VM_GetOrSetLocal::doit_prologue() Hi Richard, Sorry for the delay in reply and thank you for the update. I like it in general. There are some minor comments though. http://cr.openjdk.java.net/~rrich/webrevs/8249293/webrev.4.inc/test/hotspot/jtreg/serviceability/jvmti/GetLocalVariable/GetLocalWithoutSuspendTest.java.frames.html 112 waitTime = (waitTime << 1); // double wait time 113 if (waitTime >= M || waitTime < 0) { 114 waitTime = 1; // reset when too long 115 } The M is too big for time. What about something like this: ? waitTime = (waitTime << 1) % 32; or ? waitTime = (waitTime << 1) & 32; You can choose a better number instead of 32. http://cr.openjdk.java.net/~rrich/webrevs/8249293/webrev.4.inc/test/hotspot/jtreg/serviceability/jvmti/GetLocalVariable/libGetLocalWithoutSuspendTest.cpp.udiff.html // - Wait for the target thread to either start a new test iteration or to +// signal shutdown. A suggestion to replace: "to either start" => "either to start". +// Shutdown is signalled by setting test_state to ShutDown. The agent reacts +// to it by changing test_state to Terminated and then it exits. The second "it" is not needed: "then it exits" => "then exits". +// ... It sets the shared variable test_state +// to TargetInNative and then it uses the glws_monitor to send the The second "it" is not needed. + monitor_enter(jvmti, env, glws_monitor, AT_LINE); + monitor_notify(jvmti, env, glws_monitor, AT_LINE); + monitor_wait(jvmti, env, glws_monitor, AT_LINE); + monitor_exit(jvmti, env, glws_monitor, AT_LINE); + monitor_destroy(jvmti, env, glws_monitor, AT_LINE); There is only one lock. It'd be more simple to directly use it in the called functions and get rid of the parameter. Just a suggestion, it is up to you to decide. http://cr.openjdk.java.net/~rrich/webrevs/8249293/webrev.4.inc/test/hotspot/jtreg/serviceability/jvmti/GetLocalVariable/libGetLocalWithoutSuspendTest.cpp.frames.html I'd suggest to refactor the lines 213 and 239-257 to a separate function test_GetLocalObject(jvmti, depth). 240 jobject local_val; Better to rename it to local_obj or just obj. There are still problems with the indent. The indent 4 is mostly used. However there are still fragments with the indent 2: 112 static void monitor_enter(jvmtiEnv* jvmti, JNIEnv* env, jrawMonitorID mon, const char* loc) { 113 jvmtiError err; 114 115 err = jvmti->RawMonitorEnter(mon); 116 if (err != JVMTI_ERROR_NONE) { 117 ShowErrorMessage(jvmti, err, loc); 118 env->FatalError(loc); 119 } 120 } 121 122 static void monitor_exit(jvmtiEnv* jvmti, JNIEnv* env, jrawMonitorID mon, const char* loc) { 123 jvmtiError err; 124 125 err = jvmti->RawMonitorExit(mon); 126 if (err != JVMTI_ERROR_NONE) { 127 ShowErrorMessage(jvmti, err, loc); 128 env->FatalError(loc); 129 } 130 } 131 132 static void monitor_wait(jvmtiEnv* jvmti, JNIEnv* env, jrawMonitorID mon, const char* loc) { 133 jvmtiError err; 134 135 err = jvmti->RawMonitorWait(mon, 0); 136 if (err != JVMTI_ERROR_NONE) { 137 ShowErrorMessage(jvmti, err, loc); 138 env->FatalError(loc); 139 } 140 } 141 142 static void monitor_notify(jvmtiEnv* jvmti, JNIEnv* env, jrawMonitorID mon, const char* loc) { 143 jvmtiError err; 144 145 err = jvmti->RawMonitorNotify(mon); 146 if (err != JVMTI_ERROR_NONE) { 147 ShowErrorMessage(jvmti, err, loc); 148 env->FatalError(loc); 149 } 150 } 151 152 static void monitor_destroy(jvmtiEnv* jvmti, JNIEnv* env, jrawMonitorID mon, const char* loc) { 153 jvmtiError err; 154 155 err = jvmti->DestroyRawMonitor(mon); 156 if (err != JVMTI_ERROR_NONE) { 157 ShowErrorMessage(jvmti, err, loc); 158 env->FatalError(loc); 159 } ... 160 } 196 while (target_thread == NULL) { 197 monitor_wait(jvmti, env, glws_monitor, AT_LINE); 198 } ... 220 while (test_state != TargetInNative) { 221 if (test_state == ShutDown) { 222 test_state = Terminated; 223 monitor_notify(jvmti, env, glws_monitor, AT_LINE); 224 monitor_exit(jvmti, env, glws_monitor, AT_LINE); 225 return; 226 } 227 monitor_wait(jvmti, env, glws_monitor, AT_LINE); 228 } ... 263 // Called by target thread after building a large stack. 264 // By calling this native method, the thread's stack becomes walkable. 265 // It notifies the agent to do the GetLocalObject() call and then races 266 // it to make its stack not walkable by returning from the native call. 267 JNIEXPORT void JNICALL 268 Java_GetLocalWithoutSuspendTest_notifyAgentToGetLocal(JNIEnv *env, jclass cls, jint depth, jlong waitCycles) { 269 jvmtiEnv* jvmti = jvmti_global; 270 271 monitor_enter(jvmti, env, glws_monitor, AT_LINE); 272 273 // Set depth_for_get_local and notify agent that the target thread is ready for the GetLocalObject() call 274 depth_for_get_local = depth; 275 test_state = TargetInNative; 276 277 monitor_notify(jvmti, env, glws_monitor, AT_LINE); 278 279 // Wait for agent thread to read depth_for_get_local and do the GetLocalObject() call 280 while (test_state != AgentInGetLocal) { 281 monitor_wait(jvmti, env, glws_monitor, AT_LINE); 282 } 283 284 // Reset state to Initial 285 test_state = Initial; 286 287 monitor_exit(jvmti, env, glws_monitor, AT_LINE); 288 289 // Wait a little until agent thread is in unsafe stack walk. 290 // This needs to be a spin wait or sleep because we cannot get a notification 291 // from there. 292 while (--waitCycles > 0) { 293 dummy_counter++; 294 } 295 } ... 299 JNIEXPORT void JNICALL 300 Java_GetLocalWithoutSuspendTest_shutDown(JNIEnv *env, jclass cls) { 301 jvmtiEnv* jvmti = jvmti_global; 302 303 monitor_enter(jvmti, env, glws_monitor, AT_LINE); 304 305 // Notify agent thread to shut down 306 test_state = ShutDown; 307 monitor_notify(jvmti, env, glws_monitor, AT_LINE); 308 309 // Wait for agent to terminate 310 while (test_state != Terminated) { 311 monitor_wait(jvmti, env, glws_monitor, AT_LINE); 312 } 313 314 monitor_exit(jvmti, env, glws_monitor, AT_LINE); 315 316 // Destroy glws_monitor 317 monitor_destroy(jvmti, env, glws_monitor, AT_LINE); 318 } Thanks, Serguei On 8/14/20 07:06, Reingruber, Richard wrote: Hi Serguei, thanks for the feedback. I have implemented your suggestions and created a new webrev: Webrev: http://cr.openjdk.java.net/~rrich/webrevs/8249293/webrev.4/ Delta: http://cr.openjdk.java.net/~rrich/webrevs/8249293/webrev.4.inc/ Please find my replies to your comments below. Best regards, Richard. http://cr.openjdk.java.net/~rrich/webrevs/8249293/webrev.3.inc/test/hotspot/jtreg/serviceability/jvmti/GetLocalVariable/GetLocalWithoutSuspendTest.java.frames.html 33 * the stack walk. The target thread's stack is walkable while in native. After sending the notification it ... 54 * @param depth Depth of target frame for GetLocalObject() call. Should be large value to prolong the unsafe stack walk. 55 * @param waitTimeInNativeAfterNotify Time to wait after notify with walkable stack before returning an becoming unsafe again. ... 71 * Wait time in native, i.e. with walkable stack, after notifying agent thread to do GetLocalObject() call. ... 89 msg((now -start) + " ms Iteration : " + iterations + " waitTimeInNativeAfterNotify : " + waitTimeInNativeAfterNotify); Could you, please, re-balance the lines above to make them shorter? Ok, done. 90 int newTargetDepth = recursiveMethod(0, targetDepth); 91 if (newTargetDepth < targetDepth) { 92 msg("StackOverflowError during test."); 93 msg("Old target depth: " + targetDepth); 94 msg("Retry with new target depth: " + newTargetDepth); 95 targetDepth = newTargetDepth; 96 } A comment is needed to explain why a StackOverflowError is not desired. At least, it is not obvious initially. 73 public int waitTimeInNativeAfterNotify; This name is unreasonably long which makes the code less readable. I'd suggest to reduce it to waitTime. Ok, done. 103 notifyAgentToGetLocalAndWaitShortly(-1, 1); ... 119 notifyAgentToGetLocalAndWaitShortly(depth - 100, waitTimeInNativeAfterNotify); It is better to provide a short comment before each call explaining what it is doing. For instance, it is not clear why the call at the line 103 is needed. Why do we need to notify the agent to GetLocal for the second time? The test is repeated TEST_ITERATIONS times. In each iteration the agent calls GetLocal racing the target thread returning from the native call. The last call in line 103 ist the shutdown signal. Can it be refactored into a separate native method? I've made the shutdown process more explicit with the new native method shutDown() which sets thest_state to ShutDown. Then the the function name can be reduced to 'notifyAgentToGetLocal'. This long name does not give enough context anyway. Ok, done. 85 long iterations = 0; 87 do { ... 97 iterations++; ... 102 } while (iterations < TEST_ITERATIONS); Why a more explicit 'for' or 'while' loop is not used here? : for (long iter = 0; iter < TEST_ITERATIONS; iter++) { I have converted the loop into a for loop. http://cr.openjdk.java.net/~rrich/webrevs/8249293/webrev.3.inc/test/hotspot/jtreg/serviceability/jvmti/GetLocalVariable/libGetLocalWithoutSuspendTest.cpp.frames.html The indent in this file varies. It is better to keep it the same: 4 or 2. Yes, I noticed this. I have not corrected it yet, because I didn't want to pullute the incremental webrev with that change. Would you like me to fix the indentation now to 2 spaces or do it as a last step? 60 AgentCallingGetLocalObject // The target thread waits for the agent to call I'd suggest to rename the constant to 'AgentInGetLocal'. Ok, done. 150 GetLocalWithoutSuspendTestThreadLoop(jvmtiEnv * jvmti, JNIEnv* env, void * arg) { It is better rename the function to TestThreadLoop. Would AgentThreadLoop be ok too? You can add a comment before to explain some basic about what it is doing. Ok, done. 167 printf("*** AGENT: GetLocalWithoutSuspendTestThreadLoop thread started. Polling thread '%s' for local variables\n", It is better to get rid of leading stars in all messages. Ok, done. 176 // the native method Java_GetLocalWithoutSuspendTest_notifyAgentToGetLocalAndWaitShortly The part 'Java_GetLocalWithoutSuspendTest_' can be removed from the function name. Ok, done. --- From: mailto:serguei.spitsyn at oracle.com mailto:serguei.spitsyn at oracle.com Sent: Freitag, 14. August 2020 10:11 To: Reingruber, Richard mailto:richard.reingruber at sap.com; David Holmes mailto:david.holmes at oracle.com; mailto:serviceability-dev at openjdk.java.net Subject: Re: RFR(S) 8249293: Unsafe stackwalk in VM_GetOrSetLocal::doit_prologue() Hi Richard, http://cr.openjdk.java.net/~rrich/webrevs/8249293/webrev.3.inc/test/hotspot/jtreg/serviceability/jvmti/GetLocalVariable/GetLocalWithoutSuspendTest.java.frames.html 33 * the stack walk. The target thread's stack is walkable while in native. After sending the notification it ... 54 * @param depth Depth of target frame for GetLocalObject() call. Should be large value to prolong the unsafe stack walk. 55 * @param waitTimeInNativeAfterNotify Time to wait after notify with walkable stack before returning an becoming unsafe again. ... 71 * Wait time in native, i.e. with walkable stack, after notifying agent thread to do GetLocalObject() call. ... 89 msg((now -start) + " ms Iteration : " + iterations + " waitTimeInNativeAfterNotify : " + waitTimeInNativeAfterNotify); Could you, please, re-balance the lines above to make them shorter? ?90 int newTargetDepth = recursiveMethod(0, targetDepth); 91 if (newTargetDepth < targetDepth) { 92 msg("StackOverflowError during test."); 93 msg("Old target depth: " + targetDepth); 94 msg("Retry with new target depth: " + newTargetDepth); 95 targetDepth = newTargetDepth; 96 } A comment is needed to explain why a StackOverflowError is not desired. At least, it is not obvious initially. 73 public int waitTimeInNativeAfterNotify; This name is unreasonably long which makes the code less readable. I'd suggest to reduce it to waitTime. 103 notifyAgentToGetLocalAndWaitShortly(-1, 1); ... 119 notifyAgentToGetLocalAndWaitShortly(depth - 100, waitTimeInNativeAfterNotify); It is better to provide a short comment before each call explaining what it is doing. For instance, it is not clear why the call at the line 103 is needed. Why do we need to notify the agent to GetLocal for the second time? Can it be refactored into a separate native method? Then the the function name can be reduced to 'notifyAgentToGetLocal'. This long name does not give enough context anyway. 85 long iterations = 0; 87 do { ... 97 iterations++; ... 102 } while (iterations < TEST_ITERATIONS); Why a more explicit 'for' or 'while' loop is not used here? : for (long iter = 0; iter < TEST_ITERATIONS; iter++) { ? http://cr.openjdk.java.net/~rrich/webrevs/8249293/webrev.3.inc/test/hotspot/jtreg/serviceability/jvmti/GetLocalVariable/libGetLocalWithoutSuspendTest.cpp.frames.html The indent in this file varies. It is better to keep it the same: 4 or 2. 60 AgentCallingGetLocalObject // The target thread waits for the agent to call I'd suggest to rename the constant to 'AgentInGetLocal'. 150 GetLocalWithoutSuspendTestThreadLoop(jvmtiEnv * jvmti, JNIEnv* env, void * arg) { It is better rename the function to TestThreadLoop. You can add a comment before to explain some basic about what it is doing. 167 printf("*** AGENT: GetLocalWithoutSuspendTestThreadLoop thread started. Polling thread '%s' for local variables\n", It is better to get rid of leading stars in all messages. 176 // the native method Java_GetLocalWithoutSuspendTest_notifyAgentToGetLocalAndWaitShortly The part 'Java_GetLocalWithoutSuspendTest_' can be removed from the function name. I'm still reviewing the test native agent code. Thanks, Serguei On 8/11/20 03:02, Reingruber, Richard wrote: Hi David and Serguei, On 11/08/2020 3:21 am, mailto:serguei.spitsyn at oracle.com wrote: Hi Richard and David, The implementation looks good to me. But I do not understand what the test is doing with all this counters and recursions. For instance, these fragments: 86 recursions = 0; 87 try { 88 recursiveMethod(1<<20); 89 } catch (StackOverflowError e) { 90 msg("Caught StackOverflowError as expected"); 91 } 92 int target_depth = recursions-100; // spaces are missed around the '-' sigh It is not obvious that the 'recursion' is updated in the recursiveMethod. I would suggestto make it more explicit: recursiveMethod(M); int target_depth = M - 100; Then the variable 'recursions' can be removed or become local. The recursiveMethod takes in the maximum recursions to try and updates the recursions variable to record how many recursions were possible - so: target_depth = - 100; Possibly recursiveMethod could return the actual recursions instead of using the global variables? I've eliminated the static 'recursions' variable. recursiveMethod() now returns the depth at which the recursion was ended. I hesitated doing this, because I had to handle the StackOverflowError with all those frames still on stack. But the handler is empty, so it should not cause problems. This is the new webrev (as posted previously): Webrev: http://cr.openjdk.java.net/~rrich/webrevs/8249293/webrev.3/ Delta: http://cr.openjdk.java.net/~rrich/webrevs/8249293/webrev.3.inc/ Thanks, Richard. -----Original Message----- From: David Holmes mailto:david.holmes at oracle.com Sent: Dienstag, 11. August 2020 04:00 To: mailto:serguei.spitsyn at oracle.com; Reingruber, Richard mailto:richard.reingruber at sap.com; mailto:serviceability-dev at openjdk.java.net Subject: Re: RFR(S) 8249293: Unsafe stackwalk in VM_GetOrSetLocal::doit_prologue() Hi Serguei, On 11/08/2020 3:21 am, mailto:serguei.spitsyn at oracle.com wrote: Hi Richard and David, The implementation looks good to me. But I do not understand what the test is doing with all this counters and recursions. For instance, these fragments: 86 recursions = 0; 87 try { 88 recursiveMethod(1<<20); 89 } catch (StackOverflowError e) { 90 msg("Caught StackOverflowError as expected"); 91 } 92 int target_depth = recursions-100; // spaces are missed around the '-' sigh It is not obvious that the 'recursion' is updated in the recursiveMethod. I would suggestto make it more explicit: ?? recursiveMethod(M); ?? int target_depth = M - 100; Then the variable 'recursions' can be removed or become local. The recursiveMethod takes in the maximum recursions to try and updates the recursions variable to record how many recursions were possible - so: target_depth = - 100; Possibly recursiveMethod could return the actual recursions instead of using the global variables? David ----- This method will be: 47 private static final int M = 1 << 20; ... 121 public long recursiveMethod(int depth) { 123 if (depth == 0) { 124 notifyAgentToGetLocalAndWaitShortly(M - 100, waitTimeInNativeAfterNotify); 126 } else { 127 recursiveMethod(--depth); 128 } 129 } At least, he test is missing the comments explaining all these. Thanks, Serguei On 8/9/20 22:35, David Holmes wrote: Hi Richard, On 31/07/2020 5:28 pm, Reingruber, Richard wrote: Hi, I rebase the fix after JDK-8250042. New webrev: http://cr.openjdk.java.net/~rrich/webrevs/8249293/webrev.2/ The general fix for this seems good. A minor nit: ?588???? if (!is_assignable(signature, ob_k, Thread::current())) { You know that the current thread is the VMThread so can use VMThread::vm_thread(). Similarly for this existing code: ?694???? Thread* current_thread = Thread::current(); --- Looking at the test code ... I'm less clear on exactly what is happening and the use of spin-waits raises some red-flags for me in terms of test reliability on different platforms. The "while (--waitCycles > 0)" loop in particular offers no certainty that the agent thread is executing anything in particular. And the use of the spin_count as a guide to future waiting time seems somewhat arbitrary. In all seriousness I got a headache trying to work out how the test was expecting to operate. Some parts could be simplified using raw monitors, I think. But there's no sure way to know the agent thread is in the midst of the stackwalk when the target thread wants to leave the native code. So I understand what you are trying to achieve here, I'm just not sure how reliably it will actually achieve it. test/hotspot/jtreg/serviceability/jvmti/GetLocalVariable/libGetLocalWithoutSuspendTest.cpp ?32 static volatile jlong spinn_count???? = 0; Using a 64-bit counter seems like it will be a problem on 32-bit systems. Should be spin_count not spinn_count. ?36 // Agent thread waits for value != 0, then performas the JVMTI call to get local variable. typo: performas Thanks, David ----- Thanks, Richard. -----Original Message----- From: serviceability-dev mailto:serviceability-dev-retn at openjdk.java.net On Behalf Of Reingruber, Richard Sent: Montag, 27. Juli 2020 09:45 To: mailto:serguei.spitsyn at oracle.com; mailto:serviceability-dev at openjdk.java.net Subject: [CAUTION] RE: RFR(S) 8249293: Unsafe stackwalk in VM_GetOrSetLocal::doit_prologue() Hi Serguei, ?? > I tested it on Linux and Windows but not yet on MacOS. The test succeeded now on all platforms. Thanks, Richard. -----Original Message----- From: Reingruber, Richard Sent: Freitag, 24. Juli 2020 15:04 To: mailto:serguei.spitsyn at oracle.com; mailto:serviceability-dev at openjdk.java.net Subject: RE: RFR(S) 8249293: Unsafe stackwalk in VM_GetOrSetLocal::doit_prologue() Hi Serguei, The fix itself looks good to me. thanks for looking at the fix. I still need another look at new test. Could you, please, convert the agent of new test to C++? It will make it a little bit simpler. Sure, here is the new webrev.1 with a C++ version of the test agent: http://cr.openjdk.java.net/~rrich/webrevs/8249293/webrev.1/ I tested it on Linux and Windows but not yet on MacOS. Thanks, Richard. -----Original Message----- From: mailto:serguei.spitsyn at oracle.com mailto:serguei.spitsyn at oracle.com Sent: Freitag, 24. Juli 2020 00:00 To: Reingruber, Richard mailto:richard.reingruber at sap.com; mailto:serviceability-dev at openjdk.java.net Subject: Re: RFR(S) 8249293: Unsafe stackwalk in VM_GetOrSetLocal::doit_prologue() Hi Richard, Thank you for filing the CR and taking care about it! The fix itself looks good to me. I still need another look at new test. Could you, please, convert the agent of new test to C++? It will make it a little bit simpler. Thanks, Serguei On 7/20/20 01:15, Reingruber, Richard wrote: Hi, please help review this fix for VM_GetOrSetLocal. It moves the unsafe stackwalk from the vm operation prologue before the safepoint into the doit() method executed at the safepoint. Webrev: http://cr.openjdk.java.net/~rrich/webrevs/8249293/webrev.0/index.html Bug: https://bugs.openjdk.java.net/browse/JDK-8249293 According to the JVMTI spec on local variable access it is not required to suspend the target thread T [1]. The operation will simply fail with JVMTI_ERROR_NO_MORE_FRAMES if T is executing bytecodes. It will succeed though if T is blocked because of synchronization or executing some native code. The issue is that in the latter case the stack walk in VM_GetOrSetLocal::doit_prologue() to prepare the access to the local variable is unsafe, because it is done before the safepoint and it races with T returning to execute bytecodes making its stack not walkable. The included test shows that this can crash the VM if T wins the race. Manual testing: ??? - new test test/hotspot/jtreg/serviceability/jvmti/GetLocalVariable/GetLocalWithoutSuspendTest.java ??? - test/hotspot/jtreg/vmTestbase/nsk/jvmti ??? - test/hotspot/jtreg/serviceability/jvmti Nightly regression tests @SAP: JCK and JTREG, also in Xcomp mode, SPECjvm2008, SPECjbb2015, Renaissance Suite, SAP specific tests with fastdebug and release builds on all platforms Thanks, Richard. [1] https://docs.oracle.com/en/java/javase/14/docs/specs/jvmti.html#local From serguei.spitsyn at oracle.com Fri Aug 21 09:21:31 2020 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Fri, 21 Aug 2020 02:21:31 -0700 Subject: RFR(S) 8249293: Unsafe stackwalk in VM_GetOrSetLocal::doit_prologue() In-Reply-To: References: <54218a0f-5ea7-5a26-af7b-d13829bece14@oracle.com> <037ee9ad-615e-e70b-80f8-21b4fdc80f53@oracle.com> <0a112ad5-0771-9a83-32d8-08c87bed9b40@oracle.com> <8d945a55-aebf-a497-e793-2cd58c2f2392@oracle.com> Message-ID: An HTML attachment was scrubbed... URL: From richard.reingruber at sap.com Fri Aug 21 16:09:18 2020 From: richard.reingruber at sap.com (Reingruber, Richard) Date: Fri, 21 Aug 2020 16:09:18 +0000 Subject: RFR(S) 8249293: Unsafe stackwalk in VM_GetOrSetLocal::doit_prologue() In-Reply-To: References: <54218a0f-5ea7-5a26-af7b-d13829bece14@oracle.com> <037ee9ad-615e-e70b-80f8-21b4fdc80f53@oracle.com> <0a112ad5-0771-9a83-32d8-08c87bed9b40@oracle.com> <8d945a55-aebf-a497-e793-2cd58c2f2392@oracle.com> Message-ID: Hi Serguei, I have prepared a new webrev based on your suggestions. Webrev: http://cr.openjdk.java.net/~rrich/webrevs/8249293/webrev.6/ Delta: http://cr.openjdk.java.net/~rrich/webrevs/8249293/webrev.6.inc/ Thanks, Richard. ______ From: serguei.spitsyn at oracle.com Sent: Freitag, 21. August 2020 11:22 To: Reingruber, Richard ; David Holmes ; serviceability-dev at openjdk.java.net Subject: Re: RFR(S) 8249293: Unsafe stackwalk in VM_GetOrSetLocal::doit_prologue() Hi Richard, Thank you for the update, it looks really nice. Just several more minor comments though (I hope, the last ones). > Should I rename the variable to spinWaitCycles or something similar? Yes, waitCycles would be better and more consistent. http://cr.openjdk.java.net/~rrich/webrevs/8249293/webrev.5.inc/test/hotspot/jtreg/serviceability/jvmti/GetLocalVariable/GetLocalWithoutSuspendTest.java.udiff.html 81 * The wait time is given in cycles. 82 */ 83 public int waitTime; ... 93 waitTime = 1; This line 82 can be removed if you rename waitTime to waitCycles. It is better to initialize waitCycles at definition and remove the line 93. 146 public static void msg(String m) { 147 System.out.println("### Java-Test: " + m); 148 } One of the de-facto standard names for such methods is "log". 80 * Wait time in native, i.e. with walkable stack, after notifying agent thread to do GetLocalObject() call. 89 msg("Test how many frames fit on the stack by performing recursive calls until StackOverflowError is thrown"); Could you, please, reballance the two long lines above? http://cr.openjdk.java.net/~rrich/webrevs/8249293/webrev.5.inc/test/hotspot/jtreg/serviceability/jvmti/GetLocalVariable/libGetLocalWithoutSuspendTest.cpp.frames.html There are several spots that can be simplified a little bit: 95 jvmtiError result; 96 97 result = jvmti->GetErrorName(errCode, &errMsg); ? ==> ? ? ? jvmtiError result = jvmti->GetErrorName(errCode, &errMsg); The same is true for for the following cases: 115 err = jvmti->RawMonitorEnter(glws_monitor); 125 err = jvmti->RawMonitorExit(glws_monitor); 135 err = jvmti->RawMonitorWait(glws_monitor, 0); 145 err = jvmti->RawMonitorNotify(glws_monitor); 155 err = jvmti->DestroyRawMonitor(glws_monitor); 173 if (errMsg != NULL) { An extra space before NULL. 89 static jvmtiEnv* jvmti_global = NULL; 276 jvmtiEnv* jvmti = jvmti_global; 308 jvmtiEnv* jvmti = jvmti_global; 330 jvmtiEnv* jvmti = jvmti_global; ... 409 jvmtiEnv* jvmti; 419 res = jvm->GetEnv((void **) &jvmti, JVMTI_VERSION_9); 424 jvmti_global = jvmti; Normal practice is to name the "global_jvmti" as "jvmti". Then there is no need to set it at the start of each function. Thanks, Serguei On 8/20/20 23:47, Reingruber, Richard wrote: Hi Serguei, Sorry for the delay in reply and thank you for the update. I like it in general. There are some minor comments though. Excellent, thanks :) I've prepared webrev.5. Webrev: http://cr.openjdk.java.net/~rrich/webrevs/8249293/webrev.5/ Delta: http://cr.openjdk.java.net/~rrich/webrevs/8249293/webrev.5.inc/ http://cr.openjdk.java.net/~rrich/webrevs/8249293/webrev.4.inc/test/hotspot/jtreg/serviceability/jvmti/GetLocalVariable/GetLocalWithoutSuspendTest.java.frames.html 112 waitTime = (waitTime << 1); // double wait time 113 if (waitTime >= M || waitTime < 0) { 114 waitTime = 1; // reset when too long 115 } The M is too big for time. "waitTime" is roughly the number of cycles spent in a spin wait. M ~= 10^6 cycles does not seem too long. Should I rename the variable to spinWaitCycles or something similar? What about something like this: waitTime = (waitTime << 1) % 32; or waitTime = (waitTime << 1) & 32; I went for // Double wait time, but limit to roughly 10^6 cycles. waitTime = (waitTime << 1) & (M - 1); waitTime = waitTime == 0 ? 1 : waitTime; Masking the waitTime with % 32 is too small. In my experiments with fastdebug builds I got the crash often with a waitTime of 8K on a Linux server and 256K on my Windows notebook. http://cr.openjdk.java.net/~rrich/webrevs/8249293/webrev.4.inc/test/hotspot/jtreg/serviceability/jvmti/GetLocalVariable/libGetLocalWithoutSuspendTest.cpp.udiff.html // - Wait for the target thread to either start a new test iteration or to +// signal shutdown. A suggestion to replace: "to either start" => "either to start". Ok, done. +// Shutdown is signalled by setting test_state to ShutDown. The agent reacts +// to it by changing test_state to Terminated and then it exits. The second "it" is not needed: "then it exits" => "then exits". Ok, done. +// ... It sets the shared variable test_state +// to TargetInNative and then it uses the glws_monitor to send the The second "it" is not needed. Ok, done. + monitor_enter(jvmti, env, glws_monitor, AT_LINE); + monitor_notify(jvmti, env, glws_monitor, AT_LINE); + monitor_wait(jvmti, env, glws_monitor, AT_LINE); + monitor_exit(jvmti, env, glws_monitor, AT_LINE); + monitor_destroy(jvmti, env, glws_monitor, AT_LINE); There is only one lock. It'd be more simple to directly use it in the called functions and get rid of the parameter. Just a suggestion, it is up to you to decide. Ok, done. http://cr.openjdk.java.net/~rrich/webrevs/8249293/webrev.4.inc/test/hotspot/jtreg/serviceability/jvmti/GetLocalVariable/libGetLocalWithoutSuspendTest.cpp.frames.html I'd suggest to refactor the lines 213 and 239-257 to a separate function test_GetLocalObject(jvmti, depth). 240 jobject local_val; Better to rename it to local_obj or just obj. Ok, done. There are still problems with the indent. I reformatted the file using 2 space indentation like in other C++ sources. I didn't include the indentation change in the delta webrev. Thanks, Richard. ______________________ From: mailto:serguei.spitsyn at oracle.com mailto:serguei.spitsyn at oracle.com Sent: Donnerstag, 20. August 2020 04:42 To: Reingruber, Richard mailto:richard.reingruber at sap.com; David Holmes mailto:david.holmes at oracle.com; mailto:serviceability-dev at openjdk.java.net Subject: Re: RFR(S) 8249293: Unsafe stackwalk in VM_GetOrSetLocal::doit_prologue() Hi Richard, Sorry for the delay in reply and thank you for the update. I like it in general. There are some minor comments though. http://cr.openjdk.java.net/~rrich/webrevs/8249293/webrev.4.inc/test/hotspot/jtreg/serviceability/jvmti/GetLocalVariable/GetLocalWithoutSuspendTest.java.frames.html 112 waitTime = (waitTime << 1); // double wait time 113 if (waitTime >= M || waitTime < 0) { 114 waitTime = 1; // reset when too long 115 } The M is too big for time. What about something like this: ? waitTime = (waitTime << 1) % 32; or ? waitTime = (waitTime << 1) & 32; You can choose a better number instead of 32. http://cr.openjdk.java.net/~rrich/webrevs/8249293/webrev.4.inc/test/hotspot/jtreg/serviceability/jvmti/GetLocalVariable/libGetLocalWithoutSuspendTest.cpp.udiff.html // - Wait for the target thread to either start a new test iteration or to +// signal shutdown. A suggestion to replace: "to either start" => "either to start". +// Shutdown is signalled by setting test_state to ShutDown. The agent reacts +// to it by changing test_state to Terminated and then it exits. The second "it" is not needed: "then it exits" => "then exits". +// ... It sets the shared variable test_state +// to TargetInNative and then it uses the glws_monitor to send the The second "it" is not needed. + monitor_enter(jvmti, env, glws_monitor, AT_LINE); + monitor_notify(jvmti, env, glws_monitor, AT_LINE); + monitor_wait(jvmti, env, glws_monitor, AT_LINE); + monitor_exit(jvmti, env, glws_monitor, AT_LINE); + monitor_destroy(jvmti, env, glws_monitor, AT_LINE); There is only one lock. It'd be more simple to directly use it in the called functions and get rid of the parameter. Just a suggestion, it is up to you to decide. http://cr.openjdk.java.net/~rrich/webrevs/8249293/webrev.4.inc/test/hotspot/jtreg/serviceability/jvmti/GetLocalVariable/libGetLocalWithoutSuspendTest.cpp.frames.html I'd suggest to refactor the lines 213 and 239-257 to a separate function test_GetLocalObject(jvmti, depth). 240 jobject local_val; Better to rename it to local_obj or just obj. There are still problems with the indent. The indent 4 is mostly used. However there are still fragments with the indent 2: 112 static void monitor_enter(jvmtiEnv* jvmti, JNIEnv* env, jrawMonitorID mon, const char* loc) { 113 jvmtiError err; 114 115 err = jvmti->RawMonitorEnter(mon); 116 if (err != JVMTI_ERROR_NONE) { 117 ShowErrorMessage(jvmti, err, loc); 118 env->FatalError(loc); 119 } 120 } 121 122 static void monitor_exit(jvmtiEnv* jvmti, JNIEnv* env, jrawMonitorID mon, const char* loc) { 123 jvmtiError err; 124 125 err = jvmti->RawMonitorExit(mon); 126 if (err != JVMTI_ERROR_NONE) { 127 ShowErrorMessage(jvmti, err, loc); 128 env->FatalError(loc); 129 } 130 } 131 132 static void monitor_wait(jvmtiEnv* jvmti, JNIEnv* env, jrawMonitorID mon, const char* loc) { 133 jvmtiError err; 134 135 err = jvmti->RawMonitorWait(mon, 0); 136 if (err != JVMTI_ERROR_NONE) { 137 ShowErrorMessage(jvmti, err, loc); 138 env->FatalError(loc); 139 } 140 } 141 142 static void monitor_notify(jvmtiEnv* jvmti, JNIEnv* env, jrawMonitorID mon, const char* loc) { 143 jvmtiError err; 144 145 err = jvmti->RawMonitorNotify(mon); 146 if (err != JVMTI_ERROR_NONE) { 147 ShowErrorMessage(jvmti, err, loc); 148 env->FatalError(loc); 149 } 150 } 151 152 static void monitor_destroy(jvmtiEnv* jvmti, JNIEnv* env, jrawMonitorID mon, const char* loc) { 153 jvmtiError err; 154 155 err = jvmti->DestroyRawMonitor(mon); 156 if (err != JVMTI_ERROR_NONE) { 157 ShowErrorMessage(jvmti, err, loc); 158 env->FatalError(loc); 159 } ... 160 } 196 while (target_thread == NULL) { 197 monitor_wait(jvmti, env, glws_monitor, AT_LINE); 198 } ... 220 while (test_state != TargetInNative) { 221 if (test_state == ShutDown) { 222 test_state = Terminated; 223 monitor_notify(jvmti, env, glws_monitor, AT_LINE); 224 monitor_exit(jvmti, env, glws_monitor, AT_LINE); 225 return; 226 } 227 monitor_wait(jvmti, env, glws_monitor, AT_LINE); 228 } ... 263 // Called by target thread after building a large stack. 264 // By calling this native method, the thread's stack becomes walkable. 265 // It notifies the agent to do the GetLocalObject() call and then races 266 // it to make its stack not walkable by returning from the native call. 267 JNIEXPORT void JNICALL 268 Java_GetLocalWithoutSuspendTest_notifyAgentToGetLocal(JNIEnv *env, jclass cls, jint depth, jlong waitCycles) { 269 jvmtiEnv* jvmti = jvmti_global; 270 271 monitor_enter(jvmti, env, glws_monitor, AT_LINE); 272 273 // Set depth_for_get_local and notify agent that the target thread is ready for the GetLocalObject() call 274 depth_for_get_local = depth; 275 test_state = TargetInNative; 276 277 monitor_notify(jvmti, env, glws_monitor, AT_LINE); 278 279 // Wait for agent thread to read depth_for_get_local and do the GetLocalObject() call 280 while (test_state != AgentInGetLocal) { 281 monitor_wait(jvmti, env, glws_monitor, AT_LINE); 282 } 283 284 // Reset state to Initial 285 test_state = Initial; 286 287 monitor_exit(jvmti, env, glws_monitor, AT_LINE); 288 289 // Wait a little until agent thread is in unsafe stack walk. 290 // This needs to be a spin wait or sleep because we cannot get a notification 291 // from there. 292 while (--waitCycles > 0) { 293 dummy_counter++; 294 } 295 } ... 299 JNIEXPORT void JNICALL 300 Java_GetLocalWithoutSuspendTest_shutDown(JNIEnv *env, jclass cls) { 301 jvmtiEnv* jvmti = jvmti_global; 302 303 monitor_enter(jvmti, env, glws_monitor, AT_LINE); 304 305 // Notify agent thread to shut down 306 test_state = ShutDown; 307 monitor_notify(jvmti, env, glws_monitor, AT_LINE); 308 309 // Wait for agent to terminate 310 while (test_state != Terminated) { 311 monitor_wait(jvmti, env, glws_monitor, AT_LINE); 312 } 313 314 monitor_exit(jvmti, env, glws_monitor, AT_LINE); 315 316 // Destroy glws_monitor 317 monitor_destroy(jvmti, env, glws_monitor, AT_LINE); 318 } Thanks, Serguei On 8/14/20 07:06, Reingruber, Richard wrote: Hi Serguei, thanks for the feedback. I have implemented your suggestions and created a new webrev: Webrev: http://cr.openjdk.java.net/~rrich/webrevs/8249293/webrev.4/ Delta: http://cr.openjdk.java.net/~rrich/webrevs/8249293/webrev.4.inc/ Please find my replies to your comments below. Best regards, Richard. http://cr.openjdk.java.net/~rrich/webrevs/8249293/webrev.3.inc/test/hotspot/jtreg/serviceability/jvmti/GetLocalVariable/GetLocalWithoutSuspendTest.java.frames.html 33 * the stack walk. The target thread's stack is walkable while in native. After sending the notification it ... 54 * @param depth Depth of target frame for GetLocalObject() call. Should be large value to prolong the unsafe stack walk. 55 * @param waitTimeInNativeAfterNotify Time to wait after notify with walkable stack before returning an becoming unsafe again. ... 71 * Wait time in native, i.e. with walkable stack, after notifying agent thread to do GetLocalObject() call. ... 89 msg((now -start) + " ms Iteration : " + iterations + " waitTimeInNativeAfterNotify : " + waitTimeInNativeAfterNotify); Could you, please, re-balance the lines above to make them shorter? Ok, done. 90 int newTargetDepth = recursiveMethod(0, targetDepth); 91 if (newTargetDepth < targetDepth) { 92 msg("StackOverflowError during test."); 93 msg("Old target depth: " + targetDepth); 94 msg("Retry with new target depth: " + newTargetDepth); 95 targetDepth = newTargetDepth; 96 } A comment is needed to explain why a StackOverflowError is not desired. At least, it is not obvious initially. 73 public int waitTimeInNativeAfterNotify; This name is unreasonably long which makes the code less readable. I'd suggest to reduce it to waitTime. Ok, done. 103 notifyAgentToGetLocalAndWaitShortly(-1, 1); ... 119 notifyAgentToGetLocalAndWaitShortly(depth - 100, waitTimeInNativeAfterNotify); It is better to provide a short comment before each call explaining what it is doing. For instance, it is not clear why the call at the line 103 is needed. Why do we need to notify the agent to GetLocal for the second time? The test is repeated TEST_ITERATIONS times. In each iteration the agent calls GetLocal racing the target thread returning from the native call. The last call in line 103 ist the shutdown signal. Can it be refactored into a separate native method? I've made the shutdown process more explicit with the new native method shutDown() which sets thest_state to ShutDown. Then the the function name can be reduced to 'notifyAgentToGetLocal'. This long name does not give enough context anyway. Ok, done. 85 long iterations = 0; 87 do { ... 97 iterations++; ... 102 } while (iterations < TEST_ITERATIONS); Why a more explicit 'for' or 'while' loop is not used here? : for (long iter = 0; iter < TEST_ITERATIONS; iter++) { I have converted the loop into a for loop. http://cr.openjdk.java.net/~rrich/webrevs/8249293/webrev.3.inc/test/hotspot/jtreg/serviceability/jvmti/GetLocalVariable/libGetLocalWithoutSuspendTest.cpp.frames.html The indent in this file varies. It is better to keep it the same: 4 or 2. Yes, I noticed this. I have not corrected it yet, because I didn't want to pullute the incremental webrev with that change. Would you like me to fix the indentation now to 2 spaces or do it as a last step? 60 AgentCallingGetLocalObject // The target thread waits for the agent to call I'd suggest to rename the constant to 'AgentInGetLocal'. Ok, done. 150 GetLocalWithoutSuspendTestThreadLoop(jvmtiEnv * jvmti, JNIEnv* env, void * arg) { It is better rename the function to TestThreadLoop. Would AgentThreadLoop be ok too? You can add a comment before to explain some basic about what it is doing. Ok, done. 167 printf("*** AGENT: GetLocalWithoutSuspendTestThreadLoop thread started. Polling thread '%s' for local variables\n", It is better to get rid of leading stars in all messages. Ok, done. 176 // the native method Java_GetLocalWithoutSuspendTest_notifyAgentToGetLocalAndWaitShortly The part 'Java_GetLocalWithoutSuspendTest_' can be removed from the function name. Ok, done. --- From: mailto:serguei.spitsyn at oracle.com mailto:serguei.spitsyn at oracle.com Sent: Freitag, 14. August 2020 10:11 To: Reingruber, Richard mailto:richard.reingruber at sap.com; David Holmes mailto:david.holmes at oracle.com; mailto:serviceability-dev at openjdk.java.net Subject: Re: RFR(S) 8249293: Unsafe stackwalk in VM_GetOrSetLocal::doit_prologue() Hi Richard, http://cr.openjdk.java.net/~rrich/webrevs/8249293/webrev.3.inc/test/hotspot/jtreg/serviceability/jvmti/GetLocalVariable/GetLocalWithoutSuspendTest.java.frames.html 33 * the stack walk. The target thread's stack is walkable while in native. After sending the notification it ... 54 * @param depth Depth of target frame for GetLocalObject() call. Should be large value to prolong the unsafe stack walk. 55 * @param waitTimeInNativeAfterNotify Time to wait after notify with walkable stack before returning an becoming unsafe again. ... 71 * Wait time in native, i.e. with walkable stack, after notifying agent thread to do GetLocalObject() call. ... 89 msg((now -start) + " ms Iteration : " + iterations + " waitTimeInNativeAfterNotify : " + waitTimeInNativeAfterNotify); Could you, please, re-balance the lines above to make them shorter? ?90 int newTargetDepth = recursiveMethod(0, targetDepth); 91 if (newTargetDepth < targetDepth) { 92 msg("StackOverflowError during test."); 93 msg("Old target depth: " + targetDepth); 94 msg("Retry with new target depth: " + newTargetDepth); 95 targetDepth = newTargetDepth; 96 } A comment is needed to explain why a StackOverflowError is not desired. At least, it is not obvious initially. 73 public int waitTimeInNativeAfterNotify; This name is unreasonably long which makes the code less readable. I'd suggest to reduce it to waitTime. 103 notifyAgentToGetLocalAndWaitShortly(-1, 1); ... 119 notifyAgentToGetLocalAndWaitShortly(depth - 100, waitTimeInNativeAfterNotify); It is better to provide a short comment before each call explaining what it is doing. For instance, it is not clear why the call at the line 103 is needed. Why do we need to notify the agent to GetLocal for the second time? Can it be refactored into a separate native method? Then the the function name can be reduced to 'notifyAgentToGetLocal'. This long name does not give enough context anyway. 85 long iterations = 0; 87 do { ... 97 iterations++; ... 102 } while (iterations < TEST_ITERATIONS); Why a more explicit 'for' or 'while' loop is not used here? : for (long iter = 0; iter < TEST_ITERATIONS; iter++) { ? http://cr.openjdk.java.net/~rrich/webrevs/8249293/webrev.3.inc/test/hotspot/jtreg/serviceability/jvmti/GetLocalVariable/libGetLocalWithoutSuspendTest.cpp.frames.html The indent in this file varies. It is better to keep it the same: 4 or 2. 60 AgentCallingGetLocalObject // The target thread waits for the agent to call I'd suggest to rename the constant to 'AgentInGetLocal'. 150 GetLocalWithoutSuspendTestThreadLoop(jvmtiEnv * jvmti, JNIEnv* env, void * arg) { It is better rename the function to TestThreadLoop. You can add a comment before to explain some basic about what it is doing. 167 printf("*** AGENT: GetLocalWithoutSuspendTestThreadLoop thread started. Polling thread '%s' for local variables\n", It is better to get rid of leading stars in all messages. 176 // the native method Java_GetLocalWithoutSuspendTest_notifyAgentToGetLocalAndWaitShortly The part 'Java_GetLocalWithoutSuspendTest_' can be removed from the function name. I'm still reviewing the test native agent code. Thanks, Serguei On 8/11/20 03:02, Reingruber, Richard wrote: Hi David and Serguei, On 11/08/2020 3:21 am, mailto:serguei.spitsyn at oracle.com wrote: Hi Richard and David, The implementation looks good to me. But I do not understand what the test is doing with all this counters and recursions. For instance, these fragments: 86 recursions = 0; 87 try { 88 recursiveMethod(1<<20); 89 } catch (StackOverflowError e) { 90 msg("Caught StackOverflowError as expected"); 91 } 92 int target_depth = recursions-100; // spaces are missed around the '-' sigh It is not obvious that the 'recursion' is updated in the recursiveMethod. I would suggestto make it more explicit: recursiveMethod(M); int target_depth = M - 100; Then the variable 'recursions' can be removed or become local. The recursiveMethod takes in the maximum recursions to try and updates the recursions variable to record how many recursions were possible - so: target_depth = - 100; Possibly recursiveMethod could return the actual recursions instead of using the global variables? I've eliminated the static 'recursions' variable. recursiveMethod() now returns the depth at which the recursion was ended. I hesitated doing this, because I had to handle the StackOverflowError with all those frames still on stack. But the handler is empty, so it should not cause problems. This is the new webrev (as posted previously): Webrev: http://cr.openjdk.java.net/~rrich/webrevs/8249293/webrev.3/ Delta: http://cr.openjdk.java.net/~rrich/webrevs/8249293/webrev.3.inc/ Thanks, Richard. -----Original Message----- From: David Holmes mailto:david.holmes at oracle.com Sent: Dienstag, 11. August 2020 04:00 To: mailto:serguei.spitsyn at oracle.com; Reingruber, Richard mailto:richard.reingruber at sap.com; mailto:serviceability-dev at openjdk.java.net Subject: Re: RFR(S) 8249293: Unsafe stackwalk in VM_GetOrSetLocal::doit_prologue() Hi Serguei, On 11/08/2020 3:21 am, mailto:serguei.spitsyn at oracle.com wrote: Hi Richard and David, The implementation looks good to me. But I do not understand what the test is doing with all this counters and recursions. For instance, these fragments: 86 recursions = 0; 87 try { 88 recursiveMethod(1<<20); 89 } catch (StackOverflowError e) { 90 msg("Caught StackOverflowError as expected"); 91 } 92 int target_depth = recursions-100; // spaces are missed around the '-' sigh It is not obvious that the 'recursion' is updated in the recursiveMethod. I would suggestto make it more explicit: ?? recursiveMethod(M); ?? int target_depth = M - 100; Then the variable 'recursions' can be removed or become local. The recursiveMethod takes in the maximum recursions to try and updates the recursions variable to record how many recursions were possible - so: target_depth = - 100; Possibly recursiveMethod could return the actual recursions instead of using the global variables? David ----- This method will be: 47 private static final int M = 1 << 20; ... 121 public long recursiveMethod(int depth) { 123 if (depth == 0) { 124 notifyAgentToGetLocalAndWaitShortly(M - 100, waitTimeInNativeAfterNotify); 126 } else { 127 recursiveMethod(--depth); 128 } 129 } At least, he test is missing the comments explaining all these. Thanks, Serguei On 8/9/20 22:35, David Holmes wrote: Hi Richard, On 31/07/2020 5:28 pm, Reingruber, Richard wrote: Hi, I rebase the fix after JDK-8250042. New webrev: http://cr.openjdk.java.net/~rrich/webrevs/8249293/webrev.2/ The general fix for this seems good. A minor nit: ?588???? if (!is_assignable(signature, ob_k, Thread::current())) { You know that the current thread is the VMThread so can use VMThread::vm_thread(). Similarly for this existing code: ?694???? Thread* current_thread = Thread::current(); --- Looking at the test code ... I'm less clear on exactly what is happening and the use of spin-waits raises some red-flags for me in terms of test reliability on different platforms. The "while (--waitCycles > 0)" loop in particular offers no certainty that the agent thread is executing anything in particular. And the use of the spin_count as a guide to future waiting time seems somewhat arbitrary. In all seriousness I got a headache trying to work out how the test was expecting to operate. Some parts could be simplified using raw monitors, I think. But there's no sure way to know the agent thread is in the midst of the stackwalk when the target thread wants to leave the native code. So I understand what you are trying to achieve here, I'm just not sure how reliably it will actually achieve it. test/hotspot/jtreg/serviceability/jvmti/GetLocalVariable/libGetLocalWithoutSuspendTest.cpp ?32 static volatile jlong spinn_count???? = 0; Using a 64-bit counter seems like it will be a problem on 32-bit systems. Should be spin_count not spinn_count. ?36 // Agent thread waits for value != 0, then performas the JVMTI call to get local variable. typo: performas Thanks, David ----- Thanks, Richard. -----Original Message----- From: serviceability-dev mailto:serviceability-dev-retn at openjdk.java.net On Behalf Of Reingruber, Richard Sent: Montag, 27. Juli 2020 09:45 To: mailto:serguei.spitsyn at oracle.com; mailto:serviceability-dev at openjdk.java.net Subject: [CAUTION] RE: RFR(S) 8249293: Unsafe stackwalk in VM_GetOrSetLocal::doit_prologue() Hi Serguei, ?? > I tested it on Linux and Windows but not yet on MacOS. The test succeeded now on all platforms. Thanks, Richard. -----Original Message----- From: Reingruber, Richard Sent: Freitag, 24. Juli 2020 15:04 To: mailto:serguei.spitsyn at oracle.com; mailto:serviceability-dev at openjdk.java.net Subject: RE: RFR(S) 8249293: Unsafe stackwalk in VM_GetOrSetLocal::doit_prologue() Hi Serguei, The fix itself looks good to me. thanks for looking at the fix. I still need another look at new test. Could you, please, convert the agent of new test to C++? It will make it a little bit simpler. Sure, here is the new webrev.1 with a C++ version of the test agent: http://cr.openjdk.java.net/~rrich/webrevs/8249293/webrev.1/ I tested it on Linux and Windows but not yet on MacOS. Thanks, Richard. -----Original Message----- From: mailto:serguei.spitsyn at oracle.com mailto:serguei.spitsyn at oracle.com Sent: Freitag, 24. Juli 2020 00:00 To: Reingruber, Richard mailto:richard.reingruber at sap.com; mailto:serviceability-dev at openjdk.java.net Subject: Re: RFR(S) 8249293: Unsafe stackwalk in VM_GetOrSetLocal::doit_prologue() Hi Richard, Thank you for filing the CR and taking care about it! The fix itself looks good to me. I still need another look at new test. Could you, please, convert the agent of new test to C++? It will make it a little bit simpler. Thanks, Serguei On 7/20/20 01:15, Reingruber, Richard wrote: Hi, please help review this fix for VM_GetOrSetLocal. It moves the unsafe stackwalk from the vm operation prologue before the safepoint into the doit() method executed at the safepoint. Webrev: http://cr.openjdk.java.net/~rrich/webrevs/8249293/webrev.0/index.html Bug: https://bugs.openjdk.java.net/browse/JDK-8249293 According to the JVMTI spec on local variable access it is not required to suspend the target thread T [1]. The operation will simply fail with JVMTI_ERROR_NO_MORE_FRAMES if T is executing bytecodes. It will succeed though if T is blocked because of synchronization or executing some native code. The issue is that in the latter case the stack walk in VM_GetOrSetLocal::doit_prologue() to prepare the access to the local variable is unsafe, because it is done before the safepoint and it races with T returning to execute bytecodes making its stack not walkable. The included test shows that this can crash the VM if T wins the race. Manual testing: ??? - new test test/hotspot/jtreg/serviceability/jvmti/GetLocalVariable/GetLocalWithoutSuspendTest.java ??? - test/hotspot/jtreg/vmTestbase/nsk/jvmti ??? - test/hotspot/jtreg/serviceability/jvmti Nightly regression tests @SAP: JCK and JTREG, also in Xcomp mode, SPECjvm2008, SPECjbb2015, Renaissance Suite, SAP specific tests with fastdebug and release builds on all platforms Thanks, Richard. [1] https://docs.oracle.com/en/java/javase/14/docs/specs/jvmti.html#local From serguei.spitsyn at oracle.com Fri Aug 21 16:47:38 2020 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Fri, 21 Aug 2020 09:47:38 -0700 Subject: RFR(S) 8249293: Unsafe stackwalk in VM_GetOrSetLocal::doit_prologue() In-Reply-To: References: <54218a0f-5ea7-5a26-af7b-d13829bece14@oracle.com> <037ee9ad-615e-e70b-80f8-21b4fdc80f53@oracle.com> <0a112ad5-0771-9a83-32d8-08c87bed9b40@oracle.com> <8d945a55-aebf-a497-e793-2cd58c2f2392@oracle.com> Message-ID: <3ae72218-7551-125a-5d7b-3e7d83c189a2@oracle.com> An HTML attachment was scrubbed... URL: From volker.simonis at gmail.com Fri Aug 21 17:54:56 2020 From: volker.simonis at gmail.com (Volker Simonis) Date: Fri, 21 Aug 2020 19:54:56 +0200 Subject: RFR/RFA (M): 8185003: JMX: Add a version of ThreadMXBean.dumpAllThreads with a maxDepth argument In-Reply-To: <9c22ade8-67d3-0ddc-6f6f-4bf0108108b4@oracle.com> References: <8B42FF96-373A-4B03-8B93-D4A47D765F3E@amazon.com> <9c22ade8-67d3-0ddc-6f6f-4bf0108108b4@oracle.com> Message-ID: On Thu, Aug 20, 2020 at 10:06 PM serguei.spitsyn at oracle.com wrote: > > Hi Paul, > > I was also wondering if there is a compatibility risk involved with the JMM_VERSION change. > So, thanks to Volker for asking these questions. > > One more question. > I do not see a backport of the src/jdk.management/share/native/libmanagement_ext/management_ext.c change. > Is it intentional, and if so, what is the reason to skip this file? > "libmanagement_ext/management_ext.c" doesn't exist in jdk8. It was introduced with "8042901: Allow com.sun.management to be in a different module to java.lang.management" in jdk9. In jdk8 all the functionality is in "management/management.h" so there's no need to backport the changes from "management_ext.c" . [1] https://bugs.openjdk.java.net/browse/JDK-8042901 > Thanks, > Serguei > > > On 8/20/20 11:30, Volker Simonis wrote: > > On Wed, Aug 19, 2020 at 6:17 PM Hohensee, Paul wrote: > > Please review this backport to jdk8u. I especially need a CSR review, since the CSR approval process can be a bottleneck. The patch significantly reduces fleet profiling overhead, and a version of it has been in production at Amazon for over 3 years. > > > > Original JBS issue: https://bugs.openjdk.java.net/browse/JDK-8185003 > > Original CSR: https://bugs.openjdk.java.net/browse/JDK-8185705 > > Original patch: http://hg.openjdk.java.net/jdk10/master/rev/68d46cb9be45 > > > > Backport JBS issue: https://bugs.openjdk.java.net/browse/JDK-8251494 > > Backport CSR: https://bugs.openjdk.java.net/browse/JDK-8251498 > > Backport JDK webrev: http://cr.openjdk.java.net/~phh/8185003/webrev.8u.jdk.05/ > > JDK part looks good to me. > > Backport Hotspot webrev: http://cr.openjdk.java.net/~phh/8185003/webrev.8u.hotspot.05/ > > HotSpot part looks good to me but see discussion below. > > > Details of the interface changes needed for the backport are in the Description of the Backport CSR 8251498. The actual functional changes are minimal and low risk. > > I've also reviewed the CSR yesterday which I think is fine. But now, > when looking at the implementation, I'm a little concerned about > changing JMM_VERSION from " 0x20010203" to "0x20020000" in "jmm.h". > > This might be especially problematic in combination with the changes > in "Management::get_jmm_interface()" which is called by > JVM_GetManagement(): > > void* Management::get_jmm_interface(int version) { > #if INCLUDE_MANAGEMENT > - if (version == JMM_VERSION_1_0) { > + if (version == JMM_VERSION) { > return (void*) &jmm_interface; > } > #endif // INCLUDE_MANAGEMENT > return NULL; > } > > You've correctly fixed the single caller of "JVM_GetManagement()" in > the JDK (in "JNI_OnLoad()" in "management.c"): > > - jmm_interface = (JmmInterface*) JVM_GetManagement(JMM_VERSION_1_0); > + jmm_interface = (JmmInterface*) JVM_GetManagement(JMM_VERSION); > > but I wonder if there are other monitoring/serviceability tools out > there which use this interface and which will break after this change. > A quick search revealed at least two StackOverflow entries which > recommend using "JVM_GetManagement(JMM_VERSION_1_0)" [1,2] and there's > a talk and a blog entry doing the same [3, 4]. > > I'm not sure how relevant this is but I think a much safer and > backwards-compatible way of doing this downport would be the > following: > > - don't change "Management::get_jmm_interface()" (i.e. still check for > "JMM_VERSION_1_0") but return the "new" JMM_VERSION in > "jmm_GetVersion()". This won't break anything but will make it > possible for clients to detect the new version if they want. > > - don't change the signature of "DumpThreads()". Instead add a new > version (e.g. "DumpThreadsMaxDepth()/jmm_DumpThreadsMaxDepth()") to > the "JMMInterface" struct and to "jmm_interface" in "management.cpp". > You can do this in one of the two first, reserved fields of > "JMMInterface" so you won't break binary compatibility. > "jmm_DumpThreads()" will then be a simple wrapper which calls > "jmm_DumpThreadsMaxDepth()" with Integer.MAX_VALUE as depth. > > - in the jdk you then simply call "DumpThreadsMaxDepth()" in > "Java_sun_management_ThreadImpl_dumpThreads0()" > > I think this way we can maintain full binary compatibility while still > using the new feature. What do you think? > > Best regards, > Volker > > [1] https://stackoverflow.com/questions/23632653/generate-java-heap-dump-on-uncaught-exception > [2] https://stackoverflow.com/questions/60887816/jvm-options-printnmtstatistics-save-info-to-file > [3] https://sudonull.com/post/25841-JVM-TI-how-to-make-a-plugin-for-a-virtual-machine-Odnoklassniki-company-blog > [4] https://2019.jpoint.ru/talks/2o8scc5jbaurnqqlsydzxv/ > > Passes the included (suitably modified) test, as well as the tests in > > > > jdk/test/java/lang/management/ThreadMXBean > > jdk/test/com/sun/management/ThreadMXBean > > > > Thanks, > > Paul > > From serguei.spitsyn at oracle.com Fri Aug 21 18:07:59 2020 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Fri, 21 Aug 2020 11:07:59 -0700 Subject: RFR/RFA (M): 8185003: JMX: Add a version of ThreadMXBean.dumpAllThreads with a maxDepth argument In-Reply-To: References: <8B42FF96-373A-4B03-8B93-D4A47D765F3E@amazon.com> <9c22ade8-67d3-0ddc-6f6f-4bf0108108b4@oracle.com> Message-ID: Hi Paul, Thank you for explanation. Thanks, Serguei On 8/21/20 10:54, Volker Simonis wrote: > On Thu, Aug 20, 2020 at 10:06 PM serguei.spitsyn at oracle.com > wrote: >> Hi Paul, >> >> I was also wondering if there is a compatibility risk involved with the JMM_VERSION change. >> So, thanks to Volker for asking these questions. >> >> One more question. >> I do not see a backport of the src/jdk.management/share/native/libmanagement_ext/management_ext.c change. >> Is it intentional, and if so, what is the reason to skip this file? >> > "libmanagement_ext/management_ext.c" doesn't exist in jdk8. It was > introduced with "8042901: Allow com.sun.management to be in a > different module to java.lang.management" in jdk9. In jdk8 all the > functionality is in "management/management.h" so there's no need to > backport the changes from "management_ext.c" . > > [1] https://bugs.openjdk.java.net/browse/JDK-8042901 > >> Thanks, >> Serguei >> >> >> On 8/20/20 11:30, Volker Simonis wrote: >> >> On Wed, Aug 19, 2020 at 6:17 PM Hohensee, Paul wrote: >> >> Please review this backport to jdk8u. I especially need a CSR review, since the CSR approval process can be a bottleneck. The patch significantly reduces fleet profiling overhead, and a version of it has been in production at Amazon for over 3 years. >> >> >> >> Original JBS issue: https://bugs.openjdk.java.net/browse/JDK-8185003 >> >> Original CSR: https://bugs.openjdk.java.net/browse/JDK-8185705 >> >> Original patch: http://hg.openjdk.java.net/jdk10/master/rev/68d46cb9be45 >> >> >> >> Backport JBS issue: https://bugs.openjdk.java.net/browse/JDK-8251494 >> >> Backport CSR: https://bugs.openjdk.java.net/browse/JDK-8251498 >> >> Backport JDK webrev: http://cr.openjdk.java.net/~phh/8185003/webrev.8u.jdk.05/ >> >> JDK part looks good to me. >> >> Backport Hotspot webrev: http://cr.openjdk.java.net/~phh/8185003/webrev.8u.hotspot.05/ >> >> HotSpot part looks good to me but see discussion below. >> >> >> Details of the interface changes needed for the backport are in the Description of the Backport CSR 8251498. The actual functional changes are minimal and low risk. >> >> I've also reviewed the CSR yesterday which I think is fine. But now, >> when looking at the implementation, I'm a little concerned about >> changing JMM_VERSION from " 0x20010203" to "0x20020000" in "jmm.h". >> >> This might be especially problematic in combination with the changes >> in "Management::get_jmm_interface()" which is called by >> JVM_GetManagement(): >> >> void* Management::get_jmm_interface(int version) { >> #if INCLUDE_MANAGEMENT >> - if (version == JMM_VERSION_1_0) { >> + if (version == JMM_VERSION) { >> return (void*) &jmm_interface; >> } >> #endif // INCLUDE_MANAGEMENT >> return NULL; >> } >> >> You've correctly fixed the single caller of "JVM_GetManagement()" in >> the JDK (in "JNI_OnLoad()" in "management.c"): >> >> - jmm_interface = (JmmInterface*) JVM_GetManagement(JMM_VERSION_1_0); >> + jmm_interface = (JmmInterface*) JVM_GetManagement(JMM_VERSION); >> >> but I wonder if there are other monitoring/serviceability tools out >> there which use this interface and which will break after this change. >> A quick search revealed at least two StackOverflow entries which >> recommend using "JVM_GetManagement(JMM_VERSION_1_0)" [1,2] and there's >> a talk and a blog entry doing the same [3, 4]. >> >> I'm not sure how relevant this is but I think a much safer and >> backwards-compatible way of doing this downport would be the >> following: >> >> - don't change "Management::get_jmm_interface()" (i.e. still check for >> "JMM_VERSION_1_0") but return the "new" JMM_VERSION in >> "jmm_GetVersion()". This won't break anything but will make it >> possible for clients to detect the new version if they want. >> >> - don't change the signature of "DumpThreads()". Instead add a new >> version (e.g. "DumpThreadsMaxDepth()/jmm_DumpThreadsMaxDepth()") to >> the "JMMInterface" struct and to "jmm_interface" in "management.cpp". >> You can do this in one of the two first, reserved fields of >> "JMMInterface" so you won't break binary compatibility. >> "jmm_DumpThreads()" will then be a simple wrapper which calls >> "jmm_DumpThreadsMaxDepth()" with Integer.MAX_VALUE as depth. >> >> - in the jdk you then simply call "DumpThreadsMaxDepth()" in >> "Java_sun_management_ThreadImpl_dumpThreads0()" >> >> I think this way we can maintain full binary compatibility while still >> using the new feature. What do you think? >> >> Best regards, >> Volker >> >> [1] https://urldefense.com/v3/__https://stackoverflow.com/questions/23632653/generate-java-heap-dump-on-uncaught-exception__;!!GqivPVa7Brio!LDD5rfKbGz6KCl0LqcAgrFq7kNLkkoDhhN0ZSgHMDvgGMY5bvKJdpoXIAK6N-KqVsyaF$ >> [2] https://urldefense.com/v3/__https://stackoverflow.com/questions/60887816/jvm-options-printnmtstatistics-save-info-to-file__;!!GqivPVa7Brio!LDD5rfKbGz6KCl0LqcAgrFq7kNLkkoDhhN0ZSgHMDvgGMY5bvKJdpoXIAK6N-Ip7MAQ5$ >> [3] https://urldefense.com/v3/__https://sudonull.com/post/25841-JVM-TI-how-to-make-a-plugin-for-a-virtual-machine-Odnoklassniki-company-blog__;!!GqivPVa7Brio!LDD5rfKbGz6KCl0LqcAgrFq7kNLkkoDhhN0ZSgHMDvgGMY5bvKJdpoXIAK6N-ErSjPdD$ >> [4] https://urldefense.com/v3/__https://2019.jpoint.ru/talks/2o8scc5jbaurnqqlsydzxv/__;!!GqivPVa7Brio!LDD5rfKbGz6KCl0LqcAgrFq7kNLkkoDhhN0ZSgHMDvgGMY5bvKJdpoXIAK6N-Oxb5CQ-$ >> >> Passes the included (suitably modified) test, as well as the tests in >> >> >> >> jdk/test/java/lang/management/ThreadMXBean >> >> jdk/test/com/sun/management/ThreadMXBean >> >> >> >> Thanks, >> >> Paul >> >> From hohensee at amazon.com Fri Aug 21 20:32:51 2020 From: hohensee at amazon.com (Hohensee, Paul) Date: Fri, 21 Aug 2020 20:32:51 +0000 Subject: RFR(s):8251848: JMap.histo() and JMap.dump() should parse sub-arguments similarly Message-ID: <19771BAD-6FFC-42A0-8488-1253FB42384C@amazon.com> Lgtm. Paul From: "linzang(??)" Date: Thursday, August 20, 2020 at 5:03 PM To: "serviceability-dev at openjdk.java.net" , "Hohensee, Paul" Subject: RFR(s):8251848: JMap.histo() and JMap.dump() should parse sub-arguments similarly Dear All, Please help review the patch of 8251848, Thanks! Webrev: http://cr.openjdk.java.net/~lzang/8251848/webrev.00/ Bug: https://bugs.openjdk.java.net/browse/JDK-8251848 BRs, Lin -------------- next part -------------- An HTML attachment was scrubbed... URL: From serguei.spitsyn at oracle.com Fri Aug 21 20:36:03 2020 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Fri, 21 Aug 2020 13:36:03 -0700 Subject: RFR/RFA (M): 8185003: JMX: Add a version of ThreadMXBean.dumpAllThreads with a maxDepth argument In-Reply-To: References: <8B42FF96-373A-4B03-8B93-D4A47D765F3E@amazon.com> <9c22ade8-67d3-0ddc-6f6f-4bf0108108b4@oracle.com> Message-ID: <59b86bfa-ee37-7bee-a144-eb9cf7b6d79e@oracle.com> On 8/21/20 11:07, serguei.spitsyn at oracle.com wrote: > Hi Paul, Sorry, Volker, for using this "indirection". I hope, Paul redirected my "Hi" to you. :) Thanks, Serguei > > Thank you for explanation. > > Thanks, > Serguei > > > On 8/21/20 10:54, Volker Simonis wrote: >> On Thu, Aug 20, 2020 at 10:06 PM serguei.spitsyn at oracle.com >> wrote: >>> Hi Paul, >>> >>> I was also wondering if there is a compatibility risk involved with >>> the JMM_VERSION change. >>> So, thanks to Volker for asking these questions. >>> >>> One more question. >>> I do not see a backport of the >>> src/jdk.management/share/native/libmanagement_ext/management_ext.c >>> change. >>> Is it intentional, and if so, what is the reason to skip this file? >>> >> "libmanagement_ext/management_ext.c" doesn't exist in jdk8. It was >> introduced with "8042901: Allow com.sun.management to be in a >> different module to java.lang.management" in jdk9. In jdk8 all the >> functionality is in "management/management.h" so there's no need to >> backport the changes from "management_ext.c" . >> >> [1] https://bugs.openjdk.java.net/browse/JDK-8042901 >> >>> Thanks, >>> Serguei >>> >>> >>> On 8/20/20 11:30, Volker Simonis wrote: >>> >>> On Wed, Aug 19, 2020 at 6:17 PM Hohensee, Paul >>> wrote: >>> >>> Please review this backport to jdk8u. I especially need a CSR >>> review, since the CSR approval process can be a bottleneck. The >>> patch significantly reduces fleet profiling overhead, and a version >>> of it has been in production at Amazon for over 3 years. >>> >>> >>> >>> Original JBS issue: https://bugs.openjdk.java.net/browse/JDK-8185003 >>> >>> Original CSR: https://bugs.openjdk.java.net/browse/JDK-8185705 >>> >>> Original patch: >>> http://hg.openjdk.java.net/jdk10/master/rev/68d46cb9be45 >>> >>> >>> >>> Backport JBS issue: https://bugs.openjdk.java.net/browse/JDK-8251494 >>> >>> Backport CSR: https://bugs.openjdk.java.net/browse/JDK-8251498 >>> >>> Backport JDK webrev: >>> http://cr.openjdk.java.net/~phh/8185003/webrev.8u.jdk.05/ >>> >>> JDK part looks good to me. >>> >>> Backport Hotspot webrev: >>> http://cr.openjdk.java.net/~phh/8185003/webrev.8u.hotspot.05/ >>> >>> HotSpot part looks good to me but see discussion below. >>> >>> >>> Details of the interface changes needed for the backport are in the >>> Description of the Backport CSR 8251498. The actual functional >>> changes are minimal and low risk. >>> >>> I've also reviewed the CSR yesterday which I think is fine. But now, >>> when looking at the implementation, I'm a little concerned about >>> changing JMM_VERSION from " 0x20010203" to "0x20020000" in "jmm.h". >>> >>> This might be especially problematic in combination with the changes >>> in "Management::get_jmm_interface()" which is called by >>> JVM_GetManagement(): >>> >>> ? void* Management::get_jmm_interface(int version) { >>> ? #if INCLUDE_MANAGEMENT >>> -? if (version == JMM_VERSION_1_0) { >>> +? if (version == JMM_VERSION) { >>> ????? return (void*) &jmm_interface; >>> ??? } >>> ? #endif // INCLUDE_MANAGEMENT >>> ??? return NULL; >>> ? } >>> >>> You've correctly fixed the single caller of "JVM_GetManagement()" in >>> the JDK (in "JNI_OnLoad()" in "management.c"): >>> >>> -??? jmm_interface = (JmmInterface*) >>> JVM_GetManagement(JMM_VERSION_1_0); >>> +??? jmm_interface = (JmmInterface*) JVM_GetManagement(JMM_VERSION); >>> >>> but I wonder if there are other monitoring/serviceability tools out >>> there which use this interface and which will break after this change. >>> A quick search revealed at least two StackOverflow entries which >>> recommend using "JVM_GetManagement(JMM_VERSION_1_0)" [1,2] and there's >>> a talk and a blog entry doing the same [3, 4]. >>> >>> I'm not sure how relevant this is but I think a much safer and >>> backwards-compatible way of doing this downport would be the >>> following: >>> >>> - don't change "Management::get_jmm_interface()" (i.e. still check for >>> "JMM_VERSION_1_0") but return the "new" JMM_VERSION in >>> "jmm_GetVersion()". This won't break anything but will make it >>> possible for clients to detect the new version if they want. >>> >>> - don't change the signature of "DumpThreads()". Instead add a new >>> version (e.g. "DumpThreadsMaxDepth()/jmm_DumpThreadsMaxDepth()") to >>> the "JMMInterface" struct and to "jmm_interface" in "management.cpp". >>> You can do this in one of? the two first, reserved fields of >>> "JMMInterface" so you won't break binary compatibility. >>> "jmm_DumpThreads()" will then be a simple wrapper which calls >>> "jmm_DumpThreadsMaxDepth()" with Integer.MAX_VALUE as depth. >>> >>> - in the jdk you then simply call "DumpThreadsMaxDepth()" in >>> "Java_sun_management_ThreadImpl_dumpThreads0()" >>> >>> I think this way we can maintain full binary compatibility while still >>> using the new feature. What do you think? >>> >>> Best regards, >>> Volker >>> >>> [1] >>> https://urldefense.com/v3/__https://stackoverflow.com/questions/23632653/generate-java-heap-dump-on-uncaught-exception__;!!GqivPVa7Brio!LDD5rfKbGz6KCl0LqcAgrFq7kNLkkoDhhN0ZSgHMDvgGMY5bvKJdpoXIAK6N-KqVsyaF$ >>> [2] >>> https://urldefense.com/v3/__https://stackoverflow.com/questions/60887816/jvm-options-printnmtstatistics-save-info-to-file__;!!GqivPVa7Brio!LDD5rfKbGz6KCl0LqcAgrFq7kNLkkoDhhN0ZSgHMDvgGMY5bvKJdpoXIAK6N-Ip7MAQ5$ >>> [3] >>> https://urldefense.com/v3/__https://sudonull.com/post/25841-JVM-TI-how-to-make-a-plugin-for-a-virtual-machine-Odnoklassniki-company-blog__;!!GqivPVa7Brio!LDD5rfKbGz6KCl0LqcAgrFq7kNLkkoDhhN0ZSgHMDvgGMY5bvKJdpoXIAK6N-ErSjPdD$ >>> [4] >>> https://urldefense.com/v3/__https://2019.jpoint.ru/talks/2o8scc5jbaurnqqlsydzxv/__;!!GqivPVa7Brio!LDD5rfKbGz6KCl0LqcAgrFq7kNLkkoDhhN0ZSgHMDvgGMY5bvKJdpoXIAK6N-Oxb5CQ-$ >>> >>> Passes the included (suitably modified) test, as well as the tests in >>> >>> >>> >>> jdk/test/java/lang/management/ThreadMXBean >>> >>> jdk/test/com/sun/management/ThreadMXBean >>> >>> >>> >>> Thanks, >>> >>> Paul >>> >>> > From daniel.daugherty at oracle.com Fri Aug 21 21:01:02 2020 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Fri, 21 Aug 2020 17:01:02 -0400 Subject: RFR(s):8252101 Add specification of expected behavior of combining "all" and "live" options of jmap(Internet mail) In-Reply-To: <060569EC-202B-4DE8-8549-FF82DCCD9157@tencent.com> References: <060569EC-202B-4DE8-8549-FF82DCCD9157@tencent.com> Message-ID: On 8/20/20 7:42 PM, linzang(??) wrote: > After discuss with paul, it is not a good idea to combine two fix together in one webrev. I will handle them separately > Please help review the updated one. Thanks! > Webrev: http://cr.openjdk.java.net/~lzang/8252101/webrev.01/ src/jdk.jcmd/share/classes/sun/tools/jmap/JMap.java ??? No comments. Thumbs up. Dan > CSR: https://bugs.openjdk.java.net/browse/JDK-8252102 > Bug: https://bugs.openjdk.java.net/browse/JDK-8252101 > > BRs, > Lin > > ?On 2020/8/21, 12:17 AM, "linzang(??)" wrote: > > Dear All, > May I ask your help to review this change: > Webrev: http://cr.openjdk.java.net/~lzang/8252101/webrev.00/ > CSR: https://bugs.openjdk.java.net/browse/JDK-8252102 > Bug: https://bugs.openjdk.java.net/browse/JDK-8252101 > > This change adds the description of expected behavior for jmap -hiso/-dump to use "all" and "live" at the same time. > With Paul's help, It also includes code refine of the dump() function in Jmap.java. which is based on Paul's change http://cr.openjdk.java.net/~phh/8251835/webrev.00/ > > BRs, > Lin > > On 2020/8/20, 8:18 PM, "linzang(??)" wrote: > > Thanks Paul! > I have filed CSR and Bug: > CSR: https://bugs.openjdk.java.net/browse/JDK-8252102 > Bug: https://bugs.openjdk.java.net/browse/JDK-8252101 > > Patch is under testing, will create RFR thread when it is ready. > Thanks! > > Cheers, > Lin > > On 20/08/2020 04:18, Hohensee, Paul wrote: > > I prioritize compatibility, so would go with option 2. > > > > Thanks, > > Paul > > > > On 8/18/20, 11:17 PM, "serviceability-dev on behalf of linzang(??)" wrote: > > > > Dear All, > > May I get some suggestions? so that I can work out a patch > > base on that. > > Or may be it should not be treated as an issue? > > BRs, > > Lin > > > > On 17/08/2020 17:17, linzang(??) wrote: > > > Dear all, > > > we found the jmap?s histo/dump command could accept "live" and "all" options together, and the specification does not describe what is the expected behavior of it. > > > I have tried that when these two options used together, the "live" takes effect, no matter what sequences are they in commandline. > > > IMO, it is a little confused to use "live" and "all" together, and if it is allowed, the specification may need to be updated to state the behavior clearly. > > > Therefore may I ask your suggestion on which option of the following is prefered: > > > (option 1.) disallow using these two options together, I think this is more clear, but I am not sure whether there is backward compatibility risk. > > > (option 2.) allow the combination use of "live" and "all", and update the specification to clearly describe the behavior that "live" takes effect in this case. > > > What do you think? > > > > > > Thanks, > > > Lin > > > > > > > > > > > > > > > > > > From serguei.spitsyn at oracle.com Sat Aug 22 00:43:10 2020 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Fri, 21 Aug 2020 17:43:10 -0700 Subject: RFR(s):8252101 Add specification of expected behavior of combining "all" and "live" options of jmap(Internet mail) In-Reply-To: References: <060569EC-202B-4DE8-8549-FF82DCCD9157@tencent.com> Message-ID: <6b6cc568-7ffb-b8aa-f521-b10c20aa8399@oracle.com> Hi Lin, LGTM++ Thanks, Serguei On 8/21/20 14:01, Daniel D. Daugherty wrote: > On 8/20/20 7:42 PM, linzang(??) wrote: >> After discuss with paul, it is not a good idea to combine two fix >> together in one webrev. I will handle them separately >> Please help review the updated one. Thanks! >> ????Webrev: http://cr.openjdk.java.net/~lzang/8252101/webrev.01/ > > src/jdk.jcmd/share/classes/sun/tools/jmap/JMap.java > ??? No comments. > > Thumbs up. > > Dan > > >> ???????????? CSR: https://bugs.openjdk.java.net/browse/JDK-8252102 >> ???????????? Bug: https://bugs.openjdk.java.net/browse/JDK-8252101 >> ? BRs, >> Lin >> >> ?On 2020/8/21, 12:17 AM, "linzang(??)" wrote: >> >> ???? Dear All, >> ???????????? May I ask your help to review this change: >> ???????????? Webrev: >> http://cr.openjdk.java.net/~lzang/8252101/webrev.00/ >> ???????????? CSR: https://bugs.openjdk.java.net/browse/JDK-8252102 >> ???????????? Bug: https://bugs.openjdk.java.net/browse/JDK-8252101 >> >> ???????????? This change adds the description of expected behavior >> for jmap -hiso/-dump to use "all" and "live" at the same time. >> ???????????? With Paul's help, It also includes code refine of the >> dump() function in Jmap.java. which is based on Paul's change >> http://cr.openjdk.java.net/~phh/8251835/webrev.00/ >> >> ???? BRs, >> ???? Lin >> >> ???? On 2020/8/20, 8:18 PM, "linzang(??)" wrote: >> >> ???????? Thanks Paul! >> ???????????? I have filed CSR and Bug: >> ???????????? CSR: https://bugs.openjdk.java.net/browse/JDK-8252102 >> ???????????? Bug: https://bugs.openjdk.java.net/browse/JDK-8252101 >> >> ???????????? Patch is under testing,? will create? RFR thread when it >> is ready. >> ???????? Thanks! >> >> ???????? Cheers, >> ???????? Lin >> >> ???????? On 20/08/2020 04:18, Hohensee, Paul wrote: >> ???????? > I prioritize compatibility, so would go with option 2. >> ???????? > >> ???????? > Thanks, >> ???????? > Paul >> ???????? > >> ???????? > On 8/18/20, 11:17 PM, "serviceability-dev on behalf of >> linzang(??)" > linzang at tencent.com> wrote: >> ???????? > >> ???????? >???? Dear All, >> ???????? >???????????? May I get some suggestions?? so that I can? >> work out a patch >> ???????? >???? base on that. >> ???????? >???????????? Or may be it should not be treated as an issue? >> ???????? >???? BRs, >> ???????? >???? Lin >> ???????? > >> ???????? >???? On 17/08/2020 17:17, linzang(??) wrote: >> ???????? >???? >? Dear all, >> ???????? >???? >?????????? we found the jmap?s histo/dump command >> could accept "live" and "all" options together, and the specification >> does not describe what is the expected behavior of it. >> ???????? >???? >?????????? I have tried that when these two options >> used together, the "live" takes effect, no matter what sequences are >> they in commandline. >> ???????? >???? >?????????? IMO, it is a little confused to use "live" >> and "all" together, and if it is allowed, the specification may need >> to be updated to state the behavior clearly. >> ???????? >???? >?????????? Therefore may I ask your suggestion on >> which option of the following is prefered: >> ???????? >???? >?????????? (option 1.)? disallow using these two >> options together, I think this is more clear, but I am not sure >> whether there is backward compatibility risk. >> ???????? >???? >?????????? (option 2.)? allow the combination use of >> "live" and "all", and update the specification to clearly describe >> the behavior that "live" takes effect in this case. >> ???????? >???? >?????????? What do you think? >> ???????? >???? > >> ???????? >???? > Thanks, >> ???????? >???? > Lin >> ???????? >???? > >> ???????? >???? > >> ???????? >???? > >> ???????? > >> ???????? > >> ???????? > >> >> >> > From goetz.lindenmaier at sap.com Sat Aug 22 05:45:40 2020 From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz) Date: Sat, 22 Aug 2020 05:45:40 +0000 Subject: RFR(L) 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents In-Reply-To: References: <1f8a3c7a-fa0f-b5b2-4a8a-7d3d8dbbe1b5@oracle.com> <4b56a45c-a14c-6f74-2bfd-25deaabe8201@oracle.com> <5271429a-481d-ddb9-99dc-b3f6670fcc0b@oracle.com> Message-ID: Hi Richard, I read through your change again. It looks good to me now. The new naming and additional comments make it easier to read I think, thank you. One small thing: deoptimization.cpp, l. 1503 You don't really need the brackets. Two lines below you don't use them either. (No webrev needed) Best regards, Goetz. -----Original Message----- From: Reingruber, Richard Sent: Dienstag, 18. August 2020 10:44 To: Lindenmaier, Goetz ; serviceability-dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net; hotspot-runtime-dev at openjdk.java.net Subject: RE: RFR(L) 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents Hi Goetz, I have collected the changes based on your feedback in a new webrev: Webrev.7: http://cr.openjdk.java.net/~rrich/webrevs/8227745/webrev.7/ Delta: http://cr.openjdk.java.net/~rrich/webrevs/8227745/webrev.7.inc/ Most of the changes are renamings, commenting, and reformatting. Besides that ... - I converted the native agent of the test IterateHeapWithEscapeAnalysisEnabled from C to C++, because this seems to be preferred by serviceability developers. I also re-indented the file, but excluded this from the delta webrev. - I had to adapt test/jdk/com/sun/jdi/EATests.java to the fact that background compilation (-Xbatch) cannot be reliably disabled for JVMCI compilers. E.g. the compile broker will compile in the background if JVMCI is not yet fully initialized. Therefore it is possible that test cases are executed before the main test method is compiled on the highest level and then the test case fails. The higher the system load the higher the probability for this to happen. In webrev.7 I skip the compilation level check if the vm is configured to use the JVMCI compiler. I also answered you inline below. Thanks, Richard. -----Original Message----- From: Lindenmaier, Goetz Sent: Donnerstag, 23. Juli 2020 16:20 To: Reingruber, Richard ; serviceability-dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net; hotspot-runtime-dev at openjdk.java.net Subject: RE: RFR(L) 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents Hi Richard, Thanks for your two further explanations in the other thread. That made the points clear to me. > > I was not that happy with the names saying not_global_escape > > and similar. I now agreed you have to use the terms of the escape > > analysis (NoEscape ArgEscape= throughout the runtime code. I'm still not happy with > > the 'not' in the term, I always try to expand the name to some > > sentence with a negated verb, but it makes no sense. > > For example, "has_not_global_escape_in_scope" expands to > > "Hasn't a global escape in its scope." in my thinking, which makes > > no sense. You probably mean > > "Has not-global escape in its scope." or "Has {ArgEscape|NoEscape} > > in its scope." > > > C2 is using the word "non" in this context, e.g., here > > alloc->is_non_escaping. > > There is also ConnectionGraph::not_global_escape() That talks about a single node that represents a single Object. An object has a single state wrt. ea. You use the term for safepoint which tracks a set of objects. Here, has_not_global_excape can mean 1. None of the several objects does escape globaly. 2. There is at least one object that escapes globaly. > > non obviously negates the adjective 'global', > > non-global or nonglobal even is a English term I find in the > > net. > > So what about "has_non_global_escape_in_scope?" > > And what about has_ea_local_in_scope? That's good. Please document somewhere that Ea_local == ArgEscape | NoEscape. That's what it is, right? > > Does jvmti specify that the same limits are used ...? > > ok on your side. > > I don't know and didn't find anything in a quick search. Ok, not your business. > > > jvmtiEnvBase.cpp ok > > jvmtiImpl.h|cpp ok > > jvmtiTagMap.cpp ok > > whitebox.cpp ok > > > deoptimization.cpp > > > line 177: Please break line > > line 246, 281: Please break line > > 1578, 1583, 1589, 1632, 1649, 1651 Break line > > > 1651: You use 'non'-terms, too: non-escaping :) > > I know :) At least here it is wrong I'd say. "...has to be a not escaping obj..." > sounds better > (hopefully not only to my german ears). I thought the term non-escpaing makes it quite clear. I just wanted to point out that using non above would be similar to the wording here. > > IterateHeapWithEscapeAnalysisEnabled.java > > > line 415: > > msg("wait until target thread has set testMethod_result"); > > while (testMethod_result == 0) { > > Thread.sleep(50); > > } > > Might the test run into timeouts at this place? > > The field is volatile, i.e. it will be reloaded > > in each iteration. But will dontinline_testMethod > > write it back to main memory in time? > > You mean, the test could hang in that loop for a couple of minutes? I don't > think so. There are cache coherence protocols in place which will invalidate > stale data very timely. Ok, anyways, it would only be a hanging test. > > Ok. I've removed quite a lot of the occurrances. > > > Also, I like full sentences in comments. > > Especially for me as foreign speaker, this makes > > things much more clear. I.e., I try to make it > > a real sentence with articles, capitalized and a > > dot at the end if there is a subject and a verb > > in first place. > > E.g., jvmtiEnvBase.cpp:1327 > > Are you referring to the following? > (from > http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.6/src/hots > pot/share/prims/jvmtiEnvBase.cpp.frames.html) > > 1326 > 1327 // If the frame is a compiled one, need to deoptimize it. > 1328 if (vf->is_compiled_frame()) { > > This line 1327 is preexisting. Sorry, wrong line number again. I think I meant 1333 // eagerly reallocate scalar replaced objects. But I must admit, the subject is missing. It's one of these imperative sentences where the subject is left out, which are used throughout documentation. Bad example, but still a correct sentence, so qualifies for punctuation? Best regards, Goetz. From linzang at tencent.com Mon Aug 24 01:58:02 2020 From: linzang at tencent.com (=?iso-2022-jp?B?bGluemFuZygbJEJnSU5WGyhCKQ==?=) Date: Mon, 24 Aug 2020 01:58:02 +0000 Subject: RFR(s):8251848: JMap.histo() and JMap.dump() should parse sub-arguments similarly(Internet mail) References: <19771BAD-6FFC-42A0-8488-1253FB42384C@amazon.com> Message-ID: Dear Paul, Thanks a lot! may I ask your help to push it if there is no need for more review? Cheers, Lin On 22/08/2020 04:33, Hohensee, Paul wrote: Lgtm. Paul From: "linzang(??)" Date: Thursday, August 20, 2020 at 5:03 PM To: "serviceability-dev at openjdk.java.net" , "Hohensee, Paul" Subject: RFR(s):8251848: JMap.histo() and JMap.dump() should parse sub-arguments similarly Dear All, Please help review the patch of 8251848, Thanks! Webrev: http://cr.openjdk.java.net/~lzang/8251848/webrev.00/ Bug: https://bugs.openjdk.java.net/browse/JDK-8251848 BRs, Lin -------------- next part -------------- An HTML attachment was scrubbed... URL: From suenaga at oss.nttdata.com Mon Aug 24 02:40:21 2020 From: suenaga at oss.nttdata.com (Yasumasa Suenaga) Date: Mon, 24 Aug 2020 11:40:21 +0900 Subject: 8242427: JVMTI frame pop operations should use Thread-Local Handshakes Message-ID: Hi all, I want to hear your opinions about the change for JDK-8242427. I'm trying to migrate following operations to direct handshake. - VM_UpdateForPopTopFrame - VM_SetFramePop - VM_GetCurrentLocation Some operations (VM_GetCurrentLocation and EnterInterpOnlyModeClosure) might be called at safepoint, so I want to use JavaThread::active_handshaker() in production VM to detect the process is in direct handshake or not. However this function is available in debug VM only, so I want to hear the reason why it is for debug VM only, and there are no problem to use it in production VM. Of course another solutions are welcome. webrev is here. It passed jtreg tests (vmTestbase/nsk/{jdi,jdwp,jvmti} serviceability/{jdwp,jvmti}) http://cr.openjdk.java.net/~ysuenaga/JDK-8242427/proposal/ Thanks, Yasumasa From linzang at tencent.com Mon Aug 24 03:28:57 2020 From: linzang at tencent.com (=?utf-8?B?bGluemFuZyjoh6fnkLMp?=) Date: Mon, 24 Aug 2020 03:28:57 +0000 Subject: RFR(s):8252101 Add specification of expected behavior of combining "all" and "live" options of jmap(Internet mail) References: <060569EC-202B-4DE8-8549-FF82DCCD9157@tencent.com> <6b6cc568-7ffb-b8aa-f521-b10c20aa8399@oracle.com> Message-ID: <6d5e68e97b1b45e8a4894bdfe13d3374@tencent.com> Hi Paul, Serguei and Dan, Thanks for help review it. The CSR is in "Finalized" status, I will wait for it to be approved and then may ask your help to push it. Cheers, Lin On 22/08/2020 08:46, serguei.spitsyn at oracle.com wrote: > Hi Lin, > > LGTM++ > > Thanks, > Serguei > > > On 8/21/20 14:01, Daniel D. Daugherty wrote: >> On 8/20/20 7:42 PM, linzang(??) wrote: >>> After discuss with paul, it is not a good idea to combine two fix >>> together in one webrev. I will handle them separately >>> Please help review the updated one. Thanks! >>> Webrev: http://cr.openjdk.java.net/~lzang/8252101/webrev.01/ >> src/jdk.jcmd/share/classes/sun/tools/jmap/JMap.java >> No comments. >> >> Thumbs up. >> >> Dan >> >> >>> CSR: https://bugs.openjdk.java.net/browse/JDK-8252102 >>> Bug: https://bugs.openjdk.java.net/browse/JDK-8252101 >>> BRs, >>> Lin >>> >>> ?On 2020/8/21, 12:17 AM, "linzang(??)" wrote: >>> >>> Dear All, >>> May I ask your help to review this change: >>> Webrev: >>> http://cr.openjdk.java.net/~lzang/8252101/webrev.00/ >>> CSR: https://bugs.openjdk.java.net/browse/JDK-8252102 >>> Bug: https://bugs.openjdk.java.net/browse/JDK-8252101 >>> >>> This change adds the description of expected behavior >>> for jmap -hiso/-dump to use "all" and "live" at the same time. >>> With Paul's help, It also includes code refine of the >>> dump() function in Jmap.java. which is based on Paul's change >>> http://cr.openjdk.java.net/~phh/8251835/webrev.00/ >>> >>> BRs, >>> Lin >>> >>> On 2020/8/20, 8:18 PM, "linzang(??)" wrote: >>> >>> Thanks Paul! >>> I have filed CSR and Bug: >>> CSR: https://bugs.openjdk.java.net/browse/JDK-8252102 >>> Bug: https://bugs.openjdk.java.net/browse/JDK-8252101 >>> >>> Patch is under testing, will create RFR thread when it >>> is ready. >>> Thanks! >>> >>> Cheers, >>> Lin >>> >>> On 20/08/2020 04:18, Hohensee, Paul wrote: >>> > I prioritize compatibility, so would go with option 2. >>> > >>> > Thanks, >>> > Paul >>> > >>> > On 8/18/20, 11:17 PM, "serviceability-dev on behalf of >>> linzang(??)" >> linzang at tencent.com> wrote: >>> > >>> > Dear All, >>> > May I get some suggestions? so that I can >>> work out a patch >>> > base on that. >>> > Or may be it should not be treated as an issue? >>> > BRs, >>> > Lin >>> > >>> > On 17/08/2020 17:17, linzang(??) wrote: >>> > > Dear all, >>> > > we found the jmap?s histo/dump command >>> could accept "live" and "all" options together, and the specification >>> does not describe what is the expected behavior of it. >>> > > I have tried that when these two options >>> used together, the "live" takes effect, no matter what sequences are >>> they in commandline. >>> > > IMO, it is a little confused to use "live" >>> and "all" together, and if it is allowed, the specification may need >>> to be updated to state the behavior clearly. >>> > > Therefore may I ask your suggestion on >>> which option of the following is prefered: >>> > > (option 1.) disallow using these two >>> options together, I think this is more clear, but I am not sure >>> whether there is backward compatibility risk. >>> > > (option 2.) allow the combination use of >>> "live" and "all", and update the specification to clearly describe >>> the behavior that "live" takes effect in this case. >>> > > What do you think? >>> > > >>> > > Thanks, >>> > > Lin >>> > > >>> > > >>> > > >>> > >>> > >>> > >>> >>> >>> > From jiefu at tencent.com Mon Aug 24 16:21:13 2020 From: jiefu at tencent.com (=?utf-8?B?amllZnUo5YKF5p2wKQ==?=) Date: Mon, 24 Aug 2020 16:21:13 +0000 Subject: 8251155: HostIdentifier fails to canonicalize hostnames starting with digits(Internet mail) In-Reply-To: <8cee7938941048f4b007b1663fab4b95@tencent.com> References: <1C8FB411-F73A-4197-A8C9-28B2BE3BFBA0@tencent.com> <8cee7938941048f4b007b1663fab4b95@tencent.com> Message-ID: Hi Serguei and Claes, I forget to mention that you can also verify this fix using the following tests: ---------------------------------------------------------- test/jdk/sun/tools/jstatd/TestJstatdExternalRegistry.java test/jdk/sun/tools/jstatd/TestJstatdPort.java test/jdk/sun/tools/jstatd/TestJstatdPortAndServer.java test/jdk/sun/tools/jstatd/TestJstatdRmiPort.java ---------------------------------------------------------- Without the patch, All of them will fail if the hostname starting from digits. We've found that it seems very common that the hostname will start with digits in dockers. So it would be better to fix it. What do you think? Thanks. Best regards, Jie From: "jiefu(??)" Date: Wednesday, August 19, 2020 at 4:05 PM To: "serguei.spitsyn at oracle.com" , "serviceability-dev at openjdk.java.net" , Claes Redestad Subject: Re: 8251155: HostIdentifier fails to canonicalize hostnames starting with digits(Internet mail) Hi Serguei, Thanks for your review and help. Please see comments inline. ________________________________ From: serguei.spitsyn at oracle.com Sent: Wednesday, August 19, 2020 4:03 AM To: jiefu(??); serviceability-dev at openjdk.java.net; Claes Redestad Subject: Re: 8251155: HostIdentifier fails to canonicalize hostnames starting with digits(Internet mail) 83 *
    84 *
  • {@code } - transformed into "//localhost"
  • 85 *
  • localhost - transformed into "//localhost"
  • 86 *
  • hostname - transformed into "//hostname"
  • 87 *
  • hostname:port - transformed into "//hostname:port"
  • 88 *
  • proto:hostname - transformed into "proto://hostname"
  • 89 *
  • proto:hostname:port - transformed into 90 * "proto://hostname:port"
  • 91 *
  • proto://hostname:port
  • 92 *
>> Is it worth to add an example to the list above? Yes. It's really helpful for the review process. Thanks. >> I wander if this fix needs a CSR. I don't think so. This is just a bug fix which doesn't add/remove/change any feature of the tools. The original design has claimed to support hostname and hostname:port cases. But it fails to do so when the hostname starts with digits. It seems to be very common that the hostname will be started with digits in dockers. So I think it's worth to fix this bug. >> How did you check this fix does not introduce any regressions? In fact, Claes had helped me to answer this question here: https://mail.openjdk.java.net/pipermail/serviceability-dev/2020-August/032691.html. Also, I've tested this patch on Linux/x64 with tier1 ~ tier3 (no regression). Thanks a lot. Best regards, Jie -------------- next part -------------- An HTML attachment was scrubbed... URL: From serguei.spitsyn at oracle.com Mon Aug 24 16:47:51 2020 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Mon, 24 Aug 2020 09:47:51 -0700 Subject: 8251155: HostIdentifier fails to canonicalize hostnames starting with digits(Internet mail) In-Reply-To: References: <1C8FB411-F73A-4197-A8C9-28B2BE3BFBA0@tencent.com> Message-ID: Thank you for the comment, Claes! Serguei On 8/18/20 21:51, Claes Redestad wrote: > Hi, > > not sure I do, but a quick read of the relevant RFC suggests that since > a URI scheme (protocol) must start with a letter[1] it seems safe to > assume the string must be of the form hostname or hostname:port if the > first character in the string is a digit. > > /Claes > > [1] https://tools.ietf.org/html/rfc3986#section-3.1 > > On 2020-08-18 22:03, serguei.spitsyn at oracle.com wrote: >> Hi Jie, >> >> I've added Claes to the list as he may have an expertise in this area. >> >> ?? 83? *
    >> ?? 84? *??
  • {@code } - transformed into "//localhost"
  • >> ?? 85? *??
  • localhost - transformed into "//localhost"
  • >> ?? 86? *??
  • hostname - transformed into "//hostname"
  • >> ?? 87? *??
  • hostname:port - transformed into "//hostname:port"
  • >> ?? 88? *??
  • proto:hostname - transformed into "proto://hostname"
  • >> ?? 89? *??
  • proto:hostname:port - transformed into >> ?? 90? *????????? "proto://hostname:port"
  • >> ?? 91? *??
  • proto://hostname:port
  • >> ?? 92? *
>> >> Is it worth to add an example to the list above? >> >> I wander if this fix needs a CSR. >> How did you check this fix does not introduce any regressions? >> >> Thanks, >> Serguei >> >> >> On 8/17/20 08:13, jiefu(??) wrote: >>> >>> Ping? >>> >>> Any comments? >>> >>> Thanks. >>> >>> Best regards, >>> >>> Jie >>> >>> *From: *serviceability-dev >>> on behalf of "jiefu(??)" >>> >>> *Date: *Friday, August 7, 2020 at 7:44 AM >>> *To: *"serviceability-dev at openjdk.java.net" >>> >>> *Subject: *Re: RFR: 8251155: HostIdentifier fails to canonicalize >>> hostnames starting with digits(Internet mail) >>> >>> FYI: >>> >>> ? This bug will lead to failures of the following tests on machines >>> with hostname starting from digits. >>> >>> ??? - test/jdk/sun/tools/jstatd/TestJstatdExternalRegistry.java >>> >>> ??? - test/jdk/sun/tools/jstatd/TestJstatdPort.java >>> >>> ??? - test/jdk/sun/tools/jstatd/TestJstatdPortAndServer.java >>> >>> ??? - test/jdk/sun/tools/jstatd/TestJstatdRmiPort.java >>> >>> So it's worth fixing it. >>> >>> Testing: >>> >>> ? - tier1-3 on Linux/x64 >>> >>> Thanks. >>> >>> Best regards, >>> >>> Jie >>> >>> *From: *"jiefu(??)" >>> *Date: *Wednesday, August 5, 2020 at 3:19 PM >>> *To: *"serviceability-dev at openjdk.java.net" >>> >>> *Subject: *RFR: 8251155: HostIdentifier fails to canonicalize >>> hostnames starting with digits >>> >>> Hi all, >>> >>> JBS: https://bugs.openjdk.java.net/browse/JDK-8251155 >>> >>> Webrev: http://cr.openjdk.java.net/~jiefu/8251155/webrev.00/ >>> >>> HostIdentifier fails to canonicalize hostname:port if the hostname >>> starts with digits. >>> >>> The current implementation will get "scheme = hostname". >>> >>> But the scheme should not be started with digits, which leads to >>> this bug. >>> >>> Thanks a lot. >>> >>> Best regards, >>> >>> Jie >>> >> From hohensee at amazon.com Mon Aug 24 18:26:30 2020 From: hohensee at amazon.com (Hohensee, Paul) Date: Mon, 24 Aug 2020 18:26:30 +0000 Subject: RFR(s):8251848: JMap.histo() and JMap.dump() should parse sub-arguments similarly(Internet mail) Message-ID: <68B4CE59-C101-4489-8EAD-ECB049AC2A45@amazon.com> Needs another review. I can push it for you after that. Paul From: "linzang(??)" Date: Sunday, August 23, 2020 at 6:58 PM To: "Hohensee, Paul" , "serviceability-dev at openjdk.java.net" Subject: RE: [EXTERNAL] RFR(s):8251848: JMap.histo() and JMap.dump() should parse sub-arguments similarly(Internet mail) Dear Paul, Thanks a lot! may I ask your help to push it if there is no need for more review? Cheers, Lin On 22/08/2020 04:33, Hohensee, Paul wrote: Lgtm. Paul From: "linzang(??)" Date: Thursday, August 20, 2020 at 5:03 PM To: "serviceability-dev at openjdk.java.net" , "Hohensee, Paul" Subject: RFR(s):8251848: JMap.histo() and JMap.dump() should parse sub-arguments similarly Dear All, Please help review the patch of 8251848, Thanks! Webrev: http://cr.openjdk.java.net/~lzang/8251848/webrev.00/ Bug: https://bugs.openjdk.java.net/browse/JDK-8251848 BRs, Lin -------------- next part -------------- An HTML attachment was scrubbed... URL: From hohensee at amazon.com Mon Aug 24 18:30:13 2020 From: hohensee at amazon.com (Hohensee, Paul) Date: Mon, 24 Aug 2020 18:30:13 +0000 Subject: RFR(s):8252101 Add specification of expected behavior of combining "all" and "live" options of jmap(Internet mail) Message-ID: <9D34568A-AEF4-4DA3-85D1-6362B5EF36F0@amazon.com> The CSR has been approved. I'll push for you as soon as the openjdk servers come back. Thanks, Paul ?On 8/23/20, 8:30 PM, "linzang(??)" wrote: Hi Paul, Serguei and Dan, Thanks for help review it. The CSR is in "Finalized" status, I will wait for it to be approved and then may ask your help to push it. Cheers, Lin On 22/08/2020 08:46, serguei.spitsyn at oracle.com wrote: > Hi Lin, > > LGTM++ > > Thanks, > Serguei > > > On 8/21/20 14:01, Daniel D. Daugherty wrote: >> On 8/20/20 7:42 PM, linzang(??) wrote: >>> After discuss with paul, it is not a good idea to combine two fix >>> together in one webrev. I will handle them separately >>> Please help review the updated one. Thanks! >>> Webrev: http://cr.openjdk.java.net/~lzang/8252101/webrev.01/ >> src/jdk.jcmd/share/classes/sun/tools/jmap/JMap.java >> No comments. >> >> Thumbs up. >> >> Dan >> >> >>> CSR: https://bugs.openjdk.java.net/browse/JDK-8252102 >>> Bug: https://bugs.openjdk.java.net/browse/JDK-8252101 >>> BRs, >>> Lin >>> >>> On 2020/8/21, 12:17 AM, "linzang(??)" wrote: >>> >>> Dear All, >>> May I ask your help to review this change: >>> Webrev: >>> http://cr.openjdk.java.net/~lzang/8252101/webrev.00/ >>> CSR: https://bugs.openjdk.java.net/browse/JDK-8252102 >>> Bug: https://bugs.openjdk.java.net/browse/JDK-8252101 >>> >>> This change adds the description of expected behavior >>> for jmap -hiso/-dump to use "all" and "live" at the same time. >>> With Paul's help, It also includes code refine of the >>> dump() function in Jmap.java. which is based on Paul's change >>> http://cr.openjdk.java.net/~phh/8251835/webrev.00/ >>> >>> BRs, >>> Lin >>> >>> On 2020/8/20, 8:18 PM, "linzang(??)" wrote: >>> >>> Thanks Paul! >>> I have filed CSR and Bug: >>> CSR: https://bugs.openjdk.java.net/browse/JDK-8252102 >>> Bug: https://bugs.openjdk.java.net/browse/JDK-8252101 >>> >>> Patch is under testing, will create RFR thread when it >>> is ready. >>> Thanks! >>> >>> Cheers, >>> Lin >>> >>> On 20/08/2020 04:18, Hohensee, Paul wrote: >>> > I prioritize compatibility, so would go with option 2. >>> > >>> > Thanks, >>> > Paul >>> > >>> > On 8/18/20, 11:17 PM, "serviceability-dev on behalf of >>> linzang(??)" >> linzang at tencent.com> wrote: >>> > >>> > Dear All, >>> > May I get some suggestions? so that I can >>> work out a patch >>> > base on that. >>> > Or may be it should not be treated as an issue? >>> > BRs, >>> > Lin >>> > >>> > On 17/08/2020 17:17, linzang(??) wrote: >>> > > Dear all, >>> > > we found the jmap?s histo/dump command >>> could accept "live" and "all" options together, and the specification >>> does not describe what is the expected behavior of it. >>> > > I have tried that when these two options >>> used together, the "live" takes effect, no matter what sequences are >>> they in commandline. >>> > > IMO, it is a little confused to use "live" >>> and "all" together, and if it is allowed, the specification may need >>> to be updated to state the behavior clearly. >>> > > Therefore may I ask your suggestion on >>> which option of the following is prefered: >>> > > (option 1.) disallow using these two >>> options together, I think this is more clear, but I am not sure >>> whether there is backward compatibility risk. >>> > > (option 2.) allow the combination use of >>> "live" and "all", and update the specification to clearly describe >>> the behavior that "live" takes effect in this case. >>> > > What do you think? >>> > > >>> > > Thanks, >>> > > Lin >>> > > >>> > > >>> > > >>> > >>> > >>> > >>> >>> >>> > From serguei.spitsyn at oracle.com Mon Aug 24 18:44:03 2020 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Mon, 24 Aug 2020 11:44:03 -0700 Subject: RFR(s):8251848: JMap.histo() and JMap.dump() should parse sub-arguments similarly(Internet mail) In-Reply-To: <68B4CE59-C101-4489-8EAD-ECB049AC2A45@amazon.com> References: <68B4CE59-C101-4489-8EAD-ECB049AC2A45@amazon.com> Message-ID: An HTML attachment was scrubbed... URL: From serguei.spitsyn at oracle.com Mon Aug 24 20:17:27 2020 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Mon, 24 Aug 2020 13:17:27 -0700 Subject: RFR(s):8251848: JMap.histo() and JMap.dump() should parse sub-arguments similarly(Internet mail) In-Reply-To: References: <68B4CE59-C101-4489-8EAD-ECB049AC2A45@amazon.com> Message-ID: <56a59a8f-8402-173e-dc84-13878d2c43eb@oracle.com> An HTML attachment was scrubbed... URL: From alexey.menkov at oracle.com Mon Aug 24 20:50:03 2020 From: alexey.menkov at oracle.com (Alex Menkov) Date: Mon, 24 Aug 2020 13:50:03 -0700 Subject: Ping: RFR: JDK-8234808: jdb quoted option parsing broken In-Reply-To: <89e03cfd-7ae1-1c88-26eb-73b8c6b8d79f@oracle.com> References: <5264fc63-3ad4-ad13-768e-6c9183c52dda@oracle.com> <67794350-6c08-3354-cefe-302b931cf8ce@oracle.com> <0b1a25ba-c044-1f49-ef2e-6b412c2cf601@oracle.com> <89e03cfd-7ae1-1c88-26eb-73b8c6b8d79f@oracle.com> Message-ID: ${subj} Need 2nd reviewer --alex On 08/19/2020 16:46, serguei.spitsyn at oracle.com wrote: > Thank you for the update, Alex! > It looks good. > > Thanks, > Serguei > > On 8/19/20 16:35, Alex Menkov wrote: >> Updated webrev: >> >> http://cr.openjdk.java.net/~amenkov/jdk16/jdb_options/webrev.02/ >> >> --alex >> >> On 08/19/2020 16:14, serguei.spitsyn at oracle.com wrote: >>> On 8/19/20 15:11, Alex Menkov wrote: >>>> Hi Serguei, >>>> >>>> thank you for the feedback. >>>> >>>> On 08/19/2020 13:58, serguei.spitsyn at oracle.com wrote: >>>>> Hi Alex, >>>>> >>>>> Sorry, I've overlooked this request for review. >>>>> The fix looks good in general. >>>>> >>>>> >>>>> http://cr.openjdk.java.net/~amenkov/jdk16/jdb_options/webrev/src/jdk.jdi/share/classes/com/sun/tools/example/debug/tty/VMConnection.java.frames.html >>>>> >>>>> >>>>> 81 private Map >>>>> parseConnectorArgs(Connector connector, >>>>> 82 String argString, >>>>> 83 String extraOptions) { >>>>> >>>>> To make it more elegant I'd suggest to place the returned type on a >>>>> separate line like below: >>>>> private Map >>>>> parseConnectorArgs(Connector connector, String argString, String >>>>> extraOptions) { >>>> >>>> Do you mean second line indent should be the same as 1st? >>>> or make it 8 spaces: >>>> >>>> private Map >>>> ??????? parseConnectorArgs(Connector connector, String argString, >>>> String extraOptions) { >>> >>> No indent is needed, I think. >>> My suggestion is to use extra line for method return type instead of >>> method arguments. >>> >>> >>>>> >>>>> 127 sb.append(extraOptions).append(" "); >>>>> 128 // set extraOptions to null to not set it again >>>>> 129 extraOptions = null; >>>>> >>>>> What about rewording the comment like below? : >>>>> ??? ? // set extraOptions to null to avoid appending it again >>>> >>>> ok. >>>> >>>>> >>>>> 165 if (extraOptions != null) { >>>>> 166 // there was no "option" specified in argString >>>>> 167 Connector.Argument argument = arguments.get("options"); >>>>> 168 if (argument != null) { >>>>> 169 argument.setValue(extraOptions); >>>>> 170 } >>>>> 171 } >>>>> >>>>> Should the "option" in the comment be replaced with "options"? >>>> >>>> right. >>>> >>>>> What if the argument at line 167 was set to null? >>>>> Will the extraOptions be ignored in such a case? >>>> >>>> extraOptions makes sense only for CommandLineLaunch connector which >>>> launches new VM (and only this connector has "options" argument). >>>> Other connectors (attach or listen) connect to existing VM and >>>> cannot set its options. >>> >>> Okay, thank you for explanation. >>> >>> Thanks, >>> Serguei >>> >>>>> >>>>> >>>>> http://cr.openjdk.java.net/~amenkov/jdk16/jdb_options/webrev/test/jdk/com/sun/jdi/JdbOptions.java.html >>>>> >>>>> >>>>> This line is probably not needed anymore: >>>>> >>>>> ? 157???????????? //jdb.quit(); >>>>> >>>> >>>> will delete. >>>> >>>> --alex >>>> >>>>> >>>>> >>>>> >>>>> Thanks, >>>>> Serguei >>>>> >>>>> On 8/7/20 15:09, Alex Menkov wrote: >>>>>> Hi all, >>>>>> >>>>>> please review the fix for >>>>>> https://bugs.openjdk.java.net/browse/JDK-8234808 >>>>>> webrev: >>>>>> http://cr.openjdk.java.net/~amenkov/jdk16/jdb_options/webrev/ >>>>>> >>>>>> Some background: >>>>>> when jdb launches debuggee process it passes java options from >>>>>> "options" value for CommandLineLaunch connector and forward >>>>>> options specified before command. >>>>>> >>>>>> The fix solves several discovered issues: >>>>>> - proper handling of java options with spaces >>>>>> - if both way are used to specify java options, forwarded options >>>>>> override options from "options" value >>>>>> >>>>>> VMConnection class implements tricky logic for "options" field >>>>>> parsing for JFR needs (handling of single and double quotes). I >>>>>> decided to keep it as is to avoid massive test failures with JFR >>>>>> (there is no test coverage for this functionality and I'm not sure >>>>>> I understand all requirements). >>>>>> >>>>>> --alex >>>>> >>> > From adityam at microsoft.com Mon Aug 24 23:06:44 2020 From: adityam at microsoft.com (Aditya Mandaleeka) Date: Mon, 24 Aug 2020 23:06:44 +0000 Subject: Protecting references from GC in JDI tests In-Reply-To: <37042231-cdd7-716c-af0a-8b20ef9d6356@oracle.com> References: <37042231-cdd7-716c-af0a-8b20ef9d6356@oracle.com> Message-ID: Thanks for the replies Dan and Roger. > My recommendation would be to only use > ObjectReference.DisableCollection() > when you have observed a specific failure for a specific object in a > test. OK. I will submit an RFR soon to guard the specific case that we've observed consistently failing (after ensuring that there aren't other failures that start occurring after that change). Part of me worries that this approach won't prevent future failures in case one of the GCs changes its behavior such that the collections affect other unguarded objects in these tests. -Aditya From serguei.spitsyn at oracle.com Tue Aug 25 00:38:00 2020 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Mon, 24 Aug 2020 17:38:00 -0700 Subject: 8251155: HostIdentifier fails to canonicalize hostnames starting with digits(Internet mail) In-Reply-To: References: <1C8FB411-F73A-4197-A8C9-28B2BE3BFBA0@tencent.com> <8cee7938941048f4b007b1663fab4b95@tencent.com> Message-ID: <8dae62e2-22a6-0442-8c6a-0c666f9b1ad4@oracle.com> An HTML attachment was scrubbed... URL: From jiefu at tencent.com Tue Aug 25 02:12:43 2020 From: jiefu at tencent.com (=?iso-2022-jp?B?amllZnUoGyRCUHxbPxsoQik=?=) Date: Tue, 25 Aug 2020 02:12:43 +0000 Subject: 8251155: HostIdentifier fails to canonicalize hostnames starting with digits(Internet mail) In-Reply-To: <8dae62e2-22a6-0442-8c6a-0c666f9b1ad4@oracle.com> References: <1C8FB411-F73A-4197-A8C9-28B2BE3BFBA0@tencent.com> <8cee7938941048f4b007b1663fab4b95@tencent.com> , <8dae62e2-22a6-0442-8c6a-0c666f9b1ad4@oracle.com> Message-ID: <09c53cd507e140d4897a5db9b55419d8@tencent.com> Thanks Serguei for your review. Claes, are you okay with this change: http://cr.openjdk.java.net/~jiefu/8251155/webrev.00/ Thanks. Best regards, Jie ________________________________ From: serguei.spitsyn at oracle.com Sent: Tuesday, August 25, 2020 8:38 AM To: jiefu(??); serviceability-dev at openjdk.java.net; Claes Redestad Subject: Re: 8251155: HostIdentifier fails to canonicalize hostnames starting with digits(Internet mail) Hi Jie, I'm okay with the fix. Thanks, Serguei On 8/24/20 09:21, jiefu(??) wrote: Hi Serguei and Claes, I forget to mention that you can also verify this fix using the following tests: ---------------------------------------------------------- test/jdk/sun/tools/jstatd/TestJstatdExternalRegistry.java test/jdk/sun/tools/jstatd/TestJstatdPort.java test/jdk/sun/tools/jstatd/TestJstatdPortAndServer.java test/jdk/sun/tools/jstatd/TestJstatdRmiPort.java ---------------------------------------------------------- Without the patch, All of them will fail if the hostname starting from digits. We've found that it seems very common that the hostname will start with digits in dockers. So it would be better to fix it. What do you think? Thanks. Best regards, Jie From: "jiefu(??)" Date: Wednesday, August 19, 2020 at 4:05 PM To: "serguei.spitsyn at oracle.com" , "serviceability-dev at openjdk.java.net" , Claes Redestad Subject: Re: 8251155: HostIdentifier fails to canonicalize hostnames starting with digits(Internet mail) Hi Serguei, Thanks for your review and help. Please see comments inline. ________________________________ From: serguei.spitsyn at oracle.com Sent: Wednesday, August 19, 2020 4:03 AM To: jiefu(??); serviceability-dev at openjdk.java.net; Claes Redestad Subject: Re: 8251155: HostIdentifier fails to canonicalize hostnames starting with digits(Internet mail) 83 *
    84 *
  • {@code } - transformed into "//localhost"
  • 85 *
  • localhost - transformed into "//localhost"
  • 86 *
  • hostname - transformed into "//hostname"
  • 87 *
  • hostname:port - transformed into "//hostname:port"
  • 88 *
  • proto:hostname - transformed into "proto://hostname"
  • 89 *
  • proto:hostname:port - transformed into 90 * "proto://hostname:port"
  • 91 *
  • proto://hostname:port
  • 92 *
>> Is it worth to add an example to the list above? Yes. It's really helpful for the review process. Thanks. >> I wander if this fix needs a CSR. I don't think so. This is just a bug fix which doesn't add/remove/change any feature of the tools. The original design has claimed to support hostname and hostname:port cases. But it fails to do so when the hostname starts with digits. It seems to be very common that the hostname will be started with digits in dockers. So I think it's worth to fix this bug. >> How did you check this fix does not introduce any regressions? In fact, Claes had helped me to answer this question here: https://mail.openjdk.java.net/pipermail/serviceability-dev/2020-August/032691.html. Also, I've tested this patch on Linux/x64 with tier1 ~ tier3 (no regression). Thanks a lot. Best regards, Jie -------------- next part -------------- An HTML attachment was scrubbed... URL: From richard.reingruber at sap.com Tue Aug 25 07:34:24 2020 From: richard.reingruber at sap.com (Reingruber, Richard) Date: Tue, 25 Aug 2020 07:34:24 +0000 Subject: RFR(S) 8249293: Unsafe stackwalk in VM_GetOrSetLocal::doit_prologue() In-Reply-To: <3ae72218-7551-125a-5d7b-3e7d83c189a2@oracle.com> References: <54218a0f-5ea7-5a26-af7b-d13829bece14@oracle.com> <037ee9ad-615e-e70b-80f8-21b4fdc80f53@oracle.com> <0a112ad5-0771-9a83-32d8-08c87bed9b40@oracle.com> <8d945a55-aebf-a497-e793-2cd58c2f2392@oracle.com> <3ae72218-7551-125a-5d7b-3e7d83c189a2@oracle.com> Message-ID: Hi Serguei, thanks for your review. > These two lines can be removed: > 391 jvmtiEnv* jvmti; > 406 ::jvmti = jvmti; > No need in another webrev if you fix it. I've made this change and fixed the copyright year in jvmtiImpl.hpp. Unfortunately I got no results from the submit repo [1]. I've pushed again to submit. Will push to the jdk master repo when this job reports success. Thanks, Richard. [1] https://mail.openjdk.java.net/pipermail/jdk-submit-changes/2020-August/012091.html From: serguei.spitsyn at oracle.com Sent: Freitag, 21. August 2020 18:48 To: Reingruber, Richard ; David Holmes ; serviceability-dev at openjdk.java.net Subject: Re: RFR(S) 8249293: Unsafe stackwalk in VM_GetOrSetLocal::doit_prologue() Hi Richard, It looks good, thank you for the update! These two lines can be removed: 391 jvmtiEnv* jvmti; 406 ::jvmti = jvmti; No need in another webrev if you fix it. Thanks, Serguei On 8/21/20 09:09, Reingruber, Richard wrote: Hi Serguei, I have prepared a new webrev based on your suggestions. Webrev: http://cr.openjdk.java.net/~rrich/webrevs/8249293/webrev.6/ Delta: http://cr.openjdk.java.net/~rrich/webrevs/8249293/webrev.6.inc/ Thanks, Richard. ______ From: serguei.spitsyn at oracle.com Sent: Freitag, 21. August 2020 11:22 To: Reingruber, Richard ; David Holmes ; serviceability-dev at openjdk.java.net Subject: Re: RFR(S) 8249293: Unsafe stackwalk in VM_GetOrSetLocal::doit_prologue() Hi Richard, Thank you for the update, it looks really nice. Just several more minor comments though (I hope, the last ones). Should I rename the variable to spinWaitCycles or something similar? Yes, waitCycles would be better and more consistent. http://cr.openjdk.java.net/~rrich/webrevs/8249293/webrev.5.inc/test/hotspot/jtreg/serviceability/jvmti/GetLocalVariable/GetLocalWithoutSuspendTest.java.udiff.html 81 * The wait time is given in cycles. 82 */ 83 public int waitTime; ... 93 waitTime = 1; This line 82 can be removed if you rename waitTime to waitCycles. It is better to initialize waitCycles at definition and remove the line 93. 146 public static void msg(String m) { 147 System.out.println("### Java-Test: " + m); 148 } One of the de-facto standard names for such methods is "log". 80 * Wait time in native, i.e. with walkable stack, after notifying agent thread to do GetLocalObject() call. 89 msg("Test how many frames fit on the stack by performing recursive calls until StackOverflowError is thrown"); Could you, please, reballance the two long lines above? http://cr.openjdk.java.net/~rrich/webrevs/8249293/webrev.5.inc/test/hotspot/jtreg/serviceability/jvmti/GetLocalVariable/libGetLocalWithoutSuspendTest.cpp.frames.html There are several spots that can be simplified a little bit: 95 jvmtiError result; 96 97 result = jvmti->GetErrorName(errCode, &errMsg); ==> jvmtiError result = jvmti->GetErrorName(errCode, &errMsg); The same is true for for the following cases: 115 err = jvmti->RawMonitorEnter(glws_monitor); 125 err = jvmti->RawMonitorExit(glws_monitor); 135 err = jvmti->RawMonitorWait(glws_monitor, 0); 145 err = jvmti->RawMonitorNotify(glws_monitor); 155 err = jvmti->DestroyRawMonitor(glws_monitor); 173 if (errMsg != NULL) { An extra space before NULL. 89 static jvmtiEnv* jvmti_global = NULL; 276 jvmtiEnv* jvmti = jvmti_global; 308 jvmtiEnv* jvmti = jvmti_global; 330 jvmtiEnv* jvmti = jvmti_global; ... 409 jvmtiEnv* jvmti; 419 res = jvm->GetEnv((void **) &jvmti, JVMTI_VERSION_9); 424 jvmti_global = jvmti; Normal practice is to name the "global_jvmti" as "jvmti". Then there is no need to set it at the start of each function. Thanks, Serguei On 8/20/20 23:47, Reingruber, Richard wrote: Hi Serguei, Sorry for the delay in reply and thank you for the update. I like it in general. There are some minor comments though. Excellent, thanks :) I've prepared webrev.5. Webrev: http://cr.openjdk.java.net/~rrich/webrevs/8249293/webrev.5/ Delta: http://cr.openjdk.java.net/~rrich/webrevs/8249293/webrev.5.inc/ http://cr.openjdk.java.net/~rrich/webrevs/8249293/webrev.4.inc/test/hotspot/jtreg/serviceability/jvmti/GetLocalVariable/GetLocalWithoutSuspendTest.java.frames.html 112 waitTime = (waitTime << 1); // double wait time 113 if (waitTime >= M || waitTime < 0) { 114 waitTime = 1; // reset when too long 115 } The M is too big for time. "waitTime" is roughly the number of cycles spent in a spin wait. M ~= 10^6 cycles does not seem too long. Should I rename the variable to spinWaitCycles or something similar? What about something like this: waitTime = (waitTime << 1) % 32; or waitTime = (waitTime << 1) & 32; I went for // Double wait time, but limit to roughly 10^6 cycles. waitTime = (waitTime << 1) & (M - 1); waitTime = waitTime == 0 ? 1 : waitTime; Masking the waitTime with % 32 is too small. In my experiments with fastdebug builds I got the crash often with a waitTime of 8K on a Linux server and 256K on my Windows notebook. http://cr.openjdk.java.net/~rrich/webrevs/8249293/webrev.4.inc/test/hotspot/jtreg/serviceability/jvmti/GetLocalVariable/libGetLocalWithoutSuspendTest.cpp.udiff.html // - Wait for the target thread to either start a new test iteration or to +// signal shutdown. A suggestion to replace: "to either start" => "either to start". Ok, done. +// Shutdown is signalled by setting test_state to ShutDown. The agent reacts +// to it by changing test_state to Terminated and then it exits. The second "it" is not needed: "then it exits" => "then exits". Ok, done. +// ... It sets the shared variable test_state +// to TargetInNative and then it uses the glws_monitor to send the The second "it" is not needed. Ok, done. + monitor_enter(jvmti, env, glws_monitor, AT_LINE); + monitor_notify(jvmti, env, glws_monitor, AT_LINE); + monitor_wait(jvmti, env, glws_monitor, AT_LINE); + monitor_exit(jvmti, env, glws_monitor, AT_LINE); + monitor_destroy(jvmti, env, glws_monitor, AT_LINE); There is only one lock. It'd be more simple to directly use it in the called functions and get rid of the parameter. Just a suggestion, it is up to you to decide. Ok, done. http://cr.openjdk.java.net/~rrich/webrevs/8249293/webrev.4.inc/test/hotspot/jtreg/serviceability/jvmti/GetLocalVariable/libGetLocalWithoutSuspendTest.cpp.frames.html I'd suggest to refactor the lines 213 and 239-257 to a separate function test_GetLocalObject(jvmti, depth). 240 jobject local_val; Better to rename it to local_obj or just obj. Ok, done. There are still problems with the indent. I reformatted the file using 2 space indentation like in other C++ sources. I didn't include the indentation change in the delta webrev. Thanks, Richard. ______________________ From: mailto:serguei.spitsyn at oracle.com mailto:serguei.spitsyn at oracle.com Sent: Donnerstag, 20. August 2020 04:42 To: Reingruber, Richard mailto:richard.reingruber at sap.com; David Holmes mailto:david.holmes at oracle.com; mailto:serviceability-dev at openjdk.java.net Subject: Re: RFR(S) 8249293: Unsafe stackwalk in VM_GetOrSetLocal::doit_prologue() Hi Richard, Sorry for the delay in reply and thank you for the update. I like it in general. There are some minor comments though. http://cr.openjdk.java.net/~rrich/webrevs/8249293/webrev.4.inc/test/hotspot/jtreg/serviceability/jvmti/GetLocalVariable/GetLocalWithoutSuspendTest.java.frames.html 112 waitTime = (waitTime << 1); // double wait time 113 if (waitTime >= M || waitTime < 0) { 114 waitTime = 1; // reset when too long 115 } The M is too big for time. What about something like this: waitTime = (waitTime << 1) % 32; or waitTime = (waitTime << 1) & 32; You can choose a better number instead of 32. http://cr.openjdk.java.net/~rrich/webrevs/8249293/webrev.4.inc/test/hotspot/jtreg/serviceability/jvmti/GetLocalVariable/libGetLocalWithoutSuspendTest.cpp.udiff.html // - Wait for the target thread to either start a new test iteration or to +// signal shutdown. A suggestion to replace: "to either start" => "either to start". +// Shutdown is signalled by setting test_state to ShutDown. The agent reacts +// to it by changing test_state to Terminated and then it exits. The second "it" is not needed: "then it exits" => "then exits". +// ... It sets the shared variable test_state +// to TargetInNative and then it uses the glws_monitor to send the The second "it" is not needed. + monitor_enter(jvmti, env, glws_monitor, AT_LINE); + monitor_notify(jvmti, env, glws_monitor, AT_LINE); + monitor_wait(jvmti, env, glws_monitor, AT_LINE); + monitor_exit(jvmti, env, glws_monitor, AT_LINE); + monitor_destroy(jvmti, env, glws_monitor, AT_LINE); There is only one lock. It'd be more simple to directly use it in the called functions and get rid of the parameter. Just a suggestion, it is up to you to decide. http://cr.openjdk.java.net/~rrich/webrevs/8249293/webrev.4.inc/test/hotspot/jtreg/serviceability/jvmti/GetLocalVariable/libGetLocalWithoutSuspendTest.cpp.frames.html I'd suggest to refactor the lines 213 and 239-257 to a separate function test_GetLocalObject(jvmti, depth). 240 jobject local_val; Better to rename it to local_obj or just obj. There are still problems with the indent. The indent 4 is mostly used. However there are still fragments with the indent 2: 112 static void monitor_enter(jvmtiEnv* jvmti, JNIEnv* env, jrawMonitorID mon, const char* loc) { 113 jvmtiError err; 114 115 err = jvmti->RawMonitorEnter(mon); 116 if (err != JVMTI_ERROR_NONE) { 117 ShowErrorMessage(jvmti, err, loc); 118 env->FatalError(loc); 119 } 120 } 121 122 static void monitor_exit(jvmtiEnv* jvmti, JNIEnv* env, jrawMonitorID mon, const char* loc) { 123 jvmtiError err; 124 125 err = jvmti->RawMonitorExit(mon); 126 if (err != JVMTI_ERROR_NONE) { 127 ShowErrorMessage(jvmti, err, loc); 128 env->FatalError(loc); 129 } 130 } 131 132 static void monitor_wait(jvmtiEnv* jvmti, JNIEnv* env, jrawMonitorID mon, const char* loc) { 133 jvmtiError err; 134 135 err = jvmti->RawMonitorWait(mon, 0); 136 if (err != JVMTI_ERROR_NONE) { 137 ShowErrorMessage(jvmti, err, loc); 138 env->FatalError(loc); 139 } 140 } 141 142 static void monitor_notify(jvmtiEnv* jvmti, JNIEnv* env, jrawMonitorID mon, const char* loc) { 143 jvmtiError err; 144 145 err = jvmti->RawMonitorNotify(mon); 146 if (err != JVMTI_ERROR_NONE) { 147 ShowErrorMessage(jvmti, err, loc); 148 env->FatalError(loc); 149 } 150 } 151 152 static void monitor_destroy(jvmtiEnv* jvmti, JNIEnv* env, jrawMonitorID mon, const char* loc) { 153 jvmtiError err; 154 155 err = jvmti->DestroyRawMonitor(mon); 156 if (err != JVMTI_ERROR_NONE) { 157 ShowErrorMessage(jvmti, err, loc); 158 env->FatalError(loc); 159 } ... 160 } 196 while (target_thread == NULL) { 197 monitor_wait(jvmti, env, glws_monitor, AT_LINE); 198 } ... 220 while (test_state != TargetInNative) { 221 if (test_state == ShutDown) { 222 test_state = Terminated; 223 monitor_notify(jvmti, env, glws_monitor, AT_LINE); 224 monitor_exit(jvmti, env, glws_monitor, AT_LINE); 225 return; 226 } 227 monitor_wait(jvmti, env, glws_monitor, AT_LINE); 228 } ... 263 // Called by target thread after building a large stack. 264 // By calling this native method, the thread's stack becomes walkable. 265 // It notifies the agent to do the GetLocalObject() call and then races 266 // it to make its stack not walkable by returning from the native call. 267 JNIEXPORT void JNICALL 268 Java_GetLocalWithoutSuspendTest_notifyAgentToGetLocal(JNIEnv *env, jclass cls, jint depth, jlong waitCycles) { 269 jvmtiEnv* jvmti = jvmti_global; 270 271 monitor_enter(jvmti, env, glws_monitor, AT_LINE); 272 273 // Set depth_for_get_local and notify agent that the target thread is ready for the GetLocalObject() call 274 depth_for_get_local = depth; 275 test_state = TargetInNative; 276 277 monitor_notify(jvmti, env, glws_monitor, AT_LINE); 278 279 // Wait for agent thread to read depth_for_get_local and do the GetLocalObject() call 280 while (test_state != AgentInGetLocal) { 281 monitor_wait(jvmti, env, glws_monitor, AT_LINE); 282 } 283 284 // Reset state to Initial 285 test_state = Initial; 286 287 monitor_exit(jvmti, env, glws_monitor, AT_LINE); 288 289 // Wait a little until agent thread is in unsafe stack walk. 290 // This needs to be a spin wait or sleep because we cannot get a notification 291 // from there. 292 while (--waitCycles > 0) { 293 dummy_counter++; 294 } 295 } ... 299 JNIEXPORT void JNICALL 300 Java_GetLocalWithoutSuspendTest_shutDown(JNIEnv *env, jclass cls) { 301 jvmtiEnv* jvmti = jvmti_global; 302 303 monitor_enter(jvmti, env, glws_monitor, AT_LINE); 304 305 // Notify agent thread to shut down 306 test_state = ShutDown; 307 monitor_notify(jvmti, env, glws_monitor, AT_LINE); 308 309 // Wait for agent to terminate 310 while (test_state != Terminated) { 311 monitor_wait(jvmti, env, glws_monitor, AT_LINE); 312 } 313 314 monitor_exit(jvmti, env, glws_monitor, AT_LINE); 315 316 // Destroy glws_monitor 317 monitor_destroy(jvmti, env, glws_monitor, AT_LINE); 318 } Thanks, Serguei On 8/14/20 07:06, Reingruber, Richard wrote: Hi Serguei, thanks for the feedback. I have implemented your suggestions and created a new webrev: Webrev: http://cr.openjdk.java.net/~rrich/webrevs/8249293/webrev.4/ Delta: http://cr.openjdk.java.net/~rrich/webrevs/8249293/webrev.4.inc/ Please find my replies to your comments below. Best regards, Richard. http://cr.openjdk.java.net/~rrich/webrevs/8249293/webrev.3.inc/test/hotspot/jtreg/serviceability/jvmti/GetLocalVariable/GetLocalWithoutSuspendTest.java.frames.html 33 * the stack walk. The target thread's stack is walkable while in native. After sending the notification it ... 54 * @param depth Depth of target frame for GetLocalObject() call. Should be large value to prolong the unsafe stack walk. 55 * @param waitTimeInNativeAfterNotify Time to wait after notify with walkable stack before returning an becoming unsafe again. ... 71 * Wait time in native, i.e. with walkable stack, after notifying agent thread to do GetLocalObject() call. ... 89 msg((now -start) + " ms Iteration : " + iterations + " waitTimeInNativeAfterNotify : " + waitTimeInNativeAfterNotify); Could you, please, re-balance the lines above to make them shorter? Ok, done. 90 int newTargetDepth = recursiveMethod(0, targetDepth); 91 if (newTargetDepth < targetDepth) { 92 msg("StackOverflowError during test."); 93 msg("Old target depth: " + targetDepth); 94 msg("Retry with new target depth: " + newTargetDepth); 95 targetDepth = newTargetDepth; 96 } A comment is needed to explain why a StackOverflowError is not desired. At least, it is not obvious initially. 73 public int waitTimeInNativeAfterNotify; This name is unreasonably long which makes the code less readable. I'd suggest to reduce it to waitTime. Ok, done. 103 notifyAgentToGetLocalAndWaitShortly(-1, 1); ... 119 notifyAgentToGetLocalAndWaitShortly(depth - 100, waitTimeInNativeAfterNotify); It is better to provide a short comment before each call explaining what it is doing. For instance, it is not clear why the call at the line 103 is needed. Why do we need to notify the agent to GetLocal for the second time? The test is repeated TEST_ITERATIONS times. In each iteration the agent calls GetLocal racing the target thread returning from the native call. The last call in line 103 ist the shutdown signal. Can it be refactored into a separate native method? I've made the shutdown process more explicit with the new native method shutDown() which sets thest_state to ShutDown. Then the the function name can be reduced to 'notifyAgentToGetLocal'. This long name does not give enough context anyway. Ok, done. 85 long iterations = 0; 87 do { ... 97 iterations++; ... 102 } while (iterations < TEST_ITERATIONS); Why a more explicit 'for' or 'while' loop is not used here? : for (long iter = 0; iter < TEST_ITERATIONS; iter++) { I have converted the loop into a for loop. http://cr.openjdk.java.net/~rrich/webrevs/8249293/webrev.3.inc/test/hotspot/jtreg/serviceability/jvmti/GetLocalVariable/libGetLocalWithoutSuspendTest.cpp.frames.html The indent in this file varies. It is better to keep it the same: 4 or 2. Yes, I noticed this. I have not corrected it yet, because I didn't want to pullute the incremental webrev with that change. Would you like me to fix the indentation now to 2 spaces or do it as a last step? 60 AgentCallingGetLocalObject // The target thread waits for the agent to call I'd suggest to rename the constant to 'AgentInGetLocal'. Ok, done. 150 GetLocalWithoutSuspendTestThreadLoop(jvmtiEnv * jvmti, JNIEnv* env, void * arg) { It is better rename the function to TestThreadLoop. Would AgentThreadLoop be ok too? You can add a comment before to explain some basic about what it is doing. Ok, done. 167 printf("*** AGENT: GetLocalWithoutSuspendTestThreadLoop thread started. Polling thread '%s' for local variables\n", It is better to get rid of leading stars in all messages. Ok, done. 176 // the native method Java_GetLocalWithoutSuspendTest_notifyAgentToGetLocalAndWaitShortly The part 'Java_GetLocalWithoutSuspendTest_' can be removed from the function name. Ok, done. --- From: mailto:serguei.spitsyn at oracle.com mailto:serguei.spitsyn at oracle.com Sent: Freitag, 14. August 2020 10:11 To: Reingruber, Richard mailto:richard.reingruber at sap.com; David Holmes mailto:david.holmes at oracle.com; mailto:serviceability-dev at openjdk.java.net Subject: Re: RFR(S) 8249293: Unsafe stackwalk in VM_GetOrSetLocal::doit_prologue() Hi Richard, http://cr.openjdk.java.net/~rrich/webrevs/8249293/webrev.3.inc/test/hotspot/jtreg/serviceability/jvmti/GetLocalVariable/GetLocalWithoutSuspendTest.java.frames.html 33 * the stack walk. The target thread's stack is walkable while in native. After sending the notification it ... 54 * @param depth Depth of target frame for GetLocalObject() call. Should be large value to prolong the unsafe stack walk. 55 * @param waitTimeInNativeAfterNotify Time to wait after notify with walkable stack before returning an becoming unsafe again. ... 71 * Wait time in native, i.e. with walkable stack, after notifying agent thread to do GetLocalObject() call. ... 89 msg((now -start) + " ms Iteration : " + iterations + " waitTimeInNativeAfterNotify : " + waitTimeInNativeAfterNotify); Could you, please, re-balance the lines above to make them shorter? 90 int newTargetDepth = recursiveMethod(0, targetDepth); 91 if (newTargetDepth < targetDepth) { 92 msg("StackOverflowError during test."); 93 msg("Old target depth: " + targetDepth); 94 msg("Retry with new target depth: " + newTargetDepth); 95 targetDepth = newTargetDepth; 96 } A comment is needed to explain why a StackOverflowError is not desired. At least, it is not obvious initially. 73 public int waitTimeInNativeAfterNotify; This name is unreasonably long which makes the code less readable. I'd suggest to reduce it to waitTime. 103 notifyAgentToGetLocalAndWaitShortly(-1, 1); ... 119 notifyAgentToGetLocalAndWaitShortly(depth - 100, waitTimeInNativeAfterNotify); It is better to provide a short comment before each call explaining what it is doing. For instance, it is not clear why the call at the line 103 is needed. Why do we need to notify the agent to GetLocal for the second time? Can it be refactored into a separate native method? Then the the function name can be reduced to 'notifyAgentToGetLocal'. This long name does not give enough context anyway. 85 long iterations = 0; 87 do { ... 97 iterations++; ... 102 } while (iterations < TEST_ITERATIONS); Why a more explicit 'for' or 'while' loop is not used here? : for (long iter = 0; iter < TEST_ITERATIONS; iter++) { http://cr.openjdk.java.net/~rrich/webrevs/8249293/webrev.3.inc/test/hotspot/jtreg/serviceability/jvmti/GetLocalVariable/libGetLocalWithoutSuspendTest.cpp.frames.html The indent in this file varies. It is better to keep it the same: 4 or 2. 60 AgentCallingGetLocalObject // The target thread waits for the agent to call I'd suggest to rename the constant to 'AgentInGetLocal'. 150 GetLocalWithoutSuspendTestThreadLoop(jvmtiEnv * jvmti, JNIEnv* env, void * arg) { It is better rename the function to TestThreadLoop. You can add a comment before to explain some basic about what it is doing. 167 printf("*** AGENT: GetLocalWithoutSuspendTestThreadLoop thread started. Polling thread '%s' for local variables\n", It is better to get rid of leading stars in all messages. 176 // the native method Java_GetLocalWithoutSuspendTest_notifyAgentToGetLocalAndWaitShortly The part 'Java_GetLocalWithoutSuspendTest_' can be removed from the function name. I'm still reviewing the test native agent code. Thanks, Serguei On 8/11/20 03:02, Reingruber, Richard wrote: Hi David and Serguei, On 11/08/2020 3:21 am, mailto:serguei.spitsyn at oracle.com wrote: Hi Richard and David, The implementation looks good to me. But I do not understand what the test is doing with all this counters and recursions. For instance, these fragments: 86 recursions = 0; 87 try { 88 recursiveMethod(1<<20); 89 } catch (StackOverflowError e) { 90 msg("Caught StackOverflowError as expected"); 91 } 92 int target_depth = recursions-100; // spaces are missed around the '-' sigh It is not obvious that the 'recursion' is updated in the recursiveMethod. I would suggestto make it more explicit: recursiveMethod(M); int target_depth = M - 100; Then the variable 'recursions' can be removed or become local. The recursiveMethod takes in the maximum recursions to try and updates the recursions variable to record how many recursions were possible - so: target_depth = - 100; Possibly recursiveMethod could return the actual recursions instead of using the global variables? I've eliminated the static 'recursions' variable. recursiveMethod() now returns the depth at which the recursion was ended. I hesitated doing this, because I had to handle the StackOverflowError with all those frames still on stack. But the handler is empty, so it should not cause problems. This is the new webrev (as posted previously): Webrev: http://cr.openjdk.java.net/~rrich/webrevs/8249293/webrev.3/ Delta: http://cr.openjdk.java.net/~rrich/webrevs/8249293/webrev.3.inc/ Thanks, Richard. -----Original Message----- From: David Holmes mailto:david.holmes at oracle.com Sent: Dienstag, 11. August 2020 04:00 To: mailto:serguei.spitsyn at oracle.com; Reingruber, Richard mailto:richard.reingruber at sap.com; mailto:serviceability-dev at openjdk.java.net Subject: Re: RFR(S) 8249293: Unsafe stackwalk in VM_GetOrSetLocal::doit_prologue() Hi Serguei, On 11/08/2020 3:21 am, mailto:serguei.spitsyn at oracle.com wrote: Hi Richard and David, The implementation looks good to me. But I do not understand what the test is doing with all this counters and recursions. For instance, these fragments: 86 recursions = 0; 87 try { 88 recursiveMethod(1<<20); 89 } catch (StackOverflowError e) { 90 msg("Caught StackOverflowError as expected"); 91 } 92 int target_depth = recursions-100; // spaces are missed around the '-' sigh It is not obvious that the 'recursion' is updated in the recursiveMethod. I would suggestto make it more explicit: recursiveMethod(M); int target_depth = M - 100; Then the variable 'recursions' can be removed or become local. The recursiveMethod takes in the maximum recursions to try and updates the recursions variable to record how many recursions were possible - so: target_depth = - 100; Possibly recursiveMethod could return the actual recursions instead of using the global variables? David ----- This method will be: 47 private static final int M = 1 << 20; ... 121 public long recursiveMethod(int depth) { 123 if (depth == 0) { 124 notifyAgentToGetLocalAndWaitShortly(M - 100, waitTimeInNativeAfterNotify); 126 } else { 127 recursiveMethod(--depth); 128 } 129 } At least, he test is missing the comments explaining all these. Thanks, Serguei On 8/9/20 22:35, David Holmes wrote: Hi Richard, On 31/07/2020 5:28 pm, Reingruber, Richard wrote: Hi, I rebase the fix after JDK-8250042. New webrev: http://cr.openjdk.java.net/~rrich/webrevs/8249293/webrev.2/ The general fix for this seems good. A minor nit: 588 if (!is_assignable(signature, ob_k, Thread::current())) { You know that the current thread is the VMThread so can use VMThread::vm_thread(). Similarly for this existing code: 694 Thread* current_thread = Thread::current(); --- Looking at the test code ... I'm less clear on exactly what is happening and the use of spin-waits raises some red-flags for me in terms of test reliability on different platforms. The "while (--waitCycles > 0)" loop in particular offers no certainty that the agent thread is executing anything in particular. And the use of the spin_count as a guide to future waiting time seems somewhat arbitrary. In all seriousness I got a headache trying to work out how the test was expecting to operate. Some parts could be simplified using raw monitors, I think. But there's no sure way to know the agent thread is in the midst of the stackwalk when the target thread wants to leave the native code. So I understand what you are trying to achieve here, I'm just not sure how reliably it will actually achieve it. test/hotspot/jtreg/serviceability/jvmti/GetLocalVariable/libGetLocalWithoutSuspendTest.cpp 32 static volatile jlong spinn_count = 0; Using a 64-bit counter seems like it will be a problem on 32-bit systems. Should be spin_count not spinn_count. 36 // Agent thread waits for value != 0, then performas the JVMTI call to get local variable. typo: performas Thanks, David ----- Thanks, Richard. -----Original Message----- From: serviceability-dev mailto:serviceability-dev-retn at openjdk.java.net On Behalf Of Reingruber, Richard Sent: Montag, 27. Juli 2020 09:45 To: mailto:serguei.spitsyn at oracle.com; mailto:serviceability-dev at openjdk.java.net Subject: [CAUTION] RE: RFR(S) 8249293: Unsafe stackwalk in VM_GetOrSetLocal::doit_prologue() Hi Serguei, > I tested it on Linux and Windows but not yet on MacOS. The test succeeded now on all platforms. Thanks, Richard. -----Original Message----- From: Reingruber, Richard Sent: Freitag, 24. Juli 2020 15:04 To: mailto:serguei.spitsyn at oracle.com; mailto:serviceability-dev at openjdk.java.net Subject: RE: RFR(S) 8249293: Unsafe stackwalk in VM_GetOrSetLocal::doit_prologue() Hi Serguei, The fix itself looks good to me. thanks for looking at the fix. I still need another look at new test. Could you, please, convert the agent of new test to C++? It will make it a little bit simpler. Sure, here is the new webrev.1 with a C++ version of the test agent: http://cr.openjdk.java.net/~rrich/webrevs/8249293/webrev.1/ I tested it on Linux and Windows but not yet on MacOS. Thanks, Richard. -----Original Message----- From: mailto:serguei.spitsyn at oracle.com mailto:serguei.spitsyn at oracle.com Sent: Freitag, 24. Juli 2020 00:00 To: Reingruber, Richard mailto:richard.reingruber at sap.com; mailto:serviceability-dev at openjdk.java.net Subject: Re: RFR(S) 8249293: Unsafe stackwalk in VM_GetOrSetLocal::doit_prologue() Hi Richard, Thank you for filing the CR and taking care about it! The fix itself looks good to me. I still need another look at new test. Could you, please, convert the agent of new test to C++? It will make it a little bit simpler. Thanks, Serguei On 7/20/20 01:15, Reingruber, Richard wrote: Hi, please help review this fix for VM_GetOrSetLocal. It moves the unsafe stackwalk from the vm operation prologue before the safepoint into the doit() method executed at the safepoint. Webrev: http://cr.openjdk.java.net/~rrich/webrevs/8249293/webrev.0/index.html Bug: https://bugs.openjdk.java.net/browse/JDK-8249293 According to the JVMTI spec on local variable access it is not required to suspend the target thread T [1]. The operation will simply fail with JVMTI_ERROR_NO_MORE_FRAMES if T is executing bytecodes. It will succeed though if T is blocked because of synchronization or executing some native code. The issue is that in the latter case the stack walk in VM_GetOrSetLocal::doit_prologue() to prepare the access to the local variable is unsafe, because it is done before the safepoint and it races with T returning to execute bytecodes making its stack not walkable. The included test shows that this can crash the VM if T wins the race. Manual testing: - new test test/hotspot/jtreg/serviceability/jvmti/GetLocalVariable/GetLocalWithoutSuspendTest.java - test/hotspot/jtreg/vmTestbase/nsk/jvmti - test/hotspot/jtreg/serviceability/jvmti Nightly regression tests @SAP: JCK and JTREG, also in Xcomp mode, SPECjvm2008, SPECjbb2015, Renaissance Suite, SAP specific tests with fastdebug and release builds on all platforms Thanks, Richard. [1] https://docs.oracle.com/en/java/javase/14/docs/specs/jvmti.html#local -------------- next part -------------- An HTML attachment was scrubbed... URL: From claes.redestad at oracle.com Tue Aug 25 11:23:31 2020 From: claes.redestad at oracle.com (Claes Redestad) Date: Tue, 25 Aug 2020 13:23:31 +0200 Subject: 8251155: HostIdentifier fails to canonicalize hostnames starting with digits(Internet mail) In-Reply-To: <09c53cd507e140d4897a5db9b55419d8@tencent.com> References: <1C8FB411-F73A-4197-A8C9-28B2BE3BFBA0@tencent.com> <8cee7938941048f4b007b1663fab4b95@tencent.com> <8dae62e2-22a6-0442-8c6a-0c666f9b1ad4@oracle.com> <09c53cd507e140d4897a5db9b55419d8@tencent.com> Message-ID: <97ec2ba7-405e-1e49-566c-ae157df1b2cd@oracle.com> Hi Jie, fix looks good to me! /Claes On 2020-08-25 04:12, jiefu(??) wrote: > Thanks Serguei for your review. > > Claes, are you okay with this change: > http://cr.openjdk.java.net/~jiefu/8251155/webrev.00/ > > Thanks. > Best regards, > Jie > > > ------------------------------------------------------------------------ > *From:* serguei.spitsyn at oracle.com > *Sent:* Tuesday, August 25, 2020 8:38 AM > *To:* jiefu(??); serviceability-dev at openjdk.java.net; Claes Redestad > *Subject:* Re: 8251155: HostIdentifier fails to canonicalize hostnames > starting with digits(Internet mail) > Hi Jie, > > I'm okay with the fix. > > Thanks, > Serguei > > > On 8/24/20 09:21, jiefu(??) wrote: >> >> Hi Serguei and Claes, >> >> I forget to mention that you can also verify this fix using the >> following tests: >> >> ---------------------------------------------------------- >> >> test/jdk/sun/tools/jstatd/TestJstatdExternalRegistry.java >> >> test/jdk/sun/tools/jstatd/TestJstatdPort.java >> >> test/jdk/sun/tools/jstatd/TestJstatdPortAndServer.java >> >> test/jdk/sun/tools/jstatd/TestJstatdRmiPort.java >> >> ---------------------------------------------------------- >> >> Without the patch, All of them will fail if the hostname starting from >> digits. >> >> We've found that it seems very common that the hostname will start >> with digits in dockers. >> >> So it would be better to fix it. >> >> What do you think? >> >> Thanks. >> >> Best regards, >> >> Jie >> >> *From: *"jiefu(??)" >> *Date: *Wednesday, August 19, 2020 at 4:05 PM >> *To: *"serguei.spitsyn at oracle.com" , >> "serviceability-dev at openjdk.java.net" >> , Claes Redestad >> >> *Subject: *Re: 8251155: HostIdentifier fails to canonicalize hostnames >> starting with digits(Internet mail) >> >> Hi?Serguei, >> >> Thanks for your review and help. >> >> Please see comments inline. >> >> ------------------------------------------------------------------------ >> >> *From:*serguei.spitsyn at oracle.com >> *Sent:* Wednesday, August 19, 2020 4:03 AM >> *To:* jiefu(??); serviceability-dev at openjdk.java.net; Claes Redestad >> *Subject:* Re: 8251155: HostIdentifier fails to canonicalize hostnames >> starting with digits(Internet mail) >> >> ? 83? *
    >> ? 84? *??
  • {@code } - transformed into "//localhost"
  • >> ? 85? *??
  • localhost - transformed into "//localhost"
  • >> ? 86? *??
  • hostname - transformed into "//hostname"
  • >> ? 87? *??
  • hostname:port - transformed into "//hostname:port"
  • >> ? 88? *??
  • proto:hostname - transformed into "proto://hostname"
  • >> ? 89? *??
  • proto:hostname:port - transformed into >> ? 90? *????????? "proto://hostname:port"
  • >> ? 91? *??
  • proto://hostname:port
  • >> ? 92? *
>> >> >>?Is it worth to add an example to the list above? >> >> Yes. It's really helpful for the review process. Thanks. >> >> >> >> >>?I wander if this fix needs a CSR. >> >> I don't think so. >> >> This is just a bug fix which doesn't add/remove/change any feature of >> the tools. >> >> The original design has claimed to?support hostname and hostname:port >> cases. >> >> But it fails to do so when the hostname starts with digits. >> >> It?seems to be very?common that the hostname will be started with >> digits in dockers. >> >> So I think it's worth to fix this bug. >> >> >> >>?How did you check this fix does not introduce any regressions? >> >> In fact, Claes had helped me to answer this question here: >> https://mail.openjdk.java.net/pipermail/serviceability-dev/2020-August/032691.html. >> >> Also, I've tested this patch on Linux/x64 with >> tier1?~?tier3?(no?regression). >> >> Thanks a lot. >> >> Best regards, >> >> Jie >> > From jiefu at tencent.com Tue Aug 25 12:12:13 2020 From: jiefu at tencent.com (=?utf-8?B?amllZnUo5YKF5p2wKQ==?=) Date: Tue, 25 Aug 2020 12:12:13 +0000 Subject: 8251155: HostIdentifier fails to canonicalize hostnames starting with digits(Internet mail) In-Reply-To: <97ec2ba7-405e-1e49-566c-ae157df1b2cd@oracle.com> References: <1C8FB411-F73A-4197-A8C9-28B2BE3BFBA0@tencent.com> <8cee7938941048f4b007b1663fab4b95@tencent.com> <8dae62e2-22a6-0442-8c6a-0c666f9b1ad4@oracle.com> <09c53cd507e140d4897a5db9b55419d8@tencent.com> <97ec2ba7-405e-1e49-566c-ae157df1b2cd@oracle.com> Message-ID: Thanks Claes for your review. Pushed. Best regards, Jie ?On 2020/8/25, 7:26 PM, "Claes Redestad" wrote: Hi Jie, fix looks good to me! /Claes On 2020-08-25 04:12, jiefu(??) wrote: > Thanks Serguei for your review. > > Claes, are you okay with this change: > http://cr.openjdk.java.net/~jiefu/8251155/webrev.00/ > > Thanks. > Best regards, > Jie > > > ------------------------------------------------------------------------ > *From:* serguei.spitsyn at oracle.com > *Sent:* Tuesday, August 25, 2020 8:38 AM > *To:* jiefu(??); serviceability-dev at openjdk.java.net; Claes Redestad > *Subject:* Re: 8251155: HostIdentifier fails to canonicalize hostnames > starting with digits(Internet mail) > Hi Jie, > > I'm okay with the fix. > > Thanks, > Serguei > > > On 8/24/20 09:21, jiefu(??) wrote: >> >> Hi Serguei and Claes, >> >> I forget to mention that you can also verify this fix using the >> following tests: >> >> ---------------------------------------------------------- >> >> test/jdk/sun/tools/jstatd/TestJstatdExternalRegistry.java >> >> test/jdk/sun/tools/jstatd/TestJstatdPort.java >> >> test/jdk/sun/tools/jstatd/TestJstatdPortAndServer.java >> >> test/jdk/sun/tools/jstatd/TestJstatdRmiPort.java >> >> ---------------------------------------------------------- >> >> Without the patch, All of them will fail if the hostname starting from >> digits. >> >> We've found that it seems very common that the hostname will start >> with digits in dockers. >> >> So it would be better to fix it. >> >> What do you think? >> >> Thanks. >> >> Best regards, >> >> Jie >> >> *From: *"jiefu(??)" >> *Date: *Wednesday, August 19, 2020 at 4:05 PM >> *To: *"serguei.spitsyn at oracle.com" , >> "serviceability-dev at openjdk.java.net" >> , Claes Redestad >> >> *Subject: *Re: 8251155: HostIdentifier fails to canonicalize hostnames >> starting with digits(Internet mail) >> >> Hi Serguei, >> >> Thanks for your review and help. >> >> Please see comments inline. >> >> ------------------------------------------------------------------------ >> >> *From:*serguei.spitsyn at oracle.com >> *Sent:* Wednesday, August 19, 2020 4:03 AM >> *To:* jiefu(??); serviceability-dev at openjdk.java.net; Claes Redestad >> *Subject:* Re: 8251155: HostIdentifier fails to canonicalize hostnames >> starting with digits(Internet mail) >> >> 83 *
    >> 84 *
  • {@code } - transformed into "//localhost"
  • >> 85 *
  • localhost - transformed into "//localhost"
  • >> 86 *
  • hostname - transformed into "//hostname"
  • >> 87 *
  • hostname:port - transformed into "//hostname:port"
  • >> 88 *
  • proto:hostname - transformed into "proto://hostname"
  • >> 89 *
  • proto:hostname:port - transformed into >> 90 * "proto://hostname:port"
  • >> 91 *
  • proto://hostname:port
  • >> 92 *
>> >> >> Is it worth to add an example to the list above? >> >> Yes. It's really helpful for the review process. Thanks. >> >> >> >> >> I wander if this fix needs a CSR. >> >> I don't think so. >> >> This is just a bug fix which doesn't add/remove/change any feature of >> the tools. >> >> The original design has claimed to support hostname and hostname:port >> cases. >> >> But it fails to do so when the hostname starts with digits. >> >> It seems to be very common that the hostname will be started with >> digits in dockers. >> >> So I think it's worth to fix this bug. >> >> >> >> How did you check this fix does not introduce any regressions? >> >> In fact, Claes had helped me to answer this question here: >> https://mail.openjdk.java.net/pipermail/serviceability-dev/2020-August/032691.html. >> >> Also, I've tested this patch on Linux/x64 with >> tier1 ~ tier3 (no regression). >> >> Thanks a lot. >> >> Best regards, >> >> Jie >> > From hohensee at amazon.com Tue Aug 25 20:28:26 2020 From: hohensee at amazon.com (Hohensee, Paul) Date: Tue, 25 Aug 2020 20:28:26 +0000 Subject: RFR/RFA (M): 8185003: JMX: Add a version of ThreadMXBean.dumpAllThreads with a maxDepth argument Message-ID: :) New webrevs following Volker's suggestion. http://cr.openjdk.java.net/~phh/8185003/webrev.8u.jdk.06/ http://cr.openjdk.java.net/~phh/8185003/webrev.8u.hotspot.06/ Passes jdk/test/com/sun/management jdk/test/java/lang/management jdk/test/sun/management jdk/test/javax/management Paul ?On 8/21/20, 1:39 PM, "serguei.spitsyn at oracle.com" wrote: On 8/21/20 11:07, serguei.spitsyn at oracle.com wrote: > Hi Paul, Sorry, Volker, for using this "indirection". I hope, Paul redirected my "Hi" to you. :) Thanks, Serguei > > Thank you for explanation. > > Thanks, > Serguei > > > On 8/21/20 10:54, Volker Simonis wrote: >> On Thu, Aug 20, 2020 at 10:06 PM serguei.spitsyn at oracle.com >> wrote: >>> Hi Paul, >>> >>> I was also wondering if there is a compatibility risk involved with >>> the JMM_VERSION change. >>> So, thanks to Volker for asking these questions. >>> >>> One more question. >>> I do not see a backport of the >>> src/jdk.management/share/native/libmanagement_ext/management_ext.c >>> change. >>> Is it intentional, and if so, what is the reason to skip this file? >>> >> "libmanagement_ext/management_ext.c" doesn't exist in jdk8. It was >> introduced with "8042901: Allow com.sun.management to be in a >> different module to java.lang.management" in jdk9. In jdk8 all the >> functionality is in "management/management.h" so there's no need to >> backport the changes from "management_ext.c" . >> >> [1] https://bugs.openjdk.java.net/browse/JDK-8042901 >> >>> Thanks, >>> Serguei >>> >>> >>> On 8/20/20 11:30, Volker Simonis wrote: >>> >>> On Wed, Aug 19, 2020 at 6:17 PM Hohensee, Paul >>> wrote: >>> >>> Please review this backport to jdk8u. I especially need a CSR >>> review, since the CSR approval process can be a bottleneck. The >>> patch significantly reduces fleet profiling overhead, and a version >>> of it has been in production at Amazon for over 3 years. >>> >>> >>> >>> Original JBS issue: https://bugs.openjdk.java.net/browse/JDK-8185003 >>> >>> Original CSR: https://bugs.openjdk.java.net/browse/JDK-8185705 >>> >>> Original patch: >>> http://hg.openjdk.java.net/jdk10/master/rev/68d46cb9be45 >>> >>> >>> >>> Backport JBS issue: https://bugs.openjdk.java.net/browse/JDK-8251494 >>> >>> Backport CSR: https://bugs.openjdk.java.net/browse/JDK-8251498 >>> >>> Backport JDK webrev: >>> http://cr.openjdk.java.net/~phh/8185003/webrev.8u.jdk.05/ >>> >>> JDK part looks good to me. >>> >>> Backport Hotspot webrev: >>> http://cr.openjdk.java.net/~phh/8185003/webrev.8u.hotspot.05/ >>> >>> HotSpot part looks good to me but see discussion below. >>> >>> >>> Details of the interface changes needed for the backport are in the >>> Description of the Backport CSR 8251498. The actual functional >>> changes are minimal and low risk. >>> >>> I've also reviewed the CSR yesterday which I think is fine. But now, >>> when looking at the implementation, I'm a little concerned about >>> changing JMM_VERSION from " 0x20010203" to "0x20020000" in "jmm.h". >>> >>> This might be especially problematic in combination with the changes >>> in "Management::get_jmm_interface()" which is called by >>> JVM_GetManagement(): >>> >>> void* Management::get_jmm_interface(int version) { >>> #if INCLUDE_MANAGEMENT >>> - if (version == JMM_VERSION_1_0) { >>> + if (version == JMM_VERSION) { >>> return (void*) &jmm_interface; >>> } >>> #endif // INCLUDE_MANAGEMENT >>> return NULL; >>> } >>> >>> You've correctly fixed the single caller of "JVM_GetManagement()" in >>> the JDK (in "JNI_OnLoad()" in "management.c"): >>> >>> - jmm_interface = (JmmInterface*) >>> JVM_GetManagement(JMM_VERSION_1_0); >>> + jmm_interface = (JmmInterface*) JVM_GetManagement(JMM_VERSION); >>> >>> but I wonder if there are other monitoring/serviceability tools out >>> there which use this interface and which will break after this change. >>> A quick search revealed at least two StackOverflow entries which >>> recommend using "JVM_GetManagement(JMM_VERSION_1_0)" [1,2] and there's >>> a talk and a blog entry doing the same [3, 4]. >>> >>> I'm not sure how relevant this is but I think a much safer and >>> backwards-compatible way of doing this downport would be the >>> following: >>> >>> - don't change "Management::get_jmm_interface()" (i.e. still check for >>> "JMM_VERSION_1_0") but return the "new" JMM_VERSION in >>> "jmm_GetVersion()". This won't break anything but will make it >>> possible for clients to detect the new version if they want. >>> >>> - don't change the signature of "DumpThreads()". Instead add a new >>> version (e.g. "DumpThreadsMaxDepth()/jmm_DumpThreadsMaxDepth()") to >>> the "JMMInterface" struct and to "jmm_interface" in "management.cpp". >>> You can do this in one of the two first, reserved fields of >>> "JMMInterface" so you won't break binary compatibility. >>> "jmm_DumpThreads()" will then be a simple wrapper which calls >>> "jmm_DumpThreadsMaxDepth()" with Integer.MAX_VALUE as depth. >>> >>> - in the jdk you then simply call "DumpThreadsMaxDepth()" in >>> "Java_sun_management_ThreadImpl_dumpThreads0()" >>> >>> I think this way we can maintain full binary compatibility while still >>> using the new feature. What do you think? >>> >>> Best regards, >>> Volker >>> >>> [1] >>> https://urldefense.com/v3/__https://stackoverflow.com/questions/23632653/generate-java-heap-dump-on-uncaught-exception__;!!GqivPVa7Brio!LDD5rfKbGz6KCl0LqcAgrFq7kNLkkoDhhN0ZSgHMDvgGMY5bvKJdpoXIAK6N-KqVsyaF$ >>> [2] >>> https://urldefense.com/v3/__https://stackoverflow.com/questions/60887816/jvm-options-printnmtstatistics-save-info-to-file__;!!GqivPVa7Brio!LDD5rfKbGz6KCl0LqcAgrFq7kNLkkoDhhN0ZSgHMDvgGMY5bvKJdpoXIAK6N-Ip7MAQ5$ >>> [3] >>> https://urldefense.com/v3/__https://sudonull.com/post/25841-JVM-TI-how-to-make-a-plugin-for-a-virtual-machine-Odnoklassniki-company-blog__;!!GqivPVa7Brio!LDD5rfKbGz6KCl0LqcAgrFq7kNLkkoDhhN0ZSgHMDvgGMY5bvKJdpoXIAK6N-ErSjPdD$ >>> [4] >>> https://urldefense.com/v3/__https://2019.jpoint.ru/talks/2o8scc5jbaurnqqlsydzxv/__;!!GqivPVa7Brio!LDD5rfKbGz6KCl0LqcAgrFq7kNLkkoDhhN0ZSgHMDvgGMY5bvKJdpoXIAK6N-Oxb5CQ-$ >>> >>> Passes the included (suitably modified) test, as well as the tests in >>> >>> >>> >>> jdk/test/java/lang/management/ThreadMXBean >>> >>> jdk/test/com/sun/management/ThreadMXBean >>> >>> >>> >>> Thanks, >>> >>> Paul >>> >>> > From patricio.chilano.mateo at oracle.com Wed Aug 26 01:13:12 2020 From: patricio.chilano.mateo at oracle.com (Patricio Chilano) Date: Tue, 25 Aug 2020 22:13:12 -0300 Subject: 8242427: JVMTI frame pop operations should use Thread-Local Handshakes In-Reply-To: References: Message-ID: <11793902-5fca-3251-ce57-ae0832c3a2c4@oracle.com> Hi Yasumasa, On 8/23/20 11:40 PM, Yasumasa Suenaga wrote: > Hi all, > > I want to hear your opinions about the change for JDK-8242427. > > I'm trying to migrate following operations to direct handshake. > > ??? - VM_UpdateForPopTopFrame > ??? - VM_SetFramePop > ??? - VM_GetCurrentLocation > > Some operations (VM_GetCurrentLocation and EnterInterpOnlyModeClosure) > might be called at safepoint, so I want to use > JavaThread::active_handshaker() in production VM to detect the process > is in direct handshake or not. > > However this function is available in debug VM only, so I want to hear > the reason why it is for debug VM only, and there are no problem to > use it in production VM. Of course another solutions are welcome. I added the _active_handshaker field to the HandshakeState class when working on 8230594 to adjust some asserts, where instead of checking for the VMThread we needed to check for the active handshaker of the target JavaThread. Since there were no other users of it, there was no point in declaring it and having to write to it for the release bits. There are no issues with having it in production though so you could change that if necessary. > webrev is here. It passed jtreg tests (vmTestbase/nsk/{jdi,jdwp,jvmti} > serviceability/{jdwp,jvmti}) > ? http://cr.openjdk.java.net/~ysuenaga/JDK-8242427/proposal/ Some comments on the proposed change. src/hotspot/share/prims/jvmtiEnvThreadState.cpp, src/hotspot/share/prims/jvmtiEventController.cpp Why is the check to decide whether to call the handshake or execute the operation with the current thread different for GetCurrentLocationClosure vs EnterInterpOnlyModeClosure? (GetCurrentLocationClosure) if ((Thread::current() == _thread) || (_thread->active_handshaker() != NULL)) { ???? op.do_thread(_thread); } else { ???? Handshake::execute_direct(&op, _thread); } vs (EnterInterpOnlyModeClosure) if (target->active_handshaker() != NULL) { ??? hs.do_thread(target); } else { ??? Handshake::execute_direct(&hs, target); } If you change VM_SetFramePop to use handshakes then it seems you could reach JvmtiEventControllerPrivate::enter_interp_only_mode() with the current thread being the target. Also I think you want the second expression of that check to be (target->active_handshaker() == Thread::current()). So either you are the target or the current active_handshaker for that target. Otherwise active_handshaker() could be not NULL because there is another JavaThread handshaking the same target. Unless you are certain that it can never happen, so if active_handshaker() is not NULL it is always the current thread, but even in that case this way is safer. src/hotspot/share/prims/jvmtiThreadState.cpp The guarantee() statement exists in release builds too so the "#ifdef ASSERT" directive should be removed, otherwise "current" will not be declared. Thanks! Patricio > Thanks, > > Yasumasa From david.holmes at oracle.com Wed Aug 26 07:16:00 2020 From: david.holmes at oracle.com (David Holmes) Date: Wed, 26 Aug 2020 17:16:00 +1000 Subject: 8242427: JVMTI frame pop operations should use Thread-Local Handshakes In-Reply-To: References: Message-ID: Hi Yasumasa, On 24/08/2020 12:40 pm, Yasumasa Suenaga wrote: > Hi all, > > I want to hear your opinions about the change for JDK-8242427. > > I'm trying to migrate following operations to direct handshake. > > ??? - VM_UpdateForPopTopFrame > ??? - VM_SetFramePop > ??? - VM_GetCurrentLocation > > Some operations (VM_GetCurrentLocation and EnterInterpOnlyModeClosure) > might be called at safepoint, so I want to use > JavaThread::active_handshaker() in production VM to detect the process > is in direct handshake or not. > > However this function is available in debug VM only, so I want to hear > the reason why it is for debug VM only, and there are no problem to use > it in production VM. Of course another solutions are welcome. I don't think it should be necessary to use that function in general nor safe in general - the only safe thing you can do is check if the current thread is the target's active handshaker (via an assert). It is of no use to you to know that the target thread is involved in a handshake with a different thread for a different reason, as it can leave that handshake at any time and its stack will not be walkable. You must perform a handshake from the current thread to the target. Cheers, David ----- > webrev is here. It passed jtreg tests (vmTestbase/nsk/{jdi,jdwp,jvmti} > serviceability/{jdwp,jvmti}) > ? http://cr.openjdk.java.net/~ysuenaga/JDK-8242427/proposal/ > > > Thanks, > > Yasumasa From suenaga at oss.nttdata.com Wed Aug 26 07:34:33 2020 From: suenaga at oss.nttdata.com (Yasumasa Suenaga) Date: Wed, 26 Aug 2020 16:34:33 +0900 Subject: RFR: 8242427: JVMTI frame pop operations should use Thread-Local Handshakes In-Reply-To: <11793902-5fca-3251-ce57-ae0832c3a2c4@oracle.com> References: <11793902-5fca-3251-ce57-ae0832c3a2c4@oracle.com> Message-ID: <2764d68f-da86-865c-d8b0-e79d1fd9ee25@oss.nttdata.com> Hi Patricio, David, Thanks for your comment! I updated webrev which includes the fix which is commented by Patricio, and it passed submit repo. So I switch this mail thread to RFR. JBS: https://bugs.openjdk.java.net/browse/JDK-8242427 webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8242427/webrev.00/ I understand David said same concerns as Patricio about active handshaker. This webrev checks active handshaker is current thread or not. Cheers, Yasumasa On 2020/08/26 10:13, Patricio Chilano wrote: > Hi Yasumasa, > > On 8/23/20 11:40 PM, Yasumasa Suenaga wrote: >> Hi all, >> >> I want to hear your opinions about the change for JDK-8242427. >> >> I'm trying to migrate following operations to direct handshake. >> >> ??? - VM_UpdateForPopTopFrame >> ??? - VM_SetFramePop >> ??? - VM_GetCurrentLocation >> >> Some operations (VM_GetCurrentLocation and EnterInterpOnlyModeClosure) might be called at safepoint, so I want to use JavaThread::active_handshaker() in production VM to detect the process is in direct handshake or not. >> >> However this function is available in debug VM only, so I want to hear the reason why it is for debug VM only, and there are no problem to use it in production VM. Of course another solutions are welcome. > I added the _active_handshaker field to the HandshakeState class when working on 8230594 to adjust some asserts, where instead of checking for the VMThread we needed to check for the active handshaker of the target JavaThread. Since there were no other users of it, there was no point in declaring it and having to write to it for the release bits. There are no issues with having it in production though so you could change that if necessary. > >> webrev is here. It passed jtreg tests (vmTestbase/nsk/{jdi,jdwp,jvmti} serviceability/{jdwp,jvmti}) >> ? http://cr.openjdk.java.net/~ysuenaga/JDK-8242427/proposal/ > Some comments on the proposed change. > > src/hotspot/share/prims/jvmtiEnvThreadState.cpp, src/hotspot/share/prims/jvmtiEventController.cpp > Why is the check to decide whether to call the handshake or execute the operation with the current thread different for GetCurrentLocationClosure vs EnterInterpOnlyModeClosure? > > (GetCurrentLocationClosure) > if ((Thread::current() == _thread) || (_thread->active_handshaker() != NULL)) { > ???? op.do_thread(_thread); > } else { > ???? Handshake::execute_direct(&op, _thread); > } > > vs > > (EnterInterpOnlyModeClosure) > if (target->active_handshaker() != NULL) { > ??? hs.do_thread(target); > } else { > ??? Handshake::execute_direct(&hs, target); > } > > If you change VM_SetFramePop to use handshakes then it seems you could reach JvmtiEventControllerPrivate::enter_interp_only_mode() with the current thread being the target. > Also I think you want the second expression of that check to be (target->active_handshaker() == Thread::current()). So either you are the target or the current active_handshaker for that target. Otherwise active_handshaker() could be not NULL because there is another JavaThread handshaking the same target. Unless you are certain that it can never happen, so if active_handshaker() is not NULL it is always the current thread, but even in that case this way is safer. > > src/hotspot/share/prims/jvmtiThreadState.cpp > The guarantee() statement exists in release builds too so the "#ifdef ASSERT" directive should be removed, otherwise "current" will not be declared. > > Thanks! > > Patricio >> Thanks, >> >> Yasumasa > From robbin.ehn at oracle.com Wed Aug 26 09:13:51 2020 From: robbin.ehn at oracle.com (Robbin Ehn) Date: Wed, 26 Aug 2020 11:13:51 +0200 Subject: RFR: 8242427: JVMTI frame pop operations should use Thread-Local Handshakes In-Reply-To: <2764d68f-da86-865c-d8b0-e79d1fd9ee25@oss.nttdata.com> References: <11793902-5fca-3251-ce57-ae0832c3a2c4@oracle.com> <2764d68f-da86-865c-d8b0-e79d1fd9ee25@oss.nttdata.com> Message-ID: <4d6243a0-0883-472b-94be-104a595cb4a8@oracle.com> Hi Yasumasa, You cannot take the MutexLocker mu(JvmtiThreadState_lock) with safepoint checks inside a handshake. We are missing a NoSafepointVerifier for handshakes. (I have added this in my work in progress asynchronous handshake patch) Also this can deadlock with the handshake semaphore. (In my asynch handshake patch I have change the sema to a mutex, thus lock ranking works.) I solved this by just taking the mutex before the handshake. And removed the internal locking from set_frame_pop, etc... If there is an issue holding the JvmtiThreadState_lock to long, it should split to a per thread lock instead. (Since often the thread is suppose to be suspended, one could consider using the SR lock for serializing access to the per thread JvmtiThreadState instead.) Thanks, Robbin On 2020-08-26 09:34, Yasumasa Suenaga wrote: > Hi Patricio, David, > > Thanks for your comment! > > I updated webrev which includes the fix which is commented by Patricio, > and it passed submit repo. So I switch this mail thread to RFR. > > ? JBS: https://bugs.openjdk.java.net/browse/JDK-8242427 > ? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8242427/webrev.00/ > > I understand David said same concerns as Patricio about active > handshaker. This webrev checks active handshaker is current thread or not. > > > Cheers, > > Yasumasa > > > On 2020/08/26 10:13, Patricio Chilano wrote: >> Hi Yasumasa, >> >> On 8/23/20 11:40 PM, Yasumasa Suenaga wrote: >>> Hi all, >>> >>> I want to hear your opinions about the change for JDK-8242427. >>> >>> I'm trying to migrate following operations to direct handshake. >>> >>> ??? - VM_UpdateForPopTopFrame >>> ??? - VM_SetFramePop >>> ??? - VM_GetCurrentLocation >>> >>> Some operations (VM_GetCurrentLocation and >>> EnterInterpOnlyModeClosure) might be called at safepoint, so I want >>> to use JavaThread::active_handshaker() in production VM to detect the >>> process is in direct handshake or not. >>> >>> However this function is available in debug VM only, so I want to >>> hear the reason why it is for debug VM only, and there are no problem >>> to use it in production VM. Of course another solutions are welcome. >> I added the _active_handshaker field to the HandshakeState class when >> working on 8230594 to adjust some asserts, where instead of checking >> for the VMThread we needed to check for the active handshaker of the >> target JavaThread. Since there were no other users of it, there was no >> point in declaring it and having to write to it for the release bits. >> There are no issues with having it in production though so you could >> change that if necessary. >> >>> webrev is here. It passed jtreg tests >>> (vmTestbase/nsk/{jdi,jdwp,jvmti} serviceability/{jdwp,jvmti}) >>> ? http://cr.openjdk.java.net/~ysuenaga/JDK-8242427/proposal/ >> Some comments on the proposed change. >> >> src/hotspot/share/prims/jvmtiEnvThreadState.cpp, >> src/hotspot/share/prims/jvmtiEventController.cpp >> Why is the check to decide whether to call the handshake or execute >> the operation with the current thread different for >> GetCurrentLocationClosure vs EnterInterpOnlyModeClosure? >> >> (GetCurrentLocationClosure) >> if ((Thread::current() == _thread) || (_thread->active_handshaker() != >> NULL)) { >> ????? op.do_thread(_thread); >> } else { >> ????? Handshake::execute_direct(&op, _thread); >> } >> >> vs >> >> (EnterInterpOnlyModeClosure) >> if (target->active_handshaker() != NULL) { >> ???? hs.do_thread(target); >> } else { >> ???? Handshake::execute_direct(&hs, target); >> } >> >> If you change VM_SetFramePop to use handshakes then it seems you could >> reach JvmtiEventControllerPrivate::enter_interp_only_mode() with the >> current thread being the target. >> Also I think you want the second expression of that check to be >> (target->active_handshaker() == Thread::current()). So either you are >> the target or the current active_handshaker for that target. Otherwise >> active_handshaker() could be not NULL because there is another >> JavaThread handshaking the same target. Unless you are certain that it >> can never happen, so if active_handshaker() is not NULL it is always >> the current thread, but even in that case this way is safer. >> >> src/hotspot/share/prims/jvmtiThreadState.cpp >> The guarantee() statement exists in release builds too so the "#ifdef >> ASSERT" directive should be removed, otherwise "current" will not be >> declared. >> >> Thanks! >> >> Patricio >>> Thanks, >>> >>> Yasumasa >> From suenaga at oss.nttdata.com Wed Aug 26 12:15:57 2020 From: suenaga at oss.nttdata.com (Yasumasa Suenaga) Date: Wed, 26 Aug 2020 21:15:57 +0900 Subject: RFR: 8242427: JVMTI frame pop operations should use Thread-Local Handshakes In-Reply-To: <4d6243a0-0883-472b-94be-104a595cb4a8@oracle.com> References: <11793902-5fca-3251-ce57-ae0832c3a2c4@oracle.com> <2764d68f-da86-865c-d8b0-e79d1fd9ee25@oss.nttdata.com> <4d6243a0-0883-472b-94be-104a595cb4a8@oracle.com> Message-ID: <8643478b-5b2b-e575-52cf-e233c28c4644@oss.nttdata.com> Hi Robbin, Thanks for your comment! How about this change? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8242427/webrev.01/ diff from previous webrev: http://hg.openjdk.java.net/jdk/submit/rev/7facd1dd39d6 I still use JvmtiThreadState_lock because it has a different locking range from SR lock. Thanks, Yasumasa On 2020/08/26 18:13, Robbin Ehn wrote: > Hi Yasumasa, > > You cannot take the MutexLocker mu(JvmtiThreadState_lock) with safepoint checks inside a handshake. > We are missing a NoSafepointVerifier for handshakes. > (I have added this in my work in progress asynchronous handshake patch) > > Also this can deadlock with the handshake semaphore. > (In my asynch handshake patch I have change the sema to a mutex, thus lock ranking works.) > > I solved this by just taking the mutex before the handshake. > And removed the internal locking from set_frame_pop, etc... > If there is an issue holding the JvmtiThreadState_lock to long, it should split to a per thread lock instead. > (Since often the thread is suppose to be suspended, one could consider using the SR lock for serializing access to the per thread JvmtiThreadState instead.) > > Thanks, Robbin > > On 2020-08-26 09:34, Yasumasa Suenaga wrote: >> Hi Patricio, David, >> >> Thanks for your comment! >> >> I updated webrev which includes the fix which is commented by Patricio, and it passed submit repo. So I switch this mail thread to RFR. >> >> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8242427 >> ?? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8242427/webrev.00/ >> >> I understand David said same concerns as Patricio about active handshaker. This webrev checks active handshaker is current thread or not. >> >> >> Cheers, >> >> Yasumasa >> >> >> On 2020/08/26 10:13, Patricio Chilano wrote: >>> Hi Yasumasa, >>> >>> On 8/23/20 11:40 PM, Yasumasa Suenaga wrote: >>>> Hi all, >>>> >>>> I want to hear your opinions about the change for JDK-8242427. >>>> >>>> I'm trying to migrate following operations to direct handshake. >>>> >>>> ??? - VM_UpdateForPopTopFrame >>>> ??? - VM_SetFramePop >>>> ??? - VM_GetCurrentLocation >>>> >>>> Some operations (VM_GetCurrentLocation and EnterInterpOnlyModeClosure) might be called at safepoint, so I want to use JavaThread::active_handshaker() in production VM to detect the process is in direct handshake or not. >>>> >>>> However this function is available in debug VM only, so I want to hear the reason why it is for debug VM only, and there are no problem to use it in production VM. Of course another solutions are welcome. >>> I added the _active_handshaker field to the HandshakeState class when working on 8230594 to adjust some asserts, where instead of checking for the VMThread we needed to check for the active handshaker of the target JavaThread. Since there were no other users of it, there was no point in declaring it and having to write to it for the release bits. There are no issues with having it in production though so you could change that if necessary. >>> >>>> webrev is here. It passed jtreg tests (vmTestbase/nsk/{jdi,jdwp,jvmti} serviceability/{jdwp,jvmti}) >>>> ? http://cr.openjdk.java.net/~ysuenaga/JDK-8242427/proposal/ >>> Some comments on the proposed change. >>> >>> src/hotspot/share/prims/jvmtiEnvThreadState.cpp, src/hotspot/share/prims/jvmtiEventController.cpp >>> Why is the check to decide whether to call the handshake or execute the operation with the current thread different for GetCurrentLocationClosure vs EnterInterpOnlyModeClosure? >>> >>> (GetCurrentLocationClosure) >>> if ((Thread::current() == _thread) || (_thread->active_handshaker() != NULL)) { >>> ????? op.do_thread(_thread); >>> } else { >>> ????? Handshake::execute_direct(&op, _thread); >>> } >>> >>> vs >>> >>> (EnterInterpOnlyModeClosure) >>> if (target->active_handshaker() != NULL) { >>> ???? hs.do_thread(target); >>> } else { >>> ???? Handshake::execute_direct(&hs, target); >>> } >>> >>> If you change VM_SetFramePop to use handshakes then it seems you could reach JvmtiEventControllerPrivate::enter_interp_only_mode() with the current thread being the target. >>> Also I think you want the second expression of that check to be (target->active_handshaker() == Thread::current()). So either you are the target or the current active_handshaker for that target. Otherwise active_handshaker() could be not NULL because there is another JavaThread handshaking the same target. Unless you are certain that it can never happen, so if active_handshaker() is not NULL it is always the current thread, but even in that case this way is safer. >>> >>> src/hotspot/share/prims/jvmtiThreadState.cpp >>> The guarantee() statement exists in release builds too so the "#ifdef ASSERT" directive should be removed, otherwise "current" will not be declared. >>> >>> Thanks! >>> >>> Patricio >>>> Thanks, >>>> >>>> Yasumasa >>> From robbin.ehn at oracle.com Wed Aug 26 12:32:06 2020 From: robbin.ehn at oracle.com (Robbin Ehn) Date: Wed, 26 Aug 2020 14:32:06 +0200 Subject: RFR: 8242427: JVMTI frame pop operations should use Thread-Local Handshakes In-Reply-To: <8643478b-5b2b-e575-52cf-e233c28c4644@oss.nttdata.com> References: <11793902-5fca-3251-ce57-ae0832c3a2c4@oracle.com> <2764d68f-da86-865c-d8b0-e79d1fd9ee25@oss.nttdata.com> <4d6243a0-0883-472b-94be-104a595cb4a8@oracle.com> <8643478b-5b2b-e575-52cf-e233c28c4644@oss.nttdata.com> Message-ID: Hi Yasumasa, Yes that should work. Can you please add assert where you removed the: - MutexLocker mu(JvmtiThreadState_lock); E.g. + // If we are in a handshake we only know that the requesting thread should have locked it. + assert(SafepointSynchronize::is_at_safepoint() || JvmtiThreadState_lock->is_locked(), "Safepoint or must be locked"); Because I think you missing a MutexLocker in: jvmtiExport.cpp line ~1650: // remove the frame's entry ets->clear_frame_pop(cur_frame_number); In the method void JvmtiExport::post_method_exit(...). Thanks, Robbin On 2020-08-26 14:15, Yasumasa Suenaga wrote: > Hi Robbin, > > Thanks for your comment! > > How about this change? > > ? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8242427/webrev.01/ > ? diff from previous webrev: > http://hg.openjdk.java.net/jdk/submit/rev/7facd1dd39d6 > > I still use JvmtiThreadState_lock because it has a different locking > range from SR lock. > > > Thanks, > > Yasumasa > > > On 2020/08/26 18:13, Robbin Ehn wrote: >> Hi Yasumasa, >> >> You cannot take the MutexLocker mu(JvmtiThreadState_lock) with >> safepoint checks inside a handshake. >> We are missing a NoSafepointVerifier for handshakes. >> (I have added this in my work in progress asynchronous handshake patch) >> >> Also this can deadlock with the handshake semaphore. >> (In my asynch handshake patch I have change the sema to a mutex, thus >> lock ranking works.) >> >> I solved this by just taking the mutex before the handshake. >> And removed the internal locking from set_frame_pop, etc... >> If there is an issue holding the JvmtiThreadState_lock to long, it >> should split to a per thread lock instead. >> (Since often the thread is suppose to be suspended, one could consider >> using the SR lock for serializing access to the per thread >> JvmtiThreadState instead.) >> >> Thanks, Robbin >> >> On 2020-08-26 09:34, Yasumasa Suenaga wrote: >>> Hi Patricio, David, >>> >>> Thanks for your comment! >>> >>> I updated webrev which includes the fix which is commented by >>> Patricio, and it passed submit repo. So I switch this mail thread to >>> RFR. >>> >>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8242427 >>> ?? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8242427/webrev.00/ >>> >>> I understand David said same concerns as Patricio about active >>> handshaker. This webrev checks active handshaker is current thread or >>> not. >>> >>> >>> Cheers, >>> >>> Yasumasa >>> >>> >>> On 2020/08/26 10:13, Patricio Chilano wrote: >>>> Hi Yasumasa, >>>> >>>> On 8/23/20 11:40 PM, Yasumasa Suenaga wrote: >>>>> Hi all, >>>>> >>>>> I want to hear your opinions about the change for JDK-8242427. >>>>> >>>>> I'm trying to migrate following operations to direct handshake. >>>>> >>>>> ??? - VM_UpdateForPopTopFrame >>>>> ??? - VM_SetFramePop >>>>> ??? - VM_GetCurrentLocation >>>>> >>>>> Some operations (VM_GetCurrentLocation and >>>>> EnterInterpOnlyModeClosure) might be called at safepoint, so I want >>>>> to use JavaThread::active_handshaker() in production VM to detect >>>>> the process is in direct handshake or not. >>>>> >>>>> However this function is available in debug VM only, so I want to >>>>> hear the reason why it is for debug VM only, and there are no >>>>> problem to use it in production VM. Of course another solutions are >>>>> welcome. >>>> I added the _active_handshaker field to the HandshakeState class >>>> when working on 8230594 to adjust some asserts, where instead of >>>> checking for the VMThread we needed to check for the active >>>> handshaker of the target JavaThread. Since there were no other users >>>> of it, there was no point in declaring it and having to write to it >>>> for the release bits. There are no issues with having it in >>>> production though so you could change that if necessary. >>>> >>>>> webrev is here. It passed jtreg tests >>>>> (vmTestbase/nsk/{jdi,jdwp,jvmti} serviceability/{jdwp,jvmti}) >>>>> ? http://cr.openjdk.java.net/~ysuenaga/JDK-8242427/proposal/ >>>> Some comments on the proposed change. >>>> >>>> src/hotspot/share/prims/jvmtiEnvThreadState.cpp, >>>> src/hotspot/share/prims/jvmtiEventController.cpp >>>> Why is the check to decide whether to call the handshake or execute >>>> the operation with the current thread different for >>>> GetCurrentLocationClosure vs EnterInterpOnlyModeClosure? >>>> >>>> (GetCurrentLocationClosure) >>>> if ((Thread::current() == _thread) || (_thread->active_handshaker() >>>> != NULL)) { >>>> ????? op.do_thread(_thread); >>>> } else { >>>> ????? Handshake::execute_direct(&op, _thread); >>>> } >>>> >>>> vs >>>> >>>> (EnterInterpOnlyModeClosure) >>>> if (target->active_handshaker() != NULL) { >>>> ???? hs.do_thread(target); >>>> } else { >>>> ???? Handshake::execute_direct(&hs, target); >>>> } >>>> >>>> If you change VM_SetFramePop to use handshakes then it seems you >>>> could reach JvmtiEventControllerPrivate::enter_interp_only_mode() >>>> with the current thread being the target. >>>> Also I think you want the second expression of that check to be >>>> (target->active_handshaker() == Thread::current()). So either you >>>> are the target or the current active_handshaker for that target. >>>> Otherwise active_handshaker() could be not NULL because there is >>>> another JavaThread handshaking the same target. Unless you are >>>> certain that it can never happen, so if active_handshaker() is not >>>> NULL it is always the current thread, but even in that case this way >>>> is safer. >>>> >>>> src/hotspot/share/prims/jvmtiThreadState.cpp >>>> The guarantee() statement exists in release builds too so the >>>> "#ifdef ASSERT" directive should be removed, otherwise "current" >>>> will not be declared. >>>> >>>> Thanks! >>>> >>>> Patricio >>>>> Thanks, >>>>> >>>>> Yasumasa >>>> From suenaga at oss.nttdata.com Wed Aug 26 14:33:18 2020 From: suenaga at oss.nttdata.com (Yasumasa Suenaga) Date: Wed, 26 Aug 2020 23:33:18 +0900 Subject: RFR: 8242427: JVMTI frame pop operations should use Thread-Local Handshakes In-Reply-To: References: <11793902-5fca-3251-ce57-ae0832c3a2c4@oracle.com> <2764d68f-da86-865c-d8b0-e79d1fd9ee25@oss.nttdata.com> <4d6243a0-0883-472b-94be-104a595cb4a8@oracle.com> <8643478b-5b2b-e575-52cf-e233c28c4644@oss.nttdata.com> Message-ID: Hi Robbin, I fixed them in new webrev. Could you review again? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8242427/webrev.02/ diff from previous webrev: http://hg.openjdk.java.net/jdk/submit/rev/2ef7feb5681f It passed vmTestbase/nsk/{jdi,jdwp,jvmti} serviceability/{jdwp,jvmti} jtreg tests, so I think JVMTI functions works fine includes clear_frame_pop(). Thanks, Yasumasa On 2020/08/26 21:32, Robbin Ehn wrote: > Hi Yasumasa, > > Yes that should work. > > Can you please add assert where you removed the: > -? MutexLocker mu(JvmtiThreadState_lock); > E.g. > +? // If we are in a handshake we only know that the requesting thread should have locked it. > +? assert(SafepointSynchronize::is_at_safepoint() || JvmtiThreadState_lock->is_locked(), "Safepoint or must be locked"); > > Because I think you missing a MutexLocker in: > jvmtiExport.cpp line ~1650: > > ??????? // remove the frame's entry > ??????? ets->clear_frame_pop(cur_frame_number); > > In the method void JvmtiExport::post_method_exit(...). > > Thanks, Robbin > > On 2020-08-26 14:15, Yasumasa Suenaga wrote: >> Hi Robbin, >> >> Thanks for your comment! >> >> How about this change? >> >> ?? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8242427/webrev.01/ >> ?? diff from previous webrev: http://hg.openjdk.java.net/jdk/submit/rev/7facd1dd39d6 >> >> I still use JvmtiThreadState_lock because it has a different locking range from SR lock. >> >> >> Thanks, >> >> Yasumasa >> >> >> On 2020/08/26 18:13, Robbin Ehn wrote: >>> Hi Yasumasa, >>> >>> You cannot take the MutexLocker mu(JvmtiThreadState_lock) with safepoint checks inside a handshake. >>> We are missing a NoSafepointVerifier for handshakes. >>> (I have added this in my work in progress asynchronous handshake patch) >>> >>> Also this can deadlock with the handshake semaphore. >>> (In my asynch handshake patch I have change the sema to a mutex, thus lock ranking works.) >>> >>> I solved this by just taking the mutex before the handshake. >>> And removed the internal locking from set_frame_pop, etc... >>> If there is an issue holding the JvmtiThreadState_lock to long, it should split to a per thread lock instead. >>> (Since often the thread is suppose to be suspended, one could consider using the SR lock for serializing access to the per thread JvmtiThreadState instead.) >>> >>> Thanks, Robbin >>> >>> On 2020-08-26 09:34, Yasumasa Suenaga wrote: >>>> Hi Patricio, David, >>>> >>>> Thanks for your comment! >>>> >>>> I updated webrev which includes the fix which is commented by Patricio, and it passed submit repo. So I switch this mail thread to RFR. >>>> >>>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8242427 >>>> ?? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8242427/webrev.00/ >>>> >>>> I understand David said same concerns as Patricio about active handshaker. This webrev checks active handshaker is current thread or not. >>>> >>>> >>>> Cheers, >>>> >>>> Yasumasa >>>> >>>> >>>> On 2020/08/26 10:13, Patricio Chilano wrote: >>>>> Hi Yasumasa, >>>>> >>>>> On 8/23/20 11:40 PM, Yasumasa Suenaga wrote: >>>>>> Hi all, >>>>>> >>>>>> I want to hear your opinions about the change for JDK-8242427. >>>>>> >>>>>> I'm trying to migrate following operations to direct handshake. >>>>>> >>>>>> ??? - VM_UpdateForPopTopFrame >>>>>> ??? - VM_SetFramePop >>>>>> ??? - VM_GetCurrentLocation >>>>>> >>>>>> Some operations (VM_GetCurrentLocation and EnterInterpOnlyModeClosure) might be called at safepoint, so I want to use JavaThread::active_handshaker() in production VM to detect the process is in direct handshake or not. >>>>>> >>>>>> However this function is available in debug VM only, so I want to hear the reason why it is for debug VM only, and there are no problem to use it in production VM. Of course another solutions are welcome. >>>>> I added the _active_handshaker field to the HandshakeState class when working on 8230594 to adjust some asserts, where instead of checking for the VMThread we needed to check for the active handshaker of the target JavaThread. Since there were no other users of it, there was no point in declaring it and having to write to it for the release bits. There are no issues with having it in production though so you could change that if necessary. >>>>> >>>>>> webrev is here. It passed jtreg tests (vmTestbase/nsk/{jdi,jdwp,jvmti} serviceability/{jdwp,jvmti}) >>>>>> ? http://cr.openjdk.java.net/~ysuenaga/JDK-8242427/proposal/ >>>>> Some comments on the proposed change. >>>>> >>>>> src/hotspot/share/prims/jvmtiEnvThreadState.cpp, src/hotspot/share/prims/jvmtiEventController.cpp >>>>> Why is the check to decide whether to call the handshake or execute the operation with the current thread different for GetCurrentLocationClosure vs EnterInterpOnlyModeClosure? >>>>> >>>>> (GetCurrentLocationClosure) >>>>> if ((Thread::current() == _thread) || (_thread->active_handshaker() != NULL)) { >>>>> ????? op.do_thread(_thread); >>>>> } else { >>>>> ????? Handshake::execute_direct(&op, _thread); >>>>> } >>>>> >>>>> vs >>>>> >>>>> (EnterInterpOnlyModeClosure) >>>>> if (target->active_handshaker() != NULL) { >>>>> ???? hs.do_thread(target); >>>>> } else { >>>>> ???? Handshake::execute_direct(&hs, target); >>>>> } >>>>> >>>>> If you change VM_SetFramePop to use handshakes then it seems you could reach JvmtiEventControllerPrivate::enter_interp_only_mode() with the current thread being the target. >>>>> Also I think you want the second expression of that check to be (target->active_handshaker() == Thread::current()). So either you are the target or the current active_handshaker for that target. Otherwise active_handshaker() could be not NULL because there is another JavaThread handshaking the same target. Unless you are certain that it can never happen, so if active_handshaker() is not NULL it is always the current thread, but even in that case this way is safer. >>>>> >>>>> src/hotspot/share/prims/jvmtiThreadState.cpp >>>>> The guarantee() statement exists in release builds too so the "#ifdef ASSERT" directive should be removed, otherwise "current" will not be declared. >>>>> >>>>> Thanks! >>>>> >>>>> Patricio >>>>>> Thanks, >>>>>> >>>>>> Yasumasa >>>>> From robbin.ehn at oracle.com Wed Aug 26 15:08:56 2020 From: robbin.ehn at oracle.com (Robbin Ehn) Date: Wed, 26 Aug 2020 17:08:56 +0200 Subject: RFR: 8242427: JVMTI frame pop operations should use Thread-Local Handshakes In-Reply-To: References: <11793902-5fca-3251-ce57-ae0832c3a2c4@oracle.com> <2764d68f-da86-865c-d8b0-e79d1fd9ee25@oss.nttdata.com> <4d6243a0-0883-472b-94be-104a595cb4a8@oracle.com> <8643478b-5b2b-e575-52cf-e233c28c4644@oss.nttdata.com> Message-ID: <5871a9f0-9feb-25e6-ccd2-173dfd5e5c12@oracle.com> Hi Yasumasa, Thanks for fixing, seems good. Note that there are jdk tests for jdi which also runs this code under: test/jdk/com/sun/jdi/ /Robbin On 2020-08-26 16:33, Yasumasa Suenaga wrote: > Hi Robbin, > > I fixed them in new webrev. Could you review again? > > ? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8242427/webrev.02/ > ? diff from previous webrev: > http://hg.openjdk.java.net/jdk/submit/rev/2ef7feb5681f > > It passed vmTestbase/nsk/{jdi,jdwp,jvmti} serviceability/{jdwp,jvmti} > jtreg tests, so I think JVMTI functions works fine includes > clear_frame_pop(). > > > Thanks, > > Yasumasa > > > On 2020/08/26 21:32, Robbin Ehn wrote: >> Hi Yasumasa, >> >> Yes that should work. >> >> Can you please add assert where you removed the: >> -? MutexLocker mu(JvmtiThreadState_lock); >> E.g. >> +? // If we are in a handshake we only know that the requesting thread >> should have locked it. >> +? assert(SafepointSynchronize::is_at_safepoint() || >> JvmtiThreadState_lock->is_locked(), "Safepoint or must be locked"); >> >> Because I think you missing a MutexLocker in: >> jvmtiExport.cpp line ~1650: >> >> ???????? // remove the frame's entry >> ???????? ets->clear_frame_pop(cur_frame_number); >> >> In the method void JvmtiExport::post_method_exit(...). >> >> Thanks, Robbin >> >> On 2020-08-26 14:15, Yasumasa Suenaga wrote: >>> Hi Robbin, >>> >>> Thanks for your comment! >>> >>> How about this change? >>> >>> ?? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8242427/webrev.01/ >>> ?? diff from previous webrev: >>> http://hg.openjdk.java.net/jdk/submit/rev/7facd1dd39d6 >>> >>> I still use JvmtiThreadState_lock because it has a different locking >>> range from SR lock. >>> >>> >>> Thanks, >>> >>> Yasumasa >>> >>> >>> On 2020/08/26 18:13, Robbin Ehn wrote: >>>> Hi Yasumasa, >>>> >>>> You cannot take the MutexLocker mu(JvmtiThreadState_lock) with >>>> safepoint checks inside a handshake. >>>> We are missing a NoSafepointVerifier for handshakes. >>>> (I have added this in my work in progress asynchronous handshake patch) >>>> >>>> Also this can deadlock with the handshake semaphore. >>>> (In my asynch handshake patch I have change the sema to a mutex, >>>> thus lock ranking works.) >>>> >>>> I solved this by just taking the mutex before the handshake. >>>> And removed the internal locking from set_frame_pop, etc... >>>> If there is an issue holding the JvmtiThreadState_lock to long, it >>>> should split to a per thread lock instead. >>>> (Since often the thread is suppose to be suspended, one could >>>> consider using the SR lock for serializing access to the per thread >>>> JvmtiThreadState instead.) >>>> >>>> Thanks, Robbin >>>> >>>> On 2020-08-26 09:34, Yasumasa Suenaga wrote: >>>>> Hi Patricio, David, >>>>> >>>>> Thanks for your comment! >>>>> >>>>> I updated webrev which includes the fix which is commented by >>>>> Patricio, and it passed submit repo. So I switch this mail thread >>>>> to RFR. >>>>> >>>>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8242427 >>>>> ?? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8242427/webrev.00/ >>>>> >>>>> I understand David said same concerns as Patricio about active >>>>> handshaker. This webrev checks active handshaker is current thread >>>>> or not. >>>>> >>>>> >>>>> Cheers, >>>>> >>>>> Yasumasa >>>>> >>>>> >>>>> On 2020/08/26 10:13, Patricio Chilano wrote: >>>>>> Hi Yasumasa, >>>>>> >>>>>> On 8/23/20 11:40 PM, Yasumasa Suenaga wrote: >>>>>>> Hi all, >>>>>>> >>>>>>> I want to hear your opinions about the change for JDK-8242427. >>>>>>> >>>>>>> I'm trying to migrate following operations to direct handshake. >>>>>>> >>>>>>> ??? - VM_UpdateForPopTopFrame >>>>>>> ??? - VM_SetFramePop >>>>>>> ??? - VM_GetCurrentLocation >>>>>>> >>>>>>> Some operations (VM_GetCurrentLocation and >>>>>>> EnterInterpOnlyModeClosure) might be called at safepoint, so I >>>>>>> want to use JavaThread::active_handshaker() in production VM to >>>>>>> detect the process is in direct handshake or not. >>>>>>> >>>>>>> However this function is available in debug VM only, so I want to >>>>>>> hear the reason why it is for debug VM only, and there are no >>>>>>> problem to use it in production VM. Of course another solutions >>>>>>> are welcome. >>>>>> I added the _active_handshaker field to the HandshakeState class >>>>>> when working on 8230594 to adjust some asserts, where instead of >>>>>> checking for the VMThread we needed to check for the active >>>>>> handshaker of the target JavaThread. Since there were no other >>>>>> users of it, there was no point in declaring it and having to >>>>>> write to it for the release bits. There are no issues with having >>>>>> it in production though so you could change that if necessary. >>>>>> >>>>>>> webrev is here. It passed jtreg tests >>>>>>> (vmTestbase/nsk/{jdi,jdwp,jvmti} serviceability/{jdwp,jvmti}) >>>>>>> ? http://cr.openjdk.java.net/~ysuenaga/JDK-8242427/proposal/ >>>>>> Some comments on the proposed change. >>>>>> >>>>>> src/hotspot/share/prims/jvmtiEnvThreadState.cpp, >>>>>> src/hotspot/share/prims/jvmtiEventController.cpp >>>>>> Why is the check to decide whether to call the handshake or >>>>>> execute the operation with the current thread different for >>>>>> GetCurrentLocationClosure vs EnterInterpOnlyModeClosure? >>>>>> >>>>>> (GetCurrentLocationClosure) >>>>>> if ((Thread::current() == _thread) || >>>>>> (_thread->active_handshaker() != NULL)) { >>>>>> ????? op.do_thread(_thread); >>>>>> } else { >>>>>> ????? Handshake::execute_direct(&op, _thread); >>>>>> } >>>>>> >>>>>> vs >>>>>> >>>>>> (EnterInterpOnlyModeClosure) >>>>>> if (target->active_handshaker() != NULL) { >>>>>> ???? hs.do_thread(target); >>>>>> } else { >>>>>> ???? Handshake::execute_direct(&hs, target); >>>>>> } >>>>>> >>>>>> If you change VM_SetFramePop to use handshakes then it seems you >>>>>> could reach JvmtiEventControllerPrivate::enter_interp_only_mode() >>>>>> with the current thread being the target. >>>>>> Also I think you want the second expression of that check to be >>>>>> (target->active_handshaker() == Thread::current()). So either you >>>>>> are the target or the current active_handshaker for that target. >>>>>> Otherwise active_handshaker() could be not NULL because there is >>>>>> another JavaThread handshaking the same target. Unless you are >>>>>> certain that it can never happen, so if active_handshaker() is not >>>>>> NULL it is always the current thread, but even in that case this >>>>>> way is safer. >>>>>> >>>>>> src/hotspot/share/prims/jvmtiThreadState.cpp >>>>>> The guarantee() statement exists in release builds too so the >>>>>> "#ifdef ASSERT" directive should be removed, otherwise "current" >>>>>> will not be declared. >>>>>> >>>>>> Thanks! >>>>>> >>>>>> Patricio >>>>>>> Thanks, >>>>>>> >>>>>>> Yasumasa >>>>>> From volker.simonis at gmail.com Wed Aug 26 17:17:32 2020 From: volker.simonis at gmail.com (Volker Simonis) Date: Wed, 26 Aug 2020 19:17:32 +0200 Subject: RFR/RFA (M): 8185003: JMX: Add a version of ThreadMXBean.dumpAllThreads with a maxDepth argument In-Reply-To: References: Message-ID: Hi Paul, thanks for adapting your change. Please find my comments in-line below: On Tue, Aug 25, 2020 at 10:28 PM Hohensee, Paul wrote: > > :) > > New webrevs following Volker's suggestion. > > http://cr.openjdk.java.net/~phh/8185003/webrev.8u.jdk.06/ Looks good except JNI_OnLoad() in "management.c" where I'd change the call to "JVM_GetManagement(JMM_VERSION)" back to "JVM_GetManagement(JMM_VERSION_1_0)". See discussion below... > http://cr.openjdk.java.net/~phh/8185003/webrev.8u.hotspot.06/ > Looks good except Management::get_jmm_interface(): 2396 if (version == JMM_VERSION) { 2397 return (void*) &jmm_interface; 2398 } You still check for "JMM_VERSION" which is now "0x20020000" and thus incompatible with the old value of "JMM_VERSION_1 = 0x20010000". This will break compatibility with clients compiled against jmm.h before this change. It should therefore remain unchanged: if (version == JMM_VERSION_1_0) { return (void*) &jmm_interface; } I think the variant "if (version == JMM_VERSION_1_0 || version == JMM_VERSION)" which we've briefly discussed wouldn't work either because a binary "JMM_VERSION_2" client would expect that "DumpThreads" will have the additional "maxDepths" argument and crash. So we can't have binary compatibility with both, old jdk8 clients and new jdk11 clients. Therefore, contrary to my previous mail, I'd also change "jmm_GetVersion()" to return the "old" JMM VERSION (i.e "0x20010203") because that's really the only one we're compatible with. In fact, this makes the whole addition of "JMM_VERSION_2" questionable, because after the proposed changes it wouldn't be used anymore. And after reasoning about it a little more, I think that's correct because we really only have binary compatibility with previous jdk8 clients and that's how it should be. Thank you and best regards, Volker > Passes > > jdk/test/com/sun/management > jdk/test/java/lang/management > jdk/test/sun/management > jdk/test/javax/management > > Paul > > ?On 8/21/20, 1:39 PM, "serguei.spitsyn at oracle.com" wrote: > > On 8/21/20 11:07, serguei.spitsyn at oracle.com wrote: > > Hi Paul, > > Sorry, Volker, for using this "indirection". > I hope, Paul redirected my "Hi" to you. :) > > Thanks, > Serguei > > > > > Thank you for explanation. > > > > Thanks, > > Serguei > > > > > > On 8/21/20 10:54, Volker Simonis wrote: > >> On Thu, Aug 20, 2020 at 10:06 PM serguei.spitsyn at oracle.com > >> wrote: > >>> Hi Paul, > >>> > >>> I was also wondering if there is a compatibility risk involved with > >>> the JMM_VERSION change. > >>> So, thanks to Volker for asking these questions. > >>> > >>> One more question. > >>> I do not see a backport of the > >>> src/jdk.management/share/native/libmanagement_ext/management_ext.c > >>> change. > >>> Is it intentional, and if so, what is the reason to skip this file? > >>> > >> "libmanagement_ext/management_ext.c" doesn't exist in jdk8. It was > >> introduced with "8042901: Allow com.sun.management to be in a > >> different module to java.lang.management" in jdk9. In jdk8 all the > >> functionality is in "management/management.h" so there's no need to > >> backport the changes from "management_ext.c" . > >> > >> [1] https://bugs.openjdk.java.net/browse/JDK-8042901 > >> > >>> Thanks, > >>> Serguei > >>> > >>> > >>> On 8/20/20 11:30, Volker Simonis wrote: > >>> > >>> On Wed, Aug 19, 2020 at 6:17 PM Hohensee, Paul > >>> wrote: > >>> > >>> Please review this backport to jdk8u. I especially need a CSR > >>> review, since the CSR approval process can be a bottleneck. The > >>> patch significantly reduces fleet profiling overhead, and a version > >>> of it has been in production at Amazon for over 3 years. > >>> > >>> > >>> > >>> Original JBS issue: https://bugs.openjdk.java.net/browse/JDK-8185003 > >>> > >>> Original CSR: https://bugs.openjdk.java.net/browse/JDK-8185705 > >>> > >>> Original patch: > >>> http://hg.openjdk.java.net/jdk10/master/rev/68d46cb9be45 > >>> > >>> > >>> > >>> Backport JBS issue: https://bugs.openjdk.java.net/browse/JDK-8251494 > >>> > >>> Backport CSR: https://bugs.openjdk.java.net/browse/JDK-8251498 > >>> > >>> Backport JDK webrev: > >>> http://cr.openjdk.java.net/~phh/8185003/webrev.8u.jdk.05/ > >>> > >>> JDK part looks good to me. > >>> > >>> Backport Hotspot webrev: > >>> http://cr.openjdk.java.net/~phh/8185003/webrev.8u.hotspot.05/ > >>> > >>> HotSpot part looks good to me but see discussion below. > >>> > >>> > >>> Details of the interface changes needed for the backport are in the > >>> Description of the Backport CSR 8251498. The actual functional > >>> changes are minimal and low risk. > >>> > >>> I've also reviewed the CSR yesterday which I think is fine. But now, > >>> when looking at the implementation, I'm a little concerned about > >>> changing JMM_VERSION from " 0x20010203" to "0x20020000" in "jmm.h". > >>> > >>> This might be especially problematic in combination with the changes > >>> in "Management::get_jmm_interface()" which is called by > >>> JVM_GetManagement(): > >>> > >>> void* Management::get_jmm_interface(int version) { > >>> #if INCLUDE_MANAGEMENT > >>> - if (version == JMM_VERSION_1_0) { > >>> + if (version == JMM_VERSION) { > >>> return (void*) &jmm_interface; > >>> } > >>> #endif // INCLUDE_MANAGEMENT > >>> return NULL; > >>> } > >>> > >>> You've correctly fixed the single caller of "JVM_GetManagement()" in > >>> the JDK (in "JNI_OnLoad()" in "management.c"): > >>> > >>> - jmm_interface = (JmmInterface*) > >>> JVM_GetManagement(JMM_VERSION_1_0); > >>> + jmm_interface = (JmmInterface*) JVM_GetManagement(JMM_VERSION); > >>> > >>> but I wonder if there are other monitoring/serviceability tools out > >>> there which use this interface and which will break after this change. > >>> A quick search revealed at least two StackOverflow entries which > >>> recommend using "JVM_GetManagement(JMM_VERSION_1_0)" [1,2] and there's > >>> a talk and a blog entry doing the same [3, 4]. > >>> > >>> I'm not sure how relevant this is but I think a much safer and > >>> backwards-compatible way of doing this downport would be the > >>> following: > >>> > >>> - don't change "Management::get_jmm_interface()" (i.e. still check for > >>> "JMM_VERSION_1_0") but return the "new" JMM_VERSION in > >>> "jmm_GetVersion()". This won't break anything but will make it > >>> possible for clients to detect the new version if they want. > >>> > >>> - don't change the signature of "DumpThreads()". Instead add a new > >>> version (e.g. "DumpThreadsMaxDepth()/jmm_DumpThreadsMaxDepth()") to > >>> the "JMMInterface" struct and to "jmm_interface" in "management.cpp". > >>> You can do this in one of the two first, reserved fields of > >>> "JMMInterface" so you won't break binary compatibility. > >>> "jmm_DumpThreads()" will then be a simple wrapper which calls > >>> "jmm_DumpThreadsMaxDepth()" with Integer.MAX_VALUE as depth. > >>> > >>> - in the jdk you then simply call "DumpThreadsMaxDepth()" in > >>> "Java_sun_management_ThreadImpl_dumpThreads0()" > >>> > >>> I think this way we can maintain full binary compatibility while still > >>> using the new feature. What do you think? > >>> > >>> Best regards, > >>> Volker > >>> > >>> [1] > >>> https://urldefense.com/v3/__https://stackoverflow.com/questions/23632653/generate-java-heap-dump-on-uncaught-exception__;!!GqivPVa7Brio!LDD5rfKbGz6KCl0LqcAgrFq7kNLkkoDhhN0ZSgHMDvgGMY5bvKJdpoXIAK6N-KqVsyaF$ > >>> [2] > >>> https://urldefense.com/v3/__https://stackoverflow.com/questions/60887816/jvm-options-printnmtstatistics-save-info-to-file__;!!GqivPVa7Brio!LDD5rfKbGz6KCl0LqcAgrFq7kNLkkoDhhN0ZSgHMDvgGMY5bvKJdpoXIAK6N-Ip7MAQ5$ > >>> [3] > >>> https://urldefense.com/v3/__https://sudonull.com/post/25841-JVM-TI-how-to-make-a-plugin-for-a-virtual-machine-Odnoklassniki-company-blog__;!!GqivPVa7Brio!LDD5rfKbGz6KCl0LqcAgrFq7kNLkkoDhhN0ZSgHMDvgGMY5bvKJdpoXIAK6N-ErSjPdD$ > >>> [4] > >>> https://urldefense.com/v3/__https://2019.jpoint.ru/talks/2o8scc5jbaurnqqlsydzxv/__;!!GqivPVa7Brio!LDD5rfKbGz6KCl0LqcAgrFq7kNLkkoDhhN0ZSgHMDvgGMY5bvKJdpoXIAK6N-Oxb5CQ-$ > >>> > >>> Passes the included (suitably modified) test, as well as the tests in > >>> > >>> > >>> > >>> jdk/test/java/lang/management/ThreadMXBean > >>> > >>> jdk/test/com/sun/management/ThreadMXBean > >>> > >>> > >>> > >>> Thanks, > >>> > >>> Paul > >>> > >>> > > > > From hohensee at amazon.com Wed Aug 26 19:19:20 2020 From: hohensee at amazon.com (Hohensee, Paul) Date: Wed, 26 Aug 2020 19:19:20 +0000 Subject: RFR/RFA (M): 8185003: JMX: Add a version of ThreadMXBean.dumpAllThreads with a maxDepth argument Message-ID: +Joe for an opinion. I agree. I've added a comment to the CSR (https://bugs.openjdk.java.net/browse/JDK-8251498) and moved it back to Draft. "Volker Simonis has pointed out in https://mail.openjdk.java.net/pipermail/jdk8u-dev/2020-August/012557.html that when we backport a JMM feature, we're actually updating the existing JMM version specification rather than transitioning to a new one. There was a bit of recognition of this in https://bugs.openjdk.java.net/browse/JDK-8249101, where the @since javadoc tag was updated to 11.0.9 rather than 14. Imo, Volker makes a good argument for leaving the JMM version alone when doing JMM backports. If we adopt this approach, the JDK 11 JMM version should also be reverted." Thanks, Paul ?On 8/26/20, 10:18 AM, "Volker Simonis" wrote: Hi Paul, thanks for adapting your change. Please find my comments in-line below: On Tue, Aug 25, 2020 at 10:28 PM Hohensee, Paul wrote: > > :) > > New webrevs following Volker's suggestion. > > http://cr.openjdk.java.net/~phh/8185003/webrev.8u.jdk.06/ Looks good except JNI_OnLoad() in "management.c" where I'd change the call to "JVM_GetManagement(JMM_VERSION)" back to "JVM_GetManagement(JMM_VERSION_1_0)". See discussion below... > http://cr.openjdk.java.net/~phh/8185003/webrev.8u.hotspot.06/ > Looks good except Management::get_jmm_interface(): 2396 if (version == JMM_VERSION) { 2397 return (void*) &jmm_interface; 2398 } You still check for "JMM_VERSION" which is now "0x20020000" and thus incompatible with the old value of "JMM_VERSION_1 = 0x20010000". This will break compatibility with clients compiled against jmm.h before this change. It should therefore remain unchanged: if (version == JMM_VERSION_1_0) { return (void*) &jmm_interface; } I think the variant "if (version == JMM_VERSION_1_0 || version == JMM_VERSION)" which we've briefly discussed wouldn't work either because a binary "JMM_VERSION_2" client would expect that "DumpThreads" will have the additional "maxDepths" argument and crash. So we can't have binary compatibility with both, old jdk8 clients and new jdk11 clients. Therefore, contrary to my previous mail, I'd also change "jmm_GetVersion()" to return the "old" JMM VERSION (i.e "0x20010203") because that's really the only one we're compatible with. In fact, this makes the whole addition of "JMM_VERSION_2" questionable, because after the proposed changes it wouldn't be used anymore. And after reasoning about it a little more, I think that's correct because we really only have binary compatibility with previous jdk8 clients and that's how it should be. Thank you and best regards, Volker > Passes > > jdk/test/com/sun/management > jdk/test/java/lang/management > jdk/test/sun/management > jdk/test/javax/management > > Paul > > On 8/21/20, 1:39 PM, "serguei.spitsyn at oracle.com" wrote: > > On 8/21/20 11:07, serguei.spitsyn at oracle.com wrote: > > Hi Paul, > > Sorry, Volker, for using this "indirection". > I hope, Paul redirected my "Hi" to you. :) > > Thanks, > Serguei > > > > > Thank you for explanation. > > > > Thanks, > > Serguei > > > > > > On 8/21/20 10:54, Volker Simonis wrote: > >> On Thu, Aug 20, 2020 at 10:06 PM serguei.spitsyn at oracle.com > >> wrote: > >>> Hi Paul, > >>> > >>> I was also wondering if there is a compatibility risk involved with > >>> the JMM_VERSION change. > >>> So, thanks to Volker for asking these questions. > >>> > >>> One more question. > >>> I do not see a backport of the > >>> src/jdk.management/share/native/libmanagement_ext/management_ext.c > >>> change. > >>> Is it intentional, and if so, what is the reason to skip this file? > >>> > >> "libmanagement_ext/management_ext.c" doesn't exist in jdk8. It was > >> introduced with "8042901: Allow com.sun.management to be in a > >> different module to java.lang.management" in jdk9. In jdk8 all the > >> functionality is in "management/management.h" so there's no need to > >> backport the changes from "management_ext.c" . > >> > >> [1] https://bugs.openjdk.java.net/browse/JDK-8042901 > >> > >>> Thanks, > >>> Serguei > >>> > >>> > >>> On 8/20/20 11:30, Volker Simonis wrote: > >>> > >>> On Wed, Aug 19, 2020 at 6:17 PM Hohensee, Paul > >>> wrote: > >>> > >>> Please review this backport to jdk8u. I especially need a CSR > >>> review, since the CSR approval process can be a bottleneck. The > >>> patch significantly reduces fleet profiling overhead, and a version > >>> of it has been in production at Amazon for over 3 years. > >>> > >>> > >>> > >>> Original JBS issue: https://bugs.openjdk.java.net/browse/JDK-8185003 > >>> > >>> Original CSR: https://bugs.openjdk.java.net/browse/JDK-8185705 > >>> > >>> Original patch: > >>> http://hg.openjdk.java.net/jdk10/master/rev/68d46cb9be45 > >>> > >>> > >>> > >>> Backport JBS issue: https://bugs.openjdk.java.net/browse/JDK-8251494 > >>> > >>> Backport CSR: https://bugs.openjdk.java.net/browse/JDK-8251498 > >>> > >>> Backport JDK webrev: > >>> http://cr.openjdk.java.net/~phh/8185003/webrev.8u.jdk.05/ > >>> > >>> JDK part looks good to me. > >>> > >>> Backport Hotspot webrev: > >>> http://cr.openjdk.java.net/~phh/8185003/webrev.8u.hotspot.05/ > >>> > >>> HotSpot part looks good to me but see discussion below. > >>> > >>> > >>> Details of the interface changes needed for the backport are in the > >>> Description of the Backport CSR 8251498. The actual functional > >>> changes are minimal and low risk. > >>> > >>> I've also reviewed the CSR yesterday which I think is fine. But now, > >>> when looking at the implementation, I'm a little concerned about > >>> changing JMM_VERSION from " 0x20010203" to "0x20020000" in "jmm.h". > >>> > >>> This might be especially problematic in combination with the changes > >>> in "Management::get_jmm_interface()" which is called by > >>> JVM_GetManagement(): > >>> > >>> void* Management::get_jmm_interface(int version) { > >>> #if INCLUDE_MANAGEMENT > >>> - if (version == JMM_VERSION_1_0) { > >>> + if (version == JMM_VERSION) { > >>> return (void*) &jmm_interface; > >>> } > >>> #endif // INCLUDE_MANAGEMENT > >>> return NULL; > >>> } > >>> > >>> You've correctly fixed the single caller of "JVM_GetManagement()" in > >>> the JDK (in "JNI_OnLoad()" in "management.c"): > >>> > >>> - jmm_interface = (JmmInterface*) > >>> JVM_GetManagement(JMM_VERSION_1_0); > >>> + jmm_interface = (JmmInterface*) JVM_GetManagement(JMM_VERSION); > >>> > >>> but I wonder if there are other monitoring/serviceability tools out > >>> there which use this interface and which will break after this change. > >>> A quick search revealed at least two StackOverflow entries which > >>> recommend using "JVM_GetManagement(JMM_VERSION_1_0)" [1,2] and there's > >>> a talk and a blog entry doing the same [3, 4]. > >>> > >>> I'm not sure how relevant this is but I think a much safer and > >>> backwards-compatible way of doing this downport would be the > >>> following: > >>> > >>> - don't change "Management::get_jmm_interface()" (i.e. still check for > >>> "JMM_VERSION_1_0") but return the "new" JMM_VERSION in > >>> "jmm_GetVersion()". This won't break anything but will make it > >>> possible for clients to detect the new version if they want. > >>> > >>> - don't change the signature of "DumpThreads()". Instead add a new > >>> version (e.g. "DumpThreadsMaxDepth()/jmm_DumpThreadsMaxDepth()") to > >>> the "JMMInterface" struct and to "jmm_interface" in "management.cpp". > >>> You can do this in one of the two first, reserved fields of > >>> "JMMInterface" so you won't break binary compatibility. > >>> "jmm_DumpThreads()" will then be a simple wrapper which calls > >>> "jmm_DumpThreadsMaxDepth()" with Integer.MAX_VALUE as depth. > >>> > >>> - in the jdk you then simply call "DumpThreadsMaxDepth()" in > >>> "Java_sun_management_ThreadImpl_dumpThreads0()" > >>> > >>> I think this way we can maintain full binary compatibility while still > >>> using the new feature. What do you think? > >>> > >>> Best regards, > >>> Volker > >>> > >>> [1] > >>> https://urldefense.com/v3/__https://stackoverflow.com/questions/23632653/generate-java-heap-dump-on-uncaught-exception__;!!GqivPVa7Brio!LDD5rfKbGz6KCl0LqcAgrFq7kNLkkoDhhN0ZSgHMDvgGMY5bvKJdpoXIAK6N-KqVsyaF$ > >>> [2] > >>> https://urldefense.com/v3/__https://stackoverflow.com/questions/60887816/jvm-options-printnmtstatistics-save-info-to-file__;!!GqivPVa7Brio!LDD5rfKbGz6KCl0LqcAgrFq7kNLkkoDhhN0ZSgHMDvgGMY5bvKJdpoXIAK6N-Ip7MAQ5$ > >>> [3] > >>> https://urldefense.com/v3/__https://sudonull.com/post/25841-JVM-TI-how-to-make-a-plugin-for-a-virtual-machine-Odnoklassniki-company-blog__;!!GqivPVa7Brio!LDD5rfKbGz6KCl0LqcAgrFq7kNLkkoDhhN0ZSgHMDvgGMY5bvKJdpoXIAK6N-ErSjPdD$ > >>> [4] > >>> https://urldefense.com/v3/__https://2019.jpoint.ru/talks/2o8scc5jbaurnqqlsydzxv/__;!!GqivPVa7Brio!LDD5rfKbGz6KCl0LqcAgrFq7kNLkkoDhhN0ZSgHMDvgGMY5bvKJdpoXIAK6N-Oxb5CQ-$ > >>> > >>> Passes the included (suitably modified) test, as well as the tests in > >>> > >>> > >>> > >>> jdk/test/java/lang/management/ThreadMXBean > >>> > >>> jdk/test/com/sun/management/ThreadMXBean > >>> > >>> > >>> > >>> Thanks, > >>> > >>> Paul > >>> > >>> > > > > From richard.reingruber at sap.com Wed Aug 26 20:33:05 2020 From: richard.reingruber at sap.com (Reingruber, Richard) Date: Wed, 26 Aug 2020 20:33:05 +0000 Subject: 8242427: JVMTI frame pop operations should use Thread-Local Handshakes In-Reply-To: References: Message-ID: Hi Yasumasa, Could you explain a little bit the motivation to replace these vm operations with handshakes? Would be good, if you could add the goals as well to the JBS item. Thanks, Richard. -----Original Message----- From: serviceability-dev On Behalf Of Yasumasa Suenaga Sent: Montag, 24. August 2020 04:40 To: serviceability-dev Subject: 8242427: JVMTI frame pop operations should use Thread-Local Handshakes Hi all, I want to hear your opinions about the change for JDK-8242427. I'm trying to migrate following operations to direct handshake. - VM_UpdateForPopTopFrame - VM_SetFramePop - VM_GetCurrentLocation Some operations (VM_GetCurrentLocation and EnterInterpOnlyModeClosure) might be called at safepoint, so I want to use JavaThread::active_handshaker() in production VM to detect the process is in direct handshake or not. However this function is available in debug VM only, so I want to hear the reason why it is for debug VM only, and there are no problem to use it in production VM. Of course another solutions are welcome. webrev is here. It passed jtreg tests (vmTestbase/nsk/{jdi,jdwp,jvmti} serviceability/{jdwp,jvmti}) http://cr.openjdk.java.net/~ysuenaga/JDK-8242427/proposal/ Thanks, Yasumasa From patricio.chilano.mateo at oracle.com Wed Aug 26 22:50:51 2020 From: patricio.chilano.mateo at oracle.com (Patricio Chilano) Date: Wed, 26 Aug 2020 19:50:51 -0300 Subject: RFR: 8242427: JVMTI frame pop operations should use Thread-Local Handshakes In-Reply-To: <2764d68f-da86-865c-d8b0-e79d1fd9ee25@oss.nttdata.com> References: <11793902-5fca-3251-ce57-ae0832c3a2c4@oracle.com> <2764d68f-da86-865c-d8b0-e79d1fd9ee25@oss.nttdata.com> Message-ID: Hi Yasumasa, On 8/26/20 4:34 AM, Yasumasa Suenaga wrote: > Hi Patricio, David, > > Thanks for your comment! > > I updated webrev which includes the fix which is commented by > Patricio, and it passed submit repo. So I switch this mail thread to RFR. > > ? JBS: https://bugs.openjdk.java.net/browse/JDK-8242427 > ? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8242427/webrev.00/ The changes look good to me, thanks for fixing them. Patricio > I understand David said same concerns as Patricio about active > handshaker. This webrev checks active handshaker is current thread or > not. > > > Cheers, > > Yasumasa > > > On 2020/08/26 10:13, Patricio Chilano wrote: >> Hi Yasumasa, >> >> On 8/23/20 11:40 PM, Yasumasa Suenaga wrote: >>> Hi all, >>> >>> I want to hear your opinions about the change for JDK-8242427. >>> >>> I'm trying to migrate following operations to direct handshake. >>> >>> ??? - VM_UpdateForPopTopFrame >>> ??? - VM_SetFramePop >>> ??? - VM_GetCurrentLocation >>> >>> Some operations (VM_GetCurrentLocation and >>> EnterInterpOnlyModeClosure) might be called at safepoint, so I want >>> to use JavaThread::active_handshaker() in production VM to detect >>> the process is in direct handshake or not. >>> >>> However this function is available in debug VM only, so I want to >>> hear the reason why it is for debug VM only, and there are no >>> problem to use it in production VM. Of course another solutions are >>> welcome. >> I added the _active_handshaker field to the HandshakeState class when >> working on 8230594 to adjust some asserts, where instead of checking >> for the VMThread we needed to check for the active handshaker of the >> target JavaThread. Since there were no other users of it, there was >> no point in declaring it and having to write to it for the release >> bits. There are no issues with having it in production though so you >> could change that if necessary. >> >>> webrev is here. It passed jtreg tests >>> (vmTestbase/nsk/{jdi,jdwp,jvmti} serviceability/{jdwp,jvmti}) >>> ? http://cr.openjdk.java.net/~ysuenaga/JDK-8242427/proposal/ >> Some comments on the proposed change. >> >> src/hotspot/share/prims/jvmtiEnvThreadState.cpp, >> src/hotspot/share/prims/jvmtiEventController.cpp >> Why is the check to decide whether to call the handshake or execute >> the operation with the current thread different for >> GetCurrentLocationClosure vs EnterInterpOnlyModeClosure? >> >> (GetCurrentLocationClosure) >> if ((Thread::current() == _thread) || (_thread->active_handshaker() >> != NULL)) { >> ????? op.do_thread(_thread); >> } else { >> ????? Handshake::execute_direct(&op, _thread); >> } >> >> vs >> >> (EnterInterpOnlyModeClosure) >> if (target->active_handshaker() != NULL) { >> ???? hs.do_thread(target); >> } else { >> ???? Handshake::execute_direct(&hs, target); >> } >> >> If you change VM_SetFramePop to use handshakes then it seems you >> could reach JvmtiEventControllerPrivate::enter_interp_only_mode() >> with the current thread being the target. >> Also I think you want the second expression of that check to be >> (target->active_handshaker() == Thread::current()). So either you are >> the target or the current active_handshaker for that target. >> Otherwise active_handshaker() could be not NULL because there is >> another JavaThread handshaking the same target. Unless you are >> certain that it can never happen, so if active_handshaker() is not >> NULL it is always the current thread, but even in that case this way >> is safer. >> >> src/hotspot/share/prims/jvmtiThreadState.cpp >> The guarantee() statement exists in release builds too so the "#ifdef >> ASSERT" directive should be removed, otherwise "current" will not be >> declared. >> >> Thanks! >> >> Patricio >>> Thanks, >>> >>> Yasumasa >> From david.holmes at oracle.com Wed Aug 26 23:09:41 2020 From: david.holmes at oracle.com (David Holmes) Date: Thu, 27 Aug 2020 09:09:41 +1000 Subject: RFR: 8242427: JVMTI frame pop operations should use Thread-Local Handshakes In-Reply-To: <2764d68f-da86-865c-d8b0-e79d1fd9ee25@oss.nttdata.com> References: <11793902-5fca-3251-ce57-ae0832c3a2c4@oracle.com> <2764d68f-da86-865c-d8b0-e79d1fd9ee25@oss.nttdata.com> Message-ID: <6fa3b231-fc27-78f7-6e64-5d580b875772@oracle.com> Hi Yasumasa, On 26/08/2020 5:34 pm, Yasumasa Suenaga wrote: > Hi Patricio, David, > > Thanks for your comment! > > I updated webrev which includes the fix which is commented by Patricio, > and it passed submit repo. So I switch this mail thread to RFR. > > ? JBS: https://bugs.openjdk.java.net/browse/JDK-8242427 > ? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8242427/webrev.00/ > > I understand David said same concerns as Patricio about active > handshaker. This webrev checks active handshaker is current thread or not. How can the current thread already be in a handshake with the target when you execute this code? David ----- > > Cheers, > > Yasumasa > > > On 2020/08/26 10:13, Patricio Chilano wrote: >> Hi Yasumasa, >> >> On 8/23/20 11:40 PM, Yasumasa Suenaga wrote: >>> Hi all, >>> >>> I want to hear your opinions about the change for JDK-8242427. >>> >>> I'm trying to migrate following operations to direct handshake. >>> >>> ??? - VM_UpdateForPopTopFrame >>> ??? - VM_SetFramePop >>> ??? - VM_GetCurrentLocation >>> >>> Some operations (VM_GetCurrentLocation and >>> EnterInterpOnlyModeClosure) might be called at safepoint, so I want >>> to use JavaThread::active_handshaker() in production VM to detect the >>> process is in direct handshake or not. >>> >>> However this function is available in debug VM only, so I want to >>> hear the reason why it is for debug VM only, and there are no problem >>> to use it in production VM. Of course another solutions are welcome. >> I added the _active_handshaker field to the HandshakeState class when >> working on 8230594 to adjust some asserts, where instead of checking >> for the VMThread we needed to check for the active handshaker of the >> target JavaThread. Since there were no other users of it, there was no >> point in declaring it and having to write to it for the release bits. >> There are no issues with having it in production though so you could >> change that if necessary. >> >>> webrev is here. It passed jtreg tests >>> (vmTestbase/nsk/{jdi,jdwp,jvmti} serviceability/{jdwp,jvmti}) >>> ? http://cr.openjdk.java.net/~ysuenaga/JDK-8242427/proposal/ >> Some comments on the proposed change. >> >> src/hotspot/share/prims/jvmtiEnvThreadState.cpp, >> src/hotspot/share/prims/jvmtiEventController.cpp >> Why is the check to decide whether to call the handshake or execute >> the operation with the current thread different for >> GetCurrentLocationClosure vs EnterInterpOnlyModeClosure? >> >> (GetCurrentLocationClosure) >> if ((Thread::current() == _thread) || (_thread->active_handshaker() != >> NULL)) { >> ????? op.do_thread(_thread); >> } else { >> ????? Handshake::execute_direct(&op, _thread); >> } >> >> vs >> >> (EnterInterpOnlyModeClosure) >> if (target->active_handshaker() != NULL) { >> ???? hs.do_thread(target); >> } else { >> ???? Handshake::execute_direct(&hs, target); >> } >> >> If you change VM_SetFramePop to use handshakes then it seems you could >> reach JvmtiEventControllerPrivate::enter_interp_only_mode() with the >> current thread being the target. >> Also I think you want the second expression of that check to be >> (target->active_handshaker() == Thread::current()). So either you are >> the target or the current active_handshaker for that target. Otherwise >> active_handshaker() could be not NULL because there is another >> JavaThread handshaking the same target. Unless you are certain that it >> can never happen, so if active_handshaker() is not NULL it is always >> the current thread, but even in that case this way is safer. >> >> src/hotspot/share/prims/jvmtiThreadState.cpp >> The guarantee() statement exists in release builds too so the "#ifdef >> ASSERT" directive should be removed, otherwise "current" will not be >> declared. >> >> Thanks! >> >> Patricio >>> Thanks, >>> >>> Yasumasa >> From suenaga at oss.nttdata.com Wed Aug 26 23:29:37 2020 From: suenaga at oss.nttdata.com (Yasumasa Suenaga) Date: Thu, 27 Aug 2020 08:29:37 +0900 Subject: 8242427: JVMTI frame pop operations should use Thread-Local Handshakes In-Reply-To: References: Message-ID: Hi Richard, I've described the motivation on JDK-8201641 (it is a parent task of JDK-8242427) ``` Many JVMTI functions uses VM Operation to get information. However some of them need to stop only one thread - they don't need to stop all threads. ``` I aimed to improve JVMTI monitor operation with TLS at first, but I found other JVMTI operations can be improved with same process. So I've tried to fix them. I proposed it to serviceability-dev [1], then Dan told me similar enhancement is already filed to JBS [2]. So I created subtasks in it. Thanks, Yasumasa [1] https://mail.openjdk.java.net/pipermail/serviceability-dev/2020-March/030890.html [2] https://mail.openjdk.java.net/pipermail/serviceability-dev/2020-March/030897.html On 2020/08/27 5:33, Reingruber, Richard wrote: > Hi Yasumasa, > > Could you explain a little bit the motivation to replace these vm operations with handshakes? > Would be good, if you could add the goals as well to the JBS item. > > Thanks, Richard. > > -----Original Message----- > From: serviceability-dev On Behalf Of Yasumasa Suenaga > Sent: Montag, 24. August 2020 04:40 > To: serviceability-dev > Subject: 8242427: JVMTI frame pop operations should use Thread-Local Handshakes > > Hi all, > > I want to hear your opinions about the change for JDK-8242427. > > I'm trying to migrate following operations to direct handshake. > > - VM_UpdateForPopTopFrame > - VM_SetFramePop > - VM_GetCurrentLocation > > Some operations (VM_GetCurrentLocation and EnterInterpOnlyModeClosure) might be called at safepoint, so I want to use JavaThread::active_handshaker() in production VM to detect the process is in direct handshake or not. > > However this function is available in debug VM only, so I want to hear the reason why it is for debug VM only, and there are no problem to use it in production VM. Of course another solutions are welcome. > > webrev is here. It passed jtreg tests (vmTestbase/nsk/{jdi,jdwp,jvmti} serviceability/{jdwp,jvmti}) > http://cr.openjdk.java.net/~ysuenaga/JDK-8242427/proposal/ > > > Thanks, > > Yasumasa > From suenaga at oss.nttdata.com Wed Aug 26 23:40:12 2020 From: suenaga at oss.nttdata.com (Yasumasa Suenaga) Date: Thu, 27 Aug 2020 08:40:12 +0900 Subject: RFR: 8242427: JVMTI frame pop operations should use Thread-Local Handshakes In-Reply-To: <6fa3b231-fc27-78f7-6e64-5d580b875772@oracle.com> References: <11793902-5fca-3251-ce57-ae0832c3a2c4@oracle.com> <2764d68f-da86-865c-d8b0-e79d1fd9ee25@oss.nttdata.com> <6fa3b231-fc27-78f7-6e64-5d580b875772@oracle.com> Message-ID: <39af59d5-9701-97e9-670b-e71e9b2c6484@oss.nttdata.com> Hi David, On 2020/08/27 8:09, David Holmes wrote: > Hi Yasumasa, > > On 26/08/2020 5:34 pm, Yasumasa Suenaga wrote: >> Hi Patricio, David, >> >> Thanks for your comment! >> >> I updated webrev which includes the fix which is commented by Patricio, and it passed submit repo. So I switch this mail thread to RFR. >> >> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8242427 >> ?? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8242427/webrev.00/ >> >> I understand David said same concerns as Patricio about active handshaker. This webrev checks active handshaker is current thread or not. > > How can the current thread already be in a handshake with the target when you execute this code? EnterInterpOnlyModeClosure might be called in handshake with UpdateForPopTopFrameClosure or with SetFramePopClosure. EnterInterpOnlyModeClosure is introduced in JDK-8238585 as an alternative in VM_EnterInterpOnlyMode. VM_EnterInterpOnlyMode returned true in allow_nested_vm_operations(). Originally, it could have been called from other VM operations. Thanks, Yasumasa > David > ----- > >> >> Cheers, >> >> Yasumasa >> >> >> On 2020/08/26 10:13, Patricio Chilano wrote: >>> Hi Yasumasa, >>> >>> On 8/23/20 11:40 PM, Yasumasa Suenaga wrote: >>>> Hi all, >>>> >>>> I want to hear your opinions about the change for JDK-8242427. >>>> >>>> I'm trying to migrate following operations to direct handshake. >>>> >>>> ??? - VM_UpdateForPopTopFrame >>>> ??? - VM_SetFramePop >>>> ??? - VM_GetCurrentLocation >>>> >>>> Some operations (VM_GetCurrentLocation and EnterInterpOnlyModeClosure) might be called at safepoint, so I want to use JavaThread::active_handshaker() in production VM to detect the process is in direct handshake or not. >>>> >>>> However this function is available in debug VM only, so I want to hear the reason why it is for debug VM only, and there are no problem to use it in production VM. Of course another solutions are welcome. >>> I added the _active_handshaker field to the HandshakeState class when working on 8230594 to adjust some asserts, where instead of checking for the VMThread we needed to check for the active handshaker of the target JavaThread. Since there were no other users of it, there was no point in declaring it and having to write to it for the release bits. There are no issues with having it in production though so you could change that if necessary. >>> >>>> webrev is here. It passed jtreg tests (vmTestbase/nsk/{jdi,jdwp,jvmti} serviceability/{jdwp,jvmti}) >>>> ? http://cr.openjdk.java.net/~ysuenaga/JDK-8242427/proposal/ >>> Some comments on the proposed change. >>> >>> src/hotspot/share/prims/jvmtiEnvThreadState.cpp, src/hotspot/share/prims/jvmtiEventController.cpp >>> Why is the check to decide whether to call the handshake or execute the operation with the current thread different for GetCurrentLocationClosure vs EnterInterpOnlyModeClosure? >>> >>> (GetCurrentLocationClosure) >>> if ((Thread::current() == _thread) || (_thread->active_handshaker() != NULL)) { >>> ????? op.do_thread(_thread); >>> } else { >>> ????? Handshake::execute_direct(&op, _thread); >>> } >>> >>> vs >>> >>> (EnterInterpOnlyModeClosure) >>> if (target->active_handshaker() != NULL) { >>> ???? hs.do_thread(target); >>> } else { >>> ???? Handshake::execute_direct(&hs, target); >>> } >>> >>> If you change VM_SetFramePop to use handshakes then it seems you could reach JvmtiEventControllerPrivate::enter_interp_only_mode() with the current thread being the target. >>> Also I think you want the second expression of that check to be (target->active_handshaker() == Thread::current()). So either you are the target or the current active_handshaker for that target. Otherwise active_handshaker() could be not NULL because there is another JavaThread handshaking the same target. Unless you are certain that it can never happen, so if active_handshaker() is not NULL it is always the current thread, but even in that case this way is safer. >>> >>> src/hotspot/share/prims/jvmtiThreadState.cpp >>> The guarantee() statement exists in release builds too so the "#ifdef ASSERT" directive should be removed, otherwise "current" will not be declared. >>> >>> Thanks! >>> >>> Patricio >>>> Thanks, >>>> >>>> Yasumasa >>> From suenaga at oss.nttdata.com Wed Aug 26 23:54:18 2020 From: suenaga at oss.nttdata.com (Yasumasa Suenaga) Date: Thu, 27 Aug 2020 08:54:18 +0900 Subject: RFR: 8242427: JVMTI frame pop operations should use Thread-Local Handshakes In-Reply-To: <5871a9f0-9feb-25e6-ccd2-173dfd5e5c12@oracle.com> References: <11793902-5fca-3251-ce57-ae0832c3a2c4@oracle.com> <2764d68f-da86-865c-d8b0-e79d1fd9ee25@oss.nttdata.com> <4d6243a0-0883-472b-94be-104a595cb4a8@oracle.com> <8643478b-5b2b-e575-52cf-e233c28c4644@oss.nttdata.com> <5871a9f0-9feb-25e6-ccd2-173dfd5e5c12@oracle.com> Message-ID: <01de89cd-0e43-e19f-9c8d-51b042ab76aa@oss.nttdata.com> Hi Robbin, On 2020/08/27 0:08, Robbin Ehn wrote: > Hi Yasumasa, > > Thanks for fixing, seems good. > Note that there are jdk tests for jdi which also runs this code under: > test/jdk/com/sun/jdi/ webrev.02 passed test/jdk/com/sun/jdi/ on my Linux x64. Thanks, Yasumasa > /Robbin > > On 2020-08-26 16:33, Yasumasa Suenaga wrote: >> Hi Robbin, >> >> I fixed them in new webrev. Could you review again? >> >> ?? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8242427/webrev.02/ >> ?? diff from previous webrev: http://hg.openjdk.java.net/jdk/submit/rev/2ef7feb5681f >> >> It passed vmTestbase/nsk/{jdi,jdwp,jvmti} serviceability/{jdwp,jvmti} jtreg tests, so I think JVMTI functions works fine includes clear_frame_pop(). >> >> >> Thanks, >> >> Yasumasa >> >> >> On 2020/08/26 21:32, Robbin Ehn wrote: >>> Hi Yasumasa, >>> >>> Yes that should work. >>> >>> Can you please add assert where you removed the: >>> -? MutexLocker mu(JvmtiThreadState_lock); >>> E.g. >>> +? // If we are in a handshake we only know that the requesting thread should have locked it. >>> +? assert(SafepointSynchronize::is_at_safepoint() || JvmtiThreadState_lock->is_locked(), "Safepoint or must be locked"); >>> >>> Because I think you missing a MutexLocker in: >>> jvmtiExport.cpp line ~1650: >>> >>> ???????? // remove the frame's entry >>> ???????? ets->clear_frame_pop(cur_frame_number); >>> >>> In the method void JvmtiExport::post_method_exit(...). >>> >>> Thanks, Robbin >>> >>> On 2020-08-26 14:15, Yasumasa Suenaga wrote: >>>> Hi Robbin, >>>> >>>> Thanks for your comment! >>>> >>>> How about this change? >>>> >>>> ?? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8242427/webrev.01/ >>>> ?? diff from previous webrev: http://hg.openjdk.java.net/jdk/submit/rev/7facd1dd39d6 >>>> >>>> I still use JvmtiThreadState_lock because it has a different locking range from SR lock. >>>> >>>> >>>> Thanks, >>>> >>>> Yasumasa >>>> >>>> >>>> On 2020/08/26 18:13, Robbin Ehn wrote: >>>>> Hi Yasumasa, >>>>> >>>>> You cannot take the MutexLocker mu(JvmtiThreadState_lock) with safepoint checks inside a handshake. >>>>> We are missing a NoSafepointVerifier for handshakes. >>>>> (I have added this in my work in progress asynchronous handshake patch) >>>>> >>>>> Also this can deadlock with the handshake semaphore. >>>>> (In my asynch handshake patch I have change the sema to a mutex, thus lock ranking works.) >>>>> >>>>> I solved this by just taking the mutex before the handshake. >>>>> And removed the internal locking from set_frame_pop, etc... >>>>> If there is an issue holding the JvmtiThreadState_lock to long, it should split to a per thread lock instead. >>>>> (Since often the thread is suppose to be suspended, one could consider using the SR lock for serializing access to the per thread JvmtiThreadState instead.) >>>>> >>>>> Thanks, Robbin >>>>> >>>>> On 2020-08-26 09:34, Yasumasa Suenaga wrote: >>>>>> Hi Patricio, David, >>>>>> >>>>>> Thanks for your comment! >>>>>> >>>>>> I updated webrev which includes the fix which is commented by Patricio, and it passed submit repo. So I switch this mail thread to RFR. >>>>>> >>>>>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8242427 >>>>>> ?? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8242427/webrev.00/ >>>>>> >>>>>> I understand David said same concerns as Patricio about active handshaker. This webrev checks active handshaker is current thread or not. >>>>>> >>>>>> >>>>>> Cheers, >>>>>> >>>>>> Yasumasa >>>>>> >>>>>> >>>>>> On 2020/08/26 10:13, Patricio Chilano wrote: >>>>>>> Hi Yasumasa, >>>>>>> >>>>>>> On 8/23/20 11:40 PM, Yasumasa Suenaga wrote: >>>>>>>> Hi all, >>>>>>>> >>>>>>>> I want to hear your opinions about the change for JDK-8242427. >>>>>>>> >>>>>>>> I'm trying to migrate following operations to direct handshake. >>>>>>>> >>>>>>>> ??? - VM_UpdateForPopTopFrame >>>>>>>> ??? - VM_SetFramePop >>>>>>>> ??? - VM_GetCurrentLocation >>>>>>>> >>>>>>>> Some operations (VM_GetCurrentLocation and EnterInterpOnlyModeClosure) might be called at safepoint, so I want to use JavaThread::active_handshaker() in production VM to detect the process is in direct handshake or not. >>>>>>>> >>>>>>>> However this function is available in debug VM only, so I want to hear the reason why it is for debug VM only, and there are no problem to use it in production VM. Of course another solutions are welcome. >>>>>>> I added the _active_handshaker field to the HandshakeState class when working on 8230594 to adjust some asserts, where instead of checking for the VMThread we needed to check for the active handshaker of the target JavaThread. Since there were no other users of it, there was no point in declaring it and having to write to it for the release bits. There are no issues with having it in production though so you could change that if necessary. >>>>>>> >>>>>>>> webrev is here. It passed jtreg tests (vmTestbase/nsk/{jdi,jdwp,jvmti} serviceability/{jdwp,jvmti}) >>>>>>>> ? http://cr.openjdk.java.net/~ysuenaga/JDK-8242427/proposal/ >>>>>>> Some comments on the proposed change. >>>>>>> >>>>>>> src/hotspot/share/prims/jvmtiEnvThreadState.cpp, src/hotspot/share/prims/jvmtiEventController.cpp >>>>>>> Why is the check to decide whether to call the handshake or execute the operation with the current thread different for GetCurrentLocationClosure vs EnterInterpOnlyModeClosure? >>>>>>> >>>>>>> (GetCurrentLocationClosure) >>>>>>> if ((Thread::current() == _thread) || (_thread->active_handshaker() != NULL)) { >>>>>>> ????? op.do_thread(_thread); >>>>>>> } else { >>>>>>> ????? Handshake::execute_direct(&op, _thread); >>>>>>> } >>>>>>> >>>>>>> vs >>>>>>> >>>>>>> (EnterInterpOnlyModeClosure) >>>>>>> if (target->active_handshaker() != NULL) { >>>>>>> ???? hs.do_thread(target); >>>>>>> } else { >>>>>>> ???? Handshake::execute_direct(&hs, target); >>>>>>> } >>>>>>> >>>>>>> If you change VM_SetFramePop to use handshakes then it seems you could reach JvmtiEventControllerPrivate::enter_interp_only_mode() with the current thread being the target. >>>>>>> Also I think you want the second expression of that check to be (target->active_handshaker() == Thread::current()). So either you are the target or the current active_handshaker for that target. Otherwise active_handshaker() could be not NULL because there is another JavaThread handshaking the same target. Unless you are certain that it can never happen, so if active_handshaker() is not NULL it is always the current thread, but even in that case this way is safer. >>>>>>> >>>>>>> src/hotspot/share/prims/jvmtiThreadState.cpp >>>>>>> The guarantee() statement exists in release builds too so the "#ifdef ASSERT" directive should be removed, otherwise "current" will not be declared. >>>>>>> >>>>>>> Thanks! >>>>>>> >>>>>>> Patricio >>>>>>>> Thanks, >>>>>>>> >>>>>>>> Yasumasa >>>>>>> From suenaga at oss.nttdata.com Wed Aug 26 23:57:30 2020 From: suenaga at oss.nttdata.com (Yasumasa Suenaga) Date: Thu, 27 Aug 2020 08:57:30 +0900 Subject: RFR: 8242427: JVMTI frame pop operations should use Thread-Local Handshakes In-Reply-To: References: <11793902-5fca-3251-ce57-ae0832c3a2c4@oracle.com> <2764d68f-da86-865c-d8b0-e79d1fd9ee25@oss.nttdata.com> Message-ID: <70dd740f-9b77-c1ac-0acb-ebdc178abdce@oss.nttdata.com> Hi Patricio, Thanks for your review, but webrev.00 has been rotten. Can you review webrev.02? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8242427/webrev.02/ diff between webrev.00 and webrev.01: http://hg.openjdk.java.net/jdk/submit/rev/7facd1dd39d6 diff between webrev.01 and webrev.02: http://hg.openjdk.java.net/jdk/submit/rev/2ef7feb5681f Thanks, Yasumasa On 2020/08/27 7:50, Patricio Chilano wrote: > Hi Yasumasa, > > On 8/26/20 4:34 AM, Yasumasa Suenaga wrote: >> Hi Patricio, David, >> >> Thanks for your comment! >> >> I updated webrev which includes the fix which is commented by Patricio, and it passed submit repo. So I switch this mail thread to RFR. >> >> ? JBS: https://bugs.openjdk.java.net/browse/JDK-8242427 >> ? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8242427/webrev.00/ > The changes look good to me, thanks for fixing them. > > Patricio >> I understand David said same concerns as Patricio about active handshaker. This webrev checks active handshaker is current thread or not. >> >> >> Cheers, >> >> Yasumasa >> >> >> On 2020/08/26 10:13, Patricio Chilano wrote: >>> Hi Yasumasa, >>> >>> On 8/23/20 11:40 PM, Yasumasa Suenaga wrote: >>>> Hi all, >>>> >>>> I want to hear your opinions about the change for JDK-8242427. >>>> >>>> I'm trying to migrate following operations to direct handshake. >>>> >>>> ??? - VM_UpdateForPopTopFrame >>>> ??? - VM_SetFramePop >>>> ??? - VM_GetCurrentLocation >>>> >>>> Some operations (VM_GetCurrentLocation and EnterInterpOnlyModeClosure) might be called at safepoint, so I want to use JavaThread::active_handshaker() in production VM to detect the process is in direct handshake or not. >>>> >>>> However this function is available in debug VM only, so I want to hear the reason why it is for debug VM only, and there are no problem to use it in production VM. Of course another solutions are welcome. >>> I added the _active_handshaker field to the HandshakeState class when working on 8230594 to adjust some asserts, where instead of checking for the VMThread we needed to check for the active handshaker of the target JavaThread. Since there were no other users of it, there was no point in declaring it and having to write to it for the release bits. There are no issues with having it in production though so you could change that if necessary. >>> >>>> webrev is here. It passed jtreg tests (vmTestbase/nsk/{jdi,jdwp,jvmti} serviceability/{jdwp,jvmti}) >>>> ? http://cr.openjdk.java.net/~ysuenaga/JDK-8242427/proposal/ >>> Some comments on the proposed change. >>> >>> src/hotspot/share/prims/jvmtiEnvThreadState.cpp, src/hotspot/share/prims/jvmtiEventController.cpp >>> Why is the check to decide whether to call the handshake or execute the operation with the current thread different for GetCurrentLocationClosure vs EnterInterpOnlyModeClosure? >>> >>> (GetCurrentLocationClosure) >>> if ((Thread::current() == _thread) || (_thread->active_handshaker() != NULL)) { >>> ????? op.do_thread(_thread); >>> } else { >>> ????? Handshake::execute_direct(&op, _thread); >>> } >>> >>> vs >>> >>> (EnterInterpOnlyModeClosure) >>> if (target->active_handshaker() != NULL) { >>> ???? hs.do_thread(target); >>> } else { >>> ???? Handshake::execute_direct(&hs, target); >>> } >>> >>> If you change VM_SetFramePop to use handshakes then it seems you could reach JvmtiEventControllerPrivate::enter_interp_only_mode() with the current thread being the target. >>> Also I think you want the second expression of that check to be (target->active_handshaker() == Thread::current()). So either you are the target or the current active_handshaker for that target. Otherwise active_handshaker() could be not NULL because there is another JavaThread handshaking the same target. Unless you are certain that it can never happen, so if active_handshaker() is not NULL it is always the current thread, but even in that case this way is safer. >>> >>> src/hotspot/share/prims/jvmtiThreadState.cpp >>> The guarantee() statement exists in release builds too so the "#ifdef ASSERT" directive should be removed, otherwise "current" will not be declared. >>> >>> Thanks! >>> >>> Patricio >>>> Thanks, >>>> >>>> Yasumasa >>> > From igor.ignatyev at oracle.com Wed Aug 26 23:59:22 2020 From: igor.ignatyev at oracle.com (Igor Ignatyev) Date: Wed, 26 Aug 2020 16:59:22 -0700 Subject: RFR(T) : 8252401 : Introduce Utils.TEST_NATIVE_PATH Message-ID: <8E0C4E48-B435-4734-A86B-2C6745104BF7@oracle.com> http://cr.openjdk.java.net/~iignatyev//8252401/webrev.00 > 6 lines changed: 5 ins; 0 del; 1 mod; Hi all, could you please review this trivial patch which adds j.t.l.Utils.TEST_NATIVE_PATH static field to store the value of test.nativepath system property? JBS: https://bugs.openjdk.java.net/browse/JDK-8252401 webrev: http://cr.openjdk.java.net/~iignatyev//8252401/webrev.00 Thanks, -- Igor From igor.ignatyev at oracle.com Wed Aug 26 23:59:23 2020 From: igor.ignatyev at oracle.com (Igor Ignatyev) Date: Wed, 26 Aug 2020 16:59:23 -0700 Subject: RFR(S) : 8252402 : rewrite vmTestbase/nsk/jvmti/Allocate/alloc001 shell test to Java Message-ID: <1B17F111-EA9D-4BB6-93DB-663A5328A40F@oracle.com> http://cr.openjdk.java.net/~iignatyev//8252402/webrev.00 > 287 lines changed: 60 ins; 200 del; 27 mod; Hi all, could you please review the patch which removes shell script from alloc001 test? there are two small difference comparing to the original test: - if we don't get OutOfMemory on mac or windows, the test will be reported as skipped (as opposed to passed-passed before) - as changing DYLD_LIBRARY_PATH on mac is a bit cumbersome due to SIP, I decided to use '-agentpath:' instead of '-agentlib:' the patch also moves alloc001.java to closer to the other files (vmTestbase/nsk/jvmti/Allocate/alloc001), removes TestDescription.java file, moves jtreg test description to the test source code and removes printdump agent option making trace messages in alloc001.cpp unconditional. webrev: http://cr.openjdk.java.net/~iignatyev//8252402/webrev.00 (depends on 8252401[1,2]) JBS: https://bugs.openjdk.java.net/browse/JDK-8252402 testing: vmTestbase/nsk/jvmti/Allocate/alloc001 on {linux,windows,macos}-x64 [1] https://bugs.openjdk.java.net/browse/JDK-8252401 [2] http://cr.openjdk.java.net/~iignatyev//8252401/webrev.00 Thanks, -- Igor From david.holmes at oracle.com Thu Aug 27 04:49:04 2020 From: david.holmes at oracle.com (David Holmes) Date: Thu, 27 Aug 2020 14:49:04 +1000 Subject: RFR: JDK-8251384: [TESTBUG] jvmti tests should not be executed with minimal VM In-Reply-To: References: <80FFB92E-28EC-45D1-9E1D-9F189550270A@oracle.com> Message-ID: <2ad18332-506a-e2c7-10a6-3ae2ab72d5ed@oracle.com> Hi Alex, On 21/08/2020 6:54 am, Alex Menkov wrote: > Hi Igor, > > On 08/20/2020 09:23, Igor Ignatyev wrote: >> HI Alex, >> >> one minor nit: according to usual java coding conventions, >> isJVMTIIncluded should be spelled as isJvmtiIncluded. otherwise the >> fix looks good to me. > > I tried to be consistent with other methods like > isCDSIncludedInVmBuild, isJFRIncludedInVmBuild, isGCSupported, > isGCSelected, etc. Yes - when a name includes an acronym the use of camel-case is a secondary consideration. > Maybe this should be isJVMTIIncludedInVmBuild.. Yes that seems better and I would also prefer to see it implemented in the same style as: WB_ENTRY(jboolean, WB_IsJFRIncludedInVmBuild(JNIEnv* env)) #if INCLUDE_JFR return true; #else return false; #endif // INCLUDE_JFR WB_END to avoid implicit booleans and avoid the runtime condition check. Thanks, David ----- >> >>> Other tests will be updated in the follow-ups. >> have you already identified all the tests which need this @requires? >> filed bugs/RFEs for them? > > Not yet. > I had problem with running all hotspot tests with minimal build (for > some reason jtreg was not able to complete it), so I decided start from > the tests mentioned in the jira issue and then test area-by-area, file > and fix the tests in batches. > > --alex > >> >> Cheers, >> -- Igor >> >> >>> On Aug 19, 2020, at 6:02 PM, Alex Menkov >>> wrote: >>> >>> Hi all, >>> >>> please review the fix for >>> https://bugs.openjdk.java.net/browse/JDK-8251384 >>> webrev: >>> http://cr.openjdk.java.net/~amenkov/jdk16/minimal_jvmti/webrev/ >>> >>> The fix introduces new @requires option "vm.jvmti": >>> test/lib/sun/hotspot/WhiteBox.java >>> test/jtreg-ext/requires/VMProps.java >>> src/hotspot/share/prims/whitebox.cpp >>> test/hotspot/jtreg/TEST.ROOT >>> >>> and updates tests in test/hotspot/jtreg/serviceability/jvmti (the >>> only change in all tests is added "@requires vm.jvmti") >>> Other tests will be updated in the follow-ups. >>> >>> The >> From igor.ignatyev at oracle.com Thu Aug 27 05:14:24 2020 From: igor.ignatyev at oracle.com (Igor Ignatyev) Date: Wed, 26 Aug 2020 22:14:24 -0700 Subject: RFR(S) : 8252403 : rewrite serviceability/7170638/SDTProbesGNULinuxTest.sh to java Message-ID: <6D1980F9-850E-476E-A53B-FC194DEDF9C2@oracle.com> http://cr.openjdk.java.net/~iignatyev//8252403/webrev.00 > 76 lines changed: 8 ins; 0 del; 68 mod; Hi all, could you please review the patch which rewrites serviceability/7170638/SDTProbesGNULinuxTest.sh to java? JBS: https://bugs.openjdk.java.net/browse/JDK-8252403 webrev: http://cr.openjdk.java.net/~iignatyev//8252403/webrev.00 testing: serviceability/7170638 on linux-x64 w/ and w/o dtrace feature Thanks, -- Igor From patricio.chilano.mateo at oracle.com Thu Aug 27 06:20:06 2020 From: patricio.chilano.mateo at oracle.com (Patricio Chilano) Date: Thu, 27 Aug 2020 03:20:06 -0300 Subject: RFR: 8242427: JVMTI frame pop operations should use Thread-Local Handshakes In-Reply-To: <70dd740f-9b77-c1ac-0acb-ebdc178abdce@oss.nttdata.com> References: <11793902-5fca-3251-ce57-ae0832c3a2c4@oracle.com> <2764d68f-da86-865c-d8b0-e79d1fd9ee25@oss.nttdata.com> <70dd740f-9b77-c1ac-0acb-ebdc178abdce@oss.nttdata.com> Message-ID: Hi Yasumasa, On 8/26/20 8:57 PM, Yasumasa Suenaga wrote: > Hi Patricio, > > Thanks for your review, but webrev.00 has been rotten. > Can you review webrev.02? > > ? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8242427/webrev.02/ > ??? diff between webrev.00 and webrev.01: > http://hg.openjdk.java.net/jdk/submit/rev/7facd1dd39d6 > ??? diff between webrev.01 and webrev.02: > http://hg.openjdk.java.net/jdk/submit/rev/2ef7feb5681f Looks good to me. Can JvmtiEventController::set_frame_pop(), JvmtiEventController::clear_frame_pop() and JvmtiEventController::clear_to_frame_pop() still be called at a safepoint? Thanks, Patricio > Thanks, > > Yasumasa > > > On 2020/08/27 7:50, Patricio Chilano wrote: >> Hi Yasumasa, >> >> On 8/26/20 4:34 AM, Yasumasa Suenaga wrote: >>> Hi Patricio, David, >>> >>> Thanks for your comment! >>> >>> I updated webrev which includes the fix which is commented by >>> Patricio, and it passed submit repo. So I switch this mail thread to >>> RFR. >>> >>> ? JBS: https://bugs.openjdk.java.net/browse/JDK-8242427 >>> ? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8242427/webrev.00/ >> The changes look good to me, thanks for fixing them. >> >> Patricio >>> I understand David said same concerns as Patricio about active >>> handshaker. This webrev checks active handshaker is current thread >>> or not. >>> >>> >>> Cheers, >>> >>> Yasumasa >>> >>> >>> On 2020/08/26 10:13, Patricio Chilano wrote: >>>> Hi Yasumasa, >>>> >>>> On 8/23/20 11:40 PM, Yasumasa Suenaga wrote: >>>>> Hi all, >>>>> >>>>> I want to hear your opinions about the change for JDK-8242427. >>>>> >>>>> I'm trying to migrate following operations to direct handshake. >>>>> >>>>> ??? - VM_UpdateForPopTopFrame >>>>> ??? - VM_SetFramePop >>>>> ??? - VM_GetCurrentLocation >>>>> >>>>> Some operations (VM_GetCurrentLocation and >>>>> EnterInterpOnlyModeClosure) might be called at safepoint, so I >>>>> want to use JavaThread::active_handshaker() in production VM to >>>>> detect the process is in direct handshake or not. >>>>> >>>>> However this function is available in debug VM only, so I want to >>>>> hear the reason why it is for debug VM only, and there are no >>>>> problem to use it in production VM. Of course another solutions >>>>> are welcome. >>>> I added the _active_handshaker field to the HandshakeState class >>>> when working on 8230594 to adjust some asserts, where instead of >>>> checking for the VMThread we needed to check for the active >>>> handshaker of the target JavaThread. Since there were no other >>>> users of it, there was no point in declaring it and having to write >>>> to it for the release bits. There are no issues with having it in >>>> production though so you could change that if necessary. >>>> >>>>> webrev is here. It passed jtreg tests >>>>> (vmTestbase/nsk/{jdi,jdwp,jvmti} serviceability/{jdwp,jvmti}) >>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8242427/proposal/ >>>> Some comments on the proposed change. >>>> >>>> src/hotspot/share/prims/jvmtiEnvThreadState.cpp, >>>> src/hotspot/share/prims/jvmtiEventController.cpp >>>> Why is the check to decide whether to call the handshake or execute >>>> the operation with the current thread different for >>>> GetCurrentLocationClosure vs EnterInterpOnlyModeClosure? >>>> >>>> (GetCurrentLocationClosure) >>>> if ((Thread::current() == _thread) || (_thread->active_handshaker() >>>> != NULL)) { >>>> ????? op.do_thread(_thread); >>>> } else { >>>> ????? Handshake::execute_direct(&op, _thread); >>>> } >>>> >>>> vs >>>> >>>> (EnterInterpOnlyModeClosure) >>>> if (target->active_handshaker() != NULL) { >>>> ???? hs.do_thread(target); >>>> } else { >>>> ???? Handshake::execute_direct(&hs, target); >>>> } >>>> >>>> If you change VM_SetFramePop to use handshakes then it seems you >>>> could reach JvmtiEventControllerPrivate::enter_interp_only_mode() >>>> with the current thread being the target. >>>> Also I think you want the second expression of that check to be >>>> (target->active_handshaker() == Thread::current()). So either you >>>> are the target or the current active_handshaker for that target. >>>> Otherwise active_handshaker() could be not NULL because there is >>>> another JavaThread handshaking the same target. Unless you are >>>> certain that it can never happen, so if active_handshaker() is not >>>> NULL it is always the current thread, but even in that case this >>>> way is safer. >>>> >>>> src/hotspot/share/prims/jvmtiThreadState.cpp >>>> The guarantee() statement exists in release builds too so the >>>> "#ifdef ASSERT" directive should be removed, otherwise "current" >>>> will not be declared. >>>> >>>> Thanks! >>>> >>>> Patricio >>>>> Thanks, >>>>> >>>>> Yasumasa >>>> >> From david.holmes at oracle.com Thu Aug 27 06:34:05 2020 From: david.holmes at oracle.com (David Holmes) Date: Thu, 27 Aug 2020 16:34:05 +1000 Subject: RFR: 8242427: JVMTI frame pop operations should use Thread-Local Handshakes In-Reply-To: <39af59d5-9701-97e9-670b-e71e9b2c6484@oss.nttdata.com> References: <11793902-5fca-3251-ce57-ae0832c3a2c4@oracle.com> <2764d68f-da86-865c-d8b0-e79d1fd9ee25@oss.nttdata.com> <6fa3b231-fc27-78f7-6e64-5d580b875772@oracle.com> <39af59d5-9701-97e9-670b-e71e9b2c6484@oss.nttdata.com> Message-ID: Hi Yasumasa, On 27/08/2020 9:40 am, Yasumasa Suenaga wrote: > Hi David, > > On 2020/08/27 8:09, David Holmes wrote: >> Hi Yasumasa, >> >> On 26/08/2020 5:34 pm, Yasumasa Suenaga wrote: >>> Hi Patricio, David, >>> >>> Thanks for your comment! >>> >>> I updated webrev which includes the fix which is commented by >>> Patricio, and it passed submit repo. So I switch this mail thread to >>> RFR. >>> >>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8242427 >>> ?? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8242427/webrev.00/ >>> >>> I understand David said same concerns as Patricio about active >>> handshaker. This webrev checks active handshaker is current thread or >>> not. >> >> How can the current thread already be in a handshake with the target >> when you execute this code? > > EnterInterpOnlyModeClosure might be called in handshake with > UpdateForPopTopFrameClosure or with SetFramePopClosure. > > EnterInterpOnlyModeClosure is introduced in JDK-8238585 as an > alternative in VM_EnterInterpOnlyMode. > VM_EnterInterpOnlyMode returned true in allow_nested_vm_operations(). > Originally, it could have been called from other VM operations. I see. It is a pity that we have now lost that critical indicator that shows how this operation can be nested within another operation. The possibility of nesting is even more obscure with JvmtiEnvThreadState::reset_current_location. And the fact it is now up to the caller to handle that case explicitly raises some concern - what will happen if you call execute_direct whilst already in a handshake with the target thread? I can't help but feel that we need a more rigorous and automated way of dealing with nesting ... perhaps we don't even need to care and handshakes should always allow nested handshake requests? (Question more for Robbin and Patricio.) Further comments: src/hotspot/share/prims/jvmtiEnvThreadState.cpp 194 #ifdef ASSERT 195 Thread *current = Thread::current(); 196 #endif 197 assert(get_thread() == current || current == get_thread()->active_handshaker(), 198 "frame pop data only accessible from same thread or direct handshake"); Can you factor this out into a separate function so that it is not repeated so often. Seems to me that there should be a global function on Thread: assert_current_thread_or_handshaker() [yes unpleasant name but ...] that will allow us to stop repeating this code fragment across numerous files. A follow up RFE for that would be okay too (I see some guarantees that should probably just be asserts so they need a bit more checking). 331 Handshake::execute_direct(&op, _thread); You aren't checking the return value of execute_direct, but I can't tell where _thread was checked for still being alive ?? --- src/hotspot/share/prims/jvmtiEventController.cpp 340 Handshake::execute_direct(&hs, target); I know this is existing code but I have the same query as above - no return value check and no clear check that the JavaThread is still alive? --- Do we know if the existing tests actually test the nested cases? Thanks, David ----- > > Thanks, > > Yasumasa > > >> David >> ----- >> >>> >>> Cheers, >>> >>> Yasumasa >>> >>> >>> On 2020/08/26 10:13, Patricio Chilano wrote: >>>> Hi Yasumasa, >>>> >>>> On 8/23/20 11:40 PM, Yasumasa Suenaga wrote: >>>>> Hi all, >>>>> >>>>> I want to hear your opinions about the change for JDK-8242427. >>>>> >>>>> I'm trying to migrate following operations to direct handshake. >>>>> >>>>> ??? - VM_UpdateForPopTopFrame >>>>> ??? - VM_SetFramePop >>>>> ??? - VM_GetCurrentLocation >>>>> >>>>> Some operations (VM_GetCurrentLocation and >>>>> EnterInterpOnlyModeClosure) might be called at safepoint, so I want >>>>> to use JavaThread::active_handshaker() in production VM to detect >>>>> the process is in direct handshake or not. >>>>> >>>>> However this function is available in debug VM only, so I want to >>>>> hear the reason why it is for debug VM only, and there are no >>>>> problem to use it in production VM. Of course another solutions are >>>>> welcome. >>>> I added the _active_handshaker field to the HandshakeState class >>>> when working on 8230594 to adjust some asserts, where instead of >>>> checking for the VMThread we needed to check for the active >>>> handshaker of the target JavaThread. Since there were no other users >>>> of it, there was no point in declaring it and having to write to it >>>> for the release bits. There are no issues with having it in >>>> production though so you could change that if necessary. >>>> >>>>> webrev is here. It passed jtreg tests >>>>> (vmTestbase/nsk/{jdi,jdwp,jvmti} serviceability/{jdwp,jvmti}) >>>>> ? http://cr.openjdk.java.net/~ysuenaga/JDK-8242427/proposal/ >>>> Some comments on the proposed change. >>>> >>>> src/hotspot/share/prims/jvmtiEnvThreadState.cpp, >>>> src/hotspot/share/prims/jvmtiEventController.cpp >>>> Why is the check to decide whether to call the handshake or execute >>>> the operation with the current thread different for >>>> GetCurrentLocationClosure vs EnterInterpOnlyModeClosure? >>>> >>>> (GetCurrentLocationClosure) >>>> if ((Thread::current() == _thread) || (_thread->active_handshaker() >>>> != NULL)) { >>>> ????? op.do_thread(_thread); >>>> } else { >>>> ????? Handshake::execute_direct(&op, _thread); >>>> } >>>> >>>> vs >>>> >>>> (EnterInterpOnlyModeClosure) >>>> if (target->active_handshaker() != NULL) { >>>> ???? hs.do_thread(target); >>>> } else { >>>> ???? Handshake::execute_direct(&hs, target); >>>> } >>>> >>>> If you change VM_SetFramePop to use handshakes then it seems you >>>> could reach JvmtiEventControllerPrivate::enter_interp_only_mode() >>>> with the current thread being the target. >>>> Also I think you want the second expression of that check to be >>>> (target->active_handshaker() == Thread::current()). So either you >>>> are the target or the current active_handshaker for that target. >>>> Otherwise active_handshaker() could be not NULL because there is >>>> another JavaThread handshaking the same target. Unless you are >>>> certain that it can never happen, so if active_handshaker() is not >>>> NULL it is always the current thread, but even in that case this way >>>> is safer. >>>> >>>> src/hotspot/share/prims/jvmtiThreadState.cpp >>>> The guarantee() statement exists in release builds too so the >>>> "#ifdef ASSERT" directive should be removed, otherwise "current" >>>> will not be declared. >>>> >>>> Thanks! >>>> >>>> Patricio >>>>> Thanks, >>>>> >>>>> Yasumasa >>>> From david.holmes at oracle.com Thu Aug 27 06:49:26 2020 From: david.holmes at oracle.com (David Holmes) Date: Thu, 27 Aug 2020 16:49:26 +1000 Subject: RFR: 8242427: JVMTI frame pop operations should use Thread-Local Handshakes In-Reply-To: References: <11793902-5fca-3251-ce57-ae0832c3a2c4@oracle.com> <2764d68f-da86-865c-d8b0-e79d1fd9ee25@oss.nttdata.com> <6fa3b231-fc27-78f7-6e64-5d580b875772@oracle.com> <39af59d5-9701-97e9-670b-e71e9b2c6484@oss.nttdata.com> Message-ID: <92f6ff81-2a9a-e2d9-6e07-f19c97b63abd@oracle.com> Sorry I just realized I reviewed version 00 :( I have concerns with the added locking: MutexLocker mu(JvmtiThreadState_lock); Who else may be holding that lock? Could it be our target thread that we have already initiated a handshake with? (The lock ranking checks related to safepoints don't help us detect deadlocks between a target thread and its handshaker. :( ) It is far from clear now which functions are reachable from handshakes, which from safepoint VM_ops and which from both. ! assert(SafepointSynchronize::is_at_safepoint() || JvmtiThreadState_lock->is_locked(), "Safepoint or must be locked"); This can be written as: assert_locked_or_safepoint(JvmtiThreadState_lock); or possibly the weak variant of that. ('m puzzled by the extra check in the strong version ... I think it is intended for the case of the VMThread executing a non-safepoint VMop.) Thanks, David ----- On 27/08/2020 4:34 pm, David Holmes wrote: > Hi Yasumasa, > > On 27/08/2020 9:40 am, Yasumasa Suenaga wrote: >> Hi David, >> >> On 2020/08/27 8:09, David Holmes wrote: >>> Hi Yasumasa, >>> >>> On 26/08/2020 5:34 pm, Yasumasa Suenaga wrote: >>>> Hi Patricio, David, >>>> >>>> Thanks for your comment! >>>> >>>> I updated webrev which includes the fix which is commented by >>>> Patricio, and it passed submit repo. So I switch this mail thread to >>>> RFR. >>>> >>>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8242427 >>>> ?? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8242427/webrev.00/ >>>> >>>> I understand David said same concerns as Patricio about active >>>> handshaker. This webrev checks active handshaker is current thread >>>> or not. >>> >>> How can the current thread already be in a handshake with the target >>> when you execute this code? >> >> EnterInterpOnlyModeClosure might be called in handshake with >> UpdateForPopTopFrameClosure or with SetFramePopClosure. >> >> EnterInterpOnlyModeClosure is introduced in JDK-8238585 as an >> alternative in VM_EnterInterpOnlyMode. >> VM_EnterInterpOnlyMode returned true in allow_nested_vm_operations(). >> Originally, it could have been called from other VM operations. > > I see. It is a pity that we have now lost that critical indicator that > shows how this operation can be nested within another operation. The > possibility of nesting is even more obscure with > JvmtiEnvThreadState::reset_current_location. And the fact it is now up > to the caller to handle that case explicitly raises some concern - what > will happen if you call execute_direct whilst already in a handshake > with the target thread? > > I can't help but feel that we need a more rigorous and automated way of > dealing with nesting ... perhaps we don't even need to care and > handshakes should always allow nested handshake requests? (Question more > for Robbin and Patricio.) > > Further comments: > > src/hotspot/share/prims/jvmtiEnvThreadState.cpp > > ?194 #ifdef ASSERT > ?195?? Thread *current = Thread::current(); > ?196 #endif > ?197?? assert(get_thread() == current || current == > get_thread()->active_handshaker(), > ?198????????? "frame pop data only accessible from same thread or > direct handshake"); > > Can you factor this out into a separate function so that it is not > repeated so often. Seems to me that there should be a global function on > Thread: assert_current_thread_or_handshaker()? [yes unpleasant name but > ...] that will allow us to stop repeating this code fragment across > numerous files. A follow up RFE for that would be okay too (I see some > guarantees that should probably just be asserts so they need a bit more > checking). > > ?331???????? Handshake::execute_direct(&op, _thread); > > You aren't checking the return value of execute_direct, but I can't tell > where _thread was checked for still being alive ?? > > --- > > src/hotspot/share/prims/jvmtiEventController.cpp > > ?340???? Handshake::execute_direct(&hs, target); > > I know this is existing code but I have the same query as above - no > return value check and no clear check that the JavaThread is still alive? > > --- > > Do we know if the existing tests actually test the nested cases? > > Thanks, > David > ----- > >> >> Thanks, >> >> Yasumasa >> >> >>> David >>> ----- >>> >>>> >>>> Cheers, >>>> >>>> Yasumasa >>>> >>>> >>>> On 2020/08/26 10:13, Patricio Chilano wrote: >>>>> Hi Yasumasa, >>>>> >>>>> On 8/23/20 11:40 PM, Yasumasa Suenaga wrote: >>>>>> Hi all, >>>>>> >>>>>> I want to hear your opinions about the change for JDK-8242427. >>>>>> >>>>>> I'm trying to migrate following operations to direct handshake. >>>>>> >>>>>> ??? - VM_UpdateForPopTopFrame >>>>>> ??? - VM_SetFramePop >>>>>> ??? - VM_GetCurrentLocation >>>>>> >>>>>> Some operations (VM_GetCurrentLocation and >>>>>> EnterInterpOnlyModeClosure) might be called at safepoint, so I >>>>>> want to use JavaThread::active_handshaker() in production VM to >>>>>> detect the process is in direct handshake or not. >>>>>> >>>>>> However this function is available in debug VM only, so I want to >>>>>> hear the reason why it is for debug VM only, and there are no >>>>>> problem to use it in production VM. Of course another solutions >>>>>> are welcome. >>>>> I added the _active_handshaker field to the HandshakeState class >>>>> when working on 8230594 to adjust some asserts, where instead of >>>>> checking for the VMThread we needed to check for the active >>>>> handshaker of the target JavaThread. Since there were no other >>>>> users of it, there was no point in declaring it and having to write >>>>> to it for the release bits. There are no issues with having it in >>>>> production though so you could change that if necessary. >>>>> >>>>>> webrev is here. It passed jtreg tests >>>>>> (vmTestbase/nsk/{jdi,jdwp,jvmti} serviceability/{jdwp,jvmti}) >>>>>> ? http://cr.openjdk.java.net/~ysuenaga/JDK-8242427/proposal/ >>>>> Some comments on the proposed change. >>>>> >>>>> src/hotspot/share/prims/jvmtiEnvThreadState.cpp, >>>>> src/hotspot/share/prims/jvmtiEventController.cpp >>>>> Why is the check to decide whether to call the handshake or execute >>>>> the operation with the current thread different for >>>>> GetCurrentLocationClosure vs EnterInterpOnlyModeClosure? >>>>> >>>>> (GetCurrentLocationClosure) >>>>> if ((Thread::current() == _thread) || (_thread->active_handshaker() >>>>> != NULL)) { >>>>> ????? op.do_thread(_thread); >>>>> } else { >>>>> ????? Handshake::execute_direct(&op, _thread); >>>>> } >>>>> >>>>> vs >>>>> >>>>> (EnterInterpOnlyModeClosure) >>>>> if (target->active_handshaker() != NULL) { >>>>> ???? hs.do_thread(target); >>>>> } else { >>>>> ???? Handshake::execute_direct(&hs, target); >>>>> } >>>>> >>>>> If you change VM_SetFramePop to use handshakes then it seems you >>>>> could reach JvmtiEventControllerPrivate::enter_interp_only_mode() >>>>> with the current thread being the target. >>>>> Also I think you want the second expression of that check to be >>>>> (target->active_handshaker() == Thread::current()). So either you >>>>> are the target or the current active_handshaker for that target. >>>>> Otherwise active_handshaker() could be not NULL because there is >>>>> another JavaThread handshaking the same target. Unless you are >>>>> certain that it can never happen, so if active_handshaker() is not >>>>> NULL it is always the current thread, but even in that case this >>>>> way is safer. >>>>> >>>>> src/hotspot/share/prims/jvmtiThreadState.cpp >>>>> The guarantee() statement exists in release builds too so the >>>>> "#ifdef ASSERT" directive should be removed, otherwise "current" >>>>> will not be declared. >>>>> >>>>> Thanks! >>>>> >>>>> Patricio >>>>>> Thanks, >>>>>> >>>>>> Yasumasa >>>>> From richard.reingruber at sap.com Thu Aug 27 07:43:30 2020 From: richard.reingruber at sap.com (Reingruber, Richard) Date: Thu, 27 Aug 2020 07:43:30 +0000 Subject: 8242427: JVMTI frame pop operations should use Thread-Local Handshakes In-Reply-To: References: Message-ID: Hi Yasumasa, > I've described the motivation on JDK-8201641 (it is a parent task of JDK-8242427) > ``` > Many JVMTI functions uses VM Operation to get information. However some of them need to stop only one thread - they don't need to stop all threads. > ``` So the goal is better performance. For PopFrame IMHO it is not worth the effort, the future effort in maintaining the related code, and the risk. I think so because PopFrame is a hardly ever used. I honestly never used it (have you?). In IDEs it is well hidden. Graal does not even bother to support it. On the other side the change affects other operations that are commonly used. In the rare cases when a PopFrame is requested it will be in interactive sessions: someone found the well-hidden PopFrame button in the debugger and pressed it. Probably she won't do it again. At least not at a high frequency. So she will not notice the effect of the optimization. If you have a large cloud of JVMs where every second a PopFrame is executed, even then I would doubt that the resource savings are measurable. And I would also doubt that a cloud with PopFrames at that rate exists. I see there are rare events like full GCs that can do harm. But in the case of PopFrame I can't see a problem because the pause for the vm operation will be extremely short. Is there a scenario or a not too artificial benchmark that would show an improvement? Thanks, Richard. -----Original Message----- From: Yasumasa Suenaga Sent: Donnerstag, 27. August 2020 01:30 To: Reingruber, Richard ; serviceability-dev Subject: Re: 8242427: JVMTI frame pop operations should use Thread-Local Handshakes Hi Richard, I've described the motivation on JDK-8201641 (it is a parent task of JDK-8242427) ``` Many JVMTI functions uses VM Operation to get information. However some of them need to stop only one thread - they don't need to stop all threads. ``` I aimed to improve JVMTI monitor operation with TLS at first, but I found other JVMTI operations can be improved with same process. So I've tried to fix them. I proposed it to serviceability-dev [1], then Dan told me similar enhancement is already filed to JBS [2]. So I created subtasks in it. Thanks, Yasumasa [1] https://mail.openjdk.java.net/pipermail/serviceability-dev/2020-March/030890.html [2] https://mail.openjdk.java.net/pipermail/serviceability-dev/2020-March/030897.html On 2020/08/27 5:33, Reingruber, Richard wrote: > Hi Yasumasa, > > Could you explain a little bit the motivation to replace these vm operations with handshakes? > Would be good, if you could add the goals as well to the JBS item. > > Thanks, Richard. > > -----Original Message----- > From: serviceability-dev On Behalf Of Yasumasa Suenaga > Sent: Montag, 24. August 2020 04:40 > To: serviceability-dev > Subject: 8242427: JVMTI frame pop operations should use Thread-Local Handshakes > > Hi all, > > I want to hear your opinions about the change for JDK-8242427. > > I'm trying to migrate following operations to direct handshake. > > - VM_UpdateForPopTopFrame > - VM_SetFramePop > - VM_GetCurrentLocation > > Some operations (VM_GetCurrentLocation and EnterInterpOnlyModeClosure) might be called at safepoint, so I want to use JavaThread::active_handshaker() in production VM to detect the process is in direct handshake or not. > > However this function is available in debug VM only, so I want to hear the reason why it is for debug VM only, and there are no problem to use it in production VM. Of course another solutions are welcome. > > webrev is here. It passed jtreg tests (vmTestbase/nsk/{jdi,jdwp,jvmti} serviceability/{jdwp,jvmti}) > http://cr.openjdk.java.net/~ysuenaga/JDK-8242427/proposal/ > > > Thanks, > > Yasumasa > From dms at samersoff.net Thu Aug 27 15:03:47 2020 From: dms at samersoff.net (Dmitry Samersoff) Date: Thu, 27 Aug 2020 18:03:47 +0300 Subject: RFR (S): JDK-8250630 JdwpListenTest.java fails on Alpine Linux In-Reply-To: References: <3be5b9aa-2d4d-0806-24f3-8207de1baf35@samersoff.net> Message-ID: Hello Everybody, http://cr.openjdk.java.net/~dsamersoff/JDK-8250630/webrev.02/ Webrev is updated, all comments accepted and addressed. Except: > The error code from the inet_pton is not checked. inet_pton performs conversion of the constant value in our case and the only possible reason for it to fail is that the system doesn't support IPv6 at all. -Dmitry On 18.08.2020 3:20, serguei.spitsyn at oracle.com wrote: > Hi Dmitry, > > I agree with Alex, it is better to rename compareIPv6Addr to > isEqualIPv6Addr. > > 705 static int compareIPv6Addr(struct addrinfo *ai, struct in6_addr in6Addr) > 706 { > 707 > 708 if (ai->ai_addr->sa_family == AF_INET6) { > 709 const struct sockaddr_in6 sa = *((struct sockaddr_in6*) ai->ai_addr); > 710 return (memcmp(&sa.sin6_addr, &in6Addr, sizeof(in6Addr)) == 0); > 711 } > 712 > 713 return 0; > 714 } > > I think, the lines 707 and 712 are not needed. > > 725 struct in6_addr mappedAny; ... 761 inet_pton(AF_INET6, > "::ffff:0.0.0.0", &mappedAny); > > The error code from the inet_pton is not checked. > Also, it can be useful to pre-initialize the mappedAny. > > 737 // Try to find bind address of preferred address familty first A dot > at the end of comment is missed. > > 745 if (listenAddr == NULL) { > 746 // No address of preferred addres family found, grab the first one > 747 listenAddr = &(addrInfo[0]); > 748 } > > The indent has to be 4, not 3. > The () brackets are not actually needed but I do not object if you keep > them. > A dot at the end of comment is missed. > > 755 // Binding to IN6ADDR_ANY allow us to serve both IPv4 and IPv6 > connections, > 756 // but binding to mapped INADDR_ANY (::ffff:0.0.0.0) allow us to > serve IPv4 > 757 // connections only. So make sure, that IN6ADDR_ANY is preferred over > 758 // mapped INADDR_ANY if preferredAddressFamily is AF_INET6 or not set > > I'd suggest to replace "allow us" => "allows" in two places: > ? "IN6ADDR_ANY allow us" and "(::ffff:0.0.0.0) allow us". > Also, it'd better to replace: "So make sure,that" => "Make sure that". > A dot at the end of comment is missed. > > I don't know the network protocols well enough to comment on > > > Thanks, > Serguei > > > On 8/17/20 00:21, Dmitry Samersoff wrote: >> Hello Everybody, >> >> Please review the fix: >> >> https://cr.openjdk.java.net/~dsamersoff/JDK-8250630/webrev.01/ >> >> Binding to IN6ADDR_ANY allow us to serve both IPv4 and IPv6 >> connections, but binding to mapped INADDR_ANY (::ffff:0.0.0.0) allow >> us to serve IPv4 connections only. >> >> So make sure, that IN6ADDR_ANY is preferred over mapped INADDR_ANY if >> preferredAddressFamily is not AF_INET >> >> >> -Dmitry\S >> > From serguei.spitsyn at oracle.com Thu Aug 27 18:05:01 2020 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Thu, 27 Aug 2020 11:05:01 -0700 Subject: RFR (S): JDK-8250630 JdwpListenTest.java fails on Alpine Linux In-Reply-To: References: <3be5b9aa-2d4d-0806-24f3-8207de1baf35@samersoff.net> Message-ID: <87b7ffaf-2a05-0a76-6933-6d0ee3927f67@oracle.com> Hi Dmitry, Thank you for the update. The error code from the inet_pton() still is not checked. Thanks, Serguei On 8/27/20 08:03, Dmitry Samersoff wrote: > Hello Everybody, > > http://cr.openjdk.java.net/~dsamersoff/JDK-8250630/webrev.02/ > > Webrev is updated, all comments accepted and addressed. > > Except: > > The error code from the inet_pton is not checked. > > inet_pton performs conversion of the constant value in our case and > the only possible reason for it to fail is that the system doesn't > support IPv6 at all. > > -Dmitry > > > On 18.08.2020 3:20, serguei.spitsyn at oracle.com wrote: >> Hi Dmitry, >> >> I agree with Alex, it is better to rename compareIPv6Addr to >> isEqualIPv6Addr. >> >> 705 static int compareIPv6Addr(struct addrinfo *ai, struct in6_addr >> in6Addr) >> 706 { >> 707 >> 708 if (ai->ai_addr->sa_family == AF_INET6) { >> 709 const struct sockaddr_in6 sa = *((struct sockaddr_in6*) >> ai->ai_addr); >> 710 return (memcmp(&sa.sin6_addr, &in6Addr, sizeof(in6Addr)) == 0); >> 711 } >> 712 >> 713 return 0; >> 714 } >> >> I think, the lines 707 and 712 are not needed. >> >> 725 struct in6_addr mappedAny; ... 761 inet_pton(AF_INET6, >> "::ffff:0.0.0.0", &mappedAny); >> >> The error code from the inet_pton is not checked. >> Also, it can be useful to pre-initialize the mappedAny. >> >> 737 // Try to find bind address of preferred address familty first A >> dot at the end of comment is missed. >> >> 745 if (listenAddr == NULL) { >> 746 // No address of preferred addres family found, grab the first one >> 747 listenAddr = &(addrInfo[0]); >> ? 748???? } >> >> The indent has to be 4, not 3. >> The () brackets are not actually needed but I do not object if you >> keep them. >> A dot at the end of comment is missed. >> >> 755 // Binding to IN6ADDR_ANY allow us to serve both IPv4 and IPv6 >> connections, >> 756 // but binding to mapped INADDR_ANY (::ffff:0.0.0.0) allow us to >> serve IPv4 >> 757 // connections only. So make sure, that IN6ADDR_ANY is preferred >> over >> 758 // mapped INADDR_ANY if preferredAddressFamily is AF_INET6 or not >> set >> >> I'd suggest to replace "allow us" => "allows" in two places: >> ?? "IN6ADDR_ANY allow us" and "(::ffff:0.0.0.0) allow us". >> Also, it'd better to replace: "So make sure,that" => "Make sure that". >> A dot at the end of comment is missed. >> >> I don't know the network protocols well enough to comment on >> >> >> Thanks, >> Serguei >> >> >> On 8/17/20 00:21, Dmitry Samersoff wrote: >>> Hello Everybody, >>> >>> Please review the fix: >>> >>> https://cr.openjdk.java.net/~dsamersoff/JDK-8250630/webrev.01/ >>> >>> Binding to IN6ADDR_ANY allow us to serve both IPv4 and IPv6 >>> connections, but binding to mapped INADDR_ANY (::ffff:0.0.0.0) allow >>> us to serve IPv4 connections only. >>> >>> So make sure, that IN6ADDR_ANY is preferred over mapped INADDR_ANY >>> if preferredAddressFamily is not AF_INET >>> >>> >>> -Dmitry\S >>> >> > From alexey.menkov at oracle.com Thu Aug 27 20:29:48 2020 From: alexey.menkov at oracle.com (Alex Menkov) Date: Thu, 27 Aug 2020 13:29:48 -0700 Subject: RFR: JDK-8251384: [TESTBUG] jvmti tests should not be executed with minimal VM In-Reply-To: <2ad18332-506a-e2c7-10a6-3ae2ab72d5ed@oracle.com> References: <80FFB92E-28EC-45D1-9E1D-9F189550270A@oracle.com> <2ad18332-506a-e2c7-10a6-3ae2ab72d5ed@oracle.com> Message-ID: <06c92d9c-7369-9969-56e0-c5ff67b50a68@oracle.com> Hi David, On 08/26/2020 21:49, David Holmes wrote: > Hi Alex, > > On 21/08/2020 6:54 am, Alex Menkov wrote: >> Hi Igor, >> >> On 08/20/2020 09:23, Igor Ignatyev wrote: >>> HI Alex, >>> >>> one minor nit: according to usual java coding conventions, >>> isJVMTIIncluded should be spelled as isJvmtiIncluded. otherwise the >>> fix looks good to me. >> >> I tried to be consistent with other methods like >> isCDSIncludedInVmBuild, isJFRIncludedInVmBuild, isGCSupported, >> isGCSelected, etc. > > Yes - when a name includes an acronym the use of camel-case is a > secondary consideration. > >> Maybe this should be isJVMTIIncludedInVmBuild.. > > Yes that seems better and I would also prefer to see it implemented in > the same style as: Sorry, the fix was pushed several days ago. Do you want to change the name to isJVMTIIncludedInVmBuild by follow-up? > > WB_ENTRY(jboolean, WB_IsJFRIncludedInVmBuild(JNIEnv* env)) > #if INCLUDE_JFR > ? return true; > #else > ? return false; > #endif // INCLUDE_JFR > WB_END > > to avoid implicit booleans and avoid the runtime condition check. I don't think true/false are correct here. This is jboolean, not bool. INCLUDE_JVMTI ? JNI_TRUE : JNI_FALSE #if INCLUDE_JVMTI return JNI_TRUE; #else return JNI_FALSE; #endif looks better imo. --alex > > Thanks, > David > ----- > >>> >>>> Other tests will be updated in the follow-ups. >>> have you already identified all the tests which need this @requires? >>> filed bugs/RFEs for them? >> >> Not yet. >> I had problem with running all hotspot tests with minimal build (for >> some reason jtreg was not able to complete it), so I decided start >> from the tests mentioned in the jira issue and then test area-by-area, >> file and fix the tests in batches. >> >> --alex >> >>> >>> Cheers, >>> -- Igor >>> >>> >>>> On Aug 19, 2020, at 6:02 PM, Alex Menkov >>>> wrote: >>>> >>>> Hi all, >>>> >>>> please review the fix for >>>> https://bugs.openjdk.java.net/browse/JDK-8251384 >>>> webrev: >>>> http://cr.openjdk.java.net/~amenkov/jdk16/minimal_jvmti/webrev/ >>>> >>>> The fix introduces new @requires option "vm.jvmti": >>>> test/lib/sun/hotspot/WhiteBox.java >>>> test/jtreg-ext/requires/VMProps.java >>>> src/hotspot/share/prims/whitebox.cpp >>>> test/hotspot/jtreg/TEST.ROOT >>>> >>>> and updates tests in test/hotspot/jtreg/serviceability/jvmti (the >>>> only change in all tests is added "@requires vm.jvmti") >>>> Other tests will be updated in the follow-ups. >>>> >>>> The >>> From richard.reingruber at sap.com Thu Aug 27 20:32:36 2020 From: richard.reingruber at sap.com (Reingruber, Richard) Date: Thu, 27 Aug 2020 20:32:36 +0000 Subject: RFR(L) 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents In-Reply-To: References: <1f8a3c7a-fa0f-b5b2-4a8a-7d3d8dbbe1b5@oracle.com> <4b56a45c-a14c-6f74-2bfd-25deaabe8201@oracle.com> <5271429a-481d-ddb9-99dc-b3f6670fcc0b@oracle.com> Message-ID: Hi Goetz, > I read through your change again. It looks good to me now. > The new naming and additional comments make it > easier to read I think, thank you. Thanks for all your input! > One small thing: > deoptimization.cpp, l. 1503 > You don't really need the brackets. Two lines below you don't use them either. > (No webrev needed) Thanks for providing the correct line off list. Fixed! I prepared a new webrev, because I had to rebase after JDK-8249293 [1] and because I wanted to make use of JDK-8251384 [2] Webrev: http://cr.openjdk.java.net/~rrich/webrevs/8227745/webrev.8/ Delta: http://cr.openjdk.java.net/~rrich/webrevs/8227745/webrev.8.inc/ The delta looks bigger than it is. Most of it is re-indentation of VM_GetOrSetLocal::deoptimize_objects(). You can see this if you look at http://cr.openjdk.java.net/~rrich/webrevs/8227745/webrev.8.inc/src/hotspot/share/prims/jvmtiImpl.cpp.udiff.html which does not include the whitespace change. Hope you are still ok with webrev.8. The changes are marginal. I've commented each below. Thanks, Richard. --- Details below --- src/hotspot/share/prims/jvmtiImpl.cpp @@ -425,11 +425,11 @@ , _depth(depth) , _index(index) , _type(type) , _jvf(NULL) , _set(false) - , _eb(NULL, NULL, false) // no references escape + , _eb(NULL, NULL, type == T_OBJECT) , _result(JVMTI_ERROR_NONE) Currently 'type' is never equal to T_OBJECT at this location, still I think it is better to check. The compiler will replace the compare with false. @@ -630,11 +630,11 @@ } // Revert optimizations based on escape analysis if this is an access to a local object bool VM_GetOrSetLocal::deoptimize_objects(javaVFrame* jvf) { #if COMPILER2_OR_JVMCI - if (NOT_JVMCI(DoEscapeAnalysis &&) _type == T_OBJECT) { + assert(_type == T_OBJECT, "EscapeBarrier should not be active if _type != T_OBJECT"); I removed the if from VM_GetOrSetLocal::deoptimize_objects(), because now it only gets called if the VM_GetOrSetLocal instance has an active EscapeBarrier which will be the case iff the local type is T_OBJECT and if either C2 escape analysis is enabled or Graal is used. src/hotspot/share/runtime/deoptimization.cpp You suggested to remove the braces. Done. src/hotspot/share/runtime/deoptimization.hpp Must provide definition of EscapeBarrier::barrier_active() for new call site in VM_GetOrSetLocal::doit_prologue() if building with COMPILER2_OR_JVMCI not defined. test/hotspot/jtreg/serviceability/jvmti/Heap/IterateHeapWithEscapeAnalysisEnabled.java Make use of [2] and pass test with minimal vm. [1] https://bugs.openjdk.java.net/browse/JDK-8249293 [2] https://bugs.openjdk.java.net/browse/JDK-8251384 -----Original Message----- From: Lindenmaier, Goetz Sent: Samstag, 22. August 2020 07:46 To: Reingruber, Richard ; serviceability-dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net; hotspot-runtime-dev at openjdk.java.net Subject: RE: RFR(L) 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents Hi Richard, I read through your change again. It looks good to me now. The new naming and additional comments make it easier to read I think, thank you. One small thing: deoptimization.cpp, l. 1503 You don't really need the brackets. Two lines below you don't use them either. (No webrev needed) Best regards, Goetz. From suenaga at oss.nttdata.com Fri Aug 28 01:18:18 2020 From: suenaga at oss.nttdata.com (Yasumasa Suenaga) Date: Fri, 28 Aug 2020 10:18:18 +0900 Subject: RFR: 8242427: JVMTI frame pop operations should use Thread-Local Handshakes In-Reply-To: References: <11793902-5fca-3251-ce57-ae0832c3a2c4@oracle.com> <2764d68f-da86-865c-d8b0-e79d1fd9ee25@oss.nttdata.com> <70dd740f-9b77-c1ac-0acb-ebdc178abdce@oss.nttdata.com> Message-ID: <0707776c-928d-5fde-353d-8bf3d4d419b9@oss.nttdata.com> Hi Patricio, On 2020/08/27 15:20, Patricio Chilano wrote: > Hi Yasumasa, > > On 8/26/20 8:57 PM, Yasumasa Suenaga wrote: >> Hi Patricio, >> >> Thanks for your review, but webrev.00 has been rotten. >> Can you review webrev.02? >> >> ? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8242427/webrev.02/ >> ??? diff between webrev.00 and webrev.01: http://hg.openjdk.java.net/jdk/submit/rev/7facd1dd39d6 >> ??? diff between webrev.01 and webrev.02: http://hg.openjdk.java.net/jdk/submit/rev/2ef7feb5681f > Looks good to me. Can JvmtiEventController::set_frame_pop(), JvmtiEventController::clear_frame_pop() and JvmtiEventController::clear_to_frame_pop() still be called at a safepoint? No, and also I checked them with assert(JvmtiThreadState_lock->is_locked()). webrev is here: webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8242427/webrev.03/ diff from previous webrev: http://hg.openjdk.java.net/jdk/submit/rev/2a2c02ada080 Thanks, Yasumasa > Thanks, > Patricio >> Thanks, >> >> Yasumasa >> >> >> On 2020/08/27 7:50, Patricio Chilano wrote: >>> Hi Yasumasa, >>> >>> On 8/26/20 4:34 AM, Yasumasa Suenaga wrote: >>>> Hi Patricio, David, >>>> >>>> Thanks for your comment! >>>> >>>> I updated webrev which includes the fix which is commented by Patricio, and it passed submit repo. So I switch this mail thread to RFR. >>>> >>>> ? JBS: https://bugs.openjdk.java.net/browse/JDK-8242427 >>>> ? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8242427/webrev.00/ >>> The changes look good to me, thanks for fixing them. >>> >>> Patricio >>>> I understand David said same concerns as Patricio about active handshaker. This webrev checks active handshaker is current thread or not. >>>> >>>> >>>> Cheers, >>>> >>>> Yasumasa >>>> >>>> >>>> On 2020/08/26 10:13, Patricio Chilano wrote: >>>>> Hi Yasumasa, >>>>> >>>>> On 8/23/20 11:40 PM, Yasumasa Suenaga wrote: >>>>>> Hi all, >>>>>> >>>>>> I want to hear your opinions about the change for JDK-8242427. >>>>>> >>>>>> I'm trying to migrate following operations to direct handshake. >>>>>> >>>>>> ??? - VM_UpdateForPopTopFrame >>>>>> ??? - VM_SetFramePop >>>>>> ??? - VM_GetCurrentLocation >>>>>> >>>>>> Some operations (VM_GetCurrentLocation and EnterInterpOnlyModeClosure) might be called at safepoint, so I want to use JavaThread::active_handshaker() in production VM to detect the process is in direct handshake or not. >>>>>> >>>>>> However this function is available in debug VM only, so I want to hear the reason why it is for debug VM only, and there are no problem to use it in production VM. Of course another solutions are welcome. >>>>> I added the _active_handshaker field to the HandshakeState class when working on 8230594 to adjust some asserts, where instead of checking for the VMThread we needed to check for the active handshaker of the target JavaThread. Since there were no other users of it, there was no point in declaring it and having to write to it for the release bits. There are no issues with having it in production though so you could change that if necessary. >>>>> >>>>>> webrev is here. It passed jtreg tests (vmTestbase/nsk/{jdi,jdwp,jvmti} serviceability/{jdwp,jvmti}) >>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8242427/proposal/ >>>>> Some comments on the proposed change. >>>>> >>>>> src/hotspot/share/prims/jvmtiEnvThreadState.cpp, src/hotspot/share/prims/jvmtiEventController.cpp >>>>> Why is the check to decide whether to call the handshake or execute the operation with the current thread different for GetCurrentLocationClosure vs EnterInterpOnlyModeClosure? >>>>> >>>>> (GetCurrentLocationClosure) >>>>> if ((Thread::current() == _thread) || (_thread->active_handshaker() != NULL)) { >>>>> ????? op.do_thread(_thread); >>>>> } else { >>>>> ????? Handshake::execute_direct(&op, _thread); >>>>> } >>>>> >>>>> vs >>>>> >>>>> (EnterInterpOnlyModeClosure) >>>>> if (target->active_handshaker() != NULL) { >>>>> ???? hs.do_thread(target); >>>>> } else { >>>>> ???? Handshake::execute_direct(&hs, target); >>>>> } >>>>> >>>>> If you change VM_SetFramePop to use handshakes then it seems you could reach JvmtiEventControllerPrivate::enter_interp_only_mode() with the current thread being the target. >>>>> Also I think you want the second expression of that check to be (target->active_handshaker() == Thread::current()). So either you are the target or the current active_handshaker for that target. Otherwise active_handshaker() could be not NULL because there is another JavaThread handshaking the same target. Unless you are certain that it can never happen, so if active_handshaker() is not NULL it is always the current thread, but even in that case this way is safer. >>>>> >>>>> src/hotspot/share/prims/jvmtiThreadState.cpp >>>>> The guarantee() statement exists in release builds too so the "#ifdef ASSERT" directive should be removed, otherwise "current" will not be declared. >>>>> >>>>> Thanks! >>>>> >>>>> Patricio >>>>>> Thanks, >>>>>> >>>>>> Yasumasa >>>>> >>> > From suenaga at oss.nttdata.com Fri Aug 28 01:24:04 2020 From: suenaga at oss.nttdata.com (Yasumasa Suenaga) Date: Fri, 28 Aug 2020 10:24:04 +0900 Subject: RFR: 8242427: JVMTI frame pop operations should use Thread-Local Handshakes In-Reply-To: <92f6ff81-2a9a-e2d9-6e07-f19c97b63abd@oracle.com> References: <11793902-5fca-3251-ce57-ae0832c3a2c4@oracle.com> <2764d68f-da86-865c-d8b0-e79d1fd9ee25@oss.nttdata.com> <6fa3b231-fc27-78f7-6e64-5d580b875772@oracle.com> <39af59d5-9701-97e9-670b-e71e9b2c6484@oss.nttdata.com> <92f6ff81-2a9a-e2d9-6e07-f19c97b63abd@oracle.com> Message-ID: <781a164d-21e1-039a-afa9-77be25ba3809@oss.nttdata.com> Hi David, On 2020/08/27 15:49, David Holmes wrote: > Sorry I just realized I reviewed version 00 :( > > I have concerns with the added locking: > > MutexLocker mu(JvmtiThreadState_lock); > > Who else may be holding that lock? Could it be our target thread that we have already initiated a handshake with? (The lock ranking checks related to safepoints don't help us detect deadlocks between a target thread and its handshaker. :( ) I checked source code again, then I couldn't find the point that target thread already locked JvmtiThreadState_lock at direct handshake. > It is far from clear now which functions are reachable from handshakes, which from safepoint VM_ops and which from both. > > !?? assert(SafepointSynchronize::is_at_safepoint() || JvmtiThreadState_lock->is_locked(), "Safepoint or must be locked"); > > This can be written as: > > assert_locked_or_safepoint(JvmtiThreadState_lock); > > or possibly the weak variant of that. ('m puzzled by the extra check in the strong version ... I think it is intended for the case of the VMThread executing a non-safepoint VMop.) > JvmtiEventController::set_frame_pop(), JvmtiEventController::clear_frame_pop() and JvmtiEventController::clear_to_frame_pop() are no longer called at safepoint, so I remove safepoint check from assert() in new webrev. webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8242427/webrev.03/ diff from previous webrev: http://hg.openjdk.java.net/jdk/submit/rev/2a2c02ada080 Thanks, Yasumasa > Thanks, > David > ----- > > > On 27/08/2020 4:34 pm, David Holmes wrote: >> Hi Yasumasa, >> >> On 27/08/2020 9:40 am, Yasumasa Suenaga wrote: >>> Hi David, >>> >>> On 2020/08/27 8:09, David Holmes wrote: >>>> Hi Yasumasa, >>>> >>>> On 26/08/2020 5:34 pm, Yasumasa Suenaga wrote: >>>>> Hi Patricio, David, >>>>> >>>>> Thanks for your comment! >>>>> >>>>> I updated webrev which includes the fix which is commented by Patricio, and it passed submit repo. So I switch this mail thread to RFR. >>>>> >>>>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8242427 >>>>> ?? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8242427/webrev.00/ >>>>> >>>>> I understand David said same concerns as Patricio about active handshaker. This webrev checks active handshaker is current thread or not. >>>> >>>> How can the current thread already be in a handshake with the target when you execute this code? >>> >>> EnterInterpOnlyModeClosure might be called in handshake with UpdateForPopTopFrameClosure or with SetFramePopClosure. >>> >>> EnterInterpOnlyModeClosure is introduced in JDK-8238585 as an alternative in VM_EnterInterpOnlyMode. >>> VM_EnterInterpOnlyMode returned true in allow_nested_vm_operations(). Originally, it could have been called from other VM operations. >> >> I see. It is a pity that we have now lost that critical indicator that shows how this operation can be nested within another operation. The possibility of nesting is even more obscure with JvmtiEnvThreadState::reset_current_location. And the fact it is now up to the caller to handle that case explicitly raises some concern - what will happen if you call execute_direct whilst already in a handshake with the target thread? >> >> I can't help but feel that we need a more rigorous and automated way of dealing with nesting ... perhaps we don't even need to care and handshakes should always allow nested handshake requests? (Question more for Robbin and Patricio.) >> >> Further comments: >> >> src/hotspot/share/prims/jvmtiEnvThreadState.cpp >> >> ??194 #ifdef ASSERT >> ??195?? Thread *current = Thread::current(); >> ??196 #endif >> ??197?? assert(get_thread() == current || current == get_thread()->active_handshaker(), >> ??198????????? "frame pop data only accessible from same thread or direct handshake"); >> >> Can you factor this out into a separate function so that it is not repeated so often. Seems to me that there should be a global function on Thread: assert_current_thread_or_handshaker()? [yes unpleasant name but ...] that will allow us to stop repeating this code fragment across numerous files. A follow up RFE for that would be okay too (I see some guarantees that should probably just be asserts so they need a bit more checking). >> >> ??331???????? Handshake::execute_direct(&op, _thread); >> >> You aren't checking the return value of execute_direct, but I can't tell where _thread was checked for still being alive ?? >> >> --- >> >> src/hotspot/share/prims/jvmtiEventController.cpp >> >> ??340???? Handshake::execute_direct(&hs, target); >> >> I know this is existing code but I have the same query as above - no return value check and no clear check that the JavaThread is still alive? >> >> --- >> >> Do we know if the existing tests actually test the nested cases? >> >> Thanks, >> David >> ----- >> >>> >>> Thanks, >>> >>> Yasumasa >>> >>> >>>> David >>>> ----- >>>> >>>>> >>>>> Cheers, >>>>> >>>>> Yasumasa >>>>> >>>>> >>>>> On 2020/08/26 10:13, Patricio Chilano wrote: >>>>>> Hi Yasumasa, >>>>>> >>>>>> On 8/23/20 11:40 PM, Yasumasa Suenaga wrote: >>>>>>> Hi all, >>>>>>> >>>>>>> I want to hear your opinions about the change for JDK-8242427. >>>>>>> >>>>>>> I'm trying to migrate following operations to direct handshake. >>>>>>> >>>>>>> ??? - VM_UpdateForPopTopFrame >>>>>>> ??? - VM_SetFramePop >>>>>>> ??? - VM_GetCurrentLocation >>>>>>> >>>>>>> Some operations (VM_GetCurrentLocation and EnterInterpOnlyModeClosure) might be called at safepoint, so I want to use JavaThread::active_handshaker() in production VM to detect the process is in direct handshake or not. >>>>>>> >>>>>>> However this function is available in debug VM only, so I want to hear the reason why it is for debug VM only, and there are no problem to use it in production VM. Of course another solutions are welcome. >>>>>> I added the _active_handshaker field to the HandshakeState class when working on 8230594 to adjust some asserts, where instead of checking for the VMThread we needed to check for the active handshaker of the target JavaThread. Since there were no other users of it, there was no point in declaring it and having to write to it for the release bits. There are no issues with having it in production though so you could change that if necessary. >>>>>> >>>>>>> webrev is here. It passed jtreg tests (vmTestbase/nsk/{jdi,jdwp,jvmti} serviceability/{jdwp,jvmti}) >>>>>>> ? http://cr.openjdk.java.net/~ysuenaga/JDK-8242427/proposal/ >>>>>> Some comments on the proposed change. >>>>>> >>>>>> src/hotspot/share/prims/jvmtiEnvThreadState.cpp, src/hotspot/share/prims/jvmtiEventController.cpp >>>>>> Why is the check to decide whether to call the handshake or execute the operation with the current thread different for GetCurrentLocationClosure vs EnterInterpOnlyModeClosure? >>>>>> >>>>>> (GetCurrentLocationClosure) >>>>>> if ((Thread::current() == _thread) || (_thread->active_handshaker() != NULL)) { >>>>>> ????? op.do_thread(_thread); >>>>>> } else { >>>>>> ????? Handshake::execute_direct(&op, _thread); >>>>>> } >>>>>> >>>>>> vs >>>>>> >>>>>> (EnterInterpOnlyModeClosure) >>>>>> if (target->active_handshaker() != NULL) { >>>>>> ???? hs.do_thread(target); >>>>>> } else { >>>>>> ???? Handshake::execute_direct(&hs, target); >>>>>> } >>>>>> >>>>>> If you change VM_SetFramePop to use handshakes then it seems you could reach JvmtiEventControllerPrivate::enter_interp_only_mode() with the current thread being the target. >>>>>> Also I think you want the second expression of that check to be (target->active_handshaker() == Thread::current()). So either you are the target or the current active_handshaker for that target. Otherwise active_handshaker() could be not NULL because there is another JavaThread handshaking the same target. Unless you are certain that it can never happen, so if active_handshaker() is not NULL it is always the current thread, but even in that case this way is safer. >>>>>> >>>>>> src/hotspot/share/prims/jvmtiThreadState.cpp >>>>>> The guarantee() statement exists in release builds too so the "#ifdef ASSERT" directive should be removed, otherwise "current" will not be declared. >>>>>> >>>>>> Thanks! >>>>>> >>>>>> Patricio >>>>>>> Thanks, >>>>>>> >>>>>>> Yasumasa >>>>>> From david.holmes at oracle.com Fri Aug 28 01:30:08 2020 From: david.holmes at oracle.com (David Holmes) Date: Fri, 28 Aug 2020 11:30:08 +1000 Subject: RFR: JDK-8251384: [TESTBUG] jvmti tests should not be executed with minimal VM In-Reply-To: <06c92d9c-7369-9969-56e0-c5ff67b50a68@oracle.com> References: <80FFB92E-28EC-45D1-9E1D-9F189550270A@oracle.com> <2ad18332-506a-e2c7-10a6-3ae2ab72d5ed@oracle.com> <06c92d9c-7369-9969-56e0-c5ff67b50a68@oracle.com> Message-ID: <5a7bdc82-4a18-0a14-dbfe-c802c30e4815@oracle.com> Hi Alex, On 28/08/2020 6:29 am, Alex Menkov wrote: > Hi David, > > On 08/26/2020 21:49, David Holmes wrote: >> Hi Alex, >> >> On 21/08/2020 6:54 am, Alex Menkov wrote: >>> Hi Igor, >>> >>> On 08/20/2020 09:23, Igor Ignatyev wrote: >>>> HI Alex, >>>> >>>> one minor nit: according to usual java coding conventions, >>>> isJVMTIIncluded should be spelled as isJvmtiIncluded. otherwise the >>>> fix looks good to me. >>> >>> I tried to be consistent with other methods like >>> isCDSIncludedInVmBuild, isJFRIncludedInVmBuild, isGCSupported, >>> isGCSelected, etc. >> >> Yes - when a name includes an acronym the use of camel-case is a >> secondary consideration. >> >>> Maybe this should be isJVMTIIncludedInVmBuild.. >> >> Yes that seems better and I would also prefer to see it implemented in >> the same style as: > > Sorry, the fix was pushed several days ago. Sorry I was away and didn't notice the date. The review still seemed open as you seemed to be posing a query to Igor on the naming. > Do you want to change the name to isJVMTIIncludedInVmBuild by follow-up? I think we should go for consistency in naming for these type of feature tests, and also consistency in the implementation - as I said this doesn't need a runtime check at all (though I wonder if the C++ compiler is smart enough to elide it?). That said I would have preferred the existing checks to not have "InVmBuild" in the name as it seems redundant/unnecessary to me. But a RFE to get consistency here would be good thing IMO. Thanks, David ----- >> >> WB_ENTRY(jboolean, WB_IsJFRIncludedInVmBuild(JNIEnv* env)) >> #if INCLUDE_JFR >> ?? return true; >> #else >> ?? return false; >> #endif // INCLUDE_JFR >> WB_END >> >> to avoid implicit booleans and avoid the runtime condition check. > > I don't think true/false are correct here. This is jboolean, not bool. > > INCLUDE_JVMTI ? JNI_TRUE : JNI_FALSE > ?#if INCLUDE_JVMTI > ??? return JNI_TRUE; > ?#else > ??? return JNI_FALSE; > ?#endif > looks better imo. > > --alex > >> >> Thanks, >> David >> ----- >> >>>> >>>>> Other tests will be updated in the follow-ups. >>>> have you already identified all the tests which need this @requires? >>>> filed bugs/RFEs for them? >>> >>> Not yet. >>> I had problem with running all hotspot tests with minimal build (for >>> some reason jtreg was not able to complete it), so I decided start >>> from the tests mentioned in the jira issue and then test >>> area-by-area, file and fix the tests in batches. >>> >>> --alex >>> >>>> >>>> Cheers, >>>> -- Igor >>>> >>>> >>>>> On Aug 19, 2020, at 6:02 PM, Alex Menkov >>>>> wrote: >>>>> >>>>> Hi all, >>>>> >>>>> please review the fix for >>>>> https://bugs.openjdk.java.net/browse/JDK-8251384 >>>>> webrev: >>>>> http://cr.openjdk.java.net/~amenkov/jdk16/minimal_jvmti/webrev/ >>>>> >>>>> The fix introduces new @requires option "vm.jvmti": >>>>> test/lib/sun/hotspot/WhiteBox.java >>>>> test/jtreg-ext/requires/VMProps.java >>>>> src/hotspot/share/prims/whitebox.cpp >>>>> test/hotspot/jtreg/TEST.ROOT >>>>> >>>>> and updates tests in test/hotspot/jtreg/serviceability/jvmti (the >>>>> only change in all tests is added "@requires vm.jvmti") >>>>> Other tests will be updated in the follow-ups. >>>>> >>>>> The >>>> From suenaga at oss.nttdata.com Fri Aug 28 01:45:30 2020 From: suenaga at oss.nttdata.com (Yasumasa Suenaga) Date: Fri, 28 Aug 2020 10:45:30 +0900 Subject: 8242427: JVMTI frame pop operations should use Thread-Local Handshakes In-Reply-To: References: Message-ID: <99babb38-c4c0-c664-5a35-af51c2333e90@oss.nttdata.com> Hi Richard, Unfortunately I do not have any benchmark for this change, however I think it is worth to do it for consistency. All of VM operations which do not need global lock in JVMTI are replaced to direct handshake if this enhancement is merged. I think VM operations should be replaced to direct handshake if we can. VM operations should be just used for operations which needs global lock. It will help all of programmers who are interested in HotSpot when they try to know the operation. Thanks, Yasumasa On 2020/08/27 16:43, Reingruber, Richard wrote: > Hi Yasumasa, > >> I've described the motivation on JDK-8201641 (it is a parent task of JDK-8242427) > >> ``` >> Many JVMTI functions uses VM Operation to get information. However some of them need to stop only one thread - they don't need to stop all threads. >> ``` > > So the goal is better performance. For PopFrame IMHO it is not worth the effort, > the future effort in maintaining the related code, and the risk. > > I think so because PopFrame is a hardly ever used. I honestly never used it > (have you?). In IDEs it is well hidden. Graal does not even bother to support > it. On the other side the change affects other operations that are commonly > used. > > In the rare cases when a PopFrame is requested it will be in interactive > sessions: someone found the well-hidden PopFrame button in the debugger and > pressed it. Probably she won't do it again. At least not at a high frequency. So > she will not notice the effect of the optimization. > > If you have a large cloud of JVMs where every second a PopFrame is executed, > even then I would doubt that the resource savings are measurable. And I would > also doubt that a cloud with PopFrames at that rate exists. > > I see there are rare events like full GCs that can do harm. But in the case of > PopFrame I can't see a problem because the pause for the vm operation will be > extremely short. > > Is there a scenario or a not too artificial benchmark that would show an > improvement? > > Thanks, > Richard. > > -----Original Message----- > From: Yasumasa Suenaga > Sent: Donnerstag, 27. August 2020 01:30 > To: Reingruber, Richard ; serviceability-dev > Subject: Re: 8242427: JVMTI frame pop operations should use Thread-Local Handshakes > > Hi Richard, > > I've described the motivation on JDK-8201641 (it is a parent task of JDK-8242427) > > ``` > Many JVMTI functions uses VM Operation to get information. However some of them need to stop only one thread - they don't need to stop all threads. > ``` > > I aimed to improve JVMTI monitor operation with TLS at first, but I found other JVMTI operations can be improved with same process. So I've tried to fix them. > > I proposed it to serviceability-dev [1], then Dan told me similar enhancement is already filed to JBS [2]. So I created subtasks in it. > > > Thanks, > > Yasumasa > > > [1] https://mail.openjdk.java.net/pipermail/serviceability-dev/2020-March/030890.html > [2] https://mail.openjdk.java.net/pipermail/serviceability-dev/2020-March/030897.html > > > On 2020/08/27 5:33, Reingruber, Richard wrote: >> Hi Yasumasa, >> >> Could you explain a little bit the motivation to replace these vm operations with handshakes? >> Would be good, if you could add the goals as well to the JBS item. >> >> Thanks, Richard. >> >> -----Original Message----- >> From: serviceability-dev On Behalf Of Yasumasa Suenaga >> Sent: Montag, 24. August 2020 04:40 >> To: serviceability-dev >> Subject: 8242427: JVMTI frame pop operations should use Thread-Local Handshakes >> >> Hi all, >> >> I want to hear your opinions about the change for JDK-8242427. >> >> I'm trying to migrate following operations to direct handshake. >> >> - VM_UpdateForPopTopFrame >> - VM_SetFramePop >> - VM_GetCurrentLocation >> >> Some operations (VM_GetCurrentLocation and EnterInterpOnlyModeClosure) might be called at safepoint, so I want to use JavaThread::active_handshaker() in production VM to detect the process is in direct handshake or not. >> >> However this function is available in debug VM only, so I want to hear the reason why it is for debug VM only, and there are no problem to use it in production VM. Of course another solutions are welcome. >> >> webrev is here. It passed jtreg tests (vmTestbase/nsk/{jdi,jdwp,jvmti} serviceability/{jdwp,jvmti}) >> http://cr.openjdk.java.net/~ysuenaga/JDK-8242427/proposal/ >> >> >> Thanks, >> >> Yasumasa >> From david.holmes at oracle.com Fri Aug 28 01:53:07 2020 From: david.holmes at oracle.com (David Holmes) Date: Fri, 28 Aug 2020 11:53:07 +1000 Subject: 8242427: JVMTI frame pop operations should use Thread-Local Handshakes In-Reply-To: <99babb38-c4c0-c664-5a35-af51c2333e90@oss.nttdata.com> References: <99babb38-c4c0-c664-5a35-af51c2333e90@oss.nttdata.com> Message-ID: <47808938-a5d8-723a-bcb0-7c31d072af5c@oracle.com> On 28/08/2020 11:45 am, Yasumasa Suenaga wrote: > Hi Richard, > > Unfortunately I do not have any benchmark for this change, however I > think it is worth to do it for consistency. All of VM operations which > do not need global lock in JVMTI are replaced to direct handshake if > this enhancement is merged. > > I think VM operations should be replaced to direct handshake if we can. > VM operations should be just used for operations which needs global > lock. It will help all of programmers who are interested in HotSpot when > they try to know the operation. I agree with this motivation - we want to eradicate as many safepoint VM operations as possible, even if the usage would not really benefit from the lack of stop-the-world pauses. That said, of course this has to be tempered against the complexity of the change. But we are establishing a pattern for coding up JVMTI operation as direct handshakes, which should make things generally more easy to understand. Cheers, David > > Thanks, > > Yasumasa > > > On 2020/08/27 16:43, Reingruber, Richard wrote: >> Hi Yasumasa, >> >>> I've described the motivation on JDK-8201641 (it is a parent task of >>> JDK-8242427) >> >>> ``` >>> Many JVMTI functions uses VM Operation to get information. However >>> some of them need to stop only one thread - they don't need to stop >>> all threads. >>> ``` >> >> So the goal is better performance. For PopFrame IMHO it is not worth >> the effort, >> the future effort in maintaining the related code, and the risk. >> >> I think so because PopFrame is a hardly ever used. I honestly never >> used it >> (have you?). In IDEs it is well hidden. Graal does not even bother to >> support >> it. On the other side the change affects other operations that are >> commonly >> used. >> >> In the rare cases when a PopFrame is requested it will be in interactive >> sessions: someone found the well-hidden PopFrame button in the >> debugger and >> pressed it. Probably she won't do it again. At least not at a high >> frequency. So >> she will not notice the effect of the optimization. >> >> If you have a large cloud of JVMs where every second a PopFrame is >> executed, >> even then I would doubt that the resource savings are measurable. And >> I would >> also doubt that a cloud with PopFrames at that rate exists. >> >> I see there are rare events like full GCs that can do harm. But in the >> case of >> PopFrame I can't see a problem because the pause for the vm operation >> will be >> extremely short. >> >> Is there a scenario or a not too artificial benchmark that would show an >> improvement? >> >> Thanks, >> Richard. >> >> -----Original Message----- >> From: Yasumasa Suenaga >> Sent: Donnerstag, 27. August 2020 01:30 >> To: Reingruber, Richard ; >> serviceability-dev >> Subject: Re: 8242427: JVMTI frame pop operations should use >> Thread-Local Handshakes >> >> Hi Richard, >> >> I've described the motivation on JDK-8201641 (it is a parent task of >> JDK-8242427) >> >> ``` >> Many JVMTI functions uses VM Operation to get information. However >> some of them need to stop only one thread - they don't need to stop >> all threads. >> ``` >> >> I aimed to improve JVMTI monitor operation with TLS at first, but I >> found other JVMTI operations can be improved with same process. So >> I've tried to fix them. >> >> I proposed it to serviceability-dev [1], then Dan told me similar >> enhancement is already filed to JBS [2]. So I created subtasks in it. >> >> >> Thanks, >> >> Yasumasa >> >> >> [1] >> https://mail.openjdk.java.net/pipermail/serviceability-dev/2020-March/030890.html >> >> [2] >> https://mail.openjdk.java.net/pipermail/serviceability-dev/2020-March/030897.html >> >> >> >> On 2020/08/27 5:33, Reingruber, Richard wrote: >>> Hi Yasumasa, >>> >>> Could you explain a little bit the motivation to replace these vm >>> operations with handshakes? >>> Would be good, if you could add the goals as well to the JBS item. >>> >>> Thanks, Richard. >>> >>> -----Original Message----- >>> From: serviceability-dev >>> On Behalf Of Yasumasa Suenaga >>> Sent: Montag, 24. August 2020 04:40 >>> To: serviceability-dev >>> Subject: 8242427: JVMTI frame pop operations should use Thread-Local >>> Handshakes >>> >>> Hi all, >>> >>> I want to hear your opinions about the change for JDK-8242427. >>> >>> I'm trying to migrate following operations to direct handshake. >>> >>> ?????? - VM_UpdateForPopTopFrame >>> ?????? - VM_SetFramePop >>> ?????? - VM_GetCurrentLocation >>> >>> Some operations (VM_GetCurrentLocation and >>> EnterInterpOnlyModeClosure) might be called at safepoint, so I want >>> to use JavaThread::active_handshaker() in production VM to detect the >>> process is in direct handshake or not. >>> >>> However this function is available in debug VM only, so I want to >>> hear the reason why it is for debug VM only, and there are no problem >>> to use it in production VM. Of course another solutions are welcome. >>> >>> webrev is here. It passed jtreg tests >>> (vmTestbase/nsk/{jdi,jdwp,jvmti} serviceability/{jdwp,jvmti}) >>> ???? http://cr.openjdk.java.net/~ysuenaga/JDK-8242427/proposal/ >>> >>> >>> Thanks, >>> >>> Yasumasa >>> From david.holmes at oracle.com Fri Aug 28 02:04:04 2020 From: david.holmes at oracle.com (David Holmes) Date: Fri, 28 Aug 2020 12:04:04 +1000 Subject: RFR: 8242427: JVMTI frame pop operations should use Thread-Local Handshakes In-Reply-To: <781a164d-21e1-039a-afa9-77be25ba3809@oss.nttdata.com> References: <11793902-5fca-3251-ce57-ae0832c3a2c4@oracle.com> <2764d68f-da86-865c-d8b0-e79d1fd9ee25@oss.nttdata.com> <6fa3b231-fc27-78f7-6e64-5d580b875772@oracle.com> <39af59d5-9701-97e9-670b-e71e9b2c6484@oss.nttdata.com> <92f6ff81-2a9a-e2d9-6e07-f19c97b63abd@oracle.com> <781a164d-21e1-039a-afa9-77be25ba3809@oss.nttdata.com> Message-ID: <6dd8b890-5e1f-2e6d-826c-96d439eb65c4@oracle.com> Hi Yasumasa, On 28/08/2020 11:24 am, Yasumasa Suenaga wrote: > Hi David, > > On 2020/08/27 15:49, David Holmes wrote: >> Sorry I just realized I reviewed version 00 :( Note that my comments on version 00 in my earlier email still apply. >> >> I have concerns with the added locking: >> >> MutexLocker mu(JvmtiThreadState_lock); >> >> Who else may be holding that lock? Could it be our target thread that >> we have already initiated a handshake with? (The lock ranking checks >> related to safepoints don't help us detect deadlocks between a target >> thread and its handshaker. :( ) > > I checked source code again, then I couldn't find the point that target > thread already locked JvmtiThreadState_lock at direct handshake. I'm very unclear exactly what state this lock guards and under what conditions. But looking at: src/hotspot/share/prims/jvmtiEnv.cpp Surely the lock is only needed in the direct-handshake case and not when operating on the current thread? Or is it there because you've removed the locking from the lower-level JvmtiEventController methods and so now you need to take the lock higher-up the call chain? (I find it hard to follow the call chains in the JVMTI code.) > >> It is far from clear now which functions are reachable from >> handshakes, which from safepoint VM_ops and which from both. >> >> !?? assert(SafepointSynchronize::is_at_safepoint() || >> JvmtiThreadState_lock->is_locked(), "Safepoint or must be locked"); >> >> This can be written as: >> >> assert_locked_or_safepoint(JvmtiThreadState_lock); >> >> or possibly the weak variant of that. ('m puzzled by the extra check >> in the strong version ... I think it is intended for the case of the >> VMThread executing a non-safepoint VMop.) > >> JvmtiEventController::set_frame_pop(), >> JvmtiEventController::clear_frame_pop() and >> JvmtiEventController::clear_to_frame_pop() are no longer called at >> safepoint, so I remove safepoint check from assert() in new webrev. You should use assert_lock_strong for this. Thanks, David > ? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8242427/webrev.03/ > ??? diff from previous webrev: > http://hg.openjdk.java.net/jdk/submit/rev/2a2c02ada080 > > > Thanks, > > Yasumasa > > >> Thanks, >> David >> ----- >> >> >> On 27/08/2020 4:34 pm, David Holmes wrote: >>> Hi Yasumasa, >>> >>> On 27/08/2020 9:40 am, Yasumasa Suenaga wrote: >>>> Hi David, >>>> >>>> On 2020/08/27 8:09, David Holmes wrote: >>>>> Hi Yasumasa, >>>>> >>>>> On 26/08/2020 5:34 pm, Yasumasa Suenaga wrote: >>>>>> Hi Patricio, David, >>>>>> >>>>>> Thanks for your comment! >>>>>> >>>>>> I updated webrev which includes the fix which is commented by >>>>>> Patricio, and it passed submit repo. So I switch this mail thread >>>>>> to RFR. >>>>>> >>>>>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8242427 >>>>>> ?? webrev: >>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8242427/webrev.00/ >>>>>> >>>>>> I understand David said same concerns as Patricio about active >>>>>> handshaker. This webrev checks active handshaker is current thread >>>>>> or not. >>>>> >>>>> How can the current thread already be in a handshake with the >>>>> target when you execute this code? >>>> >>>> EnterInterpOnlyModeClosure might be called in handshake with >>>> UpdateForPopTopFrameClosure or with SetFramePopClosure. >>>> >>>> EnterInterpOnlyModeClosure is introduced in JDK-8238585 as an >>>> alternative in VM_EnterInterpOnlyMode. >>>> VM_EnterInterpOnlyMode returned true in >>>> allow_nested_vm_operations(). Originally, it could have been called >>>> from other VM operations. >>> >>> I see. It is a pity that we have now lost that critical indicator >>> that shows how this operation can be nested within another operation. >>> The possibility of nesting is even more obscure with >>> JvmtiEnvThreadState::reset_current_location. And the fact it is now >>> up to the caller to handle that case explicitly raises some concern - >>> what will happen if you call execute_direct whilst already in a >>> handshake with the target thread? >>> >>> I can't help but feel that we need a more rigorous and automated way >>> of dealing with nesting ... perhaps we don't even need to care and >>> handshakes should always allow nested handshake requests? (Question >>> more for Robbin and Patricio.) >>> >>> Further comments: >>> >>> src/hotspot/share/prims/jvmtiEnvThreadState.cpp >>> >>> ??194 #ifdef ASSERT >>> ??195?? Thread *current = Thread::current(); >>> ??196 #endif >>> ??197?? assert(get_thread() == current || current == >>> get_thread()->active_handshaker(), >>> ??198????????? "frame pop data only accessible from same thread or >>> direct handshake"); >>> >>> Can you factor this out into a separate function so that it is not >>> repeated so often. Seems to me that there should be a global function >>> on Thread: assert_current_thread_or_handshaker()? [yes unpleasant >>> name but ...] that will allow us to stop repeating this code fragment >>> across numerous files. A follow up RFE for that would be okay too (I >>> see some guarantees that should probably just be asserts so they need >>> a bit more checking). >>> >>> ??331???????? Handshake::execute_direct(&op, _thread); >>> >>> You aren't checking the return value of execute_direct, but I can't >>> tell where _thread was checked for still being alive ?? >>> >>> --- >>> >>> src/hotspot/share/prims/jvmtiEventController.cpp >>> >>> ??340???? Handshake::execute_direct(&hs, target); >>> >>> I know this is existing code but I have the same query as above - no >>> return value check and no clear check that the JavaThread is still >>> alive? >>> >>> --- >>> >>> Do we know if the existing tests actually test the nested cases? >>> >>> Thanks, >>> David >>> ----- >>> >>>> >>>> Thanks, >>>> >>>> Yasumasa >>>> >>>> >>>>> David >>>>> ----- >>>>> >>>>>> >>>>>> Cheers, >>>>>> >>>>>> Yasumasa >>>>>> >>>>>> >>>>>> On 2020/08/26 10:13, Patricio Chilano wrote: >>>>>>> Hi Yasumasa, >>>>>>> >>>>>>> On 8/23/20 11:40 PM, Yasumasa Suenaga wrote: >>>>>>>> Hi all, >>>>>>>> >>>>>>>> I want to hear your opinions about the change for JDK-8242427. >>>>>>>> >>>>>>>> I'm trying to migrate following operations to direct handshake. >>>>>>>> >>>>>>>> ??? - VM_UpdateForPopTopFrame >>>>>>>> ??? - VM_SetFramePop >>>>>>>> ??? - VM_GetCurrentLocation >>>>>>>> >>>>>>>> Some operations (VM_GetCurrentLocation and >>>>>>>> EnterInterpOnlyModeClosure) might be called at safepoint, so I >>>>>>>> want to use JavaThread::active_handshaker() in production VM to >>>>>>>> detect the process is in direct handshake or not. >>>>>>>> >>>>>>>> However this function is available in debug VM only, so I want >>>>>>>> to hear the reason why it is for debug VM only, and there are no >>>>>>>> problem to use it in production VM. Of course another solutions >>>>>>>> are welcome. >>>>>>> I added the _active_handshaker field to the HandshakeState class >>>>>>> when working on 8230594 to adjust some asserts, where instead of >>>>>>> checking for the VMThread we needed to check for the active >>>>>>> handshaker of the target JavaThread. Since there were no other >>>>>>> users of it, there was no point in declaring it and having to >>>>>>> write to it for the release bits. There are no issues with having >>>>>>> it in production though so you could change that if necessary. >>>>>>> >>>>>>>> webrev is here. It passed jtreg tests >>>>>>>> (vmTestbase/nsk/{jdi,jdwp,jvmti} serviceability/{jdwp,jvmti}) >>>>>>>> ? http://cr.openjdk.java.net/~ysuenaga/JDK-8242427/proposal/ >>>>>>> Some comments on the proposed change. >>>>>>> >>>>>>> src/hotspot/share/prims/jvmtiEnvThreadState.cpp, >>>>>>> src/hotspot/share/prims/jvmtiEventController.cpp >>>>>>> Why is the check to decide whether to call the handshake or >>>>>>> execute the operation with the current thread different for >>>>>>> GetCurrentLocationClosure vs EnterInterpOnlyModeClosure? >>>>>>> >>>>>>> (GetCurrentLocationClosure) >>>>>>> if ((Thread::current() == _thread) || >>>>>>> (_thread->active_handshaker() != NULL)) { >>>>>>> ????? op.do_thread(_thread); >>>>>>> } else { >>>>>>> ????? Handshake::execute_direct(&op, _thread); >>>>>>> } >>>>>>> >>>>>>> vs >>>>>>> >>>>>>> (EnterInterpOnlyModeClosure) >>>>>>> if (target->active_handshaker() != NULL) { >>>>>>> ???? hs.do_thread(target); >>>>>>> } else { >>>>>>> ???? Handshake::execute_direct(&hs, target); >>>>>>> } >>>>>>> >>>>>>> If you change VM_SetFramePop to use handshakes then it seems you >>>>>>> could reach JvmtiEventControllerPrivate::enter_interp_only_mode() >>>>>>> with the current thread being the target. >>>>>>> Also I think you want the second expression of that check to be >>>>>>> (target->active_handshaker() == Thread::current()). So either you >>>>>>> are the target or the current active_handshaker for that target. >>>>>>> Otherwise active_handshaker() could be not NULL because there is >>>>>>> another JavaThread handshaking the same target. Unless you are >>>>>>> certain that it can never happen, so if active_handshaker() is >>>>>>> not NULL it is always the current thread, but even in that case >>>>>>> this way is safer. >>>>>>> >>>>>>> src/hotspot/share/prims/jvmtiThreadState.cpp >>>>>>> The guarantee() statement exists in release builds too so the >>>>>>> "#ifdef ASSERT" directive should be removed, otherwise "current" >>>>>>> will not be declared. >>>>>>> >>>>>>> Thanks! >>>>>>> >>>>>>> Patricio >>>>>>>> Thanks, >>>>>>>> >>>>>>>> Yasumasa >>>>>>> From igor.ignatyev at oracle.com Fri Aug 28 02:39:38 2020 From: igor.ignatyev at oracle.com (Igor Ignatyev) Date: Thu, 27 Aug 2020 19:39:38 -0700 Subject: RFR(S) : 8252477 : nsk/share/ArgumentParser should expect that jtreg "splits" an argument Message-ID: <26E3A312-A45E-489F-A5B1-F1E67CBE807A@oracle.com> http://cr.openjdk.java.net/~iignatyev//8252477/webrev.00/ > 99 lines changed: 19 ins; 20 del; 60 mod; Hi all, could you please review the patch which unblocks the rest of 8219140's (get rid of vmTestbase/PropertyResolvingWrapper) sub-tasks? background from JBS: > jtreg splits command line by space to get the list of arguments and there is no way to prevent that (nor thru escaping, nor by adding quotes). currently, PropertyResolvingWrapper handles that and joins multiple arguments within double quotes into one argument before passing it to the actual test class. the only place where it's needed is in the tests which use nsk/share/ArgumentParser (or more precisely nsk.share.jpda.DebugeeArgumentHandler and nsk/share/jdb/JdbArgumentHandler). > > in preparation for PropertyResolvingWrapper removal, ArgumentParser should be updated to handle the "split" argument on its own. I've also taken the liberty to slightly clean up ArgumentParser. JBS: https://bugs.openjdk.java.net/browse/JDK-8252477 webrev: http://cr.openjdk.java.net/~iignatyev//8252477/webrev.00/ testing: all the tests which use ArgumentParser (:vmTestbase_nsk_aod :vmTestbase_nsk_jdb :vmTestbase_nsk_jdi :vmTestbase_nsk_jdw ,:vmTestbase_nsk_jvmti :vmTestbase_vm_compiler :vmTestbase_vm_mlvm) on {windows,linux,macos}-x64 Thanks, -- Igor From suenaga at oss.nttdata.com Fri Aug 28 03:01:11 2020 From: suenaga at oss.nttdata.com (Yasumasa Suenaga) Date: Fri, 28 Aug 2020 12:01:11 +0900 Subject: RFR: 8242427: JVMTI frame pop operations should use Thread-Local Handshakes In-Reply-To: <6dd8b890-5e1f-2e6d-826c-96d439eb65c4@oracle.com> References: <11793902-5fca-3251-ce57-ae0832c3a2c4@oracle.com> <2764d68f-da86-865c-d8b0-e79d1fd9ee25@oss.nttdata.com> <6fa3b231-fc27-78f7-6e64-5d580b875772@oracle.com> <39af59d5-9701-97e9-670b-e71e9b2c6484@oss.nttdata.com> <92f6ff81-2a9a-e2d9-6e07-f19c97b63abd@oracle.com> <781a164d-21e1-039a-afa9-77be25ba3809@oss.nttdata.com> <6dd8b890-5e1f-2e6d-826c-96d439eb65c4@oracle.com> Message-ID: <3aeaff81-f6a3-e4c0-b65c-835785b7ebe2@oss.nttdata.com> Hi David, On 2020/08/28 11:04, David Holmes wrote: > Hi Yasumasa, > > On 28/08/2020 11:24 am, Yasumasa Suenaga wrote: >> Hi David, >> >> On 2020/08/27 15:49, David Holmes wrote: >>> Sorry I just realized I reviewed version 00 :( > > Note that my comments on version 00 in my earlier email still apply. I copied here your comment on webrev.00: >>>> I see. It is a pity that we have now lost that critical indicator that shows how this operation can be nested within another operation. The possibility of nesting is even more obscure with JvmtiEnvThreadState::reset_current_location. And the fact it is now up to the caller to handle that case explicitly raises some concern - what will happen if you call execute_direct whilst already in a handshake with the target thread? I heard deadlock would be happen if execute_direct() calls in direct handshake. Thus we need to use active_handshaker() in this change. >>>> Further comments: >>>> >>>> src/hotspot/share/prims/jvmtiEnvThreadState.cpp >>>> >>>> 194 #ifdef ASSERT >>>> 195 Thread *current = Thread::current(); >>>> 196 #endif >>>> 197 assert(get_thread() == current || current == get_thread()->active_handshaker(), >>>> 198 "frame pop data only accessible from same thread or direct handshake"); >>>> >>>> Can you factor this out into a separate function so that it is not repeated so often. Seems to me that there should be a global function on Thread: assert_current_thread_or_handshaker() [yes unpleasant name but ...] that will allow us to stop repeating this code fragment across numerous files. A follow up RFE for that would be okay too (I see some guarantees that should probably just be asserts so they need a bit more checking). I filed it as another RFE: https://bugs.openjdk.java.net/browse/JDK-8252479 >>>> 331 Handshake::execute_direct(&op, _thread); >>>> >>>> You aren't checking the return value of execute_direct, but I can't tell where _thread was checked for still being alive ?? >>>> >>>> --- >>>> >>>> src/hotspot/share/prims/jvmtiEventController.cpp >>>> >>>> 340 Handshake::execute_direct(&hs, target); >>>> >>>> I know this is existing code but I have the same query as above - no return value check and no clear check that the JavaThread is still alive? Existing code seems to assume that target thread is alive, frame operations (e.g. PopFrame()) should be performed on live thread. And also existing code would not set any JVMTI error and cannot propagate it to caller. So I do not add the check for thread state. >>>> Do we know if the existing tests actually test the nested cases? I saw some error with assertion for JvmtiThreadState_lock and safepoint in vmTestbase at first, so I guess nested call would be tested, but I'm not sure. >>> I have concerns with the added locking: >>> >>> MutexLocker mu(JvmtiThreadState_lock); >>> >>> Who else may be holding that lock? Could it be our target thread that we have already initiated a handshake with? (The lock ranking checks related to safepoints don't help us detect deadlocks between a target thread and its handshaker. :( ) >> >> I checked source code again, then I couldn't find the point that target thread already locked JvmtiThreadState_lock at direct handshake. > > I'm very unclear exactly what state this lock guards and under what conditions. But looking at: > > src/hotspot/share/prims/jvmtiEnv.cpp > > Surely the lock is only needed in the direct-handshake case and not when operating on the current thread? Or is it there because you've removed the locking from the lower-level JvmtiEventController methods and so now you need to take the lock higher-up the call chain? (I find it hard to follow the call chains in the JVMTI code.) We need to take the lock higher-up the call chain. It is suggested by Robbin, and works fine. >>> It is far from clear now which functions are reachable from handshakes, which from safepoint VM_ops and which from both. >>> >>> !?? assert(SafepointSynchronize::is_at_safepoint() || JvmtiThreadState_lock->is_locked(), "Safepoint or must be locked"); >>> >>> This can be written as: >>> >>> assert_locked_or_safepoint(JvmtiThreadState_lock); >>> >>> or possibly the weak variant of that. ('m puzzled by the extra check in the strong version ... I think it is intended for the case of the VMThread executing a non-safepoint VMop.) >> >>> JvmtiEventController::set_frame_pop(), JvmtiEventController::clear_frame_pop() and JvmtiEventController::clear_to_frame_pop() are no longer called at safepoint, so I remove safepoint check from assert() in new webrev. > > You should use assert_lock_strong for this. I will do that. Thanks, Yasumasa > Thanks, > David > >> ?? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8242427/webrev.03/ >> ???? diff from previous webrev: http://hg.openjdk.java.net/jdk/submit/rev/2a2c02ada080 >> >> >> Thanks, >> >> Yasumasa >> >> >>> Thanks, >>> David >>> ----- >>> >>> >>> On 27/08/2020 4:34 pm, David Holmes wrote: >>>> Hi Yasumasa, >>>> >>>> On 27/08/2020 9:40 am, Yasumasa Suenaga wrote: >>>>> Hi David, >>>>> >>>>> On 2020/08/27 8:09, David Holmes wrote: >>>>>> Hi Yasumasa, >>>>>> >>>>>> On 26/08/2020 5:34 pm, Yasumasa Suenaga wrote: >>>>>>> Hi Patricio, David, >>>>>>> >>>>>>> Thanks for your comment! >>>>>>> >>>>>>> I updated webrev which includes the fix which is commented by Patricio, and it passed submit repo. So I switch this mail thread to RFR. >>>>>>> >>>>>>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8242427 >>>>>>> ?? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8242427/webrev.00/ >>>>>>> >>>>>>> I understand David said same concerns as Patricio about active handshaker. This webrev checks active handshaker is current thread or not. >>>>>> >>>>>> How can the current thread already be in a handshake with the target when you execute this code? >>>>> >>>>> EnterInterpOnlyModeClosure might be called in handshake with UpdateForPopTopFrameClosure or with SetFramePopClosure. >>>>> >>>>> EnterInterpOnlyModeClosure is introduced in JDK-8238585 as an alternative in VM_EnterInterpOnlyMode. >>>>> VM_EnterInterpOnlyMode returned true in allow_nested_vm_operations(). Originally, it could have been called from other VM operations. >>>> >>>> I see. It is a pity that we have now lost that critical indicator that shows how this operation can be nested within another operation. The possibility of nesting is even more obscure with JvmtiEnvThreadState::reset_current_location. And the fact it is now up to the caller to handle that case explicitly raises some concern - what will happen if you call execute_direct whilst already in a handshake with the target thread? >>>> >>>> I can't help but feel that we need a more rigorous and automated way of dealing with nesting ... perhaps we don't even need to care and handshakes should always allow nested handshake requests? (Question more for Robbin and Patricio.) >>>> >>>> Further comments: >>>> >>>> src/hotspot/share/prims/jvmtiEnvThreadState.cpp >>>> >>>> ??194 #ifdef ASSERT >>>> ??195?? Thread *current = Thread::current(); >>>> ??196 #endif >>>> ??197?? assert(get_thread() == current || current == get_thread()->active_handshaker(), >>>> ??198????????? "frame pop data only accessible from same thread or direct handshake"); >>>> >>>> Can you factor this out into a separate function so that it is not repeated so often. Seems to me that there should be a global function on Thread: assert_current_thread_or_handshaker()? [yes unpleasant name but ...] that will allow us to stop repeating this code fragment across numerous files. A follow up RFE for that would be okay too (I see some guarantees that should probably just be asserts so they need a bit more checking). >>>> >>>> ??331???????? Handshake::execute_direct(&op, _thread); >>>> >>>> You aren't checking the return value of execute_direct, but I can't tell where _thread was checked for still being alive ?? >>>> >>>> --- >>>> >>>> src/hotspot/share/prims/jvmtiEventController.cpp >>>> >>>> ??340???? Handshake::execute_direct(&hs, target); >>>> >>>> I know this is existing code but I have the same query as above - no return value check and no clear check that the JavaThread is still alive? >>>> >>>> --- >>>> >>>> Do we know if the existing tests actually test the nested cases? >>>> >>>> Thanks, >>>> David >>>> ----- >>>> >>>>> >>>>> Thanks, >>>>> >>>>> Yasumasa >>>>> >>>>> >>>>>> David >>>>>> ----- >>>>>> >>>>>>> >>>>>>> Cheers, >>>>>>> >>>>>>> Yasumasa >>>>>>> >>>>>>> >>>>>>> On 2020/08/26 10:13, Patricio Chilano wrote: >>>>>>>> Hi Yasumasa, >>>>>>>> >>>>>>>> On 8/23/20 11:40 PM, Yasumasa Suenaga wrote: >>>>>>>>> Hi all, >>>>>>>>> >>>>>>>>> I want to hear your opinions about the change for JDK-8242427. >>>>>>>>> >>>>>>>>> I'm trying to migrate following operations to direct handshake. >>>>>>>>> >>>>>>>>> ??? - VM_UpdateForPopTopFrame >>>>>>>>> ??? - VM_SetFramePop >>>>>>>>> ??? - VM_GetCurrentLocation >>>>>>>>> >>>>>>>>> Some operations (VM_GetCurrentLocation and EnterInterpOnlyModeClosure) might be called at safepoint, so I want to use JavaThread::active_handshaker() in production VM to detect the process is in direct handshake or not. >>>>>>>>> >>>>>>>>> However this function is available in debug VM only, so I want to hear the reason why it is for debug VM only, and there are no problem to use it in production VM. Of course another solutions are welcome. >>>>>>>> I added the _active_handshaker field to the HandshakeState class when working on 8230594 to adjust some asserts, where instead of checking for the VMThread we needed to check for the active handshaker of the target JavaThread. Since there were no other users of it, there was no point in declaring it and having to write to it for the release bits. There are no issues with having it in production though so you could change that if necessary. >>>>>>>> >>>>>>>>> webrev is here. It passed jtreg tests (vmTestbase/nsk/{jdi,jdwp,jvmti} serviceability/{jdwp,jvmti}) >>>>>>>>> ? http://cr.openjdk.java.net/~ysuenaga/JDK-8242427/proposal/ >>>>>>>> Some comments on the proposed change. >>>>>>>> >>>>>>>> src/hotspot/share/prims/jvmtiEnvThreadState.cpp, src/hotspot/share/prims/jvmtiEventController.cpp >>>>>>>> Why is the check to decide whether to call the handshake or execute the operation with the current thread different for GetCurrentLocationClosure vs EnterInterpOnlyModeClosure? >>>>>>>> >>>>>>>> (GetCurrentLocationClosure) >>>>>>>> if ((Thread::current() == _thread) || (_thread->active_handshaker() != NULL)) { >>>>>>>> ????? op.do_thread(_thread); >>>>>>>> } else { >>>>>>>> ????? Handshake::execute_direct(&op, _thread); >>>>>>>> } >>>>>>>> >>>>>>>> vs >>>>>>>> >>>>>>>> (EnterInterpOnlyModeClosure) >>>>>>>> if (target->active_handshaker() != NULL) { >>>>>>>> ???? hs.do_thread(target); >>>>>>>> } else { >>>>>>>> ???? Handshake::execute_direct(&hs, target); >>>>>>>> } >>>>>>>> >>>>>>>> If you change VM_SetFramePop to use handshakes then it seems you could reach JvmtiEventControllerPrivate::enter_interp_only_mode() with the current thread being the target. >>>>>>>> Also I think you want the second expression of that check to be (target->active_handshaker() == Thread::current()). So either you are the target or the current active_handshaker for that target. Otherwise active_handshaker() could be not NULL because there is another JavaThread handshaking the same target. Unless you are certain that it can never happen, so if active_handshaker() is not NULL it is always the current thread, but even in that case this way is safer. >>>>>>>> >>>>>>>> src/hotspot/share/prims/jvmtiThreadState.cpp >>>>>>>> The guarantee() statement exists in release builds too so the "#ifdef ASSERT" directive should be removed, otherwise "current" will not be declared. >>>>>>>> >>>>>>>> Thanks! >>>>>>>> >>>>>>>> Patricio >>>>>>>>> Thanks, >>>>>>>>> >>>>>>>>> Yasumasa >>>>>>>> From david.holmes at oracle.com Fri Aug 28 04:09:13 2020 From: david.holmes at oracle.com (David Holmes) Date: Fri, 28 Aug 2020 14:09:13 +1000 Subject: RFR(S) : 8252477 : nsk/share/ArgumentParser should expect that jtreg "splits" an argument In-Reply-To: <26E3A312-A45E-489F-A5B1-F1E67CBE807A@oracle.com> References: <26E3A312-A45E-489F-A5B1-F1E67CBE807A@oracle.com> Message-ID: Hi Igor, In case there may be a parsing error and the command-line is ill-formed, should you abort if you reach the end of the arg list without finding an even number of double-quotes? Or will parseArguments already handle that? Otherwise the changes seem good. Thanks, David ----- On 28/08/2020 12:39 pm, Igor Ignatyev wrote: > http://cr.openjdk.java.net/~iignatyev//8252477/webrev.00/ >> 99 lines changed: 19 ins; 20 del; 60 mod; > > Hi all, > > could you please review the patch which unblocks the rest of 8219140's (get rid of vmTestbase/PropertyResolvingWrapper) sub-tasks? > > background from JBS: >> jtreg splits command line by space to get the list of arguments and there is no way to prevent that (nor thru escaping, nor by adding quotes). currently, PropertyResolvingWrapper handles that and joins multiple arguments within double quotes into one argument before passing it to the actual test class. the only place where it's needed is in the tests which use nsk/share/ArgumentParser (or more precisely nsk.share.jpda.DebugeeArgumentHandler and nsk/share/jdb/JdbArgumentHandler). >> >> in preparation for PropertyResolvingWrapper removal, ArgumentParser should be updated to handle the "split" argument on its own. > > I've also taken the liberty to slightly clean up ArgumentParser. > > JBS: https://bugs.openjdk.java.net/browse/JDK-8252477 > webrev: http://cr.openjdk.java.net/~iignatyev//8252477/webrev.00/ > testing: all the tests which use ArgumentParser (:vmTestbase_nsk_aod :vmTestbase_nsk_jdb :vmTestbase_nsk_jdi :vmTestbase_nsk_jdw ,:vmTestbase_nsk_jvmti :vmTestbase_vm_compiler :vmTestbase_vm_mlvm) on {windows,linux,macos}-x64 > > Thanks, > -- Igor > From goetz.lindenmaier at sap.com Fri Aug 28 06:37:39 2020 From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz) Date: Fri, 28 Aug 2020 06:37:39 +0000 Subject: RFR(L) 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents In-Reply-To: References: <1f8a3c7a-fa0f-b5b2-4a8a-7d3d8dbbe1b5@oracle.com> <4b56a45c-a14c-6f74-2bfd-25deaabe8201@oracle.com> <5271429a-481d-ddb9-99dc-b3f6670fcc0b@oracle.com> Message-ID: Hi Richard, Thanks for the new webrev. The small improvements are fine, too. Reviewed from my side. Best regards, Goetz. > -----Original Message----- > From: Reingruber, Richard > Sent: Thursday, August 27, 2020 10:33 PM > To: Lindenmaier, Goetz ; serviceability- > dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net; hotspot- > runtime-dev at openjdk.java.net > Subject: RE: RFR(L) 8227745: Enable Escape Analysis for Better Performance > in the Presence of JVMTI Agents > > Hi Goetz, > > > I read through your change again. It looks good to me now. > > The new naming and additional comments make it > > easier to read I think, thank you. > > Thanks for all your input! > > > One small thing: > > deoptimization.cpp, l. 1503 > > You don't really need the brackets. Two lines below you don't use them > either. > > (No webrev needed) > > Thanks for providing the correct line off list. Fixed! > > I prepared a new webrev, because I had to rebase after JDK-8249293 [1] and > because I wanted to make use of JDK-8251384 [2] > > Webrev: http://cr.openjdk.java.net/~rrich/webrevs/8227745/webrev.8/ > Delta: http://cr.openjdk.java.net/~rrich/webrevs/8227745/webrev.8.inc/ > > The delta looks bigger than it is. Most of it is re-indentation of > VM_GetOrSetLocal::deoptimize_objects(). You can see this if you look at > > http://cr.openjdk.java.net/~rrich/webrevs/8227745/webrev.8.inc/src/hotsp > ot/share/prims/jvmtiImpl.cpp.udiff.html > > which does not include the whitespace change. > > Hope you are still ok with webrev.8. The changes are marginal. I've > commented > each below. > > Thanks, Richard. > > --- Details below --- > > src/hotspot/share/prims/jvmtiImpl.cpp > > @@ -425,11 +425,11 @@ > , _depth(depth) > , _index(index) > , _type(type) > , _jvf(NULL) > , _set(false) > - , _eb(NULL, NULL, false) // no references escape > + , _eb(NULL, NULL, type == T_OBJECT) > , _result(JVMTI_ERROR_NONE) > > Currently 'type' is never equal to T_OBJECT at this location, still I think it > is better to check. The compiler will replace the compare with false. > > @@ -630,11 +630,11 @@ > } > > // Revert optimizations based on escape analysis if this is an access to a > local object > bool VM_GetOrSetLocal::deoptimize_objects(javaVFrame* jvf) { > #if COMPILER2_OR_JVMCI > - if (NOT_JVMCI(DoEscapeAnalysis &&) _type == T_OBJECT) { > + assert(_type == T_OBJECT, "EscapeBarrier should not be active if _type != > T_OBJECT"); > > I removed the if from VM_GetOrSetLocal::deoptimize_objects(), because > now it > only gets called if the VM_GetOrSetLocal instance has an active > EscapeBarrier > which will be the case iff the local type is T_OBJECT and if either C2 escape > analysis is enabled or Graal is used. > > src/hotspot/share/runtime/deoptimization.cpp > > You suggested to remove the braces. Done. > > src/hotspot/share/runtime/deoptimization.hpp > > Must provide definition of EscapeBarrier::barrier_active() for new call site in > VM_GetOrSetLocal::doit_prologue() if building with COMPILER2_OR_JVMCI > not > defined. > > test/hotspot/jtreg/serviceability/jvmti/Heap/IterateHeapWithEscapeAnalysis > Enabled.java > > Make use of [2] and pass test with minimal vm. > > [1] https://bugs.openjdk.java.net/browse/JDK-8249293 > [2] https://bugs.openjdk.java.net/browse/JDK-8251384 > > -----Original Message----- > From: Lindenmaier, Goetz > Sent: Samstag, 22. August 2020 07:46 > To: Reingruber, Richard ; serviceability- > dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net; hotspot- > runtime-dev at openjdk.java.net > Subject: RE: RFR(L) 8227745: Enable Escape Analysis for Better Performance > in the Presence of JVMTI Agents > > Hi Richard, > > I read through your change again. It looks good to me now. > The new naming and additional comments make it > easier to read I think, thank you. > > One small thing: > deoptimization.cpp, l. 1503 > You don't really need the brackets. Two lines below you don't use them > either. > (No webrev needed) > > Best regards, > Goetz. From serguei.spitsyn at oracle.com Fri Aug 28 07:39:03 2020 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Fri, 28 Aug 2020 00:39:03 -0700 Subject: RFR(T) : 8252401 : Introduce Utils.TEST_NATIVE_PATH In-Reply-To: <8E0C4E48-B435-4734-A86B-2C6745104BF7@oracle.com> References: <8E0C4E48-B435-4734-A86B-2C6745104BF7@oracle.com> Message-ID: Hi Igor, It looks good and trivial. Thanks, Serguei On 8/26/20 16:59, Igor Ignatyev wrote: > http://cr.openjdk.java.net/~iignatyev//8252401/webrev.00 >> 6 lines changed: 5 ins; 0 del; 1 mod; > Hi all, > > could you please review this trivial patch which adds j.t.l.Utils.TEST_NATIVE_PATH static field to store the value of test.nativepath system property? > > JBS: https://bugs.openjdk.java.net/browse/JDK-8252401 > webrev: http://cr.openjdk.java.net/~iignatyev//8252401/webrev.00 > > Thanks, > -- Igor From richard.reingruber at sap.com Fri Aug 28 07:41:02 2020 From: richard.reingruber at sap.com (Reingruber, Richard) Date: Fri, 28 Aug 2020 07:41:02 +0000 Subject: RFR(L) 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents In-Reply-To: References: <1f8a3c7a-fa0f-b5b2-4a8a-7d3d8dbbe1b5@oracle.com> <4b56a45c-a14c-6f74-2bfd-25deaabe8201@oracle.com> <5271429a-481d-ddb9-99dc-b3f6670fcc0b@oracle.com> Message-ID: Thanks a lot! Richard. -----Original Message----- From: Lindenmaier, Goetz Sent: Freitag, 28. August 2020 08:38 To: Reingruber, Richard ; serviceability-dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net; hotspot-runtime-dev at openjdk.java.net Subject: RE: RFR(L) 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents Hi Richard, Thanks for the new webrev. The small improvements are fine, too. Reviewed from my side. Best regards, Goetz. > -----Original Message----- > From: Reingruber, Richard > Sent: Thursday, August 27, 2020 10:33 PM > To: Lindenmaier, Goetz ; serviceability- > dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net; hotspot- > runtime-dev at openjdk.java.net > Subject: RE: RFR(L) 8227745: Enable Escape Analysis for Better Performance > in the Presence of JVMTI Agents > > Hi Goetz, > > > I read through your change again. It looks good to me now. > > The new naming and additional comments make it > > easier to read I think, thank you. > > Thanks for all your input! > > > One small thing: > > deoptimization.cpp, l. 1503 > > You don't really need the brackets. Two lines below you don't use them > either. > > (No webrev needed) > > Thanks for providing the correct line off list. Fixed! > > I prepared a new webrev, because I had to rebase after JDK-8249293 [1] and > because I wanted to make use of JDK-8251384 [2] > > Webrev: http://cr.openjdk.java.net/~rrich/webrevs/8227745/webrev.8/ > Delta: http://cr.openjdk.java.net/~rrich/webrevs/8227745/webrev.8.inc/ > > The delta looks bigger than it is. Most of it is re-indentation of > VM_GetOrSetLocal::deoptimize_objects(). You can see this if you look at > > http://cr.openjdk.java.net/~rrich/webrevs/8227745/webrev.8.inc/src/hotsp > ot/share/prims/jvmtiImpl.cpp.udiff.html > > which does not include the whitespace change. > > Hope you are still ok with webrev.8. The changes are marginal. I've > commented > each below. > > Thanks, Richard. > > --- Details below --- > > src/hotspot/share/prims/jvmtiImpl.cpp > > @@ -425,11 +425,11 @@ > , _depth(depth) > , _index(index) > , _type(type) > , _jvf(NULL) > , _set(false) > - , _eb(NULL, NULL, false) // no references escape > + , _eb(NULL, NULL, type == T_OBJECT) > , _result(JVMTI_ERROR_NONE) > > Currently 'type' is never equal to T_OBJECT at this location, still I think it > is better to check. The compiler will replace the compare with false. > > @@ -630,11 +630,11 @@ > } > > // Revert optimizations based on escape analysis if this is an access to a > local object > bool VM_GetOrSetLocal::deoptimize_objects(javaVFrame* jvf) { > #if COMPILER2_OR_JVMCI > - if (NOT_JVMCI(DoEscapeAnalysis &&) _type == T_OBJECT) { > + assert(_type == T_OBJECT, "EscapeBarrier should not be active if _type != > T_OBJECT"); > > I removed the if from VM_GetOrSetLocal::deoptimize_objects(), because > now it > only gets called if the VM_GetOrSetLocal instance has an active > EscapeBarrier > which will be the case iff the local type is T_OBJECT and if either C2 escape > analysis is enabled or Graal is used. > > src/hotspot/share/runtime/deoptimization.cpp > > You suggested to remove the braces. Done. > > src/hotspot/share/runtime/deoptimization.hpp > > Must provide definition of EscapeBarrier::barrier_active() for new call site in > VM_GetOrSetLocal::doit_prologue() if building with COMPILER2_OR_JVMCI > not > defined. > > test/hotspot/jtreg/serviceability/jvmti/Heap/IterateHeapWithEscapeAnalysis > Enabled.java > > Make use of [2] and pass test with minimal vm. > > [1] https://bugs.openjdk.java.net/browse/JDK-8249293 > [2] https://bugs.openjdk.java.net/browse/JDK-8251384 > > -----Original Message----- > From: Lindenmaier, Goetz > Sent: Samstag, 22. August 2020 07:46 > To: Reingruber, Richard ; serviceability- > dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net; hotspot- > runtime-dev at openjdk.java.net > Subject: RE: RFR(L) 8227745: Enable Escape Analysis for Better Performance > in the Presence of JVMTI Agents > > Hi Richard, > > I read through your change again. It looks good to me now. > The new naming and additional comments make it > easier to read I think, thank you. > > One small thing: > deoptimization.cpp, l. 1503 > You don't really need the brackets. Two lines below you don't use them > either. > (No webrev needed) > > Best regards, > Goetz. From richard.reingruber at sap.com Fri Aug 28 08:11:31 2020 From: richard.reingruber at sap.com (Reingruber, Richard) Date: Fri, 28 Aug 2020 08:11:31 +0000 Subject: 8242427: JVMTI frame pop operations should use Thread-Local Handshakes In-Reply-To: <47808938-a5d8-723a-bcb0-7c31d072af5c@oracle.com> References: <99babb38-c4c0-c664-5a35-af51c2333e90@oss.nttdata.com> <47808938-a5d8-723a-bcb0-7c31d072af5c@oracle.com> Message-ID: Hi David, hi Yasumasa, > Unfortunately I do not have any benchmark for this change, however I think it is > worth to do it for consistency. All of VM operations which do not need global > lock in JVMTI are replaced to direct handshake if this enhancement is merged. VM_GetOrSetLocal can be replaced with a handshake too, I'd say. > On 28/08/2020 11:45 am, Yasumasa Suenaga wrote: > > Hi Richard, > > > > Unfortunately I do not have any benchmark for this change, however I > > think it is worth to do it for consistency. All of VM operations which > > do not need global lock in JVMTI are replaced to direct handshake if > > this enhancement is merged. > > > > I think VM operations should be replaced to direct handshake if we can. > > VM operations should be just used for operations which needs global > > lock. It will help all of programmers who are interested in HotSpot when > > they try to know the operation. > I agree with this motivation - we want to eradicate as many safepoint VM > operations as possible, even if the usage would not really benefit from > the lack of stop-the-world pauses. That said, of course this has to be > tempered against the complexity of the change. But we are establishing a > pattern for coding up JVMTI operation as direct handshakes, which should > make things generally more easy to understand. I still don't see the point in optimizing the uncommon case making it more complex. But if it's just me... Cheers, Richard. -----Original Message----- From: David Holmes Sent: Freitag, 28. August 2020 03:53 To: Yasumasa Suenaga ; Reingruber, Richard ; serviceability-dev Subject: Re: 8242427: JVMTI frame pop operations should use Thread-Local Handshakes On 28/08/2020 11:45 am, Yasumasa Suenaga wrote: > Hi Richard, > > Unfortunately I do not have any benchmark for this change, however I > think it is worth to do it for consistency. All of VM operations which > do not need global lock in JVMTI are replaced to direct handshake if > this enhancement is merged. > > I think VM operations should be replaced to direct handshake if we can. > VM operations should be just used for operations which needs global > lock. It will help all of programmers who are interested in HotSpot when > they try to know the operation. I agree with this motivation - we want to eradicate as many safepoint VM operations as possible, even if the usage would not really benefit from the lack of stop-the-world pauses. That said, of course this has to be tempered against the complexity of the change. But we are establishing a pattern for coding up JVMTI operation as direct handshakes, which should make things generally more easy to understand. Cheers, David > > Thanks, > > Yasumasa > > > On 2020/08/27 16:43, Reingruber, Richard wrote: >> Hi Yasumasa, >> >>> I've described the motivation on JDK-8201641 (it is a parent task of >>> JDK-8242427) >> >>> ``` >>> Many JVMTI functions uses VM Operation to get information. However >>> some of them need to stop only one thread - they don't need to stop >>> all threads. >>> ``` >> >> So the goal is better performance. For PopFrame IMHO it is not worth >> the effort, >> the future effort in maintaining the related code, and the risk. >> >> I think so because PopFrame is a hardly ever used. I honestly never >> used it >> (have you?). In IDEs it is well hidden. Graal does not even bother to >> support >> it. On the other side the change affects other operations that are >> commonly >> used. >> >> In the rare cases when a PopFrame is requested it will be in interactive >> sessions: someone found the well-hidden PopFrame button in the >> debugger and >> pressed it. Probably she won't do it again. At least not at a high >> frequency. So >> she will not notice the effect of the optimization. >> >> If you have a large cloud of JVMs where every second a PopFrame is >> executed, >> even then I would doubt that the resource savings are measurable. And >> I would >> also doubt that a cloud with PopFrames at that rate exists. >> >> I see there are rare events like full GCs that can do harm. But in the >> case of >> PopFrame I can't see a problem because the pause for the vm operation >> will be >> extremely short. >> >> Is there a scenario or a not too artificial benchmark that would show an >> improvement? >> >> Thanks, >> Richard. >> >> -----Original Message----- >> From: Yasumasa Suenaga >> Sent: Donnerstag, 27. August 2020 01:30 >> To: Reingruber, Richard ; >> serviceability-dev >> Subject: Re: 8242427: JVMTI frame pop operations should use >> Thread-Local Handshakes >> >> Hi Richard, >> >> I've described the motivation on JDK-8201641 (it is a parent task of >> JDK-8242427) >> >> ``` >> Many JVMTI functions uses VM Operation to get information. However >> some of them need to stop only one thread - they don't need to stop >> all threads. >> ``` >> >> I aimed to improve JVMTI monitor operation with TLS at first, but I >> found other JVMTI operations can be improved with same process. So >> I've tried to fix them. >> >> I proposed it to serviceability-dev [1], then Dan told me similar >> enhancement is already filed to JBS [2]. So I created subtasks in it. >> >> >> Thanks, >> >> Yasumasa >> >> >> [1] >> https://mail.openjdk.java.net/pipermail/serviceability-dev/2020-March/030890.html >> >> [2] >> https://mail.openjdk.java.net/pipermail/serviceability-dev/2020-March/030897.html >> >> >> >> On 2020/08/27 5:33, Reingruber, Richard wrote: >>> Hi Yasumasa, >>> >>> Could you explain a little bit the motivation to replace these vm >>> operations with handshakes? >>> Would be good, if you could add the goals as well to the JBS item. >>> >>> Thanks, Richard. >>> >>> -----Original Message----- >>> From: serviceability-dev >>> On Behalf Of Yasumasa Suenaga >>> Sent: Montag, 24. August 2020 04:40 >>> To: serviceability-dev >>> Subject: 8242427: JVMTI frame pop operations should use Thread-Local >>> Handshakes >>> >>> Hi all, >>> >>> I want to hear your opinions about the change for JDK-8242427. >>> >>> I'm trying to migrate following operations to direct handshake. >>> >>> ?????? - VM_UpdateForPopTopFrame >>> ?????? - VM_SetFramePop >>> ?????? - VM_GetCurrentLocation >>> >>> Some operations (VM_GetCurrentLocation and >>> EnterInterpOnlyModeClosure) might be called at safepoint, so I want >>> to use JavaThread::active_handshaker() in production VM to detect the >>> process is in direct handshake or not. >>> >>> However this function is available in debug VM only, so I want to >>> hear the reason why it is for debug VM only, and there are no problem >>> to use it in production VM. Of course another solutions are welcome. >>> >>> webrev is here. It passed jtreg tests >>> (vmTestbase/nsk/{jdi,jdwp,jvmti} serviceability/{jdwp,jvmti}) >>> ???? http://cr.openjdk.java.net/~ysuenaga/JDK-8242427/proposal/ >>> >>> >>> Thanks, >>> >>> Yasumasa >>> From dms at samersoff.net Fri Aug 28 08:22:24 2020 From: dms at samersoff.net (Dmitry Samersoff) Date: Fri, 28 Aug 2020 11:22:24 +0300 Subject: RFR (S): JDK-8250630 JdwpListenTest.java fails on Alpine Linux In-Reply-To: <87b7ffaf-2a05-0a76-6933-6d0ee3927f67@oracle.com> References: <3be5b9aa-2d4d-0806-24f3-8207de1baf35@samersoff.net> <87b7ffaf-2a05-0a76-6933-6d0ee3927f67@oracle.com> Message-ID: Hello Sergei, I decided not to add extra check here, if IPv6 is not enabled in general, JDWP will not work at all and it's the only possibility for inet_pton to fail. >> Except: > The error code from the inet_pton is not checked. inet_pton performs conversion of the constant value in our case and the only possible reason for it to fail is that the system doesn't support IPv6 at all. -Dmitry On 27.08.2020 21:05, serguei.spitsyn at oracle.com wrote: > Hi Dmitry, > > Thank you for the update. > The error code from the inet_pton() still is not checked. > > Thanks, > Serguei > > > On 8/27/20 08:03, Dmitry Samersoff wrote: >> Hello Everybody, >> >> http://cr.openjdk.java.net/~dsamersoff/JDK-8250630/webrev.02/ >> >> Webrev is updated, all comments accepted and addressed. >> >> Except: >> > The error code from the inet_pton is not checked. >> >> inet_pton performs conversion of the constant value in our case and >> the only possible reason for it to fail is that the system doesn't >> support IPv6 at all. >> >> -Dmitry >> >> >> On 18.08.2020 3:20, serguei.spitsyn at oracle.com wrote: >>> Hi Dmitry, >>> >>> I agree with Alex, it is better to rename compareIPv6Addr to >>> isEqualIPv6Addr. >>> >>> 705 static int compareIPv6Addr(struct addrinfo *ai, struct in6_addr >>> in6Addr) >>> 706 { >>> 707 >>> 708 if (ai->ai_addr->sa_family == AF_INET6) { >>> 709 const struct sockaddr_in6 sa = *((struct sockaddr_in6*) >>> ai->ai_addr); >>> 710 return (memcmp(&sa.sin6_addr, &in6Addr, sizeof(in6Addr)) == 0); >>> 711 } >>> 712 >>> 713 return 0; >>> 714 } >>> >>> I think, the lines 707 and 712 are not needed. >>> >>> 725 struct in6_addr mappedAny; ... 761 inet_pton(AF_INET6, >>> "::ffff:0.0.0.0", &mappedAny); >>> >>> The error code from the inet_pton is not checked. >>> Also, it can be useful to pre-initialize the mappedAny. >>> >>> 737 // Try to find bind address of preferred address familty first A >>> dot at the end of comment is missed. >>> >>> 745 if (listenAddr == NULL) { >>> 746 // No address of preferred addres family found, grab the first one >>> 747 listenAddr = &(addrInfo[0]); >>> ? 748???? } >>> >>> The indent has to be 4, not 3. >>> The () brackets are not actually needed but I do not object if you >>> keep them. >>> A dot at the end of comment is missed. >>> >>> 755 // Binding to IN6ADDR_ANY allow us to serve both IPv4 and IPv6 >>> connections, >>> 756 // but binding to mapped INADDR_ANY (::ffff:0.0.0.0) allow us to >>> serve IPv4 >>> 757 // connections only. So make sure, that IN6ADDR_ANY is preferred >>> over >>> 758 // mapped INADDR_ANY if preferredAddressFamily is AF_INET6 or not >>> set >>> >>> I'd suggest to replace "allow us" => "allows" in two places: >>> ?? "IN6ADDR_ANY allow us" and "(::ffff:0.0.0.0) allow us". >>> Also, it'd better to replace: "So make sure,that" => "Make sure that". >>> A dot at the end of comment is missed. >>> >>> I don't know the network protocols well enough to comment on >>> >>> >>> Thanks, >>> Serguei >>> >>> >>> On 8/17/20 00:21, Dmitry Samersoff wrote: >>>> Hello Everybody, >>>> >>>> Please review the fix: >>>> >>>> https://cr.openjdk.java.net/~dsamersoff/JDK-8250630/webrev.01/ >>>> >>>> Binding to IN6ADDR_ANY allow us to serve both IPv4 and IPv6 >>>> connections, but binding to mapped INADDR_ANY (::ffff:0.0.0.0) allow >>>> us to serve IPv4 connections only. >>>> >>>> So make sure, that IN6ADDR_ANY is preferred over mapped INADDR_ANY >>>> if preferredAddressFamily is not AF_INET >>>> >>>> >>>> -Dmitry\S >>>> >>> >> > From suenaga at oss.nttdata.com Fri Aug 28 08:41:34 2020 From: suenaga at oss.nttdata.com (Yasumasa Suenaga) Date: Fri, 28 Aug 2020 17:41:34 +0900 Subject: 8242427: JVMTI frame pop operations should use Thread-Local Handshakes In-Reply-To: References: <99babb38-c4c0-c664-5a35-af51c2333e90@oss.nttdata.com> <47808938-a5d8-723a-bcb0-7c31d072af5c@oracle.com> Message-ID: <2fd7e3c9-d321-bd7b-2ce6-901cb8b77e5f@oss.nttdata.com> Hi Richard, On 2020/08/28 17:11, Reingruber, Richard wrote: > Hi David, hi Yasumasa, > >> Unfortunately I do not have any benchmark for this change, however I think it is >> worth to do it for consistency. All of VM operations which do not need global >> lock in JVMTI are replaced to direct handshake if this enhancement is merged. > > VM_GetOrSetLocal can be replaced with a handshake too, I'd say. VM_GetOrSetLocal::doit() might call Deoptimization::deoptimize_frame() - it would exec VM_DeoptimizeFrame. It is VMop in direct handshake. If it is safe, we can replace VM_GetOrSetLocal, but I'm not sure. Thanks, Yasumasa >> On 28/08/2020 11:45 am, Yasumasa Suenaga wrote: >>> Hi Richard, >>> >>> Unfortunately I do not have any benchmark for this change, however I >>> think it is worth to do it for consistency. All of VM operations which >>> do not need global lock in JVMTI are replaced to direct handshake if >>> this enhancement is merged. >>> >>> I think VM operations should be replaced to direct handshake if we can. >>> VM operations should be just used for operations which needs global >>> lock. It will help all of programmers who are interested in HotSpot when >>> they try to know the operation. > >> I agree with this motivation - we want to eradicate as many safepoint VM >> operations as possible, even if the usage would not really benefit from >> the lack of stop-the-world pauses. That said, of course this has to be >> tempered against the complexity of the change. But we are establishing a >> pattern for coding up JVMTI operation as direct handshakes, which should >> make things generally more easy to understand. > > I still don't see the point in optimizing the uncommon case making it more > complex. But if it's just me... > > Cheers, Richard. > > -----Original Message----- > From: David Holmes > Sent: Freitag, 28. August 2020 03:53 > To: Yasumasa Suenaga ; Reingruber, Richard ; serviceability-dev > Subject: Re: 8242427: JVMTI frame pop operations should use Thread-Local Handshakes > > On 28/08/2020 11:45 am, Yasumasa Suenaga wrote: >> Hi Richard, >> >> Unfortunately I do not have any benchmark for this change, however I >> think it is worth to do it for consistency. All of VM operations which >> do not need global lock in JVMTI are replaced to direct handshake if >> this enhancement is merged. >> >> I think VM operations should be replaced to direct handshake if we can. >> VM operations should be just used for operations which needs global >> lock. It will help all of programmers who are interested in HotSpot when >> they try to know the operation. > > I agree with this motivation - we want to eradicate as many safepoint VM > operations as possible, even if the usage would not really benefit from > the lack of stop-the-world pauses. That said, of course this has to be > tempered against the complexity of the change. But we are establishing a > pattern for coding up JVMTI operation as direct handshakes, which should > make things generally more easy to understand. > > Cheers, > David > >> >> Thanks, >> >> Yasumasa >> >> >> On 2020/08/27 16:43, Reingruber, Richard wrote: >>> Hi Yasumasa, >>> >>>> I've described the motivation on JDK-8201641 (it is a parent task of >>>> JDK-8242427) >>> >>>> ``` >>>> Many JVMTI functions uses VM Operation to get information. However >>>> some of them need to stop only one thread - they don't need to stop >>>> all threads. >>>> ``` >>> >>> So the goal is better performance. For PopFrame IMHO it is not worth >>> the effort, >>> the future effort in maintaining the related code, and the risk. >>> >>> I think so because PopFrame is a hardly ever used. I honestly never >>> used it >>> (have you?). In IDEs it is well hidden. Graal does not even bother to >>> support >>> it. On the other side the change affects other operations that are >>> commonly >>> used. >>> >>> In the rare cases when a PopFrame is requested it will be in interactive >>> sessions: someone found the well-hidden PopFrame button in the >>> debugger and >>> pressed it. Probably she won't do it again. At least not at a high >>> frequency. So >>> she will not notice the effect of the optimization. >>> >>> If you have a large cloud of JVMs where every second a PopFrame is >>> executed, >>> even then I would doubt that the resource savings are measurable. And >>> I would >>> also doubt that a cloud with PopFrames at that rate exists. >>> >>> I see there are rare events like full GCs that can do harm. But in the >>> case of >>> PopFrame I can't see a problem because the pause for the vm operation >>> will be >>> extremely short. >>> >>> Is there a scenario or a not too artificial benchmark that would show an >>> improvement? >>> >>> Thanks, >>> Richard. >>> >>> -----Original Message----- >>> From: Yasumasa Suenaga >>> Sent: Donnerstag, 27. August 2020 01:30 >>> To: Reingruber, Richard ; >>> serviceability-dev >>> Subject: Re: 8242427: JVMTI frame pop operations should use >>> Thread-Local Handshakes >>> >>> Hi Richard, >>> >>> I've described the motivation on JDK-8201641 (it is a parent task of >>> JDK-8242427) >>> >>> ``` >>> Many JVMTI functions uses VM Operation to get information. However >>> some of them need to stop only one thread - they don't need to stop >>> all threads. >>> ``` >>> >>> I aimed to improve JVMTI monitor operation with TLS at first, but I >>> found other JVMTI operations can be improved with same process. So >>> I've tried to fix them. >>> >>> I proposed it to serviceability-dev [1], then Dan told me similar >>> enhancement is already filed to JBS [2]. So I created subtasks in it. >>> >>> >>> Thanks, >>> >>> Yasumasa >>> >>> >>> [1] >>> https://mail.openjdk.java.net/pipermail/serviceability-dev/2020-March/030890.html >>> >>> [2] >>> https://mail.openjdk.java.net/pipermail/serviceability-dev/2020-March/030897.html >>> >>> >>> >>> On 2020/08/27 5:33, Reingruber, Richard wrote: >>>> Hi Yasumasa, >>>> >>>> Could you explain a little bit the motivation to replace these vm >>>> operations with handshakes? >>>> Would be good, if you could add the goals as well to the JBS item. >>>> >>>> Thanks, Richard. >>>> >>>> -----Original Message----- >>>> From: serviceability-dev >>>> On Behalf Of Yasumasa Suenaga >>>> Sent: Montag, 24. August 2020 04:40 >>>> To: serviceability-dev >>>> Subject: 8242427: JVMTI frame pop operations should use Thread-Local >>>> Handshakes >>>> >>>> Hi all, >>>> >>>> I want to hear your opinions about the change for JDK-8242427. >>>> >>>> I'm trying to migrate following operations to direct handshake. >>>> >>>> ?????? - VM_UpdateForPopTopFrame >>>> ?????? - VM_SetFramePop >>>> ?????? - VM_GetCurrentLocation >>>> >>>> Some operations (VM_GetCurrentLocation and >>>> EnterInterpOnlyModeClosure) might be called at safepoint, so I want >>>> to use JavaThread::active_handshaker() in production VM to detect the >>>> process is in direct handshake or not. >>>> >>>> However this function is available in debug VM only, so I want to >>>> hear the reason why it is for debug VM only, and there are no problem >>>> to use it in production VM. Of course another solutions are welcome. >>>> >>>> webrev is here. It passed jtreg tests >>>> (vmTestbase/nsk/{jdi,jdwp,jvmti} serviceability/{jdwp,jvmti}) >>>> ???? http://cr.openjdk.java.net/~ysuenaga/JDK-8242427/proposal/ >>>> >>>> >>>> Thanks, >>>> >>>> Yasumasa >>>> From richard.reingruber at sap.com Fri Aug 28 08:54:16 2020 From: richard.reingruber at sap.com (Reingruber, Richard) Date: Fri, 28 Aug 2020 08:54:16 +0000 Subject: 8242427: JVMTI frame pop operations should use Thread-Local Handshakes In-Reply-To: <2fd7e3c9-d321-bd7b-2ce6-901cb8b77e5f@oss.nttdata.com> References: <99babb38-c4c0-c664-5a35-af51c2333e90@oss.nttdata.com> <47808938-a5d8-723a-bcb0-7c31d072af5c@oracle.com> <2fd7e3c9-d321-bd7b-2ce6-901cb8b77e5f@oss.nttdata.com> Message-ID: Hi Yasumasa, VM_DeoptimizeFrame can be replaced too I'd think. Cheers, Richard. -----Original Message----- From: Yasumasa Suenaga Sent: Freitag, 28. August 2020 10:42 To: Reingruber, Richard ; David Holmes ; serviceability-dev Subject: Re: 8242427: JVMTI frame pop operations should use Thread-Local Handshakes Hi Richard, On 2020/08/28 17:11, Reingruber, Richard wrote: > Hi David, hi Yasumasa, > >> Unfortunately I do not have any benchmark for this change, however I think it is >> worth to do it for consistency. All of VM operations which do not need global >> lock in JVMTI are replaced to direct handshake if this enhancement is merged. > > VM_GetOrSetLocal can be replaced with a handshake too, I'd say. VM_GetOrSetLocal::doit() might call Deoptimization::deoptimize_frame() - it would exec VM_DeoptimizeFrame. It is VMop in direct handshake. If it is safe, we can replace VM_GetOrSetLocal, but I'm not sure. Thanks, Yasumasa >> On 28/08/2020 11:45 am, Yasumasa Suenaga wrote: >>> Hi Richard, >>> >>> Unfortunately I do not have any benchmark for this change, however I >>> think it is worth to do it for consistency. All of VM operations which >>> do not need global lock in JVMTI are replaced to direct handshake if >>> this enhancement is merged. >>> >>> I think VM operations should be replaced to direct handshake if we can. >>> VM operations should be just used for operations which needs global >>> lock. It will help all of programmers who are interested in HotSpot when >>> they try to know the operation. > >> I agree with this motivation - we want to eradicate as many safepoint VM >> operations as possible, even if the usage would not really benefit from >> the lack of stop-the-world pauses. That said, of course this has to be >> tempered against the complexity of the change. But we are establishing a >> pattern for coding up JVMTI operation as direct handshakes, which should >> make things generally more easy to understand. > > I still don't see the point in optimizing the uncommon case making it more > complex. But if it's just me... > > Cheers, Richard. > > -----Original Message----- > From: David Holmes > Sent: Freitag, 28. August 2020 03:53 > To: Yasumasa Suenaga ; Reingruber, Richard ; serviceability-dev > Subject: Re: 8242427: JVMTI frame pop operations should use Thread-Local Handshakes > > On 28/08/2020 11:45 am, Yasumasa Suenaga wrote: >> Hi Richard, >> >> Unfortunately I do not have any benchmark for this change, however I >> think it is worth to do it for consistency. All of VM operations which >> do not need global lock in JVMTI are replaced to direct handshake if >> this enhancement is merged. >> >> I think VM operations should be replaced to direct handshake if we can. >> VM operations should be just used for operations which needs global >> lock. It will help all of programmers who are interested in HotSpot when >> they try to know the operation. > > I agree with this motivation - we want to eradicate as many safepoint VM > operations as possible, even if the usage would not really benefit from > the lack of stop-the-world pauses. That said, of course this has to be > tempered against the complexity of the change. But we are establishing a > pattern for coding up JVMTI operation as direct handshakes, which should > make things generally more easy to understand. > > Cheers, > David > >> >> Thanks, >> >> Yasumasa >> >> >> On 2020/08/27 16:43, Reingruber, Richard wrote: >>> Hi Yasumasa, >>> >>>> I've described the motivation on JDK-8201641 (it is a parent task of >>>> JDK-8242427) >>> >>>> ``` >>>> Many JVMTI functions uses VM Operation to get information. However >>>> some of them need to stop only one thread - they don't need to stop >>>> all threads. >>>> ``` >>> >>> So the goal is better performance. For PopFrame IMHO it is not worth >>> the effort, >>> the future effort in maintaining the related code, and the risk. >>> >>> I think so because PopFrame is a hardly ever used. I honestly never >>> used it >>> (have you?). In IDEs it is well hidden. Graal does not even bother to >>> support >>> it. On the other side the change affects other operations that are >>> commonly >>> used. >>> >>> In the rare cases when a PopFrame is requested it will be in interactive >>> sessions: someone found the well-hidden PopFrame button in the >>> debugger and >>> pressed it. Probably she won't do it again. At least not at a high >>> frequency. So >>> she will not notice the effect of the optimization. >>> >>> If you have a large cloud of JVMs where every second a PopFrame is >>> executed, >>> even then I would doubt that the resource savings are measurable. And >>> I would >>> also doubt that a cloud with PopFrames at that rate exists. >>> >>> I see there are rare events like full GCs that can do harm. But in the >>> case of >>> PopFrame I can't see a problem because the pause for the vm operation >>> will be >>> extremely short. >>> >>> Is there a scenario or a not too artificial benchmark that would show an >>> improvement? >>> >>> Thanks, >>> Richard. >>> >>> -----Original Message----- >>> From: Yasumasa Suenaga >>> Sent: Donnerstag, 27. August 2020 01:30 >>> To: Reingruber, Richard ; >>> serviceability-dev >>> Subject: Re: 8242427: JVMTI frame pop operations should use >>> Thread-Local Handshakes >>> >>> Hi Richard, >>> >>> I've described the motivation on JDK-8201641 (it is a parent task of >>> JDK-8242427) >>> >>> ``` >>> Many JVMTI functions uses VM Operation to get information. However >>> some of them need to stop only one thread - they don't need to stop >>> all threads. >>> ``` >>> >>> I aimed to improve JVMTI monitor operation with TLS at first, but I >>> found other JVMTI operations can be improved with same process. So >>> I've tried to fix them. >>> >>> I proposed it to serviceability-dev [1], then Dan told me similar >>> enhancement is already filed to JBS [2]. So I created subtasks in it. >>> >>> >>> Thanks, >>> >>> Yasumasa >>> >>> >>> [1] >>> https://mail.openjdk.java.net/pipermail/serviceability-dev/2020-March/030890.html >>> >>> [2] >>> https://mail.openjdk.java.net/pipermail/serviceability-dev/2020-March/030897.html >>> >>> >>> >>> On 2020/08/27 5:33, Reingruber, Richard wrote: >>>> Hi Yasumasa, >>>> >>>> Could you explain a little bit the motivation to replace these vm >>>> operations with handshakes? >>>> Would be good, if you could add the goals as well to the JBS item. >>>> >>>> Thanks, Richard. >>>> >>>> -----Original Message----- >>>> From: serviceability-dev >>>> On Behalf Of Yasumasa Suenaga >>>> Sent: Montag, 24. August 2020 04:40 >>>> To: serviceability-dev >>>> Subject: 8242427: JVMTI frame pop operations should use Thread-Local >>>> Handshakes >>>> >>>> Hi all, >>>> >>>> I want to hear your opinions about the change for JDK-8242427. >>>> >>>> I'm trying to migrate following operations to direct handshake. >>>> >>>> ?????? - VM_UpdateForPopTopFrame >>>> ?????? - VM_SetFramePop >>>> ?????? - VM_GetCurrentLocation >>>> >>>> Some operations (VM_GetCurrentLocation and >>>> EnterInterpOnlyModeClosure) might be called at safepoint, so I want >>>> to use JavaThread::active_handshaker() in production VM to detect the >>>> process is in direct handshake or not. >>>> >>>> However this function is available in debug VM only, so I want to >>>> hear the reason why it is for debug VM only, and there are no problem >>>> to use it in production VM. Of course another solutions are welcome. >>>> >>>> webrev is here. It passed jtreg tests >>>> (vmTestbase/nsk/{jdi,jdwp,jvmti} serviceability/{jdwp,jvmti}) >>>> ???? http://cr.openjdk.java.net/~ysuenaga/JDK-8242427/proposal/ >>>> >>>> >>>> Thanks, >>>> >>>> Yasumasa >>>> From suenaga at oss.nttdata.com Fri Aug 28 13:14:33 2020 From: suenaga at oss.nttdata.com (Yasumasa Suenaga) Date: Fri, 28 Aug 2020 22:14:33 +0900 Subject: 8242427: JVMTI frame pop operations should use Thread-Local Handshakes In-Reply-To: References: <99babb38-c4c0-c664-5a35-af51c2333e90@oss.nttdata.com> <47808938-a5d8-723a-bcb0-7c31d072af5c@oracle.com> <2fd7e3c9-d321-bd7b-2ce6-901cb8b77e5f@oss.nttdata.com> Message-ID: Hi Richard, On 2020/08/28 17:54, Reingruber, Richard wrote: > Hi Yasumasa, > > VM_DeoptimizeFrame can be replaced too I'd think. The scope of this change is JVMTI, so I don't want to change VM_DeoptimizeFrame now. Of course it would be nice if other VM operations (includes VM_DeoptimizeFrame) are replaced to direct handshake in future, but I think it is another RFE. Cheers, Yasumasa > Cheers, Richard. > > -----Original Message----- > From: Yasumasa Suenaga > Sent: Freitag, 28. August 2020 10:42 > To: Reingruber, Richard ; David Holmes ; serviceability-dev > Subject: Re: 8242427: JVMTI frame pop operations should use Thread-Local Handshakes > > Hi Richard, > > On 2020/08/28 17:11, Reingruber, Richard wrote: >> Hi David, hi Yasumasa, >> >>> Unfortunately I do not have any benchmark for this change, however I think it is >>> worth to do it for consistency. All of VM operations which do not need global >>> lock in JVMTI are replaced to direct handshake if this enhancement is merged. >> >> VM_GetOrSetLocal can be replaced with a handshake too, I'd say. > > VM_GetOrSetLocal::doit() might call Deoptimization::deoptimize_frame() - it would exec VM_DeoptimizeFrame. It is VMop in direct handshake. If it is safe, we can replace VM_GetOrSetLocal, but I'm not sure. > > > Thanks, > > Yasumasa > > >>> On 28/08/2020 11:45 am, Yasumasa Suenaga wrote: >>>> Hi Richard, >>>> >>>> Unfortunately I do not have any benchmark for this change, however I >>>> think it is worth to do it for consistency. All of VM operations which >>>> do not need global lock in JVMTI are replaced to direct handshake if >>>> this enhancement is merged. >>>> >>>> I think VM operations should be replaced to direct handshake if we can. >>>> VM operations should be just used for operations which needs global >>>> lock. It will help all of programmers who are interested in HotSpot when >>>> they try to know the operation. >> >>> I agree with this motivation - we want to eradicate as many safepoint VM >>> operations as possible, even if the usage would not really benefit from >>> the lack of stop-the-world pauses. That said, of course this has to be >>> tempered against the complexity of the change. But we are establishing a >>> pattern for coding up JVMTI operation as direct handshakes, which should >>> make things generally more easy to understand. >> >> I still don't see the point in optimizing the uncommon case making it more >> complex. But if it's just me... >> >> Cheers, Richard. >> >> -----Original Message----- >> From: David Holmes >> Sent: Freitag, 28. August 2020 03:53 >> To: Yasumasa Suenaga ; Reingruber, Richard ; serviceability-dev >> Subject: Re: 8242427: JVMTI frame pop operations should use Thread-Local Handshakes >> >> On 28/08/2020 11:45 am, Yasumasa Suenaga wrote: >>> Hi Richard, >>> >>> Unfortunately I do not have any benchmark for this change, however I >>> think it is worth to do it for consistency. All of VM operations which >>> do not need global lock in JVMTI are replaced to direct handshake if >>> this enhancement is merged. >>> >>> I think VM operations should be replaced to direct handshake if we can. >>> VM operations should be just used for operations which needs global >>> lock. It will help all of programmers who are interested in HotSpot when >>> they try to know the operation. >> >> I agree with this motivation - we want to eradicate as many safepoint VM >> operations as possible, even if the usage would not really benefit from >> the lack of stop-the-world pauses. That said, of course this has to be >> tempered against the complexity of the change. But we are establishing a >> pattern for coding up JVMTI operation as direct handshakes, which should >> make things generally more easy to understand. >> >> Cheers, >> David >> >>> >>> Thanks, >>> >>> Yasumasa >>> >>> >>> On 2020/08/27 16:43, Reingruber, Richard wrote: >>>> Hi Yasumasa, >>>> >>>>> I've described the motivation on JDK-8201641 (it is a parent task of >>>>> JDK-8242427) >>>> >>>>> ``` >>>>> Many JVMTI functions uses VM Operation to get information. However >>>>> some of them need to stop only one thread - they don't need to stop >>>>> all threads. >>>>> ``` >>>> >>>> So the goal is better performance. For PopFrame IMHO it is not worth >>>> the effort, >>>> the future effort in maintaining the related code, and the risk. >>>> >>>> I think so because PopFrame is a hardly ever used. I honestly never >>>> used it >>>> (have you?). In IDEs it is well hidden. Graal does not even bother to >>>> support >>>> it. On the other side the change affects other operations that are >>>> commonly >>>> used. >>>> >>>> In the rare cases when a PopFrame is requested it will be in interactive >>>> sessions: someone found the well-hidden PopFrame button in the >>>> debugger and >>>> pressed it. Probably she won't do it again. At least not at a high >>>> frequency. So >>>> she will not notice the effect of the optimization. >>>> >>>> If you have a large cloud of JVMs where every second a PopFrame is >>>> executed, >>>> even then I would doubt that the resource savings are measurable. And >>>> I would >>>> also doubt that a cloud with PopFrames at that rate exists. >>>> >>>> I see there are rare events like full GCs that can do harm. But in the >>>> case of >>>> PopFrame I can't see a problem because the pause for the vm operation >>>> will be >>>> extremely short. >>>> >>>> Is there a scenario or a not too artificial benchmark that would show an >>>> improvement? >>>> >>>> Thanks, >>>> Richard. >>>> >>>> -----Original Message----- >>>> From: Yasumasa Suenaga >>>> Sent: Donnerstag, 27. August 2020 01:30 >>>> To: Reingruber, Richard ; >>>> serviceability-dev >>>> Subject: Re: 8242427: JVMTI frame pop operations should use >>>> Thread-Local Handshakes >>>> >>>> Hi Richard, >>>> >>>> I've described the motivation on JDK-8201641 (it is a parent task of >>>> JDK-8242427) >>>> >>>> ``` >>>> Many JVMTI functions uses VM Operation to get information. However >>>> some of them need to stop only one thread - they don't need to stop >>>> all threads. >>>> ``` >>>> >>>> I aimed to improve JVMTI monitor operation with TLS at first, but I >>>> found other JVMTI operations can be improved with same process. So >>>> I've tried to fix them. >>>> >>>> I proposed it to serviceability-dev [1], then Dan told me similar >>>> enhancement is already filed to JBS [2]. So I created subtasks in it. >>>> >>>> >>>> Thanks, >>>> >>>> Yasumasa >>>> >>>> >>>> [1] >>>> https://mail.openjdk.java.net/pipermail/serviceability-dev/2020-March/030890.html >>>> >>>> [2] >>>> https://mail.openjdk.java.net/pipermail/serviceability-dev/2020-March/030897.html >>>> >>>> >>>> >>>> On 2020/08/27 5:33, Reingruber, Richard wrote: >>>>> Hi Yasumasa, >>>>> >>>>> Could you explain a little bit the motivation to replace these vm >>>>> operations with handshakes? >>>>> Would be good, if you could add the goals as well to the JBS item. >>>>> >>>>> Thanks, Richard. >>>>> >>>>> -----Original Message----- >>>>> From: serviceability-dev >>>>> On Behalf Of Yasumasa Suenaga >>>>> Sent: Montag, 24. August 2020 04:40 >>>>> To: serviceability-dev >>>>> Subject: 8242427: JVMTI frame pop operations should use Thread-Local >>>>> Handshakes >>>>> >>>>> Hi all, >>>>> >>>>> I want to hear your opinions about the change for JDK-8242427. >>>>> >>>>> I'm trying to migrate following operations to direct handshake. >>>>> >>>>> ?????? - VM_UpdateForPopTopFrame >>>>> ?????? - VM_SetFramePop >>>>> ?????? - VM_GetCurrentLocation >>>>> >>>>> Some operations (VM_GetCurrentLocation and >>>>> EnterInterpOnlyModeClosure) might be called at safepoint, so I want >>>>> to use JavaThread::active_handshaker() in production VM to detect the >>>>> process is in direct handshake or not. >>>>> >>>>> However this function is available in debug VM only, so I want to >>>>> hear the reason why it is for debug VM only, and there are no problem >>>>> to use it in production VM. Of course another solutions are welcome. >>>>> >>>>> webrev is here. It passed jtreg tests >>>>> (vmTestbase/nsk/{jdi,jdwp,jvmti} serviceability/{jdwp,jvmti}) >>>>> ???? http://cr.openjdk.java.net/~ysuenaga/JDK-8242427/proposal/ >>>>> >>>>> >>>>> Thanks, >>>>> >>>>> Yasumasa >>>>> From richard.reingruber at sap.com Fri Aug 28 13:45:47 2020 From: richard.reingruber at sap.com (Reingruber, Richard) Date: Fri, 28 Aug 2020 13:45:47 +0000 Subject: 8242427: JVMTI frame pop operations should use Thread-Local Handshakes In-Reply-To: References: <99babb38-c4c0-c664-5a35-af51c2333e90@oss.nttdata.com> <47808938-a5d8-723a-bcb0-7c31d072af5c@oracle.com> <2fd7e3c9-d321-bd7b-2ce6-901cb8b77e5f@oss.nttdata.com> Message-ID: Hi Yasumasa, then this was a misunderstanding. I thought you were saying you covered all vm operations in the JVMTI subsystem that can be replaced with handshakes. I wanted to state that I think that local variable access does not require global synchronization (i.e. a safepoint) and that it is feasible to use handshakes for it. Cheers, Richard. -----Original Message----- From: Yasumasa Suenaga Sent: Freitag, 28. August 2020 15:15 To: Reingruber, Richard ; David Holmes ; serviceability-dev Subject: Re: 8242427: JVMTI frame pop operations should use Thread-Local Handshakes Hi Richard, On 2020/08/28 17:54, Reingruber, Richard wrote: > Hi Yasumasa, > > VM_DeoptimizeFrame can be replaced too I'd think. The scope of this change is JVMTI, so I don't want to change VM_DeoptimizeFrame now. Of course it would be nice if other VM operations (includes VM_DeoptimizeFrame) are replaced to direct handshake in future, but I think it is another RFE. Cheers, Yasumasa > Cheers, Richard. > > -----Original Message----- > From: Yasumasa Suenaga > Sent: Freitag, 28. August 2020 10:42 > To: Reingruber, Richard ; David Holmes ; serviceability-dev > Subject: Re: 8242427: JVMTI frame pop operations should use Thread-Local Handshakes > > Hi Richard, > > On 2020/08/28 17:11, Reingruber, Richard wrote: >> Hi David, hi Yasumasa, >> >>> Unfortunately I do not have any benchmark for this change, however I think it is >>> worth to do it for consistency. All of VM operations which do not need global >>> lock in JVMTI are replaced to direct handshake if this enhancement is merged. >> >> VM_GetOrSetLocal can be replaced with a handshake too, I'd say. > > VM_GetOrSetLocal::doit() might call Deoptimization::deoptimize_frame() - it would exec VM_DeoptimizeFrame. It is VMop in direct handshake. If it is safe, we can replace VM_GetOrSetLocal, but I'm not sure. > > > Thanks, > > Yasumasa > > >>> On 28/08/2020 11:45 am, Yasumasa Suenaga wrote: >>>> Hi Richard, >>>> >>>> Unfortunately I do not have any benchmark for this change, however I >>>> think it is worth to do it for consistency. All of VM operations which >>>> do not need global lock in JVMTI are replaced to direct handshake if >>>> this enhancement is merged. >>>> >>>> I think VM operations should be replaced to direct handshake if we can. >>>> VM operations should be just used for operations which needs global >>>> lock. It will help all of programmers who are interested in HotSpot when >>>> they try to know the operation. >> >>> I agree with this motivation - we want to eradicate as many safepoint VM >>> operations as possible, even if the usage would not really benefit from >>> the lack of stop-the-world pauses. That said, of course this has to be >>> tempered against the complexity of the change. But we are establishing a >>> pattern for coding up JVMTI operation as direct handshakes, which should >>> make things generally more easy to understand. >> >> I still don't see the point in optimizing the uncommon case making it more >> complex. But if it's just me... >> >> Cheers, Richard. >> >> -----Original Message----- >> From: David Holmes >> Sent: Freitag, 28. August 2020 03:53 >> To: Yasumasa Suenaga ; Reingruber, Richard ; serviceability-dev >> Subject: Re: 8242427: JVMTI frame pop operations should use Thread-Local Handshakes >> >> On 28/08/2020 11:45 am, Yasumasa Suenaga wrote: >>> Hi Richard, >>> >>> Unfortunately I do not have any benchmark for this change, however I >>> think it is worth to do it for consistency. All of VM operations which >>> do not need global lock in JVMTI are replaced to direct handshake if >>> this enhancement is merged. >>> >>> I think VM operations should be replaced to direct handshake if we can. >>> VM operations should be just used for operations which needs global >>> lock. It will help all of programmers who are interested in HotSpot when >>> they try to know the operation. >> >> I agree with this motivation - we want to eradicate as many safepoint VM >> operations as possible, even if the usage would not really benefit from >> the lack of stop-the-world pauses. That said, of course this has to be >> tempered against the complexity of the change. But we are establishing a >> pattern for coding up JVMTI operation as direct handshakes, which should >> make things generally more easy to understand. >> >> Cheers, >> David >> >>> >>> Thanks, >>> >>> Yasumasa >>> >>> >>> On 2020/08/27 16:43, Reingruber, Richard wrote: >>>> Hi Yasumasa, >>>> >>>>> I've described the motivation on JDK-8201641 (it is a parent task of >>>>> JDK-8242427) >>>> >>>>> ``` >>>>> Many JVMTI functions uses VM Operation to get information. However >>>>> some of them need to stop only one thread - they don't need to stop >>>>> all threads. >>>>> ``` >>>> >>>> So the goal is better performance. For PopFrame IMHO it is not worth >>>> the effort, >>>> the future effort in maintaining the related code, and the risk. >>>> >>>> I think so because PopFrame is a hardly ever used. I honestly never >>>> used it >>>> (have you?). In IDEs it is well hidden. Graal does not even bother to >>>> support >>>> it. On the other side the change affects other operations that are >>>> commonly >>>> used. >>>> >>>> In the rare cases when a PopFrame is requested it will be in interactive >>>> sessions: someone found the well-hidden PopFrame button in the >>>> debugger and >>>> pressed it. Probably she won't do it again. At least not at a high >>>> frequency. So >>>> she will not notice the effect of the optimization. >>>> >>>> If you have a large cloud of JVMs where every second a PopFrame is >>>> executed, >>>> even then I would doubt that the resource savings are measurable. And >>>> I would >>>> also doubt that a cloud with PopFrames at that rate exists. >>>> >>>> I see there are rare events like full GCs that can do harm. But in the >>>> case of >>>> PopFrame I can't see a problem because the pause for the vm operation >>>> will be >>>> extremely short. >>>> >>>> Is there a scenario or a not too artificial benchmark that would show an >>>> improvement? >>>> >>>> Thanks, >>>> Richard. >>>> >>>> -----Original Message----- >>>> From: Yasumasa Suenaga >>>> Sent: Donnerstag, 27. August 2020 01:30 >>>> To: Reingruber, Richard ; >>>> serviceability-dev >>>> Subject: Re: 8242427: JVMTI frame pop operations should use >>>> Thread-Local Handshakes >>>> >>>> Hi Richard, >>>> >>>> I've described the motivation on JDK-8201641 (it is a parent task of >>>> JDK-8242427) >>>> >>>> ``` >>>> Many JVMTI functions uses VM Operation to get information. However >>>> some of them need to stop only one thread - they don't need to stop >>>> all threads. >>>> ``` >>>> >>>> I aimed to improve JVMTI monitor operation with TLS at first, but I >>>> found other JVMTI operations can be improved with same process. So >>>> I've tried to fix them. >>>> >>>> I proposed it to serviceability-dev [1], then Dan told me similar >>>> enhancement is already filed to JBS [2]. So I created subtasks in it. >>>> >>>> >>>> Thanks, >>>> >>>> Yasumasa >>>> >>>> >>>> [1] >>>> https://mail.openjdk.java.net/pipermail/serviceability-dev/2020-March/030890.html >>>> >>>> [2] >>>> https://mail.openjdk.java.net/pipermail/serviceability-dev/2020-March/030897.html >>>> >>>> >>>> >>>> On 2020/08/27 5:33, Reingruber, Richard wrote: >>>>> Hi Yasumasa, >>>>> >>>>> Could you explain a little bit the motivation to replace these vm >>>>> operations with handshakes? >>>>> Would be good, if you could add the goals as well to the JBS item. >>>>> >>>>> Thanks, Richard. >>>>> >>>>> -----Original Message----- >>>>> From: serviceability-dev >>>>> On Behalf Of Yasumasa Suenaga >>>>> Sent: Montag, 24. August 2020 04:40 >>>>> To: serviceability-dev >>>>> Subject: 8242427: JVMTI frame pop operations should use Thread-Local >>>>> Handshakes >>>>> >>>>> Hi all, >>>>> >>>>> I want to hear your opinions about the change for JDK-8242427. >>>>> >>>>> I'm trying to migrate following operations to direct handshake. >>>>> >>>>> ?????? - VM_UpdateForPopTopFrame >>>>> ?????? - VM_SetFramePop >>>>> ?????? - VM_GetCurrentLocation >>>>> >>>>> Some operations (VM_GetCurrentLocation and >>>>> EnterInterpOnlyModeClosure) might be called at safepoint, so I want >>>>> to use JavaThread::active_handshaker() in production VM to detect the >>>>> process is in direct handshake or not. >>>>> >>>>> However this function is available in debug VM only, so I want to >>>>> hear the reason why it is for debug VM only, and there are no problem >>>>> to use it in production VM. Of course another solutions are welcome. >>>>> >>>>> webrev is here. It passed jtreg tests >>>>> (vmTestbase/nsk/{jdi,jdwp,jvmti} serviceability/{jdwp,jvmti}) >>>>> ???? http://cr.openjdk.java.net/~ysuenaga/JDK-8242427/proposal/ >>>>> >>>>> >>>>> Thanks, >>>>> >>>>> Yasumasa >>>>> From igor.ignatyev at oracle.com Fri Aug 28 17:29:16 2020 From: igor.ignatyev at oracle.com (Igor Ignatyev) Date: Fri, 28 Aug 2020 10:29:16 -0700 Subject: RFR(T) : 8252401 : Introduce Utils.TEST_NATIVE_PATH In-Reply-To: References: <8E0C4E48-B435-4734-A86B-2C6745104BF7@oracle.com> Message-ID: thanks Serguei, pushed. -- Igor > On Aug 28, 2020, at 12:39 AM, serguei.spitsyn at oracle.com wrote: > > Hi Igor, > > It looks good and trivial. > > Thanks, > Serguei > > > On 8/26/20 16:59, Igor Ignatyev wrote: >> http://cr.openjdk.java.net/~iignatyev//8252401/webrev.00 >>> 6 lines changed: 5 ins; 0 del; 1 mod; >> Hi all, >> >> could you please review this trivial patch which adds j.t.l.Utils.TEST_NATIVE_PATH static field to store the value of test.nativepath system property? >> >> JBS: https://bugs.openjdk.java.net/browse/JDK-8252401 >> webrev: http://cr.openjdk.java.net/~iignatyev//8252401/webrev.00 >> >> Thanks, >> -- Igor > From martin.doerr at sap.com Fri Aug 28 17:53:22 2020 From: martin.doerr at sap.com (Doerr, Martin) Date: Fri, 28 Aug 2020 17:53:22 +0000 Subject: Fatal errors when running JCK tests with JDK15/16 debug build Message-ID: Hi, we have seen the following fatal error more than 50 times since 2020-05-25 in various JCK tests vm/jvmti. fatal error: String conversion failure: [check] ExitLock destroyed --> [check] ExitLock exited (followed by garbage output) 8166358: Re-enable String verification in java_lang_String::create_from_str() was pushed at that date which introduced the call to fatal. Stack (example from linuxppc64le, but also observed on x86 and aarch64): V [libjvm.so+0xee242c] java_lang_String::create_from_str(char const*, Thread*) [clone .part.158]+0x51c V [libjvm.so+0xee2530] java_lang_String::create_oop_from_str(char const*, Thread*)+0x40 V [libjvm.so+0x1026a30] jni_NewStringUTF+0x1e0 C [libjckjvmti.so+0x3ce4c] logWrite+0x5c C [libjckjvmti.so+0x3cd20] lprintf+0x170 C [libjckjvmti.so+0x485b8] gast00104_agent_proc+0x254 V [libjvm.so+0x1218f0c] JvmtiAgentThread::call_start_function()+0x24c V [libjvm.so+0x193a8fc] JavaThread::thread_main_inner()+0x32c V [libjvm.so+0x19418a0] Thread::call_run()+0x160 V [libjvm.so+0x15c9d0c] thread_native_entry(Thread*)+0x18c C [libpthread.so.0+0x9b48] start_thread+0x108 (Problem could have been there before but without this fatal message.) The messages are generated by: tests/vm/jvmti/GetAllStackTraces/gast001/gast00104/gast00104.c This looks like a race condition. The message changes while the VM creates a String object from it. Has anybody seen this before? Is it a test problem? I'm not familiar with the lprintf calls in the test. Best regards, Martin -------------- next part -------------- An HTML attachment was scrubbed... URL: From igor.ignatyev at oracle.com Fri Aug 28 18:18:54 2020 From: igor.ignatyev at oracle.com (Igor Ignatyev) Date: Fri, 28 Aug 2020 11:18:54 -0700 Subject: RFR(S) : 8252477 : nsk/share/ArgumentParser should expect that jtreg "splits" an argument In-Reply-To: References: <26E3A312-A45E-489F-A5B1-F1E67CBE807A@oracle.com> Message-ID: <848EB368-B244-4376-91D0-770FF0141477@oracle.com> Hi David, good point, parseArguments (or rather checkOption) does indeed validate that passed option is valid and has a valid value, yet for many options all values are treated as valid, so ill-formed command lines like `-debugee.vmkeys="${test.vm.opts} ${test.java.opts} -transport.address=dynamic` won't be spotted by ArgumentParser and its sub-classes, and I'm afraid in some cases might change tests' behavior unnoticeably. thus I've decided to add the check that we always have even number of double quotes: > diff -r 83f273f313aa test/hotspot/jtreg/vmTestbase/nsk/share/ArgumentParser.java > --- a/test/hotspot/jtreg/vmTestbase/nsk/share/ArgumentParser.java Thu Aug 27 19:37:51 2020 -0700 > +++ b/test/hotspot/jtreg/vmTestbase/nsk/share/ArgumentParser.java Fri Aug 28 11:16:24 2020 -0700 > @@ -156,6 +156,10 @@ > arg.append(" ").append(args[++i]); > doubleQuotes += numberOfDoubleQuotes(args[i]); > } > + if (doubleQuotes % 2 != 0) { > + throw new TestBug("command-line has odd number of double quotes:" + String.join(" ", args)); > + } > + > list.add(arg.toString()); > } > setRawArguments(list.toArray(String[]::new)); Thanks, -- Igor > On Aug 27, 2020, at 9:09 PM, David Holmes wrote: > > Hi Igor, > > In case there may be a parsing error and the command-line is ill-formed, should you abort if you reach the end of the arg list without finding an even number of double-quotes? Or will parseArguments already handle that? > > Otherwise the changes seem good. > > Thanks, > David > ----- > > On 28/08/2020 12:39 pm, Igor Ignatyev wrote: >> http://cr.openjdk.java.net/~iignatyev//8252477/webrev.00/ >>> 99 lines changed: 19 ins; 20 del; 60 mod; >> Hi all, >> could you please review the patch which unblocks the rest of 8219140's (get rid of vmTestbase/PropertyResolvingWrapper) sub-tasks? >> background from JBS: >>> jtreg splits command line by space to get the list of arguments and there is no way to prevent that (nor thru escaping, nor by adding quotes). currently, PropertyResolvingWrapper handles that and joins multiple arguments within double quotes into one argument before passing it to the actual test class. the only place where it's needed is in the tests which use nsk/share/ArgumentParser (or more precisely nsk.share.jpda.DebugeeArgumentHandler and nsk/share/jdb/JdbArgumentHandler). >>> >>> in preparation for PropertyResolvingWrapper removal, ArgumentParser should be updated to handle the "split" argument on its own. >> I've also taken the liberty to slightly clean up ArgumentParser. >> JBS: https://bugs.openjdk.java.net/browse/JDK-8252477 >> webrev: http://cr.openjdk.java.net/~iignatyev//8252477/webrev.00/ >> testing: all the tests which use ArgumentParser (:vmTestbase_nsk_aod :vmTestbase_nsk_jdb :vmTestbase_nsk_jdi :vmTestbase_nsk_jdw ,:vmTestbase_nsk_jvmti :vmTestbase_vm_compiler :vmTestbase_vm_mlvm) on {windows,linux,macos}-x64 >> Thanks, >> -- Igor From alexey.menkov at oracle.com Fri Aug 28 20:10:25 2020 From: alexey.menkov at oracle.com (Alex Menkov) Date: Fri, 28 Aug 2020 13:10:25 -0700 Subject: RFR (S): JDK-8250630 JdwpListenTest.java fails on Alpine Linux In-Reply-To: References: <3be5b9aa-2d4d-0806-24f3-8207de1baf35@samersoff.net> Message-ID: Hi Dmitry, Looks good to me. 2 minor nits (no new webrev required): - copyright year - indentation in lines 763 and 766 --alex On 08/27/2020 08:03, Dmitry Samersoff wrote: > Hello Everybody, > > http://cr.openjdk.java.net/~dsamersoff/JDK-8250630/webrev.02/ > > Webrev is updated, all comments accepted and addressed. > > Except: > > The error code from the inet_pton is not checked. > > inet_pton performs conversion of the constant value in our case and the > only possible reason for it to fail is that the system doesn't support > IPv6 at all. > > -Dmitry > > > On 18.08.2020 3:20, serguei.spitsyn at oracle.com wrote: >> Hi Dmitry, >> >> I agree with Alex, it is better to rename compareIPv6Addr to >> isEqualIPv6Addr. >> >> 705 static int compareIPv6Addr(struct addrinfo *ai, struct in6_addr >> in6Addr) >> 706 { >> 707 >> 708 if (ai->ai_addr->sa_family == AF_INET6) { >> 709 const struct sockaddr_in6 sa = *((struct sockaddr_in6*) ai->ai_addr); >> 710 return (memcmp(&sa.sin6_addr, &in6Addr, sizeof(in6Addr)) == 0); >> 711 } >> 712 >> 713 return 0; >> 714 } >> >> I think, the lines 707 and 712 are not needed. >> >> 725 struct in6_addr mappedAny; ... 761 inet_pton(AF_INET6, >> "::ffff:0.0.0.0", &mappedAny); >> >> The error code from the inet_pton is not checked. >> Also, it can be useful to pre-initialize the mappedAny. >> >> 737 // Try to find bind address of preferred address familty first A >> dot at the end of comment is missed. >> >> 745 if (listenAddr == NULL) { >> 746 // No address of preferred addres family found, grab the first one >> 747 listenAddr = &(addrInfo[0]); >> ? 748???? } >> >> The indent has to be 4, not 3. >> The () brackets are not actually needed but I do not object if you >> keep them. >> A dot at the end of comment is missed. >> >> 755 // Binding to IN6ADDR_ANY allow us to serve both IPv4 and IPv6 >> connections, >> 756 // but binding to mapped INADDR_ANY (::ffff:0.0.0.0) allow us to >> serve IPv4 >> 757 // connections only. So make sure, that IN6ADDR_ANY is preferred over >> 758 // mapped INADDR_ANY if preferredAddressFamily is AF_INET6 or not set >> >> I'd suggest to replace "allow us" => "allows" in two places: >> ?? "IN6ADDR_ANY allow us" and "(::ffff:0.0.0.0) allow us". >> Also, it'd better to replace: "So make sure,that" => "Make sure that". >> A dot at the end of comment is missed. >> >> I don't know the network protocols well enough to comment on >> >> >> Thanks, >> Serguei >> >> >> On 8/17/20 00:21, Dmitry Samersoff wrote: >>> Hello Everybody, >>> >>> Please review the fix: >>> >>> https://cr.openjdk.java.net/~dsamersoff/JDK-8250630/webrev.01/ >>> >>> Binding to IN6ADDR_ANY allow us to serve both IPv4 and IPv6 >>> connections, but binding to mapped INADDR_ANY (::ffff:0.0.0.0) allow >>> us to serve IPv4 connections only. >>> >>> So make sure, that IN6ADDR_ANY is preferred over mapped INADDR_ANY if >>> preferredAddressFamily is not AF_INET >>> >>> >>> -Dmitry\S >>> >> > From serguei.spitsyn at oracle.com Fri Aug 28 20:25:46 2020 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Fri, 28 Aug 2020 13:25:46 -0700 Subject: RFR (S): JDK-8250630 JdwpListenTest.java fails on Alpine Linux In-Reply-To: References: <3be5b9aa-2d4d-0806-24f3-8207de1baf35@samersoff.net> Message-ID: <9e3556f7-a187-95a3-6be4-6faf45260234@oracle.com> Hi Dmitry, LGTM++ Thanks, Serguei On 8/28/20 13:10, Alex Menkov wrote: > Hi Dmitry, > > Looks good to me. > 2 minor nits (no new webrev required): > - copyright year > - indentation in lines 763 and 766 > > --alex > > On 08/27/2020 08:03, Dmitry Samersoff wrote: >> Hello Everybody, >> >> http://cr.openjdk.java.net/~dsamersoff/JDK-8250630/webrev.02/ >> >> Webrev is updated, all comments accepted and addressed. >> >> Except: >> ?> The error code from the inet_pton is not checked. >> >> inet_pton performs conversion of the constant value in our case and >> the only possible reason for it to fail is that the system doesn't >> support IPv6 at all. >> >> -Dmitry >> >> >> On 18.08.2020 3:20, serguei.spitsyn at oracle.com wrote: >>> Hi Dmitry, >>> >>> I agree with Alex, it is better to rename compareIPv6Addr to >>> isEqualIPv6Addr. >>> >>> 705 static int compareIPv6Addr(struct addrinfo *ai, struct in6_addr >>> in6Addr) >>> 706 { >>> 707 >>> 708 if (ai->ai_addr->sa_family == AF_INET6) { >>> 709 const struct sockaddr_in6 sa = *((struct sockaddr_in6*) >>> ai->ai_addr); >>> 710 return (memcmp(&sa.sin6_addr, &in6Addr, sizeof(in6Addr)) == 0); >>> 711 } >>> 712 >>> 713 return 0; >>> 714 } >>> >>> I think, the lines 707 and 712 are not needed. >>> >>> 725 struct in6_addr mappedAny; ... 761 inet_pton(AF_INET6, >>> "::ffff:0.0.0.0", &mappedAny); >>> >>> The error code from the inet_pton is not checked. >>> Also, it can be useful to pre-initialize the mappedAny. >>> >>> 737 // Try to find bind address of preferred address familty first A >>> dot at the end of comment is missed. >>> >>> 745 if (listenAddr == NULL) { >>> 746 // No address of preferred addres family found, grab the first one >>> 747 listenAddr = &(addrInfo[0]); >>> ? 748???? } >>> >>> The indent has to be 4, not 3. >>> The () brackets are not actually needed but I do not object if you >>> keep them. >>> A dot at the end of comment is missed. >>> >>> 755 // Binding to IN6ADDR_ANY allow us to serve both IPv4 and IPv6 >>> connections, >>> 756 // but binding to mapped INADDR_ANY (::ffff:0.0.0.0) allow us to >>> serve IPv4 >>> 757 // connections only. So make sure, that IN6ADDR_ANY is preferred >>> over >>> 758 // mapped INADDR_ANY if preferredAddressFamily is AF_INET6 or >>> not set >>> >>> I'd suggest to replace "allow us" => "allows" in two places: >>> ?? "IN6ADDR_ANY allow us" and "(::ffff:0.0.0.0) allow us". >>> Also, it'd better to replace: "So make sure,that" => "Make sure that". >>> A dot at the end of comment is missed. >>> >>> I don't know the network protocols well enough to comment on >>> >>> >>> Thanks, >>> Serguei >>> >>> >>> On 8/17/20 00:21, Dmitry Samersoff wrote: >>>> Hello Everybody, >>>> >>>> Please review the fix: >>>> >>>> https://cr.openjdk.java.net/~dsamersoff/JDK-8250630/webrev.01/ >>>> >>>> Binding to IN6ADDR_ANY allow us to serve both IPv4 and IPv6 >>>> connections, but binding to mapped INADDR_ANY (::ffff:0.0.0.0) >>>> allow us to serve IPv4 connections only. >>>> >>>> So make sure, that IN6ADDR_ANY is preferred over mapped INADDR_ANY >>>> if preferredAddressFamily is not AF_INET >>>> >>>> >>>> -Dmitry\S >>>> >>> >> From patricio.chilano.mateo at oracle.com Fri Aug 28 20:54:23 2020 From: patricio.chilano.mateo at oracle.com (Patricio Chilano) Date: Fri, 28 Aug 2020 17:54:23 -0300 Subject: RFR: 8242427: JVMTI frame pop operations should use Thread-Local Handshakes In-Reply-To: <0707776c-928d-5fde-353d-8bf3d4d419b9@oss.nttdata.com> References: <11793902-5fca-3251-ce57-ae0832c3a2c4@oracle.com> <2764d68f-da86-865c-d8b0-e79d1fd9ee25@oss.nttdata.com> <70dd740f-9b77-c1ac-0acb-ebdc178abdce@oss.nttdata.com> <0707776c-928d-5fde-353d-8bf3d4d419b9@oss.nttdata.com> Message-ID: <0348ca47-fcc2-45c6-494c-b92e7f2c317d@oracle.com> Hi Yasumasa, On 8/27/20 10:18 PM, Yasumasa Suenaga wrote: > Hi Patricio, > > On 2020/08/27 15:20, Patricio Chilano wrote: >> Hi Yasumasa, >> >> On 8/26/20 8:57 PM, Yasumasa Suenaga wrote: >>> Hi Patricio, >>> >>> Thanks for your review, but webrev.00 has been rotten. >>> Can you review webrev.02? >>> >>> ? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8242427/webrev.02/ >>> ??? diff between webrev.00 and webrev.01: >>> http://hg.openjdk.java.net/jdk/submit/rev/7facd1dd39d6 >>> ??? diff between webrev.01 and webrev.02: >>> http://hg.openjdk.java.net/jdk/submit/rev/2ef7feb5681f >> Looks good to me. Can JvmtiEventController::set_frame_pop(), >> JvmtiEventController::clear_frame_pop() and >> JvmtiEventController::clear_to_frame_pop() still be called at a >> safepoint? > > No, and also I checked them with > assert(JvmtiThreadState_lock->is_locked()). > webrev is here: > > ? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8242427/webrev.03/ > ??? diff from previous webrev: > http://hg.openjdk.java.net/jdk/submit/rev/2a2c02ada080 I see that you will change them with assert_lock_strong() as David suggested which I think is good. Thanks, Patricio > Thanks, > > Yasumasa > > >> Thanks, >> Patricio >>> Thanks, >>> >>> Yasumasa >>> >>> >>> On 2020/08/27 7:50, Patricio Chilano wrote: >>>> Hi Yasumasa, >>>> >>>> On 8/26/20 4:34 AM, Yasumasa Suenaga wrote: >>>>> Hi Patricio, David, >>>>> >>>>> Thanks for your comment! >>>>> >>>>> I updated webrev which includes the fix which is commented by >>>>> Patricio, and it passed submit repo. So I switch this mail thread >>>>> to RFR. >>>>> >>>>> ? JBS: https://bugs.openjdk.java.net/browse/JDK-8242427 >>>>> ? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8242427/webrev.00/ >>>> The changes look good to me, thanks for fixing them. >>>> >>>> Patricio >>>>> I understand David said same concerns as Patricio about active >>>>> handshaker. This webrev checks active handshaker is current thread >>>>> or not. >>>>> >>>>> >>>>> Cheers, >>>>> >>>>> Yasumasa >>>>> >>>>> >>>>> On 2020/08/26 10:13, Patricio Chilano wrote: >>>>>> Hi Yasumasa, >>>>>> >>>>>> On 8/23/20 11:40 PM, Yasumasa Suenaga wrote: >>>>>>> Hi all, >>>>>>> >>>>>>> I want to hear your opinions about the change for JDK-8242427. >>>>>>> >>>>>>> I'm trying to migrate following operations to direct handshake. >>>>>>> >>>>>>> ??? - VM_UpdateForPopTopFrame >>>>>>> ??? - VM_SetFramePop >>>>>>> ??? - VM_GetCurrentLocation >>>>>>> >>>>>>> Some operations (VM_GetCurrentLocation and >>>>>>> EnterInterpOnlyModeClosure) might be called at safepoint, so I >>>>>>> want to use JavaThread::active_handshaker() in production VM to >>>>>>> detect the process is in direct handshake or not. >>>>>>> >>>>>>> However this function is available in debug VM only, so I want >>>>>>> to hear the reason why it is for debug VM only, and there are no >>>>>>> problem to use it in production VM. Of course another solutions >>>>>>> are welcome. >>>>>> I added the _active_handshaker field to the HandshakeState class >>>>>> when working on 8230594 to adjust some asserts, where instead of >>>>>> checking for the VMThread we needed to check for the active >>>>>> handshaker of the target JavaThread. Since there were no other >>>>>> users of it, there was no point in declaring it and having to >>>>>> write to it for the release bits. There are no issues with having >>>>>> it in production though so you could change that if necessary. >>>>>> >>>>>>> webrev is here. It passed jtreg tests >>>>>>> (vmTestbase/nsk/{jdi,jdwp,jvmti} serviceability/{jdwp,jvmti}) >>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8242427/proposal/ >>>>>> Some comments on the proposed change. >>>>>> >>>>>> src/hotspot/share/prims/jvmtiEnvThreadState.cpp, >>>>>> src/hotspot/share/prims/jvmtiEventController.cpp >>>>>> Why is the check to decide whether to call the handshake or >>>>>> execute the operation with the current thread different for >>>>>> GetCurrentLocationClosure vs EnterInterpOnlyModeClosure? >>>>>> >>>>>> (GetCurrentLocationClosure) >>>>>> if ((Thread::current() == _thread) || >>>>>> (_thread->active_handshaker() != NULL)) { >>>>>> ????? op.do_thread(_thread); >>>>>> } else { >>>>>> ????? Handshake::execute_direct(&op, _thread); >>>>>> } >>>>>> >>>>>> vs >>>>>> >>>>>> (EnterInterpOnlyModeClosure) >>>>>> if (target->active_handshaker() != NULL) { >>>>>> ???? hs.do_thread(target); >>>>>> } else { >>>>>> ???? Handshake::execute_direct(&hs, target); >>>>>> } >>>>>> >>>>>> If you change VM_SetFramePop to use handshakes then it seems you >>>>>> could reach JvmtiEventControllerPrivate::enter_interp_only_mode() >>>>>> with the current thread being the target. >>>>>> Also I think you want the second expression of that check to be >>>>>> (target->active_handshaker() == Thread::current()). So either you >>>>>> are the target or the current active_handshaker for that target. >>>>>> Otherwise active_handshaker() could be not NULL because there is >>>>>> another JavaThread handshaking the same target. Unless you are >>>>>> certain that it can never happen, so if active_handshaker() is >>>>>> not NULL it is always the current thread, but even in that case >>>>>> this way is safer. >>>>>> >>>>>> src/hotspot/share/prims/jvmtiThreadState.cpp >>>>>> The guarantee() statement exists in release builds too so the >>>>>> "#ifdef ASSERT" directive should be removed, otherwise "current" >>>>>> will not be declared. >>>>>> >>>>>> Thanks! >>>>>> >>>>>> Patricio >>>>>>> Thanks, >>>>>>> >>>>>>> Yasumasa >>>>>> >>>> >> From suenaga at oss.nttdata.com Sat Aug 29 02:31:44 2020 From: suenaga at oss.nttdata.com (Yasumasa Suenaga) Date: Sat, 29 Aug 2020 11:31:44 +0900 Subject: RFR: 8242427: JVMTI frame pop operations should use Thread-Local Handshakes In-Reply-To: <0348ca47-fcc2-45c6-494c-b92e7f2c317d@oracle.com> References: <11793902-5fca-3251-ce57-ae0832c3a2c4@oracle.com> <2764d68f-da86-865c-d8b0-e79d1fd9ee25@oss.nttdata.com> <70dd740f-9b77-c1ac-0acb-ebdc178abdce@oss.nttdata.com> <0707776c-928d-5fde-353d-8bf3d4d419b9@oss.nttdata.com> <0348ca47-fcc2-45c6-494c-b92e7f2c317d@oracle.com> Message-ID: On 2020/08/29 5:54, Patricio Chilano wrote: > Hi Yasumasa, > > On 8/27/20 10:18 PM, Yasumasa Suenaga wrote: >> Hi Patricio, >> >> On 2020/08/27 15:20, Patricio Chilano wrote: >>> Hi Yasumasa, >>> >>> On 8/26/20 8:57 PM, Yasumasa Suenaga wrote: >>>> Hi Patricio, >>>> >>>> Thanks for your review, but webrev.00 has been rotten. >>>> Can you review webrev.02? >>>> >>>> ? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8242427/webrev.02/ >>>> ??? diff between webrev.00 and webrev.01: http://hg.openjdk.java.net/jdk/submit/rev/7facd1dd39d6 >>>> ??? diff between webrev.01 and webrev.02: http://hg.openjdk.java.net/jdk/submit/rev/2ef7feb5681f >>> Looks good to me. Can JvmtiEventController::set_frame_pop(), JvmtiEventController::clear_frame_pop() and JvmtiEventController::clear_to_frame_pop() still be called at a safepoint? >> >> No, and also I checked them with assert(JvmtiThreadState_lock->is_locked()). >> webrev is here: >> >> ? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8242427/webrev.03/ >> ??? diff from previous webrev: http://hg.openjdk.java.net/jdk/submit/rev/2a2c02ada080 > I see that you will change them with assert_lock_strong() as David suggested which I think is good. Thanks Patricio! I will fix that in next webrev. Yasumasa > Thanks, > Patricio >> Thanks, >> >> Yasumasa >> >> >>> Thanks, >>> Patricio >>>> Thanks, >>>> >>>> Yasumasa >>>> >>>> >>>> On 2020/08/27 7:50, Patricio Chilano wrote: >>>>> Hi Yasumasa, >>>>> >>>>> On 8/26/20 4:34 AM, Yasumasa Suenaga wrote: >>>>>> Hi Patricio, David, >>>>>> >>>>>> Thanks for your comment! >>>>>> >>>>>> I updated webrev which includes the fix which is commented by Patricio, and it passed submit repo. So I switch this mail thread to RFR. >>>>>> >>>>>> ? JBS: https://bugs.openjdk.java.net/browse/JDK-8242427 >>>>>> ? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8242427/webrev.00/ >>>>> The changes look good to me, thanks for fixing them. >>>>> >>>>> Patricio >>>>>> I understand David said same concerns as Patricio about active handshaker. This webrev checks active handshaker is current thread or not. >>>>>> >>>>>> >>>>>> Cheers, >>>>>> >>>>>> Yasumasa >>>>>> >>>>>> >>>>>> On 2020/08/26 10:13, Patricio Chilano wrote: >>>>>>> Hi Yasumasa, >>>>>>> >>>>>>> On 8/23/20 11:40 PM, Yasumasa Suenaga wrote: >>>>>>>> Hi all, >>>>>>>> >>>>>>>> I want to hear your opinions about the change for JDK-8242427. >>>>>>>> >>>>>>>> I'm trying to migrate following operations to direct handshake. >>>>>>>> >>>>>>>> ??? - VM_UpdateForPopTopFrame >>>>>>>> ??? - VM_SetFramePop >>>>>>>> ??? - VM_GetCurrentLocation >>>>>>>> >>>>>>>> Some operations (VM_GetCurrentLocation and EnterInterpOnlyModeClosure) might be called at safepoint, so I want to use JavaThread::active_handshaker() in production VM to detect the process is in direct handshake or not. >>>>>>>> >>>>>>>> However this function is available in debug VM only, so I want to hear the reason why it is for debug VM only, and there are no problem to use it in production VM. Of course another solutions are welcome. >>>>>>> I added the _active_handshaker field to the HandshakeState class when working on 8230594 to adjust some asserts, where instead of checking for the VMThread we needed to check for the active handshaker of the target JavaThread. Since there were no other users of it, there was no point in declaring it and having to write to it for the release bits. There are no issues with having it in production though so you could change that if necessary. >>>>>>> >>>>>>>> webrev is here. It passed jtreg tests (vmTestbase/nsk/{jdi,jdwp,jvmti} serviceability/{jdwp,jvmti}) >>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8242427/proposal/ >>>>>>> Some comments on the proposed change. >>>>>>> >>>>>>> src/hotspot/share/prims/jvmtiEnvThreadState.cpp, src/hotspot/share/prims/jvmtiEventController.cpp >>>>>>> Why is the check to decide whether to call the handshake or execute the operation with the current thread different for GetCurrentLocationClosure vs EnterInterpOnlyModeClosure? >>>>>>> >>>>>>> (GetCurrentLocationClosure) >>>>>>> if ((Thread::current() == _thread) || (_thread->active_handshaker() != NULL)) { >>>>>>> ????? op.do_thread(_thread); >>>>>>> } else { >>>>>>> ????? Handshake::execute_direct(&op, _thread); >>>>>>> } >>>>>>> >>>>>>> vs >>>>>>> >>>>>>> (EnterInterpOnlyModeClosure) >>>>>>> if (target->active_handshaker() != NULL) { >>>>>>> ???? hs.do_thread(target); >>>>>>> } else { >>>>>>> ???? Handshake::execute_direct(&hs, target); >>>>>>> } >>>>>>> >>>>>>> If you change VM_SetFramePop to use handshakes then it seems you could reach JvmtiEventControllerPrivate::enter_interp_only_mode() with the current thread being the target. >>>>>>> Also I think you want the second expression of that check to be (target->active_handshaker() == Thread::current()). So either you are the target or the current active_handshaker for that target. Otherwise active_handshaker() could be not NULL because there is another JavaThread handshaking the same target. Unless you are certain that it can never happen, so if active_handshaker() is not NULL it is always the current thread, but even in that case this way is safer. >>>>>>> >>>>>>> src/hotspot/share/prims/jvmtiThreadState.cpp >>>>>>> The guarantee() statement exists in release builds too so the "#ifdef ASSERT" directive should be removed, otherwise "current" will not be declared. >>>>>>> >>>>>>> Thanks! >>>>>>> >>>>>>> Patricio >>>>>>>> Thanks, >>>>>>>> >>>>>>>> Yasumasa >>>>>>> >>>>> >>> > From igor.ignatyev at oracle.com Sat Aug 29 03:52:19 2020 From: igor.ignatyev at oracle.com (Igor Ignatyev) Date: Fri, 28 Aug 2020 20:52:19 -0700 Subject: RFR(T) : 8252532 : use Utils.TEST_NATIVE_PATH instead of System.getProperty("test.nativepath") Message-ID: http://cr.openjdk.java.net/~iignatyev//8252532/webrev.00 > 145 lines changed: 28 ins; 22 del; 95 mod; Hi all, could you please review this trivial clean up which replaces System.getProperty("test.nativepath") w/ Utils.TEST_NATIVE_PATH where appropriate? while updating these files, I've also cleaned them up a bit, removed unneeded imports, added/removed spaces, etc testing: runtime, serviceability and vmTestbase/nsk/jvmti/ tests on {linux,windows,macos}-x64 JBS: https://bugs.openjdk.java.net/browse/JDK-8252532 webrev: http://cr.openjdk.java.net/~iignatyev//8252532/webrev.00 Thanks, -- Igor From dms at samersoff.net Sat Aug 29 08:08:44 2020 From: dms at samersoff.net (Dmitry Samersoff) Date: Sat, 29 Aug 2020 11:08:44 +0300 Subject: RFR (S): JDK-8250630 JdwpListenTest.java fails on Alpine Linux In-Reply-To: References: <3be5b9aa-2d4d-0806-24f3-8207de1baf35@samersoff.net> Message-ID: Hello Alex, Thank you. Nits fixed in-place. -Dmitry On 28.08.2020 23:10, Alex Menkov wrote: > Hi Dmitry, > > Looks good to me. > 2 minor nits (no new webrev required): > - copyright year > - indentation in lines 763 and 766 > > --alex > > On 08/27/2020 08:03, Dmitry Samersoff wrote: >> Hello Everybody, >> >> http://cr.openjdk.java.net/~dsamersoff/JDK-8250630/webrev.02/ >> >> Webrev is updated, all comments accepted and addressed. >> >> Except: >> ?> The error code from the inet_pton is not checked. >> >> inet_pton performs conversion of the constant value in our case and >> the only possible reason for it to fail is that the system doesn't >> support IPv6 at all. >> >> -Dmitry >> >> >> On 18.08.2020 3:20, serguei.spitsyn at oracle.com wrote: >>> Hi Dmitry, >>> >>> I agree with Alex, it is better to rename compareIPv6Addr to >>> isEqualIPv6Addr. >>> >>> 705 static int compareIPv6Addr(struct addrinfo *ai, struct in6_addr >>> in6Addr) >>> 706 { >>> 707 >>> 708 if (ai->ai_addr->sa_family == AF_INET6) { >>> 709 const struct sockaddr_in6 sa = *((struct sockaddr_in6*) >>> ai->ai_addr); >>> 710 return (memcmp(&sa.sin6_addr, &in6Addr, sizeof(in6Addr)) == 0); >>> 711 } >>> 712 >>> 713 return 0; >>> 714 } >>> >>> I think, the lines 707 and 712 are not needed. >>> >>> 725 struct in6_addr mappedAny; ... 761 inet_pton(AF_INET6, >>> "::ffff:0.0.0.0", &mappedAny); >>> >>> The error code from the inet_pton is not checked. >>> Also, it can be useful to pre-initialize the mappedAny. >>> >>> 737 // Try to find bind address of preferred address familty first A >>> dot at the end of comment is missed. >>> >>> 745 if (listenAddr == NULL) { >>> 746 // No address of preferred addres family found, grab the first one >>> 747 listenAddr = &(addrInfo[0]); >>> ? 748???? } >>> >>> The indent has to be 4, not 3. >>> The () brackets are not actually needed but I do not object if you >>> keep them. >>> A dot at the end of comment is missed. >>> >>> 755 // Binding to IN6ADDR_ANY allow us to serve both IPv4 and IPv6 >>> connections, >>> 756 // but binding to mapped INADDR_ANY (::ffff:0.0.0.0) allow us to >>> serve IPv4 >>> 757 // connections only. So make sure, that IN6ADDR_ANY is preferred >>> over >>> 758 // mapped INADDR_ANY if preferredAddressFamily is AF_INET6 or not >>> set >>> >>> I'd suggest to replace "allow us" => "allows" in two places: >>> ?? "IN6ADDR_ANY allow us" and "(::ffff:0.0.0.0) allow us". >>> Also, it'd better to replace: "So make sure,that" => "Make sure that". >>> A dot at the end of comment is missed. >>> >>> I don't know the network protocols well enough to comment on >>> >>> >>> Thanks, >>> Serguei >>> >>> >>> On 8/17/20 00:21, Dmitry Samersoff wrote: >>>> Hello Everybody, >>>> >>>> Please review the fix: >>>> >>>> https://cr.openjdk.java.net/~dsamersoff/JDK-8250630/webrev.01/ >>>> >>>> Binding to IN6ADDR_ANY allow us to serve both IPv4 and IPv6 >>>> connections, but binding to mapped INADDR_ANY (::ffff:0.0.0.0) allow >>>> us to serve IPv4 connections only. >>>> >>>> So make sure, that IN6ADDR_ANY is preferred over mapped INADDR_ANY >>>> if preferredAddressFamily is not AF_INET >>>> >>>> >>>> -Dmitry\S >>>> >>> >> From daniel.daugherty at oracle.com Sun Aug 30 16:18:26 2020 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Sun, 30 Aug 2020 12:18:26 -0400 Subject: URGENT RFR(T): 8252551: JDK-8250630 causes build error on Win* Message-ID: <45f93bc0-81db-7395-ea44-84208f39e9b6@oracle.com> Greetings, I have a trivial fix for the following build breaking bug: ??? JDK-8252551 JDK-8250630 causes build error on Win* ??? https://bugs.openjdk.java.net/browse/JDK-8252551 Here's the context diffs for the fix: $ hg diff diff -r d7707af10c98 src/jdk.jdwp.agent/share/native/libdt_socket/socketTransport.c --- a/src/jdk.jdwp.agent/share/native/libdt_socket/socketTransport.c Sun Aug 30 15:48:16 2020 +0300 +++ b/src/jdk.jdwp.agent/share/native/libdt_socket/socketTransport.c Sun Aug 30 11:37:39 2020 -0400 @@ -716,7 +716,6 @@ ??????????????????????????????? char** actualAddress) ?{ ???? int err; -??? int pass; ???? struct addrinfo *addrInfo = NULL; ???? struct addrinfo *listenAddr = NULL; ???? struct addrinfo *ai = NULL; The fix is being tested by a Mach5 Tier3 job set. Thanks, in advance, for any comments, questions or suggestions. Dan From dms at samersoff.net Sun Aug 30 16:29:59 2020 From: dms at samersoff.net (Dmitry Samersoff) Date: Sun, 30 Aug 2020 19:29:59 +0300 Subject: URGENT RFR(T): 8252551: JDK-8250630 causes build error on Win* In-Reply-To: <45f93bc0-81db-7395-ea44-84208f39e9b6@oracle.com> References: <45f93bc0-81db-7395-ea44-84208f39e9b6@oracle.com> Message-ID: <2595895b-6f68-04ea-7bfd-32ba04d413b5@samersoff.net> Hello Dan, Looks good to me. Thank you for fixing it. -Dmitry On 8/30/2020 7:18 PM, Daniel D. Daugherty wrote: > Greetings, > > I have a trivial fix for the following build breaking bug: > > ??? JDK-8252551 JDK-8250630 causes build error on Win* > ??? https://bugs.openjdk.java.net/browse/JDK-8252551 > > Here's the context diffs for the fix: > > $ hg diff > diff -r d7707af10c98 > src/jdk.jdwp.agent/share/native/libdt_socket/socketTransport.c > --- a/src/jdk.jdwp.agent/share/native/libdt_socket/socketTransport.c Sun > Aug 30 15:48:16 2020 +0300 > +++ b/src/jdk.jdwp.agent/share/native/libdt_socket/socketTransport.c Sun > Aug 30 11:37:39 2020 -0400 > @@ -716,7 +716,6 @@ > ??????????????????????????????? char** actualAddress) > ?{ > ???? int err; > -??? int pass; > ???? struct addrinfo *addrInfo = NULL; > ???? struct addrinfo *listenAddr = NULL; > ???? struct addrinfo *ai = NULL; > > > The fix is being tested by a Mach5 Tier3 job set. > > Thanks, in advance, for any comments, questions or suggestions. > > Dan > From daniel.daugherty at oracle.com Sun Aug 30 16:46:40 2020 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Sun, 30 Aug 2020 12:46:40 -0400 Subject: URGENT RFR(T): 8252551: JDK-8250630 causes build error on Win* In-Reply-To: <2595895b-6f68-04ea-7bfd-32ba04d413b5@samersoff.net> References: <45f93bc0-81db-7395-ea44-84208f39e9b6@oracle.com> <2595895b-6f68-04ea-7bfd-32ba04d413b5@samersoff.net> Message-ID: <0dc3c267-f2b6-08ba-65b9-b7d81e658cc0@oracle.com> Thanks for the fast review! Pushed. Dan On 8/30/20 12:29 PM, Dmitry Samersoff wrote: > Hello Dan, > > Looks good to me. > > Thank you for fixing it. > > -Dmitry > > On 8/30/2020 7:18 PM, Daniel D. Daugherty wrote: >> Greetings, >> >> I have a trivial fix for the following build breaking bug: >> >> ???? JDK-8252551 JDK-8250630 causes build error on Win* >> ???? https://bugs.openjdk.java.net/browse/JDK-8252551 >> >> Here's the context diffs for the fix: >> >> $ hg diff >> diff -r d7707af10c98 >> src/jdk.jdwp.agent/share/native/libdt_socket/socketTransport.c >> --- a/src/jdk.jdwp.agent/share/native/libdt_socket/socketTransport.c >> Sun Aug 30 15:48:16 2020 +0300 >> +++ b/src/jdk.jdwp.agent/share/native/libdt_socket/socketTransport.c >> Sun Aug 30 11:37:39 2020 -0400 >> @@ -716,7 +716,6 @@ >> ???????????????????????????????? char** actualAddress) >> ??{ >> ????? int err; >> -??? int pass; >> ????? struct addrinfo *addrInfo = NULL; >> ????? struct addrinfo *listenAddr = NULL; >> ????? struct addrinfo *ai = NULL; >> >> >> The fix is being tested by a Mach5 Tier3 job set. >> >> Thanks, in advance, for any comments, questions or suggestions. >> >> Dan >> > From david.holmes at oracle.com Mon Aug 31 02:33:43 2020 From: david.holmes at oracle.com (David Holmes) Date: Mon, 31 Aug 2020 12:33:43 +1000 Subject: Fatal errors when running JCK tests with JDK15/16 debug build In-Reply-To: References: Message-ID: Hi Martin, On 29/08/2020 3:53 am, Doerr, Martin wrote: > Hi, > > we have seen the following fatal error more than 50 times since > 2020-05-25 in various JCK tests vm/jvmti. > > fatal error: String conversion failure: [check] ExitLock destroyed > > --> ?? [check] ExitLock exited > > (followed by garbage output) > > 8166358: Re-enable String verification in > java_lang_String::create_from_str() > > was pushed at that date which introduced the call to fatal. > > Stack (example from linuxppc64le, but also observed on x86 and aarch64): > V? [libjvm.so+0xee242c]? java_lang_String::create_from_str(char const*, > Thread*) [clone .part.158]+0x51c > V? [libjvm.so+0xee2530]? java_lang_String::create_oop_from_str(char > const*, Thread*)+0x40 > V? [libjvm.so+0x1026a30]? jni_NewStringUTF+0x1e0 > C? [libjckjvmti.so+0x3ce4c]? logWrite+0x5c > C? [libjckjvmti.so+0x3cd20]? lprintf+0x170 > C? [libjckjvmti.so+0x485b8]? gast00104_agent_proc+0x254 > V? [libjvm.so+0x1218f0c]? JvmtiAgentThread::call_start_function()+0x24c > V? [libjvm.so+0x193a8fc]? JavaThread::thread_main_inner()+0x32c > V? [libjvm.so+0x19418a0]? Thread::call_run()+0x160 > V? [libjvm.so+0x15c9d0c]? thread_native_entry(Thread*)+0x18c > C? [libpthread.so.0+0x9b48]? start_thread+0x108 > > (Problem could have been there before but without this fatal message.) > > The messages are generated by: > > tests/vm/jvmti/GetAllStackTraces/gast001/gast00104/gast00104.c > > This looks like a race condition. The message changes while the VM > creates a String object from it. Has anybody seen this before? No but ... > Is it a test problem? I?m not familiar with the lprintf calls in the test. ... the lprintf is part of the JCK support library (support.c if you have access to sources) and it uses a static buffer for the log messages and so it not thread-safe. This test creates a thread and both it and the main thread call lprintf concurrently. So this is a JCK test/test-library bug that appears to be exposed by the changes made in 8166358. Cheers, David ----- > Best regards, > > Martin > From david.holmes at oracle.com Mon Aug 31 04:18:18 2020 From: david.holmes at oracle.com (David Holmes) Date: Mon, 31 Aug 2020 14:18:18 +1000 Subject: RFR(S) : 8252477 : nsk/share/ArgumentParser should expect that jtreg "splits" an argument In-Reply-To: <848EB368-B244-4376-91D0-770FF0141477@oracle.com> References: <26E3A312-A45E-489F-A5B1-F1E67CBE807A@oracle.com> <848EB368-B244-4376-91D0-770FF0141477@oracle.com> Message-ID: Hi Igor, Update looks good. Thanks, David On 29/08/2020 4:18 am, Igor Ignatyev wrote: > Hi David, > > good point, parseArguments (or rather checkOption) does indeed validate that passed option is valid and has a valid value, yet for many options all values are treated as valid, so ill-formed command lines like `-debugee.vmkeys="${test.vm.opts} ${test.java.opts} -transport.address=dynamic` won't be spotted by ArgumentParser and its sub-classes, and I'm afraid in some cases might change tests' behavior unnoticeably. thus I've decided to add the check that we always have even number of double quotes: > >> diff -r 83f273f313aa test/hotspot/jtreg/vmTestbase/nsk/share/ArgumentParser.java >> --- a/test/hotspot/jtreg/vmTestbase/nsk/share/ArgumentParser.java Thu Aug 27 19:37:51 2020 -0700 >> +++ b/test/hotspot/jtreg/vmTestbase/nsk/share/ArgumentParser.java Fri Aug 28 11:16:24 2020 -0700 >> @@ -156,6 +156,10 @@ >> arg.append(" ").append(args[++i]); >> doubleQuotes += numberOfDoubleQuotes(args[i]); >> } >> + if (doubleQuotes % 2 != 0) { >> + throw new TestBug("command-line has odd number of double quotes:" + String.join(" ", args)); >> + } >> + >> list.add(arg.toString()); >> } >> setRawArguments(list.toArray(String[]::new)); > > > Thanks, > -- Igor > > >> On Aug 27, 2020, at 9:09 PM, David Holmes wrote: >> >> Hi Igor, >> >> In case there may be a parsing error and the command-line is ill-formed, should you abort if you reach the end of the arg list without finding an even number of double-quotes? Or will parseArguments already handle that? >> >> Otherwise the changes seem good. >> >> Thanks, >> David >> ----- >> >> On 28/08/2020 12:39 pm, Igor Ignatyev wrote: >>> http://cr.openjdk.java.net/~iignatyev//8252477/webrev.00/ >>>> 99 lines changed: 19 ins; 20 del; 60 mod; >>> Hi all, >>> could you please review the patch which unblocks the rest of 8219140's (get rid of vmTestbase/PropertyResolvingWrapper) sub-tasks? >>> background from JBS: >>>> jtreg splits command line by space to get the list of arguments and there is no way to prevent that (nor thru escaping, nor by adding quotes). currently, PropertyResolvingWrapper handles that and joins multiple arguments within double quotes into one argument before passing it to the actual test class. the only place where it's needed is in the tests which use nsk/share/ArgumentParser (or more precisely nsk.share.jpda.DebugeeArgumentHandler and nsk/share/jdb/JdbArgumentHandler). >>>> >>>> in preparation for PropertyResolvingWrapper removal, ArgumentParser should be updated to handle the "split" argument on its own. >>> I've also taken the liberty to slightly clean up ArgumentParser. >>> JBS: https://bugs.openjdk.java.net/browse/JDK-8252477 >>> webrev: http://cr.openjdk.java.net/~iignatyev//8252477/webrev.00/ >>> testing: all the tests which use ArgumentParser (:vmTestbase_nsk_aod :vmTestbase_nsk_jdb :vmTestbase_nsk_jdi :vmTestbase_nsk_jdw ,:vmTestbase_nsk_jvmti :vmTestbase_vm_compiler :vmTestbase_vm_mlvm) on {windows,linux,macos}-x64 >>> Thanks, >>> -- Igor > From david.holmes at oracle.com Mon Aug 31 04:53:09 2020 From: david.holmes at oracle.com (David Holmes) Date: Mon, 31 Aug 2020 14:53:09 +1000 Subject: RFR(T) : 8252532 : use Utils.TEST_NATIVE_PATH instead of System.getProperty("test.nativepath") In-Reply-To: References: Message-ID: <93076ed6-f112-dd27-15d3-13f67cdf5de0@oracle.com> Hi Igor, On 29/08/2020 1:52 pm, Igor Ignatyev wrote: > http://cr.openjdk.java.net/~iignatyev//8252532/webrev.00 >> 145 lines changed: 28 ins; 22 del; 95 mod; > > > Hi all, > > could you please review this trivial clean up which replaces System.getProperty("test.nativepath") w/ Utils.TEST_NATIVE_PATH where appropriate? > > while updating these files, I've also cleaned them up a bit, removed unneeded imports, added/removed spaces, etc > > testing: runtime, serviceability and vmTestbase/nsk/jvmti/ tests on {linux,windows,macos}-x64 > JBS: https://bugs.openjdk.java.net/browse/JDK-8252532 > webrev: http://cr.openjdk.java.net/~iignatyev//8252532/webrev.00 Generally seems fine (though the fact the patch file contained a series of changesets threw me initially!) test/hotspot/jtreg/runtime/signal/SigTestDriver.java // add test specific arguments w/o signame cmd.addAll(Arrays.asList(args) - .subList(1, args.length)); + .subList(1, args.length)); Your changed line doesn't have the right indent. Can this just be put on one line anyway: // add test specific arguments w/o signame cmd.addAll(Arrays.asList(args).subList(1, args.length)); that seems better to me as the fact there is only one argument seems clearer. Though for greater clarity perhaps: // add test specific arguments w/o signame var argList = Arrays.asList(args).subList(1, args.length); cmd.addAll(argList); -- + Arrays.stream(Utils.JAVA_OPTIONS.split(" "))) + .filter(s -> !s.isEmpty()) + .filter(s -> s.startsWith("-X")) + .flatMap(arg -> Stream.of("-vmopt", arg)) + .collect(Collectors.toList()); The preferred/common style for chained stream operations is to align the dots: Arrays.stream(Utils.JAVA_OPTIONS.split(" "))) .filter(s -> !s.isEmpty()) .filter(s -> s.startsWith("-X")) .flatMap(arg -> Stream.of("-vmopt", arg)) .collect(Collectors.toList()); --- test/lib/jdk/test/lib/process/ProcessTools.java - System.out.println("\t" + t + - " stack: (length = " + stack.length + ")"); + System.out.println("\t" + t + + " stack: (length = " + stack.length + ")"); The original code is more stylistically correct - when breaking arguments across lines the indent should align with the start of the arguments. Similarly here: + return String.format("--- ProcessLog ---%n" + + "cmd: %s%n" + + "exitvalue: %s%n" + + "stderr: %s%n" + + "stdout: %s%n", + getCommandLine(pb), exitValue, stderr, stdout); should be: + return String.format("--- ProcessLog ---%n" + + "cmd: %s%n" + + "exitvalue: %s%n" + + "stderr: %s%n" + + "stdout: %s%n", + getCommandLine(pb), exitValue, stderr, stdout); and here: + String executable = Paths.get(Utils.TEST_NATIVE_PATH, executableName) + .toAbsolutePath() + .toString(); indentation again. Thanks, David ----- > Thanks, > -- Igor > From david.holmes at oracle.com Mon Aug 31 05:43:35 2020 From: david.holmes at oracle.com (David Holmes) Date: Mon, 31 Aug 2020 15:43:35 +1000 Subject: RFR: 8242427: JVMTI frame pop operations should use Thread-Local Handshakes In-Reply-To: <3aeaff81-f6a3-e4c0-b65c-835785b7ebe2@oss.nttdata.com> References: <11793902-5fca-3251-ce57-ae0832c3a2c4@oracle.com> <2764d68f-da86-865c-d8b0-e79d1fd9ee25@oss.nttdata.com> <6fa3b231-fc27-78f7-6e64-5d580b875772@oracle.com> <39af59d5-9701-97e9-670b-e71e9b2c6484@oss.nttdata.com> <92f6ff81-2a9a-e2d9-6e07-f19c97b63abd@oracle.com> <781a164d-21e1-039a-afa9-77be25ba3809@oss.nttdata.com> <6dd8b890-5e1f-2e6d-826c-96d439eb65c4@oracle.com> <3aeaff81-f6a3-e4c0-b65c-835785b7ebe2@oss.nttdata.com> Message-ID: <626efb8a-2285-fc2e-6170-3d8bf406181d@oracle.com> Hi Yasumasa, On 28/08/2020 1:01 pm, Yasumasa Suenaga wrote: > Hi David, > > On 2020/08/28 11:04, David Holmes wrote: >> Hi Yasumasa, >> >> On 28/08/2020 11:24 am, Yasumasa Suenaga wrote: >>> Hi David, >>> >>> On 2020/08/27 15:49, David Holmes wrote: >>>> Sorry I just realized I reviewed version 00 :( >> >> Note that my comments on version 00 in my earlier email still apply. > > I copied here your comment on webrev.00: > >>>>> I see. It is a pity that we have now lost that critical indicator >>>>> that shows how this operation can be nested within another >>>>> operation. The possibility of nesting is even more obscure with >>>>> JvmtiEnvThreadState::reset_current_location. And the fact it is now >>>>> up to the caller to handle that case explicitly raises some concern >>>>> - what will happen if you call execute_direct whilst already in a >>>>> handshake with the target thread? > > I heard deadlock would be happen if execute_direct() calls in direct > handshake. Thus we need to use active_handshaker() in this change. Okay. This is something we need to clarify with direct handshake usage information. I think it would be preferable if this was handled in execute_direct rather than the caller ... though it may also be the case that we need the writer of the handshake operation to give due consideration to nesting ... > >>>>> Further comments: >>>>> >>>>> src/hotspot/share/prims/jvmtiEnvThreadState.cpp >>>>> >>>>> ? 194 #ifdef ASSERT >>>>> ? 195?? Thread *current = Thread::current(); >>>>> ? 196 #endif >>>>> ? 197?? assert(get_thread() == current || current == >>>>> get_thread()->active_handshaker(), >>>>> ? 198????????? "frame pop data only accessible from same thread or >>>>> direct handshake"); >>>>> >>>>> Can you factor this out into a separate function so that it is not >>>>> repeated so often. Seems to me that there should be a global >>>>> function on Thread: assert_current_thread_or_handshaker()? [yes >>>>> unpleasant name but ...] that will allow us to stop repeating this >>>>> code fragment across numerous files. A follow up RFE for that would >>>>> be okay too (I see some guarantees that should probably just be >>>>> asserts so they need a bit more checking). > > I filed it as another RFE: > ? https://bugs.openjdk.java.net/browse/JDK-8252479 Thanks. > >>>>> ? 331???????? Handshake::execute_direct(&op, _thread); >>>>> >>>>> You aren't checking the return value of execute_direct, but I can't >>>>> tell where _thread was checked for still being alive ?? >>>>> >>>>> --- >>>>> >>>>> src/hotspot/share/prims/jvmtiEventController.cpp >>>>> >>>>> ? 340???? Handshake::execute_direct(&hs, target); >>>>> >>>>> I know this is existing code but I have the same query as above - >>>>> no return value check and no clear check that the JavaThread is >>>>> still alive? > > Existing code seems to assume that target thread is alive, frame > operations (e.g. PopFrame()) should be performed on live thread. And > also existing code would not set any JVMTI error and cannot propagate it > to caller. So I do not add the check for thread state. Okay. But note that for PopFrame the tests for isAlive and is-suspended have already been performed before we do the execute_direct; so in that case we could simply assert that execute_direct returns true. Similarly for other cases. >>>>> Do we know if the existing tests actually test the nested cases? > > I saw some error with assertion for JvmtiThreadState_lock and safepoint > in vmTestbase at first, so I guess nested call would be tested, but I'm > not sure. > > >>>> I have concerns with the added locking: >>>> >>>> MutexLocker mu(JvmtiThreadState_lock); >>>> >>>> Who else may be holding that lock? Could it be our target thread >>>> that we have already initiated a handshake with? (The lock ranking >>>> checks related to safepoints don't help us detect deadlocks between >>>> a target thread and its handshaker. :( ) >>> >>> I checked source code again, then I couldn't find the point that >>> target thread already locked JvmtiThreadState_lock at direct handshake. >> >> I'm very unclear exactly what state this lock guards and under what >> conditions. But looking at: >> >> src/hotspot/share/prims/jvmtiEnv.cpp >> >> Surely the lock is only needed in the direct-handshake case and not >> when operating on the current thread? Or is it there because you've >> removed the locking from the lower-level JvmtiEventController methods >> and so now you need to take the lock higher-up the call chain? (I find >> it hard to follow the call chains in the JVMTI code.) > > We need to take the lock higher-up the call chain. It is suggested by > Robbin, and works fine. Okay. It seems reasonably safe in this context as there is little additional work done while holding the lock. > >>>> It is far from clear now which functions are reachable from >>>> handshakes, which from safepoint VM_ops and which from both. >>>> >>>> !?? assert(SafepointSynchronize::is_at_safepoint() || >>>> JvmtiThreadState_lock->is_locked(), "Safepoint or must be locked"); >>>> >>>> This can be written as: >>>> >>>> assert_locked_or_safepoint(JvmtiThreadState_lock); >>>> >>>> or possibly the weak variant of that. ('m puzzled by the extra check >>>> in the strong version ... I think it is intended for the case of the >>>> VMThread executing a non-safepoint VMop.) >>> >>>> JvmtiEventController::set_frame_pop(), >>>> JvmtiEventController::clear_frame_pop() and >>>> JvmtiEventController::clear_to_frame_pop() are no longer called at >>>> safepoint, so I remove safepoint check from assert() in new webrev. >> >> You should use assert_lock_strong for this. > > I will do that. Thanks, David ----- > > Thanks, > > Yasumasa > > >> Thanks, >> David >> >>> ?? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8242427/webrev.03/ >>> ???? diff from previous webrev: >>> http://hg.openjdk.java.net/jdk/submit/rev/2a2c02ada080 >>> >>> >>> Thanks, >>> >>> Yasumasa >>> >>> >>>> Thanks, >>>> David >>>> ----- >>>> >>>> >>>> On 27/08/2020 4:34 pm, David Holmes wrote: >>>>> Hi Yasumasa, >>>>> >>>>> On 27/08/2020 9:40 am, Yasumasa Suenaga wrote: >>>>>> Hi David, >>>>>> >>>>>> On 2020/08/27 8:09, David Holmes wrote: >>>>>>> Hi Yasumasa, >>>>>>> >>>>>>> On 26/08/2020 5:34 pm, Yasumasa Suenaga wrote: >>>>>>>> Hi Patricio, David, >>>>>>>> >>>>>>>> Thanks for your comment! >>>>>>>> >>>>>>>> I updated webrev which includes the fix which is commented by >>>>>>>> Patricio, and it passed submit repo. So I switch this mail >>>>>>>> thread to RFR. >>>>>>>> >>>>>>>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8242427 >>>>>>>> ?? webrev: >>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8242427/webrev.00/ >>>>>>>> >>>>>>>> I understand David said same concerns as Patricio about active >>>>>>>> handshaker. This webrev checks active handshaker is current >>>>>>>> thread or not. >>>>>>> >>>>>>> How can the current thread already be in a handshake with the >>>>>>> target when you execute this code? >>>>>> >>>>>> EnterInterpOnlyModeClosure might be called in handshake with >>>>>> UpdateForPopTopFrameClosure or with SetFramePopClosure. >>>>>> >>>>>> EnterInterpOnlyModeClosure is introduced in JDK-8238585 as an >>>>>> alternative in VM_EnterInterpOnlyMode. >>>>>> VM_EnterInterpOnlyMode returned true in >>>>>> allow_nested_vm_operations(). Originally, it could have been >>>>>> called from other VM operations. >>>>> >>>>> I see. It is a pity that we have now lost that critical indicator >>>>> that shows how this operation can be nested within another >>>>> operation. The possibility of nesting is even more obscure with >>>>> JvmtiEnvThreadState::reset_current_location. And the fact it is now >>>>> up to the caller to handle that case explicitly raises some concern >>>>> - what will happen if you call execute_direct whilst already in a >>>>> handshake with the target thread? >>>>> >>>>> I can't help but feel that we need a more rigorous and automated >>>>> way of dealing with nesting ... perhaps we don't even need to care >>>>> and handshakes should always allow nested handshake requests? >>>>> (Question more for Robbin and Patricio.) >>>>> >>>>> Further comments: >>>>> >>>>> src/hotspot/share/prims/jvmtiEnvThreadState.cpp >>>>> >>>>> ??194 #ifdef ASSERT >>>>> ??195?? Thread *current = Thread::current(); >>>>> ??196 #endif >>>>> ??197?? assert(get_thread() == current || current == >>>>> get_thread()->active_handshaker(), >>>>> ??198????????? "frame pop data only accessible from same thread or >>>>> direct handshake"); >>>>> >>>>> Can you factor this out into a separate function so that it is not >>>>> repeated so often. Seems to me that there should be a global >>>>> function on Thread: assert_current_thread_or_handshaker()? [yes >>>>> unpleasant name but ...] that will allow us to stop repeating this >>>>> code fragment across numerous files. A follow up RFE for that would >>>>> be okay too (I see some guarantees that should probably just be >>>>> asserts so they need a bit more checking). >>>>> >>>>> ??331???????? Handshake::execute_direct(&op, _thread); >>>>> >>>>> You aren't checking the return value of execute_direct, but I can't >>>>> tell where _thread was checked for still being alive ?? >>>>> >>>>> --- >>>>> >>>>> src/hotspot/share/prims/jvmtiEventController.cpp >>>>> >>>>> ??340???? Handshake::execute_direct(&hs, target); >>>>> >>>>> I know this is existing code but I have the same query as above - >>>>> no return value check and no clear check that the JavaThread is >>>>> still alive? >>>>> >>>>> --- >>>>> >>>>> Do we know if the existing tests actually test the nested cases? >>>>> >>>>> Thanks, >>>>> David >>>>> ----- >>>>> >>>>>> >>>>>> Thanks, >>>>>> >>>>>> Yasumasa >>>>>> >>>>>> >>>>>>> David >>>>>>> ----- >>>>>>> >>>>>>>> >>>>>>>> Cheers, >>>>>>>> >>>>>>>> Yasumasa >>>>>>>> >>>>>>>> >>>>>>>> On 2020/08/26 10:13, Patricio Chilano wrote: >>>>>>>>> Hi Yasumasa, >>>>>>>>> >>>>>>>>> On 8/23/20 11:40 PM, Yasumasa Suenaga wrote: >>>>>>>>>> Hi all, >>>>>>>>>> >>>>>>>>>> I want to hear your opinions about the change for JDK-8242427. >>>>>>>>>> >>>>>>>>>> I'm trying to migrate following operations to direct handshake. >>>>>>>>>> >>>>>>>>>> ??? - VM_UpdateForPopTopFrame >>>>>>>>>> ??? - VM_SetFramePop >>>>>>>>>> ??? - VM_GetCurrentLocation >>>>>>>>>> >>>>>>>>>> Some operations (VM_GetCurrentLocation and >>>>>>>>>> EnterInterpOnlyModeClosure) might be called at safepoint, so I >>>>>>>>>> want to use JavaThread::active_handshaker() in production VM >>>>>>>>>> to detect the process is in direct handshake or not. >>>>>>>>>> >>>>>>>>>> However this function is available in debug VM only, so I want >>>>>>>>>> to hear the reason why it is for debug VM only, and there are >>>>>>>>>> no problem to use it in production VM. Of course another >>>>>>>>>> solutions are welcome. >>>>>>>>> I added the _active_handshaker field to the HandshakeState >>>>>>>>> class when working on 8230594 to adjust some asserts, where >>>>>>>>> instead of checking for the VMThread we needed to check for the >>>>>>>>> active handshaker of the target JavaThread. Since there were no >>>>>>>>> other users of it, there was no point in declaring it and >>>>>>>>> having to write to it for the release bits. There are no issues >>>>>>>>> with having it in production though so you could change that if >>>>>>>>> necessary. >>>>>>>>> >>>>>>>>>> webrev is here. It passed jtreg tests >>>>>>>>>> (vmTestbase/nsk/{jdi,jdwp,jvmti} serviceability/{jdwp,jvmti}) >>>>>>>>>> ? http://cr.openjdk.java.net/~ysuenaga/JDK-8242427/proposal/ >>>>>>>>> Some comments on the proposed change. >>>>>>>>> >>>>>>>>> src/hotspot/share/prims/jvmtiEnvThreadState.cpp, >>>>>>>>> src/hotspot/share/prims/jvmtiEventController.cpp >>>>>>>>> Why is the check to decide whether to call the handshake or >>>>>>>>> execute the operation with the current thread different for >>>>>>>>> GetCurrentLocationClosure vs EnterInterpOnlyModeClosure? >>>>>>>>> >>>>>>>>> (GetCurrentLocationClosure) >>>>>>>>> if ((Thread::current() == _thread) || >>>>>>>>> (_thread->active_handshaker() != NULL)) { >>>>>>>>> ????? op.do_thread(_thread); >>>>>>>>> } else { >>>>>>>>> ????? Handshake::execute_direct(&op, _thread); >>>>>>>>> } >>>>>>>>> >>>>>>>>> vs >>>>>>>>> >>>>>>>>> (EnterInterpOnlyModeClosure) >>>>>>>>> if (target->active_handshaker() != NULL) { >>>>>>>>> ???? hs.do_thread(target); >>>>>>>>> } else { >>>>>>>>> ???? Handshake::execute_direct(&hs, target); >>>>>>>>> } >>>>>>>>> >>>>>>>>> If you change VM_SetFramePop to use handshakes then it seems >>>>>>>>> you could reach >>>>>>>>> JvmtiEventControllerPrivate::enter_interp_only_mode() with the >>>>>>>>> current thread being the target. >>>>>>>>> Also I think you want the second expression of that check to be >>>>>>>>> (target->active_handshaker() == Thread::current()). So either >>>>>>>>> you are the target or the current active_handshaker for that >>>>>>>>> target. Otherwise active_handshaker() could be not NULL because >>>>>>>>> there is another JavaThread handshaking the same target. Unless >>>>>>>>> you are certain that it can never happen, so if >>>>>>>>> active_handshaker() is not NULL it is always the current >>>>>>>>> thread, but even in that case this way is safer. >>>>>>>>> >>>>>>>>> src/hotspot/share/prims/jvmtiThreadState.cpp >>>>>>>>> The guarantee() statement exists in release builds too so the >>>>>>>>> "#ifdef ASSERT" directive should be removed, otherwise >>>>>>>>> "current" will not be declared. >>>>>>>>> >>>>>>>>> Thanks! >>>>>>>>> >>>>>>>>> Patricio >>>>>>>>>> Thanks, >>>>>>>>>> >>>>>>>>>> Yasumasa >>>>>>>>> From suenaga at oss.nttdata.com Mon Aug 31 06:22:03 2020 From: suenaga at oss.nttdata.com (Yasumasa Suenaga) Date: Mon, 31 Aug 2020 15:22:03 +0900 Subject: RFR: 8242427: JVMTI frame pop operations should use Thread-Local Handshakes In-Reply-To: <626efb8a-2285-fc2e-6170-3d8bf406181d@oracle.com> References: <11793902-5fca-3251-ce57-ae0832c3a2c4@oracle.com> <2764d68f-da86-865c-d8b0-e79d1fd9ee25@oss.nttdata.com> <6fa3b231-fc27-78f7-6e64-5d580b875772@oracle.com> <39af59d5-9701-97e9-670b-e71e9b2c6484@oss.nttdata.com> <92f6ff81-2a9a-e2d9-6e07-f19c97b63abd@oracle.com> <781a164d-21e1-039a-afa9-77be25ba3809@oss.nttdata.com> <6dd8b890-5e1f-2e6d-826c-96d439eb65c4@oracle.com> <3aeaff81-f6a3-e4c0-b65c-835785b7ebe2@oss.nttdata.com> <626efb8a-2285-fc2e-6170-3d8bf406181d@oracle.com> Message-ID: <1d9e2af5-0f0a-0735-0cc6-41af50262928@oss.nttdata.com> Hi David, On 2020/08/31 14:43, David Holmes wrote: > Hi Yasumasa, > > On 28/08/2020 1:01 pm, Yasumasa Suenaga wrote: >> Hi David, >> >> On 2020/08/28 11:04, David Holmes wrote: >>> Hi Yasumasa, >>> >>> On 28/08/2020 11:24 am, Yasumasa Suenaga wrote: >>>> Hi David, >>>> >>>> On 2020/08/27 15:49, David Holmes wrote: >>>>> Sorry I just realized I reviewed version 00 :( >>> >>> Note that my comments on version 00 in my earlier email still apply. >> >> I copied here your comment on webrev.00: >> >>>>>> I see. It is a pity that we have now lost that critical indicator that shows how this operation can be nested within another operation. The possibility of nesting is even more obscure with JvmtiEnvThreadState::reset_current_location. And the fact it is now up to the caller to handle that case explicitly raises some concern - what will happen if you call execute_direct whilst already in a handshake with the target thread? >> >> I heard deadlock would be happen if execute_direct() calls in direct handshake. Thus we need to use active_handshaker() in this change. > > Okay. This is something we need to clarify with direct handshake usage information. I think it would be preferable if this was handled in execute_direct rather than the caller ... though it may also be the case that we need the writer of the handshake operation to give due consideration to nesting ... Agree, I also prefer to check whether caller is in direct handshake in execute_direct(). But I think this is another enhancement because we need to change the behavior of execute_direct(). >>>>>> Further comments: >>>>>> >>>>>> src/hotspot/share/prims/jvmtiEnvThreadState.cpp >>>>>> >>>>>> ? 194 #ifdef ASSERT >>>>>> ? 195?? Thread *current = Thread::current(); >>>>>> ? 196 #endif >>>>>> ? 197?? assert(get_thread() == current || current == get_thread()->active_handshaker(), >>>>>> ? 198????????? "frame pop data only accessible from same thread or direct handshake"); >>>>>> >>>>>> Can you factor this out into a separate function so that it is not repeated so often. Seems to me that there should be a global function on Thread: assert_current_thread_or_handshaker()? [yes unpleasant name but ...] that will allow us to stop repeating this code fragment across numerous files. A follow up RFE for that would be okay too (I see some guarantees that should probably just be asserts so they need a bit more checking). >> >> I filed it as another RFE: >> ?? https://bugs.openjdk.java.net/browse/JDK-8252479 > > Thanks. > >> >>>>>> ? 331???????? Handshake::execute_direct(&op, _thread); >>>>>> >>>>>> You aren't checking the return value of execute_direct, but I can't tell where _thread was checked for still being alive ?? >>>>>> >>>>>> --- >>>>>> >>>>>> src/hotspot/share/prims/jvmtiEventController.cpp >>>>>> >>>>>> ? 340???? Handshake::execute_direct(&hs, target); >>>>>> >>>>>> I know this is existing code but I have the same query as above - no return value check and no clear check that the JavaThread is still alive? >> >> Existing code seems to assume that target thread is alive, frame operations (e.g. PopFrame()) should be performed on live thread. And also existing code would not set any JVMTI error and cannot propagate it to caller. So I do not add the check for thread state. > > Okay. But note that for PopFrame the tests for isAlive and is-suspended have already been performed before we do the execute_direct; so in that case we could simply assert that execute_direct returns true. Similarly for other cases. Ok, I will change as following in next webrev: ``` bool result = Handshake::execute_direct(&hs, target); guarantee(result, "Direct handshake failed. Target thread is still alive?"); ``` Thanks, Yasumasa >>>>>> Do we know if the existing tests actually test the nested cases? >> >> I saw some error with assertion for JvmtiThreadState_lock and safepoint in vmTestbase at first, so I guess nested call would be tested, but I'm not sure. >> >> >>>>> I have concerns with the added locking: >>>>> >>>>> MutexLocker mu(JvmtiThreadState_lock); >>>>> >>>>> Who else may be holding that lock? Could it be our target thread that we have already initiated a handshake with? (The lock ranking checks related to safepoints don't help us detect deadlocks between a target thread and its handshaker. :( ) >>>> >>>> I checked source code again, then I couldn't find the point that target thread already locked JvmtiThreadState_lock at direct handshake. >>> >>> I'm very unclear exactly what state this lock guards and under what conditions. But looking at: >>> >>> src/hotspot/share/prims/jvmtiEnv.cpp >>> >>> Surely the lock is only needed in the direct-handshake case and not when operating on the current thread? Or is it there because you've removed the locking from the lower-level JvmtiEventController methods and so now you need to take the lock higher-up the call chain? (I find it hard to follow the call chains in the JVMTI code.) >> >> We need to take the lock higher-up the call chain. It is suggested by Robbin, and works fine. > > Okay. It seems reasonably safe in this context as there is little additional work done while holding the lock. > >> >>>>> It is far from clear now which functions are reachable from handshakes, which from safepoint VM_ops and which from both. >>>>> >>>>> !?? assert(SafepointSynchronize::is_at_safepoint() || JvmtiThreadState_lock->is_locked(), "Safepoint or must be locked"); >>>>> >>>>> This can be written as: >>>>> >>>>> assert_locked_or_safepoint(JvmtiThreadState_lock); >>>>> >>>>> or possibly the weak variant of that. ('m puzzled by the extra check in the strong version ... I think it is intended for the case of the VMThread executing a non-safepoint VMop.) >>>> >>>>> JvmtiEventController::set_frame_pop(), JvmtiEventController::clear_frame_pop() and JvmtiEventController::clear_to_frame_pop() are no longer called at safepoint, so I remove safepoint check from assert() in new webrev. >>> >>> You should use assert_lock_strong for this. >> >> I will do that. > > Thanks, > David > ----- > >> >> Thanks, >> >> Yasumasa >> >> >>> Thanks, >>> David >>> >>>> ?? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8242427/webrev.03/ >>>> ???? diff from previous webrev: http://hg.openjdk.java.net/jdk/submit/rev/2a2c02ada080 >>>> >>>> >>>> Thanks, >>>> >>>> Yasumasa >>>> >>>> >>>>> Thanks, >>>>> David >>>>> ----- >>>>> >>>>> >>>>> On 27/08/2020 4:34 pm, David Holmes wrote: >>>>>> Hi Yasumasa, >>>>>> >>>>>> On 27/08/2020 9:40 am, Yasumasa Suenaga wrote: >>>>>>> Hi David, >>>>>>> >>>>>>> On 2020/08/27 8:09, David Holmes wrote: >>>>>>>> Hi Yasumasa, >>>>>>>> >>>>>>>> On 26/08/2020 5:34 pm, Yasumasa Suenaga wrote: >>>>>>>>> Hi Patricio, David, >>>>>>>>> >>>>>>>>> Thanks for your comment! >>>>>>>>> >>>>>>>>> I updated webrev which includes the fix which is commented by Patricio, and it passed submit repo. So I switch this mail thread to RFR. >>>>>>>>> >>>>>>>>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8242427 >>>>>>>>> ?? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8242427/webrev.00/ >>>>>>>>> >>>>>>>>> I understand David said same concerns as Patricio about active handshaker. This webrev checks active handshaker is current thread or not. >>>>>>>> >>>>>>>> How can the current thread already be in a handshake with the target when you execute this code? >>>>>>> >>>>>>> EnterInterpOnlyModeClosure might be called in handshake with UpdateForPopTopFrameClosure or with SetFramePopClosure. >>>>>>> >>>>>>> EnterInterpOnlyModeClosure is introduced in JDK-8238585 as an alternative in VM_EnterInterpOnlyMode. >>>>>>> VM_EnterInterpOnlyMode returned true in allow_nested_vm_operations(). Originally, it could have been called from other VM operations. >>>>>> >>>>>> I see. It is a pity that we have now lost that critical indicator that shows how this operation can be nested within another operation. The possibility of nesting is even more obscure with JvmtiEnvThreadState::reset_current_location. And the fact it is now up to the caller to handle that case explicitly raises some concern - what will happen if you call execute_direct whilst already in a handshake with the target thread? >>>>>> >>>>>> I can't help but feel that we need a more rigorous and automated way of dealing with nesting ... perhaps we don't even need to care and handshakes should always allow nested handshake requests? (Question more for Robbin and Patricio.) >>>>>> >>>>>> Further comments: >>>>>> >>>>>> src/hotspot/share/prims/jvmtiEnvThreadState.cpp >>>>>> >>>>>> ??194 #ifdef ASSERT >>>>>> ??195?? Thread *current = Thread::current(); >>>>>> ??196 #endif >>>>>> ??197?? assert(get_thread() == current || current == get_thread()->active_handshaker(), >>>>>> ??198????????? "frame pop data only accessible from same thread or direct handshake"); >>>>>> >>>>>> Can you factor this out into a separate function so that it is not repeated so often. Seems to me that there should be a global function on Thread: assert_current_thread_or_handshaker()? [yes unpleasant name but ...] that will allow us to stop repeating this code fragment across numerous files. A follow up RFE for that would be okay too (I see some guarantees that should probably just be asserts so they need a bit more checking). >>>>>> >>>>>> ??331???????? Handshake::execute_direct(&op, _thread); >>>>>> >>>>>> You aren't checking the return value of execute_direct, but I can't tell where _thread was checked for still being alive ?? >>>>>> >>>>>> --- >>>>>> >>>>>> src/hotspot/share/prims/jvmtiEventController.cpp >>>>>> >>>>>> ??340???? Handshake::execute_direct(&hs, target); >>>>>> >>>>>> I know this is existing code but I have the same query as above - no return value check and no clear check that the JavaThread is still alive? >>>>>> >>>>>> --- >>>>>> >>>>>> Do we know if the existing tests actually test the nested cases? >>>>>> >>>>>> Thanks, >>>>>> David >>>>>> ----- >>>>>> >>>>>>> >>>>>>> Thanks, >>>>>>> >>>>>>> Yasumasa >>>>>>> >>>>>>> >>>>>>>> David >>>>>>>> ----- >>>>>>>> >>>>>>>>> >>>>>>>>> Cheers, >>>>>>>>> >>>>>>>>> Yasumasa >>>>>>>>> >>>>>>>>> >>>>>>>>> On 2020/08/26 10:13, Patricio Chilano wrote: >>>>>>>>>> Hi Yasumasa, >>>>>>>>>> >>>>>>>>>> On 8/23/20 11:40 PM, Yasumasa Suenaga wrote: >>>>>>>>>>> Hi all, >>>>>>>>>>> >>>>>>>>>>> I want to hear your opinions about the change for JDK-8242427. >>>>>>>>>>> >>>>>>>>>>> I'm trying to migrate following operations to direct handshake. >>>>>>>>>>> >>>>>>>>>>> ??? - VM_UpdateForPopTopFrame >>>>>>>>>>> ??? - VM_SetFramePop >>>>>>>>>>> ??? - VM_GetCurrentLocation >>>>>>>>>>> >>>>>>>>>>> Some operations (VM_GetCurrentLocation and EnterInterpOnlyModeClosure) might be called at safepoint, so I want to use JavaThread::active_handshaker() in production VM to detect the process is in direct handshake or not. >>>>>>>>>>> >>>>>>>>>>> However this function is available in debug VM only, so I want to hear the reason why it is for debug VM only, and there are no problem to use it in production VM. Of course another solutions are welcome. >>>>>>>>>> I added the _active_handshaker field to the HandshakeState class when working on 8230594 to adjust some asserts, where instead of checking for the VMThread we needed to check for the active handshaker of the target JavaThread. Since there were no other users of it, there was no point in declaring it and having to write to it for the release bits. There are no issues with having it in production though so you could change that if necessary. >>>>>>>>>> >>>>>>>>>>> webrev is here. It passed jtreg tests (vmTestbase/nsk/{jdi,jdwp,jvmti} serviceability/{jdwp,jvmti}) >>>>>>>>>>> ? http://cr.openjdk.java.net/~ysuenaga/JDK-8242427/proposal/ >>>>>>>>>> Some comments on the proposed change. >>>>>>>>>> >>>>>>>>>> src/hotspot/share/prims/jvmtiEnvThreadState.cpp, src/hotspot/share/prims/jvmtiEventController.cpp >>>>>>>>>> Why is the check to decide whether to call the handshake or execute the operation with the current thread different for GetCurrentLocationClosure vs EnterInterpOnlyModeClosure? >>>>>>>>>> >>>>>>>>>> (GetCurrentLocationClosure) >>>>>>>>>> if ((Thread::current() == _thread) || (_thread->active_handshaker() != NULL)) { >>>>>>>>>> ????? op.do_thread(_thread); >>>>>>>>>> } else { >>>>>>>>>> ????? Handshake::execute_direct(&op, _thread); >>>>>>>>>> } >>>>>>>>>> >>>>>>>>>> vs >>>>>>>>>> >>>>>>>>>> (EnterInterpOnlyModeClosure) >>>>>>>>>> if (target->active_handshaker() != NULL) { >>>>>>>>>> ???? hs.do_thread(target); >>>>>>>>>> } else { >>>>>>>>>> ???? Handshake::execute_direct(&hs, target); >>>>>>>>>> } >>>>>>>>>> >>>>>>>>>> If you change VM_SetFramePop to use handshakes then it seems you could reach JvmtiEventControllerPrivate::enter_interp_only_mode() with the current thread being the target. >>>>>>>>>> Also I think you want the second expression of that check to be (target->active_handshaker() == Thread::current()). So either you are the target or the current active_handshaker for that target. Otherwise active_handshaker() could be not NULL because there is another JavaThread handshaking the same target. Unless you are certain that it can never happen, so if active_handshaker() is not NULL it is always the current thread, but even in that case this way is safer. >>>>>>>>>> >>>>>>>>>> src/hotspot/share/prims/jvmtiThreadState.cpp >>>>>>>>>> The guarantee() statement exists in release builds too so the "#ifdef ASSERT" directive should be removed, otherwise "current" will not be declared. >>>>>>>>>> >>>>>>>>>> Thanks! >>>>>>>>>> >>>>>>>>>> Patricio >>>>>>>>>>> Thanks, >>>>>>>>>>> >>>>>>>>>>> Yasumasa >>>>>>>>>> From suenaga at oss.nttdata.com Mon Aug 31 09:10:33 2020 From: suenaga at oss.nttdata.com (Yasumasa Suenaga) Date: Mon, 31 Aug 2020 18:10:33 +0900 Subject: RFR: 8242427: JVMTI frame pop operations should use Thread-Local Handshakes In-Reply-To: <1d9e2af5-0f0a-0735-0cc6-41af50262928@oss.nttdata.com> References: <11793902-5fca-3251-ce57-ae0832c3a2c4@oracle.com> <2764d68f-da86-865c-d8b0-e79d1fd9ee25@oss.nttdata.com> <6fa3b231-fc27-78f7-6e64-5d580b875772@oracle.com> <39af59d5-9701-97e9-670b-e71e9b2c6484@oss.nttdata.com> <92f6ff81-2a9a-e2d9-6e07-f19c97b63abd@oracle.com> <781a164d-21e1-039a-afa9-77be25ba3809@oss.nttdata.com> <6dd8b890-5e1f-2e6d-826c-96d439eb65c4@oracle.com> <3aeaff81-f6a3-e4c0-b65c-835785b7ebe2@oss.nttdata.com> <626efb8a-2285-fc2e-6170-3d8bf406181d@oracle.com> <1d9e2af5-0f0a-0735-0cc6-41af50262928@oss.nttdata.com> Message-ID: <41b91a4d-4eed-766e-48e1-87e08c1c8611@oss.nttdata.com> Hi David, I uploaded new webrev. Could you review again? http://cr.openjdk.java.net/~ysuenaga/JDK-8242427/webrev.04/ This webrev includes two changes: 1. Use assert_lock_strong() for JvmtiThreadState_lock http://hg.openjdk.java.net/jdk/submit/rev/c85f93d2042d 2. Check return value from execute_direct() with assert() http://hg.openjdk.java.net/jdk/submit/rev/8746e1651343 Thanks, Yasumasa On 2020/08/31 15:22, Yasumasa Suenaga wrote: > Hi David, > > On 2020/08/31 14:43, David Holmes wrote: >> Hi Yasumasa, >> >> On 28/08/2020 1:01 pm, Yasumasa Suenaga wrote: >>> Hi David, >>> >>> On 2020/08/28 11:04, David Holmes wrote: >>>> Hi Yasumasa, >>>> >>>> On 28/08/2020 11:24 am, Yasumasa Suenaga wrote: >>>>> Hi David, >>>>> >>>>> On 2020/08/27 15:49, David Holmes wrote: >>>>>> Sorry I just realized I reviewed version 00 :( >>>> >>>> Note that my comments on version 00 in my earlier email still apply. >>> >>> I copied here your comment on webrev.00: >>> >>>>>>> I see. It is a pity that we have now lost that critical indicator that shows how this operation can be nested within another operation. The possibility of nesting is even more obscure with JvmtiEnvThreadState::reset_current_location. And the fact it is now up to the caller to handle that case explicitly raises some concern - what will happen if you call execute_direct whilst already in a handshake with the target thread? >>> >>> I heard deadlock would be happen if execute_direct() calls in direct handshake. Thus we need to use active_handshaker() in this change. >> >> Okay. This is something we need to clarify with direct handshake usage information. I think it would be preferable if this was handled in execute_direct rather than the caller ... though it may also be the case that we need the writer of the handshake operation to give due consideration to nesting ... > > Agree, I also prefer to check whether caller is in direct handshake in execute_direct(). > But I think this is another enhancement because we need to change the behavior of execute_direct(). > > >>>>>>> Further comments: >>>>>>> >>>>>>> src/hotspot/share/prims/jvmtiEnvThreadState.cpp >>>>>>> >>>>>>> ? 194 #ifdef ASSERT >>>>>>> ? 195?? Thread *current = Thread::current(); >>>>>>> ? 196 #endif >>>>>>> ? 197?? assert(get_thread() == current || current == get_thread()->active_handshaker(), >>>>>>> ? 198????????? "frame pop data only accessible from same thread or direct handshake"); >>>>>>> >>>>>>> Can you factor this out into a separate function so that it is not repeated so often. Seems to me that there should be a global function on Thread: assert_current_thread_or_handshaker()? [yes unpleasant name but ...] that will allow us to stop repeating this code fragment across numerous files. A follow up RFE for that would be okay too (I see some guarantees that should probably just be asserts so they need a bit more checking). >>> >>> I filed it as another RFE: >>> ?? https://bugs.openjdk.java.net/browse/JDK-8252479 >> >> Thanks. >> >>> >>>>>>> ? 331???????? Handshake::execute_direct(&op, _thread); >>>>>>> >>>>>>> You aren't checking the return value of execute_direct, but I can't tell where _thread was checked for still being alive ?? >>>>>>> >>>>>>> --- >>>>>>> >>>>>>> src/hotspot/share/prims/jvmtiEventController.cpp >>>>>>> >>>>>>> ? 340???? Handshake::execute_direct(&hs, target); >>>>>>> >>>>>>> I know this is existing code but I have the same query as above - no return value check and no clear check that the JavaThread is still alive? >>> >>> Existing code seems to assume that target thread is alive, frame operations (e.g. PopFrame()) should be performed on live thread. And also existing code would not set any JVMTI error and cannot propagate it to caller. So I do not add the check for thread state. >> >> Okay. But note that for PopFrame the tests for isAlive and is-suspended have already been performed before we do the execute_direct; so in that case we could simply assert that execute_direct returns true. Similarly for other cases. > > Ok, I will change as following in next webrev: > > ``` > bool result = Handshake::execute_direct(&hs, target); > guarantee(result, "Direct handshake failed. Target thread is still alive?"); > ``` > > > Thanks, > > Yasumasa > > >>>>>>> Do we know if the existing tests actually test the nested cases? >>> >>> I saw some error with assertion for JvmtiThreadState_lock and safepoint in vmTestbase at first, so I guess nested call would be tested, but I'm not sure. >>> >>> >>>>>> I have concerns with the added locking: >>>>>> >>>>>> MutexLocker mu(JvmtiThreadState_lock); >>>>>> >>>>>> Who else may be holding that lock? Could it be our target thread that we have already initiated a handshake with? (The lock ranking checks related to safepoints don't help us detect deadlocks between a target thread and its handshaker. :( ) >>>>> >>>>> I checked source code again, then I couldn't find the point that target thread already locked JvmtiThreadState_lock at direct handshake. >>>> >>>> I'm very unclear exactly what state this lock guards and under what conditions. But looking at: >>>> >>>> src/hotspot/share/prims/jvmtiEnv.cpp >>>> >>>> Surely the lock is only needed in the direct-handshake case and not when operating on the current thread? Or is it there because you've removed the locking from the lower-level JvmtiEventController methods and so now you need to take the lock higher-up the call chain? (I find it hard to follow the call chains in the JVMTI code.) >>> >>> We need to take the lock higher-up the call chain. It is suggested by Robbin, and works fine. >> >> Okay. It seems reasonably safe in this context as there is little additional work done while holding the lock. >> >>> >>>>>> It is far from clear now which functions are reachable from handshakes, which from safepoint VM_ops and which from both. >>>>>> >>>>>> !?? assert(SafepointSynchronize::is_at_safepoint() || JvmtiThreadState_lock->is_locked(), "Safepoint or must be locked"); >>>>>> >>>>>> This can be written as: >>>>>> >>>>>> assert_locked_or_safepoint(JvmtiThreadState_lock); >>>>>> >>>>>> or possibly the weak variant of that. ('m puzzled by the extra check in the strong version ... I think it is intended for the case of the VMThread executing a non-safepoint VMop.) >>>>> >>>>>> JvmtiEventController::set_frame_pop(), JvmtiEventController::clear_frame_pop() and JvmtiEventController::clear_to_frame_pop() are no longer called at safepoint, so I remove safepoint check from assert() in new webrev. >>>> >>>> You should use assert_lock_strong for this. >>> >>> I will do that. >> >> Thanks, >> David >> ----- >> >>> >>> Thanks, >>> >>> Yasumasa >>> >>> >>>> Thanks, >>>> David >>>> >>>>> ?? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8242427/webrev.03/ >>>>> ???? diff from previous webrev: http://hg.openjdk.java.net/jdk/submit/rev/2a2c02ada080 >>>>> >>>>> >>>>> Thanks, >>>>> >>>>> Yasumasa >>>>> >>>>> >>>>>> Thanks, >>>>>> David >>>>>> ----- >>>>>> >>>>>> >>>>>> On 27/08/2020 4:34 pm, David Holmes wrote: >>>>>>> Hi Yasumasa, >>>>>>> >>>>>>> On 27/08/2020 9:40 am, Yasumasa Suenaga wrote: >>>>>>>> Hi David, >>>>>>>> >>>>>>>> On 2020/08/27 8:09, David Holmes wrote: >>>>>>>>> Hi Yasumasa, >>>>>>>>> >>>>>>>>> On 26/08/2020 5:34 pm, Yasumasa Suenaga wrote: >>>>>>>>>> Hi Patricio, David, >>>>>>>>>> >>>>>>>>>> Thanks for your comment! >>>>>>>>>> >>>>>>>>>> I updated webrev which includes the fix which is commented by Patricio, and it passed submit repo. So I switch this mail thread to RFR. >>>>>>>>>> >>>>>>>>>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8242427 >>>>>>>>>> ?? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8242427/webrev.00/ >>>>>>>>>> >>>>>>>>>> I understand David said same concerns as Patricio about active handshaker. This webrev checks active handshaker is current thread or not. >>>>>>>>> >>>>>>>>> How can the current thread already be in a handshake with the target when you execute this code? >>>>>>>> >>>>>>>> EnterInterpOnlyModeClosure might be called in handshake with UpdateForPopTopFrameClosure or with SetFramePopClosure. >>>>>>>> >>>>>>>> EnterInterpOnlyModeClosure is introduced in JDK-8238585 as an alternative in VM_EnterInterpOnlyMode. >>>>>>>> VM_EnterInterpOnlyMode returned true in allow_nested_vm_operations(). Originally, it could have been called from other VM operations. >>>>>>> >>>>>>> I see. It is a pity that we have now lost that critical indicator that shows how this operation can be nested within another operation. The possibility of nesting is even more obscure with JvmtiEnvThreadState::reset_current_location. And the fact it is now up to the caller to handle that case explicitly raises some concern - what will happen if you call execute_direct whilst already in a handshake with the target thread? >>>>>>> >>>>>>> I can't help but feel that we need a more rigorous and automated way of dealing with nesting ... perhaps we don't even need to care and handshakes should always allow nested handshake requests? (Question more for Robbin and Patricio.) >>>>>>> >>>>>>> Further comments: >>>>>>> >>>>>>> src/hotspot/share/prims/jvmtiEnvThreadState.cpp >>>>>>> >>>>>>> ??194 #ifdef ASSERT >>>>>>> ??195?? Thread *current = Thread::current(); >>>>>>> ??196 #endif >>>>>>> ??197?? assert(get_thread() == current || current == get_thread()->active_handshaker(), >>>>>>> ??198????????? "frame pop data only accessible from same thread or direct handshake"); >>>>>>> >>>>>>> Can you factor this out into a separate function so that it is not repeated so often. Seems to me that there should be a global function on Thread: assert_current_thread_or_handshaker()? [yes unpleasant name but ...] that will allow us to stop repeating this code fragment across numerous files. A follow up RFE for that would be okay too (I see some guarantees that should probably just be asserts so they need a bit more checking). >>>>>>> >>>>>>> ??331???????? Handshake::execute_direct(&op, _thread); >>>>>>> >>>>>>> You aren't checking the return value of execute_direct, but I can't tell where _thread was checked for still being alive ?? >>>>>>> >>>>>>> --- >>>>>>> >>>>>>> src/hotspot/share/prims/jvmtiEventController.cpp >>>>>>> >>>>>>> ??340???? Handshake::execute_direct(&hs, target); >>>>>>> >>>>>>> I know this is existing code but I have the same query as above - no return value check and no clear check that the JavaThread is still alive? >>>>>>> >>>>>>> --- >>>>>>> >>>>>>> Do we know if the existing tests actually test the nested cases? >>>>>>> >>>>>>> Thanks, >>>>>>> David >>>>>>> ----- >>>>>>> >>>>>>>> >>>>>>>> Thanks, >>>>>>>> >>>>>>>> Yasumasa >>>>>>>> >>>>>>>> >>>>>>>>> David >>>>>>>>> ----- >>>>>>>>> >>>>>>>>>> >>>>>>>>>> Cheers, >>>>>>>>>> >>>>>>>>>> Yasumasa >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On 2020/08/26 10:13, Patricio Chilano wrote: >>>>>>>>>>> Hi Yasumasa, >>>>>>>>>>> >>>>>>>>>>> On 8/23/20 11:40 PM, Yasumasa Suenaga wrote: >>>>>>>>>>>> Hi all, >>>>>>>>>>>> >>>>>>>>>>>> I want to hear your opinions about the change for JDK-8242427. >>>>>>>>>>>> >>>>>>>>>>>> I'm trying to migrate following operations to direct handshake. >>>>>>>>>>>> >>>>>>>>>>>> ??? - VM_UpdateForPopTopFrame >>>>>>>>>>>> ??? - VM_SetFramePop >>>>>>>>>>>> ??? - VM_GetCurrentLocation >>>>>>>>>>>> >>>>>>>>>>>> Some operations (VM_GetCurrentLocation and EnterInterpOnlyModeClosure) might be called at safepoint, so I want to use JavaThread::active_handshaker() in production VM to detect the process is in direct handshake or not. >>>>>>>>>>>> >>>>>>>>>>>> However this function is available in debug VM only, so I want to hear the reason why it is for debug VM only, and there are no problem to use it in production VM. Of course another solutions are welcome. >>>>>>>>>>> I added the _active_handshaker field to the HandshakeState class when working on 8230594 to adjust some asserts, where instead of checking for the VMThread we needed to check for the active handshaker of the target JavaThread. Since there were no other users of it, there was no point in declaring it and having to write to it for the release bits. There are no issues with having it in production though so you could change that if necessary. >>>>>>>>>>> >>>>>>>>>>>> webrev is here. It passed jtreg tests (vmTestbase/nsk/{jdi,jdwp,jvmti} serviceability/{jdwp,jvmti}) >>>>>>>>>>>> ? http://cr.openjdk.java.net/~ysuenaga/JDK-8242427/proposal/ >>>>>>>>>>> Some comments on the proposed change. >>>>>>>>>>> >>>>>>>>>>> src/hotspot/share/prims/jvmtiEnvThreadState.cpp, src/hotspot/share/prims/jvmtiEventController.cpp >>>>>>>>>>> Why is the check to decide whether to call the handshake or execute the operation with the current thread different for GetCurrentLocationClosure vs EnterInterpOnlyModeClosure? >>>>>>>>>>> >>>>>>>>>>> (GetCurrentLocationClosure) >>>>>>>>>>> if ((Thread::current() == _thread) || (_thread->active_handshaker() != NULL)) { >>>>>>>>>>> ????? op.do_thread(_thread); >>>>>>>>>>> } else { >>>>>>>>>>> ????? Handshake::execute_direct(&op, _thread); >>>>>>>>>>> } >>>>>>>>>>> >>>>>>>>>>> vs >>>>>>>>>>> >>>>>>>>>>> (EnterInterpOnlyModeClosure) >>>>>>>>>>> if (target->active_handshaker() != NULL) { >>>>>>>>>>> ???? hs.do_thread(target); >>>>>>>>>>> } else { >>>>>>>>>>> ???? Handshake::execute_direct(&hs, target); >>>>>>>>>>> } >>>>>>>>>>> >>>>>>>>>>> If you change VM_SetFramePop to use handshakes then it seems you could reach JvmtiEventControllerPrivate::enter_interp_only_mode() with the current thread being the target. >>>>>>>>>>> Also I think you want the second expression of that check to be (target->active_handshaker() == Thread::current()). So either you are the target or the current active_handshaker for that target. Otherwise active_handshaker() could be not NULL because there is another JavaThread handshaking the same target. Unless you are certain that it can never happen, so if active_handshaker() is not NULL it is always the current thread, but even in that case this way is safer. >>>>>>>>>>> >>>>>>>>>>> src/hotspot/share/prims/jvmtiThreadState.cpp >>>>>>>>>>> The guarantee() statement exists in release builds too so the "#ifdef ASSERT" directive should be removed, otherwise "current" will not be declared. >>>>>>>>>>> >>>>>>>>>>> Thanks! >>>>>>>>>>> >>>>>>>>>>> Patricio >>>>>>>>>>>> Thanks, >>>>>>>>>>>> >>>>>>>>>>>> Yasumasa >>>>>>>>>>> From martin.doerr at sap.com Mon Aug 31 17:00:10 2020 From: martin.doerr at sap.com (Doerr, Martin) Date: Mon, 31 Aug 2020 17:00:10 +0000 Subject: Fatal errors when running JCK tests with JDK15/16 debug build In-Reply-To: References: Message-ID: Hi David, thanks for analyzing it. We need to exclude the test for now. Best regards, Martin > -----Original Message----- > From: David Holmes > Sent: Montag, 31. August 2020 04:34 > To: Doerr, Martin ; serviceability- > dev at openjdk.java.net; hotspot-runtime-dev at openjdk.java.net > Subject: Re: Fatal errors when running JCK tests with JDK15/16 debug build > > Hi Martin, > > On 29/08/2020 3:53 am, Doerr, Martin wrote: > > Hi, > > > > we have seen the following fatal error more than 50 times since > > 2020-05-25 in various JCK tests vm/jvmti. > > > > fatal error: String conversion failure: [check] ExitLock destroyed > > > > --> ?? [check] ExitLock exited > > > > (followed by garbage output) > > > > 8166358: Re-enable String verification in > > java_lang_String::create_from_str() > > > > was pushed at that date which introduced the call to fatal. > > > > Stack (example from linuxppc64le, but also observed on x86 and aarch64): > > V? [libjvm.so+0xee242c]? java_lang_String::create_from_str(char const*, > > Thread*) [clone .part.158]+0x51c > > V? [libjvm.so+0xee2530]? java_lang_String::create_oop_from_str(char > > const*, Thread*)+0x40 > > V? [libjvm.so+0x1026a30]? jni_NewStringUTF+0x1e0 > > C? [libjckjvmti.so+0x3ce4c]? logWrite+0x5c > > C? [libjckjvmti.so+0x3cd20]? lprintf+0x170 > > C? [libjckjvmti.so+0x485b8]? gast00104_agent_proc+0x254 > > V? [libjvm.so+0x1218f0c]? JvmtiAgentThread::call_start_function()+0x24c > > V? [libjvm.so+0x193a8fc]? JavaThread::thread_main_inner()+0x32c > > V? [libjvm.so+0x19418a0]? Thread::call_run()+0x160 > > V? [libjvm.so+0x15c9d0c]? thread_native_entry(Thread*)+0x18c > > C? [libpthread.so.0+0x9b48]? start_thread+0x108 > > > > (Problem could have been there before but without this fatal message.) > > > > The messages are generated by: > > > > tests/vm/jvmti/GetAllStackTraces/gast001/gast00104/gast00104.c > > > > This looks like a race condition. The message changes while the VM > > creates a String object from it. Has anybody seen this before? > > No but ... > > > Is it a test problem? I'm not familiar with the lprintf calls in the test. > > ... the lprintf is part of the JCK support library (support.c if you > have access to sources) and it uses a static buffer for the log messages > and so it not thread-safe. This test creates a thread and both it and > the main thread call lprintf concurrently. > > So this is a JCK test/test-library bug that appears to be exposed by the > changes made in 8166358. > > Cheers, > David > ----- > > > Best regards, > > > > Martin > > From igor.ignatyev at oracle.com Mon Aug 31 19:03:25 2020 From: igor.ignatyev at oracle.com (Igor Ignatyev) Date: Mon, 31 Aug 2020 12:03:25 -0700 Subject: RFR(S) : 8252402 : rewrite vmTestbase/nsk/jvmti/Allocate/alloc001 shell test to Java In-Reply-To: <1B17F111-EA9D-4BB6-93DB-663A5328A40F@oracle.com> References: <1B17F111-EA9D-4BB6-93DB-663A5328A40F@oracle.com> Message-ID: <9E518796-8336-449C-AB5A-5B2BF387AA9B@oracle.com> ping? > On Aug 26, 2020, at 4:59 PM, Igor Ignatyev wrote: > > http://cr.openjdk.java.net/~iignatyev//8252402/webrev.00 >> 287 lines changed: 60 ins; 200 del; 27 mod; > > Hi all, > > could you please review the patch which removes shell script from alloc001 test? > there are two small difference comparing to the original test: > - if we don't get OutOfMemory on mac or windows, the test will be reported as skipped (as opposed to passed-passed before) > - as changing DYLD_LIBRARY_PATH on mac is a bit cumbersome due to SIP, I decided to use '-agentpath:' instead of '-agentlib:' > > the patch also moves alloc001.java to closer to the other files (vmTestbase/nsk/jvmti/Allocate/alloc001), removes TestDescription.java file, moves jtreg test description to the test source code and removes printdump agent option making trace messages in alloc001.cpp unconditional. > > webrev: http://cr.openjdk.java.net/~iignatyev//8252402/webrev.00 (depends on 8252401[1,2]) > JBS: https://bugs.openjdk.java.net/browse/JDK-8252402 > testing: vmTestbase/nsk/jvmti/Allocate/alloc001 on {linux,windows,macos}-x64 > > [1] https://bugs.openjdk.java.net/browse/JDK-8252401 > [2] http://cr.openjdk.java.net/~iignatyev//8252401/webrev.00 > > Thanks, > -- Igor > From igor.ignatyev at oracle.com Mon Aug 31 19:03:36 2020 From: igor.ignatyev at oracle.com (Igor Ignatyev) Date: Mon, 31 Aug 2020 12:03:36 -0700 Subject: RFR(S) : 8252403 : rewrite serviceability/7170638/SDTProbesGNULinuxTest.sh to java In-Reply-To: <6D1980F9-850E-476E-A53B-FC194DEDF9C2@oracle.com> References: <6D1980F9-850E-476E-A53B-FC194DEDF9C2@oracle.com> Message-ID: <5871CCB2-4AEB-4D8E-8A23-463394BCAC43@oracle.com> ping? > On Aug 26, 2020, at 10:14 PM, Igor Ignatyev wrote: > > http://cr.openjdk.java.net/~iignatyev//8252403/webrev.00 >> 76 lines changed: 8 ins; 0 del; 68 mod; > > Hi all, > > could you please review the patch which rewrites serviceability/7170638/SDTProbesGNULinuxTest.sh to java? > > JBS: https://bugs.openjdk.java.net/browse/JDK-8252403 > webrev: http://cr.openjdk.java.net/~iignatyev//8252403/webrev.00 > testing: serviceability/7170638 on linux-x64 w/ and w/o dtrace feature > > Thanks, > -- Igor From alexey.menkov at oracle.com Mon Aug 31 21:54:47 2020 From: alexey.menkov at oracle.com (Alex Menkov) Date: Mon, 31 Aug 2020 14:54:47 -0700 Subject: RFR(S) : 8252402 : rewrite vmTestbase/nsk/jvmti/Allocate/alloc001 shell test to Java In-Reply-To: <9E518796-8336-449C-AB5A-5B2BF387AA9B@oracle.com> References: <1B17F111-EA9D-4BB6-93DB-663A5328A40F@oracle.com> <9E518796-8336-449C-AB5A-5B2BF387AA9B@oracle.com> Message-ID: <146ac950-2b0b-b4ac-d957-9b6a9f01afcd@oracle.com> Hi Igor, Looks good in general. One question. As far as I see old test didn't run "sh ulimit" on Windows. So now the test requires cygwin to run? --alex On 08/31/2020 12:03, Igor Ignatyev wrote: > ping? > >> On Aug 26, 2020, at 4:59 PM, Igor Ignatyev wrote: >> >> http://cr.openjdk.java.net/~iignatyev//8252402/webrev.00 >>> 287 lines changed: 60 ins; 200 del; 27 mod; >> >> Hi all, >> >> could you please review the patch which removes shell script from alloc001 test? >> there are two small difference comparing to the original test: >> - if we don't get OutOfMemory on mac or windows, the test will be reported as skipped (as opposed to passed-passed before) >> - as changing DYLD_LIBRARY_PATH on mac is a bit cumbersome due to SIP, I decided to use '-agentpath:' instead of '-agentlib:' >> >> the patch also moves alloc001.java to closer to the other files (vmTestbase/nsk/jvmti/Allocate/alloc001), removes TestDescription.java file, moves jtreg test description to the test source code and removes printdump agent option making trace messages in alloc001.cpp unconditional. >> >> webrev: http://cr.openjdk.java.net/~iignatyev//8252402/webrev.00 (depends on 8252401[1,2]) >> JBS: https://bugs.openjdk.java.net/browse/JDK-8252402 >> testing: vmTestbase/nsk/jvmti/Allocate/alloc001 on {linux,windows,macos}-x64 >> >> [1] https://bugs.openjdk.java.net/browse/JDK-8252401 >> [2] http://cr.openjdk.java.net/~iignatyev//8252401/webrev.00 >> >> Thanks, >> -- Igor >> > From igor.ignatyev at oracle.com Mon Aug 31 22:03:26 2020 From: igor.ignatyev at oracle.com (Igor Ignatyev) Date: Mon, 31 Aug 2020 15:03:26 -0700 Subject: RFR(S) : 8252402 : rewrite vmTestbase/nsk/jvmti/Allocate/alloc001 shell test to Java In-Reply-To: <146ac950-2b0b-b4ac-d957-9b6a9f01afcd@oracle.com> References: <1B17F111-EA9D-4BB6-93DB-663A5328A40F@oracle.com> <9E518796-8336-449C-AB5A-5B2BF387AA9B@oracle.com> <146ac950-2b0b-b4ac-d957-9b6a9f01afcd@oracle.com> Message-ID: <9B0DB297-83B5-4098-BBD7-42370C8D3396@oracle.com> Hi Alex, AFAIK, cygwin always was and still is a requirement for both building and testing on windows, so it should not be a problem for anyone who is developing/testing OpenJDK on windows. Cheers, -- Igor > On Aug 31, 2020, at 2:54 PM, Alex Menkov wrote: > > Hi Igor, > > Looks good in general. > One question. > As far as I see old test didn't run "sh ulimit" on Windows. > So now the test requires cygwin to run? > > --alex > > On 08/31/2020 12:03, Igor Ignatyev wrote: >> ping? >>> On Aug 26, 2020, at 4:59 PM, Igor Ignatyev wrote: >>> >>> http://cr.openjdk.java.net/~iignatyev//8252402/webrev.00 >>>> 287 lines changed: 60 ins; 200 del; 27 mod; >>> >>> Hi all, >>> >>> could you please review the patch which removes shell script from alloc001 test? >>> there are two small difference comparing to the original test: >>> - if we don't get OutOfMemory on mac or windows, the test will be reported as skipped (as opposed to passed-passed before) >>> - as changing DYLD_LIBRARY_PATH on mac is a bit cumbersome due to SIP, I decided to use '-agentpath:' instead of '-agentlib:' >>> >>> the patch also moves alloc001.java to closer to the other files (vmTestbase/nsk/jvmti/Allocate/alloc001), removes TestDescription.java file, moves jtreg test description to the test source code and removes printdump agent option making trace messages in alloc001.cpp unconditional. >>> >>> webrev: http://cr.openjdk.java.net/~iignatyev//8252402/webrev.00 (depends on 8252401[1,2]) >>> JBS: https://bugs.openjdk.java.net/browse/JDK-8252402 >>> testing: vmTestbase/nsk/jvmti/Allocate/alloc001 on {linux,windows,macos}-x64 >>> >>> [1] https://bugs.openjdk.java.net/browse/JDK-8252401 >>> [2] http://cr.openjdk.java.net/~iignatyev//8252401/webrev.00 >>> >>> Thanks, >>> -- Igor >>> From alexey.menkov at oracle.com Mon Aug 31 22:29:10 2020 From: alexey.menkov at oracle.com (Alex Menkov) Date: Mon, 31 Aug 2020 15:29:10 -0700 Subject: RFR(S) : 8252403 : rewrite serviceability/7170638/SDTProbesGNULinuxTest.sh to java In-Reply-To: <5871CCB2-4AEB-4D8E-8A23-463394BCAC43@oracle.com> References: <6D1980F9-850E-476E-A53B-FC194DEDF9C2@oracle.com> <5871CCB2-4AEB-4D8E-8A23-463394BCAC43@oracle.com> Message-ID: Hi Igor, LGTM --alex On 08/31/2020 12:03, Igor Ignatyev wrote: > ping? > >> On Aug 26, 2020, at 10:14 PM, Igor Ignatyev wrote: >> >> http://cr.openjdk.java.net/~iignatyev//8252403/webrev.00 >>> 76 lines changed: 8 ins; 0 del; 68 mod; >> >> Hi all, >> >> could you please review the patch which rewrites serviceability/7170638/SDTProbesGNULinuxTest.sh to java? >> >> JBS: https://bugs.openjdk.java.net/browse/JDK-8252403 >> webrev: http://cr.openjdk.java.net/~iignatyev//8252403/webrev.00 >> testing: serviceability/7170638 on linux-x64 w/ and w/o dtrace feature >> >> Thanks, >> -- Igor > From david.holmes at oracle.com Mon Aug 31 22:37:11 2020 From: david.holmes at oracle.com (David Holmes) Date: Tue, 1 Sep 2020 08:37:11 +1000 Subject: Fatal errors when running JCK tests with JDK15/16 debug build In-Reply-To: References: Message-ID: <80c5c750-2a02-9bd9-d0b4-628481c71264@oracle.com> On 1/09/2020 3:00 am, Doerr, Martin wrote: > Hi David, > > thanks for analyzing it. We need to exclude the test for now. Can you file a JCK bug? I can file one on our internal JCK Jira but I'm not sure what the right process is in this case. Thanks, David > Best regards, > Martin > > >> -----Original Message----- >> From: David Holmes >> Sent: Montag, 31. August 2020 04:34 >> To: Doerr, Martin ; serviceability- >> dev at openjdk.java.net; hotspot-runtime-dev at openjdk.java.net >> Subject: Re: Fatal errors when running JCK tests with JDK15/16 debug build >> >> Hi Martin, >> >> On 29/08/2020 3:53 am, Doerr, Martin wrote: >>> Hi, >>> >>> we have seen the following fatal error more than 50 times since >>> 2020-05-25 in various JCK tests vm/jvmti. >>> >>> fatal error: String conversion failure: [check] ExitLock destroyed >>> >>> --> ?? [check] ExitLock exited >>> >>> (followed by garbage output) >>> >>> 8166358: Re-enable String verification in >>> java_lang_String::create_from_str() >>> >>> was pushed at that date which introduced the call to fatal. >>> >>> Stack (example from linuxppc64le, but also observed on x86 and aarch64): >>> V? [libjvm.so+0xee242c]? java_lang_String::create_from_str(char const*, >>> Thread*) [clone .part.158]+0x51c >>> V? [libjvm.so+0xee2530]? java_lang_String::create_oop_from_str(char >>> const*, Thread*)+0x40 >>> V? [libjvm.so+0x1026a30]? jni_NewStringUTF+0x1e0 >>> C? [libjckjvmti.so+0x3ce4c]? logWrite+0x5c >>> C? [libjckjvmti.so+0x3cd20]? lprintf+0x170 >>> C? [libjckjvmti.so+0x485b8]? gast00104_agent_proc+0x254 >>> V? [libjvm.so+0x1218f0c]? JvmtiAgentThread::call_start_function()+0x24c >>> V? [libjvm.so+0x193a8fc]? JavaThread::thread_main_inner()+0x32c >>> V? [libjvm.so+0x19418a0]? Thread::call_run()+0x160 >>> V? [libjvm.so+0x15c9d0c]? thread_native_entry(Thread*)+0x18c >>> C? [libpthread.so.0+0x9b48]? start_thread+0x108 >>> >>> (Problem could have been there before but without this fatal message.) >>> >>> The messages are generated by: >>> >>> tests/vm/jvmti/GetAllStackTraces/gast001/gast00104/gast00104.c >>> >>> This looks like a race condition. The message changes while the VM >>> creates a String object from it. Has anybody seen this before? >> >> No but ... >> >>> Is it a test problem? I'm not familiar with the lprintf calls in the test. >> >> ... the lprintf is part of the JCK support library (support.c if you >> have access to sources) and it uses a static buffer for the log messages >> and so it not thread-safe. This test creates a thread and both it and >> the main thread call lprintf concurrently. >> >> So this is a JCK test/test-library bug that appears to be exposed by the >> changes made in 8166358. >> >> Cheers, >> David >> ----- >> >>> Best regards, >>> >>> Martin >>> From alexey.menkov at oracle.com Mon Aug 31 22:37:17 2020 From: alexey.menkov at oracle.com (Alex Menkov) Date: Mon, 31 Aug 2020 15:37:17 -0700 Subject: RFR(S) : 8252402 : rewrite vmTestbase/nsk/jvmti/Allocate/alloc001 shell test to Java In-Reply-To: <9B0DB297-83B5-4098-BBD7-42370C8D3396@oracle.com> References: <1B17F111-EA9D-4BB6-93DB-663A5328A40F@oracle.com> <9E518796-8336-449C-AB5A-5B2BF387AA9B@oracle.com> <146ac950-2b0b-b4ac-d957-9b6a9f01afcd@oracle.com> <9B0DB297-83B5-4098-BBD7-42370C8D3396@oracle.com> Message-ID: <87aa2d79-d411-4787-45b8-9c1824834a10@oracle.com> I suppose with WSL cygwin is not required. But it should be OK as well. LGTM. --alex On 08/31/2020 15:03, Igor Ignatyev wrote: > Hi Alex, > > AFAIK, cygwin always was and still is a requirement for both building and testing on windows, so it should not be a problem for anyone who is developing/testing OpenJDK on windows. > > Cheers, > -- Igor > >> On Aug 31, 2020, at 2:54 PM, Alex Menkov wrote: >> >> Hi Igor, >> >> Looks good in general. >> One question. >> As far as I see old test didn't run "sh ulimit" on Windows. >> So now the test requires cygwin to run? >> >> --alex >> >> On 08/31/2020 12:03, Igor Ignatyev wrote: >>> ping? >>>> On Aug 26, 2020, at 4:59 PM, Igor Ignatyev wrote: >>>> >>>> http://cr.openjdk.java.net/~iignatyev//8252402/webrev.00 >>>>> 287 lines changed: 60 ins; 200 del; 27 mod; >>>> >>>> Hi all, >>>> >>>> could you please review the patch which removes shell script from alloc001 test? >>>> there are two small difference comparing to the original test: >>>> - if we don't get OutOfMemory on mac or windows, the test will be reported as skipped (as opposed to passed-passed before) >>>> - as changing DYLD_LIBRARY_PATH on mac is a bit cumbersome due to SIP, I decided to use '-agentpath:' instead of '-agentlib:' >>>> >>>> the patch also moves alloc001.java to closer to the other files (vmTestbase/nsk/jvmti/Allocate/alloc001), removes TestDescription.java file, moves jtreg test description to the test source code and removes printdump agent option making trace messages in alloc001.cpp unconditional. >>>> >>>> webrev: http://cr.openjdk.java.net/~iignatyev//8252402/webrev.00 (depends on 8252401[1,2]) >>>> JBS: https://bugs.openjdk.java.net/browse/JDK-8252402 >>>> testing: vmTestbase/nsk/jvmti/Allocate/alloc001 on {linux,windows,macos}-x64 >>>> >>>> [1] https://bugs.openjdk.java.net/browse/JDK-8252401 >>>> [2] http://cr.openjdk.java.net/~iignatyev//8252401/webrev.00 >>>> >>>> Thanks, >>>> -- Igor >>>> > From jai.forums2013 at gmail.com Fri Aug 21 14:05:24 2020 From: jai.forums2013 at gmail.com (Jaikiran Pai) Date: Fri, 21 Aug 2020 14:05:24 -0000 Subject: Survey : On the jinfo, jmap, jstack serviceability tools In-Reply-To: <260b8d05-43f0-f8e8-107f-5ac4784d62ab@oracle.com> References: <260b8d05-43f0-f8e8-107f-5ac4784d62ab@oracle.com> Message-ID: <0024e934-2784-71d0-f732-736b492e844d@gmail.com> Are the results of this survey now available? -Jaikiran On 16/06/20 1:12 am, Stephen Fitch wrote: > Hello: > > We are considering deprecation and (eventual) removal of the jinfo, > jmap, jstack - (aka ?j* tools?) and building out a future foundation > for some aspect of serviceability on jcmd, however we don?t have a lot > of data about how how these tools are used in practice, especially > outside of Oracle. > > Therefore, we have created a survey [1] to gather more information and > help us evaluate and understand how others are using these tools in > the JDK.If you have used, or have (support) processes that utilize > these j*commands, then we would definitely appreciate a completed survey. > > We are specifically interested in your use-cases and how these tools > are effective for you in resolving JVM issues. > > The survey will remain open through July 15 2020. The results of the > survey will be made public after the survey closes. > > Thank you very much for your time and support. > > [1] https://www.questionpro.com/t/AQk5jZhiww