From hohensee at amazon.com Sun Sep 1 00:05:24 2019 From: hohensee at amazon.com (Hohensee, Paul) Date: Sun, 1 Sep 2019 00:05:24 +0000 Subject: RFR (M): 8207266: ThreadMXBean::getThreadAllocatedBytes() can be quicker for self thread In-Reply-To: <0d42d653-d158-a6e4-45b6-84f087c7e592@oracle.com> References: <588a91ec-8d4a-1157-5d72-88bb1eef1e6e@oracle.com> <30EA5D0C-1AEC-4242-B17B-CA4D39ECAF71@amazon.com> <0d42d653-d158-a6e4-45b6-84f087c7e592@oracle.com> Message-ID: <03A2509C-5587-448A-82F8-9240EA040326@amazon.com> Thanks, Mandy. I?ve finalized the CSR. New webrev at http://cr.openjdk.java.net/~phh/8207266/webrev.04/. In management.cpp, I now have if (THREAD->is_Java_thread()) { return ((JavaThread*)THREAD)->cooked_allocated_bytes(); } In ThreadImpl.java, using requireNonNull would produce a different and less informative message, so I?d like to leave it as is. I changed throwIfNullThreadIds to ensureNonNullThreadIds, and throwIfThreadAllocatedMemoryNotSupported to ensureThreadAllocatedMemorySupported. I dropped the ?java.lang.? prefix from all uses of UnsupportedOperationException in both c.s.m.ThreadMXBean.java and j.l.m.ThreadMXBean.java, and did the same with SecurityException. ?@since 14? added to c.s.m.ThreadMXBean.java and the CSR. Do I need another reviewer? Paul From: Mandy Chung Date: Friday, August 30, 2019 at 4:26 PM To: "Hohensee, Paul" Cc: OpenJDK Serviceability , "hotspot-gc-dev at openjdk.java.net" Subject: Re: RFR (M): 8207266: ThreadMXBean::getThreadAllocatedBytes() can be quicker for self thread CSR reviewed. management.cpp 2083 java_thread = (JavaThread*)THREAD; 2084 if (java_thread->is_Java_thread()) { 2085 return java_thread->cooked_allocated_bytes(); 2086 } The cast should be done after is_Java_thread() test. ThreadImpl.java 162 private void throwIfNullThreadIds(long[] ids) { Even better: simply use Objects::requiresNonNull and this method can be removed. This suggests positive naming alternative to throwIfThreadAllocatedMemoryNotSupported - "ensureThreadAllocatedMemorySupported" (sorry I should have suggested that) ThreadMXBean.java 130 * @throws java.lang.UnsupportedOperationException if the Java virtual Nit: "java.lang." can be dropped. @since 14 is missing. Mandy On 8/30/19 3:33 PM, Hohensee, Paul wrote: Thanks for your review, Mandy. Revised webrev at http://cr.openjdk.java.net/~phh/8207266/webrev.02/. I updated the CSR with your suggested javadoc for getCurrentThreadAllocatedBytes. It now matches that for getCurrentThreadUserTime and getCurrentThreadCputime. I also fixed the ?convenient? -> ?convenience? typos in j.l.m.ThreadMXBean.java. I meant GetOneThreads to be the possessive, but don?t feel strongly either way so I?m fine with GetOneThread. I updated ThreadImpl.java as you suggested, though in getThreadAllocatedBytes(long[] ids) I had to add a redundant-in-the-not-length-1-case check for a null ids reference. Would someone take a look at the Hotspot side and the test please? Paul From: Mandy Chung Date: Friday, August 30, 2019 at 10:22 AM To: "Hohensee, Paul" Cc: OpenJDK Serviceability , "hotspot-gc-dev at openjdk.java.net" Subject: Re: RFR (M): 8207266: ThreadMXBean::getThreadAllocatedBytes() can be quicker for self thread OK. That's better. Some review comments: The javadoc of getCurrentThreadAllocatedBytes() can simply say: "Returns an approximation of the total amount of memory, in bytes, allocated in heap memory for the current thread. This is a convenient method for local management use and is equivalent to calling getThreadAllocatedBytes(Thread.currentThread().getId()). src/hotspot/share/include/jmm.h GetOneThreadsAllocatedMemory: s/OneThreads/OneThread/ sun/management/ThreadImpl.java 43 private static final String THREAD_ALLOCATED_MEMORY_NOT_SUPPORTED = 44 "Thread allocated memory measurement is not supported."; if (!isThreadAllocatedMemorySupported()) { throw new UnsupportedOperationException(THREAD_ALLOCATED_MEMORY_NOT_SUPPORTED); } Perhaps the above can be refactored as throwIfAllocatedMemoryUnsupported() method. 391 if (ids.length == 1) { 392 sizes[0] = -1; : 398 if (ids.length == 1) { 399 long id = ids[0]; 400 sizes[0] = getThreadAllocatedMemory0( 401 Thread.currentThread().getId() == id ? 0 : id); 402 } else { It seems cleaner to handle the 1-element array case at the beginning of this method: if (ids.length == 1) { long size = getThreadAllocatedBytes(ids[0]); return new long[] { size }; } I didn't review the hotspot implementation and the test. Mandy On 8/29/19 10:01 AM, Hohensee, Paul wrote: My bad, Mandy. The webrev puts getCurrentThreadAllocatedBytes in com.sun.management.ThreadMXBean along with the current two getThreadAllocatedBytes methods for the reasons you list. I?ve updated the CSR to specify com.sun.management and added a rationale. AllocatedBytes is currently enabled by Hotspot by default because the overhead of recording TLAB occupancy is negligible. There?s no new GC code, nor will there be, so imo we don?t have to involve the GC folks. I.e., the new JMM method GetOneThreadsAllocatedBytes uses the existing cooked_allocated_bytes JavaThread method, and getCurrentThreadAllocatedBytes is the same as getThreadAllocatedBytes: it just bypasses the thread lookup code. I hadn?t tracked down what happens when getCurrentThreadUserTime and getCurrentThreadCpuTime are called before, but if I?m not mistaken, it the code in jcmd() in attachListener.cpp will call GetThreadCpuTimeWithKind in management.cpp, and it will ultimately use Thread::current() as the subject of the call, see os::current_thread_cpu_time in os_linux.cpp. That means that the CurrentThread methods should work remotely the same way they do locally. GetOneThreadsAllocatedBytes in management.cpp uses THREAD as its subject when called on behalf of getCurrentThreadAllocatedBytes, so it will also uses the current remote Java thread. Even if these methods only worked locally, there are many setups where apps are self-monitoring that could use the performance improvement. Thanks, Paul From: Mandy Chung Date: Wednesday, August 28, 2019 at 3:59 PM To: "Hohensee, Paul" Cc: OpenJDK Serviceability , "hotspot-gc-dev at openjdk.java.net" Subject: Re: RFR (M): 8207266: ThreadMXBean::getThreadAllocatedBytes() can be quicker for self thread Hi Paul, The CSR proposes this method in java.lang.management.ThreadMXBean as a Java SE feature. Has this been discussed with the GC team to commit measuring current thread's allocated bytes as Java SE feature? Can this be supported by all JVM implementation? What is the overhead if this is enabled by default? Does it need to be disabled? This metric is from TLAB that might be okay. This needs advice/discussion with GC experts. I see that CSR mentions it can be disabled and link to isThreadAllocatedMemoryEnabled() and setThreadAllocatedMemoryEnabled() methods but these methods are defined in com.sun.management.ThreadMXBean. As Alan points out, current thread makes sense only in local VM management. When this is monitored from a JMX client (e.g. jconsole to connect to a running JVM, "currentThreadAllowcatedBytes" attribute is the current thread in jconsole process which invoking Thread::currentThread? Mandy On 8/28/19 12:22 PM, Hohensee, Paul wrote: Please review a performance improvement for ThreadMXBean.getThreadAllocatedBytes and the addition of getCurrentThreadAllocatedBytes. JBS issue: https://bugs.openjdk.java.net/browse/JDK-8207266 Webrev: http://cr.openjdk.java.net/~phh/8207266/webrev.00/ CSR: https://bugs.openjdk.java.net/browse/JDK-8230311 Previous email threads: https://mail.openjdk.java.net/pipermail/serviceability-dev/2018-July/024441.html https://mail.openjdk.java.net/pipermail/serviceability-dev/2018-August/024763.html The CSR is for adding ThreadMXBean.getCurrentThreadAllocatedBytes. I?d be great for someone to review it. I took Mandy?s advice and put the fast paths in the library code. I added a new JMM method GetOneThreadsAllocatedBytes that works the same as GetThreadCpuTime: it uses a thread_id value of zero to distinguish the current thread. On my Mac laptop, the result runs 47x faster for the current thread than the old implementation. The 3 tests in test/jdk/com/sun/management/ThreadMXBean all pass. I added code to ThreadAllocatedMemory.java to test getCurrentThreadAllocatedBytes as well as variations on getThreadAllocatedBytes(id). A submit repo job is in progress. Thanks, Paul -------------- next part -------------- An HTML attachment was scrubbed... URL: From david.holmes at oracle.com Tue Sep 3 09:14:40 2019 From: david.holmes at oracle.com (David Holmes) Date: Tue, 3 Sep 2019 19:14:40 +1000 Subject: RFR: 8230466: check malloc/calloc results in jdk.hotspot.agent In-Reply-To: References: Message-ID: Hi Matthias, Re-directing to serviceability-dev. David On 3/09/2019 5:42 pm, Baesken, Matthias wrote: > Hello, please review the following small fix . > > In jdk.hotspot.agent native code (linux / macosx) we miss to check the result of malloc/calloc a few times . > This should be adjusted. > Additionally I added initialization to the symtab array in symtab.c (by calling memset to make sure we have a defined state ) . > > > > One question (was not really sure about this one so I did not change it so far) : > > http://cr.openjdk.java.net/~mbaesken/webrevs/8230466.0/src/jdk.hotspot.agent/macosx/native/libsaproc/symtab.c.frames.html > > 359 void destroy_symtab(symtab_t* symtab) { > 360 if (!symtab) return; > 361 free(symtab->strs); > 362 free(symtab->symbols); > 363 free(symtab); > 364 } > > > > Here we miss to close symtab->hash_table (opened by dbopen) , is it needed (haven't used dbopen much - maybe someone can comment on this)? > > > bug/webrev : > > https://bugs.openjdk.java.net/browse/JDK-8230466 > > http://cr.openjdk.java.net/~mbaesken/webrevs/8230466.0/ > > > Thanks and best regards, Matthias > From hohensee at amazon.com Tue Sep 3 19:38:08 2019 From: hohensee at amazon.com (Hohensee, Paul) Date: Tue, 3 Sep 2019 19:38:08 +0000 Subject: RFR (M): 8207266: ThreadMXBean::getThreadAllocatedBytes() can be quicker for self thread In-Reply-To: <03A2509C-5587-448A-82F8-9240EA040326@amazon.com> References: <588a91ec-8d4a-1157-5d72-88bb1eef1e6e@oracle.com> <30EA5D0C-1AEC-4242-B17B-CA4D39ECAF71@amazon.com> <0d42d653-d158-a6e4-45b6-84f087c7e592@oracle.com> <03A2509C-5587-448A-82F8-9240EA040326@amazon.com> Message-ID: Minor update in new webrev http://cr.openjdk.java.net/~phh/8207266/webrev.05/. I removed ensureNonNullThreadIds() in favor of Objects.requireNonNull(ids). Thanks, Mandy, for your through reviews. May I get another reviewer to weigh in? Paul ?On 8/31/19, 5:06 PM, "hotspot-gc-dev on behalf of Hohensee, Paul" wrote: Thanks, Mandy. I?ve finalized the CSR. New webrev at http://cr.openjdk.java.net/~phh/8207266/webrev.04/. In management.cpp, I now have if (THREAD->is_Java_thread()) { return ((JavaThread*)THREAD)->cooked_allocated_bytes(); } In ThreadImpl.java, using requireNonNull would produce a different and less informative message, so I?d like to leave it as is. I changed throwIfNullThreadIds to ensureNonNullThreadIds, and throwIfThreadAllocatedMemoryNotSupported to ensureThreadAllocatedMemorySupported. I dropped the ?java.lang.? prefix from all uses of UnsupportedOperationException in both c.s.m.ThreadMXBean.java and j.l.m.ThreadMXBean.java, and did the same with SecurityException. ?@since 14? added to c.s.m.ThreadMXBean.java and the CSR. Do I need another reviewer? Paul From: Mandy Chung Date: Friday, August 30, 2019 at 4:26 PM To: "Hohensee, Paul" Cc: OpenJDK Serviceability , "hotspot-gc-dev at openjdk.java.net" Subject: Re: RFR (M): 8207266: ThreadMXBean::getThreadAllocatedBytes() can be quicker for self thread CSR reviewed. management.cpp 2083 java_thread = (JavaThread*)THREAD; 2084 if (java_thread->is_Java_thread()) { 2085 return java_thread->cooked_allocated_bytes(); 2086 } The cast should be done after is_Java_thread() test. ThreadImpl.java 162 private void throwIfNullThreadIds(long[] ids) { Even better: simply use Objects::requiresNonNull and this method can be removed. This suggests positive naming alternative to throwIfThreadAllocatedMemoryNotSupported - "ensureThreadAllocatedMemorySupported" (sorry I should have suggested that) ThreadMXBean.java 130 * @throws java.lang.UnsupportedOperationException if the Java virtual Nit: "java.lang." can be dropped. @since 14 is missing. Mandy On 8/30/19 3:33 PM, Hohensee, Paul wrote: Thanks for your review, Mandy. Revised webrev at http://cr.openjdk.java.net/~phh/8207266/webrev.02/. I updated the CSR with your suggested javadoc for getCurrentThreadAllocatedBytes. It now matches that for getCurrentThreadUserTime and getCurrentThreadCputime. I also fixed the ?convenient? -> ?convenience? typos in j.l.m.ThreadMXBean.java. I meant GetOneThreads to be the possessive, but don?t feel strongly either way so I?m fine with GetOneThread. I updated ThreadImpl.java as you suggested, though in getThreadAllocatedBytes(long[] ids) I had to add a redundant-in-the-not-length-1-case check for a null ids reference. Would someone take a look at the Hotspot side and the test please? Paul From: Mandy Chung Date: Friday, August 30, 2019 at 10:22 AM To: "Hohensee, Paul" Cc: OpenJDK Serviceability , "hotspot-gc-dev at openjdk.java.net" Subject: Re: RFR (M): 8207266: ThreadMXBean::getThreadAllocatedBytes() can be quicker for self thread OK. That's better. Some review comments: The javadoc of getCurrentThreadAllocatedBytes() can simply say: "Returns an approximation of the total amount of memory, in bytes, allocated in heap memory for the current thread. This is a convenient method for local management use and is equivalent to calling getThreadAllocatedBytes(Thread.currentThread().getId()). src/hotspot/share/include/jmm.h GetOneThreadsAllocatedMemory: s/OneThreads/OneThread/ sun/management/ThreadImpl.java 43 private static final String THREAD_ALLOCATED_MEMORY_NOT_SUPPORTED = 44 "Thread allocated memory measurement is not supported."; if (!isThreadAllocatedMemorySupported()) { throw new UnsupportedOperationException(THREAD_ALLOCATED_MEMORY_NOT_SUPPORTED); } Perhaps the above can be refactored as throwIfAllocatedMemoryUnsupported() method. 391 if (ids.length == 1) { 392 sizes[0] = -1; : 398 if (ids.length == 1) { 399 long id = ids[0]; 400 sizes[0] = getThreadAllocatedMemory0( 401 Thread.currentThread().getId() == id ? 0 : id); 402 } else { It seems cleaner to handle the 1-element array case at the beginning of this method: if (ids.length == 1) { long size = getThreadAllocatedBytes(ids[0]); return new long[] { size }; } I didn't review the hotspot implementation and the test. Mandy On 8/29/19 10:01 AM, Hohensee, Paul wrote: My bad, Mandy. The webrev puts getCurrentThreadAllocatedBytes in com.sun.management.ThreadMXBean along with the current two getThreadAllocatedBytes methods for the reasons you list. I?ve updated the CSR to specify com.sun.management and added a rationale. AllocatedBytes is currently enabled by Hotspot by default because the overhead of recording TLAB occupancy is negligible. There?s no new GC code, nor will there be, so imo we don?t have to involve the GC folks. I.e., the new JMM method GetOneThreadsAllocatedBytes uses the existing cooked_allocated_bytes JavaThread method, and getCurrentThreadAllocatedBytes is the same as getThreadAllocatedBytes: it just bypasses the thread lookup code. I hadn?t tracked down what happens when getCurrentThreadUserTime and getCurrentThreadCpuTime are called before, but if I?m not mistaken, it the code in jcmd() in attachListener.cpp will call GetThreadCpuTimeWithKind in management.cpp, and it will ultimately use Thread::current() as the subject of the call, see os::current_thread_cpu_time in os_linux.cpp. That means that the CurrentThread methods should work remotely the same way they do locally. GetOneThreadsAllocatedBytes in management.cpp uses THREAD as its subject when called on behalf of getCurrentThreadAllocatedBytes, so it will also uses the current remote Java thread. Even if these methods only worked locally, there are many setups where apps are self-monitoring that could use the performance improvement. Thanks, Paul From: Mandy Chung Date: Wednesday, August 28, 2019 at 3:59 PM To: "Hohensee, Paul" Cc: OpenJDK Serviceability , "hotspot-gc-dev at openjdk.java.net" Subject: Re: RFR (M): 8207266: ThreadMXBean::getThreadAllocatedBytes() can be quicker for self thread Hi Paul, The CSR proposes this method in java.lang.management.ThreadMXBean as a Java SE feature. Has this been discussed with the GC team to commit measuring current thread's allocated bytes as Java SE feature? Can this be supported by all JVM implementation? What is the overhead if this is enabled by default? Does it need to be disabled? This metric is from TLAB that might be okay. This needs advice/discussion with GC experts. I see that CSR mentions it can be disabled and link to isThreadAllocatedMemoryEnabled() and setThreadAllocatedMemoryEnabled() methods but these methods are defined in com.sun.management.ThreadMXBean. As Alan points out, current thread makes sense only in local VM management. When this is monitored from a JMX client (e.g. jconsole to connect to a running JVM, "currentThreadAllowcatedBytes" attribute is the current thread in jconsole process which invoking Thread::currentThread? Mandy On 8/28/19 12:22 PM, Hohensee, Paul wrote: Please review a performance improvement for ThreadMXBean.getThreadAllocatedBytes and the addition of getCurrentThreadAllocatedBytes. JBS issue: https://bugs.openjdk.java.net/browse/JDK-8207266 Webrev: http://cr.openjdk.java.net/~phh/8207266/webrev.00/ CSR: https://bugs.openjdk.java.net/browse/JDK-8230311 Previous email threads: https://mail.openjdk.java.net/pipermail/serviceability-dev/2018-July/024441.html https://mail.openjdk.java.net/pipermail/serviceability-dev/2018-August/024763.html The CSR is for adding ThreadMXBean.getCurrentThreadAllocatedBytes. I?d be great for someone to review it. I took Mandy?s advice and put the fast paths in the library code. I added a new JMM method GetOneThreadsAllocatedBytes that works the same as GetThreadCpuTime: it uses a thread_id value of zero to distinguish the current thread. On my Mac laptop, the result runs 47x faster for the current thread than the old implementation. The 3 tests in test/jdk/com/sun/management/ThreadMXBean all pass. I added code to ThreadAllocatedMemory.java to test getCurrentThreadAllocatedBytes as well as variations on getThreadAllocatedBytes(id). A submit repo job is in progress. Thanks, Paul From chris.plummer at oracle.com Wed Sep 4 00:12:02 2019 From: chris.plummer at oracle.com (Chris Plummer) Date: Tue, 3 Sep 2019 17:12:02 -0700 Subject: RFR: 8230466: check malloc/calloc results in jdk.hotspot.agent In-Reply-To: References: Message-ID: Sorry, I don't have an answer to your symtab->hash_table question. But the rest of the changes look good to me. thanks, Chris On 9/3/19 2:14 AM, David Holmes wrote: > Hi Matthias, > > Re-directing to serviceability-dev. > > David > > On 3/09/2019 5:42 pm, Baesken, Matthias wrote: >> Hello, please review the following small fix . >> >> In?? jdk.hotspot.agent? native code (linux / macosx)?? we miss to >> check the? result of malloc/calloc a few times . >> This should be? adjusted. >> Additionally? I added initialization? to the symtab? array? in >> symtab.c?? (by calling memset? to make sure we have a defined state )? . >> >> >> >> One question (was not really sure about this one so I did not change >> it so far) : >> >> http://cr.openjdk.java.net/~mbaesken/webrevs/8230466.0/src/jdk.hotspot.agent/macosx/native/libsaproc/symtab.c.frames.html >> >> >> 359 void destroy_symtab(symtab_t* symtab) { >> 360?? if (!symtab) return; >> 361?? free(symtab->strs); >> 362?? free(symtab->symbols); >> 363?? free(symtab); >> 364 } >> >> >> >> Here we miss to close?? symtab->hash_table?? (opened by dbopen) ,? is >> it needed? (haven't? used dbopen much - maybe someone can comment on >> this)? >> >> >> bug/webrev : >> >> https://bugs.openjdk.java.net/browse/JDK-8230466 >> >> http://cr.openjdk.java.net/~mbaesken/webrevs/8230466.0/ >> >> >> Thanks and best regards, Matthias >> From yasuenag at gmail.com Wed Sep 4 05:59:24 2019 From: yasuenag at gmail.com (Yasumasa Suenaga) Date: Wed, 4 Sep 2019 14:59:24 +0900 Subject: RFR: 8230466: check malloc/calloc results in jdk.hotspot.agent In-Reply-To: References: Message-ID: <6b6000a6-ca58-dc65-8996-5862a6e3763a@gmail.com> Hi Matthias, src/jdk.hotspot.agent/linux/native/libsaproc/symtab.c: ``` 405 // guarantee(symtab == NULL, "multiple symtab"); 406 symtab = (struct symtab*)calloc(1, sizeof(struct symtab)); 407 if (symtab == NULL) { 408 goto quit; 409 } 410 memset(symtab, 0, sizeof(struct symtab)); ``` Why do you call memset() to clear symtab in L410? symtab is allocated via calloc() in L406, so symtab would already cleared. Thanks, Yasumasa (ysuenaga) On 2019/09/03 18:14, David Holmes wrote: > Hi Matthias, > > Re-directing to serviceability-dev. > > David > > On 3/09/2019 5:42 pm, Baesken, Matthias wrote: >> Hello, please review the following small fix . >> >> In jdk.hotspot.agent native code (linux / macosx) we miss to check the result of malloc/calloc a few times . >> This should be adjusted. >> Additionally I added initialization to the symtab array in symtab.c (by calling memset to make sure we have a defined state ) . >> >> >> >> One question (was not really sure about this one so I did not change it so far) : >> >> http://cr.openjdk.java.net/~mbaesken/webrevs/8230466.0/src/jdk.hotspot.agent/macosx/native/libsaproc/symtab.c.frames.html >> >> 359 void destroy_symtab(symtab_t* symtab) { >> 360 if (!symtab) return; >> 361 free(symtab->strs); >> 362 free(symtab->symbols); >> 363 free(symtab); >> 364 } >> >> >> >> Here we miss to close symtab->hash_table (opened by dbopen) , is it needed (haven't used dbopen much - maybe someone can comment on this)? >> >> >> bug/webrev : >> >> https://bugs.openjdk.java.net/browse/JDK-8230466 >> >> http://cr.openjdk.java.net/~mbaesken/webrevs/8230466.0/ >> >> >> Thanks and best regards, Matthias >> From matthias.baesken at sap.com Wed Sep 4 07:28:26 2019 From: matthias.baesken at sap.com (Baesken, Matthias) Date: Wed, 4 Sep 2019 07:28:26 +0000 Subject: RFR: 8230466: check malloc/calloc results in jdk.hotspot.agent In-Reply-To: <6b6000a6-ca58-dc65-8996-5862a6e3763a@gmail.com> References: <6b6000a6-ca58-dc65-8996-5862a6e3763a@gmail.com> Message-ID: Hello Yasumasa and Chris, thanks for your input . Here is a new webrev , without the unneeded memset-calls after calloc . http://cr.openjdk.java.net/~mbaesken/webrevs/8230466.1/ Hope everyone is happy with this now ?? ! Best regards, Matthias > Hi Matthias, > > src/jdk.hotspot.agent/linux/native/libsaproc/symtab.c: > ``` > 405 // guarantee(symtab == NULL, "multiple symtab"); > 406 symtab = (struct symtab*)calloc(1, sizeof(struct symtab)); > 407 if (symtab == NULL) { > 408 goto quit; > 409 } > 410 memset(symtab, 0, sizeof(struct symtab)); > ``` > > Why do you call memset() to clear symtab in L410? > symtab is allocated via calloc() in L406, so symtab would already cleared. > > > Thanks, > > Yasumasa (ysuenaga) > > > On 2019/09/03 18:14, David Holmes wrote: > > Hi Matthias, > > > > Re-directing to serviceability-dev. > > > > David > > > > On 3/09/2019 5:42 pm, Baesken, Matthias wrote: > >> Hello, please review the following small fix . > >> > >> In jdk.hotspot.agent native code (linux / macosx) we miss to check the > result of malloc/calloc a few times . > >> This should be adjusted. > >> Additionally I added initialization to the symtab array in symtab.c (by > calling memset to make sure we have a defined state ) . > >> > >> > >> > >> One question (was not really sure about this one so I did not change it so > far) : > >> > >> > http://cr.openjdk.java.net/~mbaesken/webrevs/8230466.0/src/jdk.hotspot. > agent/macosx/native/libsaproc/symtab.c.frames.html > >> > >> 359 void destroy_symtab(symtab_t* symtab) { > >> 360 if (!symtab) return; > >> 361 free(symtab->strs); > >> 362 free(symtab->symbols); > >> 363 free(symtab); > >> 364 } > >> > >> > >> > >> Here we miss to close symtab->hash_table (opened by dbopen) , is it > needed (haven't used dbopen much - maybe someone can comment on > this)? > >> > >> > >> bug/webrev : > >> > >> https://bugs.openjdk.java.net/browse/JDK-8230466 > >> > >> http://cr.openjdk.java.net/~mbaesken/webrevs/8230466.0/ > >> > >> > >> Thanks and best regards, Matthias > >> From yasuenag at gmail.com Wed Sep 4 07:37:48 2019 From: yasuenag at gmail.com (Yasumasa Suenaga) Date: Wed, 4 Sep 2019 16:37:48 +0900 Subject: RFR: 8230466: check malloc/calloc results in jdk.hotspot.agent In-Reply-To: References: <6b6000a6-ca58-dc65-8996-5862a6e3763a@gmail.com> Message-ID: Looks good! Yasumasa (ysuenaga) On 2019/09/04 16:28, Baesken, Matthias wrote: > Hello Yasumasa and Chris, thanks for your input . > Here is a new webrev , without the unneeded memset-calls after calloc . > > http://cr.openjdk.java.net/~mbaesken/webrevs/8230466.1/ > > Hope everyone is happy with this now ?? ! > > Best regards, Matthias > > >> Hi Matthias, >> >> src/jdk.hotspot.agent/linux/native/libsaproc/symtab.c: >> ``` >> 405 // guarantee(symtab == NULL, "multiple symtab"); >> 406 symtab = (struct symtab*)calloc(1, sizeof(struct symtab)); >> 407 if (symtab == NULL) { >> 408 goto quit; >> 409 } >> 410 memset(symtab, 0, sizeof(struct symtab)); >> ``` >> >> Why do you call memset() to clear symtab in L410? >> symtab is allocated via calloc() in L406, so symtab would already cleared. >> >> >> Thanks, >> >> Yasumasa (ysuenaga) >> >> >> On 2019/09/03 18:14, David Holmes wrote: >>> Hi Matthias, >>> >>> Re-directing to serviceability-dev. >>> >>> David >>> >>> On 3/09/2019 5:42 pm, Baesken, Matthias wrote: >>>> Hello, please review the following small fix . >>>> >>>> In jdk.hotspot.agent native code (linux / macosx) we miss to check the >> result of malloc/calloc a few times . >>>> This should be adjusted. >>>> Additionally I added initialization to the symtab array in symtab.c (by >> calling memset to make sure we have a defined state ) . >>>> >>>> >>>> >>>> One question (was not really sure about this one so I did not change it so >> far) : >>>> >>>> >> http://cr.openjdk.java.net/~mbaesken/webrevs/8230466.0/src/jdk.hotspot. >> agent/macosx/native/libsaproc/symtab.c.frames.html >>>> >>>> 359 void destroy_symtab(symtab_t* symtab) { >>>> 360 if (!symtab) return; >>>> 361 free(symtab->strs); >>>> 362 free(symtab->symbols); >>>> 363 free(symtab); >>>> 364 } >>>> >>>> >>>> >>>> Here we miss to close symtab->hash_table (opened by dbopen) , is it >> needed (haven't used dbopen much - maybe someone can comment on >> this)? >>>> >>>> >>>> bug/webrev : >>>> >>>> https://bugs.openjdk.java.net/browse/JDK-8230466 >>>> >>>> http://cr.openjdk.java.net/~mbaesken/webrevs/8230466.0/ >>>> >>>> >>>> Thanks and best regards, Matthias >>>> From alexey.menkov at oracle.com Wed Sep 4 19:19:29 2019 From: alexey.menkov at oracle.com (Alex Menkov) Date: Wed, 4 Sep 2019 12:19:29 -0700 Subject: RFR: JDK-8192057: com/sun/jdi/BadHandshakeTest.java fails with java.net.ConnectException Message-ID: Hi all, Please review the fix for BadHandshakeTest test. The problem is the test connects to the server twice and if debuggee hasn't yet handled disconnection, the next connect gets "connection refused" error. Instead of adding delay before 2nd connect (we never know "good" value for the delay and big delay can cause "accept timeout"), the test re-tries connect in case of ConnectException. Also improved/simplified the test slightly - debuggee is now run with auto port assignment (used lib.jdb.Debuggee test class which implements required functionality). jira: https://bugs.openjdk.java.net/browse/JDK-8192057 webrev: http://cr.openjdk.java.net/~amenkov/jdk14/BadHandshakeTest/webrev/ --alex From serguei.spitsyn at oracle.com Wed Sep 4 19:45:27 2019 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Wed, 4 Sep 2019 12:45:27 -0700 Subject: RFR: 8230466: check malloc/calloc results in jdk.hotspot.agent In-Reply-To: References: <6b6000a6-ca58-dc65-8996-5862a6e3763a@gmail.com> Message-ID: <796065c6-a4ee-92b2-0881-451537f2a4b9@oracle.com> Hi Matthias, It looks good in general but I have some minor comments below. http://cr.openjdk.java.net/~mbaesken/webrevs/8230466.1/src/jdk.hotspot.agent/linux/native/libsaproc/symtab.c.frames.html ?279 build_id_to_debug_filename (size_t size, unsigned char *data) ?280 { ?. . . ?283?? filename = malloc(strlen (debug_file_directory) + (sizeof "/.build-id/" - 1) + 1 ?284???????????????????? + 2 * size + (sizeof ".debug" - 1) + 1); ?285?? if (filename == NULL) { ?286???? return NULL; ?287?? } ?. . . ?312?? char *filename ?313???? = (build_id_to_debug_filename (note->n_descsz, bytes)); ?314?? if (filename == NULL) { ?315???? return NULL; ?316?? } There is no need to check filename for NULL at the line 314 as the function build_id_to_debug_filename with new check at the line 285 never returns NULL. http://cr.openjdk.java.net/~mbaesken/webrevs/8230466.1/src/jdk.hotspot.agent/macosx/native/libsaproc/MacosxDebuggerLocal.m.frames.html ?354?? array = (*env)->NewByteArray(env, numBytes); ?. . . ?376?? if (pages == NULL) { ?377???? return NULL; ?378?? } ?379?? mapped = calloc(pageCount, sizeof(int)); ?380?? if (mapped == NULL) { ?381???? free(pages); ?382???? return NULL; ?383?? } Just a question: ? We do not release the array allocated at line 354 because this local reference ? will be auto-released when returning to Java. Is this correct? http://cr.openjdk.java.net/~mbaesken/webrevs/8230466.1/src/jdk.hotspot.agent/macosx/native/libsaproc/symtab.c.frames.html ? 69???? if (is_debug()) { ? 70?????? DBT rkey, rvalue;? 71?????? char* tmp = (char *)malloc(strlen(symtab->symbols[i].name) + 1); ? 72?????? if (tmp != NULL) { ? 73???????? strcpy(tmp, symtab->symbols[i].name); ? 74???????? rkey.data = tmp; ? 75???????? rkey.size = strlen(tmp) + 1; ? 76 (*symtab->hash_table->get)(symtab->hash_table, &rkey, &rvalue, 0); ? 77???????? // we may get a copy back so compare contents ? 78???????? symtab_symbol *res = (symtab_symbol *)rvalue.data; ? 79???????? if (strcmp(res->name, symtab->symbols[i].name) || ? 80?????????? res->offset != symtab->symbols[i].offset || ? 81?????????? res->size != symtab->symbols[i].size) { ? 82???????????? print_debug("error to get hash_table value!\n"); ? 83???????? } ? 84???????? free(tmp); ? 85?????? } If malloc returns NULL then this debugging part will be we silently skipped. In other such cases there is an attempt to print a debug message. For instance: ?140?? symtab = (symtab_t *)malloc(sizeof(symtab_t)); ?141?? if (symtab == NULL) { ?142???? print_debug("out of memory: allocating symtab\n"); ?143???? return NULL; ?144?? } I understand that print_debug can fail with out of memory as well. But it depends on its implementation. Thanks, Serguei On 9/4/19 00:28, Baesken, Matthias wrote: > Hello Yasumasa and Chris, thanks for your input . > Here is a new webrev , without the unneeded memset-calls after calloc . > > http://cr.openjdk.java.net/~mbaesken/webrevs/8230466.1/ > > Hope everyone is happy with this now ?? ! > > Best regards, Matthias > > >> Hi Matthias, >> >> src/jdk.hotspot.agent/linux/native/libsaproc/symtab.c: >> ``` >> 405 // guarantee(symtab == NULL, "multiple symtab"); >> 406 symtab = (struct symtab*)calloc(1, sizeof(struct symtab)); >> 407 if (symtab == NULL) { >> 408 goto quit; >> 409 } >> 410 memset(symtab, 0, sizeof(struct symtab)); >> ``` >> >> Why do you call memset() to clear symtab in L410? >> symtab is allocated via calloc() in L406, so symtab would already cleared. >> >> >> Thanks, >> >> Yasumasa (ysuenaga) >> >> >> On 2019/09/03 18:14, David Holmes wrote: >>> Hi Matthias, >>> >>> Re-directing to serviceability-dev. >>> >>> David >>> >>> On 3/09/2019 5:42 pm, Baesken, Matthias wrote: >>>> Hello, please review the following small fix . >>>> >>>> In jdk.hotspot.agent native code (linux / macosx) we miss to check the >> result of malloc/calloc a few times . >>>> This should be adjusted. >>>> Additionally I added initialization to the symtab array in symtab.c (by >> calling memset to make sure we have a defined state ) . >>>> >>>> >>>> One question (was not really sure about this one so I did not change it so >> far) : >>>> >> http://cr.openjdk.java.net/~mbaesken/webrevs/8230466.0/src/jdk.hotspot. >> agent/macosx/native/libsaproc/symtab.c.frames.html >>>> 359 void destroy_symtab(symtab_t* symtab) { >>>> 360 if (!symtab) return; >>>> 361 free(symtab->strs); >>>> 362 free(symtab->symbols); >>>> 363 free(symtab); >>>> 364 } >>>> >>>> >>>> >>>> Here we miss to close symtab->hash_table (opened by dbopen) , is it >> needed (haven't used dbopen much - maybe someone can comment on >> this)? >>>> >>>> bug/webrev : >>>> >>>> https://bugs.openjdk.java.net/browse/JDK-8230466 >>>> >>>> http://cr.openjdk.java.net/~mbaesken/webrevs/8230466.0/ >>>> >>>> >>>> Thanks and best regards, Matthias >>>> From serguei.spitsyn at oracle.com Wed Sep 4 20:11:07 2019 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Wed, 4 Sep 2019 13:11:07 -0700 Subject: RFR: JDK-8192057: com/sun/jdi/BadHandshakeTest.java fails with java.net.ConnectException In-Reply-To: References: Message-ID: <985bd047-9ade-afbd-cde5-c29ba9bc0bd4@oracle.com> Hi Alex, The fix looks good. Good simplification! Thanks, Serguei On 9/4/19 12:19, Alex Menkov wrote: > Hi all, > > Please review the fix for BadHandshakeTest test. > The problem is the test connects to the server twice and if debuggee > hasn't yet handled disconnection, the next connect gets "connection > refused" error. > Instead of adding delay before 2nd connect (we never know "good" value > for the delay and big delay can cause "accept timeout"), the test > re-tries connect in case of ConnectException. > Also improved/simplified the test slightly - debuggee is now run with > auto port assignment (used lib.jdb.Debuggee test class which > implements required functionality). > > jira: > ? https://bugs.openjdk.java.net/browse/JDK-8192057 > webrev: > http://cr.openjdk.java.net/~amenkov/jdk14/BadHandshakeTest/webrev/ > > --alex From david.holmes at oracle.com Thu Sep 5 06:43:16 2019 From: david.holmes at oracle.com (David Holmes) Date: Thu, 5 Sep 2019 16:43:16 +1000 Subject: RFR (S): 8227563: jvmti/scenarios/contention/TC05/tc05t001 fails ... Message-ID: <57caaba9-3d9d-6db7-d4af-79f48dd8db1c@oracle.com> Bug: https://bugs.openjdk.java.net/browse/JDK-8227563 webrev: http://cr.openjdk.java.net/~dholmes/8227563/webrev/ See bug report for gory details. Basically on Windows thread-cpu-time may only get updated at the resolution of the timer interrupt, which may be as long as 16ms. The test checks that the elapsed cpu time is < 10ms and so would fail if the first and second calls to get the time straddled a timer interrupt (and normally it passes because both calls return zero). Fix is to bump the allowed time to 16ms on Windows. I also corrected a comment as to why part of the test is already disabled on Windows, and enabled verbose logging so that if this test fails again it will be much easier to see why. Testing: itself, on Windows numerous times Thanks, David From matthias.baesken at sap.com Thu Sep 5 08:19:05 2019 From: matthias.baesken at sap.com (Baesken, Matthias) Date: Thu, 5 Sep 2019 08:19:05 +0000 Subject: RFR: 8230466: check malloc/calloc results in jdk.hotspot.agent In-Reply-To: <796065c6-a4ee-92b2-0881-451537f2a4b9@oracle.com> References: <6b6000a6-ca58-dc65-8996-5862a6e3763a@gmail.com> <796065c6-a4ee-92b2-0881-451537f2a4b9@oracle.com> Message-ID: Hello Serguei, thanks for the comments . > > http://cr.openjdk.java.net/~mbaesken/webrevs/8230466.1/src/jdk.hotspot. > agent/linux/native/libsaproc/symtab.c.frames.html > > ?279 build_id_to_debug_filename (size_t size, unsigned char *data) > ?280 { > ?. . . > ?283?? filename = malloc(strlen (debug_file_directory) + (sizeof > "/.build-id/" - 1) + 1 > ?284???????????????????? + 2 * size + (sizeof ".debug" - 1) + 1); > ?285?? if (filename == NULL) { > ?286???? return NULL; > ?287?? } > ?. . . > ?312?? char *filename > ?313???? = (build_id_to_debug_filename (note->n_descsz, bytes)); > ?314?? if (filename == NULL) { > ?315???? return NULL; > ?316?? } > > There is no need to check filename for NULL at the line 314 as the function > build_id_to_debug_filename with new check at the line 285 never returns > NULL. > At line 286 of build_id_to_debug_filename we now return NULL (in case malloc cannot alloc memory). So we should check this also at line 314 , or do I miss something ? > > http://cr.openjdk.java.net/~mbaesken/webrevs/8230466.1/src/jdk.hotspot. > agent/macosx/native/libsaproc/MacosxDebuggerLocal.m.frames.html > > ?354?? array = (*env)->NewByteArray(env, numBytes); > ?. . . > ?376?? if (pages == NULL) { > ?377???? return NULL; > ?378?? } > ?379?? mapped = calloc(pageCount, sizeof(int)); > ?380?? if (mapped == NULL) { > ?381???? free(pages); > ?382???? return NULL; > ?383?? } > > Just a question: > ? We do not release the array allocated at line 354 because this local > reference > ? will be auto-released when returning to Java. Is this correct? > Good point, I think we better add DeleteLocalRef here in case of the new early returns . > > http://cr.openjdk.java.net/~mbaesken/webrevs/8230466.1/src/jdk.hotspot. > agent/macosx/native/libsaproc/symtab.c.frames.html > > ? 69???? if (is_debug()) { > ? 70?????? DBT rkey, rvalue;? 71?????? char* tmp = (char > *)malloc(strlen(symtab->symbols[i].name) + 1); > ? 72?????? if (tmp != NULL) { > ? 73???????? strcpy(tmp, symtab->symbols[i].name); > ? 74???????? rkey.data = tmp; > ? 75???????? rkey.size = strlen(tmp) + 1; > ? 76 (*symtab->hash_table->get)(symtab->hash_table, &rkey, &rvalue, 0); > ? 77???????? // we may get a copy back so compare contents > ? 78???????? symtab_symbol *res = (symtab_symbol *)rvalue.data; > ? 79???????? if (strcmp(res->name, symtab->symbols[i].name) || > ? 80?????????? res->offset != symtab->symbols[i].offset || > ? 81?????????? res->size != symtab->symbols[i].size) { > ? 82???????????? print_debug("error to get hash_table value!\n"); > ? 83???????? } > ? 84???????? free(tmp); > ? 85?????? } > > If malloc returns NULL then this debugging part will be we silently skipped. > In other such cases there is an attempt to print a debug message. > For instance: > > ?140?? symtab = (symtab_t *)malloc(sizeof(symtab_t)); > ?141?? if (symtab == NULL) { > ?142???? print_debug("out of memory: allocating symtab\n"); > ?143???? return NULL; > ?144?? } > > I understand that print_debug can fail with out of memory as well. > But it depends on its implementation. > > Thanks, > Serguei > > That's a good idea . I added a message . See new webrev : http://cr.openjdk.java.net/~mbaesken/webrevs/8230466.2/ Thanks, Matthias From serguei.spitsyn at oracle.com Thu Sep 5 08:33:29 2019 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Thu, 5 Sep 2019 01:33:29 -0700 Subject: RFR: 8230466: check malloc/calloc results in jdk.hotspot.agent In-Reply-To: References: <6b6000a6-ca58-dc65-8996-5862a6e3763a@gmail.com> <796065c6-a4ee-92b2-0881-451537f2a4b9@oracle.com> Message-ID: Hi Matthias, On 9/5/19 01:19, Baesken, Matthias wrote: > Hello Serguei, thanks for the comments . > >> http://cr.openjdk.java.net/~mbaesken/webrevs/8230466.1/src/jdk.hotspot. >> agent/linux/native/libsaproc/symtab.c.frames.html >> >> ?279 build_id_to_debug_filename (size_t size, unsigned char *data) >> ?280 { >> ?. . . >> ?283?? filename = malloc(strlen (debug_file_directory) + (sizeof >> "/.build-id/" - 1) + 1 >> ?284???????????????????? + 2 * size + (sizeof ".debug" - 1) + 1); >> ?285?? if (filename == NULL) { >> ?286???? return NULL; >> ?287?? } >> ?. . . >> ?312?? char *filename >> ?313???? = (build_id_to_debug_filename (note->n_descsz, bytes)); >> ?314?? if (filename == NULL) { >> ?315???? return NULL; >> ?316?? } >> >> There is no need to check filename for NULL at the line 314 as the function >> build_id_to_debug_filename with new check at the line 285 never returns >> NULL. >> > At line 286 of build_id_to_debug_filename we now return NULL (in case malloc cannot alloc memory). So we should check this also > at line 314 , or do I miss something ? Oh, sorry. You are right, thanks! >> http://cr.openjdk.java.net/~mbaesken/webrevs/8230466.1/src/jdk.hotspot. >> agent/macosx/native/libsaproc/MacosxDebuggerLocal.m.frames.html >> >> ?354?? array = (*env)->NewByteArray(env, numBytes); >> ?. . . >> ?376?? if (pages == NULL) { >> ?377???? return NULL; >> ?378?? } >> ?379?? mapped = calloc(pageCount, sizeof(int)); >> ?380?? if (mapped == NULL) { >> ?381???? free(pages); >> ?382???? return NULL; >> ?383?? } >> >> Just a question: >> ? We do not release the array allocated at line 354 because this local >> reference >> ? will be auto-released when returning to Java. Is this correct? >> > Good point, I think we better add DeleteLocalRef here in case of the new early returns . > > >> http://cr.openjdk.java.net/~mbaesken/webrevs/8230466.1/src/jdk.hotspot. >> agent/macosx/native/libsaproc/symtab.c.frames.html >> >> ? 69???? if (is_debug()) { >> ? 70?????? DBT rkey, rvalue;? 71?????? char* tmp = (char >> *)malloc(strlen(symtab->symbols[i].name) + 1); >> ? 72?????? if (tmp != NULL) { >> ? 73???????? strcpy(tmp, symtab->symbols[i].name); >> ? 74???????? rkey.data = tmp; >> ? 75???????? rkey.size = strlen(tmp) + 1; >> ? 76 (*symtab->hash_table->get)(symtab->hash_table, &rkey, &rvalue, 0); >> ? 77???????? // we may get a copy back so compare contents >> ? 78???????? symtab_symbol *res = (symtab_symbol *)rvalue.data; >> ? 79???????? if (strcmp(res->name, symtab->symbols[i].name) || >> ? 80?????????? res->offset != symtab->symbols[i].offset || >> ? 81?????????? res->size != symtab->symbols[i].size) { >> ? 82???????????? print_debug("error to get hash_table value!\n"); >> ? 83???????? } >> ? 84???????? free(tmp); >> ? 85?????? } >> >> If malloc returns NULL then this debugging part will be we silently skipped. >> In other such cases there is an attempt to print a debug message. >> For instance: >> >> ?140?? symtab = (symtab_t *)malloc(sizeof(symtab_t)); >> ?141?? if (symtab == NULL) { >> ?142???? print_debug("out of memory: allocating symtab\n"); >> ?143???? return NULL; >> ?144?? } >> >> I understand that print_debug can fail with out of memory as well. >> But it depends on its implementation. >> >> Thanks, >> Serguei >> >> > That's a good idea . I added a message . > > See new webrev : http://cr.openjdk.java.net/~mbaesken/webrevs/8230466.2/ Thank you for the update! It looks good to me. Thanks, Serguei > > > Thanks, Matthias > From christoph.langer at sap.com Thu Sep 5 15:01:16 2019 From: christoph.langer at sap.com (Langer, Christoph) Date: Thu, 5 Sep 2019 15:01:16 +0000 Subject: RFR(S): 8230666: Exclude serviceability/sa/TestInstanceKlassSize.java on linuxppc64 and linuxppc64le Message-ID: Hi, please review exclusion of serviceability/sa/TestInstanceKlassSize.java for linux on the ppc platforms. The test was probably ever failing on these platforms. Martin has done some initial analysis of the problem. I've opened https://bugs.openjdk.java.net/browse/JDK-8230664 to track resolution (@Martin: can you please add some more technical detail to the bug? Thanks.) The resolution will probably take some time and the platforms are not tier1, so we shall exempt the test from being executed for the time being. This is the bug: https://bugs.openjdk.java.net/browse/JDK-8230666 This is the patch: diff -r 397b97fb989c test/hotspot/jtreg/ProblemList.txt --- a/test/hotspot/jtreg/ProblemList.txt Thu Sep 05 16:50:18 2019 +0200 +++ b/test/hotspot/jtreg/ProblemList.txt Thu Sep 05 16:54:36 2019 +0200 @@ -128,7 +128,7 @@ serviceability/sa/TestG1HeapRegion.java 8193639 solaris-all serviceability/sa/TestHeapDumpForInvokeDynamic.java 8193639 solaris-all serviceability/sa/TestHeapDumpForLargeArray.java 8193639 solaris-all -serviceability/sa/TestInstanceKlassSize.java 8193639 solaris-all +serviceability/sa/TestInstanceKlassSize.java 8193639,8230664 solaris-all,linux-ppc64le,linux-ppc64 serviceability/sa/TestInstanceKlassSizeForInterface.java 8193639 solaris-all serviceability/sa/TestIntConstant.java 8193639,8211767 solaris-all,linux-ppc64le,linux-ppc64 serviceability/sa/TestJhsdbJstackLock.java 8193639 solaris-all Thanks Christoph -------------- next part -------------- An HTML attachment was scrubbed... URL: From daniel.daugherty at oracle.com Thu Sep 5 15:49:21 2019 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Thu, 5 Sep 2019 11:49:21 -0400 Subject: RFR(S): 8230666: Exclude serviceability/sa/TestInstanceKlassSize.java on linuxppc64 and linuxppc64le In-Reply-To: References: Message-ID: Thumbs up. This is a trivial fix so only requires a single (R)eviewer. Dan On 9/5/19 11:01 AM, Langer, Christoph wrote: > > Hi, > > please review exclusion of > serviceability/sa/TestInstanceKlassSize.java for linux on the ppc > platforms. > > The test was probably ever failing on these platforms. Martin has done > some initial analysis of the problem. I?ve opened > https://bugs.openjdk.java.net/browse/JDK-8230664 to track resolution > (@Martin: can you please add some more technical detail to the bug? > Thanks.) > > The resolution will probably take some time and the platforms are not > tier1, so we shall exempt the test from being executed for the time being. > > This is the bug: https://bugs.openjdk.java.net/browse/JDK-8230666 > > > This is the patch: > > diff -r 397b97fb989c test/hotspot/jtreg/ProblemList.txt > > --- a/test/hotspot/jtreg/ProblemList.txt??????? Thu Sep 05 16:50:18 > 2019 +0200 > > +++ b/test/hotspot/jtreg/ProblemList.txt??????? Thu Sep 05 16:54:36 > 2019 +0200 > > @@ -128,7 +128,7 @@ > > serviceability/sa/TestG1HeapRegion.java 8193639 solaris-all > > serviceability/sa/TestHeapDumpForInvokeDynamic.java 8193639 solaris-all > > serviceability/sa/TestHeapDumpForLargeArray.java 8193639 solaris-all > > -serviceability/sa/TestInstanceKlassSize.java 8193639 solaris-all > > +serviceability/sa/TestInstanceKlassSize.java 8193639,8230664 > solaris-all,linux-ppc64le,linux-ppc64 > > serviceability/sa/TestInstanceKlassSizeForInterface.java 8193639 > solaris-all > > serviceability/sa/TestIntConstant.java 8193639,8211767 > solaris-all,linux-ppc64le,linux-ppc64 > > serviceability/sa/TestJhsdbJstackLock.java 8193639 solaris-all > > Thanks > > Christoph > -------------- next part -------------- An HTML attachment was scrubbed... URL: From alexey.menkov at oracle.com Thu Sep 5 17:36:29 2019 From: alexey.menkov at oracle.com (Alex Menkov) Date: Thu, 5 Sep 2019 10:36:29 -0700 Subject: RFR (S): 8227563: jvmti/scenarios/contention/TC05/tc05t001 fails ... In-Reply-To: <57caaba9-3d9d-6db7-d4af-79f48dd8db1c@oracle.com> References: <57caaba9-3d9d-6db7-d4af-79f48dd8db1c@oracle.com> Message-ID: <879069b3-5fa5-0264-d1b3-f21291e40aa9@oracle.com> Looks reasonable. --alex On 09/04/2019 23:43, David Holmes wrote: > Bug: https://bugs.openjdk.java.net/browse/JDK-8227563 > webrev: http://cr.openjdk.java.net/~dholmes/8227563/webrev/ > > See bug report for gory details. Basically on Windows thread-cpu-time > may only get updated at the resolution of the timer interrupt, which may > be as long as 16ms. The test checks that the elapsed cpu time is < 10ms > and so would fail if the first and second calls to get the time > straddled a timer interrupt (and normally it passes because both calls > return zero). > > Fix is to bump the allowed time to 16ms on Windows. > > I also corrected a comment as to why part of the test is already > disabled on Windows, and enabled verbose logging so that if this test > fails again it will be much easier to see why. > > Testing: itself, on Windows numerous times > > Thanks, > David From alexey.menkov at oracle.com Thu Sep 5 18:20:19 2019 From: alexey.menkov at oracle.com (Alex Menkov) Date: Thu, 5 Sep 2019 11:20:19 -0700 Subject: RFR: JDK-8186825: some memory leak issues in the transport_startTransport Message-ID: Hi all, Please review the fix for https://bugs.openjdk.java.net/browse/JDK-8186825 webrev: http://cr.openjdk.java.net/~amenkov/jdk14/jdwp_memory_leak/webrev/ TransportInfo structure is used to pass data from main thread (transport_startTransport function) to acceptThread/attachThread and should be released by acceptThread/attachThread. --alex From serguei.spitsyn at oracle.com Thu Sep 5 18:48:38 2019 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Thu, 5 Sep 2019 11:48:38 -0700 Subject: RFR (S): 8227563: jvmti/scenarios/contention/TC05/tc05t001 fails ... In-Reply-To: <57caaba9-3d9d-6db7-d4af-79f48dd8db1c@oracle.com> References: <57caaba9-3d9d-6db7-d4af-79f48dd8db1c@oracle.com> Message-ID: Hi David, It looks good. Thank you for taking care about this! Thanks, Serguei On 9/4/19 23:43, David Holmes wrote: > Bug: https://bugs.openjdk.java.net/browse/JDK-8227563 > webrev: http://cr.openjdk.java.net/~dholmes/8227563/webrev/ > > See bug report for gory details. Basically on Windows thread-cpu-time > may only get updated at the resolution of the timer interrupt, which > may be as long as 16ms. The test checks that the elapsed cpu time is < > 10ms and so would fail if the first and second calls to get the time > straddled a timer interrupt (and normally it passes because both calls > return zero). > > Fix is to bump the allowed time to 16ms on Windows. > > I also corrected a comment as to why part of the test is already > disabled on Windows, and enabled verbose logging so that if this test > fails again it will be much easier to see why. > > Testing: itself, on Windows numerous times > > Thanks, > David From serguei.spitsyn at oracle.com Thu Sep 5 19:08:19 2019 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Thu, 5 Sep 2019 12:08:19 -0700 Subject: RFR: JDK-8186825: some memory leak issues in the transport_startTransport In-Reply-To: References: Message-ID: <9531fb93-e9e0-5897-a466-8422244477ca@oracle.com> Hi Alex, Looks good to me. Thanks, Serguei On 9/5/19 11:20, Alex Menkov wrote: > Hi all, > > Please review the fix for > ? https://bugs.openjdk.java.net/browse/JDK-8186825 > webrev: > http://cr.openjdk.java.net/~amenkov/jdk14/jdwp_memory_leak/webrev/ > > TransportInfo structure is used to pass data from main thread > (transport_startTransport function) to acceptThread/attachThread and > should be released by acceptThread/attachThread. > > --alex From chris.plummer at oracle.com Thu Sep 5 20:53:10 2019 From: chris.plummer at oracle.com (Chris Plummer) Date: Thu, 5 Sep 2019 13:53:10 -0700 Subject: RFR (S): 8227563: jvmti/scenarios/contention/TC05/tc05t001 fails ... In-Reply-To: References: <57caaba9-3d9d-6db7-d4af-79f48dd8db1c@oracle.com> Message-ID: <5a6c9757-166c-bf9e-d125-6782d5a863f1@oracle.com> +1 Chris On 9/5/19 11:48 AM, serguei.spitsyn at oracle.com wrote: > Hi David, > > It looks good. > Thank you for taking care about this! > > Thanks, > Serguei > > > On 9/4/19 23:43, David Holmes wrote: >> Bug: https://bugs.openjdk.java.net/browse/JDK-8227563 >> webrev: http://cr.openjdk.java.net/~dholmes/8227563/webrev/ >> >> See bug report for gory details. Basically on Windows thread-cpu-time >> may only get updated at the resolution of the timer interrupt, which >> may be as long as 16ms. The test checks that the elapsed cpu time is >> < 10ms and so would fail if the first and second calls to get the >> time straddled a timer interrupt (and normally it passes because both >> calls return zero). >> >> Fix is to bump the allowed time to 16ms on Windows. >> >> I also corrected a comment as to why part of the test is already >> disabled on Windows, and enabled verbose logging so that if this test >> fails again it will be much easier to see why. >> >> Testing: itself, on Windows numerous times >> >> Thanks, >> David > From david.holmes at oracle.com Thu Sep 5 22:00:57 2019 From: david.holmes at oracle.com (David Holmes) Date: Fri, 6 Sep 2019 08:00:57 +1000 Subject: RFR (S): 8227563: jvmti/scenarios/contention/TC05/tc05t001 fails ... In-Reply-To: <879069b3-5fa5-0264-d1b3-f21291e40aa9@oracle.com> References: <57caaba9-3d9d-6db7-d4af-79f48dd8db1c@oracle.com> <879069b3-5fa5-0264-d1b3-f21291e40aa9@oracle.com> Message-ID: Thanks Alex. david On 6/09/2019 3:36 am, Alex Menkov wrote: > Looks reasonable. > > --alex > > On 09/04/2019 23:43, David Holmes wrote: >> Bug: https://bugs.openjdk.java.net/browse/JDK-8227563 >> webrev: http://cr.openjdk.java.net/~dholmes/8227563/webrev/ >> >> See bug report for gory details. Basically on Windows thread-cpu-time >> may only get updated at the resolution of the timer interrupt, which >> may be as long as 16ms. The test checks that the elapsed cpu time is < >> 10ms and so would fail if the first and second calls to get the time >> straddled a timer interrupt (and normally it passes because both calls >> return zero). >> >> Fix is to bump the allowed time to 16ms on Windows. >> >> I also corrected a comment as to why part of the test is already >> disabled on Windows, and enabled verbose logging so that if this test >> fails again it will be much easier to see why. >> >> Testing: itself, on Windows numerous times >> >> Thanks, >> David From david.holmes at oracle.com Thu Sep 5 22:02:30 2019 From: david.holmes at oracle.com (David Holmes) Date: Fri, 6 Sep 2019 08:02:30 +1000 Subject: RFR (S): 8227563: jvmti/scenarios/contention/TC05/tc05t001 fails ... In-Reply-To: References: <57caaba9-3d9d-6db7-d4af-79f48dd8db1c@oracle.com> Message-ID: <59729749-dafb-4f2a-c76d-2e4acf260f7b@oracle.com> On 6/09/2019 4:48 am, serguei.spitsyn at oracle.com wrote: > Hi David, > > It looks good. Thanks serguei. > Thank you for taking care about this! It's going to fail more often after my changes in JDK-6313903 (not really certain why though). Cheers, David > Thanks, > Serguei > > > On 9/4/19 23:43, David Holmes wrote: >> Bug: https://bugs.openjdk.java.net/browse/JDK-8227563 >> webrev: http://cr.openjdk.java.net/~dholmes/8227563/webrev/ >> >> See bug report for gory details. Basically on Windows thread-cpu-time >> may only get updated at the resolution of the timer interrupt, which >> may be as long as 16ms. The test checks that the elapsed cpu time is < >> 10ms and so would fail if the first and second calls to get the time >> straddled a timer interrupt (and normally it passes because both calls >> return zero). >> >> Fix is to bump the allowed time to 16ms on Windows. >> >> I also corrected a comment as to why part of the test is already >> disabled on Windows, and enabled verbose logging so that if this test >> fails again it will be much easier to see why. >> >> Testing: itself, on Windows numerous times >> >> Thanks, >> David > From david.holmes at oracle.com Thu Sep 5 22:02:53 2019 From: david.holmes at oracle.com (David Holmes) Date: Fri, 6 Sep 2019 08:02:53 +1000 Subject: RFR (S): 8227563: jvmti/scenarios/contention/TC05/tc05t001 fails ... In-Reply-To: <5a6c9757-166c-bf9e-d125-6782d5a863f1@oracle.com> References: <57caaba9-3d9d-6db7-d4af-79f48dd8db1c@oracle.com> <5a6c9757-166c-bf9e-d125-6782d5a863f1@oracle.com> Message-ID: <3384cc69-19d0-137c-bb5c-b97adbb85d87@oracle.com> Thanks Chris! David On 6/09/2019 6:53 am, Chris Plummer wrote: > +1 > > Chris > > On 9/5/19 11:48 AM, serguei.spitsyn at oracle.com wrote: >> Hi David, >> >> It looks good. >> Thank you for taking care about this! >> >> Thanks, >> Serguei >> >> >> On 9/4/19 23:43, David Holmes wrote: >>> Bug: https://bugs.openjdk.java.net/browse/JDK-8227563 >>> webrev: http://cr.openjdk.java.net/~dholmes/8227563/webrev/ >>> >>> See bug report for gory details. Basically on Windows thread-cpu-time >>> may only get updated at the resolution of the timer interrupt, which >>> may be as long as 16ms. The test checks that the elapsed cpu time is >>> < 10ms and so would fail if the first and second calls to get the >>> time straddled a timer interrupt (and normally it passes because both >>> calls return zero). >>> >>> Fix is to bump the allowed time to 16ms on Windows. >>> >>> I also corrected a comment as to why part of the test is already >>> disabled on Windows, and enabled verbose logging so that if this test >>> fails again it will be much easier to see why. >>> >>> Testing: itself, on Windows numerous times >>> >>> Thanks, >>> David >> > From ioi.lam at oracle.com Fri Sep 6 02:27:38 2019 From: ioi.lam at oracle.com (Ioi Lam) Date: Thu, 5 Sep 2019 19:27:38 -0700 Subject: RFR(XS) 8230674 Heap dumps should exclude dormant CDS archived objects Message-ID: https://bugs.openjdk.java.net/browse/JDK-8230674 http://cr.openjdk.java.net/~iklam/jdk14/8230674-heap-dump-exclude-dormant-oops.v01 Please review this small fix: When CDS is in use, archived objects are memory-mapped into the heap (currently G1GC only). These objects are partitioned into "subgraphs". Some of these subgraphs may not be loaded (e.g., those related to jdk.internal.math.FDBigInteger) at the time a heap dump is requested. When a subgraph is not loaded, some of the objects in this subgraph may belong to a class that's not yet loaded. The bug happens when such an "dormant" object is dumped, but its class is not dumped because the class is not in the system dictionary. There is already code in DumperSupport::dump_instance() that tries to handle dormant objects, but it needs to be extended to cover arrays, as well as and references from non-dormant object/arrays to dormant ones. Thanks - Ioi From david.holmes at oracle.com Fri Sep 6 03:18:52 2019 From: david.holmes at oracle.com (David Holmes) Date: Fri, 6 Sep 2019 13:18:52 +1000 Subject: RFR(XS) 8230674 Heap dumps should exclude dormant CDS archived objects In-Reply-To: References: Message-ID: Hi Ioi, On 6/09/2019 12:27 pm, Ioi Lam wrote: > https://bugs.openjdk.java.net/browse/JDK-8230674 > http://cr.openjdk.java.net/~iklam/jdk14/8230674-heap-dump-exclude-dormant-oops.v01 > > > Please review this small fix: > > When CDS is in use, archived objects are memory-mapped into the heap > (currently G1GC only). These objects are partitioned into > "subgraphs". Some of these subgraphs may not be loaded (e.g., those > related to jdk.internal.math.FDBigInteger) at the time a heap dump is > requested. > > When a subgraph is not loaded, some of the objects in this subgraph may > belong to a class that's not yet loaded. > > The bug happens when such an "dormant" object is dumped, but its class > is not dumped because the class is not in the system dictionary. > > There is already code in DumperSupport::dump_instance() that tries to > handle dormant objects, but it needs to be extended to cover arrays, as well as > and references from non-dormant object/arrays to dormant ones. I have to confess I did not pay any attention to the CDS archived objects work, so I don't have a firm grasp of how you have implemented things. But I'm wondering how can you have a reference to a dormant object from a non-dormant one? Shouldn't the act of becoming non-dormant automatically cause the subgraph from that object to also become non-dormant? Or do you have "read barriers" to perform the changes on demand? That aside the code changes seem reasonable, you moved the check out of DumperSupport::dump_instance and into the higher-level HeapObjectDumper::do_object so that it catches instances and arrays, plus you added a check for array elements. Thanks, David > > > Thanks > - Ioi From ioi.lam at oracle.com Fri Sep 6 03:39:51 2019 From: ioi.lam at oracle.com (Ioi Lam) Date: Thu, 5 Sep 2019 20:39:51 -0700 Subject: RFR(XS) 8230674 Heap dumps should exclude dormant CDS archived objects In-Reply-To: References: Message-ID: <1c6d0620-d0ce-cd62-1108-ce2ccdb692d8@oracle.com> On 9/5/19 8:18 PM, David Holmes wrote: > Hi Ioi, > > On 6/09/2019 12:27 pm, Ioi Lam wrote: >> https://bugs.openjdk.java.net/browse/JDK-8230674 >> http://cr.openjdk.java.net/~iklam/jdk14/8230674-heap-dump-exclude-dormant-oops.v01 >> >> >> Please review this small fix: >> >> When CDS is in use, archived objects are memory-mapped into the heap >> (currently G1GC only). These objects are partitioned into >> "subgraphs". Some of these subgraphs may not be loaded (e.g., those >> related to jdk.internal.math.FDBigInteger) at the time a heap dump is >> requested. > >> When a subgraph is not loaded, some of the objects in this subgraph >> may belong to a class that's not yet loaded. >> >> The bug happens when such an "dormant" object is dumped, but its class >> is not dumped because the class is not in the system dictionary. >> >> There is already code in DumperSupport::dump_instance() that tries to >> handle dormant objects, but it needs to be extended to cover arrays, >> as well as and references from non-dormant object/arrays to dormant >> ones. > > I have to confess I did not pay any attention to the CDS archived > objects work, so I don't have a firm grasp of how you have implemented > things. But I'm wondering how can you have a reference to a dormant > object from a non-dormant one? Shouldn't the act of becoming > non-dormant automatically cause the subgraph from that object to also > become non-dormant? Or do you have "read barriers" to perform the > changes on demand? > Hi David, Thanks for the review. The dormant objects are not reachable via the GC roots. They become non-dormant via explicit calls to JVM_InitializeFromArchive, after which they become reachable via the static fields of loaded classes. The only issue here is heap dump is done by scanning all objects in the heap, including unreachable ones ? HeapObjectDumper obj_dumper(this, writer()); ? Universe::heap()->safe_object_iterate(&obj_dumper); that's how these dormant objects are discovered during heap dump. > That aside the code changes seem reasonable, you moved the check out > of DumperSupport::dump_instance and into the higher-level > HeapObjectDumper::do_object so that it catches instances and arrays, > plus you added a check for array elements. > I am debating whether I should put the masking code in here: void DumpWriter::write_objectID(oop o) { ? o = mask_dormant_archived_object(o);? /// <---- add ? address a = (address)o; #ifdef _LP64 ? write_u8((u8)a); #else ? write_u4((u4)a); #endif } That way, even if a dormant object (unintentionally) becomes reachable via the GC roots, we won't write an invalid reference to it (the object "body" will not be written, so the ID will not point to anything valid). But this seems a little too aggressive to me. What do you think? Thanks - Ioi From david.holmes at oracle.com Fri Sep 6 06:11:19 2019 From: david.holmes at oracle.com (David Holmes) Date: Fri, 6 Sep 2019 16:11:19 +1000 Subject: RFR(XS) 8230674 Heap dumps should exclude dormant CDS archived objects In-Reply-To: <1c6d0620-d0ce-cd62-1108-ce2ccdb692d8@oracle.com> References: <1c6d0620-d0ce-cd62-1108-ce2ccdb692d8@oracle.com> Message-ID: On 6/09/2019 1:39 pm, Ioi Lam wrote: > On 9/5/19 8:18 PM, David Holmes wrote: >> Hi Ioi, >> >> On 6/09/2019 12:27 pm, Ioi Lam wrote: >>> https://bugs.openjdk.java.net/browse/JDK-8230674 >>> http://cr.openjdk.java.net/~iklam/jdk14/8230674-heap-dump-exclude-dormant-oops.v01 >>> >>> >>> Please review this small fix: >>> >>> When CDS is in use, archived objects are memory-mapped into the heap >>> (currently G1GC only). These objects are partitioned into >>> "subgraphs". Some of these subgraphs may not be loaded (e.g., those >>> related to jdk.internal.math.FDBigInteger) at the time a heap dump is >>> requested. > >>> When a subgraph is not loaded, some of the objects in this subgraph >>> may belong to a class that's not yet loaded. >>> >>> The bug happens when such an "dormant" object is dumped, but its class >>> is not dumped because the class is not in the system dictionary. >>> >>> There is already code in DumperSupport::dump_instance() that tries to >>> handle dormant objects, but it needs to be extended to cover arrays, >>> as well as and references from non-dormant object/arrays to dormant >>> ones. >> >> I have to confess I did not pay any attention to the CDS archived >> objects work, so I don't have a firm grasp of how you have implemented >> things. But I'm wondering how can you have a reference to a dormant >> object from a non-dormant one? Shouldn't the act of becoming >> non-dormant automatically cause the subgraph from that object to also >> become non-dormant? Or do you have "read barriers" to perform the >> changes on demand? >> > > Hi David, > > Thanks for the review. > > The dormant objects are not reachable via the GC roots. They become > non-dormant via explicit calls to JVM_InitializeFromArchive, after which > they become reachable via the static fields of loaded classes. Right, so is there a distinction between non-dormant and reachable at the time an object becomes non-dormant? I'm still unclear how a drmant array becomes non-dormant but still contains elements that refer to dormant objects. > The only issue here is heap dump is done by scanning all objects in the > heap, including unreachable ones > > ? HeapObjectDumper obj_dumper(this, writer()); > ? Universe::heap()->safe_object_iterate(&obj_dumper); > > that's how these dormant objects are discovered during heap dump. > >> That aside the code changes seem reasonable, you moved the check out >> of DumperSupport::dump_instance and into the higher-level >> HeapObjectDumper::do_object so that it catches instances and arrays, >> plus you added a check for array elements. >> > > I am debating whether I should put the masking code in here: > > void DumpWriter::write_objectID(oop o) { > ? o = mask_dormant_archived_object(o);? /// <---- add > ? address a = (address)o; > #ifdef _LP64 > ? write_u8((u8)a); > #else > ? write_u4((u4)a); > #endif > } > > > That way, even if a dormant object (unintentionally) becomes reachable > via the GC roots, we won't write an invalid reference to it (the object > "body" will not be written, so the ID will not point to anything valid). > > But this seems a little too aggressive to me. What do you think? It does seem a little aggressive as it seems to introduce the dormancy check into a lot of places that don't need it. But as I said I don't know this code so I'm really not the right person to ask. Cheers, David ----- > Thanks > - Ioi > From christoph.langer at sap.com Fri Sep 6 13:15:26 2019 From: christoph.langer at sap.com (Langer, Christoph) Date: Fri, 6 Sep 2019 13:15:26 +0000 Subject: RFR(S): 8230666: Exclude serviceability/sa/TestInstanceKlassSize.java on linuxppc64 and linuxppc64le In-Reply-To: References: Message-ID: Thanks, Dan. Pushed. From: Daniel D. Daugherty Sent: Donnerstag, 5. September 2019 17:49 To: Langer, Christoph ; OpenJDK Serviceability Subject: Re: RFR(S): 8230666: Exclude serviceability/sa/TestInstanceKlassSize.java on linuxppc64 and linuxppc64le Thumbs up. This is a trivial fix so only requires a single (R)eviewer. Dan On 9/5/19 11:01 AM, Langer, Christoph wrote: Hi, please review exclusion of serviceability/sa/TestInstanceKlassSize.java for linux on the ppc platforms. The test was probably ever failing on these platforms. Martin has done some initial analysis of the problem. I've opened https://bugs.openjdk.java.net/browse/JDK-8230664 to track resolution (@Martin: can you please add some more technical detail to the bug? Thanks.) The resolution will probably take some time and the platforms are not tier1, so we shall exempt the test from being executed for the time being. This is the bug: https://bugs.openjdk.java.net/browse/JDK-8230666 This is the patch: diff -r 397b97fb989c test/hotspot/jtreg/ProblemList.txt --- a/test/hotspot/jtreg/ProblemList.txt Thu Sep 05 16:50:18 2019 +0200 +++ b/test/hotspot/jtreg/ProblemList.txt Thu Sep 05 16:54:36 2019 +0200 @@ -128,7 +128,7 @@ serviceability/sa/TestG1HeapRegion.java 8193639 solaris-all serviceability/sa/TestHeapDumpForInvokeDynamic.java 8193639 solaris-all serviceability/sa/TestHeapDumpForLargeArray.java 8193639 solaris-all -serviceability/sa/TestInstanceKlassSize.java 8193639 solaris-all +serviceability/sa/TestInstanceKlassSize.java 8193639,8230664 solaris-all,linux-ppc64le,linux-ppc64 serviceability/sa/TestInstanceKlassSizeForInterface.java 8193639 solaris-all serviceability/sa/TestIntConstant.java 8193639,8211767 solaris-all,linux-ppc64le,linux-ppc64 serviceability/sa/TestJhsdbJstackLock.java 8193639 solaris-all Thanks Christoph -------------- next part -------------- An HTML attachment was scrubbed... URL: From richard.reingruber at sap.com Fri Sep 6 14:24:00 2019 From: richard.reingruber at sap.com (Reingruber, Richard) Date: Fri, 6 Sep 2019 14:24:00 +0000 Subject: RFR(S) 8230677: Should disable Escape Analysis if JVMTI capability can_get_owned_monitor_info was taken Message-ID: Hi, could I please get reviews for Webrev: http://cr.openjdk.java.net/~rrich/webrevs/2019/8230677/webrev.0/ Bug: https://bugs.openjdk.java.net/browse/JDK-8230677 The JVMTI functions GetOwnedMonitorInfo() and GetOwnedMonitorStackDepthInfo() can be used to retrieve objects locked by a thread. In terms of escape analysis those references escape and optimizations like scalar replacement become invalid. The runtime currently cannot cope with objects escaping through JVMTI (try included tests). Therefore escape analysis should be disabled if an agent requests the capabilities can_get_owned_monitor_info or can_get_owned_monitor_stack_depth_info. This was taken out of JDK-8227745 [1] to make it smaller. With JDK-8227745 there's no need to disable escape analysis, instead optimizations based on escape analysis will be reverted just before objects escape through JVMTI. I've run tier1 tests. Thanks, Richard. [1] https://bugs.openjdk.java.net/browse/JDK-8227745 From jianglizhou at google.com Fri Sep 6 15:18:37 2019 From: jianglizhou at google.com (Jiangli Zhou) Date: Fri, 6 Sep 2019 08:18:37 -0700 Subject: RFR(XS) 8230674 Heap dumps should exclude dormant CDS archived objects In-Reply-To: References: <1c6d0620-d0ce-cd62-1108-ce2ccdb692d8@oracle.com> Message-ID: Just to answer David's questions. Those are all very good questions! A dormant object is an unreachable object in the archived Java heap region. At the moment when the VM mapps in the 'Open' archive heap regions, all objects within the regions are considered as dormant objects. The state of an archived object changes when its reachability changes. Any archived Java object that is reachable by a live (non-dormant) object is effectively a non-dormant/live object. An archived object state is changed explicitly when it is 'installed'. HeapShared::materialize_archived_object() is called for that particular object to make GC aware about the object. For example, when a shared class is loaded, the corresponding archived mirror object is 'installed' in the shared klass. That's when the mirror object becomes alive. Another example is an archived object for static field value and all reachable objects from it. When the VM 'installs' an archived static field value back to the field, the object becomes alive explicitly. All reachable objects via the entry object also become non-dormant/alive implicitly. Please see more info in https://wiki.openjdk.java.net/display/HotSpot/Caching+Java+Heap+Objects. Will send review separately. Best, Jiangli On Thu, Sep 5, 2019 at 11:12 PM David Holmes wrote: > > On 6/09/2019 1:39 pm, Ioi Lam wrote: > > On 9/5/19 8:18 PM, David Holmes wrote: > >> Hi Ioi, > >> > >> On 6/09/2019 12:27 pm, Ioi Lam wrote: > >>> https://bugs.openjdk.java.net/browse/JDK-8230674 > >>> http://cr.openjdk.java.net/~iklam/jdk14/8230674-heap-dump-exclude-dormant-oops.v01 > >>> > >>> > >>> Please review this small fix: > >>> > >>> When CDS is in use, archived objects are memory-mapped into the heap > >>> (currently G1GC only). These objects are partitioned into > >>> "subgraphs". Some of these subgraphs may not be loaded (e.g., those > >>> related to jdk.internal.math.FDBigInteger) at the time a heap dump is > >>> requested. > > >>> When a subgraph is not loaded, some of the objects in this subgraph > >>> may belong to a class that's not yet loaded. > >>> > >>> The bug happens when such an "dormant" object is dumped, but its class > >>> is not dumped because the class is not in the system dictionary. > >>> > >>> There is already code in DumperSupport::dump_instance() that tries to > >>> handle dormant objects, but it needs to be extended to cover arrays, > >>> as well as and references from non-dormant object/arrays to dormant > >>> ones. > >> > >> I have to confess I did not pay any attention to the CDS archived > >> objects work, so I don't have a firm grasp of how you have implemented > >> things. But I'm wondering how can you have a reference to a dormant > >> object from a non-dormant one? Shouldn't the act of becoming > >> non-dormant automatically cause the subgraph from that object to also > >> become non-dormant? Or do you have "read barriers" to perform the > >> changes on demand? > >> > > > > Hi David, > > > > Thanks for the review. > > > > The dormant objects are not reachable via the GC roots. They become > > non-dormant via explicit calls to JVM_InitializeFromArchive, after which > > they become reachable via the static fields of loaded classes. > > Right, so is there a distinction between non-dormant and reachable at > the time an object becomes non-dormant? I'm still unclear how a drmant > array becomes non-dormant but still contains elements that refer to > dormant objects. > > > The only issue here is heap dump is done by scanning all objects in the > > heap, including unreachable ones > > > > HeapObjectDumper obj_dumper(this, writer()); > > Universe::heap()->safe_object_iterate(&obj_dumper); > > > > that's how these dormant objects are discovered during heap dump. > > > >> That aside the code changes seem reasonable, you moved the check out > >> of DumperSupport::dump_instance and into the higher-level > >> HeapObjectDumper::do_object so that it catches instances and arrays, > >> plus you added a check for array elements. > >> > > > > I am debating whether I should put the masking code in here: > > > > void DumpWriter::write_objectID(oop o) { > > o = mask_dormant_archived_object(o); /// <---- add > > address a = (address)o; > > #ifdef _LP64 > > write_u8((u8)a); > > #else > > write_u4((u4)a); > > #endif > > } > > > > > > That way, even if a dormant object (unintentionally) becomes reachable > > via the GC roots, we won't write an invalid reference to it (the object > > "body" will not be written, so the ID will not point to anything valid). > > > > But this seems a little too aggressive to me. What do you think? > > It does seem a little aggressive as it seems to introduce the dormancy > check into a lot of places that don't need it. But as I said I don't > know this code so I'm really not the right person to ask. > > Cheers, > David > ----- > > > Thanks > > - Ioi > > From richard.reingruber at sap.com Fri Sep 6 16:28:39 2019 From: richard.reingruber at sap.com (Reingruber, Richard) Date: Fri, 6 Sep 2019 16:28:39 +0000 Subject: RFR: JDK-8192057: com/sun/jdi/BadHandshakeTest.java fails with java.net.ConnectException In-Reply-To: <985bd047-9ade-afbd-cde5-c29ba9bc0bd4@oracle.com> References: <985bd047-9ade-afbd-cde5-c29ba9bc0bd4@oracle.com> Message-ID: Hi Alex, that's a good fix for the issue. One minor thing: 89 Exception error = null; 90 for (int retry = 0; retry < 5; retry++) { 91 try { 92 log("retry: " + retry); 93 s = new Socket("localhost", port); 94 error = null; 95 s.getOutputStream().write("JDWP-".getBytes("UTF-8")); 96 break; 97 } catch (ConnectException ex) { 98 log("got exception: " + ex.toString()); 99 error = ex; 100 } 101 } 102 if (error != null) { 103 throw error; 104 } Is there a reason to clear the local variable error in line 94 instead of clearing it in line 91 where each new attempt begins? Cheers, Richard. -----Original Message----- From: serviceability-dev On Behalf Of serguei.spitsyn at oracle.com Sent: Mittwoch, 4. September 2019 22:11 To: Alex Menkov ; OpenJDK Serviceability Subject: Re: RFR: JDK-8192057: com/sun/jdi/BadHandshakeTest.java fails with java.net.ConnectException Hi Alex, The fix looks good. Good simplification! Thanks, Serguei On 9/4/19 12:19, Alex Menkov wrote: > Hi all, > > Please review the fix for BadHandshakeTest test. > The problem is the test connects to the server twice and if debuggee > hasn't yet handled disconnection, the next connect gets "connection > refused" error. > Instead of adding delay before 2nd connect (we never know "good" value > for the delay and big delay can cause "accept timeout"), the test > re-tries connect in case of ConnectException. > Also improved/simplified the test slightly - debuggee is now run with > auto port assignment (used lib.jdb.Debuggee test class which > implements required functionality). > > jira: > ? https://bugs.openjdk.java.net/browse/JDK-8192057 > webrev: > http://cr.openjdk.java.net/~amenkov/jdk14/BadHandshakeTest/webrev/ > > --alex From ioi.lam at oracle.com Fri Sep 6 16:43:23 2019 From: ioi.lam at oracle.com (Ioi Lam) Date: Fri, 6 Sep 2019 09:43:23 -0700 Subject: RFR(XS) 8230674 Heap dumps should exclude dormant CDS archived objects of unloaded classes In-Reply-To: References: <1c6d0620-d0ce-cd62-1108-ce2ccdb692d8@oracle.com> Message-ID: <96a98c4d-6313-3176-538f-08043f807389@oracle.com> On 9/5/19 11:11 PM, David Holmes wrote: > On 6/09/2019 1:39 pm, Ioi Lam wrote: >> On 9/5/19 8:18 PM, David Holmes wrote: >>> Hi Ioi, >>> >>> On 6/09/2019 12:27 pm, Ioi Lam wrote: >>>> https://bugs.openjdk.java.net/browse/JDK-8230674 >>>> http://cr.openjdk.java.net/~iklam/jdk14/8230674-heap-dump-exclude-dormant-oops.v01 >>>> >>>> >>>> Please review this small fix: >>>> >>>> When CDS is in use, archived objects are memory-mapped into the >>>> heap (currently G1GC only). These objects are partitioned into >>>> "subgraphs". Some of these subgraphs may not be loaded (e.g., those >>>> related to jdk.internal.math.FDBigInteger) at the time a heap dump is >>>> requested. > >>>> When a subgraph is not loaded, some of the objects in this subgraph >>>> may belong to a class that's not yet loaded. >>>> >>>> The bug happens when such an "dormant" object is dumped, but its class >>>> is not dumped because the class is not in the system dictionary. >>>> >>>> There is already code in DumperSupport::dump_instance() that tries >>>> to handle dormant objects, but it needs to be extended to cover >>>> arrays, as well as and references from non-dormant object/arrays to >>>> dormant ones. >>> >>> I have to confess I did not pay any attention to the CDS archived >>> objects work, so I don't have a firm grasp of how you have >>> implemented things. But I'm wondering how can you have a reference >>> to a dormant object from a non-dormant one? Shouldn't the act of >>> becoming non-dormant automatically cause the subgraph from that >>> object to also become non-dormant? Or do you have "read barriers" to >>> perform the changes on demand? >>> Ah -- my bug title is not correct. I changed the bug title (and this e-mail subject) to Heap dumps should exclude dormant CDS archived objects **of unloaded classes** During the heap dump, we scan all objects in the heap, regardless of reachability. There's no way to decide reachability in HeapObjectDumper::do_object(), unless we perform an actual GC. But it's OK to include unreachable objects in the heap dump. (I guess it's useful to see how much garbage you have in the heap. There's an option to run a collection before dumping the heap.) There are 2 kinds of unreachable objects -- garbage: those that were once reachable but no longer, dormant: the archived objects that have never been reachable. Anyway, it's OK to dump dormant objects as long as their class has been loaded. The problem happens only when we dump a dormant object who class is not yet loaded (Eclipase MAT get confused when it sees an object whose class ID is invalid). So to answer your question, we can have a case with a dormant array (that contains a dormant object) like this: ??? Object[] array = {new ClassNotYetLoaded();} After my fix, the array will be dumped (we have no easy way of not doing that), but its contents becomes this in the .hprof file: ??? Object[] array = {null} Thanks - Ioi >> >> Hi David, >> >> Thanks for the review. >> >> The dormant objects are not reachable via the GC roots. They become >> non-dormant via explicit calls to JVM_InitializeFromArchive, after >> which they become reachable via the static fields of loaded classes. > > Right, so is there a distinction between non-dormant and reachable at > the time an object becomes non-dormant? I'm still unclear how a drmant > array becomes non-dormant but still contains elements that refer to > dormant objects. > >> The only issue here is heap dump is done by scanning all objects in >> the heap, including unreachable ones >> >> ?? HeapObjectDumper obj_dumper(this, writer()); >> ?? Universe::heap()->safe_object_iterate(&obj_dumper); >> >> that's how these dormant objects are discovered during heap dump. >> >>> That aside the code changes seem reasonable, you moved the check out >>> of DumperSupport::dump_instance and into the higher-level >>> HeapObjectDumper::do_object so that it catches instances and arrays, >>> plus you added a check for array elements. >>> >> >> I am debating whether I should put the masking code in here: >> >> void DumpWriter::write_objectID(oop o) { >> ?? o = mask_dormant_archived_object(o);? /// <---- add >> ?? address a = (address)o; >> #ifdef _LP64 >> ?? write_u8((u8)a); >> #else >> ?? write_u4((u4)a); >> #endif >> } >> >> >> That way, even if a dormant object (unintentionally) becomes >> reachable via the GC roots, we won't write an invalid reference to it >> (the object "body" will not be written, so the ID will not point to >> anything valid). >> >> But this seems a little too aggressive to me. What do you think? > > It does seem a little aggressive as it seems to introduce the dormancy > check into a lot of places that don't need it. But as I said I don't > know this code so I'm really not the right person to ask. > > Cheers, > David > ----- > >> Thanks >> - Ioi >> From hohensee at amazon.com Fri Sep 6 18:08:06 2019 From: hohensee at amazon.com (Hohensee, Paul) Date: Fri, 6 Sep 2019 18:08:06 +0000 Subject: RFR (M): 8207266: ThreadMXBean::getThreadAllocatedBytes() can be quicker for self thread In-Reply-To: References: <588a91ec-8d4a-1157-5d72-88bb1eef1e6e@oracle.com> <30EA5D0C-1AEC-4242-B17B-CA4D39ECAF71@amazon.com> <0d42d653-d158-a6e4-45b6-84f087c7e592@oracle.com> <03A2509C-5587-448A-82F8-9240EA040326@amazon.com> Message-ID: <23ED3BD7-3B7E-43DE-84B8-DB52D44D362B@amazon.com> Ping. Anyone? ( Thanks, ?On 9/3/19, 12:39 PM, "serviceability-dev on behalf of Hohensee, Paul" wrote: Minor update in new webrev http://cr.openjdk.java.net/~phh/8207266/webrev.05/. I removed ensureNonNullThreadIds() in favor of Objects.requireNonNull(ids). Thanks, Mandy, for your through reviews. May I get another reviewer to weigh in? Paul On 8/31/19, 5:06 PM, "hotspot-gc-dev on behalf of Hohensee, Paul" wrote: Thanks, Mandy. I?ve finalized the CSR. New webrev at http://cr.openjdk.java.net/~phh/8207266/webrev.04/. In management.cpp, I now have if (THREAD->is_Java_thread()) { return ((JavaThread*)THREAD)->cooked_allocated_bytes(); } In ThreadImpl.java, using requireNonNull would produce a different and less informative message, so I?d like to leave it as is. I changed throwIfNullThreadIds to ensureNonNullThreadIds, and throwIfThreadAllocatedMemoryNotSupported to ensureThreadAllocatedMemorySupported. I dropped the ?java.lang.? prefix from all uses of UnsupportedOperationException in both c.s.m.ThreadMXBean.java and j.l.m.ThreadMXBean.java, and did the same with SecurityException. ?@since 14? added to c.s.m.ThreadMXBean.java and the CSR. Do I need another reviewer? Paul From: Mandy Chung Date: Friday, August 30, 2019 at 4:26 PM To: "Hohensee, Paul" Cc: OpenJDK Serviceability , "hotspot-gc-dev at openjdk.java.net" Subject: Re: RFR (M): 8207266: ThreadMXBean::getThreadAllocatedBytes() can be quicker for self thread CSR reviewed. management.cpp 2083 java_thread = (JavaThread*)THREAD; 2084 if (java_thread->is_Java_thread()) { 2085 return java_thread->cooked_allocated_bytes(); 2086 } The cast should be done after is_Java_thread() test. ThreadImpl.java 162 private void throwIfNullThreadIds(long[] ids) { Even better: simply use Objects::requiresNonNull and this method can be removed. This suggests positive naming alternative to throwIfThreadAllocatedMemoryNotSupported - "ensureThreadAllocatedMemorySupported" (sorry I should have suggested that) ThreadMXBean.java 130 * @throws java.lang.UnsupportedOperationException if the Java virtual Nit: "java.lang." can be dropped. @since 14 is missing. Mandy On 8/30/19 3:33 PM, Hohensee, Paul wrote: Thanks for your review, Mandy. Revised webrev at http://cr.openjdk.java.net/~phh/8207266/webrev.02/. I updated the CSR with your suggested javadoc for getCurrentThreadAllocatedBytes. It now matches that for getCurrentThreadUserTime and getCurrentThreadCputime. I also fixed the ?convenient? -> ?convenience? typos in j.l.m.ThreadMXBean.java. I meant GetOneThreads to be the possessive, but don?t feel strongly either way so I?m fine with GetOneThread. I updated ThreadImpl.java as you suggested, though in getThreadAllocatedBytes(long[] ids) I had to add a redundant-in-the-not-length-1-case check for a null ids reference. Would someone take a look at the Hotspot side and the test please? Paul From: Mandy Chung Date: Friday, August 30, 2019 at 10:22 AM To: "Hohensee, Paul" Cc: OpenJDK Serviceability , "hotspot-gc-dev at openjdk.java.net" Subject: Re: RFR (M): 8207266: ThreadMXBean::getThreadAllocatedBytes() can be quicker for self thread OK. That's better. Some review comments: The javadoc of getCurrentThreadAllocatedBytes() can simply say: "Returns an approximation of the total amount of memory, in bytes, allocated in heap memory for the current thread. This is a convenient method for local management use and is equivalent to calling getThreadAllocatedBytes(Thread.currentThread().getId()). src/hotspot/share/include/jmm.h GetOneThreadsAllocatedMemory: s/OneThreads/OneThread/ sun/management/ThreadImpl.java 43 private static final String THREAD_ALLOCATED_MEMORY_NOT_SUPPORTED = 44 "Thread allocated memory measurement is not supported."; if (!isThreadAllocatedMemorySupported()) { throw new UnsupportedOperationException(THREAD_ALLOCATED_MEMORY_NOT_SUPPORTED); } Perhaps the above can be refactored as throwIfAllocatedMemoryUnsupported() method. 391 if (ids.length == 1) { 392 sizes[0] = -1; : 398 if (ids.length == 1) { 399 long id = ids[0]; 400 sizes[0] = getThreadAllocatedMemory0( 401 Thread.currentThread().getId() == id ? 0 : id); 402 } else { It seems cleaner to handle the 1-element array case at the beginning of this method: if (ids.length == 1) { long size = getThreadAllocatedBytes(ids[0]); return new long[] { size }; } I didn't review the hotspot implementation and the test. Mandy On 8/29/19 10:01 AM, Hohensee, Paul wrote: My bad, Mandy. The webrev puts getCurrentThreadAllocatedBytes in com.sun.management.ThreadMXBean along with the current two getThreadAllocatedBytes methods for the reasons you list. I?ve updated the CSR to specify com.sun.management and added a rationale. AllocatedBytes is currently enabled by Hotspot by default because the overhead of recording TLAB occupancy is negligible. There?s no new GC code, nor will there be, so imo we don?t have to involve the GC folks. I.e., the new JMM method GetOneThreadsAllocatedBytes uses the existing cooked_allocated_bytes JavaThread method, and getCurrentThreadAllocatedBytes is the same as getThreadAllocatedBytes: it just bypasses the thread lookup code. I hadn?t tracked down what happens when getCurrentThreadUserTime and getCurrentThreadCpuTime are called before, but if I?m not mistaken, it the code in jcmd() in attachListener.cpp will call GetThreadCpuTimeWithKind in management.cpp, and it will ultimately use Thread::current() as the subject of the call, see os::current_thread_cpu_time in os_linux.cpp. That means that the CurrentThread methods should work remotely the same way they do locally. GetOneThreadsAllocatedBytes in management.cpp uses THREAD as its subject when called on behalf of getCurrentThreadAllocatedBytes, so it will also uses the current remote Java thread. Even if these methods only worked locally, there are many setups where apps are self-monitoring that could use the performance improvement. Thanks, Paul From: Mandy Chung Date: Wednesday, August 28, 2019 at 3:59 PM To: "Hohensee, Paul" Cc: OpenJDK Serviceability , "hotspot-gc-dev at openjdk.java.net" Subject: Re: RFR (M): 8207266: ThreadMXBean::getThreadAllocatedBytes() can be quicker for self thread Hi Paul, The CSR proposes this method in java.lang.management.ThreadMXBean as a Java SE feature. Has this been discussed with the GC team to commit measuring current thread's allocated bytes as Java SE feature? Can this be supported by all JVM implementation? What is the overhead if this is enabled by default? Does it need to be disabled? This metric is from TLAB that might be okay. This needs advice/discussion with GC experts. I see that CSR mentions it can be disabled and link to isThreadAllocatedMemoryEnabled() and setThreadAllocatedMemoryEnabled() methods but these methods are defined in com.sun.management.ThreadMXBean. As Alan points out, current thread makes sense only in local VM management. When this is monitored from a JMX client (e.g. jconsole to connect to a running JVM, "currentThreadAllowcatedBytes" attribute is the current thread in jconsole process which invoking Thread::currentThread? Mandy On 8/28/19 12:22 PM, Hohensee, Paul wrote: Please review a performance improvement for ThreadMXBean.getThreadAllocatedBytes and the addition of getCurrentThreadAllocatedBytes. JBS issue: https://bugs.openjdk.java.net/browse/JDK-8207266 Webrev: http://cr.openjdk.java.net/~phh/8207266/webrev.00/ CSR: https://bugs.openjdk.java.net/browse/JDK-8230311 Previous email threads: https://mail.openjdk.java.net/pipermail/serviceability-dev/2018-July/024441.html https://mail.openjdk.java.net/pipermail/serviceability-dev/2018-August/024763.html The CSR is for adding ThreadMXBean.getCurrentThreadAllocatedBytes. I?d be great for someone to review it. I took Mandy?s advice and put the fast paths in the library code. I added a new JMM method GetOneThreadsAllocatedBytes that works the same as GetThreadCpuTime: it uses a thread_id value of zero to distinguish the current thread. On my Mac laptop, the result runs 47x faster for the current thread than the old implementation. The 3 tests in test/jdk/com/sun/management/ThreadMXBean all pass. I added code to ThreadAllocatedMemory.java to test getCurrentThreadAllocatedBytes as well as variations on getThreadAllocatedBytes(id). A submit repo job is in progress. Thanks, Paul From jianglizhou at google.com Fri Sep 6 18:48:39 2019 From: jianglizhou at google.com (Jiangli Zhou) Date: Fri, 6 Sep 2019 11:48:39 -0700 Subject: RFR(XS) 8230674 Heap dumps should exclude dormant CDS archived objects of unloaded classes In-Reply-To: <96a98c4d-6313-3176-538f-08043f807389@oracle.com> References: <1c6d0620-d0ce-cd62-1108-ce2ccdb692d8@oracle.com> <96a98c4d-6313-3176-538f-08043f807389@oracle.com> Message-ID: On Fri, Sep 6, 2019 at 9:43 AM Ioi Lam wrote: > > > > On 9/5/19 11:11 PM, David Holmes wrote: > > On 6/09/2019 1:39 pm, Ioi Lam wrote: > >> On 9/5/19 8:18 PM, David Holmes wrote: > >>> Hi Ioi, > >>> > >>> On 6/09/2019 12:27 pm, Ioi Lam wrote: > >>>> https://bugs.openjdk.java.net/browse/JDK-8230674 > >>>> http://cr.openjdk.java.net/~iklam/jdk14/8230674-heap-dump-exclude-dormant-oops.v01 > >>>> > >>>> > >>>> Please review this small fix: > >>>> > >>>> When CDS is in use, archived objects are memory-mapped into the > >>>> heap (currently G1GC only). These objects are partitioned into > >>>> "subgraphs". Some of these subgraphs may not be loaded (e.g., those > >>>> related to jdk.internal.math.FDBigInteger) at the time a heap dump is > >>>> requested. > > >>>> When a subgraph is not loaded, some of the objects in this subgraph > >>>> may belong to a class that's not yet loaded. > >>>> > >>>> The bug happens when such an "dormant" object is dumped, but its class > >>>> is not dumped because the class is not in the system dictionary. > >>>> > >>>> There is already code in DumperSupport::dump_instance() that tries > >>>> to handle dormant objects, but it needs to be extended to cover > >>>> arrays, as well as and references from non-dormant object/arrays to > >>>> dormant ones. > >>> > >>> I have to confess I did not pay any attention to the CDS archived > >>> objects work, so I don't have a firm grasp of how you have > >>> implemented things. But I'm wondering how can you have a reference > >>> to a dormant object from a non-dormant one? Shouldn't the act of > >>> becoming non-dormant automatically cause the subgraph from that > >>> object to also become non-dormant? Or do you have "read barriers" to > >>> perform the changes on demand? > >>> > > Ah -- my bug title is not correct. > > I changed the bug title (and this e-mail subject) to > > Heap dumps should exclude dormant CDS archived objects **of unloaded > classes** > > During the heap dump, we scan all objects in the heap, regardless of > reachability. There's no way to decide reachability in > HeapObjectDumper::do_object(), unless we perform an actual GC. > > But it's OK to include unreachable objects in the heap dump. (I guess > it's useful to see how much garbage you have in the heap. There's an > option to run a collection before dumping the heap.) > > There are 2 kinds of unreachable objects -- garbage: those that were > once reachable but no longer, dormant: the archived objects that have > never been reachable. Currently Java object archiving framework only supports one directional state change: dormant -> live. An archived object can become a live object from dormant state, but it cannot go back to the dormant state. Need to investigate thoroughly for all cases before the 'live -> dormant' transition can be supported. All objects in the 'Open' archive heap region are associated with the builtin class loaders and their classes are not unloaded. The existing static fields for archiving within the JDK classes are selected and the associated objects do not become garbage once 'installed'. > > Anyway, it's OK to dump dormant objects as long as their class has been > loaded. The problem happens only when we dump a dormant object who class > is not yet loaded (Eclipase MAT get confused when it sees an object > whose class ID is invalid). Yes. That's a scenario needs to be handled for a tool that iterates the Java heap. A dormant object in the 'Open' archive heap region may have a 'invalid' klass since the klass may not be loaded yet at the moment. Your webrev looks reasonable to me on high level pending information for following questions. Can you please give more details on the dormant objects referenced from the arrays? What specific arrays are those? Regards, Jiangli > > So to answer your question, we can have a case with a dormant array > (that contains a dormant object) like this: > > Object[] array = {new ClassNotYetLoaded();} > > After my fix, the array will be dumped (we have no easy way of not doing > that), but its contents becomes this in the .hprof file: > > Object[] array = {null} > > Thanks > - Ioi > > > > >> > >> Hi David, > >> > >> Thanks for the review. > >> > >> The dormant objects are not reachable via the GC roots. They become > >> non-dormant via explicit calls to JVM_InitializeFromArchive, after > >> which they become reachable via the static fields of loaded classes. > > > > Right, so is there a distinction between non-dormant and reachable at > > the time an object becomes non-dormant? I'm still unclear how a drmant > > array becomes non-dormant but still contains elements that refer to > > dormant objects. > > > >> The only issue here is heap dump is done by scanning all objects in > >> the heap, including unreachable ones > >> > >> HeapObjectDumper obj_dumper(this, writer()); > >> Universe::heap()->safe_object_iterate(&obj_dumper); > >> > >> that's how these dormant objects are discovered during heap dump. > >> > >>> That aside the code changes seem reasonable, you moved the check out > >>> of DumperSupport::dump_instance and into the higher-level > >>> HeapObjectDumper::do_object so that it catches instances and arrays, > >>> plus you added a check for array elements. > >>> > >> > >> I am debating whether I should put the masking code in here: > >> > >> void DumpWriter::write_objectID(oop o) { > >> o = mask_dormant_archived_object(o); /// <---- add > >> address a = (address)o; > >> #ifdef _LP64 > >> write_u8((u8)a); > >> #else > >> write_u4((u4)a); > >> #endif > >> } > >> > >> > >> That way, even if a dormant object (unintentionally) becomes > >> reachable via the GC roots, we won't write an invalid reference to it > >> (the object "body" will not be written, so the ID will not point to > >> anything valid). > >> > >> But this seems a little too aggressive to me. What do you think? > > > > It does seem a little aggressive as it seems to introduce the dormancy > > check into a lot of places that don't need it. But as I said I don't > > know this code so I'm really not the right person to ask. > > > > Cheers, > > David > > ----- > > > >> Thanks > >> - Ioi > >> > From alexey.menkov at oracle.com Fri Sep 6 19:56:42 2019 From: alexey.menkov at oracle.com (Alex Menkov) Date: Fri, 6 Sep 2019 12:56:42 -0700 Subject: RFR: JDK-8230516: invalid html in jdwp-protocol.html Message-ID: Hi all, Please review the fix for https://bugs.openjdk.java.net/browse/JDK-8230516 webrev: http://cr.openjdk.java.net/~amenkov/jdk14/jdwp-protocol_table_span/webrev/ The fix moves "anchor" spans to inside the cells. generated docs (no visual changes): old: http://cr.openjdk.java.net/~amenkov/jdk14/jdwp-protocol_table_span/0/jdwp-protocol.html new: http://cr.openjdk.java.net/~amenkov/jdk14/jdwp-protocol_table_span/1/jdwp-protocol.html --alex From daniel.daugherty at oracle.com Fri Sep 6 20:50:10 2019 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Fri, 6 Sep 2019 16:50:10 -0400 Subject: RFC: 8229160: Reimplement JvmtiRawMonitor to use PlatformMonitor In-Reply-To: <842a1d43-bdcc-3345-2731-d92b477e3ad3@oracle.com> References: <842a1d43-bdcc-3345-2731-d92b477e3ad3@oracle.com> Message-ID: <75acc5c5-b45f-a764-55d6-692dd7ca0a1a@oracle.com> Hi David, I've finally gotten back to this email thread... > FYI testing to date: > ?- tiers 1 -3 all platforms > ?- hotspot: serviceability/jvmti > ????????????????????????? /jdwp > ??????????? vmTestbase/nsk/jvmti > ????????????????????????? /jdwp > ?- JDK: com/sun/jdi You should also add: open/test/hotspot/jtreg/vmTestbase/nsk/jdb open/test/hotspot/jtreg/vmTestbase/nsk/jdi open/test/jdk/java/lang/instrument I took a quick look through the preliminary webrev and I don't see anything that worries me. Re: Thread.interrupt() and raw_wait() It would be good to see if that semantic is being tested via the JCK test suite for JVM/TI. I also very much like/appreciate the decoupling of JvmtiRawMonitors from ObjectMonitors... Thanks for tackling this crazy task. Dan On 8/15/19 2:22 AM, David Holmes wrote: > Bug: https://bugs.openjdk.java.net/browse/JDK-8229160 > > Preliminary webrev (still has rough edges): > http://cr.openjdk.java.net/~dholmes/8229160/webrev.prelim/ > > Background: > > We've had this comment for a long time: > > ?// The raw monitor subsystem is entirely distinct from normal > ?// java-synchronization or jni-synchronization.? raw monitors are not > ?// associated with objects.? They can be implemented in any manner > ?// that makes sense.? The original implementors decided to piggy-back > ?// the raw-monitor implementation on the existing Java objectMonitor > mechanism. > ?// This flaw needs to fixed.? We should reimplement raw monitors as > sui-generis. > ?// Specifically, we should not implement raw monitors via java monitors. > ?// Time permitting, we should disentangle and deconvolve the two > implementations > ?// and move the resulting raw monitor implementation over to the > JVMTI directories. > ?// Ideally, the raw monitor implementation would be built on top of > ?// park-unpark and nothing else. > > This is an attempt to do that disentangling so that we can then > consider changes to ObjectMonitor without having to worry about > JvmtiRawMonitors. But rather than building on low-level park/unpark > (which would require the same manual queue management and much of the > same complex code as exists in ObjectMonitor) I decided to try and do > this on top of PlatformMonitor. > > The reason this is just a RFC rather than RFR is that I overlooked a > non-trivial aspect of JvmtiRawMonitors: like Java monitors (as > implemented by ObjectMonitor) they interact with the Thread.interrupt > mechanism. This is not clearly stated in the JVM TI specification [1] > but only in passing by the possible errors for RawMonitorWait: > > JVMTI_ERROR_INTERRUPT??? Wait was interrupted, try again > > As I explain in the bug report there is no way to build in proper > interrupt support using PlatformMonitor as there is no way we can > "interrupt" the low-level pthread_cond_wait. But we can approximate > it. What I've done in this preliminary version is just check interrupt > state before and after the actual "wait" but we won't get woken by the > interrupt once we have actually blocked. Alternatively we could use a > periodic polling approach and wakeup every Nms to check for interruption. > > The only use of JvmtiRawMonitors in the JDK libraries (JDWP) is not > affected by this choice as that code ignores the interrupt until the > real action it was waiting for has occurred. The interrupt is then > reposted later. > > But more generally there could be users of JvmtiRawMonitors that > expect/require that RawMonitorWait is responsive to Thread.interrupt > in a manner similar to Object.wait. And if any of them are reading > this then I'd like to know - hence this RFC :) > > FYI testing to date: > ?- tiers 1 -3 all platforms > ?- hotspot: serviceability/jvmti > ????????????????????????? /jdwp > ??????????? vmTestbase/nsk/jvmti > ????????????????????????? /jdwp > ?- JDK: com/sun/jdi > > Comments/opinions appreciated. > > Thanks, > David > > [1] > https://docs.oracle.com/en/java/javase/11/docs/specs/jvmti.html#RawMonitorWait From alexey.menkov at oracle.com Fri Sep 6 20:51:57 2019 From: alexey.menkov at oracle.com (Alex Menkov) Date: Fri, 6 Sep 2019 13:51:57 -0700 Subject: RFR: JDK-8192057: com/sun/jdi/BadHandshakeTest.java fails with java.net.ConnectException In-Reply-To: References: <985bd047-9ade-afbd-cde5-c29ba9bc0bd4@oracle.com> Message-ID: Hi Richard, On 09/06/2019 09:28, Reingruber, Richard wrote: > Hi Alex, > > that's a good fix for the issue. > > One minor thing: > > 89 Exception error = null; > 90 for (int retry = 0; retry < 5; retry++) { > 91 try { > 92 log("retry: " + retry); > 93 s = new Socket("localhost", port); > 94 error = null; > 95 s.getOutputStream().write("JDWP-".getBytes("UTF-8")); > 96 break; > 97 } catch (ConnectException ex) { > 98 log("got exception: " + ex.toString()); > 99 error = ex; > 100 } > 101 } > 102 if (error != null) { > 103 throw error; > 104 } > > Is there a reason to clear the local variable error in line 94 instead of clearing it > in line 91 where each new attempt begins? The logic here is: The cycle has 2 exits: - error (max retry attempts reached, error is set by last "catch") - success ("break" statement at line 96, error should be null) So error is cleared only after the socket is connected (this is the problematic operation which can cause ConnectException). Of course error can be cleared before each try - there is not functional difference. --alex > > Cheers, Richard. > > -----Original Message----- > From: serviceability-dev On Behalf Of serguei.spitsyn at oracle.com > Sent: Mittwoch, 4. September 2019 22:11 > To: Alex Menkov ; OpenJDK Serviceability > Subject: Re: RFR: JDK-8192057: com/sun/jdi/BadHandshakeTest.java fails with java.net.ConnectException > > Hi Alex, > > The fix looks good. > Good simplification! > > Thanks, > Serguei > > > On 9/4/19 12:19, Alex Menkov wrote: >> Hi all, >> >> Please review the fix for BadHandshakeTest test. >> The problem is the test connects to the server twice and if debuggee >> hasn't yet handled disconnection, the next connect gets "connection >> refused" error. >> Instead of adding delay before 2nd connect (we never know "good" value >> for the delay and big delay can cause "accept timeout"), the test >> re-tries connect in case of ConnectException. >> Also improved/simplified the test slightly - debuggee is now run with >> auto port assignment (used lib.jdb.Debuggee test class which >> implements required functionality). >> >> jira: >> ? https://bugs.openjdk.java.net/browse/JDK-8192057 >> webrev: >> http://cr.openjdk.java.net/~amenkov/jdk14/BadHandshakeTest/webrev/ >> >> --alex > From daniil.x.titov at oracle.com Fri Sep 6 21:14:12 2019 From: daniil.x.titov at oracle.com (Daniil Titov) Date: Fri, 06 Sep 2019 14:14:12 -0700 Subject: RFR: JDK-8230516: invalid html in jdwp-protocol.html In-Reply-To: <0E3B211D-41F1-429F-8AFC-0BE3A5750B2C@oracle.com> References: <0E3B211D-41F1-429F-8AFC-0BE3A5750B2C@oracle.com> Message-ID: Hi Alex, The change looks good to me. Thanks! -Daniil ?On 9/6/19, 12:59 PM, "serviceability-dev on behalf of Alex Menkov" wrote: Hi all, Please review the fix for https://bugs.openjdk.java.net/browse/JDK-8230516 webrev: http://cr.openjdk.java.net/~amenkov/jdk14/jdwp-protocol_table_span/webrev/ The fix moves "anchor" spans to inside the cells. generated docs (no visual changes): old: http://cr.openjdk.java.net/~amenkov/jdk14/jdwp-protocol_table_span/0/jdwp-protocol.html new: http://cr.openjdk.java.net/~amenkov/jdk14/jdwp-protocol_table_span/1/jdwp-protocol.html --alex From ioi.lam at oracle.com Fri Sep 6 22:17:13 2019 From: ioi.lam at oracle.com (Ioi Lam) Date: Fri, 6 Sep 2019 15:17:13 -0700 Subject: RFR(XS) 8230674 Heap dumps should exclude dormant CDS archived objects of unloaded classes In-Reply-To: References: <1c6d0620-d0ce-cd62-1108-ce2ccdb692d8@oracle.com> <96a98c4d-6313-3176-538f-08043f807389@oracle.com> Message-ID: <3a510f34-7d34-cf02-f834-7522dadb8caa@oracle.com> On 9/6/19 11:48 AM, Jiangli Zhou wrote: > On Fri, Sep 6, 2019 at 9:43 AM Ioi Lam wrote: >> >> >> On 9/5/19 11:11 PM, David Holmes wrote: >>> On 6/09/2019 1:39 pm, Ioi Lam wrote: >>>> On 9/5/19 8:18 PM, David Holmes wrote: >>>>> Hi Ioi, >>>>> >>>>> On 6/09/2019 12:27 pm, Ioi Lam wrote: >>>>>> https://bugs.openjdk.java.net/browse/JDK-8230674 >>>>>> http://cr.openjdk.java.net/~iklam/jdk14/8230674-heap-dump-exclude-dormant-oops.v01 >>>>>> >>>>>> >>>>>> Please review this small fix: >>>>>> >>>>>> When CDS is in use, archived objects are memory-mapped into the >>>>>> heap (currently G1GC only). These objects are partitioned into >>>>>> "subgraphs". Some of these subgraphs may not be loaded (e.g., those >>>>>> related to jdk.internal.math.FDBigInteger) at the time a heap dump is >>>>>> requested. > >>>>>> When a subgraph is not loaded, some of the objects in this subgraph >>>>>> may belong to a class that's not yet loaded. >>>>>> >>>>>> The bug happens when such an "dormant" object is dumped, but its class >>>>>> is not dumped because the class is not in the system dictionary. >>>>>> >>>>>> There is already code in DumperSupport::dump_instance() that tries >>>>>> to handle dormant objects, but it needs to be extended to cover >>>>>> arrays, as well as and references from non-dormant object/arrays to >>>>>> dormant ones. >>>>> I have to confess I did not pay any attention to the CDS archived >>>>> objects work, so I don't have a firm grasp of how you have >>>>> implemented things. But I'm wondering how can you have a reference >>>>> to a dormant object from a non-dormant one? Shouldn't the act of >>>>> becoming non-dormant automatically cause the subgraph from that >>>>> object to also become non-dormant? Or do you have "read barriers" to >>>>> perform the changes on demand? >>>>> >> Ah -- my bug title is not correct. >> >> I changed the bug title (and this e-mail subject) to >> >> Heap dumps should exclude dormant CDS archived objects **of unloaded >> classes** >> >> During the heap dump, we scan all objects in the heap, regardless of >> reachability. There's no way to decide reachability in >> HeapObjectDumper::do_object(), unless we perform an actual GC. >> >> But it's OK to include unreachable objects in the heap dump. (I guess >> it's useful to see how much garbage you have in the heap. There's an >> option to run a collection before dumping the heap.) >> >> There are 2 kinds of unreachable objects -- garbage: those that were >> once reachable but no longer, dormant: the archived objects that have >> never been reachable. > Currently Java object archiving framework only supports one > directional state change: dormant -> live. An archived object can > become a live object from dormant state, but it cannot go back to the > dormant state. Need to investigate thoroughly for all cases before the > 'live -> dormant' transition can be supported. All objects in the > 'Open' archive heap region are associated with the builtin class > loaders and their classes are not unloaded. The existing static fields > for archiving within the JDK classes are selected and the associated > objects do not become garbage once 'installed'. > >> Anyway, it's OK to dump dormant objects as long as their class has been >> loaded. The problem happens only when we dump a dormant object who class >> is not yet loaded (Eclipase MAT get confused when it sees an object >> whose class ID is invalid). > Yes. That's a scenario needs to be handled for a tool that iterates > the Java heap. A dormant object in the 'Open' archive heap region may > have a 'invalid' klass since the klass may not be loaded yet at the > moment. > > Your webrev looks reasonable to me on high level pending information > for following questions. Can you please give more details on the > dormant objects referenced from the arrays? What specific arrays are > those? Hi Jiangli, Thanks for the review. I add the following code: ? // [id]* elements ? for (int index = 0; index < length; index++) { ??? oop o = array->obj_at(index); >> ??? if (o != NULL && mask_dormant_archived_object(o) == NULL) { ????? ResourceMark rm; ????? tty->print_cr("%s array contains %s object", array->klass()->external_name(), o->klass()->external_name()); ??? } >> ??? o = mask_dormant_archived_object(o); ??? writer->write_objectID(o); ? } and the output is: $ java -cp .? -XX:+HeapDumpAfterFullGC HelloGC Dumping heap to java_pid20956.hprof ... [Ljava.lang.Object; array contains java.util.jar.Attributes$Name object [Ljava.lang.Object; array contains java.util.jar.Attributes$Name object (repeated about 20 times) It comes from java/util/jar/Attributes$Name::KNOWN_NAMES. This class is not loaded because my program doesn't use JAR files in the classpath: Thanks - Ioi > Regards, > Jiangli > > > >> So to answer your question, we can have a case with a dormant array >> (that contains a dormant object) like this: >> >> Object[] array = {new ClassNotYetLoaded();} >> >> After my fix, the array will be dumped (we have no easy way of not doing >> that), but its contents becomes this in the .hprof file: >> >> Object[] array = {null} >> >> Thanks >> - Ioi >> >> >> >>>> Hi David, >>>> >>>> Thanks for the review. >>>> >>>> The dormant objects are not reachable via the GC roots. They become >>>> non-dormant via explicit calls to JVM_InitializeFromArchive, after >>>> which they become reachable via the static fields of loaded classes. >>> Right, so is there a distinction between non-dormant and reachable at >>> the time an object becomes non-dormant? I'm still unclear how a drmant >>> array becomes non-dormant but still contains elements that refer to >>> dormant objects. >>> >>>> The only issue here is heap dump is done by scanning all objects in >>>> the heap, including unreachable ones >>>> >>>> HeapObjectDumper obj_dumper(this, writer()); >>>> Universe::heap()->safe_object_iterate(&obj_dumper); >>>> >>>> that's how these dormant objects are discovered during heap dump. >>>> >>>>> That aside the code changes seem reasonable, you moved the check out >>>>> of DumperSupport::dump_instance and into the higher-level >>>>> HeapObjectDumper::do_object so that it catches instances and arrays, >>>>> plus you added a check for array elements. >>>>> >>>> I am debating whether I should put the masking code in here: >>>> >>>> void DumpWriter::write_objectID(oop o) { >>>> o = mask_dormant_archived_object(o); /// <---- add >>>> address a = (address)o; >>>> #ifdef _LP64 >>>> write_u8((u8)a); >>>> #else >>>> write_u4((u4)a); >>>> #endif >>>> } >>>> >>>> >>>> That way, even if a dormant object (unintentionally) becomes >>>> reachable via the GC roots, we won't write an invalid reference to it >>>> (the object "body" will not be written, so the ID will not point to >>>> anything valid). >>>> >>>> But this seems a little too aggressive to me. What do you think? >>> It does seem a little aggressive as it seems to introduce the dormancy >>> check into a lot of places that don't need it. But as I said I don't >>> know this code so I'm really not the right person to ask. >>> >>> Cheers, >>> David >>> ----- >>> >>>> Thanks >>>> - Ioi >>>> From jianglizhou at google.com Fri Sep 6 23:06:15 2019 From: jianglizhou at google.com (Jiangli Zhou) Date: Fri, 6 Sep 2019 16:06:15 -0700 Subject: RFR(XS) 8230674 Heap dumps should exclude dormant CDS archived objects of unloaded classes In-Reply-To: <3a510f34-7d34-cf02-f834-7522dadb8caa@oracle.com> References: <1c6d0620-d0ce-cd62-1108-ce2ccdb692d8@oracle.com> <96a98c4d-6313-3176-538f-08043f807389@oracle.com> <3a510f34-7d34-cf02-f834-7522dadb8caa@oracle.com> Message-ID: On Fri, Sep 6, 2019 at 3:17 PM Ioi Lam wrote: > > On 9/6/19 11:48 AM, Jiangli Zhou wrote: > > On Fri, Sep 6, 2019 at 9:43 AM Ioi Lam wrote: > >> > >> > >> On 9/5/19 11:11 PM, David Holmes wrote: > >>> On 6/09/2019 1:39 pm, Ioi Lam wrote: > >>>> On 9/5/19 8:18 PM, David Holmes wrote: > >>>>> Hi Ioi, > >>>>> > >>>>> On 6/09/2019 12:27 pm, Ioi Lam wrote: > >>>>>> https://bugs.openjdk.java.net/browse/JDK-8230674 > >>>>>> http://cr.openjdk.java.net/~iklam/jdk14/8230674-heap-dump-exclude-dormant-oops.v01 > >>>>>> > >>>>>> > >>>>>> Please review this small fix: > >>>>>> > >>>>>> When CDS is in use, archived objects are memory-mapped into the > >>>>>> heap (currently G1GC only). These objects are partitioned into > >>>>>> "subgraphs". Some of these subgraphs may not be loaded (e.g., those > >>>>>> related to jdk.internal.math.FDBigInteger) at the time a heap dump is > >>>>>> requested. > > >>>>>> When a subgraph is not loaded, some of the objects in this subgraph > >>>>>> may belong to a class that's not yet loaded. > >>>>>> > >>>>>> The bug happens when such an "dormant" object is dumped, but its class > >>>>>> is not dumped because the class is not in the system dictionary. > >>>>>> > >>>>>> There is already code in DumperSupport::dump_instance() that tries > >>>>>> to handle dormant objects, but it needs to be extended to cover > >>>>>> arrays, as well as and references from non-dormant object/arrays to > >>>>>> dormant ones. > >>>>> I have to confess I did not pay any attention to the CDS archived > >>>>> objects work, so I don't have a firm grasp of how you have > >>>>> implemented things. But I'm wondering how can you have a reference > >>>>> to a dormant object from a non-dormant one? Shouldn't the act of > >>>>> becoming non-dormant automatically cause the subgraph from that > >>>>> object to also become non-dormant? Or do you have "read barriers" to > >>>>> perform the changes on demand? > >>>>> > >> Ah -- my bug title is not correct. > >> > >> I changed the bug title (and this e-mail subject) to > >> > >> Heap dumps should exclude dormant CDS archived objects **of unloaded > >> classes** > >> > >> During the heap dump, we scan all objects in the heap, regardless of > >> reachability. There's no way to decide reachability in > >> HeapObjectDumper::do_object(), unless we perform an actual GC. > >> > >> But it's OK to include unreachable objects in the heap dump. (I guess > >> it's useful to see how much garbage you have in the heap. There's an > >> option to run a collection before dumping the heap.) > >> > >> There are 2 kinds of unreachable objects -- garbage: those that were > >> once reachable but no longer, dormant: the archived objects that have > >> never been reachable. > > Currently Java object archiving framework only supports one > > directional state change: dormant -> live. An archived object can > > become a live object from dormant state, but it cannot go back to the > > dormant state. Need to investigate thoroughly for all cases before the > > 'live -> dormant' transition can be supported. All objects in the > > 'Open' archive heap region are associated with the builtin class > > loaders and their classes are not unloaded. The existing static fields > > for archiving within the JDK classes are selected and the associated > > objects do not become garbage once 'installed'. > > > >> Anyway, it's OK to dump dormant objects as long as their class has been > >> loaded. The problem happens only when we dump a dormant object who class > >> is not yet loaded (Eclipase MAT get confused when it sees an object > >> whose class ID is invalid). > > Yes. That's a scenario needs to be handled for a tool that iterates > > the Java heap. A dormant object in the 'Open' archive heap region may > > have a 'invalid' klass since the klass may not be loaded yet at the > > moment. > > > > Your webrev looks reasonable to me on high level pending information > > for following questions. Can you please give more details on the > > dormant objects referenced from the arrays? What specific arrays are > > those? > Hi Jiangli, > > Thanks for the review. I add the following code: > > // [id]* elements > for (int index = 0; index < length; index++) { > oop o = array->obj_at(index); > >> > if (o != NULL && mask_dormant_archived_object(o) == NULL) { > ResourceMark rm; > tty->print_cr("%s array contains %s object", > array->klass()->external_name(), o->klass()->external_name()); > } > >> > o = mask_dormant_archived_object(o); > writer->write_objectID(o); > } > > and the output is: > > $ java -cp . -XX:+HeapDumpAfterFullGC HelloGC > Dumping heap to java_pid20956.hprof ... > [Ljava.lang.Object; array contains java.util.jar.Attributes$Name object > [Ljava.lang.Object; array contains java.util.jar.Attributes$Name object > (repeated about 20 times) > > It comes from java/util/jar/Attributes$Name::KNOWN_NAMES. This class is > not loaded because my program doesn't use JAR files in the classpath: The above looks right and is expected. At this point, we would not see any archive object that first becomes live then becomes dormant again. That however will change in the future when we make the object archiving framework general enough for other JDK and application class usages. We need to work out the GC details for the live -> dormant transition when that happens. Thanks, Jiangli > > Thanks > - Ioi > > > > Regards, > > Jiangli > > > > > > > >> So to answer your question, we can have a case with a dormant array > >> (that contains a dormant object) like this: > >> > >> Object[] array = {new ClassNotYetLoaded();} > >> > >> After my fix, the array will be dumped (we have no easy way of not doing > >> that), but its contents becomes this in the .hprof file: > >> > >> Object[] array = {null} > >> > >> Thanks > >> - Ioi > >> > >> > >> > >>>> Hi David, > >>>> > >>>> Thanks for the review. > >>>> > >>>> The dormant objects are not reachable via the GC roots. They become > >>>> non-dormant via explicit calls to JVM_InitializeFromArchive, after > >>>> which they become reachable via the static fields of loaded classes. > >>> Right, so is there a distinction between non-dormant and reachable at > >>> the time an object becomes non-dormant? I'm still unclear how a drmant > >>> array becomes non-dormant but still contains elements that refer to > >>> dormant objects. > >>> > >>>> The only issue here is heap dump is done by scanning all objects in > >>>> the heap, including unreachable ones > >>>> > >>>> HeapObjectDumper obj_dumper(this, writer()); > >>>> Universe::heap()->safe_object_iterate(&obj_dumper); > >>>> > >>>> that's how these dormant objects are discovered during heap dump. > >>>> > >>>>> That aside the code changes seem reasonable, you moved the check out > >>>>> of DumperSupport::dump_instance and into the higher-level > >>>>> HeapObjectDumper::do_object so that it catches instances and arrays, > >>>>> plus you added a check for array elements. > >>>>> > >>>> I am debating whether I should put the masking code in here: > >>>> > >>>> void DumpWriter::write_objectID(oop o) { > >>>> o = mask_dormant_archived_object(o); /// <---- add > >>>> address a = (address)o; > >>>> #ifdef _LP64 > >>>> write_u8((u8)a); > >>>> #else > >>>> write_u4((u4)a); > >>>> #endif > >>>> } > >>>> > >>>> > >>>> That way, even if a dormant object (unintentionally) becomes > >>>> reachable via the GC roots, we won't write an invalid reference to it > >>>> (the object "body" will not be written, so the ID will not point to > >>>> anything valid). > >>>> > >>>> But this seems a little too aggressive to me. What do you think? > >>> It does seem a little aggressive as it seems to introduce the dormancy > >>> check into a lot of places that don't need it. But as I said I don't > >>> know this code so I'm really not the right person to ask. > >>> > >>> Cheers, > >>> David > >>> ----- > >>> > >>>> Thanks > >>>> - Ioi > >>>> > From serguei.spitsyn at oracle.com Sat Sep 7 01:08:53 2019 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Fri, 6 Sep 2019 18:08:53 -0700 Subject: RFR(S) 8230677: Should disable Escape Analysis if JVMTI capability can_get_owned_monitor_info was taken In-Reply-To: References: Message-ID: <4f7232e2-4a3b-f3a1-965a-830e357f7534@oracle.com> Hi Richard, This looks good to me. One question though. Suppose some methods got compiled/optimized with the escape analysis. Then nothing prevents to enable the JVMTI capabilities later and use the JVMTI functions GetOwnedMonitorInfo()and GetOwnedMonitorStackDepthInfo(). Should enabling the capabilities cause deoptimization of the already optimized compiled frames? Is this considered to be a part of the JDK-8227745? Thanks, Serguei On 9/6/19 7:24 AM, Reingruber, Richard wrote: > Hi, > > could I please get reviews for > > Webrev: http://cr.openjdk.java.net/~rrich/webrevs/2019/8230677/webrev.0/ > Bug: https://bugs.openjdk.java.net/browse/JDK-8230677 > > The JVMTI functions GetOwnedMonitorInfo() and GetOwnedMonitorStackDepthInfo() can be used to > retrieve objects locked by a thread. In terms of escape analysis those references escape and > optimizations like scalar replacement become invalid. > > The runtime currently cannot cope with objects escaping through JVMTI (try included > tests). Therefore escape analysis should be disabled if an agent requests the capabilities > can_get_owned_monitor_info or can_get_owned_monitor_stack_depth_info. > > This was taken out of JDK-8227745 [1] to make it smaller. With JDK-8227745 there's no need to > disable escape analysis, instead optimizations based on escape analysis will be reverted just before > objects escape through JVMTI. > > I've run tier1 tests. > > Thanks, Richard. > > [1] https://bugs.openjdk.java.net/browse/JDK-8227745 From serguei.spitsyn at oracle.com Sun Sep 8 07:12:06 2019 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Sun, 8 Sep 2019 00:12:06 -0700 Subject: RFR: JDK-8230516: invalid html in jdwp-protocol.html In-Reply-To: References: <0E3B211D-41F1-429F-8AFC-0BE3A5750B2C@oracle.com> Message-ID: Hi Alex, +1 Thanks, Serguei On 9/6/19 14:14, Daniil Titov wrote: > Hi Alex, > > The change looks good to me. > > Thanks! > > -Daniil > > ?On 9/6/19, 12:59 PM, "serviceability-dev on behalf of Alex Menkov" wrote: > > Hi all, > > Please review the fix for > https://bugs.openjdk.java.net/browse/JDK-8230516 > webrev: > > http://cr.openjdk.java.net/~amenkov/jdk14/jdwp-protocol_table_span/webrev/ > > The fix moves "anchor" spans to inside the cells. > > generated docs (no visual changes): > old: > > http://cr.openjdk.java.net/~amenkov/jdk14/jdwp-protocol_table_span/0/jdwp-protocol.html > new: > > http://cr.openjdk.java.net/~amenkov/jdk14/jdwp-protocol_table_span/1/jdwp-protocol.html > > --alex > > > > From kirk.pepperdine at gmail.com Sun Sep 8 16:21:37 2019 From: kirk.pepperdine at gmail.com (Kirk Pepperdine) Date: Sun, 8 Sep 2019 09:21:37 -0700 Subject: RFR(XS) 8230674 Heap dumps should exclude dormant CDS archived objects of unloaded classes In-Reply-To: References: <1c6d0620-d0ce-cd62-1108-ce2ccdb692d8@oracle.com> <96a98c4d-6313-3176-538f-08043f807389@oracle.com> <3a510f34-7d34-cf02-f834-7522dadb8caa@oracle.com> Message-ID: Hi, Might I add a diagnostic twist to this request. It is sometimes useful to try to determine where unreferenced objects live in heap because this can help you solve questions of nepotism. Kind regards, Kirk > On Sep 6, 2019, at 4:06 PM, Jiangli Zhou wrote: > > On Fri, Sep 6, 2019 at 3:17 PM Ioi Lam wrote: >> >> On 9/6/19 11:48 AM, Jiangli Zhou wrote: >>> On Fri, Sep 6, 2019 at 9:43 AM Ioi Lam wrote: >>>> >>>> >>>> On 9/5/19 11:11 PM, David Holmes wrote: >>>>> On 6/09/2019 1:39 pm, Ioi Lam wrote: >>>>>> On 9/5/19 8:18 PM, David Holmes wrote: >>>>>>> Hi Ioi, >>>>>>> >>>>>>> On 6/09/2019 12:27 pm, Ioi Lam wrote: >>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8230674 >>>>>>>> http://cr.openjdk.java.net/~iklam/jdk14/8230674-heap-dump-exclude-dormant-oops.v01 >>>>>>>> >>>>>>>> >>>>>>>> Please review this small fix: >>>>>>>> >>>>>>>> When CDS is in use, archived objects are memory-mapped into the >>>>>>>> heap (currently G1GC only). These objects are partitioned into >>>>>>>> "subgraphs". Some of these subgraphs may not be loaded (e.g., those >>>>>>>> related to jdk.internal.math.FDBigInteger) at the time a heap dump is >>>>>>>> requested. > >>>>>>>> When a subgraph is not loaded, some of the objects in this subgraph >>>>>>>> may belong to a class that's not yet loaded. >>>>>>>> >>>>>>>> The bug happens when such an "dormant" object is dumped, but its class >>>>>>>> is not dumped because the class is not in the system dictionary. >>>>>>>> >>>>>>>> There is already code in DumperSupport::dump_instance() that tries >>>>>>>> to handle dormant objects, but it needs to be extended to cover >>>>>>>> arrays, as well as and references from non-dormant object/arrays to >>>>>>>> dormant ones. >>>>>>> I have to confess I did not pay any attention to the CDS archived >>>>>>> objects work, so I don't have a firm grasp of how you have >>>>>>> implemented things. But I'm wondering how can you have a reference >>>>>>> to a dormant object from a non-dormant one? Shouldn't the act of >>>>>>> becoming non-dormant automatically cause the subgraph from that >>>>>>> object to also become non-dormant? Or do you have "read barriers" to >>>>>>> perform the changes on demand? >>>>>>> >>>> Ah -- my bug title is not correct. >>>> >>>> I changed the bug title (and this e-mail subject) to >>>> >>>> Heap dumps should exclude dormant CDS archived objects **of unloaded >>>> classes** >>>> >>>> During the heap dump, we scan all objects in the heap, regardless of >>>> reachability. There's no way to decide reachability in >>>> HeapObjectDumper::do_object(), unless we perform an actual GC. >>>> >>>> But it's OK to include unreachable objects in the heap dump. (I guess >>>> it's useful to see how much garbage you have in the heap. There's an >>>> option to run a collection before dumping the heap.) >>>> >>>> There are 2 kinds of unreachable objects -- garbage: those that were >>>> once reachable but no longer, dormant: the archived objects that have >>>> never been reachable. >>> Currently Java object archiving framework only supports one >>> directional state change: dormant -> live. An archived object can >>> become a live object from dormant state, but it cannot go back to the >>> dormant state. Need to investigate thoroughly for all cases before the >>> 'live -> dormant' transition can be supported. All objects in the >>> 'Open' archive heap region are associated with the builtin class >>> loaders and their classes are not unloaded. The existing static fields >>> for archiving within the JDK classes are selected and the associated >>> objects do not become garbage once 'installed'. >>> >>>> Anyway, it's OK to dump dormant objects as long as their class has been >>>> loaded. The problem happens only when we dump a dormant object who class >>>> is not yet loaded (Eclipase MAT get confused when it sees an object >>>> whose class ID is invalid). >>> Yes. That's a scenario needs to be handled for a tool that iterates >>> the Java heap. A dormant object in the 'Open' archive heap region may >>> have a 'invalid' klass since the klass may not be loaded yet at the >>> moment. >>> >>> Your webrev looks reasonable to me on high level pending information >>> for following questions. Can you please give more details on the >>> dormant objects referenced from the arrays? What specific arrays are >>> those? >> Hi Jiangli, >> >> Thanks for the review. I add the following code: >> >> // [id]* elements >> for (int index = 0; index < length; index++) { >> oop o = array->obj_at(index); >>>> >> if (o != NULL && mask_dormant_archived_object(o) == NULL) { >> ResourceMark rm; >> tty->print_cr("%s array contains %s object", >> array->klass()->external_name(), o->klass()->external_name()); >> } >>>> >> o = mask_dormant_archived_object(o); >> writer->write_objectID(o); >> } >> >> and the output is: >> >> $ java -cp . -XX:+HeapDumpAfterFullGC HelloGC >> Dumping heap to java_pid20956.hprof ... >> [Ljava.lang.Object; array contains java.util.jar.Attributes$Name object >> [Ljava.lang.Object; array contains java.util.jar.Attributes$Name object >> (repeated about 20 times) >> >> It comes from java/util/jar/Attributes$Name::KNOWN_NAMES. This class is >> not loaded because my program doesn't use JAR files in the classpath: > > The above looks right and is expected. At this point, we would not see > any archive object that first becomes live then becomes dormant again. > > That however will change in the future when we make the object > archiving framework general enough for other JDK and application class > usages. We need to work out the GC details for the live -> dormant > transition when that happens. > > Thanks, > Jiangli > >> >> Thanks >> - Ioi >> >> >>> Regards, >>> Jiangli >>> >>> >>> >>>> So to answer your question, we can have a case with a dormant array >>>> (that contains a dormant object) like this: >>>> >>>> Object[] array = {new ClassNotYetLoaded();} >>>> >>>> After my fix, the array will be dumped (we have no easy way of not doing >>>> that), but its contents becomes this in the .hprof file: >>>> >>>> Object[] array = {null} >>>> >>>> Thanks >>>> - Ioi >>>> >>>> >>>> >>>>>> Hi David, >>>>>> >>>>>> Thanks for the review. >>>>>> >>>>>> The dormant objects are not reachable via the GC roots. They become >>>>>> non-dormant via explicit calls to JVM_InitializeFromArchive, after >>>>>> which they become reachable via the static fields of loaded classes. >>>>> Right, so is there a distinction between non-dormant and reachable at >>>>> the time an object becomes non-dormant? I'm still unclear how a drmant >>>>> array becomes non-dormant but still contains elements that refer to >>>>> dormant objects. >>>>> >>>>>> The only issue here is heap dump is done by scanning all objects in >>>>>> the heap, including unreachable ones >>>>>> >>>>>> HeapObjectDumper obj_dumper(this, writer()); >>>>>> Universe::heap()->safe_object_iterate(&obj_dumper); >>>>>> >>>>>> that's how these dormant objects are discovered during heap dump. >>>>>> >>>>>>> That aside the code changes seem reasonable, you moved the check out >>>>>>> of DumperSupport::dump_instance and into the higher-level >>>>>>> HeapObjectDumper::do_object so that it catches instances and arrays, >>>>>>> plus you added a check for array elements. >>>>>>> >>>>>> I am debating whether I should put the masking code in here: >>>>>> >>>>>> void DumpWriter::write_objectID(oop o) { >>>>>> o = mask_dormant_archived_object(o); /// <---- add >>>>>> address a = (address)o; >>>>>> #ifdef _LP64 >>>>>> write_u8((u8)a); >>>>>> #else >>>>>> write_u4((u4)a); >>>>>> #endif >>>>>> } >>>>>> >>>>>> >>>>>> That way, even if a dormant object (unintentionally) becomes >>>>>> reachable via the GC roots, we won't write an invalid reference to it >>>>>> (the object "body" will not be written, so the ID will not point to >>>>>> anything valid). >>>>>> >>>>>> But this seems a little too aggressive to me. What do you think? >>>>> It does seem a little aggressive as it seems to introduce the dormancy >>>>> check into a lot of places that don't need it. But as I said I don't >>>>> know this code so I'm really not the right person to ask. >>>>> >>>>> Cheers, >>>>> David >>>>> ----- >>>>> >>>>>> Thanks >>>>>> - Ioi >>>>>> >> From david.holmes at oracle.com Mon Sep 9 02:15:38 2019 From: david.holmes at oracle.com (David Holmes) Date: Mon, 9 Sep 2019 12:15:38 +1000 Subject: RFC: 8229160: Reimplement JvmtiRawMonitor to use PlatformMonitor In-Reply-To: <75acc5c5-b45f-a764-55d6-692dd7ca0a1a@oracle.com> References: <842a1d43-bdcc-3345-2731-d92b477e3ad3@oracle.com> <75acc5c5-b45f-a764-55d6-692dd7ca0a1a@oracle.com> Message-ID: Hi Dan, On 7/09/2019 6:50 am, Daniel D. Daugherty wrote: > Hi David, > > I've finally gotten back to this email thread... Thanks. >> FYI testing to date: >> ?- tiers 1 -3 all platforms >> ?- hotspot: serviceability/jvmti >> ????????????????????????? /jdwp >> ??????????? vmTestbase/nsk/jvmti >> ????????????????????????? /jdwp >> ?- JDK: com/sun/jdi > > You should also add: > > open/test/hotspot/jtreg/vmTestbase/nsk/jdb > open/test/hotspot/jtreg/vmTestbase/nsk/jdi > open/test/jdk/java/lang/instrument Okay - in progress. Though I can't see any use of RawMonitors in any of these tests. > I took a quick look through the preliminary webrev and I don't see > anything that worries me. Thanks. I'll prepare a more polished webrev soon. > Re: Thread.interrupt() and raw_wait() > > It would be good to see if that semantic is being tested via the > JCK test suite for JVM/TI. It isn't. The only thing directly tested for RawMonitorWait is normal successful operation and reporting "not owner" when not the owner. No check for JVMTI_ERROR_INTERRUPT exists other than as input for the GetErrorName function. There's only one test in the whole test base that checks for the interrupt and that is vmTestbase/nsk/jvmti/RawMonitorWait/rawmnwait005/. In that test if we are not interrupted before the RawMonitorWait we will wait until the full timeout elapses - which is 2 minutes by default - then return and report the interrupt. Hence the test still passes. (If it was an untimed wait that would be different of course). The more I try to convince people this change should be okay, the more uncomfortable I get with my own arguments. :) I think I'm going to implement the polling approach for checking interrupts - say 500ms. > I also very much like/appreciate the decoupling of JvmtiRawMonitors > from ObjectMonitors... Thanks for tackling this crazy task. Thanks :) David > Dan > > > > On 8/15/19 2:22 AM, David Holmes wrote: >> Bug: https://bugs.openjdk.java.net/browse/JDK-8229160 >> >> Preliminary webrev (still has rough edges): >> http://cr.openjdk.java.net/~dholmes/8229160/webrev.prelim/ >> >> Background: >> >> We've had this comment for a long time: >> >> ?// The raw monitor subsystem is entirely distinct from normal >> ?// java-synchronization or jni-synchronization.? raw monitors are not >> ?// associated with objects.? They can be implemented in any manner >> ?// that makes sense.? The original implementors decided to piggy-back >> ?// the raw-monitor implementation on the existing Java objectMonitor >> mechanism. >> ?// This flaw needs to fixed.? We should reimplement raw monitors as >> sui-generis. >> ?// Specifically, we should not implement raw monitors via java monitors. >> ?// Time permitting, we should disentangle and deconvolve the two >> implementations >> ?// and move the resulting raw monitor implementation over to the >> JVMTI directories. >> ?// Ideally, the raw monitor implementation would be built on top of >> ?// park-unpark and nothing else. >> >> This is an attempt to do that disentangling so that we can then >> consider changes to ObjectMonitor without having to worry about >> JvmtiRawMonitors. But rather than building on low-level park/unpark >> (which would require the same manual queue management and much of the >> same complex code as exists in ObjectMonitor) I decided to try and do >> this on top of PlatformMonitor. >> >> The reason this is just a RFC rather than RFR is that I overlooked a >> non-trivial aspect of JvmtiRawMonitors: like Java monitors (as >> implemented by ObjectMonitor) they interact with the Thread.interrupt >> mechanism. This is not clearly stated in the JVM TI specification [1] >> but only in passing by the possible errors for RawMonitorWait: >> >> JVMTI_ERROR_INTERRUPT??? Wait was interrupted, try again >> >> As I explain in the bug report there is no way to build in proper >> interrupt support using PlatformMonitor as there is no way we can >> "interrupt" the low-level pthread_cond_wait. But we can approximate >> it. What I've done in this preliminary version is just check interrupt >> state before and after the actual "wait" but we won't get woken by the >> interrupt once we have actually blocked. Alternatively we could use a >> periodic polling approach and wakeup every Nms to check for interruption. >> >> The only use of JvmtiRawMonitors in the JDK libraries (JDWP) is not >> affected by this choice as that code ignores the interrupt until the >> real action it was waiting for has occurred. The interrupt is then >> reposted later. >> >> But more generally there could be users of JvmtiRawMonitors that >> expect/require that RawMonitorWait is responsive to Thread.interrupt >> in a manner similar to Object.wait. And if any of them are reading >> this then I'd like to know - hence this RFC :) >> >> FYI testing to date: >> ?- tiers 1 -3 all platforms >> ?- hotspot: serviceability/jvmti >> ????????????????????????? /jdwp >> ??????????? vmTestbase/nsk/jvmti >> ????????????????????????? /jdwp >> ?- JDK: com/sun/jdi >> >> Comments/opinions appreciated. >> >> Thanks, >> David >> >> [1] >> https://docs.oracle.com/en/java/javase/11/docs/specs/jvmti.html#RawMonitorWait >> > From richard.reingruber at sap.com Mon Sep 9 09:07:44 2019 From: richard.reingruber at sap.com (Reingruber, Richard) Date: Mon, 9 Sep 2019 09:07:44 +0000 Subject: RFR: JDK-8192057: com/sun/jdi/BadHandshakeTest.java fails with java.net.ConnectException In-Reply-To: References: <985bd047-9ade-afbd-cde5-c29ba9bc0bd4@oracle.com> Message-ID: Hi Alex, > Of course error can be cleared before each try - there is not functional > difference. It is just a little confusing, as you can get an exception in L. 95, too. But I'm ok with it, if you prefer it like this. I would suggest, though, to sleep some ms before a retry and double the sleep time in each following retry. Best regards, Richard. -----Original Message----- From: Alex Menkov Sent: Freitag, 6. September 2019 22:52 To: Reingruber, Richard ; serguei.spitsyn at oracle.com; OpenJDK Serviceability Subject: Re: RFR: JDK-8192057: com/sun/jdi/BadHandshakeTest.java fails with java.net.ConnectException Hi Richard, On 09/06/2019 09:28, Reingruber, Richard wrote: > Hi Alex, > > that's a good fix for the issue. > > One minor thing: > > 89 Exception error = null; > 90 for (int retry = 0; retry < 5; retry++) { > 91 try { > 92 log("retry: " + retry); > 93 s = new Socket("localhost", port); > 94 error = null; > 95 s.getOutputStream().write("JDWP-".getBytes("UTF-8")); > 96 break; > 97 } catch (ConnectException ex) { > 98 log("got exception: " + ex.toString()); > 99 error = ex; > 100 } > 101 } > 102 if (error != null) { > 103 throw error; > 104 } > > Is there a reason to clear the local variable error in line 94 instead of clearing it > in line 91 where each new attempt begins? The logic here is: The cycle has 2 exits: - error (max retry attempts reached, error is set by last "catch") - success ("break" statement at line 96, error should be null) So error is cleared only after the socket is connected (this is the problematic operation which can cause ConnectException). Of course error can be cleared before each try - there is not functional difference. --alex > > Cheers, Richard. > > -----Original Message----- > From: serviceability-dev On Behalf Of serguei.spitsyn at oracle.com > Sent: Mittwoch, 4. September 2019 22:11 > To: Alex Menkov ; OpenJDK Serviceability > Subject: Re: RFR: JDK-8192057: com/sun/jdi/BadHandshakeTest.java fails with java.net.ConnectException > > Hi Alex, > > The fix looks good. > Good simplification! > > Thanks, > Serguei > > > On 9/4/19 12:19, Alex Menkov wrote: >> Hi all, >> >> Please review the fix for BadHandshakeTest test. >> The problem is the test connects to the server twice and if debuggee >> hasn't yet handled disconnection, the next connect gets "connection >> refused" error. >> Instead of adding delay before 2nd connect (we never know "good" value >> for the delay and big delay can cause "accept timeout"), the test >> re-tries connect in case of ConnectException. >> Also improved/simplified the test slightly - debuggee is now run with >> auto port assignment (used lib.jdb.Debuggee test class which >> implements required functionality). >> >> jira: >> ? https://bugs.openjdk.java.net/browse/JDK-8192057 >> webrev: >> http://cr.openjdk.java.net/~amenkov/jdk14/BadHandshakeTest/webrev/ >> >> --alex > From daniel.daugherty at oracle.com Mon Sep 9 15:29:59 2019 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Mon, 9 Sep 2019 11:29:59 -0400 Subject: RFC: 8229160: Reimplement JvmtiRawMonitor to use PlatformMonitor In-Reply-To: References: <842a1d43-bdcc-3345-2731-d92b477e3ad3@oracle.com> <75acc5c5-b45f-a764-55d6-692dd7ca0a1a@oracle.com> Message-ID: <0dd42545-847f-1bc8-3066-f5751cb0f57e@oracle.com> On 9/8/19 10:15 PM, David Holmes wrote: > Hi Dan, > > On 7/09/2019 6:50 am, Daniel D. Daugherty wrote: >> Hi David, >> >> I've finally gotten back to this email thread... > > Thanks. > >>> FYI testing to date: >>> ?- tiers 1 -3 all platforms >>> ?- hotspot: serviceability/jvmti >>> ????????????????????????? /jdwp >>> ??????????? vmTestbase/nsk/jvmti >>> ????????????????????????? /jdwp >>> ?- JDK: com/sun/jdi >> >> You should also add: >> >> open/test/hotspot/jtreg/vmTestbase/nsk/jdb >> open/test/hotspot/jtreg/vmTestbase/nsk/jdi >> open/test/jdk/java/lang/instrument > > Okay - in progress. Though I can't see any use of RawMonitors in any > of these tests. I wouldn't expect direct use. I would expect built-on-top-of use. In particular, there are scenario tests in ndk/jdi that use JDI in different ways than straight up API tests and shake out bugs. > >> I took a quick look through the preliminary webrev and I don't see >> anything that worries me. > > Thanks. I'll prepare a more polished webrev soon. > >> Re: Thread.interrupt() and raw_wait() >> >> It would be good to see if that semantic is being tested via the >> JCK test suite for JVM/TI. > > It isn't. The only thing directly tested for RawMonitorWait is normal > successful operation and reporting "not owner" when not the owner. No > check for JVMTI_ERROR_INTERRUPT exists other than as input for the > GetErrorName function. > > There's only one test in the whole test base that checks for the > interrupt and that is > vmTestbase/nsk/jvmti/RawMonitorWait/rawmnwait005/. In that test if we > are not interrupted before the RawMonitorWait we will wait until the > full timeout elapses - which is 2 minutes by default - then return and > report the interrupt. Hence the test still passes. (If it was an > untimed wait that would be different of course). > > The more I try to convince people this change should be okay, the more > uncomfortable I get with my own arguments. :) I think I'm going to > implement the polling approach for checking interrupts - say 500ms. I'll keep an eye open for the update... Dan > >> I also very much like/appreciate the decoupling of JvmtiRawMonitors >> from ObjectMonitors... Thanks for tackling this crazy task. > > Thanks :) > > David > >> Dan >> >> >> >> On 8/15/19 2:22 AM, David Holmes wrote: >>> Bug: https://bugs.openjdk.java.net/browse/JDK-8229160 >>> >>> Preliminary webrev (still has rough edges): >>> http://cr.openjdk.java.net/~dholmes/8229160/webrev.prelim/ >>> >>> Background: >>> >>> We've had this comment for a long time: >>> >>> ?// The raw monitor subsystem is entirely distinct from normal >>> ?// java-synchronization or jni-synchronization.? raw monitors are not >>> ?// associated with objects.? They can be implemented in any manner >>> ?// that makes sense.? The original implementors decided to piggy-back >>> ?// the raw-monitor implementation on the existing Java >>> objectMonitor mechanism. >>> ?// This flaw needs to fixed.? We should reimplement raw monitors as >>> sui-generis. >>> ?// Specifically, we should not implement raw monitors via java >>> monitors. >>> ?// Time permitting, we should disentangle and deconvolve the two >>> implementations >>> ?// and move the resulting raw monitor implementation over to the >>> JVMTI directories. >>> ?// Ideally, the raw monitor implementation would be built on top of >>> ?// park-unpark and nothing else. >>> >>> This is an attempt to do that disentangling so that we can then >>> consider changes to ObjectMonitor without having to worry about >>> JvmtiRawMonitors. But rather than building on low-level park/unpark >>> (which would require the same manual queue management and much of >>> the same complex code as exists in ObjectMonitor) I decided to try >>> and do this on top of PlatformMonitor. >>> >>> The reason this is just a RFC rather than RFR is that I overlooked a >>> non-trivial aspect of JvmtiRawMonitors: like Java monitors (as >>> implemented by ObjectMonitor) they interact with the >>> Thread.interrupt mechanism. This is not clearly stated in the JVM TI >>> specification [1] but only in passing by the possible errors for >>> RawMonitorWait: >>> >>> JVMTI_ERROR_INTERRUPT??? Wait was interrupted, try again >>> >>> As I explain in the bug report there is no way to build in proper >>> interrupt support using PlatformMonitor as there is no way we can >>> "interrupt" the low-level pthread_cond_wait. But we can approximate >>> it. What I've done in this preliminary version is just check >>> interrupt state before and after the actual "wait" but we won't get >>> woken by the interrupt once we have actually blocked. Alternatively >>> we could use a periodic polling approach and wakeup every Nms to >>> check for interruption. >>> >>> The only use of JvmtiRawMonitors in the JDK libraries (JDWP) is not >>> affected by this choice as that code ignores the interrupt until the >>> real action it was waiting for has occurred. The interrupt is then >>> reposted later. >>> >>> But more generally there could be users of JvmtiRawMonitors that >>> expect/require that RawMonitorWait is responsive to Thread.interrupt >>> in a manner similar to Object.wait. And if any of them are reading >>> this then I'd like to know - hence this RFC :) >>> >>> FYI testing to date: >>> ?- tiers 1 -3 all platforms >>> ?- hotspot: serviceability/jvmti >>> ????????????????????????? /jdwp >>> ??????????? vmTestbase/nsk/jvmti >>> ????????????????????????? /jdwp >>> ?- JDK: com/sun/jdi >>> >>> Comments/opinions appreciated. >>> >>> Thanks, >>> David >>> >>> [1] >>> https://docs.oracle.com/en/java/javase/11/docs/specs/jvmti.html#RawMonitorWait >>> >> From adam.farley at uk.ibm.com Mon Sep 9 16:53:50 2019 From: adam.farley at uk.ibm.com (Adam Farley8) Date: Mon, 9 Sep 2019 17:53:50 +0100 Subject: RFR: 8229378: jdwp library loader in linker_md.c quietly truncates on buffer overflow In-Reply-To: <3ea2e6ad-a32d-603c-258c-985da4e2f50a@oracle.com> References: <9bab421c-08a1-b554-6ac9-35290856ee56@oracle.com> <0c01c46b-3ff9-3016-8791-868019459d13@oracle.com> <3ea2e6ad-a32d-603c-258c-985da4e2f50a@oracle.com> Message-ID: Hi Serguei, Apologies for the delay. The errors have all been fixed, and the requested tests mostly passed, windows and linux. No test group had more failures post-fix than pre-fix, so I'm calling that a win. The new webrev can be found here: http://cr.openjdk.java.net/~afarley/8229378.2/webrev Best Regards Adam Farley IBM Runtimes "serguei.spitsyn at oracle.com" wrote on 29/08/2019 19:38:02: > From: "serguei.spitsyn at oracle.com" > To: Adam Farley8 > Cc: Chris Plummer , > daniel.daugherty at oracle.com, serviceability-dev at openjdk.java.net > Date: 29/08/2019 19:38 > Subject: Re: RFR: 8229378: jdwp library loader in linker_md.c > quietly truncates on buffer overflow > > Hi Adam, > > Okay, thanks! > Serguei > > > On 8/29/19 06:26, Adam Farley8 wrote: > Hi Serguei, > > I haven't actually run a fastdebug build before. Will do that now > and address the issues. > > Once done, I'll re-run the tests I ran, and also the tests you've > listed below. > > Can you advise on how "good coverage" is determined, so I know for > future bug fixes? > > As for the up-to-date-ness, I'll update the build before doing the above. > > Expect a webrev once all of this is complete. > > Best Regards > > Adam Farley > IBM Runtimes > > > "serguei.spitsyn at oracle.com" wrote on > 29/08/2019 03:54:56: > > > From: "serguei.spitsyn at oracle.com" > > To: Adam Farley8 > > Cc: Chris Plummer , > > daniel.daugherty at oracle.com, serviceability-dev at openjdk.java.net > > Date: 29/08/2019 04:23 > > Subject: Re: RFR: 8229378: jdwp library loader in linker_md.c > > quietly truncates on buffer overflow > > > > Hi Adam, > > > > Sorry for the latency. > > I was in process to build, test and push your fix and got the > > fastdebug build errors below. > > > > So, my question is if you've ever built the fastdebug version. > > This change is in the system-dependent code, so it has to be tested > > on both Unix and Windows. > > > > > My testing was limited to the bug specific test case I mentioned, > > and the following jdwp tests: > > > > > > test/jdk/com/sun/jdi/Jdwp* > > > test/hotspot/jtreg/serviceability/jdwp > > > > This set of tests does not provide a good coverage. > > To make sure nothing is broken you need to run the the test/jdk/com/sun/jdi > > and also the following vmTestbase tests: > > > > test/hotspot/jtreg/vmTestbase/nsk/jdi > > test/hotspot/jtreg/vmTestbase/nsk/jdb > > test/hotspot/jtreg/vmTestbase/nsk/jdwp > > > > BTW, your current webrev is not up-to-date: > > http://cr.openjdk.java.net/~afarley/8229378/webrev/ > > > > I guess, the change in the src/hotspot/share/runtime/os.cpp became > > obsolete after your previous fix that was already pushed. > > > > Thanks, > > Serguei > > > > . . . > > In file included from /scratch/sspitsyn/jdk14.1/open/src/ > > jdk.jdwp.agent/unix/native/libjdwp/linker_md.c:37:0: > > /scratch/sspitsyn/jdk14.1/open/src/jdk.jdwp.agent/unix/native/ > > libjdwp/linker_md.c: In function ?dll_build_name?: > > /scratch/sspitsyn/jdk14.1/open/src/jdk.jdwp.agent/share/native/ > > libjdwp/util.h:46:23: error: ?Do? undeclared (first use in this function) > > #define strdup(p) Do not use this interface. > > ^ > > /scratch/sspitsyn/jdk14.1/open/src/jdk.jdwp.agent/unix/native/ > > libjdwp/linker_md.c:51:18: note: in expansion of macro ?strdup? > > paths_copy = strdup(paths); > > ^ > > /scratch/sspitsyn/jdk14.1/open/src/jdk.jdwp.agent/share/native/ > > libjdwp/util.h:46:23: note: each undeclared identifier is reported > > only once for each function it appears in > > #define strdup(p) Do not use this interface. > > ^ > > /scratch/sspitsyn/jdk14.1/open/src/jdk.jdwp.agent/unix/native/ > > libjdwp/linker_md.c:51:18: note: in expansion of macro ?strdup? > > paths_copy = strdup(paths); > > ^ > > /scratch/sspitsyn/jdk14.1/open/src/jdk.jdwp.agent/share/native/ > > libjdwp/util.h:46:26: error: expected ?;? before ?not? > > #define strdup(p) Do not use this interface. > > ^ > > /scratch/sspitsyn/jdk14.1/open/src/jdk.jdwp.agent/unix/native/ > > libjdwp/linker_md.c:51:18: note: in expansion of macro ?strdup? > > paths_copy = strdup(paths); > > ^ > > /scratch/sspitsyn/jdk14.1/open/src/jdk.jdwp.agent/share/native/ > > libjdwp/util.h:38:24: error: expected ?;? before ?not? > > #define free(p) Do not use this interface. > > ^ > > /scratch/sspitsyn/jdk14.1/open/src/jdk.jdwp.agent/unix/native/ > > libjdwp/linker_md.c:71:5: note: in expansion of macro ?free? > > free(paths_copy); > > ^ > > gmake[3]: *** [/scratch/sspitsyn/jdk14.1/build/linux-x86_64-server- > > fastdebug/support/native/jdk.jdwp.agent/libjdwp/linker_md.o] Error 1 > > gmake[2]: *** [jdk.jdwp.agent-libs] Error 1 > > gmake[2]: *** Waiting for unfinished jobs.... > > > > ERROR: Build failed for target 'images' in configuration 'linux- > > x86_64-server-fastdebug' (exit code 2) > > > > > > > > On 8/13/19 09:28, Adam Farley8 wrote: > > Hi Serguei, Daniel, > > > > My testing was limited to the bug specific test case I mentioned, > > and the following jdwp tests: > > > > test/jdk/com/sun/jdi/Jdwp* > > test/hotspot/jtreg/serviceability/jdwp > > > > Best Regards > > > > Adam Farley > > IBM Runtimes > > > > > > "serguei.spitsyn at oracle.com" wrote on > > 13/08/2019 17:04:43: > > > > > From: "serguei.spitsyn at oracle.com" > > > To: daniel.daugherty at oracle.com, Adam Farley8 > > > , Chris Plummer > > > Cc: serviceability-dev at openjdk.java.net > > > Date: 13/08/2019 17:08 > > > Subject: Re: RFR: 8229378: jdwp library loader in linker_md.c > > > quietly truncates on buffer overflow > > > > > > Hi Adam, > > > > > > I'm looking at your fix. > > > Also interested about your testing. > > > > > > Thanks, > > > Serguei > > > > > > On 8/13/19 08:48, Daniel D. Daugherty wrote: > > > I don't see any information about how this change was tested... > > > Is there something on another email thread? > > > > > > Dan > > > > > > > > On 8/13/19 11:41 AM, Adam Farley8 wrote: > > > Hi Chris, > > > > > > Thanks! > > > > > > I understand we need a second reviewer/sponsor to get this change > > > in. Any volunteers? > > > > > > Best Regards > > > > > > Adam Farley > > > IBM Runtimes > > > > > > > > > Chris Plummer wrote on 12/08/2019 21:35:06: > > > > > > > From: Chris Plummer > > > > To: Adam Farley8 , serviceability- > > > dev at openjdk.java.net > > > > Date: 12/08/2019 21:35 > > > > Subject: Re: RFR: 8229378: jdwp library loader in linker_md.c > > > > quietly truncates on buffer overflow > > > > > > > > Hi Adam, > > > > > > > > It looks good to me. > > > > > > > > thanks, > > > > > > > > Chris > > > > > > > > On 8/12/19 7:34 AM, Adam Farley8 wrote: > > > > Hi All, > > > > > > > > This is a known bug, mentioned in a code comment. > > > > > > > > Here is the fix for that bug. > > > > > > > > Reviewers and sponsors requested. > > > > > > > > Short version: if you set sun.boot.library.path to > > > > something beyond a system's max path length, the > > > > current code will return an empty string (rather than > > > > printing a useful error message and shutting down). > > > > > > > > This is also a problem if you've specified multiple > > > > paths with a separator, as this code seems to wrongly > > > > assess whether the *total* length exceeds max path > > > > length. So two 200 char paths on windows will cause > > > > failure, as the total length is 400 (which is beyond > > > > max length for windows). > > > > > > > > Note that the os.cpp bit of the webrev will not be included > > > > in the final webrev, it just makes this change trivially > > > > testable. > > > > > > > > Bug: https://bugs.openjdk.java.net/browse/JDK-8229378 > > > > Webrev: http://cr.openjdk.java.net/~afarley/8229378/webrev/ > > > > > > > > > > > > Best Regards > > > > > > > > Adam Farley > > > > IBM Runtimes > > > > > > > > Unless stated otherwise above: > > > > IBM United Kingdom Limited - Registered in England and Wales with > > > > number 741598. > > > > Registered office: PO Box 41, North Harbour, Portsmouth, > Hampshire PO6 3AU > > > Unless stated otherwise above: > > > IBM United Kingdom Limited - Registered in England and Wales with > > > number 741598. > > > Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU > > Unless stated otherwise above: > > IBM United Kingdom Limited - Registered in England and Wales with > > number 741598. > > Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU > Unless stated otherwise above: > IBM United Kingdom Limited - Registered in England and Wales with > number 741598. > Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU -------------- next part -------------- An HTML attachment was scrubbed... URL: From serguei.spitsyn at oracle.com Mon Sep 9 18:26:37 2019 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Mon, 9 Sep 2019 11:26:37 -0700 Subject: RFC: 8229160: Reimplement JvmtiRawMonitor to use PlatformMonitor In-Reply-To: References: <842a1d43-bdcc-3345-2731-d92b477e3ad3@oracle.com> <75acc5c5-b45f-a764-55d6-692dd7ca0a1a@oracle.com> Message-ID: An HTML attachment was scrubbed... URL: From david.holmes at oracle.com Mon Sep 9 22:49:27 2019 From: david.holmes at oracle.com (David Holmes) Date: Tue, 10 Sep 2019 08:49:27 +1000 Subject: RFC: 8229160: Reimplement JvmtiRawMonitor to use PlatformMonitor In-Reply-To: References: <842a1d43-bdcc-3345-2731-d92b477e3ad3@oracle.com> <75acc5c5-b45f-a764-55d6-692dd7ca0a1a@oracle.com> Message-ID: Hi Serguei, On 10/09/2019 4:26 am, serguei.spitsyn at oracle.com wrote: > Hi David, > > On 9/8/19 19:15, David Holmes wrote: >> Hi Dan, >> >> On 7/09/2019 6:50 am, Daniel D. Daugherty wrote: >>> Hi David, >>> >>> I've finally gotten back to this email thread... >> >> Thanks. >> >>>> FYI testing to date: >>>> ?- tiers 1 -3 all platforms >>>> ?- hotspot: serviceability/jvmti >>>> ????????????????????????? /jdwp >>>> ??????????? vmTestbase/nsk/jvmti >>>> ????????????????????????? /jdwp >>>> ?- JDK: com/sun/jdi >>> >>> You should also add: >>> >>> open/test/hotspot/jtreg/vmTestbase/nsk/jdb >>> open/test/hotspot/jtreg/vmTestbase/nsk/jdi >>> open/test/jdk/java/lang/instrument >> >> Okay - in progress. Though I can't see any use of RawMonitors in any >> of these tests. >> >>> I took a quick look through the preliminary webrev and I don't see >>> anything that worries me. >> >> Thanks. I'll prepare a more polished webrev soon. >> >>> Re: Thread.interrupt() and raw_wait() >>> >>> It would be good to see if that semantic is being tested via the >>> JCK test suite for JVM/TI. >> >> It isn't. The only thing directly tested for RawMonitorWait is normal >> successful operation and reporting "not owner" when not the owner. No >> check for JVMTI_ERROR_INTERRUPT exists other than as input for the >> GetErrorName function. > > This is most likely true. > My only concern is if RawMonitor's can be used in the JCK test libraries > (low probability). > I've asked Leonid Kuskov (JCK) to double check this (added to the > mailing list). > >> >> There's only one test in the whole test base that checks for the >> interrupt and that is >> vmTestbase/nsk/jvmti/RawMonitorWait/rawmnwait005/. In that test if we >> are not interrupted before the RawMonitorWait we will wait until the >> full timeout elapses - which is 2 minutes by default - then return and >> report the interrupt. Hence the test still passes. (If it was an >> untimed wait that would be different of course). > > I figured the same last Friday. > One more place to care about are NSK tests libraries that are located here: > ? test/hotspot/jtreg/vmTestbase/nsk/share > > There are a couple of places where the RawMonitor is used: > ? jvmti/hotswap/HotSwap.cpp:??? if > (!NSK_JVMTI_VERIFY(jvmti->RawMonitorWait(waitLock, millis))) > ? jvmti/jvmti_tools.cpp:??? jvmtiError error = > env->RawMonitorWait(monitor, millis); > > The use in HotSwap.cpp is local. > The jvmti_tools.cpp defines this: > ? void rawMonitorWait(jvmtiEnv *env, jrawMonitorID monitor, jlong millis) { > ?? ?? jvmtiError error = env->RawMonitorWait(monitor, millis); > ?? ?? exitOnError(error); > ? } > > which is used in the jvmti/agent_tools.cpp but does not depend on > interrupting of RawMonitor's (as I can see). > One more place to mention is: > ? jvmti/DataDumpRequest/datadumpreq001/datadumpreq001.cpp > > But I see no problems there as well. > > > > The JDWP implementation is using RawMonitor's. > Please, see functions debugMonitorWait()/debugMonitorTimedWait() in > src/jdk.jdwp.agent/share/native/libjdwp/util.c. > > It expects the JVMTI_ERROR_INTERRUPT but never makes a call to the JVMTI > ThreadInterrupt(). > So, it looks like it does not depend on interrupting of RawMonitor's in > any way. > >> >> The more I try to convince people this change should be okay, the more >> uncomfortable I get with my own arguments. :) I think I'm going to >> implement the polling approach for checking interrupts - say 500ms. > > The JVMTI spec tells that the JVMTI_ERROR_INTERRUPT can be returned from > the RawMonitorWait: > https://docs.oracle.com/en/java/javase/11/docs/specs/jvmti.html#RawMonitorWait Yes it does and that is the only thing that implies a connection to Thread.interrupt. > which means that RawMonitorWait can be interrupted with the > Thread.Interrupt() > or JVMTI InterruptThread(): > https://docs.oracle.com/en/java/javase/11/docs/specs/jvmti.html#InterruptThread That's one way to interpret the fact that RawMonitorWait can return JVMTI_ERROR_INTERRUPT, but the actual interaction between Thread.interrupt and RawMonitorWait is not explicitly stated. Arguably you can just check for interruption before and after the wait, to see whether to return JVMTI_ERROR_INTERRUPT, without necessarily being able to break out of the wait itself. That's been the whole premise of this change proposal - that responsiveness to interrupts is more a quality-of-implementation issue. But in any case I've decided to try the polling approach so that we won't wait forever if interrupted but not notified. Thanks, David ----- > > > Thanks, > Serguei > >>> I also very much like/appreciate the decoupling of JvmtiRawMonitors >>> from ObjectMonitors... Thanks for tackling this crazy task. >> >> Thanks :) >> >> David >> >>> Dan >>> >>> >>> >>> On 8/15/19 2:22 AM, David Holmes wrote: >>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8229160 >>>> >>>> Preliminary webrev (still has rough edges): >>>> http://cr.openjdk.java.net/~dholmes/8229160/webrev.prelim/ >>>> >>>> Background: >>>> >>>> We've had this comment for a long time: >>>> >>>> ?// The raw monitor subsystem is entirely distinct from normal >>>> ?// java-synchronization or jni-synchronization.? raw monitors are not >>>> ?// associated with objects.? They can be implemented in any manner >>>> ?// that makes sense.? The original implementors decided to piggy-back >>>> ?// the raw-monitor implementation on the existing Java >>>> objectMonitor mechanism. >>>> ?// This flaw needs to fixed.? We should reimplement raw monitors as >>>> sui-generis. >>>> ?// Specifically, we should not implement raw monitors via java >>>> monitors. >>>> ?// Time permitting, we should disentangle and deconvolve the two >>>> implementations >>>> ?// and move the resulting raw monitor implementation over to the >>>> JVMTI directories. >>>> ?// Ideally, the raw monitor implementation would be built on top of >>>> ?// park-unpark and nothing else. >>>> >>>> This is an attempt to do that disentangling so that we can then >>>> consider changes to ObjectMonitor without having to worry about >>>> JvmtiRawMonitors. But rather than building on low-level park/unpark >>>> (which would require the same manual queue management and much of >>>> the same complex code as exists in ObjectMonitor) I decided to try >>>> and do this on top of PlatformMonitor. >>>> >>>> The reason this is just a RFC rather than RFR is that I overlooked a >>>> non-trivial aspect of JvmtiRawMonitors: like Java monitors (as >>>> implemented by ObjectMonitor) they interact with the >>>> Thread.interrupt mechanism. This is not clearly stated in the JVM TI >>>> specification [1] but only in passing by the possible errors for >>>> RawMonitorWait: >>>> >>>> JVMTI_ERROR_INTERRUPT??? Wait was interrupted, try again >>>> >>>> As I explain in the bug report there is no way to build in proper >>>> interrupt support using PlatformMonitor as there is no way we can >>>> "interrupt" the low-level pthread_cond_wait. But we can approximate >>>> it. What I've done in this preliminary version is just check >>>> interrupt state before and after the actual "wait" but we won't get >>>> woken by the interrupt once we have actually blocked. Alternatively >>>> we could use a periodic polling approach and wakeup every Nms to >>>> check for interruption. >>>> >>>> The only use of JvmtiRawMonitors in the JDK libraries (JDWP) is not >>>> affected by this choice as that code ignores the interrupt until the >>>> real action it was waiting for has occurred. The interrupt is then >>>> reposted later. >>>> >>>> But more generally there could be users of JvmtiRawMonitors that >>>> expect/require that RawMonitorWait is responsive to Thread.interrupt >>>> in a manner similar to Object.wait. And if any of them are reading >>>> this then I'd like to know - hence this RFC :) >>>> >>>> FYI testing to date: >>>> ?- tiers 1 -3 all platforms >>>> ?- hotspot: serviceability/jvmti >>>> ????????????????????????? /jdwp >>>> ??????????? vmTestbase/nsk/jvmti >>>> ????????????????????????? /jdwp >>>> ?- JDK: com/sun/jdi >>>> >>>> Comments/opinions appreciated. >>>> >>>> Thanks, >>>> David >>>> >>>> [1] >>>> https://docs.oracle.com/en/java/javase/11/docs/specs/jvmti.html#RawMonitorWait >>>> >>> > From serguei.spitsyn at oracle.com Mon Sep 9 23:11:49 2019 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Mon, 9 Sep 2019 16:11:49 -0700 Subject: RFC: 8229160: Reimplement JvmtiRawMonitor to use PlatformMonitor In-Reply-To: References: <842a1d43-bdcc-3345-2731-d92b477e3ad3@oracle.com> <75acc5c5-b45f-a764-55d6-692dd7ca0a1a@oracle.com> Message-ID: <79d37aa9-bd28-a8ab-6d4d-95facce87261@oracle.com> Hi David, On 9/9/19 15:49, David Holmes wrote: > Hi Serguei, . . . >> >> The JDWP implementation is using RawMonitor's. >> Please, see functions debugMonitorWait()/debugMonitorTimedWait() in >> src/jdk.jdwp.agent/share/native/libjdwp/util.c. >> >> It expects the JVMTI_ERROR_INTERRUPT but never makes a call to the >> JVMTI ThreadInterrupt(). >> So, it looks like it does not depend on interrupting of RawMonitor's >> in any way. >> >>> >>> The more I try to convince people this change should be okay, the >>> more uncomfortable I get with my own arguments. :) I think I'm going >>> to implement the polling approach for checking interrupts - say 500ms. >> >> The JVMTI spec tells that the JVMTI_ERROR_INTERRUPT can be returned >> from the RawMonitorWait: >> https://docs.oracle.com/en/java/javase/11/docs/specs/jvmti.html#RawMonitorWait >> > > Yes it does and that is the only thing that implies a connection to > Thread.interrupt. > >> which means that RawMonitorWait can be interrupted with the >> Thread.Interrupt() >> or JVMTI InterruptThread(): >> https://docs.oracle.com/en/java/javase/11/docs/specs/jvmti.html#InterruptThread >> > > That's one way to interpret the fact that RawMonitorWait can return > JVMTI_ERROR_INTERRUPT, but the actual interaction between > Thread.interrupt and RawMonitorWait is not explicitly stated. Yes, it is true. The same is true about the actual interaction with the JVMTI InterruptThread(). But it seems, this function is expected to actually interrupt waiting on a RawMonitorWait() call or any other wait() calls. > Arguably you can just check for interruption before and after the > wait, to see whether to return JVMTI_ERROR_INTERRUPT, without > necessarily being able to break out of the wait itself. That's been > the whole premise of this change proposal - that responsiveness to > interrupts is more a quality-of-implementation issue. Right, it is another way to interpret it. > > But in any case I've decided to try the polling approach so that we > won't wait forever if interrupted but not notified. It sounds better. Thanks, Serguei > > Thanks, > David > ----- > >> Thanks, >> Serguei From ivan.gerasimov at oracle.com Tue Sep 10 03:41:58 2019 From: ivan.gerasimov at oracle.com (Ivan Gerasimov) Date: Mon, 9 Sep 2019 20:41:58 -0700 Subject: RFR 8230303 : JDB hangs when running monitor command Message-ID: Hello! jdb utility has a command 'monitor ', which allows to execute the specified command every time the debuggee stops. If the modifies the list of installed monitors (the simplest example is 'monitor unmonitor 1'), then jdb hits a ConcurrentModificationException, and hangs the debuggee. While it is questionable, if modifying the monitor list has to be implemented in some specific way, it seems sensible to at least prevent a hard failure. The simplest fix appears to be to use CopyOnWriteArrayList, so that an immutable snapshot of the list will be traversed. Even if the list is modified, it wouldn't affect any iterator that might exist at that moment of time. BUGURL: https://bugs.openjdk.java.net/browse/JDK-8230303 WEBREV: http://cr.openjdk.java.net/~igerasim/8230303/00/webrev/ Mach5 control build was successfully run with hs-tier7-rt, which includes the new test and other jdb related tests. Would you please help review? Thanks in advance! -- With kind regards, Ivan Gerasimov From serguei.spitsyn at oracle.com Tue Sep 10 04:14:40 2019 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Mon, 9 Sep 2019 21:14:40 -0700 Subject: RFR 8230303 : JDB hangs when running monitor command In-Reply-To: References: Message-ID: <18791c15-e6bb-04c0-a1bd-fe22b175560b@oracle.com> Hi Ivan, Thank you for filing the shadow bug and fixing this issue! I've targeted the bug to 14 (please, fix me if it is wrong). The fix looks good to me. Do you have any plans to backport it to the earlier releases? Thanks, Serguei On 9/9/19 20:41, Ivan Gerasimov wrote: > Hello! > > jdb utility has a command 'monitor ', which allows to execute > the specified command every time the debuggee stops. > > If the modifies the list of installed monitors (the simplest > example is 'monitor unmonitor 1'), then jdb hits a > ConcurrentModificationException, and hangs the debuggee. > > While it is questionable, if modifying the monitor list has to be > implemented in some specific way, it seems sensible to at least > prevent a hard failure. > > The simplest fix appears to be to use CopyOnWriteArrayList, so that an > immutable snapshot of the list will be traversed. > Even if the list is modified, it wouldn't affect any iterator that > might exist at that moment of time. > > BUGURL: https://bugs.openjdk.java.net/browse/JDK-8230303 > WEBREV: http://cr.openjdk.java.net/~igerasim/8230303/00/webrev/ > > Mach5 control build was successfully run with hs-tier7-rt, which > includes the new test and other jdb related tests. > > Would you please help review? > Thanks in advance! > From ivan.gerasimov at oracle.com Tue Sep 10 05:07:51 2019 From: ivan.gerasimov at oracle.com (Ivan Gerasimov) Date: Mon, 9 Sep 2019 22:07:51 -0700 Subject: RFR 8230303 : JDB hangs when running monitor command In-Reply-To: <18791c15-e6bb-04c0-a1bd-fe22b175560b@oracle.com> References: <18791c15-e6bb-04c0-a1bd-fe22b175560b@oracle.com> Message-ID: Thank you Serguei for review! On 9/9/19 9:14 PM, serguei.spitsyn at oracle.com wrote: > Hi Ivan, > > Thank you for filing the shadow bug and fixing this issue! > I've targeted the bug to 14 (please, fix me if it is wrong). > Yes, this is correct, thanks. > The fix looks good to me. > > Do you have any plans to backport it to the earlier releases? > Yes, I'm planning to request a backport to the JDK 13 shortly. With kind regards, Ivan From ioi.lam at oracle.com Tue Sep 10 06:51:17 2019 From: ioi.lam at oracle.com (Ioi Lam) Date: Mon, 9 Sep 2019 23:51:17 -0700 Subject: RFR(XS) 8230674 Heap dumps should exclude dormant CDS archived objects of unloaded classes In-Reply-To: References: <1c6d0620-d0ce-cd62-1108-ce2ccdb692d8@oracle.com> <96a98c4d-6313-3176-538f-08043f807389@oracle.com> <3a510f34-7d34-cf02-f834-7522dadb8caa@oracle.com> Message-ID: <8e88644f-e9d9-d483-83a4-4736fcb4fcba@oracle.com> On 9/6/19 4:06 PM, Jiangli Zhou wrote: > On Fri, Sep 6, 2019 at 3:17 PM Ioi Lam wrote: >> On 9/6/19 11:48 AM, Jiangli Zhou wrote: >>> On Fri, Sep 6, 2019 at 9:43 AM Ioi Lam wrote: >>>> >>>> On 9/5/19 11:11 PM, David Holmes wrote: >>>>> On 6/09/2019 1:39 pm, Ioi Lam wrote: >>>>>> On 9/5/19 8:18 PM, David Holmes wrote: >>>>>>> Hi Ioi, >>>>>>> >>>>>>> On 6/09/2019 12:27 pm, Ioi Lam wrote: >>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8230674 >>>>>>>> http://cr.openjdk.java.net/~iklam/jdk14/8230674-heap-dump-exclude-dormant-oops.v01 >>>>>>>> >>>>>>>> >>>>>>>> Please review this small fix: >>>>>>>> >>>>>>>> When CDS is in use, archived objects are memory-mapped into the >>>>>>>> heap (currently G1GC only). These objects are partitioned into >>>>>>>> "subgraphs". Some of these subgraphs may not be loaded (e.g., those >>>>>>>> related to jdk.internal.math.FDBigInteger) at the time a heap dump is >>>>>>>> requested. > >>>>>>>> When a subgraph is not loaded, some of the objects in this subgraph >>>>>>>> may belong to a class that's not yet loaded. >>>>>>>> >>>>>>>> The bug happens when such an "dormant" object is dumped, but its class >>>>>>>> is not dumped because the class is not in the system dictionary. >>>>>>>> >>>>>>>> There is already code in DumperSupport::dump_instance() that tries >>>>>>>> to handle dormant objects, but it needs to be extended to cover >>>>>>>> arrays, as well as and references from non-dormant object/arrays to >>>>>>>> dormant ones. >>>>>>> I have to confess I did not pay any attention to the CDS archived >>>>>>> objects work, so I don't have a firm grasp of how you have >>>>>>> implemented things. But I'm wondering how can you have a reference >>>>>>> to a dormant object from a non-dormant one? Shouldn't the act of >>>>>>> becoming non-dormant automatically cause the subgraph from that >>>>>>> object to also become non-dormant? Or do you have "read barriers" to >>>>>>> perform the changes on demand? >>>>>>> >>>> Ah -- my bug title is not correct. >>>> >>>> I changed the bug title (and this e-mail subject) to >>>> >>>> Heap dumps should exclude dormant CDS archived objects **of unloaded >>>> classes** >>>> >>>> During the heap dump, we scan all objects in the heap, regardless of >>>> reachability. There's no way to decide reachability in >>>> HeapObjectDumper::do_object(), unless we perform an actual GC. >>>> >>>> But it's OK to include unreachable objects in the heap dump. (I guess >>>> it's useful to see how much garbage you have in the heap. There's an >>>> option to run a collection before dumping the heap.) >>>> >>>> There are 2 kinds of unreachable objects -- garbage: those that were >>>> once reachable but no longer, dormant: the archived objects that have >>>> never been reachable. >>> Currently Java object archiving framework only supports one >>> directional state change: dormant -> live. An archived object can >>> become a live object from dormant state, but it cannot go back to the >>> dormant state. Need to investigate thoroughly for all cases before the >>> 'live -> dormant' transition can be supported. All objects in the >>> 'Open' archive heap region are associated with the builtin class >>> loaders and their classes are not unloaded. The existing static fields >>> for archiving within the JDK classes are selected and the associated >>> objects do not become garbage once 'installed'. >>> >>>> Anyway, it's OK to dump dormant objects as long as their class has been >>>> loaded. The problem happens only when we dump a dormant object who class >>>> is not yet loaded (Eclipase MAT get confused when it sees an object >>>> whose class ID is invalid). >>> Yes. That's a scenario needs to be handled for a tool that iterates >>> the Java heap. A dormant object in the 'Open' archive heap region may >>> have a 'invalid' klass since the klass may not be loaded yet at the >>> moment. >>> >>> Your webrev looks reasonable to me on high level pending information >>> for following questions. Can you please give more details on the >>> dormant objects referenced from the arrays? What specific arrays are >>> those? >> Hi Jiangli, >> >> Thanks for the review. I add the following code: >> >> // [id]* elements >> for (int index = 0; index < length; index++) { >> oop o = array->obj_at(index); >> >> >> if (o != NULL && mask_dormant_archived_object(o) == NULL) { >> ResourceMark rm; >> tty->print_cr("%s array contains %s object", >> array->klass()->external_name(), o->klass()->external_name()); >> } >> >> >> o = mask_dormant_archived_object(o); >> writer->write_objectID(o); >> } >> >> and the output is: >> >> $ java -cp . -XX:+HeapDumpAfterFullGC HelloGC >> Dumping heap to java_pid20956.hprof ... >> [Ljava.lang.Object; array contains java.util.jar.Attributes$Name object >> [Ljava.lang.Object; array contains java.util.jar.Attributes$Name object >> (repeated about 20 times) >> >> It comes from java/util/jar/Attributes$Name::KNOWN_NAMES. This class is >> not loaded because my program doesn't use JAR files in the classpath: > The above looks right and is expected. At this point, we would not see > any archive object that first becomes live then becomes dormant again. > > That however will change in the future when we make the object > archiving framework general enough for other JDK and application class > usages. We need to work out the GC details for the live -> dormant > transition when that happens. Hi Jiangli, Sure, we should look into that when we expand the scope of object archiving. I updated the webrev to log the skipped objects with -Xlog:cds+heap=debug: http://cr.openjdk.java.net/~iklam/jdk14/8230674-heap-dump-exclude-dormant-oops.v02/ Thanks - Ioi > Thanks, > Jiangli > >> Thanks >> - Ioi >> >> >>> Regards, >>> Jiangli >>> >>> >>> >>>> So to answer your question, we can have a case with a dormant array >>>> (that contains a dormant object) like this: >>>> >>>> Object[] array = {new ClassNotYetLoaded();} >>>> >>>> After my fix, the array will be dumped (we have no easy way of not doing >>>> that), but its contents becomes this in the .hprof file: >>>> >>>> Object[] array = {null} >>>> >>>> Thanks >>>> - Ioi >>>> >>>> >>>> >>>>>> Hi David, >>>>>> >>>>>> Thanks for the review. >>>>>> >>>>>> The dormant objects are not reachable via the GC roots. They become >>>>>> non-dormant via explicit calls to JVM_InitializeFromArchive, after >>>>>> which they become reachable via the static fields of loaded classes. >>>>> Right, so is there a distinction between non-dormant and reachable at >>>>> the time an object becomes non-dormant? I'm still unclear how a drmant >>>>> array becomes non-dormant but still contains elements that refer to >>>>> dormant objects. >>>>> >>>>>> The only issue here is heap dump is done by scanning all objects in >>>>>> the heap, including unreachable ones >>>>>> >>>>>> HeapObjectDumper obj_dumper(this, writer()); >>>>>> Universe::heap()->safe_object_iterate(&obj_dumper); >>>>>> >>>>>> that's how these dormant objects are discovered during heap dump. >>>>>> >>>>>>> That aside the code changes seem reasonable, you moved the check out >>>>>>> of DumperSupport::dump_instance and into the higher-level >>>>>>> HeapObjectDumper::do_object so that it catches instances and arrays, >>>>>>> plus you added a check for array elements. >>>>>>> >>>>>> I am debating whether I should put the masking code in here: >>>>>> >>>>>> void DumpWriter::write_objectID(oop o) { >>>>>> o = mask_dormant_archived_object(o); /// <---- add >>>>>> address a = (address)o; >>>>>> #ifdef _LP64 >>>>>> write_u8((u8)a); >>>>>> #else >>>>>> write_u4((u4)a); >>>>>> #endif >>>>>> } >>>>>> >>>>>> >>>>>> That way, even if a dormant object (unintentionally) becomes >>>>>> reachable via the GC roots, we won't write an invalid reference to it >>>>>> (the object "body" will not be written, so the ID will not point to >>>>>> anything valid). >>>>>> >>>>>> But this seems a little too aggressive to me. What do you think? >>>>> It does seem a little aggressive as it seems to introduce the dormancy >>>>> check into a lot of places that don't need it. But as I said I don't >>>>> know this code so I'm really not the right person to ask. >>>>> >>>>> Cheers, >>>>> David >>>>> ----- >>>>> >>>>>> Thanks >>>>>> - Ioi >>>>>> From david.holmes at oracle.com Tue Sep 10 07:54:54 2019 From: david.holmes at oracle.com (David Holmes) Date: Tue, 10 Sep 2019 17:54:54 +1000 Subject: RFC: 8229160: Reimplement JvmtiRawMonitor to use PlatformMonitor In-Reply-To: References: <842a1d43-bdcc-3345-2731-d92b477e3ad3@oracle.com> <75acc5c5-b45f-a764-55d6-692dd7ca0a1a@oracle.com> Message-ID: <5fe4ee14-f406-850e-34b4-e44485b39349@oracle.com> It turns out that polling for interrupts is actually very difficult to do correctly. There is an inherent race with timing out and being signalled that can result in lost signals. Trying to account for that without introducing other problems would lead to a very complex synchronization mechanism (which would need to track waiters, notifications and a "generation" count). The end result is that we'd break the tie to ObjectMonitor code, but would replace it with new complex untried code. :( David On 10/09/2019 8:49 am, David Holmes wrote: > Hi Serguei, > > On 10/09/2019 4:26 am, serguei.spitsyn at oracle.com wrote: >> Hi David, >> >> On 9/8/19 19:15, David Holmes wrote: >>> Hi Dan, >>> >>> On 7/09/2019 6:50 am, Daniel D. Daugherty wrote: >>>> Hi David, >>>> >>>> I've finally gotten back to this email thread... >>> >>> Thanks. >>> >>>>> FYI testing to date: >>>>> ?- tiers 1 -3 all platforms >>>>> ?- hotspot: serviceability/jvmti >>>>> ????????????????????????? /jdwp >>>>> ??????????? vmTestbase/nsk/jvmti >>>>> ????????????????????????? /jdwp >>>>> ?- JDK: com/sun/jdi >>>> >>>> You should also add: >>>> >>>> open/test/hotspot/jtreg/vmTestbase/nsk/jdb >>>> open/test/hotspot/jtreg/vmTestbase/nsk/jdi >>>> open/test/jdk/java/lang/instrument >>> >>> Okay - in progress. Though I can't see any use of RawMonitors in any >>> of these tests. >>> >>>> I took a quick look through the preliminary webrev and I don't see >>>> anything that worries me. >>> >>> Thanks. I'll prepare a more polished webrev soon. >>> >>>> Re: Thread.interrupt() and raw_wait() >>>> >>>> It would be good to see if that semantic is being tested via the >>>> JCK test suite for JVM/TI. >>> >>> It isn't. The only thing directly tested for RawMonitorWait is normal >>> successful operation and reporting "not owner" when not the owner. No >>> check for JVMTI_ERROR_INTERRUPT exists other than as input for the >>> GetErrorName function. >> >> This is most likely true. >> My only concern is if RawMonitor's can be used in the JCK test >> libraries (low probability). >> I've asked Leonid Kuskov (JCK) to double check this (added to the >> mailing list). >> >>> >>> There's only one test in the whole test base that checks for the >>> interrupt and that is >>> vmTestbase/nsk/jvmti/RawMonitorWait/rawmnwait005/. In that test if we >>> are not interrupted before the RawMonitorWait we will wait until the >>> full timeout elapses - which is 2 minutes by default - then return >>> and report the interrupt. Hence the test still passes. (If it was an >>> untimed wait that would be different of course). >> >> I figured the same last Friday. >> One more place to care about are NSK tests libraries that are located >> here: >> ?? test/hotspot/jtreg/vmTestbase/nsk/share >> >> There are a couple of places where the RawMonitor is used: >> ?? jvmti/hotswap/HotSwap.cpp:??? if >> (!NSK_JVMTI_VERIFY(jvmti->RawMonitorWait(waitLock, millis))) >> ?? jvmti/jvmti_tools.cpp:??? jvmtiError error = >> env->RawMonitorWait(monitor, millis); >> >> The use in HotSwap.cpp is local. >> The jvmti_tools.cpp defines this: >> ?? void rawMonitorWait(jvmtiEnv *env, jrawMonitorID monitor, jlong >> millis) { >> ??? ?? jvmtiError error = env->RawMonitorWait(monitor, millis); >> ??? ?? exitOnError(error); >> ?? } >> >> which is used in the jvmti/agent_tools.cpp but does not depend on >> interrupting of RawMonitor's (as I can see). >> One more place to mention is: >> ?? jvmti/DataDumpRequest/datadumpreq001/datadumpreq001.cpp >> >> But I see no problems there as well. >> >> >> >> The JDWP implementation is using RawMonitor's. >> Please, see functions debugMonitorWait()/debugMonitorTimedWait() in >> src/jdk.jdwp.agent/share/native/libjdwp/util.c. >> >> It expects the JVMTI_ERROR_INTERRUPT but never makes a call to the >> JVMTI ThreadInterrupt(). >> So, it looks like it does not depend on interrupting of RawMonitor's >> in any way. >> >>> >>> The more I try to convince people this change should be okay, the >>> more uncomfortable I get with my own arguments. :) I think I'm going >>> to implement the polling approach for checking interrupts - say 500ms. >> >> The JVMTI spec tells that the JVMTI_ERROR_INTERRUPT can be returned >> from the RawMonitorWait: >> https://docs.oracle.com/en/java/javase/11/docs/specs/jvmti.html#RawMonitorWait >> > > Yes it does and that is the only thing that implies a connection to > Thread.interrupt. > >> which means that RawMonitorWait can be interrupted with the >> Thread.Interrupt() >> or JVMTI InterruptThread(): >> https://docs.oracle.com/en/java/javase/11/docs/specs/jvmti.html#InterruptThread >> > > That's one way to interpret the fact that RawMonitorWait can return > JVMTI_ERROR_INTERRUPT, but the actual interaction between > Thread.interrupt and RawMonitorWait is not explicitly stated. Arguably > you can just check for interruption before and after the wait, to see > whether to return JVMTI_ERROR_INTERRUPT, without necessarily being able > to break out of the wait itself. That's been the whole premise of this > change proposal - that responsiveness to interrupts is more a > quality-of-implementation issue. > > But in any case I've decided to try the polling approach so that we > won't wait forever if interrupted but not notified. > > Thanks, > David > ----- > >> >> >> Thanks, >> Serguei >> >>>> I also very much like/appreciate the decoupling of JvmtiRawMonitors >>>> from ObjectMonitors... Thanks for tackling this crazy task. >>> >>> Thanks :) >>> >>> David >>> >>>> Dan >>>> >>>> >>>> >>>> On 8/15/19 2:22 AM, David Holmes wrote: >>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8229160 >>>>> >>>>> Preliminary webrev (still has rough edges): >>>>> http://cr.openjdk.java.net/~dholmes/8229160/webrev.prelim/ >>>>> >>>>> Background: >>>>> >>>>> We've had this comment for a long time: >>>>> >>>>> ?// The raw monitor subsystem is entirely distinct from normal >>>>> ?// java-synchronization or jni-synchronization.? raw monitors are not >>>>> ?// associated with objects.? They can be implemented in any manner >>>>> ?// that makes sense.? The original implementors decided to piggy-back >>>>> ?// the raw-monitor implementation on the existing Java >>>>> objectMonitor mechanism. >>>>> ?// This flaw needs to fixed.? We should reimplement raw monitors >>>>> as sui-generis. >>>>> ?// Specifically, we should not implement raw monitors via java >>>>> monitors. >>>>> ?// Time permitting, we should disentangle and deconvolve the two >>>>> implementations >>>>> ?// and move the resulting raw monitor implementation over to the >>>>> JVMTI directories. >>>>> ?// Ideally, the raw monitor implementation would be built on top of >>>>> ?// park-unpark and nothing else. >>>>> >>>>> This is an attempt to do that disentangling so that we can then >>>>> consider changes to ObjectMonitor without having to worry about >>>>> JvmtiRawMonitors. But rather than building on low-level park/unpark >>>>> (which would require the same manual queue management and much of >>>>> the same complex code as exists in ObjectMonitor) I decided to try >>>>> and do this on top of PlatformMonitor. >>>>> >>>>> The reason this is just a RFC rather than RFR is that I overlooked >>>>> a non-trivial aspect of JvmtiRawMonitors: like Java monitors (as >>>>> implemented by ObjectMonitor) they interact with the >>>>> Thread.interrupt mechanism. This is not clearly stated in the JVM >>>>> TI specification [1] but only in passing by the possible errors for >>>>> RawMonitorWait: >>>>> >>>>> JVMTI_ERROR_INTERRUPT??? Wait was interrupted, try again >>>>> >>>>> As I explain in the bug report there is no way to build in proper >>>>> interrupt support using PlatformMonitor as there is no way we can >>>>> "interrupt" the low-level pthread_cond_wait. But we can approximate >>>>> it. What I've done in this preliminary version is just check >>>>> interrupt state before and after the actual "wait" but we won't get >>>>> woken by the interrupt once we have actually blocked. Alternatively >>>>> we could use a periodic polling approach and wakeup every Nms to >>>>> check for interruption. >>>>> >>>>> The only use of JvmtiRawMonitors in the JDK libraries (JDWP) is not >>>>> affected by this choice as that code ignores the interrupt until >>>>> the real action it was waiting for has occurred. The interrupt is >>>>> then reposted later. >>>>> >>>>> But more generally there could be users of JvmtiRawMonitors that >>>>> expect/require that RawMonitorWait is responsive to >>>>> Thread.interrupt in a manner similar to Object.wait. And if any of >>>>> them are reading this then I'd like to know - hence this RFC :) >>>>> >>>>> FYI testing to date: >>>>> ?- tiers 1 -3 all platforms >>>>> ?- hotspot: serviceability/jvmti >>>>> ????????????????????????? /jdwp >>>>> ??????????? vmTestbase/nsk/jvmti >>>>> ????????????????????????? /jdwp >>>>> ?- JDK: com/sun/jdi >>>>> >>>>> Comments/opinions appreciated. >>>>> >>>>> Thanks, >>>>> David >>>>> >>>>> [1] >>>>> https://docs.oracle.com/en/java/javase/11/docs/specs/jvmti.html#RawMonitorWait >>>>> >>>> >> From david.holmes at oracle.com Tue Sep 10 08:52:00 2019 From: david.holmes at oracle.com (David Holmes) Date: Tue, 10 Sep 2019 18:52:00 +1000 Subject: RFC: 8229160: Reimplement JvmtiRawMonitor to use PlatformMonitor In-Reply-To: <5fe4ee14-f406-850e-34b4-e44485b39349@oracle.com> References: <842a1d43-bdcc-3345-2731-d92b477e3ad3@oracle.com> <75acc5c5-b45f-a764-55d6-692dd7ca0a1a@oracle.com> <5fe4ee14-f406-850e-34b4-e44485b39349@oracle.com> Message-ID: <774fb8b3-6c00-fcc7-0c25-858933b65f4b@oracle.com> On 10/09/2019 5:54 pm, David Holmes wrote: > It turns out that polling for interrupts is actually very difficult to > do correctly. There is an inherent race with timing out and being > signalled that can result in lost signals. Trying to account for that > without introducing other problems would lead to a very complex > synchronization mechanism (which would need to track waiters, > notifications and a "generation" count). The end result is that we'd > break the tie to ObjectMonitor code, but would replace it with new > complex untried code. :( Or ... rather than the initial "unresponsive to interrupts" approach I could take it the other way and have every raw monitor wait limited to, say 500ms, at which point it will return as a "spurious wakeup" and check the interrupt state on the way out ... David > David > > On 10/09/2019 8:49 am, David Holmes wrote: >> Hi Serguei, >> >> On 10/09/2019 4:26 am, serguei.spitsyn at oracle.com wrote: >>> Hi David, >>> >>> On 9/8/19 19:15, David Holmes wrote: >>>> Hi Dan, >>>> >>>> On 7/09/2019 6:50 am, Daniel D. Daugherty wrote: >>>>> Hi David, >>>>> >>>>> I've finally gotten back to this email thread... >>>> >>>> Thanks. >>>> >>>>>> FYI testing to date: >>>>>> ?- tiers 1 -3 all platforms >>>>>> ?- hotspot: serviceability/jvmti >>>>>> ????????????????????????? /jdwp >>>>>> ??????????? vmTestbase/nsk/jvmti >>>>>> ????????????????????????? /jdwp >>>>>> ?- JDK: com/sun/jdi >>>>> >>>>> You should also add: >>>>> >>>>> open/test/hotspot/jtreg/vmTestbase/nsk/jdb >>>>> open/test/hotspot/jtreg/vmTestbase/nsk/jdi >>>>> open/test/jdk/java/lang/instrument >>>> >>>> Okay - in progress. Though I can't see any use of RawMonitors in any >>>> of these tests. >>>> >>>>> I took a quick look through the preliminary webrev and I don't see >>>>> anything that worries me. >>>> >>>> Thanks. I'll prepare a more polished webrev soon. >>>> >>>>> Re: Thread.interrupt() and raw_wait() >>>>> >>>>> It would be good to see if that semantic is being tested via the >>>>> JCK test suite for JVM/TI. >>>> >>>> It isn't. The only thing directly tested for RawMonitorWait is >>>> normal successful operation and reporting "not owner" when not the >>>> owner. No check for JVMTI_ERROR_INTERRUPT exists other than as input >>>> for the GetErrorName function. >>> >>> This is most likely true. >>> My only concern is if RawMonitor's can be used in the JCK test >>> libraries (low probability). >>> I've asked Leonid Kuskov (JCK) to double check this (added to the >>> mailing list). >>> >>>> >>>> There's only one test in the whole test base that checks for the >>>> interrupt and that is >>>> vmTestbase/nsk/jvmti/RawMonitorWait/rawmnwait005/. In that test if >>>> we are not interrupted before the RawMonitorWait we will wait until >>>> the full timeout elapses - which is 2 minutes by default - then >>>> return and report the interrupt. Hence the test still passes. (If it >>>> was an untimed wait that would be different of course). >>> >>> I figured the same last Friday. >>> One more place to care about are NSK tests libraries that are located >>> here: >>> ?? test/hotspot/jtreg/vmTestbase/nsk/share >>> >>> There are a couple of places where the RawMonitor is used: >>> ?? jvmti/hotswap/HotSwap.cpp:??? if >>> (!NSK_JVMTI_VERIFY(jvmti->RawMonitorWait(waitLock, millis))) >>> ?? jvmti/jvmti_tools.cpp:??? jvmtiError error = >>> env->RawMonitorWait(monitor, millis); >>> >>> The use in HotSwap.cpp is local. >>> The jvmti_tools.cpp defines this: >>> ?? void rawMonitorWait(jvmtiEnv *env, jrawMonitorID monitor, jlong >>> millis) { >>> ??? ?? jvmtiError error = env->RawMonitorWait(monitor, millis); >>> ??? ?? exitOnError(error); >>> ?? } >>> >>> which is used in the jvmti/agent_tools.cpp but does not depend on >>> interrupting of RawMonitor's (as I can see). >>> One more place to mention is: >>> ?? jvmti/DataDumpRequest/datadumpreq001/datadumpreq001.cpp >>> >>> But I see no problems there as well. >>> >>> >>> >>> The JDWP implementation is using RawMonitor's. >>> Please, see functions debugMonitorWait()/debugMonitorTimedWait() in >>> src/jdk.jdwp.agent/share/native/libjdwp/util.c. >>> >>> It expects the JVMTI_ERROR_INTERRUPT but never makes a call to the >>> JVMTI ThreadInterrupt(). >>> So, it looks like it does not depend on interrupting of RawMonitor's >>> in any way. >>> >>>> >>>> The more I try to convince people this change should be okay, the >>>> more uncomfortable I get with my own arguments. :) I think I'm going >>>> to implement the polling approach for checking interrupts - say 500ms. >>> >>> The JVMTI spec tells that the JVMTI_ERROR_INTERRUPT can be returned >>> from the RawMonitorWait: >>> https://docs.oracle.com/en/java/javase/11/docs/specs/jvmti.html#RawMonitorWait >>> >> >> Yes it does and that is the only thing that implies a connection to >> Thread.interrupt. >> >>> which means that RawMonitorWait can be interrupted with the >>> Thread.Interrupt() >>> or JVMTI InterruptThread(): >>> https://docs.oracle.com/en/java/javase/11/docs/specs/jvmti.html#InterruptThread >>> >> >> That's one way to interpret the fact that RawMonitorWait can return >> JVMTI_ERROR_INTERRUPT, but the actual interaction between >> Thread.interrupt and RawMonitorWait is not explicitly stated. Arguably >> you can just check for interruption before and after the wait, to see >> whether to return JVMTI_ERROR_INTERRUPT, without necessarily being >> able to break out of the wait itself. That's been the whole premise of >> this change proposal - that responsiveness to interrupts is more a >> quality-of-implementation issue. >> >> But in any case I've decided to try the polling approach so that we >> won't wait forever if interrupted but not notified. >> >> Thanks, >> David >> ----- >> >>> >>> >>> Thanks, >>> Serguei >>> >>>>> I also very much like/appreciate the decoupling of JvmtiRawMonitors >>>>> from ObjectMonitors... Thanks for tackling this crazy task. >>>> >>>> Thanks :) >>>> >>>> David >>>> >>>>> Dan >>>>> >>>>> >>>>> >>>>> On 8/15/19 2:22 AM, David Holmes wrote: >>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8229160 >>>>>> >>>>>> Preliminary webrev (still has rough edges): >>>>>> http://cr.openjdk.java.net/~dholmes/8229160/webrev.prelim/ >>>>>> >>>>>> Background: >>>>>> >>>>>> We've had this comment for a long time: >>>>>> >>>>>> ?// The raw monitor subsystem is entirely distinct from normal >>>>>> ?// java-synchronization or jni-synchronization.? raw monitors are >>>>>> not >>>>>> ?// associated with objects.? They can be implemented in any manner >>>>>> ?// that makes sense.? The original implementors decided to >>>>>> piggy-back >>>>>> ?// the raw-monitor implementation on the existing Java >>>>>> objectMonitor mechanism. >>>>>> ?// This flaw needs to fixed.? We should reimplement raw monitors >>>>>> as sui-generis. >>>>>> ?// Specifically, we should not implement raw monitors via java >>>>>> monitors. >>>>>> ?// Time permitting, we should disentangle and deconvolve the two >>>>>> implementations >>>>>> ?// and move the resulting raw monitor implementation over to the >>>>>> JVMTI directories. >>>>>> ?// Ideally, the raw monitor implementation would be built on top of >>>>>> ?// park-unpark and nothing else. >>>>>> >>>>>> This is an attempt to do that disentangling so that we can then >>>>>> consider changes to ObjectMonitor without having to worry about >>>>>> JvmtiRawMonitors. But rather than building on low-level >>>>>> park/unpark (which would require the same manual queue management >>>>>> and much of the same complex code as exists in ObjectMonitor) I >>>>>> decided to try and do this on top of PlatformMonitor. >>>>>> >>>>>> The reason this is just a RFC rather than RFR is that I overlooked >>>>>> a non-trivial aspect of JvmtiRawMonitors: like Java monitors (as >>>>>> implemented by ObjectMonitor) they interact with the >>>>>> Thread.interrupt mechanism. This is not clearly stated in the JVM >>>>>> TI specification [1] but only in passing by the possible errors >>>>>> for RawMonitorWait: >>>>>> >>>>>> JVMTI_ERROR_INTERRUPT??? Wait was interrupted, try again >>>>>> >>>>>> As I explain in the bug report there is no way to build in proper >>>>>> interrupt support using PlatformMonitor as there is no way we can >>>>>> "interrupt" the low-level pthread_cond_wait. But we can >>>>>> approximate it. What I've done in this preliminary version is just >>>>>> check interrupt state before and after the actual "wait" but we >>>>>> won't get woken by the interrupt once we have actually blocked. >>>>>> Alternatively we could use a periodic polling approach and wakeup >>>>>> every Nms to check for interruption. >>>>>> >>>>>> The only use of JvmtiRawMonitors in the JDK libraries (JDWP) is >>>>>> not affected by this choice as that code ignores the interrupt >>>>>> until the real action it was waiting for has occurred. The >>>>>> interrupt is then reposted later. >>>>>> >>>>>> But more generally there could be users of JvmtiRawMonitors that >>>>>> expect/require that RawMonitorWait is responsive to >>>>>> Thread.interrupt in a manner similar to Object.wait. And if any of >>>>>> them are reading this then I'd like to know - hence this RFC :) >>>>>> >>>>>> FYI testing to date: >>>>>> ?- tiers 1 -3 all platforms >>>>>> ?- hotspot: serviceability/jvmti >>>>>> ????????????????????????? /jdwp >>>>>> ??????????? vmTestbase/nsk/jvmti >>>>>> ????????????????????????? /jdwp >>>>>> ?- JDK: com/sun/jdi >>>>>> >>>>>> Comments/opinions appreciated. >>>>>> >>>>>> Thanks, >>>>>> David >>>>>> >>>>>> [1] >>>>>> https://docs.oracle.com/en/java/javase/11/docs/specs/jvmti.html#RawMonitorWait >>>>>> >>>>> >>> From alexey.menkov at oracle.com Tue Sep 10 20:33:29 2019 From: alexey.menkov at oracle.com (Alex Menkov) Date: Tue, 10 Sep 2019 13:33:29 -0700 Subject: RFR: JDK-8192057: com/sun/jdi/BadHandshakeTest.java fails with java.net.ConnectException In-Reply-To: References: <985bd047-9ade-afbd-cde5-c29ba9bc0bd4@oracle.com> Message-ID: Hi Richard, Updated webrev: http://cr.openjdk.java.net/~amenkov/jdk14/BadHandshakeTest/webrev.2/ added delay between retries & moved error clearing to the beginning of the cycle. --alex On 09/09/2019 02:07, Reingruber, Richard wrote: > Hi Alex, > > > Of course error can be cleared before each try - there is not functional > > difference. > > It is just a little confusing, as you can get an exception in L. 95, too. But I'm ok with it, if you > prefer it like this. > > I would suggest, though, to sleep some ms before a retry and double the sleep time in each > following retry. > > Best regards, > Richard. > > -----Original Message----- > From: Alex Menkov > Sent: Freitag, 6. September 2019 22:52 > To: Reingruber, Richard ; serguei.spitsyn at oracle.com; OpenJDK Serviceability > Subject: Re: RFR: JDK-8192057: com/sun/jdi/BadHandshakeTest.java fails with java.net.ConnectException > > Hi Richard, > > On 09/06/2019 09:28, Reingruber, Richard wrote: >> Hi Alex, >> >> that's a good fix for the issue. >> >> One minor thing: >> >> 89 Exception error = null; >> 90 for (int retry = 0; retry < 5; retry++) { >> 91 try { >> 92 log("retry: " + retry); >> 93 s = new Socket("localhost", port); >> 94 error = null; >> 95 s.getOutputStream().write("JDWP-".getBytes("UTF-8")); >> 96 break; >> 97 } catch (ConnectException ex) { >> 98 log("got exception: " + ex.toString()); >> 99 error = ex; >> 100 } >> 101 } >> 102 if (error != null) { >> 103 throw error; >> 104 } >> >> Is there a reason to clear the local variable error in line 94 instead of clearing it >> in line 91 where each new attempt begins? > > The logic here is: > The cycle has 2 exits: > - error (max retry attempts reached, error is set by last "catch") > - success ("break" statement at line 96, error should be null) > So error is cleared only after the socket is connected (this is the > problematic operation which can cause ConnectException). > > Of course error can be cleared before each try - there is not functional > difference. > > --alex > >> >> Cheers, Richard. >> >> -----Original Message----- >> From: serviceability-dev On Behalf Of serguei.spitsyn at oracle.com >> Sent: Mittwoch, 4. September 2019 22:11 >> To: Alex Menkov ; OpenJDK Serviceability >> Subject: Re: RFR: JDK-8192057: com/sun/jdi/BadHandshakeTest.java fails with java.net.ConnectException >> >> Hi Alex, >> >> The fix looks good. >> Good simplification! >> >> Thanks, >> Serguei >> >> >> On 9/4/19 12:19, Alex Menkov wrote: >>> Hi all, >>> >>> Please review the fix for BadHandshakeTest test. >>> The problem is the test connects to the server twice and if debuggee >>> hasn't yet handled disconnection, the next connect gets "connection >>> refused" error. >>> Instead of adding delay before 2nd connect (we never know "good" value >>> for the delay and big delay can cause "accept timeout"), the test >>> re-tries connect in case of ConnectException. >>> Also improved/simplified the test slightly - debuggee is now run with >>> auto port assignment (used lib.jdb.Debuggee test class which >>> implements required functionality). >>> >>> jira: >>> ? https://bugs.openjdk.java.net/browse/JDK-8192057 >>> webrev: >>> http://cr.openjdk.java.net/~amenkov/jdk14/BadHandshakeTest/webrev/ >>> >>> --alex >> From serguei.spitsyn at oracle.com Tue Sep 10 20:51:27 2019 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Tue, 10 Sep 2019 13:51:27 -0700 Subject: RFR: JDK-8192057: com/sun/jdi/BadHandshakeTest.java fails with java.net.ConnectException In-Reply-To: References: <985bd047-9ade-afbd-cde5-c29ba9bc0bd4@oracle.com> Message-ID: Hi Alex, It looks good. Thanks, Serguei On 9/10/19 1:33 PM, Alex Menkov wrote: > Hi Richard, > > Updated webrev: > http://cr.openjdk.java.net/~amenkov/jdk14/BadHandshakeTest/webrev.2/ > > added delay between retries & moved error clearing to the beginning of > the cycle. > > --alex > > On 09/09/2019 02:07, Reingruber, Richard wrote: >> Hi Alex, >> >> ?? > Of course error can be cleared before each try - there is not >> functional >> ?? > difference. >> >> It is just a little confusing, as you can get an exception in L. 95, >> too. But I'm ok with it, if you >> prefer it like this. >> >> I would suggest, though, to sleep some ms before a retry and double >> the sleep time in each >> following retry. >> >> Best regards, >> Richard. >> >> -----Original Message----- >> From: Alex Menkov >> Sent: Freitag, 6. September 2019 22:52 >> To: Reingruber, Richard ; >> serguei.spitsyn at oracle.com; OpenJDK Serviceability >> >> Subject: Re: RFR: JDK-8192057: com/sun/jdi/BadHandshakeTest.java >> fails with java.net.ConnectException >> >> Hi Richard, >> >> On 09/06/2019 09:28, Reingruber, Richard wrote: >>> Hi Alex, >>> >>> that's a good fix for the issue. >>> >>> One minor thing: >>> >>> ??? 89???????????? Exception error = null; >>> ??? 90???????????? for (int retry = 0; retry < 5; retry++) { >>> ??? 91???????????????? try { >>> ??? 92???????????????????? log("retry: " + retry); >>> ??? 93???????????????????? s = new Socket("localhost", port); >>> ??? 94???????????????????? error = null; >>> ??? 95 s.getOutputStream().write("JDWP-".getBytes("UTF-8")); >>> ??? 96???????????????????? break; >>> ??? 97???????????????? } catch (ConnectException ex) { >>> ??? 98???????????????????? log("got exception: " + ex.toString()); >>> ??? 99???????????????????? error = ex; >>> ?? 100???????????????? } >>> ?? 101???????????? } >>> ?? 102???????????? if (error != null) { >>> ?? 103???????????????? throw error; >>> ?? 104???????????? } >>> >>> Is there a reason to clear the local variable error in line 94 >>> instead of clearing it >>> in line 91 where each new attempt begins? >> >> The logic here is: >> The cycle has 2 exits: >> - error (max retry attempts reached, error is set by last "catch") >> - success ("break" statement at line 96, error should be null) >> So error is cleared only after the socket is connected (this is the >> problematic operation which can cause ConnectException). >> >> Of course error can be cleared before each try - there is not functional >> difference. >> >> --alex >> >>> >>> Cheers, Richard. >>> >>> -----Original Message----- >>> From: serviceability-dev >>> On Behalf Of >>> serguei.spitsyn at oracle.com >>> Sent: Mittwoch, 4. September 2019 22:11 >>> To: Alex Menkov ; OpenJDK Serviceability >>> >>> Subject: Re: RFR: JDK-8192057: com/sun/jdi/BadHandshakeTest.java >>> fails with java.net.ConnectException >>> >>> Hi Alex, >>> >>> The fix looks good. >>> Good simplification! >>> >>> Thanks, >>> Serguei >>> >>> >>> On 9/4/19 12:19, Alex Menkov wrote: >>>> Hi all, >>>> >>>> Please review the fix for BadHandshakeTest test. >>>> The problem is the test connects to the server twice and if debuggee >>>> hasn't yet handled disconnection, the next connect gets "connection >>>> refused" error. >>>> Instead of adding delay before 2nd connect (we never know "good" value >>>> for the delay and big delay can cause "accept timeout"), the test >>>> re-tries connect in case of ConnectException. >>>> Also improved/simplified the test slightly - debuggee is now run with >>>> auto port assignment (used lib.jdb.Debuggee test class which >>>> implements required functionality). >>>> >>>> jira: >>>> ? ? https://bugs.openjdk.java.net/browse/JDK-8192057 >>>> webrev: >>>> http://cr.openjdk.java.net/~amenkov/jdk14/BadHandshakeTest/webrev/ >>>> >>>> --alex >>> From serguei.spitsyn at oracle.com Tue Sep 10 20:55:57 2019 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Tue, 10 Sep 2019 13:55:57 -0700 Subject: RFR 8230303 : JDB hangs when running monitor command In-Reply-To: References: <18791c15-e6bb-04c0-a1bd-fe22b175560b@oracle.com> Message-ID: Hi Ivan, Thank you for clarifications! Serguei On 9/9/19 10:07 PM, Ivan Gerasimov wrote: > Thank you Serguei for review! > > On 9/9/19 9:14 PM, serguei.spitsyn at oracle.com wrote: >> Hi Ivan, >> >> Thank you for filing the shadow bug and fixing this issue! >> I've targeted the bug to 14 (please, fix me if it is wrong). >> > Yes, this is correct, thanks. > > >> The fix looks good to me. >> >> Do you have any plans to backport it to the earlier releases? >> > Yes, I'm planning to request a backport to the JDK 13 shortly. > > With kind regards, > > Ivan > > From leonid.mesnik at oracle.com Wed Sep 11 02:03:53 2019 From: leonid.mesnik at oracle.com (Leonid Mesnik) Date: Tue, 10 Sep 2019 19:03:53 -0700 Subject: RFR: 8230830: No required ResourceMark in src/hotspot/share/prims/jvmtiImpl.cpp:JvmtiSuspendControl::print() Message-ID: <7D331CDC-55A4-499D-B2D7-59C58CEF22BA@oracle.com> Hi Could you please review following tiny fix which just add ResourceMark in JvmtiSuspendControl::print() method. The method jvmtiSuspendControl::print() might used in custom builds only for debugging purposes. So I don't know when it was used last time. I found that it crashes when I tried to use it locally. webrev: http://cr.openjdk.java.net/~lmesnik/8230830/webrev.00/ bug: https://bugs.openjdk.java.net/browse/JDK-8230830 Leonid -------------- next part -------------- An HTML attachment was scrubbed... URL: From david.holmes at oracle.com Wed Sep 11 05:03:08 2019 From: david.holmes at oracle.com (David Holmes) Date: Wed, 11 Sep 2019 15:03:08 +1000 Subject: RFR: 8230830: No required ResourceMark in src/hotspot/share/prims/jvmtiImpl.cpp:JvmtiSuspendControl::print() In-Reply-To: <7D331CDC-55A4-499D-B2D7-59C58CEF22BA@oracle.com> References: <7D331CDC-55A4-499D-B2D7-59C58CEF22BA@oracle.com> Message-ID: <7cb9162f-1d39-b22d-8927-4a781df0c40e@oracle.com> Hi Leonid, On 11/09/2019 12:03 pm, Leonid Mesnik wrote: > Hi > > Could you please review following tiny fix which just add ResourceMark > in JvmtiSuspendControl::print() method. Looks fine. > The method jvmtiSuspendControl::print() might used in custom builds only > for debugging purposes. So I don't know when it was used last time. I > found that it crashes when I tried to use it locally. The only caller is JvmtiEnv::NotifyFramePop, under TraceJVMTICalls, and it already has a ResourceMark. So existing use is fine. Please ensure you test with TraceJVMTICalls enabled. Thanks, David > webrev: http://cr.openjdk.java.net/~lmesnik/8230830/webrev.00/ > bug: https://bugs.openjdk.java.net/browse/JDK-8230830 > > Leonid > > From leonid.mesnik at oracle.com Wed Sep 11 05:06:52 2019 From: leonid.mesnik at oracle.com (Leonid Mesnik) Date: Tue, 10 Sep 2019 22:06:52 -0700 Subject: RFR: 8230830: No required ResourceMark in src/hotspot/share/prims/jvmtiImpl.cpp:JvmtiSuspendControl::print() In-Reply-To: <7cb9162f-1d39-b22d-8927-4a781df0c40e@oracle.com> References: <7D331CDC-55A4-499D-B2D7-59C58CEF22BA@oracle.com> <7cb9162f-1d39-b22d-8927-4a781df0c40e@oracle.com> Message-ID: <07735389-F665-463D-8C4B-8A5734234357@oracle.com> Hi Thank you for feedback. > On Sep 10, 2019, at 10:03 PM, David Holmes wrote: > > Hi Leonid, > > On 11/09/2019 12:03 pm, Leonid Mesnik wrote: >> Hi >> Could you please review following tiny fix which just add ResourceMark in JvmtiSuspendControl::print() method. > > Looks fine. > >> The method jvmtiSuspendControl::print() might used in custom builds only for debugging purposes. So I don't know when it was used last time. I found that it crashes when I tried to use it locally. > > The only caller is JvmtiEnv::NotifyFramePop, under TraceJVMTICalls, and it already has a ResourceMark. So existing use is fine. It explains why it works. I used it to track status in suspend resume investigating https://bugs.openjdk.java.net/browse/JDK-8230459 . > > Please ensure you test with TraceJVMTICalls enabled. Thanks, I have tested it with this macro enabled. Leonid > Thanks, > David > > >> webrev: http://cr.openjdk.java.net/~lmesnik/8230830/webrev.00/ >> bug: https://bugs.openjdk.java.net/browse/JDK-8230830 >> Leonid -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.plummer at oracle.com Wed Sep 11 05:12:38 2019 From: chris.plummer at oracle.com (Chris Plummer) Date: Tue, 10 Sep 2019 22:12:38 -0700 Subject: RFR: 8230830: No required ResourceMark in src/hotspot/share/prims/jvmtiImpl.cpp:JvmtiSuspendControl::print() In-Reply-To: <7D331CDC-55A4-499D-B2D7-59C58CEF22BA@oracle.com> References: <7D331CDC-55A4-499D-B2D7-59C58CEF22BA@oracle.com> Message-ID: <5a41e489-cade-c556-517c-5739c40d8726@oracle.com> Looks good. Chris On 9/10/19 7:03 PM, Leonid Mesnik wrote: > Hi > > Could you please review following tiny fix which just add ResourceMark > in JvmtiSuspendControl::print() method. > > The method jvmtiSuspendControl::print() might used in custom builds > only for debugging purposes. So I don't know when it was used last > time. I found that it crashes when I tried to use it locally. > > webrev: http://cr.openjdk.java.net/~lmesnik/8230830/webrev.00/ > bug: https://bugs.openjdk.java.net/browse/JDK-8230830 > > Leonid > > From serguei.spitsyn at oracle.com Wed Sep 11 06:51:03 2019 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Tue, 10 Sep 2019 23:51:03 -0700 Subject: RFR: 8230830: No required ResourceMark in src/hotspot/share/prims/jvmtiImpl.cpp:JvmtiSuspendControl::print() In-Reply-To: <5a41e489-cade-c556-517c-5739c40d8726@oracle.com> References: <7D331CDC-55A4-499D-B2D7-59C58CEF22BA@oracle.com> <5a41e489-cade-c556-517c-5739c40d8726@oracle.com> Message-ID: <2fc71d64-4c88-643e-a6fc-3ace291dbb58@oracle.com> Hi Leonid, +1 Thanks, Serguei On 9/10/19 22:12, Chris Plummer wrote: > Looks good. > > Chris > > On 9/10/19 7:03 PM, Leonid Mesnik wrote: >> Hi >> >> Could you please review following tiny fix which just add >> ResourceMark in JvmtiSuspendControl::print() method. >> >> The method jvmtiSuspendControl::print() might used in custom builds >> only for debugging purposes. So I don't know when it was used last >> time. I found that it crashes when I tried to use it locally. >> >> webrev: http://cr.openjdk.java.net/~lmesnik/8230830/webrev.00/ >> bug: https://bugs.openjdk.java.net/browse/JDK-8230830 >> >> Leonid >> >> > From serguei.spitsyn at oracle.com Wed Sep 11 07:13:47 2019 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Wed, 11 Sep 2019 00:13:47 -0700 Subject: RFR: 8229378: jdwp library loader in linker_md.c quietly truncates on buffer overflow In-Reply-To: References: <9bab421c-08a1-b554-6ac9-35290856ee56@oracle.com> <0c01c46b-3ff9-3016-8791-868019459d13@oracle.com> <3ea2e6ad-a32d-603c-258c-985da4e2f50a@oracle.com> Message-ID: <11c16477-c096-b807-46fd-acab1c484587@oracle.com> An HTML attachment was scrubbed... URL: From richard.reingruber at sap.com Wed Sep 11 09:19:50 2019 From: richard.reingruber at sap.com (Reingruber, Richard) Date: Wed, 11 Sep 2019 09:19:50 +0000 Subject: RFR: JDK-8192057: com/sun/jdi/BadHandshakeTest.java fails with java.net.ConnectException In-Reply-To: References: <985bd047-9ade-afbd-cde5-c29ba9bc0bd4@oracle.com> Message-ID: Hi Alex, thanks, the change looks good. Best regards, Richard. -----Original Message----- From: Alex Menkov Sent: Dienstag, 10. September 2019 22:33 To: Reingruber, Richard ; serguei.spitsyn at oracle.com; OpenJDK Serviceability Subject: Re: RFR: JDK-8192057: com/sun/jdi/BadHandshakeTest.java fails with java.net.ConnectException Hi Richard, Updated webrev: http://cr.openjdk.java.net/~amenkov/jdk14/BadHandshakeTest/webrev.2/ added delay between retries & moved error clearing to the beginning of the cycle. --alex On 09/09/2019 02:07, Reingruber, Richard wrote: > Hi Alex, > > > Of course error can be cleared before each try - there is not functional > > difference. > > It is just a little confusing, as you can get an exception in L. 95, too. But I'm ok with it, if you > prefer it like this. > > I would suggest, though, to sleep some ms before a retry and double the sleep time in each > following retry. > > Best regards, > Richard. > > -----Original Message----- > From: Alex Menkov > Sent: Freitag, 6. September 2019 22:52 > To: Reingruber, Richard ; serguei.spitsyn at oracle.com; OpenJDK Serviceability > Subject: Re: RFR: JDK-8192057: com/sun/jdi/BadHandshakeTest.java fails with java.net.ConnectException > > Hi Richard, > > On 09/06/2019 09:28, Reingruber, Richard wrote: >> Hi Alex, >> >> that's a good fix for the issue. >> >> One minor thing: >> >> 89 Exception error = null; >> 90 for (int retry = 0; retry < 5; retry++) { >> 91 try { >> 92 log("retry: " + retry); >> 93 s = new Socket("localhost", port); >> 94 error = null; >> 95 s.getOutputStream().write("JDWP-".getBytes("UTF-8")); >> 96 break; >> 97 } catch (ConnectException ex) { >> 98 log("got exception: " + ex.toString()); >> 99 error = ex; >> 100 } >> 101 } >> 102 if (error != null) { >> 103 throw error; >> 104 } >> >> Is there a reason to clear the local variable error in line 94 instead of clearing it >> in line 91 where each new attempt begins? > > The logic here is: > The cycle has 2 exits: > - error (max retry attempts reached, error is set by last "catch") > - success ("break" statement at line 96, error should be null) > So error is cleared only after the socket is connected (this is the > problematic operation which can cause ConnectException). > > Of course error can be cleared before each try - there is not functional > difference. > > --alex > >> >> Cheers, Richard. >> >> -----Original Message----- >> From: serviceability-dev On Behalf Of serguei.spitsyn at oracle.com >> Sent: Mittwoch, 4. September 2019 22:11 >> To: Alex Menkov ; OpenJDK Serviceability >> Subject: Re: RFR: JDK-8192057: com/sun/jdi/BadHandshakeTest.java fails with java.net.ConnectException >> >> Hi Alex, >> >> The fix looks good. >> Good simplification! >> >> Thanks, >> Serguei >> >> >> On 9/4/19 12:19, Alex Menkov wrote: >>> Hi all, >>> >>> Please review the fix for BadHandshakeTest test. >>> The problem is the test connects to the server twice and if debuggee >>> hasn't yet handled disconnection, the next connect gets "connection >>> refused" error. >>> Instead of adding delay before 2nd connect (we never know "good" value >>> for the delay and big delay can cause "accept timeout"), the test >>> re-tries connect in case of ConnectException. >>> Also improved/simplified the test slightly - debuggee is now run with >>> auto port assignment (used lib.jdb.Debuggee test class which >>> implements required functionality). >>> >>> jira: >>> ? https://bugs.openjdk.java.net/browse/JDK-8192057 >>> webrev: >>> http://cr.openjdk.java.net/~amenkov/jdk14/BadHandshakeTest/webrev/ >>> >>> --alex >> From adam.farley at uk.ibm.com Wed Sep 11 11:18:30 2019 From: adam.farley at uk.ibm.com (Adam Farley8) Date: Wed, 11 Sep 2019 12:18:30 +0100 Subject: RFR: 8229378: jdwp library loader in linker_md.c quietly truncates on buffer overflow In-Reply-To: <11c16477-c096-b807-46fd-acab1c484587@oracle.com> References: <9bab421c-08a1-b554-6ac9-35290856ee56@oracle.com> <0c01c46b-3ff9-3016-8791-868019459d13@oracle.com> <3ea2e6ad-a32d-603c-258c-985da4e2f50a@oracle.com> <11c16477-c096-b807-46fd-acab1c484587@oracle.com> Message-ID: Hi Serguei, If you're happy with the fix, then here's a webrev without the os.cpp bit. http://cr.openjdk.java.net/~afarley/8229378.3/webrev/ That was only included to make direct testing possible, and I want to make sure it doesn't get included by accident. Best Regards Adam Farley IBM Runtimes "serguei.spitsyn at oracle.com" wrote on 11/09/2019 08:13:47: > From: "serguei.spitsyn at oracle.com" > To: Adam Farley8 > Cc: Chris Plummer , > daniel.daugherty at oracle.com, serviceability-dev at openjdk.java.net > Date: 11/09/2019 08:14 > Subject: Re: RFR: 8229378: jdwp library loader in linker_md.c > quietly truncates on buffer overflow > > Hi Adam, > > I'm Okay with this fix. > If nobody else have comments then I'll build it, test a little bit > and push to the jdk/jdk repo. > > Thanks, > Serguei > > > On 9/9/19 09:53, Adam Farley8 wrote: > Hi Serguei, > > Apologies for the delay. > > The errors have all been fixed, and the requested tests mostly > passed, windows and linux. > > No test group had more failures post-fix than pre-fix, so I'm > calling that a win. > > The new webrev can be found here: > > http://cr.openjdk.java.net/~afarley/8229378.2/webrev > > Best Regards > > Adam Farley > IBM Runtimes > > > "serguei.spitsyn at oracle.com" wrote on > 29/08/2019 19:38:02: > > > From: "serguei.spitsyn at oracle.com" > > To: Adam Farley8 > > Cc: Chris Plummer , > > daniel.daugherty at oracle.com, serviceability-dev at openjdk.java.net > > Date: 29/08/2019 19:38 > > Subject: Re: RFR: 8229378: jdwp library loader in linker_md.c > > quietly truncates on buffer overflow > > > > Hi Adam, > > > > Okay, thanks! > > Serguei > > > > > > On 8/29/19 06:26, Adam Farley8 wrote: > > Hi Serguei, > > > > I haven't actually run a fastdebug build before. Will do that now > > and address the issues. > > > > Once done, I'll re-run the tests I ran, and also the tests you've > > listed below. > > > > Can you advise on how "good coverage" is determined, so I know for > > future bug fixes? > > > > As for the up-to-date-ness, I'll update the build before doing the above. > > > > Expect a webrev once all of this is complete. > > > > Best Regards > > > > Adam Farley > > IBM Runtimes > > > > > > "serguei.spitsyn at oracle.com" wrote on > > 29/08/2019 03:54:56: > > > > > From: "serguei.spitsyn at oracle.com" > > > To: Adam Farley8 > > > Cc: Chris Plummer , > > > daniel.daugherty at oracle.com, serviceability-dev at openjdk.java.net > > > Date: 29/08/2019 04:23 > > > Subject: Re: RFR: 8229378: jdwp library loader in linker_md.c > > > quietly truncates on buffer overflow > > > > > > Hi Adam, > > > > > > Sorry for the latency. > > > I was in process to build, test and push your fix and got the > > > fastdebug build errors below. > > > > > > So, my question is if you've ever built the fastdebug version. > > > This change is in the system-dependent code, so it has to be tested > > > on both Unix and Windows. > > > > > > > My testing was limited to the bug specific test case I mentioned, > > > and the following jdwp tests: > > > > > > > > test/jdk/com/sun/jdi/Jdwp* > > > > test/hotspot/jtreg/serviceability/jdwp > > > > > > This set of tests does not provide a good coverage. > > > To make sure nothing is broken you need to run the the test/jdk/ > com/sun/jdi > > > and also the following vmTestbase tests: > > > > > > test/hotspot/jtreg/vmTestbase/nsk/jdi > > > test/hotspot/jtreg/vmTestbase/nsk/jdb > > > test/hotspot/jtreg/vmTestbase/nsk/jdwp > > > > > > BTW, your current webrev is not up-to-date: > > > http://cr.openjdk.java.net/~afarley/8229378/webrev/ > > > > > > I guess, the change in the src/hotspot/share/runtime/os.cpp became > > > obsolete after your previous fix that was already pushed. > > > > > > Thanks, > > > Serguei > > > > > > . . . > > > In file included from /scratch/sspitsyn/jdk14.1/open/src/ > > > jdk.jdwp.agent/unix/native/libjdwp/linker_md.c:37:0: > > > /scratch/sspitsyn/jdk14.1/open/src/jdk.jdwp.agent/unix/native/ > > > libjdwp/linker_md.c: In function ?dll_build_name?: > > > /scratch/sspitsyn/jdk14.1/open/src/jdk.jdwp.agent/share/native/ > > > libjdwp/util.h:46:23: error: ?Do? undeclared (first use in this function) > > > #define strdup(p) Do not use this interface. > > > ^ > > > /scratch/sspitsyn/jdk14.1/open/src/jdk.jdwp.agent/unix/native/ > > > libjdwp/linker_md.c:51:18: note: in expansion of macro ?strdup? > > > paths_copy = strdup(paths); > > > ^ > > > /scratch/sspitsyn/jdk14.1/open/src/jdk.jdwp.agent/share/native/ > > > libjdwp/util.h:46:23: note: each undeclared identifier is reported > > > only once for each function it appears in > > > #define strdup(p) Do not use this interface. > > > ^ > > > /scratch/sspitsyn/jdk14.1/open/src/jdk.jdwp.agent/unix/native/ > > > libjdwp/linker_md.c:51:18: note: in expansion of macro ?strdup? > > > paths_copy = strdup(paths); > > > ^ > > > /scratch/sspitsyn/jdk14.1/open/src/jdk.jdwp.agent/share/native/ > > > libjdwp/util.h:46:26: error: expected ?;? before ?not? > > > #define strdup(p) Do not use this interface. > > > ^ > > > /scratch/sspitsyn/jdk14.1/open/src/jdk.jdwp.agent/unix/native/ > > > libjdwp/linker_md.c:51:18: note: in expansion of macro ?strdup? > > > paths_copy = strdup(paths); > > > ^ > > > /scratch/sspitsyn/jdk14.1/open/src/jdk.jdwp.agent/share/native/ > > > libjdwp/util.h:38:24: error: expected ?;? before ?not? > > > #define free(p) Do not use this interface. > > > ^ > > > /scratch/sspitsyn/jdk14.1/open/src/jdk.jdwp.agent/unix/native/ > > > libjdwp/linker_md.c:71:5: note: in expansion of macro ?free? > > > free(paths_copy); > > > ^ > > > gmake[3]: *** [/scratch/sspitsyn/jdk14.1/build/linux-x86_64-server- > > > fastdebug/support/native/jdk.jdwp.agent/libjdwp/linker_md.o] Error 1 > > > gmake[2]: *** [jdk.jdwp.agent-libs] Error 1 > > > gmake[2]: *** Waiting for unfinished jobs.... > > > > > > ERROR: Build failed for target 'images' in configuration 'linux- > > > x86_64-server-fastdebug' (exit code 2) > > > > > > > > > > > > On 8/13/19 09:28, Adam Farley8 wrote: > > > Hi Serguei, Daniel, > > > > > > My testing was limited to the bug specific test case I mentioned, > > > and the following jdwp tests: > > > > > > test/jdk/com/sun/jdi/Jdwp* > > > test/hotspot/jtreg/serviceability/jdwp > > > > > > Best Regards > > > > > > Adam Farley > > > IBM Runtimes > > > > > > > > > "serguei.spitsyn at oracle.com" wrote on > > > 13/08/2019 17:04:43: > > > > > > > From: "serguei.spitsyn at oracle.com" > > > > To: daniel.daugherty at oracle.com, Adam Farley8 > > > > , Chris Plummer > > > > Cc: serviceability-dev at openjdk.java.net > > > > Date: 13/08/2019 17:08 > > > > Subject: Re: RFR: 8229378: jdwp library loader in linker_md.c > > > > quietly truncates on buffer overflow > > > > > > > > Hi Adam, > > > > > > > > I'm looking at your fix. > > > > Also interested about your testing. > > > > > > > > Thanks, > > > > Serguei > > > > > > > > On 8/13/19 08:48, Daniel D. Daugherty wrote: > > > > I don't see any information about how this change was tested... > > > > Is there something on another email thread? > > > > > > > > Dan > > > > > > > > > > > On 8/13/19 11:41 AM, Adam Farley8 wrote: > > > > Hi Chris, > > > > > > > > Thanks! > > > > > > > > I understand we need a second reviewer/sponsor to get this change > > > > in. Any volunteers? > > > > > > > > Best Regards > > > > > > > > Adam Farley > > > > IBM Runtimes > > > > > > > > > > > > Chris Plummer wrote on 12/08/2019 21:35:06: > > > > > > > > > From: Chris Plummer > > > > > To: Adam Farley8 , serviceability- > > > > dev at openjdk.java.net > > > > > Date: 12/08/2019 21:35 > > > > > Subject: Re: RFR: 8229378: jdwp library loader in linker_md.c > > > > > quietly truncates on buffer overflow > > > > > > > > > > Hi Adam, > > > > > > > > > > It looks good to me. > > > > > > > > > > thanks, > > > > > > > > > > Chris > > > > > > > > > > On 8/12/19 7:34 AM, Adam Farley8 wrote: > > > > > Hi All, > > > > > > > > > > This is a known bug, mentioned in a code comment. > > > > > > > > > > Here is the fix for that bug. > > > > > > > > > > Reviewers and sponsors requested. > > > > > > > > > > Short version: if you set sun.boot.library.path to > > > > > something beyond a system's max path length, the > > > > > current code will return an empty string (rather than > > > > > printing a useful error message and shutting down). > > > > > > > > > > This is also a problem if you've specified multiple > > > > > paths with a separator, as this code seems to wrongly > > > > > assess whether the *total* length exceeds max path > > > > > length. So two 200 char paths on windows will cause > > > > > failure, as the total length is 400 (which is beyond > > > > > max length for windows). > > > > > > > > > > Note that the os.cpp bit of the webrev will not be included > > > > > in the final webrev, it just makes this change trivially > > > > > testable. > > > > > > > > > > Bug: https://bugs.openjdk.java.net/browse/JDK-8229378 > > > > > Webrev: http://cr.openjdk.java.net/~afarley/8229378/webrev/ > > > > > > > > > > > > > > > Best Regards > > > > > > > > > > Adam Farley > > > > > IBM Runtimes > > > > > > > > > > Unless stated otherwise above: > > > > > IBM United Kingdom Limited - Registered in England and Wales with > > > > > number 741598. > > > > > Registered office: PO Box 41, North Harbour, Portsmouth, > > Hampshire PO6 3AU > > > > Unless stated otherwise above: > > > > IBM United Kingdom Limited - Registered in England and Wales with > > > > number 741598. > > > > Registered office: PO Box 41, North Harbour, Portsmouth, > Hampshire PO6 3AU > > > Unless stated otherwise above: > > > IBM United Kingdom Limited - Registered in England and Wales with > > > number 741598. > > > Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU > > Unless stated otherwise above: > > IBM United Kingdom Limited - Registered in England and Wales with > > number 741598. > > Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU > Unless stated otherwise above: > IBM United Kingdom Limited - Registered in England and Wales with > number 741598. > Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU -------------- next part -------------- An HTML attachment was scrubbed... URL: From christoph.langer at sap.com Wed Sep 11 12:37:45 2019 From: christoph.langer at sap.com (Langer, Christoph) Date: Wed, 11 Sep 2019 12:37:45 +0000 Subject: RFR (S): 8230850: Test sun/tools/jcmd/TestProcessHelper.java fails intermittently Message-ID: Hi, please review this change for test sun/tools/jcmd/TestProcessHelper.java to make it more robust. Bug: https://bugs.openjdk.java.net/browse/JDK-8230850 Webrev: http://cr.openjdk.java.net/~clanger/webrevs/8230850.0/ This Linux only test is starting several Java processes and then tries to figure out the main class by invoking jdk.jcmd's linux specific ProcessHelper implementation which parses the contents of /proc//cmdline. Under some circumstances, the test already attempts to read /proc//cmdline before it actually exists or is filled with data. This can be fixed with some sleeps/retries to wait for that data to be ready. In the actual jcmd tool, such behavior of ProcessHelper. getMainClass should not be an issue because it is handled in ProcessArgumentMatcher [0]. Thanks Christoph [0] http://hg.openjdk.java.net/jdk/jdk/file/8b08eaf9a0eb/src/jdk.jcmd/share/classes/sun/tools/common/ProcessArgumentMatcher.java#l86 -------------- next part -------------- An HTML attachment was scrubbed... URL: From daniel.daugherty at oracle.com Wed Sep 11 12:56:53 2019 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Wed, 11 Sep 2019 08:56:53 -0400 Subject: RFR: 8230830: No required ResourceMark in src/hotspot/share/prims/jvmtiImpl.cpp:JvmtiSuspendControl::print() In-Reply-To: <07735389-F665-463D-8C4B-8A5734234357@oracle.com> References: <7D331CDC-55A4-499D-B2D7-59C58CEF22BA@oracle.com> <7cb9162f-1d39-b22d-8927-4a781df0c40e@oracle.com> <07735389-F665-463D-8C4B-8A5734234357@oracle.com> Message-ID: <44459eb0-3b23-09d5-d8d3-0958486c223e@oracle.com> On 9/11/19 1:06 AM, Leonid Mesnik wrote: > Hi > > Thank you for feedback. > >> On Sep 10, 2019, at 10:03 PM, David Holmes > > wrote: >> >> Hi Leonid, >> >> On 11/09/2019 12:03 pm, Leonid Mesnik wrote: >>> Hi >>> Could you please review following tiny fix which just add >>> ResourceMark in JvmtiSuspendControl::print() method. >> >> Looks fine. >> >>> The method jvmtiSuspendControl::print() might used in custom builds >>> only for debugging purposes. So I don't know when it was used last >>> time. I found that it crashes when I tried to use it locally. >> >> The only caller is JvmtiEnv::NotifyFramePop, under TraceJVMTICalls, >> and it already has a ResourceMark. So existing use is fine. > > It explains why it works. So the question that comes to my mind is whether the ResourceMark that is in JvmtiEnv::NotifyFramePop() is needed for something other than the JvmtiSuspendControl::print() call? If not, then removing the one in JvmtiEnv::NotifyFramePop() in favor of the one you added in JvmtiSuspendControl::print() is a good idea. Dan > I used it to track status in suspend resume investigating > https://bugs.openjdk.java.net/browse/JDK-8230459. > >> >> Please ensure you test with TraceJVMTICalls enabled. > > Thanks, I have tested it with this macro enabled. > > Leonid >> Thanks, >> David >> >> >>> webrev: http://cr.openjdk.java.net/~lmesnik/8230830/webrev.00/ >>> bug: https://bugs.openjdk.java.net/browse/JDK-8230830 >>> Leonid > -------------- next part -------------- An HTML attachment was scrubbed... URL: From thomas.stuefe at gmail.com Wed Sep 11 13:59:07 2019 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Wed, 11 Sep 2019 15:59:07 +0200 Subject: RFR (S): 8230850: Test sun/tools/jcmd/TestProcessHelper.java fails intermittently In-Reply-To: References: Message-ID: Hi Christoph, in general I think this is fine. The increase-by-pow2 sleep time is odd but okay :) The whole things seems rather fragile and has a lot of question marks but I think your fix does not make it worse. One fun error now is that with a follow up java test reusing the PID we could get a wrong main class but I think the chances are astronomically low. Only remark, you fix this in the platform shared code, if this is a Linux only issue maybe it should be fixed in /shared/projects/openjdk/jdk-jdk/source/src/jdk.jcmd/linux/classes/sun/tools/ProcessHelper.java instead? If not, I would remove at least the /proc//cmdline comment since this is quite platform specific. Cheers, Thomas On Wed, Sep 11, 2019 at 2:39 PM Langer, Christoph wrote: > Hi, > > > > please review this change for test sun/tools/jcmd/TestProcessHelper.java > to make it more robust. > > > > Bug: https://bugs.openjdk.java.net/browse/JDK-8230850 > > Webrev: http://cr.openjdk.java.net/~clanger/webrevs/8230850.0/ > > > > This Linux only test is starting several Java processes and then tries to > figure out the main class by invoking jdk.jcmd's linux specific > ProcessHelper implementation which parses the contents of > /proc//cmdline. > > Under some circumstances, the test already attempts to read > /proc//cmdline before it actually exists or is filled with data. This > can be fixed with some sleeps/retries to wait for that data to be ready. > > In the actual jcmd tool, such behavior of ProcessHelper. getMainClass > should not be an issue because it is handled in ProcessArgumentMatcher [0]. > > > > Thanks > > Christoph > > > > [0] > http://hg.openjdk.java.net/jdk/jdk/file/8b08eaf9a0eb/src/jdk.jcmd/share/classes/sun/tools/common/ProcessArgumentMatcher.java#l86 > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From serguei.spitsyn at oracle.com Wed Sep 11 16:44:48 2019 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Wed, 11 Sep 2019 09:44:48 -0700 Subject: RFR: 8229378: jdwp library loader in linker_md.c quietly truncates on buffer overflow In-Reply-To: References: <9bab421c-08a1-b554-6ac9-35290856ee56@oracle.com> <0c01c46b-3ff9-3016-8791-868019459d13@oracle.com> <3ea2e6ad-a32d-603c-258c-985da4e2f50a@oracle.com> <11c16477-c096-b807-46fd-acab1c484587@oracle.com> Message-ID: An HTML attachment was scrubbed... URL: From chris.plummer at oracle.com Wed Sep 11 17:21:26 2019 From: chris.plummer at oracle.com (Chris Plummer) Date: Wed, 11 Sep 2019 10:21:26 -0700 Subject: RFR (S): 8230850: Test sun/tools/jcmd/TestProcessHelper.java fails intermittently In-Reply-To: References: Message-ID: An HTML attachment was scrubbed... URL: From christoph.langer at sap.com Wed Sep 11 21:37:57 2019 From: christoph.langer at sap.com (Langer, Christoph) Date: Wed, 11 Sep 2019 21:37:57 +0000 Subject: RFR (S): 8230850: Test sun/tools/jcmd/TestProcessHelper.java fails intermittently In-Reply-To: References: Message-ID: Hi Chris, Thomas, thanks for looking at this. I was also wondering whether a fix in ProcessHelper would be appropriate. But I think introducing retries and delays in that code can do more harm than help. For this special test case, aiming to test the ProcessHelper functionality (on Linux) only, the observed problem is that the /proc//cmdline file is not ready yet when it gets evaluated because the test can be quicker than the spawned processes. But in real life usage of jcmd this seems rather unlikely. One will probably use jcmd quite some time after a java process was started and /proc//cmdline should be ready. If then there are problems reading it, there are likely other issues which won?t go away by waiting. And for these cases the fallback is to use the attach framework, as implemented in ProcessArgumentMatcher, which provides some chance to be working still. And this fallback should also cover the exotic case when jcmd is issued too early. After all, ProcessHelper::getMainClass also documents that its result can be null. @Thomas, as for your other points: * PID reusage: Hm, maybe one can construct cases. However, I?d think the /proc/pid files should be gone after a process ends. Or at least be reconstructed if there were orphans and a new process reusing an old pid gets started. But who knows what can happen ? we?ll maybe see ?? * Comment for Linux only issue: The test is in fact a Linux only test. See line 55: * @requires os.family == "linux". So, if we?ll eventually see implementations for ProcessHelper::getMainClass on other platforms, this comment might have to be adopted. But for the time being I guess it?s fine at its current place. Would you agree? Best regards Christoph From: Chris Plummer Sent: Mittwoch, 11. September 2019 19:21 To: Thomas St?fe ; Langer, Christoph Cc: OpenJDK Serviceability Subject: Re: RFR (S): 8230850: Test sun/tools/jcmd/TestProcessHelper.java fails intermittently It does seem that the fix should be in ProcessHelper.java in getMainClass(), or maybe even getCommandLine(). Fixing it in the test implies that every user of getMainClass() should be doing something similar. But then also note what ProcessArgumentMatch.check() is doing. It also deals with getMainClass() returing null. thanks, Chris On 9/11/19 6:59 AM, Thomas St?fe wrote: Hi Christoph, in general I think this is fine. The increase-by-pow2 sleep time is odd but okay :) The whole things seems rather fragile and has a lot of question marks but I think your fix does not make it worse. One fun error now is that with a follow up java test reusing the PID we could get a wrong main class but I think the chances are astronomically low. Only remark, you fix this in the platform shared code, if this is a Linux only issue maybe it should be fixed in /shared/projects/openjdk/jdk-jdk/source/src/jdk.jcmd/linux/classes/sun/tools/ProcessHelper.java instead? If not, I would remove at least the /proc//cmdline comment since this is quite platform specific. Cheers, Thomas On Wed, Sep 11, 2019 at 2:39 PM Langer, Christoph > wrote: Hi, please review this change for test sun/tools/jcmd/TestProcessHelper.java to make it more robust. Bug: https://bugs.openjdk.java.net/browse/JDK-8230850 Webrev: http://cr.openjdk.java.net/~clanger/webrevs/8230850.0/ This Linux only test is starting several Java processes and then tries to figure out the main class by invoking jdk.jcmd's linux specific ProcessHelper implementation which parses the contents of /proc//cmdline. Under some circumstances, the test already attempts to read /proc//cmdline before it actually exists or is filled with data. This can be fixed with some sleeps/retries to wait for that data to be ready. In the actual jcmd tool, such behavior of ProcessHelper. getMainClass should not be an issue because it is handled in ProcessArgumentMatcher [0]. Thanks Christoph [0] http://hg.openjdk.java.net/jdk/jdk/file/8b08eaf9a0eb/src/jdk.jcmd/share/classes/sun/tools/common/ProcessArgumentMatcher.java#l86 -------------- next part -------------- An HTML attachment was scrubbed... URL: From christoph.langer at sap.com Wed Sep 11 22:05:40 2019 From: christoph.langer at sap.com (Langer, Christoph) Date: Wed, 11 Sep 2019 22:05:40 +0000 Subject: RFR: 8230857: Avoid reflection in sun.tools.common.ProcessHelper Message-ID: Hi, please review an enhancement which I've identified when working with Processhelper for JDK-8230850. I noticed that ProcessHelper is an interface in common code with a static method that would lookup the actual platform implementation via reflection. This seems a little cumbersome since we can have a common dummy for ProcessHelper and override it with the platform specific implementation, leveraging the build system. The only drawback is that the test "TestProcessHelper" gets a little more sophisticated when using only package private methods in ProcessHelper, because the methods need to be made accessible for the test. But I guess that's ok, given the actual tool improvement. Bug: https://bugs.openjdk.java.net/browse/JDK-8230857 Webrev: http://cr.openjdk.java.net/~clanger/webrevs/8230857.0/ Thanks Christoph -------------- next part -------------- An HTML attachment was scrubbed... URL: From david.holmes at oracle.com Wed Sep 11 22:29:13 2019 From: david.holmes at oracle.com (David Holmes) Date: Thu, 12 Sep 2019 08:29:13 +1000 Subject: RFR: 8230857: Avoid reflection in sun.tools.common.ProcessHelper In-Reply-To: References: Message-ID: <555a2cf2-e15e-abb6-5c0a-fb3ff4c0716f@oracle.com> Hi Christoph, On 12/09/2019 8:05 am, Langer, Christoph wrote: > Hi, > > please review an enhancement which I?ve identified when working with > Processhelper for JDK-8230850. > > I noticed that ProcessHelper is an interface in common code with a > static method that would lookup the actual platform implementation via > reflection. This seems a little cumbersome since we can have a common > dummy for ProcessHelper and override it with the platform specific > implementation, leveraging the build system. I don't see you leveraging the build system. You have two source files that compile to the same destination class file. What is ensuring the platform specific version is compiled after the generic one? Service-provider patterns use reflection to instantiate the service implementation. I don't see any problem here that needs solving. David ----- The only drawback is that > the test ?TestProcessHelper? gets a little more sophisticated when using > only package private methods in ProcessHelper, because the methods need > to be made accessible for the test. But I guess that?s ok, given the > actual tool improvement. > > Bug: https://bugs.openjdk.java.net/browse/JDK-8230857 > > Webrev: http://cr.openjdk.java.net/~clanger/webrevs/8230857.0/ > > Thanks > > Christoph > From leonid.mesnik at oracle.com Wed Sep 11 22:45:26 2019 From: leonid.mesnik at oracle.com (Leonid Mesnik) Date: Wed, 11 Sep 2019 15:45:26 -0700 Subject: RFR: 8230830: No required ResourceMark in src/hotspot/share/prims/jvmtiImpl.cpp:JvmtiSuspendControl::print() In-Reply-To: <44459eb0-3b23-09d5-d8d3-0958486c223e@oracle.com> References: <7D331CDC-55A4-499D-B2D7-59C58CEF22BA@oracle.com> <7cb9162f-1d39-b22d-8927-4a781df0c40e@oracle.com> <07735389-F665-463D-8C4B-8A5734234357@oracle.com> <44459eb0-3b23-09d5-d8d3-0958486c223e@oracle.com> Message-ID: Hi It is still needed for vframe *vf = vframeFor(java_thread, depth); Leonid > On Sep 11, 2019, at 5:56 AM, Daniel D. Daugherty wrote: > > On 9/11/19 1:06 AM, Leonid Mesnik wrote: >> Hi >> >> Thank you for feedback. >> >>> On Sep 10, 2019, at 10:03 PM, David Holmes > wrote: >>> >>> Hi Leonid, >>> >>> On 11/09/2019 12:03 pm, Leonid Mesnik wrote: >>>> Hi >>>> Could you please review following tiny fix which just add ResourceMark in JvmtiSuspendControl::print() method. >>> >>> Looks fine. >>> >>>> The method jvmtiSuspendControl::print() might used in custom builds only for debugging purposes. So I don't know when it was used last time. I found that it crashes when I tried to use it locally. >>> >>> The only caller is JvmtiEnv::NotifyFramePop, under TraceJVMTICalls, and it already has a ResourceMark. So existing use is fine. >> >> It explains why it works. > > So the question that comes to my mind is whether the ResourceMark > that is in JvmtiEnv::NotifyFramePop() is needed for something > other than the JvmtiSuspendControl::print() call? If not, then > removing the one in JvmtiEnv::NotifyFramePop() in favor of the > one you added in JvmtiSuspendControl::print() is a good idea. > > Dan > > >> I used it to track status in suspend resume investigating https://bugs.openjdk.java.net/browse/JDK-8230459 . >> >>> >>> Please ensure you test with TraceJVMTICalls enabled. >> >> Thanks, I have tested it with this macro enabled. >> >> Leonid >>> Thanks, >>> David >>> >>> >>>> webrev: http://cr.openjdk.java.net/~lmesnik/8230830/webrev.00/ >>>> bug: https://bugs.openjdk.java.net/browse/JDK-8230830 >>>> Leonid >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From daniel.daugherty at oracle.com Wed Sep 11 23:38:32 2019 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Wed, 11 Sep 2019 19:38:32 -0400 Subject: RFR: 8230830: No required ResourceMark in src/hotspot/share/prims/jvmtiImpl.cpp:JvmtiSuspendControl::print() In-Reply-To: References: <7D331CDC-55A4-499D-B2D7-59C58CEF22BA@oracle.com> <7cb9162f-1d39-b22d-8927-4a781df0c40e@oracle.com> <07735389-F665-463D-8C4B-8A5734234357@oracle.com> <44459eb0-3b23-09d5-d8d3-0958486c223e@oracle.com> Message-ID: <76e54e33-3b5c-d6c6-b559-b98ba05149cb@oracle.com> Thanks for chasing that down. I'm good with your change. Dan On 9/11/19 6:45 PM, Leonid Mesnik wrote: > Hi > > It is still needed for > vframe *vf = vframeFor(java_thread, depth); > > Leonid > >> On Sep 11, 2019, at 5:56 AM, Daniel D. Daugherty >> > wrote: >> >> On 9/11/19 1:06 AM, Leonid Mesnik wrote: >>> Hi >>> >>> Thank you for feedback. >>> >>>> On Sep 10, 2019, at 10:03 PM, David Holmes >>> > wrote: >>>> >>>> Hi Leonid, >>>> >>>> On 11/09/2019 12:03 pm, Leonid Mesnik wrote: >>>>> Hi >>>>> Could you please review following tiny fix which just add >>>>> ResourceMark in JvmtiSuspendControl::print() method. >>>> >>>> Looks fine. >>>> >>>>> The method jvmtiSuspendControl::print() might used in custom >>>>> builds only for debugging purposes. So I don't know when it was >>>>> used last time. I found that it crashes when I tried to use it >>>>> locally. >>>> >>>> The only caller is JvmtiEnv::NotifyFramePop, under TraceJVMTICalls, >>>> and it already has a ResourceMark. So existing use is fine. >>> >>> It explains why it works. >> >> So the question that comes to my mind is whether the ResourceMark >> that is in JvmtiEnv::NotifyFramePop() is needed for something >> other than the JvmtiSuspendControl::print() call? If not, then >> removing the one in JvmtiEnv::NotifyFramePop() in favor of the >> one you added in JvmtiSuspendControl::print() is a good idea. >> >> Dan >> >> >>> I used it to track status in suspend resume investigating >>> https://bugs.openjdk.java.net/browse/JDK-8230459. >>> >>>> >>>> Please ensure you test with TraceJVMTICalls enabled. >>> >>> Thanks, I have tested it with this macro enabled. >>> >>> Leonid >>>> Thanks, >>>> David >>>> >>>> >>>>> webrev: http://cr.openjdk.java.net/~lmesnik/8230830/webrev.00/ >>>>> bug: https://bugs.openjdk.java.net/browse/JDK-8230830 >>>>> Leonid >>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From leonid.mesnik at oracle.com Wed Sep 11 23:50:45 2019 From: leonid.mesnik at oracle.com (Leonid Mesnik) Date: Wed, 11 Sep 2019 16:50:45 -0700 Subject: RFR: 8230830: No required ResourceMark in src/hotspot/share/prims/jvmtiImpl.cpp:JvmtiSuspendControl::print() In-Reply-To: <76e54e33-3b5c-d6c6-b559-b98ba05149cb@oracle.com> References: <7D331CDC-55A4-499D-B2D7-59C58CEF22BA@oracle.com> <7cb9162f-1d39-b22d-8927-4a781df0c40e@oracle.com> <07735389-F665-463D-8C4B-8A5734234357@oracle.com> <44459eb0-3b23-09d5-d8d3-0958486c223e@oracle.com> <76e54e33-3b5c-d6c6-b559-b98ba05149cb@oracle.com> Message-ID: David, Daniel, Serguei, Chris Thank you for review. Leonid > On Sep 11, 2019, at 4:38 PM, Daniel D. Daugherty wrote: > > Thanks for chasing that down. I'm good with your change. > > Dan > > > On 9/11/19 6:45 PM, Leonid Mesnik wrote: >> Hi >> >> It is still needed for >> vframe *vf = vframeFor(java_thread, depth); >> >> Leonid >> >>> On Sep 11, 2019, at 5:56 AM, Daniel D. Daugherty > wrote: >>> >>> On 9/11/19 1:06 AM, Leonid Mesnik wrote: >>>> Hi >>>> >>>> Thank you for feedback. >>>> >>>>> On Sep 10, 2019, at 10:03 PM, David Holmes > wrote: >>>>> >>>>> Hi Leonid, >>>>> >>>>> On 11/09/2019 12:03 pm, Leonid Mesnik wrote: >>>>>> Hi >>>>>> Could you please review following tiny fix which just add ResourceMark in JvmtiSuspendControl::print() method. >>>>> >>>>> Looks fine. >>>>> >>>>>> The method jvmtiSuspendControl::print() might used in custom builds only for debugging purposes. So I don't know when it was used last time. I found that it crashes when I tried to use it locally. >>>>> >>>>> The only caller is JvmtiEnv::NotifyFramePop, under TraceJVMTICalls, and it already has a ResourceMark. So existing use is fine. >>>> >>>> It explains why it works. >>> >>> So the question that comes to my mind is whether the ResourceMark >>> that is in JvmtiEnv::NotifyFramePop() is needed for something >>> other than the JvmtiSuspendControl::print() call? If not, then >>> removing the one in JvmtiEnv::NotifyFramePop() in favor of the >>> one you added in JvmtiSuspendControl::print() is a good idea. >>> >>> Dan >>> >>> >>>> I used it to track status in suspend resume investigating https://bugs.openjdk.java.net/browse/JDK-8230459 . >>>> >>>>> >>>>> Please ensure you test with TraceJVMTICalls enabled. >>>> >>>> Thanks, I have tested it with this macro enabled. >>>> >>>> Leonid >>>>> Thanks, >>>>> David >>>>> >>>>> >>>>>> webrev: http://cr.openjdk.java.net/~lmesnik/8230830/webrev.00/ >>>>>> bug: https://bugs.openjdk.java.net/browse/JDK-8230830 >>>>>> Leonid >>>> >>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From thomas.stuefe at gmail.com Thu Sep 12 08:10:32 2019 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Thu, 12 Sep 2019 10:10:32 +0200 Subject: RFR (S): 8230850: Test sun/tools/jcmd/TestProcessHelper.java fails intermittently In-Reply-To: References: Message-ID: I'm fine with the patch if you would reshape the platform dependent comment. Proposal: ---- - // Depending on hw/os, process helper can return null here- // because /proc//cmdline is not ready yet. To cover that case, // give it some retries. -> + getMainClass() may return NULL, e.g. due to timing issues. Attempt some limited retries. ---- I do not need another webrev. Cheers, Thomas On Wed, Sep 11, 2019 at 11:37 PM Langer, Christoph wrote: > Hi Chris, Thomas, > > > > thanks for looking at this. I was also wondering whether a fix in > ProcessHelper would be appropriate. But I think introducing retries and > delays in that code can do more harm than help. > > > > For this special test case, aiming to test the ProcessHelper functionality > (on Linux) only, the observed problem is that the /proc//cmdline file > is not ready yet when it gets evaluated because the test can be quicker > than the spawned processes. But in real life usage of jcmd this seems > rather unlikely. One will probably use jcmd quite some time after a java > process was started and /proc//cmdline should be ready. If then there > are problems reading it, there are likely other issues which won?t go away > by waiting. And for these cases the fallback is to use the attach > framework, as implemented in ProcessArgumentMatcher, which provides some > chance to be working still. And this fallback should also cover the exotic > case when jcmd is issued too early. > > > > After all, ProcessHelper::getMainClass also documents that its result can > be null. > > > > @Thomas, as for your other points: > > - PID reusage: Hm, maybe one can construct cases. However, I?d think > the /proc/pid files should be gone after a process ends. Or at least be > reconstructed if there were orphans and a new process reusing an old pid > gets started. But who knows what can happen ? we?ll maybe see ?? > - Comment for Linux only issue: The test is in fact a Linux only test. > See line 55: * @requires os.family == "*linux*". So, if we?ll > eventually see implementations for ProcessHelper::getMainClass on other > platforms, this comment might have to be adopted. But for the time being I > guess it?s fine at its current place. > > > > Would you agree? > > > > Best regards > > Christoph > > > > *From:* Chris Plummer > *Sent:* Mittwoch, 11. September 2019 19:21 > *To:* Thomas St?fe ; Langer, Christoph < > christoph.langer at sap.com> > *Cc:* OpenJDK Serviceability > *Subject:* Re: RFR (S): 8230850: Test > sun/tools/jcmd/TestProcessHelper.java fails intermittently > > > > It does seem that the fix should be in ProcessHelper.java in > getMainClass(), or maybe even getCommandLine(). Fixing it in the test > implies that every user of getMainClass() should be doing something > similar. But then also note what ProcessArgumentMatch.check() is doing. It > also deals with getMainClass() returing null. > > thanks, > > Chris > > On 9/11/19 6:59 AM, Thomas St?fe wrote: > > Hi Christoph, > > > > in general I think this is fine. The increase-by-pow2 sleep time is odd > but okay :) > > > > The whole things seems rather fragile and has a lot of question marks but > I think your fix does not make it worse. One fun error now is that with a > follow up java test reusing the PID we could get a wrong main class but I > think the chances are astronomically low. > > > > Only remark, you fix this in the platform shared code, if this is a Linux > only issue maybe it should be fixed > in /shared/projects/openjdk/jdk-jdk/source/src/jdk.jcmd/linux/classes/sun/tools/ProcessHelper.java > instead? If not, I would remove at least the /proc//cmdline comment > since this is quite platform specific. > > > > Cheers, Thomas > > > > > > On Wed, Sep 11, 2019 at 2:39 PM Langer, Christoph < > christoph.langer at sap.com> wrote: > > Hi, > > > > please review this change for test sun/tools/jcmd/TestProcessHelper.java > to make it more robust. > > > > Bug: https://bugs.openjdk.java.net/browse/JDK-8230850 > > Webrev: http://cr.openjdk.java.net/~clanger/webrevs/8230850.0/ > > > > This Linux only test is starting several Java processes and then tries to > figure out the main class by invoking jdk.jcmd's linux specific > ProcessHelper implementation which parses the contents of > /proc//cmdline. > > Under some circumstances, the test already attempts to read > /proc//cmdline before it actually exists or is filled with data. This > can be fixed with some sleeps/retries to wait for that data to be ready. > > In the actual jcmd tool, such behavior of ProcessHelper. getMainClass > should not be an issue because it is handled in ProcessArgumentMatcher [0]. > > > > Thanks > > Christoph > > > > [0] > http://hg.openjdk.java.net/jdk/jdk/file/8b08eaf9a0eb/src/jdk.jcmd/share/classes/sun/tools/common/ProcessArgumentMatcher.java#l86 > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From christoph.langer at sap.com Thu Sep 12 08:12:49 2019 From: christoph.langer at sap.com (Langer, Christoph) Date: Thu, 12 Sep 2019 08:12:49 +0000 Subject: RFR (S): 8230850: Test sun/tools/jcmd/TestProcessHelper.java fails intermittently In-Reply-To: References: Message-ID: Hi Thomas, sounds reasonable, will do. Thanks Christoph From: Thomas St?fe Sent: Donnerstag, 12. September 2019 10:11 To: Langer, Christoph Cc: Chris Plummer ; OpenJDK Serviceability Subject: Re: RFR (S): 8230850: Test sun/tools/jcmd/TestProcessHelper.java fails intermittently I'm fine with the patch if you would reshape the platform dependent comment. Proposal: ---- - // Depending on hw/os, process helper can return null here - // because /proc//cmdline is not ready yet. To cover that case, // give it some retries. -> + getMainClass() may return NULL, e.g. due to timing issues. Attempt some limited retries. ---- I do not need another webrev. Cheers, Thomas On Wed, Sep 11, 2019 at 11:37 PM Langer, Christoph > wrote: Hi Chris, Thomas, thanks for looking at this. I was also wondering whether a fix in ProcessHelper would be appropriate. But I think introducing retries and delays in that code can do more harm than help. For this special test case, aiming to test the ProcessHelper functionality (on Linux) only, the observed problem is that the /proc//cmdline file is not ready yet when it gets evaluated because the test can be quicker than the spawned processes. But in real life usage of jcmd this seems rather unlikely. One will probably use jcmd quite some time after a java process was started and /proc//cmdline should be ready. If then there are problems reading it, there are likely other issues which won?t go away by waiting. And for these cases the fallback is to use the attach framework, as implemented in ProcessArgumentMatcher, which provides some chance to be working still. And this fallback should also cover the exotic case when jcmd is issued too early. After all, ProcessHelper::getMainClass also documents that its result can be null. @Thomas, as for your other points: * PID reusage: Hm, maybe one can construct cases. However, I?d think the /proc/pid files should be gone after a process ends. Or at least be reconstructed if there were orphans and a new process reusing an old pid gets started. But who knows what can happen ? we?ll maybe see ?? * Comment for Linux only issue: The test is in fact a Linux only test. See line 55: * @requires os.family == "linux". So, if we?ll eventually see implementations for ProcessHelper::getMainClass on other platforms, this comment might have to be adopted. But for the time being I guess it?s fine at its current place. Would you agree? Best regards Christoph From: Chris Plummer > Sent: Mittwoch, 11. September 2019 19:21 To: Thomas St?fe >; Langer, Christoph > Cc: OpenJDK Serviceability > Subject: Re: RFR (S): 8230850: Test sun/tools/jcmd/TestProcessHelper.java fails intermittently It does seem that the fix should be in ProcessHelper.java in getMainClass(), or maybe even getCommandLine(). Fixing it in the test implies that every user of getMainClass() should be doing something similar. But then also note what ProcessArgumentMatch.check() is doing. It also deals with getMainClass() returing null. thanks, Chris On 9/11/19 6:59 AM, Thomas St?fe wrote: Hi Christoph, in general I think this is fine. The increase-by-pow2 sleep time is odd but okay :) The whole things seems rather fragile and has a lot of question marks but I think your fix does not make it worse. One fun error now is that with a follow up java test reusing the PID we could get a wrong main class but I think the chances are astronomically low. Only remark, you fix this in the platform shared code, if this is a Linux only issue maybe it should be fixed in /shared/projects/openjdk/jdk-jdk/source/src/jdk.jcmd/linux/classes/sun/tools/ProcessHelper.java instead? If not, I would remove at least the /proc//cmdline comment since this is quite platform specific. Cheers, Thomas On Wed, Sep 11, 2019 at 2:39 PM Langer, Christoph > wrote: Hi, please review this change for test sun/tools/jcmd/TestProcessHelper.java to make it more robust. Bug: https://bugs.openjdk.java.net/browse/JDK-8230850 Webrev: http://cr.openjdk.java.net/~clanger/webrevs/8230850.0/ This Linux only test is starting several Java processes and then tries to figure out the main class by invoking jdk.jcmd's linux specific ProcessHelper implementation which parses the contents of /proc//cmdline. Under some circumstances, the test already attempts to read /proc//cmdline before it actually exists or is filled with data. This can be fixed with some sleeps/retries to wait for that data to be ready. In the actual jcmd tool, such behavior of ProcessHelper. getMainClass should not be an issue because it is handled in ProcessArgumentMatcher [0]. Thanks Christoph [0] http://hg.openjdk.java.net/jdk/jdk/file/8b08eaf9a0eb/src/jdk.jcmd/share/classes/sun/tools/common/ProcessArgumentMatcher.java#l86 -------------- next part -------------- An HTML attachment was scrubbed... URL: From sgehwolf at redhat.com Thu Sep 12 09:00:52 2019 From: sgehwolf at redhat.com (Severin Gehwolf) Date: Thu, 12 Sep 2019 11:00:52 +0200 Subject: RFR (S): 8230850: Test sun/tools/jcmd/TestProcessHelper.java fails intermittently In-Reply-To: References: Message-ID: <538fbd9d96e88ef3179f6ad7c410e4311e87c420.camel@redhat.com> Hi Christoph, Have you considered to wait for TestProcess - the spawned processes - to print this on stdout: "The process started, pid: XXX" Once that's ready on stdout, checking the main class should always pass. I believe p.isAlive() check which is currrently done is insufficient. Thanks, Severin On Thu, 2019-09-12 at 08:12 +0000, Langer, Christoph wrote: > Hi Thomas, > > sounds reasonable, will do. > > Thanks > Christoph > > From: Thomas St?fe > Sent: Donnerstag, 12. September 2019 10:11 > To: Langer, Christoph > Cc: Chris Plummer ; OpenJDK Serviceability > Subject: Re: RFR (S): 8230850: Test sun/tools/jcmd/TestProcessHelper.java fails intermittently > > I'm fine with the patch if you would reshape the platform dependent comment. Proposal: > > ---- > - // Depending on hw/os, process helper can return null here > - // because /proc//cmdline is not ready yet. To cover that case, > // give it some retries. > -> > + getMainClass() may return NULL, e.g. due to timing issues. Attempt some limited retries. > ---- > I do not need another webrev. > Cheers, Thomas > > > > > On Wed, Sep 11, 2019 at 11:37 PM Langer, Christoph wrote: > > Hi Chris, Thomas, > > > > thanks for looking at this. I was also wondering whether a fix in ProcessHelper would be appropriate. But I think introducing retries and delays in that code can do more harm than help. > > > > For this special test case, aiming to test the ProcessHelper functionality (on Linux) only, the observed problem is that the /proc//cmdline file is not ready yet when it gets evaluated because the test can be quicker than the spawned processes. But in real life usage of jcmd this seems rather unlikely. One will probably use jcmd quite some time after a java process was started and /proc//cmdline should be ready. If then there are problems reading it, there are likely other issues which won?t go away by waiting. And for these cases the fallback is to use the attach framework, as implemented in ProcessArgumentMatcher, which provides some chance to be working still. And this fallback should also cover the exotic case when jcmd is issued too early. > > > > After all, ProcessHelper::getMainClass also documents that its result can be null. > > > > @Thomas, as for your other points: > > PID reusage: Hm, maybe one can construct cases. However, I?d think the /proc/pid files should be gone after a process ends. Or at least be reconstructed if there were orphans and a new process reusing an old pid gets started. But who knows what can happen ? we?ll maybe see ?? > > Comment for Linux only issue: The test is in fact a Linux only test. See line 55: * @requires os.family == "linux". So, if we?ll eventually see implementations for ProcessHelper::getMainClass on other platforms, this comment might have to be adopted. But for the time being I guess it?s fine at its current place. > > > > Would you agree? > > > > Best regards > > Christoph > > > > From: Chris Plummer > > Sent: Mittwoch, 11. September 2019 19:21 > > To: Thomas St?fe ; Langer, Christoph > > Cc: OpenJDK Serviceability > > Subject: Re: RFR (S): 8230850: Test sun/tools/jcmd/TestProcessHelper.java fails intermittently > > > > It does seem that the fix should be in ProcessHelper.java in getMainClass(), or maybe even getCommandLine(). Fixing it in the test implies that every user of getMainClass() should be doing something similar. But then also note what ProcessArgumentMatch.check() is doing. It also deals with getMainClass() returing null. > > > > thanks, > > > > Chris > > > > On 9/11/19 6:59 AM, Thomas St?fe wrote: > > > Hi Christoph, > > > > > > in general I think this is fine. The increase-by-pow2 sleep time is odd but okay :) > > > > > > The whole things seems rather fragile and has a lot of question marks but I think your fix does not make it worse. One fun error now is that with a follow up java test reusing the PID we could get a wrong main class but I think the chances are astronomically low. > > > > > > Only remark, you fix this in the platform shared code, if this is a Linux only issue maybe it should be fixed in /shared/projects/openjdk/jdk-jdk/source/src/jdk.jcmd/linux/classes/sun/tools/ProcessHelper.java instead? If not, I would remove at least the /proc//cmdline comment since this is quite platform specific. > > > > > > Cheers, Thomas > > > > > > > > > On Wed, Sep 11, 2019 at 2:39 PM Langer, Christoph wrote: > > > > Hi, > > > > > > > > please review this change for test sun/tools/jcmd/TestProcessHelper.java to make it more robust. > > > > > > > > Bug: https://bugs.openjdk.java.net/browse/JDK-8230850 > > > > Webrev: http://cr.openjdk.java.net/~clanger/webrevs/8230850.0/ > > > > > > > > This Linux only test is starting several Java processes and then tries to figure out the main class by invoking jdk.jcmd's linux specific ProcessHelper implementation which parses the contents of /proc//cmdline. > > > > Under some circumstances, the test already attempts to read /proc//cmdline before it actually exists or is filled with data. This can be fixed with some sleeps/retries to wait for that data to be ready. > > > > In the actual jcmd tool, such behavior of ProcessHelper. getMainClass should not be an issue because it is handled in ProcessArgumentMatcher [0]. > > > > > > > > Thanks > > > > Christoph > > > > > > > > [0] http://hg.openjdk.java.net/jdk/jdk/file/8b08eaf9a0eb/src/jdk.jcmd/share/classes/sun/tools/common/ProcessArgumentMatcher.java#l86 > > > > > > > > From christoph.langer at sap.com Thu Sep 12 09:30:41 2019 From: christoph.langer at sap.com (Langer, Christoph) Date: Thu, 12 Sep 2019 09:30:41 +0000 Subject: RFR: 8230857: Avoid reflection in sun.tools.common.ProcessHelper In-Reply-To: <555a2cf2-e15e-abb6-5c0a-fb3ff4c0716f@oracle.com> References: <555a2cf2-e15e-abb6-5c0a-fb3ff4c0716f@oracle.com> Message-ID: Hi David, > > please review an enhancement which I've identified when working with > > Processhelper for JDK-8230850. > > > > I noticed that ProcessHelper is an interface in common code with a > > static method that would lookup the actual platform implementation via > > reflection. This seems a little cumbersome since we can have a common > > dummy for ProcessHelper and override it with the platform specific > > implementation, leveraging the build system. > > I don't see you leveraging the build system. You have two source files > that compile to the same destination class file. What is ensuring the > platform specific version is compiled after the generic one? > > Service-provider patterns use reflection to instantiate the service > implementation. I don't see any problem here that needs solving. TL;DR: There are two source files, one in share/classes and one in linux/classes. The build system overrides the share/classes implementation with the linux/classes implementation in the linux build. This is not by coincidence and only one class is contained in the generated jdk.jcmd module. Then there won't be a need for having a service interface and a service implementation that is looked up via reflection (which is not a bad pattern by itself). I agree that it's not a big problem to be solved but still not "no problem". Here is some longer elaboration how the build system prefers specific implementations of classes and filters generic duplicates: The SetupJavaCompilation function from JavaCompilation.gmk [0] is used to compile the java sources for JDK modules. In its documentation, for argument SRC [1], it claims: "one or more directories to search for sources. The order of the source roots is significant. The first found file of a certain name has priority". In its implementation the found files are first ordered [3] and duplicates filtered out [4]. The potential source files are handed to SetupJavaCompilation in CompileJavaModules.gmk [5] and were collected by a call to FindModuleSrcDirs [6]. FindModuleSrcDirs iterates over all potential source dirs for Java classes in the module [7]. The evaluated subdirs are (in that order) $(OPENJDK_TARGET_OS)/classes, $(OPENJDK_TARGET_OS_TYPE)/classes and share/classes, as per [8]. Hope that explains what I'm trying to leverage here. I've uploaded an updated webrev which contains some cleanup to the Test changes: http://cr.openjdk.java.net/~clanger/webrevs/8230857.1/ Thanks Christoph [0] http://hg.openjdk.java.net/jdk/jdk/file/ea93d6a9f720/make/common/JavaCompilation.gmk#l185 [1] http://hg.openjdk.java.net/jdk/jdk/file/ea93d6a9f720/make/common/JavaCompilation.gmk#l157 [3] http://hg.openjdk.java.net/jdk/jdk/file/ea93d6a9f720/make/common/JavaCompilation.gmk#l225 [4] http://hg.openjdk.java.net/jdk/jdk/file/ea93d6a9f720/make/common/JavaCompilation.gmk#l257 [5] http://hg.openjdk.java.net/jdk/jdk/file/ea93d6a9f720/make/CompileJavaModules.gmk#l603 [6] http://hg.openjdk.java.net/jdk/jdk/file/ea93d6a9f720/make/CompileJavaModules.gmk#l555 [7] http://hg.openjdk.java.net/jdk/jdk/file/ea93d6a9f720/make/common/Modules.gmk#l300 [8] http://hg.openjdk.java.net/jdk/jdk/file/ea93d6a9f720/make/common/Modules.gmk#l243 From matthias.baesken at sap.com Thu Sep 12 10:11:04 2019 From: matthias.baesken at sap.com (Baesken, Matthias) Date: Thu, 12 Sep 2019 10:11:04 +0000 Subject: RFR [XS]: 8230901: missing ReleaseStringUTFChars in servicability native code Message-ID: Hello, please reviews this small change . It adds ReleaseStringUTFChars calls at some places in early return cases . ( in src/jdk.hotspot.agent/solaris/native/libsaproc/saproc.cpp THROW_NEW_DEBUGGER_EXCEPTION contains a return , see the macro declaration 39 #define THROW_NEW_DEBUGGER_EXCEPTION(str) { throwNewDebuggerException(env, str); return;} ) Bug/webrev : https://bugs.openjdk.java.net/browse/JDK-8230901 http://cr.openjdk.java.net/~mbaesken/webrevs/8230901.0/ Thanks, Matthias -------------- next part -------------- An HTML attachment was scrubbed... URL: From thomas.stuefe at gmail.com Thu Sep 12 10:21:49 2019 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Thu, 12 Sep 2019 12:21:49 +0200 Subject: RFR [XS]: 8230901: missing ReleaseStringUTFChars in servicability native code In-Reply-To: References: Message-ID: Hi Matthias, your changes look good. an additional bug: http://cr.openjdk.java.net/~mbaesken/webrevs/8230901.0/src/jdk.hotspot.agent/solaris/native/libsaproc/saproc.cpp.frames.html 698 #ifndef _LP64 699 atoi(cmdLine_cstr); 700 if (errno) { Behaviour of atoi() in error case is undefined. errno values are not defined. See: https://pubs.opengroup.org/onlinepubs/009695399/functions/atoi.html And even if atoi would set errno, this is still not enough since errno may contain a stale value. One would have to set errno=0 before the function call. If you want to fix this too 'd suggest replacing this call with strtol(). Cheers, Thomas On Thu, Sep 12, 2019 at 12:11 PM Baesken, Matthias wrote: > Hello, please reviews this small change . > > > > It adds ReleaseStringUTFChars calls at some places in early return > cases . > > ( in src/jdk.hotspot.agent/solaris/native/libsaproc/saproc.cpp > > > > THROW_NEW_DEBUGGER_EXCEPTION contains a return , see the macro declaration > > > > 39 #define THROW_NEW_DEBUGGER_EXCEPTION(str) { throwNewDebuggerException(env, str); return;} > > ) > > > > > > > > Bug/webrev : > > > > https://bugs.openjdk.java.net/browse/JDK-8230901 > > > > http://cr.openjdk.java.net/~mbaesken/webrevs/8230901.0/ > > > > Thanks, Matthias > -------------- next part -------------- An HTML attachment was scrubbed... URL: From matthias.baesken at sap.com Thu Sep 12 11:52:18 2019 From: matthias.baesken at sap.com (Baesken, Matthias) Date: Thu, 12 Sep 2019 11:52:18 +0000 Subject: RFR [XS]: 8230901: missing ReleaseStringUTFChars in servicability native code In-Reply-To: References: Message-ID: Hi Thomas, thanks for the review . You are correct about atoi . New webrev : http://cr.openjdk.java.net/~mbaesken/webrevs/8230901.1/ I had 2 additional observations : 1. With OJDK on solaris 32bit gone for quite some time, we might be able to kick out the whole non _LP64 code because we are always 64 bit (maybe someone could comment if this is a safe assumption, there might be old 32bit solaris core files flying around for some reason even these days ? ) http://cr.openjdk.java.net/~mbaesken/webrevs/8230901.1/src/jdk.hotspot.agent/solaris/native/libsaproc/saproc.cpp.frames.html 696 // some older versions of libproc.so crash when trying to attach 32 bit 697 // debugger to 64 bit core file. check and throw error. 698 #ifndef _LP64 ?.. 1. The usage of atoi is commented here : https://docs.oracle.com/cd/E86824_01/html/E54766/atoi-3c.html ?However, applications should not use the atoi(), atol(), or atoll() functions unless they know the value represented by the argument will be in range for the corresponding result type? ??And here : https://pubs.opengroup.org/onlinepubs/009695399/functions/atoi.html ?If the number is not known to be in range, strtol() should be used because atoi() is not required to perform any error checking? However we have a number of usages in the coding where atoi is called without knowing that the argument is in the allowed range . some examples : src/hotspot/share/runtime/arguments.cpp-382- if (match_option(option, "-Dsun.java.launcher.pid=", &tail)) { src/hotspot/share/runtime/arguments.cpp:383: _sun_java_launcher_pid = atoi(tail); src/hotspot/share/runtime/arguments.cpp-384- continue; src/java.desktop/unix/native/libawt_xawt/xawt/XToolkit.c 455 value = getenv("_AWT_MAX_POLL_TIMEOUT"); 456 if (value != NULL) { 457 AWT_MAX_POLL_TIMEOUT = atoi(value); src/java.desktop/unix/native/common/awt/X11Color.c-781- if (getenv("CMAPSIZE") != 0) { src/java.desktop/unix/native/common/awt/X11Color.c:782: cmapsize = atoi(getenv("CMAPSIZE")); Should I open a bug for these ? Best regards, Matthias From: Thomas St?fe Sent: Donnerstag, 12. September 2019 12:22 To: Baesken, Matthias Cc: serviceability-dev at openjdk.java.net Subject: Re: RFR [XS]: 8230901: missing ReleaseStringUTFChars in servicability native code Hi Matthias, your changes look good. an additional bug: http://cr.openjdk.java.net/~mbaesken/webrevs/8230901.0/src/jdk.hotspot.agent/solaris/native/libsaproc/saproc.cpp.frames.html 698 #ifndef _LP64 699 atoi(cmdLine_cstr); 700 if (errno) { Behaviour of atoi() in error case is undefined. errno values are not defined. See: https://pubs.opengroup.org/onlinepubs/009695399/functions/atoi.html And even if atoi would set errno, this is still not enough since errno may contain a stale value. One would have to set errno=0 before the function call. If you want to fix this too 'd suggest replacing this call with strtol(). Cheers, Thomas On Thu, Sep 12, 2019 at 12:11 PM Baesken, Matthias > wrote: Hello, please reviews this small change . It adds ReleaseStringUTFChars calls at some places in early return cases . ( in src/jdk.hotspot.agent/solaris/native/libsaproc/saproc.cpp THROW_NEW_DEBUGGER_EXCEPTION contains a return , see the macro declaration 39 #define THROW_NEW_DEBUGGER_EXCEPTION(str) { throwNewDebuggerException(env, str); return;} ) Bug/webrev : https://bugs.openjdk.java.net/browse/JDK-8230901 http://cr.openjdk.java.net/~mbaesken/webrevs/8230901.0/ Thanks, Matthias -------------- next part -------------- An HTML attachment was scrubbed... URL: From hohensee at amazon.com Thu Sep 12 15:13:56 2019 From: hohensee at amazon.com (Hohensee, Paul) Date: Thu, 12 Sep 2019 15:13:56 +0000 Subject: RFR (M): 8207266: ThreadMXBean::getThreadAllocatedBytes() can be quicker for self thread In-Reply-To: <23ED3BD7-3B7E-43DE-84B8-DB52D44D362B@amazon.com> References: <588a91ec-8d4a-1157-5d72-88bb1eef1e6e@oracle.com> <30EA5D0C-1AEC-4242-B17B-CA4D39ECAF71@amazon.com> <0d42d653-d158-a6e4-45b6-84f087c7e592@oracle.com> <03A2509C-5587-448A-82F8-9240EA040326@amazon.com> <23ED3BD7-3B7E-43DE-84B8-DB52D44D362B@amazon.com> Message-ID: Ping once more :) Need a confirmatory review to push this. If I understand the rules correctly, it doesn't need a Reviewer review since Mandy's already reviewed it, it just needs a Committer review. Thanks, Paul ?On 9/6/19, 11:08 AM, "hotspot-gc-dev on behalf of Hohensee, Paul" wrote: Ping. Anyone? ( Thanks, On 9/3/19, 12:39 PM, "serviceability-dev on behalf of Hohensee, Paul" wrote: Minor update in new webrev http://cr.openjdk.java.net/~phh/8207266/webrev.05/. I removed ensureNonNullThreadIds() in favor of Objects.requireNonNull(ids). Thanks, Mandy, for your through reviews. May I get another reviewer to weigh in? Paul On 8/31/19, 5:06 PM, "hotspot-gc-dev on behalf of Hohensee, Paul" wrote: Thanks, Mandy. I?ve finalized the CSR. New webrev at http://cr.openjdk.java.net/~phh/8207266/webrev.04/. In management.cpp, I now have if (THREAD->is_Java_thread()) { return ((JavaThread*)THREAD)->cooked_allocated_bytes(); } In ThreadImpl.java, using requireNonNull would produce a different and less informative message, so I?d like to leave it as is. I changed throwIfNullThreadIds to ensureNonNullThreadIds, and throwIfThreadAllocatedMemoryNotSupported to ensureThreadAllocatedMemorySupported. I dropped the ?java.lang.? prefix from all uses of UnsupportedOperationException in both c.s.m.ThreadMXBean.java and j.l.m.ThreadMXBean.java, and did the same with SecurityException. ?@since 14? added to c.s.m.ThreadMXBean.java and the CSR. Do I need another reviewer? Paul From: Mandy Chung Date: Friday, August 30, 2019 at 4:26 PM To: "Hohensee, Paul" Cc: OpenJDK Serviceability , "hotspot-gc-dev at openjdk.java.net" Subject: Re: RFR (M): 8207266: ThreadMXBean::getThreadAllocatedBytes() can be quicker for self thread CSR reviewed. management.cpp 2083 java_thread = (JavaThread*)THREAD; 2084 if (java_thread->is_Java_thread()) { 2085 return java_thread->cooked_allocated_bytes(); 2086 } The cast should be done after is_Java_thread() test. ThreadImpl.java 162 private void throwIfNullThreadIds(long[] ids) { Even better: simply use Objects::requiresNonNull and this method can be removed. This suggests positive naming alternative to throwIfThreadAllocatedMemoryNotSupported - "ensureThreadAllocatedMemorySupported" (sorry I should have suggested that) ThreadMXBean.java 130 * @throws java.lang.UnsupportedOperationException if the Java virtual Nit: "java.lang." can be dropped. @since 14 is missing. Mandy On 8/30/19 3:33 PM, Hohensee, Paul wrote: Thanks for your review, Mandy. Revised webrev at http://cr.openjdk.java.net/~phh/8207266/webrev.02/. I updated the CSR with your suggested javadoc for getCurrentThreadAllocatedBytes. It now matches that for getCurrentThreadUserTime and getCurrentThreadCputime. I also fixed the ?convenient? -> ?convenience? typos in j.l.m.ThreadMXBean.java. I meant GetOneThreads to be the possessive, but don?t feel strongly either way so I?m fine with GetOneThread. I updated ThreadImpl.java as you suggested, though in getThreadAllocatedBytes(long[] ids) I had to add a redundant-in-the-not-length-1-case check for a null ids reference. Would someone take a look at the Hotspot side and the test please? Paul From: Mandy Chung Date: Friday, August 30, 2019 at 10:22 AM To: "Hohensee, Paul" Cc: OpenJDK Serviceability , "hotspot-gc-dev at openjdk.java.net" Subject: Re: RFR (M): 8207266: ThreadMXBean::getThreadAllocatedBytes() can be quicker for self thread OK. That's better. Some review comments: The javadoc of getCurrentThreadAllocatedBytes() can simply say: "Returns an approximation of the total amount of memory, in bytes, allocated in heap memory for the current thread. This is a convenient method for local management use and is equivalent to calling getThreadAllocatedBytes(Thread.currentThread().getId()). src/hotspot/share/include/jmm.h GetOneThreadsAllocatedMemory: s/OneThreads/OneThread/ sun/management/ThreadImpl.java 43 private static final String THREAD_ALLOCATED_MEMORY_NOT_SUPPORTED = 44 "Thread allocated memory measurement is not supported."; if (!isThreadAllocatedMemorySupported()) { throw new UnsupportedOperationException(THREAD_ALLOCATED_MEMORY_NOT_SUPPORTED); } Perhaps the above can be refactored as throwIfAllocatedMemoryUnsupported() method. 391 if (ids.length == 1) { 392 sizes[0] = -1; : 398 if (ids.length == 1) { 399 long id = ids[0]; 400 sizes[0] = getThreadAllocatedMemory0( 401 Thread.currentThread().getId() == id ? 0 : id); 402 } else { It seems cleaner to handle the 1-element array case at the beginning of this method: if (ids.length == 1) { long size = getThreadAllocatedBytes(ids[0]); return new long[] { size }; } I didn't review the hotspot implementation and the test. Mandy On 8/29/19 10:01 AM, Hohensee, Paul wrote: My bad, Mandy. The webrev puts getCurrentThreadAllocatedBytes in com.sun.management.ThreadMXBean along with the current two getThreadAllocatedBytes methods for the reasons you list. I?ve updated the CSR to specify com.sun.management and added a rationale. AllocatedBytes is currently enabled by Hotspot by default because the overhead of recording TLAB occupancy is negligible. There?s no new GC code, nor will there be, so imo we don?t have to involve the GC folks. I.e., the new JMM method GetOneThreadsAllocatedBytes uses the existing cooked_allocated_bytes JavaThread method, and getCurrentThreadAllocatedBytes is the same as getThreadAllocatedBytes: it just bypasses the thread lookup code. I hadn?t tracked down what happens when getCurrentThreadUserTime and getCurrentThreadCpuTime are called before, but if I?m not mistaken, it the code in jcmd() in attachListener.cpp will call GetThreadCpuTimeWithKind in management.cpp, and it will ultimately use Thread::current() as the subject of the call, see os::current_thread_cpu_time in os_linux.cpp. That means that the CurrentThread methods should work remotely the same way they do locally. GetOneThreadsAllocatedBytes in management.cpp uses THREAD as its subject when called on behalf of getCurrentThreadAllocatedBytes, so it will also uses the current remote Java thread. Even if these methods only worked locally, there are many setups where apps are self-monitoring that could use the performance improvement. Thanks, Paul From: Mandy Chung Date: Wednesday, August 28, 2019 at 3:59 PM To: "Hohensee, Paul" Cc: OpenJDK Serviceability , "hotspot-gc-dev at openjdk.java.net" Subject: Re: RFR (M): 8207266: ThreadMXBean::getThreadAllocatedBytes() can be quicker for self thread Hi Paul, The CSR proposes this method in java.lang.management.ThreadMXBean as a Java SE feature. Has this been discussed with the GC team to commit measuring current thread's allocated bytes as Java SE feature? Can this be supported by all JVM implementation? What is the overhead if this is enabled by default? Does it need to be disabled? This metric is from TLAB that might be okay. This needs advice/discussion with GC experts. I see that CSR mentions it can be disabled and link to isThreadAllocatedMemoryEnabled() and setThreadAllocatedMemoryEnabled() methods but these methods are defined in com.sun.management.ThreadMXBean. As Alan points out, current thread makes sense only in local VM management. When this is monitored from a JMX client (e.g. jconsole to connect to a running JVM, "currentThreadAllowcatedBytes" attribute is the current thread in jconsole process which invoking Thread::currentThread? Mandy On 8/28/19 12:22 PM, Hohensee, Paul wrote: Please review a performance improvement for ThreadMXBean.getThreadAllocatedBytes and the addition of getCurrentThreadAllocatedBytes. JBS issue: https://bugs.openjdk.java.net/browse/JDK-8207266 Webrev: http://cr.openjdk.java.net/~phh/8207266/webrev.00/ CSR: https://bugs.openjdk.java.net/browse/JDK-8230311 Previous email threads: https://mail.openjdk.java.net/pipermail/serviceability-dev/2018-July/024441.html https://mail.openjdk.java.net/pipermail/serviceability-dev/2018-August/024763.html The CSR is for adding ThreadMXBean.getCurrentThreadAllocatedBytes. I?d be great for someone to review it. I took Mandy?s advice and put the fast paths in the library code. I added a new JMM method GetOneThreadsAllocatedBytes that works the same as GetThreadCpuTime: it uses a thread_id value of zero to distinguish the current thread. On my Mac laptop, the result runs 47x faster for the current thread than the old implementation. The 3 tests in test/jdk/com/sun/management/ThreadMXBean all pass. I added code to ThreadAllocatedMemory.java to test getCurrentThreadAllocatedBytes as well as variations on getThreadAllocatedBytes(id). A submit repo job is in progress. Thanks, Paul From leonid.mesnik at oracle.com Thu Sep 12 16:56:44 2019 From: leonid.mesnik at oracle.com (Leonid Mesnik) Date: Thu, 12 Sep 2019 09:56:44 -0700 Subject: RFR: 8230881: serviceability/sa/TestJmapCore tests fail with java.lang.RuntimeException: Could not find dump file Message-ID: Hi Could you please verify following fix which update core filename patterns. Some hosts are configured to compress core files during core dump. In such cases tests should throw skipped exception instead of failing reading compressed cores. Fixing all tests to support unpacking core.pid.gz files is going to be fixed in separate RFE. Verified that test pass if plain core is generated and skipped in other case. webrev: http://cr.openjdk.java.net/~lmesnik/8230881/webrev.00/ bug: https://bugs.openjdk.java.net/browse/JDK-8230881 Leonid -------------- next part -------------- An HTML attachment was scrubbed... URL: From mandy.chung at oracle.com Thu Sep 12 17:06:48 2019 From: mandy.chung at oracle.com (Mandy Chung) Date: Thu, 12 Sep 2019 10:06:48 -0700 Subject: RFR (M): 8207266: ThreadMXBean::getThreadAllocatedBytes() can be quicker for self thread In-Reply-To: References: <588a91ec-8d4a-1157-5d72-88bb1eef1e6e@oracle.com> <30EA5D0C-1AEC-4242-B17B-CA4D39ECAF71@amazon.com> <0d42d653-d158-a6e4-45b6-84f087c7e592@oracle.com> <03A2509C-5587-448A-82F8-9240EA040326@amazon.com> Message-ID: <6f674d71-58f6-bc79-7d08-7bcc24e3b0fa@oracle.com> On 9/3/19 12:38 PM, Hohensee, Paul wrote: > Minor update in new webrev http://cr.openjdk.java.net/~phh/8207266/webrev.05/. I only reviewed the library side implementation that looks good.? I expect the serviceability team to review the test and hotspot change. > Need a confirmatory review to push this. If I understand the rules correctly, it doesn't need a Reviewer review since Mandy's already reviewed it, it just needs a Committer review. You need another reviewer to advice the following because I was not close to the ThreadsList work. 2087 ThreadsListHandle tlh; 2088 JavaThread* java_thread = tlh.list()->find_JavaThread_from_java_tid(thread_id); 2089 2090 if (java_thread != NULL) { 2091 return java_thread->cooked_allocated_bytes(); 2092 } This looks right to me. test/jdk/com/sun/management/ThreadMXBean/ThreadAllocatedMemory.java - "ThreadAllocatedMemory is expected to be disabled"); + "TEST FAILED: ThreadAllocatedMemory is expected to be disabled"); Prepending "TEST FAILED" in exception message (in several places) seems redundant since such RuntimeException is thrown and expected a test failure. + // back-to-back calls shouldn't allocate any memory + size = mbean.getThreadAllocatedBytes(id); + size1 = mbean.getThreadAllocatedBytes(id); + if (size1 != size) { Is there anything in the test can do to help guarantee this? I didn't closely review this test. The main thing I advice is to improve the reliability of this test. Put it in another way, we want to ensure that this test change will pass all the time in various test configuration. Mandy -------------- next part -------------- An HTML attachment was scrubbed... URL: From christoph.langer at sap.com Thu Sep 12 19:53:18 2019 From: christoph.langer at sap.com (Langer, Christoph) Date: Thu, 12 Sep 2019 19:53:18 +0000 Subject: RFR (S): 8230850: Test sun/tools/jcmd/TestProcessHelper.java fails intermittently In-Reply-To: <538fbd9d96e88ef3179f6ad7c410e4311e87c420.camel@redhat.com> References: <538fbd9d96e88ef3179f6ad7c410e4311e87c420.camel@redhat.com> Message-ID: Hi Severin, that seems an interesting idea for an elegant solution. However, after trying this on a decently fast linux x86 box by leveraging one of these ProcessTools::startProcess methods that would wait for a certain output to appear in the child before returning, I figured that the elapsed runtime of the test increases from about 3 seconds to 3 minutes. It evidently takes a lot longer to bootstrap a JVM and get the results of a first println than just forking a process and immediately accessing its proc filesystem. So I think we don't want to do that. I would like to go with this version (changed the comment to Thomas' suggestion): http://cr.openjdk.java.net/~clanger/webrevs/8230850.1/ Chris, are you ok with it? Thanks Christoph > -----Original Message----- > From: Severin Gehwolf > Sent: Donnerstag, 12. September 2019 11:01 > To: Langer, Christoph ; Thomas St?fe > > Cc: OpenJDK Serviceability > Subject: Re: RFR (S): 8230850: Test sun/tools/jcmd/TestProcessHelper.java > fails intermittently > > Hi Christoph, > > Have you considered to wait for TestProcess - the spawned processes - to > print this on stdout: > > "The process started, pid: XXX" > > Once that's ready on stdout, checking the main class should always > pass. I believe p.isAlive() check which is currrently done is > insufficient. > > Thanks, > Severin > > On Thu, 2019-09-12 at 08:12 +0000, Langer, Christoph wrote: > > Hi Thomas, > > > > sounds reasonable, will do. > > > > Thanks > > Christoph > > > > From: Thomas St?fe > > Sent: Donnerstag, 12. September 2019 10:11 > > To: Langer, Christoph > > Cc: Chris Plummer ; OpenJDK Serviceability > > > Subject: Re: RFR (S): 8230850: Test sun/tools/jcmd/TestProcessHelper.java > fails intermittently > > > > I'm fine with the patch if you would reshape the platform dependent > comment. Proposal: > > > > ---- > > - // Depending on hw/os, process helper can return null here > > - // because /proc//cmdline is not ready yet. To cover that case, > > // give it some retries. > > -> > > + getMainClass() may return NULL, e.g. due to timing issues. Attempt some > limited retries. > > ---- > > I do not need another webrev. > > Cheers, Thomas > > > > > > > > > > On Wed, Sep 11, 2019 at 11:37 PM Langer, Christoph > wrote: > > > Hi Chris, Thomas, > > > > > > thanks for looking at this. I was also wondering whether a fix in > ProcessHelper would be appropriate. But I think introducing retries and > delays in that code can do more harm than help. > > > > > > For this special test case, aiming to test the ProcessHelper functionality > (on Linux) only, the observed problem is that the /proc//cmdline file is > not ready yet when it gets evaluated because the test can be quicker than > the spawned processes. But in real life usage of jcmd this seems rather > unlikely. One will probably use jcmd quite some time after a java process was > started and /proc//cmdline should be ready. If then there are > problems reading it, there are likely other issues which won?t go away by > waiting. And for these cases the fallback is to use the attach framework, as > implemented in ProcessArgumentMatcher, which provides some chance to > be working still. And this fallback should also cover the exotic case when jcmd > is issued too early. > > > > > > After all, ProcessHelper::getMainClass also documents that its result can > be null. > > > > > > @Thomas, as for your other points: > > > PID reusage: Hm, maybe one can construct cases. However, I?d think the > /proc/pid files should be gone after a process ends. Or at least be > reconstructed if there were orphans and a new process reusing an old pid > gets started. But who knows what can happen ? we?ll maybe see ?? > > > Comment for Linux only issue: The test is in fact a Linux only test. See line > 55: * @requires os.family == "linux". So, if we?ll eventually see > implementations for ProcessHelper::getMainClass on other platforms, this > comment might have to be adopted. But for the time being I guess it?s fine at > its current place. > > > > > > Would you agree? > > > > > > Best regards > > > Christoph > > > > > > From: Chris Plummer > > > Sent: Mittwoch, 11. September 2019 19:21 > > > To: Thomas St?fe ; Langer, Christoph > > > > Cc: OpenJDK Serviceability > > > Subject: Re: RFR (S): 8230850: Test > sun/tools/jcmd/TestProcessHelper.java fails intermittently > > > > > > It does seem that the fix should be in ProcessHelper.java in > getMainClass(), or maybe even getCommandLine(). Fixing it in the test > implies that every user of getMainClass() should be doing something similar. > But then also note what ProcessArgumentMatch.check() is doing. It also > deals with getMainClass() returing null. > > > > > > thanks, > > > > > > Chris > > > > > > On 9/11/19 6:59 AM, Thomas St?fe wrote: > > > > Hi Christoph, > > > > > > > > in general I think this is fine. The increase-by-pow2 sleep time is odd > but okay :) > > > > > > > > The whole things seems rather fragile and has a lot of question marks > but I think your fix does not make it worse. One fun error now is that with a > follow up java test reusing the PID we could get a wrong main class but I think > the chances are astronomically low. > > > > > > > > Only remark, you fix this in the platform shared code, if this is a Linux > only issue maybe it should be fixed in /shared/projects/openjdk/jdk- > jdk/source/src/jdk.jcmd/linux/classes/sun/tools/ProcessHelper.java > instead? If not, I would remove at least the /proc//cmdline comment > since this is quite platform specific. > > > > > > > > Cheers, Thomas > > > > > > > > > > > > On Wed, Sep 11, 2019 at 2:39 PM Langer, Christoph > wrote: > > > > > Hi, > > > > > > > > > > please review this change for test > sun/tools/jcmd/TestProcessHelper.java to make it more robust. > > > > > > > > > > Bug: https://bugs.openjdk.java.net/browse/JDK-8230850 > > > > > Webrev: http://cr.openjdk.java.net/~clanger/webrevs/8230850.0/ > > > > > > > > > > This Linux only test is starting several Java processes and then tries to > figure out the main class by invoking jdk.jcmd's linux specific ProcessHelper > implementation which parses the contents of /proc//cmdline. > > > > > Under some circumstances, the test already attempts to read > /proc//cmdline before it actually exists or is filled with data. This can be > fixed with some sleeps/retries to wait for that data to be ready. > > > > > In the actual jcmd tool, such behavior of ProcessHelper. getMainClass > should not be an issue because it is handled in ProcessArgumentMatcher [0]. > > > > > > > > > > Thanks > > > > > Christoph > > > > > > > > > > [0] > http://hg.openjdk.java.net/jdk/jdk/file/8b08eaf9a0eb/src/jdk.jcmd/share/cl > asses/sun/tools/common/ProcessArgumentMatcher.java#l86 > > > > > > > > > > > From david.holmes at oracle.com Thu Sep 12 23:46:50 2019 From: david.holmes at oracle.com (David Holmes) Date: Fri, 13 Sep 2019 09:46:50 +1000 Subject: RFR: 8230881: serviceability/sa/TestJmapCore tests fail with java.lang.RuntimeException: Could not find dump file In-Reply-To: References: Message-ID: Looks good! Thanks for fixing. David On 13/09/2019 2:56 am, Leonid Mesnik wrote: > Hi > > Could you please verify following fix which update core filename > patterns. Some hosts are configured to compress core files during core > dump. In such cases tests should throw skipped exception instead of > failing reading compressed cores. > Fixing all tests to support unpacking core.pid.gz files is going to be > fixed in separate RFE. > > > Verified that test pass if plain core is generated and skipped in other > case. > > > webrev: http://cr.openjdk.java.net/~lmesnik/8230881/webrev.00/ > bug: https://bugs.openjdk.java.net/browse/JDK-8230881 > > Leonid From hohensee at amazon.com Fri Sep 13 00:29:49 2019 From: hohensee at amazon.com (Hohensee, Paul) Date: Fri, 13 Sep 2019 00:29:49 +0000 Subject: RFR (M): 8207266: ThreadMXBean::getThreadAllocatedBytes() can be quicker for self thread In-Reply-To: <6f674d71-58f6-bc79-7d08-7bcc24e3b0fa@oracle.com> References: <588a91ec-8d4a-1157-5d72-88bb1eef1e6e@oracle.com> <30EA5D0C-1AEC-4242-B17B-CA4D39ECAF71@amazon.com> <0d42d653-d158-a6e4-45b6-84f087c7e592@oracle.com> <03A2509C-5587-448A-82F8-9240EA040326@amazon.com> <6f674d71-58f6-bc79-7d08-7bcc24e3b0fa@oracle.com> Message-ID: Thanks for clarifying the review rules. Would someone from the serviceability team please review? New webrev at http://cr.openjdk.java.net/~phh/8207266/webrev.07/ I didn?t disturb the existing checks in the test, just added code to check the result of getThreadAllocatedBytes(long) on a non-current thread, plus the back-to-back no-allocation checks. The former wasn?t needed before because getThreadAllocatedBytes(long) was just a wrapper around getThreadAllocatedBytes(long []). This patch changes that, so I added a separate test. The latter is supposed to fail if there?s object allocation on calls to getCurrentThreadAllocatedBytes and getThreadAllocatedBytes(long). I.e., a feature, not a bug, because accumulation of transient small objects can be a performance problem. Thanks to your review, I noticed that the back-to-back check on the current thread was using getThreadAllocatedBytes(long) instead of getCurrentThreadAllocatedBytes and fixed it. I also removed all instances of ?TEST FAILED: ?. Paul From: Mandy Chung Date: Thursday, September 12, 2019 at 10:09 AM To: "Hohensee, Paul" Cc: OpenJDK Serviceability , "hotspot-gc-dev at openjdk.java.net" Subject: Re: RFR (M): 8207266: ThreadMXBean::getThreadAllocatedBytes() can be quicker for self thread On 9/3/19 12:38 PM, Hohensee, Paul wrote: Minor update in new webrev http://cr.openjdk.java.net/~phh/8207266/webrev.05/. I only reviewed the library side implementation that looks good. I expect the serviceability team to review the test and hotspot change. Need a confirmatory review to push this. If I understand the rules correctly, it doesn't need a Reviewer review since Mandy's already reviewed it, it just needs a Committer review. You need another reviewer to advice the following because I was not close to the ThreadsList work. 2087 ThreadsListHandle tlh; 2088 JavaThread* java_thread = tlh.list()->find_JavaThread_from_java_tid(thread_id); 2089 2090 if (java_thread != NULL) { 2091 return java_thread->cooked_allocated_bytes(); 2092 } This looks right to me. test/jdk/com/sun/management/ThreadMXBean/ThreadAllocatedMemory.java - "ThreadAllocatedMemory is expected to be disabled"); + "TEST FAILED: ThreadAllocatedMemory is expected to be disabled"); Prepending "TEST FAILED" in exception message (in several places) seems redundant since such RuntimeException is thrown and expected a test failure. + // back-to-back calls shouldn't allocate any memory + size = mbean.getThreadAllocatedBytes(id); + size1 = mbean.getThreadAllocatedBytes(id); + if (size1 != size) { Is there anything in the test can do to help guarantee this? I didn't closely review this test. The main thing I advice is to improve the reliability of this test. Put it in another way, we want to ensure that this test change will pass all the time in various test configuration. Mandy -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.plummer at oracle.com Fri Sep 13 00:53:06 2019 From: chris.plummer at oracle.com (Chris Plummer) Date: Thu, 12 Sep 2019 17:53:06 -0700 Subject: RFR: 8230881: serviceability/sa/TestJmapCore tests fail with java.lang.RuntimeException: Could not find dump file In-Reply-To: References: Message-ID: <6d007e8f-7696-841e-0637-033631262248@oracle.com> Looks good. Chris On 9/12/19 9:56 AM, Leonid Mesnik wrote: > Hi > > Could you please verify following fix which update core filename > patterns. Some hosts are configured to compress core files during core > dump. In such cases tests should throw skipped exception instead of > failing reading compressed cores. > Fixing all tests to support unpacking core.pid.gz files is going to be > fixed in separate RFE. > > > Verified that test pass if plain core is generated and skipped in > other case. > > > webrev: http://cr.openjdk.java.net/~lmesnik/8230881/webrev.00/ > bug: https://bugs.openjdk.java.net/browse/JDK-8230881 > > Leonid From hohensee at amazon.com Fri Sep 13 01:52:50 2019 From: hohensee at amazon.com (Hohensee, Paul) Date: Fri, 13 Sep 2019 01:52:50 +0000 Subject: RFR (M): 8207266: ThreadMXBean::getThreadAllocatedBytes() can be quicker for self thread In-Reply-To: References: <588a91ec-8d4a-1157-5d72-88bb1eef1e6e@oracle.com> <30EA5D0C-1AEC-4242-B17B-CA4D39ECAF71@amazon.com> <0d42d653-d158-a6e4-45b6-84f087c7e592@oracle.com> <03A2509C-5587-448A-82F8-9240EA040326@amazon.com> <6f674d71-58f6-bc79-7d08-7bcc24e3b0fa@oracle.com> Message-ID: And of course the back-to-back check is more or less accurate depending on the accuracy of the underlying Hotspot mechanism. So it?s possible (indeed likely with the current TLAB refill interval update) that it?ll return false negatives, but imo better to keep it anyway. From: "Hohensee, Paul" Date: Thursday, September 12, 2019 at 5:29 PM To: Mandy Chung Cc: OpenJDK Serviceability , "hotspot-gc-dev at openjdk.java.net" Subject: Re: RFR (M): 8207266: ThreadMXBean::getThreadAllocatedBytes() can be quicker for self thread Thanks for clarifying the review rules. Would someone from the serviceability team please review? New webrev at http://cr.openjdk.java.net/~phh/8207266/webrev.07/ I didn?t disturb the existing checks in the test, just added code to check the result of getThreadAllocatedBytes(long) on a non-current thread, plus the back-to-back no-allocation checks. The former wasn?t needed before because getThreadAllocatedBytes(long) was just a wrapper around getThreadAllocatedBytes(long []). This patch changes that, so I added a separate test. The latter is supposed to fail if there?s object allocation on calls to getCurrentThreadAllocatedBytes and getThreadAllocatedBytes(long). I.e., a feature, not a bug, because accumulation of transient small objects can be a performance problem. Thanks to your review, I noticed that the back-to-back check on the current thread was using getThreadAllocatedBytes(long) instead of getCurrentThreadAllocatedBytes and fixed it. I also removed all instances of ?TEST FAILED: ?. Paul From: Mandy Chung Date: Thursday, September 12, 2019 at 10:09 AM To: "Hohensee, Paul" Cc: OpenJDK Serviceability , "hotspot-gc-dev at openjdk.java.net" Subject: Re: RFR (M): 8207266: ThreadMXBean::getThreadAllocatedBytes() can be quicker for self thread On 9/3/19 12:38 PM, Hohensee, Paul wrote: Minor update in new webrev http://cr.openjdk.java.net/~phh/8207266/webrev.05/. I only reviewed the library side implementation that looks good. I expect the serviceability team to review the test and hotspot change. Need a confirmatory review to push this. If I understand the rules correctly, it doesn't need a Reviewer review since Mandy's already reviewed it, it just needs a Committer review. You need another reviewer to advice the following because I was not close to the ThreadsList work. 2087 ThreadsListHandle tlh; 2088 JavaThread* java_thread = tlh.list()->find_JavaThread_from_java_tid(thread_id); 2089 2090 if (java_thread != NULL) { 2091 return java_thread->cooked_allocated_bytes(); 2092 } This looks right to me. test/jdk/com/sun/management/ThreadMXBean/ThreadAllocatedMemory.java - "ThreadAllocatedMemory is expected to be disabled"); + "TEST FAILED: ThreadAllocatedMemory is expected to be disabled"); Prepending "TEST FAILED" in exception message (in several places) seems redundant since such RuntimeException is thrown and expected a test failure. + // back-to-back calls shouldn't allocate any memory + size = mbean.getThreadAllocatedBytes(id); + size1 = mbean.getThreadAllocatedBytes(id); + if (size1 != size) { Is there anything in the test can do to help guarantee this? I didn't closely review this test. The main thing I advice is to improve the reliability of this test. Put it in another way, we want to ensure that this test change will pass all the time in various test configuration. Mandy -------------- next part -------------- An HTML attachment was scrubbed... URL: From david.holmes at oracle.com Fri Sep 13 07:50:27 2019 From: david.holmes at oracle.com (David Holmes) Date: Fri, 13 Sep 2019 17:50:27 +1000 Subject: RFR (M): 8207266: ThreadMXBean::getThreadAllocatedBytes() can be quicker for self thread In-Reply-To: References: <588a91ec-8d4a-1157-5d72-88bb1eef1e6e@oracle.com> <30EA5D0C-1AEC-4242-B17B-CA4D39ECAF71@amazon.com> <0d42d653-d158-a6e4-45b6-84f087c7e592@oracle.com> <03A2509C-5587-448A-82F8-9240EA040326@amazon.com> <6f674d71-58f6-bc79-7d08-7bcc24e3b0fa@oracle.com> Message-ID: <5252a51d-4217-000b-1444-a088bb8a6a58@oracle.com> Hi Paul, On 13/09/2019 10:29 am, Hohensee, Paul wrote: > Thanks for clarifying the review rules. Would someone from the > serviceability team please review? New webrev at > > http://cr.openjdk.java.net/~phh/8207266/webrev.07/ One aspect of the functional change needs clarification for me - and apologies if this has been covered in the past. It seems to me that currently we only check isThreadAllocatedMemorySupported for these operations, but if I read things correctly the updated code additionally checks isThreadAllocatedMemoryEnabled, which is a behaviour change not mentioned in the CSR. > I didn?t disturb the existing checks in the test, just added code to > check the result of getThreadAllocatedBytes(long) on a non-current > thread, plus the back-to-back no-allocation checks. The former wasn?t > needed before because getThreadAllocatedBytes(long) was just a wrapper > around getThreadAllocatedBytes(long []). This patch changes that, so I > added a separate test. The latter is supposed to fail if there?s object > allocation on calls to getCurrentThreadAllocatedBytes and > getThreadAllocatedBytes(long). I.e., a feature, not a bug, because > accumulation of transient small objects can be a performance problem. > Thanks to your review, I noticed that the back-to-back check on the > current thread was using getThreadAllocatedBytes(long) instead of > getCurrentThreadAllocatedBytes and fixed it. I also removed all > instances of ?TEST FAILED: ?. The back-to-back check is not valid in general. You don't know if the first check might trigger some class loading on the return path after it has obtained the first memory value. The check might also fail if using JVMCI and some compilation related activity occurs in the current thread on the second call. Also with the introduction of handshakes its possible the current thread might hit a safepoint checks that results in it executing a handshake operation that performs allocation. Potentially there could be numerous non-deterministic actions that might occur leading to unanticipated allocation. I understand what you want to test here, I just don't think it is reliably doable. Thanks, David ----- > > Paul > > *From: *Mandy Chung > *Date: *Thursday, September 12, 2019 at 10:09 AM > *To: *"Hohensee, Paul" > *Cc: *OpenJDK Serviceability , > "hotspot-gc-dev at openjdk.java.net" > *Subject: *Re: RFR (M): 8207266: ThreadMXBean::getThreadAllocatedBytes() > can be quicker for self thread > > On 9/3/19 12:38 PM, Hohensee, Paul wrote: > > Minor update in new webrevhttp://cr.openjdk.java.net/~phh/8207266/webrev.05/. > > > I only reviewed the library side implementation that looks good.? I > expect the serviceability team to review the test and hotspot change. > > > Need a confirmatory review to push this. If I understand the rules correctly, it doesn't need a Reviewer review since Mandy's already reviewed it, it just needs a Committer review. > > > You need another reviewer to advice the following because I was not > close to the ThreadsList work. > > 2087?? ThreadsListHandle tlh; > > 2088?? JavaThread* java_thread = tlh.list()->find_JavaThread_from_java_tid(thread_id); > > 2089 > > 2090?? if (java_thread != NULL) { > > 2091???? return java_thread->cooked_allocated_bytes(); > > 2092?? } > > This looks right to me. > > test/jdk/com/sun/management/ThreadMXBean/ThreadAllocatedMemory.java > > -??????????????? "ThreadAllocatedMemory is expected to be disabled"); > > +??????????????? "TEST FAILED: ThreadAllocatedMemory is expected to be > disabled"); > > Prepending "TEST FAILED" in exception message (in several places) > > seems redundant since such RuntimeException is thrown and expected > > a test failure. > > +??????? // back-to-back calls shouldn't allocate any memory > > +??????? size = mbean.getThreadAllocatedBytes(id); > > +??????? size1 = mbean.getThreadAllocatedBytes(id); > > +??????? if (size1 != size) { > > Is there anything in the test can do to help guarantee this? I didn't > > closely review this test.? The main thing I advice is to improve > > the reliability of this test.? Put it in another way, we want to > > ensure that this test change will pass all the time in various > > test configuration. > > Mandy > From sgehwolf at redhat.com Fri Sep 13 08:33:10 2019 From: sgehwolf at redhat.com (Severin Gehwolf) Date: Fri, 13 Sep 2019 10:33:10 +0200 Subject: RFR (S): 8230850: Test sun/tools/jcmd/TestProcessHelper.java fails intermittently In-Reply-To: References: <538fbd9d96e88ef3179f6ad7c410e4311e87c420.camel@redhat.com> Message-ID: <44f49ad20c8c0b5cf2141b82f3fe6db6a9088857.camel@redhat.com> On Thu, 2019-09-12 at 19:53 +0000, Langer, Christoph wrote: > Hi Severin, > > that seems an interesting idea for an elegant solution. However, > after trying this on a decently fast linux x86 box by leveraging one > of these ProcessTools::startProcess methods that would wait for a > certain output to appear in the child before returning, I figured > that the elapsed runtime of the test increases from about 3 seconds > to 3 minutes. It evidently takes a lot longer to bootstrap a JVM and > get the results of a first println than just forking a process and > immediately accessing its proc filesystem. So I think we don't want > to do that. Wow, that's unfortunate. > I would like to go with this version (changed the comment to Thomas' > suggestion): http://cr.openjdk.java.net/~clanger/webrevs/8230850.1/ Patch seems OK in this case. Thanks, Severin > Chris, are you ok with it? > > Thanks > Christoph > > > -----Original Message----- > > From: Severin Gehwolf > > Sent: Donnerstag, 12. September 2019 11:01 > > To: Langer, Christoph ; Thomas St?fe > > > > Cc: OpenJDK Serviceability > > Subject: Re: RFR (S): 8230850: Test > > sun/tools/jcmd/TestProcessHelper.java > > fails intermittently > > > > Hi Christoph, > > > > Have you considered to wait for TestProcess - the spawned processes > > - to > > print this on stdout: > > > > "The process started, pid: XXX" > > > > Once that's ready on stdout, checking the main class should always > > pass. I believe p.isAlive() check which is currrently done is > > insufficient. > > > > Thanks, > > Severin > > > > On Thu, 2019-09-12 at 08:12 +0000, Langer, Christoph wrote: > > > Hi Thomas, > > > > > > sounds reasonable, will do. > > > > > > Thanks > > > Christoph > > > > > > From: Thomas St?fe > > > Sent: Donnerstag, 12. September 2019 10:11 > > > To: Langer, Christoph > > > Cc: Chris Plummer ; OpenJDK > > > Serviceability > > > > > Subject: Re: RFR (S): 8230850: Test > > > sun/tools/jcmd/TestProcessHelper.java > > fails intermittently > > > I'm fine with the patch if you would reshape the platform > > > dependent > > comment. Proposal: > > > ---- > > > - // Depending on hw/os, process helper can return null > > > here > > > - // because /proc//cmdline is not ready yet. To > > > cover that case, > > > // give it some retries. > > > -> > > > + getMainClass() may return NULL, e.g. due to timing issues. > > > Attempt some > > limited retries. > > > ---- > > > I do not need another webrev. > > > Cheers, Thomas > > > > > > > > > > > > > > > On Wed, Sep 11, 2019 at 11:37 PM Langer, Christoph > > wrote: > > > > Hi Chris, Thomas, > > > > > > > > thanks for looking at this. I was also wondering whether a fix > > > > in > > ProcessHelper would be appropriate. But I think introducing retries > > and > > delays in that code can do more harm than help. > > > > For this special test case, aiming to test the ProcessHelper > > > > functionality > > (on Linux) only, the observed problem is that the > > /proc//cmdline file is > > not ready yet when it gets evaluated because the test can be > > quicker than > > the spawned processes. But in real life usage of jcmd this seems > > rather > > unlikely. One will probably use jcmd quite some time after a java > > process was > > started and /proc//cmdline should be ready. If then there are > > problems reading it, there are likely other issues which won?t go > > away by > > waiting. And for these cases the fallback is to use the attach > > framework, as > > implemented in ProcessArgumentMatcher, which provides some chance > > to > > be working still. And this fallback should also cover the exotic > > case when jcmd > > is issued too early. > > > > After all, ProcessHelper::getMainClass also documents that its > > > > result can > > be null. > > > > @Thomas, as for your other points: > > > > PID reusage: Hm, maybe one can construct cases. However, I?d > > > > think the > > /proc/pid files should be gone after a process ends. Or at least be > > reconstructed if there were orphans and a new process reusing an > > old pid > > gets started. But who knows what can happen ? we?ll maybe see ?? > > > > Comment for Linux only issue: The test is in fact a Linux only > > > > test. See line > > 55: * @requires os.family == "linux". So, if we?ll eventually see > > implementations for ProcessHelper::getMainClass on other platforms, > > this > > comment might have to be adopted. But for the time being I guess > > it?s fine at > > its current place. > > > > Would you agree? > > > > > > > > Best regards > > > > Christoph > > > > > > > > From: Chris Plummer > > > > Sent: Mittwoch, 11. September 2019 19:21 > > > > To: Thomas St?fe ; Langer, Christoph > > > > > > Cc: OpenJDK Serviceability > > > > > > > > Subject: Re: RFR (S): 8230850: Test > > sun/tools/jcmd/TestProcessHelper.java fails intermittently > > > > It does seem that the fix should be in ProcessHelper.java in > > getMainClass(), or maybe even getCommandLine(). Fixing it in the > > test > > implies that every user of getMainClass() should be doing something > > similar. > > But then also note what ProcessArgumentMatch.check() is doing. It > > also > > deals with getMainClass() returing null. > > > > thanks, > > > > > > > > Chris > > > > > > > > On 9/11/19 6:59 AM, Thomas St?fe wrote: > > > > > Hi Christoph, > > > > > > > > > > in general I think this is fine. The increase-by-pow2 sleep > > > > > time is odd > > but okay :) > > > > > The whole things seems rather fragile and has a lot of > > > > > question marks > > but I think your fix does not make it worse. One fun error now is > > that with a > > follow up java test reusing the PID we could get a wrong main class > > but I think > > the chances are astronomically low. > > > > > Only remark, you fix this in the platform shared code, if > > > > > this is a Linux > > only issue maybe it should be fixed in > > /shared/projects/openjdk/jdk- > > jdk/source/src/jdk.jcmd/linux/classes/sun/tools/ProcessHelper.java > > instead? If not, I would remove at least the /proc//cmdline > > comment > > since this is quite platform specific. > > > > > Cheers, Thomas > > > > > > > > > > > > > > > On Wed, Sep 11, 2019 at 2:39 PM Langer, Christoph > > wrote: > > > > > > Hi, > > > > > > > > > > > > please review this change for test > > sun/tools/jcmd/TestProcessHelper.java to make it more robust. > > > > > > Bug: https://bugs.openjdk.java.net/browse/JDK-8230850 > > > > > > Webrev: > > > > > > http://cr.openjdk.java.net/~clanger/webrevs/8230850.0/ > > > > > > > > > > > > This Linux only test is starting several Java processes and > > > > > > then tries to > > figure out the main class by invoking jdk.jcmd's linux specific > > ProcessHelper > > implementation which parses the contents of /proc//cmdline. > > > > > > Under some circumstances, the test already attempts to read > > /proc//cmdline before it actually exists or is filled with > > data. This can be > > fixed with some sleeps/retries to wait for that data to be ready. > > > > > > In the actual jcmd tool, such behavior of ProcessHelper. > > > > > > getMainClass > > should not be an issue because it is handled in > > ProcessArgumentMatcher [0]. > > > > > > Thanks > > > > > > Christoph > > > > > > > > > > > > [0] > > http://hg.openjdk.java.net/jdk/jdk/file/8b08eaf9a0eb/src/jdk.jcmd/share/cl > > asses/sun/tools/common/ProcessArgumentMatcher.java#l86 From matthias.baesken at sap.com Fri Sep 13 10:01:39 2019 From: matthias.baesken at sap.com (Baesken, Matthias) Date: Fri, 13 Sep 2019 10:01:39 +0000 Subject: RFR [XS]: 8230901: missing ReleaseStringUTFChars in servicability native code References: Message-ID: Hello , my colleague Ralf pointed out that the NULL-check of the result of GetStringUTFChars should be done right after the GetStringUTFChars so I moved the NULL-check up : http://cr.openjdk.java.net/~mbaesken/webrevs/8230901.2/ Best regards, Matthias Hi Thomas, thanks for the review . You are correct about atoi . New webrev : http://cr.openjdk.java.net/~mbaesken/webrevs/8230901.1/ I had 2 additional observations : 1. With OJDK on solaris 32bit gone for quite some time, we might be able to kick out the whole non _LP64 code because we are always 64 bit (maybe someone could comment if this is a safe assumption, there might be old 32bit solaris core files flying around for some reason even these days ? ) http://cr.openjdk.java.net/~mbaesken/webrevs/8230901.1/src/jdk.hotspot.agent/solaris/native/libsaproc/saproc.cpp.frames.html 696 // some older versions of libproc.so crash when trying to attach 32 bit 697 // debugger to 64 bit core file. check and throw error. 698 #ifndef _LP64 ?.. 1. The usage of atoi is commented here : https://docs.oracle.com/cd/E86824_01/html/E54766/atoi-3c.html ?However, applications should not use the atoi(), atol(), or atoll() functions unless they know the value represented by the argument will be in range for the corresponding result type? ??And here : https://pubs.opengroup.org/onlinepubs/009695399/functions/atoi.html ?If the number is not known to be in range, strtol() should be used because atoi() is not required to perform any error checking? However we have a number of usages in the coding where atoi is called without knowing that the argument is in the allowed range . some examples : src/hotspot/share/runtime/arguments.cpp-382- if (match_option(option, "-Dsun.java.launcher.pid=", &tail)) { src/hotspot/share/runtime/arguments.cpp:383: _sun_java_launcher_pid = atoi(tail); src/hotspot/share/runtime/arguments.cpp-384- continue; src/java.desktop/unix/native/libawt_xawt/xawt/XToolkit.c 455 value = getenv("_AWT_MAX_POLL_TIMEOUT"); 456 if (value != NULL) { 457 AWT_MAX_POLL_TIMEOUT = atoi(value); src/java.desktop/unix/native/common/awt/X11Color.c-781- if (getenv("CMAPSIZE") != 0) { src/java.desktop/unix/native/common/awt/X11Color.c:782: cmapsize = atoi(getenv("CMAPSIZE")); Should I open a bug for these ? Best regards, Matthias -------------- next part -------------- An HTML attachment was scrubbed... URL: From richard.reingruber at sap.com Fri Sep 13 14:12:24 2019 From: richard.reingruber at sap.com (Reingruber, Richard) Date: Fri, 13 Sep 2019 14:12:24 +0000 Subject: RFR(S) 8230956: Should disable Escape Analysis when JVMTI capability can_tag_objects is taken Message-ID: Hi, could I please get reviews for Webrev: http://cr.openjdk.java.net/~rrich/webrevs/2019/8230956/webrev.0/ Bug: https://bugs.openjdk.java.net/browse/JDK-8230956 JVMTI provides functions to follow references beginning at the roots of the object graph and it provides functions to iterate all objects on the heap[1][2]. These functions are means to access objects which are otherwise local to a Java thread. In terms of escape analysis these local objects escape through these JVMTI functions invalidating optimizations based on escape analysis. Example: - Let J be a JavaThread that calls a compiled method M with a NoEscape instance I of class C that is scalar replaced. - JVMTI agent A uses JVMTI FollowReferences() to iterate the objects in the object graph tagging all instances of C. - A uses GetObjectsWithTags() to retrieve the tagged instances of C. - Error: I is missing because its allocation was eliminated / scalar replaced. Agents are required to possess the capability can_tag_objects in order to call the JVMTI heap functions that let objects escape. Currently it is not possible to revert EA based optimizations just before objects escape through JVMTI therefore escape analysis should be disabled as soon as the JVMTI capability can_tag_objects is taken. But this is not sufficient, because there may be compiled frames on stack with EA based optimizations when a JVMTI agent takes can_tag_objects (see included exclusive test cases), and then it does not help to disable escape analysis or invalidate compiled methods with ea based optimizations. In general it is still an improvement to do so. JDK-8227745 would be a complete solution to the issue. An further improvement could be to invalidate methods compiled by c2 when can_tag_objects gets added, but I'd rather suggest to integrated the implementation for JDK-8227745. Note also that after calling JVMTI AddCapabilities(), even with an empty set of capabilities, JvmtiExport::can_walk_any_space() will return true. I've run tier1 tests. Thanks, Richard. [1] https://docs.oracle.com/en/java/javase/11/docs/specs/jvmti.html#Heap [2] https://docs.oracle.com/en/java/javase/11/docs/specs/jvmti.html#Heap_1_0 From chris.plummer at oracle.com Fri Sep 13 17:49:00 2019 From: chris.plummer at oracle.com (Chris Plummer) Date: Fri, 13 Sep 2019 10:49:00 -0700 Subject: RFR (S): 8230850: Test sun/tools/jcmd/TestProcessHelper.java fails intermittently In-Reply-To: References: <538fbd9d96e88ef3179f6ad7c410e4311e87c420.camel@redhat.com> Message-ID: <73cc892e-abf2-d9cc-22c3-058ca7b01ebd@oracle.com> 3 minutes??? Sounds like something is wrong. What is the JVM doing during this time? Chris On 9/12/19 12:53 PM, Langer, Christoph wrote: > Hi Severin, > > that seems an interesting idea for an elegant solution. However, after trying this on a decently fast linux x86 box by leveraging one of these ProcessTools::startProcess methods that would wait for a certain output to appear in the child before returning, I figured that the elapsed runtime of the test increases from about 3 seconds to 3 minutes. It evidently takes a lot longer to bootstrap a JVM and get the results of a first println than just forking a process and immediately accessing its proc filesystem. So I think we don't want to do that. > > I would like to go with this version (changed the comment to Thomas' suggestion): http://cr.openjdk.java.net/~clanger/webrevs/8230850.1/ > > Chris, are you ok with it? > > Thanks > Christoph > >> -----Original Message----- >> From: Severin Gehwolf >> Sent: Donnerstag, 12. September 2019 11:01 >> To: Langer, Christoph ; Thomas St?fe >> >> Cc: OpenJDK Serviceability >> Subject: Re: RFR (S): 8230850: Test sun/tools/jcmd/TestProcessHelper.java >> fails intermittently >> >> Hi Christoph, >> >> Have you considered to wait for TestProcess - the spawned processes - to >> print this on stdout: >> >> "The process started, pid: XXX" >> >> Once that's ready on stdout, checking the main class should always >> pass. I believe p.isAlive() check which is currrently done is >> insufficient. >> >> Thanks, >> Severin >> >> On Thu, 2019-09-12 at 08:12 +0000, Langer, Christoph wrote: >>> Hi Thomas, >>> >>> sounds reasonable, will do. >>> >>> Thanks >>> Christoph >>> >>> From: Thomas St?fe >>> Sent: Donnerstag, 12. September 2019 10:11 >>> To: Langer, Christoph >>> Cc: Chris Plummer ; OpenJDK Serviceability >> >>> Subject: Re: RFR (S): 8230850: Test sun/tools/jcmd/TestProcessHelper.java >> fails intermittently >>> I'm fine with the patch if you would reshape the platform dependent >> comment. Proposal: >>> ---- >>> - // Depending on hw/os, process helper can return null here >>> - // because /proc//cmdline is not ready yet. To cover that case, >>> // give it some retries. >>> -> >>> + getMainClass() may return NULL, e.g. due to timing issues. Attempt some >> limited retries. >>> ---- >>> I do not need another webrev. >>> Cheers, Thomas >>> >>> >>> >>> >>> On Wed, Sep 11, 2019 at 11:37 PM Langer, Christoph >> wrote: >>>> Hi Chris, Thomas, >>>> >>>> thanks for looking at this. I was also wondering whether a fix in >> ProcessHelper would be appropriate. But I think introducing retries and >> delays in that code can do more harm than help. >>>> For this special test case, aiming to test the ProcessHelper functionality >> (on Linux) only, the observed problem is that the /proc//cmdline file is >> not ready yet when it gets evaluated because the test can be quicker than >> the spawned processes. But in real life usage of jcmd this seems rather >> unlikely. One will probably use jcmd quite some time after a java process was >> started and /proc//cmdline should be ready. If then there are >> problems reading it, there are likely other issues which won?t go away by >> waiting. And for these cases the fallback is to use the attach framework, as >> implemented in ProcessArgumentMatcher, which provides some chance to >> be working still. And this fallback should also cover the exotic case when jcmd >> is issued too early. >>>> After all, ProcessHelper::getMainClass also documents that its result can >> be null. >>>> @Thomas, as for your other points: >>>> PID reusage: Hm, maybe one can construct cases. However, I?d think the >> /proc/pid files should be gone after a process ends. Or at least be >> reconstructed if there were orphans and a new process reusing an old pid >> gets started. But who knows what can happen ? we?ll maybe see ?? >>>> Comment for Linux only issue: The test is in fact a Linux only test. See line >> 55: * @requires os.family == "linux". So, if we?ll eventually see >> implementations for ProcessHelper::getMainClass on other platforms, this >> comment might have to be adopted. But for the time being I guess it?s fine at >> its current place. >>>> Would you agree? >>>> >>>> Best regards >>>> Christoph >>>> >>>> From: Chris Plummer >>>> Sent: Mittwoch, 11. September 2019 19:21 >>>> To: Thomas St?fe ; Langer, Christoph >> >>>> Cc: OpenJDK Serviceability >>>> Subject: Re: RFR (S): 8230850: Test >> sun/tools/jcmd/TestProcessHelper.java fails intermittently >>>> It does seem that the fix should be in ProcessHelper.java in >> getMainClass(), or maybe even getCommandLine(). Fixing it in the test >> implies that every user of getMainClass() should be doing something similar. >> But then also note what ProcessArgumentMatch.check() is doing. It also >> deals with getMainClass() returing null. >>>> thanks, >>>> >>>> Chris >>>> >>>> On 9/11/19 6:59 AM, Thomas St?fe wrote: >>>>> Hi Christoph, >>>>> >>>>> in general I think this is fine. The increase-by-pow2 sleep time is odd >> but okay :) >>>>> The whole things seems rather fragile and has a lot of question marks >> but I think your fix does not make it worse. One fun error now is that with a >> follow up java test reusing the PID we could get a wrong main class but I think >> the chances are astronomically low. >>>>> Only remark, you fix this in the platform shared code, if this is a Linux >> only issue maybe it should be fixed in /shared/projects/openjdk/jdk- >> jdk/source/src/jdk.jcmd/linux/classes/sun/tools/ProcessHelper.java >> instead? If not, I would remove at least the /proc//cmdline comment >> since this is quite platform specific. >>>>> Cheers, Thomas >>>>> >>>>> >>>>> On Wed, Sep 11, 2019 at 2:39 PM Langer, Christoph >> wrote: >>>>>> Hi, >>>>>> >>>>>> please review this change for test >> sun/tools/jcmd/TestProcessHelper.java to make it more robust. >>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8230850 >>>>>> Webrev: http://cr.openjdk.java.net/~clanger/webrevs/8230850.0/ >>>>>> >>>>>> This Linux only test is starting several Java processes and then tries to >> figure out the main class by invoking jdk.jcmd's linux specific ProcessHelper >> implementation which parses the contents of /proc//cmdline. >>>>>> Under some circumstances, the test already attempts to read >> /proc//cmdline before it actually exists or is filled with data. This can be >> fixed with some sleeps/retries to wait for that data to be ready. >>>>>> In the actual jcmd tool, such behavior of ProcessHelper. getMainClass >> should not be an issue because it is handled in ProcessArgumentMatcher [0]. >>>>>> Thanks >>>>>> Christoph >>>>>> >>>>>> [0] >> http://hg.openjdk.java.net/jdk/jdk/file/8b08eaf9a0eb/src/jdk.jcmd/share/cl >> asses/sun/tools/common/ProcessArgumentMatcher.java#l86 >>>> From hohensee at amazon.com Fri Sep 13 19:11:30 2019 From: hohensee at amazon.com (Hohensee, Paul) Date: Fri, 13 Sep 2019 19:11:30 +0000 Subject: RFR (M): 8207266: ThreadMXBean::getThreadAllocatedBytes() can be quicker for self thread In-Reply-To: <5252a51d-4217-000b-1444-a088bb8a6a58@oracle.com> References: <588a91ec-8d4a-1157-5d72-88bb1eef1e6e@oracle.com> <30EA5D0C-1AEC-4242-B17B-CA4D39ECAF71@amazon.com> <0d42d653-d158-a6e4-45b6-84f087c7e592@oracle.com> <03A2509C-5587-448A-82F8-9240EA040326@amazon.com> <6f674d71-58f6-bc79-7d08-7bcc24e3b0fa@oracle.com> <5252a51d-4217-000b-1444-a088bb8a6a58@oracle.com> Message-ID: <873119A8-C595-4B73-AD0B-1625D6CAC47D@amazon.com> Hi David, thanks for your comments. New webrev in http://cr.openjdk.java.net/~phh/8207266/webrev.08/ Both the old and new versions of the code check that thread allocated memory is both supported and enabled. The existing version of getThreadAllocatedBytes(long []) calls verifyThreadAllocatedMemory(long []), which checks inline to make sure thread allocated memory is supported, then calls isThreadAllocatedMemoryEnabled() to verify that it's enabled. isThreadAllocatedMemoryEnabled() duplicates (!) the support check and returns the enabled flag. I removed the redundant check in the new version. You're of course correct about the back-to-back check. Application code can't know when the runtime will hijack a thread for its own purposes. I've removed the check. Paul ?On 9/13/19, 12:50 AM, "David Holmes" wrote: Hi Paul, On 13/09/2019 10:29 am, Hohensee, Paul wrote: > Thanks for clarifying the review rules. Would someone from the > serviceability team please review? New webrev at > > http://cr.openjdk.java.net/~phh/8207266/webrev.07/ One aspect of the functional change needs clarification for me - and apologies if this has been covered in the past. It seems to me that currently we only check isThreadAllocatedMemorySupported for these operations, but if I read things correctly the updated code additionally checks isThreadAllocatedMemoryEnabled, which is a behaviour change not mentioned in the CSR. > I didn?t disturb the existing checks in the test, just added code to > check the result of getThreadAllocatedBytes(long) on a non-current > thread, plus the back-to-back no-allocation checks. The former wasn?t > needed before because getThreadAllocatedBytes(long) was just a wrapper > around getThreadAllocatedBytes(long []). This patch changes that, so I > added a separate test. The latter is supposed to fail if there?s object > allocation on calls to getCurrentThreadAllocatedBytes and > getThreadAllocatedBytes(long). I.e., a feature, not a bug, because > accumulation of transient small objects can be a performance problem. > Thanks to your review, I noticed that the back-to-back check on the > current thread was using getThreadAllocatedBytes(long) instead of > getCurrentThreadAllocatedBytes and fixed it. I also removed all > instances of ?TEST FAILED: ?. The back-to-back check is not valid in general. You don't know if the first check might trigger some class loading on the return path after it has obtained the first memory value. The check might also fail if using JVMCI and some compilation related activity occurs in the current thread on the second call. Also with the introduction of handshakes its possible the current thread might hit a safepoint checks that results in it executing a handshake operation that performs allocation. Potentially there could be numerous non-deterministic actions that might occur leading to unanticipated allocation. I understand what you want to test here, I just don't think it is reliably doable. Thanks, David ----- > > Paul > > *From: *Mandy Chung > *Date: *Thursday, September 12, 2019 at 10:09 AM > *To: *"Hohensee, Paul" > *Cc: *OpenJDK Serviceability , > "hotspot-gc-dev at openjdk.java.net" > *Subject: *Re: RFR (M): 8207266: ThreadMXBean::getThreadAllocatedBytes() > can be quicker for self thread > > On 9/3/19 12:38 PM, Hohensee, Paul wrote: > > Minor update in new webrevhttp://cr.openjdk.java.net/~phh/8207266/webrev.05/. > > > I only reviewed the library side implementation that looks good. I > expect the serviceability team to review the test and hotspot change. > > > Need a confirmatory review to push this. If I understand the rules correctly, it doesn't need a Reviewer review since Mandy's already reviewed it, it just needs a Committer review. > > > You need another reviewer to advice the following because I was not > close to the ThreadsList work. > > 2087 ThreadsListHandle tlh; > > 2088 JavaThread* java_thread = tlh.list()->find_JavaThread_from_java_tid(thread_id); > > 2089 > > 2090 if (java_thread != NULL) { > > 2091 return java_thread->cooked_allocated_bytes(); > > 2092 } > > This looks right to me. > > test/jdk/com/sun/management/ThreadMXBean/ThreadAllocatedMemory.java > > - "ThreadAllocatedMemory is expected to be disabled"); > > + "TEST FAILED: ThreadAllocatedMemory is expected to be > disabled"); > > Prepending "TEST FAILED" in exception message (in several places) > > seems redundant since such RuntimeException is thrown and expected > > a test failure. > > + // back-to-back calls shouldn't allocate any memory > > + size = mbean.getThreadAllocatedBytes(id); > > + size1 = mbean.getThreadAllocatedBytes(id); > > + if (size1 != size) { > > Is there anything in the test can do to help guarantee this? I didn't > > closely review this test. The main thing I advice is to improve > > the reliability of this test. Put it in another way, we want to > > ensure that this test change will pass all the time in various > > test configuration. > > Mandy > From serguei.spitsyn at oracle.com Fri Sep 13 21:58:06 2019 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Fri, 13 Sep 2019 14:58:06 -0700 Subject: RFR [XS]: 8230901: missing ReleaseStringUTFChars in servicability native code In-Reply-To: References: Message-ID: Hi Matthias, The fix looks good to me. Thank you for catching and fixing this! Thanks, Serguei On 9/13/19 3:01 AM, Baesken, Matthias wrote: > > Hello , my colleague? Ralf pointed out that? the? NULL-check? of the > result of GetStringUTFChars > > should be done right ?after the ?GetStringUTFChars?? so I moved the > NULL-check up : > > http://cr.openjdk.java.net/~mbaesken/webrevs/8230901.2/ > > > Best regards, Matthias > > Hi Thomas,?? thanks for the review . > > You are correct about atoi . > > New webrev ?: > > http://cr.openjdk.java.net/~mbaesken/webrevs/8230901.1/ > > > I had 2 additional? observations? : > > 1. With ?OJDK on solaris 32bit gone for quite some time, we might be? > able? to kick out the whole? non _LP64? code? because we are > always 64 bit > > (maybe? someone could comment if this is a safe assumption,? there > might be old 32bit solaris core files flying around for some reason > even these days ? ) > > http://cr.openjdk.java.net/~mbaesken/webrevs/8230901.1/src/jdk.hotspot.agent/solaris/native/libsaproc/saproc.cpp.frames.html > > > 696?? // some older versions of libproc.so crash when trying to attach > 32 bit > > 697?? // debugger to 64 bit core file. check and throw error. > > 698 #ifndef _LP64 > > ?.. > > 2. The usage of atoi is commented? here : > > https://docs.oracle.com/cd/E86824_01/html/E54766/atoi-3c.html > > ?However, applications should not use the atoi(), atol(), or > atoll()?functions unless they know the value represented by the > argument will be in range for the corresponding result type? > > ????? ??And here : > > https://pubs.opengroup.org/onlinepubs/009695399/functions/atoi.html > > ?If the number is not known to be in range, /strtol/() > ?should > be used because /atoi/() is not required to perform any error checking? > > However? we ?have? a number of? usages in the coding?? where? atoi is > called without knowing that? the argument is in the allowed range . > > some examples : > > src/hotspot/share/runtime/arguments.cpp-382- if (match_option(option, > "-Dsun.java.launcher.pid=", &tail)) { > > src/hotspot/share/runtime/arguments.cpp:383: _sun_java_launcher_pid = > atoi(tail); > > src/hotspot/share/runtime/arguments.cpp-384- continue; > > src/java.desktop/unix/native/libawt_xawt/xawt/XToolkit.c > > 455??? value = getenv("_AWT_MAX_POLL_TIMEOUT"); > > 456??? if (value != NULL) { > > 457??????? AWT_MAX_POLL_TIMEOUT = atoi(value); > > src/java.desktop/unix/native/common/awt/X11Color.c-781- if > (getenv("CMAPSIZE") != 0) { > > src/java.desktop/unix/native/common/awt/X11Color.c:782: cmapsize = > atoi(getenv("CMAPSIZE")); > > Should I open a bug for these ? > > Best regards, Matthias > -------------- next part -------------- An HTML attachment was scrubbed... URL: From serguei.spitsyn at oracle.com Fri Sep 13 22:18:20 2019 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Fri, 13 Sep 2019 15:18:20 -0700 Subject: RFR [XS]: 8230901: missing ReleaseStringUTFChars in servicability native code In-Reply-To: References: Message-ID: <9dca6ade-aba5-daf1-fa26-07b3d0bad35e@oracle.com> Hi Matthias, On 9/12/19 4:52 AM, Baesken, Matthias wrote: > > Hi Thomas,?? thanks for the review . > > You are correct about atoi . > > New webrev ?: > > http://cr.openjdk.java.net/~mbaesken/webrevs/8230901.1/ > > > I had 2 additional? observations? : > > 1. With ?OJDK on solaris 32bit gone for quite some time, we might be > able? to kick out the whole? non _LP64? code? because we are > always 64 bit > > (maybe? someone could comment if this is a safe assumption,? there > might be old 32bit solaris core files flying around for some reason > even these days ? ) > > http://cr.openjdk.java.net/~mbaesken/webrevs/8230901.1/src/jdk.hotspot.agent/solaris/native/libsaproc/saproc.cpp.frames.html > > > 696?? // some older versions of libproc.so crash when trying to attach > 32 bit > > 697?? // debugger to 64 bit core file. check and throw error. > > 698 #ifndef _LP64 > > ?.. > > 2. The usage of atoi is commented? here : > > https://docs.oracle.com/cd/E86824_01/html/E54766/atoi-3c.html > > ?However, applications should not use the atoi(), atol(), or > atoll()?functions unless they know the value represented by the > argument will be in range for the corresponding result type? > > ????? ??And here : > > https://pubs.opengroup.org/onlinepubs/009695399/functions/atoi.html > > ?If the number is not known to be in range, /strtol/() > ?should > be used because /atoi/() is not required to perform any error checking? > > However? we ?have? a number of? usages in the coding?? where? atoi is > called without knowing that? the argument is in the allowed range . > > some examples : > > src/hotspot/share/runtime/arguments.cpp-382- if (match_option(option, > "-Dsun.java.launcher.pid=", &tail)) { > > src/hotspot/share/runtime/arguments.cpp:383: _sun_java_launcher_pid = > atoi(tail); > > src/hotspot/share/runtime/arguments.cpp-384- continue; > > src/java.desktop/unix/native/libawt_xawt/xawt/XToolkit.c > > 455??? value = getenv("_AWT_MAX_POLL_TIMEOUT"); > > 456??? if (value != NULL) { > > 457??????? AWT_MAX_POLL_TIMEOUT = atoi(value); > > src/java.desktop/unix/native/common/awt/X11Color.c-781- if > (getenv("CMAPSIZE") != 0) { > > src/java.desktop/unix/native/common/awt/X11Color.c:782: cmapsize = > atoi(getenv("CMAPSIZE")); > > Should I open a bug for these ? > Probably, two different bug are needed: hotspot/runtime and AWT. Thanks, Serguei > Best regards, Matthias > > *From:*Thomas St?fe > *Sent:* Donnerstag, 12. September 2019 12:22 > *To:* Baesken, Matthias > *Cc:* serviceability-dev at openjdk.java.net > *Subject:* Re: RFR [XS]: 8230901: missing ReleaseStringUTFChars in > servicability native code > > Hi Matthias, > > your changes look good. > > > an additional bug: > > http://cr.openjdk.java.net/~mbaesken/webrevs/8230901.0/src/jdk.hotspot.agent/solaris/native/libsaproc/saproc.cpp.frames.html > > > > ?698 #ifndef _LP64 > ?699 ? atoi(cmdLine_cstr); > ?700 ? if (errno) { > > Behaviour of atoi() in error case is undefined. errno values are not > defined. > > See: https://pubs.opengroup.org/onlinepubs/009695399/functions/atoi.html > > And even if atoi would set errno, this is still not enough since errno > may contain a stale value. One would have to set errno=0 before the > function call. > > > If you want to fix this too 'd suggest replacing this call with strtol(). > > Cheers, Thomas > > On Thu, Sep 12, 2019 at 12:11 PM Baesken, Matthias > > wrote: > > Hello, please reviews this small change . > > It adds? ReleaseStringUTFChars calls? at some places in early > return cases . > > ( in src/jdk.hotspot.agent/solaris/native/libsaproc/saproc.cpp > > THROW_NEW_DEBUGGER_EXCEPTION contains a return , see the macro > declaration > > 39 #define THROW_NEW_DEBUGGER_EXCEPTION(str) { > throwNewDebuggerException(env, str); return;} > > ) > > Bug/webrev : > > https://bugs.openjdk.java.net/browse/JDK-8230901 > > http://cr.openjdk.java.net/~mbaesken/webrevs/8230901.0/ > > > Thanks, Matthias > -------------- next part -------------- An HTML attachment was scrubbed... URL: From david.holmes at oracle.com Fri Sep 13 22:34:38 2019 From: david.holmes at oracle.com (David Holmes) Date: Sat, 14 Sep 2019 08:34:38 +1000 Subject: RFR (M): 8207266: ThreadMXBean::getThreadAllocatedBytes() can be quicker for self thread In-Reply-To: <873119A8-C595-4B73-AD0B-1625D6CAC47D@amazon.com> References: <588a91ec-8d4a-1157-5d72-88bb1eef1e6e@oracle.com> <30EA5D0C-1AEC-4242-B17B-CA4D39ECAF71@amazon.com> <0d42d653-d158-a6e4-45b6-84f087c7e592@oracle.com> <03A2509C-5587-448A-82F8-9240EA040326@amazon.com> <6f674d71-58f6-bc79-7d08-7bcc24e3b0fa@oracle.com> <5252a51d-4217-000b-1444-a088bb8a6a58@oracle.com> <873119A8-C595-4B73-AD0B-1625D6CAC47D@amazon.com> Message-ID: <28a1cc57-dee3-7beb-19ef-d434ddf038ca@oracle.com> Hi Paul, On 14/09/2019 5:11 am, Hohensee, Paul wrote: > Hi David, thanks for your comments. New webrev in > > http://cr.openjdk.java.net/~phh/8207266/webrev.08/ > > Both the old and new versions of the code check that thread allocated memory is both supported and enabled. The existing version of getThreadAllocatedBytes(long []) calls verifyThreadAllocatedMemory(long []), which checks inline to make sure thread allocated memory is supported, then calls isThreadAllocatedMemoryEnabled() to verify that it's enabled. isThreadAllocatedMemoryEnabled() duplicates (!) the support check and returns the enabled flag. I removed the redundant check in the new version. Thanks for clarifying. > You're of course correct about the back-to-back check. Application code can't know when the runtime will hijack a thread for its own purposes. I've removed the check. Updated test looks fine. Nothing further from me. Thanks, David ----- > Paul > > ?On 9/13/19, 12:50 AM, "David Holmes" wrote: > > Hi Paul, > > On 13/09/2019 10:29 am, Hohensee, Paul wrote: > > Thanks for clarifying the review rules. Would someone from the > > serviceability team please review? New webrev at > > > > http://cr.openjdk.java.net/~phh/8207266/webrev.07/ > > One aspect of the functional change needs clarification for me - and > apologies if this has been covered in the past. It seems to me that > currently we only check isThreadAllocatedMemorySupported for these > operations, but if I read things correctly the updated code additionally > checks isThreadAllocatedMemoryEnabled, which is a behaviour change not > mentioned in the CSR. > > > I didn?t disturb the existing checks in the test, just added code to > > check the result of getThreadAllocatedBytes(long) on a non-current > > thread, plus the back-to-back no-allocation checks. The former wasn?t > > needed before because getThreadAllocatedBytes(long) was just a wrapper > > around getThreadAllocatedBytes(long []). This patch changes that, so I > > added a separate test. The latter is supposed to fail if there?s object > > allocation on calls to getCurrentThreadAllocatedBytes and > > getThreadAllocatedBytes(long). I.e., a feature, not a bug, because > > accumulation of transient small objects can be a performance problem. > > Thanks to your review, I noticed that the back-to-back check on the > > current thread was using getThreadAllocatedBytes(long) instead of > > getCurrentThreadAllocatedBytes and fixed it. I also removed all > > instances of ?TEST FAILED: ?. > > The back-to-back check is not valid in general. You don't know if the > first check might trigger some class loading on the return path after it > has obtained the first memory value. The check might also fail if using > JVMCI and some compilation related activity occurs in the current thread > on the second call. Also with the introduction of handshakes its > possible the current thread might hit a safepoint checks that results in > it executing a handshake operation that performs allocation. Potentially > there could be numerous non-deterministic actions that might occur > leading to unanticipated allocation. > > I understand what you want to test here, I just don't think it is > reliably doable. > > Thanks, > David > ----- > > > > > Paul > > > > *From: *Mandy Chung > > *Date: *Thursday, September 12, 2019 at 10:09 AM > > *To: *"Hohensee, Paul" > > *Cc: *OpenJDK Serviceability , > > "hotspot-gc-dev at openjdk.java.net" > > *Subject: *Re: RFR (M): 8207266: ThreadMXBean::getThreadAllocatedBytes() > > can be quicker for self thread > > > > On 9/3/19 12:38 PM, Hohensee, Paul wrote: > > > > Minor update in new webrevhttp://cr.openjdk.java.net/~phh/8207266/webrev.05/. > > > > > > I only reviewed the library side implementation that looks good. I > > expect the serviceability team to review the test and hotspot change. > > > > > > Need a confirmatory review to push this. If I understand the rules correctly, it doesn't need a Reviewer review since Mandy's already reviewed it, it just needs a Committer review. > > > > > > You need another reviewer to advice the following because I was not > > close to the ThreadsList work. > > > > 2087 ThreadsListHandle tlh; > > > > 2088 JavaThread* java_thread = tlh.list()->find_JavaThread_from_java_tid(thread_id); > > > > 2089 > > > > 2090 if (java_thread != NULL) { > > > > 2091 return java_thread->cooked_allocated_bytes(); > > > > 2092 } > > > > This looks right to me. > > > > test/jdk/com/sun/management/ThreadMXBean/ThreadAllocatedMemory.java > > > > - "ThreadAllocatedMemory is expected to be disabled"); > > > > + "TEST FAILED: ThreadAllocatedMemory is expected to be > > disabled"); > > > > Prepending "TEST FAILED" in exception message (in several places) > > > > seems redundant since such RuntimeException is thrown and expected > > > > a test failure. > > > > + // back-to-back calls shouldn't allocate any memory > > > > + size = mbean.getThreadAllocatedBytes(id); > > > > + size1 = mbean.getThreadAllocatedBytes(id); > > > > + if (size1 != size) { > > > > Is there anything in the test can do to help guarantee this? I didn't > > > > closely review this test. The main thing I advice is to improve > > > > the reliability of this test. Put it in another way, we want to > > > > ensure that this test change will pass all the time in various > > > > test configuration. > > > > Mandy > > > > From serguei.spitsyn at oracle.com Fri Sep 13 23:33:29 2019 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Fri, 13 Sep 2019 16:33:29 -0700 Subject: RFR: 8230881: serviceability/sa/TestJmapCore tests fail with java.lang.RuntimeException: Could not find dump file In-Reply-To: References: Message-ID: <9de2c900-16d5-d38a-e707-266313b39a5d@oracle.com> Hi Leonid, +1 Thanks, Serguei On 9/12/19 4:46 PM, David Holmes wrote: > Looks good! > > Thanks for fixing. > > David > > On 13/09/2019 2:56 am, Leonid Mesnik wrote: >> Hi >> >> Could you please verify following fix which update core filename >> patterns. Some hosts are configured to compress core files during >> core dump. In such cases tests should throw skipped exception instead >> of failing reading compressed cores. >> Fixing all tests to support unpacking core.pid.gz files is going to >> be fixed in separate RFE. >> >> >> Verified that test pass if plain core is generated and skipped in >> other case. >> >> >> webrev: http://cr.openjdk.java.net/~lmesnik/8230881/webrev.00/ >> bug: https://bugs.openjdk.java.net/browse/JDK-8230881 >> >> Leonid From leonid.mesnik at oracle.com Fri Sep 13 23:42:17 2019 From: leonid.mesnik at oracle.com (Leonid Mesnik) Date: Fri, 13 Sep 2019 16:42:17 -0700 Subject: RFR: 8230881: serviceability/sa/TestJmapCore tests fail with java.lang.RuntimeException: Could not find dump file In-Reply-To: <9de2c900-16d5-d38a-e707-266313b39a5d@oracle.com> References: <9de2c900-16d5-d38a-e707-266313b39a5d@oracle.com> Message-ID: <9487A838-FB4C-4E40-83F1-E094B27063ED@oracle.com> Thank you for your review. I filed another bug (rfe?) to support testing on hosts with compressing cores https://bugs.openjdk.java.net/browse/JDK-8230942 Will fix it later. Leonid > On Sep 13, 2019, at 4:33 PM, serguei.spitsyn at oracle.com wrote: > > Hi Leonid, > > +1 > > Thanks, > Serguei > > On 9/12/19 4:46 PM, David Holmes wrote: >> Looks good! >> >> Thanks for fixing. >> >> David >> >> On 13/09/2019 2:56 am, Leonid Mesnik wrote: >>> Hi >>> >>> Could you please verify following fix which update core filename patterns. Some hosts are configured to compress core files during core dump. In such cases tests should throw skipped exception instead of failing reading compressed cores. >>> Fixing all tests to support unpacking core.pid.gz files is going to be fixed in separate RFE. >>> >>> >>> Verified that test pass if plain core is generated and skipped in other case. >>> >>> >>> webrev: http://cr.openjdk.java.net/~lmesnik/8230881/webrev.00/ >>> bug: https://bugs.openjdk.java.net/browse/JDK-8230881 >>> >>> Leonid > -------------- next part -------------- An HTML attachment was scrubbed... URL: From serguei.spitsyn at oracle.com Fri Sep 13 23:58:30 2019 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Fri, 13 Sep 2019 16:58:30 -0700 Subject: RFR: 8230881: serviceability/sa/TestJmapCore tests fail with java.lang.RuntimeException: Could not find dump file In-Reply-To: <9487A838-FB4C-4E40-83F1-E094B27063ED@oracle.com> References: <9de2c900-16d5-d38a-e707-266313b39a5d@oracle.com> <9487A838-FB4C-4E40-83F1-E094B27063ED@oracle.com> Message-ID: <41f2a5ea-b783-98b5-3bbe-285583312f6f@oracle.com> Hi Leonid, Thank you for filing the bug! I've changed it to Enhancement. Thanks, Serguei On 9/13/19 4:42 PM, Leonid Mesnik wrote: > Thank you for your review. > > I filed another bug (rfe?) to support testing on hosts with > compressing cores > > https://bugs.openjdk.java.net/browse/JDK-8230942 > > Will fix it later. > > Leonid > >> On Sep 13, 2019, at 4:33 PM, serguei.spitsyn at oracle.com >> wrote: >> >> Hi Leonid, >> >> +1 >> >> Thanks, >> Serguei >> >> On 9/12/19 4:46 PM, David Holmes wrote: >>> Looks good! >>> >>> Thanks for fixing. >>> >>> David >>> >>> On 13/09/2019 2:56 am, Leonid Mesnik wrote: >>>> Hi >>>> >>>> Could you please verify following fix which update core filename >>>> patterns. Some hosts are configured to compress core files during >>>> core dump. In such cases tests should throw skipped exception >>>> instead of failing reading compressed cores. >>>> Fixing all tests to support unpacking core.pid.gz files is going to >>>> be fixed in separate RFE. >>>> >>>> >>>> Verified that test pass if plain core is generated and skipped in >>>> other case. >>>> >>>> >>>> webrev: http://cr.openjdk.java.net/~lmesnik/8230881/webrev.00/ >>>> >>>> bug: https://bugs.openjdk.java.net/browse/JDK-8230881 >>>> >>>> Leonid >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From serguei.spitsyn at oracle.com Sat Sep 14 00:48:51 2019 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Fri, 13 Sep 2019 17:48:51 -0700 Subject: RFR (M): 8207266: ThreadMXBean::getThreadAllocatedBytes() can be quicker for self thread In-Reply-To: <873119A8-C595-4B73-AD0B-1625D6CAC47D@amazon.com> References: <588a91ec-8d4a-1157-5d72-88bb1eef1e6e@oracle.com> <30EA5D0C-1AEC-4242-B17B-CA4D39ECAF71@amazon.com> <0d42d653-d158-a6e4-45b6-84f087c7e592@oracle.com> <03A2509C-5587-448A-82F8-9240EA040326@amazon.com> <6f674d71-58f6-bc79-7d08-7bcc24e3b0fa@oracle.com> <5252a51d-4217-000b-1444-a088bb8a6a58@oracle.com> <873119A8-C595-4B73-AD0B-1625D6CAC47D@amazon.com> Message-ID: <56ea7c5f-8c91-9a05-6d95-255bfd0c154d@oracle.com> Hi Paul, It looks pretty good in general. http://cr.openjdk.java.net/~phh/8207266/webrev.08/test/jdk/com/sun/management/ThreadMXBean/ThreadAllocatedMemory.java.frames.html It would be nice to refactor the java main() method as it becomes too big. Two ways ofgetCurrentThreadAllocatedBytes() testing are good candidates to become separate methods. 98 long size1 = mbean.getThreadAllocatedBytes(id); Just wanted to double check if you wanted to invoke the getCurrentThreadAllocatedBytes() instead as it is a part of: 85 // First way, getCurrentThreadAllocatedBytes Thanks, Serguei On 9/13/19 12:11 PM, Hohensee, Paul wrote: > Hi David, thanks for your comments. New webrev in > > http://cr.openjdk.java.net/~phh/8207266/webrev.08/ > > Both the old and new versions of the code check that thread allocated memory is both supported and enabled. The existing version of getThreadAllocatedBytes(long []) calls verifyThreadAllocatedMemory(long []), which checks inline to make sure thread allocated memory is supported, then calls isThreadAllocatedMemoryEnabled() to verify that it's enabled. isThreadAllocatedMemoryEnabled() duplicates (!) the support check and returns the enabled flag. I removed the redundant check in the new version. > > You're of course correct about the back-to-back check. Application code can't know when the runtime will hijack a thread for its own purposes. I've removed the check. > > Paul > > ?On 9/13/19, 12:50 AM, "David Holmes" wrote: > > Hi Paul, > > On 13/09/2019 10:29 am, Hohensee, Paul wrote: > > Thanks for clarifying the review rules. Would someone from the > > serviceability team please review? New webrev at > > > > http://cr.openjdk.java.net/~phh/8207266/webrev.07/ > > One aspect of the functional change needs clarification for me - and > apologies if this has been covered in the past. It seems to me that > currently we only check isThreadAllocatedMemorySupported for these > operations, but if I read things correctly the updated code additionally > checks isThreadAllocatedMemoryEnabled, which is a behaviour change not > mentioned in the CSR. > > > I didn?t disturb the existing checks in the test, just added code to > > check the result of getThreadAllocatedBytes(long) on a non-current > > thread, plus the back-to-back no-allocation checks. The former wasn?t > > needed before because getThreadAllocatedBytes(long) was just a wrapper > > around getThreadAllocatedBytes(long []). This patch changes that, so I > > added a separate test. The latter is supposed to fail if there?s object > > allocation on calls to getCurrentThreadAllocatedBytes and > > getThreadAllocatedBytes(long). I.e., a feature, not a bug, because > > accumulation of transient small objects can be a performance problem. > > Thanks to your review, I noticed that the back-to-back check on the > > current thread was using getThreadAllocatedBytes(long) instead of > > getCurrentThreadAllocatedBytes and fixed it. I also removed all > > instances of ?TEST FAILED: ?. > > The back-to-back check is not valid in general. You don't know if the > first check might trigger some class loading on the return path after it > has obtained the first memory value. The check might also fail if using > JVMCI and some compilation related activity occurs in the current thread > on the second call. Also with the introduction of handshakes its > possible the current thread might hit a safepoint checks that results in > it executing a handshake operation that performs allocation. Potentially > there could be numerous non-deterministic actions that might occur > leading to unanticipated allocation. > > I understand what you want to test here, I just don't think it is > reliably doable. > > Thanks, > David > ----- > > > > > Paul > > > > *From: *Mandy Chung > > *Date: *Thursday, September 12, 2019 at 10:09 AM > > *To: *"Hohensee, Paul" > > *Cc: *OpenJDK Serviceability , > > "hotspot-gc-dev at openjdk.java.net" > > *Subject: *Re: RFR (M): 8207266: ThreadMXBean::getThreadAllocatedBytes() > > can be quicker for self thread > > > > On 9/3/19 12:38 PM, Hohensee, Paul wrote: > > > > Minor update in new webrevhttp://cr.openjdk.java.net/~phh/8207266/webrev.05/. > > > > > > I only reviewed the library side implementation that looks good. I > > expect the serviceability team to review the test and hotspot change. > > > > > > Need a confirmatory review to push this. If I understand the rules correctly, it doesn't need a Reviewer review since Mandy's already reviewed it, it just needs a Committer review. > > > > > > You need another reviewer to advice the following because I was not > > close to the ThreadsList work. > > > > 2087 ThreadsListHandle tlh; > > > > 2088 JavaThread* java_thread = tlh.list()->find_JavaThread_from_java_tid(thread_id); > > > > 2089 > > > > 2090 if (java_thread != NULL) { > > > > 2091 return java_thread->cooked_allocated_bytes(); > > > > 2092 } > > > > This looks right to me. > > > > test/jdk/com/sun/management/ThreadMXBean/ThreadAllocatedMemory.java > > > > - "ThreadAllocatedMemory is expected to be disabled"); > > > > + "TEST FAILED: ThreadAllocatedMemory is expected to be > > disabled"); > > > > Prepending "TEST FAILED" in exception message (in several places) > > > > seems redundant since such RuntimeException is thrown and expected > > > > a test failure. > > > > + // back-to-back calls shouldn't allocate any memory > > > > + size = mbean.getThreadAllocatedBytes(id); > > > > + size1 = mbean.getThreadAllocatedBytes(id); > > > > + if (size1 != size) { > > > > Is there anything in the test can do to help guarantee this? I didn't > > > > closely review this test. The main thing I advice is to improve > > > > the reliability of this test. Put it in another way, we want to > > > > ensure that this test change will pass all the time in various > > > > test configuration. > > > > Mandy > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From christoph.langer at sap.com Sun Sep 15 05:57:15 2019 From: christoph.langer at sap.com (Langer, Christoph) Date: Sun, 15 Sep 2019 05:57:15 +0000 Subject: RFR (S): 8230850: Test sun/tools/jcmd/TestProcessHelper.java fails intermittently In-Reply-To: <73cc892e-abf2-d9cc-22c3-058ca7b01ebd@oracle.com> References: <538fbd9d96e88ef3179f6ad7c410e4311e87c420.camel@redhat.com> <73cc892e-abf2-d9cc-22c3-058ca7b01ebd@oracle.com> Message-ID: Hi Chris, I guess it's the multitude of JVMs that get started one after each other and not in parallel. So I'll push my fix then. Best regards Christoph > -----Original Message----- > From: Chris Plummer > Sent: Freitag, 13. September 2019 19:49 > To: Langer, Christoph ; Severin Gehwolf > ; Thomas St?fe > Cc: OpenJDK Serviceability > Subject: Re: RFR (S): 8230850: Test sun/tools/jcmd/TestProcessHelper.java > fails intermittently > > 3 minutes??? Sounds like something is wrong. What is the JVM doing > during this time? > > Chris > > On 9/12/19 12:53 PM, Langer, Christoph wrote: > > Hi Severin, > > > > that seems an interesting idea for an elegant solution. However, after > trying this on a decently fast linux x86 box by leveraging one of these > ProcessTools::startProcess methods that would wait for a certain output to > appear in the child before returning, I figured that the elapsed runtime of the > test increases from about 3 seconds to 3 minutes. It evidently takes a lot > longer to bootstrap a JVM and get the results of a first println than just > forking a process and immediately accessing its proc filesystem. So I think we > don't want to do that. > > > > I would like to go with this version (changed the comment to Thomas' > suggestion): http://cr.openjdk.java.net/~clanger/webrevs/8230850.1/ > > > > Chris, are you ok with it? > > > > Thanks > > Christoph > > > >> -----Original Message----- > >> From: Severin Gehwolf > >> Sent: Donnerstag, 12. September 2019 11:01 > >> To: Langer, Christoph ; Thomas St?fe > >> > >> Cc: OpenJDK Serviceability > >> Subject: Re: RFR (S): 8230850: Test > sun/tools/jcmd/TestProcessHelper.java > >> fails intermittently > >> > >> Hi Christoph, > >> > >> Have you considered to wait for TestProcess - the spawned processes - to > >> print this on stdout: > >> > >> "The process started, pid: XXX" > >> > >> Once that's ready on stdout, checking the main class should always > >> pass. I believe p.isAlive() check which is currrently done is > >> insufficient. > >> > >> Thanks, > >> Severin > >> > >> On Thu, 2019-09-12 at 08:12 +0000, Langer, Christoph wrote: > >>> Hi Thomas, > >>> > >>> sounds reasonable, will do. > >>> > >>> Thanks > >>> Christoph > >>> > >>> From: Thomas St?fe > >>> Sent: Donnerstag, 12. September 2019 10:11 > >>> To: Langer, Christoph > >>> Cc: Chris Plummer ; OpenJDK Serviceability > >> > >>> Subject: Re: RFR (S): 8230850: Test > sun/tools/jcmd/TestProcessHelper.java > >> fails intermittently > >>> I'm fine with the patch if you would reshape the platform dependent > >> comment. Proposal: > >>> ---- > >>> - // Depending on hw/os, process helper can return null here > >>> - // because /proc//cmdline is not ready yet. To cover that > case, > >>> // give it some retries. > >>> -> > >>> + getMainClass() may return NULL, e.g. due to timing issues. Attempt > some > >> limited retries. > >>> ---- > >>> I do not need another webrev. > >>> Cheers, Thomas > >>> > >>> > >>> > >>> > >>> On Wed, Sep 11, 2019 at 11:37 PM Langer, Christoph > >> wrote: > >>>> Hi Chris, Thomas, > >>>> > >>>> thanks for looking at this. I was also wondering whether a fix in > >> ProcessHelper would be appropriate. But I think introducing retries and > >> delays in that code can do more harm than help. > >>>> For this special test case, aiming to test the ProcessHelper functionality > >> (on Linux) only, the observed problem is that the /proc//cmdline file > is > >> not ready yet when it gets evaluated because the test can be quicker than > >> the spawned processes. But in real life usage of jcmd this seems rather > >> unlikely. One will probably use jcmd quite some time after a java process > was > >> started and /proc//cmdline should be ready. If then there are > >> problems reading it, there are likely other issues which won?t go away by > >> waiting. And for these cases the fallback is to use the attach framework, > as > >> implemented in ProcessArgumentMatcher, which provides some chance > to > >> be working still. And this fallback should also cover the exotic case when > jcmd > >> is issued too early. > >>>> After all, ProcessHelper::getMainClass also documents that its result > can > >> be null. > >>>> @Thomas, as for your other points: > >>>> PID reusage: Hm, maybe one can construct cases. However, I?d think > the > >> /proc/pid files should be gone after a process ends. Or at least be > >> reconstructed if there were orphans and a new process reusing an old pid > >> gets started. But who knows what can happen ? we?ll maybe see ?? > >>>> Comment for Linux only issue: The test is in fact a Linux only test. See > line > >> 55: * @requires os.family == "linux". So, if we?ll eventually see > >> implementations for ProcessHelper::getMainClass on other platforms, > this > >> comment might have to be adopted. But for the time being I guess it?s > fine at > >> its current place. > >>>> Would you agree? > >>>> > >>>> Best regards > >>>> Christoph > >>>> > >>>> From: Chris Plummer > >>>> Sent: Mittwoch, 11. September 2019 19:21 > >>>> To: Thomas St?fe ; Langer, Christoph > >> > >>>> Cc: OpenJDK Serviceability > >>>> Subject: Re: RFR (S): 8230850: Test > >> sun/tools/jcmd/TestProcessHelper.java fails intermittently > >>>> It does seem that the fix should be in ProcessHelper.java in > >> getMainClass(), or maybe even getCommandLine(). Fixing it in the test > >> implies that every user of getMainClass() should be doing something > similar. > >> But then also note what ProcessArgumentMatch.check() is doing. It also > >> deals with getMainClass() returing null. > >>>> thanks, > >>>> > >>>> Chris > >>>> > >>>> On 9/11/19 6:59 AM, Thomas St?fe wrote: > >>>>> Hi Christoph, > >>>>> > >>>>> in general I think this is fine. The increase-by-pow2 sleep time is odd > >> but okay :) > >>>>> The whole things seems rather fragile and has a lot of question marks > >> but I think your fix does not make it worse. One fun error now is that with > a > >> follow up java test reusing the PID we could get a wrong main class but I > think > >> the chances are astronomically low. > >>>>> Only remark, you fix this in the platform shared code, if this is a Linux > >> only issue maybe it should be fixed in /shared/projects/openjdk/jdk- > >> jdk/source/src/jdk.jcmd/linux/classes/sun/tools/ProcessHelper.java > >> instead? If not, I would remove at least the /proc//cmdline > comment > >> since this is quite platform specific. > >>>>> Cheers, Thomas > >>>>> > >>>>> > >>>>> On Wed, Sep 11, 2019 at 2:39 PM Langer, Christoph > >> wrote: > >>>>>> Hi, > >>>>>> > >>>>>> please review this change for test > >> sun/tools/jcmd/TestProcessHelper.java to make it more robust. > >>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8230850 > >>>>>> Webrev: http://cr.openjdk.java.net/~clanger/webrevs/8230850.0/ > >>>>>> > >>>>>> This Linux only test is starting several Java processes and then tries > to > >> figure out the main class by invoking jdk.jcmd's linux specific > ProcessHelper > >> implementation which parses the contents of /proc//cmdline. > >>>>>> Under some circumstances, the test already attempts to read > >> /proc//cmdline before it actually exists or is filled with data. This can > be > >> fixed with some sleeps/retries to wait for that data to be ready. > >>>>>> In the actual jcmd tool, such behavior of ProcessHelper. > getMainClass > >> should not be an issue because it is handled in ProcessArgumentMatcher > [0]. > >>>>>> Thanks > >>>>>> Christoph > >>>>>> > >>>>>> [0] > >> > http://hg.openjdk.java.net/jdk/jdk/file/8b08eaf9a0eb/src/jdk.jcmd/share/cl > >> asses/sun/tools/common/ProcessArgumentMatcher.java#l86 > >>>> > From hohensee at amazon.com Sun Sep 15 09:52:30 2019 From: hohensee at amazon.com (Hohensee, Paul) Date: Sun, 15 Sep 2019 09:52:30 +0000 Subject: RFR (M): 8207266: ThreadMXBean::getThreadAllocatedBytes() can be quicker for self thread In-Reply-To: <56ea7c5f-8c91-9a05-6d95-255bfd0c154d@oracle.com> References: <588a91ec-8d4a-1157-5d72-88bb1eef1e6e@oracle.com> <30EA5D0C-1AEC-4242-B17B-CA4D39ECAF71@amazon.com> <0d42d653-d158-a6e4-45b6-84f087c7e592@oracle.com> <03A2509C-5587-448A-82F8-9240EA040326@amazon.com> <6f674d71-58f6-bc79-7d08-7bcc24e3b0fa@oracle.com> <5252a51d-4217-000b-1444-a088bb8a6a58@oracle.com> <873119A8-C595-4B73-AD0B-1625D6CAC47D@amazon.com> <56ea7c5f-8c91-9a05-6d95-255bfd0c154d@oracle.com> Message-ID: Hi, Serguei, thanks for the review. New webrev at http://cr.openjdk.java.net/~phh/8207266/webrev.09/ I refactored the test?s main() method, and you?re correct, getThreadAllocatedBytes should be getCurrentThreadAllocatedBytes in that context: fixed. Paul From: "serguei.spitsyn at oracle.com" Organization: Oracle Corporation Date: Friday, September 13, 2019 at 5:50 PM To: "Hohensee, Paul" , David Holmes , Mandy Chung Cc: OpenJDK Serviceability , "hotspot-gc-dev at openjdk.java.net" Subject: Re: RFR (M): 8207266: ThreadMXBean::getThreadAllocatedBytes() can be quicker for self thread Hi Paul, It looks pretty good in general. http://cr.openjdk.java.net/~phh/8207266/webrev.08/test/jdk/com/sun/management/ThreadMXBean/ThreadAllocatedMemory.java.frames.html It would be nice to refactor the java main() method as it becomes too big. Two ways of getCurrentThreadAllocatedBytes() testing are good candidates to become separate methods. 98 long size1 = mbean.getThreadAllocatedBytes(id); Just wanted to double check if you wanted to invoke the getCurrentThreadAllocatedBytes() instead as it is a part of: 85 // First way, getCurrentThreadAllocatedBytes Thanks, Serguei On 9/13/19 12:11 PM, Hohensee, Paul wrote: Hi David, thanks for your comments. New webrev in http://cr.openjdk.java.net/~phh/8207266/webrev.08/ Both the old and new versions of the code check that thread allocated memory is both supported and enabled. The existing version of getThreadAllocatedBytes(long []) calls verifyThreadAllocatedMemory(long []), which checks inline to make sure thread allocated memory is supported, then calls isThreadAllocatedMemoryEnabled() to verify that it's enabled. isThreadAllocatedMemoryEnabled() duplicates (!) the support check and returns the enabled flag. I removed the redundant check in the new version. You're of course correct about the back-to-back check. Application code can't know when the runtime will hijack a thread for its own purposes. I've removed the check. Paul On 9/13/19, 12:50 AM, "David Holmes" wrote: Hi Paul, On 13/09/2019 10:29 am, Hohensee, Paul wrote: > Thanks for clarifying the review rules. Would someone from the > serviceability team please review? New webrev at > > http://cr.openjdk.java.net/~phh/8207266/webrev.07/ One aspect of the functional change needs clarification for me - and apologies if this has been covered in the past. It seems to me that currently we only check isThreadAllocatedMemorySupported for these operations, but if I read things correctly the updated code additionally checks isThreadAllocatedMemoryEnabled, which is a behaviour change not mentioned in the CSR. > I didn?t disturb the existing checks in the test, just added code to > check the result of getThreadAllocatedBytes(long) on a non-current > thread, plus the back-to-back no-allocation checks. The former wasn?t > needed before because getThreadAllocatedBytes(long) was just a wrapper > around getThreadAllocatedBytes(long []). This patch changes that, so I > added a separate test. The latter is supposed to fail if there?s object > allocation on calls to getCurrentThreadAllocatedBytes and > getThreadAllocatedBytes(long). I.e., a feature, not a bug, because > accumulation of transient small objects can be a performance problem. > Thanks to your review, I noticed that the back-to-back check on the > current thread was using getThreadAllocatedBytes(long) instead of > getCurrentThreadAllocatedBytes and fixed it. I also removed all > instances of ?TEST FAILED: ?. The back-to-back check is not valid in general. You don't know if the first check might trigger some class loading on the return path after it has obtained the first memory value. The check might also fail if using JVMCI and some compilation related activity occurs in the current thread on the second call. Also with the introduction of handshakes its possible the current thread might hit a safepoint checks that results in it executing a handshake operation that performs allocation. Potentially there could be numerous non-deterministic actions that might occur leading to unanticipated allocation. I understand what you want to test here, I just don't think it is reliably doable. Thanks, David ----- > > Paul > > *From: *Mandy Chung > *Date: *Thursday, September 12, 2019 at 10:09 AM > *To: *"Hohensee, Paul" > *Cc: *OpenJDK Serviceability , > "hotspot-gc-dev at openjdk.java.net" > *Subject: *Re: RFR (M): 8207266: ThreadMXBean::getThreadAllocatedBytes() > can be quicker for self thread > > On 9/3/19 12:38 PM, Hohensee, Paul wrote: > > Minor update in new webrevhttp://cr.openjdk.java.net/~phh/8207266/webrev.05/. > > > I only reviewed the library side implementation that looks good. I > expect the serviceability team to review the test and hotspot change. > > > Need a confirmatory review to push this. If I understand the rules correctly, it doesn't need a Reviewer review since Mandy's already reviewed it, it just needs a Committer review. > > > You need another reviewer to advice the following because I was not > close to the ThreadsList work. > > 2087 ThreadsListHandle tlh; > > 2088 JavaThread* java_thread = tlh.list()->find_JavaThread_from_java_tid(thread_id); > > 2089 > > 2090 if (java_thread != NULL) { > > 2091 return java_thread->cooked_allocated_bytes(); > > 2092 } > > This looks right to me. > > test/jdk/com/sun/management/ThreadMXBean/ThreadAllocatedMemory.java > > - "ThreadAllocatedMemory is expected to be disabled"); > > + "TEST FAILED: ThreadAllocatedMemory is expected to be > disabled"); > > Prepending "TEST FAILED" in exception message (in several places) > > seems redundant since such RuntimeException is thrown and expected > > a test failure. > > + // back-to-back calls shouldn't allocate any memory > > + size = mbean.getThreadAllocatedBytes(id); > > + size1 = mbean.getThreadAllocatedBytes(id); > > + if (size1 != size) { > > Is there anything in the test can do to help guarantee this? I didn't > > closely review this test. The main thing I advice is to improve > > the reliability of this test. Put it in another way, we want to > > ensure that this test change will pass all the time in various > > test configuration. > > Mandy > -------------- next part -------------- An HTML attachment was scrubbed... URL: From daniil.x.titov at oracle.com Mon Sep 16 18:18:49 2019 From: daniil.x.titov at oracle.com (Daniil Titov) Date: Mon, 16 Sep 2019 11:18:49 -0700 Subject: RFR: 8185005: Improve performance of ThreadMXBean.getThreadInfo(long ids[], int maxDepth) In-Reply-To: References: <4C4212D0-BFFF-4C85-ACC6-05200F220C3F@oracle.com> <2d6dede1-aa79-99ce-a823-773fa2e19827@oracle.com> <6E7B043A-4647-4931-977C-1854CA7EBEC1@oracle.com> <76BCC96D-DB5D-409A-95D5-3A64B893832D@oracle.com> <7e0ba39e-e5b7-f56b-66ea-820a0a35ec2c@oracle.com> <87748188-3BD4-4A8B-938A-89DBC8F3C57A@oracle.com> Message-ID: <1D2CC008-A509-4B0B-A8C7-75C1F94545AD@oracle.com> Hello, After investigating with Claes the impact of this change on the performance (thanks a lot Claes for helping with it!) the conclusion was that the impact on the thread startup time is not a blocker for this change. I also measured the memory footprint using Native Memory Tracking and results showed around 40 bytes per live thread. Please review a new version of the fix, webrev.06 [1]. Just to remind, webrev.05 was abandoned and webrev.06 [1] is webrev.04 [3] minus changes in src/hotspot/share/services/management.cpp (that were factored out to a separate issue [4]) and plus a change in ThreadsList::find_JavaThread_from_java_tid() method (please, see below) that addresses the problem Robbin found and puts the code that adds a new thread to the thread table inside Threads_lock. src/hotspot/share/runtime/threadSMR.cpp 622 if (tobj != NULL && java_tid == java_lang_Thread::thread_id(tobj)) { 623 MutexLocker ml(Threads_lock); 624 // Must be inside the lock to ensure that we don't add the thread to the table 625 // that has just passed the removal point in ThreadsSMRSupport::remove_thread() 626 if (!thread->is_exiting()) { 627 ThreadTable::add_thread(java_tid, thread); 628 return thread; 629 } 630 } [1] Webrev: https://cr.openjdk.java.net/~dtitov/8185005/webrev.06 [2] Bug: https://bugs.openjdk.java.net/browse/JDK-8185005 [3] https://cr.openjdk.java.net/~dtitov/8185005/webrev.04 [4] https://bugs.openjdk.java.net/browse/JDK-8229391 ?Thank you, Daniil > > ?On 8/4/19, 7:54 PM, "David Holmes" wrote: > > Hi Daniil, > > On 3/08/2019 8:16 am, Daniil Titov wrote: > > Hi David, > > > > Thank you for your detailed review. Please review a new version of the fix that includes > > the changes you suggested: > > - ThreadTableCreate_lock scope is reduced to cover the creation of the table only; > > - ThreadTableCreate_lock is made _safepoint_check_always; > > Okay. > > > - ServiceThread is no longer responsible for the resizing of the thread table, instead, > > the thread table is changed to grow on demand by the thread that is doing the addition; > > Okay - I'm happy to get the serviceThread out of the picture here. > > > - fixed nits and formatting issues. > > Okay. > > >>> The change also includes additional optimization for some callers of find_JavaThread_from_java_tid() > >>> as Daniel suggested. > >> Not sure it's best to combine these, but if they are limited to the > >> changes in management.cpp only then that may be okay. > > > > The additional optimization for some callers of find_JavaThread_from_java_tid() is > > limited to management.cpp (plus a new test) so I left them in the webrev but > > I also could move it in the separate issue if required. > > I'd prefer this part of be separated out, but won't insist. Let's see if > Dan or Serguei have a strong opinion. > > > > src/hotspot/share/runtime/threadSMR.cpp > > >755 jlong tid = SharedRuntime::get_java_tid(thread); > > > 926 jlong tid = SharedRuntime::get_java_tid(thread); > > > I think it cleaner/better to just use > > > jlong tid = java_lang_Thread::thread_id(thread->threadObj()); > > > as we know thread is not NULL, it is a JavaThread and it has to have a > > > non-null threadObj. > > > > I had to leave this code unchanged since it turned out the threadObj is null > > when VM is destroyed: > > > > V [libjvm.so+0xe165d7] oopDesc::long_field(int) const+0x67 > > V [libjvm.so+0x16e06c6] ThreadsSMRSupport::add_thread(JavaThread*)+0x116 > > V [libjvm.so+0x16d1302] Threads::add(JavaThread*, bool)+0x82 > > V [libjvm.so+0xef8369] attach_current_thread.part.197+0xc9 > > V [libjvm.so+0xec136c] jni_DestroyJavaVM+0x6c > > C [libjli.so+0x4333] JavaMain+0x2c3 > > C [libjli.so+0x8159] ThreadJavaMain+0x9 > > This is actually nothing to do with the VM being destroyed, but is an > issue with JNI_AttachCurrentThread and its interaction with the > ThreadSMR iterators. The attach process is: > - create JavaThread > - mark as "is attaching via jni" > - add to ThreadsList > - create java.lang.Thread object (you can only execute Java code after > you are attached) > - mark as "attach completed" > > So while a thread "is attaching" it will be seen by the ThreadSMR thread > iterator but will have a NULL java.lang.Thread object. > > We special-case attaching threads in a number of places in the VM and I > think we should be explicitly doing something here to filter out > attaching threads, rather than just being tolerant of a NULL j.l.Thread > object. Specifically in ThreadsSMRSupport::add_thread: > > if (ThreadTable::is_initialized() && !thread->is_attaching_via_jni()) { > jlong tid = java_lang_Thread::thread_id(thread->threadObj()); > ThreadTable::add_thread(tid, thread); > } > > Note that in ThreadsSMRSupport::remove_thread we can use the same guard, > which covers the case the JNI attach encountered an error trying to > create the j.l.Thread object. > > >> src/hotspot/share/services/threadTable.cpp > >> 71 static uintx get_hash(Value const& value, bool* is_dead) { > > > >> The is_dead parameter still bothers me here. I can't make enough sense > >> out of the template code in ConcurrentHashtable to see why we have to > >> have it, but I'm concerned that its very existence means we perhaps > >> should not be trying to extend CHT in this context. ?? > > > > My understanding is that is_dead parameter provides a mechanism for > > ConcurrentHashtable to remove stale entries that were not explicitly > > removed by calling ConcurrentHashTable::remove() method. > > I think that just because in our case we don't use this mechanism doesn't > > mean we should not use ConcurrentHashTable. > > Can you confirm that this usage is okay with Robbin Ehn please. He's > back from vacation this week. > > >> I would still want to see what impact this has on thread > >> startup cost, both with and without the table being initialized. > > > > I run a test that initializes the table by calling ThreadMXBean.get getThreadInfo(), > > starts some threads as a worm-up, and then creates and starts 100,000 threads > > (each thread just sleeps for 100 ms). In case when the thread table is enabled > > 100,000 threads are created and started for about 15200 ms. If the thread table > > is off the test takes about 14800 ms. Based on this information the enabled > > thread table makes the thread startup about 2.7% slower. > > That doesn't sound very good. I think we may need to Claes involved to > help investigate overall performance impact here. > > > Webrev: https://cr.openjdk.java.net/~dtitov/8185005/webrev.04/ > > Bug: https://bugs.openjdk.java.net/browse/JDK-8185005 > > No further code comments. > > I didn't look at the test in detail. > > Thanks, > David > > > Thanks! > > --Daniil > > > > > > ?On 7/29/19, 12:53 AM, "David Holmes" wrote: > > > > Hi Daniil, > > > > Overall I think this is a reasonable approach but I would still like to > > see some performance and footprint numbers, both to verify it fixes the > > problem reported, and that we are not getting penalized elsewhere. > > > > On 25/07/2019 3:21 am, Daniil Titov wrote: > > > Hi David, Daniel, and Serguei, > > > > > > Please review the new version of the fix, that makes the thread table initialization on demand and > > > moves it inside ThreadsList::find_JavaThread_from_java_tid(). At the creation time the thread table > > > is initialized with the threads from the current thread list. We don't want to hold Threads_lock > > > inside find_JavaThread_from_java_tid(), thus new threads still could be created while the thread > > > table is being initialized . Such threads will be found by the linear search and added to the thread table > > > later, in ThreadsList::find_JavaThread_from_java_tid(). > > > > The initialization allows the created but unpopulated, or partially > > populated, table to be seen by other threads - is that your intention? > > It seems it should be okay as the other threads will then race with the > > initializing thread to add specific entries, and this is a concurrent > > map so that should be functionally correct. But if so then I think you > > can also reduce the scope of the ThreadTableCreate_lock so that it > > covers creation of the table only, not the initial population of the table. > > > > I like the approach of only initializing the table when needed and using > > that to control when the add/remove-thread code needs to update the > > table. But I would still want to see what impact this has on thread > > startup cost, both with and without the table being initialized. > > > > > The change also includes additional optimization for some callers of find_JavaThread_from_java_tid() > > > as Daniel suggested. > > > > Not sure it's best to combine these, but if they are limited to the > > changes in management.cpp only then that may be okay. It helps to be > > able to focus on the table related changes without being distracted by > > other optimizations. > > > > > That is correct that ResolvedMethodTable was used as a blueprint for the thread table, however, I tried > > > to strip it of the all functionality that is not required in the thread table case. > > > > The revised version seems better in that regard. But I still have a > > concern, see below. > > > > > We need to have the thread table resizable and allow it to grow as the number of threads increases to avoid > > > reserving excessive memory a-priori or deteriorating lookup times. The ServiceThread is responsible for > > > growing the thread table when required. > > > > Yes but why? Why can't this table be grown on demand by the thread that > > is doing the addition? For other tables we may have to delegate to the > > service thread because the current thread cannot perform the action, or > > it doesn't want to perform it at the time the need for the resize is > > detected (e.g. its detected at a safepoint and you want the resize to > > happen later outside the safepoint). It's not apparent to me that such > > restrictions apply here. > > > > > There is no ConcurrentHashTable available in Java 8 and for backporting this fix to Java 8 another implementation > > > of the hash table, probably originally suggested in the patch attached to the JBS issue, should be used. It will make > > > the backporting more complicated, however, adding a new Implementation of the hash table in Java 14 while it > > > already has ConcurrentHashTable doesn't seem reasonable for me. > > > > Ok. > > > > > Webrev: http://cr.openjdk.java.net/~dtitov/8185005/webrev.03 > > > > Some specific code comments: > > > > src/hotspot/share/runtime/mutexLocker.cpp > > > > + def(ThreadTableCreate_lock , PaddedMutex , special, > > false, Monitor::_safepoint_check_never); > > > > I think this needs to be a _safepoint_check_always lock. The table will > > be created by regular JavaThreads and they should (nearly) always be > > checking for safepoints if they are going to block acquiring the lock. > > And it isn't at all obvious that the thread doing the creation can't go > > to a safepoint whilst this lock is held. > > > > --- > > > > src/hotspot/share/runtime/threadSMR.cpp > > > > Nit: > > > > 618 JavaThread* thread = thread_at(i); > > > > you could reuse the new java_thread local you introduced at line 613 and > > just rename that "new" variable to "thread" so you don't have to change > > all other uses. > > > > 628 } else if (java_thread != NULL && ... > > > > You don't need to check != NULL here as you only get here when > > java_thread is not NULL. > > > > 755 jlong tid = SharedRuntime::get_java_tid(thread); > > 926 jlong tid = SharedRuntime::get_java_tid(thread); > > > > I think it cleaner/better to just use > > > > jlong tid = java_lang_Thread::thread_id(thread->threadObj()); > > > > as we know thread is not NULL, it is a JavaThread and it has to have a > > non-null threadObj. > > > > --- > > > > src/hotspot/share/services/management.cpp > > > > 1323 if (THREAD->is_Java_thread()) { > > 1324 JavaThread* current_thread = (JavaThread*)THREAD; > > > > These calls can only be made on a JavaThread so this be simplified to > > remove the is_Java_thread() call. Similarly in other places. > > > > --- > > > > src/hotspot/share/services/threadTable.cpp > > > > 55 class ThreadTableEntry : public CHeapObj { > > 56 private: > > 57 jlong _tid; > > > > I believe hotspot style is to not indent the access modifiers in C++ > > class declarations, so the above would just be: > > > > 55 class ThreadTableEntry : public CHeapObj { > > 56 private: > > 57 jlong _tid; > > > > etc. > > > > 60 ThreadTableEntry(jlong tid, JavaThread* java_thread) : > > 61 _tid(tid),_java_thread(java_thread) {} > > > > line 61 should be indented as it continues line 60. > > > > 67 class ThreadTableConfig : public AllStatic { > > ... > > 71 static uintx get_hash(Value const& value, bool* is_dead) { > > > > The is_dead parameter still bothers me here. I can't make enough sense > > out of the template code in ConcurrentHashtable to see why we have to > > have it, but I'm concerned that its very existence means we perhaps > > should not be trying to extend CHT in this context. ?? > > > > 115 size_t start_size_log = size_log > DefaultThreadTableSizeLog > > 116 ? size_log : DefaultThreadTableSizeLog; > > > > line 116 should be indented, though in this case I think a better layout > > would be: > > > > 115 size_t start_size_log = > > 116 size_log > DefaultThreadTableSizeLog ? size_log : > > DefaultThreadTableSizeLog; > > > > 131 double ThreadTable::get_load_factor() { > > 132 return (double)_items_count/_current_size; > > 133 } > > > > Not sure that is doing what you want/expect. It will perform integer > > division and then cast that whole integer to a double. If you want > > double arithmetic you need: > > > > return ((double)_items_count)/_current_size; > > > > 180 jlong _tid; > > 181 uintx _hash; > > > > Nit: no need for all those spaces before the variable name. > > > > 183 ThreadTableLookup(jlong tid) > > 184 : _tid(tid), _hash(primitive_hash(tid)) {} > > > > line 184 should be indented. > > > > 201 ThreadGet():_return(NULL) {} > > > > Nit: need space after : > > > > 211 assert(_is_initialized, "Thread table is not initialized"); > > 212 _has_work = false; > > > > line 211 is indented one space too far. > > > > 229 ThreadTableEntry* entry = new ThreadTableEntry(tid,java_thread); > > > > Nit: need space after , > > > > 252 return _local_table->remove(thread,lookup); > > > > Nit: need space after , > > > > Thanks, > > David > > ------ > > > > > Bug: https://bugs.openjdk.java.net/browse/JDK-8185005 > > > > > > Thanks! > > > --Daniil > > > > > > > > > ?On 7/8/19, 3:24 PM, "Daniel D. Daugherty" wrote: > > > > > > On 6/29/19 12:06 PM, Daniil Titov wrote: > > > > Hi Serguei and David, > > > > > > > > Serguei is right, ThreadTable::find_thread(java_tid) cannot return a JavaThread with an unmatched java_tid. > > > > > > > > Please find a new version of the fix that includes the changes Serguei suggested. > > > > > > > > Regarding the concern about the maintaining the thread table when it may never even be queried, one of > > > > the options could be to add ThreadTable ::isEnabled flag, set it to "false" by default, and wrap the calls to the thread table > > > > in ThreadsSMRSupport add_thread() and remove_thread() methods to check this flag. > > > > > > > > When ThreadsList::find_JavaThread_from_java_tid() is called for the first time it could check if ThreadTable ::isEnabled > > > > Is on and if not then set it on and populate the thread table with all existing threads from the thread list. > > > > > > I have the same concerns as David H. about this new ThreadTable. > > > ThreadsList::find_JavaThread_from_java_tid() is only called from code > > > in src/hotspot/share/services/management.cpp so I think that table > > > needs to enabled and populated only if it is going to be used. > > > > > > I've taken a look at the webrev below and I see that David has > > > followed up with additional comments. Before I do a crawl through > > > code review for this, I would like to see the ThreadTable stuff > > > made optional and David's other comments addressed. > > > > > > Another possible optimization is for callers of > > > find_JavaThread_from_java_tid() to save the calling thread's > > > tid value before they loop and if the current tid == saved_tid > > > then use the current JavaThread* instead of calling > > > find_JavaThread_from_java_tid() to get the JavaThread*. > > > > > > Dan > > > > > > > > > > > Webrev: https://cr.openjdk.java.net/~dtitov/8185005/webrev.02/ > > > > Bug: https://bugs.openjdk.java.net/browse/JDK-8185005 > > > > > > > > Thanks! > > > > --Daniil > > > > > > > > From: > > > > Organization: Oracle Corporation > > > > Date: Friday, June 28, 2019 at 7:56 PM > > > > To: Daniil Titov , OpenJDK Serviceability , "hotspot-runtime-dev at openjdk.java.net" , "jmx-dev at openjdk.java.net" > > > > Subject: Re: RFR: 8185005: Improve performance of ThreadMXBean.getThreadInfo(long ids[], int maxDepth) > > > > > > > > Hi Daniil, > > > > > > > > I have several quick comments. > > > > > > > > The indent in the hotspot c/c++ files has to be 2, not 4. > > > > > > > > https://cr.openjdk.java.net/~dtitov/8185005/webrev.01/src/hotspot/share/runtime/threadSMR.cpp.frames.html > > > > 614 JavaThread* ThreadsList::find_JavaThread_from_java_tid(jlong java_tid) const { > > > > 615 JavaThread* java_thread = ThreadTable::find_thread(java_tid); > > > > 616 if (java_thread == NULL && java_tid == PMIMORDIAL_JAVA_TID) { > > > > 617 // ThreadsSMRSupport::add_thread() is not called for the primordial > > > > 618 // thread. Thus, we find this thread with a linear search and add it > > > > 619 // to the thread table. > > > > 620 for (uint i = 0; i < length(); i++) { > > > > 621 JavaThread* thread = thread_at(i); > > > > 622 if (is_valid_java_thread(java_tid,thread)) { > > > > 623 ThreadTable::add_thread(java_tid, thread); > > > > 624 return thread; > > > > 625 } > > > > 626 } > > > > 627 } else if (java_thread != NULL && is_valid_java_thread(java_tid, java_thread)) { > > > > 628 return java_thread; > > > > 629 } > > > > 630 return NULL; > > > > 631 } > > > > 632 bool ThreadsList::is_valid_java_thread(jlong java_tid, JavaThread* java_thread) { > > > > 633 oop tobj = java_thread->threadObj(); > > > > 634 // Ignore the thread if it hasn't run yet, has exited > > > > 635 // or is starting to exit. > > > > 636 return (tobj != NULL && !java_thread->is_exiting() && > > > > 637 java_tid == java_lang_Thread::thread_id(tobj)); > > > > 638 } > > > > > > > > 615 JavaThread* java_thread = ThreadTable::find_thread(java_tid); > > > > > > > > I'd suggest to rename find_thread() to find_thread_by_tid(). > > > > > > > > A space is missed after the comma: > > > > 622 if (is_valid_java_thread(java_tid,thread)) { > > > > > > > > An empty line is needed before L632. > > > > > > > > The name 'is_valid_java_thread' looks wrong (or confusing) to me. > > > > Something like 'is_alive_java_thread_with_tid()' would be better. > > > > It'd better to list parameters in the opposite order. > > > > > > > > The call to is_valid_java_thread() is confusing: > > > > 627 } else if (java_thread != NULL && is_valid_java_thread(java_tid, java_thread)) { > > > > > > > > Why would the call ThreadTable::find_thread(java_tid) return a JavaThread with an unmatched java_tid? > > > > > > > > > > > > Thanks, > > > > Serguei > > > > > > > > On 6/28/19, 9:40 PM, "David Holmes" wrote: > > > > > > > > Hi Daniil, > > > > > > > > The definition and use of this hashtable (yet another hashtable > > > > implementation!) will need careful examination. We have to be concerned > > > > about the cost of maintaining it when it may never even be queried. You > > > > would need to look at footprint cost and performance impact. > > > > > > > > Unfortunately I'm just about to board a plane and will be out for the > > > > next few days. I will try to look at this asap next week, but we will > > > > need a lot more data on it. > > > > > > > > Thanks, > > > > David > > > > > > > > On 6/28/19 3:31 PM, Daniil Titov wrote: > > > > Please review the change that improves performance of ThreadMXBean MXBean methods returning the > > > > information for specific threads. The change introduces the thread table that uses ConcurrentHashTable > > > > to store one-to-one the mapping between the thread ids and JavaThread objects and replaces the linear > > > > search over the thread list in ThreadsList::find_JavaThread_from_java_tid(jlong tid) method with the lookup > > > > in the thread table. > > > > > > > > Testing: Mach5 tier1,tier2 and tier3 tests successfully passed. > > > > > > > > Webrev: https://cr.openjdk.java.net/~dtitov/8185005/webrev.01/ > > > > Bug: https://bugs.openjdk.java.net/browse/JDK-8185005 > > > > > > > > Thanks! > > > > > > > > Best regards, > > > > Daniil > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > From alexey.menkov at oracle.com Mon Sep 16 19:24:23 2019 From: alexey.menkov at oracle.com (Alex Menkov) Date: Mon, 16 Sep 2019 12:24:23 -0700 Subject: Ping: Re: RFR: JDK-8186825: some memory leak issues in the transport_startTransport In-Reply-To: <9531fb93-e9e0-5897-a466-8422244477ca@oracle.com> References: <9531fb93-e9e0-5897-a466-8422244477ca@oracle.com> Message-ID: <2744cb9d-5d6f-0321-97a9-f2dae8c6f5d9@oracle.com> Need 2nd review for the fix --alex On 09/05/2019 12:08, serguei.spitsyn at oracle.com wrote: > Hi Alex, > > Looks good to me. > > Thanks, > Serguei > > > On 9/5/19 11:20, Alex Menkov wrote: >> Hi all, >> >> Please review the fix for >> ? https://bugs.openjdk.java.net/browse/JDK-8186825 >> webrev: >> http://cr.openjdk.java.net/~amenkov/jdk14/jdwp_memory_leak/webrev/ >> >> TransportInfo structure is used to pass data from main thread >> (transport_startTransport function) to acceptThread/attachThread and >> should be released by acceptThread/attachThread. >> >> --alex > From hohensee at amazon.com Mon Sep 16 21:18:22 2019 From: hohensee at amazon.com (Hohensee, Paul) Date: Mon, 16 Sep 2019 21:18:22 +0000 Subject: Ping: Re: RFR: JDK-8186825: some memory leak issues in the transport_startTransport In-Reply-To: <2744cb9d-5d6f-0321-97a9-f2dae8c6f5d9@oracle.com> References: <9531fb93-e9e0-5897-a466-8422244477ca@oracle.com> <2744cb9d-5d6f-0321-97a9-f2dae8c6f5d9@oracle.com> Message-ID: <3AFDE38B-FC0A-406A-BC58-FE1D92250A0C@amazon.com> Looks good Paul ?On 9/16/19, 12:25 PM, "serviceability-dev on behalf of Alex Menkov" wrote: Need 2nd review for the fix --alex On 09/05/2019 12:08, serguei.spitsyn at oracle.com wrote: > Hi Alex, > > Looks good to me. > > Thanks, > Serguei > > > On 9/5/19 11:20, Alex Menkov wrote: >> Hi all, >> >> Please review the fix for >> https://bugs.openjdk.java.net/browse/JDK-8186825 >> webrev: >> http://cr.openjdk.java.net/~amenkov/jdk14/jdwp_memory_leak/webrev/ >> >> TransportInfo structure is used to pass data from main thread >> (transport_startTransport function) to acceptThread/attachThread and >> should be released by acceptThread/attachThread. >> >> --alex > From david.holmes at oracle.com Mon Sep 16 23:01:15 2019 From: david.holmes at oracle.com (David Holmes) Date: Tue, 17 Sep 2019 09:01:15 +1000 Subject: RFR: 8230857: Avoid reflection in sun.tools.common.ProcessHelper In-Reply-To: References: <555a2cf2-e15e-abb6-5c0a-fb3ff4c0716f@oracle.com> Message-ID: <7cef2fd2-74cb-f069-d837-b5219924efc0@oracle.com> Hi Christoph, Sorry for the delay getting back you. cc'd build-dev to get some clarification on the below ... On 12/09/2019 7:30 pm, Langer, Christoph wrote: > Hi David, > >>> please review an enhancement which I've identified when working with >>> Processhelper for JDK-8230850. >>> >>> I noticed that ProcessHelper is an interface in common code with a >>> static method that would lookup the actual platform implementation via >>> reflection. This seems a little cumbersome since we can have a common >>> dummy for ProcessHelper and override it with the platform specific >>> implementation, leveraging the build system. >> >> I don't see you leveraging the build system. You have two source files >> that compile to the same destination class file. What is ensuring the >> platform specific version is compiled after the generic one? >> >> Service-provider patterns use reflection to instantiate the service >> implementation. I don't see any problem here that needs solving. > > TL;DR: > There are two source files, one in share/classes and one in linux/classes. The build system overrides the share/classes implementation with the linux/classes implementation in the linux build. This is not by coincidence and only one class is contained in the generated jdk.jcmd module. Then there won't be a need for having a service interface and a service implementation that is looked up via reflection (which is not a bad pattern by itself). I agree that it's not a big problem to be solved but still not "no problem". > Here is some longer elaboration how the build system prefers specific implementations of classes and filters generic duplicates: > The SetupJavaCompilation function from JavaCompilation.gmk [0] is used to compile the java sources for JDK modules. In its documentation, for argument SRC [1], it claims: "one or more directories to search for sources. The order of the source roots is significant. The first found file of a certain name has priority". In its implementation the found files are first ordered [3] and duplicates filtered out [4]. > The potential source files are handed to SetupJavaCompilation in CompileJavaModules.gmk [5] and were collected by a call to FindModuleSrcDirs [6]. FindModuleSrcDirs iterates over all potential source dirs for Java classes in the module [7]. The evaluated subdirs are (in that order) $(OPENJDK_TARGET_OS)/classes, $(OPENJDK_TARGET_OS_TYPE)/classes and share/classes, as per [8]. > Hope that explains what I'm trying to leverage here. I'm not 100% certain that what you describe actually ensures what you want it to ensure. I can't reconcile "the first found file ... has priority" with the fact found files are sorted and duplicates eliminated. It is the sorting that concerns me as it suggests linux/Foo.java might replace shared/Foo.java, but if we're on Windows then we have a problem! That said there is also this comment: # Order src files according to the order of the src dirs. Correct odering is # needed for correct overriding between different source roots. I'd need the build team to clarify what "correct overriding" is actually defined as. Thanks, David ----- > I've uploaded an updated webrev which contains some cleanup to the Test changes: http://cr.openjdk.java.net/~clanger/webrevs/8230857.1/ > > Thanks > Christoph > > [0] http://hg.openjdk.java.net/jdk/jdk/file/ea93d6a9f720/make/common/JavaCompilation.gmk#l185 > [1] http://hg.openjdk.java.net/jdk/jdk/file/ea93d6a9f720/make/common/JavaCompilation.gmk#l157 > [3] http://hg.openjdk.java.net/jdk/jdk/file/ea93d6a9f720/make/common/JavaCompilation.gmk#l225 > [4] http://hg.openjdk.java.net/jdk/jdk/file/ea93d6a9f720/make/common/JavaCompilation.gmk#l257 > [5] http://hg.openjdk.java.net/jdk/jdk/file/ea93d6a9f720/make/CompileJavaModules.gmk#l603 > [6] http://hg.openjdk.java.net/jdk/jdk/file/ea93d6a9f720/make/CompileJavaModules.gmk#l555 > [7] http://hg.openjdk.java.net/jdk/jdk/file/ea93d6a9f720/make/common/Modules.gmk#l300 > [8] http://hg.openjdk.java.net/jdk/jdk/file/ea93d6a9f720/make/common/Modules.gmk#l243 > > From david.holmes at oracle.com Tue Sep 17 02:26:48 2019 From: david.holmes at oracle.com (David Holmes) Date: Tue, 17 Sep 2019 12:26:48 +1000 Subject: RFR: 8185005: Improve performance of ThreadMXBean.getThreadInfo(long ids[], int maxDepth) In-Reply-To: <1D2CC008-A509-4B0B-A8C7-75C1F94545AD@oracle.com> References: <4C4212D0-BFFF-4C85-ACC6-05200F220C3F@oracle.com> <2d6dede1-aa79-99ce-a823-773fa2e19827@oracle.com> <6E7B043A-4647-4931-977C-1854CA7EBEC1@oracle.com> <76BCC96D-DB5D-409A-95D5-3A64B893832D@oracle.com> <7e0ba39e-e5b7-f56b-66ea-820a0a35ec2c@oracle.com> <87748188-3BD4-4A8B-938A-89DBC8F3C57A@oracle.com> <1D2CC008-A509-4B0B-A8C7-75C1F94545AD@oracle.com> Message-ID: Hi Daniil, Thanks again for your perseverance on this one. I think there is a problem with initialization of the thread table. Suppose thread T1 has called ThreadsList::find_JavaThread_from_java_tid and has commenced execution of ThreadTable::lazy_initialize, but not yet marked _is_initialized as true. Now two new threads (T2 and T3) are created and start running - they aren't added to the ThreadTable yet because it isn't initialized. Now T0 also calls ThreadsList::find_JavaThread_from_java_tid using an updated ThreadsList that contains T2 and T3. It also calls ThreadTable::lazy_initialize. If _is_initialized is still false T0 will attempt initialization but once it gets the lock it will see the table has now been initialized by T1. It will then proceed to update the table with its own ThreadList content - adding T2 and T3. That is all fine. But now suppose T0 initially sees _is_initialized as true, it will do nothing in lazy_initialize and simply return to find_JavaThread_from_java_tid. But now T2 and T3 are missing from the ThreadTable and nothing will cause them to be added. More generally any ThreadsList that is created after the ThreadsList that will be used for initialization, may contain threads that will not be added to the table. Thanks, David On 17/09/2019 4:18 am, Daniil Titov wrote: > Hello, > > After investigating with Claes the impact of this change on the performance (thanks a lot Claes for helping with it!) the conclusion was that the impact on the thread startup time is not a blocker for this change. > > I also measured the memory footprint using Native Memory Tracking and results showed around 40 bytes per live thread. > > Please review a new version of the fix, webrev.06 [1]. Just to remind, webrev.05 was abandoned and webrev.06 [1] is webrev.04 [3] minus changes in src/hotspot/share/services/management.cpp (that were factored out to a separate issue [4]) and plus a change in ThreadsList::find_JavaThread_from_java_tid() method (please, see below) that addresses the problem Robbin found and puts the code that adds a new thread to the thread table inside Threads_lock. > > src/hotspot/share/runtime/threadSMR.cpp > > 622 if (tobj != NULL && java_tid == java_lang_Thread::thread_id(tobj)) { > 623 MutexLocker ml(Threads_lock); > 624 // Must be inside the lock to ensure that we don't add the thread to the table > 625 // that has just passed the removal point in ThreadsSMRSupport::remove_thread() > 626 if (!thread->is_exiting()) { > 627 ThreadTable::add_thread(java_tid, thread); > 628 return thread; > 629 } > 630 } > > [1] Webrev: https://cr.openjdk.java.net/~dtitov/8185005/webrev.06 > [2] Bug: https://bugs.openjdk.java.net/browse/JDK-8185005 > [3] https://cr.openjdk.java.net/~dtitov/8185005/webrev.04 > [4] https://bugs.openjdk.java.net/browse/JDK-8229391 > > ?Thank you, > Daniil > > > > > > > ?On 8/4/19, 7:54 PM, "David Holmes" wrote: > > > > Hi Daniil, > > > > On 3/08/2019 8:16 am, Daniil Titov wrote: > > > Hi David, > > > > > > Thank you for your detailed review. Please review a new version of the fix that includes > > > the changes you suggested: > > > - ThreadTableCreate_lock scope is reduced to cover the creation of the table only; > > > - ThreadTableCreate_lock is made _safepoint_check_always; > > > > Okay. > > > > > - ServiceThread is no longer responsible for the resizing of the thread table, instead, > > > the thread table is changed to grow on demand by the thread that is doing the addition; > > > > Okay - I'm happy to get the serviceThread out of the picture here. > > > > > - fixed nits and formatting issues. > > > > Okay. > > > > >>> The change also includes additional optimization for some callers of find_JavaThread_from_java_tid() > > >>> as Daniel suggested. > > >> Not sure it's best to combine these, but if they are limited to the > > >> changes in management.cpp only then that may be okay. > > > > > > The additional optimization for some callers of find_JavaThread_from_java_tid() is > > > limited to management.cpp (plus a new test) so I left them in the webrev but > > > I also could move it in the separate issue if required. > > > > I'd prefer this part of be separated out, but won't insist. Let's see if > > Dan or Serguei have a strong opinion. > > > > > > src/hotspot/share/runtime/threadSMR.cpp > > > >755 jlong tid = SharedRuntime::get_java_tid(thread); > > > > 926 jlong tid = SharedRuntime::get_java_tid(thread); > > > > I think it cleaner/better to just use > > > > jlong tid = java_lang_Thread::thread_id(thread->threadObj()); > > > > as we know thread is not NULL, it is a JavaThread and it has to have a > > > > non-null threadObj. > > > > > > I had to leave this code unchanged since it turned out the threadObj is null > > > when VM is destroyed: > > > > > > V [libjvm.so+0xe165d7] oopDesc::long_field(int) const+0x67 > > > V [libjvm.so+0x16e06c6] ThreadsSMRSupport::add_thread(JavaThread*)+0x116 > > > V [libjvm.so+0x16d1302] Threads::add(JavaThread*, bool)+0x82 > > > V [libjvm.so+0xef8369] attach_current_thread.part.197+0xc9 > > > V [libjvm.so+0xec136c] jni_DestroyJavaVM+0x6c > > > C [libjli.so+0x4333] JavaMain+0x2c3 > > > C [libjli.so+0x8159] ThreadJavaMain+0x9 > > > > This is actually nothing to do with the VM being destroyed, but is an > > issue with JNI_AttachCurrentThread and its interaction with the > > ThreadSMR iterators. The attach process is: > > - create JavaThread > > - mark as "is attaching via jni" > > - add to ThreadsList > > - create java.lang.Thread object (you can only execute Java code after > > you are attached) > > - mark as "attach completed" > > > > So while a thread "is attaching" it will be seen by the ThreadSMR thread > > iterator but will have a NULL java.lang.Thread object. > > > > We special-case attaching threads in a number of places in the VM and I > > think we should be explicitly doing something here to filter out > > attaching threads, rather than just being tolerant of a NULL j.l.Thread > > object. Specifically in ThreadsSMRSupport::add_thread: > > > > if (ThreadTable::is_initialized() && !thread->is_attaching_via_jni()) { > > jlong tid = java_lang_Thread::thread_id(thread->threadObj()); > > ThreadTable::add_thread(tid, thread); > > } > > > > Note that in ThreadsSMRSupport::remove_thread we can use the same guard, > > which covers the case the JNI attach encountered an error trying to > > create the j.l.Thread object. > > > > >> src/hotspot/share/services/threadTable.cpp > > >> 71 static uintx get_hash(Value const& value, bool* is_dead) { > > > > > >> The is_dead parameter still bothers me here. I can't make enough sense > > >> out of the template code in ConcurrentHashtable to see why we have to > > >> have it, but I'm concerned that its very existence means we perhaps > > >> should not be trying to extend CHT in this context. ?? > > > > > > My understanding is that is_dead parameter provides a mechanism for > > > ConcurrentHashtable to remove stale entries that were not explicitly > > > removed by calling ConcurrentHashTable::remove() method. > > > I think that just because in our case we don't use this mechanism doesn't > > > mean we should not use ConcurrentHashTable. > > > > Can you confirm that this usage is okay with Robbin Ehn please. He's > > back from vacation this week. > > > > >> I would still want to see what impact this has on thread > > >> startup cost, both with and without the table being initialized. > > > > > > I run a test that initializes the table by calling ThreadMXBean.get getThreadInfo(), > > > starts some threads as a worm-up, and then creates and starts 100,000 threads > > > (each thread just sleeps for 100 ms). In case when the thread table is enabled > > > 100,000 threads are created and started for about 15200 ms. If the thread table > > > is off the test takes about 14800 ms. Based on this information the enabled > > > thread table makes the thread startup about 2.7% slower. > > > > That doesn't sound very good. I think we may need to Claes involved to > > help investigate overall performance impact here. > > > > > Webrev: https://cr.openjdk.java.net/~dtitov/8185005/webrev.04/ > > > Bug: https://bugs.openjdk.java.net/browse/JDK-8185005 > > > > No further code comments. > > > > I didn't look at the test in detail. > > > > Thanks, > > David > > > > > Thanks! > > > --Daniil > > > > > > > > > ?On 7/29/19, 12:53 AM, "David Holmes" wrote: > > > > > > Hi Daniil, > > > > > > Overall I think this is a reasonable approach but I would still like to > > > see some performance and footprint numbers, both to verify it fixes the > > > problem reported, and that we are not getting penalized elsewhere. > > > > > > On 25/07/2019 3:21 am, Daniil Titov wrote: > > > > Hi David, Daniel, and Serguei, > > > > > > > > Please review the new version of the fix, that makes the thread table initialization on demand and > > > > moves it inside ThreadsList::find_JavaThread_from_java_tid(). At the creation time the thread table > > > > is initialized with the threads from the current thread list. We don't want to hold Threads_lock > > > > inside find_JavaThread_from_java_tid(), thus new threads still could be created while the thread > > > > table is being initialized . Such threads will be found by the linear search and added to the thread table > > > > later, in ThreadsList::find_JavaThread_from_java_tid(). > > > > > > The initialization allows the created but unpopulated, or partially > > > populated, table to be seen by other threads - is that your intention? > > > It seems it should be okay as the other threads will then race with the > > > initializing thread to add specific entries, and this is a concurrent > > > map so that should be functionally correct. But if so then I think you > > > can also reduce the scope of the ThreadTableCreate_lock so that it > > > covers creation of the table only, not the initial population of the table. > > > > > > I like the approach of only initializing the table when needed and using > > > that to control when the add/remove-thread code needs to update the > > > table. But I would still want to see what impact this has on thread > > > startup cost, both with and without the table being initialized. > > > > > > > The change also includes additional optimization for some callers of find_JavaThread_from_java_tid() > > > > as Daniel suggested. > > > > > > Not sure it's best to combine these, but if they are limited to the > > > changes in management.cpp only then that may be okay. It helps to be > > > able to focus on the table related changes without being distracted by > > > other optimizations. > > > > > > > That is correct that ResolvedMethodTable was used as a blueprint for the thread table, however, I tried > > > > to strip it of the all functionality that is not required in the thread table case. > > > > > > The revised version seems better in that regard. But I still have a > > > concern, see below. > > > > > > > We need to have the thread table resizable and allow it to grow as the number of threads increases to avoid > > > > reserving excessive memory a-priori or deteriorating lookup times. The ServiceThread is responsible for > > > > growing the thread table when required. > > > > > > Yes but why? Why can't this table be grown on demand by the thread that > > > is doing the addition? For other tables we may have to delegate to the > > > service thread because the current thread cannot perform the action, or > > > it doesn't want to perform it at the time the need for the resize is > > > detected (e.g. its detected at a safepoint and you want the resize to > > > happen later outside the safepoint). It's not apparent to me that such > > > restrictions apply here. > > > > > > > There is no ConcurrentHashTable available in Java 8 and for backporting this fix to Java 8 another implementation > > > > of the hash table, probably originally suggested in the patch attached to the JBS issue, should be used. It will make > > > > the backporting more complicated, however, adding a new Implementation of the hash table in Java 14 while it > > > > already has ConcurrentHashTable doesn't seem reasonable for me. > > > > > > Ok. > > > > > > > Webrev: http://cr.openjdk.java.net/~dtitov/8185005/webrev.03 > > > > > > Some specific code comments: > > > > > > src/hotspot/share/runtime/mutexLocker.cpp > > > > > > + def(ThreadTableCreate_lock , PaddedMutex , special, > > > false, Monitor::_safepoint_check_never); > > > > > > I think this needs to be a _safepoint_check_always lock. The table will > > > be created by regular JavaThreads and they should (nearly) always be > > > checking for safepoints if they are going to block acquiring the lock. > > > And it isn't at all obvious that the thread doing the creation can't go > > > to a safepoint whilst this lock is held. > > > > > > --- > > > > > > src/hotspot/share/runtime/threadSMR.cpp > > > > > > Nit: > > > > > > 618 JavaThread* thread = thread_at(i); > > > > > > you could reuse the new java_thread local you introduced at line 613 and > > > just rename that "new" variable to "thread" so you don't have to change > > > all other uses. > > > > > > 628 } else if (java_thread != NULL && ... > > > > > > You don't need to check != NULL here as you only get here when > > > java_thread is not NULL. > > > > > > 755 jlong tid = SharedRuntime::get_java_tid(thread); > > > 926 jlong tid = SharedRuntime::get_java_tid(thread); > > > > > > I think it cleaner/better to just use > > > > > > jlong tid = java_lang_Thread::thread_id(thread->threadObj()); > > > > > > as we know thread is not NULL, it is a JavaThread and it has to have a > > > non-null threadObj. > > > > > > --- > > > > > > src/hotspot/share/services/management.cpp > > > > > > 1323 if (THREAD->is_Java_thread()) { > > > 1324 JavaThread* current_thread = (JavaThread*)THREAD; > > > > > > These calls can only be made on a JavaThread so this be simplified to > > > remove the is_Java_thread() call. Similarly in other places. > > > > > > --- > > > > > > src/hotspot/share/services/threadTable.cpp > > > > > > 55 class ThreadTableEntry : public CHeapObj { > > > 56 private: > > > 57 jlong _tid; > > > > > > I believe hotspot style is to not indent the access modifiers in C++ > > > class declarations, so the above would just be: > > > > > > 55 class ThreadTableEntry : public CHeapObj { > > > 56 private: > > > 57 jlong _tid; > > > > > > etc. > > > > > > 60 ThreadTableEntry(jlong tid, JavaThread* java_thread) : > > > 61 _tid(tid),_java_thread(java_thread) {} > > > > > > line 61 should be indented as it continues line 60. > > > > > > 67 class ThreadTableConfig : public AllStatic { > > > ... > > > 71 static uintx get_hash(Value const& value, bool* is_dead) { > > > > > > The is_dead parameter still bothers me here. I can't make enough sense > > > out of the template code in ConcurrentHashtable to see why we have to > > > have it, but I'm concerned that its very existence means we perhaps > > > should not be trying to extend CHT in this context. ?? > > > > > > 115 size_t start_size_log = size_log > DefaultThreadTableSizeLog > > > 116 ? size_log : DefaultThreadTableSizeLog; > > > > > > line 116 should be indented, though in this case I think a better layout > > > would be: > > > > > > 115 size_t start_size_log = > > > 116 size_log > DefaultThreadTableSizeLog ? size_log : > > > DefaultThreadTableSizeLog; > > > > > > 131 double ThreadTable::get_load_factor() { > > > 132 return (double)_items_count/_current_size; > > > 133 } > > > > > > Not sure that is doing what you want/expect. It will perform integer > > > division and then cast that whole integer to a double. If you want > > > double arithmetic you need: > > > > > > return ((double)_items_count)/_current_size; > > > > > > 180 jlong _tid; > > > 181 uintx _hash; > > > > > > Nit: no need for all those spaces before the variable name. > > > > > > 183 ThreadTableLookup(jlong tid) > > > 184 : _tid(tid), _hash(primitive_hash(tid)) {} > > > > > > line 184 should be indented. > > > > > > 201 ThreadGet():_return(NULL) {} > > > > > > Nit: need space after : > > > > > > 211 assert(_is_initialized, "Thread table is not initialized"); > > > 212 _has_work = false; > > > > > > line 211 is indented one space too far. > > > > > > 229 ThreadTableEntry* entry = new ThreadTableEntry(tid,java_thread); > > > > > > Nit: need space after , > > > > > > 252 return _local_table->remove(thread,lookup); > > > > > > Nit: need space after , > > > > > > Thanks, > > > David > > > ------ > > > > > > > Bug: https://bugs.openjdk.java.net/browse/JDK-8185005 > > > > > > > > Thanks! > > > > --Daniil > > > > > > > > > > > > ?On 7/8/19, 3:24 PM, "Daniel D. Daugherty" wrote: > > > > > > > > On 6/29/19 12:06 PM, Daniil Titov wrote: > > > > > Hi Serguei and David, > > > > > > > > > > Serguei is right, ThreadTable::find_thread(java_tid) cannot return a JavaThread with an unmatched java_tid. > > > > > > > > > > Please find a new version of the fix that includes the changes Serguei suggested. > > > > > > > > > > Regarding the concern about the maintaining the thread table when it may never even be queried, one of > > > > > the options could be to add ThreadTable ::isEnabled flag, set it to "false" by default, and wrap the calls to the thread table > > > > > in ThreadsSMRSupport add_thread() and remove_thread() methods to check this flag. > > > > > > > > > > When ThreadsList::find_JavaThread_from_java_tid() is called for the first time it could check if ThreadTable ::isEnabled > > > > > Is on and if not then set it on and populate the thread table with all existing threads from the thread list. > > > > > > > > I have the same concerns as David H. about this new ThreadTable. > > > > ThreadsList::find_JavaThread_from_java_tid() is only called from code > > > > in src/hotspot/share/services/management.cpp so I think that table > > > > needs to enabled and populated only if it is going to be used. > > > > > > > > I've taken a look at the webrev below and I see that David has > > > > followed up with additional comments. Before I do a crawl through > > > > code review for this, I would like to see the ThreadTable stuff > > > > made optional and David's other comments addressed. > > > > > > > > Another possible optimization is for callers of > > > > find_JavaThread_from_java_tid() to save the calling thread's > > > > tid value before they loop and if the current tid == saved_tid > > > > then use the current JavaThread* instead of calling > > > > find_JavaThread_from_java_tid() to get the JavaThread*. > > > > > > > > Dan > > > > > > > > > > > > > > Webrev: https://cr.openjdk.java.net/~dtitov/8185005/webrev.02/ > > > > > Bug: https://bugs.openjdk.java.net/browse/JDK-8185005 > > > > > > > > > > Thanks! > > > > > --Daniil > > > > > > > > > > From: > > > > > Organization: Oracle Corporation > > > > > Date: Friday, June 28, 2019 at 7:56 PM > > > > > To: Daniil Titov , OpenJDK Serviceability , "hotspot-runtime-dev at openjdk.java.net" , "jmx-dev at openjdk.java.net" > > > > > Subject: Re: RFR: 8185005: Improve performance of ThreadMXBean.getThreadInfo(long ids[], int maxDepth) > > > > > > > > > > Hi Daniil, > > > > > > > > > > I have several quick comments. > > > > > > > > > > The indent in the hotspot c/c++ files has to be 2, not 4. > > > > > > > > > > https://cr.openjdk.java.net/~dtitov/8185005/webrev.01/src/hotspot/share/runtime/threadSMR.cpp.frames.html > > > > > 614 JavaThread* ThreadsList::find_JavaThread_from_java_tid(jlong java_tid) const { > > > > > 615 JavaThread* java_thread = ThreadTable::find_thread(java_tid); > > > > > 616 if (java_thread == NULL && java_tid == PMIMORDIAL_JAVA_TID) { > > > > > 617 // ThreadsSMRSupport::add_thread() is not called for the primordial > > > > > 618 // thread. Thus, we find this thread with a linear search and add it > > > > > 619 // to the thread table. > > > > > 620 for (uint i = 0; i < length(); i++) { > > > > > 621 JavaThread* thread = thread_at(i); > > > > > 622 if (is_valid_java_thread(java_tid,thread)) { > > > > > 623 ThreadTable::add_thread(java_tid, thread); > > > > > 624 return thread; > > > > > 625 } > > > > > 626 } > > > > > 627 } else if (java_thread != NULL && is_valid_java_thread(java_tid, java_thread)) { > > > > > 628 return java_thread; > > > > > 629 } > > > > > 630 return NULL; > > > > > 631 } > > > > > 632 bool ThreadsList::is_valid_java_thread(jlong java_tid, JavaThread* java_thread) { > > > > > 633 oop tobj = java_thread->threadObj(); > > > > > 634 // Ignore the thread if it hasn't run yet, has exited > > > > > 635 // or is starting to exit. > > > > > 636 return (tobj != NULL && !java_thread->is_exiting() && > > > > > 637 java_tid == java_lang_Thread::thread_id(tobj)); > > > > > 638 } > > > > > > > > > > 615 JavaThread* java_thread = ThreadTable::find_thread(java_tid); > > > > > > > > > > I'd suggest to rename find_thread() to find_thread_by_tid(). > > > > > > > > > > A space is missed after the comma: > > > > > 622 if (is_valid_java_thread(java_tid,thread)) { > > > > > > > > > > An empty line is needed before L632. > > > > > > > > > > The name 'is_valid_java_thread' looks wrong (or confusing) to me. > > > > > Something like 'is_alive_java_thread_with_tid()' would be better. > > > > > It'd better to list parameters in the opposite order. > > > > > > > > > > The call to is_valid_java_thread() is confusing: > > > > > 627 } else if (java_thread != NULL && is_valid_java_thread(java_tid, java_thread)) { > > > > > > > > > > Why would the call ThreadTable::find_thread(java_tid) return a JavaThread with an unmatched java_tid? > > > > > > > > > > > > > > > Thanks, > > > > > Serguei > > > > > > > > > > On 6/28/19, 9:40 PM, "David Holmes" wrote: > > > > > > > > > > Hi Daniil, > > > > > > > > > > The definition and use of this hashtable (yet another hashtable > > > > > implementation!) will need careful examination. We have to be concerned > > > > > about the cost of maintaining it when it may never even be queried. You > > > > > would need to look at footprint cost and performance impact. > > > > > > > > > > Unfortunately I'm just about to board a plane and will be out for the > > > > > next few days. I will try to look at this asap next week, but we will > > > > > need a lot more data on it. > > > > > > > > > > Thanks, > > > > > David > > > > > > > > > > On 6/28/19 3:31 PM, Daniil Titov wrote: > > > > > Please review the change that improves performance of ThreadMXBean MXBean methods returning the > > > > > information for specific threads. The change introduces the thread table that uses ConcurrentHashTable > > > > > to store one-to-one the mapping between the thread ids and JavaThread objects and replaces the linear > > > > > search over the thread list in ThreadsList::find_JavaThread_from_java_tid(jlong tid) method with the lookup > > > > > in the thread table. > > > > > > > > > > Testing: Mach5 tier1,tier2 and tier3 tests successfully passed. > > > > > > > > > > Webrev: https://cr.openjdk.java.net/~dtitov/8185005/webrev.01/ > > > > > Bug: https://bugs.openjdk.java.net/browse/JDK-8185005 > > > > > > > > > > Thanks! > > > > > > > > > > Best regards, > > > > > Daniil > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > From daniil.x.titov at oracle.com Tue Sep 17 04:36:01 2019 From: daniil.x.titov at oracle.com (Daniil Titov) Date: Mon, 16 Sep 2019 21:36:01 -0700 Subject: 8185005: Improve performance of ThreadMXBean.getThreadInfo(long ids[], int maxDepth) In-Reply-To: References: <4C4212D0-BFFF-4C85-ACC6-05200F220C3F@oracle.com> <2d6dede1-aa79-99ce-a823-773fa2e19827@oracle.com> <6E7B043A-4647-4931-977C-1854CA7EBEC1@oracle.com> <76BCC96D-DB5D-409A-95D5-3A64B893832D@oracle.com> <7e0ba39e-e5b7-f56b-66ea-820a0a35ec2c@oracle.com> <87748188-3BD4-4A8B-938A-89DBC8F3C57A@oracle.com> <1D2CC008-A509-4B0B-A8C7-75C1F94545AD@oracle.com> Message-ID: <9A125A5A-3904-4E3B-9650-308B56E15F20@oracle.com> Hi David, The case you have described is exact the reason why we still have a code inside ThreadsList::find_JavaThread_from_java_tid() method that does a linear scan and adds the requested thread to the thread table if it is not there ( lines 614-613 below). The assumption is that it's quite uncommon and even if this is the case the linear scan happens only once per such thread. 611 JavaThread* ThreadsList::find_JavaThread_from_java_tid(jlong java_tid) const { 612 ThreadTable::lazy_initialize(this); 613 JavaThread* thread = ThreadTable::find_thread_by_tid(java_tid); 614 if (thread == NULL) { 615 // If the thread is not found in the table find it 616 // with a linear search and add to the table. 617 for (uint i = 0; i < length(); i++) { 618 thread = thread_at(i); 619 oop tobj = thread->threadObj(); 620 // Ignore the thread if it hasn't run yet, has exited 621 // or is starting to exit. 622 if (tobj != NULL && java_tid == java_lang_Thread::thread_id(tobj)) { 623 MutexLocker ml(Threads_lock); 624 // Must be inside the lock to ensure that we don't add the thread to the table 625 // that has just passed the removal point in ThreadsSMRSupport::remove_thread() 626 if (!thread->is_exiting()) { 627 ThreadTable::add_thread(java_tid, thread); 628 return thread; 629 } 630 } 631 } 632 } else if (!thread->is_exiting()) { 633 return thread; 634 } 635 return NULL; 636 } Thanks, Daniil ?On 9/16/19, 7:27 PM, "David Holmes" wrote: Hi Daniil, Thanks again for your perseverance on this one. I think there is a problem with initialization of the thread table. Suppose thread T1 has called ThreadsList::find_JavaThread_from_java_tid and has commenced execution of ThreadTable::lazy_initialize, but not yet marked _is_initialized as true. Now two new threads (T2 and T3) are created and start running - they aren't added to the ThreadTable yet because it isn't initialized. Now T0 also calls ThreadsList::find_JavaThread_from_java_tid using an updated ThreadsList that contains T2 and T3. It also calls ThreadTable::lazy_initialize. If _is_initialized is still false T0 will attempt initialization but once it gets the lock it will see the table has now been initialized by T1. It will then proceed to update the table with its own ThreadList content - adding T2 and T3. That is all fine. But now suppose T0 initially sees _is_initialized as true, it will do nothing in lazy_initialize and simply return to find_JavaThread_from_java_tid. But now T2 and T3 are missing from the ThreadTable and nothing will cause them to be added. More generally any ThreadsList that is created after the ThreadsList that will be used for initialization, may contain threads that will not be added to the table. Thanks, David On 17/09/2019 4:18 am, Daniil Titov wrote: > Hello, > > After investigating with Claes the impact of this change on the performance (thanks a lot Claes for helping with it!) the conclusion was that the impact on the thread startup time is not a blocker for this change. > > I also measured the memory footprint using Native Memory Tracking and results showed around 40 bytes per live thread. > > Please review a new version of the fix, webrev.06 [1]. Just to remind, webrev.05 was abandoned and webrev.06 [1] is webrev.04 [3] minus changes in src/hotspot/share/services/management.cpp (that were factored out to a separate issue [4]) and plus a change in ThreadsList::find_JavaThread_from_java_tid() method (please, see below) that addresses the problem Robbin found and puts the code that adds a new thread to the thread table inside Threads_lock. > > src/hotspot/share/runtime/threadSMR.cpp > > 622 if (tobj != NULL && java_tid == java_lang_Thread::thread_id(tobj)) { > 623 MutexLocker ml(Threads_lock); > 624 // Must be inside the lock to ensure that we don't add the thread to the table > 625 // that has just passed the removal point in ThreadsSMRSupport::remove_thread() > 626 if (!thread->is_exiting()) { > 627 ThreadTable::add_thread(java_tid, thread); > 628 return thread; > 629 } > 630 } > > [1] Webrev: https://cr.openjdk.java.net/~dtitov/8185005/webrev.06 > [2] Bug: https://bugs.openjdk.java.net/browse/JDK-8185005 > [3] https://cr.openjdk.java.net/~dtitov/8185005/webrev.04 > [4] https://bugs.openjdk.java.net/browse/JDK-8229391 > > ?Thank you, > Daniil > > > > > > > ?On 8/4/19, 7:54 PM, "David Holmes" wrote: > > > > Hi Daniil, > > > > On 3/08/2019 8:16 am, Daniil Titov wrote: > > > Hi David, > > > > > > Thank you for your detailed review. Please review a new version of the fix that includes > > > the changes you suggested: > > > - ThreadTableCreate_lock scope is reduced to cover the creation of the table only; > > > - ThreadTableCreate_lock is made _safepoint_check_always; > > > > Okay. > > > > > - ServiceThread is no longer responsible for the resizing of the thread table, instead, > > > the thread table is changed to grow on demand by the thread that is doing the addition; > > > > Okay - I'm happy to get the serviceThread out of the picture here. > > > > > - fixed nits and formatting issues. > > > > Okay. > > > > >>> The change also includes additional optimization for some callers of find_JavaThread_from_java_tid() > > >>> as Daniel suggested. > > >> Not sure it's best to combine these, but if they are limited to the > > >> changes in management.cpp only then that may be okay. > > > > > > The additional optimization for some callers of find_JavaThread_from_java_tid() is > > > limited to management.cpp (plus a new test) so I left them in the webrev but > > > I also could move it in the separate issue if required. > > > > I'd prefer this part of be separated out, but won't insist. Let's see if > > Dan or Serguei have a strong opinion. > > > > > > src/hotspot/share/runtime/threadSMR.cpp > > > >755 jlong tid = SharedRuntime::get_java_tid(thread); > > > > 926 jlong tid = SharedRuntime::get_java_tid(thread); > > > > I think it cleaner/better to just use > > > > jlong tid = java_lang_Thread::thread_id(thread->threadObj()); > > > > as we know thread is not NULL, it is a JavaThread and it has to have a > > > > non-null threadObj. > > > > > > I had to leave this code unchanged since it turned out the threadObj is null > > > when VM is destroyed: > > > > > > V [libjvm.so+0xe165d7] oopDesc::long_field(int) const+0x67 > > > V [libjvm.so+0x16e06c6] ThreadsSMRSupport::add_thread(JavaThread*)+0x116 > > > V [libjvm.so+0x16d1302] Threads::add(JavaThread*, bool)+0x82 > > > V [libjvm.so+0xef8369] attach_current_thread.part.197+0xc9 > > > V [libjvm.so+0xec136c] jni_DestroyJavaVM+0x6c > > > C [libjli.so+0x4333] JavaMain+0x2c3 > > > C [libjli.so+0x8159] ThreadJavaMain+0x9 > > > > This is actually nothing to do with the VM being destroyed, but is an > > issue with JNI_AttachCurrentThread and its interaction with the > > ThreadSMR iterators. The attach process is: > > - create JavaThread > > - mark as "is attaching via jni" > > - add to ThreadsList > > - create java.lang.Thread object (you can only execute Java code after > > you are attached) > > - mark as "attach completed" > > > > So while a thread "is attaching" it will be seen by the ThreadSMR thread > > iterator but will have a NULL java.lang.Thread object. > > > > We special-case attaching threads in a number of places in the VM and I > > think we should be explicitly doing something here to filter out > > attaching threads, rather than just being tolerant of a NULL j.l.Thread > > object. Specifically in ThreadsSMRSupport::add_thread: > > > > if (ThreadTable::is_initialized() && !thread->is_attaching_via_jni()) { > > jlong tid = java_lang_Thread::thread_id(thread->threadObj()); > > ThreadTable::add_thread(tid, thread); > > } > > > > Note that in ThreadsSMRSupport::remove_thread we can use the same guard, > > which covers the case the JNI attach encountered an error trying to > > create the j.l.Thread object. > > > > >> src/hotspot/share/services/threadTable.cpp > > >> 71 static uintx get_hash(Value const& value, bool* is_dead) { > > > > > >> The is_dead parameter still bothers me here. I can't make enough sense > > >> out of the template code in ConcurrentHashtable to see why we have to > > >> have it, but I'm concerned that its very existence means we perhaps > > >> should not be trying to extend CHT in this context. ?? > > > > > > My understanding is that is_dead parameter provides a mechanism for > > > ConcurrentHashtable to remove stale entries that were not explicitly > > > removed by calling ConcurrentHashTable::remove() method. > > > I think that just because in our case we don't use this mechanism doesn't > > > mean we should not use ConcurrentHashTable. > > > > Can you confirm that this usage is okay with Robbin Ehn please. He's > > back from vacation this week. > > > > >> I would still want to see what impact this has on thread > > >> startup cost, both with and without the table being initialized. > > > > > > I run a test that initializes the table by calling ThreadMXBean.get getThreadInfo(), > > > starts some threads as a worm-up, and then creates and starts 100,000 threads > > > (each thread just sleeps for 100 ms). In case when the thread table is enabled > > > 100,000 threads are created and started for about 15200 ms. If the thread table > > > is off the test takes about 14800 ms. Based on this information the enabled > > > thread table makes the thread startup about 2.7% slower. > > > > That doesn't sound very good. I think we may need to Claes involved to > > help investigate overall performance impact here. > > > > > Webrev: https://cr.openjdk.java.net/~dtitov/8185005/webrev.04/ > > > Bug: https://bugs.openjdk.java.net/browse/JDK-8185005 > > > > No further code comments. > > > > I didn't look at the test in detail. > > > > Thanks, > > David > > > > > Thanks! > > > --Daniil > > > > > > > > > ?On 7/29/19, 12:53 AM, "David Holmes" wrote: > > > > > > Hi Daniil, > > > > > > Overall I think this is a reasonable approach but I would still like to > > > see some performance and footprint numbers, both to verify it fixes the > > > problem reported, and that we are not getting penalized elsewhere. > > > > > > On 25/07/2019 3:21 am, Daniil Titov wrote: > > > > Hi David, Daniel, and Serguei, > > > > > > > > Please review the new version of the fix, that makes the thread table initialization on demand and > > > > moves it inside ThreadsList::find_JavaThread_from_java_tid(). At the creation time the thread table > > > > is initialized with the threads from the current thread list. We don't want to hold Threads_lock > > > > inside find_JavaThread_from_java_tid(), thus new threads still could be created while the thread > > > > table is being initialized . Such threads will be found by the linear search and added to the thread table > > > > later, in ThreadsList::find_JavaThread_from_java_tid(). > > > > > > The initialization allows the created but unpopulated, or partially > > > populated, table to be seen by other threads - is that your intention? > > > It seems it should be okay as the other threads will then race with the > > > initializing thread to add specific entries, and this is a concurrent > > > map so that should be functionally correct. But if so then I think you > > > can also reduce the scope of the ThreadTableCreate_lock so that it > > > covers creation of the table only, not the initial population of the table. > > > > > > I like the approach of only initializing the table when needed and using > > > that to control when the add/remove-thread code needs to update the > > > table. But I would still want to see what impact this has on thread > > > startup cost, both with and without the table being initialized. > > > > > > > The change also includes additional optimization for some callers of find_JavaThread_from_java_tid() > > > > as Daniel suggested. > > > > > > Not sure it's best to combine these, but if they are limited to the > > > changes in management.cpp only then that may be okay. It helps to be > > > able to focus on the table related changes without being distracted by > > > other optimizations. > > > > > > > That is correct that ResolvedMethodTable was used as a blueprint for the thread table, however, I tried > > > > to strip it of the all functionality that is not required in the thread table case. > > > > > > The revised version seems better in that regard. But I still have a > > > concern, see below. > > > > > > > We need to have the thread table resizable and allow it to grow as the number of threads increases to avoid > > > > reserving excessive memory a-priori or deteriorating lookup times. The ServiceThread is responsible for > > > > growing the thread table when required. > > > > > > Yes but why? Why can't this table be grown on demand by the thread that > > > is doing the addition? For other tables we may have to delegate to the > > > service thread because the current thread cannot perform the action, or > > > it doesn't want to perform it at the time the need for the resize is > > > detected (e.g. its detected at a safepoint and you want the resize to > > > happen later outside the safepoint). It's not apparent to me that such > > > restrictions apply here. > > > > > > > There is no ConcurrentHashTable available in Java 8 and for backporting this fix to Java 8 another implementation > > > > of the hash table, probably originally suggested in the patch attached to the JBS issue, should be used. It will make > > > > the backporting more complicated, however, adding a new Implementation of the hash table in Java 14 while it > > > > already has ConcurrentHashTable doesn't seem reasonable for me. > > > > > > Ok. > > > > > > > Webrev: http://cr.openjdk.java.net/~dtitov/8185005/webrev.03 > > > > > > Some specific code comments: > > > > > > src/hotspot/share/runtime/mutexLocker.cpp > > > > > > + def(ThreadTableCreate_lock , PaddedMutex , special, > > > false, Monitor::_safepoint_check_never); > > > > > > I think this needs to be a _safepoint_check_always lock. The table will > > > be created by regular JavaThreads and they should (nearly) always be > > > checking for safepoints if they are going to block acquiring the lock. > > > And it isn't at all obvious that the thread doing the creation can't go > > > to a safepoint whilst this lock is held. > > > > > > --- > > > > > > src/hotspot/share/runtime/threadSMR.cpp > > > > > > Nit: > > > > > > 618 JavaThread* thread = thread_at(i); > > > > > > you could reuse the new java_thread local you introduced at line 613 and > > > just rename that "new" variable to "thread" so you don't have to change > > > all other uses. > > > > > > 628 } else if (java_thread != NULL && ... > > > > > > You don't need to check != NULL here as you only get here when > > > java_thread is not NULL. > > > > > > 755 jlong tid = SharedRuntime::get_java_tid(thread); > > > 926 jlong tid = SharedRuntime::get_java_tid(thread); > > > > > > I think it cleaner/better to just use > > > > > > jlong tid = java_lang_Thread::thread_id(thread->threadObj()); > > > > > > as we know thread is not NULL, it is a JavaThread and it has to have a > > > non-null threadObj. > > > > > > --- > > > > > > src/hotspot/share/services/management.cpp > > > > > > 1323 if (THREAD->is_Java_thread()) { > > > 1324 JavaThread* current_thread = (JavaThread*)THREAD; > > > > > > These calls can only be made on a JavaThread so this be simplified to > > > remove the is_Java_thread() call. Similarly in other places. > > > > > > --- > > > > > > src/hotspot/share/services/threadTable.cpp > > > > > > 55 class ThreadTableEntry : public CHeapObj { > > > 56 private: > > > 57 jlong _tid; > > > > > > I believe hotspot style is to not indent the access modifiers in C++ > > > class declarations, so the above would just be: > > > > > > 55 class ThreadTableEntry : public CHeapObj { > > > 56 private: > > > 57 jlong _tid; > > > > > > etc. > > > > > > 60 ThreadTableEntry(jlong tid, JavaThread* java_thread) : > > > 61 _tid(tid),_java_thread(java_thread) {} > > > > > > line 61 should be indented as it continues line 60. > > > > > > 67 class ThreadTableConfig : public AllStatic { > > > ... > > > 71 static uintx get_hash(Value const& value, bool* is_dead) { > > > > > > The is_dead parameter still bothers me here. I can't make enough sense > > > out of the template code in ConcurrentHashtable to see why we have to > > > have it, but I'm concerned that its very existence means we perhaps > > > should not be trying to extend CHT in this context. ?? > > > > > > 115 size_t start_size_log = size_log > DefaultThreadTableSizeLog > > > 116 ? size_log : DefaultThreadTableSizeLog; > > > > > > line 116 should be indented, though in this case I think a better layout > > > would be: > > > > > > 115 size_t start_size_log = > > > 116 size_log > DefaultThreadTableSizeLog ? size_log : > > > DefaultThreadTableSizeLog; > > > > > > 131 double ThreadTable::get_load_factor() { > > > 132 return (double)_items_count/_current_size; > > > 133 } > > > > > > Not sure that is doing what you want/expect. It will perform integer > > > division and then cast that whole integer to a double. If you want > > > double arithmetic you need: > > > > > > return ((double)_items_count)/_current_size; > > > > > > 180 jlong _tid; > > > 181 uintx _hash; > > > > > > Nit: no need for all those spaces before the variable name. > > > > > > 183 ThreadTableLookup(jlong tid) > > > 184 : _tid(tid), _hash(primitive_hash(tid)) {} > > > > > > line 184 should be indented. > > > > > > 201 ThreadGet():_return(NULL) {} > > > > > > Nit: need space after : > > > > > > 211 assert(_is_initialized, "Thread table is not initialized"); > > > 212 _has_work = false; > > > > > > line 211 is indented one space too far. > > > > > > 229 ThreadTableEntry* entry = new ThreadTableEntry(tid,java_thread); > > > > > > Nit: need space after , > > > > > > 252 return _local_table->remove(thread,lookup); > > > > > > Nit: need space after , > > > > > > Thanks, > > > David > > > ------ > > > > > > > Bug: https://bugs.openjdk.java.net/browse/JDK-8185005 > > > > > > > > Thanks! > > > > --Daniil > > > > > > > > > > > > ?On 7/8/19, 3:24 PM, "Daniel D. Daugherty" wrote: > > > > > > > > On 6/29/19 12:06 PM, Daniil Titov wrote: > > > > > Hi Serguei and David, > > > > > > > > > > Serguei is right, ThreadTable::find_thread(java_tid) cannot return a JavaThread with an unmatched java_tid. > > > > > > > > > > Please find a new version of the fix that includes the changes Serguei suggested. > > > > > > > > > > Regarding the concern about the maintaining the thread table when it may never even be queried, one of > > > > > the options could be to add ThreadTable ::isEnabled flag, set it to "false" by default, and wrap the calls to the thread table > > > > > in ThreadsSMRSupport add_thread() and remove_thread() methods to check this flag. > > > > > > > > > > When ThreadsList::find_JavaThread_from_java_tid() is called for the first time it could check if ThreadTable ::isEnabled > > > > > Is on and if not then set it on and populate the thread table with all existing threads from the thread list. > > > > > > > > I have the same concerns as David H. about this new ThreadTable. > > > > ThreadsList::find_JavaThread_from_java_tid() is only called from code > > > > in src/hotspot/share/services/management.cpp so I think that table > > > > needs to enabled and populated only if it is going to be used. > > > > > > > > I've taken a look at the webrev below and I see that David has > > > > followed up with additional comments. Before I do a crawl through > > > > code review for this, I would like to see the ThreadTable stuff > > > > made optional and David's other comments addressed. > > > > > > > > Another possible optimization is for callers of > > > > find_JavaThread_from_java_tid() to save the calling thread's > > > > tid value before they loop and if the current tid == saved_tid > > > > then use the current JavaThread* instead of calling > > > > find_JavaThread_from_java_tid() to get the JavaThread*. > > > > > > > > Dan > > > > > > > > > > > > > > Webrev: https://cr.openjdk.java.net/~dtitov/8185005/webrev.02/ > > > > > Bug: https://bugs.openjdk.java.net/browse/JDK-8185005 > > > > > > > > > > Thanks! > > > > > --Daniil > > > > > > > > > > From: > > > > > Organization: Oracle Corporation > > > > > Date: Friday, June 28, 2019 at 7:56 PM > > > > > To: Daniil Titov , OpenJDK Serviceability , "hotspot-runtime-dev at openjdk.java.net" , "jmx-dev at openjdk.java.net" > > > > > Subject: Re: RFR: 8185005: Improve performance of ThreadMXBean.getThreadInfo(long ids[], int maxDepth) > > > > > > > > > > Hi Daniil, > > > > > > > > > > I have several quick comments. > > > > > > > > > > The indent in the hotspot c/c++ files has to be 2, not 4. > > > > > > > > > > https://cr.openjdk.java.net/~dtitov/8185005/webrev.01/src/hotspot/share/runtime/threadSMR.cpp.frames.html > > > > > 614 JavaThread* ThreadsList::find_JavaThread_from_java_tid(jlong java_tid) const { > > > > > 615 JavaThread* java_thread = ThreadTable::find_thread(java_tid); > > > > > 616 if (java_thread == NULL && java_tid == PMIMORDIAL_JAVA_TID) { > > > > > 617 // ThreadsSMRSupport::add_thread() is not called for the primordial > > > > > 618 // thread. Thus, we find this thread with a linear search and add it > > > > > 619 // to the thread table. > > > > > 620 for (uint i = 0; i < length(); i++) { > > > > > 621 JavaThread* thread = thread_at(i); > > > > > 622 if (is_valid_java_thread(java_tid,thread)) { > > > > > 623 ThreadTable::add_thread(java_tid, thread); > > > > > 624 return thread; > > > > > 625 } > > > > > 626 } > > > > > 627 } else if (java_thread != NULL && is_valid_java_thread(java_tid, java_thread)) { > > > > > 628 return java_thread; > > > > > 629 } > > > > > 630 return NULL; > > > > > 631 } > > > > > 632 bool ThreadsList::is_valid_java_thread(jlong java_tid, JavaThread* java_thread) { > > > > > 633 oop tobj = java_thread->threadObj(); > > > > > 634 // Ignore the thread if it hasn't run yet, has exited > > > > > 635 // or is starting to exit. > > > > > 636 return (tobj != NULL && !java_thread->is_exiting() && > > > > > 637 java_tid == java_lang_Thread::thread_id(tobj)); > > > > > 638 } > > > > > > > > > > 615 JavaThread* java_thread = ThreadTable::find_thread(java_tid); > > > > > > > > > > I'd suggest to rename find_thread() to find_thread_by_tid(). > > > > > > > > > > A space is missed after the comma: > > > > > 622 if (is_valid_java_thread(java_tid,thread)) { > > > > > > > > > > An empty line is needed before L632. > > > > > > > > > > The name 'is_valid_java_thread' looks wrong (or confusing) to me. > > > > > Something like 'is_alive_java_thread_with_tid()' would be better. > > > > > It'd better to list parameters in the opposite order. > > > > > > > > > > The call to is_valid_java_thread() is confusing: > > > > > 627 } else if (java_thread != NULL && is_valid_java_thread(java_tid, java_thread)) { > > > > > > > > > > Why would the call ThreadTable::find_thread(java_tid) return a JavaThread with an unmatched java_tid? > > > > > > > > > > > > > > > Thanks, > > > > > Serguei > > > > > > > > > > On 6/28/19, 9:40 PM, "David Holmes" wrote: > > > > > > > > > > Hi Daniil, > > > > > > > > > > The definition and use of this hashtable (yet another hashtable > > > > > implementation!) will need careful examination. We have to be concerned > > > > > about the cost of maintaining it when it may never even be queried. You > > > > > would need to look at footprint cost and performance impact. > > > > > > > > > > Unfortunately I'm just about to board a plane and will be out for the > > > > > next few days. I will try to look at this asap next week, but we will > > > > > need a lot more data on it. > > > > > > > > > > Thanks, > > > > > David > > > > > > > > > > On 6/28/19 3:31 PM, Daniil Titov wrote: > > > > > Please review the change that improves performance of ThreadMXBean MXBean methods returning the > > > > > information for specific threads. The change introduces the thread table that uses ConcurrentHashTable > > > > > to store one-to-one the mapping between the thread ids and JavaThread objects and replaces the linear > > > > > search over the thread list in ThreadsList::find_JavaThread_from_java_tid(jlong tid) method with the lookup > > > > > in the thread table. > > > > > > > > > > Testing: Mach5 tier1,tier2 and tier3 tests successfully passed. > > > > > > > > > > Webrev: https://cr.openjdk.java.net/~dtitov/8185005/webrev.01/ > > > > > Bug: https://bugs.openjdk.java.net/browse/JDK-8185005 > > > > > > > > > > Thanks! > > > > > > > > > > Best regards, > > > > > Daniil > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > From david.holmes at oracle.com Tue Sep 17 05:17:03 2019 From: david.holmes at oracle.com (David Holmes) Date: Tue, 17 Sep 2019 15:17:03 +1000 Subject: 8185005: Improve performance of ThreadMXBean.getThreadInfo(long ids[], int maxDepth) In-Reply-To: <9A125A5A-3904-4E3B-9650-308B56E15F20@oracle.com> References: <4C4212D0-BFFF-4C85-ACC6-05200F220C3F@oracle.com> <2d6dede1-aa79-99ce-a823-773fa2e19827@oracle.com> <6E7B043A-4647-4931-977C-1854CA7EBEC1@oracle.com> <76BCC96D-DB5D-409A-95D5-3A64B893832D@oracle.com> <7e0ba39e-e5b7-f56b-66ea-820a0a35ec2c@oracle.com> <87748188-3BD4-4A8B-938A-89DBC8F3C57A@oracle.com> <1D2CC008-A509-4B0B-A8C7-75C1F94545AD@oracle.com> <9A125A5A-3904-4E3B-9650-308B56E15F20@oracle.com> Message-ID: Hi Daniil, On 17/09/2019 2:36 pm, Daniil Titov wrote: > Hi David, > > The case you have described is exact the reason why we still have a code inside > ThreadsList::find_JavaThread_from_java_tid() method that does a linear scan and adds > the requested thread to the thread table if it is not there ( lines 614-613 below). The > assumption is that it's quite uncommon and even if this is the case the linear scan happens > only once per such thread. But that is a linear scan of the current threadslist which doesn't contain the new threads ... hmmm need to think more on this. If the tid I'm looking for is valid it should belong to a thread that existed before the current ThreadsList was created to process that tid. So it would only be a problem if the tid was speculatively produced and matched one of those new threads. But that is a race anyway. Okay. Thanks for clarifying. David ----- > > 611 JavaThread* ThreadsList::find_JavaThread_from_java_tid(jlong java_tid) const { > 612 ThreadTable::lazy_initialize(this); > 613 JavaThread* thread = ThreadTable::find_thread_by_tid(java_tid); > 614 if (thread == NULL) { > 615 // If the thread is not found in the table find it > 616 // with a linear search and add to the table. > 617 for (uint i = 0; i < length(); i++) { > 618 thread = thread_at(i); > 619 oop tobj = thread->threadObj(); > 620 // Ignore the thread if it hasn't run yet, has exited > 621 // or is starting to exit. > 622 if (tobj != NULL && java_tid == java_lang_Thread::thread_id(tobj)) { > 623 MutexLocker ml(Threads_lock); > 624 // Must be inside the lock to ensure that we don't add the thread to the table > 625 // that has just passed the removal point in ThreadsSMRSupport::remove_thread() > 626 if (!thread->is_exiting()) { > 627 ThreadTable::add_thread(java_tid, thread); > 628 return thread; > 629 } > 630 } > 631 } > 632 } else if (!thread->is_exiting()) { > 633 return thread; > 634 } > 635 return NULL; > 636 } > > Thanks, > Daniil > > ?On 9/16/19, 7:27 PM, "David Holmes" wrote: > > Hi Daniil, > > Thanks again for your perseverance on this one. > > I think there is a problem with initialization of the thread table. > Suppose thread T1 has called ThreadsList::find_JavaThread_from_java_tid > and has commenced execution of ThreadTable::lazy_initialize, but not yet > marked _is_initialized as true. Now two new threads (T2 and T3) are > created and start running - they aren't added to the ThreadTable yet > because it isn't initialized. Now T0 also calls > ThreadsList::find_JavaThread_from_java_tid using an updated ThreadsList > that contains T2 and T3. It also calls ThreadTable::lazy_initialize. If > _is_initialized is still false T0 will attempt initialization but once > it gets the lock it will see the table has now been initialized by T1. > It will then proceed to update the table with its own ThreadList content > - adding T2 and T3. That is all fine. But now suppose T0 initially sees > _is_initialized as true, it will do nothing in lazy_initialize and > simply return to find_JavaThread_from_java_tid. But now T2 and T3 are > missing from the ThreadTable and nothing will cause them to be added. > > More generally any ThreadsList that is created after the ThreadsList > that will be used for initialization, may contain threads that will not > be added to the table. > > Thanks, > David > > On 17/09/2019 4:18 am, Daniil Titov wrote: > > Hello, > > > > After investigating with Claes the impact of this change on the performance (thanks a lot Claes for helping with it!) the conclusion was that the impact on the thread startup time is not a blocker for this change. > > > > I also measured the memory footprint using Native Memory Tracking and results showed around 40 bytes per live thread. > > > > Please review a new version of the fix, webrev.06 [1]. Just to remind, webrev.05 was abandoned and webrev.06 [1] is webrev.04 [3] minus changes in src/hotspot/share/services/management.cpp (that were factored out to a separate issue [4]) and plus a change in ThreadsList::find_JavaThread_from_java_tid() method (please, see below) that addresses the problem Robbin found and puts the code that adds a new thread to the thread table inside Threads_lock. > > > > src/hotspot/share/runtime/threadSMR.cpp > > > > 622 if (tobj != NULL && java_tid == java_lang_Thread::thread_id(tobj)) { > > 623 MutexLocker ml(Threads_lock); > > 624 // Must be inside the lock to ensure that we don't add the thread to the table > > 625 // that has just passed the removal point in ThreadsSMRSupport::remove_thread() > > 626 if (!thread->is_exiting()) { > > 627 ThreadTable::add_thread(java_tid, thread); > > 628 return thread; > > 629 } > > 630 } > > > > [1] Webrev: https://cr.openjdk.java.net/~dtitov/8185005/webrev.06 > > [2] Bug: https://bugs.openjdk.java.net/browse/JDK-8185005 > > [3] https://cr.openjdk.java.net/~dtitov/8185005/webrev.04 > > [4] https://bugs.openjdk.java.net/browse/JDK-8229391 > > > > ?Thank you, > > Daniil > > > > > > > > > > > > ?On 8/4/19, 7:54 PM, "David Holmes" wrote: > > > > > > Hi Daniil, > > > > > > On 3/08/2019 8:16 am, Daniil Titov wrote: > > > > Hi David, > > > > > > > > Thank you for your detailed review. Please review a new version of the fix that includes > > > > the changes you suggested: > > > > - ThreadTableCreate_lock scope is reduced to cover the creation of the table only; > > > > - ThreadTableCreate_lock is made _safepoint_check_always; > > > > > > Okay. > > > > > > > - ServiceThread is no longer responsible for the resizing of the thread table, instead, > > > > the thread table is changed to grow on demand by the thread that is doing the addition; > > > > > > Okay - I'm happy to get the serviceThread out of the picture here. > > > > > > > - fixed nits and formatting issues. > > > > > > Okay. > > > > > > >>> The change also includes additional optimization for some callers of find_JavaThread_from_java_tid() > > > >>> as Daniel suggested. > > > >> Not sure it's best to combine these, but if they are limited to the > > > >> changes in management.cpp only then that may be okay. > > > > > > > > The additional optimization for some callers of find_JavaThread_from_java_tid() is > > > > limited to management.cpp (plus a new test) so I left them in the webrev but > > > > I also could move it in the separate issue if required. > > > > > > I'd prefer this part of be separated out, but won't insist. Let's see if > > > Dan or Serguei have a strong opinion. > > > > > > > > src/hotspot/share/runtime/threadSMR.cpp > > > > >755 jlong tid = SharedRuntime::get_java_tid(thread); > > > > > 926 jlong tid = SharedRuntime::get_java_tid(thread); > > > > > I think it cleaner/better to just use > > > > > jlong tid = java_lang_Thread::thread_id(thread->threadObj()); > > > > > as we know thread is not NULL, it is a JavaThread and it has to have a > > > > > non-null threadObj. > > > > > > > > I had to leave this code unchanged since it turned out the threadObj is null > > > > when VM is destroyed: > > > > > > > > V [libjvm.so+0xe165d7] oopDesc::long_field(int) const+0x67 > > > > V [libjvm.so+0x16e06c6] ThreadsSMRSupport::add_thread(JavaThread*)+0x116 > > > > V [libjvm.so+0x16d1302] Threads::add(JavaThread*, bool)+0x82 > > > > V [libjvm.so+0xef8369] attach_current_thread.part.197+0xc9 > > > > V [libjvm.so+0xec136c] jni_DestroyJavaVM+0x6c > > > > C [libjli.so+0x4333] JavaMain+0x2c3 > > > > C [libjli.so+0x8159] ThreadJavaMain+0x9 > > > > > > This is actually nothing to do with the VM being destroyed, but is an > > > issue with JNI_AttachCurrentThread and its interaction with the > > > ThreadSMR iterators. The attach process is: > > > - create JavaThread > > > - mark as "is attaching via jni" > > > - add to ThreadsList > > > - create java.lang.Thread object (you can only execute Java code after > > > you are attached) > > > - mark as "attach completed" > > > > > > So while a thread "is attaching" it will be seen by the ThreadSMR thread > > > iterator but will have a NULL java.lang.Thread object. > > > > > > We special-case attaching threads in a number of places in the VM and I > > > think we should be explicitly doing something here to filter out > > > attaching threads, rather than just being tolerant of a NULL j.l.Thread > > > object. Specifically in ThreadsSMRSupport::add_thread: > > > > > > if (ThreadTable::is_initialized() && !thread->is_attaching_via_jni()) { > > > jlong tid = java_lang_Thread::thread_id(thread->threadObj()); > > > ThreadTable::add_thread(tid, thread); > > > } > > > > > > Note that in ThreadsSMRSupport::remove_thread we can use the same guard, > > > which covers the case the JNI attach encountered an error trying to > > > create the j.l.Thread object. > > > > > > >> src/hotspot/share/services/threadTable.cpp > > > >> 71 static uintx get_hash(Value const& value, bool* is_dead) { > > > > > > > >> The is_dead parameter still bothers me here. I can't make enough sense > > > >> out of the template code in ConcurrentHashtable to see why we have to > > > >> have it, but I'm concerned that its very existence means we perhaps > > > >> should not be trying to extend CHT in this context. ?? > > > > > > > > My understanding is that is_dead parameter provides a mechanism for > > > > ConcurrentHashtable to remove stale entries that were not explicitly > > > > removed by calling ConcurrentHashTable::remove() method. > > > > I think that just because in our case we don't use this mechanism doesn't > > > > mean we should not use ConcurrentHashTable. > > > > > > Can you confirm that this usage is okay with Robbin Ehn please. He's > > > back from vacation this week. > > > > > > >> I would still want to see what impact this has on thread > > > >> startup cost, both with and without the table being initialized. > > > > > > > > I run a test that initializes the table by calling ThreadMXBean.get getThreadInfo(), > > > > starts some threads as a worm-up, and then creates and starts 100,000 threads > > > > (each thread just sleeps for 100 ms). In case when the thread table is enabled > > > > 100,000 threads are created and started for about 15200 ms. If the thread table > > > > is off the test takes about 14800 ms. Based on this information the enabled > > > > thread table makes the thread startup about 2.7% slower. > > > > > > That doesn't sound very good. I think we may need to Claes involved to > > > help investigate overall performance impact here. > > > > > > > Webrev: https://cr.openjdk.java.net/~dtitov/8185005/webrev.04/ > > > > Bug: https://bugs.openjdk.java.net/browse/JDK-8185005 > > > > > > No further code comments. > > > > > > I didn't look at the test in detail. > > > > > > Thanks, > > > David > > > > > > > Thanks! > > > > --Daniil > > > > > > > > > > > > ?On 7/29/19, 12:53 AM, "David Holmes" wrote: > > > > > > > > Hi Daniil, > > > > > > > > Overall I think this is a reasonable approach but I would still like to > > > > see some performance and footprint numbers, both to verify it fixes the > > > > problem reported, and that we are not getting penalized elsewhere. > > > > > > > > On 25/07/2019 3:21 am, Daniil Titov wrote: > > > > > Hi David, Daniel, and Serguei, > > > > > > > > > > Please review the new version of the fix, that makes the thread table initialization on demand and > > > > > moves it inside ThreadsList::find_JavaThread_from_java_tid(). At the creation time the thread table > > > > > is initialized with the threads from the current thread list. We don't want to hold Threads_lock > > > > > inside find_JavaThread_from_java_tid(), thus new threads still could be created while the thread > > > > > table is being initialized . Such threads will be found by the linear search and added to the thread table > > > > > later, in ThreadsList::find_JavaThread_from_java_tid(). > > > > > > > > The initialization allows the created but unpopulated, or partially > > > > populated, table to be seen by other threads - is that your intention? > > > > It seems it should be okay as the other threads will then race with the > > > > initializing thread to add specific entries, and this is a concurrent > > > > map so that should be functionally correct. But if so then I think you > > > > can also reduce the scope of the ThreadTableCreate_lock so that it > > > > covers creation of the table only, not the initial population of the table. > > > > > > > > I like the approach of only initializing the table when needed and using > > > > that to control when the add/remove-thread code needs to update the > > > > table. But I would still want to see what impact this has on thread > > > > startup cost, both with and without the table being initialized. > > > > > > > > > The change also includes additional optimization for some callers of find_JavaThread_from_java_tid() > > > > > as Daniel suggested. > > > > > > > > Not sure it's best to combine these, but if they are limited to the > > > > changes in management.cpp only then that may be okay. It helps to be > > > > able to focus on the table related changes without being distracted by > > > > other optimizations. > > > > > > > > > That is correct that ResolvedMethodTable was used as a blueprint for the thread table, however, I tried > > > > > to strip it of the all functionality that is not required in the thread table case. > > > > > > > > The revised version seems better in that regard. But I still have a > > > > concern, see below. > > > > > > > > > We need to have the thread table resizable and allow it to grow as the number of threads increases to avoid > > > > > reserving excessive memory a-priori or deteriorating lookup times. The ServiceThread is responsible for > > > > > growing the thread table when required. > > > > > > > > Yes but why? Why can't this table be grown on demand by the thread that > > > > is doing the addition? For other tables we may have to delegate to the > > > > service thread because the current thread cannot perform the action, or > > > > it doesn't want to perform it at the time the need for the resize is > > > > detected (e.g. its detected at a safepoint and you want the resize to > > > > happen later outside the safepoint). It's not apparent to me that such > > > > restrictions apply here. > > > > > > > > > There is no ConcurrentHashTable available in Java 8 and for backporting this fix to Java 8 another implementation > > > > > of the hash table, probably originally suggested in the patch attached to the JBS issue, should be used. It will make > > > > > the backporting more complicated, however, adding a new Implementation of the hash table in Java 14 while it > > > > > already has ConcurrentHashTable doesn't seem reasonable for me. > > > > > > > > Ok. > > > > > > > > > Webrev: http://cr.openjdk.java.net/~dtitov/8185005/webrev.03 > > > > > > > > Some specific code comments: > > > > > > > > src/hotspot/share/runtime/mutexLocker.cpp > > > > > > > > + def(ThreadTableCreate_lock , PaddedMutex , special, > > > > false, Monitor::_safepoint_check_never); > > > > > > > > I think this needs to be a _safepoint_check_always lock. The table will > > > > be created by regular JavaThreads and they should (nearly) always be > > > > checking for safepoints if they are going to block acquiring the lock. > > > > And it isn't at all obvious that the thread doing the creation can't go > > > > to a safepoint whilst this lock is held. > > > > > > > > --- > > > > > > > > src/hotspot/share/runtime/threadSMR.cpp > > > > > > > > Nit: > > > > > > > > 618 JavaThread* thread = thread_at(i); > > > > > > > > you could reuse the new java_thread local you introduced at line 613 and > > > > just rename that "new" variable to "thread" so you don't have to change > > > > all other uses. > > > > > > > > 628 } else if (java_thread != NULL && ... > > > > > > > > You don't need to check != NULL here as you only get here when > > > > java_thread is not NULL. > > > > > > > > 755 jlong tid = SharedRuntime::get_java_tid(thread); > > > > 926 jlong tid = SharedRuntime::get_java_tid(thread); > > > > > > > > I think it cleaner/better to just use > > > > > > > > jlong tid = java_lang_Thread::thread_id(thread->threadObj()); > > > > > > > > as we know thread is not NULL, it is a JavaThread and it has to have a > > > > non-null threadObj. > > > > > > > > --- > > > > > > > > src/hotspot/share/services/management.cpp > > > > > > > > 1323 if (THREAD->is_Java_thread()) { > > > > 1324 JavaThread* current_thread = (JavaThread*)THREAD; > > > > > > > > These calls can only be made on a JavaThread so this be simplified to > > > > remove the is_Java_thread() call. Similarly in other places. > > > > > > > > --- > > > > > > > > src/hotspot/share/services/threadTable.cpp > > > > > > > > 55 class ThreadTableEntry : public CHeapObj { > > > > 56 private: > > > > 57 jlong _tid; > > > > > > > > I believe hotspot style is to not indent the access modifiers in C++ > > > > class declarations, so the above would just be: > > > > > > > > 55 class ThreadTableEntry : public CHeapObj { > > > > 56 private: > > > > 57 jlong _tid; > > > > > > > > etc. > > > > > > > > 60 ThreadTableEntry(jlong tid, JavaThread* java_thread) : > > > > 61 _tid(tid),_java_thread(java_thread) {} > > > > > > > > line 61 should be indented as it continues line 60. > > > > > > > > 67 class ThreadTableConfig : public AllStatic { > > > > ... > > > > 71 static uintx get_hash(Value const& value, bool* is_dead) { > > > > > > > > The is_dead parameter still bothers me here. I can't make enough sense > > > > out of the template code in ConcurrentHashtable to see why we have to > > > > have it, but I'm concerned that its very existence means we perhaps > > > > should not be trying to extend CHT in this context. ?? > > > > > > > > 115 size_t start_size_log = size_log > DefaultThreadTableSizeLog > > > > 116 ? size_log : DefaultThreadTableSizeLog; > > > > > > > > line 116 should be indented, though in this case I think a better layout > > > > would be: > > > > > > > > 115 size_t start_size_log = > > > > 116 size_log > DefaultThreadTableSizeLog ? size_log : > > > > DefaultThreadTableSizeLog; > > > > > > > > 131 double ThreadTable::get_load_factor() { > > > > 132 return (double)_items_count/_current_size; > > > > 133 } > > > > > > > > Not sure that is doing what you want/expect. It will perform integer > > > > division and then cast that whole integer to a double. If you want > > > > double arithmetic you need: > > > > > > > > return ((double)_items_count)/_current_size; > > > > > > > > 180 jlong _tid; > > > > 181 uintx _hash; > > > > > > > > Nit: no need for all those spaces before the variable name. > > > > > > > > 183 ThreadTableLookup(jlong tid) > > > > 184 : _tid(tid), _hash(primitive_hash(tid)) {} > > > > > > > > line 184 should be indented. > > > > > > > > 201 ThreadGet():_return(NULL) {} > > > > > > > > Nit: need space after : > > > > > > > > 211 assert(_is_initialized, "Thread table is not initialized"); > > > > 212 _has_work = false; > > > > > > > > line 211 is indented one space too far. > > > > > > > > 229 ThreadTableEntry* entry = new ThreadTableEntry(tid,java_thread); > > > > > > > > Nit: need space after , > > > > > > > > 252 return _local_table->remove(thread,lookup); > > > > > > > > Nit: need space after , > > > > > > > > Thanks, > > > > David > > > > ------ > > > > > > > > > Bug: https://bugs.openjdk.java.net/browse/JDK-8185005 > > > > > > > > > > Thanks! > > > > > --Daniil > > > > > > > > > > > > > > > ?On 7/8/19, 3:24 PM, "Daniel D. Daugherty" wrote: > > > > > > > > > > On 6/29/19 12:06 PM, Daniil Titov wrote: > > > > > > Hi Serguei and David, > > > > > > > > > > > > Serguei is right, ThreadTable::find_thread(java_tid) cannot return a JavaThread with an unmatched java_tid. > > > > > > > > > > > > Please find a new version of the fix that includes the changes Serguei suggested. > > > > > > > > > > > > Regarding the concern about the maintaining the thread table when it may never even be queried, one of > > > > > > the options could be to add ThreadTable ::isEnabled flag, set it to "false" by default, and wrap the calls to the thread table > > > > > > in ThreadsSMRSupport add_thread() and remove_thread() methods to check this flag. > > > > > > > > > > > > When ThreadsList::find_JavaThread_from_java_tid() is called for the first time it could check if ThreadTable ::isEnabled > > > > > > Is on and if not then set it on and populate the thread table with all existing threads from the thread list. > > > > > > > > > > I have the same concerns as David H. about this new ThreadTable. > > > > > ThreadsList::find_JavaThread_from_java_tid() is only called from code > > > > > in src/hotspot/share/services/management.cpp so I think that table > > > > > needs to enabled and populated only if it is going to be used. > > > > > > > > > > I've taken a look at the webrev below and I see that David has > > > > > followed up with additional comments. Before I do a crawl through > > > > > code review for this, I would like to see the ThreadTable stuff > > > > > made optional and David's other comments addressed. > > > > > > > > > > Another possible optimization is for callers of > > > > > find_JavaThread_from_java_tid() to save the calling thread's > > > > > tid value before they loop and if the current tid == saved_tid > > > > > then use the current JavaThread* instead of calling > > > > > find_JavaThread_from_java_tid() to get the JavaThread*. > > > > > > > > > > Dan > > > > > > > > > > > > > > > > > Webrev: https://cr.openjdk.java.net/~dtitov/8185005/webrev.02/ > > > > > > Bug: https://bugs.openjdk.java.net/browse/JDK-8185005 > > > > > > > > > > > > Thanks! > > > > > > --Daniil > > > > > > > > > > > > From: > > > > > > Organization: Oracle Corporation > > > > > > Date: Friday, June 28, 2019 at 7:56 PM > > > > > > To: Daniil Titov , OpenJDK Serviceability , "hotspot-runtime-dev at openjdk.java.net" , "jmx-dev at openjdk.java.net" > > > > > > Subject: Re: RFR: 8185005: Improve performance of ThreadMXBean.getThreadInfo(long ids[], int maxDepth) > > > > > > > > > > > > Hi Daniil, > > > > > > > > > > > > I have several quick comments. > > > > > > > > > > > > The indent in the hotspot c/c++ files has to be 2, not 4. > > > > > > > > > > > > https://cr.openjdk.java.net/~dtitov/8185005/webrev.01/src/hotspot/share/runtime/threadSMR.cpp.frames.html > > > > > > 614 JavaThread* ThreadsList::find_JavaThread_from_java_tid(jlong java_tid) const { > > > > > > 615 JavaThread* java_thread = ThreadTable::find_thread(java_tid); > > > > > > 616 if (java_thread == NULL && java_tid == PMIMORDIAL_JAVA_TID) { > > > > > > 617 // ThreadsSMRSupport::add_thread() is not called for the primordial > > > > > > 618 // thread. Thus, we find this thread with a linear search and add it > > > > > > 619 // to the thread table. > > > > > > 620 for (uint i = 0; i < length(); i++) { > > > > > > 621 JavaThread* thread = thread_at(i); > > > > > > 622 if (is_valid_java_thread(java_tid,thread)) { > > > > > > 623 ThreadTable::add_thread(java_tid, thread); > > > > > > 624 return thread; > > > > > > 625 } > > > > > > 626 } > > > > > > 627 } else if (java_thread != NULL && is_valid_java_thread(java_tid, java_thread)) { > > > > > > 628 return java_thread; > > > > > > 629 } > > > > > > 630 return NULL; > > > > > > 631 } > > > > > > 632 bool ThreadsList::is_valid_java_thread(jlong java_tid, JavaThread* java_thread) { > > > > > > 633 oop tobj = java_thread->threadObj(); > > > > > > 634 // Ignore the thread if it hasn't run yet, has exited > > > > > > 635 // or is starting to exit. > > > > > > 636 return (tobj != NULL && !java_thread->is_exiting() && > > > > > > 637 java_tid == java_lang_Thread::thread_id(tobj)); > > > > > > 638 } > > > > > > > > > > > > 615 JavaThread* java_thread = ThreadTable::find_thread(java_tid); > > > > > > > > > > > > I'd suggest to rename find_thread() to find_thread_by_tid(). > > > > > > > > > > > > A space is missed after the comma: > > > > > > 622 if (is_valid_java_thread(java_tid,thread)) { > > > > > > > > > > > > An empty line is needed before L632. > > > > > > > > > > > > The name 'is_valid_java_thread' looks wrong (or confusing) to me. > > > > > > Something like 'is_alive_java_thread_with_tid()' would be better. > > > > > > It'd better to list parameters in the opposite order. > > > > > > > > > > > > The call to is_valid_java_thread() is confusing: > > > > > > 627 } else if (java_thread != NULL && is_valid_java_thread(java_tid, java_thread)) { > > > > > > > > > > > > Why would the call ThreadTable::find_thread(java_tid) return a JavaThread with an unmatched java_tid? > > > > > > > > > > > > > > > > > > Thanks, > > > > > > Serguei > > > > > > > > > > > > On 6/28/19, 9:40 PM, "David Holmes" wrote: > > > > > > > > > > > > Hi Daniil, > > > > > > > > > > > > The definition and use of this hashtable (yet another hashtable > > > > > > implementation!) will need careful examination. We have to be concerned > > > > > > about the cost of maintaining it when it may never even be queried. You > > > > > > would need to look at footprint cost and performance impact. > > > > > > > > > > > > Unfortunately I'm just about to board a plane and will be out for the > > > > > > next few days. I will try to look at this asap next week, but we will > > > > > > need a lot more data on it. > > > > > > > > > > > > Thanks, > > > > > > David > > > > > > > > > > > > On 6/28/19 3:31 PM, Daniil Titov wrote: > > > > > > Please review the change that improves performance of ThreadMXBean MXBean methods returning the > > > > > > information for specific threads. The change introduces the thread table that uses ConcurrentHashTable > > > > > > to store one-to-one the mapping between the thread ids and JavaThread objects and replaces the linear > > > > > > search over the thread list in ThreadsList::find_JavaThread_from_java_tid(jlong tid) method with the lookup > > > > > > in the thread table. > > > > > > > > > > > > Testing: Mach5 tier1,tier2 and tier3 tests successfully passed. > > > > > > > > > > > > Webrev: https://cr.openjdk.java.net/~dtitov/8185005/webrev.01/ > > > > > > Bug: https://bugs.openjdk.java.net/browse/JDK-8185005 > > > > > > > > > > > > Thanks! > > > > > > > > > > > > Best regards, > > > > > > Daniil > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > From serguei.spitsyn at oracle.com Tue Sep 17 08:53:54 2019 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Tue, 17 Sep 2019 01:53:54 -0700 Subject: RFR: 8185005: Improve performance of ThreadMXBean.getThreadInfo(long ids[], int maxDepth) In-Reply-To: <1D2CC008-A509-4B0B-A8C7-75C1F94545AD@oracle.com> References: <4C4212D0-BFFF-4C85-ACC6-05200F220C3F@oracle.com> <2d6dede1-aa79-99ce-a823-773fa2e19827@oracle.com> <6E7B043A-4647-4931-977C-1854CA7EBEC1@oracle.com> <76BCC96D-DB5D-409A-95D5-3A64B893832D@oracle.com> <7e0ba39e-e5b7-f56b-66ea-820a0a35ec2c@oracle.com> <87748188-3BD4-4A8B-938A-89DBC8F3C57A@oracle.com> <1D2CC008-A509-4B0B-A8C7-75C1F94545AD@oracle.com> Message-ID: <0105ea55-9d9c-ca09-53af-3e9863e78e95@oracle.com> An HTML attachment was scrubbed... URL: From serguei.spitsyn at oracle.com Tue Sep 17 09:04:42 2019 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Tue, 17 Sep 2019 02:04:42 -0700 Subject: RFR: 8185005: Improve performance of ThreadMXBean.getThreadInfo(long ids[], int maxDepth) In-Reply-To: References: <4C4212D0-BFFF-4C85-ACC6-05200F220C3F@oracle.com> <2d6dede1-aa79-99ce-a823-773fa2e19827@oracle.com> <6E7B043A-4647-4931-977C-1854CA7EBEC1@oracle.com> <76BCC96D-DB5D-409A-95D5-3A64B893832D@oracle.com> <7e0ba39e-e5b7-f56b-66ea-820a0a35ec2c@oracle.com> <87748188-3BD4-4A8B-938A-89DBC8F3C57A@oracle.com> <1D2CC008-A509-4B0B-A8C7-75C1F94545AD@oracle.com> Message-ID: An HTML attachment was scrubbed... URL: From serguei.spitsyn at oracle.com Tue Sep 17 09:10:10 2019 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Tue, 17 Sep 2019 02:10:10 -0700 Subject: 8185005: Improve performance of ThreadMXBean.getThreadInfo(long ids[], int maxDepth) In-Reply-To: <9A125A5A-3904-4E3B-9650-308B56E15F20@oracle.com> References: <4C4212D0-BFFF-4C85-ACC6-05200F220C3F@oracle.com> <2d6dede1-aa79-99ce-a823-773fa2e19827@oracle.com> <6E7B043A-4647-4931-977C-1854CA7EBEC1@oracle.com> <76BCC96D-DB5D-409A-95D5-3A64B893832D@oracle.com> <7e0ba39e-e5b7-f56b-66ea-820a0a35ec2c@oracle.com> <87748188-3BD4-4A8B-938A-89DBC8F3C57A@oracle.com> <1D2CC008-A509-4B0B-A8C7-75C1F94545AD@oracle.com> <9A125A5A-3904-4E3B-9650-308B56E15F20@oracle.com> Message-ID: <0b307a92-5b5d-bd36-a128-99af6d0f3b1b@oracle.com> Hi Daniil, On 9/16/19 21:36, Daniil Titov wrote: > Hi David, > > The case you have described is exact the reason why we still have a code inside > ThreadsList::find_JavaThread_from_java_tid() method that does a linear scan and adds > the requested thread to the thread table if it is not there ( lines 614-613 below). I disagree because it is easy to avoid concurrent ThreadTable initialization (please, see my separate email). The reason for this code is to cover a case of late/lazy ThreadTable initialization. Thanks, Serguei > The > assumption is that it's quite uncommon and even if this is the case the linear scan happens > only once per such thread. > > > 611 JavaThread* ThreadsList::find_JavaThread_from_java_tid(jlong java_tid) const { > 612 ThreadTable::lazy_initialize(this); > 613 JavaThread* thread = ThreadTable::find_thread_by_tid(java_tid); > 614 if (thread == NULL) { > 615 // If the thread is not found in the table find it > 616 // with a linear search and add to the table. > 617 for (uint i = 0; i < length(); i++) { > 618 thread = thread_at(i); > 619 oop tobj = thread->threadObj(); > 620 // Ignore the thread if it hasn't run yet, has exited > 621 // or is starting to exit. > 622 if (tobj != NULL && java_tid == java_lang_Thread::thread_id(tobj)) { > 623 MutexLocker ml(Threads_lock); > 624 // Must be inside the lock to ensure that we don't add the thread to the table > 625 // that has just passed the removal point in ThreadsSMRSupport::remove_thread() > 626 if (!thread->is_exiting()) { > 627 ThreadTable::add_thread(java_tid, thread); > 628 return thread; > 629 } > 630 } > 631 } > 632 } else if (!thread->is_exiting()) { > 633 return thread; > 634 } > 635 return NULL; > 636 } > > Thanks, > Daniil > > ?On 9/16/19, 7:27 PM, "David Holmes" wrote: > > Hi Daniil, > > Thanks again for your perseverance on this one. > > I think there is a problem with initialization of the thread table. > Suppose thread T1 has called ThreadsList::find_JavaThread_from_java_tid > and has commenced execution of ThreadTable::lazy_initialize, but not yet > marked _is_initialized as true. Now two new threads (T2 and T3) are > created and start running - they aren't added to the ThreadTable yet > because it isn't initialized. Now T0 also calls > ThreadsList::find_JavaThread_from_java_tid using an updated ThreadsList > that contains T2 and T3. It also calls ThreadTable::lazy_initialize. If > _is_initialized is still false T0 will attempt initialization but once > it gets the lock it will see the table has now been initialized by T1. > It will then proceed to update the table with its own ThreadList content > - adding T2 and T3. That is all fine. But now suppose T0 initially sees > _is_initialized as true, it will do nothing in lazy_initialize and > simply return to find_JavaThread_from_java_tid. But now T2 and T3 are > missing from the ThreadTable and nothing will cause them to be added. > > More generally any ThreadsList that is created after the ThreadsList > that will be used for initialization, may contain threads that will not > be added to the table. > > Thanks, > David > > On 17/09/2019 4:18 am, Daniil Titov wrote: > > Hello, > > > > After investigating with Claes the impact of this change on the performance (thanks a lot Claes for helping with it!) the conclusion was that the impact on the thread startup time is not a blocker for this change. > > > > I also measured the memory footprint using Native Memory Tracking and results showed around 40 bytes per live thread. > > > > Please review a new version of the fix, webrev.06 [1]. Just to remind, webrev.05 was abandoned and webrev.06 [1] is webrev.04 [3] minus changes in src/hotspot/share/services/management.cpp (that were factored out to a separate issue [4]) and plus a change in ThreadsList::find_JavaThread_from_java_tid() method (please, see below) that addresses the problem Robbin found and puts the code that adds a new thread to the thread table inside Threads_lock. > > > > src/hotspot/share/runtime/threadSMR.cpp > > > > 622 if (tobj != NULL && java_tid == java_lang_Thread::thread_id(tobj)) { > > 623 MutexLocker ml(Threads_lock); > > 624 // Must be inside the lock to ensure that we don't add the thread to the table > > 625 // that has just passed the removal point in ThreadsSMRSupport::remove_thread() > > 626 if (!thread->is_exiting()) { > > 627 ThreadTable::add_thread(java_tid, thread); > > 628 return thread; > > 629 } > > 630 } > > > > [1] Webrev: https://cr.openjdk.java.net/~dtitov/8185005/webrev.06 > > [2] Bug: https://bugs.openjdk.java.net/browse/JDK-8185005 > > [3] https://cr.openjdk.java.net/~dtitov/8185005/webrev.04 > > [4] https://bugs.openjdk.java.net/browse/JDK-8229391 > > > > ?Thank you, > > Daniil > > > > > > > > > > > > ?On 8/4/19, 7:54 PM, "David Holmes" wrote: > > > > > > Hi Daniil, > > > > > > On 3/08/2019 8:16 am, Daniil Titov wrote: > > > > Hi David, > > > > > > > > Thank you for your detailed review. Please review a new version of the fix that includes > > > > the changes you suggested: > > > > - ThreadTableCreate_lock scope is reduced to cover the creation of the table only; > > > > - ThreadTableCreate_lock is made _safepoint_check_always; > > > > > > Okay. > > > > > > > - ServiceThread is no longer responsible for the resizing of the thread table, instead, > > > > the thread table is changed to grow on demand by the thread that is doing the addition; > > > > > > Okay - I'm happy to get the serviceThread out of the picture here. > > > > > > > - fixed nits and formatting issues. > > > > > > Okay. > > > > > > >>> The change also includes additional optimization for some callers of find_JavaThread_from_java_tid() > > > >>> as Daniel suggested. > > > >> Not sure it's best to combine these, but if they are limited to the > > > >> changes in management.cpp only then that may be okay. > > > > > > > > The additional optimization for some callers of find_JavaThread_from_java_tid() is > > > > limited to management.cpp (plus a new test) so I left them in the webrev but > > > > I also could move it in the separate issue if required. > > > > > > I'd prefer this part of be separated out, but won't insist. Let's see if > > > Dan or Serguei have a strong opinion. > > > > > > > > src/hotspot/share/runtime/threadSMR.cpp > > > > >755 jlong tid = SharedRuntime::get_java_tid(thread); > > > > > 926 jlong tid = SharedRuntime::get_java_tid(thread); > > > > > I think it cleaner/better to just use > > > > > jlong tid = java_lang_Thread::thread_id(thread->threadObj()); > > > > > as we know thread is not NULL, it is a JavaThread and it has to have a > > > > > non-null threadObj. > > > > > > > > I had to leave this code unchanged since it turned out the threadObj is null > > > > when VM is destroyed: > > > > > > > > V [libjvm.so+0xe165d7] oopDesc::long_field(int) const+0x67 > > > > V [libjvm.so+0x16e06c6] ThreadsSMRSupport::add_thread(JavaThread*)+0x116 > > > > V [libjvm.so+0x16d1302] Threads::add(JavaThread*, bool)+0x82 > > > > V [libjvm.so+0xef8369] attach_current_thread.part.197+0xc9 > > > > V [libjvm.so+0xec136c] jni_DestroyJavaVM+0x6c > > > > C [libjli.so+0x4333] JavaMain+0x2c3 > > > > C [libjli.so+0x8159] ThreadJavaMain+0x9 > > > > > > This is actually nothing to do with the VM being destroyed, but is an > > > issue with JNI_AttachCurrentThread and its interaction with the > > > ThreadSMR iterators. The attach process is: > > > - create JavaThread > > > - mark as "is attaching via jni" > > > - add to ThreadsList > > > - create java.lang.Thread object (you can only execute Java code after > > > you are attached) > > > - mark as "attach completed" > > > > > > So while a thread "is attaching" it will be seen by the ThreadSMR thread > > > iterator but will have a NULL java.lang.Thread object. > > > > > > We special-case attaching threads in a number of places in the VM and I > > > think we should be explicitly doing something here to filter out > > > attaching threads, rather than just being tolerant of a NULL j.l.Thread > > > object. Specifically in ThreadsSMRSupport::add_thread: > > > > > > if (ThreadTable::is_initialized() && !thread->is_attaching_via_jni()) { > > > jlong tid = java_lang_Thread::thread_id(thread->threadObj()); > > > ThreadTable::add_thread(tid, thread); > > > } > > > > > > Note that in ThreadsSMRSupport::remove_thread we can use the same guard, > > > which covers the case the JNI attach encountered an error trying to > > > create the j.l.Thread object. > > > > > > >> src/hotspot/share/services/threadTable.cpp > > > >> 71 static uintx get_hash(Value const& value, bool* is_dead) { > > > > > > > >> The is_dead parameter still bothers me here. I can't make enough sense > > > >> out of the template code in ConcurrentHashtable to see why we have to > > > >> have it, but I'm concerned that its very existence means we perhaps > > > >> should not be trying to extend CHT in this context. ?? > > > > > > > > My understanding is that is_dead parameter provides a mechanism for > > > > ConcurrentHashtable to remove stale entries that were not explicitly > > > > removed by calling ConcurrentHashTable::remove() method. > > > > I think that just because in our case we don't use this mechanism doesn't > > > > mean we should not use ConcurrentHashTable. > > > > > > Can you confirm that this usage is okay with Robbin Ehn please. He's > > > back from vacation this week. > > > > > > >> I would still want to see what impact this has on thread > > > >> startup cost, both with and without the table being initialized. > > > > > > > > I run a test that initializes the table by calling ThreadMXBean.get getThreadInfo(), > > > > starts some threads as a worm-up, and then creates and starts 100,000 threads > > > > (each thread just sleeps for 100 ms). In case when the thread table is enabled > > > > 100,000 threads are created and started for about 15200 ms. If the thread table > > > > is off the test takes about 14800 ms. Based on this information the enabled > > > > thread table makes the thread startup about 2.7% slower. > > > > > > That doesn't sound very good. I think we may need to Claes involved to > > > help investigate overall performance impact here. > > > > > > > Webrev: https://cr.openjdk.java.net/~dtitov/8185005/webrev.04/ > > > > Bug: https://bugs.openjdk.java.net/browse/JDK-8185005 > > > > > > No further code comments. > > > > > > I didn't look at the test in detail. > > > > > > Thanks, > > > David > > > > > > > Thanks! > > > > --Daniil > > > > > > > > > > > > ?On 7/29/19, 12:53 AM, "David Holmes" wrote: > > > > > > > > Hi Daniil, > > > > > > > > Overall I think this is a reasonable approach but I would still like to > > > > see some performance and footprint numbers, both to verify it fixes the > > > > problem reported, and that we are not getting penalized elsewhere. > > > > > > > > On 25/07/2019 3:21 am, Daniil Titov wrote: > > > > > Hi David, Daniel, and Serguei, > > > > > > > > > > Please review the new version of the fix, that makes the thread table initialization on demand and > > > > > moves it inside ThreadsList::find_JavaThread_from_java_tid(). At the creation time the thread table > > > > > is initialized with the threads from the current thread list. We don't want to hold Threads_lock > > > > > inside find_JavaThread_from_java_tid(), thus new threads still could be created while the thread > > > > > table is being initialized . Such threads will be found by the linear search and added to the thread table > > > > > later, in ThreadsList::find_JavaThread_from_java_tid(). > > > > > > > > The initialization allows the created but unpopulated, or partially > > > > populated, table to be seen by other threads - is that your intention? > > > > It seems it should be okay as the other threads will then race with the > > > > initializing thread to add specific entries, and this is a concurrent > > > > map so that should be functionally correct. But if so then I think you > > > > can also reduce the scope of the ThreadTableCreate_lock so that it > > > > covers creation of the table only, not the initial population of the table. > > > > > > > > I like the approach of only initializing the table when needed and using > > > > that to control when the add/remove-thread code needs to update the > > > > table. But I would still want to see what impact this has on thread > > > > startup cost, both with and without the table being initialized. > > > > > > > > > The change also includes additional optimization for some callers of find_JavaThread_from_java_tid() > > > > > as Daniel suggested. > > > > > > > > Not sure it's best to combine these, but if they are limited to the > > > > changes in management.cpp only then that may be okay. It helps to be > > > > able to focus on the table related changes without being distracted by > > > > other optimizations. > > > > > > > > > That is correct that ResolvedMethodTable was used as a blueprint for the thread table, however, I tried > > > > > to strip it of the all functionality that is not required in the thread table case. > > > > > > > > The revised version seems better in that regard. But I still have a > > > > concern, see below. > > > > > > > > > We need to have the thread table resizable and allow it to grow as the number of threads increases to avoid > > > > > reserving excessive memory a-priori or deteriorating lookup times. The ServiceThread is responsible for > > > > > growing the thread table when required. > > > > > > > > Yes but why? Why can't this table be grown on demand by the thread that > > > > is doing the addition? For other tables we may have to delegate to the > > > > service thread because the current thread cannot perform the action, or > > > > it doesn't want to perform it at the time the need for the resize is > > > > detected (e.g. its detected at a safepoint and you want the resize to > > > > happen later outside the safepoint). It's not apparent to me that such > > > > restrictions apply here. > > > > > > > > > There is no ConcurrentHashTable available in Java 8 and for backporting this fix to Java 8 another implementation > > > > > of the hash table, probably originally suggested in the patch attached to the JBS issue, should be used. It will make > > > > > the backporting more complicated, however, adding a new Implementation of the hash table in Java 14 while it > > > > > already has ConcurrentHashTable doesn't seem reasonable for me. > > > > > > > > Ok. > > > > > > > > > Webrev: http://cr.openjdk.java.net/~dtitov/8185005/webrev.03 > > > > > > > > Some specific code comments: > > > > > > > > src/hotspot/share/runtime/mutexLocker.cpp > > > > > > > > + def(ThreadTableCreate_lock , PaddedMutex , special, > > > > false, Monitor::_safepoint_check_never); > > > > > > > > I think this needs to be a _safepoint_check_always lock. The table will > > > > be created by regular JavaThreads and they should (nearly) always be > > > > checking for safepoints if they are going to block acquiring the lock. > > > > And it isn't at all obvious that the thread doing the creation can't go > > > > to a safepoint whilst this lock is held. > > > > > > > > --- > > > > > > > > src/hotspot/share/runtime/threadSMR.cpp > > > > > > > > Nit: > > > > > > > > 618 JavaThread* thread = thread_at(i); > > > > > > > > you could reuse the new java_thread local you introduced at line 613 and > > > > just rename that "new" variable to "thread" so you don't have to change > > > > all other uses. > > > > > > > > 628 } else if (java_thread != NULL && ... > > > > > > > > You don't need to check != NULL here as you only get here when > > > > java_thread is not NULL. > > > > > > > > 755 jlong tid = SharedRuntime::get_java_tid(thread); > > > > 926 jlong tid = SharedRuntime::get_java_tid(thread); > > > > > > > > I think it cleaner/better to just use > > > > > > > > jlong tid = java_lang_Thread::thread_id(thread->threadObj()); > > > > > > > > as we know thread is not NULL, it is a JavaThread and it has to have a > > > > non-null threadObj. > > > > > > > > --- > > > > > > > > src/hotspot/share/services/management.cpp > > > > > > > > 1323 if (THREAD->is_Java_thread()) { > > > > 1324 JavaThread* current_thread = (JavaThread*)THREAD; > > > > > > > > These calls can only be made on a JavaThread so this be simplified to > > > > remove the is_Java_thread() call. Similarly in other places. > > > > > > > > --- > > > > > > > > src/hotspot/share/services/threadTable.cpp > > > > > > > > 55 class ThreadTableEntry : public CHeapObj { > > > > 56 private: > > > > 57 jlong _tid; > > > > > > > > I believe hotspot style is to not indent the access modifiers in C++ > > > > class declarations, so the above would just be: > > > > > > > > 55 class ThreadTableEntry : public CHeapObj { > > > > 56 private: > > > > 57 jlong _tid; > > > > > > > > etc. > > > > > > > > 60 ThreadTableEntry(jlong tid, JavaThread* java_thread) : > > > > 61 _tid(tid),_java_thread(java_thread) {} > > > > > > > > line 61 should be indented as it continues line 60. > > > > > > > > 67 class ThreadTableConfig : public AllStatic { > > > > ... > > > > 71 static uintx get_hash(Value const& value, bool* is_dead) { > > > > > > > > The is_dead parameter still bothers me here. I can't make enough sense > > > > out of the template code in ConcurrentHashtable to see why we have to > > > > have it, but I'm concerned that its very existence means we perhaps > > > > should not be trying to extend CHT in this context. ?? > > > > > > > > 115 size_t start_size_log = size_log > DefaultThreadTableSizeLog > > > > 116 ? size_log : DefaultThreadTableSizeLog; > > > > > > > > line 116 should be indented, though in this case I think a better layout > > > > would be: > > > > > > > > 115 size_t start_size_log = > > > > 116 size_log > DefaultThreadTableSizeLog ? size_log : > > > > DefaultThreadTableSizeLog; > > > > > > > > 131 double ThreadTable::get_load_factor() { > > > > 132 return (double)_items_count/_current_size; > > > > 133 } > > > > > > > > Not sure that is doing what you want/expect. It will perform integer > > > > division and then cast that whole integer to a double. If you want > > > > double arithmetic you need: > > > > > > > > return ((double)_items_count)/_current_size; > > > > > > > > 180 jlong _tid; > > > > 181 uintx _hash; > > > > > > > > Nit: no need for all those spaces before the variable name. > > > > > > > > 183 ThreadTableLookup(jlong tid) > > > > 184 : _tid(tid), _hash(primitive_hash(tid)) {} > > > > > > > > line 184 should be indented. > > > > > > > > 201 ThreadGet():_return(NULL) {} > > > > > > > > Nit: need space after : > > > > > > > > 211 assert(_is_initialized, "Thread table is not initialized"); > > > > 212 _has_work = false; > > > > > > > > line 211 is indented one space too far. > > > > > > > > 229 ThreadTableEntry* entry = new ThreadTableEntry(tid,java_thread); > > > > > > > > Nit: need space after , > > > > > > > > 252 return _local_table->remove(thread,lookup); > > > > > > > > Nit: need space after , > > > > > > > > Thanks, > > > > David > > > > ------ > > > > > > > > > Bug: https://bugs.openjdk.java.net/browse/JDK-8185005 > > > > > > > > > > Thanks! > > > > > --Daniil > > > > > > > > > > > > > > > ?On 7/8/19, 3:24 PM, "Daniel D. Daugherty" wrote: > > > > > > > > > > On 6/29/19 12:06 PM, Daniil Titov wrote: > > > > > > Hi Serguei and David, > > > > > > > > > > > > Serguei is right, ThreadTable::find_thread(java_tid) cannot return a JavaThread with an unmatched java_tid. > > > > > > > > > > > > Please find a new version of the fix that includes the changes Serguei suggested. > > > > > > > > > > > > Regarding the concern about the maintaining the thread table when it may never even be queried, one of > > > > > > the options could be to add ThreadTable ::isEnabled flag, set it to "false" by default, and wrap the calls to the thread table > > > > > > in ThreadsSMRSupport add_thread() and remove_thread() methods to check this flag. > > > > > > > > > > > > When ThreadsList::find_JavaThread_from_java_tid() is called for the first time it could check if ThreadTable ::isEnabled > > > > > > Is on and if not then set it on and populate the thread table with all existing threads from the thread list. > > > > > > > > > > I have the same concerns as David H. about this new ThreadTable. > > > > > ThreadsList::find_JavaThread_from_java_tid() is only called from code > > > > > in src/hotspot/share/services/management.cpp so I think that table > > > > > needs to enabled and populated only if it is going to be used. > > > > > > > > > > I've taken a look at the webrev below and I see that David has > > > > > followed up with additional comments. Before I do a crawl through > > > > > code review for this, I would like to see the ThreadTable stuff > > > > > made optional and David's other comments addressed. > > > > > > > > > > Another possible optimization is for callers of > > > > > find_JavaThread_from_java_tid() to save the calling thread's > > > > > tid value before they loop and if the current tid == saved_tid > > > > > then use the current JavaThread* instead of calling > > > > > find_JavaThread_from_java_tid() to get the JavaThread*. > > > > > > > > > > Dan > > > > > > > > > > > > > > > > > Webrev: https://cr.openjdk.java.net/~dtitov/8185005/webrev.02/ > > > > > > Bug: https://bugs.openjdk.java.net/browse/JDK-8185005 > > > > > > > > > > > > Thanks! > > > > > > --Daniil > > > > > > > > > > > > From: > > > > > > Organization: Oracle Corporation > > > > > > Date: Friday, June 28, 2019 at 7:56 PM > > > > > > To: Daniil Titov , OpenJDK Serviceability , "hotspot-runtime-dev at openjdk.java.net" , "jmx-dev at openjdk.java.net" > > > > > > Subject: Re: RFR: 8185005: Improve performance of ThreadMXBean.getThreadInfo(long ids[], int maxDepth) > > > > > > > > > > > > Hi Daniil, > > > > > > > > > > > > I have several quick comments. > > > > > > > > > > > > The indent in the hotspot c/c++ files has to be 2, not 4. > > > > > > > > > > > > https://cr.openjdk.java.net/~dtitov/8185005/webrev.01/src/hotspot/share/runtime/threadSMR.cpp.frames.html > > > > > > 614 JavaThread* ThreadsList::find_JavaThread_from_java_tid(jlong java_tid) const { > > > > > > 615 JavaThread* java_thread = ThreadTable::find_thread(java_tid); > > > > > > 616 if (java_thread == NULL && java_tid == PMIMORDIAL_JAVA_TID) { > > > > > > 617 // ThreadsSMRSupport::add_thread() is not called for the primordial > > > > > > 618 // thread. Thus, we find this thread with a linear search and add it > > > > > > 619 // to the thread table. > > > > > > 620 for (uint i = 0; i < length(); i++) { > > > > > > 621 JavaThread* thread = thread_at(i); > > > > > > 622 if (is_valid_java_thread(java_tid,thread)) { > > > > > > 623 ThreadTable::add_thread(java_tid, thread); > > > > > > 624 return thread; > > > > > > 625 } > > > > > > 626 } > > > > > > 627 } else if (java_thread != NULL && is_valid_java_thread(java_tid, java_thread)) { > > > > > > 628 return java_thread; > > > > > > 629 } > > > > > > 630 return NULL; > > > > > > 631 } > > > > > > 632 bool ThreadsList::is_valid_java_thread(jlong java_tid, JavaThread* java_thread) { > > > > > > 633 oop tobj = java_thread->threadObj(); > > > > > > 634 // Ignore the thread if it hasn't run yet, has exited > > > > > > 635 // or is starting to exit. > > > > > > 636 return (tobj != NULL && !java_thread->is_exiting() && > > > > > > 637 java_tid == java_lang_Thread::thread_id(tobj)); > > > > > > 638 } > > > > > > > > > > > > 615 JavaThread* java_thread = ThreadTable::find_thread(java_tid); > > > > > > > > > > > > I'd suggest to rename find_thread() to find_thread_by_tid(). > > > > > > > > > > > > A space is missed after the comma: > > > > > > 622 if (is_valid_java_thread(java_tid,thread)) { > > > > > > > > > > > > An empty line is needed before L632. > > > > > > > > > > > > The name 'is_valid_java_thread' looks wrong (or confusing) to me. > > > > > > Something like 'is_alive_java_thread_with_tid()' would be better. > > > > > > It'd better to list parameters in the opposite order. > > > > > > > > > > > > The call to is_valid_java_thread() is confusing: > > > > > > 627 } else if (java_thread != NULL && is_valid_java_thread(java_tid, java_thread)) { > > > > > > > > > > > > Why would the call ThreadTable::find_thread(java_tid) return a JavaThread with an unmatched java_tid? > > > > > > > > > > > > > > > > > > Thanks, > > > > > > Serguei > > > > > > > > > > > > On 6/28/19, 9:40 PM, "David Holmes" wrote: > > > > > > > > > > > > Hi Daniil, > > > > > > > > > > > > The definition and use of this hashtable (yet another hashtable > > > > > > implementation!) will need careful examination. We have to be concerned > > > > > > about the cost of maintaining it when it may never even be queried. You > > > > > > would need to look at footprint cost and performance impact. > > > > > > > > > > > > Unfortunately I'm just about to board a plane and will be out for the > > > > > > next few days. I will try to look at this asap next week, but we will > > > > > > need a lot more data on it. > > > > > > > > > > > > Thanks, > > > > > > David > > > > > > > > > > > > On 6/28/19 3:31 PM, Daniil Titov wrote: > > > > > > Please review the change that improves performance of ThreadMXBean MXBean methods returning the > > > > > > information for specific threads. The change introduces the thread table that uses ConcurrentHashTable > > > > > > to store one-to-one the mapping between the thread ids and JavaThread objects and replaces the linear > > > > > > search over the thread list in ThreadsList::find_JavaThread_from_java_tid(jlong tid) method with the lookup > > > > > > in the thread table. > > > > > > > > > > > > Testing: Mach5 tier1,tier2 and tier3 tests successfully passed. > > > > > > > > > > > > Webrev: https://cr.openjdk.java.net/~dtitov/8185005/webrev.01/ > > > > > > Bug: https://bugs.openjdk.java.net/browse/JDK-8185005 > > > > > > > > > > > > Thanks! > > > > > > > > > > > > Best regards, > > > > > > Daniil > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > From serguei.spitsyn at oracle.com Tue Sep 17 09:26:29 2019 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Tue, 17 Sep 2019 02:26:29 -0700 Subject: RFR (M): 8207266: ThreadMXBean::getThreadAllocatedBytes() can be quicker for self thread In-Reply-To: References: <588a91ec-8d4a-1157-5d72-88bb1eef1e6e@oracle.com> <30EA5D0C-1AEC-4242-B17B-CA4D39ECAF71@amazon.com> <0d42d653-d158-a6e4-45b6-84f087c7e592@oracle.com> <03A2509C-5587-448A-82F8-9240EA040326@amazon.com> <6f674d71-58f6-bc79-7d08-7bcc24e3b0fa@oracle.com> <5252a51d-4217-000b-1444-a088bb8a6a58@oracle.com> <873119A8-C595-4B73-AD0B-1625D6CAC47D@amazon.com> <56ea7c5f-8c91-9a05-6d95-255bfd0c154d@oracle.com> Message-ID: An HTML attachment was scrubbed... URL: From david.holmes at oracle.com Tue Sep 17 10:46:07 2019 From: david.holmes at oracle.com (David Holmes) Date: Tue, 17 Sep 2019 20:46:07 +1000 Subject: 8185005: Improve performance of ThreadMXBean.getThreadInfo(long ids[], int maxDepth) In-Reply-To: <0b307a92-5b5d-bd36-a128-99af6d0f3b1b@oracle.com> References: <4C4212D0-BFFF-4C85-ACC6-05200F220C3F@oracle.com> <2d6dede1-aa79-99ce-a823-773fa2e19827@oracle.com> <6E7B043A-4647-4931-977C-1854CA7EBEC1@oracle.com> <76BCC96D-DB5D-409A-95D5-3A64B893832D@oracle.com> <7e0ba39e-e5b7-f56b-66ea-820a0a35ec2c@oracle.com> <87748188-3BD4-4A8B-938A-89DBC8F3C57A@oracle.com> <1D2CC008-A509-4B0B-A8C7-75C1F94545AD@oracle.com> <9A125A5A-3904-4E3B-9650-308B56E15F20@oracle.com> <0b307a92-5b5d-bd36-a128-99af6d0f3b1b@oracle.com> Message-ID: Hi Serguei, On 17/09/2019 7:10 pm, serguei.spitsyn at oracle.com wrote: > Hi Daniil, > > > On 9/16/19 21:36, Daniil Titov wrote: >> Hi David, >> >> The case you have described is exact the reason why we still have a >> code inside >> ThreadsList::find_JavaThread_from_java_tid() method that does a linear >> scan and adds >> ? the requested thread to the thread table if it is not there ( lines >> 614-613 below). > > I disagree because it is easy to avoid concurrent ThreadTable > initialization (please, see my separate email). > The reason for this code is to cover a case of late/lazy ThreadTable > initialization. I'm not sure I follow. With the current code if two threads are racing to initialize the ThreadTable with ThreadsLists that contain a different set of threads then there are two possibilities with regards to the interleaving. Assume T1 initializes the table with its set of threads and so finds the tid it is looking for in the table. Meanwhile T2 is racing with the initialization logic: - If T2 sees _is_initialized then lazy_initialization does nothing for T2, and the additional threads in its ThreadsList (say T3 and T4) are not added to the table. But the specific thread associated with the tid (say T3) will be found by linear search of the ThreadsList and then added. If any other threads come searching for T4 they too will not find it in the ThreadTable but instead perform the linear search of their ThreadsList (and add it). - if T2 doesn't see _is_initialized at first it will try to acquire the lock, and eventually see _is_initialized is true, at which point it will try to add all of its thread's to the table (so T3 and T4 will be added). When lazy_initialize returns, T3 will be found in the table and returned. If any other threads come searching for T4 they will also find it in the table. With your suggested code change this second case is not possible so for any racing initialization the lookup of any threads not in the original ThreadsList will always result in using the linear search before adding to the table. Both seem correct to me. Which one is more efficient will totally depend on the number of differences between the ThreadsLists and whether the code ever tries to look up those additional threads. If we assume racing initialization is likely to be rare anyway (because generally one thread is in charge of doing the monitoring) then the choice seems somewhat arbitrary. Cheers, David ----- > Thanks, > Serguei > >> ??? The >> assumption is that it's quite uncommon and even if this is the case >> the linear scan happens >> only once per such thread. >> >> ? 611 JavaThread* ThreadsList::find_JavaThread_from_java_tid(jlong >> java_tid) const { >> ? 612?? ThreadTable::lazy_initialize(this); >> ? 613?? JavaThread* thread = ThreadTable::find_thread_by_tid(java_tid); >> ? 614?? if (thread == NULL) { >> ? 615???? // If the thread is not found in the table find it >> ? 616???? // with a linear search and add to the table. >> ? 617???? for (uint i = 0; i < length(); i++) { >> ? 618?????? thread = thread_at(i); >> ? 619?????? oop tobj = thread->threadObj(); >> ? 620?????? // Ignore the thread if it hasn't run yet, has exited >> ? 621?????? // or is starting to exit. >> ? 622?????? if (tobj != NULL && java_tid == >> java_lang_Thread::thread_id(tobj)) { >> ? 623???????? MutexLocker ml(Threads_lock); >> ? 624???????? // Must be inside the lock to ensure that we don't add >> the thread to the table >> ? 625???????? // that has just passed the removal point in >> ThreadsSMRSupport::remove_thread() >> ? 626???????? if (!thread->is_exiting()) { >> ? 627?????????? ThreadTable::add_thread(java_tid, thread); >> ? 628?????????? return thread; >> ? 629???????? } >> ? 630?????? } >> ? 631???? } >> ? 632?? } else if (!thread->is_exiting()) { >> ? 633?????? return thread; >> ? 634?? } >> ? 635?? return NULL; >> ? 636 } >> >> Thanks, >> Daniil >> >> ?On 9/16/19, 7:27 PM, "David Holmes" wrote: >> >> ???? Hi Daniil, >> ???? Thanks again for your perseverance on this one. >> ???? I think there is a problem with initialization of the thread table. >> ???? Suppose thread T1 has called >> ThreadsList::find_JavaThread_from_java_tid >> ???? and has commenced execution of ThreadTable::lazy_initialize, but >> not yet >> ???? marked _is_initialized as true. Now two new threads (T2 and T3) are >> ???? created and start running - they aren't added to the ThreadTable yet >> ???? because it isn't initialized. Now T0 also calls >> ???? ThreadsList::find_JavaThread_from_java_tid using an updated >> ThreadsList >> ???? that contains T2 and T3. It also calls >> ThreadTable::lazy_initialize. If >> ???? _is_initialized is still false T0 will attempt initialization but >> once >> ???? it gets the lock it will see the table has now been initialized >> by T1. >> ???? It will then proceed to update the table with its own ThreadList >> content >> ???? - adding T2 and T3. That is all fine. But now suppose T0 >> initially sees >> ???? _is_initialized as true, it will do nothing in lazy_initialize and >> ???? simply return to find_JavaThread_from_java_tid. But now T2 and T3 >> are >> ???? missing from the ThreadTable and nothing will cause them to be >> added. >> ???? More generally any ThreadsList that is created after the ThreadsList >> ???? that will be used for initialization, may contain threads that >> will not >> ???? be added to the table. >> ???? Thanks, >> ???? David >> ???? On 17/09/2019 4:18 am, Daniil Titov wrote: >> ???? > Hello, >> ???? > >> ???? > After investigating with Claes the impact of this change on the >> performance (thanks a lot Claes for helping with it!) the conclusion >> was that the impact on the thread startup time is not a blocker for >> this change. >> ???? > >> ???? > I also measured the memory footprint using Native Memory >> Tracking and results showed around 40 bytes per live thread. >> ???? > >> ???? > Please review a new version of the fix, webrev.06 [1].? Just to >> remind,? webrev.05 was abandoned and webrev.06 [1] is webrev.04 [3] >> minus changes in src/hotspot/share/services/management.cpp (that were >> factored out to a separate issue [4]) and plus a change in >> ThreadsList::find_JavaThread_from_java_tid() method (please, see >> below)? that addresses the problem Robbin found and puts the code that >> adds a new thread to the thread table inside Threads_lock. >> ???? > >> ???? > src/hotspot/share/runtime/threadSMR.cpp >> ???? > >> ???? > 622?????? if (tobj != NULL && java_tid == >> java_lang_Thread::thread_id(tobj)) { >> ???? > 623???????? MutexLocker ml(Threads_lock); >> ???? > 624???????? // Must be inside the lock to ensure that we don't >> add the thread to the table >> ???? > 625???????? // that has just passed the removal point in >> ThreadsSMRSupport::remove_thread() >> ???? > 626???????? if (!thread->is_exiting()) { >> ???? > 627?????????? ThreadTable::add_thread(java_tid, thread); >> ???? > 628?????????? return thread; >> ???? > 629???????? } >> ???? > 630?????? } >> ???? > >> ???? > [1] Webrev:? https://cr.openjdk.java.net/~dtitov/8185005/webrev.06 >> ???? > [2] Bug: https://bugs.openjdk.java.net/browse/JDK-8185005 >> ???? > [3] https://cr.openjdk.java.net/~dtitov/8185005/webrev.04 >> ???? > [4] https://bugs.openjdk.java.net/browse/JDK-8229391 >> ???? > >> ???? > ?Thank you, >> ???? > Daniil >> ???? > >> ???? > >> ???? > >> ???? >????????? > >> ???? >????????? > ?On 8/4/19, 7:54 PM, "David Holmes" >> wrote: >> ???? >????????? > >> ???? >????????? >????? Hi Daniil, >> ???? >????????? > >> ???? >????????? >????? On 3/08/2019 8:16 am, Daniil Titov wrote: >> ???? >????????? >????? > Hi David, >> ???? >????????? >????? > >> ???? >????????? >????? > Thank you for your detailed review. Please >> review a new version of the fix that includes >> ???? >????????? >????? > the changes you suggested: >> ???? >????????? >????? > - ThreadTableCreate_lock scope is reduced to >> cover the creation of the table only; >> ???? >????????? >????? > - ThreadTableCreate_lock is made >> _safepoint_check_always; >> ???? >????????? > >> ???? >????????? >????? Okay. >> ???? >????????? > >> ???? >????????? >????? > - ServiceThread is no longer responsible for >> the resizing of the thread table, instead, >> ???? >????????? >????? >??? the thread table is changed to grow on >> demand by the thread that is doing the addition; >> ???? >????????? > >> ???? >????????? >????? Okay - I'm happy to get the serviceThread out >> of the picture here. >> ???? >????????? > >> ???? >????????? >????? > - fixed nits and formatting issues. >> ???? >????????? > >> ???? >????????? >????? Okay. >> ???? >????????? > >> ???? >????????? >????? >>> The change also includes additional >> optimization for some callers of find_JavaThread_from_java_tid() >> ???? >????????? >????? >>>?? as Daniel suggested. >> ???? >????????? >????? >> Not sure it's best to combine these, but if >> they are limited to the >> ???? >????????? >????? >> changes in management.cpp only then that may >> be okay. >> ???? >????????? >????? > >> ???? >????????? >????? > The additional optimization for some callers >> of find_JavaThread_from_java_tid() is >> ???? >????????? >????? > limited to management.cpp (plus a new test) >> so I left them in the webrev? but >> ???? >????????? >????? > I also could move it in the separate issue if >> required. >> ???? >????????? > >> ???? >????????? >????? I'd prefer this part of be separated out, but >> won't insist. Let's see if >> ???? >????????? >????? Dan or Serguei have a strong opinion. >> ???? >????????? > >> ???? >????????? >????? >??? > src/hotspot/share/runtime/threadSMR.cpp >> ???? >????????? >????? >??? >755???? jlong tid = >> SharedRuntime::get_java_tid(thread); >> ???? >????????? >????? >??? > 926???? jlong tid = >> SharedRuntime::get_java_tid(thread); >> ???? >????????? >????? >?? >? I think it cleaner/better to just use >> ???? >????????? >????? >?? > jlong tid = >> java_lang_Thread::thread_id(thread->threadObj()); >> ???? >????????? >????? >?? > as we know thread is not NULL, it is a >> JavaThread and it has to have a >> ???? >????????? >????? >?? > non-null threadObj. >> ???? >????????? >????? > >> ???? >????????? >????? > I had to leave this code unchanged since it >> turned out the threadObj is null >> ???? >????????? >????? > when VM is destroyed: >> ???? >????????? >????? > >> ???? >????????? >????? > V? [libjvm.so+0xe165d7] >> oopDesc::long_field(int) const+0x67 >> ???? >????????? >????? > V? [libjvm.so+0x16e06c6] >> ThreadsSMRSupport::add_thread(JavaThread*)+0x116 >> ???? >????????? >????? > V? [libjvm.so+0x16d1302] >> Threads::add(JavaThread*, bool)+0x82 >> ???? >????????? >????? > V? [libjvm.so+0xef8369] >> attach_current_thread.part.197+0xc9 >> ???? >????????? >????? > V? [libjvm.so+0xec136c]? jni_DestroyJavaVM+0x6c >> ???? >????????? >????? > C? [libjli.so+0x4333]? JavaMain+0x2c3 >> ???? >????????? >????? > C? [libjli.so+0x8159]? ThreadJavaMain+0x9 >> ???? >????????? > >> ???? >????????? >????? This is actually nothing to do with the VM >> being destroyed, but is an >> ???? >????????? >????? issue with JNI_AttachCurrentThread and its >> interaction with the >> ???? >????????? >????? ThreadSMR iterators. The attach process is: >> ???? >????????? >????? - create JavaThread >> ???? >????????? >????? - mark as "is attaching via jni" >> ???? >????????? >????? - add to ThreadsList >> ???? >????????? >????? - create java.lang.Thread object (you can only >> execute Java code after >> ???? >????????? >????? you are attached) >> ???? >????????? >????? - mark as "attach completed" >> ???? >????????? > >> ???? >????????? >????? So while a thread "is attaching" it will be >> seen by the ThreadSMR thread >> ???? >????????? >????? iterator but will have a NULL java.lang.Thread >> object. >> ???? >????????? > >> ???? >????????? >????? We special-case attaching threads in a number >> of places in the VM and I >> ???? >????????? >????? think we should be explicitly doing something >> here to filter out >> ???? >????????? >????? attaching threads, rather than just being >> tolerant of a NULL j.l.Thread >> ???? >????????? >????? object. Specifically in >> ThreadsSMRSupport::add_thread: >> ???? >????????? > >> ???? >????????? >????? if (ThreadTable::is_initialized() && >> !thread->is_attaching_via_jni()) { >> ???? >????????? >???????? jlong tid = >> java_lang_Thread::thread_id(thread->threadObj()); >> ???? >????????? >???????? ThreadTable::add_thread(tid, thread); >> ???? >????????? >????? } >> ???? >????????? > >> ???? >????????? >????? Note that in ThreadsSMRSupport::remove_thread >> we can use the same guard, >> ???? >????????? >????? which covers the case the JNI attach >> encountered an error trying to >> ???? >????????? >????? create the j.l.Thread object. >> ???? >????????? > >> ???? >????????? >????? >> src/hotspot/share/services/threadTable.cpp >> ???? >????????? >????? >> 71???? static uintx get_hash(Value const& >> value, bool* is_dead) { >> ???? >????????? >????? > >> ???? >????????? >????? >> The is_dead parameter still bothers me here. >> I can't make enough sense >> ???? >????????? >????? >> out of the template code in >> ConcurrentHashtable to see why we have to >> ???? >????????? >????? >> have it, but I'm concerned that its very >> existence means we perhaps >> ???? >????????? >????? >> should not be trying to extend CHT in this >> context. ?? >> ???? >????????? >????? > >> ???? >????????? >????? > My understanding is that is_dead parameter >> provides a mechanism for >> ???? >????????? >????? > ConcurrentHashtable to remove stale entries >> that were not explicitly >> ???? >????????? >????? > removed by calling >> ConcurrentHashTable::remove() method. >> ???? >????????? >????? > I think that just because in our case we >> don't use this mechanism doesn't >> ???? >????????? >????? > mean we should not use ConcurrentHashTable. >> ???? >????????? > >> ???? >????????? >????? Can you confirm that this usage is okay with >> Robbin Ehn please. He's >> ???? >????????? >????? back from vacation this week. >> ???? >????????? > >> ???? >????????? >????? >> I would still want to see what impact this >> has on thread >> ???? >????????? >????? >> startup cost, both with and without the >> table being initialized. >> ???? >????????? >????? > >> ???? >????????? >????? > I run a test that initializes the table by >> calling ThreadMXBean.get getThreadInfo(), >> ???? >????????? >????? > starts some threads as a worm-up, and then >> creates and starts 100,000 threads >> ???? >????????? >????? > (each thread just sleeps for 100 ms). In case >> when the thread table is enabled >> ???? >????????? >????? > 100,000 threads are created and started? for >> about 15200 ms. If the thread table >> ???? >????????? >????? > is off the test takes about 14800 ms. Based >> on this information the enabled >> ???? >????????? >????? > thread table makes the thread startup about >> 2.7% slower. >> ???? >????????? > >> ???? >????????? >????? That doesn't sound very good. I think we may >> need to Claes involved to >> ???? >????????? >????? help investigate overall performance impact here. >> ???? >????????? > >> ???? >????????? >????? > Webrev: >> https://cr.openjdk.java.net/~dtitov/8185005/webrev.04/ >> ???? >????????? >????? > Bug: >> https://bugs.openjdk.java.net/browse/JDK-8185005 >> ???? >????????? > >> ???? >????????? >????? No further code comments. >> ???? >????????? > >> ???? >????????? >????? I didn't look at the test in detail. >> ???? >????????? > >> ???? >????????? >????? Thanks, >> ???? >????????? >????? David >> ???? >????????? > >> ???? >????????? >????? > Thanks! >> ???? >????????? >????? > --Daniil >> ???? >????????? >????? > >> ???? >????????? >????? > >> ???? >????????? >????? > ?On 7/29/19, 12:53 AM, "David Holmes" >> wrote: >> ???? >????????? >????? > >> ???? >????????? >????? >????? Hi Daniil, >> ???? >????????? >????? > >> ???? >????????? >????? >????? Overall I think this is a reasonable >> approach but I would still like to >> ???? >????????? >????? >????? see some performance and footprint >> numbers, both to verify it fixes the >> ???? >????????? >????? >????? problem reported, and that we are not >> getting penalized elsewhere. >> ???? >????????? >????? > >> ???? >????????? >????? >????? On 25/07/2019 3:21 am, Daniil Titov wrote: >> ???? >????????? >????? >????? > Hi David, Daniel, and Serguei, >> ???? >????????? >????? >????? > >> ???? >????????? >????? >????? > Please review the new version of the >> fix, that makes the thread table initialization on demand and >> ???? >????????? >????? >????? > moves it inside >> ThreadsList::find_JavaThread_from_java_tid(). At the creation time the >> thread table >> ???? >????????? >????? >????? >?? is initialized with the threads from >> the current thread list. We don't want to hold Threads_lock >> ???? >????????? >????? >????? > inside >> find_JavaThread_from_java_tid(),? thus new threads still could be >> created? while the thread >> ???? >????????? >????? >????? > table is being initialized . Such >> threads will be found by the linear search and added to the thread table >> ???? >????????? >????? >????? > later, in >> ThreadsList::find_JavaThread_from_java_tid(). >> ???? >????????? >????? > >> ???? >????????? >????? >????? The initialization allows the created >> but unpopulated, or partially >> ???? >????????? >????? >????? populated, table to be seen by other >> threads - is that your intention? >> ???? >????????? >????? >????? It seems it should be okay as the other >> threads will then race with the >> ???? >????????? >????? >????? initializing thread to add specific >> entries, and this is a concurrent >> ???? >????????? >????? >????? map so that should be functionally >> correct. But if so then I think you >> ???? >????????? >????? >????? can also reduce the scope of the >> ThreadTableCreate_lock so that it >> ???? >????????? >????? >????? covers creation of the table only, not >> the initial population of the table. >> ???? >????????? >????? > >> ???? >????????? >????? >????? I like the approach of only initializing >> the table when needed and using >> ???? >????????? >????? >????? that to control when the >> add/remove-thread code needs to update the >> ???? >????????? >????? >????? table. But I would still want to see >> what impact this has on thread >> ???? >????????? >????? >????? startup cost, both with and without the >> table being initialized. >> ???? >????????? >????? > >> ???? >????????? >????? >????? > The change also includes additional >> optimization for some callers of find_JavaThread_from_java_tid() >> ???? >????????? >????? >????? > as Daniel suggested. >> ???? >????????? >????? > >> ???? >????????? >????? >????? Not sure it's best to combine these, but >> if they are limited to the >> ???? >????????? >????? >????? changes in management.cpp only then that >> may be okay. It helps to be >> ???? >????????? >????? >????? able to focus on the table related >> changes without being distracted by >> ???? >????????? >????? >????? other optimizations. >> ???? >????????? >????? > >> ???? >????????? >????? >????? > That is correct that >> ResolvedMethodTable was used as a blueprint for the thread table, >> however, I tried >> ???? >????????? >????? >????? > to strip it of the all functionality >> that is not required in the thread table case. >> ???? >????????? >????? > >> ???? >????????? >????? >????? The revised version seems better in that >> regard. But I still have a >> ???? >????????? >????? >????? concern, see below. >> ???? >????????? >????? > >> ???? >????????? >????? >????? > We need to have the thread table >> resizable and allow it to grow as the number of threads increases to >> avoid >> ???? >????????? >????? >????? > reserving excessive memory a-priori or >> deteriorating lookup times. The ServiceThread is responsible for >> ???? >????????? >????? >????? > growing the thread table when required. >> ???? >????????? >????? > >> ???? >????????? >????? >????? Yes but why? Why can't this table be >> grown on demand by the thread that >> ???? >????????? >????? >????? is doing the addition? For other tables >> we may have to delegate to the >> ???? >????????? >????? >????? service thread because the current >> thread cannot perform the action, or >> ???? >????????? >????? >????? it doesn't want to perform it at the >> time the need for the resize is >> ???? >????????? >????? >????? detected (e.g. its detected at a >> safepoint and you want the resize to >> ???? >????????? >????? >????? happen later outside the safepoint). >> It's not apparent to me that such >> ???? >????????? >????? >????? restrictions apply here. >> ???? >????????? >????? > >> ???? >????????? >????? >????? > There is no ConcurrentHashTable >> available in Java 8 and for backporting this fix to Java 8 another >> implementation >> ???? >????????? >????? >????? > of the hash table, probably originally >> suggested in the patch attached to the JBS issue, should be used.? It >> will make >> ???? >????????? >????? >????? > the backporting more complicated, >> however, adding a new Implementation of the hash table in Java 14 >> while it >> ???? >????????? >????? >????? > already has ConcurrentHashTable >> doesn't seem? reasonable for me. >> ???? >????????? >????? > >> ???? >????????? >????? >????? Ok. >> ???? >????????? >????? > >> ???? >????????? >????? >????? > Webrev: >> http://cr.openjdk.java.net/~dtitov/8185005/webrev.03 >> ???? >????????? >????? > >> ???? >????????? >????? >????? Some specific code comments: >> ???? >????????? >????? > >> ???? >????????? >????? >????? src/hotspot/share/runtime/mutexLocker.cpp >> ???? >????????? >????? > >> ???? >????????? >????? >????? +?? def(ThreadTableCreate_lock?????? , >> PaddedMutex? , special, >> ???? >????????? >????? >????? false, Monitor::_safepoint_check_never); >> ???? >????????? >????? > >> ???? >????????? >????? >????? I think this needs to be a >> _safepoint_check_always lock. The table will >> ???? >????????? >????? >????? be created by regular JavaThreads and >> they should (nearly) always be >> ???? >????????? >????? >????? checking for safepoints if they are >> going to block acquiring the lock. >> ???? >????????? >????? >????? And it isn't at all obvious that the >> thread doing the creation can't go >> ???? >????????? >????? >????? to a safepoint whilst this lock is held. >> ???? >????????? >????? > >> ???? >????????? >????? >????? --- >> ???? >????????? >????? > >> ???? >????????? >????? >????? src/hotspot/share/runtime/threadSMR.cpp >> ???? >????????? >????? > >> ???? >????????? >????? >????? Nit: >> ???? >????????? >????? > >> ???? >????????? >????? >??????? 618?????? JavaThread* thread = >> thread_at(i); >> ???? >????????? >????? > >> ???? >????????? >????? >????? you could reuse the new java_thread >> local you introduced at line 613 and >> ???? >????????? >????? >????? just rename that "new" variable to >> "thread" so you don't have to change >> ???? >????????? >????? >????? all other uses. >> ???? >????????? >????? > >> ???? >????????? >????? >????? 628?? } else if (java_thread != NULL && ... >> ???? >????????? >????? > >> ???? >????????? >????? >????? You don't need to check != NULL here as >> you only get here when >> ???? >????????? >????? >????? java_thread is not NULL. >> ???? >????????? >????? > >> ???? >????????? >????? >??????? 755???? jlong tid = >> SharedRuntime::get_java_tid(thread); >> ???? >????????? >????? >??????? 926???? jlong tid = >> SharedRuntime::get_java_tid(thread); >> ???? >????????? >????? > >> ???? >????????? >????? >????? I think it cleaner/better to just use >> ???? >????????? >????? > >> ???? >????????? >????? >????? jlong tid = >> java_lang_Thread::thread_id(thread->threadObj()); >> ???? >????????? >????? > >> ???? >????????? >????? >????? as we know thread is not NULL, it is a >> JavaThread and it has to have a >> ???? >????????? >????? >????? non-null threadObj. >> ???? >????????? >????? > >> ???? >????????? >????? >????? --- >> ???? >????????? >????? > >> ???? >????????? >????? >????? src/hotspot/share/services/management.cpp >> ???? >????????? >????? > >> ???? >????????? >????? >????? 1323???????? if >> (THREAD->is_Java_thread()) { >> ???? >????????? >????? >????? 1324?????????? JavaThread* >> current_thread = (JavaThread*)THREAD; >> ???? >????????? >????? > >> ???? >????????? >????? >????? These calls can only be made on a >> JavaThread so this be simplified to >> ???? >????????? >????? >????? remove the is_Java_thread() call. >> Similarly in other places. >> ???? >????????? >????? > >> ???? >????????? >????? >????? --- >> ???? >????????? >????? > >> ???? >????????? >????? >????? src/hotspot/share/services/threadTable.cpp >> ???? >????????? >????? > >> ???? >????????? >????? >???????? 55 class ThreadTableEntry : public >> CHeapObj { >> ???? >????????? >????? >???????? 56?? private: >> ???? >????????? >????? >???????? 57???? jlong _tid; >> ???? >????????? >????? > >> ???? >????????? >????? >????? I believe hotspot style is to not indent >> the access modifiers in C++ >> ???? >????????? >????? >????? class declarations, so the above would >> just be: >> ???? >????????? >????? > >> ???? >????????? >????? >???????? 55 class ThreadTableEntry : public >> CHeapObj { >> ???? >????????? >????? >???????? 56 private: >> ???? >????????? >????? >???????? 57?? jlong _tid; >> ???? >????????? >????? > >> ???? >????????? >????? >????? etc. >> ???? >????????? >????? > >> ???? >????????? >????? >??????? 60???? ThreadTableEntry(jlong tid, >> JavaThread* java_thread) : >> ???? >????????? >????? >??????? 61 >> _tid(tid),_java_thread(java_thread) {} >> ???? >????????? >????? > >> ???? >????????? >????? >????? line 61 should be indented as it >> continues line 60. >> ???? >????????? >????? > >> ???? >????????? >????? >???????? 67 class ThreadTableConfig : public >> AllStatic { >> ???? >????????? >????? >???????? ... >> ???? >????????? >????? >???????? 71???? static uintx get_hash(Value >> const& value, bool* is_dead) { >> ???? >????????? >????? > >> ???? >????????? >????? >????? The is_dead parameter still bothers me >> here. I can't make enough sense >> ???? >????????? >????? >????? out of the template code in >> ConcurrentHashtable to see why we have to >> ???? >????????? >????? >????? have it, but I'm concerned that its very >> existence means we perhaps >> ???? >????????? >????? >????? should not be trying to extend CHT in >> this context. ?? >> ???? >????????? >????? > >> ???? >????????? >????? >??????? 115?? size_t start_size_log = size_log >> > DefaultThreadTableSizeLog >> ???? >????????? >????? >??????? 116?? ? size_log : >> DefaultThreadTableSizeLog; >> ???? >????????? >????? > >> ???? >????????? >????? >????? line 116 should be indented, though in >> this case I think a better layout >> ???? >????????? >????? >????? would be: >> ???? >????????? >????? > >> ???? >????????? >????? >??????? 115?? size_t start_size_log = >> ???? >????????? >????? >??????? 116?????? size_log > >> DefaultThreadTableSizeLog ? size_log : >> ???? >????????? >????? >????? DefaultThreadTableSizeLog; >> ???? >????????? >????? > >> ???? >????????? >????? >??????? 131 double >> ThreadTable::get_load_factor() { >> ???? >????????? >????? >??????? 132?? return >> (double)_items_count/_current_size; >> ???? >????????? >????? >??????? 133 } >> ???? >????????? >????? > >> ???? >????????? >????? >????? Not sure that is doing what you >> want/expect. It will perform integer >> ???? >????????? >????? >????? division and then cast that whole >> integer to a double. If you want >> ???? >????????? >????? >????? double arithmetic you need: >> ???? >????????? >????? > >> ???? >????????? >????? >????? return >> ((double)_items_count)/_current_size; >> ???? >????????? >????? > >> ???? >????????? >????? >????? 180???? jlong????????? _tid; >> ???? >????????? >????? >????? 181???? uintx???????? _hash; >> ???? >????????? >????? > >> ???? >????????? >????? >????? Nit: no need for all those spaces before >> the variable name. >> ???? >????????? >????? > >> ???? >????????? >????? >??????? 183???? ThreadTableLookup(jlong tid) >> ???? >????????? >????? >??????? 184???? : _tid(tid), >> _hash(primitive_hash(tid)) {} >> ???? >????????? >????? > >> ???? >????????? >????? >????? line 184 should be indented. >> ???? >????????? >????? > >> ???? >????????? >????? >????? 201???? ThreadGet():_return(NULL) {} >> ???? >????????? >????? > >> ???? >????????? >????? >????? Nit: need space after : >> ???? >????????? >????? > >> ???? >????????? >????? >??????? 211??? assert(_is_initialized, "Thread >> table is not initialized"); >> ???? >????????? >????? >??????? 212?? _has_work = false; >> ???? >????????? >????? > >> ???? >????????? >????? >????? line 211 is indented one space too far. >> ???? >????????? >????? > >> ???? >????????? >????? >????? 229???? ThreadTableEntry* entry = new >> ThreadTableEntry(tid,java_thread); >> ???? >????????? >????? > >> ???? >????????? >????? >????? Nit: need space after , >> ???? >????????? >????? > >> ???? >????????? >????? >????? 252?? return >> _local_table->remove(thread,lookup); >> ???? >????????? >????? > >> ???? >????????? >????? >????? Nit: need space after , >> ???? >????????? >????? > >> ???? >????????? >????? >????? Thanks, >> ???? >????????? >????? >????? David >> ???? >????????? >????? >????? ------ >> ???? >????????? >????? > >> ???? >????????? >????? >????? > Bug: >> https://bugs.openjdk.java.net/browse/JDK-8185005 >> ???? >????????? >????? >????? > >> ???? >????????? >????? >????? > Thanks! >> ???? >????????? >????? >????? > --Daniil >> ???? >????????? >????? >????? > >> ???? >????????? >????? >????? > >> ???? >????????? >????? >????? > ?On 7/8/19, 3:24 PM, "Daniel D. >> Daugherty" wrote: >> ???? >????????? >????? >????? > >> ???? >????????? >????? >????? >????? On 6/29/19 12:06 PM, Daniil Titov >> wrote: >> ???? >????????? >????? >????? >????? > Hi Serguei and David, >> ???? >????????? >????? >????? >????? > >> ???? >????????? >????? >????? >????? > Serguei is right, >> ThreadTable::find_thread(java_tid) cannot? return a JavaThread with an >> unmatched java_tid. >> ???? >????????? >????? >????? >????? > >> ???? >????????? >????? >????? >????? > Please find a new version of >> the fix that includes the changes Serguei suggested. >> ???? >????????? >????? >????? >????? > >> ???? >????????? >????? >????? >????? > Regarding the concern about the >> maintaining the thread table when it may never even be queried, one of >> ???? >????????? >????? >????? >????? > the options could be to add >> ThreadTable ::isEnabled flag, set it to "false" by default, and wrap >> the calls to the thread table >> ???? >????????? >????? >????? >????? > in ThreadsSMRSupport >> add_thread() and remove_thread() methods to check this flag. >> ???? >????????? >????? >????? >????? > >> ???? >????????? >????? >????? >????? > When >> ThreadsList::find_JavaThread_from_java_tid() is called for the first >> time it could check if ThreadTable ::isEnabled >> ???? >????????? >????? >????? >????? > Is on and if not then set it on >> and populate the thread table with all existing threads from the >> thread list. >> ???? >????????? >????? >????? > >> ???? >????????? >????? >????? >????? I have the same concerns as David >> H. about this new ThreadTable. >> ???? >????????? >????? >????? > >> ThreadsList::find_JavaThread_from_java_tid() is only called from code >> ???? >????????? >????? >????? >????? in >> src/hotspot/share/services/management.cpp so I think that table >> ???? >????????? >????? >????? >????? needs to enabled and populated >> only if it is going to be used. >> ???? >????????? >????? >????? > >> ???? >????????? >????? >????? >????? I've taken a look at the webrev >> below and I see that David has >> ???? >????????? >????? >????? >????? followed up with additional >> comments. Before I do a crawl through >> ???? >????????? >????? >????? >????? code review for this, I would >> like to see the ThreadTable stuff >> ???? >????????? >????? >????? >????? made optional and David's other >> comments addressed. >> ???? >????????? >????? >????? > >> ???? >????????? >????? >????? >????? Another possible optimization is >> for callers of >> ???? >????????? >????? >????? >????? find_JavaThread_from_java_tid() >> to save the calling thread's >> ???? >????????? >????? >????? >????? tid value before they loop and if >> the current tid == saved_tid >> ???? >????????? >????? >????? >????? then use the current JavaThread* >> instead of calling >> ???? >????????? >????? >????? >????? find_JavaThread_from_java_tid() >> to get the JavaThread*. >> ???? >????????? >????? >????? > >> ???? >????????? >????? >????? >????? Dan >> ???? >????????? >????? >????? > >> ???? >????????? >????? >????? >????? > >> ???? >????????? >????? >????? >????? > Webrev: >> https://cr.openjdk.java.net/~dtitov/8185005/webrev.02/ >> ???? >????????? >????? >????? >????? > Bug: >> https://bugs.openjdk.java.net/browse/JDK-8185005 >> ???? >????????? >????? >????? >????? > >> ???? >????????? >????? >????? >????? > Thanks! >> ???? >????????? >????? >????? >????? > --Daniil >> ???? >????????? >????? >????? >????? > >> ???? >????????? >????? >????? >????? > From: >> ???? >????????? >????? >????? >????? > Organization: Oracle Corporation >> ???? >????????? >????? >????? >????? > Date: Friday, June 28, 2019 at >> 7:56 PM >> ???? >????????? >????? >????? >????? > To: Daniil Titov >> , OpenJDK Serviceability >> , >> "hotspot-runtime-dev at openjdk.java.net" >> , "jmx-dev at openjdk.java.net" >> >> ???? >????????? >????? >????? >????? > Subject: Re: RFR: 8185005: >> Improve performance of ThreadMXBean.getThreadInfo(long ids[], int >> maxDepth) >> ???? >????????? >????? >????? >????? > >> ???? >????????? >????? >????? >????? > Hi Daniil, >> ???? >????????? >????? >????? >????? > >> ???? >????????? >????? >????? >????? > I have several quick comments. >> ???? >????????? >????? >????? >????? > >> ???? >????????? >????? >????? >????? > The indent in the hotspot c/c++ >> files has to be 2, not 4. >> ???? >????????? >????? >????? >????? > >> ???? >????????? >????? >????? >????? > >> https://cr.openjdk.java.net/~dtitov/8185005/webrev.01/src/hotspot/share/runtime/threadSMR.cpp.frames.html >> >> ???? >????????? >????? >????? >????? > 614 JavaThread* >> ThreadsList::find_JavaThread_from_java_tid(jlong java_tid) const { >> ???? >????????? >????? >????? >????? >?? 615???? JavaThread* >> java_thread = ThreadTable::find_thread(java_tid); >> ???? >????????? >????? >????? >????? >?? 616???? if (java_thread == >> NULL && java_tid == PMIMORDIAL_JAVA_TID) { >> ???? >????????? >????? >????? >????? >?? 617???????? // >> ThreadsSMRSupport::add_thread() is not called for the primordial >> ???? >????????? >????? >????? >????? >?? 618???????? // thread. Thus, >> we find this thread with a linear search and add it >> ???? >????????? >????? >????? >????? >?? 619???????? // to the thread >> table. >> ???? >????????? >????? >????? >????? >?? 620???????? for (uint i = 0; >> i < length(); i++) { >> ???? >????????? >????? >????? >????? >?? 621???????????? JavaThread* >> thread = thread_at(i); >> ???? >????????? >????? >????? >????? >?? 622???????????? if >> (is_valid_java_thread(java_tid,thread)) { >> ???? >????????? >????? >????? >????? >?? 623 >> ThreadTable::add_thread(java_tid, thread); >> ???? >????????? >????? >????? >????? >?? 624???????????????? return >> thread; >> ???? >????????? >????? >????? >????? >?? 625???????????? } >> ???? >????????? >????? >????? >????? >?? 626???????? } >> ???? >????????? >????? >????? >????? >?? 627???? } else if >> (java_thread != NULL && is_valid_java_thread(java_tid, java_thread)) { >> ???? >????????? >????? >????? >????? >?? 628???????? return java_thread; >> ???? >????????? >????? >????? >????? >?? 629???? } >> ???? >????????? >????? >????? >????? >?? 630???? return NULL; >> ???? >????????? >????? >????? >????? >?? 631 } >> ???? >????????? >????? >????? >????? >?? 632 bool >> ThreadsList::is_valid_java_thread(jlong java_tid, JavaThread* >> java_thread) { >> ???? >????????? >????? >????? >????? >?? 633???? oop tobj = >> java_thread->threadObj(); >> ???? >????????? >????? >????? >????? >?? 634???? // Ignore the thread >> if it hasn't run yet, has exited >> ???? >????????? >????? >????? >????? >?? 635???? // or is starting to >> exit. >> ???? >????????? >????? >????? >????? >?? 636???? return (tobj != NULL >> && !java_thread->is_exiting() && >> ???? >????????? >????? >????? >????? >?? 637???????????? java_tid == >> java_lang_Thread::thread_id(tobj)); >> ???? >????????? >????? >????? >????? >?? 638 } >> ???? >????????? >????? >????? >????? > >> ???? >????????? >????? >????? >????? >?? 615???? JavaThread* >> java_thread = ThreadTable::find_thread(java_tid); >> ???? >????????? >????? >????? >????? > >> ???? >????????? >????? >????? >????? >??? I'd suggest to rename >> find_thread() to find_thread_by_tid(). >> ???? >????????? >????? >????? >????? > >> ???? >????????? >????? >????? >????? > A space is missed after the comma: >> ???? >????????? >????? >????? >????? >??? 622 if >> (is_valid_java_thread(java_tid,thread)) { >> ???? >????????? >????? >????? >????? > >> ???? >????????? >????? >????? >????? > An empty line is needed before >> L632. >> ???? >????????? >????? >????? >????? > >> ???? >????????? >????? >????? >????? > The name 'is_valid_java_thread' >> looks wrong (or confusing) to me. >> ???? >????????? >????? >????? >????? > Something like >> 'is_alive_java_thread_with_tid()' would be better. >> ???? >????????? >????? >????? >????? > It'd better to list parameters >> in the opposite order. >> ???? >????????? >????? >????? >????? > >> ???? >????????? >????? >????? >????? > The call to >> is_valid_java_thread() is confusing: >> ???? >????????? >????? >????? >????? >???? 627 } else if (java_thread >> != NULL && is_valid_java_thread(java_tid, java_thread)) { >> ???? >????????? >????? >????? >????? > >> ???? >????????? >????? >????? >????? > Why would the call >> ThreadTable::find_thread(java_tid) return a JavaThread with an >> unmatched java_tid? >> ???? >????????? >????? >????? >????? > >> ???? >????????? >????? >????? >????? > >> ???? >????????? >????? >????? >????? > Thanks, >> ???? >????????? >????? >????? >????? > Serguei >> ???? >????????? >????? >????? >????? > >> ???? >????????? >????? >????? >????? > On 6/28/19, 9:40 PM, "David >> Holmes" wrote: >> ???? >????????? >????? >????? >????? > >> ???? >????????? >????? >????? >????? >????? Hi Daniil, >> ???? >????????? >????? >????? >????? > >> ???? >????????? >????? >????? >????? >????? The definition and use of >> this hashtable (yet another hashtable >> ???? >????????? >????? >????? >????? >????? implementation!) will need >> careful examination. We have to be concerned >> ???? >????????? >????? >????? >????? >????? about the cost of >> maintaining it when it may never even be queried. You >> ???? >????????? >????? >????? >????? >????? would need to look at >> footprint cost and performance impact. >> ???? >????????? >????? >????? >????? > >> ???? >????????? >????? >????? >????? >????? Unfortunately I'm just >> about to board a plane and will be out for the >> ???? >????????? >????? >????? >????? >????? next few days. I will try >> to look at this asap next week, but we will >> ???? >????????? >????? >????? >????? >????? need a lot more data on it. >> ???? >????????? >????? >????? >????? > >> ???? >????????? >????? >????? >????? >????? Thanks, >> ???? >????????? >????? >????? >????? >????? David >> ???? >????????? >????? >????? >????? > >> ???? >????????? >????? >????? >????? > On 6/28/19 3:31 PM, Daniil >> Titov wrote: >> ???? >????????? >????? >????? >????? > Please review the change that >> improves performance of ThreadMXBean MXBean methods returning the >> ???? >????????? >????? >????? >????? > information for specific >> threads. The change introduces the thread table that uses >> ConcurrentHashTable >> ???? >????????? >????? >????? >????? > to store one-to-one the mapping >> between the thread ids and JavaThread objects and replaces the linear >> ???? >????????? >????? >????? >????? > search over the thread list in >> ThreadsList::find_JavaThread_from_java_tid(jlong tid) method with the >> lookup >> ???? >????????? >????? >????? >????? > in the thread table. >> ???? >????????? >????? >????? >????? > >> ???? >????????? >????? >????? >????? > Testing: Mach5 tier1,tier2 and >> tier3 tests successfully passed. >> ???? >????????? >????? >????? >????? > >> ???? >????????? >????? >????? >????? > Webrev: >> https://cr.openjdk.java.net/~dtitov/8185005/webrev.01/ >> ???? >????????? >????? >????? >????? > Bug: >> https://bugs.openjdk.java.net/browse/JDK-8185005 >> ???? >????????? >????? >????? >????? > >> ???? >????????? >????? >????? >????? > Thanks! >> ???? >????????? >????? >????? >????? > >> ???? >????????? >????? >????? >????? > Best regards, >> ???? >????????? >????? >????? >????? > Daniil >> ???? >????????? >????? >????? >????? > >> ???? >????????? >????? >????? >????? > >> ???? >????????? >????? >????? >????? > >> ???? >????????? >????? >????? >????? > >> ???? >????????? >????? >????? >????? > >> ???? >????????? >????? >????? >????? > >> ???? >????????? >????? >????? >????? > >> ???? >????????? >????? >????? > >> ???? >????????? >????? >????? > >> ???? >????????? >????? >????? > >> ???? >????????? >????? >????? > >> ???? >????????? >????? > >> ???? >????????? >????? > >> ???? >????????? >????? > >> ???? >????????? > >> ???? >????????? > >> ???? >????????? > >> ???? > >> ???? > >> ???? > >> ???? > >> ???? > >> ???? > >> ???? > >> >> > From matthias.baesken at sap.com Tue Sep 17 11:07:43 2019 From: matthias.baesken at sap.com (Baesken, Matthias) Date: Tue, 17 Sep 2019 11:07:43 +0000 Subject: RFR [XS]: 8230901: missing ReleaseStringUTFChars in servicability native code In-Reply-To: <9dca6ade-aba5-daf1-fa26-07b3d0bad35e@oracle.com> References: <9dca6ade-aba5-daf1-fa26-07b3d0bad35e@oracle.com> Message-ID: Hi Serguei and Thomas , thanks for the reviews. >Should I open a bug for these ? > Probably, two different bug are needed: hotspot/runtime and AWT. Regarding the atoi on input provided by getenv - I?ll open 2 bugs for this. Best regards, Matthias From: serguei.spitsyn at oracle.com Sent: Samstag, 14. September 2019 00:18 To: Baesken, Matthias ; Thomas St?fe Cc: serviceability-dev at openjdk.java.net Subject: Re: RFR [XS]: 8230901: missing ReleaseStringUTFChars in servicability native code Hi Matthias, On 9/12/19 4:52 AM, Baesken, Matthias wrote: Hi Thomas, thanks for the review . You are correct about atoi . New webrev : http://cr.openjdk.java.net/~mbaesken/webrevs/8230901.1/ I had 2 additional observations : 1. With OJDK on solaris 32bit gone for quite some time, we might be able to kick out the whole non _LP64 code because we are always 64 bit (maybe someone could comment if this is a safe assumption, there might be old 32bit solaris core files flying around for some reason even these days ? ) http://cr.openjdk.java.net/~mbaesken/webrevs/8230901.1/src/jdk.hotspot.agent/solaris/native/libsaproc/saproc.cpp.frames.html 696 // some older versions of libproc.so crash when trying to attach 32 bit 697 // debugger to 64 bit core file. check and throw error. 698 #ifndef _LP64 ?.. 1. The usage of atoi is commented here : https://docs.oracle.com/cd/E86824_01/html/E54766/atoi-3c.html ?However, applications should not use the atoi(), atol(), or atoll() functions unless they know the value represented by the argument will be in range for the corresponding result type? ??And here : https://pubs.opengroup.org/onlinepubs/009695399/functions/atoi.html ?If the number is not known to be in range, strtol() should be used because atoi() is not required to perform any error checking? However we have a number of usages in the coding where atoi is called without knowing that the argument is in the allowed range . some examples : src/hotspot/share/runtime/arguments.cpp-382- if (match_option(option, "-Dsun.java.launcher.pid=", &tail)) { src/hotspot/share/runtime/arguments.cpp:383: _sun_java_launcher_pid = atoi(tail); src/hotspot/share/runtime/arguments.cpp-384- continue; src/java.desktop/unix/native/libawt_xawt/xawt/XToolkit.c 455 value = getenv("_AWT_MAX_POLL_TIMEOUT"); 456 if (value != NULL) { 457 AWT_MAX_POLL_TIMEOUT = atoi(value); src/java.desktop/unix/native/common/awt/X11Color.c-781- if (getenv("CMAPSIZE") != 0) { src/java.desktop/unix/native/common/awt/X11Color.c:782: cmapsize = atoi(getenv("CMAPSIZE")); Should I open a bug for these ? Probably, two different bug are needed: hotspot/runtime and AWT. Thanks, Serguei Best regards, Matthias From: Thomas St?fe Sent: Donnerstag, 12. September 2019 12:22 To: Baesken, Matthias Cc: serviceability-dev at openjdk.java.net Subject: Re: RFR [XS]: 8230901: missing ReleaseStringUTFChars in servicability native code Hi Matthias, your changes look good. an additional bug: http://cr.openjdk.java.net/~mbaesken/webrevs/8230901.0/src/jdk.hotspot.agent/solaris/native/libsaproc/saproc.cpp.frames.html 698 #ifndef _LP64 699 atoi(cmdLine_cstr); 700 if (errno) { Behaviour of atoi() in error case is undefined. errno values are not defined. See: https://pubs.opengroup.org/onlinepubs/009695399/functions/atoi.html And even if atoi would set errno, this is still not enough since errno may contain a stale value. One would have to set errno=0 before the function call. If you want to fix this too 'd suggest replacing this call with strtol(). Cheers, Thomas On Thu, Sep 12, 2019 at 12:11 PM Baesken, Matthias > wrote: Hello, please reviews this small change . It adds ReleaseStringUTFChars calls at some places in early return cases . ( in src/jdk.hotspot.agent/solaris/native/libsaproc/saproc.cpp THROW_NEW_DEBUGGER_EXCEPTION contains a return , see the macro declaration 39 #define THROW_NEW_DEBUGGER_EXCEPTION(str) { throwNewDebuggerException(env, str); return;} ) Bug/webrev : https://bugs.openjdk.java.net/browse/JDK-8230901 http://cr.openjdk.java.net/~mbaesken/webrevs/8230901.0/ Thanks, Matthias -------------- next part -------------- An HTML attachment was scrubbed... URL: From magnus.ihse.bursie at oracle.com Tue Sep 17 11:26:19 2019 From: magnus.ihse.bursie at oracle.com (Magnus Ihse Bursie) Date: Tue, 17 Sep 2019 13:26:19 +0200 Subject: RFR: 8230857: Avoid reflection in sun.tools.common.ProcessHelper In-Reply-To: <7cef2fd2-74cb-f069-d837-b5219924efc0@oracle.com> References: <555a2cf2-e15e-abb6-5c0a-fb3ff4c0716f@oracle.com> <7cef2fd2-74cb-f069-d837-b5219924efc0@oracle.com> Message-ID: <105a9f1f-9b1e-c707-b65d-6e71db7c701d@oracle.com> On 2019-09-17 01:01, David Holmes wrote: > Hi Christoph, > > Sorry for the delay getting back you. > > cc'd build-dev to get some clarification on the below ... > > On 12/09/2019 7:30 pm, Langer, Christoph wrote: >> Hi David, >> >>>> please review an enhancement which I've identified when working with >>>> Processhelper for JDK-8230850. >>>> >>>> I noticed that ProcessHelper is an interface in common code with a >>>> static method that would lookup the actual platform implementation via >>>> reflection. This seems a little cumbersome since we can have a common >>>> dummy for ProcessHelper and override it with the platform specific >>>> implementation, leveraging the build system. >>> >>> I don't see you leveraging the build system. You have two source files >>> that compile to the same destination class file. What is ensuring the >>> platform specific version is compiled after the generic one? >>> >>> Service-provider patterns use reflection to instantiate the service >>> implementation. I don't see any problem here that needs solving. >> >> TL;DR: >> There are two source files, one in share/classes and one in >> linux/classes. The build system overrides the share/classes >> implementation with the linux/classes implementation in the linux >> build. This is not by coincidence and only one class is contained in >> the generated jdk.jcmd module. Then there won't be a need for having >> a service interface and a service implementation that is looked up >> via reflection (which is not a bad pattern by itself). I agree that >> it's not a big problem to be solved but still not "no problem". >> Here is some longer elaboration how the build system prefers specific >> implementations of classes and filters generic duplicates: >> The SetupJavaCompilation function from JavaCompilation.gmk [0] is >> used to compile the java sources for JDK modules. In its >> documentation, for argument SRC [1], it claims: "one or more >> directories to search for sources. The order of the source roots is >> significant. The first found file of a certain name has priority". In >> its implementation the found files are first ordered [3] and >> duplicates filtered out [4]. >> The potential source files are handed to SetupJavaCompilation in >> CompileJavaModules.gmk [5] and were collected by a call to >> FindModuleSrcDirs [6].? FindModuleSrcDirs iterates over all potential >> source dirs for Java classes in the module [7]. The evaluated subdirs >> are (in that order) $(OPENJDK_TARGET_OS)/classes, >> $(OPENJDK_TARGET_OS_TYPE)/classes and share/classes, as per [8]. >> Hope that explains what I'm trying to leverage here. > > I'm not 100% certain that what you describe actually ensures what you > want it to ensure. I can't reconcile "the first found file ... has > priority" with the fact found files are sorted and duplicates > eliminated. It is the sorting that concerns me as it suggests > linux/Foo.java might replace shared/Foo.java, but if we're on Windows > then we have a problem! That said there is also this comment: > > # Order src files according to the order of the src dirs. Correct > odering is > # needed for correct overriding between different source roots. > > I'd need the build team to clarify what "correct overriding" is > actually defined as. David, Christoph is correct. linux/Foo.java will override share/Foo.java. I don't remember how the magic in JavaCompilation.gmk works anymore :-), but we have relied on this behavior in other places for a long time, so I'm pretty certain it is still working correctly. Presumably, the $(sort ...) is there to remove (identical) duplicates, which is a side-effect of sort. /Magnus > > Thanks, > David > ----- > >> I've uploaded an updated webrev which contains some cleanup to the >> Test changes: http://cr.openjdk.java.net/~clanger/webrevs/8230857.1/ >> >> Thanks >> Christoph >> >> [0] >> http://hg.openjdk.java.net/jdk/jdk/file/ea93d6a9f720/make/common/JavaCompilation.gmk#l185 >> [1] >> http://hg.openjdk.java.net/jdk/jdk/file/ea93d6a9f720/make/common/JavaCompilation.gmk#l157 >> [3] >> http://hg.openjdk.java.net/jdk/jdk/file/ea93d6a9f720/make/common/JavaCompilation.gmk#l225 >> [4] >> http://hg.openjdk.java.net/jdk/jdk/file/ea93d6a9f720/make/common/JavaCompilation.gmk#l257 >> [5] >> http://hg.openjdk.java.net/jdk/jdk/file/ea93d6a9f720/make/CompileJavaModules.gmk#l603 >> [6] >> http://hg.openjdk.java.net/jdk/jdk/file/ea93d6a9f720/make/CompileJavaModules.gmk#l555 >> [7] >> http://hg.openjdk.java.net/jdk/jdk/file/ea93d6a9f720/make/common/Modules.gmk#l300 >> [8] >> http://hg.openjdk.java.net/jdk/jdk/file/ea93d6a9f720/make/common/Modules.gmk#l243 >> >> From david.holmes at oracle.com Tue Sep 17 12:59:40 2019 From: david.holmes at oracle.com (David Holmes) Date: Tue, 17 Sep 2019 22:59:40 +1000 Subject: RFR: 8230857: Avoid reflection in sun.tools.common.ProcessHelper In-Reply-To: <105a9f1f-9b1e-c707-b65d-6e71db7c701d@oracle.com> References: <555a2cf2-e15e-abb6-5c0a-fb3ff4c0716f@oracle.com> <7cef2fd2-74cb-f069-d837-b5219924efc0@oracle.com> <105a9f1f-9b1e-c707-b65d-6e71db7c701d@oracle.com> Message-ID: <431b85fb-b131-b55b-f7a8-d7112b2c9fa4@oracle.com> Hi Magnus, On 17/09/2019 9:26 pm, Magnus Ihse Bursie wrote: > On 2019-09-17 01:01, David Holmes wrote: >> Hi Christoph, >> >> Sorry for the delay getting back you. >> >> cc'd build-dev to get some clarification on the below ... >> >> On 12/09/2019 7:30 pm, Langer, Christoph wrote: >>> Hi David, >>> >>>>> please review an enhancement which I've identified when working with >>>>> Processhelper for JDK-8230850. >>>>> >>>>> I noticed that ProcessHelper is an interface in common code with a >>>>> static method that would lookup the actual platform implementation via >>>>> reflection. This seems a little cumbersome since we can have a common >>>>> dummy for ProcessHelper and override it with the platform specific >>>>> implementation, leveraging the build system. >>>> >>>> I don't see you leveraging the build system. You have two source files >>>> that compile to the same destination class file. What is ensuring the >>>> platform specific version is compiled after the generic one? >>>> >>>> Service-provider patterns use reflection to instantiate the service >>>> implementation. I don't see any problem here that needs solving. >>> >>> TL;DR: >>> There are two source files, one in share/classes and one in >>> linux/classes. The build system overrides the share/classes >>> implementation with the linux/classes implementation in the linux >>> build. This is not by coincidence and only one class is contained in >>> the generated jdk.jcmd module. Then there won't be a need for having >>> a service interface and a service implementation that is looked up >>> via reflection (which is not a bad pattern by itself). I agree that >>> it's not a big problem to be solved but still not "no problem". >>> Here is some longer elaboration how the build system prefers specific >>> implementations of classes and filters generic duplicates: >>> The SetupJavaCompilation function from JavaCompilation.gmk [0] is >>> used to compile the java sources for JDK modules. In its >>> documentation, for argument SRC [1], it claims: "one or more >>> directories to search for sources. The order of the source roots is >>> significant. The first found file of a certain name has priority". In >>> its implementation the found files are first ordered [3] and >>> duplicates filtered out [4]. >>> The potential source files are handed to SetupJavaCompilation in >>> CompileJavaModules.gmk [5] and were collected by a call to >>> FindModuleSrcDirs [6].? FindModuleSrcDirs iterates over all potential >>> source dirs for Java classes in the module [7]. The evaluated subdirs >>> are (in that order) $(OPENJDK_TARGET_OS)/classes, >>> $(OPENJDK_TARGET_OS_TYPE)/classes and share/classes, as per [8]. >>> Hope that explains what I'm trying to leverage here. >> >> I'm not 100% certain that what you describe actually ensures what you >> want it to ensure. I can't reconcile "the first found file ... has >> priority" with the fact found files are sorted and duplicates >> eliminated. It is the sorting that concerns me as it suggests >> linux/Foo.java might replace shared/Foo.java, but if we're on Windows >> then we have a problem! That said there is also this comment: >> >> # Order src files according to the order of the src dirs. Correct >> odering is >> # needed for correct overriding between different source roots. >> >> I'd need the build team to clarify what "correct overriding" is >> actually defined as. > David, > > Christoph is correct. linux/Foo.java will override share/Foo.java. I > don't remember how the magic in JavaCompilation.gmk works anymore :-), > but we have relied on this behavior in other places for a long time, so > I'm pretty certain it is still working correctly. Presumably, the $(sort > ...) is there to remove (identical) duplicates, which is a side-effect > of sort. Thanks for confirming. I'd still like to understand exactly what these overriding rules are though. It's not a mechanism I was aware of. Thanks, David > /Magnus > >> >> Thanks, >> David >> ----- >> >>> I've uploaded an updated webrev which contains some cleanup to the >>> Test changes: http://cr.openjdk.java.net/~clanger/webrevs/8230857.1/ >>> >>> Thanks >>> Christoph >>> >>> [0] >>> http://hg.openjdk.java.net/jdk/jdk/file/ea93d6a9f720/make/common/JavaCompilation.gmk#l185 >>> >>> [1] >>> http://hg.openjdk.java.net/jdk/jdk/file/ea93d6a9f720/make/common/JavaCompilation.gmk#l157 >>> >>> [3] >>> http://hg.openjdk.java.net/jdk/jdk/file/ea93d6a9f720/make/common/JavaCompilation.gmk#l225 >>> >>> [4] >>> http://hg.openjdk.java.net/jdk/jdk/file/ea93d6a9f720/make/common/JavaCompilation.gmk#l257 >>> >>> [5] >>> http://hg.openjdk.java.net/jdk/jdk/file/ea93d6a9f720/make/CompileJavaModules.gmk#l603 >>> >>> [6] >>> http://hg.openjdk.java.net/jdk/jdk/file/ea93d6a9f720/make/CompileJavaModules.gmk#l555 >>> >>> [7] >>> http://hg.openjdk.java.net/jdk/jdk/file/ea93d6a9f720/make/common/Modules.gmk#l300 >>> >>> [8] >>> http://hg.openjdk.java.net/jdk/jdk/file/ea93d6a9f720/make/common/Modules.gmk#l243 >>> >>> >>> > From erik.joelsson at oracle.com Tue Sep 17 13:39:25 2019 From: erik.joelsson at oracle.com (Erik Joelsson) Date: Tue, 17 Sep 2019 06:39:25 -0700 Subject: RFR: 8230857: Avoid reflection in sun.tools.common.ProcessHelper In-Reply-To: <431b85fb-b131-b55b-f7a8-d7112b2c9fa4@oracle.com> References: <555a2cf2-e15e-abb6-5c0a-fb3ff4c0716f@oracle.com> <7cef2fd2-74cb-f069-d837-b5219924efc0@oracle.com> <105a9f1f-9b1e-c707-b65d-6e71db7c701d@oracle.com> <431b85fb-b131-b55b-f7a8-d7112b2c9fa4@oracle.com> Message-ID: <98c7eadb-3003-cea7-f5a3-c902d9a6f1db@oracle.com> Hello, On 2019-09-17 05:59, David Holmes wrote: > Hi Magnus, > > On 17/09/2019 9:26 pm, Magnus Ihse Bursie wrote: >> On 2019-09-17 01:01, David Holmes wrote: >>> Hi Christoph, >>> >>> Sorry for the delay getting back you. >>> >>> cc'd build-dev to get some clarification on the below ... >>> >>> On 12/09/2019 7:30 pm, Langer, Christoph wrote: >>>> Hi David, >>>> >>>>>> please review an enhancement which I've identified when working with >>>>>> Processhelper for JDK-8230850. >>>>>> >>>>>> I noticed that ProcessHelper is an interface in common code with a >>>>>> static method that would lookup the actual platform >>>>>> implementation via >>>>>> reflection. This seems a little cumbersome since we can have a >>>>>> common >>>>>> dummy for ProcessHelper and override it with the platform specific >>>>>> implementation, leveraging the build system. >>>>> >>>>> I don't see you leveraging the build system. You have two source >>>>> files >>>>> that compile to the same destination class file. What is ensuring the >>>>> platform specific version is compiled after the generic one? >>>>> >>>>> Service-provider patterns use reflection to instantiate the service >>>>> implementation. I don't see any problem here that needs solving. >>>> >>>> TL;DR: >>>> There are two source files, one in share/classes and one in >>>> linux/classes. The build system overrides the share/classes >>>> implementation with the linux/classes implementation in the linux >>>> build. This is not by coincidence and only one class is contained >>>> in the generated jdk.jcmd module. Then there won't be a need for >>>> having a service interface and a service implementation that is >>>> looked up via reflection (which is not a bad pattern by itself). I >>>> agree that it's not a big problem to be solved but still not "no >>>> problem". >>>> Here is some longer elaboration how the build system prefers >>>> specific implementations of classes and filters generic duplicates: >>>> The SetupJavaCompilation function from JavaCompilation.gmk [0] is >>>> used to compile the java sources for JDK modules. In its >>>> documentation, for argument SRC [1], it claims: "one or more >>>> directories to search for sources. The order of the source roots is >>>> significant. The first found file of a certain name has priority". >>>> In its implementation the found files are first ordered [3] and >>>> duplicates filtered out [4]. >>>> The potential source files are handed to SetupJavaCompilation in >>>> CompileJavaModules.gmk [5] and were collected by a call to >>>> FindModuleSrcDirs [6]. FindModuleSrcDirs iterates over all >>>> potential source dirs for Java classes in the module [7]. The >>>> evaluated subdirs are (in that order) $(OPENJDK_TARGET_OS)/classes, >>>> $(OPENJDK_TARGET_OS_TYPE)/classes and share/classes, as per [8]. >>>> Hope that explains what I'm trying to leverage here. >>> >>> I'm not 100% certain that what you describe actually ensures what >>> you want it to ensure. I can't reconcile "the first found file ... >>> has priority" with the fact found files are sorted and duplicates >>> eliminated. It is the sorting that concerns me as it suggests >>> linux/Foo.java might replace shared/Foo.java, but if we're on >>> Windows then we have a problem! That said there is also this comment: >>> >>> # Order src files according to the order of the src dirs. Correct >>> odering is >>> # needed for correct overriding between different source roots. >>> >>> I'd need the build team to clarify what "correct overriding" is >>> actually defined as. >> David, >> >> Christoph is correct. linux/Foo.java will override share/Foo.java. I >> don't remember how the magic in JavaCompilation.gmk works anymore >> :-), but we have relied on this behavior in other places for a long >> time, so I'm pretty certain it is still working correctly. >> Presumably, the $(sort ...) is there to remove (identical) >> duplicates, which is a side-effect of sort. > > Thanks for confirming. I'd still like to understand exactly what these > overriding rules are though. It's not a mechanism I was aware of. > SetupJavaCompilation is indeed behaving as Christoph describes and it is by design. I implemented support for this behavior in: https://bugs.openjdk.java.net/browse/JDK-8079344 The relevant parts of SetupJavaCompilation look like this: ? # Order src files according to the order of the src dirs. Correct odering is ? # needed for correct overriding between different source roots. ? $1_ALL_SRC_RAW := $$(call FindFiles, $$($1_SRC)) ? $1_ALL_SRCS := $$($1_EXTRA_FILES) \ ????? $$(foreach d, $$($1_SRC), $$(filter $$d%, $$($1_ALL_SRC_RAW))) The second line orders the src files by the src roots. (We used to just call find for one src root at a time, but the above actually performs better due only running 1 external process) Further down we have this: ? ifneq ($$($1_KEEP_DUPS), true) ??? # Remove duplicate source files by keeping the first found of each duplicate. ??? # This allows for automatic overrides with custom or platform specific versions ??? # source files. ??? # ??? # For the smart javac wrapper case, add each removed file to an extra exclude ??? # file list to prevent sjavac from finding duplicate sources. ??? $1_SRCS := $$(strip $$(foreach s, $$($1_SRCS), \ ??????? $$(eval relative_src := $$(call remove-prefixes, $$($1_SRC), $$(s))) \ ??????? $$(if $$($1_$$(relative_src)), \ ????????? $$(eval $1_SJAVAC_EXCLUDE_FILES += $$(s)), \ ????????? $$(eval $1_$$(relative_src) := 1) $$(s)))) ? endif This loop is a bit hairy to wrap your head around. It's iterating over all the src files, in the order of importance. The variable relative_src is the path from the src root, the part that is common to all duplicate src files. The variables on the form $1_$$(relative_src) basically act as a hash map (string->boolean). So for each src file, if the relative path for it has already been seen, add it to an exclude list, else mark it as seen and add it to the return list. /Erik From hohensee at amazon.com Tue Sep 17 14:10:26 2019 From: hohensee at amazon.com (Hohensee, Paul) Date: Tue, 17 Sep 2019 14:10:26 +0000 Subject: RFR (M): 8207266: ThreadMXBean::getThreadAllocatedBytes() can be quicker for self thread In-Reply-To: References: <588a91ec-8d4a-1157-5d72-88bb1eef1e6e@oracle.com> <30EA5D0C-1AEC-4242-B17B-CA4D39ECAF71@amazon.com> <0d42d653-d158-a6e4-45b6-84f087c7e592@oracle.com> <03A2509C-5587-448A-82F8-9240EA040326@amazon.com> <6f674d71-58f6-bc79-7d08-7bcc24e3b0fa@oracle.com> <5252a51d-4217-000b-1444-a088bb8a6a58@oracle.com> <873119A8-C595-4B73-AD0B-1625D6CAC47D@amazon.com> <56ea7c5f-8c91-9a05-6d95-255bfd0c154d@oracle.com> Message-ID: <5417BEA4-AECD-4130-B269-19847C0092B3@amazon.com> Thanks, Serguei. :) David, are you ok with the patch? Paul From: "serguei.spitsyn at oracle.com" Date: Tuesday, September 17, 2019 at 2:26 AM To: "Hohensee, Paul" , David Holmes , Mandy Chung Cc: OpenJDK Serviceability , "hotspot-gc-dev at openjdk.java.net" Subject: Re: RFR (M): 8207266: ThreadMXBean::getThreadAllocatedBytes() can be quicker for self thread Hi Paul, Thank you for refactoring and fixing the test. It looks great now! Thanks, Serguei On 9/15/19 02:52, Hohensee, Paul wrote: Hi, Serguei, thanks for the review. New webrev at http://cr.openjdk.java.net/~phh/8207266/webrev.09/ I refactored the test?s main() method, and you?re correct, getThreadAllocatedBytes should be getCurrentThreadAllocatedBytes in that context: fixed. Paul From: "serguei.spitsyn at oracle.com" Organization: Oracle Corporation Date: Friday, September 13, 2019 at 5:50 PM To: "Hohensee, Paul" , David Holmes , Mandy Chung Cc: OpenJDK Serviceability , "hotspot-gc-dev at openjdk.java.net" Subject: Re: RFR (M): 8207266: ThreadMXBean::getThreadAllocatedBytes() can be quicker for self thread Hi Paul, It looks pretty good in general. http://cr.openjdk.java.net/~phh/8207266/webrev.08/test/jdk/com/sun/management/ThreadMXBean/ThreadAllocatedMemory.java.frames.html It would be nice to refactor the java main() method as it becomes too big. Two ways of getCurrentThreadAllocatedBytes() testing are good candidates to become separate methods. 98 long size1 = mbean.getThreadAllocatedBytes(id); Just wanted to double check if you wanted to invoke the getCurrentThreadAllocatedBytes() instead as it is a part of: 85 // First way, getCurrentThreadAllocatedBytes Thanks, Serguei On 9/13/19 12:11 PM, Hohensee, Paul wrote: Hi David, thanks for your comments. New webrev in http://cr.openjdk.java.net/~phh/8207266/webrev.08/ Both the old and new versions of the code check that thread allocated memory is both supported and enabled. The existing version of getThreadAllocatedBytes(long []) calls verifyThreadAllocatedMemory(long []), which checks inline to make sure thread allocated memory is supported, then calls isThreadAllocatedMemoryEnabled() to verify that it's enabled. isThreadAllocatedMemoryEnabled() duplicates (!) the support check and returns the enabled flag. I removed the redundant check in the new version. You're of course correct about the back-to-back check. Application code can't know when the runtime will hijack a thread for its own purposes. I've removed the check. Paul On 9/13/19, 12:50 AM, "David Holmes" wrote: Hi Paul, On 13/09/2019 10:29 am, Hohensee, Paul wrote: > Thanks for clarifying the review rules. Would someone from the > serviceability team please review? New webrev at > > http://cr.openjdk.java.net/~phh/8207266/webrev.07/ One aspect of the functional change needs clarification for me - and apologies if this has been covered in the past. It seems to me that currently we only check isThreadAllocatedMemorySupported for these operations, but if I read things correctly the updated code additionally checks isThreadAllocatedMemoryEnabled, which is a behaviour change not mentioned in the CSR. > I didn?t disturb the existing checks in the test, just added code to > check the result of getThreadAllocatedBytes(long) on a non-current > thread, plus the back-to-back no-allocation checks. The former wasn?t > needed before because getThreadAllocatedBytes(long) was just a wrapper > around getThreadAllocatedBytes(long []). This patch changes that, so I > added a separate test. The latter is supposed to fail if there?s object > allocation on calls to getCurrentThreadAllocatedBytes and > getThreadAllocatedBytes(long). I.e., a feature, not a bug, because > accumulation of transient small objects can be a performance problem. > Thanks to your review, I noticed that the back-to-back check on the > current thread was using getThreadAllocatedBytes(long) instead of > getCurrentThreadAllocatedBytes and fixed it. I also removed all > instances of ?TEST FAILED: ?. The back-to-back check is not valid in general. You don't know if the first check might trigger some class loading on the return path after it has obtained the first memory value. The check might also fail if using JVMCI and some compilation related activity occurs in the current thread on the second call. Also with the introduction of handshakes its possible the current thread might hit a safepoint checks that results in it executing a handshake operation that performs allocation. Potentially there could be numerous non-deterministic actions that might occur leading to unanticipated allocation. I understand what you want to test here, I just don't think it is reliably doable. Thanks, David ----- > > Paul > > *From: *Mandy Chung > *Date: *Thursday, September 12, 2019 at 10:09 AM > *To: *"Hohensee, Paul" > *Cc: *OpenJDK Serviceability , > "hotspot-gc-dev at openjdk.java.net" > *Subject: *Re: RFR (M): 8207266: ThreadMXBean::getThreadAllocatedBytes() > can be quicker for self thread > > On 9/3/19 12:38 PM, Hohensee, Paul wrote: > > Minor update in new webrevhttp://cr.openjdk.java.net/~phh/8207266/webrev.05/. > > > I only reviewed the library side implementation that looks good. I > expect the serviceability team to review the test and hotspot change. > > > Need a confirmatory review to push this. If I understand the rules correctly, it doesn't need a Reviewer review since Mandy's already reviewed it, it just needs a Committer review. > > > You need another reviewer to advice the following because I was not > close to the ThreadsList work. > > 2087 ThreadsListHandle tlh; > > 2088 JavaThread* java_thread = tlh.list()->find_JavaThread_from_java_tid(thread_id); > > 2089 > > 2090 if (java_thread != NULL) { > > 2091 return java_thread->cooked_allocated_bytes(); > > 2092 } > > This looks right to me. > > test/jdk/com/sun/management/ThreadMXBean/ThreadAllocatedMemory.java > > - "ThreadAllocatedMemory is expected to be disabled"); > > + "TEST FAILED: ThreadAllocatedMemory is expected to be > disabled"); > > Prepending "TEST FAILED" in exception message (in several places) > > seems redundant since such RuntimeException is thrown and expected > > a test failure. > > + // back-to-back calls shouldn't allocate any memory > > + size = mbean.getThreadAllocatedBytes(id); > > + size1 = mbean.getThreadAllocatedBytes(id); > > + if (size1 != size) { > > Is there anything in the test can do to help guarantee this? I didn't > > closely review this test. The main thing I advice is to improve > > the reliability of this test. Put it in another way, we want to > > ensure that this test change will pass all the time in various > > test configuration. > > Mandy > -------------- next part -------------- An HTML attachment was scrubbed... URL: From david.holmes at oracle.com Tue Sep 17 22:50:24 2019 From: david.holmes at oracle.com (David Holmes) Date: Wed, 18 Sep 2019 08:50:24 +1000 Subject: RFR (M): 8207266: ThreadMXBean::getThreadAllocatedBytes() can be quicker for self thread In-Reply-To: <5417BEA4-AECD-4130-B269-19847C0092B3@amazon.com> References: <588a91ec-8d4a-1157-5d72-88bb1eef1e6e@oracle.com> <30EA5D0C-1AEC-4242-B17B-CA4D39ECAF71@amazon.com> <0d42d653-d158-a6e4-45b6-84f087c7e592@oracle.com> <03A2509C-5587-448A-82F8-9240EA040326@amazon.com> <6f674d71-58f6-bc79-7d08-7bcc24e3b0fa@oracle.com> <5252a51d-4217-000b-1444-a088bb8a6a58@oracle.com> <873119A8-C595-4B73-AD0B-1625D6CAC47D@amazon.com> <56ea7c5f-8c91-9a05-6d95-255bfd0c154d@oracle.com> <5417BEA4-AECD-4130-B269-19847C0092B3@amazon.com> Message-ID: <1561d09b-68ff-55fa-128a-045798a3d6a9@oracle.com> On 18/09/2019 12:10 am, Hohensee, Paul wrote: > Thanks, Serguei. :) > > David, are you ok with the patch? Yep, nothing further from me. David > Paul > > *From: *"serguei.spitsyn at oracle.com" > *Date: *Tuesday, September 17, 2019 at 2:26 AM > *To: *"Hohensee, Paul" , David Holmes > , Mandy Chung > *Cc: *OpenJDK Serviceability , > "hotspot-gc-dev at openjdk.java.net" > *Subject: *Re: RFR (M): 8207266: ThreadMXBean::getThreadAllocatedBytes() > can be quicker for self thread > > Hi Paul, > > Thank you for refactoring and fixing the test. > It looks great now! > > Thanks, > Serguei > > > On 9/15/19 02:52, Hohensee, Paul wrote: > > Hi, Serguei, thanks for the review. New webrev at > > http://cr.openjdk.java.net/~phh/8207266/webrev.09/ > > I refactored the test?s main() method, and you?re correct, > getThreadAllocatedBytes should be getCurrentThreadAllocatedBytes in > that context: fixed. > > Paul > > *From: *"serguei.spitsyn at oracle.com" > > > *Organization: *Oracle Corporation > *Date: *Friday, September 13, 2019 at 5:50 PM > *To: *"Hohensee, Paul" > , David Holmes > , Mandy Chung > > *Cc: *OpenJDK Serviceability > , > "hotspot-gc-dev at openjdk.java.net" > > > > *Subject: *Re: RFR (M): 8207266: > ThreadMXBean::getThreadAllocatedBytes() can be quicker for self thread > > Hi Paul, > > It looks pretty good in general. > > http://cr.openjdk.java.net/~phh/8207266/webrev.08/test/jdk/com/sun/management/ThreadMXBean/ThreadAllocatedMemory.java.frames.html > > It would be nice to refactor the java main() method as it becomes > too big. > Two ways ofgetCurrentThreadAllocatedBytes() testing are good candidates > to become separate methods. > > ? 98?? ??????long size1 = mbean.getThreadAllocatedBytes(id); > > Just wanted to double check if you wanted to invoke > the getCurrentThreadAllocatedBytes() instead as it is > a part of: > > ? 85???????? // First way, getCurrentThreadAllocatedBytes > > > Thanks, > Serguei > > On 9/13/19 12:11 PM, Hohensee, Paul wrote: > > Hi David, thanks for your comments. New webrev in > > > > http://cr.openjdk.java.net/~phh/8207266/webrev.08/ > > > > Both the old and new versions of the code check that thread allocated memory is both supported and enabled. The existing version of getThreadAllocatedBytes(long []) calls verifyThreadAllocatedMemory(long []), which checks inline to make sure thread allocated memory is supported, then calls isThreadAllocatedMemoryEnabled() to verify that it's enabled. isThreadAllocatedMemoryEnabled() duplicates (!) the support check and returns the enabled flag. I removed the redundant check in the new version. > > > > You're of course correct about the back-to-back check. Application code can't know when the runtime will hijack a thread for its own purposes. I've removed the check. > > > > Paul > > > > On 9/13/19, 12:50 AM, "David Holmes" wrote: > > > > ??? Hi Paul, > > > > ????On 13/09/2019 10:29 am, Hohensee, Paul wrote: > > ??? > Thanks for clarifying the review rules. Would someone from the > > ????> serviceability team please review? New webrev at > > ??? > > > ????>http://cr.openjdk.java.net/~phh/8207266/webrev.07/ > > > > ????One aspect of the functional change needs clarification for me - and > > ????apologies if this has been covered in the past. It seems to me that > > ????currently we only check isThreadAllocatedMemorySupported for these > > ????operations, but if I read things correctly the updated code additionally > > ????checks isThreadAllocatedMemoryEnabled, which is a behaviour change not > > ????mentioned in the CSR. > > > > ????> I didn?t disturb the existing checks in the test, just added code to > > ????> check the result of getThreadAllocatedBytes(long) on a non-current > > ????> thread, plus the back-to-back no-allocation checks. The former wasn?t > > ????> needed before because getThreadAllocatedBytes(long) was just a wrapper > > ????> around getThreadAllocatedBytes(long []). This patch changes that, so I > > ????> added a separate test. The latter is supposed to fail if there?s object > > ????> allocation on calls to getCurrentThreadAllocatedBytes and > > ????> getThreadAllocatedBytes(long). I.e., a feature, not a bug, because > > ????> accumulation of transient small objects can be a performance problem. > > ????> Thanks to your review, I noticed that the back-to-back check on the > > ????> current thread was using getThreadAllocatedBytes(long) instead of > > ????> getCurrentThreadAllocatedBytes and fixed it. I also removed all > > ????> instances of ?TEST FAILED: ?. > > > > ????The back-to-back check is not valid in general. You don't know if the > > ????first check might trigger some class loading on the return path after it > > ????has obtained the first memory value. The check might also fail if using > > ????JVMCI and some compilation related activity occurs in the current thread > > ????on the second call. Also with the introduction of handshakes its > > ????possible the current thread might hit a safepoint checks that results in > > ????it executing a handshake operation that performs allocation. Potentially > > ????there could be numerous non-deterministic actions that might occur > > ????leading to unanticipated allocation. > > > > ????I understand what you want to test here, I just don't think it is > > ????reliably doable. > > > > ????Thanks, > > ??? David > > ??? ----- > > > > ????> > > ????> Paul > > ??? > > > ????> *From: *Mandy Chung > > ??? > *Date: *Thursday, September 12, 2019 at 10:09 AM > > ??? > *To: *"Hohensee, Paul" > > ??? > *Cc: *OpenJDK Serviceability , > > ????>"hotspot-gc-dev at openjdk.java.net" > > ??? > *Subject: *Re: RFR (M): 8207266: ThreadMXBean::getThreadAllocatedBytes() > > ????> can be quicker for self thread > > ??? > > > ????> On 9/3/19 12:38 PM, Hohensee, Paul wrote: > > ??? > > > ????>???? Minor update in new webrevhttp://cr.openjdk.java.net/~phh/8207266/webrev.05/. > > ??? > > > ????> > > ????> I only reviewed the library side implementation that looks good.? I > > ????> expect the serviceability team to review the test and hotspot change. > > ??? > > > ????> > > ????>???? Need a confirmatory review to push this. If I understand the rules correctly, it doesn't need a Reviewer review since Mandy's already reviewed it, it just needs a Committer review. > > ??? > > > ????> > > ????> You need another reviewer to advice the following because I was not > > ????> close to the ThreadsList work. > > ??? > > > ????> 2087?? ThreadsListHandle tlh; > > ??? > > > ????> 2088?? JavaThread* java_thread = tlh.list()->find_JavaThread_from_java_tid(thread_id); > > ??? > > > ????> 2089 > > ??? > > > ????> 2090?? if (java_thread != NULL) { > > ??? > > > ????> 2091???? return java_thread->cooked_allocated_bytes(); > > ??? > > > ????> 2092?? } > > ??? > > > ????> This looks right to me. > > ??? > > > ????> test/jdk/com/sun/management/ThreadMXBean/ThreadAllocatedMemory.java > > ??? > > > ????> -??????????????? "ThreadAllocatedMemory is expected to be disabled"); > > ??? > > > ????> +??????????????? "TEST FAILED: ThreadAllocatedMemory is expected to be > > ????> disabled"); > > ??? > > > ????> Prepending "TEST FAILED" in exception message (in several places) > > ??? > > > ????> seems redundant since such RuntimeException is thrown and expected > > ??? > > > ????> a test failure. > > ??? > > > ????> +??????? // back-to-back calls shouldn't allocate any memory > > ??? > > > ????> +??????? size = mbean.getThreadAllocatedBytes(id); > > ??? > > > ????> +??????? size1 = mbean.getThreadAllocatedBytes(id); > > ??? > > > ????> +??????? if (size1 != size) { > > ??? > > > ????> Is there anything in the test can do to help guarantee this? I didn't > > ??? > > > ????> closely review this test.? The main thing I advice is to improve > > ??? > > > ????> the reliability of this test.? Put it in another way, we want to > > ??? > > > ????> ensure that this test change will pass all the time in various > > ??? > > > ????> test configuration. > > ??? > > > ????> Mandy > > ??? > > > > > > > > From david.holmes at oracle.com Tue Sep 17 23:12:49 2019 From: david.holmes at oracle.com (David Holmes) Date: Wed, 18 Sep 2019 09:12:49 +1000 Subject: RFR: 8230857: Avoid reflection in sun.tools.common.ProcessHelper In-Reply-To: <98c7eadb-3003-cea7-f5a3-c902d9a6f1db@oracle.com> References: <555a2cf2-e15e-abb6-5c0a-fb3ff4c0716f@oracle.com> <7cef2fd2-74cb-f069-d837-b5219924efc0@oracle.com> <105a9f1f-9b1e-c707-b65d-6e71db7c701d@oracle.com> <431b85fb-b131-b55b-f7a8-d7112b2c9fa4@oracle.com> <98c7eadb-3003-cea7-f5a3-c902d9a6f1db@oracle.com> Message-ID: Hi Erik, Thanks for the additional details (I can't say I fully understand them :) ). David On 17/09/2019 11:39 pm, Erik Joelsson wrote: > Hello, > > On 2019-09-17 05:59, David Holmes wrote: >> Hi Magnus, >> >> On 17/09/2019 9:26 pm, Magnus Ihse Bursie wrote: >>> On 2019-09-17 01:01, David Holmes wrote: >>>> Hi Christoph, >>>> >>>> Sorry for the delay getting back you. >>>> >>>> cc'd build-dev to get some clarification on the below ... >>>> >>>> On 12/09/2019 7:30 pm, Langer, Christoph wrote: >>>>> Hi David, >>>>> >>>>>>> please review an enhancement which I've identified when working with >>>>>>> Processhelper for JDK-8230850. >>>>>>> >>>>>>> I noticed that ProcessHelper is an interface in common code with a >>>>>>> static method that would lookup the actual platform >>>>>>> implementation via >>>>>>> reflection. This seems a little cumbersome since we can have a >>>>>>> common >>>>>>> dummy for ProcessHelper and override it with the platform specific >>>>>>> implementation, leveraging the build system. >>>>>> >>>>>> I don't see you leveraging the build system. You have two source >>>>>> files >>>>>> that compile to the same destination class file. What is ensuring the >>>>>> platform specific version is compiled after the generic one? >>>>>> >>>>>> Service-provider patterns use reflection to instantiate the service >>>>>> implementation. I don't see any problem here that needs solving. >>>>> >>>>> TL;DR: >>>>> There are two source files, one in share/classes and one in >>>>> linux/classes. The build system overrides the share/classes >>>>> implementation with the linux/classes implementation in the linux >>>>> build. This is not by coincidence and only one class is contained >>>>> in the generated jdk.jcmd module. Then there won't be a need for >>>>> having a service interface and a service implementation that is >>>>> looked up via reflection (which is not a bad pattern by itself). I >>>>> agree that it's not a big problem to be solved but still not "no >>>>> problem". >>>>> Here is some longer elaboration how the build system prefers >>>>> specific implementations of classes and filters generic duplicates: >>>>> The SetupJavaCompilation function from JavaCompilation.gmk [0] is >>>>> used to compile the java sources for JDK modules. In its >>>>> documentation, for argument SRC [1], it claims: "one or more >>>>> directories to search for sources. The order of the source roots is >>>>> significant. The first found file of a certain name has priority". >>>>> In its implementation the found files are first ordered [3] and >>>>> duplicates filtered out [4]. >>>>> The potential source files are handed to SetupJavaCompilation in >>>>> CompileJavaModules.gmk [5] and were collected by a call to >>>>> FindModuleSrcDirs [6]. FindModuleSrcDirs iterates over all >>>>> potential source dirs for Java classes in the module [7]. The >>>>> evaluated subdirs are (in that order) $(OPENJDK_TARGET_OS)/classes, >>>>> $(OPENJDK_TARGET_OS_TYPE)/classes and share/classes, as per [8]. >>>>> Hope that explains what I'm trying to leverage here. >>>> >>>> I'm not 100% certain that what you describe actually ensures what >>>> you want it to ensure. I can't reconcile "the first found file ... >>>> has priority" with the fact found files are sorted and duplicates >>>> eliminated. It is the sorting that concerns me as it suggests >>>> linux/Foo.java might replace shared/Foo.java, but if we're on >>>> Windows then we have a problem! That said there is also this comment: >>>> >>>> # Order src files according to the order of the src dirs. Correct >>>> odering is >>>> # needed for correct overriding between different source roots. >>>> >>>> I'd need the build team to clarify what "correct overriding" is >>>> actually defined as. >>> David, >>> >>> Christoph is correct. linux/Foo.java will override share/Foo.java. I >>> don't remember how the magic in JavaCompilation.gmk works anymore >>> :-), but we have relied on this behavior in other places for a long >>> time, so I'm pretty certain it is still working correctly. >>> Presumably, the $(sort ...) is there to remove (identical) >>> duplicates, which is a side-effect of sort. >> >> Thanks for confirming. I'd still like to understand exactly what these >> overriding rules are though. It's not a mechanism I was aware of. >> > SetupJavaCompilation is indeed behaving as Christoph describes and it is > by design. I implemented support for this behavior in: > > https://bugs.openjdk.java.net/browse/JDK-8079344 > > The relevant parts of SetupJavaCompilation look like this: > > ? # Order src files according to the order of the src dirs. Correct > odering is > ? # needed for correct overriding between different source roots. > ? $1_ALL_SRC_RAW := $$(call FindFiles, $$($1_SRC)) > ? $1_ALL_SRCS := $$($1_EXTRA_FILES) \ > ????? $$(foreach d, $$($1_SRC), $$(filter $$d%, $$($1_ALL_SRC_RAW))) > > The second line orders the src files by the src roots. (We used to just > call find for one src root at a time, but the above actually performs > better due only running 1 external process) > > Further down we have this: > > ? ifneq ($$($1_KEEP_DUPS), true) > ??? # Remove duplicate source files by keeping the first found of each > duplicate. > ??? # This allows for automatic overrides with custom or platform > specific versions > ??? # source files. > ??? # > ??? # For the smart javac wrapper case, add each removed file to an > extra exclude > ??? # file list to prevent sjavac from finding duplicate sources. > ??? $1_SRCS := $$(strip $$(foreach s, $$($1_SRCS), \ > ??????? $$(eval relative_src := $$(call remove-prefixes, $$($1_SRC), > $$(s))) \ > ??????? $$(if $$($1_$$(relative_src)), \ > ????????? $$(eval $1_SJAVAC_EXCLUDE_FILES += $$(s)), \ > ????????? $$(eval $1_$$(relative_src) := 1) $$(s)))) > ? endif > > This loop is a bit hairy to wrap your head around. It's iterating over > all the src files, in the order of importance. The variable relative_src > is the path from the src root, the part that is common to all duplicate > src files. The variables on the form $1_$$(relative_src) basically act > as a hash map (string->boolean). So for each src file, if the relative > path for it has already been seen, add it to an exclude list, else mark > it as seen and add it to the return list. > > /Erik > From daniil.x.titov at oracle.com Wed Sep 18 00:13:57 2019 From: daniil.x.titov at oracle.com (Daniil Titov) Date: Tue, 17 Sep 2019 17:13:57 -0700 Subject: 8185005: Improve performance of ThreadMXBean.getThreadInfo(long ids[], int maxDepth) In-Reply-To: <0105ea55-9d9c-ca09-53af-3e9863e78e95@oracle.com> References: <4C4212D0-BFFF-4C85-ACC6-05200F220C3F@oracle.com> <2d6dede1-aa79-99ce-a823-773fa2e19827@oracle.com> <6E7B043A-4647-4931-977C-1854CA7EBEC1@oracle.com> <76BCC96D-DB5D-409A-95D5-3A64B893832D@oracle.com> <7e0ba39e-e5b7-f56b-66ea-820a0a35ec2c@oracle.com> <87748188-3BD4-4A8B-938A-89DBC8F3C57A@oracle.com> <1D2CC008-A509-4B0B-A8C7-75C1F94545AD@oracle.com> <0105ea55-9d9c-ca09-53af-3e9863e78e95@oracle.com> Message-ID: <5560D680-CD20-442F-8902-7F7034B0736A@oracle.com> Hi Serguei, Please find below my answers to the concerns you mentioned in the previous email. 1. > I have a concern about the checks for thread->is_exiting(). > - the lines 632-633 are useless as they do not really protect from returning an exiting thread > It is interesting what might happen if an exiting thread is returned by the > ThreadsList::find_JavaThread_from_java_tid (). > Does it make sense to develop a test that would cover these cases? I agree, it doesn't really provide any protection so it makes sense just remove it. The current implementation find_JavaThread_from_java_tid() doesn't provide such protection as well, since the thread could start exiting immediately after method find_JavaThread_from_java_tid() returns, so the assumption is that the callers of find_JavaThread_from_java_tid() are expecting to deal with such threads and looking on some of them shows that they usually try to retrieve threadObj or a thread statistic object and if it is NULL that just do nothing. I'm not sure we could cover this specific case with the test. The window between find_JavaThread_from_java_tid() returns and the caller continues the execution is too small. The window between the thread started exiting and removed itself from the thread table is very small as well. 2. > - the lines 105-108 can result in adding exiting threads into the ThreadTable I agree, it was missed, we need to wrap this code inside Thread_lock in the similar way as it is done find_JavaThread_from_java_tid() 3. > I would suggest to rewrite this fragment in a safe way: > 95 { > 96 MutexLocker ml(ThreadTableCreate_lock); > 97 if (!_is_initialized) { > 98 create_table(threads->length()); > 99 _is_initialized = true; > 100 } > 101 } > as: > { > MutexLocker ml(ThreadTableCreate_lock); > if (_is_initialized) { > return; > } > create_table(threads->length()); > _is_initialized = true; > } It was an intension to not block while populating the table with the threads from the current thread list. There is no needs to have other threads that call find_JavaThread_from_java_tid() be blocked and waiting for it to complete since the requested thread could be not present in the thread list that triggers the thread table initialization. Plus in case of racing initialization it allows threads from not original thread lists be added to the table and thus avoid the linear scan when these thread are looked up for the first time. 4. >> The case you have described is exact the reason why we still have a code inside >> ThreadsList::find_JavaThread_from_java_tid() method that does a linear scan and adds >> the requested thread to the thread table if it is not there ( lines 614-613 below). > I disagree because it is easy to avoid concurrent ThreadTable > initialization (please, see my separate email). > The reason for this code is to cover a case of late/lazy ThreadTable > initialization. David Holmes replied to this in a separate email providing a very detailed explanation of the possible cases and how the proposed implementation satisfies them. Best regards, Daniil From: "serguei.spitsyn at oracle.com" Date: Tuesday, September 17, 2019 at 1:53 AM To: Daniil Titov , Robbin Ehn , David Holmes , , OpenJDK Serviceability , "hotspot-runtime-dev at openjdk.java.net" , "jmx-dev at openjdk.java.net" , Claes Redestad Subject: Re: RFR: 8185005: Improve performance of ThreadMXBean.getThreadInfo(long ids[], int maxDepth) Hi Daniil, Thank you for you patience in working on this issue! Also, I like that the current thread related optimizations in management.cpp were factored out. It was a good idea to separate them. I have a concern about the checks for thread->is_exiting(). The threads are added to and removed from the ThreadTable under protection of Threads_lock. However, the thread->is_exiting() checks are not protected, and so, they are racy. There is a couple of such checks to mention: 611 JavaThread* ThreadsList::find_JavaThread_from_java_tid(jlong java_tid) const { 612 ThreadTable::lazy_initialize(this); 613 JavaThread* thread = ThreadTable::find_thread_by_tid(java_tid); 614 if (thread == NULL) { 615 // If the thread is not found in the table find it 616 // with a linear search and add to the table. 617 for (uint i = 0; i < length(); i++) { 618 thread = thread_at(i); 619 oop tobj = thread->threadObj(); 620 // Ignore the thread if it hasn't run yet, has exited 621 // or is starting to exit. 622 if (tobj != NULL && java_tid == java_lang_Thread::thread_id(tobj)) { 623 MutexLocker ml(Threads_lock); 624 // Must be inside the lock to ensure that we don't add the thread to the table 625 // that has just passed the removal point in ThreadsSMRSupport::remove_thread() 626 if (!thread->is_exiting()) { 627 ThreadTable::add_thread(java_tid, thread); 628 return thread; 629 } 630 } 631 } 632 } else if (!thread->is_exiting()) { 633 return thread; 634 } 635 return NULL; 636 } ? ... 93 void ThreadTable::lazy_initialize(const ThreadsList *threads) { 94 if (!_is_initialized) { 95 { 96 MutexLocker ml(ThreadTableCreate_lock); 97 if (!_is_initialized) { 98 create_table(threads->length()); 99 _is_initialized = true; 100 } 101 } 102 for (uint i = 0; i < threads->length(); i++) { 103 JavaThread* thread = threads->thread_at(i); 104 oop tobj = thread->threadObj(); 105 if (tobj != NULL && !thread->is_exiting()) { 106 jlong java_tid = java_lang_Thread::thread_id(tobj); 107 add_thread(java_tid, thread); 108 } 109 } 110 } 111 } A thread may start exiting right after the checks at the lines 626 and 105. So that: ?- the lines 632-633 are useless as they do not really protect from returning an exiting thread ?- the lines 105-108 can result in adding exiting threads into the ThreadTable Please, note, the lines 626-629 are safe in terms of addition to the ThreadTable as they are protected with the Threads_lock. But the returned thread still can exit after that. It is interesting what might happen if an exiting thread is returned by the ThreadsList::find_JavaThread_from_java_tid (). Does it make sense to develop a test that would cover these cases? Thanks, Serguei On 9/16/19 11:18, Daniil Titov wrote: Hello, After investigating with Claes the impact of this change on the performance (thanks a lot Claes for helping with it!) the conclusion was that the impact on the thread startup time is not a blocker for this change. I also measured the memory footprint using Native Memory Tracking and results showed around 40 bytes per live thread. Please review a new version of the fix, webrev.06 [1]. Just to remind, webrev.05 was abandoned and webrev.06 [1] is webrev.04 [3] minus changes in src/hotspot/share/services/management.cpp (that were factored out to a separate issue [4]) and plus a change in ThreadsList::find_JavaThread_from_java_tid() method (please, see below) that addresses the problem Robbin found and puts the code that adds a new thread to the thread table inside Threads_lock. src/hotspot/share/runtime/threadSMR.cpp 622 if (tobj != NULL && java_tid == java_lang_Thread::thread_id(tobj)) { 623 MutexLocker ml(Threads_lock); 624 // Must be inside the lock to ensure that we don't add the thread to the table 625 // that has just passed the removal point in ThreadsSMRSupport::remove_thread() 626 if (!thread->is_exiting()) { 627 ThreadTable::add_thread(java_tid, thread); 628 return thread; 629 } 630 } [1] Webrev: https://cr.openjdk.java.net/~dtitov/8185005/webrev.06 [2] Bug: https://bugs.openjdk.java.net/browse/JDK-8185005 [3] https://cr.openjdk.java.net/~dtitov/8185005/webrev.04 [4] https://bugs.openjdk.java.net/browse/JDK-8229391 ?Thank you, Daniil > > ?On 8/4/19, 7:54 PM, "David Holmes" mailto:david.holmes at oracle.com wrote: > > Hi Daniil, > > On 3/08/2019 8:16 am, Daniil Titov wrote: > > Hi David, > > > > Thank you for your detailed review. Please review a new version of the fix that includes > > the changes you suggested: > > - ThreadTableCreate_lock scope is reduced to cover the creation of the table only; > > - ThreadTableCreate_lock is made _safepoint_check_always; > > Okay. > > > - ServiceThread is no longer responsible for the resizing of the thread table, instead, > > the thread table is changed to grow on demand by the thread that is doing the addition; > > Okay - I'm happy to get the serviceThread out of the picture here. > > > - fixed nits and formatting issues. > > Okay. > > >>> The change also includes additional optimization for some callers of find_JavaThread_from_java_tid() > >>> as Daniel suggested. > >> Not sure it's best to combine these, but if they are limited to the > >> changes in management.cpp only then that may be okay. > > > > The additional optimization for some callers of find_JavaThread_from_java_tid() is > > limited to management.cpp (plus a new test) so I left them in the webrev but > > I also could move it in the separate issue if required. > > I'd prefer this part of be separated out, but won't insist. Let's see if > Dan or Serguei have a strong opinion. > > > > src/hotspot/share/runtime/threadSMR.cpp > > >755 jlong tid = SharedRuntime::get_java_tid(thread); > > > 926 jlong tid = SharedRuntime::get_java_tid(thread); > > > I think it cleaner/better to just use > > > jlong tid = java_lang_Thread::thread_id(thread->threadObj()); > > > as we know thread is not NULL, it is a JavaThread and it has to have a > > > non-null threadObj. > > > > I had to leave this code unchanged since it turned out the threadObj is null > > when VM is destroyed: > > > > V [libjvm.so+0xe165d7] oopDesc::long_field(int) const+0x67 > > V [libjvm.so+0x16e06c6] ThreadsSMRSupport::add_thread(JavaThread*)+0x116 > > V [libjvm.so+0x16d1302] Threads::add(JavaThread*, bool)+0x82 > > V [libjvm.so+0xef8369] attach_current_thread.part.197+0xc9 > > V [libjvm.so+0xec136c] jni_DestroyJavaVM+0x6c > > C [libjli.so+0x4333] JavaMain+0x2c3 > > C [libjli.so+0x8159] ThreadJavaMain+0x9 > > This is actually nothing to do with the VM being destroyed, but is an > issue with JNI_AttachCurrentThread and its interaction with the > ThreadSMR iterators. The attach process is: > - create JavaThread > - mark as "is attaching via jni" > - add to ThreadsList > - create java.lang.Thread object (you can only execute Java code after > you are attached) > - mark as "attach completed" > > So while a thread "is attaching" it will be seen by the ThreadSMR thread > iterator but will have a NULL java.lang.Thread object. > > We special-case attaching threads in a number of places in the VM and I > think we should be explicitly doing something here to filter out > attaching threads, rather than just being tolerant of a NULL j.l.Thread > object. Specifically in ThreadsSMRSupport::add_thread: > > if (ThreadTable::is_initialized() && !thread->is_attaching_via_jni()) { > jlong tid = java_lang_Thread::thread_id(thread->threadObj()); > ThreadTable::add_thread(tid, thread); > } > > Note that in ThreadsSMRSupport::remove_thread we can use the same guard, > which covers the case the JNI attach encountered an error trying to > create the j.l.Thread object. > > >> src/hotspot/share/services/threadTable.cpp > >> 71 static uintx get_hash(Value const& value, bool* is_dead) { > > > >> The is_dead parameter still bothers me here. I can't make enough sense > >> out of the template code in ConcurrentHashtable to see why we have to > >> have it, but I'm concerned that its very existence means we perhaps > >> should not be trying to extend CHT in this context. ?? > > > > My understanding is that is_dead parameter provides a mechanism for > > ConcurrentHashtable to remove stale entries that were not explicitly > > removed by calling ConcurrentHashTable::remove() method. > > I think that just because in our case we don't use this mechanism doesn't > > mean we should not use ConcurrentHashTable. > > Can you confirm that this usage is okay with Robbin Ehn please. He's > back from vacation this week. > > >> I would still want to see what impact this has on thread > >> startup cost, both with and without the table being initialized. > > > > I run a test that initializes the table by calling ThreadMXBean.get getThreadInfo(), > > starts some threads as a worm-up, and then creates and starts 100,000 threads > > (each thread just sleeps for 100 ms). In case when the thread table is enabled > > 100,000 threads are created and started for about 15200 ms. If the thread table > > is off the test takes about 14800 ms. Based on this information the enabled > > thread table makes the thread startup about 2.7% slower. > > That doesn't sound very good. I think we may need to Claes involved to > help investigate overall performance impact here. > > > Webrev: https://cr.openjdk.java.net/~dtitov/8185005/webrev.04/ > > Bug: https://bugs.openjdk.java.net/browse/JDK-8185005 > > No further code comments. > > I didn't look at the test in detail. > > Thanks, > David > > > Thanks! > > --Daniil > > > > > > ?On 7/29/19, 12:53 AM, "David Holmes" mailto:david.holmes at oracle.com wrote: > > > > Hi Daniil, > > > > Overall I think this is a reasonable approach but I would still like to > > see some performance and footprint numbers, both to verify it fixes the > > problem reported, and that we are not getting penalized elsewhere. > > > > On 25/07/2019 3:21 am, Daniil Titov wrote: > > > Hi David, Daniel, and Serguei, > > > > > > Please review the new version of the fix, that makes the thread table initialization on demand and > > > moves it inside ThreadsList::find_JavaThread_from_java_tid(). At the creation time the thread table > > > is initialized with the threads from the current thread list. We don't want to hold Threads_lock > > > inside find_JavaThread_from_java_tid(), thus new threads still could be created while the thread > > > table is being initialized . Such threads will be found by the linear search and added to the thread table > > > later, in ThreadsList::find_JavaThread_from_java_tid(). > > > > The initialization allows the created but unpopulated, or partially > > populated, table to be seen by other threads - is that your intention? > > It seems it should be okay as the other threads will then race with the > > initializing thread to add specific entries, and this is a concurrent > > map so that should be functionally correct. But if so then I think you > > can also reduce the scope of the ThreadTableCreate_lock so that it > > covers creation of the table only, not the initial population of the table. > > > > I like the approach of only initializing the table when needed and using > > that to control when the add/remove-thread code needs to update the > > table. But I would still want to see what impact this has on thread > > startup cost, both with and without the table being initialized. > > > > > The change also includes additional optimization for some callers of find_JavaThread_from_java_tid() > > > as Daniel suggested. > > > > Not sure it's best to combine these, but if they are limited to the > > changes in management.cpp only then that may be okay. It helps to be > > able to focus on the table related changes without being distracted by > > other optimizations. > > > > > That is correct that ResolvedMethodTable was used as a blueprint for the thread table, however, I tried > > > to strip it of the all functionality that is not required in the thread table case. > > > > The revised version seems better in that regard. But I still have a > > concern, see below. > > > > > We need to have the thread table resizable and allow it to grow as the number of threads increases to avoid > > > reserving excessive memory a-priori or deteriorating lookup times. The ServiceThread is responsible for > > > growing the thread table when required. > > > > Yes but why? Why can't this table be grown on demand by the thread that > > is doing the addition? For other tables we may have to delegate to the > > service thread because the current thread cannot perform the action, or > > it doesn't want to perform it at the time the need for the resize is > > detected (e.g. its detected at a safepoint and you want the resize to > > happen later outside the safepoint). It's not apparent to me that such > > restrictions apply here. > > > > > There is no ConcurrentHashTable available in Java 8 and for backporting this fix to Java 8 another implementation > > > of the hash table, probably originally suggested in the patch attached to the JBS issue, should be used. It will make > > > the backporting more complicated, however, adding a new Implementation of the hash table in Java 14 while it > > > already has ConcurrentHashTable doesn't seem reasonable for me. > > > > Ok. > > > > > Webrev: http://cr.openjdk.java.net/~dtitov/8185005/webrev.03 > > > > Some specific code comments: > > > > src/hotspot/share/runtime/mutexLocker.cpp > > > > + def(ThreadTableCreate_lock , PaddedMutex , special, > > false, Monitor::_safepoint_check_never); > > > > I think this needs to be a _safepoint_check_always lock. The table will > > be created by regular JavaThreads and they should (nearly) always be > > checking for safepoints if they are going to block acquiring the lock. > > And it isn't at all obvious that the thread doing the creation can't go > > to a safepoint whilst this lock is held. > > > > --- > > > > src/hotspot/share/runtime/threadSMR.cpp > > > > Nit: > > > > 618 JavaThread* thread = thread_at(i); > > > > you could reuse the new java_thread local you introduced at line 613 and > > just rename that "new" variable to "thread" so you don't have to change > > all other uses. > > > > 628 } else if (java_thread != NULL && ... > > > > You don't need to check != NULL here as you only get here when > > java_thread is not NULL. > > > > 755 jlong tid = SharedRuntime::get_java_tid(thread); > > 926 jlong tid = SharedRuntime::get_java_tid(thread); > > > > I think it cleaner/better to just use > > > > jlong tid = java_lang_Thread::thread_id(thread->threadObj()); > > > > as we know thread is not NULL, it is a JavaThread and it has to have a > > non-null threadObj. > > > > --- > > > > src/hotspot/share/services/management.cpp > > > > 1323 if (THREAD->is_Java_thread()) { > > 1324 JavaThread* current_thread = (JavaThread*)THREAD; > > > > These calls can only be made on a JavaThread so this be simplified to > > remove the is_Java_thread() call. Similarly in other places. > > > > --- > > > > src/hotspot/share/services/threadTable.cpp > > > > 55 class ThreadTableEntry : public CHeapObj { > > 56 private: > > 57 jlong _tid; > > > > I believe hotspot style is to not indent the access modifiers in C++ > > class declarations, so the above would just be: > > > > 55 class ThreadTableEntry : public CHeapObj { > > 56 private: > > 57 jlong _tid; > > > > etc. > > > > 60 ThreadTableEntry(jlong tid, JavaThread* java_thread) : > > 61 _tid(tid),_java_thread(java_thread) {} > > > > line 61 should be indented as it continues line 60. > > > > 67 class ThreadTableConfig : public AllStatic { > > ... > > 71 static uintx get_hash(Value const& value, bool* is_dead) { > > > > The is_dead parameter still bothers me here. I can't make enough sense > > out of the template code in ConcurrentHashtable to see why we have to > > have it, but I'm concerned that its very existence means we perhaps > > should not be trying to extend CHT in this context. ?? > > > > 115 size_t start_size_log = size_log > DefaultThreadTableSizeLog > > 116 ? size_log : DefaultThreadTableSizeLog; > > > > line 116 should be indented, though in this case I think a better layout > > would be: > > > > 115 size_t start_size_log = > > 116 size_log > DefaultThreadTableSizeLog ? size_log : > > DefaultThreadTableSizeLog; > > > > 131 double ThreadTable::get_load_factor() { > > 132 return (double)_items_count/_current_size; > > 133 } > > > > Not sure that is doing what you want/expect. It will perform integer > > division and then cast that whole integer to a double. If you want > > double arithmetic you need: > > > > return ((double)_items_count)/_current_size; > > > > 180 jlong _tid; > > 181 uintx _hash; > > > > Nit: no need for all those spaces before the variable name. > > > > 183 ThreadTableLookup(jlong tid) > > 184 : _tid(tid), _hash(primitive_hash(tid)) {} > > > > line 184 should be indented. > > > > 201 ThreadGet():_return(NULL) {} > > > > Nit: need space after : > > > > 211 assert(_is_initialized, "Thread table is not initialized"); > > 212 _has_work = false; > > > > line 211 is indented one space too far. > > > > 229 ThreadTableEntry* entry = new ThreadTableEntry(tid,java_thread); > > > > Nit: need space after , > > > > 252 return _local_table->remove(thread,lookup); > > > > Nit: need space after , > > > > Thanks, > > David > > ------ > > > > > Bug: https://bugs.openjdk.java.net/browse/JDK-8185005 > > > > > > Thanks! > > > --Daniil > > > > > > > > > ?On 7/8/19, 3:24 PM, "Daniel D. Daugherty" mailto:daniel.daugherty at oracle.com wrote: > > > > > > On 6/29/19 12:06 PM, Daniil Titov wrote: > > > > Hi Serguei and David, > > > > > > > > Serguei is right, ThreadTable::find_thread(java_tid) cannot return a JavaThread with an unmatched java_tid. > > > > > > > > Please find a new version of the fix that includes the changes Serguei suggested. > > > > > > > > Regarding the concern about the maintaining the thread table when it may never even be queried, one of > > > > the options could be to add ThreadTable ::isEnabled flag, set it to "false" by default, and wrap the calls to the thread table > > > > in ThreadsSMRSupport add_thread() and remove_thread() methods to check this flag. > > > > > > > > When ThreadsList::find_JavaThread_from_java_tid() is called for the first time it could check if ThreadTable ::isEnabled > > > > Is on and if not then set it on and populate the thread table with all existing threads from the thread list. > > > > > > I have the same concerns as David H. about this new ThreadTable. > > > ThreadsList::find_JavaThread_from_java_tid() is only called from code > > > in src/hotspot/share/services/management.cpp so I think that table > > > needs to enabled and populated only if it is going to be used. > > > > > > I've taken a look at the webrev below and I see that David has > > > followed up with additional comments. Before I do a crawl through > > > code review for this, I would like to see the ThreadTable stuff > > > made optional and David's other comments addressed. > > > > > > Another possible optimization is for callers of > > > find_JavaThread_from_java_tid() to save the calling thread's > > > tid value before they loop and if the current tid == saved_tid > > > then use the current JavaThread* instead of calling > > > find_JavaThread_from_java_tid() to get the JavaThread*. > > > > > > Dan > > > > > > > > > > > Webrev: https://cr.openjdk.java.net/~dtitov/8185005/webrev.02/ > > > > Bug: https://bugs.openjdk.java.net/browse/JDK-8185005 > > > > > > > > Thanks! > > > > --Daniil > > > > > > > > From: mailto:serguei.spitsyn at oracle.com > > > > Organization: Oracle Corporation > > > > Date: Friday, June 28, 2019 at 7:56 PM > > > > To: Daniil Titov mailto:daniil.x.titov at oracle.com, OpenJDK Serviceability mailto:serviceability-dev at openjdk.java.net, mailto:hotspot-runtime-dev at openjdk.java.net mailto:hotspot-runtime-dev at openjdk.java.net, mailto:jmx-dev at openjdk.java.net mailto:jmx-dev at openjdk.java.net > > > > Subject: Re: RFR: 8185005: Improve performance of ThreadMXBean.getThreadInfo(long ids[], int maxDepth) > > > > > > > > Hi Daniil, > > > > > > > > I have several quick comments. > > > > > > > > The indent in the hotspot c/c++ files has to be 2, not 4. > > > > > > > > https://cr.openjdk.java.net/~dtitov/8185005/webrev.01/src/hotspot/share/runtime/threadSMR.cpp.frames.html > > > > 614 JavaThread* ThreadsList::find_JavaThread_from_java_tid(jlong java_tid) const { > > > > 615 JavaThread* java_thread = ThreadTable::find_thread(java_tid); > > > > 616 if (java_thread == NULL && java_tid == PMIMORDIAL_JAVA_TID) { > > > > 617 // ThreadsSMRSupport::add_thread() is not called for the primordial > > > > 618 // thread. Thus, we find this thread with a linear search and add it > > > > 619 // to the thread table. > > > > 620 for (uint i = 0; i < length(); i++) { > > > > 621 JavaThread* thread = thread_at(i); > > > > 622 if (is_valid_java_thread(java_tid,thread)) { > > > > 623 ThreadTable::add_thread(java_tid, thread); > > > > 624 return thread; > > > > 625 } > > > > 626 } > > > > 627 } else if (java_thread != NULL && is_valid_java_thread(java_tid, java_thread)) { > > > > 628 return java_thread; > > > > 629 } > > > > 630 return NULL; > > > > 631 } > > > > 632 bool ThreadsList::is_valid_java_thread(jlong java_tid, JavaThread* java_thread) { > > > > 633 oop tobj = java_thread->threadObj(); > > > > 634 // Ignore the thread if it hasn't run yet, has exited > > > > 635 // or is starting to exit. > > > > 636 return (tobj != NULL && !java_thread->is_exiting() && > > > > 637 java_tid == java_lang_Thread::thread_id(tobj)); > > > > 638 } > > > > > > > > 615 JavaThread* java_thread = ThreadTable::find_thread(java_tid); > > > > > > > > I'd suggest to rename find_thread() to find_thread_by_tid(). > > > > > > > > A space is missed after the comma: > > > > 622 if (is_valid_java_thread(java_tid,thread)) { > > > > > > > > An empty line is needed before L632. > > > > > > > > The name 'is_valid_java_thread' looks wrong (or confusing) to me. > > > > Something like 'is_alive_java_thread_with_tid()' would be better. > > > > It'd better to list parameters in the opposite order. > > > > > > > > The call to is_valid_java_thread() is confusing: > > > > 627 } else if (java_thread != NULL && is_valid_java_thread(java_tid, java_thread)) { > > > > > > > > Why would the call ThreadTable::find_thread(java_tid) return a JavaThread with an unmatched java_tid? > > > > > > > > > > > > Thanks, > > > > Serguei > > > > > > > > On 6/28/19, 9:40 PM, "David Holmes" mailto:david.holmes at oracle.com wrote: > > > > > > > > Hi Daniil, > > > > > > > > The definition and use of this hashtable (yet another hashtable > > > > implementation!) will need careful examination. We have to be concerned > > > > about the cost of maintaining it when it may never even be queried. You > > > > would need to look at footprint cost and performance impact. > > > > > > > > Unfortunately I'm just about to board a plane and will be out for the > > > > next few days. I will try to look at this asap next week, but we will > > > > need a lot more data on it. > > > > > > > > Thanks, > > > > David > > > > > > > > On 6/28/19 3:31 PM, Daniil Titov wrote: > > > > Please review the change that improves performance of ThreadMXBean MXBean methods returning the > > > > information for specific threads. The change introduces the thread table that uses ConcurrentHashTable > > > > to store one-to-one the mapping between the thread ids and JavaThread objects and replaces the linear > > > > search over the thread list in ThreadsList::find_JavaThread_from_java_tid(jlong tid) method with the lookup > > > > in the thread table. > > > > > > > > Testing: Mach5 tier1,tier2 and tier3 tests successfully passed. > > > > > > > > Webrev: https://cr.openjdk.java.net/~dtitov/8185005/webrev.01/ > > > > Bug: https://bugs.openjdk.java.net/browse/JDK-8185005 > > > > > > > > Thanks! > > > > > > > > Best regards, > > > > Daniil > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > From david.holmes at oracle.com Wed Sep 18 06:26:43 2019 From: david.holmes at oracle.com (David Holmes) Date: Wed, 18 Sep 2019 16:26:43 +1000 Subject: RFR (XXXXS): 8231162: JVMTI RawMonitorWait triggers assertion failure: Only JavaThreads can be interruptible Message-ID: Bug: https://bugs.openjdk.java.net/browse/JDK-8231162 webrev: http://cr.openjdk.java.net/~dholmes/8231162/webrev/ - r = rmonitor->raw_wait(millis, true, thread); + r = rmonitor->raw_wait(millis, false, thread); Non-JavaThreads are not interruptible and so "true" should not have been being passed. This tripped over the assertions added as part of the movement of the interrupt code to JavaThread under JDK-8230424. Dan: FYI I overlooked this because I already rewrote all this RawMonitor logic under "8229160: Reimplement JvmtiRawMonitor to use PlatformMonitor" to do the right thing, but of course that hasn't been pushed yet. And this isn't detected until tier 4 testing. Thanks, David From serguei.spitsyn at oracle.com Wed Sep 18 07:13:15 2019 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Wed, 18 Sep 2019 00:13:15 -0700 Subject: 8185005: Improve performance of ThreadMXBean.getThreadInfo(long ids[], int maxDepth) In-Reply-To: References: <4C4212D0-BFFF-4C85-ACC6-05200F220C3F@oracle.com> <2d6dede1-aa79-99ce-a823-773fa2e19827@oracle.com> <6E7B043A-4647-4931-977C-1854CA7EBEC1@oracle.com> <76BCC96D-DB5D-409A-95D5-3A64B893832D@oracle.com> <7e0ba39e-e5b7-f56b-66ea-820a0a35ec2c@oracle.com> <87748188-3BD4-4A8B-938A-89DBC8F3C57A@oracle.com> <1D2CC008-A509-4B0B-A8C7-75C1F94545AD@oracle.com> <9A125A5A-3904-4E3B-9650-308B56E15F20@oracle.com> <0b307a92-5b5d-bd36-a128-99af6d0f3b1b@oracle.com> Message-ID: <88ec033f-a216-3e0a-8e27-b82fa4728055@oracle.com> An HTML attachment was scrubbed... URL: From david.holmes at oracle.com Wed Sep 18 07:25:29 2019 From: david.holmes at oracle.com (David Holmes) Date: Wed, 18 Sep 2019 17:25:29 +1000 Subject: 8185005: Improve performance of ThreadMXBean.getThreadInfo(long ids[], int maxDepth) In-Reply-To: <88ec033f-a216-3e0a-8e27-b82fa4728055@oracle.com> References: <4C4212D0-BFFF-4C85-ACC6-05200F220C3F@oracle.com> <2d6dede1-aa79-99ce-a823-773fa2e19827@oracle.com> <6E7B043A-4647-4931-977C-1854CA7EBEC1@oracle.com> <76BCC96D-DB5D-409A-95D5-3A64B893832D@oracle.com> <7e0ba39e-e5b7-f56b-66ea-820a0a35ec2c@oracle.com> <87748188-3BD4-4A8B-938A-89DBC8F3C57A@oracle.com> <1D2CC008-A509-4B0B-A8C7-75C1F94545AD@oracle.com> <9A125A5A-3904-4E3B-9650-308B56E15F20@oracle.com> <0b307a92-5b5d-bd36-a128-99af6d0f3b1b@oracle.com> <88ec033f-a216-3e0a-8e27-b82fa4728055@oracle.com> Message-ID: <48c3ef87-a171-18d1-fbe8-ed4dcb622193@oracle.com> Hi Serguei, In the interests of full disclosure I was the one who told Daniil to not hold the lock while populating the table. I'll leave you two to work out which way you want to go there. Thanks, David On 18/09/2019 5:13 pm, serguei.spitsyn at oracle.com wrote: > Hi David, > > > On 9/17/19 03:46, David Holmes wrote: >> Hi Serguei, >> >> On 17/09/2019 7:10 pm, serguei.spitsyn at oracle.com wrote: >>> Hi Daniil, >>> >>> >>> On 9/16/19 21:36, Daniil Titov wrote: >>>> Hi David, >>>> >>>> The case you have described is exact the reason why we still have a >>>> code inside >>>> ThreadsList::find_JavaThread_from_java_tid() method that does a >>>> linear scan and adds >>>> ? the requested thread to the thread table if it is not there ( >>>> lines 614-613 below). >>> >>> I disagree because it is easy to avoid concurrent ThreadTable >>> initialization (please, see my separate email). >>> The reason for this code is to cover a case of late/lazy ThreadTable >>> initialization. >> >> I'm not sure I follow. With the current code if two threads are racing >> to initialize the ThreadTable with ThreadsLists that contain a >> different set of threads then there are two possibilities with regards >> to the interleaving. Assume T1 initializes the table with its set of >> threads and so finds the tid it is looking for in the table. Meanwhile >> T2 is racing with the initialization logic: >> >> - If T2 sees _is_initialized then lazy_initialization does nothing for >> T2, and the additional threads in its ThreadsList (say T3 and T4) are >> not added to the table. But the specific thread associated with the >> tid (say T3) will be found by linear search of the ThreadsList and >> then added. If any other threads come searching for T4 they too will >> not find it in the ThreadTable but instead perform the linear search >> of their ThreadsList (and add it). >> >> - if T2 doesn't see _is_initialized at first it will try to acquire >> the lock, and eventually see _is_initialized is true, at which point >> it will try to add all of its thread's to the table (so T3 and T4 will >> be added). When lazy_initialize returns, T3 will be found in the table >> and returned. If any other threads come searching for T4 they will >> also find it in the table. > > My main concerns are simplicity and reliability. > I do no care much about extra overhead at the ThreadTable initialization. > A probability of the ThreadTable::lazy_initialize() being called > concurrently is low. > Also, it might happen only once for the whole VM process execution. > > I was wrong by thinking that adding new threads to the ThreadTable after > its initialization > will result in thread linear search as well. So my conclusion was that > we should not care > if it happens once more at lazy initialization point. > But now I see that after ThreadTable::is_initialized() returns true the > ThreadsSMRSupport::add_thread() > makes a call to the ThreadTable::add_thread(). So, no more linear search > will happen. > > > However, it seems to me, a possible concurrent lazy initialization in > the webrev.06 introduces > its own extra overhead - competing threads (suppose, we have just two of > them) will do > the same amount of work concurrently: > ?? - all indirect memory readings, lookup/comparisons and other checks > will be performed twice > ?? - new ThreadTableEntry can be allocated twice for some threads in > the list > ???? (depending on how ConcurrentHashTable is implemented, there can be > a potential a memory leak) > > So, I doubt we win much in performance here but can loose in reliability. > > > I'd suggest to simplify the lazy initialization code and make it more > reliable this way: > > if (!_is_initialized) { > MutexLocker ml(ThreadTableCreate_lock); > if (!_is_initialized) { > create_table(threads->length()); > _is_initialized = true; > } > for (uint i = 0; i < threads->length(); i++) { > JavaThread* thread = threads->thread_at(i); > oop tobj = thread->threadObj(); > if (tobj != NULL && !thread->is_exiting()) { > jlong java_tid = java_lang_Thread::thread_id(tobj); > add_thread(java_tid, thread); > } > } > } > >> With your suggested code change this second case is not possible so >> for any racing initialization the lookup of any threads not in the >> original ThreadsList will always result in using the linear search >> before adding to the table. > > Yes, but I did not care about this. > The overhead is expected to be lower than the lazy initialization cost, > especially because the probability of concurrent initialization is low. > >> Both seem correct to me. Which one is more efficient will totally >> depend on the number of differences between the ThreadsLists and >> whether the code ever tries to look up those additional threads. If we >> assume racing initialization is likely to be rare anyway (because >> generally one thread is in charge of doing the monitoring) then the >> choice seems somewhat arbitrary. > > I agree, but have a little preference in favor of simplicity. > It was a good discussion though. :) > > Thanks, > Serguei > >> Cheers, >> David >> ----- >> >>> Thanks, >>> Serguei >>> >>>> ??? The >>>> assumption is that it's quite uncommon and even if this is the case >>>> the linear scan happens >>>> only once per such thread. >>>> >>>> ? 611 JavaThread* ThreadsList::find_JavaThread_from_java_tid(jlong >>>> java_tid) const { >>>> ? 612?? ThreadTable::lazy_initialize(this); >>>> ? 613?? JavaThread* thread = ThreadTable::find_thread_by_tid(java_tid); >>>> ? 614?? if (thread == NULL) { >>>> ? 615???? // If the thread is not found in the table find it >>>> ? 616???? // with a linear search and add to the table. >>>> ? 617???? for (uint i = 0; i < length(); i++) { >>>> ? 618?????? thread = thread_at(i); >>>> ? 619?????? oop tobj = thread->threadObj(); >>>> ? 620?????? // Ignore the thread if it hasn't run yet, has exited >>>> ? 621?????? // or is starting to exit. >>>> ? 622?????? if (tobj != NULL && java_tid == >>>> java_lang_Thread::thread_id(tobj)) { >>>> ? 623???????? MutexLocker ml(Threads_lock); >>>> ? 624???????? // Must be inside the lock to ensure that we don't add >>>> the thread to the table >>>> ? 625???????? // that has just passed the removal point in >>>> ThreadsSMRSupport::remove_thread() >>>> ? 626???????? if (!thread->is_exiting()) { >>>> ? 627?????????? ThreadTable::add_thread(java_tid, thread); >>>> ? 628?????????? return thread; >>>> ? 629???????? } >>>> ? 630?????? } >>>> ? 631???? } >>>> ? 632?? } else if (!thread->is_exiting()) { >>>> ? 633?????? return thread; >>>> ? 634?? } >>>> ? 635?? return NULL; >>>> ? 636 } >>>> >>>> Thanks, >>>> Daniil >>>> >>>> ?On 9/16/19, 7:27 PM, "David Holmes" wrote: >>>> >>>> ???? Hi Daniil, >>>> ???? Thanks again for your perseverance on this one. >>>> ???? I think there is a problem with initialization of the thread >>>> table. >>>> ???? Suppose thread T1 has called >>>> ThreadsList::find_JavaThread_from_java_tid >>>> ???? and has commenced execution of ThreadTable::lazy_initialize, >>>> but not yet >>>> ???? marked _is_initialized as true. Now two new threads (T2 and T3) >>>> are >>>> ???? created and start running - they aren't added to the >>>> ThreadTable yet >>>> ???? because it isn't initialized. Now T0 also calls >>>> ???? ThreadsList::find_JavaThread_from_java_tid using an updated >>>> ThreadsList >>>> ???? that contains T2 and T3. It also calls >>>> ThreadTable::lazy_initialize. If >>>> ???? _is_initialized is still false T0 will attempt initialization >>>> but once >>>> ???? it gets the lock it will see the table has now been initialized >>>> by T1. >>>> ???? It will then proceed to update the table with its own >>>> ThreadList content >>>> ???? - adding T2 and T3. That is all fine. But now suppose T0 >>>> initially sees >>>> ???? _is_initialized as true, it will do nothing in lazy_initialize and >>>> ???? simply return to find_JavaThread_from_java_tid. But now T2 and >>>> T3 are >>>> ???? missing from the ThreadTable and nothing will cause them to be >>>> added. >>>> ???? More generally any ThreadsList that is created after the >>>> ThreadsList >>>> ???? that will be used for initialization, may contain threads that >>>> will not >>>> ???? be added to the table. >>>> ???? Thanks, >>>> ???? David >>>> ???? On 17/09/2019 4:18 am, Daniil Titov wrote: >>>> ???? > Hello, >>>> ???? > >>>> ???? > After investigating with Claes the impact of this change on >>>> the performance (thanks a lot Claes for helping with it!) the >>>> conclusion was that the impact on the thread startup time is not a >>>> blocker for this change. >>>> ???? > >>>> ???? > I also measured the memory footprint using Native Memory >>>> Tracking and results showed around 40 bytes per live thread. >>>> ???? > >>>> ???? > Please review a new version of the fix, webrev.06 [1].? Just >>>> to remind,? webrev.05 was abandoned and webrev.06 [1] is webrev.04 >>>> [3] minus changes in src/hotspot/share/services/management.cpp (that >>>> were factored out to a separate issue [4]) and plus a change in >>>> ThreadsList::find_JavaThread_from_java_tid() method (please, see >>>> below)? that addresses the problem Robbin found and puts the code >>>> that adds a new thread to the thread table inside Threads_lock. >>>> ???? > >>>> ???? > src/hotspot/share/runtime/threadSMR.cpp >>>> ???? > >>>> ???? > 622?????? if (tobj != NULL && java_tid == >>>> java_lang_Thread::thread_id(tobj)) { >>>> ???? > 623???????? MutexLocker ml(Threads_lock); >>>> ???? > 624???????? // Must be inside the lock to ensure that we >>>> don't add the thread to the table >>>> ???? > 625???????? // that has just passed the removal point in >>>> ThreadsSMRSupport::remove_thread() >>>> ???? > 626???????? if (!thread->is_exiting()) { >>>> ???? > 627?????????? ThreadTable::add_thread(java_tid, thread); >>>> ???? > 628?????????? return thread; >>>> ???? > 629???????? } >>>> ???? > 630?????? } >>>> ???? > >>>> ???? > [1] Webrev: >>>> https://cr.openjdk.java.net/~dtitov/8185005/webrev.06 >>>> ???? > [2] Bug: https://bugs.openjdk.java.net/browse/JDK-8185005 >>>> ???? > [3] https://cr.openjdk.java.net/~dtitov/8185005/webrev.04 >>>> ???? > [4] https://bugs.openjdk.java.net/browse/JDK-8229391 >>>> ???? > >>>> ???? > ?Thank you, >>>> ???? > Daniil >>>> ???? > >>>> ???? > >>>> ???? > >>>> ???? >????????? > >>>> ???? >????????? > ?On 8/4/19, 7:54 PM, "David Holmes" >>>> wrote: >>>> ???? >????????? > >>>> ???? >????????? >????? Hi Daniil, >>>> ???? >????????? > >>>> ???? >????????? >????? On 3/08/2019 8:16 am, Daniil Titov wrote: >>>> ???? >????????? >????? > Hi David, >>>> ???? >????????? >????? > >>>> ???? >????????? >????? > Thank you for your detailed review. Please >>>> review a new version of the fix that includes >>>> ???? >????????? >????? > the changes you suggested: >>>> ???? >????????? >????? > - ThreadTableCreate_lock scope is reduced >>>> to cover the creation of the table only; >>>> ???? >????????? >????? > - ThreadTableCreate_lock is made >>>> _safepoint_check_always; >>>> ???? >????????? > >>>> ???? >????????? >????? Okay. >>>> ???? >????????? > >>>> ???? >????????? >????? > - ServiceThread is no longer responsible >>>> for the resizing of the thread table, instead, >>>> ???? >????????? >????? >??? the thread table is changed to grow on >>>> demand by the thread that is doing the addition; >>>> ???? >????????? > >>>> ???? >????????? >????? Okay - I'm happy to get the serviceThread out >>>> of the picture here. >>>> ???? >????????? > >>>> ???? >????????? >????? > - fixed nits and formatting issues. >>>> ???? >????????? > >>>> ???? >????????? >????? Okay. >>>> ???? >????????? > >>>> ???? >????????? >????? >>> The change also includes additional >>>> optimization for some callers of find_JavaThread_from_java_tid() >>>> ???? >????????? >????? >>>?? as Daniel suggested. >>>> ???? >????????? >????? >> Not sure it's best to combine these, but >>>> if they are limited to the >>>> ???? >????????? >????? >> changes in management.cpp only then that >>>> may be okay. >>>> ???? >????????? >????? > >>>> ???? >????????? >????? > The additional optimization for some >>>> callers of find_JavaThread_from_java_tid() is >>>> ???? >????????? >????? > limited to management.cpp (plus a new test) >>>> so I left them in the webrev? but >>>> ???? >????????? >????? > I also could move it in the separate issue >>>> if required. >>>> ???? >????????? > >>>> ???? >????????? >????? I'd prefer this part of be separated out, but >>>> won't insist. Let's see if >>>> ???? >????????? >????? Dan or Serguei have a strong opinion. >>>> ???? >????????? > >>>> ???? >????????? >????? >??? > src/hotspot/share/runtime/threadSMR.cpp >>>> ???? >????????? >????? >??? >755???? jlong tid = >>>> SharedRuntime::get_java_tid(thread); >>>> ???? >????????? >????? >??? > 926???? jlong tid = >>>> SharedRuntime::get_java_tid(thread); >>>> ???? >????????? >????? >?? >? I think it cleaner/better to just use >>>> ???? >????????? >????? >?? > jlong tid = >>>> java_lang_Thread::thread_id(thread->threadObj()); >>>> ???? >????????? >????? >?? > as we know thread is not NULL, it is a >>>> JavaThread and it has to have a >>>> ???? >????????? >????? >?? > non-null threadObj. >>>> ???? >????????? >????? > >>>> ???? >????????? >????? > I had to leave this code unchanged since it >>>> turned out the threadObj is null >>>> ???? >????????? >????? > when VM is destroyed: >>>> ???? >????????? >????? > >>>> ???? >????????? >????? > V? [libjvm.so+0xe165d7] >>>> oopDesc::long_field(int) const+0x67 >>>> ???? >????????? >????? > V? [libjvm.so+0x16e06c6] >>>> ThreadsSMRSupport::add_thread(JavaThread*)+0x116 >>>> ???? >????????? >????? > V? [libjvm.so+0x16d1302] >>>> Threads::add(JavaThread*, bool)+0x82 >>>> ???? >????????? >????? > V? [libjvm.so+0xef8369] >>>> attach_current_thread.part.197+0xc9 >>>> ???? >????????? >????? > V? [libjvm.so+0xec136c] jni_DestroyJavaVM+0x6c >>>> ???? >????????? >????? > C? [libjli.so+0x4333] JavaMain+0x2c3 >>>> ???? >????????? >????? > C? [libjli.so+0x8159] ThreadJavaMain+0x9 >>>> ???? >????????? > >>>> ???? >????????? >????? This is actually nothing to do with the VM >>>> being destroyed, but is an >>>> ???? >????????? >????? issue with JNI_AttachCurrentThread and its >>>> interaction with the >>>> ???? >????????? >????? ThreadSMR iterators. The attach process is: >>>> ???? >????????? >????? - create JavaThread >>>> ???? >????????? >????? - mark as "is attaching via jni" >>>> ???? >????????? >????? - add to ThreadsList >>>> ???? >????????? >????? - create java.lang.Thread object (you can >>>> only execute Java code after >>>> ???? >????????? >????? you are attached) >>>> ???? >????????? >????? - mark as "attach completed" >>>> ???? >????????? > >>>> ???? >????????? >????? So while a thread "is attaching" it will be >>>> seen by the ThreadSMR thread >>>> ???? >????????? >????? iterator but will have a NULL >>>> java.lang.Thread object. >>>> ???? >????????? > >>>> ???? >????????? >????? We special-case attaching threads in a number >>>> of places in the VM and I >>>> ???? >????????? >????? think we should be explicitly doing something >>>> here to filter out >>>> ???? >????????? >????? attaching threads, rather than just being >>>> tolerant of a NULL j.l.Thread >>>> ???? >????????? >????? object. Specifically in >>>> ThreadsSMRSupport::add_thread: >>>> ???? >????????? > >>>> ???? >????????? >????? if (ThreadTable::is_initialized() && >>>> !thread->is_attaching_via_jni()) { >>>> ???? >????????? >???????? jlong tid = >>>> java_lang_Thread::thread_id(thread->threadObj()); >>>> ???? >????????? >???????? ThreadTable::add_thread(tid, thread); >>>> ???? >????????? >????? } >>>> ???? >????????? > >>>> ???? >????????? >????? Note that in ThreadsSMRSupport::remove_thread >>>> we can use the same guard, >>>> ???? >????????? >????? which covers the case the JNI attach >>>> encountered an error trying to >>>> ???? >????????? >????? create the j.l.Thread object. >>>> ???? >????????? > >>>> ???? >????????? >????? >> src/hotspot/share/services/threadTable.cpp >>>> ???? >????????? >????? >> 71???? static uintx get_hash(Value const& >>>> value, bool* is_dead) { >>>> ???? >????????? >????? > >>>> ???? >????????? >????? >> The is_dead parameter still bothers me >>>> here. I can't make enough sense >>>> ???? >????????? >????? >> out of the template code in >>>> ConcurrentHashtable to see why we have to >>>> ???? >????????? >????? >> have it, but I'm concerned that its very >>>> existence means we perhaps >>>> ???? >????????? >????? >> should not be trying to extend CHT in this >>>> context. ?? >>>> ???? >????????? >????? > >>>> ???? >????????? >????? > My understanding is that is_dead parameter >>>> provides a mechanism for >>>> ???? >????????? >????? > ConcurrentHashtable to remove stale entries >>>> that were not explicitly >>>> ???? >????????? >????? > removed by calling >>>> ConcurrentHashTable::remove() method. >>>> ???? >????????? >????? > I think that just because in our case we >>>> don't use this mechanism doesn't >>>> ???? >????????? >????? > mean we should not use ConcurrentHashTable. >>>> ???? >????????? > >>>> ???? >????????? >????? Can you confirm that this usage is okay with >>>> Robbin Ehn please. He's >>>> ???? >????????? >????? back from vacation this week. >>>> ???? >????????? > >>>> ???? >????????? >????? >> I would still want to see what impact this >>>> has on thread >>>> ???? >????????? >????? >> startup cost, both with and without the >>>> table being initialized. >>>> ???? >????????? >????? > >>>> ???? >????????? >????? > I run a test that initializes the table by >>>> calling ThreadMXBean.get getThreadInfo(), >>>> ???? >????????? >????? > starts some threads as a worm-up, and then >>>> creates and starts 100,000 threads >>>> ???? >????????? >????? > (each thread just sleeps for 100 ms). In >>>> case when the thread table is enabled >>>> ???? >????????? >????? > 100,000 threads are created and started >>>> for about 15200 ms. If the thread table >>>> ???? >????????? >????? > is off the test takes about 14800 ms. Based >>>> on this information the enabled >>>> ???? >????????? >????? > thread table makes the thread startup about >>>> 2.7% slower. >>>> ???? >????????? > >>>> ???? >????????? >????? That doesn't sound very good. I think we may >>>> need to Claes involved to >>>> ???? >????????? >????? help investigate overall performance impact >>>> here. >>>> ???? >????????? > >>>> ???? >????????? >????? > Webrev: >>>> https://cr.openjdk.java.net/~dtitov/8185005/webrev.04/ >>>> ???? >????????? >????? > Bug: >>>> https://bugs.openjdk.java.net/browse/JDK-8185005 >>>> ???? >????????? > >>>> ???? >????????? >????? No further code comments. >>>> ???? >????????? > >>>> ???? >????????? >????? I didn't look at the test in detail. >>>> ???? >????????? > >>>> ???? >????????? >????? Thanks, >>>> ???? >????????? >????? David >>>> ???? >????????? > >>>> ???? >????????? >????? > Thanks! >>>> ???? >????????? >????? > --Daniil >>>> ???? >????????? >????? > >>>> ???? >????????? >????? > >>>> ???? >????????? >????? > ?On 7/29/19, 12:53 AM, "David Holmes" >>>> wrote: >>>> ???? >????????? >????? > >>>> ???? >????????? >????? >????? Hi Daniil, >>>> ???? >????????? >????? > >>>> ???? >????????? >????? >????? Overall I think this is a reasonable >>>> approach but I would still like to >>>> ???? >????????? >????? >????? see some performance and footprint >>>> numbers, both to verify it fixes the >>>> ???? >????????? >????? >????? problem reported, and that we are not >>>> getting penalized elsewhere. >>>> ???? >????????? >????? > >>>> ???? >????????? >????? >????? On 25/07/2019 3:21 am, Daniil Titov >>>> wrote: >>>> ???? >????????? >????? >????? > Hi David, Daniel, and Serguei, >>>> ???? >????????? >????? >????? > >>>> ???? >????????? >????? >????? > Please review the new version of the >>>> fix, that makes the thread table initialization on demand and >>>> ???? >????????? >????? >????? > moves it inside >>>> ThreadsList::find_JavaThread_from_java_tid(). At the creation time >>>> the thread table >>>> ???? >????????? >????? >????? >?? is initialized with the threads >>>> from the current thread list. We don't want to hold Threads_lock >>>> ???? >????????? >????? >????? > inside >>>> find_JavaThread_from_java_tid(),? thus new threads still could be >>>> created? while the thread >>>> ???? >????????? >????? >????? > table is being initialized . Such >>>> threads will be found by the linear search and added to the thread >>>> table >>>> ???? >????????? >????? >????? > later, in >>>> ThreadsList::find_JavaThread_from_java_tid(). >>>> ????? >????????? >????? > >>>> ???? >????????? >????? >????? The initialization allows the created >>>> but unpopulated, or partially >>>> ???? >????????? >????? >????? populated, table to be seen by other >>>> threads - is that your intention? >>>> ???? >????????? >????? >????? It seems it should be okay as the >>>> other threads will then race with the >>>> ???? >????????? >????? >????? initializing thread to add specific >>>> entries, and this is a concurrent >>>> ???? >????????? >????? >????? map so that should be functionally >>>> correct. But if so then I think you >>>> ???? >????????? >????? >????? can also reduce the scope of the >>>> ThreadTableCreate_lock so that it >>>> ???? >????????? >????? >????? covers creation of the table only, not >>>> the initial population of the table. >>>> ???? >????????? >????? > >>>> ???? >????????? >????? >????? I like the approach of only >>>> initializing the table when needed and using >>>> ???? >????????? >????? >????? that to control when the >>>> add/remove-thread code needs to update the >>>> ???? >????????? >????? >????? table. But I would still want to see >>>> what impact this has on thread >>>> ???? >????????? >????? >????? startup cost, both with and without >>>> the table being initialized. >>>> ???? >????????? >????? > >>>> ???? >????????? >????? >????? > The change also includes additional >>>> optimization for some callers of find_JavaThread_from_java_tid() >>>> ???? >????????? >????? >????? > as Daniel suggested. >>>> ???? >????????? >????? > >>>> ???? >????????? >????? >????? Not sure it's best to combine these, >>>> but if they are limited to the >>>> ???? >????????? >????? >????? changes in management.cpp only then >>>> that may be okay. It helps to be >>>> ???? >????????? >????? >????? able to focus on the table related >>>> changes without being distracted by >>>> ???? >????????? >????? >????? other optimizations. >>>> ???? >????????? >????? > >>>> ???? >????????? >????? >????? > That is correct that >>>> ResolvedMethodTable was used as a blueprint for the thread table, >>>> however, I tried >>>> ???? >????????? >????? >????? > to strip it of the all functionality >>>> that is not required in the thread table case. >>>> ???? >????????? >????? > >>>> ???? >????????? >????? >????? The revised version seems better in >>>> that regard. But I still have a >>>> ???? >????????? >????? >????? concern, see below. >>>> ???? >????????? >????? > >>>> ???? >????????? >????? >????? > We need to have the thread table >>>> resizable and allow it to grow as the number of threads increases to >>>> avoid >>>> ???? >????????? >????? >????? > reserving excessive memory a-priori >>>> or deteriorating lookup times. The ServiceThread is responsible for >>>> ???? >????????? >????? >????? > growing the thread table when required. >>>> ???? >????????? >????? > >>>> ???? >????????? >????? >????? Yes but why? Why can't this table be >>>> grown on demand by the thread that >>>> ???? >????????? >????? >????? is doing the addition? For other >>>> tables we may have to delegate to the >>>> ???? >????????? >????? >????? service thread because the current >>>> thread cannot perform the action, or >>>> ???? >????????? >????? >????? it doesn't want to perform it at the >>>> time the need for the resize is >>>> ???? >????????? >????? >????? detected (e.g. its detected at a >>>> safepoint and you want the resize to >>>> ???? >????????? >????? >????? happen later outside the safepoint). >>>> It's not apparent to me that such >>>> ???? >????????? >????? >????? restrictions apply here. >>>> ???? >????????? >????? > >>>> ???? >????????? >????? >????? > There is no ConcurrentHashTable >>>> available in Java 8 and for backporting this fix to Java 8 another >>>> implementation >>>> ???? >????????? >????? >????? > of the hash table, probably >>>> originally suggested in the patch attached to the JBS issue, should >>>> be used.? It will make >>>> ???? >????????? >????? >????? > the backporting more complicated, >>>> however, adding a new Implementation of the hash table in Java 14 >>>> while it >>>> ???? >????????? >????? >????? > already has ConcurrentHashTable >>>> doesn't seem? reasonable for me. >>>> ???? >????????? >????? > >>>> ???? >????????? >????? >????? Ok. >>>> ???? >????????? >????? > >>>> ???? >????????? >????? >????? > Webrev: >>>> http://cr.openjdk.java.net/~dtitov/8185005/webrev.03 >>>> ???? >????????? >????? > >>>> ???? >????????? >????? >????? Some specific code comments: >>>> ???? >????????? >????? > >>>> ???? >????????? >????? > src/hotspot/share/runtime/mutexLocker.cpp >>>> ???? >????????? >????? > >>>> ???? >????????? >????? >????? + def(ThreadTableCreate_lock?????? , >>>> PaddedMutex? , special, >>>> ???? >????????? >????? >????? false, Monitor::_safepoint_check_never); >>>> ???? >????????? >????? > >>>> ???? >????????? >????? >????? I think this needs to be a >>>> _safepoint_check_always lock. The table will >>>> ???? >????????? >????? >????? be created by regular JavaThreads and >>>> they should (nearly) always be >>>> ???? >????????? >????? >????? checking for safepoints if they are >>>> going to block acquiring the lock. >>>> ???? >????????? >????? >????? And it isn't at all obvious that the >>>> thread doing the creation can't go >>>> ???? >????????? >????? >????? to a safepoint whilst this lock is held. >>>> ???? >????????? >????? > >>>> ???? >????????? >????? >????? --- >>>> ???? >????????? >????? > >>>> ???? >????????? >????? > src/hotspot/share/runtime/threadSMR.cpp >>>> ???? >????????? >????? > >>>> ???? >????????? >????? >????? Nit: >>>> ???? >????????? >????? > >>>> ???? >????????? >????? >??????? 618?????? JavaThread* thread = >>>> thread_at(i); >>>> ???? >????????? >????? > >>>> ???? >????????? >????? >????? you could reuse the new java_thread >>>> local you introduced at line 613 and >>>> ???? >????????? >????? >????? just rename that "new" variable to >>>> "thread" so you don't have to change >>>> ???? >????????? >????? >????? all other uses. >>>> ???? >????????? >????? > >>>> ???? >????????? >????? >????? 628?? } else if (java_thread != NULL >>>> && ... >>>> ???? >????????? >????? > >>>> ???? >????????? >????? >????? You don't need to check != NULL here >>>> as you only get here when >>>> ???? >????????? >????? >????? java_thread is not NULL. >>>> ???? >????????? >????? > >>>> ???? >????????? >????? >??????? 755???? jlong tid = >>>> SharedRuntime::get_java_tid(thread); >>>> ???? >????????? >????? >??????? 926???? jlong tid = >>>> SharedRuntime::get_java_tid(thread); >>>> ???? >????????? >????? > >>>> ???? >????????? >????? >????? I think it cleaner/better to just use >>>> ???? >????????? >????? > >>>> ???? >????????? >????? >????? jlong tid = >>>> java_lang_Thread::thread_id(thread->threadObj()); >>>> ???? >????????? >????? > >>>> ???? >????????? >????? >????? as we know thread is not NULL, it is a >>>> JavaThread and it has to have a >>>> ???? >????????? >????? >????? non-null threadObj. >>>> ???? >????????? >????? > >>>> ???? >????????? >????? >????? --- >>>> ???? >????????? >????? > >>>> ???? >????????? >????? > src/hotspot/share/services/management.cpp >>>> ???? >????????? >????? > >>>> ???? >????????? >????? >????? 1323???????? if >>>> (THREAD->is_Java_thread()) { >>>> ???? >????????? >????? >????? 1324 JavaThread* current_thread = >>>> (JavaThread*)THREAD; >>>> ???? >????????? >????? > >>>> ???? >????????? >????? >????? These calls can only be made on a >>>> JavaThread so this be simplified to >>>> ???? >????????? >????? >????? remove the is_Java_thread() call. >>>> Similarly in other places. >>>> ???? >????????? >????? > >>>> ???? >????????? >????? >????? --- >>>> ???? >????????? >????? > >>>> ???? >????????? >????? > src/hotspot/share/services/threadTable.cpp >>>> ???? >????????? >????? > >>>> ???? >????????? >????? >???????? 55 class ThreadTableEntry : public >>>> CHeapObj { >>>> ???? >????????? >????? >???????? 56?? private: >>>> ???? >????????? >????? >???????? 57???? jlong _tid; >>>> ???? >????????? >????? > >>>> ???? >????????? >????? >????? I believe hotspot style is to not >>>> indent the access modifiers in C++ >>>> ???? >????????? >????? >????? class declarations, so the above would >>>> just be: >>>> ???? >????????? >????? > >>>> ???? >????????? >????? >???????? 55 class ThreadTableEntry : public >>>> CHeapObj { >>>> ???? >????????? >????? >???????? 56 private: >>>> ???? >????????? >????? >???????? 57?? jlong _tid; >>>> ???? >????????? >????? > >>>> ???? >????????? >????? >????? etc. >>>> ???? >????????? >????? > >>>> ???? >????????? >????? >??????? 60 ThreadTableEntry(jlong tid, >>>> JavaThread* java_thread) : >>>> ???? >????????? >????? >??????? 61 >>>> _tid(tid),_java_thread(java_thread) {} >>>> ???? >????????? >????? > >>>> ???? >????????? >????? >????? line 61 should be indented as it >>>> continues line 60. >>>> ???? >????????? >????? > >>>> ???? >????????? >????? >???????? 67 class ThreadTableConfig : public >>>> AllStatic { >>>> ???? >????????? >????? >???????? ... >>>> ???? >????????? >????? >???????? 71???? static uintx get_hash(Value >>>> const& value, bool* is_dead) { >>>> ???? >????????? >????? > >>>> ???? >????????? >????? >????? The is_dead parameter still bothers me >>>> here. I can't make enough sense >>>> ???? >????????? >????? >????? out of the template code in >>>> ConcurrentHashtable to see why we have to >>>> ???? >????????? >????? >????? have it, but I'm concerned that its >>>> very existence means we perhaps >>>> ???? >????????? >????? >????? should not be trying to extend CHT in >>>> this context. ?? >>>> ???? >????????? >????? > >>>> ???? >????????? >????? >??????? 115?? size_t start_size_log = >>>> size_log > DefaultThreadTableSizeLog >>>> ???? >????????? >????? >??????? 116?? ? size_log : >>>> DefaultThreadTableSizeLog; >>>> ???? >????????? >????? > >>>> ???? >????????? >????? >????? line 116 should be indented, though in >>>> this case I think a better layout >>>> ???? >????????? >????? >????? would be: >>>> ???? >????????? >????? > >>>> ???? >????????? >????? >??????? 115?? size_t start_size_log = >>>> ???? >????????? >????? >??????? 116?????? size_log > >>>> DefaultThreadTableSizeLog ? size_log : >>>> ???? >????????? >????? > DefaultThreadTableSizeLog; >>>> ???? >????????? >????? > >>>> ???? >????????? >????? >??????? 131 double >>>> ThreadTable::get_load_factor() { >>>> ???? >????????? >????? >??????? 132?? return >>>> (double)_items_count/_current_size; >>>> ???? >????????? >????? >??????? 133 } >>>> ???? >????????? >????? > >>>> ???? >????????? >????? >????? Not sure that is doing what you >>>> want/expect. It will perform integer >>>> ???? >????????? >????? >????? division and then cast that whole >>>> integer to a double. If you want >>>> ???? >????????? >????? >????? double arithmetic you need: >>>> ???? >????????? >????? > >>>> ???? >????????? >????? >????? return >>>> ((double)_items_count)/_current_size; >>>> ???? >????????? >????? > >>>> ???? >????????? >????? >????? 180???? jlong _tid; >>>> ???? >????????? >????? >????? 181???? uintx _hash; >>>> ???? >????????? >????? > >>>> ???? >????????? >????? >????? Nit: no need for all those spaces >>>> before the variable name. >>>> ???? >????????? >????? > >>>> ???? >????????? >????? >??????? 183 ThreadTableLookup(jlong tid) >>>> ???? >????????? >????? >??????? 184???? : _tid(tid), >>>> _hash(primitive_hash(tid)) {} >>>> ???? >????????? >????? > >>>> ???? >????????? >????? >????? line 184 should be indented. >>>> ???? >????????? >????? > >>>> ???? >????????? >????? >????? 201 ThreadGet():_return(NULL) {} >>>> ???? >????????? >????? > >>>> ???? >????????? >????? >????? Nit: need space after : >>>> ???? >????????? >????? > >>>> ???? >????????? >????? >??????? 211 assert(_is_initialized, "Thread >>>> table is not initialized"); >>>> ???? >????????? >????? >??????? 212?? _has_work = false; >>>> ???? >????????? >????? > >>>> ???? >????????? >????? >????? line 211 is indented one space too far. >>>> ???? >????????? >????? > >>>> ???? >????????? >????? >????? 229 ThreadTableEntry* entry = new >>>> ThreadTableEntry(tid,java_thread); >>>> ???? >????????? >????? > >>>> ???? >????????? >????? >????? Nit: need space after , >>>> ???? >????????? >????? > >>>> ???? >????????? >????? >????? 252?? return >>>> _local_table->remove(thread,lookup); >>>> ???? >????????? >????? > >>>> ???? >????????? >????? >????? Nit: need space after , >>>> ???? >????????? >????? > >>>> ???? >????????? >????? >????? Thanks, >>>> ???? >????????? >????? >????? David >>>> ???? >????????? >????? >????? ------ >>>> ???? >????????? >????? > >>>> ???? >????????? >????? >????? > Bug: >>>> https://bugs.openjdk.java.net/browse/JDK-8185005 >>>> ???? >????????? >????? >????? > >>>> ???? >????????? >????? >????? > Thanks! >>>> ???? >????????? >????? >????? > --Daniil >>>> ???? >????????? >????? >????? > >>>> ???? >????????? >????? >????? > >>>> ???? >????????? >????? >????? > ?On 7/8/19, 3:24 PM, "Daniel D. >>>> Daugherty" wrote: >>>> ???? >????????? >????? >????? > >>>> ???? >????????? >????? >????? >????? On 6/29/19 12:06 PM, Daniil >>>> Titov wrote: >>>> ???? >????????? >????? >????? >????? > Hi Serguei and David, >>>> ???? >????????? >????? >????? >????? > >>>> ???? >????????? >????? >????? >????? > Serguei is right, >>>> ThreadTable::find_thread(java_tid) cannot? return a JavaThread with >>>> an unmatched java_tid. >>>> ???? >????????? >????? >????? >????? > >>>> ???? >????????? >????? >????? >????? > Please find a new version of >>>> the fix that includes the changes Serguei suggested. >>>> ???? >????????? >????? >????? >????? > >>>> ???? >????????? >????? >????? >????? > Regarding the concern about >>>> the maintaining the thread table when it may never even be queried, >>>> one of >>>> ???? >????????? >????? >????? >????? > the options could be to add >>>> ThreadTable ::isEnabled flag, set it to "false" by default, and wrap >>>> the calls to the thread table >>>> ???? >????????? >????? >????? >????? > in ThreadsSMRSupport >>>> add_thread() and remove_thread() methods to check this flag. >>>> ???? >????????? >????? >????? >????? > >>>> ???? >????????? >????? >????? >????? > When >>>> ThreadsList::find_JavaThread_from_java_tid() is called for the first >>>> time it could check if ThreadTable ::isEnabled >>>> ???? >????????? >????? >????? >????? > Is on and if not then set it >>>> on and populate the thread table with all existing threads from the >>>> thread list. >>>> ???? >????????? >????? >????? > >>>> ???? >????????? >????? >????? >????? I have the same concerns as >>>> David H. about this new ThreadTable. >>>> ???? >????????? >????? >????? > >>>> ThreadsList::find_JavaThread_from_java_tid() is only called from code >>>> ???? >????????? >????? >????? >????? in >>>> src/hotspot/share/services/management.cpp so I think that table >>>> ???? >????????? >????? >????? >????? needs to enabled and populated >>>> only if it is going to be used. >>>> ???? >????????? >????? >????? > >>>> ???? >????????? >????? >????? >????? I've taken a look at the webrev >>>> below and I see that David has >>>> ???? >????????? >????? >????? >????? followed up with additional >>>> comments. Before I do a crawl through >>>> ???? >????????? >????? >????? >????? code review for this, I would >>>> like to see the ThreadTable stuff >>>> ???? >????????? >????? >????? >????? made optional and David's other >>>> comments addressed. >>>> ???? >????????? >????? >????? > >>>> ???? >????????? >????? >????? >????? Another possible optimization >>>> is for callers of >>>> ???? >????????? >????? >????? > find_JavaThread_from_java_tid() to >>>> save the calling thread's >>>> ???? >????????? >????? >????? >????? tid value before they loop and >>>> if the current tid == saved_tid >>>> ???? >????????? >????? >????? >????? then use the current >>>> JavaThread* instead of calling >>>> ???? >????????? >????? >????? > find_JavaThread_from_java_tid() to >>>> get the JavaThread*. >>>> ???? >????????? >????? >????? > >>>> ???? >????????? >????? >????? >????? Dan >>>> ???? >????????? >????? >????? > >>>> ???? >????????? >????? >????? >????? > >>>> ???? >????????? >????? >????? >????? > Webrev: >>>> https://cr.openjdk.java.net/~dtitov/8185005/webrev.02/ >>>> ???? >????????? >????? >????? >????? > Bug: >>>> https://bugs.openjdk.java.net/browse/JDK-8185005 >>>> ???? >????????? >????? >????? >????? > >>>> ???? >????????? >????? >????? >????? > Thanks! >>>> ???? >????????? >????? >????? >????? > --Daniil >>>> ???? >????????? >????? >????? >????? > >>>> ???? >????????? >????? >????? >????? > From: >>>> >>>> ???? >????????? >????? >????? >????? > Organization: Oracle Corporation >>>> ???? >????????? >????? >????? >????? > Date: Friday, June 28, 2019 >>>> at 7:56 PM >>>> ???? >????????? >????? >????? >????? > To: Daniil Titov >>>> , OpenJDK Serviceability >>>> , >>>> "hotspot-runtime-dev at openjdk.java.net" >>>> , "jmx-dev at openjdk.java.net" >>>> >>>> ???? >????????? >????? >????? >????? > Subject: Re: RFR: 8185005: >>>> Improve performance of ThreadMXBean.getThreadInfo(long ids[], int >>>> maxDepth) >>>> ???? >????????? >????? >????? >????? > >>>> ???? >????????? >????? >????? >????? > Hi Daniil, >>>> ???? >????????? >????? >????? >????? > >>>> ???? >????????? >????? >????? >????? > I have several quick comments. >>>> ???? >????????? >????? >????? >????? > >>>> ???? >????????? >????? >????? >????? > The indent in the hotspot >>>> c/c++ files has to be 2, not 4. >>>> ???? >????????? >????? >????? >????? > >>>> ???? >????????? >????? >????? >????? > >>>> https://cr.openjdk.java.net/~dtitov/8185005/webrev.01/src/hotspot/share/runtime/threadSMR.cpp.frames.html >>>> >>>> ???? >????????? >????? >????? >????? > 614 JavaThread* >>>> ThreadsList::find_JavaThread_from_java_tid(jlong java_tid) const { >>>> ???? >????????? >????? >????? >????? > 615???? JavaThread* >>>> java_thread = ThreadTable::find_thread(java_tid); >>>> ???? >????????? >????? >????? >????? > 616???? if (java_thread == >>>> NULL && java_tid == PMIMORDIAL_JAVA_TID) { >>>> ???? >????????? >????? >????? >????? > 617???????? // >>>> ThreadsSMRSupport::add_thread() is not called for the primordial >>>> ???? >????????? >????? >????? >????? > 618???????? // thread. Thus, >>>> we find this thread with a linear search and add it >>>> ???? >????????? >????? >????? >????? > 619???????? // to the thread >>>> table. >>>> ???? >????????? >????? >????? >????? > 620???????? for (uint i = 0; >>>> i < length(); i++) { >>>> ???? >????????? >????? >????? >????? > 621???????????? JavaThread* >>>> thread = thread_at(i); >>>> ???? >????????? >????? >????? >????? > 622???????????? if >>>> (is_valid_java_thread(java_tid,thread)) { >>>> ???? >????????? >????? >????? >????? > 623 >>>> ThreadTable::add_thread(java_tid, thread); >>>> ???? >????????? >????? >????? >????? > 624???????????????? return >>>> thread; >>>> ???? >????????? >????? >????? >????? > 625???????????? } >>>> ???? >????????? >????? >????? >????? > 626???????? } >>>> ???? >????????? >????? >????? >????? > 627???? } else if >>>> (java_thread != NULL && is_valid_java_thread(java_tid, java_thread)) { >>>> ???? >????????? >????? >????? >????? > 628???????? return java_thread; >>>> ???? >????????? >????? >????? >????? > 629???? } >>>> ???? >????????? >????? >????? >????? > 630???? return NULL; >>>> ???? >????????? >????? >????? >????? >?? 631 } >>>> ???? >????????? >????? >????? >????? >?? 632 bool >>>> ThreadsList::is_valid_java_thread(jlong java_tid, JavaThread* >>>> java_thread) { >>>> ???? >????????? >????? >????? >????? > 633???? oop tobj = >>>> java_thread->threadObj(); >>>> ???? >????????? >????? >????? >????? > 634???? // Ignore the thread >>>> if it hasn't run yet, has exited >>>> ???? >????????? >????? >????? >????? > 635???? // or is starting to >>>> exit. >>>> ???? >????????? >????? >????? >????? > 636???? return (tobj != NULL >>>> && !java_thread->is_exiting() && >>>> ???? >????????? >????? >????? >????? > 637???????????? java_tid == >>>> java_lang_Thread::thread_id(tobj)); >>>> ???? >????????? >????? >????? >????? >?? 638 } >>>> ???? >????????? >????? >????? >????? > >>>> ???? >????????? >????? >????? >????? > 615???? JavaThread* >>>> java_thread = ThreadTable::find_thread(java_tid); >>>> ???? >????????? >????? >????? >????? > >>>> ???? >????????? >????? >????? >????? >??? I'd suggest to rename >>>> find_thread() to find_thread_by_tid(). >>>> ???? >????????? >????? >????? >????? > >>>> ???? >????????? >????? >????? >????? > A space is missed after the >>>> comma: >>>> ???? >????????? >????? >????? >????? >??? 622 if >>>> (is_valid_java_thread(java_tid,thread)) { >>>> ???? >????????? >????? >????? >????? > >>>> ???? >????????? >????? >????? >????? > An empty line is needed >>>> before L632. >>>> ???? >????????? >????? >????? >????? > >>>> ???? >????????? >????? >????? >????? > The name >>>> 'is_valid_java_thread' looks wrong (or confusing) to me. >>>> ???? >????????? >????? >????? >????? > Something like >>>> 'is_alive_java_thread_with_tid()' would be better. >>>> ???? >????????? >????? >????? >????? > It'd better to list >>>> parameters in the opposite order. >>>> ???? >????????? >????? >????? >????? > >>>> ???? >????????? >????? >????? >????? > The call to >>>> is_valid_java_thread() is confusing: >>>> ???? >????????? >????? >????? >????? >???? 627 } else if >>>> (java_thread != NULL && is_valid_java_thread(java_tid, java_thread)) { >>>> ???? >????????? >????? >????? >????? > >>>> ???? >????????? >????? >????? >????? > Why would the call >>>> ThreadTable::find_thread(java_tid) return a JavaThread with an >>>> unmatched java_tid? >>>> ???? >????????? >????? >????? >????? > >>>> ???? >????????? >????? >????? >????? > >>>> ???? >????????? >????? >????? >????? > Thanks, >>>> ???? >????????? >????? >????? >????? > Serguei >>>> ???? >????????? >????? >????? >????? > >>>> ???? >????????? >????? >????? >????? > On 6/28/19, 9:40 PM, "David >>>> Holmes" wrote: >>>> ???? >????????? >????? >????? >????? > >>>> ???? >????????? >????? >????? >????? >????? Hi Daniil, >>>> ???? >????????? >????? >????? >????? > >>>> ???? >????????? >????? >????? >????? >????? The definition and use >>>> of this hashtable (yet another hashtable >>>> ???? >????????? >????? >????? >????? > implementation!) will need >>>> careful examination. We have to be concerned >>>> ???? >????????? >????? >????? >????? > about the cost of maintaining >>>> it when it may never even be queried. You >>>> ???? >????????? >????? >????? >????? > would need to look at >>>> footprint cost and performance impact. >>>> ???? >????????? >????? >????? >????? > >>>> ???? >????????? >????? >????? >????? > Unfortunately I'm just about >>>> to board a plane and will be out for the >>>> ???? >????????? >????? >????? >????? > next few days. I will try to >>>> look at this asap next week, but we will >>>> ???? >????????? >????? >????? >????? > need a lot more data on it. >>>> ???? >????????? >????? >????? >????? > >>>> ???? >????????? >????? >????? >????? > Thanks, >>>> ???? >????????? >????? >????? >????? > David >>>> ???? >????????? >????? >????? >????? > >>>> ???? >????????? >????? >????? >????? > On 6/28/19 3:31 PM, Daniil >>>> Titov wrote: >>>> ???? >????????? >????? >????? >????? > Please review the change that >>>> improves performance of ThreadMXBean MXBean methods returning the >>>> ???? >????????? >????? >????? >????? > information for specific >>>> threads. The change introduces the thread table that uses >>>> ConcurrentHashTable >>>> ???? >????????? >????? >????? >????? > to store one-to-one the >>>> mapping between the thread ids and JavaThread objects and replaces >>>> the linear >>>> ???? >????????? >????? >????? >????? > search over the thread list >>>> in ThreadsList::find_JavaThread_from_java_tid(jlong tid) method with >>>> the lookup >>>> ???? >????????? >????? >????? >????? > in the thread table. >>>> ???? >????????? >????? >????? >????? > >>>> ???? >????????? >????? >????? >????? > Testing: Mach5 tier1,tier2 >>>> and tier3 tests successfully passed. >>>> ???? >????????? >????? >????? >????? > >>>> ???? >????????? >????? >????? >????? > Webrev: >>>> https://cr.openjdk.java.net/~dtitov/8185005/webrev.01/ >>>> ???? >????????? >????? >????? >????? > Bug: >>>> https://bugs.openjdk.java.net/browse/JDK-8185005 >>>> ???? >????????? >????? >????? >????? > >>>> ???? >????????? >????? >????? >????? > Thanks! >>>> ???? >????????? >????? >????? >????? > >>>> ???? >????????? >????? >????? >????? > Best regards, >>>> ???? >????????? >????? >????? >????? > Daniil >>>> ???? >????????? >????? >????? >????? > >>>> ???? >????????? >????? >????? >????? > >>>> ???? >????????? >????? >????? >????? > >>>> ???? >????????? >????? >????? >????? > >>>> ???? >????????? >????? >????? >????? > >>>> ???? >????????? >????? >????? >????? > >>>> ???? >????????? >????? >????? >????? > >>>> ???? >????????? >????? >????? > >>>> ???? >????????? >????? >????? > >>>> ???? >????????? >????? >????? > >>>> ???? >????????? >????? >????? > >>>> ???? >????????? >????? > >>>> ???? >????????? >????? > >>>> ???? >????????? >????? > >>>> ???? >????????? > >>>> ???? >????????? > >>>> ???? >????????? > >>>> ???? > >>>> ???? > >>>> ???? > >>>> ???? > >>>> ???? > >>>> ???? > >>>> ???? > >>>> >>>> >>> > From serguei.spitsyn at oracle.com Wed Sep 18 07:34:55 2019 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Wed, 18 Sep 2019 00:34:55 -0700 Subject: 8185005: Improve performance of ThreadMXBean.getThreadInfo(long ids[], int maxDepth) In-Reply-To: <48c3ef87-a171-18d1-fbe8-ed4dcb622193@oracle.com> References: <4C4212D0-BFFF-4C85-ACC6-05200F220C3F@oracle.com> <2d6dede1-aa79-99ce-a823-773fa2e19827@oracle.com> <6E7B043A-4647-4931-977C-1854CA7EBEC1@oracle.com> <76BCC96D-DB5D-409A-95D5-3A64B893832D@oracle.com> <7e0ba39e-e5b7-f56b-66ea-820a0a35ec2c@oracle.com> <87748188-3BD4-4A8B-938A-89DBC8F3C57A@oracle.com> <1D2CC008-A509-4B0B-A8C7-75C1F94545AD@oracle.com> <9A125A5A-3904-4E3B-9650-308B56E15F20@oracle.com> <0b307a92-5b5d-bd36-a128-99af6d0f3b1b@oracle.com> <88ec033f-a216-3e0a-8e27-b82fa4728055@oracle.com> <48c3ef87-a171-18d1-fbe8-ed4dcb622193@oracle.com> Message-ID: Hi David, On 9/18/19 00:25, David Holmes wrote: > Hi Serguei, > > In the interests of full disclosure I was the one who told Daniil to > not hold the lock while populating the table. I well understand your point to not hold the lock while populating the table. > > I'll leave you two to work out which way you want to go there. In fact, I have no strong opinion here, just my gut feelings. :) And I'm open for any solution. Just wanted to share my thoughts with you, guys, to have all reasoning on the table. Let me briefly talk to Daniil tomorrow. Thanks, Serguei > Thanks, > David > > On 18/09/2019 5:13 pm, serguei.spitsyn at oracle.com wrote: >> Hi David, >> >> >> On 9/17/19 03:46, David Holmes wrote: >>> Hi Serguei, >>> >>> On 17/09/2019 7:10 pm, serguei.spitsyn at oracle.com wrote: >>>> Hi Daniil, >>>> >>>> >>>> On 9/16/19 21:36, Daniil Titov wrote: >>>>> Hi David, >>>>> >>>>> The case you have described is exact the reason why we still have >>>>> a code inside >>>>> ThreadsList::find_JavaThread_from_java_tid() method that does a >>>>> linear scan and adds >>>>> ? the requested thread to the thread table if it is not there ( >>>>> lines 614-613 below). >>>> >>>> I disagree because it is easy to avoid concurrent ThreadTable >>>> initialization (please, see my separate email). >>>> The reason for this code is to cover a case of late/lazy >>>> ThreadTable initialization. >>> >>> I'm not sure I follow. With the current code if two threads are >>> racing to initialize the ThreadTable with ThreadsLists that contain >>> a different set of threads then there are two possibilities with >>> regards to the interleaving. Assume T1 initializes the table with >>> its set of threads and so finds the tid it is looking for in the >>> table. Meanwhile T2 is racing with the initialization logic: >>> >>> - If T2 sees _is_initialized then lazy_initialization does nothing >>> for T2, and the additional threads in its ThreadsList (say T3 and >>> T4) are not added to the table. But the specific thread associated >>> with the tid (say T3) will be found by linear search of the >>> ThreadsList and then added. If any other threads come searching for >>> T4 they too will not find it in the ThreadTable but instead perform >>> the linear search of their ThreadsList (and add it). >>> >>> - if T2 doesn't see _is_initialized at first it will try to acquire >>> the lock, and eventually see _is_initialized is true, at which point >>> it will try to add all of its thread's to the table (so T3 and T4 >>> will be added). When lazy_initialize returns, T3 will be found in >>> the table and returned. If any other threads come searching for T4 >>> they will also find it in the table. >> >> My main concerns are simplicity and reliability. >> I do no care much about extra overhead at the ThreadTable >> initialization. >> A probability of the ThreadTable::lazy_initialize() being called >> concurrently is low. >> Also, it might happen only once for the whole VM process execution. >> >> I was wrong by thinking that adding new threads to the ThreadTable >> after its initialization >> will result in thread linear search as well. So my conclusion was >> that we should not care >> if it happens once more at lazy initialization point. >> But now I see that after ThreadTable::is_initialized() returns true >> the ThreadsSMRSupport::add_thread() >> makes a call to the ThreadTable::add_thread(). So, no more linear >> search will happen. >> >> >> However, it seems to me, a possible concurrent lazy initialization in >> the webrev.06 introduces >> its own extra overhead - competing threads (suppose, we have just two >> of them) will do >> the same amount of work concurrently: >> ??? - all indirect memory readings, lookup/comparisons and other >> checks will be performed twice >> ??? - new ThreadTableEntry can be allocated twice for some threads in >> the list >> ????? (depending on how ConcurrentHashTable is implemented, there can >> be a potential a memory leak) >> >> So, I doubt we win much in performance here but can loose in >> reliability. >> >> >> I'd suggest to simplify the lazy initialization code and make it more >> reliable this way: >> >> ????? if (!_is_initialized) { >> ??????? MutexLocker ml(ThreadTableCreate_lock); >> ??????? if (!_is_initialized) { >> ????????? create_table(threads->length()); >> ????????? _is_initialized = true; >> ??????? } >> ??????? for (uint i = 0; i < threads->length(); i++) { >> ????????? JavaThread* thread = threads->thread_at(i); >> ????????? oop tobj = thread->threadObj(); >> ????????? if (tobj != NULL && !thread->is_exiting()) { >> ??????????? jlong java_tid = java_lang_Thread::thread_id(tobj); >> ??????????? add_thread(java_tid, thread); >> ????????? } >> ??????? } >> ????? } >> >>> With your suggested code change this second case is not possible so >>> for any racing initialization the lookup of any threads not in the >>> original ThreadsList will always result in using the linear search >>> before adding to the table. >> >> Yes, but I did not care about this. >> The overhead is expected to be lower than the lazy initialization cost, >> especially because the probability of concurrent initialization is low. >> >>> Both seem correct to me. Which one is more efficient will totally >>> depend on the number of differences between the ThreadsLists and >>> whether the code ever tries to look up those additional threads. If >>> we assume racing initialization is likely to be rare anyway (because >>> generally one thread is in charge of doing the monitoring) then the >>> choice seems somewhat arbitrary. >> >> I agree, but have a little preference in favor of simplicity. >> It was a good discussion though. :) >> >> Thanks, >> Serguei >> >>> Cheers, >>> David >>> ----- >>> >>>> Thanks, >>>> Serguei >>>> >>>>> ??? The >>>>> assumption is that it's quite uncommon and even if this is the >>>>> case the linear scan happens >>>>> only once per such thread. >>>>> >>>>> ? 611 JavaThread* ThreadsList::find_JavaThread_from_java_tid(jlong >>>>> java_tid) const { >>>>> ? 612?? ThreadTable::lazy_initialize(this); >>>>> ? 613?? JavaThread* thread = >>>>> ThreadTable::find_thread_by_tid(java_tid); >>>>> ? 614?? if (thread == NULL) { >>>>> ? 615???? // If the thread is not found in the table find it >>>>> ? 616???? // with a linear search and add to the table. >>>>> ? 617???? for (uint i = 0; i < length(); i++) { >>>>> ? 618?????? thread = thread_at(i); >>>>> ? 619?????? oop tobj = thread->threadObj(); >>>>> ? 620?????? // Ignore the thread if it hasn't run yet, has exited >>>>> ? 621?????? // or is starting to exit. >>>>> ? 622?????? if (tobj != NULL && java_tid == >>>>> java_lang_Thread::thread_id(tobj)) { >>>>> ? 623???????? MutexLocker ml(Threads_lock); >>>>> ? 624???????? // Must be inside the lock to ensure that we don't >>>>> add the thread to the table >>>>> ? 625???????? // that has just passed the removal point in >>>>> ThreadsSMRSupport::remove_thread() >>>>> ? 626???????? if (!thread->is_exiting()) { >>>>> ? 627?????????? ThreadTable::add_thread(java_tid, thread); >>>>> ? 628?????????? return thread; >>>>> ? 629???????? } >>>>> ? 630?????? } >>>>> ? 631???? } >>>>> ? 632?? } else if (!thread->is_exiting()) { >>>>> ? 633?????? return thread; >>>>> ? 634?? } >>>>> ? 635?? return NULL; >>>>> ? 636 } >>>>> >>>>> Thanks, >>>>> Daniil >>>>> >>>>> ?On 9/16/19, 7:27 PM, "David Holmes" wrote: >>>>> >>>>> ???? Hi Daniil, >>>>> ???? Thanks again for your perseverance on this one. >>>>> ???? I think there is a problem with initialization of the thread >>>>> table. >>>>> ???? Suppose thread T1 has called >>>>> ThreadsList::find_JavaThread_from_java_tid >>>>> ???? and has commenced execution of ThreadTable::lazy_initialize, >>>>> but not yet >>>>> ???? marked _is_initialized as true. Now two new threads (T2 and >>>>> T3) are >>>>> ???? created and start running - they aren't added to the >>>>> ThreadTable yet >>>>> ???? because it isn't initialized. Now T0 also calls >>>>> ???? ThreadsList::find_JavaThread_from_java_tid using an updated >>>>> ThreadsList >>>>> ???? that contains T2 and T3. It also calls >>>>> ThreadTable::lazy_initialize. If >>>>> ???? _is_initialized is still false T0 will attempt initialization >>>>> but once >>>>> ???? it gets the lock it will see the table has now been >>>>> initialized by T1. >>>>> ???? It will then proceed to update the table with its own >>>>> ThreadList content >>>>> ???? - adding T2 and T3. That is all fine. But now suppose T0 >>>>> initially sees >>>>> ???? _is_initialized as true, it will do nothing in >>>>> lazy_initialize and >>>>> ???? simply return to find_JavaThread_from_java_tid. But now T2 >>>>> and T3 are >>>>> ???? missing from the ThreadTable and nothing will cause them to >>>>> be added. >>>>> ???? More generally any ThreadsList that is created after the >>>>> ThreadsList >>>>> ???? that will be used for initialization, may contain threads >>>>> that will not >>>>> ???? be added to the table. >>>>> ???? Thanks, >>>>> ???? David >>>>> ???? On 17/09/2019 4:18 am, Daniil Titov wrote: >>>>> ???? > Hello, >>>>> ???? > >>>>> ???? > After investigating with Claes the impact of this change on >>>>> the performance (thanks a lot Claes for helping with it!) the >>>>> conclusion was that the impact on the thread startup time is not a >>>>> blocker for this change. >>>>> ???? > >>>>> ???? > I also measured the memory footprint using Native Memory >>>>> Tracking and results showed around 40 bytes per live thread. >>>>> ???? > >>>>> ???? > Please review a new version of the fix, webrev.06 [1].? >>>>> Just to remind,? webrev.05 was abandoned and webrev.06 [1] is >>>>> webrev.04 [3] minus changes in >>>>> src/hotspot/share/services/management.cpp (that were factored out >>>>> to a separate issue [4]) and plus a change in >>>>> ThreadsList::find_JavaThread_from_java_tid() method (please, see >>>>> below)? that addresses the problem Robbin found and puts the code >>>>> that adds a new thread to the thread table inside Threads_lock. >>>>> ???? > >>>>> ???? > src/hotspot/share/runtime/threadSMR.cpp >>>>> ???? > >>>>> ???? > 622?????? if (tobj != NULL && java_tid == >>>>> java_lang_Thread::thread_id(tobj)) { >>>>> ???? > 623???????? MutexLocker ml(Threads_lock); >>>>> ???? > 624???????? // Must be inside the lock to ensure that we >>>>> don't add the thread to the table >>>>> ???? > 625???????? // that has just passed the removal point in >>>>> ThreadsSMRSupport::remove_thread() >>>>> ???? > 626???????? if (!thread->is_exiting()) { >>>>> ???? > 627?????????? ThreadTable::add_thread(java_tid, thread); >>>>> ???? > 628?????????? return thread; >>>>> ???? > 629???????? } >>>>> ???? > 630?????? } >>>>> ???? > >>>>> ???? > [1] Webrev: >>>>> https://cr.openjdk.java.net/~dtitov/8185005/webrev.06 >>>>> ???? > [2] Bug: https://bugs.openjdk.java.net/browse/JDK-8185005 >>>>> ???? > [3] https://cr.openjdk.java.net/~dtitov/8185005/webrev.04 >>>>> ???? > [4] https://bugs.openjdk.java.net/browse/JDK-8229391 >>>>> ???? > >>>>> ???? > ?Thank you, >>>>> ???? > Daniil >>>>> ???? > >>>>> ???? > >>>>> ???? > >>>>> ???? >????????? > >>>>> ???? >????????? > ?On 8/4/19, 7:54 PM, "David Holmes" >>>>> wrote: >>>>> ???? >????????? > >>>>> ???? >????????? >????? Hi Daniil, >>>>> ???? >????????? > >>>>> ???? >????????? >????? On 3/08/2019 8:16 am, Daniil Titov wrote: >>>>> ???? >????????? >????? > Hi David, >>>>> ???? >????????? >????? > >>>>> ???? >????????? >????? > Thank you for your detailed review. >>>>> Please review a new version of the fix that includes >>>>> ???? >????????? >????? > the changes you suggested: >>>>> ???? >????????? >????? > - ThreadTableCreate_lock scope is reduced >>>>> to cover the creation of the table only; >>>>> ???? >????????? >????? > - ThreadTableCreate_lock is made >>>>> _safepoint_check_always; >>>>> ???? >????????? > >>>>> ???? >????????? >????? Okay. >>>>> ???? >????????? > >>>>> ???? >????????? >????? > - ServiceThread is no longer responsible >>>>> for the resizing of the thread table, instead, >>>>> ???? >????????? >????? >??? the thread table is changed to grow on >>>>> demand by the thread that is doing the addition; >>>>> ???? >????????? > >>>>> ???? >????????? >????? Okay - I'm happy to get the serviceThread >>>>> out of the picture here. >>>>> ???? >????????? > >>>>> ???? >????????? >????? > - fixed nits and formatting issues. >>>>> ???? >????????? > >>>>> ???? >????????? >????? Okay. >>>>> ???? >????????? > >>>>> ???? >????????? >????? >>> The change also includes additional >>>>> optimization for some callers of find_JavaThread_from_java_tid() >>>>> ???? >????????? >????? >>>?? as Daniel suggested. >>>>> ???? >????????? >????? >> Not sure it's best to combine these, but >>>>> if they are limited to the >>>>> ???? >????????? >????? >> changes in management.cpp only then that >>>>> may be okay. >>>>> ???? >????????? >????? > >>>>> ???? >????????? >????? > The additional optimization for some >>>>> callers of find_JavaThread_from_java_tid() is >>>>> ???? >????????? >????? > limited to management.cpp (plus a new >>>>> test) so I left them in the webrev? but >>>>> ???? >????????? >????? > I also could move it in the separate >>>>> issue if required. >>>>> ???? >????????? > >>>>> ???? >????????? >????? I'd prefer this part of be separated out, >>>>> but won't insist. Let's see if >>>>> ???? >????????? >????? Dan or Serguei have a strong opinion. >>>>> ???? >????????? > >>>>> ???? >????????? >????? >??? > src/hotspot/share/runtime/threadSMR.cpp >>>>> ???? >????????? >????? >??? >755???? jlong tid = >>>>> SharedRuntime::get_java_tid(thread); >>>>> ???? >????????? >????? >??? > 926???? jlong tid = >>>>> SharedRuntime::get_java_tid(thread); >>>>> ???? >????????? >????? >?? >? I think it cleaner/better to just use >>>>> ???? >????????? >????? >?? > jlong tid = >>>>> java_lang_Thread::thread_id(thread->threadObj()); >>>>> ???? >????????? >????? >?? > as we know thread is not NULL, it is >>>>> a JavaThread and it has to have a >>>>> ???? >????????? >????? >?? > non-null threadObj. >>>>> ???? >????????? >????? > >>>>> ???? >????????? >????? > I had to leave this code unchanged since >>>>> it turned out the threadObj is null >>>>> ???? >????????? >????? > when VM is destroyed: >>>>> ???? >????????? >????? > >>>>> ???? >????????? >????? > V? [libjvm.so+0xe165d7] >>>>> oopDesc::long_field(int) const+0x67 >>>>> ???? >????????? >????? > V? [libjvm.so+0x16e06c6] >>>>> ThreadsSMRSupport::add_thread(JavaThread*)+0x116 >>>>> ???? >????????? >????? > V? [libjvm.so+0x16d1302] >>>>> Threads::add(JavaThread*, bool)+0x82 >>>>> ???? >????????? >????? > V? [libjvm.so+0xef8369] >>>>> attach_current_thread.part.197+0xc9 >>>>> ???? >????????? >????? > V? [libjvm.so+0xec136c] >>>>> jni_DestroyJavaVM+0x6c >>>>> ???? >????????? >????? > C? [libjli.so+0x4333] JavaMain+0x2c3 >>>>> ???? >????????? >????? > C? [libjli.so+0x8159] ThreadJavaMain+0x9 >>>>> ???? >????????? > >>>>> ???? >????????? >????? This is actually nothing to do with the VM >>>>> being destroyed, but is an >>>>> ???? >????????? >????? issue with JNI_AttachCurrentThread and its >>>>> interaction with the >>>>> ???? >????????? >????? ThreadSMR iterators. The attach process is: >>>>> ???? >????????? >????? - create JavaThread >>>>> ???? >????????? >????? - mark as "is attaching via jni" >>>>> ???? >????????? >????? - add to ThreadsList >>>>> ???? >????????? >????? - create java.lang.Thread object (you can >>>>> only execute Java code after >>>>> ???? >????????? >????? you are attached) >>>>> ???? >????????? >????? - mark as "attach completed" >>>>> ???? >????????? > >>>>> ???? >????????? >????? So while a thread "is attaching" it will be >>>>> seen by the ThreadSMR thread >>>>> ???? >????????? >????? iterator but will have a NULL >>>>> java.lang.Thread object. >>>>> ???? >????????? > >>>>> ???? >????????? >????? We special-case attaching threads in a >>>>> number of places in the VM and I >>>>> ???? >????????? >????? think we should be explicitly doing >>>>> something here to filter out >>>>> ???? >????????? >????? attaching threads, rather than just being >>>>> tolerant of a NULL j.l.Thread >>>>> ???? >????????? >????? object. Specifically in >>>>> ThreadsSMRSupport::add_thread: >>>>> ???? >????????? > >>>>> ???? >????????? >????? if (ThreadTable::is_initialized() && >>>>> !thread->is_attaching_via_jni()) { >>>>> ???? >????????? >???????? jlong tid = >>>>> java_lang_Thread::thread_id(thread->threadObj()); >>>>> ???? >????????? > ThreadTable::add_thread(tid, thread); >>>>> ???? >????????? >????? } >>>>> ???? >????????? > >>>>> ???? >????????? >????? Note that in >>>>> ThreadsSMRSupport::remove_thread we can use the same guard, >>>>> ???? >????????? >????? which covers the case the JNI attach >>>>> encountered an error trying to >>>>> ???? >????????? >????? create the j.l.Thread object. >>>>> ???? >????????? > >>>>> ???? >????????? >????? >> src/hotspot/share/services/threadTable.cpp >>>>> ???? >????????? >????? >> 71???? static uintx get_hash(Value >>>>> const& value, bool* is_dead) { >>>>> ???? >????????? >????? > >>>>> ???? >????????? >????? >> The is_dead parameter still bothers me >>>>> here. I can't make enough sense >>>>> ???? >????????? >????? >> out of the template code in >>>>> ConcurrentHashtable to see why we have to >>>>> ???? >????????? >????? >> have it, but I'm concerned that its very >>>>> existence means we perhaps >>>>> ???? >????????? >????? >> should not be trying to extend CHT in >>>>> this context. ?? >>>>> ???? >????????? >????? > >>>>> ???? >????????? >????? > My understanding is that is_dead >>>>> parameter provides a mechanism for >>>>> ???? >????????? >????? > ConcurrentHashtable to remove stale >>>>> entries that were not explicitly >>>>> ???? >????????? >????? > removed by calling >>>>> ConcurrentHashTable::remove() method. >>>>> ???? >????????? >????? > I think that just because in our case we >>>>> don't use this mechanism doesn't >>>>> ???? >????????? >????? > mean we should not use ConcurrentHashTable. >>>>> ???? >????????? > >>>>> ???? >????????? >????? Can you confirm that this usage is okay >>>>> with Robbin Ehn please. He's >>>>> ???? >????????? >????? back from vacation this week. >>>>> ???? >????????? > >>>>> ???? >????????? >????? >> I would still want to see what impact >>>>> this has on thread >>>>> ???? >????????? >????? >> startup cost, both with and without the >>>>> table being initialized. >>>>> ???? >????????? >????? > >>>>> ???? >????????? >????? > I run a test that initializes the table >>>>> by calling ThreadMXBean.get getThreadInfo(), >>>>> ???? >????????? >????? > starts some threads as a worm-up, and >>>>> then creates and starts 100,000 threads >>>>> ???? >????????? >????? > (each thread just sleeps for 100 ms). In >>>>> case when the thread table is enabled >>>>> ???? >????????? >????? > 100,000 threads are created and started? >>>>> for about 15200 ms. If the thread table >>>>> ???? >????????? >????? > is off the test takes about 14800 ms. >>>>> Based on this information the enabled >>>>> ???? >????????? >????? > thread table makes the thread startup >>>>> about 2.7% slower. >>>>> ???? >????????? > >>>>> ???? >????????? >????? That doesn't sound very good. I think we >>>>> may need to Claes involved to >>>>> ???? >????????? >????? help investigate overall performance impact >>>>> here. >>>>> ???? >????????? > >>>>> ???? >????????? >????? > Webrev: >>>>> https://cr.openjdk.java.net/~dtitov/8185005/webrev.04/ >>>>> ???? >????????? >????? > Bug: >>>>> https://bugs.openjdk.java.net/browse/JDK-8185005 >>>>> ???? >????????? > >>>>> ???? >????????? >????? No further code comments. >>>>> ???? >????????? > >>>>> ???? >????????? >????? I didn't look at the test in detail. >>>>> ???? >????????? > >>>>> ???? >????????? >????? Thanks, >>>>> ???? >????????? >????? David >>>>> ???? >????????? > >>>>> ???? >????????? >????? > Thanks! >>>>> ???? >????????? >????? > --Daniil >>>>> ???? >????????? >????? > >>>>> ???? >????????? >????? > >>>>> ???? >????????? >????? > ?On 7/29/19, 12:53 AM, "David Holmes" >>>>> wrote: >>>>> ???? >????????? >????? > >>>>> ???? >????????? >????? >????? Hi Daniil, >>>>> ???? >????????? >????? > >>>>> ???? >????????? >????? >????? Overall I think this is a reasonable >>>>> approach but I would still like to >>>>> ???? >????????? >????? >????? see some performance and footprint >>>>> numbers, both to verify it fixes the >>>>> ???? >????????? >????? >????? problem reported, and that we are >>>>> not getting penalized elsewhere. >>>>> ???? >????????? >????? > >>>>> ???? >????????? >????? >????? On 25/07/2019 3:21 am, Daniil Titov >>>>> wrote: >>>>> ???? >????????? >????? >????? > Hi David, Daniel, and Serguei, >>>>> ???? >????????? >????? >????? > >>>>> ???? >????????? >????? >????? > Please review the new version of >>>>> the fix, that makes the thread table initialization on demand and >>>>> ???? >????????? >????? >????? > moves it inside >>>>> ThreadsList::find_JavaThread_from_java_tid(). At the creation time >>>>> the thread table >>>>> ???? >????????? >????? >????? >?? is initialized with the threads >>>>> from the current thread list. We don't want to hold Threads_lock >>>>> ???? >????????? >????? >????? > inside >>>>> find_JavaThread_from_java_tid(),? thus new threads still could be >>>>> created? while the thread >>>>> ???? >????????? >????? >????? > table is being initialized . Such >>>>> threads will be found by the linear search and added to the thread >>>>> table >>>>> ???? >????????? >????? >????? > later, in >>>>> ThreadsList::find_JavaThread_from_java_tid(). >>>>> ????? >????????? >????? > >>>>> ???? >????????? >????? >????? The initialization allows the >>>>> created but unpopulated, or partially >>>>> ???? >????????? >????? >????? populated, table to be seen by other >>>>> threads - is that your intention? >>>>> ???? >????????? >????? >????? It seems it should be okay as the >>>>> other threads will then race with the >>>>> ???? >????????? >????? >????? initializing thread to add specific >>>>> entries, and this is a concurrent >>>>> ???? >????????? >????? >????? map so that should be functionally >>>>> correct. But if so then I think you >>>>> ???? >????????? >????? >????? can also reduce the scope of the >>>>> ThreadTableCreate_lock so that it >>>>> ???? >????????? >????? >????? covers creation of the table only, >>>>> not the initial population of the table. >>>>> ???? >????????? >????? > >>>>> ???? >????????? >????? >????? I like the approach of only >>>>> initializing the table when needed and using >>>>> ???? >????????? >????? >????? that to control when the >>>>> add/remove-thread code needs to update the >>>>> ???? >????????? >????? >????? table. But I would still want to see >>>>> what impact this has on thread >>>>> ???? >????????? >????? >????? startup cost, both with and without >>>>> the table being initialized. >>>>> ???? >????????? >????? > >>>>> ???? >????????? >????? >????? > The change also includes >>>>> additional optimization for some callers of >>>>> find_JavaThread_from_java_tid() >>>>> ???? >????????? >????? >????? > as Daniel suggested. >>>>> ???? >????????? >????? > >>>>> ???? >????????? >????? >????? Not sure it's best to combine these, >>>>> but if they are limited to the >>>>> ???? >????????? >????? >????? changes in management.cpp only then >>>>> that may be okay. It helps to be >>>>> ???? >????????? >????? >????? able to focus on the table related >>>>> changes without being distracted by >>>>> ???? >????????? >????? >????? other optimizations. >>>>> ???? >????????? >????? > >>>>> ???? >????????? >????? >????? > That is correct that >>>>> ResolvedMethodTable was used as a blueprint for the thread table, >>>>> however, I tried >>>>> ???? >????????? >????? >????? > to strip it of the all >>>>> functionality that is not required in the thread table case. >>>>> ???? >????????? >????? > >>>>> ???? >????????? >????? >????? The revised version seems better in >>>>> that regard. But I still have a >>>>> ???? >????????? >????? >????? concern, see below. >>>>> ???? >????????? >????? > >>>>> ???? >????????? >????? >????? > We need to have the thread table >>>>> resizable and allow it to grow as the number of threads increases >>>>> to avoid >>>>> ???? >????????? >????? >????? > reserving excessive memory >>>>> a-priori or deteriorating lookup times. The ServiceThread is >>>>> responsible for >>>>> ???? >????????? >????? >????? > growing the thread table when >>>>> required. >>>>> ???? >????????? >????? > >>>>> ???? >????????? >????? >????? Yes but why? Why can't this table be >>>>> grown on demand by the thread that >>>>> ???? >????????? >????? >????? is doing the addition? For other >>>>> tables we may have to delegate to the >>>>> ???? >????????? >????? >????? service thread because the current >>>>> thread cannot perform the action, or >>>>> ???? >????????? >????? >????? it doesn't want to perform it at the >>>>> time the need for the resize is >>>>> ???? >????????? >????? >????? detected (e.g. its detected at a >>>>> safepoint and you want the resize to >>>>> ???? >????????? >????? >????? happen later outside the safepoint). >>>>> It's not apparent to me that such >>>>> ???? >????????? >????? >????? restrictions apply here. >>>>> ???? >????????? >????? > >>>>> ???? >????????? >????? >????? > There is no ConcurrentHashTable >>>>> available in Java 8 and for backporting this fix to Java 8 another >>>>> implementation >>>>> ???? >????????? >????? >????? > of the hash table, probably >>>>> originally suggested in the patch attached to the JBS issue, >>>>> should be used.? It will make >>>>> ???? >????????? >????? >????? > the backporting more complicated,? >>>>> however, adding a new Implementation of the hash table in Java 14 >>>>> while it >>>>> ???? >????????? >????? >????? > already has ConcurrentHashTable >>>>> doesn't seem? reasonable for me. >>>>> ???? >????????? >????? > >>>>> ???? >????????? >????? >????? Ok. >>>>> ???? >????????? >????? > >>>>> ???? >????????? >????? >????? > Webrev: >>>>> http://cr.openjdk.java.net/~dtitov/8185005/webrev.03 >>>>> ???? >????????? >????? > >>>>> ???? >????????? >????? >????? Some specific code comments: >>>>> ???? >????????? >????? > >>>>> ???? >????????? >????? > src/hotspot/share/runtime/mutexLocker.cpp >>>>> ???? >????????? >????? > >>>>> ???? >????????? >????? >????? + def(ThreadTableCreate_lock?????? , >>>>> PaddedMutex? , special, >>>>> ???? >????????? >????? >????? false, >>>>> Monitor::_safepoint_check_never); >>>>> ???? >????????? >????? > >>>>> ???? >????????? >????? >????? I think this needs to be a >>>>> _safepoint_check_always lock. The table will >>>>> ???? >????????? >????? >????? be created by regular JavaThreads >>>>> and they should (nearly) always be >>>>> ???? >????????? >????? >????? checking for safepoints if they are >>>>> going to block acquiring the lock. >>>>> ???? >????????? >????? >????? And it isn't at all obvious that the >>>>> thread doing the creation can't go >>>>> ???? >????????? >????? >????? to a safepoint whilst this lock is >>>>> held. >>>>> ???? >????????? >????? > >>>>> ???? >????????? >????? >????? --- >>>>> ???? >????????? >????? > >>>>> ???? >????????? >????? > src/hotspot/share/runtime/threadSMR.cpp >>>>> ???? >????????? >????? > >>>>> ???? >????????? >????? >????? Nit: >>>>> ???? >????????? >????? > >>>>> ???? >????????? >????? >??????? 618 JavaThread* thread = >>>>> thread_at(i); >>>>> ???? >????????? >????? > >>>>> ???? >????????? >????? >????? you could reuse the new java_thread >>>>> local you introduced at line 613 and >>>>> ???? >????????? >????? >????? just rename that "new" variable to >>>>> "thread" so you don't have to change >>>>> ???? >????????? >????? >????? all other uses. >>>>> ???? >????????? >????? > >>>>> ???? >????????? >????? >????? 628?? } else if (java_thread != NULL >>>>> && ... >>>>> ???? >????????? >????? > >>>>> ???? >????????? >????? >????? You don't need to check != NULL here >>>>> as you only get here when >>>>> ???? >????????? >????? >????? java_thread is not NULL. >>>>> ???? >????????? >????? > >>>>> ???? >????????? >????? >??????? 755???? jlong tid = >>>>> SharedRuntime::get_java_tid(thread); >>>>> ???? >????????? >????? >??????? 926???? jlong tid = >>>>> SharedRuntime::get_java_tid(thread); >>>>> ???? >????????? >????? > >>>>> ???? >????????? >????? >????? I think it cleaner/better to just use >>>>> ???? >????????? >????? > >>>>> ???? >????????? >????? >????? jlong tid = >>>>> java_lang_Thread::thread_id(thread->threadObj()); >>>>> ???? >????????? >????? > >>>>> ???? >????????? >????? >????? as we know thread is not NULL, it is >>>>> a JavaThread and it has to have a >>>>> ???? >????????? >????? >????? non-null threadObj. >>>>> ???? >????????? >????? > >>>>> ???? >????????? >????? >????? --- >>>>> ???? >????????? >????? > >>>>> ???? >????????? >????? > src/hotspot/share/services/management.cpp >>>>> ???? >????????? >????? > >>>>> ???? >????????? >????? >????? 1323???????? if >>>>> (THREAD->is_Java_thread()) { >>>>> ???? >????????? >????? >????? 1324 JavaThread* current_thread = >>>>> (JavaThread*)THREAD; >>>>> ???? >????????? >????? > >>>>> ???? >????????? >????? >????? These calls can only be made on a >>>>> JavaThread so this be simplified to >>>>> ???? >????????? >????? >????? remove the is_Java_thread() call. >>>>> Similarly in other places. >>>>> ???? >????????? >????? > >>>>> ???? >????????? >????? >????? --- >>>>> ???? >????????? >????? > >>>>> ???? >????????? >????? > src/hotspot/share/services/threadTable.cpp >>>>> ???? >????????? >????? > >>>>> ???? >????????? >????? >???????? 55 class ThreadTableEntry : >>>>> public CHeapObj { >>>>> ???? >????????? >????? >???????? 56?? private: >>>>> ???? >????????? >????? >???????? 57???? jlong _tid; >>>>> ???? >????????? >????? > >>>>> ???? >????????? >????? >????? I believe hotspot style is to not >>>>> indent the access modifiers in C++ >>>>> ???? >????????? >????? >????? class declarations, so the above >>>>> would just be: >>>>> ???? >????????? >????? > >>>>> ???? >????????? >????? >???????? 55 class ThreadTableEntry : >>>>> public CHeapObj { >>>>> ???? >????????? >????? >???????? 56 private: >>>>> ???? >????????? >????? >???????? 57?? jlong _tid; >>>>> ???? >????????? >????? > >>>>> ???? >????????? >????? >????? etc. >>>>> ???? >????????? >????? > >>>>> ???? >????????? >????? >??????? 60 ThreadTableEntry(jlong tid, >>>>> JavaThread* java_thread) : >>>>> ???? >????????? >????? >??????? 61 >>>>> _tid(tid),_java_thread(java_thread) {} >>>>> ???? >????????? >????? > >>>>> ???? >????????? >????? >????? line 61 should be indented as it >>>>> continues line 60. >>>>> ???? >????????? >????? > >>>>> ???? >????????? >????? >???????? 67 class ThreadTableConfig : >>>>> public AllStatic { >>>>> ???? >????????? >????? >???????? ... >>>>> ???? >????????? >????? >???????? 71???? static uintx >>>>> get_hash(Value const& value, bool* is_dead) { >>>>> ???? >????????? >????? > >>>>> ???? >????????? >????? >????? The is_dead parameter still bothers >>>>> me here. I can't make enough sense >>>>> ???? >????????? >????? >????? out of the template code in >>>>> ConcurrentHashtable to see why we have to >>>>> ???? >????????? >????? >????? have it, but I'm concerned that its >>>>> very existence means we perhaps >>>>> ???? >????????? >????? >????? should not be trying to extend CHT >>>>> in this context. ?? >>>>> ???? >????????? >????? > >>>>> ???? >????????? >????? >??????? 115?? size_t start_size_log = >>>>> size_log > DefaultThreadTableSizeLog >>>>> ???? >????????? >????? >??????? 116?? ? size_log : >>>>> DefaultThreadTableSizeLog; >>>>> ???? >????????? >????? > >>>>> ???? >????????? >????? >????? line 116 should be indented, though >>>>> in this case I think a better layout >>>>> ???? >????????? >????? >????? would be: >>>>> ???? >????????? >????? > >>>>> ???? >????????? >????? >??????? 115?? size_t start_size_log = >>>>> ???? >????????? >????? >??????? 116 size_log > >>>>> DefaultThreadTableSizeLog ? size_log : >>>>> ???? >????????? >????? > DefaultThreadTableSizeLog; >>>>> ???? >????????? >????? > >>>>> ???? >????????? >????? >??????? 131 double >>>>> ThreadTable::get_load_factor() { >>>>> ???? >????????? >????? >??????? 132?? return >>>>> (double)_items_count/_current_size; >>>>> ???? >????????? >????? >??????? 133 } >>>>> ???? >????????? >????? > >>>>> ???? >????????? >????? >????? Not sure that is doing what you >>>>> want/expect. It will perform integer >>>>> ???? >????????? >????? >????? division and then cast that whole >>>>> integer to a double. If you want >>>>> ???? >????????? >????? >????? double arithmetic you need: >>>>> ???? >????????? >????? > >>>>> ???? >????????? >????? >????? return >>>>> ((double)_items_count)/_current_size; >>>>> ???? >????????? >????? > >>>>> ???? >????????? >????? >????? 180???? jlong _tid; >>>>> ???? >????????? >????? >????? 181???? uintx _hash; >>>>> ???? >????????? >????? > >>>>> ???? >????????? >????? >????? Nit: no need for all those spaces >>>>> before the variable name. >>>>> ???? >????????? >????? > >>>>> ???? >????????? >????? >??????? 183 ThreadTableLookup(jlong tid) >>>>> ???? >????????? >????? >??????? 184???? : _tid(tid), >>>>> _hash(primitive_hash(tid)) {} >>>>> ???? >????????? >????? > >>>>> ???? >????????? >????? >????? line 184 should be indented. >>>>> ???? >????????? >????? > >>>>> ???? >????????? >????? >????? 201 ThreadGet():_return(NULL) {} >>>>> ???? >????????? >????? > >>>>> ???? >????????? >????? >????? Nit: need space after : >>>>> ???? >????????? >????? > >>>>> ???? >????????? >????? >??????? 211 assert(_is_initialized, >>>>> "Thread table is not initialized"); >>>>> ???? >????????? >????? >??????? 212?? _has_work = false; >>>>> ???? >????????? >????? > >>>>> ???? >????????? >????? >????? line 211 is indented one space too far. >>>>> ???? >????????? >????? > >>>>> ???? >????????? >????? >????? 229 ThreadTableEntry* entry = new >>>>> ThreadTableEntry(tid,java_thread); >>>>> ???? >????????? >????? > >>>>> ???? >????????? >????? >????? Nit: need space after , >>>>> ???? >????????? >????? > >>>>> ???? >????????? >????? >????? 252?? return >>>>> _local_table->remove(thread,lookup); >>>>> ???? >????????? >????? > >>>>> ???? >????????? >????? >????? Nit: need space after , >>>>> ???? >????????? >????? > >>>>> ???? >????????? >????? >????? Thanks, >>>>> ???? >????????? >????? >????? David >>>>> ???? >????????? >????? >????? ------ >>>>> ???? >????????? >????? > >>>>> ???? >????????? >????? >????? > Bug: >>>>> https://bugs.openjdk.java.net/browse/JDK-8185005 >>>>> ???? >????????? >????? >????? > >>>>> ???? >????????? >????? >????? > Thanks! >>>>> ???? >????????? >????? >????? > --Daniil >>>>> ???? >????????? >????? >????? > >>>>> ???? >????????? >????? >????? > >>>>> ???? >????????? >????? >????? > ?On 7/8/19, 3:24 PM, "Daniel D. >>>>> Daugherty" wrote: >>>>> ???? >????????? >????? >????? > >>>>> ???? >????????? >????? >????? >????? On 6/29/19 12:06 PM, Daniil >>>>> Titov wrote: >>>>> ???? >????????? >????? >????? >????? > Hi Serguei and David, >>>>> ???? >????????? >????? >????? >????? > >>>>> ???? >????????? >????? >????? >????? > Serguei is right, >>>>> ThreadTable::find_thread(java_tid) cannot? return a JavaThread >>>>> with an unmatched java_tid. >>>>> ???? >????????? >????? >????? >????? > >>>>> ???? >????????? >????? >????? >????? > Please find a new version >>>>> of the fix that includes the changes Serguei suggested. >>>>> ???? >????????? >????? >????? >????? > >>>>> ???? >????????? >????? >????? >????? > Regarding the concern about >>>>> the maintaining the thread table when it may never even be >>>>> queried, one of >>>>> ???? >????????? >????? >????? >????? > the options could be to add >>>>> ThreadTable ::isEnabled flag, set it to "false" by default, and >>>>> wrap the calls to the thread table >>>>> ???? >????????? >????? >????? >????? > in ThreadsSMRSupport >>>>> add_thread() and remove_thread() methods to check this flag. >>>>> ???? >????????? >????? >????? >????? > >>>>> ???? >????????? >????? >????? >????? > When >>>>> ThreadsList::find_JavaThread_from_java_tid() is called for the >>>>> first time it could check if ThreadTable ::isEnabled >>>>> ???? >????????? >????? >????? >????? > Is on and if not then set >>>>> it on and populate the thread table with all existing threads from >>>>> the thread list. >>>>> ???? >????????? >????? >????? > >>>>> ???? >????????? >????? >????? >????? I have the same concerns as >>>>> David H. about this new ThreadTable. >>>>> ???? >????????? >????? >????? > >>>>> ThreadsList::find_JavaThread_from_java_tid() is only called from code >>>>> ???? >????????? >????? >????? >????? in >>>>> src/hotspot/share/services/management.cpp so I think that table >>>>> ???? >????????? >????? >????? >????? needs to enabled and >>>>> populated only if it is going to be used. >>>>> ???? >????????? >????? >????? > >>>>> ???? >????????? >????? >????? >????? I've taken a look at the >>>>> webrev below and I see that David has >>>>> ???? >????????? >????? >????? >????? followed up with additional >>>>> comments. Before I do a crawl through >>>>> ???? >????????? >????? >????? >????? code review for this, I would >>>>> like to see the ThreadTable stuff >>>>> ???? >????????? >????? >????? >????? made optional and David's >>>>> other comments addressed. >>>>> ???? >????????? >????? >????? > >>>>> ???? >????????? >????? >????? >????? Another possible optimization >>>>> is for callers of >>>>> ???? >????????? >????? >????? > find_JavaThread_from_java_tid() to >>>>> save the calling thread's >>>>> ???? >????????? >????? >????? >????? tid value before they loop >>>>> and if the current tid == saved_tid >>>>> ???? >????????? >????? >????? >????? then use the current >>>>> JavaThread* instead of calling >>>>> ???? >????????? >????? >????? > find_JavaThread_from_java_tid() to >>>>> get the JavaThread*. >>>>> ???? >????????? >????? >????? > >>>>> ???? >????????? >????? >????? >????? Dan >>>>> ???? >????????? >????? >????? > >>>>> ???? >????????? >????? >????? >????? > >>>>> ???? >????????? >????? >????? >????? > Webrev: >>>>> https://cr.openjdk.java.net/~dtitov/8185005/webrev.02/ >>>>> ???? >????????? >????? >????? >????? > Bug: >>>>> https://bugs.openjdk.java.net/browse/JDK-8185005 >>>>> ???? >????????? >????? >????? >????? > >>>>> ???? >????????? >????? >????? >????? > Thanks! >>>>> ???? >????????? >????? >????? >????? > --Daniil >>>>> ???? >????????? >????? >????? >????? > >>>>> ???? >????????? >????? >????? >????? > From: >>>>> >>>>> ???? >????????? >????? >????? >????? > Organization: Oracle >>>>> Corporation >>>>> ???? >????????? >????? >????? >????? > Date: Friday, June 28, 2019 >>>>> at 7:56 PM >>>>> ???? >????????? >????? >????? >????? > To: Daniil Titov >>>>> , OpenJDK Serviceability >>>>> , >>>>> "hotspot-runtime-dev at openjdk.java.net" >>>>> , "jmx-dev at openjdk.java.net" >>>>> >>>>> ???? >????????? >????? >????? >????? > Subject: Re: RFR: 8185005: >>>>> Improve performance of ThreadMXBean.getThreadInfo(long ids[], int >>>>> maxDepth) >>>>> ???? >????????? >????? >????? >????? > >>>>> ???? >????????? >????? >????? >????? > Hi Daniil, >>>>> ???? >????????? >????? >????? >????? > >>>>> ???? >????????? >????? >????? >????? > I have several quick comments. >>>>> ???? >????????? >????? >????? >????? > >>>>> ???? >????????? >????? >????? >????? > The indent in the hotspot >>>>> c/c++ files has to be 2, not 4. >>>>> ???? >????????? >????? >????? >????? > >>>>> ???? >????????? >????? >????? >????? > >>>>> https://cr.openjdk.java.net/~dtitov/8185005/webrev.01/src/hotspot/share/runtime/threadSMR.cpp.frames.html >>>>> >>>>> ???? >????????? >????? >????? >????? > 614 JavaThread* >>>>> ThreadsList::find_JavaThread_from_java_tid(jlong java_tid) const { >>>>> ???? >????????? >????? >????? >????? > 615???? JavaThread* >>>>> java_thread = ThreadTable::find_thread(java_tid); >>>>> ???? >????????? >????? >????? >????? > 616???? if (java_thread == >>>>> NULL && java_tid == PMIMORDIAL_JAVA_TID) { >>>>> ???? >????????? >????? >????? >????? > 617???????? // >>>>> ThreadsSMRSupport::add_thread() is not called for the primordial >>>>> ???? >????????? >????? >????? >????? > 618???????? // thread. >>>>> Thus, we find this thread with a linear search and add it >>>>> ???? >????????? >????? >????? >????? > 619???????? // to the >>>>> thread table. >>>>> ???? >????????? >????? >????? >????? > 620???????? for (uint i = >>>>> 0; i < length(); i++) { >>>>> ???? >????????? >????? >????? >????? > 621???????????? JavaThread* >>>>> thread = thread_at(i); >>>>> ???? >????????? >????? >????? >????? > 622???????????? if >>>>> (is_valid_java_thread(java_tid,thread)) { >>>>> ???? >????????? >????? >????? >????? > 623???????????????? >>>>> ThreadTable::add_thread(java_tid, thread); >>>>> ???? >????????? >????? >????? >????? > 624???????????????? return >>>>> thread; >>>>> ???? >????????? >????? >????? >????? > 625???????????? } >>>>> ???? >????????? >????? >????? >????? > 626???????? } >>>>> ???? >????????? >????? >????? >????? > 627???? } else if >>>>> (java_thread != NULL && is_valid_java_thread(java_tid, >>>>> java_thread)) { >>>>> ???? >????????? >????? >????? >????? > 628???????? return >>>>> java_thread; >>>>> ???? >????????? >????? >????? >????? > 629???? } >>>>> ???? >????????? >????? >????? >????? > 630???? return NULL; >>>>> ???? >????????? >????? >????? >????? > 631 } >>>>> ???? >????????? >????? >????? >????? > 632 bool >>>>> ThreadsList::is_valid_java_thread(jlong java_tid, JavaThread* >>>>> java_thread) { >>>>> ???? >????????? >????? >????? >????? > 633???? oop tobj = >>>>> java_thread->threadObj(); >>>>> ???? >????????? >????? >????? >????? > 634???? // Ignore the >>>>> thread if it hasn't run yet, has exited >>>>> ???? >????????? >????? >????? >????? > 635???? // or is starting >>>>> to exit. >>>>> ???? >????????? >????? >????? >????? > 636???? return (tobj != >>>>> NULL && !java_thread->is_exiting() && >>>>> ???? >????????? >????? >????? >????? > 637???????????? java_tid == >>>>> java_lang_Thread::thread_id(tobj)); >>>>> ???? >????????? >????? >????? >????? > 638 } >>>>> ???? >????????? >????? >????? >????? > >>>>> ???? >????????? >????? >????? >????? > 615???? JavaThread* >>>>> java_thread = ThreadTable::find_thread(java_tid); >>>>> ???? >????????? >????? >????? >????? > >>>>> ???? >????????? >????? >????? >????? > I'd suggest to rename >>>>> find_thread() to find_thread_by_tid(). >>>>> ???? >????????? >????? >????? >????? > >>>>> ???? >????????? >????? >????? >????? > A space is missed after the >>>>> comma: >>>>> ???? >????????? >????? >????? >????? > 622 if >>>>> (is_valid_java_thread(java_tid,thread)) { >>>>> ???? >????????? >????? >????? >????? > >>>>> ???? >????????? >????? >????? >????? > An empty line is needed >>>>> before L632. >>>>> ???? >????????? >????? >????? >????? > >>>>> ???? >????????? >????? >????? >????? > The name >>>>> 'is_valid_java_thread' looks wrong (or confusing) to me. >>>>> ???? >????????? >????? >????? >????? > Something like >>>>> 'is_alive_java_thread_with_tid()' would be better. >>>>> ???? >????????? >????? >????? >????? > It'd better to list >>>>> parameters in the opposite order. >>>>> ???? >????????? >????? >????? >????? > >>>>> ???? >????????? >????? >????? >????? > The call to >>>>> is_valid_java_thread() is confusing: >>>>> ???? >????????? >????? >????? >????? > 627 } else if (java_thread >>>>> != NULL && is_valid_java_thread(java_tid, java_thread)) { >>>>> ???? >????????? >????? >????? >????? > >>>>> ???? >????????? >????? >????? >????? > Why would the call >>>>> ThreadTable::find_thread(java_tid) return a JavaThread with an >>>>> unmatched java_tid? >>>>> ???? >????????? >????? >????? >????? > >>>>> ???? >????????? >????? >????? >????? > >>>>> ???? >????????? >????? >????? >????? > Thanks, >>>>> ???? >????????? >????? >????? >????? > Serguei >>>>> ???? >????????? >????? >????? >????? > >>>>> ???? >????????? >????? >????? >????? > On 6/28/19, 9:40 PM, "David >>>>> Holmes" wrote: >>>>> ???? >????????? >????? >????? >????? > >>>>> ???? >????????? >????? >????? >????? > Hi Daniil, >>>>> ???? >????????? >????? >????? >????? > >>>>> ???? >????????? >????? >????? >????? > The definition and use of >>>>> this hashtable (yet another hashtable >>>>> ???? >????????? >????? >????? >????? > implementation!) will need >>>>> careful examination. We have to be concerned >>>>> ???? >????????? >????? >????? >????? > about the cost of >>>>> maintaining it when it may never even be queried. You >>>>> ???? >????????? >????? >????? >????? > would need to look at >>>>> footprint cost and performance impact. >>>>> ???? >????????? >????? >????? >????? > >>>>> ???? >????????? >????? >????? >????? > Unfortunately I'm just >>>>> about to board a plane and will be out for the >>>>> ???? >????????? >????? >????? >????? > next few days. I will try >>>>> to look at this asap next week, but we will >>>>> ???? >????????? >????? >????? >????? > need a lot more data on it. >>>>> ???? >????????? >????? >????? >????? > >>>>> ???? >????????? >????? >????? >????? > Thanks, >>>>> ???? >????????? >????? >????? >????? > David >>>>> ???? >????????? >????? >????? >????? > >>>>> ???? >????????? >????? >????? >????? > On 6/28/19 3:31 PM, Daniil >>>>> Titov wrote: >>>>> ???? >????????? >????? >????? >????? > Please review the change >>>>> that improves performance of ThreadMXBean MXBean methods returning >>>>> the >>>>> ???? >????????? >????? >????? >????? > information for specific >>>>> threads. The change introduces the thread table that uses >>>>> ConcurrentHashTable >>>>> ???? >????????? >????? >????? >????? > to store one-to-one the >>>>> mapping between the thread ids and JavaThread objects and replaces >>>>> the linear >>>>> ???? >????????? >????? >????? >????? > search over the thread list >>>>> in ThreadsList::find_JavaThread_from_java_tid(jlong tid) method >>>>> with the lookup >>>>> ???? >????????? >????? >????? >????? > in the thread table. >>>>> ???? >????????? >????? >????? >????? > >>>>> ???? >????????? >????? >????? >????? > Testing: Mach5 tier1,tier2 >>>>> and tier3 tests successfully passed. >>>>> ???? >????????? >????? >????? >????? > >>>>> ???? >????????? >????? >????? >????? > Webrev: >>>>> https://cr.openjdk.java.net/~dtitov/8185005/webrev.01/ >>>>> ???? >????????? >????? >????? >????? > Bug: >>>>> https://bugs.openjdk.java.net/browse/JDK-8185005 >>>>> ???? >????????? >????? >????? >????? > >>>>> ???? >????????? >????? >????? >????? > Thanks! >>>>> ???? >????????? >????? >????? >????? > >>>>> ???? >????????? >????? >????? >????? > Best regards, >>>>> ???? >????????? >????? >????? >????? > Daniil >>>>> ???? >????????? >????? >????? >????? > >>>>> ???? >????????? >????? >????? >????? > >>>>> ???? >????????? >????? >????? >????? > >>>>> ???? >????????? >????? >????? >????? > >>>>> ???? >????????? >????? >????? >????? > >>>>> ???? >????????? >????? >????? >????? > >>>>> ???? >????????? >????? >????? >????? > >>>>> ???? >????????? >????? >????? > >>>>> ???? >????????? >????? >????? > >>>>> ???? >????????? >????? >????? > >>>>> ???? >????????? >????? >????? > >>>>> ???? >????????? >????? > >>>>> ???? >????????? >????? > >>>>> ???? >????????? >????? > >>>>> ???? >????????? > >>>>> ???? >????????? > >>>>> ???? >????????? > >>>>> ???? > >>>>> ???? > >>>>> ???? > >>>>> ???? > >>>>> ???? > >>>>> ???? > >>>>> ???? > >>>>> >>>>> >>>> >> From serguei.spitsyn at oracle.com Wed Sep 18 08:01:25 2019 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Wed, 18 Sep 2019 01:01:25 -0700 Subject: 8185005: Improve performance of ThreadMXBean.getThreadInfo(long ids[], int maxDepth) In-Reply-To: <5560D680-CD20-442F-8902-7F7034B0736A@oracle.com> References: <4C4212D0-BFFF-4C85-ACC6-05200F220C3F@oracle.com> <2d6dede1-aa79-99ce-a823-773fa2e19827@oracle.com> <6E7B043A-4647-4931-977C-1854CA7EBEC1@oracle.com> <76BCC96D-DB5D-409A-95D5-3A64B893832D@oracle.com> <7e0ba39e-e5b7-f56b-66ea-820a0a35ec2c@oracle.com> <87748188-3BD4-4A8B-938A-89DBC8F3C57A@oracle.com> <1D2CC008-A509-4B0B-A8C7-75C1F94545AD@oracle.com> <0105ea55-9d9c-ca09-53af-3e9863e78e95@oracle.com> <5560D680-CD20-442F-8902-7F7034B0736A@oracle.com> Message-ID: Hi Daniil, On 9/17/19 17:13, Daniil Titov wrote: > Hi Serguei, > > Please find below my answers to the concerns you mentioned in the previous email. > > 1. > > I have a concern about the checks for thread->is_exiting(). > > - the lines 632-633 are useless as they do not really protect from returning an exiting thread >> It is interesting what might happen if an exiting thread is returned by the >> ThreadsList::find_JavaThread_from_java_tid (). >> Does it make sense to develop a test that would cover these cases? > I agree, it doesn't really provide any protection so it makes sense just remove it. Now, I'm not that confident about it. :) > The current implementation > find_JavaThread_from_java_tid() doesn't provide such protection as well, since the thread could start exiting > immediately after method find_JavaThread_from_java_tid() returns, so the assumption is that the callers of > find_JavaThread_from_java_tid() are expecting to deal with such threads and looking on some of them shows that > they usually try to retrieve threadObj or a thread statistic object and if it is NULL that just do nothing. If I understand it correctly, the jt->threadObj() can remain non-NULL for some time while jt->is_exiting() == true. It is not clear how reliable is to use it. But this is a pre-existing issue. It is not you who introduced it. :) So, we can skip it for now. But for the record, we may have a source of intermittent issues. > I'm not sure we could cover this specific case with the test. The window between find_JavaThread_from_java_tid() returns and the caller > continues the execution is too small. The window between the thread started exiting and removed itself from the thread table is very small as well. Understand. > 2. >> - the lines 105-108 can result in adding exiting threads into the ThreadTable > I agree, it was missed, we need to wrap this code inside Thread_lock in the similar way as it is done find_JavaThread_from_java_tid() Okay, thanks! > 3. >> I would suggest to rewrite this fragment in a safe way: >> 95 { >> 96 MutexLocker ml(ThreadTableCreate_lock); >> 97 if (!_is_initialized) { >> 98 create_table(threads->length()); >> 99 _is_initialized = true; >> 100 } >> 101 } >> as: >> { >> MutexLocker ml(ThreadTableCreate_lock); >> if (_is_initialized) { >> return; > > } > > create_table(threads->length()); > > _is_initialized = true; > > } > > It was an intension to not block while populating the table with the threads from the current thread list. > There is no needs to have other threads that call find_JavaThread_from_java_tid() be blocked and waiting for > it to complete since the requested thread could be not present in the thread list that triggers the thread table > initialization. Plus in case of racing initialization it allows threads from not original thread lists be added to the table > and thus avoid the linear scan when these thread are looked up for the first time. I've replied to David in another email. Let's talk once more about it tomorrow. > 4. >>> The case you have described is exact the reason why we still have a code inside >>> ThreadsList::find_JavaThread_from_java_tid() method that does a linear scan and adds >>> the requested thread to the thread table if it is not there ( lines 614-613 below). >> I disagree because it is easy to avoid concurrent ThreadTable >> initialization (please, see my separate email). >> The reason for this code is to cover a case of late/lazy ThreadTable >> initialization. > David Holmes replied to this in a separate email providing a very detailed > explanation of the possible cases and how the proposed implementation satisfies them. Yes. Please, see above. Thanks, Serguei > Best regards, > Daniil > > From: "serguei.spitsyn at oracle.com" > Date: Tuesday, September 17, 2019 at 1:53 AM > To: Daniil Titov , Robbin Ehn , David Holmes , , OpenJDK Serviceability , "hotspot-runtime-dev at openjdk.java.net" , "jmx-dev at openjdk.java.net" , Claes Redestad > Subject: Re: RFR: 8185005: Improve performance of ThreadMXBean.getThreadInfo(long ids[], int maxDepth) > > Hi Daniil, > > Thank you for you patience in working on this issue! > Also, I like that the current thread related optimizations in management.cpp were factored out. > It was a good idea to separate them. > > I have a concern about the checks for thread->is_exiting(). > The threads are added to and removed from the ThreadTable under protection of Threads_lock. > However, the thread->is_exiting() checks are not protected, and so, they are racy. > > There is a couple of such checks to mention: > 611 JavaThread* ThreadsList::find_JavaThread_from_java_tid(jlong java_tid) const { > 612 ThreadTable::lazy_initialize(this); > 613 JavaThread* thread = ThreadTable::find_thread_by_tid(java_tid); > 614 if (thread == NULL) { > 615 // If the thread is not found in the table find it > 616 // with a linear search and add to the table. > 617 for (uint i = 0; i < length(); i++) { > 618 thread = thread_at(i); > 619 oop tobj = thread->threadObj(); > 620 // Ignore the thread if it hasn't run yet, has exited > 621 // or is starting to exit. > 622 if (tobj != NULL && java_tid == java_lang_Thread::thread_id(tobj)) { > 623 MutexLocker ml(Threads_lock); > 624 // Must be inside the lock to ensure that we don't add the thread to the table > 625 // that has just passed the removal point in ThreadsSMRSupport::remove_thread() > 626 if (!thread->is_exiting()) { > 627 ThreadTable::add_thread(java_tid, thread); > 628 return thread; > 629 } > 630 } > 631 } > 632 } else if (!thread->is_exiting()) { > 633 return thread; > 634 } > 635 return NULL; > 636 } > ? ... > 93 void ThreadTable::lazy_initialize(const ThreadsList *threads) { > 94 if (!_is_initialized) { > 95 { > 96 MutexLocker ml(ThreadTableCreate_lock); > 97 if (!_is_initialized) { > 98 create_table(threads->length()); > 99 _is_initialized = true; > 100 } > 101 } > 102 for (uint i = 0; i < threads->length(); i++) { > 103 JavaThread* thread = threads->thread_at(i); > 104 oop tobj = thread->threadObj(); > 105 if (tobj != NULL && !thread->is_exiting()) { > 106 jlong java_tid = java_lang_Thread::thread_id(tobj); > 107 add_thread(java_tid, thread); > 108 } > 109 } > 110 } > 111 } > > A thread may start exiting right after the checks at the lines 626 and 105. > So that: > ?- the lines 632-633 are useless as they do not really protect from returning an exiting thread > ?- the lines 105-108 can result in adding exiting threads into the ThreadTable > > Please, note, the lines 626-629 are safe in terms of addition to the ThreadTable as they > are protected with the Threads_lock. But the returned thread still can exit after that. > It is interesting what might happen if an exiting thread is returned by the > ThreadsList::find_JavaThread_from_java_tid (). > > Does it make sense to develop a test that would cover these cases? > > Thanks, > Serguei > > > On 9/16/19 11:18, Daniil Titov wrote: > Hello, > > After investigating with Claes the impact of this change on the performance (thanks a lot Claes for helping with it!) the conclusion was that the impact on the thread startup time is not a blocker for this change. > > I also measured the memory footprint using Native Memory Tracking and results showed around 40 bytes per live thread. > > Please review a new version of the fix, webrev.06 [1]. Just to remind, webrev.05 was abandoned and webrev.06 [1] is webrev.04 [3] minus changes in src/hotspot/share/services/management.cpp (that were factored out to a separate issue [4]) and plus a change in ThreadsList::find_JavaThread_from_java_tid() method (please, see below) that addresses the problem Robbin found and puts the code that adds a new thread to the thread table inside Threads_lock. > > src/hotspot/share/runtime/threadSMR.cpp > > 622 if (tobj != NULL && java_tid == java_lang_Thread::thread_id(tobj)) { > 623 MutexLocker ml(Threads_lock); > 624 // Must be inside the lock to ensure that we don't add the thread to the table > 625 // that has just passed the removal point in ThreadsSMRSupport::remove_thread() > 626 if (!thread->is_exiting()) { > 627 ThreadTable::add_thread(java_tid, thread); > 628 return thread; > 629 } > 630 } > > [1] Webrev: https://cr.openjdk.java.net/~dtitov/8185005/webrev.06 > [2] Bug: https://bugs.openjdk.java.net/browse/JDK-8185005 > [3] https://cr.openjdk.java.net/~dtitov/8185005/webrev.04 > [4] https://bugs.openjdk.java.net/browse/JDK-8229391 > > ?Thank you, > Daniil > > > > > > > ?On 8/4/19, 7:54 PM, "David Holmes" mailto:david.holmes at oracle.com wrote: > > > > Hi Daniil, > > > > On 3/08/2019 8:16 am, Daniil Titov wrote: > > > Hi David, > > > > > > Thank you for your detailed review. Please review a new version of the fix that includes > > > the changes you suggested: > > > - ThreadTableCreate_lock scope is reduced to cover the creation of the table only; > > > - ThreadTableCreate_lock is made _safepoint_check_always; > > > > Okay. > > > > > - ServiceThread is no longer responsible for the resizing of the thread table, instead, > > > the thread table is changed to grow on demand by the thread that is doing the addition; > > > > Okay - I'm happy to get the serviceThread out of the picture here. > > > > > - fixed nits and formatting issues. > > > > Okay. > > > > >>> The change also includes additional optimization for some callers of find_JavaThread_from_java_tid() > > >>> as Daniel suggested. > > >> Not sure it's best to combine these, but if they are limited to the > > >> changes in management.cpp only then that may be okay. > > > > > > The additional optimization for some callers of find_JavaThread_from_java_tid() is > > > limited to management.cpp (plus a new test) so I left them in the webrev but > > > I also could move it in the separate issue if required. > > > > I'd prefer this part of be separated out, but won't insist. Let's see if > > Dan or Serguei have a strong opinion. > > > > > > src/hotspot/share/runtime/threadSMR.cpp > > > >755 jlong tid = SharedRuntime::get_java_tid(thread); > > > > 926 jlong tid = SharedRuntime::get_java_tid(thread); > > > > I think it cleaner/better to just use > > > > jlong tid = java_lang_Thread::thread_id(thread->threadObj()); > > > > as we know thread is not NULL, it is a JavaThread and it has to have a > > > > non-null threadObj. > > > > > > I had to leave this code unchanged since it turned out the threadObj is null > > > when VM is destroyed: > > > > > > V [libjvm.so+0xe165d7] oopDesc::long_field(int) const+0x67 > > > V [libjvm.so+0x16e06c6] ThreadsSMRSupport::add_thread(JavaThread*)+0x116 > > > V [libjvm.so+0x16d1302] Threads::add(JavaThread*, bool)+0x82 > > > V [libjvm.so+0xef8369] attach_current_thread.part.197+0xc9 > > > V [libjvm.so+0xec136c] jni_DestroyJavaVM+0x6c > > > C [libjli.so+0x4333] JavaMain+0x2c3 > > > C [libjli.so+0x8159] ThreadJavaMain+0x9 > > > > This is actually nothing to do with the VM being destroyed, but is an > > issue with JNI_AttachCurrentThread and its interaction with the > > ThreadSMR iterators. The attach process is: > > - create JavaThread > > - mark as "is attaching via jni" > > - add to ThreadsList > > - create java.lang.Thread object (you can only execute Java code after > > you are attached) > > - mark as "attach completed" > > > > So while a thread "is attaching" it will be seen by the ThreadSMR thread > > iterator but will have a NULL java.lang.Thread object. > > > > We special-case attaching threads in a number of places in the VM and I > > think we should be explicitly doing something here to filter out > > attaching threads, rather than just being tolerant of a NULL j.l.Thread > > object. Specifically in ThreadsSMRSupport::add_thread: > > > > if (ThreadTable::is_initialized() && !thread->is_attaching_via_jni()) { > > jlong tid = java_lang_Thread::thread_id(thread->threadObj()); > > ThreadTable::add_thread(tid, thread); > > } > > > > Note that in ThreadsSMRSupport::remove_thread we can use the same guard, > > which covers the case the JNI attach encountered an error trying to > > create the j.l.Thread object. > > > > >> src/hotspot/share/services/threadTable.cpp > > >> 71 static uintx get_hash(Value const& value, bool* is_dead) { > > > > > >> The is_dead parameter still bothers me here. I can't make enough sense > > >> out of the template code in ConcurrentHashtable to see why we have to > > >> have it, but I'm concerned that its very existence means we perhaps > > >> should not be trying to extend CHT in this context. ?? > > > > > > My understanding is that is_dead parameter provides a mechanism for > > > ConcurrentHashtable to remove stale entries that were not explicitly > > > removed by calling ConcurrentHashTable::remove() method. > > > I think that just because in our case we don't use this mechanism doesn't > > > mean we should not use ConcurrentHashTable. > > > > Can you confirm that this usage is okay with Robbin Ehn please. He's > > back from vacation this week. > > > > >> I would still want to see what impact this has on thread > > >> startup cost, both with and without the table being initialized. > > > > > > I run a test that initializes the table by calling ThreadMXBean.get getThreadInfo(), > > > starts some threads as a worm-up, and then creates and starts 100,000 threads > > > (each thread just sleeps for 100 ms). In case when the thread table is enabled > > > 100,000 threads are created and started for about 15200 ms. If the thread table > > > is off the test takes about 14800 ms. Based on this information the enabled > > > thread table makes the thread startup about 2.7% slower. > > > > That doesn't sound very good. I think we may need to Claes involved to > > help investigate overall performance impact here. > > > > > Webrev: https://cr.openjdk.java.net/~dtitov/8185005/webrev.04/ > > > Bug: https://bugs.openjdk.java.net/browse/JDK-8185005 > > > > No further code comments. > > > > I didn't look at the test in detail. > > > > Thanks, > > David > > > > > Thanks! > > > --Daniil > > > > > > > > > ?On 7/29/19, 12:53 AM, "David Holmes" mailto:david.holmes at oracle.com wrote: > > > > > > Hi Daniil, > > > > > > Overall I think this is a reasonable approach but I would still like to > > > see some performance and footprint numbers, both to verify it fixes the > > > problem reported, and that we are not getting penalized elsewhere. > > > > > > On 25/07/2019 3:21 am, Daniil Titov wrote: > > > > Hi David, Daniel, and Serguei, > > > > > > > > Please review the new version of the fix, that makes the thread table initialization on demand and > > > > moves it inside ThreadsList::find_JavaThread_from_java_tid(). At the creation time the thread table > > > > is initialized with the threads from the current thread list. We don't want to hold Threads_lock > > > > inside find_JavaThread_from_java_tid(), thus new threads still could be created while the thread > > > > table is being initialized . Such threads will be found by the linear search and added to the thread table > > > > later, in ThreadsList::find_JavaThread_from_java_tid(). > > > > > > The initialization allows the created but unpopulated, or partially > > > populated, table to be seen by other threads - is that your intention? > > > It seems it should be okay as the other threads will then race with the > > > initializing thread to add specific entries, and this is a concurrent > > > map so that should be functionally correct. But if so then I think you > > > can also reduce the scope of the ThreadTableCreate_lock so that it > > > covers creation of the table only, not the initial population of the table. > > > > > > I like the approach of only initializing the table when needed and using > > > that to control when the add/remove-thread code needs to update the > > > table. But I would still want to see what impact this has on thread > > > startup cost, both with and without the table being initialized. > > > > > > > The change also includes additional optimization for some callers of find_JavaThread_from_java_tid() > > > > as Daniel suggested. > > > > > > Not sure it's best to combine these, but if they are limited to the > > > changes in management.cpp only then that may be okay. It helps to be > > > able to focus on the table related changes without being distracted by > > > other optimizations. > > > > > > > That is correct that ResolvedMethodTable was used as a blueprint for the thread table, however, I tried > > > > to strip it of the all functionality that is not required in the thread table case. > > > > > > The revised version seems better in that regard. But I still have a > > > concern, see below. > > > > > > > We need to have the thread table resizable and allow it to grow as the number of threads increases to avoid > > > > reserving excessive memory a-priori or deteriorating lookup times. The ServiceThread is responsible for > > > > growing the thread table when required. > > > > > > Yes but why? Why can't this table be grown on demand by the thread that > > > is doing the addition? For other tables we may have to delegate to the > > > service thread because the current thread cannot perform the action, or > > > it doesn't want to perform it at the time the need for the resize is > > > detected (e.g. its detected at a safepoint and you want the resize to > > > happen later outside the safepoint). It's not apparent to me that such > > > restrictions apply here. > > > > > > > There is no ConcurrentHashTable available in Java 8 and for backporting this fix to Java 8 another implementation > > > > of the hash table, probably originally suggested in the patch attached to the JBS issue, should be used. It will make > > > > the backporting more complicated, however, adding a new Implementation of the hash table in Java 14 while it > > > > already has ConcurrentHashTable doesn't seem reasonable for me. > > > > > > Ok. > > > > > > > Webrev: http://cr.openjdk.java.net/~dtitov/8185005/webrev.03 > > > > > > Some specific code comments: > > > > > > src/hotspot/share/runtime/mutexLocker.cpp > > > > > > + def(ThreadTableCreate_lock , PaddedMutex , special, > > > false, Monitor::_safepoint_check_never); > > > > > > I think this needs to be a _safepoint_check_always lock. The table will > > > be created by regular JavaThreads and they should (nearly) always be > > > checking for safepoints if they are going to block acquiring the lock. > > > And it isn't at all obvious that the thread doing the creation can't go > > > to a safepoint whilst this lock is held. > > > > > > --- > > > > > > src/hotspot/share/runtime/threadSMR.cpp > > > > > > Nit: > > > > > > 618 JavaThread* thread = thread_at(i); > > > > > > you could reuse the new java_thread local you introduced at line 613 and > > > just rename that "new" variable to "thread" so you don't have to change > > > all other uses. > > > > > > 628 } else if (java_thread != NULL && ... > > > > > > You don't need to check != NULL here as you only get here when > > > java_thread is not NULL. > > > > > > 755 jlong tid = SharedRuntime::get_java_tid(thread); > > > 926 jlong tid = SharedRuntime::get_java_tid(thread); > > > > > > I think it cleaner/better to just use > > > > > > jlong tid = java_lang_Thread::thread_id(thread->threadObj()); > > > > > > as we know thread is not NULL, it is a JavaThread and it has to have a > > > non-null threadObj. > > > > > > --- > > > > > > src/hotspot/share/services/management.cpp > > > > > > 1323 if (THREAD->is_Java_thread()) { > > > 1324 JavaThread* current_thread = (JavaThread*)THREAD; > > > > > > These calls can only be made on a JavaThread so this be simplified to > > > remove the is_Java_thread() call. Similarly in other places. > > > > > > --- > > > > > > src/hotspot/share/services/threadTable.cpp > > > > > > 55 class ThreadTableEntry : public CHeapObj { > > > 56 private: > > > 57 jlong _tid; > > > > > > I believe hotspot style is to not indent the access modifiers in C++ > > > class declarations, so the above would just be: > > > > > > 55 class ThreadTableEntry : public CHeapObj { > > > 56 private: > > > 57 jlong _tid; > > > > > > etc. > > > > > > 60 ThreadTableEntry(jlong tid, JavaThread* java_thread) : > > > 61 _tid(tid),_java_thread(java_thread) {} > > > > > > line 61 should be indented as it continues line 60. > > > > > > 67 class ThreadTableConfig : public AllStatic { > > > ... > > > 71 static uintx get_hash(Value const& value, bool* is_dead) { > > > > > > The is_dead parameter still bothers me here. I can't make enough sense > > > out of the template code in ConcurrentHashtable to see why we have to > > > have it, but I'm concerned that its very existence means we perhaps > > > should not be trying to extend CHT in this context. ?? > > > > > > 115 size_t start_size_log = size_log > DefaultThreadTableSizeLog > > > 116 ? size_log : DefaultThreadTableSizeLog; > > > > > > line 116 should be indented, though in this case I think a better layout > > > would be: > > > > > > 115 size_t start_size_log = > > > 116 size_log > DefaultThreadTableSizeLog ? size_log : > > > DefaultThreadTableSizeLog; > > > > > > 131 double ThreadTable::get_load_factor() { > > > 132 return (double)_items_count/_current_size; > > > 133 } > > > > > > Not sure that is doing what you want/expect. It will perform integer > > > division and then cast that whole integer to a double. If you want > > > double arithmetic you need: > > > > > > return ((double)_items_count)/_current_size; > > > > > > 180 jlong _tid; > > > 181 uintx _hash; > > > > > > Nit: no need for all those spaces before the variable name. > > > > > > 183 ThreadTableLookup(jlong tid) > > > 184 : _tid(tid), _hash(primitive_hash(tid)) {} > > > > > > line 184 should be indented. > > > > > > 201 ThreadGet():_return(NULL) {} > > > > > > Nit: need space after : > > > > > > 211 assert(_is_initialized, "Thread table is not initialized"); > > > 212 _has_work = false; > > > > > > line 211 is indented one space too far. > > > > > > 229 ThreadTableEntry* entry = new ThreadTableEntry(tid,java_thread); > > > > > > Nit: need space after , > > > > > > 252 return _local_table->remove(thread,lookup); > > > > > > Nit: need space after , > > > > > > Thanks, > > > David > > > ------ > > > > > > > Bug: https://bugs.openjdk.java.net/browse/JDK-8185005 > > > > > > > > Thanks! > > > > --Daniil > > > > > > > > > > > > ?On 7/8/19, 3:24 PM, "Daniel D. Daugherty" mailto:daniel.daugherty at oracle.com wrote: > > > > > > > > On 6/29/19 12:06 PM, Daniil Titov wrote: > > > > > Hi Serguei and David, > > > > > > > > > > Serguei is right, ThreadTable::find_thread(java_tid) cannot return a JavaThread with an unmatched java_tid. > > > > > > > > > > Please find a new version of the fix that includes the changes Serguei suggested. > > > > > > > > > > Regarding the concern about the maintaining the thread table when it may never even be queried, one of > > > > > the options could be to add ThreadTable ::isEnabled flag, set it to "false" by default, and wrap the calls to the thread table > > > > > in ThreadsSMRSupport add_thread() and remove_thread() methods to check this flag. > > > > > > > > > > When ThreadsList::find_JavaThread_from_java_tid() is called for the first time it could check if ThreadTable ::isEnabled > > > > > Is on and if not then set it on and populate the thread table with all existing threads from the thread list. > > > > > > > > I have the same concerns as David H. about this new ThreadTable. > > > > ThreadsList::find_JavaThread_from_java_tid() is only called from code > > > > in src/hotspot/share/services/management.cpp so I think that table > > > > needs to enabled and populated only if it is going to be used. > > > > > > > > I've taken a look at the webrev below and I see that David has > > > > followed up with additional comments. Before I do a crawl through > > > > code review for this, I would like to see the ThreadTable stuff > > > > made optional and David's other comments addressed. > > > > > > > > Another possible optimization is for callers of > > > > find_JavaThread_from_java_tid() to save the calling thread's > > > > tid value before they loop and if the current tid == saved_tid > > > > then use the current JavaThread* instead of calling > > > > find_JavaThread_from_java_tid() to get the JavaThread*. > > > > > > > > Dan > > > > > > > > > > > > > > Webrev: https://cr.openjdk.java.net/~dtitov/8185005/webrev.02/ > > > > > Bug: https://bugs.openjdk.java.net/browse/JDK-8185005 > > > > > > > > > > Thanks! > > > > > --Daniil > > > > > > > > > > From: mailto:serguei.spitsyn at oracle.com > > > > > Organization: Oracle Corporation > > > > > Date: Friday, June 28, 2019 at 7:56 PM > > > > > To: Daniil Titov mailto:daniil.x.titov at oracle.com, OpenJDK Serviceability mailto:serviceability-dev at openjdk.java.net, mailto:hotspot-runtime-dev at openjdk.java.net mailto:hotspot-runtime-dev at openjdk.java.net, mailto:jmx-dev at openjdk.java.net mailto:jmx-dev at openjdk.java.net > > > > > Subject: Re: RFR: 8185005: Improve performance of ThreadMXBean.getThreadInfo(long ids[], int maxDepth) > > > > > > > > > > Hi Daniil, > > > > > > > > > > I have several quick comments. > > > > > > > > > > The indent in the hotspot c/c++ files has to be 2, not 4. > > > > > > > > > > https://cr.openjdk.java.net/~dtitov/8185005/webrev.01/src/hotspot/share/runtime/threadSMR.cpp.frames.html > > > > > 614 JavaThread* ThreadsList::find_JavaThread_from_java_tid(jlong java_tid) const { > > > > > 615 JavaThread* java_thread = ThreadTable::find_thread(java_tid); > > > > > 616 if (java_thread == NULL && java_tid == PMIMORDIAL_JAVA_TID) { > > > > > 617 // ThreadsSMRSupport::add_thread() is not called for the primordial > > > > > 618 // thread. Thus, we find this thread with a linear search and add it > > > > > 619 // to the thread table. > > > > > 620 for (uint i = 0; i < length(); i++) { > > > > > 621 JavaThread* thread = thread_at(i); > > > > > 622 if (is_valid_java_thread(java_tid,thread)) { > > > > > 623 ThreadTable::add_thread(java_tid, thread); > > > > > 624 return thread; > > > > > 625 } > > > > > 626 } > > > > > 627 } else if (java_thread != NULL && is_valid_java_thread(java_tid, java_thread)) { > > > > > 628 return java_thread; > > > > > 629 } > > > > > 630 return NULL; > > > > > 631 } > > > > > 632 bool ThreadsList::is_valid_java_thread(jlong java_tid, JavaThread* java_thread) { > > > > > 633 oop tobj = java_thread->threadObj(); > > > > > 634 // Ignore the thread if it hasn't run yet, has exited > > > > > 635 // or is starting to exit. > > > > > 636 return (tobj != NULL && !java_thread->is_exiting() && > > > > > 637 java_tid == java_lang_Thread::thread_id(tobj)); > > > > > 638 } > > > > > > > > > > 615 JavaThread* java_thread = ThreadTable::find_thread(java_tid); > > > > > > > > > > I'd suggest to rename find_thread() to find_thread_by_tid(). > > > > > > > > > > A space is missed after the comma: > > > > > 622 if (is_valid_java_thread(java_tid,thread)) { > > > > > > > > > > An empty line is needed before L632. > > > > > > > > > > The name 'is_valid_java_thread' looks wrong (or confusing) to me. > > > > > Something like 'is_alive_java_thread_with_tid()' would be better. > > > > > It'd better to list parameters in the opposite order. > > > > > > > > > > The call to is_valid_java_thread() is confusing: > > > > > 627 } else if (java_thread != NULL && is_valid_java_thread(java_tid, java_thread)) { > > > > > > > > > > Why would the call ThreadTable::find_thread(java_tid) return a JavaThread with an unmatched java_tid? > > > > > > > > > > > > > > > Thanks, > > > > > Serguei > > > > > > > > > > On 6/28/19, 9:40 PM, "David Holmes" mailto:david.holmes at oracle.com wrote: > > > > > > > > > > Hi Daniil, > > > > > > > > > > The definition and use of this hashtable (yet another hashtable > > > > > implementation!) will need careful examination. We have to be concerned > > > > > about the cost of maintaining it when it may never even be queried. You > > > > > would need to look at footprint cost and performance impact. > > > > > > > > > > Unfortunately I'm just about to board a plane and will be out for the > > > > > next few days. I will try to look at this asap next week, but we will > > > > > need a lot more data on it. > > > > > > > > > > Thanks, > > > > > David > > > > > > > > > > On 6/28/19 3:31 PM, Daniil Titov wrote: > > > > > Please review the change that improves performance of ThreadMXBean MXBean methods returning the > > > > > information for specific threads. The change introduces the thread table that uses ConcurrentHashTable > > > > > to store one-to-one the mapping between the thread ids and JavaThread objects and replaces the linear > > > > > search over the thread list in ThreadsList::find_JavaThread_from_java_tid(jlong tid) method with the lookup > > > > > in the thread table. > > > > > > > > > > Testing: Mach5 tier1,tier2 and tier3 tests successfully passed. > > > > > > > > > > Webrev: https://cr.openjdk.java.net/~dtitov/8185005/webrev.01/ > > > > > Bug: https://bugs.openjdk.java.net/browse/JDK-8185005 > > > > > > > > > > Thanks! > > > > > > > > > > Best regards, > > > > > Daniil > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > From daniel.daugherty at oracle.com Wed Sep 18 13:32:09 2019 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Wed, 18 Sep 2019 09:32:09 -0400 Subject: RFR (XXXXS): 8231162: JVMTI RawMonitorWait triggers assertion failure: Only JavaThreads can be interruptible In-Reply-To: References: Message-ID: <3ffac0c6-9dd6-41db-4993-1e9c3435768c@oracle.com> Thumbs up! This is a trivial change and only needs a single (R)eviewer. Did you rerun the failing tests to make sure this is the only issue? Dan P.S. I've done the "test a stack of patches" together and have something break when you push just one of the patches... just recently in fact. :-( At least yours didn't happen until Tier4... :-) On 9/18/19 2:26 AM, David Holmes wrote: > Bug: https://bugs.openjdk.java.net/browse/JDK-8231162 > webrev: http://cr.openjdk.java.net/~dholmes/8231162/webrev/ > > -????? r = rmonitor->raw_wait(millis, true, thread); > +????? r = rmonitor->raw_wait(millis, false, thread); > > Non-JavaThreads are not interruptible and so "true" should not have > been being passed. This tripped over the assertions added as part of > the movement of the interrupt code to JavaThread under JDK-8230424. > > Dan: FYI I overlooked this because I already rewrote all this > RawMonitor logic under "8229160: Reimplement JvmtiRawMonitor to use > PlatformMonitor" to do the right thing, but of course that hasn't been > pushed yet. And this isn't detected until tier 4 testing. > > Thanks, > David From daniel.daugherty at oracle.com Wed Sep 18 13:36:00 2019 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Wed, 18 Sep 2019 09:36:00 -0400 Subject: RFR (XXXXS): 8231162: JVMTI RawMonitorWait triggers assertion failure: Only JavaThreads can be interruptible In-Reply-To: <3ffac0c6-9dd6-41db-4993-1e9c3435768c@oracle.com> References: <3ffac0c6-9dd6-41db-4993-1e9c3435768c@oracle.com> Message-ID: <7a6609f7-9349-31d7-5419-a34577adf560@oracle.com> Forgot to say: I'm really happy you put in that new assert: ? if (interruptible) { ??? assert(THREAD->is_Java_thread(), "Only JavaThreads can be interruptible"); Dan On 9/18/19 9:32 AM, Daniel D. Daugherty wrote: > Thumbs up! This is a trivial change and only needs a single (R)eviewer. > > Did you rerun the failing tests to make sure this is the only issue? > > Dan > > P.S. > I've done the "test a stack of patches" together and have something break > when you push just one of the patches... just recently in fact. :-( > At least yours didn't happen until Tier4... :-) > > On 9/18/19 2:26 AM, David Holmes wrote: >> Bug: https://bugs.openjdk.java.net/browse/JDK-8231162 >> webrev: http://cr.openjdk.java.net/~dholmes/8231162/webrev/ >> >> -????? r = rmonitor->raw_wait(millis, true, thread); >> +????? r = rmonitor->raw_wait(millis, false, thread); >> >> Non-JavaThreads are not interruptible and so "true" should not have >> been being passed. This tripped over the assertions added as part of >> the movement of the interrupt code to JavaThread under JDK-8230424. >> >> Dan: FYI I overlooked this because I already rewrote all this >> RawMonitor logic under "8229160: Reimplement JvmtiRawMonitor to use >> PlatformMonitor" to do the right thing, but of course that hasn't >> been pushed yet. And this isn't detected until tier 4 testing. >> >> Thanks, >> David > From hohensee at amazon.com Wed Sep 18 15:03:07 2019 From: hohensee at amazon.com (Hohensee, Paul) Date: Wed, 18 Sep 2019 15:03:07 +0000 Subject: RFR (M): 8207266: ThreadMXBean::getThreadAllocatedBytes() can be quicker for self thread In-Reply-To: <1561d09b-68ff-55fa-128a-045798a3d6a9@oracle.com> References: <588a91ec-8d4a-1157-5d72-88bb1eef1e6e@oracle.com> <30EA5D0C-1AEC-4242-B17B-CA4D39ECAF71@amazon.com> <0d42d653-d158-a6e4-45b6-84f087c7e592@oracle.com> <03A2509C-5587-448A-82F8-9240EA040326@amazon.com> <6f674d71-58f6-bc79-7d08-7bcc24e3b0fa@oracle.com> <5252a51d-4217-000b-1444-a088bb8a6a58@oracle.com> <873119A8-C595-4B73-AD0B-1625D6CAC47D@amazon.com> <56ea7c5f-8c91-9a05-6d95-255bfd0c154d@oracle.com> <5417BEA4-AECD-4130-B269-19847C0092B3@amazon.com> <1561d09b-68ff-55fa-128a-045798a3d6a9@oracle.com> Message-ID: <7D09AF54-54DC-4064-83F9-29EA671C3484@amazon.com> Thanks David (and Mandy and Serguei). Pushed. ?On 9/17/19, 3:51 PM, "David Holmes" wrote: On 18/09/2019 12:10 am, Hohensee, Paul wrote: > Thanks, Serguei. :) > > David, are you ok with the patch? Yep, nothing further from me. David > Paul > > *From: *"serguei.spitsyn at oracle.com" > *Date: *Tuesday, September 17, 2019 at 2:26 AM > *To: *"Hohensee, Paul" , David Holmes > , Mandy Chung > *Cc: *OpenJDK Serviceability , > "hotspot-gc-dev at openjdk.java.net" > *Subject: *Re: RFR (M): 8207266: ThreadMXBean::getThreadAllocatedBytes() > can be quicker for self thread > > Hi Paul, > > Thank you for refactoring and fixing the test. > It looks great now! > > Thanks, > Serguei > > > On 9/15/19 02:52, Hohensee, Paul wrote: > > Hi, Serguei, thanks for the review. New webrev at > > http://cr.openjdk.java.net/~phh/8207266/webrev.09/ > > I refactored the test?s main() method, and you?re correct, > getThreadAllocatedBytes should be getCurrentThreadAllocatedBytes in > that context: fixed. > > Paul > > *From: *"serguei.spitsyn at oracle.com" > > > *Organization: *Oracle Corporation > *Date: *Friday, September 13, 2019 at 5:50 PM > *To: *"Hohensee, Paul" > , David Holmes > , Mandy Chung > > *Cc: *OpenJDK Serviceability > , > "hotspot-gc-dev at openjdk.java.net" > > > > *Subject: *Re: RFR (M): 8207266: > ThreadMXBean::getThreadAllocatedBytes() can be quicker for self thread > > Hi Paul, > > It looks pretty good in general. > > http://cr.openjdk.java.net/~phh/8207266/webrev.08/test/jdk/com/sun/management/ThreadMXBean/ThreadAllocatedMemory.java.frames.html > > It would be nice to refactor the java main() method as it becomes > too big. > Two ways ofgetCurrentThreadAllocatedBytes() testing are good candidates > to become separate methods. > > 98 long size1 = mbean.getThreadAllocatedBytes(id); > > Just wanted to double check if you wanted to invoke > the getCurrentThreadAllocatedBytes() instead as it is > a part of: > > 85 // First way, getCurrentThreadAllocatedBytes > > > Thanks, > Serguei > > On 9/13/19 12:11 PM, Hohensee, Paul wrote: > > Hi David, thanks for your comments. New webrev in > > > > http://cr.openjdk.java.net/~phh/8207266/webrev.08/ > > > > Both the old and new versions of the code check that thread allocated memory is both supported and enabled. The existing version of getThreadAllocatedBytes(long []) calls verifyThreadAllocatedMemory(long []), which checks inline to make sure thread allocated memory is supported, then calls isThreadAllocatedMemoryEnabled() to verify that it's enabled. isThreadAllocatedMemoryEnabled() duplicates (!) the support check and returns the enabled flag. I removed the redundant check in the new version. > > > > You're of course correct about the back-to-back check. Application code can't know when the runtime will hijack a thread for its own purposes. I've removed the check. > > > > Paul > > > > On 9/13/19, 12:50 AM, "David Holmes" wrote: > > > > Hi Paul, > > > > On 13/09/2019 10:29 am, Hohensee, Paul wrote: > > > Thanks for clarifying the review rules. Would someone from the > > > serviceability team please review? New webrev at > > > > > >http://cr.openjdk.java.net/~phh/8207266/webrev.07/ > > > > One aspect of the functional change needs clarification for me - and > > apologies if this has been covered in the past. It seems to me that > > currently we only check isThreadAllocatedMemorySupported for these > > operations, but if I read things correctly the updated code additionally > > checks isThreadAllocatedMemoryEnabled, which is a behaviour change not > > mentioned in the CSR. > > > > > I didn?t disturb the existing checks in the test, just added code to > > > check the result of getThreadAllocatedBytes(long) on a non-current > > > thread, plus the back-to-back no-allocation checks. The former wasn?t > > > needed before because getThreadAllocatedBytes(long) was just a wrapper > > > around getThreadAllocatedBytes(long []). This patch changes that, so I > > > added a separate test. The latter is supposed to fail if there?s object > > > allocation on calls to getCurrentThreadAllocatedBytes and > > > getThreadAllocatedBytes(long). I.e., a feature, not a bug, because > > > accumulation of transient small objects can be a performance problem. > > > Thanks to your review, I noticed that the back-to-back check on the > > > current thread was using getThreadAllocatedBytes(long) instead of > > > getCurrentThreadAllocatedBytes and fixed it. I also removed all > > > instances of ?TEST FAILED: ?. > > > > The back-to-back check is not valid in general. You don't know if the > > first check might trigger some class loading on the return path after it > > has obtained the first memory value. The check might also fail if using > > JVMCI and some compilation related activity occurs in the current thread > > on the second call. Also with the introduction of handshakes its > > possible the current thread might hit a safepoint checks that results in > > it executing a handshake operation that performs allocation. Potentially > > there could be numerous non-deterministic actions that might occur > > leading to unanticipated allocation. > > > > I understand what you want to test here, I just don't think it is > > reliably doable. > > > > Thanks, > > David > > ----- > > > > > > > > Paul > > > > > > *From: *Mandy Chung > > > *Date: *Thursday, September 12, 2019 at 10:09 AM > > > *To: *"Hohensee, Paul" > > > *Cc: *OpenJDK Serviceability , > > >"hotspot-gc-dev at openjdk.java.net" > > > *Subject: *Re: RFR (M): 8207266: ThreadMXBean::getThreadAllocatedBytes() > > > can be quicker for self thread > > > > > > On 9/3/19 12:38 PM, Hohensee, Paul wrote: > > > > > > Minor update in new webrevhttp://cr.openjdk.java.net/~phh/8207266/webrev.05/. > > > > > > > > > I only reviewed the library side implementation that looks good. I > > > expect the serviceability team to review the test and hotspot change. > > > > > > > > > Need a confirmatory review to push this. If I understand the rules correctly, it doesn't need a Reviewer review since Mandy's already reviewed it, it just needs a Committer review. > > > > > > > > > You need another reviewer to advice the following because I was not > > > close to the ThreadsList work. > > > > > > 2087 ThreadsListHandle tlh; > > > > > > 2088 JavaThread* java_thread = tlh.list()->find_JavaThread_from_java_tid(thread_id); > > > > > > 2089 > > > > > > 2090 if (java_thread != NULL) { > > > > > > 2091 return java_thread->cooked_allocated_bytes(); > > > > > > 2092 } > > > > > > This looks right to me. > > > > > > test/jdk/com/sun/management/ThreadMXBean/ThreadAllocatedMemory.java > > > > > > - "ThreadAllocatedMemory is expected to be disabled"); > > > > > > + "TEST FAILED: ThreadAllocatedMemory is expected to be > > > disabled"); > > > > > > Prepending "TEST FAILED" in exception message (in several places) > > > > > > seems redundant since such RuntimeException is thrown and expected > > > > > > a test failure. > > > > > > + // back-to-back calls shouldn't allocate any memory > > > > > > + size = mbean.getThreadAllocatedBytes(id); > > > > > > + size1 = mbean.getThreadAllocatedBytes(id); > > > > > > + if (size1 != size) { > > > > > > Is there anything in the test can do to help guarantee this? I didn't > > > > > > closely review this test. The main thing I advice is to improve > > > > > > the reliability of this test. Put it in another way, we want to > > > > > > ensure that this test change will pass all the time in various > > > > > > test configuration. > > > > > > Mandy > > > > > > > > > > From hohensee at amazon.com Wed Sep 18 17:56:28 2019 From: hohensee at amazon.com (Hohensee, Paul) Date: Wed, 18 Sep 2019 17:56:28 +0000 Subject: RFA Backport CSR to 11u: 8231194: ThreadMXBean::getThreadAllocatedBytes() can be quicker for self thread Message-ID: <4A7D9E4F-0D48-4C31-B092-55E03A45E0D1@amazon.com> Please review/approve an 11u CSR backport request. https://bugs.openjdk.java.net/browse/JDK-8231194 Original issue: https://bugs.openjdk.java.net/browse/JDK-8207266 Original CSR: https://bugs.openjdk.java.net/browse/JDK-8230311 11u backport issue: https://bugs.openjdk.java.net/browse/JDK-8231193 Email threads: https://mail.openjdk.java.net/pipermail/serviceability-dev/2019-August/029011.html https://mail.openjdk.java.net/pipermail/serviceability-dev/2019-September/029033.html Thanks, Paul -------------- next part -------------- An HTML attachment was scrubbed... URL: From daniel.daugherty at oracle.com Wed Sep 18 20:26:15 2019 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Wed, 18 Sep 2019 16:26:15 -0400 Subject: Bytecode Instrumentation and Class Loading. In-Reply-To: References: Message-ID: <02f974fc-37cb-ca65-0596-30b2a1fd3feb@oracle.com> Forwarding this to the serviceability-dev at ... alias and Bcc'ing jdk-dev at ... which is a really broad alias... Dan On 9/18/19 2:47 PM, Sam Thomas wrote: > Hi, > > I'm trying to understand if a class will load as soon as all the > transformers return. The aim is to get a class reference of a class I have > seen in my transformer. > I'm currently using Class.forName(). > > Also if there is way to get the same without triggering class loading - if > the class is not loaded return a null reference. I ask this because there > is a scenario where there are two agents on JBoss where one asks for the > class reference using Class.forName() and the other went and performed a > redefineClass() on it and I end up getting a Failed to define class. > > Thanks > ./Sam From chris.plummer at oracle.com Wed Sep 18 21:11:17 2019 From: chris.plummer at oracle.com (Chris Plummer) Date: Wed, 18 Sep 2019 14:11:17 -0700 Subject: RFR(S): 8228625: [TESTBUG] sun/tools/jhsdb/JShellHeapDumpTest.java fails with RuntimeException 'JShellToolProvider' missing from stdout/stderr Message-ID: Hello, Please review the following changes: http://cr.openjdk.java.net/~cjplummer/8228625/webrev.00/ https://bugs.openjdk.java.net/browse/JDK-8228625 There are actually numerous ways that JShellHeapDumpTest.java fails. One is a test bug, being addressed here, and the rest all seem to be SA bugs. Those are now being covered by JDK-8230872. All the issues seem to stem from the fact that the test spawns a jshell process, and then immediately does a "jhsdb jmap" on the process before jshell has fully started up. The test bug happens when the jmap succeeds, but jshell has not yet entered the main java thread. Thus the search for "JShellToolProvider" in the output fails. It expects "JShellToolProvider" to be in the output because it is part of a method name in the main thread, and the test dump all the thread stacks contained in the jmap generated hprof file. When the test fails in this way, you can see the stack dump in the output, but the main thread is missing. There's a couple of ways to fix this. One is to just add a delay (10s seems to be more than enough), and the other is to retry the "jhsdb jmap" command until the stack contains the JShellToolProvider symbol. I chose the later because doing a 10s delay masks the SA issues that are now covered by JDK-8230872. In a way the 10s delay is a better fix, because it makes this test pass every time, but I did not like that it also hid real SA problems in JDK-8230872. My plan for now is to do this retry fix, and then if there are too many failures due to JDK-8230872, then also add a 10s delay, with the intention of removing it once JDK-8230872 if fixed. From what I can see, JDK-8230872 failures happen on about 1% of the runs. I made a few of other changes. One was to no longer redirect stderr from the jmap process as was done from the following: processBuilder.redirectError(ProcessBuilder.Redirect.INHERIT); This causes the output not to appear in the OutputAnalyzer output, resulting in the following not working: ??????????? output.shouldNotContain("null"); Also I added code to dump the output of the jshell process so you can see if the jshell prompt was ever generated. thanks, Chris From david.holmes at oracle.com Wed Sep 18 21:31:06 2019 From: david.holmes at oracle.com (David Holmes) Date: Thu, 19 Sep 2019 07:31:06 +1000 Subject: RFR (XXXXS): 8231162: JVMTI RawMonitorWait triggers assertion failure: Only JavaThreads can be interruptible In-Reply-To: <3ffac0c6-9dd6-41db-4993-1e9c3435768c@oracle.com> References: <3ffac0c6-9dd6-41db-4993-1e9c3435768c@oracle.com> Message-ID: <5094af97-6f27-ae7e-9953-fe0abb02fd56@oracle.com> Hi Dan, On 18/09/2019 11:32 pm, Daniel D. Daugherty wrote: > Thumbs up! This is a trivial change and only needs a single (R)eviewer. Thanks for the review. I missed you by 10 minutes last night :( > Did you rerun the failing tests to make sure this is the only issue? Yes all the failing tests pass now. Thanks, David > Dan > > P.S. > I've done the "test a stack of patches" together and have something break > when you push just one of the patches... just recently in fact. :-( > At least yours didn't happen until Tier4... :-) > > On 9/18/19 2:26 AM, David Holmes wrote: >> Bug: https://bugs.openjdk.java.net/browse/JDK-8231162 >> webrev: http://cr.openjdk.java.net/~dholmes/8231162/webrev/ >> >> -????? r = rmonitor->raw_wait(millis, true, thread); >> +????? r = rmonitor->raw_wait(millis, false, thread); >> >> Non-JavaThreads are not interruptible and so "true" should not have >> been being passed. This tripped over the assertions added as part of >> the movement of the interrupt code to JavaThread under JDK-8230424. >> >> Dan: FYI I overlooked this because I already rewrote all this >> RawMonitor logic under "8229160: Reimplement JvmtiRawMonitor to use >> PlatformMonitor" to do the right thing, but of course that hasn't been >> pushed yet. And this isn't detected until tier 4 testing. >> >> Thanks, >> David > From alexey.menkov at oracle.com Wed Sep 18 22:01:25 2019 From: alexey.menkov at oracle.com (Alex Menkov) Date: Wed, 18 Sep 2019 15:01:25 -0700 Subject: RFR(S): 8228625: [TESTBUG] sun/tools/jhsdb/JShellHeapDumpTest.java fails with RuntimeException 'JShellToolProvider' missing from stdout/stderr In-Reply-To: References: Message-ID: <4ea2ef0c-0530-41bf-683c-ca57def47beb@oracle.com> Hi Chris, Did you think about waiting for jshell prompt ("jshell>") before run "jhsdb jmap" command instead of delay or re-tries? --alex On 09/18/2019 14:11, Chris Plummer wrote: > Hello, > > Please review the following changes: > > http://cr.openjdk.java.net/~cjplummer/8228625/webrev.00/ > https://bugs.openjdk.java.net/browse/JDK-8228625 > > There are actually numerous ways that JShellHeapDumpTest.java fails. One > is a test bug, being addressed here, and the rest all seem to be SA > bugs. Those are now being covered by JDK-8230872. All the issues seem to > stem from the fact that the test spawns a jshell process, and then > immediately does a "jhsdb jmap" on the process before jshell has fully > started up. > > The test bug happens when the jmap succeeds, but jshell has not yet > entered the main java thread. Thus the search for "JShellToolProvider" > in the output fails. It expects "JShellToolProvider" to be in the output > because it is part of a method name in the main thread, and the test > dump all the thread stacks contained in the jmap generated hprof file. > When the test fails in this way, you can see the stack dump in the > output, but the main thread is missing. > > There's a couple of ways to fix this. One is to just add a delay (10s > seems to be more than enough), and the other is to retry the "jhsdb > jmap" command until the stack contains the JShellToolProvider symbol. I > chose the later because doing a 10s delay masks the SA issues that are > now covered by JDK-8230872. In a way the 10s delay is a better fix, > because it makes this test pass every time, but I did not like that it > also hid real SA problems in JDK-8230872. My plan for now is to do this > retry fix, and then if there are too many failures due to JDK-8230872, > then also add a 10s delay, with the intention of removing it once > JDK-8230872 if fixed. From what I can see, JDK-8230872 failures happen > on about 1% of the runs. > > I made a few of other changes. One was to no longer redirect stderr from > the jmap process as was done from the following: > > processBuilder.redirectError(ProcessBuilder.Redirect.INHERIT); > > This causes the output not to appear in the OutputAnalyzer output, > resulting in the following not working: > > ??????????? output.shouldNotContain("null"); > > Also I added code to dump the output of the jshell process so you can > see if the jshell prompt was ever generated. > > thanks, > > Chris > From chris.plummer at oracle.com Wed Sep 18 22:44:05 2019 From: chris.plummer at oracle.com (Chris Plummer) Date: Wed, 18 Sep 2019 15:44:05 -0700 Subject: RFR(S): 8228625: [TESTBUG] sun/tools/jhsdb/JShellHeapDumpTest.java fails with RuntimeException 'JShellToolProvider' missing from stdout/stderr In-Reply-To: <4ea2ef0c-0530-41bf-683c-ca57def47beb@oracle.com> References: <4ea2ef0c-0530-41bf-683c-ca57def47beb@oracle.com> Message-ID: <51dd8f1e-1c3b-7688-53f4-9e1ccd5748fa@oracle.com> Is there an easy way of doing this? Currently the jshell process is just spawned using Runtime.exec(). Chris On 9/18/19 3:01 PM, Alex Menkov wrote: > Hi Chris, > > Did you think about waiting for jshell prompt ("jshell>") before run > "jhsdb jmap" command instead of delay or re-tries? > > --alex > > On 09/18/2019 14:11, Chris Plummer wrote: >> Hello, >> >> Please review the following changes: >> >> http://cr.openjdk.java.net/~cjplummer/8228625/webrev.00/ >> https://bugs.openjdk.java.net/browse/JDK-8228625 >> >> There are actually numerous ways that JShellHeapDumpTest.java fails. >> One is a test bug, being addressed here, and the rest all seem to be >> SA bugs. Those are now being covered by JDK-8230872. All the issues >> seem to stem from the fact that the test spawns a jshell process, and >> then immediately does a "jhsdb jmap" on the process before jshell has >> fully started up. >> >> The test bug happens when the jmap succeeds, but jshell has not yet >> entered the main java thread. Thus the search for >> "JShellToolProvider" in the output fails. It expects >> "JShellToolProvider" to be in the output because it is part of a >> method name in the main thread, and the test dump all the thread >> stacks contained in the jmap generated hprof file. When the test >> fails in this way, you can see the stack dump in the output, but the >> main thread is missing. >> >> There's a couple of ways to fix this. One is to just add a delay (10s >> seems to be more than enough), and the other is to retry the "jhsdb >> jmap" command until the stack contains the JShellToolProvider symbol. >> I chose the later because doing a 10s delay masks the SA issues that >> are now covered by JDK-8230872. In a way the 10s delay is a better >> fix, because it makes this test pass every time, but I did not like >> that it also hid real SA problems in JDK-8230872. My plan for now is >> to do this retry fix, and then if there are too many failures due to >> JDK-8230872, then also add a 10s delay, with the intention of >> removing it once JDK-8230872 if fixed. From what I can see, >> JDK-8230872 failures happen on about 1% of the runs. >> >> I made a few of other changes. One was to no longer redirect stderr >> from the jmap process as was done from the following: >> >> processBuilder.redirectError(ProcessBuilder.Redirect.INHERIT); >> >> This causes the output not to appear in the OutputAnalyzer output, >> resulting in the following not working: >> >> ???????????? output.shouldNotContain("null"); >> >> Also I added code to dump the output of the jshell process so you can >> see if the jshell prompt was ever generated. >> >> thanks, >> >> Chris >> From alexey.menkov at oracle.com Wed Sep 18 23:29:36 2019 From: alexey.menkov at oracle.com (Alex Menkov) Date: Wed, 18 Sep 2019 16:29:36 -0700 Subject: RFR(S): 8228625: [TESTBUG] sun/tools/jhsdb/JShellHeapDumpTest.java fails with RuntimeException 'JShellToolProvider' missing from stdout/stderr In-Reply-To: <51dd8f1e-1c3b-7688-53f4-9e1ccd5748fa@oracle.com> References: <4ea2ef0c-0530-41bf-683c-ca57def47beb@oracle.com> <51dd8f1e-1c3b-7688-53f4-9e1ccd5748fa@oracle.com> Message-ID: You can use jdk.test.lib classes to simplify the things. Something like ProcessBuilder pb = new ProcessBuilder(JDKToolFinder.getTestJDKTool("jshell")); Process p = ProcessTools.startProcess("JShell", pb, s -> { // warm-up predicate return s.contains(">jshell"); }); --alex On 09/18/2019 15:44, Chris Plummer wrote: > Is there an easy way of doing this? Currently the jshell process is just > spawned using Runtime.exec(). > > Chris > > On 9/18/19 3:01 PM, Alex Menkov wrote: >> Hi Chris, >> >> Did you think about waiting for jshell prompt ("jshell>") before run >> "jhsdb jmap" command instead of delay or re-tries? >> >> --alex >> >> On 09/18/2019 14:11, Chris Plummer wrote: >>> Hello, >>> >>> Please review the following changes: >>> >>> http://cr.openjdk.java.net/~cjplummer/8228625/webrev.00/ >>> https://bugs.openjdk.java.net/browse/JDK-8228625 >>> >>> There are actually numerous ways that JShellHeapDumpTest.java fails. >>> One is a test bug, being addressed here, and the rest all seem to be >>> SA bugs. Those are now being covered by JDK-8230872. All the issues >>> seem to stem from the fact that the test spawns a jshell process, and >>> then immediately does a "jhsdb jmap" on the process before jshell has >>> fully started up. >>> >>> The test bug happens when the jmap succeeds, but jshell has not yet >>> entered the main java thread. Thus the search for >>> "JShellToolProvider" in the output fails. It expects >>> "JShellToolProvider" to be in the output because it is part of a >>> method name in the main thread, and the test dump all the thread >>> stacks contained in the jmap generated hprof file. When the test >>> fails in this way, you can see the stack dump in the output, but the >>> main thread is missing. >>> >>> There's a couple of ways to fix this. One is to just add a delay (10s >>> seems to be more than enough), and the other is to retry the "jhsdb >>> jmap" command until the stack contains the JShellToolProvider symbol. >>> I chose the later because doing a 10s delay masks the SA issues that >>> are now covered by JDK-8230872. In a way the 10s delay is a better >>> fix, because it makes this test pass every time, but I did not like >>> that it also hid real SA problems in JDK-8230872. My plan for now is >>> to do this retry fix, and then if there are too many failures due to >>> JDK-8230872, then also add a 10s delay, with the intention of >>> removing it once JDK-8230872 if fixed. From what I can see, >>> JDK-8230872 failures happen on about 1% of the runs. >>> >>> I made a few of other changes. One was to no longer redirect stderr >>> from the jmap process as was done from the following: >>> >>> processBuilder.redirectError(ProcessBuilder.Redirect.INHERIT); >>> >>> This causes the output not to appear in the OutputAnalyzer output, >>> resulting in the following not working: >>> >>> ???????????? output.shouldNotContain("null"); >>> >>> Also I added code to dump the output of the jshell process so you can >>> see if the jshell prompt was ever generated. >>> >>> thanks, >>> >>> Chris >>> > > From alexey.menkov at oracle.com Wed Sep 18 23:39:16 2019 From: alexey.menkov at oracle.com (Alex Menkov) Date: Wed, 18 Sep 2019 16:39:16 -0700 Subject: RFR(S): 8228625: [TESTBUG] sun/tools/jhsdb/JShellHeapDumpTest.java fails with RuntimeException 'JShellToolProvider' missing from stdout/stderr In-Reply-To: References: <4ea2ef0c-0530-41bf-683c-ca57def47beb@oracle.com> <51dd8f1e-1c3b-7688-53f4-9e1ccd5748fa@oracle.com> Message-ID: <690e1b0e-8f9c-ecda-d9bf-1232a52f03c8@oracle.com> Oh, I mean s.contains("jshell>") --alex On 09/18/2019 16:29, Alex Menkov wrote: > You can use jdk.test.lib classes to simplify the things. > Something like > > ProcessBuilder pb = new > ProcessBuilder(JDKToolFinder.getTestJDKTool("jshell")); > Process p = ProcessTools.startProcess("JShell", pb, > ??? s -> {? // warm-up predicate > ??????? return s.contains(">jshell"); > ??? }); > > --alex > > On 09/18/2019 15:44, Chris Plummer wrote: >> Is there an easy way of doing this? Currently the jshell process is >> just spawned using Runtime.exec(). >> >> Chris >> >> On 9/18/19 3:01 PM, Alex Menkov wrote: >>> Hi Chris, >>> >>> Did you think about waiting for jshell prompt ("jshell>") before run >>> "jhsdb jmap" command instead of delay or re-tries? >>> >>> --alex >>> >>> On 09/18/2019 14:11, Chris Plummer wrote: >>>> Hello, >>>> >>>> Please review the following changes: >>>> >>>> http://cr.openjdk.java.net/~cjplummer/8228625/webrev.00/ >>>> https://bugs.openjdk.java.net/browse/JDK-8228625 >>>> >>>> There are actually numerous ways that JShellHeapDumpTest.java fails. >>>> One is a test bug, being addressed here, and the rest all seem to be >>>> SA bugs. Those are now being covered by JDK-8230872. All the issues >>>> seem to stem from the fact that the test spawns a jshell process, >>>> and then immediately does a "jhsdb jmap" on the process before >>>> jshell has fully started up. >>>> >>>> The test bug happens when the jmap succeeds, but jshell has not yet >>>> entered the main java thread. Thus the search for >>>> "JShellToolProvider" in the output fails. It expects >>>> "JShellToolProvider" to be in the output because it is part of a >>>> method name in the main thread, and the test dump all the thread >>>> stacks contained in the jmap generated hprof file. When the test >>>> fails in this way, you can see the stack dump in the output, but the >>>> main thread is missing. >>>> >>>> There's a couple of ways to fix this. One is to just add a delay >>>> (10s seems to be more than enough), and the other is to retry the >>>> "jhsdb jmap" command until the stack contains the JShellToolProvider >>>> symbol. I chose the later because doing a 10s delay masks the SA >>>> issues that are now covered by JDK-8230872. In a way the 10s delay >>>> is a better fix, because it makes this test pass every time, but I >>>> did not like that it also hid real SA problems in JDK-8230872. My >>>> plan for now is to do this retry fix, and then if there are too many >>>> failures due to JDK-8230872, then also add a 10s delay, with the >>>> intention of removing it once JDK-8230872 if fixed. From what I can >>>> see, JDK-8230872 failures happen on about 1% of the runs. >>>> >>>> I made a few of other changes. One was to no longer redirect stderr >>>> from the jmap process as was done from the following: >>>> >>>> processBuilder.redirectError(ProcessBuilder.Redirect.INHERIT); >>>> >>>> This causes the output not to appear in the OutputAnalyzer output, >>>> resulting in the following not working: >>>> >>>> ???????????? output.shouldNotContain("null"); >>>> >>>> Also I added code to dump the output of the jshell process so you >>>> can see if the jshell prompt was ever generated. >>>> >>>> thanks, >>>> >>>> Chris >>>> >> >> From david.holmes at oracle.com Wed Sep 18 23:40:32 2019 From: david.holmes at oracle.com (David Holmes) Date: Thu, 19 Sep 2019 09:40:32 +1000 Subject: RFR (M): 8207266: ThreadMXBean::getThreadAllocatedBytes() can be quicker for self thread In-Reply-To: <1561d09b-68ff-55fa-128a-045798a3d6a9@oracle.com> References: <588a91ec-8d4a-1157-5d72-88bb1eef1e6e@oracle.com> <30EA5D0C-1AEC-4242-B17B-CA4D39ECAF71@amazon.com> <0d42d653-d158-a6e4-45b6-84f087c7e592@oracle.com> <03A2509C-5587-448A-82F8-9240EA040326@amazon.com> <6f674d71-58f6-bc79-7d08-7bcc24e3b0fa@oracle.com> <5252a51d-4217-000b-1444-a088bb8a6a58@oracle.com> <873119A8-C595-4B73-AD0B-1625D6CAC47D@amazon.com> <56ea7c5f-8c91-9a05-6d95-255bfd0c154d@oracle.com> <5417BEA4-AECD-4130-B269-19847C0092B3@amazon.com> <1561d09b-68ff-55fa-128a-045798a3d6a9@oracle.com> Message-ID: <6732ee43-532f-377d-f37c-fe4e0f9becfe@oracle.com> Paul, Unfortunately this patch has broken the vmTestbase/nsk/monitoring tests: [2019-09-18T22:59:32,349Z] /scratch/mesos/jib-master/install/jdk-14+15-615/src.full/open/test/hotspot/jtreg/vmTestbase/nsk/monitoring/share/server/ServerThreadMXBeanNew.java:32: error: ServerThreadMXBeanNew is not abstract and does not override abstract method getCurrentThreadAllocatedBytes() in ThreadMXBean and possibly other issues as we are seeing hundreds of failures. David On 18/09/2019 8:50 am, David Holmes wrote: > On 18/09/2019 12:10 am, Hohensee, Paul wrote: >> Thanks, Serguei. :) >> >> David, are you ok with the patch? > > Yep, nothing further from me. > > David > >> Paul >> >> *From: *"serguei.spitsyn at oracle.com" >> *Date: *Tuesday, September 17, 2019 at 2:26 AM >> *To: *"Hohensee, Paul" , David Holmes >> , Mandy Chung >> *Cc: *OpenJDK Serviceability , >> "hotspot-gc-dev at openjdk.java.net" >> *Subject: *Re: RFR (M): 8207266: >> ThreadMXBean::getThreadAllocatedBytes() can be quicker for self thread >> >> Hi Paul, >> >> Thank you for refactoring and fixing the test. >> It looks great now! >> >> Thanks, >> Serguei >> >> >> On 9/15/19 02:52, Hohensee, Paul wrote: >> >> ??? Hi, Serguei, thanks for the review. New webrev at >> >> ??? http://cr.openjdk.java.net/~phh/8207266/webrev.09/ >> >> ??? I refactored the test?s main() method, and you?re correct, >> ??? getThreadAllocatedBytes should be getCurrentThreadAllocatedBytes in >> ??? that context: fixed. >> >> ??? Paul >> >> ??? *From: *"serguei.spitsyn at oracle.com" >> ??? >> ??? >> ??? *Organization: *Oracle Corporation >> ??? *Date: *Friday, September 13, 2019 at 5:50 PM >> ??? *To: *"Hohensee, Paul" >> ??? , David Holmes >> ??? , Mandy Chung >> ??? >> ??? *Cc: *OpenJDK Serviceability >> ??? , >> ??? "hotspot-gc-dev at openjdk.java.net" >> ??? >> ??? >> ??? >> ??? *Subject: *Re: RFR (M): 8207266: >> ??? ThreadMXBean::getThreadAllocatedBytes() can be quicker for self >> thread >> >> ??? Hi Paul, >> >> ??? It looks pretty good in general. >> >> >> http://cr.openjdk.java.net/~phh/8207266/webrev.08/test/jdk/com/sun/management/ThreadMXBean/ThreadAllocatedMemory.java.frames.html >> >> >> ??? It would be nice to refactor the java main() method as it becomes >> ??? too big. >> ??? Two ways ofgetCurrentThreadAllocatedBytes() testing are good >> candidates >> ??? to become separate methods. >> >> ???? ? 98?? ??????long size1 = mbean.getThreadAllocatedBytes(id); >> >> ??? Just wanted to double check if you wanted to invoke >> ??? the getCurrentThreadAllocatedBytes() instead as it is >> ??? a part of: >> >> ???? ? 85???????? // First way, getCurrentThreadAllocatedBytes >> >> >> ??? Thanks, >> ??? Serguei >> >> ??? On 9/13/19 12:11 PM, Hohensee, Paul wrote: >> >> ??????? Hi David, thanks for your comments. New webrev in >> >> >> ??????? http://cr.openjdk.java.net/~phh/8207266/webrev.08/ >> >> >> ??????? Both the old and new versions of the code check that thread >> allocated memory is both supported and enabled. The existing version >> of getThreadAllocatedBytes(long []) calls >> verifyThreadAllocatedMemory(long []), which checks inline to make sure >> thread allocated memory is supported, then calls >> isThreadAllocatedMemoryEnabled() to verify that it's enabled. >> isThreadAllocatedMemoryEnabled() duplicates (!) the support check and >> returns the enabled flag. I removed the redundant check in the new >> version. >> >> >> ??????? You're of course correct about the back-to-back check. >> Application code can't know when the runtime will hijack a thread for >> its own purposes. I've removed the check. >> >> >> ??????? Paul >> >> >> ??????? On 9/13/19, 12:50 AM, "David Holmes" >> ? wrote: >> >> >> ???????? ??? Hi Paul, >> >> >> ???????? ????On 13/09/2019 10:29 am, Hohensee, Paul wrote: >> >> ???????? ??? > Thanks for clarifying the review rules. Would someone >> from the >> >> ???????? ????> serviceability team please review? New webrev at >> >> ???????? ??? > >> >> ???????? ????>http://cr.openjdk.java.net/~phh/8207266/webrev.07/ >> >> >> ???????? ????One aspect of the functional change needs clarification >> for me - and >> >> ???????? ????apologies if this has been covered in the past. It seems >> to me that >> >> ???????? ????currently we only check isThreadAllocatedMemorySupported >> for these >> >> ???????? ????operations, but if I read things correctly the updated >> code additionally >> >> ???????? ????checks isThreadAllocatedMemoryEnabled, which is a >> behaviour change not >> >> ???????? ????mentioned in the CSR. >> >> >> ???????? ????> I didn?t disturb the existing checks in the test, just >> added code to >> >> ???????? ????> check the result of getThreadAllocatedBytes(long) on a >> non-current >> >> ???????? ????> thread, plus the back-to-back no-allocation checks. The >> former wasn?t >> >> ???????? ????> needed before because getThreadAllocatedBytes(long) was >> just a wrapper >> >> ???????? ????> around getThreadAllocatedBytes(long []). This patch >> changes that, so I >> >> ???????? ????> added a separate test. The latter is supposed to fail >> if there?s object >> >> ???????? ????> allocation on calls to getCurrentThreadAllocatedBytes and >> >> ???????? ????> getThreadAllocatedBytes(long). I.e., a feature, not a >> bug, because >> >> ???????? ????> accumulation of transient small objects can be a >> performance problem. >> >> ???????? ????> Thanks to your review, I noticed that the back-to-back >> check on the >> >> ???????? ????> current thread was using getThreadAllocatedBytes(long) >> instead of >> >> ???????? ????> getCurrentThreadAllocatedBytes and fixed it. I also >> removed all >> >> ???????? ????> instances of ?TEST FAILED: ?. >> >> >> ???????? ????The back-to-back check is not valid in general. You don't >> know if the >> >> ???????? ????first check might trigger some class loading on the >> return path after it >> >> ???????? ????has obtained the first memory value. The check might also >> fail if using >> >> ???????? ????JVMCI and some compilation related activity occurs in the >> current thread >> >> ???????? ????on the second call. Also with the introduction of >> handshakes its >> >> ???????? ????possible the current thread might hit a safepoint checks >> that results in >> >> ???????? ????it executing a handshake operation that performs >> allocation. Potentially >> >> ???????? ????there could be numerous non-deterministic actions that >> might occur >> >> ???????? ????leading to unanticipated allocation. >> >> >> ???????? ????I understand what you want to test here, I just don't >> think it is >> >> ???????? ????reliably doable. >> >> >> ???????? ????Thanks, >> >> ???????? ??? David >> >> ???????? ??? ----- >> >> >> ???????? ????> >> >> ???????? ????> Paul >> >> ???????? ??? > >> >> ???????? ????> *From: *Mandy Chung >> >> >> ???????? ??? > *Date: *Thursday, September 12, 2019 at 10:09 AM >> >> ???????? ??? > *To: *"Hohensee, Paul" >> >> >> ???????? ??? > *Cc: *OpenJDK >> Serviceability >> , >> >> ???????? ????>"hotspot-gc-dev at openjdk.java.net" >> >> >> >> >> ???????? ??? > *Subject: *Re: RFR (M): 8207266: >> ThreadMXBean::getThreadAllocatedBytes() >> >> ???????? ????> can be quicker for self thread >> >> ???????? ??? > >> >> ???????? ????> On 9/3/19 12:38 PM, Hohensee, Paul wrote: >> >> ???????? ??? > >> >> ???????? ????>???? Minor update in new >> webrevhttp://cr.openjdk.java.net/~phh/8207266/webrev.05/. >> >> ???????? ??? > >> >> ???????? ????> >> >> ???????? ????> I only reviewed the library side implementation that >> looks good.? I >> >> ???????? ????> expect the serviceability team to review the test and >> hotspot change. >> >> ???????? ??? > >> >> ???????? ????> >> >> ???????? ????>???? Need a confirmatory review to push this. If I >> understand the rules correctly, it doesn't need a Reviewer review >> since Mandy's already reviewed it, it just needs a Committer review. >> >> ???????? ??? > >> >> ???????? ????> >> >> ???????? ????> You need another reviewer to advice the following >> because I was not >> >> ???????? ????> close to the ThreadsList work. >> >> ???????? ??? > >> >> ???????? ????> 2087?? ThreadsListHandle tlh; >> >> ???????? ??? > >> >> ???????? ????> 2088?? JavaThread* java_thread = >> tlh.list()->find_JavaThread_from_java_tid(thread_id); >> >> ???????? ??? > >> >> ???????? ????> 2089 >> >> ???????? ??? > >> >> ???????? ????> 2090?? if (java_thread != NULL) { >> >> ???????? ??? > >> >> ???????? ????> 2091???? return java_thread->cooked_allocated_bytes(); >> >> ???????? ??? > >> >> ???????? ????> 2092?? } >> >> ???????? ??? > >> >> ???????? ????> This looks right to me. >> >> ???????? ??? > >> >> ???????? ????> >> test/jdk/com/sun/management/ThreadMXBean/ThreadAllocatedMemory.java >> >> ???????? ??? > >> >> ???????? ????> -??????????????? "ThreadAllocatedMemory is expected to >> be disabled"); >> >> ???????? ??? > >> >> ???????? ????> +??????????????? "TEST FAILED: ThreadAllocatedMemory is >> expected to be >> >> ???????? ????> disabled"); >> >> ???????? ??? > >> >> ???????? ????> Prepending "TEST FAILED" in exception message (in >> several places) >> >> ???????? ??? > >> >> ???????? ????> seems redundant since such RuntimeException is thrown >> and expected >> >> ???????? ??? > >> >> ???????? ????> a test failure. >> >> ???????? ??? > >> >> ???????? ????> +??????? // back-to-back calls shouldn't allocate any >> memory >> >> ???????? ??? > >> >> ???????? ????> +??????? size = mbean.getThreadAllocatedBytes(id); >> >> ???????? ??? > >> >> ???????? ????> +??????? size1 = mbean.getThreadAllocatedBytes(id); >> >> ???????? ??? > >> >> ???????? ????> +??????? if (size1 != size) { >> >> ???????? ??? > >> >> ???????? ????> Is there anything in the test can do to help guarantee >> this? I didn't >> >> ???????? ??? > >> >> ???????? ????> closely review this test.? The main thing I advice is >> to improve >> >> ???????? ??? > >> >> ???????? ????> the reliability of this test.? Put it in another way, >> we want to >> >> ???????? ??? > >> >> ???????? ????> ensure that this test change will pass all the time in >> various >> >> ???????? ??? > >> >> ???????? ????> test configuration. >> >> ???????? ??? > >> >> ???????? ????> Mandy >> >> ???????? ??? > >> >> >> >> >> From hohensee at amazon.com Wed Sep 18 23:47:02 2019 From: hohensee at amazon.com (Hohensee, Paul) Date: Wed, 18 Sep 2019 23:47:02 +0000 Subject: RFR (M): 8207266: ThreadMXBean::getThreadAllocatedBytes() can be quicker for self thread In-Reply-To: <6732ee43-532f-377d-f37c-fe4e0f9becfe@oracle.com> References: <588a91ec-8d4a-1157-5d72-88bb1eef1e6e@oracle.com> <30EA5D0C-1AEC-4242-B17B-CA4D39ECAF71@amazon.com> <0d42d653-d158-a6e4-45b6-84f087c7e592@oracle.com> <03A2509C-5587-448A-82F8-9240EA040326@amazon.com> <6f674d71-58f6-bc79-7d08-7bcc24e3b0fa@oracle.com> <5252a51d-4217-000b-1444-a088bb8a6a58@oracle.com> <873119A8-C595-4B73-AD0B-1625D6CAC47D@amazon.com> <56ea7c5f-8c91-9a05-6d95-255bfd0c154d@oracle.com> <5417BEA4-AECD-4130-B269-19847C0092B3@amazon.com> <1561d09b-68ff-55fa-128a-045798a3d6a9@oracle.com> <6732ee43-532f-377d-f37c-fe4e0f9becfe@oracle.com> Message-ID: I'll take a look. ?On 9/18/19, 4:40 PM, "David Holmes" wrote: Paul, Unfortunately this patch has broken the vmTestbase/nsk/monitoring tests: [2019-09-18T22:59:32,349Z] /scratch/mesos/jib-master/install/jdk-14+15-615/src.full/open/test/hotspot/jtreg/vmTestbase/nsk/monitoring/share/server/ServerThreadMXBeanNew.java:32: error: ServerThreadMXBeanNew is not abstract and does not override abstract method getCurrentThreadAllocatedBytes() in ThreadMXBean and possibly other issues as we are seeing hundreds of failures. David On 18/09/2019 8:50 am, David Holmes wrote: > On 18/09/2019 12:10 am, Hohensee, Paul wrote: >> Thanks, Serguei. :) >> >> David, are you ok with the patch? > > Yep, nothing further from me. > > David > >> Paul >> >> *From: *"serguei.spitsyn at oracle.com" >> *Date: *Tuesday, September 17, 2019 at 2:26 AM >> *To: *"Hohensee, Paul" , David Holmes >> , Mandy Chung >> *Cc: *OpenJDK Serviceability , >> "hotspot-gc-dev at openjdk.java.net" >> *Subject: *Re: RFR (M): 8207266: >> ThreadMXBean::getThreadAllocatedBytes() can be quicker for self thread >> >> Hi Paul, >> >> Thank you for refactoring and fixing the test. >> It looks great now! >> >> Thanks, >> Serguei >> >> >> On 9/15/19 02:52, Hohensee, Paul wrote: >> >> Hi, Serguei, thanks for the review. New webrev at >> >> http://cr.openjdk.java.net/~phh/8207266/webrev.09/ >> >> I refactored the test?s main() method, and you?re correct, >> getThreadAllocatedBytes should be getCurrentThreadAllocatedBytes in >> that context: fixed. >> >> Paul >> >> *From: *"serguei.spitsyn at oracle.com" >> >> >> *Organization: *Oracle Corporation >> *Date: *Friday, September 13, 2019 at 5:50 PM >> *To: *"Hohensee, Paul" >> , David Holmes >> , Mandy Chung >> >> *Cc: *OpenJDK Serviceability >> , >> "hotspot-gc-dev at openjdk.java.net" >> >> >> >> *Subject: *Re: RFR (M): 8207266: >> ThreadMXBean::getThreadAllocatedBytes() can be quicker for self >> thread >> >> Hi Paul, >> >> It looks pretty good in general. >> >> >> http://cr.openjdk.java.net/~phh/8207266/webrev.08/test/jdk/com/sun/management/ThreadMXBean/ThreadAllocatedMemory.java.frames.html >> >> >> It would be nice to refactor the java main() method as it becomes >> too big. >> Two ways ofgetCurrentThreadAllocatedBytes() testing are good >> candidates >> to become separate methods. >> >> 98 long size1 = mbean.getThreadAllocatedBytes(id); >> >> Just wanted to double check if you wanted to invoke >> the getCurrentThreadAllocatedBytes() instead as it is >> a part of: >> >> 85 // First way, getCurrentThreadAllocatedBytes >> >> >> Thanks, >> Serguei >> >> On 9/13/19 12:11 PM, Hohensee, Paul wrote: >> >> Hi David, thanks for your comments. New webrev in >> >> >> http://cr.openjdk.java.net/~phh/8207266/webrev.08/ >> >> >> Both the old and new versions of the code check that thread >> allocated memory is both supported and enabled. The existing version >> of getThreadAllocatedBytes(long []) calls >> verifyThreadAllocatedMemory(long []), which checks inline to make sure >> thread allocated memory is supported, then calls >> isThreadAllocatedMemoryEnabled() to verify that it's enabled. >> isThreadAllocatedMemoryEnabled() duplicates (!) the support check and >> returns the enabled flag. I removed the redundant check in the new >> version. >> >> >> You're of course correct about the back-to-back check. >> Application code can't know when the runtime will hijack a thread for >> its own purposes. I've removed the check. >> >> >> Paul >> >> >> On 9/13/19, 12:50 AM, "David Holmes" >> wrote: >> >> >> Hi Paul, >> >> >> On 13/09/2019 10:29 am, Hohensee, Paul wrote: >> >> > Thanks for clarifying the review rules. Would someone >> from the >> >> > serviceability team please review? New webrev at >> >> > >> >> >http://cr.openjdk.java.net/~phh/8207266/webrev.07/ >> >> >> One aspect of the functional change needs clarification >> for me - and >> >> apologies if this has been covered in the past. It seems >> to me that >> >> currently we only check isThreadAllocatedMemorySupported >> for these >> >> operations, but if I read things correctly the updated >> code additionally >> >> checks isThreadAllocatedMemoryEnabled, which is a >> behaviour change not >> >> mentioned in the CSR. >> >> >> > I didn?t disturb the existing checks in the test, just >> added code to >> >> > check the result of getThreadAllocatedBytes(long) on a >> non-current >> >> > thread, plus the back-to-back no-allocation checks. The >> former wasn?t >> >> > needed before because getThreadAllocatedBytes(long) was >> just a wrapper >> >> > around getThreadAllocatedBytes(long []). This patch >> changes that, so I >> >> > added a separate test. The latter is supposed to fail >> if there?s object >> >> > allocation on calls to getCurrentThreadAllocatedBytes and >> >> > getThreadAllocatedBytes(long). I.e., a feature, not a >> bug, because >> >> > accumulation of transient small objects can be a >> performance problem. >> >> > Thanks to your review, I noticed that the back-to-back >> check on the >> >> > current thread was using getThreadAllocatedBytes(long) >> instead of >> >> > getCurrentThreadAllocatedBytes and fixed it. I also >> removed all >> >> > instances of ?TEST FAILED: ?. >> >> >> The back-to-back check is not valid in general. You don't >> know if the >> >> first check might trigger some class loading on the >> return path after it >> >> has obtained the first memory value. The check might also >> fail if using >> >> JVMCI and some compilation related activity occurs in the >> current thread >> >> on the second call. Also with the introduction of >> handshakes its >> >> possible the current thread might hit a safepoint checks >> that results in >> >> it executing a handshake operation that performs >> allocation. Potentially >> >> there could be numerous non-deterministic actions that >> might occur >> >> leading to unanticipated allocation. >> >> >> I understand what you want to test here, I just don't >> think it is >> >> reliably doable. >> >> >> Thanks, >> >> David >> >> ----- >> >> >> > >> >> > Paul >> >> > >> >> > *From: *Mandy Chung >> >> >> > *Date: *Thursday, September 12, 2019 at 10:09 AM >> >> > *To: *"Hohensee, Paul" >> >> >> > *Cc: *OpenJDK >> Serviceability >> , >> >> >"hotspot-gc-dev at openjdk.java.net" >> >> >> >> >> > *Subject: *Re: RFR (M): 8207266: >> ThreadMXBean::getThreadAllocatedBytes() >> >> > can be quicker for self thread >> >> > >> >> > On 9/3/19 12:38 PM, Hohensee, Paul wrote: >> >> > >> >> > Minor update in new >> webrevhttp://cr.openjdk.java.net/~phh/8207266/webrev.05/. >> >> > >> >> > >> >> > I only reviewed the library side implementation that >> looks good. I >> >> > expect the serviceability team to review the test and >> hotspot change. >> >> > >> >> > >> >> > Need a confirmatory review to push this. If I >> understand the rules correctly, it doesn't need a Reviewer review >> since Mandy's already reviewed it, it just needs a Committer review. >> >> > >> >> > >> >> > You need another reviewer to advice the following >> because I was not >> >> > close to the ThreadsList work. >> >> > >> >> > 2087 ThreadsListHandle tlh; >> >> > >> >> > 2088 JavaThread* java_thread = >> tlh.list()->find_JavaThread_from_java_tid(thread_id); >> >> > >> >> > 2089 >> >> > >> >> > 2090 if (java_thread != NULL) { >> >> > >> >> > 2091 return java_thread->cooked_allocated_bytes(); >> >> > >> >> > 2092 } >> >> > >> >> > This looks right to me. >> >> > >> >> > >> test/jdk/com/sun/management/ThreadMXBean/ThreadAllocatedMemory.java >> >> > >> >> > - "ThreadAllocatedMemory is expected to >> be disabled"); >> >> > >> >> > + "TEST FAILED: ThreadAllocatedMemory is >> expected to be >> >> > disabled"); >> >> > >> >> > Prepending "TEST FAILED" in exception message (in >> several places) >> >> > >> >> > seems redundant since such RuntimeException is thrown >> and expected >> >> > >> >> > a test failure. >> >> > >> >> > + // back-to-back calls shouldn't allocate any >> memory >> >> > >> >> > + size = mbean.getThreadAllocatedBytes(id); >> >> > >> >> > + size1 = mbean.getThreadAllocatedBytes(id); >> >> > >> >> > + if (size1 != size) { >> >> > >> >> > Is there anything in the test can do to help guarantee >> this? I didn't >> >> > >> >> > closely review this test. The main thing I advice is >> to improve >> >> > >> >> > the reliability of this test. Put it in another way, >> we want to >> >> > >> >> > ensure that this test change will pass all the time in >> various >> >> > >> >> > test configuration. >> >> > >> >> > Mandy >> >> > >> >> >> >> >> From hohensee at amazon.com Thu Sep 19 00:00:58 2019 From: hohensee at amazon.com (Hohensee, Paul) Date: Thu, 19 Sep 2019 00:00:58 +0000 Subject: RFR (M): 8207266: ThreadMXBean::getThreadAllocatedBytes() can be quicker for self thread In-Reply-To: References: <588a91ec-8d4a-1157-5d72-88bb1eef1e6e@oracle.com> <30EA5D0C-1AEC-4242-B17B-CA4D39ECAF71@amazon.com> <0d42d653-d158-a6e4-45b6-84f087c7e592@oracle.com> <03A2509C-5587-448A-82F8-9240EA040326@amazon.com> <6f674d71-58f6-bc79-7d08-7bcc24e3b0fa@oracle.com> <5252a51d-4217-000b-1444-a088bb8a6a58@oracle.com> <873119A8-C595-4B73-AD0B-1625D6CAC47D@amazon.com> <56ea7c5f-8c91-9a05-6d95-255bfd0c154d@oracle.com> <5417BEA4-AECD-4130-B269-19847C0092B3@amazon.com> <1561d09b-68ff-55fa-128a-045798a3d6a9@oracle.com> <6732ee43-532f-377d-f37c-fe4e0f9becfe@oracle.com> Message-ID: <93FFE1B3-C1BA-4568-9402-48EB74BB089B@amazon.com> They all implement com.sun.management.ThreadMXBean, so adding a getCurrentThreadAllocatedBytes broke them. Potential fix is to give it a default implementation, vis public default long getCurrentThreadAllocatedBytes() { return -1; } Shall I go with that, or reverse the original patch? ?On 9/18/19, 4:48 PM, "serviceability-dev on behalf of Hohensee, Paul" wrote: I'll take a look. On 9/18/19, 4:40 PM, "David Holmes" wrote: Paul, Unfortunately this patch has broken the vmTestbase/nsk/monitoring tests: [2019-09-18T22:59:32,349Z] /scratch/mesos/jib-master/install/jdk-14+15-615/src.full/open/test/hotspot/jtreg/vmTestbase/nsk/monitoring/share/server/ServerThreadMXBeanNew.java:32: error: ServerThreadMXBeanNew is not abstract and does not override abstract method getCurrentThreadAllocatedBytes() in ThreadMXBean and possibly other issues as we are seeing hundreds of failures. David On 18/09/2019 8:50 am, David Holmes wrote: > On 18/09/2019 12:10 am, Hohensee, Paul wrote: >> Thanks, Serguei. :) >> >> David, are you ok with the patch? > > Yep, nothing further from me. > > David > >> Paul >> >> *From: *"serguei.spitsyn at oracle.com" >> *Date: *Tuesday, September 17, 2019 at 2:26 AM >> *To: *"Hohensee, Paul" , David Holmes >> , Mandy Chung >> *Cc: *OpenJDK Serviceability , >> "hotspot-gc-dev at openjdk.java.net" >> *Subject: *Re: RFR (M): 8207266: >> ThreadMXBean::getThreadAllocatedBytes() can be quicker for self thread >> >> Hi Paul, >> >> Thank you for refactoring and fixing the test. >> It looks great now! >> >> Thanks, >> Serguei >> >> >> On 9/15/19 02:52, Hohensee, Paul wrote: >> >> Hi, Serguei, thanks for the review. New webrev at >> >> http://cr.openjdk.java.net/~phh/8207266/webrev.09/ >> >> I refactored the test?s main() method, and you?re correct, >> getThreadAllocatedBytes should be getCurrentThreadAllocatedBytes in >> that context: fixed. >> >> Paul >> >> *From: *"serguei.spitsyn at oracle.com" >> >> >> *Organization: *Oracle Corporation >> *Date: *Friday, September 13, 2019 at 5:50 PM >> *To: *"Hohensee, Paul" >> , David Holmes >> , Mandy Chung >> >> *Cc: *OpenJDK Serviceability >> , >> "hotspot-gc-dev at openjdk.java.net" >> >> >> >> *Subject: *Re: RFR (M): 8207266: >> ThreadMXBean::getThreadAllocatedBytes() can be quicker for self >> thread >> >> Hi Paul, >> >> It looks pretty good in general. >> >> >> http://cr.openjdk.java.net/~phh/8207266/webrev.08/test/jdk/com/sun/management/ThreadMXBean/ThreadAllocatedMemory.java.frames.html >> >> >> It would be nice to refactor the java main() method as it becomes >> too big. >> Two ways ofgetCurrentThreadAllocatedBytes() testing are good >> candidates >> to become separate methods. >> >> 98 long size1 = mbean.getThreadAllocatedBytes(id); >> >> Just wanted to double check if you wanted to invoke >> the getCurrentThreadAllocatedBytes() instead as it is >> a part of: >> >> 85 // First way, getCurrentThreadAllocatedBytes >> >> >> Thanks, >> Serguei >> >> On 9/13/19 12:11 PM, Hohensee, Paul wrote: >> >> Hi David, thanks for your comments. New webrev in >> >> >> http://cr.openjdk.java.net/~phh/8207266/webrev.08/ >> >> >> Both the old and new versions of the code check that thread >> allocated memory is both supported and enabled. The existing version >> of getThreadAllocatedBytes(long []) calls >> verifyThreadAllocatedMemory(long []), which checks inline to make sure >> thread allocated memory is supported, then calls >> isThreadAllocatedMemoryEnabled() to verify that it's enabled. >> isThreadAllocatedMemoryEnabled() duplicates (!) the support check and >> returns the enabled flag. I removed the redundant check in the new >> version. >> >> >> You're of course correct about the back-to-back check. >> Application code can't know when the runtime will hijack a thread for >> its own purposes. I've removed the check. >> >> >> Paul >> >> >> On 9/13/19, 12:50 AM, "David Holmes" >> wrote: >> >> >> Hi Paul, >> >> >> On 13/09/2019 10:29 am, Hohensee, Paul wrote: >> >> > Thanks for clarifying the review rules. Would someone >> from the >> >> > serviceability team please review? New webrev at >> >> > >> >> >http://cr.openjdk.java.net/~phh/8207266/webrev.07/ >> >> >> One aspect of the functional change needs clarification >> for me - and >> >> apologies if this has been covered in the past. It seems >> to me that >> >> currently we only check isThreadAllocatedMemorySupported >> for these >> >> operations, but if I read things correctly the updated >> code additionally >> >> checks isThreadAllocatedMemoryEnabled, which is a >> behaviour change not >> >> mentioned in the CSR. >> >> >> > I didn?t disturb the existing checks in the test, just >> added code to >> >> > check the result of getThreadAllocatedBytes(long) on a >> non-current >> >> > thread, plus the back-to-back no-allocation checks. The >> former wasn?t >> >> > needed before because getThreadAllocatedBytes(long) was >> just a wrapper >> >> > around getThreadAllocatedBytes(long []). This patch >> changes that, so I >> >> > added a separate test. The latter is supposed to fail >> if there?s object >> >> > allocation on calls to getCurrentThreadAllocatedBytes and >> >> > getThreadAllocatedBytes(long). I.e., a feature, not a >> bug, because >> >> > accumulation of transient small objects can be a >> performance problem. >> >> > Thanks to your review, I noticed that the back-to-back >> check on the >> >> > current thread was using getThreadAllocatedBytes(long) >> instead of >> >> > getCurrentThreadAllocatedBytes and fixed it. I also >> removed all >> >> > instances of ?TEST FAILED: ?. >> >> >> The back-to-back check is not valid in general. You don't >> know if the >> >> first check might trigger some class loading on the >> return path after it >> >> has obtained the first memory value. The check might also >> fail if using >> >> JVMCI and some compilation related activity occurs in the >> current thread >> >> on the second call. Also with the introduction of >> handshakes its >> >> possible the current thread might hit a safepoint checks >> that results in >> >> it executing a handshake operation that performs >> allocation. Potentially >> >> there could be numerous non-deterministic actions that >> might occur >> >> leading to unanticipated allocation. >> >> >> I understand what you want to test here, I just don't >> think it is >> >> reliably doable. >> >> >> Thanks, >> >> David >> >> ----- >> >> >> > >> >> > Paul >> >> > >> >> > *From: *Mandy Chung >> >> >> > *Date: *Thursday, September 12, 2019 at 10:09 AM >> >> > *To: *"Hohensee, Paul" >> >> >> > *Cc: *OpenJDK >> Serviceability >> , >> >> >"hotspot-gc-dev at openjdk.java.net" >> >> >> >> >> > *Subject: *Re: RFR (M): 8207266: >> ThreadMXBean::getThreadAllocatedBytes() >> >> > can be quicker for self thread >> >> > >> >> > On 9/3/19 12:38 PM, Hohensee, Paul wrote: >> >> > >> >> > Minor update in new >> webrevhttp://cr.openjdk.java.net/~phh/8207266/webrev.05/. >> >> > >> >> > >> >> > I only reviewed the library side implementation that >> looks good. I >> >> > expect the serviceability team to review the test and >> hotspot change. >> >> > >> >> > >> >> > Need a confirmatory review to push this. If I >> understand the rules correctly, it doesn't need a Reviewer review >> since Mandy's already reviewed it, it just needs a Committer review. >> >> > >> >> > >> >> > You need another reviewer to advice the following >> because I was not >> >> > close to the ThreadsList work. >> >> > >> >> > 2087 ThreadsListHandle tlh; >> >> > >> >> > 2088 JavaThread* java_thread = >> tlh.list()->find_JavaThread_from_java_tid(thread_id); >> >> > >> >> > 2089 >> >> > >> >> > 2090 if (java_thread != NULL) { >> >> > >> >> > 2091 return java_thread->cooked_allocated_bytes(); >> >> > >> >> > 2092 } >> >> > >> >> > This looks right to me. >> >> > >> >> > >> test/jdk/com/sun/management/ThreadMXBean/ThreadAllocatedMemory.java >> >> > >> >> > - "ThreadAllocatedMemory is expected to >> be disabled"); >> >> > >> >> > + "TEST FAILED: ThreadAllocatedMemory is >> expected to be >> >> > disabled"); >> >> > >> >> > Prepending "TEST FAILED" in exception message (in >> several places) >> >> > >> >> > seems redundant since such RuntimeException is thrown >> and expected >> >> > >> >> > a test failure. >> >> > >> >> > + // back-to-back calls shouldn't allocate any >> memory >> >> > >> >> > + size = mbean.getThreadAllocatedBytes(id); >> >> > >> >> > + size1 = mbean.getThreadAllocatedBytes(id); >> >> > >> >> > + if (size1 != size) { >> >> > >> >> > Is there anything in the test can do to help guarantee >> this? I didn't >> >> > >> >> > closely review this test. The main thing I advice is >> to improve >> >> > >> >> > the reliability of this test. Put it in another way, >> we want to >> >> > >> >> > ensure that this test change will pass all the time in >> various >> >> > >> >> > test configuration. >> >> > >> >> > Mandy >> >> > >> >> >> >> >> From hohensee at amazon.com Thu Sep 19 00:10:02 2019 From: hohensee at amazon.com (Hohensee, Paul) Date: Thu, 19 Sep 2019 00:10:02 +0000 Subject: RFR (M): 8207266: ThreadMXBean::getThreadAllocatedBytes() can be quicker for self thread In-Reply-To: <93FFE1B3-C1BA-4568-9402-48EB74BB089B@amazon.com> References: <588a91ec-8d4a-1157-5d72-88bb1eef1e6e@oracle.com> <30EA5D0C-1AEC-4242-B17B-CA4D39ECAF71@amazon.com> <0d42d653-d158-a6e4-45b6-84f087c7e592@oracle.com> <03A2509C-5587-448A-82F8-9240EA040326@amazon.com> <6f674d71-58f6-bc79-7d08-7bcc24e3b0fa@oracle.com> <5252a51d-4217-000b-1444-a088bb8a6a58@oracle.com> <873119A8-C595-4B73-AD0B-1625D6CAC47D@amazon.com> <56ea7c5f-8c91-9a05-6d95-255bfd0c154d@oracle.com> <5417BEA4-AECD-4130-B269-19847C0092B3@amazon.com> <1561d09b-68ff-55fa-128a-045798a3d6a9@oracle.com> <6732ee43-532f-377d-f37c-fe4e0f9becfe@oracle.com> <93FFE1B3-C1BA-4568-9402-48EB74BB089B@amazon.com> Message-ID: I've filed https://bugs.openjdk.java.net/browse/JDK-8231209 for this quick fix. A better fix is to support getCurrentThreadAllocatedBytes in these tests. ?On 9/18/19, 5:02 PM, "hotspot-gc-dev on behalf of Hohensee, Paul" wrote: They all implement com.sun.management.ThreadMXBean, so adding a getCurrentThreadAllocatedBytes broke them. Potential fix is to give it a default implementation, vis public default long getCurrentThreadAllocatedBytes() { return -1; } Shall I go with that, or reverse the original patch? On 9/18/19, 4:48 PM, "serviceability-dev on behalf of Hohensee, Paul" wrote: I'll take a look. On 9/18/19, 4:40 PM, "David Holmes" wrote: Paul, Unfortunately this patch has broken the vmTestbase/nsk/monitoring tests: [2019-09-18T22:59:32,349Z] /scratch/mesos/jib-master/install/jdk-14+15-615/src.full/open/test/hotspot/jtreg/vmTestbase/nsk/monitoring/share/server/ServerThreadMXBeanNew.java:32: error: ServerThreadMXBeanNew is not abstract and does not override abstract method getCurrentThreadAllocatedBytes() in ThreadMXBean and possibly other issues as we are seeing hundreds of failures. David On 18/09/2019 8:50 am, David Holmes wrote: > On 18/09/2019 12:10 am, Hohensee, Paul wrote: >> Thanks, Serguei. :) >> >> David, are you ok with the patch? > > Yep, nothing further from me. > > David > >> Paul >> >> *From: *"serguei.spitsyn at oracle.com" >> *Date: *Tuesday, September 17, 2019 at 2:26 AM >> *To: *"Hohensee, Paul" , David Holmes >> , Mandy Chung >> *Cc: *OpenJDK Serviceability , >> "hotspot-gc-dev at openjdk.java.net" >> *Subject: *Re: RFR (M): 8207266: >> ThreadMXBean::getThreadAllocatedBytes() can be quicker for self thread >> >> Hi Paul, >> >> Thank you for refactoring and fixing the test. >> It looks great now! >> >> Thanks, >> Serguei >> >> >> On 9/15/19 02:52, Hohensee, Paul wrote: >> >> Hi, Serguei, thanks for the review. New webrev at >> >> http://cr.openjdk.java.net/~phh/8207266/webrev.09/ >> >> I refactored the test?s main() method, and you?re correct, >> getThreadAllocatedBytes should be getCurrentThreadAllocatedBytes in >> that context: fixed. >> >> Paul >> >> *From: *"serguei.spitsyn at oracle.com" >> >> >> *Organization: *Oracle Corporation >> *Date: *Friday, September 13, 2019 at 5:50 PM >> *To: *"Hohensee, Paul" >> , David Holmes >> , Mandy Chung >> >> *Cc: *OpenJDK Serviceability >> , >> "hotspot-gc-dev at openjdk.java.net" >> >> >> >> *Subject: *Re: RFR (M): 8207266: >> ThreadMXBean::getThreadAllocatedBytes() can be quicker for self >> thread >> >> Hi Paul, >> >> It looks pretty good in general. >> >> >> http://cr.openjdk.java.net/~phh/8207266/webrev.08/test/jdk/com/sun/management/ThreadMXBean/ThreadAllocatedMemory.java.frames.html >> >> >> It would be nice to refactor the java main() method as it becomes >> too big. >> Two ways ofgetCurrentThreadAllocatedBytes() testing are good >> candidates >> to become separate methods. >> >> 98 long size1 = mbean.getThreadAllocatedBytes(id); >> >> Just wanted to double check if you wanted to invoke >> the getCurrentThreadAllocatedBytes() instead as it is >> a part of: >> >> 85 // First way, getCurrentThreadAllocatedBytes >> >> >> Thanks, >> Serguei >> >> On 9/13/19 12:11 PM, Hohensee, Paul wrote: >> >> Hi David, thanks for your comments. New webrev in >> >> >> http://cr.openjdk.java.net/~phh/8207266/webrev.08/ >> >> >> Both the old and new versions of the code check that thread >> allocated memory is both supported and enabled. The existing version >> of getThreadAllocatedBytes(long []) calls >> verifyThreadAllocatedMemory(long []), which checks inline to make sure >> thread allocated memory is supported, then calls >> isThreadAllocatedMemoryEnabled() to verify that it's enabled. >> isThreadAllocatedMemoryEnabled() duplicates (!) the support check and >> returns the enabled flag. I removed the redundant check in the new >> version. >> >> >> You're of course correct about the back-to-back check. >> Application code can't know when the runtime will hijack a thread for >> its own purposes. I've removed the check. >> >> >> Paul >> >> >> On 9/13/19, 12:50 AM, "David Holmes" >> wrote: >> >> >> Hi Paul, >> >> >> On 13/09/2019 10:29 am, Hohensee, Paul wrote: >> >> > Thanks for clarifying the review rules. Would someone >> from the >> >> > serviceability team please review? New webrev at >> >> > >> >> >http://cr.openjdk.java.net/~phh/8207266/webrev.07/ >> >> >> One aspect of the functional change needs clarification >> for me - and >> >> apologies if this has been covered in the past. It seems >> to me that >> >> currently we only check isThreadAllocatedMemorySupported >> for these >> >> operations, but if I read things correctly the updated >> code additionally >> >> checks isThreadAllocatedMemoryEnabled, which is a >> behaviour change not >> >> mentioned in the CSR. >> >> >> > I didn?t disturb the existing checks in the test, just >> added code to >> >> > check the result of getThreadAllocatedBytes(long) on a >> non-current >> >> > thread, plus the back-to-back no-allocation checks. The >> former wasn?t >> >> > needed before because getThreadAllocatedBytes(long) was >> just a wrapper >> >> > around getThreadAllocatedBytes(long []). This patch >> changes that, so I >> >> > added a separate test. The latter is supposed to fail >> if there?s object >> >> > allocation on calls to getCurrentThreadAllocatedBytes and >> >> > getThreadAllocatedBytes(long). I.e., a feature, not a >> bug, because >> >> > accumulation of transient small objects can be a >> performance problem. >> >> > Thanks to your review, I noticed that the back-to-back >> check on the >> >> > current thread was using getThreadAllocatedBytes(long) >> instead of >> >> > getCurrentThreadAllocatedBytes and fixed it. I also >> removed all >> >> > instances of ?TEST FAILED: ?. >> >> >> The back-to-back check is not valid in general. You don't >> know if the >> >> first check might trigger some class loading on the >> return path after it >> >> has obtained the first memory value. The check might also >> fail if using >> >> JVMCI and some compilation related activity occurs in the >> current thread >> >> on the second call. Also with the introduction of >> handshakes its >> >> possible the current thread might hit a safepoint checks >> that results in >> >> it executing a handshake operation that performs >> allocation. Potentially >> >> there could be numerous non-deterministic actions that >> might occur >> >> leading to unanticipated allocation. >> >> >> I understand what you want to test here, I just don't >> think it is >> >> reliably doable. >> >> >> Thanks, >> >> David >> >> ----- >> >> >> > >> >> > Paul >> >> > >> >> > *From: *Mandy Chung >> >> >> > *Date: *Thursday, September 12, 2019 at 10:09 AM >> >> > *To: *"Hohensee, Paul" >> >> >> > *Cc: *OpenJDK >> Serviceability >> , >> >> >"hotspot-gc-dev at openjdk.java.net" >> >> >> >> >> > *Subject: *Re: RFR (M): 8207266: >> ThreadMXBean::getThreadAllocatedBytes() >> >> > can be quicker for self thread >> >> > >> >> > On 9/3/19 12:38 PM, Hohensee, Paul wrote: >> >> > >> >> > Minor update in new >> webrevhttp://cr.openjdk.java.net/~phh/8207266/webrev.05/. >> >> > >> >> > >> >> > I only reviewed the library side implementation that >> looks good. I >> >> > expect the serviceability team to review the test and >> hotspot change. >> >> > >> >> > >> >> > Need a confirmatory review to push this. If I >> understand the rules correctly, it doesn't need a Reviewer review >> since Mandy's already reviewed it, it just needs a Committer review. >> >> > >> >> > >> >> > You need another reviewer to advice the following >> because I was not >> >> > close to the ThreadsList work. >> >> > >> >> > 2087 ThreadsListHandle tlh; >> >> > >> >> > 2088 JavaThread* java_thread = >> tlh.list()->find_JavaThread_from_java_tid(thread_id); >> >> > >> >> > 2089 >> >> > >> >> > 2090 if (java_thread != NULL) { >> >> > >> >> > 2091 return java_thread->cooked_allocated_bytes(); >> >> > >> >> > 2092 } >> >> > >> >> > This looks right to me. >> >> > >> >> > >> test/jdk/com/sun/management/ThreadMXBean/ThreadAllocatedMemory.java >> >> > >> >> > - "ThreadAllocatedMemory is expected to >> be disabled"); >> >> > >> >> > + "TEST FAILED: ThreadAllocatedMemory is >> expected to be >> >> > disabled"); >> >> > >> >> > Prepending "TEST FAILED" in exception message (in >> several places) >> >> > >> >> > seems redundant since such RuntimeException is thrown >> and expected >> >> > >> >> > a test failure. >> >> > >> >> > + // back-to-back calls shouldn't allocate any >> memory >> >> > >> >> > + size = mbean.getThreadAllocatedBytes(id); >> >> > >> >> > + size1 = mbean.getThreadAllocatedBytes(id); >> >> > >> >> > + if (size1 != size) { >> >> > >> >> > Is there anything in the test can do to help guarantee >> this? I didn't >> >> > >> >> > closely review this test. The main thing I advice is >> to improve >> >> > >> >> > the reliability of this test. Put it in another way, >> we want to >> >> > >> >> > ensure that this test change will pass all the time in >> various >> >> > >> >> > test configuration. >> >> > >> >> > Mandy >> >> > >> >> >> >> >> From daniel.daugherty at oracle.com Thu Sep 19 00:14:13 2019 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Wed, 18 Sep 2019 20:14:13 -0400 Subject: RFR (M): 8207266: ThreadMXBean::getThreadAllocatedBytes() can be quicker for self thread In-Reply-To: <93FFE1B3-C1BA-4568-9402-48EB74BB089B@amazon.com> References: <588a91ec-8d4a-1157-5d72-88bb1eef1e6e@oracle.com> <30EA5D0C-1AEC-4242-B17B-CA4D39ECAF71@amazon.com> <0d42d653-d158-a6e4-45b6-84f087c7e592@oracle.com> <03A2509C-5587-448A-82F8-9240EA040326@amazon.com> <6f674d71-58f6-bc79-7d08-7bcc24e3b0fa@oracle.com> <5252a51d-4217-000b-1444-a088bb8a6a58@oracle.com> <873119A8-C595-4B73-AD0B-1625D6CAC47D@amazon.com> <56ea7c5f-8c91-9a05-6d95-255bfd0c154d@oracle.com> <5417BEA4-AECD-4130-B269-19847C0092B3@amazon.com> <1561d09b-68ff-55fa-128a-045798a3d6a9@oracle.com> <6732ee43-532f-377d-f37c-fe4e0f9becfe@oracle.com> <93FFE1B3-C1BA-4568-9402-48EB74BB089B@amazon.com> Message-ID: <64069096-23e3-efd5-44cd-2fff0618d28f@oracle.com> > Shall I go with that, or reverse the original patch? I'm a bit worried about what else might show up since the NSK monitoring tests were not run prior to this push. I vote for backing out the fix until proper testing has been done (and at least the one problem fixed...) Dan On 9/18/19 8:00 PM, Hohensee, Paul wrote: > They all implement com.sun.management.ThreadMXBean, so adding a getCurrentThreadAllocatedBytes broke them. Potential fix is to give it a default implementation, vis > > public default long getCurrentThreadAllocatedBytes() { > return -1; > } > > Shall I go with that, or reverse the original patch? > > ?On 9/18/19, 4:48 PM, "serviceability-dev on behalf of Hohensee, Paul" wrote: > > I'll take a look. > > On 9/18/19, 4:40 PM, "David Holmes" wrote: > > Paul, > > Unfortunately this patch has broken the vmTestbase/nsk/monitoring tests: > > [2019-09-18T22:59:32,349Z] > /scratch/mesos/jib-master/install/jdk-14+15-615/src.full/open/test/hotspot/jtreg/vmTestbase/nsk/monitoring/share/server/ServerThreadMXBeanNew.java:32: > error: ServerThreadMXBeanNew is not abstract and does not override > abstract method getCurrentThreadAllocatedBytes() in ThreadMXBean > > and possibly other issues as we are seeing hundreds of failures. > > David > > On 18/09/2019 8:50 am, David Holmes wrote: > > On 18/09/2019 12:10 am, Hohensee, Paul wrote: > >> Thanks, Serguei. :) > >> > >> David, are you ok with the patch? > > > > Yep, nothing further from me. > > > > David > > > >> Paul > >> > >> *From: *"serguei.spitsyn at oracle.com" > >> *Date: *Tuesday, September 17, 2019 at 2:26 AM > >> *To: *"Hohensee, Paul" , David Holmes > >> , Mandy Chung > >> *Cc: *OpenJDK Serviceability , > >> "hotspot-gc-dev at openjdk.java.net" > >> *Subject: *Re: RFR (M): 8207266: > >> ThreadMXBean::getThreadAllocatedBytes() can be quicker for self thread > >> > >> Hi Paul, > >> > >> Thank you for refactoring and fixing the test. > >> It looks great now! > >> > >> Thanks, > >> Serguei > >> > >> > >> On 9/15/19 02:52, Hohensee, Paul wrote: > >> > >> Hi, Serguei, thanks for the review. New webrev at > >> > >> http://cr.openjdk.java.net/~phh/8207266/webrev.09/ > >> > >> I refactored the test?s main() method, and you?re correct, > >> getThreadAllocatedBytes should be getCurrentThreadAllocatedBytes in > >> that context: fixed. > >> > >> Paul > >> > >> *From: *"serguei.spitsyn at oracle.com" > >> > >> > >> *Organization: *Oracle Corporation > >> *Date: *Friday, September 13, 2019 at 5:50 PM > >> *To: *"Hohensee, Paul" > >> , David Holmes > >> , Mandy Chung > >> > >> *Cc: *OpenJDK Serviceability > >> , > >> "hotspot-gc-dev at openjdk.java.net" > >> > >> > >> > >> *Subject: *Re: RFR (M): 8207266: > >> ThreadMXBean::getThreadAllocatedBytes() can be quicker for self > >> thread > >> > >> Hi Paul, > >> > >> It looks pretty good in general. > >> > >> > >> http://cr.openjdk.java.net/~phh/8207266/webrev.08/test/jdk/com/sun/management/ThreadMXBean/ThreadAllocatedMemory.java.frames.html > >> > >> > >> It would be nice to refactor the java main() method as it becomes > >> too big. > >> Two ways ofgetCurrentThreadAllocatedBytes() testing are good > >> candidates > >> to become separate methods. > >> > >> 98 long size1 = mbean.getThreadAllocatedBytes(id); > >> > >> Just wanted to double check if you wanted to invoke > >> the getCurrentThreadAllocatedBytes() instead as it is > >> a part of: > >> > >> 85 // First way, getCurrentThreadAllocatedBytes > >> > >> > >> Thanks, > >> Serguei > >> > >> On 9/13/19 12:11 PM, Hohensee, Paul wrote: > >> > >> Hi David, thanks for your comments. New webrev in > >> > >> > >> http://cr.openjdk.java.net/~phh/8207266/webrev.08/ > >> > >> > >> Both the old and new versions of the code check that thread > >> allocated memory is both supported and enabled. The existing version > >> of getThreadAllocatedBytes(long []) calls > >> verifyThreadAllocatedMemory(long []), which checks inline to make sure > >> thread allocated memory is supported, then calls > >> isThreadAllocatedMemoryEnabled() to verify that it's enabled. > >> isThreadAllocatedMemoryEnabled() duplicates (!) the support check and > >> returns the enabled flag. I removed the redundant check in the new > >> version. > >> > >> > >> You're of course correct about the back-to-back check. > >> Application code can't know when the runtime will hijack a thread for > >> its own purposes. I've removed the check. > >> > >> > >> Paul > >> > >> > >> On 9/13/19, 12:50 AM, "David Holmes" > >> wrote: > >> > >> > >> Hi Paul, > >> > >> > >> On 13/09/2019 10:29 am, Hohensee, Paul wrote: > >> > >> > Thanks for clarifying the review rules. Would someone > >> from the > >> > >> > serviceability team please review? New webrev at > >> > >> > > >> > >> >http://cr.openjdk.java.net/~phh/8207266/webrev.07/ > >> > >> > >> One aspect of the functional change needs clarification > >> for me - and > >> > >> apologies if this has been covered in the past. It seems > >> to me that > >> > >> currently we only check isThreadAllocatedMemorySupported > >> for these > >> > >> operations, but if I read things correctly the updated > >> code additionally > >> > >> checks isThreadAllocatedMemoryEnabled, which is a > >> behaviour change not > >> > >> mentioned in the CSR. > >> > >> > >> > I didn?t disturb the existing checks in the test, just > >> added code to > >> > >> > check the result of getThreadAllocatedBytes(long) on a > >> non-current > >> > >> > thread, plus the back-to-back no-allocation checks. The > >> former wasn?t > >> > >> > needed before because getThreadAllocatedBytes(long) was > >> just a wrapper > >> > >> > around getThreadAllocatedBytes(long []). This patch > >> changes that, so I > >> > >> > added a separate test. The latter is supposed to fail > >> if there?s object > >> > >> > allocation on calls to getCurrentThreadAllocatedBytes and > >> > >> > getThreadAllocatedBytes(long). I.e., a feature, not a > >> bug, because > >> > >> > accumulation of transient small objects can be a > >> performance problem. > >> > >> > Thanks to your review, I noticed that the back-to-back > >> check on the > >> > >> > current thread was using getThreadAllocatedBytes(long) > >> instead of > >> > >> > getCurrentThreadAllocatedBytes and fixed it. I also > >> removed all > >> > >> > instances of ?TEST FAILED: ?. > >> > >> > >> The back-to-back check is not valid in general. You don't > >> know if the > >> > >> first check might trigger some class loading on the > >> return path after it > >> > >> has obtained the first memory value. The check might also > >> fail if using > >> > >> JVMCI and some compilation related activity occurs in the > >> current thread > >> > >> on the second call. Also with the introduction of > >> handshakes its > >> > >> possible the current thread might hit a safepoint checks > >> that results in > >> > >> it executing a handshake operation that performs > >> allocation. Potentially > >> > >> there could be numerous non-deterministic actions that > >> might occur > >> > >> leading to unanticipated allocation. > >> > >> > >> I understand what you want to test here, I just don't > >> think it is > >> > >> reliably doable. > >> > >> > >> Thanks, > >> > >> David > >> > >> ----- > >> > >> > >> > > >> > >> > Paul > >> > >> > > >> > >> > *From: *Mandy Chung > >> > >> > >> > *Date: *Thursday, September 12, 2019 at 10:09 AM > >> > >> > *To: *"Hohensee, Paul" > >> > >> > >> > *Cc: *OpenJDK > >> Serviceability > >> , > >> > >> >"hotspot-gc-dev at openjdk.java.net" > >> > >> > >> > >> > >> > *Subject: *Re: RFR (M): 8207266: > >> ThreadMXBean::getThreadAllocatedBytes() > >> > >> > can be quicker for self thread > >> > >> > > >> > >> > On 9/3/19 12:38 PM, Hohensee, Paul wrote: > >> > >> > > >> > >> > Minor update in new > >> webrevhttp://cr.openjdk.java.net/~phh/8207266/webrev.05/. > >> > >> > > >> > >> > > >> > >> > I only reviewed the library side implementation that > >> looks good. I > >> > >> > expect the serviceability team to review the test and > >> hotspot change. > >> > >> > > >> > >> > > >> > >> > Need a confirmatory review to push this. If I > >> understand the rules correctly, it doesn't need a Reviewer review > >> since Mandy's already reviewed it, it just needs a Committer review. > >> > >> > > >> > >> > > >> > >> > You need another reviewer to advice the following > >> because I was not > >> > >> > close to the ThreadsList work. > >> > >> > > >> > >> > 2087 ThreadsListHandle tlh; > >> > >> > > >> > >> > 2088 JavaThread* java_thread = > >> tlh.list()->find_JavaThread_from_java_tid(thread_id); > >> > >> > > >> > >> > 2089 > >> > >> > > >> > >> > 2090 if (java_thread != NULL) { > >> > >> > > >> > >> > 2091 return java_thread->cooked_allocated_bytes(); > >> > >> > > >> > >> > 2092 } > >> > >> > > >> > >> > This looks right to me. > >> > >> > > >> > >> > > >> test/jdk/com/sun/management/ThreadMXBean/ThreadAllocatedMemory.java > >> > >> > > >> > >> > - "ThreadAllocatedMemory is expected to > >> be disabled"); > >> > >> > > >> > >> > + "TEST FAILED: ThreadAllocatedMemory is > >> expected to be > >> > >> > disabled"); > >> > >> > > >> > >> > Prepending "TEST FAILED" in exception message (in > >> several places) > >> > >> > > >> > >> > seems redundant since such RuntimeException is thrown > >> and expected > >> > >> > > >> > >> > a test failure. > >> > >> > > >> > >> > + // back-to-back calls shouldn't allocate any > >> memory > >> > >> > > >> > >> > + size = mbean.getThreadAllocatedBytes(id); > >> > >> > > >> > >> > + size1 = mbean.getThreadAllocatedBytes(id); > >> > >> > > >> > >> > + if (size1 != size) { > >> > >> > > >> > >> > Is there anything in the test can do to help guarantee > >> this? I didn't > >> > >> > > >> > >> > closely review this test. The main thing I advice is > >> to improve > >> > >> > > >> > >> > the reliability of this test. Put it in another way, > >> we want to > >> > >> > > >> > >> > ensure that this test change will pass all the time in > >> various > >> > >> > > >> > >> > test configuration. > >> > >> > > >> > >> > Mandy > >> > >> > > >> > >> > >> > >> > >> > > > > From hohensee at amazon.com Thu Sep 19 00:17:04 2019 From: hohensee at amazon.com (Hohensee, Paul) Date: Thu, 19 Sep 2019 00:17:04 +0000 Subject: RFR (M): 8207266: ThreadMXBean::getThreadAllocatedBytes() can be quicker for self thread In-Reply-To: <64069096-23e3-efd5-44cd-2fff0618d28f@oracle.com> References: <588a91ec-8d4a-1157-5d72-88bb1eef1e6e@oracle.com> <30EA5D0C-1AEC-4242-B17B-CA4D39ECAF71@amazon.com> <0d42d653-d158-a6e4-45b6-84f087c7e592@oracle.com> <03A2509C-5587-448A-82F8-9240EA040326@amazon.com> <6f674d71-58f6-bc79-7d08-7bcc24e3b0fa@oracle.com> <5252a51d-4217-000b-1444-a088bb8a6a58@oracle.com> <873119A8-C595-4B73-AD0B-1625D6CAC47D@amazon.com> <56ea7c5f-8c91-9a05-6d95-255bfd0c154d@oracle.com> <5417BEA4-AECD-4130-B269-19847C0092B3@amazon.com> <1561d09b-68ff-55fa-128a-045798a3d6a9@oracle.com> <6732ee43-532f-377d-f37c-fe4e0f9becfe@oracle.com> <93FFE1B3-C1BA-4568-9402-48EB74BB089B@amazon.com> <64069096-23e3-efd5-44cd-2fff0618d28f@oracle.com> Message-ID: <9A151A35-DA66-4B0F-B67F-E7F5BECF205A@amazon.com> Is there a tool that will generate a reversal patch? ?On 9/18/19, 5:14 PM, "Daniel D. Daugherty" wrote: > Shall I go with that, or reverse the original patch? I'm a bit worried about what else might show up since the NSK monitoring tests were not run prior to this push. I vote for backing out the fix until proper testing has been done (and at least the one problem fixed...) Dan On 9/18/19 8:00 PM, Hohensee, Paul wrote: > They all implement com.sun.management.ThreadMXBean, so adding a getCurrentThreadAllocatedBytes broke them. Potential fix is to give it a default implementation, vis > > public default long getCurrentThreadAllocatedBytes() { > return -1; > } > > Shall I go with that, or reverse the original patch? > > On 9/18/19, 4:48 PM, "serviceability-dev on behalf of Hohensee, Paul" wrote: > > I'll take a look. > > On 9/18/19, 4:40 PM, "David Holmes" wrote: > > Paul, > > Unfortunately this patch has broken the vmTestbase/nsk/monitoring tests: > > [2019-09-18T22:59:32,349Z] > /scratch/mesos/jib-master/install/jdk-14+15-615/src.full/open/test/hotspot/jtreg/vmTestbase/nsk/monitoring/share/server/ServerThreadMXBeanNew.java:32: > error: ServerThreadMXBeanNew is not abstract and does not override > abstract method getCurrentThreadAllocatedBytes() in ThreadMXBean > > and possibly other issues as we are seeing hundreds of failures. > > David > > On 18/09/2019 8:50 am, David Holmes wrote: > > On 18/09/2019 12:10 am, Hohensee, Paul wrote: > >> Thanks, Serguei. :) > >> > >> David, are you ok with the patch? > > > > Yep, nothing further from me. > > > > David > > > >> Paul > >> > >> *From: *"serguei.spitsyn at oracle.com" > >> *Date: *Tuesday, September 17, 2019 at 2:26 AM > >> *To: *"Hohensee, Paul" , David Holmes > >> , Mandy Chung > >> *Cc: *OpenJDK Serviceability , > >> "hotspot-gc-dev at openjdk.java.net" > >> *Subject: *Re: RFR (M): 8207266: > >> ThreadMXBean::getThreadAllocatedBytes() can be quicker for self thread > >> > >> Hi Paul, > >> > >> Thank you for refactoring and fixing the test. > >> It looks great now! > >> > >> Thanks, > >> Serguei > >> > >> > >> On 9/15/19 02:52, Hohensee, Paul wrote: > >> > >> Hi, Serguei, thanks for the review. New webrev at > >> > >> http://cr.openjdk.java.net/~phh/8207266/webrev.09/ > >> > >> I refactored the test?s main() method, and you?re correct, > >> getThreadAllocatedBytes should be getCurrentThreadAllocatedBytes in > >> that context: fixed. > >> > >> Paul > >> > >> *From: *"serguei.spitsyn at oracle.com" > >> > >> > >> *Organization: *Oracle Corporation > >> *Date: *Friday, September 13, 2019 at 5:50 PM > >> *To: *"Hohensee, Paul" > >> , David Holmes > >> , Mandy Chung > >> > >> *Cc: *OpenJDK Serviceability > >> , > >> "hotspot-gc-dev at openjdk.java.net" > >> > >> > >> > >> *Subject: *Re: RFR (M): 8207266: > >> ThreadMXBean::getThreadAllocatedBytes() can be quicker for self > >> thread > >> > >> Hi Paul, > >> > >> It looks pretty good in general. > >> > >> > >> http://cr.openjdk.java.net/~phh/8207266/webrev.08/test/jdk/com/sun/management/ThreadMXBean/ThreadAllocatedMemory.java.frames.html > >> > >> > >> It would be nice to refactor the java main() method as it becomes > >> too big. > >> Two ways ofgetCurrentThreadAllocatedBytes() testing are good > >> candidates > >> to become separate methods. > >> > >> 98 long size1 = mbean.getThreadAllocatedBytes(id); > >> > >> Just wanted to double check if you wanted to invoke > >> the getCurrentThreadAllocatedBytes() instead as it is > >> a part of: > >> > >> 85 // First way, getCurrentThreadAllocatedBytes > >> > >> > >> Thanks, > >> Serguei > >> > >> On 9/13/19 12:11 PM, Hohensee, Paul wrote: > >> > >> Hi David, thanks for your comments. New webrev in > >> > >> > >> http://cr.openjdk.java.net/~phh/8207266/webrev.08/ > >> > >> > >> Both the old and new versions of the code check that thread > >> allocated memory is both supported and enabled. The existing version > >> of getThreadAllocatedBytes(long []) calls > >> verifyThreadAllocatedMemory(long []), which checks inline to make sure > >> thread allocated memory is supported, then calls > >> isThreadAllocatedMemoryEnabled() to verify that it's enabled. > >> isThreadAllocatedMemoryEnabled() duplicates (!) the support check and > >> returns the enabled flag. I removed the redundant check in the new > >> version. > >> > >> > >> You're of course correct about the back-to-back check. > >> Application code can't know when the runtime will hijack a thread for > >> its own purposes. I've removed the check. > >> > >> > >> Paul > >> > >> > >> On 9/13/19, 12:50 AM, "David Holmes" > >> wrote: > >> > >> > >> Hi Paul, > >> > >> > >> On 13/09/2019 10:29 am, Hohensee, Paul wrote: > >> > >> > Thanks for clarifying the review rules. Would someone > >> from the > >> > >> > serviceability team please review? New webrev at > >> > >> > > >> > >> >http://cr.openjdk.java.net/~phh/8207266/webrev.07/ > >> > >> > >> One aspect of the functional change needs clarification > >> for me - and > >> > >> apologies if this has been covered in the past. It seems > >> to me that > >> > >> currently we only check isThreadAllocatedMemorySupported > >> for these > >> > >> operations, but if I read things correctly the updated > >> code additionally > >> > >> checks isThreadAllocatedMemoryEnabled, which is a > >> behaviour change not > >> > >> mentioned in the CSR. > >> > >> > >> > I didn?t disturb the existing checks in the test, just > >> added code to > >> > >> > check the result of getThreadAllocatedBytes(long) on a > >> non-current > >> > >> > thread, plus the back-to-back no-allocation checks. The > >> former wasn?t > >> > >> > needed before because getThreadAllocatedBytes(long) was > >> just a wrapper > >> > >> > around getThreadAllocatedBytes(long []). This patch > >> changes that, so I > >> > >> > added a separate test. The latter is supposed to fail > >> if there?s object > >> > >> > allocation on calls to getCurrentThreadAllocatedBytes and > >> > >> > getThreadAllocatedBytes(long). I.e., a feature, not a > >> bug, because > >> > >> > accumulation of transient small objects can be a > >> performance problem. > >> > >> > Thanks to your review, I noticed that the back-to-back > >> check on the > >> > >> > current thread was using getThreadAllocatedBytes(long) > >> instead of > >> > >> > getCurrentThreadAllocatedBytes and fixed it. I also > >> removed all > >> > >> > instances of ?TEST FAILED: ?. > >> > >> > >> The back-to-back check is not valid in general. You don't > >> know if the > >> > >> first check might trigger some class loading on the > >> return path after it > >> > >> has obtained the first memory value. The check might also > >> fail if using > >> > >> JVMCI and some compilation related activity occurs in the > >> current thread > >> > >> on the second call. Also with the introduction of > >> handshakes its > >> > >> possible the current thread might hit a safepoint checks > >> that results in > >> > >> it executing a handshake operation that performs > >> allocation. Potentially > >> > >> there could be numerous non-deterministic actions that > >> might occur > >> > >> leading to unanticipated allocation. > >> > >> > >> I understand what you want to test here, I just don't > >> think it is > >> > >> reliably doable. > >> > >> > >> Thanks, > >> > >> David > >> > >> ----- > >> > >> > >> > > >> > >> > Paul > >> > >> > > >> > >> > *From: *Mandy Chung > >> > >> > >> > *Date: *Thursday, September 12, 2019 at 10:09 AM > >> > >> > *To: *"Hohensee, Paul" > >> > >> > >> > *Cc: *OpenJDK > >> Serviceability > >> , > >> > >> >"hotspot-gc-dev at openjdk.java.net" > >> > >> > >> > >> > >> > *Subject: *Re: RFR (M): 8207266: > >> ThreadMXBean::getThreadAllocatedBytes() > >> > >> > can be quicker for self thread > >> > >> > > >> > >> > On 9/3/19 12:38 PM, Hohensee, Paul wrote: > >> > >> > > >> > >> > Minor update in new > >> webrevhttp://cr.openjdk.java.net/~phh/8207266/webrev.05/. > >> > >> > > >> > >> > > >> > >> > I only reviewed the library side implementation that > >> looks good. I > >> > >> > expect the serviceability team to review the test and > >> hotspot change. > >> > >> > > >> > >> > > >> > >> > Need a confirmatory review to push this. If I > >> understand the rules correctly, it doesn't need a Reviewer review > >> since Mandy's already reviewed it, it just needs a Committer review. > >> > >> > > >> > >> > > >> > >> > You need another reviewer to advice the following > >> because I was not > >> > >> > close to the ThreadsList work. > >> > >> > > >> > >> > 2087 ThreadsListHandle tlh; > >> > >> > > >> > >> > 2088 JavaThread* java_thread = > >> tlh.list()->find_JavaThread_from_java_tid(thread_id); > >> > >> > > >> > >> > 2089 > >> > >> > > >> > >> > 2090 if (java_thread != NULL) { > >> > >> > > >> > >> > 2091 return java_thread->cooked_allocated_bytes(); > >> > >> > > >> > >> > 2092 } > >> > >> > > >> > >> > This looks right to me. > >> > >> > > >> > >> > > >> test/jdk/com/sun/management/ThreadMXBean/ThreadAllocatedMemory.java > >> > >> > > >> > >> > - "ThreadAllocatedMemory is expected to > >> be disabled"); > >> > >> > > >> > >> > + "TEST FAILED: ThreadAllocatedMemory is > >> expected to be > >> > >> > disabled"); > >> > >> > > >> > >> > Prepending "TEST FAILED" in exception message (in > >> several places) > >> > >> > > >> > >> > seems redundant since such RuntimeException is thrown > >> and expected > >> > >> > > >> > >> > a test failure. > >> > >> > > >> > >> > + // back-to-back calls shouldn't allocate any > >> memory > >> > >> > > >> > >> > + size = mbean.getThreadAllocatedBytes(id); > >> > >> > > >> > >> > + size1 = mbean.getThreadAllocatedBytes(id); > >> > >> > > >> > >> > + if (size1 != size) { > >> > >> > > >> > >> > Is there anything in the test can do to help guarantee > >> this? I didn't > >> > >> > > >> > >> > closely review this test. The main thing I advice is > >> to improve > >> > >> > > >> > >> > the reliability of this test. Put it in another way, > >> we want to > >> > >> > > >> > >> > ensure that this test change will pass all the time in > >> various > >> > >> > > >> > >> > test configuration. > >> > >> > > >> > >> > Mandy > >> > >> > > >> > >> > >> > >> > >> > > > > From daniel.daugherty at oracle.com Thu Sep 19 00:18:43 2019 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Wed, 18 Sep 2019 20:18:43 -0400 Subject: RFR (M): 8207266: ThreadMXBean::getThreadAllocatedBytes() can be quicker for self thread In-Reply-To: <9A151A35-DA66-4B0F-B67F-E7F5BECF205A@amazon.com> References: <30EA5D0C-1AEC-4242-B17B-CA4D39ECAF71@amazon.com> <0d42d653-d158-a6e4-45b6-84f087c7e592@oracle.com> <03A2509C-5587-448A-82F8-9240EA040326@amazon.com> <6f674d71-58f6-bc79-7d08-7bcc24e3b0fa@oracle.com> <5252a51d-4217-000b-1444-a088bb8a6a58@oracle.com> <873119A8-C595-4B73-AD0B-1625D6CAC47D@amazon.com> <56ea7c5f-8c91-9a05-6d95-255bfd0c154d@oracle.com> <5417BEA4-AECD-4130-B269-19847C0092B3@amazon.com> <1561d09b-68ff-55fa-128a-045798a3d6a9@oracle.com> <6732ee43-532f-377d-f37c-fe4e0f9becfe@oracle.com> <93FFE1B3-C1BA-4568-9402-48EB74BB089B@amazon.com> <64069096-23e3-efd5-44cd-2fff0618d28f@oracle.com> <9A151A35-DA66-4B0F-B67F-E7F5BECF205A@amazon.com> Message-ID: % hg backout is the usual way to do this... Dan On 9/18/19 8:17 PM, Hohensee, Paul wrote: > Is there a tool that will generate a reversal patch? > > ?On 9/18/19, 5:14 PM, "Daniel D. Daugherty" wrote: > > > Shall I go with that, or reverse the original patch? > > I'm a bit worried about what else might show up since the > NSK monitoring tests were not run prior to this push. > > I vote for backing out the fix until proper testing has > been done (and at least the one problem fixed...) > > Dan > > > On 9/18/19 8:00 PM, Hohensee, Paul wrote: > > They all implement com.sun.management.ThreadMXBean, so adding a getCurrentThreadAllocatedBytes broke them. Potential fix is to give it a default implementation, vis > > > > public default long getCurrentThreadAllocatedBytes() { > > return -1; > > } > > > > Shall I go with that, or reverse the original patch? > > > > On 9/18/19, 4:48 PM, "serviceability-dev on behalf of Hohensee, Paul" wrote: > > > > I'll take a look. > > > > On 9/18/19, 4:40 PM, "David Holmes" wrote: > > > > Paul, > > > > Unfortunately this patch has broken the vmTestbase/nsk/monitoring tests: > > > > [2019-09-18T22:59:32,349Z] > > /scratch/mesos/jib-master/install/jdk-14+15-615/src.full/open/test/hotspot/jtreg/vmTestbase/nsk/monitoring/share/server/ServerThreadMXBeanNew.java:32: > > error: ServerThreadMXBeanNew is not abstract and does not override > > abstract method getCurrentThreadAllocatedBytes() in ThreadMXBean > > > > and possibly other issues as we are seeing hundreds of failures. > > > > David > > > > On 18/09/2019 8:50 am, David Holmes wrote: > > > On 18/09/2019 12:10 am, Hohensee, Paul wrote: > > >> Thanks, Serguei. :) > > >> > > >> David, are you ok with the patch? > > > > > > Yep, nothing further from me. > > > > > > David > > > > > >> Paul > > >> > > >> *From: *"serguei.spitsyn at oracle.com" > > >> *Date: *Tuesday, September 17, 2019 at 2:26 AM > > >> *To: *"Hohensee, Paul" , David Holmes > > >> , Mandy Chung > > >> *Cc: *OpenJDK Serviceability , > > >> "hotspot-gc-dev at openjdk.java.net" > > >> *Subject: *Re: RFR (M): 8207266: > > >> ThreadMXBean::getThreadAllocatedBytes() can be quicker for self thread > > >> > > >> Hi Paul, > > >> > > >> Thank you for refactoring and fixing the test. > > >> It looks great now! > > >> > > >> Thanks, > > >> Serguei > > >> > > >> > > >> On 9/15/19 02:52, Hohensee, Paul wrote: > > >> > > >> Hi, Serguei, thanks for the review. New webrev at > > >> > > >> http://cr.openjdk.java.net/~phh/8207266/webrev.09/ > > >> > > >> I refactored the test?s main() method, and you?re correct, > > >> getThreadAllocatedBytes should be getCurrentThreadAllocatedBytes in > > >> that context: fixed. > > >> > > >> Paul > > >> > > >> *From: *"serguei.spitsyn at oracle.com" > > >> > > >> > > >> *Organization: *Oracle Corporation > > >> *Date: *Friday, September 13, 2019 at 5:50 PM > > >> *To: *"Hohensee, Paul" > > >> , David Holmes > > >> , Mandy Chung > > >> > > >> *Cc: *OpenJDK Serviceability > > >> , > > >> "hotspot-gc-dev at openjdk.java.net" > > >> > > >> > > >> > > >> *Subject: *Re: RFR (M): 8207266: > > >> ThreadMXBean::getThreadAllocatedBytes() can be quicker for self > > >> thread > > >> > > >> Hi Paul, > > >> > > >> It looks pretty good in general. > > >> > > >> > > >> http://cr.openjdk.java.net/~phh/8207266/webrev.08/test/jdk/com/sun/management/ThreadMXBean/ThreadAllocatedMemory.java.frames.html > > >> > > >> > > >> It would be nice to refactor the java main() method as it becomes > > >> too big. > > >> Two ways ofgetCurrentThreadAllocatedBytes() testing are good > > >> candidates > > >> to become separate methods. > > >> > > >> 98 long size1 = mbean.getThreadAllocatedBytes(id); > > >> > > >> Just wanted to double check if you wanted to invoke > > >> the getCurrentThreadAllocatedBytes() instead as it is > > >> a part of: > > >> > > >> 85 // First way, getCurrentThreadAllocatedBytes > > >> > > >> > > >> Thanks, > > >> Serguei > > >> > > >> On 9/13/19 12:11 PM, Hohensee, Paul wrote: > > >> > > >> Hi David, thanks for your comments. New webrev in > > >> > > >> > > >> http://cr.openjdk.java.net/~phh/8207266/webrev.08/ > > >> > > >> > > >> Both the old and new versions of the code check that thread > > >> allocated memory is both supported and enabled. The existing version > > >> of getThreadAllocatedBytes(long []) calls > > >> verifyThreadAllocatedMemory(long []), which checks inline to make sure > > >> thread allocated memory is supported, then calls > > >> isThreadAllocatedMemoryEnabled() to verify that it's enabled. > > >> isThreadAllocatedMemoryEnabled() duplicates (!) the support check and > > >> returns the enabled flag. I removed the redundant check in the new > > >> version. > > >> > > >> > > >> You're of course correct about the back-to-back check. > > >> Application code can't know when the runtime will hijack a thread for