From dmitry.samersoff at oracle.com Sat Feb 1 01:02:29 2014 From: dmitry.samersoff at oracle.com (Dmitry Samersoff) Date: Sat, 01 Feb 2014 13:02:29 +0400 Subject: RFR (XS) 6471769: Error: assert(_cur_stack_depth == count_frames(), "cur_stack_depth out of sync") In-Reply-To: <52EC62CD.8000300@oracle.com> References: <52EC62CD.8000300@oracle.com> Message-ID: <52ECB825.70600@oracle.com> Looks good for me! On 2014-02-01 06:58, serguei.spitsyn at oracle.com wrote: > Please, review the fix for: > https://bugs.openjdk.java.net/browse/JDK-6471769 > > > Open webrev: > http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/hotspot/6471769-JVMTI-FRAME/ > > > Summary: > > There is a general issue in the suspend equivalent condition mechanism: > Two subsequent calls to the JvmtiEnv::is_thread_fully_suspended() may > return different results: > - 1-st: true > - 2-nd: false > > This more generic suspend equivalent issue is covered by another bug: > https://bugs.openjdk.java.net/browse/JDK-6280037 > > The bug to fix in this review is a specific manifestation of the 6280037 > in the JVMTI GetFrameCount() that has a big impact on the SQE nightly. > It is on the Test Stabilization radar (as well as the 6280037). > There are many tests intermittently failing because of this. > > The webrev for review is a one-liner work around the 6280037 for the > GetFrameCount(). > > The JVMTI GetFrameCount() spec tells: > "If this function is called for a thread actively executing > bytecodes (for example, > not the current thread and not suspended), the information returned > is transient." > > So, it is Ok to call the GetFrameCount() for non-suspended target > threads. > To achieve safety, the frame count for non-suspended threads is > calculated at a safepoint. > It should be Ok and more safe to do the same for suspended threads as > well. > There is no big performance impact because it is already on a slow path. > It is still important to avoid safepointing when the target thread is > current. > > The bug 6280037 should go out of the Test Stabilization radar (remove > the svc-nightly label) > as the most of the impacted tests are covered by the 6471769. > > > Testing: > In progress: nsk.jvmti, nsk.jdi, nsk.jdwp, impacted JTreg tests > > > Thanks, > Serguei > -- Dmitry Samersoff Oracle Java development team, Saint Petersburg, Russia * I would love to change the world, but they won't give me the source code. From serguei.spitsyn at oracle.com Sat Feb 1 04:24:43 2014 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Sat, 01 Feb 2014 04:24:43 -0800 Subject: RFR (XS) 6471769: Error: assert(_cur_stack_depth == count_frames(), "cur_stack_depth out of sync") In-Reply-To: <52ECB825.70600@oracle.com> References: <52EC62CD.8000300@oracle.com> <52ECB825.70600@oracle.com> Message-ID: <52ECE78B.10809@oracle.com> Thanks, Dmitry! Serguei On 2/1/14 1:02 AM, Dmitry Samersoff wrote: > Looks good for me! > > On 2014-02-01 06:58, serguei.spitsyn at oracle.com wrote: >> Please, review the fix for: >> https://bugs.openjdk.java.net/browse/JDK-6471769 >> >> >> Open webrev: >> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/hotspot/6471769-JVMTI-FRAME/ >> >> >> Summary: >> >> There is a general issue in the suspend equivalent condition mechanism: >> Two subsequent calls to the JvmtiEnv::is_thread_fully_suspended() may >> return different results: >> - 1-st: true >> - 2-nd: false >> >> This more generic suspend equivalent issue is covered by another bug: >> https://bugs.openjdk.java.net/browse/JDK-6280037 >> >> The bug to fix in this review is a specific manifestation of the 6280037 >> in the JVMTI GetFrameCount() that has a big impact on the SQE nightly. >> It is on the Test Stabilization radar (as well as the 6280037). >> There are many tests intermittently failing because of this. >> >> The webrev for review is a one-liner work around the 6280037 for the >> GetFrameCount(). >> >> The JVMTI GetFrameCount() spec tells: >> "If this function is called for a thread actively executing >> bytecodes (for example, >> not the current thread and not suspended), the information returned >> is transient." >> >> So, it is Ok to call the GetFrameCount() for non-suspended target >> threads. >> To achieve safety, the frame count for non-suspended threads is >> calculated at a safepoint. >> It should be Ok and more safe to do the same for suspended threads as >> well. >> There is no big performance impact because it is already on a slow path. >> It is still important to avoid safepointing when the target thread is >> current. >> >> The bug 6280037 should go out of the Test Stabilization radar (remove >> the svc-nightly label) >> as the most of the impacted tests are covered by the 6471769. >> >> >> Testing: >> In progress: nsk.jvmti, nsk.jdi, nsk.jdwp, impacted JTreg tests >> >> >> Thanks, >> Serguei >> > From daniel.daugherty at oracle.com Sat Feb 1 10:38:32 2014 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Sat, 01 Feb 2014 11:38:32 -0700 Subject: code review round 0 for ObjectMonitor-JVM/TI hang fix (8028073) Message-ID: <52ED3F28.5070806@oracle.com> Greetings, I have a fix ready for the following bug: 8028073 race condition in ObjectMonitor implementation causing deadlocks https://bugs.openjdk.java.net/browse/JDK-8028073 On the surface, this is a very simple fix that relocates a few lines of code, relocates and rewrites the comments associated with that code and adds several new comments. Of course, in reality, the issue is much more complicated, but I'm hoping to make it easy for anyone not acquainted with this issue to understand what's going on. Here are the JDK9 webrev URLs: OpenJDK: http://cr.openjdk.java.net/~dcubed/8028073-webrev/0-jdk9-hs-runtime/ Oracle internal: http://javaweb.us.oracle.com/~ddaugher/8028073-webrev/0-jdk9-hs-runtime/ The simple summary: - since Java Monitors and JVM/TI RawMonitors share a ParkEvent, it is possible for a JVM/TI monitor event handler to accidentally consume a ParkEvent.unpark() call meant for Java Monitor layer - the original code fix was made on 2005.07.04 using this bug ID: https://bugs.openjdk.java.net/browse/JDK-5030359 - it's the right fix, but it's in the wrong place - the fix needs to be after the JVMTI_EVENT_MONITOR_WAITED event handler is called because it is that event handler that can cause the hang Testing ------- - a new StessMonitorWait test has been created that reliably reproduces the hang in JDK[6789]; see the bug's gory details for the specific versions where the hang has been reproduced - the test reliably reproduces the hang in 5 seconds on my T7600 running Solaris 10u11 X86; 1 minute runs reproduce the hang reliably on other machines - 12 hour stress run of the new test on Linux-X64, MacOS X-X64, Solaris-SPARCV9, Solaris-X64, and Win7-X86 with the JPRT bits did not reproduce the hang - JPRT test job - VM/SQE Adhoc test job on Server VM, fastdebug bits on Linux-X86, Linux-X64, MacOS X-X64, Solaris-SPARCV9, Solaris-X64, Windows-X86, and Windows-X64: - vm.quick - Kitchensink (bigapps) - Weblogic+medrec (bigapps) - runThese (bigapps) The Gory Details Start Here --------------------------- This is the old location of block of code that's being moved: src/share/vm/runtime/objectMonitor.cpp: 1440 void ObjectMonitor::wait(jlong millis, bool interruptible, TRAPS) { 1499 exit (true, Self) ; // exit the monitor 1513 if (node._notified != 0 && _succ == Self) { 1514 node._event->unpark(); 1515 } This is the new location of block of code that's being moved: src/share/vm/runtime/objectMonitor.cpp: 1452 void ObjectMonitor::wait(jlong millis, bool interruptible, TRAPS) { 1601 if (JvmtiExport::should_post_monitor_waited()) { 1602 JvmtiExport::post_monitor_waited(jt, this, ret == OS_TIMEOUT); 1604 if (node._notified != 0 && _succ == Self) { 1620 node._event->unpark(); 1621 } The Risks --------- - The code now executes only when the JVMTI_EVENT_MONITOR_WAITED event is enabled: - previously it was always executed - while the old code was not effective for the hang that is being fixed with this bug, it is possible that the old code prevented a different bug in the successor protocol from manifesting - thorough analysis of the successor protocol did not reveal a case where the old code was needed in the old location - Thorough analysis indicates that the other JVM/TI monitor events do not need a fix like the one for JVMTI_EVENT_MONITOR_WAITED: - the successor protocol is complicated and the analysis could be wrong when certain options are used - comments were added to each location where a JVM/TI monitor event handler is called documenting why a fix like this one is not needed there - if the analysis is wrong, the new comments show where a new code change would be needed The Scenario ------------ I've created a scenario that reproduces this hang: T1 - enters monitor and calls monitor.wait() T2 - enters the monitor, calls monitor.notify() and exits the monitor T3 - enters and exits the monitor T4 - enters the monitor, delays for 5 seconds, exits the monitor A JVM/TI agent that enables JVMTI_EVENT_MONITOR_WAITED and has a handler that: enters a raw monitor, waits for 1ms, exits a raw monitor. Here are the six events necessary to make this hang happen: // KEY-EVENT-1a: After being unparked(), T1 has cleared the _succ field, but // KEY-EVENT-1b: T3 is exiting the monitor and makes T1 the successor again. // KEY-EVENT-2a: The unpark() done by T3 when it made T1 the successor // KEY-EVENT-2b: is consumed by the JVM/TI event handler. // KEY-EVENT-3a: T3 made T1 the successor // KEY-EVENT-3b: but before T1 could reenter the monitor T4 grabbed it. // KEY-EVENT-4a: T1's TrySpin() call sees T4 as NotRunnable so // KEY-EVENT-4b: T1 bails from TrySpin without touching _succ. // KEY-EVENT-5a: T4 sees that T1 is still the successor so // KEY-EVENT-5b: T4 takes the quick exit path (no ExitEpilog) // KEY-EVENT-6a: T1 is about to park and it is the successor, but // KEY-EVENT-6b: T3's unpark has been eaten by the JVM/TI event handler // KEY-EVENT-6c: and T4 took the quick exit path. T1 is about to be stuck. This bug is intertwined with: - The ObjectMonitor successor protocol - the sharing of a ParkEvent between Java Monitors and JVM/TI RawMonitors There is a very long successor.notes attachment to JDK-8028073 that attempts to describe the ObjectMonitor successor protocol. It's good for putting pretty much anyone to sleep. Since this hang reproduces back to JDK6, this bug is taking the easily backported solution of moving the original fix to the right location. The following new bug has been filed for possible future work in this area by the Serviceability Team: 8033399 add a separate ParkEvent for JVM/TI RawMonitor use https://bugs.openjdk.java.net/browse/JDK-8033399 The Symptoms ------------ With intermittent hangs like this, it is useful to know what to look for in order to determine if you are running into this issue: - if you aren't using a debugger or a profiler or some other JVM/TI agent, then this hang is not the same as yours - if your JVM/TI agent isn't using a JVMTI_EVENT_MONITOR_WAITED event handler, then this hang is not the same as yours - if your JVMTI_EVENT_MONITOR_WAITED event handler is not using JVM/TI RawMonitors, then this hang is not the same as yours - if your JVMTI_EVENT_MONITOR_WAITED event handler is calling back into Java code, then you might just be insane and this hang might be similar to yours. However, using a Java callback in an event handler is an even bigger problem/risk so fix that first. - if you one or more threads blocked like this and making no progress, then this hang might be the same as yours: "T1" #22 prio=5 os_prio=64 tid=0x00000000009ca800 nid=0x2f waiting for monitor e ntry [0xfffffd7fc0231000] java.lang.Thread.State: BLOCKED (on object monitor) JavaThread state: _thread_blocked Thread: 0x00000000009ca800 [0x2f] State: _at_safepoint _has_called_back 0 _at_p oll_safepoint 0 JavaThread state: _thread_blocked at java.lang.Object.wait(Native Method) - waiting on <0xfffffd7e6a2b6ff0> (a java.lang.String) at java.lang.Object.wait(Object.java:502) at SMW_WorkerThread.run(StressMonitorWait.java:103) - locked <0xfffffd7e6a2b6ff0> (a java.lang.String) "T2" #23 prio=5 os_prio=64 tid=0x00000000009cc000 nid=0x30 waiting for monitor e ntry [0xfffffd7fc0130000] java.lang.Thread.State: BLOCKED (on object monitor) JavaThread state: _thread_blocked Thread: 0x00000000009cc000 [0x30] State: _at_safepoint _has_called_back 0 _at_p oll_safepoint 0 JavaThread state: _thread_blocked at SMW_WorkerThread.run(StressMonitorWait.java:120) - waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String) "T3" #24 prio=5 os_prio=64 tid=0x00000000009ce000 nid=0x31 waiting for monitor e ntry [0xfffffd7fc002f000] java.lang.Thread.State: BLOCKED (on object monitor) JavaThread state: _thread_blocked Thread: 0x00000000009ce000 [0x31] State: _at_safepoint _has_called_back 0 _at_p oll_safepoint 0 JavaThread state: _thread_blocked at SMW_WorkerThread.run(StressMonitorWait.java:139) - waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String) Key symptoms in thread T1: - had the object locked: locked <0xfffffd7e6a2b6ff0> (a java.lang.String) - did an Object.wait(): waiting on <0xfffffd7e6a2b6ff0> (a java.lang.String) - is blocked on reentry: waiting for monitor entry [0xfffffd7fc0231000] Key symtoms in thread T2: - is blocked waiting to lock the object: waiting for monitor entry [0xfffffd7fc0130000] waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String) Key symtoms in thread T3: - is blocked waiting to lock the object: waiting for monitor entry [0xfffffd7fc002f000] waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String) From staffan.larsen at oracle.com Mon Feb 3 02:39:12 2014 From: staffan.larsen at oracle.com (Staffan Larsen) Date: Mon, 3 Feb 2014 11:39:12 +0100 Subject: RR(S): JDK-8023667 SA: ExceptionBlob and other C2 classes not available in client VM In-Reply-To: <52EBFC89.3030809@oracle.com> References: <52EBFC89.3030809@oracle.com> Message-ID: It looks like this new version will leave some things uninitialized if an item higher on the list throws an exception. I don?t think that is what you want. /Staffan On 31 jan 2014, at 20:42, Dmitry Samersoff wrote: > Hi Everybody, > > Please review the fix: > > http://cr.openjdk.java.net/~dsamersoff/JDK-8023667/webrev.01/ > > sa.js try to initialize all possible VM structures regardless of type of > VM we are running on > > -- > Dmitry Samersoff > Oracle Java development team, Saint Petersburg, Russia > * I would love to change the world, but they won't give me the sources. From dmitry.samersoff at oracle.com Mon Feb 3 02:59:01 2014 From: dmitry.samersoff at oracle.com (Dmitry Samersoff) Date: Mon, 03 Feb 2014 14:59:01 +0400 Subject: RR(S): JDK-8023667 SA: ExceptionBlob and other C2 classes not available in client VM In-Reply-To: References: <52EBFC89.3030809@oracle.com> Message-ID: <52EF7675.3020707@oracle.com> Staffan, All items under try-catch exists only in server (C2 compiler) VM, so if one of them is missing in target VM other items is missed as well. I have no ideas what happens later if SA tries to access one of this non-initialized items, but I don't see any errors in my home testing. -Dmitry On 2014-02-03 14:39, Staffan Larsen wrote: > It looks like this new version will leave some things uninitialized > if an item higher on the list throws an exception. I don?t think that > is what you want. > > /Staffan > > > On 31 jan 2014, at 20:42, Dmitry Samersoff > wrote: > >> Hi Everybody, >> >> Please review the fix: >> >> http://cr.openjdk.java.net/~dsamersoff/JDK-8023667/webrev.01/ >> >> sa.js try to initialize all possible VM structures regardless of >> type of VM we are running on >> >> -- Dmitry Samersoff Oracle Java development team, Saint Petersburg, >> Russia * I would love to change the world, but they won't give me >> the sources. > -- Dmitry Samersoff Oracle Java development team, Saint Petersburg, Russia * I would love to change the world, but they won't give me the sources. From staffan.larsen at oracle.com Mon Feb 3 03:57:27 2014 From: staffan.larsen at oracle.com (Staffan Larsen) Date: Mon, 3 Feb 2014 12:57:27 +0100 Subject: RFR (XS) 6471769: Error: assert(_cur_stack_depth == count_frames(), "cur_stack_depth out of sync") In-Reply-To: <52EC62CD.8000300@oracle.com> References: <52EC62CD.8000300@oracle.com> Message-ID: <3E8D21DD-BD7A-4AEF-9C02-C175AEB40738@oracle.com> Looks good! Thanks, /Staffan On 1 feb 2014, at 03:58, serguei.spitsyn at oracle.com wrote: > Please, review the fix for: > https://bugs.openjdk.java.net/browse/JDK-6471769 > > > Open webrev: > http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/hotspot/6471769-JVMTI-FRAME/ > > Summary: > > There is a general issue in the suspend equivalent condition mechanism: > Two subsequent calls to the JvmtiEnv::is_thread_fully_suspended() may return different results: > - 1-st: true > - 2-nd: false > > This more generic suspend equivalent issue is covered by another bug: > https://bugs.openjdk.java.net/browse/JDK-6280037 > > The bug to fix in this review is a specific manifestation of the 6280037 > in the JVMTI GetFrameCount() that has a big impact on the SQE nightly. > It is on the Test Stabilization radar (as well as the 6280037). > There are many tests intermittently failing because of this. > > The webrev for review is a one-liner work around the 6280037 for the GetFrameCount(). > > The JVMTI GetFrameCount() spec tells: > "If this function is called for a thread actively executing bytecodes (for example, > not the current thread and not suspended), the information returned is transient." > > So, it is Ok to call the GetFrameCount() for non-suspended target threads. > To achieve safety, the frame count for non-suspended threads is calculated at a safepoint. > It should be Ok and more safe to do the same for suspended threads as well. > There is no big performance impact because it is already on a slow path. > It is still important to avoid safepointing when the target thread is current. > > The bug 6280037 should go out of the Test Stabilization radar (remove the svc-nightly label) > as the most of the impacted tests are covered by the 6471769. > > > Testing: > In progress: nsk.jvmti, nsk.jdi, nsk.jdwp, impacted JTreg tests > > > Thanks, > Serguei > From david.holmes at oracle.com Mon Feb 3 04:10:50 2014 From: david.holmes at oracle.com (David Holmes) Date: Mon, 03 Feb 2014 22:10:50 +1000 Subject: code review round 0 for ObjectMonitor-JVM/TI hang fix (8028073) In-Reply-To: <52ED3F28.5070806@oracle.com> References: <52ED3F28.5070806@oracle.com> Message-ID: <52EF874A.8020603@oracle.com> Hi Dan, On 2/02/2014 4:38 AM, Daniel D. Daugherty wrote: > Greetings, > > I have a fix ready for the following bug: > > 8028073 race condition in ObjectMonitor implementation causing > deadlocks > https://bugs.openjdk.java.net/browse/JDK-8028073 > > On the surface, this is a very simple fix that relocates a few lines of > code, relocates and rewrites the comments associated with that code and > adds several new comments. > > Of course, in reality, the issue is much more complicated, but I'm > hoping to make it easy for anyone not acquainted with this issue to > understand what's going on. Let's spare them the trouble - Reviewed :) I'm still troubled by the rationale for the original placement of this code. It makes sense to issue the unpark prior to the park() call but on the other hand there is no agent/event code in that path that would have consumed the original unpark. ?? Thanks, I know you've really enjoyed this one :) David ----- > Here are the JDK9 webrev URLs: > > OpenJDK: > http://cr.openjdk.java.net/~dcubed/8028073-webrev/0-jdk9-hs-runtime/ > > Oracle internal: > http://javaweb.us.oracle.com/~ddaugher/8028073-webrev/0-jdk9-hs-runtime/ > > The simple summary: > > - since Java Monitors and JVM/TI RawMonitors share a ParkEvent, > it is possible for a JVM/TI monitor event handler to accidentally > consume a ParkEvent.unpark() call meant for Java Monitor layer > - the original code fix was made on 2005.07.04 using this bug ID: > https://bugs.openjdk.java.net/browse/JDK-5030359 > - it's the right fix, but it's in the wrong place > - the fix needs to be after the JVMTI_EVENT_MONITOR_WAITED > event handler is called because it is that event handler > that can cause the hang > > > Testing > ------- > > - a new StessMonitorWait test has been created that reliably > reproduces the hang in JDK[6789]; see the bug's gory details > for the specific versions where the hang has been reproduced > - the test reliably reproduces the hang in 5 seconds on my > T7600 running Solaris 10u11 X86; 1 minute runs reproduce > the hang reliably on other machines > - 12 hour stress run of the new test on Linux-X64, MacOS X-X64, > Solaris-SPARCV9, Solaris-X64, and Win7-X86 with the JPRT > bits did not reproduce the hang > - JPRT test job > - VM/SQE Adhoc test job on Server VM, fastdebug bits on Linux-X86, > Linux-X64, MacOS X-X64, Solaris-SPARCV9, Solaris-X64, Windows-X86, > and Windows-X64: > - vm.quick > - Kitchensink (bigapps) > - Weblogic+medrec (bigapps) > - runThese (bigapps) > > > The Gory Details Start Here > --------------------------- > > This is the old location of block of code that's being moved: > > src/share/vm/runtime/objectMonitor.cpp: > > 1440 void ObjectMonitor::wait(jlong millis, bool interruptible, TRAPS) { > > 1499 exit (true, Self) ; // exit the monitor > > 1513 if (node._notified != 0 && _succ == Self) { > 1514 node._event->unpark(); > 1515 } > > > This is the new location of block of code that's being moved: > > src/share/vm/runtime/objectMonitor.cpp: > > 1452 void ObjectMonitor::wait(jlong millis, bool interruptible, TRAPS) { > > 1601 if (JvmtiExport::should_post_monitor_waited()) { > 1602 JvmtiExport::post_monitor_waited(jt, this, ret == OS_TIMEOUT); > > 1604 if (node._notified != 0 && _succ == Self) { > > 1620 node._event->unpark(); > 1621 } > > > The Risks > --------- > > - The code now executes only when the JVMTI_EVENT_MONITOR_WAITED event > is enabled: > - previously it was always executed > - while the old code was not effective for the hang that is being > fixed with this bug, it is possible that the old code prevented > a different bug in the successor protocol from manifesting > - thorough analysis of the successor protocol did not reveal a > case where the old code was needed in the old location > - Thorough analysis indicates that the other JVM/TI monitor events > do not need a fix like the one for JVMTI_EVENT_MONITOR_WAITED: > - the successor protocol is complicated and the analysis could > be wrong when certain options are used > - comments were added to each location where a JVM/TI monitor > event handler is called documenting why a fix like this one > is not needed there > - if the analysis is wrong, the new comments show where a new > code change would be needed > > > The Scenario > ------------ > > I've created a scenario that reproduces this hang: > > T1 - enters monitor and calls monitor.wait() > T2 - enters the monitor, calls monitor.notify() and exits the monitor > T3 - enters and exits the monitor > T4 - enters the monitor, delays for 5 seconds, exits the monitor > > A JVM/TI agent that enables JVMTI_EVENT_MONITOR_WAITED and has a > handler that: enters a raw monitor, waits for 1ms, exits a raw monitor. > > Here are the six events necessary to make this hang happen: > > // KEY-EVENT-1a: After being unparked(), T1 has cleared the _succ field, > but > // KEY-EVENT-1b: T3 is exiting the monitor and makes T1 the successor > again. > > // KEY-EVENT-2a: The unpark() done by T3 when it made T1 the successor > // KEY-EVENT-2b: is consumed by the JVM/TI event handler. > > // KEY-EVENT-3a: T3 made T1 the successor > // KEY-EVENT-3b: but before T1 could reenter the monitor T4 grabbed it. > > // KEY-EVENT-4a: T1's TrySpin() call sees T4 as NotRunnable so > // KEY-EVENT-4b: T1 bails from TrySpin without touching _succ. > > // KEY-EVENT-5a: T4 sees that T1 is still the successor so > // KEY-EVENT-5b: T4 takes the quick exit path (no ExitEpilog) > > // KEY-EVENT-6a: T1 is about to park and it is the successor, but > // KEY-EVENT-6b: T3's unpark has been eaten by the JVM/TI event handler > // KEY-EVENT-6c: and T4 took the quick exit path. T1 is about to be stuck. > > > This bug is intertwined with: > > - The ObjectMonitor successor protocol > - the sharing of a ParkEvent between Java Monitors and JVM/TI RawMonitors > > There is a very long successor.notes attachment to JDK-8028073 that > attempts to describe the ObjectMonitor successor protocol. It's good > for putting pretty much anyone to sleep. > > Since this hang reproduces back to JDK6, this bug is taking the easily > backported solution of moving the original fix to the right location. > The following new bug has been filed for possible future work in this > area by the Serviceability Team: > > 8033399 add a separate ParkEvent for JVM/TI RawMonitor use > https://bugs.openjdk.java.net/browse/JDK-8033399 > > > The Symptoms > ------------ > > With intermittent hangs like this, it is useful to know what to look > for in order to determine if you are running into this issue: > > - if you aren't using a debugger or a profiler or some other > JVM/TI agent, then this hang is not the same as yours > - if your JVM/TI agent isn't using a JVMTI_EVENT_MONITOR_WAITED > event handler, then this hang is not the same as yours > - if your JVMTI_EVENT_MONITOR_WAITED event handler is not using > JVM/TI RawMonitors, then this hang is not the same as yours > - if your JVMTI_EVENT_MONITOR_WAITED event handler is calling > back into Java code, then you might just be insane and this > hang might be similar to yours. However, using a Java callback > in an event handler is an even bigger problem/risk so fix that > first. > - if you one or more threads blocked like this and making no > progress, then this hang might be the same as yours: > > "T1" #22 prio=5 os_prio=64 tid=0x00000000009ca800 nid=0x2f waiting for > monitor e > ntry [0xfffffd7fc0231000] > java.lang.Thread.State: BLOCKED (on object monitor) > JavaThread state: _thread_blocked > Thread: 0x00000000009ca800 [0x2f] State: _at_safepoint _has_called_back > 0 _at_p > oll_safepoint 0 > JavaThread state: _thread_blocked > at java.lang.Object.wait(Native Method) > - waiting on <0xfffffd7e6a2b6ff0> (a java.lang.String) > at java.lang.Object.wait(Object.java:502) > at SMW_WorkerThread.run(StressMonitorWait.java:103) > - locked <0xfffffd7e6a2b6ff0> (a java.lang.String) > > "T2" #23 prio=5 os_prio=64 tid=0x00000000009cc000 nid=0x30 waiting for > monitor e > ntry [0xfffffd7fc0130000] > java.lang.Thread.State: BLOCKED (on object monitor) > JavaThread state: _thread_blocked > Thread: 0x00000000009cc000 [0x30] State: _at_safepoint _has_called_back > 0 _at_p > oll_safepoint 0 > JavaThread state: _thread_blocked > at SMW_WorkerThread.run(StressMonitorWait.java:120) > - waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String) > > "T3" #24 prio=5 os_prio=64 tid=0x00000000009ce000 nid=0x31 waiting for > monitor e > ntry [0xfffffd7fc002f000] > java.lang.Thread.State: BLOCKED (on object monitor) > JavaThread state: _thread_blocked > Thread: 0x00000000009ce000 [0x31] State: _at_safepoint _has_called_back > 0 _at_p > oll_safepoint 0 > JavaThread state: _thread_blocked > at SMW_WorkerThread.run(StressMonitorWait.java:139) > - waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String) > > Key symptoms in thread T1: > > - had the object locked: > > locked <0xfffffd7e6a2b6ff0> (a java.lang.String) > > - did an Object.wait(): > > waiting on <0xfffffd7e6a2b6ff0> (a java.lang.String) > > - is blocked on reentry: > > waiting for monitor entry [0xfffffd7fc0231000] > > Key symtoms in thread T2: > > - is blocked waiting to lock the object: > > waiting for monitor entry [0xfffffd7fc0130000] > waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String) > > Key symtoms in thread T3: > > - is blocked waiting to lock the object: > > waiting for monitor entry [0xfffffd7fc002f000] > waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String) > From staffan.larsen at oracle.com Mon Feb 3 04:13:18 2014 From: staffan.larsen at oracle.com (Staffan Larsen) Date: Mon, 3 Feb 2014 13:13:18 +0100 Subject: RR(S): JDK-8023667 SA: ExceptionBlob and other C2 classes not available in client VM In-Reply-To: <52EF7675.3020707@oracle.com> References: <52EBFC89.3030809@oracle.com> <52EF7675.3020707@oracle.com> Message-ID: On 3 feb 2014, at 11:59, Dmitry Samersoff wrote: > Staffan, > > All items under try-catch exists only in server (C2 compiler) VM, so if > one of them is missing in target VM other items is missed as well. Are you sure? Some of them look very generic: ?JavaThread?, ?SurrogateLockerThread?, ?GenCollectedHeap?, etc. Perhaps the catch-ignore-c2-specifics should end at lin 821? /Staffan > > I have no ideas what happens later if SA tries to access one of this > non-initialized items, but I don't see any errors in my home testing. > > -Dmitry > > On 2014-02-03 14:39, Staffan Larsen wrote: >> It looks like this new version will leave some things uninitialized >> if an item higher on the list throws an exception. I don?t think that >> is what you want. >> >> /Staffan >> >> >> On 31 jan 2014, at 20:42, Dmitry Samersoff >> wrote: >> >>> Hi Everybody, >>> >>> Please review the fix: >>> >>> http://cr.openjdk.java.net/~dsamersoff/JDK-8023667/webrev.01/ >>> >>> sa.js try to initialize all possible VM structures regardless of >>> type of VM we are running on >>> >>> -- Dmitry Samersoff Oracle Java development team, Saint Petersburg, >>> Russia * I would love to change the world, but they won't give me >>> the sources. >> > > > -- > Dmitry Samersoff > Oracle Java development team, Saint Petersburg, Russia > * I would love to change the world, but they won't give me the sources. From david.holmes at oracle.com Mon Feb 3 04:16:39 2014 From: david.holmes at oracle.com (David Holmes) Date: Mon, 03 Feb 2014 22:16:39 +1000 Subject: RFR (XS) 6471769: Error: assert(_cur_stack_depth == count_frames(), "cur_stack_depth out of sync") In-Reply-To: <52EC62CD.8000300@oracle.com> References: <52EC62CD.8000300@oracle.com> Message-ID: <52EF88A7.3090506@oracle.com> Hi Serguei, On 1/02/2014 12:58 PM, serguei.spitsyn at oracle.com wrote: > Please, review the fix for: > https://bugs.openjdk.java.net/browse/JDK-6471769 > > > Open webrev: > http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/hotspot/6471769-JVMTI-FRAME/ > > > Summary: > > There is a general issue in the suspend equivalent condition mechanism: > Two subsequent calls to the JvmtiEnv::is_thread_fully_suspended() may > return different results: > - 1-st: true > - 2-nd: false > > This more generic suspend equivalent issue is covered by another bug: > https://bugs.openjdk.java.net/browse/JDK-6280037 > > The bug to fix in this review is a specific manifestation of the 6280037 > in the JVMTI GetFrameCount() that has a big impact on the SQE nightly. > It is on the Test Stabilization radar (as well as the 6280037). > There are many tests intermittently failing because of this. > > The webrev for review is a one-liner work around the 6280037 for the > GetFrameCount(). > > The JVMTI GetFrameCount() spec tells: > "If this function is called for a thread actively executing > bytecodes (for example, > not the current thread and not suspended), the information > returned is transient." > > So, it is Ok to call the GetFrameCount() for non-suspended target > threads. > To achieve safety, the frame count for non-suspended threads is > calculated at a safepoint. > It should be Ok and more safe to do the same for suspended threads as > well. > There is no big performance impact because it is already on a slow path. > It is still important to avoid safepointing when the target thread is > current. This sounds completely reasonable - Reviewed. I can't check at the moment whether VM_GetFrameCount properly checks for a live target thread? David > The bug 6280037 should go out of the Test Stabilization radar (remove > the svc-nightly label) > as the most of the impacted tests are covered by the 6471769. > > > Testing: > In progress: nsk.jvmti, nsk.jdi, nsk.jdwp, impacted JTreg tests > > > Thanks, > Serguei > From daniel.daugherty at oracle.com Mon Feb 3 06:56:28 2014 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Mon, 03 Feb 2014 07:56:28 -0700 Subject: code review round 0 for ObjectMonitor-JVM/TI hang fix (8028073) In-Reply-To: <52EF874A.8020603@oracle.com> References: <52ED3F28.5070806@oracle.com> <52EF874A.8020603@oracle.com> Message-ID: <52EFAE1C.3050202@oracle.com> Adding Dave Dice to this thread... On 2/3/14 5:10 AM, David Holmes wrote: > Hi Dan, > > On 2/02/2014 4:38 AM, Daniel D. Daugherty wrote: >> Greetings, >> >> I have a fix ready for the following bug: >> >> 8028073 race condition in ObjectMonitor implementation causing >> deadlocks >> https://bugs.openjdk.java.net/browse/JDK-8028073 >> >> On the surface, this is a very simple fix that relocates a few lines of >> code, relocates and rewrites the comments associated with that code and >> adds several new comments. >> >> Of course, in reality, the issue is much more complicated, but I'm >> hoping to make it easy for anyone not acquainted with this issue to >> understand what's going on. > > Let's spare them the trouble - Reviewed :) Thanks for the review! > I'm still troubled by the rationale for the original placement of this > code. It makes sense to issue the unpark prior to the park() call but > on the other hand there is no agent/event code in that path that would > have consumed the original unpark. ?? I think I've acknowledged that concern in the "Risks" section below: - while the old code was not effective for the hang that is being fixed with this bug, it is possible that the old code prevented a different bug in the successor protocol from manifesting Hopefully, Dave Dice will chime in and share the history... > Thanks, I know you've really enjoyed this one :) I actually did enjoy this one from a 'challenge to the mind' POV. But I have to say that the intricacies of the successor protocol gave me headaches... Hopefully, my successor.notes will help anyone else that is crazy enough to wander down this path... Dan > > David > ----- > >> Here are the JDK9 webrev URLs: >> >> OpenJDK: >> http://cr.openjdk.java.net/~dcubed/8028073-webrev/0-jdk9-hs-runtime/ >> >> Oracle internal: >> http://javaweb.us.oracle.com/~ddaugher/8028073-webrev/0-jdk9-hs-runtime/ >> >> The simple summary: >> >> - since Java Monitors and JVM/TI RawMonitors share a ParkEvent, >> it is possible for a JVM/TI monitor event handler to accidentally >> consume a ParkEvent.unpark() call meant for Java Monitor layer >> - the original code fix was made on 2005.07.04 using this bug ID: >> https://bugs.openjdk.java.net/browse/JDK-5030359 >> - it's the right fix, but it's in the wrong place >> - the fix needs to be after the JVMTI_EVENT_MONITOR_WAITED >> event handler is called because it is that event handler >> that can cause the hang >> >> >> Testing >> ------- >> >> - a new StessMonitorWait test has been created that reliably >> reproduces the hang in JDK[6789]; see the bug's gory details >> for the specific versions where the hang has been reproduced >> - the test reliably reproduces the hang in 5 seconds on my >> T7600 running Solaris 10u11 X86; 1 minute runs reproduce >> the hang reliably on other machines >> - 12 hour stress run of the new test on Linux-X64, MacOS X-X64, >> Solaris-SPARCV9, Solaris-X64, and Win7-X86 with the JPRT >> bits did not reproduce the hang >> - JPRT test job >> - VM/SQE Adhoc test job on Server VM, fastdebug bits on Linux-X86, >> Linux-X64, MacOS X-X64, Solaris-SPARCV9, Solaris-X64, Windows-X86, >> and Windows-X64: >> - vm.quick >> - Kitchensink (bigapps) >> - Weblogic+medrec (bigapps) >> - runThese (bigapps) >> >> >> The Gory Details Start Here >> --------------------------- >> >> This is the old location of block of code that's being moved: >> >> src/share/vm/runtime/objectMonitor.cpp: >> >> 1440 void ObjectMonitor::wait(jlong millis, bool interruptible, TRAPS) { >> >> 1499 exit (true, Self) ; // exit the monitor >> >> 1513 if (node._notified != 0 && _succ == Self) { >> 1514 node._event->unpark(); >> 1515 } >> >> >> This is the new location of block of code that's being moved: >> >> src/share/vm/runtime/objectMonitor.cpp: >> >> 1452 void ObjectMonitor::wait(jlong millis, bool interruptible, TRAPS) { >> >> 1601 if (JvmtiExport::should_post_monitor_waited()) { >> 1602 JvmtiExport::post_monitor_waited(jt, this, ret == >> OS_TIMEOUT); >> >> 1604 if (node._notified != 0 && _succ == Self) { >> >> 1620 node._event->unpark(); >> 1621 } >> >> >> The Risks >> --------- >> >> - The code now executes only when the JVMTI_EVENT_MONITOR_WAITED event >> is enabled: >> - previously it was always executed >> - while the old code was not effective for the hang that is being >> fixed with this bug, it is possible that the old code prevented >> a different bug in the successor protocol from manifesting >> - thorough analysis of the successor protocol did not reveal a >> case where the old code was needed in the old location >> - Thorough analysis indicates that the other JVM/TI monitor events >> do not need a fix like the one for JVMTI_EVENT_MONITOR_WAITED: >> - the successor protocol is complicated and the analysis could >> be wrong when certain options are used >> - comments were added to each location where a JVM/TI monitor >> event handler is called documenting why a fix like this one >> is not needed there >> - if the analysis is wrong, the new comments show where a new >> code change would be needed >> >> >> The Scenario >> ------------ >> >> I've created a scenario that reproduces this hang: >> >> T1 - enters monitor and calls monitor.wait() >> T2 - enters the monitor, calls monitor.notify() and exits the monitor >> T3 - enters and exits the monitor >> T4 - enters the monitor, delays for 5 seconds, exits the monitor >> >> A JVM/TI agent that enables JVMTI_EVENT_MONITOR_WAITED and has a >> handler that: enters a raw monitor, waits for 1ms, exits a raw monitor. >> >> Here are the six events necessary to make this hang happen: >> >> // KEY-EVENT-1a: After being unparked(), T1 has cleared the _succ field, >> but >> // KEY-EVENT-1b: T3 is exiting the monitor and makes T1 the successor >> again. >> >> // KEY-EVENT-2a: The unpark() done by T3 when it made T1 the successor >> // KEY-EVENT-2b: is consumed by the JVM/TI event handler. >> >> // KEY-EVENT-3a: T3 made T1 the successor >> // KEY-EVENT-3b: but before T1 could reenter the monitor T4 grabbed it. >> >> // KEY-EVENT-4a: T1's TrySpin() call sees T4 as NotRunnable so >> // KEY-EVENT-4b: T1 bails from TrySpin without touching _succ. >> >> // KEY-EVENT-5a: T4 sees that T1 is still the successor so >> // KEY-EVENT-5b: T4 takes the quick exit path (no ExitEpilog) >> >> // KEY-EVENT-6a: T1 is about to park and it is the successor, but >> // KEY-EVENT-6b: T3's unpark has been eaten by the JVM/TI event handler >> // KEY-EVENT-6c: and T4 took the quick exit path. T1 is about to be >> stuck. >> >> >> This bug is intertwined with: >> >> - The ObjectMonitor successor protocol >> - the sharing of a ParkEvent between Java Monitors and JVM/TI >> RawMonitors >> >> There is a very long successor.notes attachment to JDK-8028073 that >> attempts to describe the ObjectMonitor successor protocol. It's good >> for putting pretty much anyone to sleep. >> >> Since this hang reproduces back to JDK6, this bug is taking the easily >> backported solution of moving the original fix to the right location. >> The following new bug has been filed for possible future work in this >> area by the Serviceability Team: >> >> 8033399 add a separate ParkEvent for JVM/TI RawMonitor use >> https://bugs.openjdk.java.net/browse/JDK-8033399 >> >> >> The Symptoms >> ------------ >> >> With intermittent hangs like this, it is useful to know what to look >> for in order to determine if you are running into this issue: >> >> - if you aren't using a debugger or a profiler or some other >> JVM/TI agent, then this hang is not the same as yours >> - if your JVM/TI agent isn't using a JVMTI_EVENT_MONITOR_WAITED >> event handler, then this hang is not the same as yours >> - if your JVMTI_EVENT_MONITOR_WAITED event handler is not using >> JVM/TI RawMonitors, then this hang is not the same as yours >> - if your JVMTI_EVENT_MONITOR_WAITED event handler is calling >> back into Java code, then you might just be insane and this >> hang might be similar to yours. However, using a Java callback >> in an event handler is an even bigger problem/risk so fix that >> first. >> - if you one or more threads blocked like this and making no >> progress, then this hang might be the same as yours: >> >> "T1" #22 prio=5 os_prio=64 tid=0x00000000009ca800 nid=0x2f waiting for >> monitor e >> ntry [0xfffffd7fc0231000] >> java.lang.Thread.State: BLOCKED (on object monitor) >> JavaThread state: _thread_blocked >> Thread: 0x00000000009ca800 [0x2f] State: _at_safepoint _has_called_back >> 0 _at_p >> oll_safepoint 0 >> JavaThread state: _thread_blocked >> at java.lang.Object.wait(Native Method) >> - waiting on <0xfffffd7e6a2b6ff0> (a java.lang.String) >> at java.lang.Object.wait(Object.java:502) >> at SMW_WorkerThread.run(StressMonitorWait.java:103) >> - locked <0xfffffd7e6a2b6ff0> (a java.lang.String) >> >> "T2" #23 prio=5 os_prio=64 tid=0x00000000009cc000 nid=0x30 waiting for >> monitor e >> ntry [0xfffffd7fc0130000] >> java.lang.Thread.State: BLOCKED (on object monitor) >> JavaThread state: _thread_blocked >> Thread: 0x00000000009cc000 [0x30] State: _at_safepoint _has_called_back >> 0 _at_p >> oll_safepoint 0 >> JavaThread state: _thread_blocked >> at SMW_WorkerThread.run(StressMonitorWait.java:120) >> - waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String) >> >> "T3" #24 prio=5 os_prio=64 tid=0x00000000009ce000 nid=0x31 waiting for >> monitor e >> ntry [0xfffffd7fc002f000] >> java.lang.Thread.State: BLOCKED (on object monitor) >> JavaThread state: _thread_blocked >> Thread: 0x00000000009ce000 [0x31] State: _at_safepoint _has_called_back >> 0 _at_p >> oll_safepoint 0 >> JavaThread state: _thread_blocked >> at SMW_WorkerThread.run(StressMonitorWait.java:139) >> - waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String) >> >> Key symptoms in thread T1: >> >> - had the object locked: >> >> locked <0xfffffd7e6a2b6ff0> (a java.lang.String) >> >> - did an Object.wait(): >> >> waiting on <0xfffffd7e6a2b6ff0> (a java.lang.String) >> >> - is blocked on reentry: >> >> waiting for monitor entry [0xfffffd7fc0231000] >> >> Key symtoms in thread T2: >> >> - is blocked waiting to lock the object: >> >> waiting for monitor entry [0xfffffd7fc0130000] >> waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String) >> >> Key symtoms in thread T3: >> >> - is blocked waiting to lock the object: >> >> waiting for monitor entry [0xfffffd7fc002f000] >> waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String) >> From daniel.daugherty at oracle.com Mon Feb 3 07:51:18 2014 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Mon, 03 Feb 2014 08:51:18 -0700 Subject: RFR (XS) 6471769: Error: assert(_cur_stack_depth == count_frames(), "cur_stack_depth out of sync") In-Reply-To: <52EC62CD.8000300@oracle.com> References: <52EC62CD.8000300@oracle.com> Message-ID: <52EFBAF6.8000404@oracle.com> On 1/31/14 7:58 PM, serguei.spitsyn at oracle.com wrote: > Please, review the fix for: > https://bugs.openjdk.java.net/browse/JDK-6471769 > > > Open webrev: > http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/hotspot/6471769-JVMTI-FRAME/ > src/share/vm/prims/jvmtiEnv.cpp Thumbs up on the code change. I've never been a fan of the: if (is_thread_fully_suspended(...)) do-something-direct else do-something-via-a-safepoint model. We don't have any mechanism in the VM to keep the is_thread_fully_suspended() constraint true over the execution time of the "do-something-direct" operation so we've always been exposed to races. Also, I think this model is used for several other JVM/TI calls so you might want to visit those also with an eye on safety. For this specific change, I think the comment would be better as: // It is only safe to perform the direct operation on the current // thread. All other usage needs to use a vm-op for safety. In particular, I don't agree with this part of the comment: // Performance-wise the only important case is when current thread is // the target thread. I'm not sure how you can state that performance on the current thread is the only important case, but maybe I don't understand what you're really trying to say here. Now for the more general question with respect to this bug: 6471769 I've having trouble connecting this change (which I like) to this particular bug report. The crash in the bug's description looks like this: # Internal Error (/net/prt-solamd64-q1-2/PrtBuildDir/workspace/src/share/vm/prims/jvmtiThreadState.cpp, 273), pid=9561, tid=2 # # Java VM: Java HotSpot(TM) 64-Bit Server VM (20060914135846.dcubed.service_hs_b02_merge.2-debug compiled mode) # # Error: assert(_cur_stack_depth == count_frames(),"cur_stack_depth out of sync") V [libjvm.so+0x1217206];; void VMError::report_and_die()+0x606 V [libjvm.so+0x667841];; void report_assertion_failure(const char*,int,const char*)+0x61 V [libjvm.so+0xca9557];; int JvmtiThreadState::cur_stack_depth()+0x6e7 V [libjvm.so+0xc1caaa];; void JvmtiExport::post_method_exit(JavaThread*,methodOop,frame)+0x21ba V [libjvm.so+0x8385de];; void InterpreterRuntime::post_method_exit(JavaThread*)+0x21e j java.io.ObjectInputStream$BlockDataInputStream.getBlockDataMode()Z+4 In the above crash, the current thread is trying to post a JVMTI_EVENT_METHOD_EXIT and during the event posting code, it calls cur_stack_depth() which fails an assert because the cached stack depth doesn't match the current queried value: 282 int JvmtiThreadState::cur_stack_depth() { 283 uint32_t debug_bits = 0; 284 guarantee(JavaThread::current() == get_thread() || 285 JvmtiEnv::is_thread_fully_suspended(get_thread(), false, &debug_bits ), 286 "must be current thread or suspended"); 287 288 if (!is_interp_only_mode() || _cur_stack_depth == UNKNOWN_STACK_DEPTH) { 289 _cur_stack_depth = count_frames(); 290 } else { 291 // heavy weight assert 292 assert(_cur_stack_depth == count_frames(), 293 "cur_stack_depth out of sync"); 294 } 295 return _cur_stack_depth; 296 } This is not an unsafe query from one thread to another target thread so I don't see how the proposed change will address this failure mode. In particular, the assertion is troubling because it tells me that the cached stack depth for the current thread is either wrong or has been corrupted. To reiterate: - I like the code change! - I would like to see the comment tweaked a bit You should consider going forward with this change using a different bug ID, perhaps something like: JDK-NNNNNNN improve safety of JVM/TI GetFrameCount() Dan > > Summary: > > There is a general issue in the suspend equivalent condition mechanism: > Two subsequent calls to the JvmtiEnv::is_thread_fully_suspended() > may return different results: > - 1-st: true > - 2-nd: false > > This more generic suspend equivalent issue is covered by another bug: > https://bugs.openjdk.java.net/browse/JDK-6280037 > > The bug to fix in this review is a specific manifestation of the > 6280037 > in the JVMTI GetFrameCount() that has a big impact on the SQE nightly. > It is on the Test Stabilization radar (as well as the 6280037). > There are many tests intermittently failing because of this. > > The webrev for review is a one-liner work around the 6280037 for the > GetFrameCount(). > > The JVMTI GetFrameCount() spec tells: > "If this function is called for a thread actively executing > bytecodes (for example, > not the current thread and not suspended), the information > returned is transient." > > So, it is Ok to call the GetFrameCount() for non-suspended target > threads. > To achieve safety, the frame count for non-suspended threads is > calculated at a safepoint. > It should be Ok and more safe to do the same for suspended threads > as well. > There is no big performance impact because it is already on a slow > path. > It is still important to avoid safepointing when the target thread > is current. > > The bug 6280037 should go out of the Test Stabilization radar > (remove the svc-nightly label) > as the most of the impacted tests are covered by the 6471769. > > > Testing: > In progress: nsk.jvmti, nsk.jdi, nsk.jdwp, impacted JTreg tests > > > Thanks, > Serguei > From serguei.spitsyn at oracle.com Mon Feb 3 10:30:51 2014 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Mon, 03 Feb 2014 10:30:51 -0800 Subject: RFR (XS) 6471769: Error: assert(_cur_stack_depth == count_frames(), "cur_stack_depth out of sync") In-Reply-To: <3E8D21DD-BD7A-4AEF-9C02-C175AEB40738@oracle.com> References: <52EC62CD.8000300@oracle.com> <3E8D21DD-BD7A-4AEF-9C02-C175AEB40738@oracle.com> Message-ID: <52EFE05B.1070504@oracle.com> Thanks, Staffan! Serguei On 2/3/14 3:57 AM, Staffan Larsen wrote: > Looks good! > > Thanks, > /Staffan > > On 1 feb 2014, at 03:58, serguei.spitsyn at oracle.com wrote: > >> Please, review the fix for: >> https://bugs.openjdk.java.net/browse/JDK-6471769 >> >> >> Open webrev: >> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/hotspot/6471769-JVMTI-FRAME/ >> >> Summary: >> >> There is a general issue in the suspend equivalent condition mechanism: >> Two subsequent calls to the JvmtiEnv::is_thread_fully_suspended() may return different results: >> - 1-st: true >> - 2-nd: false >> >> This more generic suspend equivalent issue is covered by another bug: >> https://bugs.openjdk.java.net/browse/JDK-6280037 >> >> The bug to fix in this review is a specific manifestation of the 6280037 >> in the JVMTI GetFrameCount() that has a big impact on the SQE nightly. >> It is on the Test Stabilization radar (as well as the 6280037). >> There are many tests intermittently failing because of this. >> >> The webrev for review is a one-liner work around the 6280037 for the GetFrameCount(). >> >> The JVMTI GetFrameCount() spec tells: >> "If this function is called for a thread actively executing bytecodes (for example, >> not the current thread and not suspended), the information returned is transient." >> >> So, it is Ok to call the GetFrameCount() for non-suspended target threads. >> To achieve safety, the frame count for non-suspended threads is calculated at a safepoint. >> It should be Ok and more safe to do the same for suspended threads as well. >> There is no big performance impact because it is already on a slow path. >> It is still important to avoid safepointing when the target thread is current. >> >> The bug 6280037 should go out of the Test Stabilization radar (remove the svc-nightly label) >> as the most of the impacted tests are covered by the 6471769. >> >> >> Testing: >> In progress: nsk.jvmti, nsk.jdi, nsk.jdwp, impacted JTreg tests >> >> >> Thanks, >> Serguei >> From serguei.spitsyn at oracle.com Mon Feb 3 10:36:50 2014 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Mon, 03 Feb 2014 10:36:50 -0800 Subject: code review round 0 for ObjectMonitor-JVM/TI hang fix (8028073) In-Reply-To: <52ED3F28.5070806@oracle.com> References: <52ED3F28.5070806@oracle.com> Message-ID: <52EFE1C2.2070209@oracle.com> Hi Dan, It looks good to me. Your work on this issue is outstanding. I know what it took to resolve this one. Great job! Thanks, Serguei On 2/1/14 10:38 AM, Daniel D. Daugherty wrote: > Greetings, > > I have a fix ready for the following bug: > > 8028073 race condition in ObjectMonitor implementation causing > deadlocks > https://bugs.openjdk.java.net/browse/JDK-8028073 > > On the surface, this is a very simple fix that relocates a few lines of > code, relocates and rewrites the comments associated with that code and > adds several new comments. > > Of course, in reality, the issue is much more complicated, but I'm > hoping to make it easy for anyone not acquainted with this issue to > understand what's going on. > > Here are the JDK9 webrev URLs: > > OpenJDK: > http://cr.openjdk.java.net/~dcubed/8028073-webrev/0-jdk9-hs-runtime/ > > Oracle internal: > http://javaweb.us.oracle.com/~ddaugher/8028073-webrev/0-jdk9-hs-runtime/ > > The simple summary: > > - since Java Monitors and JVM/TI RawMonitors share a ParkEvent, > it is possible for a JVM/TI monitor event handler to accidentally > consume a ParkEvent.unpark() call meant for Java Monitor layer > - the original code fix was made on 2005.07.04 using this bug ID: > https://bugs.openjdk.java.net/browse/JDK-5030359 > - it's the right fix, but it's in the wrong place > - the fix needs to be after the JVMTI_EVENT_MONITOR_WAITED > event handler is called because it is that event handler > that can cause the hang > > > Testing > ------- > > - a new StessMonitorWait test has been created that reliably > reproduces the hang in JDK[6789]; see the bug's gory details > for the specific versions where the hang has been reproduced > - the test reliably reproduces the hang in 5 seconds on my > T7600 running Solaris 10u11 X86; 1 minute runs reproduce > the hang reliably on other machines > - 12 hour stress run of the new test on Linux-X64, MacOS X-X64, > Solaris-SPARCV9, Solaris-X64, and Win7-X86 with the JPRT > bits did not reproduce the hang > - JPRT test job > - VM/SQE Adhoc test job on Server VM, fastdebug bits on Linux-X86, > Linux-X64, MacOS X-X64, Solaris-SPARCV9, Solaris-X64, Windows-X86, > and Windows-X64: > - vm.quick > - Kitchensink (bigapps) > - Weblogic+medrec (bigapps) > - runThese (bigapps) > > > The Gory Details Start Here > --------------------------- > > This is the old location of block of code that's being moved: > > src/share/vm/runtime/objectMonitor.cpp: > > 1440 void ObjectMonitor::wait(jlong millis, bool interruptible, TRAPS) { > > 1499 exit (true, Self) ; // exit the monitor > > 1513 if (node._notified != 0 && _succ == Self) { > 1514 node._event->unpark(); > 1515 } > > > This is the new location of block of code that's being moved: > > src/share/vm/runtime/objectMonitor.cpp: > > 1452 void ObjectMonitor::wait(jlong millis, bool interruptible, TRAPS) { > > 1601 if (JvmtiExport::should_post_monitor_waited()) { > 1602 JvmtiExport::post_monitor_waited(jt, this, ret == > OS_TIMEOUT); > > 1604 if (node._notified != 0 && _succ == Self) { > > 1620 node._event->unpark(); > 1621 } > > > The Risks > --------- > > - The code now executes only when the JVMTI_EVENT_MONITOR_WAITED event > is enabled: > - previously it was always executed > - while the old code was not effective for the hang that is being > fixed with this bug, it is possible that the old code prevented > a different bug in the successor protocol from manifesting > - thorough analysis of the successor protocol did not reveal a > case where the old code was needed in the old location > - Thorough analysis indicates that the other JVM/TI monitor events > do not need a fix like the one for JVMTI_EVENT_MONITOR_WAITED: > - the successor protocol is complicated and the analysis could > be wrong when certain options are used > - comments were added to each location where a JVM/TI monitor > event handler is called documenting why a fix like this one > is not needed there > - if the analysis is wrong, the new comments show where a new > code change would be needed > > > The Scenario > ------------ > > I've created a scenario that reproduces this hang: > > T1 - enters monitor and calls monitor.wait() > T2 - enters the monitor, calls monitor.notify() and exits the monitor > T3 - enters and exits the monitor > T4 - enters the monitor, delays for 5 seconds, exits the monitor > > A JVM/TI agent that enables JVMTI_EVENT_MONITOR_WAITED and has a > handler that: enters a raw monitor, waits for 1ms, exits a raw monitor. > > Here are the six events necessary to make this hang happen: > > // KEY-EVENT-1a: After being unparked(), T1 has cleared the _succ > field, but > // KEY-EVENT-1b: T3 is exiting the monitor and makes T1 the successor > again. > > // KEY-EVENT-2a: The unpark() done by T3 when it made T1 the successor > // KEY-EVENT-2b: is consumed by the JVM/TI event handler. > > // KEY-EVENT-3a: T3 made T1 the successor > // KEY-EVENT-3b: but before T1 could reenter the monitor T4 grabbed it. > > // KEY-EVENT-4a: T1's TrySpin() call sees T4 as NotRunnable so > // KEY-EVENT-4b: T1 bails from TrySpin without touching _succ. > > // KEY-EVENT-5a: T4 sees that T1 is still the successor so > // KEY-EVENT-5b: T4 takes the quick exit path (no ExitEpilog) > > // KEY-EVENT-6a: T1 is about to park and it is the successor, but > // KEY-EVENT-6b: T3's unpark has been eaten by the JVM/TI event handler > // KEY-EVENT-6c: and T4 took the quick exit path. T1 is about to be > stuck. > > > This bug is intertwined with: > > - The ObjectMonitor successor protocol > - the sharing of a ParkEvent between Java Monitors and JVM/TI RawMonitors > > There is a very long successor.notes attachment to JDK-8028073 that > attempts to describe the ObjectMonitor successor protocol. It's good > for putting pretty much anyone to sleep. > > Since this hang reproduces back to JDK6, this bug is taking the easily > backported solution of moving the original fix to the right location. > The following new bug has been filed for possible future work in this > area by the Serviceability Team: > > 8033399 add a separate ParkEvent for JVM/TI RawMonitor use > https://bugs.openjdk.java.net/browse/JDK-8033399 > > > The Symptoms > ------------ > > With intermittent hangs like this, it is useful to know what to look > for in order to determine if you are running into this issue: > > - if you aren't using a debugger or a profiler or some other > JVM/TI agent, then this hang is not the same as yours > - if your JVM/TI agent isn't using a JVMTI_EVENT_MONITOR_WAITED > event handler, then this hang is not the same as yours > - if your JVMTI_EVENT_MONITOR_WAITED event handler is not using > JVM/TI RawMonitors, then this hang is not the same as yours > - if your JVMTI_EVENT_MONITOR_WAITED event handler is calling > back into Java code, then you might just be insane and this > hang might be similar to yours. However, using a Java callback > in an event handler is an even bigger problem/risk so fix that > first. > - if you one or more threads blocked like this and making no > progress, then this hang might be the same as yours: > > "T1" #22 prio=5 os_prio=64 tid=0x00000000009ca800 nid=0x2f waiting for > monitor e > ntry [0xfffffd7fc0231000] > java.lang.Thread.State: BLOCKED (on object monitor) > JavaThread state: _thread_blocked > Thread: 0x00000000009ca800 [0x2f] State: _at_safepoint > _has_called_back 0 _at_p > oll_safepoint 0 > JavaThread state: _thread_blocked > at java.lang.Object.wait(Native Method) > - waiting on <0xfffffd7e6a2b6ff0> (a java.lang.String) > at java.lang.Object.wait(Object.java:502) > at SMW_WorkerThread.run(StressMonitorWait.java:103) > - locked <0xfffffd7e6a2b6ff0> (a java.lang.String) > > "T2" #23 prio=5 os_prio=64 tid=0x00000000009cc000 nid=0x30 waiting for > monitor e > ntry [0xfffffd7fc0130000] > java.lang.Thread.State: BLOCKED (on object monitor) > JavaThread state: _thread_blocked > Thread: 0x00000000009cc000 [0x30] State: _at_safepoint > _has_called_back 0 _at_p > oll_safepoint 0 > JavaThread state: _thread_blocked > at SMW_WorkerThread.run(StressMonitorWait.java:120) > - waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String) > > "T3" #24 prio=5 os_prio=64 tid=0x00000000009ce000 nid=0x31 waiting for > monitor e > ntry [0xfffffd7fc002f000] > java.lang.Thread.State: BLOCKED (on object monitor) > JavaThread state: _thread_blocked > Thread: 0x00000000009ce000 [0x31] State: _at_safepoint > _has_called_back 0 _at_p > oll_safepoint 0 > JavaThread state: _thread_blocked > at SMW_WorkerThread.run(StressMonitorWait.java:139) > - waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String) > > Key symptoms in thread T1: > > - had the object locked: > > locked <0xfffffd7e6a2b6ff0> (a java.lang.String) > > - did an Object.wait(): > > waiting on <0xfffffd7e6a2b6ff0> (a java.lang.String) > > - is blocked on reentry: > > waiting for monitor entry [0xfffffd7fc0231000] > > Key symtoms in thread T2: > > - is blocked waiting to lock the object: > > waiting for monitor entry [0xfffffd7fc0130000] > waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String) > > Key symtoms in thread T3: > > - is blocked waiting to lock the object: > > waiting for monitor entry [0xfffffd7fc002f000] > waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String) > From daniel.daugherty at oracle.com Mon Feb 3 10:50:48 2014 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Mon, 03 Feb 2014 11:50:48 -0700 Subject: code review round 0 for ObjectMonitor-JVM/TI hang fix (8028073) In-Reply-To: <52EFE1C2.2070209@oracle.com> References: <52ED3F28.5070806@oracle.com> <52EFE1C2.2070209@oracle.com> Message-ID: <52EFE508.2010808@oracle.com> Thanks for the review Serguei! Dan On 2/3/14 11:36 AM, serguei.spitsyn at oracle.com wrote: > Hi Dan, > > It looks good to me. > Your work on this issue is outstanding. > I know what it took to resolve this one. > Great job! > > Thanks, > Serguei > > On 2/1/14 10:38 AM, Daniel D. Daugherty wrote: >> Greetings, >> >> I have a fix ready for the following bug: >> >> 8028073 race condition in ObjectMonitor implementation causing >> deadlocks >> https://bugs.openjdk.java.net/browse/JDK-8028073 >> >> On the surface, this is a very simple fix that relocates a few lines of >> code, relocates and rewrites the comments associated with that code and >> adds several new comments. >> >> Of course, in reality, the issue is much more complicated, but I'm >> hoping to make it easy for anyone not acquainted with this issue to >> understand what's going on. >> >> Here are the JDK9 webrev URLs: >> >> OpenJDK: >> http://cr.openjdk.java.net/~dcubed/8028073-webrev/0-jdk9-hs-runtime/ >> >> Oracle internal: >> http://javaweb.us.oracle.com/~ddaugher/8028073-webrev/0-jdk9-hs-runtime/ >> >> The simple summary: >> >> - since Java Monitors and JVM/TI RawMonitors share a ParkEvent, >> it is possible for a JVM/TI monitor event handler to accidentally >> consume a ParkEvent.unpark() call meant for Java Monitor layer >> - the original code fix was made on 2005.07.04 using this bug ID: >> https://bugs.openjdk.java.net/browse/JDK-5030359 >> - it's the right fix, but it's in the wrong place >> - the fix needs to be after the JVMTI_EVENT_MONITOR_WAITED >> event handler is called because it is that event handler >> that can cause the hang >> >> >> Testing >> ------- >> >> - a new StessMonitorWait test has been created that reliably >> reproduces the hang in JDK[6789]; see the bug's gory details >> for the specific versions where the hang has been reproduced >> - the test reliably reproduces the hang in 5 seconds on my >> T7600 running Solaris 10u11 X86; 1 minute runs reproduce >> the hang reliably on other machines >> - 12 hour stress run of the new test on Linux-X64, MacOS X-X64, >> Solaris-SPARCV9, Solaris-X64, and Win7-X86 with the JPRT >> bits did not reproduce the hang >> - JPRT test job >> - VM/SQE Adhoc test job on Server VM, fastdebug bits on Linux-X86, >> Linux-X64, MacOS X-X64, Solaris-SPARCV9, Solaris-X64, Windows-X86, >> and Windows-X64: >> - vm.quick >> - Kitchensink (bigapps) >> - Weblogic+medrec (bigapps) >> - runThese (bigapps) >> >> >> The Gory Details Start Here >> --------------------------- >> >> This is the old location of block of code that's being moved: >> >> src/share/vm/runtime/objectMonitor.cpp: >> >> 1440 void ObjectMonitor::wait(jlong millis, bool interruptible, TRAPS) { >> >> 1499 exit (true, Self) ; // exit the monitor >> >> 1513 if (node._notified != 0 && _succ == Self) { >> 1514 node._event->unpark(); >> 1515 } >> >> >> This is the new location of block of code that's being moved: >> >> src/share/vm/runtime/objectMonitor.cpp: >> >> 1452 void ObjectMonitor::wait(jlong millis, bool interruptible, TRAPS) { >> >> 1601 if (JvmtiExport::should_post_monitor_waited()) { >> 1602 JvmtiExport::post_monitor_waited(jt, this, ret == >> OS_TIMEOUT); >> >> 1604 if (node._notified != 0 && _succ == Self) { >> >> 1620 node._event->unpark(); >> 1621 } >> >> >> The Risks >> --------- >> >> - The code now executes only when the JVMTI_EVENT_MONITOR_WAITED event >> is enabled: >> - previously it was always executed >> - while the old code was not effective for the hang that is being >> fixed with this bug, it is possible that the old code prevented >> a different bug in the successor protocol from manifesting >> - thorough analysis of the successor protocol did not reveal a >> case where the old code was needed in the old location >> - Thorough analysis indicates that the other JVM/TI monitor events >> do not need a fix like the one for JVMTI_EVENT_MONITOR_WAITED: >> - the successor protocol is complicated and the analysis could >> be wrong when certain options are used >> - comments were added to each location where a JVM/TI monitor >> event handler is called documenting why a fix like this one >> is not needed there >> - if the analysis is wrong, the new comments show where a new >> code change would be needed >> >> >> The Scenario >> ------------ >> >> I've created a scenario that reproduces this hang: >> >> T1 - enters monitor and calls monitor.wait() >> T2 - enters the monitor, calls monitor.notify() and exits the monitor >> T3 - enters and exits the monitor >> T4 - enters the monitor, delays for 5 seconds, exits the monitor >> >> A JVM/TI agent that enables JVMTI_EVENT_MONITOR_WAITED and has a >> handler that: enters a raw monitor, waits for 1ms, exits a raw monitor. >> >> Here are the six events necessary to make this hang happen: >> >> // KEY-EVENT-1a: After being unparked(), T1 has cleared the _succ >> field, but >> // KEY-EVENT-1b: T3 is exiting the monitor and makes T1 the successor >> again. >> >> // KEY-EVENT-2a: The unpark() done by T3 when it made T1 the successor >> // KEY-EVENT-2b: is consumed by the JVM/TI event handler. >> >> // KEY-EVENT-3a: T3 made T1 the successor >> // KEY-EVENT-3b: but before T1 could reenter the monitor T4 grabbed it. >> >> // KEY-EVENT-4a: T1's TrySpin() call sees T4 as NotRunnable so >> // KEY-EVENT-4b: T1 bails from TrySpin without touching _succ. >> >> // KEY-EVENT-5a: T4 sees that T1 is still the successor so >> // KEY-EVENT-5b: T4 takes the quick exit path (no ExitEpilog) >> >> // KEY-EVENT-6a: T1 is about to park and it is the successor, but >> // KEY-EVENT-6b: T3's unpark has been eaten by the JVM/TI event handler >> // KEY-EVENT-6c: and T4 took the quick exit path. T1 is about to be >> stuck. >> >> >> This bug is intertwined with: >> >> - The ObjectMonitor successor protocol >> - the sharing of a ParkEvent between Java Monitors and JVM/TI >> RawMonitors >> >> There is a very long successor.notes attachment to JDK-8028073 that >> attempts to describe the ObjectMonitor successor protocol. It's good >> for putting pretty much anyone to sleep. >> >> Since this hang reproduces back to JDK6, this bug is taking the easily >> backported solution of moving the original fix to the right location. >> The following new bug has been filed for possible future work in this >> area by the Serviceability Team: >> >> 8033399 add a separate ParkEvent for JVM/TI RawMonitor use >> https://bugs.openjdk.java.net/browse/JDK-8033399 >> >> >> The Symptoms >> ------------ >> >> With intermittent hangs like this, it is useful to know what to look >> for in order to determine if you are running into this issue: >> >> - if you aren't using a debugger or a profiler or some other >> JVM/TI agent, then this hang is not the same as yours >> - if your JVM/TI agent isn't using a JVMTI_EVENT_MONITOR_WAITED >> event handler, then this hang is not the same as yours >> - if your JVMTI_EVENT_MONITOR_WAITED event handler is not using >> JVM/TI RawMonitors, then this hang is not the same as yours >> - if your JVMTI_EVENT_MONITOR_WAITED event handler is calling >> back into Java code, then you might just be insane and this >> hang might be similar to yours. However, using a Java callback >> in an event handler is an even bigger problem/risk so fix that >> first. >> - if you one or more threads blocked like this and making no >> progress, then this hang might be the same as yours: >> >> "T1" #22 prio=5 os_prio=64 tid=0x00000000009ca800 nid=0x2f waiting >> for monitor e >> ntry [0xfffffd7fc0231000] >> java.lang.Thread.State: BLOCKED (on object monitor) >> JavaThread state: _thread_blocked >> Thread: 0x00000000009ca800 [0x2f] State: _at_safepoint >> _has_called_back 0 _at_p >> oll_safepoint 0 >> JavaThread state: _thread_blocked >> at java.lang.Object.wait(Native Method) >> - waiting on <0xfffffd7e6a2b6ff0> (a java.lang.String) >> at java.lang.Object.wait(Object.java:502) >> at SMW_WorkerThread.run(StressMonitorWait.java:103) >> - locked <0xfffffd7e6a2b6ff0> (a java.lang.String) >> >> "T2" #23 prio=5 os_prio=64 tid=0x00000000009cc000 nid=0x30 waiting >> for monitor e >> ntry [0xfffffd7fc0130000] >> java.lang.Thread.State: BLOCKED (on object monitor) >> JavaThread state: _thread_blocked >> Thread: 0x00000000009cc000 [0x30] State: _at_safepoint >> _has_called_back 0 _at_p >> oll_safepoint 0 >> JavaThread state: _thread_blocked >> at SMW_WorkerThread.run(StressMonitorWait.java:120) >> - waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String) >> >> "T3" #24 prio=5 os_prio=64 tid=0x00000000009ce000 nid=0x31 waiting >> for monitor e >> ntry [0xfffffd7fc002f000] >> java.lang.Thread.State: BLOCKED (on object monitor) >> JavaThread state: _thread_blocked >> Thread: 0x00000000009ce000 [0x31] State: _at_safepoint >> _has_called_back 0 _at_p >> oll_safepoint 0 >> JavaThread state: _thread_blocked >> at SMW_WorkerThread.run(StressMonitorWait.java:139) >> - waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String) >> >> Key symptoms in thread T1: >> >> - had the object locked: >> >> locked <0xfffffd7e6a2b6ff0> (a java.lang.String) >> >> - did an Object.wait(): >> >> waiting on <0xfffffd7e6a2b6ff0> (a java.lang.String) >> >> - is blocked on reentry: >> >> waiting for monitor entry [0xfffffd7fc0231000] >> >> Key symtoms in thread T2: >> >> - is blocked waiting to lock the object: >> >> waiting for monitor entry [0xfffffd7fc0130000] >> waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String) >> >> Key symtoms in thread T3: >> >> - is blocked waiting to lock the object: >> >> waiting for monitor entry [0xfffffd7fc002f000] >> waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String) >> > From serguei.spitsyn at oracle.com Mon Feb 3 11:12:37 2014 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Mon, 03 Feb 2014 11:12:37 -0800 Subject: RFR (XS) 6471769: Error: assert(_cur_stack_depth == count_frames(), "cur_stack_depth out of sync") In-Reply-To: <52EF88A7.3090506@oracle.com> References: <52EC62CD.8000300@oracle.com> <52EF88A7.3090506@oracle.com> Message-ID: <52EFEA25.8080905@oracle.com> On 2/3/14 4:16 AM, David Holmes wrote: > Hi Serguei, > > On 1/02/2014 12:58 PM, serguei.spitsyn at oracle.com wrote: >> Please, review the fix for: >> https://bugs.openjdk.java.net/browse/JDK-6471769 >> >> >> Open webrev: >> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/hotspot/6471769-JVMTI-FRAME/ >> >> >> >> Summary: >> >> There is a general issue in the suspend equivalent condition >> mechanism: >> Two subsequent calls to the JvmtiEnv::is_thread_fully_suspended() may >> return different results: >> - 1-st: true >> - 2-nd: false >> >> This more generic suspend equivalent issue is covered by another bug: >> https://bugs.openjdk.java.net/browse/JDK-6280037 >> >> The bug to fix in this review is a specific manifestation of the >> 6280037 >> in the JVMTI GetFrameCount() that has a big impact on the SQE >> nightly. >> It is on the Test Stabilization radar (as well as the 6280037). >> There are many tests intermittently failing because of this. >> >> The webrev for review is a one-liner work around the 6280037 for the >> GetFrameCount(). >> >> The JVMTI GetFrameCount() spec tells: >> "If this function is called for a thread actively executing >> bytecodes (for example, >> not the current thread and not suspended), the information >> returned is transient." >> >> So, it is Ok to call the GetFrameCount() for non-suspended target >> threads. >> To achieve safety, the frame count for non-suspended threads is >> calculated at a safepoint. >> It should be Ok and more safe to do the same for suspended threads as >> well. >> There is no big performance impact because it is already on a slow >> path. >> It is still important to avoid safepointing when the target thread is >> current. > > This sounds completely reasonable - Reviewed. > > I can't check at the moment whether VM_GetFrameCount properly checks > for a live target thread? David, thank you for the review! It looks like the check is missed. Please, give me some time to make sure what is needed here. Thanks! Serguei > > David > >> The bug 6280037 should go out of the Test Stabilization radar (remove >> the svc-nightly label) >> as the most of the impacted tests are covered by the 6471769. >> >> >> Testing: >> In progress: nsk.jvmti, nsk.jdi, nsk.jdwp, impacted JTreg tests >> >> >> Thanks, >> Serguei >> From staffan.larsen at oracle.com Mon Feb 3 11:18:46 2014 From: staffan.larsen at oracle.com (Staffan Larsen) Date: Mon, 3 Feb 2014 20:18:46 +0100 Subject: RFR(S): JDK-8029808 com/sun/jdi/ProcessAttachTest.sh times out In-Reply-To: <52EA872E.7070209@oracle.com> References: <79C6D4D7-F8A0-4A33-AF7A-D1007F227A86@oracle.com> <52EA7F46.5070405@oracle.com> <16D6BA50-82FE-41F6-8423-FE218D397C58@oracle.com> <52EA84DC.5090008@oracle.com> <52EA872E.7070209@oracle.com> Message-ID: OK, my last try at this was very broken - I?m glad it didn?t pass the review. Here is a new try: http://cr.openjdk.java.net/~sla/8029808/webrev.01/ As Alan says, we can?t use jps on the (second) debuggee becuse it is started with suspend=y. In that mode it is waiting for a debugger to attach before it has started executing any Java code and before jps is able to identify it. I did find a race in the test, though. When the first debuggee is started (suspend=n) there is a race where the debugger process can attach and finish before the debuggee has written out the portfile. In that case ShutdownDebuggee will fail. I have seen this failure mode happen and I can force it by inserting a Thread.sleep into the debuggee. I solved this by waiting for the port file to appear in the suspend=n case as well. There is also a potential problem where both debuggees use the same output file and we may think the second debuggee has started before it actually has (because the output file isn?t empty). I solved that be creating different output files for the two test cases. I?ve also redirected all error streams so any errors aren?t lost. Thanks, /Staffan On 30 jan 2014, at 18:09, Alan Bateman wrote: > On 30/01/2014 16:59, Dmitry Samersoff wrote: >> Staffan, >> >> No it's not to your code. Sorry for not being clean enough. >> >> You cleaned up unix code, but windows code remains bad. Particularly, it >> has bad unconditional sleep 2 >> >> I would propose replace *windows manipulation with CYGWIN/MKS pids* to >> call to JPS that return windows pid >> > I have a vague memory that we had to use ps to get the pid because of the suspend=y test which cause the debuggee to suspect during startup. I might be wrong on this of course, it was a long time ago. > > -Alan From Alan.Bateman at oracle.com Mon Feb 3 12:07:28 2014 From: Alan.Bateman at oracle.com (Alan Bateman) Date: Mon, 03 Feb 2014 20:07:28 +0000 Subject: RFR(S): JDK-8029808 com/sun/jdi/ProcessAttachTest.sh times out In-Reply-To: References: <79C6D4D7-F8A0-4A33-AF7A-D1007F227A86@oracle.com> <52EA7F46.5070405@oracle.com> <16D6BA50-82FE-41F6-8423-FE218D397C58@oracle.com> <52EA84DC.5090008@oracle.com> <52EA872E.7070209@oracle.com> Message-ID: <52EFF700.2070803@oracle.com> On 03/02/2014 19:18, Staffan Larsen wrote: > OK, my last try at this was very broken - I?m glad it didn?t pass the review. > > Here is a new try: http://cr.openjdk.java.net/~sla/8029808/webrev.01/ > > As Alan says, we can?t use jps on the (second) debuggee becuse it is started with suspend=y. In that mode it is waiting for a debugger to attach before it has started executing any Java code and before jps is able to identify it. > > I did find a race in the test, though. When the first debuggee is started (suspend=n) there is a race where the debugger process can attach and finish before the debuggee has written out the portfile. In that case ShutdownDebuggee will fail. I have seen this failure mode happen and I can force it by inserting a Thread.sleep into the debuggee. I solved this by waiting for the port file to appear in the suspend=n case as well. > > There is also a potential problem where both debuggees use the same output file and we may think the second debuggee has started before it actually has (because the output file isn?t empty). I solved that be creating different output files for the two test cases. > > I?ve also redirected all error streams so any errors aren?t lost. > This update looks okay to me (and good to have found the race). -Alan. From serguei.spitsyn at oracle.com Mon Feb 3 12:13:02 2014 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Mon, 03 Feb 2014 12:13:02 -0800 Subject: RFR (XS) 6471769: Error: assert(_cur_stack_depth == count_frames(), "cur_stack_depth out of sync") In-Reply-To: <52EFBAF6.8000404@oracle.com> References: <52EC62CD.8000300@oracle.com> <52EFBAF6.8000404@oracle.com> Message-ID: <52EFF84E.1050800@oracle.com> Dan, Thank you for reviewing! Some comments are below. On 2/3/14 7:51 AM, Daniel D. Daugherty wrote: > On 1/31/14 7:58 PM, serguei.spitsyn at oracle.com wrote: >> Please, review the fix for: >> https://bugs.openjdk.java.net/browse/JDK-6471769 >> >> >> Open webrev: >> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/hotspot/6471769-JVMTI-FRAME/ >> > > src/share/vm/prims/jvmtiEnv.cpp > Thumbs up on the code change. > > I've never been a fan of the: > > if (is_thread_fully_suspended(...)) > do-something-direct > else > do-something-via-a-safepoint > > model. We don't have any mechanism in the VM to keep the > is_thread_fully_suspended() constraint true over the > execution time of the "do-something-direct" operation so > we've always been exposed to races. Thank you for this confirmation. So that, we are in sync here. > Also, I think this > model is used for several other JVM/TI calls so you might > want to visit those also with an eye on safety. Yes, I'm looking to isolate and double check other places like this in the code. Another piece of work is to isolate other bugs on this topic. > > For this specific change, I think the comment would be > better as: > > // It is only safe to perform the direct operation on the current > // thread. All other usage needs to use a vm-op for safety. It is taken, thanks! > In particular, I don't agree with this part of the comment: > > // Performance-wise the only important case is when current thread is > // the target thread. > > I'm not sure how you can state that performance on the current thread > is the only important case, but maybe I don't understand what you're > really trying to say here. I agree, the comment is not clear. Probably, the comment had to say: "the only case where we can keep performance without loosing safety is the target thread is current ...". In fact, collecting data at a safepoint is the only way to make it safe. If the target thread is not current then: - a) thread has been suspended: already slow path, so that performance is not that critical - b) thread has not been suspended: the information we provide is transient, safepoint is the only way to get it safe (is is correct?) > > Now for the more general question with respect to this bug: 6471769 > I've having trouble connecting this change (which I like) to this > particular bug report. > > The crash in the bug's description looks like this: > > # Internal Error > (/net/prt-solamd64-q1-2/PrtBuildDir/workspace/src/share/vm/prims/jvmtiThreadState.cpp, > 273), pid=9561, tid=2 > # > # Java VM: Java HotSpot(TM) 64-Bit Server VM > (20060914135846.dcubed.service_hs_b02_merge.2-debug compiled mode) > # > # Error: assert(_cur_stack_depth == count_frames(),"cur_stack_depth > out of sync") > > V [libjvm.so+0x1217206];; void VMError::report_and_die()+0x606 > V [libjvm.so+0x667841];; void report_assertion_failure(const > char*,int,const char*)+0x61 > V [libjvm.so+0xca9557];; int JvmtiThreadState::cur_stack_depth()+0x6e7 > V [libjvm.so+0xc1caaa];; void > JvmtiExport::post_method_exit(JavaThread*,methodOop,frame)+0x21ba > V [libjvm.so+0x8385de];; void > InterpreterRuntime::post_method_exit(JavaThread*)+0x21e > j java.io.ObjectInputStream$BlockDataInputStream.getBlockDataMode()Z+4 > > In the above crash, the current thread is trying to post a > JVMTI_EVENT_METHOD_EXIT and during the event posting code, it > calls cur_stack_depth() which fails an assert because the > cached stack depth doesn't match the current queried value: > > 282 int JvmtiThreadState::cur_stack_depth() { > 283 uint32_t debug_bits = 0; > 284 guarantee(JavaThread::current() == get_thread() || > 285 JvmtiEnv::is_thread_fully_suspended(get_thread(), false, > &debug_bits > ), > 286 "must be current thread or suspended"); > 287 > 288 if (!is_interp_only_mode() || _cur_stack_depth == > UNKNOWN_STACK_DEPTH) > { > 289 _cur_stack_depth = count_frames(); > 290 } else { > 291 // heavy weight assert > 292 assert(_cur_stack_depth == count_frames(), > 293 "cur_stack_depth out of sync"); > 294 } > 295 return _cur_stack_depth; > 296 } > > > This is not an unsafe query from one thread to another target thread > so I don't see how the proposed change will address this failure mode. > In particular, the assertion is troubling because it tells me that > the cached stack depth for the current thread is either wrong or has > been corrupted. The above are consequences of unsafe operations we are doing. Current thread should not calculate frame count of target thread (if different) other than at a safepoint. We assume the thread is suspended but, in fact, it was not. So that we lost safety here and all manifestation we observe are the consequences. This bug covers the frames miscount issue. In the fastdebug mode we may get the assert as in this bug report: https://bugs.openjdk.java.net/browse/JDK-8032223 There is another potential race that I forgot to check and fix but David pointed out. I will try to cover it in the next version of webrev. It is still possible there can be other issues here (as you say). But it is hard to make sure it is the case because it is extremely hard to reproduce (you know it!). I'd suggest to resolve it step-by-step: - fix the known safety issues - watch the nightly if anything else is left, file and fix new bugs > > To reiterate: > > - I like the code change! > - I would like to see the comment tweaked a bit > > You should consider going forward with this change using a different > bug ID, perhaps something like: > > JDK-NNNNNNN improve safety of JVM/TI GetFrameCount() There is already bug on this topic (and there are even more dups): https://bugs.openjdk.java.net/browse/JDK-8032223 But I still believe, the issue is the same ... Thanks! Serguei > > Dan > > > >> >> Summary: >> >> There is a general issue in the suspend equivalent condition >> mechanism: >> Two subsequent calls to the JvmtiEnv::is_thread_fully_suspended() >> may return different results: >> - 1-st: true >> - 2-nd: false >> >> This more generic suspend equivalent issue is covered by another bug: >> https://bugs.openjdk.java.net/browse/JDK-6280037 >> >> The bug to fix in this review is a specific manifestation of the >> 6280037 >> in the JVMTI GetFrameCount() that has a big impact on the SQE nightly. >> It is on the Test Stabilization radar (as well as the 6280037). >> There are many tests intermittently failing because of this. >> >> The webrev for review is a one-liner work around the 6280037 for >> the GetFrameCount(). >> >> The JVMTI GetFrameCount() spec tells: >> "If this function is called for a thread actively executing >> bytecodes (for example, >> not the current thread and not suspended), the information >> returned is transient." >> >> So, it is Ok to call the GetFrameCount() for non-suspended target >> threads. >> To achieve safety, the frame count for non-suspended threads is >> calculated at a safepoint. >> It should be Ok and more safe to do the same for suspended threads >> as well. >> There is no big performance impact because it is already on a slow >> path. >> It is still important to avoid safepointing when the target thread >> is current. >> >> The bug 6280037 should go out of the Test Stabilization radar >> (remove the svc-nightly label) >> as the most of the impacted tests are covered by the 6471769. >> >> >> Testing: >> In progress: nsk.jvmti, nsk.jdi, nsk.jdwp, impacted JTreg tests >> >> >> Thanks, >> Serguei >> > From dmitry.samersoff at oracle.com Mon Feb 3 12:31:49 2014 From: dmitry.samersoff at oracle.com (Dmitry Samersoff) Date: Tue, 04 Feb 2014 00:31:49 +0400 Subject: RFR(S): JDK-8029808 com/sun/jdi/ProcessAttachTest.sh times out In-Reply-To: References: <79C6D4D7-F8A0-4A33-AF7A-D1007F227A86@oracle.com> <52EA7F46.5070405@oracle.com> <16D6BA50-82FE-41F6-8423-FE218D397C58@oracle.com> <52EA84DC.5090008@oracle.com> <52EA872E.7070209@oracle.com> Message-ID: <52EFFCB5.2030004@oracle.com> Staffan, Looks good for me! -Dmitry On 2014-02-03 23:18, Staffan Larsen wrote: > OK, my last try at this was very broken - I?m glad it didn?t pass the review. > > Here is a new try: http://cr.openjdk.java.net/~sla/8029808/webrev.01/ > > As Alan says, we can?t use jps on the (second) debuggee becuse it is started with suspend=y. In that mode it is waiting for a debugger to attach before it has started executing any Java code and before jps is able to identify it. > > I did find a race in the test, though. When the first debuggee is started (suspend=n) there is a race where the debugger process can attach and finish before the debuggee has written out the portfile. In that case ShutdownDebuggee will fail. I have seen this failure mode happen and I can force it by inserting a Thread.sleep into the debuggee. I solved this by waiting for the port file to appear in the suspend=n case as well. > > There is also a potential problem where both debuggees use the same output file and we may think the second debuggee has started before it actually has (because the output file isn?t empty). I solved that be creating different output files for the two test cases. > > I?ve also redirected all error streams so any errors aren?t lost. > > Thanks, > /Staffan > > On 30 jan 2014, at 18:09, Alan Bateman wrote: > >> On 30/01/2014 16:59, Dmitry Samersoff wrote: >>> Staffan, >>> >>> No it's not to your code. Sorry for not being clean enough. >>> >>> You cleaned up unix code, but windows code remains bad. Particularly, it >>> has bad unconditional sleep 2 >>> >>> I would propose replace *windows manipulation with CYGWIN/MKS pids* to >>> call to JPS that return windows pid >>> >> I have a vague memory that we had to use ps to get the pid because of the suspend=y test which cause the debuggee to suspect during startup. I might be wrong on this of course, it was a long time ago. >> >> -Alan > -- Dmitry Samersoff Oracle Java development team, Saint Petersburg, Russia * I would love to change the world, but they won't give me the sources. From serguei.spitsyn at oracle.com Mon Feb 3 14:17:13 2014 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Mon, 03 Feb 2014 14:17:13 -0800 Subject: Review Request (S) 8025841: JVMTI: "vtable stub" dynamic code notification is misplaced Message-ID: <52F01569.3070201@oracle.com> Please, review the fix for: https://bugs.openjdk.java.net/browse/JDK-8025841 Open webrev: http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/hotspot/omazurov/8025841-JVMTI-vtbl.1 Summary: The fix contributed by Oleg Mazurov to improve profiling data quality. It moves the "vtable stub" dynamic code notification to the right place. I've already reviewed the fix, and it looks good to me. Bug report description: "JVMTI_EVENT_DYNAMIC_CODE_GENERATED for "vtable stub" gets scheduled when a new chunk of memory for subsequent vtable and itable stubs is allocated. That chunk is uninitialized (contains zeros or garbage) although due to the fact that the actual event delivery is deferred, at least one vtable comes out right. This event should describe an individual vtable/itable stub (base address and size) and only after it's been created (memory is actually populated with code). Where VM diagnostic messages about vtable/itable stubs are issued upon -XX:+PrintAdapterHandlers appears exactly the right place for JVMTI events as well. Getting vtables/itables right is important in the context of performance analysis as that dynamically generated code may accumulate quite noticeable CPU time (especially itabes), sometimes larger than the actual Java methods called." Testing: Oleg tested it in the Oracle Studio Performance Analyzer environment. nsk.jvmti, nsk.jdi, nsk.jdwp, In progress: Jtreg com/sun/jdi, java/lang/instrument Thanks, Serguei From jaroslav.bachorik at oracle.com Mon Feb 3 14:33:15 2014 From: jaroslav.bachorik at oracle.com (Jaroslav Bachorik) Date: Mon, 03 Feb 2014 23:33:15 +0100 Subject: RFR 6791551: ExclusiveBind.java has a race condition Message-ID: <52F0192B.9060104@oracle.com> Please, review the following test fix: Issue : https://bugs.openjdk.java.net/browse/JDK-6791551 Webrev: http://cr.openjdk.java.net/~jbachorik/6791551/webrev.00 The fix prevents the situation when the first debuggee has not managed to finish its intialization while the second one is started up thus making the port available for the second debuggee and failing the test. The patch is using the library methods to configure and launch the debuggee and the test waits for the well known string to appear in the first debuggee output before attempting to launch the second debuggee. Thanks, -JB- From daniel.daugherty at oracle.com Mon Feb 3 14:38:27 2014 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Mon, 03 Feb 2014 15:38:27 -0700 Subject: RFR (XS) 6471769: Error: assert(_cur_stack_depth == count_frames(), "cur_stack_depth out of sync") In-Reply-To: <52EFEA25.8080905@oracle.com> References: <52EC62CD.8000300@oracle.com> <52EF88A7.3090506@oracle.com> <52EFEA25.8080905@oracle.com> Message-ID: <52F01A63.7080206@oracle.com> On 2/3/14 12:12 PM, serguei.spitsyn at oracle.com wrote: > On 2/3/14 4:16 AM, David Holmes wrote: >> Hi Serguei, >> >> On 1/02/2014 12:58 PM, serguei.spitsyn at oracle.com wrote: >>> Please, review the fix for: >>> https://bugs.openjdk.java.net/browse/JDK-6471769 >>> >>> >>> Open webrev: >>> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/hotspot/6471769-JVMTI-FRAME/ >>> >>> >>> >>> Summary: >>> >>> There is a general issue in the suspend equivalent condition >>> mechanism: >>> Two subsequent calls to the JvmtiEnv::is_thread_fully_suspended() >>> may >>> return different results: >>> - 1-st: true >>> - 2-nd: false >>> >>> This more generic suspend equivalent issue is covered by another >>> bug: >>> https://bugs.openjdk.java.net/browse/JDK-6280037 >>> >>> The bug to fix in this review is a specific manifestation of the >>> 6280037 >>> in the JVMTI GetFrameCount() that has a big impact on the SQE >>> nightly. >>> It is on the Test Stabilization radar (as well as the 6280037). >>> There are many tests intermittently failing because of this. >>> >>> The webrev for review is a one-liner work around the 6280037 for the >>> GetFrameCount(). >>> >>> The JVMTI GetFrameCount() spec tells: >>> "If this function is called for a thread actively executing >>> bytecodes (for example, >>> not the current thread and not suspended), the information >>> returned is transient." >>> >>> So, it is Ok to call the GetFrameCount() for non-suspended target >>> threads. >>> To achieve safety, the frame count for non-suspended threads is >>> calculated at a safepoint. >>> It should be Ok and more safe to do the same for suspended >>> threads as >>> well. >>> There is no big performance impact because it is already on a >>> slow path. >>> It is still important to avoid safepointing when the target >>> thread is >>> current. >> >> This sounds completely reasonable - Reviewed. >> >> I can't check at the moment whether VM_GetFrameCount properly checks >> for a live target thread? > > David, thank you for the review! > It looks like the check is missed. > Please, give me some time to make sure what is needed here. Check out the work done via this changeset: http://hg.openjdk.java.net/hsx/hotspot-rt/hotspot/rev/c1d7040a1183 Sergey did the push via: JDK-8022836 JVM crashes in JVMTIENVBASE::GET_CURRENT_CONTENDED_MONITOR and GET_OWNED_MONITOR https://bugs.openjdk.java.net/browse/JDK-8022836 but the real commentary about the bug is in this one: JDK-7154963 crash in JvmtiEnvBase::get_current_contended_monitor() https://bugs.openjdk.java.net/browse/JDK-7154963 I think above code is what you want... Dan > > Thanks! > Serguei > >> >> David >> >>> The bug 6280037 should go out of the Test Stabilization radar >>> (remove >>> the svc-nightly label) >>> as the most of the impacted tests are covered by the 6471769. >>> >>> >>> Testing: >>> In progress: nsk.jvmti, nsk.jdi, nsk.jdwp, impacted JTreg tests >>> >>> >>> Thanks, >>> Serguei >>> > From daniel.daugherty at oracle.com Mon Feb 3 14:54:06 2014 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Mon, 03 Feb 2014 15:54:06 -0700 Subject: RFR (XS) 6471769: Error: assert(_cur_stack_depth == count_frames(), "cur_stack_depth out of sync") In-Reply-To: <52EFF84E.1050800@oracle.com> References: <52EC62CD.8000300@oracle.com> <52EFBAF6.8000404@oracle.com> <52EFF84E.1050800@oracle.com> Message-ID: <52F01E0D.8010304@oracle.com> On 2/3/14 1:13 PM, serguei.spitsyn at oracle.com wrote: > Dan, > > Thank you for reviewing! > Some comments are below. > > On 2/3/14 7:51 AM, Daniel D. Daugherty wrote: >> On 1/31/14 7:58 PM, serguei.spitsyn at oracle.com wrote: >>> Please, review the fix for: >>> https://bugs.openjdk.java.net/browse/JDK-6471769 >>> >>> >>> Open webrev: >>> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/hotspot/6471769-JVMTI-FRAME/ >>> >> >> src/share/vm/prims/jvmtiEnv.cpp >> Thumbs up on the code change. >> >> I've never been a fan of the: >> >> if (is_thread_fully_suspended(...)) >> do-something-direct >> else >> do-something-via-a-safepoint >> >> model. We don't have any mechanism in the VM to keep the >> is_thread_fully_suspended() constraint true over the >> execution time of the "do-something-direct" operation so >> we've always been exposed to races. > > Thank you for this confirmation. > So that, we are in sync here. > > >> Also, I think this >> model is used for several other JVM/TI calls so you might >> want to visit those also with an eye on safety. > > Yes, I'm looking to isolate and double check other places like this in > the code. > Another piece of work is to isolate other bugs on this topic. > > >> >> For this specific change, I think the comment would be >> better as: >> >> // It is only safe to perform the direct operation on the current >> // thread. All other usage needs to use a vm-op for safety. > > It is taken, thanks! > >> In particular, I don't agree with this part of the comment: >> >> // Performance-wise the only important case is when current >> thread is >> // the target thread. >> >> I'm not sure how you can state that performance on the current >> thread >> is the only important case, but maybe I don't understand what you're >> really trying to say here. > > I agree, the comment is not clear. > Probably, the comment had to say: "the only case where we can keep > performance > without loosing safety is the target thread is current ...". > In fact, collecting data at a safepoint is the only way to make it safe. > If the target thread is not current then: > - a) thread has been suspended: already slow path, so that > performance is not that critical > - b) thread has not been suspended: the information we provide is > transient, > safepoint is the only way to get it safe (is is correct?) Yes, the only safe way to get info for another thread is via a safepoint operation. > >> >> Now for the more general question with respect to this bug: 6471769 >> I've having trouble connecting this change (which I like) to this >> particular bug report. >> >> The crash in the bug's description looks like this: >> >> # Internal Error >> (/net/prt-solamd64-q1-2/PrtBuildDir/workspace/src/share/vm/prims/jvmtiThreadState.cpp, >> 273), pid=9561, tid=2 >> # >> # Java VM: Java HotSpot(TM) 64-Bit Server VM >> (20060914135846.dcubed.service_hs_b02_merge.2-debug compiled mode) >> # >> # Error: assert(_cur_stack_depth == count_frames(),"cur_stack_depth >> out of sync") >> >> V [libjvm.so+0x1217206];; void VMError::report_and_die()+0x606 >> V [libjvm.so+0x667841];; void report_assertion_failure(const >> char*,int,const char*)+0x61 >> V [libjvm.so+0xca9557];; int JvmtiThreadState::cur_stack_depth()+0x6e7 >> V [libjvm.so+0xc1caaa];; void >> JvmtiExport::post_method_exit(JavaThread*,methodOop,frame)+0x21ba >> V [libjvm.so+0x8385de];; void >> InterpreterRuntime::post_method_exit(JavaThread*)+0x21e >> j java.io.ObjectInputStream$BlockDataInputStream.getBlockDataMode()Z+4 >> >> In the above crash, the current thread is trying to post a >> JVMTI_EVENT_METHOD_EXIT and during the event posting code, it >> calls cur_stack_depth() which fails an assert because the >> cached stack depth doesn't match the current queried value: >> >> 282 int JvmtiThreadState::cur_stack_depth() { >> 283 uint32_t debug_bits = 0; >> 284 guarantee(JavaThread::current() == get_thread() || >> 285 JvmtiEnv::is_thread_fully_suspended(get_thread(), false, >> &debug_bits >> ), >> 286 "must be current thread or suspended"); >> 287 >> 288 if (!is_interp_only_mode() || _cur_stack_depth == >> UNKNOWN_STACK_DEPTH) >> { >> 289 _cur_stack_depth = count_frames(); >> 290 } else { >> 291 // heavy weight assert >> 292 assert(_cur_stack_depth == count_frames(), >> 293 "cur_stack_depth out of sync"); >> 294 } >> 295 return _cur_stack_depth; >> 296 } >> >> >> This is not an unsafe query from one thread to another target thread >> so I don't see how the proposed change will address this failure mode. >> In particular, the assertion is troubling because it tells me that >> the cached stack depth for the current thread is either wrong or has >> been corrupted. > > The above are consequences of unsafe operations we are doing. > Current thread should not calculate frame count of target thread (if > different) other than at a safepoint. > We assume the thread is suspended but, in fact, it was not. > So that we lost safety here and all manifestation we observe are the > consequences. For at least the stack trace shown in the description of 6471769, it is not an unsafe case because the thread is operating on itself. All I'm saying here is that 6471769 is not the right bug to use for this fix... > This bug covers the frames miscount issue. > In the fastdebug mode we may get the assert as in this bug report: > https://bugs.openjdk.java.net/browse/JDK-8032223 This bug: JDK-8032223 seems to be a perfect match for code that you are proposing the change in your current webrev. You have a thread ("JDWP Transport Listener: dt_socket") that is trying to query the frame count on another thread... That target JavaThread met the is_thread_fully_suspended() criteria when execution went down the direct path, but later the target stopped meeting that criteria... And boom... > There is another potential race that I forgot to check and fix but > David pointed out. > I will try to cover it in the next version of webrev. See the separate reply that I sent on that part... > > It is still possible there can be other issues here (as you say). > But it is hard to make sure it is the case because it is extremely > hard to reproduce (you know it!). > I'd suggest to resolve it step-by-step: > - fix the known safety issues > - watch the nightly if anything else is left, file and fix new bugs This sounds like a good plan. > >> >> To reiterate: >> >> - I like the code change! >> - I would like to see the comment tweaked a bit >> >> You should consider going forward with this change using a different >> bug ID, perhaps something like: >> >> JDK-NNNNNNN improve safety of JVM/TI GetFrameCount() > > There is already bug on this topic (and there are even more dups): > https://bugs.openjdk.java.net/browse/JDK-8032223 > > But I still believe, the issue is the same ... I don't think the issues in JDK-6471769 and JDK-8032223 are the same; well not the same for the crash in the description of JDK-6471769. I didn't look closely at the other crashes... Dan > > > Thanks! > Serguei > >> >> Dan >> >> >> >>> >>> Summary: >>> >>> There is a general issue in the suspend equivalent condition >>> mechanism: >>> Two subsequent calls to the JvmtiEnv::is_thread_fully_suspended() >>> may return different results: >>> - 1-st: true >>> - 2-nd: false >>> >>> This more generic suspend equivalent issue is covered by another bug: >>> https://bugs.openjdk.java.net/browse/JDK-6280037 >>> >>> The bug to fix in this review is a specific manifestation of the >>> 6280037 >>> in the JVMTI GetFrameCount() that has a big impact on the SQE >>> nightly. >>> It is on the Test Stabilization radar (as well as the 6280037). >>> There are many tests intermittently failing because of this. >>> >>> The webrev for review is a one-liner work around the 6280037 for >>> the GetFrameCount(). >>> >>> The JVMTI GetFrameCount() spec tells: >>> "If this function is called for a thread actively executing >>> bytecodes (for example, >>> not the current thread and not suspended), the information >>> returned is transient." >>> >>> So, it is Ok to call the GetFrameCount() for non-suspended target >>> threads. >>> To achieve safety, the frame count for non-suspended threads is >>> calculated at a safepoint. >>> It should be Ok and more safe to do the same for suspended threads >>> as well. >>> There is no big performance impact because it is already on a slow >>> path. >>> It is still important to avoid safepointing when the target thread >>> is current. >>> >>> The bug 6280037 should go out of the Test Stabilization radar >>> (remove the svc-nightly label) >>> as the most of the impacted tests are covered by the 6471769. >>> >>> >>> Testing: >>> In progress: nsk.jvmti, nsk.jdi, nsk.jdwp, impacted JTreg tests >>> >>> >>> Thanks, >>> Serguei >>> >> > From serguei.spitsyn at oracle.com Mon Feb 3 14:54:14 2014 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Mon, 03 Feb 2014 14:54:14 -0800 Subject: RFR (XS) 6471769: Error: assert(_cur_stack_depth == count_frames(), "cur_stack_depth out of sync") In-Reply-To: <52F01A63.7080206@oracle.com> References: <52EC62CD.8000300@oracle.com> <52EF88A7.3090506@oracle.com> <52EFEA25.8080905@oracle.com> <52F01A63.7080206@oracle.com> Message-ID: <52F01E16.7050200@oracle.com> On 2/3/14 2:38 PM, Daniel D. Daugherty wrote: > > On 2/3/14 12:12 PM, serguei.spitsyn at oracle.com wrote: >> On 2/3/14 4:16 AM, David Holmes wrote: >>> Hi Serguei, >>> >>> On 1/02/2014 12:58 PM, serguei.spitsyn at oracle.com wrote: >>>> Please, review the fix for: >>>> https://bugs.openjdk.java.net/browse/JDK-6471769 >>>> >>>> >>>> Open webrev: >>>> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/hotspot/6471769-JVMTI-FRAME/ >>>> >>>> >>>> >>>> Summary: >>>> >>>> There is a general issue in the suspend equivalent condition >>>> mechanism: >>>> Two subsequent calls to the >>>> JvmtiEnv::is_thread_fully_suspended() may >>>> return different results: >>>> - 1-st: true >>>> - 2-nd: false >>>> >>>> This more generic suspend equivalent issue is covered by another >>>> bug: >>>> https://bugs.openjdk.java.net/browse/JDK-6280037 >>>> >>>> The bug to fix in this review is a specific manifestation of the >>>> 6280037 >>>> in the JVMTI GetFrameCount() that has a big impact on the SQE >>>> nightly. >>>> It is on the Test Stabilization radar (as well as the 6280037). >>>> There are many tests intermittently failing because of this. >>>> >>>> The webrev for review is a one-liner work around the 6280037 for >>>> the >>>> GetFrameCount(). >>>> >>>> The JVMTI GetFrameCount() spec tells: >>>> "If this function is called for a thread actively executing >>>> bytecodes (for example, >>>> not the current thread and not suspended), the information >>>> returned is transient." >>>> >>>> So, it is Ok to call the GetFrameCount() for non-suspended target >>>> threads. >>>> To achieve safety, the frame count for non-suspended threads is >>>> calculated at a safepoint. >>>> It should be Ok and more safe to do the same for suspended >>>> threads as >>>> well. >>>> There is no big performance impact because it is already on a >>>> slow path. >>>> It is still important to avoid safepointing when the target >>>> thread is >>>> current. >>> >>> This sounds completely reasonable - Reviewed. >>> >>> I can't check at the moment whether VM_GetFrameCount properly checks >>> for a live target thread? >> >> David, thank you for the review! >> It looks like the check is missed. >> Please, give me some time to make sure what is needed here. > > Check out the work done via this changeset: > > http://hg.openjdk.java.net/hsx/hotspot-rt/hotspot/rev/c1d7040a1183 > > Sergey did the push via: > > JDK-8022836 JVM crashes in > JVMTIENVBASE::GET_CURRENT_CONTENDED_MONITOR > and GET_OWNED_MONITOR > https://bugs.openjdk.java.net/browse/JDK-8022836 > > but the real commentary about the bug is in this one: > > JDK-7154963 crash in JvmtiEnvBase::get_current_contended_monitor() > https://bugs.openjdk.java.net/browse/JDK-7154963 > > I think above code is what you want... I know what to check, just need to make sure new check is in a right place. :) One of the examples is: http://cr.openjdk.java.net/~sspitsyn/webrevs/2013/hotspot/8030027-JVMTI-HS101.1/ Thanks, Serguei > > Dan > > >> >> Thanks! >> Serguei >> >>> >>> David >>> >>>> The bug 6280037 should go out of the Test Stabilization radar >>>> (remove >>>> the svc-nightly label) >>>> as the most of the impacted tests are covered by the 6471769. >>>> >>>> >>>> Testing: >>>> In progress: nsk.jvmti, nsk.jdi, nsk.jdwp, impacted JTreg tests >>>> >>>> >>>> Thanks, >>>> Serguei >>>> >> > From serguei.spitsyn at oracle.com Mon Feb 3 16:04:10 2014 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Mon, 03 Feb 2014 16:04:10 -0800 Subject: RFR (XS) 6471769: Error: assert(_cur_stack_depth == count_frames(), "cur_stack_depth out of sync") In-Reply-To: <52F01E0D.8010304@oracle.com> References: <52EC62CD.8000300@oracle.com> <52EFBAF6.8000404@oracle.com> <52EFF84E.1050800@oracle.com> <52F01E0D.8010304@oracle.com> Message-ID: <52F02E7A.9090603@oracle.com> On 2/3/14 2:54 PM, Daniel D. Daugherty wrote: > On 2/3/14 1:13 PM, serguei.spitsyn at oracle.com wrote: >> Dan, >> >> Thank you for reviewing! >> Some comments are below. >> >> On 2/3/14 7:51 AM, Daniel D. Daugherty wrote: >>> On 1/31/14 7:58 PM, serguei.spitsyn at oracle.com wrote: >>>> Please, review the fix for: >>>> https://bugs.openjdk.java.net/browse/JDK-6471769 >>>> >>>> >>>> Open webrev: >>>> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/hotspot/6471769-JVMTI-FRAME/ >>>> >>> >>> src/share/vm/prims/jvmtiEnv.cpp >>> Thumbs up on the code change. >>> >>> I've never been a fan of the: >>> >>> if (is_thread_fully_suspended(...)) >>> do-something-direct >>> else >>> do-something-via-a-safepoint >>> >>> model. We don't have any mechanism in the VM to keep the >>> is_thread_fully_suspended() constraint true over the >>> execution time of the "do-something-direct" operation so >>> we've always been exposed to races. >> >> Thank you for this confirmation. >> So that, we are in sync here. >> >> >>> Also, I think this >>> model is used for several other JVM/TI calls so you might >>> want to visit those also with an eye on safety. >> >> Yes, I'm looking to isolate and double check other places like this >> in the code. >> Another piece of work is to isolate other bugs on this topic. >> >> >>> >>> For this specific change, I think the comment would be >>> better as: >>> >>> // It is only safe to perform the direct operation on the current >>> // thread. All other usage needs to use a vm-op for safety. >> >> It is taken, thanks! >> >>> In particular, I don't agree with this part of the comment: >>> >>> // Performance-wise the only important case is when current >>> thread is >>> // the target thread. >>> >>> I'm not sure how you can state that performance on the current >>> thread >>> is the only important case, but maybe I don't understand what >>> you're >>> really trying to say here. >> >> I agree, the comment is not clear. >> Probably, the comment had to say: "the only case where we can keep >> performance >> without loosing safety is the target thread is current ...". >> In fact, collecting data at a safepoint is the only way to make it safe. >> If the target thread is not current then: >> - a) thread has been suspended: already slow path, so that >> performance is not that critical >> - b) thread has not been suspended: the information we provide is >> transient, >> safepoint is the only way to get it safe (is is correct?) > > Yes, the only safe way to get info for another thread is via > a safepoint operation. Ok, thanks. >>> >>> Now for the more general question with respect to this bug: 6471769 >>> I've having trouble connecting this change (which I like) to this >>> particular bug report. >>> >>> The crash in the bug's description looks like this: >>> >>> # Internal Error >>> (/net/prt-solamd64-q1-2/PrtBuildDir/workspace/src/share/vm/prims/jvmtiThreadState.cpp, >>> 273), pid=9561, tid=2 >>> # >>> # Java VM: Java HotSpot(TM) 64-Bit Server VM >>> (20060914135846.dcubed.service_hs_b02_merge.2-debug compiled mode) >>> # >>> # Error: assert(_cur_stack_depth == count_frames(),"cur_stack_depth >>> out of sync") >>> >>> V [libjvm.so+0x1217206];; void VMError::report_and_die()+0x606 >>> V [libjvm.so+0x667841];; void report_assertion_failure(const >>> char*,int,const char*)+0x61 >>> V [libjvm.so+0xca9557];; int JvmtiThreadState::cur_stack_depth()+0x6e7 >>> V [libjvm.so+0xc1caaa];; void >>> JvmtiExport::post_method_exit(JavaThread*,methodOop,frame)+0x21ba >>> V [libjvm.so+0x8385de];; void >>> InterpreterRuntime::post_method_exit(JavaThread*)+0x21e >>> j java.io.ObjectInputStream$BlockDataInputStream.getBlockDataMode()Z+4 >>> >>> In the above crash, the current thread is trying to post a >>> JVMTI_EVENT_METHOD_EXIT and during the event posting code, it >>> calls cur_stack_depth() which fails an assert because the >>> cached stack depth doesn't match the current queried value: >>> >>> 282 int JvmtiThreadState::cur_stack_depth() { >>> 283 uint32_t debug_bits = 0; >>> 284 guarantee(JavaThread::current() == get_thread() || >>> 285 JvmtiEnv::is_thread_fully_suspended(get_thread(), false, >>> &debug_bits >>> ), >>> 286 "must be current thread or suspended"); >>> 287 >>> 288 if (!is_interp_only_mode() || _cur_stack_depth == >>> UNKNOWN_STACK_DEPTH) >>> { >>> 289 _cur_stack_depth = count_frames(); >>> 290 } else { >>> 291 // heavy weight assert >>> 292 assert(_cur_stack_depth == count_frames(), >>> 293 "cur_stack_depth out of sync"); >>> 294 } >>> 295 return _cur_stack_depth; >>> 296 } >>> >>> >>> This is not an unsafe query from one thread to another target thread >>> so I don't see how the proposed change will address this failure mode. >>> In particular, the assertion is troubling because it tells me that >>> the cached stack depth for the current thread is either wrong or has >>> been corrupted. >> >> The above are consequences of unsafe operations we are doing. >> Current thread should not calculate frame count of target thread (if >> different) other than at a safepoint. >> We assume the thread is suspended but, in fact, it was not. >> So that we lost safety here and all manifestation we observe are the >> consequences. > > For at least the stack trace shown in the description of 6471769, > it is not an unsafe case because the thread is operating on itself. > All I'm saying here is that 6471769 is not the right bug to use for > this fix... Thank you for sharing the concern. I was not able to reproduce the issue and so, did not prove my theory (which can be wrong). But you have a pretty valid point here. I still think, the stack trace with the assert is a consequence of a previous unsafe calculation on another thread. But that calculation is probably different than that I'm fixing in this webrev. >> This bug covers the frames miscount issue. >> In the fastdebug mode we may get the assert as in this bug report: >> https://bugs.openjdk.java.net/browse/JDK-8032223 > > This bug: JDK-8032223 seems to be a perfect match for code that you > are proposing the change in your current webrev. You have a thread > ("JDWP Transport Listener: dt_socket") that is trying to query the > frame count on another thread... That target JavaThread met the > is_thread_fully_suspended() criteria when execution went down the > direct path, but later the target stopped meeting that criteria... > > And boom... > >> There is another potential race that I forgot to check and fix but >> David pointed out. >> I will try to cover it in the next version of webrev. > > See the separate reply that I sent on that part... > > >> >> It is still possible there can be other issues here (as you say). >> But it is hard to make sure it is the case because it is extremely >> hard to reproduce (you know it!). >> I'd suggest to resolve it step-by-step: >> - fix the known safety issues >> - watch the nightly if anything else is left, file and fix new bugs > > This sounds like a good plan. > > >> >>> >>> To reiterate: >>> >>> - I like the code change! >>> - I would like to see the comment tweaked a bit >>> >>> You should consider going forward with this change using a different >>> bug ID, perhaps something like: >>> >>> JDK-NNNNNNN improve safety of JVM/TI GetFrameCount() >> >> There is already bug on this topic (and there are even more dups): >> https://bugs.openjdk.java.net/browse/JDK-8032223 >> >> But I still believe, the issue is the same ... > > I don't think the issues in JDK-6471769 and JDK-8032223 are the same; > well not the same for the crash in the description of JDK-6471769. I > didn't look closely at the other crashes... Ok. I will reopen the JDK-8032223 and use it for this fix. Thank you for helping to sort this out! Serguei > > > Dan > > >> >> >> Thanks! >> Serguei >> >>> >>> Dan >>> >>> >>> >>>> >>>> Summary: >>>> >>>> There is a general issue in the suspend equivalent condition >>>> mechanism: >>>> Two subsequent calls to the JvmtiEnv::is_thread_fully_suspended() >>>> may return different results: >>>> - 1-st: true >>>> - 2-nd: false >>>> >>>> This more generic suspend equivalent issue is covered by another >>>> bug: >>>> https://bugs.openjdk.java.net/browse/JDK-6280037 >>>> >>>> The bug to fix in this review is a specific manifestation of the >>>> 6280037 >>>> in the JVMTI GetFrameCount() that has a big impact on the SQE >>>> nightly. >>>> It is on the Test Stabilization radar (as well as the 6280037). >>>> There are many tests intermittently failing because of this. >>>> >>>> The webrev for review is a one-liner work around the 6280037 for >>>> the GetFrameCount(). >>>> >>>> The JVMTI GetFrameCount() spec tells: >>>> "If this function is called for a thread actively executing >>>> bytecodes (for example, >>>> not the current thread and not suspended), the information >>>> returned is transient." >>>> >>>> So, it is Ok to call the GetFrameCount() for non-suspended target >>>> threads. >>>> To achieve safety, the frame count for non-suspended threads is >>>> calculated at a safepoint. >>>> It should be Ok and more safe to do the same for suspended >>>> threads as well. >>>> There is no big performance impact because it is already on a >>>> slow path. >>>> It is still important to avoid safepointing when the target >>>> thread is current. >>>> >>>> The bug 6280037 should go out of the Test Stabilization radar >>>> (remove the svc-nightly label) >>>> as the most of the impacted tests are covered by the 6471769. >>>> >>>> >>>> Testing: >>>> In progress: nsk.jvmti, nsk.jdi, nsk.jdwp, impacted JTreg tests >>>> >>>> >>>> Thanks, >>>> Serguei >>>> >>> >> > From david.holmes at oracle.com Mon Feb 3 22:39:48 2014 From: david.holmes at oracle.com (David Holmes) Date: Tue, 04 Feb 2014 16:39:48 +1000 Subject: code review round 0 for ObjectMonitor-JVM/TI hang fix (8028073) In-Reply-To: <52EFAE1C.3050202@oracle.com> References: <52ED3F28.5070806@oracle.com> <52EF874A.8020603@oracle.com> <52EFAE1C.3050202@oracle.com> Message-ID: <52F08B34.1010300@oracle.com> On 4/02/2014 12:56 AM, Daniel D. Daugherty wrote: > Adding Dave Dice to this thread... > > On 2/3/14 5:10 AM, David Holmes wrote: >> Hi Dan, >> >> On 2/02/2014 4:38 AM, Daniel D. Daugherty wrote: >>> Greetings, >>> >>> I have a fix ready for the following bug: >>> >>> 8028073 race condition in ObjectMonitor implementation causing >>> deadlocks >>> https://bugs.openjdk.java.net/browse/JDK-8028073 >>> >>> On the surface, this is a very simple fix that relocates a few lines of >>> code, relocates and rewrites the comments associated with that code and >>> adds several new comments. >>> >>> Of course, in reality, the issue is much more complicated, but I'm >>> hoping to make it easy for anyone not acquainted with this issue to >>> understand what's going on. >> >> Let's spare them the trouble - Reviewed :) > > Thanks for the review! > > >> I'm still troubled by the rationale for the original placement of this >> code. It makes sense to issue the unpark prior to the park() call but >> on the other hand there is no agent/event code in that path that would >> have consumed the original unpark. ?? > > I think I've acknowledged that concern in the "Risks" section below: > > - while the old code was not effective for the hang that is being > fixed with this bug, it is possible that the old code prevented > a different bug in the successor protocol from manifesting > > Hopefully, Dave Dice will chime in and share the history... For the record Dave did chime in off-list and this code block dates back to the JVMPI implementation which did jvmpi::post_monitor_contended_exit_event at the end of exit(), which is called just before the code block in question. That was back in JDK 5. By JDK 6 JVMPI had been stripped out but the code block remained. >> Thanks, I know you've really enjoyed this one :) > > I actually did enjoy this one from a 'challenge to the mind' POV. > But I have to say that the intricacies of the successor protocol > gave me headaches... Hopefully, my successor.notes will help anyone > else that is crazy enough to wander down this path... Indeed! Thanks, David > Dan > > >> >> David >> ----- >> >>> Here are the JDK9 webrev URLs: >>> >>> OpenJDK: >>> http://cr.openjdk.java.net/~dcubed/8028073-webrev/0-jdk9-hs-runtime/ >>> >>> Oracle internal: >>> http://javaweb.us.oracle.com/~ddaugher/8028073-webrev/0-jdk9-hs-runtime/ >>> >>> The simple summary: >>> >>> - since Java Monitors and JVM/TI RawMonitors share a ParkEvent, >>> it is possible for a JVM/TI monitor event handler to accidentally >>> consume a ParkEvent.unpark() call meant for Java Monitor layer >>> - the original code fix was made on 2005.07.04 using this bug ID: >>> https://bugs.openjdk.java.net/browse/JDK-5030359 >>> - it's the right fix, but it's in the wrong place >>> - the fix needs to be after the JVMTI_EVENT_MONITOR_WAITED >>> event handler is called because it is that event handler >>> that can cause the hang >>> >>> >>> Testing >>> ------- >>> >>> - a new StessMonitorWait test has been created that reliably >>> reproduces the hang in JDK[6789]; see the bug's gory details >>> for the specific versions where the hang has been reproduced >>> - the test reliably reproduces the hang in 5 seconds on my >>> T7600 running Solaris 10u11 X86; 1 minute runs reproduce >>> the hang reliably on other machines >>> - 12 hour stress run of the new test on Linux-X64, MacOS X-X64, >>> Solaris-SPARCV9, Solaris-X64, and Win7-X86 with the JPRT >>> bits did not reproduce the hang >>> - JPRT test job >>> - VM/SQE Adhoc test job on Server VM, fastdebug bits on Linux-X86, >>> Linux-X64, MacOS X-X64, Solaris-SPARCV9, Solaris-X64, Windows-X86, >>> and Windows-X64: >>> - vm.quick >>> - Kitchensink (bigapps) >>> - Weblogic+medrec (bigapps) >>> - runThese (bigapps) >>> >>> >>> The Gory Details Start Here >>> --------------------------- >>> >>> This is the old location of block of code that's being moved: >>> >>> src/share/vm/runtime/objectMonitor.cpp: >>> >>> 1440 void ObjectMonitor::wait(jlong millis, bool interruptible, TRAPS) { >>> >>> 1499 exit (true, Self) ; // exit the monitor >>> >>> 1513 if (node._notified != 0 && _succ == Self) { >>> 1514 node._event->unpark(); >>> 1515 } >>> >>> >>> This is the new location of block of code that's being moved: >>> >>> src/share/vm/runtime/objectMonitor.cpp: >>> >>> 1452 void ObjectMonitor::wait(jlong millis, bool interruptible, TRAPS) { >>> >>> 1601 if (JvmtiExport::should_post_monitor_waited()) { >>> 1602 JvmtiExport::post_monitor_waited(jt, this, ret == >>> OS_TIMEOUT); >>> >>> 1604 if (node._notified != 0 && _succ == Self) { >>> >>> 1620 node._event->unpark(); >>> 1621 } >>> >>> >>> The Risks >>> --------- >>> >>> - The code now executes only when the JVMTI_EVENT_MONITOR_WAITED event >>> is enabled: >>> - previously it was always executed >>> - while the old code was not effective for the hang that is being >>> fixed with this bug, it is possible that the old code prevented >>> a different bug in the successor protocol from manifesting >>> - thorough analysis of the successor protocol did not reveal a >>> case where the old code was needed in the old location >>> - Thorough analysis indicates that the other JVM/TI monitor events >>> do not need a fix like the one for JVMTI_EVENT_MONITOR_WAITED: >>> - the successor protocol is complicated and the analysis could >>> be wrong when certain options are used >>> - comments were added to each location where a JVM/TI monitor >>> event handler is called documenting why a fix like this one >>> is not needed there >>> - if the analysis is wrong, the new comments show where a new >>> code change would be needed >>> >>> >>> The Scenario >>> ------------ >>> >>> I've created a scenario that reproduces this hang: >>> >>> T1 - enters monitor and calls monitor.wait() >>> T2 - enters the monitor, calls monitor.notify() and exits the monitor >>> T3 - enters and exits the monitor >>> T4 - enters the monitor, delays for 5 seconds, exits the monitor >>> >>> A JVM/TI agent that enables JVMTI_EVENT_MONITOR_WAITED and has a >>> handler that: enters a raw monitor, waits for 1ms, exits a raw monitor. >>> >>> Here are the six events necessary to make this hang happen: >>> >>> // KEY-EVENT-1a: After being unparked(), T1 has cleared the _succ field, >>> but >>> // KEY-EVENT-1b: T3 is exiting the monitor and makes T1 the successor >>> again. >>> >>> // KEY-EVENT-2a: The unpark() done by T3 when it made T1 the successor >>> // KEY-EVENT-2b: is consumed by the JVM/TI event handler. >>> >>> // KEY-EVENT-3a: T3 made T1 the successor >>> // KEY-EVENT-3b: but before T1 could reenter the monitor T4 grabbed it. >>> >>> // KEY-EVENT-4a: T1's TrySpin() call sees T4 as NotRunnable so >>> // KEY-EVENT-4b: T1 bails from TrySpin without touching _succ. >>> >>> // KEY-EVENT-5a: T4 sees that T1 is still the successor so >>> // KEY-EVENT-5b: T4 takes the quick exit path (no ExitEpilog) >>> >>> // KEY-EVENT-6a: T1 is about to park and it is the successor, but >>> // KEY-EVENT-6b: T3's unpark has been eaten by the JVM/TI event handler >>> // KEY-EVENT-6c: and T4 took the quick exit path. T1 is about to be >>> stuck. >>> >>> >>> This bug is intertwined with: >>> >>> - The ObjectMonitor successor protocol >>> - the sharing of a ParkEvent between Java Monitors and JVM/TI >>> RawMonitors >>> >>> There is a very long successor.notes attachment to JDK-8028073 that >>> attempts to describe the ObjectMonitor successor protocol. It's good >>> for putting pretty much anyone to sleep. >>> >>> Since this hang reproduces back to JDK6, this bug is taking the easily >>> backported solution of moving the original fix to the right location. >>> The following new bug has been filed for possible future work in this >>> area by the Serviceability Team: >>> >>> 8033399 add a separate ParkEvent for JVM/TI RawMonitor use >>> https://bugs.openjdk.java.net/browse/JDK-8033399 >>> >>> >>> The Symptoms >>> ------------ >>> >>> With intermittent hangs like this, it is useful to know what to look >>> for in order to determine if you are running into this issue: >>> >>> - if you aren't using a debugger or a profiler or some other >>> JVM/TI agent, then this hang is not the same as yours >>> - if your JVM/TI agent isn't using a JVMTI_EVENT_MONITOR_WAITED >>> event handler, then this hang is not the same as yours >>> - if your JVMTI_EVENT_MONITOR_WAITED event handler is not using >>> JVM/TI RawMonitors, then this hang is not the same as yours >>> - if your JVMTI_EVENT_MONITOR_WAITED event handler is calling >>> back into Java code, then you might just be insane and this >>> hang might be similar to yours. However, using a Java callback >>> in an event handler is an even bigger problem/risk so fix that >>> first. >>> - if you one or more threads blocked like this and making no >>> progress, then this hang might be the same as yours: >>> >>> "T1" #22 prio=5 os_prio=64 tid=0x00000000009ca800 nid=0x2f waiting for >>> monitor e >>> ntry [0xfffffd7fc0231000] >>> java.lang.Thread.State: BLOCKED (on object monitor) >>> JavaThread state: _thread_blocked >>> Thread: 0x00000000009ca800 [0x2f] State: _at_safepoint _has_called_back >>> 0 _at_p >>> oll_safepoint 0 >>> JavaThread state: _thread_blocked >>> at java.lang.Object.wait(Native Method) >>> - waiting on <0xfffffd7e6a2b6ff0> (a java.lang.String) >>> at java.lang.Object.wait(Object.java:502) >>> at SMW_WorkerThread.run(StressMonitorWait.java:103) >>> - locked <0xfffffd7e6a2b6ff0> (a java.lang.String) >>> >>> "T2" #23 prio=5 os_prio=64 tid=0x00000000009cc000 nid=0x30 waiting for >>> monitor e >>> ntry [0xfffffd7fc0130000] >>> java.lang.Thread.State: BLOCKED (on object monitor) >>> JavaThread state: _thread_blocked >>> Thread: 0x00000000009cc000 [0x30] State: _at_safepoint _has_called_back >>> 0 _at_p >>> oll_safepoint 0 >>> JavaThread state: _thread_blocked >>> at SMW_WorkerThread.run(StressMonitorWait.java:120) >>> - waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String) >>> >>> "T3" #24 prio=5 os_prio=64 tid=0x00000000009ce000 nid=0x31 waiting for >>> monitor e >>> ntry [0xfffffd7fc002f000] >>> java.lang.Thread.State: BLOCKED (on object monitor) >>> JavaThread state: _thread_blocked >>> Thread: 0x00000000009ce000 [0x31] State: _at_safepoint _has_called_back >>> 0 _at_p >>> oll_safepoint 0 >>> JavaThread state: _thread_blocked >>> at SMW_WorkerThread.run(StressMonitorWait.java:139) >>> - waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String) >>> >>> Key symptoms in thread T1: >>> >>> - had the object locked: >>> >>> locked <0xfffffd7e6a2b6ff0> (a java.lang.String) >>> >>> - did an Object.wait(): >>> >>> waiting on <0xfffffd7e6a2b6ff0> (a java.lang.String) >>> >>> - is blocked on reentry: >>> >>> waiting for monitor entry [0xfffffd7fc0231000] >>> >>> Key symtoms in thread T2: >>> >>> - is blocked waiting to lock the object: >>> >>> waiting for monitor entry [0xfffffd7fc0130000] >>> waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String) >>> >>> Key symtoms in thread T3: >>> >>> - is blocked waiting to lock the object: >>> >>> waiting for monitor entry [0xfffffd7fc002f000] >>> waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String) >>> > From shanliang.jiang at oracle.com Tue Feb 4 00:54:20 2014 From: shanliang.jiang at oracle.com (shanliang) Date: Tue, 04 Feb 2014 09:54:20 +0100 Subject: RFR 6791551: ExclusiveBind.java has a race condition In-Reply-To: <52F0192B.9060104@oracle.com> References: <52F0192B.9060104@oracle.com> Message-ID: <52F0AABC.7000904@oracle.com> Jaroslav, Your fix should work in most case, but is it better and more reliable to wait a VM event as suggested in the bug? even your timeout is adapted to the test time factory, but the solution still depends to a fixed timeout and a fixed line out. Shanliang Jaroslav Bachorik wrote: > Please, review the following test fix: > > Issue : https://bugs.openjdk.java.net/browse/JDK-6791551 > Webrev: http://cr.openjdk.java.net/~jbachorik/6791551/webrev.00 > > The fix prevents the situation when the first debuggee has not managed > to finish its intialization while the second one is started up thus > making the port available for the second debuggee and failing the test. > > The patch is using the library methods to configure and launch the > debuggee and the test waits for the well known string to appear in the > first debuggee output before attempting to launch the second debuggee. > > Thanks, > > -JB- From jaroslav.bachorik at oracle.com Tue Feb 4 00:58:42 2014 From: jaroslav.bachorik at oracle.com (Jaroslav Bachorik) Date: Tue, 04 Feb 2014 09:58:42 +0100 Subject: RFR 6791551: ExclusiveBind.java has a race condition In-Reply-To: <52F0AABC.7000904@oracle.com> References: <52F0192B.9060104@oracle.com> <52F0AABC.7000904@oracle.com> Message-ID: <52F0ABC2.4040700@oracle.com> On 4.2.2014 09:54, shanliang wrote: > Jaroslav, > > Your fix should work in most case, but is it better and more reliable to > wait a VM event as suggested in the bug? even your timeout is adapted to > the test time factory, but the solution still depends to a fixed timeout > and a fixed line out. Well, if I get the test logic correctly it is supposed to test that the agent blocks the port even when no client has connected yet. Connecting to the agent and waiting for the event would change the thing the test checks, actually. -JB- > > Shanliang > > Jaroslav Bachorik wrote: >> Please, review the following test fix: >> >> Issue : https://bugs.openjdk.java.net/browse/JDK-6791551 >> Webrev: http://cr.openjdk.java.net/~jbachorik/6791551/webrev.00 >> >> The fix prevents the situation when the first debuggee has not managed >> to finish its intialization while the second one is started up thus >> making the port available for the second debuggee and failing the test. >> >> The patch is using the library methods to configure and launch the >> debuggee and the test waits for the well known string to appear in the >> first debuggee output before attempting to launch the second debuggee. >> >> Thanks, >> >> -JB- > From jesper.wilhelmsson at oracle.com Tue Feb 4 01:17:16 2014 From: jesper.wilhelmsson at oracle.com (Jesper Wilhelmsson) Date: Tue, 04 Feb 2014 10:17:16 +0100 Subject: RFR (backport of): 8028391 - Make the Min/MaxHeapFreeRatio flags manageable Message-ID: <52F0B01C.2040600@oracle.com> Hi, The patch from jdk9/hs-gc to make MinHeapFreeRatio and MaxHeapFreeRatio manageable flags and supported in ParallelGC applied cleanly to jdk8u/hs-dev. hsx/jdk7u60 required some manual labor to patch since the argument parsing code had moved around, but the relevant logic was untouched so I didn't have to change anything in the new code. For 7 I omitted some of the minor cleanups that was done in the original patch. Webrev for 7: http://cr.openjdk.java.net/~jwilhelm/8028391/webrev.7u60/ Webrev for 8: http://cr.openjdk.java.net/~jwilhelm/8028391/webrev.8u20/ (The 8u20 webrev is identical with the final version for 9.) So kindly approve this change for backport to jdk8u20 and jdk7u60. The bugs: jdk9: https://bugs.openjdk.java.net/browse/JDK-8028391 jdk8: https://bugs.openjdk.java.net/browse/JDK-8033209 jdk7: https://bugs.openjdk.java.net/browse/JDK-8028720 Reviews in this thread: http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/2014-January/009187.html The jdk9 changeset: http://hg.openjdk.java.net/jdk9/hs-gc/hotspot/rev/44315152d434 Thanks, /Jesper From olivier.lagneau at oracle.com Tue Feb 4 02:15:29 2014 From: olivier.lagneau at oracle.com (Olivier Lagneau) Date: Tue, 04 Feb 2014 11:15:29 +0100 Subject: Need second 7u reviewer for [7u60] Request for phase 2 approval for CR 8019389: SA-JDI JSR292: sun.jvm.hotspot.jdi.StackFrame.thisObject() throws sun.jvm.hotspot.utilities.AssertionFailure: sanity check In-Reply-To: <2F0BD850-0C45-4377-959A-41045C75FB88@oracle.com> References: <52DD37E3.1020402@oracle.com> <52DD3D1B.9070700@oracle.com> <52DD53C2.6070904@oracle.com> <52DE32CE.3080703@oracle.com> <2F0BD850-0C45-4377-959A-41045C75FB88@oracle.com> Message-ID: <52F0BDC1.7050405@oracle.com> I need a second 7u reviewer for this 7u60 critical request fix (very simple fix). Could someone review it please ? Thanks, Olivier. Christian Thalinger said on date 1/23/2014 10:09 PM: > Looks good. > > On Jan 21, 2014, at 12:41 AM, Olivier Lagneau > > wrote: > >> Please find the new webrev with copyright date fixed (changed to 2014). >> >> Webrev: http://cr.openjdk.java.net/~olagneau/8019389/webrev.01/ >> >> >> Olivier. >> >> Olivier Lagneau said on date 1/20/2014 5:50 PM: >>> Oops, right ! >>> >>> Will fix that. >>> >>> Olivier. >>> >>> shanliang said on date 1/20/2014 4:13 PM: >>>> Olivier, >>>> >>>> Now it is 2014 :) >>>> >>>> >>>> Olivier Lagneau wrote: >>>>> Please review the following simple fix. >>>>> >>>>> Issue: https://bugs.openjdk.java.net/browse/JDK-8019389 >>>>> Webrev: http://cr.openjdk.java.net/~olagneau/8019389/webrev.00/ >>>>> >>>>> >>>>> The issue is due to the fact that _invokeHandle bytecode is >>>>> generated by hotspot, >>>>> but is not declared in agent code. Just declaring the new bytecode >>>>> solves the assertion failure. >>>>> >>>>> However the tests reported in 8019389 >>>>> (bootstrapOtherStratumInStackTrace, targetOtherStratumInStackTrace) >>>>> suffer the problem from JDK-7016268 >>>>> : Can't get >>>>> strata information through SA-JDI >>>>> Thus, the "stratum mismatch" related to JDK-7016268 will still be >>>>> present after fix. >>>>> This second problem has to be fixed through JDK-7016268. >>>>> >>>>> Thanks, >>>>> Olivier. >>>> >>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20140204/46a80282/attachment-0001.html From serguei.spitsyn at oracle.com Tue Feb 4 03:03:27 2014 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Tue, 04 Feb 2014 03:03:27 -0800 Subject: Need second 7u reviewer for [7u60] Request for phase 2 approval for CR 8019389: SA-JDI JSR292: sun.jvm.hotspot.jdi.StackFrame.thisObject() throws sun.jvm.hotspot.utilities.AssertionFailure: sanity check In-Reply-To: <52F0BDC1.7050405@oracle.com> References: <52DD37E3.1020402@oracle.com> <52DD3D1B.9070700@oracle.com> <52DD53C2.6070904@oracle.com> <52DE32CE.3080703@oracle.com> <2F0BD850-0C45-4377-959A-41045C75FB88@oracle.com> <52F0BDC1.7050405@oracle.com> Message-ID: <52F0C8FF.70100@oracle.com> Hi Oliver, The fix looks good. Was this already fixed in the JDK 9 and 8? Thanks, Serguei On 2/4/14 2:15 AM, Olivier Lagneau wrote: > I need a second 7u reviewer for this 7u60 critical request fix (very > simple fix). > > Could someone review it please ? > > Thanks, > Olivier. > > Christian Thalinger said on date 1/23/2014 10:09 PM: >> Looks good. >> >> On Jan 21, 2014, at 12:41 AM, Olivier Lagneau >> > wrote: >> >>> Please find the new webrev with copyright date fixed (changed to 2014). >>> >>> Webrev: http://cr.openjdk.java.net/~olagneau/8019389/webrev.01/ >>> >>> >>> Olivier. >>> >>> Olivier Lagneau said on date 1/20/2014 5:50 PM: >>>> Oops, right ! >>>> >>>> Will fix that. >>>> >>>> Olivier. >>>> >>>> shanliang said on date 1/20/2014 4:13 PM: >>>>> Olivier, >>>>> >>>>> Now it is 2014 :) >>>>> >>>>> >>>>> Olivier Lagneau wrote: >>>>>> Please review the following simple fix. >>>>>> >>>>>> Issue: https://bugs.openjdk.java.net/browse/JDK-8019389 >>>>>> Webrev: http://cr.openjdk.java.net/~olagneau/8019389/webrev.00/ >>>>>> >>>>>> >>>>>> The issue is due to the fact that _invokeHandle bytecode is >>>>>> generated by hotspot, >>>>>> but is not declared in agent code. Just declaring the new >>>>>> bytecode solves the assertion failure. >>>>>> >>>>>> However the tests reported in 8019389 >>>>>> (bootstrapOtherStratumInStackTrace, targetOtherStratumInStackTrace) >>>>>> suffer the problem from JDK-7016268 >>>>>> : Can't get >>>>>> strata information through SA-JDI >>>>>> Thus, the "stratum mismatch" related to JDK-7016268 will still be >>>>>> present after fix. >>>>>> This second problem has to be fixed through JDK-7016268. >>>>>> >>>>>> Thanks, >>>>>> Olivier. >>>>> >>>> >>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20140204/6ae668b8/attachment.html From serguei.spitsyn at oracle.com Tue Feb 4 03:13:20 2014 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Tue, 04 Feb 2014 03:13:20 -0800 Subject: 2-nd round RFR (S) 8032223: nsk/regression/b4663146 gets assert(SafepointSynchronize::is_at_safepoint() || JvmtiEnv::is_thread_fully_suspended(get_thread(), false, &debug_bits)) Message-ID: <52F0CB50.3040703@oracle.com> Please, review the fix for: https://bugs.openjdk.java.net/browse/JDK-8032223 Open webrev: http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/hotspot/8032223-JVMTI-FRAME.1/ Summary: This is the second round of review for this issue. But it was decided that the JDK-8032223 must be used to cover it instead of the JDK-6471769. The 8032223 was initially closed as a dup of 6471769 but it has been re-open now. There is a general issue in the suspend equivalent condition mechanism: Two subsequent calls to the JvmtiEnv::is_thread_fully_suspended() may return different results: - 1-st: true - 2-nd: false This suspend equivalent issue is covered by another bug: https://bugs.openjdk.java.net/browse/JDK-6280037 The bug to fix in this review is a specific manifestation of the 6280037 in the JVMTI GetFrameCount() that has a major impact on the SQE nightly. It is on the Test Stabilization radar as well as the 6280037. There are many tests intermittently failing because of this. I've also decided to fix the same issue in the JVMTI GetFrameLocation() as well. The JVMTI GetFrameCount() spec tells: "If this function is called for a thread actively executing bytecodes (for example, not the current thread and not suspended), the information returned is transient." So, it is Ok to call the GetFrameCount() for the non-suspended target thread. To achieve safety, the frame count for non-suspended threads is calculated at a safepoint. It should be Ok and more safe to do the same for suspended threads as well. There should be no big performance impact because it is already on a slow path. It is still important to avoid safepointing when the target thread is current. The bug 6280037 should go out of the Test Stabilization radar (remove the svc-nightly label) as the most of the impacted tests must be covered by the 8032223. Testing: In progress: - nsk.jvmti, nsk.jdi, nsk.jdwp - JTreg com/sun/jdi Thanks, Serguei From olivier.lagneau at oracle.com Tue Feb 4 03:10:38 2014 From: olivier.lagneau at oracle.com (Olivier Lagneau) Date: Tue, 04 Feb 2014 12:10:38 +0100 Subject: Need second 7u reviewer for [7u60] Request for phase 2 approval for CR 8019389: SA-JDI JSR292: sun.jvm.hotspot.jdi.StackFrame.thisObject() throws sun.jvm.hotspot.utilities.AssertionFailure: sanity check In-Reply-To: <52F0C8FF.70100@oracle.com> References: <52DD37E3.1020402@oracle.com> <52DD3D1B.9070700@oracle.com> <52DD53C2.6070904@oracle.com> <52DE32CE.3080703@oracle.com> <2F0BD850-0C45-4377-959A-41045C75FB88@oracle.com> <52F0BDC1.7050405@oracle.com> <52F0C8FF.70100@oracle.com> Message-ID: <52F0CAAE.3010106@oracle.com> Thanks Serguei ! This is a 7u only fix. This code is already present in 8 and 9. Olivier. serguei.spitsyn at oracle.com said on date 2/4/2014 12:03 PM: > Hi Oliver, > > The fix looks good. > Was this already fixed in the JDK 9 and 8? > > Thanks, > Serguei > > On 2/4/14 2:15 AM, Olivier Lagneau wrote: >> I need a second 7u reviewer for this 7u60 critical request fix (very >> simple fix). >> >> Could someone review it please ? >> >> Thanks, >> Olivier. >> >> Christian Thalinger said on date 1/23/2014 10:09 PM: >>> Looks good. >>> >>> On Jan 21, 2014, at 12:41 AM, Olivier Lagneau >>> > wrote: >>> >>>> Please find the new webrev with copyright date fixed (changed to 2014). >>>> >>>> Webrev: http://cr.openjdk.java.net/~olagneau/8019389/webrev.01/ >>>> >>>> >>>> Olivier. >>>> >>>> Olivier Lagneau said on date 1/20/2014 5:50 PM: >>>>> Oops, right ! >>>>> >>>>> Will fix that. >>>>> >>>>> Olivier. >>>>> >>>>> shanliang said on date 1/20/2014 4:13 PM: >>>>>> Olivier, >>>>>> >>>>>> Now it is 2014 :) >>>>>> >>>>>> >>>>>> Olivier Lagneau wrote: >>>>>>> Please review the following simple fix. >>>>>>> >>>>>>> Issue: https://bugs.openjdk.java.net/browse/JDK-8019389 >>>>>>> Webrev: http://cr.openjdk.java.net/~olagneau/8019389/webrev.00/ >>>>>>> >>>>>>> >>>>>>> The issue is due to the fact that _invokeHandle bytecode is >>>>>>> generated by hotspot, >>>>>>> but is not declared in agent code. Just declaring the new >>>>>>> bytecode solves the assertion failure. >>>>>>> >>>>>>> However the tests reported in 8019389 >>>>>>> (bootstrapOtherStratumInStackTrace, targetOtherStratumInStackTrace) >>>>>>> suffer the problem from JDK-7016268 >>>>>>> : Can't get >>>>>>> strata information through SA-JDI >>>>>>> Thus, the "stratum mismatch" related to JDK-7016268 will still >>>>>>> be present after fix. >>>>>>> This second problem has to be fixed through JDK-7016268. >>>>>>> >>>>>>> Thanks, >>>>>>> Olivier. >>>>>> >>>>> >>>> >>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20140204/8ffc47b8/attachment.html From serguei.spitsyn at oracle.com Tue Feb 4 03:32:05 2014 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Tue, 04 Feb 2014 03:32:05 -0800 Subject: Need second 7u reviewer for [7u60] Request for phase 2 approval for CR 8019389: SA-JDI JSR292: sun.jvm.hotspot.jdi.StackFrame.thisObject() throws sun.jvm.hotspot.utilities.AssertionFailure: sanity check In-Reply-To: <52F0CAAE.3010106@oracle.com> References: <52DD37E3.1020402@oracle.com> <52DD3D1B.9070700@oracle.com> <52DD53C2.6070904@oracle.com> <52DE32CE.3080703@oracle.com> <2F0BD850-0C45-4377-959A-41045C75FB88@oracle.com> <52F0BDC1.7050405@oracle.com> <52F0C8FF.70100@oracle.com> <52F0CAAE.3010106@oracle.com> Message-ID: <52F0CFB5.3040107@oracle.com> On 2/4/14 3:10 AM, Olivier Lagneau wrote: > Thanks Serguei ! > > This is a 7u only fix. This code is already present in 8 and 9. Ok, thanks. Just in case, I've attached one of the latest e-mail notifications from Alejandro. Ask questions if there is any confusion. Thanks, Serguei > > Olivier. > > serguei.spitsyn at oracle.com said on date 2/4/2014 12:03 PM: >> Hi Oliver, >> >> The fix looks good. >> Was this already fixed in the JDK 9 and 8? >> >> Thanks, >> Serguei >> >> On 2/4/14 2:15 AM, Olivier Lagneau wrote: >>> I need a second 7u reviewer for this 7u60 critical request fix (very >>> simple fix). >>> >>> Could someone review it please ? >>> >>> Thanks, >>> Olivier. >>> >>> Christian Thalinger said on date 1/23/2014 10:09 PM: >>>> Looks good. >>>> >>>> On Jan 21, 2014, at 12:41 AM, Olivier Lagneau >>>> > wrote: >>>> >>>>> Please find the new webrev with copyright date fixed (changed to >>>>> 2014). >>>>> >>>>> Webrev: http://cr.openjdk.java.net/~olagneau/8019389/webrev.01/ >>>>> >>>>> >>>>> Olivier. >>>>> >>>>> Olivier Lagneau said on date 1/20/2014 5:50 PM: >>>>>> Oops, right ! >>>>>> >>>>>> Will fix that. >>>>>> >>>>>> Olivier. >>>>>> >>>>>> shanliang said on date 1/20/2014 4:13 PM: >>>>>>> Olivier, >>>>>>> >>>>>>> Now it is 2014 :) >>>>>>> >>>>>>> >>>>>>> Olivier Lagneau wrote: >>>>>>>> Please review the following simple fix. >>>>>>>> >>>>>>>> Issue: https://bugs.openjdk.java.net/browse/JDK-8019389 >>>>>>>> Webrev: http://cr.openjdk.java.net/~olagneau/8019389/webrev.00/ >>>>>>>> >>>>>>>> >>>>>>>> The issue is due to the fact that _invokeHandle bytecode is >>>>>>>> generated by hotspot, >>>>>>>> but is not declared in agent code. Just declaring the new >>>>>>>> bytecode solves the assertion failure. >>>>>>>> >>>>>>>> However the tests reported in 8019389 >>>>>>>> (bootstrapOtherStratumInStackTrace, targetOtherStratumInStackTrace) >>>>>>>> suffer the problem from JDK-7016268 >>>>>>>> : Can't get >>>>>>>> strata information through SA-JDI >>>>>>>> Thus, the "stratum mismatch" related to JDK-7016268 will still >>>>>>>> be present after fix. >>>>>>>> This second problem has to be fixed through JDK-7016268. >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Olivier. >>>>>>> >>>>>> >>>>> >>>> >>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20140204/75360197/attachment-0001.html -------------- next part -------------- An embedded message was scrubbed... From: Alejandro E Murillo Subject: [Rectification] Re: New hotspot repo for 7u60 changes Date: Mon, 03 Feb 2014 16:14:29 -0700 Size: 4082 Url: http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20140204/75360197/AttachedMessage-0001.nws From david.holmes at oracle.com Tue Feb 4 03:46:20 2014 From: david.holmes at oracle.com (David Holmes) Date: Tue, 04 Feb 2014 21:46:20 +1000 Subject: 2-nd round RFR (S) 8032223: nsk/regression/b4663146 gets assert(SafepointSynchronize::is_at_safepoint() || JvmtiEnv::is_thread_fully_suspended(get_thread(), false, &debug_bits)) In-Reply-To: <52F0CB50.3040703@oracle.com> References: <52F0CB50.3040703@oracle.com> Message-ID: <52F0D30C.20401@oracle.com> Hi Serguei, Looks okay to me. Minor nit: "use a vm-op for safety" is actually "use a vm-safepoint-op for safety". Not all VM ops need involve a safepoint. David On 4/02/2014 9:13 PM, serguei.spitsyn at oracle.com wrote: > Please, review the fix for: > https://bugs.openjdk.java.net/browse/JDK-8032223 > > > Open webrev: > http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/hotspot/8032223-JVMTI-FRAME.1/ > > > Summary: > > This is the second round of review for this issue. > But it was decided that the JDK-8032223 must be used to cover it > instead of the JDK-6471769. > The 8032223 was initially closed as a dup of 6471769 but it has been > re-open now. > > There is a general issue in the suspend equivalent condition mechanism: > Two subsequent calls to the JvmtiEnv::is_thread_fully_suspended() may > return different results: > - 1-st: true > - 2-nd: false > > This suspend equivalent issue is covered by another bug: > https://bugs.openjdk.java.net/browse/JDK-6280037 > > The bug to fix in this review is a specific manifestation of the 6280037 > in the JVMTI GetFrameCount() that has a major impact on the SQE nightly. > It is on the Test Stabilization radar as well as the 6280037. > There are many tests intermittently failing because of this. > I've also decided to fix the same issue in the JVMTI > GetFrameLocation() as well. > > The JVMTI GetFrameCount() spec tells: > "If this function is called for a thread actively executing > bytecodes (for example, > not the current thread and not suspended), the information > returned is transient." > > So, it is Ok to call the GetFrameCount() for the non-suspended target > thread. > To achieve safety, the frame count for non-suspended threads is > calculated at a safepoint. > It should be Ok and more safe to do the same for suspended threads as > well. > There should be no big performance impact because it is already on a > slow path. > It is still important to avoid safepointing when the target thread is > current. > > The bug 6280037 should go out of the Test Stabilization radar (remove > the svc-nightly label) > as the most of the impacted tests must be covered by the 8032223. > > > Testing: > In progress: > - nsk.jvmti, nsk.jdi, nsk.jdwp > - JTreg com/sun/jdi > > > Thanks, > Serguei > From olivier.lagneau at oracle.com Tue Feb 4 03:33:34 2014 From: olivier.lagneau at oracle.com (Olivier Lagneau) Date: Tue, 04 Feb 2014 12:33:34 +0100 Subject: Need second 7u reviewer for [7u60] Request for phase 2 approval for CR 8019389: SA-JDI JSR292: sun.jvm.hotspot.jdi.StackFrame.thisObject() throws sun.jvm.hotspot.utilities.AssertionFailure: sanity check In-Reply-To: <52F0CFB5.3040107@oracle.com> References: <52DD37E3.1020402@oracle.com> <52DD3D1B.9070700@oracle.com> <52DD53C2.6070904@oracle.com> <52DE32CE.3080703@oracle.com> <2F0BD850-0C45-4377-959A-41045C75FB88@oracle.com> <52F0BDC1.7050405@oracle.com> <52F0C8FF.70100@oracle.com> <52F0CAAE.3010106@oracle.com> <52F0CFB5.3040107@oracle.com> Message-ID: <52F0D00E.2040700@oracle.com> serguei.spitsyn at oracle.com said on date 2/4/2014 12:32 PM: > > Just in case, I've attached one of the latest e-mail notifications > from Alejandro. > Ask questions if there is any confusion. Thanks Serguey for reminding. I have been in touch with Alejandro about where to push the fix. Olivier. > > Thanks, > Serguei > >> >> Olivier. >> >> serguei.spitsyn at oracle.com said on date 2/4/2014 12:03 PM: >>> Hi Oliver, >>> >>> The fix looks good. >>> Was this already fixed in the JDK 9 and 8? >>> >>> Thanks, >>> Serguei >>> >>> On 2/4/14 2:15 AM, Olivier Lagneau wrote: >>>> I need a second 7u reviewer for this 7u60 critical request fix >>>> (very simple fix). >>>> >>>> Could someone review it please ? >>>> >>>> Thanks, >>>> Olivier. >>>> >>>> Christian Thalinger said on date 1/23/2014 10:09 PM: >>>>> Looks good. >>>>> >>>>> On Jan 21, 2014, at 12:41 AM, Olivier Lagneau >>>>> > >>>>> wrote: >>>>> >>>>>> Please find the new webrev with copyright date fixed (changed to >>>>>> 2014). >>>>>> >>>>>> Webrev: http://cr.openjdk.java.net/~olagneau/8019389/webrev.01/ >>>>>> >>>>>> >>>>>> Olivier. >>>>>> >>>>>> Olivier Lagneau said on date 1/20/2014 5:50 PM: >>>>>>> Oops, right ! >>>>>>> >>>>>>> Will fix that. >>>>>>> >>>>>>> Olivier. >>>>>>> >>>>>>> shanliang said on date 1/20/2014 4:13 PM: >>>>>>>> Olivier, >>>>>>>> >>>>>>>> Now it is 2014 :) >>>>>>>> >>>>>>>> >>>>>>>> Olivier Lagneau wrote: >>>>>>>>> Please review the following simple fix. >>>>>>>>> >>>>>>>>> Issue: https://bugs.openjdk.java.net/browse/JDK-8019389 >>>>>>>>> Webrev: >>>>>>>>> http://cr.openjdk.java.net/~olagneau/8019389/webrev.00/ >>>>>>>>> >>>>>>>>> >>>>>>>>> The issue is due to the fact that _invokeHandle bytecode is >>>>>>>>> generated by hotspot, >>>>>>>>> but is not declared in agent code. Just declaring the new >>>>>>>>> bytecode solves the assertion failure. >>>>>>>>> >>>>>>>>> However the tests reported in 8019389 >>>>>>>>> (bootstrapOtherStratumInStackTrace, >>>>>>>>> targetOtherStratumInStackTrace) >>>>>>>>> suffer the problem from JDK-7016268 >>>>>>>>> : Can't get >>>>>>>>> strata information through SA-JDI >>>>>>>>> Thus, the "stratum mismatch" related to JDK-7016268 will still >>>>>>>>> be present after fix. >>>>>>>>> This second problem has to be fixed through JDK-7016268. >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> Olivier. >>>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20140204/b50a26bc/attachment.html From dmitry.samersoff at oracle.com Tue Feb 4 05:46:49 2014 From: dmitry.samersoff at oracle.com (Dmitry Samersoff) Date: Tue, 04 Feb 2014 17:46:49 +0400 Subject: RR(S): JDK-8023667 SA: ExceptionBlob and other C2 classes not available in client VM In-Reply-To: References: <52EBFC89.3030809@oracle.com> <52EF7675.3020707@oracle.com> Message-ID: <52F0EF49.4050003@oracle.com> Staffan, You was right, only two classes is C2 specific. Here is updated webrev. http://cr.openjdk.java.net/~dsamersoff/JDK-8023667/webrev.02/ -Dmitry On 2014-02-03 16:13, Staffan Larsen wrote: > > On 3 feb 2014, at 11:59, Dmitry Samersoff wrote: > >> Staffan, >> >> All items under try-catch exists only in server (C2 compiler) VM, so if >> one of them is missing in target VM other items is missed as well. > > Are you sure? Some of them look very generic: ?JavaThread?, ?SurrogateLockerThread?, ?GenCollectedHeap?, etc. Perhaps the catch-ignore-c2-specifics should end at lin 821? > > /Staffan > > >> >> I have no ideas what happens later if SA tries to access one of this >> non-initialized items, but I don't see any errors in my home testing. >> >> -Dmitry >> >> On 2014-02-03 14:39, Staffan Larsen wrote: >>> It looks like this new version will leave some things uninitialized >>> if an item higher on the list throws an exception. I don?t think that >>> is what you want. >>> >>> /Staffan >>> >>> >>> On 31 jan 2014, at 20:42, Dmitry Samersoff >>> wrote: >>> >>>> Hi Everybody, >>>> >>>> Please review the fix: >>>> >>>> http://cr.openjdk.java.net/~dsamersoff/JDK-8023667/webrev.01/ >>>> >>>> sa.js try to initialize all possible VM structures regardless of >>>> type of VM we are running on >>>> >>>> -- Dmitry Samersoff Oracle Java development team, Saint Petersburg, >>>> Russia * I would love to change the world, but they won't give me >>>> the sources. >>> >> >> >> -- >> Dmitry Samersoff >> Oracle Java development team, Saint Petersburg, Russia >> * I would love to change the world, but they won't give me the sources. > -- Dmitry Samersoff Oracle Java development team, Saint Petersburg, Russia * I would love to change the world, but they won't give me the sources. From staffan.larsen at oracle.com Tue Feb 4 05:47:33 2014 From: staffan.larsen at oracle.com (Staffan Larsen) Date: Tue, 4 Feb 2014 14:47:33 +0100 Subject: RR(S): JDK-8023667 SA: ExceptionBlob and other C2 classes not available in client VM In-Reply-To: <52F0EF49.4050003@oracle.com> References: <52EBFC89.3030809@oracle.com> <52EF7675.3020707@oracle.com> <52F0EF49.4050003@oracle.com> Message-ID: Looks good now! /Staffan On 4 feb 2014, at 14:46, Dmitry Samersoff wrote: > Staffan, > > You was right, only two classes is C2 specific. > > Here is updated webrev. > > http://cr.openjdk.java.net/~dsamersoff/JDK-8023667/webrev.02/ > > -Dmitry > > On 2014-02-03 16:13, Staffan Larsen wrote: >> >> On 3 feb 2014, at 11:59, Dmitry Samersoff wrote: >> >>> Staffan, >>> >>> All items under try-catch exists only in server (C2 compiler) VM, so if >>> one of them is missing in target VM other items is missed as well. >> >> Are you sure? Some of them look very generic: ?JavaThread?, ?SurrogateLockerThread?, ?GenCollectedHeap?, etc. Perhaps the catch-ignore-c2-specifics should end at lin 821? >> >> /Staffan >> >> >>> >>> I have no ideas what happens later if SA tries to access one of this >>> non-initialized items, but I don't see any errors in my home testing. >>> >>> -Dmitry >>> >>> On 2014-02-03 14:39, Staffan Larsen wrote: >>>> It looks like this new version will leave some things uninitialized >>>> if an item higher on the list throws an exception. I don?t think that >>>> is what you want. >>>> >>>> /Staffan >>>> >>>> >>>> On 31 jan 2014, at 20:42, Dmitry Samersoff >>>> wrote: >>>> >>>>> Hi Everybody, >>>>> >>>>> Please review the fix: >>>>> >>>>> http://cr.openjdk.java.net/~dsamersoff/JDK-8023667/webrev.01/ >>>>> >>>>> sa.js try to initialize all possible VM structures regardless of >>>>> type of VM we are running on >>>>> >>>>> -- Dmitry Samersoff Oracle Java development team, Saint Petersburg, >>>>> Russia * I would love to change the world, but they won't give me >>>>> the sources. >>>> >>> >>> >>> -- >>> Dmitry Samersoff >>> Oracle Java development team, Saint Petersburg, Russia >>> * I would love to change the world, but they won't give me the sources. >> > > > -- > Dmitry Samersoff > Oracle Java development team, Saint Petersburg, Russia > * I would love to change the world, but they won't give me the sources. From mikael.gerdin at oracle.com Tue Feb 4 06:44:55 2014 From: mikael.gerdin at oracle.com (Mikael Gerdin) Date: Tue, 04 Feb 2014 15:44:55 +0100 Subject: RFR (backport of): 8028391 - Make the Min/MaxHeapFreeRatio flags manageable In-Reply-To: <52F0B01C.2040600@oracle.com> References: <52F0B01C.2040600@oracle.com> Message-ID: <16972825.PJJixsIhPJ@mgerdin03> Hi Jesper, On Tuesday 04 February 2014 10.17.16 Jesper Wilhelmsson wrote: > Hi, > > The patch from jdk9/hs-gc to make MinHeapFreeRatio and MaxHeapFreeRatio > manageable flags and supported in ParallelGC applied cleanly to > jdk8u/hs-dev. > > hsx/jdk7u60 required some manual labor to patch since the argument parsing > code had moved around, but the relevant logic was untouched so I didn't > have to change anything in the new code. For 7 I omitted some of the minor > cleanups that was done in the original patch. > > Webrev for 7: http://cr.openjdk.java.net/~jwilhelm/8028391/webrev.7u60/ The 7u60 backport looks good to me. > Webrev for 8: http://cr.openjdk.java.net/~jwilhelm/8028391/webrev.8u20/ > > (The 8u20 webrev is identical with the final version for 9.) In that case, you don't need to ask for reviews for the 8u20 backport, you can just go ahead and push it. /Mikael > > So kindly approve this change for backport to jdk8u20 and jdk7u60. > > The bugs: > jdk9: https://bugs.openjdk.java.net/browse/JDK-8028391 > jdk8: https://bugs.openjdk.java.net/browse/JDK-8033209 > jdk7: https://bugs.openjdk.java.net/browse/JDK-8028720 > > Reviews in this thread: > http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/2014-January/009187.ht > ml > > The jdk9 changeset: > http://hg.openjdk.java.net/jdk9/hs-gc/hotspot/rev/44315152d434 > > Thanks, > /Jesper From shanliang.jiang at oracle.com Tue Feb 4 07:07:13 2014 From: shanliang.jiang at oracle.com (shanliang) Date: Tue, 04 Feb 2014 16:07:13 +0100 Subject: RFR 6791551: ExclusiveBind.java has a race condition In-Reply-To: <52F0ABC2.4040700@oracle.com> References: <52F0192B.9060104@oracle.com> <52F0AABC.7000904@oracle.com> <52F0ABC2.4040700@oracle.com> Message-ID: <52F10221.6010303@oracle.com> Jaroslav Bachorik wrote: > On 4.2.2014 09:54, shanliang wrote: >> Jaroslav, >> >> Your fix should work in most case, but is it better and more reliable to >> wait a VM event as suggested in the bug? even your timeout is adapted to >> the test time factory, but the solution still depends to a fixed timeout >> and a fixed line out. > > Well, if I get the test logic correctly it is supposed to test that > the agent blocks the port even when no client has connected yet. > Connecting to the agent and waiting for the event would change the > thing the test checks, actually. You are right that the test should not attach a VM before launching the second debuggee. Let's hope that 5000 * Utils.TIMEOUT_FACTOR works for all testing machines. Looks OK. Thanks, Shanliang > > -JB- > >> >> Shanliang >> >> Jaroslav Bachorik wrote: >>> Please, review the following test fix: >>> >>> Issue : https://bugs.openjdk.java.net/browse/JDK-6791551 >>> Webrev: http://cr.openjdk.java.net/~jbachorik/6791551/webrev.00 >>> >>> The fix prevents the situation when the first debuggee has not managed >>> to finish its intialization while the second one is started up thus >>> making the port available for the second debuggee and failing the test. >>> >>> The patch is using the library methods to configure and launch the >>> debuggee and the test waits for the well known string to appear in the >>> first debuggee output before attempting to launch the second debuggee. >>> >>> Thanks, >>> >>> -JB- >> > From bengt.rutisson at oracle.com Tue Feb 4 07:18:34 2014 From: bengt.rutisson at oracle.com (Bengt Rutisson) Date: Tue, 04 Feb 2014 16:18:34 +0100 Subject: RFR (backport of): 8028391 - Make the Min/MaxHeapFreeRatio flags manageable In-Reply-To: <52F0B01C.2040600@oracle.com> References: <52F0B01C.2040600@oracle.com> Message-ID: <52F104CA.3030702@oracle.com> Hi Jesper, I browsed the 7u60 back port and it looks fine as far as I can tell. I did not look at the 8u20 back port since I don't think you need a review if it applied cleanly. One small detail that you probably should not change for these back ports but that I should have noticed in the original patch is the error message in Arguments::verify_MinHeapFreeRatio(). It says "MinHeapFreeRatio must have a value between 0 and 100" but in fact we only allow 0-99. I guess this is probably not worth fixing. Not even for 9. But I just wanted to mention it since I should have caught it in the original review. Thanks, Bengt On 2/4/14 10:17 AM, Jesper Wilhelmsson wrote: > Hi, > > The patch from jdk9/hs-gc to make MinHeapFreeRatio and > MaxHeapFreeRatio manageable flags and supported in ParallelGC applied > cleanly to jdk8u/hs-dev. > > hsx/jdk7u60 required some manual labor to patch since the argument > parsing code had moved around, but the relevant logic was untouched so > I didn't have to change anything in the new code. For 7 I omitted some > of the minor cleanups that was done in the original patch. > > Webrev for 7: http://cr.openjdk.java.net/~jwilhelm/8028391/webrev.7u60/ > Webrev for 8: http://cr.openjdk.java.net/~jwilhelm/8028391/webrev.8u20/ > > (The 8u20 webrev is identical with the final version for 9.) > > So kindly approve this change for backport to jdk8u20 and jdk7u60. > > The bugs: > jdk9: https://bugs.openjdk.java.net/browse/JDK-8028391 > jdk8: https://bugs.openjdk.java.net/browse/JDK-8033209 > jdk7: https://bugs.openjdk.java.net/browse/JDK-8028720 > > Reviews in this thread: > http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/2014-January/009187.html > > > The jdk9 changeset: > http://hg.openjdk.java.net/jdk9/hs-gc/hotspot/rev/44315152d434 > > Thanks, > /Jesper From daniel.daugherty at oracle.com Tue Feb 4 07:48:10 2014 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Tue, 04 Feb 2014 08:48:10 -0700 Subject: 2-nd round RFR (S) 8032223: nsk/regression/b4663146 gets assert(SafepointSynchronize::is_at_safepoint() || JvmtiEnv::is_thread_fully_suspended(get_thread(), false, &debug_bits)) In-Reply-To: <52F0CB50.3040703@oracle.com> References: <52F0CB50.3040703@oracle.com> Message-ID: <52F10BBA.8060504@oracle.com> On 2/4/14 4:13 AM, serguei.spitsyn at oracle.com wrote: > Please, review the fix for: > https://bugs.openjdk.java.net/browse/JDK-8032223 > > > Open webrev: > http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/hotspot/8032223-JVMTI-FRAME.1/ > src/share/vm/prims/jvmtiEnvBase.hpp No comments. src/share/vm/prims/jvmtiEnv.cpp No comments beyond David's tweak to the comment. For future work... Looks like these VM ops also need the liveness check on the target JavaThread: VM_GetOwnedMonitorInfo VM_GetStackTrace JvmtiEnv::GetStackTrace() looks like it has the same is_thread_fully_suspended() flaw... Dan > > Summary: > > This is the second round of review for this issue. > But it was decided that the JDK-8032223 must be used to cover it > instead of the JDK-6471769. > The 8032223 was initially closed as a dup of 6471769 but it has been > re-open now. > > There is a general issue in the suspend equivalent condition mechanism: > Two subsequent calls to the JvmtiEnv::is_thread_fully_suspended() > may return different results: > - 1-st: true > - 2-nd: false > > This suspend equivalent issue is covered by another bug: > https://bugs.openjdk.java.net/browse/JDK-6280037 > > The bug to fix in this review is a specific manifestation of the > 6280037 > in the JVMTI GetFrameCount() that has a major impact on the SQE > nightly. > It is on the Test Stabilization radar as well as the 6280037. > There are many tests intermittently failing because of this. > I've also decided to fix the same issue in the JVMTI > GetFrameLocation() as well. > > The JVMTI GetFrameCount() spec tells: > "If this function is called for a thread actively executing > bytecodes (for example, > not the current thread and not suspended), the information > returned is transient." > > So, it is Ok to call the GetFrameCount() for the non-suspended > target thread. > To achieve safety, the frame count for non-suspended threads is > calculated at a safepoint. > It should be Ok and more safe to do the same for suspended threads > as well. > There should be no big performance impact because it is already on a > slow path. > It is still important to avoid safepointing when the target thread > is current. > > The bug 6280037 should go out of the Test Stabilization radar > (remove the svc-nightly label) > as the most of the impacted tests must be covered by the 8032223. > > > Testing: > In progress: > - nsk.jvmti, nsk.jdi, nsk.jdwp > - JTreg com/sun/jdi > > > Thanks, > Serguei > From daniel.daugherty at oracle.com Tue Feb 4 10:52:32 2014 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Tue, 04 Feb 2014 11:52:32 -0700 Subject: code review round 0 for ObjectMonitor-JVM/TI hang fix (8028073) In-Reply-To: <52F08B34.1010300@oracle.com> References: <52ED3F28.5070806@oracle.com> <52EF874A.8020603@oracle.com> <52EFAE1C.3050202@oracle.com> <52F08B34.1010300@oracle.com> Message-ID: <52F136F0.5050408@oracle.com> On 2/3/14 11:39 PM, David Holmes wrote: > On 4/02/2014 12:56 AM, Daniel D. Daugherty wrote: >> Adding Dave Dice to this thread... >> >> On 2/3/14 5:10 AM, David Holmes wrote: >>> Hi Dan, >>> >>> On 2/02/2014 4:38 AM, Daniel D. Daugherty wrote: >>>> Greetings, >>>> >>>> I have a fix ready for the following bug: >>>> >>>> 8028073 race condition in ObjectMonitor implementation causing >>>> deadlocks >>>> https://bugs.openjdk.java.net/browse/JDK-8028073 >>>> >>>> On the surface, this is a very simple fix that relocates a few >>>> lines of >>>> code, relocates and rewrites the comments associated with that code >>>> and >>>> adds several new comments. >>>> >>>> Of course, in reality, the issue is much more complicated, but I'm >>>> hoping to make it easy for anyone not acquainted with this issue to >>>> understand what's going on. >>> >>> Let's spare them the trouble - Reviewed :) >> >> Thanks for the review! >> >> >>> I'm still troubled by the rationale for the original placement of this >>> code. It makes sense to issue the unpark prior to the park() call but >>> on the other hand there is no agent/event code in that path that would >>> have consumed the original unpark. ?? >> >> I think I've acknowledged that concern in the "Risks" section below: >> >> - while the old code was not effective for the hang that is being >> fixed with this bug, it is possible that the old code prevented >> a different bug in the successor protocol from manifesting >> >> Hopefully, Dave Dice will chime in and share the history... > > For the record Dave did chime in off-list and this code block dates > back to the JVMPI implementation which did > jvmpi::post_monitor_contended_exit_event at the end of exit(), which > is called just before the code block in question. That was back in JDK > 5. By JDK 6 JVMPI had been stripped out but the code block remained. David, thanks for closing the loop on this one. Dan > >>> Thanks, I know you've really enjoyed this one :) >> >> I actually did enjoy this one from a 'challenge to the mind' POV. >> But I have to say that the intricacies of the successor protocol >> gave me headaches... Hopefully, my successor.notes will help anyone >> else that is crazy enough to wander down this path... > > Indeed! > > Thanks, > David > > >> Dan >> >> >>> >>> David >>> ----- >>> >>>> Here are the JDK9 webrev URLs: >>>> >>>> OpenJDK: >>>> http://cr.openjdk.java.net/~dcubed/8028073-webrev/0-jdk9-hs-runtime/ >>>> >>>> Oracle internal: >>>> http://javaweb.us.oracle.com/~ddaugher/8028073-webrev/0-jdk9-hs-runtime/ >>>> >>>> >>>> The simple summary: >>>> >>>> - since Java Monitors and JVM/TI RawMonitors share a ParkEvent, >>>> it is possible for a JVM/TI monitor event handler to accidentally >>>> consume a ParkEvent.unpark() call meant for Java Monitor layer >>>> - the original code fix was made on 2005.07.04 using this bug ID: >>>> https://bugs.openjdk.java.net/browse/JDK-5030359 >>>> - it's the right fix, but it's in the wrong place >>>> - the fix needs to be after the JVMTI_EVENT_MONITOR_WAITED >>>> event handler is called because it is that event handler >>>> that can cause the hang >>>> >>>> >>>> Testing >>>> ------- >>>> >>>> - a new StessMonitorWait test has been created that reliably >>>> reproduces the hang in JDK[6789]; see the bug's gory details >>>> for the specific versions where the hang has been reproduced >>>> - the test reliably reproduces the hang in 5 seconds on my >>>> T7600 running Solaris 10u11 X86; 1 minute runs reproduce >>>> the hang reliably on other machines >>>> - 12 hour stress run of the new test on Linux-X64, MacOS X-X64, >>>> Solaris-SPARCV9, Solaris-X64, and Win7-X86 with the JPRT >>>> bits did not reproduce the hang >>>> - JPRT test job >>>> - VM/SQE Adhoc test job on Server VM, fastdebug bits on Linux-X86, >>>> Linux-X64, MacOS X-X64, Solaris-SPARCV9, Solaris-X64, Windows-X86, >>>> and Windows-X64: >>>> - vm.quick >>>> - Kitchensink (bigapps) >>>> - Weblogic+medrec (bigapps) >>>> - runThese (bigapps) >>>> >>>> >>>> The Gory Details Start Here >>>> --------------------------- >>>> >>>> This is the old location of block of code that's being moved: >>>> >>>> src/share/vm/runtime/objectMonitor.cpp: >>>> >>>> 1440 void ObjectMonitor::wait(jlong millis, bool interruptible, >>>> TRAPS) { >>>> >>>> 1499 exit (true, Self) ; // exit the monitor >>>> >>>> 1513 if (node._notified != 0 && _succ == Self) { >>>> 1514 node._event->unpark(); >>>> 1515 } >>>> >>>> >>>> This is the new location of block of code that's being moved: >>>> >>>> src/share/vm/runtime/objectMonitor.cpp: >>>> >>>> 1452 void ObjectMonitor::wait(jlong millis, bool interruptible, >>>> TRAPS) { >>>> >>>> 1601 if (JvmtiExport::should_post_monitor_waited()) { >>>> 1602 JvmtiExport::post_monitor_waited(jt, this, ret == >>>> OS_TIMEOUT); >>>> >>>> 1604 if (node._notified != 0 && _succ == Self) { >>>> >>>> 1620 node._event->unpark(); >>>> 1621 } >>>> >>>> >>>> The Risks >>>> --------- >>>> >>>> - The code now executes only when the JVMTI_EVENT_MONITOR_WAITED event >>>> is enabled: >>>> - previously it was always executed >>>> - while the old code was not effective for the hang that is being >>>> fixed with this bug, it is possible that the old code prevented >>>> a different bug in the successor protocol from manifesting >>>> - thorough analysis of the successor protocol did not reveal a >>>> case where the old code was needed in the old location >>>> - Thorough analysis indicates that the other JVM/TI monitor events >>>> do not need a fix like the one for JVMTI_EVENT_MONITOR_WAITED: >>>> - the successor protocol is complicated and the analysis could >>>> be wrong when certain options are used >>>> - comments were added to each location where a JVM/TI monitor >>>> event handler is called documenting why a fix like this one >>>> is not needed there >>>> - if the analysis is wrong, the new comments show where a new >>>> code change would be needed >>>> >>>> >>>> The Scenario >>>> ------------ >>>> >>>> I've created a scenario that reproduces this hang: >>>> >>>> T1 - enters monitor and calls monitor.wait() >>>> T2 - enters the monitor, calls monitor.notify() and exits the monitor >>>> T3 - enters and exits the monitor >>>> T4 - enters the monitor, delays for 5 seconds, exits the monitor >>>> >>>> A JVM/TI agent that enables JVMTI_EVENT_MONITOR_WAITED and has a >>>> handler that: enters a raw monitor, waits for 1ms, exits a raw >>>> monitor. >>>> >>>> Here are the six events necessary to make this hang happen: >>>> >>>> // KEY-EVENT-1a: After being unparked(), T1 has cleared the _succ >>>> field, >>>> but >>>> // KEY-EVENT-1b: T3 is exiting the monitor and makes T1 the successor >>>> again. >>>> >>>> // KEY-EVENT-2a: The unpark() done by T3 when it made T1 the successor >>>> // KEY-EVENT-2b: is consumed by the JVM/TI event handler. >>>> >>>> // KEY-EVENT-3a: T3 made T1 the successor >>>> // KEY-EVENT-3b: but before T1 could reenter the monitor T4 grabbed >>>> it. >>>> >>>> // KEY-EVENT-4a: T1's TrySpin() call sees T4 as NotRunnable so >>>> // KEY-EVENT-4b: T1 bails from TrySpin without touching _succ. >>>> >>>> // KEY-EVENT-5a: T4 sees that T1 is still the successor so >>>> // KEY-EVENT-5b: T4 takes the quick exit path (no ExitEpilog) >>>> >>>> // KEY-EVENT-6a: T1 is about to park and it is the successor, but >>>> // KEY-EVENT-6b: T3's unpark has been eaten by the JVM/TI event >>>> handler >>>> // KEY-EVENT-6c: and T4 took the quick exit path. T1 is about to be >>>> stuck. >>>> >>>> >>>> This bug is intertwined with: >>>> >>>> - The ObjectMonitor successor protocol >>>> - the sharing of a ParkEvent between Java Monitors and JVM/TI >>>> RawMonitors >>>> >>>> There is a very long successor.notes attachment to JDK-8028073 that >>>> attempts to describe the ObjectMonitor successor protocol. It's good >>>> for putting pretty much anyone to sleep. >>>> >>>> Since this hang reproduces back to JDK6, this bug is taking the easily >>>> backported solution of moving the original fix to the right location. >>>> The following new bug has been filed for possible future work in this >>>> area by the Serviceability Team: >>>> >>>> 8033399 add a separate ParkEvent for JVM/TI RawMonitor use >>>> https://bugs.openjdk.java.net/browse/JDK-8033399 >>>> >>>> >>>> The Symptoms >>>> ------------ >>>> >>>> With intermittent hangs like this, it is useful to know what to look >>>> for in order to determine if you are running into this issue: >>>> >>>> - if you aren't using a debugger or a profiler or some other >>>> JVM/TI agent, then this hang is not the same as yours >>>> - if your JVM/TI agent isn't using a JVMTI_EVENT_MONITOR_WAITED >>>> event handler, then this hang is not the same as yours >>>> - if your JVMTI_EVENT_MONITOR_WAITED event handler is not using >>>> JVM/TI RawMonitors, then this hang is not the same as yours >>>> - if your JVMTI_EVENT_MONITOR_WAITED event handler is calling >>>> back into Java code, then you might just be insane and this >>>> hang might be similar to yours. However, using a Java callback >>>> in an event handler is an even bigger problem/risk so fix that >>>> first. >>>> - if you one or more threads blocked like this and making no >>>> progress, then this hang might be the same as yours: >>>> >>>> "T1" #22 prio=5 os_prio=64 tid=0x00000000009ca800 nid=0x2f waiting for >>>> monitor e >>>> ntry [0xfffffd7fc0231000] >>>> java.lang.Thread.State: BLOCKED (on object monitor) >>>> JavaThread state: _thread_blocked >>>> Thread: 0x00000000009ca800 [0x2f] State: _at_safepoint >>>> _has_called_back >>>> 0 _at_p >>>> oll_safepoint 0 >>>> JavaThread state: _thread_blocked >>>> at java.lang.Object.wait(Native Method) >>>> - waiting on <0xfffffd7e6a2b6ff0> (a java.lang.String) >>>> at java.lang.Object.wait(Object.java:502) >>>> at SMW_WorkerThread.run(StressMonitorWait.java:103) >>>> - locked <0xfffffd7e6a2b6ff0> (a java.lang.String) >>>> >>>> "T2" #23 prio=5 os_prio=64 tid=0x00000000009cc000 nid=0x30 waiting for >>>> monitor e >>>> ntry [0xfffffd7fc0130000] >>>> java.lang.Thread.State: BLOCKED (on object monitor) >>>> JavaThread state: _thread_blocked >>>> Thread: 0x00000000009cc000 [0x30] State: _at_safepoint >>>> _has_called_back >>>> 0 _at_p >>>> oll_safepoint 0 >>>> JavaThread state: _thread_blocked >>>> at SMW_WorkerThread.run(StressMonitorWait.java:120) >>>> - waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String) >>>> >>>> "T3" #24 prio=5 os_prio=64 tid=0x00000000009ce000 nid=0x31 waiting for >>>> monitor e >>>> ntry [0xfffffd7fc002f000] >>>> java.lang.Thread.State: BLOCKED (on object monitor) >>>> JavaThread state: _thread_blocked >>>> Thread: 0x00000000009ce000 [0x31] State: _at_safepoint >>>> _has_called_back >>>> 0 _at_p >>>> oll_safepoint 0 >>>> JavaThread state: _thread_blocked >>>> at SMW_WorkerThread.run(StressMonitorWait.java:139) >>>> - waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String) >>>> >>>> Key symptoms in thread T1: >>>> >>>> - had the object locked: >>>> >>>> locked <0xfffffd7e6a2b6ff0> (a java.lang.String) >>>> >>>> - did an Object.wait(): >>>> >>>> waiting on <0xfffffd7e6a2b6ff0> (a java.lang.String) >>>> >>>> - is blocked on reentry: >>>> >>>> waiting for monitor entry [0xfffffd7fc0231000] >>>> >>>> Key symtoms in thread T2: >>>> >>>> - is blocked waiting to lock the object: >>>> >>>> waiting for monitor entry [0xfffffd7fc0130000] >>>> waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String) >>>> >>>> Key symtoms in thread T3: >>>> >>>> - is blocked waiting to lock the object: >>>> >>>> waiting for monitor entry [0xfffffd7fc002f000] >>>> waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String) >>>> >> > From serguei.spitsyn at oracle.com Tue Feb 4 11:25:17 2014 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Tue, 04 Feb 2014 11:25:17 -0800 Subject: 2-nd round RFR (S) 8032223: nsk/regression/b4663146 gets assert(SafepointSynchronize::is_at_safepoint() || JvmtiEnv::is_thread_fully_suspended(get_thread(), false, &debug_bits)) In-Reply-To: <52F0D30C.20401@oracle.com> References: <52F0CB50.3040703@oracle.com> <52F0D30C.20401@oracle.com> Message-ID: <52F13E9D.2040105@oracle.com> Thanks, David! Serguei On 2/4/14 3:46 AM, David Holmes wrote: > Hi Serguei, > > Looks okay to me. > > Minor nit: "use a vm-op for safety" is actually "use a vm-safepoint-op > for safety". Not all VM ops need involve a safepoint. > > David > > On 4/02/2014 9:13 PM, serguei.spitsyn at oracle.com wrote: >> Please, review the fix for: >> https://bugs.openjdk.java.net/browse/JDK-8032223 >> >> >> Open webrev: >> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/hotspot/8032223-JVMTI-FRAME.1/ >> >> >> >> Summary: >> >> This is the second round of review for this issue. >> But it was decided that the JDK-8032223 must be used to cover it >> instead of the JDK-6471769. >> The 8032223 was initially closed as a dup of 6471769 but it has been >> re-open now. >> >> There is a general issue in the suspend equivalent condition >> mechanism: >> Two subsequent calls to the JvmtiEnv::is_thread_fully_suspended() may >> return different results: >> - 1-st: true >> - 2-nd: false >> >> This suspend equivalent issue is covered by another bug: >> https://bugs.openjdk.java.net/browse/JDK-6280037 >> >> The bug to fix in this review is a specific manifestation of the >> 6280037 >> in the JVMTI GetFrameCount() that has a major impact on the SQE >> nightly. >> It is on the Test Stabilization radar as well as the 6280037. >> There are many tests intermittently failing because of this. >> I've also decided to fix the same issue in the JVMTI >> GetFrameLocation() as well. >> >> The JVMTI GetFrameCount() spec tells: >> "If this function is called for a thread actively executing >> bytecodes (for example, >> not the current thread and not suspended), the information >> returned is transient." >> >> So, it is Ok to call the GetFrameCount() for the non-suspended target >> thread. >> To achieve safety, the frame count for non-suspended threads is >> calculated at a safepoint. >> It should be Ok and more safe to do the same for suspended threads as >> well. >> There should be no big performance impact because it is already on a >> slow path. >> It is still important to avoid safepointing when the target thread is >> current. >> >> The bug 6280037 should go out of the Test Stabilization radar (remove >> the svc-nightly label) >> as the most of the impacted tests must be covered by the 8032223. >> >> >> Testing: >> In progress: >> - nsk.jvmti, nsk.jdi, nsk.jdwp >> - JTreg com/sun/jdi >> >> >> Thanks, >> Serguei >> From serguei.spitsyn at oracle.com Tue Feb 4 12:45:00 2014 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Tue, 04 Feb 2014 12:45:00 -0800 Subject: 2-nd round RFR (S) 8032223: nsk/regression/b4663146 gets assert(SafepointSynchronize::is_at_safepoint() || JvmtiEnv::is_thread_fully_suspended(get_thread(), false, &debug_bits)) In-Reply-To: <52F10BBA.8060504@oracle.com> References: <52F0CB50.3040703@oracle.com> <52F10BBA.8060504@oracle.com> Message-ID: <52F1514C.1050405@oracle.com> Thanks, Dan! Some comments below. On 2/4/14 7:48 AM, Daniel D. Daugherty wrote: > On 2/4/14 4:13 AM, serguei.spitsyn at oracle.com wrote: >> Please, review the fix for: >> https://bugs.openjdk.java.net/browse/JDK-8032223 >> >> >> Open webrev: >> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/hotspot/8032223-JVMTI-FRAME.1/ >> > > src/share/vm/prims/jvmtiEnvBase.hpp > No comments. > > src/share/vm/prims/jvmtiEnv.cpp > No comments beyond David's tweak to the comment. Will fix it before push. > For future work... > > Looks like these VM ops also need the liveness check on the > target JavaThread: > > VM_GetOwnedMonitorInfo > VM_GetStackTrace > > JvmtiEnv::GetStackTrace() looks like it has the same > is_thread_fully_suspended() flaw... I know about this. There are even more issues: GetCurrentContendedMonitor GetOwnedMonitorStackDepthInfo NotifyFramePop I'll check if we can use an existing bug to add this extra request. Otherwise, will file new one. Thanks, Serguei > > Dan > > > >> >> Summary: >> >> This is the second round of review for this issue. >> But it was decided that the JDK-8032223 must be used to cover it >> instead of the JDK-6471769. >> The 8032223 was initially closed as a dup of 6471769 but it has >> been re-open now. >> >> There is a general issue in the suspend equivalent condition >> mechanism: >> Two subsequent calls to the JvmtiEnv::is_thread_fully_suspended() >> may return different results: >> - 1-st: true >> - 2-nd: false >> >> This suspend equivalent issue is covered by another bug: >> https://bugs.openjdk.java.net/browse/JDK-6280037 >> >> The bug to fix in this review is a specific manifestation of the >> 6280037 >> in the JVMTI GetFrameCount() that has a major impact on the SQE >> nightly. >> It is on the Test Stabilization radar as well as the 6280037. >> There are many tests intermittently failing because of this. >> I've also decided to fix the same issue in the JVMTI >> GetFrameLocation() as well. >> >> The JVMTI GetFrameCount() spec tells: >> "If this function is called for a thread actively executing >> bytecodes (for example, >> not the current thread and not suspended), the information >> returned is transient." >> >> So, it is Ok to call the GetFrameCount() for the non-suspended >> target thread. >> To achieve safety, the frame count for non-suspended threads is >> calculated at a safepoint. >> It should be Ok and more safe to do the same for suspended threads >> as well. >> There should be no big performance impact because it is already on >> a slow path. >> It is still important to avoid safepointing when the target thread >> is current. >> >> The bug 6280037 should go out of the Test Stabilization radar >> (remove the svc-nightly label) >> as the most of the impacted tests must be covered by the 8032223. >> >> >> Testing: >> In progress: >> - nsk.jvmti, nsk.jdi, nsk.jdwp >> - JTreg com/sun/jdi >> >> >> Thanks, >> Serguei >> > From vladimir.kozlov at oracle.com Tue Feb 4 15:45:29 2014 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Tue, 04 Feb 2014 15:45:29 -0800 Subject: Review Request (S) 8025841: JVMTI: "vtable stub" dynamic code notification is misplaced In-Reply-To: <52F01569.3070201@oracle.com> References: <52F01569.3070201@oracle.com> Message-ID: <52F17B99.6080405@oracle.com> Changes look reasonable. Thank, Vladimir On 2/3/14 2:17 PM, serguei.spitsyn at oracle.com wrote: > Please, review the fix for: > https://bugs.openjdk.java.net/browse/JDK-8025841 > > > Open webrev: > http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/hotspot/omazurov/8025841-JVMTI-vtbl.1 > > > Summary: > > The fix contributed by Oleg Mazurov to improve profiling data quality. > It moves the "vtable stub" dynamic code notification to the right place. > I've already reviewed the fix, and it looks good to me. > > Bug report description: > > "JVMTI_EVENT_DYNAMIC_CODE_GENERATED for "vtable stub" gets scheduled > when > a new chunk of memory for subsequent vtable and itable stubs is > allocated. > That chunk is uninitialized (contains zeros or garbage) although > due to the fact > that the actual event delivery is deferred, at least one vtable > comes out right. > > This event should describe an individual vtable/itable stub (base > address and size) > and only after it's been created (memory is actually populated with > code). > Where VM diagnostic messages about vtable/itable stubs are issued upon > -XX:+PrintAdapterHandlers appears exactly the right place for JVMTI > events as well. > > Getting vtables/itables right is important in the context of > performance analysis as > that dynamically generated code may accumulate quite noticeable CPU > time > (especially itabes), sometimes larger than the actual Java methods > called." > > > Testing: > Oleg tested it in the Oracle Studio Performance Analyzer environment. > nsk.jvmti, nsk.jdi, nsk.jdwp, > In progress: Jtreg com/sun/jdi, java/lang/instrument > > > Thanks, > Serguei > From serguei.spitsyn at oracle.com Tue Feb 4 16:14:48 2014 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Tue, 04 Feb 2014 16:14:48 -0800 Subject: Review Request (S) 8025841: JVMTI: "vtable stub" dynamic code notification is misplaced In-Reply-To: <52F17B99.6080405@oracle.com> References: <52F01569.3070201@oracle.com> <52F17B99.6080405@oracle.com> Message-ID: <52F18278.309@oracle.com> Thanks a lot, Vladimir! Serguei On 2/4/14 3:45 PM, Vladimir Kozlov wrote: > Changes look reasonable. > > Thank, > Vladimir > > On 2/3/14 2:17 PM, serguei.spitsyn at oracle.com wrote: >> Please, review the fix for: >> https://bugs.openjdk.java.net/browse/JDK-8025841 >> >> >> Open webrev: >> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/hotspot/omazurov/8025841-JVMTI-vtbl.1 >> >> >> >> Summary: >> >> The fix contributed by Oleg Mazurov to improve profiling data >> quality. >> It moves the "vtable stub" dynamic code notification to the right >> place. >> I've already reviewed the fix, and it looks good to me. >> >> Bug report description: >> >> "JVMTI_EVENT_DYNAMIC_CODE_GENERATED for "vtable stub" gets scheduled >> when >> a new chunk of memory for subsequent vtable and itable stubs is >> allocated. >> That chunk is uninitialized (contains zeros or garbage) although >> due to the fact >> that the actual event delivery is deferred, at least one vtable >> comes out right. >> >> This event should describe an individual vtable/itable stub (base >> address and size) >> and only after it's been created (memory is actually populated with >> code). >> Where VM diagnostic messages about vtable/itable stubs are >> issued upon >> -XX:+PrintAdapterHandlers appears exactly the right place for JVMTI >> events as well. >> >> Getting vtables/itables right is important in the context of >> performance analysis as >> that dynamically generated code may accumulate quite noticeable CPU >> time >> (especially itabes), sometimes larger than the actual Java methods >> called." >> >> >> Testing: >> Oleg tested it in the Oracle Studio Performance Analyzer environment. >> nsk.jvmti, nsk.jdi, nsk.jdwp, >> In progress: Jtreg com/sun/jdi, java/lang/instrument >> >> >> Thanks, >> Serguei >> From staffan.larsen at oracle.com Wed Feb 5 00:38:27 2014 From: staffan.larsen at oracle.com (Staffan Larsen) Date: Wed, 5 Feb 2014 09:38:27 +0100 Subject: RFR(S): JDK-8029808 com/sun/jdi/ProcessAttachTest.sh times out In-Reply-To: <52EFFCB5.2030004@oracle.com> References: <79C6D4D7-F8A0-4A33-AF7A-D1007F227A86@oracle.com> <52EA7F46.5070405@oracle.com> <16D6BA50-82FE-41F6-8423-FE218D397C58@oracle.com> <52EA84DC.5090008@oracle.com> <52EA872E.7070209@oracle.com> <52EFFCB5.2030004@oracle.com> Message-ID: Dmitry, Alan: Thanks! On 3 feb 2014, at 21:31, Dmitry Samersoff wrote: > Staffan, > > Looks good for me! > > -Dmitry > > On 2014-02-03 23:18, Staffan Larsen wrote: >> OK, my last try at this was very broken - I?m glad it didn?t pass the review. >> >> Here is a new try: http://cr.openjdk.java.net/~sla/8029808/webrev.01/ >> >> As Alan says, we can?t use jps on the (second) debuggee becuse it is started with suspend=y. In that mode it is waiting for a debugger to attach before it has started executing any Java code and before jps is able to identify it. >> >> I did find a race in the test, though. When the first debuggee is started (suspend=n) there is a race where the debugger process can attach and finish before the debuggee has written out the portfile. In that case ShutdownDebuggee will fail. I have seen this failure mode happen and I can force it by inserting a Thread.sleep into the debuggee. I solved this by waiting for the port file to appear in the suspend=n case as well. >> >> There is also a potential problem where both debuggees use the same output file and we may think the second debuggee has started before it actually has (because the output file isn?t empty). I solved that be creating different output files for the two test cases. >> >> I?ve also redirected all error streams so any errors aren?t lost. >> >> Thanks, >> /Staffan >> >> On 30 jan 2014, at 18:09, Alan Bateman wrote: >> >>> On 30/01/2014 16:59, Dmitry Samersoff wrote: >>>> Staffan, >>>> >>>> No it's not to your code. Sorry for not being clean enough. >>>> >>>> You cleaned up unix code, but windows code remains bad. Particularly, it >>>> has bad unconditional sleep 2 >>>> >>>> I would propose replace *windows manipulation with CYGWIN/MKS pids* to >>>> call to JPS that return windows pid >>>> >>> I have a vague memory that we had to use ps to get the pid because of the suspend=y test which cause the debuggee to suspect during startup. I might be wrong on this of course, it was a long time ago. >>> >>> -Alan >> > > > -- > Dmitry Samersoff > Oracle Java development team, Saint Petersburg, Russia > * I would love to change the world, but they won't give me the sources. From erik.helin at oracle.com Wed Feb 5 02:40:30 2014 From: erik.helin at oracle.com (Erik Helin) Date: Wed, 05 Feb 2014 11:40:30 +0100 Subject: JDK-7090324: gclog rotation via external tool In-Reply-To: <52EA33B3.6080409@lab.ntt.co.jp> References: <52483BDB.8040206@ysfactory.dip.jp> <52496A21.8080608@ysfactory.dip.jp> <527CDD56.7080106@ysfactory.dip.jp> <52A09642.4030609@ysfactory.dip.jp> <52E26FA2.40909@ysfactory.dip.jp> <0E26045D-F7F9-49BC-AB36-A42C1DC6E64E@oracle.com> <52E91000.9010600@ysfactory.dip.jp> <52E91AAA.3060008@oracle.com> <52E9248D.2090108@ysfactory.dip.jp> <52E9FA5B.6010306@lab.ntt.co.jp> <5B189F0A-7408-47C7-9719-DC2990355209@oracle.com> <52EA33B3.6080409@lab.ntt.co.jp> Message-ID: <52F2151E.4040805@oracle.com> Hi Yasumasa, I've looked through the latest patch, it is much better! I just have two comments: - ostream.hpp: Why did you add GCLogFileSize != 0 in should_rotate? The old check just checked that _bytes_written > GCLogFileSize. - TestGCLogRotationViaJcmd.java: Could you use the helper class JDKToolLauncher to start jmap? The code would then be slightly easier to read: for (int times = 1; times < NUM_LOGS; times++) { // Run jcmd GC.rotate_log JDKToolLauncher jmap = JDKToolLauncher.create("jmap") .addToolArg(pid) .addToolArg("GC.rotate_log"); ProcessBuilder pb = new ProcessBuilder(jmap.getCommand()); // Make sure we didn't crash OutputAnalyzer output = new OutputAnalyzer(pb.start()); output.shouldHaveExitValue(0); } Thanks, Erik On 01/30/2014 12:12 PM, Yasumasa Suenaga wrote: > Hi Staffan, > > I've uploaded new webrev: > http://cr.openjdk.java.net/~ysuenaga/JDK-7090324/webrev.03/ > > On 2014/01/30 17:23, Staffan Larsen wrote: >> Would it be possible for the Diagnostic Command to output the location >> of the rotated log? When invoking the command it would be good to get >> some kind of feedback. > > I changed rotate_log() to redirect messages to jcmd. > If GC.rotate_log is executed, we can get messages on jcmd console as below: > ------------ > $ jcmd 18976 GC.rotate_log > 18976: > 2014-01-30 19:59:39 GC log rotation request has been received. Saved as > test.log.0 > 2014-01-30 19:59:39 GC log file created test.log.1 > ------------ > > >> test/gc/7090324/Test7090324.java: >> - I think this needs to have the Oracle copyright notice as well. >> - Tests should now use descriptive names, not bug numbers: >> https://wiki.openjdk.java.net/display/HotSpot/Naming+HotSpot+JTReg+Tests >> - nits: lots of missing spaces before ?{?, and after ?for?, ?if? >> - line 47: you don?t need to clean up old files, jtreg will give you a >> fresh scratch directory to run in > > I've fixed. > Could you review again? > > > Thanks, > > Yasumasa > >> /Staffan >> >> >> >> On 30 jan 2014, at 08:08, Yasumasa >> Suenaga wrote: >> >>> Hi Erik, Staffan, >>> >>> I've uploaded new webrev. Could you review this ? >>> http://cr.openjdk.java.net/~ysuenaga/JDK-7090324/webrev.02/ >>> >>> This patch includes fixes from comments of Staffan and Erik. >>> >>> And I created new test of this patch as Test7090324 . >>> This test works fine with jtreg. >>> >>> >>> Thanks, >>> >>> Yasumasa >>> >>> On 2014/01/30 0:55, Yasumasa Suenaga wrote: >>>> Hi Erik, >>>> >>>> On 2014/01/30 0:13, Erik Helin wrote: >>>>> Hi Yasumasa, >>>>> >>>>> (have to use HTML email to get a width of more than 78 chars, sorry) >>>>> >>>>> why did you change the code in arguments.cpp in the method >>>>> check_gc_log_consistency? >>>> >>>> In current implementation, check_gclog_consistency() checks three >>>> parameters: >>>> >>>> - GC log filename >>>> - NumberOfGCLogFiles >>>> - GCLogFileSize >>>> >>>> My customer uses external trigger "ONLY" for rotating logs. >>>> If they want to do that, GCLogFileSize does not need. >>>> >>>> >>>>> Next, the gcLogFileStream::rotate_log method now does a lot of things. >>>>> Could you separate out the first block into a new method, >>>>> gcLogFileStream::should_rotate(bool force)? >>>>> >>>>> This was, the code would read: >>>>> >>>>>> bool gcLogFileStream::should_rotate(bool force) { >>>>>> return force || _bytes_writen>= GCLogFileSize; >>>>>> } >>>>>> >>>>>> void gcLogFileStream::rotate_log(bool force) { >>>>>> char time_msg[FILENAMEBUFLEN]; >>>>>> char time_str[EXTRACHARLEN]; >>>>>> char current_file_name[FILENAMEBUFLEN]; >>>>>> char renamed_file_name[FILENAMEBUFLEN]; >>>>>> >>>>>> if (!should_rotate(force)) { >>>>>> return; >>>>>> } >>>>>> >>>>>> ... >>>>>> } >>>>> >>>>> Could you please update your patch? >>>> >>>> I will do that. >>>> >>>> >>>>> There is a new empty line in the rotate_log method: >>>>> >>>>>> } >>>>>> + >>>>>> #ifdef ASSERT >>>>> >>>>> could you please remove it? >>>> >>>> I will do that. >>>> >>>> >>>>> The logging change in rotate_log uses a different kind of if/else >>>>> syntax >>>>> than the rest of the file: >>>>> >>>>>> if (force) { >>>>>> ... >>>>>> } >>>>>> else { >>>>>> ... >>>>>> } >>>>> >>>>> The other if/else statements in the file uses: >>>>> >>>>>> if (cond) { >>>>>> ... >>>>>> } else { >>>>>> ... >>>>>> } >>>>> >>>>> Could you please update your change to use the same if/else syntax? >>>> >>>> I will do that. >>>> >>>> >>>>> This part of the change duplicates the code: >>>>> >>>>> + jio_snprintf(time_msg, sizeof(time_msg), "%s GC log rotation >>>>> request has been received. Saved as %s\n", >>>>> + os::local_time_string((char *)time_str, sizeof(time_str)), >>>>> + renamed_file_name); >>>>> + } >>>>> + else { >>>>> + jio_snprintf(time_msg, sizeof(time_msg), "%s GC log file has >>>>> reached the" >>>>> " maximum size. Saved as %s\n", >>>>> - os::local_time_string((char *)time_str, sizeof(time_str)), >>>>> + os::local_time_string((char *)time_str, sizeof(time_str)), >>>>> renamed_file_name); >>>>> >>>>> Could you instead just change the message, as in: >>>>> >>>>>> const char* msg = forced ? "%s GC log rotation request has been >>>>>> received. Saved as %s\n" : >>>>>> "%s GC log file has reached the maximum size. Saved as %s\n"; >>>>>> jio_snprintf(msg, os::local_time_string((char *)time_str, >>>>>> sizeof(time_str)), renamed_file_name); >>>> >>>> I will do that. >>>> >>>> >>>>> The declaration of rotate_log in ostream.hpp still uses the old >>>>> variable name is_force, it should use force, >>>>> just as the definition. >>>> >>>> Sorry, I will fix it. >>>> >>>> >>>>> Finally, could you add a test that tests your change? Have a look >>>>> at the other tests >>>>> in hotspot/test/gc to see how you can do it >>>>> (you might want to use some functionality from >>>>> hotspot/test/testlibrary). >>>> >>>> I found three tests as following: >>>> >>>> [ysuenaga at xelvis test]$ find . -iname "*jcmd*" >>>> ./runtime/NMT/JcmdWithNMTDisabled.java >>>> ./runtime/NMT/JcmdScale.java >>>> ./gc/TestG1ZeroPGCTJcmdThreadPrint.java >>>> >>>> I understand that these tests checks output (stdout/stderr) with >>>> OutputAnalyzer. >>>> However, my patch affects target VM. So I guess current test cannot >>>> check >>>> that GC log rotation is succeeded. >>>> >>>> Should I make test which checks exit value of jcmd ? >>>> >>>> >>>> Thanks, >>>> >>>> Yasumasa >>>> >>>>> Thanks, >>>>> Erik >>>>> >>>>> On 2014-01-29 15:28, Yasumasa Suenaga wrote: >>>>>> Hi Staffan, >>>>>> >>>>>> Thank you for reviewing! >>>>>> I've uploaded new webrev. >>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-7090324/webrev.01/ >>>>>> >>>>>> On 2014/01/29 20:56, Staffan Larsen wrote: >>>>>>> Yasumasa, >>>>>>> >>>>>>> src/share/vm/runtime/arguments.cpp >>>>>>> no comments >>>>>>> >>>>>>> src/share/vm/runtime/safepoint.cpp >>>>>>> I was surprised that gc log size was checked after each safe >>>>>>> point. That seems an uneccssary burden to place on a safe point. >>>>>>> Instead we should switch to a periodic task that checks the gc >>>>>>> log size. However, this is unrelated to you patch, so please >>>>>>> ignore for now. >>>>>> >>>>>> Agree. >>>>>> However, I think that PeriodicTask also is not appropriate for this. >>>>>> >>>>>> Size of GC log file is increased when GC is occurred. >>>>>> So I think rotate function should be called at entry of each GC >>>>>> events >>>>>> e.g. VM_GC_Operation::doit_prologue() etc... >>>>>> >>>>>> >>>>>>> src/share/vm/runtime/vm_operations.hpp >>>>>>> line 402: nit: missing space before { >>>>>> >>>>>> Fixed. >>>>>> >>>>>> >>>>>>> line 405: I think ?force? is a better name than ?is_force? >>>>>> >>>>>> I removed "force" option from DCmd. >>>>>> So I removed this from VMOperation. >>>>>> >>>>>> >>>>>>> src/share/vm/services/diagnosticCommand.cpp >>>>>>> line 666: What does this do without the -force option? It looks >>>>>>> to me that the non-force case will happen after each safe point >>>>>>> (see above) and that there is no need to ever do this from a >>>>>>> diagnostic command. Can we remove the option? >>>>>> >>>>>> Indeed. >>>>>> I removed "force" option. >>>>>> >>>>>> >>>>>>> line 677: ?Target VM does not support GC log file rotation." >>>>>> >>>>>> Fixed. >>>>>> >>>>>> >>>>>>> nits: some missing spaces before ?{' and after ?if' >>>>>> >>>>>> Fixed. >>>>>> >>>>>> >>>>>>> src/share/vm/services/diagnosticCommand.hpp >>>>>>> I think RotateGCLogDCmd should require the ?control? permission >>>>>>> when executed via JMX, so please add: >>>>>>> static const JavaPermission permission() { >>>>>>> JavaPermission p = {"java.lang.management.ManagementPermission", >>>>>>> "control", NULL}; >>>>>>> return p; >>>>>>> } >>>>>> >>>>>> Added. >>>>>> >>>>>> >>>>>>> line 394: Maybe ?Force the GC log file to be rotated.? is a >>>>>>> better description? >>>>>> >>>>>> Fixed. >>>>>> >>>>>> >>>>>>> src/share/vm/utilities/ostream.cpp >>>>>>> line 662: I think ?force? is a better name than ?is_force? >>>>>>> line 668: The comment says exactly the same thing as the code so >>>>>>> I think it can be skipped >>>>>>> line 671: ?GC log file rotation occurs by external trigger ONLY." >>>>>>> line 675: "not need? -> ?no need? >>>>>>> line 718: "GC log rotation request has been received? >>>>>> >>>>>> Fixed them. >>>>>> >>>>>> >>>>>> Thanks, >>>>>> >>>>>> Yasumasa >>>>>> >>>>>> >>>>>>> src/share/vm/utilities/ostream.hpp >>>>>>> no comments >>>>>>> >>>>>>> >>>>>>> Thanks, >>>>>>> /Staffan >>>>>>> >>>>>>> On 24 jan 2014, at 14:50, Yasumasa >>>>>>> Suenaga wrote: >>>>>>> >>>>>>>> Hi all, >>>>>>>> >>>>>>>> I've created webrev: >>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-7090324/webrev.00/ >>>>>>>> >>>>>>>> This patch works fine on current jdk9/hs-rt . >>>>>>>> Could you review this? >>>>>>>> >>>>>>>> >>>>>>>> I am just an Author. So I need a sponsor. >>>>>>>> Could you help me? >>>>>>>> >>>>>>>> >>>>>>>> Please cooperate. >>>>>>>> >>>>>>>> >>>>>>>> Thanks, >>>>>>>> >>>>>>>> Yasumasa >>>>>>>> >>>>>>>> >>>>>>>> On 2013/12/06 0:05, Yasumasa Suenaga wrote: >>>>>>>>> Hi all, >>>>>>>>> >>>>>>>>> Did someone read my email? >>>>>>>>> I really hope to merge "JDK-7090324: gclog rotation via >>>>>>>>> external tool" . >>>>>>>>> >>>>>>>>> I hear that someone need this RFE. So I want to discuss about >>>>>>>>> this. >>>>>>>>> >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> >>>>>>>>> Yasumasa >>>>>>>>> >>>>>>>>> On 2013/11/08 21:47, Yasumasa Suenaga wrote: >>>>>>>>>> Hi all, >>>>>>>>>> >>>>>>>>>> Did someone read my mail? >>>>>>>>>> >>>>>>>>>> I think that this RFE helps us to watch Java heap on >>>>>>>>>> production system. >>>>>>>>>> Also I think this RFE is able to be part of the JEP 158 >>>>>>>>>> (Unified JVM Logging) . >>>>>>>>>> >>>>>>>>>> I want to update this RFE in JDK Bug System, but I don't have >>>>>>>>>> account. >>>>>>>>>> So I've posted email at first. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> >>>>>>>>>> Yasumasa >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On 2013/09/30 21:10, Yasumasa Suenaga wrote: >>>>>>>>>>> In previous email, I've attached new patch for this RFE. >>>>>>>>>>> It works fine with current hsx. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Yasumasa >>>>>>>>>>> >>>>>>>>>>> On 2013/09/29 23:40, Yasu wrote: >>>>>>>>>>>> Hi all, >>>>>>>>>>>> >>>>>>>>>>>> We are using "logrotate" tool on RHEL for various log rotation. >>>>>>>>>>>> Current HotSpot has gclog rotation function for log size base, >>>>>>>>>>>> however I need to rotate gc log synchronizing with logrotate >>>>>>>>>>>> tool. >>>>>>>>>>>> >>>>>>>>>>>> So I've created RFE as "JDK-7090324: gclog rotation via >>>>>>>>>>>> external tool" . >>>>>>>>>>>> And Sr. Engineering Manager in Oracle said he use the >>>>>>>>>>>> essence of my patch in one >>>>>>>>>>>> of the jcmd subcommands. >>>>>>>>>>>> http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/2011-September/003274.html >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> 2 years ago, I posted a patch for this RFE. >>>>>>>>>>>> But this patch is too old to apply for current HotSpot. >>>>>>>>>>>> >>>>>>>>>>>> In last month, a similar discussion was appeared in ML. >>>>>>>>>>>> So I think it's time to discuss this RFE. >>>>>>>>>>>> http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/2013-August/008029.html >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Please cooperate. >>>>>>>>>>>> >>>>>>>>>>>> Best regards, >>>>>>>>>>>> Yasumasa >>>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>> >> > From yasu at ysfactory.dip.jp Wed Feb 5 04:09:04 2014 From: yasu at ysfactory.dip.jp (Yasumasa Suenaga) Date: Wed, 05 Feb 2014 21:09:04 +0900 Subject: JDK-7090324: gclog rotation via external tool In-Reply-To: <52F2151E.4040805@oracle.com> References: <52483BDB.8040206@ysfactory.dip.jp> <52496A21.8080608@ysfactory.dip.jp> <527CDD56.7080106@ysfactory.dip.jp> <52A09642.4030609@ysfactory.dip.jp> <52E26FA2.40909@ysfactory.dip.jp> <0E26045D-F7F9-49BC-AB36-A42C1DC6E64E@oracle.com> <52E91000.9010600@ysfactory.dip.jp> <52E91AAA.3060008@oracle.com> <52E9248D.2090108@ysfactory.dip.jp> <52E9FA5B.6010306@lab.ntt.co.jp> <5B189F0A-7408-47C7-9719-DC2990355209@oracle.com> <52EA33B3.6080409@lab.ntt.co.jp> <52F2151E.4040805@oracle.com> Message-ID: <52F229E0.3010400@ysfactory.dip.jp> Hi Erik, Thank you for reviewing again! I've updated new webrev. On 02/05/2014 07:40 PM, Erik Helin wrote: > Hi Yasumasa, > > I've looked through the latest patch, it is much better! I just have > two comments: > > - ostream.hpp: > Why did you add GCLogFileSize != 0 in should_rotate? The old check > just checked that _bytes_written > GCLogFileSize. Default value of GCLogFileSIze is "0" in globals.hpp . So if this state is missed, should_rotate() returns true in anytime. > - TestGCLogRotationViaJcmd.java: > Could you use the helper class JDKToolLauncher to start jmap? The > code would then be slightly easier to read: > > for (int times = 1; times < NUM_LOGS; times++) { > // Run jcmd GC.rotate_log > JDKToolLauncher jmap = JDKToolLauncher.create("jmap") > .addToolArg(pid) > .addToolArg("GC.rotate_log"); > ProcessBuilder pb = new ProcessBuilder(jmap.getCommand()); > > // Make sure we didn't crash > OutputAnalyzer output = new OutputAnalyzer(pb.start()); > output.shouldHaveExitValue(0); > } I've fixed. Could you check the patch? Thanks, Yasumasa > Thanks, > Erik > > On 01/30/2014 12:12 PM, Yasumasa Suenaga wrote: >> Hi Staffan, >> >> I've uploaded new webrev: >> http://cr.openjdk.java.net/~ysuenaga/JDK-7090324/webrev.03/ >> >> On 2014/01/30 17:23, Staffan Larsen wrote: >>> Would it be possible for the Diagnostic Command to output the location >>> of the rotated log? When invoking the command it would be good to get >>> some kind of feedback. >> >> I changed rotate_log() to redirect messages to jcmd. >> If GC.rotate_log is executed, we can get messages on jcmd console as >> below: >> ------------ >> $ jcmd 18976 GC.rotate_log >> 18976: >> 2014-01-30 19:59:39 GC log rotation request has been received. Saved as >> test.log.0 >> 2014-01-30 19:59:39 GC log file created test.log.1 >> ------------ >> >> >>> test/gc/7090324/Test7090324.java: >>> - I think this needs to have the Oracle copyright notice as well. >>> - Tests should now use descriptive names, not bug numbers: >>> https://wiki.openjdk.java.net/display/HotSpot/Naming+HotSpot+JTReg+Tests >>> >>> - nits: lots of missing spaces before ?{?, and after ?for?, ?if? >>> - line 47: you don?t need to clean up old files, jtreg will give you a >>> fresh scratch directory to run in >> >> I've fixed. >> Could you review again? >> >> >> Thanks, >> >> Yasumasa >> >>> /Staffan >>> >>> >>> >>> On 30 jan 2014, at 08:08, Yasumasa >>> Suenaga wrote: >>> >>>> Hi Erik, Staffan, >>>> >>>> I've uploaded new webrev. Could you review this ? >>>> http://cr.openjdk.java.net/~ysuenaga/JDK-7090324/webrev.02/ >>>> >>>> This patch includes fixes from comments of Staffan and Erik. >>>> >>>> And I created new test of this patch as Test7090324 . >>>> This test works fine with jtreg. >>>> >>>> >>>> Thanks, >>>> >>>> Yasumasa >>>> >>>> On 2014/01/30 0:55, Yasumasa Suenaga wrote: >>>>> Hi Erik, >>>>> >>>>> On 2014/01/30 0:13, Erik Helin wrote: >>>>>> Hi Yasumasa, >>>>>> >>>>>> (have to use HTML email to get a width of more than 78 chars, sorry) >>>>>> >>>>>> why did you change the code in arguments.cpp in the method >>>>>> check_gc_log_consistency? >>>>> >>>>> In current implementation, check_gclog_consistency() checks three >>>>> parameters: >>>>> >>>>> - GC log filename >>>>> - NumberOfGCLogFiles >>>>> - GCLogFileSize >>>>> >>>>> My customer uses external trigger "ONLY" for rotating logs. >>>>> If they want to do that, GCLogFileSize does not need. >>>>> >>>>> >>>>>> Next, the gcLogFileStream::rotate_log method now does a lot of >>>>>> things. >>>>>> Could you separate out the first block into a new method, >>>>>> gcLogFileStream::should_rotate(bool force)? >>>>>> >>>>>> This was, the code would read: >>>>>> >>>>>>> bool gcLogFileStream::should_rotate(bool force) { >>>>>>> return force || _bytes_writen>= GCLogFileSize; >>>>>>> } >>>>>>> >>>>>>> void gcLogFileStream::rotate_log(bool force) { >>>>>>> char time_msg[FILENAMEBUFLEN]; >>>>>>> char time_str[EXTRACHARLEN]; >>>>>>> char current_file_name[FILENAMEBUFLEN]; >>>>>>> char renamed_file_name[FILENAMEBUFLEN]; >>>>>>> >>>>>>> if (!should_rotate(force)) { >>>>>>> return; >>>>>>> } >>>>>>> >>>>>>> ... >>>>>>> } >>>>>> >>>>>> Could you please update your patch? >>>>> >>>>> I will do that. >>>>> >>>>> >>>>>> There is a new empty line in the rotate_log method: >>>>>> >>>>>>> } >>>>>>> + >>>>>>> #ifdef ASSERT >>>>>> >>>>>> could you please remove it? >>>>> >>>>> I will do that. >>>>> >>>>> >>>>>> The logging change in rotate_log uses a different kind of if/else >>>>>> syntax >>>>>> than the rest of the file: >>>>>> >>>>>>> if (force) { >>>>>>> ... >>>>>>> } >>>>>>> else { >>>>>>> ... >>>>>>> } >>>>>> >>>>>> The other if/else statements in the file uses: >>>>>> >>>>>>> if (cond) { >>>>>>> ... >>>>>>> } else { >>>>>>> ... >>>>>>> } >>>>>> >>>>>> Could you please update your change to use the same if/else syntax? >>>>> >>>>> I will do that. >>>>> >>>>> >>>>>> This part of the change duplicates the code: >>>>>> >>>>>> + jio_snprintf(time_msg, sizeof(time_msg), "%s GC log rotation >>>>>> request has been received. Saved as %s\n", >>>>>> + os::local_time_string((char *)time_str, sizeof(time_str)), >>>>>> + renamed_file_name); >>>>>> + } >>>>>> + else { >>>>>> + jio_snprintf(time_msg, sizeof(time_msg), "%s GC log file has >>>>>> reached the" >>>>>> " maximum size. Saved as %s\n", >>>>>> - os::local_time_string((char *)time_str, sizeof(time_str)), >>>>>> + os::local_time_string((char *)time_str, sizeof(time_str)), >>>>>> renamed_file_name); >>>>>> >>>>>> Could you instead just change the message, as in: >>>>>> >>>>>>> const char* msg = forced ? "%s GC log rotation request has been >>>>>>> received. Saved as %s\n" : >>>>>>> "%s GC log file has reached the maximum size. Saved as %s\n"; >>>>>>> jio_snprintf(msg, os::local_time_string((char *)time_str, >>>>>>> sizeof(time_str)), renamed_file_name); >>>>> >>>>> I will do that. >>>>> >>>>> >>>>>> The declaration of rotate_log in ostream.hpp still uses the old >>>>>> variable name is_force, it should use force, >>>>>> just as the definition. >>>>> >>>>> Sorry, I will fix it. >>>>> >>>>> >>>>>> Finally, could you add a test that tests your change? Have a look >>>>>> at the other tests >>>>>> in hotspot/test/gc to see how you can do it >>>>>> (you might want to use some functionality from >>>>>> hotspot/test/testlibrary). >>>>> >>>>> I found three tests as following: >>>>> >>>>> [ysuenaga at xelvis test]$ find . -iname "*jcmd*" >>>>> ./runtime/NMT/JcmdWithNMTDisabled.java >>>>> ./runtime/NMT/JcmdScale.java >>>>> ./gc/TestG1ZeroPGCTJcmdThreadPrint.java >>>>> >>>>> I understand that these tests checks output (stdout/stderr) with >>>>> OutputAnalyzer. >>>>> However, my patch affects target VM. So I guess current test cannot >>>>> check >>>>> that GC log rotation is succeeded. >>>>> >>>>> Should I make test which checks exit value of jcmd ? >>>>> >>>>> >>>>> Thanks, >>>>> >>>>> Yasumasa >>>>> >>>>>> Thanks, >>>>>> Erik >>>>>> >>>>>> On 2014-01-29 15:28, Yasumasa Suenaga wrote: >>>>>>> Hi Staffan, >>>>>>> >>>>>>> Thank you for reviewing! >>>>>>> I've uploaded new webrev. >>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-7090324/webrev.01/ >>>>>>> >>>>>>> On 2014/01/29 20:56, Staffan Larsen wrote: >>>>>>>> Yasumasa, >>>>>>>> >>>>>>>> src/share/vm/runtime/arguments.cpp >>>>>>>> no comments >>>>>>>> >>>>>>>> src/share/vm/runtime/safepoint.cpp >>>>>>>> I was surprised that gc log size was checked after each safe >>>>>>>> point. That seems an uneccssary burden to place on a safe point. >>>>>>>> Instead we should switch to a periodic task that checks the gc >>>>>>>> log size. However, this is unrelated to you patch, so please >>>>>>>> ignore for now. >>>>>>> >>>>>>> Agree. >>>>>>> However, I think that PeriodicTask also is not appropriate for >>>>>>> this. >>>>>>> >>>>>>> Size of GC log file is increased when GC is occurred. >>>>>>> So I think rotate function should be called at entry of each GC >>>>>>> events >>>>>>> e.g. VM_GC_Operation::doit_prologue() etc... >>>>>>> >>>>>>> >>>>>>>> src/share/vm/runtime/vm_operations.hpp >>>>>>>> line 402: nit: missing space before { >>>>>>> >>>>>>> Fixed. >>>>>>> >>>>>>> >>>>>>>> line 405: I think ?force? is a better name than ?is_force? >>>>>>> >>>>>>> I removed "force" option from DCmd. >>>>>>> So I removed this from VMOperation. >>>>>>> >>>>>>> >>>>>>>> src/share/vm/services/diagnosticCommand.cpp >>>>>>>> line 666: What does this do without the -force option? It looks >>>>>>>> to me that the non-force case will happen after each safe point >>>>>>>> (see above) and that there is no need to ever do this from a >>>>>>>> diagnostic command. Can we remove the option? >>>>>>> >>>>>>> Indeed. >>>>>>> I removed "force" option. >>>>>>> >>>>>>> >>>>>>>> line 677: ?Target VM does not support GC log file rotation." >>>>>>> >>>>>>> Fixed. >>>>>>> >>>>>>> >>>>>>>> nits: some missing spaces before ?{' and after ?if' >>>>>>> >>>>>>> Fixed. >>>>>>> >>>>>>> >>>>>>>> src/share/vm/services/diagnosticCommand.hpp >>>>>>>> I think RotateGCLogDCmd should require the ?control? permission >>>>>>>> when executed via JMX, so please add: >>>>>>>> static const JavaPermission permission() { >>>>>>>> JavaPermission p = {"java.lang.management.ManagementPermission", >>>>>>>> "control", NULL}; >>>>>>>> return p; >>>>>>>> } >>>>>>> >>>>>>> Added. >>>>>>> >>>>>>> >>>>>>>> line 394: Maybe ?Force the GC log file to be rotated.? is a >>>>>>>> better description? >>>>>>> >>>>>>> Fixed. >>>>>>> >>>>>>> >>>>>>>> src/share/vm/utilities/ostream.cpp >>>>>>>> line 662: I think ?force? is a better name than ?is_force? >>>>>>>> line 668: The comment says exactly the same thing as the code so >>>>>>>> I think it can be skipped >>>>>>>> line 671: ?GC log file rotation occurs by external trigger ONLY." >>>>>>>> line 675: "not need? -> ?no need? >>>>>>>> line 718: "GC log rotation request has been received? >>>>>>> >>>>>>> Fixed them. >>>>>>> >>>>>>> >>>>>>> Thanks, >>>>>>> >>>>>>> Yasumasa >>>>>>> >>>>>>> >>>>>>>> src/share/vm/utilities/ostream.hpp >>>>>>>> no comments >>>>>>>> >>>>>>>> >>>>>>>> Thanks, >>>>>>>> /Staffan >>>>>>>> >>>>>>>> On 24 jan 2014, at 14:50, Yasumasa >>>>>>>> Suenaga wrote: >>>>>>>> >>>>>>>>> Hi all, >>>>>>>>> >>>>>>>>> I've created webrev: >>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-7090324/webrev.00/ >>>>>>>>> >>>>>>>>> This patch works fine on current jdk9/hs-rt . >>>>>>>>> Could you review this? >>>>>>>>> >>>>>>>>> >>>>>>>>> I am just an Author. So I need a sponsor. >>>>>>>>> Could you help me? >>>>>>>>> >>>>>>>>> >>>>>>>>> Please cooperate. >>>>>>>>> >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> >>>>>>>>> Yasumasa >>>>>>>>> >>>>>>>>> >>>>>>>>> On 2013/12/06 0:05, Yasumasa Suenaga wrote: >>>>>>>>>> Hi all, >>>>>>>>>> >>>>>>>>>> Did someone read my email? >>>>>>>>>> I really hope to merge "JDK-7090324: gclog rotation via >>>>>>>>>> external tool" . >>>>>>>>>> >>>>>>>>>> I hear that someone need this RFE. So I want to discuss about >>>>>>>>>> this. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> >>>>>>>>>> Yasumasa >>>>>>>>>> >>>>>>>>>> On 2013/11/08 21:47, Yasumasa Suenaga wrote: >>>>>>>>>>> Hi all, >>>>>>>>>>> >>>>>>>>>>> Did someone read my mail? >>>>>>>>>>> >>>>>>>>>>> I think that this RFE helps us to watch Java heap on >>>>>>>>>>> production system. >>>>>>>>>>> Also I think this RFE is able to be part of the JEP 158 >>>>>>>>>>> (Unified JVM Logging) . >>>>>>>>>>> >>>>>>>>>>> I want to update this RFE in JDK Bug System, but I don't have >>>>>>>>>>> account. >>>>>>>>>>> So I've posted email at first. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> >>>>>>>>>>> Yasumasa >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On 2013/09/30 21:10, Yasumasa Suenaga wrote: >>>>>>>>>>>> In previous email, I've attached new patch for this RFE. >>>>>>>>>>>> It works fine with current hsx. >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Yasumasa >>>>>>>>>>>> >>>>>>>>>>>> On 2013/09/29 23:40, Yasu wrote: >>>>>>>>>>>>> Hi all, >>>>>>>>>>>>> >>>>>>>>>>>>> We are using "logrotate" tool on RHEL for various log >>>>>>>>>>>>> rotation. >>>>>>>>>>>>> Current HotSpot has gclog rotation function for log size >>>>>>>>>>>>> base, >>>>>>>>>>>>> however I need to rotate gc log synchronizing with logrotate >>>>>>>>>>>>> tool. >>>>>>>>>>>>> >>>>>>>>>>>>> So I've created RFE as "JDK-7090324: gclog rotation via >>>>>>>>>>>>> external tool" . >>>>>>>>>>>>> And Sr. Engineering Manager in Oracle said he use the >>>>>>>>>>>>> essence of my patch in one >>>>>>>>>>>>> of the jcmd subcommands. >>>>>>>>>>>>> http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/2011-September/003274.html >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> 2 years ago, I posted a patch for this RFE. >>>>>>>>>>>>> But this patch is too old to apply for current HotSpot. >>>>>>>>>>>>> >>>>>>>>>>>>> In last month, a similar discussion was appeared in ML. >>>>>>>>>>>>> So I think it's time to discuss this RFE. >>>>>>>>>>>>> http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/2013-August/008029.html >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> Please cooperate. >>>>>>>>>>>>> >>>>>>>>>>>>> Best regards, >>>>>>>>>>>>> Yasumasa >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>> >> > From yasu at ysfactory.dip.jp Wed Feb 5 04:13:20 2014 From: yasu at ysfactory.dip.jp (Yasumasa Suenaga) Date: Wed, 05 Feb 2014 21:13:20 +0900 Subject: JDK-7090324: gclog rotation via external tool In-Reply-To: <52F229E0.3010400@ysfactory.dip.jp> References: <52483BDB.8040206@ysfactory.dip.jp> <52496A21.8080608@ysfactory.dip.jp> <527CDD56.7080106@ysfactory.dip.jp> <52A09642.4030609@ysfactory.dip.jp> <52E26FA2.40909@ysfactory.dip.jp> <0E26045D-F7F9-49BC-AB36-A42C1DC6E64E@oracle.com> <52E91000.9010600@ysfactory.dip.jp> <52E91AAA.3060008@oracle.com> <52E9248D.2090108@ysfactory.dip.jp> <52E9FA5B.6010306@lab.ntt.co.jp> <5B189F0A-7408-47C7-9719-DC2990355209@oracle.com> <52EA33B3.6080409@lab.ntt.co.jp> <52F2151E.4040805@oracle.com> <52F229E0.3010400@ysfactory.dip.jp> Message-ID: <52F22AE0.1000008@ysfactory.dip.jp> Sorry, I forgot to paste URL of new webrev :-P http://cr.openjdk.java.net/~ysuenaga/JDK-7090324/webrev.04/ Yasumasa On 02/05/2014 09:09 PM, Yasumasa Suenaga wrote: > Hi Erik, > > Thank you for reviewing again! > I've updated new webrev. > > On 02/05/2014 07:40 PM, Erik Helin wrote: >> Hi Yasumasa, >> >> I've looked through the latest patch, it is much better! I just have >> two comments: >> >> - ostream.hpp: >> Why did you add GCLogFileSize != 0 in should_rotate? The old check >> just checked that _bytes_written > GCLogFileSize. > > Default value of GCLogFileSIze is "0" in globals.hpp . > So if this state is missed, should_rotate() returns true in anytime. > > >> - TestGCLogRotationViaJcmd.java: >> Could you use the helper class JDKToolLauncher to start jmap? The >> code would then be slightly easier to read: >> >> for (int times = 1; times < NUM_LOGS; times++) { >> // Run jcmd GC.rotate_log >> JDKToolLauncher jmap = JDKToolLauncher.create("jmap") >> .addToolArg(pid) >> .addToolArg("GC.rotate_log"); >> ProcessBuilder pb = new ProcessBuilder(jmap.getCommand()); >> >> // Make sure we didn't crash >> OutputAnalyzer output = new OutputAnalyzer(pb.start()); >> output.shouldHaveExitValue(0); >> } > > I've fixed. Could you check the patch? > > > Thanks, > > Yasumasa > > >> Thanks, >> Erik >> >> On 01/30/2014 12:12 PM, Yasumasa Suenaga wrote: >>> Hi Staffan, >>> >>> I've uploaded new webrev: >>> http://cr.openjdk.java.net/~ysuenaga/JDK-7090324/webrev.03/ >>> >>> On 2014/01/30 17:23, Staffan Larsen wrote: >>>> Would it be possible for the Diagnostic Command to output the location >>>> of the rotated log? When invoking the command it would be good to get >>>> some kind of feedback. >>> >>> I changed rotate_log() to redirect messages to jcmd. >>> If GC.rotate_log is executed, we can get messages on jcmd console as >>> below: >>> ------------ >>> $ jcmd 18976 GC.rotate_log >>> 18976: >>> 2014-01-30 19:59:39 GC log rotation request has been received. Saved as >>> test.log.0 >>> 2014-01-30 19:59:39 GC log file created test.log.1 >>> ------------ >>> >>> >>>> test/gc/7090324/Test7090324.java: >>>> - I think this needs to have the Oracle copyright notice as well. >>>> - Tests should now use descriptive names, not bug numbers: >>>> https://wiki.openjdk.java.net/display/HotSpot/Naming+HotSpot+JTReg+Tests >>>> >>>> - nits: lots of missing spaces before ?{?, and after ?for?, ?if? >>>> - line 47: you don?t need to clean up old files, jtreg will give you a >>>> fresh scratch directory to run in >>> >>> I've fixed. >>> Could you review again? >>> >>> >>> Thanks, >>> >>> Yasumasa >>> >>>> /Staffan >>>> >>>> >>>> >>>> On 30 jan 2014, at 08:08, Yasumasa >>>> Suenaga wrote: >>>> >>>>> Hi Erik, Staffan, >>>>> >>>>> I've uploaded new webrev. Could you review this ? >>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-7090324/webrev.02/ >>>>> >>>>> This patch includes fixes from comments of Staffan and Erik. >>>>> >>>>> And I created new test of this patch as Test7090324 . >>>>> This test works fine with jtreg. >>>>> >>>>> >>>>> Thanks, >>>>> >>>>> Yasumasa >>>>> >>>>> On 2014/01/30 0:55, Yasumasa Suenaga wrote: >>>>>> Hi Erik, >>>>>> >>>>>> On 2014/01/30 0:13, Erik Helin wrote: >>>>>>> Hi Yasumasa, >>>>>>> >>>>>>> (have to use HTML email to get a width of more than 78 chars, >>>>>>> sorry) >>>>>>> >>>>>>> why did you change the code in arguments.cpp in the method >>>>>>> check_gc_log_consistency? >>>>>> >>>>>> In current implementation, check_gclog_consistency() checks three >>>>>> parameters: >>>>>> >>>>>> - GC log filename >>>>>> - NumberOfGCLogFiles >>>>>> - GCLogFileSize >>>>>> >>>>>> My customer uses external trigger "ONLY" for rotating logs. >>>>>> If they want to do that, GCLogFileSize does not need. >>>>>> >>>>>> >>>>>>> Next, the gcLogFileStream::rotate_log method now does a lot of >>>>>>> things. >>>>>>> Could you separate out the first block into a new method, >>>>>>> gcLogFileStream::should_rotate(bool force)? >>>>>>> >>>>>>> This was, the code would read: >>>>>>> >>>>>>>> bool gcLogFileStream::should_rotate(bool force) { >>>>>>>> return force || _bytes_writen>= GCLogFileSize; >>>>>>>> } >>>>>>>> >>>>>>>> void gcLogFileStream::rotate_log(bool force) { >>>>>>>> char time_msg[FILENAMEBUFLEN]; >>>>>>>> char time_str[EXTRACHARLEN]; >>>>>>>> char current_file_name[FILENAMEBUFLEN]; >>>>>>>> char renamed_file_name[FILENAMEBUFLEN]; >>>>>>>> >>>>>>>> if (!should_rotate(force)) { >>>>>>>> return; >>>>>>>> } >>>>>>>> >>>>>>>> ... >>>>>>>> } >>>>>>> >>>>>>> Could you please update your patch? >>>>>> >>>>>> I will do that. >>>>>> >>>>>> >>>>>>> There is a new empty line in the rotate_log method: >>>>>>> >>>>>>>> } >>>>>>>> + >>>>>>>> #ifdef ASSERT >>>>>>> >>>>>>> could you please remove it? >>>>>> >>>>>> I will do that. >>>>>> >>>>>> >>>>>>> The logging change in rotate_log uses a different kind of if/else >>>>>>> syntax >>>>>>> than the rest of the file: >>>>>>> >>>>>>>> if (force) { >>>>>>>> ... >>>>>>>> } >>>>>>>> else { >>>>>>>> ... >>>>>>>> } >>>>>>> >>>>>>> The other if/else statements in the file uses: >>>>>>> >>>>>>>> if (cond) { >>>>>>>> ... >>>>>>>> } else { >>>>>>>> ... >>>>>>>> } >>>>>>> >>>>>>> Could you please update your change to use the same if/else syntax? >>>>>> >>>>>> I will do that. >>>>>> >>>>>> >>>>>>> This part of the change duplicates the code: >>>>>>> >>>>>>> + jio_snprintf(time_msg, sizeof(time_msg), "%s GC log rotation >>>>>>> request has been received. Saved as %s\n", >>>>>>> + os::local_time_string((char *)time_str, sizeof(time_str)), >>>>>>> + renamed_file_name); >>>>>>> + } >>>>>>> + else { >>>>>>> + jio_snprintf(time_msg, sizeof(time_msg), "%s GC log file has >>>>>>> reached the" >>>>>>> " maximum size. Saved as %s\n", >>>>>>> - os::local_time_string((char *)time_str, sizeof(time_str)), >>>>>>> + os::local_time_string((char *)time_str, sizeof(time_str)), >>>>>>> renamed_file_name); >>>>>>> >>>>>>> Could you instead just change the message, as in: >>>>>>> >>>>>>>> const char* msg = forced ? "%s GC log rotation request has been >>>>>>>> received. Saved as %s\n" : >>>>>>>> "%s GC log file has reached the maximum size. Saved as %s\n"; >>>>>>>> jio_snprintf(msg, os::local_time_string((char *)time_str, >>>>>>>> sizeof(time_str)), renamed_file_name); >>>>>> >>>>>> I will do that. >>>>>> >>>>>> >>>>>>> The declaration of rotate_log in ostream.hpp still uses the old >>>>>>> variable name is_force, it should use force, >>>>>>> just as the definition. >>>>>> >>>>>> Sorry, I will fix it. >>>>>> >>>>>> >>>>>>> Finally, could you add a test that tests your change? Have a look >>>>>>> at the other tests >>>>>>> in hotspot/test/gc to see how you can do it >>>>>>> (you might want to use some functionality from >>>>>>> hotspot/test/testlibrary). >>>>>> >>>>>> I found three tests as following: >>>>>> >>>>>> [ysuenaga at xelvis test]$ find . -iname "*jcmd*" >>>>>> ./runtime/NMT/JcmdWithNMTDisabled.java >>>>>> ./runtime/NMT/JcmdScale.java >>>>>> ./gc/TestG1ZeroPGCTJcmdThreadPrint.java >>>>>> >>>>>> I understand that these tests checks output (stdout/stderr) with >>>>>> OutputAnalyzer. >>>>>> However, my patch affects target VM. So I guess current test cannot >>>>>> check >>>>>> that GC log rotation is succeeded. >>>>>> >>>>>> Should I make test which checks exit value of jcmd ? >>>>>> >>>>>> >>>>>> Thanks, >>>>>> >>>>>> Yasumasa >>>>>> >>>>>>> Thanks, >>>>>>> Erik >>>>>>> >>>>>>> On 2014-01-29 15:28, Yasumasa Suenaga wrote: >>>>>>>> Hi Staffan, >>>>>>>> >>>>>>>> Thank you for reviewing! >>>>>>>> I've uploaded new webrev. >>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-7090324/webrev.01/ >>>>>>>> >>>>>>>> On 2014/01/29 20:56, Staffan Larsen wrote: >>>>>>>>> Yasumasa, >>>>>>>>> >>>>>>>>> src/share/vm/runtime/arguments.cpp >>>>>>>>> no comments >>>>>>>>> >>>>>>>>> src/share/vm/runtime/safepoint.cpp >>>>>>>>> I was surprised that gc log size was checked after each safe >>>>>>>>> point. That seems an uneccssary burden to place on a safe point. >>>>>>>>> Instead we should switch to a periodic task that checks the gc >>>>>>>>> log size. However, this is unrelated to you patch, so please >>>>>>>>> ignore for now. >>>>>>>> >>>>>>>> Agree. >>>>>>>> However, I think that PeriodicTask also is not appropriate for >>>>>>>> this. >>>>>>>> >>>>>>>> Size of GC log file is increased when GC is occurred. >>>>>>>> So I think rotate function should be called at entry of each GC >>>>>>>> events >>>>>>>> e.g. VM_GC_Operation::doit_prologue() etc... >>>>>>>> >>>>>>>> >>>>>>>>> src/share/vm/runtime/vm_operations.hpp >>>>>>>>> line 402: nit: missing space before { >>>>>>>> >>>>>>>> Fixed. >>>>>>>> >>>>>>>> >>>>>>>>> line 405: I think ?force? is a better name than ?is_force? >>>>>>>> >>>>>>>> I removed "force" option from DCmd. >>>>>>>> So I removed this from VMOperation. >>>>>>>> >>>>>>>> >>>>>>>>> src/share/vm/services/diagnosticCommand.cpp >>>>>>>>> line 666: What does this do without the -force option? It looks >>>>>>>>> to me that the non-force case will happen after each safe point >>>>>>>>> (see above) and that there is no need to ever do this from a >>>>>>>>> diagnostic command. Can we remove the option? >>>>>>>> >>>>>>>> Indeed. >>>>>>>> I removed "force" option. >>>>>>>> >>>>>>>> >>>>>>>>> line 677: ?Target VM does not support GC log file rotation." >>>>>>>> >>>>>>>> Fixed. >>>>>>>> >>>>>>>> >>>>>>>>> nits: some missing spaces before ?{' and after ?if' >>>>>>>> >>>>>>>> Fixed. >>>>>>>> >>>>>>>> >>>>>>>>> src/share/vm/services/diagnosticCommand.hpp >>>>>>>>> I think RotateGCLogDCmd should require the ?control? permission >>>>>>>>> when executed via JMX, so please add: >>>>>>>>> static const JavaPermission permission() { >>>>>>>>> JavaPermission p = {"java.lang.management.ManagementPermission", >>>>>>>>> "control", NULL}; >>>>>>>>> return p; >>>>>>>>> } >>>>>>>> >>>>>>>> Added. >>>>>>>> >>>>>>>> >>>>>>>>> line 394: Maybe ?Force the GC log file to be rotated.? is a >>>>>>>>> better description? >>>>>>>> >>>>>>>> Fixed. >>>>>>>> >>>>>>>> >>>>>>>>> src/share/vm/utilities/ostream.cpp >>>>>>>>> line 662: I think ?force? is a better name than ?is_force? >>>>>>>>> line 668: The comment says exactly the same thing as the code so >>>>>>>>> I think it can be skipped >>>>>>>>> line 671: ?GC log file rotation occurs by external trigger ONLY." >>>>>>>>> line 675: "not need? -> ?no need? >>>>>>>>> line 718: "GC log rotation request has been received? >>>>>>>> >>>>>>>> Fixed them. >>>>>>>> >>>>>>>> >>>>>>>> Thanks, >>>>>>>> >>>>>>>> Yasumasa >>>>>>>> >>>>>>>> >>>>>>>>> src/share/vm/utilities/ostream.hpp >>>>>>>>> no comments >>>>>>>>> >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> /Staffan >>>>>>>>> >>>>>>>>> On 24 jan 2014, at 14:50, Yasumasa >>>>>>>>> Suenaga wrote: >>>>>>>>> >>>>>>>>>> Hi all, >>>>>>>>>> >>>>>>>>>> I've created webrev: >>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-7090324/webrev.00/ >>>>>>>>>> >>>>>>>>>> This patch works fine on current jdk9/hs-rt . >>>>>>>>>> Could you review this? >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> I am just an Author. So I need a sponsor. >>>>>>>>>> Could you help me? >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Please cooperate. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> >>>>>>>>>> Yasumasa >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On 2013/12/06 0:05, Yasumasa Suenaga wrote: >>>>>>>>>>> Hi all, >>>>>>>>>>> >>>>>>>>>>> Did someone read my email? >>>>>>>>>>> I really hope to merge "JDK-7090324: gclog rotation via >>>>>>>>>>> external tool" . >>>>>>>>>>> >>>>>>>>>>> I hear that someone need this RFE. So I want to discuss about >>>>>>>>>>> this. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> >>>>>>>>>>> Yasumasa >>>>>>>>>>> >>>>>>>>>>> On 2013/11/08 21:47, Yasumasa Suenaga wrote: >>>>>>>>>>>> Hi all, >>>>>>>>>>>> >>>>>>>>>>>> Did someone read my mail? >>>>>>>>>>>> >>>>>>>>>>>> I think that this RFE helps us to watch Java heap on >>>>>>>>>>>> production system. >>>>>>>>>>>> Also I think this RFE is able to be part of the JEP 158 >>>>>>>>>>>> (Unified JVM Logging) . >>>>>>>>>>>> >>>>>>>>>>>> I want to update this RFE in JDK Bug System, but I don't have >>>>>>>>>>>> account. >>>>>>>>>>>> So I've posted email at first. >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Thanks, >>>>>>>>>>>> >>>>>>>>>>>> Yasumasa >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On 2013/09/30 21:10, Yasumasa Suenaga wrote: >>>>>>>>>>>>> In previous email, I've attached new patch for this RFE. >>>>>>>>>>>>> It works fine with current hsx. >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>> >>>>>>>>>>>>> On 2013/09/29 23:40, Yasu wrote: >>>>>>>>>>>>>> Hi all, >>>>>>>>>>>>>> >>>>>>>>>>>>>> We are using "logrotate" tool on RHEL for various log >>>>>>>>>>>>>> rotation. >>>>>>>>>>>>>> Current HotSpot has gclog rotation function for log size >>>>>>>>>>>>>> base, >>>>>>>>>>>>>> however I need to rotate gc log synchronizing with logrotate >>>>>>>>>>>>>> tool. >>>>>>>>>>>>>> >>>>>>>>>>>>>> So I've created RFE as "JDK-7090324: gclog rotation via >>>>>>>>>>>>>> external tool" . >>>>>>>>>>>>>> And Sr. Engineering Manager in Oracle said he use the >>>>>>>>>>>>>> essence of my patch in one >>>>>>>>>>>>>> of the jcmd subcommands. >>>>>>>>>>>>>> http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/2011-September/003274.html >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> 2 years ago, I posted a patch for this RFE. >>>>>>>>>>>>>> But this patch is too old to apply for current HotSpot. >>>>>>>>>>>>>> >>>>>>>>>>>>>> In last month, a similar discussion was appeared in ML. >>>>>>>>>>>>>> So I think it's time to discuss this RFE. >>>>>>>>>>>>>> http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/2013-August/008029.html >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> Please cooperate. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Best regards, >>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>> >>>> >>> >> > From taras.ledkov at oracle.com Wed Feb 5 04:42:59 2014 From: taras.ledkov at oracle.com (taras ledkov) Date: Wed, 05 Feb 2014 16:42:59 +0400 Subject: Review request for 7195249: Some jtreg tests use hard coded ports In-Reply-To: <52DE41BB.40309@oracle.com> References: <529EF58F.5000701@oracle.com> <52A58687.6020708@oracle.com> <52A5953A.5040102@oracle.com> <52A7061E.8040002@oracle.com> <52BC2A7D.3070403@oracle.com> <52D6A61A.5020109@oracle.com> <52DD3BB6.2070607@oracle.com> <52DE3E3D.5070903@oracle.com> <52DE41BB.40309@oracle.com> Message-ID: <52F231D3.1050208@oracle.com> Hi, So please take a look at the review against JDK9. The reviewed patch had not been integrated into JDK8. Port to JDK9 is identical. The difference: the ProcessTools.java has been already patched by Jaroslav. Webrev for jdk part: http://cr.openjdk.java.net/~anazarov/7195249/jdk/webrev.03/ Webrev for hs part: http://cr.openjdk.java.net/~anazarov/7195249/hs/webrev.03/ On 21.01.2014 13:45, Jaroslav Bachorik wrote: > Hi Taras, > > On 21.1.2014 10:30, taras ledkov wrote: >> Hi Jaroslav, >> >> Could you please review the last changes? >> Are you OK? > > Yes, the change looks ok. But I think we will need to get back to this > problem eventually and implement a central port dispatcher if we want to > be 100% sure the port conflicts wouldn't occur. But your changes reduce > the chance significantly. > > Thanks for taking care of this. > > -JB- > >> >> On 20.01.2014 19:21, Staffan Larsen wrote: >>> Sorry for not replying earlier. Yes, I?m ok with these changes. >>> >>> Thanks, >>> /Staffan >>> >>> On 20 jan 2014, at 16:07, taras ledkov wrote: >>> >>>> Hi Staffan, >>>> >>>> I fixed the tests according with your comments. >>>> Are you OK? >>>> >>>> On 15.01.2014 19:15, taras ledkov wrote: >>>>> Hi, >>>>> >>>>> Please take a look at the new review. >>>>> >>>>> Webrev for jdk part: >>>>> http://cr.openjdk.java.net/~anazarov/7195249/jdk/webrev.02/ >>>>> >>>>> Webrev for hs part: >>>>> http://cr.openjdk.java.net/~anazarov/7195249/hs/webrev.02/ >>>>> >>>>> My answers are inline: >>>>> >>>>> On 08.01.2014 17:46, Staffan Larsen wrote: >>>>>> Hi Taras, >>>>>> >>>>>> Thanks for doing this clean up and conversion of tests into Java. >>>>>> Here?s a couple of comments: >>>>>> >>>>>> test/runtime/6294277/SourceDebugExtension.java: >>>>>> This test could be simplified by not specifying an address at all. >>>>>> Since the test never connects to the JVM started with -Xrunjdwp, >>>>>> there >>>>>> is no reason to specify an address. If address is unspecified (and >>>>>> server=y), the connector will pick an address and print it to the >>>>>> command line. Thus the only change that needs to be done is to remove >>>>>> ",address=8888? from the @run command. >>>>> fixed >>>>> >>>>>> test/sun/management/jmxremote/bootstrap/RmiBootstrapTest.sh: >>>>>> test/sun/management/jmxremote/bootstrap/RmiSslBootstrapTest.sh: >>>>>> These tests do not compile cleanly with an empty JTwork directory. It >>>>>> seems that having one @build for each class does not work well - when >>>>>> compiling RmiBootstrapTest.java it cannot find TestLogger. Moving all >>>>>> classes to one @build statement solved this problem for me. >>>>> fixed >>>>> >>>>>> test/lib/testlibrary/jdk/testlibrary/ProcessTools.java: >>>>>> 187 Future stdoutTask = stdout.process(); >>>>>> 188 Future stderrTask = stderr.process(); >>>>>> The stdoutTask and stderrTask variables are unused. >>>>> fixed >>>>> >>>>>> test/sun/management/jmxremote/bootstrap/RmiRegistrySslTest.java: >>>>>> At first I thought something was wrong with this file - the diff is >>>>>> very weird. Then I realized you renamed an old file and created a new >>>>>> file using the old name. >>>>> You are right. I did it to keep the test name. >>>>> >>>>>> test/sun/management/jmxremote/bootstrap/AbstractFilePermissionTest.java: >>>>>> >>>>>> >>>>>> - Is resetPasswordFilePermission() really necessary? It looks like >>>>>> you >>>>>> delete the files at the beginning of the test in any case. >>>>> I think yes. n the first place, this functionality was at the old >>>>> code. >>>>> In the second place, a file without write permission may be a problem >>>>> for a further cleanup (not by the test, for example for the tests >>>>> launcher scripts etc.) >>>>> >>>>>> - I find the names and usage of ?mgmt? and ?file2PermissionTest? >>>>>> confusing. They are both Paths. One is used directly by the >>>>>> sub-classes, the other has a getter method. >>>>> fixed >>>>> >>>>>> - Lines 57-58: Don?t swallow exceptions, add an ex.printStackTrace(). >>>>>> (Same thing for all other places where you call Integer.parseInt()) >>>>> fixed >>>>> >>>>>> test/sun/management/jmxremote/bootstrap/Dummy.java: >>>>>> This file is never used as far as I can see. >>>>> It is used by PasswordFilePermissionTest & SSLConfigFilePermissionTest >>>>> via the AbstractFilePermissionTest (see the doTest method, >>>>> AbstractFilePermissionTest : 162). >>>>> >>>>>> Thanks, >>>>>> /Staffan >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> On 26 dec 2013, at 14:09, taras ledkov >>>>>> wrote: >>>>>> >>>>>>> Hi, >>>>>>> >>>>>>> Please take a look at the review with fixed issues about trying to >>>>>>> launch test that needs free port several times. >>>>>>> >>>>>>> Webrev for jdk part: >>>>>>> http://cr.openjdk.java.net/~anazarov/7195249/jdk/webrev.01/ >>>>>>> >>>>>>> Webrev for hs part: >>>>>>> http://cr.openjdk.java.net/~anazarov/7195249/hs/webrev.01/ >>>>>>> >>>>>>> Pay your attention to new method ProcessTools.startProcess(String, >>>>>>> ProcessBuilder, Consumer) that is used to analyze all output >>>>>>> of a sub-process. It has common part with >>>>>>> ProcessTools.startProcess(String, ProcessBuilder, Predicate, >>>>>>> long, TumeUnit) that is used to determine the warm-up moment. >>>>>>> >>>>>>> I think the ProcessTools.startProcess(String, ProcessBuilder, >>>>>>> Predicate, long, TumeUnit) may be changed by adding LinePump >>>>>>> to stderr if there is not serious reason for restricting the warm-up >>>>>>> analysis to stdout stream. >>>>>>> >>>>>>> On 10.12.2013 16:16, Yekaterina Kantserova wrote: >>>>>>>> Hi, >>>>>>>> >>>>>>>> I've consulted with Serviceability engineers (add them to CC >>>>>>>> list) and >>>>>>>> they would like to see tests to solve these problem so far: >>>>>>>> >>>>>>>> 2. Implement loops in every test. >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Katja >>>>>>>> >>>>>>>> >>>>>>>> On 12/09/2013 11:02 AM, Alexandre (Shura) Iline wrote: >>>>>>>>> Guys. >>>>>>>>> >>>>>>>>> Let me try to sum up what was said before and may be suggest a >>>>>>>>> compromise. >>>>>>>>> >>>>>>>>> 1. There is a desire to have a support port allocation on the >>>>>>>>> level of >>>>>>>>> a JTReg suite execution. Taras created a bug for that >>>>>>>>> (https://bugs.openjdk.java.net/browse/JDK-7195249). Whether it >>>>>>>>> is a >>>>>>>>> test harness API or a library API does not really matter from >>>>>>>>> usage >>>>>>>>> point of view. >>>>>>>>> >>>>>>>>> 2. There is no way to make the tests absolutely stable, whatever >>>>>>>>> port >>>>>>>>> allocation logic is used. The best we could do is to try to >>>>>>>>> perform >>>>>>>>> the test logic with different ports until the test succeeds. >>>>>>>>> >>>>>>>>> Both arguments make sense. #2 is the ultimate answer, of course, >>>>>>>>> but >>>>>>>>> better be used in conjunction with a meaningful port selection >>>>>>>>> algorithm. >>>>>>>>> >>>>>>>>> At the same time, copying a loop-until-success login from one >>>>>>>>> test to >>>>>>>>> another may be not the best solution. Library could help with >>>>>>>>> that I >>>>>>>>> believe. There only need to be an API method which takes >>>>>>>>> behavior as a >>>>>>>>> parameter and run it until it succeeds. Something like: >>>>>>>>> public runOnAFreePort(Function) >>>>>>>>> or similar. There could be arguments of how/whether to implement >>>>>>>>> it, >>>>>>>>> the solution would not work for shell tests, etc, but still ... >>>>>>>>> >>>>>>>>> >>>>>>>>> With the tests in question though, we have a few options. >>>>>>>>> >>>>>>>>> 1. Integrate tests as is. Get to it later after reaching >>>>>>>>> agreement in >>>>>>>>> the library, etc. >>>>>>>>> 2. Implement loops in every test. >>>>>>>>> 3. Wait for the library to be ready and only then integrate the >>>>>>>>> changes. >>>>>>>>> >>>>>>>>> Please let us know which one is closer to your heart. >>>>>>>>> >>>>>>>>> I personally prefer #1 for the reason that the changes already >>>>>>>>> supposed to make the tests more stable and also there are many >>>>>>>>> more >>>>>>>>> tests tests which use ports, so the scope of the problem is bigger >>>>>>>>> than these. >>>>>>>>> >>>>>>>>> Shura >>>>>>>>> >>>>>>>>> >>>>>>>>>> Taras, >>>>>>>>>> >>>>>>>>>> I agree with the previous comments, that Utils.getFreePort() >>>>>>>>>> does not >>>>>>>>>> guarantee the port will be still free when you start your >>>>>>>>>> process. >>>>>>>>>> Unfortunately I don't think the library can do more. However, >>>>>>>>>> there is a >>>>>>>>>> solution. >>>>>>>>>> >>>>>>>>>> Please, look at the *jdk/test/sun/tools/jstatd/JstatdTest.java >>>>>>>>>> tryToSetupJstatdProcess()*. In brief, the test will try to >>>>>>>>>> start a >>>>>>>>>> process with a free port and then check if >>>>>>>>>> /java.rmi.server.ExportException: Port already in use/ has been >>>>>>>>>> thrown. >>>>>>>>>> If yes, you have to retry. >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> Katja >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On 12/02/2013 01:39 PM, taras ledkov wrote: >>>>>>>>>>> Hi Everyone, >>>>>>>>>>> >>>>>>>>>>> Whatever logic is to be chosen to select a free port, it is the >>>>>>>>>>> library responsibility to implements it, would not you agree? >>>>>>>>>>> >>>>>>>>>>> Hence what I am suggesting is to integrate the tests as is. >>>>>>>>>>> >>>>>>>>>>> Should we decide to replace logic of the port selection, we >>>>>>>>>>> could do >>>>>>>>>>> it later in the library. >>>>>>>>>>> >>>>>>>>>>> On 21.11.2013 15:00, Jaroslav Bachorik wrote: >>>>>>>>>>>> On 20.11.2013 18:38, Dmitry Samersoff wrote: >>>>>>>>>>>>> Roger, >>>>>>>>>>>>> >>>>>>>>>>>>> As soon as we close a socket nobody can guarantee that the >>>>>>>>>>>>> port is >>>>>>>>>>>>> free. >>>>>>>>>>>>> >>>>>>>>>>>>> Moreover, port returned by getFreePort()[1] remains not >>>>>>>>>>>>> accessible >>>>>>>>>>>>> for >>>>>>>>>>>>> some time - it depends to system setup, take a look to >>>>>>>>>>>>> discussions >>>>>>>>>>>>> around SO_REUSEPORT for Linux or SO_REUSEADDR and SO_LINGER >>>>>>>>>>>>> for >>>>>>>>>>>>> BSD. >>>>>>>>>>>>> >>>>>>>>>>>>> So from stability point of view it's better to just return >>>>>>>>>>>>> random >>>>>>>>>>>>> number >>>>>>>>>>>>> between 49152 and 65535. >>>>>>>>>>>> >>>>>>>>>>>> Well, this doesn't seem to improve the odds by much. When >>>>>>>>>>>> there are >>>>>>>>>>>> more >>>>>>>>>>>> tests run in parallel, all of them requiring a free port, >>>>>>>>>>>> nothing >>>>>>>>>>>> prevents the random function to return the same port to all of >>>>>>>>>>>> them. >>>>>>>>>>>> Also, two subsequent requests can return the same port and >>>>>>>>>>>> cause >>>>>>>>>>>> problems with timing when a port used by a previous test is not >>>>>>>>>>>> fully >>>>>>>>>>>> ready to be assigned to a different socket. And as Dmitry >>>>>>>>>>>> pointed out >>>>>>>>>>>> unless one can keep hold of the allocated socket and use it >>>>>>>>>>>> later >>>>>>>>>>>> there >>>>>>>>>>>> is no guarantee that a port which was tested unallocated will >>>>>>>>>>>> remain >>>>>>>>>>>> unallocated also for the next few milliseconds. >>>>>>>>>>>> >>>>>>>>>>>> The only fail proof solution would be a port allocating service >>>>>>>>>>>> provided >>>>>>>>>>>> by the harness. Until then we can only (hopefully) decrease the >>>>>>>>>>>> chance >>>>>>>>>>>> of intermittent failures due to a port being in use. >>>>>>>>>>>> >>>>>>>>>>>> -JB- >>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> -Dmitry >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> [1] >>>>>>>>>>>>> >>>>>>>>>>>>> 141 public static int getFreePort() throws >>>>>>>>>>>>> InterruptedException, >>>>>>>>>>>>> IOException { >>>>>>>>>>>>> 142 int port = -1; >>>>>>>>>>>>> 143 >>>>>>>>>>>>> 144 while (port <= 0) { >>>>>>>>>>>>> 145 Thread.sleep(100); >>>>>>>>>>>>> 146 >>>>>>>>>>>>> 147 ServerSocket serverSocket = null; >>>>>>>>>>>>> 148 try { >>>>>>>>>>>>> 149 serverSocket = new ServerSocket(0); >>>>>>>>>>>>> 150 port = serverSocket.getLocalPort(); >>>>>>>>>>>>> 151 } finally { >>>>>>>>>>>>> 152 serverSocket.close(); >>>>>>>>>>>>> 153 } >>>>>>>>>>>>> 154 } >>>>>>>>>>>>> 155 >>>>>>>>>>>>> 156 return port; >>>>>>>>>>>>> 157 } >>>>>>>>>>>>> 158 >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> On 2013-11-20 19:40, roger riggs wrote: >>>>>>>>>>>>>> Hi, >>>>>>>>>>>>>> >>>>>>>>>>>>>> fyi, The jdk.testlibrary.Utils.getFreePort() method will >>>>>>>>>>>>>> Open an >>>>>>>>>>>>>> free >>>>>>>>>>>>>> Socket, close it and return >>>>>>>>>>>>>> the port number. >>>>>>>>>>>>>> >>>>>>>>>>>>>> And as Alan recommended, use (0) when possible to have the >>>>>>>>>>>>>> system >>>>>>>>>>>>>> assign >>>>>>>>>>>>>> the port #. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Roger >>>>>>>>>>>>>> >>>>>>>>>>>>>> On 11/20/2013 8:04 AM, Dmitry Samersoff wrote: >>>>>>>>>>>>>>> Taras, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> *The only* correct way to take really free port is: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> 1. Chose random number between 49152 and 65535 >>>>>>>>>>>>>>> 2. Open socket >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> if socket fails - repeat step 1 >>>>>>>>>>>>>>> if socket OK - return *socket* >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> If you can't keep the socket open (e.g. you have to pass >>>>>>>>>>>>>>> port >>>>>>>>>>>>>>> number as >>>>>>>>>>>>>>> property value) you shouldn't do any pre-check as it has no >>>>>>>>>>>>>>> value >>>>>>>>>>>>>>> - as >>>>>>>>>>>>>>> as soon as you close socket someone can take the port. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> So just choose a random number within the range above and >>>>>>>>>>>>>>> let >>>>>>>>>>>>>>> networking >>>>>>>>>>>>>>> code opening socket to handle port conflict. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> -Dmitry >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> On 2013-11-20 15:54, taras ledkov wrote: >>>>>>>>>>>>>>>> Hi Everyone, >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> I am working on bug >>>>>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-7195249. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> There are two webrevs: >>>>>>>>>>>>>>>> Webrev for jdk part: >>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~anazarov/7195249/jdk/webrev.00/ >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Webrev for hs part: >>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~anazarov/7195249/hs/webrev.00/ >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Please take a look at some notes: >>>>>>>>>>>>>>>> - After discussing with Yekaterina Kantserova & Jaroslav >>>>>>>>>>>>>>>> Bachorik >>>>>>>>>>>>>>>> some >>>>>>>>>>>>>>>> shell tests have been converted to java based tests >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> - PasswordFilePermissionTest & SSLConfigFilePermissionTest >>>>>>>>>>>>>>>> tests >>>>>>>>>>>>>>>> looked >>>>>>>>>>>>>>>> very similar, so a common parent class was created for >>>>>>>>>>>>>>>> them: >>>>>>>>>>>>>>>> AbstractFilePermissionTest >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> - What was called RmiRegistrySslTest.java I've renamed to >>>>>>>>>>>>>>>> RmiRegistrySslTestApp.java. The java code to replace old >>>>>>>>>>>>>>>> shell >>>>>>>>>>>>>>>> script >>>>>>>>>>>>>>>> RmiRegistrySslTest.sh is called RmiRegistrySslTest.java, >>>>>>>>>>>>>>>> hence the >>>>>>>>>>>>>>>> huge >>>>>>>>>>>>>>>> diff. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> - The new RmiRegistrySslTest.java has some lines similar >>>>>>>>>>>>>>>> to the >>>>>>>>>>>>>>>> AbstractFilePermissionTest.java, I nevertheless decided >>>>>>>>>>>>>>>> to not >>>>>>>>>>>>>>>> complicate the code further and leave it as is. Please >>>>>>>>>>>>>>>> let me >>>>>>>>>>>>>>>> know if >>>>>>>>>>>>>>>> this is somehow not acceptable >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> - com/oracle/java/testlibrary/Utils.java that is added to >>>>>>>>>>>>>>>> hotspot >>>>>>>>>>>>>>>> repository is taken from this patch: >>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ykantser/8023138/webrev.00/test/lib/testlibrary/jdk/testlibrary/Utils.java.sdiff.html >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> - These tests will need additional changes when test >>>>>>>>>>>>>>>> library >>>>>>>>>>>>>>>> process >>>>>>>>>>>>>>>> tools will support command line options inheritance >>>>>>>>>>>>>>>> (http://mail.openjdk.java.net/pipermail/serviceability-dev/2013-November/013235.html) >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>> >>>>>>> >>>>>>> -- >>>>>>> With best regards, >>>>>>> Taras Ledkov >>>>>>> Mail-To: taras.ledkov at oracle.com >>>>>>> skype: taras_ledkov >>>>>>> Phone: 7(812)3346-157 >>>>>> >>>>> >>>> >>>> -- >>>> With best regards, >>>> Taras Ledkov >>>> Mail-To: taras.ledkov at oracle.com >>>> skype: taras_ledkov >>>> Phone: 7(812)3346-157 >>> >> > -- With best regards, Taras Ledkov Mail-To: taras.ledkov at oracle.com skype: taras_ledkov Phone: 7(812)3346-157 From jaroslav.bachorik at oracle.com Wed Feb 5 05:20:23 2014 From: jaroslav.bachorik at oracle.com (Jaroslav Bachorik) Date: Wed, 05 Feb 2014 14:20:23 +0100 Subject: Review request for 7195249: Some jtreg tests use hard coded ports In-Reply-To: <52F231D3.1050208@oracle.com> References: <529EF58F.5000701@oracle.com> <52A58687.6020708@oracle.com> <52A5953A.5040102@oracle.com> <52A7061E.8040002@oracle.com> <52BC2A7D.3070403@oracle.com> <52D6A61A.5020109@oracle.com> <52DD3BB6.2070607@oracle.com> <52DE3E3D.5070903@oracle.com> <52DE41BB.40309@oracle.com> <52F231D3.1050208@oracle.com> Message-ID: <52F23A97.7020807@oracle.com> Hi Taras, thanks for taking care of this. The changes look fine to me. One minor nit is unused imports of the library classes in "test/sun/management/jmxremote/bootstrap/SSLConfigFilePermissionTest.java". It does not use any of those classes as its base class "AbstractFilePermissionTest" does all the heavy lifting. Cheers, -JB- On 5.2.2014 13:42, taras ledkov wrote: > Hi, > > So please take a look at the review against JDK9. > The reviewed patch had not been integrated into JDK8. > > Port to JDK9 is identical. The difference: the ProcessTools.java has > been already patched by Jaroslav. > > Webrev for jdk part: > http://cr.openjdk.java.net/~anazarov/7195249/jdk/webrev.03/ > > Webrev for hs part: > http://cr.openjdk.java.net/~anazarov/7195249/hs/webrev.03/ > > > On 21.01.2014 13:45, Jaroslav Bachorik wrote: >> Hi Taras, >> >> On 21.1.2014 10:30, taras ledkov wrote: >>> Hi Jaroslav, >>> >>> Could you please review the last changes? >>> Are you OK? >> >> Yes, the change looks ok. But I think we will need to get back to this >> problem eventually and implement a central port dispatcher if we want to >> be 100% sure the port conflicts wouldn't occur. But your changes reduce >> the chance significantly. >> >> Thanks for taking care of this. >> >> -JB- >> >>> >>> On 20.01.2014 19:21, Staffan Larsen wrote: >>>> Sorry for not replying earlier. Yes, I?m ok with these changes. >>>> >>>> Thanks, >>>> /Staffan >>>> >>>> On 20 jan 2014, at 16:07, taras ledkov wrote: >>>> >>>>> Hi Staffan, >>>>> >>>>> I fixed the tests according with your comments. >>>>> Are you OK? >>>>> >>>>> On 15.01.2014 19:15, taras ledkov wrote: >>>>>> Hi, >>>>>> >>>>>> Please take a look at the new review. >>>>>> >>>>>> Webrev for jdk part: >>>>>> http://cr.openjdk.java.net/~anazarov/7195249/jdk/webrev.02/ >>>>>> >>>>>> Webrev for hs part: >>>>>> http://cr.openjdk.java.net/~anazarov/7195249/hs/webrev.02/ >>>>>> >>>>>> My answers are inline: >>>>>> >>>>>> On 08.01.2014 17:46, Staffan Larsen wrote: >>>>>>> Hi Taras, >>>>>>> >>>>>>> Thanks for doing this clean up and conversion of tests into Java. >>>>>>> Here?s a couple of comments: >>>>>>> >>>>>>> test/runtime/6294277/SourceDebugExtension.java: >>>>>>> This test could be simplified by not specifying an address at all. >>>>>>> Since the test never connects to the JVM started with -Xrunjdwp, >>>>>>> there >>>>>>> is no reason to specify an address. If address is unspecified (and >>>>>>> server=y), the connector will pick an address and print it to the >>>>>>> command line. Thus the only change that needs to be done is to >>>>>>> remove >>>>>>> ",address=8888? from the @run command. >>>>>> fixed >>>>>> >>>>>>> test/sun/management/jmxremote/bootstrap/RmiBootstrapTest.sh: >>>>>>> test/sun/management/jmxremote/bootstrap/RmiSslBootstrapTest.sh: >>>>>>> These tests do not compile cleanly with an empty JTwork >>>>>>> directory. It >>>>>>> seems that having one @build for each class does not work well - >>>>>>> when >>>>>>> compiling RmiBootstrapTest.java it cannot find TestLogger. Moving >>>>>>> all >>>>>>> classes to one @build statement solved this problem for me. >>>>>> fixed >>>>>> >>>>>>> test/lib/testlibrary/jdk/testlibrary/ProcessTools.java: >>>>>>> 187 Future stdoutTask = stdout.process(); >>>>>>> 188 Future stderrTask = stderr.process(); >>>>>>> The stdoutTask and stderrTask variables are unused. >>>>>> fixed >>>>>> >>>>>>> test/sun/management/jmxremote/bootstrap/RmiRegistrySslTest.java: >>>>>>> At first I thought something was wrong with this file - the diff is >>>>>>> very weird. Then I realized you renamed an old file and created a >>>>>>> new >>>>>>> file using the old name. >>>>>> You are right. I did it to keep the test name. >>>>>> >>>>>>> test/sun/management/jmxremote/bootstrap/AbstractFilePermissionTest.java: >>>>>>> >>>>>>> >>>>>>> >>>>>>> - Is resetPasswordFilePermission() really necessary? It looks like >>>>>>> you >>>>>>> delete the files at the beginning of the test in any case. >>>>>> I think yes. n the first place, this functionality was at the old >>>>>> code. >>>>>> In the second place, a file without write permission may be a problem >>>>>> for a further cleanup (not by the test, for example for the tests >>>>>> launcher scripts etc.) >>>>>> >>>>>>> - I find the names and usage of ?mgmt? and ?file2PermissionTest? >>>>>>> confusing. They are both Paths. One is used directly by the >>>>>>> sub-classes, the other has a getter method. >>>>>> fixed >>>>>> >>>>>>> - Lines 57-58: Don?t swallow exceptions, add an >>>>>>> ex.printStackTrace(). >>>>>>> (Same thing for all other places where you call Integer.parseInt()) >>>>>> fixed >>>>>> >>>>>>> test/sun/management/jmxremote/bootstrap/Dummy.java: >>>>>>> This file is never used as far as I can see. >>>>>> It is used by PasswordFilePermissionTest & >>>>>> SSLConfigFilePermissionTest >>>>>> via the AbstractFilePermissionTest (see the doTest method, >>>>>> AbstractFilePermissionTest : 162). >>>>>> >>>>>>> Thanks, >>>>>>> /Staffan >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> On 26 dec 2013, at 14:09, taras ledkov >>>>>>> wrote: >>>>>>> >>>>>>>> Hi, >>>>>>>> >>>>>>>> Please take a look at the review with fixed issues about trying to >>>>>>>> launch test that needs free port several times. >>>>>>>> >>>>>>>> Webrev for jdk part: >>>>>>>> http://cr.openjdk.java.net/~anazarov/7195249/jdk/webrev.01/ >>>>>>>> >>>>>>>> Webrev for hs part: >>>>>>>> http://cr.openjdk.java.net/~anazarov/7195249/hs/webrev.01/ >>>>>>>> >>>>>>>> Pay your attention to new method ProcessTools.startProcess(String, >>>>>>>> ProcessBuilder, Consumer) that is used to analyze all >>>>>>>> output >>>>>>>> of a sub-process. It has common part with >>>>>>>> ProcessTools.startProcess(String, ProcessBuilder, >>>>>>>> Predicate, >>>>>>>> long, TumeUnit) that is used to determine the warm-up moment. >>>>>>>> >>>>>>>> I think the ProcessTools.startProcess(String, ProcessBuilder, >>>>>>>> Predicate, long, TumeUnit) may be changed by adding >>>>>>>> LinePump >>>>>>>> to stderr if there is not serious reason for restricting the >>>>>>>> warm-up >>>>>>>> analysis to stdout stream. >>>>>>>> >>>>>>>> On 10.12.2013 16:16, Yekaterina Kantserova wrote: >>>>>>>>> Hi, >>>>>>>>> >>>>>>>>> I've consulted with Serviceability engineers (add them to CC >>>>>>>>> list) and >>>>>>>>> they would like to see tests to solve these problem so far: >>>>>>>>> >>>>>>>>> 2. Implement loops in every test. >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> Katja >>>>>>>>> >>>>>>>>> >>>>>>>>> On 12/09/2013 11:02 AM, Alexandre (Shura) Iline wrote: >>>>>>>>>> Guys. >>>>>>>>>> >>>>>>>>>> Let me try to sum up what was said before and may be suggest a >>>>>>>>>> compromise. >>>>>>>>>> >>>>>>>>>> 1. There is a desire to have a support port allocation on the >>>>>>>>>> level of >>>>>>>>>> a JTReg suite execution. Taras created a bug for that >>>>>>>>>> (https://bugs.openjdk.java.net/browse/JDK-7195249). Whether it >>>>>>>>>> is a >>>>>>>>>> test harness API or a library API does not really matter from >>>>>>>>>> usage >>>>>>>>>> point of view. >>>>>>>>>> >>>>>>>>>> 2. There is no way to make the tests absolutely stable, whatever >>>>>>>>>> port >>>>>>>>>> allocation logic is used. The best we could do is to try to >>>>>>>>>> perform >>>>>>>>>> the test logic with different ports until the test succeeds. >>>>>>>>>> >>>>>>>>>> Both arguments make sense. #2 is the ultimate answer, of course, >>>>>>>>>> but >>>>>>>>>> better be used in conjunction with a meaningful port selection >>>>>>>>>> algorithm. >>>>>>>>>> >>>>>>>>>> At the same time, copying a loop-until-success login from one >>>>>>>>>> test to >>>>>>>>>> another may be not the best solution. Library could help with >>>>>>>>>> that I >>>>>>>>>> believe. There only need to be an API method which takes >>>>>>>>>> behavior as a >>>>>>>>>> parameter and run it until it succeeds. Something like: >>>>>>>>>> public runOnAFreePort(Function) >>>>>>>>>> or similar. There could be arguments of how/whether to implement >>>>>>>>>> it, >>>>>>>>>> the solution would not work for shell tests, etc, but still ... >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> With the tests in question though, we have a few options. >>>>>>>>>> >>>>>>>>>> 1. Integrate tests as is. Get to it later after reaching >>>>>>>>>> agreement in >>>>>>>>>> the library, etc. >>>>>>>>>> 2. Implement loops in every test. >>>>>>>>>> 3. Wait for the library to be ready and only then integrate the >>>>>>>>>> changes. >>>>>>>>>> >>>>>>>>>> Please let us know which one is closer to your heart. >>>>>>>>>> >>>>>>>>>> I personally prefer #1 for the reason that the changes already >>>>>>>>>> supposed to make the tests more stable and also there are many >>>>>>>>>> more >>>>>>>>>> tests tests which use ports, so the scope of the problem is >>>>>>>>>> bigger >>>>>>>>>> than these. >>>>>>>>>> >>>>>>>>>> Shura >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>> Taras, >>>>>>>>>>> >>>>>>>>>>> I agree with the previous comments, that Utils.getFreePort() >>>>>>>>>>> does not >>>>>>>>>>> guarantee the port will be still free when you start your >>>>>>>>>>> process. >>>>>>>>>>> Unfortunately I don't think the library can do more. However, >>>>>>>>>>> there is a >>>>>>>>>>> solution. >>>>>>>>>>> >>>>>>>>>>> Please, look at the *jdk/test/sun/tools/jstatd/JstatdTest.java >>>>>>>>>>> tryToSetupJstatdProcess()*. In brief, the test will try to >>>>>>>>>>> start a >>>>>>>>>>> process with a free port and then check if >>>>>>>>>>> /java.rmi.server.ExportException: Port already in use/ has been >>>>>>>>>>> thrown. >>>>>>>>>>> If yes, you have to retry. >>>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> Katja >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On 12/02/2013 01:39 PM, taras ledkov wrote: >>>>>>>>>>>> Hi Everyone, >>>>>>>>>>>> >>>>>>>>>>>> Whatever logic is to be chosen to select a free port, it is the >>>>>>>>>>>> library responsibility to implements it, would not you agree? >>>>>>>>>>>> >>>>>>>>>>>> Hence what I am suggesting is to integrate the tests as is. >>>>>>>>>>>> >>>>>>>>>>>> Should we decide to replace logic of the port selection, we >>>>>>>>>>>> could do >>>>>>>>>>>> it later in the library. >>>>>>>>>>>> >>>>>>>>>>>> On 21.11.2013 15:00, Jaroslav Bachorik wrote: >>>>>>>>>>>>> On 20.11.2013 18:38, Dmitry Samersoff wrote: >>>>>>>>>>>>>> Roger, >>>>>>>>>>>>>> >>>>>>>>>>>>>> As soon as we close a socket nobody can guarantee that the >>>>>>>>>>>>>> port is >>>>>>>>>>>>>> free. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Moreover, port returned by getFreePort()[1] remains not >>>>>>>>>>>>>> accessible >>>>>>>>>>>>>> for >>>>>>>>>>>>>> some time - it depends to system setup, take a look to >>>>>>>>>>>>>> discussions >>>>>>>>>>>>>> around SO_REUSEPORT for Linux or SO_REUSEADDR and SO_LINGER >>>>>>>>>>>>>> for >>>>>>>>>>>>>> BSD. >>>>>>>>>>>>>> >>>>>>>>>>>>>> So from stability point of view it's better to just return >>>>>>>>>>>>>> random >>>>>>>>>>>>>> number >>>>>>>>>>>>>> between 49152 and 65535. >>>>>>>>>>>>> >>>>>>>>>>>>> Well, this doesn't seem to improve the odds by much. When >>>>>>>>>>>>> there are >>>>>>>>>>>>> more >>>>>>>>>>>>> tests run in parallel, all of them requiring a free port, >>>>>>>>>>>>> nothing >>>>>>>>>>>>> prevents the random function to return the same port to all of >>>>>>>>>>>>> them. >>>>>>>>>>>>> Also, two subsequent requests can return the same port and >>>>>>>>>>>>> cause >>>>>>>>>>>>> problems with timing when a port used by a previous test is >>>>>>>>>>>>> not >>>>>>>>>>>>> fully >>>>>>>>>>>>> ready to be assigned to a different socket. And as Dmitry >>>>>>>>>>>>> pointed out >>>>>>>>>>>>> unless one can keep hold of the allocated socket and use it >>>>>>>>>>>>> later >>>>>>>>>>>>> there >>>>>>>>>>>>> is no guarantee that a port which was tested unallocated will >>>>>>>>>>>>> remain >>>>>>>>>>>>> unallocated also for the next few milliseconds. >>>>>>>>>>>>> >>>>>>>>>>>>> The only fail proof solution would be a port allocating >>>>>>>>>>>>> service >>>>>>>>>>>>> provided >>>>>>>>>>>>> by the harness. Until then we can only (hopefully) decrease >>>>>>>>>>>>> the >>>>>>>>>>>>> chance >>>>>>>>>>>>> of intermittent failures due to a port being in use. >>>>>>>>>>>>> >>>>>>>>>>>>> -JB- >>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> -Dmitry >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> [1] >>>>>>>>>>>>>> >>>>>>>>>>>>>> 141 public static int getFreePort() throws >>>>>>>>>>>>>> InterruptedException, >>>>>>>>>>>>>> IOException { >>>>>>>>>>>>>> 142 int port = -1; >>>>>>>>>>>>>> 143 >>>>>>>>>>>>>> 144 while (port <= 0) { >>>>>>>>>>>>>> 145 Thread.sleep(100); >>>>>>>>>>>>>> 146 >>>>>>>>>>>>>> 147 ServerSocket serverSocket = null; >>>>>>>>>>>>>> 148 try { >>>>>>>>>>>>>> 149 serverSocket = new ServerSocket(0); >>>>>>>>>>>>>> 150 port = serverSocket.getLocalPort(); >>>>>>>>>>>>>> 151 } finally { >>>>>>>>>>>>>> 152 serverSocket.close(); >>>>>>>>>>>>>> 153 } >>>>>>>>>>>>>> 154 } >>>>>>>>>>>>>> 155 >>>>>>>>>>>>>> 156 return port; >>>>>>>>>>>>>> 157 } >>>>>>>>>>>>>> 158 >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> On 2013-11-20 19:40, roger riggs wrote: >>>>>>>>>>>>>>> Hi, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> fyi, The jdk.testlibrary.Utils.getFreePort() method will >>>>>>>>>>>>>>> Open an >>>>>>>>>>>>>>> free >>>>>>>>>>>>>>> Socket, close it and return >>>>>>>>>>>>>>> the port number. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> And as Alan recommended, use (0) when possible to have the >>>>>>>>>>>>>>> system >>>>>>>>>>>>>>> assign >>>>>>>>>>>>>>> the port #. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Roger >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> On 11/20/2013 8:04 AM, Dmitry Samersoff wrote: >>>>>>>>>>>>>>>> Taras, >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> *The only* correct way to take really free port is: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> 1. Chose random number between 49152 and 65535 >>>>>>>>>>>>>>>> 2. Open socket >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> if socket fails - repeat step 1 >>>>>>>>>>>>>>>> if socket OK - return *socket* >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> If you can't keep the socket open (e.g. you have to pass >>>>>>>>>>>>>>>> port >>>>>>>>>>>>>>>> number as >>>>>>>>>>>>>>>> property value) you shouldn't do any pre-check as it has no >>>>>>>>>>>>>>>> value >>>>>>>>>>>>>>>> - as >>>>>>>>>>>>>>>> as soon as you close socket someone can take the port. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> So just choose a random number within the range above and >>>>>>>>>>>>>>>> let >>>>>>>>>>>>>>>> networking >>>>>>>>>>>>>>>> code opening socket to handle port conflict. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> -Dmitry >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> On 2013-11-20 15:54, taras ledkov wrote: >>>>>>>>>>>>>>>>> Hi Everyone, >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> I am working on bug >>>>>>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-7195249. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> There are two webrevs: >>>>>>>>>>>>>>>>> Webrev for jdk part: >>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~anazarov/7195249/jdk/webrev.00/ >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Webrev for hs part: >>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~anazarov/7195249/hs/webrev.00/ >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Please take a look at some notes: >>>>>>>>>>>>>>>>> - After discussing with Yekaterina Kantserova & Jaroslav >>>>>>>>>>>>>>>>> Bachorik >>>>>>>>>>>>>>>>> some >>>>>>>>>>>>>>>>> shell tests have been converted to java based tests >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> - PasswordFilePermissionTest & SSLConfigFilePermissionTest >>>>>>>>>>>>>>>>> tests >>>>>>>>>>>>>>>>> looked >>>>>>>>>>>>>>>>> very similar, so a common parent class was created for >>>>>>>>>>>>>>>>> them: >>>>>>>>>>>>>>>>> AbstractFilePermissionTest >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> - What was called RmiRegistrySslTest.java I've renamed to >>>>>>>>>>>>>>>>> RmiRegistrySslTestApp.java. The java code to replace old >>>>>>>>>>>>>>>>> shell >>>>>>>>>>>>>>>>> script >>>>>>>>>>>>>>>>> RmiRegistrySslTest.sh is called RmiRegistrySslTest.java, >>>>>>>>>>>>>>>>> hence the >>>>>>>>>>>>>>>>> huge >>>>>>>>>>>>>>>>> diff. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> - The new RmiRegistrySslTest.java has some lines similar >>>>>>>>>>>>>>>>> to the >>>>>>>>>>>>>>>>> AbstractFilePermissionTest.java, I nevertheless decided >>>>>>>>>>>>>>>>> to not >>>>>>>>>>>>>>>>> complicate the code further and leave it as is. Please >>>>>>>>>>>>>>>>> let me >>>>>>>>>>>>>>>>> know if >>>>>>>>>>>>>>>>> this is somehow not acceptable >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> - com/oracle/java/testlibrary/Utils.java that is added to >>>>>>>>>>>>>>>>> hotspot >>>>>>>>>>>>>>>>> repository is taken from this patch: >>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ykantser/8023138/webrev.00/test/lib/testlibrary/jdk/testlibrary/Utils.java.sdiff.html >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> - These tests will need additional changes when test >>>>>>>>>>>>>>>>> library >>>>>>>>>>>>>>>>> process >>>>>>>>>>>>>>>>> tools will support command line options inheritance >>>>>>>>>>>>>>>>> (http://mail.openjdk.java.net/pipermail/serviceability-dev/2013-November/013235.html) >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> With best regards, >>>>>>>> Taras Ledkov >>>>>>>> Mail-To: taras.ledkov at oracle.com >>>>>>>> skype: taras_ledkov >>>>>>>> Phone: 7(812)3346-157 >>>>>>> >>>>>> >>>>> >>>>> -- >>>>> With best regards, >>>>> Taras Ledkov >>>>> Mail-To: taras.ledkov at oracle.com >>>>> skype: taras_ledkov >>>>> Phone: 7(812)3346-157 >>>> >>> >> > From dmitry.samersoff at oracle.com Wed Feb 5 05:45:15 2014 From: dmitry.samersoff at oracle.com (Dmitry Samersoff) Date: Wed, 05 Feb 2014 17:45:15 +0400 Subject: Need Second reviewer! Re: RR(S): JDK-8023667 SA: ExceptionBlob and other C2 classes not available in client VM In-Reply-To: <52F0EF49.4050003@oracle.com> References: <52EBFC89.3030809@oracle.com> <52EF7675.3020707@oracle.com> <52F0EF49.4050003@oracle.com> Message-ID: <52F2406B.70405@oracle.com> On 2014-02-04 17:46, Dmitry Samersoff wrote: > Staffan, > > You was right, only two classes is C2 specific. > > Here is updated webrev. > > http://cr.openjdk.java.net/~dsamersoff/JDK-8023667/webrev.02/ > > -Dmitry > > On 2014-02-03 16:13, Staffan Larsen wrote: >> >> On 3 feb 2014, at 11:59, Dmitry Samersoff wrote: >> >>> Staffan, >>> >>> All items under try-catch exists only in server (C2 compiler) VM, so if >>> one of them is missing in target VM other items is missed as well. >> >> Are you sure? Some of them look very generic: ?JavaThread?, ?SurrogateLockerThread?, ?GenCollectedHeap?, etc. Perhaps the catch-ignore-c2-specifics should end at lin 821? >> >> /Staffan >> >> >>> >>> I have no ideas what happens later if SA tries to access one of this >>> non-initialized items, but I don't see any errors in my home testing. >>> >>> -Dmitry >>> >>> On 2014-02-03 14:39, Staffan Larsen wrote: >>>> It looks like this new version will leave some things uninitialized >>>> if an item higher on the list throws an exception. I don?t think that >>>> is what you want. >>>> >>>> /Staffan >>>> >>>> >>>> On 31 jan 2014, at 20:42, Dmitry Samersoff >>>> wrote: >>>> >>>>> Hi Everybody, >>>>> >>>>> Please review the fix: >>>>> >>>>> http://cr.openjdk.java.net/~dsamersoff/JDK-8023667/webrev.01/ >>>>> >>>>> sa.js try to initialize all possible VM structures regardless of >>>>> type of VM we are running on >>>>> >>>>> -- Dmitry Samersoff Oracle Java development team, Saint Petersburg, >>>>> Russia * I would love to change the world, but they won't give me >>>>> the sources. >>>> >>> >>> >>> -- >>> Dmitry Samersoff >>> Oracle Java development team, Saint Petersburg, Russia >>> * I would love to change the world, but they won't give me the sources. >> > > -- Dmitry Samersoff Oracle Java development team, Saint Petersburg, Russia * I would love to change the world, but they won't give me the source code. From volker.simonis at gmail.com Wed Feb 5 10:09:25 2014 From: volker.simonis at gmail.com (Volker Simonis) Date: Wed, 5 Feb 2014 19:09:25 +0100 Subject: Why do we need both - export maps AND -fvisibility=hidden/__attribute__((visibility("default"))) In-Reply-To: <52F1E551.2070304@oracle.com> References: <52F18B03.1000107@oracle.com> <52F1E551.2070304@oracle.com> Message-ID: On Wed, Feb 5, 2014 at 8:16 AM, David Holmes wrote: > On 5/02/2014 5:09 PM, Jeremy Manson wrote: >> >> Also, don't you statically link libstdc++ into Hotspot? > > > Normally yes. > That's bad - security-wise as well as from a usability perspective: Security-wise because the JDK will need to be rebuild for every security fix in libgcc and libstdc++. I'm not sure if there's at least somebody actively and regularly looking at libgcc/libstdc++ security issues. Considering the fact, that the JDK is usually build on a "not so new" Linux release when it is released for the first time, this even increases the possibility that libgcc/libstdc++ are actively supported on that release for the full JDK life time. >From a user perspective, because users can only safely use/link C++ JNI libraries with the JDK which have been build against the exactly same libgcc/libstdc++ version. Others, like RedHat (OpenJDK/IcedTea) and SAP (SAP JVM) don't do this any more - i.e. they link libgcc/libstdc++ dynamically, without any problems. While I'm aware that dynamic linking imposes some compatibility risks as well if the same binary must support a bunch of different releases, I still think that is easier to mange. I think static linking of libgcc/libstdc++ is pre-gcc-2.9.2 paranoia, but that's my personal opinion. Also that's a totally different discussion not related to my initial question. > >> It would break >> many, many things if the symbols from that were exported from Hotspot >> (pretty much anything with native code written in C++). > > > Doe static linking automatically re-export all the visible symbols of the > linked in library? That would seem undesirable under most circumstances. > Yes, Jeremy is totally right. I did some tests and realized, that even if we use -fvisibility=hidden in the compile step of a file, this will not help if we make a static library (i.e. a '.a' archive) from it. Linking such an archive into a shared library, will still let the shared library export all "non-static" objects from the archive. So, considering that the OpenDJK by default still statically links libgcc/libstdc++ by default, this is one justification for the usage of map files. I did some more experiments and looked at the current build system and also realized the following: the build dynamically builds a list of all vtable symbols of all classes and puts them into the map file which is used to generate the libjvm.so. If I remember right, they are (or were) needed for the serviceability agent. Now that would be another justification for the usage of a map file as I don't now how we could otherwise export the vtable symbols of all classes. But the funny thing is that I've just realized that this technique isn't working any more with gcc 4.3 and above (which I think is the build compiler of jdk8). Here are the details: The vtable symbols are defined as weak symbols in the object files like so: 0000000000000000 V _ZTV10ArrayKlass If such an object file will be linked with gcc 4.1.2 without map file into the libjvm.so, the symbol would turn into a local data object like so: 0000000000e37160 d _ZTV10ArrayKlass However, if we use a map file which specifies that the symbol '_ZTV10ArrayKlass' should be exported, the libjvm.so will contain the following, global symbol: 0000000001423240 D _ZTV10arrayKlass Now I think this is the expected result of the whole dynamic map-file generation process. However, with gcc 4.3 and later, there's no difference if the corresponding vtable symbol is mentioned in the export map or not. The resulting libjvm.so always only contains a local data object (just use 'nm --defined-only --extern-only libjvm.so' to check). Now I'm not sure if the vtable symbols arn't used by the serviceability agent any more or if just nobody has realized until now that they are not in exported from the libjvm.so any more. In any case, the whole magic of constructing a link export file at build time seems unnecessary by now. I've CC-ed serviceability group to comment on this issue. I'll plan do write down all my findings in a more readable way within the next days. If there are any more/other comments on this topic I'll be highly interested. Regards, Volker > David > >> Jeremy >> >> >> On Tue, Feb 4, 2014 at 4:51 PM, David Holmes > > wrote: >> >> Hi Volker, >> >> >> On 5/02/2014 2:18 AM, Volker Simonis wrote: >> >> I know this is an old topic and actually I was involved myself >> in some >> of these discussions. Nevertheless, the topic regularly pops up >> again >> and every time this happens and I take a deeper look at the topic >> I >> have the feeling I still haven?t completely understood it. >> >> Current state: >> >> On Linux, we are currently compiling the HotSpot with the >> '-fvisibility=hidden' flag which effectively hides all symbols >> except >> the ones which are explicitely declared to have default visibility >> (with the macro JNIEXPORT which expands to >> __attribute__((visibility("__default")))). >> >> >> This was introduced by "6588413: Use -fvisibility=hidden for gcc >> compiles" [1] in HS 21 and it is potentially a good change >> because it >> hides all symbols by default and it enables certain compiler >> optimizations on some platforms (see "How To Write Shared >> Libraries" >> [2]). >> >> However, independently of this change, we still use export maps >> (i.e. >> --version-script=mapfile___reorder) in addition to setting the >> >> explicit >> symbol visibility. While [2] recommends this as good practice, I >> don't >> see the real benefit of having both - >> -fvisibility=hidden/____attribute__((visibility("__default"))) >> >> and export >> maps except if we were maintaining a versioned ABI. But as far as >> I >> can tell, that's not the case for HotSpot. >> >> So finally my question: is there any real reason (except the >> fact that >> "we always did it that way") why we would still need to use export >> maps on Linux where symbol hiding is done with >> -fvisibility=hidden/____attribute__((visibility("__default"))) >> >> anyway? >> Dropping the maps would make the maintenance of exported symbols >> easier as we wouldn't have to keep the export map files and the >> JNIEXPORT functions in sync. >> >> >> From past discussions isn't this because we need the map files to >> expose dynamically generated symbols? (And so we must then also add >> the other visible symbols.) Does that make the visibility attributes >> redundant? I'm not sure, but they serve as a clear statement of >> intent anyway. >> >> David >> >> >> Thank you and best regards, >> Volker >> >> [1] https://bugs.openjdk.java.net/__browse/JDK-6588413 >> >> [2] http://www.akkadia.org/__drepper/dsohowto.pdf >> >> >> > From daniel.daugherty at oracle.com Wed Feb 5 15:20:52 2014 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Wed, 05 Feb 2014 16:20:52 -0700 Subject: code review round 0 for minor FDS makefile fix (8033714) Message-ID: <52F2C754.40809@oracle.com> This code review request is going to three different aliases. Don't use Thunderbird's "reply to list" option since it will pick just _one_ of the _three_ lists. Greetings, Doug Simon and Tom Rodriguez have sent a Full Debug Symbols (FDS) makefile fix our way. Here are the bug and webrev URLs: http://cr.openjdk.java.net/~dcubed/8033714-webrev/0-jdk9-hs-runtime/ 8033714 hotspot 'install_jvm' bld target broken with ZIP_DEBUGINFO_FILES=0 https://bugs.openjdk.java.net/browse/JDK-8033714 As you might guess from the bug synopsis, this fix is needed when building without ZIP'ing the debuginfo files (ZIP_DEBUGINFO_FILES=0). Based on the Graal project fix, I've updated a few other places where building with FDS disabled is affected. As always, comments and suggestions are welcome. Dan From ron.durbin at oracle.com Wed Feb 5 18:04:02 2014 From: ron.durbin at oracle.com (Ron Durbin) Date: Wed, 5 Feb 2014 18:04:02 -0800 (PST) Subject: code review round 0 for minor FDS makefile fix (8033714) In-Reply-To: <52F2C754.40809@oracle.com> References: <52F2C754.40809@oracle.com> Message-ID: <626b3248-50fd-4f91-8aba-fed5fb3a49cc@default> Dan The changes look good Ron > -----Original Message----- > From: Daniel D. Daugherty > Sent: Wednesday, February 05, 2014 4:21 PM > To: hotspot-runtime-dev at openjdk.java.net; serviceability-dev at openjdk.java.net; build-dev; > Doug Simon; Tom Rodriguez > Subject: code review round 0 for minor FDS makefile fix (8033714) > > This code review request is going to three different aliases. > Don't use Thunderbird's "reply to list" option since it will pick just _one_ of the _three_ > lists. > > > Greetings, > > Doug Simon and Tom Rodriguez have sent a Full Debug Symbols (FDS) makefile fix our way. Here > are the bug and webrev URLs: > > http://cr.openjdk.java.net/~dcubed/8033714-webrev/0-jdk9-hs-runtime/ > > 8033714 hotspot 'install_jvm' bld target broken with > ZIP_DEBUGINFO_FILES=0 > https://bugs.openjdk.java.net/browse/JDK-8033714 > > As you might guess from the bug synopsis, this fix is needed when building without ZIP'ing > the debuginfo files (ZIP_DEBUGINFO_FILES=0). > Based on the Graal project fix, I've updated a few other places where building with FDS > disabled is affected. > > As always, comments and suggestions are welcome. > > Dan From daniel.daugherty at oracle.com Wed Feb 5 20:25:49 2014 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Wed, 05 Feb 2014 21:25:49 -0700 Subject: code review round 0 for minor FDS makefile fix (8033714) In-Reply-To: <626b3248-50fd-4f91-8aba-fed5fb3a49cc@default> References: <52F2C754.40809@oracle.com> <626b3248-50fd-4f91-8aba-fed5fb3a49cc@default> Message-ID: <52F30ECD.8080303@oracle.com> Ron, Thanks for the review. Dan On 2/5/14 7:04 PM, Ron Durbin wrote: > Dan > > The changes look good > > Ron >> -----Original Message----- >> From: Daniel D. Daugherty >> Sent: Wednesday, February 05, 2014 4:21 PM >> To: hotspot-runtime-dev at openjdk.java.net; serviceability-dev at openjdk.java.net; build-dev; >> Doug Simon; Tom Rodriguez >> Subject: code review round 0 for minor FDS makefile fix (8033714) >> >> This code review request is going to three different aliases. >> Don't use Thunderbird's "reply to list" option since it will pick just _one_ of the _three_ >> lists. >> >> >> Greetings, >> >> Doug Simon and Tom Rodriguez have sent a Full Debug Symbols (FDS) makefile fix our way. Here >> are the bug and webrev URLs: >> >> http://cr.openjdk.java.net/~dcubed/8033714-webrev/0-jdk9-hs-runtime/ >> >> 8033714 hotspot 'install_jvm' bld target broken with >> ZIP_DEBUGINFO_FILES=0 >> https://bugs.openjdk.java.net/browse/JDK-8033714 >> >> As you might guess from the bug synopsis, this fix is needed when building without ZIP'ing >> the debuginfo files (ZIP_DEBUGINFO_FILES=0). >> Based on the Graal project fix, I've updated a few other places where building with FDS >> disabled is affected. >> >> As always, comments and suggestions are welcome. >> >> Dan From david.holmes at oracle.com Wed Feb 5 20:28:44 2014 From: david.holmes at oracle.com (David Holmes) Date: Thu, 06 Feb 2014 14:28:44 +1000 Subject: code review round 0 for minor FDS makefile fix (8033714) In-Reply-To: <52F2C754.40809@oracle.com> References: <52F2C754.40809@oracle.com> Message-ID: <52F30F7C.7010407@oracle.com> Hi Dan, Looks good to me. (I never run the install targets :( ) Thanks, David On 6/02/2014 9:20 AM, Daniel D. Daugherty wrote: > This code review request is going to three different aliases. > Don't use Thunderbird's "reply to list" option since it will > pick just _one_ of the _three_ lists. > > > Greetings, > > Doug Simon and Tom Rodriguez have sent a Full Debug Symbols (FDS) > makefile fix our way. Here are the bug and webrev URLs: > > http://cr.openjdk.java.net/~dcubed/8033714-webrev/0-jdk9-hs-runtime/ > > 8033714 hotspot 'install_jvm' bld target broken with > ZIP_DEBUGINFO_FILES=0 > https://bugs.openjdk.java.net/browse/JDK-8033714 > > As you might guess from the bug synopsis, this fix is needed when > building without ZIP'ing the debuginfo files (ZIP_DEBUGINFO_FILES=0). > Based on the Graal project fix, I've updated a few other places where > building with FDS disabled is affected. > > As always, comments and suggestions are welcome. > > Dan From david.holmes at oracle.com Wed Feb 5 20:36:40 2014 From: david.holmes at oracle.com (David Holmes) Date: Thu, 06 Feb 2014 14:36:40 +1000 Subject: Need Second reviewer! Re: RR(S): JDK-8023667 SA: ExceptionBlob and other C2 classes not available in client VM In-Reply-To: <52F2406B.70405@oracle.com> References: <52EBFC89.3030809@oracle.com> <52EF7675.3020707@oracle.com> <52F0EF49.4050003@oracle.com> <52F2406B.70405@oracle.com> Message-ID: <52F31158.1050101@oracle.com> Hi Dmitry, This looks okay to me. But can I suggest not starting a new email thread when you need to prompt for reviews as it makes this review disconnected from the original RFR email. (I would have replied to the original but I don't have a local copy to do that.) Thanks, David On 5/02/2014 11:45 PM, Dmitry Samersoff wrote: > On 2014-02-04 17:46, Dmitry Samersoff wrote: >> Staffan, >> >> You was right, only two classes is C2 specific. >> >> Here is updated webrev. >> >> http://cr.openjdk.java.net/~dsamersoff/JDK-8023667/webrev.02/ >> >> -Dmitry >> >> On 2014-02-03 16:13, Staffan Larsen wrote: >>> >>> On 3 feb 2014, at 11:59, Dmitry Samersoff wrote: >>> >>>> Staffan, >>>> >>>> All items under try-catch exists only in server (C2 compiler) VM, so if >>>> one of them is missing in target VM other items is missed as well. >>> >>> Are you sure? Some of them look very generic: ?JavaThread?, ?SurrogateLockerThread?, ?GenCollectedHeap?, etc. Perhaps the catch-ignore-c2-specifics should end at lin 821? >>> >>> /Staffan >>> >>> >>>> >>>> I have no ideas what happens later if SA tries to access one of this >>>> non-initialized items, but I don't see any errors in my home testing. >>>> >>>> -Dmitry >>>> >>>> On 2014-02-03 14:39, Staffan Larsen wrote: >>>>> It looks like this new version will leave some things uninitialized >>>>> if an item higher on the list throws an exception. I don?t think that >>>>> is what you want. >>>>> >>>>> /Staffan >>>>> >>>>> >>>>> On 31 jan 2014, at 20:42, Dmitry Samersoff >>>>> wrote: >>>>> >>>>>> Hi Everybody, >>>>>> >>>>>> Please review the fix: >>>>>> >>>>>> http://cr.openjdk.java.net/~dsamersoff/JDK-8023667/webrev.01/ >>>>>> >>>>>> sa.js try to initialize all possible VM structures regardless of >>>>>> type of VM we are running on >>>>>> >>>>>> -- Dmitry Samersoff Oracle Java development team, Saint Petersburg, >>>>>> Russia * I would love to change the world, but they won't give me >>>>>> the sources. >>>>> >>>> >>>> >>>> -- >>>> Dmitry Samersoff >>>> Oracle Java development team, Saint Petersburg, Russia >>>> * I would love to change the world, but they won't give me the sources. >>> >> >> > > From Alan.Bateman at oracle.com Thu Feb 6 02:04:13 2014 From: Alan.Bateman at oracle.com (Alan Bateman) Date: Thu, 06 Feb 2014 10:04:13 +0000 Subject: 8033758: gcc warnings compiling jdk/src/share/back Message-ID: <52F35E1D.3040507@oracle.com> This is a drive-by fix to the JDWP agent to fix 50+ warnings that have been annoying me, see: https://bugs.openjdk.java.net/browse/JDK-8033758 The bulk of the warnings stem from using a switch statement to switch on JVMTI and JDWP internal "agent" errors. The agent errors are arranged (in util.h) so that they have integer values beyond the range of the JVMTI errors. There are several ways to deal with this, including suppressing the warning. I toyed with splitting the switch statement into two so that the JVMTI and agent errors are handled separately. It didn't seem to be worth and the simplest is just to cast the jvmtiError to an int. The other warning (in ArrayTypeImpl's writeNewObjectArray) is just that the compiler doesn't know if componentClass has been set in getComponentClass. The simplest thing here is just to initialize to NULL. The proposed changes are below. -Alan diff --git a/src/share/back/ArrayTypeImpl.c b/src/share/back/ArrayTypeImpl.c --- a/src/share/back/ArrayTypeImpl.c +++ b/src/share/back/ArrayTypeImpl.c @@ -112,7 +112,7 @@ WITH_LOCAL_REFS(env, 1) { jarray array; - jclass componentClass; + jclass componentClass = NULL; jdwpError serror; serror = getComponentClass(env, arrayClass, diff --git a/src/share/back/error_messages.c b/src/share/back/error_messages.c --- a/src/share/back/error_messages.c +++ b/src/share/back/error_messages.c @@ -140,7 +140,7 @@ const char * jvmtiErrorText(jvmtiError error) { - switch (error) { + switch ((int)error) { CASE_RETURN_TEXT(JVMTI_ERROR_NONE) CASE_RETURN_TEXT(JVMTI_ERROR_INVALID_THREAD) CASE_RETURN_TEXT(JVMTI_ERROR_INVALID_THREAD_GROUP) diff --git a/src/share/back/util.c b/src/share/back/util.c --- a/src/share/back/util.c +++ b/src/share/back/util.c @@ -2122,7 +2122,7 @@ jdwpError map2jdwpError(jvmtiError error) { - switch ( error ) { + switch ( (int)error ) { case JVMTI_ERROR_NONE: return JDWP_ERROR(NONE); case AGENT_ERROR_INVALID_THREAD: From david.holmes at oracle.com Thu Feb 6 02:22:06 2014 From: david.holmes at oracle.com (David Holmes) Date: Thu, 06 Feb 2014 20:22:06 +1000 Subject: 8033758: gcc warnings compiling jdk/src/share/back In-Reply-To: <52F35E1D.3040507@oracle.com> References: <52F35E1D.3040507@oracle.com> Message-ID: <52F3624E.8010404@oracle.com> Hi Alan, On 6/02/2014 8:04 PM, Alan Bateman wrote: > > This is a drive-by fix to the JDWP agent to fix 50+ warnings that have > been annoying me, see: > https://bugs.openjdk.java.net/browse/JDK-8033758 > > The bulk of the warnings stem from using a switch statement to switch on > JVMTI and JDWP internal "agent" errors. The agent errors are arranged > (in util.h) so that they have integer values beyond the range of the > JVMTI errors. > > There are several ways to deal with this, including suppressing the > warning. I toyed with splitting the switch statement into two so that > the JVMTI and agent errors are handled separately. It didn't seem to be > worth and the simplest is just to cast the jvmtiError to an int. Ok - not the nicest but this isn't worth any additional effort. > The other warning (in ArrayTypeImpl's writeNewObjectArray) is just that > the compiler doesn't know if componentClass has been set in > getComponentClass. The simplest thing here is just to initialize to NULL. Ok. David ----- > The proposed changes are below. > > -Alan > > > diff --git a/src/share/back/ArrayTypeImpl.c > b/src/share/back/ArrayTypeImpl.c > --- a/src/share/back/ArrayTypeImpl.c > +++ b/src/share/back/ArrayTypeImpl.c > @@ -112,7 +112,7 @@ > WITH_LOCAL_REFS(env, 1) { > > jarray array; > - jclass componentClass; > + jclass componentClass = NULL; > jdwpError serror; > > serror = getComponentClass(env, arrayClass, > diff --git a/src/share/back/error_messages.c > b/src/share/back/error_messages.c > --- a/src/share/back/error_messages.c > +++ b/src/share/back/error_messages.c > @@ -140,7 +140,7 @@ > const char * > jvmtiErrorText(jvmtiError error) > { > - switch (error) { > + switch ((int)error) { > CASE_RETURN_TEXT(JVMTI_ERROR_NONE) > CASE_RETURN_TEXT(JVMTI_ERROR_INVALID_THREAD) > CASE_RETURN_TEXT(JVMTI_ERROR_INVALID_THREAD_GROUP) > diff --git a/src/share/back/util.c b/src/share/back/util.c > --- a/src/share/back/util.c > +++ b/src/share/back/util.c > @@ -2122,7 +2122,7 @@ > jdwpError > map2jdwpError(jvmtiError error) > { > - switch ( error ) { > + switch ( (int)error ) { > case JVMTI_ERROR_NONE: > return JDWP_ERROR(NONE); > case AGENT_ERROR_INVALID_THREAD: From staffan.larsen at oracle.com Thu Feb 6 02:34:00 2014 From: staffan.larsen at oracle.com (Staffan Larsen) Date: Thu, 6 Feb 2014 11:34:00 +0100 Subject: 8033758: gcc warnings compiling jdk/src/share/back In-Reply-To: <52F35E1D.3040507@oracle.com> References: <52F35E1D.3040507@oracle.com> Message-ID: <8E79CC47-31B2-4DAB-B8FB-AA56B6B71188@oracle.com> Looks good! Thanks, /Staffan On 6 feb 2014, at 11:04, Alan Bateman wrote: > > This is a drive-by fix to the JDWP agent to fix 50+ warnings that have been annoying me, see: > https://bugs.openjdk.java.net/browse/JDK-8033758 > > The bulk of the warnings stem from using a switch statement to switch on JVMTI and JDWP internal "agent" errors. The agent errors are arranged (in util.h) so that they have integer values beyond the range of the JVMTI errors. > > There are several ways to deal with this, including suppressing the warning. I toyed with splitting the switch statement into two so that the JVMTI and agent errors are handled separately. It didn't seem to be worth and the simplest is just to cast the jvmtiError to an int. > > The other warning (in ArrayTypeImpl's writeNewObjectArray) is just that the compiler doesn't know if componentClass has been set in getComponentClass. The simplest thing here is just to initialize to NULL. > > The proposed changes are below. > > -Alan > > > diff --git a/src/share/back/ArrayTypeImpl.c b/src/share/back/ArrayTypeImpl.c > --- a/src/share/back/ArrayTypeImpl.c > +++ b/src/share/back/ArrayTypeImpl.c > @@ -112,7 +112,7 @@ > WITH_LOCAL_REFS(env, 1) { > > jarray array; > - jclass componentClass; > + jclass componentClass = NULL; > jdwpError serror; > > serror = getComponentClass(env, arrayClass, > diff --git a/src/share/back/error_messages.c b/src/share/back/error_messages.c > --- a/src/share/back/error_messages.c > +++ b/src/share/back/error_messages.c > @@ -140,7 +140,7 @@ > const char * > jvmtiErrorText(jvmtiError error) > { > - switch (error) { > + switch ((int)error) { > CASE_RETURN_TEXT(JVMTI_ERROR_NONE) > CASE_RETURN_TEXT(JVMTI_ERROR_INVALID_THREAD) > CASE_RETURN_TEXT(JVMTI_ERROR_INVALID_THREAD_GROUP) > diff --git a/src/share/back/util.c b/src/share/back/util.c > --- a/src/share/back/util.c > +++ b/src/share/back/util.c > @@ -2122,7 +2122,7 @@ > jdwpError > map2jdwpError(jvmtiError error) > { > - switch ( error ) { > + switch ( (int)error ) { > case JVMTI_ERROR_NONE: > return JDWP_ERROR(NONE); > case AGENT_ERROR_INVALID_THREAD: From serguei.spitsyn at oracle.com Thu Feb 6 02:45:31 2014 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Thu, 06 Feb 2014 02:45:31 -0800 Subject: 8033758: gcc warnings compiling jdk/src/share/back In-Reply-To: <52F35E1D.3040507@oracle.com> References: <52F35E1D.3040507@oracle.com> Message-ID: <52F367CB.6080905@oracle.com> Looks good. Thanks, Alan! Serguei On 2/6/14 2:04 AM, Alan Bateman wrote: > > This is a drive-by fix to the JDWP agent to fix 50+ warnings that have > been annoying me, see: > https://bugs.openjdk.java.net/browse/JDK-8033758 > > The bulk of the warnings stem from using a switch statement to switch > on JVMTI and JDWP internal "agent" errors. The agent errors are > arranged (in util.h) so that they have integer values beyond the range > of the JVMTI errors. > > There are several ways to deal with this, including suppressing the > warning. I toyed with splitting the switch statement into two so that > the JVMTI and agent errors are handled separately. It didn't seem to > be worth and the simplest is just to cast the jvmtiError to an int. > > The other warning (in ArrayTypeImpl's writeNewObjectArray) is just > that the compiler doesn't know if componentClass has been set in > getComponentClass. The simplest thing here is just to initialize to NULL. > > The proposed changes are below. > > -Alan > > > diff --git a/src/share/back/ArrayTypeImpl.c > b/src/share/back/ArrayTypeImpl.c > --- a/src/share/back/ArrayTypeImpl.c > +++ b/src/share/back/ArrayTypeImpl.c > @@ -112,7 +112,7 @@ > WITH_LOCAL_REFS(env, 1) { > > jarray array; > - jclass componentClass; > + jclass componentClass = NULL; > jdwpError serror; > > serror = getComponentClass(env, arrayClass, > diff --git a/src/share/back/error_messages.c > b/src/share/back/error_messages.c > --- a/src/share/back/error_messages.c > +++ b/src/share/back/error_messages.c > @@ -140,7 +140,7 @@ > const char * > jvmtiErrorText(jvmtiError error) > { > - switch (error) { > + switch ((int)error) { > CASE_RETURN_TEXT(JVMTI_ERROR_NONE) > CASE_RETURN_TEXT(JVMTI_ERROR_INVALID_THREAD) > CASE_RETURN_TEXT(JVMTI_ERROR_INVALID_THREAD_GROUP) > diff --git a/src/share/back/util.c b/src/share/back/util.c > --- a/src/share/back/util.c > +++ b/src/share/back/util.c > @@ -2122,7 +2122,7 @@ > jdwpError > map2jdwpError(jvmtiError error) > { > - switch ( error ) { > + switch ( (int)error ) { > case JVMTI_ERROR_NONE: > return JDWP_ERROR(NONE); > case AGENT_ERROR_INVALID_THREAD: From Alan.Bateman at oracle.com Thu Feb 6 02:46:54 2014 From: Alan.Bateman at oracle.com (Alan Bateman) Date: Thu, 06 Feb 2014 10:46:54 +0000 Subject: 8033758: gcc warnings compiling jdk/src/share/back In-Reply-To: <52F3624E.8010404@oracle.com> References: <52F35E1D.3040507@oracle.com> <52F3624E.8010404@oracle.com> Message-ID: <52F3681E.6010302@oracle.com> On 06/02/2014 10:22, David Holmes wrote: > > Ok - not the nicest but this isn't worth any additional effort. Right, it's not worth it. Thanks for the quick review (Staffan too). I've just run the JDI tests on all platforms (as that exercises JDWP) and the tests are passing so I think we are good to go. -Alan From daniel.daugherty at oracle.com Thu Feb 6 06:07:59 2014 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Thu, 06 Feb 2014 07:07:59 -0700 Subject: code review round 0 for minor FDS makefile fix (8033714) In-Reply-To: <52F30F7C.7010407@oracle.com> References: <52F2C754.40809@oracle.com> <52F30F7C.7010407@oracle.com> Message-ID: <52F3973F.3040509@oracle.com> On 2/5/14 9:28 PM, David Holmes wrote: > Hi Dan, > > Looks good to me. Thanks for the review! > (I never run the install targets :( ) Neither do I and apparently neither does JPRT... That's how this slipped through the cracks... Dan > > Thanks, > David > > On 6/02/2014 9:20 AM, Daniel D. Daugherty wrote: >> This code review request is going to three different aliases. >> Don't use Thunderbird's "reply to list" option since it will >> pick just _one_ of the _three_ lists. >> >> >> Greetings, >> >> Doug Simon and Tom Rodriguez have sent a Full Debug Symbols (FDS) >> makefile fix our way. Here are the bug and webrev URLs: >> >> http://cr.openjdk.java.net/~dcubed/8033714-webrev/0-jdk9-hs-runtime/ >> >> 8033714 hotspot 'install_jvm' bld target broken with >> ZIP_DEBUGINFO_FILES=0 >> https://bugs.openjdk.java.net/browse/JDK-8033714 >> >> As you might guess from the bug synopsis, this fix is needed when >> building without ZIP'ing the debuginfo files (ZIP_DEBUGINFO_FILES=0). >> Based on the Graal project fix, I've updated a few other places where >> building with FDS disabled is affected. >> >> As always, comments and suggestions are welcome. >> >> Dan From daniel.daugherty at oracle.com Thu Feb 6 06:57:57 2014 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Thu, 06 Feb 2014 07:57:57 -0700 Subject: 8033758: gcc warnings compiling jdk/src/share/back In-Reply-To: <52F35E1D.3040507@oracle.com> References: <52F35E1D.3040507@oracle.com> Message-ID: <52F3A2F5.3010807@oracle.com> Looks good to me. Nice to see you back in Serviceability code...:-) Dan On 2/6/14 3:04 AM, Alan Bateman wrote: > > This is a drive-by fix to the JDWP agent to fix 50+ warnings that have > been annoying me, see: > https://bugs.openjdk.java.net/browse/JDK-8033758 > > The bulk of the warnings stem from using a switch statement to switch > on JVMTI and JDWP internal "agent" errors. The agent errors are > arranged (in util.h) so that they have integer values beyond the range > of the JVMTI errors. > > There are several ways to deal with this, including suppressing the > warning. I toyed with splitting the switch statement into two so that > the JVMTI and agent errors are handled separately. It didn't seem to > be worth and the simplest is just to cast the jvmtiError to an int. > > The other warning (in ArrayTypeImpl's writeNewObjectArray) is just > that the compiler doesn't know if componentClass has been set in > getComponentClass. The simplest thing here is just to initialize to NULL. > > The proposed changes are below. > > -Alan > > > diff --git a/src/share/back/ArrayTypeImpl.c > b/src/share/back/ArrayTypeImpl.c > --- a/src/share/back/ArrayTypeImpl.c > +++ b/src/share/back/ArrayTypeImpl.c > @@ -112,7 +112,7 @@ > WITH_LOCAL_REFS(env, 1) { > > jarray array; > - jclass componentClass; > + jclass componentClass = NULL; > jdwpError serror; > > serror = getComponentClass(env, arrayClass, > diff --git a/src/share/back/error_messages.c > b/src/share/back/error_messages.c > --- a/src/share/back/error_messages.c > +++ b/src/share/back/error_messages.c > @@ -140,7 +140,7 @@ > const char * > jvmtiErrorText(jvmtiError error) > { > - switch (error) { > + switch ((int)error) { > CASE_RETURN_TEXT(JVMTI_ERROR_NONE) > CASE_RETURN_TEXT(JVMTI_ERROR_INVALID_THREAD) > CASE_RETURN_TEXT(JVMTI_ERROR_INVALID_THREAD_GROUP) > diff --git a/src/share/back/util.c b/src/share/back/util.c > --- a/src/share/back/util.c > +++ b/src/share/back/util.c > @@ -2122,7 +2122,7 @@ > jdwpError > map2jdwpError(jvmtiError error) > { > - switch ( error ) { > + switch ( (int)error ) { > case JVMTI_ERROR_NONE: > return JDWP_ERROR(NONE); > case AGENT_ERROR_INVALID_THREAD: From tom.rodriguez at oracle.com Thu Feb 6 08:07:18 2014 From: tom.rodriguez at oracle.com (Tom Rodriguez) Date: Thu, 6 Feb 2014 08:07:18 -0800 Subject: code review round 0 for minor FDS makefile fix (8033714) In-Reply-To: <52F3973F.3040509@oracle.com> References: <52F2C754.40809@oracle.com> <52F30F7C.7010407@oracle.com> <52F3973F.3040509@oracle.com> Message-ID: <1ECD0F23-FF74-442B-AEDC-C6CDD5AE9738@oracle.com> Looks good to me too. Thanks for fixing this. tom On Feb 6, 2014, at 6:07 AM, Daniel D. Daugherty wrote: > On 2/5/14 9:28 PM, David Holmes wrote: >> Hi Dan, >> >> Looks good to me. > > Thanks for the review! > > >> (I never run the install targets :( ) > > Neither do I and apparently neither does JPRT... That's how this > slipped through the cracks... > > Dan > > >> >> Thanks, >> David >> >> On 6/02/2014 9:20 AM, Daniel D. Daugherty wrote: >>> This code review request is going to three different aliases. >>> Don't use Thunderbird's "reply to list" option since it will >>> pick just _one_ of the _three_ lists. >>> >>> >>> Greetings, >>> >>> Doug Simon and Tom Rodriguez have sent a Full Debug Symbols (FDS) >>> makefile fix our way. Here are the bug and webrev URLs: >>> >>> http://cr.openjdk.java.net/~dcubed/8033714-webrev/0-jdk9-hs-runtime/ >>> >>> 8033714 hotspot 'install_jvm' bld target broken with >>> ZIP_DEBUGINFO_FILES=0 >>> https://bugs.openjdk.java.net/browse/JDK-8033714 >>> >>> As you might guess from the bug synopsis, this fix is needed when >>> building without ZIP'ing the debuginfo files (ZIP_DEBUGINFO_FILES=0). >>> Based on the Graal project fix, I've updated a few other places where >>> building with FDS disabled is affected. >>> >>> As always, comments and suggestions are welcome. >>> >>> Dan > From daniel.daugherty at oracle.com Thu Feb 6 08:32:37 2014 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Thu, 06 Feb 2014 09:32:37 -0700 Subject: code review round 0 for minor FDS makefile fix (8033714) In-Reply-To: References: <52F2C754.40809@oracle.com> <52F30F7C.7010407@oracle.com> <52F3973F.3040509@oracle.com> <1ECD0F23-FF74-442B-AEDC-C6CDD5AE9738@oracle.com> Message-ID: <52F3B925.5030602@oracle.com> On 2/6/14 9:29 AM, Doug Simon wrote: > Not sure if I?m being asked for a review, but if so, looks good. Yes, I was looking for a review. In particular because I tweaked your original fix... Thanks for the review! Dan > > On Feb 6, 2014, at 5:07 PM, Tom Rodriguez wrote: > >> Looks good to me too. Thanks for fixing this. >> >> tom >> >> On Feb 6, 2014, at 6:07 AM, Daniel D. Daugherty wrote: >> >>> On 2/5/14 9:28 PM, David Holmes wrote: >>>> Hi Dan, >>>> >>>> Looks good to me. >>> Thanks for the review! >>> >>> >>>> (I never run the install targets :( ) >>> Neither do I and apparently neither does JPRT... That's how this >>> slipped through the cracks... >>> >>> Dan >>> >>> >>>> Thanks, >>>> David >>>> >>>> On 6/02/2014 9:20 AM, Daniel D. Daugherty wrote: >>>>> This code review request is going to three different aliases. >>>>> Don't use Thunderbird's "reply to list" option since it will >>>>> pick just _one_ of the _three_ lists. >>>>> >>>>> >>>>> Greetings, >>>>> >>>>> Doug Simon and Tom Rodriguez have sent a Full Debug Symbols (FDS) >>>>> makefile fix our way. Here are the bug and webrev URLs: >>>>> >>>>> http://cr.openjdk.java.net/~dcubed/8033714-webrev/0-jdk9-hs-runtime/ >>>>> >>>>> 8033714 hotspot 'install_jvm' bld target broken with >>>>> ZIP_DEBUGINFO_FILES=0 >>>>> https://bugs.openjdk.java.net/browse/JDK-8033714 >>>>> >>>>> As you might guess from the bug synopsis, this fix is needed when >>>>> building without ZIP'ing the debuginfo files (ZIP_DEBUGINFO_FILES=0). >>>>> Based on the Graal project fix, I've updated a few other places where >>>>> building with FDS disabled is affected. >>>>> >>>>> As always, comments and suggestions are welcome. >>>>> >>>>> Dan From doug.simon at oracle.com Thu Feb 6 08:29:02 2014 From: doug.simon at oracle.com (Doug Simon) Date: Thu, 6 Feb 2014 17:29:02 +0100 Subject: code review round 0 for minor FDS makefile fix (8033714) In-Reply-To: <1ECD0F23-FF74-442B-AEDC-C6CDD5AE9738@oracle.com> References: <52F2C754.40809@oracle.com> <52F30F7C.7010407@oracle.com> <52F3973F.3040509@oracle.com> <1ECD0F23-FF74-442B-AEDC-C6CDD5AE9738@oracle.com> Message-ID: Not sure if I?m being asked for a review, but if so, looks good. On Feb 6, 2014, at 5:07 PM, Tom Rodriguez wrote: > Looks good to me too. Thanks for fixing this. > > tom > > On Feb 6, 2014, at 6:07 AM, Daniel D. Daugherty wrote: > >> On 2/5/14 9:28 PM, David Holmes wrote: >>> Hi Dan, >>> >>> Looks good to me. >> >> Thanks for the review! >> >> >>> (I never run the install targets :( ) >> >> Neither do I and apparently neither does JPRT... That's how this >> slipped through the cracks... >> >> Dan >> >> >>> >>> Thanks, >>> David >>> >>> On 6/02/2014 9:20 AM, Daniel D. Daugherty wrote: >>>> This code review request is going to three different aliases. >>>> Don't use Thunderbird's "reply to list" option since it will >>>> pick just _one_ of the _three_ lists. >>>> >>>> >>>> Greetings, >>>> >>>> Doug Simon and Tom Rodriguez have sent a Full Debug Symbols (FDS) >>>> makefile fix our way. Here are the bug and webrev URLs: >>>> >>>> http://cr.openjdk.java.net/~dcubed/8033714-webrev/0-jdk9-hs-runtime/ >>>> >>>> 8033714 hotspot 'install_jvm' bld target broken with >>>> ZIP_DEBUGINFO_FILES=0 >>>> https://bugs.openjdk.java.net/browse/JDK-8033714 >>>> >>>> As you might guess from the bug synopsis, this fix is needed when >>>> building without ZIP'ing the debuginfo files (ZIP_DEBUGINFO_FILES=0). >>>> Based on the Graal project fix, I've updated a few other places where >>>> building with FDS disabled is affected. >>>> >>>> As always, comments and suggestions are welcome. >>>> >>>> Dan >> > From tim.bell at oracle.com Thu Feb 6 08:53:53 2014 From: tim.bell at oracle.com (Tim Bell) Date: Thu, 06 Feb 2014 08:53:53 -0800 Subject: code review round 0 for minor FDS makefile fix (8033714) In-Reply-To: <52F3B925.5030602@oracle.com> References: <52F2C754.40809@oracle.com> <52F30F7C.7010407@oracle.com> <52F3973F.3040509@oracle.com> <1ECD0F23-FF74-442B-AEDC-C6CDD5AE9738@oracle.com> <52F3B925.5030602@oracle.com> Message-ID: <52F3BE21.9000100@oracle.com> On 02/ 6/14 08:32 AM, Daniel D. Daugherty wrote: Looks good to me, Dan Tim >>>>> On 6/02/2014 9:20 AM, Daniel D. Daugherty wrote: >>>>>> This code review request is going to three different aliases. >>>>>> Don't use Thunderbird's "reply to list" option since it will >>>>>> pick just _one_ of the _three_ lists. >>>>>> >>>>>> >>>>>> Greetings, >>>>>> >>>>>> Doug Simon and Tom Rodriguez have sent a Full Debug Symbols (FDS) >>>>>> makefile fix our way. Here are the bug and webrev URLs: >>>>>> >>>>>> http://cr.openjdk.java.net/~dcubed/8033714-webrev/0-jdk9-hs-runtime/ >>>>>> >>>>>> 8033714 hotspot 'install_jvm' bld target broken with >>>>>> ZIP_DEBUGINFO_FILES=0 >>>>>> https://bugs.openjdk.java.net/browse/JDK-8033714 >>>>>> >>>>>> As you might guess from the bug synopsis, this fix is needed when >>>>>> building without ZIP'ing the debuginfo files >>>>>> (ZIP_DEBUGINFO_FILES=0). >>>>>> Based on the Graal project fix, I've updated a few other places >>>>>> where >>>>>> building with FDS disabled is affected. >>>>>> >>>>>> As always, comments and suggestions are welcome. >>>>>> >>>>>> Dan > From daniel.daugherty at oracle.com Thu Feb 6 09:07:00 2014 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Thu, 06 Feb 2014 10:07:00 -0700 Subject: code review round 0 for minor FDS makefile fix (8033714) In-Reply-To: <52F3BE21.9000100@oracle.com> References: <52F2C754.40809@oracle.com> <52F30F7C.7010407@oracle.com> <52F3973F.3040509@oracle.com> <1ECD0F23-FF74-442B-AEDC-C6CDD5AE9738@oracle.com> <52F3B925.5030602@oracle.com> <52F3BE21.9000100@oracle.com> Message-ID: <52F3C134.2000004@oracle.com> Thanks for the review! Dan On 2/6/14 9:53 AM, Tim Bell wrote: > On 02/ 6/14 08:32 AM, Daniel D. Daugherty wrote: > > Looks good to me, Dan > > Tim > >>>>>> On 6/02/2014 9:20 AM, Daniel D. Daugherty wrote: >>>>>>> This code review request is going to three different aliases. >>>>>>> Don't use Thunderbird's "reply to list" option since it will >>>>>>> pick just _one_ of the _three_ lists. >>>>>>> >>>>>>> >>>>>>> Greetings, >>>>>>> >>>>>>> Doug Simon and Tom Rodriguez have sent a Full Debug Symbols (FDS) >>>>>>> makefile fix our way. Here are the bug and webrev URLs: >>>>>>> >>>>>>> http://cr.openjdk.java.net/~dcubed/8033714-webrev/0-jdk9-hs-runtime/ >>>>>>> >>>>>>> >>>>>>> 8033714 hotspot 'install_jvm' bld target broken with >>>>>>> ZIP_DEBUGINFO_FILES=0 >>>>>>> https://bugs.openjdk.java.net/browse/JDK-8033714 >>>>>>> >>>>>>> As you might guess from the bug synopsis, this fix is needed when >>>>>>> building without ZIP'ing the debuginfo files >>>>>>> (ZIP_DEBUGINFO_FILES=0). >>>>>>> Based on the Graal project fix, I've updated a few other places >>>>>>> where >>>>>>> building with FDS disabled is affected. >>>>>>> >>>>>>> As always, comments and suggestions are welcome. >>>>>>> >>>>>>> Dan >> > From henry.jen at oracle.com Thu Feb 6 10:34:09 2014 From: henry.jen at oracle.com (henry.jen at oracle.com) Date: Thu, 06 Feb 2014 18:34:09 +0000 Subject: hg: jdk8/tl/jdk: 8033590: java.util.Comparator::thenComparing has unnecessary type restriction Message-ID: <20140206183432.A898562A7F@hg.openjdk.java.net> Changeset: 7534523b4174 Author: henryjen Date: 2014-02-06 10:30 -0800 URL: http://hg.openjdk.java.net/jdk8/tl/jdk/rev/7534523b4174 8033590: java.util.Comparator::thenComparing has unnecessary type restriction Reviewed-by: psandoz ! src/share/classes/java/util/Comparator.java ! test/java/util/Comparator/TypeTest.java From serguei.spitsyn at oracle.com Thu Feb 6 15:17:53 2014 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Thu, 06 Feb 2014 15:17:53 -0800 Subject: Review Request (S) 8025841: JVMTI: "vtable stub" dynamic code notification is misplaced In-Reply-To: <52F01569.3070201@oracle.com> References: <52F01569.3070201@oracle.com> Message-ID: <52F41821.10403@oracle.com> Runtime team, This fix was reviewed by Vladimir K. and me. Just wanted to make sure if you would like to review it as well. If not, then I will push it. Thanks, Serguei On 2/3/14 2:17 PM, serguei.spitsyn at oracle.com wrote: > Please, review the fix for: > https://bugs.openjdk.java.net/browse/JDK-8025841 > > > Open webrev: > http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/hotspot/omazurov/8025841-JVMTI-vtbl.1 > > > Summary: > > The fix contributed by Oleg Mazurov to improve profiling data quality. > It moves the "vtable stub" dynamic code notification to the right > place. > I've already reviewed the fix, and it looks good to me. > > Bug report description: > > "JVMTI_EVENT_DYNAMIC_CODE_GENERATED for "vtable stub" gets > scheduled when > a new chunk of memory for subsequent vtable and itable stubs is > allocated. > That chunk is uninitialized (contains zeros or garbage) although > due to the fact > that the actual event delivery is deferred, at least one vtable > comes out right. > > This event should describe an individual vtable/itable stub (base > address and size) > and only after it's been created (memory is actually populated > with code). > Where VM diagnostic messages about vtable/itable stubs are issued > upon > -XX:+PrintAdapterHandlers appears exactly the right place for > JVMTI events as well. > > Getting vtables/itables right is important in the context of > performance analysis as > that dynamically generated code may accumulate quite noticeable > CPU time > (especially itabes), sometimes larger than the actual Java methods > called." > > > Testing: > Oleg tested it in the Oracle Studio Performance Analyzer environment. > nsk.jvmti, nsk.jdi, nsk.jdwp, > In progress: Jtreg com/sun/jdi, java/lang/instrument > > > Thanks, > Serguei > From coleen.phillimore at oracle.com Thu Feb 6 15:32:13 2014 From: coleen.phillimore at oracle.com (Coleen Phillimore) Date: Thu, 06 Feb 2014 18:32:13 -0500 Subject: Review Request (S) 8025841: JVMTI: "vtable stub" dynamic code notification is misplaced In-Reply-To: <52F41821.10403@oracle.com> References: <52F01569.3070201@oracle.com> <52F41821.10403@oracle.com> Message-ID: <52F41B7D.6090206@oracle.com> Hi, I clicked on this a couple times. It seems okay but isn't there a safer way to identify code blobs that are vtable stubs, without looking at the name (which can change in while creating it). A comment at least when you create "vtable chunks" would be good. It seems that someone might want to rename it "vtable or itable buffers", or something like that. thanks, Coleen On 2/6/14 6:17 PM, serguei.spitsyn at oracle.com wrote: > Runtime team, > > This fix was reviewed by Vladimir K. and me. > Just wanted to make sure if you would like to review it as well. > If not, then I will push it. > > Thanks, > Serguei > > On 2/3/14 2:17 PM, serguei.spitsyn at oracle.com wrote: >> Please, review the fix for: >> https://bugs.openjdk.java.net/browse/JDK-8025841 >> >> >> Open webrev: >> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/hotspot/omazurov/8025841-JVMTI-vtbl.1 >> >> >> Summary: >> >> The fix contributed by Oleg Mazurov to improve profiling data quality. >> It moves the "vtable stub" dynamic code notification to the right >> place. >> I've already reviewed the fix, and it looks good to me. >> >> Bug report description: >> >> "JVMTI_EVENT_DYNAMIC_CODE_GENERATED for "vtable stub" gets >> scheduled when >> a new chunk of memory for subsequent vtable and itable stubs is >> allocated. >> That chunk is uninitialized (contains zeros or garbage) although >> due to the fact >> that the actual event delivery is deferred, at least one vtable >> comes out right. >> >> This event should describe an individual vtable/itable stub (base >> address and size) >> and only after it's been created (memory is actually populated >> with code). >> Where VM diagnostic messages about vtable/itable stubs are issued >> upon >> -XX:+PrintAdapterHandlers appears exactly the right place for >> JVMTI events as well. >> >> Getting vtables/itables right is important in the context of >> performance analysis as >> that dynamically generated code may accumulate quite noticeable >> CPU time >> (especially itabes), sometimes larger than the actual Java >> methods called." >> >> >> Testing: >> Oleg tested it in the Oracle Studio Performance Analyzer environment. >> nsk.jvmti, nsk.jdi, nsk.jdwp, >> In progress: Jtreg com/sun/jdi, java/lang/instrument >> >> >> Thanks, >> Serguei >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20140206/55f4f99f/attachment.html From oleg.mazurov at oracle.com Thu Feb 6 15:52:28 2014 From: oleg.mazurov at oracle.com (Oleg Mazurov) Date: Thu, 6 Feb 2014 15:52:28 -0800 Subject: Review Request (S) 8025841: JVMTI: "vtable stub" dynamic code notification is misplaced In-Reply-To: <52F41B7D.6090206@oracle.com> References: <52F01569.3070201@oracle.com> <52F41821.10403@oracle.com> <52F41B7D.6090206@oracle.com> Message-ID: My understanding was that a buffer blob was just that - a buffer. Could potentially contain code fragments of different kinds. Thus, is_buffer_blob() was the closest type available. Agree that a dependency on its name is not reliable, though testing will reveal if the condition turns false for "vtable chunks" due to a name change (I had to deal with that particular test, Serguei should be able to identify it). Adding a comment to where the name is defined (vtableStubs.cpp) that such a dependency exists is a good idea. Thanks, -- Oleg On Feb 6, 2014, at 3:32 PM, Coleen Phillimore wrote: > > Hi, I clicked on this a couple times. It seems okay but isn't there a safer way to identify code blobs that are vtable stubs, without looking at the name (which can change in while creating it). A comment at least when you create "vtable chunks" would be good. It seems that someone might want to rename it "vtable or itable buffers", or something like that. > > thanks, > Coleen > > On 2/6/14 6:17 PM, serguei.spitsyn at oracle.com wrote: >> Runtime team, >> >> This fix was reviewed by Vladimir K. and me. >> Just wanted to make sure if you would like to review it as well. >> If not, then I will push it. >> >> Thanks, >> Serguei >> >> On 2/3/14 2:17 PM, serguei.spitsyn at oracle.com wrote: >>> Please, review the fix for: >>> https://bugs.openjdk.java.net/browse/JDK-8025841 >>> >>> >>> Open webrev: >>> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/hotspot/omazurov/8025841-JVMTI-vtbl.1 >>> >>> Summary: >>> >>> The fix contributed by Oleg Mazurov to improve profiling data quality. >>> It moves the "vtable stub" dynamic code notification to the right place. >>> I've already reviewed the fix, and it looks good to me. >>> >>> Bug report description: >>> >>> "JVMTI_EVENT_DYNAMIC_CODE_GENERATED for "vtable stub" gets scheduled when >>> a new chunk of memory for subsequent vtable and itable stubs is allocated. >>> That chunk is uninitialized (contains zeros or garbage) although due to the fact >>> that the actual event delivery is deferred, at least one vtable comes out right. >>> >>> This event should describe an individual vtable/itable stub (base address and size) >>> and only after it's been created (memory is actually populated with code). >>> Where VM diagnostic messages about vtable/itable stubs are issued upon >>> -XX:+PrintAdapterHandlers appears exactly the right place for JVMTI events as well. >>> >>> Getting vtables/itables right is important in the context of performance analysis as >>> that dynamically generated code may accumulate quite noticeable CPU time >>> (especially itabes), sometimes larger than the actual Java methods called." >>> >>> >>> Testing: >>> Oleg tested it in the Oracle Studio Performance Analyzer environment. >>> nsk.jvmti, nsk.jdi, nsk.jdwp, >>> In progress: Jtreg com/sun/jdi, java/lang/instrument >>> >>> >>> Thanks, >>> Serguei >>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20140206/902e54bd/attachment.html From coleen.phillimore at oracle.com Thu Feb 6 16:23:05 2014 From: coleen.phillimore at oracle.com (Coleen Phillimore) Date: Thu, 06 Feb 2014 19:23:05 -0500 Subject: Review Request (S) 8025841: JVMTI: "vtable stub" dynamic code notification is misplaced In-Reply-To: References: <52F01569.3070201@oracle.com> <52F41821.10403@oracle.com> <52F41B7D.6090206@oracle.com> Message-ID: <52F42769.3040603@oracle.com> Okay, thanks for adding a comment. Coleen On 2/6/14 6:52 PM, Oleg Mazurov wrote: > My understanding was that a buffer blob was just that - a buffer. > Could potentially contain code fragments of different kinds. > Thus, is_buffer_blob() was the closest type available. Agree that a > dependency on its name is not reliable, though testing > will reveal if the condition turns false for "vtable chunks" due to a > name change (I had to deal with that particular test, Serguei > should be able to identify it). Adding a comment to where the name is > defined (vtableStubs.cpp) that such a dependency exists > is a good idea. > Thanks, > > -- Oleg > > On Feb 6, 2014, at 3:32 PM, Coleen Phillimore wrote: > >> >> Hi, I clicked on this a couple times. It seems okay but isn't there >> a safer way to identify code blobs that are vtable stubs, without >> looking at the name (which can change in while creating it). A >> comment at least when you create "vtable chunks" would be good. It >> seems that someone might want to rename it "vtable or itable >> buffers", or something like that. >> >> thanks, >> Coleen >> >> On 2/6/14 6:17 PM, serguei.spitsyn at oracle.com wrote: >>> Runtime team, >>> >>> This fix was reviewed by Vladimir K. and me. >>> Just wanted to make sure if you would like to review it as well. >>> If not, then I will push it. >>> >>> Thanks, >>> Serguei >>> >>> On 2/3/14 2:17 PM, serguei.spitsyn at oracle.com wrote: >>>> Please, review the fix for: >>>> https://bugs.openjdk.java.net/browse/JDK-8025841 >>>> >>>> >>>> Open webrev: >>>> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/hotspot/omazurov/8025841-JVMTI-vtbl.1 >>>> >>>> >>>> Summary: >>>> >>>> The fix contributed by Oleg Mazurov to improve profiling data >>>> quality. >>>> It moves the "vtable stub" dynamic code notification to the right >>>> place. >>>> I've already reviewed the fix, and it looks good to me. >>>> >>>> Bug report description: >>>> >>>> "JVMTI_EVENT_DYNAMIC_CODE_GENERATED for "vtable stub" gets >>>> scheduled when >>>> a new chunk of memory for subsequent vtable and itable stubs is >>>> allocated. >>>> That chunk is uninitialized (contains zeros or garbage) >>>> although due to the fact >>>> that the actual event delivery is deferred, at least one vtable >>>> comes out right. >>>> >>>> This event should describe an individual vtable/itable stub >>>> (base address and size) >>>> and only after it's been created (memory is actually populated >>>> with code). >>>> Where VM diagnostic messages about vtable/itable stubs are >>>> issued upon >>>> -XX:+PrintAdapterHandlers appears exactly the right place for >>>> JVMTI events as well. >>>> >>>> Getting vtables/itables right is important in the context of >>>> performance analysis as >>>> that dynamically generated code may accumulate quite noticeable >>>> CPU time >>>> (especially itabes), sometimes larger than the actual Java >>>> methods called." >>>> >>>> >>>> Testing: >>>> Oleg tested it in the Oracle Studio Performance Analyzer >>>> environment. >>>> nsk.jvmti, nsk.jdi, nsk.jdwp, >>>> In progress: Jtreg com/sun/jdi, java/lang/instrument >>>> >>>> >>>> Thanks, >>>> Serguei >>>> >>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20140206/d2c0fe97/attachment-0001.html From staffan.larsen at oracle.com Fri Feb 7 02:46:07 2014 From: staffan.larsen at oracle.com (Staffan Larsen) Date: Fri, 7 Feb 2014 11:46:07 +0100 Subject: RFR(S): JDK-8033911 Simplify instrumentation of FileInputStream and RandomAccessFile Message-ID: <2BA69765-5405-413A-BAC6-EDE77C176FFA@oracle.com> A few of the public read and write methods in FileInputStream and RandomAccessFile are declared native. This means that it is hard to instrument them using byte code instrumentation. Changing the public methods to be to non-native and instead calling private native methods simplifies instrumentation. webrev: http://cr.openjdk.java.net/~sla/8033911/webrev.00/ bug: https://bugs.openjdk.java.net/browse/JDK-8033911 Thanks, /Staffan From Alan.Bateman at oracle.com Fri Feb 7 02:56:30 2014 From: Alan.Bateman at oracle.com (Alan Bateman) Date: Fri, 07 Feb 2014 10:56:30 +0000 Subject: RFR(S): JDK-8033911 Simplify instrumentation of FileInputStream and RandomAccessFile In-Reply-To: <2BA69765-5405-413A-BAC6-EDE77C176FFA@oracle.com> References: <2BA69765-5405-413A-BAC6-EDE77C176FFA@oracle.com> Message-ID: <52F4BBDE.4010005@oracle.com> On 07/02/2014 10:46, Staffan Larsen wrote: > A few of the public read and write methods in FileInputStream and RandomAccessFile are declared native. This means that it is hard to instrument them using byte code instrumentation. Changing the public methods to be to non-native and instead calling private native methods simplifies instrumentation. > > webrev: http://cr.openjdk.java.net/~sla/8033911/webrev.00/ > bug: https://bugs.openjdk.java.net/browse/JDK-8033911 > I assume you know this already but both JVM TI and java.lang.instrument do have support for hooking into the resolution of native methods. It is of course more complicated and not for the fainthearted. The proposed changes look okay. The only thing is that it might not be complete but perhaps it's not too interesting to instrument methods such as skip or available. -Alan. From dmitry.samersoff at oracle.com Fri Feb 7 03:07:23 2014 From: dmitry.samersoff at oracle.com (Dmitry Samersoff) Date: Fri, 07 Feb 2014 15:07:23 +0400 Subject: RFR(S): JDK-8033911 Simplify instrumentation of FileInputStream and RandomAccessFile In-Reply-To: <2BA69765-5405-413A-BAC6-EDE77C176FFA@oracle.com> References: <2BA69765-5405-413A-BAC6-EDE77C176FFA@oracle.com> Message-ID: <52F4BE6B.8040109@oracle.com> Staffan, As far as you touching this. Is it possible to change all native methods in these two classes to have 0 at the end of name? i.e. readBytes => readBytes0 it's pure cosmetic, but fairly simplify core dump reading and later grep-ing. -Dmitry On 2014-02-07 14:46, Staffan Larsen wrote: > A few of the public read and write methods in FileInputStream and RandomAccessFile are declared native. This means that it is hard to instrument them using byte code instrumentation. Changing the public methods to be to non-native and instead calling private native methods simplifies instrumentation. > > webrev: http://cr.openjdk.java.net/~sla/8033911/webrev.00/ > bug: https://bugs.openjdk.java.net/browse/JDK-8033911 > > Thanks, > /Staffan > -- Dmitry Samersoff Oracle Java development team, Saint Petersburg, Russia * I would love to change the world, but they won't give me the sources. From staffan.larsen at oracle.com Fri Feb 7 03:27:46 2014 From: staffan.larsen at oracle.com (Staffan Larsen) Date: Fri, 7 Feb 2014 12:27:46 +0100 Subject: RFR(S): JDK-8033911 Simplify instrumentation of FileInputStream and RandomAccessFile In-Reply-To: <52F4BBDE.4010005@oracle.com> References: <2BA69765-5405-413A-BAC6-EDE77C176FFA@oracle.com> <52F4BBDE.4010005@oracle.com> Message-ID: On 7 feb 2014, at 11:56, Alan Bateman wrote: > On 07/02/2014 10:46, Staffan Larsen wrote: >> A few of the public read and write methods in FileInputStream and RandomAccessFile are declared native. This means that it is hard to instrument them using byte code instrumentation. Changing the public methods to be to non-native and instead calling private native methods simplifies instrumentation. >> >> webrev: http://cr.openjdk.java.net/~sla/8033911/webrev.00/ >> bug: https://bugs.openjdk.java.net/browse/JDK-8033911 >> > I assume you know this already but both JVM TI and java.lang.instrument do have support for hooking into the resolution of native methods. It is of course more complicated and not for the fainthearted. Yes, but a large drawback of using SetNativeMethodPrefix is that it requires schema changes to the class file (adding a method). This is not possible when dynamically attaching an instrumentation agent. > The proposed changes look okay. The only thing is that it might not be complete but perhaps it's not too interesting to instrument methods such as skip or available. That was my thinking, too. Thanks, /Staffan From staffan.larsen at oracle.com Fri Feb 7 03:28:12 2014 From: staffan.larsen at oracle.com (Staffan Larsen) Date: Fri, 7 Feb 2014 12:28:12 +0100 Subject: RFR(S): JDK-8033911 Simplify instrumentation of FileInputStream and RandomAccessFile In-Reply-To: <52F4BE6B.8040109@oracle.com> References: <2BA69765-5405-413A-BAC6-EDE77C176FFA@oracle.com> <52F4BE6B.8040109@oracle.com> Message-ID: I would prefer that to be a different change. Thanks, /Staffan On 7 feb 2014, at 12:07, Dmitry Samersoff wrote: > Staffan, > > As far as you touching this. > > Is it possible to change all native methods in these two classes to have > 0 at the end of name? > > i.e. readBytes => readBytes0 > > it's pure cosmetic, but fairly simplify core dump reading and later > grep-ing. > > -Dmitry > > On 2014-02-07 14:46, Staffan Larsen wrote: >> A few of the public read and write methods in FileInputStream and RandomAccessFile are declared native. This means that it is hard to instrument them using byte code instrumentation. Changing the public methods to be to non-native and instead calling private native methods simplifies instrumentation. >> >> webrev: http://cr.openjdk.java.net/~sla/8033911/webrev.00/ >> bug: https://bugs.openjdk.java.net/browse/JDK-8033911 >> >> Thanks, >> /Staffan >> > > > -- > Dmitry Samersoff > Oracle Java development team, Saint Petersburg, Russia > * I would love to change the world, but they won't give me the sources. From dmitry.samersoff at oracle.com Fri Feb 7 03:30:23 2014 From: dmitry.samersoff at oracle.com (Dmitry Samersoff) Date: Fri, 07 Feb 2014 15:30:23 +0400 Subject: RFR(S): JDK-8033911 Simplify instrumentation of FileInputStream and RandomAccessFile In-Reply-To: References: <2BA69765-5405-413A-BAC6-EDE77C176FFA@oracle.com> <52F4BE6B.8040109@oracle.com> Message-ID: <52F4C3CF.4090505@oracle.com> Staffan, OK! Looks good for me. -Dmitry On 2014-02-07 15:28, Staffan Larsen wrote: > I would prefer that to be a different change. > > Thanks, > /Staffan > > On 7 feb 2014, at 12:07, Dmitry Samersoff wrote: > >> Staffan, >> >> As far as you touching this. >> >> Is it possible to change all native methods in these two classes to have >> 0 at the end of name? >> >> i.e. readBytes => readBytes0 >> >> it's pure cosmetic, but fairly simplify core dump reading and later >> grep-ing. >> >> -Dmitry >> >> On 2014-02-07 14:46, Staffan Larsen wrote: >>> A few of the public read and write methods in FileInputStream and RandomAccessFile are declared native. This means that it is hard to instrument them using byte code instrumentation. Changing the public methods to be to non-native and instead calling private native methods simplifies instrumentation. >>> >>> webrev: http://cr.openjdk.java.net/~sla/8033911/webrev.00/ >>> bug: https://bugs.openjdk.java.net/browse/JDK-8033911 >>> >>> Thanks, >>> /Staffan >>> >> >> >> -- >> Dmitry Samersoff >> Oracle Java development team, Saint Petersburg, Russia >> * I would love to change the world, but they won't give me the sources. > -- Dmitry Samersoff Oracle Java development team, Saint Petersburg, Russia * I would love to change the world, but they won't give me the sources. From jaroslav.bachorik at oracle.com Fri Feb 7 03:38:59 2014 From: jaroslav.bachorik at oracle.com (Jaroslav Bachorik) Date: Fri, 07 Feb 2014 12:38:59 +0100 Subject: RFR(S): JDK-8033911 Simplify instrumentation of FileInputStream and RandomAccessFile In-Reply-To: <2BA69765-5405-413A-BAC6-EDE77C176FFA@oracle.com> References: <2BA69765-5405-413A-BAC6-EDE77C176FFA@oracle.com> Message-ID: <3d89f5ba-3c6e-4571-b7bf-19cbad4e5b5c@email.android.com> Looks good. Additional benefit is the compliance with the secure coding guide. -JB- On February 7, 2014 11:46:07 AM CET, Staffan Larsen wrote: >A few of the public read and write methods in FileInputStream and >RandomAccessFile are declared native. This means that it is hard to >instrument them using byte code instrumentation. Changing the public >methods to be to non-native and instead calling private native methods >simplifies instrumentation. > >webrev: http://cr.openjdk.java.net/~sla/8033911/webrev.00/ >bug: https://bugs.openjdk.java.net/browse/JDK-8033911 > >Thanks, >/Staffan -- Sent from my Android device with K-9 Mail. Please excuse my brevity. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20140207/cf88cc49/attachment.html From staffan.larsen at oracle.com Fri Feb 7 04:07:04 2014 From: staffan.larsen at oracle.com (Staffan Larsen) Date: Fri, 7 Feb 2014 13:07:04 +0100 Subject: RFR: JDK-8033917 Keep track of file paths in file streams and channels for instrumentation purposes Message-ID: <25552F6F-0FFC-42EA-9B0B-49EC135A5EE6@oracle.com> Instrumentation agents that want to instrument FileInputStream/FileOutputStream to see which files are being accessed do not currently have access to the file system path of the stream. This is because the path is never stored in the stream class, only the file descriptor is. (This is also true for RandomAccessFile and FileChannel). An agent could instrument the respective constructors to store the path. The problem is where to store it. To add a field, the instrumentation agent needs to change the schema of the class. This is not possible at runtime but can be done at class-loading time. However for a j.l.instrument agent these classes are already defined when the agent is first called. For a native JVMTI agent the problem becomes parsing and modifying byte codes in a native agent which is error prone and requires a lot of code to maintain. If instead the stream classes were modified to store a reference to the path, it would be readily available for agents at a minimum of cost to the libraries. This is what this patch does. FileInputStream, FileOutputStream, RandomAccessFile and FileChannelImpl are modified to record the path they operate on in a private field. There are no accessors added to retrieve the path - it is purely stored for instrumentation purposes. The path is intentionally not resolved to be an absolute path since that would potentially add unwanted overhead. If a stream is created from a file descriptor, no path will be stored. The overhead for this path will be keeping the path String alive for a longer period of time. I hope this will not cause any problems. A consumer of this feature will be Java Flight Recorder, but the implementation is usable by other agents as well. webrev: http://cr.openjdk.java.net/~sla/8033917/webrev.00/ bug: https://bugs.openjdk.java.net/browse/JDK-8033917 Thanks, /Staffan From Alan.Bateman at oracle.com Fri Feb 7 04:27:53 2014 From: Alan.Bateman at oracle.com (Alan Bateman) Date: Fri, 07 Feb 2014 12:27:53 +0000 Subject: RFR: JDK-8033917 Keep track of file paths in file streams and channels for instrumentation purposes In-Reply-To: <25552F6F-0FFC-42EA-9B0B-49EC135A5EE6@oracle.com> References: <25552F6F-0FFC-42EA-9B0B-49EC135A5EE6@oracle.com> Message-ID: <52F4D149.8000900@oracle.com> On 07/02/2014 12:07, Staffan Larsen wrote: > Instrumentation agents that want to instrument FileInputStream/FileOutputStream to see which files are being accessed do not currently have access to the file system path of the stream. This is because the path is never stored in the stream class, only the file descriptor is. (This is also true for RandomAccessFile and FileChannel). > > An agent could instrument the respective constructors to store the path. The problem is where to store it. To add a field, the instrumentation agent needs to change the schema of the class. This is not possible at runtime but can be done at class-loading time. However for a j.l.instrument agent these classes are already defined when the agent is first called. For a native JVMTI agent the problem becomes parsing and modifying byte codes in a native agent which is error prone and requires a lot of code to maintain. > > If instead the stream classes were modified to store a reference to the path, it would be readily available for agents at a minimum of cost to the libraries. This is what this patch does. FileInputStream, FileOutputStream, RandomAccessFile and FileChannelImpl are modified to record the path they operate on in a private field. There are no accessors added to retrieve the path - it is purely stored for instrumentation purposes. The path is intentionally not resolved to be an absolute path since that would potentially add unwanted overhead. If a stream is created from a file descriptor, no path will be stored. > > The overhead for this path will be keeping the path String alive for a longer period of time. I hope this will not cause any problems. > > A consumer of this feature will be Java Flight Recorder, but the implementation is usable by other agents as well. > > webrev: http://cr.openjdk.java.net/~sla/8033917/webrev.00/ > bug: https://bugs.openjdk.java.net/browse/JDK-8033917 > I have reservations about doing this as hints of code making use of private fields which isn't good. For the comments in FileInputStream and other then it might be best to keep the line lengths consistent with the existing code if you can (it makes future side-by-side reviews a bit easier too). In WindowsChannelFactory then you've re-order and expand imports. The ordering of the import groups in this area has been Java SE, JDK-specific and then finally the JDK-internal. It's not a big deal of course. Personally I prefer the original static imports but I know some people don't. -Alan. From staffan.larsen at oracle.com Fri Feb 7 04:49:55 2014 From: staffan.larsen at oracle.com (Staffan Larsen) Date: Fri, 7 Feb 2014 13:49:55 +0100 Subject: RFR: JDK-8033917 Keep track of file paths in file streams and channels for instrumentation purposes In-Reply-To: <52F4D149.8000900@oracle.com> References: <25552F6F-0FFC-42EA-9B0B-49EC135A5EE6@oracle.com> <52F4D149.8000900@oracle.com> Message-ID: On 7 feb 2014, at 13:27, Alan Bateman wrote: > On 07/02/2014 12:07, Staffan Larsen wrote: >> Instrumentation agents that want to instrument FileInputStream/FileOutputStream to see which files are being accessed do not currently have access to the file system path of the stream. This is because the path is never stored in the stream class, only the file descriptor is. (This is also true for RandomAccessFile and FileChannel). >> >> An agent could instrument the respective constructors to store the path. The problem is where to store it. To add a field, the instrumentation agent needs to change the schema of the class. This is not possible at runtime but can be done at class-loading time. However for a j.l.instrument agent these classes are already defined when the agent is first called. For a native JVMTI agent the problem becomes parsing and modifying byte codes in a native agent which is error prone and requires a lot of code to maintain. >> >> If instead the stream classes were modified to store a reference to the path, it would be readily available for agents at a minimum of cost to the libraries. This is what this patch does. FileInputStream, FileOutputStream, RandomAccessFile and FileChannelImpl are modified to record the path they operate on in a private field. There are no accessors added to retrieve the path - it is purely stored for instrumentation purposes. The path is intentionally not resolved to be an absolute path since that would potentially add unwanted overhead. If a stream is created from a file descriptor, no path will be stored. >> >> The overhead for this path will be keeping the path String alive for a longer period of time. I hope this will not cause any problems. >> >> A consumer of this feature will be Java Flight Recorder, but the implementation is usable by other agents as well. >> >> webrev: http://cr.openjdk.java.net/~sla/8033917/webrev.00/ >> bug: https://bugs.openjdk.java.net/browse/JDK-8033917 >> > I have reservations about doing this as hints of code making use of private fields which isn't good. > > For the comments in FileInputStream and other then it might be best to keep the line lengths consistent with the existing code if you can (it makes future side-by-side reviews a bit easier too). I?ve updated the comments to have shorter lines and javadoc style comments. > In WindowsChannelFactory then you've re-order and expand imports. The ordering of the import groups in this area has been Java SE, JDK-specific and then finally the JDK-internal. It's not a big deal of course. Personally I prefer the original static imports but I know some people don?t. I?ve reverted to the original order, and only expanded the non-static imports. (The danger of IDEs). Updated webrev here: http://cr.openjdk.java.net/~sla/8033917/webrev.01/ Thanks, /Staffan -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20140207/889ad8c6/attachment.html From Alan.Bateman at oracle.com Fri Feb 7 05:22:15 2014 From: Alan.Bateman at oracle.com (Alan Bateman) Date: Fri, 07 Feb 2014 13:22:15 +0000 Subject: RFR: JDK-8033917 Keep track of file paths in file streams and channels for instrumentation purposes In-Reply-To: References: <25552F6F-0FFC-42EA-9B0B-49EC135A5EE6@oracle.com> <52F4D149.8000900@oracle.com> Message-ID: <52F4DE07.7090003@oracle.com> On 07/02/2014 12:49, Staffan Larsen wrote: > > Updated webrev here: > http://cr.openjdk.java.net/~sla/8033917/webrev.01/ > > Thanks for the updates, looks good. -Alan. From staffan.larsen at oracle.com Fri Feb 7 05:25:05 2014 From: staffan.larsen at oracle.com (Staffan Larsen) Date: Fri, 7 Feb 2014 14:25:05 +0100 Subject: RFR: JDK-8033917 Keep track of file paths in file streams and channels for instrumentation purposes In-Reply-To: <52F4DE07.7090003@oracle.com> References: <25552F6F-0FFC-42EA-9B0B-49EC135A5EE6@oracle.com> <52F4D149.8000900@oracle.com> <52F4DE07.7090003@oracle.com> Message-ID: Thanks Alan! On 7 feb 2014, at 14:22, Alan Bateman wrote: > On 07/02/2014 12:49, Staffan Larsen wrote: >> >> Updated webrev here: http://cr.openjdk.java.net/~sla/8033917/webrev.01/ >> > Thanks for the updates, looks good. > > -Alan. From staffan.larsen at oracle.com Fri Feb 7 05:25:43 2014 From: staffan.larsen at oracle.com (Staffan Larsen) Date: Fri, 7 Feb 2014 14:25:43 +0100 Subject: RFR(S): JDK-8033911 Simplify instrumentation of FileInputStream and RandomAccessFile In-Reply-To: <3d89f5ba-3c6e-4571-b7bf-19cbad4e5b5c@email.android.com> References: <2BA69765-5405-413A-BAC6-EDE77C176FFA@oracle.com> <3d89f5ba-3c6e-4571-b7bf-19cbad4e5b5c@email.android.com> Message-ID: Alan, Jaroslav, Dmitry: Thanks! On 7 feb 2014, at 12:38, Jaroslav Bachorik wrote: > Looks good. Additional benefit is the compliance with the secure coding guide. > > -JB- > > On February 7, 2014 11:46:07 AM CET, Staffan Larsen wrote: > A few of the public read and write methods in FileInputStream and RandomAccessFile are declared native. This means that it is hard to instrument them using byte code instrumentation. Changing the public methods to be to non-native and instead calling private native methods simplifies instrumentation. > > webrev: http://cr.openjdk.java.net/~sla/8033911/webrev.00/ > bug: https://bugs.openjdk.java.net/browse/JDK-8033911 > > Thanks, > /Staffan > > -- > Sent from my Android device with K-9 Mail. Please excuse my brevity. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20140207/9ae4c673/attachment.html From dmitry.samersoff at oracle.com Fri Feb 7 07:24:00 2014 From: dmitry.samersoff at oracle.com (Dmitry Samersoff) Date: Fri, 07 Feb 2014 19:24:00 +0400 Subject: RFR: JDK-8033917 Keep track of file paths in file streams and channels for instrumentation purposes In-Reply-To: <25552F6F-0FFC-42EA-9B0B-49EC135A5EE6@oracle.com> References: <25552F6F-0FFC-42EA-9B0B-49EC135A5EE6@oracle.com> Message-ID: <52F4FA90.2080908@oracle.com> Staffan, FileInputStream.java 55: It's better to initialize path with null. 134: It's better to assign name at one of first lines, in this case we will be able to retrieve file name ever if open fails for some reason. 171: It's not necessary (the same is applicable to other files) I'm a bit scared changing signature of public methods of FileChannelImpl but if Alan says it's OK - lets go with it. -Dmitry On 2014-02-07 16:07, Staffan Larsen wrote: > Instrumentation agents that want to instrument > FileInputStream/FileOutputStream to see which files are being > accessed do not currently have access to the file system path of the > stream. This is because the path is never stored in the stream class, > only the file descriptor is. (This is also true for RandomAccessFile > and FileChannel). > > An agent could instrument the respective constructors to store the > path. The problem is where to store it. To add a field, the > instrumentation agent needs to change the schema of the class. This > is not possible at runtime but can be done at class-loading time. > However for a j.l.instrument agent these classes are already defined > when the agent is first called. For a native JVMTI agent the problem > becomes parsing and modifying byte codes in a native agent which is > error prone and requires a lot of code to maintain. > > If instead the stream classes were modified to store a reference to > the path, it would be readily available for agents at a minimum of > cost to the libraries. This is what this patch does. FileInputStream, > FileOutputStream, RandomAccessFile and FileChannelImpl are modified > to record the path they operate on in a private field. There are no > accessors added to retrieve the path - it is purely stored for > instrumentation purposes. The path is intentionally not resolved to > be an absolute path since that would potentially add unwanted > overhead. If a stream is created from a file descriptor, no path will > be stored. > > The overhead for this path will be keeping the path String alive for > a longer period of time. I hope this will not cause any problems. > > A consumer of this feature will be Java Flight Recorder, but the > implementation is usable by other agents as well. > > webrev: http://cr.openjdk.java.net/~sla/8033917/webrev.00/ bug: > https://bugs.openjdk.java.net/browse/JDK-8033917 > > Thanks, /Staffan > -- Dmitry Samersoff Oracle Java development team, Saint Petersburg, Russia * I would love to change the world, but they won't give me the sources. From Alan.Bateman at oracle.com Fri Feb 7 07:27:36 2014 From: Alan.Bateman at oracle.com (Alan Bateman) Date: Fri, 07 Feb 2014 15:27:36 +0000 Subject: RFR: JDK-8033917 Keep track of file paths in file streams and channels for instrumentation purposes In-Reply-To: <52F4FA90.2080908@oracle.com> References: <25552F6F-0FFC-42EA-9B0B-49EC135A5EE6@oracle.com> <52F4FA90.2080908@oracle.com> Message-ID: <52F4FB68.7040805@oracle.com> On 07/02/2014 15:24, Dmitry Samersoff wrote: > : > > I'm a bit scared changing signature of public methods of FileChannelImpl > but if Alan says it's OK - lets go with it. > This is sun.nio.ch and so JDK-internal, nothing outside of the JDK should be using it directly. -Alan From chris.hegarty at oracle.com Fri Feb 7 07:32:03 2014 From: chris.hegarty at oracle.com (Chris Hegarty) Date: Fri, 07 Feb 2014 15:32:03 +0000 Subject: RFR: JDK-8033917 Keep track of file paths in file streams and channels for instrumentation purposes In-Reply-To: <52F4FA90.2080908@oracle.com> References: <25552F6F-0FFC-42EA-9B0B-49EC135A5EE6@oracle.com> <52F4FA90.2080908@oracle.com> Message-ID: <52F4FC73.5080803@oracle.com> On 07/02/14 15:24, Dmitry Samersoff wrote: > Staffan, > > FileInputStream.java > > 55: It's better to initialize path with null. I'm afraid I disagree with this. The default value is already null, why set it to null again? I see this pattern all over the code, but it seems completely redundant to me. -Chris. > 134: It's better to assign name at one of first lines, in this case we > will be able to retrieve file name ever if open fails for some reason. > 171: It's not necessary > > (the same is applicable to other files) > > I'm a bit scared changing signature of public methods of FileChannelImpl > but if Alan says it's OK - lets go with it. > > -Dmitry > > > On 2014-02-07 16:07, Staffan Larsen wrote: >> Instrumentation agents that want to instrument >> FileInputStream/FileOutputStream to see which files are being >> accessed do not currently have access to the file system path of the >> stream. This is because the path is never stored in the stream class, >> only the file descriptor is. (This is also true for RandomAccessFile >> and FileChannel). >> >> An agent could instrument the respective constructors to store the >> path. The problem is where to store it. To add a field, the >> instrumentation agent needs to change the schema of the class. This >> is not possible at runtime but can be done at class-loading time. >> However for a j.l.instrument agent these classes are already defined >> when the agent is first called. For a native JVMTI agent the problem >> becomes parsing and modifying byte codes in a native agent which is >> error prone and requires a lot of code to maintain. >> >> If instead the stream classes were modified to store a reference to >> the path, it would be readily available for agents at a minimum of >> cost to the libraries. This is what this patch does. FileInputStream, >> FileOutputStream, RandomAccessFile and FileChannelImpl are modified >> to record the path they operate on in a private field. There are no >> accessors added to retrieve the path - it is purely stored for >> instrumentation purposes. The path is intentionally not resolved to >> be an absolute path since that would potentially add unwanted >> overhead. If a stream is created from a file descriptor, no path will >> be stored. >> >> The overhead for this path will be keeping the path String alive for >> a longer period of time. I hope this will not cause any problems. >> >> A consumer of this feature will be Java Flight Recorder, but the >> implementation is usable by other agents as well. >> >> webrev: http://cr.openjdk.java.net/~sla/8033917/webrev.00/ bug: >> https://bugs.openjdk.java.net/browse/JDK-8033917 >> >> Thanks, /Staffan >> > > From staffan.larsen at oracle.com Fri Feb 7 07:49:25 2014 From: staffan.larsen at oracle.com (Staffan Larsen) Date: Fri, 7 Feb 2014 16:49:25 +0100 Subject: RFR: JDK-8033917 Keep track of file paths in file streams and channels for instrumentation purposes In-Reply-To: <52F4FA90.2080908@oracle.com> References: <25552F6F-0FFC-42EA-9B0B-49EC135A5EE6@oracle.com> <52F4FA90.2080908@oracle.com> Message-ID: <3F1E5072-1745-4EA4-B783-93CAD2643B91@oracle.com> On 7 feb 2014, at 16:24, Dmitry Samersoff wrote: > Staffan, > > FileInputStream.java > > 55: It's better to initialize path with null. I agree with Chris here. The value should be explicitly initialized by all constructors. > 134: It's better to assign name at one of first lines, in this case we > will be able to retrieve file name ever if open fails for some reason. This is the constructor. If anything fails it will throw and exception, and there won?t be an object to look at. > 171: It's not necessary All constructors must initialize the value. Thanks, /Staffan > > (the same is applicable to other files) > > I'm a bit scared changing signature of public methods of FileChannelImpl > but if Alan says it's OK - lets go with it. > > -Dmitry > > > On 2014-02-07 16:07, Staffan Larsen wrote: >> Instrumentation agents that want to instrument >> FileInputStream/FileOutputStream to see which files are being >> accessed do not currently have access to the file system path of the >> stream. This is because the path is never stored in the stream class, >> only the file descriptor is. (This is also true for RandomAccessFile >> and FileChannel). >> >> An agent could instrument the respective constructors to store the >> path. The problem is where to store it. To add a field, the >> instrumentation agent needs to change the schema of the class. This >> is not possible at runtime but can be done at class-loading time. >> However for a j.l.instrument agent these classes are already defined >> when the agent is first called. For a native JVMTI agent the problem >> becomes parsing and modifying byte codes in a native agent which is >> error prone and requires a lot of code to maintain. >> >> If instead the stream classes were modified to store a reference to >> the path, it would be readily available for agents at a minimum of >> cost to the libraries. This is what this patch does. FileInputStream, >> FileOutputStream, RandomAccessFile and FileChannelImpl are modified >> to record the path they operate on in a private field. There are no >> accessors added to retrieve the path - it is purely stored for >> instrumentation purposes. The path is intentionally not resolved to >> be an absolute path since that would potentially add unwanted >> overhead. If a stream is created from a file descriptor, no path will >> be stored. >> >> The overhead for this path will be keeping the path String alive for >> a longer period of time. I hope this will not cause any problems. >> >> A consumer of this feature will be Java Flight Recorder, but the >> implementation is usable by other agents as well. >> >> webrev: http://cr.openjdk.java.net/~sla/8033917/webrev.00/ bug: >> https://bugs.openjdk.java.net/browse/JDK-8033917 >> >> Thanks, /Staffan >> > > > -- > Dmitry Samersoff > Oracle Java development team, Saint Petersburg, Russia > * I would love to change the world, but they won't give me the sources. From dmitry.samersoff at oracle.com Fri Feb 7 08:18:07 2014 From: dmitry.samersoff at oracle.com (Dmitry Samersoff) Date: Fri, 07 Feb 2014 20:18:07 +0400 Subject: RFR: JDK-8033917 Keep track of file paths in file streams and channels for instrumentation purposes In-Reply-To: <52F4FC73.5080803@oracle.com> References: <25552F6F-0FFC-42EA-9B0B-49EC135A5EE6@oracle.com> <52F4FA90.2080908@oracle.com> <52F4FC73.5080803@oracle.com> Message-ID: <52F5073F.3090904@oracle.com> On 2014-02-07 19:32, Chris Hegarty wrote: > On 07/02/14 15:24, Dmitry Samersoff wrote: >> Staffan, >> >> FileInputStream.java >> >> 55: It's better to initialize path with null. > > I'm afraid I disagree with this. The default value is already null, why > set it to null again? I see this pattern all over the code, but it seems > completely redundant to me. Yes, It's NOOP but it makes readers and variety of "security" tools happy. I will not press for it, but as far as rest of the code (e.g. private FileChannel channel = null; ) uses this pattern and initialize variables explicitly, I think it's good to initialize this variable as well. -Dmitry > > -Chris. > >> 134: It's better to assign name at one of first lines, in this case we >> will be able to retrieve file name ever if open fails for some reason. >> 171: It's not necessary >> >> (the same is applicable to other files) >> >> I'm a bit scared changing signature of public methods of FileChannelImpl >> but if Alan says it's OK - lets go with it. >> >> -Dmitry >> >> >> On 2014-02-07 16:07, Staffan Larsen wrote: >>> Instrumentation agents that want to instrument >>> FileInputStream/FileOutputStream to see which files are being >>> accessed do not currently have access to the file system path of the >>> stream. This is because the path is never stored in the stream class, >>> only the file descriptor is. (This is also true for RandomAccessFile >>> and FileChannel). >>> >>> An agent could instrument the respective constructors to store the >>> path. The problem is where to store it. To add a field, the >>> instrumentation agent needs to change the schema of the class. This >>> is not possible at runtime but can be done at class-loading time. >>> However for a j.l.instrument agent these classes are already defined >>> when the agent is first called. For a native JVMTI agent the problem >>> becomes parsing and modifying byte codes in a native agent which is >>> error prone and requires a lot of code to maintain. >>> >>> If instead the stream classes were modified to store a reference to >>> the path, it would be readily available for agents at a minimum of >>> cost to the libraries. This is what this patch does. FileInputStream, >>> FileOutputStream, RandomAccessFile and FileChannelImpl are modified >>> to record the path they operate on in a private field. There are no >>> accessors added to retrieve the path - it is purely stored for >>> instrumentation purposes. The path is intentionally not resolved to >>> be an absolute path since that would potentially add unwanted >>> overhead. If a stream is created from a file descriptor, no path will >>> be stored. >>> >>> The overhead for this path will be keeping the path String alive for >>> a longer period of time. I hope this will not cause any problems. >>> >>> A consumer of this feature will be Java Flight Recorder, but the >>> implementation is usable by other agents as well. >>> >>> webrev: http://cr.openjdk.java.net/~sla/8033917/webrev.00/ bug: >>> https://bugs.openjdk.java.net/browse/JDK-8033917 >>> >>> Thanks, /Staffan >>> >> >> -- Dmitry Samersoff Oracle Java development team, Saint Petersburg, Russia * I would love to change the world, but they won't give me the sources. From dmitry.samersoff at oracle.com Fri Feb 7 08:19:11 2014 From: dmitry.samersoff at oracle.com (Dmitry Samersoff) Date: Fri, 07 Feb 2014 20:19:11 +0400 Subject: RFR: JDK-8033917 Keep track of file paths in file streams and channels for instrumentation purposes In-Reply-To: <3F1E5072-1745-4EA4-B783-93CAD2643B91@oracle.com> References: <25552F6F-0FFC-42EA-9B0B-49EC135A5EE6@oracle.com> <52F4FA90.2080908@oracle.com> <3F1E5072-1745-4EA4-B783-93CAD2643B91@oracle.com> Message-ID: <52F5077F.6000500@oracle.com> Staffan, OK! -Dmitry On 2014-02-07 19:49, Staffan Larsen wrote: > > On 7 feb 2014, at 16:24, Dmitry Samersoff wrote: > >> Staffan, >> >> FileInputStream.java >> >> 55: It's better to initialize path with null. > > I agree with Chris here. The value should be explicitly initialized by all constructors. > >> 134: It's better to assign name at one of first lines, in this case we >> will be able to retrieve file name ever if open fails for some reason. > > This is the constructor. If anything fails it will throw and exception, and there won?t be an object to look at. > >> 171: It's not necessary > > All constructors must initialize the value. > > Thanks, > /Staffan > >> >> (the same is applicable to other files) >> >> I'm a bit scared changing signature of public methods of FileChannelImpl >> but if Alan says it's OK - lets go with it. >> >> -Dmitry >> >> >> On 2014-02-07 16:07, Staffan Larsen wrote: >>> Instrumentation agents that want to instrument >>> FileInputStream/FileOutputStream to see which files are being >>> accessed do not currently have access to the file system path of the >>> stream. This is because the path is never stored in the stream class, >>> only the file descriptor is. (This is also true for RandomAccessFile >>> and FileChannel). >>> >>> An agent could instrument the respective constructors to store the >>> path. The problem is where to store it. To add a field, the >>> instrumentation agent needs to change the schema of the class. This >>> is not possible at runtime but can be done at class-loading time. >>> However for a j.l.instrument agent these classes are already defined >>> when the agent is first called. For a native JVMTI agent the problem >>> becomes parsing and modifying byte codes in a native agent which is >>> error prone and requires a lot of code to maintain. >>> >>> If instead the stream classes were modified to store a reference to >>> the path, it would be readily available for agents at a minimum of >>> cost to the libraries. This is what this patch does. FileInputStream, >>> FileOutputStream, RandomAccessFile and FileChannelImpl are modified >>> to record the path they operate on in a private field. There are no >>> accessors added to retrieve the path - it is purely stored for >>> instrumentation purposes. The path is intentionally not resolved to >>> be an absolute path since that would potentially add unwanted >>> overhead. If a stream is created from a file descriptor, no path will >>> be stored. >>> >>> The overhead for this path will be keeping the path String alive for >>> a longer period of time. I hope this will not cause any problems. >>> >>> A consumer of this feature will be Java Flight Recorder, but the >>> implementation is usable by other agents as well. >>> >>> webrev: http://cr.openjdk.java.net/~sla/8033917/webrev.00/ bug: >>> https://bugs.openjdk.java.net/browse/JDK-8033917 >>> >>> Thanks, /Staffan >>> >> >> >> -- >> Dmitry Samersoff >> Oracle Java development team, Saint Petersburg, Russia >> * I would love to change the world, but they won't give me the sources. > -- Dmitry Samersoff Oracle Java development team, Saint Petersburg, Russia * I would love to change the world, but they won't give me the sources. From dean.long at oracle.com Fri Feb 7 13:06:14 2014 From: dean.long at oracle.com (Dean Long) Date: Fri, 07 Feb 2014 13:06:14 -0800 Subject: Review Request (S) 8025841: JVMTI: "vtable stub" dynamic code notification is misplaced In-Reply-To: References: <52F01569.3070201@oracle.com> <52F41821.10403@oracle.com> <52F41B7D.6090206@oracle.com> Message-ID: <52F54AC6.8020804@oracle.com> What's the cost for adding a new BufferBlob subtype? We already have AdapterBlob and MethodHandlesAdapterBlob. dl On 2/6/2014 3:52 PM, Oleg Mazurov wrote: > My understanding was that a buffer blob was just that - a buffer. Could potentially contain code fragments of different kinds. > Thus, is_buffer_blob() was the closest type available. Agree that a dependency on its name is not reliable, though testing > will reveal if the condition turns false for "vtable chunks" due to a name change (I had to deal with that particular test, Serguei > should be able to identify it). Adding a comment to where the name is defined (vtableStubs.cpp) that such a dependency exists > is a good idea. > Thanks, > > -- Oleg > > On Feb 6, 2014, at 3:32 PM, Coleen Phillimore wrote: > >> Hi, I clicked on this a couple times. It seems okay but isn't there a safer way to identify code blobs that are vtable stubs, without looking at the name (which can change in while creating it). A comment at least when you create "vtable chunks" would be good. It seems that someone might want to rename it "vtable or itable buffers", or something like that. >> >> thanks, >> Coleen >> >> On 2/6/14 6:17 PM, serguei.spitsyn at oracle.com wrote: >>> Runtime team, >>> >>> This fix was reviewed by Vladimir K. and me. >>> Just wanted to make sure if you would like to review it as well. >>> If not, then I will push it. >>> >>> Thanks, >>> Serguei >>> >>> On 2/3/14 2:17 PM, serguei.spitsyn at oracle.com wrote: >>>> Please, review the fix for: >>>> https://bugs.openjdk.java.net/browse/JDK-8025841 >>>> >>>> >>>> Open webrev: >>>> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/hotspot/omazurov/8025841-JVMTI-vtbl.1 >>>> >>>> Summary: >>>> >>>> The fix contributed by Oleg Mazurov to improve profiling data quality. >>>> It moves the "vtable stub" dynamic code notification to the right place. >>>> I've already reviewed the fix, and it looks good to me. >>>> >>>> Bug report description: >>>> >>>> "JVMTI_EVENT_DYNAMIC_CODE_GENERATED for "vtable stub" gets scheduled when >>>> a new chunk of memory for subsequent vtable and itable stubs is allocated. >>>> That chunk is uninitialized (contains zeros or garbage) although due to the fact >>>> that the actual event delivery is deferred, at least one vtable comes out right. >>>> >>>> This event should describe an individual vtable/itable stub (base address and size) >>>> and only after it's been created (memory is actually populated with code). >>>> Where VM diagnostic messages about vtable/itable stubs are issued upon >>>> -XX:+PrintAdapterHandlers appears exactly the right place for JVMTI events as well. >>>> >>>> Getting vtables/itables right is important in the context of performance analysis as >>>> that dynamically generated code may accumulate quite noticeable CPU time >>>> (especially itabes), sometimes larger than the actual Java methods called." >>>> >>>> >>>> Testing: >>>> Oleg tested it in the Oracle Studio Performance Analyzer environment. >>>> nsk.jvmti, nsk.jdi, nsk.jdwp, >>>> In progress: Jtreg com/sun/jdi, java/lang/instrument >>>> >>>> >>>> Thanks, >>>> Serguei >>>> From karen.kinnear at oracle.com Fri Feb 7 13:56:33 2014 From: karen.kinnear at oracle.com (Karen Kinnear) Date: Fri, 7 Feb 2014 16:56:33 -0500 Subject: code review round 0 for ObjectMonitor-JVM/TI hang fix (8028073) In-Reply-To: <52ED3F28.5070806@oracle.com> References: <52ED3F28.5070806@oracle.com> Message-ID: <6C8A4CDA-B5D5-44E4-865E-A7205D533AFD@oracle.com> Dan, Greatly impressed. Very subtle. Thanks to you and David and Dice and Serguei for a lot of blood, sweat and tears. Apologies that I haven't been following all the details, so a couple of questions. 1. I like the new fix. I totally appreciate the massive documentation in the bug, it really helps. And the comments in the code. 2. In removing the earlier fix, does that work because - there is no posting of monitor_contended_exit (or any posting) during the exit logic? Or did I just miss finding it? 3. Did I read the code correctly that the Thread::SpinAcquire can make a timed park call on the same thread's _ParkEvent? And that this is used to get on and off the wait queue, i.e. to acquire the WaitSetLock? Is there the same risk that a notify might be eaten here also? If so, I wonder if we want this added unpark to not just be called if JVMTI_EVENT_MONITOR_WAITED is enabled? thanks, Karen On Feb 1, 2014, at 1:38 PM, Daniel D. Daugherty wrote: > Greetings, > > I have a fix ready for the following bug: > > 8028073 race condition in ObjectMonitor implementation causing deadlocks > https://bugs.openjdk.java.net/browse/JDK-8028073 > > On the surface, this is a very simple fix that relocates a few lines of > code, relocates and rewrites the comments associated with that code and > adds several new comments. > > Of course, in reality, the issue is much more complicated, but I'm > hoping to make it easy for anyone not acquainted with this issue to > understand what's going on. > > Here are the JDK9 webrev URLs: > > OpenJDK: > http://cr.openjdk.java.net/~dcubed/8028073-webrev/0-jdk9-hs-runtime/ > > Oracle internal: > http://javaweb.us.oracle.com/~ddaugher/8028073-webrev/0-jdk9-hs-runtime/ > > The simple summary: > > - since Java Monitors and JVM/TI RawMonitors share a ParkEvent, > it is possible for a JVM/TI monitor event handler to accidentally > consume a ParkEvent.unpark() call meant for Java Monitor layer > - the original code fix was made on 2005.07.04 using this bug ID: > https://bugs.openjdk.java.net/browse/JDK-5030359 > - it's the right fix, but it's in the wrong place > - the fix needs to be after the JVMTI_EVENT_MONITOR_WAITED > event handler is called because it is that event handler > that can cause the hang > > > Testing > ------- > > - a new StessMonitorWait test has been created that reliably > reproduces the hang in JDK[6789]; see the bug's gory details > for the specific versions where the hang has been reproduced > - the test reliably reproduces the hang in 5 seconds on my > T7600 running Solaris 10u11 X86; 1 minute runs reproduce > the hang reliably on other machines > - 12 hour stress run of the new test on Linux-X64, MacOS X-X64, > Solaris-SPARCV9, Solaris-X64, and Win7-X86 with the JPRT > bits did not reproduce the hang > - JPRT test job > - VM/SQE Adhoc test job on Server VM, fastdebug bits on Linux-X86, > Linux-X64, MacOS X-X64, Solaris-SPARCV9, Solaris-X64, Windows-X86, > and Windows-X64: > - vm.quick > - Kitchensink (bigapps) > - Weblogic+medrec (bigapps) > - runThese (bigapps) > > > The Gory Details Start Here > --------------------------- > > This is the old location of block of code that's being moved: > > src/share/vm/runtime/objectMonitor.cpp: > > 1440 void ObjectMonitor::wait(jlong millis, bool interruptible, TRAPS) { > > 1499 exit (true, Self) ; // exit the monitor > > 1513 if (node._notified != 0 && _succ == Self) { > 1514 node._event->unpark(); > 1515 } > > > This is the new location of block of code that's being moved: > > src/share/vm/runtime/objectMonitor.cpp: > > 1452 void ObjectMonitor::wait(jlong millis, bool interruptible, TRAPS) { > > 1601 if (JvmtiExport::should_post_monitor_waited()) { > 1602 JvmtiExport::post_monitor_waited(jt, this, ret == OS_TIMEOUT); > > 1604 if (node._notified != 0 && _succ == Self) { > > 1620 node._event->unpark(); > 1621 } > > > The Risks > --------- > > - The code now executes only when the JVMTI_EVENT_MONITOR_WAITED event > is enabled: > - previously it was always executed > - while the old code was not effective for the hang that is being > fixed with this bug, it is possible that the old code prevented > a different bug in the successor protocol from manifesting > - thorough analysis of the successor protocol did not reveal a > case where the old code was needed in the old location > - Thorough analysis indicates that the other JVM/TI monitor events > do not need a fix like the one for JVMTI_EVENT_MONITOR_WAITED: > - the successor protocol is complicated and the analysis could > be wrong when certain options are used > - comments were added to each location where a JVM/TI monitor > event handler is called documenting why a fix like this one > is not needed there > - if the analysis is wrong, the new comments show where a new > code change would be needed > > > The Scenario > ------------ > > I've created a scenario that reproduces this hang: > > T1 - enters monitor and calls monitor.wait() > T2 - enters the monitor, calls monitor.notify() and exits the monitor > T3 - enters and exits the monitor > T4 - enters the monitor, delays for 5 seconds, exits the monitor > > A JVM/TI agent that enables JVMTI_EVENT_MONITOR_WAITED and has a > handler that: enters a raw monitor, waits for 1ms, exits a raw monitor. > > Here are the six events necessary to make this hang happen: > > // KEY-EVENT-1a: After being unparked(), T1 has cleared the _succ field, but > // KEY-EVENT-1b: T3 is exiting the monitor and makes T1 the successor again. > > // KEY-EVENT-2a: The unpark() done by T3 when it made T1 the successor > // KEY-EVENT-2b: is consumed by the JVM/TI event handler. > > // KEY-EVENT-3a: T3 made T1 the successor > // KEY-EVENT-3b: but before T1 could reenter the monitor T4 grabbed it. > > // KEY-EVENT-4a: T1's TrySpin() call sees T4 as NotRunnable so > // KEY-EVENT-4b: T1 bails from TrySpin without touching _succ. > > // KEY-EVENT-5a: T4 sees that T1 is still the successor so > // KEY-EVENT-5b: T4 takes the quick exit path (no ExitEpilog) > > // KEY-EVENT-6a: T1 is about to park and it is the successor, but > // KEY-EVENT-6b: T3's unpark has been eaten by the JVM/TI event handler > // KEY-EVENT-6c: and T4 took the quick exit path. T1 is about to be stuck. > > > This bug is intertwined with: > > - The ObjectMonitor successor protocol > - the sharing of a ParkEvent between Java Monitors and JVM/TI RawMonitors > > There is a very long successor.notes attachment to JDK-8028073 that > attempts to describe the ObjectMonitor successor protocol. It's good > for putting pretty much anyone to sleep. > > Since this hang reproduces back to JDK6, this bug is taking the easily > backported solution of moving the original fix to the right location. > The following new bug has been filed for possible future work in this > area by the Serviceability Team: > > 8033399 add a separate ParkEvent for JVM/TI RawMonitor use > https://bugs.openjdk.java.net/browse/JDK-8033399 > > > The Symptoms > ------------ > > With intermittent hangs like this, it is useful to know what to look > for in order to determine if you are running into this issue: > > - if you aren't using a debugger or a profiler or some other > JVM/TI agent, then this hang is not the same as yours > - if your JVM/TI agent isn't using a JVMTI_EVENT_MONITOR_WAITED > event handler, then this hang is not the same as yours > - if your JVMTI_EVENT_MONITOR_WAITED event handler is not using > JVM/TI RawMonitors, then this hang is not the same as yours > - if your JVMTI_EVENT_MONITOR_WAITED event handler is calling > back into Java code, then you might just be insane and this > hang might be similar to yours. However, using a Java callback > in an event handler is an even bigger problem/risk so fix that > first. > - if you one or more threads blocked like this and making no > progress, then this hang might be the same as yours: > > "T1" #22 prio=5 os_prio=64 tid=0x00000000009ca800 nid=0x2f waiting for monitor e > ntry [0xfffffd7fc0231000] > java.lang.Thread.State: BLOCKED (on object monitor) > JavaThread state: _thread_blocked > Thread: 0x00000000009ca800 [0x2f] State: _at_safepoint _has_called_back 0 _at_p > oll_safepoint 0 > JavaThread state: _thread_blocked > at java.lang.Object.wait(Native Method) > - waiting on <0xfffffd7e6a2b6ff0> (a java.lang.String) > at java.lang.Object.wait(Object.java:502) > at SMW_WorkerThread.run(StressMonitorWait.java:103) > - locked <0xfffffd7e6a2b6ff0> (a java.lang.String) > > "T2" #23 prio=5 os_prio=64 tid=0x00000000009cc000 nid=0x30 waiting for monitor e > ntry [0xfffffd7fc0130000] > java.lang.Thread.State: BLOCKED (on object monitor) > JavaThread state: _thread_blocked > Thread: 0x00000000009cc000 [0x30] State: _at_safepoint _has_called_back 0 _at_p > oll_safepoint 0 > JavaThread state: _thread_blocked > at SMW_WorkerThread.run(StressMonitorWait.java:120) > - waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String) > > "T3" #24 prio=5 os_prio=64 tid=0x00000000009ce000 nid=0x31 waiting for monitor e > ntry [0xfffffd7fc002f000] > java.lang.Thread.State: BLOCKED (on object monitor) > JavaThread state: _thread_blocked > Thread: 0x00000000009ce000 [0x31] State: _at_safepoint _has_called_back 0 _at_p > oll_safepoint 0 > JavaThread state: _thread_blocked > at SMW_WorkerThread.run(StressMonitorWait.java:139) > - waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String) > > Key symptoms in thread T1: > > - had the object locked: > > locked <0xfffffd7e6a2b6ff0> (a java.lang.String) > > - did an Object.wait(): > > waiting on <0xfffffd7e6a2b6ff0> (a java.lang.String) > > - is blocked on reentry: > > waiting for monitor entry [0xfffffd7fc0231000] > > Key symtoms in thread T2: > > - is blocked waiting to lock the object: > > waiting for monitor entry [0xfffffd7fc0130000] > waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String) > > Key symtoms in thread T3: > > - is blocked waiting to lock the object: > > waiting for monitor entry [0xfffffd7fc002f000] > waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String) > From erik.gahlin at oracle.com Fri Feb 7 14:29:39 2014 From: erik.gahlin at oracle.com (Erik Gahlin) Date: Fri, 07 Feb 2014 23:29:39 +0100 Subject: RFR(S): JDK-8032250 : Add trace event for VM flag changes In-Reply-To: <4213C88C-3617-466C-909B-F8DAA37C8015@oracle.com> References: <4213C88C-3617-466C-909B-F8DAA37C8015@oracle.com> Message-ID: <52F55E53.3010203@oracle.com> > 128 > 129 Field identifiers should use Java naming convention, oldValue and newValue. Erik Staffan Larsen skrev 2014-01-21 14:47: > This is a patch to capture a trace event whenever a manageable VM flag is changed in runtime. > > webrev: http://cr.openjdk.java.net/~sla/8032250/webrev.00/ > bug: https://bugs.openjdk.java.net/browse/JDK-8032250 > > Thanks, > /Staffan From daniel.daugherty at oracle.com Fri Feb 7 15:45:48 2014 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Fri, 07 Feb 2014 16:45:48 -0700 Subject: code review round 0 for ObjectMonitor-JVM/TI hang fix (8028073) In-Reply-To: <6C8A4CDA-B5D5-44E4-865E-A7205D533AFD@oracle.com> References: <52ED3F28.5070806@oracle.com> <6C8A4CDA-B5D5-44E4-865E-A7205D533AFD@oracle.com> Message-ID: <52F5702C.50508@oracle.com> On 2/7/14 2:56 PM, Karen Kinnear wrote: > Dan, > > Greatly impressed. Thanks! And thanks for the review! > Very subtle. Since you've worked with this code, that shouldn't be a surprise... :-) We're down to the subtle bugs... :-( > Thanks to you and David and Dice and Serguei > for a lot of blood, sweat and tears. You're very welcome. > Apologies that I haven't been following all the details, so a couple of questions. No apologies needed. You have other fish to fry... > 1. I like the new fix. I totally appreciate the massive documentation in the bug, it really helps. > And the comments in the code. Glad you like the fix and the docs. > 2. In removing the earlier fix, does that work because > - there is no posting of monitor_contended_exit (or any posting) during > the exit logic? Or did I just miss finding it? Dave Dice tracked the earlier fix down the JVM/PI MONITOR_CONTENDED_EXIT event. There's no equivalent event in JVM/TI so the old fix is no longer needed. > 3. Did I read the code correctly that the Thread::SpinAcquire can make a timed park > call on the same thread's _ParkEvent? And that this is used to get on and off the wait queue, > i.e. to acquire the WaitSetLock? > Is there the same risk that a notify might be eaten here also? As far as I can see, Thread::SpinAcquire() does not use a ParkEvent at all. However, Thread::muxAcquire() does use a ParkEvent, but it is a different ParkEvent. From src/share/vm/runtime/thread.hpp: ParkEvent * _ParkEvent ; // for synchronized() ParkEvent * _SleepEvent ; // for Thread.sleep ParkEvent * _MutexEvent ; // for native internal Mutex/Monitor ParkEvent * _MuxEvent ; // for low-level muxAcquire-muxRelease So ObjectMonitor uses the _ParkEvent field and Thread::muxAcquire() uses the _MuxEvent. There are some comments in thread.cpp about how _MuxEvent could be eliminated and _ParkEvent shared, but I don't think we ever want to go there. I also filed this RFE: 8033399 add a separate ParkEvent for JVM/TI RawMonitor use https://bugs.openjdk.java.net/browse/JDK-8033399 just in case the Serviceability team wants to migrate JVM/TI RawMonitors to a separate ParkEvent. Please let me know if you concur that I've resolved issue #3. > If so, I wonder if we want this added unpark to not just be called if JVMTI_EVENT_MONITOR_WAITED > is enabled? I don't think we need it, but I've noted its removal as a risk. Again, thanks for the review! Dan > > thanks, > Karen > > On Feb 1, 2014, at 1:38 PM, Daniel D. Daugherty wrote: > >> Greetings, >> >> I have a fix ready for the following bug: >> >> 8028073 race condition in ObjectMonitor implementation causing deadlocks >> https://bugs.openjdk.java.net/browse/JDK-8028073 >> >> On the surface, this is a very simple fix that relocates a few lines of >> code, relocates and rewrites the comments associated with that code and >> adds several new comments. >> >> Of course, in reality, the issue is much more complicated, but I'm >> hoping to make it easy for anyone not acquainted with this issue to >> understand what's going on. >> >> Here are the JDK9 webrev URLs: >> >> OpenJDK: >> http://cr.openjdk.java.net/~dcubed/8028073-webrev/0-jdk9-hs-runtime/ >> >> Oracle internal: >> http://javaweb.us.oracle.com/~ddaugher/8028073-webrev/0-jdk9-hs-runtime/ >> >> The simple summary: >> >> - since Java Monitors and JVM/TI RawMonitors share a ParkEvent, >> it is possible for a JVM/TI monitor event handler to accidentally >> consume a ParkEvent.unpark() call meant for Java Monitor layer >> - the original code fix was made on 2005.07.04 using this bug ID: >> https://bugs.openjdk.java.net/browse/JDK-5030359 >> - it's the right fix, but it's in the wrong place >> - the fix needs to be after the JVMTI_EVENT_MONITOR_WAITED >> event handler is called because it is that event handler >> that can cause the hang >> >> >> Testing >> ------- >> >> - a new StessMonitorWait test has been created that reliably >> reproduces the hang in JDK[6789]; see the bug's gory details >> for the specific versions where the hang has been reproduced >> - the test reliably reproduces the hang in 5 seconds on my >> T7600 running Solaris 10u11 X86; 1 minute runs reproduce >> the hang reliably on other machines >> - 12 hour stress run of the new test on Linux-X64, MacOS X-X64, >> Solaris-SPARCV9, Solaris-X64, and Win7-X86 with the JPRT >> bits did not reproduce the hang >> - JPRT test job >> - VM/SQE Adhoc test job on Server VM, fastdebug bits on Linux-X86, >> Linux-X64, MacOS X-X64, Solaris-SPARCV9, Solaris-X64, Windows-X86, >> and Windows-X64: >> - vm.quick >> - Kitchensink (bigapps) >> - Weblogic+medrec (bigapps) >> - runThese (bigapps) >> >> >> The Gory Details Start Here >> --------------------------- >> >> This is the old location of block of code that's being moved: >> >> src/share/vm/runtime/objectMonitor.cpp: >> >> 1440 void ObjectMonitor::wait(jlong millis, bool interruptible, TRAPS) { >> >> 1499 exit (true, Self) ; // exit the monitor >> >> 1513 if (node._notified != 0 && _succ == Self) { >> 1514 node._event->unpark(); >> 1515 } >> >> >> This is the new location of block of code that's being moved: >> >> src/share/vm/runtime/objectMonitor.cpp: >> >> 1452 void ObjectMonitor::wait(jlong millis, bool interruptible, TRAPS) { >> >> 1601 if (JvmtiExport::should_post_monitor_waited()) { >> 1602 JvmtiExport::post_monitor_waited(jt, this, ret == OS_TIMEOUT); >> >> 1604 if (node._notified != 0 && _succ == Self) { >> >> 1620 node._event->unpark(); >> 1621 } >> >> >> The Risks >> --------- >> >> - The code now executes only when the JVMTI_EVENT_MONITOR_WAITED event >> is enabled: >> - previously it was always executed >> - while the old code was not effective for the hang that is being >> fixed with this bug, it is possible that the old code prevented >> a different bug in the successor protocol from manifesting >> - thorough analysis of the successor protocol did not reveal a >> case where the old code was needed in the old location >> - Thorough analysis indicates that the other JVM/TI monitor events >> do not need a fix like the one for JVMTI_EVENT_MONITOR_WAITED: >> - the successor protocol is complicated and the analysis could >> be wrong when certain options are used >> - comments were added to each location where a JVM/TI monitor >> event handler is called documenting why a fix like this one >> is not needed there >> - if the analysis is wrong, the new comments show where a new >> code change would be needed >> >> >> The Scenario >> ------------ >> >> I've created a scenario that reproduces this hang: >> >> T1 - enters monitor and calls monitor.wait() >> T2 - enters the monitor, calls monitor.notify() and exits the monitor >> T3 - enters and exits the monitor >> T4 - enters the monitor, delays for 5 seconds, exits the monitor >> >> A JVM/TI agent that enables JVMTI_EVENT_MONITOR_WAITED and has a >> handler that: enters a raw monitor, waits for 1ms, exits a raw monitor. >> >> Here are the six events necessary to make this hang happen: >> >> // KEY-EVENT-1a: After being unparked(), T1 has cleared the _succ field, but >> // KEY-EVENT-1b: T3 is exiting the monitor and makes T1 the successor again. >> >> // KEY-EVENT-2a: The unpark() done by T3 when it made T1 the successor >> // KEY-EVENT-2b: is consumed by the JVM/TI event handler. >> >> // KEY-EVENT-3a: T3 made T1 the successor >> // KEY-EVENT-3b: but before T1 could reenter the monitor T4 grabbed it. >> >> // KEY-EVENT-4a: T1's TrySpin() call sees T4 as NotRunnable so >> // KEY-EVENT-4b: T1 bails from TrySpin without touching _succ. >> >> // KEY-EVENT-5a: T4 sees that T1 is still the successor so >> // KEY-EVENT-5b: T4 takes the quick exit path (no ExitEpilog) >> >> // KEY-EVENT-6a: T1 is about to park and it is the successor, but >> // KEY-EVENT-6b: T3's unpark has been eaten by the JVM/TI event handler >> // KEY-EVENT-6c: and T4 took the quick exit path. T1 is about to be stuck. >> >> >> This bug is intertwined with: >> >> - The ObjectMonitor successor protocol >> - the sharing of a ParkEvent between Java Monitors and JVM/TI RawMonitors >> >> There is a very long successor.notes attachment to JDK-8028073 that >> attempts to describe the ObjectMonitor successor protocol. It's good >> for putting pretty much anyone to sleep. >> >> Since this hang reproduces back to JDK6, this bug is taking the easily >> backported solution of moving the original fix to the right location. >> The following new bug has been filed for possible future work in this >> area by the Serviceability Team: >> >> 8033399 add a separate ParkEvent for JVM/TI RawMonitor use >> https://bugs.openjdk.java.net/browse/JDK-8033399 >> >> >> The Symptoms >> ------------ >> >> With intermittent hangs like this, it is useful to know what to look >> for in order to determine if you are running into this issue: >> >> - if you aren't using a debugger or a profiler or some other >> JVM/TI agent, then this hang is not the same as yours >> - if your JVM/TI agent isn't using a JVMTI_EVENT_MONITOR_WAITED >> event handler, then this hang is not the same as yours >> - if your JVMTI_EVENT_MONITOR_WAITED event handler is not using >> JVM/TI RawMonitors, then this hang is not the same as yours >> - if your JVMTI_EVENT_MONITOR_WAITED event handler is calling >> back into Java code, then you might just be insane and this >> hang might be similar to yours. However, using a Java callback >> in an event handler is an even bigger problem/risk so fix that >> first. >> - if you one or more threads blocked like this and making no >> progress, then this hang might be the same as yours: >> >> "T1" #22 prio=5 os_prio=64 tid=0x00000000009ca800 nid=0x2f waiting for monitor e >> ntry [0xfffffd7fc0231000] >> java.lang.Thread.State: BLOCKED (on object monitor) >> JavaThread state: _thread_blocked >> Thread: 0x00000000009ca800 [0x2f] State: _at_safepoint _has_called_back 0 _at_p >> oll_safepoint 0 >> JavaThread state: _thread_blocked >> at java.lang.Object.wait(Native Method) >> - waiting on <0xfffffd7e6a2b6ff0> (a java.lang.String) >> at java.lang.Object.wait(Object.java:502) >> at SMW_WorkerThread.run(StressMonitorWait.java:103) >> - locked <0xfffffd7e6a2b6ff0> (a java.lang.String) >> >> "T2" #23 prio=5 os_prio=64 tid=0x00000000009cc000 nid=0x30 waiting for monitor e >> ntry [0xfffffd7fc0130000] >> java.lang.Thread.State: BLOCKED (on object monitor) >> JavaThread state: _thread_blocked >> Thread: 0x00000000009cc000 [0x30] State: _at_safepoint _has_called_back 0 _at_p >> oll_safepoint 0 >> JavaThread state: _thread_blocked >> at SMW_WorkerThread.run(StressMonitorWait.java:120) >> - waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String) >> >> "T3" #24 prio=5 os_prio=64 tid=0x00000000009ce000 nid=0x31 waiting for monitor e >> ntry [0xfffffd7fc002f000] >> java.lang.Thread.State: BLOCKED (on object monitor) >> JavaThread state: _thread_blocked >> Thread: 0x00000000009ce000 [0x31] State: _at_safepoint _has_called_back 0 _at_p >> oll_safepoint 0 >> JavaThread state: _thread_blocked >> at SMW_WorkerThread.run(StressMonitorWait.java:139) >> - waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String) >> >> Key symptoms in thread T1: >> >> - had the object locked: >> >> locked <0xfffffd7e6a2b6ff0> (a java.lang.String) >> >> - did an Object.wait(): >> >> waiting on <0xfffffd7e6a2b6ff0> (a java.lang.String) >> >> - is blocked on reentry: >> >> waiting for monitor entry [0xfffffd7fc0231000] >> >> Key symtoms in thread T2: >> >> - is blocked waiting to lock the object: >> >> waiting for monitor entry [0xfffffd7fc0130000] >> waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String) >> >> Key symtoms in thread T3: >> >> - is blocked waiting to lock the object: >> >> waiting for monitor entry [0xfffffd7fc002f000] >> waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String) >> From oleg.mazurov at oracle.com Fri Feb 7 15:55:53 2014 From: oleg.mazurov at oracle.com (Oleg Mazurov) Date: Fri, 07 Feb 2014 15:55:53 -0800 Subject: Review Request (S) 8025841: JVMTI: "vtable stub" dynamic code notification is misplaced In-Reply-To: <52F54AC6.8020804@oracle.com> References: <52F01569.3070201@oracle.com> <52F41821.10403@oracle.com> <52F41B7D.6090206@oracle.com> <52F54AC6.8020804@oracle.com> Message-ID: <52F57289.3080201@oracle.com> The cost might be minimal but that would be a move in the wrong direction in my opinion. A larger problem is that BufferBlobs being just placeholders for dynamic code should not be reported via JVMTI at all: when they are created they usually contain no executable code and actual objects placed into such blobs are reported separately (that wasn't true for vtable/itable stubs before this fix). That the same address, that of a BufferBlob and its first object, could be reported twice is revealed by a comment to a loop that follows the problematic comparison: src/share/vm/prims/jvmtiCodeBlobEvents.cpp: 124 // exclude VtableStubs, which are processed separately 125 if (cb->is_buffer_blob() && strcmp(cb->name(), "vtable chunks") == 0) { 126 return; 127 } 128 129 // check if this starting address has been seen already - the 130 // assumption is that stubs are inserted into the list before the 131 // enclosing BufferBlobs. 132 address addr = cb->code_begin(); 133 for (int i=0; i<_global_code_blobs->length(); i++) { 134 JvmtiCodeBlobDesc* scb = _global_code_blobs->at(i); 135 if (addr == scb->code_begin()) { 136 return; 137 } 138 } I believe that now that the vtable stub problem is fixed the need for that check is gone and both the strcmp() call and the following loop could be removed altogether, thus stopping further processing for *any* BufferBlob and avoiding a O(n^2) overhead they were causing. The scope of that change is much larger than the original problem entailed and would require not just additional ad hoc testing on the JMTI consumer side but also thorough statical analysis for all BufferBlob uses in the VM code. In fact, I was going to file a linked JIRA issue on that further improvement and if my idea for it holds true there would be no need for a new BufferBlob subtype. -- Oleg On 2/7/2014 1:06 PM, Dean Long wrote: > What's the cost for adding a new BufferBlob subtype? We already have > AdapterBlob and MethodHandlesAdapterBlob. > > dl > > On 2/6/2014 3:52 PM, Oleg Mazurov wrote: >> My understanding was that a buffer blob was just that - a buffer. >> Could potentially contain code fragments of different kinds. >> Thus, is_buffer_blob() was the closest type available. Agree that a >> dependency on its name is not reliable, though testing >> will reveal if the condition turns false for "vtable chunks" due to a >> name change (I had to deal with that particular test, Serguei >> should be able to identify it). Adding a comment to where the name is >> defined (vtableStubs.cpp) that such a dependency exists >> is a good idea. >> Thanks, >> >> -- Oleg >> >> On Feb 6, 2014, at 3:32 PM, Coleen Phillimore wrote: >> >>> Hi, I clicked on this a couple times. It seems okay but isn't there >>> a safer way to identify code blobs that are vtable stubs, without >>> looking at the name (which can change in while creating it). A >>> comment at least when you create "vtable chunks" would be good. It >>> seems that someone might want to rename it "vtable or itable >>> buffers", or something like that. >>> >>> thanks, >>> Coleen >>> >>> On 2/6/14 6:17 PM, serguei.spitsyn at oracle.com wrote: >>>> Runtime team, >>>> >>>> This fix was reviewed by Vladimir K. and me. >>>> Just wanted to make sure if you would like to review it as well. >>>> If not, then I will push it. >>>> >>>> Thanks, >>>> Serguei >>>> >>>> On 2/3/14 2:17 PM, serguei.spitsyn at oracle.com wrote: >>>>> Please, review the fix for: >>>>> https://bugs.openjdk.java.net/browse/JDK-8025841 >>>>> >>>>> >>>>> Open webrev: >>>>> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/hotspot/omazurov/8025841-JVMTI-vtbl.1 >>>>> >>>>> >>>>> Summary: >>>>> >>>>> The fix contributed by Oleg Mazurov to improve profiling data >>>>> quality. >>>>> It moves the "vtable stub" dynamic code notification to the >>>>> right place. >>>>> I've already reviewed the fix, and it looks good to me. >>>>> >>>>> Bug report description: >>>>> >>>>> "JVMTI_EVENT_DYNAMIC_CODE_GENERATED for "vtable stub" gets >>>>> scheduled when >>>>> a new chunk of memory for subsequent vtable and itable stubs >>>>> is allocated. >>>>> That chunk is uninitialized (contains zeros or garbage) >>>>> although due to the fact >>>>> that the actual event delivery is deferred, at least one >>>>> vtable comes out right. >>>>> >>>>> This event should describe an individual vtable/itable stub >>>>> (base address and size) >>>>> and only after it's been created (memory is actually >>>>> populated with code). >>>>> Where VM diagnostic messages about vtable/itable stubs are >>>>> issued upon >>>>> -XX:+PrintAdapterHandlers appears exactly the right place for >>>>> JVMTI events as well. >>>>> >>>>> Getting vtables/itables right is important in the context of >>>>> performance analysis as >>>>> that dynamically generated code may accumulate quite >>>>> noticeable CPU time >>>>> (especially itabes), sometimes larger than the actual Java >>>>> methods called." >>>>> >>>>> >>>>> Testing: >>>>> Oleg tested it in the Oracle Studio Performance Analyzer >>>>> environment. >>>>> nsk.jvmti, nsk.jdi, nsk.jdwp, >>>>> In progress: Jtreg com/sun/jdi, java/lang/instrument >>>>> >>>>> >>>>> Thanks, >>>>> Serguei >>>>> > From dean.long at oracle.com Fri Feb 7 16:08:36 2014 From: dean.long at oracle.com (Dean Long) Date: Fri, 07 Feb 2014 16:08:36 -0800 Subject: Review Request (S) 8025841: JVMTI: "vtable stub" dynamic code notification is misplaced In-Reply-To: <52F57289.3080201@oracle.com> References: <52F01569.3070201@oracle.com> <52F41821.10403@oracle.com> <52F41B7D.6090206@oracle.com> <52F54AC6.8020804@oracle.com> <52F57289.3080201@oracle.com> Message-ID: <52F57584.70007@oracle.com> OK. Your further improvement idea sounds promising. dl On 2/7/2014 3:55 PM, Oleg Mazurov wrote: > The cost might be minimal but that would be a move in the wrong > direction in my opinion. > A larger problem is that BufferBlobs being just placeholders for > dynamic code should not be > reported via JVMTI at all: when they are created they usually contain > no executable code and > actual objects placed into such blobs are reported separately (that > wasn't true for vtable/itable > stubs before this fix). That the same address, that of a BufferBlob > and its first object, could be > reported twice is revealed by a comment to a loop that follows the > problematic comparison: > > src/share/vm/prims/jvmtiCodeBlobEvents.cpp: > > 124 // exclude VtableStubs, which are processed separately > 125 if (cb->is_buffer_blob() && strcmp(cb->name(), "vtable chunks") > == 0) { > 126 return; > 127 } > 128 > 129 // check if this starting address has been seen already - the > 130 // assumption is that stubs are inserted into the list before the > 131 // enclosing BufferBlobs. > 132 address addr = cb->code_begin(); > 133 for (int i=0; i<_global_code_blobs->length(); i++) { > 134 JvmtiCodeBlobDesc* scb = _global_code_blobs->at(i); > 135 if (addr == scb->code_begin()) { > 136 return; > 137 } > 138 } > > I believe that now that the vtable stub problem is fixed the need for > that check is gone > and both the strcmp() call and the following loop could be removed > altogether, thus stopping > further processing for *any* BufferBlob and avoiding a O(n^2) overhead > they were causing. > The scope of that change is much larger than the original problem > entailed and would require > not just additional ad hoc testing on the JMTI consumer side but also > thorough statical analysis > for all BufferBlob uses in the VM code. > In fact, I was going to file a linked JIRA issue on that further > improvement and if my idea for > it holds true there would be no need for a new BufferBlob subtype. > > -- Oleg > > On 2/7/2014 1:06 PM, Dean Long wrote: >> What's the cost for adding a new BufferBlob subtype? We already have >> AdapterBlob and MethodHandlesAdapterBlob. >> >> dl >> >> On 2/6/2014 3:52 PM, Oleg Mazurov wrote: >>> My understanding was that a buffer blob was just that - a buffer. >>> Could potentially contain code fragments of different kinds. >>> Thus, is_buffer_blob() was the closest type available. Agree that a >>> dependency on its name is not reliable, though testing >>> will reveal if the condition turns false for "vtable chunks" due to >>> a name change (I had to deal with that particular test, Serguei >>> should be able to identify it). Adding a comment to where the name >>> is defined (vtableStubs.cpp) that such a dependency exists >>> is a good idea. >>> Thanks, >>> >>> -- Oleg >>> >>> On Feb 6, 2014, at 3:32 PM, Coleen Phillimore wrote: >>> >>>> Hi, I clicked on this a couple times. It seems okay but isn't there >>>> a safer way to identify code blobs that are vtable stubs, without >>>> looking at the name (which can change in while creating it). A >>>> comment at least when you create "vtable chunks" would be good. >>>> It seems that someone might want to rename it "vtable or itable >>>> buffers", or something like that. >>>> >>>> thanks, >>>> Coleen >>>> >>>> On 2/6/14 6:17 PM, serguei.spitsyn at oracle.com wrote: >>>>> Runtime team, >>>>> >>>>> This fix was reviewed by Vladimir K. and me. >>>>> Just wanted to make sure if you would like to review it as well. >>>>> If not, then I will push it. >>>>> >>>>> Thanks, >>>>> Serguei >>>>> >>>>> On 2/3/14 2:17 PM, serguei.spitsyn at oracle.com wrote: >>>>>> Please, review the fix for: >>>>>> https://bugs.openjdk.java.net/browse/JDK-8025841 >>>>>> >>>>>> >>>>>> Open webrev: >>>>>> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/hotspot/omazurov/8025841-JVMTI-vtbl.1 >>>>>> >>>>>> >>>>>> Summary: >>>>>> >>>>>> The fix contributed by Oleg Mazurov to improve profiling data >>>>>> quality. >>>>>> It moves the "vtable stub" dynamic code notification to the >>>>>> right place. >>>>>> I've already reviewed the fix, and it looks good to me. >>>>>> >>>>>> Bug report description: >>>>>> >>>>>> "JVMTI_EVENT_DYNAMIC_CODE_GENERATED for "vtable stub" gets >>>>>> scheduled when >>>>>> a new chunk of memory for subsequent vtable and itable stubs >>>>>> is allocated. >>>>>> That chunk is uninitialized (contains zeros or garbage) >>>>>> although due to the fact >>>>>> that the actual event delivery is deferred, at least one >>>>>> vtable comes out right. >>>>>> >>>>>> This event should describe an individual vtable/itable stub >>>>>> (base address and size) >>>>>> and only after it's been created (memory is actually >>>>>> populated with code). >>>>>> Where VM diagnostic messages about vtable/itable stubs are >>>>>> issued upon >>>>>> -XX:+PrintAdapterHandlers appears exactly the right place >>>>>> for JVMTI events as well. >>>>>> >>>>>> Getting vtables/itables right is important in the context of >>>>>> performance analysis as >>>>>> that dynamically generated code may accumulate quite >>>>>> noticeable CPU time >>>>>> (especially itabes), sometimes larger than the actual Java >>>>>> methods called." >>>>>> >>>>>> >>>>>> Testing: >>>>>> Oleg tested it in the Oracle Studio Performance Analyzer >>>>>> environment. >>>>>> nsk.jvmti, nsk.jdi, nsk.jdwp, >>>>>> In progress: Jtreg com/sun/jdi, java/lang/instrument >>>>>> >>>>>> >>>>>> Thanks, >>>>>> Serguei >>>>>> >> > From david.holmes at oracle.com Sun Feb 9 19:37:11 2014 From: david.holmes at oracle.com (David Holmes) Date: Mon, 10 Feb 2014 13:37:11 +1000 Subject: code review round 0 for ObjectMonitor-JVM/TI hang fix (8028073) In-Reply-To: <52F5702C.50508@oracle.com> References: <52ED3F28.5070806@oracle.com> <6C8A4CDA-B5D5-44E4-865E-A7205D533AFD@oracle.com> <52F5702C.50508@oracle.com> Message-ID: <52F84967.8040807@oracle.com> trimming content ... On 8/02/2014 9:45 AM, Daniel D. Daugherty wrote: > On 2/7/14 2:56 PM, Karen Kinnear wrote: >> 3. Did I read the code correctly that the Thread::SpinAcquire can make >> a timed park >> call on the same thread's _ParkEvent? And that this is used to get on >> and off the wait queue, >> i.e. to acquire the WaitSetLock? >> Is there the same risk that a notify might be eaten here also? > > As far as I can see, Thread::SpinAcquire() does not use a ParkEvent It sure does: void Thread::SpinAcquire (volatile int * adr, const char * LockName) { if (Atomic::cmpxchg (1, adr, 0) == 0) { return ; // normal fast-path return } // Slow-path : We've encountered contention -- Spin/Yield/Block strategy. TEVENT (SpinAcquire - ctx) ; int ctr = 0 ; int Yields = 0 ; for (;;) { while (*adr != 0) { ++ctr ; if ((ctr & 0xFFF) == 0 || !os::is_MP()) { if (Yields > 5) { // Consider using a simple NakedSleep() instead. // Then SpinAcquire could be called by non-JVM threads Thread::current()->_ParkEvent->park(1) ; So considering Karen's question ... I can't tell for certain. :( I do not think the SpinAcquire on grabbing the wait-set lock to add to the wait-set can be an issue because we will only park in response to the actual wait, and hence only get unparked due to a notify/notifyAll, but at this point we still own the monitor so no notify/notifyAll is possible. However, for the removal from the wait-set a more complex analysis is needed. To do the SpinAcquire we must still be flagged as TS_WAIT - which means we have not been notified, but must be returning due to a timeout (or spurious wakeup?). In such circumstances could we be _succ? I don't think so but I'll leave it to Dan to confirm that part :) David ----- > at all. However, Thread::muxAcquire() does use a ParkEvent, but it > is a different ParkEvent. From src/share/vm/runtime/thread.hpp: > > ParkEvent * _ParkEvent ; // for synchronized() > ParkEvent * _SleepEvent ; // for Thread.sleep > ParkEvent * _MutexEvent ; // for native internal > Mutex/Monitor > ParkEvent * _MuxEvent ; // for low-level > muxAcquire-muxRelease > > So ObjectMonitor uses the _ParkEvent field and Thread::muxAcquire() > uses the _MuxEvent. There are some comments in thread.cpp about > how _MuxEvent could be eliminated and _ParkEvent shared, but I don't > think we ever want to go there. > > I also filed this RFE: > > 8033399 add a separate ParkEvent for JVM/TI RawMonitor use > https://bugs.openjdk.java.net/browse/JDK-8033399 > > just in case the Serviceability team wants to migrate JVM/TI RawMonitors > to a separate ParkEvent. > > Please let me know if you concur that I've resolved issue #3. > > >> If so, I wonder if we want this added unpark to not just be called if >> JVMTI_EVENT_MONITOR_WAITED >> is enabled? > > I don't think we need it, but I've noted its removal as a risk. > > Again, thanks for the review! > > Dan > > >> >> thanks, >> Karen >> >> On Feb 1, 2014, at 1:38 PM, Daniel D. Daugherty wrote: >> >>> Greetings, >>> >>> I have a fix ready for the following bug: >>> >>> 8028073 race condition in ObjectMonitor implementation causing >>> deadlocks >>> https://bugs.openjdk.java.net/browse/JDK-8028073 >>> >>> On the surface, this is a very simple fix that relocates a few lines of >>> code, relocates and rewrites the comments associated with that code and >>> adds several new comments. >>> >>> Of course, in reality, the issue is much more complicated, but I'm >>> hoping to make it easy for anyone not acquainted with this issue to >>> understand what's going on. >>> >>> Here are the JDK9 webrev URLs: >>> >>> OpenJDK: >>> http://cr.openjdk.java.net/~dcubed/8028073-webrev/0-jdk9-hs-runtime/ >>> >>> Oracle internal: >>> http://javaweb.us.oracle.com/~ddaugher/8028073-webrev/0-jdk9-hs-runtime/ >>> >>> The simple summary: >>> >>> - since Java Monitors and JVM/TI RawMonitors share a ParkEvent, >>> it is possible for a JVM/TI monitor event handler to accidentally >>> consume a ParkEvent.unpark() call meant for Java Monitor layer >>> - the original code fix was made on 2005.07.04 using this bug ID: >>> https://bugs.openjdk.java.net/browse/JDK-5030359 >>> - it's the right fix, but it's in the wrong place >>> - the fix needs to be after the JVMTI_EVENT_MONITOR_WAITED >>> event handler is called because it is that event handler >>> that can cause the hang >>> >>> >>> Testing >>> ------- >>> >>> - a new StessMonitorWait test has been created that reliably >>> reproduces the hang in JDK[6789]; see the bug's gory details >>> for the specific versions where the hang has been reproduced >>> - the test reliably reproduces the hang in 5 seconds on my >>> T7600 running Solaris 10u11 X86; 1 minute runs reproduce >>> the hang reliably on other machines >>> - 12 hour stress run of the new test on Linux-X64, MacOS X-X64, >>> Solaris-SPARCV9, Solaris-X64, and Win7-X86 with the JPRT >>> bits did not reproduce the hang >>> - JPRT test job >>> - VM/SQE Adhoc test job on Server VM, fastdebug bits on Linux-X86, >>> Linux-X64, MacOS X-X64, Solaris-SPARCV9, Solaris-X64, Windows-X86, >>> and Windows-X64: >>> - vm.quick >>> - Kitchensink (bigapps) >>> - Weblogic+medrec (bigapps) >>> - runThese (bigapps) >>> >>> >>> The Gory Details Start Here >>> --------------------------- >>> >>> This is the old location of block of code that's being moved: >>> >>> src/share/vm/runtime/objectMonitor.cpp: >>> >>> 1440 void ObjectMonitor::wait(jlong millis, bool interruptible, TRAPS) { >>> >>> 1499 exit (true, Self) ; // exit the monitor >>> >>> 1513 if (node._notified != 0 && _succ == Self) { >>> 1514 node._event->unpark(); >>> 1515 } >>> >>> >>> This is the new location of block of code that's being moved: >>> >>> src/share/vm/runtime/objectMonitor.cpp: >>> >>> 1452 void ObjectMonitor::wait(jlong millis, bool interruptible, TRAPS) { >>> >>> 1601 if (JvmtiExport::should_post_monitor_waited()) { >>> 1602 JvmtiExport::post_monitor_waited(jt, this, ret == >>> OS_TIMEOUT); >>> >>> 1604 if (node._notified != 0 && _succ == Self) { >>> >>> 1620 node._event->unpark(); >>> 1621 } >>> >>> >>> The Risks >>> --------- >>> >>> - The code now executes only when the JVMTI_EVENT_MONITOR_WAITED event >>> is enabled: >>> - previously it was always executed >>> - while the old code was not effective for the hang that is being >>> fixed with this bug, it is possible that the old code prevented >>> a different bug in the successor protocol from manifesting >>> - thorough analysis of the successor protocol did not reveal a >>> case where the old code was needed in the old location >>> - Thorough analysis indicates that the other JVM/TI monitor events >>> do not need a fix like the one for JVMTI_EVENT_MONITOR_WAITED: >>> - the successor protocol is complicated and the analysis could >>> be wrong when certain options are used >>> - comments were added to each location where a JVM/TI monitor >>> event handler is called documenting why a fix like this one >>> is not needed there >>> - if the analysis is wrong, the new comments show where a new >>> code change would be needed >>> >>> >>> The Scenario >>> ------------ >>> >>> I've created a scenario that reproduces this hang: >>> >>> T1 - enters monitor and calls monitor.wait() >>> T2 - enters the monitor, calls monitor.notify() and exits the monitor >>> T3 - enters and exits the monitor >>> T4 - enters the monitor, delays for 5 seconds, exits the monitor >>> >>> A JVM/TI agent that enables JVMTI_EVENT_MONITOR_WAITED and has a >>> handler that: enters a raw monitor, waits for 1ms, exits a raw monitor. >>> >>> Here are the six events necessary to make this hang happen: >>> >>> // KEY-EVENT-1a: After being unparked(), T1 has cleared the _succ >>> field, but >>> // KEY-EVENT-1b: T3 is exiting the monitor and makes T1 the successor >>> again. >>> >>> // KEY-EVENT-2a: The unpark() done by T3 when it made T1 the successor >>> // KEY-EVENT-2b: is consumed by the JVM/TI event handler. >>> >>> // KEY-EVENT-3a: T3 made T1 the successor >>> // KEY-EVENT-3b: but before T1 could reenter the monitor T4 grabbed it. >>> >>> // KEY-EVENT-4a: T1's TrySpin() call sees T4 as NotRunnable so >>> // KEY-EVENT-4b: T1 bails from TrySpin without touching _succ. >>> >>> // KEY-EVENT-5a: T4 sees that T1 is still the successor so >>> // KEY-EVENT-5b: T4 takes the quick exit path (no ExitEpilog) >>> >>> // KEY-EVENT-6a: T1 is about to park and it is the successor, but >>> // KEY-EVENT-6b: T3's unpark has been eaten by the JVM/TI event handler >>> // KEY-EVENT-6c: and T4 took the quick exit path. T1 is about to be >>> stuck. >>> >>> >>> This bug is intertwined with: >>> >>> - The ObjectMonitor successor protocol >>> - the sharing of a ParkEvent between Java Monitors and JVM/TI >>> RawMonitors >>> >>> There is a very long successor.notes attachment to JDK-8028073 that >>> attempts to describe the ObjectMonitor successor protocol. It's good >>> for putting pretty much anyone to sleep. >>> >>> Since this hang reproduces back to JDK6, this bug is taking the easily >>> backported solution of moving the original fix to the right location. >>> The following new bug has been filed for possible future work in this >>> area by the Serviceability Team: >>> >>> 8033399 add a separate ParkEvent for JVM/TI RawMonitor use >>> https://bugs.openjdk.java.net/browse/JDK-8033399 >>> >>> >>> The Symptoms >>> ------------ >>> >>> With intermittent hangs like this, it is useful to know what to look >>> for in order to determine if you are running into this issue: >>> >>> - if you aren't using a debugger or a profiler or some other >>> JVM/TI agent, then this hang is not the same as yours >>> - if your JVM/TI agent isn't using a JVMTI_EVENT_MONITOR_WAITED >>> event handler, then this hang is not the same as yours >>> - if your JVMTI_EVENT_MONITOR_WAITED event handler is not using >>> JVM/TI RawMonitors, then this hang is not the same as yours >>> - if your JVMTI_EVENT_MONITOR_WAITED event handler is calling >>> back into Java code, then you might just be insane and this >>> hang might be similar to yours. However, using a Java callback >>> in an event handler is an even bigger problem/risk so fix that >>> first. >>> - if you one or more threads blocked like this and making no >>> progress, then this hang might be the same as yours: >>> >>> "T1" #22 prio=5 os_prio=64 tid=0x00000000009ca800 nid=0x2f waiting >>> for monitor e >>> ntry [0xfffffd7fc0231000] >>> java.lang.Thread.State: BLOCKED (on object monitor) >>> JavaThread state: _thread_blocked >>> Thread: 0x00000000009ca800 [0x2f] State: _at_safepoint >>> _has_called_back 0 _at_p >>> oll_safepoint 0 >>> JavaThread state: _thread_blocked >>> at java.lang.Object.wait(Native Method) >>> - waiting on <0xfffffd7e6a2b6ff0> (a java.lang.String) >>> at java.lang.Object.wait(Object.java:502) >>> at SMW_WorkerThread.run(StressMonitorWait.java:103) >>> - locked <0xfffffd7e6a2b6ff0> (a java.lang.String) >>> >>> "T2" #23 prio=5 os_prio=64 tid=0x00000000009cc000 nid=0x30 waiting >>> for monitor e >>> ntry [0xfffffd7fc0130000] >>> java.lang.Thread.State: BLOCKED (on object monitor) >>> JavaThread state: _thread_blocked >>> Thread: 0x00000000009cc000 [0x30] State: _at_safepoint >>> _has_called_back 0 _at_p >>> oll_safepoint 0 >>> JavaThread state: _thread_blocked >>> at SMW_WorkerThread.run(StressMonitorWait.java:120) >>> - waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String) >>> >>> "T3" #24 prio=5 os_prio=64 tid=0x00000000009ce000 nid=0x31 waiting >>> for monitor e >>> ntry [0xfffffd7fc002f000] >>> java.lang.Thread.State: BLOCKED (on object monitor) >>> JavaThread state: _thread_blocked >>> Thread: 0x00000000009ce000 [0x31] State: _at_safepoint >>> _has_called_back 0 _at_p >>> oll_safepoint 0 >>> JavaThread state: _thread_blocked >>> at SMW_WorkerThread.run(StressMonitorWait.java:139) >>> - waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String) >>> >>> Key symptoms in thread T1: >>> >>> - had the object locked: >>> >>> locked <0xfffffd7e6a2b6ff0> (a java.lang.String) >>> >>> - did an Object.wait(): >>> >>> waiting on <0xfffffd7e6a2b6ff0> (a java.lang.String) >>> >>> - is blocked on reentry: >>> >>> waiting for monitor entry [0xfffffd7fc0231000] >>> >>> Key symtoms in thread T2: >>> >>> - is blocked waiting to lock the object: >>> >>> waiting for monitor entry [0xfffffd7fc0130000] >>> waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String) >>> >>> Key symtoms in thread T3: >>> >>> - is blocked waiting to lock the object: >>> >>> waiting for monitor entry [0xfffffd7fc002f000] >>> waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String) >>> > From jaroslav.bachorik at oracle.com Mon Feb 10 04:00:56 2014 From: jaroslav.bachorik at oracle.com (Jaroslav Bachorik) Date: Mon, 10 Feb 2014 13:00:56 +0100 Subject: RFR 6791551: ExclusiveBind.java has a race condition In-Reply-To: <52F10221.6010303@oracle.com> References: <52F0192B.9060104@oracle.com> <52F0AABC.7000904@oracle.com> <52F0ABC2.4040700@oracle.com> <52F10221.6010303@oracle.com> Message-ID: <52F8BF78.60803@oracle.com> On 4.2.2014 16:07, shanliang wrote: > Jaroslav Bachorik wrote: >> On 4.2.2014 09:54, shanliang wrote: >>> Jaroslav, >>> >>> Your fix should work in most case, but is it better and more reliable to >>> wait a VM event as suggested in the bug? even your timeout is adapted to >>> the test time factory, but the solution still depends to a fixed timeout >>> and a fixed line out. >> >> Well, if I get the test logic correctly it is supposed to test that >> the agent blocks the port even when no client has connected yet. >> Connecting to the agent and waiting for the event would change the >> thing the test checks, actually. > You are right that the test should not attach a VM before launching the > second debuggee. Let's hope that 5000 * Utils.TIMEOUT_FACTOR works for > all testing machines. Hopefully it should. 5 seconds to start the debugee under normal circumstances sounds more than enough. For the exceptional circumstances the TIMEOUT_FACTOR should be properly tuned. We will see. > Looks OK. Thanks! May I get an official reviewer to take a look at this, please? -JB- > > Thanks, > Shanliang >> >> -JB- >> >>> >>> Shanliang >>> >>> Jaroslav Bachorik wrote: >>>> Please, review the following test fix: >>>> >>>> Issue : https://bugs.openjdk.java.net/browse/JDK-6791551 >>>> Webrev: http://cr.openjdk.java.net/~jbachorik/6791551/webrev.00 >>>> >>>> The fix prevents the situation when the first debuggee has not managed >>>> to finish its intialization while the second one is started up thus >>>> making the port available for the second debuggee and failing the test. >>>> >>>> The patch is using the library methods to configure and launch the >>>> debuggee and the test waits for the well known string to appear in the >>>> first debuggee output before attempting to launch the second debuggee. >>>> >>>> Thanks, >>>> >>>> -JB- >>> >> > From staffan.larsen at oracle.com Mon Feb 10 04:22:41 2014 From: staffan.larsen at oracle.com (Staffan Larsen) Date: Mon, 10 Feb 2014 13:22:41 +0100 Subject: RFR 6791551: ExclusiveBind.java has a race condition In-Reply-To: <52F8BF78.60803@oracle.com> References: <52F0192B.9060104@oracle.com> <52F0AABC.7000904@oracle.com> <52F0ABC2.4040700@oracle.com> <52F10221.6010303@oracle.com> <52F8BF78.60803@oracle.com> Message-ID: <16C6B74D-4E42-4464-B6CE-7338D90E68AE@oracle.com> Looks good! Thanks, /Staffan On 10 feb 2014, at 13:00, Jaroslav Bachorik wrote: > On 4.2.2014 16:07, shanliang wrote: >> Jaroslav Bachorik wrote: >>> On 4.2.2014 09:54, shanliang wrote: >>>> Jaroslav, >>>> >>>> Your fix should work in most case, but is it better and more reliable to >>>> wait a VM event as suggested in the bug? even your timeout is adapted to >>>> the test time factory, but the solution still depends to a fixed timeout >>>> and a fixed line out. >>> >>> Well, if I get the test logic correctly it is supposed to test that >>> the agent blocks the port even when no client has connected yet. >>> Connecting to the agent and waiting for the event would change the >>> thing the test checks, actually. >> You are right that the test should not attach a VM before launching the >> second debuggee. Let's hope that 5000 * Utils.TIMEOUT_FACTOR works for >> all testing machines. > > Hopefully it should. 5 seconds to start the debugee under normal circumstances sounds more than enough. For the exceptional circumstances the TIMEOUT_FACTOR should be properly tuned. We will see. > > >> Looks OK. > > Thanks! > > May I get an official reviewer to take a look at this, please? > > -JB- > >> >> Thanks, >> Shanliang >>> >>> -JB- >>> >>>> >>>> Shanliang >>>> >>>> Jaroslav Bachorik wrote: >>>>> Please, review the following test fix: >>>>> >>>>> Issue : https://bugs.openjdk.java.net/browse/JDK-6791551 >>>>> Webrev: http://cr.openjdk.java.net/~jbachorik/6791551/webrev.00 >>>>> >>>>> The fix prevents the situation when the first debuggee has not managed >>>>> to finish its intialization while the second one is started up thus >>>>> making the port available for the second debuggee and failing the test. >>>>> >>>>> The patch is using the library methods to configure and launch the >>>>> debuggee and the test waits for the well known string to appear in the >>>>> first debuggee output before attempting to launch the second debuggee. >>>>> >>>>> Thanks, >>>>> >>>>> -JB- -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20140210/eae4553d/attachment.html From staffan.larsen at oracle.com Mon Feb 10 05:38:48 2014 From: staffan.larsen at oracle.com (Staffan Larsen) Date: Mon, 10 Feb 2014 14:38:48 +0100 Subject: RFR: 6545422 NativeErrors.java uses wrong path name in exec In-Reply-To: <7a061a24-dd85-4834-b6fd-fb917e0f9a47@default> References: <7a061a24-dd85-4834-b6fd-fb917e0f9a47@default> Message-ID: <6FB1E959-DFDD-481E-9020-9EE3230D6C14@oracle.com> And a looong time later: Looks good! Thanks, /Staffan On 4 dec 2013, at 15:38, Mattias Tobiasson wrote: > Hi, > Could someone please review this patch? > > webrev: http://cr.openjdk.java.net/~ykantser/6545422/webrev.00 > bug: https://bugs.openjdk.java.net/browse/JDK-6545422 > > Changes: > Uses common testlibrary function to find native2ascii binary. > Check null values instead of throwing NullPointerException at different locations. The old version checked null with "assert", which is not always enabled. > > > Mattias From rob.mckenna at oracle.com Mon Feb 10 06:50:15 2014 From: rob.mckenna at oracle.com (rob.mckenna at oracle.com) Date: Mon, 10 Feb 2014 14:50:15 +0000 Subject: hg: jdk8/tl/jdk: 7152892: some jtreg tests fail with permission denied Message-ID: <20140210145247.DD74462B24@hg.openjdk.java.net> Changeset: da4b0962ad11 Author: robm Date: 2014-02-10 14:35 +0000 URL: http://hg.openjdk.java.net/jdk8/tl/jdk/rev/da4b0962ad11 7152892: some jtreg tests fail with permission denied Reviewed-by: coffeys ! test/java/lang/ClassLoader/Assert.sh ! test/java/rmi/registry/readTest/readTest.sh ! test/java/util/zip/ZipFile/ReadZip.java ! test/sun/net/www/protocol/jar/jarbug/run.sh From daniel.daugherty at oracle.com Mon Feb 10 10:55:41 2014 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Mon, 10 Feb 2014 11:55:41 -0700 Subject: code review round 0 for ObjectMonitor-JVM/TI hang fix (8028073) In-Reply-To: <52F84967.8040807@oracle.com> References: <52ED3F28.5070806@oracle.com> <6C8A4CDA-B5D5-44E4-865E-A7205D533AFD@oracle.com> <52F5702C.50508@oracle.com> <52F84967.8040807@oracle.com> Message-ID: <52F920AD.6050705@oracle.com> On 2/9/14 8:37 PM, David Holmes wrote: > trimming content ... > > On 8/02/2014 9:45 AM, Daniel D. Daugherty wrote: >> On 2/7/14 2:56 PM, Karen Kinnear wrote: >>> 3. Did I read the code correctly that the Thread::SpinAcquire can make >>> a timed park >>> call on the same thread's _ParkEvent? And that this is used to get on >>> and off the wait queue, >>> i.e. to acquire the WaitSetLock? >>> Is there the same risk that a notify might be eaten here also? >> >> As far as I can see, Thread::SpinAcquire() does not use a ParkEvent > > It sure does: > > void Thread::SpinAcquire (volatile int * adr, const char * LockName) { > if (Atomic::cmpxchg (1, adr, 0) == 0) { > return ; // normal fast-path return > } > > // Slow-path : We've encountered contention -- Spin/Yield/Block > strategy. > TEVENT (SpinAcquire - ctx) ; > int ctr = 0 ; > int Yields = 0 ; > for (;;) { > while (*adr != 0) { > ++ctr ; > if ((ctr & 0xFFF) == 0 || !os::is_MP()) { > if (Yields > 5) { > // Consider using a simple NakedSleep() instead. > // Then SpinAcquire could be called by non-JVM threads > Thread::current()->_ParkEvent->park(1) ; Ummmm... that's not the code I'm seeing... src/share/vm/runtime/thread.cpp: 4417 void Thread::SpinAcquire (volatile int * adr, const char * LockName) { 4418 if (Atomic::cmpxchg (1, adr, 0) == 0) { 4419 return ; // normal fast-path return 4420 } 4421 4422 // Slow-path : We've encountered contention -- Spin/Yield/Block strategy. 4423 TEVENT (SpinAcquire - ctx) ; 4424 int ctr = 0 ; 4425 int Yields = 0 ; 4426 for (;;) { 4427 while (*adr != 0) { 4428 ++ctr ; 4429 if ((ctr & 0xFFF) == 0 || !os::is_MP()) { 4430 if (Yields > 5) { 4431 os::naked_short_sleep(1); 4432 } else { 4433 os::NakedYield() ; 4434 ++Yields ; 4435 } 4436 } else { 4437 SpinPause() ; 4438 } 4439 } 4440 if (Atomic::cmpxchg (1, adr, 0) == 0) return ; 4441 } 4442 } Mr Simms recently changed the above code via: changeset: 5832:5944dba4badc user: dsimms date: Fri Jan 24 09:28:47 2014 +0100 summary: 8028280: ParkEvent leak when running modified runThese which only loads classes os::naked_short_sleep() is new: - BSD/MacOS X, Linux - uses nanosleep() - Solaris - uses usleep() - Windows - uses Sleep() The fix for 8028280 was pushed to JDK9/hs-rt on 2014.01.24 and to JDK9/hs on 2014.01.29. I don't see any signs that Mr Simm's fix will be backported to JDK8u/HSX-25u (yet) so this part of the review thread might impact the backport of my fix to earlier releases. > So considering Karen's question ... I can't tell for certain. :( > > I do not think the SpinAcquire on grabbing the wait-set lock to add to > the wait-set can be an issue because we will only park in response to > the actual wait, and hence only get unparked due to a > notify/notifyAll, but at this point we still own the monitor so no > notify/notifyAll is possible. > > However, for the removal from the wait-set a more complex analysis is > needed. To do the SpinAcquire we must still be flagged as TS_WAIT - > which means we have not been notified, but must be returning due to a > timeout (or spurious wakeup?). In such circumstances could we be > _succ? I don't think so but I'll leave it to Dan to confirm that part :) So for HSX-25 and probably older... There are four Thread::SpinAcquire() calls in the objectMonitor code: Thread::SpinAcquire (&_WaitSetLock, "WaitSet - add") ; Thread::SpinAcquire (&_WaitSetLock, "WaitSet - unlink") ; Thread::SpinAcquire (&_WaitSetLock, "WaitSet - notify") ; Thread::SpinAcquire (&_WaitSetLock, "WaitSet - notifyall") ; We can easily rule out the "notify" and "notifyAll" uses since the current thread owns the Java-level monitor and there are no events to post in this part of the notify() or notifyAll() protocols. For the "WaitSet - add" use, the current thread owns the Java-level monitor and the thread has not been added as a waiter yet so another thread cannot do the notify-exit-make-successor part of the protocol yet. For the "WaitSet - unlink" use: src/share/vm/runtime/objectMonitor.cpp: 1569 if (node.TState == ObjectWaiter::TS_WAIT) { 1570 Thread::SpinAcquire (&_WaitSetLock, "WaitSet - unlink") ; 1571 if (node.TState == ObjectWaiter::TS_WAIT) { 1572 DequeueSpecificWaiter (&node) ; // unlink from WaitSet 1573 assert(node._notified == 0, "invariant"); 1574 node.TState = ObjectWaiter::TS_RUN ; 1575 } 1576 Thread::SpinRelease (&_WaitSetLock) ; 1577 } It is the call on line 1570 above that gets us into this code: src/share/vm/runtime/thread.cpp: 4435 void Thread::SpinAcquire (volatile int * adr, const char * LockName) { 4436 if (Atomic::cmpxchg (1, adr, 0) == 0) { 4437 return ; // normal fast-path return 4438 } 4439 4440 // Slow-path : We've encountered contention -- Spin/Yield/Block strategy. 4441 TEVENT (SpinAcquire - ctx) ; 4442 int ctr = 0 ; 4443 int Yields = 0 ; 4444 for (;;) { 4445 while (*adr != 0) { 4446 ++ctr ; 4447 if ((ctr & 0xFFF) == 0 || !os::is_MP()) { 4448 if (Yields > 5) { 4449 // Consider using a simple NakedSleep() instead. 4450 // Then SpinAcquire could be called by non-JVM threads 4451 Thread::current()->_ParkEvent->park(1) ; 4452 } else { 4453 os::NakedYield() ; 4454 ++Yields ; 4455 } 4456 } else { 4457 SpinPause() ; 4458 } 4459 } 4460 if (Atomic::cmpxchg (1, adr, 0) == 0) return ; 4461 } 4462 } And the above code can consume the unpark() on line 4451. So how the heck do we get to line 1570??? Well, the target thread would have to be both notified and unparked to be executing this code path. When the notify() code runs, the target of the notify() is changed from ObjectWaiter::TS_WAIT to ObjectWaiter::TS_ENTER unless Knob_MoveNotifyee == 4. The default for Knob_MoveNotifyee == 2 so we're in non default mode here... Here are the Knob_MoveNotifyee policy values: 1717 if (Policy == 0) { // prepend to EntryList 1728 if (Policy == 1) { // append to EntryList 1744 if (Policy == 2) { // prepend to cxq 1760 if (Policy == 3) { // append to cxq For Knob_MoveNotifyee == 4 (or higher), we use the old mechanism where we just unpark the target thread and let it run. Part of that code changes from ObjectWaiter::TS_WAIT to ObjectWaiter::TS_RUN. The code works the same for notifyAll() for the thread picked to be notified. For the Knob_MoveNotifyee == 4 (or higher) case, we just unpark all the waiters and we a free-for-all. So it looks like the code block from lines 1569-1577 is never used... or is it? Well... you have to remember two things: 1) spurious unpark() 2) timed wait() The caller might have called wait(0), but that doesn't mean that the underlying park() mechanism won't have a spurious unpark(). Or better, the caller might have called wait(1) and be running again after a millisecond. So in the HSX25 and older system (i.e., without Mr Simms fix for 8028280), it is possible for this call: 1570 Thread::SpinAcquire (&_WaitSetLock, "WaitSet - unlink") ; to consume the unpark(). The gauntlet that has to be traversed to get to this call: 4451 Thread::current()->_ParkEvent->park(1) ; is impressive: - fast-path acquisition of the _WaitSetLock has to fail: 4436 if (Atomic::cmpxchg (1, adr, 0) == 0) { 4437 return ; // normal fast-path return 4438 } - if the machine is a uniprocessor, then 6 os::NakedYield() call-loop-recheck attempts have to fail: 4447 if ((ctr & 0xFFF) == 0 || !os::is_MP()) { 4448 if (Yields > 5) { 4449 // Consider using a simple NakedSleep() instead. 4450 // Then SpinAcquire could be called by non-JVM threads 4451 Thread::current()->_ParkEvent->park(1) ; 4452 } else { 4453 os::NakedYield() ; 4454 ++Yields ; 4455 } - if the machine is a multi-processor, then 6 rounds of { 4095 SpinPause() attempts, 1 os::NakedYield() attempt} have to fail: 4446 ++ctr ; 4447 if ((ctr & 0xFFF) == 0 || !os::is_MP()) { 4448 if (Yields > 5) { 4449 // Consider using a simple NakedSleep() instead. 4450 // Then SpinAcquire could be called by non-JVM threads 4451 Thread::current()->_ParkEvent->park(1) ; 4452 } else { 4453 os::NakedYield() ; 4454 ++Yields ; 4455 } 4456 } else { 4457 SpinPause() ; 4458 } But it is possible. It is one of those once-in-a-blue moon type windows where everything has to line up just so. So how do we address this issue in HSX-25 and possibly older? If Mr Simms fix for 8028280 is also backported, then there is no issue. If it is not backported, then applying the fix for this bug like so: src/share/vm/runtime/objectMonitor.cpp: 1596 if (JvmtiExport::should_post_monitor_waited()) { 1597 JvmtiExport::post_monitor_waited(jt, this, ret == OS_TIMEOUT); 1598 } 1604 if (node._notified != 0 && _succ == Self) { 1605 // In this part of the monitor wait-notify-reenter protocol it 1606 // is possible (and normal) for another thread to do a fastpath 1607 // monitor enter-exit while this thread is still trying to get 1608 // to the reenter portion of the protocol. 1609 // 1610 // The ObjectMonitor was notified and the current thread is 1611 // the successor which also means that an unpark() has already 1612 // been done. The JVMTI_EVENT_MONITOR_WAITED event handler can 1613 // consume the unpark() that was done when the successor was 1614 // set because the same ParkEvent is shared between Java 1615 // monitors and JVM/TI RawMonitors (for now). 1616 // 1617 // We redo the unpark() to ensure forward progress, i.e., we 1618 // don't want all pending threads hanging (parked) with none 1619 // entering the unlocked monitor. 1620 node._event->unpark(); 1621 } Of course the line numbers for the "fix" would be different and the comment would need to be updated to reflect that the: 1570 Thread::SpinAcquire (&_WaitSetLock, "WaitSet - unlink") ; call above could also consume an unpark(), but it should work. If you've read this far, then I'm impressed. If you've read this far and only fallen asleep a couple of times, then I'm still impressed. Summary: I don't think we have an issue in JDK9, but we'll have to do the fix in JDK8/HSX25 and older a little differently. Dan > > David > ----- > >> at all. However, Thread::muxAcquire() does use a ParkEvent, but it >> is a different ParkEvent. From src/share/vm/runtime/thread.hpp: >> >> ParkEvent * _ParkEvent ; // for synchronized() >> ParkEvent * _SleepEvent ; // for Thread.sleep >> ParkEvent * _MutexEvent ; // for native internal >> Mutex/Monitor >> ParkEvent * _MuxEvent ; // for low-level >> muxAcquire-muxRelease >> >> So ObjectMonitor uses the _ParkEvent field and Thread::muxAcquire() >> uses the _MuxEvent. There are some comments in thread.cpp about >> how _MuxEvent could be eliminated and _ParkEvent shared, but I don't >> think we ever want to go there. >> >> I also filed this RFE: >> >> 8033399 add a separate ParkEvent for JVM/TI RawMonitor use >> https://bugs.openjdk.java.net/browse/JDK-8033399 >> >> just in case the Serviceability team wants to migrate JVM/TI RawMonitors >> to a separate ParkEvent. >> >> Please let me know if you concur that I've resolved issue #3. >> >> >>> If so, I wonder if we want this added unpark to not just be called if >>> JVMTI_EVENT_MONITOR_WAITED >>> is enabled? >> >> I don't think we need it, but I've noted its removal as a risk. >> >> Again, thanks for the review! >> >> Dan >> >> >>> >>> thanks, >>> Karen >>> >>> On Feb 1, 2014, at 1:38 PM, Daniel D. Daugherty wrote: >>> >>>> Greetings, >>>> >>>> I have a fix ready for the following bug: >>>> >>>> 8028073 race condition in ObjectMonitor implementation causing >>>> deadlocks >>>> https://bugs.openjdk.java.net/browse/JDK-8028073 >>>> >>>> On the surface, this is a very simple fix that relocates a few >>>> lines of >>>> code, relocates and rewrites the comments associated with that code >>>> and >>>> adds several new comments. >>>> >>>> Of course, in reality, the issue is much more complicated, but I'm >>>> hoping to make it easy for anyone not acquainted with this issue to >>>> understand what's going on. >>>> >>>> Here are the JDK9 webrev URLs: >>>> >>>> OpenJDK: >>>> http://cr.openjdk.java.net/~dcubed/8028073-webrev/0-jdk9-hs-runtime/ >>>> >>>> Oracle internal: >>>> http://javaweb.us.oracle.com/~ddaugher/8028073-webrev/0-jdk9-hs-runtime/ >>>> >>>> >>>> The simple summary: >>>> >>>> - since Java Monitors and JVM/TI RawMonitors share a ParkEvent, >>>> it is possible for a JVM/TI monitor event handler to accidentally >>>> consume a ParkEvent.unpark() call meant for Java Monitor layer >>>> - the original code fix was made on 2005.07.04 using this bug ID: >>>> https://bugs.openjdk.java.net/browse/JDK-5030359 >>>> - it's the right fix, but it's in the wrong place >>>> - the fix needs to be after the JVMTI_EVENT_MONITOR_WAITED >>>> event handler is called because it is that event handler >>>> that can cause the hang >>>> >>>> >>>> Testing >>>> ------- >>>> >>>> - a new StessMonitorWait test has been created that reliably >>>> reproduces the hang in JDK[6789]; see the bug's gory details >>>> for the specific versions where the hang has been reproduced >>>> - the test reliably reproduces the hang in 5 seconds on my >>>> T7600 running Solaris 10u11 X86; 1 minute runs reproduce >>>> the hang reliably on other machines >>>> - 12 hour stress run of the new test on Linux-X64, MacOS X-X64, >>>> Solaris-SPARCV9, Solaris-X64, and Win7-X86 with the JPRT >>>> bits did not reproduce the hang >>>> - JPRT test job >>>> - VM/SQE Adhoc test job on Server VM, fastdebug bits on Linux-X86, >>>> Linux-X64, MacOS X-X64, Solaris-SPARCV9, Solaris-X64, Windows-X86, >>>> and Windows-X64: >>>> - vm.quick >>>> - Kitchensink (bigapps) >>>> - Weblogic+medrec (bigapps) >>>> - runThese (bigapps) >>>> >>>> >>>> The Gory Details Start Here >>>> --------------------------- >>>> >>>> This is the old location of block of code that's being moved: >>>> >>>> src/share/vm/runtime/objectMonitor.cpp: >>>> >>>> 1440 void ObjectMonitor::wait(jlong millis, bool interruptible, >>>> TRAPS) { >>>> >>>> 1499 exit (true, Self) ; // exit the monitor >>>> >>>> 1513 if (node._notified != 0 && _succ == Self) { >>>> 1514 node._event->unpark(); >>>> 1515 } >>>> >>>> >>>> This is the new location of block of code that's being moved: >>>> >>>> src/share/vm/runtime/objectMonitor.cpp: >>>> >>>> 1452 void ObjectMonitor::wait(jlong millis, bool interruptible, >>>> TRAPS) { >>>> >>>> 1601 if (JvmtiExport::should_post_monitor_waited()) { >>>> 1602 JvmtiExport::post_monitor_waited(jt, this, ret == >>>> OS_TIMEOUT); >>>> >>>> 1604 if (node._notified != 0 && _succ == Self) { >>>> >>>> 1620 node._event->unpark(); >>>> 1621 } >>>> >>>> >>>> The Risks >>>> --------- >>>> >>>> - The code now executes only when the JVMTI_EVENT_MONITOR_WAITED event >>>> is enabled: >>>> - previously it was always executed >>>> - while the old code was not effective for the hang that is being >>>> fixed with this bug, it is possible that the old code prevented >>>> a different bug in the successor protocol from manifesting >>>> - thorough analysis of the successor protocol did not reveal a >>>> case where the old code was needed in the old location >>>> - Thorough analysis indicates that the other JVM/TI monitor events >>>> do not need a fix like the one for JVMTI_EVENT_MONITOR_WAITED: >>>> - the successor protocol is complicated and the analysis could >>>> be wrong when certain options are used >>>> - comments were added to each location where a JVM/TI monitor >>>> event handler is called documenting why a fix like this one >>>> is not needed there >>>> - if the analysis is wrong, the new comments show where a new >>>> code change would be needed >>>> >>>> >>>> The Scenario >>>> ------------ >>>> >>>> I've created a scenario that reproduces this hang: >>>> >>>> T1 - enters monitor and calls monitor.wait() >>>> T2 - enters the monitor, calls monitor.notify() and exits the monitor >>>> T3 - enters and exits the monitor >>>> T4 - enters the monitor, delays for 5 seconds, exits the monitor >>>> >>>> A JVM/TI agent that enables JVMTI_EVENT_MONITOR_WAITED and has a >>>> handler that: enters a raw monitor, waits for 1ms, exits a raw >>>> monitor. >>>> >>>> Here are the six events necessary to make this hang happen: >>>> >>>> // KEY-EVENT-1a: After being unparked(), T1 has cleared the _succ >>>> field, but >>>> // KEY-EVENT-1b: T3 is exiting the monitor and makes T1 the successor >>>> again. >>>> >>>> // KEY-EVENT-2a: The unpark() done by T3 when it made T1 the successor >>>> // KEY-EVENT-2b: is consumed by the JVM/TI event handler. >>>> >>>> // KEY-EVENT-3a: T3 made T1 the successor >>>> // KEY-EVENT-3b: but before T1 could reenter the monitor T4 grabbed >>>> it. >>>> >>>> // KEY-EVENT-4a: T1's TrySpin() call sees T4 as NotRunnable so >>>> // KEY-EVENT-4b: T1 bails from TrySpin without touching _succ. >>>> >>>> // KEY-EVENT-5a: T4 sees that T1 is still the successor so >>>> // KEY-EVENT-5b: T4 takes the quick exit path (no ExitEpilog) >>>> >>>> // KEY-EVENT-6a: T1 is about to park and it is the successor, but >>>> // KEY-EVENT-6b: T3's unpark has been eaten by the JVM/TI event >>>> handler >>>> // KEY-EVENT-6c: and T4 took the quick exit path. T1 is about to be >>>> stuck. >>>> >>>> >>>> This bug is intertwined with: >>>> >>>> - The ObjectMonitor successor protocol >>>> - the sharing of a ParkEvent between Java Monitors and JVM/TI >>>> RawMonitors >>>> >>>> There is a very long successor.notes attachment to JDK-8028073 that >>>> attempts to describe the ObjectMonitor successor protocol. It's good >>>> for putting pretty much anyone to sleep. >>>> >>>> Since this hang reproduces back to JDK6, this bug is taking the easily >>>> backported solution of moving the original fix to the right location. >>>> The following new bug has been filed for possible future work in this >>>> area by the Serviceability Team: >>>> >>>> 8033399 add a separate ParkEvent for JVM/TI RawMonitor use >>>> https://bugs.openjdk.java.net/browse/JDK-8033399 >>>> >>>> >>>> The Symptoms >>>> ------------ >>>> >>>> With intermittent hangs like this, it is useful to know what to look >>>> for in order to determine if you are running into this issue: >>>> >>>> - if you aren't using a debugger or a profiler or some other >>>> JVM/TI agent, then this hang is not the same as yours >>>> - if your JVM/TI agent isn't using a JVMTI_EVENT_MONITOR_WAITED >>>> event handler, then this hang is not the same as yours >>>> - if your JVMTI_EVENT_MONITOR_WAITED event handler is not using >>>> JVM/TI RawMonitors, then this hang is not the same as yours >>>> - if your JVMTI_EVENT_MONITOR_WAITED event handler is calling >>>> back into Java code, then you might just be insane and this >>>> hang might be similar to yours. However, using a Java callback >>>> in an event handler is an even bigger problem/risk so fix that >>>> first. >>>> - if you one or more threads blocked like this and making no >>>> progress, then this hang might be the same as yours: >>>> >>>> "T1" #22 prio=5 os_prio=64 tid=0x00000000009ca800 nid=0x2f waiting >>>> for monitor e >>>> ntry [0xfffffd7fc0231000] >>>> java.lang.Thread.State: BLOCKED (on object monitor) >>>> JavaThread state: _thread_blocked >>>> Thread: 0x00000000009ca800 [0x2f] State: _at_safepoint >>>> _has_called_back 0 _at_p >>>> oll_safepoint 0 >>>> JavaThread state: _thread_blocked >>>> at java.lang.Object.wait(Native Method) >>>> - waiting on <0xfffffd7e6a2b6ff0> (a java.lang.String) >>>> at java.lang.Object.wait(Object.java:502) >>>> at SMW_WorkerThread.run(StressMonitorWait.java:103) >>>> - locked <0xfffffd7e6a2b6ff0> (a java.lang.String) >>>> >>>> "T2" #23 prio=5 os_prio=64 tid=0x00000000009cc000 nid=0x30 waiting >>>> for monitor e >>>> ntry [0xfffffd7fc0130000] >>>> java.lang.Thread.State: BLOCKED (on object monitor) >>>> JavaThread state: _thread_blocked >>>> Thread: 0x00000000009cc000 [0x30] State: _at_safepoint >>>> _has_called_back 0 _at_p >>>> oll_safepoint 0 >>>> JavaThread state: _thread_blocked >>>> at SMW_WorkerThread.run(StressMonitorWait.java:120) >>>> - waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String) >>>> >>>> "T3" #24 prio=5 os_prio=64 tid=0x00000000009ce000 nid=0x31 waiting >>>> for monitor e >>>> ntry [0xfffffd7fc002f000] >>>> java.lang.Thread.State: BLOCKED (on object monitor) >>>> JavaThread state: _thread_blocked >>>> Thread: 0x00000000009ce000 [0x31] State: _at_safepoint >>>> _has_called_back 0 _at_p >>>> oll_safepoint 0 >>>> JavaThread state: _thread_blocked >>>> at SMW_WorkerThread.run(StressMonitorWait.java:139) >>>> - waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String) >>>> >>>> Key symptoms in thread T1: >>>> >>>> - had the object locked: >>>> >>>> locked <0xfffffd7e6a2b6ff0> (a java.lang.String) >>>> >>>> - did an Object.wait(): >>>> >>>> waiting on <0xfffffd7e6a2b6ff0> (a java.lang.String) >>>> >>>> - is blocked on reentry: >>>> >>>> waiting for monitor entry [0xfffffd7fc0231000] >>>> >>>> Key symtoms in thread T2: >>>> >>>> - is blocked waiting to lock the object: >>>> >>>> waiting for monitor entry [0xfffffd7fc0130000] >>>> waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String) >>>> >>>> Key symtoms in thread T3: >>>> >>>> - is blocked waiting to lock the object: >>>> >>>> waiting for monitor entry [0xfffffd7fc002f000] >>>> waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String) >>>> >> From karen.kinnear at oracle.com Mon Feb 10 12:20:57 2014 From: karen.kinnear at oracle.com (Karen Kinnear) Date: Mon, 10 Feb 2014 15:20:57 -0500 Subject: code review round 0 for ObjectMonitor-JVM/TI hang fix (8028073) In-Reply-To: <52F920AD.6050705@oracle.com> References: <52ED3F28.5070806@oracle.com> <6C8A4CDA-B5D5-44E4-865E-A7205D533AFD@oracle.com> <52F5702C.50508@oracle.com> <52F84967.8040807@oracle.com> <52F920AD.6050705@oracle.com> Message-ID: <5B22D063-1125-4E1D-8D96-C5A5B3E78BA0@oracle.com> Dan, Thank you so much. My bad - I was looking at a jdk8 repo, not a jdk9 one. So I agree that the JDK9 fix as is works. Code change reviewed. For JDK8: I don't believe we were planning to backport this to 8 given risks of changes in this area. I did reach the same conclusion you did, that the WaitSetLock acquirers who already own the lock don't have this issue, but those that don't already own the lock do have the problem, and the timed wait could trigger this. And that a JDK8 fix would take the change out of the jvmti conditional, or need the 8028280 fix, which I also believe we do not plan to backport. thank you for the detailed walk-through, Karen On Feb 10, 2014, at 1:55 PM, Daniel D. Daugherty wrote: > On 2/9/14 8:37 PM, David Holmes wrote: >> trimming content ... >> >> On 8/02/2014 9:45 AM, Daniel D. Daugherty wrote: >>> On 2/7/14 2:56 PM, Karen Kinnear wrote: >>>> 3. Did I read the code correctly that the Thread::SpinAcquire can make >>>> a timed park >>>> call on the same thread's _ParkEvent? And that this is used to get on >>>> and off the wait queue, >>>> i.e. to acquire the WaitSetLock? >>>> Is there the same risk that a notify might be eaten here also? >>> >>> As far as I can see, Thread::SpinAcquire() does not use a ParkEvent >> >> It sure does: >> >> void Thread::SpinAcquire (volatile int * adr, const char * LockName) { >> if (Atomic::cmpxchg (1, adr, 0) == 0) { >> return ; // normal fast-path return >> } >> >> // Slow-path : We've encountered contention -- Spin/Yield/Block strategy. >> TEVENT (SpinAcquire - ctx) ; >> int ctr = 0 ; >> int Yields = 0 ; >> for (;;) { >> while (*adr != 0) { >> ++ctr ; >> if ((ctr & 0xFFF) == 0 || !os::is_MP()) { >> if (Yields > 5) { >> // Consider using a simple NakedSleep() instead. >> // Then SpinAcquire could be called by non-JVM threads >> Thread::current()->_ParkEvent->park(1) ; > > Ummmm... that's not the code I'm seeing... > > src/share/vm/runtime/thread.cpp: > > 4417 void Thread::SpinAcquire (volatile int * adr, const char * LockName) { > 4418 if (Atomic::cmpxchg (1, adr, 0) == 0) { > 4419 return ; // normal fast-path return > 4420 } > 4421 > 4422 // Slow-path : We've encountered contention -- Spin/Yield/Block strategy. > 4423 TEVENT (SpinAcquire - ctx) ; > 4424 int ctr = 0 ; > 4425 int Yields = 0 ; > 4426 for (;;) { > 4427 while (*adr != 0) { > 4428 ++ctr ; > 4429 if ((ctr & 0xFFF) == 0 || !os::is_MP()) { > 4430 if (Yields > 5) { > 4431 os::naked_short_sleep(1); > 4432 } else { > 4433 os::NakedYield() ; > 4434 ++Yields ; > 4435 } > 4436 } else { > 4437 SpinPause() ; > 4438 } > 4439 } > 4440 if (Atomic::cmpxchg (1, adr, 0) == 0) return ; > 4441 } > 4442 } > > Mr Simms recently changed the above code via: > > changeset: 5832:5944dba4badc > user: dsimms > date: Fri Jan 24 09:28:47 2014 +0100 > summary: 8028280: ParkEvent leak when running modified runThese which only loads classes > > os::naked_short_sleep() is new: > > - BSD/MacOS X, Linux - uses nanosleep() > - Solaris - uses usleep() > - Windows - uses Sleep() > > The fix for 8028280 was pushed to JDK9/hs-rt on 2014.01.24 and to JDK9/hs > on 2014.01.29. I don't see any signs that Mr Simm's fix will be backported > to JDK8u/HSX-25u (yet) so this part of the review thread might impact the > backport of my fix to earlier releases. > > >> So considering Karen's question ... I can't tell for certain. :( >> >> I do not think the SpinAcquire on grabbing the wait-set lock to add to the wait-set can be an issue because we will only park in response to the actual wait, and hence only get unparked due to a notify/notifyAll, but at this point we still own the monitor so no notify/notifyAll is possible. >> >> However, for the removal from the wait-set a more complex analysis is needed. To do the SpinAcquire we must still be flagged as TS_WAIT - which means we have not been notified, but must be returning due to a timeout (or spurious wakeup?). In such circumstances could we be _succ? I don't think so but I'll leave it to Dan to confirm that part :) > > So for HSX-25 and probably older... > > There are four Thread::SpinAcquire() calls in the objectMonitor code: > > Thread::SpinAcquire (&_WaitSetLock, "WaitSet - add") ; > Thread::SpinAcquire (&_WaitSetLock, "WaitSet - unlink") ; > Thread::SpinAcquire (&_WaitSetLock, "WaitSet - notify") ; > Thread::SpinAcquire (&_WaitSetLock, "WaitSet - notifyall") ; > > We can easily rule out the "notify" and "notifyAll" uses since the > current thread owns the Java-level monitor and there are no events > to post in this part of the notify() or notifyAll() protocols. > > For the "WaitSet - add" use, the current thread owns the Java-level > monitor and the thread has not been added as a waiter yet so another > thread cannot do the notify-exit-make-successor part of the protocol > yet. > > For the "WaitSet - unlink" use: > > src/share/vm/runtime/objectMonitor.cpp: > > 1569 if (node.TState == ObjectWaiter::TS_WAIT) { > 1570 Thread::SpinAcquire (&_WaitSetLock, "WaitSet - unlink") ; > 1571 if (node.TState == ObjectWaiter::TS_WAIT) { > 1572 DequeueSpecificWaiter (&node) ; // unlink from WaitSet > 1573 assert(node._notified == 0, "invariant"); > 1574 node.TState = ObjectWaiter::TS_RUN ; > 1575 } > 1576 Thread::SpinRelease (&_WaitSetLock) ; > 1577 } > > It is the call on line 1570 above that gets us into this code: > > src/share/vm/runtime/thread.cpp: > > 4435 void Thread::SpinAcquire (volatile int * adr, const char * LockName) { > 4436 if (Atomic::cmpxchg (1, adr, 0) == 0) { > 4437 return ; // normal fast-path return > 4438 } > 4439 > 4440 // Slow-path : We've encountered contention -- Spin/Yield/Block strategy. > 4441 TEVENT (SpinAcquire - ctx) ; > 4442 int ctr = 0 ; > 4443 int Yields = 0 ; > 4444 for (;;) { > 4445 while (*adr != 0) { > 4446 ++ctr ; > 4447 if ((ctr & 0xFFF) == 0 || !os::is_MP()) { > 4448 if (Yields > 5) { > 4449 // Consider using a simple NakedSleep() instead. > 4450 // Then SpinAcquire could be called by non-JVM threads > 4451 Thread::current()->_ParkEvent->park(1) ; > 4452 } else { > 4453 os::NakedYield() ; > 4454 ++Yields ; > 4455 } > 4456 } else { > 4457 SpinPause() ; > 4458 } > 4459 } > 4460 if (Atomic::cmpxchg (1, adr, 0) == 0) return ; > 4461 } > 4462 } > > And the above code can consume the unpark() on line 4451. > > So how the heck do we get to line 1570??? > > Well, the target thread would have to be both notified and unparked > to be executing this code path. When the notify() code runs, the > target of the notify() is changed from ObjectWaiter::TS_WAIT to > ObjectWaiter::TS_ENTER unless Knob_MoveNotifyee == 4. The default > for Knob_MoveNotifyee == 2 so we're in non default mode here... > > Here are the Knob_MoveNotifyee policy values: > > 1717 if (Policy == 0) { // prepend to EntryList > 1728 if (Policy == 1) { // append to EntryList > 1744 if (Policy == 2) { // prepend to cxq > 1760 if (Policy == 3) { // append to cxq > > For Knob_MoveNotifyee == 4 (or higher), we use the old mechanism > where we just unpark the target thread and let it run. Part of > that code changes from ObjectWaiter::TS_WAIT to ObjectWaiter::TS_RUN. > > The code works the same for notifyAll() for the thread picked > to be notified. For the Knob_MoveNotifyee == 4 (or higher) case, > we just unpark all the waiters and we a free-for-all. > > So it looks like the code block from lines 1569-1577 is never > used... or is it? Well... you have to remember two things: > > 1) spurious unpark() > 2) timed wait() > > The caller might have called wait(0), but that doesn't mean that > the underlying park() mechanism won't have a spurious unpark(). > Or better, the caller might have called wait(1) and be running > again after a millisecond. > > So in the HSX25 and older system (i.e., without Mr Simms fix for > 8028280), it is possible for this call: > > 1570 Thread::SpinAcquire (&_WaitSetLock, "WaitSet - unlink") ; > > to consume the unpark(). The gauntlet that has to be traversed > to get to this call: > > 4451 Thread::current()->_ParkEvent->park(1) ; > > is impressive: > > - fast-path acquisition of the _WaitSetLock has to fail: > > 4436 if (Atomic::cmpxchg (1, adr, 0) == 0) { > 4437 return ; // normal fast-path return > 4438 } > > - if the machine is a uniprocessor, then 6 os::NakedYield() > call-loop-recheck attempts have to fail: > > 4447 if ((ctr & 0xFFF) == 0 || !os::is_MP()) { > 4448 if (Yields > 5) { > 4449 // Consider using a simple NakedSleep() instead. > 4450 // Then SpinAcquire could be called by non-JVM threads > 4451 Thread::current()->_ParkEvent->park(1) ; > 4452 } else { > 4453 os::NakedYield() ; > 4454 ++Yields ; > 4455 } > > - if the machine is a multi-processor, then 6 rounds of { 4095 SpinPause() > attempts, 1 os::NakedYield() attempt} have to fail: > > 4446 ++ctr ; > 4447 if ((ctr & 0xFFF) == 0 || !os::is_MP()) { > 4448 if (Yields > 5) { > 4449 // Consider using a simple NakedSleep() instead. > 4450 // Then SpinAcquire could be called by non-JVM threads > 4451 Thread::current()->_ParkEvent->park(1) ; > 4452 } else { > 4453 os::NakedYield() ; > 4454 ++Yields ; > 4455 } > 4456 } else { > 4457 SpinPause() ; > 4458 } > > But it is possible. It is one of those once-in-a-blue moon type > windows where everything has to line up just so. > > So how do we address this issue in HSX-25 and possibly older? > > If Mr Simms fix for 8028280 is also backported, then there is no > issue. If it is not backported, then applying the fix for this > bug like so: > > src/share/vm/runtime/objectMonitor.cpp: > > 1596 if (JvmtiExport::should_post_monitor_waited()) { > 1597 JvmtiExport::post_monitor_waited(jt, this, ret == OS_TIMEOUT); > 1598 } > > 1604 if (node._notified != 0 && _succ == Self) { > 1605 // In this part of the monitor wait-notify-reenter protocol it > 1606 // is possible (and normal) for another thread to do a fastpath > 1607 // monitor enter-exit while this thread is still trying to get > 1608 // to the reenter portion of the protocol. > 1609 // > 1610 // The ObjectMonitor was notified and the current thread is > 1611 // the successor which also means that an unpark() has already > 1612 // been done. The JVMTI_EVENT_MONITOR_WAITED event handler can > 1613 // consume the unpark() that was done when the successor was > 1614 // set because the same ParkEvent is shared between Java > 1615 // monitors and JVM/TI RawMonitors (for now). > 1616 // > 1617 // We redo the unpark() to ensure forward progress, i.e., we > 1618 // don't want all pending threads hanging (parked) with none > 1619 // entering the unlocked monitor. > 1620 node._event->unpark(); > 1621 } > > Of course the line numbers for the "fix" would be different and the comment > would need to be updated to reflect that the: > > 1570 Thread::SpinAcquire (&_WaitSetLock, "WaitSet - unlink") ; > > call above could also consume an unpark(), but it should work. > > If you've read this far, then I'm impressed. If you've read this far > and only fallen asleep a couple of times, then I'm still impressed. > > Summary: I don't think we have an issue in JDK9, but we'll have to do > the fix in JDK8/HSX25 and older a little differently. > > Dan > > >> >> David >> ----- >> >>> at all. However, Thread::muxAcquire() does use a ParkEvent, but it >>> is a different ParkEvent. From src/share/vm/runtime/thread.hpp: >>> >>> ParkEvent * _ParkEvent ; // for synchronized() >>> ParkEvent * _SleepEvent ; // for Thread.sleep >>> ParkEvent * _MutexEvent ; // for native internal >>> Mutex/Monitor >>> ParkEvent * _MuxEvent ; // for low-level >>> muxAcquire-muxRelease >>> >>> So ObjectMonitor uses the _ParkEvent field and Thread::muxAcquire() >>> uses the _MuxEvent. There are some comments in thread.cpp about >>> how _MuxEvent could be eliminated and _ParkEvent shared, but I don't >>> think we ever want to go there. >>> >>> I also filed this RFE: >>> >>> 8033399 add a separate ParkEvent for JVM/TI RawMonitor use >>> https://bugs.openjdk.java.net/browse/JDK-8033399 >>> >>> just in case the Serviceability team wants to migrate JVM/TI RawMonitors >>> to a separate ParkEvent. >>> >>> Please let me know if you concur that I've resolved issue #3. >>> >>> >>>> If so, I wonder if we want this added unpark to not just be called if >>>> JVMTI_EVENT_MONITOR_WAITED >>>> is enabled? >>> >>> I don't think we need it, but I've noted its removal as a risk. >>> >>> Again, thanks for the review! >>> >>> Dan >>> >>> >>>> >>>> thanks, >>>> Karen >>>> >>>> On Feb 1, 2014, at 1:38 PM, Daniel D. Daugherty wrote: >>>> >>>>> Greetings, >>>>> >>>>> I have a fix ready for the following bug: >>>>> >>>>> 8028073 race condition in ObjectMonitor implementation causing >>>>> deadlocks >>>>> https://bugs.openjdk.java.net/browse/JDK-8028073 >>>>> >>>>> On the surface, this is a very simple fix that relocates a few lines of >>>>> code, relocates and rewrites the comments associated with that code and >>>>> adds several new comments. >>>>> >>>>> Of course, in reality, the issue is much more complicated, but I'm >>>>> hoping to make it easy for anyone not acquainted with this issue to >>>>> understand what's going on. >>>>> >>>>> Here are the JDK9 webrev URLs: >>>>> >>>>> OpenJDK: >>>>> http://cr.openjdk.java.net/~dcubed/8028073-webrev/0-jdk9-hs-runtime/ >>>>> >>>>> Oracle internal: >>>>> http://javaweb.us.oracle.com/~ddaugher/8028073-webrev/0-jdk9-hs-runtime/ >>>>> >>>>> The simple summary: >>>>> >>>>> - since Java Monitors and JVM/TI RawMonitors share a ParkEvent, >>>>> it is possible for a JVM/TI monitor event handler to accidentally >>>>> consume a ParkEvent.unpark() call meant for Java Monitor layer >>>>> - the original code fix was made on 2005.07.04 using this bug ID: >>>>> https://bugs.openjdk.java.net/browse/JDK-5030359 >>>>> - it's the right fix, but it's in the wrong place >>>>> - the fix needs to be after the JVMTI_EVENT_MONITOR_WAITED >>>>> event handler is called because it is that event handler >>>>> that can cause the hang >>>>> >>>>> >>>>> Testing >>>>> ------- >>>>> >>>>> - a new StessMonitorWait test has been created that reliably >>>>> reproduces the hang in JDK[6789]; see the bug's gory details >>>>> for the specific versions where the hang has been reproduced >>>>> - the test reliably reproduces the hang in 5 seconds on my >>>>> T7600 running Solaris 10u11 X86; 1 minute runs reproduce >>>>> the hang reliably on other machines >>>>> - 12 hour stress run of the new test on Linux-X64, MacOS X-X64, >>>>> Solaris-SPARCV9, Solaris-X64, and Win7-X86 with the JPRT >>>>> bits did not reproduce the hang >>>>> - JPRT test job >>>>> - VM/SQE Adhoc test job on Server VM, fastdebug bits on Linux-X86, >>>>> Linux-X64, MacOS X-X64, Solaris-SPARCV9, Solaris-X64, Windows-X86, >>>>> and Windows-X64: >>>>> - vm.quick >>>>> - Kitchensink (bigapps) >>>>> - Weblogic+medrec (bigapps) >>>>> - runThese (bigapps) >>>>> >>>>> >>>>> The Gory Details Start Here >>>>> --------------------------- >>>>> >>>>> This is the old location of block of code that's being moved: >>>>> >>>>> src/share/vm/runtime/objectMonitor.cpp: >>>>> >>>>> 1440 void ObjectMonitor::wait(jlong millis, bool interruptible, TRAPS) { >>>>> >>>>> 1499 exit (true, Self) ; // exit the monitor >>>>> >>>>> 1513 if (node._notified != 0 && _succ == Self) { >>>>> 1514 node._event->unpark(); >>>>> 1515 } >>>>> >>>>> >>>>> This is the new location of block of code that's being moved: >>>>> >>>>> src/share/vm/runtime/objectMonitor.cpp: >>>>> >>>>> 1452 void ObjectMonitor::wait(jlong millis, bool interruptible, TRAPS) { >>>>> >>>>> 1601 if (JvmtiExport::should_post_monitor_waited()) { >>>>> 1602 JvmtiExport::post_monitor_waited(jt, this, ret == >>>>> OS_TIMEOUT); >>>>> >>>>> 1604 if (node._notified != 0 && _succ == Self) { >>>>> >>>>> 1620 node._event->unpark(); >>>>> 1621 } >>>>> >>>>> >>>>> The Risks >>>>> --------- >>>>> >>>>> - The code now executes only when the JVMTI_EVENT_MONITOR_WAITED event >>>>> is enabled: >>>>> - previously it was always executed >>>>> - while the old code was not effective for the hang that is being >>>>> fixed with this bug, it is possible that the old code prevented >>>>> a different bug in the successor protocol from manifesting >>>>> - thorough analysis of the successor protocol did not reveal a >>>>> case where the old code was needed in the old location >>>>> - Thorough analysis indicates that the other JVM/TI monitor events >>>>> do not need a fix like the one for JVMTI_EVENT_MONITOR_WAITED: >>>>> - the successor protocol is complicated and the analysis could >>>>> be wrong when certain options are used >>>>> - comments were added to each location where a JVM/TI monitor >>>>> event handler is called documenting why a fix like this one >>>>> is not needed there >>>>> - if the analysis is wrong, the new comments show where a new >>>>> code change would be needed >>>>> >>>>> >>>>> The Scenario >>>>> ------------ >>>>> >>>>> I've created a scenario that reproduces this hang: >>>>> >>>>> T1 - enters monitor and calls monitor.wait() >>>>> T2 - enters the monitor, calls monitor.notify() and exits the monitor >>>>> T3 - enters and exits the monitor >>>>> T4 - enters the monitor, delays for 5 seconds, exits the monitor >>>>> >>>>> A JVM/TI agent that enables JVMTI_EVENT_MONITOR_WAITED and has a >>>>> handler that: enters a raw monitor, waits for 1ms, exits a raw monitor. >>>>> >>>>> Here are the six events necessary to make this hang happen: >>>>> >>>>> // KEY-EVENT-1a: After being unparked(), T1 has cleared the _succ >>>>> field, but >>>>> // KEY-EVENT-1b: T3 is exiting the monitor and makes T1 the successor >>>>> again. >>>>> >>>>> // KEY-EVENT-2a: The unpark() done by T3 when it made T1 the successor >>>>> // KEY-EVENT-2b: is consumed by the JVM/TI event handler. >>>>> >>>>> // KEY-EVENT-3a: T3 made T1 the successor >>>>> // KEY-EVENT-3b: but before T1 could reenter the monitor T4 grabbed it. >>>>> >>>>> // KEY-EVENT-4a: T1's TrySpin() call sees T4 as NotRunnable so >>>>> // KEY-EVENT-4b: T1 bails from TrySpin without touching _succ. >>>>> >>>>> // KEY-EVENT-5a: T4 sees that T1 is still the successor so >>>>> // KEY-EVENT-5b: T4 takes the quick exit path (no ExitEpilog) >>>>> >>>>> // KEY-EVENT-6a: T1 is about to park and it is the successor, but >>>>> // KEY-EVENT-6b: T3's unpark has been eaten by the JVM/TI event handler >>>>> // KEY-EVENT-6c: and T4 took the quick exit path. T1 is about to be >>>>> stuck. >>>>> >>>>> >>>>> This bug is intertwined with: >>>>> >>>>> - The ObjectMonitor successor protocol >>>>> - the sharing of a ParkEvent between Java Monitors and JVM/TI >>>>> RawMonitors >>>>> >>>>> There is a very long successor.notes attachment to JDK-8028073 that >>>>> attempts to describe the ObjectMonitor successor protocol. It's good >>>>> for putting pretty much anyone to sleep. >>>>> >>>>> Since this hang reproduces back to JDK6, this bug is taking the easily >>>>> backported solution of moving the original fix to the right location. >>>>> The following new bug has been filed for possible future work in this >>>>> area by the Serviceability Team: >>>>> >>>>> 8033399 add a separate ParkEvent for JVM/TI RawMonitor use >>>>> https://bugs.openjdk.java.net/browse/JDK-8033399 >>>>> >>>>> >>>>> The Symptoms >>>>> ------------ >>>>> >>>>> With intermittent hangs like this, it is useful to know what to look >>>>> for in order to determine if you are running into this issue: >>>>> >>>>> - if you aren't using a debugger or a profiler or some other >>>>> JVM/TI agent, then this hang is not the same as yours >>>>> - if your JVM/TI agent isn't using a JVMTI_EVENT_MONITOR_WAITED >>>>> event handler, then this hang is not the same as yours >>>>> - if your JVMTI_EVENT_MONITOR_WAITED event handler is not using >>>>> JVM/TI RawMonitors, then this hang is not the same as yours >>>>> - if your JVMTI_EVENT_MONITOR_WAITED event handler is calling >>>>> back into Java code, then you might just be insane and this >>>>> hang might be similar to yours. However, using a Java callback >>>>> in an event handler is an even bigger problem/risk so fix that >>>>> first. >>>>> - if you one or more threads blocked like this and making no >>>>> progress, then this hang might be the same as yours: >>>>> >>>>> "T1" #22 prio=5 os_prio=64 tid=0x00000000009ca800 nid=0x2f waiting >>>>> for monitor e >>>>> ntry [0xfffffd7fc0231000] >>>>> java.lang.Thread.State: BLOCKED (on object monitor) >>>>> JavaThread state: _thread_blocked >>>>> Thread: 0x00000000009ca800 [0x2f] State: _at_safepoint >>>>> _has_called_back 0 _at_p >>>>> oll_safepoint 0 >>>>> JavaThread state: _thread_blocked >>>>> at java.lang.Object.wait(Native Method) >>>>> - waiting on <0xfffffd7e6a2b6ff0> (a java.lang.String) >>>>> at java.lang.Object.wait(Object.java:502) >>>>> at SMW_WorkerThread.run(StressMonitorWait.java:103) >>>>> - locked <0xfffffd7e6a2b6ff0> (a java.lang.String) >>>>> >>>>> "T2" #23 prio=5 os_prio=64 tid=0x00000000009cc000 nid=0x30 waiting >>>>> for monitor e >>>>> ntry [0xfffffd7fc0130000] >>>>> java.lang.Thread.State: BLOCKED (on object monitor) >>>>> JavaThread state: _thread_blocked >>>>> Thread: 0x00000000009cc000 [0x30] State: _at_safepoint >>>>> _has_called_back 0 _at_p >>>>> oll_safepoint 0 >>>>> JavaThread state: _thread_blocked >>>>> at SMW_WorkerThread.run(StressMonitorWait.java:120) >>>>> - waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String) >>>>> >>>>> "T3" #24 prio=5 os_prio=64 tid=0x00000000009ce000 nid=0x31 waiting >>>>> for monitor e >>>>> ntry [0xfffffd7fc002f000] >>>>> java.lang.Thread.State: BLOCKED (on object monitor) >>>>> JavaThread state: _thread_blocked >>>>> Thread: 0x00000000009ce000 [0x31] State: _at_safepoint >>>>> _has_called_back 0 _at_p >>>>> oll_safepoint 0 >>>>> JavaThread state: _thread_blocked >>>>> at SMW_WorkerThread.run(StressMonitorWait.java:139) >>>>> - waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String) >>>>> >>>>> Key symptoms in thread T1: >>>>> >>>>> - had the object locked: >>>>> >>>>> locked <0xfffffd7e6a2b6ff0> (a java.lang.String) >>>>> >>>>> - did an Object.wait(): >>>>> >>>>> waiting on <0xfffffd7e6a2b6ff0> (a java.lang.String) >>>>> >>>>> - is blocked on reentry: >>>>> >>>>> waiting for monitor entry [0xfffffd7fc0231000] >>>>> >>>>> Key symtoms in thread T2: >>>>> >>>>> - is blocked waiting to lock the object: >>>>> >>>>> waiting for monitor entry [0xfffffd7fc0130000] >>>>> waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String) >>>>> >>>>> Key symtoms in thread T3: >>>>> >>>>> - is blocked waiting to lock the object: >>>>> >>>>> waiting for monitor entry [0xfffffd7fc002f000] >>>>> waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String) >>>>> >>> > From daniel.daugherty at oracle.com Mon Feb 10 12:31:20 2014 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Mon, 10 Feb 2014 13:31:20 -0700 Subject: code review round 0 for ObjectMonitor-JVM/TI hang fix (8028073) In-Reply-To: <5B22D063-1125-4E1D-8D96-C5A5B3E78BA0@oracle.com> References: <52ED3F28.5070806@oracle.com> <6C8A4CDA-B5D5-44E4-865E-A7205D533AFD@oracle.com> <52F5702C.50508@oracle.com> <52F84967.8040807@oracle.com> <52F920AD.6050705@oracle.com> <5B22D063-1125-4E1D-8D96-C5A5B3E78BA0@oracle.com> Message-ID: <52F93718.5010007@oracle.com> On 2/10/14 1:20 PM, Karen Kinnear wrote: > Dan, > > Thank you so much. My bad - I was looking at a jdk8 repo, not a jdk9 one. No problem... I had the advantage of wanting Mr Simms changes so that I could (more easily) develop the debug code flow hooks that I'm planning to add to the "debug tips and tricks" wiki... > So I agree that the JDK9 fix as is works. Code change reviewed. Thanks for confirmation! > For JDK8: > I don't believe we were planning to backport this to 8 given risks of changes in this area. Ummm.... Not JDK8-GA, but definitely a JDK8-Update... As usual, I plan to do the backport engineering and I'll let someone else worry about the politics... :-) > I did reach the same conclusion you did, that the WaitSetLock acquirers who already own > the lock don't have this issue, but those that don't already own the lock do have > the problem, and the timed wait could trigger this. > And that a JDK8 fix would take the change out of the jvmti conditional, or need the 8028280 > fix, which I also believe we do not plan to backport. Yeah, I'll chat with Mr Simms about backporting 8028280... That os::naked_short_sleep() function is so very useful... > thank you for the detailed walk-through, No problem. Thank you for slogging through the details here. Dan > Karen > > On Feb 10, 2014, at 1:55 PM, Daniel D. Daugherty wrote: > >> On 2/9/14 8:37 PM, David Holmes wrote: >>> trimming content ... >>> >>> On 8/02/2014 9:45 AM, Daniel D. Daugherty wrote: >>>> On 2/7/14 2:56 PM, Karen Kinnear wrote: >>>>> 3. Did I read the code correctly that the Thread::SpinAcquire can make >>>>> a timed park >>>>> call on the same thread's _ParkEvent? And that this is used to get on >>>>> and off the wait queue, >>>>> i.e. to acquire the WaitSetLock? >>>>> Is there the same risk that a notify might be eaten here also? >>>> As far as I can see, Thread::SpinAcquire() does not use a ParkEvent >>> It sure does: >>> >>> void Thread::SpinAcquire (volatile int * adr, const char * LockName) { >>> if (Atomic::cmpxchg (1, adr, 0) == 0) { >>> return ; // normal fast-path return >>> } >>> >>> // Slow-path : We've encountered contention -- Spin/Yield/Block strategy. >>> TEVENT (SpinAcquire - ctx) ; >>> int ctr = 0 ; >>> int Yields = 0 ; >>> for (;;) { >>> while (*adr != 0) { >>> ++ctr ; >>> if ((ctr & 0xFFF) == 0 || !os::is_MP()) { >>> if (Yields > 5) { >>> // Consider using a simple NakedSleep() instead. >>> // Then SpinAcquire could be called by non-JVM threads >>> Thread::current()->_ParkEvent->park(1) ; >> Ummmm... that's not the code I'm seeing... >> >> src/share/vm/runtime/thread.cpp: >> >> 4417 void Thread::SpinAcquire (volatile int * adr, const char * LockName) { >> 4418 if (Atomic::cmpxchg (1, adr, 0) == 0) { >> 4419 return ; // normal fast-path return >> 4420 } >> 4421 >> 4422 // Slow-path : We've encountered contention -- Spin/Yield/Block strategy. >> 4423 TEVENT (SpinAcquire - ctx) ; >> 4424 int ctr = 0 ; >> 4425 int Yields = 0 ; >> 4426 for (;;) { >> 4427 while (*adr != 0) { >> 4428 ++ctr ; >> 4429 if ((ctr & 0xFFF) == 0 || !os::is_MP()) { >> 4430 if (Yields > 5) { >> 4431 os::naked_short_sleep(1); >> 4432 } else { >> 4433 os::NakedYield() ; >> 4434 ++Yields ; >> 4435 } >> 4436 } else { >> 4437 SpinPause() ; >> 4438 } >> 4439 } >> 4440 if (Atomic::cmpxchg (1, adr, 0) == 0) return ; >> 4441 } >> 4442 } >> >> Mr Simms recently changed the above code via: >> >> changeset: 5832:5944dba4badc >> user: dsimms >> date: Fri Jan 24 09:28:47 2014 +0100 >> summary: 8028280: ParkEvent leak when running modified runThese which only loads classes >> >> os::naked_short_sleep() is new: >> >> - BSD/MacOS X, Linux - uses nanosleep() >> - Solaris - uses usleep() >> - Windows - uses Sleep() >> >> The fix for 8028280 was pushed to JDK9/hs-rt on 2014.01.24 and to JDK9/hs >> on 2014.01.29. I don't see any signs that Mr Simm's fix will be backported >> to JDK8u/HSX-25u (yet) so this part of the review thread might impact the >> backport of my fix to earlier releases. >> >> >>> So considering Karen's question ... I can't tell for certain. :( >>> >>> I do not think the SpinAcquire on grabbing the wait-set lock to add to the wait-set can be an issue because we will only park in response to the actual wait, and hence only get unparked due to a notify/notifyAll, but at this point we still own the monitor so no notify/notifyAll is possible. >>> >>> However, for the removal from the wait-set a more complex analysis is needed. To do the SpinAcquire we must still be flagged as TS_WAIT - which means we have not been notified, but must be returning due to a timeout (or spurious wakeup?). In such circumstances could we be _succ? I don't think so but I'll leave it to Dan to confirm that part :) >> So for HSX-25 and probably older... >> >> There are four Thread::SpinAcquire() calls in the objectMonitor code: >> >> Thread::SpinAcquire (&_WaitSetLock, "WaitSet - add") ; >> Thread::SpinAcquire (&_WaitSetLock, "WaitSet - unlink") ; >> Thread::SpinAcquire (&_WaitSetLock, "WaitSet - notify") ; >> Thread::SpinAcquire (&_WaitSetLock, "WaitSet - notifyall") ; >> >> We can easily rule out the "notify" and "notifyAll" uses since the >> current thread owns the Java-level monitor and there are no events >> to post in this part of the notify() or notifyAll() protocols. >> >> For the "WaitSet - add" use, the current thread owns the Java-level >> monitor and the thread has not been added as a waiter yet so another >> thread cannot do the notify-exit-make-successor part of the protocol >> yet. >> >> For the "WaitSet - unlink" use: >> >> src/share/vm/runtime/objectMonitor.cpp: >> >> 1569 if (node.TState == ObjectWaiter::TS_WAIT) { >> 1570 Thread::SpinAcquire (&_WaitSetLock, "WaitSet - unlink") ; >> 1571 if (node.TState == ObjectWaiter::TS_WAIT) { >> 1572 DequeueSpecificWaiter (&node) ; // unlink from WaitSet >> 1573 assert(node._notified == 0, "invariant"); >> 1574 node.TState = ObjectWaiter::TS_RUN ; >> 1575 } >> 1576 Thread::SpinRelease (&_WaitSetLock) ; >> 1577 } >> >> It is the call on line 1570 above that gets us into this code: >> >> src/share/vm/runtime/thread.cpp: >> >> 4435 void Thread::SpinAcquire (volatile int * adr, const char * LockName) { >> 4436 if (Atomic::cmpxchg (1, adr, 0) == 0) { >> 4437 return ; // normal fast-path return >> 4438 } >> 4439 >> 4440 // Slow-path : We've encountered contention -- Spin/Yield/Block strategy. >> 4441 TEVENT (SpinAcquire - ctx) ; >> 4442 int ctr = 0 ; >> 4443 int Yields = 0 ; >> 4444 for (;;) { >> 4445 while (*adr != 0) { >> 4446 ++ctr ; >> 4447 if ((ctr & 0xFFF) == 0 || !os::is_MP()) { >> 4448 if (Yields > 5) { >> 4449 // Consider using a simple NakedSleep() instead. >> 4450 // Then SpinAcquire could be called by non-JVM threads >> 4451 Thread::current()->_ParkEvent->park(1) ; >> 4452 } else { >> 4453 os::NakedYield() ; >> 4454 ++Yields ; >> 4455 } >> 4456 } else { >> 4457 SpinPause() ; >> 4458 } >> 4459 } >> 4460 if (Atomic::cmpxchg (1, adr, 0) == 0) return ; >> 4461 } >> 4462 } >> >> And the above code can consume the unpark() on line 4451. >> >> So how the heck do we get to line 1570??? >> >> Well, the target thread would have to be both notified and unparked >> to be executing this code path. When the notify() code runs, the >> target of the notify() is changed from ObjectWaiter::TS_WAIT to >> ObjectWaiter::TS_ENTER unless Knob_MoveNotifyee == 4. The default >> for Knob_MoveNotifyee == 2 so we're in non default mode here... >> >> Here are the Knob_MoveNotifyee policy values: >> >> 1717 if (Policy == 0) { // prepend to EntryList >> 1728 if (Policy == 1) { // append to EntryList >> 1744 if (Policy == 2) { // prepend to cxq >> 1760 if (Policy == 3) { // append to cxq >> >> For Knob_MoveNotifyee == 4 (or higher), we use the old mechanism >> where we just unpark the target thread and let it run. Part of >> that code changes from ObjectWaiter::TS_WAIT to ObjectWaiter::TS_RUN. >> >> The code works the same for notifyAll() for the thread picked >> to be notified. For the Knob_MoveNotifyee == 4 (or higher) case, >> we just unpark all the waiters and we a free-for-all. >> >> So it looks like the code block from lines 1569-1577 is never >> used... or is it? Well... you have to remember two things: >> >> 1) spurious unpark() >> 2) timed wait() >> >> The caller might have called wait(0), but that doesn't mean that >> the underlying park() mechanism won't have a spurious unpark(). >> Or better, the caller might have called wait(1) and be running >> again after a millisecond. >> >> So in the HSX25 and older system (i.e., without Mr Simms fix for >> 8028280), it is possible for this call: >> >> 1570 Thread::SpinAcquire (&_WaitSetLock, "WaitSet - unlink") ; >> >> to consume the unpark(). The gauntlet that has to be traversed >> to get to this call: >> >> 4451 Thread::current()->_ParkEvent->park(1) ; >> >> is impressive: >> >> - fast-path acquisition of the _WaitSetLock has to fail: >> >> 4436 if (Atomic::cmpxchg (1, adr, 0) == 0) { >> 4437 return ; // normal fast-path return >> 4438 } >> >> - if the machine is a uniprocessor, then 6 os::NakedYield() >> call-loop-recheck attempts have to fail: >> >> 4447 if ((ctr & 0xFFF) == 0 || !os::is_MP()) { >> 4448 if (Yields > 5) { >> 4449 // Consider using a simple NakedSleep() instead. >> 4450 // Then SpinAcquire could be called by non-JVM threads >> 4451 Thread::current()->_ParkEvent->park(1) ; >> 4452 } else { >> 4453 os::NakedYield() ; >> 4454 ++Yields ; >> 4455 } >> >> - if the machine is a multi-processor, then 6 rounds of { 4095 SpinPause() >> attempts, 1 os::NakedYield() attempt} have to fail: >> >> 4446 ++ctr ; >> 4447 if ((ctr & 0xFFF) == 0 || !os::is_MP()) { >> 4448 if (Yields > 5) { >> 4449 // Consider using a simple NakedSleep() instead. >> 4450 // Then SpinAcquire could be called by non-JVM threads >> 4451 Thread::current()->_ParkEvent->park(1) ; >> 4452 } else { >> 4453 os::NakedYield() ; >> 4454 ++Yields ; >> 4455 } >> 4456 } else { >> 4457 SpinPause() ; >> 4458 } >> >> But it is possible. It is one of those once-in-a-blue moon type >> windows where everything has to line up just so. >> >> So how do we address this issue in HSX-25 and possibly older? >> >> If Mr Simms fix for 8028280 is also backported, then there is no >> issue. If it is not backported, then applying the fix for this >> bug like so: >> >> src/share/vm/runtime/objectMonitor.cpp: >> >> 1596 if (JvmtiExport::should_post_monitor_waited()) { >> 1597 JvmtiExport::post_monitor_waited(jt, this, ret == OS_TIMEOUT); >> 1598 } >> >> 1604 if (node._notified != 0 && _succ == Self) { >> 1605 // In this part of the monitor wait-notify-reenter protocol it >> 1606 // is possible (and normal) for another thread to do a fastpath >> 1607 // monitor enter-exit while this thread is still trying to get >> 1608 // to the reenter portion of the protocol. >> 1609 // >> 1610 // The ObjectMonitor was notified and the current thread is >> 1611 // the successor which also means that an unpark() has already >> 1612 // been done. The JVMTI_EVENT_MONITOR_WAITED event handler can >> 1613 // consume the unpark() that was done when the successor was >> 1614 // set because the same ParkEvent is shared between Java >> 1615 // monitors and JVM/TI RawMonitors (for now). >> 1616 // >> 1617 // We redo the unpark() to ensure forward progress, i.e., we >> 1618 // don't want all pending threads hanging (parked) with none >> 1619 // entering the unlocked monitor. >> 1620 node._event->unpark(); >> 1621 } >> >> Of course the line numbers for the "fix" would be different and the comment >> would need to be updated to reflect that the: >> >> 1570 Thread::SpinAcquire (&_WaitSetLock, "WaitSet - unlink") ; >> >> call above could also consume an unpark(), but it should work. >> >> If you've read this far, then I'm impressed. If you've read this far >> and only fallen asleep a couple of times, then I'm still impressed. >> >> Summary: I don't think we have an issue in JDK9, but we'll have to do >> the fix in JDK8/HSX25 and older a little differently. >> >> Dan >> >> >>> David >>> ----- >>> >>>> at all. However, Thread::muxAcquire() does use a ParkEvent, but it >>>> is a different ParkEvent. From src/share/vm/runtime/thread.hpp: >>>> >>>> ParkEvent * _ParkEvent ; // for synchronized() >>>> ParkEvent * _SleepEvent ; // for Thread.sleep >>>> ParkEvent * _MutexEvent ; // for native internal >>>> Mutex/Monitor >>>> ParkEvent * _MuxEvent ; // for low-level >>>> muxAcquire-muxRelease >>>> >>>> So ObjectMonitor uses the _ParkEvent field and Thread::muxAcquire() >>>> uses the _MuxEvent. There are some comments in thread.cpp about >>>> how _MuxEvent could be eliminated and _ParkEvent shared, but I don't >>>> think we ever want to go there. >>>> >>>> I also filed this RFE: >>>> >>>> 8033399 add a separate ParkEvent for JVM/TI RawMonitor use >>>> https://bugs.openjdk.java.net/browse/JDK-8033399 >>>> >>>> just in case the Serviceability team wants to migrate JVM/TI RawMonitors >>>> to a separate ParkEvent. >>>> >>>> Please let me know if you concur that I've resolved issue #3. >>>> >>>> >>>>> If so, I wonder if we want this added unpark to not just be called if >>>>> JVMTI_EVENT_MONITOR_WAITED >>>>> is enabled? >>>> I don't think we need it, but I've noted its removal as a risk. >>>> >>>> Again, thanks for the review! >>>> >>>> Dan >>>> >>>> >>>>> thanks, >>>>> Karen >>>>> >>>>> On Feb 1, 2014, at 1:38 PM, Daniel D. Daugherty wrote: >>>>> >>>>>> Greetings, >>>>>> >>>>>> I have a fix ready for the following bug: >>>>>> >>>>>> 8028073 race condition in ObjectMonitor implementation causing >>>>>> deadlocks >>>>>> https://bugs.openjdk.java.net/browse/JDK-8028073 >>>>>> >>>>>> On the surface, this is a very simple fix that relocates a few lines of >>>>>> code, relocates and rewrites the comments associated with that code and >>>>>> adds several new comments. >>>>>> >>>>>> Of course, in reality, the issue is much more complicated, but I'm >>>>>> hoping to make it easy for anyone not acquainted with this issue to >>>>>> understand what's going on. >>>>>> >>>>>> Here are the JDK9 webrev URLs: >>>>>> >>>>>> OpenJDK: >>>>>> http://cr.openjdk.java.net/~dcubed/8028073-webrev/0-jdk9-hs-runtime/ >>>>>> >>>>>> Oracle internal: >>>>>> http://javaweb.us.oracle.com/~ddaugher/8028073-webrev/0-jdk9-hs-runtime/ >>>>>> >>>>>> The simple summary: >>>>>> >>>>>> - since Java Monitors and JVM/TI RawMonitors share a ParkEvent, >>>>>> it is possible for a JVM/TI monitor event handler to accidentally >>>>>> consume a ParkEvent.unpark() call meant for Java Monitor layer >>>>>> - the original code fix was made on 2005.07.04 using this bug ID: >>>>>> https://bugs.openjdk.java.net/browse/JDK-5030359 >>>>>> - it's the right fix, but it's in the wrong place >>>>>> - the fix needs to be after the JVMTI_EVENT_MONITOR_WAITED >>>>>> event handler is called because it is that event handler >>>>>> that can cause the hang >>>>>> >>>>>> >>>>>> Testing >>>>>> ------- >>>>>> >>>>>> - a new StessMonitorWait test has been created that reliably >>>>>> reproduces the hang in JDK[6789]; see the bug's gory details >>>>>> for the specific versions where the hang has been reproduced >>>>>> - the test reliably reproduces the hang in 5 seconds on my >>>>>> T7600 running Solaris 10u11 X86; 1 minute runs reproduce >>>>>> the hang reliably on other machines >>>>>> - 12 hour stress run of the new test on Linux-X64, MacOS X-X64, >>>>>> Solaris-SPARCV9, Solaris-X64, and Win7-X86 with the JPRT >>>>>> bits did not reproduce the hang >>>>>> - JPRT test job >>>>>> - VM/SQE Adhoc test job on Server VM, fastdebug bits on Linux-X86, >>>>>> Linux-X64, MacOS X-X64, Solaris-SPARCV9, Solaris-X64, Windows-X86, >>>>>> and Windows-X64: >>>>>> - vm.quick >>>>>> - Kitchensink (bigapps) >>>>>> - Weblogic+medrec (bigapps) >>>>>> - runThese (bigapps) >>>>>> >>>>>> >>>>>> The Gory Details Start Here >>>>>> --------------------------- >>>>>> >>>>>> This is the old location of block of code that's being moved: >>>>>> >>>>>> src/share/vm/runtime/objectMonitor.cpp: >>>>>> >>>>>> 1440 void ObjectMonitor::wait(jlong millis, bool interruptible, TRAPS) { >>>>>> >>>>>> 1499 exit (true, Self) ; // exit the monitor >>>>>> >>>>>> 1513 if (node._notified != 0 && _succ == Self) { >>>>>> 1514 node._event->unpark(); >>>>>> 1515 } >>>>>> >>>>>> >>>>>> This is the new location of block of code that's being moved: >>>>>> >>>>>> src/share/vm/runtime/objectMonitor.cpp: >>>>>> >>>>>> 1452 void ObjectMonitor::wait(jlong millis, bool interruptible, TRAPS) { >>>>>> >>>>>> 1601 if (JvmtiExport::should_post_monitor_waited()) { >>>>>> 1602 JvmtiExport::post_monitor_waited(jt, this, ret == >>>>>> OS_TIMEOUT); >>>>>> >>>>>> 1604 if (node._notified != 0 && _succ == Self) { >>>>>> >>>>>> 1620 node._event->unpark(); >>>>>> 1621 } >>>>>> >>>>>> >>>>>> The Risks >>>>>> --------- >>>>>> >>>>>> - The code now executes only when the JVMTI_EVENT_MONITOR_WAITED event >>>>>> is enabled: >>>>>> - previously it was always executed >>>>>> - while the old code was not effective for the hang that is being >>>>>> fixed with this bug, it is possible that the old code prevented >>>>>> a different bug in the successor protocol from manifesting >>>>>> - thorough analysis of the successor protocol did not reveal a >>>>>> case where the old code was needed in the old location >>>>>> - Thorough analysis indicates that the other JVM/TI monitor events >>>>>> do not need a fix like the one for JVMTI_EVENT_MONITOR_WAITED: >>>>>> - the successor protocol is complicated and the analysis could >>>>>> be wrong when certain options are used >>>>>> - comments were added to each location where a JVM/TI monitor >>>>>> event handler is called documenting why a fix like this one >>>>>> is not needed there >>>>>> - if the analysis is wrong, the new comments show where a new >>>>>> code change would be needed >>>>>> >>>>>> >>>>>> The Scenario >>>>>> ------------ >>>>>> >>>>>> I've created a scenario that reproduces this hang: >>>>>> >>>>>> T1 - enters monitor and calls monitor.wait() >>>>>> T2 - enters the monitor, calls monitor.notify() and exits the monitor >>>>>> T3 - enters and exits the monitor >>>>>> T4 - enters the monitor, delays for 5 seconds, exits the monitor >>>>>> >>>>>> A JVM/TI agent that enables JVMTI_EVENT_MONITOR_WAITED and has a >>>>>> handler that: enters a raw monitor, waits for 1ms, exits a raw monitor. >>>>>> >>>>>> Here are the six events necessary to make this hang happen: >>>>>> >>>>>> // KEY-EVENT-1a: After being unparked(), T1 has cleared the _succ >>>>>> field, but >>>>>> // KEY-EVENT-1b: T3 is exiting the monitor and makes T1 the successor >>>>>> again. >>>>>> >>>>>> // KEY-EVENT-2a: The unpark() done by T3 when it made T1 the successor >>>>>> // KEY-EVENT-2b: is consumed by the JVM/TI event handler. >>>>>> >>>>>> // KEY-EVENT-3a: T3 made T1 the successor >>>>>> // KEY-EVENT-3b: but before T1 could reenter the monitor T4 grabbed it. >>>>>> >>>>>> // KEY-EVENT-4a: T1's TrySpin() call sees T4 as NotRunnable so >>>>>> // KEY-EVENT-4b: T1 bails from TrySpin without touching _succ. >>>>>> >>>>>> // KEY-EVENT-5a: T4 sees that T1 is still the successor so >>>>>> // KEY-EVENT-5b: T4 takes the quick exit path (no ExitEpilog) >>>>>> >>>>>> // KEY-EVENT-6a: T1 is about to park and it is the successor, but >>>>>> // KEY-EVENT-6b: T3's unpark has been eaten by the JVM/TI event handler >>>>>> // KEY-EVENT-6c: and T4 took the quick exit path. T1 is about to be >>>>>> stuck. >>>>>> >>>>>> >>>>>> This bug is intertwined with: >>>>>> >>>>>> - The ObjectMonitor successor protocol >>>>>> - the sharing of a ParkEvent between Java Monitors and JVM/TI >>>>>> RawMonitors >>>>>> >>>>>> There is a very long successor.notes attachment to JDK-8028073 that >>>>>> attempts to describe the ObjectMonitor successor protocol. It's good >>>>>> for putting pretty much anyone to sleep. >>>>>> >>>>>> Since this hang reproduces back to JDK6, this bug is taking the easily >>>>>> backported solution of moving the original fix to the right location. >>>>>> The following new bug has been filed for possible future work in this >>>>>> area by the Serviceability Team: >>>>>> >>>>>> 8033399 add a separate ParkEvent for JVM/TI RawMonitor use >>>>>> https://bugs.openjdk.java.net/browse/JDK-8033399 >>>>>> >>>>>> >>>>>> The Symptoms >>>>>> ------------ >>>>>> >>>>>> With intermittent hangs like this, it is useful to know what to look >>>>>> for in order to determine if you are running into this issue: >>>>>> >>>>>> - if you aren't using a debugger or a profiler or some other >>>>>> JVM/TI agent, then this hang is not the same as yours >>>>>> - if your JVM/TI agent isn't using a JVMTI_EVENT_MONITOR_WAITED >>>>>> event handler, then this hang is not the same as yours >>>>>> - if your JVMTI_EVENT_MONITOR_WAITED event handler is not using >>>>>> JVM/TI RawMonitors, then this hang is not the same as yours >>>>>> - if your JVMTI_EVENT_MONITOR_WAITED event handler is calling >>>>>> back into Java code, then you might just be insane and this >>>>>> hang might be similar to yours. However, using a Java callback >>>>>> in an event handler is an even bigger problem/risk so fix that >>>>>> first. >>>>>> - if you one or more threads blocked like this and making no >>>>>> progress, then this hang might be the same as yours: >>>>>> >>>>>> "T1" #22 prio=5 os_prio=64 tid=0x00000000009ca800 nid=0x2f waiting >>>>>> for monitor e >>>>>> ntry [0xfffffd7fc0231000] >>>>>> java.lang.Thread.State: BLOCKED (on object monitor) >>>>>> JavaThread state: _thread_blocked >>>>>> Thread: 0x00000000009ca800 [0x2f] State: _at_safepoint >>>>>> _has_called_back 0 _at_p >>>>>> oll_safepoint 0 >>>>>> JavaThread state: _thread_blocked >>>>>> at java.lang.Object.wait(Native Method) >>>>>> - waiting on <0xfffffd7e6a2b6ff0> (a java.lang.String) >>>>>> at java.lang.Object.wait(Object.java:502) >>>>>> at SMW_WorkerThread.run(StressMonitorWait.java:103) >>>>>> - locked <0xfffffd7e6a2b6ff0> (a java.lang.String) >>>>>> >>>>>> "T2" #23 prio=5 os_prio=64 tid=0x00000000009cc000 nid=0x30 waiting >>>>>> for monitor e >>>>>> ntry [0xfffffd7fc0130000] >>>>>> java.lang.Thread.State: BLOCKED (on object monitor) >>>>>> JavaThread state: _thread_blocked >>>>>> Thread: 0x00000000009cc000 [0x30] State: _at_safepoint >>>>>> _has_called_back 0 _at_p >>>>>> oll_safepoint 0 >>>>>> JavaThread state: _thread_blocked >>>>>> at SMW_WorkerThread.run(StressMonitorWait.java:120) >>>>>> - waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String) >>>>>> >>>>>> "T3" #24 prio=5 os_prio=64 tid=0x00000000009ce000 nid=0x31 waiting >>>>>> for monitor e >>>>>> ntry [0xfffffd7fc002f000] >>>>>> java.lang.Thread.State: BLOCKED (on object monitor) >>>>>> JavaThread state: _thread_blocked >>>>>> Thread: 0x00000000009ce000 [0x31] State: _at_safepoint >>>>>> _has_called_back 0 _at_p >>>>>> oll_safepoint 0 >>>>>> JavaThread state: _thread_blocked >>>>>> at SMW_WorkerThread.run(StressMonitorWait.java:139) >>>>>> - waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String) >>>>>> >>>>>> Key symptoms in thread T1: >>>>>> >>>>>> - had the object locked: >>>>>> >>>>>> locked <0xfffffd7e6a2b6ff0> (a java.lang.String) >>>>>> >>>>>> - did an Object.wait(): >>>>>> >>>>>> waiting on <0xfffffd7e6a2b6ff0> (a java.lang.String) >>>>>> >>>>>> - is blocked on reentry: >>>>>> >>>>>> waiting for monitor entry [0xfffffd7fc0231000] >>>>>> >>>>>> Key symtoms in thread T2: >>>>>> >>>>>> - is blocked waiting to lock the object: >>>>>> >>>>>> waiting for monitor entry [0xfffffd7fc0130000] >>>>>> waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String) >>>>>> >>>>>> Key symtoms in thread T3: >>>>>> >>>>>> - is blocked waiting to lock the object: >>>>>> >>>>>> waiting for monitor entry [0xfffffd7fc002f000] >>>>>> waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String) >>>>>> From david.holmes at oracle.com Mon Feb 10 22:37:23 2014 From: david.holmes at oracle.com (David Holmes) Date: Tue, 11 Feb 2014 16:37:23 +1000 Subject: code review round 0 for ObjectMonitor-JVM/TI hang fix (8028073) In-Reply-To: <52F920AD.6050705@oracle.com> References: <52ED3F28.5070806@oracle.com> <6C8A4CDA-B5D5-44E4-865E-A7205D533AFD@oracle.com> <52F5702C.50508@oracle.com> <52F84967.8040807@oracle.com> <52F920AD.6050705@oracle.com> Message-ID: <52F9C523.3010403@oracle.com> Dan - thanks for an awesome analysis once again! Yes mea culpa the code I was looking at was sans Mr Simms most fortuitous change. (I'm suffering from repo overload these days.) As per my previous email I could see that a timed-wait, or spurious wakeup, could take us to the potentially problematic SpinAcquire, however I could not see how/if that thread could become _succ such that an unpark directed to _succ could be consumed in the SpinAcquire park(). Thanks again, David On 11/02/2014 4:55 AM, Daniel D. Daugherty wrote: > On 2/9/14 8:37 PM, David Holmes wrote: >> trimming content ... >> >> On 8/02/2014 9:45 AM, Daniel D. Daugherty wrote: >>> On 2/7/14 2:56 PM, Karen Kinnear wrote: >>>> 3. Did I read the code correctly that the Thread::SpinAcquire can make >>>> a timed park >>>> call on the same thread's _ParkEvent? And that this is used to get on >>>> and off the wait queue, >>>> i.e. to acquire the WaitSetLock? >>>> Is there the same risk that a notify might be eaten here also? >>> >>> As far as I can see, Thread::SpinAcquire() does not use a ParkEvent >> >> It sure does: >> >> void Thread::SpinAcquire (volatile int * adr, const char * LockName) { >> if (Atomic::cmpxchg (1, adr, 0) == 0) { >> return ; // normal fast-path return >> } >> >> // Slow-path : We've encountered contention -- Spin/Yield/Block >> strategy. >> TEVENT (SpinAcquire - ctx) ; >> int ctr = 0 ; >> int Yields = 0 ; >> for (;;) { >> while (*adr != 0) { >> ++ctr ; >> if ((ctr & 0xFFF) == 0 || !os::is_MP()) { >> if (Yields > 5) { >> // Consider using a simple NakedSleep() instead. >> // Then SpinAcquire could be called by non-JVM threads >> Thread::current()->_ParkEvent->park(1) ; > > Ummmm... that's not the code I'm seeing... > > src/share/vm/runtime/thread.cpp: > > 4417 void Thread::SpinAcquire (volatile int * adr, const char * > LockName) { > 4418 if (Atomic::cmpxchg (1, adr, 0) == 0) { > 4419 return ; // normal fast-path return > 4420 } > 4421 > 4422 // Slow-path : We've encountered contention -- > Spin/Yield/Block strategy. > 4423 TEVENT (SpinAcquire - ctx) ; > 4424 int ctr = 0 ; > 4425 int Yields = 0 ; > 4426 for (;;) { > 4427 while (*adr != 0) { > 4428 ++ctr ; > 4429 if ((ctr & 0xFFF) == 0 || !os::is_MP()) { > 4430 if (Yields > 5) { > 4431 os::naked_short_sleep(1); > 4432 } else { > 4433 os::NakedYield() ; > 4434 ++Yields ; > 4435 } > 4436 } else { > 4437 SpinPause() ; > 4438 } > 4439 } > 4440 if (Atomic::cmpxchg (1, adr, 0) == 0) return ; > 4441 } > 4442 } > > Mr Simms recently changed the above code via: > > changeset: 5832:5944dba4badc > user: dsimms > date: Fri Jan 24 09:28:47 2014 +0100 > summary: 8028280: ParkEvent leak when running modified runThese > which only loads classes > > os::naked_short_sleep() is new: > > - BSD/MacOS X, Linux - uses nanosleep() > - Solaris - uses usleep() > - Windows - uses Sleep() > > The fix for 8028280 was pushed to JDK9/hs-rt on 2014.01.24 and to JDK9/hs > on 2014.01.29. I don't see any signs that Mr Simm's fix will be backported > to JDK8u/HSX-25u (yet) so this part of the review thread might impact the > backport of my fix to earlier releases. > > >> So considering Karen's question ... I can't tell for certain. :( >> >> I do not think the SpinAcquire on grabbing the wait-set lock to add to >> the wait-set can be an issue because we will only park in response to >> the actual wait, and hence only get unparked due to a >> notify/notifyAll, but at this point we still own the monitor so no >> notify/notifyAll is possible. >> >> However, for the removal from the wait-set a more complex analysis is >> needed. To do the SpinAcquire we must still be flagged as TS_WAIT - >> which means we have not been notified, but must be returning due to a >> timeout (or spurious wakeup?). In such circumstances could we be >> _succ? I don't think so but I'll leave it to Dan to confirm that part :) > > So for HSX-25 and probably older... > > There are four Thread::SpinAcquire() calls in the objectMonitor code: > > Thread::SpinAcquire (&_WaitSetLock, "WaitSet - add") ; > Thread::SpinAcquire (&_WaitSetLock, "WaitSet - unlink") ; > Thread::SpinAcquire (&_WaitSetLock, "WaitSet - notify") ; > Thread::SpinAcquire (&_WaitSetLock, "WaitSet - notifyall") ; > > We can easily rule out the "notify" and "notifyAll" uses since the > current thread owns the Java-level monitor and there are no events > to post in this part of the notify() or notifyAll() protocols. > > For the "WaitSet - add" use, the current thread owns the Java-level > monitor and the thread has not been added as a waiter yet so another > thread cannot do the notify-exit-make-successor part of the protocol > yet. > > For the "WaitSet - unlink" use: > > src/share/vm/runtime/objectMonitor.cpp: > > 1569 if (node.TState == ObjectWaiter::TS_WAIT) { > 1570 Thread::SpinAcquire (&_WaitSetLock, "WaitSet - unlink") ; > 1571 if (node.TState == ObjectWaiter::TS_WAIT) { > 1572 DequeueSpecificWaiter (&node) ; // unlink > from WaitSet > 1573 assert(node._notified == 0, "invariant"); > 1574 node.TState = ObjectWaiter::TS_RUN ; > 1575 } > 1576 Thread::SpinRelease (&_WaitSetLock) ; > 1577 } > > It is the call on line 1570 above that gets us into this code: > > src/share/vm/runtime/thread.cpp: > > 4435 void Thread::SpinAcquire (volatile int * adr, const char * > LockName) { > 4436 if (Atomic::cmpxchg (1, adr, 0) == 0) { > 4437 return ; // normal fast-path return > 4438 } > 4439 > 4440 // Slow-path : We've encountered contention -- > Spin/Yield/Block strategy. > 4441 TEVENT (SpinAcquire - ctx) ; > 4442 int ctr = 0 ; > 4443 int Yields = 0 ; > 4444 for (;;) { > 4445 while (*adr != 0) { > 4446 ++ctr ; > 4447 if ((ctr & 0xFFF) == 0 || !os::is_MP()) { > 4448 if (Yields > 5) { > 4449 // Consider using a simple NakedSleep() instead. > 4450 // Then SpinAcquire could be called by non-JVM > threads > 4451 Thread::current()->_ParkEvent->park(1) ; > 4452 } else { > 4453 os::NakedYield() ; > 4454 ++Yields ; > 4455 } > 4456 } else { > 4457 SpinPause() ; > 4458 } > 4459 } > 4460 if (Atomic::cmpxchg (1, adr, 0) == 0) return ; > 4461 } > 4462 } > > And the above code can consume the unpark() on line 4451. > > So how the heck do we get to line 1570??? > > Well, the target thread would have to be both notified and unparked > to be executing this code path. When the notify() code runs, the > target of the notify() is changed from ObjectWaiter::TS_WAIT to > ObjectWaiter::TS_ENTER unless Knob_MoveNotifyee == 4. The default > for Knob_MoveNotifyee == 2 so we're in non default mode here... > > Here are the Knob_MoveNotifyee policy values: > > 1717 if (Policy == 0) { // prepend to EntryList > 1728 if (Policy == 1) { // append to EntryList > 1744 if (Policy == 2) { // prepend to cxq > 1760 if (Policy == 3) { // append to cxq > > For Knob_MoveNotifyee == 4 (or higher), we use the old mechanism > where we just unpark the target thread and let it run. Part of > that code changes from ObjectWaiter::TS_WAIT to ObjectWaiter::TS_RUN. > > The code works the same for notifyAll() for the thread picked > to be notified. For the Knob_MoveNotifyee == 4 (or higher) case, > we just unpark all the waiters and we a free-for-all. > > So it looks like the code block from lines 1569-1577 is never > used... or is it? Well... you have to remember two things: > > 1) spurious unpark() > 2) timed wait() > > The caller might have called wait(0), but that doesn't mean that > the underlying park() mechanism won't have a spurious unpark(). > Or better, the caller might have called wait(1) and be running > again after a millisecond. > > So in the HSX25 and older system (i.e., without Mr Simms fix for > 8028280), it is possible for this call: > > 1570 Thread::SpinAcquire (&_WaitSetLock, "WaitSet - unlink") ; > > to consume the unpark(). The gauntlet that has to be traversed > to get to this call: > > 4451 Thread::current()->_ParkEvent->park(1) ; > > is impressive: > > - fast-path acquisition of the _WaitSetLock has to fail: > > 4436 if (Atomic::cmpxchg (1, adr, 0) == 0) { > 4437 return ; // normal fast-path return > 4438 } > > - if the machine is a uniprocessor, then 6 os::NakedYield() > call-loop-recheck attempts have to fail: > > 4447 if ((ctr & 0xFFF) == 0 || !os::is_MP()) { > 4448 if (Yields > 5) { > 4449 // Consider using a simple NakedSleep() instead. > 4450 // Then SpinAcquire could be called by non-JVM > threads > 4451 Thread::current()->_ParkEvent->park(1) ; > 4452 } else { > 4453 os::NakedYield() ; > 4454 ++Yields ; > 4455 } > > - if the machine is a multi-processor, then 6 rounds of { 4095 SpinPause() > attempts, 1 os::NakedYield() attempt} have to fail: > > 4446 ++ctr ; > 4447 if ((ctr & 0xFFF) == 0 || !os::is_MP()) { > 4448 if (Yields > 5) { > 4449 // Consider using a simple NakedSleep() instead. > 4450 // Then SpinAcquire could be called by non-JVM > threads > 4451 Thread::current()->_ParkEvent->park(1) ; > 4452 } else { > 4453 os::NakedYield() ; > 4454 ++Yields ; > 4455 } > 4456 } else { > 4457 SpinPause() ; > 4458 } > > But it is possible. It is one of those once-in-a-blue moon type > windows where everything has to line up just so. > > So how do we address this issue in HSX-25 and possibly older? > > If Mr Simms fix for 8028280 is also backported, then there is no > issue. If it is not backported, then applying the fix for this > bug like so: > > src/share/vm/runtime/objectMonitor.cpp: > > 1596 if (JvmtiExport::should_post_monitor_waited()) { > 1597 JvmtiExport::post_monitor_waited(jt, this, ret == > OS_TIMEOUT); > 1598 } > > 1604 if (node._notified != 0 && _succ == Self) { > 1605 // In this part of the monitor wait-notify-reenter > protocol it > 1606 // is possible (and normal) for another thread to do a > fastpath > 1607 // monitor enter-exit while this thread is still trying > to get > 1608 // to the reenter portion of the protocol. > 1609 // > 1610 // The ObjectMonitor was notified and the current thread is > 1611 // the successor which also means that an unpark() has > already > 1612 // been done. The JVMTI_EVENT_MONITOR_WAITED event > handler can > 1613 // consume the unpark() that was done when the successor > was > 1614 // set because the same ParkEvent is shared between Java > 1615 // monitors and JVM/TI RawMonitors (for now). > 1616 // > 1617 // We redo the unpark() to ensure forward progress, > i.e., we > 1618 // don't want all pending threads hanging (parked) with > none > 1619 // entering the unlocked monitor. > 1620 node._event->unpark(); > 1621 } > > Of course the line numbers for the "fix" would be different and the comment > would need to be updated to reflect that the: > > 1570 Thread::SpinAcquire (&_WaitSetLock, "WaitSet - unlink") ; > > call above could also consume an unpark(), but it should work. > > If you've read this far, then I'm impressed. If you've read this far > and only fallen asleep a couple of times, then I'm still impressed. > > Summary: I don't think we have an issue in JDK9, but we'll have to do > the fix in JDK8/HSX25 and older a little differently. > > Dan > > >> >> David >> ----- >> >>> at all. However, Thread::muxAcquire() does use a ParkEvent, but it >>> is a different ParkEvent. From src/share/vm/runtime/thread.hpp: >>> >>> ParkEvent * _ParkEvent ; // for synchronized() >>> ParkEvent * _SleepEvent ; // for Thread.sleep >>> ParkEvent * _MutexEvent ; // for native internal >>> Mutex/Monitor >>> ParkEvent * _MuxEvent ; // for low-level >>> muxAcquire-muxRelease >>> >>> So ObjectMonitor uses the _ParkEvent field and Thread::muxAcquire() >>> uses the _MuxEvent. There are some comments in thread.cpp about >>> how _MuxEvent could be eliminated and _ParkEvent shared, but I don't >>> think we ever want to go there. >>> >>> I also filed this RFE: >>> >>> 8033399 add a separate ParkEvent for JVM/TI RawMonitor use >>> https://bugs.openjdk.java.net/browse/JDK-8033399 >>> >>> just in case the Serviceability team wants to migrate JVM/TI RawMonitors >>> to a separate ParkEvent. >>> >>> Please let me know if you concur that I've resolved issue #3. >>> >>> >>>> If so, I wonder if we want this added unpark to not just be called if >>>> JVMTI_EVENT_MONITOR_WAITED >>>> is enabled? >>> >>> I don't think we need it, but I've noted its removal as a risk. >>> >>> Again, thanks for the review! >>> >>> Dan >>> >>> >>>> >>>> thanks, >>>> Karen >>>> >>>> On Feb 1, 2014, at 1:38 PM, Daniel D. Daugherty wrote: >>>> >>>>> Greetings, >>>>> >>>>> I have a fix ready for the following bug: >>>>> >>>>> 8028073 race condition in ObjectMonitor implementation causing >>>>> deadlocks >>>>> https://bugs.openjdk.java.net/browse/JDK-8028073 >>>>> >>>>> On the surface, this is a very simple fix that relocates a few >>>>> lines of >>>>> code, relocates and rewrites the comments associated with that code >>>>> and >>>>> adds several new comments. >>>>> >>>>> Of course, in reality, the issue is much more complicated, but I'm >>>>> hoping to make it easy for anyone not acquainted with this issue to >>>>> understand what's going on. >>>>> >>>>> Here are the JDK9 webrev URLs: >>>>> >>>>> OpenJDK: >>>>> http://cr.openjdk.java.net/~dcubed/8028073-webrev/0-jdk9-hs-runtime/ >>>>> >>>>> Oracle internal: >>>>> http://javaweb.us.oracle.com/~ddaugher/8028073-webrev/0-jdk9-hs-runtime/ >>>>> >>>>> >>>>> The simple summary: >>>>> >>>>> - since Java Monitors and JVM/TI RawMonitors share a ParkEvent, >>>>> it is possible for a JVM/TI monitor event handler to accidentally >>>>> consume a ParkEvent.unpark() call meant for Java Monitor layer >>>>> - the original code fix was made on 2005.07.04 using this bug ID: >>>>> https://bugs.openjdk.java.net/browse/JDK-5030359 >>>>> - it's the right fix, but it's in the wrong place >>>>> - the fix needs to be after the JVMTI_EVENT_MONITOR_WAITED >>>>> event handler is called because it is that event handler >>>>> that can cause the hang >>>>> >>>>> >>>>> Testing >>>>> ------- >>>>> >>>>> - a new StessMonitorWait test has been created that reliably >>>>> reproduces the hang in JDK[6789]; see the bug's gory details >>>>> for the specific versions where the hang has been reproduced >>>>> - the test reliably reproduces the hang in 5 seconds on my >>>>> T7600 running Solaris 10u11 X86; 1 minute runs reproduce >>>>> the hang reliably on other machines >>>>> - 12 hour stress run of the new test on Linux-X64, MacOS X-X64, >>>>> Solaris-SPARCV9, Solaris-X64, and Win7-X86 with the JPRT >>>>> bits did not reproduce the hang >>>>> - JPRT test job >>>>> - VM/SQE Adhoc test job on Server VM, fastdebug bits on Linux-X86, >>>>> Linux-X64, MacOS X-X64, Solaris-SPARCV9, Solaris-X64, Windows-X86, >>>>> and Windows-X64: >>>>> - vm.quick >>>>> - Kitchensink (bigapps) >>>>> - Weblogic+medrec (bigapps) >>>>> - runThese (bigapps) >>>>> >>>>> >>>>> The Gory Details Start Here >>>>> --------------------------- >>>>> >>>>> This is the old location of block of code that's being moved: >>>>> >>>>> src/share/vm/runtime/objectMonitor.cpp: >>>>> >>>>> 1440 void ObjectMonitor::wait(jlong millis, bool interruptible, >>>>> TRAPS) { >>>>> >>>>> 1499 exit (true, Self) ; // exit the monitor >>>>> >>>>> 1513 if (node._notified != 0 && _succ == Self) { >>>>> 1514 node._event->unpark(); >>>>> 1515 } >>>>> >>>>> >>>>> This is the new location of block of code that's being moved: >>>>> >>>>> src/share/vm/runtime/objectMonitor.cpp: >>>>> >>>>> 1452 void ObjectMonitor::wait(jlong millis, bool interruptible, >>>>> TRAPS) { >>>>> >>>>> 1601 if (JvmtiExport::should_post_monitor_waited()) { >>>>> 1602 JvmtiExport::post_monitor_waited(jt, this, ret == >>>>> OS_TIMEOUT); >>>>> >>>>> 1604 if (node._notified != 0 && _succ == Self) { >>>>> >>>>> 1620 node._event->unpark(); >>>>> 1621 } >>>>> >>>>> >>>>> The Risks >>>>> --------- >>>>> >>>>> - The code now executes only when the JVMTI_EVENT_MONITOR_WAITED event >>>>> is enabled: >>>>> - previously it was always executed >>>>> - while the old code was not effective for the hang that is being >>>>> fixed with this bug, it is possible that the old code prevented >>>>> a different bug in the successor protocol from manifesting >>>>> - thorough analysis of the successor protocol did not reveal a >>>>> case where the old code was needed in the old location >>>>> - Thorough analysis indicates that the other JVM/TI monitor events >>>>> do not need a fix like the one for JVMTI_EVENT_MONITOR_WAITED: >>>>> - the successor protocol is complicated and the analysis could >>>>> be wrong when certain options are used >>>>> - comments were added to each location where a JVM/TI monitor >>>>> event handler is called documenting why a fix like this one >>>>> is not needed there >>>>> - if the analysis is wrong, the new comments show where a new >>>>> code change would be needed >>>>> >>>>> >>>>> The Scenario >>>>> ------------ >>>>> >>>>> I've created a scenario that reproduces this hang: >>>>> >>>>> T1 - enters monitor and calls monitor.wait() >>>>> T2 - enters the monitor, calls monitor.notify() and exits the monitor >>>>> T3 - enters and exits the monitor >>>>> T4 - enters the monitor, delays for 5 seconds, exits the monitor >>>>> >>>>> A JVM/TI agent that enables JVMTI_EVENT_MONITOR_WAITED and has a >>>>> handler that: enters a raw monitor, waits for 1ms, exits a raw >>>>> monitor. >>>>> >>>>> Here are the six events necessary to make this hang happen: >>>>> >>>>> // KEY-EVENT-1a: After being unparked(), T1 has cleared the _succ >>>>> field, but >>>>> // KEY-EVENT-1b: T3 is exiting the monitor and makes T1 the successor >>>>> again. >>>>> >>>>> // KEY-EVENT-2a: The unpark() done by T3 when it made T1 the successor >>>>> // KEY-EVENT-2b: is consumed by the JVM/TI event handler. >>>>> >>>>> // KEY-EVENT-3a: T3 made T1 the successor >>>>> // KEY-EVENT-3b: but before T1 could reenter the monitor T4 grabbed >>>>> it. >>>>> >>>>> // KEY-EVENT-4a: T1's TrySpin() call sees T4 as NotRunnable so >>>>> // KEY-EVENT-4b: T1 bails from TrySpin without touching _succ. >>>>> >>>>> // KEY-EVENT-5a: T4 sees that T1 is still the successor so >>>>> // KEY-EVENT-5b: T4 takes the quick exit path (no ExitEpilog) >>>>> >>>>> // KEY-EVENT-6a: T1 is about to park and it is the successor, but >>>>> // KEY-EVENT-6b: T3's unpark has been eaten by the JVM/TI event >>>>> handler >>>>> // KEY-EVENT-6c: and T4 took the quick exit path. T1 is about to be >>>>> stuck. >>>>> >>>>> >>>>> This bug is intertwined with: >>>>> >>>>> - The ObjectMonitor successor protocol >>>>> - the sharing of a ParkEvent between Java Monitors and JVM/TI >>>>> RawMonitors >>>>> >>>>> There is a very long successor.notes attachment to JDK-8028073 that >>>>> attempts to describe the ObjectMonitor successor protocol. It's good >>>>> for putting pretty much anyone to sleep. >>>>> >>>>> Since this hang reproduces back to JDK6, this bug is taking the easily >>>>> backported solution of moving the original fix to the right location. >>>>> The following new bug has been filed for possible future work in this >>>>> area by the Serviceability Team: >>>>> >>>>> 8033399 add a separate ParkEvent for JVM/TI RawMonitor use >>>>> https://bugs.openjdk.java.net/browse/JDK-8033399 >>>>> >>>>> >>>>> The Symptoms >>>>> ------------ >>>>> >>>>> With intermittent hangs like this, it is useful to know what to look >>>>> for in order to determine if you are running into this issue: >>>>> >>>>> - if you aren't using a debugger or a profiler or some other >>>>> JVM/TI agent, then this hang is not the same as yours >>>>> - if your JVM/TI agent isn't using a JVMTI_EVENT_MONITOR_WAITED >>>>> event handler, then this hang is not the same as yours >>>>> - if your JVMTI_EVENT_MONITOR_WAITED event handler is not using >>>>> JVM/TI RawMonitors, then this hang is not the same as yours >>>>> - if your JVMTI_EVENT_MONITOR_WAITED event handler is calling >>>>> back into Java code, then you might just be insane and this >>>>> hang might be similar to yours. However, using a Java callback >>>>> in an event handler is an even bigger problem/risk so fix that >>>>> first. >>>>> - if you one or more threads blocked like this and making no >>>>> progress, then this hang might be the same as yours: >>>>> >>>>> "T1" #22 prio=5 os_prio=64 tid=0x00000000009ca800 nid=0x2f waiting >>>>> for monitor e >>>>> ntry [0xfffffd7fc0231000] >>>>> java.lang.Thread.State: BLOCKED (on object monitor) >>>>> JavaThread state: _thread_blocked >>>>> Thread: 0x00000000009ca800 [0x2f] State: _at_safepoint >>>>> _has_called_back 0 _at_p >>>>> oll_safepoint 0 >>>>> JavaThread state: _thread_blocked >>>>> at java.lang.Object.wait(Native Method) >>>>> - waiting on <0xfffffd7e6a2b6ff0> (a java.lang.String) >>>>> at java.lang.Object.wait(Object.java:502) >>>>> at SMW_WorkerThread.run(StressMonitorWait.java:103) >>>>> - locked <0xfffffd7e6a2b6ff0> (a java.lang.String) >>>>> >>>>> "T2" #23 prio=5 os_prio=64 tid=0x00000000009cc000 nid=0x30 waiting >>>>> for monitor e >>>>> ntry [0xfffffd7fc0130000] >>>>> java.lang.Thread.State: BLOCKED (on object monitor) >>>>> JavaThread state: _thread_blocked >>>>> Thread: 0x00000000009cc000 [0x30] State: _at_safepoint >>>>> _has_called_back 0 _at_p >>>>> oll_safepoint 0 >>>>> JavaThread state: _thread_blocked >>>>> at SMW_WorkerThread.run(StressMonitorWait.java:120) >>>>> - waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String) >>>>> >>>>> "T3" #24 prio=5 os_prio=64 tid=0x00000000009ce000 nid=0x31 waiting >>>>> for monitor e >>>>> ntry [0xfffffd7fc002f000] >>>>> java.lang.Thread.State: BLOCKED (on object monitor) >>>>> JavaThread state: _thread_blocked >>>>> Thread: 0x00000000009ce000 [0x31] State: _at_safepoint >>>>> _has_called_back 0 _at_p >>>>> oll_safepoint 0 >>>>> JavaThread state: _thread_blocked >>>>> at SMW_WorkerThread.run(StressMonitorWait.java:139) >>>>> - waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String) >>>>> >>>>> Key symptoms in thread T1: >>>>> >>>>> - had the object locked: >>>>> >>>>> locked <0xfffffd7e6a2b6ff0> (a java.lang.String) >>>>> >>>>> - did an Object.wait(): >>>>> >>>>> waiting on <0xfffffd7e6a2b6ff0> (a java.lang.String) >>>>> >>>>> - is blocked on reentry: >>>>> >>>>> waiting for monitor entry [0xfffffd7fc0231000] >>>>> >>>>> Key symtoms in thread T2: >>>>> >>>>> - is blocked waiting to lock the object: >>>>> >>>>> waiting for monitor entry [0xfffffd7fc0130000] >>>>> waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String) >>>>> >>>>> Key symtoms in thread T3: >>>>> >>>>> - is blocked waiting to lock the object: >>>>> >>>>> waiting for monitor entry [0xfffffd7fc002f000] >>>>> waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String) >>>>> >>> > From staffan at larsen.se Tue Feb 11 02:18:09 2014 From: staffan at larsen.se (staffan at larsen.se) Date: Tue, 11 Feb 2014 11:18:09 +0100 Subject: RFR(XS): JDK-8034176 Update mapfile for libjfr Message-ID: Please review this small fix for libjfr/mapfile-vers. Thanks, /Staffan diff --git a/make/mapfiles/libjfr/mapfile-vers b/make/mapfiles/libjfr/mapfile-vers --- a/make/mapfiles/libjfr/mapfile-vers +++ b/make/mapfiles/libjfr/mapfile-vers @@ -34,6 +34,7 @@ Java_oracle_jrockit_jfr_VMJFR_getPeriod; Java_oracle_jrockit_jfr_VMJFR_descriptors; Java_oracle_jrockit_jfr_VMJFR_redefineClass0; + Java_oracle_jrockit_jfr_VMJFR_retransformClasses0; JNI_OnLoad; local: *; From staffan.larsen at oracle.com Tue Feb 11 03:00:47 2014 From: staffan.larsen at oracle.com (Staffan Larsen) Date: Tue, 11 Feb 2014 12:00:47 +0100 Subject: RFR(XS): JDK-8034176 Update mapfile for libjfr In-Reply-To: References: Message-ID: Resending (email to build-dev bounced since I used the wrong sender address). On 11 feb 2014, at 11:18, staffan at larsen.se wrote: > Please review this small fix for libjfr/mapfile-vers. > > Thanks, > /Staffan > > diff --git a/make/mapfiles/libjfr/mapfile-vers b/make/mapfiles/libjfr/mapfile-vers > --- a/make/mapfiles/libjfr/mapfile-vers > +++ b/make/mapfiles/libjfr/mapfile-vers > @@ -34,6 +34,7 @@ > Java_oracle_jrockit_jfr_VMJFR_getPeriod; > Java_oracle_jrockit_jfr_VMJFR_descriptors; > Java_oracle_jrockit_jfr_VMJFR_redefineClass0; > + Java_oracle_jrockit_jfr_VMJFR_retransformClasses0; > JNI_OnLoad; > local: > *; From erik.joelsson at oracle.com Tue Feb 11 03:14:01 2014 From: erik.joelsson at oracle.com (Erik Joelsson) Date: Tue, 11 Feb 2014 12:14:01 +0100 Subject: RFR(XS): JDK-8034176 Update mapfile for libjfr In-Reply-To: References: Message-ID: <52FA05F9.20206@oracle.com> Looks good, but I can't help but wonder why the mapfile for libjfr is in the open. /Erik On 2014-02-11 12:00, Staffan Larsen wrote: > Resending (email to build-dev bounced since I used the wrong sender address). > > On 11 feb 2014, at 11:18, staffan at larsen.se wrote: > >> Please review this small fix for libjfr/mapfile-vers. >> >> Thanks, >> /Staffan >> >> diff --git a/make/mapfiles/libjfr/mapfile-vers b/make/mapfiles/libjfr/mapfile-vers >> --- a/make/mapfiles/libjfr/mapfile-vers >> +++ b/make/mapfiles/libjfr/mapfile-vers >> @@ -34,6 +34,7 @@ >> Java_oracle_jrockit_jfr_VMJFR_getPeriod; >> Java_oracle_jrockit_jfr_VMJFR_descriptors; >> Java_oracle_jrockit_jfr_VMJFR_redefineClass0; >> + Java_oracle_jrockit_jfr_VMJFR_retransformClasses0; >> JNI_OnLoad; >> local: >> *; From staffan.larsen at oracle.com Tue Feb 11 04:01:37 2014 From: staffan.larsen at oracle.com (Staffan Larsen) Date: Tue, 11 Feb 2014 13:01:37 +0100 Subject: RFR(XS): JDK-8034176 Update mapfile for libjfr In-Reply-To: <52FA05F9.20206@oracle.com> References: <52FA05F9.20206@oracle.com> Message-ID: <7482D938-F9B1-4890-B48D-AF892B6FA3C3@oracle.com> On 11 feb 2014, at 12:14, Erik Joelsson wrote: > Looks good, but I can't help but wonder why the mapfile for libjfr is in the open. Yes, that is unfortunate. It used to be in closed source in jdk7, but moved into open with the new build system. /Staffan > > /Erik > > On 2014-02-11 12:00, Staffan Larsen wrote: >> Resending (email to build-dev bounced since I used the wrong sender address). >> >> On 11 feb 2014, at 11:18, staffan at larsen.se wrote: >> >>> Please review this small fix for libjfr/mapfile-vers. >>> >>> Thanks, >>> /Staffan >>> >>> diff --git a/make/mapfiles/libjfr/mapfile-vers b/make/mapfiles/libjfr/mapfile-vers >>> --- a/make/mapfiles/libjfr/mapfile-vers >>> +++ b/make/mapfiles/libjfr/mapfile-vers >>> @@ -34,6 +34,7 @@ >>> Java_oracle_jrockit_jfr_VMJFR_getPeriod; >>> Java_oracle_jrockit_jfr_VMJFR_descriptors; >>> Java_oracle_jrockit_jfr_VMJFR_redefineClass0; >>> + Java_oracle_jrockit_jfr_VMJFR_retransformClasses0; >>> JNI_OnLoad; >>> local: >>> *; > From fweimer at redhat.com Tue Feb 11 04:52:03 2014 From: fweimer at redhat.com (Florian Weimer) Date: Tue, 11 Feb 2014 13:52:03 +0100 Subject: Project to improve hs_err files In-Reply-To: References: Message-ID: <52FA1CF3.2070902@redhat.com> On 09/06/2013 01:32 PM, Mattis Castegren wrote: > Please let me know if you have ideas of what information would make > hs_err files more useful, and I will add them to my list. Structured logging to the systemd journal would be an interesting addition. This might be a bit tricky from a signal handler, though: http://lists.freedesktop.org/archives/systemd-devel/2012-November/007341.html http://lists.freedesktop.org/archives/systemd-devel/2014-January/016465.html -- Florian Weimer / Red Hat Product Security Team From staffan.larsen at oracle.com Tue Feb 11 06:12:59 2014 From: staffan.larsen at oracle.com (Staffan Larsen) Date: Tue, 11 Feb 2014 15:12:59 +0100 Subject: RFR(S): 6952105 TEST_BUG: testcase failure, not very often, com/sun/jdi/SuspendThreadTest.java Message-ID: Updated the test to use proper synchronization and notification between threads. Should be more stable and much faster. bug: https://bugs.openjdk.java.net/browse/JDK-6952105 webrev: http://cr.openjdk.java.net/~sla/6952105/webrev.00/ Thanks, /Staffan From shanliang.jiang at oracle.com Tue Feb 11 06:13:46 2014 From: shanliang.jiang at oracle.com (shanliang) Date: Tue, 11 Feb 2014 15:13:46 +0100 Subject: Codereview request: 8007710 runtime/7158988/FieldMonitor.java fails with com.sun.jdi.VMDisconnectedException: Connection closed Message-ID: <52FA301A.2040301@oracle.com> Hi , The problem could be that FieldMonitor did not have enough time to "addFieldWatch" but the vm to monitor (TestPostFieldModification) was already ended. So we should make sure that TestPostFieldModification exits after FieldMonitor has done necessary. The solution proposed here is that FieldMonitor creates a file after adding field watching, and TestPostFieldModification quits only after finding the file. web: http://icncweb.fr.oracle.com/~shjiang/webrev/8007710/00/ bug: https://bugs.openjdk.java.net/browse/JDK-8007710 Thanks, Shanliang From staffan.larsen at oracle.com Tue Feb 11 07:04:46 2014 From: staffan.larsen at oracle.com (Staffan Larsen) Date: Tue, 11 Feb 2014 16:04:46 +0100 Subject: Codereview request: 8007710 runtime/7158988/FieldMonitor.java fails with com.sun.jdi.VMDisconnectedException: Connection closed In-Reply-To: <52FA301A.2040301@oracle.com> References: <52FA301A.2040301@oracle.com> Message-ID: Hi Shanliang, I can?t quite see how the test can fail in this way. When the ClassPrepareEvent happens, the debuggee will be suspended. So when addFieldWatch() is called, the debuggee should not have moved. One problem I do see with the test is that it does not wait for a VMStartEvent before setting up requests. I?m not sure if that could cause the failure in the bug report, though. /Staffan On 11 feb 2014, at 15:13, shanliang wrote: > Hi , > > The problem could be that FieldMonitor did not have enough time to "addFieldWatch" but the vm to monitor (TestPostFieldModification) was already ended. > > So we should make sure that TestPostFieldModification exits after FieldMonitor has done necessary. The solution proposed here is that FieldMonitor creates a file after adding field watching, and TestPostFieldModification quits only after finding the file. > > web: > http://icncweb.fr.oracle.com/~shjiang/webrev/8007710/00/ > > bug: > https://bugs.openjdk.java.net/browse/JDK-8007710 > > Thanks, > Shanliang From shanliang.jiang at oracle.com Tue Feb 11 07:21:30 2014 From: shanliang.jiang at oracle.com (shanliang) Date: Tue, 11 Feb 2014 16:21:30 +0100 Subject: Codereview request: 8007710 runtime/7158988/FieldMonitor.java fails with com.sun.jdi.VMDisconnectedException: Connection closed In-Reply-To: <52FA301A.2040301@oracle.com> References: <52FA301A.2040301@oracle.com> Message-ID: <52FA3FFA.70502@oracle.com> public web: http://cr.openjdk.java.net/~sjiang/JDK-8007710/00/ Shanliang shanliang wrote: > Hi , > > The problem could be that FieldMonitor did not have enough time to > "addFieldWatch" but the vm to monitor (TestPostFieldModification) was > already ended. > > So we should make sure that TestPostFieldModification exits after > FieldMonitor has done necessary. The solution proposed here is that > FieldMonitor creates a file after adding field watching, and > TestPostFieldModification quits only after finding the file. > > web: > http://icncweb.fr.oracle.com/~shjiang/webrev/8007710/00/ > > bug: > https://bugs.openjdk.java.net/browse/JDK-8007710 > > Thanks, > Shanliang From shanliang.jiang at oracle.com Tue Feb 11 07:31:26 2014 From: shanliang.jiang at oracle.com (shanliang) Date: Tue, 11 Feb 2014 16:31:26 +0100 Subject: Codereview request: 8007710 runtime/7158988/FieldMonitor.java fails with com.sun.jdi.VMDisconnectedException: Connection closed In-Reply-To: References: <52FA301A.2040301@oracle.com> Message-ID: <52FA424E.2050202@oracle.com> Staffan Larsen wrote: > Hi Shanliang, > > I can?t quite see how the test can fail in this way. When the ClassPrepareEvent happens, the debuggee will be suspended. So when addFieldWatch() is called, the debuggee should not have moved. > I am not expert of jdi so I may miss something here. I checked the failure trace and saw the report exception happen when FieldMonitor received ClassPrepareEvent and was doing addFieldWatch. FieldMonitor did call "vm.resume()" before treating events. I reproduced the bug by add sleep(1000) after vm.resume() but before calling eventQueue.remove(); Thanks, Shanliang > One problem I do see with the test is that it does not wait for a VMStartEvent before setting up requests. I?m not sure if that could cause the failure in the bug report, though. > > /Staffan > > On 11 feb 2014, at 15:13, shanliang wrote: > > >> Hi , >> >> The problem could be that FieldMonitor did not have enough time to "addFieldWatch" but the vm to monitor (TestPostFieldModification) was already ended. >> >> So we should make sure that TestPostFieldModification exits after FieldMonitor has done necessary. The solution proposed here is that FieldMonitor creates a file after adding field watching, and TestPostFieldModification quits only after finding the file. >> >> web: >> http://icncweb.fr.oracle.com/~shjiang/webrev/8007710/00/ >> >> bug: >> https://bugs.openjdk.java.net/browse/JDK-8007710 >> >> Thanks, >> Shanliang >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20140211/eca01cf7/attachment-0001.html From jaroslav.bachorik at oracle.com Tue Feb 11 07:43:17 2014 From: jaroslav.bachorik at oracle.com (Jaroslav Bachorik) Date: Tue, 11 Feb 2014 16:43:17 +0100 Subject: Codereview request: 8007710 runtime/7158988/FieldMonitor.java fails with com.sun.jdi.VMDisconnectedException: Connection closed In-Reply-To: <52FA424E.2050202@oracle.com> References: <52FA301A.2040301@oracle.com> <52FA424E.2050202@oracle.com> Message-ID: <52FA4515.3050904@oracle.com> On 11.2.2014 16:31, shanliang wrote: > Staffan Larsen wrote: >> Hi Shanliang, >> >> I can?t quite see how the test can fail in this way. When the >> ClassPrepareEvent happens, the debuggee will be suspended. So when >> addFieldWatch() is called, the debuggee should not have moved. > I am not expert of jdi so I may miss something here. I checked the > failure trace and saw the report exception happen when FieldMonitor > received ClassPrepareEvent and was doing addFieldWatch. FieldMonitor did > call "vm.resume()" before treating events. AFAICS, calling vm.resume() results in an almost immediate debuggee death. The gc() invoking thread "d" is flagged as a deamon and as such doesn't prevent the process from exiting. The other thread is not a daemon but will finish in only few cycles. > > I reproduced the bug by add sleep(1000) after vm.resume() but before > calling eventQueue.remove(); It looks like some kind of synchronization between the debugger and the debuggee is necessary. But I wonder if you should better use the process.getOuptuptStream() to write and flush a message for the debugee indicating that it can exit. And in the debugee you would just do System.in.read() as the last statement in the main() method. Seems more robust than involving files. Cheers, -JB- > > Thanks, > Shanliang >> One problem I do see with the test is that it does not wait for a >> VMStartEvent before setting up requests. I?m not sure if that could >> cause the failure in the bug report, though. >> >> /Staffan >> >> On 11 feb 2014, at 15:13, shanliang wrote: >> >>> Hi , >>> >>> The problem could be that FieldMonitor did not have enough time to >>> "addFieldWatch" but the vm to monitor (TestPostFieldModification) was >>> already ended. >>> >>> So we should make sure that TestPostFieldModification exits after >>> FieldMonitor has done necessary. The solution proposed here is that >>> FieldMonitor creates a file after adding field watching, and >>> TestPostFieldModification quits only after finding the file. >>> >>> web: >>> http://icncweb.fr.oracle.com/~shjiang/webrev/8007710/00/ >>> >>> bug: >>> https://bugs.openjdk.java.net/browse/JDK-8007710 >>> >>> Thanks, >>> Shanliang >> > > From shanliang.jiang at oracle.com Tue Feb 11 08:17:55 2014 From: shanliang.jiang at oracle.com (shanliang) Date: Tue, 11 Feb 2014 17:17:55 +0100 Subject: Codereview request: 8007710 runtime/7158988/FieldMonitor.java fails with com.sun.jdi.VMDisconnectedException: Connection closed In-Reply-To: <52FA4515.3050904@oracle.com> References: <52FA301A.2040301@oracle.com> <52FA424E.2050202@oracle.com> <52FA4515.3050904@oracle.com> Message-ID: <52FA4D33.4030402@oracle.com> Jaroslav Bachorik wrote: > On 11.2.2014 16:31, shanliang wrote: >> Staffan Larsen wrote: >>> Hi Shanliang, >>> >>> I can?t quite see how the test can fail in this way. When the >>> ClassPrepareEvent happens, the debuggee will be suspended. So when >>> addFieldWatch() is called, the debuggee should not have moved. >> I am not expert of jdi so I may miss something here. I checked the >> failure trace and saw the report exception happen when FieldMonitor >> received ClassPrepareEvent and was doing addFieldWatch. FieldMonitor did >> call "vm.resume()" before treating events. > > AFAICS, calling vm.resume() results in an almost immediate debuggee > death. The gc() invoking thread "d" is flagged as a deamon and as such > doesn't prevent the process from exiting. The other thread is not a > daemon but will finish in only few cycles. I looked at the class com.sun.jdi.VirtualMachine, here is the Javadoc of the method "resume": /** * Continues the execution of the application running in this * virtual machine. All threads are resumed as documented in * {@link ThreadReference#resume}. * * @throws VMCannotBeModifiedException if the VirtualMachine is read-only - see {@link VirtualMachine#canBeModified()}. * * @see #suspend */ void resume(); My understanding is that the debuggee resumes to work after this call, instead to die? > >> >> I reproduced the bug by add sleep(1000) after vm.resume() but before >> calling eventQueue.remove(); > > It looks like some kind of synchronization between the debugger and > the debuggee is necessary. But I wonder if you should better use the > process.getOuptuptStream() to write and flush a message for the > debugee indicating that it can exit. And in the debugee you would just > do System.in.read() as the last statement in the main() method. Seems > more robust than involving files. It could work, but creating a file in the testing directory should have no issue, but yes maybe less performance. Thanks, Shanliang > > Cheers, > > -JB- > >> >> Thanks, >> Shanliang >>> One problem I do see with the test is that it does not wait for a >>> VMStartEvent before setting up requests. I?m not sure if that could >>> cause the failure in the bug report, though. >>> >>> /Staffan >>> >>> On 11 feb 2014, at 15:13, shanliang wrote: >>> >>>> Hi , >>>> >>>> The problem could be that FieldMonitor did not have enough time to >>>> "addFieldWatch" but the vm to monitor (TestPostFieldModification) was >>>> already ended. >>>> >>>> So we should make sure that TestPostFieldModification exits after >>>> FieldMonitor has done necessary. The solution proposed here is that >>>> FieldMonitor creates a file after adding field watching, and >>>> TestPostFieldModification quits only after finding the file. >>>> >>>> web: >>>> http://icncweb.fr.oracle.com/~shjiang/webrev/8007710/00/ >>>> >>>> bug: >>>> https://bugs.openjdk.java.net/browse/JDK-8007710 >>>> >>>> Thanks, >>>> Shanliang >>> >> >> > From shanliang.jiang at oracle.com Tue Feb 11 08:37:18 2014 From: shanliang.jiang at oracle.com (shanliang) Date: Tue, 11 Feb 2014 17:37:18 +0100 Subject: Codereview request: 8007710 runtime/7158988/FieldMonitor.java fails with com.sun.jdi.VMDisconnectedException: Connection closed In-Reply-To: <52FA4D33.4030402@oracle.com> References: <52FA301A.2040301@oracle.com> <52FA424E.2050202@oracle.com> <52FA4515.3050904@oracle.com> <52FA4D33.4030402@oracle.com> Message-ID: <52FA51BE.4070701@oracle.com> shanliang wrote: > Jaroslav Bachorik wrote: >> On 11.2.2014 16:31, shanliang wrote: >>> Staffan Larsen wrote: >>>> Hi Shanliang, >>>> >>>> I can?t quite see how the test can fail in this way. When the >>>> ClassPrepareEvent happens, the debuggee will be suspended. So when >>>> addFieldWatch() is called, the debuggee should not have moved. >>> I am not expert of jdi so I may miss something here. I checked the >>> failure trace and saw the report exception happen when FieldMonitor >>> received ClassPrepareEvent and was doing addFieldWatch. FieldMonitor >>> did >>> call "vm.resume()" before treating events. >> >> AFAICS, calling vm.resume() results in an almost immediate debuggee >> death. The gc() invoking thread "d" is flagged as a deamon and as >> such doesn't prevent the process from exiting. The other thread is >> not a daemon but will finish in only few cycles. > I looked at the class com.sun.jdi.VirtualMachine, here is the Javadoc > of the method "resume": > /** > * Continues the execution of the application running in this > * virtual machine. All threads are resumed as documented in > * {@link ThreadReference#resume}. > * > * @throws VMCannotBeModifiedException if the VirtualMachine is > read-only - see {@link VirtualMachine#canBeModified()}. > * > * @see #suspend > */ > void resume(); > My understanding is that the debuggee resumes to work after this call, > instead to die? In fact the problem is here, the vm (TestPostFieldModification) should not die before FieldMonitor finishes addFieldWatch. Shanliang >> >>> >>> I reproduced the bug by add sleep(1000) after vm.resume() but before >>> calling eventQueue.remove(); >> >> It looks like some kind of synchronization between the debugger and >> the debuggee is necessary. But I wonder if you should better use the >> process.getOuptuptStream() to write and flush a message for the >> debugee indicating that it can exit. And in the debugee you would >> just do System.in.read() as the last statement in the main() method. >> Seems more robust than involving files. > It could work, but creating a file in the testing directory should > have no issue, but yes maybe less performance. > > Thanks, > Shanliang >> >> Cheers, >> >> -JB- >> >>> >>> Thanks, >>> Shanliang >>>> One problem I do see with the test is that it does not wait for a >>>> VMStartEvent before setting up requests. I?m not sure if that could >>>> cause the failure in the bug report, though. >>>> >>>> /Staffan >>>> >>>> On 11 feb 2014, at 15:13, shanliang >>>> wrote: >>>> >>>>> Hi , >>>>> >>>>> The problem could be that FieldMonitor did not have enough time to >>>>> "addFieldWatch" but the vm to monitor (TestPostFieldModification) was >>>>> already ended. >>>>> >>>>> So we should make sure that TestPostFieldModification exits after >>>>> FieldMonitor has done necessary. The solution proposed here is that >>>>> FieldMonitor creates a file after adding field watching, and >>>>> TestPostFieldModification quits only after finding the file. >>>>> >>>>> web: >>>>> http://icncweb.fr.oracle.com/~shjiang/webrev/8007710/00/ >>>>> >>>>> bug: >>>>> https://bugs.openjdk.java.net/browse/JDK-8007710 >>>>> >>>>> Thanks, >>>>> Shanliang >>>> >>> >>> >> > From jaroslav.bachorik at oracle.com Tue Feb 11 08:41:01 2014 From: jaroslav.bachorik at oracle.com (Jaroslav Bachorik) Date: Tue, 11 Feb 2014 17:41:01 +0100 Subject: Codereview request: 8007710 runtime/7158988/FieldMonitor.java fails with com.sun.jdi.VMDisconnectedException: Connection closed In-Reply-To: <52FA4D33.4030402@oracle.com> References: <52FA301A.2040301@oracle.com> <52FA424E.2050202@oracle.com> <52FA4515.3050904@oracle.com> <52FA4D33.4030402@oracle.com> Message-ID: <52FA529D.3010306@oracle.com> On 11.2.2014 17:17, shanliang wrote: > Jaroslav Bachorik wrote: >> On 11.2.2014 16:31, shanliang wrote: >>> Staffan Larsen wrote: >>>> Hi Shanliang, >>>> >>>> I can?t quite see how the test can fail in this way. When the >>>> ClassPrepareEvent happens, the debuggee will be suspended. So when >>>> addFieldWatch() is called, the debuggee should not have moved. >>> I am not expert of jdi so I may miss something here. I checked the >>> failure trace and saw the report exception happen when FieldMonitor >>> received ClassPrepareEvent and was doing addFieldWatch. FieldMonitor did >>> call "vm.resume()" before treating events. >> >> AFAICS, calling vm.resume() results in an almost immediate debuggee >> death. The gc() invoking thread "d" is flagged as a deamon and as such >> doesn't prevent the process from exiting. The other thread is not a >> daemon but will finish in only few cycles. > I looked at the class com.sun.jdi.VirtualMachine, here is the Javadoc of > the method "resume": > /** > * Continues the execution of the application running in this > * virtual machine. All threads are resumed as documented in > * {@link ThreadReference#resume}. > * > * @throws VMCannotBeModifiedException if the VirtualMachine is > read-only - see {@link VirtualMachine#canBeModified()}. > * > * @see #suspend > */ > void resume(); > My understanding is that the debuggee resumes to work after this call, > instead to die? Yes. It resumes. But only for a few microseconds at best. Those 10 string concatenations will not take long. So the VM will exit almost immediately after receiving the resume command. >> >>> >>> I reproduced the bug by add sleep(1000) after vm.resume() but before >>> calling eventQueue.remove(); >> >> It looks like some kind of synchronization between the debugger and >> the debuggee is necessary. But I wonder if you should better use the >> process.getOuptuptStream() to write and flush a message for the >> debugee indicating that it can exit. And in the debugee you would just >> do System.in.read() as the last statement in the main() method. Seems >> more robust than involving files. > It could work, but creating a file in the testing directory should have > no issue, but yes maybe less performance. ... and possible file system issues, necessity to clean up the file, not ready for parallelization etc. BTW, I have a recollection that you should create test specific files in a path specified by some other system property, not "user.dir". I can try to find the property name. -JB- > > Thanks, > Shanliang >> >> Cheers, >> >> -JB- >> >>> >>> Thanks, >>> Shanliang >>>> One problem I do see with the test is that it does not wait for a >>>> VMStartEvent before setting up requests. I?m not sure if that could >>>> cause the failure in the bug report, though. >>>> >>>> /Staffan >>>> >>>> On 11 feb 2014, at 15:13, shanliang wrote: >>>> >>>>> Hi , >>>>> >>>>> The problem could be that FieldMonitor did not have enough time to >>>>> "addFieldWatch" but the vm to monitor (TestPostFieldModification) was >>>>> already ended. >>>>> >>>>> So we should make sure that TestPostFieldModification exits after >>>>> FieldMonitor has done necessary. The solution proposed here is that >>>>> FieldMonitor creates a file after adding field watching, and >>>>> TestPostFieldModification quits only after finding the file. >>>>> >>>>> web: >>>>> http://icncweb.fr.oracle.com/~shjiang/webrev/8007710/00/ >>>>> >>>>> bug: >>>>> https://bugs.openjdk.java.net/browse/JDK-8007710 >>>>> >>>>> Thanks, >>>>> Shanliang >>>> >>> >>> >> > From shanliang.jiang at oracle.com Tue Feb 11 09:30:57 2014 From: shanliang.jiang at oracle.com (shanliang) Date: Tue, 11 Feb 2014 18:30:57 +0100 Subject: Codereview request: 8007710 runtime/7158988/FieldMonitor.java fails with com.sun.jdi.VMDisconnectedException: Connection closed In-Reply-To: <52FA51BE.4070701@oracle.com> References: <52FA301A.2040301@oracle.com> <52FA424E.2050202@oracle.com> <52FA4515.3050904@oracle.com> <52FA4D33.4030402@oracle.com> <52FA51BE.4070701@oracle.com> Message-ID: <52FA5E51.3040704@oracle.com> Here is the new fix in which FieldMonitor will write to TestPostFieldModification, to inform the latter to quit, as suggested bu Jaroslav http://cr.openjdk.java.net/~sjiang/JDK-8007710/01/ Thanks, Shanliang shanliang wrote: > shanliang wrote: >> Jaroslav Bachorik wrote: >>> On 11.2.2014 16:31, shanliang wrote: >>>> Staffan Larsen wrote: >>>>> Hi Shanliang, >>>>> >>>>> I can?t quite see how the test can fail in this way. When the >>>>> ClassPrepareEvent happens, the debuggee will be suspended. So when >>>>> addFieldWatch() is called, the debuggee should not have moved. >>>> I am not expert of jdi so I may miss something here. I checked the >>>> failure trace and saw the report exception happen when FieldMonitor >>>> received ClassPrepareEvent and was doing addFieldWatch. >>>> FieldMonitor did >>>> call "vm.resume()" before treating events. >>> >>> AFAICS, calling vm.resume() results in an almost immediate debuggee >>> death. The gc() invoking thread "d" is flagged as a deamon and as >>> such doesn't prevent the process from exiting. The other thread is >>> not a daemon but will finish in only few cycles. >> I looked at the class com.sun.jdi.VirtualMachine, here is the Javadoc >> of the method "resume": >> /** >> * Continues the execution of the application running in this >> * virtual machine. All threads are resumed as documented in >> * {@link ThreadReference#resume}. >> * >> * @throws VMCannotBeModifiedException if the VirtualMachine is >> read-only - see {@link VirtualMachine#canBeModified()}. >> * >> * @see #suspend >> */ >> void resume(); >> My understanding is that the debuggee resumes to work after this >> call, instead to die? > In fact the problem is here, the vm (TestPostFieldModification) should > not die before FieldMonitor finishes addFieldWatch. > > Shanliang >>> >>>> >>>> I reproduced the bug by add sleep(1000) after vm.resume() but before >>>> calling eventQueue.remove(); >>> >>> It looks like some kind of synchronization between the debugger and >>> the debuggee is necessary. But I wonder if you should better use the >>> process.getOuptuptStream() to write and flush a message for the >>> debugee indicating that it can exit. And in the debugee you would >>> just do System.in.read() as the last statement in the main() method. >>> Seems more robust than involving files. >> It could work, but creating a file in the testing directory should >> have no issue, but yes maybe less performance. >> >> Thanks, >> Shanliang >>> >>> Cheers, >>> >>> -JB- >>> >>>> >>>> Thanks, >>>> Shanliang >>>>> One problem I do see with the test is that it does not wait for a >>>>> VMStartEvent before setting up requests. I?m not sure if that could >>>>> cause the failure in the bug report, though. >>>>> >>>>> /Staffan >>>>> >>>>> On 11 feb 2014, at 15:13, shanliang >>>>> wrote: >>>>> >>>>>> Hi , >>>>>> >>>>>> The problem could be that FieldMonitor did not have enough time to >>>>>> "addFieldWatch" but the vm to monitor (TestPostFieldModification) >>>>>> was >>>>>> already ended. >>>>>> >>>>>> So we should make sure that TestPostFieldModification exits after >>>>>> FieldMonitor has done necessary. The solution proposed here is that >>>>>> FieldMonitor creates a file after adding field watching, and >>>>>> TestPostFieldModification quits only after finding the file. >>>>>> >>>>>> web: >>>>>> http://icncweb.fr.oracle.com/~shjiang/webrev/8007710/00/ >>>>>> >>>>>> bug: >>>>>> https://bugs.openjdk.java.net/browse/JDK-8007710 >>>>>> >>>>>> Thanks, >>>>>> Shanliang >>>>> >>>> >>>> >>> >> > From mark.reinhold at oracle.com Tue Feb 11 09:42:40 2014 From: mark.reinhold at oracle.com (mark.reinhold at oracle.com) Date: Tue, 11 Feb 2014 09:42:40 -0800 Subject: Changeset rolled back: jdk8/tl/jdk: 7152892: some jtreg tests fail with permission denied Message-ID: <20140211094240.954088@eggemoggin.niobe.net> This changeset was erroneously pushed to jdk8/tl/jdk: Changeset: da4b0962ad11 Author: robm Date: 2014-02-10 14:35 +0000 URL: http://hg.openjdk.java.net/jdk8/tl/jdk/rev/da4b0962ad11 7152892: some jtreg tests fail with permission denied Reviewed-by: coffeys ! test/java/lang/ClassLoader/Assert.sh ! test/java/rmi/registry/readTest/readTest.sh ! test/java/util/zip/ZipFile/ReadZip.java ! test/sun/net/www/protocol/jar/jarbug/run.sh It has been rolled back on the server and blacklisted in jcheck. If you have a clone of this repository then either strip this changeset or destroy the repository and re-clone it. - Mark From staffan.larsen at oracle.com Tue Feb 11 11:15:43 2014 From: staffan.larsen at oracle.com (Staffan Larsen) Date: Tue, 11 Feb 2014 20:15:43 +0100 Subject: Codereview request: 8007710 runtime/7158988/FieldMonitor.java fails with com.sun.jdi.VMDisconnectedException: Connection closed In-Reply-To: <52FA5E51.3040704@oracle.com> References: <52FA301A.2040301@oracle.com> <52FA424E.2050202@oracle.com> <52FA4515.3050904@oracle.com> <52FA4D33.4030402@oracle.com> <52FA51BE.4070701@oracle.com> <52FA5E51.3040704@oracle.com> Message-ID: <3F301491-8A27-4319-A694-D1E8ACB2F249@oracle.com> I think what you need to do is wait for the VMStartEvent before you add requests to the VM. Note this paragraph from the VirtualMachine doc: Note that a target VM launched by a launching connector is not guaranteed to be stable until after the VMStartEvent has been received. I think adding code that looks something like this will make the test stable: VirtualMachine vm = launchTarget(CLASS_NAME); EventQueue eventQueue = vm.eventQueue(); boolean started = false; while(!started) { EventSet eventSet = eventQueue.remove(); for (Event event : eventSet) { if (event instanceof VMStartEvent) { started = true; } if (event instanceof VMDeathEvent || event instanceof VMDisconnectEvent) { throw new Error("VM died before it started...:"+event); } } } System.out.println("Vm launched"); There is also no reason to call addFieldWatch() before the ClassPrepareEvent has been received. The call to vm..classesByName() will just return an empty list anyway. While you are in there you can also remove the unused StringBuffer near the top of main(). Thanks, /Staffan On 11 feb 2014, at 18:30, shanliang wrote: > Here is the new fix in which FieldMonitor will write to TestPostFieldModification, to inform the latter to quit, as suggested bu Jaroslav > http://cr.openjdk.java.net/~sjiang/JDK-8007710/01/ > > Thanks, > Shanliang > > shanliang wrote: >> shanliang wrote: >>> Jaroslav Bachorik wrote: >>>> On 11.2.2014 16:31, shanliang wrote: >>>>> Staffan Larsen wrote: >>>>>> Hi Shanliang, >>>>>> >>>>>> I can?t quite see how the test can fail in this way. When the >>>>>> ClassPrepareEvent happens, the debuggee will be suspended. So when >>>>>> addFieldWatch() is called, the debuggee should not have moved. >>>>> I am not expert of jdi so I may miss something here. I checked the >>>>> failure trace and saw the report exception happen when FieldMonitor >>>>> received ClassPrepareEvent and was doing addFieldWatch. FieldMonitor did >>>>> call "vm.resume()" before treating events. >>>> >>>> AFAICS, calling vm.resume() results in an almost immediate debuggee death. The gc() invoking thread "d" is flagged as a deamon and as such doesn't prevent the process from exiting. The other thread is not a daemon but will finish in only few cycles. >>> I looked at the class com.sun.jdi.VirtualMachine, here is the Javadoc of the method "resume": >>> /** >>> * Continues the execution of the application running in this >>> * virtual machine. All threads are resumed as documented in >>> * {@link ThreadReference#resume}. >>> * >>> * @throws VMCannotBeModifiedException if the VirtualMachine is read-only - see {@link VirtualMachine#canBeModified()}. >>> * >>> * @see #suspend >>> */ >>> void resume(); >>> My understanding is that the debuggee resumes to work after this call, instead to die? >> In fact the problem is here, the vm (TestPostFieldModification) should not die before FieldMonitor finishes addFieldWatch. >> >> Shanliang >>>> >>>>> >>>>> I reproduced the bug by add sleep(1000) after vm.resume() but before >>>>> calling eventQueue.remove(); >>>> >>>> It looks like some kind of synchronization between the debugger and the debuggee is necessary. But I wonder if you should better use the process.getOuptuptStream() to write and flush a message for the debugee indicating that it can exit. And in the debugee you would just do System.in.read() as the last statement in the main() method. Seems more robust than involving files. >>> It could work, but creating a file in the testing directory should have no issue, but yes maybe less performance. >>> >>> Thanks, >>> Shanliang >>>> >>>> Cheers, >>>> >>>> -JB- >>>> >>>>> >>>>> Thanks, >>>>> Shanliang >>>>>> One problem I do see with the test is that it does not wait for a >>>>>> VMStartEvent before setting up requests. I?m not sure if that could >>>>>> cause the failure in the bug report, though. >>>>>> >>>>>> /Staffan >>>>>> >>>>>> On 11 feb 2014, at 15:13, shanliang wrote: >>>>>> >>>>>>> Hi , >>>>>>> >>>>>>> The problem could be that FieldMonitor did not have enough time to >>>>>>> "addFieldWatch" but the vm to monitor (TestPostFieldModification) was >>>>>>> already ended. >>>>>>> >>>>>>> So we should make sure that TestPostFieldModification exits after >>>>>>> FieldMonitor has done necessary. The solution proposed here is that >>>>>>> FieldMonitor creates a file after adding field watching, and >>>>>>> TestPostFieldModification quits only after finding the file. >>>>>>> >>>>>>> web: >>>>>>> http://icncweb.fr.oracle.com/~shjiang/webrev/8007710/00/ >>>>>>> >>>>>>> bug: >>>>>>> https://bugs.openjdk.java.net/browse/JDK-8007710 >>>>>>> >>>>>>> Thanks, >>>>>>> Shanliang >>>>>> >>>>> >>>>> >>>> >>> >> > From serguei.spitsyn at oracle.com Tue Feb 11 12:23:27 2014 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Tue, 11 Feb 2014 12:23:27 -0800 Subject: RFR(S): 6952105 TEST_BUG: testcase failure, not very often, com/sun/jdi/SuspendThreadTest.java In-Reply-To: References: Message-ID: <52FA86BF.4040002@oracle.com> Staffan, It is nice fix to have. I do not see any issues with it. Thanks, Serguei On 2/11/14 6:12 AM, Staffan Larsen wrote: > Updated the test to use proper synchronization and notification between threads. Should be more stable and much faster. > > bug: https://bugs.openjdk.java.net/browse/JDK-6952105 > webrev: http://cr.openjdk.java.net/~sla/6952105/webrev.00/ > > Thanks, > /Staffan From dmitry.samersoff at oracle.com Tue Feb 11 12:35:52 2014 From: dmitry.samersoff at oracle.com (Dmitry Samersoff) Date: Wed, 12 Feb 2014 00:35:52 +0400 Subject: RFR(S): 6952105 TEST_BUG: testcase failure, not very often, com/sun/jdi/SuspendThreadTest.java In-Reply-To: References: Message-ID: <52FA89A8.7060105@oracle.com> Staffan, Nice catch! Looks good for me. -Dmitry On 2014-02-11 18:12, Staffan Larsen wrote: > Updated the test to use proper synchronization and notification between threads. Should be more stable and much faster. > > bug: https://bugs.openjdk.java.net/browse/JDK-6952105 > webrev: http://cr.openjdk.java.net/~sla/6952105/webrev.00/ > > Thanks, > /Staffan > -- Dmitry Samersoff Oracle Java development team, Saint Petersburg, Russia * I would love to change the world, but they won't give me the sources. From serguei.spitsyn at oracle.com Tue Feb 11 12:55:08 2014 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Tue, 11 Feb 2014 12:55:08 -0800 Subject: Codereview request: 8007710 runtime/7158988/FieldMonitor.java fails with com.sun.jdi.VMDisconnectedException: Connection closed In-Reply-To: <52FA5E51.3040704@oracle.com> References: <52FA301A.2040301@oracle.com> <52FA424E.2050202@oracle.com> <52FA4515.3050904@oracle.com> <52FA4D33.4030402@oracle.com> <52FA51BE.4070701@oracle.com> <52FA5E51.3040704@oracle.com> Message-ID: <52FA8E2C.3050605@oracle.com> The fix looks good. But could you change "impossible" at line 45 to something more adequate, i.e. "caught exception"? : 41 System.out.println("---TestPostFieldModification-run waiting to exit ..."); 42 try { 43 System.in.read(); 44 } catch (Exception e) { 45 System.out.println("---TestPostFieldModification-run impossible? "+e); 46 e.printStackTrace(); 47 } Thanks, Serguei On 2/11/14 9:30 AM, shanliang wrote: > Here is the new fix in which FieldMonitor will write to > TestPostFieldModification, to inform the latter to quit, as suggested > bu Jaroslav > http://cr.openjdk.java.net/~sjiang/JDK-8007710/01/ > > Thanks, > Shanliang > > shanliang wrote: >> shanliang wrote: >>> Jaroslav Bachorik wrote: >>>> On 11.2.2014 16:31, shanliang wrote: >>>>> Staffan Larsen wrote: >>>>>> Hi Shanliang, >>>>>> >>>>>> I can?t quite see how the test can fail in this way. When the >>>>>> ClassPrepareEvent happens, the debuggee will be suspended. So when >>>>>> addFieldWatch() is called, the debuggee should not have moved. >>>>> I am not expert of jdi so I may miss something here. I checked the >>>>> failure trace and saw the report exception happen when FieldMonitor >>>>> received ClassPrepareEvent and was doing addFieldWatch. >>>>> FieldMonitor did >>>>> call "vm.resume()" before treating events. >>>> >>>> AFAICS, calling vm.resume() results in an almost immediate debuggee >>>> death. The gc() invoking thread "d" is flagged as a deamon and as >>>> such doesn't prevent the process from exiting. The other thread is >>>> not a daemon but will finish in only few cycles. >>> I looked at the class com.sun.jdi.VirtualMachine, here is the >>> Javadoc of the method "resume": >>> /** >>> * Continues the execution of the application running in this >>> * virtual machine. All threads are resumed as documented in >>> * {@link ThreadReference#resume}. >>> * >>> * @throws VMCannotBeModifiedException if the VirtualMachine is >>> read-only - see {@link VirtualMachine#canBeModified()}. >>> * >>> * @see #suspend >>> */ >>> void resume(); >>> My understanding is that the debuggee resumes to work after this >>> call, instead to die? >> In fact the problem is here, the vm (TestPostFieldModification) >> should not die before FieldMonitor finishes addFieldWatch. >> >> Shanliang >>>> >>>>> >>>>> I reproduced the bug by add sleep(1000) after vm.resume() but before >>>>> calling eventQueue.remove(); >>>> >>>> It looks like some kind of synchronization between the debugger and >>>> the debuggee is necessary. But I wonder if you should better use >>>> the process.getOuptuptStream() to write and flush a message for the >>>> debugee indicating that it can exit. And in the debugee you would >>>> just do System.in.read() as the last statement in the main() >>>> method. Seems more robust than involving files. >>> It could work, but creating a file in the testing directory should >>> have no issue, but yes maybe less performance. >>> >>> Thanks, >>> Shanliang >>>> >>>> Cheers, >>>> >>>> -JB- >>>> >>>>> >>>>> Thanks, >>>>> Shanliang >>>>>> One problem I do see with the test is that it does not wait for a >>>>>> VMStartEvent before setting up requests. I?m not sure if that could >>>>>> cause the failure in the bug report, though. >>>>>> >>>>>> /Staffan >>>>>> >>>>>> On 11 feb 2014, at 15:13, shanliang >>>>>> wrote: >>>>>> >>>>>>> Hi , >>>>>>> >>>>>>> The problem could be that FieldMonitor did not have enough time to >>>>>>> "addFieldWatch" but the vm to monitor >>>>>>> (TestPostFieldModification) was >>>>>>> already ended. >>>>>>> >>>>>>> So we should make sure that TestPostFieldModification exits after >>>>>>> FieldMonitor has done necessary. The solution proposed here is that >>>>>>> FieldMonitor creates a file after adding field watching, and >>>>>>> TestPostFieldModification quits only after finding the file. >>>>>>> >>>>>>> web: >>>>>>> http://icncweb.fr.oracle.com/~shjiang/webrev/8007710/00/ >>>>>>> >>>>>>> bug: >>>>>>> https://bugs.openjdk.java.net/browse/JDK-8007710 >>>>>>> >>>>>>> Thanks, >>>>>>> Shanliang >>>>>> >>>>> >>>>> >>>> >>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20140211/63046720/attachment.html From lana.steuck at oracle.com Tue Feb 11 13:40:10 2014 From: lana.steuck at oracle.com (lana.steuck at oracle.com) Date: Tue, 11 Feb 2014 21:40:10 +0000 Subject: hg: jdk8/tl/corba: 7 new changesets Message-ID: <20140211214024.5EA1E62B9A@hg.openjdk.java.net> Changeset: 18c4d03cf516 Author: katleman Date: 2014-01-22 12:53 -0800 URL: http://hg.openjdk.java.net/jdk8/tl/corba/rev/18c4d03cf516 Added tag jdk8-b125 for changeset 7b45151c7a05 ! .hgtags Changeset: 8ceb68fd9e10 Author: katleman Date: 2014-01-22 14:06 -0800 URL: http://hg.openjdk.java.net/jdk8/tl/corba/rev/8ceb68fd9e10 Merge ! .hgtags Changeset: cfa04e69b115 Author: katleman Date: 2014-01-24 15:07 -0800 URL: http://hg.openjdk.java.net/jdk8/tl/corba/rev/cfa04e69b115 Added tag jdk8-b126 for changeset 8ceb68fd9e10 ! .hgtags Changeset: b8c71dae0557 Author: lana Date: 2014-01-29 11:11 -0800 URL: http://hg.openjdk.java.net/jdk8/tl/corba/rev/b8c71dae0557 Merge Changeset: 113e7569b49b Author: katleman Date: 2014-01-30 12:16 -0800 URL: http://hg.openjdk.java.net/jdk8/tl/corba/rev/113e7569b49b Added tag jdk8-b127 for changeset b8c71dae0557 ! .hgtags Changeset: 5c72d74c6805 Author: katleman Date: 2014-02-01 18:21 -0800 URL: http://hg.openjdk.java.net/jdk8/tl/corba/rev/5c72d74c6805 Added tag jdk8-b128 for changeset 113e7569b49b ! .hgtags Changeset: eea0d7dfcbe2 Author: katleman Date: 2014-02-06 17:34 -0800 URL: http://hg.openjdk.java.net/jdk8/tl/corba/rev/eea0d7dfcbe2 Added tag jdk8-b129 for changeset 5c72d74c6805 ! .hgtags From lana.steuck at oracle.com Tue Feb 11 13:40:17 2014 From: lana.steuck at oracle.com (lana.steuck at oracle.com) Date: Tue, 11 Feb 2014 21:40:17 +0000 Subject: hg: jdk8/tl/langtools: 7 new changesets Message-ID: <20140211214140.425E862B9E@hg.openjdk.java.net> Changeset: 9a4dbfe11ed1 Author: katleman Date: 2014-01-22 12:54 -0800 URL: http://hg.openjdk.java.net/jdk8/tl/langtools/rev/9a4dbfe11ed1 Added tag jdk8-b125 for changeset 436176151e85 ! .hgtags Changeset: ba24b6304362 Author: katleman Date: 2014-01-22 14:09 -0800 URL: http://hg.openjdk.java.net/jdk8/tl/langtools/rev/ba24b6304362 Merge ! .hgtags Changeset: 305b97f4651b Author: katleman Date: 2014-01-24 15:08 -0800 URL: http://hg.openjdk.java.net/jdk8/tl/langtools/rev/305b97f4651b Added tag jdk8-b126 for changeset ba24b6304362 ! .hgtags Changeset: bb69217ed812 Author: lana Date: 2014-01-29 11:12 -0800 URL: http://hg.openjdk.java.net/jdk8/tl/langtools/rev/bb69217ed812 Merge Changeset: 09cdd3b493c0 Author: katleman Date: 2014-01-30 12:17 -0800 URL: http://hg.openjdk.java.net/jdk8/tl/langtools/rev/09cdd3b493c0 Added tag jdk8-b127 for changeset bb69217ed812 ! .hgtags Changeset: 8fe7202d3c38 Author: katleman Date: 2014-02-01 18:21 -0800 URL: http://hg.openjdk.java.net/jdk8/tl/langtools/rev/8fe7202d3c38 Added tag jdk8-b128 for changeset 09cdd3b493c0 ! .hgtags Changeset: 9d81ae1c417a Author: katleman Date: 2014-02-06 17:35 -0800 URL: http://hg.openjdk.java.net/jdk8/tl/langtools/rev/9d81ae1c417a Added tag jdk8-b129 for changeset 8fe7202d3c38 ! .hgtags From lana.steuck at oracle.com Tue Feb 11 13:40:13 2014 From: lana.steuck at oracle.com (lana.steuck at oracle.com) Date: Tue, 11 Feb 2014 21:40:13 +0000 Subject: hg: jdk8/tl/jaxws: 7 new changesets Message-ID: <20140211214119.DF23C62B9C@hg.openjdk.java.net> Changeset: c0040f0b75e2 Author: katleman Date: 2014-01-22 12:53 -0800 URL: http://hg.openjdk.java.net/jdk8/tl/jaxws/rev/c0040f0b75e2 Added tag jdk8-b125 for changeset ef71ecbcd7bc ! .hgtags Changeset: 7193a007a159 Author: katleman Date: 2014-01-22 14:07 -0800 URL: http://hg.openjdk.java.net/jdk8/tl/jaxws/rev/7193a007a159 Merge ! .hgtags Changeset: 3f682f2ea376 Author: katleman Date: 2014-01-24 15:08 -0800 URL: http://hg.openjdk.java.net/jdk8/tl/jaxws/rev/3f682f2ea376 Added tag jdk8-b126 for changeset 7193a007a159 ! .hgtags Changeset: 8e46fe36e175 Author: lana Date: 2014-01-29 11:11 -0800 URL: http://hg.openjdk.java.net/jdk8/tl/jaxws/rev/8e46fe36e175 Merge Changeset: de172acc095b Author: katleman Date: 2014-01-30 12:16 -0800 URL: http://hg.openjdk.java.net/jdk8/tl/jaxws/rev/de172acc095b Added tag jdk8-b127 for changeset 8e46fe36e175 ! .hgtags Changeset: aabc90596123 Author: katleman Date: 2014-02-01 18:21 -0800 URL: http://hg.openjdk.java.net/jdk8/tl/jaxws/rev/aabc90596123 Added tag jdk8-b128 for changeset de172acc095b ! .hgtags Changeset: 4195c0956930 Author: katleman Date: 2014-02-06 17:35 -0800 URL: http://hg.openjdk.java.net/jdk8/tl/jaxws/rev/4195c0956930 Added tag jdk8-b129 for changeset aabc90596123 ! .hgtags From lana.steuck at oracle.com Tue Feb 11 13:40:17 2014 From: lana.steuck at oracle.com (lana.steuck at oracle.com) Date: Tue, 11 Feb 2014 21:40:17 +0000 Subject: hg: jdk8/tl/jaxp: 7 new changesets Message-ID: <20140211214123.063DA62B9D@hg.openjdk.java.net> Changeset: 6a5af8a36aaf Author: katleman Date: 2014-01-22 12:53 -0800 URL: http://hg.openjdk.java.net/jdk8/tl/jaxp/rev/6a5af8a36aaf Added tag jdk8-b125 for changeset 83bb924238f8 ! .hgtags Changeset: 390cc275c04c Author: katleman Date: 2014-01-22 14:07 -0800 URL: http://hg.openjdk.java.net/jdk8/tl/jaxp/rev/390cc275c04c Merge ! .hgtags Changeset: 573c261a2025 Author: katleman Date: 2014-01-24 15:08 -0800 URL: http://hg.openjdk.java.net/jdk8/tl/jaxp/rev/573c261a2025 Added tag jdk8-b126 for changeset 390cc275c04c ! .hgtags Changeset: b68cdb63a70b Author: lana Date: 2014-01-29 11:11 -0800 URL: http://hg.openjdk.java.net/jdk8/tl/jaxp/rev/b68cdb63a70b Merge Changeset: b1839922f10c Author: katleman Date: 2014-01-30 12:16 -0800 URL: http://hg.openjdk.java.net/jdk8/tl/jaxp/rev/b1839922f10c Added tag jdk8-b127 for changeset b68cdb63a70b ! .hgtags Changeset: b7752cea7c81 Author: katleman Date: 2014-02-01 18:21 -0800 URL: http://hg.openjdk.java.net/jdk8/tl/jaxp/rev/b7752cea7c81 Added tag jdk8-b128 for changeset b1839922f10c ! .hgtags Changeset: 0cb0cd015218 Author: katleman Date: 2014-02-06 17:34 -0800 URL: http://hg.openjdk.java.net/jdk8/tl/jaxp/rev/0cb0cd015218 Added tag jdk8-b129 for changeset b7752cea7c81 ! .hgtags From lana.steuck at oracle.com Tue Feb 11 13:40:24 2014 From: lana.steuck at oracle.com (lana.steuck at oracle.com) Date: Tue, 11 Feb 2014 21:40:24 +0000 Subject: hg: jdk8/tl/nashorn: 8 new changesets Message-ID: <20140211214101.632E862B9B@hg.openjdk.java.net> Changeset: d336209a0e45 Author: katleman Date: 2014-01-22 12:54 -0800 URL: http://hg.openjdk.java.net/jdk8/tl/nashorn/rev/d336209a0e45 Added tag jdk8-b125 for changeset 7346abe2ea03 ! .hgtags Changeset: 095263db862d Author: katleman Date: 2014-01-22 14:00 -0800 URL: http://hg.openjdk.java.net/jdk8/tl/nashorn/rev/095263db862d Merge ! .hgtags Changeset: e2522604c7c9 Author: katleman Date: 2014-01-24 15:08 -0800 URL: http://hg.openjdk.java.net/jdk8/tl/nashorn/rev/e2522604c7c9 Added tag jdk8-b126 for changeset 095263db862d ! .hgtags Changeset: fdfbb745caf0 Author: lana Date: 2014-01-29 11:12 -0800 URL: http://hg.openjdk.java.net/jdk8/tl/nashorn/rev/fdfbb745caf0 Merge Changeset: 7dfde83426d1 Author: katleman Date: 2014-01-30 12:17 -0800 URL: http://hg.openjdk.java.net/jdk8/tl/nashorn/rev/7dfde83426d1 Added tag jdk8-b127 for changeset fdfbb745caf0 ! .hgtags Changeset: 73cbad0c5d28 Author: lana Date: 2014-01-31 13:47 -0800 URL: http://hg.openjdk.java.net/jdk8/tl/nashorn/rev/73cbad0c5d28 Merge Changeset: 9cc3fd32fbab Author: katleman Date: 2014-02-01 18:21 -0800 URL: http://hg.openjdk.java.net/jdk8/tl/nashorn/rev/9cc3fd32fbab Added tag jdk8-b128 for changeset 73cbad0c5d28 ! .hgtags Changeset: f87eba70e9ee Author: katleman Date: 2014-02-06 17:35 -0800 URL: http://hg.openjdk.java.net/jdk8/tl/nashorn/rev/f87eba70e9ee Added tag jdk8-b129 for changeset 9cc3fd32fbab ! .hgtags From lana.steuck at oracle.com Tue Feb 11 13:40:17 2014 From: lana.steuck at oracle.com (lana.steuck at oracle.com) Date: Tue, 11 Feb 2014 21:40:17 +0000 Subject: hg: jdk8/tl/hotspot: 23 new changesets Message-ID: <20140211214227.12C4F62B9F@hg.openjdk.java.net> Changeset: 16e0c6c84a91 Author: amurillo Date: 2014-01-13 16:00 -0800 URL: http://hg.openjdk.java.net/jdk8/tl/hotspot/rev/16e0c6c84a91 8031553: new hotspot build - hs25-b67 Reviewed-by: jcoomes ! make/hotspot_version Changeset: 12ad8db39f76 Author: roland Date: 2014-01-14 09:44 +0100 URL: http://hg.openjdk.java.net/jdk8/tl/hotspot/rev/12ad8db39f76 8028764: dtrace/hotspot_jni/ALL/ALL001 crashes the vm on Solaris-amd64, SIGSEGV in MarkSweep::follow_stack()+0x8a Summary: C1 generates code to encode compressed oop into tmp register before runtime call for patching where GC may happen Reviewed-by: iveresov, twisti, kvn Contributed-by: mgerdin ! src/cpu/x86/vm/c1_LIRAssembler_x86.cpp Changeset: 8b81451dc7f7 Author: twisti Date: 2014-01-16 16:18 -0800 URL: http://hg.openjdk.java.net/jdk8/tl/hotspot/rev/8b81451dc7f7 8022395: java.util.zip.ZipException: Not in GZIP format in JT_JDK/test/java/util/zip/GZIP tests Reviewed-by: kvn, iveresov ! src/cpu/x86/vm/c1_LIRGenerator_x86.cpp Changeset: 3585183c191a Author: amurillo Date: 2014-01-17 20:24 -0800 URL: http://hg.openjdk.java.net/jdk8/tl/hotspot/rev/3585183c191a Merge Changeset: 5df2666e4573 Author: amurillo Date: 2014-01-17 20:24 -0800 URL: http://hg.openjdk.java.net/jdk8/tl/hotspot/rev/5df2666e4573 Added tag hs25-b67 for changeset 3585183c191a ! .hgtags Changeset: 55ff9170e27d Author: katleman Date: 2014-01-22 12:53 -0800 URL: http://hg.openjdk.java.net/jdk8/tl/hotspot/rev/55ff9170e27d Added tag jdk8-b125 for changeset df333ee12bba ! .hgtags Changeset: c8218f1072a0 Author: katleman Date: 2014-01-22 14:07 -0800 URL: http://hg.openjdk.java.net/jdk8/tl/hotspot/rev/c8218f1072a0 Merge ! .hgtags Changeset: 9a11d5e679cf Author: katleman Date: 2014-01-24 15:07 -0800 URL: http://hg.openjdk.java.net/jdk8/tl/hotspot/rev/9a11d5e679cf Added tag jdk8-b126 for changeset c8218f1072a0 ! .hgtags Changeset: c2106608358b Author: amurillo Date: 2014-01-17 20:30 -0800 URL: http://hg.openjdk.java.net/jdk8/tl/hotspot/rev/c2106608358b 8032015: new hotspot build - hs25-b68 Reviewed-by: jcoomes ! make/hotspot_version Changeset: 709018897c81 Author: vlivanov Date: 2014-01-23 01:23 +0400 URL: http://hg.openjdk.java.net/jdk8/tl/hotspot/rev/709018897c81 8031695: CHA ignores default methods during analysis leading to incorrect code generation Reviewed-by: jrose, acorn, hseigel, lfoltan ! src/share/vm/code/dependencies.cpp + test/compiler/inlining/DefaultAndConcreteMethodsCHA.java Changeset: f970454708b8 Author: iveresov Date: 2014-01-17 18:09 -0800 URL: http://hg.openjdk.java.net/jdk8/tl/hotspot/rev/f970454708b8 8032207: C2: assert(VerifyOops || MachNode::size(ra_) <= (3+1)*4) failed: bad fixed size Summary: Fix the sizing of loadUS2L_immI16 and loadI2L_immI Reviewed-by: kvn, azeemj ! src/cpu/sparc/vm/sparc.ad + test/compiler/codegen/LoadWithMask.java Changeset: 984401824c5e Author: iveresov Date: 2014-01-21 20:05 -0800 URL: http://hg.openjdk.java.net/jdk8/tl/hotspot/rev/984401824c5e 8031743: C2: loadI2L_immI broken for negative memory values Summary: Restrict loadI2L_imm optimizations to positive values of mask Reviewed-by: kvn, dlong ! src/cpu/sparc/vm/sparc.ad ! src/cpu/x86/vm/x86_32.ad ! src/cpu/x86/vm/x86_64.ad + test/compiler/codegen/LoadWithMask2.java Changeset: d45454002494 Author: amurillo Date: 2014-01-23 13:37 -0800 URL: http://hg.openjdk.java.net/jdk8/tl/hotspot/rev/d45454002494 Merge Changeset: 2c564e329c87 Author: amurillo Date: 2014-01-23 13:37 -0800 URL: http://hg.openjdk.java.net/jdk8/tl/hotspot/rev/2c564e329c87 Added tag hs25-b68 for changeset d45454002494 ! .hgtags Changeset: 58879cd9f8df Author: amurillo Date: 2014-01-28 09:51 -0800 URL: http://hg.openjdk.java.net/jdk8/tl/hotspot/rev/58879cd9f8df Merge ! .hgtags Changeset: 7e412f95e310 Author: amurillo Date: 2014-01-23 13:53 -0800 URL: http://hg.openjdk.java.net/jdk8/tl/hotspot/rev/7e412f95e310 8032608: new hotspot build - hs25-b69 Reviewed-by: jcoomes ! make/hotspot_version Changeset: 2185d483f5f8 Author: kvn Date: 2014-01-27 10:20 -0800 URL: http://hg.openjdk.java.net/jdk8/tl/hotspot/rev/2185d483f5f8 8032566: Crash in JIT when running Scala compiler (and compiling Scala std lib) Summary: Switch off EliminateAutoBox flag by default in jdk8 release. Reviewed-by: iveresov ! src/share/vm/opto/c2_globals.hpp Changeset: 32f017489ba5 Author: amurillo Date: 2014-01-28 15:00 -0800 URL: http://hg.openjdk.java.net/jdk8/tl/hotspot/rev/32f017489ba5 Merge Changeset: 1d8728efc05f Author: amurillo Date: 2014-01-28 15:00 -0800 URL: http://hg.openjdk.java.net/jdk8/tl/hotspot/rev/1d8728efc05f Added tag hs25-b69 for changeset 32f017489ba5 ! .hgtags Changeset: 35038da7bb9d Author: lana Date: 2014-01-29 11:11 -0800 URL: http://hg.openjdk.java.net/jdk8/tl/hotspot/rev/35038da7bb9d Merge Changeset: 874c0b4a946c Author: katleman Date: 2014-01-30 12:16 -0800 URL: http://hg.openjdk.java.net/jdk8/tl/hotspot/rev/874c0b4a946c Added tag jdk8-b127 for changeset 35038da7bb9d ! .hgtags Changeset: cb39165c4a65 Author: katleman Date: 2014-02-01 18:21 -0800 URL: http://hg.openjdk.java.net/jdk8/tl/hotspot/rev/cb39165c4a65 Added tag jdk8-b128 for changeset 874c0b4a946c ! .hgtags Changeset: 1dbaf664a611 Author: katleman Date: 2014-02-06 17:34 -0800 URL: http://hg.openjdk.java.net/jdk8/tl/hotspot/rev/1dbaf664a611 Added tag jdk8-b129 for changeset cb39165c4a65 ! .hgtags From lana.steuck at oracle.com Tue Feb 11 13:40:47 2014 From: lana.steuck at oracle.com (lana.steuck at oracle.com) Date: Tue, 11 Feb 2014 21:40:47 +0000 Subject: hg: jdk8/tl/jdk: 14 new changesets Message-ID: <20140211214445.3A9FE62BA0@hg.openjdk.java.net> Changeset: 75cf17ceb6d1 Author: katleman Date: 2014-01-22 12:54 -0800 URL: http://hg.openjdk.java.net/jdk8/tl/jdk/rev/75cf17ceb6d1 Added tag jdk8-b125 for changeset ae303640bc1c ! .hgtags Changeset: 95410515ba5f Author: katleman Date: 2014-01-22 14:08 -0800 URL: http://hg.openjdk.java.net/jdk8/tl/jdk/rev/95410515ba5f Merge ! .hgtags Changeset: 91bce40d0347 Author: alexsch Date: 2014-01-23 20:36 +0400 URL: http://hg.openjdk.java.net/jdk8/tl/jdk/rev/91bce40d0347 8032063: javax.swing.plaf.metal.MetalFileChooserUI$FilterComboBoxModel extends non-standard API Reviewed-by: pchelko, serb ! src/macosx/classes/com/apple/laf/AquaFileChooserUI.java ! src/share/classes/com/sun/java/swing/plaf/gtk/GTKFileChooserUI.java ! src/share/classes/com/sun/java/swing/plaf/motif/MotifFileChooserUI.java ! src/share/classes/com/sun/java/swing/plaf/windows/WindowsFileChooserUI.java ! src/share/classes/javax/swing/plaf/metal/MetalFileChooserUI.java - src/share/classes/sun/swing/AbstractFilterComboBoxModel.java ! src/share/classes/sun/swing/plaf/synth/SynthFileChooserUIImpl.java Changeset: 6935e7a3a7c9 Author: amurillo Date: 2014-01-23 14:46 -0800 URL: http://hg.openjdk.java.net/jdk8/tl/jdk/rev/6935e7a3a7c9 Merge Changeset: a9088d517f2f Author: amurillo Date: 2014-01-23 14:47 -0800 URL: http://hg.openjdk.java.net/jdk8/tl/jdk/rev/a9088d517f2f Merge Changeset: a635c394328c Author: katleman Date: 2014-01-24 15:08 -0800 URL: http://hg.openjdk.java.net/jdk8/tl/jdk/rev/a635c394328c Added tag jdk8-b126 for changeset a9088d517f2f ! .hgtags Changeset: fbf251b8ef8a Author: lana Date: 2014-01-29 11:11 -0800 URL: http://hg.openjdk.java.net/jdk8/tl/jdk/rev/fbf251b8ef8a Merge Changeset: f777d83e0433 Author: katleman Date: 2014-01-30 12:17 -0800 URL: http://hg.openjdk.java.net/jdk8/tl/jdk/rev/f777d83e0433 Added tag jdk8-b127 for changeset fbf251b8ef8a ! .hgtags Changeset: f644211c59fd Author: lana Date: 2014-01-31 13:47 -0800 URL: http://hg.openjdk.java.net/jdk8/tl/jdk/rev/f644211c59fd Merge Changeset: 3c9473004f38 Author: katleman Date: 2014-02-01 18:21 -0800 URL: http://hg.openjdk.java.net/jdk8/tl/jdk/rev/3c9473004f38 Added tag jdk8-b128 for changeset f644211c59fd ! .hgtags Changeset: ab6e7bb8ff9f Author: pchelko Date: 2014-01-22 16:15 +0400 URL: http://hg.openjdk.java.net/jdk8/tl/jdk/rev/ab6e7bb8ff9f 7155984: Security problems in regression test java/awt/PrintJob/Security/SecurityDialogTest.java Reviewed-by: anthony, serb ! src/macosx/classes/apple/laf/JRSUIUtils.java Changeset: eef10feca8ca Author: lana Date: 2014-02-06 13:28 -0800 URL: http://hg.openjdk.java.net/jdk8/tl/jdk/rev/eef10feca8ca Merge Changeset: 80568a19aab7 Author: lana Date: 2014-02-06 13:29 -0800 URL: http://hg.openjdk.java.net/jdk8/tl/jdk/rev/80568a19aab7 Merge Changeset: 43386cc9a017 Author: katleman Date: 2014-02-06 17:35 -0800 URL: http://hg.openjdk.java.net/jdk8/tl/jdk/rev/43386cc9a017 Added tag jdk8-b129 for changeset 80568a19aab7 ! .hgtags From daniel.daugherty at oracle.com Tue Feb 11 14:03:23 2014 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Tue, 11 Feb 2014 15:03:23 -0700 Subject: code review round 0 for ObjectMonitor-JVM/TI hang fix (8028073) In-Reply-To: <52F9C523.3010403@oracle.com> References: <52ED3F28.5070806@oracle.com> <6C8A4CDA-B5D5-44E4-865E-A7205D533AFD@oracle.com> <52F5702C.50508@oracle.com> <52F84967.8040807@oracle.com> <52F920AD.6050705@oracle.com> <52F9C523.3010403@oracle.com> Message-ID: <52FA9E2B.6040907@oracle.com> On 2/10/14 11:37 PM, David Holmes wrote: > Dan - thanks for an awesome analysis once again! You're welcome. > Yes mea culpa the code I was looking at was sans Mr Simms most > fortuitous change. (I'm suffering from repo overload these days.) No apology needed. It's a good thing that you and Karen were both looking at slightly older versions of the code. If we had not gone down this road, we may not have noticed this bug when backporting the fix to HSX-25 or older. > As per my previous email I could see that a timed-wait, or spurious > wakeup, could take us to the potentially problematic SpinAcquire, > however I could not see how/if that thread could become _succ such > that an unpark directed to _succ could be consumed in the SpinAcquire > park(). Yes, a big part of my previous reply was fleshing out what you said in an order that made sense to my brain. I should have made it more clear that I was restating things to make sure I had it right. Sorry, I forgot to address your successor question in my previous reply. Once a thread that has called Object.wait() and has been notified, then it is eligible to be made the successor by another thread that has acquired and exited the monitor. The window of opportunity for losing an unpark() opens when a thread can be seen on either the entry-queue or the cx-queue. At that point, any thread going through ExitEpilog can make the target thread the successor and call unpark() on it. The call to SpinAcquire() that we're discussing happens right in the window so that SpinAcquire() call in HSX-25 and older code can consume the unpark(). Hopefully, this resolves the last of this review thread. Dan > > Thanks again, > David > > On 11/02/2014 4:55 AM, Daniel D. Daugherty wrote: >> On 2/9/14 8:37 PM, David Holmes wrote: >>> trimming content ... >>> >>> On 8/02/2014 9:45 AM, Daniel D. Daugherty wrote: >>>> On 2/7/14 2:56 PM, Karen Kinnear wrote: >>>>> 3. Did I read the code correctly that the Thread::SpinAcquire can >>>>> make >>>>> a timed park >>>>> call on the same thread's _ParkEvent? And that this is used to get on >>>>> and off the wait queue, >>>>> i.e. to acquire the WaitSetLock? >>>>> Is there the same risk that a notify might be eaten here also? >>>> >>>> As far as I can see, Thread::SpinAcquire() does not use a ParkEvent >>> >>> It sure does: >>> >>> void Thread::SpinAcquire (volatile int * adr, const char * LockName) { >>> if (Atomic::cmpxchg (1, adr, 0) == 0) { >>> return ; // normal fast-path return >>> } >>> >>> // Slow-path : We've encountered contention -- Spin/Yield/Block >>> strategy. >>> TEVENT (SpinAcquire - ctx) ; >>> int ctr = 0 ; >>> int Yields = 0 ; >>> for (;;) { >>> while (*adr != 0) { >>> ++ctr ; >>> if ((ctr & 0xFFF) == 0 || !os::is_MP()) { >>> if (Yields > 5) { >>> // Consider using a simple NakedSleep() instead. >>> // Then SpinAcquire could be called by non-JVM threads >>> Thread::current()->_ParkEvent->park(1) ; >> >> Ummmm... that's not the code I'm seeing... >> >> src/share/vm/runtime/thread.cpp: >> >> 4417 void Thread::SpinAcquire (volatile int * adr, const char * >> LockName) { >> 4418 if (Atomic::cmpxchg (1, adr, 0) == 0) { >> 4419 return ; // normal fast-path return >> 4420 } >> 4421 >> 4422 // Slow-path : We've encountered contention -- >> Spin/Yield/Block strategy. >> 4423 TEVENT (SpinAcquire - ctx) ; >> 4424 int ctr = 0 ; >> 4425 int Yields = 0 ; >> 4426 for (;;) { >> 4427 while (*adr != 0) { >> 4428 ++ctr ; >> 4429 if ((ctr & 0xFFF) == 0 || !os::is_MP()) { >> 4430 if (Yields > 5) { >> 4431 os::naked_short_sleep(1); >> 4432 } else { >> 4433 os::NakedYield() ; >> 4434 ++Yields ; >> 4435 } >> 4436 } else { >> 4437 SpinPause() ; >> 4438 } >> 4439 } >> 4440 if (Atomic::cmpxchg (1, adr, 0) == 0) return ; >> 4441 } >> 4442 } >> >> Mr Simms recently changed the above code via: >> >> changeset: 5832:5944dba4badc >> user: dsimms >> date: Fri Jan 24 09:28:47 2014 +0100 >> summary: 8028280: ParkEvent leak when running modified runThese >> which only loads classes >> >> os::naked_short_sleep() is new: >> >> - BSD/MacOS X, Linux - uses nanosleep() >> - Solaris - uses usleep() >> - Windows - uses Sleep() >> >> The fix for 8028280 was pushed to JDK9/hs-rt on 2014.01.24 and to >> JDK9/hs >> on 2014.01.29. I don't see any signs that Mr Simm's fix will be >> backported >> to JDK8u/HSX-25u (yet) so this part of the review thread might impact >> the >> backport of my fix to earlier releases. >> >> >>> So considering Karen's question ... I can't tell for certain. :( >>> >>> I do not think the SpinAcquire on grabbing the wait-set lock to add to >>> the wait-set can be an issue because we will only park in response to >>> the actual wait, and hence only get unparked due to a >>> notify/notifyAll, but at this point we still own the monitor so no >>> notify/notifyAll is possible. >>> >>> However, for the removal from the wait-set a more complex analysis is >>> needed. To do the SpinAcquire we must still be flagged as TS_WAIT - >>> which means we have not been notified, but must be returning due to a >>> timeout (or spurious wakeup?). In such circumstances could we be >>> _succ? I don't think so but I'll leave it to Dan to confirm that >>> part :) >> >> So for HSX-25 and probably older... >> >> There are four Thread::SpinAcquire() calls in the objectMonitor code: >> >> Thread::SpinAcquire (&_WaitSetLock, "WaitSet - add") ; >> Thread::SpinAcquire (&_WaitSetLock, "WaitSet - unlink") ; >> Thread::SpinAcquire (&_WaitSetLock, "WaitSet - notify") ; >> Thread::SpinAcquire (&_WaitSetLock, "WaitSet - notifyall") ; >> >> We can easily rule out the "notify" and "notifyAll" uses since the >> current thread owns the Java-level monitor and there are no events >> to post in this part of the notify() or notifyAll() protocols. >> >> For the "WaitSet - add" use, the current thread owns the Java-level >> monitor and the thread has not been added as a waiter yet so another >> thread cannot do the notify-exit-make-successor part of the protocol >> yet. >> >> For the "WaitSet - unlink" use: >> >> src/share/vm/runtime/objectMonitor.cpp: >> >> 1569 if (node.TState == ObjectWaiter::TS_WAIT) { >> 1570 Thread::SpinAcquire (&_WaitSetLock, "WaitSet - >> unlink") ; >> 1571 if (node.TState == ObjectWaiter::TS_WAIT) { >> 1572 DequeueSpecificWaiter (&node) ; // unlink >> from WaitSet >> 1573 assert(node._notified == 0, "invariant"); >> 1574 node.TState = ObjectWaiter::TS_RUN ; >> 1575 } >> 1576 Thread::SpinRelease (&_WaitSetLock) ; >> 1577 } >> >> It is the call on line 1570 above that gets us into this code: >> >> src/share/vm/runtime/thread.cpp: >> >> 4435 void Thread::SpinAcquire (volatile int * adr, const char * >> LockName) { >> 4436 if (Atomic::cmpxchg (1, adr, 0) == 0) { >> 4437 return ; // normal fast-path return >> 4438 } >> 4439 >> 4440 // Slow-path : We've encountered contention -- >> Spin/Yield/Block strategy. >> 4441 TEVENT (SpinAcquire - ctx) ; >> 4442 int ctr = 0 ; >> 4443 int Yields = 0 ; >> 4444 for (;;) { >> 4445 while (*adr != 0) { >> 4446 ++ctr ; >> 4447 if ((ctr & 0xFFF) == 0 || !os::is_MP()) { >> 4448 if (Yields > 5) { >> 4449 // Consider using a simple NakedSleep() instead. >> 4450 // Then SpinAcquire could be called by non-JVM >> threads >> 4451 Thread::current()->_ParkEvent->park(1) ; >> 4452 } else { >> 4453 os::NakedYield() ; >> 4454 ++Yields ; >> 4455 } >> 4456 } else { >> 4457 SpinPause() ; >> 4458 } >> 4459 } >> 4460 if (Atomic::cmpxchg (1, adr, 0) == 0) return ; >> 4461 } >> 4462 } >> >> And the above code can consume the unpark() on line 4451. >> >> So how the heck do we get to line 1570??? >> >> Well, the target thread would have to be both notified and unparked >> to be executing this code path. When the notify() code runs, the >> target of the notify() is changed from ObjectWaiter::TS_WAIT to >> ObjectWaiter::TS_ENTER unless Knob_MoveNotifyee == 4. The default >> for Knob_MoveNotifyee == 2 so we're in non default mode here... >> >> Here are the Knob_MoveNotifyee policy values: >> >> 1717 if (Policy == 0) { // prepend to EntryList >> 1728 if (Policy == 1) { // append to EntryList >> 1744 if (Policy == 2) { // prepend to cxq >> 1760 if (Policy == 3) { // append to cxq >> >> For Knob_MoveNotifyee == 4 (or higher), we use the old mechanism >> where we just unpark the target thread and let it run. Part of >> that code changes from ObjectWaiter::TS_WAIT to ObjectWaiter::TS_RUN. >> >> The code works the same for notifyAll() for the thread picked >> to be notified. For the Knob_MoveNotifyee == 4 (or higher) case, >> we just unpark all the waiters and we a free-for-all. >> >> So it looks like the code block from lines 1569-1577 is never >> used... or is it? Well... you have to remember two things: >> >> 1) spurious unpark() >> 2) timed wait() >> >> The caller might have called wait(0), but that doesn't mean that >> the underlying park() mechanism won't have a spurious unpark(). >> Or better, the caller might have called wait(1) and be running >> again after a millisecond. >> >> So in the HSX25 and older system (i.e., without Mr Simms fix for >> 8028280), it is possible for this call: >> >> 1570 Thread::SpinAcquire (&_WaitSetLock, "WaitSet - >> unlink") ; >> >> to consume the unpark(). The gauntlet that has to be traversed >> to get to this call: >> >> 4451 Thread::current()->_ParkEvent->park(1) ; >> >> is impressive: >> >> - fast-path acquisition of the _WaitSetLock has to fail: >> >> 4436 if (Atomic::cmpxchg (1, adr, 0) == 0) { >> 4437 return ; // normal fast-path return >> 4438 } >> >> - if the machine is a uniprocessor, then 6 os::NakedYield() >> call-loop-recheck attempts have to fail: >> >> 4447 if ((ctr & 0xFFF) == 0 || !os::is_MP()) { >> 4448 if (Yields > 5) { >> 4449 // Consider using a simple NakedSleep() instead. >> 4450 // Then SpinAcquire could be called by non-JVM >> threads >> 4451 Thread::current()->_ParkEvent->park(1) ; >> 4452 } else { >> 4453 os::NakedYield() ; >> 4454 ++Yields ; >> 4455 } >> >> - if the machine is a multi-processor, then 6 rounds of { 4095 >> SpinPause() >> attempts, 1 os::NakedYield() attempt} have to fail: >> >> 4446 ++ctr ; >> 4447 if ((ctr & 0xFFF) == 0 || !os::is_MP()) { >> 4448 if (Yields > 5) { >> 4449 // Consider using a simple NakedSleep() instead. >> 4450 // Then SpinAcquire could be called by non-JVM >> threads >> 4451 Thread::current()->_ParkEvent->park(1) ; >> 4452 } else { >> 4453 os::NakedYield() ; >> 4454 ++Yields ; >> 4455 } >> 4456 } else { >> 4457 SpinPause() ; >> 4458 } >> >> But it is possible. It is one of those once-in-a-blue moon type >> windows where everything has to line up just so. >> >> So how do we address this issue in HSX-25 and possibly older? >> >> If Mr Simms fix for 8028280 is also backported, then there is no >> issue. If it is not backported, then applying the fix for this >> bug like so: >> >> src/share/vm/runtime/objectMonitor.cpp: >> >> 1596 if (JvmtiExport::should_post_monitor_waited()) { >> 1597 JvmtiExport::post_monitor_waited(jt, this, ret == >> OS_TIMEOUT); >> 1598 } >> >> 1604 if (node._notified != 0 && _succ == Self) { >> 1605 // In this part of the monitor wait-notify-reenter >> protocol it >> 1606 // is possible (and normal) for another thread to do a >> fastpath >> 1607 // monitor enter-exit while this thread is still trying >> to get >> 1608 // to the reenter portion of the protocol. >> 1609 // >> 1610 // The ObjectMonitor was notified and the current >> thread is >> 1611 // the successor which also means that an unpark() has >> already >> 1612 // been done. The JVMTI_EVENT_MONITOR_WAITED event >> handler can >> 1613 // consume the unpark() that was done when the successor >> was >> 1614 // set because the same ParkEvent is shared between Java >> 1615 // monitors and JVM/TI RawMonitors (for now). >> 1616 // >> 1617 // We redo the unpark() to ensure forward progress, >> i.e., we >> 1618 // don't want all pending threads hanging (parked) with >> none >> 1619 // entering the unlocked monitor. >> 1620 node._event->unpark(); >> 1621 } >> >> Of course the line numbers for the "fix" would be different and the >> comment >> would need to be updated to reflect that the: >> >> 1570 Thread::SpinAcquire (&_WaitSetLock, "WaitSet - >> unlink") ; >> >> call above could also consume an unpark(), but it should work. >> >> If you've read this far, then I'm impressed. If you've read this far >> and only fallen asleep a couple of times, then I'm still impressed. >> >> Summary: I don't think we have an issue in JDK9, but we'll have to do >> the fix in JDK8/HSX25 and older a little differently. >> >> Dan >> >> >>> >>> David >>> ----- >>> >>>> at all. However, Thread::muxAcquire() does use a ParkEvent, but it >>>> is a different ParkEvent. From src/share/vm/runtime/thread.hpp: >>>> >>>> ParkEvent * _ParkEvent ; // for synchronized() >>>> ParkEvent * _SleepEvent ; // for Thread.sleep >>>> ParkEvent * _MutexEvent ; // for native internal >>>> Mutex/Monitor >>>> ParkEvent * _MuxEvent ; // for low-level >>>> muxAcquire-muxRelease >>>> >>>> So ObjectMonitor uses the _ParkEvent field and Thread::muxAcquire() >>>> uses the _MuxEvent. There are some comments in thread.cpp about >>>> how _MuxEvent could be eliminated and _ParkEvent shared, but I don't >>>> think we ever want to go there. >>>> >>>> I also filed this RFE: >>>> >>>> 8033399 add a separate ParkEvent for JVM/TI RawMonitor use >>>> https://bugs.openjdk.java.net/browse/JDK-8033399 >>>> >>>> just in case the Serviceability team wants to migrate JVM/TI >>>> RawMonitors >>>> to a separate ParkEvent. >>>> >>>> Please let me know if you concur that I've resolved issue #3. >>>> >>>> >>>>> If so, I wonder if we want this added unpark to not just be called if >>>>> JVMTI_EVENT_MONITOR_WAITED >>>>> is enabled? >>>> >>>> I don't think we need it, but I've noted its removal as a risk. >>>> >>>> Again, thanks for the review! >>>> >>>> Dan >>>> >>>> >>>>> >>>>> thanks, >>>>> Karen >>>>> >>>>> On Feb 1, 2014, at 1:38 PM, Daniel D. Daugherty wrote: >>>>> >>>>>> Greetings, >>>>>> >>>>>> I have a fix ready for the following bug: >>>>>> >>>>>> 8028073 race condition in ObjectMonitor implementation causing >>>>>> deadlocks >>>>>> https://bugs.openjdk.java.net/browse/JDK-8028073 >>>>>> >>>>>> On the surface, this is a very simple fix that relocates a few >>>>>> lines of >>>>>> code, relocates and rewrites the comments associated with that code >>>>>> and >>>>>> adds several new comments. >>>>>> >>>>>> Of course, in reality, the issue is much more complicated, but I'm >>>>>> hoping to make it easy for anyone not acquainted with this issue to >>>>>> understand what's going on. >>>>>> >>>>>> Here are the JDK9 webrev URLs: >>>>>> >>>>>> OpenJDK: >>>>>> http://cr.openjdk.java.net/~dcubed/8028073-webrev/0-jdk9-hs-runtime/ >>>>>> >>>>>> Oracle internal: >>>>>> http://javaweb.us.oracle.com/~ddaugher/8028073-webrev/0-jdk9-hs-runtime/ >>>>>> >>>>>> >>>>>> >>>>>> The simple summary: >>>>>> >>>>>> - since Java Monitors and JVM/TI RawMonitors share a ParkEvent, >>>>>> it is possible for a JVM/TI monitor event handler to accidentally >>>>>> consume a ParkEvent.unpark() call meant for Java Monitor layer >>>>>> - the original code fix was made on 2005.07.04 using this bug ID: >>>>>> https://bugs.openjdk.java.net/browse/JDK-5030359 >>>>>> - it's the right fix, but it's in the wrong place >>>>>> - the fix needs to be after the JVMTI_EVENT_MONITOR_WAITED >>>>>> event handler is called because it is that event handler >>>>>> that can cause the hang >>>>>> >>>>>> >>>>>> Testing >>>>>> ------- >>>>>> >>>>>> - a new StessMonitorWait test has been created that reliably >>>>>> reproduces the hang in JDK[6789]; see the bug's gory details >>>>>> for the specific versions where the hang has been reproduced >>>>>> - the test reliably reproduces the hang in 5 seconds on my >>>>>> T7600 running Solaris 10u11 X86; 1 minute runs reproduce >>>>>> the hang reliably on other machines >>>>>> - 12 hour stress run of the new test on Linux-X64, MacOS X-X64, >>>>>> Solaris-SPARCV9, Solaris-X64, and Win7-X86 with the JPRT >>>>>> bits did not reproduce the hang >>>>>> - JPRT test job >>>>>> - VM/SQE Adhoc test job on Server VM, fastdebug bits on Linux-X86, >>>>>> Linux-X64, MacOS X-X64, Solaris-SPARCV9, Solaris-X64, Windows-X86, >>>>>> and Windows-X64: >>>>>> - vm.quick >>>>>> - Kitchensink (bigapps) >>>>>> - Weblogic+medrec (bigapps) >>>>>> - runThese (bigapps) >>>>>> >>>>>> >>>>>> The Gory Details Start Here >>>>>> --------------------------- >>>>>> >>>>>> This is the old location of block of code that's being moved: >>>>>> >>>>>> src/share/vm/runtime/objectMonitor.cpp: >>>>>> >>>>>> 1440 void ObjectMonitor::wait(jlong millis, bool interruptible, >>>>>> TRAPS) { >>>>>> >>>>>> 1499 exit (true, Self) ; // exit the monitor >>>>>> >>>>>> 1513 if (node._notified != 0 && _succ == Self) { >>>>>> 1514 node._event->unpark(); >>>>>> 1515 } >>>>>> >>>>>> >>>>>> This is the new location of block of code that's being moved: >>>>>> >>>>>> src/share/vm/runtime/objectMonitor.cpp: >>>>>> >>>>>> 1452 void ObjectMonitor::wait(jlong millis, bool interruptible, >>>>>> TRAPS) { >>>>>> >>>>>> 1601 if (JvmtiExport::should_post_monitor_waited()) { >>>>>> 1602 JvmtiExport::post_monitor_waited(jt, this, ret == >>>>>> OS_TIMEOUT); >>>>>> >>>>>> 1604 if (node._notified != 0 && _succ == Self) { >>>>>> >>>>>> 1620 node._event->unpark(); >>>>>> 1621 } >>>>>> >>>>>> >>>>>> The Risks >>>>>> --------- >>>>>> >>>>>> - The code now executes only when the JVMTI_EVENT_MONITOR_WAITED >>>>>> event >>>>>> is enabled: >>>>>> - previously it was always executed >>>>>> - while the old code was not effective for the hang that is being >>>>>> fixed with this bug, it is possible that the old code prevented >>>>>> a different bug in the successor protocol from manifesting >>>>>> - thorough analysis of the successor protocol did not reveal a >>>>>> case where the old code was needed in the old location >>>>>> - Thorough analysis indicates that the other JVM/TI monitor events >>>>>> do not need a fix like the one for JVMTI_EVENT_MONITOR_WAITED: >>>>>> - the successor protocol is complicated and the analysis could >>>>>> be wrong when certain options are used >>>>>> - comments were added to each location where a JVM/TI monitor >>>>>> event handler is called documenting why a fix like this one >>>>>> is not needed there >>>>>> - if the analysis is wrong, the new comments show where a new >>>>>> code change would be needed >>>>>> >>>>>> >>>>>> The Scenario >>>>>> ------------ >>>>>> >>>>>> I've created a scenario that reproduces this hang: >>>>>> >>>>>> T1 - enters monitor and calls monitor.wait() >>>>>> T2 - enters the monitor, calls monitor.notify() and exits the >>>>>> monitor >>>>>> T3 - enters and exits the monitor >>>>>> T4 - enters the monitor, delays for 5 seconds, exits the monitor >>>>>> >>>>>> A JVM/TI agent that enables JVMTI_EVENT_MONITOR_WAITED and has a >>>>>> handler that: enters a raw monitor, waits for 1ms, exits a raw >>>>>> monitor. >>>>>> >>>>>> Here are the six events necessary to make this hang happen: >>>>>> >>>>>> // KEY-EVENT-1a: After being unparked(), T1 has cleared the _succ >>>>>> field, but >>>>>> // KEY-EVENT-1b: T3 is exiting the monitor and makes T1 the >>>>>> successor >>>>>> again. >>>>>> >>>>>> // KEY-EVENT-2a: The unpark() done by T3 when it made T1 the >>>>>> successor >>>>>> // KEY-EVENT-2b: is consumed by the JVM/TI event handler. >>>>>> >>>>>> // KEY-EVENT-3a: T3 made T1 the successor >>>>>> // KEY-EVENT-3b: but before T1 could reenter the monitor T4 grabbed >>>>>> it. >>>>>> >>>>>> // KEY-EVENT-4a: T1's TrySpin() call sees T4 as NotRunnable so >>>>>> // KEY-EVENT-4b: T1 bails from TrySpin without touching _succ. >>>>>> >>>>>> // KEY-EVENT-5a: T4 sees that T1 is still the successor so >>>>>> // KEY-EVENT-5b: T4 takes the quick exit path (no ExitEpilog) >>>>>> >>>>>> // KEY-EVENT-6a: T1 is about to park and it is the successor, but >>>>>> // KEY-EVENT-6b: T3's unpark has been eaten by the JVM/TI event >>>>>> handler >>>>>> // KEY-EVENT-6c: and T4 took the quick exit path. T1 is about to be >>>>>> stuck. >>>>>> >>>>>> >>>>>> This bug is intertwined with: >>>>>> >>>>>> - The ObjectMonitor successor protocol >>>>>> - the sharing of a ParkEvent between Java Monitors and JVM/TI >>>>>> RawMonitors >>>>>> >>>>>> There is a very long successor.notes attachment to JDK-8028073 that >>>>>> attempts to describe the ObjectMonitor successor protocol. It's good >>>>>> for putting pretty much anyone to sleep. >>>>>> >>>>>> Since this hang reproduces back to JDK6, this bug is taking the >>>>>> easily >>>>>> backported solution of moving the original fix to the right >>>>>> location. >>>>>> The following new bug has been filed for possible future work in >>>>>> this >>>>>> area by the Serviceability Team: >>>>>> >>>>>> 8033399 add a separate ParkEvent for JVM/TI RawMonitor use >>>>>> https://bugs.openjdk.java.net/browse/JDK-8033399 >>>>>> >>>>>> >>>>>> The Symptoms >>>>>> ------------ >>>>>> >>>>>> With intermittent hangs like this, it is useful to know what to look >>>>>> for in order to determine if you are running into this issue: >>>>>> >>>>>> - if you aren't using a debugger or a profiler or some other >>>>>> JVM/TI agent, then this hang is not the same as yours >>>>>> - if your JVM/TI agent isn't using a JVMTI_EVENT_MONITOR_WAITED >>>>>> event handler, then this hang is not the same as yours >>>>>> - if your JVMTI_EVENT_MONITOR_WAITED event handler is not using >>>>>> JVM/TI RawMonitors, then this hang is not the same as yours >>>>>> - if your JVMTI_EVENT_MONITOR_WAITED event handler is calling >>>>>> back into Java code, then you might just be insane and this >>>>>> hang might be similar to yours. However, using a Java callback >>>>>> in an event handler is an even bigger problem/risk so fix that >>>>>> first. >>>>>> - if you one or more threads blocked like this and making no >>>>>> progress, then this hang might be the same as yours: >>>>>> >>>>>> "T1" #22 prio=5 os_prio=64 tid=0x00000000009ca800 nid=0x2f waiting >>>>>> for monitor e >>>>>> ntry [0xfffffd7fc0231000] >>>>>> java.lang.Thread.State: BLOCKED (on object monitor) >>>>>> JavaThread state: _thread_blocked >>>>>> Thread: 0x00000000009ca800 [0x2f] State: _at_safepoint >>>>>> _has_called_back 0 _at_p >>>>>> oll_safepoint 0 >>>>>> JavaThread state: _thread_blocked >>>>>> at java.lang.Object.wait(Native Method) >>>>>> - waiting on <0xfffffd7e6a2b6ff0> (a java.lang.String) >>>>>> at java.lang.Object.wait(Object.java:502) >>>>>> at SMW_WorkerThread.run(StressMonitorWait.java:103) >>>>>> - locked <0xfffffd7e6a2b6ff0> (a java.lang.String) >>>>>> >>>>>> "T2" #23 prio=5 os_prio=64 tid=0x00000000009cc000 nid=0x30 waiting >>>>>> for monitor e >>>>>> ntry [0xfffffd7fc0130000] >>>>>> java.lang.Thread.State: BLOCKED (on object monitor) >>>>>> JavaThread state: _thread_blocked >>>>>> Thread: 0x00000000009cc000 [0x30] State: _at_safepoint >>>>>> _has_called_back 0 _at_p >>>>>> oll_safepoint 0 >>>>>> JavaThread state: _thread_blocked >>>>>> at SMW_WorkerThread.run(StressMonitorWait.java:120) >>>>>> - waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String) >>>>>> >>>>>> "T3" #24 prio=5 os_prio=64 tid=0x00000000009ce000 nid=0x31 waiting >>>>>> for monitor e >>>>>> ntry [0xfffffd7fc002f000] >>>>>> java.lang.Thread.State: BLOCKED (on object monitor) >>>>>> JavaThread state: _thread_blocked >>>>>> Thread: 0x00000000009ce000 [0x31] State: _at_safepoint >>>>>> _has_called_back 0 _at_p >>>>>> oll_safepoint 0 >>>>>> JavaThread state: _thread_blocked >>>>>> at SMW_WorkerThread.run(StressMonitorWait.java:139) >>>>>> - waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String) >>>>>> >>>>>> Key symptoms in thread T1: >>>>>> >>>>>> - had the object locked: >>>>>> >>>>>> locked <0xfffffd7e6a2b6ff0> (a java.lang.String) >>>>>> >>>>>> - did an Object.wait(): >>>>>> >>>>>> waiting on <0xfffffd7e6a2b6ff0> (a java.lang.String) >>>>>> >>>>>> - is blocked on reentry: >>>>>> >>>>>> waiting for monitor entry [0xfffffd7fc0231000] >>>>>> >>>>>> Key symtoms in thread T2: >>>>>> >>>>>> - is blocked waiting to lock the object: >>>>>> >>>>>> waiting for monitor entry [0xfffffd7fc0130000] >>>>>> waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String) >>>>>> >>>>>> Key symtoms in thread T3: >>>>>> >>>>>> - is blocked waiting to lock the object: >>>>>> >>>>>> waiting for monitor entry [0xfffffd7fc002f000] >>>>>> waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String) >>>>>> >>>> >> From lana.steuck at oracle.com Tue Feb 11 13:40:08 2014 From: lana.steuck at oracle.com (lana.steuck at oracle.com) Date: Tue, 11 Feb 2014 21:40:08 +0000 Subject: hg: jdk8/tl: 8 new changesets Message-ID: <20140211214010.B63CD62B98@hg.openjdk.java.net> Changeset: 950921234b10 Author: katleman Date: 2014-01-22 12:53 -0800 URL: http://hg.openjdk.java.net/jdk8/tl/rev/950921234b10 Added tag jdk8-b125 for changeset 790bbd46b201 ! .hgtags Changeset: 1b5d578f93ef Author: katleman Date: 2014-01-22 14:06 -0800 URL: http://hg.openjdk.java.net/jdk8/tl/rev/1b5d578f93ef Merge ! .hgtags Changeset: 9ccce5bf1b0e Author: katleman Date: 2014-01-24 04:45 -0800 URL: http://hg.openjdk.java.net/jdk8/tl/rev/9ccce5bf1b0e Merge Changeset: 4f8fa4724c14 Author: katleman Date: 2014-01-24 15:07 -0800 URL: http://hg.openjdk.java.net/jdk8/tl/rev/4f8fa4724c14 Added tag jdk8-b126 for changeset 9ccce5bf1b0e ! .hgtags Changeset: 2e2ffb9e4b69 Author: lana Date: 2014-01-29 11:11 -0800 URL: http://hg.openjdk.java.net/jdk8/tl/rev/2e2ffb9e4b69 Merge Changeset: 101e42de4686 Author: katleman Date: 2014-01-30 12:16 -0800 URL: http://hg.openjdk.java.net/jdk8/tl/rev/101e42de4686 Added tag jdk8-b127 for changeset 2e2ffb9e4b69 ! .hgtags Changeset: 1e5fe8654913 Author: katleman Date: 2014-02-01 18:21 -0800 URL: http://hg.openjdk.java.net/jdk8/tl/rev/1e5fe8654913 Added tag jdk8-b128 for changeset 101e42de4686 ! .hgtags Changeset: 839546caab12 Author: katleman Date: 2014-02-06 17:34 -0800 URL: http://hg.openjdk.java.net/jdk8/tl/rev/839546caab12 Added tag jdk8-b129 for changeset 1e5fe8654913 ! .hgtags From serguei.spitsyn at oracle.com Tue Feb 11 18:57:06 2014 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Tue, 11 Feb 2014 18:57:06 -0800 Subject: 2-nd round RFR (S) 8032223: nsk/regression/b4663146 gets assert(SafepointSynchronize::is_at_safepoint() || JvmtiEnv::is_thread_fully_suspended(get_thread(), false, &debug_bits)) In-Reply-To: <52F1514C.1050405@oracle.com> References: <52F0CB50.3040703@oracle.com> <52F10BBA.8060504@oracle.com> <52F1514C.1050405@oracle.com> Message-ID: <52FAE302.1040305@oracle.com> On 2/4/14 12:45 PM, serguei.spitsyn at oracle.com wrote: > Thanks, Dan! > Some comments below. > > On 2/4/14 7:48 AM, Daniel D. Daugherty wrote: >> On 2/4/14 4:13 AM, serguei.spitsyn at oracle.com wrote: >>> Please, review the fix for: >>> https://bugs.openjdk.java.net/browse/JDK-8032223 >>> >>> >>> Open webrev: >>> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/hotspot/8032223-JVMTI-FRAME.1/ >>> >> >> src/share/vm/prims/jvmtiEnvBase.hpp >> No comments. >> >> src/share/vm/prims/jvmtiEnv.cpp >> No comments beyond David's tweak to the comment. > > Will fix it before push. > >> For future work... >> >> Looks like these VM ops also need the liveness check on the >> target JavaThread: >> >> VM_GetOwnedMonitorInfo >> VM_GetStackTrace >> >> JvmtiEnv::GetStackTrace() looks like it has the same >> is_thread_fully_suspended() flaw... > > I know about this. > There are even more issues: > GetCurrentContendedMonitor > GetOwnedMonitorStackDepthInfo > NotifyFramePop > > I'll check if we can use an existing bug to add this extra request. > Otherwise, will file new one. Filed this one: https://bugs.openjdk.java.net/browse/JDK-8034249 Thanks, Serguei > > Thanks, > Serguei > >> >> Dan >> >> >> >>> >>> Summary: >>> >>> This is the second round of review for this issue. >>> But it was decided that the JDK-8032223 must be used to cover it >>> instead of the JDK-6471769. >>> The 8032223 was initially closed as a dup of 6471769 but it has >>> been re-open now. >>> >>> There is a general issue in the suspend equivalent condition >>> mechanism: >>> Two subsequent calls to the JvmtiEnv::is_thread_fully_suspended() >>> may return different results: >>> - 1-st: true >>> - 2-nd: false >>> >>> This suspend equivalent issue is covered by another bug: >>> https://bugs.openjdk.java.net/browse/JDK-6280037 >>> >>> The bug to fix in this review is a specific manifestation of the >>> 6280037 >>> in the JVMTI GetFrameCount() that has a major impact on the SQE >>> nightly. >>> It is on the Test Stabilization radar as well as the 6280037. >>> There are many tests intermittently failing because of this. >>> I've also decided to fix the same issue in the JVMTI >>> GetFrameLocation() as well. >>> >>> The JVMTI GetFrameCount() spec tells: >>> "If this function is called for a thread actively executing >>> bytecodes (for example, >>> not the current thread and not suspended), the information >>> returned is transient." >>> >>> So, it is Ok to call the GetFrameCount() for the non-suspended >>> target thread. >>> To achieve safety, the frame count for non-suspended threads is >>> calculated at a safepoint. >>> It should be Ok and more safe to do the same for suspended threads >>> as well. >>> There should be no big performance impact because it is already on >>> a slow path. >>> It is still important to avoid safepointing when the target thread >>> is current. >>> >>> The bug 6280037 should go out of the Test Stabilization radar >>> (remove the svc-nightly label) >>> as the most of the impacted tests must be covered by the 8032223. >>> >>> >>> Testing: >>> In progress: >>> - nsk.jvmti, nsk.jdi, nsk.jdwp >>> - JTreg com/sun/jdi >>> >>> >>> Thanks, >>> Serguei >>> >> > From volker.simonis at gmail.com Wed Feb 12 06:28:50 2014 From: volker.simonis at gmail.com (Volker Simonis) Date: Wed, 12 Feb 2014 15:28:50 +0100 Subject: Project to improve hs_err files In-Reply-To: <52FB73D0.1070402@oracle.com> References: <52FB73D0.1070402@oracle.com> Message-ID: On Wed, Feb 12, 2014 at 2:14 PM, Nils Eliasson wrote: > On 2013-09-06 13:32, Mattis Castegren wrote: > > ... > > > > * Bigger changes > > - Re-structure hs_err file ... > > Yes, please move the compiler related info together. Having the current > compiler task separated from the compilation events with the heap regions > causes a lot of scrolling. > > > > * Additional Data > > ... > > > Make sure we read the OS-context when hitting asserts in debug-builds. Today > we only get registers, code and stack on actual crashes. Very valuable info > that is missing when investigating many test failures. > I'm afraid that will be not possible. For a crash (i.e. because of a SIGSEGV) the kernel saves the context for the PC which caused the problem. 'assert' and 'guarantee' are macros which evaluate an expression and call report_vm_error if that expression was 'false'. I think the best we could do is to call 'getcontext()' after the evaluation of the expression, but that's not actually the context where the evaluation of the expression failed. But I agree that it may be helpful as well. And you should already get a complete stack trace for 'asserts' and 'guarantees' with the current implementation. Don't you see it? Regards, Volker > //Nils From yasu at ysfactory.dip.jp Wed Feb 12 06:32:11 2014 From: yasu at ysfactory.dip.jp (Yasumasa Suenaga) Date: Wed, 12 Feb 2014 23:32:11 +0900 Subject: JDK-7090324: gclog rotation via external tool In-Reply-To: <52F22AE0.1000008@ysfactory.dip.jp> References: <52483BDB.8040206@ysfactory.dip.jp> <52496A21.8080608@ysfactory.dip.jp> <527CDD56.7080106@ysfactory.dip.jp> <52A09642.4030609@ysfactory.dip.jp> <52E26FA2.40909@ysfactory.dip.jp> <0E26045D-F7F9-49BC-AB36-A42C1DC6E64E@oracle.com> <52E91000.9010600@ysfactory.dip.jp> <52E91AAA.3060008@oracle.com> <52E9248D.2090108@ysfactory.dip.jp> <52E9FA5B.6010306@lab.ntt.co.jp> <5B189F0A-7408-47C7-9719-DC2990355209@oracle.com> <52EA33B3.6080409@lab.ntt.co.jp> <52F2151E.4040805@oracle.com> <52F229E0.3010400@ysfactory.dip.jp> <52F22AE0.1000008@ysfactory.dip.jp> Message-ID: <52FB85EB.2090005@ysfactory.dip.jp> Hi all, I've uploaded new webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-7090324/webrev.05/ Erik pointed me that my patch changes current behavior for GCLogFileSize. In current implementation, default value of GCLogFileSize is set to "0" and if user set this value to less than 8K, JVM adjust it to 8K. Below are the scenarios: 1. -Xloggc:test.log -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=3 Should result in GCLogFileSize "0" (GC log rotation will be turned off) 2. -Xloggc:test.log -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=3 -XX:GCLogFileSize=10K Should result in GCLogFileSize 10K 3. -Xloggc:test.log -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=3 -XX:GCLogFileSize=2K Should result in GCLogFileSize 8K From the result of 3, we can think that GCLogFileSize is set to 8K by default. So I want to change default value of this to 8K in globals.hpp . And I want to treat "0" as special number for rotating by external trigger. From the result of 1, if GCLogFileSIze is set to "0", UseGCLogFileRotation is set to false. Definition of GCLogFileSize in globals.hpp, "0" means "no rotation" . Thus I think this changes does not make different behavior from current implementation. ------------------------ product(uintx, GCLogFileSize, 0, \ "GC log file size (default: 0 bytes, no rotation). " \ "It requires UseGCLogFileRotation") \ ------------------------ Could you review this ? Thanks, Yasumasa On 02/05/2014 09:13 PM, Yasumasa Suenaga wrote: > Sorry, I forgot to paste URL of new webrev :-P > http://cr.openjdk.java.net/~ysuenaga/JDK-7090324/webrev.04/ > > > Yasumasa > > On 02/05/2014 09:09 PM, Yasumasa Suenaga wrote: >> Hi Erik, >> >> Thank you for reviewing again! >> I've updated new webrev. >> >> On 02/05/2014 07:40 PM, Erik Helin wrote: >>> Hi Yasumasa, >>> >>> I've looked through the latest patch, it is much better! I just have >>> two comments: >>> >>> - ostream.hpp: >>> Why did you add GCLogFileSize != 0 in should_rotate? The old check >>> just checked that _bytes_written > GCLogFileSize. >> >> Default value of GCLogFileSIze is "0" in globals.hpp . >> So if this state is missed, should_rotate() returns true in anytime. >> >> >>> - TestGCLogRotationViaJcmd.java: >>> Could you use the helper class JDKToolLauncher to start jmap? The >>> code would then be slightly easier to read: >>> >>> for (int times = 1; times < NUM_LOGS; times++) { >>> // Run jcmd GC.rotate_log >>> JDKToolLauncher jmap = JDKToolLauncher.create("jmap") >>> .addToolArg(pid) >>> .addToolArg("GC.rotate_log"); >>> ProcessBuilder pb = new ProcessBuilder(jmap.getCommand()); >>> >>> // Make sure we didn't crash >>> OutputAnalyzer output = new OutputAnalyzer(pb.start()); >>> output.shouldHaveExitValue(0); >>> } >> >> I've fixed. Could you check the patch? >> >> >> Thanks, >> >> Yasumasa >> >> >>> Thanks, >>> Erik >>> >>> On 01/30/2014 12:12 PM, Yasumasa Suenaga wrote: >>>> Hi Staffan, >>>> >>>> I've uploaded new webrev: >>>> http://cr.openjdk.java.net/~ysuenaga/JDK-7090324/webrev.03/ >>>> >>>> On 2014/01/30 17:23, Staffan Larsen wrote: >>>>> Would it be possible for the Diagnostic Command to output the >>>>> location >>>>> of the rotated log? When invoking the command it would be good to get >>>>> some kind of feedback. >>>> >>>> I changed rotate_log() to redirect messages to jcmd. >>>> If GC.rotate_log is executed, we can get messages on jcmd console >>>> as below: >>>> ------------ >>>> $ jcmd 18976 GC.rotate_log >>>> 18976: >>>> 2014-01-30 19:59:39 GC log rotation request has been received. >>>> Saved as >>>> test.log.0 >>>> 2014-01-30 19:59:39 GC log file created test.log.1 >>>> ------------ >>>> >>>> >>>>> test/gc/7090324/Test7090324.java: >>>>> - I think this needs to have the Oracle copyright notice as well. >>>>> - Tests should now use descriptive names, not bug numbers: >>>>> https://wiki.openjdk.java.net/display/HotSpot/Naming+HotSpot+JTReg+Tests >>>>> >>>>> - nits: lots of missing spaces before ?{?, and after ?for?, ?if? >>>>> - line 47: you don?t need to clean up old files, jtreg will give >>>>> you a >>>>> fresh scratch directory to run in >>>> >>>> I've fixed. >>>> Could you review again? >>>> >>>> >>>> Thanks, >>>> >>>> Yasumasa >>>> >>>>> /Staffan >>>>> >>>>> >>>>> >>>>> On 30 jan 2014, at 08:08, Yasumasa >>>>> Suenaga wrote: >>>>> >>>>>> Hi Erik, Staffan, >>>>>> >>>>>> I've uploaded new webrev. Could you review this ? >>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-7090324/webrev.02/ >>>>>> >>>>>> This patch includes fixes from comments of Staffan and Erik. >>>>>> >>>>>> And I created new test of this patch as Test7090324 . >>>>>> This test works fine with jtreg. >>>>>> >>>>>> >>>>>> Thanks, >>>>>> >>>>>> Yasumasa >>>>>> >>>>>> On 2014/01/30 0:55, Yasumasa Suenaga wrote: >>>>>>> Hi Erik, >>>>>>> >>>>>>> On 2014/01/30 0:13, Erik Helin wrote: >>>>>>>> Hi Yasumasa, >>>>>>>> >>>>>>>> (have to use HTML email to get a width of more than 78 chars, >>>>>>>> sorry) >>>>>>>> >>>>>>>> why did you change the code in arguments.cpp in the method >>>>>>>> check_gc_log_consistency? >>>>>>> >>>>>>> In current implementation, check_gclog_consistency() checks three >>>>>>> parameters: >>>>>>> >>>>>>> - GC log filename >>>>>>> - NumberOfGCLogFiles >>>>>>> - GCLogFileSize >>>>>>> >>>>>>> My customer uses external trigger "ONLY" for rotating logs. >>>>>>> If they want to do that, GCLogFileSize does not need. >>>>>>> >>>>>>> >>>>>>>> Next, the gcLogFileStream::rotate_log method now does a lot of >>>>>>>> things. >>>>>>>> Could you separate out the first block into a new method, >>>>>>>> gcLogFileStream::should_rotate(bool force)? >>>>>>>> >>>>>>>> This was, the code would read: >>>>>>>> >>>>>>>>> bool gcLogFileStream::should_rotate(bool force) { >>>>>>>>> return force || _bytes_writen>= GCLogFileSize; >>>>>>>>> } >>>>>>>>> >>>>>>>>> void gcLogFileStream::rotate_log(bool force) { >>>>>>>>> char time_msg[FILENAMEBUFLEN]; >>>>>>>>> char time_str[EXTRACHARLEN]; >>>>>>>>> char current_file_name[FILENAMEBUFLEN]; >>>>>>>>> char renamed_file_name[FILENAMEBUFLEN]; >>>>>>>>> >>>>>>>>> if (!should_rotate(force)) { >>>>>>>>> return; >>>>>>>>> } >>>>>>>>> >>>>>>>>> ... >>>>>>>>> } >>>>>>>> >>>>>>>> Could you please update your patch? >>>>>>> >>>>>>> I will do that. >>>>>>> >>>>>>> >>>>>>>> There is a new empty line in the rotate_log method: >>>>>>>> >>>>>>>>> } >>>>>>>>> + >>>>>>>>> #ifdef ASSERT >>>>>>>> >>>>>>>> could you please remove it? >>>>>>> >>>>>>> I will do that. >>>>>>> >>>>>>> >>>>>>>> The logging change in rotate_log uses a different kind of if/else >>>>>>>> syntax >>>>>>>> than the rest of the file: >>>>>>>> >>>>>>>>> if (force) { >>>>>>>>> ... >>>>>>>>> } >>>>>>>>> else { >>>>>>>>> ... >>>>>>>>> } >>>>>>>> >>>>>>>> The other if/else statements in the file uses: >>>>>>>> >>>>>>>>> if (cond) { >>>>>>>>> ... >>>>>>>>> } else { >>>>>>>>> ... >>>>>>>>> } >>>>>>>> >>>>>>>> Could you please update your change to use the same if/else >>>>>>>> syntax? >>>>>>> >>>>>>> I will do that. >>>>>>> >>>>>>> >>>>>>>> This part of the change duplicates the code: >>>>>>>> >>>>>>>> + jio_snprintf(time_msg, sizeof(time_msg), "%s GC log rotation >>>>>>>> request has been received. Saved as %s\n", >>>>>>>> + os::local_time_string((char *)time_str, sizeof(time_str)), >>>>>>>> + renamed_file_name); >>>>>>>> + } >>>>>>>> + else { >>>>>>>> + jio_snprintf(time_msg, sizeof(time_msg), "%s GC log file has >>>>>>>> reached the" >>>>>>>> " maximum size. Saved as %s\n", >>>>>>>> - os::local_time_string((char *)time_str, sizeof(time_str)), >>>>>>>> + os::local_time_string((char *)time_str, sizeof(time_str)), >>>>>>>> renamed_file_name); >>>>>>>> >>>>>>>> Could you instead just change the message, as in: >>>>>>>> >>>>>>>>> const char* msg = forced ? "%s GC log rotation request has been >>>>>>>>> received. Saved as %s\n" : >>>>>>>>> "%s GC log file has reached the maximum size. Saved as %s\n"; >>>>>>>>> jio_snprintf(msg, os::local_time_string((char *)time_str, >>>>>>>>> sizeof(time_str)), renamed_file_name); >>>>>>> >>>>>>> I will do that. >>>>>>> >>>>>>> >>>>>>>> The declaration of rotate_log in ostream.hpp still uses the old >>>>>>>> variable name is_force, it should use force, >>>>>>>> just as the definition. >>>>>>> >>>>>>> Sorry, I will fix it. >>>>>>> >>>>>>> >>>>>>>> Finally, could you add a test that tests your change? Have a look >>>>>>>> at the other tests >>>>>>>> in hotspot/test/gc to see how you can do it >>>>>>>> (you might want to use some functionality from >>>>>>>> hotspot/test/testlibrary). >>>>>>> >>>>>>> I found three tests as following: >>>>>>> >>>>>>> [ysuenaga at xelvis test]$ find . -iname "*jcmd*" >>>>>>> ./runtime/NMT/JcmdWithNMTDisabled.java >>>>>>> ./runtime/NMT/JcmdScale.java >>>>>>> ./gc/TestG1ZeroPGCTJcmdThreadPrint.java >>>>>>> >>>>>>> I understand that these tests checks output (stdout/stderr) with >>>>>>> OutputAnalyzer. >>>>>>> However, my patch affects target VM. So I guess current test cannot >>>>>>> check >>>>>>> that GC log rotation is succeeded. >>>>>>> >>>>>>> Should I make test which checks exit value of jcmd ? >>>>>>> >>>>>>> >>>>>>> Thanks, >>>>>>> >>>>>>> Yasumasa >>>>>>> >>>>>>>> Thanks, >>>>>>>> Erik >>>>>>>> >>>>>>>> On 2014-01-29 15:28, Yasumasa Suenaga wrote: >>>>>>>>> Hi Staffan, >>>>>>>>> >>>>>>>>> Thank you for reviewing! >>>>>>>>> I've uploaded new webrev. >>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-7090324/webrev.01/ >>>>>>>>> >>>>>>>>> On 2014/01/29 20:56, Staffan Larsen wrote: >>>>>>>>>> Yasumasa, >>>>>>>>>> >>>>>>>>>> src/share/vm/runtime/arguments.cpp >>>>>>>>>> no comments >>>>>>>>>> >>>>>>>>>> src/share/vm/runtime/safepoint.cpp >>>>>>>>>> I was surprised that gc log size was checked after each safe >>>>>>>>>> point. That seems an uneccssary burden to place on a safe point. >>>>>>>>>> Instead we should switch to a periodic task that checks the gc >>>>>>>>>> log size. However, this is unrelated to you patch, so please >>>>>>>>>> ignore for now. >>>>>>>>> >>>>>>>>> Agree. >>>>>>>>> However, I think that PeriodicTask also is not appropriate for >>>>>>>>> this. >>>>>>>>> >>>>>>>>> Size of GC log file is increased when GC is occurred. >>>>>>>>> So I think rotate function should be called at entry of each GC >>>>>>>>> events >>>>>>>>> e.g. VM_GC_Operation::doit_prologue() etc... >>>>>>>>> >>>>>>>>> >>>>>>>>>> src/share/vm/runtime/vm_operations.hpp >>>>>>>>>> line 402: nit: missing space before { >>>>>>>>> >>>>>>>>> Fixed. >>>>>>>>> >>>>>>>>> >>>>>>>>>> line 405: I think ?force? is a better name than ?is_force? >>>>>>>>> >>>>>>>>> I removed "force" option from DCmd. >>>>>>>>> So I removed this from VMOperation. >>>>>>>>> >>>>>>>>> >>>>>>>>>> src/share/vm/services/diagnosticCommand.cpp >>>>>>>>>> line 666: What does this do without the -force option? It looks >>>>>>>>>> to me that the non-force case will happen after each safe point >>>>>>>>>> (see above) and that there is no need to ever do this from a >>>>>>>>>> diagnostic command. Can we remove the option? >>>>>>>>> >>>>>>>>> Indeed. >>>>>>>>> I removed "force" option. >>>>>>>>> >>>>>>>>> >>>>>>>>>> line 677: ?Target VM does not support GC log file rotation." >>>>>>>>> >>>>>>>>> Fixed. >>>>>>>>> >>>>>>>>> >>>>>>>>>> nits: some missing spaces before ?{' and after ?if' >>>>>>>>> >>>>>>>>> Fixed. >>>>>>>>> >>>>>>>>> >>>>>>>>>> src/share/vm/services/diagnosticCommand.hpp >>>>>>>>>> I think RotateGCLogDCmd should require the ?control? permission >>>>>>>>>> when executed via JMX, so please add: >>>>>>>>>> static const JavaPermission permission() { >>>>>>>>>> JavaPermission p = {"java.lang.management.ManagementPermission", >>>>>>>>>> "control", NULL}; >>>>>>>>>> return p; >>>>>>>>>> } >>>>>>>>> >>>>>>>>> Added. >>>>>>>>> >>>>>>>>> >>>>>>>>>> line 394: Maybe ?Force the GC log file to be rotated.? is a >>>>>>>>>> better description? >>>>>>>>> >>>>>>>>> Fixed. >>>>>>>>> >>>>>>>>> >>>>>>>>>> src/share/vm/utilities/ostream.cpp >>>>>>>>>> line 662: I think ?force? is a better name than ?is_force? >>>>>>>>>> line 668: The comment says exactly the same thing as the code so >>>>>>>>>> I think it can be skipped >>>>>>>>>> line 671: ?GC log file rotation occurs by external trigger >>>>>>>>>> ONLY." >>>>>>>>>> line 675: "not need? -> ?no need? >>>>>>>>>> line 718: "GC log rotation request has been received? >>>>>>>>> >>>>>>>>> Fixed them. >>>>>>>>> >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> >>>>>>>>> Yasumasa >>>>>>>>> >>>>>>>>> >>>>>>>>>> src/share/vm/utilities/ostream.hpp >>>>>>>>>> no comments >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> /Staffan >>>>>>>>>> >>>>>>>>>> On 24 jan 2014, at 14:50, Yasumasa >>>>>>>>>> Suenaga wrote: >>>>>>>>>> >>>>>>>>>>> Hi all, >>>>>>>>>>> >>>>>>>>>>> I've created webrev: >>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-7090324/webrev.00/ >>>>>>>>>>> >>>>>>>>>>> This patch works fine on current jdk9/hs-rt . >>>>>>>>>>> Could you review this? >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> I am just an Author. So I need a sponsor. >>>>>>>>>>> Could you help me? >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Please cooperate. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> >>>>>>>>>>> Yasumasa >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On 2013/12/06 0:05, Yasumasa Suenaga wrote: >>>>>>>>>>>> Hi all, >>>>>>>>>>>> >>>>>>>>>>>> Did someone read my email? >>>>>>>>>>>> I really hope to merge "JDK-7090324: gclog rotation via >>>>>>>>>>>> external tool" . >>>>>>>>>>>> >>>>>>>>>>>> I hear that someone need this RFE. So I want to discuss about >>>>>>>>>>>> this. >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Thanks, >>>>>>>>>>>> >>>>>>>>>>>> Yasumasa >>>>>>>>>>>> >>>>>>>>>>>> On 2013/11/08 21:47, Yasumasa Suenaga wrote: >>>>>>>>>>>>> Hi all, >>>>>>>>>>>>> >>>>>>>>>>>>> Did someone read my mail? >>>>>>>>>>>>> >>>>>>>>>>>>> I think that this RFE helps us to watch Java heap on >>>>>>>>>>>>> production system. >>>>>>>>>>>>> Also I think this RFE is able to be part of the JEP 158 >>>>>>>>>>>>> (Unified JVM Logging) . >>>>>>>>>>>>> >>>>>>>>>>>>> I want to update this RFE in JDK Bug System, but I don't have >>>>>>>>>>>>> account. >>>>>>>>>>>>> So I've posted email at first. >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> Thanks, >>>>>>>>>>>>> >>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> On 2013/09/30 21:10, Yasumasa Suenaga wrote: >>>>>>>>>>>>>> In previous email, I've attached new patch for this RFE. >>>>>>>>>>>>>> It works fine with current hsx. >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>>> >>>>>>>>>>>>>> On 2013/09/29 23:40, Yasu wrote: >>>>>>>>>>>>>>> Hi all, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> We are using "logrotate" tool on RHEL for various log >>>>>>>>>>>>>>> rotation. >>>>>>>>>>>>>>> Current HotSpot has gclog rotation function for log size >>>>>>>>>>>>>>> base, >>>>>>>>>>>>>>> however I need to rotate gc log synchronizing with >>>>>>>>>>>>>>> logrotate >>>>>>>>>>>>>>> tool. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> So I've created RFE as "JDK-7090324: gclog rotation via >>>>>>>>>>>>>>> external tool" . >>>>>>>>>>>>>>> And Sr. Engineering Manager in Oracle said he use the >>>>>>>>>>>>>>> essence of my patch in one >>>>>>>>>>>>>>> of the jcmd subcommands. >>>>>>>>>>>>>>> http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/2011-September/003274.html >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> 2 years ago, I posted a patch for this RFE. >>>>>>>>>>>>>>> But this patch is too old to apply for current HotSpot. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> In last month, a similar discussion was appeared in ML. >>>>>>>>>>>>>>> So I think it's time to discuss this RFE. >>>>>>>>>>>>>>> http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/2013-August/008029.html >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Please cooperate. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Best regards, >>>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>> >>>> >>> >> > From shanliang.jiang at oracle.com Wed Feb 12 09:04:50 2014 From: shanliang.jiang at oracle.com (shanliang) Date: Wed, 12 Feb 2014 18:04:50 +0100 Subject: Codereview request: 8007710 runtime/7158988/FieldMonitor.java fails with com.sun.jdi.VMDisconnectedException: Connection closed In-Reply-To: <3F301491-8A27-4319-A694-D1E8ACB2F249@oracle.com> References: <52FA301A.2040301@oracle.com> <52FA424E.2050202@oracle.com> <52FA4515.3050904@oracle.com> <52FA4D33.4030402@oracle.com> <52FA51BE.4070701@oracle.com> <52FA5E51.3040704@oracle.com> <3F301491-8A27-4319-A694-D1E8ACB2F249@oracle.com> Message-ID: <52FBA9B2.6090502@oracle.com> Staffan Larsen wrote: > I think what you need to do is wait for the VMStartEvent before you add requests to the VM. Note this paragraph from the VirtualMachine doc: > > Note that a target VM launched by a launching connector is not > guaranteed to be stable until after the VMStartEvent has been > received. > I may miss something here, I believe VMStartEvent must be the first event, when the test got ClassPrepareEvent, it must already received VMStartEvent. > I think adding code that looks something like this will make the test stable: > > VirtualMachine vm = launchTarget(CLASS_NAME); > EventQueue eventQueue = vm.eventQueue(); > > boolean started = false; > while(!started) { > EventSet eventSet = eventQueue.remove(); > for (Event event : eventSet) { > if (event instanceof VMStartEvent) { > started = true; > } > if (event instanceof VMDeathEvent > || event instanceof VMDisconnectEvent) { > throw new Error("VM died before it started...:"+event); > } > } > } > > System.out.println("Vm launched"); > The code you proposed could improve the test, it made sure that TestPostFieldModification was started, but I am afraid that it did not address the issue causing the failure, the issue I believe was that TestPostFieldModification exited before or during FieldMonitor called addFieldWatch(), that was why addFieldWatch() received VMDisconnectedException. When the test was treating ClassPrepareEvent, even if VMDeathEvent or VMDisconnectEvent arrived, it must be still waiting in the eventQueue because it arrived after ClassPrepareEvent. My fix was to not allow TestPostFieldModification to exit before addFieldWatch() was done. > > There is also no reason to call addFieldWatch() before the ClassPrepareEvent has been received. The call to vm..classesByName() will just return an empty list anyway. > I do not know why the test called addFieldWatch before ClassPrepareEvent had been received, but yes the returned list was empty, so agree to remove it. > While you are in there you can also remove the unused StringBuffer near the top of main(). > Yes it was already removed in version 01 Here is the new webrev: http://cr.openjdk.java.net/~sjiang/JDK-8007710/02/ Thanks, Shanliang > > Thanks, > /Staffan > > On 11 feb 2014, at 18:30, shanliang wrote: > > >> Here is the new fix in which FieldMonitor will write to TestPostFieldModification, to inform the latter to quit, as suggested bu Jaroslav >> http://cr.openjdk.java.net/~sjiang/JDK-8007710/01/ >> >> Thanks, >> Shanliang >> >> shanliang wrote: >> >>> shanliang wrote: >>> >>>> Jaroslav Bachorik wrote: >>>> >>>>> On 11.2.2014 16:31, shanliang wrote: >>>>> >>>>>> Staffan Larsen wrote: >>>>>> >>>>>>> Hi Shanliang, >>>>>>> >>>>>>> I can?t quite see how the test can fail in this way. When the >>>>>>> ClassPrepareEvent happens, the debuggee will be suspended. So when >>>>>>> addFieldWatch() is called, the debuggee should not have moved. >>>>>>> >>>>>> I am not expert of jdi so I may miss something here. I checked the >>>>>> failure trace and saw the report exception happen when FieldMonitor >>>>>> received ClassPrepareEvent and was doing addFieldWatch. FieldMonitor did >>>>>> call "vm.resume()" before treating events. >>>>>> >>>>> AFAICS, calling vm.resume() results in an almost immediate debuggee death. The gc() invoking thread "d" is flagged as a deamon and as such doesn't prevent the process from exiting. The other thread is not a daemon but will finish in only few cycles. >>>>> >>>> I looked at the class com.sun.jdi.VirtualMachine, here is the Javadoc of the method "resume": >>>> /** >>>> * Continues the execution of the application running in this >>>> * virtual machine. All threads are resumed as documented in >>>> * {@link ThreadReference#resume}. >>>> * >>>> * @throws VMCannotBeModifiedException if the VirtualMachine is read-only - see {@link VirtualMachine#canBeModified()}. >>>> * >>>> * @see #suspend >>>> */ >>>> void resume(); >>>> My understanding is that the debuggee resumes to work after this call, instead to die? >>>> >>> In fact the problem is here, the vm (TestPostFieldModification) should not die before FieldMonitor finishes addFieldWatch. >>> >>> Shanliang >>> >>>>>> I reproduced the bug by add sleep(1000) after vm.resume() but before >>>>>> calling eventQueue.remove(); >>>>>> >>>>> It looks like some kind of synchronization between the debugger and the debuggee is necessary. But I wonder if you should better use the process.getOuptuptStream() to write and flush a message for the debugee indicating that it can exit. And in the debugee you would just do System.in.read() as the last statement in the main() method. Seems more robust than involving files. >>>>> >>>> It could work, but creating a file in the testing directory should have no issue, but yes maybe less performance. >>>> >>>> Thanks, >>>> Shanliang >>>> >>>>> Cheers, >>>>> >>>>> -JB- >>>>> >>>>> >>>>>> Thanks, >>>>>> Shanliang >>>>>> >>>>>>> One problem I do see with the test is that it does not wait for a >>>>>>> VMStartEvent before setting up requests. I?m not sure if that could >>>>>>> cause the failure in the bug report, though. >>>>>>> >>>>>>> /Staffan >>>>>>> >>>>>>> On 11 feb 2014, at 15:13, shanliang wrote: >>>>>>> >>>>>>> >>>>>>>> Hi , >>>>>>>> >>>>>>>> The problem could be that FieldMonitor did not have enough time to >>>>>>>> "addFieldWatch" but the vm to monitor (TestPostFieldModification) was >>>>>>>> already ended. >>>>>>>> >>>>>>>> So we should make sure that TestPostFieldModification exits after >>>>>>>> FieldMonitor has done necessary. The solution proposed here is that >>>>>>>> FieldMonitor creates a file after adding field watching, and >>>>>>>> TestPostFieldModification quits only after finding the file. >>>>>>>> >>>>>>>> web: >>>>>>>> http://icncweb.fr.oracle.com/~shjiang/webrev/8007710/00/ >>>>>>>> >>>>>>>> bug: >>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8007710 >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Shanliang >>>>>>>> >>>>>> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20140212/3853039a/attachment.html From shanliang.jiang at oracle.com Wed Feb 12 09:06:03 2014 From: shanliang.jiang at oracle.com (shanliang) Date: Wed, 12 Feb 2014 18:06:03 +0100 Subject: Codereview request: 8007710 runtime/7158988/FieldMonitor.java fails with com.sun.jdi.VMDisconnectedException: Connection closed In-Reply-To: <52FA8E2C.3050605@oracle.com> References: <52FA301A.2040301@oracle.com> <52FA424E.2050202@oracle.com> <52FA4515.3050904@oracle.com> <52FA4D33.4030402@oracle.com> <52FA51BE.4070701@oracle.com> <52FA5E51.3040704@oracle.com> <52FA8E2C.3050605@oracle.com> Message-ID: <52FBA9FB.1040002@oracle.com> serguei.spitsyn at oracle.com wrote: > The fix looks good. > But could you change "impossible" at line 45 to something more > adequate, i.e. "caught exception"? : > > 41 System.out.println("---TestPostFieldModification-run waiting to exit ..."); > 42 try { > 43 System.in.read(); > 44 } catch (Exception e) { > 45 System.out.println("---TestPostFieldModification-run impossible? "+e); > 46 e.printStackTrace(); > 47 } > Done. Thanks for reviewing. Shanliang > > Thanks, > Serguei > > > On 2/11/14 9:30 AM, shanliang wrote: >> Here is the new fix in which FieldMonitor will write to >> TestPostFieldModification, to inform the latter to quit, as suggested >> bu Jaroslav >> http://cr.openjdk.java.net/~sjiang/JDK-8007710/01/ >> >> Thanks, >> Shanliang >> >> shanliang wrote: >>> shanliang wrote: >>>> Jaroslav Bachorik wrote: >>>>> On 11.2.2014 16:31, shanliang wrote: >>>>>> Staffan Larsen wrote: >>>>>>> Hi Shanliang, >>>>>>> >>>>>>> I can?t quite see how the test can fail in this way. When the >>>>>>> ClassPrepareEvent happens, the debuggee will be suspended. So when >>>>>>> addFieldWatch() is called, the debuggee should not have moved. >>>>>> I am not expert of jdi so I may miss something here. I checked the >>>>>> failure trace and saw the report exception happen when FieldMonitor >>>>>> received ClassPrepareEvent and was doing addFieldWatch. >>>>>> FieldMonitor did >>>>>> call "vm.resume()" before treating events. >>>>> >>>>> AFAICS, calling vm.resume() results in an almost immediate >>>>> debuggee death. The gc() invoking thread "d" is flagged as a >>>>> deamon and as such doesn't prevent the process from exiting. The >>>>> other thread is not a daemon but will finish in only few cycles. >>>> I looked at the class com.sun.jdi.VirtualMachine, here is the >>>> Javadoc of the method "resume": >>>> /** >>>> * Continues the execution of the application running in this >>>> * virtual machine. All threads are resumed as documented in >>>> * {@link ThreadReference#resume}. >>>> * >>>> * @throws VMCannotBeModifiedException if the VirtualMachine is >>>> read-only - see {@link VirtualMachine#canBeModified()}. >>>> * >>>> * @see #suspend >>>> */ >>>> void resume(); >>>> My understanding is that the debuggee resumes to work after this >>>> call, instead to die? >>> In fact the problem is here, the vm (TestPostFieldModification) >>> should not die before FieldMonitor finishes addFieldWatch. >>> >>> Shanliang >>>>> >>>>>> >>>>>> I reproduced the bug by add sleep(1000) after vm.resume() but before >>>>>> calling eventQueue.remove(); >>>>> >>>>> It looks like some kind of synchronization between the debugger >>>>> and the debuggee is necessary. But I wonder if you should better >>>>> use the process.getOuptuptStream() to write and flush a message >>>>> for the debugee indicating that it can exit. And in the debugee >>>>> you would just do System.in.read() as the last statement in the >>>>> main() method. Seems more robust than involving files. >>>> It could work, but creating a file in the testing directory should >>>> have no issue, but yes maybe less performance. >>>> >>>> Thanks, >>>> Shanliang >>>>> >>>>> Cheers, >>>>> >>>>> -JB- >>>>> >>>>>> >>>>>> Thanks, >>>>>> Shanliang >>>>>>> One problem I do see with the test is that it does not wait for a >>>>>>> VMStartEvent before setting up requests. I?m not sure if that could >>>>>>> cause the failure in the bug report, though. >>>>>>> >>>>>>> /Staffan >>>>>>> >>>>>>> On 11 feb 2014, at 15:13, shanliang >>>>>>> wrote: >>>>>>> >>>>>>>> Hi , >>>>>>>> >>>>>>>> The problem could be that FieldMonitor did not have enough time to >>>>>>>> "addFieldWatch" but the vm to monitor >>>>>>>> (TestPostFieldModification) was >>>>>>>> already ended. >>>>>>>> >>>>>>>> So we should make sure that TestPostFieldModification exits after >>>>>>>> FieldMonitor has done necessary. The solution proposed here is >>>>>>>> that >>>>>>>> FieldMonitor creates a file after adding field watching, and >>>>>>>> TestPostFieldModification quits only after finding the file. >>>>>>>> >>>>>>>> web: >>>>>>>> http://icncweb.fr.oracle.com/~shjiang/webrev/8007710/00/ >>>>>>>> >>>>>>>> bug: >>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8007710 >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Shanliang >>>>>>> >>>>>> >>>>>> >>>>> >>>> >>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20140212/79fed66a/attachment-0001.html From staffan.larsen at oracle.com Wed Feb 12 09:27:42 2014 From: staffan.larsen at oracle.com (Staffan Larsen) Date: Wed, 12 Feb 2014 18:27:42 +0100 Subject: Codereview request: 8007710 runtime/7158988/FieldMonitor.java fails with com.sun.jdi.VMDisconnectedException: Connection closed In-Reply-To: <52FBA9B2.6090502@oracle.com> References: <52FA301A.2040301@oracle.com> <52FA424E.2050202@oracle.com> <52FA4515.3050904@oracle.com> <52FA4D33.4030402@oracle.com> <52FA51BE.4070701@oracle.com> <52FA5E51.3040704@oracle.com> <3F301491-8A27-4319-A694-D1E8ACB2F249@oracle.com> <52FBA9B2.6090502@oracle.com> Message-ID: I?m still not happy with this fix since I think the extra output stream synchronization logic is not needed - the debuggee should be suspended at all the interesting points. The fix I proposed is cleaner and (as far as I can tell) also fixes the problem. The only thing is that I can?t quite explain what goes wrong without the fix? I?d really like to understand that. I?ll try to dig deeper and see if I can understand exactly what happens. /Staffan On 12 feb 2014, at 18:04, shanliang wrote: > Staffan Larsen wrote: >> >> I think what you need to do is wait for the VMStartEvent before you add requests to the VM. Note this paragraph from the VirtualMachine doc: >> >> Note that a target VM launched by a launching connector is not >> guaranteed to be stable until after the VMStartEvent has been >> received. >> > I may miss something here, I believe VMStartEvent must be the first event, when the test got ClassPrepareEvent, it must already received VMStartEvent. >> >> I think adding code that looks something like this will make the test stable: >> >> VirtualMachine vm = launchTarget(CLASS_NAME); >> EventQueue eventQueue = vm.eventQueue(); >> >> boolean started = false; >> while(!started) { >> EventSet eventSet = eventQueue.remove(); >> for (Event event : eventSet) { >> if (event instanceof VMStartEvent) { >> started = true; >> } >> if (event instanceof VMDeathEvent >> || event instanceof VMDisconnectEvent) { >> throw new Error("VM died before it started...:"+event); >> } >> } >> } >> >> System.out.println("Vm launched"); >> > The code you proposed could improve the test, it made sure that TestPostFieldModification was started, but I am afraid that it did not address the issue causing the failure, the issue I believe was that TestPostFieldModification exited before or during FieldMonitor called addFieldWatch(), that was why addFieldWatch() received VMDisconnectedException. When the test was treating ClassPrepareEvent, even if VMDeathEvent or VMDisconnectEvent arrived, it must be still waiting in the eventQueue because it arrived after ClassPrepareEvent. > > My fix was to not allow TestPostFieldModification to exit before addFieldWatch() was done. >> >> >> There is also no reason to call addFieldWatch() before the ClassPrepareEvent has been received. The call to vm..classesByName() will just return an empty list anyway. >> > I do not know why the test called addFieldWatch before ClassPrepareEvent had been received, but yes the returned list was empty, so agree to remove it. >> While you are in there you can also remove the unused StringBuffer near the top of main(). >> > Yes it was already removed in version 01 > > Here is the new webrev: > http://cr.openjdk.java.net/~sjiang/JDK-8007710/02/ > > Thanks, > Shanliang >> >> Thanks, >> /Staffan >> >> On 11 feb 2014, at 18:30, shanliang wrote: >> >> >>> Here is the new fix in which FieldMonitor will write to TestPostFieldModification, to inform the latter to quit, as suggested bu Jaroslav >>> http://cr.openjdk.java.net/~sjiang/JDK-8007710/01/ >>> >>> Thanks, >>> Shanliang >>> >>> shanliang wrote: >>> >>>> shanliang wrote: >>>> >>>>> Jaroslav Bachorik wrote: >>>>> >>>>>> On 11.2.2014 16:31, shanliang wrote: >>>>>> >>>>>>> Staffan Larsen wrote: >>>>>>> >>>>>>>> Hi Shanliang, >>>>>>>> >>>>>>>> I can?t quite see how the test can fail in this way. When the >>>>>>>> ClassPrepareEvent happens, the debuggee will be suspended. So when >>>>>>>> addFieldWatch() is called, the debuggee should not have moved. >>>>>>>> >>>>>>> I am not expert of jdi so I may miss something here. I checked the >>>>>>> failure trace and saw the report exception happen when FieldMonitor >>>>>>> received ClassPrepareEvent and was doing addFieldWatch. FieldMonitor did >>>>>>> call "vm.resume()" before treating events. >>>>>>> >>>>>> AFAICS, calling vm.resume() results in an almost immediate debuggee death. The gc() invoking thread "d" is flagged as a deamon and as such doesn't prevent the process from exiting. The other thread is not a daemon but will finish in only few cycles. >>>>>> >>>>> I looked at the class com.sun.jdi.VirtualMachine, here is the Javadoc of the method "resume": >>>>> /** >>>>> * Continues the execution of the application running in this >>>>> * virtual machine. All threads are resumed as documented in >>>>> * {@link ThreadReference#resume}. >>>>> * >>>>> * @throws VMCannotBeModifiedException if the VirtualMachine is read-only - see {@link VirtualMachine#canBeModified()}. >>>>> * >>>>> * @see #suspend >>>>> */ >>>>> void resume(); >>>>> My understanding is that the debuggee resumes to work after this call, instead to die? >>>>> >>>> In fact the problem is here, the vm (TestPostFieldModification) should not die before FieldMonitor finishes addFieldWatch. >>>> >>>> Shanliang >>>> >>>>>>> I reproduced the bug by add sleep(1000) after vm.resume() but before >>>>>>> calling eventQueue.remove(); >>>>>>> >>>>>> It looks like some kind of synchronization between the debugger and the debuggee is necessary. But I wonder if you should better use the process.getOuptuptStream() to write and flush a message for the debugee indicating that it can exit. And in the debugee you would just do System.in.read() as the last statement in the main() method. Seems more robust than involving files. >>>>>> >>>>> It could work, but creating a file in the testing directory should have no issue, but yes maybe less performance. >>>>> >>>>> Thanks, >>>>> Shanliang >>>>> >>>>>> Cheers, >>>>>> >>>>>> -JB- >>>>>> >>>>>> >>>>>>> Thanks, >>>>>>> Shanliang >>>>>>> >>>>>>>> One problem I do see with the test is that it does not wait for a >>>>>>>> VMStartEvent before setting up requests. I?m not sure if that could >>>>>>>> cause the failure in the bug report, though. >>>>>>>> >>>>>>>> /Staffan >>>>>>>> >>>>>>>> On 11 feb 2014, at 15:13, shanliang wrote: >>>>>>>> >>>>>>>> >>>>>>>>> Hi , >>>>>>>>> >>>>>>>>> The problem could be that FieldMonitor did not have enough time to >>>>>>>>> "addFieldWatch" but the vm to monitor (TestPostFieldModification) was >>>>>>>>> already ended. >>>>>>>>> >>>>>>>>> So we should make sure that TestPostFieldModification exits after >>>>>>>>> FieldMonitor has done necessary. The solution proposed here is that >>>>>>>>> FieldMonitor creates a file after adding field watching, and >>>>>>>>> TestPostFieldModification quits only after finding the file. >>>>>>>>> >>>>>>>>> web: >>>>>>>>> http://icncweb.fr.oracle.com/~shjiang/webrev/8007710/00/ >>>>>>>>> >>>>>>>>> bug: >>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8007710 >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> Shanliang >>>>>>>>> >>>>>>> >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20140212/1f406f74/attachment.html From nils.eliasson at oracle.com Wed Feb 12 09:30:02 2014 From: nils.eliasson at oracle.com (Nils Eliasson) Date: Wed, 12 Feb 2014 18:30:02 +0100 Subject: Project to improve hs_err files In-Reply-To: References: <52FB73D0.1070402@oracle.com> Message-ID: <52FBAF9A.1090206@oracle.com> On 2014-02-12 15:28, Volker Simonis wrote: > On Wed, Feb 12, 2014 at 2:14 PM, Nils Eliasson wrote: >> On 2013-09-06 13:32, Mattis Castegren wrote: >> >> ... >> >> >> * Additional Data >> >> ... >> >> >> Make sure we read the OS-context when hitting asserts in debug-builds. Today >> we only get registers, code and stack on actual crashes. Very valuable info >> that is missing when investigating many test failures. >> > I'm afraid that will be not possible. For a crash (i.e. because of a > SIGSEGV) the kernel saves the context for the PC which caused the > problem. 'assert' and 'guarantee' are macros which evaluate an > expression and call report_vm_error if that expression was 'false'. I > think the best we could do is to call 'getcontext()' after the > evaluation of the expression, but that's not actually the context > where the evaluation of the expression failed. But I agree that it may > be helpful as well. getContext() works at least for some platforms. An alternative is to a force crash and let the crash handler do the works as usual. It has been done and worked pretty well. > And you should already get a complete stack trace for 'asserts' and > 'guarantees' with the current implementation. Don't you see it? I get a stacktrace but no raw stack from the current frame. Regards, Nilx > > Regards, > Volker > >> //Nils From volker.simonis at gmail.com Wed Feb 12 10:03:51 2014 From: volker.simonis at gmail.com (Volker Simonis) Date: Wed, 12 Feb 2014 19:03:51 +0100 Subject: Project to improve hs_err files In-Reply-To: <52FBAF9A.1090206@oracle.com> References: <52FB73D0.1070402@oracle.com> <52FBAF9A.1090206@oracle.com> Message-ID: On Wed, Feb 12, 2014 at 6:30 PM, Nils Eliasson wrote: > > > On 2014-02-12 15:28, Volker Simonis wrote: >> >> On Wed, Feb 12, 2014 at 2:14 PM, Nils Eliasson >> wrote: >>> >>> On 2013-09-06 13:32, Mattis Castegren wrote: >>> >>> ... >>> >>> >>> * Additional Data >>> >>> ... >>> >>> >>> Make sure we read the OS-context when hitting asserts in debug-builds. >>> Today >>> we only get registers, code and stack on actual crashes. Very valuable >>> info >>> that is missing when investigating many test failures. >>> >> I'm afraid that will be not possible. For a crash (i.e. because of a >> SIGSEGV) the kernel saves the context for the PC which caused the >> problem. 'assert' and 'guarantee' are macros which evaluate an >> expression and call report_vm_error if that expression was 'false'. I >> think the best we could do is to call 'getcontext()' after the >> evaluation of the expression, but that's not actually the context >> where the evaluation of the expression failed. But I agree that it may >> be helpful as well. > > > getContext() works at least for some platforms. An alternative is to a force > crash and let the crash handler do the works as usual. It has been done and > worked pretty well. > This is a good idea, but it may interfer with -XX:SuppressErrorAt. We currently only check if we should ignore an assertion in report_vm_error(). If we would crash before (i.e. right after the evaluation of the expression), -XX:SuppressErrorAt would be useless. On the other side, if w only crash in report_vm_error(), we would get the context in that functions which would not contain the desired register values in the frame which crashed either. So we'd probably have to think how this could be implemented in the best way. > > >> And you should already get a complete stack trace for 'asserts' and >> 'guarantees' with the current implementation. Don't you see it? > > > I get a stacktrace but no raw stack from the current frame. > > Regards, > Nilx > > >> >> Regards, >> Volker >> >>> //Nils > > From jaroslav.bachorik at oracle.com Thu Feb 13 01:17:21 2014 From: jaroslav.bachorik at oracle.com (Jaroslav Bachorik) Date: Thu, 13 Feb 2014 10:17:21 +0100 Subject: Codereview request: 8007710 runtime/7158988/FieldMonitor.java fails with com.sun.jdi.VMDisconnectedException: Connection closed In-Reply-To: References: <52FA301A.2040301@oracle.com> <52FA424E.2050202@oracle.com> <52FA4515.3050904@oracle.com> <52FA4D33.4030402@oracle.com> <52FA51BE.4070701@oracle.com> <52FA5E51.3040704@oracle.com> <3F301491-8A27-4319-A694-D1E8ACB2F249@oracle.com> <52FBA9B2.6090502@oracle.com> Message-ID: <52FC8DA1.9090900@oracle.com> Hi Staffan, On 12.2.2014 18:27, Staffan Larsen wrote: > I?m still not happy with this fix since I think the extra output stream synchronization logic is not needed - the debuggee should be suspended at all the interesting points. The fix I proposed is cleaner and (as far as I can tell) also fixes the problem. The only thing is that I can?t quite explain what goes wrong without the fix? I?d really like to understand that. I?ll try to dig deeper and see if I can understand exactly what happens. Yes, bringing the VM to a stable state before calling other JDI functions helps to stabilize the test even without the additional synchronization via stdout/stdin. I just wonder whether this check should not be done inside com.sun.jdi.connect.LaunchingConnector#launch() implementation. Does it even make sense to hand off an unstable VM? -JB- > > /Staffan > > On 12 feb 2014, at 18:04, shanliang wrote: > >> Staffan Larsen wrote: >>> >>> I think what you need to do is wait for the VMStartEvent before you add requests to the VM. Note this paragraph from the VirtualMachine doc: >>> >>> Note that a target VM launched by a launching connector is not >>> guaranteed to be stable until after the VMStartEvent has been >>> received. >>> >> I may miss something here, I believe VMStartEvent must be the first event, when the test got ClassPrepareEvent, it must already received VMStartEvent. >>> >>> I think adding code that looks something like this will make the test stable: >>> >>> VirtualMachine vm = launchTarget(CLASS_NAME); >>> EventQueue eventQueue = vm.eventQueue(); >>> >>> boolean started = false; >>> while(!started) { >>> EventSet eventSet = eventQueue.remove(); >>> for (Event event : eventSet) { >>> if (event instanceof VMStartEvent) { >>> started = true; >>> } >>> if (event instanceof VMDeathEvent >>> || event instanceof VMDisconnectEvent) { >>> throw new Error("VM died before it started...:"+event); >>> } >>> } >>> } >>> >>> System.out.println("Vm launched"); >>> >> The code you proposed could improve the test, it made sure that TestPostFieldModification was started, but I am afraid that it did not address the issue causing the failure, the issue I believe was that TestPostFieldModification exited before or during FieldMonitor called addFieldWatch(), that was why addFieldWatch() received VMDisconnectedException. When the test was treating ClassPrepareEvent, even if VMDeathEvent or VMDisconnectEvent arrived, it must be still waiting in the eventQueue because it arrived after ClassPrepareEvent. >> >> My fix was to not allow TestPostFieldModification to exit before addFieldWatch() was done. >>> >>> >>> There is also no reason to call addFieldWatch() before the ClassPrepareEvent has been received. The call to vm..classesByName() will just return an empty list anyway. >>> >> I do not know why the test called addFieldWatch before ClassPrepareEvent had been received, but yes the returned list was empty, so agree to remove it. >>> While you are in there you can also remove the unused StringBuffer near the top of main(). >>> >> Yes it was already removed in version 01 >> >> Here is the new webrev: >> http://cr.openjdk.java.net/~sjiang/JDK-8007710/02/ >> >> Thanks, >> Shanliang >>> >>> Thanks, >>> /Staffan >>> >>> On 11 feb 2014, at 18:30, shanliang wrote: >>> >>> >>>> Here is the new fix in which FieldMonitor will write to TestPostFieldModification, to inform the latter to quit, as suggested bu Jaroslav >>>> http://cr.openjdk.java.net/~sjiang/JDK-8007710/01/ >>>> >>>> Thanks, >>>> Shanliang >>>> >>>> shanliang wrote: >>>> >>>>> shanliang wrote: >>>>> >>>>>> Jaroslav Bachorik wrote: >>>>>> >>>>>>> On 11.2.2014 16:31, shanliang wrote: >>>>>>> >>>>>>>> Staffan Larsen wrote: >>>>>>>> >>>>>>>>> Hi Shanliang, >>>>>>>>> >>>>>>>>> I can?t quite see how the test can fail in this way. When the >>>>>>>>> ClassPrepareEvent happens, the debuggee will be suspended. So when >>>>>>>>> addFieldWatch() is called, the debuggee should not have moved. >>>>>>>>> >>>>>>>> I am not expert of jdi so I may miss something here. I checked the >>>>>>>> failure trace and saw the report exception happen when FieldMonitor >>>>>>>> received ClassPrepareEvent and was doing addFieldWatch. FieldMonitor did >>>>>>>> call "vm.resume()" before treating events. >>>>>>>> >>>>>>> AFAICS, calling vm.resume() results in an almost immediate debuggee death. The gc() invoking thread "d" is flagged as a deamon and as such doesn't prevent the process from exiting. The other thread is not a daemon but will finish in only few cycles. >>>>>>> >>>>>> I looked at the class com.sun.jdi.VirtualMachine, here is the Javadoc of the method "resume": >>>>>> /** >>>>>> * Continues the execution of the application running in this >>>>>> * virtual machine. All threads are resumed as documented in >>>>>> * {@link ThreadReference#resume}. >>>>>> * >>>>>> * @throws VMCannotBeModifiedException if the VirtualMachine is read-only - see {@link VirtualMachine#canBeModified()}. >>>>>> * >>>>>> * @see #suspend >>>>>> */ >>>>>> void resume(); >>>>>> My understanding is that the debuggee resumes to work after this call, instead to die? >>>>>> >>>>> In fact the problem is here, the vm (TestPostFieldModification) should not die before FieldMonitor finishes addFieldWatch. >>>>> >>>>> Shanliang >>>>> >>>>>>>> I reproduced the bug by add sleep(1000) after vm.resume() but before >>>>>>>> calling eventQueue.remove(); >>>>>>>> >>>>>>> It looks like some kind of synchronization between the debugger and the debuggee is necessary. But I wonder if you should better use the process.getOuptuptStream() to write and flush a message for the debugee indicating that it can exit. And in the debugee you would just do System.in.read() as the last statement in the main() method. Seems more robust than involving files. >>>>>>> >>>>>> It could work, but creating a file in the testing directory should have no issue, but yes maybe less performance. >>>>>> >>>>>> Thanks, >>>>>> Shanliang >>>>>> >>>>>>> Cheers, >>>>>>> >>>>>>> -JB- >>>>>>> >>>>>>> >>>>>>>> Thanks, >>>>>>>> Shanliang >>>>>>>> >>>>>>>>> One problem I do see with the test is that it does not wait for a >>>>>>>>> VMStartEvent before setting up requests. I?m not sure if that could >>>>>>>>> cause the failure in the bug report, though. >>>>>>>>> >>>>>>>>> /Staffan >>>>>>>>> >>>>>>>>> On 11 feb 2014, at 15:13, shanliang wrote: >>>>>>>>> >>>>>>>>> >>>>>>>>>> Hi , >>>>>>>>>> >>>>>>>>>> The problem could be that FieldMonitor did not have enough time to >>>>>>>>>> "addFieldWatch" but the vm to monitor (TestPostFieldModification) was >>>>>>>>>> already ended. >>>>>>>>>> >>>>>>>>>> So we should make sure that TestPostFieldModification exits after >>>>>>>>>> FieldMonitor has done necessary. The solution proposed here is that >>>>>>>>>> FieldMonitor creates a file after adding field watching, and >>>>>>>>>> TestPostFieldModification quits only after finding the file. >>>>>>>>>> >>>>>>>>>> web: >>>>>>>>>> http://icncweb.fr.oracle.com/~shjiang/webrev/8007710/00/ >>>>>>>>>> >>>>>>>>>> bug: >>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8007710 >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> Shanliang >>>>>>>>>> >>>>>>>> >>> >> >> > > From staffan.larsen at oracle.com Thu Feb 13 02:04:05 2014 From: staffan.larsen at oracle.com (Staffan Larsen) Date: Thu, 13 Feb 2014 11:04:05 +0100 Subject: Codereview request: 8007710 runtime/7158988/FieldMonitor.java fails with com.sun.jdi.VMDisconnectedException: Connection closed In-Reply-To: <52FC8DA1.9090900@oracle.com> References: <52FA301A.2040301@oracle.com> <52FA424E.2050202@oracle.com> <52FA4515.3050904@oracle.com> <52FA4D33.4030402@oracle.com> <52FA51BE.4070701@oracle.com> <52FA5E51.3040704@oracle.com> <3F301491-8A27-4319-A694-D1E8ACB2F249@oracle.com> <52FBA9B2.6090502@oracle.com> <52FC8DA1.9090900@oracle.com> Message-ID: <1353E47D-3ACB-42FD-ABD3-845CEB6EE389@oracle.com> On 13 feb 2014, at 10:17, Jaroslav Bachorik wrote: > Hi Staffan, > > On 12.2.2014 18:27, Staffan Larsen wrote: >> I?m still not happy with this fix since I think the extra output stream synchronization logic is not needed - the debuggee should be suspended at all the interesting points. The fix I proposed is cleaner and (as far as I can tell) also fixes the problem. The only thing is that I can?t quite explain what goes wrong without the fix? I?d really like to understand that. I?ll try to dig deeper and see if I can understand exactly what happens. > > Yes, bringing the VM to a stable state before calling other JDI functions helps to stabilize the test even without the additional synchronization via stdout/stdin. > > I just wonder whether this check should not be done inside com.sun.jdi.connect.LaunchingConnector#launch() implementation. Does it even make sense to hand off an unstable VM? Good question, but hard to change now - all implementations depend on the current functionality. The VMStartEvent also gives you a reference to the main thread. /S > > -JB- > >> >> /Staffan >> >> On 12 feb 2014, at 18:04, shanliang wrote: >> >>> Staffan Larsen wrote: >>>> >>>> I think what you need to do is wait for the VMStartEvent before you add requests to the VM. Note this paragraph from the VirtualMachine doc: >>>> >>>> Note that a target VM launched by a launching connector is not >>>> guaranteed to be stable until after the VMStartEvent has been >>>> received. >>>> >>> I may miss something here, I believe VMStartEvent must be the first event, when the test got ClassPrepareEvent, it must already received VMStartEvent. >>>> >>>> I think adding code that looks something like this will make the test stable: >>>> >>>> VirtualMachine vm = launchTarget(CLASS_NAME); >>>> EventQueue eventQueue = vm.eventQueue(); >>>> >>>> boolean started = false; >>>> while(!started) { >>>> EventSet eventSet = eventQueue.remove(); >>>> for (Event event : eventSet) { >>>> if (event instanceof VMStartEvent) { >>>> started = true; >>>> } >>>> if (event instanceof VMDeathEvent >>>> || event instanceof VMDisconnectEvent) { >>>> throw new Error("VM died before it started...:"+event); >>>> } >>>> } >>>> } >>>> >>>> System.out.println("Vm launched"); >>>> >>> The code you proposed could improve the test, it made sure that TestPostFieldModification was started, but I am afraid that it did not address the issue causing the failure, the issue I believe was that TestPostFieldModification exited before or during FieldMonitor called addFieldWatch(), that was why addFieldWatch() received VMDisconnectedException. When the test was treating ClassPrepareEvent, even if VMDeathEvent or VMDisconnectEvent arrived, it must be still waiting in the eventQueue because it arrived after ClassPrepareEvent. >>> >>> My fix was to not allow TestPostFieldModification to exit before addFieldWatch() was done. >>>> >>>> >>>> There is also no reason to call addFieldWatch() before the ClassPrepareEvent has been received. The call to vm..classesByName() will just return an empty list anyway. >>>> >>> I do not know why the test called addFieldWatch before ClassPrepareEvent had been received, but yes the returned list was empty, so agree to remove it. >>>> While you are in there you can also remove the unused StringBuffer near the top of main(). >>>> >>> Yes it was already removed in version 01 >>> >>> Here is the new webrev: >>> http://cr.openjdk.java.net/~sjiang/JDK-8007710/02/ >>> >>> Thanks, >>> Shanliang >>>> >>>> Thanks, >>>> /Staffan >>>> >>>> On 11 feb 2014, at 18:30, shanliang wrote: >>>> >>>> >>>>> Here is the new fix in which FieldMonitor will write to TestPostFieldModification, to inform the latter to quit, as suggested bu Jaroslav >>>>> http://cr.openjdk.java.net/~sjiang/JDK-8007710/01/ >>>>> >>>>> Thanks, >>>>> Shanliang >>>>> >>>>> shanliang wrote: >>>>> >>>>>> shanliang wrote: >>>>>> >>>>>>> Jaroslav Bachorik wrote: >>>>>>> >>>>>>>> On 11.2.2014 16:31, shanliang wrote: >>>>>>>> >>>>>>>>> Staffan Larsen wrote: >>>>>>>>> >>>>>>>>>> Hi Shanliang, >>>>>>>>>> >>>>>>>>>> I can?t quite see how the test can fail in this way. When the >>>>>>>>>> ClassPrepareEvent happens, the debuggee will be suspended. So when >>>>>>>>>> addFieldWatch() is called, the debuggee should not have moved. >>>>>>>>>> >>>>>>>>> I am not expert of jdi so I may miss something here. I checked the >>>>>>>>> failure trace and saw the report exception happen when FieldMonitor >>>>>>>>> received ClassPrepareEvent and was doing addFieldWatch. FieldMonitor did >>>>>>>>> call "vm.resume()" before treating events. >>>>>>>>> >>>>>>>> AFAICS, calling vm.resume() results in an almost immediate debuggee death. The gc() invoking thread "d" is flagged as a deamon and as such doesn't prevent the process from exiting. The other thread is not a daemon but will finish in only few cycles. >>>>>>>> >>>>>>> I looked at the class com.sun.jdi.VirtualMachine, here is the Javadoc of the method "resume": >>>>>>> /** >>>>>>> * Continues the execution of the application running in this >>>>>>> * virtual machine. All threads are resumed as documented in >>>>>>> * {@link ThreadReference#resume}. >>>>>>> * >>>>>>> * @throws VMCannotBeModifiedException if the VirtualMachine is read-only - see {@link VirtualMachine#canBeModified()}. >>>>>>> * >>>>>>> * @see #suspend >>>>>>> */ >>>>>>> void resume(); >>>>>>> My understanding is that the debuggee resumes to work after this call, instead to die? >>>>>>> >>>>>> In fact the problem is here, the vm (TestPostFieldModification) should not die before FieldMonitor finishes addFieldWatch. >>>>>> >>>>>> Shanliang >>>>>> >>>>>>>>> I reproduced the bug by add sleep(1000) after vm.resume() but before >>>>>>>>> calling eventQueue.remove(); >>>>>>>>> >>>>>>>> It looks like some kind of synchronization between the debugger and the debuggee is necessary. But I wonder if you should better use the process.getOuptuptStream() to write and flush a message for the debugee indicating that it can exit. And in the debugee you would just do System.in.read() as the last statement in the main() method. Seems more robust than involving files. >>>>>>>> >>>>>>> It could work, but creating a file in the testing directory should have no issue, but yes maybe less performance. >>>>>>> >>>>>>> Thanks, >>>>>>> Shanliang >>>>>>> >>>>>>>> Cheers, >>>>>>>> >>>>>>>> -JB- >>>>>>>> >>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> Shanliang >>>>>>>>> >>>>>>>>>> One problem I do see with the test is that it does not wait for a >>>>>>>>>> VMStartEvent before setting up requests. I?m not sure if that could >>>>>>>>>> cause the failure in the bug report, though. >>>>>>>>>> >>>>>>>>>> /Staffan >>>>>>>>>> >>>>>>>>>> On 11 feb 2014, at 15:13, shanliang wrote: >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>> Hi , >>>>>>>>>>> >>>>>>>>>>> The problem could be that FieldMonitor did not have enough time to >>>>>>>>>>> "addFieldWatch" but the vm to monitor (TestPostFieldModification) was >>>>>>>>>>> already ended. >>>>>>>>>>> >>>>>>>>>>> So we should make sure that TestPostFieldModification exits after >>>>>>>>>>> FieldMonitor has done necessary. The solution proposed here is that >>>>>>>>>>> FieldMonitor creates a file after adding field watching, and >>>>>>>>>>> TestPostFieldModification quits only after finding the file. >>>>>>>>>>> >>>>>>>>>>> web: >>>>>>>>>>> http://icncweb.fr.oracle.com/~shjiang/webrev/8007710/00/ >>>>>>>>>>> >>>>>>>>>>> bug: >>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8007710 >>>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> Shanliang >>>>>>>>>>> >>>>>>>>> >>>> >>> >>> >> >> > From shanliang.jiang at oracle.com Thu Feb 13 02:36:04 2014 From: shanliang.jiang at oracle.com (shanliang) Date: Thu, 13 Feb 2014 11:36:04 +0100 Subject: Codereview request: 8007710 runtime/7158988/FieldMonitor.java fails with com.sun.jdi.VMDisconnectedException: Connection closed In-Reply-To: <1353E47D-3ACB-42FD-ABD3-845CEB6EE389@oracle.com> References: <52FA301A.2040301@oracle.com> <52FA424E.2050202@oracle.com> <52FA4515.3050904@oracle.com> <52FA4D33.4030402@oracle.com> <52FA51BE.4070701@oracle.com> <52FA5E51.3040704@oracle.com> <3F301491-8A27-4319-A694-D1E8ACB2F249@oracle.com> <52FBA9B2.6090502@oracle.com> <52FC8DA1.9090900@oracle.com> <1353E47D-3ACB-42FD-ABD3-845CEB6EE389@oracle.com> Message-ID: <52FCA014.9000106@oracle.com> Staffan Larsen wrote: > On 13 feb 2014, at 10:17, Jaroslav Bachorik wrote: > > >> Hi Staffan, >> >> On 12.2.2014 18:27, Staffan Larsen wrote: >> >>> I?m still not happy with this fix since I think the extra output stream synchronization logic is not needed - the debuggee should be suspended at all the interesting points. The fix I proposed is cleaner and (as far as I can tell) also fixes the problem. The only thing is that I can?t quite explain what goes wrong without the fix? I?d really like to understand that. I?ll try to dig deeper and see if I can understand exactly what happens. >>> >> Yes, bringing the VM to a stable state before calling other JDI functions helps to stabilize the test even without the additional synchronization via stdout/stdin. >> >> I just wonder whether this check should not be done inside com.sun.jdi.connect.LaunchingConnector#launch() implementation. Does it even make sense to hand off an unstable VM? >> > > Good question, but hard to change now - all implementations depend on the current functionality. The VMStartEvent also gives you a reference to the main thread. > The test failed when it received ClassPrepareEvent and did addFieldWatch, that meant the test must receive already VMStartEvent, because VMStartEvent must be the first event, if it was true then the vm must be already stable when failing. Except that the test received ClassPrepareEvent before VMStartEvent then it was doing addFieldWatch with a possibly unstable VM. in this case we might have a serious bug in VirtualMachine implementation, and if this is true the fix proposed to check "start" may make miss ClassPrepareEvent, then the test would test nothing. Shanliang > /S > > >> -JB- >> >> >>> /Staffan >>> >>> On 12 feb 2014, at 18:04, shanliang wrote: >>> >>> >>>> Staffan Larsen wrote: >>>> >>>>> I think what you need to do is wait for the VMStartEvent before you add requests to the VM. Note this paragraph from the VirtualMachine doc: >>>>> >>>>> Note that a target VM launched by a launching connector is not >>>>> guaranteed to be stable until after the VMStartEvent has been >>>>> received. >>>>> >>>>> >>>> I may miss something here, I believe VMStartEvent must be the first event, when the test got ClassPrepareEvent, it must already received VMStartEvent. >>>> >>>>> I think adding code that looks something like this will make the test stable: >>>>> >>>>> VirtualMachine vm = launchTarget(CLASS_NAME); >>>>> EventQueue eventQueue = vm.eventQueue(); >>>>> >>>>> boolean started = false; >>>>> while(!started) { >>>>> EventSet eventSet = eventQueue.remove(); >>>>> for (Event event : eventSet) { >>>>> if (event instanceof VMStartEvent) { >>>>> started = true; >>>>> } >>>>> if (event instanceof VMDeathEvent >>>>> || event instanceof VMDisconnectEvent) { >>>>> throw new Error("VM died before it started...:"+event); >>>>> } >>>>> } >>>>> } >>>>> >>>>> System.out.println("Vm launched"); >>>>> >>>>> >>>> The code you proposed could improve the test, it made sure that TestPostFieldModification was started, but I am afraid that it did not address the issue causing the failure, the issue I believe was that TestPostFieldModification exited before or during FieldMonitor called addFieldWatch(), that was why addFieldWatch() received VMDisconnectedException. When the test was treating ClassPrepareEvent, even if VMDeathEvent or VMDisconnectEvent arrived, it must be still waiting in the eventQueue because it arrived after ClassPrepareEvent. >>>> >>>> My fix was to not allow TestPostFieldModification to exit before addFieldWatch() was done. >>>> >>>>> There is also no reason to call addFieldWatch() before the ClassPrepareEvent has been received. The call to vm..classesByName() will just return an empty list anyway. >>>>> >>>>> >>>> I do not know why the test called addFieldWatch before ClassPrepareEvent had been received, but yes the returned list was empty, so agree to remove it. >>>> >>>>> While you are in there you can also remove the unused StringBuffer near the top of main(). >>>>> >>>>> >>>> Yes it was already removed in version 01 >>>> >>>> Here is the new webrev: >>>> http://cr.openjdk.java.net/~sjiang/JDK-8007710/02/ >>>> >>>> Thanks, >>>> Shanliang >>>> >>>>> Thanks, >>>>> /Staffan >>>>> >>>>> On 11 feb 2014, at 18:30, shanliang wrote: >>>>> >>>>> >>>>> >>>>>> Here is the new fix in which FieldMonitor will write to TestPostFieldModification, to inform the latter to quit, as suggested bu Jaroslav >>>>>> http://cr.openjdk.java.net/~sjiang/JDK-8007710/01/ >>>>>> >>>>>> Thanks, >>>>>> Shanliang >>>>>> >>>>>> shanliang wrote: >>>>>> >>>>>> >>>>>>> shanliang wrote: >>>>>>> >>>>>>> >>>>>>>> Jaroslav Bachorik wrote: >>>>>>>> >>>>>>>> >>>>>>>>> On 11.2.2014 16:31, shanliang wrote: >>>>>>>>> >>>>>>>>> >>>>>>>>>> Staffan Larsen wrote: >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>> Hi Shanliang, >>>>>>>>>>> >>>>>>>>>>> I can?t quite see how the test can fail in this way. When the >>>>>>>>>>> ClassPrepareEvent happens, the debuggee will be suspended. So when >>>>>>>>>>> addFieldWatch() is called, the debuggee should not have moved. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> I am not expert of jdi so I may miss something here. I checked the >>>>>>>>>> failure trace and saw the report exception happen when FieldMonitor >>>>>>>>>> received ClassPrepareEvent and was doing addFieldWatch. FieldMonitor did >>>>>>>>>> call "vm.resume()" before treating events. >>>>>>>>>> >>>>>>>>>> >>>>>>>>> AFAICS, calling vm.resume() results in an almost immediate debuggee death. The gc() invoking thread "d" is flagged as a deamon and as such doesn't prevent the process from exiting. The other thread is not a daemon but will finish in only few cycles. >>>>>>>>> >>>>>>>>> >>>>>>>> I looked at the class com.sun.jdi.VirtualMachine, here is the Javadoc of the method "resume": >>>>>>>> /** >>>>>>>> * Continues the execution of the application running in this >>>>>>>> * virtual machine. All threads are resumed as documented in >>>>>>>> * {@link ThreadReference#resume}. >>>>>>>> * >>>>>>>> * @throws VMCannotBeModifiedException if the VirtualMachine is read-only - see {@link VirtualMachine#canBeModified()}. >>>>>>>> * >>>>>>>> * @see #suspend >>>>>>>> */ >>>>>>>> void resume(); >>>>>>>> My understanding is that the debuggee resumes to work after this call, instead to die? >>>>>>>> >>>>>>>> >>>>>>> In fact the problem is here, the vm (TestPostFieldModification) should not die before FieldMonitor finishes addFieldWatch. >>>>>>> >>>>>>> Shanliang >>>>>>> >>>>>>> >>>>>>>>>> I reproduced the bug by add sleep(1000) after vm.resume() but before >>>>>>>>>> calling eventQueue.remove(); >>>>>>>>>> >>>>>>>>>> >>>>>>>>> It looks like some kind of synchronization between the debugger and the debuggee is necessary. But I wonder if you should better use the process.getOuptuptStream() to write and flush a message for the debugee indicating that it can exit. And in the debugee you would just do System.in.read() as the last statement in the main() method. Seems more robust than involving files. >>>>>>>>> >>>>>>>>> >>>>>>>> It could work, but creating a file in the testing directory should have no issue, but yes maybe less performance. >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Shanliang >>>>>>>> >>>>>>>> >>>>>>>>> Cheers, >>>>>>>>> >>>>>>>>> -JB- >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> Shanliang >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>> One problem I do see with the test is that it does not wait for a >>>>>>>>>>> VMStartEvent before setting up requests. I?m not sure if that could >>>>>>>>>>> cause the failure in the bug report, though. >>>>>>>>>>> >>>>>>>>>>> /Staffan >>>>>>>>>>> >>>>>>>>>>> On 11 feb 2014, at 15:13, shanliang wrote: >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>> Hi , >>>>>>>>>>>> >>>>>>>>>>>> The problem could be that FieldMonitor did not have enough time to >>>>>>>>>>>> "addFieldWatch" but the vm to monitor (TestPostFieldModification) was >>>>>>>>>>>> already ended. >>>>>>>>>>>> >>>>>>>>>>>> So we should make sure that TestPostFieldModification exits after >>>>>>>>>>>> FieldMonitor has done necessary. The solution proposed here is that >>>>>>>>>>>> FieldMonitor creates a file after adding field watching, and >>>>>>>>>>>> TestPostFieldModification quits only after finding the file. >>>>>>>>>>>> >>>>>>>>>>>> web: >>>>>>>>>>>> http://icncweb.fr.oracle.com/~shjiang/webrev/8007710/00/ >>>>>>>>>>>> >>>>>>>>>>>> bug: >>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8007710 >>>>>>>>>>>> >>>>>>>>>>>> Thanks, >>>>>>>>>>>> Shanliang >>>>>>>>>>>> >>>>>>>>>>>> >>>> >>> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20140213/2ce2f159/attachment-0001.html From mattias.tobiasson at oracle.com Thu Feb 13 03:57:40 2014 From: mattias.tobiasson at oracle.com (Mattias Tobiasson) Date: Thu, 13 Feb 2014 03:57:40 -0800 (PST) Subject: RFR: 6545422 NativeErrors.java uses wrong path name in exec Message-ID: <850898d0-8094-4769-8ef6-b0b5d8b2dbad@default> Hi, Could you please sponsor and submit this patch? Mattias ----- Original Message ----- From: staffan.larsen at oracle.com To: mattias.tobiasson at oracle.com Cc: serviceability-dev at openjdk.java.net Sent: Monday, February 10, 2014 2:38:51 PM GMT +01:00 Amsterdam / Berlin / Bern / Rome / Stockholm / Vienna Subject: Re: RFR: 6545422 NativeErrors.java uses wrong path name in exec And a looong time later: Looks good! Thanks, /Staffan On 4 dec 2013, at 15:38, Mattias Tobiasson wrote: > Hi, > Could someone please review this patch? > > webrev: http://cr.openjdk.java.net/~ykantser/6545422/webrev.00 > bug: https://bugs.openjdk.java.net/browse/JDK-6545422 > > Changes: > Uses common testlibrary function to find native2ascii binary. > Check null values instead of throwing NullPointerException at different locations. The old version checked null with "assert", which is not always enabled. > > > Mattias -------------- next part -------------- A non-text attachment was scrubbed... Name: 6545422.patch Type: text/x-patch Size: 6890 bytes Desc: not available Url : http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20140213/4c2b41df/6545422.patch From Alan.Bateman at oracle.com Thu Feb 13 05:18:03 2014 From: Alan.Bateman at oracle.com (Alan Bateman) Date: Thu, 13 Feb 2014 13:18:03 +0000 Subject: 8034856/8034857: More gcc warnings Message-ID: <52FCC60B.6040205@oracle.com> The number of native code warnings in the build is annoying so this is another drive-by fix that eliminates a few of them in the serviceability and security areas. The webrev with the changes is here: http://cr.openjdk.java.net/~alanb/8034856+8034857/webrev/ In the pkcs11 code the issue is the function prototypes for the throwXXX functions aren't included. This is fixed by including pkcs11wrapper.h but that exposes another issue with the header file includes that needed to be fixed. In JarFacade the issue is that it uses isspace but doesn't include the ctype.h For LinuxOperatingSystem.c then there are 12 warnings related to fscanf usages where the format specifier is %lld and the code wants to read into a uint64_t. I've changed the format specifier to"%"SCNd64 so that it matches uint64_t and should be okay on both 32 and 64-bit. Thanks, Alan. From staffan.larsen at oracle.com Thu Feb 13 05:17:24 2014 From: staffan.larsen at oracle.com (Staffan Larsen) Date: Thu, 13 Feb 2014 14:17:24 +0100 Subject: Codereview request: 8007710 runtime/7158988/FieldMonitor.java fails with com.sun.jdi.VMDisconnectedException: Connection closed In-Reply-To: <52FCA014.9000106@oracle.com> References: <52FA301A.2040301@oracle.com> <52FA424E.2050202@oracle.com> <52FA4515.3050904@oracle.com> <52FA4D33.4030402@oracle.com> <52FA51BE.4070701@oracle.com> <52FA5E51.3040704@oracle.com> <3F301491-8A27-4319-A694-D1E8ACB2F249@oracle.com> <52FBA9B2.6090502@oracle.com> <52FC8DA1.9090900@oracle.com> <1353E47D-3ACB-42FD-ABD3-845CEB6EE389@oracle.com> <52FCA014.9000106@oracle.com> Message-ID: I think I understand what happens now. The test code, simplified, looks like this (with the Thread.sleep() added that causes the test to fail): launchTarget(); addClassWatch(); vm.resume(); Thread.sleep(1000); while(connected) { eventSet = eventQueue.remove() for(event : eventQueue) { if (event instanceof ClassPrepareEvent) { addFieldWatch(); } } eventSet.resume(); } By default all events that happen will cause the debuggee to suspend (see EventRequest.setSuspendPolicy()). Thus when we get to addFieldWatch(), the vm should be suspended and we should be able to create the field watch without problem. But the VM isn?t suspended and that is why the test fail. Why isn?t the VM suspended? When we get to the ?for(event : eventQueue)? the first time there are *two* events already in the queue: the VMStartEvent and a ClassPrepareEvent. At this point the VM is suspended and everything is good. We look at the first eventSet which only contains the VMStartEvent, we ignore the event, but we resume the VM. We then loop and look at the ClassPrepareEvent, but by now the VM is already running and has also terminated. Failure. Thus, we need to handle the VMStartEvent. I suggest a modification to my previous code: launchTarget(); while(connected) { eventSet = eventQueue.remove() for(event : eventQueue) { if (event instanceof VMStartEvent) { addClassWatch(); } if (event instanceof ClassPrepareEvent) { addFieldWatch(); } } eventSet.resume(); } This will cause us to have complete control over the state of the debuggee. The first event we see will be the VMStartEvent. The VM will be suspended. We can add a class watch here. Then we resume the VM. The second event we see will be the ClassPrepareEvent with the VM suspended. We can add the field watch. Then we resume the VM and wait for the field watch events. Thanks, /Staffan On 13 feb 2014, at 11:36, shanliang wrote: > Staffan Larsen wrote: >> >> On 13 feb 2014, at 10:17, Jaroslav Bachorik wrote: >> >> >>> Hi Staffan, >>> >>> On 12.2.2014 18:27, Staffan Larsen wrote: >>> >>>> I?m still not happy with this fix since I think the extra output stream synchronization logic is not needed - the debuggee should be suspended at all the interesting points. The fix I proposed is cleaner and (as far as I can tell) also fixes the problem. The only thing is that I can?t quite explain what goes wrong without the fix? I?d really like to understand that. I?ll try to dig deeper and see if I can understand exactly what happens. >>>> >>> Yes, bringing the VM to a stable state before calling other JDI functions helps to stabilize the test even without the additional synchronization via stdout/stdin. >>> >>> I just wonder whether this check should not be done inside com.sun.jdi.connect.LaunchingConnector#launch() implementation. Does it even make sense to hand off an unstable VM? >>> >> Good question, but hard to change now - all implementations depend on the current functionality. The VMStartEvent also gives you a reference to the main thread. >> > The test failed when it received ClassPrepareEvent and did addFieldWatch, that meant the test must receive already VMStartEvent, because VMStartEvent must be the first event, if it was true then the vm must be already stable when failing. > > Except that the test received ClassPrepareEvent before VMStartEvent then it was doing addFieldWatch with a possibly unstable VM. in this case we might have a serious bug in VirtualMachine implementation, and if this is true the fix proposed to check "start" may make miss ClassPrepareEvent, then the test would test nothing. > > Shanliang >> /S >> >> >>> -JB- >>> >>> >>>> /Staffan >>>> >>>> On 12 feb 2014, at 18:04, shanliang wrote: >>>> >>>> >>>>> Staffan Larsen wrote: >>>>> >>>>>> I think what you need to do is wait for the VMStartEvent before you add requests to the VM. Note this paragraph from the VirtualMachine doc: >>>>>> >>>>>> Note that a target VM launched by a launching connector is not >>>>>> guaranteed to be stable until after the VMStartEvent has been >>>>>> received. >>>>>> >>>>>> >>>>> I may miss something here, I believe VMStartEvent must be the first event, when the test got ClassPrepareEvent, it must already received VMStartEvent. >>>>> >>>>>> I think adding code that looks something like this will make the test stable: >>>>>> >>>>>> VirtualMachine vm = launchTarget(CLASS_NAME); >>>>>> EventQueue eventQueue = vm.eventQueue(); >>>>>> >>>>>> boolean started = false; >>>>>> while(!started) { >>>>>> EventSet eventSet = eventQueue.remove(); >>>>>> for (Event event : eventSet) { >>>>>> if (event instanceof VMStartEvent) { >>>>>> started = true; >>>>>> } >>>>>> if (event instanceof VMDeathEvent >>>>>> || event instanceof VMDisconnectEvent) { >>>>>> throw new Error("VM died before it started...:"+event); >>>>>> } >>>>>> } >>>>>> } >>>>>> >>>>>> System.out.println("Vm launched"); >>>>>> >>>>>> >>>>> The code you proposed could improve the test, it made sure that TestPostFieldModification was started, but I am afraid that it did not address the issue causing the failure, the issue I believe was that TestPostFieldModification exited before or during FieldMonitor called addFieldWatch(), that was why addFieldWatch() received VMDisconnectedException. When the test was treating ClassPrepareEvent, even if VMDeathEvent or VMDisconnectEvent arrived, it must be still waiting in the eventQueue because it arrived after ClassPrepareEvent. >>>>> >>>>> My fix was to not allow TestPostFieldModification to exit before addFieldWatch() was done. >>>>> >>>>>> There is also no reason to call addFieldWatch() before the ClassPrepareEvent has been received. The call to vm..classesByName() will just return an empty list anyway. >>>>>> >>>>>> >>>>> I do not know why the test called addFieldWatch before ClassPrepareEvent had been received, but yes the returned list was empty, so agree to remove it. >>>>> >>>>>> While you are in there you can also remove the unused StringBuffer near the top of main(). >>>>>> >>>>>> >>>>> Yes it was already removed in version 01 >>>>> >>>>> Here is the new webrev: >>>>> http://cr.openjdk.java.net/~sjiang/JDK-8007710/02/ >>>>> >>>>> Thanks, >>>>> Shanliang >>>>> >>>>>> Thanks, >>>>>> /Staffan >>>>>> >>>>>> On 11 feb 2014, at 18:30, shanliang wrote: >>>>>> >>>>>> >>>>>> >>>>>>> Here is the new fix in which FieldMonitor will write to TestPostFieldModification, to inform the latter to quit, as suggested bu Jaroslav >>>>>>> http://cr.openjdk.java.net/~sjiang/JDK-8007710/01/ >>>>>>> >>>>>>> Thanks, >>>>>>> Shanliang >>>>>>> >>>>>>> shanliang wrote: >>>>>>> >>>>>>> >>>>>>>> shanliang wrote: >>>>>>>> >>>>>>>> >>>>>>>>> Jaroslav Bachorik wrote: >>>>>>>>> >>>>>>>>> >>>>>>>>>> On 11.2.2014 16:31, shanliang wrote: >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>> Staffan Larsen wrote: >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>> Hi Shanliang, >>>>>>>>>>>> >>>>>>>>>>>> I can?t quite see how the test can fail in this way. When the >>>>>>>>>>>> ClassPrepareEvent happens, the debuggee will be suspended. So when >>>>>>>>>>>> addFieldWatch() is called, the debuggee should not have moved. >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>> I am not expert of jdi so I may miss something here. I checked the >>>>>>>>>>> failure trace and saw the report exception happen when FieldMonitor >>>>>>>>>>> received ClassPrepareEvent and was doing addFieldWatch. FieldMonitor did >>>>>>>>>>> call "vm.resume()" before treating events. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> AFAICS, calling vm.resume() results in an almost immediate debuggee death. The gc() invoking thread "d" is flagged as a deamon and as such doesn't prevent the process from exiting. The other thread is not a daemon but will finish in only few cycles. >>>>>>>>>> >>>>>>>>>> >>>>>>>>> I looked at the class com.sun.jdi.VirtualMachine, here is the Javadoc of the method "resume": >>>>>>>>> /** >>>>>>>>> * Continues the execution of the application running in this >>>>>>>>> * virtual machine. All threads are resumed as documented in >>>>>>>>> * {@link ThreadReference#resume}. >>>>>>>>> * >>>>>>>>> * @throws VMCannotBeModifiedException if the VirtualMachine is read-only - see {@link VirtualMachine#canBeModified()}. >>>>>>>>> * >>>>>>>>> * @see #suspend >>>>>>>>> */ >>>>>>>>> void resume(); >>>>>>>>> My understanding is that the debuggee resumes to work after this call, instead to die? >>>>>>>>> >>>>>>>>> >>>>>>>> In fact the problem is here, the vm (TestPostFieldModification) should not die before FieldMonitor finishes addFieldWatch. >>>>>>>> >>>>>>>> Shanliang >>>>>>>> >>>>>>>> >>>>>>>>>>> I reproduced the bug by add sleep(1000) after vm.resume() but before >>>>>>>>>>> calling eventQueue.remove(); >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> It looks like some kind of synchronization between the debugger and the debuggee is necessary. But I wonder if you should better use the process.getOuptuptStream() to write and flush a message for the debugee indicating that it can exit. And in the debugee you would just do System.in.read() as the last statement in the main() method. Seems more robust than involving files. >>>>>>>>>> >>>>>>>>>> >>>>>>>>> It could work, but creating a file in the testing directory should have no issue, but yes maybe less performance. >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> Shanliang >>>>>>>>> >>>>>>>>> >>>>>>>>>> Cheers, >>>>>>>>>> >>>>>>>>>> -JB- >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> Shanliang >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>> One problem I do see with the test is that it does not wait for a >>>>>>>>>>>> VMStartEvent before setting up requests. I?m not sure if that could >>>>>>>>>>>> cause the failure in the bug report, though. >>>>>>>>>>>> >>>>>>>>>>>> /Staffan >>>>>>>>>>>> >>>>>>>>>>>> On 11 feb 2014, at 15:13, shanliang wrote: >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>>> Hi , >>>>>>>>>>>>> >>>>>>>>>>>>> The problem could be that FieldMonitor did not have enough time to >>>>>>>>>>>>> "addFieldWatch" but the vm to monitor (TestPostFieldModification) was >>>>>>>>>>>>> already ended. >>>>>>>>>>>>> >>>>>>>>>>>>> So we should make sure that TestPostFieldModification exits after >>>>>>>>>>>>> FieldMonitor has done necessary. The solution proposed here is that >>>>>>>>>>>>> FieldMonitor creates a file after adding field watching, and >>>>>>>>>>>>> TestPostFieldModification quits only after finding the file. >>>>>>>>>>>>> >>>>>>>>>>>>> web: >>>>>>>>>>>>> http://icncweb.fr.oracle.com/~shjiang/webrev/8007710/00/ >>>>>>>>>>>>> >>>>>>>>>>>>> bug: >>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8007710 >>>>>>>>>>>>> >>>>>>>>>>>>> Thanks, >>>>>>>>>>>>> Shanliang >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>> >>>> >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20140213/9d1ca03f/attachment-0001.html From staffan.larsen at oracle.com Thu Feb 13 05:24:39 2014 From: staffan.larsen at oracle.com (Staffan Larsen) Date: Thu, 13 Feb 2014 14:24:39 +0100 Subject: 8034856/8034857: More gcc warnings In-Reply-To: <52FCC60B.6040205@oracle.com> References: <52FCC60B.6040205@oracle.com> Message-ID: Changes look good. /Staffan On 13 feb 2014, at 14:18, Alan Bateman wrote: > > The number of native code warnings in the build is annoying so this is another drive-by fix that eliminates a few of them in the serviceability and security areas. The webrev with the changes is here: > > http://cr.openjdk.java.net/~alanb/8034856+8034857/webrev/ > > In the pkcs11 code the issue is the function prototypes for the throwXXX functions aren't included. This is fixed by including pkcs11wrapper.h but that exposes another issue with the header file includes that needed to be fixed. > > In JarFacade the issue is that it uses isspace but doesn't include the ctype.h > > For LinuxOperatingSystem.c then there are 12 warnings related to fscanf usages where the format specifier is %lld and the code wants to read into a uint64_t. I've changed the format specifier to"%"SCNd64 so that it matches uint64_t and should be okay on both 32 and 64-bit. > > Thanks, > Alan. From sean.mullan at oracle.com Thu Feb 13 06:05:49 2014 From: sean.mullan at oracle.com (Sean Mullan) Date: Thu, 13 Feb 2014 09:05:49 -0500 Subject: 8034856/8034857: More gcc warnings In-Reply-To: <52FCC60B.6040205@oracle.com> References: <52FCC60B.6040205@oracle.com> Message-ID: <52FCD13D.9040605@oracle.com> Looks fine to me. --Sean On 02/13/2014 08:18 AM, Alan Bateman wrote: > > The number of native code warnings in the build is annoying so this is > another drive-by fix that eliminates a few of them in the serviceability > and security areas. The webrev with the changes is here: > > http://cr.openjdk.java.net/~alanb/8034856+8034857/webrev/ > > In the pkcs11 code the issue is the function prototypes for the throwXXX > functions aren't included. This is fixed by including pkcs11wrapper.h > but that exposes another issue with the header file includes that needed > to be fixed. > > In JarFacade the issue is that it uses isspace but doesn't include the > ctype.h > > For LinuxOperatingSystem.c then there are 12 warnings related to fscanf > usages where the format specifier is %lld and the code wants to read > into a uint64_t. I've changed the format specifier to"%"SCNd64 so that > it matches uint64_t and should be okay on both 32 and 64-bit. > > Thanks, > Alan. From daniel.daugherty at oracle.com Thu Feb 13 09:07:40 2014 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Thu, 13 Feb 2014 10:07:40 -0700 Subject: RFR(XS): JDK-8034867 - Object references from expression stack are not visible as roots via jvmti FollowReferences and HeapDump In-Reply-To: <02D5D45C1F8DB848A7AE20E80EE61A5C3980BF2A@DEWDFEMB20C.global.corp.sap> References: <02D5D45C1F8DB848A7AE20E80EE61A5C3980BF2A@DEWDFEMB20C.global.corp.sap> Message-ID: <52FCFBDC.4020602@oracle.com> Adding serviceability-dev at openjdk.java.net since JVM/TI belongs to the Serviceability team. Please make sure that replies to this review thread include both aliases for completeness. Dan On 2/13/14 9:57 AM, Siebenborn, Axel wrote: > > Hi all, > > could I have a review for the following change? > > It adds code to iterate over the slots of the expression stack. > > Found object references are used as heap roots, the same way as object > references in locals slots. > > Bug: > > https://bugs.openjdk.java.net/browse/JDK-8034867 > > Webrev: > > http://sapjvm.com/as/webrevs/8034867 > > Thanks, > > Axel > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20140213/490f7e03/attachment.html From shanliang.jiang at oracle.com Thu Feb 13 09:15:38 2014 From: shanliang.jiang at oracle.com (shanliang) Date: Thu, 13 Feb 2014 18:15:38 +0100 Subject: Codereview request: 8007710 runtime/7158988/FieldMonitor.java fails with com.sun.jdi.VMDisconnectedException: Connection closed In-Reply-To: References: <52FA301A.2040301@oracle.com> <52FA424E.2050202@oracle.com> <52FA4515.3050904@oracle.com> <52FA4D33.4030402@oracle.com> <52FA51BE.4070701@oracle.com> <52FA5E51.3040704@oracle.com> <3F301491-8A27-4319-A694-D1E8ACB2F249@oracle.com> <52FBA9B2.6090502@oracle.com> <52FC8DA1.9090900@oracle.com> <1353E47D-3ACB-42FD-ABD3-845CEB6EE389@oracle.com> <52FCA014.9000106@oracle.com> Message-ID: <52FCFDBA.5040108@oracle.com> Staffan, Very nice analysis! The fix must be very simple, just remove the line 108 vm.resume it is an error because here the test does not yet treat the events in eventSet. the line 136 eventSet.resume(); is the right place to resume the threads after event treatment. Here is the new webrev: http://cr.openjdk.java.net/~sjiang/JDK-8007710/03/ Thanks, Shanliang Staffan Larsen wrote: > I think I understand what happens now. > > The test code, simplified, looks like this (with the Thread.sleep() > added that causes the test to fail): > > launchTarget(); > addClassWatch(); > vm.resume(); > Thread.sleep(1000); > while(connected) { > eventSet = eventQueue.remove() > for(event : eventQueue) { > if (event instanceof ClassPrepareEvent) { > addFieldWatch(); > } > } > eventSet.resume(); > } > > By default all events that happen will cause the debuggee to suspend > (see EventRequest.setSuspendPolicy()). Thus when we get to > addFieldWatch(), the vm should be suspended and we should be able to > create the field watch without problem. But the VM isn?t suspended and > that is why the test fail. > > Why isn?t the VM suspended? When we get to the ?for(event : > eventQueue)? the first time there are *two* events already in the > queue: the VMStartEvent and a ClassPrepareEvent. At this point the VM > is suspended and everything is good. We look at the first eventSet > which only contains the VMStartEvent, we ignore the event, but we > resume the VM. We then loop and look at the ClassPrepareEvent, but by > now the VM is already running and has also terminated. Failure. > > Thus, we need to handle the VMStartEvent. I suggest a modification to > my previous code: > > launchTarget(); > while(connected) { > eventSet = eventQueue.remove() > for(event : eventQueue) { > if (event instanceof VMStartEvent) { > addClassWatch(); > } > if (event instanceof ClassPrepareEvent) { > addFieldWatch(); > } > } > eventSet.resume(); > } > > This will cause us to have complete control over the state of the > debuggee. The first event we see will be the VMStartEvent. The VM will > be suspended. We can add a class watch here. Then we resume the VM. > The second event we see will be the ClassPrepareEvent with the VM > suspended. We can add the field watch. Then we resume the VM and wait > for the field watch events. > > Thanks, > /Staffan > > On 13 feb 2014, at 11:36, shanliang > wrote: > >> Staffan Larsen wrote: >>> On 13 feb 2014, at 10:17, Jaroslav Bachorik wrote: >>> >>> >>>> Hi Staffan, >>>> >>>> On 12.2.2014 18:27, Staffan Larsen wrote: >>>> >>>>> I?m still not happy with this fix since I think the extra output stream synchronization logic is not needed - the debuggee should be suspended at all the interesting points. The fix I proposed is cleaner and (as far as I can tell) also fixes the problem. The only thing is that I can?t quite explain what goes wrong without the fix? I?d really like to understand that. I?ll try to dig deeper and see if I can understand exactly what happens. >>>>> >>>> Yes, bringing the VM to a stable state before calling other JDI functions helps to stabilize the test even without the additional synchronization via stdout/stdin. >>>> >>>> I just wonder whether this check should not be done inside com.sun.jdi.connect.LaunchingConnector#launch() implementation. Does it even make sense to hand off an unstable VM? >>>> >>> Good question, but hard to change now - all implementations depend on the current functionality. The VMStartEvent also gives you a reference to the main thread. >>> >> The test failed when it received ClassPrepareEvent and did >> addFieldWatch, that meant the test must receive already VMStartEvent, >> because VMStartEvent must be the first event, if it was true then the >> vm must be already stable when failing. >> >> Except that the test received ClassPrepareEvent before VMStartEvent >> then it was doing addFieldWatch with a possibly unstable VM. in this >> case we might have a serious bug in VirtualMachine implementation, >> and if this is true the fix proposed to check "start" may make miss >> ClassPrepareEvent, then the test would test nothing. >> >> Shanliang >>> /S >>> >>> >>>> -JB- >>>> >>>> >>>>> /Staffan >>>>> >>>>> On 12 feb 2014, at 18:04, shanliang wrote: >>>>> >>>>> >>>>>> Staffan Larsen wrote: >>>>>> >>>>>>> I think what you need to do is wait for the VMStartEvent before you add requests to the VM. Note this paragraph from the VirtualMachine doc: >>>>>>> >>>>>>> Note that a target VM launched by a launching connector is not >>>>>>> guaranteed to be stable until after the VMStartEvent has been >>>>>>> received. >>>>>>> >>>>>>> >>>>>> I may miss something here, I believe VMStartEvent must be the first event, when the test got ClassPrepareEvent, it must already received VMStartEvent. >>>>>> >>>>>>> I think adding code that looks something like this will make the test stable: >>>>>>> >>>>>>> VirtualMachine vm = launchTarget(CLASS_NAME); >>>>>>> EventQueue eventQueue = vm.eventQueue(); >>>>>>> >>>>>>> boolean started = false; >>>>>>> while(!started) { >>>>>>> EventSet eventSet = eventQueue.remove(); >>>>>>> for (Event event : eventSet) { >>>>>>> if (event instanceof VMStartEvent) { >>>>>>> started = true; >>>>>>> } >>>>>>> if (event instanceof VMDeathEvent >>>>>>> || event instanceof VMDisconnectEvent) { >>>>>>> throw new Error("VM died before it started...:"+event); >>>>>>> } >>>>>>> } >>>>>>> } >>>>>>> >>>>>>> System.out.println("Vm launched"); >>>>>>> >>>>>>> >>>>>> The code you proposed could improve the test, it made sure that TestPostFieldModification was started, but I am afraid that it did not address the issue causing the failure, the issue I believe was that TestPostFieldModification exited before or during FieldMonitor called addFieldWatch(), that was why addFieldWatch() received VMDisconnectedException. When the test was treating ClassPrepareEvent, even if VMDeathEvent or VMDisconnectEvent arrived, it must be still waiting in the eventQueue because it arrived after ClassPrepareEvent. >>>>>> >>>>>> My fix was to not allow TestPostFieldModification to exit before addFieldWatch() was done. >>>>>> >>>>>>> There is also no reason to call addFieldWatch() before the ClassPrepareEvent has been received. The call to vm..classesByName() will just return an empty list anyway. >>>>>>> >>>>>>> >>>>>> I do not know why the test called addFieldWatch before ClassPrepareEvent had been received, but yes the returned list was empty, so agree to remove it. >>>>>> >>>>>>> While you are in there you can also remove the unused StringBuffer near the top of main(). >>>>>>> >>>>>>> >>>>>> Yes it was already removed in version 01 >>>>>> >>>>>> Here is the new webrev: >>>>>> http://cr.openjdk.java.net/~sjiang/JDK-8007710/02/ >>>>>> >>>>>> Thanks, >>>>>> Shanliang >>>>>> >>>>>>> Thanks, >>>>>>> /Staffan >>>>>>> >>>>>>> On 11 feb 2014, at 18:30, shanliang wrote: >>>>>>> >>>>>>> >>>>>>> >>>>>>>> Here is the new fix in which FieldMonitor will write to TestPostFieldModification, to inform the latter to quit, as suggested bu Jaroslav >>>>>>>> http://cr.openjdk.java.net/~sjiang/JDK-8007710/01/ >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Shanliang >>>>>>>> >>>>>>>> shanliang wrote: >>>>>>>> >>>>>>>> >>>>>>>>> shanliang wrote: >>>>>>>>> >>>>>>>>> >>>>>>>>>> Jaroslav Bachorik wrote: >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>> On 11.2.2014 16:31, shanliang wrote: >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>> Staffan Larsen wrote: >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>>> Hi Shanliang, >>>>>>>>>>>>> >>>>>>>>>>>>> I can?t quite see how the test can fail in this way. When the >>>>>>>>>>>>> ClassPrepareEvent happens, the debuggee will be suspended. So when >>>>>>>>>>>>> addFieldWatch() is called, the debuggee should not have moved. >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>> I am not expert of jdi so I may miss something here. I checked the >>>>>>>>>>>> failure trace and saw the report exception happen when FieldMonitor >>>>>>>>>>>> received ClassPrepareEvent and was doing addFieldWatch. FieldMonitor did >>>>>>>>>>>> call "vm.resume()" before treating events. >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>> AFAICS, calling vm.resume() results in an almost immediate debuggee death. The gc() invoking thread "d" is flagged as a deamon and as such doesn't prevent the process from exiting. The other thread is not a daemon but will finish in only few cycles. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> I looked at the class com.sun.jdi.VirtualMachine, here is the Javadoc of the method "resume": >>>>>>>>>> /** >>>>>>>>>> * Continues the execution of the application running in this >>>>>>>>>> * virtual machine. All threads are resumed as documented in >>>>>>>>>> * {@link ThreadReference#resume}. >>>>>>>>>> * >>>>>>>>>> * @throws VMCannotBeModifiedException if the VirtualMachine is read-only - see {@link VirtualMachine#canBeModified()}. >>>>>>>>>> * >>>>>>>>>> * @see #suspend >>>>>>>>>> */ >>>>>>>>>> void resume(); >>>>>>>>>> My understanding is that the debuggee resumes to work after this call, instead to die? >>>>>>>>>> >>>>>>>>>> >>>>>>>>> In fact the problem is here, the vm (TestPostFieldModification) should not die before FieldMonitor finishes addFieldWatch. >>>>>>>>> >>>>>>>>> Shanliang >>>>>>>>> >>>>>>>>> >>>>>>>>>>>> I reproduced the bug by add sleep(1000) after vm.resume() but before >>>>>>>>>>>> calling eventQueue.remove(); >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>> It looks like some kind of synchronization between the debugger and the debuggee is necessary. But I wonder if you should better use the process.getOuptuptStream() to write and flush a message for the debugee indicating that it can exit. And in the debugee you would just do System.in.read() as the last statement in the main() method. Seems more robust than involving files. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> It could work, but creating a file in the testing directory should have no issue, but yes maybe less performance. >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> Shanliang >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>> Cheers, >>>>>>>>>>> >>>>>>>>>>> -JB- >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>> Thanks, >>>>>>>>>>>> Shanliang >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>>> One problem I do see with the test is that it does not wait for a >>>>>>>>>>>>> VMStartEvent before setting up requests. I?m not sure if that could >>>>>>>>>>>>> cause the failure in the bug report, though. >>>>>>>>>>>>> >>>>>>>>>>>>> /Staffan >>>>>>>>>>>>> >>>>>>>>>>>>> On 11 feb 2014, at 15:13, shanliang wrote: >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>>> Hi , >>>>>>>>>>>>>> >>>>>>>>>>>>>> The problem could be that FieldMonitor did not have enough time to >>>>>>>>>>>>>> "addFieldWatch" but the vm to monitor (TestPostFieldModification) was >>>>>>>>>>>>>> already ended. >>>>>>>>>>>>>> >>>>>>>>>>>>>> So we should make sure that TestPostFieldModification exits after >>>>>>>>>>>>>> FieldMonitor has done necessary. The solution proposed here is that >>>>>>>>>>>>>> FieldMonitor creates a file after adding field watching, and >>>>>>>>>>>>>> TestPostFieldModification quits only after finding the file. >>>>>>>>>>>>>> >>>>>>>>>>>>>> web: >>>>>>>>>>>>>> http://icncweb.fr.oracle.com/~shjiang/webrev/8007710/00/ >>>>>>>>>>>>>> >>>>>>>>>>>>>> bug: >>>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8007710 >>>>>>>>>>>>>> >>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>> Shanliang >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>> >>>>> >>> >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20140213/e7a14bc3/attachment-0001.html From mikael.vidstedt at oracle.com Thu Feb 13 09:56:48 2014 From: mikael.vidstedt at oracle.com (Mikael Vidstedt) Date: Thu, 13 Feb 2014 09:56:48 -0800 Subject: 8034856/8034857: More gcc warnings In-Reply-To: <52FCC60B.6040205@oracle.com> References: <52FCC60B.6040205@oracle.com> Message-ID: <6E073B44-2FF7-419F-A094-7CD5C3AE0788@oracle.com> Alan, I made the change to JarFacade.c myself last week, only to then see the comment a few lines above where you added the new include. It seems to indicate that including ctype.h on Solaris/SPARC is a bad idea. I have no idea if the comment is still relevant, but that may be worth understanding first. Cheers, Mikael > On Feb 13, 2014, at 5:18, Alan Bateman wrote: > > > The number of native code warnings in the build is annoying so this is another drive-by fix that eliminates a few of them in the serviceability and security areas. The webrev with the changes is here: > > http://cr.openjdk.java.net/~alanb/8034856+8034857/webrev/ > > In the pkcs11 code the issue is the function prototypes for the throwXXX functions aren't included. This is fixed by including pkcs11wrapper.h but that exposes another issue with the header file includes that needed to be fixed. > > In JarFacade the issue is that it uses isspace but doesn't include the ctype.h > > For LinuxOperatingSystem.c then there are 12 warnings related to fscanf usages where the format specifier is %lld and the code wants to read into a uint64_t. I've changed the format specifier to"%"SCNd64 so that it matches uint64_t and should be okay on both 32 and 64-bit. > > Thanks, > Alan. From Alan.Bateman at oracle.com Thu Feb 13 10:23:30 2014 From: Alan.Bateman at oracle.com (Alan Bateman) Date: Thu, 13 Feb 2014 18:23:30 +0000 Subject: 8034856/8034857: More gcc warnings In-Reply-To: <6E073B44-2FF7-419F-A094-7CD5C3AE0788@oracle.com> References: <52FCC60B.6040205@oracle.com> <6E073B44-2FF7-419F-A094-7CD5C3AE0788@oracle.com> Message-ID: <52FD0DA2.7030201@oracle.com> On 13/02/2014 17:56, Mikael Vidstedt wrote: > Alan, > > I made the change to JarFacade.c myself last week, only to then see the comment a few lines above where you added the new include. It seems to indicate that including ctype.h on Solaris/SPARC is a bad idea. I have no idea if the comment is still relevant, but that may be worth understanding first. > > Do you have cycles to look into it? As the code is using isspace already then it's not clear (unless there are different versions). Before pushing the changes then I ran the tests on all platforms (including Solaris) and the j.l.i tests include a number of tests exercise these manifest attributes with a non-US characters. As an aside, the native code warnings coming from the jdk repository are really annoying so this is the reason for the drive-by fixes when I get a few minutes. I think others are doing the same. -Alan. From shanliang.jiang at oracle.com Thu Feb 13 12:15:20 2014 From: shanliang.jiang at oracle.com (shanliang) Date: Thu, 13 Feb 2014 21:15:20 +0100 Subject: Codereview request: 8007710 runtime/7158988/FieldMonitor.java fails with com.sun.jdi.VMDisconnectedException: Connection closed In-Reply-To: <52FCFDBA.5040108@oracle.com> References: <52FA301A.2040301@oracle.com> <52FA424E.2050202@oracle.com> <52FA4515.3050904@oracle.com> <52FA4D33.4030402@oracle.com> <52FA51BE.4070701@oracle.com> <52FA5E51.3040704@oracle.com> <3F301491-8A27-4319-A694-D1E8ACB2F249@oracle.com> <52FBA9B2.6090502@oracle.com> <52FC8DA1.9090900@oracle.com> <1353E47D-3ACB-42FD-ABD3-845CEB6EE389@oracle.com> <52FCA014.9000106@oracle.com> <52FCFDBA.5040108@oracle.com> Message-ID: <52FD27D8.8060200@oracle.com> Hi, Here is Version 4: http://cr.openjdk.java.net/~sjiang/JDK-8007710/04/ 1) remove the line 108 vm.resume() 2) call addClassWatch(vm) only when receiving VMStartEvent 3) make sure that the test receives ModificationWatchpointEvent 4) clean Thanks, Shanliang shanliang wrote: > Staffan, > > Very nice analysis! > > The fix must be very simple, just remove the line > 108 vm.resume > it is an error because here the test does not yet treat the events in > eventSet. > > the line > 136 eventSet.resume(); > is the right place to resume the threads after event treatment. > > Here is the new webrev: > http://cr.openjdk.java.net/~sjiang/JDK-8007710/03/ > > Thanks, > Shanliang > > Staffan Larsen wrote: >> I think I understand what happens now. >> >> The test code, simplified, looks like this (with the Thread.sleep() >> added that causes the test to fail): >> >> launchTarget(); >> addClassWatch(); >> vm.resume(); >> Thread.sleep(1000); >> while(connected) { >> eventSet = eventQueue.remove() >> for(event : eventQueue) { >> if (event instanceof ClassPrepareEvent) { >> addFieldWatch(); >> } >> } >> eventSet.resume(); >> } >> >> By default all events that happen will cause the debuggee to suspend >> (see EventRequest.setSuspendPolicy()). Thus when we get to >> addFieldWatch(), the vm should be suspended and we should be able to >> create the field watch without problem. But the VM isn?t suspended >> and that is why the test fail. >> >> Why isn?t the VM suspended? When we get to the ?for(event : >> eventQueue)? the first time there are *two* events already in the >> queue: the VMStartEvent and a ClassPrepareEvent. At this point the VM >> is suspended and everything is good. We look at the first eventSet >> which only contains the VMStartEvent, we ignore the event, but we >> resume the VM. We then loop and look at the ClassPrepareEvent, but by >> now the VM is already running and has also terminated. Failure. >> >> Thus, we need to handle the VMStartEvent. I suggest a modification to >> my previous code: >> >> launchTarget(); >> while(connected) { >> eventSet = eventQueue.remove() >> for(event : eventQueue) { >> if (event instanceof VMStartEvent) { >> addClassWatch(); >> } >> if (event instanceof ClassPrepareEvent) { >> addFieldWatch(); >> } >> } >> eventSet.resume(); >> } >> >> This will cause us to have complete control over the state of the >> debuggee. The first event we see will be the VMStartEvent. The VM >> will be suspended. We can add a class watch here. Then we resume the >> VM. The second event we see will be the ClassPrepareEvent with the VM >> suspended. We can add the field watch. Then we resume the VM and wait >> for the field watch events. >> >> Thanks, >> /Staffan >> >> On 13 feb 2014, at 11:36, shanliang > > wrote: >> >>> Staffan Larsen wrote: >>>> On 13 feb 2014, at 10:17, Jaroslav Bachorik wrote: >>>> >>>> >>>>> Hi Staffan, >>>>> >>>>> On 12.2.2014 18:27, Staffan Larsen wrote: >>>>> >>>>>> I?m still not happy with this fix since I think the extra output stream synchronization logic is not needed - the debuggee should be suspended at all the interesting points. The fix I proposed is cleaner and (as far as I can tell) also fixes the problem. The only thing is that I can?t quite explain what goes wrong without the fix? I?d really like to understand that. I?ll try to dig deeper and see if I can understand exactly what happens. >>>>>> >>>>> Yes, bringing the VM to a stable state before calling other JDI functions helps to stabilize the test even without the additional synchronization via stdout/stdin. >>>>> >>>>> I just wonder whether this check should not be done inside com.sun.jdi.connect.LaunchingConnector#launch() implementation. Does it even make sense to hand off an unstable VM? >>>>> >>>> Good question, but hard to change now - all implementations depend on the current functionality. The VMStartEvent also gives you a reference to the main thread. >>>> >>> The test failed when it received ClassPrepareEvent and did >>> addFieldWatch, that meant the test must receive already >>> VMStartEvent, because VMStartEvent must be the first event, if it >>> was true then the vm must be already stable when failing. >>> >>> Except that the test received ClassPrepareEvent before VMStartEvent >>> then it was doing addFieldWatch with a possibly unstable VM. in this >>> case we might have a serious bug in VirtualMachine implementation, >>> and if this is true the fix proposed to check "start" may make miss >>> ClassPrepareEvent, then the test would test nothing. >>> >>> Shanliang >>>> /S >>>> >>>> >>>>> -JB- >>>>> >>>>> >>>>>> /Staffan >>>>>> >>>>>> On 12 feb 2014, at 18:04, shanliang wrote: >>>>>> >>>>>> >>>>>>> Staffan Larsen wrote: >>>>>>> >>>>>>>> I think what you need to do is wait for the VMStartEvent before you add requests to the VM. Note this paragraph from the VirtualMachine doc: >>>>>>>> >>>>>>>> Note that a target VM launched by a launching connector is not >>>>>>>> guaranteed to be stable until after the VMStartEvent has been >>>>>>>> received. >>>>>>>> >>>>>>>> >>>>>>> I may miss something here, I believe VMStartEvent must be the first event, when the test got ClassPrepareEvent, it must already received VMStartEvent. >>>>>>> >>>>>>>> I think adding code that looks something like this will make the test stable: >>>>>>>> >>>>>>>> VirtualMachine vm = launchTarget(CLASS_NAME); >>>>>>>> EventQueue eventQueue = vm.eventQueue(); >>>>>>>> >>>>>>>> boolean started = false; >>>>>>>> while(!started) { >>>>>>>> EventSet eventSet = eventQueue.remove(); >>>>>>>> for (Event event : eventSet) { >>>>>>>> if (event instanceof VMStartEvent) { >>>>>>>> started = true; >>>>>>>> } >>>>>>>> if (event instanceof VMDeathEvent >>>>>>>> || event instanceof VMDisconnectEvent) { >>>>>>>> throw new Error("VM died before it started...:"+event); >>>>>>>> } >>>>>>>> } >>>>>>>> } >>>>>>>> >>>>>>>> System.out.println("Vm launched"); >>>>>>>> >>>>>>>> >>>>>>> The code you proposed could improve the test, it made sure that TestPostFieldModification was started, but I am afraid that it did not address the issue causing the failure, the issue I believe was that TestPostFieldModification exited before or during FieldMonitor called addFieldWatch(), that was why addFieldWatch() received VMDisconnectedException. When the test was treating ClassPrepareEvent, even if VMDeathEvent or VMDisconnectEvent arrived, it must be still waiting in the eventQueue because it arrived after ClassPrepareEvent. >>>>>>> >>>>>>> My fix was to not allow TestPostFieldModification to exit before addFieldWatch() was done. >>>>>>> >>>>>>>> There is also no reason to call addFieldWatch() before the ClassPrepareEvent has been received. The call to vm..classesByName() will just return an empty list anyway. >>>>>>>> >>>>>>>> >>>>>>> I do not know why the test called addFieldWatch before ClassPrepareEvent had been received, but yes the returned list was empty, so agree to remove it. >>>>>>> >>>>>>>> While you are in there you can also remove the unused StringBuffer near the top of main(). >>>>>>>> >>>>>>>> >>>>>>> Yes it was already removed in version 01 >>>>>>> >>>>>>> Here is the new webrev: >>>>>>> http://cr.openjdk.java.net/~sjiang/JDK-8007710/02/ >>>>>>> >>>>>>> Thanks, >>>>>>> Shanliang >>>>>>> >>>>>>>> Thanks, >>>>>>>> /Staffan >>>>>>>> >>>>>>>> On 11 feb 2014, at 18:30, shanliang wrote: >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>>> Here is the new fix in which FieldMonitor will write to TestPostFieldModification, to inform the latter to quit, as suggested bu Jaroslav >>>>>>>>> http://cr.openjdk.java.net/~sjiang/JDK-8007710/01/ >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> Shanliang >>>>>>>>> >>>>>>>>> shanliang wrote: >>>>>>>>> >>>>>>>>> >>>>>>>>>> shanliang wrote: >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>> Jaroslav Bachorik wrote: >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>> On 11.2.2014 16:31, shanliang wrote: >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>>> Staffan Larsen wrote: >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>>> Hi Shanliang, >>>>>>>>>>>>>> >>>>>>>>>>>>>> I can?t quite see how the test can fail in this way. When the >>>>>>>>>>>>>> ClassPrepareEvent happens, the debuggee will be suspended. So when >>>>>>>>>>>>>> addFieldWatch() is called, the debuggee should not have moved. >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>> I am not expert of jdi so I may miss something here. I checked the >>>>>>>>>>>>> failure trace and saw the report exception happen when FieldMonitor >>>>>>>>>>>>> received ClassPrepareEvent and was doing addFieldWatch. FieldMonitor did >>>>>>>>>>>>> call "vm.resume()" before treating events. >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>> AFAICS, calling vm.resume() results in an almost immediate debuggee death. The gc() invoking thread "d" is flagged as a deamon and as such doesn't prevent the process from exiting. The other thread is not a daemon but will finish in only few cycles. >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>> I looked at the class com.sun.jdi.VirtualMachine, here is the Javadoc of the method "resume": >>>>>>>>>>> /** >>>>>>>>>>> * Continues the execution of the application running in this >>>>>>>>>>> * virtual machine. All threads are resumed as documented in >>>>>>>>>>> * {@link ThreadReference#resume}. >>>>>>>>>>> * >>>>>>>>>>> * @throws VMCannotBeModifiedException if the VirtualMachine is read-only - see {@link VirtualMachine#canBeModified()}. >>>>>>>>>>> * >>>>>>>>>>> * @see #suspend >>>>>>>>>>> */ >>>>>>>>>>> void resume(); >>>>>>>>>>> My understanding is that the debuggee resumes to work after this call, instead to die? >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> In fact the problem is here, the vm (TestPostFieldModification) should not die before FieldMonitor finishes addFieldWatch. >>>>>>>>>> >>>>>>>>>> Shanliang >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>>>> I reproduced the bug by add sleep(1000) after vm.resume() but before >>>>>>>>>>>>> calling eventQueue.remove(); >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>> It looks like some kind of synchronization between the debugger and the debuggee is necessary. But I wonder if you should better use the process.getOuptuptStream() to write and flush a message for the debugee indicating that it can exit. And in the debugee you would just do System.in.read() as the last statement in the main() method. Seems more robust than involving files. >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>> It could work, but creating a file in the testing directory should have no issue, but yes maybe less performance. >>>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> Shanliang >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>> Cheers, >>>>>>>>>>>> >>>>>>>>>>>> -JB- >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>>> Thanks, >>>>>>>>>>>>> Shanliang >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>>> One problem I do see with the test is that it does not wait for a >>>>>>>>>>>>>> VMStartEvent before setting up requests. I?m not sure if that could >>>>>>>>>>>>>> cause the failure in the bug report, though. >>>>>>>>>>>>>> >>>>>>>>>>>>>> /Staffan >>>>>>>>>>>>>> >>>>>>>>>>>>>> On 11 feb 2014, at 15:13, shanliang wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>>> Hi , >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> The problem could be that FieldMonitor did not have enough time to >>>>>>>>>>>>>>> "addFieldWatch" but the vm to monitor (TestPostFieldModification) was >>>>>>>>>>>>>>> already ended. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> So we should make sure that TestPostFieldModification exits after >>>>>>>>>>>>>>> FieldMonitor has done necessary. The solution proposed here is that >>>>>>>>>>>>>>> FieldMonitor creates a file after adding field watching, and >>>>>>>>>>>>>>> TestPostFieldModification quits only after finding the file. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> web: >>>>>>>>>>>>>>> http://icncweb.fr.oracle.com/~shjiang/webrev/8007710/00/ >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> bug: >>>>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8007710 >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>> Shanliang >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>> >>>>>> >>>> >>> >>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20140213/34327469/attachment-0001.html From mikael.vidstedt at oracle.com Thu Feb 13 13:14:45 2014 From: mikael.vidstedt at oracle.com (Mikael Vidstedt) Date: Thu, 13 Feb 2014 13:14:45 -0800 Subject: 8034856/8034857: More gcc warnings In-Reply-To: <52FD0DA2.7030201@oracle.com> References: <52FCC60B.6040205@oracle.com> <6E073B44-2FF7-419F-A094-7CD5C3AE0788@oracle.com> <52FD0DA2.7030201@oracle.com> Message-ID: <52FD35C5.1000306@oracle.com> On 2014-02-13 10:23, Alan Bateman wrote: > On 13/02/2014 17:56, Mikael Vidstedt wrote: >> Alan, >> >> I made the change to JarFacade.c myself last week, only to then see >> the comment a few lines above where you added the new include. It >> seems to indicate that including ctype.h on Solaris/SPARC is a bad >> idea. I have no idea if the comment is still relevant, but that may >> be worth understanding first. >> >> > Do you have cycles to look into it? As the code is using isspace > already then it's not clear (unless there are different versions). > Before pushing the changes then I ran the tests on all platforms > (including Solaris) and the j.l.i tests include a number of tests > exercise these manifest attributes with a non-US characters. The change in question appears to come from https://bugs.openjdk.java.net/browse/JDK-6679866, but I'm not sure the bug gives enough additional information. My speculation (and it's really just a speculation) is that it's not related to isspace per-se, but to something else which gets defined/redefined/undefined by including ctype.h. I guess it would be good to know if we have tests which cover the thing the comment is alluding to (non-ascii in Premain-Class). > As an aside, the native code warnings coming from the jdk repository > are really annoying so this is the reason for the drive-by fixes when > I get a few minutes. I think others are doing the same. Absolutely support this work! As a matter of fact I have a couple of change in a sandbox I should send out for review. Cheers, Mikael From serguei.spitsyn at oracle.com Thu Feb 13 16:01:07 2014 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Thu, 13 Feb 2014 16:01:07 -0800 Subject: RFR (S) 8034249: need more workarounds for suspend equivalent condition issue Message-ID: <52FD5CC3.4090502@oracle.com> Please, review the fix for: https://bugs.openjdk.java.net/browse/JDK-8034249 Open webrev: http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/hotspot/8034249-JVMTI-MON.1 Summary: This issue was identified in the review of the 8032223 and it is similar to the 8032223 but impacts different JVMTI functions: GetCurrentContendedMonitor, GetOwnedMonitorInfo, GetOwnedMonitorStackDepthInfo, GetStackTrace There is a general issue in the suspend equivalent condition mechanism: Two subsequent calls to the JvmtiEnv::is_thread_fully_suspended() may return different results: - 1-st: true - 2-nd: false This suspend equivalent issue is covered by another bug: https://bugs.openjdk.java.net/browse/JDK-6280037 This fix is to work around the 6280037. It is more safe to collect the necesary information at a safepoint instead of relying on the suspension of the target thread. Testing: In progress: nsk.jvmti, nsk.jdi, nsk.jdwp, JTreg com/sun/jdi Thanks, Serguei From staffan.larsen at oracle.com Fri Feb 14 00:45:04 2014 From: staffan.larsen at oracle.com (Staffan Larsen) Date: Fri, 14 Feb 2014 09:45:04 +0100 Subject: Codereview request: 8007710 runtime/7158988/FieldMonitor.java fails with com.sun.jdi.VMDisconnectedException: Connection closed In-Reply-To: <52FD27D8.8060200@oracle.com> References: <52FA301A.2040301@oracle.com> <52FA424E.2050202@oracle.com> <52FA4515.3050904@oracle.com> <52FA4D33.4030402@oracle.com> <52FA51BE.4070701@oracle.com> <52FA5E51.3040704@oracle.com> <3F301491-8A27-4319-A694-D1E8ACB2F249@oracle.com> <52FBA9B2.6090502@oracle.com> <52FC8DA1.9090900@oracle.com> <1353E47D-3ACB-42FD-ABD3-845CEB6EE389@oracle.com> <52FCA014.9000106@oracle.com> <52FCFDBA.5040108@oracle.com> <52FD27D8.8060200@oracle.com> Message-ID: <28F9E62B-3A20-4BFB-AA40-9568C8BEDBC2@oracle.com> This version looks good! Thanks for hanging in there. The only improvement would be to count and verify the number of ModificationWatchpointEvent (there should be 10). Thanks, /Staffan On 13 feb 2014, at 21:15, shanliang wrote: > Hi, > > Here is Version 4: > http://cr.openjdk.java.net/~sjiang/JDK-8007710/04/ > > 1) remove the line > 108 vm.resume() > 2) call addClassWatch(vm) only when receiving VMStartEvent > 3) make sure that the test receives ModificationWatchpointEvent > 4) clean > > Thanks, > Shanliang > > shanliang wrote: >> >> Staffan, >> >> Very nice analysis! >> >> The fix must be very simple, just remove the line >> 108 vm.resume >> it is an error because here the test does not yet treat the events in eventSet. >> >> the line >> 136 eventSet.resume(); >> is the right place to resume the threads after event treatment. >> >> Here is the new webrev: >> http://cr.openjdk.java.net/~sjiang/JDK-8007710/03/ >> >> Thanks, >> Shanliang >> >> Staffan Larsen wrote: >>> >>> I think I understand what happens now. >>> >>> The test code, simplified, looks like this (with the Thread.sleep() added that causes the test to fail): >>> >>> launchTarget(); >>> addClassWatch(); >>> vm.resume(); >>> Thread.sleep(1000); >>> while(connected) { >>> eventSet = eventQueue.remove() >>> for(event : eventQueue) { >>> if (event instanceof ClassPrepareEvent) { >>> addFieldWatch(); >>> } >>> } >>> eventSet.resume(); >>> } >>> >>> By default all events that happen will cause the debuggee to suspend (see EventRequest.setSuspendPolicy()). Thus when we get to addFieldWatch(), the vm should be suspended and we should be able to create the field watch without problem. But the VM isn?t suspended and that is why the test fail. >>> >>> Why isn?t the VM suspended? When we get to the ?for(event : eventQueue)? the first time there are *two* events already in the queue: the VMStartEvent and a ClassPrepareEvent. At this point the VM is suspended and everything is good. We look at the first eventSet which only contains the VMStartEvent, we ignore the event, but we resume the VM. We then loop and look at the ClassPrepareEvent, but by now the VM is already running and has also terminated. Failure. >>> >>> Thus, we need to handle the VMStartEvent. I suggest a modification to my previous code: >>> >>> launchTarget(); >>> while(connected) { >>> eventSet = eventQueue.remove() >>> for(event : eventQueue) { >>> if (event instanceof VMStartEvent) { >>> addClassWatch(); >>> } >>> if (event instanceof ClassPrepareEvent) { >>> addFieldWatch(); >>> } >>> } >>> eventSet.resume(); >>> } >>> >>> This will cause us to have complete control over the state of the debuggee. The first event we see will be the VMStartEvent. The VM will be suspended. We can add a class watch here. Then we resume the VM. The second event we see will be the ClassPrepareEvent with the VM suspended. We can add the field watch. Then we resume the VM and wait for the field watch events. >>> >>> Thanks, >>> /Staffan >>> >>> On 13 feb 2014, at 11:36, shanliang wrote: >>> >>>> Staffan Larsen wrote: >>>>> >>>>> On 13 feb 2014, at 10:17, Jaroslav Bachorik wrote: >>>>> >>>>> >>>>>> Hi Staffan, >>>>>> >>>>>> On 12.2.2014 18:27, Staffan Larsen wrote: >>>>>> >>>>>>> I?m still not happy with this fix since I think the extra output stream synchronization logic is not needed - the debuggee should be suspended at all the interesting points. The fix I proposed is cleaner and (as far as I can tell) also fixes the problem. The only thing is that I can?t quite explain what goes wrong without the fix? I?d really like to understand that. I?ll try to dig deeper and see if I can understand exactly what happens. >>>>>>> >>>>>> Yes, bringing the VM to a stable state before calling other JDI functions helps to stabilize the test even without the additional synchronization via stdout/stdin. >>>>>> >>>>>> I just wonder whether this check should not be done inside com.sun.jdi.connect.LaunchingConnector#launch() implementation. Does it even make sense to hand off an unstable VM? >>>>>> >>>>> Good question, but hard to change now - all implementations depend on the current functionality. The VMStartEvent also gives you a reference to the main thread. >>>>> >>>> The test failed when it received ClassPrepareEvent and did addFieldWatch, that meant the test must receive already VMStartEvent, because VMStartEvent must be the first event, if it was true then the vm must be already stable when failing. >>>> >>>> Except that the test received ClassPrepareEvent before VMStartEvent then it was doing addFieldWatch with a possibly unstable VM. in this case we might have a serious bug in VirtualMachine implementation, and if this is true the fix proposed to check "start" may make miss ClassPrepareEvent, then the test would test nothing. >>>> >>>> Shanliang >>>>> /S >>>>> >>>>> >>>>>> -JB- >>>>>> >>>>>> >>>>>>> /Staffan >>>>>>> >>>>>>> On 12 feb 2014, at 18:04, shanliang wrote: >>>>>>> >>>>>>> >>>>>>>> Staffan Larsen wrote: >>>>>>>> >>>>>>>>> I think what you need to do is wait for the VMStartEvent before you add requests to the VM. Note this paragraph from the VirtualMachine doc: >>>>>>>>> >>>>>>>>> Note that a target VM launched by a launching connector is not >>>>>>>>> guaranteed to be stable until after the VMStartEvent has been >>>>>>>>> received. >>>>>>>>> >>>>>>>>> >>>>>>>> I may miss something here, I believe VMStartEvent must be the first event, when the test got ClassPrepareEvent, it must already received VMStartEvent. >>>>>>>> >>>>>>>>> I think adding code that looks something like this will make the test stable: >>>>>>>>> >>>>>>>>> VirtualMachine vm = launchTarget(CLASS_NAME); >>>>>>>>> EventQueue eventQueue = vm.eventQueue(); >>>>>>>>> >>>>>>>>> boolean started = false; >>>>>>>>> while(!started) { >>>>>>>>> EventSet eventSet = eventQueue.remove(); >>>>>>>>> for (Event event : eventSet) { >>>>>>>>> if (event instanceof VMStartEvent) { >>>>>>>>> started = true; >>>>>>>>> } >>>>>>>>> if (event instanceof VMDeathEvent >>>>>>>>> || event instanceof VMDisconnectEvent) { >>>>>>>>> throw new Error("VM died before it started...:"+event); >>>>>>>>> } >>>>>>>>> } >>>>>>>>> } >>>>>>>>> >>>>>>>>> System.out.println("Vm launched"); >>>>>>>>> >>>>>>>>> >>>>>>>> The code you proposed could improve the test, it made sure that TestPostFieldModification was started, but I am afraid that it did not address the issue causing the failure, the issue I believe was that TestPostFieldModification exited before or during FieldMonitor called addFieldWatch(), that was why addFieldWatch() received VMDisconnectedException. When the test was treating ClassPrepareEvent, even if VMDeathEvent or VMDisconnectEvent arrived, it must be still waiting in the eventQueue because it arrived after ClassPrepareEvent. >>>>>>>> >>>>>>>> My fix was to not allow TestPostFieldModification to exit before addFieldWatch() was done. >>>>>>>> >>>>>>>>> There is also no reason to call addFieldWatch() before the ClassPrepareEvent has been received. The call to vm..classesByName() will just return an empty list anyway. >>>>>>>>> >>>>>>>>> >>>>>>>> I do not know why the test called addFieldWatch before ClassPrepareEvent had been received, but yes the returned list was empty, so agree to remove it. >>>>>>>> >>>>>>>>> While you are in there you can also remove the unused StringBuffer near the top of main(). >>>>>>>>> >>>>>>>>> >>>>>>>> Yes it was already removed in version 01 >>>>>>>> >>>>>>>> Here is the new webrev: >>>>>>>> http://cr.openjdk.java.net/~sjiang/JDK-8007710/02/ >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Shanliang >>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> /Staffan >>>>>>>>> >>>>>>>>> On 11 feb 2014, at 18:30, shanliang wrote: >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>>> Here is the new fix in which FieldMonitor will write to TestPostFieldModification, to inform the latter to quit, as suggested bu Jaroslav >>>>>>>>>> http://cr.openjdk.java.net/~sjiang/JDK-8007710/01/ >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> Shanliang >>>>>>>>>> >>>>>>>>>> shanliang wrote: >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>> shanliang wrote: >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>> Jaroslav Bachorik wrote: >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>>> On 11.2.2014 16:31, shanliang wrote: >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>>> Staffan Larsen wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>>> Hi Shanliang, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> I can?t quite see how the test can fail in this way. When the >>>>>>>>>>>>>>> ClassPrepareEvent happens, the debuggee will be suspended. So when >>>>>>>>>>>>>>> addFieldWatch() is called, the debuggee should not have moved. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>> I am not expert of jdi so I may miss something here. I checked the >>>>>>>>>>>>>> failure trace and saw the report exception happen when FieldMonitor >>>>>>>>>>>>>> received ClassPrepareEvent and was doing addFieldWatch. FieldMonitor did >>>>>>>>>>>>>> call "vm.resume()" before treating events. >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>> AFAICS, calling vm.resume() results in an almost immediate debuggee death. The gc() invoking thread "d" is flagged as a deamon and as such doesn't prevent the process from exiting. The other thread is not a daemon but will finish in only few cycles. >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>> I looked at the class com.sun.jdi.VirtualMachine, here is the Javadoc of the method "resume": >>>>>>>>>>>> /** >>>>>>>>>>>> * Continues the execution of the application running in this >>>>>>>>>>>> * virtual machine. All threads are resumed as documented in >>>>>>>>>>>> * {@link ThreadReference#resume}. >>>>>>>>>>>> * >>>>>>>>>>>> * @throws VMCannotBeModifiedException if the VirtualMachine is read-only - see {@link VirtualMachine#canBeModified()}. >>>>>>>>>>>> * >>>>>>>>>>>> * @see #suspend >>>>>>>>>>>> */ >>>>>>>>>>>> void resume(); >>>>>>>>>>>> My understanding is that the debuggee resumes to work after this call, instead to die? >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>> In fact the problem is here, the vm (TestPostFieldModification) should not die before FieldMonitor finishes addFieldWatch. >>>>>>>>>>> >>>>>>>>>>> Shanliang >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>>>> I reproduced the bug by add sleep(1000) after vm.resume() but before >>>>>>>>>>>>>> calling eventQueue.remove(); >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>> It looks like some kind of synchronization between the debugger and the debuggee is necessary. But I wonder if you should better use the process.getOuptuptStream() to write and flush a message for the debugee indicating that it can exit. And in the debugee you would just do System.in.read() as the last statement in the main() method. Seems more robust than involving files. >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>> It could work, but creating a file in the testing directory should have no issue, but yes maybe less performance. >>>>>>>>>>>> >>>>>>>>>>>> Thanks, >>>>>>>>>>>> Shanliang >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>>> Cheers, >>>>>>>>>>>>> >>>>>>>>>>>>> -JB- >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>> Shanliang >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>>> One problem I do see with the test is that it does not wait for a >>>>>>>>>>>>>>> VMStartEvent before setting up requests. I?m not sure if that could >>>>>>>>>>>>>>> cause the failure in the bug report, though. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> /Staffan >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> On 11 feb 2014, at 15:13, shanliang wrote: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Hi , >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> The problem could be that FieldMonitor did not have enough time to >>>>>>>>>>>>>>>> "addFieldWatch" but the vm to monitor (TestPostFieldModification) was >>>>>>>>>>>>>>>> already ended. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> So we should make sure that TestPostFieldModification exits after >>>>>>>>>>>>>>>> FieldMonitor has done necessary. The solution proposed here is that >>>>>>>>>>>>>>>> FieldMonitor creates a file after adding field watching, and >>>>>>>>>>>>>>>> TestPostFieldModification quits only after finding the file. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> web: >>>>>>>>>>>>>>>> http://icncweb.fr.oracle.com/~shjiang/webrev/8007710/00/ >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> bug: >>>>>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8007710 >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>> Shanliang >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>> >>>>>>> >>>>> >>>> >>>> >>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20140214/48950ed3/attachment-0001.html From shanliang.jiang at oracle.com Fri Feb 14 02:00:02 2014 From: shanliang.jiang at oracle.com (shanliang) Date: Fri, 14 Feb 2014 11:00:02 +0100 Subject: Codereview request: 8007710 runtime/7158988/FieldMonitor.java fails with com.sun.jdi.VMDisconnectedException: Connection closed In-Reply-To: <28F9E62B-3A20-4BFB-AA40-9568C8BEDBC2@oracle.com> References: <52FA301A.2040301@oracle.com> <52FA424E.2050202@oracle.com> <52FA4515.3050904@oracle.com> <52FA4D33.4030402@oracle.com> <52FA51BE.4070701@oracle.com> <52FA5E51.3040704@oracle.com> <3F301491-8A27-4319-A694-D1E8ACB2F249@oracle.com> <52FBA9B2.6090502@oracle.com> <52FC8DA1.9090900@oracle.com> <1353E47D-3ACB-42FD-ABD3-845CEB6EE389@oracle.com> <52FCA014.9000106@oracle.com> <52FCFDBA.5040108@oracle.com> <52FD27D8.8060200@oracle.com> <28F9E62B-3A20-4BFB-AA40-9568C8BEDBC2@oracle.com> Message-ID: <52FDE922.3040301@oracle.com> Staffan Larsen wrote: > This version looks good! Thanks for hanging in there. > > The only improvement would be to count and verify the number of > ModificationWatchpointEvent (there should be 10). Good idea, here is: http://cr.openjdk.java.net/~sjiang/JDK-8007710/05/ Thanks, Shanliang > > Thanks, > /Staffan > > On 13 feb 2014, at 21:15, shanliang > wrote: > >> Hi, >> >> Here is Version 4: >> http://cr.openjdk.java.net/~sjiang/JDK-8007710/04/ >> >> 1) remove the line >> 108 vm.resume() >> 2) call addClassWatch(vm) only when receiving VMStartEvent >> 3) make sure that the test receives ModificationWatchpointEvent >> 4) clean >> >> Thanks, >> Shanliang >> >> shanliang wrote: >>> Staffan, >>> >>> Very nice analysis! >>> >>> The fix must be very simple, just remove the line >>> 108 vm.resume >>> it is an error because here the test does not yet treat the events >>> in eventSet. >>> >>> the line >>> 136 eventSet.resume(); >>> is the right place to resume the threads after event treatment. >>> >>> Here is the new webrev: >>> http://cr.openjdk.java.net/~sjiang/JDK-8007710/03/ >>> >>> Thanks, >>> Shanliang >>> >>> Staffan Larsen wrote: >>>> I think I understand what happens now. >>>> >>>> The test code, simplified, looks like this (with the Thread.sleep() >>>> added that causes the test to fail): >>>> >>>> launchTarget(); >>>> addClassWatch(); >>>> vm.resume(); >>>> Thread.sleep(1000); >>>> while(connected) { >>>> eventSet = eventQueue.remove() >>>> for(event : eventQueue) { >>>> if (event instanceof ClassPrepareEvent) { >>>> addFieldWatch(); >>>> } >>>> } >>>> eventSet.resume(); >>>> } >>>> >>>> By default all events that happen will cause the debuggee to >>>> suspend (see EventRequest.setSuspendPolicy()). Thus when we get to >>>> addFieldWatch(), the vm should be suspended and we should be able >>>> to create the field watch without problem. But the VM isn?t >>>> suspended and that is why the test fail. >>>> >>>> Why isn?t the VM suspended? When we get to the ?for(event : >>>> eventQueue)? the first time there are *two* events already in the >>>> queue: the VMStartEvent and a ClassPrepareEvent. At this point the >>>> VM is suspended and everything is good. We look at the first >>>> eventSet which only contains the VMStartEvent, we ignore the event, >>>> but we resume the VM. We then loop and look at the >>>> ClassPrepareEvent, but by now the VM is already running and has >>>> also terminated. Failure. >>>> >>>> Thus, we need to handle the VMStartEvent. I suggest a modification >>>> to my previous code: >>>> >>>> launchTarget(); >>>> while(connected) { >>>> eventSet = eventQueue.remove() >>>> for(event : eventQueue) { >>>> if (event instanceof VMStartEvent) { >>>> addClassWatch(); >>>> } >>>> if (event instanceof ClassPrepareEvent) { >>>> addFieldWatch(); >>>> } >>>> } >>>> eventSet.resume(); >>>> } >>>> >>>> This will cause us to have complete control over the state of the >>>> debuggee. The first event we see will be the VMStartEvent. The VM >>>> will be suspended. We can add a class watch here. Then we resume >>>> the VM. The second event we see will be the ClassPrepareEvent with >>>> the VM suspended. We can add the field watch. Then we resume the VM >>>> and wait for the field watch events. >>>> >>>> Thanks, >>>> /Staffan >>>> >>>> On 13 feb 2014, at 11:36, shanliang >>> > wrote: >>>> >>>>> Staffan Larsen wrote: >>>>>> On 13 feb 2014, at 10:17, Jaroslav Bachorik wrote: >>>>>> >>>>>> >>>>>>> Hi Staffan, >>>>>>> >>>>>>> On 12.2.2014 18:27, Staffan Larsen wrote: >>>>>>> >>>>>>>> I?m still not happy with this fix since I think the extra output stream synchronization logic is not needed - the debuggee should be suspended at all the interesting points. The fix I proposed is cleaner and (as far as I can tell) also fixes the problem. The only thing is that I can?t quite explain what goes wrong without the fix? I?d really like to understand that. I?ll try to dig deeper and see if I can understand exactly what happens. >>>>>>>> >>>>>>> Yes, bringing the VM to a stable state before calling other JDI functions helps to stabilize the test even without the additional synchronization via stdout/stdin. >>>>>>> >>>>>>> I just wonder whether this check should not be done inside com.sun.jdi.connect.LaunchingConnector#launch() implementation. Does it even make sense to hand off an unstable VM? >>>>>>> >>>>>> Good question, but hard to change now - all implementations depend on the current functionality. The VMStartEvent also gives you a reference to the main thread. >>>>>> >>>>> The test failed when it received ClassPrepareEvent and did >>>>> addFieldWatch, that meant the test must receive already >>>>> VMStartEvent, because VMStartEvent must be the first event, if it >>>>> was true then the vm must be already stable when failing. >>>>> >>>>> Except that the test received ClassPrepareEvent before >>>>> VMStartEvent then it was doing addFieldWatch with a possibly >>>>> unstable VM. in this case we might have a serious bug in >>>>> VirtualMachine implementation, and if this is true the fix >>>>> proposed to check "start" may make miss ClassPrepareEvent, then >>>>> the test would test nothing. >>>>> >>>>> Shanliang >>>>>> /S >>>>>> >>>>>> >>>>>>> -JB- >>>>>>> >>>>>>> >>>>>>>> /Staffan >>>>>>>> >>>>>>>> On 12 feb 2014, at 18:04, shanliang wrote: >>>>>>>> >>>>>>>> >>>>>>>>> Staffan Larsen wrote: >>>>>>>>> >>>>>>>>>> I think what you need to do is wait for the VMStartEvent before you add requests to the VM. Note this paragraph from the VirtualMachine doc: >>>>>>>>>> >>>>>>>>>> Note that a target VM launched by a launching connector is not >>>>>>>>>> guaranteed to be stable until after the VMStartEvent has been >>>>>>>>>> received. >>>>>>>>>> >>>>>>>>>> >>>>>>>>> I may miss something here, I believe VMStartEvent must be the first event, when the test got ClassPrepareEvent, it must already received VMStartEvent. >>>>>>>>> >>>>>>>>>> I think adding code that looks something like this will make the test stable: >>>>>>>>>> >>>>>>>>>> VirtualMachine vm = launchTarget(CLASS_NAME); >>>>>>>>>> EventQueue eventQueue = vm.eventQueue(); >>>>>>>>>> >>>>>>>>>> boolean started = false; >>>>>>>>>> while(!started) { >>>>>>>>>> EventSet eventSet = eventQueue.remove(); >>>>>>>>>> for (Event event : eventSet) { >>>>>>>>>> if (event instanceof VMStartEvent) { >>>>>>>>>> started = true; >>>>>>>>>> } >>>>>>>>>> if (event instanceof VMDeathEvent >>>>>>>>>> || event instanceof VMDisconnectEvent) { >>>>>>>>>> throw new Error("VM died before it started...:"+event); >>>>>>>>>> } >>>>>>>>>> } >>>>>>>>>> } >>>>>>>>>> >>>>>>>>>> System.out.println("Vm launched"); >>>>>>>>>> >>>>>>>>>> >>>>>>>>> The code you proposed could improve the test, it made sure that TestPostFieldModification was started, but I am afraid that it did not address the issue causing the failure, the issue I believe was that TestPostFieldModification exited before or during FieldMonitor called addFieldWatch(), that was why addFieldWatch() received VMDisconnectedException. When the test was treating ClassPrepareEvent, even if VMDeathEvent or VMDisconnectEvent arrived, it must be still waiting in the eventQueue because it arrived after ClassPrepareEvent. >>>>>>>>> >>>>>>>>> My fix was to not allow TestPostFieldModification to exit before addFieldWatch() was done. >>>>>>>>> >>>>>>>>>> There is also no reason to call addFieldWatch() before the ClassPrepareEvent has been received. The call to vm..classesByName() will just return an empty list anyway. >>>>>>>>>> >>>>>>>>>> >>>>>>>>> I do not know why the test called addFieldWatch before ClassPrepareEvent had been received, but yes the returned list was empty, so agree to remove it. >>>>>>>>> >>>>>>>>>> While you are in there you can also remove the unused StringBuffer near the top of main(). >>>>>>>>>> >>>>>>>>>> >>>>>>>>> Yes it was already removed in version 01 >>>>>>>>> >>>>>>>>> Here is the new webrev: >>>>>>>>> http://cr.openjdk.java.net/~sjiang/JDK-8007710/02/ >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> Shanliang >>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> /Staffan >>>>>>>>>> >>>>>>>>>> On 11 feb 2014, at 18:30, shanliang wrote: >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>> Here is the new fix in which FieldMonitor will write to TestPostFieldModification, to inform the latter to quit, as suggested bu Jaroslav >>>>>>>>>>> http://cr.openjdk.java.net/~sjiang/JDK-8007710/01/ >>>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> Shanliang >>>>>>>>>>> >>>>>>>>>>> shanliang wrote: >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>> shanliang wrote: >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>>> Jaroslav Bachorik wrote: >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>>> On 11.2.2014 16:31, shanliang wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>>> Staffan Larsen wrote: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Hi Shanliang, >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> I can?t quite see how the test can fail in this way. When the >>>>>>>>>>>>>>>> ClassPrepareEvent happens, the debuggee will be suspended. So when >>>>>>>>>>>>>>>> addFieldWatch() is called, the debuggee should not have moved. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> I am not expert of jdi so I may miss something here. I checked the >>>>>>>>>>>>>>> failure trace and saw the report exception happen when FieldMonitor >>>>>>>>>>>>>>> received ClassPrepareEvent and was doing addFieldWatch. FieldMonitor did >>>>>>>>>>>>>>> call "vm.resume()" before treating events. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>> AFAICS, calling vm.resume() results in an almost immediate debuggee death. The gc() invoking thread "d" is flagged as a deamon and as such doesn't prevent the process from exiting. The other thread is not a daemon but will finish in only few cycles. >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>> I looked at the class com.sun.jdi.VirtualMachine, here is the Javadoc of the method "resume": >>>>>>>>>>>>> /** >>>>>>>>>>>>> * Continues the execution of the application running in this >>>>>>>>>>>>> * virtual machine. All threads are resumed as documented in >>>>>>>>>>>>> * {@link ThreadReference#resume}. >>>>>>>>>>>>> * >>>>>>>>>>>>> * @throws VMCannotBeModifiedException if the VirtualMachine is read-only - see {@link VirtualMachine#canBeModified()}. >>>>>>>>>>>>> * >>>>>>>>>>>>> * @see #suspend >>>>>>>>>>>>> */ >>>>>>>>>>>>> void resume(); >>>>>>>>>>>>> My understanding is that the debuggee resumes to work after this call, instead to die? >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>> In fact the problem is here, the vm (TestPostFieldModification) should not die before FieldMonitor finishes addFieldWatch. >>>>>>>>>>>> >>>>>>>>>>>> Shanliang >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>>>>> I reproduced the bug by add sleep(1000) after vm.resume() but before >>>>>>>>>>>>>>> calling eventQueue.remove(); >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>> It looks like some kind of synchronization between the debugger and the debuggee is necessary. But I wonder if you should better use the process.getOuptuptStream() to write and flush a message for the debugee indicating that it can exit. And in the debugee you would just do System.in.read() as the last statement in the main() method. Seems more robust than involving files. >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>> It could work, but creating a file in the testing directory should have no issue, but yes maybe less performance. >>>>>>>>>>>>> >>>>>>>>>>>>> Thanks, >>>>>>>>>>>>> Shanliang >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>>> Cheers, >>>>>>>>>>>>>> >>>>>>>>>>>>>> -JB- >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>> Shanliang >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> One problem I do see with the test is that it does not wait for a >>>>>>>>>>>>>>>> VMStartEvent before setting up requests. I?m not sure if that could >>>>>>>>>>>>>>>> cause the failure in the bug report, though. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> /Staffan >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> On 11 feb 2014, at 15:13, shanliang wrote: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Hi , >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> The problem could be that FieldMonitor did not have enough time to >>>>>>>>>>>>>>>>> "addFieldWatch" but the vm to monitor (TestPostFieldModification) was >>>>>>>>>>>>>>>>> already ended. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> So we should make sure that TestPostFieldModification exits after >>>>>>>>>>>>>>>>> FieldMonitor has done necessary. The solution proposed here is that >>>>>>>>>>>>>>>>> FieldMonitor creates a file after adding field watching, and >>>>>>>>>>>>>>>>> TestPostFieldModification quits only after finding the file. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> web: >>>>>>>>>>>>>>>>> http://icncweb.fr.oracle.com/~shjiang/webrev/8007710/00/ >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> bug: >>>>>>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8007710 >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>> Shanliang >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>> >>>>> >>>>> >>>> >>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20140214/247b058e/attachment-0001.html From jaroslav.bachorik at oracle.com Fri Feb 14 03:07:09 2014 From: jaroslav.bachorik at oracle.com (Jaroslav Bachorik) Date: Fri, 14 Feb 2014 12:07:09 +0100 Subject: RFR 4505697: nsk/jdi/ExceptionEvent/_itself_/exevent006 and exevent008 tests fail with InvocationTargetException Message-ID: <52FDF8DD.9090602@oracle.com> This is a round-0 review request. The reflection code intercepting the exceptions thrown in the invoked methods does not play nicely with JVMTI (which, in this case, propagates to JDI). The reflection code lacks the traditional error handler - therefore, upon throwing the NumberFormatException, the stack is searched for appropriate handlers and none are found. This leaves the "exception_detected" flag set to true while normally it would be reset to false once the exception is handled. The reflection code then goes on and wraps the NumberFormatException into InvocationTargetException and throws it. But, alas, the "exception_detected" flag is still set to true and no JVMTI exception event will be sent out. The proposed solution is to call thread->jvmti_thread_state()->clear_exception_detected() at the appropriate places in the reflection code to reset the "exception_detected" flag and enable the InvocationTargetException be properly reported over JVMTI. Issue : https://bugs.openjdk.java.net/browse/JDK-4505697 Webrev: http://cr.openjdk.java.net/~jbachorik/4505697/webrev.00 Thanks! -JB- From staffan.larsen at oracle.com Fri Feb 14 03:16:10 2014 From: staffan.larsen at oracle.com (Staffan Larsen) Date: Fri, 14 Feb 2014 12:16:10 +0100 Subject: Codereview request: 8007710 runtime/7158988/FieldMonitor.java fails with com.sun.jdi.VMDisconnectedException: Connection closed In-Reply-To: <52FDE922.3040301@oracle.com> References: <52FA301A.2040301@oracle.com> <52FA424E.2050202@oracle.com> <52FA4515.3050904@oracle.com> <52FA4D33.4030402@oracle.com> <52FA51BE.4070701@oracle.com> <52FA5E51.3040704@oracle.com> <3F301491-8A27-4319-A694-D1E8ACB2F249@oracle.com> <52FBA9B2.6090502@oracle.com> <52FC8DA1.9090900@oracle.com> <1353E47D-3ACB-42FD-ABD3-845CEB6EE389@oracle.com> <52FCA014.9000106@oracle.com> <52FCFDBA.5040108@oracle.com> <52FD27D8.8060200@oracle.com> <28F9E62B-3A20-4BFB-AA40-9568C8BEDBC2@oracle.com> <52FDE922.3040301@oracle.com> Message-ID: Good. Just a small spelling error ?Espected? -> ?Expected?. /Staffan On 14 feb 2014, at 11:00, shanliang wrote: > Staffan Larsen wrote: >> >> This version looks good! Thanks for hanging in there. >> >> The only improvement would be to count and verify the number of ModificationWatchpointEvent (there should be 10). > Good idea, here is: > http://cr.openjdk.java.net/~sjiang/JDK-8007710/05/ > > Thanks, > Shanliang >> >> Thanks, >> /Staffan >> >> On 13 feb 2014, at 21:15, shanliang wrote: >> >>> Hi, >>> >>> Here is Version 4: >>> http://cr.openjdk.java.net/~sjiang/JDK-8007710/04/ >>> >>> 1) remove the line >>> 108 vm.resume() >>> 2) call addClassWatch(vm) only when receiving VMStartEvent >>> 3) make sure that the test receives ModificationWatchpointEvent >>> 4) clean >>> >>> Thanks, >>> Shanliang >>> >>> shanliang wrote: >>>> >>>> Staffan, >>>> >>>> Very nice analysis! >>>> >>>> The fix must be very simple, just remove the line >>>> 108 vm.resume >>>> it is an error because here the test does not yet treat the events in eventSet. >>>> >>>> the line >>>> 136 eventSet.resume(); >>>> is the right place to resume the threads after event treatment. >>>> >>>> Here is the new webrev: >>>> http://cr.openjdk.java.net/~sjiang/JDK-8007710/03/ >>>> >>>> Thanks, >>>> Shanliang >>>> >>>> Staffan Larsen wrote: >>>>> >>>>> I think I understand what happens now. >>>>> >>>>> The test code, simplified, looks like this (with the Thread.sleep() added that causes the test to fail): >>>>> >>>>> launchTarget(); >>>>> addClassWatch(); >>>>> vm.resume(); >>>>> Thread.sleep(1000); >>>>> while(connected) { >>>>> eventSet = eventQueue.remove() >>>>> for(event : eventQueue) { >>>>> if (event instanceof ClassPrepareEvent) { >>>>> addFieldWatch(); >>>>> } >>>>> } >>>>> eventSet.resume(); >>>>> } >>>>> >>>>> By default all events that happen will cause the debuggee to suspend (see EventRequest.setSuspendPolicy()). Thus when we get to addFieldWatch(), the vm should be suspended and we should be able to create the field watch without problem. But the VM isn?t suspended and that is why the test fail. >>>>> >>>>> Why isn?t the VM suspended? When we get to the ?for(event : eventQueue)? the first time there are *two* events already in the queue: the VMStartEvent and a ClassPrepareEvent. At this point the VM is suspended and everything is good. We look at the first eventSet which only contains the VMStartEvent, we ignore the event, but we resume the VM. We then loop and look at the ClassPrepareEvent, but by now the VM is already running and has also terminated. Failure. >>>>> >>>>> Thus, we need to handle the VMStartEvent. I suggest a modification to my previous code: >>>>> >>>>> launchTarget(); >>>>> while(connected) { >>>>> eventSet = eventQueue.remove() >>>>> for(event : eventQueue) { >>>>> if (event instanceof VMStartEvent) { >>>>> addClassWatch(); >>>>> } >>>>> if (event instanceof ClassPrepareEvent) { >>>>> addFieldWatch(); >>>>> } >>>>> } >>>>> eventSet.resume(); >>>>> } >>>>> >>>>> This will cause us to have complete control over the state of the debuggee. The first event we see will be the VMStartEvent. The VM will be suspended. We can add a class watch here. Then we resume the VM. The second event we see will be the ClassPrepareEvent with the VM suspended. We can add the field watch. Then we resume the VM and wait for the field watch events. >>>>> >>>>> Thanks, >>>>> /Staffan >>>>> >>>>> On 13 feb 2014, at 11:36, shanliang wrote: >>>>> >>>>>> Staffan Larsen wrote: >>>>>>> >>>>>>> On 13 feb 2014, at 10:17, Jaroslav Bachorik wrote: >>>>>>> >>>>>>> >>>>>>>> Hi Staffan, >>>>>>>> >>>>>>>> On 12.2.2014 18:27, Staffan Larsen wrote: >>>>>>>> >>>>>>>>> I?m still not happy with this fix since I think the extra output stream synchronization logic is not needed - the debuggee should be suspended at all the interesting points. The fix I proposed is cleaner and (as far as I can tell) also fixes the problem. The only thing is that I can?t quite explain what goes wrong without the fix? I?d really like to understand that. I?ll try to dig deeper and see if I can understand exactly what happens. >>>>>>>>> >>>>>>>> Yes, bringing the VM to a stable state before calling other JDI functions helps to stabilize the test even without the additional synchronization via stdout/stdin. >>>>>>>> >>>>>>>> I just wonder whether this check should not be done inside com.sun.jdi.connect.LaunchingConnector#launch() implementation. Does it even make sense to hand off an unstable VM? >>>>>>>> >>>>>>> Good question, but hard to change now - all implementations depend on the current functionality. The VMStartEvent also gives you a reference to the main thread. >>>>>>> >>>>>> The test failed when it received ClassPrepareEvent and did addFieldWatch, that meant the test must receive already VMStartEvent, because VMStartEvent must be the first event, if it was true then the vm must be already stable when failing. >>>>>> >>>>>> Except that the test received ClassPrepareEvent before VMStartEvent then it was doing addFieldWatch with a possibly unstable VM. in this case we might have a serious bug in VirtualMachine implementation, and if this is true the fix proposed to check "start" may make miss ClassPrepareEvent, then the test would test nothing. >>>>>> >>>>>> Shanliang >>>>>>> /S >>>>>>> >>>>>>> >>>>>>>> -JB- >>>>>>>> >>>>>>>> >>>>>>>>> /Staffan >>>>>>>>> >>>>>>>>> On 12 feb 2014, at 18:04, shanliang wrote: >>>>>>>>> >>>>>>>>> >>>>>>>>>> Staffan Larsen wrote: >>>>>>>>>> >>>>>>>>>>> I think what you need to do is wait for the VMStartEvent before you add requests to the VM. Note this paragraph from the VirtualMachine doc: >>>>>>>>>>> >>>>>>>>>>> Note that a target VM launched by a launching connector is not >>>>>>>>>>> guaranteed to be stable until after the VMStartEvent has been >>>>>>>>>>> received. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> I may miss something here, I believe VMStartEvent must be the first event, when the test got ClassPrepareEvent, it must already received VMStartEvent. >>>>>>>>>> >>>>>>>>>>> I think adding code that looks something like this will make the test stable: >>>>>>>>>>> >>>>>>>>>>> VirtualMachine vm = launchTarget(CLASS_NAME); >>>>>>>>>>> EventQueue eventQueue = vm.eventQueue(); >>>>>>>>>>> >>>>>>>>>>> boolean started = false; >>>>>>>>>>> while(!started) { >>>>>>>>>>> EventSet eventSet = eventQueue.remove(); >>>>>>>>>>> for (Event event : eventSet) { >>>>>>>>>>> if (event instanceof VMStartEvent) { >>>>>>>>>>> started = true; >>>>>>>>>>> } >>>>>>>>>>> if (event instanceof VMDeathEvent >>>>>>>>>>> || event instanceof VMDisconnectEvent) { >>>>>>>>>>> throw new Error("VM died before it started...:"+event); >>>>>>>>>>> } >>>>>>>>>>> } >>>>>>>>>>> } >>>>>>>>>>> >>>>>>>>>>> System.out.println("Vm launched"); >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> The code you proposed could improve the test, it made sure that TestPostFieldModification was started, but I am afraid that it did not address the issue causing the failure, the issue I believe was that TestPostFieldModification exited before or during FieldMonitor called addFieldWatch(), that was why addFieldWatch() received VMDisconnectedException. When the test was treating ClassPrepareEvent, even if VMDeathEvent or VMDisconnectEvent arrived, it must be still waiting in the eventQueue because it arrived after ClassPrepareEvent. >>>>>>>>>> >>>>>>>>>> My fix was to not allow TestPostFieldModification to exit before addFieldWatch() was done. >>>>>>>>>> >>>>>>>>>>> There is also no reason to call addFieldWatch() before the ClassPrepareEvent has been received. The call to vm..classesByName() will just return an empty list anyway. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> I do not know why the test called addFieldWatch before ClassPrepareEvent had been received, but yes the returned list was empty, so agree to remove it. >>>>>>>>>> >>>>>>>>>>> While you are in there you can also remove the unused StringBuffer near the top of main(). >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> Yes it was already removed in version 01 >>>>>>>>>> >>>>>>>>>> Here is the new webrev: >>>>>>>>>> http://cr.openjdk.java.net/~sjiang/JDK-8007710/02/ >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> Shanliang >>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> /Staffan >>>>>>>>>>> >>>>>>>>>>> On 11 feb 2014, at 18:30, shanliang wrote: >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>> Here is the new fix in which FieldMonitor will write to TestPostFieldModification, to inform the latter to quit, as suggested bu Jaroslav >>>>>>>>>>>> http://cr.openjdk.java.net/~sjiang/JDK-8007710/01/ >>>>>>>>>>>> >>>>>>>>>>>> Thanks, >>>>>>>>>>>> Shanliang >>>>>>>>>>>> >>>>>>>>>>>> shanliang wrote: >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>>> shanliang wrote: >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>>> Jaroslav Bachorik wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>>> On 11.2.2014 16:31, shanliang wrote: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Staffan Larsen wrote: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Hi Shanliang, >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> I can?t quite see how the test can fail in this way. When the >>>>>>>>>>>>>>>>> ClassPrepareEvent happens, the debuggee will be suspended. So when >>>>>>>>>>>>>>>>> addFieldWatch() is called, the debuggee should not have moved. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> I am not expert of jdi so I may miss something here. I checked the >>>>>>>>>>>>>>>> failure trace and saw the report exception happen when FieldMonitor >>>>>>>>>>>>>>>> received ClassPrepareEvent and was doing addFieldWatch. FieldMonitor did >>>>>>>>>>>>>>>> call "vm.resume()" before treating events. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> AFAICS, calling vm.resume() results in an almost immediate debuggee death. The gc() invoking thread "d" is flagged as a deamon and as such doesn't prevent the process from exiting. The other thread is not a daemon but will finish in only few cycles. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>> I looked at the class com.sun.jdi.VirtualMachine, here is the Javadoc of the method "resume": >>>>>>>>>>>>>> /** >>>>>>>>>>>>>> * Continues the execution of the application running in this >>>>>>>>>>>>>> * virtual machine. All threads are resumed as documented in >>>>>>>>>>>>>> * {@link ThreadReference#resume}. >>>>>>>>>>>>>> * >>>>>>>>>>>>>> * @throws VMCannotBeModifiedException if the VirtualMachine is read-only - see {@link VirtualMachine#canBeModified()}. >>>>>>>>>>>>>> * >>>>>>>>>>>>>> * @see #suspend >>>>>>>>>>>>>> */ >>>>>>>>>>>>>> void resume(); >>>>>>>>>>>>>> My understanding is that the debuggee resumes to work after this call, instead to die? >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>> In fact the problem is here, the vm (TestPostFieldModification) should not die before FieldMonitor finishes addFieldWatch. >>>>>>>>>>>>> >>>>>>>>>>>>> Shanliang >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>>>>> I reproduced the bug by add sleep(1000) after vm.resume() but before >>>>>>>>>>>>>>>> calling eventQueue.remove(); >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> It looks like some kind of synchronization between the debugger and the debuggee is necessary. But I wonder if you should better use the process.getOuptuptStream() to write and flush a message for the debugee indicating that it can exit. And in the debugee you would just do System.in.read() as the last statement in the main() method. Seems more robust than involving files. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>> It could work, but creating a file in the testing directory should have no issue, but yes maybe less performance. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>> Shanliang >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>>> Cheers, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> -JB- >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>> Shanliang >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> One problem I do see with the test is that it does not wait for a >>>>>>>>>>>>>>>>> VMStartEvent before setting up requests. I?m not sure if that could >>>>>>>>>>>>>>>>> cause the failure in the bug report, though. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> /Staffan >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> On 11 feb 2014, at 15:13, shanliang wrote: >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Hi , >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> The problem could be that FieldMonitor did not have enough time to >>>>>>>>>>>>>>>>>> "addFieldWatch" but the vm to monitor (TestPostFieldModification) was >>>>>>>>>>>>>>>>>> already ended. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> So we should make sure that TestPostFieldModification exits after >>>>>>>>>>>>>>>>>> FieldMonitor has done necessary. The solution proposed here is that >>>>>>>>>>>>>>>>>> FieldMonitor creates a file after adding field watching, and >>>>>>>>>>>>>>>>>> TestPostFieldModification quits only after finding the file. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> web: >>>>>>>>>>>>>>>>>> http://icncweb.fr.oracle.com/~shjiang/webrev/8007710/00/ >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> bug: >>>>>>>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8007710 >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>> Shanliang >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>> >>>>>> >>>>>> >>>>> >>>> >>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20140214/a4346110/attachment-0001.html From dmitry.samersoff at oracle.com Fri Feb 14 06:52:10 2014 From: dmitry.samersoff at oracle.com (Dmitry Samersoff) Date: Fri, 14 Feb 2014 18:52:10 +0400 Subject: RR(S): JDK-7189721 nightly tests failed on JDI attaching Message-ID: <52FE2D9A.4030700@oracle.com> Hi Everyone, Please, review the changes. http://cr.openjdk.java.net/~dsamersoff/JDK-7189721/webrev.01/ Summary: If for some reason elf section with section names is not loaded to cache it attempts to read data using NULL section pointer. -- Dmitry Samersoff Oracle Java development team, Saint Petersburg, Russia * I would love to change the world, but they won't give me the sources. From daniel.daugherty at oracle.com Fri Feb 14 07:45:49 2014 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Fri, 14 Feb 2014 08:45:49 -0700 Subject: RFR (S) 8034249: need more workarounds for suspend equivalent condition issue In-Reply-To: <52FD5CC3.4090502@oracle.com> References: <52FD5CC3.4090502@oracle.com> Message-ID: <52FE3A2D.808@oracle.com> On 2/13/14 5:01 PM, serguei.spitsyn at oracle.com wrote: > Please, review the fix for: > https://bugs.openjdk.java.net/browse/JDK-8034249 > > > Open webrev: > http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/hotspot/8034249-JVMTI-MON.1 > src/share/vm/prims/jvmtiEnvBase.hpp line 360 and 446 are a bit long src/share/vm/prims/jvmtiEnv.cpp No comments. Thumbs up. Dan > > Summary: > > This issue was identified in the review of the 8032223 and it is > similar to the 8032223 > but impacts different JVMTI functions: > GetCurrentContendedMonitor, GetOwnedMonitorInfo, > GetOwnedMonitorStackDepthInfo, GetStackTrace > > There is a general issue in the suspend equivalent condition mechanism: > Two subsequent calls to the JvmtiEnv::is_thread_fully_suspended() > may return different results: > - 1-st: true > - 2-nd: false > > This suspend equivalent issue is covered by another bug: > https://bugs.openjdk.java.net/browse/JDK-6280037 > > This fix is to work around the 6280037. > It is more safe to collect the necesary information at a safepoint > instead of > relying on the suspension of the target thread. > > > Testing: > In progress: nsk.jvmti, nsk.jdi, nsk.jdwp, JTreg com/sun/jdi > > > Thanks, > Serguei From daniel.daugherty at oracle.com Fri Feb 14 08:38:58 2014 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Fri, 14 Feb 2014 09:38:58 -0700 Subject: RFR 4505697: nsk/jdi/ExceptionEvent/_itself_/exevent006 and exevent008 tests fail with InvocationTargetException In-Reply-To: <52FDF8DD.9090602@oracle.com> References: <52FDF8DD.9090602@oracle.com> Message-ID: <52FE46A2.20705@oracle.com> > Webrev: http://cr.openjdk.java.net/~jbachorik/4505697/webrev.00 Nice job on such an old bug. src/share/vm/prims/jvmtiExport.hpp No comments. src/share/vm/prims/jvmtiExport.cpp line 2170: state->clear_exception_detected(); HotSpot indent is two spaces. src/share/vm/runtime/reflection.cpp line 948: JvmtiExport::clear_detected_exception((JavaThread*) THREAD); line 1085: JvmtiExport::clear_detected_exception((JavaThread*) THREAD); HotSpot indent is two spaces. So there are two existing tests for this failure mode: nsk/jdi/ExceptionEvent/_itself_/exevent006 nsk/jdi/ExceptionEvent/_itself_/exevent008 so I'm presuming that you've run the above tests and they are now happy, but you haven't really stated that. What other pre-integration testing do you have planned? Dan On 2/14/14 4:07 AM, Jaroslav Bachorik wrote: > This is a round-0 review request. > > The reflection code intercepting the exceptions thrown in the invoked > methods does not play nicely with JVMTI (which, in this case, > propagates to JDI). > > The reflection code lacks the traditional error handler - therefore, > upon throwing the NumberFormatException, the stack is searched for > appropriate handlers and none are found. This leaves the > "exception_detected" flag set to true while normally it would be reset > to false once the exception is handled. The reflection code then goes > on and wraps the NumberFormatException into InvocationTargetException > and throws it. But, alas, the "exception_detected" flag is still set > to true and no JVMTI exception event will be sent out. > > The proposed solution is to call > thread->jvmti_thread_state()->clear_exception_detected() at the > appropriate places in the reflection code to reset the > "exception_detected" flag and enable the InvocationTargetException be > properly reported over JVMTI. > > Issue : https://bugs.openjdk.java.net/browse/JDK-4505697 > Webrev: http://cr.openjdk.java.net/~jbachorik/4505697/webrev.00 > > Thanks! > > -JB- > > From serguei.spitsyn at oracle.com Fri Feb 14 09:31:37 2014 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Fri, 14 Feb 2014 09:31:37 -0800 Subject: RFR (S) 8034249: need more workarounds for suspend equivalent condition issue In-Reply-To: <52FE3A2D.808@oracle.com> References: <52FD5CC3.4090502@oracle.com> <52FE3A2D.808@oracle.com> Message-ID: <52FE52F9.5070701@oracle.com> On 2/14/14 7:45 AM, Daniel D. Daugherty wrote: > On 2/13/14 5:01 PM, serguei.spitsyn at oracle.com wrote: >> Please, review the fix for: >> https://bugs.openjdk.java.net/browse/JDK-8034249 >> >> >> Open webrev: >> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/hotspot/8034249-JVMTI-MON.1 >> > > src/share/vm/prims/jvmtiEnvBase.hpp > line 360 and 446 are a bit long I'll split the lines. > > src/share/vm/prims/jvmtiEnv.cpp > No comments. > > Thumbs up. Thank you for review! Serguei > > Dan > > > >> >> Summary: >> >> This issue was identified in the review of the 8032223 and it is >> similar to the 8032223 >> but impacts different JVMTI functions: >> GetCurrentContendedMonitor, GetOwnedMonitorInfo, >> GetOwnedMonitorStackDepthInfo, GetStackTrace >> >> There is a general issue in the suspend equivalent condition >> mechanism: >> Two subsequent calls to the JvmtiEnv::is_thread_fully_suspended() >> may return different results: >> - 1-st: true >> - 2-nd: false >> >> This suspend equivalent issue is covered by another bug: >> https://bugs.openjdk.java.net/browse/JDK-6280037 >> >> This fix is to work around the 6280037. >> It is more safe to collect the necesary information at a safepoint >> instead of >> relying on the suspension of the target thread. >> >> >> Testing: >> In progress: nsk.jvmti, nsk.jdi, nsk.jdwp, JTreg com/sun/jdi >> >> >> Thanks, >> Serguei > From serguei.spitsyn at oracle.com Fri Feb 14 10:46:30 2014 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Fri, 14 Feb 2014 10:46:30 -0800 Subject: RFR 4505697: nsk/jdi/ExceptionEvent/_itself_/exevent006 and exevent008 tests fail with InvocationTargetException In-Reply-To: <52FDF8DD.9090602@oracle.com> References: <52FDF8DD.9090602@oracle.com> Message-ID: <52FE6486.8000603@oracle.com> Jaroslav, It looks good in general modulo indent comments from Dan. But I have a doubt that acquiring the JvmtiThreadState_lock is needed or right thing to do in the JvmtiExport::clear_detected_exception(). It seems, both clear_exception_detected() and set_exception_detected() are always called on current thread and so, it has to be safe to do without acquiring any locks. And I'm repeating my question about pre-integration testing (Dan is asking about the same). Thanks, Serguei On 2/14/14 3:07 AM, Jaroslav Bachorik wrote: > This is a round-0 review request. > > The reflection code intercepting the exceptions thrown in the invoked > methods does not play nicely with JVMTI (which, in this case, > propagates to JDI). > > The reflection code lacks the traditional error handler - therefore, > upon throwing the NumberFormatException, the stack is searched for > appropriate handlers and none are found. This leaves the > "exception_detected" flag set to true while normally it would be reset > to false once the exception is handled. The reflection code then goes > on and wraps the NumberFormatException into InvocationTargetException > and throws it. But, alas, the "exception_detected" flag is still set > to true and no JVMTI exception event will be sent out. > > The proposed solution is to call > thread->jvmti_thread_state()->clear_exception_detected() at the > appropriate places in the reflection code to reset the > "exception_detected" flag and enable the InvocationTargetException be > properly reported over JVMTI. > > Issue : https://bugs.openjdk.java.net/browse/JDK-4505697 > Webrev: http://cr.openjdk.java.net/~jbachorik/4505697/webrev.00 > > Thanks! > > -JB- -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20140214/be23acd6/attachment.html From daniel.daugherty at oracle.com Fri Feb 14 12:33:12 2014 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Fri, 14 Feb 2014 13:33:12 -0700 Subject: RFR 4505697: nsk/jdi/ExceptionEvent/_itself_/exevent006 and exevent008 tests fail with InvocationTargetException In-Reply-To: <52FE6486.8000603@oracle.com> References: <52FDF8DD.9090602@oracle.com> <52FE6486.8000603@oracle.com> Message-ID: <52FE7D88.8000701@oracle.com> On 2/14/14 11:46 AM, serguei.spitsyn at oracle.com wrote: > Jaroslav, > > It looks good in general modulo indent comments from Dan. > > But I have a doubt that acquiring the JvmtiThreadState_lock is needed > or right thing to do in the JvmtiExport::clear_detected_exception(). > It seems, both clear_exception_detected() and set_exception_detected() > are always > called on current thread and so, it has to be safe to do without > acquiring any locks. My JVM/TI-foo is rusty, but I believe that JvmtiThreadState stuff can also be queried/modified by other threads so grabbing the associated lock is a good idea. Dan > > And I'm repeating my question about pre-integration testing (Dan is > asking about the same). > > Thanks, > Serguei > > > On 2/14/14 3:07 AM, Jaroslav Bachorik wrote: >> This is a round-0 review request. >> >> The reflection code intercepting the exceptions thrown in the invoked >> methods does not play nicely with JVMTI (which, in this case, >> propagates to JDI). >> >> The reflection code lacks the traditional error handler - therefore, >> upon throwing the NumberFormatException, the stack is searched for >> appropriate handlers and none are found. This leaves the >> "exception_detected" flag set to true while normally it would be >> reset to false once the exception is handled. The reflection code >> then goes on and wraps the NumberFormatException into >> InvocationTargetException and throws it. But, alas, the >> "exception_detected" flag is still set to true and no JVMTI exception >> event will be sent out. >> >> The proposed solution is to call >> thread->jvmti_thread_state()->clear_exception_detected() at the >> appropriate places in the reflection code to reset the >> "exception_detected" flag and enable the InvocationTargetException be >> properly reported over JVMTI. >> >> Issue : https://bugs.openjdk.java.net/browse/JDK-4505697 >> Webrev: http://cr.openjdk.java.net/~jbachorik/4505697/webrev.00 >> >> Thanks! >> >> -JB- > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20140214/3a4dd4bc/attachment.html From serguei.spitsyn at oracle.com Fri Feb 14 14:13:50 2014 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Fri, 14 Feb 2014 14:13:50 -0800 Subject: RFR 4505697: nsk/jdi/ExceptionEvent/_itself_/exevent006 and exevent008 tests fail with InvocationTargetException In-Reply-To: <52FE7D88.8000701@oracle.com> References: <52FDF8DD.9090602@oracle.com> <52FE6486.8000603@oracle.com> <52FE7D88.8000701@oracle.com> Message-ID: <52FE951E.1060002@oracle.com> On 2/14/14 12:33 PM, Daniel D. Daugherty wrote: > On 2/14/14 11:46 AM, serguei.spitsyn at oracle.com wrote: >> Jaroslav, >> >> It looks good in general modulo indent comments from Dan. >> >> But I have a doubt that acquiring the JvmtiThreadState_lock is needed >> or right thing to do in the JvmtiExport::clear_detected_exception(). >> It seems, both clear_exception_detected() and >> set_exception_detected() are always >> called on current thread and so, it has to be safe to do without >> acquiring any locks. > > My JVM/TI-foo is rusty, but I believe that JvmtiThreadState stuff > can also be queried/modified by other threads so grabbing the > associated lock is a good idea. The lock synchronization is cooperative. It does not help much if the lock is not acquired in other places. I can be wrong, but I've not found yet any place in the code where the clear_exception_detected() and set_exception_detected() are called under protection of the JvmtiThreadState_lock. Thanks, Serguei > > Dan > > >> >> And I'm repeating my question about pre-integration testing (Dan is >> asking about the same). >> >> Thanks, >> Serguei >> >> >> On 2/14/14 3:07 AM, Jaroslav Bachorik wrote: >>> This is a round-0 review request. >>> >>> The reflection code intercepting the exceptions thrown in the >>> invoked methods does not play nicely with JVMTI (which, in this >>> case, propagates to JDI). >>> >>> The reflection code lacks the traditional error handler - therefore, >>> upon throwing the NumberFormatException, the stack is searched for >>> appropriate handlers and none are found. This leaves the >>> "exception_detected" flag set to true while normally it would be >>> reset to false once the exception is handled. The reflection code >>> then goes on and wraps the NumberFormatException into >>> InvocationTargetException and throws it. But, alas, the >>> "exception_detected" flag is still set to true and no JVMTI >>> exception event will be sent out. >>> >>> The proposed solution is to call >>> thread->jvmti_thread_state()->clear_exception_detected() at the >>> appropriate places in the reflection code to reset the >>> "exception_detected" flag and enable the InvocationTargetException >>> be properly reported over JVMTI. >>> >>> Issue : https://bugs.openjdk.java.net/browse/JDK-4505697 >>> Webrev: http://cr.openjdk.java.net/~jbachorik/4505697/webrev.00 >>> >>> Thanks! >>> >>> -JB- >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20140214/6fdcf9dd/attachment.html From Alan.Bateman at oracle.com Sun Feb 16 09:20:47 2014 From: Alan.Bateman at oracle.com (Alan Bateman) Date: Sun, 16 Feb 2014 17:20:47 +0000 Subject: 8034856/8034857: More gcc warnings In-Reply-To: <52FD35C5.1000306@oracle.com> References: <52FCC60B.6040205@oracle.com> <6E073B44-2FF7-419F-A094-7CD5C3AE0788@oracle.com> <52FD0DA2.7030201@oracle.com> <52FD35C5.1000306@oracle.com> Message-ID: <5300F36F.9010205@oracle.com> On 13/02/2014 21:14, Mikael Vidstedt wrote: > : > > The change in question appears to come from > https://bugs.openjdk.java.net/browse/JDK-6679866, but I'm not sure the > bug gives enough additional information. My speculation (and it's > really just a speculation) is that it's not related to isspace per-se, > but to something else which gets defined/redefined/undefined by > including ctype.h. I guess it would be good to know if we have tests > which cover the thing the comment is alluding to (non-ascii in > Premain-Class). Thanks for pointing this out. I looked at it again and the issue is that isspace is a macro and depends on the locale. By not including ctype.h then it means we get linked to the libc function instead. One approach is to include ctype.h and then #undef isspace, another is to define function prototype ourselves. I think the latter is a little bit better because it would avoid accidental usage of other local sensitive char classifiers. Attached is the patch that I propose. I have deliberate moved to to after other includes so we get a chance to #undef in the event that it gets included by something else. On tests then PremainClassTest.java is good enough to find this on Solaris. -Alan diff --git a/src/share/instrument/JarFacade.c b/src/share/instrument/JarFacade.c --- a/src/share/instrument/JarFacade.c +++ b/src/share/instrument/JarFacade.c @@ -23,17 +23,20 @@ * questions. */ -#ifdef _WIN32 -/* - * Win* needs this include. However, Linux and Solaris do not. - * Having this include on Solaris SPARC breaks having non US-ASCII - * characters in the value of the Premain-Class attribute. - */ -#include -#endif /* _WIN32 */ #include #include -#include + +/** + * ctype.h is required on Windows. For other platforms we use a function + * prototype to ensure that we use the libc isspace function rather than + * the isspace macro (due to isspace being locale sensitive) + */ +#ifdef _WIN32 + #include +#else + #undef isspace + extern int isspace(int c); +#endif /* _WIN32 */ #include "jni.h" #include "manifest_info.h" From martinrb at google.com Sun Feb 16 14:20:30 2014 From: martinrb at google.com (Martin Buchholz) Date: Sun, 16 Feb 2014 14:20:30 -0800 Subject: 8034856/8034857: More gcc warnings In-Reply-To: <5300F36F.9010205@oracle.com> References: <52FCC60B.6040205@oracle.com> <6E073B44-2FF7-419F-A094-7CD5C3AE0788@oracle.com> <52FD0DA2.7030201@oracle.com> <52FD35C5.1000306@oracle.com> <5300F36F.9010205@oracle.com> Message-ID: Those locale-dependent APIs - more trouble than they're worth. More often than not, you want a locale-independent version. So just define your own is_ASCII_space etc. like everybody else has done and move on. On Sun, Feb 16, 2014 at 9:20 AM, Alan Bateman wrote: > On 13/02/2014 21:14, Mikael Vidstedt wrote: > >> : >> >> The change in question appears to come from >> https://bugs.openjdk.java.net/browse/JDK-6679866, but I'm not sure the >> bug gives enough additional information. My speculation (and it's really >> just a speculation) is that it's not related to isspace per-se, but to >> something else which gets defined/redefined/undefined by including ctype.h. >> I guess it would be good to know if we have tests which cover the thing the >> comment is alluding to (non-ascii in Premain-Class). >> > Thanks for pointing this out. I looked at it again and the issue is that > isspace is a macro and depends on the locale. By not including ctype.h then > it means we get linked to the libc function instead. One approach is to > include ctype.h and then #undef isspace, another is to define function > prototype ourselves. I think the latter is a little bit better because it > would avoid accidental usage of other local sensitive char classifiers. > Attached is the patch that I propose. I have deliberate moved to to after > other includes so we get a chance to #undef in the event that it gets > included by something else. > > On tests then PremainClassTest.java is good enough to find this on Solaris. > > -Alan > > > diff --git a/src/share/instrument/JarFacade.c b/src/share/instrument/ > JarFacade.c > --- a/src/share/instrument/JarFacade.c > +++ b/src/share/instrument/JarFacade.c > @@ -23,17 +23,20 @@ > * questions. > */ > > -#ifdef _WIN32 > -/* > - * Win* needs this include. However, Linux and Solaris do not. > - * Having this include on Solaris SPARC breaks having non US-ASCII > - * characters in the value of the Premain-Class attribute. > - */ > -#include > -#endif /* _WIN32 */ > #include > #include > -#include > + > +/** > + * ctype.h is required on Windows. For other platforms we use a function > + * prototype to ensure that we use the libc isspace function rather than > + * the isspace macro (due to isspace being locale sensitive) > + */ > +#ifdef _WIN32 > + #include > +#else > + #undef isspace > + extern int isspace(int c); > +#endif /* _WIN32 */ > > #include "jni.h" > #include "manifest_info.h" > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20140216/2af02169/attachment.html From david.holmes at oracle.com Sun Feb 16 17:30:02 2014 From: david.holmes at oracle.com (David Holmes) Date: Mon, 17 Feb 2014 11:30:02 +1000 Subject: RFR(XS): JDK-8034176 Update mapfile for libjfr In-Reply-To: <7482D938-F9B1-4890-B48D-AF892B6FA3C3@oracle.com> References: <52FA05F9.20206@oracle.com> <7482D938-F9B1-4890-B48D-AF892B6FA3C3@oracle.com> Message-ID: <5301661A.2070807@oracle.com> On 11/02/2014 10:01 PM, Staffan Larsen wrote: > > On 11 feb 2014, at 12:14, Erik Joelsson wrote: > >> Looks good, but I can't help but wonder why the mapfile for libjfr is in the open. > > Yes, that is unfortunate. It used to be in closed source in jdk7, but moved into open with the new build system. That should be fixed. Looks like we need a make/closed/lib/Serviceability.gmk that contains the BUILD_LIBJFR definition. David > /Staffan > >> >> /Erik >> >> On 2014-02-11 12:00, Staffan Larsen wrote: >>> Resending (email to build-dev bounced since I used the wrong sender address). >>> >>> On 11 feb 2014, at 11:18, staffan at larsen.se wrote: >>> >>>> Please review this small fix for libjfr/mapfile-vers. >>>> >>>> Thanks, >>>> /Staffan >>>> >>>> diff --git a/make/mapfiles/libjfr/mapfile-vers b/make/mapfiles/libjfr/mapfile-vers >>>> --- a/make/mapfiles/libjfr/mapfile-vers >>>> +++ b/make/mapfiles/libjfr/mapfile-vers >>>> @@ -34,6 +34,7 @@ >>>> Java_oracle_jrockit_jfr_VMJFR_getPeriod; >>>> Java_oracle_jrockit_jfr_VMJFR_descriptors; >>>> Java_oracle_jrockit_jfr_VMJFR_redefineClass0; >>>> + Java_oracle_jrockit_jfr_VMJFR_retransformClasses0; >>>> JNI_OnLoad; >>>> local: >>>> *; >> > From mikael.vidstedt at oracle.com Sun Feb 16 21:51:20 2014 From: mikael.vidstedt at oracle.com (Mikael Vidstedt) Date: Sun, 16 Feb 2014 21:51:20 -0800 Subject: 8034856/8034857: More gcc warnings In-Reply-To: References: <52FCC60B.6040205@oracle.com> <6E073B44-2FF7-419F-A094-7CD5C3AE0788@oracle.com> <52FD0DA2.7030201@oracle.com> <52FD35C5.1000306@oracle.com> <5300F36F.9010205@oracle.com> Message-ID: <4DCAB60B-5D60-4873-A5A4-ACF123F10045@oracle.com> I'm inclined to agree with this. Since the code depends on a specific behavior of isspace which does not match what the system provided function does I too think it would be more robust to implement our own version of it. Cheers, Mikael > On Feb 16, 2014, at 14:20, Martin Buchholz wrote: > > Those locale-dependent APIs - more trouble than they're worth. More often than not, you want a locale-independent version. > > So just define your own is_ASCII_space etc. like everybody else has done and move on. > > >> On Sun, Feb 16, 2014 at 9:20 AM, Alan Bateman wrote: >>> On 13/02/2014 21:14, Mikael Vidstedt wrote: >>> : >>> >>> The change in question appears to come from https://bugs.openjdk.java.net/browse/JDK-6679866, but I'm not sure the bug gives enough additional information. My speculation (and it's really just a speculation) is that it's not related to isspace per-se, but to something else which gets defined/redefined/undefined by including ctype.h. I guess it would be good to know if we have tests which cover the thing the comment is alluding to (non-ascii in Premain-Class). >> Thanks for pointing this out. I looked at it again and the issue is that isspace is a macro and depends on the locale. By not including ctype.h then it means we get linked to the libc function instead. One approach is to include ctype.h and then #undef isspace, another is to define function prototype ourselves. I think the latter is a little bit better because it would avoid accidental usage of other local sensitive char classifiers. Attached is the patch that I propose. I have deliberate moved to to after other includes so we get a chance to #undef in the event that it gets included by something else. >> >> On tests then PremainClassTest.java is good enough to find this on Solaris. >> >> -Alan >> >> >> diff --git a/src/share/instrument/JarFacade.c b/src/share/instrument/JarFacade.c >> --- a/src/share/instrument/JarFacade.c >> +++ b/src/share/instrument/JarFacade.c >> @@ -23,17 +23,20 @@ >> * questions. >> */ >> >> -#ifdef _WIN32 >> -/* >> - * Win* needs this include. However, Linux and Solaris do not. >> - * Having this include on Solaris SPARC breaks having non US-ASCII >> - * characters in the value of the Premain-Class attribute. >> - */ >> -#include >> -#endif /* _WIN32 */ >> #include >> #include >> -#include >> + >> +/** >> + * ctype.h is required on Windows. For other platforms we use a function >> + * prototype to ensure that we use the libc isspace function rather than >> + * the isspace macro (due to isspace being locale sensitive) >> + */ >> +#ifdef _WIN32 >> + #include >> +#else >> + #undef isspace >> + extern int isspace(int c); >> +#endif /* _WIN32 */ >> >> #include "jni.h" >> #include "manifest_info.h" > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20140216/8c63f17b/attachment.html From staffan.larsen at oracle.com Sun Feb 16 23:30:53 2014 From: staffan.larsen at oracle.com (Staffan Larsen) Date: Mon, 17 Feb 2014 08:30:53 +0100 Subject: RFR(XS): JDK-8034176 Update mapfile for libjfr In-Reply-To: <5301661A.2070807@oracle.com> References: <52FA05F9.20206@oracle.com> <7482D938-F9B1-4890-B48D-AF892B6FA3C3@oracle.com> <5301661A.2070807@oracle.com> Message-ID: <7422AD5A-1830-4902-A3B6-7ADBFE4C5721@oracle.com> On 17 feb 2014, at 02:30, David Holmes wrote: > On 11/02/2014 10:01 PM, Staffan Larsen wrote: >> >> On 11 feb 2014, at 12:14, Erik Joelsson wrote: >> >>> Looks good, but I can't help but wonder why the mapfile for libjfr is in the open. >> >> Yes, that is unfortunate. It used to be in closed source in jdk7, but moved into open with the new build system. > > That should be fixed. Looks like we need a make/closed/lib/Serviceability.gmk that contains the BUILD_LIBJFR definition. Yes, I?ve filed a bug about it. Thanks, /Staffan > > David > >> /Staffan >> >>> >>> /Erik >>> >>> On 2014-02-11 12:00, Staffan Larsen wrote: >>>> Resending (email to build-dev bounced since I used the wrong sender address). >>>> >>>> On 11 feb 2014, at 11:18, staffan at larsen.se wrote: >>>> >>>>> Please review this small fix for libjfr/mapfile-vers. >>>>> >>>>> Thanks, >>>>> /Staffan >>>>> >>>>> diff --git a/make/mapfiles/libjfr/mapfile-vers b/make/mapfiles/libjfr/mapfile-vers >>>>> --- a/make/mapfiles/libjfr/mapfile-vers >>>>> +++ b/make/mapfiles/libjfr/mapfile-vers >>>>> @@ -34,6 +34,7 @@ >>>>> Java_oracle_jrockit_jfr_VMJFR_getPeriod; >>>>> Java_oracle_jrockit_jfr_VMJFR_descriptors; >>>>> Java_oracle_jrockit_jfr_VMJFR_redefineClass0; >>>>> + Java_oracle_jrockit_jfr_VMJFR_retransformClasses0; >>>>> JNI_OnLoad; >>>>> local: >>>>> *; >>> >> From jaroslav.bachorik at oracle.com Sun Feb 16 23:53:07 2014 From: jaroslav.bachorik at oracle.com (Jaroslav Bachorik) Date: Mon, 17 Feb 2014 08:53:07 +0100 Subject: RFR 4505697: nsk/jdi/ExceptionEvent/_itself_/exevent006 and exevent008 tests fail with InvocationTargetException In-Reply-To: <52FE46A2.20705@oracle.com> References: <52FDF8DD.9090602@oracle.com> <52FE46A2.20705@oracle.com> Message-ID: <5301BFE3.7050400@oracle.com> Hi Dan, On 14.2.2014 17:38, Daniel D. Daugherty wrote: > > Webrev: http://cr.openjdk.java.net/~jbachorik/4505697/webrev.00 > > Nice job on such an old bug. > > src/share/vm/prims/jvmtiExport.hpp > No comments. > > src/share/vm/prims/jvmtiExport.cpp > line 2170: state->clear_exception_detected(); > HotSpot indent is two spaces. > > src/share/vm/runtime/reflection.cpp > line 948: JvmtiExport::clear_detected_exception((JavaThread*) THREAD); > line 1085: JvmtiExport::clear_detected_exception((JavaThread*) > THREAD); > HotSpot indent is two spaces. Indentation will be fixed. > > So there are two existing tests for this failure mode: > > nsk/jdi/ExceptionEvent/_itself_/exevent006 > nsk/jdi/ExceptionEvent/_itself_/exevent008 > > so I'm presuming that you've run the above tests and they are now > happy, but you haven't really stated that. What other pre-integration > testing do you have planned? Yes, those two tests are not failing any more. The full testing would include running both JTREG (testsets 'core' and 'svc') and aurora testbase at least for 'runtime', 'nsk/jvmti' and 'nsk/jdi'. -JB- > > Dan > > > On 2/14/14 4:07 AM, Jaroslav Bachorik wrote: >> This is a round-0 review request. >> >> The reflection code intercepting the exceptions thrown in the invoked >> methods does not play nicely with JVMTI (which, in this case, >> propagates to JDI). >> >> The reflection code lacks the traditional error handler - therefore, >> upon throwing the NumberFormatException, the stack is searched for >> appropriate handlers and none are found. This leaves the >> "exception_detected" flag set to true while normally it would be reset >> to false once the exception is handled. The reflection code then goes >> on and wraps the NumberFormatException into InvocationTargetException >> and throws it. But, alas, the "exception_detected" flag is still set >> to true and no JVMTI exception event will be sent out. >> >> The proposed solution is to call >> thread->jvmti_thread_state()->clear_exception_detected() at the >> appropriate places in the reflection code to reset the >> "exception_detected" flag and enable the InvocationTargetException be >> properly reported over JVMTI. >> >> Issue : https://bugs.openjdk.java.net/browse/JDK-4505697 >> Webrev: http://cr.openjdk.java.net/~jbachorik/4505697/webrev.00 >> >> Thanks! >> >> -JB- >> >> > From jaroslav.bachorik at oracle.com Mon Feb 17 00:04:59 2014 From: jaroslav.bachorik at oracle.com (Jaroslav Bachorik) Date: Mon, 17 Feb 2014 09:04:59 +0100 Subject: RFR 4505697: nsk/jdi/ExceptionEvent/_itself_/exevent006 and exevent008 tests fail with InvocationTargetException In-Reply-To: <52FE951E.1060002@oracle.com> References: <52FDF8DD.9090602@oracle.com> <52FE6486.8000603@oracle.com> <52FE7D88.8000701@oracle.com> <52FE951E.1060002@oracle.com> Message-ID: <5301C2AB.2070600@oracle.com> On 14.2.2014 23:13, serguei.spitsyn at oracle.com wrote: > On 2/14/14 12:33 PM, Daniel D. Daugherty wrote: >> On 2/14/14 11:46 AM, serguei.spitsyn at oracle.com wrote: >>> Jaroslav, >>> >>> It looks good in general modulo indent comments from Dan. >>> >>> But I have a doubt that acquiring the JvmtiThreadState_lock is needed >>> or right thing to do in the JvmtiExport::clear_detected_exception(). >>> It seems, both clear_exception_detected() and >>> set_exception_detected() are always >>> called on current thread and so, it has to be safe to do without >>> acquiring any locks. >> >> My JVM/TI-foo is rusty, but I believe that JvmtiThreadState stuff >> can also be queried/modified by other threads so grabbing the >> associated lock is a good idea. > > The lock synchronization is cooperative. > It does not help much if the lock is not acquired in other places. > I can be wrong, but I've not found yet any place in the code where the > clear_exception_detected() and set_exception_detected() are called > under protection of the JvmtiThreadState_lock. I copied the locking over from "JvmtiExport::cleanup_thread(JavaThread* thread)". That method is also supposed to work only with the current thread but acquires the lock nonetheless. But if you are sure that the lock is not required I have no objections removing it. -JB- > > Thanks, > Serguei > >> >> Dan >> >> >>> >>> And I'm repeating my question about pre-integration testing (Dan is >>> asking about the same). >>> >>> Thanks, >>> Serguei >>> >>> >>> On 2/14/14 3:07 AM, Jaroslav Bachorik wrote: >>>> This is a round-0 review request. >>>> >>>> The reflection code intercepting the exceptions thrown in the >>>> invoked methods does not play nicely with JVMTI (which, in this >>>> case, propagates to JDI). >>>> >>>> The reflection code lacks the traditional error handler - therefore, >>>> upon throwing the NumberFormatException, the stack is searched for >>>> appropriate handlers and none are found. This leaves the >>>> "exception_detected" flag set to true while normally it would be >>>> reset to false once the exception is handled. The reflection code >>>> then goes on and wraps the NumberFormatException into >>>> InvocationTargetException and throws it. But, alas, the >>>> "exception_detected" flag is still set to true and no JVMTI >>>> exception event will be sent out. >>>> >>>> The proposed solution is to call >>>> thread->jvmti_thread_state()->clear_exception_detected() at the >>>> appropriate places in the reflection code to reset the >>>> "exception_detected" flag and enable the InvocationTargetException >>>> be properly reported over JVMTI. >>>> >>>> Issue : https://bugs.openjdk.java.net/browse/JDK-4505697 >>>> Webrev: http://cr.openjdk.java.net/~jbachorik/4505697/webrev.00 >>>> >>>> Thanks! >>>> >>>> -JB- >>> >> > > From fredrik.arvidsson at oracle.com Mon Feb 17 01:46:42 2014 From: fredrik.arvidsson at oracle.com (Fredrik Arvidsson) Date: Mon, 17 Feb 2014 10:46:42 +0100 Subject: RFR(XS): JDK-4515292: ReferenceType.isStatic() returns true for arrays Message-ID: <5301DA82.2070001@oracle.com> Hi please review this minor JDI fix. This bug was found when investigating failing JDI tests. According to the JavaDoc all calls to ReferenceType when it is an array type should return false for isStatic(). Webrev: http://cr.openjdk.java.net/~farvidsson/4515292/webrev.00/ Jira: https://bugs.openjdk.java.net/browse/JDK-4515292 Cheers /Fredrik From jaroslav.bachorik at oracle.com Mon Feb 17 03:01:49 2014 From: jaroslav.bachorik at oracle.com (Jaroslav Bachorik) Date: Mon, 17 Feb 2014 12:01:49 +0100 Subject: RFR(XS): JDK-4515292: ReferenceType.isStatic() returns true for arrays In-Reply-To: <5301DA82.2070001@oracle.com> References: <5301DA82.2070001@oracle.com> Message-ID: <5301EC1D.5010407@oracle.com> Looks good! (not a reviewer) -JB- On 17.2.2014 10:46, Fredrik Arvidsson wrote: > Hi please review this minor JDI fix. > > This bug was found when investigating failing JDI tests. According to > the JavaDoc all calls to ReferenceType when it is an array type should > return false for isStatic(). > > Webrev: http://cr.openjdk.java.net/~farvidsson/4515292/webrev.00/ > > Jira: https://bugs.openjdk.java.net/browse/JDK-4515292 > > Cheers > /Fredrik From Alan.Bateman at oracle.com Mon Feb 17 07:08:02 2014 From: Alan.Bateman at oracle.com (Alan Bateman) Date: Mon, 17 Feb 2014 15:08:02 +0000 Subject: 8034856/8034857: More gcc warnings In-Reply-To: <4DCAB60B-5D60-4873-A5A4-ACF123F10045@oracle.com> References: <52FCC60B.6040205@oracle.com> <6E073B44-2FF7-419F-A094-7CD5C3AE0788@oracle.com> <52FD0DA2.7030201@oracle.com> <52FD35C5.1000306@oracle.com> <5300F36F.9010205@oracle.com> <4DCAB60B-5D60-4873-A5A4-ACF123F10045@oracle.com> Message-ID: <530225D2.4040709@oracle.com> On 17/02/2014 05:51, Mikael Vidstedt wrote: > > I'm inclined to agree with this. Since the code depends on a specific > behavior of isspace which does not match what the system provided > function does I too think it would be more robust to implement our own > version of it. I completely agree that changing this code to use its own isspace is the right thing, it just seems a bit much for a drive-by fixed to gcc warnings. Do either of you want to take it? -Alan -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20140217/9d7b6232/attachment.html From serguei.spitsyn at oracle.com Mon Feb 17 11:44:35 2014 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Mon, 17 Feb 2014 11:44:35 -0800 Subject: RFR(XS): JDK-4515292: ReferenceType.isStatic() returns true for arrays In-Reply-To: <5301DA82.2070001@oracle.com> References: <5301DA82.2070001@oracle.com> Message-ID: <530266A3.50406@oracle.com> Looks good. Thank you for fixing it! Serguei On 2/17/14 1:46 AM, Fredrik Arvidsson wrote: > Hi please review this minor JDI fix. > > This bug was found when investigating failing JDI tests. According to > the JavaDoc all calls to ReferenceType when it is an array type should > return false for isStatic(). > > Webrev: http://cr.openjdk.java.net/~farvidsson/4515292/webrev.00/ > > Jira: https://bugs.openjdk.java.net/browse/JDK-4515292 > > Cheers > /Fredrik From mikael.vidstedt at oracle.com Mon Feb 17 19:59:04 2014 From: mikael.vidstedt at oracle.com (Mikael Vidstedt) Date: Mon, 17 Feb 2014 19:59:04 -0800 Subject: 8034856/8034857: More gcc warnings In-Reply-To: <530225D2.4040709@oracle.com> References: <52FCC60B.6040205@oracle.com> <6E073B44-2FF7-419F-A094-7CD5C3AE0788@oracle.com> <52FD0DA2.7030201@oracle.com> <52FD35C5.1000306@oracle.com> <5300F36F.9010205@oracle.com> <4DCAB60B-5D60-4873-A5A4-ACF123F10045@oracle.com> <530225D2.4040709@oracle.com> Message-ID: <5302DA88.9060306@oracle.com> On 2014-02-17 07:08, Alan Bateman wrote: > On 17/02/2014 05:51, Mikael Vidstedt wrote: >> >> I'm inclined to agree with this. Since the code depends on a specific >> behavior of isspace which does not match what the system provided >> function does I too think it would be more robust to implement our >> own version of it. > I completely agree that changing this code to use its own isspace is > the right thing, it just seems a bit much for a drive-by fixed to gcc > warnings. Do either of you want to take it? How about: http://cr.openjdk.java.net/~mikael/webrevs/isspace/webrev.00/webrev/ Cheers, Mikael -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20140217/0cf3246b/attachment.html From david.holmes at oracle.com Mon Feb 17 21:02:29 2014 From: david.holmes at oracle.com (David Holmes) Date: Tue, 18 Feb 2014 15:02:29 +1000 Subject: RFR (S) 8034249: need more workarounds for suspend equivalent condition issue In-Reply-To: <52FD5CC3.4090502@oracle.com> References: <52FD5CC3.4090502@oracle.com> Message-ID: <5302E965.3090505@oracle.com> Hi Serguei, This looks good to me. I wonder if we will reach a point where we can delete is_thread_fully_suspended? ;-) David On 14/02/2014 10:01 AM, serguei.spitsyn at oracle.com wrote: > Please, review the fix for: > https://bugs.openjdk.java.net/browse/JDK-8034249 > > > Open webrev: > http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/hotspot/8034249-JVMTI-MON.1 > > > Summary: > > This issue was identified in the review of the 8032223 and it is > similar to the 8032223 > but impacts different JVMTI functions: > GetCurrentContendedMonitor, GetOwnedMonitorInfo, > GetOwnedMonitorStackDepthInfo, GetStackTrace > > There is a general issue in the suspend equivalent condition mechanism: > Two subsequent calls to the JvmtiEnv::is_thread_fully_suspended() may > return different results: > - 1-st: true > - 2-nd: false > > This suspend equivalent issue is covered by another bug: > https://bugs.openjdk.java.net/browse/JDK-6280037 > > This fix is to work around the 6280037. > It is more safe to collect the necesary information at a safepoint > instead of > relying on the suspension of the target thread. > > > Testing: > In progress: nsk.jvmti, nsk.jdi, nsk.jdwp, JTreg com/sun/jdi > > > Thanks, > Serguei From david.holmes at oracle.com Mon Feb 17 21:28:15 2014 From: david.holmes at oracle.com (David Holmes) Date: Tue, 18 Feb 2014 15:28:15 +1000 Subject: RFR 4505697: nsk/jdi/ExceptionEvent/_itself_/exevent006 and exevent008 tests fail with InvocationTargetException In-Reply-To: <52FDF8DD.9090602@oracle.com> References: <52FDF8DD.9090602@oracle.com> Message-ID: <5302EF6F.9090204@oracle.com> Hi Jaroslav, It seems to me that this issue extends to other places in the VM. In particular class initialization in instanceKlass.cpp - anywhere that one exception is "caught" in the VM and then wrapped with, or replaced by, another exception, will only notify JVMTI of the original exception. David On 14/02/2014 9:07 PM, Jaroslav Bachorik wrote: > This is a round-0 review request. > > The reflection code intercepting the exceptions thrown in the invoked > methods does not play nicely with JVMTI (which, in this case, propagates > to JDI). > > The reflection code lacks the traditional error handler - therefore, > upon throwing the NumberFormatException, the stack is searched for > appropriate handlers and none are found. This leaves the > "exception_detected" flag set to true while normally it would be reset > to false once the exception is handled. The reflection code then goes on > and wraps the NumberFormatException into InvocationTargetException and > throws it. But, alas, the "exception_detected" flag is still set to true > and no JVMTI exception event will be sent out. > > The proposed solution is to call > thread->jvmti_thread_state()->clear_exception_detected() at the > appropriate places in the reflection code to reset the > "exception_detected" flag and enable the InvocationTargetException be > properly reported over JVMTI. > > Issue : https://bugs.openjdk.java.net/browse/JDK-4505697 > Webrev: http://cr.openjdk.java.net/~jbachorik/4505697/webrev.00 > > Thanks! > > -JB- From serguei.spitsyn at oracle.com Mon Feb 17 21:33:08 2014 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Mon, 17 Feb 2014 21:33:08 -0800 Subject: RFR (S) 8034249: need more workarounds for suspend equivalent condition issue In-Reply-To: <5302E965.3090505@oracle.com> References: <52FD5CC3.4090502@oracle.com> <5302E965.3090505@oracle.com> Message-ID: <5302F094.7090006@oracle.com> Thank you a lot, David! On 2/17/14 9:02 PM, David Holmes wrote: > Hi Serguei, > > This looks good to me. > > I wonder if we will reach a point where we can delete > is_thread_fully_suspended? ;-) I know what you mean by this. :) There are still some space to improve safety with the safepoint mechanizm. Of course, the is_thread_fully_suspended() is still needed for external JVMTI/JDI purposes. Thanks, Serguei > > David > > On 14/02/2014 10:01 AM, serguei.spitsyn at oracle.com wrote: >> Please, review the fix for: >> https://bugs.openjdk.java.net/browse/JDK-8034249 >> >> >> Open webrev: >> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/hotspot/8034249-JVMTI-MON.1 >> >> >> >> Summary: >> >> This issue was identified in the review of the 8032223 and it is >> similar to the 8032223 >> but impacts different JVMTI functions: >> GetCurrentContendedMonitor, GetOwnedMonitorInfo, >> GetOwnedMonitorStackDepthInfo, GetStackTrace >> >> There is a general issue in the suspend equivalent condition >> mechanism: >> Two subsequent calls to the JvmtiEnv::is_thread_fully_suspended() may >> return different results: >> - 1-st: true >> - 2-nd: false >> >> This suspend equivalent issue is covered by another bug: >> https://bugs.openjdk.java.net/browse/JDK-6280037 >> >> This fix is to work around the 6280037. >> It is more safe to collect the necesary information at a safepoint >> instead of >> relying on the suspension of the target thread. >> >> >> Testing: >> In progress: nsk.jvmti, nsk.jdi, nsk.jdwp, JTreg com/sun/jdi >> >> >> Thanks, >> Serguei From Alan.Bateman at oracle.com Tue Feb 18 00:33:40 2014 From: Alan.Bateman at oracle.com (Alan Bateman) Date: Tue, 18 Feb 2014 08:33:40 +0000 Subject: 8034856/8034857: More gcc warnings In-Reply-To: <5302DA88.9060306@oracle.com> References: <52FCC60B.6040205@oracle.com> <6E073B44-2FF7-419F-A094-7CD5C3AE0788@oracle.com> <52FD0DA2.7030201@oracle.com> <52FD35C5.1000306@oracle.com> <5300F36F.9010205@oracle.com> <4DCAB60B-5D60-4873-A5A4-ACF123F10045@oracle.com> <530225D2.4040709@oracle.com> <5302DA88.9060306@oracle.com> Message-ID: <53031AE4.2050700@oracle.com> On 18/02/2014 03:59, Mikael Vidstedt wrote: > > How about: > > http://cr.openjdk.java.net/~mikael/webrevs/isspace/webrev.00/webrev/ > > Cheers, > Mikael > I checked the java.lang.instrument spec and for the Boot-Class-Path attribute then it doesn't say any more than "space". It might be worth checking the manifest parsing code (parse_manfiest.c) to see how continuations are handled as I suspect \r and \n can't appear in the attribute value (in which case the check might really only need to be for space and \t. Otherwise replacing isspace is good and your isspaceAscii is likely to match the libc isspace (at runtime). This code isn't performance sensitive but maybe check space first would be a bit better. Also the library native code using 4 space indent rather than hotspot's 2. I created JDK-8035054 a few days ago to track this. Thanks for taking it as I am busy with a number of other things at the moment. -Alan -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20140218/8c866848/attachment.html From jaroslav.bachorik at oracle.com Tue Feb 18 00:47:36 2014 From: jaroslav.bachorik at oracle.com (Jaroslav Bachorik) Date: Tue, 18 Feb 2014 09:47:36 +0100 Subject: RFR 4505697: nsk/jdi/ExceptionEvent/_itself_/exevent006 and exevent008 tests fail with InvocationTargetException In-Reply-To: <5302EF6F.9090204@oracle.com> References: <52FDF8DD.9090602@oracle.com> <5302EF6F.9090204@oracle.com> Message-ID: <53031E28.1060007@oracle.com> Hi David, On 18.2.2014 06:28, David Holmes wrote: > Hi Jaroslav, > > It seems to me that this issue extends to other places in the VM. In > particular class initialization in instanceKlass.cpp - anywhere that one > exception is "caught" in the VM and then wrapped with, or replaced by, > another exception, will only notify JVMTI of the original exception. Thanks for pointing this out. Turns out there is another location in jvm.cpp which needs the same treatment. BTW, what is your take on the necessity to grab the JvmtiThreadState_lock before cleaning the detected exception in the jvmti thread state? -JB- > > David > > On 14/02/2014 9:07 PM, Jaroslav Bachorik wrote: >> This is a round-0 review request. >> >> The reflection code intercepting the exceptions thrown in the invoked >> methods does not play nicely with JVMTI (which, in this case, propagates >> to JDI). >> >> The reflection code lacks the traditional error handler - therefore, >> upon throwing the NumberFormatException, the stack is searched for >> appropriate handlers and none are found. This leaves the >> "exception_detected" flag set to true while normally it would be reset >> to false once the exception is handled. The reflection code then goes on >> and wraps the NumberFormatException into InvocationTargetException and >> throws it. But, alas, the "exception_detected" flag is still set to true >> and no JVMTI exception event will be sent out. >> >> The proposed solution is to call >> thread->jvmti_thread_state()->clear_exception_detected() at the >> appropriate places in the reflection code to reset the >> "exception_detected" flag and enable the InvocationTargetException be >> properly reported over JVMTI. >> >> Issue : https://bugs.openjdk.java.net/browse/JDK-4505697 >> Webrev: http://cr.openjdk.java.net/~jbachorik/4505697/webrev.00 >> >> Thanks! >> >> -JB- From dmitry.samersoff at oracle.com Tue Feb 18 00:58:39 2014 From: dmitry.samersoff at oracle.com (Dmitry Samersoff) Date: Tue, 18 Feb 2014 12:58:39 +0400 Subject: 8034856/8034857: More gcc warnings In-Reply-To: <5302DA88.9060306@oracle.com> References: <52FCC60B.6040205@oracle.com> <6E073B44-2FF7-419F-A094-7CD5C3AE0788@oracle.com> <52FD0DA2.7030201@oracle.com> <52FD35C5.1000306@oracle.com> <5300F36F.9010205@oracle.com> <4DCAB60B-5D60-4873-A5A4-ACF123F10045@oracle.com> <530225D2.4040709@oracle.com> <5302DA88.9060306@oracle.com> Message-ID: <530320BF.2070607@oracle.com> Mikael, > http://cr.openjdk.java.net/~mikael/webrevs/isspace/webrev.00/webrev/ 1. I agree that ctypes isspace usually cause more problems than solve and it's good to have our own version. 2. one of possible implementation is #define isspaceASCII(c) (strchr(SPACE_CHARS,c) != NULL) -Dmitry On 2014-02-18 07:59, Mikael Vidstedt wrote: > > On 2014-02-17 07:08, Alan Bateman wrote: >> On 17/02/2014 05:51, Mikael Vidstedt wrote: >>> >>> I'm inclined to agree with this. Since the code depends on a specific >>> behavior of isspace which does not match what the system provided >>> function does I too think it would be more robust to implement our >>> own version of it. >> I completely agree that changing this code to use its own isspace is >> the right thing, it just seems a bit much for a drive-by fixed to gcc >> warnings. Do either of you want to take it? > > How about: > > http://cr.openjdk.java.net/~mikael/webrevs/isspace/webrev.00/webrev/ > > Cheers, > Mikael > -- Dmitry Samersoff Oracle Java development team, Saint Petersburg, Russia * I would love to change the world, but they won't give me the sources. From david.holmes at oracle.com Tue Feb 18 01:39:42 2014 From: david.holmes at oracle.com (David Holmes) Date: Tue, 18 Feb 2014 19:39:42 +1000 Subject: RFR 4505697: nsk/jdi/ExceptionEvent/_itself_/exevent006 and exevent008 tests fail with InvocationTargetException In-Reply-To: <53031E28.1060007@oracle.com> References: <52FDF8DD.9090602@oracle.com> <5302EF6F.9090204@oracle.com> <53031E28.1060007@oracle.com> Message-ID: <53032A5E.2000705@oracle.com> On 18/02/2014 6:47 PM, Jaroslav Bachorik wrote: > Hi David, > > On 18.2.2014 06:28, David Holmes wrote: >> Hi Jaroslav, >> >> It seems to me that this issue extends to other places in the VM. In >> particular class initialization in instanceKlass.cpp - anywhere that one >> exception is "caught" in the VM and then wrapped with, or replaced by, >> another exception, will only notify JVMTI of the original exception. > > Thanks for pointing this out. Turns out there is another location in > jvm.cpp which needs the same treatment. > > BTW, what is your take on the necessity to grab the > JvmtiThreadState_lock before cleaning the detected exception in the > jvmti thread state? I would need to analyze all of the code that accesses it to determine that. My initial thought was that it seemed unnecessary and I did look at some of the code which seemed to indicate other threads would only access it at a safepoint. But there may be other access points that I'm not aware of. David > -JB- > >> >> David >> >> On 14/02/2014 9:07 PM, Jaroslav Bachorik wrote: >>> This is a round-0 review request. >>> >>> The reflection code intercepting the exceptions thrown in the invoked >>> methods does not play nicely with JVMTI (which, in this case, propagates >>> to JDI). >>> >>> The reflection code lacks the traditional error handler - therefore, >>> upon throwing the NumberFormatException, the stack is searched for >>> appropriate handlers and none are found. This leaves the >>> "exception_detected" flag set to true while normally it would be reset >>> to false once the exception is handled. The reflection code then goes on >>> and wraps the NumberFormatException into InvocationTargetException and >>> throws it. But, alas, the "exception_detected" flag is still set to true >>> and no JVMTI exception event will be sent out. >>> >>> The proposed solution is to call >>> thread->jvmti_thread_state()->clear_exception_detected() at the >>> appropriate places in the reflection code to reset the >>> "exception_detected" flag and enable the InvocationTargetException be >>> properly reported over JVMTI. >>> >>> Issue : https://bugs.openjdk.java.net/browse/JDK-4505697 >>> Webrev: http://cr.openjdk.java.net/~jbachorik/4505697/webrev.00 >>> >>> Thanks! >>> >>> -JB- > From fredrik.arvidsson at oracle.com Tue Feb 18 01:43:36 2014 From: fredrik.arvidsson at oracle.com (Fredrik Arvidsson) Date: Tue, 18 Feb 2014 10:43:36 +0100 Subject: RFR(XS): JDK-8034203: Change JavaDoc for com.sun.jdi.request.EventRequest.setEnabled(boolean val) Message-ID: <53032B48.2060003@oracle.com> Hi Please help me review this small change: This is a minor change to the JavaDoc in the JDI API. The problem was discovered when trying to fix a failing test. Webrev: 8034203 Bug: JDK-8034203 CCC: 8034203 Thanks /Fredrik -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20140218/25306b71/attachment.html From staffan.larsen at oracle.com Tue Feb 18 02:08:16 2014 From: staffan.larsen at oracle.com (Staffan Larsen) Date: Tue, 18 Feb 2014 11:08:16 +0100 Subject: RFR(XS): JDK-4515292: ReferenceType.isStatic() returns true for arrays In-Reply-To: <5301DA82.2070001@oracle.com> References: <5301DA82.2070001@oracle.com> Message-ID: <7A8179FC-A3DF-4C35-BE95-143305F1C253@oracle.com> Looks good! Thanks, /Staffan On 17 feb 2014, at 10:46, Fredrik Arvidsson wrote: > Hi please review this minor JDI fix. > > This bug was found when investigating failing JDI tests. According to the JavaDoc all calls to ReferenceType when it is an array type should return false for isStatic(). > > Webrev: http://cr.openjdk.java.net/~farvidsson/4515292/webrev.00/ > Jira: https://bugs.openjdk.java.net/browse/JDK-4515292 > > Cheers > /Fredrik From serguei.spitsyn at oracle.com Tue Feb 18 02:18:32 2014 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Tue, 18 Feb 2014 02:18:32 -0800 Subject: RFR 4505697: nsk/jdi/ExceptionEvent/_itself_/exevent006 and exevent008 tests fail with InvocationTargetException In-Reply-To: <5301C2AB.2070600@oracle.com> References: <52FDF8DD.9090602@oracle.com> <52FE6486.8000603@oracle.com> <52FE7D88.8000701@oracle.com> <52FE951E.1060002@oracle.com> <5301C2AB.2070600@oracle.com> Message-ID: <53033378.2060300@oracle.com> On 2/17/14 12:04 AM, Jaroslav Bachorik wrote: > On 14.2.2014 23:13, serguei.spitsyn at oracle.com wrote: >> On 2/14/14 12:33 PM, Daniel D. Daugherty wrote: >>> On 2/14/14 11:46 AM, serguei.spitsyn at oracle.com wrote: >>>> Jaroslav, >>>> >>>> It looks good in general modulo indent comments from Dan. >>>> >>>> But I have a doubt that acquiring the JvmtiThreadState_lock is needed >>>> or right thing to do in the JvmtiExport::clear_detected_exception(). >>>> It seems, both clear_exception_detected() and >>>> set_exception_detected() are always >>>> called on current thread and so, it has to be safe to do without >>>> acquiring any locks. >>> >>> My JVM/TI-foo is rusty, but I believe that JvmtiThreadState stuff >>> can also be queried/modified by other threads so grabbing the >>> associated lock is a good idea. >> >> The lock synchronization is cooperative. >> It does not help much if the lock is not acquired in other places. >> I can be wrong, but I've not found yet any place in the code where the >> clear_exception_detected() and set_exception_detected() are called >> under protection of the JvmtiThreadState_lock. > > I copied the locking over from > "JvmtiExport::cleanup_thread(JavaThread* thread)". That method is also > supposed to work only with the current thread but acquires the lock > nonetheless. But if you are sure that the lock is not required I have > no objections removing it. I'm suggesting to remove it, as it is not used in other places in the code. It is going to be confusing if it is used in one place and missed in others. Thanks, Serguei > > -JB- > >> >> Thanks, >> Serguei >> >>> >>> Dan >>> >>> >>>> >>>> And I'm repeating my question about pre-integration testing (Dan is >>>> asking about the same). >>>> >>>> Thanks, >>>> Serguei >>>> >>>> >>>> On 2/14/14 3:07 AM, Jaroslav Bachorik wrote: >>>>> This is a round-0 review request. >>>>> >>>>> The reflection code intercepting the exceptions thrown in the >>>>> invoked methods does not play nicely with JVMTI (which, in this >>>>> case, propagates to JDI). >>>>> >>>>> The reflection code lacks the traditional error handler - therefore, >>>>> upon throwing the NumberFormatException, the stack is searched for >>>>> appropriate handlers and none are found. This leaves the >>>>> "exception_detected" flag set to true while normally it would be >>>>> reset to false once the exception is handled. The reflection code >>>>> then goes on and wraps the NumberFormatException into >>>>> InvocationTargetException and throws it. But, alas, the >>>>> "exception_detected" flag is still set to true and no JVMTI >>>>> exception event will be sent out. >>>>> >>>>> The proposed solution is to call >>>>> thread->jvmti_thread_state()->clear_exception_detected() at the >>>>> appropriate places in the reflection code to reset the >>>>> "exception_detected" flag and enable the InvocationTargetException >>>>> be properly reported over JVMTI. >>>>> >>>>> Issue : https://bugs.openjdk.java.net/browse/JDK-4505697 >>>>> Webrev: http://cr.openjdk.java.net/~jbachorik/4505697/webrev.00 >>>>> >>>>> Thanks! >>>>> >>>>> -JB- >>>> >>> >> >> > From staffan.larsen at oracle.com Tue Feb 18 02:19:24 2014 From: staffan.larsen at oracle.com (Staffan Larsen) Date: Tue, 18 Feb 2014 11:19:24 +0100 Subject: RFR(S): 6952105 TEST_BUG: testcase failure, not very often, com/sun/jdi/SuspendThreadTest.java In-Reply-To: References: Message-ID: <0C0B4222-F372-4484-B92B-647C6BAE3D19@oracle.com> Still looking for Reviewer for this change. Thanks, /Staffan On 11 feb 2014, at 15:12, Staffan Larsen wrote: > Updated the test to use proper synchronization and notification between threads. Should be more stable and much faster. > > bug: https://bugs.openjdk.java.net/browse/JDK-6952105 > webrev: http://cr.openjdk.java.net/~sla/6952105/webrev.00/ > > Thanks, > /Staffan From serguei.spitsyn at oracle.com Tue Feb 18 03:42:20 2014 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Tue, 18 Feb 2014 03:42:20 -0800 Subject: RFR(S): 6952105 TEST_BUG: testcase failure, not very often, com/sun/jdi/SuspendThreadTest.java In-Reply-To: <0C0B4222-F372-4484-B92B-647C6BAE3D19@oracle.com> References: <0C0B4222-F372-4484-B92B-647C6BAE3D19@oracle.com> Message-ID: <5303471C.4060405@oracle.com> The fix looks good. Thanks, Serguei On 2/18/14 2:19 AM, Staffan Larsen wrote: > Still looking for Reviewer for this change. > > Thanks, > /Staffan > > On 11 feb 2014, at 15:12, Staffan Larsen wrote: > >> Updated the test to use proper synchronization and notification between threads. Should be more stable and much faster. >> >> bug: https://bugs.openjdk.java.net/browse/JDK-6952105 >> webrev: http://cr.openjdk.java.net/~sla/6952105/webrev.00/ >> >> Thanks, >> /Staffan From david.holmes at oracle.com Tue Feb 18 04:09:49 2014 From: david.holmes at oracle.com (David Holmes) Date: Tue, 18 Feb 2014 22:09:49 +1000 Subject: RFR(S): 6952105 TEST_BUG: testcase failure, not very often, com/sun/jdi/SuspendThreadTest.java In-Reply-To: <0C0B4222-F372-4484-B92B-647C6BAE3D19@oracle.com> References: <0C0B4222-F372-4484-B92B-647C6BAE3D19@oracle.com> Message-ID: <53034D8D.7050908@oracle.com> Hi Staffan, If you get a spurious wakeup from wait(): 151 try { 152 synchronized (bkptSignal) { 153 bkptSignal.wait(5000); 154 } 155 } catch (InterruptedException ee) { 156 } 157 if (prevBkptCount == bkptCount) { 158 failure("failure: test hung"); you could report failure. But that is far less likely than the current problem using sleep. David On 18/02/2014 8:19 PM, Staffan Larsen wrote: > Still looking for Reviewer for this change. > > Thanks, > /Staffan > > On 11 feb 2014, at 15:12, Staffan Larsen wrote: > >> Updated the test to use proper synchronization and notification between threads. Should be more stable and much faster. >> >> bug: https://bugs.openjdk.java.net/browse/JDK-6952105 >> webrev: http://cr.openjdk.java.net/~sla/6952105/webrev.00/ >> >> Thanks, >> /Staffan > From staffan.larsen at oracle.com Tue Feb 18 05:03:07 2014 From: staffan.larsen at oracle.com (Staffan Larsen) Date: Tue, 18 Feb 2014 14:03:07 +0100 Subject: RFR(S): 6952105 TEST_BUG: testcase failure, not very often, com/sun/jdi/SuspendThreadTest.java In-Reply-To: <53034D8D.7050908@oracle.com> References: <0C0B4222-F372-4484-B92B-647C6BAE3D19@oracle.com> <53034D8D.7050908@oracle.com> Message-ID: On 18 feb 2014, at 13:09, David Holmes wrote: > Hi Staffan, > > If you get a spurious wakeup from wait(): > > 151 try { > 152 synchronized (bkptSignal) { > 153 bkptSignal.wait(5000); > 154 } > 155 } catch (InterruptedException ee) { > 156 } > 157 if (prevBkptCount == bkptCount) { > 158 failure("failure: test hung"); > > you could report failure. But that is far less likely than the current problem using sleep. Right. Adding ?continue;? inside the catch(InterruptedException) block should guard against that. /Staffan > > David > > On 18/02/2014 8:19 PM, Staffan Larsen wrote: >> Still looking for Reviewer for this change. >> >> Thanks, >> /Staffan >> >> On 11 feb 2014, at 15:12, Staffan Larsen wrote: >> >>> Updated the test to use proper synchronization and notification between threads. Should be more stable and much faster. >>> >>> bug: https://bugs.openjdk.java.net/browse/JDK-6952105 >>> webrev: http://cr.openjdk.java.net/~sla/6952105/webrev.00/ >>> >>> Thanks, >>> /Staffan >> From jaroslav.bachorik at oracle.com Tue Feb 18 08:22:36 2014 From: jaroslav.bachorik at oracle.com (Jaroslav Bachorik) Date: Tue, 18 Feb 2014 17:22:36 +0100 Subject: RFR 8034168: ThreadMXBean/Locks.java failed, blocked on wrong object Message-ID: <530388CC.9080807@oracle.com> Please, review the following test change. Issue : https://bugs.openjdk.java.net/browse/JDK-8034168 Webrev: http://cr.openjdk.java.net/~jbachorik/8034168/webrev.00 The test fails because of falsely evaluating the thread being parked as actually waiting on a monitor. This is because there is no difference in java thread state for those two situations. The test is using Phaser for synchronization between the checked and checking thread to make sure an appropriate code section is entered before performing asserts. Then it checks the checked thread state and waits till it becomes WAITING. Unfortunately, when Phaser needs to wait it parks the thread and sets the thread state to WAITING. From now on the test is in a completely random state and the result will largely depend on timing - thus failing intermittently. The solution is to use an additional volatile variable to prevent falsely indicating the park() induced WAITING state. Thanks, -JB- From martinrb at google.com Tue Feb 18 09:06:39 2014 From: martinrb at google.com (Martin Buchholz) Date: Tue, 18 Feb 2014 09:06:39 -0800 Subject: RFR 8034168: ThreadMXBean/Locks.java failed, blocked on wrong object In-Reply-To: <530388CC.9080807@oracle.com> References: <530388CC.9080807@oracle.com> Message-ID: Not checking any details, but tests that want to wait for a particular thread state are a good reason to use volatile boolean flag; ... while (!flag) Thread.yield(); I prefer calling Thread.yield to sleeping in this special case, in part because I don't want to rely on the implementation of sleep, while yield is semantically a no-op. (Also sleeping 100ms is a long time for a computer) On Tue, Feb 18, 2014 at 8:22 AM, Jaroslav Bachorik < jaroslav.bachorik at oracle.com> wrote: > Please, review the following test change. > > Issue : https://bugs.openjdk.java.net/browse/JDK-8034168 > Webrev: http://cr.openjdk.java.net/~jbachorik/8034168/webrev.00 > > The test fails because of falsely evaluating the thread being parked as > actually waiting on a monitor. This is because there is no difference in > java thread state for those two situations. The test is using Phaser for > synchronization between the checked and checking thread to make sure an > appropriate code section is entered before performing asserts. Then it > checks the checked thread state and waits till it becomes WAITING. > Unfortunately, when Phaser needs to wait it parks the thread and sets the > thread state to WAITING. From now on the test is in a completely random > state and the result will largely depend on timing - thus failing > intermittently. > > The solution is to use an additional volatile variable to prevent falsely > indicating the park() induced WAITING state. > > Thanks, > > -JB- > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20140218/9f0cd912/attachment.html From mikael.vidstedt at oracle.com Tue Feb 18 11:45:33 2014 From: mikael.vidstedt at oracle.com (Mikael Vidstedt) Date: Tue, 18 Feb 2014 11:45:33 -0800 Subject: 8034856/8034857: More gcc warnings In-Reply-To: <53031AE4.2050700@oracle.com> References: <52FCC60B.6040205@oracle.com> <6E073B44-2FF7-419F-A094-7CD5C3AE0788@oracle.com> <52FD0DA2.7030201@oracle.com> <52FD35C5.1000306@oracle.com> <5300F36F.9010205@oracle.com> <4DCAB60B-5D60-4873-A5A4-ACF123F10045@oracle.com> <530225D2.4040709@oracle.com> <5302DA88.9060306@oracle.com> <53031AE4.2050700@oracle.com> Message-ID: <5303B85D.10505@oracle.com> On 2014-02-18 00:33, Alan Bateman wrote: > On 18/02/2014 03:59, Mikael Vidstedt wrote: >> >> How about: >> >> http://cr.openjdk.java.net/~mikael/webrevs/isspace/webrev.00/webrev/ >> >> Cheers, >> Mikael >> > I checked the java.lang.instrument spec and for the Boot-Class-Path > attribute then it doesn't say any more than "space". It might be worth > checking the manifest parsing code (parse_manfiest.c) to see how > continuations are handled as I suspect \r and \n can't appear in the > attribute value (in which case the check might really only need to be > for space and \t. That makes sense, and in fact parse_manifest.c does not even appear to allow for \t, so I'm more and more starting to think that a reasonable implementation in this context would be: static int isNormalSpace(int c) { return c == ' '; } In which case it probably shouldn't even be a separate function to start with. I would like to get a second opinion on the implications of only checking for ' ' (0x20) though. If we want to allow both ' ' and \t we should probably call the function isblankAscii. > Otherwise replacing isspace is good and your isspaceAscii is likely to > match the libc isspace (at runtime). This code isn't performance > sensitive but maybe check space first would be a bit better. Also the > library native code using 4 space indent rather than hotspot's 2. Will fix indentation. I seriously doubt that the performance difference warrants the more complicated code. > I created JDK-8035054 a few days ago to track this. Thanks for taking > it as I am busy with a number of other things at the moment. Always for you, sir! ;) /Mikael -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20140218/ad9ae5c7/attachment.html From david.holmes at oracle.com Tue Feb 18 11:49:58 2014 From: david.holmes at oracle.com (David Holmes) Date: Wed, 19 Feb 2014 05:49:58 +1000 Subject: RFR(S): 6952105 TEST_BUG: testcase failure, not very often, com/sun/jdi/SuspendThreadTest.java In-Reply-To: References: <0C0B4222-F372-4484-B92B-647C6BAE3D19@oracle.com> <53034D8D.7050908@oracle.com> Message-ID: <5303B966.20709@oracle.com> On 18/02/2014 11:03 PM, Staffan Larsen wrote: > > On 18 feb 2014, at 13:09, David Holmes wrote: > >> Hi Staffan, >> >> If you get a spurious wakeup from wait(): >> >> 151 try { >> 152 synchronized (bkptSignal) { >> 153 bkptSignal.wait(5000); >> 154 } >> 155 } catch (InterruptedException ee) { >> 156 } >> 157 if (prevBkptCount == bkptCount) { >> 158 failure("failure: test hung"); >> >> you could report failure. But that is far less likely than the current problem using sleep. > > Right. Adding ?continue;? inside the catch(InterruptedException) block should guard against that. No, a spurious wakeup is not an interrupt - the wait() will simply return. David > > /Staffan > >> >> David >> >> On 18/02/2014 8:19 PM, Staffan Larsen wrote: >>> Still looking for Reviewer for this change. >>> >>> Thanks, >>> /Staffan >>> >>> On 11 feb 2014, at 15:12, Staffan Larsen wrote: >>> >>>> Updated the test to use proper synchronization and notification between threads. Should be more stable and much faster. >>>> >>>> bug: https://bugs.openjdk.java.net/browse/JDK-6952105 >>>> webrev: http://cr.openjdk.java.net/~sla/6952105/webrev.00/ >>>> >>>> Thanks, >>>> /Staffan >>> > From sitnikov.vladimir at gmail.com Tue Feb 18 11:57:46 2014 From: sitnikov.vladimir at gmail.com (Vladimir Sitnikov) Date: Tue, 18 Feb 2014 23:57:46 +0400 Subject: JDK-8005604, instance size/metaspace data in HPROF Message-ID: Seems like HPROF does not include information on non-heap memory. It would be nice to have accurate class size (e.g. with @Contended), space wastage, permgen/metaspace (e.g. constant pool size or whatever is in C-heap), code cache blobs in the heap dump. Alexey Shipil?v raised similar question regarding instance size CR 8005604: HPROF should report the actual instance size [1], [2], however it seems no progress is made. Creating brand-new format does not look like an easy solution, and it is not clear how new features will be added later. It is not clear why new format would be better in terms of efforts to update tooling. What do you think if we keep HPROF format completely intact and dump additional information via synthetic classes and objects? For instance, to dump precise instance size of java class, we include "vm.native.synthetic.ClassLayout { int instanceSize; int spaceLoss; }" class in the generated HPROF file. For each java class we dump "instance" of this ClassInfo. This will reveal true instance size, space loss (e.g. alignment, @contended, etc) to the heap analyzer. The resulting HPROF will be fully compatible with existing analyzers, so no significant harm is made. To associate java.lang.Class and its vm.native metadata, a new synthetic static reference from java.lang.Class to vm.native.synthetic.ClassInfo is added. Another approach is to reference from vm.native to java.lang.Class, however it will be harder to analyze (more clicks, more complex queries) Pros: 1) We can dump real instance size or any new information while keeping HPROF format intact 2) Current tools will parse and display the dump just fine. With moderate effort tools can be improved to use this "metadata" for calculations (and proper accounting of "used heap size") 3) Query language of memory analyzers (e.g. map-reduce-javascript in VirtualVM, OQL/SQL in Eclipse MAT, etc) can leverage this new data. The approach of synthetic classes does not require to update analyzer for that. 4) More details (e.g. code cache blobs or whatever) can be added in the similar way to the dump while still keeping forward and backward compatibility in terms of file format Cons: 1) I am not sure if iteration over permgen/metaspace/codecache is safe in terms of crashes (or whatever). It is not good a good idea to crash when writing HPROF dump (especially during OutOfMemoryError), however I believe at least basic information (e.g. instance size) should be available in a safe way. 2) Existing analyzers will charge "heap size" for synthetic classes/objects. I do not think it is a big deal, however, if we dump permgen/metaspace/codecache, then the amount of synthetic objects might be significant enough to confuse non-expecting engineers 1: https://bugs.openjdk.java.net/browse/JDK-8005604 2: http://mail.openjdk.java.net/pipermail/serviceability-dev/2012-December/007852.html -- Regards, Vladimir Sitnikov -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20140218/9085839c/attachment-0001.html From shanliang.jiang at oracle.com Tue Feb 18 13:17:40 2014 From: shanliang.jiang at oracle.com (shanliang) Date: Tue, 18 Feb 2014 22:17:40 +0100 Subject: RFR(S): 6952105 TEST_BUG: testcase failure, not very often, com/sun/jdi/SuspendThreadTest.java In-Reply-To: <5303B966.20709@oracle.com> References: <0C0B4222-F372-4484-B92B-647C6BAE3D19@oracle.com> <53034D8D.7050908@oracle.com> <5303B966.20709@oracle.com> Message-ID: <5303CDF4.1050702@oracle.com> I am looking at the old file: 143 while (bkptCount < maxBkpts) { 144 prevBkptCount = bkptCount; suppose the following execution sequence: 1) when Line 143 was called by Thread1, we had bkptCount == maxBkpts - 1; 2) bkptCount++ was executed by thread2; 3) Line 144 was called by thread1, in this case it was sure that the line 152 failure("failure: test hung"); would be called. It is good to add: synchronized (bkptSignal) in the fix, but we need to put Line 143 and 144 into synchronization too. To deal with a spurious wakeup, we might do like this: long stopTime = System.currentTimeMillis() + 5000; do { try { bkptSignal.wait(100); } catch (InterruptedException e){} } while(prevBkptCount == bkptCount && System.currentTimeMillis() < stopTime); Shanliang David Holmes wrote: > On 18/02/2014 11:03 PM, Staffan Larsen wrote: >> >> On 18 feb 2014, at 13:09, David Holmes wrote: >> >>> Hi Staffan, >>> >>> If you get a spurious wakeup from wait(): >>> >>> 151 try { >>> 152 synchronized (bkptSignal) { >>> 153 bkptSignal.wait(5000); >>> 154 } >>> 155 } catch (InterruptedException ee) { >>> 156 } >>> 157 if (prevBkptCount == bkptCount) { >>> 158 failure("failure: test hung"); >>> >>> you could report failure. But that is far less likely than the >>> current problem using sleep. >> >> Right. Adding ?continue;? inside the catch(InterruptedException) >> block should guard against that. > > No, a spurious wakeup is not an interrupt - the wait() will simply > return. > > David >> >> /Staffan >> >>> >>> David >>> >>> On 18/02/2014 8:19 PM, Staffan Larsen wrote: >>>> Still looking for Reviewer for this change. >>>> >>>> Thanks, >>>> /Staffan >>>> >>>> On 11 feb 2014, at 15:12, Staffan Larsen >>>> wrote: >>>> >>>>> Updated the test to use proper synchronization and notification >>>>> between threads. Should be more stable and much faster. >>>>> >>>>> bug: https://bugs.openjdk.java.net/browse/JDK-6952105 >>>>> webrev: http://cr.openjdk.java.net/~sla/6952105/webrev.00/ >>>>> >>>>> Thanks, >>>>> /Staffan >>>> >> From jaroslav.bachorik at oracle.com Tue Feb 18 23:29:58 2014 From: jaroslav.bachorik at oracle.com (Jaroslav Bachorik) Date: Wed, 19 Feb 2014 08:29:58 +0100 Subject: RFR 8034168: ThreadMXBean/Locks.java failed, blocked on wrong object In-Reply-To: References: <530388CC.9080807@oracle.com> Message-ID: <53045D76.5080907@oracle.com> On 18.2.2014 18:06, Martin Buchholz wrote: > Not checking any details, but tests that want to wait for a particular > thread state are a good reason to use > > volatile boolean flag; > ... > while (!flag) Thread.yield(); > > I prefer calling Thread.yield to sleeping in this special case, in part > because I don't want to rely on the implementation of sleep, while yield is > semantically a no-op. (Also sleeping 100ms is a long time for a computer) There were discussions for a similar fix regarding Thread.yield(). The concern was that using Thread.yield() in a tight loop might very easily lead to starvation on single core machines. Therefore Thread.sleep(10) is used to be sure the flag setting thread has actually a chance to progress. -JB- > > > > On Tue, Feb 18, 2014 at 8:22 AM, Jaroslav Bachorik < > jaroslav.bachorik at oracle.com> wrote: > >> Please, review the following test change. >> >> Issue : https://bugs.openjdk.java.net/browse/JDK-8034168 >> Webrev: http://cr.openjdk.java.net/~jbachorik/8034168/webrev.00 >> >> The test fails because of falsely evaluating the thread being parked as >> actually waiting on a monitor. This is because there is no difference in >> java thread state for those two situations. The test is using Phaser for >> synchronization between the checked and checking thread to make sure an >> appropriate code section is entered before performing asserts. Then it >> checks the checked thread state and waits till it becomes WAITING. >> Unfortunately, when Phaser needs to wait it parks the thread and sets the >> thread state to WAITING. From now on the test is in a completely random >> state and the result will largely depend on timing - thus failing >> intermittently. >> >> The solution is to use an additional volatile variable to prevent falsely >> indicating the park() induced WAITING state. >> >> Thanks, >> >> -JB- >> > From shanliang.jiang at oracle.com Wed Feb 19 00:17:45 2014 From: shanliang.jiang at oracle.com (shanliang) Date: Wed, 19 Feb 2014 09:17:45 +0100 Subject: Codereview request: 8035195 demo/jvmti/mtrace/TraceJFrame.java can't connect to X11 Message-ID: <530468A9.7080509@oracle.com> Hi, The failure was from a bad DISPLAY setting, this must be not intended, so we re-throw the exception and tell to make sure that the DISPLAY is correct. The fix is only to add a catch of InternalError at the beginning, the rest code is not modified but re-formated to remove some spaces. webrev: http://cr.openjdk.java.net/~sjiang/JDK-8035195/00 / bug: https://bugs.openjdk.java.net/browse/JDK-8035195 Thanks, Shanliang From david.holmes at oracle.com Tue Feb 18 22:09:34 2014 From: david.holmes at oracle.com (David Holmes) Date: Wed, 19 Feb 2014 16:09:34 +1000 Subject: RFR(S): 6952105 TEST_BUG: testcase failure, not very often, com/sun/jdi/SuspendThreadTest.java In-Reply-To: <5303CDF4.1050702@oracle.com> References: <0C0B4222-F372-4484-B92B-647C6BAE3D19@oracle.com> <53034D8D.7050908@oracle.com> <5303B966.20709@oracle.com> <5303CDF4.1050702@oracle.com> Message-ID: <53044A9E.6050507@oracle.com> On 19/02/2014 7:17 AM, shanliang wrote: > I am looking at the old file: > 143 while (bkptCount < maxBkpts) { > 144 prevBkptCount = bkptCount; > > suppose the following execution sequence: > 1) when Line 143 was called by Thread1, we had bkptCount == maxBkpts - 1; > > 2) bkptCount++ was executed by thread2; > > 3) Line 144 was called by thread1, > > in this case it was sure that the line > 152 failure("failure: test hung"); > would be called. Yes I was looking at that race too. The comments suggest that we should never reach a point where we get to maxBkpts, so this failure would be very rare and would likely indicate a real problem. > It is good to add: > synchronized (bkptSignal) > in the fix, but we need to put Line 143 and 144 into synchronization too. > > To deal with a spurious wakeup, we might do like this: > long stopTime = System.currentTimeMillis() + 5000; > do { > try { > bkptSignal.wait(100); > } catch (InterruptedException e){} > } while(prevBkptCount == bkptCount && System.currentTimeMillis() > < stopTime); It is better to use System.nanoTime() rather than the non-monotonic currentTimeMillis(). And you really want a while loop rather than do-while so we don't always do that 100ms wait. David > Shanliang > > David Holmes wrote: >> On 18/02/2014 11:03 PM, Staffan Larsen wrote: >>> >>> On 18 feb 2014, at 13:09, David Holmes wrote: >>> >>>> Hi Staffan, >>>> >>>> If you get a spurious wakeup from wait(): >>>> >>>> 151 try { >>>> 152 synchronized (bkptSignal) { >>>> 153 bkptSignal.wait(5000); >>>> 154 } >>>> 155 } catch (InterruptedException ee) { >>>> 156 } >>>> 157 if (prevBkptCount == bkptCount) { >>>> 158 failure("failure: test hung"); >>>> >>>> you could report failure. But that is far less likely than the >>>> current problem using sleep. >>> >>> Right. Adding ?continue;? inside the catch(InterruptedException) >>> block should guard against that. >> >> No, a spurious wakeup is not an interrupt - the wait() will simply >> return. >> >> David >>> >>> /Staffan >>> >>>> >>>> David >>>> >>>> On 18/02/2014 8:19 PM, Staffan Larsen wrote: >>>>> Still looking for Reviewer for this change. >>>>> >>>>> Thanks, >>>>> /Staffan >>>>> >>>>> On 11 feb 2014, at 15:12, Staffan Larsen >>>>> wrote: >>>>> >>>>>> Updated the test to use proper synchronization and notification >>>>>> between threads. Should be more stable and much faster. >>>>>> >>>>>> bug: https://bugs.openjdk.java.net/browse/JDK-6952105 >>>>>> webrev: http://cr.openjdk.java.net/~sla/6952105/webrev.00/ >>>>>> >>>>>> Thanks, >>>>>> /Staffan >>>>> >>> > From staffan.larsen at oracle.com Wed Feb 19 01:01:59 2014 From: staffan.larsen at oracle.com (Staffan Larsen) Date: Wed, 19 Feb 2014 10:01:59 +0100 Subject: RFR(S): 6952105 TEST_BUG: testcase failure, not very often, com/sun/jdi/SuspendThreadTest.java In-Reply-To: <53044A9E.6050507@oracle.com> References: <0C0B4222-F372-4484-B92B-647C6BAE3D19@oracle.com> <53034D8D.7050908@oracle.com> <5303B966.20709@oracle.com> <5303CDF4.1050702@oracle.com> <53044A9E.6050507@oracle.com> Message-ID: <6B94A897-2385-472A-B0CA-B0FC56D1D818@oracle.com> Thanks for the feedback! I chose to use yet another variable to avoid the spurious wakeups. I?ve also increased the range of the synchronized statement to avoid the race. http://cr.openjdk.java.net/~sla/6952105/webrev.01/ Thanks, /Staffan On 19 feb 2014, at 07:09, David Holmes wrote: > On 19/02/2014 7:17 AM, shanliang wrote: >> I am looking at the old file: >> 143 while (bkptCount < maxBkpts) { >> 144 prevBkptCount = bkptCount; >> >> suppose the following execution sequence: >> 1) when Line 143 was called by Thread1, we had bkptCount == maxBkpts - 1; >> >> 2) bkptCount++ was executed by thread2; >> >> 3) Line 144 was called by thread1, >> >> in this case it was sure that the line >> 152 failure("failure: test hung"); >> would be called. > > Yes I was looking at that race too. The comments suggest that we should never reach a point where we get to maxBkpts, so this failure would be very rare and would likely indicate a real problem. > >> It is good to add: >> synchronized (bkptSignal) >> in the fix, but we need to put Line 143 and 144 into synchronization too. >> >> To deal with a spurious wakeup, we might do like this: >> long stopTime = System.currentTimeMillis() + 5000; >> do { >> try { >> bkptSignal.wait(100); >> } catch (InterruptedException e){} >> } while(prevBkptCount == bkptCount && System.currentTimeMillis() >> < stopTime); > > It is better to use System.nanoTime() rather than the non-monotonic currentTimeMillis(). And you really want a while loop rather than do-while so we don't always do that 100ms wait. > > David > >> Shanliang >> >> David Holmes wrote: >>> On 18/02/2014 11:03 PM, Staffan Larsen wrote: >>>> >>>> On 18 feb 2014, at 13:09, David Holmes wrote: >>>> >>>>> Hi Staffan, >>>>> >>>>> If you get a spurious wakeup from wait(): >>>>> >>>>> 151 try { >>>>> 152 synchronized (bkptSignal) { >>>>> 153 bkptSignal.wait(5000); >>>>> 154 } >>>>> 155 } catch (InterruptedException ee) { >>>>> 156 } >>>>> 157 if (prevBkptCount == bkptCount) { >>>>> 158 failure("failure: test hung"); >>>>> >>>>> you could report failure. But that is far less likely than the >>>>> current problem using sleep. >>>> >>>> Right. Adding ?continue;? inside the catch(InterruptedException) >>>> block should guard against that. >>> >>> No, a spurious wakeup is not an interrupt - the wait() will simply >>> return. >>> >>> David >>>> >>>> /Staffan >>>> >>>>> >>>>> David >>>>> >>>>> On 18/02/2014 8:19 PM, Staffan Larsen wrote: >>>>>> Still looking for Reviewer for this change. >>>>>> >>>>>> Thanks, >>>>>> /Staffan >>>>>> >>>>>> On 11 feb 2014, at 15:12, Staffan Larsen >>>>>> wrote: >>>>>> >>>>>>> Updated the test to use proper synchronization and notification >>>>>>> between threads. Should be more stable and much faster. >>>>>>> >>>>>>> bug: https://bugs.openjdk.java.net/browse/JDK-6952105 >>>>>>> webrev: http://cr.openjdk.java.net/~sla/6952105/webrev.00/ >>>>>>> >>>>>>> Thanks, >>>>>>> /Staffan -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20140219/352fb18b/attachment-0001.html From david.holmes at oracle.com Wed Feb 19 01:38:51 2014 From: david.holmes at oracle.com (David Holmes) Date: Wed, 19 Feb 2014 19:38:51 +1000 Subject: RFR(S): 6952105 TEST_BUG: testcase failure, not very often, com/sun/jdi/SuspendThreadTest.java In-Reply-To: <6B94A897-2385-472A-B0CA-B0FC56D1D818@oracle.com> References: <0C0B4222-F372-4484-B92B-647C6BAE3D19@oracle.com> <53034D8D.7050908@oracle.com> <5303B966.20709@oracle.com> <5303CDF4.1050702@oracle.com> <53044A9E.6050507@oracle.com> <6B94A897-2385-472A-B0CA-B0FC56D1D818@oracle.com> Message-ID: <53047BAB.5020904@oracle.com> On 19/02/2014 7:01 PM, Staffan Larsen wrote: > Thanks for the feedback! > > I chose to use yet another variable to avoid the spurious wakeups. I?ve > also increased the range of the synchronized statement to avoid the race. > > http://cr.openjdk.java.net/~sla/6952105/webrev.01/ Slightly simpler to just do: bkptSignal.wait(5000); if (!signalSent) continue; but what you have works. Also signalSent doesn't need to be volatile as it is only accessed within the sync blocks. Thanks, David > Thanks, > /Staffan > > On 19 feb 2014, at 07:09, David Holmes wrote: > >> On 19/02/2014 7:17 AM, shanliang wrote: >>> I am looking at the old file: >>> 143 while (bkptCount < maxBkpts) { >>> 144 prevBkptCount = bkptCount; >>> >>> suppose the following execution sequence: >>> 1) when Line 143 was called by Thread1, we had bkptCount == >>> maxBkpts - 1; >>> >>> 2) bkptCount++ was executed by thread2; >>> >>> 3) Line 144 was called by thread1, >>> >>> in this case it was sure that the line >>> 152 failure("failure: test hung"); >>> would be called. >> >> Yes I was looking at that race too. The comments suggest that we >> should never reach a point where we get to maxBkpts, so this failure >> would be very rare and would likely indicate a real problem. >> >>> It is good to add: >>> synchronized (bkptSignal) >>> in the fix, but we need to put Line 143 and 144 into synchronization too. >>> >>> To deal with a spurious wakeup, we might do like this: >>> long stopTime = System.currentTimeMillis() + 5000; >>> do { >>> try { >>> bkptSignal.wait(100); >>> } catch (InterruptedException e){} >>> } while(prevBkptCount == bkptCount && System.currentTimeMillis() >>> < stopTime); >> >> It is better to use System.nanoTime() rather than the non-monotonic >> currentTimeMillis(). And you really want a while loop rather than >> do-while so we don't always do that 100ms wait. >> >> David >> >>> Shanliang >>> >>> David Holmes wrote: >>>> On 18/02/2014 11:03 PM, Staffan Larsen wrote: >>>>> >>>>> On 18 feb 2014, at 13:09, David Holmes wrote: >>>>> >>>>>> Hi Staffan, >>>>>> >>>>>> If you get a spurious wakeup from wait(): >>>>>> >>>>>> 151 try { >>>>>> 152 synchronized (bkptSignal) { >>>>>> 153 bkptSignal.wait(5000); >>>>>> 154 } >>>>>> 155 } catch (InterruptedException ee) { >>>>>> 156 } >>>>>> 157 if (prevBkptCount == bkptCount) { >>>>>> 158 failure("failure: test hung"); >>>>>> >>>>>> you could report failure. But that is far less likely than the >>>>>> current problem using sleep. >>>>> >>>>> Right. Adding ?continue;? inside the catch(InterruptedException) >>>>> block should guard against that. >>>> >>>> No, a spurious wakeup is not an interrupt - the wait() will simply >>>> return. >>>> >>>> David >>>>> >>>>> /Staffan >>>>> >>>>>> >>>>>> David >>>>>> >>>>>> On 18/02/2014 8:19 PM, Staffan Larsen wrote: >>>>>>> Still looking for Reviewer for this change. >>>>>>> >>>>>>> Thanks, >>>>>>> /Staffan >>>>>>> >>>>>>> On 11 feb 2014, at 15:12, Staffan Larsen >>>>>>> wrote: >>>>>>> >>>>>>>> Updated the test to use proper synchronization and notification >>>>>>>> between threads. Should be more stable and much faster. >>>>>>>> >>>>>>>> bug: https://bugs.openjdk.java.net/browse/JDK-6952105 >>>>>>>> webrev: http://cr.openjdk.java.net/~sla/6952105/webrev.00/ >>>>>>>> >>>>>>>> Thanks, >>>>>>>> /Staffan > From staffan.larsen at oracle.com Wed Feb 19 01:42:51 2014 From: staffan.larsen at oracle.com (Staffan Larsen) Date: Wed, 19 Feb 2014 10:42:51 +0100 Subject: RFR(S): 6952105 TEST_BUG: testcase failure, not very often, com/sun/jdi/SuspendThreadTest.java In-Reply-To: <53047BAB.5020904@oracle.com> References: <0C0B4222-F372-4484-B92B-647C6BAE3D19@oracle.com> <53034D8D.7050908@oracle.com> <5303B966.20709@oracle.com> <5303CDF4.1050702@oracle.com> <53044A9E.6050507@oracle.com> <6B94A897-2385-472A-B0CA-B0FC56D1D818@oracle.com> <53047BAB.5020904@oracle.com> Message-ID: <626ADA78-77A9-44C3-B2D5-C42798C4F46F@oracle.com> On 19 feb 2014, at 10:38, David Holmes wrote: > On 19/02/2014 7:01 PM, Staffan Larsen wrote: >> Thanks for the feedback! >> >> I chose to use yet another variable to avoid the spurious wakeups. I?ve >> also increased the range of the synchronized statement to avoid the race. >> >> http://cr.openjdk.java.net/~sla/6952105/webrev.01/ > > Slightly simpler to just do: > > bkptSignal.wait(5000); > if (!signalSent) > continue; > > but what you have works. > > Also signalSent doesn't need to be volatile as it is only accessed within the sync blocks. True. And true for bkptCount as well now, except for one usage in a println. I?ll remove the volatile on signalSent, but keep it on bkptCount. Thanks, /Staffan > > Thanks, > David > >> Thanks, >> /Staffan >> >> On 19 feb 2014, at 07:09, David Holmes wrote: >> >>> On 19/02/2014 7:17 AM, shanliang wrote: >>>> I am looking at the old file: >>>> 143 while (bkptCount < maxBkpts) { >>>> 144 prevBkptCount = bkptCount; >>>> >>>> suppose the following execution sequence: >>>> 1) when Line 143 was called by Thread1, we had bkptCount == >>>> maxBkpts - 1; >>>> >>>> 2) bkptCount++ was executed by thread2; >>>> >>>> 3) Line 144 was called by thread1, >>>> >>>> in this case it was sure that the line >>>> 152 failure("failure: test hung"); >>>> would be called. >>> >>> Yes I was looking at that race too. The comments suggest that we >>> should never reach a point where we get to maxBkpts, so this failure >>> would be very rare and would likely indicate a real problem. >>> >>>> It is good to add: >>>> synchronized (bkptSignal) >>>> in the fix, but we need to put Line 143 and 144 into synchronization too. >>>> >>>> To deal with a spurious wakeup, we might do like this: >>>> long stopTime = System.currentTimeMillis() + 5000; >>>> do { >>>> try { >>>> bkptSignal.wait(100); >>>> } catch (InterruptedException e){} >>>> } while(prevBkptCount == bkptCount && System.currentTimeMillis() >>>> < stopTime); >>> >>> It is better to use System.nanoTime() rather than the non-monotonic >>> currentTimeMillis(). And you really want a while loop rather than >>> do-while so we don't always do that 100ms wait. >>> >>> David >>> >>>> Shanliang >>>> >>>> David Holmes wrote: >>>>> On 18/02/2014 11:03 PM, Staffan Larsen wrote: >>>>>> >>>>>> On 18 feb 2014, at 13:09, David Holmes wrote: >>>>>> >>>>>>> Hi Staffan, >>>>>>> >>>>>>> If you get a spurious wakeup from wait(): >>>>>>> >>>>>>> 151 try { >>>>>>> 152 synchronized (bkptSignal) { >>>>>>> 153 bkptSignal.wait(5000); >>>>>>> 154 } >>>>>>> 155 } catch (InterruptedException ee) { >>>>>>> 156 } >>>>>>> 157 if (prevBkptCount == bkptCount) { >>>>>>> 158 failure("failure: test hung"); >>>>>>> >>>>>>> you could report failure. But that is far less likely than the >>>>>>> current problem using sleep. >>>>>> >>>>>> Right. Adding ?continue;? inside the catch(InterruptedException) >>>>>> block should guard against that. >>>>> >>>>> No, a spurious wakeup is not an interrupt - the wait() will simply >>>>> return. >>>>> >>>>> David >>>>>> >>>>>> /Staffan >>>>>> >>>>>>> >>>>>>> David >>>>>>> >>>>>>> On 18/02/2014 8:19 PM, Staffan Larsen wrote: >>>>>>>> Still looking for Reviewer for this change. >>>>>>>> >>>>>>>> Thanks, >>>>>>>> /Staffan >>>>>>>> >>>>>>>> On 11 feb 2014, at 15:12, Staffan Larsen >>>>>>>> wrote: >>>>>>>> >>>>>>>>> Updated the test to use proper synchronization and notification >>>>>>>>> between threads. Should be more stable and much faster. >>>>>>>>> >>>>>>>>> bug: https://bugs.openjdk.java.net/browse/JDK-6952105 >>>>>>>>> webrev: http://cr.openjdk.java.net/~sla/6952105/webrev.00/ >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> /Staffan >> From taras.ledkov at oracle.com Wed Feb 19 03:05:45 2014 From: taras.ledkov at oracle.com (taras ledkov) Date: Wed, 19 Feb 2014 15:05:45 +0400 Subject: Review request for 7195249: Some jtreg tests use hard coded ports In-Reply-To: <52F23A97.7020807@oracle.com> References: <529EF58F.5000701@oracle.com> <52A58687.6020708@oracle.com> <52A5953A.5040102@oracle.com> <52A7061E.8040002@oracle.com> <52BC2A7D.3070403@oracle.com> <52D6A61A.5020109@oracle.com> <52DD3BB6.2070607@oracle.com> <52DE3E3D.5070903@oracle.com> <52DE41BB.40309@oracle.com> <52F231D3.1050208@oracle.com> <52F23A97.7020807@oracle.com> Message-ID: <53049009.9020204@oracle.com> Hi, Imports are fixed: http://cr.openjdk.java.net/~anazarov/7195249/jdk/webrev.04/ On 05.02.2014 17:20, Jaroslav Bachorik wrote: > Hi Taras, > > thanks for taking care of this. > > The changes look fine to me. > > One minor nit is unused imports of the library classes in > "test/sun/management/jmxremote/bootstrap/SSLConfigFilePermissionTest.java". > It does not use any of those classes as its base class > "AbstractFilePermissionTest" does all the heavy lifting. > > Cheers, > > -JB- > > On 5.2.2014 13:42, taras ledkov wrote: >> Hi, >> >> So please take a look at the review against JDK9. >> The reviewed patch had not been integrated into JDK8. >> >> Port to JDK9 is identical. The difference: the ProcessTools.java has >> been already patched by Jaroslav. >> >> Webrev for jdk part: >> http://cr.openjdk.java.net/~anazarov/7195249/jdk/webrev.03/ >> >> Webrev for hs part: >> http://cr.openjdk.java.net/~anazarov/7195249/hs/webrev.03/ >> >> >> On 21.01.2014 13:45, Jaroslav Bachorik wrote: >>> Hi Taras, >>> >>> On 21.1.2014 10:30, taras ledkov wrote: >>>> Hi Jaroslav, >>>> >>>> Could you please review the last changes? >>>> Are you OK? >>> >>> Yes, the change looks ok. But I think we will need to get back to this >>> problem eventually and implement a central port dispatcher if we want to >>> be 100% sure the port conflicts wouldn't occur. But your changes reduce >>> the chance significantly. >>> >>> Thanks for taking care of this. >>> >>> -JB- >>> >>>> >>>> On 20.01.2014 19:21, Staffan Larsen wrote: >>>>> Sorry for not replying earlier. Yes, I?m ok with these changes. >>>>> >>>>> Thanks, >>>>> /Staffan >>>>> >>>>> On 20 jan 2014, at 16:07, taras ledkov >>>>> wrote: >>>>> >>>>>> Hi Staffan, >>>>>> >>>>>> I fixed the tests according with your comments. >>>>>> Are you OK? >>>>>> >>>>>> On 15.01.2014 19:15, taras ledkov wrote: >>>>>>> Hi, >>>>>>> >>>>>>> Please take a look at the new review. >>>>>>> >>>>>>> Webrev for jdk part: >>>>>>> http://cr.openjdk.java.net/~anazarov/7195249/jdk/webrev.02/ >>>>>>> >>>>>>> Webrev for hs part: >>>>>>> http://cr.openjdk.java.net/~anazarov/7195249/hs/webrev.02/ >>>>>>> >>>>>>> My answers are inline: >>>>>>> >>>>>>> On 08.01.2014 17:46, Staffan Larsen wrote: >>>>>>>> Hi Taras, >>>>>>>> >>>>>>>> Thanks for doing this clean up and conversion of tests into Java. >>>>>>>> Here?s a couple of comments: >>>>>>>> >>>>>>>> test/runtime/6294277/SourceDebugExtension.java: >>>>>>>> This test could be simplified by not specifying an address at all. >>>>>>>> Since the test never connects to the JVM started with -Xrunjdwp, >>>>>>>> there >>>>>>>> is no reason to specify an address. If address is unspecified (and >>>>>>>> server=y), the connector will pick an address and print it to the >>>>>>>> command line. Thus the only change that needs to be done is to >>>>>>>> remove >>>>>>>> ",address=8888? from the @run command. >>>>>>> fixed >>>>>>> >>>>>>>> test/sun/management/jmxremote/bootstrap/RmiBootstrapTest.sh: >>>>>>>> test/sun/management/jmxremote/bootstrap/RmiSslBootstrapTest.sh: >>>>>>>> These tests do not compile cleanly with an empty JTwork >>>>>>>> directory. It >>>>>>>> seems that having one @build for each class does not work well - >>>>>>>> when >>>>>>>> compiling RmiBootstrapTest.java it cannot find TestLogger. Moving >>>>>>>> all >>>>>>>> classes to one @build statement solved this problem for me. >>>>>>> fixed >>>>>>> >>>>>>>> test/lib/testlibrary/jdk/testlibrary/ProcessTools.java: >>>>>>>> 187 Future stdoutTask = stdout.process(); >>>>>>>> 188 Future stderrTask = stderr.process(); >>>>>>>> The stdoutTask and stderrTask variables are unused. >>>>>>> fixed >>>>>>> >>>>>>>> test/sun/management/jmxremote/bootstrap/RmiRegistrySslTest.java: >>>>>>>> At first I thought something was wrong with this file - the diff is >>>>>>>> very weird. Then I realized you renamed an old file and created a >>>>>>>> new >>>>>>>> file using the old name. >>>>>>> You are right. I did it to keep the test name. >>>>>>> >>>>>>>> test/sun/management/jmxremote/bootstrap/AbstractFilePermissionTest.java: >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> - Is resetPasswordFilePermission() really necessary? It looks like >>>>>>>> you >>>>>>>> delete the files at the beginning of the test in any case. >>>>>>> I think yes. n the first place, this functionality was at the old >>>>>>> code. >>>>>>> In the second place, a file without write permission may be a >>>>>>> problem >>>>>>> for a further cleanup (not by the test, for example for the tests >>>>>>> launcher scripts etc.) >>>>>>> >>>>>>>> - I find the names and usage of ?mgmt? and ?file2PermissionTest? >>>>>>>> confusing. They are both Paths. One is used directly by the >>>>>>>> sub-classes, the other has a getter method. >>>>>>> fixed >>>>>>> >>>>>>>> - Lines 57-58: Don?t swallow exceptions, add an >>>>>>>> ex.printStackTrace(). >>>>>>>> (Same thing for all other places where you call Integer.parseInt()) >>>>>>> fixed >>>>>>> >>>>>>>> test/sun/management/jmxremote/bootstrap/Dummy.java: >>>>>>>> This file is never used as far as I can see. >>>>>>> It is used by PasswordFilePermissionTest & >>>>>>> SSLConfigFilePermissionTest >>>>>>> via the AbstractFilePermissionTest (see the doTest method, >>>>>>> AbstractFilePermissionTest : 162). >>>>>>> >>>>>>>> Thanks, >>>>>>>> /Staffan >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> On 26 dec 2013, at 14:09, taras ledkov >>>>>>>> wrote: >>>>>>>> >>>>>>>>> Hi, >>>>>>>>> >>>>>>>>> Please take a look at the review with fixed issues about trying to >>>>>>>>> launch test that needs free port several times. >>>>>>>>> >>>>>>>>> Webrev for jdk part: >>>>>>>>> http://cr.openjdk.java.net/~anazarov/7195249/jdk/webrev.01/ >>>>>>>>> >>>>>>>>> Webrev for hs part: >>>>>>>>> http://cr.openjdk.java.net/~anazarov/7195249/hs/webrev.01/ >>>>>>>>> >>>>>>>>> Pay your attention to new method ProcessTools.startProcess(String, >>>>>>>>> ProcessBuilder, Consumer) that is used to analyze all >>>>>>>>> output >>>>>>>>> of a sub-process. It has common part with >>>>>>>>> ProcessTools.startProcess(String, ProcessBuilder, >>>>>>>>> Predicate, >>>>>>>>> long, TumeUnit) that is used to determine the warm-up moment. >>>>>>>>> >>>>>>>>> I think the ProcessTools.startProcess(String, ProcessBuilder, >>>>>>>>> Predicate, long, TumeUnit) may be changed by adding >>>>>>>>> LinePump >>>>>>>>> to stderr if there is not serious reason for restricting the >>>>>>>>> warm-up >>>>>>>>> analysis to stdout stream. >>>>>>>>> >>>>>>>>> On 10.12.2013 16:16, Yekaterina Kantserova wrote: >>>>>>>>>> Hi, >>>>>>>>>> >>>>>>>>>> I've consulted with Serviceability engineers (add them to CC >>>>>>>>>> list) and >>>>>>>>>> they would like to see tests to solve these problem so far: >>>>>>>>>> >>>>>>>>>> 2. Implement loops in every test. >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> Katja >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On 12/09/2013 11:02 AM, Alexandre (Shura) Iline wrote: >>>>>>>>>>> Guys. >>>>>>>>>>> >>>>>>>>>>> Let me try to sum up what was said before and may be suggest a >>>>>>>>>>> compromise. >>>>>>>>>>> >>>>>>>>>>> 1. There is a desire to have a support port allocation on the >>>>>>>>>>> level of >>>>>>>>>>> a JTReg suite execution. Taras created a bug for that >>>>>>>>>>> (https://bugs.openjdk.java.net/browse/JDK-7195249). Whether it >>>>>>>>>>> is a >>>>>>>>>>> test harness API or a library API does not really matter from >>>>>>>>>>> usage >>>>>>>>>>> point of view. >>>>>>>>>>> >>>>>>>>>>> 2. There is no way to make the tests absolutely stable, whatever >>>>>>>>>>> port >>>>>>>>>>> allocation logic is used. The best we could do is to try to >>>>>>>>>>> perform >>>>>>>>>>> the test logic with different ports until the test succeeds. >>>>>>>>>>> >>>>>>>>>>> Both arguments make sense. #2 is the ultimate answer, of course, >>>>>>>>>>> but >>>>>>>>>>> better be used in conjunction with a meaningful port selection >>>>>>>>>>> algorithm. >>>>>>>>>>> >>>>>>>>>>> At the same time, copying a loop-until-success login from one >>>>>>>>>>> test to >>>>>>>>>>> another may be not the best solution. Library could help with >>>>>>>>>>> that I >>>>>>>>>>> believe. There only need to be an API method which takes >>>>>>>>>>> behavior as a >>>>>>>>>>> parameter and run it until it succeeds. Something like: >>>>>>>>>>> public runOnAFreePort(Function) >>>>>>>>>>> or similar. There could be arguments of how/whether to implement >>>>>>>>>>> it, >>>>>>>>>>> the solution would not work for shell tests, etc, but still ... >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> With the tests in question though, we have a few options. >>>>>>>>>>> >>>>>>>>>>> 1. Integrate tests as is. Get to it later after reaching >>>>>>>>>>> agreement in >>>>>>>>>>> the library, etc. >>>>>>>>>>> 2. Implement loops in every test. >>>>>>>>>>> 3. Wait for the library to be ready and only then integrate the >>>>>>>>>>> changes. >>>>>>>>>>> >>>>>>>>>>> Please let us know which one is closer to your heart. >>>>>>>>>>> >>>>>>>>>>> I personally prefer #1 for the reason that the changes already >>>>>>>>>>> supposed to make the tests more stable and also there are many >>>>>>>>>>> more >>>>>>>>>>> tests tests which use ports, so the scope of the problem is >>>>>>>>>>> bigger >>>>>>>>>>> than these. >>>>>>>>>>> >>>>>>>>>>> Shura >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>> Taras, >>>>>>>>>>>> >>>>>>>>>>>> I agree with the previous comments, that Utils.getFreePort() >>>>>>>>>>>> does not >>>>>>>>>>>> guarantee the port will be still free when you start your >>>>>>>>>>>> process. >>>>>>>>>>>> Unfortunately I don't think the library can do more. However, >>>>>>>>>>>> there is a >>>>>>>>>>>> solution. >>>>>>>>>>>> >>>>>>>>>>>> Please, look at the *jdk/test/sun/tools/jstatd/JstatdTest.java >>>>>>>>>>>> tryToSetupJstatdProcess()*. In brief, the test will try to >>>>>>>>>>>> start a >>>>>>>>>>>> process with a free port and then check if >>>>>>>>>>>> /java.rmi.server.ExportException: Port already in use/ has been >>>>>>>>>>>> thrown. >>>>>>>>>>>> If yes, you have to retry. >>>>>>>>>>>> >>>>>>>>>>>> Thanks, >>>>>>>>>>>> Katja >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On 12/02/2013 01:39 PM, taras ledkov wrote: >>>>>>>>>>>>> Hi Everyone, >>>>>>>>>>>>> >>>>>>>>>>>>> Whatever logic is to be chosen to select a free port, it is >>>>>>>>>>>>> the >>>>>>>>>>>>> library responsibility to implements it, would not you agree? >>>>>>>>>>>>> >>>>>>>>>>>>> Hence what I am suggesting is to integrate the tests as is. >>>>>>>>>>>>> >>>>>>>>>>>>> Should we decide to replace logic of the port selection, we >>>>>>>>>>>>> could do >>>>>>>>>>>>> it later in the library. >>>>>>>>>>>>> >>>>>>>>>>>>> On 21.11.2013 15:00, Jaroslav Bachorik wrote: >>>>>>>>>>>>>> On 20.11.2013 18:38, Dmitry Samersoff wrote: >>>>>>>>>>>>>>> Roger, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> As soon as we close a socket nobody can guarantee that the >>>>>>>>>>>>>>> port is >>>>>>>>>>>>>>> free. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Moreover, port returned by getFreePort()[1] remains not >>>>>>>>>>>>>>> accessible >>>>>>>>>>>>>>> for >>>>>>>>>>>>>>> some time - it depends to system setup, take a look to >>>>>>>>>>>>>>> discussions >>>>>>>>>>>>>>> around SO_REUSEPORT for Linux or SO_REUSEADDR and SO_LINGER >>>>>>>>>>>>>>> for >>>>>>>>>>>>>>> BSD. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> So from stability point of view it's better to just return >>>>>>>>>>>>>>> random >>>>>>>>>>>>>>> number >>>>>>>>>>>>>>> between 49152 and 65535. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Well, this doesn't seem to improve the odds by much. When >>>>>>>>>>>>>> there are >>>>>>>>>>>>>> more >>>>>>>>>>>>>> tests run in parallel, all of them requiring a free port, >>>>>>>>>>>>>> nothing >>>>>>>>>>>>>> prevents the random function to return the same port to >>>>>>>>>>>>>> all of >>>>>>>>>>>>>> them. >>>>>>>>>>>>>> Also, two subsequent requests can return the same port and >>>>>>>>>>>>>> cause >>>>>>>>>>>>>> problems with timing when a port used by a previous test is >>>>>>>>>>>>>> not >>>>>>>>>>>>>> fully >>>>>>>>>>>>>> ready to be assigned to a different socket. And as Dmitry >>>>>>>>>>>>>> pointed out >>>>>>>>>>>>>> unless one can keep hold of the allocated socket and use it >>>>>>>>>>>>>> later >>>>>>>>>>>>>> there >>>>>>>>>>>>>> is no guarantee that a port which was tested unallocated will >>>>>>>>>>>>>> remain >>>>>>>>>>>>>> unallocated also for the next few milliseconds. >>>>>>>>>>>>>> >>>>>>>>>>>>>> The only fail proof solution would be a port allocating >>>>>>>>>>>>>> service >>>>>>>>>>>>>> provided >>>>>>>>>>>>>> by the harness. Until then we can only (hopefully) decrease >>>>>>>>>>>>>> the >>>>>>>>>>>>>> chance >>>>>>>>>>>>>> of intermittent failures due to a port being in use. >>>>>>>>>>>>>> >>>>>>>>>>>>>> -JB- >>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> -Dmitry >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> [1] >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> 141 public static int getFreePort() throws >>>>>>>>>>>>>>> InterruptedException, >>>>>>>>>>>>>>> IOException { >>>>>>>>>>>>>>> 142 int port = -1; >>>>>>>>>>>>>>> 143 >>>>>>>>>>>>>>> 144 while (port <= 0) { >>>>>>>>>>>>>>> 145 Thread.sleep(100); >>>>>>>>>>>>>>> 146 >>>>>>>>>>>>>>> 147 ServerSocket serverSocket = null; >>>>>>>>>>>>>>> 148 try { >>>>>>>>>>>>>>> 149 serverSocket = new ServerSocket(0); >>>>>>>>>>>>>>> 150 port = serverSocket.getLocalPort(); >>>>>>>>>>>>>>> 151 } finally { >>>>>>>>>>>>>>> 152 serverSocket.close(); >>>>>>>>>>>>>>> 153 } >>>>>>>>>>>>>>> 154 } >>>>>>>>>>>>>>> 155 >>>>>>>>>>>>>>> 156 return port; >>>>>>>>>>>>>>> 157 } >>>>>>>>>>>>>>> 158 >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> On 2013-11-20 19:40, roger riggs wrote: >>>>>>>>>>>>>>>> Hi, >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> fyi, The jdk.testlibrary.Utils.getFreePort() method will >>>>>>>>>>>>>>>> Open an >>>>>>>>>>>>>>>> free >>>>>>>>>>>>>>>> Socket, close it and return >>>>>>>>>>>>>>>> the port number. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> And as Alan recommended, use (0) when possible to have the >>>>>>>>>>>>>>>> system >>>>>>>>>>>>>>>> assign >>>>>>>>>>>>>>>> the port #. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Roger >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> On 11/20/2013 8:04 AM, Dmitry Samersoff wrote: >>>>>>>>>>>>>>>>> Taras, >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> *The only* correct way to take really free port is: >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> 1. Chose random number between 49152 and 65535 >>>>>>>>>>>>>>>>> 2. Open socket >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> if socket fails - repeat step 1 >>>>>>>>>>>>>>>>> if socket OK - return *socket* >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> If you can't keep the socket open (e.g. you have to pass >>>>>>>>>>>>>>>>> port >>>>>>>>>>>>>>>>> number as >>>>>>>>>>>>>>>>> property value) you shouldn't do any pre-check as it >>>>>>>>>>>>>>>>> has no >>>>>>>>>>>>>>>>> value >>>>>>>>>>>>>>>>> - as >>>>>>>>>>>>>>>>> as soon as you close socket someone can take the port. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> So just choose a random number within the range above and >>>>>>>>>>>>>>>>> let >>>>>>>>>>>>>>>>> networking >>>>>>>>>>>>>>>>> code opening socket to handle port conflict. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> -Dmitry >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> On 2013-11-20 15:54, taras ledkov wrote: >>>>>>>>>>>>>>>>>> Hi Everyone, >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> I am working on bug >>>>>>>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-7195249. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> There are two webrevs: >>>>>>>>>>>>>>>>>> Webrev for jdk part: >>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~anazarov/7195249/jdk/webrev.00/ >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Webrev for hs part: >>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~anazarov/7195249/hs/webrev.00/ >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Please take a look at some notes: >>>>>>>>>>>>>>>>>> - After discussing with Yekaterina Kantserova & Jaroslav >>>>>>>>>>>>>>>>>> Bachorik >>>>>>>>>>>>>>>>>> some >>>>>>>>>>>>>>>>>> shell tests have been converted to java based tests >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> - PasswordFilePermissionTest & >>>>>>>>>>>>>>>>>> SSLConfigFilePermissionTest >>>>>>>>>>>>>>>>>> tests >>>>>>>>>>>>>>>>>> looked >>>>>>>>>>>>>>>>>> very similar, so a common parent class was created for >>>>>>>>>>>>>>>>>> them: >>>>>>>>>>>>>>>>>> AbstractFilePermissionTest >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> - What was called RmiRegistrySslTest.java I've renamed to >>>>>>>>>>>>>>>>>> RmiRegistrySslTestApp.java. The java code to replace old >>>>>>>>>>>>>>>>>> shell >>>>>>>>>>>>>>>>>> script >>>>>>>>>>>>>>>>>> RmiRegistrySslTest.sh is called RmiRegistrySslTest.java, >>>>>>>>>>>>>>>>>> hence the >>>>>>>>>>>>>>>>>> huge >>>>>>>>>>>>>>>>>> diff. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> - The new RmiRegistrySslTest.java has some lines similar >>>>>>>>>>>>>>>>>> to the >>>>>>>>>>>>>>>>>> AbstractFilePermissionTest.java, I nevertheless decided >>>>>>>>>>>>>>>>>> to not >>>>>>>>>>>>>>>>>> complicate the code further and leave it as is. Please >>>>>>>>>>>>>>>>>> let me >>>>>>>>>>>>>>>>>> know if >>>>>>>>>>>>>>>>>> this is somehow not acceptable >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> - com/oracle/java/testlibrary/Utils.java that is added to >>>>>>>>>>>>>>>>>> hotspot >>>>>>>>>>>>>>>>>> repository is taken from this patch: >>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ykantser/8023138/webrev.00/test/lib/testlibrary/jdk/testlibrary/Utils.java.sdiff.html >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> - These tests will need additional changes when test >>>>>>>>>>>>>>>>>> library >>>>>>>>>>>>>>>>>> process >>>>>>>>>>>>>>>>>> tools will support command line options inheritance >>>>>>>>>>>>>>>>>> (http://mail.openjdk.java.net/pipermail/serviceability-dev/2013-November/013235.html) >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>>> With best regards, >>>>>>>>> Taras Ledkov >>>>>>>>> Mail-To: taras.ledkov at oracle.com >>>>>>>>> skype: taras_ledkov >>>>>>>>> Phone: 7(812)3346-157 >>>>>>>> >>>>>>> >>>>>> >>>>>> -- >>>>>> With best regards, >>>>>> Taras Ledkov >>>>>> Mail-To: taras.ledkov at oracle.com >>>>>> skype: taras_ledkov >>>>>> Phone: 7(812)3346-157 >>>>> >>>> >>> >> > -- With best regards, Taras Ledkov Mail-To: taras.ledkov at oracle.com skype: taras_ledkov Phone: 7(812)3346-157 From jaroslav.bachorik at oracle.com Wed Feb 19 07:59:08 2014 From: jaroslav.bachorik at oracle.com (Jaroslav Bachorik) Date: Wed, 19 Feb 2014 16:59:08 +0100 Subject: RFR 4505697: nsk/jdi/ExceptionEvent/_itself_/exevent006 and exevent008 tests fail with InvocationTargetException In-Reply-To: <53033378.2060300@oracle.com> References: <52FDF8DD.9090602@oracle.com> <52FE6486.8000603@oracle.com> <52FE7D88.8000701@oracle.com> <52FE951E.1060002@oracle.com> <5301C2AB.2070600@oracle.com> <53033378.2060300@oracle.com> Message-ID: <5304D4CC.1020007@oracle.com> On 18.2.2014 11:18, serguei.spitsyn at oracle.com wrote: > On 2/17/14 12:04 AM, Jaroslav Bachorik wrote: >> On 14.2.2014 23:13, serguei.spitsyn at oracle.com wrote: >>> On 2/14/14 12:33 PM, Daniel D. Daugherty wrote: >>>> On 2/14/14 11:46 AM, serguei.spitsyn at oracle.com wrote: >>>>> Jaroslav, >>>>> >>>>> It looks good in general modulo indent comments from Dan. >>>>> >>>>> But I have a doubt that acquiring the JvmtiThreadState_lock is needed >>>>> or right thing to do in the JvmtiExport::clear_detected_exception(). >>>>> It seems, both clear_exception_detected() and >>>>> set_exception_detected() are always >>>>> called on current thread and so, it has to be safe to do without >>>>> acquiring any locks. >>>> >>>> My JVM/TI-foo is rusty, but I believe that JvmtiThreadState stuff >>>> can also be queried/modified by other threads so grabbing the >>>> associated lock is a good idea. >>> >>> The lock synchronization is cooperative. >>> It does not help much if the lock is not acquired in other places. >>> I can be wrong, but I've not found yet any place in the code where the >>> clear_exception_detected() and set_exception_detected() are called >>> under protection of the JvmtiThreadState_lock. >> >> I copied the locking over from >> "JvmtiExport::cleanup_thread(JavaThread* thread)". That method is also >> supposed to work only with the current thread but acquires the lock >> nonetheless. But if you are sure that the lock is not required I have >> no objections removing it. > > I'm suggesting to remove it, as it is not used in other places in the code. > It is going to be confusing if it is used in one place and missed in > others. I've removed the lock and applied the same cleanup logic to other places where exceptions are rewrapped. Webrev: http://cr.openjdk.java.net/~jbachorik/4505697/webrev.02 JPRT run: http://prt-web.us.oracle.com//archive/2014/02/2014-02-19-114618.jbachorik.hotspot/ Aurora Adhoc: http://aurora.ru.oracle.com//faces/Batch.xhtml?batchName=418853.VMSQE.adhoc.JPRT.full (still running at the moment; no failures so far) > > Thanks, > Serguei > >> >> -JB- >> >>> >>> Thanks, >>> Serguei >>> >>>> >>>> Dan >>>> >>>> >>>>> >>>>> And I'm repeating my question about pre-integration testing (Dan is >>>>> asking about the same). >>>>> >>>>> Thanks, >>>>> Serguei >>>>> >>>>> >>>>> On 2/14/14 3:07 AM, Jaroslav Bachorik wrote: >>>>>> This is a round-0 review request. >>>>>> >>>>>> The reflection code intercepting the exceptions thrown in the >>>>>> invoked methods does not play nicely with JVMTI (which, in this >>>>>> case, propagates to JDI). >>>>>> >>>>>> The reflection code lacks the traditional error handler - therefore, >>>>>> upon throwing the NumberFormatException, the stack is searched for >>>>>> appropriate handlers and none are found. This leaves the >>>>>> "exception_detected" flag set to true while normally it would be >>>>>> reset to false once the exception is handled. The reflection code >>>>>> then goes on and wraps the NumberFormatException into >>>>>> InvocationTargetException and throws it. But, alas, the >>>>>> "exception_detected" flag is still set to true and no JVMTI >>>>>> exception event will be sent out. >>>>>> >>>>>> The proposed solution is to call >>>>>> thread->jvmti_thread_state()->clear_exception_detected() at the >>>>>> appropriate places in the reflection code to reset the >>>>>> "exception_detected" flag and enable the InvocationTargetException >>>>>> be properly reported over JVMTI. >>>>>> >>>>>> Issue : https://bugs.openjdk.java.net/browse/JDK-4505697 >>>>>> Webrev: http://cr.openjdk.java.net/~jbachorik/4505697/webrev.00 >>>>>> >>>>>> Thanks! >>>>>> >>>>>> -JB- >>>>> >>>> >>> >>> >> > From martinrb at google.com Wed Feb 19 08:56:03 2014 From: martinrb at google.com (Martin Buchholz) Date: Wed, 19 Feb 2014 08:56:03 -0800 Subject: RFR 8034168: ThreadMXBean/Locks.java failed, blocked on wrong object In-Reply-To: <53045D76.5080907@oracle.com> References: <530388CC.9080807@oracle.com> <53045D76.5080907@oracle.com> Message-ID: The jsr166 tck tests make systematic use of Thread.yield, e.g. /** * Spin-waits up to the specified number of milliseconds for the given * thread to enter a wait state: BLOCKED, WAITING, or TIMED_WAITING. */ void waitForThreadToEnterWaitState(Thread thread, long timeoutMillis) { long startTime = System.nanoTime(); for (;;) { Thread.State s = thread.getState(); if (s == Thread.State.BLOCKED || s == Thread.State.WAITING || s == Thread.State.TIMED_WAITING) return; else if (s == Thread.State.TERMINATED) fail("Unexpected thread termination"); else if (millisElapsedSince(startTime) > timeoutMillis) { threadAssertTrue(thread.isAlive()); return; } Thread.yield(); } } On Tue, Feb 18, 2014 at 11:29 PM, Jaroslav Bachorik < jaroslav.bachorik at oracle.com> wrote: > On 18.2.2014 18:06, Martin Buchholz wrote: > >> Not checking any details, but tests that want to wait for a particular >> thread state are a good reason to use >> >> volatile boolean flag; >> ... >> while (!flag) Thread.yield(); >> >> I prefer calling Thread.yield to sleeping in this special case, in part >> because I don't want to rely on the implementation of sleep, while yield >> is >> semantically a no-op. (Also sleeping 100ms is a long time for a computer) >> > > There were discussions for a similar fix regarding Thread.yield(). The > concern was that using Thread.yield() in a tight loop might very easily > lead to starvation on single core machines. Therefore Thread.sleep(10) is > used to be sure the flag setting thread has actually a chance to progress. > > -JB- > > > >> >> >> On Tue, Feb 18, 2014 at 8:22 AM, Jaroslav Bachorik < >> jaroslav.bachorik at oracle.com> wrote: >> >> Please, review the following test change. >>> >>> Issue : https://bugs.openjdk.java.net/browse/JDK-8034168 >>> Webrev: http://cr.openjdk.java.net/~jbachorik/8034168/webrev.00 >>> >>> The test fails because of falsely evaluating the thread being parked as >>> actually waiting on a monitor. This is because there is no difference in >>> java thread state for those two situations. The test is using Phaser for >>> synchronization between the checked and checking thread to make sure an >>> appropriate code section is entered before performing asserts. Then it >>> checks the checked thread state and waits till it becomes WAITING. >>> Unfortunately, when Phaser needs to wait it parks the thread and sets the >>> thread state to WAITING. From now on the test is in a completely random >>> state and the result will largely depend on timing - thus failing >>> intermittently. >>> >>> The solution is to use an additional volatile variable to prevent falsely >>> indicating the park() induced WAITING state. >>> >>> Thanks, >>> >>> -JB- >>> >>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20140219/d1b80e06/attachment.html From daniel.daugherty at oracle.com Wed Feb 19 09:01:39 2014 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Wed, 19 Feb 2014 10:01:39 -0700 Subject: RFR 4505697: nsk/jdi/ExceptionEvent/_itself_/exevent006 and exevent008 tests fail with InvocationTargetException In-Reply-To: <5304D4CC.1020007@oracle.com> References: <52FDF8DD.9090602@oracle.com> <52FE6486.8000603@oracle.com> <52FE7D88.8000701@oracle.com> <52FE951E.1060002@oracle.com> <5301C2AB.2070600@oracle.com> <53033378.2060300@oracle.com> <5304D4CC.1020007@oracle.com> Message-ID: <5304E373.1050003@oracle.com> > Webrev: http://cr.openjdk.java.net/~jbachorik/4505697/webrev.02 src/share/vm/prims/jvmtiExport.hpp No comments. src/share/vm/prims/jvmtiExport.cpp No comments. src/share/vm/oops/instanceKlass.cpp No comments. src/share/vm/prims/jvm.cpp No comments. src/share/vm/runtime/reflection.cpp lines 948, 1085: HotSpot indent is two spaces Thumbs up. Dan On 2/19/14 8:59 AM, Jaroslav Bachorik wrote: > On 18.2.2014 11:18, serguei.spitsyn at oracle.com wrote: >> On 2/17/14 12:04 AM, Jaroslav Bachorik wrote: >>> On 14.2.2014 23:13, serguei.spitsyn at oracle.com wrote: >>>> On 2/14/14 12:33 PM, Daniel D. Daugherty wrote: >>>>> On 2/14/14 11:46 AM, serguei.spitsyn at oracle.com wrote: >>>>>> Jaroslav, >>>>>> >>>>>> It looks good in general modulo indent comments from Dan. >>>>>> >>>>>> But I have a doubt that acquiring the JvmtiThreadState_lock is >>>>>> needed >>>>>> or right thing to do in the JvmtiExport::clear_detected_exception(). >>>>>> It seems, both clear_exception_detected() and >>>>>> set_exception_detected() are always >>>>>> called on current thread and so, it has to be safe to do without >>>>>> acquiring any locks. >>>>> >>>>> My JVM/TI-foo is rusty, but I believe that JvmtiThreadState stuff >>>>> can also be queried/modified by other threads so grabbing the >>>>> associated lock is a good idea. >>>> >>>> The lock synchronization is cooperative. >>>> It does not help much if the lock is not acquired in other places. >>>> I can be wrong, but I've not found yet any place in the code where the >>>> clear_exception_detected() and set_exception_detected() are called >>>> under protection of the JvmtiThreadState_lock. >>> >>> I copied the locking over from >>> "JvmtiExport::cleanup_thread(JavaThread* thread)". That method is also >>> supposed to work only with the current thread but acquires the lock >>> nonetheless. But if you are sure that the lock is not required I have >>> no objections removing it. >> >> I'm suggesting to remove it, as it is not used in other places in the >> code. >> It is going to be confusing if it is used in one place and missed in >> others. > > I've removed the lock and applied the same cleanup logic to other > places where exceptions are rewrapped. > > Webrev: http://cr.openjdk.java.net/~jbachorik/4505697/webrev.02 > JPRT run: > http://prt-web.us.oracle.com//archive/2014/02/2014-02-19-114618.jbachorik.hotspot/ > Aurora Adhoc: > http://aurora.ru.oracle.com//faces/Batch.xhtml?batchName=418853.VMSQE.adhoc.JPRT.full > (still running at the moment; no failures so far) > >> >> Thanks, >> Serguei >> >>> >>> -JB- >>> >>>> >>>> Thanks, >>>> Serguei >>>> >>>>> >>>>> Dan >>>>> >>>>> >>>>>> >>>>>> And I'm repeating my question about pre-integration testing (Dan is >>>>>> asking about the same). >>>>>> >>>>>> Thanks, >>>>>> Serguei >>>>>> >>>>>> >>>>>> On 2/14/14 3:07 AM, Jaroslav Bachorik wrote: >>>>>>> This is a round-0 review request. >>>>>>> >>>>>>> The reflection code intercepting the exceptions thrown in the >>>>>>> invoked methods does not play nicely with JVMTI (which, in this >>>>>>> case, propagates to JDI). >>>>>>> >>>>>>> The reflection code lacks the traditional error handler - >>>>>>> therefore, >>>>>>> upon throwing the NumberFormatException, the stack is searched for >>>>>>> appropriate handlers and none are found. This leaves the >>>>>>> "exception_detected" flag set to true while normally it would be >>>>>>> reset to false once the exception is handled. The reflection code >>>>>>> then goes on and wraps the NumberFormatException into >>>>>>> InvocationTargetException and throws it. But, alas, the >>>>>>> "exception_detected" flag is still set to true and no JVMTI >>>>>>> exception event will be sent out. >>>>>>> >>>>>>> The proposed solution is to call >>>>>>> thread->jvmti_thread_state()->clear_exception_detected() at the >>>>>>> appropriate places in the reflection code to reset the >>>>>>> "exception_detected" flag and enable the InvocationTargetException >>>>>>> be properly reported over JVMTI. >>>>>>> >>>>>>> Issue : https://bugs.openjdk.java.net/browse/JDK-4505697 >>>>>>> Webrev: http://cr.openjdk.java.net/~jbachorik/4505697/webrev.00 >>>>>>> >>>>>>> Thanks! >>>>>>> >>>>>>> -JB- >>>>>> >>>>> >>>> >>>> >>> >> > > > From Alan.Bateman at oracle.com Wed Feb 19 09:07:04 2014 From: Alan.Bateman at oracle.com (Alan Bateman) Date: Wed, 19 Feb 2014 17:07:04 +0000 Subject: 8034856/8034857: More gcc warnings In-Reply-To: <5303B85D.10505@oracle.com> References: <52FCC60B.6040205@oracle.com> <6E073B44-2FF7-419F-A094-7CD5C3AE0788@oracle.com> <52FD0DA2.7030201@oracle.com> <52FD35C5.1000306@oracle.com> <5300F36F.9010205@oracle.com> <4DCAB60B-5D60-4873-A5A4-ACF123F10045@oracle.com> <530225D2.4040709@oracle.com> <5302DA88.9060306@oracle.com> <53031AE4.2050700@oracle.com> <5303B85D.10505@oracle.com> Message-ID: <5304E4B8.6040908@oracle.com> On 18/02/2014 19:45, Mikael Vidstedt wrote: > > That makes sense, and in fact parse_manifest.c does not even appear to > allow for \t, so I'm more and more starting to think that a reasonable > implementation in this context would be: > > static int isNormalSpace(int c) { return c == ' '; } > > In which case it probably shouldn't even be a separate function to > start with. I would like to get a second opinion on the implications > of only checking for ' ' (0x20) though. > > If we want to allow both ' ' and \t we should probably call the > function isblankAscii. Thanks again for taking this. On \t then if it's nor handled by the parsing code then isNormalSpace should be fine. -Alan. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20140219/aaa8c604/attachment.html From serguei.spitsyn at oracle.com Wed Feb 19 10:20:38 2014 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Wed, 19 Feb 2014 10:20:38 -0800 Subject: RFR 4505697: nsk/jdi/ExceptionEvent/_itself_/exevent006 and exevent008 tests fail with InvocationTargetException In-Reply-To: <5304D4CC.1020007@oracle.com> References: <52FDF8DD.9090602@oracle.com> <52FE6486.8000603@oracle.com> <52FE7D88.8000701@oracle.com> <52FE951E.1060002@oracle.com> <5301C2AB.2070600@oracle.com> <53033378.2060300@oracle.com> <5304D4CC.1020007@oracle.com> Message-ID: <5304F5F6.5070500@oracle.com> On 2/19/14 7:59 AM, Jaroslav Bachorik wrote: > On 18.2.2014 11:18, serguei.spitsyn at oracle.com wrote: >> On 2/17/14 12:04 AM, Jaroslav Bachorik wrote: >>> On 14.2.2014 23:13, serguei.spitsyn at oracle.com wrote: >>>> On 2/14/14 12:33 PM, Daniel D. Daugherty wrote: >>>>> On 2/14/14 11:46 AM, serguei.spitsyn at oracle.com wrote: >>>>>> Jaroslav, >>>>>> >>>>>> It looks good in general modulo indent comments from Dan. >>>>>> >>>>>> But I have a doubt that acquiring the JvmtiThreadState_lock is >>>>>> needed >>>>>> or right thing to do in the JvmtiExport::clear_detected_exception(). >>>>>> It seems, both clear_exception_detected() and >>>>>> set_exception_detected() are always >>>>>> called on current thread and so, it has to be safe to do without >>>>>> acquiring any locks. >>>>> >>>>> My JVM/TI-foo is rusty, but I believe that JvmtiThreadState stuff >>>>> can also be queried/modified by other threads so grabbing the >>>>> associated lock is a good idea. >>>> >>>> The lock synchronization is cooperative. >>>> It does not help much if the lock is not acquired in other places. >>>> I can be wrong, but I've not found yet any place in the code where the >>>> clear_exception_detected() and set_exception_detected() are called >>>> under protection of the JvmtiThreadState_lock. >>> >>> I copied the locking over from >>> "JvmtiExport::cleanup_thread(JavaThread* thread)". That method is also >>> supposed to work only with the current thread but acquires the lock >>> nonetheless. But if you are sure that the lock is not required I have >>> no objections removing it. >> >> I'm suggesting to remove it, as it is not used in other places in the >> code. >> It is going to be confusing if it is used in one place and missed in >> others. > > I've removed the lock and applied the same cleanup logic to other > places where exceptions are rewrapped. > > Webrev: http://cr.openjdk.java.net/~jbachorik/4505697/webrev.02 > JPRT run: > http://prt-web.us.oracle.com//archive/2014/02/2014-02-19-114618.jbachorik.hotspot/ > Aurora Adhoc: > http://aurora.ru.oracle.com//faces/Batch.xhtml?batchName=418853.VMSQE.adhoc.JPRT.full > (still running at the moment; no failures so far) > It looks good to me. Thank you for changes in the fix! Thanks, Serguei >> >> Thanks, >> Serguei >> >>> >>> -JB- >>> >>>> >>>> Thanks, >>>> Serguei >>>> >>>>> >>>>> Dan >>>>> >>>>> >>>>>> >>>>>> And I'm repeating my question about pre-integration testing (Dan is >>>>>> asking about the same). >>>>>> >>>>>> Thanks, >>>>>> Serguei >>>>>> >>>>>> >>>>>> On 2/14/14 3:07 AM, Jaroslav Bachorik wrote: >>>>>>> This is a round-0 review request. >>>>>>> >>>>>>> The reflection code intercepting the exceptions thrown in the >>>>>>> invoked methods does not play nicely with JVMTI (which, in this >>>>>>> case, propagates to JDI). >>>>>>> >>>>>>> The reflection code lacks the traditional error handler - >>>>>>> therefore, >>>>>>> upon throwing the NumberFormatException, the stack is searched for >>>>>>> appropriate handlers and none are found. This leaves the >>>>>>> "exception_detected" flag set to true while normally it would be >>>>>>> reset to false once the exception is handled. The reflection code >>>>>>> then goes on and wraps the NumberFormatException into >>>>>>> InvocationTargetException and throws it. But, alas, the >>>>>>> "exception_detected" flag is still set to true and no JVMTI >>>>>>> exception event will be sent out. >>>>>>> >>>>>>> The proposed solution is to call >>>>>>> thread->jvmti_thread_state()->clear_exception_detected() at the >>>>>>> appropriate places in the reflection code to reset the >>>>>>> "exception_detected" flag and enable the InvocationTargetException >>>>>>> be properly reported over JVMTI. >>>>>>> >>>>>>> Issue : https://bugs.openjdk.java.net/browse/JDK-4505697 >>>>>>> Webrev: http://cr.openjdk.java.net/~jbachorik/4505697/webrev.00 >>>>>>> >>>>>>> Thanks! >>>>>>> >>>>>>> -JB- >>>>>> >>>>> >>>> >>>> >>> >> > From mikael.vidstedt at oracle.com Wed Feb 19 10:22:06 2014 From: mikael.vidstedt at oracle.com (Mikael Vidstedt) Date: Wed, 19 Feb 2014 10:22:06 -0800 Subject: 8034856/8034857: More gcc warnings In-Reply-To: <5304E4B8.6040908@oracle.com> References: <52FCC60B.6040205@oracle.com> <6E073B44-2FF7-419F-A094-7CD5C3AE0788@oracle.com> <52FD0DA2.7030201@oracle.com> <52FD35C5.1000306@oracle.com> <5300F36F.9010205@oracle.com> <4DCAB60B-5D60-4873-A5A4-ACF123F10045@oracle.com> <530225D2.4040709@oracle.com> <5302DA88.9060306@oracle.com> <53031AE4.2050700@oracle.com> <5303B85D.10505@oracle.com> <5304E4B8.6040908@oracle.com> Message-ID: <5304F64E.4000206@oracle.com> On 2014-02-19 09:07, Alan Bateman wrote: > On 18/02/2014 19:45, Mikael Vidstedt wrote: >> >> That makes sense, and in fact parse_manifest.c does not even appear >> to allow for \t, so I'm more and more starting to think that a >> reasonable implementation in this context would be: >> >> static int isNormalSpace(int c) { return c == ' '; } >> >> In which case it probably shouldn't even be a separate function to >> start with. I would like to get a second opinion on the implications >> of only checking for ' ' (0x20) though. >> >> If we want to allow both ' ' and \t we should probably call the >> function isblankAscii. > Thanks again for taking this. On \t then if it's nor handled by the > parsing code then isNormalSpace should be fine. Since I'm not exactly an expert on the code in question I would certainly appreciate it if somebody could verify me on that. I'm looking at parse_nv_pair (lines 430-542) in: http://hg.openjdk.java.net/jdk9/dev/jdk/file/c766ec3e4877/src/share/bin/parse_manifest.c The documented grammar in the comment only mentions "SPACE" and the code below doesn't make any references to \t. As a matter of fact, it only checks for one single, mandatory SPACE after the colon (enforced at line 535-536) and doesn't care to remove any space characters at the end of the value. The while loop only deals with continuations. If additional spaces do exist they will as far as I can tell be part of the value. Are they trimmed later? I'm assuming it would be nice to have both parsers (parse_manifest & JarFacade) behave the same way? Here's what it would look like to only check for space, but still eat any additional spaces which doesn't match what parse_manifest/parse_nv_pair does: http://cr.openjdk.java.net/~mikael/webrevs/isspace/webrev.01/webrev/ Cheers, Mikael -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20140219/f045a3d6/attachment.html From coleen.phillimore at oracle.com Wed Feb 19 11:28:26 2014 From: coleen.phillimore at oracle.com (Coleen Phillimore) Date: Wed, 19 Feb 2014 14:28:26 -0500 Subject: RFR(XS): JDK-8034867 - Object references from expression stack are not visible as roots via jvmti FollowReferences and HeapDump In-Reply-To: <02D5D45C1F8DB848A7AE20E80EE61A5C3980BF2A@DEWDFEMB20C.global.corp.sap> References: <02D5D45C1F8DB848A7AE20E80EE61A5C3980BF2A@DEWDFEMB20C.global.corp.sap> Message-ID: <530505DA.2070305@oracle.com> This change looks good. Do you need a sponsor? thanks, Coleen On 2/13/14 11:57 AM, Siebenborn, Axel wrote: > > Hi all, > > could I have a review for the following change? > > It adds code to iterate over the slots of the expression stack. > > Found object references are used as heap roots, the same way as object > references in locals slots. > > Bug: > > https://bugs.openjdk.java.net/browse/JDK-8034867 > > Webrev: > > http://sapjvm.com/as/webrevs/8034867 > > Thanks, > > Axel > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20140219/8b4a8588/attachment.html From serguei.spitsyn at oracle.com Wed Feb 19 21:30:20 2014 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Wed, 19 Feb 2014 21:30:20 -0800 Subject: RFR(XS): JDK-8034867 - Object references from expression stack are not visible as roots via jvmti FollowReferences and HeapDump In-Reply-To: <530505DA.2070305@oracle.com> References: <02D5D45C1F8DB848A7AE20E80EE61A5C3980BF2A@DEWDFEMB20C.global.corp.sap> <530505DA.2070305@oracle.com> Message-ID: <530592EC.20207@oracle.com> It looks good to me too. But does Axel need to sign an OCA ? This is a link to follow: http://openjdk.java.net/contribute/ Thanks, Serguei On 2/19/14 11:28 AM, Coleen Phillimore wrote: > > This change looks good. Do you need a sponsor? > thanks, > Coleen > > On 2/13/14 11:57 AM, Siebenborn, Axel wrote: >> >> Hi all, >> >> could I have a review for the following change? >> >> It adds code to iterate over the slots of the expression stack. >> >> Found object references are used as heap roots, the same way as >> object references in locals slots. >> >> Bug: >> >> https://bugs.openjdk.java.net/browse/JDK-8034867 >> >> Webrev: >> >> http://sapjvm.com/as/webrevs/8034867 >> >> Thanks, >> >> Axel >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20140219/63cd4d0b/attachment.html From staffan.larsen at oracle.com Thu Feb 20 00:05:21 2014 From: staffan.larsen at oracle.com (Staffan Larsen) Date: Thu, 20 Feb 2014 09:05:21 +0100 Subject: Codereview request: 8035195 demo/jvmti/mtrace/TraceJFrame.java can't connect to X11 In-Reply-To: <530468A9.7080509@oracle.com> References: <530468A9.7080509@oracle.com> Message-ID: <2B50B541-4A32-4D76-A514-95B34E29C73D@oracle.com> Looks like an OK change although it does not fix the issue with test failures that we see. For that, someone in SQE needs to look at the environment and make sure the tests is run correctly. I suggest you leave the current bug open and create a new one for this test output change. /Staffan On 19 feb 2014, at 09:17, shanliang wrote: > Hi, > > The failure was from a bad DISPLAY setting, this must be not intended, so we re-throw the exception and tell to make sure that the DISPLAY is correct. > > The fix is only to add a catch of InternalError at the beginning, the rest code is not modified but re-formated to remove some spaces. > > webrev: > http://cr.openjdk.java.net/~sjiang/JDK-8035195/00 / > > bug: > https://bugs.openjdk.java.net/browse/JDK-8035195 > > > Thanks, > Shanliang From volker.simonis at gmail.com Thu Feb 20 00:21:25 2014 From: volker.simonis at gmail.com (Volker Simonis) Date: Thu, 20 Feb 2014 09:21:25 +0100 Subject: RFR(XS): JDK-8034867 - Object references from expression stack are not visible as roots via jvmti FollowReferences and HeapDump In-Reply-To: <530592EC.20207@oracle.com> References: <02D5D45C1F8DB848A7AE20E80EE61A5C3980BF2A@DEWDFEMB20C.global.corp.sap> <530505DA.2070305@oracle.com> <530592EC.20207@oracle.com> Message-ID: Axel is covered by the SAP OCA. Thanks, Volker On Thu, Feb 20, 2014 at 6:30 AM, serguei.spitsyn at oracle.com wrote: > It looks good to me too. > But does Axel need to sign an OCA ? > > This is a link to follow: > http://openjdk.java.net/contribute/ > > > Thanks, > Serguei > > > > On 2/19/14 11:28 AM, Coleen Phillimore wrote: > > > This change looks good. Do you need a sponsor? > thanks, > Coleen > > On 2/13/14 11:57 AM, Siebenborn, Axel wrote: > > Hi all, > > could I have a review for the following change? > > > > It adds code to iterate over the slots of the expression stack. > > Found object references are used as heap roots, the same way as object > references in locals slots. > > > > Bug: > > https://bugs.openjdk.java.net/browse/JDK-8034867 > > > > Webrev: > > http://sapjvm.com/as/webrevs/8034867 > > > > Thanks, > > Axel > > > From serguei.spitsyn at oracle.com Thu Feb 20 00:44:13 2014 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Thu, 20 Feb 2014 00:44:13 -0800 Subject: RFR(XS): JDK-8034867 - Object references from expression stack are not visible as roots via jvmti FollowReferences and HeapDump In-Reply-To: References: <02D5D45C1F8DB848A7AE20E80EE61A5C3980BF2A@DEWDFEMB20C.global.corp.sap> <530505DA.2070305@oracle.com> <530592EC.20207@oracle.com> Message-ID: <5305C05D.6070708@oracle.com> Ok Thanks! Serguei On 2/20/14 12:21 AM, Volker Simonis wrote: > Axel is covered by the SAP OCA. > > Thanks, > Volker > > > On Thu, Feb 20, 2014 at 6:30 AM, serguei.spitsyn at oracle.com > wrote: >> It looks good to me too. >> But does Axel need to sign an OCA ? >> >> This is a link to follow: >> http://openjdk.java.net/contribute/ >> >> >> Thanks, >> Serguei >> >> >> >> On 2/19/14 11:28 AM, Coleen Phillimore wrote: >> >> >> This change looks good. Do you need a sponsor? >> thanks, >> Coleen >> >> On 2/13/14 11:57 AM, Siebenborn, Axel wrote: >> >> Hi all, >> >> could I have a review for the following change? >> >> >> >> It adds code to iterate over the slots of the expression stack. >> >> Found object references are used as heap roots, the same way as object >> references in locals slots. >> >> >> >> Bug: >> >> https://bugs.openjdk.java.net/browse/JDK-8034867 >> >> >> >> Webrev: >> >> http://sapjvm.com/as/webrevs/8034867 >> >> >> >> Thanks, >> >> Axel >> >> >> From jaroslav.bachorik at oracle.com Thu Feb 20 00:47:10 2014 From: jaroslav.bachorik at oracle.com (Jaroslav Bachorik) Date: Thu, 20 Feb 2014 09:47:10 +0100 Subject: Codereview request: 8035195 demo/jvmti/mtrace/TraceJFrame.java can't connect to X11 In-Reply-To: <2B50B541-4A32-4D76-A514-95B34E29C73D@oracle.com> References: <530468A9.7080509@oracle.com> <2B50B541-4A32-4D76-A514-95B34E29C73D@oracle.com> Message-ID: <5305C10E.4060704@oracle.com> On 20.2.2014 09:05, Staffan Larsen wrote: > Looks like an OK change although it does not fix the issue with test failures that we see. For that, someone in SQE needs to look at the environment and make sure the tests is run correctly. > > I suggest you leave the current bug open and create a new one for this test output change. +1 -JB- > > /Staffan > > On 19 feb 2014, at 09:17, shanliang wrote: > >> Hi, >> >> The failure was from a bad DISPLAY setting, this must be not intended, so we re-throw the exception and tell to make sure that the DISPLAY is correct. >> >> The fix is only to add a catch of InternalError at the beginning, the rest code is not modified but re-formated to remove some spaces. >> >> webrev: >> http://cr.openjdk.java.net/~sjiang/JDK-8035195/00 / >> >> bug: >> https://bugs.openjdk.java.net/browse/JDK-8035195 >> >> >> Thanks, >> Shanliang > From jaroslav.bachorik at oracle.com Thu Feb 20 00:49:06 2014 From: jaroslav.bachorik at oracle.com (Jaroslav Bachorik) Date: Thu, 20 Feb 2014 09:49:06 +0100 Subject: RFR 4505697: nsk/jdi/ExceptionEvent/_itself_/exevent006 and exevent008 tests fail with InvocationTargetException In-Reply-To: <5304E373.1050003@oracle.com> References: <52FDF8DD.9090602@oracle.com> <52FE6486.8000603@oracle.com> <52FE7D88.8000701@oracle.com> <52FE951E.1060002@oracle.com> <5301C2AB.2070600@oracle.com> <53033378.2060300@oracle.com> <5304D4CC.1020007@oracle.com> <5304E373.1050003@oracle.com> Message-ID: <5305C182.6020802@oracle.com> On 19.2.2014 18:01, Daniel D. Daugherty wrote: > > Webrev: http://cr.openjdk.java.net/~jbachorik/4505697/webrev.02 > > src/share/vm/prims/jvmtiExport.hpp > No comments. > > src/share/vm/prims/jvmtiExport.cpp > No comments. > > src/share/vm/oops/instanceKlass.cpp > No comments. > > src/share/vm/prims/jvm.cpp > No comments. > > src/share/vm/runtime/reflection.cpp > lines 948, 1085: HotSpot indent is two spaces > > Thumbs up. > Thanks Dan! Could I have a second HS reviewer to take a look at this, please? -JB- > Dan > > > On 2/19/14 8:59 AM, Jaroslav Bachorik wrote: >> On 18.2.2014 11:18, serguei.spitsyn at oracle.com wrote: >>> On 2/17/14 12:04 AM, Jaroslav Bachorik wrote: >>>> On 14.2.2014 23:13, serguei.spitsyn at oracle.com wrote: >>>>> On 2/14/14 12:33 PM, Daniel D. Daugherty wrote: >>>>>> On 2/14/14 11:46 AM, serguei.spitsyn at oracle.com wrote: >>>>>>> Jaroslav, >>>>>>> >>>>>>> It looks good in general modulo indent comments from Dan. >>>>>>> >>>>>>> But I have a doubt that acquiring the JvmtiThreadState_lock is >>>>>>> needed >>>>>>> or right thing to do in the JvmtiExport::clear_detected_exception(). >>>>>>> It seems, both clear_exception_detected() and >>>>>>> set_exception_detected() are always >>>>>>> called on current thread and so, it has to be safe to do without >>>>>>> acquiring any locks. >>>>>> >>>>>> My JVM/TI-foo is rusty, but I believe that JvmtiThreadState stuff >>>>>> can also be queried/modified by other threads so grabbing the >>>>>> associated lock is a good idea. >>>>> >>>>> The lock synchronization is cooperative. >>>>> It does not help much if the lock is not acquired in other places. >>>>> I can be wrong, but I've not found yet any place in the code where the >>>>> clear_exception_detected() and set_exception_detected() are called >>>>> under protection of the JvmtiThreadState_lock. >>>> >>>> I copied the locking over from >>>> "JvmtiExport::cleanup_thread(JavaThread* thread)". That method is also >>>> supposed to work only with the current thread but acquires the lock >>>> nonetheless. But if you are sure that the lock is not required I have >>>> no objections removing it. >>> >>> I'm suggesting to remove it, as it is not used in other places in the >>> code. >>> It is going to be confusing if it is used in one place and missed in >>> others. >> >> I've removed the lock and applied the same cleanup logic to other >> places where exceptions are rewrapped. >> >> Webrev: http://cr.openjdk.java.net/~jbachorik/4505697/webrev.02 >> JPRT run: >> http://prt-web.us.oracle.com//archive/2014/02/2014-02-19-114618.jbachorik.hotspot/ >> >> Aurora Adhoc: >> http://aurora.ru.oracle.com//faces/Batch.xhtml?batchName=418853.VMSQE.adhoc.JPRT.full >> (still running at the moment; no failures so far) >> >>> >>> Thanks, >>> Serguei >>> >>>> >>>> -JB- >>>> >>>>> >>>>> Thanks, >>>>> Serguei >>>>> >>>>>> >>>>>> Dan >>>>>> >>>>>> >>>>>>> >>>>>>> And I'm repeating my question about pre-integration testing (Dan is >>>>>>> asking about the same). >>>>>>> >>>>>>> Thanks, >>>>>>> Serguei >>>>>>> >>>>>>> >>>>>>> On 2/14/14 3:07 AM, Jaroslav Bachorik wrote: >>>>>>>> This is a round-0 review request. >>>>>>>> >>>>>>>> The reflection code intercepting the exceptions thrown in the >>>>>>>> invoked methods does not play nicely with JVMTI (which, in this >>>>>>>> case, propagates to JDI). >>>>>>>> >>>>>>>> The reflection code lacks the traditional error handler - >>>>>>>> therefore, >>>>>>>> upon throwing the NumberFormatException, the stack is searched for >>>>>>>> appropriate handlers and none are found. This leaves the >>>>>>>> "exception_detected" flag set to true while normally it would be >>>>>>>> reset to false once the exception is handled. The reflection code >>>>>>>> then goes on and wraps the NumberFormatException into >>>>>>>> InvocationTargetException and throws it. But, alas, the >>>>>>>> "exception_detected" flag is still set to true and no JVMTI >>>>>>>> exception event will be sent out. >>>>>>>> >>>>>>>> The proposed solution is to call >>>>>>>> thread->jvmti_thread_state()->clear_exception_detected() at the >>>>>>>> appropriate places in the reflection code to reset the >>>>>>>> "exception_detected" flag and enable the InvocationTargetException >>>>>>>> be properly reported over JVMTI. >>>>>>>> >>>>>>>> Issue : https://bugs.openjdk.java.net/browse/JDK-4505697 >>>>>>>> Webrev: http://cr.openjdk.java.net/~jbachorik/4505697/webrev.00 >>>>>>>> >>>>>>>> Thanks! >>>>>>>> >>>>>>>> -JB- >>>>>>> >>>>>> >>>>> >>>>> >>>> >>> >> >> >> > From axel.siebenborn at sap.com Thu Feb 20 01:05:32 2014 From: axel.siebenborn at sap.com (Siebenborn, Axel) Date: Thu, 20 Feb 2014 09:05:32 +0000 Subject: RFR(XS): JDK-8034867 - Object references from expression stack are not visible as roots via jvmti FollowReferences and HeapDump In-Reply-To: References: <02D5D45C1F8DB848A7AE20E80EE61A5C3980BF2A@DEWDFEMB20C.global.corp.sap> <530505DA.2070305@oracle.com> <530592EC.20207@oracle.com> Message-ID: <02D5D45C1F8DB848A7AE20E80EE61A5C3980D881@DEWDFEMB20C.global.corp.sap> Yes, I need a sponsor for this change. Thanks for the reviews, Axel > -----Original Message----- > From: Volker Simonis [mailto:volker.simonis at gmail.com] > Sent: Donnerstag, 20. Februar 2014 09:21 > To: serguei.spitsyn at oracle.com > Cc: Coleen Phillimore; hotspot-runtime-dev at openjdk.java.net; > serviceability-dev at openjdk.java.net; Siebenborn, Axel > Subject: Re: RFR(XS): JDK-8034867 - Object references from expression stack > are not visible as roots via jvmti FollowReferences and HeapDump > > Axel is covered by the SAP OCA. > > Thanks, > Volker > > > On Thu, Feb 20, 2014 at 6:30 AM, serguei.spitsyn at oracle.com > wrote: > > It looks good to me too. > > But does Axel need to sign an OCA ? > > > > This is a link to follow: > > http://openjdk.java.net/contribute/ > > > > > > Thanks, > > Serguei > > > > > > > > On 2/19/14 11:28 AM, Coleen Phillimore wrote: > > > > > > This change looks good. Do you need a sponsor? > > thanks, > > Coleen > > > > On 2/13/14 11:57 AM, Siebenborn, Axel wrote: > > > > Hi all, > > > > could I have a review for the following change? > > > > > > > > It adds code to iterate over the slots of the expression stack. > > > > Found object references are used as heap roots, the same way as object > > references in locals slots. > > > > > > > > Bug: > > > > https://bugs.openjdk.java.net/browse/JDK-8034867 > > > > > > > > Webrev: > > > > http://sapjvm.com/as/webrevs/8034867 > > > > > > > > Thanks, > > > > Axel > > > > > > From serguei.spitsyn at oracle.com Thu Feb 20 01:28:36 2014 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Thu, 20 Feb 2014 01:28:36 -0800 Subject: RFR 4505697: nsk/jdi/ExceptionEvent/_itself_/exevent006 and exevent008 tests fail with InvocationTargetException In-Reply-To: <5305C182.6020802@oracle.com> References: <52FDF8DD.9090602@oracle.com> <52FE6486.8000603@oracle.com> <52FE7D88.8000701@oracle.com> <52FE951E.1060002@oracle.com> <5301C2AB.2070600@oracle.com> <53033378.2060300@oracle.com> <5304D4CC.1020007@oracle.com> <5304E373.1050003@oracle.com> <5305C182.6020802@oracle.com> Message-ID: <5305CAC4.2050308@oracle.com> On 2/20/14 12:49 AM, Jaroslav Bachorik wrote: > On 19.2.2014 18:01, Daniel D. Daugherty wrote: >> > Webrev: http://cr.openjdk.java.net/~jbachorik/4505697/webrev.02 >> >> src/share/vm/prims/jvmtiExport.hpp >> No comments. >> >> src/share/vm/prims/jvmtiExport.cpp >> No comments. >> >> src/share/vm/oops/instanceKlass.cpp >> No comments. >> >> src/share/vm/prims/jvm.cpp >> No comments. >> >> src/share/vm/runtime/reflection.cpp >> lines 948, 1085: HotSpot indent is two spaces >> >> Thumbs up. >> > > Thanks Dan! > > Could I have a second HS reviewer to take a look at this, please? You need just one review with the reviewer status and at least two reviews total. You already got reviews from Dan and me which is enough. But it still might be worth to wait for David to review this version as he already reviewed prev. one. Thanks, Serguei > > -JB- > >> Dan >> >> >> On 2/19/14 8:59 AM, Jaroslav Bachorik wrote: >>> On 18.2.2014 11:18, serguei.spitsyn at oracle.com wrote: >>>> On 2/17/14 12:04 AM, Jaroslav Bachorik wrote: >>>>> On 14.2.2014 23:13, serguei.spitsyn at oracle.com wrote: >>>>>> On 2/14/14 12:33 PM, Daniel D. Daugherty wrote: >>>>>>> On 2/14/14 11:46 AM, serguei.spitsyn at oracle.com wrote: >>>>>>>> Jaroslav, >>>>>>>> >>>>>>>> It looks good in general modulo indent comments from Dan. >>>>>>>> >>>>>>>> But I have a doubt that acquiring the JvmtiThreadState_lock is >>>>>>>> needed >>>>>>>> or right thing to do in the >>>>>>>> JvmtiExport::clear_detected_exception(). >>>>>>>> It seems, both clear_exception_detected() and >>>>>>>> set_exception_detected() are always >>>>>>>> called on current thread and so, it has to be safe to do without >>>>>>>> acquiring any locks. >>>>>>> >>>>>>> My JVM/TI-foo is rusty, but I believe that JvmtiThreadState stuff >>>>>>> can also be queried/modified by other threads so grabbing the >>>>>>> associated lock is a good idea. >>>>>> >>>>>> The lock synchronization is cooperative. >>>>>> It does not help much if the lock is not acquired in other places. >>>>>> I can be wrong, but I've not found yet any place in the code >>>>>> where the >>>>>> clear_exception_detected() and set_exception_detected() are called >>>>>> under protection of the JvmtiThreadState_lock. >>>>> >>>>> I copied the locking over from >>>>> "JvmtiExport::cleanup_thread(JavaThread* thread)". That method is >>>>> also >>>>> supposed to work only with the current thread but acquires the lock >>>>> nonetheless. But if you are sure that the lock is not required I have >>>>> no objections removing it. >>>> >>>> I'm suggesting to remove it, as it is not used in other places in the >>>> code. >>>> It is going to be confusing if it is used in one place and missed in >>>> others. >>> >>> I've removed the lock and applied the same cleanup logic to other >>> places where exceptions are rewrapped. >>> >>> Webrev: http://cr.openjdk.java.net/~jbachorik/4505697/webrev.02 >>> JPRT run: >>> http://prt-web.us.oracle.com//archive/2014/02/2014-02-19-114618.jbachorik.hotspot/ >>> >>> >>> Aurora Adhoc: >>> http://aurora.ru.oracle.com//faces/Batch.xhtml?batchName=418853.VMSQE.adhoc.JPRT.full >>> >>> (still running at the moment; no failures so far) >>> >>>> >>>> Thanks, >>>> Serguei >>>> >>>>> >>>>> -JB- >>>>> >>>>>> >>>>>> Thanks, >>>>>> Serguei >>>>>> >>>>>>> >>>>>>> Dan >>>>>>> >>>>>>> >>>>>>>> >>>>>>>> And I'm repeating my question about pre-integration testing >>>>>>>> (Dan is >>>>>>>> asking about the same). >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Serguei >>>>>>>> >>>>>>>> >>>>>>>> On 2/14/14 3:07 AM, Jaroslav Bachorik wrote: >>>>>>>>> This is a round-0 review request. >>>>>>>>> >>>>>>>>> The reflection code intercepting the exceptions thrown in the >>>>>>>>> invoked methods does not play nicely with JVMTI (which, in this >>>>>>>>> case, propagates to JDI). >>>>>>>>> >>>>>>>>> The reflection code lacks the traditional error handler - >>>>>>>>> therefore, >>>>>>>>> upon throwing the NumberFormatException, the stack is searched >>>>>>>>> for >>>>>>>>> appropriate handlers and none are found. This leaves the >>>>>>>>> "exception_detected" flag set to true while normally it would be >>>>>>>>> reset to false once the exception is handled. The reflection code >>>>>>>>> then goes on and wraps the NumberFormatException into >>>>>>>>> InvocationTargetException and throws it. But, alas, the >>>>>>>>> "exception_detected" flag is still set to true and no JVMTI >>>>>>>>> exception event will be sent out. >>>>>>>>> >>>>>>>>> The proposed solution is to call >>>>>>>>> thread->jvmti_thread_state()->clear_exception_detected() at the >>>>>>>>> appropriate places in the reflection code to reset the >>>>>>>>> "exception_detected" flag and enable the >>>>>>>>> InvocationTargetException >>>>>>>>> be properly reported over JVMTI. >>>>>>>>> >>>>>>>>> Issue : https://bugs.openjdk.java.net/browse/JDK-4505697 >>>>>>>>> Webrev: http://cr.openjdk.java.net/~jbachorik/4505697/webrev.00 >>>>>>>>> >>>>>>>>> Thanks! >>>>>>>>> >>>>>>>>> -JB- >>>>>>>> >>>>>>> >>>>>> >>>>>> >>>>> >>>> >>> >>> >>> >> > From david.holmes at oracle.com Thu Feb 20 02:40:48 2014 From: david.holmes at oracle.com (David Holmes) Date: Thu, 20 Feb 2014 20:40:48 +1000 Subject: RFR 4505697: nsk/jdi/ExceptionEvent/_itself_/exevent006 and exevent008 tests fail with InvocationTargetException In-Reply-To: <5304D4CC.1020007@oracle.com> References: <52FDF8DD.9090602@oracle.com> <52FE6486.8000603@oracle.com> <52FE7D88.8000701@oracle.com> <52FE951E.1060002@oracle.com> <5301C2AB.2070600@oracle.com> <53033378.2060300@oracle.com> <5304D4CC.1020007@oracle.com> Message-ID: <5305DBB0.3080809@oracle.com> Hi Jaroslav, instanceKlass.cpp: Comment is wrong: 913 // JVMTI internal flag reset is needed in order to report InvocationTargetException It will be ExceptionInInitializerError You added this: 917 this_oop->set_initialization_state_and_notify(initialization_error, THREAD); 918 CLEAR_PENDING_EXCEPTION; // ignore any exception thrown, class initialization error is thrown below + 919 // JVMTI has already reported the pending exception + 920 // JVMTI internal flag reset is needed in order to report InvocationTargetException + 921 JvmtiExport::clear_detected_exception((JavaThread*)THREAD); but there are a number of places where set_initialization_state_and_notify is called when a pending exception has been cleared, and then CLEAR_PENDING_EXCEPTION is called again, but you didn't modify those other locations. They will rethrow the original exception so I suppose that is okay from JVMTI's perspective. But the flip-side of this is that if set_initialization_state_and_notify does throw an exception, JVMTI will never see it. --- jvm.cpp Comment is wrong again - not InvocationTargetException. --- David ------ On 20/02/2014 1:59 AM, Jaroslav Bachorik wrote: > On 18.2.2014 11:18, serguei.spitsyn at oracle.com wrote: >> On 2/17/14 12:04 AM, Jaroslav Bachorik wrote: >>> On 14.2.2014 23:13, serguei.spitsyn at oracle.com wrote: >>>> On 2/14/14 12:33 PM, Daniel D. Daugherty wrote: >>>>> On 2/14/14 11:46 AM, serguei.spitsyn at oracle.com wrote: >>>>>> Jaroslav, >>>>>> >>>>>> It looks good in general modulo indent comments from Dan. >>>>>> >>>>>> But I have a doubt that acquiring the JvmtiThreadState_lock is needed >>>>>> or right thing to do in the JvmtiExport::clear_detected_exception(). >>>>>> It seems, both clear_exception_detected() and >>>>>> set_exception_detected() are always >>>>>> called on current thread and so, it has to be safe to do without >>>>>> acquiring any locks. >>>>> >>>>> My JVM/TI-foo is rusty, but I believe that JvmtiThreadState stuff >>>>> can also be queried/modified by other threads so grabbing the >>>>> associated lock is a good idea. >>>> >>>> The lock synchronization is cooperative. >>>> It does not help much if the lock is not acquired in other places. >>>> I can be wrong, but I've not found yet any place in the code where the >>>> clear_exception_detected() and set_exception_detected() are called >>>> under protection of the JvmtiThreadState_lock. >>> >>> I copied the locking over from >>> "JvmtiExport::cleanup_thread(JavaThread* thread)". That method is also >>> supposed to work only with the current thread but acquires the lock >>> nonetheless. But if you are sure that the lock is not required I have >>> no objections removing it. >> >> I'm suggesting to remove it, as it is not used in other places in the >> code. >> It is going to be confusing if it is used in one place and missed in >> others. > > I've removed the lock and applied the same cleanup logic to other places > where exceptions are rewrapped. > > Webrev: http://cr.openjdk.java.net/~jbachorik/4505697/webrev.02 > JPRT run: > http://prt-web.us.oracle.com//archive/2014/02/2014-02-19-114618.jbachorik.hotspot/ > > Aurora Adhoc: > http://aurora.ru.oracle.com//faces/Batch.xhtml?batchName=418853.VMSQE.adhoc.JPRT.full > (still running at the moment; no failures so far) > >> >> Thanks, >> Serguei >> >>> >>> -JB- >>> >>>> >>>> Thanks, >>>> Serguei >>>> >>>>> >>>>> Dan >>>>> >>>>> >>>>>> >>>>>> And I'm repeating my question about pre-integration testing (Dan is >>>>>> asking about the same). >>>>>> >>>>>> Thanks, >>>>>> Serguei >>>>>> >>>>>> >>>>>> On 2/14/14 3:07 AM, Jaroslav Bachorik wrote: >>>>>>> This is a round-0 review request. >>>>>>> >>>>>>> The reflection code intercepting the exceptions thrown in the >>>>>>> invoked methods does not play nicely with JVMTI (which, in this >>>>>>> case, propagates to JDI). >>>>>>> >>>>>>> The reflection code lacks the traditional error handler - therefore, >>>>>>> upon throwing the NumberFormatException, the stack is searched for >>>>>>> appropriate handlers and none are found. This leaves the >>>>>>> "exception_detected" flag set to true while normally it would be >>>>>>> reset to false once the exception is handled. The reflection code >>>>>>> then goes on and wraps the NumberFormatException into >>>>>>> InvocationTargetException and throws it. But, alas, the >>>>>>> "exception_detected" flag is still set to true and no JVMTI >>>>>>> exception event will be sent out. >>>>>>> >>>>>>> The proposed solution is to call >>>>>>> thread->jvmti_thread_state()->clear_exception_detected() at the >>>>>>> appropriate places in the reflection code to reset the >>>>>>> "exception_detected" flag and enable the InvocationTargetException >>>>>>> be properly reported over JVMTI. >>>>>>> >>>>>>> Issue : https://bugs.openjdk.java.net/browse/JDK-4505697 >>>>>>> Webrev: http://cr.openjdk.java.net/~jbachorik/4505697/webrev.00 >>>>>>> >>>>>>> Thanks! >>>>>>> >>>>>>> -JB- >>>>>> >>>>> >>>> >>>> >>> >> > From staffan.larsen at oracle.com Thu Feb 20 03:22:31 2014 From: staffan.larsen at oracle.com (Staffan Larsen) Date: Thu, 20 Feb 2014 12:22:31 +0100 Subject: JDK-8022203 - Intermittent test failures in demo/jvmti/hprof In-Reply-To: <5305DBF0.5070902@oracle.com> References: <5302016A.1020906@oracle.com> <6A4175A3-DAEC-42D2-99E6-42960B6122E1@oracle.com> <5305AC54.2010102@oracle.com> <27761889-3A44-4A4B-932C-91D25EEE94B5@oracle.com> <5305BA1F.4050700@oracle.com> <5305DBF0.5070902@oracle.com> Message-ID: Looks good! Thanks, /Staffan On 20 feb 2014, at 11:41, Sergei Kovalev wrote: > > Hello everyone, > > I'm working for https://bugs.openjdk.java.net/browse/JDK-8022203. > > As it was discussed, I prepared fix for the issue:http://cr.openjdk.java.net/~anazarov/8022203/webrev.02/ > > The failure caused by slippage in IO operation due to file system performance. To avoid this situation we will use an unique file name for profiling dump. In this case we will have zero probability that tests would overlapped by output file. By default the file name for dump output builds from library name plus .java. prefix and .txt suffix. Hprof tool supports "file" option that used for providing the output file name. Therefore we can just add unique file name to options string for each individual tests that used hprof. > > Testing done: > - single run of tests > - Stability run of all modified tests > - Modified stability run. I did some test modification that helped me to increase reproducibility rating. It was 500-1000 iteration instead 10,000 in original case. Test with modified options string ran 10,000 iteration. > > Could you please approve the fix? > > -- > With best regards, > Sergei > From jaroslav.bachorik at oracle.com Thu Feb 20 05:41:37 2014 From: jaroslav.bachorik at oracle.com (Jaroslav Bachorik) Date: Thu, 20 Feb 2014 14:41:37 +0100 Subject: RFR 4505697: nsk/jdi/ExceptionEvent/_itself_/exevent006 and exevent008 tests fail with InvocationTargetException In-Reply-To: <5305DBB0.3080809@oracle.com> References: <52FDF8DD.9090602@oracle.com> <52FE6486.8000603@oracle.com> <52FE7D88.8000701@oracle.com> <52FE951E.1060002@oracle.com> <5301C2AB.2070600@oracle.com> <53033378.2060300@oracle.com> <5304D4CC.1020007@oracle.com> <5305DBB0.3080809@oracle.com> Message-ID: <53060611.3040209@oracle.com> On 20.2.2014 11:40, David Holmes wrote: > Hi Jaroslav, > > instanceKlass.cpp: > > Comment is wrong: > > 913 // JVMTI internal flag reset is needed in order to report > InvocationTargetException > > It will be ExceptionInInitializerError Will fix. Copypaste ... > > You added this: > > 917 > this_oop->set_initialization_state_and_notify(initialization_error, > THREAD); > 918 CLEAR_PENDING_EXCEPTION; // ignore any exception thrown, > class initialization error is thrown below > + 919 // JVMTI has already reported the pending exception > + 920 // JVMTI internal flag reset is needed in order to report > InvocationTargetException > + 921 JvmtiExport::clear_detected_exception((JavaThread*)THREAD); > > but there are a number of places where > set_initialization_state_and_notify is called when a pending exception > has been cleared, and then CLEAR_PENDING_EXCEPTION is called again, but > you didn't modify those other locations. They will rethrow the original > exception so I suppose that is okay from JVMTI's perspective. But the > flip-side of this is that if set_initialization_state_and_notify does > throw an exception, JVMTI will never see it. I don't know if it supposed to see it. It seems that any exception thrown from set_initialization_state_and_notify is thoroughly ignored and hidden from the outer world. Perhaps someone more experienced in JVMTI than me would like to chime in here? Serguei? -JB- > > --- > > jvm.cpp > > Comment is wrong again - not InvocationTargetException. > > --- > > David > ------ > > > > > On 20/02/2014 1:59 AM, Jaroslav Bachorik wrote: >> On 18.2.2014 11:18, serguei.spitsyn at oracle.com wrote: >>> On 2/17/14 12:04 AM, Jaroslav Bachorik wrote: >>>> On 14.2.2014 23:13, serguei.spitsyn at oracle.com wrote: >>>>> On 2/14/14 12:33 PM, Daniel D. Daugherty wrote: >>>>>> On 2/14/14 11:46 AM, serguei.spitsyn at oracle.com wrote: >>>>>>> Jaroslav, >>>>>>> >>>>>>> It looks good in general modulo indent comments from Dan. >>>>>>> >>>>>>> But I have a doubt that acquiring the JvmtiThreadState_lock is >>>>>>> needed >>>>>>> or right thing to do in the JvmtiExport::clear_detected_exception(). >>>>>>> It seems, both clear_exception_detected() and >>>>>>> set_exception_detected() are always >>>>>>> called on current thread and so, it has to be safe to do without >>>>>>> acquiring any locks. >>>>>> >>>>>> My JVM/TI-foo is rusty, but I believe that JvmtiThreadState stuff >>>>>> can also be queried/modified by other threads so grabbing the >>>>>> associated lock is a good idea. >>>>> >>>>> The lock synchronization is cooperative. >>>>> It does not help much if the lock is not acquired in other places. >>>>> I can be wrong, but I've not found yet any place in the code where the >>>>> clear_exception_detected() and set_exception_detected() are called >>>>> under protection of the JvmtiThreadState_lock. >>>> >>>> I copied the locking over from >>>> "JvmtiExport::cleanup_thread(JavaThread* thread)". That method is also >>>> supposed to work only with the current thread but acquires the lock >>>> nonetheless. But if you are sure that the lock is not required I have >>>> no objections removing it. >>> >>> I'm suggesting to remove it, as it is not used in other places in the >>> code. >>> It is going to be confusing if it is used in one place and missed in >>> others. >> >> I've removed the lock and applied the same cleanup logic to other places >> where exceptions are rewrapped. >> >> Webrev: http://cr.openjdk.java.net/~jbachorik/4505697/webrev.02 >> JPRT run: >> http://prt-web.us.oracle.com//archive/2014/02/2014-02-19-114618.jbachorik.hotspot/ >> >> >> Aurora Adhoc: >> http://aurora.ru.oracle.com//faces/Batch.xhtml?batchName=418853.VMSQE.adhoc.JPRT.full >> >> (still running at the moment; no failures so far) >> >>> >>> Thanks, >>> Serguei >>> >>>> >>>> -JB- >>>> >>>>> >>>>> Thanks, >>>>> Serguei >>>>> >>>>>> >>>>>> Dan >>>>>> >>>>>> >>>>>>> >>>>>>> And I'm repeating my question about pre-integration testing (Dan is >>>>>>> asking about the same). >>>>>>> >>>>>>> Thanks, >>>>>>> Serguei >>>>>>> >>>>>>> >>>>>>> On 2/14/14 3:07 AM, Jaroslav Bachorik wrote: >>>>>>>> This is a round-0 review request. >>>>>>>> >>>>>>>> The reflection code intercepting the exceptions thrown in the >>>>>>>> invoked methods does not play nicely with JVMTI (which, in this >>>>>>>> case, propagates to JDI). >>>>>>>> >>>>>>>> The reflection code lacks the traditional error handler - >>>>>>>> therefore, >>>>>>>> upon throwing the NumberFormatException, the stack is searched for >>>>>>>> appropriate handlers and none are found. This leaves the >>>>>>>> "exception_detected" flag set to true while normally it would be >>>>>>>> reset to false once the exception is handled. The reflection code >>>>>>>> then goes on and wraps the NumberFormatException into >>>>>>>> InvocationTargetException and throws it. But, alas, the >>>>>>>> "exception_detected" flag is still set to true and no JVMTI >>>>>>>> exception event will be sent out. >>>>>>>> >>>>>>>> The proposed solution is to call >>>>>>>> thread->jvmti_thread_state()->clear_exception_detected() at the >>>>>>>> appropriate places in the reflection code to reset the >>>>>>>> "exception_detected" flag and enable the InvocationTargetException >>>>>>>> be properly reported over JVMTI. >>>>>>>> >>>>>>>> Issue : https://bugs.openjdk.java.net/browse/JDK-4505697 >>>>>>>> Webrev: http://cr.openjdk.java.net/~jbachorik/4505697/webrev.00 >>>>>>>> >>>>>>>> Thanks! >>>>>>>> >>>>>>>> -JB- >>>>>>> >>>>>> >>>>> >>>>> >>>> >>> >> From jaroslav.bachorik at oracle.com Thu Feb 20 06:16:55 2014 From: jaroslav.bachorik at oracle.com (Jaroslav Bachorik) Date: Thu, 20 Feb 2014 15:16:55 +0100 Subject: RFR 8034168: ThreadMXBean/Locks.java failed, blocked on wrong object In-Reply-To: References: <530388CC.9080807@oracle.com> <53045D76.5080907@oracle.com> Message-ID: <53060E57.6010300@oracle.com> This was discussed when reviewing ThreadMXBean/SynchronizationStatistics.java Regarding a busy wait checking the thread states and issuing Thread.yield() now and then David Holmes wrote: "Not elegant and not completely reliable either. Probably adequate on a multi-core system but single-core and with some schedulers it could just be a busy spin." [1] As far as I understand, the only benefit of using Thread.yield() instead of Thread.sleep() would be 10-100ms shorter execution time, right? -JB- [1] http://mail.openjdk.java.net/pipermail/jmx-dev/2013-October/000484.html On 19.2.2014 17:56, Martin Buchholz wrote: > The jsr166 tck tests make systematic use of Thread.yield, e.g. > > > /** > * Spin-waits up to the specified number of milliseconds for the given > * thread to enter a wait state: BLOCKED, WAITING, or TIMED_WAITING. > */ > void waitForThreadToEnterWaitState(Thread thread, long timeoutMillis) { > long startTime = System.nanoTime(); > for (;;) { > Thread.State s = thread.getState(); > if (s == Thread.State.BLOCKED || > s == Thread.State.WAITING || > s == Thread.State.TIMED_WAITING) > return; > else if (s == Thread.State.TERMINATED) > fail("Unexpected thread termination"); > else if (millisElapsedSince(startTime) > timeoutMillis) { > threadAssertTrue(thread.isAlive()); > return; > } > Thread.yield(); > } > } > > > > On Tue, Feb 18, 2014 at 11:29 PM, Jaroslav Bachorik < > jaroslav.bachorik at oracle.com> wrote: > >> On 18.2.2014 18:06, Martin Buchholz wrote: >> >>> Not checking any details, but tests that want to wait for a particular >>> thread state are a good reason to use >>> >>> volatile boolean flag; >>> ... >>> while (!flag) Thread.yield(); >>> >>> I prefer calling Thread.yield to sleeping in this special case, in part >>> because I don't want to rely on the implementation of sleep, while yield >>> is >>> semantically a no-op. (Also sleeping 100ms is a long time for a computer) >>> >> >> There were discussions for a similar fix regarding Thread.yield(). The >> concern was that using Thread.yield() in a tight loop might very easily >> lead to starvation on single core machines. Therefore Thread.sleep(10) is >> used to be sure the flag setting thread has actually a chance to progress. >> >> -JB- >> >> >> >>> >>> >>> On Tue, Feb 18, 2014 at 8:22 AM, Jaroslav Bachorik < >>> jaroslav.bachorik at oracle.com> wrote: >>> >>> Please, review the following test change. >>>> >>>> Issue : https://bugs.openjdk.java.net/browse/JDK-8034168 >>>> Webrev: http://cr.openjdk.java.net/~jbachorik/8034168/webrev.00 >>>> >>>> The test fails because of falsely evaluating the thread being parked as >>>> actually waiting on a monitor. This is because there is no difference in >>>> java thread state for those two situations. The test is using Phaser for >>>> synchronization between the checked and checking thread to make sure an >>>> appropriate code section is entered before performing asserts. Then it >>>> checks the checked thread state and waits till it becomes WAITING. >>>> Unfortunately, when Phaser needs to wait it parks the thread and sets the >>>> thread state to WAITING. From now on the test is in a completely random >>>> state and the result will largely depend on timing - thus failing >>>> intermittently. >>>> >>>> The solution is to use an additional volatile variable to prevent falsely >>>> indicating the park() induced WAITING state. >>>> >>>> Thanks, >>>> >>>> -JB- >>>> >>>> >>> >> > From jaroslav.bachorik at oracle.com Thu Feb 20 06:21:29 2014 From: jaroslav.bachorik at oracle.com (Jaroslav Bachorik) Date: Thu, 20 Feb 2014 15:21:29 +0100 Subject: RFR 8035395: sun/management/jmxremote/startstop/JMXStartStopTest.java fails intermittently: Port already in use Message-ID: <53060F69.7050704@oracle.com> Please, review this test fix. Issue : https://bugs.openjdk.java.net/browse/JDK-8035395 Webrev: http://cr.openjdk.java.net/~jbachorik/8035395/webrev.00 Currently, the test is using two fixed ports to start JMX connector and RMI registry when necessary. It can not deal with situations when the ports are not available. The patch is adding the ability to obtain ports from the ephemeral range and use them instead of the hardcoded ones. It also tries to minimize the chance of another process stealing the ports by holding the corresponding SocketServers open till right before the port is actually needed. Thanks, -JB- From sergei.kovalev at oracle.com Thu Feb 20 02:41:52 2014 From: sergei.kovalev at oracle.com (Sergei Kovalev) Date: Thu, 20 Feb 2014 14:41:52 +0400 Subject: JDK-8022203 - Intermittent test failures in demo/jvmti/hprof In-Reply-To: References: <5302016A.1020906@oracle.com> <6A4175A3-DAEC-42D2-99E6-42960B6122E1@oracle.com> <5305AC54.2010102@oracle.com> <27761889-3A44-4A4B-932C-91D25EEE94B5@oracle.com> <5305BA1F.4050700@oracle.com> Message-ID: <5305DBF0.5070902@oracle.com> Hello everyone, I'm working for https://bugs.openjdk.java.net/browse/JDK-8022203. As it was discussed, I prepared fix for the issue:http://cr.openjdk.java.net/~anazarov/8022203/webrev.02/ The failure caused by slippage in IO operation due to file system performance. To avoid this situation we will use an unique file name for profiling dump. In this case we will have zero probability that tests would overlapped by output file. By default the file name for dump output builds from library name plus .java. prefix and .txt suffix. Hprof tool supports "file" option that used for providing the output file name. Therefore we can just add unique file name to options string for each individual tests that used hprof. Testing done: - single run of tests - Stability run of all modified tests - Modified stability run. I did some test modification that helped me to increase reproducibility rating. It was 500-1000 iteration instead 10,000 in original case. Test with modified options string ran 10,000 iteration. Could you please approve the fix? -- With best regards, Sergei From martinrb at google.com Thu Feb 20 09:04:25 2014 From: martinrb at google.com (Martin Buchholz) Date: Thu, 20 Feb 2014 09:04:25 -0800 Subject: RFR 8034168: ThreadMXBean/Locks.java failed, blocked on wrong object In-Reply-To: <53060E57.6010300@oracle.com> References: <530388CC.9080807@oracle.com> <53045D76.5080907@oracle.com> <53060E57.6010300@oracle.com> Message-ID: I think David is too pessimistic about Thread.yield being ineffective on Java SE implementations (OTOH David is a Java Embedded expert). In practice an implementation that never thread switched out of a yield() loop would not pass the tck. As for theory: it's true that Thread.yield has no progress guarantees, but then neither does Thread.sleep. A perverse implementation can always starve any thread it feels like. Anyways, there is nothing wrong with your sleep loop. Except maybe you want to time out eventually. On Thu, Feb 20, 2014 at 6:16 AM, Jaroslav Bachorik < jaroslav.bachorik at oracle.com> wrote: > This was discussed when reviewing ThreadMXBean/ > SynchronizationStatistics.java > > Regarding a busy wait checking the thread states and issuing > Thread.yield() now and then David Holmes wrote: > "Not elegant and not completely reliable either. Probably adequate on a > multi-core system but single-core and with some schedulers it could just > be a busy spin." [1] > > As far as I understand, the only benefit of using Thread.yield() instead > of Thread.sleep() would be 10-100ms shorter execution time, right? > > -JB- > > [1] http://mail.openjdk.java.net/pipermail/jmx-dev/2013- > October/000484.html > > > On 19.2.2014 17:56, Martin Buchholz wrote: > >> The jsr166 tck tests make systematic use of Thread.yield, e.g. >> >> >> /** >> * Spin-waits up to the specified number of milliseconds for the >> given >> * thread to enter a wait state: BLOCKED, WAITING, or TIMED_WAITING. >> */ >> void waitForThreadToEnterWaitState(Thread thread, long >> timeoutMillis) { >> long startTime = System.nanoTime(); >> for (;;) { >> Thread.State s = thread.getState(); >> if (s == Thread.State.BLOCKED || >> s == Thread.State.WAITING || >> s == Thread.State.TIMED_WAITING) >> return; >> else if (s == Thread.State.TERMINATED) >> fail("Unexpected thread termination"); >> else if (millisElapsedSince(startTime) > timeoutMillis) { >> threadAssertTrue(thread.isAlive()); >> return; >> } >> Thread.yield(); >> } >> } >> >> >> >> On Tue, Feb 18, 2014 at 11:29 PM, Jaroslav Bachorik < >> jaroslav.bachorik at oracle.com> wrote: >> >> On 18.2.2014 18:06, Martin Buchholz wrote: >>> >>> Not checking any details, but tests that want to wait for a particular >>>> thread state are a good reason to use >>>> >>>> volatile boolean flag; >>>> ... >>>> while (!flag) Thread.yield(); >>>> >>>> I prefer calling Thread.yield to sleeping in this special case, in part >>>> because I don't want to rely on the implementation of sleep, while yield >>>> is >>>> semantically a no-op. (Also sleeping 100ms is a long time for a >>>> computer) >>>> >>>> >>> There were discussions for a similar fix regarding Thread.yield(). The >>> concern was that using Thread.yield() in a tight loop might very easily >>> lead to starvation on single core machines. Therefore Thread.sleep(10) is >>> used to be sure the flag setting thread has actually a chance to >>> progress. >>> >>> -JB- >>> >>> >>> >>> >>>> >>>> On Tue, Feb 18, 2014 at 8:22 AM, Jaroslav Bachorik < >>>> jaroslav.bachorik at oracle.com> wrote: >>>> >>>> Please, review the following test change. >>>> >>>>> >>>>> Issue : https://bugs.openjdk.java.net/browse/JDK-8034168 >>>>> Webrev: http://cr.openjdk.java.net/~jbachorik/8034168/webrev.00 >>>>> >>>>> The test fails because of falsely evaluating the thread being parked as >>>>> actually waiting on a monitor. This is because there is no difference >>>>> in >>>>> java thread state for those two situations. The test is using Phaser >>>>> for >>>>> synchronization between the checked and checking thread to make sure an >>>>> appropriate code section is entered before performing asserts. Then it >>>>> checks the checked thread state and waits till it becomes WAITING. >>>>> Unfortunately, when Phaser needs to wait it parks the thread and sets >>>>> the >>>>> thread state to WAITING. From now on the test is in a completely random >>>>> state and the result will largely depend on timing - thus failing >>>>> intermittently. >>>>> >>>>> The solution is to use an additional volatile variable to prevent >>>>> falsely >>>>> indicating the park() induced WAITING state. >>>>> >>>>> Thanks, >>>>> >>>>> -JB- >>>>> >>>>> >>>>> >>>> >>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20140220/f1162adc/attachment.html From staffan.larsen at oracle.com Thu Feb 20 09:42:05 2014 From: staffan.larsen at oracle.com (Staffan Larsen) Date: Thu, 20 Feb 2014 18:42:05 +0100 Subject: jmx-dev RFR 8035395: sun/management/jmxremote/startstop/JMXStartStopTest.java fails intermittently: Port already in use In-Reply-To: <53060F69.7050704@oracle.com> References: <53060F69.7050704@oracle.com> Message-ID: Looks good. You have a commented out call to debugPortUsage(pa); which you can remove before pushing. Thanks, /Staffan On 20 feb 2014, at 15:21, Jaroslav Bachorik wrote: > Please, review this test fix. > > Issue : https://bugs.openjdk.java.net/browse/JDK-8035395 > Webrev: http://cr.openjdk.java.net/~jbachorik/8035395/webrev.00 > > Currently, the test is using two fixed ports to start JMX connector and RMI registry when necessary. It can not deal with situations when the ports are not available. The patch is adding the ability to obtain ports from the ephemeral range and use them instead of the hardcoded ones. It also tries to minimize the chance of another process stealing the ports by holding the corresponding SocketServers open till right before the port is actually needed. > > Thanks, > > -JB- From shanliang.jiang at oracle.com Thu Feb 20 10:20:08 2014 From: shanliang.jiang at oracle.com (shanliang) Date: Thu, 20 Feb 2014 19:20:08 +0100 Subject: RFR 8035395: sun/management/jmxremote/startstop/JMXStartStopTest.java fails intermittently: Port already in use In-Reply-To: <53060F69.7050704@oracle.com> References: <53060F69.7050704@oracle.com> Message-ID: <53064758.7090009@oracle.com> Jaroslav, The failed tests were: 1, 7, 8, 9 but the tests using this port (port2: 50235) were 1, 3, 4, 6, 7, 8, 9 and tests 2,4,6 were passed. so I think that the problem might be that the port was not fully released when a test was trying to use it, even the port was closed by the previous test. Your solution is to create a Server socket on a free port, then release it when a test needs it. I suspect whether we will fall into same issue here: the port would not be fully released when using it? Shanliang Jaroslav Bachorik wrote: > Please, review this test fix. > > Issue : https://bugs.openjdk.java.net/browse/JDK-8035395 > Webrev: http://cr.openjdk.java.net/~jbachorik/8035395/webrev.00 > > Currently, the test is using two fixed ports to start JMX connector > and RMI registry when necessary. It can not deal with situations when > the ports are not available. The patch is adding the ability to obtain > ports from the ephemeral range and use them instead of the hardcoded > ones. It also tries to minimize the chance of another process stealing > the ports by holding the corresponding SocketServers open till right > before the port is actually needed. > > Thanks, > > -JB- From serguei.spitsyn at oracle.com Thu Feb 20 12:04:42 2014 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Thu, 20 Feb 2014 12:04:42 -0800 Subject: RFR(XS): JDK-8034867 - Object references from expression stack are not visible as roots via jvmti FollowReferences and HeapDump In-Reply-To: <02D5D45C1F8DB848A7AE20E80EE61A5C3980D881@DEWDFEMB20C.global.corp.sap> References: <02D5D45C1F8DB848A7AE20E80EE61A5C3980BF2A@DEWDFEMB20C.global.corp.sap> <530505DA.2070305@oracle.com> <530592EC.20207@oracle.com> <02D5D45C1F8DB848A7AE20E80EE61A5C3980D881@DEWDFEMB20C.global.corp.sap> Message-ID: <53065FDA.8050608@oracle.com> Hi Axel, We decided that I will sponsor your change. My current plan is to test and push it over weekend. Thanks, Serguei On 2/20/14 1:05 AM, Siebenborn, Axel wrote: > Yes, I need a sponsor for this change. > Thanks for the reviews, > Axel > > >> -----Original Message----- >> From: Volker Simonis [mailto:volker.simonis at gmail.com] >> Sent: Donnerstag, 20. Februar 2014 09:21 >> To: serguei.spitsyn at oracle.com >> Cc: Coleen Phillimore; hotspot-runtime-dev at openjdk.java.net; >> serviceability-dev at openjdk.java.net; Siebenborn, Axel >> Subject: Re: RFR(XS): JDK-8034867 - Object references from expression stack >> are not visible as roots via jvmti FollowReferences and HeapDump >> >> Axel is covered by the SAP OCA. >> >> Thanks, >> Volker >> >> >> On Thu, Feb 20, 2014 at 6:30 AM, serguei.spitsyn at oracle.com >> wrote: >>> It looks good to me too. >>> But does Axel need to sign an OCA ? >>> >>> This is a link to follow: >>> http://openjdk.java.net/contribute/ >>> >>> >>> Thanks, >>> Serguei >>> >>> >>> >>> On 2/19/14 11:28 AM, Coleen Phillimore wrote: >>> >>> >>> This change looks good. Do you need a sponsor? >>> thanks, >>> Coleen >>> >>> On 2/13/14 11:57 AM, Siebenborn, Axel wrote: >>> >>> Hi all, >>> >>> could I have a review for the following change? >>> >>> >>> >>> It adds code to iterate over the slots of the expression stack. >>> >>> Found object references are used as heap roots, the same way as object >>> references in locals slots. >>> >>> >>> >>> Bug: >>> >>> https://bugs.openjdk.java.net/browse/JDK-8034867 >>> >>> >>> >>> Webrev: >>> >>> http://sapjvm.com/as/webrevs/8034867 >>> >>> >>> >>> Thanks, >>> >>> Axel >>> >>> >>> From david.holmes at oracle.com Thu Feb 20 23:24:57 2014 From: david.holmes at oracle.com (David Holmes) Date: Fri, 21 Feb 2014 17:24:57 +1000 Subject: RFR 4505697: nsk/jdi/ExceptionEvent/_itself_/exevent006 and exevent008 tests fail with InvocationTargetException In-Reply-To: <53060611.3040209@oracle.com> References: <52FDF8DD.9090602@oracle.com> <52FE6486.8000603@oracle.com> <52FE7D88.8000701@oracle.com> <52FE951E.1060002@oracle.com> <5301C2AB.2070600@oracle.com> <53033378.2060300@oracle.com> <5304D4CC.1020007@oracle.com> <5305DBB0.3080809@oracle.com> <53060611.3040209@oracle.com> Message-ID: <5306FF49.4010607@oracle.com> On 20/02/2014 11:41 PM, Jaroslav Bachorik wrote: > On 20.2.2014 11:40, David Holmes wrote: >> Hi Jaroslav, >> >> instanceKlass.cpp: >> >> Comment is wrong: >> >> 913 // JVMTI internal flag reset is needed in order to report >> InvocationTargetException >> >> It will be ExceptionInInitializerError > > Will fix. Copypaste ... > >> >> You added this: >> >> 917 >> this_oop->set_initialization_state_and_notify(initialization_error, >> THREAD); >> 918 CLEAR_PENDING_EXCEPTION; // ignore any exception thrown, >> class initialization error is thrown below >> + 919 // JVMTI has already reported the pending exception >> + 920 // JVMTI internal flag reset is needed in order to report >> InvocationTargetException >> + 921 JvmtiExport::clear_detected_exception((JavaThread*)THREAD); >> >> but there are a number of places where >> set_initialization_state_and_notify is called when a pending exception >> has been cleared, and then CLEAR_PENDING_EXCEPTION is called again, but >> you didn't modify those other locations. They will rethrow the original >> exception so I suppose that is okay from JVMTI's perspective. But the >> flip-side of this is that if set_initialization_state_and_notify does >> throw an exception, JVMTI will never see it. > > I don't know if it supposed to see it. It seems that any exception > thrown from set_initialization_state_and_notify is thoroughly ignored > and hidden from the outer world. Perhaps someone more experienced in > JVMTI than me would like to chime in here? Serguei? set_initialization_state_and_notify is often called when an exception has already occurred during the class loading/initialization process. It is that original exception that we want to propagate but meanwhile we have to perform this action to update the state and wakeup any waiters. So we cache the original exception, clear it, do the state update and then clear any pending exception (I think the only exception possible here is OOME!), then rethrow the original. If this action did indeed throw OOME then we might not be able to wake up the waiter(s) and that might lead to a hang. While a debug VM could use TraceExceptions to (hopefully) spot the OOME, in a product VM it would be invisible, even if a JVMTI agent was tracking exceptions. So I think it should be visible to JVMTI. I would like to hear other opinions though. However this is going beyond the scope of fixing these particular tests so I'm fine if this is simply recorded in another bug for future clean up. Thanks, David > -JB- > >> >> --- >> >> jvm.cpp >> >> Comment is wrong again - not InvocationTargetException. >> >> --- >> >> David >> ------ >> >> >> >> >> On 20/02/2014 1:59 AM, Jaroslav Bachorik wrote: >>> On 18.2.2014 11:18, serguei.spitsyn at oracle.com wrote: >>>> On 2/17/14 12:04 AM, Jaroslav Bachorik wrote: >>>>> On 14.2.2014 23:13, serguei.spitsyn at oracle.com wrote: >>>>>> On 2/14/14 12:33 PM, Daniel D. Daugherty wrote: >>>>>>> On 2/14/14 11:46 AM, serguei.spitsyn at oracle.com wrote: >>>>>>>> Jaroslav, >>>>>>>> >>>>>>>> It looks good in general modulo indent comments from Dan. >>>>>>>> >>>>>>>> But I have a doubt that acquiring the JvmtiThreadState_lock is >>>>>>>> needed >>>>>>>> or right thing to do in the >>>>>>>> JvmtiExport::clear_detected_exception(). >>>>>>>> It seems, both clear_exception_detected() and >>>>>>>> set_exception_detected() are always >>>>>>>> called on current thread and so, it has to be safe to do without >>>>>>>> acquiring any locks. >>>>>>> >>>>>>> My JVM/TI-foo is rusty, but I believe that JvmtiThreadState stuff >>>>>>> can also be queried/modified by other threads so grabbing the >>>>>>> associated lock is a good idea. >>>>>> >>>>>> The lock synchronization is cooperative. >>>>>> It does not help much if the lock is not acquired in other places. >>>>>> I can be wrong, but I've not found yet any place in the code where >>>>>> the >>>>>> clear_exception_detected() and set_exception_detected() are called >>>>>> under protection of the JvmtiThreadState_lock. >>>>> >>>>> I copied the locking over from >>>>> "JvmtiExport::cleanup_thread(JavaThread* thread)". That method is also >>>>> supposed to work only with the current thread but acquires the lock >>>>> nonetheless. But if you are sure that the lock is not required I have >>>>> no objections removing it. >>>> >>>> I'm suggesting to remove it, as it is not used in other places in the >>>> code. >>>> It is going to be confusing if it is used in one place and missed in >>>> others. >>> >>> I've removed the lock and applied the same cleanup logic to other places >>> where exceptions are rewrapped. >>> >>> Webrev: http://cr.openjdk.java.net/~jbachorik/4505697/webrev.02 >>> JPRT run: >>> http://prt-web.us.oracle.com//archive/2014/02/2014-02-19-114618.jbachorik.hotspot/ >>> >>> >>> >>> Aurora Adhoc: >>> http://aurora.ru.oracle.com//faces/Batch.xhtml?batchName=418853.VMSQE.adhoc.JPRT.full >>> >>> >>> (still running at the moment; no failures so far) >>> >>>> >>>> Thanks, >>>> Serguei >>>> >>>>> >>>>> -JB- >>>>> >>>>>> >>>>>> Thanks, >>>>>> Serguei >>>>>> >>>>>>> >>>>>>> Dan >>>>>>> >>>>>>> >>>>>>>> >>>>>>>> And I'm repeating my question about pre-integration testing (Dan is >>>>>>>> asking about the same). >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Serguei >>>>>>>> >>>>>>>> >>>>>>>> On 2/14/14 3:07 AM, Jaroslav Bachorik wrote: >>>>>>>>> This is a round-0 review request. >>>>>>>>> >>>>>>>>> The reflection code intercepting the exceptions thrown in the >>>>>>>>> invoked methods does not play nicely with JVMTI (which, in this >>>>>>>>> case, propagates to JDI). >>>>>>>>> >>>>>>>>> The reflection code lacks the traditional error handler - >>>>>>>>> therefore, >>>>>>>>> upon throwing the NumberFormatException, the stack is searched for >>>>>>>>> appropriate handlers and none are found. This leaves the >>>>>>>>> "exception_detected" flag set to true while normally it would be >>>>>>>>> reset to false once the exception is handled. The reflection code >>>>>>>>> then goes on and wraps the NumberFormatException into >>>>>>>>> InvocationTargetException and throws it. But, alas, the >>>>>>>>> "exception_detected" flag is still set to true and no JVMTI >>>>>>>>> exception event will be sent out. >>>>>>>>> >>>>>>>>> The proposed solution is to call >>>>>>>>> thread->jvmti_thread_state()->clear_exception_detected() at the >>>>>>>>> appropriate places in the reflection code to reset the >>>>>>>>> "exception_detected" flag and enable the InvocationTargetException >>>>>>>>> be properly reported over JVMTI. >>>>>>>>> >>>>>>>>> Issue : https://bugs.openjdk.java.net/browse/JDK-4505697 >>>>>>>>> Webrev: http://cr.openjdk.java.net/~jbachorik/4505697/webrev.00 >>>>>>>>> >>>>>>>>> Thanks! >>>>>>>>> >>>>>>>>> -JB- >>>>>>>> >>>>>>> >>>>>> >>>>>> >>>>> >>>> >>> > From jaroslav.bachorik at oracle.com Thu Feb 20 23:35:30 2014 From: jaroslav.bachorik at oracle.com (Jaroslav Bachorik) Date: Fri, 21 Feb 2014 08:35:30 +0100 Subject: RFR 8035395: sun/management/jmxremote/startstop/JMXStartStopTest.java fails intermittently: Port already in use In-Reply-To: <53064758.7090009@oracle.com> References: <53060F69.7050704@oracle.com> <53064758.7090009@oracle.com> Message-ID: <530701C2.1020003@oracle.com> Hi Shanliang, On 20.2.2014 19:20, shanliang wrote: > Jaroslav, > > The failed tests were: > 1, 7, 8, 9 > > but the tests using this port (port2: 50235) were > 1, 3, 4, 6, 7, 8, 9 > > and tests 2,4,6 were passed. > > so I think that the problem might be that the port was not fully > released when a test was trying to use it, even the port was closed by > the previous test. I don't think this is the case. Firstly, test 1 fails. There are no previous tests possibly holding on to the required port. So the port must have been taken by some foreign process. The port number being from the ephemeral range doesn't help either, quite contrary. Secondly, a port can not be partially bound - either a process binds to a port or not. Also, when a process exits all the bound ports must be released. Since we are waiting for the exit code of the launched test application before proceeding all the ports used by that application must be released before the main test routine can continue. > > Your solution is to create a Server socket on a free port, then release > it when a test needs it. I suspect whether we will fall into same issue > here: the port would not be fully released when using it? No. SocketServer.close() is called synchronously right before the port is going to be used. This call unbinds the socket and returns. At the moment of the return the port is free. I've run the tests locally and via JPRT and they are all passing. Thanks, -JB- > > Shanliang > > Jaroslav Bachorik wrote: >> Please, review this test fix. >> >> Issue : https://bugs.openjdk.java.net/browse/JDK-8035395 >> Webrev: http://cr.openjdk.java.net/~jbachorik/8035395/webrev.00 >> >> Currently, the test is using two fixed ports to start JMX connector >> and RMI registry when necessary. It can not deal with situations when >> the ports are not available. The patch is adding the ability to obtain >> ports from the ephemeral range and use them instead of the hardcoded >> ones. It also tries to minimize the chance of another process stealing >> the ports by holding the corresponding SocketServers open till right >> before the port is actually needed. >> >> Thanks, >> >> -JB- > From shanliang.jiang at oracle.com Fri Feb 21 01:26:26 2014 From: shanliang.jiang at oracle.com (shanliang) Date: Fri, 21 Feb 2014 10:26:26 +0100 Subject: RFR 8035395: sun/management/jmxremote/startstop/JMXStartStopTest.java fails intermittently: Port already in use In-Reply-To: <530701C2.1020003@oracle.com> References: <53060F69.7050704@oracle.com> <53064758.7090009@oracle.com> <530701C2.1020003@oracle.com> Message-ID: <53071BC2.3060307@oracle.com> Jaroslav Bachorik wrote: > Hi Shanliang, > > On 20.2.2014 19:20, shanliang wrote: >> Jaroslav, >> >> The failed tests were: >> 1, 7, 8, 9 >> >> but the tests using this port (port2: 50235) were >> 1, 3, 4, 6, 7, 8, 9 >> >> and tests 2,4,6 were passed. >> >> so I think that the problem might be that the port was not fully >> released when a test was trying to use it, even the port was closed by >> the previous test. > > I don't think this is the case. > > Firstly, test 1 fails. There are no previous tests possibly holding on > to the required port. So the port must have been taken by some foreign > process. The port number being from the ephemeral range doesn't help > either, quite contrary. > > Secondly, a port can not be partially bound - either a process binds > to a port or not. Also, when a process exits all the bound ports must > be released. Since we are waiting for the exit code of the launched > test application before proceeding all the ports used by that > application must be released before the main test routine can continue. Why did exception "Port already in use error: " happen for Test1, then the port was free for 3/4/6, and then the exception appeared again for 7, 8, 9? A port is possibly unavailable after being closed, because it can be in the state TIME_WAIT. Your fix created a server socket but no client would connect to it, then the port could be available immediately after close(), not need to enter TIME_WAIT state, if so hopeful the fix could work. Shanliang > >> >> Your solution is to create a Server socket on a free port, then release >> it when a test needs it. I suspect whether we will fall into same issue >> here: the port would not be fully released when using it? > > No. SocketServer.close() is called synchronously right before the port > is going to be used. This call unbinds the socket and returns. At the > moment of the return the port is free. I've run the tests locally and > via JPRT and they are all passing. > > Thanks, > > -JB- > >> >> Shanliang >> >> Jaroslav Bachorik wrote: >>> Please, review this test fix. >>> >>> Issue : https://bugs.openjdk.java.net/browse/JDK-8035395 >>> Webrev: http://cr.openjdk.java.net/~jbachorik/8035395/webrev.00 >>> >>> Currently, the test is using two fixed ports to start JMX connector >>> and RMI registry when necessary. It can not deal with situations when >>> the ports are not available. The patch is adding the ability to obtain >>> ports from the ephemeral range and use them instead of the hardcoded >>> ones. It also tries to minimize the chance of another process stealing >>> the ports by holding the corresponding SocketServers open till right >>> before the port is actually needed. >>> >>> Thanks, >>> >>> -JB- >> > From jaroslav.bachorik at oracle.com Fri Feb 21 01:50:04 2014 From: jaroslav.bachorik at oracle.com (Jaroslav Bachorik) Date: Fri, 21 Feb 2014 10:50:04 +0100 Subject: RFR 8035395: sun/management/jmxremote/startstop/JMXStartStopTest.java fails intermittently: Port already in use In-Reply-To: <53071BC2.3060307@oracle.com> References: <53060F69.7050704@oracle.com> <53064758.7090009@oracle.com> <530701C2.1020003@oracle.com> <53071BC2.3060307@oracle.com> Message-ID: <5307214C.2080009@oracle.com> On 21.2.2014 10:26, shanliang wrote: > Jaroslav Bachorik wrote: >> Hi Shanliang, >> >> On 20.2.2014 19:20, shanliang wrote: >>> Jaroslav, >>> >>> The failed tests were: >>> 1, 7, 8, 9 >>> >>> but the tests using this port (port2: 50235) were >>> 1, 3, 4, 6, 7, 8, 9 >>> >>> and tests 2,4,6 were passed. >>> >>> so I think that the problem might be that the port was not fully >>> released when a test was trying to use it, even the port was closed by >>> the previous test. >> >> I don't think this is the case. >> >> Firstly, test 1 fails. There are no previous tests possibly holding on >> to the required port. So the port must have been taken by some foreign >> process. The port number being from the ephemeral range doesn't help >> either, quite contrary. >> >> Secondly, a port can not be partially bound - either a process binds >> to a port or not. Also, when a process exits all the bound ports must >> be released. Since we are waiting for the exit code of the launched >> test application before proceeding all the ports used by that >> application must be released before the main test routine can continue. > Why did exception "Port already in use error: " happen for Test1, then > the port was free for 3/4/6, and then the exception appeared again for > 7, 8, 9? Test1: fails; it tries to start JMX connector on port 50235 and check the conenction afterward Test2: passes; does not use port 50235 Test3: passes; checks for not being able to connect to port 50235 Test4: fails; the same as Test1 Test5: passes; the same as Test2 Test6: fails; it tries to start RMI registry on port 50235 and fails Test7: fails; the same as Test1 Test8: fails; the same as Test1 Test9: fails; the same as Test1 Test10: passes; the same as Test2 Test11: passes; the same as Test2 Test12: passes; the same as Test2 Test13: passes; the same as Test2 The port doesn't mysteriously become used and unused. It is still occupied by a different process. Some of the tests don't fail simply because they don't use the port. > > A port is possibly unavailable after being closed, because it can be in > the state TIME_WAIT. > > Your fix created a server socket but no client would connect to it, then > the port could be available immediately after close(), not need to enter > TIME_WAIT state, if so hopeful the fix could work. SocketServer does not accept any incoming connection and as such the socket should not go to TIME_WAIT state when it is closed. The JPRT results would indicate that this is indeed the case. -JB- > > Shanliang >> >>> >>> Your solution is to create a Server socket on a free port, then release >>> it when a test needs it. I suspect whether we will fall into same issue >>> here: the port would not be fully released when using it? >> >> No. SocketServer.close() is called synchronously right before the port >> is going to be used. This call unbinds the socket and returns. At the >> moment of the return the port is free. I've run the tests locally and >> via JPRT and they are all passing. >> >> Thanks, >> >> -JB- >> >>> >>> Shanliang >>> >>> Jaroslav Bachorik wrote: >>>> Please, review this test fix. >>>> >>>> Issue : https://bugs.openjdk.java.net/browse/JDK-8035395 >>>> Webrev: http://cr.openjdk.java.net/~jbachorik/8035395/webrev.00 >>>> >>>> Currently, the test is using two fixed ports to start JMX connector >>>> and RMI registry when necessary. It can not deal with situations when >>>> the ports are not available. The patch is adding the ability to obtain >>>> ports from the ephemeral range and use them instead of the hardcoded >>>> ones. It also tries to minimize the chance of another process stealing >>>> the ports by holding the corresponding SocketServers open till right >>>> before the port is actually needed. >>>> >>>> Thanks, >>>> >>>> -JB- >>> >> > From jaroslav.bachorik at oracle.com Fri Feb 21 02:13:39 2014 From: jaroslav.bachorik at oracle.com (Jaroslav Bachorik) Date: Fri, 21 Feb 2014 11:13:39 +0100 Subject: RFR 8033787: [TESTBUG] Few jdk regression management tests should be improved with -XX:+UsePerfData usage Message-ID: <530726D3.6020504@oracle.com> Please, review this very simple test fix. Issue : https://bugs.openjdk.java.net/browse/JDK-8033787 Webrev: http://cr.openjdk.java.net/~jbachorik/8033787/webrev.00 The fix just adds -XX:+UsePerfData parameter to the @run section of the mentioned test. -JB- From david.holmes at oracle.com Fri Feb 21 03:16:22 2014 From: david.holmes at oracle.com (David Holmes) Date: Fri, 21 Feb 2014 21:16:22 +1000 Subject: RFR 8033787: [TESTBUG] Few jdk regression management tests should be improved with -XX:+UsePerfData usage In-Reply-To: <530726D3.6020504@oracle.com> References: <530726D3.6020504@oracle.com> Message-ID: <53073586.3050203@oracle.com> Looks okay to me. David On 21/02/2014 8:13 PM, Jaroslav Bachorik wrote: > Please, review this very simple test fix. > > Issue : https://bugs.openjdk.java.net/browse/JDK-8033787 > Webrev: http://cr.openjdk.java.net/~jbachorik/8033787/webrev.00 > > The fix just adds -XX:+UsePerfData parameter to the @run section of the > mentioned test. > > -JB- From shanliang.jiang at oracle.com Fri Feb 21 04:42:11 2014 From: shanliang.jiang at oracle.com (shanliang) Date: Fri, 21 Feb 2014 13:42:11 +0100 Subject: RFR 8035395: sun/management/jmxremote/startstop/JMXStartStopTest.java fails intermittently: Port already in use In-Reply-To: <5307214C.2080009@oracle.com> References: <53060F69.7050704@oracle.com> <53064758.7090009@oracle.com> <530701C2.1020003@oracle.com> <53071BC2.3060307@oracle.com> <5307214C.2080009@oracle.com> Message-ID: <530749A3.7060802@oracle.com> Jaroslav Bachorik wrote: > On 21.2.2014 10:26, shanliang wrote: >> Jaroslav Bachorik wrote: >>> Hi Shanliang, >>> >>> On 20.2.2014 19:20, shanliang wrote: >>>> Jaroslav, >>>> >>>> The failed tests were: >>>> 1, 7, 8, 9 >>>> >>>> but the tests using this port (port2: 50235) were >>>> 1, 3, 4, 6, 7, 8, 9 >>>> >>>> and tests 2,4,6 were passed. >>>> >>>> so I think that the problem might be that the port was not fully >>>> released when a test was trying to use it, even the port was closed by >>>> the previous test. >>> >>> I don't think this is the case. >>> >>> Firstly, test 1 fails. There are no previous tests possibly holding on >>> to the required port. So the port must have been taken by some foreign >>> process. The port number being from the ephemeral range doesn't help >>> either, quite contrary. >>> >>> Secondly, a port can not be partially bound - either a process binds >>> to a port or not. Also, when a process exits all the bound ports must >>> be released. Since we are waiting for the exit code of the launched >>> test application before proceeding all the ports used by that >>> application must be released before the main test routine can continue. >> Why did exception "Port already in use error: " happen for Test1, then >> the port was free for 3/4/6, and then the exception appeared again for >> 7, 8, 9? > > Test1: fails; it tries to start JMX connector on port 50235 and check > the conenction afterward > Test2: passes; does not use port 50235 > Test3: passes; checks for not being able to connect to port 50235 > Test4: fails; the same as Test1 > Test5: passes; the same as Test2 > Test6: fails; it tries to start RMI registry on port 50235 and fails > Test7: fails; the same as Test1 > Test8: fails; the same as Test1 > Test9: fails; the same as Test1 > Test10: passes; the same as Test2 > Test11: passes; the same as Test2 > Test12: passes; the same as Test2 > Test13: passes; the same as Test2 > > The port doesn't mysteriously become used and unused. It is still > occupied by a different process. Some of the tests don't fail simply > because they don't use the port. Yes right, some tests passed because they did not expected JMX connection, not sure possible to distinguish between a port issue and no jmx server issue. The fix looks OK. Thanks, Shanliang > >> >> A port is possibly unavailable after being closed, because it can be in >> the state TIME_WAIT. >> >> Your fix created a server socket but no client would connect to it, then >> the port could be available immediately after close(), not need to enter >> TIME_WAIT state, if so hopeful the fix could work. > > SocketServer does not accept any incoming connection and as such the > socket should not go to TIME_WAIT state when it is closed. The JPRT > results would indicate that this is indeed the case. > > -JB- > >> >> Shanliang >>> >>>> >>>> Your solution is to create a Server socket on a free port, then >>>> release >>>> it when a test needs it. I suspect whether we will fall into same >>>> issue >>>> here: the port would not be fully released when using it? >>> >>> No. SocketServer.close() is called synchronously right before the port >>> is going to be used. This call unbinds the socket and returns. At the >>> moment of the return the port is free. I've run the tests locally and >>> via JPRT and they are all passing. >>> >>> Thanks, >>> >>> -JB- >>> >>>> >>>> Shanliang >>>> >>>> Jaroslav Bachorik wrote: >>>>> Please, review this test fix. >>>>> >>>>> Issue : https://bugs.openjdk.java.net/browse/JDK-8035395 >>>>> Webrev: http://cr.openjdk.java.net/~jbachorik/8035395/webrev.00 >>>>> >>>>> Currently, the test is using two fixed ports to start JMX connector >>>>> and RMI registry when necessary. It can not deal with situations when >>>>> the ports are not available. The patch is adding the ability to >>>>> obtain >>>>> ports from the ephemeral range and use them instead of the hardcoded >>>>> ones. It also tries to minimize the chance of another process >>>>> stealing >>>>> the ports by holding the corresponding SocketServers open till right >>>>> before the port is actually needed. >>>>> >>>>> Thanks, >>>>> >>>>> -JB- >>>> >>> >> > From staffan.larsen at oracle.com Fri Feb 21 06:10:48 2014 From: staffan.larsen at oracle.com (Staffan Larsen) Date: Fri, 21 Feb 2014 15:10:48 +0100 Subject: RFR: 8035150 ShouldNotReachHere() in ConstantPool::copy_entry_to Message-ID: <19821D33-93B6-43FA-85DC-070DB4800911@oracle.com> This is an attempt to solve a crash while redefining a class that has unresolved class references in its constant pool. I would appreciate some extra scrutiny here since I am unfamiliar with this code path. I have also added a test that causes a JVM crash without the fix. The updates to the test library is all code copied from the jdk version of the test library. webrev: http://cr.openjdk.java.net/~sla/8035150/webrev.00/ bug: https://bugs.openjdk.java.net/browse/JDK-8035150 Thanks, /Staffan From dmitry.samersoff at oracle.com Fri Feb 21 08:08:43 2014 From: dmitry.samersoff at oracle.com (Dmitry Samersoff) Date: Fri, 21 Feb 2014 20:08:43 +0400 Subject: PING Re: RR(S): JDK-7189721 nightly tests failed on JDI attaching In-Reply-To: <52FE2D9A.4030700@oracle.com> References: <52FE2D9A.4030700@oracle.com> Message-ID: <53077A0B.5010401@oracle.com> On 2014-02-14 18:52, Dmitry Samersoff wrote: > Hi Everyone, > > Please, review the changes. > > http://cr.openjdk.java.net/~dsamersoff/JDK-7189721/webrev.01/ > > Summary: > > If for some reason elf section with section names is not loaded to cache > it attempts to read data using NULL section pointer. > > -- Dmitry Samersoff Oracle Java development team, Saint Petersburg, Russia * I would love to change the world, but they won't give me the sources. From daniel.daugherty at oracle.com Fri Feb 21 19:40:15 2014 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Fri, 21 Feb 2014 20:40:15 -0700 Subject: RFR round 0 JDK8u backport of ObjectMonitor-JVM/TI hang fix (8028073) In-Reply-To: <52ED3F28.5070806@oracle.com> References: <52ED3F28.5070806@oracle.com> Message-ID: <53081C1F.2070008@oracle.com> Greetings, This is a code review request for the JDK8u-hs-dev backport of the following ObjectMonitor-JVM/TI hang fix: 8028073 race condition in ObjectMonitor implementation causing deadlocks https://bugs.openjdk.java.net/browse/JDK-8028073 Here is the JDK8u-hs-dev webrev URL: http://cr.openjdk.java.net/~dcubed/8028073-webrev/0-jdk8u-hs-dev/ This is _almost_ a straight forward backport of the JDK9 fix. The only difference to the fix was discussed at the end of the JDK9 review and was determined to only be needed in versions of HotSpot without the fix for 8028280: http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2014-February/010745.html 8028280 has not yet been backported to JDK8u-hs-dev. The easiest way to review the backport is to download the two patch files from the webrevs and compare them with something like: jfilemerge -r -w 8028073_exp.patch 8028073_exp_for_jdk8u_hs.patch The same testing has been performed on the JDK8u-hs-dev version as with the JDK9-hs-runtime version. Thanks, in advance, for any comments, questions or suggestions. Dan On 2/1/14 11:38 AM, Daniel D. Daugherty wrote: > Greetings, > > I have a fix ready for the following bug: > > 8028073 race condition in ObjectMonitor implementation causing deadlocks > https://bugs.openjdk.java.net/browse/JDK-8028073 > > On the surface, this is a very simple fix that relocates a few lines of > code, relocates and rewrites the comments associated with that code and > adds several new comments. > > Of course, in reality, the issue is much more complicated, but I'm > hoping to make it easy for anyone not acquainted with this issue to > understand what's going on. > > Here are the JDK9 webrev URLs: > > OpenJDK: > http://cr.openjdk.java.net/~dcubed/8028073-webrev/0-jdk9-hs-runtime/ > > Oracle internal: > http://javaweb.us.oracle.com/~ddaugher/8028073-webrev/0-jdk9-hs-runtime/ > > The simple summary: > > - since Java Monitors and JVM/TI RawMonitors share a ParkEvent, > it is possible for a JVM/TI monitor event handler to accidentally > consume a ParkEvent.unpark() call meant for Java Monitor layer > - the original code fix was made on 2005.07.04 using this bug ID: > https://bugs.openjdk.java.net/browse/JDK-5030359 > - it's the right fix, but it's in the wrong place > - the fix needs to be after the JVMTI_EVENT_MONITOR_WAITED > event handler is called because it is that event handler > that can cause the hang > > > Testing > ------- > > - a new StessMonitorWait test has been created that reliably > reproduces the hang in JDK[6789]; see the bug's gory details > for the specific versions where the hang has been reproduced > - the test reliably reproduces the hang in 5 seconds on my > T7600 running Solaris 10u11 X86; 1 minute runs reproduce > the hang reliably on other machines > - 12 hour stress run of the new test on Linux-X64, MacOS X-X64, > Solaris-SPARCV9, Solaris-X64, and Win7-X86 with the JPRT > bits did not reproduce the hang > - JPRT test job > - VM/SQE Adhoc test job on Server VM, fastdebug bits on Linux-X86, > Linux-X64, MacOS X-X64, Solaris-SPARCV9, Solaris-X64, Windows-X86, > and Windows-X64: > - vm.quick > - Kitchensink (bigapps) > - Weblogic+medrec (bigapps) > - runThese (bigapps) > > > The Gory Details Start Here > --------------------------- > > This is the old location of block of code that's being moved: > > src/share/vm/runtime/objectMonitor.cpp: > > 1440 void ObjectMonitor::wait(jlong millis, bool interruptible, TRAPS) { > > 1499 exit (true, Self) ; // exit the monitor > > 1513 if (node._notified != 0 && _succ == Self) { > 1514 node._event->unpark(); > 1515 } > > > This is the new location of block of code that's being moved: > > src/share/vm/runtime/objectMonitor.cpp: > > 1452 void ObjectMonitor::wait(jlong millis, bool interruptible, TRAPS) { > > 1601 if (JvmtiExport::should_post_monitor_waited()) { > 1602 JvmtiExport::post_monitor_waited(jt, this, ret == OS_TIMEOUT); > > 1604 if (node._notified != 0 && _succ == Self) { > > 1620 node._event->unpark(); > 1621 } > > > The Risks > --------- > > - The code now executes only when the JVMTI_EVENT_MONITOR_WAITED event > is enabled: > - previously it was always executed > - while the old code was not effective for the hang that is being > fixed with this bug, it is possible that the old code prevented > a different bug in the successor protocol from manifesting > - thorough analysis of the successor protocol did not reveal a > case where the old code was needed in the old location > - Thorough analysis indicates that the other JVM/TI monitor events > do not need a fix like the one for JVMTI_EVENT_MONITOR_WAITED: > - the successor protocol is complicated and the analysis could > be wrong when certain options are used > - comments were added to each location where a JVM/TI monitor > event handler is called documenting why a fix like this one > is not needed there > - if the analysis is wrong, the new comments show where a new > code change would be needed > > > The Scenario > ------------ > > I've created a scenario that reproduces this hang: > > T1 - enters monitor and calls monitor.wait() > T2 - enters the monitor, calls monitor.notify() and exits the monitor > T3 - enters and exits the monitor > T4 - enters the monitor, delays for 5 seconds, exits the monitor > > A JVM/TI agent that enables JVMTI_EVENT_MONITOR_WAITED and has a > handler that: enters a raw monitor, waits for 1ms, exits a raw monitor. > > Here are the six events necessary to make this hang happen: > > // KEY-EVENT-1a: After being unparked(), T1 has cleared the _succ field, but > // KEY-EVENT-1b: T3 is exiting the monitor and makes T1 the successor again. > > // KEY-EVENT-2a: The unpark() done by T3 when it made T1 the successor > // KEY-EVENT-2b: is consumed by the JVM/TI event handler. > > // KEY-EVENT-3a: T3 made T1 the successor > // KEY-EVENT-3b: but before T1 could reenter the monitor T4 grabbed it. > > // KEY-EVENT-4a: T1's TrySpin() call sees T4 as NotRunnable so > // KEY-EVENT-4b: T1 bails from TrySpin without touching _succ. > > // KEY-EVENT-5a: T4 sees that T1 is still the successor so > // KEY-EVENT-5b: T4 takes the quick exit path (no ExitEpilog) > > // KEY-EVENT-6a: T1 is about to park and it is the successor, but > // KEY-EVENT-6b: T3's unpark has been eaten by the JVM/TI event handler > // KEY-EVENT-6c: and T4 took the quick exit path. T1 is about to be stuck. > > > This bug is intertwined with: > > - The ObjectMonitor successor protocol > - the sharing of a ParkEvent between Java Monitors and JVM/TI RawMonitors > > There is a very long successor.notes attachment to JDK-8028073 that > attempts to describe the ObjectMonitor successor protocol. It's good > for putting pretty much anyone to sleep. > > Since this hang reproduces back to JDK6, this bug is taking the easily > backported solution of moving the original fix to the right location. > The following new bug has been filed for possible future work in this > area by the Serviceability Team: > > 8033399 add a separate ParkEvent for JVM/TI RawMonitor use > https://bugs.openjdk.java.net/browse/JDK-8033399 > > > The Symptoms > ------------ > > With intermittent hangs like this, it is useful to know what to look > for in order to determine if you are running into this issue: > > - if you aren't using a debugger or a profiler or some other > JVM/TI agent, then this hang is not the same as yours > - if your JVM/TI agent isn't using a JVMTI_EVENT_MONITOR_WAITED > event handler, then this hang is not the same as yours > - if your JVMTI_EVENT_MONITOR_WAITED event handler is not using > JVM/TI RawMonitors, then this hang is not the same as yours > - if your JVMTI_EVENT_MONITOR_WAITED event handler is calling > back into Java code, then you might just be insane and this > hang might be similar to yours. However, using a Java callback > in an event handler is an even bigger problem/risk so fix that > first. > - if you one or more threads blocked like this and making no > progress, then this hang might be the same as yours: > > "T1" #22 prio=5 os_prio=64 tid=0x00000000009ca800 nid=0x2f waiting for monitor e > ntry [0xfffffd7fc0231000] > java.lang.Thread.State: BLOCKED (on object monitor) > JavaThread state: _thread_blocked > Thread: 0x00000000009ca800 [0x2f] State: _at_safepoint _has_called_back 0 _at_p > oll_safepoint 0 > JavaThread state: _thread_blocked > at java.lang.Object.wait(Native Method) > - waiting on <0xfffffd7e6a2b6ff0> (a java.lang.String) > at java.lang.Object.wait(Object.java:502) > at SMW_WorkerThread.run(StressMonitorWait.java:103) > - locked <0xfffffd7e6a2b6ff0> (a java.lang.String) > > "T2" #23 prio=5 os_prio=64 tid=0x00000000009cc000 nid=0x30 waiting for monitor e > ntry [0xfffffd7fc0130000] > java.lang.Thread.State: BLOCKED (on object monitor) > JavaThread state: _thread_blocked > Thread: 0x00000000009cc000 [0x30] State: _at_safepoint _has_called_back 0 _at_p > oll_safepoint 0 > JavaThread state: _thread_blocked > at SMW_WorkerThread.run(StressMonitorWait.java:120) > - waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String) > > "T3" #24 prio=5 os_prio=64 tid=0x00000000009ce000 nid=0x31 waiting for monitor e > ntry [0xfffffd7fc002f000] > java.lang.Thread.State: BLOCKED (on object monitor) > JavaThread state: _thread_blocked > Thread: 0x00000000009ce000 [0x31] State: _at_safepoint _has_called_back 0 _at_p > oll_safepoint 0 > JavaThread state: _thread_blocked > at SMW_WorkerThread.run(StressMonitorWait.java:139) > - waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String) > > Key symptoms in thread T1: > > - had the object locked: > > locked <0xfffffd7e6a2b6ff0> (a java.lang.String) > > - did an Object.wait(): > > waiting on <0xfffffd7e6a2b6ff0> (a java.lang.String) > > - is blocked on reentry: > > waiting for monitor entry [0xfffffd7fc0231000] > > Key symptoms in thread T2: > > - is blocked waiting to lock the object: > > waiting for monitor entry [0xfffffd7fc0130000] > waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String) > > Key symptoms in thread T3: > > - is blocked waiting to lock the object: > > waiting for monitor entry [0xfffffd7fc002f000] > waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String) From erik.gahlin at oracle.com Sat Feb 22 07:56:51 2014 From: erik.gahlin at oracle.com (Erik Gahlin) Date: Sat, 22 Feb 2014 16:56:51 +0100 Subject: RFR 8034168: ThreadMXBean/Locks.java failed, blocked on wrong object In-Reply-To: <530388CC.9080807@oracle.com> References: <530388CC.9080807@oracle.com> Message-ID: <5308C8C3.4020907@oracle.com> Looks good. /Erik Jaroslav Bachorik skrev 2/18/14 5:22 PM: > Please, review the following test change. > > Issue : https://bugs.openjdk.java.net/browse/JDK-8034168 > Webrev: http://cr.openjdk.java.net/~jbachorik/8034168/webrev.00 > > The test fails because of falsely evaluating the thread being parked > as actually waiting on a monitor. This is because there is no > difference in java thread state for those two situations. The test is > using Phaser for synchronization between the checked and checking > thread to make sure an appropriate code section is entered before > performing asserts. Then it checks the checked thread state and waits > till it becomes WAITING. Unfortunately, when Phaser needs to wait it > parks the thread and sets the thread state to WAITING. From now on the > test is in a completely random state and the result will largely > depend on timing - thus failing intermittently. > > The solution is to use an additional volatile variable to prevent > falsely indicating the park() induced WAITING state. > > Thanks, > > -JB- From Alan.Bateman at oracle.com Sun Feb 23 01:19:33 2014 From: Alan.Bateman at oracle.com (Alan Bateman) Date: Sun, 23 Feb 2014 09:19:33 +0000 Subject: 8034856/8034857: More gcc warnings In-Reply-To: <5304F64E.4000206@oracle.com> References: <52FCC60B.6040205@oracle.com> <6E073B44-2FF7-419F-A094-7CD5C3AE0788@oracle.com> <52FD0DA2.7030201@oracle.com> <52FD35C5.1000306@oracle.com> <5300F36F.9010205@oracle.com> <4DCAB60B-5D60-4873-A5A4-ACF123F10045@oracle.com> <530225D2.4040709@oracle.com> <5302DA88.9060306@oracle.com> <53031AE4.2050700@oracle.com> <5303B85D.10505@oracle.com> <5304E4B8.6040908@oracle.com> <5304F64E.4000206@oracle.com> Message-ID: <5309BD25.2030407@oracle.com> On 19/02/2014 18:22, Mikael Vidstedt wrote: > : > > The documented grammar in the comment only mentions "SPACE" and the > code below doesn't make any references to \t. As a matter of fact, it > only checks for one single, mandatory SPACE after the colon (enforced > at line 535-536) and doesn't care to remove any space characters at > the end of the value. The while loop only deals with continuations. If > additional spaces do exist they will as far as I can tell be part of > the value. Are they trimmed later? I'm assuming it would be nice to > have both parsers (parse_manifest & JarFacade) behave the same way? > > Here's what it would look like to only check for space, but still eat > any additional spaces which doesn't match what > parse_manifest/parse_nv_pair does: > > http://cr.openjdk.java.net/~mikael/webrevs/isspace/webrev.01/webrev/ > Sorry for the delay getting back to you on this (I've been busy with other things). I checked the JAR File Specification, which is turn references RFC 822 as the "inspiration" for the name-value pairs. The SPACE token is just ASCII SP. So I agree it's just ASCII SP that needs to be handled here, not LWSP-char which includes ASCII HT. Looking at JDK-6274276 then the trimming was done to avoid hard-to-diagnose problems with leading/trailing spaces. It's possible that this is inconsistent with other areas where JAR file attributes are used. I would suggest leaving it as is for now as this is potentially changing behavior in several areas. -Alan -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20140223/d6a19933/attachment.html From serguei.spitsyn at oracle.com Sun Feb 23 23:32:33 2014 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Sun, 23 Feb 2014 23:32:33 -0800 Subject: RFR(XS): JDK-8034867 - Object references from expression stack are not visible as roots via jvmti FollowReferences and HeapDump In-Reply-To: <53065FDA.8050608@oracle.com> References: <02D5D45C1F8DB848A7AE20E80EE61A5C3980BF2A@DEWDFEMB20C.global.corp.sap> <530505DA.2070305@oracle.com> <530592EC.20207@oracle.com> <02D5D45C1F8DB848A7AE20E80EE61A5C3980D881@DEWDFEMB20C.global.corp.sap> <53065FDA.8050608@oracle.com> Message-ID: <530AF591.9070401@oracle.com> The fix has been integrated into the jdk9/hs-rt repo. I've also filed a qe-test/hotspot Enhancement to integrate the test attached to the bug report into the VM SQE test base: https://bugs.openjdk.java.net/browse/INTJDK-7609362 Thanks, Serguei On 2/20/14 12:04 PM, serguei.spitsyn at oracle.com wrote: > Hi Axel, > > We decided that I will sponsor your change. > My current plan is to test and push it over weekend. > > Thanks, > Serguei > > > On 2/20/14 1:05 AM, Siebenborn, Axel wrote: >> Yes, I need a sponsor for this change. >> Thanks for the reviews, >> Axel >> >> >>> -----Original Message----- >>> From: Volker Simonis [mailto:volker.simonis at gmail.com] >>> Sent: Donnerstag, 20. Februar 2014 09:21 >>> To: serguei.spitsyn at oracle.com >>> Cc: Coleen Phillimore; hotspot-runtime-dev at openjdk.java.net; >>> serviceability-dev at openjdk.java.net; Siebenborn, Axel >>> Subject: Re: RFR(XS): JDK-8034867 - Object references from >>> expression stack >>> are not visible as roots via jvmti FollowReferences and HeapDump >>> >>> Axel is covered by the SAP OCA. >>> >>> Thanks, >>> Volker >>> >>> >>> On Thu, Feb 20, 2014 at 6:30 AM, serguei.spitsyn at oracle.com >>> wrote: >>>> It looks good to me too. >>>> But does Axel need to sign an OCA ? >>>> >>>> This is a link to follow: >>>> http://openjdk.java.net/contribute/ >>>> >>>> >>>> Thanks, >>>> Serguei >>>> >>>> >>>> >>>> On 2/19/14 11:28 AM, Coleen Phillimore wrote: >>>> >>>> >>>> This change looks good. Do you need a sponsor? >>>> thanks, >>>> Coleen >>>> >>>> On 2/13/14 11:57 AM, Siebenborn, Axel wrote: >>>> >>>> Hi all, >>>> >>>> could I have a review for the following change? >>>> >>>> >>>> >>>> It adds code to iterate over the slots of the expression stack. >>>> >>>> Found object references are used as heap roots, the same way as object >>>> references in locals slots. >>>> >>>> >>>> >>>> Bug: >>>> >>>> https://bugs.openjdk.java.net/browse/JDK-8034867 >>>> >>>> >>>> >>>> Webrev: >>>> >>>> http://sapjvm.com/as/webrevs/8034867 >>>> >>>> >>>> >>>> Thanks, >>>> >>>> Axel >>>> >>>> >>>> > From staffan.larsen at oracle.com Mon Feb 24 00:07:00 2014 From: staffan.larsen at oracle.com (Staffan Larsen) Date: Mon, 24 Feb 2014 09:07:00 +0100 Subject: RFR(XS): JDK-8034203: Change JavaDoc for com.sun.jdi.request.EventRequest.setEnabled(boolean val) In-Reply-To: <53032B48.2060003@oracle.com> References: <53032B48.2060003@oracle.com> Message-ID: <72947A99-0450-45A2-9190-9DCBAF0C3CDE@oracle.com> Looks good! Thanks, /Staffan On 18 feb 2014, at 10:43, Fredrik Arvidsson wrote: > Hi > > Please help me review this small change: > > This is a minor change to the JavaDoc in the JDI API. The problem was discovered when trying to fix a failing test. > > Webrev: 8034203 > Bug: JDK-8034203 > CCC: 8034203 > > Thanks > /Fredrik -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20140224/b2edc044/attachment.html From volker.simonis at gmail.com Mon Feb 24 00:39:36 2014 From: volker.simonis at gmail.com (Volker Simonis) Date: Mon, 24 Feb 2014 09:39:36 +0100 Subject: RFR(XS): JDK-8034867 - Object references from expression stack are not visible as roots via jvmti FollowReferences and HeapDump In-Reply-To: <530AF591.9070401@oracle.com> References: <02D5D45C1F8DB848A7AE20E80EE61A5C3980BF2A@DEWDFEMB20C.global.corp.sap> <530505DA.2070305@oracle.com> <530592EC.20207@oracle.com> <02D5D45C1F8DB848A7AE20E80EE61A5C3980D881@DEWDFEMB20C.global.corp.sap> <53065FDA.8050608@oracle.com> <530AF591.9070401@oracle.com> Message-ID: Hi Serguei, thanks a lot for sponsoring this change. It seems that https://bugs.openjdk.java.net/browse/INTJDK-7609362 is not visible. Could you please make it readable for everybody. Thank you and best regards, Volker On Mon, Feb 24, 2014 at 8:32 AM, serguei.spitsyn at oracle.com wrote: > The fix has been integrated into the jdk9/hs-rt repo. > > I've also filed a qe-test/hotspot Enhancement to integrate the test > attached to the bug report into the VM SQE test base: > https://bugs.openjdk.java.net/browse/INTJDK-7609362 > > Thanks, > Serguei > > > > On 2/20/14 12:04 PM, serguei.spitsyn at oracle.com wrote: >> >> Hi Axel, >> >> We decided that I will sponsor your change. >> My current plan is to test and push it over weekend. >> >> Thanks, >> Serguei >> >> >> On 2/20/14 1:05 AM, Siebenborn, Axel wrote: >>> >>> Yes, I need a sponsor for this change. >>> Thanks for the reviews, >>> Axel >>> >>> >>>> -----Original Message----- >>>> From: Volker Simonis [mailto:volker.simonis at gmail.com] >>>> Sent: Donnerstag, 20. Februar 2014 09:21 >>>> To: serguei.spitsyn at oracle.com >>>> Cc: Coleen Phillimore; hotspot-runtime-dev at openjdk.java.net; >>>> serviceability-dev at openjdk.java.net; Siebenborn, Axel >>>> Subject: Re: RFR(XS): JDK-8034867 - Object references from expression >>>> stack >>>> are not visible as roots via jvmti FollowReferences and HeapDump >>>> >>>> Axel is covered by the SAP OCA. >>>> >>>> Thanks, >>>> Volker >>>> >>>> >>>> On Thu, Feb 20, 2014 at 6:30 AM, serguei.spitsyn at oracle.com >>>> wrote: >>>>> >>>>> It looks good to me too. >>>>> But does Axel need to sign an OCA ? >>>>> >>>>> This is a link to follow: >>>>> http://openjdk.java.net/contribute/ >>>>> >>>>> >>>>> Thanks, >>>>> Serguei >>>>> >>>>> >>>>> >>>>> On 2/19/14 11:28 AM, Coleen Phillimore wrote: >>>>> >>>>> >>>>> This change looks good. Do you need a sponsor? >>>>> thanks, >>>>> Coleen >>>>> >>>>> On 2/13/14 11:57 AM, Siebenborn, Axel wrote: >>>>> >>>>> Hi all, >>>>> >>>>> could I have a review for the following change? >>>>> >>>>> >>>>> >>>>> It adds code to iterate over the slots of the expression stack. >>>>> >>>>> Found object references are used as heap roots, the same way as object >>>>> references in locals slots. >>>>> >>>>> >>>>> >>>>> Bug: >>>>> >>>>> https://bugs.openjdk.java.net/browse/JDK-8034867 >>>>> >>>>> >>>>> >>>>> Webrev: >>>>> >>>>> http://sapjvm.com/as/webrevs/8034867 >>>>> >>>>> >>>>> >>>>> Thanks, >>>>> >>>>> Axel >>>>> >>>>> >>>>> >> > From jaroslav.bachorik at oracle.com Mon Feb 24 01:04:30 2014 From: jaroslav.bachorik at oracle.com (Jaroslav Bachorik) Date: Mon, 24 Feb 2014 10:04:30 +0100 Subject: RFR 4505697: nsk/jdi/ExceptionEvent/_itself_/exevent006 and exevent008 tests fail with InvocationTargetException In-Reply-To: <5306FF49.4010607@oracle.com> References: <52FDF8DD.9090602@oracle.com> <52FE6486.8000603@oracle.com> <52FE7D88.8000701@oracle.com> <52FE951E.1060002@oracle.com> <5301C2AB.2070600@oracle.com> <53033378.2060300@oracle.com> <5304D4CC.1020007@oracle.com> <5305DBB0.3080809@oracle.com> <53060611.3040209@oracle.com> <5306FF49.4010607@oracle.com> Message-ID: <530B0B1E.3060508@oracle.com> On 21.2.2014 08:24, David Holmes wrote: > On 20/02/2014 11:41 PM, Jaroslav Bachorik wrote: >> On 20.2.2014 11:40, David Holmes wrote: >>> Hi Jaroslav, >>> >>> instanceKlass.cpp: >>> >>> Comment is wrong: >>> >>> 913 // JVMTI internal flag reset is needed in order to report >>> InvocationTargetException >>> >>> It will be ExceptionInInitializerError >> >> Will fix. Copypaste ... >> >>> >>> You added this: >>> >>> 917 >>> this_oop->set_initialization_state_and_notify(initialization_error, >>> THREAD); >>> 918 CLEAR_PENDING_EXCEPTION; // ignore any exception thrown, >>> class initialization error is thrown below >>> + 919 // JVMTI has already reported the pending exception >>> + 920 // JVMTI internal flag reset is needed in order to report >>> InvocationTargetException >>> + 921 JvmtiExport::clear_detected_exception((JavaThread*)THREAD); >>> >>> but there are a number of places where >>> set_initialization_state_and_notify is called when a pending exception >>> has been cleared, and then CLEAR_PENDING_EXCEPTION is called again, but >>> you didn't modify those other locations. They will rethrow the original >>> exception so I suppose that is okay from JVMTI's perspective. But the >>> flip-side of this is that if set_initialization_state_and_notify does >>> throw an exception, JVMTI will never see it. >> >> I don't know if it supposed to see it. It seems that any exception >> thrown from set_initialization_state_and_notify is thoroughly ignored >> and hidden from the outer world. Perhaps someone more experienced in >> JVMTI than me would like to chime in here? Serguei? > > set_initialization_state_and_notify is often called when an exception > has already occurred during the class loading/initialization process. It > is that original exception that we want to propagate but meanwhile we > have to perform this action to update the state and wakeup any waiters. > So we cache the original exception, clear it, do the state update and > then clear any pending exception (I think the only exception possible > here is OOME!), then rethrow the original. If this action did indeed > throw OOME then we might not be able to wake up the waiter(s) and that > might lead to a hang. While a debug VM could use TraceExceptions to > (hopefully) spot the OOME, in a product VM it would be invisible, even > if a JVMTI agent was tracking exceptions. So I think it should be > visible to JVMTI. I would like to hear other opinions though. > > However this is going beyond the scope of fixing these particular tests > so I'm fine if this is simply recorded in another bug for future clean up. Created https://bugs.openjdk.java.net/browse/JDK-8035646 Thanks for the review, everyone. -JB- > > Thanks, > David > >> -JB- >> >>> >>> --- >>> >>> jvm.cpp >>> >>> Comment is wrong again - not InvocationTargetException. >>> >>> --- >>> >>> David >>> ------ >>> >>> >>> >>> >>> On 20/02/2014 1:59 AM, Jaroslav Bachorik wrote: >>>> On 18.2.2014 11:18, serguei.spitsyn at oracle.com wrote: >>>>> On 2/17/14 12:04 AM, Jaroslav Bachorik wrote: >>>>>> On 14.2.2014 23:13, serguei.spitsyn at oracle.com wrote: >>>>>>> On 2/14/14 12:33 PM, Daniel D. Daugherty wrote: >>>>>>>> On 2/14/14 11:46 AM, serguei.spitsyn at oracle.com wrote: >>>>>>>>> Jaroslav, >>>>>>>>> >>>>>>>>> It looks good in general modulo indent comments from Dan. >>>>>>>>> >>>>>>>>> But I have a doubt that acquiring the JvmtiThreadState_lock is >>>>>>>>> needed >>>>>>>>> or right thing to do in the >>>>>>>>> JvmtiExport::clear_detected_exception(). >>>>>>>>> It seems, both clear_exception_detected() and >>>>>>>>> set_exception_detected() are always >>>>>>>>> called on current thread and so, it has to be safe to do without >>>>>>>>> acquiring any locks. >>>>>>>> >>>>>>>> My JVM/TI-foo is rusty, but I believe that JvmtiThreadState stuff >>>>>>>> can also be queried/modified by other threads so grabbing the >>>>>>>> associated lock is a good idea. >>>>>>> >>>>>>> The lock synchronization is cooperative. >>>>>>> It does not help much if the lock is not acquired in other places. >>>>>>> I can be wrong, but I've not found yet any place in the code where >>>>>>> the >>>>>>> clear_exception_detected() and set_exception_detected() are called >>>>>>> under protection of the JvmtiThreadState_lock. >>>>>> >>>>>> I copied the locking over from >>>>>> "JvmtiExport::cleanup_thread(JavaThread* thread)". That method is >>>>>> also >>>>>> supposed to work only with the current thread but acquires the lock >>>>>> nonetheless. But if you are sure that the lock is not required I have >>>>>> no objections removing it. >>>>> >>>>> I'm suggesting to remove it, as it is not used in other places in the >>>>> code. >>>>> It is going to be confusing if it is used in one place and missed in >>>>> others. >>>> >>>> I've removed the lock and applied the same cleanup logic to other >>>> places >>>> where exceptions are rewrapped. >>>> >>>> Webrev: http://cr.openjdk.java.net/~jbachorik/4505697/webrev.02 >>>> JPRT run: >>>> http://prt-web.us.oracle.com//archive/2014/02/2014-02-19-114618.jbachorik.hotspot/ >>>> >>>> >>>> >>>> >>>> Aurora Adhoc: >>>> http://aurora.ru.oracle.com//faces/Batch.xhtml?batchName=418853.VMSQE.adhoc.JPRT.full >>>> >>>> >>>> >>>> (still running at the moment; no failures so far) >>>> >>>>> >>>>> Thanks, >>>>> Serguei >>>>> >>>>>> >>>>>> -JB- >>>>>> >>>>>>> >>>>>>> Thanks, >>>>>>> Serguei >>>>>>> >>>>>>>> >>>>>>>> Dan >>>>>>>> >>>>>>>> >>>>>>>>> >>>>>>>>> And I'm repeating my question about pre-integration testing >>>>>>>>> (Dan is >>>>>>>>> asking about the same). >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> Serguei >>>>>>>>> >>>>>>>>> >>>>>>>>> On 2/14/14 3:07 AM, Jaroslav Bachorik wrote: >>>>>>>>>> This is a round-0 review request. >>>>>>>>>> >>>>>>>>>> The reflection code intercepting the exceptions thrown in the >>>>>>>>>> invoked methods does not play nicely with JVMTI (which, in this >>>>>>>>>> case, propagates to JDI). >>>>>>>>>> >>>>>>>>>> The reflection code lacks the traditional error handler - >>>>>>>>>> therefore, >>>>>>>>>> upon throwing the NumberFormatException, the stack is searched >>>>>>>>>> for >>>>>>>>>> appropriate handlers and none are found. This leaves the >>>>>>>>>> "exception_detected" flag set to true while normally it would be >>>>>>>>>> reset to false once the exception is handled. The reflection code >>>>>>>>>> then goes on and wraps the NumberFormatException into >>>>>>>>>> InvocationTargetException and throws it. But, alas, the >>>>>>>>>> "exception_detected" flag is still set to true and no JVMTI >>>>>>>>>> exception event will be sent out. >>>>>>>>>> >>>>>>>>>> The proposed solution is to call >>>>>>>>>> thread->jvmti_thread_state()->clear_exception_detected() at the >>>>>>>>>> appropriate places in the reflection code to reset the >>>>>>>>>> "exception_detected" flag and enable the >>>>>>>>>> InvocationTargetException >>>>>>>>>> be properly reported over JVMTI. >>>>>>>>>> >>>>>>>>>> Issue : https://bugs.openjdk.java.net/browse/JDK-4505697 >>>>>>>>>> Webrev: http://cr.openjdk.java.net/~jbachorik/4505697/webrev.00 >>>>>>>>>> >>>>>>>>>> Thanks! >>>>>>>>>> >>>>>>>>>> -JB- >>>>>>>>> >>>>>>>> >>>>>>> >>>>>>> >>>>>> >>>>> >>>> >> From staffan.larsen at oracle.com Mon Feb 24 01:16:52 2014 From: staffan.larsen at oracle.com (Staffan Larsen) Date: Mon, 24 Feb 2014 10:16:52 +0100 Subject: RFR(XS): JDK-8034867 - Object references from expression stack are not visible as roots via jvmti FollowReferences and HeapDump In-Reply-To: References: <02D5D45C1F8DB848A7AE20E80EE61A5C3980BF2A@DEWDFEMB20C.global.corp.sap> <530505DA.2070305@oracle.com> <530592EC.20207@oracle.com> <02D5D45C1F8DB848A7AE20E80EE61A5C3980D881@DEWDFEMB20C.global.corp.sap> <53065FDA.8050608@oracle.com> <530AF591.9070401@oracle.com> Message-ID: <3D7EB0F4-128E-4933-88DA-7B5849BE2672@oracle.com> On 24 feb 2014, at 09:39, Volker Simonis wrote: > Hi Serguei, > > thanks a lot for sponsoring this change. > > It seems that https://bugs.openjdk.java.net/browse/INTJDK-7609362 is > not visible. > Could you please make it readable for everybody. Since it is filed on an internal test suite it?s not possible to make the bug visible to everyone. /Staffan > > Thank you and best regards, > Volker > > > On Mon, Feb 24, 2014 at 8:32 AM, serguei.spitsyn at oracle.com > wrote: >> The fix has been integrated into the jdk9/hs-rt repo. >> >> I've also filed a qe-test/hotspot Enhancement to integrate the test >> attached to the bug report into the VM SQE test base: >> https://bugs.openjdk.java.net/browse/INTJDK-7609362 >> >> Thanks, >> Serguei >> >> >> >> On 2/20/14 12:04 PM, serguei.spitsyn at oracle.com wrote: >>> >>> Hi Axel, >>> >>> We decided that I will sponsor your change. >>> My current plan is to test and push it over weekend. >>> >>> Thanks, >>> Serguei >>> >>> >>> On 2/20/14 1:05 AM, Siebenborn, Axel wrote: >>>> >>>> Yes, I need a sponsor for this change. >>>> Thanks for the reviews, >>>> Axel >>>> >>>> >>>>> -----Original Message----- >>>>> From: Volker Simonis [mailto:volker.simonis at gmail.com] >>>>> Sent: Donnerstag, 20. Februar 2014 09:21 >>>>> To: serguei.spitsyn at oracle.com >>>>> Cc: Coleen Phillimore; hotspot-runtime-dev at openjdk.java.net; >>>>> serviceability-dev at openjdk.java.net; Siebenborn, Axel >>>>> Subject: Re: RFR(XS): JDK-8034867 - Object references from expression >>>>> stack >>>>> are not visible as roots via jvmti FollowReferences and HeapDump >>>>> >>>>> Axel is covered by the SAP OCA. >>>>> >>>>> Thanks, >>>>> Volker >>>>> >>>>> >>>>> On Thu, Feb 20, 2014 at 6:30 AM, serguei.spitsyn at oracle.com >>>>> wrote: >>>>>> >>>>>> It looks good to me too. >>>>>> But does Axel need to sign an OCA ? >>>>>> >>>>>> This is a link to follow: >>>>>> http://openjdk.java.net/contribute/ >>>>>> >>>>>> >>>>>> Thanks, >>>>>> Serguei >>>>>> >>>>>> >>>>>> >>>>>> On 2/19/14 11:28 AM, Coleen Phillimore wrote: >>>>>> >>>>>> >>>>>> This change looks good. Do you need a sponsor? >>>>>> thanks, >>>>>> Coleen >>>>>> >>>>>> On 2/13/14 11:57 AM, Siebenborn, Axel wrote: >>>>>> >>>>>> Hi all, >>>>>> >>>>>> could I have a review for the following change? >>>>>> >>>>>> >>>>>> >>>>>> It adds code to iterate over the slots of the expression stack. >>>>>> >>>>>> Found object references are used as heap roots, the same way as object >>>>>> references in locals slots. >>>>>> >>>>>> >>>>>> >>>>>> Bug: >>>>>> >>>>>> https://bugs.openjdk.java.net/browse/JDK-8034867 >>>>>> >>>>>> >>>>>> >>>>>> Webrev: >>>>>> >>>>>> http://sapjvm.com/as/webrevs/8034867 >>>>>> >>>>>> >>>>>> >>>>>> Thanks, >>>>>> >>>>>> Axel >>>>>> >>>>>> >>>>>> >>> >> From markus.gronlund at oracle.com Mon Feb 24 01:42:42 2014 From: markus.gronlund at oracle.com (Markus Gronlund) Date: Mon, 24 Feb 2014 01:42:42 -0800 (PST) Subject: RFR: 8035150 ShouldNotReachHere() in ConstantPool::copy_entry_to In-Reply-To: <19821D33-93B6-43FA-85DC-070DB4800911@oracle.com> References: <19821D33-93B6-43FA-85DC-070DB4800911@oracle.com> Message-ID: Hi Staffan, I would think this is the correct fix. The other two constant pool "error" tags, besides UnresolvedClassInError, which signal constant pool resolution errors are MethodTypeInError and MethodHandleInError - these error tags are associated with their corresponding "success" tags in switch targets in ConstantPool::copy_entry_to(), as well as in additional routines in constantPool.cpp. In addition, in other routines in ConstantPool.cpp, the error tag JVM_CONSTANT_UnresolvedClassInError is associated with JVM_CONSTANT_UnresolvedClass - ConstantPool::resolve_constant_at_impl() for example. Thanks Markus -----Original Message----- From: Staffan Larsen Sent: den 21 februari 2014 15:11 To: hotspot-runtime-dev; serviceability-dev at openjdk.java.net serviceability-dev at openjdk.java.net Subject: RFR: 8035150 ShouldNotReachHere() in ConstantPool::copy_entry_to This is an attempt to solve a crash while redefining a class that has unresolved class references in its constant pool. I would appreciate some extra scrutiny here since I am unfamiliar with this code path. I have also added a test that causes a JVM crash without the fix. The updates to the test library is all code copied from the jdk version of the test library. webrev: http://cr.openjdk.java.net/~sla/8035150/webrev.00/ bug: https://bugs.openjdk.java.net/browse/JDK-8035150 Thanks, /Staffan From volker.simonis at gmail.com Mon Feb 24 01:43:21 2014 From: volker.simonis at gmail.com (Volker Simonis) Date: Mon, 24 Feb 2014 10:43:21 +0100 Subject: RFR(XS): JDK-8034867 - Object references from expression stack are not visible as roots via jvmti FollowReferences and HeapDump In-Reply-To: <3D7EB0F4-128E-4933-88DA-7B5849BE2672@oracle.com> References: <02D5D45C1F8DB848A7AE20E80EE61A5C3980BF2A@DEWDFEMB20C.global.corp.sap> <530505DA.2070305@oracle.com> <530592EC.20207@oracle.com> <02D5D45C1F8DB848A7AE20E80EE61A5C3980D881@DEWDFEMB20C.global.corp.sap> <53065FDA.8050608@oracle.com> <530AF591.9070401@oracle.com> <3D7EB0F4-128E-4933-88DA-7B5849BE2672@oracle.com> Message-ID: On Mon, Feb 24, 2014 at 10:16 AM, Staffan Larsen wrote: > > On 24 feb 2014, at 09:39, Volker Simonis wrote: > >> Hi Serguei, >> >> thanks a lot for sponsoring this change. >> >> It seems that https://bugs.openjdk.java.net/browse/INTJDK-7609362 is >> not visible. >> Could you please make it readable for everybody. > > Since it is filed on an internal test suite it?s not possible to make the bug visible to everyone. > OK, thanks. I mis-read Serguei's mail. I thought he wanted to create a jdk regression test (altough that's a little complicated because of the need to create a native shared library). Regards, Volker > /Staffan > >> >> Thank you and best regards, >> Volker >> >> >> On Mon, Feb 24, 2014 at 8:32 AM, serguei.spitsyn at oracle.com >> wrote: >>> The fix has been integrated into the jdk9/hs-rt repo. >>> >>> I've also filed a qe-test/hotspot Enhancement to integrate the test >>> attached to the bug report into the VM SQE test base: >>> https://bugs.openjdk.java.net/browse/INTJDK-7609362 >>> >>> Thanks, >>> Serguei >>> >>> >>> >>> On 2/20/14 12:04 PM, serguei.spitsyn at oracle.com wrote: >>>> >>>> Hi Axel, >>>> >>>> We decided that I will sponsor your change. >>>> My current plan is to test and push it over weekend. >>>> >>>> Thanks, >>>> Serguei >>>> >>>> >>>> On 2/20/14 1:05 AM, Siebenborn, Axel wrote: >>>>> >>>>> Yes, I need a sponsor for this change. >>>>> Thanks for the reviews, >>>>> Axel >>>>> >>>>> >>>>>> -----Original Message----- >>>>>> From: Volker Simonis [mailto:volker.simonis at gmail.com] >>>>>> Sent: Donnerstag, 20. Februar 2014 09:21 >>>>>> To: serguei.spitsyn at oracle.com >>>>>> Cc: Coleen Phillimore; hotspot-runtime-dev at openjdk.java.net; >>>>>> serviceability-dev at openjdk.java.net; Siebenborn, Axel >>>>>> Subject: Re: RFR(XS): JDK-8034867 - Object references from expression >>>>>> stack >>>>>> are not visible as roots via jvmti FollowReferences and HeapDump >>>>>> >>>>>> Axel is covered by the SAP OCA. >>>>>> >>>>>> Thanks, >>>>>> Volker >>>>>> >>>>>> >>>>>> On Thu, Feb 20, 2014 at 6:30 AM, serguei.spitsyn at oracle.com >>>>>> wrote: >>>>>>> >>>>>>> It looks good to me too. >>>>>>> But does Axel need to sign an OCA ? >>>>>>> >>>>>>> This is a link to follow: >>>>>>> http://openjdk.java.net/contribute/ >>>>>>> >>>>>>> >>>>>>> Thanks, >>>>>>> Serguei >>>>>>> >>>>>>> >>>>>>> >>>>>>> On 2/19/14 11:28 AM, Coleen Phillimore wrote: >>>>>>> >>>>>>> >>>>>>> This change looks good. Do you need a sponsor? >>>>>>> thanks, >>>>>>> Coleen >>>>>>> >>>>>>> On 2/13/14 11:57 AM, Siebenborn, Axel wrote: >>>>>>> >>>>>>> Hi all, >>>>>>> >>>>>>> could I have a review for the following change? >>>>>>> >>>>>>> >>>>>>> >>>>>>> It adds code to iterate over the slots of the expression stack. >>>>>>> >>>>>>> Found object references are used as heap roots, the same way as object >>>>>>> references in locals slots. >>>>>>> >>>>>>> >>>>>>> >>>>>>> Bug: >>>>>>> >>>>>>> https://bugs.openjdk.java.net/browse/JDK-8034867 >>>>>>> >>>>>>> >>>>>>> >>>>>>> Webrev: >>>>>>> >>>>>>> http://sapjvm.com/as/webrevs/8034867 >>>>>>> >>>>>>> >>>>>>> >>>>>>> Thanks, >>>>>>> >>>>>>> Axel >>>>>>> >>>>>>> >>>>>>> >>>> >>> > From staffan.larsen at oracle.com Mon Feb 24 02:17:41 2014 From: staffan.larsen at oracle.com (Staffan Larsen) Date: Mon, 24 Feb 2014 11:17:41 +0100 Subject: RFR(XS): JDK-8034867 - Object references from expression stack are not visible as roots via jvmti FollowReferences and HeapDump In-Reply-To: References: <02D5D45C1F8DB848A7AE20E80EE61A5C3980BF2A@DEWDFEMB20C.global.corp.sap> <530505DA.2070305@oracle.com> <530592EC.20207@oracle.com> <02D5D45C1F8DB848A7AE20E80EE61A5C3980D881@DEWDFEMB20C.global.corp.sap> <53065FDA.8050608@oracle.com> <530AF591.9070401@oracle.com> <3D7EB0F4-128E-4933-88DA-7B5849BE2672@oracle.com> Message-ID: <244E1F17-0104-411E-AE16-CF7027E46A1E@oracle.com> On 24 feb 2014, at 10:43, Volker Simonis wrote: > On Mon, Feb 24, 2014 at 10:16 AM, Staffan Larsen > wrote: >> >> On 24 feb 2014, at 09:39, Volker Simonis wrote: >> >>> Hi Serguei, >>> >>> thanks a lot for sponsoring this change. >>> >>> It seems that https://bugs.openjdk.java.net/browse/INTJDK-7609362 is >>> not visible. >>> Could you please make it readable for everybody. >> >> Since it is filed on an internal test suite it?s not possible to make the bug visible to everyone. >> > > OK, thanks. I mis-read Serguei's mail. I thought he wanted to create a > jdk regression test (altough that's a little complicated because of > the need to create a native shared library). Yes, that is the root problem. Working on fixing it? /Staffan > > Regards, > Volker > >> /Staffan >> >>> >>> Thank you and best regards, >>> Volker >>> >>> >>> On Mon, Feb 24, 2014 at 8:32 AM, serguei.spitsyn at oracle.com >>> wrote: >>>> The fix has been integrated into the jdk9/hs-rt repo. >>>> >>>> I've also filed a qe-test/hotspot Enhancement to integrate the test >>>> attached to the bug report into the VM SQE test base: >>>> https://bugs.openjdk.java.net/browse/INTJDK-7609362 >>>> >>>> Thanks, >>>> Serguei >>>> >>>> >>>> >>>> On 2/20/14 12:04 PM, serguei.spitsyn at oracle.com wrote: >>>>> >>>>> Hi Axel, >>>>> >>>>> We decided that I will sponsor your change. >>>>> My current plan is to test and push it over weekend. >>>>> >>>>> Thanks, >>>>> Serguei >>>>> >>>>> >>>>> On 2/20/14 1:05 AM, Siebenborn, Axel wrote: >>>>>> >>>>>> Yes, I need a sponsor for this change. >>>>>> Thanks for the reviews, >>>>>> Axel >>>>>> >>>>>> >>>>>>> -----Original Message----- >>>>>>> From: Volker Simonis [mailto:volker.simonis at gmail.com] >>>>>>> Sent: Donnerstag, 20. Februar 2014 09:21 >>>>>>> To: serguei.spitsyn at oracle.com >>>>>>> Cc: Coleen Phillimore; hotspot-runtime-dev at openjdk.java.net; >>>>>>> serviceability-dev at openjdk.java.net; Siebenborn, Axel >>>>>>> Subject: Re: RFR(XS): JDK-8034867 - Object references from expression >>>>>>> stack >>>>>>> are not visible as roots via jvmti FollowReferences and HeapDump >>>>>>> >>>>>>> Axel is covered by the SAP OCA. >>>>>>> >>>>>>> Thanks, >>>>>>> Volker >>>>>>> >>>>>>> >>>>>>> On Thu, Feb 20, 2014 at 6:30 AM, serguei.spitsyn at oracle.com >>>>>>> wrote: >>>>>>>> >>>>>>>> It looks good to me too. >>>>>>>> But does Axel need to sign an OCA ? >>>>>>>> >>>>>>>> This is a link to follow: >>>>>>>> http://openjdk.java.net/contribute/ >>>>>>>> >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Serguei >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> On 2/19/14 11:28 AM, Coleen Phillimore wrote: >>>>>>>> >>>>>>>> >>>>>>>> This change looks good. Do you need a sponsor? >>>>>>>> thanks, >>>>>>>> Coleen >>>>>>>> >>>>>>>> On 2/13/14 11:57 AM, Siebenborn, Axel wrote: >>>>>>>> >>>>>>>> Hi all, >>>>>>>> >>>>>>>> could I have a review for the following change? >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> It adds code to iterate over the slots of the expression stack. >>>>>>>> >>>>>>>> Found object references are used as heap roots, the same way as object >>>>>>>> references in locals slots. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> Bug: >>>>>>>> >>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8034867 >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> Webrev: >>>>>>>> >>>>>>>> http://sapjvm.com/as/webrevs/8034867 >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> Thanks, >>>>>>>> >>>>>>>> Axel -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20140224/04704fde/attachment.html From mikael.vidstedt at oracle.com Mon Feb 24 11:22:33 2014 From: mikael.vidstedt at oracle.com (Mikael Vidstedt) Date: Mon, 24 Feb 2014 11:22:33 -0800 Subject: 8034856/8034857: More gcc warnings In-Reply-To: <5309BD25.2030407@oracle.com> References: <52FCC60B.6040205@oracle.com> <6E073B44-2FF7-419F-A094-7CD5C3AE0788@oracle.com> <52FD0DA2.7030201@oracle.com> <52FD35C5.1000306@oracle.com> <5300F36F.9010205@oracle.com> <4DCAB60B-5D60-4873-A5A4-ACF123F10045@oracle.com> <530225D2.4040709@oracle.com> <5302DA88.9060306@oracle.com> <53031AE4.2050700@oracle.com> <5303B85D.10505@oracle.com> <5304E4B8.6040908@oracle.com> <5304F64E.4000206@oracle.com> <5309BD25.2030407@oracle.com> Message-ID: <530B9BF9.40102@oracle.com> On 2014-02-23 01:19, Alan Bateman wrote: > On 19/02/2014 18:22, Mikael Vidstedt wrote: >> : >> >> The documented grammar in the comment only mentions "SPACE" and the >> code below doesn't make any references to \t. As a matter of fact, it >> only checks for one single, mandatory SPACE after the colon (enforced >> at line 535-536) and doesn't care to remove any space characters at >> the end of the value. The while loop only deals with continuations. >> If additional spaces do exist they will as far as I can tell be part >> of the value. Are they trimmed later? I'm assuming it would be nice >> to have both parsers (parse_manifest & JarFacade) behave the same way? >> >> Here's what it would look like to only check for space, but still eat >> any additional spaces which doesn't match what >> parse_manifest/parse_nv_pair does: >> >> http://cr.openjdk.java.net/~mikael/webrevs/isspace/webrev.01/webrev/ >> > Sorry for the delay getting back to you on this (I've been busy with > other things). > > I checked the JAR File Specification, which is turn references RFC 822 > as the "inspiration" for the name-value pairs. The SPACE token is just > ASCII SP. So I agree it's just ASCII SP that needs to be handled here, > not LWSP-char which includes ASCII HT. > > Looking at JDK-6274276 then the trimming was done to avoid > hard-to-diagnose problems with leading/trailing spaces. It's possible > that this is inconsistent with other areas where JAR file attributes > are used. I would suggest leaving it as is for now as this is > potentially changing behavior in several areas. That sounds reasonable. I'll keep the webrev.01 approach - only check for and trim ASCII SP. Thanks for the review! Cheers, Mikael -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20140224/cc0a7399/attachment.html From kevin.walls at oracle.com Tue Feb 25 01:54:31 2014 From: kevin.walls at oracle.com (Kevin Walls) Date: Tue, 25 Feb 2014 09:54:31 +0000 Subject: RR(S): 8028623 and 8032466: serviceability agent hashcode changes. In-Reply-To: <52E0DDE2.6050601@oracle.com> References: <52E0033B.80109@oracle.com> <05BFE552-8255-4C60-AE82-981E0FD9E5AD@oracle.com> <52E0DDE2.6050601@oracle.com> Message-ID: <530C6857.6000601@oracle.com> Hi - I'm going to backport this (8028623 hashcode change, and testcase tweak 8032466) to jdk8u (and after that, to jdk7u). The same changesets work and test fine. 8u webrev, planning to push to http://hg.openjdk.java.net/jdk8u/hs-dev/hotspot: http://cr.openjdk.java.net/~kevinw/8028623/8028623.8u.001/ This email is a further review/request, or if that is redundant it's just a warning in case anybody thinks that is a bad idea for some reason. Feel free to speak, but I would rather like to presume the same reviewers are OK with it... Thanks Kevin On 23/01/14 09:16, Kevin Walls wrote: > Thanks Dmitry, Staffan! > > On 23/01/14 07:05, Staffan Larsen wrote: >> Looks good! >> >> Thanks, >> /Staffan >> >> On 22 jan 2014, at 18:43, Kevin Walls wrote: >> >>> Hi, >>> >>> Can I just get a review of this jtreg tag change in a testcase. The >>> testcase specifically contains a utf8 character and to make sure it >>> compiles everywhere we need to specify -encoding on the @compile >>> line, and then add an @run line: >>> >>> webrev >>> http://cr.openjdk.java.net/~kevinw/8032466/webrev.00/ >>> >>> bug >>> https://bugs.openjdk.java.net/browse/JDK-8032466 >>> >>> (if the testcase runs fine without this change, in jtreg locally you >>> can make it fail by setting -encoding ascii) >>> >>> Thanks! >>> Kevin > From jaroslav.bachorik at oracle.com Tue Feb 25 05:45:13 2014 From: jaroslav.bachorik at oracle.com (Jaroslav Bachorik) Date: Tue, 25 Feb 2014 14:45:13 +0100 Subject: RFR 8034168: ThreadMXBean/Locks.java failed, blocked on wrong object In-Reply-To: References: <530388CC.9080807@oracle.com> <53045D76.5080907@oracle.com> <53060E57.6010300@oracle.com> Message-ID: <530C9E69.9070501@oracle.com> On 20.2.2014 18:04, Martin Buchholz wrote: > I think David is too pessimistic about Thread.yield being ineffective on > Java SE implementations (OTOH David is a Java Embedded expert). In > practice an implementation that never thread switched out of a yield() loop > would not pass the tck. As for theory: it's true that Thread.yield has no > progress guarantees, but then neither does Thread.sleep. A perverse > implementation can always starve any thread it feels like. > > Anyways, there is nothing wrong with your sleep loop. Except maybe you > want to time out eventually. The test harness should take care of timeout - at least that's what I can see as the preferred approach in the recent test fixes. Since David doesn't seem to have a strong opinion about Thread.sleep() vs. Thread.yield() here and I am rather impartial to that as well I can change the fix to use Thread.yield() instead if you are sure it won't cause any troubles. Thanks, -JB- > > > On Thu, Feb 20, 2014 at 6:16 AM, Jaroslav Bachorik < > jaroslav.bachorik at oracle.com> wrote: > >> This was discussed when reviewing ThreadMXBean/ >> SynchronizationStatistics.java >> >> Regarding a busy wait checking the thread states and issuing >> Thread.yield() now and then David Holmes wrote: >> "Not elegant and not completely reliable either. Probably adequate on a >> multi-core system but single-core and with some schedulers it could just >> be a busy spin." [1] >> >> As far as I understand, the only benefit of using Thread.yield() instead >> of Thread.sleep() would be 10-100ms shorter execution time, right? >> >> -JB- >> >> [1] http://mail.openjdk.java.net/pipermail/jmx-dev/2013- >> October/000484.html >> >> >> On 19.2.2014 17:56, Martin Buchholz wrote: >> >>> The jsr166 tck tests make systematic use of Thread.yield, e.g. >>> >>> >>> /** >>> * Spin-waits up to the specified number of milliseconds for the >>> given >>> * thread to enter a wait state: BLOCKED, WAITING, or TIMED_WAITING. >>> */ >>> void waitForThreadToEnterWaitState(Thread thread, long >>> timeoutMillis) { >>> long startTime = System.nanoTime(); >>> for (;;) { >>> Thread.State s = thread.getState(); >>> if (s == Thread.State.BLOCKED || >>> s == Thread.State.WAITING || >>> s == Thread.State.TIMED_WAITING) >>> return; >>> else if (s == Thread.State.TERMINATED) >>> fail("Unexpected thread termination"); >>> else if (millisElapsedSince(startTime) > timeoutMillis) { >>> threadAssertTrue(thread.isAlive()); >>> return; >>> } >>> Thread.yield(); >>> } >>> } >>> >>> >>> >>> On Tue, Feb 18, 2014 at 11:29 PM, Jaroslav Bachorik < >>> jaroslav.bachorik at oracle.com> wrote: >>> >>> On 18.2.2014 18:06, Martin Buchholz wrote: >>>> >>>> Not checking any details, but tests that want to wait for a particular >>>>> thread state are a good reason to use >>>>> >>>>> volatile boolean flag; >>>>> ... >>>>> while (!flag) Thread.yield(); >>>>> >>>>> I prefer calling Thread.yield to sleeping in this special case, in part >>>>> because I don't want to rely on the implementation of sleep, while yield >>>>> is >>>>> semantically a no-op. (Also sleeping 100ms is a long time for a >>>>> computer) >>>>> >>>>> >>>> There were discussions for a similar fix regarding Thread.yield(). The >>>> concern was that using Thread.yield() in a tight loop might very easily >>>> lead to starvation on single core machines. Therefore Thread.sleep(10) is >>>> used to be sure the flag setting thread has actually a chance to >>>> progress. >>>> >>>> -JB- >>>> >>>> >>>> >>>> >>>>> >>>>> On Tue, Feb 18, 2014 at 8:22 AM, Jaroslav Bachorik < >>>>> jaroslav.bachorik at oracle.com> wrote: >>>>> >>>>> Please, review the following test change. >>>>> >>>>>> >>>>>> Issue : https://bugs.openjdk.java.net/browse/JDK-8034168 >>>>>> Webrev: http://cr.openjdk.java.net/~jbachorik/8034168/webrev.00 >>>>>> >>>>>> The test fails because of falsely evaluating the thread being parked as >>>>>> actually waiting on a monitor. This is because there is no difference >>>>>> in >>>>>> java thread state for those two situations. The test is using Phaser >>>>>> for >>>>>> synchronization between the checked and checking thread to make sure an >>>>>> appropriate code section is entered before performing asserts. Then it >>>>>> checks the checked thread state and waits till it becomes WAITING. >>>>>> Unfortunately, when Phaser needs to wait it parks the thread and sets >>>>>> the >>>>>> thread state to WAITING. From now on the test is in a completely random >>>>>> state and the result will largely depend on timing - thus failing >>>>>> intermittently. >>>>>> >>>>>> The solution is to use an additional volatile variable to prevent >>>>>> falsely >>>>>> indicating the park() induced WAITING state. >>>>>> >>>>>> Thanks, >>>>>> >>>>>> -JB- >>>>>> >>>>>> >>>>>> >>>>> >>>> >>> >> > From jaroslav.bachorik at oracle.com Tue Feb 25 05:52:07 2014 From: jaroslav.bachorik at oracle.com (Jaroslav Bachorik) Date: Tue, 25 Feb 2014 14:52:07 +0100 Subject: Review request for 7195249: Some jtreg tests use hard coded ports In-Reply-To: <53049009.9020204@oracle.com> References: <529EF58F.5000701@oracle.com> <52A58687.6020708@oracle.com> <52A5953A.5040102@oracle.com> <52A7061E.8040002@oracle.com> <52BC2A7D.3070403@oracle.com> <52D6A61A.5020109@oracle.com> <52DD3BB6.2070607@oracle.com> <52DE3E3D.5070903@oracle.com> <52DE41BB.40309@oracle.com> <52F231D3.1050208@oracle.com> <52F23A97.7020807@oracle.com> <53049009.9020204@oracle.com> Message-ID: <530CA007.5080006@oracle.com> Thumbs up. (not a "reviewer", though) -JB- On 19.2.2014 12:05, taras ledkov wrote: > Hi, > > Imports are fixed: > > http://cr.openjdk.java.net/~anazarov/7195249/jdk/webrev.04/ > > On 05.02.2014 17:20, Jaroslav Bachorik wrote: >> Hi Taras, >> >> thanks for taking care of this. >> >> The changes look fine to me. >> >> One minor nit is unused imports of the library classes in >> "test/sun/management/jmxremote/bootstrap/SSLConfigFilePermissionTest.java". >> >> It does not use any of those classes as its base class >> "AbstractFilePermissionTest" does all the heavy lifting. >> >> Cheers, >> >> -JB- >> >> On 5.2.2014 13:42, taras ledkov wrote: >>> Hi, >>> >>> So please take a look at the review against JDK9. >>> The reviewed patch had not been integrated into JDK8. >>> >>> Port to JDK9 is identical. The difference: the ProcessTools.java has >>> been already patched by Jaroslav. >>> >>> Webrev for jdk part: >>> http://cr.openjdk.java.net/~anazarov/7195249/jdk/webrev.03/ >>> >>> Webrev for hs part: >>> http://cr.openjdk.java.net/~anazarov/7195249/hs/webrev.03/ >>> >>> >>> On 21.01.2014 13:45, Jaroslav Bachorik wrote: >>>> Hi Taras, >>>> >>>> On 21.1.2014 10:30, taras ledkov wrote: >>>>> Hi Jaroslav, >>>>> >>>>> Could you please review the last changes? >>>>> Are you OK? >>>> >>>> Yes, the change looks ok. But I think we will need to get back to this >>>> problem eventually and implement a central port dispatcher if we >>>> want to >>>> be 100% sure the port conflicts wouldn't occur. But your changes reduce >>>> the chance significantly. >>>> >>>> Thanks for taking care of this. >>>> >>>> -JB- >>>> >>>>> >>>>> On 20.01.2014 19:21, Staffan Larsen wrote: >>>>>> Sorry for not replying earlier. Yes, I?m ok with these changes. >>>>>> >>>>>> Thanks, >>>>>> /Staffan >>>>>> >>>>>> On 20 jan 2014, at 16:07, taras ledkov >>>>>> wrote: >>>>>> >>>>>>> Hi Staffan, >>>>>>> >>>>>>> I fixed the tests according with your comments. >>>>>>> Are you OK? >>>>>>> >>>>>>> On 15.01.2014 19:15, taras ledkov wrote: >>>>>>>> Hi, >>>>>>>> >>>>>>>> Please take a look at the new review. >>>>>>>> >>>>>>>> Webrev for jdk part: >>>>>>>> http://cr.openjdk.java.net/~anazarov/7195249/jdk/webrev.02/ >>>>>>>> >>>>>>>> Webrev for hs part: >>>>>>>> http://cr.openjdk.java.net/~anazarov/7195249/hs/webrev.02/ >>>>>>>> >>>>>>>> My answers are inline: >>>>>>>> >>>>>>>> On 08.01.2014 17:46, Staffan Larsen wrote: >>>>>>>>> Hi Taras, >>>>>>>>> >>>>>>>>> Thanks for doing this clean up and conversion of tests into Java. >>>>>>>>> Here?s a couple of comments: >>>>>>>>> >>>>>>>>> test/runtime/6294277/SourceDebugExtension.java: >>>>>>>>> This test could be simplified by not specifying an address at all. >>>>>>>>> Since the test never connects to the JVM started with -Xrunjdwp, >>>>>>>>> there >>>>>>>>> is no reason to specify an address. If address is unspecified (and >>>>>>>>> server=y), the connector will pick an address and print it to the >>>>>>>>> command line. Thus the only change that needs to be done is to >>>>>>>>> remove >>>>>>>>> ",address=8888? from the @run command. >>>>>>>> fixed >>>>>>>> >>>>>>>>> test/sun/management/jmxremote/bootstrap/RmiBootstrapTest.sh: >>>>>>>>> test/sun/management/jmxremote/bootstrap/RmiSslBootstrapTest.sh: >>>>>>>>> These tests do not compile cleanly with an empty JTwork >>>>>>>>> directory. It >>>>>>>>> seems that having one @build for each class does not work well - >>>>>>>>> when >>>>>>>>> compiling RmiBootstrapTest.java it cannot find TestLogger. Moving >>>>>>>>> all >>>>>>>>> classes to one @build statement solved this problem for me. >>>>>>>> fixed >>>>>>>> >>>>>>>>> test/lib/testlibrary/jdk/testlibrary/ProcessTools.java: >>>>>>>>> 187 Future stdoutTask = stdout.process(); >>>>>>>>> 188 Future stderrTask = stderr.process(); >>>>>>>>> The stdoutTask and stderrTask variables are unused. >>>>>>>> fixed >>>>>>>> >>>>>>>>> test/sun/management/jmxremote/bootstrap/RmiRegistrySslTest.java: >>>>>>>>> At first I thought something was wrong with this file - the >>>>>>>>> diff is >>>>>>>>> very weird. Then I realized you renamed an old file and created a >>>>>>>>> new >>>>>>>>> file using the old name. >>>>>>>> You are right. I did it to keep the test name. >>>>>>>> >>>>>>>>> test/sun/management/jmxremote/bootstrap/AbstractFilePermissionTest.java: >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> - Is resetPasswordFilePermission() really necessary? It looks like >>>>>>>>> you >>>>>>>>> delete the files at the beginning of the test in any case. >>>>>>>> I think yes. n the first place, this functionality was at the old >>>>>>>> code. >>>>>>>> In the second place, a file without write permission may be a >>>>>>>> problem >>>>>>>> for a further cleanup (not by the test, for example for the tests >>>>>>>> launcher scripts etc.) >>>>>>>> >>>>>>>>> - I find the names and usage of ?mgmt? and ?file2PermissionTest? >>>>>>>>> confusing. They are both Paths. One is used directly by the >>>>>>>>> sub-classes, the other has a getter method. >>>>>>>> fixed >>>>>>>> >>>>>>>>> - Lines 57-58: Don?t swallow exceptions, add an >>>>>>>>> ex.printStackTrace(). >>>>>>>>> (Same thing for all other places where you call >>>>>>>>> Integer.parseInt()) >>>>>>>> fixed >>>>>>>> >>>>>>>>> test/sun/management/jmxremote/bootstrap/Dummy.java: >>>>>>>>> This file is never used as far as I can see. >>>>>>>> It is used by PasswordFilePermissionTest & >>>>>>>> SSLConfigFilePermissionTest >>>>>>>> via the AbstractFilePermissionTest (see the doTest method, >>>>>>>> AbstractFilePermissionTest : 162). >>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> /Staffan >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> On 26 dec 2013, at 14:09, taras ledkov >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>>> Hi, >>>>>>>>>> >>>>>>>>>> Please take a look at the review with fixed issues about >>>>>>>>>> trying to >>>>>>>>>> launch test that needs free port several times. >>>>>>>>>> >>>>>>>>>> Webrev for jdk part: >>>>>>>>>> http://cr.openjdk.java.net/~anazarov/7195249/jdk/webrev.01/ >>>>>>>>>> >>>>>>>>>> Webrev for hs part: >>>>>>>>>> http://cr.openjdk.java.net/~anazarov/7195249/hs/webrev.01/ >>>>>>>>>> >>>>>>>>>> Pay your attention to new method >>>>>>>>>> ProcessTools.startProcess(String, >>>>>>>>>> ProcessBuilder, Consumer) that is used to analyze all >>>>>>>>>> output >>>>>>>>>> of a sub-process. It has common part with >>>>>>>>>> ProcessTools.startProcess(String, ProcessBuilder, >>>>>>>>>> Predicate, >>>>>>>>>> long, TumeUnit) that is used to determine the warm-up moment. >>>>>>>>>> >>>>>>>>>> I think the ProcessTools.startProcess(String, ProcessBuilder, >>>>>>>>>> Predicate, long, TumeUnit) may be changed by adding >>>>>>>>>> LinePump >>>>>>>>>> to stderr if there is not serious reason for restricting the >>>>>>>>>> warm-up >>>>>>>>>> analysis to stdout stream. >>>>>>>>>> >>>>>>>>>> On 10.12.2013 16:16, Yekaterina Kantserova wrote: >>>>>>>>>>> Hi, >>>>>>>>>>> >>>>>>>>>>> I've consulted with Serviceability engineers (add them to CC >>>>>>>>>>> list) and >>>>>>>>>>> they would like to see tests to solve these problem so far: >>>>>>>>>>> >>>>>>>>>>> 2. Implement loops in every test. >>>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> Katja >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On 12/09/2013 11:02 AM, Alexandre (Shura) Iline wrote: >>>>>>>>>>>> Guys. >>>>>>>>>>>> >>>>>>>>>>>> Let me try to sum up what was said before and may be suggest a >>>>>>>>>>>> compromise. >>>>>>>>>>>> >>>>>>>>>>>> 1. There is a desire to have a support port allocation on the >>>>>>>>>>>> level of >>>>>>>>>>>> a JTReg suite execution. Taras created a bug for that >>>>>>>>>>>> (https://bugs.openjdk.java.net/browse/JDK-7195249). Whether it >>>>>>>>>>>> is a >>>>>>>>>>>> test harness API or a library API does not really matter from >>>>>>>>>>>> usage >>>>>>>>>>>> point of view. >>>>>>>>>>>> >>>>>>>>>>>> 2. There is no way to make the tests absolutely stable, >>>>>>>>>>>> whatever >>>>>>>>>>>> port >>>>>>>>>>>> allocation logic is used. The best we could do is to try to >>>>>>>>>>>> perform >>>>>>>>>>>> the test logic with different ports until the test succeeds. >>>>>>>>>>>> >>>>>>>>>>>> Both arguments make sense. #2 is the ultimate answer, of >>>>>>>>>>>> course, >>>>>>>>>>>> but >>>>>>>>>>>> better be used in conjunction with a meaningful port selection >>>>>>>>>>>> algorithm. >>>>>>>>>>>> >>>>>>>>>>>> At the same time, copying a loop-until-success login from one >>>>>>>>>>>> test to >>>>>>>>>>>> another may be not the best solution. Library could help with >>>>>>>>>>>> that I >>>>>>>>>>>> believe. There only need to be an API method which takes >>>>>>>>>>>> behavior as a >>>>>>>>>>>> parameter and run it until it succeeds. Something like: >>>>>>>>>>>> public runOnAFreePort(Function) >>>>>>>>>>>> or similar. There could be arguments of how/whether to >>>>>>>>>>>> implement >>>>>>>>>>>> it, >>>>>>>>>>>> the solution would not work for shell tests, etc, but still ... >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> With the tests in question though, we have a few options. >>>>>>>>>>>> >>>>>>>>>>>> 1. Integrate tests as is. Get to it later after reaching >>>>>>>>>>>> agreement in >>>>>>>>>>>> the library, etc. >>>>>>>>>>>> 2. Implement loops in every test. >>>>>>>>>>>> 3. Wait for the library to be ready and only then integrate the >>>>>>>>>>>> changes. >>>>>>>>>>>> >>>>>>>>>>>> Please let us know which one is closer to your heart. >>>>>>>>>>>> >>>>>>>>>>>> I personally prefer #1 for the reason that the changes already >>>>>>>>>>>> supposed to make the tests more stable and also there are many >>>>>>>>>>>> more >>>>>>>>>>>> tests tests which use ports, so the scope of the problem is >>>>>>>>>>>> bigger >>>>>>>>>>>> than these. >>>>>>>>>>>> >>>>>>>>>>>> Shura >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>>> Taras, >>>>>>>>>>>>> >>>>>>>>>>>>> I agree with the previous comments, that Utils.getFreePort() >>>>>>>>>>>>> does not >>>>>>>>>>>>> guarantee the port will be still free when you start your >>>>>>>>>>>>> process. >>>>>>>>>>>>> Unfortunately I don't think the library can do more. However, >>>>>>>>>>>>> there is a >>>>>>>>>>>>> solution. >>>>>>>>>>>>> >>>>>>>>>>>>> Please, look at the *jdk/test/sun/tools/jstatd/JstatdTest.java >>>>>>>>>>>>> tryToSetupJstatdProcess()*. In brief, the test will try to >>>>>>>>>>>>> start a >>>>>>>>>>>>> process with a free port and then check if >>>>>>>>>>>>> /java.rmi.server.ExportException: Port already in use/ has >>>>>>>>>>>>> been >>>>>>>>>>>>> thrown. >>>>>>>>>>>>> If yes, you have to retry. >>>>>>>>>>>>> >>>>>>>>>>>>> Thanks, >>>>>>>>>>>>> Katja >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> On 12/02/2013 01:39 PM, taras ledkov wrote: >>>>>>>>>>>>>> Hi Everyone, >>>>>>>>>>>>>> >>>>>>>>>>>>>> Whatever logic is to be chosen to select a free port, it is >>>>>>>>>>>>>> the >>>>>>>>>>>>>> library responsibility to implements it, would not you agree? >>>>>>>>>>>>>> >>>>>>>>>>>>>> Hence what I am suggesting is to integrate the tests as is. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Should we decide to replace logic of the port selection, we >>>>>>>>>>>>>> could do >>>>>>>>>>>>>> it later in the library. >>>>>>>>>>>>>> >>>>>>>>>>>>>> On 21.11.2013 15:00, Jaroslav Bachorik wrote: >>>>>>>>>>>>>>> On 20.11.2013 18:38, Dmitry Samersoff wrote: >>>>>>>>>>>>>>>> Roger, >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> As soon as we close a socket nobody can guarantee that the >>>>>>>>>>>>>>>> port is >>>>>>>>>>>>>>>> free. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Moreover, port returned by getFreePort()[1] remains not >>>>>>>>>>>>>>>> accessible >>>>>>>>>>>>>>>> for >>>>>>>>>>>>>>>> some time - it depends to system setup, take a look to >>>>>>>>>>>>>>>> discussions >>>>>>>>>>>>>>>> around SO_REUSEPORT for Linux or SO_REUSEADDR and SO_LINGER >>>>>>>>>>>>>>>> for >>>>>>>>>>>>>>>> BSD. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> So from stability point of view it's better to just return >>>>>>>>>>>>>>>> random >>>>>>>>>>>>>>>> number >>>>>>>>>>>>>>>> between 49152 and 65535. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Well, this doesn't seem to improve the odds by much. When >>>>>>>>>>>>>>> there are >>>>>>>>>>>>>>> more >>>>>>>>>>>>>>> tests run in parallel, all of them requiring a free port, >>>>>>>>>>>>>>> nothing >>>>>>>>>>>>>>> prevents the random function to return the same port to >>>>>>>>>>>>>>> all of >>>>>>>>>>>>>>> them. >>>>>>>>>>>>>>> Also, two subsequent requests can return the same port and >>>>>>>>>>>>>>> cause >>>>>>>>>>>>>>> problems with timing when a port used by a previous test is >>>>>>>>>>>>>>> not >>>>>>>>>>>>>>> fully >>>>>>>>>>>>>>> ready to be assigned to a different socket. And as Dmitry >>>>>>>>>>>>>>> pointed out >>>>>>>>>>>>>>> unless one can keep hold of the allocated socket and use it >>>>>>>>>>>>>>> later >>>>>>>>>>>>>>> there >>>>>>>>>>>>>>> is no guarantee that a port which was tested unallocated >>>>>>>>>>>>>>> will >>>>>>>>>>>>>>> remain >>>>>>>>>>>>>>> unallocated also for the next few milliseconds. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> The only fail proof solution would be a port allocating >>>>>>>>>>>>>>> service >>>>>>>>>>>>>>> provided >>>>>>>>>>>>>>> by the harness. Until then we can only (hopefully) decrease >>>>>>>>>>>>>>> the >>>>>>>>>>>>>>> chance >>>>>>>>>>>>>>> of intermittent failures due to a port being in use. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> -JB- >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> -Dmitry >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> [1] >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> 141 public static int getFreePort() throws >>>>>>>>>>>>>>>> InterruptedException, >>>>>>>>>>>>>>>> IOException { >>>>>>>>>>>>>>>> 142 int port = -1; >>>>>>>>>>>>>>>> 143 >>>>>>>>>>>>>>>> 144 while (port <= 0) { >>>>>>>>>>>>>>>> 145 Thread.sleep(100); >>>>>>>>>>>>>>>> 146 >>>>>>>>>>>>>>>> 147 ServerSocket serverSocket = null; >>>>>>>>>>>>>>>> 148 try { >>>>>>>>>>>>>>>> 149 serverSocket = new ServerSocket(0); >>>>>>>>>>>>>>>> 150 port = serverSocket.getLocalPort(); >>>>>>>>>>>>>>>> 151 } finally { >>>>>>>>>>>>>>>> 152 serverSocket.close(); >>>>>>>>>>>>>>>> 153 } >>>>>>>>>>>>>>>> 154 } >>>>>>>>>>>>>>>> 155 >>>>>>>>>>>>>>>> 156 return port; >>>>>>>>>>>>>>>> 157 } >>>>>>>>>>>>>>>> 158 >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> On 2013-11-20 19:40, roger riggs wrote: >>>>>>>>>>>>>>>>> Hi, >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> fyi, The jdk.testlibrary.Utils.getFreePort() method will >>>>>>>>>>>>>>>>> Open an >>>>>>>>>>>>>>>>> free >>>>>>>>>>>>>>>>> Socket, close it and return >>>>>>>>>>>>>>>>> the port number. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> And as Alan recommended, use (0) when possible to have the >>>>>>>>>>>>>>>>> system >>>>>>>>>>>>>>>>> assign >>>>>>>>>>>>>>>>> the port #. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Roger >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> On 11/20/2013 8:04 AM, Dmitry Samersoff wrote: >>>>>>>>>>>>>>>>>> Taras, >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> *The only* correct way to take really free port is: >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> 1. Chose random number between 49152 and 65535 >>>>>>>>>>>>>>>>>> 2. Open socket >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> if socket fails - repeat step 1 >>>>>>>>>>>>>>>>>> if socket OK - return *socket* >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> If you can't keep the socket open (e.g. you have to pass >>>>>>>>>>>>>>>>>> port >>>>>>>>>>>>>>>>>> number as >>>>>>>>>>>>>>>>>> property value) you shouldn't do any pre-check as it >>>>>>>>>>>>>>>>>> has no >>>>>>>>>>>>>>>>>> value >>>>>>>>>>>>>>>>>> - as >>>>>>>>>>>>>>>>>> as soon as you close socket someone can take the port. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> So just choose a random number within the range above and >>>>>>>>>>>>>>>>>> let >>>>>>>>>>>>>>>>>> networking >>>>>>>>>>>>>>>>>> code opening socket to handle port conflict. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> -Dmitry >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> On 2013-11-20 15:54, taras ledkov wrote: >>>>>>>>>>>>>>>>>>> Hi Everyone, >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> I am working on bug >>>>>>>>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-7195249. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> There are two webrevs: >>>>>>>>>>>>>>>>>>> Webrev for jdk part: >>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~anazarov/7195249/jdk/webrev.00/ >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Webrev for hs part: >>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~anazarov/7195249/hs/webrev.00/ >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Please take a look at some notes: >>>>>>>>>>>>>>>>>>> - After discussing with Yekaterina Kantserova & Jaroslav >>>>>>>>>>>>>>>>>>> Bachorik >>>>>>>>>>>>>>>>>>> some >>>>>>>>>>>>>>>>>>> shell tests have been converted to java based tests >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> - PasswordFilePermissionTest & >>>>>>>>>>>>>>>>>>> SSLConfigFilePermissionTest >>>>>>>>>>>>>>>>>>> tests >>>>>>>>>>>>>>>>>>> looked >>>>>>>>>>>>>>>>>>> very similar, so a common parent class was created for >>>>>>>>>>>>>>>>>>> them: >>>>>>>>>>>>>>>>>>> AbstractFilePermissionTest >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> - What was called RmiRegistrySslTest.java I've >>>>>>>>>>>>>>>>>>> renamed to >>>>>>>>>>>>>>>>>>> RmiRegistrySslTestApp.java. The java code to replace old >>>>>>>>>>>>>>>>>>> shell >>>>>>>>>>>>>>>>>>> script >>>>>>>>>>>>>>>>>>> RmiRegistrySslTest.sh is called RmiRegistrySslTest.java, >>>>>>>>>>>>>>>>>>> hence the >>>>>>>>>>>>>>>>>>> huge >>>>>>>>>>>>>>>>>>> diff. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> - The new RmiRegistrySslTest.java has some lines similar >>>>>>>>>>>>>>>>>>> to the >>>>>>>>>>>>>>>>>>> AbstractFilePermissionTest.java, I nevertheless decided >>>>>>>>>>>>>>>>>>> to not >>>>>>>>>>>>>>>>>>> complicate the code further and leave it as is. Please >>>>>>>>>>>>>>>>>>> let me >>>>>>>>>>>>>>>>>>> know if >>>>>>>>>>>>>>>>>>> this is somehow not acceptable >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> - com/oracle/java/testlibrary/Utils.java that is >>>>>>>>>>>>>>>>>>> added to >>>>>>>>>>>>>>>>>>> hotspot >>>>>>>>>>>>>>>>>>> repository is taken from this patch: >>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ykantser/8023138/webrev.00/test/lib/testlibrary/jdk/testlibrary/Utils.java.sdiff.html >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> - These tests will need additional changes when test >>>>>>>>>>>>>>>>>>> library >>>>>>>>>>>>>>>>>>> process >>>>>>>>>>>>>>>>>>> tools will support command line options inheritance >>>>>>>>>>>>>>>>>>> (http://mail.openjdk.java.net/pipermail/serviceability-dev/2013-November/013235.html) >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> With best regards, >>>>>>>>>> Taras Ledkov >>>>>>>>>> Mail-To: taras.ledkov at oracle.com >>>>>>>>>> skype: taras_ledkov >>>>>>>>>> Phone: 7(812)3346-157 >>>>>>>>> >>>>>>>> >>>>>>> >>>>>>> -- >>>>>>> With best regards, >>>>>>> Taras Ledkov >>>>>>> Mail-To: taras.ledkov at oracle.com >>>>>>> skype: taras_ledkov >>>>>>> Phone: 7(812)3346-157 >>>>>> >>>>> >>>> >>> >> > From daniel.daugherty at oracle.com Tue Feb 25 08:03:36 2014 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Tue, 25 Feb 2014 09:03:36 -0700 Subject: RFR round 0 JDK8u backport of ObjectMonitor-JVM/TI hang fix (8028073) In-Reply-To: <53081C1F.2070008@oracle.com> References: <52ED3F28.5070806@oracle.com> <53081C1F.2070008@oracle.com> Message-ID: <530CBED8.4050509@oracle.com> Ping! Still haven't heard from anyone on this backport... Dan On 2/21/14 8:40 PM, Daniel D. Daugherty wrote: > Greetings, > > This is a code review request for the JDK8u-hs-dev backport of the > following ObjectMonitor-JVM/TI hang fix: > > 8028073 race condition in ObjectMonitor implementation causing > deadlocks > https://bugs.openjdk.java.net/browse/JDK-8028073 > > Here is the JDK8u-hs-dev webrev URL: > > http://cr.openjdk.java.net/~dcubed/8028073-webrev/0-jdk8u-hs-dev/ > > This is _almost_ a straight forward backport of the JDK9 fix. The only > difference to the fix was discussed at the end of the JDK9 review and > was determined to only be needed in versions of HotSpot without the > fix for 8028280: > > http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2014-February/010745.html > > > 8028280 has not yet been backported to JDK8u-hs-dev. > > The easiest way to review the backport is to download the two patch > files from the webrevs and compare them with something like: > > jfilemerge -r -w 8028073_exp.patch 8028073_exp_for_jdk8u_hs.patch > > The same testing has been performed on the JDK8u-hs-dev version as > with the JDK9-hs-runtime version. > > Thanks, in advance, for any comments, questions or suggestions. > > Dan > > > On 2/1/14 11:38 AM, Daniel D. Daugherty wrote: > > Greetings, > > > > I have a fix ready for the following bug: > > > > 8028073 race condition in ObjectMonitor implementation causing > deadlocks > > https://bugs.openjdk.java.net/browse/JDK-8028073 > > > > On the surface, this is a very simple fix that relocates a few lines of > > code, relocates and rewrites the comments associated with that code and > > adds several new comments. > > > > Of course, in reality, the issue is much more complicated, but I'm > > hoping to make it easy for anyone not acquainted with this issue to > > understand what's going on. > > > > Here are the JDK9 webrev URLs: > > > > OpenJDK: > > http://cr.openjdk.java.net/~dcubed/8028073-webrev/0-jdk9-hs-runtime/ > > > > Oracle internal: > > > http://javaweb.us.oracle.com/~ddaugher/8028073-webrev/0-jdk9-hs-runtime/ > > > > The simple summary: > > > > - since Java Monitors and JVM/TI RawMonitors share a ParkEvent, > > it is possible for a JVM/TI monitor event handler to accidentally > > consume a ParkEvent.unpark() call meant for Java Monitor layer > > - the original code fix was made on 2005.07.04 using this bug ID: > > https://bugs.openjdk.java.net/browse/JDK-5030359 > > - it's the right fix, but it's in the wrong place > > - the fix needs to be after the JVMTI_EVENT_MONITOR_WAITED > > event handler is called because it is that event handler > > that can cause the hang > > > > > > Testing > > ------- > > > > - a new StessMonitorWait test has been created that reliably > > reproduces the hang in JDK[6789]; see the bug's gory details > > for the specific versions where the hang has been reproduced > > - the test reliably reproduces the hang in 5 seconds on my > > T7600 running Solaris 10u11 X86; 1 minute runs reproduce > > the hang reliably on other machines > > - 12 hour stress run of the new test on Linux-X64, MacOS X-X64, > > Solaris-SPARCV9, Solaris-X64, and Win7-X86 with the JPRT > > bits did not reproduce the hang > > - JPRT test job > > - VM/SQE Adhoc test job on Server VM, fastdebug bits on Linux-X86, > > Linux-X64, MacOS X-X64, Solaris-SPARCV9, Solaris-X64, Windows-X86, > > and Windows-X64: > > - vm.quick > > - Kitchensink (bigapps) > > - Weblogic+medrec (bigapps) > > - runThese (bigapps) > > > > > > The Gory Details Start Here > > --------------------------- > > > > This is the old location of block of code that's being moved: > > > > src/share/vm/runtime/objectMonitor.cpp: > > > > 1440 void ObjectMonitor::wait(jlong millis, bool interruptible, > TRAPS) { > > > > 1499 exit (true, Self) ; // exit the monitor > > > > 1513 if (node._notified != 0 && _succ == Self) { > > 1514 node._event->unpark(); > > 1515 } > > > > > > This is the new location of block of code that's being moved: > > > > src/share/vm/runtime/objectMonitor.cpp: > > > > 1452 void ObjectMonitor::wait(jlong millis, bool interruptible, > TRAPS) { > > > > 1601 if (JvmtiExport::should_post_monitor_waited()) { > > 1602 JvmtiExport::post_monitor_waited(jt, this, ret == > OS_TIMEOUT); > > > > 1604 if (node._notified != 0 && _succ == Self) { > > > > 1620 node._event->unpark(); > > 1621 } > > > > > > The Risks > > --------- > > > > - The code now executes only when the JVMTI_EVENT_MONITOR_WAITED event > > is enabled: > > - previously it was always executed > > - while the old code was not effective for the hang that is being > > fixed with this bug, it is possible that the old code prevented > > a different bug in the successor protocol from manifesting > > - thorough analysis of the successor protocol did not reveal a > > case where the old code was needed in the old location > > - Thorough analysis indicates that the other JVM/TI monitor events > > do not need a fix like the one for JVMTI_EVENT_MONITOR_WAITED: > > - the successor protocol is complicated and the analysis could > > be wrong when certain options are used > > - comments were added to each location where a JVM/TI monitor > > event handler is called documenting why a fix like this one > > is not needed there > > - if the analysis is wrong, the new comments show where a new > > code change would be needed > > > > > > The Scenario > > ------------ > > > > I've created a scenario that reproduces this hang: > > > > T1 - enters monitor and calls monitor.wait() > > T2 - enters the monitor, calls monitor.notify() and exits the monitor > > T3 - enters and exits the monitor > > T4 - enters the monitor, delays for 5 seconds, exits the monitor > > > > A JVM/TI agent that enables JVMTI_EVENT_MONITOR_WAITED and has a > > handler that: enters a raw monitor, waits for 1ms, exits a raw monitor. > > > > Here are the six events necessary to make this hang happen: > > > > // KEY-EVENT-1a: After being unparked(), T1 has cleared the _succ > field, but > > // KEY-EVENT-1b: T3 is exiting the monitor and makes T1 the > successor again. > > > > // KEY-EVENT-2a: The unpark() done by T3 when it made T1 the successor > > // KEY-EVENT-2b: is consumed by the JVM/TI event handler. > > > > // KEY-EVENT-3a: T3 made T1 the successor > > // KEY-EVENT-3b: but before T1 could reenter the monitor T4 grabbed it. > > > > // KEY-EVENT-4a: T1's TrySpin() call sees T4 as NotRunnable so > > // KEY-EVENT-4b: T1 bails from TrySpin without touching _succ. > > > > // KEY-EVENT-5a: T4 sees that T1 is still the successor so > > // KEY-EVENT-5b: T4 takes the quick exit path (no ExitEpilog) > > > > // KEY-EVENT-6a: T1 is about to park and it is the successor, but > > // KEY-EVENT-6b: T3's unpark has been eaten by the JVM/TI event handler > > // KEY-EVENT-6c: and T4 took the quick exit path. T1 is about to be > stuck. > > > > > > This bug is intertwined with: > > > > - The ObjectMonitor successor protocol > > - the sharing of a ParkEvent between Java Monitors and JVM/TI > RawMonitors > > > > There is a very long successor.notes attachment to JDK-8028073 that > > attempts to describe the ObjectMonitor successor protocol. It's good > > for putting pretty much anyone to sleep. > > > > Since this hang reproduces back to JDK6, this bug is taking the easily > > backported solution of moving the original fix to the right location. > > The following new bug has been filed for possible future work in this > > area by the Serviceability Team: > > > > 8033399 add a separate ParkEvent for JVM/TI RawMonitor use > > https://bugs.openjdk.java.net/browse/JDK-8033399 > > > > > > The Symptoms > > ------------ > > > > With intermittent hangs like this, it is useful to know what to look > > for in order to determine if you are running into this issue: > > > > - if you aren't using a debugger or a profiler or some other > > JVM/TI agent, then this hang is not the same as yours > > - if your JVM/TI agent isn't using a JVMTI_EVENT_MONITOR_WAITED > > event handler, then this hang is not the same as yours > > - if your JVMTI_EVENT_MONITOR_WAITED event handler is not using > > JVM/TI RawMonitors, then this hang is not the same as yours > > - if your JVMTI_EVENT_MONITOR_WAITED event handler is calling > > back into Java code, then you might just be insane and this > > hang might be similar to yours. However, using a Java callback > > in an event handler is an even bigger problem/risk so fix that > > first. > > - if you one or more threads blocked like this and making no > > progress, then this hang might be the same as yours: > > > > "T1" #22 prio=5 os_prio=64 tid=0x00000000009ca800 nid=0x2f waiting > for monitor e > > ntry [0xfffffd7fc0231000] > > java.lang.Thread.State: BLOCKED (on object monitor) > > JavaThread state: _thread_blocked > > Thread: 0x00000000009ca800 [0x2f] State: _at_safepoint > _has_called_back 0 _at_p > > oll_safepoint 0 > > JavaThread state: _thread_blocked > > at java.lang.Object.wait(Native Method) > > - waiting on <0xfffffd7e6a2b6ff0> (a java.lang.String) > > at java.lang.Object.wait(Object.java:502) > > at SMW_WorkerThread.run(StressMonitorWait.java:103) > > - locked <0xfffffd7e6a2b6ff0> (a java.lang.String) > > > > "T2" #23 prio=5 os_prio=64 tid=0x00000000009cc000 nid=0x30 waiting > for monitor e > > ntry [0xfffffd7fc0130000] > > java.lang.Thread.State: BLOCKED (on object monitor) > > JavaThread state: _thread_blocked > > Thread: 0x00000000009cc000 [0x30] State: _at_safepoint > _has_called_back 0 _at_p > > oll_safepoint 0 > > JavaThread state: _thread_blocked > > at SMW_WorkerThread.run(StressMonitorWait.java:120) > > - waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String) > > > > "T3" #24 prio=5 os_prio=64 tid=0x00000000009ce000 nid=0x31 waiting > for monitor e > > ntry [0xfffffd7fc002f000] > > java.lang.Thread.State: BLOCKED (on object monitor) > > JavaThread state: _thread_blocked > > Thread: 0x00000000009ce000 [0x31] State: _at_safepoint > _has_called_back 0 _at_p > > oll_safepoint 0 > > JavaThread state: _thread_blocked > > at SMW_WorkerThread.run(StressMonitorWait.java:139) > > - waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String) > > > > Key symptoms in thread T1: > > > > - had the object locked: > > > > locked <0xfffffd7e6a2b6ff0> (a java.lang.String) > > > > - did an Object.wait(): > > > > waiting on <0xfffffd7e6a2b6ff0> (a java.lang.String) > > > > - is blocked on reentry: > > > > waiting for monitor entry [0xfffffd7fc0231000] > > > > Key symptoms in thread T2: > > > > - is blocked waiting to lock the object: > > > > waiting for monitor entry [0xfffffd7fc0130000] > > waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String) > > > > Key symptoms in thread T3: > > > > - is blocked waiting to lock the object: > > > > waiting for monitor entry [0xfffffd7fc002f000] > > waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String) > > > From ron.durbin at oracle.com Tue Feb 25 12:00:07 2014 From: ron.durbin at oracle.com (Ron Durbin) Date: Tue, 25 Feb 2014 12:00:07 -0800 (PST) Subject: RFR round 0 JDK8u backport of ObjectMonitor-JVM/TI hang fix (8028073) In-Reply-To: <530CBED8.4050509@oracle.com> References: <52ED3F28.5070806@oracle.com> <53081C1F.2070008@oracle.com> <530CBED8.4050509@oracle.com> Message-ID: <1a31f034-a44c-4c0c-b350-ea626ed81147@default> Code looks good. Thx for the local in person review. Thx Ron > -----Original Message----- > From: Daniel D. Daugherty > Sent: Tuesday, February 25, 2014 9:04 AM > To: David Holmes; Serguei Spitsyn; Dave Dice; Karen Kinnear > Cc: serviceability-dev at openjdk.java.net; hotspot-runtime-dev at openjdk.java.net > Subject: Re: RFR round 0 JDK8u backport of ObjectMonitor-JVM/TI hang fix (8028073) > > Ping! Still haven't heard from anyone on this backport... > > Dan > > > On 2/21/14 8:40 PM, Daniel D. Daugherty wrote: > > Greetings, > > > > This is a code review request for the JDK8u-hs-dev backport of the > > following ObjectMonitor-JVM/TI hang fix: > > > > 8028073 race condition in ObjectMonitor implementation causing > > deadlocks > > https://bugs.openjdk.java.net/browse/JDK-8028073 > > > > Here is the JDK8u-hs-dev webrev URL: > > > > http://cr.openjdk.java.net/~dcubed/8028073-webrev/0-jdk8u-hs-dev/ > > > > This is _almost_ a straight forward backport of the JDK9 fix. The only > > difference to the fix was discussed at the end of the JDK9 review and > > was determined to only be needed in versions of HotSpot without the > > fix for 8028280: > > > > http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2014-February/010745.html > > > > > > 8028280 has not yet been backported to JDK8u-hs-dev. > > > > The easiest way to review the backport is to download the two patch > > files from the webrevs and compare them with something like: > > > > jfilemerge -r -w 8028073_exp.patch 8028073_exp_for_jdk8u_hs.patch > > > > The same testing has been performed on the JDK8u-hs-dev version as > > with the JDK9-hs-runtime version. > > > > Thanks, in advance, for any comments, questions or suggestions. > > > > Dan > > > > > > On 2/1/14 11:38 AM, Daniel D. Daugherty wrote: > > > Greetings, > > > > > > I have a fix ready for the following bug: > > > > > > 8028073 race condition in ObjectMonitor implementation causing > > deadlocks > > > https://bugs.openjdk.java.net/browse/JDK-8028073 > > > > > > On the surface, this is a very simple fix that relocates a few lines of > > > code, relocates and rewrites the comments associated with that code and > > > adds several new comments. > > > > > > Of course, in reality, the issue is much more complicated, but I'm > > > hoping to make it easy for anyone not acquainted with this issue to > > > understand what's going on. > > > > > > Here are the JDK9 webrev URLs: > > > > > > OpenJDK: > > > http://cr.openjdk.java.net/~dcubed/8028073-webrev/0-jdk9-hs-runtime/ > > > > > > Oracle internal: > > > > > http://javaweb.us.oracle.com/~ddaugher/8028073-webrev/0-jdk9-hs-runtime/ > > > > > > The simple summary: > > > > > > - since Java Monitors and JVM/TI RawMonitors share a ParkEvent, > > > it is possible for a JVM/TI monitor event handler to accidentally > > > consume a ParkEvent.unpark() call meant for Java Monitor layer > > > - the original code fix was made on 2005.07.04 using this bug ID: > > > https://bugs.openjdk.java.net/browse/JDK-5030359 > > > - it's the right fix, but it's in the wrong place > > > - the fix needs to be after the JVMTI_EVENT_MONITOR_WAITED > > > event handler is called because it is that event handler > > > that can cause the hang > > > > > > > > > Testing > > > ------- > > > > > > - a new StessMonitorWait test has been created that reliably > > > reproduces the hang in JDK[6789]; see the bug's gory details > > > for the specific versions where the hang has been reproduced > > > - the test reliably reproduces the hang in 5 seconds on my > > > T7600 running Solaris 10u11 X86; 1 minute runs reproduce > > > the hang reliably on other machines > > > - 12 hour stress run of the new test on Linux-X64, MacOS X-X64, > > > Solaris-SPARCV9, Solaris-X64, and Win7-X86 with the JPRT > > > bits did not reproduce the hang > > > - JPRT test job > > > - VM/SQE Adhoc test job on Server VM, fastdebug bits on Linux-X86, > > > Linux-X64, MacOS X-X64, Solaris-SPARCV9, Solaris-X64, Windows-X86, > > > and Windows-X64: > > > - vm.quick > > > - Kitchensink (bigapps) > > > - Weblogic+medrec (bigapps) > > > - runThese (bigapps) > > > > > > > > > The Gory Details Start Here > > > --------------------------- > > > > > > This is the old location of block of code that's being moved: > > > > > > src/share/vm/runtime/objectMonitor.cpp: > > > > > > 1440 void ObjectMonitor::wait(jlong millis, bool interruptible, > > TRAPS) { > > > > > > 1499 exit (true, Self) ; // exit the monitor > > > > > > 1513 if (node._notified != 0 && _succ == Self) { > > > 1514 node._event->unpark(); > > > 1515 } > > > > > > > > > This is the new location of block of code that's being moved: > > > > > > src/share/vm/runtime/objectMonitor.cpp: > > > > > > 1452 void ObjectMonitor::wait(jlong millis, bool interruptible, > > TRAPS) { > > > > > > 1601 if (JvmtiExport::should_post_monitor_waited()) { > > > 1602 JvmtiExport::post_monitor_waited(jt, this, ret == > > OS_TIMEOUT); > > > > > > 1604 if (node._notified != 0 && _succ == Self) { > > > > > > 1620 node._event->unpark(); > > > 1621 } > > > > > > > > > The Risks > > > --------- > > > > > > - The code now executes only when the JVMTI_EVENT_MONITOR_WAITED event > > > is enabled: > > > - previously it was always executed > > > - while the old code was not effective for the hang that is being > > > fixed with this bug, it is possible that the old code prevented > > > a different bug in the successor protocol from manifesting > > > - thorough analysis of the successor protocol did not reveal a > > > case where the old code was needed in the old location > > > - Thorough analysis indicates that the other JVM/TI monitor events > > > do not need a fix like the one for JVMTI_EVENT_MONITOR_WAITED: > > > - the successor protocol is complicated and the analysis could > > > be wrong when certain options are used > > > - comments were added to each location where a JVM/TI monitor > > > event handler is called documenting why a fix like this one > > > is not needed there > > > - if the analysis is wrong, the new comments show where a new > > > code change would be needed > > > > > > > > > The Scenario > > > ------------ > > > > > > I've created a scenario that reproduces this hang: > > > > > > T1 - enters monitor and calls monitor.wait() > > > T2 - enters the monitor, calls monitor.notify() and exits the monitor > > > T3 - enters and exits the monitor > > > T4 - enters the monitor, delays for 5 seconds, exits the monitor > > > > > > A JVM/TI agent that enables JVMTI_EVENT_MONITOR_WAITED and has a > > > handler that: enters a raw monitor, waits for 1ms, exits a raw monitor. > > > > > > Here are the six events necessary to make this hang happen: > > > > > > // KEY-EVENT-1a: After being unparked(), T1 has cleared the _succ > > field, but > > > // KEY-EVENT-1b: T3 is exiting the monitor and makes T1 the > > successor again. > > > > > > // KEY-EVENT-2a: The unpark() done by T3 when it made T1 the successor > > > // KEY-EVENT-2b: is consumed by the JVM/TI event handler. > > > > > > // KEY-EVENT-3a: T3 made T1 the successor > > > // KEY-EVENT-3b: but before T1 could reenter the monitor T4 grabbed it. > > > > > > // KEY-EVENT-4a: T1's TrySpin() call sees T4 as NotRunnable so > > > // KEY-EVENT-4b: T1 bails from TrySpin without touching _succ. > > > > > > // KEY-EVENT-5a: T4 sees that T1 is still the successor so > > > // KEY-EVENT-5b: T4 takes the quick exit path (no ExitEpilog) > > > > > > // KEY-EVENT-6a: T1 is about to park and it is the successor, but > > > // KEY-EVENT-6b: T3's unpark has been eaten by the JVM/TI event handler > > > // KEY-EVENT-6c: and T4 took the quick exit path. T1 is about to be > > stuck. > > > > > > > > > This bug is intertwined with: > > > > > > - The ObjectMonitor successor protocol > > > - the sharing of a ParkEvent between Java Monitors and JVM/TI > > RawMonitors > > > > > > There is a very long successor.notes attachment to JDK-8028073 that > > > attempts to describe the ObjectMonitor successor protocol. It's good > > > for putting pretty much anyone to sleep. > > > > > > Since this hang reproduces back to JDK6, this bug is taking the easily > > > backported solution of moving the original fix to the right location. > > > The following new bug has been filed for possible future work in this > > > area by the Serviceability Team: > > > > > > 8033399 add a separate ParkEvent for JVM/TI RawMonitor use > > > https://bugs.openjdk.java.net/browse/JDK-8033399 > > > > > > > > > The Symptoms > > > ------------ > > > > > > With intermittent hangs like this, it is useful to know what to look > > > for in order to determine if you are running into this issue: > > > > > > - if you aren't using a debugger or a profiler or some other > > > JVM/TI agent, then this hang is not the same as yours > > > - if your JVM/TI agent isn't using a JVMTI_EVENT_MONITOR_WAITED > > > event handler, then this hang is not the same as yours > > > - if your JVMTI_EVENT_MONITOR_WAITED event handler is not using > > > JVM/TI RawMonitors, then this hang is not the same as yours > > > - if your JVMTI_EVENT_MONITOR_WAITED event handler is calling > > > back into Java code, then you might just be insane and this > > > hang might be similar to yours. However, using a Java callback > > > in an event handler is an even bigger problem/risk so fix that > > > first. > > > - if you one or more threads blocked like this and making no > > > progress, then this hang might be the same as yours: > > > > > > "T1" #22 prio=5 os_prio=64 tid=0x00000000009ca800 nid=0x2f waiting > > for monitor e > > > ntry [0xfffffd7fc0231000] > > > java.lang.Thread.State: BLOCKED (on object monitor) > > > JavaThread state: _thread_blocked > > > Thread: 0x00000000009ca800 [0x2f] State: _at_safepoint > > _has_called_back 0 _at_p > > > oll_safepoint 0 > > > JavaThread state: _thread_blocked > > > at java.lang.Object.wait(Native Method) > > > - waiting on <0xfffffd7e6a2b6ff0> (a java.lang.String) > > > at java.lang.Object.wait(Object.java:502) > > > at SMW_WorkerThread.run(StressMonitorWait.java:103) > > > - locked <0xfffffd7e6a2b6ff0> (a java.lang.String) > > > > > > "T2" #23 prio=5 os_prio=64 tid=0x00000000009cc000 nid=0x30 waiting > > for monitor e > > > ntry [0xfffffd7fc0130000] > > > java.lang.Thread.State: BLOCKED (on object monitor) > > > JavaThread state: _thread_blocked > > > Thread: 0x00000000009cc000 [0x30] State: _at_safepoint > > _has_called_back 0 _at_p > > > oll_safepoint 0 > > > JavaThread state: _thread_blocked > > > at SMW_WorkerThread.run(StressMonitorWait.java:120) > > > - waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String) > > > > > > "T3" #24 prio=5 os_prio=64 tid=0x00000000009ce000 nid=0x31 waiting > > for monitor e > > > ntry [0xfffffd7fc002f000] > > > java.lang.Thread.State: BLOCKED (on object monitor) > > > JavaThread state: _thread_blocked > > > Thread: 0x00000000009ce000 [0x31] State: _at_safepoint > > _has_called_back 0 _at_p > > > oll_safepoint 0 > > > JavaThread state: _thread_blocked > > > at SMW_WorkerThread.run(StressMonitorWait.java:139) > > > - waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String) > > > > > > Key symptoms in thread T1: > > > > > > - had the object locked: > > > > > > locked <0xfffffd7e6a2b6ff0> (a java.lang.String) > > > > > > - did an Object.wait(): > > > > > > waiting on <0xfffffd7e6a2b6ff0> (a java.lang.String) > > > > > > - is blocked on reentry: > > > > > > waiting for monitor entry [0xfffffd7fc0231000] > > > > > > Key symptoms in thread T2: > > > > > > - is blocked waiting to lock the object: > > > > > > waiting for monitor entry [0xfffffd7fc0130000] > > > waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String) > > > > > > Key symptoms in thread T3: > > > > > > - is blocked waiting to lock the object: > > > > > > waiting for monitor entry [0xfffffd7fc002f000] > > > waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String) > > > > > > > From serguei.spitsyn at oracle.com Tue Feb 25 12:43:41 2014 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Tue, 25 Feb 2014 12:43:41 -0800 Subject: RFR 6471769: Error: assert(_cur_stack_depth == count_frames(), "cur_stack_depth out of sync") Message-ID: <530D007D.4040602@oracle.com> Please, review the fix for: https://bugs.openjdk.java.net/browse/JDK-6471769 Open webrev: http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/hotspot/6471769-JVMTI-DEPTH.1 Summary: This is another Test Stabilization issue. The fix is very similar to other JVMTI stabilization fixes. It is to use safepoints for updating the PopFrame data instead of relying on the suspend equivalent condition mechanism (JvmtiEnv::is_thread_fully_suspended()) which is not adequate from the reliability point of view. Testing: In progress: nsk.jvmti, nsk.jdi, nsk.jdwp, JTreg com/sun/jdi Thanks, Serguei From daniel.daugherty at oracle.com Tue Feb 25 13:51:32 2014 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Tue, 25 Feb 2014 14:51:32 -0700 Subject: RFR round 0 JDK8u backport of ObjectMonitor-JVM/TI hang fix (8028073) In-Reply-To: <1a31f034-a44c-4c0c-b350-ea626ed81147@default> References: <52ED3F28.5070806@oracle.com> <53081C1F.2070008@oracle.com> <530CBED8.4050509@oracle.com> <1a31f034-a44c-4c0c-b350-ea626ed81147@default> Message-ID: <530D1064.6080304@oracle.com> Thanks Ron! Dan On 2/25/14 1:00 PM, Ron Durbin wrote: > Code looks good. > Thx for the local in person review. > > Thx Ron > >> -----Original Message----- >> From: Daniel D. Daugherty >> Sent: Tuesday, February 25, 2014 9:04 AM >> To: David Holmes; Serguei Spitsyn; Dave Dice; Karen Kinnear >> Cc: serviceability-dev at openjdk.java.net; hotspot-runtime-dev at openjdk.java.net >> Subject: Re: RFR round 0 JDK8u backport of ObjectMonitor-JVM/TI hang fix (8028073) >> >> Ping! Still haven't heard from anyone on this backport... >> >> Dan >> >> >> On 2/21/14 8:40 PM, Daniel D. Daugherty wrote: >>> Greetings, >>> >>> This is a code review request for the JDK8u-hs-dev backport of the >>> following ObjectMonitor-JVM/TI hang fix: >>> >>> 8028073 race condition in ObjectMonitor implementation causing >>> deadlocks >>> https://bugs.openjdk.java.net/browse/JDK-8028073 >>> >>> Here is the JDK8u-hs-dev webrev URL: >>> >>> http://cr.openjdk.java.net/~dcubed/8028073-webrev/0-jdk8u-hs-dev/ >>> >>> This is _almost_ a straight forward backport of the JDK9 fix. The only >>> difference to the fix was discussed at the end of the JDK9 review and >>> was determined to only be needed in versions of HotSpot without the >>> fix for 8028280: >>> >>> http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2014-February/010745.html >>> >>> >>> 8028280 has not yet been backported to JDK8u-hs-dev. >>> >>> The easiest way to review the backport is to download the two patch >>> files from the webrevs and compare them with something like: >>> >>> jfilemerge -r -w 8028073_exp.patch 8028073_exp_for_jdk8u_hs.patch >>> >>> The same testing has been performed on the JDK8u-hs-dev version as >>> with the JDK9-hs-runtime version. >>> >>> Thanks, in advance, for any comments, questions or suggestions. >>> >>> Dan >>> >>> >>> On 2/1/14 11:38 AM, Daniel D. Daugherty wrote: >>>> Greetings, >>>> >>>> I have a fix ready for the following bug: >>>> >>>> 8028073 race condition in ObjectMonitor implementation causing >>> deadlocks >>>> https://bugs.openjdk.java.net/browse/JDK-8028073 >>>> >>>> On the surface, this is a very simple fix that relocates a few lines of >>>> code, relocates and rewrites the comments associated with that code and >>>> adds several new comments. >>>> >>>> Of course, in reality, the issue is much more complicated, but I'm >>>> hoping to make it easy for anyone not acquainted with this issue to >>>> understand what's going on. >>>> >>>> Here are the JDK9 webrev URLs: >>>> >>>> OpenJDK: >>>> http://cr.openjdk.java.net/~dcubed/8028073-webrev/0-jdk9-hs-runtime/ >>>> >>>> Oracle internal: >>>> >>> http://javaweb.us.oracle.com/~ddaugher/8028073-webrev/0-jdk9-hs-runtime/ >>>> The simple summary: >>>> >>>> - since Java Monitors and JVM/TI RawMonitors share a ParkEvent, >>>> it is possible for a JVM/TI monitor event handler to accidentally >>>> consume a ParkEvent.unpark() call meant for Java Monitor layer >>>> - the original code fix was made on 2005.07.04 using this bug ID: >>>> https://bugs.openjdk.java.net/browse/JDK-5030359 >>>> - it's the right fix, but it's in the wrong place >>>> - the fix needs to be after the JVMTI_EVENT_MONITOR_WAITED >>>> event handler is called because it is that event handler >>>> that can cause the hang >>>> >>>> >>>> Testing >>>> ------- >>>> >>>> - a new StessMonitorWait test has been created that reliably >>>> reproduces the hang in JDK[6789]; see the bug's gory details >>>> for the specific versions where the hang has been reproduced >>>> - the test reliably reproduces the hang in 5 seconds on my >>>> T7600 running Solaris 10u11 X86; 1 minute runs reproduce >>>> the hang reliably on other machines >>>> - 12 hour stress run of the new test on Linux-X64, MacOS X-X64, >>>> Solaris-SPARCV9, Solaris-X64, and Win7-X86 with the JPRT >>>> bits did not reproduce the hang >>>> - JPRT test job >>>> - VM/SQE Adhoc test job on Server VM, fastdebug bits on Linux-X86, >>>> Linux-X64, MacOS X-X64, Solaris-SPARCV9, Solaris-X64, Windows-X86, >>>> and Windows-X64: >>>> - vm.quick >>>> - Kitchensink (bigapps) >>>> - Weblogic+medrec (bigapps) >>>> - runThese (bigapps) >>>> >>>> >>>> The Gory Details Start Here >>>> --------------------------- >>>> >>>> This is the old location of block of code that's being moved: >>>> >>>> src/share/vm/runtime/objectMonitor.cpp: >>>> >>>> 1440 void ObjectMonitor::wait(jlong millis, bool interruptible, >>> TRAPS) { >>>> >>>> 1499 exit (true, Self) ; // exit the monitor >>>> >>>> 1513 if (node._notified != 0 && _succ == Self) { >>>> 1514 node._event->unpark(); >>>> 1515 } >>>> >>>> >>>> This is the new location of block of code that's being moved: >>>> >>>> src/share/vm/runtime/objectMonitor.cpp: >>>> >>>> 1452 void ObjectMonitor::wait(jlong millis, bool interruptible, >>> TRAPS) { >>>> >>>> 1601 if (JvmtiExport::should_post_monitor_waited()) { >>>> 1602 JvmtiExport::post_monitor_waited(jt, this, ret == >>> OS_TIMEOUT); >>>> >>>> 1604 if (node._notified != 0 && _succ == Self) { >>>> >>>> 1620 node._event->unpark(); >>>> 1621 } >>>> >>>> >>>> The Risks >>>> --------- >>>> >>>> - The code now executes only when the JVMTI_EVENT_MONITOR_WAITED event >>>> is enabled: >>>> - previously it was always executed >>>> - while the old code was not effective for the hang that is being >>>> fixed with this bug, it is possible that the old code prevented >>>> a different bug in the successor protocol from manifesting >>>> - thorough analysis of the successor protocol did not reveal a >>>> case where the old code was needed in the old location >>>> - Thorough analysis indicates that the other JVM/TI monitor events >>>> do not need a fix like the one for JVMTI_EVENT_MONITOR_WAITED: >>>> - the successor protocol is complicated and the analysis could >>>> be wrong when certain options are used >>>> - comments were added to each location where a JVM/TI monitor >>>> event handler is called documenting why a fix like this one >>>> is not needed there >>>> - if the analysis is wrong, the new comments show where a new >>>> code change would be needed >>>> >>>> >>>> The Scenario >>>> ------------ >>>> >>>> I've created a scenario that reproduces this hang: >>>> >>>> T1 - enters monitor and calls monitor.wait() >>>> T2 - enters the monitor, calls monitor.notify() and exits the monitor >>>> T3 - enters and exits the monitor >>>> T4 - enters the monitor, delays for 5 seconds, exits the monitor >>>> >>>> A JVM/TI agent that enables JVMTI_EVENT_MONITOR_WAITED and has a >>>> handler that: enters a raw monitor, waits for 1ms, exits a raw monitor. >>>> >>>> Here are the six events necessary to make this hang happen: >>>> >>>> // KEY-EVENT-1a: After being unparked(), T1 has cleared the _succ >>> field, but >>>> // KEY-EVENT-1b: T3 is exiting the monitor and makes T1 the >>> successor again. >>>> // KEY-EVENT-2a: The unpark() done by T3 when it made T1 the successor >>>> // KEY-EVENT-2b: is consumed by the JVM/TI event handler. >>>> >>>> // KEY-EVENT-3a: T3 made T1 the successor >>>> // KEY-EVENT-3b: but before T1 could reenter the monitor T4 grabbed it. >>>> >>>> // KEY-EVENT-4a: T1's TrySpin() call sees T4 as NotRunnable so >>>> // KEY-EVENT-4b: T1 bails from TrySpin without touching _succ. >>>> >>>> // KEY-EVENT-5a: T4 sees that T1 is still the successor so >>>> // KEY-EVENT-5b: T4 takes the quick exit path (no ExitEpilog) >>>> >>>> // KEY-EVENT-6a: T1 is about to park and it is the successor, but >>>> // KEY-EVENT-6b: T3's unpark has been eaten by the JVM/TI event handler >>>> // KEY-EVENT-6c: and T4 took the quick exit path. T1 is about to be >>> stuck. >>>> >>>> This bug is intertwined with: >>>> >>>> - The ObjectMonitor successor protocol >>>> - the sharing of a ParkEvent between Java Monitors and JVM/TI >>> RawMonitors >>>> There is a very long successor.notes attachment to JDK-8028073 that >>>> attempts to describe the ObjectMonitor successor protocol. It's good >>>> for putting pretty much anyone to sleep. >>>> >>>> Since this hang reproduces back to JDK6, this bug is taking the easily >>>> backported solution of moving the original fix to the right location. >>>> The following new bug has been filed for possible future work in this >>>> area by the Serviceability Team: >>>> >>>> 8033399 add a separate ParkEvent for JVM/TI RawMonitor use >>>> https://bugs.openjdk.java.net/browse/JDK-8033399 >>>> >>>> >>>> The Symptoms >>>> ------------ >>>> >>>> With intermittent hangs like this, it is useful to know what to look >>>> for in order to determine if you are running into this issue: >>>> >>>> - if you aren't using a debugger or a profiler or some other >>>> JVM/TI agent, then this hang is not the same as yours >>>> - if your JVM/TI agent isn't using a JVMTI_EVENT_MONITOR_WAITED >>>> event handler, then this hang is not the same as yours >>>> - if your JVMTI_EVENT_MONITOR_WAITED event handler is not using >>>> JVM/TI RawMonitors, then this hang is not the same as yours >>>> - if your JVMTI_EVENT_MONITOR_WAITED event handler is calling >>>> back into Java code, then you might just be insane and this >>>> hang might be similar to yours. However, using a Java callback >>>> in an event handler is an even bigger problem/risk so fix that >>>> first. >>>> - if you one or more threads blocked like this and making no >>>> progress, then this hang might be the same as yours: >>>> >>>> "T1" #22 prio=5 os_prio=64 tid=0x00000000009ca800 nid=0x2f waiting >>> for monitor e >>>> ntry [0xfffffd7fc0231000] >>>> java.lang.Thread.State: BLOCKED (on object monitor) >>>> JavaThread state: _thread_blocked >>>> Thread: 0x00000000009ca800 [0x2f] State: _at_safepoint >>> _has_called_back 0 _at_p >>>> oll_safepoint 0 >>>> JavaThread state: _thread_blocked >>>> at java.lang.Object.wait(Native Method) >>>> - waiting on <0xfffffd7e6a2b6ff0> (a java.lang.String) >>>> at java.lang.Object.wait(Object.java:502) >>>> at SMW_WorkerThread.run(StressMonitorWait.java:103) >>>> - locked <0xfffffd7e6a2b6ff0> (a java.lang.String) >>>> >>>> "T2" #23 prio=5 os_prio=64 tid=0x00000000009cc000 nid=0x30 waiting >>> for monitor e >>>> ntry [0xfffffd7fc0130000] >>>> java.lang.Thread.State: BLOCKED (on object monitor) >>>> JavaThread state: _thread_blocked >>>> Thread: 0x00000000009cc000 [0x30] State: _at_safepoint >>> _has_called_back 0 _at_p >>>> oll_safepoint 0 >>>> JavaThread state: _thread_blocked >>>> at SMW_WorkerThread.run(StressMonitorWait.java:120) >>>> - waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String) >>>> >>>> "T3" #24 prio=5 os_prio=64 tid=0x00000000009ce000 nid=0x31 waiting >>> for monitor e >>>> ntry [0xfffffd7fc002f000] >>>> java.lang.Thread.State: BLOCKED (on object monitor) >>>> JavaThread state: _thread_blocked >>>> Thread: 0x00000000009ce000 [0x31] State: _at_safepoint >>> _has_called_back 0 _at_p >>>> oll_safepoint 0 >>>> JavaThread state: _thread_blocked >>>> at SMW_WorkerThread.run(StressMonitorWait.java:139) >>>> - waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String) >>>> >>>> Key symptoms in thread T1: >>>> >>>> - had the object locked: >>>> >>>> locked <0xfffffd7e6a2b6ff0> (a java.lang.String) >>>> >>>> - did an Object.wait(): >>>> >>>> waiting on <0xfffffd7e6a2b6ff0> (a java.lang.String) >>>> >>>> - is blocked on reentry: >>>> >>>> waiting for monitor entry [0xfffffd7fc0231000] >>>> >>>> Key symptoms in thread T2: >>>> >>>> - is blocked waiting to lock the object: >>>> >>>> waiting for monitor entry [0xfffffd7fc0130000] >>>> waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String) >>>> >>>> Key symptoms in thread T3: >>>> >>>> - is blocked waiting to lock the object: >>>> >>>> waiting for monitor entry [0xfffffd7fc002f000] >>>> waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String) >>> >>> From serguei.spitsyn at oracle.com Tue Feb 25 13:54:49 2014 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Tue, 25 Feb 2014 13:54:49 -0800 Subject: RFR round 0 JDK8u backport of ObjectMonitor-JVM/TI hang fix (8028073) In-Reply-To: <530CBED8.4050509@oracle.com> References: <52ED3F28.5070806@oracle.com> <53081C1F.2070008@oracle.com> <530CBED8.4050509@oracle.com> Message-ID: <530D1129.4060605@oracle.com> Will review it today. You know, your fix is tricky. :) Thanks, Serguei On 2/25/14 8:03 AM, Daniel D. Daugherty wrote: > Ping! Still haven't heard from anyone on this backport... > > Dan > > > On 2/21/14 8:40 PM, Daniel D. Daugherty wrote: >> Greetings, >> >> This is a code review request for the JDK8u-hs-dev backport of the >> following ObjectMonitor-JVM/TI hang fix: >> >> 8028073 race condition in ObjectMonitor implementation causing >> deadlocks >> https://bugs.openjdk.java.net/browse/JDK-8028073 >> >> Here is the JDK8u-hs-dev webrev URL: >> >> http://cr.openjdk.java.net/~dcubed/8028073-webrev/0-jdk8u-hs-dev/ >> >> This is _almost_ a straight forward backport of the JDK9 fix. The only >> difference to the fix was discussed at the end of the JDK9 review and >> was determined to only be needed in versions of HotSpot without the >> fix for 8028280: >> >> http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2014-February/010745.html >> >> >> 8028280 has not yet been backported to JDK8u-hs-dev. >> >> The easiest way to review the backport is to download the two patch >> files from the webrevs and compare them with something like: >> >> jfilemerge -r -w 8028073_exp.patch 8028073_exp_for_jdk8u_hs.patch >> >> The same testing has been performed on the JDK8u-hs-dev version as >> with the JDK9-hs-runtime version. >> >> Thanks, in advance, for any comments, questions or suggestions. >> >> Dan >> >> >> On 2/1/14 11:38 AM, Daniel D. Daugherty wrote: >> > Greetings, >> > >> > I have a fix ready for the following bug: >> > >> > 8028073 race condition in ObjectMonitor implementation causing >> deadlocks >> > https://bugs.openjdk.java.net/browse/JDK-8028073 >> > >> > On the surface, this is a very simple fix that relocates a few >> lines of >> > code, relocates and rewrites the comments associated with that code >> and >> > adds several new comments. >> > >> > Of course, in reality, the issue is much more complicated, but I'm >> > hoping to make it easy for anyone not acquainted with this issue to >> > understand what's going on. >> > >> > Here are the JDK9 webrev URLs: >> > >> > OpenJDK: >> > http://cr.openjdk.java.net/~dcubed/8028073-webrev/0-jdk9-hs-runtime/ >> > >> > Oracle internal: >> > >> http://javaweb.us.oracle.com/~ddaugher/8028073-webrev/0-jdk9-hs-runtime/ >> > >> > The simple summary: >> > >> > - since Java Monitors and JVM/TI RawMonitors share a ParkEvent, >> > it is possible for a JVM/TI monitor event handler to accidentally >> > consume a ParkEvent.unpark() call meant for Java Monitor layer >> > - the original code fix was made on 2005.07.04 using this bug ID: >> > https://bugs.openjdk.java.net/browse/JDK-5030359 >> > - it's the right fix, but it's in the wrong place >> > - the fix needs to be after the JVMTI_EVENT_MONITOR_WAITED >> > event handler is called because it is that event handler >> > that can cause the hang >> > >> > >> > Testing >> > ------- >> > >> > - a new StessMonitorWait test has been created that reliably >> > reproduces the hang in JDK[6789]; see the bug's gory details >> > for the specific versions where the hang has been reproduced >> > - the test reliably reproduces the hang in 5 seconds on my >> > T7600 running Solaris 10u11 X86; 1 minute runs reproduce >> > the hang reliably on other machines >> > - 12 hour stress run of the new test on Linux-X64, MacOS X-X64, >> > Solaris-SPARCV9, Solaris-X64, and Win7-X86 with the JPRT >> > bits did not reproduce the hang >> > - JPRT test job >> > - VM/SQE Adhoc test job on Server VM, fastdebug bits on Linux-X86, >> > Linux-X64, MacOS X-X64, Solaris-SPARCV9, Solaris-X64, Windows-X86, >> > and Windows-X64: >> > - vm.quick >> > - Kitchensink (bigapps) >> > - Weblogic+medrec (bigapps) >> > - runThese (bigapps) >> > >> > >> > The Gory Details Start Here >> > --------------------------- >> > >> > This is the old location of block of code that's being moved: >> > >> > src/share/vm/runtime/objectMonitor.cpp: >> > >> > 1440 void ObjectMonitor::wait(jlong millis, bool interruptible, >> TRAPS) { >> > >> > 1499 exit (true, Self) ; // exit the monitor >> > >> > 1513 if (node._notified != 0 && _succ == Self) { >> > 1514 node._event->unpark(); >> > 1515 } >> > >> > >> > This is the new location of block of code that's being moved: >> > >> > src/share/vm/runtime/objectMonitor.cpp: >> > >> > 1452 void ObjectMonitor::wait(jlong millis, bool interruptible, >> TRAPS) { >> > >> > 1601 if (JvmtiExport::should_post_monitor_waited()) { >> > 1602 JvmtiExport::post_monitor_waited(jt, this, ret == >> OS_TIMEOUT); >> > >> > 1604 if (node._notified != 0 && _succ == Self) { >> > >> > 1620 node._event->unpark(); >> > 1621 } >> > >> > >> > The Risks >> > --------- >> > >> > - The code now executes only when the JVMTI_EVENT_MONITOR_WAITED event >> > is enabled: >> > - previously it was always executed >> > - while the old code was not effective for the hang that is being >> > fixed with this bug, it is possible that the old code prevented >> > a different bug in the successor protocol from manifesting >> > - thorough analysis of the successor protocol did not reveal a >> > case where the old code was needed in the old location >> > - Thorough analysis indicates that the other JVM/TI monitor events >> > do not need a fix like the one for JVMTI_EVENT_MONITOR_WAITED: >> > - the successor protocol is complicated and the analysis could >> > be wrong when certain options are used >> > - comments were added to each location where a JVM/TI monitor >> > event handler is called documenting why a fix like this one >> > is not needed there >> > - if the analysis is wrong, the new comments show where a new >> > code change would be needed >> > >> > >> > The Scenario >> > ------------ >> > >> > I've created a scenario that reproduces this hang: >> > >> > T1 - enters monitor and calls monitor.wait() >> > T2 - enters the monitor, calls monitor.notify() and exits the monitor >> > T3 - enters and exits the monitor >> > T4 - enters the monitor, delays for 5 seconds, exits the monitor >> > >> > A JVM/TI agent that enables JVMTI_EVENT_MONITOR_WAITED and has a >> > handler that: enters a raw monitor, waits for 1ms, exits a raw >> monitor. >> > >> > Here are the six events necessary to make this hang happen: >> > >> > // KEY-EVENT-1a: After being unparked(), T1 has cleared the _succ >> field, but >> > // KEY-EVENT-1b: T3 is exiting the monitor and makes T1 the >> successor again. >> > >> > // KEY-EVENT-2a: The unpark() done by T3 when it made T1 the successor >> > // KEY-EVENT-2b: is consumed by the JVM/TI event handler. >> > >> > // KEY-EVENT-3a: T3 made T1 the successor >> > // KEY-EVENT-3b: but before T1 could reenter the monitor T4 grabbed >> it. >> > >> > // KEY-EVENT-4a: T1's TrySpin() call sees T4 as NotRunnable so >> > // KEY-EVENT-4b: T1 bails from TrySpin without touching _succ. >> > >> > // KEY-EVENT-5a: T4 sees that T1 is still the successor so >> > // KEY-EVENT-5b: T4 takes the quick exit path (no ExitEpilog) >> > >> > // KEY-EVENT-6a: T1 is about to park and it is the successor, but >> > // KEY-EVENT-6b: T3's unpark has been eaten by the JVM/TI event >> handler >> > // KEY-EVENT-6c: and T4 took the quick exit path. T1 is about to be >> stuck. >> > >> > >> > This bug is intertwined with: >> > >> > - The ObjectMonitor successor protocol >> > - the sharing of a ParkEvent between Java Monitors and JVM/TI >> RawMonitors >> > >> > There is a very long successor.notes attachment to JDK-8028073 that >> > attempts to describe the ObjectMonitor successor protocol. It's good >> > for putting pretty much anyone to sleep. >> > >> > Since this hang reproduces back to JDK6, this bug is taking the easily >> > backported solution of moving the original fix to the right location. >> > The following new bug has been filed for possible future work in this >> > area by the Serviceability Team: >> > >> > 8033399 add a separate ParkEvent for JVM/TI RawMonitor use >> > https://bugs.openjdk.java.net/browse/JDK-8033399 >> > >> > >> > The Symptoms >> > ------------ >> > >> > With intermittent hangs like this, it is useful to know what to look >> > for in order to determine if you are running into this issue: >> > >> > - if you aren't using a debugger or a profiler or some other >> > JVM/TI agent, then this hang is not the same as yours >> > - if your JVM/TI agent isn't using a JVMTI_EVENT_MONITOR_WAITED >> > event handler, then this hang is not the same as yours >> > - if your JVMTI_EVENT_MONITOR_WAITED event handler is not using >> > JVM/TI RawMonitors, then this hang is not the same as yours >> > - if your JVMTI_EVENT_MONITOR_WAITED event handler is calling >> > back into Java code, then you might just be insane and this >> > hang might be similar to yours. However, using a Java callback >> > in an event handler is an even bigger problem/risk so fix that >> > first. >> > - if you one or more threads blocked like this and making no >> > progress, then this hang might be the same as yours: >> > >> > "T1" #22 prio=5 os_prio=64 tid=0x00000000009ca800 nid=0x2f waiting >> for monitor e >> > ntry [0xfffffd7fc0231000] >> > java.lang.Thread.State: BLOCKED (on object monitor) >> > JavaThread state: _thread_blocked >> > Thread: 0x00000000009ca800 [0x2f] State: _at_safepoint >> _has_called_back 0 _at_p >> > oll_safepoint 0 >> > JavaThread state: _thread_blocked >> > at java.lang.Object.wait(Native Method) >> > - waiting on <0xfffffd7e6a2b6ff0> (a java.lang.String) >> > at java.lang.Object.wait(Object.java:502) >> > at SMW_WorkerThread.run(StressMonitorWait.java:103) >> > - locked <0xfffffd7e6a2b6ff0> (a java.lang.String) >> > >> > "T2" #23 prio=5 os_prio=64 tid=0x00000000009cc000 nid=0x30 waiting >> for monitor e >> > ntry [0xfffffd7fc0130000] >> > java.lang.Thread.State: BLOCKED (on object monitor) >> > JavaThread state: _thread_blocked >> > Thread: 0x00000000009cc000 [0x30] State: _at_safepoint >> _has_called_back 0 _at_p >> > oll_safepoint 0 >> > JavaThread state: _thread_blocked >> > at SMW_WorkerThread.run(StressMonitorWait.java:120) >> > - waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String) >> > >> > "T3" #24 prio=5 os_prio=64 tid=0x00000000009ce000 nid=0x31 waiting >> for monitor e >> > ntry [0xfffffd7fc002f000] >> > java.lang.Thread.State: BLOCKED (on object monitor) >> > JavaThread state: _thread_blocked >> > Thread: 0x00000000009ce000 [0x31] State: _at_safepoint >> _has_called_back 0 _at_p >> > oll_safepoint 0 >> > JavaThread state: _thread_blocked >> > at SMW_WorkerThread.run(StressMonitorWait.java:139) >> > - waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String) >> > >> > Key symptoms in thread T1: >> > >> > - had the object locked: >> > >> > locked <0xfffffd7e6a2b6ff0> (a java.lang.String) >> > >> > - did an Object.wait(): >> > >> > waiting on <0xfffffd7e6a2b6ff0> (a java.lang.String) >> > >> > - is blocked on reentry: >> > >> > waiting for monitor entry [0xfffffd7fc0231000] >> > >> > Key symptoms in thread T2: >> > >> > - is blocked waiting to lock the object: >> > >> > waiting for monitor entry [0xfffffd7fc0130000] >> > waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String) >> > >> > Key symptoms in thread T3: >> > >> > - is blocked waiting to lock the object: >> > >> > waiting for monitor entry [0xfffffd7fc002f000] >> > waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String) >> >> >> > From daniel.daugherty at oracle.com Tue Feb 25 14:57:20 2014 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Tue, 25 Feb 2014 15:57:20 -0700 Subject: RFR 6471769: Error: assert(_cur_stack_depth == count_frames(), "cur_stack_depth out of sync") In-Reply-To: <530D007D.4040602@oracle.com> References: <530D007D.4040602@oracle.com> Message-ID: <530D1FD0.8070308@oracle.com> On 2/25/14 1:43 PM, serguei.spitsyn at oracle.com wrote: > Please, review the fix for: > https://bugs.openjdk.java.net/browse/JDK-6471769 > > > Open webrev: > http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/hotspot/6471769-JVMTI-DEPTH.1 > src/share/vm/runtime/vm_operations.hpp No comments. src/share/vm/prims/jvmtiEnvBase.hpp No comments. src/share/vm/prims/jvmtiEnv.cpp No comments. src/share/vm/prims/jvmtiEnvThreadState.cpp No comments. src/share/vm/prims/jvmtiThreadState.cpp line 66: _cur_stack_depth = UNKNOWN_STACK_DEPTH; This looks like the key piece of this fix with respect to the assert() in the bug report. I suspect that the first call to JvmtiThreadState::cur_stack_depth() is racing with another thread that happens to do something else that inits or sets _cur_stack_depth to an acceptable value. line 251: "must be current thread or at safepont"); line 284: "must be current thread or at safepont"); typo: 'safepont' -> 'safepoint' Thumbs up! No need to re-review the typo fixes. Dan > > Summary: > > This is another Test Stabilization issue. > The fix is very similar to other JVMTI stabilization fixes. > It is to use safepoints for updating the PopFrame data instead of > relying on the > suspend equivalent condition mechanism > (JvmtiEnv::is_thread_fully_suspended()) > which is not adequate from the reliability point of view. > > Testing: > In progress: nsk.jvmti, nsk.jdi, nsk.jdwp, JTreg com/sun/jdi > > > Thanks, > Serguei > From Karen.Kinnear at Oracle.com Tue Feb 25 15:11:10 2014 From: Karen.Kinnear at Oracle.com (Karen Kinnear) Date: Tue, 25 Feb 2014 18:11:10 -0500 Subject: RFR round 0 JDK8u backport of ObjectMonitor-JVM/TI hang fix (8028073) In-Reply-To: <53081C1F.2070008@oracle.com> References: <52ED3F28.5070806@oracle.com> <53081C1F.2070008@oracle.com> Message-ID: <8100537C-485E-4699-926B-CB1A4BF36DA7@Oracle.com> Dan, Code looks good. This makes sense to me. Thank you for the detailed comments and testing. thanks, Karen p.s. sorry - you would think getting three copies of the review request would mean I would not completely overlook this in my emails :-) On Feb 21, 2014, at 10:40 PM, Daniel D. Daugherty wrote: > Greetings, > > This is a code review request for the JDK8u-hs-dev backport of the > following ObjectMonitor-JVM/TI hang fix: > > 8028073 race condition in ObjectMonitor implementation causing deadlocks > https://bugs.openjdk.java.net/browse/JDK-8028073 > > Here is the JDK8u-hs-dev webrev URL: > > http://cr.openjdk.java.net/~dcubed/8028073-webrev/0-jdk8u-hs-dev/ > > This is _almost_ a straight forward backport of the JDK9 fix. The only > difference to the fix was discussed at the end of the JDK9 review and > was determined to only be needed in versions of HotSpot without the > fix for 8028280: > > http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2014-February/010745.html > > 8028280 has not yet been backported to JDK8u-hs-dev. > > The easiest way to review the backport is to download the two patch > files from the webrevs and compare them with something like: > > jfilemerge -r -w 8028073_exp.patch 8028073_exp_for_jdk8u_hs.patch > > The same testing has been performed on the JDK8u-hs-dev version as > with the JDK9-hs-runtime version. > > Thanks, in advance, for any comments, questions or suggestions. > > Dan > > > On 2/1/14 11:38 AM, Daniel D. Daugherty wrote: > > Greetings, > > > > I have a fix ready for the following bug: > > > > 8028073 race condition in ObjectMonitor implementation causing deadlocks > > https://bugs.openjdk.java.net/browse/JDK-8028073 > > > > On the surface, this is a very simple fix that relocates a few lines of > > code, relocates and rewrites the comments associated with that code and > > adds several new comments. > > > > Of course, in reality, the issue is much more complicated, but I'm > > hoping to make it easy for anyone not acquainted with this issue to > > understand what's going on. > > > > Here are the JDK9 webrev URLs: > > > > OpenJDK: > > http://cr.openjdk.java.net/~dcubed/8028073-webrev/0-jdk9-hs-runtime/ > > > > Oracle internal: > > http://javaweb.us.oracle.com/~ddaugher/8028073-webrev/0-jdk9-hs-runtime/ > > > > The simple summary: > > > > - since Java Monitors and JVM/TI RawMonitors share a ParkEvent, > > it is possible for a JVM/TI monitor event handler to accidentally > > consume a ParkEvent.unpark() call meant for Java Monitor layer > > - the original code fix was made on 2005.07.04 using this bug ID: > > https://bugs.openjdk.java.net/browse/JDK-5030359 > > - it's the right fix, but it's in the wrong place > > - the fix needs to be after the JVMTI_EVENT_MONITOR_WAITED > > event handler is called because it is that event handler > > that can cause the hang > > > > > > Testing > > ------- > > > > - a new StessMonitorWait test has been created that reliably > > reproduces the hang in JDK[6789]; see the bug's gory details > > for the specific versions where the hang has been reproduced > > - the test reliably reproduces the hang in 5 seconds on my > > T7600 running Solaris 10u11 X86; 1 minute runs reproduce > > the hang reliably on other machines > > - 12 hour stress run of the new test on Linux-X64, MacOS X-X64, > > Solaris-SPARCV9, Solaris-X64, and Win7-X86 with the JPRT > > bits did not reproduce the hang > > - JPRT test job > > - VM/SQE Adhoc test job on Server VM, fastdebug bits on Linux-X86, > > Linux-X64, MacOS X-X64, Solaris-SPARCV9, Solaris-X64, Windows-X86, > > and Windows-X64: > > - vm.quick > > - Kitchensink (bigapps) > > - Weblogic+medrec (bigapps) > > - runThese (bigapps) > > > > > > The Gory Details Start Here > > --------------------------- > > > > This is the old location of block of code that's being moved: > > > > src/share/vm/runtime/objectMonitor.cpp: > > > > 1440 void ObjectMonitor::wait(jlong millis, bool interruptible, TRAPS) { > > > > 1499 exit (true, Self) ; // exit the monitor > > > > 1513 if (node._notified != 0 && _succ == Self) { > > 1514 node._event->unpark(); > > 1515 } > > > > > > This is the new location of block of code that's being moved: > > > > src/share/vm/runtime/objectMonitor.cpp: > > > > 1452 void ObjectMonitor::wait(jlong millis, bool interruptible, TRAPS) { > > > > 1601 if (JvmtiExport::should_post_monitor_waited()) { > > 1602 JvmtiExport::post_monitor_waited(jt, this, ret == OS_TIMEOUT); > > > > 1604 if (node._notified != 0 && _succ == Self) { > > > > 1620 node._event->unpark(); > > 1621 } > > > > > > The Risks > > --------- > > > > - The code now executes only when the JVMTI_EVENT_MONITOR_WAITED event > > is enabled: > > - previously it was always executed > > - while the old code was not effective for the hang that is being > > fixed with this bug, it is possible that the old code prevented > > a different bug in the successor protocol from manifesting > > - thorough analysis of the successor protocol did not reveal a > > case where the old code was needed in the old location > > - Thorough analysis indicates that the other JVM/TI monitor events > > do not need a fix like the one for JVMTI_EVENT_MONITOR_WAITED: > > - the successor protocol is complicated and the analysis could > > be wrong when certain options are used > > - comments were added to each location where a JVM/TI monitor > > event handler is called documenting why a fix like this one > > is not needed there > > - if the analysis is wrong, the new comments show where a new > > code change would be needed > > > > > > The Scenario > > ------------ > > > > I've created a scenario that reproduces this hang: > > > > T1 - enters monitor and calls monitor.wait() > > T2 - enters the monitor, calls monitor.notify() and exits the monitor > > T3 - enters and exits the monitor > > T4 - enters the monitor, delays for 5 seconds, exits the monitor > > > > A JVM/TI agent that enables JVMTI_EVENT_MONITOR_WAITED and has a > > handler that: enters a raw monitor, waits for 1ms, exits a raw monitor. > > > > Here are the six events necessary to make this hang happen: > > > > // KEY-EVENT-1a: After being unparked(), T1 has cleared the _succ field, but > > // KEY-EVENT-1b: T3 is exiting the monitor and makes T1 the successor again. > > > > // KEY-EVENT-2a: The unpark() done by T3 when it made T1 the successor > > // KEY-EVENT-2b: is consumed by the JVM/TI event handler. > > > > // KEY-EVENT-3a: T3 made T1 the successor > > // KEY-EVENT-3b: but before T1 could reenter the monitor T4 grabbed it. > > > > // KEY-EVENT-4a: T1's TrySpin() call sees T4 as NotRunnable so > > // KEY-EVENT-4b: T1 bails from TrySpin without touching _succ. > > > > // KEY-EVENT-5a: T4 sees that T1 is still the successor so > > // KEY-EVENT-5b: T4 takes the quick exit path (no ExitEpilog) > > > > // KEY-EVENT-6a: T1 is about to park and it is the successor, but > > // KEY-EVENT-6b: T3's unpark has been eaten by the JVM/TI event handler > > // KEY-EVENT-6c: and T4 took the quick exit path. T1 is about to be stuck. > > > > > > This bug is intertwined with: > > > > - The ObjectMonitor successor protocol > > - the sharing of a ParkEvent between Java Monitors and JVM/TI RawMonitors > > > > There is a very long successor.notes attachment to JDK-8028073 that > > attempts to describe the ObjectMonitor successor protocol. It's good > > for putting pretty much anyone to sleep. > > > > Since this hang reproduces back to JDK6, this bug is taking the easily > > backported solution of moving the original fix to the right location. > > The following new bug has been filed for possible future work in this > > area by the Serviceability Team: > > > > 8033399 add a separate ParkEvent for JVM/TI RawMonitor use > > https://bugs.openjdk.java.net/browse/JDK-8033399 > > > > > > The Symptoms > > ------------ > > > > With intermittent hangs like this, it is useful to know what to look > > for in order to determine if you are running into this issue: > > > > - if you aren't using a debugger or a profiler or some other > > JVM/TI agent, then this hang is not the same as yours > > - if your JVM/TI agent isn't using a JVMTI_EVENT_MONITOR_WAITED > > event handler, then this hang is not the same as yours > > - if your JVMTI_EVENT_MONITOR_WAITED event handler is not using > > JVM/TI RawMonitors, then this hang is not the same as yours > > - if your JVMTI_EVENT_MONITOR_WAITED event handler is calling > > back into Java code, then you might just be insane and this > > hang might be similar to yours. However, using a Java callback > > in an event handler is an even bigger problem/risk so fix that > > first. > > - if you one or more threads blocked like this and making no > > progress, then this hang might be the same as yours: > > > > "T1" #22 prio=5 os_prio=64 tid=0x00000000009ca800 nid=0x2f waiting for monitor e > > ntry [0xfffffd7fc0231000] > > java.lang.Thread.State: BLOCKED (on object monitor) > > JavaThread state: _thread_blocked > > Thread: 0x00000000009ca800 [0x2f] State: _at_safepoint _has_called_back 0 _at_p > > oll_safepoint 0 > > JavaThread state: _thread_blocked > > at java.lang.Object.wait(Native Method) > > - waiting on <0xfffffd7e6a2b6ff0> (a java.lang.String) > > at java.lang.Object.wait(Object.java:502) > > at SMW_WorkerThread.run(StressMonitorWait.java:103) > > - locked <0xfffffd7e6a2b6ff0> (a java.lang.String) > > > > "T2" #23 prio=5 os_prio=64 tid=0x00000000009cc000 nid=0x30 waiting for monitor e > > ntry [0xfffffd7fc0130000] > > java.lang.Thread.State: BLOCKED (on object monitor) > > JavaThread state: _thread_blocked > > Thread: 0x00000000009cc000 [0x30] State: _at_safepoint _has_called_back 0 _at_p > > oll_safepoint 0 > > JavaThread state: _thread_blocked > > at SMW_WorkerThread.run(StressMonitorWait.java:120) > > - waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String) > > > > "T3" #24 prio=5 os_prio=64 tid=0x00000000009ce000 nid=0x31 waiting for monitor e > > ntry [0xfffffd7fc002f000] > > java.lang.Thread.State: BLOCKED (on object monitor) > > JavaThread state: _thread_blocked > > Thread: 0x00000000009ce000 [0x31] State: _at_safepoint _has_called_back 0 _at_p > > oll_safepoint 0 > > JavaThread state: _thread_blocked > > at SMW_WorkerThread.run(StressMonitorWait.java:139) > > - waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String) > > > > Key symptoms in thread T1: > > > > - had the object locked: > > > > locked <0xfffffd7e6a2b6ff0> (a java.lang.String) > > > > - did an Object.wait(): > > > > waiting on <0xfffffd7e6a2b6ff0> (a java.lang.String) > > > > - is blocked on reentry: > > > > waiting for monitor entry [0xfffffd7fc0231000] > > > > Key symptoms in thread T2: > > > > - is blocked waiting to lock the object: > > > > waiting for monitor entry [0xfffffd7fc0130000] > > waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String) > > > > Key symptoms in thread T3: > > > > - is blocked waiting to lock the object: > > > > waiting for monitor entry [0xfffffd7fc002f000] > > waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String) > From daniel.daugherty at oracle.com Tue Feb 25 15:26:16 2014 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Tue, 25 Feb 2014 16:26:16 -0700 Subject: RFR round 0 JDK8u backport of ObjectMonitor-JVM/TI hang fix (8028073) In-Reply-To: <8100537C-485E-4699-926B-CB1A4BF36DA7@Oracle.com> References: <52ED3F28.5070806@oracle.com> <53081C1F.2070008@oracle.com> <8100537C-485E-4699-926B-CB1A4BF36DA7@Oracle.com> Message-ID: <530D2698.9030503@oracle.com> On 2/25/14 4:11 PM, Karen Kinnear wrote: > Dan, > > Code looks good. This makes sense to me. Thank you for the detailed comments and testing. Thanks for the review! > thanks, > Karen > > p.s. sorry - you would think getting three copies of the review request would mean > I would not completely overlook this in my emails :-) No worries. I'm using the backports as fillers for my other tasks... sometimes I need a break from the Win7 DBI... :-) Dan > On Feb 21, 2014, at 10:40 PM, Daniel D. Daugherty wrote: > >> Greetings, >> >> This is a code review request for the JDK8u-hs-dev backport of the >> following ObjectMonitor-JVM/TI hang fix: >> >> 8028073 race condition in ObjectMonitor implementation causing deadlocks >> https://bugs.openjdk.java.net/browse/JDK-8028073 >> >> Here is the JDK8u-hs-dev webrev URL: >> >> http://cr.openjdk.java.net/~dcubed/8028073-webrev/0-jdk8u-hs-dev/ >> >> This is _almost_ a straight forward backport of the JDK9 fix. The only >> difference to the fix was discussed at the end of the JDK9 review and >> was determined to only be needed in versions of HotSpot without the >> fix for 8028280: >> >> http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2014-February/010745.html >> >> 8028280 has not yet been backported to JDK8u-hs-dev. >> >> The easiest way to review the backport is to download the two patch >> files from the webrevs and compare them with something like: >> >> jfilemerge -r -w 8028073_exp.patch 8028073_exp_for_jdk8u_hs.patch >> >> The same testing has been performed on the JDK8u-hs-dev version as >> with the JDK9-hs-runtime version. >> >> Thanks, in advance, for any comments, questions or suggestions. >> >> Dan >> >> >> On 2/1/14 11:38 AM, Daniel D. Daugherty wrote: >>> Greetings, >>> >>> I have a fix ready for the following bug: >>> >>> 8028073 race condition in ObjectMonitor implementation causing deadlocks >>> https://bugs.openjdk.java.net/browse/JDK-8028073 >>> >>> On the surface, this is a very simple fix that relocates a few lines of >>> code, relocates and rewrites the comments associated with that code and >>> adds several new comments. >>> >>> Of course, in reality, the issue is much more complicated, but I'm >>> hoping to make it easy for anyone not acquainted with this issue to >>> understand what's going on. >>> >>> Here are the JDK9 webrev URLs: >>> >>> OpenJDK: >>> http://cr.openjdk.java.net/~dcubed/8028073-webrev/0-jdk9-hs-runtime/ >>> >>> Oracle internal: >>> http://javaweb.us.oracle.com/~ddaugher/8028073-webrev/0-jdk9-hs-runtime/ >>> >>> The simple summary: >>> >>> - since Java Monitors and JVM/TI RawMonitors share a ParkEvent, >>> it is possible for a JVM/TI monitor event handler to accidentally >>> consume a ParkEvent.unpark() call meant for Java Monitor layer >>> - the original code fix was made on 2005.07.04 using this bug ID: >>> https://bugs.openjdk.java.net/browse/JDK-5030359 >>> - it's the right fix, but it's in the wrong place >>> - the fix needs to be after the JVMTI_EVENT_MONITOR_WAITED >>> event handler is called because it is that event handler >>> that can cause the hang >>> >>> >>> Testing >>> ------- >>> >>> - a new StessMonitorWait test has been created that reliably >>> reproduces the hang in JDK[6789]; see the bug's gory details >>> for the specific versions where the hang has been reproduced >>> - the test reliably reproduces the hang in 5 seconds on my >>> T7600 running Solaris 10u11 X86; 1 minute runs reproduce >>> the hang reliably on other machines >>> - 12 hour stress run of the new test on Linux-X64, MacOS X-X64, >>> Solaris-SPARCV9, Solaris-X64, and Win7-X86 with the JPRT >>> bits did not reproduce the hang >>> - JPRT test job >>> - VM/SQE Adhoc test job on Server VM, fastdebug bits on Linux-X86, >>> Linux-X64, MacOS X-X64, Solaris-SPARCV9, Solaris-X64, Windows-X86, >>> and Windows-X64: >>> - vm.quick >>> - Kitchensink (bigapps) >>> - Weblogic+medrec (bigapps) >>> - runThese (bigapps) >>> >>> >>> The Gory Details Start Here >>> --------------------------- >>> >>> This is the old location of block of code that's being moved: >>> >>> src/share/vm/runtime/objectMonitor.cpp: >>> >>> 1440 void ObjectMonitor::wait(jlong millis, bool interruptible, TRAPS) { >>> >>> 1499 exit (true, Self) ; // exit the monitor >>> >>> 1513 if (node._notified != 0 && _succ == Self) { >>> 1514 node._event->unpark(); >>> 1515 } >>> >>> >>> This is the new location of block of code that's being moved: >>> >>> src/share/vm/runtime/objectMonitor.cpp: >>> >>> 1452 void ObjectMonitor::wait(jlong millis, bool interruptible, TRAPS) { >>> >>> 1601 if (JvmtiExport::should_post_monitor_waited()) { >>> 1602 JvmtiExport::post_monitor_waited(jt, this, ret == OS_TIMEOUT); >>> >>> 1604 if (node._notified != 0 && _succ == Self) { >>> >>> 1620 node._event->unpark(); >>> 1621 } >>> >>> >>> The Risks >>> --------- >>> >>> - The code now executes only when the JVMTI_EVENT_MONITOR_WAITED event >>> is enabled: >>> - previously it was always executed >>> - while the old code was not effective for the hang that is being >>> fixed with this bug, it is possible that the old code prevented >>> a different bug in the successor protocol from manifesting >>> - thorough analysis of the successor protocol did not reveal a >>> case where the old code was needed in the old location >>> - Thorough analysis indicates that the other JVM/TI monitor events >>> do not need a fix like the one for JVMTI_EVENT_MONITOR_WAITED: >>> - the successor protocol is complicated and the analysis could >>> be wrong when certain options are used >>> - comments were added to each location where a JVM/TI monitor >>> event handler is called documenting why a fix like this one >>> is not needed there >>> - if the analysis is wrong, the new comments show where a new >>> code change would be needed >>> >>> >>> The Scenario >>> ------------ >>> >>> I've created a scenario that reproduces this hang: >>> >>> T1 - enters monitor and calls monitor.wait() >>> T2 - enters the monitor, calls monitor.notify() and exits the monitor >>> T3 - enters and exits the monitor >>> T4 - enters the monitor, delays for 5 seconds, exits the monitor >>> >>> A JVM/TI agent that enables JVMTI_EVENT_MONITOR_WAITED and has a >>> handler that: enters a raw monitor, waits for 1ms, exits a raw monitor. >>> >>> Here are the six events necessary to make this hang happen: >>> >>> // KEY-EVENT-1a: After being unparked(), T1 has cleared the _succ field, but >>> // KEY-EVENT-1b: T3 is exiting the monitor and makes T1 the successor again. >>> >>> // KEY-EVENT-2a: The unpark() done by T3 when it made T1 the successor >>> // KEY-EVENT-2b: is consumed by the JVM/TI event handler. >>> >>> // KEY-EVENT-3a: T3 made T1 the successor >>> // KEY-EVENT-3b: but before T1 could reenter the monitor T4 grabbed it. >>> >>> // KEY-EVENT-4a: T1's TrySpin() call sees T4 as NotRunnable so >>> // KEY-EVENT-4b: T1 bails from TrySpin without touching _succ. >>> >>> // KEY-EVENT-5a: T4 sees that T1 is still the successor so >>> // KEY-EVENT-5b: T4 takes the quick exit path (no ExitEpilog) >>> >>> // KEY-EVENT-6a: T1 is about to park and it is the successor, but >>> // KEY-EVENT-6b: T3's unpark has been eaten by the JVM/TI event handler >>> // KEY-EVENT-6c: and T4 took the quick exit path. T1 is about to be stuck. >>> >>> >>> This bug is intertwined with: >>> >>> - The ObjectMonitor successor protocol >>> - the sharing of a ParkEvent between Java Monitors and JVM/TI RawMonitors >>> >>> There is a very long successor.notes attachment to JDK-8028073 that >>> attempts to describe the ObjectMonitor successor protocol. It's good >>> for putting pretty much anyone to sleep. >>> >>> Since this hang reproduces back to JDK6, this bug is taking the easily >>> backported solution of moving the original fix to the right location. >>> The following new bug has been filed for possible future work in this >>> area by the Serviceability Team: >>> >>> 8033399 add a separate ParkEvent for JVM/TI RawMonitor use >>> https://bugs.openjdk.java.net/browse/JDK-8033399 >>> >>> >>> The Symptoms >>> ------------ >>> >>> With intermittent hangs like this, it is useful to know what to look >>> for in order to determine if you are running into this issue: >>> >>> - if you aren't using a debugger or a profiler or some other >>> JVM/TI agent, then this hang is not the same as yours >>> - if your JVM/TI agent isn't using a JVMTI_EVENT_MONITOR_WAITED >>> event handler, then this hang is not the same as yours >>> - if your JVMTI_EVENT_MONITOR_WAITED event handler is not using >>> JVM/TI RawMonitors, then this hang is not the same as yours >>> - if your JVMTI_EVENT_MONITOR_WAITED event handler is calling >>> back into Java code, then you might just be insane and this >>> hang might be similar to yours. However, using a Java callback >>> in an event handler is an even bigger problem/risk so fix that >>> first. >>> - if you one or more threads blocked like this and making no >>> progress, then this hang might be the same as yours: >>> >>> "T1" #22 prio=5 os_prio=64 tid=0x00000000009ca800 nid=0x2f waiting for monitor e >>> ntry [0xfffffd7fc0231000] >>> java.lang.Thread.State: BLOCKED (on object monitor) >>> JavaThread state: _thread_blocked >>> Thread: 0x00000000009ca800 [0x2f] State: _at_safepoint _has_called_back 0 _at_p >>> oll_safepoint 0 >>> JavaThread state: _thread_blocked >>> at java.lang.Object.wait(Native Method) >>> - waiting on <0xfffffd7e6a2b6ff0> (a java.lang.String) >>> at java.lang.Object.wait(Object.java:502) >>> at SMW_WorkerThread.run(StressMonitorWait.java:103) >>> - locked <0xfffffd7e6a2b6ff0> (a java.lang.String) >>> >>> "T2" #23 prio=5 os_prio=64 tid=0x00000000009cc000 nid=0x30 waiting for monitor e >>> ntry [0xfffffd7fc0130000] >>> java.lang.Thread.State: BLOCKED (on object monitor) >>> JavaThread state: _thread_blocked >>> Thread: 0x00000000009cc000 [0x30] State: _at_safepoint _has_called_back 0 _at_p >>> oll_safepoint 0 >>> JavaThread state: _thread_blocked >>> at SMW_WorkerThread.run(StressMonitorWait.java:120) >>> - waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String) >>> >>> "T3" #24 prio=5 os_prio=64 tid=0x00000000009ce000 nid=0x31 waiting for monitor e >>> ntry [0xfffffd7fc002f000] >>> java.lang.Thread.State: BLOCKED (on object monitor) >>> JavaThread state: _thread_blocked >>> Thread: 0x00000000009ce000 [0x31] State: _at_safepoint _has_called_back 0 _at_p >>> oll_safepoint 0 >>> JavaThread state: _thread_blocked >>> at SMW_WorkerThread.run(StressMonitorWait.java:139) >>> - waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String) >>> >>> Key symptoms in thread T1: >>> >>> - had the object locked: >>> >>> locked <0xfffffd7e6a2b6ff0> (a java.lang.String) >>> >>> - did an Object.wait(): >>> >>> waiting on <0xfffffd7e6a2b6ff0> (a java.lang.String) >>> >>> - is blocked on reentry: >>> >>> waiting for monitor entry [0xfffffd7fc0231000] >>> >>> Key symptoms in thread T2: >>> >>> - is blocked waiting to lock the object: >>> >>> waiting for monitor entry [0xfffffd7fc0130000] >>> waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String) >>> >>> Key symptoms in thread T3: >>> >>> - is blocked waiting to lock the object: >>> >>> waiting for monitor entry [0xfffffd7fc002f000] >>> waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String) From daniel.daugherty at oracle.com Tue Feb 25 15:26:43 2014 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Tue, 25 Feb 2014 16:26:43 -0700 Subject: RFR round 0 JDK8u backport of ObjectMonitor-JVM/TI hang fix (8028073) In-Reply-To: <530D1129.4060605@oracle.com> References: <52ED3F28.5070806@oracle.com> <53081C1F.2070008@oracle.com> <530CBED8.4050509@oracle.com> <530D1129.4060605@oracle.com> Message-ID: <530D26B3.6000702@oracle.com> On 2/25/14 2:54 PM, serguei.spitsyn at oracle.com wrote: > Will review it today. > You know, your fix is tricky. :) Thanks. I will wait for your review. Dan > > Thanks, > Serguei > > On 2/25/14 8:03 AM, Daniel D. Daugherty wrote: >> Ping! Still haven't heard from anyone on this backport... >> >> Dan >> >> >> On 2/21/14 8:40 PM, Daniel D. Daugherty wrote: >>> Greetings, >>> >>> This is a code review request for the JDK8u-hs-dev backport of the >>> following ObjectMonitor-JVM/TI hang fix: >>> >>> 8028073 race condition in ObjectMonitor implementation causing >>> deadlocks >>> https://bugs.openjdk.java.net/browse/JDK-8028073 >>> >>> Here is the JDK8u-hs-dev webrev URL: >>> >>> http://cr.openjdk.java.net/~dcubed/8028073-webrev/0-jdk8u-hs-dev/ >>> >>> This is _almost_ a straight forward backport of the JDK9 fix. The only >>> difference to the fix was discussed at the end of the JDK9 review and >>> was determined to only be needed in versions of HotSpot without the >>> fix for 8028280: >>> >>> http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2014-February/010745.html >>> >>> >>> 8028280 has not yet been backported to JDK8u-hs-dev. >>> >>> The easiest way to review the backport is to download the two patch >>> files from the webrevs and compare them with something like: >>> >>> jfilemerge -r -w 8028073_exp.patch 8028073_exp_for_jdk8u_hs.patch >>> >>> The same testing has been performed on the JDK8u-hs-dev version as >>> with the JDK9-hs-runtime version. >>> >>> Thanks, in advance, for any comments, questions or suggestions. >>> >>> Dan >>> >>> >>> On 2/1/14 11:38 AM, Daniel D. Daugherty wrote: >>> > Greetings, >>> > >>> > I have a fix ready for the following bug: >>> > >>> > 8028073 race condition in ObjectMonitor implementation causing >>> deadlocks >>> > https://bugs.openjdk.java.net/browse/JDK-8028073 >>> > >>> > On the surface, this is a very simple fix that relocates a few >>> lines of >>> > code, relocates and rewrites the comments associated with that >>> code and >>> > adds several new comments. >>> > >>> > Of course, in reality, the issue is much more complicated, but I'm >>> > hoping to make it easy for anyone not acquainted with this issue to >>> > understand what's going on. >>> > >>> > Here are the JDK9 webrev URLs: >>> > >>> > OpenJDK: >>> > http://cr.openjdk.java.net/~dcubed/8028073-webrev/0-jdk9-hs-runtime/ >>> > >>> > Oracle internal: >>> > >>> http://javaweb.us.oracle.com/~ddaugher/8028073-webrev/0-jdk9-hs-runtime/ >>> >>> > >>> > The simple summary: >>> > >>> > - since Java Monitors and JVM/TI RawMonitors share a ParkEvent, >>> > it is possible for a JVM/TI monitor event handler to accidentally >>> > consume a ParkEvent.unpark() call meant for Java Monitor layer >>> > - the original code fix was made on 2005.07.04 using this bug ID: >>> > https://bugs.openjdk.java.net/browse/JDK-5030359 >>> > - it's the right fix, but it's in the wrong place >>> > - the fix needs to be after the JVMTI_EVENT_MONITOR_WAITED >>> > event handler is called because it is that event handler >>> > that can cause the hang >>> > >>> > >>> > Testing >>> > ------- >>> > >>> > - a new StessMonitorWait test has been created that reliably >>> > reproduces the hang in JDK[6789]; see the bug's gory details >>> > for the specific versions where the hang has been reproduced >>> > - the test reliably reproduces the hang in 5 seconds on my >>> > T7600 running Solaris 10u11 X86; 1 minute runs reproduce >>> > the hang reliably on other machines >>> > - 12 hour stress run of the new test on Linux-X64, MacOS X-X64, >>> > Solaris-SPARCV9, Solaris-X64, and Win7-X86 with the JPRT >>> > bits did not reproduce the hang >>> > - JPRT test job >>> > - VM/SQE Adhoc test job on Server VM, fastdebug bits on Linux-X86, >>> > Linux-X64, MacOS X-X64, Solaris-SPARCV9, Solaris-X64, Windows-X86, >>> > and Windows-X64: >>> > - vm.quick >>> > - Kitchensink (bigapps) >>> > - Weblogic+medrec (bigapps) >>> > - runThese (bigapps) >>> > >>> > >>> > The Gory Details Start Here >>> > --------------------------- >>> > >>> > This is the old location of block of code that's being moved: >>> > >>> > src/share/vm/runtime/objectMonitor.cpp: >>> > >>> > 1440 void ObjectMonitor::wait(jlong millis, bool interruptible, >>> TRAPS) { >>> > >>> > 1499 exit (true, Self) ; // exit the monitor >>> > >>> > 1513 if (node._notified != 0 && _succ == Self) { >>> > 1514 node._event->unpark(); >>> > 1515 } >>> > >>> > >>> > This is the new location of block of code that's being moved: >>> > >>> > src/share/vm/runtime/objectMonitor.cpp: >>> > >>> > 1452 void ObjectMonitor::wait(jlong millis, bool interruptible, >>> TRAPS) { >>> > >>> > 1601 if (JvmtiExport::should_post_monitor_waited()) { >>> > 1602 JvmtiExport::post_monitor_waited(jt, this, ret == >>> OS_TIMEOUT); >>> > >>> > 1604 if (node._notified != 0 && _succ == Self) { >>> > >>> > 1620 node._event->unpark(); >>> > 1621 } >>> > >>> > >>> > The Risks >>> > --------- >>> > >>> > - The code now executes only when the JVMTI_EVENT_MONITOR_WAITED >>> event >>> > is enabled: >>> > - previously it was always executed >>> > - while the old code was not effective for the hang that is being >>> > fixed with this bug, it is possible that the old code prevented >>> > a different bug in the successor protocol from manifesting >>> > - thorough analysis of the successor protocol did not reveal a >>> > case where the old code was needed in the old location >>> > - Thorough analysis indicates that the other JVM/TI monitor events >>> > do not need a fix like the one for JVMTI_EVENT_MONITOR_WAITED: >>> > - the successor protocol is complicated and the analysis could >>> > be wrong when certain options are used >>> > - comments were added to each location where a JVM/TI monitor >>> > event handler is called documenting why a fix like this one >>> > is not needed there >>> > - if the analysis is wrong, the new comments show where a new >>> > code change would be needed >>> > >>> > >>> > The Scenario >>> > ------------ >>> > >>> > I've created a scenario that reproduces this hang: >>> > >>> > T1 - enters monitor and calls monitor.wait() >>> > T2 - enters the monitor, calls monitor.notify() and exits the monitor >>> > T3 - enters and exits the monitor >>> > T4 - enters the monitor, delays for 5 seconds, exits the monitor >>> > >>> > A JVM/TI agent that enables JVMTI_EVENT_MONITOR_WAITED and has a >>> > handler that: enters a raw monitor, waits for 1ms, exits a raw >>> monitor. >>> > >>> > Here are the six events necessary to make this hang happen: >>> > >>> > // KEY-EVENT-1a: After being unparked(), T1 has cleared the _succ >>> field, but >>> > // KEY-EVENT-1b: T3 is exiting the monitor and makes T1 the >>> successor again. >>> > >>> > // KEY-EVENT-2a: The unpark() done by T3 when it made T1 the >>> successor >>> > // KEY-EVENT-2b: is consumed by the JVM/TI event handler. >>> > >>> > // KEY-EVENT-3a: T3 made T1 the successor >>> > // KEY-EVENT-3b: but before T1 could reenter the monitor T4 >>> grabbed it. >>> > >>> > // KEY-EVENT-4a: T1's TrySpin() call sees T4 as NotRunnable so >>> > // KEY-EVENT-4b: T1 bails from TrySpin without touching _succ. >>> > >>> > // KEY-EVENT-5a: T4 sees that T1 is still the successor so >>> > // KEY-EVENT-5b: T4 takes the quick exit path (no ExitEpilog) >>> > >>> > // KEY-EVENT-6a: T1 is about to park and it is the successor, but >>> > // KEY-EVENT-6b: T3's unpark has been eaten by the JVM/TI event >>> handler >>> > // KEY-EVENT-6c: and T4 took the quick exit path. T1 is about to >>> be stuck. >>> > >>> > >>> > This bug is intertwined with: >>> > >>> > - The ObjectMonitor successor protocol >>> > - the sharing of a ParkEvent between Java Monitors and JVM/TI >>> RawMonitors >>> > >>> > There is a very long successor.notes attachment to JDK-8028073 that >>> > attempts to describe the ObjectMonitor successor protocol. It's good >>> > for putting pretty much anyone to sleep. >>> > >>> > Since this hang reproduces back to JDK6, this bug is taking the >>> easily >>> > backported solution of moving the original fix to the right location. >>> > The following new bug has been filed for possible future work in this >>> > area by the Serviceability Team: >>> > >>> > 8033399 add a separate ParkEvent for JVM/TI RawMonitor use >>> > https://bugs.openjdk.java.net/browse/JDK-8033399 >>> > >>> > >>> > The Symptoms >>> > ------------ >>> > >>> > With intermittent hangs like this, it is useful to know what to look >>> > for in order to determine if you are running into this issue: >>> > >>> > - if you aren't using a debugger or a profiler or some other >>> > JVM/TI agent, then this hang is not the same as yours >>> > - if your JVM/TI agent isn't using a JVMTI_EVENT_MONITOR_WAITED >>> > event handler, then this hang is not the same as yours >>> > - if your JVMTI_EVENT_MONITOR_WAITED event handler is not using >>> > JVM/TI RawMonitors, then this hang is not the same as yours >>> > - if your JVMTI_EVENT_MONITOR_WAITED event handler is calling >>> > back into Java code, then you might just be insane and this >>> > hang might be similar to yours. However, using a Java callback >>> > in an event handler is an even bigger problem/risk so fix that >>> > first. >>> > - if you one or more threads blocked like this and making no >>> > progress, then this hang might be the same as yours: >>> > >>> > "T1" #22 prio=5 os_prio=64 tid=0x00000000009ca800 nid=0x2f waiting >>> for monitor e >>> > ntry [0xfffffd7fc0231000] >>> > java.lang.Thread.State: BLOCKED (on object monitor) >>> > JavaThread state: _thread_blocked >>> > Thread: 0x00000000009ca800 [0x2f] State: _at_safepoint >>> _has_called_back 0 _at_p >>> > oll_safepoint 0 >>> > JavaThread state: _thread_blocked >>> > at java.lang.Object.wait(Native Method) >>> > - waiting on <0xfffffd7e6a2b6ff0> (a java.lang.String) >>> > at java.lang.Object.wait(Object.java:502) >>> > at SMW_WorkerThread.run(StressMonitorWait.java:103) >>> > - locked <0xfffffd7e6a2b6ff0> (a java.lang.String) >>> > >>> > "T2" #23 prio=5 os_prio=64 tid=0x00000000009cc000 nid=0x30 waiting >>> for monitor e >>> > ntry [0xfffffd7fc0130000] >>> > java.lang.Thread.State: BLOCKED (on object monitor) >>> > JavaThread state: _thread_blocked >>> > Thread: 0x00000000009cc000 [0x30] State: _at_safepoint >>> _has_called_back 0 _at_p >>> > oll_safepoint 0 >>> > JavaThread state: _thread_blocked >>> > at SMW_WorkerThread.run(StressMonitorWait.java:120) >>> > - waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String) >>> > >>> > "T3" #24 prio=5 os_prio=64 tid=0x00000000009ce000 nid=0x31 waiting >>> for monitor e >>> > ntry [0xfffffd7fc002f000] >>> > java.lang.Thread.State: BLOCKED (on object monitor) >>> > JavaThread state: _thread_blocked >>> > Thread: 0x00000000009ce000 [0x31] State: _at_safepoint >>> _has_called_back 0 _at_p >>> > oll_safepoint 0 >>> > JavaThread state: _thread_blocked >>> > at SMW_WorkerThread.run(StressMonitorWait.java:139) >>> > - waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String) >>> > >>> > Key symptoms in thread T1: >>> > >>> > - had the object locked: >>> > >>> > locked <0xfffffd7e6a2b6ff0> (a java.lang.String) >>> > >>> > - did an Object.wait(): >>> > >>> > waiting on <0xfffffd7e6a2b6ff0> (a java.lang.String) >>> > >>> > - is blocked on reentry: >>> > >>> > waiting for monitor entry [0xfffffd7fc0231000] >>> > >>> > Key symptoms in thread T2: >>> > >>> > - is blocked waiting to lock the object: >>> > >>> > waiting for monitor entry [0xfffffd7fc0130000] >>> > waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String) >>> > >>> > Key symptoms in thread T3: >>> > >>> > - is blocked waiting to lock the object: >>> > >>> > waiting for monitor entry [0xfffffd7fc002f000] >>> > waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String) >>> >>> >>> >> > From serguei.spitsyn at oracle.com Tue Feb 25 15:35:54 2014 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Tue, 25 Feb 2014 15:35:54 -0800 Subject: RFR 6471769: Error: assert(_cur_stack_depth == count_frames(), "cur_stack_depth out of sync") In-Reply-To: <530D1FD0.8070308@oracle.com> References: <530D007D.4040602@oracle.com> <530D1FD0.8070308@oracle.com> Message-ID: <530D28DA.1070308@oracle.com> On 2/25/14 2:57 PM, Daniel D. Daugherty wrote: > On 2/25/14 1:43 PM, serguei.spitsyn at oracle.com wrote: >> Please, review the fix for: >> https://bugs.openjdk.java.net/browse/JDK-6471769 >> >> >> Open webrev: >> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/hotspot/6471769-JVMTI-DEPTH.1 >> > > src/share/vm/runtime/vm_operations.hpp > No comments. > > src/share/vm/prims/jvmtiEnvBase.hpp > No comments. > > src/share/vm/prims/jvmtiEnv.cpp > No comments. > > src/share/vm/prims/jvmtiEnvThreadState.cpp > No comments. > > src/share/vm/prims/jvmtiThreadState.cpp > line 66: _cur_stack_depth = UNKNOWN_STACK_DEPTH; > This looks like the key piece of this fix with respect to the > assert() in the bug report. I suspect that the first call to > JvmtiThreadState::cur_stack_depth() is racing with another > thread that happens to do something else that inits or sets > _cur_stack_depth to an acceptable value. Another potential cause of the issue is that the cur_stack_depth() is called from the update_for_pop_top_frame() which used to be called under suspend equivalent condition. The update_for_pop_top_frame() has been changed to be called at a safepoint now. > > line 251: "must be current thread or at safepont"); > line 284: "must be current thread or at safepont"); > typo: 'safepont' -> 'safepoint' > > Thumbs up! No need to re-review the typo fixes. Sure. I'll fix it before pushing. Thanks, Dan! Serguei > > Dan > > >> >> Summary: >> >> This is another Test Stabilization issue. >> The fix is very similar to other JVMTI stabilization fixes. >> It is to use safepoints for updating the PopFrame data instead of >> relying on the >> suspend equivalent condition mechanism >> (JvmtiEnv::is_thread_fully_suspended()) >> which is not adequate from the reliability point of view. >> >> Testing: >> In progress: nsk.jvmti, nsk.jdi, nsk.jdwp, JTreg com/sun/jdi >> >> >> Thanks, >> Serguei >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20140225/4e4c9753/attachment-0001.html From serguei.spitsyn at oracle.com Tue Feb 25 16:20:40 2014 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Tue, 25 Feb 2014 16:20:40 -0800 Subject: RFR round 0 JDK8u backport of ObjectMonitor-JVM/TI hang fix (8028073) In-Reply-To: <530CBED8.4050509@oracle.com> References: <52ED3F28.5070806@oracle.com> <53081C1F.2070008@oracle.com> <530CBED8.4050509@oracle.com> Message-ID: <530D3358.5090304@oracle.com> Dan, The fix looks good to me. I like the comments. They help to understand this aspect of the protocol. Thanks, Serguei On 2/25/14 8:03 AM, Daniel D. Daugherty wrote: > Ping! Still haven't heard from anyone on this backport... > > Dan > > > On 2/21/14 8:40 PM, Daniel D. Daugherty wrote: >> Greetings, >> >> This is a code review request for the JDK8u-hs-dev backport of the >> following ObjectMonitor-JVM/TI hang fix: >> >> 8028073 race condition in ObjectMonitor implementation causing >> deadlocks >> https://bugs.openjdk.java.net/browse/JDK-8028073 >> >> Here is the JDK8u-hs-dev webrev URL: >> >> http://cr.openjdk.java.net/~dcubed/8028073-webrev/0-jdk8u-hs-dev/ >> >> This is _almost_ a straight forward backport of the JDK9 fix. The only >> difference to the fix was discussed at the end of the JDK9 review and >> was determined to only be needed in versions of HotSpot without the >> fix for 8028280: >> >> http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2014-February/010745.html >> >> >> 8028280 has not yet been backported to JDK8u-hs-dev. >> >> The easiest way to review the backport is to download the two patch >> files from the webrevs and compare them with something like: >> >> jfilemerge -r -w 8028073_exp.patch 8028073_exp_for_jdk8u_hs.patch >> >> The same testing has been performed on the JDK8u-hs-dev version as >> with the JDK9-hs-runtime version. >> >> Thanks, in advance, for any comments, questions or suggestions. >> >> Dan >> >> >> On 2/1/14 11:38 AM, Daniel D. Daugherty wrote: >> > Greetings, >> > >> > I have a fix ready for the following bug: >> > >> > 8028073 race condition in ObjectMonitor implementation causing >> deadlocks >> > https://bugs.openjdk.java.net/browse/JDK-8028073 >> > >> > On the surface, this is a very simple fix that relocates a few >> lines of >> > code, relocates and rewrites the comments associated with that code >> and >> > adds several new comments. >> > >> > Of course, in reality, the issue is much more complicated, but I'm >> > hoping to make it easy for anyone not acquainted with this issue to >> > understand what's going on. >> > >> > Here are the JDK9 webrev URLs: >> > >> > OpenJDK: >> > http://cr.openjdk.java.net/~dcubed/8028073-webrev/0-jdk9-hs-runtime/ >> > >> > Oracle internal: >> > >> http://javaweb.us.oracle.com/~ddaugher/8028073-webrev/0-jdk9-hs-runtime/ >> > >> > The simple summary: >> > >> > - since Java Monitors and JVM/TI RawMonitors share a ParkEvent, >> > it is possible for a JVM/TI monitor event handler to accidentally >> > consume a ParkEvent.unpark() call meant for Java Monitor layer >> > - the original code fix was made on 2005.07.04 using this bug ID: >> > https://bugs.openjdk.java.net/browse/JDK-5030359 >> > - it's the right fix, but it's in the wrong place >> > - the fix needs to be after the JVMTI_EVENT_MONITOR_WAITED >> > event handler is called because it is that event handler >> > that can cause the hang >> > >> > >> > Testing >> > ------- >> > >> > - a new StessMonitorWait test has been created that reliably >> > reproduces the hang in JDK[6789]; see the bug's gory details >> > for the specific versions where the hang has been reproduced >> > - the test reliably reproduces the hang in 5 seconds on my >> > T7600 running Solaris 10u11 X86; 1 minute runs reproduce >> > the hang reliably on other machines >> > - 12 hour stress run of the new test on Linux-X64, MacOS X-X64, >> > Solaris-SPARCV9, Solaris-X64, and Win7-X86 with the JPRT >> > bits did not reproduce the hang >> > - JPRT test job >> > - VM/SQE Adhoc test job on Server VM, fastdebug bits on Linux-X86, >> > Linux-X64, MacOS X-X64, Solaris-SPARCV9, Solaris-X64, Windows-X86, >> > and Windows-X64: >> > - vm.quick >> > - Kitchensink (bigapps) >> > - Weblogic+medrec (bigapps) >> > - runThese (bigapps) >> > >> > >> > The Gory Details Start Here >> > --------------------------- >> > >> > This is the old location of block of code that's being moved: >> > >> > src/share/vm/runtime/objectMonitor.cpp: >> > >> > 1440 void ObjectMonitor::wait(jlong millis, bool interruptible, >> TRAPS) { >> > >> > 1499 exit (true, Self) ; // exit the monitor >> > >> > 1513 if (node._notified != 0 && _succ == Self) { >> > 1514 node._event->unpark(); >> > 1515 } >> > >> > >> > This is the new location of block of code that's being moved: >> > >> > src/share/vm/runtime/objectMonitor.cpp: >> > >> > 1452 void ObjectMonitor::wait(jlong millis, bool interruptible, >> TRAPS) { >> > >> > 1601 if (JvmtiExport::should_post_monitor_waited()) { >> > 1602 JvmtiExport::post_monitor_waited(jt, this, ret == >> OS_TIMEOUT); >> > >> > 1604 if (node._notified != 0 && _succ == Self) { >> > >> > 1620 node._event->unpark(); >> > 1621 } >> > >> > >> > The Risks >> > --------- >> > >> > - The code now executes only when the JVMTI_EVENT_MONITOR_WAITED event >> > is enabled: >> > - previously it was always executed >> > - while the old code was not effective for the hang that is being >> > fixed with this bug, it is possible that the old code prevented >> > a different bug in the successor protocol from manifesting >> > - thorough analysis of the successor protocol did not reveal a >> > case where the old code was needed in the old location >> > - Thorough analysis indicates that the other JVM/TI monitor events >> > do not need a fix like the one for JVMTI_EVENT_MONITOR_WAITED: >> > - the successor protocol is complicated and the analysis could >> > be wrong when certain options are used >> > - comments were added to each location where a JVM/TI monitor >> > event handler is called documenting why a fix like this one >> > is not needed there >> > - if the analysis is wrong, the new comments show where a new >> > code change would be needed >> > >> > >> > The Scenario >> > ------------ >> > >> > I've created a scenario that reproduces this hang: >> > >> > T1 - enters monitor and calls monitor.wait() >> > T2 - enters the monitor, calls monitor.notify() and exits the monitor >> > T3 - enters and exits the monitor >> > T4 - enters the monitor, delays for 5 seconds, exits the monitor >> > >> > A JVM/TI agent that enables JVMTI_EVENT_MONITOR_WAITED and has a >> > handler that: enters a raw monitor, waits for 1ms, exits a raw >> monitor. >> > >> > Here are the six events necessary to make this hang happen: >> > >> > // KEY-EVENT-1a: After being unparked(), T1 has cleared the _succ >> field, but >> > // KEY-EVENT-1b: T3 is exiting the monitor and makes T1 the >> successor again. >> > >> > // KEY-EVENT-2a: The unpark() done by T3 when it made T1 the successor >> > // KEY-EVENT-2b: is consumed by the JVM/TI event handler. >> > >> > // KEY-EVENT-3a: T3 made T1 the successor >> > // KEY-EVENT-3b: but before T1 could reenter the monitor T4 grabbed >> it. >> > >> > // KEY-EVENT-4a: T1's TrySpin() call sees T4 as NotRunnable so >> > // KEY-EVENT-4b: T1 bails from TrySpin without touching _succ. >> > >> > // KEY-EVENT-5a: T4 sees that T1 is still the successor so >> > // KEY-EVENT-5b: T4 takes the quick exit path (no ExitEpilog) >> > >> > // KEY-EVENT-6a: T1 is about to park and it is the successor, but >> > // KEY-EVENT-6b: T3's unpark has been eaten by the JVM/TI event >> handler >> > // KEY-EVENT-6c: and T4 took the quick exit path. T1 is about to be >> stuck. >> > >> > >> > This bug is intertwined with: >> > >> > - The ObjectMonitor successor protocol >> > - the sharing of a ParkEvent between Java Monitors and JVM/TI >> RawMonitors >> > >> > There is a very long successor.notes attachment to JDK-8028073 that >> > attempts to describe the ObjectMonitor successor protocol. It's good >> > for putting pretty much anyone to sleep. >> > >> > Since this hang reproduces back to JDK6, this bug is taking the easily >> > backported solution of moving the original fix to the right location. >> > The following new bug has been filed for possible future work in this >> > area by the Serviceability Team: >> > >> > 8033399 add a separate ParkEvent for JVM/TI RawMonitor use >> > https://bugs.openjdk.java.net/browse/JDK-8033399 >> > >> > >> > The Symptoms >> > ------------ >> > >> > With intermittent hangs like this, it is useful to know what to look >> > for in order to determine if you are running into this issue: >> > >> > - if you aren't using a debugger or a profiler or some other >> > JVM/TI agent, then this hang is not the same as yours >> > - if your JVM/TI agent isn't using a JVMTI_EVENT_MONITOR_WAITED >> > event handler, then this hang is not the same as yours >> > - if your JVMTI_EVENT_MONITOR_WAITED event handler is not using >> > JVM/TI RawMonitors, then this hang is not the same as yours >> > - if your JVMTI_EVENT_MONITOR_WAITED event handler is calling >> > back into Java code, then you might just be insane and this >> > hang might be similar to yours. However, using a Java callback >> > in an event handler is an even bigger problem/risk so fix that >> > first. >> > - if you one or more threads blocked like this and making no >> > progress, then this hang might be the same as yours: >> > >> > "T1" #22 prio=5 os_prio=64 tid=0x00000000009ca800 nid=0x2f waiting >> for monitor e >> > ntry [0xfffffd7fc0231000] >> > java.lang.Thread.State: BLOCKED (on object monitor) >> > JavaThread state: _thread_blocked >> > Thread: 0x00000000009ca800 [0x2f] State: _at_safepoint >> _has_called_back 0 _at_p >> > oll_safepoint 0 >> > JavaThread state: _thread_blocked >> > at java.lang.Object.wait(Native Method) >> > - waiting on <0xfffffd7e6a2b6ff0> (a java.lang.String) >> > at java.lang.Object.wait(Object.java:502) >> > at SMW_WorkerThread.run(StressMonitorWait.java:103) >> > - locked <0xfffffd7e6a2b6ff0> (a java.lang.String) >> > >> > "T2" #23 prio=5 os_prio=64 tid=0x00000000009cc000 nid=0x30 waiting >> for monitor e >> > ntry [0xfffffd7fc0130000] >> > java.lang.Thread.State: BLOCKED (on object monitor) >> > JavaThread state: _thread_blocked >> > Thread: 0x00000000009cc000 [0x30] State: _at_safepoint >> _has_called_back 0 _at_p >> > oll_safepoint 0 >> > JavaThread state: _thread_blocked >> > at SMW_WorkerThread.run(StressMonitorWait.java:120) >> > - waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String) >> > >> > "T3" #24 prio=5 os_prio=64 tid=0x00000000009ce000 nid=0x31 waiting >> for monitor e >> > ntry [0xfffffd7fc002f000] >> > java.lang.Thread.State: BLOCKED (on object monitor) >> > JavaThread state: _thread_blocked >> > Thread: 0x00000000009ce000 [0x31] State: _at_safepoint >> _has_called_back 0 _at_p >> > oll_safepoint 0 >> > JavaThread state: _thread_blocked >> > at SMW_WorkerThread.run(StressMonitorWait.java:139) >> > - waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String) >> > >> > Key symptoms in thread T1: >> > >> > - had the object locked: >> > >> > locked <0xfffffd7e6a2b6ff0> (a java.lang.String) >> > >> > - did an Object.wait(): >> > >> > waiting on <0xfffffd7e6a2b6ff0> (a java.lang.String) >> > >> > - is blocked on reentry: >> > >> > waiting for monitor entry [0xfffffd7fc0231000] >> > >> > Key symptoms in thread T2: >> > >> > - is blocked waiting to lock the object: >> > >> > waiting for monitor entry [0xfffffd7fc0130000] >> > waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String) >> > >> > Key symptoms in thread T3: >> > >> > - is blocked waiting to lock the object: >> > >> > waiting for monitor entry [0xfffffd7fc002f000] >> > waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String) >> >> >> > From daniel.daugherty at oracle.com Tue Feb 25 16:28:25 2014 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Tue, 25 Feb 2014 17:28:25 -0700 Subject: RFR round 0 JDK8u backport of ObjectMonitor-JVM/TI hang fix (8028073) In-Reply-To: <530D3358.5090304@oracle.com> References: <52ED3F28.5070806@oracle.com> <53081C1F.2070008@oracle.com> <530CBED8.4050509@oracle.com> <530D3358.5090304@oracle.com> Message-ID: <530D3529.9080007@oracle.com> Thanks for the review! Dan On 2/25/14 5:20 PM, serguei.spitsyn at oracle.com wrote: > Dan, > > The fix looks good to me. > I like the comments. > They help to understand this aspect of the protocol. > > Thanks, > Serguei > > > On 2/25/14 8:03 AM, Daniel D. Daugherty wrote: >> Ping! Still haven't heard from anyone on this backport... >> >> Dan >> >> >> On 2/21/14 8:40 PM, Daniel D. Daugherty wrote: >>> Greetings, >>> >>> This is a code review request for the JDK8u-hs-dev backport of the >>> following ObjectMonitor-JVM/TI hang fix: >>> >>> 8028073 race condition in ObjectMonitor implementation causing >>> deadlocks >>> https://bugs.openjdk.java.net/browse/JDK-8028073 >>> >>> Here is the JDK8u-hs-dev webrev URL: >>> >>> http://cr.openjdk.java.net/~dcubed/8028073-webrev/0-jdk8u-hs-dev/ >>> >>> This is _almost_ a straight forward backport of the JDK9 fix. The only >>> difference to the fix was discussed at the end of the JDK9 review and >>> was determined to only be needed in versions of HotSpot without the >>> fix for 8028280: >>> >>> http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2014-February/010745.html >>> >>> >>> 8028280 has not yet been backported to JDK8u-hs-dev. >>> >>> The easiest way to review the backport is to download the two patch >>> files from the webrevs and compare them with something like: >>> >>> jfilemerge -r -w 8028073_exp.patch 8028073_exp_for_jdk8u_hs.patch >>> >>> The same testing has been performed on the JDK8u-hs-dev version as >>> with the JDK9-hs-runtime version. >>> >>> Thanks, in advance, for any comments, questions or suggestions. >>> >>> Dan >>> >>> >>> On 2/1/14 11:38 AM, Daniel D. Daugherty wrote: >>> > Greetings, >>> > >>> > I have a fix ready for the following bug: >>> > >>> > 8028073 race condition in ObjectMonitor implementation causing >>> deadlocks >>> > https://bugs.openjdk.java.net/browse/JDK-8028073 >>> > >>> > On the surface, this is a very simple fix that relocates a few >>> lines of >>> > code, relocates and rewrites the comments associated with that >>> code and >>> > adds several new comments. >>> > >>> > Of course, in reality, the issue is much more complicated, but I'm >>> > hoping to make it easy for anyone not acquainted with this issue to >>> > understand what's going on. >>> > >>> > Here are the JDK9 webrev URLs: >>> > >>> > OpenJDK: >>> > http://cr.openjdk.java.net/~dcubed/8028073-webrev/0-jdk9-hs-runtime/ >>> > >>> > Oracle internal: >>> > >>> http://javaweb.us.oracle.com/~ddaugher/8028073-webrev/0-jdk9-hs-runtime/ >>> >>> > >>> > The simple summary: >>> > >>> > - since Java Monitors and JVM/TI RawMonitors share a ParkEvent, >>> > it is possible for a JVM/TI monitor event handler to accidentally >>> > consume a ParkEvent.unpark() call meant for Java Monitor layer >>> > - the original code fix was made on 2005.07.04 using this bug ID: >>> > https://bugs.openjdk.java.net/browse/JDK-5030359 >>> > - it's the right fix, but it's in the wrong place >>> > - the fix needs to be after the JVMTI_EVENT_MONITOR_WAITED >>> > event handler is called because it is that event handler >>> > that can cause the hang >>> > >>> > >>> > Testing >>> > ------- >>> > >>> > - a new StessMonitorWait test has been created that reliably >>> > reproduces the hang in JDK[6789]; see the bug's gory details >>> > for the specific versions where the hang has been reproduced >>> > - the test reliably reproduces the hang in 5 seconds on my >>> > T7600 running Solaris 10u11 X86; 1 minute runs reproduce >>> > the hang reliably on other machines >>> > - 12 hour stress run of the new test on Linux-X64, MacOS X-X64, >>> > Solaris-SPARCV9, Solaris-X64, and Win7-X86 with the JPRT >>> > bits did not reproduce the hang >>> > - JPRT test job >>> > - VM/SQE Adhoc test job on Server VM, fastdebug bits on Linux-X86, >>> > Linux-X64, MacOS X-X64, Solaris-SPARCV9, Solaris-X64, Windows-X86, >>> > and Windows-X64: >>> > - vm.quick >>> > - Kitchensink (bigapps) >>> > - Weblogic+medrec (bigapps) >>> > - runThese (bigapps) >>> > >>> > >>> > The Gory Details Start Here >>> > --------------------------- >>> > >>> > This is the old location of block of code that's being moved: >>> > >>> > src/share/vm/runtime/objectMonitor.cpp: >>> > >>> > 1440 void ObjectMonitor::wait(jlong millis, bool interruptible, >>> TRAPS) { >>> > >>> > 1499 exit (true, Self) ; // exit the monitor >>> > >>> > 1513 if (node._notified != 0 && _succ == Self) { >>> > 1514 node._event->unpark(); >>> > 1515 } >>> > >>> > >>> > This is the new location of block of code that's being moved: >>> > >>> > src/share/vm/runtime/objectMonitor.cpp: >>> > >>> > 1452 void ObjectMonitor::wait(jlong millis, bool interruptible, >>> TRAPS) { >>> > >>> > 1601 if (JvmtiExport::should_post_monitor_waited()) { >>> > 1602 JvmtiExport::post_monitor_waited(jt, this, ret == >>> OS_TIMEOUT); >>> > >>> > 1604 if (node._notified != 0 && _succ == Self) { >>> > >>> > 1620 node._event->unpark(); >>> > 1621 } >>> > >>> > >>> > The Risks >>> > --------- >>> > >>> > - The code now executes only when the JVMTI_EVENT_MONITOR_WAITED >>> event >>> > is enabled: >>> > - previously it was always executed >>> > - while the old code was not effective for the hang that is being >>> > fixed with this bug, it is possible that the old code prevented >>> > a different bug in the successor protocol from manifesting >>> > - thorough analysis of the successor protocol did not reveal a >>> > case where the old code was needed in the old location >>> > - Thorough analysis indicates that the other JVM/TI monitor events >>> > do not need a fix like the one for JVMTI_EVENT_MONITOR_WAITED: >>> > - the successor protocol is complicated and the analysis could >>> > be wrong when certain options are used >>> > - comments were added to each location where a JVM/TI monitor >>> > event handler is called documenting why a fix like this one >>> > is not needed there >>> > - if the analysis is wrong, the new comments show where a new >>> > code change would be needed >>> > >>> > >>> > The Scenario >>> > ------------ >>> > >>> > I've created a scenario that reproduces this hang: >>> > >>> > T1 - enters monitor and calls monitor.wait() >>> > T2 - enters the monitor, calls monitor.notify() and exits the monitor >>> > T3 - enters and exits the monitor >>> > T4 - enters the monitor, delays for 5 seconds, exits the monitor >>> > >>> > A JVM/TI agent that enables JVMTI_EVENT_MONITOR_WAITED and has a >>> > handler that: enters a raw monitor, waits for 1ms, exits a raw >>> monitor. >>> > >>> > Here are the six events necessary to make this hang happen: >>> > >>> > // KEY-EVENT-1a: After being unparked(), T1 has cleared the _succ >>> field, but >>> > // KEY-EVENT-1b: T3 is exiting the monitor and makes T1 the >>> successor again. >>> > >>> > // KEY-EVENT-2a: The unpark() done by T3 when it made T1 the >>> successor >>> > // KEY-EVENT-2b: is consumed by the JVM/TI event handler. >>> > >>> > // KEY-EVENT-3a: T3 made T1 the successor >>> > // KEY-EVENT-3b: but before T1 could reenter the monitor T4 >>> grabbed it. >>> > >>> > // KEY-EVENT-4a: T1's TrySpin() call sees T4 as NotRunnable so >>> > // KEY-EVENT-4b: T1 bails from TrySpin without touching _succ. >>> > >>> > // KEY-EVENT-5a: T4 sees that T1 is still the successor so >>> > // KEY-EVENT-5b: T4 takes the quick exit path (no ExitEpilog) >>> > >>> > // KEY-EVENT-6a: T1 is about to park and it is the successor, but >>> > // KEY-EVENT-6b: T3's unpark has been eaten by the JVM/TI event >>> handler >>> > // KEY-EVENT-6c: and T4 took the quick exit path. T1 is about to >>> be stuck. >>> > >>> > >>> > This bug is intertwined with: >>> > >>> > - The ObjectMonitor successor protocol >>> > - the sharing of a ParkEvent between Java Monitors and JVM/TI >>> RawMonitors >>> > >>> > There is a very long successor.notes attachment to JDK-8028073 that >>> > attempts to describe the ObjectMonitor successor protocol. It's good >>> > for putting pretty much anyone to sleep. >>> > >>> > Since this hang reproduces back to JDK6, this bug is taking the >>> easily >>> > backported solution of moving the original fix to the right location. >>> > The following new bug has been filed for possible future work in this >>> > area by the Serviceability Team: >>> > >>> > 8033399 add a separate ParkEvent for JVM/TI RawMonitor use >>> > https://bugs.openjdk.java.net/browse/JDK-8033399 >>> > >>> > >>> > The Symptoms >>> > ------------ >>> > >>> > With intermittent hangs like this, it is useful to know what to look >>> > for in order to determine if you are running into this issue: >>> > >>> > - if you aren't using a debugger or a profiler or some other >>> > JVM/TI agent, then this hang is not the same as yours >>> > - if your JVM/TI agent isn't using a JVMTI_EVENT_MONITOR_WAITED >>> > event handler, then this hang is not the same as yours >>> > - if your JVMTI_EVENT_MONITOR_WAITED event handler is not using >>> > JVM/TI RawMonitors, then this hang is not the same as yours >>> > - if your JVMTI_EVENT_MONITOR_WAITED event handler is calling >>> > back into Java code, then you might just be insane and this >>> > hang might be similar to yours. However, using a Java callback >>> > in an event handler is an even bigger problem/risk so fix that >>> > first. >>> > - if you one or more threads blocked like this and making no >>> > progress, then this hang might be the same as yours: >>> > >>> > "T1" #22 prio=5 os_prio=64 tid=0x00000000009ca800 nid=0x2f waiting >>> for monitor e >>> > ntry [0xfffffd7fc0231000] >>> > java.lang.Thread.State: BLOCKED (on object monitor) >>> > JavaThread state: _thread_blocked >>> > Thread: 0x00000000009ca800 [0x2f] State: _at_safepoint >>> _has_called_back 0 _at_p >>> > oll_safepoint 0 >>> > JavaThread state: _thread_blocked >>> > at java.lang.Object.wait(Native Method) >>> > - waiting on <0xfffffd7e6a2b6ff0> (a java.lang.String) >>> > at java.lang.Object.wait(Object.java:502) >>> > at SMW_WorkerThread.run(StressMonitorWait.java:103) >>> > - locked <0xfffffd7e6a2b6ff0> (a java.lang.String) >>> > >>> > "T2" #23 prio=5 os_prio=64 tid=0x00000000009cc000 nid=0x30 waiting >>> for monitor e >>> > ntry [0xfffffd7fc0130000] >>> > java.lang.Thread.State: BLOCKED (on object monitor) >>> > JavaThread state: _thread_blocked >>> > Thread: 0x00000000009cc000 [0x30] State: _at_safepoint >>> _has_called_back 0 _at_p >>> > oll_safepoint 0 >>> > JavaThread state: _thread_blocked >>> > at SMW_WorkerThread.run(StressMonitorWait.java:120) >>> > - waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String) >>> > >>> > "T3" #24 prio=5 os_prio=64 tid=0x00000000009ce000 nid=0x31 waiting >>> for monitor e >>> > ntry [0xfffffd7fc002f000] >>> > java.lang.Thread.State: BLOCKED (on object monitor) >>> > JavaThread state: _thread_blocked >>> > Thread: 0x00000000009ce000 [0x31] State: _at_safepoint >>> _has_called_back 0 _at_p >>> > oll_safepoint 0 >>> > JavaThread state: _thread_blocked >>> > at SMW_WorkerThread.run(StressMonitorWait.java:139) >>> > - waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String) >>> > >>> > Key symptoms in thread T1: >>> > >>> > - had the object locked: >>> > >>> > locked <0xfffffd7e6a2b6ff0> (a java.lang.String) >>> > >>> > - did an Object.wait(): >>> > >>> > waiting on <0xfffffd7e6a2b6ff0> (a java.lang.String) >>> > >>> > - is blocked on reentry: >>> > >>> > waiting for monitor entry [0xfffffd7fc0231000] >>> > >>> > Key symptoms in thread T2: >>> > >>> > - is blocked waiting to lock the object: >>> > >>> > waiting for monitor entry [0xfffffd7fc0130000] >>> > waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String) >>> > >>> > Key symptoms in thread T3: >>> > >>> > - is blocked waiting to lock the object: >>> > >>> > waiting for monitor entry [0xfffffd7fc002f000] >>> > waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String) >>> >>> >>> >> > From daniel.daugherty at oracle.com Tue Feb 25 16:48:07 2014 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Tue, 25 Feb 2014 17:48:07 -0700 Subject: RFR: 8035150 ShouldNotReachHere() in ConstantPool::copy_entry_to In-Reply-To: References: <19821D33-93B6-43FA-85DC-070DB4800911@oracle.com> Message-ID: <530D39C7.6080009@oracle.com> I concur with Markus. Pairing JVM_CONSTANT_UnresolvedClassInError with JVM_CONSTANT_UnresolvedClass in the ConstantPool::copy_entry_to() switch looks like the right thing to do. The usual questions: - why wasn't this failure mode seen before JDK8? - was this failure caught somewhere else before JDK8 and changes in JDK8 exposed a new code path? Reasoning about this from a 30,000 foot view, I don't see any reason why you can't redefine a class that has a constant pool ref that refers to a class in error. You won't be able to use the error'ed class, but there's no reason it can't be in there... Or does that violate the rule that you can't redefine a class that isn't fully linked (what ever that means...)??? So what does your new test on JDK7 or JDK6? Just curious... Dan On 2/24/14 2:42 AM, Markus Gronlund wrote: > Hi Staffan, > > I would think this is the correct fix. > > The other two constant pool "error" tags, besides UnresolvedClassInError, which signal constant pool resolution errors are MethodTypeInError and MethodHandleInError - these error tags are associated with their corresponding "success" tags in switch targets in ConstantPool::copy_entry_to(), as well as in additional routines in constantPool.cpp. > > In addition, in other routines in ConstantPool.cpp, the error tag JVM_CONSTANT_UnresolvedClassInError is associated with JVM_CONSTANT_UnresolvedClass - ConstantPool::resolve_constant_at_impl() for example. > > Thanks > Markus > > > -----Original Message----- > From: Staffan Larsen > Sent: den 21 februari 2014 15:11 > To: hotspot-runtime-dev; serviceability-dev at openjdk.java.net serviceability-dev at openjdk.java.net > Subject: RFR: 8035150 ShouldNotReachHere() in ConstantPool::copy_entry_to > > This is an attempt to solve a crash while redefining a class that has unresolved class references in its constant pool. I would appreciate some extra scrutiny here since I am unfamiliar with this code path. > > I have also added a test that causes a JVM crash without the fix. > > The updates to the test library is all code copied from the jdk version of the test library. > > webrev: http://cr.openjdk.java.net/~sla/8035150/webrev.00/ > bug: https://bugs.openjdk.java.net/browse/JDK-8035150 > > Thanks, > /Staffan From christian.thalinger at oracle.com Tue Feb 25 17:55:20 2014 From: christian.thalinger at oracle.com (Christian Thalinger) Date: Tue, 25 Feb 2014 17:55:20 -0800 Subject: RFR 6471769: Error: assert(_cur_stack_depth == count_frames(), "cur_stack_depth out of sync") In-Reply-To: <530D007D.4040602@oracle.com> References: <530D007D.4040602@oracle.com> Message-ID: <254DB943-0858-49A3-8BF3-3323F9BA4722@oracle.com> Looks good. On Feb 25, 2014, at 12:43 PM, serguei.spitsyn at oracle.com wrote: > Please, review the fix for: > https://bugs.openjdk.java.net/browse/JDK-6471769 > > > Open webrev: > http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/hotspot/6471769-JVMTI-DEPTH.1 > > Summary: > > This is another Test Stabilization issue. > The fix is very similar to other JVMTI stabilization fixes. > It is to use safepoints for updating the PopFrame data instead of relying on the > suspend equivalent condition mechanism (JvmtiEnv::is_thread_fully_suspended()) > which is not adequate from the reliability point of view. > > Testing: > In progress: nsk.jvmti, nsk.jdi, nsk.jdwp, JTreg com/sun/jdi > > > Thanks, > Serguei > From serguei.spitsyn at oracle.com Tue Feb 25 18:11:10 2014 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Tue, 25 Feb 2014 18:11:10 -0800 Subject: RFR 6471769: Error: assert(_cur_stack_depth == count_frames(), "cur_stack_depth out of sync") In-Reply-To: <254DB943-0858-49A3-8BF3-3323F9BA4722@oracle.com> References: <530D007D.4040602@oracle.com> <254DB943-0858-49A3-8BF3-3323F9BA4722@oracle.com> Message-ID: <530D4D3E.3030707@oracle.com> Thanks, Christian! Serguei On 2/25/14 5:55 PM, Christian Thalinger wrote: > Looks good. > > On Feb 25, 2014, at 12:43 PM, serguei.spitsyn at oracle.com wrote: > >> Please, review the fix for: >> https://bugs.openjdk.java.net/browse/JDK-6471769 >> >> >> Open webrev: >> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/hotspot/6471769-JVMTI-DEPTH.1 >> >> Summary: >> >> This is another Test Stabilization issue. >> The fix is very similar to other JVMTI stabilization fixes. >> It is to use safepoints for updating the PopFrame data instead of relying on the >> suspend equivalent condition mechanism (JvmtiEnv::is_thread_fully_suspended()) >> which is not adequate from the reliability point of view. >> >> Testing: >> In progress: nsk.jvmti, nsk.jdi, nsk.jdwp, JTreg com/sun/jdi >> >> >> Thanks, >> Serguei >> From staffan.larsen at oracle.com Wed Feb 26 00:31:59 2014 From: staffan.larsen at oracle.com (Staffan Larsen) Date: Wed, 26 Feb 2014 09:31:59 +0100 Subject: RFR: 8035150 ShouldNotReachHere() in ConstantPool::copy_entry_to In-Reply-To: <530D39C7.6080009@oracle.com> References: <19821D33-93B6-43FA-85DC-070DB4800911@oracle.com> <530D39C7.6080009@oracle.com> Message-ID: On 26 feb 2014, at 01:48, Daniel D. Daugherty wrote: > I concur with Markus. Pairing JVM_CONSTANT_UnresolvedClassInError with > JVM_CONSTANT_UnresolvedClass in the ConstantPool::copy_entry_to() > switch looks like the right thing to do. Good - thanks. > > The usual questions: > > - why wasn't this failure mode seen before JDK8? No tests for this ? ;) > - was this failure caught somewhere else before JDK8 and changes > in JDK8 exposed a new code path? > > Reasoning about this from a 30,000 foot view, I don't see any reason > why you can't redefine a class that has a constant pool ref that > refers to a class in error. You won't be able to use the error'ed > class, but there's no reason it can't be in there... Or does that > violate the rule that you can't redefine a class that isn't fully > linked (what ever that means...)??? > > So what does your new test on JDK7 or JDK6? Just curious? The test passes on jdk7, but fails on jdk8. (I don?t have a jdk6). I don?t know why it passes on jdk7, do you think it?s important to track it down? /Staffan > > Dan > > > On 2/24/14 2:42 AM, Markus Gronlund wrote: >> Hi Staffan, >> >> I would think this is the correct fix. >> >> The other two constant pool "error" tags, besides UnresolvedClassInError, which signal constant pool resolution errors are MethodTypeInError and MethodHandleInError - these error tags are associated with their corresponding "success" tags in switch targets in ConstantPool::copy_entry_to(), as well as in additional routines in constantPool.cpp. >> >> In addition, in other routines in ConstantPool.cpp, the error tag JVM_CONSTANT_UnresolvedClassInError is associated with JVM_CONSTANT_UnresolvedClass - ConstantPool::resolve_constant_at_impl() for example. >> >> Thanks >> Markus >> >> >> -----Original Message----- >> From: Staffan Larsen >> Sent: den 21 februari 2014 15:11 >> To: hotspot-runtime-dev; serviceability-dev at openjdk.java.net serviceability-dev at openjdk.java.net >> Subject: RFR: 8035150 ShouldNotReachHere() in ConstantPool::copy_entry_to >> >> This is an attempt to solve a crash while redefining a class that has unresolved class references in its constant pool. I would appreciate some extra scrutiny here since I am unfamiliar with this code path. >> >> I have also added a test that causes a JVM crash without the fix. >> >> The updates to the test library is all code copied from the jdk version of the test library. >> >> webrev: http://cr.openjdk.java.net/~sla/8035150/webrev.00/ >> bug: https://bugs.openjdk.java.net/browse/JDK-8035150 >> >> Thanks, >> /Staffan > From daniel.daugherty at oracle.com Wed Feb 26 06:03:55 2014 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Wed, 26 Feb 2014 07:03:55 -0700 Subject: RFR: 8035150 ShouldNotReachHere() in ConstantPool::copy_entry_to In-Reply-To: References: <19821D33-93B6-43FA-85DC-070DB4800911@oracle.com> <530D39C7.6080009@oracle.com> Message-ID: <530DF44B.7060209@oracle.com> On 2/26/14 1:31 AM, Staffan Larsen wrote: > On 26 feb 2014, at 01:48, Daniel D. Daugherty wrote: > >> I concur with Markus. Pairing JVM_CONSTANT_UnresolvedClassInError with >> JVM_CONSTANT_UnresolvedClass in the ConstantPool::copy_entry_to() >> switch looks like the right thing to do. > Good - thanks. > >> The usual questions: >> >> - why wasn't this failure mode seen before JDK8? > No tests for this ? ;) I should have been more clear... :-) Why hasn't the NetBeans profiler run into this before? That profiler is a wonderful test for the RedefineClasses/RetransformClasses stuff... > >> - was this failure caught somewhere else before JDK8 and changes >> in JDK8 exposed a new code path? >> >> Reasoning about this from a 30,000 foot view, I don't see any reason >> why you can't redefine a class that has a constant pool ref that >> refers to a class in error. You won't be able to use the error'ed >> class, but there's no reason it can't be in there... Or does that >> violate the rule that you can't redefine a class that isn't fully >> linked (what ever that means...)??? >> >> So what does your new test on JDK7 or JDK6? Just curious? > The test passes on jdk7, but fails on jdk8. (I don?t have a jdk6). I don?t know why it passes on jdk7, do you think it?s important to track it down? The fact that it passes on JDK7 is the useful piece of data. Figuring out why is much less important. BTW, which JDK7 version? One of the updates or GA/FCS? Dan > > /Staffan > >> Dan >> >> >> On 2/24/14 2:42 AM, Markus Gronlund wrote: >>> Hi Staffan, >>> >>> I would think this is the correct fix. >>> >>> The other two constant pool "error" tags, besides UnresolvedClassInError, which signal constant pool resolution errors are MethodTypeInError and MethodHandleInError - these error tags are associated with their corresponding "success" tags in switch targets in ConstantPool::copy_entry_to(), as well as in additional routines in constantPool.cpp. >>> >>> In addition, in other routines in ConstantPool.cpp, the error tag JVM_CONSTANT_UnresolvedClassInError is associated with JVM_CONSTANT_UnresolvedClass - ConstantPool::resolve_constant_at_impl() for example. >>> >>> Thanks >>> Markus >>> >>> >>> -----Original Message----- >>> From: Staffan Larsen >>> Sent: den 21 februari 2014 15:11 >>> To: hotspot-runtime-dev; serviceability-dev at openjdk.java.net serviceability-dev at openjdk.java.net >>> Subject: RFR: 8035150 ShouldNotReachHere() in ConstantPool::copy_entry_to >>> >>> This is an attempt to solve a crash while redefining a class that has unresolved class references in its constant pool. I would appreciate some extra scrutiny here since I am unfamiliar with this code path. >>> >>> I have also added a test that causes a JVM crash without the fix. >>> >>> The updates to the test library is all code copied from the jdk version of the test library. >>> >>> webrev: http://cr.openjdk.java.net/~sla/8035150/webrev.00/ >>> bug: https://bugs.openjdk.java.net/browse/JDK-8035150 >>> >>> Thanks, >>> /Staffan From staffan.larsen at oracle.com Wed Feb 26 06:15:43 2014 From: staffan.larsen at oracle.com (Staffan Larsen) Date: Wed, 26 Feb 2014 15:15:43 +0100 Subject: RFR: 8035150 ShouldNotReachHere() in ConstantPool::copy_entry_to In-Reply-To: <530DF44B.7060209@oracle.com> References: <19821D33-93B6-43FA-85DC-070DB4800911@oracle.com> <530D39C7.6080009@oracle.com> <530DF44B.7060209@oracle.com> Message-ID: On 26 feb 2014, at 15:03, Daniel D. Daugherty wrote: > On 2/26/14 1:31 AM, Staffan Larsen wrote: >> On 26 feb 2014, at 01:48, Daniel D. Daugherty wrote: >> >>> I concur with Markus. Pairing JVM_CONSTANT_UnresolvedClassInError with >>> JVM_CONSTANT_UnresolvedClass in the ConstantPool::copy_entry_to() >>> switch looks like the right thing to do. >> Good - thanks. >> >>> The usual questions: >>> >>> - why wasn't this failure mode seen before JDK8? >> No tests for this ? ;) > > I should have been more clear... :-) Why hasn't the NetBeans profiler > run into this before? That profiler is a wonderful test for the > RedefineClasses/RetransformClasses stuff? Ah, ok. No idea... > > >> >>> - was this failure caught somewhere else before JDK8 and changes >>> in JDK8 exposed a new code path? >>> >>> Reasoning about this from a 30,000 foot view, I don't see any reason >>> why you can't redefine a class that has a constant pool ref that >>> refers to a class in error. You won't be able to use the error'ed >>> class, but there's no reason it can't be in there... Or does that >>> violate the rule that you can't redefine a class that isn't fully >>> linked (what ever that means...)??? >>> >>> So what does your new test on JDK7 or JDK6? Just curious? >> The test passes on jdk7, but fails on jdk8. (I don?t have a jdk6). I don?t know why it passes on jdk7, do you think it?s important to track it down? > > The fact that it passes on JDK7 is the useful piece of data. > Figuring out why is much less important. BTW, which JDK7 > version? One of the updates or GA/FCS? I used 7u45, but now I tested with 7u4 as well - passes there, too. Are you ok with pushing the change? Thanks, /Staffan > > Dan > > >> >> /Staffan >> >>> Dan >>> >>> >>> On 2/24/14 2:42 AM, Markus Gronlund wrote: >>>> Hi Staffan, >>>> >>>> I would think this is the correct fix. >>>> >>>> The other two constant pool "error" tags, besides UnresolvedClassInError, which signal constant pool resolution errors are MethodTypeInError and MethodHandleInError - these error tags are associated with their corresponding "success" tags in switch targets in ConstantPool::copy_entry_to(), as well as in additional routines in constantPool.cpp. >>>> >>>> In addition, in other routines in ConstantPool.cpp, the error tag JVM_CONSTANT_UnresolvedClassInError is associated with JVM_CONSTANT_UnresolvedClass - ConstantPool::resolve_constant_at_impl() for example. >>>> >>>> Thanks >>>> Markus >>>> >>>> >>>> -----Original Message----- >>>> From: Staffan Larsen >>>> Sent: den 21 februari 2014 15:11 >>>> To: hotspot-runtime-dev; serviceability-dev at openjdk.java.net serviceability-dev at openjdk.java.net >>>> Subject: RFR: 8035150 ShouldNotReachHere() in ConstantPool::copy_entry_to >>>> >>>> This is an attempt to solve a crash while redefining a class that has unresolved class references in its constant pool. I would appreciate some extra scrutiny here since I am unfamiliar with this code path. >>>> >>>> I have also added a test that causes a JVM crash without the fix. >>>> >>>> The updates to the test library is all code copied from the jdk version of the test library. >>>> >>>> webrev: http://cr.openjdk.java.net/~sla/8035150/webrev.00/ >>>> bug: https://bugs.openjdk.java.net/browse/JDK-8035150 >>>> >>>> Thanks, >>>> /Staffan -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20140226/7fa0b7d6/attachment-0001.html From taras.ledkov at oracle.com Wed Feb 26 06:24:28 2014 From: taras.ledkov at oracle.com (taras ledkov) Date: Wed, 26 Feb 2014 18:24:28 +0400 Subject: Review request for 7195249: Some jtreg tests use hard coded ports In-Reply-To: <530CA007.5080006@oracle.com> References: <529EF58F.5000701@oracle.com> <52A58687.6020708@oracle.com> <52A5953A.5040102@oracle.com> <52A7061E.8040002@oracle.com> <52BC2A7D.3070403@oracle.com> <52D6A61A.5020109@oracle.com> <52DD3BB6.2070607@oracle.com> <52DE3E3D.5070903@oracle.com> <52DE41BB.40309@oracle.com> <52F231D3.1050208@oracle.com> <52F23A97.7020807@oracle.com> <53049009.9020204@oracle.com> <530CA007.5080006@oracle.com> Message-ID: <530DF91C.5010000@oracle.com> Hi, Alan, Mandy could you please review the fix: https://bugs.openjdk.java.net/browse/JDK-7195249. I had the discussion with Jaroslav and Staffan and they have approved my fix, but they are not reviewers. Webrev for jdk part: http://cr.openjdk.java.net/~anazarov/7195249/jdk/webrev.04/ Webrev for hs part: http://cr.openjdk.java.net/~anazarov/7195249/hs/webrev.03/ On 25.02.2014 17:52, Jaroslav Bachorik wrote: > Thumbs up. (not a "reviewer", though) > > -JB- > > On 19.2.2014 12:05, taras ledkov wrote: >> Hi, >> >> Imports are fixed: >> >> http://cr.openjdk.java.net/~anazarov/7195249/jdk/webrev.04/ >> >> On 05.02.2014 17:20, Jaroslav Bachorik wrote: >>> Hi Taras, >>> >>> thanks for taking care of this. >>> >>> The changes look fine to me. >>> >>> One minor nit is unused imports of the library classes in >>> "test/sun/management/jmxremote/bootstrap/SSLConfigFilePermissionTest.java". >>> >>> >>> It does not use any of those classes as its base class >>> "AbstractFilePermissionTest" does all the heavy lifting. >>> >>> Cheers, >>> >>> -JB- >>> >>> On 5.2.2014 13:42, taras ledkov wrote: >>>> Hi, >>>> >>>> So please take a look at the review against JDK9. >>>> The reviewed patch had not been integrated into JDK8. >>>> >>>> Port to JDK9 is identical. The difference: the ProcessTools.java has >>>> been already patched by Jaroslav. >>>> >>>> Webrev for jdk part: >>>> http://cr.openjdk.java.net/~anazarov/7195249/jdk/webrev.03/ >>>> >>>> Webrev for hs part: >>>> http://cr.openjdk.java.net/~anazarov/7195249/hs/webrev.03/ >>>> >>>> >>>> On 21.01.2014 13:45, Jaroslav Bachorik wrote: >>>>> Hi Taras, >>>>> >>>>> On 21.1.2014 10:30, taras ledkov wrote: >>>>>> Hi Jaroslav, >>>>>> >>>>>> Could you please review the last changes? >>>>>> Are you OK? >>>>> >>>>> Yes, the change looks ok. But I think we will need to get back to this >>>>> problem eventually and implement a central port dispatcher if we >>>>> want to >>>>> be 100% sure the port conflicts wouldn't occur. But your changes >>>>> reduce >>>>> the chance significantly. >>>>> >>>>> Thanks for taking care of this. >>>>> >>>>> -JB- >>>>> >>>>>> >>>>>> On 20.01.2014 19:21, Staffan Larsen wrote: >>>>>>> Sorry for not replying earlier. Yes, I?m ok with these changes. >>>>>>> >>>>>>> Thanks, >>>>>>> /Staffan >>>>>>> >>>>>>> On 20 jan 2014, at 16:07, taras ledkov >>>>>>> wrote: >>>>>>> >>>>>>>> Hi Staffan, >>>>>>>> >>>>>>>> I fixed the tests according with your comments. >>>>>>>> Are you OK? >>>>>>>> >>>>>>>> On 15.01.2014 19:15, taras ledkov wrote: >>>>>>>>> Hi, >>>>>>>>> >>>>>>>>> Please take a look at the new review. >>>>>>>>> >>>>>>>>> Webrev for jdk part: >>>>>>>>> http://cr.openjdk.java.net/~anazarov/7195249/jdk/webrev.02/ >>>>>>>>> >>>>>>>>> Webrev for hs part: >>>>>>>>> http://cr.openjdk.java.net/~anazarov/7195249/hs/webrev.02/ >>>>>>>>> >>>>>>>>> My answers are inline: >>>>>>>>> >>>>>>>>> On 08.01.2014 17:46, Staffan Larsen wrote: >>>>>>>>>> Hi Taras, >>>>>>>>>> >>>>>>>>>> Thanks for doing this clean up and conversion of tests into Java. >>>>>>>>>> Here?s a couple of comments: >>>>>>>>>> >>>>>>>>>> test/runtime/6294277/SourceDebugExtension.java: >>>>>>>>>> This test could be simplified by not specifying an address at >>>>>>>>>> all. >>>>>>>>>> Since the test never connects to the JVM started with -Xrunjdwp, >>>>>>>>>> there >>>>>>>>>> is no reason to specify an address. If address is unspecified >>>>>>>>>> (and >>>>>>>>>> server=y), the connector will pick an address and print it to the >>>>>>>>>> command line. Thus the only change that needs to be done is to >>>>>>>>>> remove >>>>>>>>>> ",address=8888? from the @run command. >>>>>>>>> fixed >>>>>>>>> >>>>>>>>>> test/sun/management/jmxremote/bootstrap/RmiBootstrapTest.sh: >>>>>>>>>> test/sun/management/jmxremote/bootstrap/RmiSslBootstrapTest.sh: >>>>>>>>>> These tests do not compile cleanly with an empty JTwork >>>>>>>>>> directory. It >>>>>>>>>> seems that having one @build for each class does not work well - >>>>>>>>>> when >>>>>>>>>> compiling RmiBootstrapTest.java it cannot find TestLogger. Moving >>>>>>>>>> all >>>>>>>>>> classes to one @build statement solved this problem for me. >>>>>>>>> fixed >>>>>>>>> >>>>>>>>>> test/lib/testlibrary/jdk/testlibrary/ProcessTools.java: >>>>>>>>>> 187 Future stdoutTask = stdout.process(); >>>>>>>>>> 188 Future stderrTask = stderr.process(); >>>>>>>>>> The stdoutTask and stderrTask variables are unused. >>>>>>>>> fixed >>>>>>>>> >>>>>>>>>> test/sun/management/jmxremote/bootstrap/RmiRegistrySslTest.java: >>>>>>>>>> At first I thought something was wrong with this file - the >>>>>>>>>> diff is >>>>>>>>>> very weird. Then I realized you renamed an old file and created a >>>>>>>>>> new >>>>>>>>>> file using the old name. >>>>>>>>> You are right. I did it to keep the test name. >>>>>>>>> >>>>>>>>>> test/sun/management/jmxremote/bootstrap/AbstractFilePermissionTest.java: >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> - Is resetPasswordFilePermission() really necessary? It looks >>>>>>>>>> like >>>>>>>>>> you >>>>>>>>>> delete the files at the beginning of the test in any case. >>>>>>>>> I think yes. n the first place, this functionality was at the old >>>>>>>>> code. >>>>>>>>> In the second place, a file without write permission may be a >>>>>>>>> problem >>>>>>>>> for a further cleanup (not by the test, for example for the tests >>>>>>>>> launcher scripts etc.) >>>>>>>>> >>>>>>>>>> - I find the names and usage of ?mgmt? and ?file2PermissionTest? >>>>>>>>>> confusing. They are both Paths. One is used directly by the >>>>>>>>>> sub-classes, the other has a getter method. >>>>>>>>> fixed >>>>>>>>> >>>>>>>>>> - Lines 57-58: Don?t swallow exceptions, add an >>>>>>>>>> ex.printStackTrace(). >>>>>>>>>> (Same thing for all other places where you call >>>>>>>>>> Integer.parseInt()) >>>>>>>>> fixed >>>>>>>>> >>>>>>>>>> test/sun/management/jmxremote/bootstrap/Dummy.java: >>>>>>>>>> This file is never used as far as I can see. >>>>>>>>> It is used by PasswordFilePermissionTest & >>>>>>>>> SSLConfigFilePermissionTest >>>>>>>>> via the AbstractFilePermissionTest (see the doTest method, >>>>>>>>> AbstractFilePermissionTest : 162). >>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> /Staffan >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On 26 dec 2013, at 14:09, taras ledkov >>>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>>> Hi, >>>>>>>>>>> >>>>>>>>>>> Please take a look at the review with fixed issues about >>>>>>>>>>> trying to >>>>>>>>>>> launch test that needs free port several times. >>>>>>>>>>> >>>>>>>>>>> Webrev for jdk part: >>>>>>>>>>> http://cr.openjdk.java.net/~anazarov/7195249/jdk/webrev.01/ >>>>>>>>>>> >>>>>>>>>>> Webrev for hs part: >>>>>>>>>>> http://cr.openjdk.java.net/~anazarov/7195249/hs/webrev.01/ >>>>>>>>>>> >>>>>>>>>>> Pay your attention to new method >>>>>>>>>>> ProcessTools.startProcess(String, >>>>>>>>>>> ProcessBuilder, Consumer) that is used to analyze all >>>>>>>>>>> output >>>>>>>>>>> of a sub-process. It has common part with >>>>>>>>>>> ProcessTools.startProcess(String, ProcessBuilder, >>>>>>>>>>> Predicate, >>>>>>>>>>> long, TumeUnit) that is used to determine the warm-up moment. >>>>>>>>>>> >>>>>>>>>>> I think the ProcessTools.startProcess(String, ProcessBuilder, >>>>>>>>>>> Predicate, long, TumeUnit) may be changed by adding >>>>>>>>>>> LinePump >>>>>>>>>>> to stderr if there is not serious reason for restricting the >>>>>>>>>>> warm-up >>>>>>>>>>> analysis to stdout stream. >>>>>>>>>>> >>>>>>>>>>> On 10.12.2013 16:16, Yekaterina Kantserova wrote: >>>>>>>>>>>> Hi, >>>>>>>>>>>> >>>>>>>>>>>> I've consulted with Serviceability engineers (add them to CC >>>>>>>>>>>> list) and >>>>>>>>>>>> they would like to see tests to solve these problem so far: >>>>>>>>>>>> >>>>>>>>>>>> 2. Implement loops in every test. >>>>>>>>>>>> >>>>>>>>>>>> Thanks, >>>>>>>>>>>> Katja >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On 12/09/2013 11:02 AM, Alexandre (Shura) Iline wrote: >>>>>>>>>>>>> Guys. >>>>>>>>>>>>> >>>>>>>>>>>>> Let me try to sum up what was said before and may be suggest a >>>>>>>>>>>>> compromise. >>>>>>>>>>>>> >>>>>>>>>>>>> 1. There is a desire to have a support port allocation on the >>>>>>>>>>>>> level of >>>>>>>>>>>>> a JTReg suite execution. Taras created a bug for that >>>>>>>>>>>>> (https://bugs.openjdk.java.net/browse/JDK-7195249). Whether it >>>>>>>>>>>>> is a >>>>>>>>>>>>> test harness API or a library API does not really matter from >>>>>>>>>>>>> usage >>>>>>>>>>>>> point of view. >>>>>>>>>>>>> >>>>>>>>>>>>> 2. There is no way to make the tests absolutely stable, >>>>>>>>>>>>> whatever >>>>>>>>>>>>> port >>>>>>>>>>>>> allocation logic is used. The best we could do is to try to >>>>>>>>>>>>> perform >>>>>>>>>>>>> the test logic with different ports until the test succeeds. >>>>>>>>>>>>> >>>>>>>>>>>>> Both arguments make sense. #2 is the ultimate answer, of >>>>>>>>>>>>> course, >>>>>>>>>>>>> but >>>>>>>>>>>>> better be used in conjunction with a meaningful port selection >>>>>>>>>>>>> algorithm. >>>>>>>>>>>>> >>>>>>>>>>>>> At the same time, copying a loop-until-success login from one >>>>>>>>>>>>> test to >>>>>>>>>>>>> another may be not the best solution. Library could help with >>>>>>>>>>>>> that I >>>>>>>>>>>>> believe. There only need to be an API method which takes >>>>>>>>>>>>> behavior as a >>>>>>>>>>>>> parameter and run it until it succeeds. Something like: >>>>>>>>>>>>> public runOnAFreePort(Function) >>>>>>>>>>>>> or similar. There could be arguments of how/whether to >>>>>>>>>>>>> implement >>>>>>>>>>>>> it, >>>>>>>>>>>>> the solution would not work for shell tests, etc, but still >>>>>>>>>>>>> ... >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> With the tests in question though, we have a few options. >>>>>>>>>>>>> >>>>>>>>>>>>> 1. Integrate tests as is. Get to it later after reaching >>>>>>>>>>>>> agreement in >>>>>>>>>>>>> the library, etc. >>>>>>>>>>>>> 2. Implement loops in every test. >>>>>>>>>>>>> 3. Wait for the library to be ready and only then integrate >>>>>>>>>>>>> the >>>>>>>>>>>>> changes. >>>>>>>>>>>>> >>>>>>>>>>>>> Please let us know which one is closer to your heart. >>>>>>>>>>>>> >>>>>>>>>>>>> I personally prefer #1 for the reason that the changes already >>>>>>>>>>>>> supposed to make the tests more stable and also there are many >>>>>>>>>>>>> more >>>>>>>>>>>>> tests tests which use ports, so the scope of the problem is >>>>>>>>>>>>> bigger >>>>>>>>>>>>> than these. >>>>>>>>>>>>> >>>>>>>>>>>>> Shura >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>>> Taras, >>>>>>>>>>>>>> >>>>>>>>>>>>>> I agree with the previous comments, that Utils.getFreePort() >>>>>>>>>>>>>> does not >>>>>>>>>>>>>> guarantee the port will be still free when you start your >>>>>>>>>>>>>> process. >>>>>>>>>>>>>> Unfortunately I don't think the library can do more. However, >>>>>>>>>>>>>> there is a >>>>>>>>>>>>>> solution. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Please, look at the >>>>>>>>>>>>>> *jdk/test/sun/tools/jstatd/JstatdTest.java >>>>>>>>>>>>>> tryToSetupJstatdProcess()*. In brief, the test will try to >>>>>>>>>>>>>> start a >>>>>>>>>>>>>> process with a free port and then check if >>>>>>>>>>>>>> /java.rmi.server.ExportException: Port already in use/ has >>>>>>>>>>>>>> been >>>>>>>>>>>>>> thrown. >>>>>>>>>>>>>> If yes, you have to retry. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>> Katja >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> On 12/02/2013 01:39 PM, taras ledkov wrote: >>>>>>>>>>>>>>> Hi Everyone, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Whatever logic is to be chosen to select a free port, it is >>>>>>>>>>>>>>> the >>>>>>>>>>>>>>> library responsibility to implements it, would not you >>>>>>>>>>>>>>> agree? >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Hence what I am suggesting is to integrate the tests as is. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Should we decide to replace logic of the port selection, we >>>>>>>>>>>>>>> could do >>>>>>>>>>>>>>> it later in the library. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> On 21.11.2013 15:00, Jaroslav Bachorik wrote: >>>>>>>>>>>>>>>> On 20.11.2013 18:38, Dmitry Samersoff wrote: >>>>>>>>>>>>>>>>> Roger, >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> As soon as we close a socket nobody can guarantee that the >>>>>>>>>>>>>>>>> port is >>>>>>>>>>>>>>>>> free. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Moreover, port returned by getFreePort()[1] remains not >>>>>>>>>>>>>>>>> accessible >>>>>>>>>>>>>>>>> for >>>>>>>>>>>>>>>>> some time - it depends to system setup, take a look to >>>>>>>>>>>>>>>>> discussions >>>>>>>>>>>>>>>>> around SO_REUSEPORT for Linux or SO_REUSEADDR and >>>>>>>>>>>>>>>>> SO_LINGER >>>>>>>>>>>>>>>>> for >>>>>>>>>>>>>>>>> BSD. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> So from stability point of view it's better to just return >>>>>>>>>>>>>>>>> random >>>>>>>>>>>>>>>>> number >>>>>>>>>>>>>>>>> between 49152 and 65535. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Well, this doesn't seem to improve the odds by much. When >>>>>>>>>>>>>>>> there are >>>>>>>>>>>>>>>> more >>>>>>>>>>>>>>>> tests run in parallel, all of them requiring a free port, >>>>>>>>>>>>>>>> nothing >>>>>>>>>>>>>>>> prevents the random function to return the same port to >>>>>>>>>>>>>>>> all of >>>>>>>>>>>>>>>> them. >>>>>>>>>>>>>>>> Also, two subsequent requests can return the same port and >>>>>>>>>>>>>>>> cause >>>>>>>>>>>>>>>> problems with timing when a port used by a previous test is >>>>>>>>>>>>>>>> not >>>>>>>>>>>>>>>> fully >>>>>>>>>>>>>>>> ready to be assigned to a different socket. And as Dmitry >>>>>>>>>>>>>>>> pointed out >>>>>>>>>>>>>>>> unless one can keep hold of the allocated socket and use it >>>>>>>>>>>>>>>> later >>>>>>>>>>>>>>>> there >>>>>>>>>>>>>>>> is no guarantee that a port which was tested unallocated >>>>>>>>>>>>>>>> will >>>>>>>>>>>>>>>> remain >>>>>>>>>>>>>>>> unallocated also for the next few milliseconds. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> The only fail proof solution would be a port allocating >>>>>>>>>>>>>>>> service >>>>>>>>>>>>>>>> provided >>>>>>>>>>>>>>>> by the harness. Until then we can only (hopefully) decrease >>>>>>>>>>>>>>>> the >>>>>>>>>>>>>>>> chance >>>>>>>>>>>>>>>> of intermittent failures due to a port being in use. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> -JB- >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> -Dmitry >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> [1] >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> 141 public static int getFreePort() throws >>>>>>>>>>>>>>>>> InterruptedException, >>>>>>>>>>>>>>>>> IOException { >>>>>>>>>>>>>>>>> 142 int port = -1; >>>>>>>>>>>>>>>>> 143 >>>>>>>>>>>>>>>>> 144 while (port <= 0) { >>>>>>>>>>>>>>>>> 145 Thread.sleep(100); >>>>>>>>>>>>>>>>> 146 >>>>>>>>>>>>>>>>> 147 ServerSocket serverSocket = null; >>>>>>>>>>>>>>>>> 148 try { >>>>>>>>>>>>>>>>> 149 serverSocket = new ServerSocket(0); >>>>>>>>>>>>>>>>> 150 port = serverSocket.getLocalPort(); >>>>>>>>>>>>>>>>> 151 } finally { >>>>>>>>>>>>>>>>> 152 serverSocket.close(); >>>>>>>>>>>>>>>>> 153 } >>>>>>>>>>>>>>>>> 154 } >>>>>>>>>>>>>>>>> 155 >>>>>>>>>>>>>>>>> 156 return port; >>>>>>>>>>>>>>>>> 157 } >>>>>>>>>>>>>>>>> 158 >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> On 2013-11-20 19:40, roger riggs wrote: >>>>>>>>>>>>>>>>>> Hi, >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> fyi, The jdk.testlibrary.Utils.getFreePort() method will >>>>>>>>>>>>>>>>>> Open an >>>>>>>>>>>>>>>>>> free >>>>>>>>>>>>>>>>>> Socket, close it and return >>>>>>>>>>>>>>>>>> the port number. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> And as Alan recommended, use (0) when possible to have >>>>>>>>>>>>>>>>>> the >>>>>>>>>>>>>>>>>> system >>>>>>>>>>>>>>>>>> assign >>>>>>>>>>>>>>>>>> the port #. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Roger >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> On 11/20/2013 8:04 AM, Dmitry Samersoff wrote: >>>>>>>>>>>>>>>>>>> Taras, >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> *The only* correct way to take really free port is: >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> 1. Chose random number between 49152 and 65535 >>>>>>>>>>>>>>>>>>> 2. Open socket >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> if socket fails - repeat step 1 >>>>>>>>>>>>>>>>>>> if socket OK - return *socket* >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> If you can't keep the socket open (e.g. you have to pass >>>>>>>>>>>>>>>>>>> port >>>>>>>>>>>>>>>>>>> number as >>>>>>>>>>>>>>>>>>> property value) you shouldn't do any pre-check as it >>>>>>>>>>>>>>>>>>> has no >>>>>>>>>>>>>>>>>>> value >>>>>>>>>>>>>>>>>>> - as >>>>>>>>>>>>>>>>>>> as soon as you close socket someone can take the port. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> So just choose a random number within the range above >>>>>>>>>>>>>>>>>>> and >>>>>>>>>>>>>>>>>>> let >>>>>>>>>>>>>>>>>>> networking >>>>>>>>>>>>>>>>>>> code opening socket to handle port conflict. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> -Dmitry >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> On 2013-11-20 15:54, taras ledkov wrote: >>>>>>>>>>>>>>>>>>>> Hi Everyone, >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> I am working on bug >>>>>>>>>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-7195249. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> There are two webrevs: >>>>>>>>>>>>>>>>>>>> Webrev for jdk part: >>>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~anazarov/7195249/jdk/webrev.00/ >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Webrev for hs part: >>>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~anazarov/7195249/hs/webrev.00/ >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Please take a look at some notes: >>>>>>>>>>>>>>>>>>>> - After discussing with Yekaterina Kantserova & >>>>>>>>>>>>>>>>>>>> Jaroslav >>>>>>>>>>>>>>>>>>>> Bachorik >>>>>>>>>>>>>>>>>>>> some >>>>>>>>>>>>>>>>>>>> shell tests have been converted to java based tests >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> - PasswordFilePermissionTest & >>>>>>>>>>>>>>>>>>>> SSLConfigFilePermissionTest >>>>>>>>>>>>>>>>>>>> tests >>>>>>>>>>>>>>>>>>>> looked >>>>>>>>>>>>>>>>>>>> very similar, so a common parent class was created for >>>>>>>>>>>>>>>>>>>> them: >>>>>>>>>>>>>>>>>>>> AbstractFilePermissionTest >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> - What was called RmiRegistrySslTest.java I've >>>>>>>>>>>>>>>>>>>> renamed to >>>>>>>>>>>>>>>>>>>> RmiRegistrySslTestApp.java. The java code to replace >>>>>>>>>>>>>>>>>>>> old >>>>>>>>>>>>>>>>>>>> shell >>>>>>>>>>>>>>>>>>>> script >>>>>>>>>>>>>>>>>>>> RmiRegistrySslTest.sh is called >>>>>>>>>>>>>>>>>>>> RmiRegistrySslTest.java, >>>>>>>>>>>>>>>>>>>> hence the >>>>>>>>>>>>>>>>>>>> huge >>>>>>>>>>>>>>>>>>>> diff. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> - The new RmiRegistrySslTest.java has some lines >>>>>>>>>>>>>>>>>>>> similar >>>>>>>>>>>>>>>>>>>> to the >>>>>>>>>>>>>>>>>>>> AbstractFilePermissionTest.java, I nevertheless decided >>>>>>>>>>>>>>>>>>>> to not >>>>>>>>>>>>>>>>>>>> complicate the code further and leave it as is. Please >>>>>>>>>>>>>>>>>>>> let me >>>>>>>>>>>>>>>>>>>> know if >>>>>>>>>>>>>>>>>>>> this is somehow not acceptable >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> - com/oracle/java/testlibrary/Utils.java that is >>>>>>>>>>>>>>>>>>>> added to >>>>>>>>>>>>>>>>>>>> hotspot >>>>>>>>>>>>>>>>>>>> repository is taken from this patch: >>>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ykantser/8023138/webrev.00/test/lib/testlibrary/jdk/testlibrary/Utils.java.sdiff.html >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> - These tests will need additional changes when test >>>>>>>>>>>>>>>>>>>> library >>>>>>>>>>>>>>>>>>>> process >>>>>>>>>>>>>>>>>>>> tools will support command line options inheritance >>>>>>>>>>>>>>>>>>>> (http://mail.openjdk.java.net/pipermail/serviceability-dev/2013-November/013235.html) >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> -- >>>>>>>>>>> With best regards, >>>>>>>>>>> Taras Ledkov >>>>>>>>>>> Mail-To: taras.ledkov at oracle.com >>>>>>>>>>> skype: taras_ledkov >>>>>>>>>>> Phone: 7(812)3346-157 >>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> With best regards, >>>>>>>> Taras Ledkov >>>>>>>> Mail-To: taras.ledkov at oracle.com >>>>>>>> skype: taras_ledkov >>>>>>>> Phone: 7(812)3346-157 >>>>>>> >>>>>> >>>>> >>>> >>> >> > -- With best regards, Taras Ledkov Mail-To: taras.ledkov at oracle.com skype: taras_ledkov Phone: 7(812)3346-157 From daniel.daugherty at oracle.com Wed Feb 26 06:28:33 2014 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Wed, 26 Feb 2014 07:28:33 -0700 Subject: RFR: 8035150 ShouldNotReachHere() in ConstantPool::copy_entry_to In-Reply-To: References: <19821D33-93B6-43FA-85DC-070DB4800911@oracle.com> <530D39C7.6080009@oracle.com> <530DF44B.7060209@oracle.com> Message-ID: <530DFA11.90303@oracle.com> On 2/26/14 7:15 AM, Staffan Larsen wrote: > > On 26 feb 2014, at 15:03, Daniel D. Daugherty > > wrote: > >> On 2/26/14 1:31 AM, Staffan Larsen wrote: >>> On 26 feb 2014, at 01:48, Daniel D. Daugherty >>> > >>> wrote: >>> >>>> I concur with Markus. Pairing JVM_CONSTANT_UnresolvedClassInError with >>>> JVM_CONSTANT_UnresolvedClass in the ConstantPool::copy_entry_to() >>>> switch looks like the right thing to do. >>> Good - thanks. >>> >>>> The usual questions: >>>> >>>> - why wasn't this failure mode seen before JDK8? >>> No tests for this ? ;) >> >> I should have been more clear... :-) Why hasn't the NetBeans profiler >> run into this before? That profiler is a wonderful test for the >> RedefineClasses/RetransformClasses stuff? > > Ah, ok. No idea... > >> >> >>> >>>> - was this failure caught somewhere else before JDK8 and changes >>>> in JDK8 exposed a new code path? >>>> >>>> Reasoning about this from a 30,000 foot view, I don't see any reason >>>> why you can't redefine a class that has a constant pool ref that >>>> refers to a class in error. You won't be able to use the error'ed >>>> class, but there's no reason it can't be in there... Or does that >>>> violate the rule that you can't redefine a class that isn't fully >>>> linked (what ever that means...)??? >>>> >>>> So what does your new test on JDK7 or JDK6? Just curious? >>> The test passes on jdk7, but fails on jdk8. (I don?t have a jdk6). I >>> don?t know why it passes on jdk7, do you think it?s important to >>> track it down? >> >> The fact that it passes on JDK7 is the useful piece of data. >> Figuring out why is much less important. BTW, which JDK7 >> version? One of the updates or GA/FCS? > > I used 7u45, but now I tested with 7u4 as well - passes there, too. Sounds like the change/breakage is limited to JDK8 so that's a relief. > > Are you ok with pushing the change? Very much so. Dan > > Thanks, > /Staffan > >> >> Dan >> >> >>> >>> /Staffan >>> >>>> Dan >>>> >>>> >>>> On 2/24/14 2:42 AM, Markus Gronlund wrote: >>>>> Hi Staffan, >>>>> >>>>> I would think this is the correct fix. >>>>> >>>>> The other two constant pool "error" tags, besides >>>>> UnresolvedClassInError, which signal constant pool resolution >>>>> errors are MethodTypeInError and MethodHandleInError - these error >>>>> tags are associated with their corresponding "success" tags in >>>>> switch targets in ConstantPool::copy_entry_to(), as well as in >>>>> additional routines in constantPool.cpp. >>>>> >>>>> In addition, in other routines in ConstantPool.cpp, the error tag >>>>> JVM_CONSTANT_UnresolvedClassInError is associated with >>>>> JVM_CONSTANT_UnresolvedClass - >>>>> ConstantPool::resolve_constant_at_impl() for example. >>>>> >>>>> Thanks >>>>> Markus >>>>> >>>>> >>>>> -----Original Message----- >>>>> From: Staffan Larsen >>>>> Sent: den 21 februari 2014 15:11 >>>>> To: hotspot-runtime-dev; serviceability-dev at openjdk.java.net >>>>> >>>>> serviceability-dev at openjdk.java.net >>>>> >>>>> Subject: RFR: 8035150 ShouldNotReachHere() in >>>>> ConstantPool::copy_entry_to >>>>> >>>>> This is an attempt to solve a crash while redefining a class that >>>>> has unresolved class references in its constant pool. I would >>>>> appreciate some extra scrutiny here since I am unfamiliar with >>>>> this code path. >>>>> >>>>> I have also added a test that causes a JVM crash without the fix. >>>>> >>>>> The updates to the test library is all code copied from the jdk >>>>> version of the test library. >>>>> >>>>> webrev: http://cr.openjdk.java.net/~sla/8035150/webrev.00/ >>>>> >>>>> bug: https://bugs.openjdk.java.net/browse/JDK-8035150 >>>>> >>>>> Thanks, >>>>> /Staffan > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20140226/79157738/attachment-0001.html From staffan.larsen at oracle.com Wed Feb 26 06:33:22 2014 From: staffan.larsen at oracle.com (Staffan Larsen) Date: Wed, 26 Feb 2014 15:33:22 +0100 Subject: RFR: 8035150 ShouldNotReachHere() in ConstantPool::copy_entry_to In-Reply-To: <530DFA11.90303@oracle.com> References: <19821D33-93B6-43FA-85DC-070DB4800911@oracle.com> <530D39C7.6080009@oracle.com> <530DF44B.7060209@oracle.com> <530DFA11.90303@oracle.com> Message-ID: <4BECFB25-0704-40DE-AFC7-0948DD6531AF@oracle.com> Thanks Dan and Markus! On 26 feb 2014, at 15:28, Daniel D. Daugherty wrote: > On 2/26/14 7:15 AM, Staffan Larsen wrote: >> >> On 26 feb 2014, at 15:03, Daniel D. Daugherty wrote: >> >>> On 2/26/14 1:31 AM, Staffan Larsen wrote: >>>> On 26 feb 2014, at 01:48, Daniel D. Daugherty wrote: >>>> >>>>> I concur with Markus. Pairing JVM_CONSTANT_UnresolvedClassInError with >>>>> JVM_CONSTANT_UnresolvedClass in the ConstantPool::copy_entry_to() >>>>> switch looks like the right thing to do. >>>> Good - thanks. >>>> >>>>> The usual questions: >>>>> >>>>> - why wasn't this failure mode seen before JDK8? >>>> No tests for this ? ;) >>> >>> I should have been more clear... :-) Why hasn't the NetBeans profiler >>> run into this before? That profiler is a wonderful test for the >>> RedefineClasses/RetransformClasses stuff? >> >> Ah, ok. No idea... >> >>> >>> >>>> >>>>> - was this failure caught somewhere else before JDK8 and changes >>>>> in JDK8 exposed a new code path? >>>>> >>>>> Reasoning about this from a 30,000 foot view, I don't see any reason >>>>> why you can't redefine a class that has a constant pool ref that >>>>> refers to a class in error. You won't be able to use the error'ed >>>>> class, but there's no reason it can't be in there... Or does that >>>>> violate the rule that you can't redefine a class that isn't fully >>>>> linked (what ever that means...)??? >>>>> >>>>> So what does your new test on JDK7 or JDK6? Just curious? >>>> The test passes on jdk7, but fails on jdk8. (I don?t have a jdk6). I don?t know why it passes on jdk7, do you think it?s important to track it down? >>> >>> The fact that it passes on JDK7 is the useful piece of data. >>> Figuring out why is much less important. BTW, which JDK7 >>> version? One of the updates or GA/FCS? >> >> I used 7u45, but now I tested with 7u4 as well - passes there, too. > > Sounds like the change/breakage is limited to JDK8 so that's > a relief. > > >> >> Are you ok with pushing the change? > > Very much so. > > Dan > > >> >> Thanks, >> /Staffan >> >>> >>> Dan >>> >>> >>>> >>>> /Staffan >>>> >>>>> Dan >>>>> >>>>> >>>>> On 2/24/14 2:42 AM, Markus Gronlund wrote: >>>>>> Hi Staffan, >>>>>> >>>>>> I would think this is the correct fix. >>>>>> >>>>>> The other two constant pool "error" tags, besides UnresolvedClassInError, which signal constant pool resolution errors are MethodTypeInError and MethodHandleInError - these error tags are associated with their corresponding "success" tags in switch targets in ConstantPool::copy_entry_to(), as well as in additional routines in constantPool.cpp. >>>>>> >>>>>> In addition, in other routines in ConstantPool.cpp, the error tag JVM_CONSTANT_UnresolvedClassInError is associated with JVM_CONSTANT_UnresolvedClass - ConstantPool::resolve_constant_at_impl() for example. >>>>>> >>>>>> Thanks >>>>>> Markus >>>>>> >>>>>> >>>>>> -----Original Message----- >>>>>> From: Staffan Larsen >>>>>> Sent: den 21 februari 2014 15:11 >>>>>> To: hotspot-runtime-dev; serviceability-dev at openjdk.java.net serviceability-dev at openjdk.java.net >>>>>> Subject: RFR: 8035150 ShouldNotReachHere() in ConstantPool::copy_entry_to >>>>>> >>>>>> This is an attempt to solve a crash while redefining a class that has unresolved class references in its constant pool. I would appreciate some extra scrutiny here since I am unfamiliar with this code path. >>>>>> >>>>>> I have also added a test that causes a JVM crash without the fix. >>>>>> >>>>>> The updates to the test library is all code copied from the jdk version of the test library. >>>>>> >>>>>> webrev: http://cr.openjdk.java.net/~sla/8035150/webrev.00/ >>>>>> bug: https://bugs.openjdk.java.net/browse/JDK-8035150 >>>>>> >>>>>> Thanks, >>>>>> /Staffan >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20140226/6df7515d/attachment.html From staffan.larsen at oracle.com Wed Feb 26 06:36:11 2014 From: staffan.larsen at oracle.com (Staffan Larsen) Date: Wed, 26 Feb 2014 15:36:11 +0100 Subject: Review request for 7195249: Some jtreg tests use hard coded ports In-Reply-To: <530DF91C.5010000@oracle.com> References: <529EF58F.5000701@oracle.com> <52A58687.6020708@oracle.com> <52A5953A.5040102@oracle.com> <52A7061E.8040002@oracle.com> <52BC2A7D.3070403@oracle.com> <52D6A61A.5020109@oracle.com> <52DD3BB6.2070607@oracle.com> <52DE3E3D.5070903@oracle.com> <52DE41BB.40309@oracle.com> <52F231D3.1050208@oracle.com> <52F23A97.7020807@oracle.com> <53049009.9020204@oracle.com> <530CA007.5080006@oracle.com> <530DF91C.5010000@oracle.com> Message-ID: <2C889ABA-D705-4518-85EC-C5686DC43647@oracle.com> On 26 feb 2014, at 15:24, taras ledkov wrote: > Hi, > > Alan, Mandy could you please review the fix: https://bugs.openjdk.java.net/browse/JDK-7195249. > > I had the discussion with Jaroslav and Staffan and they have approved my fix, but they are not reviewers. I am a Reviewer. Thanks, /Staffan > > Webrev for jdk part: > http://cr.openjdk.java.net/~anazarov/7195249/jdk/webrev.04/ > > Webrev for hs part: > http://cr.openjdk.java.net/~anazarov/7195249/hs/webrev.03/ > > On 25.02.2014 17:52, Jaroslav Bachorik wrote: >> Thumbs up. (not a "reviewer", though) >> >> -JB- >> >> On 19.2.2014 12:05, taras ledkov wrote: >>> Hi, >>> >>> Imports are fixed: >>> >>> http://cr.openjdk.java.net/~anazarov/7195249/jdk/webrev.04/ >>> >>> On 05.02.2014 17:20, Jaroslav Bachorik wrote: >>>> Hi Taras, >>>> >>>> thanks for taking care of this. >>>> >>>> The changes look fine to me. >>>> >>>> One minor nit is unused imports of the library classes in >>>> "test/sun/management/jmxremote/bootstrap/SSLConfigFilePermissionTest.java". >>>> >>>> >>>> It does not use any of those classes as its base class >>>> "AbstractFilePermissionTest" does all the heavy lifting. >>>> >>>> Cheers, >>>> >>>> -JB- >>>> >>>> On 5.2.2014 13:42, taras ledkov wrote: >>>>> Hi, >>>>> >>>>> So please take a look at the review against JDK9. >>>>> The reviewed patch had not been integrated into JDK8. >>>>> >>>>> Port to JDK9 is identical. The difference: the ProcessTools.java has >>>>> been already patched by Jaroslav. >>>>> >>>>> Webrev for jdk part: >>>>> http://cr.openjdk.java.net/~anazarov/7195249/jdk/webrev.03/ >>>>> >>>>> Webrev for hs part: >>>>> http://cr.openjdk.java.net/~anazarov/7195249/hs/webrev.03/ >>>>> >>>>> >>>>> On 21.01.2014 13:45, Jaroslav Bachorik wrote: >>>>>> Hi Taras, >>>>>> >>>>>> On 21.1.2014 10:30, taras ledkov wrote: >>>>>>> Hi Jaroslav, >>>>>>> >>>>>>> Could you please review the last changes? >>>>>>> Are you OK? >>>>>> >>>>>> Yes, the change looks ok. But I think we will need to get back to this >>>>>> problem eventually and implement a central port dispatcher if we >>>>>> want to >>>>>> be 100% sure the port conflicts wouldn't occur. But your changes >>>>>> reduce >>>>>> the chance significantly. >>>>>> >>>>>> Thanks for taking care of this. >>>>>> >>>>>> -JB- >>>>>> >>>>>>> >>>>>>> On 20.01.2014 19:21, Staffan Larsen wrote: >>>>>>>> Sorry for not replying earlier. Yes, I?m ok with these changes. >>>>>>>> >>>>>>>> Thanks, >>>>>>>> /Staffan >>>>>>>> >>>>>>>> On 20 jan 2014, at 16:07, taras ledkov >>>>>>>> wrote: >>>>>>>> >>>>>>>>> Hi Staffan, >>>>>>>>> >>>>>>>>> I fixed the tests according with your comments. >>>>>>>>> Are you OK? >>>>>>>>> >>>>>>>>> On 15.01.2014 19:15, taras ledkov wrote: >>>>>>>>>> Hi, >>>>>>>>>> >>>>>>>>>> Please take a look at the new review. >>>>>>>>>> >>>>>>>>>> Webrev for jdk part: >>>>>>>>>> http://cr.openjdk.java.net/~anazarov/7195249/jdk/webrev.02/ >>>>>>>>>> >>>>>>>>>> Webrev for hs part: >>>>>>>>>> http://cr.openjdk.java.net/~anazarov/7195249/hs/webrev.02/ >>>>>>>>>> >>>>>>>>>> My answers are inline: >>>>>>>>>> >>>>>>>>>> On 08.01.2014 17:46, Staffan Larsen wrote: >>>>>>>>>>> Hi Taras, >>>>>>>>>>> >>>>>>>>>>> Thanks for doing this clean up and conversion of tests into Java. >>>>>>>>>>> Here?s a couple of comments: >>>>>>>>>>> >>>>>>>>>>> test/runtime/6294277/SourceDebugExtension.java: >>>>>>>>>>> This test could be simplified by not specifying an address at >>>>>>>>>>> all. >>>>>>>>>>> Since the test never connects to the JVM started with -Xrunjdwp, >>>>>>>>>>> there >>>>>>>>>>> is no reason to specify an address. If address is unspecified >>>>>>>>>>> (and >>>>>>>>>>> server=y), the connector will pick an address and print it to the >>>>>>>>>>> command line. Thus the only change that needs to be done is to >>>>>>>>>>> remove >>>>>>>>>>> ",address=8888? from the @run command. >>>>>>>>>> fixed >>>>>>>>>> >>>>>>>>>>> test/sun/management/jmxremote/bootstrap/RmiBootstrapTest.sh: >>>>>>>>>>> test/sun/management/jmxremote/bootstrap/RmiSslBootstrapTest.sh: >>>>>>>>>>> These tests do not compile cleanly with an empty JTwork >>>>>>>>>>> directory. It >>>>>>>>>>> seems that having one @build for each class does not work well - >>>>>>>>>>> when >>>>>>>>>>> compiling RmiBootstrapTest.java it cannot find TestLogger. Moving >>>>>>>>>>> all >>>>>>>>>>> classes to one @build statement solved this problem for me. >>>>>>>>>> fixed >>>>>>>>>> >>>>>>>>>>> test/lib/testlibrary/jdk/testlibrary/ProcessTools.java: >>>>>>>>>>> 187 Future stdoutTask = stdout.process(); >>>>>>>>>>> 188 Future stderrTask = stderr.process(); >>>>>>>>>>> The stdoutTask and stderrTask variables are unused. >>>>>>>>>> fixed >>>>>>>>>> >>>>>>>>>>> test/sun/management/jmxremote/bootstrap/RmiRegistrySslTest.java: >>>>>>>>>>> At first I thought something was wrong with this file - the >>>>>>>>>>> diff is >>>>>>>>>>> very weird. Then I realized you renamed an old file and created a >>>>>>>>>>> new >>>>>>>>>>> file using the old name. >>>>>>>>>> You are right. I did it to keep the test name. >>>>>>>>>> >>>>>>>>>>> test/sun/management/jmxremote/bootstrap/AbstractFilePermissionTest.java: >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> - Is resetPasswordFilePermission() really necessary? It looks >>>>>>>>>>> like >>>>>>>>>>> you >>>>>>>>>>> delete the files at the beginning of the test in any case. >>>>>>>>>> I think yes. n the first place, this functionality was at the old >>>>>>>>>> code. >>>>>>>>>> In the second place, a file without write permission may be a >>>>>>>>>> problem >>>>>>>>>> for a further cleanup (not by the test, for example for the tests >>>>>>>>>> launcher scripts etc.) >>>>>>>>>> >>>>>>>>>>> - I find the names and usage of ?mgmt? and ?file2PermissionTest? >>>>>>>>>>> confusing. They are both Paths. One is used directly by the >>>>>>>>>>> sub-classes, the other has a getter method. >>>>>>>>>> fixed >>>>>>>>>> >>>>>>>>>>> - Lines 57-58: Don?t swallow exceptions, add an >>>>>>>>>>> ex.printStackTrace(). >>>>>>>>>>> (Same thing for all other places where you call >>>>>>>>>>> Integer.parseInt()) >>>>>>>>>> fixed >>>>>>>>>> >>>>>>>>>>> test/sun/management/jmxremote/bootstrap/Dummy.java: >>>>>>>>>>> This file is never used as far as I can see. >>>>>>>>>> It is used by PasswordFilePermissionTest & >>>>>>>>>> SSLConfigFilePermissionTest >>>>>>>>>> via the AbstractFilePermissionTest (see the doTest method, >>>>>>>>>> AbstractFilePermissionTest : 162). >>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> /Staffan >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On 26 dec 2013, at 14:09, taras ledkov >>>>>>>>>>> wrote: >>>>>>>>>>> >>>>>>>>>>>> Hi, >>>>>>>>>>>> >>>>>>>>>>>> Please take a look at the review with fixed issues about >>>>>>>>>>>> trying to >>>>>>>>>>>> launch test that needs free port several times. >>>>>>>>>>>> >>>>>>>>>>>> Webrev for jdk part: >>>>>>>>>>>> http://cr.openjdk.java.net/~anazarov/7195249/jdk/webrev.01/ >>>>>>>>>>>> >>>>>>>>>>>> Webrev for hs part: >>>>>>>>>>>> http://cr.openjdk.java.net/~anazarov/7195249/hs/webrev.01/ >>>>>>>>>>>> >>>>>>>>>>>> Pay your attention to new method >>>>>>>>>>>> ProcessTools.startProcess(String, >>>>>>>>>>>> ProcessBuilder, Consumer) that is used to analyze all >>>>>>>>>>>> output >>>>>>>>>>>> of a sub-process. It has common part with >>>>>>>>>>>> ProcessTools.startProcess(String, ProcessBuilder, >>>>>>>>>>>> Predicate, >>>>>>>>>>>> long, TumeUnit) that is used to determine the warm-up moment. >>>>>>>>>>>> >>>>>>>>>>>> I think the ProcessTools.startProcess(String, ProcessBuilder, >>>>>>>>>>>> Predicate, long, TumeUnit) may be changed by adding >>>>>>>>>>>> LinePump >>>>>>>>>>>> to stderr if there is not serious reason for restricting the >>>>>>>>>>>> warm-up >>>>>>>>>>>> analysis to stdout stream. >>>>>>>>>>>> >>>>>>>>>>>> On 10.12.2013 16:16, Yekaterina Kantserova wrote: >>>>>>>>>>>>> Hi, >>>>>>>>>>>>> >>>>>>>>>>>>> I've consulted with Serviceability engineers (add them to CC >>>>>>>>>>>>> list) and >>>>>>>>>>>>> they would like to see tests to solve these problem so far: >>>>>>>>>>>>> >>>>>>>>>>>>> 2. Implement loops in every test. >>>>>>>>>>>>> >>>>>>>>>>>>> Thanks, >>>>>>>>>>>>> Katja >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> On 12/09/2013 11:02 AM, Alexandre (Shura) Iline wrote: >>>>>>>>>>>>>> Guys. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Let me try to sum up what was said before and may be suggest a >>>>>>>>>>>>>> compromise. >>>>>>>>>>>>>> >>>>>>>>>>>>>> 1. There is a desire to have a support port allocation on the >>>>>>>>>>>>>> level of >>>>>>>>>>>>>> a JTReg suite execution. Taras created a bug for that >>>>>>>>>>>>>> (https://bugs.openjdk.java.net/browse/JDK-7195249). Whether it >>>>>>>>>>>>>> is a >>>>>>>>>>>>>> test harness API or a library API does not really matter from >>>>>>>>>>>>>> usage >>>>>>>>>>>>>> point of view. >>>>>>>>>>>>>> >>>>>>>>>>>>>> 2. There is no way to make the tests absolutely stable, >>>>>>>>>>>>>> whatever >>>>>>>>>>>>>> port >>>>>>>>>>>>>> allocation logic is used. The best we could do is to try to >>>>>>>>>>>>>> perform >>>>>>>>>>>>>> the test logic with different ports until the test succeeds. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Both arguments make sense. #2 is the ultimate answer, of >>>>>>>>>>>>>> course, >>>>>>>>>>>>>> but >>>>>>>>>>>>>> better be used in conjunction with a meaningful port selection >>>>>>>>>>>>>> algorithm. >>>>>>>>>>>>>> >>>>>>>>>>>>>> At the same time, copying a loop-until-success login from one >>>>>>>>>>>>>> test to >>>>>>>>>>>>>> another may be not the best solution. Library could help with >>>>>>>>>>>>>> that I >>>>>>>>>>>>>> believe. There only need to be an API method which takes >>>>>>>>>>>>>> behavior as a >>>>>>>>>>>>>> parameter and run it until it succeeds. Something like: >>>>>>>>>>>>>> public runOnAFreePort(Function) >>>>>>>>>>>>>> or similar. There could be arguments of how/whether to >>>>>>>>>>>>>> implement >>>>>>>>>>>>>> it, >>>>>>>>>>>>>> the solution would not work for shell tests, etc, but still >>>>>>>>>>>>>> ... >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> With the tests in question though, we have a few options. >>>>>>>>>>>>>> >>>>>>>>>>>>>> 1. Integrate tests as is. Get to it later after reaching >>>>>>>>>>>>>> agreement in >>>>>>>>>>>>>> the library, etc. >>>>>>>>>>>>>> 2. Implement loops in every test. >>>>>>>>>>>>>> 3. Wait for the library to be ready and only then integrate >>>>>>>>>>>>>> the >>>>>>>>>>>>>> changes. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Please let us know which one is closer to your heart. >>>>>>>>>>>>>> >>>>>>>>>>>>>> I personally prefer #1 for the reason that the changes already >>>>>>>>>>>>>> supposed to make the tests more stable and also there are many >>>>>>>>>>>>>> more >>>>>>>>>>>>>> tests tests which use ports, so the scope of the problem is >>>>>>>>>>>>>> bigger >>>>>>>>>>>>>> than these. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Shura >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>>> Taras, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> I agree with the previous comments, that Utils.getFreePort() >>>>>>>>>>>>>>> does not >>>>>>>>>>>>>>> guarantee the port will be still free when you start your >>>>>>>>>>>>>>> process. >>>>>>>>>>>>>>> Unfortunately I don't think the library can do more. However, >>>>>>>>>>>>>>> there is a >>>>>>>>>>>>>>> solution. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Please, look at the >>>>>>>>>>>>>>> *jdk/test/sun/tools/jstatd/JstatdTest.java >>>>>>>>>>>>>>> tryToSetupJstatdProcess()*. In brief, the test will try to >>>>>>>>>>>>>>> start a >>>>>>>>>>>>>>> process with a free port and then check if >>>>>>>>>>>>>>> /java.rmi.server.ExportException: Port already in use/ has >>>>>>>>>>>>>>> been >>>>>>>>>>>>>>> thrown. >>>>>>>>>>>>>>> If yes, you have to retry. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>> Katja >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> On 12/02/2013 01:39 PM, taras ledkov wrote: >>>>>>>>>>>>>>>> Hi Everyone, >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Whatever logic is to be chosen to select a free port, it is >>>>>>>>>>>>>>>> the >>>>>>>>>>>>>>>> library responsibility to implements it, would not you >>>>>>>>>>>>>>>> agree? >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Hence what I am suggesting is to integrate the tests as is. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Should we decide to replace logic of the port selection, we >>>>>>>>>>>>>>>> could do >>>>>>>>>>>>>>>> it later in the library. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> On 21.11.2013 15:00, Jaroslav Bachorik wrote: >>>>>>>>>>>>>>>>> On 20.11.2013 18:38, Dmitry Samersoff wrote: >>>>>>>>>>>>>>>>>> Roger, >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> As soon as we close a socket nobody can guarantee that the >>>>>>>>>>>>>>>>>> port is >>>>>>>>>>>>>>>>>> free. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Moreover, port returned by getFreePort()[1] remains not >>>>>>>>>>>>>>>>>> accessible >>>>>>>>>>>>>>>>>> for >>>>>>>>>>>>>>>>>> some time - it depends to system setup, take a look to >>>>>>>>>>>>>>>>>> discussions >>>>>>>>>>>>>>>>>> around SO_REUSEPORT for Linux or SO_REUSEADDR and >>>>>>>>>>>>>>>>>> SO_LINGER >>>>>>>>>>>>>>>>>> for >>>>>>>>>>>>>>>>>> BSD. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> So from stability point of view it's better to just return >>>>>>>>>>>>>>>>>> random >>>>>>>>>>>>>>>>>> number >>>>>>>>>>>>>>>>>> between 49152 and 65535. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Well, this doesn't seem to improve the odds by much. When >>>>>>>>>>>>>>>>> there are >>>>>>>>>>>>>>>>> more >>>>>>>>>>>>>>>>> tests run in parallel, all of them requiring a free port, >>>>>>>>>>>>>>>>> nothing >>>>>>>>>>>>>>>>> prevents the random function to return the same port to >>>>>>>>>>>>>>>>> all of >>>>>>>>>>>>>>>>> them. >>>>>>>>>>>>>>>>> Also, two subsequent requests can return the same port and >>>>>>>>>>>>>>>>> cause >>>>>>>>>>>>>>>>> problems with timing when a port used by a previous test is >>>>>>>>>>>>>>>>> not >>>>>>>>>>>>>>>>> fully >>>>>>>>>>>>>>>>> ready to be assigned to a different socket. And as Dmitry >>>>>>>>>>>>>>>>> pointed out >>>>>>>>>>>>>>>>> unless one can keep hold of the allocated socket and use it >>>>>>>>>>>>>>>>> later >>>>>>>>>>>>>>>>> there >>>>>>>>>>>>>>>>> is no guarantee that a port which was tested unallocated >>>>>>>>>>>>>>>>> will >>>>>>>>>>>>>>>>> remain >>>>>>>>>>>>>>>>> unallocated also for the next few milliseconds. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> The only fail proof solution would be a port allocating >>>>>>>>>>>>>>>>> service >>>>>>>>>>>>>>>>> provided >>>>>>>>>>>>>>>>> by the harness. Until then we can only (hopefully) decrease >>>>>>>>>>>>>>>>> the >>>>>>>>>>>>>>>>> chance >>>>>>>>>>>>>>>>> of intermittent failures due to a port being in use. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> -JB- >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> -Dmitry >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> [1] >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> 141 public static int getFreePort() throws >>>>>>>>>>>>>>>>>> InterruptedException, >>>>>>>>>>>>>>>>>> IOException { >>>>>>>>>>>>>>>>>> 142 int port = -1; >>>>>>>>>>>>>>>>>> 143 >>>>>>>>>>>>>>>>>> 144 while (port <= 0) { >>>>>>>>>>>>>>>>>> 145 Thread.sleep(100); >>>>>>>>>>>>>>>>>> 146 >>>>>>>>>>>>>>>>>> 147 ServerSocket serverSocket = null; >>>>>>>>>>>>>>>>>> 148 try { >>>>>>>>>>>>>>>>>> 149 serverSocket = new ServerSocket(0); >>>>>>>>>>>>>>>>>> 150 port = serverSocket.getLocalPort(); >>>>>>>>>>>>>>>>>> 151 } finally { >>>>>>>>>>>>>>>>>> 152 serverSocket.close(); >>>>>>>>>>>>>>>>>> 153 } >>>>>>>>>>>>>>>>>> 154 } >>>>>>>>>>>>>>>>>> 155 >>>>>>>>>>>>>>>>>> 156 return port; >>>>>>>>>>>>>>>>>> 157 } >>>>>>>>>>>>>>>>>> 158 >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> On 2013-11-20 19:40, roger riggs wrote: >>>>>>>>>>>>>>>>>>> Hi, >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> fyi, The jdk.testlibrary.Utils.getFreePort() method will >>>>>>>>>>>>>>>>>>> Open an >>>>>>>>>>>>>>>>>>> free >>>>>>>>>>>>>>>>>>> Socket, close it and return >>>>>>>>>>>>>>>>>>> the port number. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> And as Alan recommended, use (0) when possible to have >>>>>>>>>>>>>>>>>>> the >>>>>>>>>>>>>>>>>>> system >>>>>>>>>>>>>>>>>>> assign >>>>>>>>>>>>>>>>>>> the port #. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Roger >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> On 11/20/2013 8:04 AM, Dmitry Samersoff wrote: >>>>>>>>>>>>>>>>>>>> Taras, >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> *The only* correct way to take really free port is: >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> 1. Chose random number between 49152 and 65535 >>>>>>>>>>>>>>>>>>>> 2. Open socket >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> if socket fails - repeat step 1 >>>>>>>>>>>>>>>>>>>> if socket OK - return *socket* >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> If you can't keep the socket open (e.g. you have to pass >>>>>>>>>>>>>>>>>>>> port >>>>>>>>>>>>>>>>>>>> number as >>>>>>>>>>>>>>>>>>>> property value) you shouldn't do any pre-check as it >>>>>>>>>>>>>>>>>>>> has no >>>>>>>>>>>>>>>>>>>> value >>>>>>>>>>>>>>>>>>>> - as >>>>>>>>>>>>>>>>>>>> as soon as you close socket someone can take the port. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> So just choose a random number within the range above >>>>>>>>>>>>>>>>>>>> and >>>>>>>>>>>>>>>>>>>> let >>>>>>>>>>>>>>>>>>>> networking >>>>>>>>>>>>>>>>>>>> code opening socket to handle port conflict. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> -Dmitry >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> On 2013-11-20 15:54, taras ledkov wrote: >>>>>>>>>>>>>>>>>>>>> Hi Everyone, >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> I am working on bug >>>>>>>>>>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-7195249. >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> There are two webrevs: >>>>>>>>>>>>>>>>>>>>> Webrev for jdk part: >>>>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~anazarov/7195249/jdk/webrev.00/ >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> Webrev for hs part: >>>>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~anazarov/7195249/hs/webrev.00/ >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> Please take a look at some notes: >>>>>>>>>>>>>>>>>>>>> - After discussing with Yekaterina Kantserova & >>>>>>>>>>>>>>>>>>>>> Jaroslav >>>>>>>>>>>>>>>>>>>>> Bachorik >>>>>>>>>>>>>>>>>>>>> some >>>>>>>>>>>>>>>>>>>>> shell tests have been converted to java based tests >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> - PasswordFilePermissionTest & >>>>>>>>>>>>>>>>>>>>> SSLConfigFilePermissionTest >>>>>>>>>>>>>>>>>>>>> tests >>>>>>>>>>>>>>>>>>>>> looked >>>>>>>>>>>>>>>>>>>>> very similar, so a common parent class was created for >>>>>>>>>>>>>>>>>>>>> them: >>>>>>>>>>>>>>>>>>>>> AbstractFilePermissionTest >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> - What was called RmiRegistrySslTest.java I've >>>>>>>>>>>>>>>>>>>>> renamed to >>>>>>>>>>>>>>>>>>>>> RmiRegistrySslTestApp.java. The java code to replace >>>>>>>>>>>>>>>>>>>>> old >>>>>>>>>>>>>>>>>>>>> shell >>>>>>>>>>>>>>>>>>>>> script >>>>>>>>>>>>>>>>>>>>> RmiRegistrySslTest.sh is called >>>>>>>>>>>>>>>>>>>>> RmiRegistrySslTest.java, >>>>>>>>>>>>>>>>>>>>> hence the >>>>>>>>>>>>>>>>>>>>> huge >>>>>>>>>>>>>>>>>>>>> diff. >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> - The new RmiRegistrySslTest.java has some lines >>>>>>>>>>>>>>>>>>>>> similar >>>>>>>>>>>>>>>>>>>>> to the >>>>>>>>>>>>>>>>>>>>> AbstractFilePermissionTest.java, I nevertheless decided >>>>>>>>>>>>>>>>>>>>> to not >>>>>>>>>>>>>>>>>>>>> complicate the code further and leave it as is. Please >>>>>>>>>>>>>>>>>>>>> let me >>>>>>>>>>>>>>>>>>>>> know if >>>>>>>>>>>>>>>>>>>>> this is somehow not acceptable >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> - com/oracle/java/testlibrary/Utils.java that is >>>>>>>>>>>>>>>>>>>>> added to >>>>>>>>>>>>>>>>>>>>> hotspot >>>>>>>>>>>>>>>>>>>>> repository is taken from this patch: >>>>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ykantser/8023138/webrev.00/test/lib/testlibrary/jdk/testlibrary/Utils.java.sdiff.html >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> - These tests will need additional changes when test >>>>>>>>>>>>>>>>>>>>> library >>>>>>>>>>>>>>>>>>>>> process >>>>>>>>>>>>>>>>>>>>> tools will support command line options inheritance >>>>>>>>>>>>>>>>>>>>> (http://mail.openjdk.java.net/pipermail/serviceability-dev/2013-November/013235.html) >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> -- >>>>>>>>>>>> With best regards, >>>>>>>>>>>> Taras Ledkov >>>>>>>>>>>> Mail-To: taras.ledkov at oracle.com >>>>>>>>>>>> skype: taras_ledkov >>>>>>>>>>>> Phone: 7(812)3346-157 >>>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>>> With best regards, >>>>>>>>> Taras Ledkov >>>>>>>>> Mail-To: taras.ledkov at oracle.com >>>>>>>>> skype: taras_ledkov >>>>>>>>> Phone: 7(812)3346-157 >>>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> >> > > -- > With best regards, > Taras Ledkov > Mail-To: taras.ledkov at oracle.com > skype: taras_ledkov > Phone: 7(812)3346-157 From brent.christian at oracle.com Wed Feb 26 11:34:51 2014 From: brent.christian at oracle.com (Brent Christian) Date: Wed, 26 Feb 2014 11:34:51 -0800 Subject: RFR 6835233 : Fedora 9 jdk regression test failed: java/lang/instrument/ParallelTransformerLoader.sh Message-ID: <530E41DB.20801@oracle.com> File under "chipping away at test stabilization issues." https://bugs.openjdk.java.net/browse/JDK-6835233 I've done some repeated runs of this test on my Linux machine. The test fails every time with 6u3. It fails intermittently on 7 (after 145 iterations for 7u45, and 62 iterations for 7u60b07). I have not been able to reproduce the failure on 8 or 9, running 1000 iterations each on 8b115, 8b129, and 9b02. I would like to resolve this bug by removing the "@ignore" tag for JDK 9, and bring the test back into rotation. If the failure comes back, I'll submit a new issue for further investigation. The change is: # @bug 5088398 -# @ignore until bug 6835233 dealt with # @summary Test parallel class loading by parallel transformers. Thanks, -Brent From sean.coffey at oracle.com Wed Feb 26 15:07:08 2014 From: sean.coffey at oracle.com (sean.coffey at oracle.com) Date: Wed, 26 Feb 2014 23:07:08 +0000 Subject: hg: jdk8/tl/corba: 8035618: Four api/org_omg/CORBA TCK tests fail under plugin only Message-ID: <20140226230711.5750362F74@hg.openjdk.java.net> Changeset: 0683ee308085 Author: coffeys Date: 2014-02-26 23:04 +0000 URL: http://hg.openjdk.java.net/jdk8/tl/corba/rev/0683ee308085 8035618: Four api/org_omg/CORBA TCK tests fail under plugin only Reviewed-by: mchung, chegar ! src/share/classes/com/sun/corba/se/spi/orb/ORB.java From staffan.larsen at oracle.com Wed Feb 26 22:58:48 2014 From: staffan.larsen at oracle.com (Staffan Larsen) Date: Thu, 27 Feb 2014 07:58:48 +0100 Subject: RFR 6835233 : Fedora 9 jdk regression test failed: java/lang/instrument/ParallelTransformerLoader.sh In-Reply-To: <530E41DB.20801@oracle.com> References: <530E41DB.20801@oracle.com> Message-ID: <5A25AA54-BD79-4590-8230-0C875CAF181D@oracle.com> Looks good! Thanks, /Staffan On 26 feb 2014, at 20:34, Brent Christian wrote: > File under "chipping away at test stabilization issues." > > https://bugs.openjdk.java.net/browse/JDK-6835233 > > I've done some repeated runs of this test on my Linux machine. The test fails every time with 6u3. It fails intermittently on 7 (after 145 iterations for 7u45, and 62 iterations for 7u60b07). I have not been able to reproduce the failure on 8 or 9, running 1000 iterations each on 8b115, 8b129, and 9b02. > > I would like to resolve this bug by removing the "@ignore" tag for JDK 9, and bring the test back into rotation. If the failure comes back, I'll submit a new issue for further investigation. > > The change is: > > # @bug 5088398 > -# @ignore until bug 6835233 dealt with > # @summary Test parallel class loading by parallel transformers. > > Thanks, > -Brent From Alan.Bateman at oracle.com Wed Feb 26 23:47:57 2014 From: Alan.Bateman at oracle.com (Alan Bateman) Date: Thu, 27 Feb 2014 07:47:57 +0000 Subject: RFR 6835233 : Fedora 9 jdk regression test failed: java/lang/instrument/ParallelTransformerLoader.sh In-Reply-To: <530E41DB.20801@oracle.com> References: <530E41DB.20801@oracle.com> Message-ID: <530EEDAD.9050400@oracle.com> On 26/02/2014 19:34, Brent Christian wrote: > : > > The change is: > > # @bug 5088398 > -# @ignore until bug 6835233 dealt with > # @summary Test parallel class loading by parallel transformers. This looks okay to me too, I assume that if there is any residual issue that it will show up quickly once the test is running again. -Alan From serguei.spitsyn at oracle.com Thu Feb 27 00:25:00 2014 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Thu, 27 Feb 2014 00:25:00 -0800 Subject: 2-nd round RFR 6471769: Error: assert(_cur_stack_depth == count_frames(), "cur_stack_depth out of sync") In-Reply-To: <530D007D.4040602@oracle.com> References: <530D007D.4040602@oracle.com> Message-ID: <530EF65C.5040707@oracle.com> Please, review the fix for: https://bugs.openjdk.java.net/browse/JDK-6471769 Open webrev: http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/hotspot/6471769-JVMTI-DEPTH.2 Summary: It is the 2-nd round of review because the JTREG com/sun/jdi tests discovered a regression in the first round change. The issue was in the JvmtiEventController::clear_frame_pop() lock synchronization that is not allowed at safepoints. As a result I've changed the JvmtiEnv::NotifyFramePop to use a VM operation for safety. Also, I've removed the lock synchronization from the 3 impacted JvmtiEventController:: functions: set_frame_pop(), clear_frame_pop() and clear_to_frame_pop(). Testing: In progress: nsk.jvmti, nsk.jdi, nsk.jdwp, JTreg com/sun/jdi Thanks, Serguei On 2/25/14 12:43 PM, serguei.spitsyn at oracle.com wrote: > Please, review the fix for: > https://bugs.openjdk.java.net/browse/JDK-6471769 > > > Open webrev: > http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/hotspot/6471769-JVMTI-DEPTH.1 > > > Summary: > > This is another Test Stabilization issue. > The fix is very similar to other JVMTI stabilization fixes. > It is to use safepoints for updating the PopFrame data instead of > relying on the > suspend equivalent condition mechanism > (JvmtiEnv::is_thread_fully_suspended()) > which is not adequate from the reliability point of view. > > Testing: > In progress: nsk.jvmti, nsk.jdi, nsk.jdwp, JTreg com/sun/jdi > > > Thanks, > Serguei > From mattias.tobiasson at oracle.com Thu Feb 27 02:43:57 2014 From: mattias.tobiasson at oracle.com (Mattias Tobiasson) Date: Thu, 27 Feb 2014 02:43:57 -0800 (PST) Subject: RFR(XS) 8031065: LowMemoryTest2.sh fails: OutOfMemoryError: Metaspace Message-ID: <4b9ed302-c006-4829-b732-2ffc123d84f4@default> Hi, Could you please review this test fix. The test verifies that MemoryPoolMXBean sends a notification when used memory has reached the threshold. The flag thresholdExceeded marks if we have reached the memory threshold. When the flag is set, the test slows down to give time for the notification to be received. The problem is that thresholdExceeded is overwritten every time in the loop. Instead it should be set if any pool has reached the threshold. This means that the test continues to allocate memory at full speed, and we may get an OutOfMemory before we get the notification. bug: https://bugs.openjdk.java.net/browse/JDK-8031065 webrev: http://cr.openjdk.java.net/~ykantser/8031065/webrev.00/ Mattias From staffan.larsen at oracle.com Thu Feb 27 03:38:15 2014 From: staffan.larsen at oracle.com (Staffan Larsen) Date: Thu, 27 Feb 2014 12:38:15 +0100 Subject: RFR(XS) 8031065: LowMemoryTest2.sh fails: OutOfMemoryError: Metaspace In-Reply-To: <4b9ed302-c006-4829-b732-2ffc123d84f4@default> References: <4b9ed302-c006-4829-b732-2ffc123d84f4@default> Message-ID: <0FF85AAC-03C4-49BA-8F6F-4FCEA09138EA@oracle.com> Looks good! Thanks, /Staffan On 27 feb 2014, at 11:43, Mattias Tobiasson wrote: > Hi, > Could you please review this test fix. > > The test verifies that MemoryPoolMXBean sends a notification when used memory has reached the threshold. > The flag thresholdExceeded marks if we have reached the memory threshold. When the flag is set, the test slows down to give time for the notification to be received. > The problem is that thresholdExceeded is overwritten every time in the loop. Instead it should be set if any pool has reached the threshold. This means that the test continues to allocate memory at full speed, and we may get an OutOfMemory before we get the notification. > > bug: > https://bugs.openjdk.java.net/browse/JDK-8031065 > > webrev: > http://cr.openjdk.java.net/~ykantser/8031065/webrev.00/ > > Mattias From taras.ledkov at oracle.com Thu Feb 27 03:39:38 2014 From: taras.ledkov at oracle.com (taras ledkov) Date: Thu, 27 Feb 2014 15:39:38 +0400 Subject: Review request for 7195249: Some jtreg tests use hard coded ports In-Reply-To: <2C889ABA-D705-4518-85EC-C5686DC43647@oracle.com> References: <529EF58F.5000701@oracle.com> <52A58687.6020708@oracle.com> <52A5953A.5040102@oracle.com> <52A7061E.8040002@oracle.com> <52BC2A7D.3070403@oracle.com> <52D6A61A.5020109@oracle.com> <52DD3BB6.2070607@oracle.com> <52DE3E3D.5070903@oracle.com> <52DE41BB.40309@oracle.com> <52F231D3.1050208@oracle.com> <52F23A97.7020807@oracle.com> <53049009.9020204@oracle.com> <530CA007.5080006@oracle.com> <530DF91C.5010000@oracle.com> <2C889ABA-D705-4518-85EC-C5686DC43647@oracle.com> Message-ID: <530F23FA.6070805@oracle.com> Staffan, Excuse me. I didn't know about that. I looked at the http://openjdk.java.net/census#serviceability. Now I'm in need of the second review. On 26.02.2014 18:36, Staffan Larsen wrote: > > On 26 feb 2014, at 15:24, taras ledkov wrote: > >> Hi, >> >> Alan, Mandy could you please review the fix: https://bugs.openjdk.java.net/browse/JDK-7195249. >> >> I had the discussion with Jaroslav and Staffan and they have approved my fix, but they are not reviewers. > > I am a Reviewer. > > Thanks, > /Staffan > >> >> Webrev for jdk part: >> http://cr.openjdk.java.net/~anazarov/7195249/jdk/webrev.04/ >> >> Webrev for hs part: >> http://cr.openjdk.java.net/~anazarov/7195249/hs/webrev.03/ >> >> On 25.02.2014 17:52, Jaroslav Bachorik wrote: >>> Thumbs up. (not a "reviewer", though) >>> >>> -JB- >>> >>> On 19.2.2014 12:05, taras ledkov wrote: >>>> Hi, >>>> >>>> Imports are fixed: >>>> >>>> http://cr.openjdk.java.net/~anazarov/7195249/jdk/webrev.04/ >>>> >>>> On 05.02.2014 17:20, Jaroslav Bachorik wrote: >>>>> Hi Taras, >>>>> >>>>> thanks for taking care of this. >>>>> >>>>> The changes look fine to me. >>>>> >>>>> One minor nit is unused imports of the library classes in >>>>> "test/sun/management/jmxremote/bootstrap/SSLConfigFilePermissionTest.java". >>>>> >>>>> >>>>> It does not use any of those classes as its base class >>>>> "AbstractFilePermissionTest" does all the heavy lifting. >>>>> >>>>> Cheers, >>>>> >>>>> -JB- >>>>> >>>>> On 5.2.2014 13:42, taras ledkov wrote: >>>>>> Hi, >>>>>> >>>>>> So please take a look at the review against JDK9. >>>>>> The reviewed patch had not been integrated into JDK8. >>>>>> >>>>>> Port to JDK9 is identical. The difference: the ProcessTools.java has >>>>>> been already patched by Jaroslav. >>>>>> >>>>>> Webrev for jdk part: >>>>>> http://cr.openjdk.java.net/~anazarov/7195249/jdk/webrev.03/ >>>>>> >>>>>> Webrev for hs part: >>>>>> http://cr.openjdk.java.net/~anazarov/7195249/hs/webrev.03/ >>>>>> >>>>>> >>>>>> On 21.01.2014 13:45, Jaroslav Bachorik wrote: >>>>>>> Hi Taras, >>>>>>> >>>>>>> On 21.1.2014 10:30, taras ledkov wrote: >>>>>>>> Hi Jaroslav, >>>>>>>> >>>>>>>> Could you please review the last changes? >>>>>>>> Are you OK? >>>>>>> >>>>>>> Yes, the change looks ok. But I think we will need to get back to this >>>>>>> problem eventually and implement a central port dispatcher if we >>>>>>> want to >>>>>>> be 100% sure the port conflicts wouldn't occur. But your changes >>>>>>> reduce >>>>>>> the chance significantly. >>>>>>> >>>>>>> Thanks for taking care of this. >>>>>>> >>>>>>> -JB- >>>>>>> >>>>>>>> >>>>>>>> On 20.01.2014 19:21, Staffan Larsen wrote: >>>>>>>>> Sorry for not replying earlier. Yes, I?m ok with these changes. >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> /Staffan >>>>>>>>> >>>>>>>>> On 20 jan 2014, at 16:07, taras ledkov >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>>> Hi Staffan, >>>>>>>>>> >>>>>>>>>> I fixed the tests according with your comments. >>>>>>>>>> Are you OK? >>>>>>>>>> >>>>>>>>>> On 15.01.2014 19:15, taras ledkov wrote: >>>>>>>>>>> Hi, >>>>>>>>>>> >>>>>>>>>>> Please take a look at the new review. >>>>>>>>>>> >>>>>>>>>>> Webrev for jdk part: >>>>>>>>>>> http://cr.openjdk.java.net/~anazarov/7195249/jdk/webrev.02/ >>>>>>>>>>> >>>>>>>>>>> Webrev for hs part: >>>>>>>>>>> http://cr.openjdk.java.net/~anazarov/7195249/hs/webrev.02/ >>>>>>>>>>> >>>>>>>>>>> My answers are inline: >>>>>>>>>>> >>>>>>>>>>> On 08.01.2014 17:46, Staffan Larsen wrote: >>>>>>>>>>>> Hi Taras, >>>>>>>>>>>> >>>>>>>>>>>> Thanks for doing this clean up and conversion of tests into Java. >>>>>>>>>>>> Here?s a couple of comments: >>>>>>>>>>>> >>>>>>>>>>>> test/runtime/6294277/SourceDebugExtension.java: >>>>>>>>>>>> This test could be simplified by not specifying an address at >>>>>>>>>>>> all. >>>>>>>>>>>> Since the test never connects to the JVM started with -Xrunjdwp, >>>>>>>>>>>> there >>>>>>>>>>>> is no reason to specify an address. If address is unspecified >>>>>>>>>>>> (and >>>>>>>>>>>> server=y), the connector will pick an address and print it to the >>>>>>>>>>>> command line. Thus the only change that needs to be done is to >>>>>>>>>>>> remove >>>>>>>>>>>> ",address=8888? from the @run command. >>>>>>>>>>> fixed >>>>>>>>>>> >>>>>>>>>>>> test/sun/management/jmxremote/bootstrap/RmiBootstrapTest.sh: >>>>>>>>>>>> test/sun/management/jmxremote/bootstrap/RmiSslBootstrapTest.sh: >>>>>>>>>>>> These tests do not compile cleanly with an empty JTwork >>>>>>>>>>>> directory. It >>>>>>>>>>>> seems that having one @build for each class does not work well - >>>>>>>>>>>> when >>>>>>>>>>>> compiling RmiBootstrapTest.java it cannot find TestLogger. Moving >>>>>>>>>>>> all >>>>>>>>>>>> classes to one @build statement solved this problem for me. >>>>>>>>>>> fixed >>>>>>>>>>> >>>>>>>>>>>> test/lib/testlibrary/jdk/testlibrary/ProcessTools.java: >>>>>>>>>>>> 187 Future stdoutTask = stdout.process(); >>>>>>>>>>>> 188 Future stderrTask = stderr.process(); >>>>>>>>>>>> The stdoutTask and stderrTask variables are unused. >>>>>>>>>>> fixed >>>>>>>>>>> >>>>>>>>>>>> test/sun/management/jmxremote/bootstrap/RmiRegistrySslTest.java: >>>>>>>>>>>> At first I thought something was wrong with this file - the >>>>>>>>>>>> diff is >>>>>>>>>>>> very weird. Then I realized you renamed an old file and created a >>>>>>>>>>>> new >>>>>>>>>>>> file using the old name. >>>>>>>>>>> You are right. I did it to keep the test name. >>>>>>>>>>> >>>>>>>>>>>> test/sun/management/jmxremote/bootstrap/AbstractFilePermissionTest.java: >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> - Is resetPasswordFilePermission() really necessary? It looks >>>>>>>>>>>> like >>>>>>>>>>>> you >>>>>>>>>>>> delete the files at the beginning of the test in any case. >>>>>>>>>>> I think yes. n the first place, this functionality was at the old >>>>>>>>>>> code. >>>>>>>>>>> In the second place, a file without write permission may be a >>>>>>>>>>> problem >>>>>>>>>>> for a further cleanup (not by the test, for example for the tests >>>>>>>>>>> launcher scripts etc.) >>>>>>>>>>> >>>>>>>>>>>> - I find the names and usage of ?mgmt? and ?file2PermissionTest? >>>>>>>>>>>> confusing. They are both Paths. One is used directly by the >>>>>>>>>>>> sub-classes, the other has a getter method. >>>>>>>>>>> fixed >>>>>>>>>>> >>>>>>>>>>>> - Lines 57-58: Don?t swallow exceptions, add an >>>>>>>>>>>> ex.printStackTrace(). >>>>>>>>>>>> (Same thing for all other places where you call >>>>>>>>>>>> Integer.parseInt()) >>>>>>>>>>> fixed >>>>>>>>>>> >>>>>>>>>>>> test/sun/management/jmxremote/bootstrap/Dummy.java: >>>>>>>>>>>> This file is never used as far as I can see. >>>>>>>>>>> It is used by PasswordFilePermissionTest & >>>>>>>>>>> SSLConfigFilePermissionTest >>>>>>>>>>> via the AbstractFilePermissionTest (see the doTest method, >>>>>>>>>>> AbstractFilePermissionTest : 162). >>>>>>>>>>> >>>>>>>>>>>> Thanks, >>>>>>>>>>>> /Staffan >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On 26 dec 2013, at 14:09, taras ledkov >>>>>>>>>>>> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> Hi, >>>>>>>>>>>>> >>>>>>>>>>>>> Please take a look at the review with fixed issues about >>>>>>>>>>>>> trying to >>>>>>>>>>>>> launch test that needs free port several times. >>>>>>>>>>>>> >>>>>>>>>>>>> Webrev for jdk part: >>>>>>>>>>>>> http://cr.openjdk.java.net/~anazarov/7195249/jdk/webrev.01/ >>>>>>>>>>>>> >>>>>>>>>>>>> Webrev for hs part: >>>>>>>>>>>>> http://cr.openjdk.java.net/~anazarov/7195249/hs/webrev.01/ >>>>>>>>>>>>> >>>>>>>>>>>>> Pay your attention to new method >>>>>>>>>>>>> ProcessTools.startProcess(String, >>>>>>>>>>>>> ProcessBuilder, Consumer) that is used to analyze all >>>>>>>>>>>>> output >>>>>>>>>>>>> of a sub-process. It has common part with >>>>>>>>>>>>> ProcessTools.startProcess(String, ProcessBuilder, >>>>>>>>>>>>> Predicate, >>>>>>>>>>>>> long, TumeUnit) that is used to determine the warm-up moment. >>>>>>>>>>>>> >>>>>>>>>>>>> I think the ProcessTools.startProcess(String, ProcessBuilder, >>>>>>>>>>>>> Predicate, long, TumeUnit) may be changed by adding >>>>>>>>>>>>> LinePump >>>>>>>>>>>>> to stderr if there is not serious reason for restricting the >>>>>>>>>>>>> warm-up >>>>>>>>>>>>> analysis to stdout stream. >>>>>>>>>>>>> >>>>>>>>>>>>> On 10.12.2013 16:16, Yekaterina Kantserova wrote: >>>>>>>>>>>>>> Hi, >>>>>>>>>>>>>> >>>>>>>>>>>>>> I've consulted with Serviceability engineers (add them to CC >>>>>>>>>>>>>> list) and >>>>>>>>>>>>>> they would like to see tests to solve these problem so far: >>>>>>>>>>>>>> >>>>>>>>>>>>>> 2. Implement loops in every test. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>> Katja >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> On 12/09/2013 11:02 AM, Alexandre (Shura) Iline wrote: >>>>>>>>>>>>>>> Guys. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Let me try to sum up what was said before and may be suggest a >>>>>>>>>>>>>>> compromise. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> 1. There is a desire to have a support port allocation on the >>>>>>>>>>>>>>> level of >>>>>>>>>>>>>>> a JTReg suite execution. Taras created a bug for that >>>>>>>>>>>>>>> (https://bugs.openjdk.java.net/browse/JDK-7195249). Whether it >>>>>>>>>>>>>>> is a >>>>>>>>>>>>>>> test harness API or a library API does not really matter from >>>>>>>>>>>>>>> usage >>>>>>>>>>>>>>> point of view. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> 2. There is no way to make the tests absolutely stable, >>>>>>>>>>>>>>> whatever >>>>>>>>>>>>>>> port >>>>>>>>>>>>>>> allocation logic is used. The best we could do is to try to >>>>>>>>>>>>>>> perform >>>>>>>>>>>>>>> the test logic with different ports until the test succeeds. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Both arguments make sense. #2 is the ultimate answer, of >>>>>>>>>>>>>>> course, >>>>>>>>>>>>>>> but >>>>>>>>>>>>>>> better be used in conjunction with a meaningful port selection >>>>>>>>>>>>>>> algorithm. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> At the same time, copying a loop-until-success login from one >>>>>>>>>>>>>>> test to >>>>>>>>>>>>>>> another may be not the best solution. Library could help with >>>>>>>>>>>>>>> that I >>>>>>>>>>>>>>> believe. There only need to be an API method which takes >>>>>>>>>>>>>>> behavior as a >>>>>>>>>>>>>>> parameter and run it until it succeeds. Something like: >>>>>>>>>>>>>>> public runOnAFreePort(Function) >>>>>>>>>>>>>>> or similar. There could be arguments of how/whether to >>>>>>>>>>>>>>> implement >>>>>>>>>>>>>>> it, >>>>>>>>>>>>>>> the solution would not work for shell tests, etc, but still >>>>>>>>>>>>>>> ... >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> With the tests in question though, we have a few options. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> 1. Integrate tests as is. Get to it later after reaching >>>>>>>>>>>>>>> agreement in >>>>>>>>>>>>>>> the library, etc. >>>>>>>>>>>>>>> 2. Implement loops in every test. >>>>>>>>>>>>>>> 3. Wait for the library to be ready and only then integrate >>>>>>>>>>>>>>> the >>>>>>>>>>>>>>> changes. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Please let us know which one is closer to your heart. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> I personally prefer #1 for the reason that the changes already >>>>>>>>>>>>>>> supposed to make the tests more stable and also there are many >>>>>>>>>>>>>>> more >>>>>>>>>>>>>>> tests tests which use ports, so the scope of the problem is >>>>>>>>>>>>>>> bigger >>>>>>>>>>>>>>> than these. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Shura >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Taras, >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> I agree with the previous comments, that Utils.getFreePort() >>>>>>>>>>>>>>>> does not >>>>>>>>>>>>>>>> guarantee the port will be still free when you start your >>>>>>>>>>>>>>>> process. >>>>>>>>>>>>>>>> Unfortunately I don't think the library can do more. However, >>>>>>>>>>>>>>>> there is a >>>>>>>>>>>>>>>> solution. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Please, look at the >>>>>>>>>>>>>>>> *jdk/test/sun/tools/jstatd/JstatdTest.java >>>>>>>>>>>>>>>> tryToSetupJstatdProcess()*. In brief, the test will try to >>>>>>>>>>>>>>>> start a >>>>>>>>>>>>>>>> process with a free port and then check if >>>>>>>>>>>>>>>> /java.rmi.server.ExportException: Port already in use/ has >>>>>>>>>>>>>>>> been >>>>>>>>>>>>>>>> thrown. >>>>>>>>>>>>>>>> If yes, you have to retry. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>> Katja >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> On 12/02/2013 01:39 PM, taras ledkov wrote: >>>>>>>>>>>>>>>>> Hi Everyone, >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Whatever logic is to be chosen to select a free port, it is >>>>>>>>>>>>>>>>> the >>>>>>>>>>>>>>>>> library responsibility to implements it, would not you >>>>>>>>>>>>>>>>> agree? >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Hence what I am suggesting is to integrate the tests as is. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Should we decide to replace logic of the port selection, we >>>>>>>>>>>>>>>>> could do >>>>>>>>>>>>>>>>> it later in the library. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> On 21.11.2013 15:00, Jaroslav Bachorik wrote: >>>>>>>>>>>>>>>>>> On 20.11.2013 18:38, Dmitry Samersoff wrote: >>>>>>>>>>>>>>>>>>> Roger, >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> As soon as we close a socket nobody can guarantee that the >>>>>>>>>>>>>>>>>>> port is >>>>>>>>>>>>>>>>>>> free. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Moreover, port returned by getFreePort()[1] remains not >>>>>>>>>>>>>>>>>>> accessible >>>>>>>>>>>>>>>>>>> for >>>>>>>>>>>>>>>>>>> some time - it depends to system setup, take a look to >>>>>>>>>>>>>>>>>>> discussions >>>>>>>>>>>>>>>>>>> around SO_REUSEPORT for Linux or SO_REUSEADDR and >>>>>>>>>>>>>>>>>>> SO_LINGER >>>>>>>>>>>>>>>>>>> for >>>>>>>>>>>>>>>>>>> BSD. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> So from stability point of view it's better to just return >>>>>>>>>>>>>>>>>>> random >>>>>>>>>>>>>>>>>>> number >>>>>>>>>>>>>>>>>>> between 49152 and 65535. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Well, this doesn't seem to improve the odds by much. When >>>>>>>>>>>>>>>>>> there are >>>>>>>>>>>>>>>>>> more >>>>>>>>>>>>>>>>>> tests run in parallel, all of them requiring a free port, >>>>>>>>>>>>>>>>>> nothing >>>>>>>>>>>>>>>>>> prevents the random function to return the same port to >>>>>>>>>>>>>>>>>> all of >>>>>>>>>>>>>>>>>> them. >>>>>>>>>>>>>>>>>> Also, two subsequent requests can return the same port and >>>>>>>>>>>>>>>>>> cause >>>>>>>>>>>>>>>>>> problems with timing when a port used by a previous test is >>>>>>>>>>>>>>>>>> not >>>>>>>>>>>>>>>>>> fully >>>>>>>>>>>>>>>>>> ready to be assigned to a different socket. And as Dmitry >>>>>>>>>>>>>>>>>> pointed out >>>>>>>>>>>>>>>>>> unless one can keep hold of the allocated socket and use it >>>>>>>>>>>>>>>>>> later >>>>>>>>>>>>>>>>>> there >>>>>>>>>>>>>>>>>> is no guarantee that a port which was tested unallocated >>>>>>>>>>>>>>>>>> will >>>>>>>>>>>>>>>>>> remain >>>>>>>>>>>>>>>>>> unallocated also for the next few milliseconds. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> The only fail proof solution would be a port allocating >>>>>>>>>>>>>>>>>> service >>>>>>>>>>>>>>>>>> provided >>>>>>>>>>>>>>>>>> by the harness. Until then we can only (hopefully) decrease >>>>>>>>>>>>>>>>>> the >>>>>>>>>>>>>>>>>> chance >>>>>>>>>>>>>>>>>> of intermittent failures due to a port being in use. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> -JB- >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> -Dmitry >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> [1] >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> 141 public static int getFreePort() throws >>>>>>>>>>>>>>>>>>> InterruptedException, >>>>>>>>>>>>>>>>>>> IOException { >>>>>>>>>>>>>>>>>>> 142 int port = -1; >>>>>>>>>>>>>>>>>>> 143 >>>>>>>>>>>>>>>>>>> 144 while (port <= 0) { >>>>>>>>>>>>>>>>>>> 145 Thread.sleep(100); >>>>>>>>>>>>>>>>>>> 146 >>>>>>>>>>>>>>>>>>> 147 ServerSocket serverSocket = null; >>>>>>>>>>>>>>>>>>> 148 try { >>>>>>>>>>>>>>>>>>> 149 serverSocket = new ServerSocket(0); >>>>>>>>>>>>>>>>>>> 150 port = serverSocket.getLocalPort(); >>>>>>>>>>>>>>>>>>> 151 } finally { >>>>>>>>>>>>>>>>>>> 152 serverSocket.close(); >>>>>>>>>>>>>>>>>>> 153 } >>>>>>>>>>>>>>>>>>> 154 } >>>>>>>>>>>>>>>>>>> 155 >>>>>>>>>>>>>>>>>>> 156 return port; >>>>>>>>>>>>>>>>>>> 157 } >>>>>>>>>>>>>>>>>>> 158 >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> On 2013-11-20 19:40, roger riggs wrote: >>>>>>>>>>>>>>>>>>>> Hi, >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> fyi, The jdk.testlibrary.Utils.getFreePort() method will >>>>>>>>>>>>>>>>>>>> Open an >>>>>>>>>>>>>>>>>>>> free >>>>>>>>>>>>>>>>>>>> Socket, close it and return >>>>>>>>>>>>>>>>>>>> the port number. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> And as Alan recommended, use (0) when possible to have >>>>>>>>>>>>>>>>>>>> the >>>>>>>>>>>>>>>>>>>> system >>>>>>>>>>>>>>>>>>>> assign >>>>>>>>>>>>>>>>>>>> the port #. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Roger >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> On 11/20/2013 8:04 AM, Dmitry Samersoff wrote: >>>>>>>>>>>>>>>>>>>>> Taras, >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> *The only* correct way to take really free port is: >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> 1. Chose random number between 49152 and 65535 >>>>>>>>>>>>>>>>>>>>> 2. Open socket >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> if socket fails - repeat step 1 >>>>>>>>>>>>>>>>>>>>> if socket OK - return *socket* >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> If you can't keep the socket open (e.g. you have to pass >>>>>>>>>>>>>>>>>>>>> port >>>>>>>>>>>>>>>>>>>>> number as >>>>>>>>>>>>>>>>>>>>> property value) you shouldn't do any pre-check as it >>>>>>>>>>>>>>>>>>>>> has no >>>>>>>>>>>>>>>>>>>>> value >>>>>>>>>>>>>>>>>>>>> - as >>>>>>>>>>>>>>>>>>>>> as soon as you close socket someone can take the port. >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> So just choose a random number within the range above >>>>>>>>>>>>>>>>>>>>> and >>>>>>>>>>>>>>>>>>>>> let >>>>>>>>>>>>>>>>>>>>> networking >>>>>>>>>>>>>>>>>>>>> code opening socket to handle port conflict. >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> -Dmitry >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> On 2013-11-20 15:54, taras ledkov wrote: >>>>>>>>>>>>>>>>>>>>>> Hi Everyone, >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> I am working on bug >>>>>>>>>>>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-7195249. >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> There are two webrevs: >>>>>>>>>>>>>>>>>>>>>> Webrev for jdk part: >>>>>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~anazarov/7195249/jdk/webrev.00/ >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> Webrev for hs part: >>>>>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~anazarov/7195249/hs/webrev.00/ >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> Please take a look at some notes: >>>>>>>>>>>>>>>>>>>>>> - After discussing with Yekaterina Kantserova & >>>>>>>>>>>>>>>>>>>>>> Jaroslav >>>>>>>>>>>>>>>>>>>>>> Bachorik >>>>>>>>>>>>>>>>>>>>>> some >>>>>>>>>>>>>>>>>>>>>> shell tests have been converted to java based tests >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> - PasswordFilePermissionTest & >>>>>>>>>>>>>>>>>>>>>> SSLConfigFilePermissionTest >>>>>>>>>>>>>>>>>>>>>> tests >>>>>>>>>>>>>>>>>>>>>> looked >>>>>>>>>>>>>>>>>>>>>> very similar, so a common parent class was created for >>>>>>>>>>>>>>>>>>>>>> them: >>>>>>>>>>>>>>>>>>>>>> AbstractFilePermissionTest >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> - What was called RmiRegistrySslTest.java I've >>>>>>>>>>>>>>>>>>>>>> renamed to >>>>>>>>>>>>>>>>>>>>>> RmiRegistrySslTestApp.java. The java code to replace >>>>>>>>>>>>>>>>>>>>>> old >>>>>>>>>>>>>>>>>>>>>> shell >>>>>>>>>>>>>>>>>>>>>> script >>>>>>>>>>>>>>>>>>>>>> RmiRegistrySslTest.sh is called >>>>>>>>>>>>>>>>>>>>>> RmiRegistrySslTest.java, >>>>>>>>>>>>>>>>>>>>>> hence the >>>>>>>>>>>>>>>>>>>>>> huge >>>>>>>>>>>>>>>>>>>>>> diff. >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> - The new RmiRegistrySslTest.java has some lines >>>>>>>>>>>>>>>>>>>>>> similar >>>>>>>>>>>>>>>>>>>>>> to the >>>>>>>>>>>>>>>>>>>>>> AbstractFilePermissionTest.java, I nevertheless decided >>>>>>>>>>>>>>>>>>>>>> to not >>>>>>>>>>>>>>>>>>>>>> complicate the code further and leave it as is. Please >>>>>>>>>>>>>>>>>>>>>> let me >>>>>>>>>>>>>>>>>>>>>> know if >>>>>>>>>>>>>>>>>>>>>> this is somehow not acceptable >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> - com/oracle/java/testlibrary/Utils.java that is >>>>>>>>>>>>>>>>>>>>>> added to >>>>>>>>>>>>>>>>>>>>>> hotspot >>>>>>>>>>>>>>>>>>>>>> repository is taken from this patch: >>>>>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ykantser/8023138/webrev.00/test/lib/testlibrary/jdk/testlibrary/Utils.java.sdiff.html >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> - These tests will need additional changes when test >>>>>>>>>>>>>>>>>>>>>> library >>>>>>>>>>>>>>>>>>>>>> process >>>>>>>>>>>>>>>>>>>>>> tools will support command line options inheritance >>>>>>>>>>>>>>>>>>>>>> (http://mail.openjdk.java.net/pipermail/serviceability-dev/2013-November/013235.html) >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> -- >>>>>>>>>>>>> With best regards, >>>>>>>>>>>>> Taras Ledkov >>>>>>>>>>>>> Mail-To: taras.ledkov at oracle.com >>>>>>>>>>>>> skype: taras_ledkov >>>>>>>>>>>>> Phone: 7(812)3346-157 >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> With best regards, >>>>>>>>>> Taras Ledkov >>>>>>>>>> Mail-To: taras.ledkov at oracle.com >>>>>>>>>> skype: taras_ledkov >>>>>>>>>> Phone: 7(812)3346-157 >>>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> >> >> -- >> With best regards, >> Taras Ledkov >> Mail-To: taras.ledkov at oracle.com >> skype: taras_ledkov >> Phone: 7(812)3346-157 > -- With best regards, Taras Ledkov Mail-To: taras.ledkov at oracle.com skype: taras_ledkov Phone: 7(812)3346-157 From daniel.fuchs at oracle.com Thu Feb 27 03:49:18 2014 From: daniel.fuchs at oracle.com (Daniel Fuchs) Date: Thu, 27 Feb 2014 12:49:18 +0100 Subject: RFR(XS) 8031065: LowMemoryTest2.sh fails: OutOfMemoryError: Metaspace In-Reply-To: <4b9ed302-c006-4829-b732-2ffc123d84f4@default> References: <4b9ed302-c006-4829-b732-2ffc123d84f4@default> Message-ID: <530F263E.8090408@oracle.com> On 2/27/14 11:43 AM, Mattias Tobiasson wrote: > Hi, > Could you please review this test fix. > > The test verifies that MemoryPoolMXBean sends a notification when used memory has reached the threshold. > The flag thresholdExceeded marks if we have reached the memory threshold. When the flag is set, the test slows down to give time for the notification to be received. > The problem is that thresholdExceeded is overwritten every time in the loop. Instead it should be set if any pool has reached the threshold. This means that the test continues to allocate memory at full speed, and we may get an OutOfMemory before we get the notification. > Hi Mattias, I wonder whether you should also stop calling loadNext() once thresholdExceeded is true? best regards, -- daniel > bug: > https://bugs.openjdk.java.net/browse/JDK-8031065 > > webrev: > http://cr.openjdk.java.net/~ykantser/8031065/webrev.00/ > > Mattias > From staffan.larsen at oracle.com Thu Feb 27 03:50:13 2014 From: staffan.larsen at oracle.com (Staffan Larsen) Date: Thu, 27 Feb 2014 12:50:13 +0100 Subject: Review request for 7195249: Some jtreg tests use hard coded ports In-Reply-To: <530F23FA.6070805@oracle.com> References: <529EF58F.5000701@oracle.com> <52A58687.6020708@oracle.com> <52A5953A.5040102@oracle.com> <52A7061E.8040002@oracle.com> <52BC2A7D.3070403@oracle.com> <52D6A61A.5020109@oracle.com> <52DD3BB6.2070607@oracle.com> <52DE3E3D.5070903@oracle.com> <52DE41BB.40309@oracle.com> <52F231D3.1050208@oracle.com> <52F23A97.7020807@oracle.com> <53049009.9020204@oracle.com> <530CA007.5080006@oracle.com> <530DF91C.5010000@oracle.com> <2C889ABA-D705-4518-85EC-C5686DC43647@oracle.com> <530F23FA.6070805@oracle.com> Message-ID: <840B8EC7-DAF9-44BB-9D53-33FA0DB11F9D@oracle.com> On 27 feb 2014, at 12:39, taras ledkov wrote: > Staffan, Excuse me. No problem. > I didn't know about that. > I looked at the http://openjdk.java.net/census#serviceability. In this case the relevant project is jdk9 since that is where the change should happen: http://openjdk.java.net/census#jdk9 The ?serviceability? group is mostly dormant. > Now I'm in need of the second review. For the jdk9 project the only requirement is one review from an official Reviewer. You currently have my review (an official Reviewer) and Jaroslav?s review (not an official Reviewer). So you are all set. Since you are not a Commiter you cannot push your changes yourself, but need a sponsor to do that. I can do that for you if you send me the ?hg export? output for the changeset that passes jcheck. Thanks, /Staffan > > On 26.02.2014 18:36, Staffan Larsen wrote: >> >> On 26 feb 2014, at 15:24, taras ledkov wrote: >> >>> Hi, >>> >>> Alan, Mandy could you please review the fix: https://bugs.openjdk.java.net/browse/JDK-7195249. >>> >>> I had the discussion with Jaroslav and Staffan and they have approved my fix, but they are not reviewers. >> >> I am a Reviewer. >> >> Thanks, >> /Staffan >> >>> >>> Webrev for jdk part: >>> http://cr.openjdk.java.net/~anazarov/7195249/jdk/webrev.04/ >>> >>> Webrev for hs part: >>> http://cr.openjdk.java.net/~anazarov/7195249/hs/webrev.03/ >>> >>> On 25.02.2014 17:52, Jaroslav Bachorik wrote: >>>> Thumbs up. (not a "reviewer", though) >>>> >>>> -JB- >>>> >>>> On 19.2.2014 12:05, taras ledkov wrote: >>>>> Hi, >>>>> >>>>> Imports are fixed: >>>>> >>>>> http://cr.openjdk.java.net/~anazarov/7195249/jdk/webrev.04/ >>>>> >>>>> On 05.02.2014 17:20, Jaroslav Bachorik wrote: >>>>>> Hi Taras, >>>>>> >>>>>> thanks for taking care of this. >>>>>> >>>>>> The changes look fine to me. >>>>>> >>>>>> One minor nit is unused imports of the library classes in >>>>>> "test/sun/management/jmxremote/bootstrap/SSLConfigFilePermissionTest.java". >>>>>> >>>>>> >>>>>> It does not use any of those classes as its base class >>>>>> "AbstractFilePermissionTest" does all the heavy lifting. >>>>>> >>>>>> Cheers, >>>>>> >>>>>> -JB- >>>>>> >>>>>> On 5.2.2014 13:42, taras ledkov wrote: >>>>>>> Hi, >>>>>>> >>>>>>> So please take a look at the review against JDK9. >>>>>>> The reviewed patch had not been integrated into JDK8. >>>>>>> >>>>>>> Port to JDK9 is identical. The difference: the ProcessTools.java has >>>>>>> been already patched by Jaroslav. >>>>>>> >>>>>>> Webrev for jdk part: >>>>>>> http://cr.openjdk.java.net/~anazarov/7195249/jdk/webrev.03/ >>>>>>> >>>>>>> Webrev for hs part: >>>>>>> http://cr.openjdk.java.net/~anazarov/7195249/hs/webrev.03/ >>>>>>> >>>>>>> >>>>>>> On 21.01.2014 13:45, Jaroslav Bachorik wrote: >>>>>>>> Hi Taras, >>>>>>>> >>>>>>>> On 21.1.2014 10:30, taras ledkov wrote: >>>>>>>>> Hi Jaroslav, >>>>>>>>> >>>>>>>>> Could you please review the last changes? >>>>>>>>> Are you OK? >>>>>>>> >>>>>>>> Yes, the change looks ok. But I think we will need to get back to this >>>>>>>> problem eventually and implement a central port dispatcher if we >>>>>>>> want to >>>>>>>> be 100% sure the port conflicts wouldn't occur. But your changes >>>>>>>> reduce >>>>>>>> the chance significantly. >>>>>>>> >>>>>>>> Thanks for taking care of this. >>>>>>>> >>>>>>>> -JB- >>>>>>>> >>>>>>>>> >>>>>>>>> On 20.01.2014 19:21, Staffan Larsen wrote: >>>>>>>>>> Sorry for not replying earlier. Yes, I?m ok with these changes. >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> /Staffan >>>>>>>>>> >>>>>>>>>> On 20 jan 2014, at 16:07, taras ledkov >>>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>>> Hi Staffan, >>>>>>>>>>> >>>>>>>>>>> I fixed the tests according with your comments. >>>>>>>>>>> Are you OK? >>>>>>>>>>> >>>>>>>>>>> On 15.01.2014 19:15, taras ledkov wrote: >>>>>>>>>>>> Hi, >>>>>>>>>>>> >>>>>>>>>>>> Please take a look at the new review. >>>>>>>>>>>> >>>>>>>>>>>> Webrev for jdk part: >>>>>>>>>>>> http://cr.openjdk.java.net/~anazarov/7195249/jdk/webrev.02/ >>>>>>>>>>>> >>>>>>>>>>>> Webrev for hs part: >>>>>>>>>>>> http://cr.openjdk.java.net/~anazarov/7195249/hs/webrev.02/ >>>>>>>>>>>> >>>>>>>>>>>> My answers are inline: >>>>>>>>>>>> >>>>>>>>>>>> On 08.01.2014 17:46, Staffan Larsen wrote: >>>>>>>>>>>>> Hi Taras, >>>>>>>>>>>>> >>>>>>>>>>>>> Thanks for doing this clean up and conversion of tests into Java. >>>>>>>>>>>>> Here?s a couple of comments: >>>>>>>>>>>>> >>>>>>>>>>>>> test/runtime/6294277/SourceDebugExtension.java: >>>>>>>>>>>>> This test could be simplified by not specifying an address at >>>>>>>>>>>>> all. >>>>>>>>>>>>> Since the test never connects to the JVM started with -Xrunjdwp, >>>>>>>>>>>>> there >>>>>>>>>>>>> is no reason to specify an address. If address is unspecified >>>>>>>>>>>>> (and >>>>>>>>>>>>> server=y), the connector will pick an address and print it to the >>>>>>>>>>>>> command line. Thus the only change that needs to be done is to >>>>>>>>>>>>> remove >>>>>>>>>>>>> ",address=8888? from the @run command. >>>>>>>>>>>> fixed >>>>>>>>>>>> >>>>>>>>>>>>> test/sun/management/jmxremote/bootstrap/RmiBootstrapTest.sh: >>>>>>>>>>>>> test/sun/management/jmxremote/bootstrap/RmiSslBootstrapTest.sh: >>>>>>>>>>>>> These tests do not compile cleanly with an empty JTwork >>>>>>>>>>>>> directory. It >>>>>>>>>>>>> seems that having one @build for each class does not work well - >>>>>>>>>>>>> when >>>>>>>>>>>>> compiling RmiBootstrapTest.java it cannot find TestLogger. Moving >>>>>>>>>>>>> all >>>>>>>>>>>>> classes to one @build statement solved this problem for me. >>>>>>>>>>>> fixed >>>>>>>>>>>> >>>>>>>>>>>>> test/lib/testlibrary/jdk/testlibrary/ProcessTools.java: >>>>>>>>>>>>> 187 Future stdoutTask = stdout.process(); >>>>>>>>>>>>> 188 Future stderrTask = stderr.process(); >>>>>>>>>>>>> The stdoutTask and stderrTask variables are unused. >>>>>>>>>>>> fixed >>>>>>>>>>>> >>>>>>>>>>>>> test/sun/management/jmxremote/bootstrap/RmiRegistrySslTest.java: >>>>>>>>>>>>> At first I thought something was wrong with this file - the >>>>>>>>>>>>> diff is >>>>>>>>>>>>> very weird. Then I realized you renamed an old file and created a >>>>>>>>>>>>> new >>>>>>>>>>>>> file using the old name. >>>>>>>>>>>> You are right. I did it to keep the test name. >>>>>>>>>>>> >>>>>>>>>>>>> test/sun/management/jmxremote/bootstrap/AbstractFilePermissionTest.java: >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> - Is resetPasswordFilePermission() really necessary? It looks >>>>>>>>>>>>> like >>>>>>>>>>>>> you >>>>>>>>>>>>> delete the files at the beginning of the test in any case. >>>>>>>>>>>> I think yes. n the first place, this functionality was at the old >>>>>>>>>>>> code. >>>>>>>>>>>> In the second place, a file without write permission may be a >>>>>>>>>>>> problem >>>>>>>>>>>> for a further cleanup (not by the test, for example for the tests >>>>>>>>>>>> launcher scripts etc.) >>>>>>>>>>>> >>>>>>>>>>>>> - I find the names and usage of ?mgmt? and ?file2PermissionTest? >>>>>>>>>>>>> confusing. They are both Paths. One is used directly by the >>>>>>>>>>>>> sub-classes, the other has a getter method. >>>>>>>>>>>> fixed >>>>>>>>>>>> >>>>>>>>>>>>> - Lines 57-58: Don?t swallow exceptions, add an >>>>>>>>>>>>> ex.printStackTrace(). >>>>>>>>>>>>> (Same thing for all other places where you call >>>>>>>>>>>>> Integer.parseInt()) >>>>>>>>>>>> fixed >>>>>>>>>>>> >>>>>>>>>>>>> test/sun/management/jmxremote/bootstrap/Dummy.java: >>>>>>>>>>>>> This file is never used as far as I can see. >>>>>>>>>>>> It is used by PasswordFilePermissionTest & >>>>>>>>>>>> SSLConfigFilePermissionTest >>>>>>>>>>>> via the AbstractFilePermissionTest (see the doTest method, >>>>>>>>>>>> AbstractFilePermissionTest : 162). >>>>>>>>>>>> >>>>>>>>>>>>> Thanks, >>>>>>>>>>>>> /Staffan >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> On 26 dec 2013, at 14:09, taras ledkov >>>>>>>>>>>>> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>>> Hi, >>>>>>>>>>>>>> >>>>>>>>>>>>>> Please take a look at the review with fixed issues about >>>>>>>>>>>>>> trying to >>>>>>>>>>>>>> launch test that needs free port several times. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Webrev for jdk part: >>>>>>>>>>>>>> http://cr.openjdk.java.net/~anazarov/7195249/jdk/webrev.01/ >>>>>>>>>>>>>> >>>>>>>>>>>>>> Webrev for hs part: >>>>>>>>>>>>>> http://cr.openjdk.java.net/~anazarov/7195249/hs/webrev.01/ >>>>>>>>>>>>>> >>>>>>>>>>>>>> Pay your attention to new method >>>>>>>>>>>>>> ProcessTools.startProcess(String, >>>>>>>>>>>>>> ProcessBuilder, Consumer) that is used to analyze all >>>>>>>>>>>>>> output >>>>>>>>>>>>>> of a sub-process. It has common part with >>>>>>>>>>>>>> ProcessTools.startProcess(String, ProcessBuilder, >>>>>>>>>>>>>> Predicate, >>>>>>>>>>>>>> long, TumeUnit) that is used to determine the warm-up moment. >>>>>>>>>>>>>> >>>>>>>>>>>>>> I think the ProcessTools.startProcess(String, ProcessBuilder, >>>>>>>>>>>>>> Predicate, long, TumeUnit) may be changed by adding >>>>>>>>>>>>>> LinePump >>>>>>>>>>>>>> to stderr if there is not serious reason for restricting the >>>>>>>>>>>>>> warm-up >>>>>>>>>>>>>> analysis to stdout stream. >>>>>>>>>>>>>> >>>>>>>>>>>>>> On 10.12.2013 16:16, Yekaterina Kantserova wrote: >>>>>>>>>>>>>>> Hi, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> I've consulted with Serviceability engineers (add them to CC >>>>>>>>>>>>>>> list) and >>>>>>>>>>>>>>> they would like to see tests to solve these problem so far: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> 2. Implement loops in every test. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>> Katja >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> On 12/09/2013 11:02 AM, Alexandre (Shura) Iline wrote: >>>>>>>>>>>>>>>> Guys. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Let me try to sum up what was said before and may be suggest a >>>>>>>>>>>>>>>> compromise. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> 1. There is a desire to have a support port allocation on the >>>>>>>>>>>>>>>> level of >>>>>>>>>>>>>>>> a JTReg suite execution. Taras created a bug for that >>>>>>>>>>>>>>>> (https://bugs.openjdk.java.net/browse/JDK-7195249). Whether it >>>>>>>>>>>>>>>> is a >>>>>>>>>>>>>>>> test harness API or a library API does not really matter from >>>>>>>>>>>>>>>> usage >>>>>>>>>>>>>>>> point of view. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> 2. There is no way to make the tests absolutely stable, >>>>>>>>>>>>>>>> whatever >>>>>>>>>>>>>>>> port >>>>>>>>>>>>>>>> allocation logic is used. The best we could do is to try to >>>>>>>>>>>>>>>> perform >>>>>>>>>>>>>>>> the test logic with different ports until the test succeeds. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Both arguments make sense. #2 is the ultimate answer, of >>>>>>>>>>>>>>>> course, >>>>>>>>>>>>>>>> but >>>>>>>>>>>>>>>> better be used in conjunction with a meaningful port selection >>>>>>>>>>>>>>>> algorithm. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> At the same time, copying a loop-until-success login from one >>>>>>>>>>>>>>>> test to >>>>>>>>>>>>>>>> another may be not the best solution. Library could help with >>>>>>>>>>>>>>>> that I >>>>>>>>>>>>>>>> believe. There only need to be an API method which takes >>>>>>>>>>>>>>>> behavior as a >>>>>>>>>>>>>>>> parameter and run it until it succeeds. Something like: >>>>>>>>>>>>>>>> public runOnAFreePort(Function) >>>>>>>>>>>>>>>> or similar. There could be arguments of how/whether to >>>>>>>>>>>>>>>> implement >>>>>>>>>>>>>>>> it, >>>>>>>>>>>>>>>> the solution would not work for shell tests, etc, but still >>>>>>>>>>>>>>>> ... >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> With the tests in question though, we have a few options. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> 1. Integrate tests as is. Get to it later after reaching >>>>>>>>>>>>>>>> agreement in >>>>>>>>>>>>>>>> the library, etc. >>>>>>>>>>>>>>>> 2. Implement loops in every test. >>>>>>>>>>>>>>>> 3. Wait for the library to be ready and only then integrate >>>>>>>>>>>>>>>> the >>>>>>>>>>>>>>>> changes. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Please let us know which one is closer to your heart. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> I personally prefer #1 for the reason that the changes already >>>>>>>>>>>>>>>> supposed to make the tests more stable and also there are many >>>>>>>>>>>>>>>> more >>>>>>>>>>>>>>>> tests tests which use ports, so the scope of the problem is >>>>>>>>>>>>>>>> bigger >>>>>>>>>>>>>>>> than these. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Shura >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Taras, >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> I agree with the previous comments, that Utils.getFreePort() >>>>>>>>>>>>>>>>> does not >>>>>>>>>>>>>>>>> guarantee the port will be still free when you start your >>>>>>>>>>>>>>>>> process. >>>>>>>>>>>>>>>>> Unfortunately I don't think the library can do more. However, >>>>>>>>>>>>>>>>> there is a >>>>>>>>>>>>>>>>> solution. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Please, look at the >>>>>>>>>>>>>>>>> *jdk/test/sun/tools/jstatd/JstatdTest.java >>>>>>>>>>>>>>>>> tryToSetupJstatdProcess()*. In brief, the test will try to >>>>>>>>>>>>>>>>> start a >>>>>>>>>>>>>>>>> process with a free port and then check if >>>>>>>>>>>>>>>>> /java.rmi.server.ExportException: Port already in use/ has >>>>>>>>>>>>>>>>> been >>>>>>>>>>>>>>>>> thrown. >>>>>>>>>>>>>>>>> If yes, you have to retry. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>> Katja >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> On 12/02/2013 01:39 PM, taras ledkov wrote: >>>>>>>>>>>>>>>>>> Hi Everyone, >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Whatever logic is to be chosen to select a free port, it is >>>>>>>>>>>>>>>>>> the >>>>>>>>>>>>>>>>>> library responsibility to implements it, would not you >>>>>>>>>>>>>>>>>> agree? >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Hence what I am suggesting is to integrate the tests as is. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Should we decide to replace logic of the port selection, we >>>>>>>>>>>>>>>>>> could do >>>>>>>>>>>>>>>>>> it later in the library. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> On 21.11.2013 15:00, Jaroslav Bachorik wrote: >>>>>>>>>>>>>>>>>>> On 20.11.2013 18:38, Dmitry Samersoff wrote: >>>>>>>>>>>>>>>>>>>> Roger, >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> As soon as we close a socket nobody can guarantee that the >>>>>>>>>>>>>>>>>>>> port is >>>>>>>>>>>>>>>>>>>> free. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Moreover, port returned by getFreePort()[1] remains not >>>>>>>>>>>>>>>>>>>> accessible >>>>>>>>>>>>>>>>>>>> for >>>>>>>>>>>>>>>>>>>> some time - it depends to system setup, take a look to >>>>>>>>>>>>>>>>>>>> discussions >>>>>>>>>>>>>>>>>>>> around SO_REUSEPORT for Linux or SO_REUSEADDR and >>>>>>>>>>>>>>>>>>>> SO_LINGER >>>>>>>>>>>>>>>>>>>> for >>>>>>>>>>>>>>>>>>>> BSD. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> So from stability point of view it's better to just return >>>>>>>>>>>>>>>>>>>> random >>>>>>>>>>>>>>>>>>>> number >>>>>>>>>>>>>>>>>>>> between 49152 and 65535. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Well, this doesn't seem to improve the odds by much. When >>>>>>>>>>>>>>>>>>> there are >>>>>>>>>>>>>>>>>>> more >>>>>>>>>>>>>>>>>>> tests run in parallel, all of them requiring a free port, >>>>>>>>>>>>>>>>>>> nothing >>>>>>>>>>>>>>>>>>> prevents the random function to return the same port to >>>>>>>>>>>>>>>>>>> all of >>>>>>>>>>>>>>>>>>> them. >>>>>>>>>>>>>>>>>>> Also, two subsequent requests can return the same port and >>>>>>>>>>>>>>>>>>> cause >>>>>>>>>>>>>>>>>>> problems with timing when a port used by a previous test is >>>>>>>>>>>>>>>>>>> not >>>>>>>>>>>>>>>>>>> fully >>>>>>>>>>>>>>>>>>> ready to be assigned to a different socket. And as Dmitry >>>>>>>>>>>>>>>>>>> pointed out >>>>>>>>>>>>>>>>>>> unless one can keep hold of the allocated socket and use it >>>>>>>>>>>>>>>>>>> later >>>>>>>>>>>>>>>>>>> there >>>>>>>>>>>>>>>>>>> is no guarantee that a port which was tested unallocated >>>>>>>>>>>>>>>>>>> will >>>>>>>>>>>>>>>>>>> remain >>>>>>>>>>>>>>>>>>> unallocated also for the next few milliseconds. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> The only fail proof solution would be a port allocating >>>>>>>>>>>>>>>>>>> service >>>>>>>>>>>>>>>>>>> provided >>>>>>>>>>>>>>>>>>> by the harness. Until then we can only (hopefully) decrease >>>>>>>>>>>>>>>>>>> the >>>>>>>>>>>>>>>>>>> chance >>>>>>>>>>>>>>>>>>> of intermittent failures due to a port being in use. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> -JB- >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> -Dmitry >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> [1] >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> 141 public static int getFreePort() throws >>>>>>>>>>>>>>>>>>>> InterruptedException, >>>>>>>>>>>>>>>>>>>> IOException { >>>>>>>>>>>>>>>>>>>> 142 int port = -1; >>>>>>>>>>>>>>>>>>>> 143 >>>>>>>>>>>>>>>>>>>> 144 while (port <= 0) { >>>>>>>>>>>>>>>>>>>> 145 Thread.sleep(100); >>>>>>>>>>>>>>>>>>>> 146 >>>>>>>>>>>>>>>>>>>> 147 ServerSocket serverSocket = null; >>>>>>>>>>>>>>>>>>>> 148 try { >>>>>>>>>>>>>>>>>>>> 149 serverSocket = new ServerSocket(0); >>>>>>>>>>>>>>>>>>>> 150 port = serverSocket.getLocalPort(); >>>>>>>>>>>>>>>>>>>> 151 } finally { >>>>>>>>>>>>>>>>>>>> 152 serverSocket.close(); >>>>>>>>>>>>>>>>>>>> 153 } >>>>>>>>>>>>>>>>>>>> 154 } >>>>>>>>>>>>>>>>>>>> 155 >>>>>>>>>>>>>>>>>>>> 156 return port; >>>>>>>>>>>>>>>>>>>> 157 } >>>>>>>>>>>>>>>>>>>> 158 >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> On 2013-11-20 19:40, roger riggs wrote: >>>>>>>>>>>>>>>>>>>>> Hi, >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> fyi, The jdk.testlibrary.Utils.getFreePort() method will >>>>>>>>>>>>>>>>>>>>> Open an >>>>>>>>>>>>>>>>>>>>> free >>>>>>>>>>>>>>>>>>>>> Socket, close it and return >>>>>>>>>>>>>>>>>>>>> the port number. >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> And as Alan recommended, use (0) when possible to have >>>>>>>>>>>>>>>>>>>>> the >>>>>>>>>>>>>>>>>>>>> system >>>>>>>>>>>>>>>>>>>>> assign >>>>>>>>>>>>>>>>>>>>> the port #. >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> Roger >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> On 11/20/2013 8:04 AM, Dmitry Samersoff wrote: >>>>>>>>>>>>>>>>>>>>>> Taras, >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> *The only* correct way to take really free port is: >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> 1. Chose random number between 49152 and 65535 >>>>>>>>>>>>>>>>>>>>>> 2. Open socket >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> if socket fails - repeat step 1 >>>>>>>>>>>>>>>>>>>>>> if socket OK - return *socket* >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> If you can't keep the socket open (e.g. you have to pass >>>>>>>>>>>>>>>>>>>>>> port >>>>>>>>>>>>>>>>>>>>>> number as >>>>>>>>>>>>>>>>>>>>>> property value) you shouldn't do any pre-check as it >>>>>>>>>>>>>>>>>>>>>> has no >>>>>>>>>>>>>>>>>>>>>> value >>>>>>>>>>>>>>>>>>>>>> - as >>>>>>>>>>>>>>>>>>>>>> as soon as you close socket someone can take the port. >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> So just choose a random number within the range above >>>>>>>>>>>>>>>>>>>>>> and >>>>>>>>>>>>>>>>>>>>>> let >>>>>>>>>>>>>>>>>>>>>> networking >>>>>>>>>>>>>>>>>>>>>> code opening socket to handle port conflict. >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> -Dmitry >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> On 2013-11-20 15:54, taras ledkov wrote: >>>>>>>>>>>>>>>>>>>>>>> Hi Everyone, >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> I am working on bug >>>>>>>>>>>>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-7195249. >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> There are two webrevs: >>>>>>>>>>>>>>>>>>>>>>> Webrev for jdk part: >>>>>>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~anazarov/7195249/jdk/webrev.00/ >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> Webrev for hs part: >>>>>>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~anazarov/7195249/hs/webrev.00/ >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> Please take a look at some notes: >>>>>>>>>>>>>>>>>>>>>>> - After discussing with Yekaterina Kantserova & >>>>>>>>>>>>>>>>>>>>>>> Jaroslav >>>>>>>>>>>>>>>>>>>>>>> Bachorik >>>>>>>>>>>>>>>>>>>>>>> some >>>>>>>>>>>>>>>>>>>>>>> shell tests have been converted to java based tests >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> - PasswordFilePermissionTest & >>>>>>>>>>>>>>>>>>>>>>> SSLConfigFilePermissionTest >>>>>>>>>>>>>>>>>>>>>>> tests >>>>>>>>>>>>>>>>>>>>>>> looked >>>>>>>>>>>>>>>>>>>>>>> very similar, so a common parent class was created for >>>>>>>>>>>>>>>>>>>>>>> them: >>>>>>>>>>>>>>>>>>>>>>> AbstractFilePermissionTest >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> - What was called RmiRegistrySslTest.java I've >>>>>>>>>>>>>>>>>>>>>>> renamed to >>>>>>>>>>>>>>>>>>>>>>> RmiRegistrySslTestApp.java. The java code to replace >>>>>>>>>>>>>>>>>>>>>>> old >>>>>>>>>>>>>>>>>>>>>>> shell >>>>>>>>>>>>>>>>>>>>>>> script >>>>>>>>>>>>>>>>>>>>>>> RmiRegistrySslTest.sh is called >>>>>>>>>>>>>>>>>>>>>>> RmiRegistrySslTest.java, >>>>>>>>>>>>>>>>>>>>>>> hence the >>>>>>>>>>>>>>>>>>>>>>> huge >>>>>>>>>>>>>>>>>>>>>>> diff. >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> - The new RmiRegistrySslTest.java has some lines >>>>>>>>>>>>>>>>>>>>>>> similar >>>>>>>>>>>>>>>>>>>>>>> to the >>>>>>>>>>>>>>>>>>>>>>> AbstractFilePermissionTest.java, I nevertheless decided >>>>>>>>>>>>>>>>>>>>>>> to not >>>>>>>>>>>>>>>>>>>>>>> complicate the code further and leave it as is. Please >>>>>>>>>>>>>>>>>>>>>>> let me >>>>>>>>>>>>>>>>>>>>>>> know if >>>>>>>>>>>>>>>>>>>>>>> this is somehow not acceptable >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> - com/oracle/java/testlibrary/Utils.java that is >>>>>>>>>>>>>>>>>>>>>>> added to >>>>>>>>>>>>>>>>>>>>>>> hotspot >>>>>>>>>>>>>>>>>>>>>>> repository is taken from this patch: >>>>>>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ykantser/8023138/webrev.00/test/lib/testlibrary/jdk/testlibrary/Utils.java.sdiff.html >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> - These tests will need additional changes when test >>>>>>>>>>>>>>>>>>>>>>> library >>>>>>>>>>>>>>>>>>>>>>> process >>>>>>>>>>>>>>>>>>>>>>> tools will support command line options inheritance >>>>>>>>>>>>>>>>>>>>>>> (http://mail.openjdk.java.net/pipermail/serviceability-dev/2013-November/013235.html) >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> -- >>>>>>>>>>>>>> With best regards, >>>>>>>>>>>>>> Taras Ledkov >>>>>>>>>>>>>> Mail-To: taras.ledkov at oracle.com >>>>>>>>>>>>>> skype: taras_ledkov >>>>>>>>>>>>>> Phone: 7(812)3346-157 >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> -- >>>>>>>>>>> With best regards, >>>>>>>>>>> Taras Ledkov >>>>>>>>>>> Mail-To: taras.ledkov at oracle.com >>>>>>>>>>> skype: taras_ledkov >>>>>>>>>>> Phone: 7(812)3346-157 >>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> >>> -- >>> With best regards, >>> Taras Ledkov >>> Mail-To: taras.ledkov at oracle.com >>> skype: taras_ledkov >>> Phone: 7(812)3346-157 >> > > -- > With best regards, > Taras Ledkov > Mail-To: taras.ledkov at oracle.com > skype: taras_ledkov > Phone: 7(812)3346-157 From jaroslav.bachorik at oracle.com Thu Feb 27 03:56:13 2014 From: jaroslav.bachorik at oracle.com (Jaroslav Bachorik) Date: Thu, 27 Feb 2014 12:56:13 +0100 Subject: Review request for 7195249: Some jtreg tests use hard coded ports In-Reply-To: <530F23FA.6070805@oracle.com> References: <529EF58F.5000701@oracle.com> <52A58687.6020708@oracle.com> <52A5953A.5040102@oracle.com> <52A7061E.8040002@oracle.com> <52BC2A7D.3070403@oracle.com> <52D6A61A.5020109@oracle.com> <52DD3BB6.2070607@oracle.com> <52DE3E3D.5070903@oracle.com> <52DE41BB.40309@oracle.com> <52F231D3.1050208@oracle.com> <52F23A97.7020807@oracle.com> <53049009.9020204@oracle.com> <530CA007.5080006@oracle.com> <530DF91C.5010000@oracle.com> <2C889ABA-D705-4518-85EC-C5686DC43647@oracle.com> <530F23FA.6070805@oracle.com> Message-ID: <530F27DD.3000704@oracle.com> On 27.2.2014 12:39, taras ledkov wrote: > Staffan, Excuse me. > I didn't know about that. > I looked at the http://openjdk.java.net/census#serviceability. > > Now I'm in need of the second review. You only need one Reviewer to approve. The other approvals may come from non-Reviewers. -JB- > > On 26.02.2014 18:36, Staffan Larsen wrote: >> >> On 26 feb 2014, at 15:24, taras ledkov wrote: >> >>> Hi, >>> >>> Alan, Mandy could you please review the fix: >>> https://bugs.openjdk.java.net/browse/JDK-7195249. >>> >>> I had the discussion with Jaroslav and Staffan and they have approved >>> my fix, but they are not reviewers. >> >> I am a Reviewer. >> >> Thanks, >> /Staffan >> >>> >>> Webrev for jdk part: >>> http://cr.openjdk.java.net/~anazarov/7195249/jdk/webrev.04/ >>> >>> Webrev for hs part: >>> http://cr.openjdk.java.net/~anazarov/7195249/hs/webrev.03/ >>> >>> On 25.02.2014 17:52, Jaroslav Bachorik wrote: >>>> Thumbs up. (not a "reviewer", though) >>>> >>>> -JB- >>>> >>>> On 19.2.2014 12:05, taras ledkov wrote: >>>>> Hi, >>>>> >>>>> Imports are fixed: >>>>> >>>>> http://cr.openjdk.java.net/~anazarov/7195249/jdk/webrev.04/ >>>>> >>>>> On 05.02.2014 17:20, Jaroslav Bachorik wrote: >>>>>> Hi Taras, >>>>>> >>>>>> thanks for taking care of this. >>>>>> >>>>>> The changes look fine to me. >>>>>> >>>>>> One minor nit is unused imports of the library classes in >>>>>> "test/sun/management/jmxremote/bootstrap/SSLConfigFilePermissionTest.java". >>>>>> >>>>>> >>>>>> >>>>>> It does not use any of those classes as its base class >>>>>> "AbstractFilePermissionTest" does all the heavy lifting. >>>>>> >>>>>> Cheers, >>>>>> >>>>>> -JB- >>>>>> >>>>>> On 5.2.2014 13:42, taras ledkov wrote: >>>>>>> Hi, >>>>>>> >>>>>>> So please take a look at the review against JDK9. >>>>>>> The reviewed patch had not been integrated into JDK8. >>>>>>> >>>>>>> Port to JDK9 is identical. The difference: the ProcessTools.java has >>>>>>> been already patched by Jaroslav. >>>>>>> >>>>>>> Webrev for jdk part: >>>>>>> http://cr.openjdk.java.net/~anazarov/7195249/jdk/webrev.03/ >>>>>>> >>>>>>> Webrev for hs part: >>>>>>> http://cr.openjdk.java.net/~anazarov/7195249/hs/webrev.03/ >>>>>>> >>>>>>> >>>>>>> On 21.01.2014 13:45, Jaroslav Bachorik wrote: >>>>>>>> Hi Taras, >>>>>>>> >>>>>>>> On 21.1.2014 10:30, taras ledkov wrote: >>>>>>>>> Hi Jaroslav, >>>>>>>>> >>>>>>>>> Could you please review the last changes? >>>>>>>>> Are you OK? >>>>>>>> >>>>>>>> Yes, the change looks ok. But I think we will need to get back >>>>>>>> to this >>>>>>>> problem eventually and implement a central port dispatcher if we >>>>>>>> want to >>>>>>>> be 100% sure the port conflicts wouldn't occur. But your changes >>>>>>>> reduce >>>>>>>> the chance significantly. >>>>>>>> >>>>>>>> Thanks for taking care of this. >>>>>>>> >>>>>>>> -JB- >>>>>>>> >>>>>>>>> >>>>>>>>> On 20.01.2014 19:21, Staffan Larsen wrote: >>>>>>>>>> Sorry for not replying earlier. Yes, I?m ok with these changes. >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> /Staffan >>>>>>>>>> >>>>>>>>>> On 20 jan 2014, at 16:07, taras ledkov >>>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>>> Hi Staffan, >>>>>>>>>>> >>>>>>>>>>> I fixed the tests according with your comments. >>>>>>>>>>> Are you OK? >>>>>>>>>>> >>>>>>>>>>> On 15.01.2014 19:15, taras ledkov wrote: >>>>>>>>>>>> Hi, >>>>>>>>>>>> >>>>>>>>>>>> Please take a look at the new review. >>>>>>>>>>>> >>>>>>>>>>>> Webrev for jdk part: >>>>>>>>>>>> http://cr.openjdk.java.net/~anazarov/7195249/jdk/webrev.02/ >>>>>>>>>>>> >>>>>>>>>>>> Webrev for hs part: >>>>>>>>>>>> http://cr.openjdk.java.net/~anazarov/7195249/hs/webrev.02/ >>>>>>>>>>>> >>>>>>>>>>>> My answers are inline: >>>>>>>>>>>> >>>>>>>>>>>> On 08.01.2014 17:46, Staffan Larsen wrote: >>>>>>>>>>>>> Hi Taras, >>>>>>>>>>>>> >>>>>>>>>>>>> Thanks for doing this clean up and conversion of tests into >>>>>>>>>>>>> Java. >>>>>>>>>>>>> Here?s a couple of comments: >>>>>>>>>>>>> >>>>>>>>>>>>> test/runtime/6294277/SourceDebugExtension.java: >>>>>>>>>>>>> This test could be simplified by not specifying an address at >>>>>>>>>>>>> all. >>>>>>>>>>>>> Since the test never connects to the JVM started with >>>>>>>>>>>>> -Xrunjdwp, >>>>>>>>>>>>> there >>>>>>>>>>>>> is no reason to specify an address. If address is unspecified >>>>>>>>>>>>> (and >>>>>>>>>>>>> server=y), the connector will pick an address and print it >>>>>>>>>>>>> to the >>>>>>>>>>>>> command line. Thus the only change that needs to be done is to >>>>>>>>>>>>> remove >>>>>>>>>>>>> ",address=8888? from the @run command. >>>>>>>>>>>> fixed >>>>>>>>>>>> >>>>>>>>>>>>> test/sun/management/jmxremote/bootstrap/RmiBootstrapTest.sh: >>>>>>>>>>>>> test/sun/management/jmxremote/bootstrap/RmiSslBootstrapTest.sh: >>>>>>>>>>>>> >>>>>>>>>>>>> These tests do not compile cleanly with an empty JTwork >>>>>>>>>>>>> directory. It >>>>>>>>>>>>> seems that having one @build for each class does not work >>>>>>>>>>>>> well - >>>>>>>>>>>>> when >>>>>>>>>>>>> compiling RmiBootstrapTest.java it cannot find TestLogger. >>>>>>>>>>>>> Moving >>>>>>>>>>>>> all >>>>>>>>>>>>> classes to one @build statement solved this problem for me. >>>>>>>>>>>> fixed >>>>>>>>>>>> >>>>>>>>>>>>> test/lib/testlibrary/jdk/testlibrary/ProcessTools.java: >>>>>>>>>>>>> 187 Future stdoutTask = stdout.process(); >>>>>>>>>>>>> 188 Future stderrTask = stderr.process(); >>>>>>>>>>>>> The stdoutTask and stderrTask variables are unused. >>>>>>>>>>>> fixed >>>>>>>>>>>> >>>>>>>>>>>>> test/sun/management/jmxremote/bootstrap/RmiRegistrySslTest.java: >>>>>>>>>>>>> >>>>>>>>>>>>> At first I thought something was wrong with this file - the >>>>>>>>>>>>> diff is >>>>>>>>>>>>> very weird. Then I realized you renamed an old file and >>>>>>>>>>>>> created a >>>>>>>>>>>>> new >>>>>>>>>>>>> file using the old name. >>>>>>>>>>>> You are right. I did it to keep the test name. >>>>>>>>>>>> >>>>>>>>>>>>> test/sun/management/jmxremote/bootstrap/AbstractFilePermissionTest.java: >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> - Is resetPasswordFilePermission() really necessary? It looks >>>>>>>>>>>>> like >>>>>>>>>>>>> you >>>>>>>>>>>>> delete the files at the beginning of the test in any case. >>>>>>>>>>>> I think yes. n the first place, this functionality was at >>>>>>>>>>>> the old >>>>>>>>>>>> code. >>>>>>>>>>>> In the second place, a file without write permission may be a >>>>>>>>>>>> problem >>>>>>>>>>>> for a further cleanup (not by the test, for example for the >>>>>>>>>>>> tests >>>>>>>>>>>> launcher scripts etc.) >>>>>>>>>>>> >>>>>>>>>>>>> - I find the names and usage of ?mgmt? and >>>>>>>>>>>>> ?file2PermissionTest? >>>>>>>>>>>>> confusing. They are both Paths. One is used directly by the >>>>>>>>>>>>> sub-classes, the other has a getter method. >>>>>>>>>>>> fixed >>>>>>>>>>>> >>>>>>>>>>>>> - Lines 57-58: Don?t swallow exceptions, add an >>>>>>>>>>>>> ex.printStackTrace(). >>>>>>>>>>>>> (Same thing for all other places where you call >>>>>>>>>>>>> Integer.parseInt()) >>>>>>>>>>>> fixed >>>>>>>>>>>> >>>>>>>>>>>>> test/sun/management/jmxremote/bootstrap/Dummy.java: >>>>>>>>>>>>> This file is never used as far as I can see. >>>>>>>>>>>> It is used by PasswordFilePermissionTest & >>>>>>>>>>>> SSLConfigFilePermissionTest >>>>>>>>>>>> via the AbstractFilePermissionTest (see the doTest method, >>>>>>>>>>>> AbstractFilePermissionTest : 162). >>>>>>>>>>>> >>>>>>>>>>>>> Thanks, >>>>>>>>>>>>> /Staffan >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> On 26 dec 2013, at 14:09, taras ledkov >>>>>>>>>>>>> >>>>>>>>>>>>> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>>> Hi, >>>>>>>>>>>>>> >>>>>>>>>>>>>> Please take a look at the review with fixed issues about >>>>>>>>>>>>>> trying to >>>>>>>>>>>>>> launch test that needs free port several times. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Webrev for jdk part: >>>>>>>>>>>>>> http://cr.openjdk.java.net/~anazarov/7195249/jdk/webrev.01/ >>>>>>>>>>>>>> >>>>>>>>>>>>>> Webrev for hs part: >>>>>>>>>>>>>> http://cr.openjdk.java.net/~anazarov/7195249/hs/webrev.01/ >>>>>>>>>>>>>> >>>>>>>>>>>>>> Pay your attention to new method >>>>>>>>>>>>>> ProcessTools.startProcess(String, >>>>>>>>>>>>>> ProcessBuilder, Consumer) that is used to analyze all >>>>>>>>>>>>>> output >>>>>>>>>>>>>> of a sub-process. It has common part with >>>>>>>>>>>>>> ProcessTools.startProcess(String, ProcessBuilder, >>>>>>>>>>>>>> Predicate, >>>>>>>>>>>>>> long, TumeUnit) that is used to determine the warm-up moment. >>>>>>>>>>>>>> >>>>>>>>>>>>>> I think the ProcessTools.startProcess(String, ProcessBuilder, >>>>>>>>>>>>>> Predicate, long, TumeUnit) may be changed by adding >>>>>>>>>>>>>> LinePump >>>>>>>>>>>>>> to stderr if there is not serious reason for restricting the >>>>>>>>>>>>>> warm-up >>>>>>>>>>>>>> analysis to stdout stream. >>>>>>>>>>>>>> >>>>>>>>>>>>>> On 10.12.2013 16:16, Yekaterina Kantserova wrote: >>>>>>>>>>>>>>> Hi, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> I've consulted with Serviceability engineers (add them to CC >>>>>>>>>>>>>>> list) and >>>>>>>>>>>>>>> they would like to see tests to solve these problem so far: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> 2. Implement loops in every test. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>> Katja >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> On 12/09/2013 11:02 AM, Alexandre (Shura) Iline wrote: >>>>>>>>>>>>>>>> Guys. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Let me try to sum up what was said before and may be >>>>>>>>>>>>>>>> suggest a >>>>>>>>>>>>>>>> compromise. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> 1. There is a desire to have a support port allocation >>>>>>>>>>>>>>>> on the >>>>>>>>>>>>>>>> level of >>>>>>>>>>>>>>>> a JTReg suite execution. Taras created a bug for that >>>>>>>>>>>>>>>> (https://bugs.openjdk.java.net/browse/JDK-7195249). >>>>>>>>>>>>>>>> Whether it >>>>>>>>>>>>>>>> is a >>>>>>>>>>>>>>>> test harness API or a library API does not really matter >>>>>>>>>>>>>>>> from >>>>>>>>>>>>>>>> usage >>>>>>>>>>>>>>>> point of view. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> 2. There is no way to make the tests absolutely stable, >>>>>>>>>>>>>>>> whatever >>>>>>>>>>>>>>>> port >>>>>>>>>>>>>>>> allocation logic is used. The best we could do is to try to >>>>>>>>>>>>>>>> perform >>>>>>>>>>>>>>>> the test logic with different ports until the test >>>>>>>>>>>>>>>> succeeds. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Both arguments make sense. #2 is the ultimate answer, of >>>>>>>>>>>>>>>> course, >>>>>>>>>>>>>>>> but >>>>>>>>>>>>>>>> better be used in conjunction with a meaningful port >>>>>>>>>>>>>>>> selection >>>>>>>>>>>>>>>> algorithm. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> At the same time, copying a loop-until-success login >>>>>>>>>>>>>>>> from one >>>>>>>>>>>>>>>> test to >>>>>>>>>>>>>>>> another may be not the best solution. Library could help >>>>>>>>>>>>>>>> with >>>>>>>>>>>>>>>> that I >>>>>>>>>>>>>>>> believe. There only need to be an API method which takes >>>>>>>>>>>>>>>> behavior as a >>>>>>>>>>>>>>>> parameter and run it until it succeeds. Something like: >>>>>>>>>>>>>>>> public runOnAFreePort(Function) >>>>>>>>>>>>>>>> or similar. There could be arguments of how/whether to >>>>>>>>>>>>>>>> implement >>>>>>>>>>>>>>>> it, >>>>>>>>>>>>>>>> the solution would not work for shell tests, etc, but still >>>>>>>>>>>>>>>> ... >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> With the tests in question though, we have a few options. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> 1. Integrate tests as is. Get to it later after reaching >>>>>>>>>>>>>>>> agreement in >>>>>>>>>>>>>>>> the library, etc. >>>>>>>>>>>>>>>> 2. Implement loops in every test. >>>>>>>>>>>>>>>> 3. Wait for the library to be ready and only then integrate >>>>>>>>>>>>>>>> the >>>>>>>>>>>>>>>> changes. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Please let us know which one is closer to your heart. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> I personally prefer #1 for the reason that the changes >>>>>>>>>>>>>>>> already >>>>>>>>>>>>>>>> supposed to make the tests more stable and also there >>>>>>>>>>>>>>>> are many >>>>>>>>>>>>>>>> more >>>>>>>>>>>>>>>> tests tests which use ports, so the scope of the problem is >>>>>>>>>>>>>>>> bigger >>>>>>>>>>>>>>>> than these. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Shura >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Taras, >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> I agree with the previous comments, that >>>>>>>>>>>>>>>>> Utils.getFreePort() >>>>>>>>>>>>>>>>> does not >>>>>>>>>>>>>>>>> guarantee the port will be still free when you start your >>>>>>>>>>>>>>>>> process. >>>>>>>>>>>>>>>>> Unfortunately I don't think the library can do more. >>>>>>>>>>>>>>>>> However, >>>>>>>>>>>>>>>>> there is a >>>>>>>>>>>>>>>>> solution. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Please, look at the >>>>>>>>>>>>>>>>> *jdk/test/sun/tools/jstatd/JstatdTest.java >>>>>>>>>>>>>>>>> tryToSetupJstatdProcess()*. In brief, the test will try to >>>>>>>>>>>>>>>>> start a >>>>>>>>>>>>>>>>> process with a free port and then check if >>>>>>>>>>>>>>>>> /java.rmi.server.ExportException: Port already in use/ has >>>>>>>>>>>>>>>>> been >>>>>>>>>>>>>>>>> thrown. >>>>>>>>>>>>>>>>> If yes, you have to retry. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>> Katja >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> On 12/02/2013 01:39 PM, taras ledkov wrote: >>>>>>>>>>>>>>>>>> Hi Everyone, >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Whatever logic is to be chosen to select a free port, >>>>>>>>>>>>>>>>>> it is >>>>>>>>>>>>>>>>>> the >>>>>>>>>>>>>>>>>> library responsibility to implements it, would not you >>>>>>>>>>>>>>>>>> agree? >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Hence what I am suggesting is to integrate the tests >>>>>>>>>>>>>>>>>> as is. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Should we decide to replace logic of the port >>>>>>>>>>>>>>>>>> selection, we >>>>>>>>>>>>>>>>>> could do >>>>>>>>>>>>>>>>>> it later in the library. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> On 21.11.2013 15:00, Jaroslav Bachorik wrote: >>>>>>>>>>>>>>>>>>> On 20.11.2013 18:38, Dmitry Samersoff wrote: >>>>>>>>>>>>>>>>>>>> Roger, >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> As soon as we close a socket nobody can guarantee >>>>>>>>>>>>>>>>>>>> that the >>>>>>>>>>>>>>>>>>>> port is >>>>>>>>>>>>>>>>>>>> free. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Moreover, port returned by getFreePort()[1] remains not >>>>>>>>>>>>>>>>>>>> accessible >>>>>>>>>>>>>>>>>>>> for >>>>>>>>>>>>>>>>>>>> some time - it depends to system setup, take a look to >>>>>>>>>>>>>>>>>>>> discussions >>>>>>>>>>>>>>>>>>>> around SO_REUSEPORT for Linux or SO_REUSEADDR and >>>>>>>>>>>>>>>>>>>> SO_LINGER >>>>>>>>>>>>>>>>>>>> for >>>>>>>>>>>>>>>>>>>> BSD. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> So from stability point of view it's better to just >>>>>>>>>>>>>>>>>>>> return >>>>>>>>>>>>>>>>>>>> random >>>>>>>>>>>>>>>>>>>> number >>>>>>>>>>>>>>>>>>>> between 49152 and 65535. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Well, this doesn't seem to improve the odds by much. >>>>>>>>>>>>>>>>>>> When >>>>>>>>>>>>>>>>>>> there are >>>>>>>>>>>>>>>>>>> more >>>>>>>>>>>>>>>>>>> tests run in parallel, all of them requiring a free >>>>>>>>>>>>>>>>>>> port, >>>>>>>>>>>>>>>>>>> nothing >>>>>>>>>>>>>>>>>>> prevents the random function to return the same port to >>>>>>>>>>>>>>>>>>> all of >>>>>>>>>>>>>>>>>>> them. >>>>>>>>>>>>>>>>>>> Also, two subsequent requests can return the same >>>>>>>>>>>>>>>>>>> port and >>>>>>>>>>>>>>>>>>> cause >>>>>>>>>>>>>>>>>>> problems with timing when a port used by a previous >>>>>>>>>>>>>>>>>>> test is >>>>>>>>>>>>>>>>>>> not >>>>>>>>>>>>>>>>>>> fully >>>>>>>>>>>>>>>>>>> ready to be assigned to a different socket. And as >>>>>>>>>>>>>>>>>>> Dmitry >>>>>>>>>>>>>>>>>>> pointed out >>>>>>>>>>>>>>>>>>> unless one can keep hold of the allocated socket and >>>>>>>>>>>>>>>>>>> use it >>>>>>>>>>>>>>>>>>> later >>>>>>>>>>>>>>>>>>> there >>>>>>>>>>>>>>>>>>> is no guarantee that a port which was tested unallocated >>>>>>>>>>>>>>>>>>> will >>>>>>>>>>>>>>>>>>> remain >>>>>>>>>>>>>>>>>>> unallocated also for the next few milliseconds. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> The only fail proof solution would be a port allocating >>>>>>>>>>>>>>>>>>> service >>>>>>>>>>>>>>>>>>> provided >>>>>>>>>>>>>>>>>>> by the harness. Until then we can only (hopefully) >>>>>>>>>>>>>>>>>>> decrease >>>>>>>>>>>>>>>>>>> the >>>>>>>>>>>>>>>>>>> chance >>>>>>>>>>>>>>>>>>> of intermittent failures due to a port being in use. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> -JB- >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> -Dmitry >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> [1] >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> 141 public static int getFreePort() throws >>>>>>>>>>>>>>>>>>>> InterruptedException, >>>>>>>>>>>>>>>>>>>> IOException { >>>>>>>>>>>>>>>>>>>> 142 int port = -1; >>>>>>>>>>>>>>>>>>>> 143 >>>>>>>>>>>>>>>>>>>> 144 while (port <= 0) { >>>>>>>>>>>>>>>>>>>> 145 Thread.sleep(100); >>>>>>>>>>>>>>>>>>>> 146 >>>>>>>>>>>>>>>>>>>> 147 ServerSocket serverSocket = null; >>>>>>>>>>>>>>>>>>>> 148 try { >>>>>>>>>>>>>>>>>>>> 149 serverSocket = new >>>>>>>>>>>>>>>>>>>> ServerSocket(0); >>>>>>>>>>>>>>>>>>>> 150 port = >>>>>>>>>>>>>>>>>>>> serverSocket.getLocalPort(); >>>>>>>>>>>>>>>>>>>> 151 } finally { >>>>>>>>>>>>>>>>>>>> 152 serverSocket.close(); >>>>>>>>>>>>>>>>>>>> 153 } >>>>>>>>>>>>>>>>>>>> 154 } >>>>>>>>>>>>>>>>>>>> 155 >>>>>>>>>>>>>>>>>>>> 156 return port; >>>>>>>>>>>>>>>>>>>> 157 } >>>>>>>>>>>>>>>>>>>> 158 >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> On 2013-11-20 19:40, roger riggs wrote: >>>>>>>>>>>>>>>>>>>>> Hi, >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> fyi, The jdk.testlibrary.Utils.getFreePort() >>>>>>>>>>>>>>>>>>>>> method will >>>>>>>>>>>>>>>>>>>>> Open an >>>>>>>>>>>>>>>>>>>>> free >>>>>>>>>>>>>>>>>>>>> Socket, close it and return >>>>>>>>>>>>>>>>>>>>> the port number. >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> And as Alan recommended, use (0) when possible to have >>>>>>>>>>>>>>>>>>>>> the >>>>>>>>>>>>>>>>>>>>> system >>>>>>>>>>>>>>>>>>>>> assign >>>>>>>>>>>>>>>>>>>>> the port #. >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> Roger >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> On 11/20/2013 8:04 AM, Dmitry Samersoff wrote: >>>>>>>>>>>>>>>>>>>>>> Taras, >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> *The only* correct way to take really free port is: >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> 1. Chose random number between 49152 and 65535 >>>>>>>>>>>>>>>>>>>>>> 2. Open socket >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> if socket fails - repeat step 1 >>>>>>>>>>>>>>>>>>>>>> if socket OK - return *socket* >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> If you can't keep the socket open (e.g. you have >>>>>>>>>>>>>>>>>>>>>> to pass >>>>>>>>>>>>>>>>>>>>>> port >>>>>>>>>>>>>>>>>>>>>> number as >>>>>>>>>>>>>>>>>>>>>> property value) you shouldn't do any pre-check as it >>>>>>>>>>>>>>>>>>>>>> has no >>>>>>>>>>>>>>>>>>>>>> value >>>>>>>>>>>>>>>>>>>>>> - as >>>>>>>>>>>>>>>>>>>>>> as soon as you close socket someone can take the >>>>>>>>>>>>>>>>>>>>>> port. >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> So just choose a random number within the range above >>>>>>>>>>>>>>>>>>>>>> and >>>>>>>>>>>>>>>>>>>>>> let >>>>>>>>>>>>>>>>>>>>>> networking >>>>>>>>>>>>>>>>>>>>>> code opening socket to handle port conflict. >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> -Dmitry >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> On 2013-11-20 15:54, taras ledkov wrote: >>>>>>>>>>>>>>>>>>>>>>> Hi Everyone, >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> I am working on bug >>>>>>>>>>>>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-7195249. >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> There are two webrevs: >>>>>>>>>>>>>>>>>>>>>>> Webrev for jdk part: >>>>>>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~anazarov/7195249/jdk/webrev.00/ >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> Webrev for hs part: >>>>>>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~anazarov/7195249/hs/webrev.00/ >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> Please take a look at some notes: >>>>>>>>>>>>>>>>>>>>>>> - After discussing with Yekaterina Kantserova & >>>>>>>>>>>>>>>>>>>>>>> Jaroslav >>>>>>>>>>>>>>>>>>>>>>> Bachorik >>>>>>>>>>>>>>>>>>>>>>> some >>>>>>>>>>>>>>>>>>>>>>> shell tests have been converted to java based tests >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> - PasswordFilePermissionTest & >>>>>>>>>>>>>>>>>>>>>>> SSLConfigFilePermissionTest >>>>>>>>>>>>>>>>>>>>>>> tests >>>>>>>>>>>>>>>>>>>>>>> looked >>>>>>>>>>>>>>>>>>>>>>> very similar, so a common parent class was >>>>>>>>>>>>>>>>>>>>>>> created for >>>>>>>>>>>>>>>>>>>>>>> them: >>>>>>>>>>>>>>>>>>>>>>> AbstractFilePermissionTest >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> - What was called RmiRegistrySslTest.java I've >>>>>>>>>>>>>>>>>>>>>>> renamed to >>>>>>>>>>>>>>>>>>>>>>> RmiRegistrySslTestApp.java. The java code to replace >>>>>>>>>>>>>>>>>>>>>>> old >>>>>>>>>>>>>>>>>>>>>>> shell >>>>>>>>>>>>>>>>>>>>>>> script >>>>>>>>>>>>>>>>>>>>>>> RmiRegistrySslTest.sh is called >>>>>>>>>>>>>>>>>>>>>>> RmiRegistrySslTest.java, >>>>>>>>>>>>>>>>>>>>>>> hence the >>>>>>>>>>>>>>>>>>>>>>> huge >>>>>>>>>>>>>>>>>>>>>>> diff. >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> - The new RmiRegistrySslTest.java has some lines >>>>>>>>>>>>>>>>>>>>>>> similar >>>>>>>>>>>>>>>>>>>>>>> to the >>>>>>>>>>>>>>>>>>>>>>> AbstractFilePermissionTest.java, I nevertheless >>>>>>>>>>>>>>>>>>>>>>> decided >>>>>>>>>>>>>>>>>>>>>>> to not >>>>>>>>>>>>>>>>>>>>>>> complicate the code further and leave it as is. >>>>>>>>>>>>>>>>>>>>>>> Please >>>>>>>>>>>>>>>>>>>>>>> let me >>>>>>>>>>>>>>>>>>>>>>> know if >>>>>>>>>>>>>>>>>>>>>>> this is somehow not acceptable >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> - com/oracle/java/testlibrary/Utils.java that is >>>>>>>>>>>>>>>>>>>>>>> added to >>>>>>>>>>>>>>>>>>>>>>> hotspot >>>>>>>>>>>>>>>>>>>>>>> repository is taken from this patch: >>>>>>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ykantser/8023138/webrev.00/test/lib/testlibrary/jdk/testlibrary/Utils.java.sdiff.html >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> - These tests will need additional changes when test >>>>>>>>>>>>>>>>>>>>>>> library >>>>>>>>>>>>>>>>>>>>>>> process >>>>>>>>>>>>>>>>>>>>>>> tools will support command line options inheritance >>>>>>>>>>>>>>>>>>>>>>> (http://mail.openjdk.java.net/pipermail/serviceability-dev/2013-November/013235.html) >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> -- >>>>>>>>>>>>>> With best regards, >>>>>>>>>>>>>> Taras Ledkov >>>>>>>>>>>>>> Mail-To: taras.ledkov at oracle.com >>>>>>>>>>>>>> skype: taras_ledkov >>>>>>>>>>>>>> Phone: 7(812)3346-157 >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> -- >>>>>>>>>>> With best regards, >>>>>>>>>>> Taras Ledkov >>>>>>>>>>> Mail-To: taras.ledkov at oracle.com >>>>>>>>>>> skype: taras_ledkov >>>>>>>>>>> Phone: 7(812)3346-157 >>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> >>> -- >>> With best regards, >>> Taras Ledkov >>> Mail-To: taras.ledkov at oracle.com >>> skype: taras_ledkov >>> Phone: 7(812)3346-157 >> > From shanliang.jiang at oracle.com Thu Feb 27 03:59:47 2014 From: shanliang.jiang at oracle.com (shanliang) Date: Thu, 27 Feb 2014 12:59:47 +0100 Subject: RFR(XS) 8031065: LowMemoryTest2.sh fails: OutOfMemoryError: Metaspace In-Reply-To: <530F263E.8090408@oracle.com> References: <4b9ed302-c006-4829-b732-2ffc123d84f4@default> <530F263E.8090408@oracle.com> Message-ID: <530F28B3.4020809@oracle.com> Daniel Fuchs wrote: > On 2/27/14 11:43 AM, Mattias Tobiasson wrote: >> Hi, >> Could you please review this test fix. >> >> The test verifies that MemoryPoolMXBean sends a notification when >> used memory has reached the threshold. >> The flag thresholdExceeded marks if we have reached the memory >> threshold. When the flag is set, the test slows down to give time for >> the notification to be received. >> The problem is that thresholdExceeded is overwritten every time in >> the loop. Instead it should be set if any pool has reached the >> threshold. This means that the test continues to allocate memory at >> full speed, and we may get an OutOfMemory before we get the >> notification. >> > > Hi Mattias, > > I wonder whether you should also stop calling loadNext() once > thresholdExceeded is true? Yes I am thinking this too. Line 172 is unnecessary, after thresholdExceeded becomes true, Line 170 will always be skipped. Shanliang > > best regards, > > -- daniel > >> bug: >> https://bugs.openjdk.java.net/browse/JDK-8031065 >> >> webrev: >> http://cr.openjdk.java.net/~ykantser/8031065/webrev.00/ >> >> Mattias >> > From mattias.tobiasson at oracle.com Thu Feb 27 04:16:10 2014 From: mattias.tobiasson at oracle.com (Mattias Tobiasson) Date: Thu, 27 Feb 2014 04:16:10 -0800 (PST) Subject: RFR(XS) 8031065: LowMemoryTest2.sh fails: OutOfMemoryError: Metaspace Message-ID: <0b126035-d2ba-461a-854f-943e4c3bd0ce@default> Hi, thanks for the fast reviews. I did think about stop calling loadNext() after the flag has been set. The main reason for not doing that was just because I wanted to change as little as possible. Now the test works as originally intended. I do not mind removing the calls to loadNext(), but then we would need some timeout waiting for the callback. Currently the test "times out" with an OutOfMemory when we have allocated the remaining 20% of the space. About line 172, you are correct. I will just remove that line. Thanks! Mattias ----- Original Message ----- From: shanliang.jiang at oracle.com To: daniel.fuchs at oracle.com Cc: mattias.tobiasson at oracle.com, serviceability-dev at openjdk.java.net Sent: Thursday, February 27, 2014 12:59:49 PM GMT +01:00 Amsterdam / Berlin / Bern / Rome / Stockholm / Vienna Subject: Re: RFR(XS) 8031065: LowMemoryTest2.sh fails: OutOfMemoryError: Metaspace Daniel Fuchs wrote: > On 2/27/14 11:43 AM, Mattias Tobiasson wrote: >> Hi, >> Could you please review this test fix. >> >> The test verifies that MemoryPoolMXBean sends a notification when >> used memory has reached the threshold. >> The flag thresholdExceeded marks if we have reached the memory >> threshold. When the flag is set, the test slows down to give time for >> the notification to be received. >> The problem is that thresholdExceeded is overwritten every time in >> the loop. Instead it should be set if any pool has reached the >> threshold. This means that the test continues to allocate memory at >> full speed, and we may get an OutOfMemory before we get the >> notification. >> > > Hi Mattias, > > I wonder whether you should also stop calling loadNext() once > thresholdExceeded is true? Yes I am thinking this too. Line 172 is unnecessary, after thresholdExceeded becomes true, Line 170 will always be skipped. Shanliang > > best regards, > > -- daniel > >> bug: >> https://bugs.openjdk.java.net/browse/JDK-8031065 >> >> webrev: >> http://cr.openjdk.java.net/~ykantser/8031065/webrev.00/ >> >> Mattias >> > From dmitry.samersoff at oracle.com Thu Feb 27 04:53:47 2014 From: dmitry.samersoff at oracle.com (Dmitry Samersoff) Date: Thu, 27 Feb 2014 16:53:47 +0400 Subject: PING Re: RR(S): JDK-7189721 nightly tests failed on JDI attaching In-Reply-To: <53077A0B.5010401@oracle.com> References: <52FE2D9A.4030700@oracle.com> <53077A0B.5010401@oracle.com> Message-ID: <530F355B.1010601@oracle.com> Still waiting for review. On 2014-02-21 20:08, Dmitry Samersoff wrote: > On 2014-02-14 18:52, Dmitry Samersoff wrote: >> Hi Everyone, >> >> Please, review the changes. >> >> http://cr.openjdk.java.net/~dsamersoff/JDK-7189721/webrev.01/ >> >> Summary: >> >> If for some reason elf section with section names is not loaded to cache >> it attempts to read data using NULL section pointer. -- Dmitry Samersoff Oracle Java development team, Saint Petersburg, Russia * I would love to change the world, but they won't give me the sources. From jaroslav.bachorik at oracle.com Thu Feb 27 04:58:22 2014 From: jaroslav.bachorik at oracle.com (Jaroslav Bachorik) Date: Thu, 27 Feb 2014 13:58:22 +0100 Subject: RFR 8035668: [test] Check that jstat does not cause problems when monitored application exits before jstat finishes its work Message-ID: <530F366E.2040806@oracle.com> Please, review the addition of the jstat related test. Issue : https://bugs.openjdk.java.net/browse/JDK-8035668 Webrev: http://cr.openjdk.java.net/~jbachorik/8035668/webrev.00/ The test logic is simple - first the application to be monitored is started. When the application is ready jstat is launched - it periodically (each 100ms) polls the monitored application status. At this moment the monitored application is signalled the possibility to exit - it waits for 500ms and exits. There should not be any errors neither in the monitored application nor jstat (checked by the exit status of their processes) and they should not hang (detected by the harness timeout). Thanks, -JB- From roger.riggs at oracle.com Thu Feb 27 07:34:01 2014 From: roger.riggs at oracle.com (roger riggs) Date: Thu, 27 Feb 2014 10:34:01 -0500 Subject: RFR 9: 8035889: jdk testlibrary - add printing of values of failed assertions In-Reply-To: <530EAE5F.3050308@oracle.com> References: <530E4FD8.9040605@oracle.com> <530E9C6C.8080409@oracle.com> <530EAC78.4000301@Oracle.com> <530EAE5F.3050308@oracle.com> Message-ID: <530F5AE9.8000405@oracle.com> Hi Mandy, I updated the webrev: http://cr.openjdk.java.net/~rriggs/webrev-testlibrary-asserts-8035889/ Alan suggested copying serviceability-dev so they have a chance to review if desired. I want to investigate if it is possible to use the TestNG Assert classes without the TestNG execution framework. It would be necessary to compile/run against TestNG.jar but it might not need the entire mechanism. Thanks, Roger On 2/26/2014 10:17 PM, Mandy Chung wrote: > On 2/26/2014 7:09 PM, Roger Riggs wrote: >> Hi Mandy, >> >> Yes, it might be more productive to switch the tests to TestNG. >> But it did provide support in cases where TestNG could not be used, >> for example in a directory of existing tests that had custom reporting. >> >> But I remember there is a problem with TestNG having a dependency for XML >> which is not supported in Profile1 and a number of tests had to be >> disabled >> in that configuration. Will XML always be available. Do we need to >> solve >> or work around that problem with TestNG? >> > > This is a good point. When we want to test just the base module for > example, how can we run TestNG tests? We need to address that certainly. > > My comment on TestNG is a question for new tests using this Asserts > class. Your patch is fine to go (after taking out @library tag if I > got it correct). > > Mandy > >> Thanks, Roger >> >> On 2/26/14 9:01 PM, Mandy Chung wrote: >>> Hi Roger, >>> >>> On 2/26/2014 12:34 PM, roger riggs wrote: >>>> The testlibrary for the jdk should be printing the values in the >>>> failed >>>> assertions to make debugging easier and quicker. >>>> >>>> The webrev adds the printing of the failed assertions and added >>>> methods >>>> for formatting and unconditional fail methods. >>>> >>>> Webrev: >>>> http://cr.openjdk.java.net/~rriggs/webrev-testlibrary-asserts-8035889/ >>>> >>> >>> AssertsTest.java: line 28: @library doesn't look like it's needed. >>> There is no jdk/test/testlibrary directory and I think >>> jdk.testlibrary.* are found as relative to $test.src. >>> >>> Otherwise, the change looks okay. >>> >>> Now that jtreg supports TestNG and I wonder if this class should >>> retire some day (there are only about 10 existing tests using this >>> class). Are you writing new tests using this Asserts class? >>> >>> Mandy >>> >>>> Bug: >>>> 8035889: jdk testlibrary - add printing of values of failed >>>> assertions >>>> >>>> Thanks, Roger >>>> >>>> [1] https://bugs.openjdk.java.net/browse/JDK-8035889 >>>> >>>> >>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20140227/a01db940/attachment.html From shanliang.jiang at oracle.com Thu Feb 27 08:12:49 2014 From: shanliang.jiang at oracle.com (shanliang) Date: Thu, 27 Feb 2014 17:12:49 +0100 Subject: RFR(XS) 8031065: LowMemoryTest2.sh fails: OutOfMemoryError: Metaspace In-Reply-To: <0b126035-d2ba-461a-854f-943e4c3bd0ce@default> References: <0b126035-d2ba-461a-854f-943e4c3bd0ce@default> Message-ID: <530F6401.8010708@oracle.com> Mattias Tobiasson wrote: > Hi, thanks for the fast reviews. > > I did think about stop calling loadNext() after the flag has been set. The main reason for not doing that was just because I wanted to change as little as possible. Now the test works as originally intended. > I prefer to do like this too :) > I do not mind removing the calls to loadNext(), but then we would need some timeout waiting for the callback. Currently the test "times out" with an OutOfMemory when we have allocated the remaining 20% of the space. > You do not need to add a timeout, only change Line 151 for(;;) to while(!listenerInvoked) { and remove 160 -- 162 in case that an expected notification is not arrived, the testing harness has a timeout to stop the test. This way makes the test more robust, but I am OK with the current fix. Thanks, Shanliang > > About line 172, you are correct. I will just remove that line. Thanks! > > Mattias > > ----- Original Message ----- > From: shanliang.jiang at oracle.com > To: daniel.fuchs at oracle.com > Cc: mattias.tobiasson at oracle.com, serviceability-dev at openjdk.java.net > Sent: Thursday, February 27, 2014 12:59:49 PM GMT +01:00 Amsterdam / Berlin / Bern / Rome / Stockholm / Vienna > Subject: Re: RFR(XS) 8031065: LowMemoryTest2.sh fails: OutOfMemoryError: Metaspace > > Daniel Fuchs wrote: > >> On 2/27/14 11:43 AM, Mattias Tobiasson wrote: >> >>> Hi, >>> Could you please review this test fix. >>> >>> The test verifies that MemoryPoolMXBean sends a notification when >>> used memory has reached the threshold. >>> The flag thresholdExceeded marks if we have reached the memory >>> threshold. When the flag is set, the test slows down to give time for >>> the notification to be received. >>> The problem is that thresholdExceeded is overwritten every time in >>> the loop. Instead it should be set if any pool has reached the >>> threshold. This means that the test continues to allocate memory at >>> full speed, and we may get an OutOfMemory before we get the >>> notification. >>> >>> >> Hi Mattias, >> >> I wonder whether you should also stop calling loadNext() once >> thresholdExceeded is true? >> > Yes I am thinking this too. > > Line 172 is unnecessary, after thresholdExceeded becomes true, Line 170 > will always be skipped. > > Shanliang > >> best regards, >> >> -- daniel >> >> >>> bug: >>> https://bugs.openjdk.java.net/browse/JDK-8031065 >>> >>> webrev: >>> http://cr.openjdk.java.net/~ykantser/8031065/webrev.00/ >>> >>> Mattias >>> >>> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20140227/aa886465/attachment-0001.html From pavel.punegov at oracle.com Thu Feb 27 08:20:54 2014 From: pavel.punegov at oracle.com (Pavel Punegov) Date: Thu, 27 Feb 2014 20:20:54 +0400 Subject: RFR 6946101: ShellScaffold.sh based tests can issue "write error: Broken pipe" messages Message-ID: <1701612.OzZc2qZ2Ld@work-dell> Please review the fix for: https://bugs.openjdk.java.net/browse/JDK-6946101 webrev: http://cr.openjdk.java.net/~iignatyev/ppunegov/6946101/webrev.00/ This change fixes two issues with the tests: 1. Fix incorrect 'quit' command sending to JDB when JDB process was finished. 2. Improve JDB unexpected exit detection and process synchronization. Description of fix: 1. Add allowExit parameter to cmd() to show that the given command can finish JDB. E.g., 'cont' command make JDB execute debuggee to the end. If allowExit wasn't set for a command then assume that it can't finish execution, and fail the test if it did. 2. Make test fail if it tries to send 'quit' or 'exit' commands. This makes it impossible to send quit/exit from test by mistake. Scaffold will finish JDB by itself if JDB didn't finish before be a command with allowExit set. Add dofinish() function to be the only method that may exit JDB. 3. Add proper synchronization into waitForFinish(). On all systems except SunOS use wait (from bash). On Solaris find the shell subprocess and wait for its finish. It replaces wait used on all other systems, because it doesn't work on sh/ksh as in bash. 4. Fix tests: add allowExit to tests where it's needed. -- Thanks, Pavel Punegov From daniel.daugherty at oracle.com Thu Feb 27 11:09:06 2014 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Thu, 27 Feb 2014 12:09:06 -0700 Subject: 2-nd round RFR 6471769: Error: assert(_cur_stack_depth == count_frames(), "cur_stack_depth out of sync") In-Reply-To: <530EF65C.5040707@oracle.com> References: <530D007D.4040602@oracle.com> <530EF65C.5040707@oracle.com> Message-ID: <530F8D52.2040400@oracle.com> On 2/27/14 1:25 AM, serguei.spitsyn at oracle.com wrote: > Please, review the fix for: > https://bugs.openjdk.java.net/browse/JDK-6471769 > > > Open webrev: > http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/hotspot/6471769-JVMTI-DEPTH.2 > src/share/vm/runtime/vm_operations.hpp No comments. src/share/vm/prims/jvmtiEnvBase.hpp No comments. src/share/vm/prims/jvmtiEnv.cpp No comments. src/share/vm/prims/jvmtiEnvThreadState.cpp No comments. src/share/vm/prims/jvmtiEventController.cpp JvmtiEventController::set_frame_pop() is called by JvmtiEnvThreadState::set_frame_pop() which is called by JvmtiEnv::NotifyFramePop(). The "MutexLocker mu(JvmtiThreadState_lock)" in JvmtiEventController::set_frame_pop() protected the work done by JvmtiEventControllerPrivate::set_frame_pop(): ets->get_frame_pops()->set(fpop); recompute_thread_enabled(ets->get_thread()->jvmti_thread_state()); Since multiple threads can call JVM/TI NotifyFramePop() on the same target thread, what keeps the threads from messing with the list of frame pops simultaneously or messing with the thread enabled events bits in parallel? I suspect that this might also be an issue for JvmtiEventController::clear_frame_pop() and JvmtiEventController::clear_to_frame_pop() also. src/share/vm/prims/jvmtiThreadState.cpp No comments. Dan > > Summary: > > It is the 2-nd round of review because the JTREG com/sun/jdi tests > discovered a regression > in the first round change. The issue was in the > JvmtiEventController::clear_frame_pop() > lock synchronization that is not allowed at safepoints. > > As a result I've changed the JvmtiEnv::NotifyFramePop to use a VM > operation for safety. > Also, I've removed the lock synchronization from the 3 impacted > JvmtiEventController:: > functions: set_frame_pop(), clear_frame_pop() and clear_to_frame_pop(). > > Testing: > In progress: nsk.jvmti, nsk.jdi, nsk.jdwp, JTreg com/sun/jdi > > > Thanks, > Serguei > > > On 2/25/14 12:43 PM, serguei.spitsyn at oracle.com wrote: >> Please, review the fix for: >> https://bugs.openjdk.java.net/browse/JDK-6471769 >> >> >> Open webrev: >> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/hotspot/6471769-JVMTI-DEPTH.1 >> >> >> Summary: >> >> This is another Test Stabilization issue. >> The fix is very similar to other JVMTI stabilization fixes. >> It is to use safepoints for updating the PopFrame data instead of >> relying on the >> suspend equivalent condition mechanism >> (JvmtiEnv::is_thread_fully_suspended()) >> which is not adequate from the reliability point of view. >> >> Testing: >> In progress: nsk.jvmti, nsk.jdi, nsk.jdwp, JTreg com/sun/jdi >> >> >> Thanks, >> Serguei >> > From serguei.spitsyn at oracle.com Thu Feb 27 12:28:35 2014 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Thu, 27 Feb 2014 12:28:35 -0800 Subject: 2-nd round RFR 6471769: Error: assert(_cur_stack_depth == count_frames(), "cur_stack_depth out of sync") In-Reply-To: <530F8D52.2040400@oracle.com> References: <530D007D.4040602@oracle.com> <530EF65C.5040707@oracle.com> <530F8D52.2040400@oracle.com> Message-ID: <530F9FF3.7030209@oracle.com> Dan, Thank you a lot for reviewing this! On 2/27/14 11:09 AM, Daniel D. Daugherty wrote: > On 2/27/14 1:25 AM, serguei.spitsyn at oracle.com wrote: >> Please, review the fix for: >> https://bugs.openjdk.java.net/browse/JDK-6471769 >> >> >> Open webrev: >> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/hotspot/6471769-JVMTI-DEPTH.2 >> > > src/share/vm/runtime/vm_operations.hpp > No comments. > > src/share/vm/prims/jvmtiEnvBase.hpp > No comments. > > src/share/vm/prims/jvmtiEnv.cpp > No comments. > > src/share/vm/prims/jvmtiEnvThreadState.cpp > No comments. > > src/share/vm/prims/jvmtiEventController.cpp > JvmtiEventController::set_frame_pop() is called by > JvmtiEnvThreadState::set_frame_pop() which is called by > JvmtiEnv::NotifyFramePop(). > > The "MutexLocker mu(JvmtiThreadState_lock)" in > JvmtiEventController::set_frame_pop() protected the work > done by JvmtiEventControllerPrivate::set_frame_pop(): > > ets->get_frame_pops()->set(fpop); > recompute_thread_enabled(ets->get_thread()->jvmti_thread_state()); Your check is the right thing to do, thanks! I had to explain this more clearly in this 2-nd review request. The approach I've taken here is that all this code paths are executed on the target thread or at a safepoint. It is true for all 3 functions: set_frame_pop(), clear_frame_pop() and clear_to_frame_pop(). And the updated assert guards ensure that it is the case. It could be a good idea to add a No_Safepoint_Verifier for PopFrame() and NotifyFramePop() to make sure the current/target thread does not go to safepoint until it is returned from update_for_pop_top_frame() and set_frame_pop() correspondingly. A No_Safepoint_Verifier can be also needed in the JvmtiExport::post_method_exit(). These are all places where these functions are called: prims/jvmtiEnv.cpp: state->env_thread_state(this)->set_frame_pop(frame_number); // JvmtiEnv::NotifyFramePop() prims/jvmtiExport.cpp: ets->clear_frame_pop(cur_frame_number); // JvmtiExport::post_method_exit() prims/jvmtiThreadState.cpp: ets->clear_frame_pop(popframe_number); // JvmtiThreadState::update_for_pop_top_frame() The function JvmtiEnvThreadState::clear_to_frame_pop() is never called now. Thanks, Serguei > > Since multiple threads can call JVM/TI NotifyFramePop() on the > same target thread, what keeps the threads from messing with > the list of frame pops simultaneously or messing with the > thread enabled events bits in parallel? > > I suspect that this might also be an issue for > JvmtiEventController::clear_frame_pop() and > JvmtiEventController::clear_to_frame_pop() also. > > src/share/vm/prims/jvmtiThreadState.cpp > No comments. > > Dan > > >> >> Summary: >> >> It is the 2-nd round of review because the JTREG com/sun/jdi tests >> discovered a regression >> in the first round change. The issue was in the >> JvmtiEventController::clear_frame_pop() >> lock synchronization that is not allowed at safepoints. >> >> As a result I've changed the JvmtiEnv::NotifyFramePop to use a VM >> operation for safety. >> Also, I've removed the lock synchronization from the 3 impacted >> JvmtiEventController:: >> functions: set_frame_pop(), clear_frame_pop() and >> clear_to_frame_pop(). >> >> Testing: >> In progress: nsk.jvmti, nsk.jdi, nsk.jdwp, JTreg com/sun/jdi >> >> >> Thanks, >> Serguei >> >> >> On 2/25/14 12:43 PM, serguei.spitsyn at oracle.com wrote: >>> Please, review the fix for: >>> https://bugs.openjdk.java.net/browse/JDK-6471769 >>> >>> >>> Open webrev: >>> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/hotspot/6471769-JVMTI-DEPTH.1 >>> >>> >>> Summary: >>> >>> This is another Test Stabilization issue. >>> The fix is very similar to other JVMTI stabilization fixes. >>> It is to use safepoints for updating the PopFrame data instead of >>> relying on the >>> suspend equivalent condition mechanism >>> (JvmtiEnv::is_thread_fully_suspended()) >>> which is not adequate from the reliability point of view. >>> >>> Testing: >>> In progress: nsk.jvmti, nsk.jdi, nsk.jdwp, JTreg com/sun/jdi >>> >>> >>> Thanks, >>> Serguei >>> >> > From mandy.chung at oracle.com Thu Feb 27 12:32:58 2014 From: mandy.chung at oracle.com (Mandy Chung) Date: Thu, 27 Feb 2014 12:32:58 -0800 Subject: RFR 9: 8035889: jdk testlibrary - add printing of values of failed assertions In-Reply-To: <530F5AE9.8000405@oracle.com> References: <530E4FD8.9040605@oracle.com> <530E9C6C.8080409@oracle.com> <530EAC78.4000301@Oracle.com> <530EAE5F.3050308@oracle.com> <530F5AE9.8000405@oracle.com> Message-ID: <530FA0F9.5000106@oracle.com> Looks okay to me. Mandy On 2/27/14 7:34 AM, roger riggs wrote: > Hi Mandy, > > I updated the webrev: > http://cr.openjdk.java.net/~rriggs/webrev-testlibrary-asserts-8035889/ > > Alan suggested copying serviceability-dev so they have a chance to > review if desired. > > I want to investigate if it is possible to use the TestNG Assert > classes without > the TestNG execution framework. > It would be necessary to compile/run against TestNG.jar but it might not > need the entire mechanism. > > Thanks, Roger > > On 2/26/2014 10:17 PM, Mandy Chung wrote: >> On 2/26/2014 7:09 PM, Roger Riggs wrote: >>> Hi Mandy, >>> >>> Yes, it might be more productive to switch the tests to TestNG. >>> But it did provide support in cases where TestNG could not be used, >>> for example in a directory of existing tests that had custom reporting. >>> >>> But I remember there is a problem with TestNG having a dependency >>> for XML >>> which is not supported in Profile1 and a number of tests had to be >>> disabled >>> in that configuration. Will XML always be available. Do we need to >>> solve >>> or work around that problem with TestNG? >>> >> >> This is a good point. When we want to test just the base module for >> example, how can we run TestNG tests? We need to address that certainly. >> >> My comment on TestNG is a question for new tests using this Asserts >> class. Your patch is fine to go (after taking out @library tag if I >> got it correct). >> >> Mandy >> >>> Thanks, Roger >>> >>> On 2/26/14 9:01 PM, Mandy Chung wrote: >>>> Hi Roger, >>>> >>>> On 2/26/2014 12:34 PM, roger riggs wrote: >>>>> The testlibrary for the jdk should be printing the values in the >>>>> failed >>>>> assertions to make debugging easier and quicker. >>>>> >>>>> The webrev adds the printing of the failed assertions and added >>>>> methods >>>>> for formatting and unconditional fail methods. >>>>> >>>>> Webrev: >>>>> http://cr.openjdk.java.net/~rriggs/webrev-testlibrary-asserts-8035889/ >>>>> >>>>> >>>> >>>> AssertsTest.java: line 28: @library doesn't look like it's needed. >>>> There is no jdk/test/testlibrary directory and I think >>>> jdk.testlibrary.* are found as relative to $test.src. >>>> >>>> Otherwise, the change looks okay. >>>> >>>> Now that jtreg supports TestNG and I wonder if this class should >>>> retire some day (there are only about 10 existing tests using this >>>> class). Are you writing new tests using this Asserts class? >>>> >>>> Mandy >>>> >>>>> Bug: >>>>> 8035889: jdk testlibrary - add printing of values of failed >>>>> assertions >>>>> >>>>> Thanks, Roger >>>>> >>>>> [1] https://bugs.openjdk.java.net/browse/JDK-8035889 >>>>> >>>>> >>>> >>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20140227/46e827fc/attachment.html From serguei.spitsyn at oracle.com Thu Feb 27 13:03:27 2014 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Thu, 27 Feb 2014 13:03:27 -0800 Subject: 2-nd round RFR 6471769: Error: assert(_cur_stack_depth == count_frames(), "cur_stack_depth out of sync") In-Reply-To: <530F9FF3.7030209@oracle.com> References: <530D007D.4040602@oracle.com> <530EF65C.5040707@oracle.com> <530F8D52.2040400@oracle.com> <530F9FF3.7030209@oracle.com> Message-ID: <530FA81F.8040309@oracle.com> On 2/27/14 12:28 PM, serguei.spitsyn at oracle.com wrote: > Dan, > > Thank you a lot for reviewing this! > > On 2/27/14 11:09 AM, Daniel D. Daugherty wrote: >> On 2/27/14 1:25 AM, serguei.spitsyn at oracle.com wrote: >>> Please, review the fix for: >>> https://bugs.openjdk.java.net/browse/JDK-6471769 >>> >>> >>> Open webrev: >>> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/hotspot/6471769-JVMTI-DEPTH.2 >>> >> >> src/share/vm/runtime/vm_operations.hpp >> No comments. >> >> src/share/vm/prims/jvmtiEnvBase.hpp >> No comments. >> >> src/share/vm/prims/jvmtiEnv.cpp >> No comments. >> >> src/share/vm/prims/jvmtiEnvThreadState.cpp >> No comments. >> >> src/share/vm/prims/jvmtiEventController.cpp >> JvmtiEventController::set_frame_pop() is called by >> JvmtiEnvThreadState::set_frame_pop() which is called by >> JvmtiEnv::NotifyFramePop(). >> >> The "MutexLocker mu(JvmtiThreadState_lock)" in >> JvmtiEventController::set_frame_pop() protected the work >> done by JvmtiEventControllerPrivate::set_frame_pop(): >> >> ets->get_frame_pops()->set(fpop); >> recompute_thread_enabled(ets->get_thread()->jvmti_thread_state()); > > Your check is the right thing to do, thanks! > I had to explain this more clearly in this 2-nd review request. > > The approach I've taken here is that all this code paths are executed > on the target thread or at a safepoint. > > It is true for all 3 functions: > set_frame_pop(), clear_frame_pop() and clear_to_frame_pop(). > > And the updated assert guards ensure that it is the case. > > It could be a good idea to add a No_Safepoint_Verifier for PopFrame() > and NotifyFramePop() > to make sure the current/target thread does not go to safepoint until > it is returned from > update_for_pop_top_frame() and set_frame_pop() correspondingly. > A No_Safepoint_Verifier can be also needed in the > JvmtiExport::post_method_exit(). > > These are all places where these functions are called: > prims/jvmtiEnv.cpp: > state->env_thread_state(this)->set_frame_pop(frame_number); // > JvmtiEnv::NotifyFramePop() > prims/jvmtiExport.cpp: ets->clear_frame_pop(cur_frame_number); // > JvmtiExport::post_method_exit() > prims/jvmtiThreadState.cpp: > ets->clear_frame_pop(popframe_number); // > JvmtiThreadState::update_for_pop_top_frame() > > The function JvmtiEnvThreadState::clear_to_frame_pop() is never called > now. There is still a concern about recompute_thread_enabled(). If it is normally always protected with the JvmtiThreadState_lock then the approach above is not going to work. I'm trying to check this now. Thanks, Serguei > > Thanks, > Serguei > > > > > >> >> Since multiple threads can call JVM/TI NotifyFramePop() on the >> same target thread, what keeps the threads from messing with >> the list of frame pops simultaneously or messing with the >> thread enabled events bits in parallel? >> >> I suspect that this might also be an issue for >> JvmtiEventController::clear_frame_pop() and >> JvmtiEventController::clear_to_frame_pop() also. >> >> src/share/vm/prims/jvmtiThreadState.cpp >> No comments. >> >> Dan >> >> >>> >>> Summary: >>> >>> It is the 2-nd round of review because the JTREG com/sun/jdi tests >>> discovered a regression >>> in the first round change. The issue was in the >>> JvmtiEventController::clear_frame_pop() >>> lock synchronization that is not allowed at safepoints. >>> >>> As a result I've changed the JvmtiEnv::NotifyFramePop to use a VM >>> operation for safety. >>> Also, I've removed the lock synchronization from the 3 impacted >>> JvmtiEventController:: >>> functions: set_frame_pop(), clear_frame_pop() and >>> clear_to_frame_pop(). >>> >>> Testing: >>> In progress: nsk.jvmti, nsk.jdi, nsk.jdwp, JTreg com/sun/jdi >>> >>> >>> Thanks, >>> Serguei >>> >>> >>> On 2/25/14 12:43 PM, serguei.spitsyn at oracle.com wrote: >>>> Please, review the fix for: >>>> https://bugs.openjdk.java.net/browse/JDK-6471769 >>>> >>>> >>>> Open webrev: >>>> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/hotspot/6471769-JVMTI-DEPTH.1 >>>> >>>> >>>> Summary: >>>> >>>> This is another Test Stabilization issue. >>>> The fix is very similar to other JVMTI stabilization fixes. >>>> It is to use safepoints for updating the PopFrame data instead of >>>> relying on the >>>> suspend equivalent condition mechanism >>>> (JvmtiEnv::is_thread_fully_suspended()) >>>> which is not adequate from the reliability point of view. >>>> >>>> Testing: >>>> In progress: nsk.jvmti, nsk.jdi, nsk.jdwp, JTreg com/sun/jdi >>>> >>>> >>>> Thanks, >>>> Serguei >>>> >>> >> > From serguei.spitsyn at oracle.com Thu Feb 27 14:00:18 2014 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Thu, 27 Feb 2014 14:00:18 -0800 Subject: 2-nd round RFR 6471769: Error: assert(_cur_stack_depth == count_frames(), "cur_stack_depth out of sync") In-Reply-To: <530FA81F.8040309@oracle.com> References: <530D007D.4040602@oracle.com> <530EF65C.5040707@oracle.com> <530F8D52.2040400@oracle.com> <530F9FF3.7030209@oracle.com> <530FA81F.8040309@oracle.com> Message-ID: <530FB572.4010507@oracle.com> On 2/27/14 1:03 PM, serguei.spitsyn at oracle.com wrote: > On 2/27/14 12:28 PM, serguei.spitsyn at oracle.com wrote: >> Dan, >> >> Thank you a lot for reviewing this! >> >> On 2/27/14 11:09 AM, Daniel D. Daugherty wrote: >>> On 2/27/14 1:25 AM, serguei.spitsyn at oracle.com wrote: >>>> Please, review the fix for: >>>> https://bugs.openjdk.java.net/browse/JDK-6471769 >>>> >>>> >>>> Open webrev: >>>> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/hotspot/6471769-JVMTI-DEPTH.2 >>>> >>> >>> src/share/vm/runtime/vm_operations.hpp >>> No comments. >>> >>> src/share/vm/prims/jvmtiEnvBase.hpp >>> No comments. >>> >>> src/share/vm/prims/jvmtiEnv.cpp >>> No comments. >>> >>> src/share/vm/prims/jvmtiEnvThreadState.cpp >>> No comments. >>> >>> src/share/vm/prims/jvmtiEventController.cpp >>> JvmtiEventController::set_frame_pop() is called by >>> JvmtiEnvThreadState::set_frame_pop() which is called by >>> JvmtiEnv::NotifyFramePop(). >>> >>> The "MutexLocker mu(JvmtiThreadState_lock)" in >>> JvmtiEventController::set_frame_pop() protected the work >>> done by JvmtiEventControllerPrivate::set_frame_pop(): >>> >>> ets->get_frame_pops()->set(fpop); >>> recompute_thread_enabled(ets->get_thread()->jvmti_thread_state()); >> >> Your check is the right thing to do, thanks! >> I had to explain this more clearly in this 2-nd review request. >> >> The approach I've taken here is that all this code paths are executed >> on the target thread or at a safepoint. >> >> It is true for all 3 functions: >> set_frame_pop(), clear_frame_pop() and clear_to_frame_pop(). >> >> And the updated assert guards ensure that it is the case. >> >> It could be a good idea to add a No_Safepoint_Verifier for PopFrame() >> and NotifyFramePop() >> to make sure the current/target thread does not go to safepoint until >> it is returned from >> update_for_pop_top_frame() and set_frame_pop() correspondingly. >> A No_Safepoint_Verifier can be also needed in the >> JvmtiExport::post_method_exit(). >> >> These are all places where these functions are called: >> prims/jvmtiEnv.cpp: >> state->env_thread_state(this)->set_frame_pop(frame_number); // >> JvmtiEnv::NotifyFramePop() >> prims/jvmtiExport.cpp: ets->clear_frame_pop(cur_frame_number); // >> JvmtiExport::post_method_exit() >> prims/jvmtiThreadState.cpp: >> ets->clear_frame_pop(popframe_number); // >> JvmtiThreadState::update_for_pop_top_frame() >> >> The function JvmtiEnvThreadState::clear_to_frame_pop() is never >> called now. > > There is still a concern about recompute_thread_enabled(). > If it is normally always protected with the JvmtiThreadState_lock > then the approach above is not going to work. > I'm trying to check this now. Dan, I came to a conclusion that these 3 functions still must be protected by the JvmtiThreadState_lock when they are called out of a safepoint. It is a little bit ugly but has to be safe though. Please, let me know if you see eny problems with that. I'll send a new webrev soon. Thanks, Serguei > > Thanks, > Serguei > > >> >> Thanks, >> Serguei >> >> >> >> >> >>> >>> Since multiple threads can call JVM/TI NotifyFramePop() on the >>> same target thread, what keeps the threads from messing with >>> the list of frame pops simultaneously or messing with the >>> thread enabled events bits in parallel? >>> >>> I suspect that this might also be an issue for >>> JvmtiEventController::clear_frame_pop() and >>> JvmtiEventController::clear_to_frame_pop() also. >>> >>> src/share/vm/prims/jvmtiThreadState.cpp >>> No comments. >>> >>> Dan >>> >>> >>>> >>>> Summary: >>>> >>>> It is the 2-nd round of review because the JTREG com/sun/jdi >>>> tests discovered a regression >>>> in the first round change. The issue was in the >>>> JvmtiEventController::clear_frame_pop() >>>> lock synchronization that is not allowed at safepoints. >>>> >>>> As a result I've changed the JvmtiEnv::NotifyFramePop to use a VM >>>> operation for safety. >>>> Also, I've removed the lock synchronization from the 3 impacted >>>> JvmtiEventController:: >>>> functions: set_frame_pop(), clear_frame_pop() and >>>> clear_to_frame_pop(). >>>> >>>> Testing: >>>> In progress: nsk.jvmti, nsk.jdi, nsk.jdwp, JTreg com/sun/jdi >>>> >>>> >>>> Thanks, >>>> Serguei >>>> >>>> >>>> On 2/25/14 12:43 PM, serguei.spitsyn at oracle.com wrote: >>>>> Please, review the fix for: >>>>> https://bugs.openjdk.java.net/browse/JDK-6471769 >>>>> >>>>> >>>>> Open webrev: >>>>> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/hotspot/6471769-JVMTI-DEPTH.1 >>>>> >>>>> >>>>> Summary: >>>>> >>>>> This is another Test Stabilization issue. >>>>> The fix is very similar to other JVMTI stabilization fixes. >>>>> It is to use safepoints for updating the PopFrame data instead >>>>> of relying on the >>>>> suspend equivalent condition mechanism >>>>> (JvmtiEnv::is_thread_fully_suspended()) >>>>> which is not adequate from the reliability point of view. >>>>> >>>>> Testing: >>>>> In progress: nsk.jvmti, nsk.jdi, nsk.jdwp, JTreg com/sun/jdi >>>>> >>>>> >>>>> Thanks, >>>>> Serguei >>>>> >>>> >>> >> > From daniel.daugherty at oracle.com Thu Feb 27 14:47:15 2014 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Thu, 27 Feb 2014 15:47:15 -0700 Subject: RFR 6946101: ShellScaffold.sh based tests can issue "write error: Broken pipe" messages In-Reply-To: <1701612.OzZc2qZ2Ld@work-dell> References: <1701612.OzZc2qZ2Ld@work-dell> Message-ID: <530FC073.3060905@oracle.com> On 2/27/14 9:20 AM, Pavel Punegov wrote: > Please review the fix for: > https://bugs.openjdk.java.net/browse/JDK-6946101 > > webrev: > http://cr.openjdk.java.net/~iignatyev/ppunegov/6946101/webrev.00/ test/com/sun/jdi/ShellScaffold.sh line 531: # allows JDB to exit" stray double-quote at end of comment line 563: dofail "It's not allowed to send quit and exit commands from the test" 'and' should be 'or' line 819: # Kill debugger, it could be hang Typo: 'hang' -> 'hung' test/com/sun/jdi/ArrayLengthDumpTest.sh test/com/sun/jdi/CatchAllTest.sh test/com/sun/jdi/CatchCaughtTest.sh test/com/sun/jdi/CatchPatternTest.sh test/com/sun/jdi/CommandCommentDelimiter.sh test/com/sun/jdi/DeferredStepTest.sh test/com/sun/jdi/DeoptimizeWalk.sh test/com/sun/jdi/EvalArgs.sh test/com/sun/jdi/GetLocalVariables3Test.sh test/com/sun/jdi/GetLocalVariables4Test.sh test/com/sun/jdi/JdbExprTest.sh test/com/sun/jdi/JdbLockTest.sh test/com/sun/jdi/JdbMethodExitTest.sh test/com/sun/jdi/JdbMissStep.sh test/com/sun/jdi/MixedSuspendTest.sh test/com/sun/jdi/NotAField.sh test/com/sun/jdi/NullLocalVariable.sh test/com/sun/jdi/Redefine-g.sh test/com/sun/jdi/RedefineAnnotation.sh test/com/sun/jdi/RedefineChangeClassOrder.sh test/com/sun/jdi/RedefineClasses.sh test/com/sun/jdi/RedefineException.sh test/com/sun/jdi/RedefineFinal.sh test/com/sun/jdi/RedefineImplementor.sh test/com/sun/jdi/RedefineIntConstantToLong.sh test/com/sun/jdi/RedefineMulti.sh test/com/sun/jdi/RedefinePop.sh test/com/sun/jdi/RedefineTTYLineNumber.sh test/com/sun/jdi/StringConvertTest.sh test/com/sun/jdi/WatchFramePop.sh I _think_ I understand the new test driver style: - get rid of all explicit 'cmd quit' usages because mydojdbCmds() now wraps the test's dojdbCmds with a 'quit' cmd - any test that previously ended with a 'cmd cont' is presumed to be OK of that 'cmd cont' caused jdb to execute off the end of main(); sounds reasonable to me - perfect example of the new logic to catch an errant run off the end is test/com/sun/jdi/WatchFramePop.sh - the last jdb cmd is 'next' - and jdb is NOT supposed to run off the end - the new logic should catch this nicely; I _think_ the old logic would only catch a run off the end if someone manually checked the test result Thumbs up! Dan > > > This change fixes two issues with the tests: > 1. Fix incorrect 'quit' command sending to JDB when JDB process was finished. > 2. Improve JDB unexpected exit detection and process synchronization. > > Description of fix: > 1. Add allowExit parameter to cmd() to show that the given command can finish > JDB. E.g., 'cont' command make JDB execute debuggee to the end. > If allowExit wasn't set for a command then assume that it can't finish > execution, and fail the test if it did. > > 2. Make test fail if it tries to send 'quit' or 'exit' commands. This makes > it impossible to send quit/exit from test by mistake. Scaffold will > finish JDB by itself if JDB didn't finish before be a command with allowExit > set. Add dofinish() function to be the only method that may exit JDB. > > 3. Add proper synchronization into waitForFinish(). On all systems except > SunOS use wait (from bash). On Solaris find the shell subprocess and wait for > its finish. It replaces wait used on all other systems, because it > doesn't work on sh/ksh as in bash. > > 4. Fix tests: add allowExit to tests where it's needed. > From serguei.spitsyn at oracle.com Thu Feb 27 19:50:18 2014 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Thu, 27 Feb 2014 19:50:18 -0800 Subject: 3-nd round RFR 6471769: Error: assert(_cur_stack_depth == count_frames(), "cur_stack_depth out of sync") In-Reply-To: <530FB572.4010507@oracle.com> References: <530D007D.4040602@oracle.com> <530EF65C.5040707@oracle.com> <530F8D52.2040400@oracle.com> <530F9FF3.7030209@oracle.com> <530FA81F.8040309@oracle.com> <530FB572.4010507@oracle.com> Message-ID: <5310077A.7010004@oracle.com> Please, review the fix for: https://bugs.openjdk.java.net/browse/JDK-6471769 Open webrev: http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/hotspot/6471769-JVMTI-DEPTH.3 Summary: It is another attempt to fix the JTREG com/sun/jdi tests regression discovered in the first round change. The fix is to avoid lock synchronization at safepoints(jvmtiEventController.cpp). Thanks to Dan for catching the problem in the 2-nd round of review! Testing: All tests are passed: nsk.jvmti, nsk.jdi, nsk.jdwp, JTreg com/sun/jdi Thanks, Serguei On 2/27/14 2:00 PM, serguei.spitsyn at oracle.com wrote: > On 2/27/14 1:03 PM, serguei.spitsyn at oracle.com wrote: >> On 2/27/14 12:28 PM, serguei.spitsyn at oracle.com wrote: >>> Dan, >>> >>> Thank you a lot for reviewing this! >>> >>> On 2/27/14 11:09 AM, Daniel D. Daugherty wrote: >>>> On 2/27/14 1:25 AM, serguei.spitsyn at oracle.com wrote: >>>>> Please, review the fix for: >>>>> https://bugs.openjdk.java.net/browse/JDK-6471769 >>>>> >>>>> >>>>> Open webrev: >>>>> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/hotspot/6471769-JVMTI-DEPTH.2 >>>>> >>>> >>>> src/share/vm/runtime/vm_operations.hpp >>>> No comments. >>>> >>>> src/share/vm/prims/jvmtiEnvBase.hpp >>>> No comments. >>>> >>>> src/share/vm/prims/jvmtiEnv.cpp >>>> No comments. >>>> >>>> src/share/vm/prims/jvmtiEnvThreadState.cpp >>>> No comments. >>>> >>>> src/share/vm/prims/jvmtiEventController.cpp >>>> JvmtiEventController::set_frame_pop() is called by >>>> JvmtiEnvThreadState::set_frame_pop() which is called by >>>> JvmtiEnv::NotifyFramePop(). >>>> >>>> The "MutexLocker mu(JvmtiThreadState_lock)" in >>>> JvmtiEventController::set_frame_pop() protected the work >>>> done by JvmtiEventControllerPrivate::set_frame_pop(): >>>> >>>> ets->get_frame_pops()->set(fpop); >>>> recompute_thread_enabled(ets->get_thread()->jvmti_thread_state()); >>> >>> Your check is the right thing to do, thanks! >>> I had to explain this more clearly in this 2-nd review request. >>> >>> The approach I've taken here is that all this code paths are executed >>> on the target thread or at a safepoint. >>> >>> It is true for all 3 functions: >>> set_frame_pop(), clear_frame_pop() and clear_to_frame_pop(). >>> >>> And the updated assert guards ensure that it is the case. >>> >>> It could be a good idea to add a No_Safepoint_Verifier for >>> PopFrame() and NotifyFramePop() >>> to make sure the current/target thread does not go to safepoint >>> until it is returned from >>> update_for_pop_top_frame() and set_frame_pop() correspondingly. >>> A No_Safepoint_Verifier can be also needed in the >>> JvmtiExport::post_method_exit(). >>> >>> These are all places where these functions are called: >>> prims/jvmtiEnv.cpp: >>> state->env_thread_state(this)->set_frame_pop(frame_number); // >>> JvmtiEnv::NotifyFramePop() >>> prims/jvmtiExport.cpp: ets->clear_frame_pop(cur_frame_number); // >>> JvmtiExport::post_method_exit() >>> prims/jvmtiThreadState.cpp: >>> ets->clear_frame_pop(popframe_number); // >>> JvmtiThreadState::update_for_pop_top_frame() >>> >>> The function JvmtiEnvThreadState::clear_to_frame_pop() is never >>> called now. >> >> There is still a concern about recompute_thread_enabled(). >> If it is normally always protected with the JvmtiThreadState_lock >> then the approach above is not going to work. >> I'm trying to check this now. > > Dan, > > I came to a conclusion that these 3 functions still must be protected > by the JvmtiThreadState_lock when they are called out of a safepoint. > It is a little bit ugly but has to be safe though. > > Please, let me know if you see eny problems with that. > I'll send a new webrev soon. > > Thanks, > Serguei > > >> >> Thanks, >> Serguei >> >> >>> >>> Thanks, >>> Serguei >>> >>> >>> >>> >>> >>>> >>>> Since multiple threads can call JVM/TI NotifyFramePop() on the >>>> same target thread, what keeps the threads from messing with >>>> the list of frame pops simultaneously or messing with the >>>> thread enabled events bits in parallel? >>>> >>>> I suspect that this might also be an issue for >>>> JvmtiEventController::clear_frame_pop() and >>>> JvmtiEventController::clear_to_frame_pop() also. >>>> >>>> src/share/vm/prims/jvmtiThreadState.cpp >>>> No comments. >>>> >>>> Dan >>>> >>>> >>>>> >>>>> Summary: >>>>> >>>>> It is the 2-nd round of review because the JTREG com/sun/jdi >>>>> tests discovered a regression >>>>> in the first round change. The issue was in the >>>>> JvmtiEventController::clear_frame_pop() >>>>> lock synchronization that is not allowed at safepoints. >>>>> >>>>> As a result I've changed the JvmtiEnv::NotifyFramePop to use a >>>>> VM operation for safety. >>>>> Also, I've removed the lock synchronization from the 3 impacted >>>>> JvmtiEventController:: >>>>> functions: set_frame_pop(), clear_frame_pop() and >>>>> clear_to_frame_pop(). >>>>> >>>>> Testing: >>>>> In progress: nsk.jvmti, nsk.jdi, nsk.jdwp, JTreg com/sun/jdi >>>>> >>>>> >>>>> Thanks, >>>>> Serguei >>>>> >>>>> >>>>> On 2/25/14 12:43 PM, serguei.spitsyn at oracle.com wrote: >>>>>> Please, review the fix for: >>>>>> https://bugs.openjdk.java.net/browse/JDK-6471769 >>>>>> >>>>>> >>>>>> Open webrev: >>>>>> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/hotspot/6471769-JVMTI-DEPTH.1 >>>>>> >>>>>> >>>>>> Summary: >>>>>> >>>>>> This is another Test Stabilization issue. >>>>>> The fix is very similar to other JVMTI stabilization fixes. >>>>>> It is to use safepoints for updating the PopFrame data instead >>>>>> of relying on the >>>>>> suspend equivalent condition mechanism >>>>>> (JvmtiEnv::is_thread_fully_suspended()) >>>>>> which is not adequate from the reliability point of view. >>>>>> >>>>>> Testing: >>>>>> In progress: nsk.jvmti, nsk.jdi, nsk.jdwp, JTreg com/sun/jdi >>>>>> >>>>>> >>>>>> Thanks, >>>>>> Serguei >>>>>> >>>>> >>>> >>> >> > From david.holmes at oracle.com Thu Feb 27 22:04:50 2014 From: david.holmes at oracle.com (David Holmes) Date: Fri, 28 Feb 2014 16:04:50 +1000 Subject: 3-nd round RFR 6471769: Error: assert(_cur_stack_depth == count_frames(), "cur_stack_depth out of sync") In-Reply-To: <5310077A.7010004@oracle.com> References: <530D007D.4040602@oracle.com> <530EF65C.5040707@oracle.com> <530F8D52.2040400@oracle.com> <530F9FF3.7030209@oracle.com> <530FA81F.8040309@oracle.com> <530FB572.4010507@oracle.com> <5310077A.7010004@oracle.com> Message-ID: <53102702.4070603@oracle.com> Hi Serguei, On 28/02/2014 1:50 PM, serguei.spitsyn at oracle.com wrote: > Please, review the fix for: > https://bugs.openjdk.java.net/browse/JDK-6471769 > > > Open webrev: > http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/hotspot/6471769-JVMTI-DEPTH.3 > > > Summary: > > It is another attempt to fix the JTREG com/sun/jdi tests regression > discovered in the first round change. > The fix is to avoid lock synchronization at > safepoints(jvmtiEventController.cpp). > Thanks to Dan for catching the problem in the 2-nd round of review! The basic approach here seems sound. I find the checking for cur->is_VMThread() somewhat overly conservative - if we are at a safepoint, and executing this code, then we must be the VMThread. But ok. You could also use MutexLockerEx to avoid the need for locked and unlocked paths to a common call, but that's just stylistic. Though if you are grabbing the current thread anyway you can also use the MutexLocker calls that take the thread arg - to avoid a second look-up of the current thread. David ----- > Testing: > All tests are passed: nsk.jvmti, nsk.jdi, nsk.jdwp, JTreg com/sun/jdi > > > Thanks, > Serguei > > > On 2/27/14 2:00 PM, serguei.spitsyn at oracle.com wrote: >> On 2/27/14 1:03 PM, serguei.spitsyn at oracle.com wrote: >>> On 2/27/14 12:28 PM, serguei.spitsyn at oracle.com wrote: >>>> Dan, >>>> >>>> Thank you a lot for reviewing this! >>>> >>>> On 2/27/14 11:09 AM, Daniel D. Daugherty wrote: >>>>> On 2/27/14 1:25 AM, serguei.spitsyn at oracle.com wrote: >>>>>> Please, review the fix for: >>>>>> https://bugs.openjdk.java.net/browse/JDK-6471769 >>>>>> >>>>>> >>>>>> Open webrev: >>>>>> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/hotspot/6471769-JVMTI-DEPTH.2 >>>>>> >>>>> >>>>> src/share/vm/runtime/vm_operations.hpp >>>>> No comments. >>>>> >>>>> src/share/vm/prims/jvmtiEnvBase.hpp >>>>> No comments. >>>>> >>>>> src/share/vm/prims/jvmtiEnv.cpp >>>>> No comments. >>>>> >>>>> src/share/vm/prims/jvmtiEnvThreadState.cpp >>>>> No comments. >>>>> >>>>> src/share/vm/prims/jvmtiEventController.cpp >>>>> JvmtiEventController::set_frame_pop() is called by >>>>> JvmtiEnvThreadState::set_frame_pop() which is called by >>>>> JvmtiEnv::NotifyFramePop(). >>>>> >>>>> The "MutexLocker mu(JvmtiThreadState_lock)" in >>>>> JvmtiEventController::set_frame_pop() protected the work >>>>> done by JvmtiEventControllerPrivate::set_frame_pop(): >>>>> >>>>> ets->get_frame_pops()->set(fpop); >>>>> recompute_thread_enabled(ets->get_thread()->jvmti_thread_state()); >>>> >>>> Your check is the right thing to do, thanks! >>>> I had to explain this more clearly in this 2-nd review request. >>>> >>>> The approach I've taken here is that all this code paths are executed >>>> on the target thread or at a safepoint. >>>> >>>> It is true for all 3 functions: >>>> set_frame_pop(), clear_frame_pop() and clear_to_frame_pop(). >>>> >>>> And the updated assert guards ensure that it is the case. >>>> >>>> It could be a good idea to add a No_Safepoint_Verifier for >>>> PopFrame() and NotifyFramePop() >>>> to make sure the current/target thread does not go to safepoint >>>> until it is returned from >>>> update_for_pop_top_frame() and set_frame_pop() correspondingly. >>>> A No_Safepoint_Verifier can be also needed in the >>>> JvmtiExport::post_method_exit(). >>>> >>>> These are all places where these functions are called: >>>> prims/jvmtiEnv.cpp: >>>> state->env_thread_state(this)->set_frame_pop(frame_number); // >>>> JvmtiEnv::NotifyFramePop() >>>> prims/jvmtiExport.cpp: ets->clear_frame_pop(cur_frame_number); // >>>> JvmtiExport::post_method_exit() >>>> prims/jvmtiThreadState.cpp: >>>> ets->clear_frame_pop(popframe_number); // >>>> JvmtiThreadState::update_for_pop_top_frame() >>>> >>>> The function JvmtiEnvThreadState::clear_to_frame_pop() is never >>>> called now. >>> >>> There is still a concern about recompute_thread_enabled(). >>> If it is normally always protected with the JvmtiThreadState_lock >>> then the approach above is not going to work. >>> I'm trying to check this now. >> >> Dan, >> >> I came to a conclusion that these 3 functions still must be protected >> by the JvmtiThreadState_lock when they are called out of a safepoint. >> It is a little bit ugly but has to be safe though. >> >> Please, let me know if you see eny problems with that. >> I'll send a new webrev soon. >> >> Thanks, >> Serguei >> >> >>> >>> Thanks, >>> Serguei >>> >>> >>>> >>>> Thanks, >>>> Serguei >>>> >>>> >>>> >>>> >>>> >>>>> >>>>> Since multiple threads can call JVM/TI NotifyFramePop() on the >>>>> same target thread, what keeps the threads from messing with >>>>> the list of frame pops simultaneously or messing with the >>>>> thread enabled events bits in parallel? >>>>> >>>>> I suspect that this might also be an issue for >>>>> JvmtiEventController::clear_frame_pop() and >>>>> JvmtiEventController::clear_to_frame_pop() also. >>>>> >>>>> src/share/vm/prims/jvmtiThreadState.cpp >>>>> No comments. >>>>> >>>>> Dan >>>>> >>>>> >>>>>> >>>>>> Summary: >>>>>> >>>>>> It is the 2-nd round of review because the JTREG com/sun/jdi >>>>>> tests discovered a regression >>>>>> in the first round change. The issue was in the >>>>>> JvmtiEventController::clear_frame_pop() >>>>>> lock synchronization that is not allowed at safepoints. >>>>>> >>>>>> As a result I've changed the JvmtiEnv::NotifyFramePop to use a >>>>>> VM operation for safety. >>>>>> Also, I've removed the lock synchronization from the 3 impacted >>>>>> JvmtiEventController:: >>>>>> functions: set_frame_pop(), clear_frame_pop() and >>>>>> clear_to_frame_pop(). >>>>>> >>>>>> Testing: >>>>>> In progress: nsk.jvmti, nsk.jdi, nsk.jdwp, JTreg com/sun/jdi >>>>>> >>>>>> >>>>>> Thanks, >>>>>> Serguei >>>>>> >>>>>> >>>>>> On 2/25/14 12:43 PM, serguei.spitsyn at oracle.com wrote: >>>>>>> Please, review the fix for: >>>>>>> https://bugs.openjdk.java.net/browse/JDK-6471769 >>>>>>> >>>>>>> >>>>>>> Open webrev: >>>>>>> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/hotspot/6471769-JVMTI-DEPTH.1 >>>>>>> >>>>>>> >>>>>>> Summary: >>>>>>> >>>>>>> This is another Test Stabilization issue. >>>>>>> The fix is very similar to other JVMTI stabilization fixes. >>>>>>> It is to use safepoints for updating the PopFrame data instead >>>>>>> of relying on the >>>>>>> suspend equivalent condition mechanism >>>>>>> (JvmtiEnv::is_thread_fully_suspended()) >>>>>>> which is not adequate from the reliability point of view. >>>>>>> >>>>>>> Testing: >>>>>>> In progress: nsk.jvmti, nsk.jdi, nsk.jdwp, JTreg com/sun/jdi >>>>>>> >>>>>>> >>>>>>> Thanks, >>>>>>> Serguei >>>>>>> >>>>>> >>>>> >>>> >>> >> > From staffan.larsen at oracle.com Fri Feb 28 00:31:29 2014 From: staffan.larsen at oracle.com (Staffan Larsen) Date: Fri, 28 Feb 2014 09:31:29 +0100 Subject: RFR 6946101: ShellScaffold.sh based tests can issue "write error: Broken pipe" messages In-Reply-To: <530FC073.3060905@oracle.com> References: <1701612.OzZc2qZ2Ld@work-dell> <530FC073.3060905@oracle.com> Message-ID: <47FE9723-6797-4135-A103-E1E42675FE99@oracle.com> Very nice change - looks good! test/com/sun/jdi/ShellScaffold.sh line 1000: # mydojdbCmds() didn't finished because it waits for JDB message nit: finished -> finish Just a note that this should be pushed through jdk9/dev and not jdk9/hs-comp (where the webrev was made). Thanks, /Staffan On 27 feb 2014, at 23:47, Daniel D. Daugherty wrote: > On 2/27/14 9:20 AM, Pavel Punegov wrote: >> Please review the fix for: >> https://bugs.openjdk.java.net/browse/JDK-6946101 >> >> webrev: >> http://cr.openjdk.java.net/~iignatyev/ppunegov/6946101/webrev.00/ > > test/com/sun/jdi/ShellScaffold.sh > line 531: # allows JDB to exit" > stray double-quote at end of comment > > line 563: dofail "It's not allowed to send quit and exit commands from the test" > 'and' should be 'or' > > line 819: # Kill debugger, it could be hang > Typo: 'hang' -> 'hung' > > test/com/sun/jdi/ArrayLengthDumpTest.sh > test/com/sun/jdi/CatchAllTest.sh > test/com/sun/jdi/CatchCaughtTest.sh > test/com/sun/jdi/CatchPatternTest.sh > test/com/sun/jdi/CommandCommentDelimiter.sh > test/com/sun/jdi/DeferredStepTest.sh > test/com/sun/jdi/DeoptimizeWalk.sh > test/com/sun/jdi/EvalArgs.sh > test/com/sun/jdi/GetLocalVariables3Test.sh > test/com/sun/jdi/GetLocalVariables4Test.sh > test/com/sun/jdi/JdbExprTest.sh > test/com/sun/jdi/JdbLockTest.sh > test/com/sun/jdi/JdbMethodExitTest.sh > test/com/sun/jdi/JdbMissStep.sh > test/com/sun/jdi/MixedSuspendTest.sh > test/com/sun/jdi/NotAField.sh > test/com/sun/jdi/NullLocalVariable.sh > test/com/sun/jdi/Redefine-g.sh > test/com/sun/jdi/RedefineAnnotation.sh > test/com/sun/jdi/RedefineChangeClassOrder.sh > test/com/sun/jdi/RedefineClasses.sh > test/com/sun/jdi/RedefineException.sh > test/com/sun/jdi/RedefineFinal.sh > test/com/sun/jdi/RedefineImplementor.sh > test/com/sun/jdi/RedefineIntConstantToLong.sh > test/com/sun/jdi/RedefineMulti.sh > test/com/sun/jdi/RedefinePop.sh > test/com/sun/jdi/RedefineTTYLineNumber.sh > test/com/sun/jdi/StringConvertTest.sh > test/com/sun/jdi/WatchFramePop.sh > I _think_ I understand the new test driver style: > > - get rid of all explicit 'cmd quit' usages because mydojdbCmds() > now wraps the test's dojdbCmds with a 'quit' cmd > - any test that previously ended with a 'cmd cont' is presumed to > be OK of that 'cmd cont' caused jdb to execute off the end of > main(); sounds reasonable to me > - perfect example of the new logic to catch an errant run off the > end is test/com/sun/jdi/WatchFramePop.sh > - the last jdb cmd is 'next' > - and jdb is NOT supposed to run off the end > - the new logic should catch this nicely; I _think_ the old > logic would only catch a run off the end if someone manually > checked the test result > > > Thumbs up! > > Dan > > >> >> >> This change fixes two issues with the tests: >> 1. Fix incorrect 'quit' command sending to JDB when JDB process was finished. >> 2. Improve JDB unexpected exit detection and process synchronization. >> >> Description of fix: >> 1. Add allowExit parameter to cmd() to show that the given command can finish >> JDB. E.g., 'cont' command make JDB execute debuggee to the end. >> If allowExit wasn't set for a command then assume that it can't finish >> execution, and fail the test if it did. >> >> 2. Make test fail if it tries to send 'quit' or 'exit' commands. This makes >> it impossible to send quit/exit from test by mistake. Scaffold will >> finish JDB by itself if JDB didn't finish before be a command with allowExit >> set. Add dofinish() function to be the only method that may exit JDB. >> >> 3. Add proper synchronization into waitForFinish(). On all systems except >> SunOS use wait (from bash). On Solaris find the shell subprocess and wait for >> its finish. It replaces wait used on all other systems, because it >> doesn't work on sh/ksh as in bash. >> >> 4. Fix tests: add allowExit to tests where it's needed. >> > From staffan.larsen at oracle.com Fri Feb 28 01:43:20 2014 From: staffan.larsen at oracle.com (Staffan Larsen) Date: Fri, 28 Feb 2014 10:43:20 +0100 Subject: RFR 9: 8035889: jdk testlibrary - add printing of values of failed assertions In-Reply-To: <530F5AE9.8000405@oracle.com> References: <530E4FD8.9040605@oracle.com> <530E9C6C.8080409@oracle.com> <530EAC78.4000301@Oracle.com> <530EAE5F.3050308@oracle.com> <530F5AE9.8000405@oracle.com> Message-ID: <6A397A15-56DD-41FA-885F-4A3BF2415CB6@oracle.com> Looks good to me! Thanks, /Staffan On 27 feb 2014, at 16:34, roger riggs wrote: > Hi Mandy, > > I updated the webrev: > http://cr.openjdk.java.net/~rriggs/webrev-testlibrary-asserts-8035889/ > > Alan suggested copying serviceability-dev so they have a chance to review if desired. > > I want to investigate if it is possible to use the TestNG Assert classes without > the TestNG execution framework. > It would be necessary to compile/run against TestNG.jar but it might not > need the entire mechanism. > > Thanks, Roger > > On 2/26/2014 10:17 PM, Mandy Chung wrote: >> On 2/26/2014 7:09 PM, Roger Riggs wrote: >>> Hi Mandy, >>> >>> Yes, it might be more productive to switch the tests to TestNG. >>> But it did provide support in cases where TestNG could not be used, >>> for example in a directory of existing tests that had custom reporting. >>> >>> But I remember there is a problem with TestNG having a dependency for XML >>> which is not supported in Profile1 and a number of tests had to be disabled >>> in that configuration. Will XML always be available. Do we need to solve >>> or work around that problem with TestNG? >>> >> >> This is a good point. When we want to test just the base module for example, how can we run TestNG tests? We need to address that certainly. >> >> My comment on TestNG is a question for new tests using this Asserts class. Your patch is fine to go (after taking out @library tag if I got it correct). >> >> Mandy >> >>> Thanks, Roger >>> >>> On 2/26/14 9:01 PM, Mandy Chung wrote: >>>> Hi Roger, >>>> >>>> On 2/26/2014 12:34 PM, roger riggs wrote: >>>>> The testlibrary for the jdk should be printing the values in the failed >>>>> assertions to make debugging easier and quicker. >>>>> >>>>> The webrev adds the printing of the failed assertions and added methods >>>>> for formatting and unconditional fail methods. >>>>> >>>>> Webrev: >>>>> http://cr.openjdk.java.net/~rriggs/webrev-testlibrary-asserts-8035889/ >>>>> >>>> >>>> AssertsTest.java: line 28: @library doesn't look like it's needed. There is no jdk/test/testlibrary directory and I think jdk.testlibrary.* are found as relative to $test.src. >>>> >>>> Otherwise, the change looks okay. >>>> >>>> Now that jtreg supports TestNG and I wonder if this class should retire some day (there are only about 10 existing tests using this class). Are you writing new tests using this Asserts class? >>>> >>>> Mandy >>>> >>>>> Bug: >>>>> 8035889: jdk testlibrary - add printing of values of failed assertions >>>>> >>>>> Thanks, Roger >>>>> >>>>> [1] https://bugs.openjdk.java.net/browse/JDK-8035889 >>>>> >>>>> >>>> >>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20140228/44b9c51b/attachment.html From staffan.larsen at oracle.com Fri Feb 28 01:47:02 2014 From: staffan.larsen at oracle.com (Staffan Larsen) Date: Fri, 28 Feb 2014 10:47:02 +0100 Subject: RFR 8035668: [test] Check that jstat does not cause problems when monitored application exits before jstat finishes its work In-Reply-To: <530F366E.2040806@oracle.com> References: <530F366E.2040806@oracle.com> Message-ID: <939A40A3-C202-49B6-B454-841D972A727E@oracle.com> You are missing the copyright-header. Otherwise looks good. Thanks, /Staffan On 27 feb 2014, at 13:58, Jaroslav Bachorik wrote: > Please, review the addition of the jstat related test. > > Issue : https://bugs.openjdk.java.net/browse/JDK-8035668 > Webrev: http://cr.openjdk.java.net/~jbachorik/8035668/webrev.00/ > > The test logic is simple - first the application to be monitored is started. When the application is ready jstat is launched - it periodically (each 100ms) polls the monitored application status. At this moment the monitored application is signalled the possibility to exit - it waits for 500ms and exits. There should not be any errors neither in the monitored application nor jstat (checked by the exit status of their processes) and they should not hang (detected by the harness timeout). > > Thanks, > > -JB- From mattias.tobiasson at oracle.com Fri Feb 28 02:33:08 2014 From: mattias.tobiasson at oracle.com (Mattias Tobiasson) Date: Fri, 28 Feb 2014 02:33:08 -0800 (PST) Subject: RFR(XS) 8031065: LowMemoryTest2.sh fails: OutOfMemoryError: Metaspace Message-ID: <8e5e581d-c5a9-4cd1-9a40-9d2d2a456b2f@default> Hi, I have updated the test and now stop allocating when we have reached the threshold. Since we now do all allocations first and then just wait for the notification, I have split the loop into two separate loops to make it clearer. To detect if we have reached the threshold I now check MemoryPoolMXBean.getUsageThresholdCount() > 0 instead of checking isUsageThresholdExceeded(). The reason for that is because the notification event is not generated immediately when isUsageThresholdExceeded() = true. The notification is only generated at the next GC. So that is the reason for why the old test kept allocating after it had reached the threshold (to trigger another GC). getUsageThresholdCount() is updated at the same time as the event is generated. So after getUsageThresholdCount() > 0, I can just wait for the notification without more allocations. webrev: http://cr.openjdk.java.net/~mtobiass/8031065/webrev.01 bug: https://bugs.openjdk.java.net/browse/JDK-8031065 Mattias ----- Original Message ----- From: shanliang.jiang at oracle.com To: mattias.tobiasson at oracle.com Cc: serviceability-dev at openjdk.java.net, daniel.fuchs at oracle.com Sent: Thursday, February 27, 2014 5:12:51 PM GMT +01:00 Amsterdam / Berlin / Bern / Rome / Stockholm / Vienna Subject: Re: RFR(XS) 8031065: LowMemoryTest2.sh fails: OutOfMemoryError: Metaspace Mattias Tobiasson wrote: Hi, thanks for the fast reviews. I did think about stop calling loadNext() after the flag has been set. The main reason for not doing that was just because I wanted to change as little as possible. Now the test works as originally intended. I prefer to do like this too :) I do not mind removing the calls to loadNext(), but then we would need some timeout waiting for the callback. Currently the test "times out" with an OutOfMemory when we have allocated the remaining 20% of the space. You do not need to add a timeout, only change Line 151 for(;;) to while(!listenerInvoked) { and remove 160 -- 162 in case that an expected notification is not arrived, the testing harness has a timeout to stop the test. This way makes the test more robust, but I am OK with the current fix. Thanks, Shanliang About line 172, you are correct. I will just remove that line. Thanks! Mattias ----- Original Message ----- From: shanliang.jiang at oracle.com To: daniel.fuchs at oracle.com Cc: mattias.tobiasson at oracle.com , serviceability-dev at openjdk.java.net Sent: Thursday, February 27, 2014 12:59:49 PM GMT +01:00 Amsterdam / Berlin / Bern / Rome / Stockholm / Vienna Subject: Re: RFR(XS) 8031065: LowMemoryTest2.sh fails: OutOfMemoryError: Metaspace Daniel Fuchs wrote: On 2/27/14 11:43 AM, Mattias Tobiasson wrote: Hi, Could you please review this test fix. The test verifies that MemoryPoolMXBean sends a notification when used memory has reached the threshold. The flag thresholdExceeded marks if we have reached the memory threshold. When the flag is set, the test slows down to give time for the notification to be received. The problem is that thresholdExceeded is overwritten every time in the loop. Instead it should be set if any pool has reached the threshold. This means that the test continues to allocate memory at full speed, and we may get an OutOfMemory before we get the notification. Hi Mattias, I wonder whether you should also stop calling loadNext() once thresholdExceeded is true? Yes I am thinking this too. Line 172 is unnecessary, after thresholdExceeded becomes true, Line 170 will always be skipped. Shanliang best regards, -- daniel bug: https://bugs.openjdk.java.net/browse/JDK-8031065 webrev: http://cr.openjdk.java.net/~ykantser/8031065/webrev.00/ Mattias From shanliang.jiang at oracle.com Fri Feb 28 02:58:11 2014 From: shanliang.jiang at oracle.com (shanliang) Date: Fri, 28 Feb 2014 11:58:11 +0100 Subject: RFR(XS) 8031065: LowMemoryTest2.sh fails: OutOfMemoryError: Metaspace In-Reply-To: <8e5e581d-c5a9-4cd1-9a40-9d2d2a456b2f@default> References: <8e5e581d-c5a9-4cd1-9a40-9d2d2a456b2f@default> Message-ID: <53106BC3.2060505@oracle.com> Looks good! It could be improved to not use the variable thresholdExceeded: change Line 146 to while(true) { and remove Line 143 and 158 Thanks, Shanliang Mattias Tobiasson wrote: > Hi, > I have updated the test and now stop allocating when we have reached the threshold. > Since we now do all allocations first and then just wait for the notification, I have split the loop into two separate loops to make it clearer. > > To detect if we have reached the threshold I now check MemoryPoolMXBean.getUsageThresholdCount() > 0 instead of checking isUsageThresholdExceeded(). > The reason for that is because the notification event is not generated immediately when isUsageThresholdExceeded() = true. The notification is only generated at the next GC. So that is the reason for why the old test kept allocating after it had reached the threshold (to trigger another GC). > > getUsageThresholdCount() is updated at the same time as the event is generated. So after getUsageThresholdCount() > 0, I can just wait for the notification without more allocations. > > webrev: > http://cr.openjdk.java.net/~mtobiass/8031065/webrev.01 > > bug: > https://bugs.openjdk.java.net/browse/JDK-8031065 > > Mattias > > ----- Original Message ----- > From: shanliang.jiang at oracle.com > To: mattias.tobiasson at oracle.com > Cc: serviceability-dev at openjdk.java.net, daniel.fuchs at oracle.com > Sent: Thursday, February 27, 2014 5:12:51 PM GMT +01:00 Amsterdam / Berlin / Bern / Rome / Stockholm / Vienna > Subject: Re: RFR(XS) 8031065: LowMemoryTest2.sh fails: OutOfMemoryError: Metaspace > > > Mattias Tobiasson wrote: > > Hi, thanks for the fast reviews. > > I did think about stop calling loadNext() after the flag has been set. The main reason for not doing that was just because I wanted to change as little as possible. Now the test works as originally intended. I prefer to do like this too :) > > > I do not mind removing the calls to loadNext(), but then we would need some timeout waiting for the callback. Currently the test "times out" with an OutOfMemory when we have allocated the remaining 20% of the space. You do not need to add a timeout, only change Line 151 > for(;;) > to > while(!listenerInvoked) { > > and remove 160 -- 162 > > in case that an expected notification is not arrived, the testing harness has a timeout to stop the test. > > This way makes the test more robust, but I am OK with the current fix. > > Thanks, > Shanliang > > > About line 172, you are correct. I will just remove that line. Thanks! > > Mattias > > ----- Original Message ----- > From: shanliang.jiang at oracle.com To: daniel.fuchs at oracle.com Cc: mattias.tobiasson at oracle.com , serviceability-dev at openjdk.java.net Sent: Thursday, February 27, 2014 12:59:49 PM GMT +01:00 Amsterdam / Berlin / Bern / Rome / Stockholm / Vienna > Subject: Re: RFR(XS) 8031065: LowMemoryTest2.sh fails: OutOfMemoryError: Metaspace > > Daniel Fuchs wrote: > > On 2/27/14 11:43 AM, Mattias Tobiasson wrote: > > Hi, > Could you please review this test fix. > > The test verifies that MemoryPoolMXBean sends a notification when > used memory has reached the threshold. > The flag thresholdExceeded marks if we have reached the memory > threshold. When the flag is set, the test slows down to give time for > the notification to be received. > The problem is that thresholdExceeded is overwritten every time in > the loop. Instead it should be set if any pool has reached the > threshold. This means that the test continues to allocate memory at > full speed, and we may get an OutOfMemory before we get the > notification. Hi Mattias, > > I wonder whether you should also stop calling loadNext() once > thresholdExceeded is true? Yes I am thinking this too. > > Line 172 is unnecessary, after thresholdExceeded becomes true, Line 170 > will always be skipped. > > Shanliang > > best regards, > > -- daniel > > bug: https://bugs.openjdk.java.net/browse/JDK-8031065 webrev: http://cr.openjdk.java.net/~ykantser/8031065/webrev.00/ Mattias > From daniel.fuchs at oracle.com Fri Feb 28 03:05:47 2014 From: daniel.fuchs at oracle.com (Daniel Fuchs) Date: Fri, 28 Feb 2014 12:05:47 +0100 Subject: RFR(XS) 8031065: LowMemoryTest2.sh fails: OutOfMemoryError: Metaspace In-Reply-To: <8e5e581d-c5a9-4cd1-9a40-9d2d2a456b2f@default> References: <8e5e581d-c5a9-4cd1-9a40-9d2d2a456b2f@default> Message-ID: <53106D8B.9040407@oracle.com> Hi Mattias, The new version looks good! best regards, -- daniel On 2/28/14 11:33 AM, Mattias Tobiasson wrote: > Hi, > I have updated the test and now stop allocating when we have reached the threshold. > Since we now do all allocations first and then just wait for the notification, I have split the loop into two separate loops to make it clearer. > > To detect if we have reached the threshold I now check MemoryPoolMXBean.getUsageThresholdCount() > 0 instead of checking isUsageThresholdExceeded(). > The reason for that is because the notification event is not generated immediately when isUsageThresholdExceeded() = true. The notification is only generated at the next GC. So that is the reason for why the old test kept allocating after it had reached the threshold (to trigger another GC). > > getUsageThresholdCount() is updated at the same time as the event is generated. So after getUsageThresholdCount() > 0, I can just wait for the notification without more allocations. > > webrev: > http://cr.openjdk.java.net/~mtobiass/8031065/webrev.01 > > bug: > https://bugs.openjdk.java.net/browse/JDK-8031065 > > Mattias > > ----- Original Message ----- > From: shanliang.jiang at oracle.com > To: mattias.tobiasson at oracle.com > Cc: serviceability-dev at openjdk.java.net, daniel.fuchs at oracle.com > Sent: Thursday, February 27, 2014 5:12:51 PM GMT +01:00 Amsterdam / Berlin / Bern / Rome / Stockholm / Vienna > Subject: Re: RFR(XS) 8031065: LowMemoryTest2.sh fails: OutOfMemoryError: Metaspace > > > Mattias Tobiasson wrote: > > Hi, thanks for the fast reviews. > > I did think about stop calling loadNext() after the flag has been set. The main reason for not doing that was just because I wanted to change as little as possible. Now the test works as originally intended. I prefer to do like this too :) > > > I do not mind removing the calls to loadNext(), but then we would need some timeout waiting for the callback. Currently the test "times out" with an OutOfMemory when we have allocated the remaining 20% of the space. You do not need to add a timeout, only change Line 151 > for(;;) > to > while(!listenerInvoked) { > > and remove 160 -- 162 > > in case that an expected notification is not arrived, the testing harness has a timeout to stop the test. > > This way makes the test more robust, but I am OK with the current fix. > > Thanks, > Shanliang > > > About line 172, you are correct. I will just remove that line. Thanks! > > Mattias > > ----- Original Message ----- > From: shanliang.jiang at oracle.com To: daniel.fuchs at oracle.com Cc: mattias.tobiasson at oracle.com , serviceability-dev at openjdk.java.net Sent: Thursday, February 27, 2014 12:59:49 PM GMT +01:00 Amsterdam / Berlin / Bern / Rome / Stockholm / Vienna > Subject: Re: RFR(XS) 8031065: LowMemoryTest2.sh fails: OutOfMemoryError: Metaspace > > Daniel Fuchs wrote: > > On 2/27/14 11:43 AM, Mattias Tobiasson wrote: > > Hi, > Could you please review this test fix. > > The test verifies that MemoryPoolMXBean sends a notification when > used memory has reached the threshold. > The flag thresholdExceeded marks if we have reached the memory > threshold. When the flag is set, the test slows down to give time for > the notification to be received. > The problem is that thresholdExceeded is overwritten every time in > the loop. Instead it should be set if any pool has reached the > threshold. This means that the test continues to allocate memory at > full speed, and we may get an OutOfMemory before we get the > notification. Hi Mattias, > > I wonder whether you should also stop calling loadNext() once > thresholdExceeded is true? Yes I am thinking this too. > > Line 172 is unnecessary, after thresholdExceeded becomes true, Line 170 > will always be skipped. > > Shanliang > > best regards, > > -- daniel > > bug: https://bugs.openjdk.java.net/browse/JDK-8031065 webrev: http://cr.openjdk.java.net/~ykantser/8031065/webrev.00/ Mattias > From mattias.tobiasson at oracle.com Fri Feb 28 03:47:11 2014 From: mattias.tobiasson at oracle.com (Mattias Tobiasson) Date: Fri, 28 Feb 2014 03:47:11 -0800 (PST) Subject: RFR(XS) 8031065: LowMemoryTest2.sh fails: OutOfMemoryError: Metaspace Message-ID: <69c2acbd-c1a0-47fa-b255-9d110fdeb135@default> Thanks for the review. Unfortunately I can not remove the thresholdExceeded, because the "break" only leaves the loop of memory pools. I know nested loops are not perfect, but I did not want to change too much from the original test. Mattias ----- Original Message ----- From: shanliang.jiang at oracle.com To: mattias.tobiasson at oracle.com Cc: serviceability-dev at openjdk.java.net, daniel.fuchs at oracle.com Sent: Friday, February 28, 2014 11:58:20 AM GMT +01:00 Amsterdam / Berlin / Bern / Rome / Stockholm / Vienna Subject: Re: RFR(XS) 8031065: LowMemoryTest2.sh fails: OutOfMemoryError: Metaspace Looks good! It could be improved to not use the variable thresholdExceeded: change Line 146 to while(true) { and remove Line 143 and 158 Thanks, Shanliang Mattias Tobiasson wrote: > Hi, > I have updated the test and now stop allocating when we have reached the threshold. > Since we now do all allocations first and then just wait for the notification, I have split the loop into two separate loops to make it clearer. > > To detect if we have reached the threshold I now check MemoryPoolMXBean.getUsageThresholdCount() > 0 instead of checking isUsageThresholdExceeded(). > The reason for that is because the notification event is not generated immediately when isUsageThresholdExceeded() = true. The notification is only generated at the next GC. So that is the reason for why the old test kept allocating after it had reached the threshold (to trigger another GC). > > getUsageThresholdCount() is updated at the same time as the event is generated. So after getUsageThresholdCount() > 0, I can just wait for the notification without more allocations. > > webrev: > http://cr.openjdk.java.net/~mtobiass/8031065/webrev.01 > > bug: > https://bugs.openjdk.java.net/browse/JDK-8031065 > > Mattias > > ----- Original Message ----- > From: shanliang.jiang at oracle.com > To: mattias.tobiasson at oracle.com > Cc: serviceability-dev at openjdk.java.net, daniel.fuchs at oracle.com > Sent: Thursday, February 27, 2014 5:12:51 PM GMT +01:00 Amsterdam / Berlin / Bern / Rome / Stockholm / Vienna > Subject: Re: RFR(XS) 8031065: LowMemoryTest2.sh fails: OutOfMemoryError: Metaspace > > > Mattias Tobiasson wrote: > > Hi, thanks for the fast reviews. > > I did think about stop calling loadNext() after the flag has been set. The main reason for not doing that was just because I wanted to change as little as possible. Now the test works as originally intended. I prefer to do like this too :) > > > I do not mind removing the calls to loadNext(), but then we would need some timeout waiting for the callback. Currently the test "times out" with an OutOfMemory when we have allocated the remaining 20% of the space. You do not need to add a timeout, only change Line 151 > for(;;) > to > while(!listenerInvoked) { > > and remove 160 -- 162 > > in case that an expected notification is not arrived, the testing harness has a timeout to stop the test. > > This way makes the test more robust, but I am OK with the current fix. > > Thanks, > Shanliang > > > About line 172, you are correct. I will just remove that line. Thanks! > > Mattias > > ----- Original Message ----- > From: shanliang.jiang at oracle.com To: daniel.fuchs at oracle.com Cc: mattias.tobiasson at oracle.com , serviceability-dev at openjdk.java.net Sent: Thursday, February 27, 2014 12:59:49 PM GMT +01:00 Amsterdam / Berlin / Bern / Rome / Stockholm / Vienna > Subject: Re: RFR(XS) 8031065: LowMemoryTest2.sh fails: OutOfMemoryError: Metaspace > > Daniel Fuchs wrote: > > On 2/27/14 11:43 AM, Mattias Tobiasson wrote: > > Hi, > Could you please review this test fix. > > The test verifies that MemoryPoolMXBean sends a notification when > used memory has reached the threshold. > The flag thresholdExceeded marks if we have reached the memory > threshold. When the flag is set, the test slows down to give time for > the notification to be received. > The problem is that thresholdExceeded is overwritten every time in > the loop. Instead it should be set if any pool has reached the > threshold. This means that the test continues to allocate memory at > full speed, and we may get an OutOfMemory before we get the > notification. Hi Mattias, > > I wonder whether you should also stop calling loadNext() once > thresholdExceeded is true? Yes I am thinking this too. > > Line 172 is unnecessary, after thresholdExceeded becomes true, Line 170 > will always be skipped. > > Shanliang > > best regards, > > -- daniel > > bug: https://bugs.openjdk.java.net/browse/JDK-8031065 webrev: http://cr.openjdk.java.net/~ykantser/8031065/webrev.00/ Mattias > From mattias.tobiasson at oracle.com Fri Feb 28 04:56:11 2014 From: mattias.tobiasson at oracle.com (Mattias Tobiasson) Date: Fri, 28 Feb 2014 04:56:11 -0800 (PST) Subject: RFR(XS) 6625574: MemoryMXBean/Pending.java fails: "pending finalization = 0 but expected > 0" Message-ID: <78525bc3-f9ea-466d-abd0-b3ea762078bf@default> Hi, Could you please review this fix? The test often fails when run with command line "-Xcomp" The test creates some objects and stores them in a local variable in the function. The test expects those objects to survive until they are set to null. The problem seems to be that the optimizer realizes that the objects are never used and removes them before the test expects it. The solution is to move the local variable out to a public static variable. Then the optimizer will not remove it before expected. webrev: http://cr.openjdk.java.net/~mtobiass/6625574/webrev.00 bug: https://bugs.openjdk.java.net/browse/JDK-6625574 Mattias From staffan.larsen at oracle.com Fri Feb 28 04:58:38 2014 From: staffan.larsen at oracle.com (Staffan Larsen) Date: Fri, 28 Feb 2014 13:58:38 +0100 Subject: RFR(XS) 6625574: MemoryMXBean/Pending.java fails: "pending finalization = 0 but expected > 0" In-Reply-To: <78525bc3-f9ea-466d-abd0-b3ea762078bf@default> References: <78525bc3-f9ea-466d-abd0-b3ea762078bf@default> Message-ID: <3F7E2914-034A-4737-8F87-64F5EDCF81DE@oracle.com> Looks good! Thanks, /Staffan On 28 feb 2014, at 13:56, Mattias Tobiasson wrote: > Hi, > Could you please review this fix? > > The test often fails when run with command line "-Xcomp" > > The test creates some objects and stores them in a local variable in the function. The test expects those objects to survive until they are set to null. The problem seems to be that the optimizer realizes that the objects are never used and removes them before the test expects it. > > The solution is to move the local variable out to a public static variable. Then the optimizer will not remove it before expected. > > webrev: > http://cr.openjdk.java.net/~mtobiass/6625574/webrev.00 > > bug: > https://bugs.openjdk.java.net/browse/JDK-6625574 > > Mattias From pavel.punegov at oracle.com Fri Feb 28 06:36:23 2014 From: pavel.punegov at oracle.com (Pavel Punegov) Date: Fri, 28 Feb 2014 18:36:23 +0400 Subject: RFR 6946101: ShellScaffold.sh based tests can issue "write error: Broken pipe" messages In-Reply-To: <47FE9723-6797-4135-A103-E1E42675FE99@oracle.com> References: <1701612.OzZc2qZ2Ld@work-dell> <530FC073.3060905@oracle.com> <47FE9723-6797-4135-A103-E1E42675FE99@oracle.com> Message-ID: <1838555.9rITELihyr@work-dell> New wevrev, thanks to Igor I.: http://cr.openjdk.java.net/~iignatyev/ppunegov/6946101/webrev.01/ Fixed typos/grammar Added next string to catch the situation when jdb exited with input stream closed prematurely (break in a while loop before the check) : 996 # jdb exited because its input stream closed prematurely 997 jdbFailIfPresent "Input stream closed" It could happen if dojdbCmds() subshell was killed or exited prematurely. Without this check test fails with no complains about Input stream. On Fri, 28 of Feb 2014 09:31:29 Staffan Larsen wrote: > Very nice change - looks good! > > test/com/sun/jdi/ShellScaffold.sh > line 1000: # mydojdbCmds() didn't finished because it waits for JDB > message nit: finished -> finish Fixed > Just a note that this should be pushed through jdk9/dev and not jdk9/hs-comp > (where the webrev was made). Igor made a webrev based on jdk9/dev and will sponsor me. On 27 feb 2014, at 23:47, Daniel D. Daugherty wrote: >> On 2/27/14 9:20 AM, Pavel Punegov wrote: >>> Please review the fix for: >>> https://bugs.openjdk.java.net/browse/JDK-6946101 >>> >>> webrev: >>> http://cr.openjdk.java.net/~iignatyev/ppunegov/6946101/webrev.00/ >> >> test/com/sun/jdi/ShellScaffold.sh >> >> line 531: # allows JDB to exit" >> >> stray double-quote at end of comment >> >> line 563: dofail "It's not allowed to send quit and exit commands from >> the test"> >> 'and' should be 'or' >> >> line 819: # Kill debugger, it could be hang >> >> Typo: 'hang' -> 'hung' >> Fixed >> I _think_ I understand the new test driver style: >> >> - get rid of all explicit 'cmd quit' usages because mydojdbCmds() >> now wraps the test's dojdbCmds with a 'quit' cmd Yep >> - any test that previously ended with a 'cmd cont' is presumed to >> be OK of that 'cmd cont' caused jdb to execute off the end of >> main(); sounds reasonable to me The test JdbMethodExitTest.sh doesn't have allowExit set for the last 'cmd cont'. This 'cont' should run jdb to breakpoint set with bkpt(); See java file inside the test: 183 // test trace method exit 184 traceExit1(); 185 bkpt(); 186 187 } >> - perfect example of the new logic to catch an errant run off the >> end is test/com/sun/jdi/WatchFramePop.sh >> - the last jdb cmd is 'next' >> - and jdb is NOT supposed to run off the end >> - the new logic should catch this nicely; I _think_ the old >> logic would only catch a run off the end if someone manually >> checked the test result My testing (see in the bug comments) shows that old logic can only catch the message absence with failIfNotPresent() functions. If I remove these checks tests will silently pass even though we have sent 'cmd exit' inside the test. -- Thanks, Pavel Punegov From staffan.larsen at oracle.com Fri Feb 28 06:39:19 2014 From: staffan.larsen at oracle.com (Staffan Larsen) Date: Fri, 28 Feb 2014 15:39:19 +0100 Subject: RFR 6946101: ShellScaffold.sh based tests can issue "write error: Broken pipe" messages In-Reply-To: <1838555.9rITELihyr@work-dell> References: <1701612.OzZc2qZ2Ld@work-dell> <530FC073.3060905@oracle.com> <47FE9723-6797-4135-A103-E1E42675FE99@oracle.com> <1838555.9rITELihyr@work-dell> Message-ID: <867D8A1C-031A-49F2-8AB4-ECD3BAEE8153@oracle.com> Still good. Thanks, /Staffan On 28 feb 2014, at 15:36, Pavel Punegov wrote: > New wevrev, thanks to Igor I.: > http://cr.openjdk.java.net/~iignatyev/ppunegov/6946101/webrev.01/ > > Fixed typos/grammar > Added next string to catch the situation when jdb exited with > input stream closed prematurely (break in a while loop before the check) : > 996 # jdb exited because its input stream closed prematurely > 997 jdbFailIfPresent "Input stream closed" > > It could happen if dojdbCmds() subshell was killed or exited prematurely. > Without this check test fails with no complains about Input stream. > > > On Fri, 28 of Feb 2014 09:31:29 Staffan Larsen wrote: >> Very nice change - looks good! >> >> test/com/sun/jdi/ShellScaffold.sh >> line 1000: # mydojdbCmds() didn't finished because it waits for JDB >> message nit: finished -> finish > Fixed > >> Just a note that this should be pushed through jdk9/dev and not jdk9/hs-comp >> (where the webrev was made). > Igor made a webrev based on jdk9/dev and will sponsor me. > > > On 27 feb 2014, at 23:47, Daniel D. Daugherty > wrote: >>> On 2/27/14 9:20 AM, Pavel Punegov wrote: >>>> Please review the fix for: >>>> https://bugs.openjdk.java.net/browse/JDK-6946101 >>>> >>>> webrev: >>>> http://cr.openjdk.java.net/~iignatyev/ppunegov/6946101/webrev.00/ >>> >>> test/com/sun/jdi/ShellScaffold.sh >>> >>> line 531: # allows JDB to exit" >>> >>> stray double-quote at end of comment >>> >>> line 563: dofail "It's not allowed to send quit and exit commands from >>> the test"> >>> 'and' should be 'or' >>> >>> line 819: # Kill debugger, it could be hang >>> >>> Typo: 'hang' -> 'hung' >>> > Fixed > >>> I _think_ I understand the new test driver style: >>> >>> - get rid of all explicit 'cmd quit' usages because mydojdbCmds() >>> now wraps the test's dojdbCmds with a 'quit' cmd > Yep > >>> - any test that previously ended with a 'cmd cont' is presumed to >>> be OK of that 'cmd cont' caused jdb to execute off the end of >>> main(); sounds reasonable to me > The test JdbMethodExitTest.sh doesn't have allowExit set for the last > 'cmd cont'. This 'cont' should run jdb to breakpoint set with bkpt(); > See java file inside the test: > 183 // test trace method exit > 184 traceExit1(); > 185 bkpt(); > 186 > 187 } > >>> - perfect example of the new logic to catch an errant run off the >>> end is test/com/sun/jdi/WatchFramePop.sh >>> - the last jdb cmd is 'next' >>> - and jdb is NOT supposed to run off the end >>> - the new logic should catch this nicely; I _think_ the old >>> logic would only catch a run off the end if someone manually >>> checked the test result > My testing (see in the bug comments) shows that old logic can only catch the > message absence with failIfNotPresent() functions. If I remove these checks > tests will silently pass even though we have sent 'cmd exit' inside the test. > > -- > Thanks, > Pavel Punegov From stuart.marks at oracle.com Fri Feb 28 09:27:26 2014 From: stuart.marks at oracle.com (Stuart Marks) Date: Fri, 28 Feb 2014 09:27:26 -0800 Subject: RFR 6835233 : Fedora 9 jdk regression test failed: java/lang/instrument/ParallelTransformerLoader.sh In-Reply-To: <530E41DB.20801@oracle.com> References: <530E41DB.20801@oracle.com> Message-ID: <5310C6FE.6040406@oracle.com> On 2/26/14 11:34 AM, Brent Christian wrote: > File under "chipping away at test stabilization issues." > > https://bugs.openjdk.java.net/browse/JDK-6835233 > > I would like to resolve this bug by removing the "@ignore" tag for JDK 9, and > bring the test back into rotation. If the failure comes back, I'll submit a new > issue for further investigation. +1 from me on "chipping away at test stabilization issues." :-) I guess there is some risk of adding new intermittent failures, but tackling @ignore'd tests is important too. Thanks for keeping an eye this. s'marks From robert.field at oracle.com Fri Feb 28 10:45:50 2014 From: robert.field at oracle.com (robert.field at oracle.com) Date: Fri, 28 Feb 2014 18:45:50 +0000 Subject: hg: jdk8/tl/jdk: 8035777: Consistent Lambda construction Message-ID: <20140228184805.22B6D6239C@hg.openjdk.java.net> Changeset: 183a8c520b4a Author: rfield Date: 2014-02-28 10:43 -0800 URL: http://hg.openjdk.java.net/jdk8/tl/jdk/rev/183a8c520b4a 8035777: Consistent Lambda construction Reviewed-by: ahgross, briangoetz, dlsmith ! src/share/classes/java/lang/invoke/AbstractValidatingLambdaMetafactory.java ! src/share/classes/java/lang/invoke/TypeConvertingMethodAdapter.java + test/java/lang/invoke/lambda/LambdaReceiver.java + test/java/lang/invoke/lambda/LambdaReceiverBridge.java + test/java/lang/invoke/lambda/LambdaReceiver_anotherpkg/LambdaReceiver_A.java + test/java/lang/invoke/lambda/LambdaReturn.java + test/java/lang/invoke/lambda/MetafactoryArityTest.java + test/java/lang/invoke/lambda/MetafactoryParameterCastTest.java + test/java/lang/invoke/lambda/MetafactorySamReturnTest.java From brent.christian at oracle.com Fri Feb 28 11:01:55 2014 From: brent.christian at oracle.com (Brent Christian) Date: Fri, 28 Feb 2014 11:01:55 -0800 Subject: RFR 6835233 : Fedora 9 jdk regression test failed: java/lang/instrument/ParallelTransformerLoader.sh In-Reply-To: <5310C6FE.6040406@oracle.com> References: <530E41DB.20801@oracle.com> <5310C6FE.6040406@oracle.com> Message-ID: <5310DD23.1040706@oracle.com> On 2/28/14 9:27 AM, Stuart Marks wrote: > > I guess there is some risk of adding new intermittent failures, but > tackling @ignore'd tests is important too. Right - the main risk is that we will see this test fail again at some point in the future. I'll be keeping an eye out for that. Thanks for the review, guys. -Brent From daniel.daugherty at oracle.com Fri Feb 28 11:22:51 2014 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Fri, 28 Feb 2014 12:22:51 -0700 Subject: RFR 6946101: ShellScaffold.sh based tests can issue "write error: Broken pipe" messages In-Reply-To: <1838555.9rITELihyr@work-dell> References: <1701612.OzZc2qZ2Ld@work-dell> <530FC073.3060905@oracle.com> <47FE9723-6797-4135-A103-E1E42675FE99@oracle.com> <1838555.9rITELihyr@work-dell> Message-ID: <5310E20B.5080900@oracle.com> On 2/28/14 7:36 AM, Pavel Punegov wrote: > New wevrev, thanks to Igor I.: > http://cr.openjdk.java.net/~iignatyev/ppunegov/6946101/webrev.01/ Thumbs up. Dan > > Fixed typos/grammar > Added next string to catch the situation when jdb exited with > input stream closed prematurely (break in a while loop before the check) : > 996 # jdb exited because its input stream closed prematurely > 997 jdbFailIfPresent "Input stream closed" > > It could happen if dojdbCmds() subshell was killed or exited prematurely. > Without this check test fails with no complains about Input stream. > > > On Fri, 28 of Feb 2014 09:31:29 Staffan Larsen wrote: >> Very nice change - looks good! >> >> test/com/sun/jdi/ShellScaffold.sh >> line 1000: # mydojdbCmds() didn't finished because it waits for JDB >> message nit: finished -> finish > Fixed > >> Just a note that this should be pushed through jdk9/dev and not jdk9/hs-comp >> (where the webrev was made). > Igor made a webrev based on jdk9/dev and will sponsor me. > > > On 27 feb 2014, at 23:47, Daniel D. Daugherty > wrote: >>> On 2/27/14 9:20 AM, Pavel Punegov wrote: >>>> Please review the fix for: >>>> https://bugs.openjdk.java.net/browse/JDK-6946101 >>>> >>>> webrev: >>>> http://cr.openjdk.java.net/~iignatyev/ppunegov/6946101/webrev.00/ >>> test/com/sun/jdi/ShellScaffold.sh >>> >>> line 531: # allows JDB to exit" >>> >>> stray double-quote at end of comment >>> >>> line 563: dofail "It's not allowed to send quit and exit commands from >>> the test"> >>> 'and' should be 'or' >>> >>> line 819: # Kill debugger, it could be hang >>> >>> Typo: 'hang' -> 'hung' >>> > Fixed > >>> I _think_ I understand the new test driver style: >>> >>> - get rid of all explicit 'cmd quit' usages because mydojdbCmds() >>> now wraps the test's dojdbCmds with a 'quit' cmd > Yep > >>> - any test that previously ended with a 'cmd cont' is presumed to >>> be OK of that 'cmd cont' caused jdb to execute off the end of >>> main(); sounds reasonable to me > The test JdbMethodExitTest.sh doesn't have allowExit set for the last > 'cmd cont'. This 'cont' should run jdb to breakpoint set with bkpt(); > See java file inside the test: > 183 // test trace method exit > 184 traceExit1(); > 185 bkpt(); > 186 > 187 } > >>> - perfect example of the new logic to catch an errant run off the >>> end is test/com/sun/jdi/WatchFramePop.sh >>> - the last jdb cmd is 'next' >>> - and jdb is NOT supposed to run off the end >>> - the new logic should catch this nicely; I _think_ the old >>> logic would only catch a run off the end if someone manually >>> checked the test result > My testing (see in the bug comments) shows that old logic can only catch the > message absence with failIfNotPresent() functions. If I remove these checks > tests will silently pass even though we have sent 'cmd exit' inside the test. > From serguei.spitsyn at oracle.com Fri Feb 28 11:55:02 2014 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Fri, 28 Feb 2014 11:55:02 -0800 Subject: 3-nd round RFR 6471769: Error: assert(_cur_stack_depth == count_frames(), "cur_stack_depth out of sync") In-Reply-To: <53102702.4070603@oracle.com> References: <530D007D.4040602@oracle.com> <530EF65C.5040707@oracle.com> <530F8D52.2040400@oracle.com> <530F9FF3.7030209@oracle.com> <530FA81F.8040309@oracle.com> <530FB572.4010507@oracle.com> <5310077A.7010004@oracle.com> <53102702.4070603@oracle.com> Message-ID: <5310E996.9010800@oracle.com> On 2/27/14 10:04 PM, David Holmes wrote: > Hi Serguei, > > On 28/02/2014 1:50 PM, serguei.spitsyn at oracle.com wrote: >> Please, review the fix for: >> https://bugs.openjdk.java.net/browse/JDK-6471769 >> >> >> Open webrev: >> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/hotspot/6471769-JVMTI-DEPTH.3 >> >> >> >> Summary: >> >> It is another attempt to fix the JTREG com/sun/jdi tests regression >> discovered in the first round change. >> The fix is to avoid lock synchronization at >> safepoints(jvmtiEventController.cpp). >> Thanks to Dan for catching the problem in the 2-nd round of review! > > The basic approach here seems sound. Thank you for reviewing the fix! > > I find the checking for cur->is_VMThread() somewhat overly > conservative - if we are at a safepoint, and executing this code, then > we must be the VMThread. But ok. Agreed and simplified. Thanks! > > You could also use MutexLockerEx to avoid the need for locked and > unlocked paths to a common call, but that's just stylistic. Though if > you are grabbing the current thread anyway you can also use the > MutexLocker calls that take the thread arg - to avoid a second look-up > of the current thread. Thank you for reminding. I keep forgetting about it. Will check what is better here, just do not want to rerun the whole testing. But I'm in favor to make it simpler. :) Thanks, Serguei > > David > ----- > >> Testing: >> All tests are passed: nsk.jvmti, nsk.jdi, nsk.jdwp, JTreg com/sun/jdi >> >> >> Thanks, >> Serguei >> >> >> On 2/27/14 2:00 PM, serguei.spitsyn at oracle.com wrote: >>> On 2/27/14 1:03 PM, serguei.spitsyn at oracle.com wrote: >>>> On 2/27/14 12:28 PM, serguei.spitsyn at oracle.com wrote: >>>>> Dan, >>>>> >>>>> Thank you a lot for reviewing this! >>>>> >>>>> On 2/27/14 11:09 AM, Daniel D. Daugherty wrote: >>>>>> On 2/27/14 1:25 AM, serguei.spitsyn at oracle.com wrote: >>>>>>> Please, review the fix for: >>>>>>> https://bugs.openjdk.java.net/browse/JDK-6471769 >>>>>>> >>>>>>> >>>>>>> Open webrev: >>>>>>> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/hotspot/6471769-JVMTI-DEPTH.2 >>>>>>> >>>>>>> >>>>>> >>>>>> src/share/vm/runtime/vm_operations.hpp >>>>>> No comments. >>>>>> >>>>>> src/share/vm/prims/jvmtiEnvBase.hpp >>>>>> No comments. >>>>>> >>>>>> src/share/vm/prims/jvmtiEnv.cpp >>>>>> No comments. >>>>>> >>>>>> src/share/vm/prims/jvmtiEnvThreadState.cpp >>>>>> No comments. >>>>>> >>>>>> src/share/vm/prims/jvmtiEventController.cpp >>>>>> JvmtiEventController::set_frame_pop() is called by >>>>>> JvmtiEnvThreadState::set_frame_pop() which is called by >>>>>> JvmtiEnv::NotifyFramePop(). >>>>>> >>>>>> The "MutexLocker mu(JvmtiThreadState_lock)" in >>>>>> JvmtiEventController::set_frame_pop() protected the work >>>>>> done by JvmtiEventControllerPrivate::set_frame_pop(): >>>>>> >>>>>> ets->get_frame_pops()->set(fpop); >>>>>> recompute_thread_enabled(ets->get_thread()->jvmti_thread_state()); >>>>> >>>>> Your check is the right thing to do, thanks! >>>>> I had to explain this more clearly in this 2-nd review request. >>>>> >>>>> The approach I've taken here is that all this code paths are executed >>>>> on the target thread or at a safepoint. >>>>> >>>>> It is true for all 3 functions: >>>>> set_frame_pop(), clear_frame_pop() and clear_to_frame_pop(). >>>>> >>>>> And the updated assert guards ensure that it is the case. >>>>> >>>>> It could be a good idea to add a No_Safepoint_Verifier for >>>>> PopFrame() and NotifyFramePop() >>>>> to make sure the current/target thread does not go to safepoint >>>>> until it is returned from >>>>> update_for_pop_top_frame() and set_frame_pop() correspondingly. >>>>> A No_Safepoint_Verifier can be also needed in the >>>>> JvmtiExport::post_method_exit(). >>>>> >>>>> These are all places where these functions are called: >>>>> prims/jvmtiEnv.cpp: >>>>> state->env_thread_state(this)->set_frame_pop(frame_number); // >>>>> JvmtiEnv::NotifyFramePop() >>>>> prims/jvmtiExport.cpp: ets->clear_frame_pop(cur_frame_number); // >>>>> JvmtiExport::post_method_exit() >>>>> prims/jvmtiThreadState.cpp: >>>>> ets->clear_frame_pop(popframe_number); // >>>>> JvmtiThreadState::update_for_pop_top_frame() >>>>> >>>>> The function JvmtiEnvThreadState::clear_to_frame_pop() is never >>>>> called now. >>>> >>>> There is still a concern about recompute_thread_enabled(). >>>> If it is normally always protected with the JvmtiThreadState_lock >>>> then the approach above is not going to work. >>>> I'm trying to check this now. >>> >>> Dan, >>> >>> I came to a conclusion that these 3 functions still must be protected >>> by the JvmtiThreadState_lock when they are called out of a safepoint. >>> It is a little bit ugly but has to be safe though. >>> >>> Please, let me know if you see eny problems with that. >>> I'll send a new webrev soon. >>> >>> Thanks, >>> Serguei >>> >>> >>>> >>>> Thanks, >>>> Serguei >>>> >>>> >>>>> >>>>> Thanks, >>>>> Serguei >>>>> >>>>> >>>>> >>>>> >>>>> >>>>>> >>>>>> Since multiple threads can call JVM/TI NotifyFramePop() on the >>>>>> same target thread, what keeps the threads from messing with >>>>>> the list of frame pops simultaneously or messing with the >>>>>> thread enabled events bits in parallel? >>>>>> >>>>>> I suspect that this might also be an issue for >>>>>> JvmtiEventController::clear_frame_pop() and >>>>>> JvmtiEventController::clear_to_frame_pop() also. >>>>>> >>>>>> src/share/vm/prims/jvmtiThreadState.cpp >>>>>> No comments. >>>>>> >>>>>> Dan >>>>>> >>>>>> >>>>>>> >>>>>>> Summary: >>>>>>> >>>>>>> It is the 2-nd round of review because the JTREG com/sun/jdi >>>>>>> tests discovered a regression >>>>>>> in the first round change. The issue was in the >>>>>>> JvmtiEventController::clear_frame_pop() >>>>>>> lock synchronization that is not allowed at safepoints. >>>>>>> >>>>>>> As a result I've changed the JvmtiEnv::NotifyFramePop to use a >>>>>>> VM operation for safety. >>>>>>> Also, I've removed the lock synchronization from the 3 impacted >>>>>>> JvmtiEventController:: >>>>>>> functions: set_frame_pop(), clear_frame_pop() and >>>>>>> clear_to_frame_pop(). >>>>>>> >>>>>>> Testing: >>>>>>> In progress: nsk.jvmti, nsk.jdi, nsk.jdwp, JTreg com/sun/jdi >>>>>>> >>>>>>> >>>>>>> Thanks, >>>>>>> Serguei >>>>>>> >>>>>>> >>>>>>> On 2/25/14 12:43 PM, serguei.spitsyn at oracle.com wrote: >>>>>>>> Please, review the fix for: >>>>>>>> https://bugs.openjdk.java.net/browse/JDK-6471769 >>>>>>>> >>>>>>>> >>>>>>>> Open webrev: >>>>>>>> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/hotspot/6471769-JVMTI-DEPTH.1 >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> Summary: >>>>>>>> >>>>>>>> This is another Test Stabilization issue. >>>>>>>> The fix is very similar to other JVMTI stabilization fixes. >>>>>>>> It is to use safepoints for updating the PopFrame data instead >>>>>>>> of relying on the >>>>>>>> suspend equivalent condition mechanism >>>>>>>> (JvmtiEnv::is_thread_fully_suspended()) >>>>>>>> which is not adequate from the reliability point of view. >>>>>>>> >>>>>>>> Testing: >>>>>>>> In progress: nsk.jvmti, nsk.jdi, nsk.jdwp, JTreg com/sun/jdi >>>>>>>> >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Serguei >>>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> >> From daniel.daugherty at oracle.com Fri Feb 28 13:12:57 2014 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Fri, 28 Feb 2014 14:12:57 -0700 Subject: 3-nd round RFR 6471769: Error: assert(_cur_stack_depth == count_frames(), "cur_stack_depth out of sync") In-Reply-To: <5310E996.9010800@oracle.com> References: <530D007D.4040602@oracle.com> <530EF65C.5040707@oracle.com> <530F8D52.2040400@oracle.com> <530F9FF3.7030209@oracle.com> <530FA81F.8040309@oracle.com> <530FB572.4010507@oracle.com> <5310077A.7010004@oracle.com> <53102702.4070603@oracle.com> <5310E996.9010800@oracle.com> Message-ID: <5310FBD9.5090607@oracle.com> On 2/28/14 12:55 PM, serguei.spitsyn at oracle.com wrote: > On 2/27/14 10:04 PM, David Holmes wrote: >> Hi Serguei, >> >> On 28/02/2014 1:50 PM, serguei.spitsyn at oracle.com wrote: >>> Please, review the fix for: >>> https://bugs.openjdk.java.net/browse/JDK-6471769 >>> >>> >>> Open webrev: >>> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/hotspot/6471769-JVMTI-DEPTH.3 >>> Thumbs up! (including the tweaks in the .4 version) src/share/vm/runtime/vm_operations.hpp No comments. src/share/vm/prims/jvmtiEnvBase.hpp line 375: bool allow_nested_vm_operations() const { return true; } Does VM_SetFramePop really gave to permit nested VMops? src/share/vm/prims/jvmtiEnv.cpp No comments. src/share/vm/prims/jvmtiEnvThreadState.cpp No comments. src/share/vm/prims/jvmtiEventController.cpp No comments. src/share/vm/prims/jvmtiThreadState.cpp No comments. Dan >>> >>> >>> Summary: >>> >>> It is another attempt to fix the JTREG com/sun/jdi tests regression >>> discovered in the first round change. >>> The fix is to avoid lock synchronization at >>> safepoints(jvmtiEventController.cpp). >>> Thanks to Dan for catching the problem in the 2-nd round of review! >> >> The basic approach here seems sound. > > Thank you for reviewing the fix! > >> >> I find the checking for cur->is_VMThread() somewhat overly >> conservative - if we are at a safepoint, and executing this code, >> then we must be the VMThread. But ok. > > Agreed and simplified. Thanks! > >> >> You could also use MutexLockerEx to avoid the need for locked and >> unlocked paths to a common call, but that's just stylistic. Though if >> you are grabbing the current thread anyway you can also use the >> MutexLocker calls that take the thread arg - to avoid a second >> look-up of the current thread. > > Thank you for reminding. I keep forgetting about it. > Will check what is better here, just do not want to rerun the whole > testing. > But I'm in favor to make it simpler. :) > > Thanks, > Serguei > >> >> David >> ----- >> >>> Testing: >>> All tests are passed: nsk.jvmti, nsk.jdi, nsk.jdwp, JTreg >>> com/sun/jdi >>> >>> >>> Thanks, >>> Serguei >>> >>> >>> On 2/27/14 2:00 PM, serguei.spitsyn at oracle.com wrote: >>>> On 2/27/14 1:03 PM, serguei.spitsyn at oracle.com wrote: >>>>> On 2/27/14 12:28 PM, serguei.spitsyn at oracle.com wrote: >>>>>> Dan, >>>>>> >>>>>> Thank you a lot for reviewing this! >>>>>> >>>>>> On 2/27/14 11:09 AM, Daniel D. Daugherty wrote: >>>>>>> On 2/27/14 1:25 AM, serguei.spitsyn at oracle.com wrote: >>>>>>>> Please, review the fix for: >>>>>>>> https://bugs.openjdk.java.net/browse/JDK-6471769 >>>>>>>> >>>>>>>> >>>>>>>> Open webrev: >>>>>>>> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/hotspot/6471769-JVMTI-DEPTH.2 >>>>>>>> >>>>>>>> >>>>>>> >>>>>>> src/share/vm/runtime/vm_operations.hpp >>>>>>> No comments. >>>>>>> >>>>>>> src/share/vm/prims/jvmtiEnvBase.hpp >>>>>>> No comments. >>>>>>> >>>>>>> src/share/vm/prims/jvmtiEnv.cpp >>>>>>> No comments. >>>>>>> >>>>>>> src/share/vm/prims/jvmtiEnvThreadState.cpp >>>>>>> No comments. >>>>>>> >>>>>>> src/share/vm/prims/jvmtiEventController.cpp >>>>>>> JvmtiEventController::set_frame_pop() is called by >>>>>>> JvmtiEnvThreadState::set_frame_pop() which is called by >>>>>>> JvmtiEnv::NotifyFramePop(). >>>>>>> >>>>>>> The "MutexLocker mu(JvmtiThreadState_lock)" in >>>>>>> JvmtiEventController::set_frame_pop() protected the work >>>>>>> done by JvmtiEventControllerPrivate::set_frame_pop(): >>>>>>> >>>>>>> ets->get_frame_pops()->set(fpop); >>>>>>> recompute_thread_enabled(ets->get_thread()->jvmti_thread_state()); >>>>>> >>>>>> Your check is the right thing to do, thanks! >>>>>> I had to explain this more clearly in this 2-nd review request. >>>>>> >>>>>> The approach I've taken here is that all this code paths are >>>>>> executed >>>>>> on the target thread or at a safepoint. >>>>>> >>>>>> It is true for all 3 functions: >>>>>> set_frame_pop(), clear_frame_pop() and clear_to_frame_pop(). >>>>>> >>>>>> And the updated assert guards ensure that it is the case. >>>>>> >>>>>> It could be a good idea to add a No_Safepoint_Verifier for >>>>>> PopFrame() and NotifyFramePop() >>>>>> to make sure the current/target thread does not go to safepoint >>>>>> until it is returned from >>>>>> update_for_pop_top_frame() and set_frame_pop() correspondingly. >>>>>> A No_Safepoint_Verifier can be also needed in the >>>>>> JvmtiExport::post_method_exit(). >>>>>> >>>>>> These are all places where these functions are called: >>>>>> prims/jvmtiEnv.cpp: >>>>>> state->env_thread_state(this)->set_frame_pop(frame_number); // >>>>>> JvmtiEnv::NotifyFramePop() >>>>>> prims/jvmtiExport.cpp: ets->clear_frame_pop(cur_frame_number); // >>>>>> JvmtiExport::post_method_exit() >>>>>> prims/jvmtiThreadState.cpp: >>>>>> ets->clear_frame_pop(popframe_number); // >>>>>> JvmtiThreadState::update_for_pop_top_frame() >>>>>> >>>>>> The function JvmtiEnvThreadState::clear_to_frame_pop() is never >>>>>> called now. >>>>> >>>>> There is still a concern about recompute_thread_enabled(). >>>>> If it is normally always protected with the JvmtiThreadState_lock >>>>> then the approach above is not going to work. >>>>> I'm trying to check this now. >>>> >>>> Dan, >>>> >>>> I came to a conclusion that these 3 functions still must be protected >>>> by the JvmtiThreadState_lock when they are called out of a safepoint. >>>> It is a little bit ugly but has to be safe though. >>>> >>>> Please, let me know if you see eny problems with that. >>>> I'll send a new webrev soon. >>>> >>>> Thanks, >>>> Serguei >>>> >>>> >>>>> >>>>> Thanks, >>>>> Serguei >>>>> >>>>> >>>>>> >>>>>> Thanks, >>>>>> Serguei >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>>> >>>>>>> Since multiple threads can call JVM/TI NotifyFramePop() on the >>>>>>> same target thread, what keeps the threads from messing with >>>>>>> the list of frame pops simultaneously or messing with the >>>>>>> thread enabled events bits in parallel? >>>>>>> >>>>>>> I suspect that this might also be an issue for >>>>>>> JvmtiEventController::clear_frame_pop() and >>>>>>> JvmtiEventController::clear_to_frame_pop() also. >>>>>>> >>>>>>> src/share/vm/prims/jvmtiThreadState.cpp >>>>>>> No comments. >>>>>>> >>>>>>> Dan >>>>>>> >>>>>>> >>>>>>>> >>>>>>>> Summary: >>>>>>>> >>>>>>>> It is the 2-nd round of review because the JTREG com/sun/jdi >>>>>>>> tests discovered a regression >>>>>>>> in the first round change. The issue was in the >>>>>>>> JvmtiEventController::clear_frame_pop() >>>>>>>> lock synchronization that is not allowed at safepoints. >>>>>>>> >>>>>>>> As a result I've changed the JvmtiEnv::NotifyFramePop to use a >>>>>>>> VM operation for safety. >>>>>>>> Also, I've removed the lock synchronization from the 3 impacted >>>>>>>> JvmtiEventController:: >>>>>>>> functions: set_frame_pop(), clear_frame_pop() and >>>>>>>> clear_to_frame_pop(). >>>>>>>> >>>>>>>> Testing: >>>>>>>> In progress: nsk.jvmti, nsk.jdi, nsk.jdwp, JTreg com/sun/jdi >>>>>>>> >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Serguei >>>>>>>> >>>>>>>> >>>>>>>> On 2/25/14 12:43 PM, serguei.spitsyn at oracle.com wrote: >>>>>>>>> Please, review the fix for: >>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-6471769 >>>>>>>>> >>>>>>>>> >>>>>>>>> Open webrev: >>>>>>>>> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/hotspot/6471769-JVMTI-DEPTH.1 >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> Summary: >>>>>>>>> >>>>>>>>> This is another Test Stabilization issue. >>>>>>>>> The fix is very similar to other JVMTI stabilization fixes. >>>>>>>>> It is to use safepoints for updating the PopFrame data instead >>>>>>>>> of relying on the >>>>>>>>> suspend equivalent condition mechanism >>>>>>>>> (JvmtiEnv::is_thread_fully_suspended()) >>>>>>>>> which is not adequate from the reliability point of view. >>>>>>>>> >>>>>>>>> Testing: >>>>>>>>> In progress: nsk.jvmti, nsk.jdi, nsk.jdwp, JTreg com/sun/jdi >>>>>>>>> >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> Serguei >>>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> > From frederic.parain at oracle.com Fri Feb 28 13:19:42 2014 From: frederic.parain at oracle.com (frederic parain) Date: Fri, 28 Feb 2014 22:19:42 +0100 Subject: RFR(S): JDK-8035952 Remove use of JVM_Open, JVM_Read and JVM_Close functions from serviceability code Message-ID: <5310FD6E.7050305@oracle.com> Greetings, Please review this small changeset for CR JDK-8035952 "Remove use of JVM_Open, JVM_Read and JVM_Close functions from serviceability code" Bug: https://bugs.openjdk.java.net/browse/JDK-8035952 Webrev: http://cr.openjdk.java.net/~fparain/8035952/webrev.00/ Tested with jdk_management test suite + manual tests to verify consistency of values returned by new code. Thank you, Fred -- Frederic Parain - Oracle Grenoble Engineering Center - France Phone: +33 4 76 18 81 17 Email: Frederic.Parain at oracle.com From serguei.spitsyn at oracle.com Fri Feb 28 13:24:47 2014 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Fri, 28 Feb 2014 13:24:47 -0800 Subject: 3-nd round RFR 6471769: Error: assert(_cur_stack_depth == count_frames(), "cur_stack_depth out of sync") In-Reply-To: <5310FBD9.5090607@oracle.com> References: <530D007D.4040602@oracle.com> <530EF65C.5040707@oracle.com> <530F8D52.2040400@oracle.com> <530F9FF3.7030209@oracle.com> <530FA81F.8040309@oracle.com> <530FB572.4010507@oracle.com> <5310077A.7010004@oracle.com> <53102702.4070603@oracle.com> <5310E996.9010800@oracle.com> <5310FBD9.5090607@oracle.com> Message-ID: <5310FE9F.1060606@oracle.com> On 2/28/14 1:12 PM, Daniel D. Daugherty wrote: > On 2/28/14 12:55 PM, serguei.spitsyn at oracle.com wrote: >> On 2/27/14 10:04 PM, David Holmes wrote: >>> Hi Serguei, >>> >>> On 28/02/2014 1:50 PM, serguei.spitsyn at oracle.com wrote: >>>> Please, review the fix for: >>>> https://bugs.openjdk.java.net/browse/JDK-6471769 >>>> >>>> >>>> Open webrev: >>>> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/hotspot/6471769-JVMTI-DEPTH.3 >>>> > > Thumbs up! (including the tweaks in the .4 version) Thanks a lot, Dan! > > src/share/vm/runtime/vm_operations.hpp > No comments. > > src/share/vm/prims/jvmtiEnvBase.hpp > line 375: bool allow_nested_vm_operations() const { return > true; } > Does VM_SetFramePop really gave to permit nested VMops? Yes. It was a real surprise that it is really necessary. Thanks! Serguei > > src/share/vm/prims/jvmtiEnv.cpp > No comments. > > src/share/vm/prims/jvmtiEnvThreadState.cpp > No comments. > > src/share/vm/prims/jvmtiEventController.cpp > No comments. > > src/share/vm/prims/jvmtiThreadState.cpp > No comments. > > Dan > > > >>>> >>>> >>>> Summary: >>>> >>>> It is another attempt to fix the JTREG com/sun/jdi tests regression >>>> discovered in the first round change. >>>> The fix is to avoid lock synchronization at >>>> safepoints(jvmtiEventController.cpp). >>>> Thanks to Dan for catching the problem in the 2-nd round of review! >>> >>> The basic approach here seems sound. >> >> Thank you for reviewing the fix! >> >>> >>> I find the checking for cur->is_VMThread() somewhat overly >>> conservative - if we are at a safepoint, and executing this code, >>> then we must be the VMThread. But ok. >> >> Agreed and simplified. Thanks! >> >>> >>> You could also use MutexLockerEx to avoid the need for locked and >>> unlocked paths to a common call, but that's just stylistic. Though >>> if you are grabbing the current thread anyway you can also use the >>> MutexLocker calls that take the thread arg - to avoid a second >>> look-up of the current thread. >> >> Thank you for reminding. I keep forgetting about it. >> Will check what is better here, just do not want to rerun the whole >> testing. >> But I'm in favor to make it simpler. :) >> >> Thanks, >> Serguei >> >>> >>> David >>> ----- >>> >>>> Testing: >>>> All tests are passed: nsk.jvmti, nsk.jdi, nsk.jdwp, JTreg >>>> com/sun/jdi >>>> >>>> >>>> Thanks, >>>> Serguei >>>> >>>> >>>> On 2/27/14 2:00 PM, serguei.spitsyn at oracle.com wrote: >>>>> On 2/27/14 1:03 PM, serguei.spitsyn at oracle.com wrote: >>>>>> On 2/27/14 12:28 PM, serguei.spitsyn at oracle.com wrote: >>>>>>> Dan, >>>>>>> >>>>>>> Thank you a lot for reviewing this! >>>>>>> >>>>>>> On 2/27/14 11:09 AM, Daniel D. Daugherty wrote: >>>>>>>> On 2/27/14 1:25 AM, serguei.spitsyn at oracle.com wrote: >>>>>>>>> Please, review the fix for: >>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-6471769 >>>>>>>>> >>>>>>>>> >>>>>>>>> Open webrev: >>>>>>>>> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/hotspot/6471769-JVMTI-DEPTH.2 >>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>>> src/share/vm/runtime/vm_operations.hpp >>>>>>>> No comments. >>>>>>>> >>>>>>>> src/share/vm/prims/jvmtiEnvBase.hpp >>>>>>>> No comments. >>>>>>>> >>>>>>>> src/share/vm/prims/jvmtiEnv.cpp >>>>>>>> No comments. >>>>>>>> >>>>>>>> src/share/vm/prims/jvmtiEnvThreadState.cpp >>>>>>>> No comments. >>>>>>>> >>>>>>>> src/share/vm/prims/jvmtiEventController.cpp >>>>>>>> JvmtiEventController::set_frame_pop() is called by >>>>>>>> JvmtiEnvThreadState::set_frame_pop() which is called by >>>>>>>> JvmtiEnv::NotifyFramePop(). >>>>>>>> >>>>>>>> The "MutexLocker mu(JvmtiThreadState_lock)" in >>>>>>>> JvmtiEventController::set_frame_pop() protected the work >>>>>>>> done by JvmtiEventControllerPrivate::set_frame_pop(): >>>>>>>> >>>>>>>> ets->get_frame_pops()->set(fpop); >>>>>>>> recompute_thread_enabled(ets->get_thread()->jvmti_thread_state()); >>>>>>> >>>>>>> Your check is the right thing to do, thanks! >>>>>>> I had to explain this more clearly in this 2-nd review request. >>>>>>> >>>>>>> The approach I've taken here is that all this code paths are >>>>>>> executed >>>>>>> on the target thread or at a safepoint. >>>>>>> >>>>>>> It is true for all 3 functions: >>>>>>> set_frame_pop(), clear_frame_pop() and clear_to_frame_pop(). >>>>>>> >>>>>>> And the updated assert guards ensure that it is the case. >>>>>>> >>>>>>> It could be a good idea to add a No_Safepoint_Verifier for >>>>>>> PopFrame() and NotifyFramePop() >>>>>>> to make sure the current/target thread does not go to safepoint >>>>>>> until it is returned from >>>>>>> update_for_pop_top_frame() and set_frame_pop() correspondingly. >>>>>>> A No_Safepoint_Verifier can be also needed in the >>>>>>> JvmtiExport::post_method_exit(). >>>>>>> >>>>>>> These are all places where these functions are called: >>>>>>> prims/jvmtiEnv.cpp: >>>>>>> state->env_thread_state(this)->set_frame_pop(frame_number); // >>>>>>> JvmtiEnv::NotifyFramePop() >>>>>>> prims/jvmtiExport.cpp: ets->clear_frame_pop(cur_frame_number); // >>>>>>> JvmtiExport::post_method_exit() >>>>>>> prims/jvmtiThreadState.cpp: >>>>>>> ets->clear_frame_pop(popframe_number); // >>>>>>> JvmtiThreadState::update_for_pop_top_frame() >>>>>>> >>>>>>> The function JvmtiEnvThreadState::clear_to_frame_pop() is never >>>>>>> called now. >>>>>> >>>>>> There is still a concern about recompute_thread_enabled(). >>>>>> If it is normally always protected with the JvmtiThreadState_lock >>>>>> then the approach above is not going to work. >>>>>> I'm trying to check this now. >>>>> >>>>> Dan, >>>>> >>>>> I came to a conclusion that these 3 functions still must be protected >>>>> by the JvmtiThreadState_lock when they are called out of a safepoint. >>>>> It is a little bit ugly but has to be safe though. >>>>> >>>>> Please, let me know if you see eny problems with that. >>>>> I'll send a new webrev soon. >>>>> >>>>> Thanks, >>>>> Serguei >>>>> >>>>> >>>>>> >>>>>> Thanks, >>>>>> Serguei >>>>>> >>>>>> >>>>>>> >>>>>>> Thanks, >>>>>>> Serguei >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>>> >>>>>>>> Since multiple threads can call JVM/TI NotifyFramePop() on the >>>>>>>> same target thread, what keeps the threads from messing with >>>>>>>> the list of frame pops simultaneously or messing with the >>>>>>>> thread enabled events bits in parallel? >>>>>>>> >>>>>>>> I suspect that this might also be an issue for >>>>>>>> JvmtiEventController::clear_frame_pop() and >>>>>>>> JvmtiEventController::clear_to_frame_pop() also. >>>>>>>> >>>>>>>> src/share/vm/prims/jvmtiThreadState.cpp >>>>>>>> No comments. >>>>>>>> >>>>>>>> Dan >>>>>>>> >>>>>>>> >>>>>>>>> >>>>>>>>> Summary: >>>>>>>>> >>>>>>>>> It is the 2-nd round of review because the JTREG com/sun/jdi >>>>>>>>> tests discovered a regression >>>>>>>>> in the first round change. The issue was in the >>>>>>>>> JvmtiEventController::clear_frame_pop() >>>>>>>>> lock synchronization that is not allowed at safepoints. >>>>>>>>> >>>>>>>>> As a result I've changed the JvmtiEnv::NotifyFramePop to use a >>>>>>>>> VM operation for safety. >>>>>>>>> Also, I've removed the lock synchronization from the 3 impacted >>>>>>>>> JvmtiEventController:: >>>>>>>>> functions: set_frame_pop(), clear_frame_pop() and >>>>>>>>> clear_to_frame_pop(). >>>>>>>>> >>>>>>>>> Testing: >>>>>>>>> In progress: nsk.jvmti, nsk.jdi, nsk.jdwp, JTreg com/sun/jdi >>>>>>>>> >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> Serguei >>>>>>>>> >>>>>>>>> >>>>>>>>> On 2/25/14 12:43 PM, serguei.spitsyn at oracle.com wrote: >>>>>>>>>> Please, review the fix for: >>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-6471769 >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Open webrev: >>>>>>>>>> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/hotspot/6471769-JVMTI-DEPTH.1 >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Summary: >>>>>>>>>> >>>>>>>>>> This is another Test Stabilization issue. >>>>>>>>>> The fix is very similar to other JVMTI stabilization fixes. >>>>>>>>>> It is to use safepoints for updating the PopFrame data instead >>>>>>>>>> of relying on the >>>>>>>>>> suspend equivalent condition mechanism >>>>>>>>>> (JvmtiEnv::is_thread_fully_suspended()) >>>>>>>>>> which is not adequate from the reliability point of view. >>>>>>>>>> >>>>>>>>>> Testing: >>>>>>>>>> In progress: nsk.jvmti, nsk.jdi, nsk.jdwp, JTreg com/sun/jdi >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> Serguei >>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>> >>>> >> > From daniel.daugherty at oracle.com Fri Feb 28 13:26:43 2014 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Fri, 28 Feb 2014 14:26:43 -0700 Subject: 3-nd round RFR 6471769: Error: assert(_cur_stack_depth == count_frames(), "cur_stack_depth out of sync") In-Reply-To: <5310FE9F.1060606@oracle.com> References: <530D007D.4040602@oracle.com> <530EF65C.5040707@oracle.com> <530F8D52.2040400@oracle.com> <530F9FF3.7030209@oracle.com> <530FA81F.8040309@oracle.com> <530FB572.4010507@oracle.com> <5310077A.7010004@oracle.com> <53102702.4070603@oracle.com> <5310E996.9010800@oracle.com> <5310FBD9.5090607@oracle.com> <5310FE9F.1060606@oracle.com> Message-ID: <5310FF13.8050204@oracle.com> On 2/28/14 2:24 PM, serguei.spitsyn at oracle.com wrote: > On 2/28/14 1:12 PM, Daniel D. Daugherty wrote: >> On 2/28/14 12:55 PM, serguei.spitsyn at oracle.com wrote: >>> On 2/27/14 10:04 PM, David Holmes wrote: >>>> Hi Serguei, >>>> >>>> On 28/02/2014 1:50 PM, serguei.spitsyn at oracle.com wrote: >>>>> Please, review the fix for: >>>>> https://bugs.openjdk.java.net/browse/JDK-6471769 >>>>> >>>>> >>>>> Open webrev: >>>>> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/hotspot/6471769-JVMTI-DEPTH.3 >>>>> >> >> Thumbs up! (including the tweaks in the .4 version) > > Thanks a lot, Dan! > >> >> src/share/vm/runtime/vm_operations.hpp >> No comments. >> >> src/share/vm/prims/jvmtiEnvBase.hpp >> line 375: bool allow_nested_vm_operations() const { return >> true; } >> Does VM_SetFramePop really gave to permit nested VMops? > > Yes. > It was a real surprise that it is really necessary. Maybe add a one line comment saying something like: // needs to permit nested because this VMop can be invoked from XXX VMop... or something like that. Dan > > Thanks! > Serguei > >> >> src/share/vm/prims/jvmtiEnv.cpp >> No comments. >> >> src/share/vm/prims/jvmtiEnvThreadState.cpp >> No comments. >> >> src/share/vm/prims/jvmtiEventController.cpp >> No comments. >> >> src/share/vm/prims/jvmtiThreadState.cpp >> No comments. >> >> Dan >> >> >> >>>>> >>>>> >>>>> Summary: >>>>> >>>>> It is another attempt to fix the JTREG com/sun/jdi tests >>>>> regression >>>>> discovered in the first round change. >>>>> The fix is to avoid lock synchronization at >>>>> safepoints(jvmtiEventController.cpp). >>>>> Thanks to Dan for catching the problem in the 2-nd round of >>>>> review! >>>> >>>> The basic approach here seems sound. >>> >>> Thank you for reviewing the fix! >>> >>>> >>>> I find the checking for cur->is_VMThread() somewhat overly >>>> conservative - if we are at a safepoint, and executing this code, >>>> then we must be the VMThread. But ok. >>> >>> Agreed and simplified. Thanks! >>> >>>> >>>> You could also use MutexLockerEx to avoid the need for locked and >>>> unlocked paths to a common call, but that's just stylistic. Though >>>> if you are grabbing the current thread anyway you can also use the >>>> MutexLocker calls that take the thread arg - to avoid a second >>>> look-up of the current thread. >>> >>> Thank you for reminding. I keep forgetting about it. >>> Will check what is better here, just do not want to rerun the whole >>> testing. >>> But I'm in favor to make it simpler. :) >>> >>> Thanks, >>> Serguei >>> >>>> >>>> David >>>> ----- >>>> >>>>> Testing: >>>>> All tests are passed: nsk.jvmti, nsk.jdi, nsk.jdwp, JTreg >>>>> com/sun/jdi >>>>> >>>>> >>>>> Thanks, >>>>> Serguei >>>>> >>>>> >>>>> On 2/27/14 2:00 PM, serguei.spitsyn at oracle.com wrote: >>>>>> On 2/27/14 1:03 PM, serguei.spitsyn at oracle.com wrote: >>>>>>> On 2/27/14 12:28 PM, serguei.spitsyn at oracle.com wrote: >>>>>>>> Dan, >>>>>>>> >>>>>>>> Thank you a lot for reviewing this! >>>>>>>> >>>>>>>> On 2/27/14 11:09 AM, Daniel D. Daugherty wrote: >>>>>>>>> On 2/27/14 1:25 AM, serguei.spitsyn at oracle.com wrote: >>>>>>>>>> Please, review the fix for: >>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-6471769 >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Open webrev: >>>>>>>>>> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/hotspot/6471769-JVMTI-DEPTH.2 >>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>>> src/share/vm/runtime/vm_operations.hpp >>>>>>>>> No comments. >>>>>>>>> >>>>>>>>> src/share/vm/prims/jvmtiEnvBase.hpp >>>>>>>>> No comments. >>>>>>>>> >>>>>>>>> src/share/vm/prims/jvmtiEnv.cpp >>>>>>>>> No comments. >>>>>>>>> >>>>>>>>> src/share/vm/prims/jvmtiEnvThreadState.cpp >>>>>>>>> No comments. >>>>>>>>> >>>>>>>>> src/share/vm/prims/jvmtiEventController.cpp >>>>>>>>> JvmtiEventController::set_frame_pop() is called by >>>>>>>>> JvmtiEnvThreadState::set_frame_pop() which is called by >>>>>>>>> JvmtiEnv::NotifyFramePop(). >>>>>>>>> >>>>>>>>> The "MutexLocker mu(JvmtiThreadState_lock)" in >>>>>>>>> JvmtiEventController::set_frame_pop() protected the work >>>>>>>>> done by JvmtiEventControllerPrivate::set_frame_pop(): >>>>>>>>> >>>>>>>>> ets->get_frame_pops()->set(fpop); >>>>>>>>> recompute_thread_enabled(ets->get_thread()->jvmti_thread_state()); >>>>>>>>> >>>>>>>> >>>>>>>> Your check is the right thing to do, thanks! >>>>>>>> I had to explain this more clearly in this 2-nd review request. >>>>>>>> >>>>>>>> The approach I've taken here is that all this code paths are >>>>>>>> executed >>>>>>>> on the target thread or at a safepoint. >>>>>>>> >>>>>>>> It is true for all 3 functions: >>>>>>>> set_frame_pop(), clear_frame_pop() and clear_to_frame_pop(). >>>>>>>> >>>>>>>> And the updated assert guards ensure that it is the case. >>>>>>>> >>>>>>>> It could be a good idea to add a No_Safepoint_Verifier for >>>>>>>> PopFrame() and NotifyFramePop() >>>>>>>> to make sure the current/target thread does not go to safepoint >>>>>>>> until it is returned from >>>>>>>> update_for_pop_top_frame() and set_frame_pop() correspondingly. >>>>>>>> A No_Safepoint_Verifier can be also needed in the >>>>>>>> JvmtiExport::post_method_exit(). >>>>>>>> >>>>>>>> These are all places where these functions are called: >>>>>>>> prims/jvmtiEnv.cpp: >>>>>>>> state->env_thread_state(this)->set_frame_pop(frame_number); // >>>>>>>> JvmtiEnv::NotifyFramePop() >>>>>>>> prims/jvmtiExport.cpp: ets->clear_frame_pop(cur_frame_number); // >>>>>>>> JvmtiExport::post_method_exit() >>>>>>>> prims/jvmtiThreadState.cpp: >>>>>>>> ets->clear_frame_pop(popframe_number); // >>>>>>>> JvmtiThreadState::update_for_pop_top_frame() >>>>>>>> >>>>>>>> The function JvmtiEnvThreadState::clear_to_frame_pop() is never >>>>>>>> called now. >>>>>>> >>>>>>> There is still a concern about recompute_thread_enabled(). >>>>>>> If it is normally always protected with the JvmtiThreadState_lock >>>>>>> then the approach above is not going to work. >>>>>>> I'm trying to check this now. >>>>>> >>>>>> Dan, >>>>>> >>>>>> I came to a conclusion that these 3 functions still must be >>>>>> protected >>>>>> by the JvmtiThreadState_lock when they are called out of a >>>>>> safepoint. >>>>>> It is a little bit ugly but has to be safe though. >>>>>> >>>>>> Please, let me know if you see eny problems with that. >>>>>> I'll send a new webrev soon. >>>>>> >>>>>> Thanks, >>>>>> Serguei >>>>>> >>>>>> >>>>>>> >>>>>>> Thanks, >>>>>>> Serguei >>>>>>> >>>>>>> >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Serguei >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>>> >>>>>>>>> Since multiple threads can call JVM/TI NotifyFramePop() on >>>>>>>>> the >>>>>>>>> same target thread, what keeps the threads from messing with >>>>>>>>> the list of frame pops simultaneously or messing with the >>>>>>>>> thread enabled events bits in parallel? >>>>>>>>> >>>>>>>>> I suspect that this might also be an issue for >>>>>>>>> JvmtiEventController::clear_frame_pop() and >>>>>>>>> JvmtiEventController::clear_to_frame_pop() also. >>>>>>>>> >>>>>>>>> src/share/vm/prims/jvmtiThreadState.cpp >>>>>>>>> No comments. >>>>>>>>> >>>>>>>>> Dan >>>>>>>>> >>>>>>>>> >>>>>>>>>> >>>>>>>>>> Summary: >>>>>>>>>> >>>>>>>>>> It is the 2-nd round of review because the JTREG com/sun/jdi >>>>>>>>>> tests discovered a regression >>>>>>>>>> in the first round change. The issue was in the >>>>>>>>>> JvmtiEventController::clear_frame_pop() >>>>>>>>>> lock synchronization that is not allowed at safepoints. >>>>>>>>>> >>>>>>>>>> As a result I've changed the JvmtiEnv::NotifyFramePop to use a >>>>>>>>>> VM operation for safety. >>>>>>>>>> Also, I've removed the lock synchronization from the 3 >>>>>>>>>> impacted >>>>>>>>>> JvmtiEventController:: >>>>>>>>>> functions: set_frame_pop(), clear_frame_pop() and >>>>>>>>>> clear_to_frame_pop(). >>>>>>>>>> >>>>>>>>>> Testing: >>>>>>>>>> In progress: nsk.jvmti, nsk.jdi, nsk.jdwp, JTreg com/sun/jdi >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> Serguei >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On 2/25/14 12:43 PM, serguei.spitsyn at oracle.com wrote: >>>>>>>>>>> Please, review the fix for: >>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-6471769 >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Open webrev: >>>>>>>>>>> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/hotspot/6471769-JVMTI-DEPTH.1 >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Summary: >>>>>>>>>>> >>>>>>>>>>> This is another Test Stabilization issue. >>>>>>>>>>> The fix is very similar to other JVMTI stabilization fixes. >>>>>>>>>>> It is to use safepoints for updating the PopFrame data >>>>>>>>>>> instead >>>>>>>>>>> of relying on the >>>>>>>>>>> suspend equivalent condition mechanism >>>>>>>>>>> (JvmtiEnv::is_thread_fully_suspended()) >>>>>>>>>>> which is not adequate from the reliability point of view. >>>>>>>>>>> >>>>>>>>>>> Testing: >>>>>>>>>>> In progress: nsk.jvmti, nsk.jdi, nsk.jdwp, JTreg com/sun/jdi >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> Serguei >>>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>> >>> >> > From serguei.spitsyn at oracle.com Fri Feb 28 13:33:46 2014 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Fri, 28 Feb 2014 13:33:46 -0800 Subject: 3-nd round RFR 6471769: Error: assert(_cur_stack_depth == count_frames(), "cur_stack_depth out of sync") In-Reply-To: <5310FF13.8050204@oracle.com> References: <530D007D.4040602@oracle.com> <530EF65C.5040707@oracle.com> <530F8D52.2040400@oracle.com> <530F9FF3.7030209@oracle.com> <530FA81F.8040309@oracle.com> <530FB572.4010507@oracle.com> <5310077A.7010004@oracle.com> <53102702.4070603@oracle.com> <5310E996.9010800@oracle.com> <5310FBD9.5090607@oracle.com> <5310FE9F.1060606@oracle.com> <5310FF13.8050204@oracle.com> Message-ID: <531100BA.1020303@oracle.com> On 2/28/14 1:26 PM, Daniel D. Daugherty wrote: > On 2/28/14 2:24 PM, serguei.spitsyn at oracle.com wrote: >> On 2/28/14 1:12 PM, Daniel D. Daugherty wrote: >>> On 2/28/14 12:55 PM, serguei.spitsyn at oracle.com wrote: >>>> On 2/27/14 10:04 PM, David Holmes wrote: >>>>> Hi Serguei, >>>>> >>>>> On 28/02/2014 1:50 PM, serguei.spitsyn at oracle.com wrote: >>>>>> Please, review the fix for: >>>>>> https://bugs.openjdk.java.net/browse/JDK-6471769 >>>>>> >>>>>> >>>>>> Open webrev: >>>>>> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/hotspot/6471769-JVMTI-DEPTH.3 >>>>>> >>> >>> Thumbs up! (including the tweaks in the .4 version) >> >> Thanks a lot, Dan! >> >>> >>> src/share/vm/runtime/vm_operations.hpp >>> No comments. >>> >>> src/share/vm/prims/jvmtiEnvBase.hpp >>> line 375: bool allow_nested_vm_operations() const { >>> return true; } >>> Does VM_SetFramePop really gave to permit nested VMops? >> >> Yes. >> It was a real surprise that it is really necessary. > > Maybe add a one line comment saying something like: > > // needs to permit nested because this VMop can be invoked from XXX > VMop... > > or something like that. Ok, I'll add a comment before the push. The latest public webrev with the simplification fixes suggested by David (no above comment yet): http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/hotspot/6471769-JVMTI-DEPTH.4/ Thanks, Serguei > > Dan > > >> >> Thanks! >> Serguei >> >>> >>> src/share/vm/prims/jvmtiEnv.cpp >>> No comments. >>> >>> src/share/vm/prims/jvmtiEnvThreadState.cpp >>> No comments. >>> >>> src/share/vm/prims/jvmtiEventController.cpp >>> No comments. >>> >>> src/share/vm/prims/jvmtiThreadState.cpp >>> No comments. >>> >>> Dan >>> >>> >>> >>>>>> >>>>>> >>>>>> Summary: >>>>>> >>>>>> It is another attempt to fix the JTREG com/sun/jdi tests >>>>>> regression >>>>>> discovered in the first round change. >>>>>> The fix is to avoid lock synchronization at >>>>>> safepoints(jvmtiEventController.cpp). >>>>>> Thanks to Dan for catching the problem in the 2-nd round of >>>>>> review! >>>>> >>>>> The basic approach here seems sound. >>>> >>>> Thank you for reviewing the fix! >>>> >>>>> >>>>> I find the checking for cur->is_VMThread() somewhat overly >>>>> conservative - if we are at a safepoint, and executing this code, >>>>> then we must be the VMThread. But ok. >>>> >>>> Agreed and simplified. Thanks! >>>> >>>>> >>>>> You could also use MutexLockerEx to avoid the need for locked and >>>>> unlocked paths to a common call, but that's just stylistic. Though >>>>> if you are grabbing the current thread anyway you can also use the >>>>> MutexLocker calls that take the thread arg - to avoid a second >>>>> look-up of the current thread. >>>> >>>> Thank you for reminding. I keep forgetting about it. >>>> Will check what is better here, just do not want to rerun the whole >>>> testing. >>>> But I'm in favor to make it simpler. :) >>>> >>>> Thanks, >>>> Serguei >>>> >>>>> >>>>> David >>>>> ----- >>>>> >>>>>> Testing: >>>>>> All tests are passed: nsk.jvmti, nsk.jdi, nsk.jdwp, JTreg >>>>>> com/sun/jdi >>>>>> >>>>>> >>>>>> Thanks, >>>>>> Serguei >>>>>> >>>>>> >>>>>> On 2/27/14 2:00 PM, serguei.spitsyn at oracle.com wrote: >>>>>>> On 2/27/14 1:03 PM, serguei.spitsyn at oracle.com wrote: >>>>>>>> On 2/27/14 12:28 PM, serguei.spitsyn at oracle.com wrote: >>>>>>>>> Dan, >>>>>>>>> >>>>>>>>> Thank you a lot for reviewing this! >>>>>>>>> >>>>>>>>> On 2/27/14 11:09 AM, Daniel D. Daugherty wrote: >>>>>>>>>> On 2/27/14 1:25 AM, serguei.spitsyn at oracle.com wrote: >>>>>>>>>>> Please, review the fix for: >>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-6471769 >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Open webrev: >>>>>>>>>>> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/hotspot/6471769-JVMTI-DEPTH.2 >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> src/share/vm/runtime/vm_operations.hpp >>>>>>>>>> No comments. >>>>>>>>>> >>>>>>>>>> src/share/vm/prims/jvmtiEnvBase.hpp >>>>>>>>>> No comments. >>>>>>>>>> >>>>>>>>>> src/share/vm/prims/jvmtiEnv.cpp >>>>>>>>>> No comments. >>>>>>>>>> >>>>>>>>>> src/share/vm/prims/jvmtiEnvThreadState.cpp >>>>>>>>>> No comments. >>>>>>>>>> >>>>>>>>>> src/share/vm/prims/jvmtiEventController.cpp >>>>>>>>>> JvmtiEventController::set_frame_pop() is called by >>>>>>>>>> JvmtiEnvThreadState::set_frame_pop() which is called by >>>>>>>>>> JvmtiEnv::NotifyFramePop(). >>>>>>>>>> >>>>>>>>>> The "MutexLocker mu(JvmtiThreadState_lock)" in >>>>>>>>>> JvmtiEventController::set_frame_pop() protected the work >>>>>>>>>> done by JvmtiEventControllerPrivate::set_frame_pop(): >>>>>>>>>> >>>>>>>>>> ets->get_frame_pops()->set(fpop); >>>>>>>>>> recompute_thread_enabled(ets->get_thread()->jvmti_thread_state()); >>>>>>>>>> >>>>>>>>> >>>>>>>>> Your check is the right thing to do, thanks! >>>>>>>>> I had to explain this more clearly in this 2-nd review request. >>>>>>>>> >>>>>>>>> The approach I've taken here is that all this code paths are >>>>>>>>> executed >>>>>>>>> on the target thread or at a safepoint. >>>>>>>>> >>>>>>>>> It is true for all 3 functions: >>>>>>>>> set_frame_pop(), clear_frame_pop() and clear_to_frame_pop(). >>>>>>>>> >>>>>>>>> And the updated assert guards ensure that it is the case. >>>>>>>>> >>>>>>>>> It could be a good idea to add a No_Safepoint_Verifier for >>>>>>>>> PopFrame() and NotifyFramePop() >>>>>>>>> to make sure the current/target thread does not go to safepoint >>>>>>>>> until it is returned from >>>>>>>>> update_for_pop_top_frame() and set_frame_pop() correspondingly. >>>>>>>>> A No_Safepoint_Verifier can be also needed in the >>>>>>>>> JvmtiExport::post_method_exit(). >>>>>>>>> >>>>>>>>> These are all places where these functions are called: >>>>>>>>> prims/jvmtiEnv.cpp: >>>>>>>>> state->env_thread_state(this)->set_frame_pop(frame_number); // >>>>>>>>> JvmtiEnv::NotifyFramePop() >>>>>>>>> prims/jvmtiExport.cpp: ets->clear_frame_pop(cur_frame_number); // >>>>>>>>> JvmtiExport::post_method_exit() >>>>>>>>> prims/jvmtiThreadState.cpp: >>>>>>>>> ets->clear_frame_pop(popframe_number); // >>>>>>>>> JvmtiThreadState::update_for_pop_top_frame() >>>>>>>>> >>>>>>>>> The function JvmtiEnvThreadState::clear_to_frame_pop() is never >>>>>>>>> called now. >>>>>>>> >>>>>>>> There is still a concern about recompute_thread_enabled(). >>>>>>>> If it is normally always protected with the JvmtiThreadState_lock >>>>>>>> then the approach above is not going to work. >>>>>>>> I'm trying to check this now. >>>>>>> >>>>>>> Dan, >>>>>>> >>>>>>> I came to a conclusion that these 3 functions still must be >>>>>>> protected >>>>>>> by the JvmtiThreadState_lock when they are called out of a >>>>>>> safepoint. >>>>>>> It is a little bit ugly but has to be safe though. >>>>>>> >>>>>>> Please, let me know if you see eny problems with that. >>>>>>> I'll send a new webrev soon. >>>>>>> >>>>>>> Thanks, >>>>>>> Serguei >>>>>>> >>>>>>> >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Serguei >>>>>>>> >>>>>>>> >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> Serguei >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>>> >>>>>>>>>> Since multiple threads can call JVM/TI NotifyFramePop() >>>>>>>>>> on the >>>>>>>>>> same target thread, what keeps the threads from messing with >>>>>>>>>> the list of frame pops simultaneously or messing with the >>>>>>>>>> thread enabled events bits in parallel? >>>>>>>>>> >>>>>>>>>> I suspect that this might also be an issue for >>>>>>>>>> JvmtiEventController::clear_frame_pop() and >>>>>>>>>> JvmtiEventController::clear_to_frame_pop() also. >>>>>>>>>> >>>>>>>>>> src/share/vm/prims/jvmtiThreadState.cpp >>>>>>>>>> No comments. >>>>>>>>>> >>>>>>>>>> Dan >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Summary: >>>>>>>>>>> >>>>>>>>>>> It is the 2-nd round of review because the JTREG com/sun/jdi >>>>>>>>>>> tests discovered a regression >>>>>>>>>>> in the first round change. The issue was in the >>>>>>>>>>> JvmtiEventController::clear_frame_pop() >>>>>>>>>>> lock synchronization that is not allowed at safepoints. >>>>>>>>>>> >>>>>>>>>>> As a result I've changed the JvmtiEnv::NotifyFramePop to >>>>>>>>>>> use a >>>>>>>>>>> VM operation for safety. >>>>>>>>>>> Also, I've removed the lock synchronization from the 3 >>>>>>>>>>> impacted >>>>>>>>>>> JvmtiEventController:: >>>>>>>>>>> functions: set_frame_pop(), clear_frame_pop() and >>>>>>>>>>> clear_to_frame_pop(). >>>>>>>>>>> >>>>>>>>>>> Testing: >>>>>>>>>>> In progress: nsk.jvmti, nsk.jdi, nsk.jdwp, JTreg com/sun/jdi >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> Serguei >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On 2/25/14 12:43 PM, serguei.spitsyn at oracle.com wrote: >>>>>>>>>>>> Please, review the fix for: >>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-6471769 >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Open webrev: >>>>>>>>>>>> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/hotspot/6471769-JVMTI-DEPTH.1 >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Summary: >>>>>>>>>>>> >>>>>>>>>>>> This is another Test Stabilization issue. >>>>>>>>>>>> The fix is very similar to other JVMTI stabilization fixes. >>>>>>>>>>>> It is to use safepoints for updating the PopFrame data >>>>>>>>>>>> instead >>>>>>>>>>>> of relying on the >>>>>>>>>>>> suspend equivalent condition mechanism >>>>>>>>>>>> (JvmtiEnv::is_thread_fully_suspended()) >>>>>>>>>>>> which is not adequate from the reliability point of view. >>>>>>>>>>>> >>>>>>>>>>>> Testing: >>>>>>>>>>>> In progress: nsk.jvmti, nsk.jdi, nsk.jdwp, JTreg com/sun/jdi >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Thanks, >>>>>>>>>>>> Serguei >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>> >>> >> > From daniel.daugherty at oracle.com Fri Feb 28 13:53:43 2014 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Fri, 28 Feb 2014 14:53:43 -0700 Subject: RFR round 0 JDK8u backport of ObjectMonitor-JVM/TI hang fix (8028073) In-Reply-To: <53081C1F.2070008@oracle.com> References: <52ED3F28.5070806@oracle.com> <53081C1F.2070008@oracle.com> Message-ID: <53110567.6020600@oracle.com> Greetings, This is a code review request for the JDK7u-hs-dev backport of the following ObjectMonitor-JVM/TI hang fix: 8028073 race condition in ObjectMonitor implementation causing deadlocks https://bugs.openjdk.java.net/browse/JDK-8028073 Here is the JDK7u-hs-dev webrev URL: http://cr.openjdk.java.net/~dcubed/8028073-webrev/0-jdk7u-hs-dev/ This is a straight forward backport of the JDK8u-hs-dev version of the fix modulo lines numbers and context diff anchors. The diff of the two 'hg diff --nodates' files is below. Same testing as for the JDK9 and JDK8u versions with the same results. I need one sanity check review from someone. Dan $ diff 8028073-diffs-jdk[78]u-hs.txt.00 1c1 < diff -r bdb3798edd11 src/share/vm/prims/jvm.cpp --- > diff -r 231051bff068 src/share/vm/prims/jvm.cpp 11c11 < @@ -527,6 +527,12 @@ JVM_ENTRY(void, JVM_MonitorWait(JNIEnv* --- > @@ -518,6 +518,12 @@ JVM_ENTRY(void, JVM_MonitorWait(JNIEnv* 22c22 < ObjectSynchronizer::wait(obj, ms, THREAD); --- > ObjectSynchronizer::wait(obj, ms, CHECK); 24c24 < diff -r bdb3798edd11 src/share/vm/runtime/objectMonitor.cpp --- > diff -r 231051bff068 src/share/vm/runtime/objectMonitor.cpp 34c34 < @@ -385,6 +385,12 @@ void ATTR ObjectMonitor::enter(TRAPS) { --- > @@ -382,6 +382,12 @@ void ATTR ObjectMonitor::enter(TRAPS) { 47c47 < @@ -442,6 +448,12 @@ void ATTR ObjectMonitor::enter(TRAPS) { --- > @@ -439,6 +445,12 @@ void ATTR ObjectMonitor::enter(TRAPS) { 60c60 < @@ -1459,6 +1471,14 @@ void ObjectMonitor::wait(jlong millis, b --- > @@ -1456,6 +1468,14 @@ void ObjectMonitor::wait(jlong millis, b 75c75 < @@ -1501,21 +1521,6 @@ void ObjectMonitor::wait(jlong millis, b --- > @@ -1498,21 +1518,6 @@ void ObjectMonitor::wait(jlong millis, b 97c97 < @@ -1598,6 +1603,33 @@ void ObjectMonitor::wait(jlong millis, b --- > @@ -1595,6 +1600,33 @@ void ObjectMonitor::wait(jlong millis, b On 2/21/14 8:40 PM, Daniel D. Daugherty wrote: > Greetings, > > This is a code review request for the JDK8u-hs-dev backport of the > following ObjectMonitor-JVM/TI hang fix: > > 8028073 race condition in ObjectMonitor implementation causing > deadlocks > https://bugs.openjdk.java.net/browse/JDK-8028073 > > Here is the JDK8u-hs-dev webrev URL: > > http://cr.openjdk.java.net/~dcubed/8028073-webrev/0-jdk8u-hs-dev/ > > This is _almost_ a straight forward backport of the JDK9 fix. The only > difference to the fix was discussed at the end of the JDK9 review and > was determined to only be needed in versions of HotSpot without the > fix for 8028280: > > http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2014-February/010745.html > > > 8028280 has not yet been backported to JDK8u-hs-dev. > > The easiest way to review the backport is to download the two patch > files from the webrevs and compare them with something like: > > jfilemerge -r -w 8028073_exp.patch 8028073_exp_for_jdk8u_hs.patch > > The same testing has been performed on the JDK8u-hs-dev version as > with the JDK9-hs-runtime version. > > Thanks, in advance, for any comments, questions or suggestions. > > Dan > > > On 2/1/14 11:38 AM, Daniel D. Daugherty wrote: > > Greetings, > > > > I have a fix ready for the following bug: > > > > 8028073 race condition in ObjectMonitor implementation causing > deadlocks > > https://bugs.openjdk.java.net/browse/JDK-8028073 > > > > On the surface, this is a very simple fix that relocates a few lines of > > code, relocates and rewrites the comments associated with that code and > > adds several new comments. > > > > Of course, in reality, the issue is much more complicated, but I'm > > hoping to make it easy for anyone not acquainted with this issue to > > understand what's going on. > > > > Here are the JDK9 webrev URLs: > > > > OpenJDK: > > http://cr.openjdk.java.net/~dcubed/8028073-webrev/0-jdk9-hs-runtime/ > > > > Oracle internal: > > > http://javaweb.us.oracle.com/~ddaugher/8028073-webrev/0-jdk9-hs-runtime/ > > > > The simple summary: > > > > - since Java Monitors and JVM/TI RawMonitors share a ParkEvent, > > it is possible for a JVM/TI monitor event handler to accidentally > > consume a ParkEvent.unpark() call meant for Java Monitor layer > > - the original code fix was made on 2005.07.04 using this bug ID: > > https://bugs.openjdk.java.net/browse/JDK-5030359 > > - it's the right fix, but it's in the wrong place > > - the fix needs to be after the JVMTI_EVENT_MONITOR_WAITED > > event handler is called because it is that event handler > > that can cause the hang > > > > > > Testing > > ------- > > > > - a new StessMonitorWait test has been created that reliably > > reproduces the hang in JDK[6789]; see the bug's gory details > > for the specific versions where the hang has been reproduced > > - the test reliably reproduces the hang in 5 seconds on my > > T7600 running Solaris 10u11 X86; 1 minute runs reproduce > > the hang reliably on other machines > > - 12 hour stress run of the new test on Linux-X64, MacOS X-X64, > > Solaris-SPARCV9, Solaris-X64, and Win7-X86 with the JPRT > > bits did not reproduce the hang > > - JPRT test job > > - VM/SQE Adhoc test job on Server VM, fastdebug bits on Linux-X86, > > Linux-X64, MacOS X-X64, Solaris-SPARCV9, Solaris-X64, Windows-X86, > > and Windows-X64: > > - vm.quick > > - Kitchensink (bigapps) > > - Weblogic+medrec (bigapps) > > - runThese (bigapps) > > > > > > The Gory Details Start Here > > --------------------------- > > > > This is the old location of block of code that's being moved: > > > > src/share/vm/runtime/objectMonitor.cpp: > > > > 1440 void ObjectMonitor::wait(jlong millis, bool interruptible, > TRAPS) { > > > > 1499 exit (true, Self) ; // exit the monitor > > > > 1513 if (node._notified != 0 && _succ == Self) { > > 1514 node._event->unpark(); > > 1515 } > > > > > > This is the new location of block of code that's being moved: > > > > src/share/vm/runtime/objectMonitor.cpp: > > > > 1452 void ObjectMonitor::wait(jlong millis, bool interruptible, > TRAPS) { > > > > 1601 if (JvmtiExport::should_post_monitor_waited()) { > > 1602 JvmtiExport::post_monitor_waited(jt, this, ret == > OS_TIMEOUT); > > > > 1604 if (node._notified != 0 && _succ == Self) { > > > > 1620 node._event->unpark(); > > 1621 } > > > > > > The Risks > > --------- > > > > - The code now executes only when the JVMTI_EVENT_MONITOR_WAITED event > > is enabled: > > - previously it was always executed > > - while the old code was not effective for the hang that is being > > fixed with this bug, it is possible that the old code prevented > > a different bug in the successor protocol from manifesting > > - thorough analysis of the successor protocol did not reveal a > > case where the old code was needed in the old location > > - Thorough analysis indicates that the other JVM/TI monitor events > > do not need a fix like the one for JVMTI_EVENT_MONITOR_WAITED: > > - the successor protocol is complicated and the analysis could > > be wrong when certain options are used > > - comments were added to each location where a JVM/TI monitor > > event handler is called documenting why a fix like this one > > is not needed there > > - if the analysis is wrong, the new comments show where a new > > code change would be needed > > > > > > The Scenario > > ------------ > > > > I've created a scenario that reproduces this hang: > > > > T1 - enters monitor and calls monitor.wait() > > T2 - enters the monitor, calls monitor.notify() and exits the monitor > > T3 - enters and exits the monitor > > T4 - enters the monitor, delays for 5 seconds, exits the monitor > > > > A JVM/TI agent that enables JVMTI_EVENT_MONITOR_WAITED and has a > > handler that: enters a raw monitor, waits for 1ms, exits a raw monitor. > > > > Here are the six events necessary to make this hang happen: > > > > // KEY-EVENT-1a: After being unparked(), T1 has cleared the _succ > field, but > > // KEY-EVENT-1b: T3 is exiting the monitor and makes T1 the > successor again. > > > > // KEY-EVENT-2a: The unpark() done by T3 when it made T1 the successor > > // KEY-EVENT-2b: is consumed by the JVM/TI event handler. > > > > // KEY-EVENT-3a: T3 made T1 the successor > > // KEY-EVENT-3b: but before T1 could reenter the monitor T4 grabbed it. > > > > // KEY-EVENT-4a: T1's TrySpin() call sees T4 as NotRunnable so > > // KEY-EVENT-4b: T1 bails from TrySpin without touching _succ. > > > > // KEY-EVENT-5a: T4 sees that T1 is still the successor so > > // KEY-EVENT-5b: T4 takes the quick exit path (no ExitEpilog) > > > > // KEY-EVENT-6a: T1 is about to park and it is the successor, but > > // KEY-EVENT-6b: T3's unpark has been eaten by the JVM/TI event handler > > // KEY-EVENT-6c: and T4 took the quick exit path. T1 is about to be > stuck. > > > > > > This bug is intertwined with: > > > > - The ObjectMonitor successor protocol > > - the sharing of a ParkEvent between Java Monitors and JVM/TI > RawMonitors > > > > There is a very long successor.notes attachment to JDK-8028073 that > > attempts to describe the ObjectMonitor successor protocol. It's good > > for putting pretty much anyone to sleep. > > > > Since this hang reproduces back to JDK6, this bug is taking the easily > > backported solution of moving the original fix to the right location. > > The following new bug has been filed for possible future work in this > > area by the Serviceability Team: > > > > 8033399 add a separate ParkEvent for JVM/TI RawMonitor use > > https://bugs.openjdk.java.net/browse/JDK-8033399 > > > > > > The Symptoms > > ------------ > > > > With intermittent hangs like this, it is useful to know what to look > > for in order to determine if you are running into this issue: > > > > - if you aren't using a debugger or a profiler or some other > > JVM/TI agent, then this hang is not the same as yours > > - if your JVM/TI agent isn't using a JVMTI_EVENT_MONITOR_WAITED > > event handler, then this hang is not the same as yours > > - if your JVMTI_EVENT_MONITOR_WAITED event handler is not using > > JVM/TI RawMonitors, then this hang is not the same as yours > > - if your JVMTI_EVENT_MONITOR_WAITED event handler is calling > > back into Java code, then you might just be insane and this > > hang might be similar to yours. However, using a Java callback > > in an event handler is an even bigger problem/risk so fix that > > first. > > - if you one or more threads blocked like this and making no > > progress, then this hang might be the same as yours: > > > > "T1" #22 prio=5 os_prio=64 tid=0x00000000009ca800 nid=0x2f waiting > for monitor e > > ntry [0xfffffd7fc0231000] > > java.lang.Thread.State: BLOCKED (on object monitor) > > JavaThread state: _thread_blocked > > Thread: 0x00000000009ca800 [0x2f] State: _at_safepoint > _has_called_back 0 _at_p > > oll_safepoint 0 > > JavaThread state: _thread_blocked > > at java.lang.Object.wait(Native Method) > > - waiting on <0xfffffd7e6a2b6ff0> (a java.lang.String) > > at java.lang.Object.wait(Object.java:502) > > at SMW_WorkerThread.run(StressMonitorWait.java:103) > > - locked <0xfffffd7e6a2b6ff0> (a java.lang.String) > > > > "T2" #23 prio=5 os_prio=64 tid=0x00000000009cc000 nid=0x30 waiting > for monitor e > > ntry [0xfffffd7fc0130000] > > java.lang.Thread.State: BLOCKED (on object monitor) > > JavaThread state: _thread_blocked > > Thread: 0x00000000009cc000 [0x30] State: _at_safepoint > _has_called_back 0 _at_p > > oll_safepoint 0 > > JavaThread state: _thread_blocked > > at SMW_WorkerThread.run(StressMonitorWait.java:120) > > - waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String) > > > > "T3" #24 prio=5 os_prio=64 tid=0x00000000009ce000 nid=0x31 waiting > for monitor e > > ntry [0xfffffd7fc002f000] > > java.lang.Thread.State: BLOCKED (on object monitor) > > JavaThread state: _thread_blocked > > Thread: 0x00000000009ce000 [0x31] State: _at_safepoint > _has_called_back 0 _at_p > > oll_safepoint 0 > > JavaThread state: _thread_blocked > > at SMW_WorkerThread.run(StressMonitorWait.java:139) > > - waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String) > > > > Key symptoms in thread T1: > > > > - had the object locked: > > > > locked <0xfffffd7e6a2b6ff0> (a java.lang.String) > > > > - did an Object.wait(): > > > > waiting on <0xfffffd7e6a2b6ff0> (a java.lang.String) > > > > - is blocked on reentry: > > > > waiting for monitor entry [0xfffffd7fc0231000] > > > > Key symptoms in thread T2: > > > > - is blocked waiting to lock the object: > > > > waiting for monitor entry [0xfffffd7fc0130000] > > waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String) > > > > Key symptoms in thread T3: > > > > - is blocked waiting to lock the object: > > > > waiting for monitor entry [0xfffffd7fc002f000] > > waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String) > > > From daniel.daugherty at oracle.com Fri Feb 28 13:55:08 2014 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Fri, 28 Feb 2014 14:55:08 -0700 Subject: RFR round 0 JDK7u backport of ObjectMonitor-JVM/TI hang fix (8028073) In-Reply-To: <53081C1F.2070008@oracle.com> References: <52ED3F28.5070806@oracle.com> <53081C1F.2070008@oracle.com> Message-ID: <531105BC.4050509@oracle.com> Resend with the corrected subject line... sigh... Greetings, This is a code review request for the JDK7u-hs-dev backport of the following ObjectMonitor-JVM/TI hang fix: 8028073 race condition in ObjectMonitor implementation causing deadlocks https://bugs.openjdk.java.net/browse/JDK-8028073 Here is the JDK7u-hs-dev webrev URL: http://cr.openjdk.java.net/~dcubed/8028073-webrev/0-jdk7u-hs-dev/ This is a straight forward backport of the JDK8u-hs-dev version of the fix modulo lines numbers and context diff anchors. The diff of the two 'hg diff --nodates' files is below. Same testing as for the JDK9 and JDK8u versions with the same results. I need one sanity check review from someone. Dan $ diff 8028073-diffs-jdk[78]u-hs.txt.00 1c1 < diff -r bdb3798edd11 src/share/vm/prims/jvm.cpp --- > diff -r 231051bff068 src/share/vm/prims/jvm.cpp 11c11 < @@ -527,6 +527,12 @@ JVM_ENTRY(void, JVM_MonitorWait(JNIEnv* --- > @@ -518,6 +518,12 @@ JVM_ENTRY(void, JVM_MonitorWait(JNIEnv* 22c22 < ObjectSynchronizer::wait(obj, ms, THREAD); --- > ObjectSynchronizer::wait(obj, ms, CHECK); 24c24 < diff -r bdb3798edd11 src/share/vm/runtime/objectMonitor.cpp --- > diff -r 231051bff068 src/share/vm/runtime/objectMonitor.cpp 34c34 < @@ -385,6 +385,12 @@ void ATTR ObjectMonitor::enter(TRAPS) { --- > @@ -382,6 +382,12 @@ void ATTR ObjectMonitor::enter(TRAPS) { 47c47 < @@ -442,6 +448,12 @@ void ATTR ObjectMonitor::enter(TRAPS) { --- > @@ -439,6 +445,12 @@ void ATTR ObjectMonitor::enter(TRAPS) { 60c60 < @@ -1459,6 +1471,14 @@ void ObjectMonitor::wait(jlong millis, b --- > @@ -1456,6 +1468,14 @@ void ObjectMonitor::wait(jlong millis, b 75c75 < @@ -1501,21 +1521,6 @@ void ObjectMonitor::wait(jlong millis, b --- > @@ -1498,21 +1518,6 @@ void ObjectMonitor::wait(jlong millis, b 97c97 < @@ -1598,6 +1603,33 @@ void ObjectMonitor::wait(jlong millis, b --- > @@ -1595,6 +1600,33 @@ void ObjectMonitor::wait(jlong millis, b On 2/21/14 8:40 PM, Daniel D. Daugherty wrote: > Greetings, > > This is a code review request for the JDK8u-hs-dev backport of the > following ObjectMonitor-JVM/TI hang fix: > > 8028073 race condition in ObjectMonitor implementation causing > deadlocks > https://bugs.openjdk.java.net/browse/JDK-8028073 > > Here is the JDK8u-hs-dev webrev URL: > > http://cr.openjdk.java.net/~dcubed/8028073-webrev/0-jdk8u-hs-dev/ > > This is _almost_ a straight forward backport of the JDK9 fix. The only > difference to the fix was discussed at the end of the JDK9 review and > was determined to only be needed in versions of HotSpot without the > fix for 8028280: > > http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2014-February/010745.html > > > 8028280 has not yet been backported to JDK8u-hs-dev. > > The easiest way to review the backport is to download the two patch > files from the webrevs and compare them with something like: > > jfilemerge -r -w 8028073_exp.patch 8028073_exp_for_jdk8u_hs.patch > > The same testing has been performed on the JDK8u-hs-dev version as > with the JDK9-hs-runtime version. > > Thanks, in advance, for any comments, questions or suggestions. > > Dan > > > On 2/1/14 11:38 AM, Daniel D. Daugherty wrote: > > Greetings, > > > > I have a fix ready for the following bug: > > > > 8028073 race condition in ObjectMonitor implementation causing > deadlocks > > https://bugs.openjdk.java.net/browse/JDK-8028073 > > > > On the surface, this is a very simple fix that relocates a few lines of > > code, relocates and rewrites the comments associated with that code and > > adds several new comments. > > > > Of course, in reality, the issue is much more complicated, but I'm > > hoping to make it easy for anyone not acquainted with this issue to > > understand what's going on. > > > > Here are the JDK9 webrev URLs: > > > > OpenJDK: > > http://cr.openjdk.java.net/~dcubed/8028073-webrev/0-jdk9-hs-runtime/ > > > > Oracle internal: > > > http://javaweb.us.oracle.com/~ddaugher/8028073-webrev/0-jdk9-hs-runtime/ > > > > The simple summary: > > > > - since Java Monitors and JVM/TI RawMonitors share a ParkEvent, > > it is possible for a JVM/TI monitor event handler to accidentally > > consume a ParkEvent.unpark() call meant for Java Monitor layer > > - the original code fix was made on 2005.07.04 using this bug ID: > > https://bugs.openjdk.java.net/browse/JDK-5030359 > > - it's the right fix, but it's in the wrong place > > - the fix needs to be after the JVMTI_EVENT_MONITOR_WAITED > > event handler is called because it is that event handler > > that can cause the hang > > > > > > Testing > > ------- > > > > - a new StessMonitorWait test has been created that reliably > > reproduces the hang in JDK[6789]; see the bug's gory details > > for the specific versions where the hang has been reproduced > > - the test reliably reproduces the hang in 5 seconds on my > > T7600 running Solaris 10u11 X86; 1 minute runs reproduce > > the hang reliably on other machines > > - 12 hour stress run of the new test on Linux-X64, MacOS X-X64, > > Solaris-SPARCV9, Solaris-X64, and Win7-X86 with the JPRT > > bits did not reproduce the hang > > - JPRT test job > > - VM/SQE Adhoc test job on Server VM, fastdebug bits on Linux-X86, > > Linux-X64, MacOS X-X64, Solaris-SPARCV9, Solaris-X64, Windows-X86, > > and Windows-X64: > > - vm.quick > > - Kitchensink (bigapps) > > - Weblogic+medrec (bigapps) > > - runThese (bigapps) > > > > > > The Gory Details Start Here > > --------------------------- > > > > This is the old location of block of code that's being moved: > > > > src/share/vm/runtime/objectMonitor.cpp: > > > > 1440 void ObjectMonitor::wait(jlong millis, bool interruptible, > TRAPS) { > > > > 1499 exit (true, Self) ; // exit the monitor > > > > 1513 if (node._notified != 0 && _succ == Self) { > > 1514 node._event->unpark(); > > 1515 } > > > > > > This is the new location of block of code that's being moved: > > > > src/share/vm/runtime/objectMonitor.cpp: > > > > 1452 void ObjectMonitor::wait(jlong millis, bool interruptible, > TRAPS) { > > > > 1601 if (JvmtiExport::should_post_monitor_waited()) { > > 1602 JvmtiExport::post_monitor_waited(jt, this, ret == > OS_TIMEOUT); > > > > 1604 if (node._notified != 0 && _succ == Self) { > > > > 1620 node._event->unpark(); > > 1621 } > > > > > > The Risks > > --------- > > > > - The code now executes only when the JVMTI_EVENT_MONITOR_WAITED event > > is enabled: > > - previously it was always executed > > - while the old code was not effective for the hang that is being > > fixed with this bug, it is possible that the old code prevented > > a different bug in the successor protocol from manifesting > > - thorough analysis of the successor protocol did not reveal a > > case where the old code was needed in the old location > > - Thorough analysis indicates that the other JVM/TI monitor events > > do not need a fix like the one for JVMTI_EVENT_MONITOR_WAITED: > > - the successor protocol is complicated and the analysis could > > be wrong when certain options are used > > - comments were added to each location where a JVM/TI monitor > > event handler is called documenting why a fix like this one > > is not needed there > > - if the analysis is wrong, the new comments show where a new > > code change would be needed > > > > > > The Scenario > > ------------ > > > > I've created a scenario that reproduces this hang: > > > > T1 - enters monitor and calls monitor.wait() > > T2 - enters the monitor, calls monitor.notify() and exits the monitor > > T3 - enters and exits the monitor > > T4 - enters the monitor, delays for 5 seconds, exits the monitor > > > > A JVM/TI agent that enables JVMTI_EVENT_MONITOR_WAITED and has a > > handler that: enters a raw monitor, waits for 1ms, exits a raw monitor. > > > > Here are the six events necessary to make this hang happen: > > > > // KEY-EVENT-1a: After being unparked(), T1 has cleared the _succ > field, but > > // KEY-EVENT-1b: T3 is exiting the monitor and makes T1 the > successor again. > > > > // KEY-EVENT-2a: The unpark() done by T3 when it made T1 the successor > > // KEY-EVENT-2b: is consumed by the JVM/TI event handler. > > > > // KEY-EVENT-3a: T3 made T1 the successor > > // KEY-EVENT-3b: but before T1 could reenter the monitor T4 grabbed it. > > > > // KEY-EVENT-4a: T1's TrySpin() call sees T4 as NotRunnable so > > // KEY-EVENT-4b: T1 bails from TrySpin without touching _succ. > > > > // KEY-EVENT-5a: T4 sees that T1 is still the successor so > > // KEY-EVENT-5b: T4 takes the quick exit path (no ExitEpilog) > > > > // KEY-EVENT-6a: T1 is about to park and it is the successor, but > > // KEY-EVENT-6b: T3's unpark has been eaten by the JVM/TI event handler > > // KEY-EVENT-6c: and T4 took the quick exit path. T1 is about to be > stuck. > > > > > > This bug is intertwined with: > > > > - The ObjectMonitor successor protocol > > - the sharing of a ParkEvent between Java Monitors and JVM/TI > RawMonitors > > > > There is a very long successor.notes attachment to JDK-8028073 that > > attempts to describe the ObjectMonitor successor protocol. It's good > > for putting pretty much anyone to sleep. > > > > Since this hang reproduces back to JDK6, this bug is taking the easily > > backported solution of moving the original fix to the right location. > > The following new bug has been filed for possible future work in this > > area by the Serviceability Team: > > > > 8033399 add a separate ParkEvent for JVM/TI RawMonitor use > > https://bugs.openjdk.java.net/browse/JDK-8033399 > > > > > > The Symptoms > > ------------ > > > > With intermittent hangs like this, it is useful to know what to look > > for in order to determine if you are running into this issue: > > > > - if you aren't using a debugger or a profiler or some other > > JVM/TI agent, then this hang is not the same as yours > > - if your JVM/TI agent isn't using a JVMTI_EVENT_MONITOR_WAITED > > event handler, then this hang is not the same as yours > > - if your JVMTI_EVENT_MONITOR_WAITED event handler is not using > > JVM/TI RawMonitors, then this hang is not the same as yours > > - if your JVMTI_EVENT_MONITOR_WAITED event handler is calling > > back into Java code, then you might just be insane and this > > hang might be similar to yours. However, using a Java callback > > in an event handler is an even bigger problem/risk so fix that > > first. > > - if you one or more threads blocked like this and making no > > progress, then this hang might be the same as yours: > > > > "T1" #22 prio=5 os_prio=64 tid=0x00000000009ca800 nid=0x2f waiting > for monitor e > > ntry [0xfffffd7fc0231000] > > java.lang.Thread.State: BLOCKED (on object monitor) > > JavaThread state: _thread_blocked > > Thread: 0x00000000009ca800 [0x2f] State: _at_safepoint > _has_called_back 0 _at_p > > oll_safepoint 0 > > JavaThread state: _thread_blocked > > at java.lang.Object.wait(Native Method) > > - waiting on <0xfffffd7e6a2b6ff0> (a java.lang.String) > > at java.lang.Object.wait(Object.java:502) > > at SMW_WorkerThread.run(StressMonitorWait.java:103) > > - locked <0xfffffd7e6a2b6ff0> (a java.lang.String) > > > > "T2" #23 prio=5 os_prio=64 tid=0x00000000009cc000 nid=0x30 waiting > for monitor e > > ntry [0xfffffd7fc0130000] > > java.lang.Thread.State: BLOCKED (on object monitor) > > JavaThread state: _thread_blocked > > Thread: 0x00000000009cc000 [0x30] State: _at_safepoint > _has_called_back 0 _at_p > > oll_safepoint 0 > > JavaThread state: _thread_blocked > > at SMW_WorkerThread.run(StressMonitorWait.java:120) > > - waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String) > > > > "T3" #24 prio=5 os_prio=64 tid=0x00000000009ce000 nid=0x31 waiting > for monitor e > > ntry [0xfffffd7fc002f000] > > java.lang.Thread.State: BLOCKED (on object monitor) > > JavaThread state: _thread_blocked > > Thread: 0x00000000009ce000 [0x31] State: _at_safepoint > _has_called_back 0 _at_p > > oll_safepoint 0 > > JavaThread state: _thread_blocked > > at SMW_WorkerThread.run(StressMonitorWait.java:139) > > - waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String) > > > > Key symptoms in thread T1: > > > > - had the object locked: > > > > locked <0xfffffd7e6a2b6ff0> (a java.lang.String) > > > > - did an Object.wait(): > > > > waiting on <0xfffffd7e6a2b6ff0> (a java.lang.String) > > > > - is blocked on reentry: > > > > waiting for monitor entry [0xfffffd7fc0231000] > > > > Key symptoms in thread T2: > > > > - is blocked waiting to lock the object: > > > > waiting for monitor entry [0xfffffd7fc0130000] > > waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String) > > > > Key symptoms in thread T3: > > > > - is blocked waiting to lock the object: > > > > waiting for monitor entry [0xfffffd7fc002f000] > > waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String) > > > From serguei.spitsyn at oracle.com Fri Feb 28 14:09:45 2014 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Fri, 28 Feb 2014 14:09:45 -0800 Subject: RFR round 0 JDK7u backport of ObjectMonitor-JVM/TI hang fix (8028073) In-Reply-To: <531105BC.4050509@oracle.com> References: <52ED3F28.5070806@oracle.com> <53081C1F.2070008@oracle.com> <531105BC.4050509@oracle.com> Message-ID: <53110929.6020507@oracle.com> On 2/28/14 1:55 PM, Daniel D. Daugherty wrote: > Resend with the corrected subject line... sigh... > > Greetings, > > This is a code review request for the JDK7u-hs-dev backport of the > following ObjectMonitor-JVM/TI hang fix: > > 8028073 race condition in ObjectMonitor implementation causing > deadlocks > https://bugs.openjdk.java.net/browse/JDK-8028073 > > Here is the JDK7u-hs-dev webrev URL: > > http://cr.openjdk.java.net/~dcubed/8028073-webrev/0-jdk7u-hs-dev/ > > This is a straight forward backport of the JDK8u-hs-dev version of the > fix modulo lines numbers and context diff anchors. The diff of the two > 'hg diff --nodates' files is below. > > Same testing as for the JDK9 and JDK8u versions with the same results. > > I need one sanity check review from someone. Looks good. Thanks, Serguei > > Dan > > $ diff 8028073-diffs-jdk[78]u-hs.txt.00 > 1c1 > < diff -r bdb3798edd11 src/share/vm/prims/jvm.cpp > --- >> diff -r 231051bff068 src/share/vm/prims/jvm.cpp > 11c11 > < @@ -527,6 +527,12 @@ JVM_ENTRY(void, JVM_MonitorWait(JNIEnv* > --- >> @@ -518,6 +518,12 @@ JVM_ENTRY(void, JVM_MonitorWait(JNIEnv* > 22c22 > < ObjectSynchronizer::wait(obj, ms, THREAD); > --- >> ObjectSynchronizer::wait(obj, ms, CHECK); > 24c24 > < diff -r bdb3798edd11 src/share/vm/runtime/objectMonitor.cpp > --- >> diff -r 231051bff068 src/share/vm/runtime/objectMonitor.cpp > 34c34 > < @@ -385,6 +385,12 @@ void ATTR ObjectMonitor::enter(TRAPS) { > --- >> @@ -382,6 +382,12 @@ void ATTR ObjectMonitor::enter(TRAPS) { > 47c47 > < @@ -442,6 +448,12 @@ void ATTR ObjectMonitor::enter(TRAPS) { > --- >> @@ -439,6 +445,12 @@ void ATTR ObjectMonitor::enter(TRAPS) { > 60c60 > < @@ -1459,6 +1471,14 @@ void ObjectMonitor::wait(jlong millis, b > --- >> @@ -1456,6 +1468,14 @@ void ObjectMonitor::wait(jlong millis, b > 75c75 > < @@ -1501,21 +1521,6 @@ void ObjectMonitor::wait(jlong millis, b > --- >> @@ -1498,21 +1518,6 @@ void ObjectMonitor::wait(jlong millis, b > 97c97 > < @@ -1598,6 +1603,33 @@ void ObjectMonitor::wait(jlong millis, b > --- >> @@ -1595,6 +1600,33 @@ void ObjectMonitor::wait(jlong millis, b > > > On 2/21/14 8:40 PM, Daniel D. Daugherty wrote: >> Greetings, >> >> This is a code review request for the JDK8u-hs-dev backport of the >> following ObjectMonitor-JVM/TI hang fix: >> >> 8028073 race condition in ObjectMonitor implementation causing >> deadlocks >> https://bugs.openjdk.java.net/browse/JDK-8028073 >> >> Here is the JDK8u-hs-dev webrev URL: >> >> http://cr.openjdk.java.net/~dcubed/8028073-webrev/0-jdk8u-hs-dev/ >> >> This is _almost_ a straight forward backport of the JDK9 fix. The only >> difference to the fix was discussed at the end of the JDK9 review and >> was determined to only be needed in versions of HotSpot without the >> fix for 8028280: >> >> http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2014-February/010745.html >> >> >> >> 8028280 has not yet been backported to JDK8u-hs-dev. >> >> The easiest way to review the backport is to download the two patch >> files from the webrevs and compare them with something like: >> >> jfilemerge -r -w 8028073_exp.patch 8028073_exp_for_jdk8u_hs.patch >> >> The same testing has been performed on the JDK8u-hs-dev version as >> with the JDK9-hs-runtime version. >> >> Thanks, in advance, for any comments, questions or suggestions. >> >> Dan >> >> >> On 2/1/14 11:38 AM, Daniel D. Daugherty wrote: >> > Greetings, >> > >> > I have a fix ready for the following bug: >> > >> > 8028073 race condition in ObjectMonitor implementation causing >> deadlocks >> > https://bugs.openjdk.java.net/browse/JDK-8028073 >> > >> > On the surface, this is a very simple fix that relocates a few >> lines of >> > code, relocates and rewrites the comments associated with that code >> and >> > adds several new comments. >> > >> > Of course, in reality, the issue is much more complicated, but I'm >> > hoping to make it easy for anyone not acquainted with this issue to >> > understand what's going on. >> > >> > Here are the JDK9 webrev URLs: >> > >> > OpenJDK: >> > http://cr.openjdk.java.net/~dcubed/8028073-webrev/0-jdk9-hs-runtime/ >> > >> > Oracle internal: >> > >> http://javaweb.us.oracle.com/~ddaugher/8028073-webrev/0-jdk9-hs-runtime/ >> > >> > The simple summary: >> > >> > - since Java Monitors and JVM/TI RawMonitors share a ParkEvent, >> > it is possible for a JVM/TI monitor event handler to accidentally >> > consume a ParkEvent.unpark() call meant for Java Monitor layer >> > - the original code fix was made on 2005.07.04 using this bug ID: >> > https://bugs.openjdk.java.net/browse/JDK-5030359 >> > - it's the right fix, but it's in the wrong place >> > - the fix needs to be after the JVMTI_EVENT_MONITOR_WAITED >> > event handler is called because it is that event handler >> > that can cause the hang >> > >> > >> > Testing >> > ------- >> > >> > - a new StessMonitorWait test has been created that reliably >> > reproduces the hang in JDK[6789]; see the bug's gory details >> > for the specific versions where the hang has been reproduced >> > - the test reliably reproduces the hang in 5 seconds on my >> > T7600 running Solaris 10u11 X86; 1 minute runs reproduce >> > the hang reliably on other machines >> > - 12 hour stress run of the new test on Linux-X64, MacOS X-X64, >> > Solaris-SPARCV9, Solaris-X64, and Win7-X86 with the JPRT >> > bits did not reproduce the hang >> > - JPRT test job >> > - VM/SQE Adhoc test job on Server VM, fastdebug bits on Linux-X86, >> > Linux-X64, MacOS X-X64, Solaris-SPARCV9, Solaris-X64, Windows-X86, >> > and Windows-X64: >> > - vm.quick >> > - Kitchensink (bigapps) >> > - Weblogic+medrec (bigapps) >> > - runThese (bigapps) >> > >> > >> > The Gory Details Start Here >> > --------------------------- >> > >> > This is the old location of block of code that's being moved: >> > >> > src/share/vm/runtime/objectMonitor.cpp: >> > >> > 1440 void ObjectMonitor::wait(jlong millis, bool interruptible, >> TRAPS) { >> > >> > 1499 exit (true, Self) ; // exit the monitor >> > >> > 1513 if (node._notified != 0 && _succ == Self) { >> > 1514 node._event->unpark(); >> > 1515 } >> > >> > >> > This is the new location of block of code that's being moved: >> > >> > src/share/vm/runtime/objectMonitor.cpp: >> > >> > 1452 void ObjectMonitor::wait(jlong millis, bool interruptible, >> TRAPS) { >> > >> > 1601 if (JvmtiExport::should_post_monitor_waited()) { >> > 1602 JvmtiExport::post_monitor_waited(jt, this, ret == >> OS_TIMEOUT); >> > >> > 1604 if (node._notified != 0 && _succ == Self) { >> > >> > 1620 node._event->unpark(); >> > 1621 } >> > >> > >> > The Risks >> > --------- >> > >> > - The code now executes only when the JVMTI_EVENT_MONITOR_WAITED event >> > is enabled: >> > - previously it was always executed >> > - while the old code was not effective for the hang that is being >> > fixed with this bug, it is possible that the old code prevented >> > a different bug in the successor protocol from manifesting >> > - thorough analysis of the successor protocol did not reveal a >> > case where the old code was needed in the old location >> > - Thorough analysis indicates that the other JVM/TI monitor events >> > do not need a fix like the one for JVMTI_EVENT_MONITOR_WAITED: >> > - the successor protocol is complicated and the analysis could >> > be wrong when certain options are used >> > - comments were added to each location where a JVM/TI monitor >> > event handler is called documenting why a fix like this one >> > is not needed there >> > - if the analysis is wrong, the new comments show where a new >> > code change would be needed >> > >> > >> > The Scenario >> > ------------ >> > >> > I've created a scenario that reproduces this hang: >> > >> > T1 - enters monitor and calls monitor.wait() >> > T2 - enters the monitor, calls monitor.notify() and exits the monitor >> > T3 - enters and exits the monitor >> > T4 - enters the monitor, delays for 5 seconds, exits the monitor >> > >> > A JVM/TI agent that enables JVMTI_EVENT_MONITOR_WAITED and has a >> > handler that: enters a raw monitor, waits for 1ms, exits a raw >> monitor. >> > >> > Here are the six events necessary to make this hang happen: >> > >> > // KEY-EVENT-1a: After being unparked(), T1 has cleared the _succ >> field, but >> > // KEY-EVENT-1b: T3 is exiting the monitor and makes T1 the >> successor again. >> > >> > // KEY-EVENT-2a: The unpark() done by T3 when it made T1 the successor >> > // KEY-EVENT-2b: is consumed by the JVM/TI event handler. >> > >> > // KEY-EVENT-3a: T3 made T1 the successor >> > // KEY-EVENT-3b: but before T1 could reenter the monitor T4 grabbed >> it. >> > >> > // KEY-EVENT-4a: T1's TrySpin() call sees T4 as NotRunnable so >> > // KEY-EVENT-4b: T1 bails from TrySpin without touching _succ. >> > >> > // KEY-EVENT-5a: T4 sees that T1 is still the successor so >> > // KEY-EVENT-5b: T4 takes the quick exit path (no ExitEpilog) >> > >> > // KEY-EVENT-6a: T1 is about to park and it is the successor, but >> > // KEY-EVENT-6b: T3's unpark has been eaten by the JVM/TI event >> handler >> > // KEY-EVENT-6c: and T4 took the quick exit path. T1 is about to be >> stuck. >> > >> > >> > This bug is intertwined with: >> > >> > - The ObjectMonitor successor protocol >> > - the sharing of a ParkEvent between Java Monitors and JVM/TI >> RawMonitors >> > >> > There is a very long successor.notes attachment to JDK-8028073 that >> > attempts to describe the ObjectMonitor successor protocol. It's good >> > for putting pretty much anyone to sleep. >> > >> > Since this hang reproduces back to JDK6, this bug is taking the easily >> > backported solution of moving the original fix to the right location. >> > The following new bug has been filed for possible future work in this >> > area by the Serviceability Team: >> > >> > 8033399 add a separate ParkEvent for JVM/TI RawMonitor use >> > https://bugs.openjdk.java.net/browse/JDK-8033399 >> > >> > >> > The Symptoms >> > ------------ >> > >> > With intermittent hangs like this, it is useful to know what to look >> > for in order to determine if you are running into this issue: >> > >> > - if you aren't using a debugger or a profiler or some other >> > JVM/TI agent, then this hang is not the same as yours >> > - if your JVM/TI agent isn't using a JVMTI_EVENT_MONITOR_WAITED >> > event handler, then this hang is not the same as yours >> > - if your JVMTI_EVENT_MONITOR_WAITED event handler is not using >> > JVM/TI RawMonitors, then this hang is not the same as yours >> > - if your JVMTI_EVENT_MONITOR_WAITED event handler is calling >> > back into Java code, then you might just be insane and this >> > hang might be similar to yours. However, using a Java callback >> > in an event handler is an even bigger problem/risk so fix that >> > first. >> > - if you one or more threads blocked like this and making no >> > progress, then this hang might be the same as yours: >> > >> > "T1" #22 prio=5 os_prio=64 tid=0x00000000009ca800 nid=0x2f waiting >> for monitor e >> > ntry [0xfffffd7fc0231000] >> > java.lang.Thread.State: BLOCKED (on object monitor) >> > JavaThread state: _thread_blocked >> > Thread: 0x00000000009ca800 [0x2f] State: _at_safepoint >> _has_called_back 0 _at_p >> > oll_safepoint 0 >> > JavaThread state: _thread_blocked >> > at java.lang.Object.wait(Native Method) >> > - waiting on <0xfffffd7e6a2b6ff0> (a java.lang.String) >> > at java.lang.Object.wait(Object.java:502) >> > at SMW_WorkerThread.run(StressMonitorWait.java:103) >> > - locked <0xfffffd7e6a2b6ff0> (a java.lang.String) >> > >> > "T2" #23 prio=5 os_prio=64 tid=0x00000000009cc000 nid=0x30 waiting >> for monitor e >> > ntry [0xfffffd7fc0130000] >> > java.lang.Thread.State: BLOCKED (on object monitor) >> > JavaThread state: _thread_blocked >> > Thread: 0x00000000009cc000 [0x30] State: _at_safepoint >> _has_called_back 0 _at_p >> > oll_safepoint 0 >> > JavaThread state: _thread_blocked >> > at SMW_WorkerThread.run(StressMonitorWait.java:120) >> > - waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String) >> > >> > "T3" #24 prio=5 os_prio=64 tid=0x00000000009ce000 nid=0x31 waiting >> for monitor e >> > ntry [0xfffffd7fc002f000] >> > java.lang.Thread.State: BLOCKED (on object monitor) >> > JavaThread state: _thread_blocked >> > Thread: 0x00000000009ce000 [0x31] State: _at_safepoint >> _has_called_back 0 _at_p >> > oll_safepoint 0 >> > JavaThread state: _thread_blocked >> > at SMW_WorkerThread.run(StressMonitorWait.java:139) >> > - waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String) >> > >> > Key symptoms in thread T1: >> > >> > - had the object locked: >> > >> > locked <0xfffffd7e6a2b6ff0> (a java.lang.String) >> > >> > - did an Object.wait(): >> > >> > waiting on <0xfffffd7e6a2b6ff0> (a java.lang.String) >> > >> > - is blocked on reentry: >> > >> > waiting for monitor entry [0xfffffd7fc0231000] >> > >> > Key symptoms in thread T2: >> > >> > - is blocked waiting to lock the object: >> > >> > waiting for monitor entry [0xfffffd7fc0130000] >> > waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String) >> > >> > Key symptoms in thread T3: >> > >> > - is blocked waiting to lock the object: >> > >> > waiting for monitor entry [0xfffffd7fc002f000] >> > waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String) >> >> >> > From daniel.daugherty at oracle.com Fri Feb 28 14:15:55 2014 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Fri, 28 Feb 2014 15:15:55 -0700 Subject: RFR round 0 JDK7u backport of ObjectMonitor-JVM/TI hang fix (8028073) In-Reply-To: <53110929.6020507@oracle.com> References: <52ED3F28.5070806@oracle.com> <53081C1F.2070008@oracle.com> <531105BC.4050509@oracle.com> <53110929.6020507@oracle.com> Message-ID: <53110A9B.1030508@oracle.com> > Looks good. Thanks! Last backport (by me) for this one... Dan On 2/28/14 3:09 PM, serguei.spitsyn at oracle.com wrote: > On 2/28/14 1:55 PM, Daniel D. Daugherty wrote: >> Resend with the corrected subject line... sigh... >> >> Greetings, >> >> This is a code review request for the JDK7u-hs-dev backport of the >> following ObjectMonitor-JVM/TI hang fix: >> >> 8028073 race condition in ObjectMonitor implementation causing >> deadlocks >> https://bugs.openjdk.java.net/browse/JDK-8028073 >> >> Here is the JDK7u-hs-dev webrev URL: >> >> http://cr.openjdk.java.net/~dcubed/8028073-webrev/0-jdk7u-hs-dev/ >> >> This is a straight forward backport of the JDK8u-hs-dev version of the >> fix modulo lines numbers and context diff anchors. The diff of the two >> 'hg diff --nodates' files is below. >> >> Same testing as for the JDK9 and JDK8u versions with the same results. >> >> I need one sanity check review from someone. > > Looks good. > > Thanks, > Serguei > > >> >> Dan >> >> $ diff 8028073-diffs-jdk[78]u-hs.txt.00 >> 1c1 >> < diff -r bdb3798edd11 src/share/vm/prims/jvm.cpp >> --- >>> diff -r 231051bff068 src/share/vm/prims/jvm.cpp >> 11c11 >> < @@ -527,6 +527,12 @@ JVM_ENTRY(void, JVM_MonitorWait(JNIEnv* >> --- >>> @@ -518,6 +518,12 @@ JVM_ENTRY(void, JVM_MonitorWait(JNIEnv* >> 22c22 >> < ObjectSynchronizer::wait(obj, ms, THREAD); >> --- >>> ObjectSynchronizer::wait(obj, ms, CHECK); >> 24c24 >> < diff -r bdb3798edd11 src/share/vm/runtime/objectMonitor.cpp >> --- >>> diff -r 231051bff068 src/share/vm/runtime/objectMonitor.cpp >> 34c34 >> < @@ -385,6 +385,12 @@ void ATTR ObjectMonitor::enter(TRAPS) { >> --- >>> @@ -382,6 +382,12 @@ void ATTR ObjectMonitor::enter(TRAPS) { >> 47c47 >> < @@ -442,6 +448,12 @@ void ATTR ObjectMonitor::enter(TRAPS) { >> --- >>> @@ -439,6 +445,12 @@ void ATTR ObjectMonitor::enter(TRAPS) { >> 60c60 >> < @@ -1459,6 +1471,14 @@ void ObjectMonitor::wait(jlong millis, b >> --- >>> @@ -1456,6 +1468,14 @@ void ObjectMonitor::wait(jlong millis, b >> 75c75 >> < @@ -1501,21 +1521,6 @@ void ObjectMonitor::wait(jlong millis, b >> --- >>> @@ -1498,21 +1518,6 @@ void ObjectMonitor::wait(jlong millis, b >> 97c97 >> < @@ -1598,6 +1603,33 @@ void ObjectMonitor::wait(jlong millis, b >> --- >>> @@ -1595,6 +1600,33 @@ void ObjectMonitor::wait(jlong millis, b >> >> >> On 2/21/14 8:40 PM, Daniel D. Daugherty wrote: >>> Greetings, >>> >>> This is a code review request for the JDK8u-hs-dev backport of the >>> following ObjectMonitor-JVM/TI hang fix: >>> >>> 8028073 race condition in ObjectMonitor implementation causing >>> deadlocks >>> https://bugs.openjdk.java.net/browse/JDK-8028073 >>> >>> Here is the JDK8u-hs-dev webrev URL: >>> >>> http://cr.openjdk.java.net/~dcubed/8028073-webrev/0-jdk8u-hs-dev/ >>> >>> This is _almost_ a straight forward backport of the JDK9 fix. The only >>> difference to the fix was discussed at the end of the JDK9 review and >>> was determined to only be needed in versions of HotSpot without the >>> fix for 8028280: >>> >>> http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2014-February/010745.html >>> >>> >>> >>> 8028280 has not yet been backported to JDK8u-hs-dev. >>> >>> The easiest way to review the backport is to download the two patch >>> files from the webrevs and compare them with something like: >>> >>> jfilemerge -r -w 8028073_exp.patch 8028073_exp_for_jdk8u_hs.patch >>> >>> The same testing has been performed on the JDK8u-hs-dev version as >>> with the JDK9-hs-runtime version. >>> >>> Thanks, in advance, for any comments, questions or suggestions. >>> >>> Dan >>> >>> >>> On 2/1/14 11:38 AM, Daniel D. Daugherty wrote: >>> > Greetings, >>> > >>> > I have a fix ready for the following bug: >>> > >>> > 8028073 race condition in ObjectMonitor implementation causing >>> deadlocks >>> > https://bugs.openjdk.java.net/browse/JDK-8028073 >>> > >>> > On the surface, this is a very simple fix that relocates a few >>> lines of >>> > code, relocates and rewrites the comments associated with that >>> code and >>> > adds several new comments. >>> > >>> > Of course, in reality, the issue is much more complicated, but I'm >>> > hoping to make it easy for anyone not acquainted with this issue to >>> > understand what's going on. >>> > >>> > Here are the JDK9 webrev URLs: >>> > >>> > OpenJDK: >>> > http://cr.openjdk.java.net/~dcubed/8028073-webrev/0-jdk9-hs-runtime/ >>> > >>> > Oracle internal: >>> > >>> http://javaweb.us.oracle.com/~ddaugher/8028073-webrev/0-jdk9-hs-runtime/ >>> >>> > >>> > The simple summary: >>> > >>> > - since Java Monitors and JVM/TI RawMonitors share a ParkEvent, >>> > it is possible for a JVM/TI monitor event handler to accidentally >>> > consume a ParkEvent.unpark() call meant for Java Monitor layer >>> > - the original code fix was made on 2005.07.04 using this bug ID: >>> > https://bugs.openjdk.java.net/browse/JDK-5030359 >>> > - it's the right fix, but it's in the wrong place >>> > - the fix needs to be after the JVMTI_EVENT_MONITOR_WAITED >>> > event handler is called because it is that event handler >>> > that can cause the hang >>> > >>> > >>> > Testing >>> > ------- >>> > >>> > - a new StessMonitorWait test has been created that reliably >>> > reproduces the hang in JDK[6789]; see the bug's gory details >>> > for the specific versions where the hang has been reproduced >>> > - the test reliably reproduces the hang in 5 seconds on my >>> > T7600 running Solaris 10u11 X86; 1 minute runs reproduce >>> > the hang reliably on other machines >>> > - 12 hour stress run of the new test on Linux-X64, MacOS X-X64, >>> > Solaris-SPARCV9, Solaris-X64, and Win7-X86 with the JPRT >>> > bits did not reproduce the hang >>> > - JPRT test job >>> > - VM/SQE Adhoc test job on Server VM, fastdebug bits on Linux-X86, >>> > Linux-X64, MacOS X-X64, Solaris-SPARCV9, Solaris-X64, Windows-X86, >>> > and Windows-X64: >>> > - vm.quick >>> > - Kitchensink (bigapps) >>> > - Weblogic+medrec (bigapps) >>> > - runThese (bigapps) >>> > >>> > >>> > The Gory Details Start Here >>> > --------------------------- >>> > >>> > This is the old location of block of code that's being moved: >>> > >>> > src/share/vm/runtime/objectMonitor.cpp: >>> > >>> > 1440 void ObjectMonitor::wait(jlong millis, bool interruptible, >>> TRAPS) { >>> > >>> > 1499 exit (true, Self) ; // exit the monitor >>> > >>> > 1513 if (node._notified != 0 && _succ == Self) { >>> > 1514 node._event->unpark(); >>> > 1515 } >>> > >>> > >>> > This is the new location of block of code that's being moved: >>> > >>> > src/share/vm/runtime/objectMonitor.cpp: >>> > >>> > 1452 void ObjectMonitor::wait(jlong millis, bool interruptible, >>> TRAPS) { >>> > >>> > 1601 if (JvmtiExport::should_post_monitor_waited()) { >>> > 1602 JvmtiExport::post_monitor_waited(jt, this, ret == >>> OS_TIMEOUT); >>> > >>> > 1604 if (node._notified != 0 && _succ == Self) { >>> > >>> > 1620 node._event->unpark(); >>> > 1621 } >>> > >>> > >>> > The Risks >>> > --------- >>> > >>> > - The code now executes only when the JVMTI_EVENT_MONITOR_WAITED >>> event >>> > is enabled: >>> > - previously it was always executed >>> > - while the old code was not effective for the hang that is being >>> > fixed with this bug, it is possible that the old code prevented >>> > a different bug in the successor protocol from manifesting >>> > - thorough analysis of the successor protocol did not reveal a >>> > case where the old code was needed in the old location >>> > - Thorough analysis indicates that the other JVM/TI monitor events >>> > do not need a fix like the one for JVMTI_EVENT_MONITOR_WAITED: >>> > - the successor protocol is complicated and the analysis could >>> > be wrong when certain options are used >>> > - comments were added to each location where a JVM/TI monitor >>> > event handler is called documenting why a fix like this one >>> > is not needed there >>> > - if the analysis is wrong, the new comments show where a new >>> > code change would be needed >>> > >>> > >>> > The Scenario >>> > ------------ >>> > >>> > I've created a scenario that reproduces this hang: >>> > >>> > T1 - enters monitor and calls monitor.wait() >>> > T2 - enters the monitor, calls monitor.notify() and exits the monitor >>> > T3 - enters and exits the monitor >>> > T4 - enters the monitor, delays for 5 seconds, exits the monitor >>> > >>> > A JVM/TI agent that enables JVMTI_EVENT_MONITOR_WAITED and has a >>> > handler that: enters a raw monitor, waits for 1ms, exits a raw >>> monitor. >>> > >>> > Here are the six events necessary to make this hang happen: >>> > >>> > // KEY-EVENT-1a: After being unparked(), T1 has cleared the _succ >>> field, but >>> > // KEY-EVENT-1b: T3 is exiting the monitor and makes T1 the >>> successor again. >>> > >>> > // KEY-EVENT-2a: The unpark() done by T3 when it made T1 the >>> successor >>> > // KEY-EVENT-2b: is consumed by the JVM/TI event handler. >>> > >>> > // KEY-EVENT-3a: T3 made T1 the successor >>> > // KEY-EVENT-3b: but before T1 could reenter the monitor T4 >>> grabbed it. >>> > >>> > // KEY-EVENT-4a: T1's TrySpin() call sees T4 as NotRunnable so >>> > // KEY-EVENT-4b: T1 bails from TrySpin without touching _succ. >>> > >>> > // KEY-EVENT-5a: T4 sees that T1 is still the successor so >>> > // KEY-EVENT-5b: T4 takes the quick exit path (no ExitEpilog) >>> > >>> > // KEY-EVENT-6a: T1 is about to park and it is the successor, but >>> > // KEY-EVENT-6b: T3's unpark has been eaten by the JVM/TI event >>> handler >>> > // KEY-EVENT-6c: and T4 took the quick exit path. T1 is about to be >>> stuck. >>> > >>> > >>> > This bug is intertwined with: >>> > >>> > - The ObjectMonitor successor protocol >>> > - the sharing of a ParkEvent between Java Monitors and JVM/TI >>> RawMonitors >>> > >>> > There is a very long successor.notes attachment to JDK-8028073 that >>> > attempts to describe the ObjectMonitor successor protocol. It's good >>> > for putting pretty much anyone to sleep. >>> > >>> > Since this hang reproduces back to JDK6, this bug is taking the >>> easily >>> > backported solution of moving the original fix to the right location. >>> > The following new bug has been filed for possible future work in this >>> > area by the Serviceability Team: >>> > >>> > 8033399 add a separate ParkEvent for JVM/TI RawMonitor use >>> > https://bugs.openjdk.java.net/browse/JDK-8033399 >>> > >>> > >>> > The Symptoms >>> > ------------ >>> > >>> > With intermittent hangs like this, it is useful to know what to look >>> > for in order to determine if you are running into this issue: >>> > >>> > - if you aren't using a debugger or a profiler or some other >>> > JVM/TI agent, then this hang is not the same as yours >>> > - if your JVM/TI agent isn't using a JVMTI_EVENT_MONITOR_WAITED >>> > event handler, then this hang is not the same as yours >>> > - if your JVMTI_EVENT_MONITOR_WAITED event handler is not using >>> > JVM/TI RawMonitors, then this hang is not the same as yours >>> > - if your JVMTI_EVENT_MONITOR_WAITED event handler is calling >>> > back into Java code, then you might just be insane and this >>> > hang might be similar to yours. However, using a Java callback >>> > in an event handler is an even bigger problem/risk so fix that >>> > first. >>> > - if you one or more threads blocked like this and making no >>> > progress, then this hang might be the same as yours: >>> > >>> > "T1" #22 prio=5 os_prio=64 tid=0x00000000009ca800 nid=0x2f waiting >>> for monitor e >>> > ntry [0xfffffd7fc0231000] >>> > java.lang.Thread.State: BLOCKED (on object monitor) >>> > JavaThread state: _thread_blocked >>> > Thread: 0x00000000009ca800 [0x2f] State: _at_safepoint >>> _has_called_back 0 _at_p >>> > oll_safepoint 0 >>> > JavaThread state: _thread_blocked >>> > at java.lang.Object.wait(Native Method) >>> > - waiting on <0xfffffd7e6a2b6ff0> (a java.lang.String) >>> > at java.lang.Object.wait(Object.java:502) >>> > at SMW_WorkerThread.run(StressMonitorWait.java:103) >>> > - locked <0xfffffd7e6a2b6ff0> (a java.lang.String) >>> > >>> > "T2" #23 prio=5 os_prio=64 tid=0x00000000009cc000 nid=0x30 waiting >>> for monitor e >>> > ntry [0xfffffd7fc0130000] >>> > java.lang.Thread.State: BLOCKED (on object monitor) >>> > JavaThread state: _thread_blocked >>> > Thread: 0x00000000009cc000 [0x30] State: _at_safepoint >>> _has_called_back 0 _at_p >>> > oll_safepoint 0 >>> > JavaThread state: _thread_blocked >>> > at SMW_WorkerThread.run(StressMonitorWait.java:120) >>> > - waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String) >>> > >>> > "T3" #24 prio=5 os_prio=64 tid=0x00000000009ce000 nid=0x31 waiting >>> for monitor e >>> > ntry [0xfffffd7fc002f000] >>> > java.lang.Thread.State: BLOCKED (on object monitor) >>> > JavaThread state: _thread_blocked >>> > Thread: 0x00000000009ce000 [0x31] State: _at_safepoint >>> _has_called_back 0 _at_p >>> > oll_safepoint 0 >>> > JavaThread state: _thread_blocked >>> > at SMW_WorkerThread.run(StressMonitorWait.java:139) >>> > - waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String) >>> > >>> > Key symptoms in thread T1: >>> > >>> > - had the object locked: >>> > >>> > locked <0xfffffd7e6a2b6ff0> (a java.lang.String) >>> > >>> > - did an Object.wait(): >>> > >>> > waiting on <0xfffffd7e6a2b6ff0> (a java.lang.String) >>> > >>> > - is blocked on reentry: >>> > >>> > waiting for monitor entry [0xfffffd7fc0231000] >>> > >>> > Key symptoms in thread T2: >>> > >>> > - is blocked waiting to lock the object: >>> > >>> > waiting for monitor entry [0xfffffd7fc0130000] >>> > waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String) >>> > >>> > Key symptoms in thread T3: >>> > >>> > - is blocked waiting to lock the object: >>> > >>> > waiting for monitor entry [0xfffffd7fc002f000] >>> > waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String) >>> >>> >>> >> > From chris.hegarty at oracle.com Fri Feb 28 14:27:08 2014 From: chris.hegarty at oracle.com (Chris Hegarty) Date: Fri, 28 Feb 2014 22:27:08 +0000 Subject: RFR(S): JDK-8035952 Remove use of JVM_Open, JVM_Read and JVM_Close functions from serviceability code In-Reply-To: <5310FD6E.7050305@oracle.com> References: <5310FD6E.7050305@oracle.com> Message-ID: <711DB132-BE41-436F-91BD-3AB2CCB9524A@oracle.com> Looks good to me Frederic. JVM_Read is restartable, but that may not be an issue here. -Chris. > On 28 Feb 2014, at 21:19, frederic parain wrote: > > Greetings, > > Please review this small changeset for CR JDK-8035952 > "Remove use of JVM_Open, JVM_Read and JVM_Close functions from serviceability code" > > Bug: > https://bugs.openjdk.java.net/browse/JDK-8035952 > > Webrev: > http://cr.openjdk.java.net/~fparain/8035952/webrev.00/ > > Tested with jdk_management test suite + manual tests > to verify consistency of values returned by new code. > > Thank you, > > Fred > > -- > Frederic Parain - Oracle > Grenoble Engineering Center - France > Phone: +33 4 76 18 81 17 > Email: Frederic.Parain at oracle.com From martinrb at google.com Fri Feb 28 14:31:29 2014 From: martinrb at google.com (Martin Buchholz) Date: Fri, 28 Feb 2014 14:31:29 -0800 Subject: RFR(S): JDK-8035952 Remove use of JVM_Open, JVM_Read and JVM_Close functions from serviceability code In-Reply-To: <711DB132-BE41-436F-91BD-3AB2CCB9524A@oracle.com> References: <5310FD6E.7050305@oracle.com> <711DB132-BE41-436F-91BD-3AB2CCB9524A@oracle.com> Message-ID: I'm still bummed out that no one ever explained what JVM_Read's purpose in life was. https://bugs.openjdk.java.net/browse/JDK-4904617 JVM_Read needs better documentation On Fri, Feb 28, 2014 at 2:27 PM, Chris Hegarty wrote: > Looks good to me Frederic. > > JVM_Read is restartable, but that may not be an issue here. > > -Chris. > > > On 28 Feb 2014, at 21:19, frederic parain > wrote: > > > > Greetings, > > > > Please review this small changeset for CR JDK-8035952 > > "Remove use of JVM_Open, JVM_Read and JVM_Close functions from > serviceability code" > > > > Bug: > > https://bugs.openjdk.java.net/browse/JDK-8035952 > > > > Webrev: > > http://cr.openjdk.java.net/~fparain/8035952/webrev.00/ > > > > Tested with jdk_management test suite + manual tests > > to verify consistency of values returned by new code. > > > > Thank you, > > > > Fred > > > > -- > > Frederic Parain - Oracle > > Grenoble Engineering Center - France > > Phone: +33 4 76 18 81 17 > > Email: Frederic.Parain at oracle.com > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20140228/d09bb9a7/attachment.html From mandy.chung at oracle.com Fri Feb 28 15:10:08 2014 From: mandy.chung at oracle.com (Mandy Chung) Date: Fri, 28 Feb 2014 15:10:08 -0800 Subject: RFR(S): JDK-8035952 Remove use of JVM_Open, JVM_Read and JVM_Close functions from serviceability code In-Reply-To: <5310FD6E.7050305@oracle.com> References: <5310FD6E.7050305@oracle.com> Message-ID: <53111750.3070106@oracle.com> On 2/28/14 1:19 PM, frederic parain wrote: > Greetings, > > Please review this small changeset for CR JDK-8035952 > "Remove use of JVM_Open, JVM_Read and JVM_Close functions from > serviceability code" > > Bug: > https://bugs.openjdk.java.net/browse/JDK-8035952 > > Webrev: > http://cr.openjdk.java.net/~fparain/8035952/webrev.00/ Looks good. It should have used open/read/close functions from the beginning. thanks Mandy