From dmitry.samersoff at oracle.com  Sat Feb  1 01:02:29 2014
From: dmitry.samersoff at oracle.com (Dmitry Samersoff)
Date: Sat, 01 Feb 2014 13:02:29 +0400
Subject: RFR (XS) 6471769: Error: assert(_cur_stack_depth ==
	count_frames(), "cur_stack_depth out of sync")
In-Reply-To: <52EC62CD.8000300@oracle.com>
References: <52EC62CD.8000300@oracle.com>
Message-ID: <52ECB825.70600@oracle.com>

Looks good for me!

On 2014-02-01 06:58, serguei.spitsyn at oracle.com wrote:
> Please, review the fix for:
>   https://bugs.openjdk.java.net/browse/JDK-6471769
> 
> 
> Open webrev:
> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/hotspot/6471769-JVMTI-FRAME/
> 
> 
> Summary:
> 
>   There is a general issue in the suspend equivalent condition mechanism:
>   Two subsequent calls to the JvmtiEnv::is_thread_fully_suspended() may
> return different results:
>     - 1-st: true
>     - 2-nd: false
> 
>   This more generic suspend equivalent issue is covered by another bug:
>     https://bugs.openjdk.java.net/browse/JDK-6280037
> 
>   The bug to fix in this review is a specific manifestation of the 6280037
>   in the JVMTI GetFrameCount() that has a big impact on the SQE nightly.
>   It is on the Test Stabilization radar (as well as the 6280037).
>   There are many tests intermittently failing because of this.
> 
>   The webrev for review is a one-liner work around the 6280037 for the
> GetFrameCount().
> 
>   The JVMTI GetFrameCount() spec tells:
>     "If this function is called for a thread actively executing
> bytecodes (for example,
>      not the current thread and not suspended), the information returned
> is transient."
> 
>   So, it is Ok to call the GetFrameCount() for non-suspended target
> threads.
>   To achieve safety, the frame count for non-suspended threads is
> calculated at a safepoint.
>   It should be Ok and more safe to do the same for suspended threads as
> well.
>   There is no big performance impact because it is already on a slow path.
>   It is still important to avoid safepointing when the target thread is
> current.
> 
>   The bug 6280037 should go out of the Test Stabilization radar (remove
> the svc-nightly label)
>   as the most of the impacted tests are covered by the 6471769.
> 
> 
> Testing:
>   In progress: nsk.jvmti, nsk.jdi, nsk.jdwp, impacted JTreg tests
> 
> 
> Thanks,
> Serguei
> 


-- 
Dmitry Samersoff
Oracle Java development team, Saint Petersburg, Russia
* I would love to change the world, but they won't give me the source code.

From serguei.spitsyn at oracle.com  Sat Feb  1 04:24:43 2014
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Sat, 01 Feb 2014 04:24:43 -0800
Subject: RFR (XS) 6471769: Error: assert(_cur_stack_depth ==
	count_frames(), "cur_stack_depth out of sync")
In-Reply-To: <52ECB825.70600@oracle.com>
References: <52EC62CD.8000300@oracle.com> <52ECB825.70600@oracle.com>
Message-ID: <52ECE78B.10809@oracle.com>

Thanks, Dmitry!
Serguei

On 2/1/14 1:02 AM, Dmitry Samersoff wrote:
> Looks good for me!
>
> On 2014-02-01 06:58, serguei.spitsyn at oracle.com wrote:
>> Please, review the fix for:
>>    https://bugs.openjdk.java.net/browse/JDK-6471769
>>
>>
>> Open webrev:
>> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/hotspot/6471769-JVMTI-FRAME/
>>
>>
>> Summary:
>>
>>    There is a general issue in the suspend equivalent condition mechanism:
>>    Two subsequent calls to the JvmtiEnv::is_thread_fully_suspended() may
>> return different results:
>>      - 1-st: true
>>      - 2-nd: false
>>
>>    This more generic suspend equivalent issue is covered by another bug:
>>      https://bugs.openjdk.java.net/browse/JDK-6280037
>>
>>    The bug to fix in this review is a specific manifestation of the 6280037
>>    in the JVMTI GetFrameCount() that has a big impact on the SQE nightly.
>>    It is on the Test Stabilization radar (as well as the 6280037).
>>    There are many tests intermittently failing because of this.
>>
>>    The webrev for review is a one-liner work around the 6280037 for the
>> GetFrameCount().
>>
>>    The JVMTI GetFrameCount() spec tells:
>>      "If this function is called for a thread actively executing
>> bytecodes (for example,
>>       not the current thread and not suspended), the information returned
>> is transient."
>>
>>    So, it is Ok to call the GetFrameCount() for non-suspended target
>> threads.
>>    To achieve safety, the frame count for non-suspended threads is
>> calculated at a safepoint.
>>    It should be Ok and more safe to do the same for suspended threads as
>> well.
>>    There is no big performance impact because it is already on a slow path.
>>    It is still important to avoid safepointing when the target thread is
>> current.
>>
>>    The bug 6280037 should go out of the Test Stabilization radar (remove
>> the svc-nightly label)
>>    as the most of the impacted tests are covered by the 6471769.
>>
>>
>> Testing:
>>    In progress: nsk.jvmti, nsk.jdi, nsk.jdwp, impacted JTreg tests
>>
>>
>> Thanks,
>> Serguei
>>
>


From daniel.daugherty at oracle.com  Sat Feb  1 10:38:32 2014
From: daniel.daugherty at oracle.com (Daniel D. Daugherty)
Date: Sat, 01 Feb 2014 11:38:32 -0700
Subject: code review round 0 for ObjectMonitor-JVM/TI hang fix (8028073)
Message-ID: <52ED3F28.5070806@oracle.com>

Greetings,

I have a fix ready for the following bug:

     8028073 race condition in ObjectMonitor implementation causing 
deadlocks
     https://bugs.openjdk.java.net/browse/JDK-8028073

On the surface, this is a very simple fix that relocates a few lines of
code, relocates and rewrites the comments associated with that code and
adds several new comments.

Of course, in reality, the issue is much more complicated, but I'm
hoping to make it easy for anyone not acquainted with this issue to
understand what's going on.

Here are the JDK9 webrev URLs:

OpenJDK:
http://cr.openjdk.java.net/~dcubed/8028073-webrev/0-jdk9-hs-runtime/

Oracle internal:
http://javaweb.us.oracle.com/~ddaugher/8028073-webrev/0-jdk9-hs-runtime/

The simple summary:

- since Java Monitors and JVM/TI RawMonitors share a ParkEvent,
   it is possible for a JVM/TI monitor event handler to accidentally
   consume a ParkEvent.unpark() call meant for Java Monitor layer
- the original code fix was made on 2005.07.04 using this bug ID:
   https://bugs.openjdk.java.net/browse/JDK-5030359
- it's the right fix, but it's in the wrong place
- the fix needs to be after the JVMTI_EVENT_MONITOR_WAITED
   event handler is called because it is that event handler
   that can cause the hang


Testing
-------

- a new StessMonitorWait test has been created that reliably
   reproduces the hang in JDK[6789]; see the bug's gory details
   for the specific versions where the hang has been reproduced
   - the test reliably reproduces the hang in 5 seconds on my
     T7600 running Solaris 10u11 X86; 1 minute runs reproduce
     the hang reliably on other machines
   - 12 hour stress run of the new test on Linux-X64, MacOS X-X64,
     Solaris-SPARCV9, Solaris-X64, and Win7-X86 with the JPRT
     bits did not reproduce the hang
- JPRT test job
- VM/SQE Adhoc test job on Server VM, fastdebug bits on Linux-X86,
   Linux-X64, MacOS X-X64, Solaris-SPARCV9, Solaris-X64, Windows-X86,
   and Windows-X64:
   - vm.quick
   - Kitchensink (bigapps)
   - Weblogic+medrec (bigapps)
   - runThese (bigapps)


The Gory Details Start Here
---------------------------

This is the old location of block of code that's being moved:

src/share/vm/runtime/objectMonitor.cpp:

1440 void ObjectMonitor::wait(jlong millis, bool interruptible, TRAPS) {
<snip>
1499    exit (true, Self) ;                    // exit the monitor
<snip>
1513    if (node._notified != 0 && _succ == Self) {
1514       node._event->unpark();
1515    }


This is the new location of block of code that's being moved:

src/share/vm/runtime/objectMonitor.cpp:

1452 void ObjectMonitor::wait(jlong millis, bool interruptible, TRAPS) {
<snip>
1601      if (JvmtiExport::should_post_monitor_waited()) {
1602        JvmtiExport::post_monitor_waited(jt, this, ret == OS_TIMEOUT);
<snip>
1604        if (node._notified != 0 && _succ == Self) {
<snip>
1620          node._event->unpark();
1621        }


The Risks
---------

- The code now executes only when the JVMTI_EVENT_MONITOR_WAITED event
   is enabled:
   - previously it was always executed
   - while the old code was not effective for the hang that is being
     fixed with this bug, it is possible that the old code prevented
     a different bug in the successor protocol from manifesting
   - thorough analysis of the successor protocol did not reveal a
     case where the old code was needed in the old location
- Thorough analysis indicates that the other JVM/TI monitor events
   do not need a fix like the one for JVMTI_EVENT_MONITOR_WAITED:
   - the successor protocol is complicated and the analysis could
     be wrong when certain options are used
   - comments were added to each location where a JVM/TI monitor
     event handler is called documenting why a fix like this one
     is not needed there
   - if the analysis is wrong, the new comments show where a new
     code change would be needed


The Scenario
------------

I've created a scenario that reproduces this hang:

T1 - enters monitor and calls monitor.wait()
T2 - enters the monitor, calls monitor.notify() and exits the monitor
T3 - enters and exits the monitor
T4 - enters the monitor, delays for 5 seconds, exits the monitor

A JVM/TI agent that enables JVMTI_EVENT_MONITOR_WAITED and has a
handler that: enters a raw monitor, waits for 1ms, exits a raw monitor.

Here are the six events necessary to make this hang happen:

// KEY-EVENT-1a: After being unparked(), T1 has cleared the _succ field, but
// KEY-EVENT-1b: T3 is exiting the monitor and makes T1 the successor again.

// KEY-EVENT-2a: The unpark() done by T3 when it made T1 the successor
// KEY-EVENT-2b: is consumed by the JVM/TI event handler.

// KEY-EVENT-3a: T3 made T1 the successor
// KEY-EVENT-3b: but before T1 could reenter the monitor T4 grabbed it.

// KEY-EVENT-4a: T1's TrySpin() call sees T4 as NotRunnable so
// KEY-EVENT-4b: T1 bails from TrySpin without touching _succ.

// KEY-EVENT-5a: T4 sees that T1 is still the successor so
// KEY-EVENT-5b: T4 takes the quick exit path (no ExitEpilog)

// KEY-EVENT-6a: T1 is about to park and it is the successor, but
// KEY-EVENT-6b: T3's unpark has been eaten by the JVM/TI event handler
// KEY-EVENT-6c: and T4 took the quick exit path. T1 is about to be stuck.


This bug is intertwined with:

- The ObjectMonitor successor protocol
- the sharing of a ParkEvent between Java Monitors and JVM/TI RawMonitors

There is a very long successor.notes attachment to JDK-8028073 that
attempts to describe the ObjectMonitor successor protocol. It's good
for putting pretty much anyone to sleep.

Since this hang reproduces back to JDK6, this bug is taking the easily
backported solution of moving the original fix to the right location.
The following new bug has been filed for possible future work in this
area by the Serviceability Team:

     8033399 add a separate ParkEvent for JVM/TI RawMonitor use
     https://bugs.openjdk.java.net/browse/JDK-8033399


The Symptoms
------------

With intermittent hangs like this, it is useful to know what to look
for in order to determine if you are running into this issue:

- if you aren't using a debugger or a profiler or some other
   JVM/TI agent, then this hang is not the same as yours
- if your JVM/TI agent isn't using a JVMTI_EVENT_MONITOR_WAITED
   event handler, then this hang is not the same as yours
- if your JVMTI_EVENT_MONITOR_WAITED event handler is not using
   JVM/TI RawMonitors, then this hang is not the same as yours
- if your JVMTI_EVENT_MONITOR_WAITED event handler is calling
   back into Java code, then you might just be insane and this
   hang might be similar to yours. However, using a Java callback
   in an event handler is an even bigger problem/risk so fix that
   first.
- if you one or more threads blocked like this and making no
   progress, then this hang might be the same as yours:

"T1" #22 prio=5 os_prio=64 tid=0x00000000009ca800 nid=0x2f waiting for 
monitor e
ntry [0xfffffd7fc0231000]
    java.lang.Thread.State: BLOCKED (on object monitor)
    JavaThread state: _thread_blocked
Thread: 0x00000000009ca800  [0x2f] State: _at_safepoint _has_called_back 
0 _at_p
oll_safepoint 0
    JavaThread state: _thread_blocked
         at java.lang.Object.wait(Native Method)
         - waiting on <0xfffffd7e6a2b6ff0> (a java.lang.String)
         at java.lang.Object.wait(Object.java:502)
         at SMW_WorkerThread.run(StressMonitorWait.java:103)
         - locked <0xfffffd7e6a2b6ff0> (a java.lang.String)

"T2" #23 prio=5 os_prio=64 tid=0x00000000009cc000 nid=0x30 waiting for 
monitor e
ntry [0xfffffd7fc0130000]
    java.lang.Thread.State: BLOCKED (on object monitor)
    JavaThread state: _thread_blocked
Thread: 0x00000000009cc000  [0x30] State: _at_safepoint _has_called_back 
0 _at_p
oll_safepoint 0
    JavaThread state: _thread_blocked
         at SMW_WorkerThread.run(StressMonitorWait.java:120)
         - waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String)

"T3" #24 prio=5 os_prio=64 tid=0x00000000009ce000 nid=0x31 waiting for 
monitor e
ntry [0xfffffd7fc002f000]
    java.lang.Thread.State: BLOCKED (on object monitor)
    JavaThread state: _thread_blocked
Thread: 0x00000000009ce000  [0x31] State: _at_safepoint _has_called_back 
0 _at_p
oll_safepoint 0
    JavaThread state: _thread_blocked
         at SMW_WorkerThread.run(StressMonitorWait.java:139)
         - waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String)

Key symptoms in thread T1:

- had the object locked:

   locked <0xfffffd7e6a2b6ff0> (a java.lang.String)

- did an Object.wait():

   waiting on <0xfffffd7e6a2b6ff0> (a java.lang.String)

- is blocked on reentry:

   waiting for monitor entry [0xfffffd7fc0231000]

Key symtoms in thread T2:

- is blocked waiting to lock the object:

   waiting for monitor entry [0xfffffd7fc0130000]
   waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String)

Key symtoms in thread T3:

- is blocked waiting to lock the object:

   waiting for monitor entry [0xfffffd7fc002f000]
   waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String)


From staffan.larsen at oracle.com  Mon Feb  3 02:39:12 2014
From: staffan.larsen at oracle.com (Staffan Larsen)
Date: Mon, 3 Feb 2014 11:39:12 +0100
Subject: RR(S): JDK-8023667 SA: ExceptionBlob and other C2 classes not
	available in client VM
In-Reply-To: <52EBFC89.3030809@oracle.com>
References: <52EBFC89.3030809@oracle.com>
Message-ID: <A10C11D1-EABD-4D6A-B3BF-137E4F3EDB4E@oracle.com>

It looks like this new version will leave some things uninitialized if an item higher on the list throws an exception. I don?t think that is what you want. 

/Staffan


On 31 jan 2014, at 20:42, Dmitry Samersoff <dmitry.samersoff at oracle.com> wrote:

> Hi Everybody,
> 
> Please review the fix:
> 
> http://cr.openjdk.java.net/~dsamersoff/JDK-8023667/webrev.01/
> 
> sa.js try to initialize all possible VM structures regardless of type of
> VM we are running on
> 
> --
> Dmitry Samersoff
> Oracle Java development team, Saint Petersburg, Russia
> * I would love to change the world, but they won't give me the sources.


From dmitry.samersoff at oracle.com  Mon Feb  3 02:59:01 2014
From: dmitry.samersoff at oracle.com (Dmitry Samersoff)
Date: Mon, 03 Feb 2014 14:59:01 +0400
Subject: RR(S): JDK-8023667 SA: ExceptionBlob and other C2 classes not
	available in client VM
In-Reply-To: <A10C11D1-EABD-4D6A-B3BF-137E4F3EDB4E@oracle.com>
References: <52EBFC89.3030809@oracle.com>
	<A10C11D1-EABD-4D6A-B3BF-137E4F3EDB4E@oracle.com>
Message-ID: <52EF7675.3020707@oracle.com>

Staffan,

All items under try-catch exists only in server (C2 compiler) VM, so if
one of them is missing in target VM other items is missed as well.

I have no ideas what happens later if SA tries to access one of this
non-initialized items, but I don't see any errors in my home testing.

-Dmitry

On 2014-02-03 14:39, Staffan Larsen wrote:
> It looks like this new version will leave some things uninitialized
> if an item higher on the list throws an exception. I don?t think that
> is what you want.
> 
> /Staffan
> 
> 
> On 31 jan 2014, at 20:42, Dmitry Samersoff
> <dmitry.samersoff at oracle.com> wrote:
> 
>> Hi Everybody,
>> 
>> Please review the fix:
>> 
>> http://cr.openjdk.java.net/~dsamersoff/JDK-8023667/webrev.01/
>> 
>> sa.js try to initialize all possible VM structures regardless of
>> type of VM we are running on
>> 
>> -- Dmitry Samersoff Oracle Java development team, Saint Petersburg,
>> Russia * I would love to change the world, but they won't give me
>> the sources.
> 


-- 
Dmitry Samersoff
Oracle Java development team, Saint Petersburg, Russia
* I would love to change the world, but they won't give me the sources.

From staffan.larsen at oracle.com  Mon Feb  3 03:57:27 2014
From: staffan.larsen at oracle.com (Staffan Larsen)
Date: Mon, 3 Feb 2014 12:57:27 +0100
Subject: RFR (XS) 6471769: Error: assert(_cur_stack_depth ==
	count_frames(), "cur_stack_depth out of sync")
In-Reply-To: <52EC62CD.8000300@oracle.com>
References: <52EC62CD.8000300@oracle.com>
Message-ID: <3E8D21DD-BD7A-4AEF-9C02-C175AEB40738@oracle.com>

Looks good!

Thanks,
/Staffan

On 1 feb 2014, at 03:58, serguei.spitsyn at oracle.com wrote:

> Please, review the fix for:
>  https://bugs.openjdk.java.net/browse/JDK-6471769
> 
> 
> Open webrev:
> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/hotspot/6471769-JVMTI-FRAME/
> 
> Summary:
> 
>  There is a general issue in the suspend equivalent condition mechanism:
>  Two subsequent calls to the JvmtiEnv::is_thread_fully_suspended() may return different results:
>    - 1-st: true
>    - 2-nd: false
> 
>  This more generic suspend equivalent issue is covered by another bug:
>    https://bugs.openjdk.java.net/browse/JDK-6280037
> 
>  The bug to fix in this review is a specific manifestation of the 6280037
>  in the JVMTI GetFrameCount() that has a big impact on the SQE nightly.
>  It is on the Test Stabilization radar (as well as the 6280037).
>  There are many tests intermittently failing because of this.
> 
>  The webrev for review is a one-liner work around the 6280037 for the GetFrameCount().
> 
>  The JVMTI GetFrameCount() spec tells:
>    "If this function is called for a thread actively executing bytecodes (for example,
>     not the current thread and not suspended), the information returned is transient."
> 
>  So, it is Ok to call the GetFrameCount() for non-suspended target threads.
>  To achieve safety, the frame count for non-suspended threads is calculated at a safepoint.
>  It should be Ok and more safe to do the same for suspended threads as well.
>  There is no big performance impact because it is already on a slow path.
>  It is still important to avoid safepointing when the target thread is current.
> 
>  The bug 6280037 should go out of the Test Stabilization radar (remove the svc-nightly label)
>  as the most of the impacted tests are covered by the 6471769.
> 
> 
> Testing:
>  In progress: nsk.jvmti, nsk.jdi, nsk.jdwp, impacted JTreg tests
> 
> 
> Thanks,
> Serguei
> 


From david.holmes at oracle.com  Mon Feb  3 04:10:50 2014
From: david.holmes at oracle.com (David Holmes)
Date: Mon, 03 Feb 2014 22:10:50 +1000
Subject: code review round 0 for ObjectMonitor-JVM/TI hang fix (8028073)
In-Reply-To: <52ED3F28.5070806@oracle.com>
References: <52ED3F28.5070806@oracle.com>
Message-ID: <52EF874A.8020603@oracle.com>

Hi Dan,

On 2/02/2014 4:38 AM, Daniel D. Daugherty wrote:
> Greetings,
>
> I have a fix ready for the following bug:
>
>      8028073 race condition in ObjectMonitor implementation causing
> deadlocks
>      https://bugs.openjdk.java.net/browse/JDK-8028073
>
> On the surface, this is a very simple fix that relocates a few lines of
> code, relocates and rewrites the comments associated with that code and
> adds several new comments.
>
> Of course, in reality, the issue is much more complicated, but I'm
> hoping to make it easy for anyone not acquainted with this issue to
> understand what's going on.

Let's spare them the trouble - Reviewed :)

I'm still troubled by the rationale for the original placement of this 
code. It makes sense to issue the unpark prior to the park() call but on 
the other hand there is no agent/event code in that path that would have 
consumed the original unpark. ??

Thanks, I know you've really enjoyed this one :)

David
-----

> Here are the JDK9 webrev URLs:
>
> OpenJDK:
> http://cr.openjdk.java.net/~dcubed/8028073-webrev/0-jdk9-hs-runtime/
>
> Oracle internal:
> http://javaweb.us.oracle.com/~ddaugher/8028073-webrev/0-jdk9-hs-runtime/
>
> The simple summary:
>
> - since Java Monitors and JVM/TI RawMonitors share a ParkEvent,
>    it is possible for a JVM/TI monitor event handler to accidentally
>    consume a ParkEvent.unpark() call meant for Java Monitor layer
> - the original code fix was made on 2005.07.04 using this bug ID:
>    https://bugs.openjdk.java.net/browse/JDK-5030359
> - it's the right fix, but it's in the wrong place
> - the fix needs to be after the JVMTI_EVENT_MONITOR_WAITED
>    event handler is called because it is that event handler
>    that can cause the hang
>
>
> Testing
> -------
>
> - a new StessMonitorWait test has been created that reliably
>    reproduces the hang in JDK[6789]; see the bug's gory details
>    for the specific versions where the hang has been reproduced
>    - the test reliably reproduces the hang in 5 seconds on my
>      T7600 running Solaris 10u11 X86; 1 minute runs reproduce
>      the hang reliably on other machines
>    - 12 hour stress run of the new test on Linux-X64, MacOS X-X64,
>      Solaris-SPARCV9, Solaris-X64, and Win7-X86 with the JPRT
>      bits did not reproduce the hang
> - JPRT test job
> - VM/SQE Adhoc test job on Server VM, fastdebug bits on Linux-X86,
>    Linux-X64, MacOS X-X64, Solaris-SPARCV9, Solaris-X64, Windows-X86,
>    and Windows-X64:
>    - vm.quick
>    - Kitchensink (bigapps)
>    - Weblogic+medrec (bigapps)
>    - runThese (bigapps)
>
>
> The Gory Details Start Here
> ---------------------------
>
> This is the old location of block of code that's being moved:
>
> src/share/vm/runtime/objectMonitor.cpp:
>
> 1440 void ObjectMonitor::wait(jlong millis, bool interruptible, TRAPS) {
> <snip>
> 1499    exit (true, Self) ;                    // exit the monitor
> <snip>
> 1513    if (node._notified != 0 && _succ == Self) {
> 1514       node._event->unpark();
> 1515    }
>
>
> This is the new location of block of code that's being moved:
>
> src/share/vm/runtime/objectMonitor.cpp:
>
> 1452 void ObjectMonitor::wait(jlong millis, bool interruptible, TRAPS) {
> <snip>
> 1601      if (JvmtiExport::should_post_monitor_waited()) {
> 1602        JvmtiExport::post_monitor_waited(jt, this, ret == OS_TIMEOUT);
> <snip>
> 1604        if (node._notified != 0 && _succ == Self) {
> <snip>
> 1620          node._event->unpark();
> 1621        }
>
>
> The Risks
> ---------
>
> - The code now executes only when the JVMTI_EVENT_MONITOR_WAITED event
>    is enabled:
>    - previously it was always executed
>    - while the old code was not effective for the hang that is being
>      fixed with this bug, it is possible that the old code prevented
>      a different bug in the successor protocol from manifesting
>    - thorough analysis of the successor protocol did not reveal a
>      case where the old code was needed in the old location
> - Thorough analysis indicates that the other JVM/TI monitor events
>    do not need a fix like the one for JVMTI_EVENT_MONITOR_WAITED:
>    - the successor protocol is complicated and the analysis could
>      be wrong when certain options are used
>    - comments were added to each location where a JVM/TI monitor
>      event handler is called documenting why a fix like this one
>      is not needed there
>    - if the analysis is wrong, the new comments show where a new
>      code change would be needed
>
>
> The Scenario
> ------------
>
> I've created a scenario that reproduces this hang:
>
> T1 - enters monitor and calls monitor.wait()
> T2 - enters the monitor, calls monitor.notify() and exits the monitor
> T3 - enters and exits the monitor
> T4 - enters the monitor, delays for 5 seconds, exits the monitor
>
> A JVM/TI agent that enables JVMTI_EVENT_MONITOR_WAITED and has a
> handler that: enters a raw monitor, waits for 1ms, exits a raw monitor.
>
> Here are the six events necessary to make this hang happen:
>
> // KEY-EVENT-1a: After being unparked(), T1 has cleared the _succ field,
> but
> // KEY-EVENT-1b: T3 is exiting the monitor and makes T1 the successor
> again.
>
> // KEY-EVENT-2a: The unpark() done by T3 when it made T1 the successor
> // KEY-EVENT-2b: is consumed by the JVM/TI event handler.
>
> // KEY-EVENT-3a: T3 made T1 the successor
> // KEY-EVENT-3b: but before T1 could reenter the monitor T4 grabbed it.
>
> // KEY-EVENT-4a: T1's TrySpin() call sees T4 as NotRunnable so
> // KEY-EVENT-4b: T1 bails from TrySpin without touching _succ.
>
> // KEY-EVENT-5a: T4 sees that T1 is still the successor so
> // KEY-EVENT-5b: T4 takes the quick exit path (no ExitEpilog)
>
> // KEY-EVENT-6a: T1 is about to park and it is the successor, but
> // KEY-EVENT-6b: T3's unpark has been eaten by the JVM/TI event handler
> // KEY-EVENT-6c: and T4 took the quick exit path. T1 is about to be stuck.
>
>
> This bug is intertwined with:
>
> - The ObjectMonitor successor protocol
> - the sharing of a ParkEvent between Java Monitors and JVM/TI RawMonitors
>
> There is a very long successor.notes attachment to JDK-8028073 that
> attempts to describe the ObjectMonitor successor protocol. It's good
> for putting pretty much anyone to sleep.
>
> Since this hang reproduces back to JDK6, this bug is taking the easily
> backported solution of moving the original fix to the right location.
> The following new bug has been filed for possible future work in this
> area by the Serviceability Team:
>
>      8033399 add a separate ParkEvent for JVM/TI RawMonitor use
>      https://bugs.openjdk.java.net/browse/JDK-8033399
>
>
> The Symptoms
> ------------
>
> With intermittent hangs like this, it is useful to know what to look
> for in order to determine if you are running into this issue:
>
> - if you aren't using a debugger or a profiler or some other
>    JVM/TI agent, then this hang is not the same as yours
> - if your JVM/TI agent isn't using a JVMTI_EVENT_MONITOR_WAITED
>    event handler, then this hang is not the same as yours
> - if your JVMTI_EVENT_MONITOR_WAITED event handler is not using
>    JVM/TI RawMonitors, then this hang is not the same as yours
> - if your JVMTI_EVENT_MONITOR_WAITED event handler is calling
>    back into Java code, then you might just be insane and this
>    hang might be similar to yours. However, using a Java callback
>    in an event handler is an even bigger problem/risk so fix that
>    first.
> - if you one or more threads blocked like this and making no
>    progress, then this hang might be the same as yours:
>
> "T1" #22 prio=5 os_prio=64 tid=0x00000000009ca800 nid=0x2f waiting for
> monitor e
> ntry [0xfffffd7fc0231000]
>     java.lang.Thread.State: BLOCKED (on object monitor)
>     JavaThread state: _thread_blocked
> Thread: 0x00000000009ca800  [0x2f] State: _at_safepoint _has_called_back
> 0 _at_p
> oll_safepoint 0
>     JavaThread state: _thread_blocked
>          at java.lang.Object.wait(Native Method)
>          - waiting on <0xfffffd7e6a2b6ff0> (a java.lang.String)
>          at java.lang.Object.wait(Object.java:502)
>          at SMW_WorkerThread.run(StressMonitorWait.java:103)
>          - locked <0xfffffd7e6a2b6ff0> (a java.lang.String)
>
> "T2" #23 prio=5 os_prio=64 tid=0x00000000009cc000 nid=0x30 waiting for
> monitor e
> ntry [0xfffffd7fc0130000]
>     java.lang.Thread.State: BLOCKED (on object monitor)
>     JavaThread state: _thread_blocked
> Thread: 0x00000000009cc000  [0x30] State: _at_safepoint _has_called_back
> 0 _at_p
> oll_safepoint 0
>     JavaThread state: _thread_blocked
>          at SMW_WorkerThread.run(StressMonitorWait.java:120)
>          - waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String)
>
> "T3" #24 prio=5 os_prio=64 tid=0x00000000009ce000 nid=0x31 waiting for
> monitor e
> ntry [0xfffffd7fc002f000]
>     java.lang.Thread.State: BLOCKED (on object monitor)
>     JavaThread state: _thread_blocked
> Thread: 0x00000000009ce000  [0x31] State: _at_safepoint _has_called_back
> 0 _at_p
> oll_safepoint 0
>     JavaThread state: _thread_blocked
>          at SMW_WorkerThread.run(StressMonitorWait.java:139)
>          - waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String)
>
> Key symptoms in thread T1:
>
> - had the object locked:
>
>    locked <0xfffffd7e6a2b6ff0> (a java.lang.String)
>
> - did an Object.wait():
>
>    waiting on <0xfffffd7e6a2b6ff0> (a java.lang.String)
>
> - is blocked on reentry:
>
>    waiting for monitor entry [0xfffffd7fc0231000]
>
> Key symtoms in thread T2:
>
> - is blocked waiting to lock the object:
>
>    waiting for monitor entry [0xfffffd7fc0130000]
>    waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String)
>
> Key symtoms in thread T3:
>
> - is blocked waiting to lock the object:
>
>    waiting for monitor entry [0xfffffd7fc002f000]
>    waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String)
>

From staffan.larsen at oracle.com  Mon Feb  3 04:13:18 2014
From: staffan.larsen at oracle.com (Staffan Larsen)
Date: Mon, 3 Feb 2014 13:13:18 +0100
Subject: RR(S): JDK-8023667 SA: ExceptionBlob and other C2 classes not
	available in client VM
In-Reply-To: <52EF7675.3020707@oracle.com>
References: <52EBFC89.3030809@oracle.com>
	<A10C11D1-EABD-4D6A-B3BF-137E4F3EDB4E@oracle.com>
	<52EF7675.3020707@oracle.com>
Message-ID: <CFDDFEB8-4C53-4EEF-850E-D6C0E7BFE07C@oracle.com>


On 3 feb 2014, at 11:59, Dmitry Samersoff <dmitry.samersoff at oracle.com> wrote:

> Staffan,
> 
> All items under try-catch exists only in server (C2 compiler) VM, so if
> one of them is missing in target VM other items is missed as well.

Are you sure? Some of them look very generic: ?JavaThread?, ?SurrogateLockerThread?, ?GenCollectedHeap?, etc. Perhaps the catch-ignore-c2-specifics should end at lin 821? 

/Staffan


> 
> I have no ideas what happens later if SA tries to access one of this
> non-initialized items, but I don't see any errors in my home testing.
> 
> -Dmitry
> 
> On 2014-02-03 14:39, Staffan Larsen wrote:
>> It looks like this new version will leave some things uninitialized
>> if an item higher on the list throws an exception. I don?t think that
>> is what you want.
>> 
>> /Staffan
>> 
>> 
>> On 31 jan 2014, at 20:42, Dmitry Samersoff
>> <dmitry.samersoff at oracle.com> wrote:
>> 
>>> Hi Everybody,
>>> 
>>> Please review the fix:
>>> 
>>> http://cr.openjdk.java.net/~dsamersoff/JDK-8023667/webrev.01/
>>> 
>>> sa.js try to initialize all possible VM structures regardless of
>>> type of VM we are running on
>>> 
>>> -- Dmitry Samersoff Oracle Java development team, Saint Petersburg,
>>> Russia * I would love to change the world, but they won't give me
>>> the sources.
>> 
> 
> 
> -- 
> Dmitry Samersoff
> Oracle Java development team, Saint Petersburg, Russia
> * I would love to change the world, but they won't give me the sources.


From david.holmes at oracle.com  Mon Feb  3 04:16:39 2014
From: david.holmes at oracle.com (David Holmes)
Date: Mon, 03 Feb 2014 22:16:39 +1000
Subject: RFR (XS) 6471769: Error: assert(_cur_stack_depth ==
	count_frames(), "cur_stack_depth out of sync")
In-Reply-To: <52EC62CD.8000300@oracle.com>
References: <52EC62CD.8000300@oracle.com>
Message-ID: <52EF88A7.3090506@oracle.com>

Hi Serguei,

On 1/02/2014 12:58 PM, serguei.spitsyn at oracle.com wrote:
> Please, review the fix for:
>    https://bugs.openjdk.java.net/browse/JDK-6471769
>
>
> Open webrev:
> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/hotspot/6471769-JVMTI-FRAME/
>
>
> Summary:
>
>    There is a general issue in the suspend equivalent condition mechanism:
>    Two subsequent calls to the JvmtiEnv::is_thread_fully_suspended() may
> return different results:
>      - 1-st: true
>      - 2-nd: false
>
>    This more generic suspend equivalent issue is covered by another bug:
>      https://bugs.openjdk.java.net/browse/JDK-6280037
>
>    The bug to fix in this review is a specific manifestation of the 6280037
>    in the JVMTI GetFrameCount() that has a big impact on the SQE nightly.
>    It is on the Test Stabilization radar (as well as the 6280037).
>    There are many tests intermittently failing because of this.
>
>    The webrev for review is a one-liner work around the 6280037 for the
> GetFrameCount().
>
>    The JVMTI GetFrameCount() spec tells:
>      "If this function is called for a thread actively executing
> bytecodes (for example,
>       not the current thread and not suspended), the information
> returned is transient."
>
>    So, it is Ok to call the GetFrameCount() for non-suspended target
> threads.
>    To achieve safety, the frame count for non-suspended threads is
> calculated at a safepoint.
>    It should be Ok and more safe to do the same for suspended threads as
> well.
>    There is no big performance impact because it is already on a slow path.
>    It is still important to avoid safepointing when the target thread is
> current.

This sounds completely reasonable - Reviewed.

I can't check at the moment whether VM_GetFrameCount properly checks for 
a live target thread?

David

>    The bug 6280037 should go out of the Test Stabilization radar (remove
> the svc-nightly label)
>    as the most of the impacted tests are covered by the 6471769.
>
>
> Testing:
>    In progress: nsk.jvmti, nsk.jdi, nsk.jdwp, impacted JTreg tests
>
>
> Thanks,
> Serguei
>

From daniel.daugherty at oracle.com  Mon Feb  3 06:56:28 2014
From: daniel.daugherty at oracle.com (Daniel D. Daugherty)
Date: Mon, 03 Feb 2014 07:56:28 -0700
Subject: code review round 0 for ObjectMonitor-JVM/TI hang fix (8028073)
In-Reply-To: <52EF874A.8020603@oracle.com>
References: <52ED3F28.5070806@oracle.com> <52EF874A.8020603@oracle.com>
Message-ID: <52EFAE1C.3050202@oracle.com>

Adding Dave Dice to this thread...

On 2/3/14 5:10 AM, David Holmes wrote:
> Hi Dan,
>
> On 2/02/2014 4:38 AM, Daniel D. Daugherty wrote:
>> Greetings,
>>
>> I have a fix ready for the following bug:
>>
>>      8028073 race condition in ObjectMonitor implementation causing
>> deadlocks
>>      https://bugs.openjdk.java.net/browse/JDK-8028073
>>
>> On the surface, this is a very simple fix that relocates a few lines of
>> code, relocates and rewrites the comments associated with that code and
>> adds several new comments.
>>
>> Of course, in reality, the issue is much more complicated, but I'm
>> hoping to make it easy for anyone not acquainted with this issue to
>> understand what's going on.
>
> Let's spare them the trouble - Reviewed :)

Thanks for the review!


> I'm still troubled by the rationale for the original placement of this 
> code. It makes sense to issue the unpark prior to the park() call but 
> on the other hand there is no agent/event code in that path that would 
> have consumed the original unpark. ??

I think I've acknowledged that concern in the "Risks" section below:

    - while the old code was not effective for the hang that is being
      fixed with this bug, it is possible that the old code prevented
      a different bug in the successor protocol from manifesting

Hopefully, Dave Dice will chime in and share the history...


> Thanks, I know you've really enjoyed this one :)

I actually did enjoy this one from a 'challenge to the mind' POV.
But I have to say that the intricacies of the successor protocol
gave me headaches... Hopefully, my successor.notes will help anyone
else that is crazy enough to wander down this path...

Dan


>
> David
> -----
>
>> Here are the JDK9 webrev URLs:
>>
>> OpenJDK:
>> http://cr.openjdk.java.net/~dcubed/8028073-webrev/0-jdk9-hs-runtime/
>>
>> Oracle internal:
>> http://javaweb.us.oracle.com/~ddaugher/8028073-webrev/0-jdk9-hs-runtime/
>>
>> The simple summary:
>>
>> - since Java Monitors and JVM/TI RawMonitors share a ParkEvent,
>>    it is possible for a JVM/TI monitor event handler to accidentally
>>    consume a ParkEvent.unpark() call meant for Java Monitor layer
>> - the original code fix was made on 2005.07.04 using this bug ID:
>>    https://bugs.openjdk.java.net/browse/JDK-5030359
>> - it's the right fix, but it's in the wrong place
>> - the fix needs to be after the JVMTI_EVENT_MONITOR_WAITED
>>    event handler is called because it is that event handler
>>    that can cause the hang
>>
>>
>> Testing
>> -------
>>
>> - a new StessMonitorWait test has been created that reliably
>>    reproduces the hang in JDK[6789]; see the bug's gory details
>>    for the specific versions where the hang has been reproduced
>>    - the test reliably reproduces the hang in 5 seconds on my
>>      T7600 running Solaris 10u11 X86; 1 minute runs reproduce
>>      the hang reliably on other machines
>>    - 12 hour stress run of the new test on Linux-X64, MacOS X-X64,
>>      Solaris-SPARCV9, Solaris-X64, and Win7-X86 with the JPRT
>>      bits did not reproduce the hang
>> - JPRT test job
>> - VM/SQE Adhoc test job on Server VM, fastdebug bits on Linux-X86,
>>    Linux-X64, MacOS X-X64, Solaris-SPARCV9, Solaris-X64, Windows-X86,
>>    and Windows-X64:
>>    - vm.quick
>>    - Kitchensink (bigapps)
>>    - Weblogic+medrec (bigapps)
>>    - runThese (bigapps)
>>
>>
>> The Gory Details Start Here
>> ---------------------------
>>
>> This is the old location of block of code that's being moved:
>>
>> src/share/vm/runtime/objectMonitor.cpp:
>>
>> 1440 void ObjectMonitor::wait(jlong millis, bool interruptible, TRAPS) {
>> <snip>
>> 1499    exit (true, Self) ;                    // exit the monitor
>> <snip>
>> 1513    if (node._notified != 0 && _succ == Self) {
>> 1514       node._event->unpark();
>> 1515    }
>>
>>
>> This is the new location of block of code that's being moved:
>>
>> src/share/vm/runtime/objectMonitor.cpp:
>>
>> 1452 void ObjectMonitor::wait(jlong millis, bool interruptible, TRAPS) {
>> <snip>
>> 1601      if (JvmtiExport::should_post_monitor_waited()) {
>> 1602        JvmtiExport::post_monitor_waited(jt, this, ret == 
>> OS_TIMEOUT);
>> <snip>
>> 1604        if (node._notified != 0 && _succ == Self) {
>> <snip>
>> 1620          node._event->unpark();
>> 1621        }
>>
>>
>> The Risks
>> ---------
>>
>> - The code now executes only when the JVMTI_EVENT_MONITOR_WAITED event
>>    is enabled:
>>    - previously it was always executed
>>    - while the old code was not effective for the hang that is being
>>      fixed with this bug, it is possible that the old code prevented
>>      a different bug in the successor protocol from manifesting
>>    - thorough analysis of the successor protocol did not reveal a
>>      case where the old code was needed in the old location
>> - Thorough analysis indicates that the other JVM/TI monitor events
>>    do not need a fix like the one for JVMTI_EVENT_MONITOR_WAITED:
>>    - the successor protocol is complicated and the analysis could
>>      be wrong when certain options are used
>>    - comments were added to each location where a JVM/TI monitor
>>      event handler is called documenting why a fix like this one
>>      is not needed there
>>    - if the analysis is wrong, the new comments show where a new
>>      code change would be needed
>>
>>
>> The Scenario
>> ------------
>>
>> I've created a scenario that reproduces this hang:
>>
>> T1 - enters monitor and calls monitor.wait()
>> T2 - enters the monitor, calls monitor.notify() and exits the monitor
>> T3 - enters and exits the monitor
>> T4 - enters the monitor, delays for 5 seconds, exits the monitor
>>
>> A JVM/TI agent that enables JVMTI_EVENT_MONITOR_WAITED and has a
>> handler that: enters a raw monitor, waits for 1ms, exits a raw monitor.
>>
>> Here are the six events necessary to make this hang happen:
>>
>> // KEY-EVENT-1a: After being unparked(), T1 has cleared the _succ field,
>> but
>> // KEY-EVENT-1b: T3 is exiting the monitor and makes T1 the successor
>> again.
>>
>> // KEY-EVENT-2a: The unpark() done by T3 when it made T1 the successor
>> // KEY-EVENT-2b: is consumed by the JVM/TI event handler.
>>
>> // KEY-EVENT-3a: T3 made T1 the successor
>> // KEY-EVENT-3b: but before T1 could reenter the monitor T4 grabbed it.
>>
>> // KEY-EVENT-4a: T1's TrySpin() call sees T4 as NotRunnable so
>> // KEY-EVENT-4b: T1 bails from TrySpin without touching _succ.
>>
>> // KEY-EVENT-5a: T4 sees that T1 is still the successor so
>> // KEY-EVENT-5b: T4 takes the quick exit path (no ExitEpilog)
>>
>> // KEY-EVENT-6a: T1 is about to park and it is the successor, but
>> // KEY-EVENT-6b: T3's unpark has been eaten by the JVM/TI event handler
>> // KEY-EVENT-6c: and T4 took the quick exit path. T1 is about to be 
>> stuck.
>>
>>
>> This bug is intertwined with:
>>
>> - The ObjectMonitor successor protocol
>> - the sharing of a ParkEvent between Java Monitors and JVM/TI 
>> RawMonitors
>>
>> There is a very long successor.notes attachment to JDK-8028073 that
>> attempts to describe the ObjectMonitor successor protocol. It's good
>> for putting pretty much anyone to sleep.
>>
>> Since this hang reproduces back to JDK6, this bug is taking the easily
>> backported solution of moving the original fix to the right location.
>> The following new bug has been filed for possible future work in this
>> area by the Serviceability Team:
>>
>>      8033399 add a separate ParkEvent for JVM/TI RawMonitor use
>>      https://bugs.openjdk.java.net/browse/JDK-8033399
>>
>>
>> The Symptoms
>> ------------
>>
>> With intermittent hangs like this, it is useful to know what to look
>> for in order to determine if you are running into this issue:
>>
>> - if you aren't using a debugger or a profiler or some other
>>    JVM/TI agent, then this hang is not the same as yours
>> - if your JVM/TI agent isn't using a JVMTI_EVENT_MONITOR_WAITED
>>    event handler, then this hang is not the same as yours
>> - if your JVMTI_EVENT_MONITOR_WAITED event handler is not using
>>    JVM/TI RawMonitors, then this hang is not the same as yours
>> - if your JVMTI_EVENT_MONITOR_WAITED event handler is calling
>>    back into Java code, then you might just be insane and this
>>    hang might be similar to yours. However, using a Java callback
>>    in an event handler is an even bigger problem/risk so fix that
>>    first.
>> - if you one or more threads blocked like this and making no
>>    progress, then this hang might be the same as yours:
>>
>> "T1" #22 prio=5 os_prio=64 tid=0x00000000009ca800 nid=0x2f waiting for
>> monitor e
>> ntry [0xfffffd7fc0231000]
>>     java.lang.Thread.State: BLOCKED (on object monitor)
>>     JavaThread state: _thread_blocked
>> Thread: 0x00000000009ca800  [0x2f] State: _at_safepoint _has_called_back
>> 0 _at_p
>> oll_safepoint 0
>>     JavaThread state: _thread_blocked
>>          at java.lang.Object.wait(Native Method)
>>          - waiting on <0xfffffd7e6a2b6ff0> (a java.lang.String)
>>          at java.lang.Object.wait(Object.java:502)
>>          at SMW_WorkerThread.run(StressMonitorWait.java:103)
>>          - locked <0xfffffd7e6a2b6ff0> (a java.lang.String)
>>
>> "T2" #23 prio=5 os_prio=64 tid=0x00000000009cc000 nid=0x30 waiting for
>> monitor e
>> ntry [0xfffffd7fc0130000]
>>     java.lang.Thread.State: BLOCKED (on object monitor)
>>     JavaThread state: _thread_blocked
>> Thread: 0x00000000009cc000  [0x30] State: _at_safepoint _has_called_back
>> 0 _at_p
>> oll_safepoint 0
>>     JavaThread state: _thread_blocked
>>          at SMW_WorkerThread.run(StressMonitorWait.java:120)
>>          - waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String)
>>
>> "T3" #24 prio=5 os_prio=64 tid=0x00000000009ce000 nid=0x31 waiting for
>> monitor e
>> ntry [0xfffffd7fc002f000]
>>     java.lang.Thread.State: BLOCKED (on object monitor)
>>     JavaThread state: _thread_blocked
>> Thread: 0x00000000009ce000  [0x31] State: _at_safepoint _has_called_back
>> 0 _at_p
>> oll_safepoint 0
>>     JavaThread state: _thread_blocked
>>          at SMW_WorkerThread.run(StressMonitorWait.java:139)
>>          - waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String)
>>
>> Key symptoms in thread T1:
>>
>> - had the object locked:
>>
>>    locked <0xfffffd7e6a2b6ff0> (a java.lang.String)
>>
>> - did an Object.wait():
>>
>>    waiting on <0xfffffd7e6a2b6ff0> (a java.lang.String)
>>
>> - is blocked on reentry:
>>
>>    waiting for monitor entry [0xfffffd7fc0231000]
>>
>> Key symtoms in thread T2:
>>
>> - is blocked waiting to lock the object:
>>
>>    waiting for monitor entry [0xfffffd7fc0130000]
>>    waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String)
>>
>> Key symtoms in thread T3:
>>
>> - is blocked waiting to lock the object:
>>
>>    waiting for monitor entry [0xfffffd7fc002f000]
>>    waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String)
>>


From daniel.daugherty at oracle.com  Mon Feb  3 07:51:18 2014
From: daniel.daugherty at oracle.com (Daniel D. Daugherty)
Date: Mon, 03 Feb 2014 08:51:18 -0700
Subject: RFR (XS) 6471769: Error: assert(_cur_stack_depth ==
	count_frames(), "cur_stack_depth out of sync")
In-Reply-To: <52EC62CD.8000300@oracle.com>
References: <52EC62CD.8000300@oracle.com>
Message-ID: <52EFBAF6.8000404@oracle.com>

On 1/31/14 7:58 PM, serguei.spitsyn at oracle.com wrote:
> Please, review the fix for:
>   https://bugs.openjdk.java.net/browse/JDK-6471769
>
>
> Open webrev:
> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/hotspot/6471769-JVMTI-FRAME/ 
>

src/share/vm/prims/jvmtiEnv.cpp
     Thumbs up on the code change.

     I've never been a fan of the:

         if (is_thread_fully_suspended(...))
             do-something-direct
         else
             do-something-via-a-safepoint

     model. We don't have any mechanism in the VM to keep the
     is_thread_fully_suspended() constraint true over the
     execution time of the "do-something-direct" operation so
     we've always been exposed to races. Also, I think this
     model is used for several other JVM/TI calls so you might
     want to visit those also with an eye on safety.

     For this specific change, I think the comment would be
     better as:

     // It is only safe to perform the direct operation on the current
     // thread. All other usage needs to use a vm-op for safety.

     In particular, I don't agree with this part of the comment:

     // Performance-wise the only important case is when current thread is
     // the target thread.

     I'm not sure how you can state that performance on the current thread
     is the only important case, but maybe I don't understand what you're
     really trying to say here.

Now for the more general question with respect to this bug: 6471769
I've having trouble connecting this change (which I like) to this
particular bug report.

The crash in the bug's description looks like this:

# Internal Error 
(/net/prt-solamd64-q1-2/PrtBuildDir/workspace/src/share/vm/prims/jvmtiThreadState.cpp, 
273), pid=9561, tid=2
#
# Java VM: Java HotSpot(TM) 64-Bit Server VM 
(20060914135846.dcubed.service_hs_b02_merge.2-debug compiled mode)
#
# Error: assert(_cur_stack_depth == count_frames(),"cur_stack_depth out 
of sync")

V [libjvm.so+0x1217206];; void VMError::report_and_die()+0x606
V [libjvm.so+0x667841];; void report_assertion_failure(const 
char*,int,const char*)+0x61
V [libjvm.so+0xca9557];; int JvmtiThreadState::cur_stack_depth()+0x6e7
V [libjvm.so+0xc1caaa];; void 
JvmtiExport::post_method_exit(JavaThread*,methodOop,frame)+0x21ba
V [libjvm.so+0x8385de];; void 
InterpreterRuntime::post_method_exit(JavaThread*)+0x21e
j java.io.ObjectInputStream$BlockDataInputStream.getBlockDataMode()Z+4

In the above crash, the current thread is trying to post a
JVMTI_EVENT_METHOD_EXIT and during the event posting code, it
calls cur_stack_depth() which fails an assert because the
cached stack depth doesn't match the current queried value:

    282  int JvmtiThreadState::cur_stack_depth() {
    283    uint32_t debug_bits = 0;
    284    guarantee(JavaThread::current() == get_thread() ||
    285      JvmtiEnv::is_thread_fully_suspended(get_thread(), false, 
&debug_bits
),
    286      "must be current thread or suspended");
    287
    288    if (!is_interp_only_mode() || _cur_stack_depth == 
UNKNOWN_STACK_DEPTH)
  {
    289      _cur_stack_depth = count_frames();
    290    } else {
    291      // heavy weight assert
    292      assert(_cur_stack_depth == count_frames(),
    293             "cur_stack_depth out of sync");
    294    }
    295    return _cur_stack_depth;
    296  }


This is not an unsafe query from one thread to another target thread
so I don't see how the proposed change will address this failure mode.
In particular, the assertion is troubling because it tells me that
the cached stack depth for the current thread is either wrong or has
been corrupted.

To reiterate:

- I like the code change!
- I would like to see the comment tweaked a bit

You should consider going forward with this change using a different
bug ID, perhaps something like:

    JDK-NNNNNNN improve safety of JVM/TI GetFrameCount()

Dan


>
> Summary:
>
>   There is a general issue in the suspend equivalent condition mechanism:
>   Two subsequent calls to the JvmtiEnv::is_thread_fully_suspended() 
> may return different results:
>     - 1-st: true
>     - 2-nd: false
>
>   This more generic suspend equivalent issue is covered by another bug:
>     https://bugs.openjdk.java.net/browse/JDK-6280037
>
>   The bug to fix in this review is a specific manifestation of the 
> 6280037
>   in the JVMTI GetFrameCount() that has a big impact on the SQE nightly.
>   It is on the Test Stabilization radar (as well as the 6280037).
>   There are many tests intermittently failing because of this.
>
>   The webrev for review is a one-liner work around the 6280037 for the 
> GetFrameCount().
>
>   The JVMTI GetFrameCount() spec tells:
>     "If this function is called for a thread actively executing 
> bytecodes (for example,
>      not the current thread and not suspended), the information 
> returned is transient."
>
>   So, it is Ok to call the GetFrameCount() for non-suspended target 
> threads.
>   To achieve safety, the frame count for non-suspended threads is 
> calculated at a safepoint.
>   It should be Ok and more safe to do the same for suspended threads 
> as well.
>   There is no big performance impact because it is already on a slow 
> path.
>   It is still important to avoid safepointing when the target thread 
> is current.
>
>   The bug 6280037 should go out of the Test Stabilization radar 
> (remove the svc-nightly label)
>   as the most of the impacted tests are covered by the 6471769.
>
>
> Testing:
>   In progress: nsk.jvmti, nsk.jdi, nsk.jdwp, impacted JTreg tests
>
>
> Thanks,
> Serguei
>


From serguei.spitsyn at oracle.com  Mon Feb  3 10:30:51 2014
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Mon, 03 Feb 2014 10:30:51 -0800
Subject: RFR (XS) 6471769: Error: assert(_cur_stack_depth ==
	count_frames(), "cur_stack_depth out of sync")
In-Reply-To: <3E8D21DD-BD7A-4AEF-9C02-C175AEB40738@oracle.com>
References: <52EC62CD.8000300@oracle.com>
	<3E8D21DD-BD7A-4AEF-9C02-C175AEB40738@oracle.com>
Message-ID: <52EFE05B.1070504@oracle.com>

Thanks, Staffan!
Serguei

On 2/3/14 3:57 AM, Staffan Larsen wrote:
> Looks good!
>
> Thanks,
> /Staffan
>
> On 1 feb 2014, at 03:58, serguei.spitsyn at oracle.com wrote:
>
>> Please, review the fix for:
>>   https://bugs.openjdk.java.net/browse/JDK-6471769
>>
>>
>> Open webrev:
>> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/hotspot/6471769-JVMTI-FRAME/
>>
>> Summary:
>>
>>   There is a general issue in the suspend equivalent condition mechanism:
>>   Two subsequent calls to the JvmtiEnv::is_thread_fully_suspended() may return different results:
>>     - 1-st: true
>>     - 2-nd: false
>>
>>   This more generic suspend equivalent issue is covered by another bug:
>>     https://bugs.openjdk.java.net/browse/JDK-6280037
>>
>>   The bug to fix in this review is a specific manifestation of the 6280037
>>   in the JVMTI GetFrameCount() that has a big impact on the SQE nightly.
>>   It is on the Test Stabilization radar (as well as the 6280037).
>>   There are many tests intermittently failing because of this.
>>
>>   The webrev for review is a one-liner work around the 6280037 for the GetFrameCount().
>>
>>   The JVMTI GetFrameCount() spec tells:
>>     "If this function is called for a thread actively executing bytecodes (for example,
>>      not the current thread and not suspended), the information returned is transient."
>>
>>   So, it is Ok to call the GetFrameCount() for non-suspended target threads.
>>   To achieve safety, the frame count for non-suspended threads is calculated at a safepoint.
>>   It should be Ok and more safe to do the same for suspended threads as well.
>>   There is no big performance impact because it is already on a slow path.
>>   It is still important to avoid safepointing when the target thread is current.
>>
>>   The bug 6280037 should go out of the Test Stabilization radar (remove the svc-nightly label)
>>   as the most of the impacted tests are covered by the 6471769.
>>
>>
>> Testing:
>>   In progress: nsk.jvmti, nsk.jdi, nsk.jdwp, impacted JTreg tests
>>
>>
>> Thanks,
>> Serguei
>>


From serguei.spitsyn at oracle.com  Mon Feb  3 10:36:50 2014
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Mon, 03 Feb 2014 10:36:50 -0800
Subject: code review round 0 for ObjectMonitor-JVM/TI hang fix (8028073)
In-Reply-To: <52ED3F28.5070806@oracle.com>
References: <52ED3F28.5070806@oracle.com>
Message-ID: <52EFE1C2.2070209@oracle.com>

Hi Dan,

It looks good to me.
Your work on this issue is outstanding.
I know what it took to resolve this one.
Great job!

Thanks,
Serguei

On 2/1/14 10:38 AM, Daniel D. Daugherty wrote:
> Greetings,
>
> I have a fix ready for the following bug:
>
>     8028073 race condition in ObjectMonitor implementation causing 
> deadlocks
>     https://bugs.openjdk.java.net/browse/JDK-8028073
>
> On the surface, this is a very simple fix that relocates a few lines of
> code, relocates and rewrites the comments associated with that code and
> adds several new comments.
>
> Of course, in reality, the issue is much more complicated, but I'm
> hoping to make it easy for anyone not acquainted with this issue to
> understand what's going on.
>
> Here are the JDK9 webrev URLs:
>
> OpenJDK:
> http://cr.openjdk.java.net/~dcubed/8028073-webrev/0-jdk9-hs-runtime/
>
> Oracle internal:
> http://javaweb.us.oracle.com/~ddaugher/8028073-webrev/0-jdk9-hs-runtime/
>
> The simple summary:
>
> - since Java Monitors and JVM/TI RawMonitors share a ParkEvent,
>   it is possible for a JVM/TI monitor event handler to accidentally
>   consume a ParkEvent.unpark() call meant for Java Monitor layer
> - the original code fix was made on 2005.07.04 using this bug ID:
>   https://bugs.openjdk.java.net/browse/JDK-5030359
> - it's the right fix, but it's in the wrong place
> - the fix needs to be after the JVMTI_EVENT_MONITOR_WAITED
>   event handler is called because it is that event handler
>   that can cause the hang
>
>
> Testing
> -------
>
> - a new StessMonitorWait test has been created that reliably
>   reproduces the hang in JDK[6789]; see the bug's gory details
>   for the specific versions where the hang has been reproduced
>   - the test reliably reproduces the hang in 5 seconds on my
>     T7600 running Solaris 10u11 X86; 1 minute runs reproduce
>     the hang reliably on other machines
>   - 12 hour stress run of the new test on Linux-X64, MacOS X-X64,
>     Solaris-SPARCV9, Solaris-X64, and Win7-X86 with the JPRT
>     bits did not reproduce the hang
> - JPRT test job
> - VM/SQE Adhoc test job on Server VM, fastdebug bits on Linux-X86,
>   Linux-X64, MacOS X-X64, Solaris-SPARCV9, Solaris-X64, Windows-X86,
>   and Windows-X64:
>   - vm.quick
>   - Kitchensink (bigapps)
>   - Weblogic+medrec (bigapps)
>   - runThese (bigapps)
>
>
> The Gory Details Start Here
> ---------------------------
>
> This is the old location of block of code that's being moved:
>
> src/share/vm/runtime/objectMonitor.cpp:
>
> 1440 void ObjectMonitor::wait(jlong millis, bool interruptible, TRAPS) {
> <snip>
> 1499    exit (true, Self) ;                    // exit the monitor
> <snip>
> 1513    if (node._notified != 0 && _succ == Self) {
> 1514       node._event->unpark();
> 1515    }
>
>
> This is the new location of block of code that's being moved:
>
> src/share/vm/runtime/objectMonitor.cpp:
>
> 1452 void ObjectMonitor::wait(jlong millis, bool interruptible, TRAPS) {
> <snip>
> 1601      if (JvmtiExport::should_post_monitor_waited()) {
> 1602        JvmtiExport::post_monitor_waited(jt, this, ret == 
> OS_TIMEOUT);
> <snip>
> 1604        if (node._notified != 0 && _succ == Self) {
> <snip>
> 1620          node._event->unpark();
> 1621        }
>
>
> The Risks
> ---------
>
> - The code now executes only when the JVMTI_EVENT_MONITOR_WAITED event
>   is enabled:
>   - previously it was always executed
>   - while the old code was not effective for the hang that is being
>     fixed with this bug, it is possible that the old code prevented
>     a different bug in the successor protocol from manifesting
>   - thorough analysis of the successor protocol did not reveal a
>     case where the old code was needed in the old location
> - Thorough analysis indicates that the other JVM/TI monitor events
>   do not need a fix like the one for JVMTI_EVENT_MONITOR_WAITED:
>   - the successor protocol is complicated and the analysis could
>     be wrong when certain options are used
>   - comments were added to each location where a JVM/TI monitor
>     event handler is called documenting why a fix like this one
>     is not needed there
>   - if the analysis is wrong, the new comments show where a new
>     code change would be needed
>
>
> The Scenario
> ------------
>
> I've created a scenario that reproduces this hang:
>
> T1 - enters monitor and calls monitor.wait()
> T2 - enters the monitor, calls monitor.notify() and exits the monitor
> T3 - enters and exits the monitor
> T4 - enters the monitor, delays for 5 seconds, exits the monitor
>
> A JVM/TI agent that enables JVMTI_EVENT_MONITOR_WAITED and has a
> handler that: enters a raw monitor, waits for 1ms, exits a raw monitor.
>
> Here are the six events necessary to make this hang happen:
>
> // KEY-EVENT-1a: After being unparked(), T1 has cleared the _succ 
> field, but
> // KEY-EVENT-1b: T3 is exiting the monitor and makes T1 the successor 
> again.
>
> // KEY-EVENT-2a: The unpark() done by T3 when it made T1 the successor
> // KEY-EVENT-2b: is consumed by the JVM/TI event handler.
>
> // KEY-EVENT-3a: T3 made T1 the successor
> // KEY-EVENT-3b: but before T1 could reenter the monitor T4 grabbed it.
>
> // KEY-EVENT-4a: T1's TrySpin() call sees T4 as NotRunnable so
> // KEY-EVENT-4b: T1 bails from TrySpin without touching _succ.
>
> // KEY-EVENT-5a: T4 sees that T1 is still the successor so
> // KEY-EVENT-5b: T4 takes the quick exit path (no ExitEpilog)
>
> // KEY-EVENT-6a: T1 is about to park and it is the successor, but
> // KEY-EVENT-6b: T3's unpark has been eaten by the JVM/TI event handler
> // KEY-EVENT-6c: and T4 took the quick exit path. T1 is about to be 
> stuck.
>
>
> This bug is intertwined with:
>
> - The ObjectMonitor successor protocol
> - the sharing of a ParkEvent between Java Monitors and JVM/TI RawMonitors
>
> There is a very long successor.notes attachment to JDK-8028073 that
> attempts to describe the ObjectMonitor successor protocol. It's good
> for putting pretty much anyone to sleep.
>
> Since this hang reproduces back to JDK6, this bug is taking the easily
> backported solution of moving the original fix to the right location.
> The following new bug has been filed for possible future work in this
> area by the Serviceability Team:
>
>     8033399 add a separate ParkEvent for JVM/TI RawMonitor use
>     https://bugs.openjdk.java.net/browse/JDK-8033399
>
>
> The Symptoms
> ------------
>
> With intermittent hangs like this, it is useful to know what to look
> for in order to determine if you are running into this issue:
>
> - if you aren't using a debugger or a profiler or some other
>   JVM/TI agent, then this hang is not the same as yours
> - if your JVM/TI agent isn't using a JVMTI_EVENT_MONITOR_WAITED
>   event handler, then this hang is not the same as yours
> - if your JVMTI_EVENT_MONITOR_WAITED event handler is not using
>   JVM/TI RawMonitors, then this hang is not the same as yours
> - if your JVMTI_EVENT_MONITOR_WAITED event handler is calling
>   back into Java code, then you might just be insane and this
>   hang might be similar to yours. However, using a Java callback
>   in an event handler is an even bigger problem/risk so fix that
>   first.
> - if you one or more threads blocked like this and making no
>   progress, then this hang might be the same as yours:
>
> "T1" #22 prio=5 os_prio=64 tid=0x00000000009ca800 nid=0x2f waiting for 
> monitor e
> ntry [0xfffffd7fc0231000]
>    java.lang.Thread.State: BLOCKED (on object monitor)
>    JavaThread state: _thread_blocked
> Thread: 0x00000000009ca800  [0x2f] State: _at_safepoint 
> _has_called_back 0 _at_p
> oll_safepoint 0
>    JavaThread state: _thread_blocked
>         at java.lang.Object.wait(Native Method)
>         - waiting on <0xfffffd7e6a2b6ff0> (a java.lang.String)
>         at java.lang.Object.wait(Object.java:502)
>         at SMW_WorkerThread.run(StressMonitorWait.java:103)
>         - locked <0xfffffd7e6a2b6ff0> (a java.lang.String)
>
> "T2" #23 prio=5 os_prio=64 tid=0x00000000009cc000 nid=0x30 waiting for 
> monitor e
> ntry [0xfffffd7fc0130000]
>    java.lang.Thread.State: BLOCKED (on object monitor)
>    JavaThread state: _thread_blocked
> Thread: 0x00000000009cc000  [0x30] State: _at_safepoint 
> _has_called_back 0 _at_p
> oll_safepoint 0
>    JavaThread state: _thread_blocked
>         at SMW_WorkerThread.run(StressMonitorWait.java:120)
>         - waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String)
>
> "T3" #24 prio=5 os_prio=64 tid=0x00000000009ce000 nid=0x31 waiting for 
> monitor e
> ntry [0xfffffd7fc002f000]
>    java.lang.Thread.State: BLOCKED (on object monitor)
>    JavaThread state: _thread_blocked
> Thread: 0x00000000009ce000  [0x31] State: _at_safepoint 
> _has_called_back 0 _at_p
> oll_safepoint 0
>    JavaThread state: _thread_blocked
>         at SMW_WorkerThread.run(StressMonitorWait.java:139)
>         - waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String)
>
> Key symptoms in thread T1:
>
> - had the object locked:
>
>   locked <0xfffffd7e6a2b6ff0> (a java.lang.String)
>
> - did an Object.wait():
>
>   waiting on <0xfffffd7e6a2b6ff0> (a java.lang.String)
>
> - is blocked on reentry:
>
>   waiting for monitor entry [0xfffffd7fc0231000]
>
> Key symtoms in thread T2:
>
> - is blocked waiting to lock the object:
>
>   waiting for monitor entry [0xfffffd7fc0130000]
>   waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String)
>
> Key symtoms in thread T3:
>
> - is blocked waiting to lock the object:
>
>   waiting for monitor entry [0xfffffd7fc002f000]
>   waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String)
>


From daniel.daugherty at oracle.com  Mon Feb  3 10:50:48 2014
From: daniel.daugherty at oracle.com (Daniel D. Daugherty)
Date: Mon, 03 Feb 2014 11:50:48 -0700
Subject: code review round 0 for ObjectMonitor-JVM/TI hang fix (8028073)
In-Reply-To: <52EFE1C2.2070209@oracle.com>
References: <52ED3F28.5070806@oracle.com> <52EFE1C2.2070209@oracle.com>
Message-ID: <52EFE508.2010808@oracle.com>

Thanks for the review Serguei!

Dan


On 2/3/14 11:36 AM, serguei.spitsyn at oracle.com wrote:
> Hi Dan,
>
> It looks good to me.
> Your work on this issue is outstanding.
> I know what it took to resolve this one.
> Great job!
>
> Thanks,
> Serguei
>
> On 2/1/14 10:38 AM, Daniel D. Daugherty wrote:
>> Greetings,
>>
>> I have a fix ready for the following bug:
>>
>>     8028073 race condition in ObjectMonitor implementation causing 
>> deadlocks
>>     https://bugs.openjdk.java.net/browse/JDK-8028073
>>
>> On the surface, this is a very simple fix that relocates a few lines of
>> code, relocates and rewrites the comments associated with that code and
>> adds several new comments.
>>
>> Of course, in reality, the issue is much more complicated, but I'm
>> hoping to make it easy for anyone not acquainted with this issue to
>> understand what's going on.
>>
>> Here are the JDK9 webrev URLs:
>>
>> OpenJDK:
>> http://cr.openjdk.java.net/~dcubed/8028073-webrev/0-jdk9-hs-runtime/
>>
>> Oracle internal:
>> http://javaweb.us.oracle.com/~ddaugher/8028073-webrev/0-jdk9-hs-runtime/
>>
>> The simple summary:
>>
>> - since Java Monitors and JVM/TI RawMonitors share a ParkEvent,
>>   it is possible for a JVM/TI monitor event handler to accidentally
>>   consume a ParkEvent.unpark() call meant for Java Monitor layer
>> - the original code fix was made on 2005.07.04 using this bug ID:
>>   https://bugs.openjdk.java.net/browse/JDK-5030359
>> - it's the right fix, but it's in the wrong place
>> - the fix needs to be after the JVMTI_EVENT_MONITOR_WAITED
>>   event handler is called because it is that event handler
>>   that can cause the hang
>>
>>
>> Testing
>> -------
>>
>> - a new StessMonitorWait test has been created that reliably
>>   reproduces the hang in JDK[6789]; see the bug's gory details
>>   for the specific versions where the hang has been reproduced
>>   - the test reliably reproduces the hang in 5 seconds on my
>>     T7600 running Solaris 10u11 X86; 1 minute runs reproduce
>>     the hang reliably on other machines
>>   - 12 hour stress run of the new test on Linux-X64, MacOS X-X64,
>>     Solaris-SPARCV9, Solaris-X64, and Win7-X86 with the JPRT
>>     bits did not reproduce the hang
>> - JPRT test job
>> - VM/SQE Adhoc test job on Server VM, fastdebug bits on Linux-X86,
>>   Linux-X64, MacOS X-X64, Solaris-SPARCV9, Solaris-X64, Windows-X86,
>>   and Windows-X64:
>>   - vm.quick
>>   - Kitchensink (bigapps)
>>   - Weblogic+medrec (bigapps)
>>   - runThese (bigapps)
>>
>>
>> The Gory Details Start Here
>> ---------------------------
>>
>> This is the old location of block of code that's being moved:
>>
>> src/share/vm/runtime/objectMonitor.cpp:
>>
>> 1440 void ObjectMonitor::wait(jlong millis, bool interruptible, TRAPS) {
>> <snip>
>> 1499    exit (true, Self) ;                    // exit the monitor
>> <snip>
>> 1513    if (node._notified != 0 && _succ == Self) {
>> 1514       node._event->unpark();
>> 1515    }
>>
>>
>> This is the new location of block of code that's being moved:
>>
>> src/share/vm/runtime/objectMonitor.cpp:
>>
>> 1452 void ObjectMonitor::wait(jlong millis, bool interruptible, TRAPS) {
>> <snip>
>> 1601      if (JvmtiExport::should_post_monitor_waited()) {
>> 1602        JvmtiExport::post_monitor_waited(jt, this, ret == 
>> OS_TIMEOUT);
>> <snip>
>> 1604        if (node._notified != 0 && _succ == Self) {
>> <snip>
>> 1620          node._event->unpark();
>> 1621        }
>>
>>
>> The Risks
>> ---------
>>
>> - The code now executes only when the JVMTI_EVENT_MONITOR_WAITED event
>>   is enabled:
>>   - previously it was always executed
>>   - while the old code was not effective for the hang that is being
>>     fixed with this bug, it is possible that the old code prevented
>>     a different bug in the successor protocol from manifesting
>>   - thorough analysis of the successor protocol did not reveal a
>>     case where the old code was needed in the old location
>> - Thorough analysis indicates that the other JVM/TI monitor events
>>   do not need a fix like the one for JVMTI_EVENT_MONITOR_WAITED:
>>   - the successor protocol is complicated and the analysis could
>>     be wrong when certain options are used
>>   - comments were added to each location where a JVM/TI monitor
>>     event handler is called documenting why a fix like this one
>>     is not needed there
>>   - if the analysis is wrong, the new comments show where a new
>>     code change would be needed
>>
>>
>> The Scenario
>> ------------
>>
>> I've created a scenario that reproduces this hang:
>>
>> T1 - enters monitor and calls monitor.wait()
>> T2 - enters the monitor, calls monitor.notify() and exits the monitor
>> T3 - enters and exits the monitor
>> T4 - enters the monitor, delays for 5 seconds, exits the monitor
>>
>> A JVM/TI agent that enables JVMTI_EVENT_MONITOR_WAITED and has a
>> handler that: enters a raw monitor, waits for 1ms, exits a raw monitor.
>>
>> Here are the six events necessary to make this hang happen:
>>
>> // KEY-EVENT-1a: After being unparked(), T1 has cleared the _succ 
>> field, but
>> // KEY-EVENT-1b: T3 is exiting the monitor and makes T1 the successor 
>> again.
>>
>> // KEY-EVENT-2a: The unpark() done by T3 when it made T1 the successor
>> // KEY-EVENT-2b: is consumed by the JVM/TI event handler.
>>
>> // KEY-EVENT-3a: T3 made T1 the successor
>> // KEY-EVENT-3b: but before T1 could reenter the monitor T4 grabbed it.
>>
>> // KEY-EVENT-4a: T1's TrySpin() call sees T4 as NotRunnable so
>> // KEY-EVENT-4b: T1 bails from TrySpin without touching _succ.
>>
>> // KEY-EVENT-5a: T4 sees that T1 is still the successor so
>> // KEY-EVENT-5b: T4 takes the quick exit path (no ExitEpilog)
>>
>> // KEY-EVENT-6a: T1 is about to park and it is the successor, but
>> // KEY-EVENT-6b: T3's unpark has been eaten by the JVM/TI event handler
>> // KEY-EVENT-6c: and T4 took the quick exit path. T1 is about to be 
>> stuck.
>>
>>
>> This bug is intertwined with:
>>
>> - The ObjectMonitor successor protocol
>> - the sharing of a ParkEvent between Java Monitors and JVM/TI 
>> RawMonitors
>>
>> There is a very long successor.notes attachment to JDK-8028073 that
>> attempts to describe the ObjectMonitor successor protocol. It's good
>> for putting pretty much anyone to sleep.
>>
>> Since this hang reproduces back to JDK6, this bug is taking the easily
>> backported solution of moving the original fix to the right location.
>> The following new bug has been filed for possible future work in this
>> area by the Serviceability Team:
>>
>>     8033399 add a separate ParkEvent for JVM/TI RawMonitor use
>>     https://bugs.openjdk.java.net/browse/JDK-8033399
>>
>>
>> The Symptoms
>> ------------
>>
>> With intermittent hangs like this, it is useful to know what to look
>> for in order to determine if you are running into this issue:
>>
>> - if you aren't using a debugger or a profiler or some other
>>   JVM/TI agent, then this hang is not the same as yours
>> - if your JVM/TI agent isn't using a JVMTI_EVENT_MONITOR_WAITED
>>   event handler, then this hang is not the same as yours
>> - if your JVMTI_EVENT_MONITOR_WAITED event handler is not using
>>   JVM/TI RawMonitors, then this hang is not the same as yours
>> - if your JVMTI_EVENT_MONITOR_WAITED event handler is calling
>>   back into Java code, then you might just be insane and this
>>   hang might be similar to yours. However, using a Java callback
>>   in an event handler is an even bigger problem/risk so fix that
>>   first.
>> - if you one or more threads blocked like this and making no
>>   progress, then this hang might be the same as yours:
>>
>> "T1" #22 prio=5 os_prio=64 tid=0x00000000009ca800 nid=0x2f waiting 
>> for monitor e
>> ntry [0xfffffd7fc0231000]
>>    java.lang.Thread.State: BLOCKED (on object monitor)
>>    JavaThread state: _thread_blocked
>> Thread: 0x00000000009ca800  [0x2f] State: _at_safepoint 
>> _has_called_back 0 _at_p
>> oll_safepoint 0
>>    JavaThread state: _thread_blocked
>>         at java.lang.Object.wait(Native Method)
>>         - waiting on <0xfffffd7e6a2b6ff0> (a java.lang.String)
>>         at java.lang.Object.wait(Object.java:502)
>>         at SMW_WorkerThread.run(StressMonitorWait.java:103)
>>         - locked <0xfffffd7e6a2b6ff0> (a java.lang.String)
>>
>> "T2" #23 prio=5 os_prio=64 tid=0x00000000009cc000 nid=0x30 waiting 
>> for monitor e
>> ntry [0xfffffd7fc0130000]
>>    java.lang.Thread.State: BLOCKED (on object monitor)
>>    JavaThread state: _thread_blocked
>> Thread: 0x00000000009cc000  [0x30] State: _at_safepoint 
>> _has_called_back 0 _at_p
>> oll_safepoint 0
>>    JavaThread state: _thread_blocked
>>         at SMW_WorkerThread.run(StressMonitorWait.java:120)
>>         - waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String)
>>
>> "T3" #24 prio=5 os_prio=64 tid=0x00000000009ce000 nid=0x31 waiting 
>> for monitor e
>> ntry [0xfffffd7fc002f000]
>>    java.lang.Thread.State: BLOCKED (on object monitor)
>>    JavaThread state: _thread_blocked
>> Thread: 0x00000000009ce000  [0x31] State: _at_safepoint 
>> _has_called_back 0 _at_p
>> oll_safepoint 0
>>    JavaThread state: _thread_blocked
>>         at SMW_WorkerThread.run(StressMonitorWait.java:139)
>>         - waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String)
>>
>> Key symptoms in thread T1:
>>
>> - had the object locked:
>>
>>   locked <0xfffffd7e6a2b6ff0> (a java.lang.String)
>>
>> - did an Object.wait():
>>
>>   waiting on <0xfffffd7e6a2b6ff0> (a java.lang.String)
>>
>> - is blocked on reentry:
>>
>>   waiting for monitor entry [0xfffffd7fc0231000]
>>
>> Key symtoms in thread T2:
>>
>> - is blocked waiting to lock the object:
>>
>>   waiting for monitor entry [0xfffffd7fc0130000]
>>   waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String)
>>
>> Key symtoms in thread T3:
>>
>> - is blocked waiting to lock the object:
>>
>>   waiting for monitor entry [0xfffffd7fc002f000]
>>   waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String)
>>
>


From serguei.spitsyn at oracle.com  Mon Feb  3 11:12:37 2014
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Mon, 03 Feb 2014 11:12:37 -0800
Subject: RFR (XS) 6471769: Error: assert(_cur_stack_depth ==
	count_frames(), "cur_stack_depth out of sync")
In-Reply-To: <52EF88A7.3090506@oracle.com>
References: <52EC62CD.8000300@oracle.com> <52EF88A7.3090506@oracle.com>
Message-ID: <52EFEA25.8080905@oracle.com>

On 2/3/14 4:16 AM, David Holmes wrote:
> Hi Serguei,
>
> On 1/02/2014 12:58 PM, serguei.spitsyn at oracle.com wrote:
>> Please, review the fix for:
>>    https://bugs.openjdk.java.net/browse/JDK-6471769
>>
>>
>> Open webrev:
>> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/hotspot/6471769-JVMTI-FRAME/ 
>>
>>
>>
>> Summary:
>>
>>    There is a general issue in the suspend equivalent condition 
>> mechanism:
>>    Two subsequent calls to the JvmtiEnv::is_thread_fully_suspended() may
>> return different results:
>>      - 1-st: true
>>      - 2-nd: false
>>
>>    This more generic suspend equivalent issue is covered by another bug:
>>      https://bugs.openjdk.java.net/browse/JDK-6280037
>>
>>    The bug to fix in this review is a specific manifestation of the 
>> 6280037
>>    in the JVMTI GetFrameCount() that has a big impact on the SQE 
>> nightly.
>>    It is on the Test Stabilization radar (as well as the 6280037).
>>    There are many tests intermittently failing because of this.
>>
>>    The webrev for review is a one-liner work around the 6280037 for the
>> GetFrameCount().
>>
>>    The JVMTI GetFrameCount() spec tells:
>>      "If this function is called for a thread actively executing
>> bytecodes (for example,
>>       not the current thread and not suspended), the information
>> returned is transient."
>>
>>    So, it is Ok to call the GetFrameCount() for non-suspended target
>> threads.
>>    To achieve safety, the frame count for non-suspended threads is
>> calculated at a safepoint.
>>    It should be Ok and more safe to do the same for suspended threads as
>> well.
>>    There is no big performance impact because it is already on a slow 
>> path.
>>    It is still important to avoid safepointing when the target thread is
>> current.
>
> This sounds completely reasonable - Reviewed.
>
> I can't check at the moment whether VM_GetFrameCount properly checks 
> for a live target thread?

David, thank you for the review!
It looks like the check is missed.
Please, give me some time to make sure what is needed here.

Thanks!
Serguei

>
> David
>
>>    The bug 6280037 should go out of the Test Stabilization radar (remove
>> the svc-nightly label)
>>    as the most of the impacted tests are covered by the 6471769.
>>
>>
>> Testing:
>>    In progress: nsk.jvmti, nsk.jdi, nsk.jdwp, impacted JTreg tests
>>
>>
>> Thanks,
>> Serguei
>>


From staffan.larsen at oracle.com  Mon Feb  3 11:18:46 2014
From: staffan.larsen at oracle.com (Staffan Larsen)
Date: Mon, 3 Feb 2014 20:18:46 +0100
Subject: RFR(S): JDK-8029808 com/sun/jdi/ProcessAttachTest.sh times out 
In-Reply-To: <52EA872E.7070209@oracle.com>
References: <79C6D4D7-F8A0-4A33-AF7A-D1007F227A86@oracle.com>
	<52EA7F46.5070405@oracle.com>
	<16D6BA50-82FE-41F6-8423-FE218D397C58@oracle.com>
	<52EA84DC.5090008@oracle.com> <52EA872E.7070209@oracle.com>
Message-ID: <D1A333D9-1B29-443E-BFD7-E140F49B53F6@oracle.com>

OK, my last try at this was very broken - I?m glad it didn?t pass the review.

Here is a new try: http://cr.openjdk.java.net/~sla/8029808/webrev.01/

As Alan says, we can?t use jps on the (second) debuggee becuse it is started with suspend=y. In that mode it is waiting for a debugger to attach before it has started executing any Java code and before jps is able to identify it.

I did find a race in the test, though. When the first debuggee is started (suspend=n) there is a race where the debugger process can attach and finish before the debuggee has written out the portfile. In that case ShutdownDebuggee will fail. I have seen this failure mode happen and I can force it by inserting a Thread.sleep into the debuggee. I solved this by waiting for the port file to appear in the suspend=n case as well.

There is also a potential problem where both debuggees use the same output file and we may think the second debuggee has started before it actually has (because the output file isn?t empty). I solved that be creating different output files for the two test cases.

I?ve also redirected all error streams so any errors aren?t lost.

Thanks,
/Staffan

On 30 jan 2014, at 18:09, Alan Bateman <Alan.Bateman at oracle.com> wrote:

> On 30/01/2014 16:59, Dmitry Samersoff wrote:
>> Staffan,
>> 
>> No it's not to your code. Sorry for not being clean enough.
>> 
>> You cleaned up unix code, but windows code remains bad. Particularly, it
>> has bad unconditional sleep 2
>> 
>> I would propose replace *windows manipulation with CYGWIN/MKS pids* to
>> call to JPS that return windows pid
>> 
> I have a vague memory that we had to use ps to get the pid because of the suspend=y test which cause the debuggee to suspect during startup. I might be wrong on this of course, it was a long time ago.
> 
> -Alan


From Alan.Bateman at oracle.com  Mon Feb  3 12:07:28 2014
From: Alan.Bateman at oracle.com (Alan Bateman)
Date: Mon, 03 Feb 2014 20:07:28 +0000
Subject: RFR(S): JDK-8029808 com/sun/jdi/ProcessAttachTest.sh times out
In-Reply-To: <D1A333D9-1B29-443E-BFD7-E140F49B53F6@oracle.com>
References: <79C6D4D7-F8A0-4A33-AF7A-D1007F227A86@oracle.com>
	<52EA7F46.5070405@oracle.com>
	<16D6BA50-82FE-41F6-8423-FE218D397C58@oracle.com>
	<52EA84DC.5090008@oracle.com> <52EA872E.7070209@oracle.com>
	<D1A333D9-1B29-443E-BFD7-E140F49B53F6@oracle.com>
Message-ID: <52EFF700.2070803@oracle.com>

On 03/02/2014 19:18, Staffan Larsen wrote:
> OK, my last try at this was very broken - I?m glad it didn?t pass the review.
>
> Here is a new try: http://cr.openjdk.java.net/~sla/8029808/webrev.01/
>
> As Alan says, we can?t use jps on the (second) debuggee becuse it is started with suspend=y. In that mode it is waiting for a debugger to attach before it has started executing any Java code and before jps is able to identify it.
>
> I did find a race in the test, though. When the first debuggee is started (suspend=n) there is a race where the debugger process can attach and finish before the debuggee has written out the portfile. In that case ShutdownDebuggee will fail. I have seen this failure mode happen and I can force it by inserting a Thread.sleep into the debuggee. I solved this by waiting for the port file to appear in the suspend=n case as well.
>
> There is also a potential problem where both debuggees use the same output file and we may think the second debuggee has started before it actually has (because the output file isn?t empty). I solved that be creating different output files for the two test cases.
>
> I?ve also redirected all error streams so any errors aren?t lost.
>
This update looks okay to me (and good to have found the race).

-Alan.

From serguei.spitsyn at oracle.com  Mon Feb  3 12:13:02 2014
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Mon, 03 Feb 2014 12:13:02 -0800
Subject: RFR (XS) 6471769: Error: assert(_cur_stack_depth ==
	count_frames(), "cur_stack_depth out of sync")
In-Reply-To: <52EFBAF6.8000404@oracle.com>
References: <52EC62CD.8000300@oracle.com> <52EFBAF6.8000404@oracle.com>
Message-ID: <52EFF84E.1050800@oracle.com>

Dan,

Thank you for reviewing!
Some comments are below.

On 2/3/14 7:51 AM, Daniel D. Daugherty wrote:
> On 1/31/14 7:58 PM, serguei.spitsyn at oracle.com wrote:
>> Please, review the fix for:
>>   https://bugs.openjdk.java.net/browse/JDK-6471769
>>
>>
>> Open webrev:
>> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/hotspot/6471769-JVMTI-FRAME/ 
>>
>
> src/share/vm/prims/jvmtiEnv.cpp
>     Thumbs up on the code change.
>
>     I've never been a fan of the:
>
>         if (is_thread_fully_suspended(...))
>             do-something-direct
>         else
>             do-something-via-a-safepoint
>
>     model. We don't have any mechanism in the VM to keep the
>     is_thread_fully_suspended() constraint true over the
>     execution time of the "do-something-direct" operation so
>     we've always been exposed to races.

Thank you for this confirmation.
So that, we are in sync here.


> Also, I think this
>     model is used for several other JVM/TI calls so you might
>     want to visit those also with an eye on safety.

Yes, I'm looking to isolate and double check other places like this in 
the code.
Another piece of work is to isolate other bugs on this topic.


>
>     For this specific change, I think the comment would be
>     better as:
>
>     // It is only safe to perform the direct operation on the current
>     // thread. All other usage needs to use a vm-op for safety.

It is taken, thanks!

> In particular, I don't agree with this part of the comment:
>
>     // Performance-wise the only important case is when current thread is
>     // the target thread.
>
>     I'm not sure how you can state that performance on the current thread
>     is the only important case, but maybe I don't understand what you're
>     really trying to say here.

I agree, the comment is not clear.
Probably, the comment had to say: "the only case where we can keep 
performance
without loosing safety is the target thread is current ...".
In fact, collecting data at a safepoint is the only way to make it safe.
If the target thread is not current then:
  - a) thread has been suspended: already slow path, so that performance 
is not that critical
  - b) thread has not been suspended: the information we provide is 
transient,
         safepoint is the only way to get it safe (is is correct?)

>
> Now for the more general question with respect to this bug: 6471769
> I've having trouble connecting this change (which I like) to this
> particular bug report.
>
> The crash in the bug's description looks like this:
>
> # Internal Error 
> (/net/prt-solamd64-q1-2/PrtBuildDir/workspace/src/share/vm/prims/jvmtiThreadState.cpp, 
> 273), pid=9561, tid=2
> #
> # Java VM: Java HotSpot(TM) 64-Bit Server VM 
> (20060914135846.dcubed.service_hs_b02_merge.2-debug compiled mode)
> #
> # Error: assert(_cur_stack_depth == count_frames(),"cur_stack_depth 
> out of sync")
>
> V [libjvm.so+0x1217206];; void VMError::report_and_die()+0x606
> V [libjvm.so+0x667841];; void report_assertion_failure(const 
> char*,int,const char*)+0x61
> V [libjvm.so+0xca9557];; int JvmtiThreadState::cur_stack_depth()+0x6e7
> V [libjvm.so+0xc1caaa];; void 
> JvmtiExport::post_method_exit(JavaThread*,methodOop,frame)+0x21ba
> V [libjvm.so+0x8385de];; void 
> InterpreterRuntime::post_method_exit(JavaThread*)+0x21e
> j java.io.ObjectInputStream$BlockDataInputStream.getBlockDataMode()Z+4
>
> In the above crash, the current thread is trying to post a
> JVMTI_EVENT_METHOD_EXIT and during the event posting code, it
> calls cur_stack_depth() which fails an assert because the
> cached stack depth doesn't match the current queried value:
>
>    282  int JvmtiThreadState::cur_stack_depth() {
>    283    uint32_t debug_bits = 0;
>    284    guarantee(JavaThread::current() == get_thread() ||
>    285      JvmtiEnv::is_thread_fully_suspended(get_thread(), false, 
> &debug_bits
> ),
>    286      "must be current thread or suspended");
>    287
>    288    if (!is_interp_only_mode() || _cur_stack_depth == 
> UNKNOWN_STACK_DEPTH)
>  {
>    289      _cur_stack_depth = count_frames();
>    290    } else {
>    291      // heavy weight assert
>    292      assert(_cur_stack_depth == count_frames(),
>    293             "cur_stack_depth out of sync");
>    294    }
>    295    return _cur_stack_depth;
>    296  }
>
>
> This is not an unsafe query from one thread to another target thread
> so I don't see how the proposed change will address this failure mode.
> In particular, the assertion is troubling because it tells me that
> the cached stack depth for the current thread is either wrong or has
> been corrupted.

The above are consequences of unsafe operations we are doing.
Current thread should not calculate frame count of target thread (if 
different) other than at a safepoint.
We assume the thread is suspended but, in fact, it was not.
So that we lost safety here and all manifestation we observe are the 
consequences.

This bug covers the frames miscount issue.
In the fastdebug mode we may get the assert as in this bug report:
   https://bugs.openjdk.java.net/browse/JDK-8032223

There is another potential race that I forgot to check and fix but David 
pointed out.
I will try to cover it in the next version of webrev.

It is still possible there can be other issues here (as you say).
But it is hard to make sure it is the case because it is extremely hard 
to reproduce (you know it!).
I'd suggest to resolve it step-by-step:
   - fix the known safety issues
   - watch the nightly if anything else is left, file and fix new bugs

>
> To reiterate:
>
> - I like the code change!
> - I would like to see the comment tweaked a bit
>
> You should consider going forward with this change using a different
> bug ID, perhaps something like:
>
>    JDK-NNNNNNN improve safety of JVM/TI GetFrameCount()

There is already bug on this topic (and there are even more dups):
   https://bugs.openjdk.java.net/browse/JDK-8032223

But I still believe, the issue is the same ...


Thanks!
Serguei

>
> Dan
>
>
>
>>
>> Summary:
>>
>>   There is a general issue in the suspend equivalent condition 
>> mechanism:
>>   Two subsequent calls to the JvmtiEnv::is_thread_fully_suspended() 
>> may return different results:
>>     - 1-st: true
>>     - 2-nd: false
>>
>>   This more generic suspend equivalent issue is covered by another bug:
>>     https://bugs.openjdk.java.net/browse/JDK-6280037
>>
>>   The bug to fix in this review is a specific manifestation of the 
>> 6280037
>>   in the JVMTI GetFrameCount() that has a big impact on the SQE nightly.
>>   It is on the Test Stabilization radar (as well as the 6280037).
>>   There are many tests intermittently failing because of this.
>>
>>   The webrev for review is a one-liner work around the 6280037 for 
>> the GetFrameCount().
>>
>>   The JVMTI GetFrameCount() spec tells:
>>     "If this function is called for a thread actively executing 
>> bytecodes (for example,
>>      not the current thread and not suspended), the information 
>> returned is transient."
>>
>>   So, it is Ok to call the GetFrameCount() for non-suspended target 
>> threads.
>>   To achieve safety, the frame count for non-suspended threads is 
>> calculated at a safepoint.
>>   It should be Ok and more safe to do the same for suspended threads 
>> as well.
>>   There is no big performance impact because it is already on a slow 
>> path.
>>   It is still important to avoid safepointing when the target thread 
>> is current.
>>
>>   The bug 6280037 should go out of the Test Stabilization radar 
>> (remove the svc-nightly label)
>>   as the most of the impacted tests are covered by the 6471769.
>>
>>
>> Testing:
>>   In progress: nsk.jvmti, nsk.jdi, nsk.jdwp, impacted JTreg tests
>>
>>
>> Thanks,
>> Serguei
>>
>


From dmitry.samersoff at oracle.com  Mon Feb  3 12:31:49 2014
From: dmitry.samersoff at oracle.com (Dmitry Samersoff)
Date: Tue, 04 Feb 2014 00:31:49 +0400
Subject: RFR(S): JDK-8029808 com/sun/jdi/ProcessAttachTest.sh times out
In-Reply-To: <D1A333D9-1B29-443E-BFD7-E140F49B53F6@oracle.com>
References: <79C6D4D7-F8A0-4A33-AF7A-D1007F227A86@oracle.com>
	<52EA7F46.5070405@oracle.com>
	<16D6BA50-82FE-41F6-8423-FE218D397C58@oracle.com>
	<52EA84DC.5090008@oracle.com> <52EA872E.7070209@oracle.com>
	<D1A333D9-1B29-443E-BFD7-E140F49B53F6@oracle.com>
Message-ID: <52EFFCB5.2030004@oracle.com>

Staffan,

Looks good for me!

-Dmitry

On 2014-02-03 23:18, Staffan Larsen wrote:
> OK, my last try at this was very broken - I?m glad it didn?t pass the review.
> 
> Here is a new try: http://cr.openjdk.java.net/~sla/8029808/webrev.01/
> 
> As Alan says, we can?t use jps on the (second) debuggee becuse it is started with suspend=y. In that mode it is waiting for a debugger to attach before it has started executing any Java code and before jps is able to identify it.
> 
> I did find a race in the test, though. When the first debuggee is started (suspend=n) there is a race where the debugger process can attach and finish before the debuggee has written out the portfile. In that case ShutdownDebuggee will fail. I have seen this failure mode happen and I can force it by inserting a Thread.sleep into the debuggee. I solved this by waiting for the port file to appear in the suspend=n case as well.
> 
> There is also a potential problem where both debuggees use the same output file and we may think the second debuggee has started before it actually has (because the output file isn?t empty). I solved that be creating different output files for the two test cases.
> 
> I?ve also redirected all error streams so any errors aren?t lost.
> 
> Thanks,
> /Staffan
> 
> On 30 jan 2014, at 18:09, Alan Bateman <Alan.Bateman at oracle.com> wrote:
> 
>> On 30/01/2014 16:59, Dmitry Samersoff wrote:
>>> Staffan,
>>>
>>> No it's not to your code. Sorry for not being clean enough.
>>>
>>> You cleaned up unix code, but windows code remains bad. Particularly, it
>>> has bad unconditional sleep 2
>>>
>>> I would propose replace *windows manipulation with CYGWIN/MKS pids* to
>>> call to JPS that return windows pid
>>>
>> I have a vague memory that we had to use ps to get the pid because of the suspend=y test which cause the debuggee to suspect during startup. I might be wrong on this of course, it was a long time ago.
>>
>> -Alan
> 


-- 
Dmitry Samersoff
Oracle Java development team, Saint Petersburg, Russia
* I would love to change the world, but they won't give me the sources.

From serguei.spitsyn at oracle.com  Mon Feb  3 14:17:13 2014
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Mon, 03 Feb 2014 14:17:13 -0800
Subject: Review Request (S) 8025841: JVMTI: "vtable stub" dynamic code
	notification is misplaced
Message-ID: <52F01569.3070201@oracle.com>

Please, review the fix for:
   https://bugs.openjdk.java.net/browse/JDK-8025841


Open webrev:
http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/hotspot/omazurov/8025841-JVMTI-vtbl.1

Summary:

   The fix contributed by Oleg Mazurov to improve profiling data quality.
   It moves the "vtable stub" dynamic code notification to the right place.
   I've already reviewed the fix, and it looks good to me.

   Bug report description:

    "JVMTI_EVENT_DYNAMIC_CODE_GENERATED for "vtable stub" gets scheduled 
when
     a new chunk of memory for subsequent vtable and itable stubs is 
allocated.
     That chunk is uninitialized (contains zeros or garbage) although 
due to the fact
     that the actual event delivery is deferred, at least one vtable 
comes out right.

     This event should describe an individual vtable/itable stub (base 
address and size)
     and only after it's been created (memory is actually populated with 
code).
     Where VM diagnostic messages about vtable/itable stubs are issued upon
     -XX:+PrintAdapterHandlers appears exactly the right place for JVMTI 
events as well.

     Getting vtables/itables right is important in the context of 
performance analysis as
     that dynamically generated code may accumulate quite noticeable CPU 
time
     (especially itabes), sometimes larger than the actual Java methods 
called."


Testing:
   Oleg tested it in the Oracle Studio Performance Analyzer environment.
   nsk.jvmti, nsk.jdi, nsk.jdwp,
   In progress: Jtreg com/sun/jdi, java/lang/instrument


Thanks,
Serguei


From jaroslav.bachorik at oracle.com  Mon Feb  3 14:33:15 2014
From: jaroslav.bachorik at oracle.com (Jaroslav Bachorik)
Date: Mon, 03 Feb 2014 23:33:15 +0100
Subject: RFR 6791551: ExclusiveBind.java has a race condition
Message-ID: <52F0192B.9060104@oracle.com>

Please, review the following test fix:

Issue : https://bugs.openjdk.java.net/browse/JDK-6791551
Webrev: http://cr.openjdk.java.net/~jbachorik/6791551/webrev.00

The fix prevents the situation when the first debuggee has not managed 
to finish its intialization while the second one is started up thus 
making the port available for the second debuggee and failing the test.

The patch is using the library methods to configure and launch the 
debuggee and the test waits for the well known string to appear in the 
first debuggee output before attempting to launch the second debuggee.

Thanks,

-JB-

From daniel.daugherty at oracle.com  Mon Feb  3 14:38:27 2014
From: daniel.daugherty at oracle.com (Daniel D. Daugherty)
Date: Mon, 03 Feb 2014 15:38:27 -0700
Subject: RFR (XS) 6471769: Error: assert(_cur_stack_depth ==
	count_frames(), "cur_stack_depth out of sync")
In-Reply-To: <52EFEA25.8080905@oracle.com>
References: <52EC62CD.8000300@oracle.com> <52EF88A7.3090506@oracle.com>
	<52EFEA25.8080905@oracle.com>
Message-ID: <52F01A63.7080206@oracle.com>


On 2/3/14 12:12 PM, serguei.spitsyn at oracle.com wrote:
> On 2/3/14 4:16 AM, David Holmes wrote:
>> Hi Serguei,
>>
>> On 1/02/2014 12:58 PM, serguei.spitsyn at oracle.com wrote:
>>> Please, review the fix for:
>>>    https://bugs.openjdk.java.net/browse/JDK-6471769
>>>
>>>
>>> Open webrev:
>>> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/hotspot/6471769-JVMTI-FRAME/ 
>>>
>>>
>>>
>>> Summary:
>>>
>>>    There is a general issue in the suspend equivalent condition 
>>> mechanism:
>>>    Two subsequent calls to the JvmtiEnv::is_thread_fully_suspended() 
>>> may
>>> return different results:
>>>      - 1-st: true
>>>      - 2-nd: false
>>>
>>>    This more generic suspend equivalent issue is covered by another 
>>> bug:
>>>      https://bugs.openjdk.java.net/browse/JDK-6280037
>>>
>>>    The bug to fix in this review is a specific manifestation of the 
>>> 6280037
>>>    in the JVMTI GetFrameCount() that has a big impact on the SQE 
>>> nightly.
>>>    It is on the Test Stabilization radar (as well as the 6280037).
>>>    There are many tests intermittently failing because of this.
>>>
>>>    The webrev for review is a one-liner work around the 6280037 for the
>>> GetFrameCount().
>>>
>>>    The JVMTI GetFrameCount() spec tells:
>>>      "If this function is called for a thread actively executing
>>> bytecodes (for example,
>>>       not the current thread and not suspended), the information
>>> returned is transient."
>>>
>>>    So, it is Ok to call the GetFrameCount() for non-suspended target
>>> threads.
>>>    To achieve safety, the frame count for non-suspended threads is
>>> calculated at a safepoint.
>>>    It should be Ok and more safe to do the same for suspended 
>>> threads as
>>> well.
>>>    There is no big performance impact because it is already on a 
>>> slow path.
>>>    It is still important to avoid safepointing when the target 
>>> thread is
>>> current.
>>
>> This sounds completely reasonable - Reviewed.
>>
>> I can't check at the moment whether VM_GetFrameCount properly checks 
>> for a live target thread?
>
> David, thank you for the review!
> It looks like the check is missed.
> Please, give me some time to make sure what is needed here.

Check out the work done via this changeset:

http://hg.openjdk.java.net/hsx/hotspot-rt/hotspot/rev/c1d7040a1183

Sergey did the push via:

     JDK-8022836 JVM crashes in JVMTIENVBASE::GET_CURRENT_CONTENDED_MONITOR
                 and GET_OWNED_MONITOR
     https://bugs.openjdk.java.net/browse/JDK-8022836

but the real commentary about the bug is in this one:

     JDK-7154963 crash in JvmtiEnvBase::get_current_contended_monitor()
     https://bugs.openjdk.java.net/browse/JDK-7154963

I think above code is what you want...

Dan


>
> Thanks!
> Serguei
>
>>
>> David
>>
>>>    The bug 6280037 should go out of the Test Stabilization radar 
>>> (remove
>>> the svc-nightly label)
>>>    as the most of the impacted tests are covered by the 6471769.
>>>
>>>
>>> Testing:
>>>    In progress: nsk.jvmti, nsk.jdi, nsk.jdwp, impacted JTreg tests
>>>
>>>
>>> Thanks,
>>> Serguei
>>>
>


From daniel.daugherty at oracle.com  Mon Feb  3 14:54:06 2014
From: daniel.daugherty at oracle.com (Daniel D. Daugherty)
Date: Mon, 03 Feb 2014 15:54:06 -0700
Subject: RFR (XS) 6471769: Error: assert(_cur_stack_depth ==
	count_frames(), "cur_stack_depth out of sync")
In-Reply-To: <52EFF84E.1050800@oracle.com>
References: <52EC62CD.8000300@oracle.com> <52EFBAF6.8000404@oracle.com>
	<52EFF84E.1050800@oracle.com>
Message-ID: <52F01E0D.8010304@oracle.com>

On 2/3/14 1:13 PM, serguei.spitsyn at oracle.com wrote:
> Dan,
>
> Thank you for reviewing!
> Some comments are below.
>
> On 2/3/14 7:51 AM, Daniel D. Daugherty wrote:
>> On 1/31/14 7:58 PM, serguei.spitsyn at oracle.com wrote:
>>> Please, review the fix for:
>>>   https://bugs.openjdk.java.net/browse/JDK-6471769
>>>
>>>
>>> Open webrev:
>>> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/hotspot/6471769-JVMTI-FRAME/ 
>>>
>>
>> src/share/vm/prims/jvmtiEnv.cpp
>>     Thumbs up on the code change.
>>
>>     I've never been a fan of the:
>>
>>         if (is_thread_fully_suspended(...))
>>             do-something-direct
>>         else
>>             do-something-via-a-safepoint
>>
>>     model. We don't have any mechanism in the VM to keep the
>>     is_thread_fully_suspended() constraint true over the
>>     execution time of the "do-something-direct" operation so
>>     we've always been exposed to races.
>
> Thank you for this confirmation.
> So that, we are in sync here.
>
>
>> Also, I think this
>>     model is used for several other JVM/TI calls so you might
>>     want to visit those also with an eye on safety.
>
> Yes, I'm looking to isolate and double check other places like this in 
> the code.
> Another piece of work is to isolate other bugs on this topic.
>
>
>>
>>     For this specific change, I think the comment would be
>>     better as:
>>
>>     // It is only safe to perform the direct operation on the current
>>     // thread. All other usage needs to use a vm-op for safety.
>
> It is taken, thanks!
>
>> In particular, I don't agree with this part of the comment:
>>
>>     // Performance-wise the only important case is when current 
>> thread is
>>     // the target thread.
>>
>>     I'm not sure how you can state that performance on the current 
>> thread
>>     is the only important case, but maybe I don't understand what you're
>>     really trying to say here.
>
> I agree, the comment is not clear.
> Probably, the comment had to say: "the only case where we can keep 
> performance
> without loosing safety is the target thread is current ...".
> In fact, collecting data at a safepoint is the only way to make it safe.
> If the target thread is not current then:
>  - a) thread has been suspended: already slow path, so that 
> performance is not that critical
>  - b) thread has not been suspended: the information we provide is 
> transient,
>         safepoint is the only way to get it safe (is is correct?)

Yes, the only safe way to get info for another thread is via
a safepoint operation.


>
>>
>> Now for the more general question with respect to this bug: 6471769
>> I've having trouble connecting this change (which I like) to this
>> particular bug report.
>>
>> The crash in the bug's description looks like this:
>>
>> # Internal Error 
>> (/net/prt-solamd64-q1-2/PrtBuildDir/workspace/src/share/vm/prims/jvmtiThreadState.cpp, 
>> 273), pid=9561, tid=2
>> #
>> # Java VM: Java HotSpot(TM) 64-Bit Server VM 
>> (20060914135846.dcubed.service_hs_b02_merge.2-debug compiled mode)
>> #
>> # Error: assert(_cur_stack_depth == count_frames(),"cur_stack_depth 
>> out of sync")
>>
>> V [libjvm.so+0x1217206];; void VMError::report_and_die()+0x606
>> V [libjvm.so+0x667841];; void report_assertion_failure(const 
>> char*,int,const char*)+0x61
>> V [libjvm.so+0xca9557];; int JvmtiThreadState::cur_stack_depth()+0x6e7
>> V [libjvm.so+0xc1caaa];; void 
>> JvmtiExport::post_method_exit(JavaThread*,methodOop,frame)+0x21ba
>> V [libjvm.so+0x8385de];; void 
>> InterpreterRuntime::post_method_exit(JavaThread*)+0x21e
>> j java.io.ObjectInputStream$BlockDataInputStream.getBlockDataMode()Z+4
>>
>> In the above crash, the current thread is trying to post a
>> JVMTI_EVENT_METHOD_EXIT and during the event posting code, it
>> calls cur_stack_depth() which fails an assert because the
>> cached stack depth doesn't match the current queried value:
>>
>>    282  int JvmtiThreadState::cur_stack_depth() {
>>    283    uint32_t debug_bits = 0;
>>    284    guarantee(JavaThread::current() == get_thread() ||
>>    285      JvmtiEnv::is_thread_fully_suspended(get_thread(), false, 
>> &debug_bits
>> ),
>>    286      "must be current thread or suspended");
>>    287
>>    288    if (!is_interp_only_mode() || _cur_stack_depth == 
>> UNKNOWN_STACK_DEPTH)
>>  {
>>    289      _cur_stack_depth = count_frames();
>>    290    } else {
>>    291      // heavy weight assert
>>    292      assert(_cur_stack_depth == count_frames(),
>>    293             "cur_stack_depth out of sync");
>>    294    }
>>    295    return _cur_stack_depth;
>>    296  }
>>
>>
>> This is not an unsafe query from one thread to another target thread
>> so I don't see how the proposed change will address this failure mode.
>> In particular, the assertion is troubling because it tells me that
>> the cached stack depth for the current thread is either wrong or has
>> been corrupted.
>
> The above are consequences of unsafe operations we are doing.
> Current thread should not calculate frame count of target thread (if 
> different) other than at a safepoint.
> We assume the thread is suspended but, in fact, it was not.
> So that we lost safety here and all manifestation we observe are the 
> consequences.

For at least the stack trace shown in the description of 6471769,
it is not an unsafe case because the thread is operating on itself.
All I'm saying here is that 6471769 is not the right bug to use for
this fix...


> This bug covers the frames miscount issue.
> In the fastdebug mode we may get the assert as in this bug report:
>   https://bugs.openjdk.java.net/browse/JDK-8032223

This bug: JDK-8032223 seems to be a perfect match for code that you
are proposing the change in your current webrev. You have a thread
("JDWP Transport Listener: dt_socket") that is trying to query the
frame count on another thread... That target JavaThread met the
is_thread_fully_suspended() criteria when execution went down the
direct path, but later the target stopped meeting that criteria...

And boom...

> There is another potential race that I forgot to check and fix but 
> David pointed out.
> I will try to cover it in the next version of webrev.

See the separate reply that I sent on that part...


>
> It is still possible there can be other issues here (as you say).
> But it is hard to make sure it is the case because it is extremely 
> hard to reproduce (you know it!).
> I'd suggest to resolve it step-by-step:
>   - fix the known safety issues
>   - watch the nightly if anything else is left, file and fix new bugs

This sounds like a good plan.


>
>>
>> To reiterate:
>>
>> - I like the code change!
>> - I would like to see the comment tweaked a bit
>>
>> You should consider going forward with this change using a different
>> bug ID, perhaps something like:
>>
>>    JDK-NNNNNNN improve safety of JVM/TI GetFrameCount()
>
> There is already bug on this topic (and there are even more dups):
>   https://bugs.openjdk.java.net/browse/JDK-8032223
>
> But I still believe, the issue is the same ...

I don't think the issues in JDK-6471769 and JDK-8032223 are the same;
well not the same for the crash in the description of JDK-6471769. I
didn't look closely at the other crashes...


Dan


>
>
> Thanks!
> Serguei
>
>>
>> Dan
>>
>>
>>
>>>
>>> Summary:
>>>
>>>   There is a general issue in the suspend equivalent condition 
>>> mechanism:
>>>   Two subsequent calls to the JvmtiEnv::is_thread_fully_suspended() 
>>> may return different results:
>>>     - 1-st: true
>>>     - 2-nd: false
>>>
>>>   This more generic suspend equivalent issue is covered by another bug:
>>>     https://bugs.openjdk.java.net/browse/JDK-6280037
>>>
>>>   The bug to fix in this review is a specific manifestation of the 
>>> 6280037
>>>   in the JVMTI GetFrameCount() that has a big impact on the SQE 
>>> nightly.
>>>   It is on the Test Stabilization radar (as well as the 6280037).
>>>   There are many tests intermittently failing because of this.
>>>
>>>   The webrev for review is a one-liner work around the 6280037 for 
>>> the GetFrameCount().
>>>
>>>   The JVMTI GetFrameCount() spec tells:
>>>     "If this function is called for a thread actively executing 
>>> bytecodes (for example,
>>>      not the current thread and not suspended), the information 
>>> returned is transient."
>>>
>>>   So, it is Ok to call the GetFrameCount() for non-suspended target 
>>> threads.
>>>   To achieve safety, the frame count for non-suspended threads is 
>>> calculated at a safepoint.
>>>   It should be Ok and more safe to do the same for suspended threads 
>>> as well.
>>>   There is no big performance impact because it is already on a slow 
>>> path.
>>>   It is still important to avoid safepointing when the target thread 
>>> is current.
>>>
>>>   The bug 6280037 should go out of the Test Stabilization radar 
>>> (remove the svc-nightly label)
>>>   as the most of the impacted tests are covered by the 6471769.
>>>
>>>
>>> Testing:
>>>   In progress: nsk.jvmti, nsk.jdi, nsk.jdwp, impacted JTreg tests
>>>
>>>
>>> Thanks,
>>> Serguei
>>>
>>
>


From serguei.spitsyn at oracle.com  Mon Feb  3 14:54:14 2014
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Mon, 03 Feb 2014 14:54:14 -0800
Subject: RFR (XS) 6471769: Error: assert(_cur_stack_depth ==
	count_frames(), "cur_stack_depth out of sync")
In-Reply-To: <52F01A63.7080206@oracle.com>
References: <52EC62CD.8000300@oracle.com> <52EF88A7.3090506@oracle.com>
	<52EFEA25.8080905@oracle.com> <52F01A63.7080206@oracle.com>
Message-ID: <52F01E16.7050200@oracle.com>

On 2/3/14 2:38 PM, Daniel D. Daugherty wrote:
>
> On 2/3/14 12:12 PM, serguei.spitsyn at oracle.com wrote:
>> On 2/3/14 4:16 AM, David Holmes wrote:
>>> Hi Serguei,
>>>
>>> On 1/02/2014 12:58 PM, serguei.spitsyn at oracle.com wrote:
>>>> Please, review the fix for:
>>>>    https://bugs.openjdk.java.net/browse/JDK-6471769
>>>>
>>>>
>>>> Open webrev:
>>>> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/hotspot/6471769-JVMTI-FRAME/ 
>>>>
>>>>
>>>>
>>>> Summary:
>>>>
>>>>    There is a general issue in the suspend equivalent condition 
>>>> mechanism:
>>>>    Two subsequent calls to the 
>>>> JvmtiEnv::is_thread_fully_suspended() may
>>>> return different results:
>>>>      - 1-st: true
>>>>      - 2-nd: false
>>>>
>>>>    This more generic suspend equivalent issue is covered by another 
>>>> bug:
>>>>      https://bugs.openjdk.java.net/browse/JDK-6280037
>>>>
>>>>    The bug to fix in this review is a specific manifestation of the 
>>>> 6280037
>>>>    in the JVMTI GetFrameCount() that has a big impact on the SQE 
>>>> nightly.
>>>>    It is on the Test Stabilization radar (as well as the 6280037).
>>>>    There are many tests intermittently failing because of this.
>>>>
>>>>    The webrev for review is a one-liner work around the 6280037 for 
>>>> the
>>>> GetFrameCount().
>>>>
>>>>    The JVMTI GetFrameCount() spec tells:
>>>>      "If this function is called for a thread actively executing
>>>> bytecodes (for example,
>>>>       not the current thread and not suspended), the information
>>>> returned is transient."
>>>>
>>>>    So, it is Ok to call the GetFrameCount() for non-suspended target
>>>> threads.
>>>>    To achieve safety, the frame count for non-suspended threads is
>>>> calculated at a safepoint.
>>>>    It should be Ok and more safe to do the same for suspended 
>>>> threads as
>>>> well.
>>>>    There is no big performance impact because it is already on a 
>>>> slow path.
>>>>    It is still important to avoid safepointing when the target 
>>>> thread is
>>>> current.
>>>
>>> This sounds completely reasonable - Reviewed.
>>>
>>> I can't check at the moment whether VM_GetFrameCount properly checks 
>>> for a live target thread?
>>
>> David, thank you for the review!
>> It looks like the check is missed.
>> Please, give me some time to make sure what is needed here.
>
> Check out the work done via this changeset:
>
> http://hg.openjdk.java.net/hsx/hotspot-rt/hotspot/rev/c1d7040a1183
>
> Sergey did the push via:
>
>     JDK-8022836 JVM crashes in 
> JVMTIENVBASE::GET_CURRENT_CONTENDED_MONITOR
>                 and GET_OWNED_MONITOR
>     https://bugs.openjdk.java.net/browse/JDK-8022836
>
> but the real commentary about the bug is in this one:
>
>     JDK-7154963 crash in JvmtiEnvBase::get_current_contended_monitor()
>     https://bugs.openjdk.java.net/browse/JDK-7154963
>
> I think above code is what you want...


I know what to check, just need to make sure new check is in a right 
place. :)
One of the examples is:
http://cr.openjdk.java.net/~sspitsyn/webrevs/2013/hotspot/8030027-JVMTI-HS101.1/

Thanks,
Serguei


>
> Dan
>
>
>>
>> Thanks!
>> Serguei
>>
>>>
>>> David
>>>
>>>>    The bug 6280037 should go out of the Test Stabilization radar 
>>>> (remove
>>>> the svc-nightly label)
>>>>    as the most of the impacted tests are covered by the 6471769.
>>>>
>>>>
>>>> Testing:
>>>>    In progress: nsk.jvmti, nsk.jdi, nsk.jdwp, impacted JTreg tests
>>>>
>>>>
>>>> Thanks,
>>>> Serguei
>>>>
>>
>


From serguei.spitsyn at oracle.com  Mon Feb  3 16:04:10 2014
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Mon, 03 Feb 2014 16:04:10 -0800
Subject: RFR (XS) 6471769: Error: assert(_cur_stack_depth ==
	count_frames(), "cur_stack_depth out of sync")
In-Reply-To: <52F01E0D.8010304@oracle.com>
References: <52EC62CD.8000300@oracle.com> <52EFBAF6.8000404@oracle.com>
	<52EFF84E.1050800@oracle.com> <52F01E0D.8010304@oracle.com>
Message-ID: <52F02E7A.9090603@oracle.com>

On 2/3/14 2:54 PM, Daniel D. Daugherty wrote:
> On 2/3/14 1:13 PM, serguei.spitsyn at oracle.com wrote:
>> Dan,
>>
>> Thank you for reviewing!
>> Some comments are below.
>>
>> On 2/3/14 7:51 AM, Daniel D. Daugherty wrote:
>>> On 1/31/14 7:58 PM, serguei.spitsyn at oracle.com wrote:
>>>> Please, review the fix for:
>>>>   https://bugs.openjdk.java.net/browse/JDK-6471769
>>>>
>>>>
>>>> Open webrev:
>>>> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/hotspot/6471769-JVMTI-FRAME/ 
>>>>
>>>
>>> src/share/vm/prims/jvmtiEnv.cpp
>>>     Thumbs up on the code change.
>>>
>>>     I've never been a fan of the:
>>>
>>>         if (is_thread_fully_suspended(...))
>>>             do-something-direct
>>>         else
>>>             do-something-via-a-safepoint
>>>
>>>     model. We don't have any mechanism in the VM to keep the
>>>     is_thread_fully_suspended() constraint true over the
>>>     execution time of the "do-something-direct" operation so
>>>     we've always been exposed to races.
>>
>> Thank you for this confirmation.
>> So that, we are in sync here.
>>
>>
>>> Also, I think this
>>>     model is used for several other JVM/TI calls so you might
>>>     want to visit those also with an eye on safety.
>>
>> Yes, I'm looking to isolate and double check other places like this 
>> in the code.
>> Another piece of work is to isolate other bugs on this topic.
>>
>>
>>>
>>>     For this specific change, I think the comment would be
>>>     better as:
>>>
>>>     // It is only safe to perform the direct operation on the current
>>>     // thread. All other usage needs to use a vm-op for safety.
>>
>> It is taken, thanks!
>>
>>> In particular, I don't agree with this part of the comment:
>>>
>>>     // Performance-wise the only important case is when current 
>>> thread is
>>>     // the target thread.
>>>
>>>     I'm not sure how you can state that performance on the current 
>>> thread
>>>     is the only important case, but maybe I don't understand what 
>>> you're
>>>     really trying to say here.
>>
>> I agree, the comment is not clear.
>> Probably, the comment had to say: "the only case where we can keep 
>> performance
>> without loosing safety is the target thread is current ...".
>> In fact, collecting data at a safepoint is the only way to make it safe.
>> If the target thread is not current then:
>>  - a) thread has been suspended: already slow path, so that 
>> performance is not that critical
>>  - b) thread has not been suspended: the information we provide is 
>> transient,
>>         safepoint is the only way to get it safe (is is correct?)
>
> Yes, the only safe way to get info for another thread is via
> a safepoint operation.

Ok, thanks.

>>>
>>> Now for the more general question with respect to this bug: 6471769
>>> I've having trouble connecting this change (which I like) to this
>>> particular bug report.
>>>
>>> The crash in the bug's description looks like this:
>>>
>>> # Internal Error 
>>> (/net/prt-solamd64-q1-2/PrtBuildDir/workspace/src/share/vm/prims/jvmtiThreadState.cpp, 
>>> 273), pid=9561, tid=2
>>> #
>>> # Java VM: Java HotSpot(TM) 64-Bit Server VM 
>>> (20060914135846.dcubed.service_hs_b02_merge.2-debug compiled mode)
>>> #
>>> # Error: assert(_cur_stack_depth == count_frames(),"cur_stack_depth 
>>> out of sync")
>>>
>>> V [libjvm.so+0x1217206];; void VMError::report_and_die()+0x606
>>> V [libjvm.so+0x667841];; void report_assertion_failure(const 
>>> char*,int,const char*)+0x61
>>> V [libjvm.so+0xca9557];; int JvmtiThreadState::cur_stack_depth()+0x6e7
>>> V [libjvm.so+0xc1caaa];; void 
>>> JvmtiExport::post_method_exit(JavaThread*,methodOop,frame)+0x21ba
>>> V [libjvm.so+0x8385de];; void 
>>> InterpreterRuntime::post_method_exit(JavaThread*)+0x21e
>>> j java.io.ObjectInputStream$BlockDataInputStream.getBlockDataMode()Z+4
>>>
>>> In the above crash, the current thread is trying to post a
>>> JVMTI_EVENT_METHOD_EXIT and during the event posting code, it
>>> calls cur_stack_depth() which fails an assert because the
>>> cached stack depth doesn't match the current queried value:
>>>
>>>    282  int JvmtiThreadState::cur_stack_depth() {
>>>    283    uint32_t debug_bits = 0;
>>>    284    guarantee(JavaThread::current() == get_thread() ||
>>>    285      JvmtiEnv::is_thread_fully_suspended(get_thread(), false, 
>>> &debug_bits
>>> ),
>>>    286      "must be current thread or suspended");
>>>    287
>>>    288    if (!is_interp_only_mode() || _cur_stack_depth == 
>>> UNKNOWN_STACK_DEPTH)
>>>  {
>>>    289      _cur_stack_depth = count_frames();
>>>    290    } else {
>>>    291      // heavy weight assert
>>>    292      assert(_cur_stack_depth == count_frames(),
>>>    293             "cur_stack_depth out of sync");
>>>    294    }
>>>    295    return _cur_stack_depth;
>>>    296  }
>>>
>>>
>>> This is not an unsafe query from one thread to another target thread
>>> so I don't see how the proposed change will address this failure mode.
>>> In particular, the assertion is troubling because it tells me that
>>> the cached stack depth for the current thread is either wrong or has
>>> been corrupted.
>>
>> The above are consequences of unsafe operations we are doing.
>> Current thread should not calculate frame count of target thread (if 
>> different) other than at a safepoint.
>> We assume the thread is suspended but, in fact, it was not.
>> So that we lost safety here and all manifestation we observe are the 
>> consequences.
>
> For at least the stack trace shown in the description of 6471769,
> it is not an unsafe case because the thread is operating on itself.
> All I'm saying here is that 6471769 is not the right bug to use for
> this fix...

Thank you for sharing the concern.
I was not able to reproduce the issue and so, did not prove my theory 
(which can be wrong).
But you have a pretty valid point here.
I still think, the stack trace with the assert is a consequence of a 
previous unsafe calculation on another thread.
But that calculation is probably different than that I'm fixing in this 
webrev.


>> This bug covers the frames miscount issue.
>> In the fastdebug mode we may get the assert as in this bug report:
>>   https://bugs.openjdk.java.net/browse/JDK-8032223
>
> This bug: JDK-8032223 seems to be a perfect match for code that you
> are proposing the change in your current webrev. You have a thread
> ("JDWP Transport Listener: dt_socket") that is trying to query the
> frame count on another thread... That target JavaThread met the
> is_thread_fully_suspended() criteria when execution went down the
> direct path, but later the target stopped meeting that criteria...
>
> And boom...
>
>> There is another potential race that I forgot to check and fix but 
>> David pointed out.
>> I will try to cover it in the next version of webrev.
>
> See the separate reply that I sent on that part...
>
>
>>
>> It is still possible there can be other issues here (as you say).
>> But it is hard to make sure it is the case because it is extremely 
>> hard to reproduce (you know it!).
>> I'd suggest to resolve it step-by-step:
>>   - fix the known safety issues
>>   - watch the nightly if anything else is left, file and fix new bugs
>
> This sounds like a good plan.
>
>
>>
>>>
>>> To reiterate:
>>>
>>> - I like the code change!
>>> - I would like to see the comment tweaked a bit
>>>
>>> You should consider going forward with this change using a different
>>> bug ID, perhaps something like:
>>>
>>>    JDK-NNNNNNN improve safety of JVM/TI GetFrameCount()
>>
>> There is already bug on this topic (and there are even more dups):
>>   https://bugs.openjdk.java.net/browse/JDK-8032223
>>
>> But I still believe, the issue is the same ...
>
> I don't think the issues in JDK-6471769 and JDK-8032223 are the same;
> well not the same for the crash in the description of JDK-6471769. I
> didn't look closely at the other crashes...

Ok.
I will reopen the JDK-8032223 and use it for this fix.

Thank you for helping to sort this out!
Serguei

>
>
> Dan
>
>
>>
>>
>> Thanks!
>> Serguei
>>
>>>
>>> Dan
>>>
>>>
>>>
>>>>
>>>> Summary:
>>>>
>>>>   There is a general issue in the suspend equivalent condition 
>>>> mechanism:
>>>>   Two subsequent calls to the JvmtiEnv::is_thread_fully_suspended() 
>>>> may return different results:
>>>>     - 1-st: true
>>>>     - 2-nd: false
>>>>
>>>>   This more generic suspend equivalent issue is covered by another 
>>>> bug:
>>>>     https://bugs.openjdk.java.net/browse/JDK-6280037
>>>>
>>>>   The bug to fix in this review is a specific manifestation of the 
>>>> 6280037
>>>>   in the JVMTI GetFrameCount() that has a big impact on the SQE 
>>>> nightly.
>>>>   It is on the Test Stabilization radar (as well as the 6280037).
>>>>   There are many tests intermittently failing because of this.
>>>>
>>>>   The webrev for review is a one-liner work around the 6280037 for 
>>>> the GetFrameCount().
>>>>
>>>>   The JVMTI GetFrameCount() spec tells:
>>>>     "If this function is called for a thread actively executing 
>>>> bytecodes (for example,
>>>>      not the current thread and not suspended), the information 
>>>> returned is transient."
>>>>
>>>>   So, it is Ok to call the GetFrameCount() for non-suspended target 
>>>> threads.
>>>>   To achieve safety, the frame count for non-suspended threads is 
>>>> calculated at a safepoint.
>>>>   It should be Ok and more safe to do the same for suspended 
>>>> threads as well.
>>>>   There is no big performance impact because it is already on a 
>>>> slow path.
>>>>   It is still important to avoid safepointing when the target 
>>>> thread is current.
>>>>
>>>>   The bug 6280037 should go out of the Test Stabilization radar 
>>>> (remove the svc-nightly label)
>>>>   as the most of the impacted tests are covered by the 6471769.
>>>>
>>>>
>>>> Testing:
>>>>   In progress: nsk.jvmti, nsk.jdi, nsk.jdwp, impacted JTreg tests
>>>>
>>>>
>>>> Thanks,
>>>> Serguei
>>>>
>>>
>>
>


From david.holmes at oracle.com  Mon Feb  3 22:39:48 2014
From: david.holmes at oracle.com (David Holmes)
Date: Tue, 04 Feb 2014 16:39:48 +1000
Subject: code review round 0 for ObjectMonitor-JVM/TI hang fix (8028073)
In-Reply-To: <52EFAE1C.3050202@oracle.com>
References: <52ED3F28.5070806@oracle.com> <52EF874A.8020603@oracle.com>
	<52EFAE1C.3050202@oracle.com>
Message-ID: <52F08B34.1010300@oracle.com>

On 4/02/2014 12:56 AM, Daniel D. Daugherty wrote:
> Adding Dave Dice to this thread...
>
> On 2/3/14 5:10 AM, David Holmes wrote:
>> Hi Dan,
>>
>> On 2/02/2014 4:38 AM, Daniel D. Daugherty wrote:
>>> Greetings,
>>>
>>> I have a fix ready for the following bug:
>>>
>>>      8028073 race condition in ObjectMonitor implementation causing
>>> deadlocks
>>>      https://bugs.openjdk.java.net/browse/JDK-8028073
>>>
>>> On the surface, this is a very simple fix that relocates a few lines of
>>> code, relocates and rewrites the comments associated with that code and
>>> adds several new comments.
>>>
>>> Of course, in reality, the issue is much more complicated, but I'm
>>> hoping to make it easy for anyone not acquainted with this issue to
>>> understand what's going on.
>>
>> Let's spare them the trouble - Reviewed :)
>
> Thanks for the review!
>
>
>> I'm still troubled by the rationale for the original placement of this
>> code. It makes sense to issue the unpark prior to the park() call but
>> on the other hand there is no agent/event code in that path that would
>> have consumed the original unpark. ??
>
> I think I've acknowledged that concern in the "Risks" section below:
>
>     - while the old code was not effective for the hang that is being
>       fixed with this bug, it is possible that the old code prevented
>       a different bug in the successor protocol from manifesting
>
> Hopefully, Dave Dice will chime in and share the history...

For the record Dave did chime in off-list and this code block dates back 
to the JVMPI implementation which did 
jvmpi::post_monitor_contended_exit_event at the end of exit(), which is 
called just before the code block in question. That was back in JDK 5. 
By JDK 6 JVMPI had been stripped out but the code block remained.

>> Thanks, I know you've really enjoyed this one :)
>
> I actually did enjoy this one from a 'challenge to the mind' POV.
> But I have to say that the intricacies of the successor protocol
> gave me headaches... Hopefully, my successor.notes will help anyone
> else that is crazy enough to wander down this path...

Indeed!

Thanks,
David


> Dan
>
>
>>
>> David
>> -----
>>
>>> Here are the JDK9 webrev URLs:
>>>
>>> OpenJDK:
>>> http://cr.openjdk.java.net/~dcubed/8028073-webrev/0-jdk9-hs-runtime/
>>>
>>> Oracle internal:
>>> http://javaweb.us.oracle.com/~ddaugher/8028073-webrev/0-jdk9-hs-runtime/
>>>
>>> The simple summary:
>>>
>>> - since Java Monitors and JVM/TI RawMonitors share a ParkEvent,
>>>    it is possible for a JVM/TI monitor event handler to accidentally
>>>    consume a ParkEvent.unpark() call meant for Java Monitor layer
>>> - the original code fix was made on 2005.07.04 using this bug ID:
>>>    https://bugs.openjdk.java.net/browse/JDK-5030359
>>> - it's the right fix, but it's in the wrong place
>>> - the fix needs to be after the JVMTI_EVENT_MONITOR_WAITED
>>>    event handler is called because it is that event handler
>>>    that can cause the hang
>>>
>>>
>>> Testing
>>> -------
>>>
>>> - a new StessMonitorWait test has been created that reliably
>>>    reproduces the hang in JDK[6789]; see the bug's gory details
>>>    for the specific versions where the hang has been reproduced
>>>    - the test reliably reproduces the hang in 5 seconds on my
>>>      T7600 running Solaris 10u11 X86; 1 minute runs reproduce
>>>      the hang reliably on other machines
>>>    - 12 hour stress run of the new test on Linux-X64, MacOS X-X64,
>>>      Solaris-SPARCV9, Solaris-X64, and Win7-X86 with the JPRT
>>>      bits did not reproduce the hang
>>> - JPRT test job
>>> - VM/SQE Adhoc test job on Server VM, fastdebug bits on Linux-X86,
>>>    Linux-X64, MacOS X-X64, Solaris-SPARCV9, Solaris-X64, Windows-X86,
>>>    and Windows-X64:
>>>    - vm.quick
>>>    - Kitchensink (bigapps)
>>>    - Weblogic+medrec (bigapps)
>>>    - runThese (bigapps)
>>>
>>>
>>> The Gory Details Start Here
>>> ---------------------------
>>>
>>> This is the old location of block of code that's being moved:
>>>
>>> src/share/vm/runtime/objectMonitor.cpp:
>>>
>>> 1440 void ObjectMonitor::wait(jlong millis, bool interruptible, TRAPS) {
>>> <snip>
>>> 1499    exit (true, Self) ;                    // exit the monitor
>>> <snip>
>>> 1513    if (node._notified != 0 && _succ == Self) {
>>> 1514       node._event->unpark();
>>> 1515    }
>>>
>>>
>>> This is the new location of block of code that's being moved:
>>>
>>> src/share/vm/runtime/objectMonitor.cpp:
>>>
>>> 1452 void ObjectMonitor::wait(jlong millis, bool interruptible, TRAPS) {
>>> <snip>
>>> 1601      if (JvmtiExport::should_post_monitor_waited()) {
>>> 1602        JvmtiExport::post_monitor_waited(jt, this, ret ==
>>> OS_TIMEOUT);
>>> <snip>
>>> 1604        if (node._notified != 0 && _succ == Self) {
>>> <snip>
>>> 1620          node._event->unpark();
>>> 1621        }
>>>
>>>
>>> The Risks
>>> ---------
>>>
>>> - The code now executes only when the JVMTI_EVENT_MONITOR_WAITED event
>>>    is enabled:
>>>    - previously it was always executed
>>>    - while the old code was not effective for the hang that is being
>>>      fixed with this bug, it is possible that the old code prevented
>>>      a different bug in the successor protocol from manifesting
>>>    - thorough analysis of the successor protocol did not reveal a
>>>      case where the old code was needed in the old location
>>> - Thorough analysis indicates that the other JVM/TI monitor events
>>>    do not need a fix like the one for JVMTI_EVENT_MONITOR_WAITED:
>>>    - the successor protocol is complicated and the analysis could
>>>      be wrong when certain options are used
>>>    - comments were added to each location where a JVM/TI monitor
>>>      event handler is called documenting why a fix like this one
>>>      is not needed there
>>>    - if the analysis is wrong, the new comments show where a new
>>>      code change would be needed
>>>
>>>
>>> The Scenario
>>> ------------
>>>
>>> I've created a scenario that reproduces this hang:
>>>
>>> T1 - enters monitor and calls monitor.wait()
>>> T2 - enters the monitor, calls monitor.notify() and exits the monitor
>>> T3 - enters and exits the monitor
>>> T4 - enters the monitor, delays for 5 seconds, exits the monitor
>>>
>>> A JVM/TI agent that enables JVMTI_EVENT_MONITOR_WAITED and has a
>>> handler that: enters a raw monitor, waits for 1ms, exits a raw monitor.
>>>
>>> Here are the six events necessary to make this hang happen:
>>>
>>> // KEY-EVENT-1a: After being unparked(), T1 has cleared the _succ field,
>>> but
>>> // KEY-EVENT-1b: T3 is exiting the monitor and makes T1 the successor
>>> again.
>>>
>>> // KEY-EVENT-2a: The unpark() done by T3 when it made T1 the successor
>>> // KEY-EVENT-2b: is consumed by the JVM/TI event handler.
>>>
>>> // KEY-EVENT-3a: T3 made T1 the successor
>>> // KEY-EVENT-3b: but before T1 could reenter the monitor T4 grabbed it.
>>>
>>> // KEY-EVENT-4a: T1's TrySpin() call sees T4 as NotRunnable so
>>> // KEY-EVENT-4b: T1 bails from TrySpin without touching _succ.
>>>
>>> // KEY-EVENT-5a: T4 sees that T1 is still the successor so
>>> // KEY-EVENT-5b: T4 takes the quick exit path (no ExitEpilog)
>>>
>>> // KEY-EVENT-6a: T1 is about to park and it is the successor, but
>>> // KEY-EVENT-6b: T3's unpark has been eaten by the JVM/TI event handler
>>> // KEY-EVENT-6c: and T4 took the quick exit path. T1 is about to be
>>> stuck.
>>>
>>>
>>> This bug is intertwined with:
>>>
>>> - The ObjectMonitor successor protocol
>>> - the sharing of a ParkEvent between Java Monitors and JVM/TI
>>> RawMonitors
>>>
>>> There is a very long successor.notes attachment to JDK-8028073 that
>>> attempts to describe the ObjectMonitor successor protocol. It's good
>>> for putting pretty much anyone to sleep.
>>>
>>> Since this hang reproduces back to JDK6, this bug is taking the easily
>>> backported solution of moving the original fix to the right location.
>>> The following new bug has been filed for possible future work in this
>>> area by the Serviceability Team:
>>>
>>>      8033399 add a separate ParkEvent for JVM/TI RawMonitor use
>>>      https://bugs.openjdk.java.net/browse/JDK-8033399
>>>
>>>
>>> The Symptoms
>>> ------------
>>>
>>> With intermittent hangs like this, it is useful to know what to look
>>> for in order to determine if you are running into this issue:
>>>
>>> - if you aren't using a debugger or a profiler or some other
>>>    JVM/TI agent, then this hang is not the same as yours
>>> - if your JVM/TI agent isn't using a JVMTI_EVENT_MONITOR_WAITED
>>>    event handler, then this hang is not the same as yours
>>> - if your JVMTI_EVENT_MONITOR_WAITED event handler is not using
>>>    JVM/TI RawMonitors, then this hang is not the same as yours
>>> - if your JVMTI_EVENT_MONITOR_WAITED event handler is calling
>>>    back into Java code, then you might just be insane and this
>>>    hang might be similar to yours. However, using a Java callback
>>>    in an event handler is an even bigger problem/risk so fix that
>>>    first.
>>> - if you one or more threads blocked like this and making no
>>>    progress, then this hang might be the same as yours:
>>>
>>> "T1" #22 prio=5 os_prio=64 tid=0x00000000009ca800 nid=0x2f waiting for
>>> monitor e
>>> ntry [0xfffffd7fc0231000]
>>>     java.lang.Thread.State: BLOCKED (on object monitor)
>>>     JavaThread state: _thread_blocked
>>> Thread: 0x00000000009ca800  [0x2f] State: _at_safepoint _has_called_back
>>> 0 _at_p
>>> oll_safepoint 0
>>>     JavaThread state: _thread_blocked
>>>          at java.lang.Object.wait(Native Method)
>>>          - waiting on <0xfffffd7e6a2b6ff0> (a java.lang.String)
>>>          at java.lang.Object.wait(Object.java:502)
>>>          at SMW_WorkerThread.run(StressMonitorWait.java:103)
>>>          - locked <0xfffffd7e6a2b6ff0> (a java.lang.String)
>>>
>>> "T2" #23 prio=5 os_prio=64 tid=0x00000000009cc000 nid=0x30 waiting for
>>> monitor e
>>> ntry [0xfffffd7fc0130000]
>>>     java.lang.Thread.State: BLOCKED (on object monitor)
>>>     JavaThread state: _thread_blocked
>>> Thread: 0x00000000009cc000  [0x30] State: _at_safepoint _has_called_back
>>> 0 _at_p
>>> oll_safepoint 0
>>>     JavaThread state: _thread_blocked
>>>          at SMW_WorkerThread.run(StressMonitorWait.java:120)
>>>          - waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String)
>>>
>>> "T3" #24 prio=5 os_prio=64 tid=0x00000000009ce000 nid=0x31 waiting for
>>> monitor e
>>> ntry [0xfffffd7fc002f000]
>>>     java.lang.Thread.State: BLOCKED (on object monitor)
>>>     JavaThread state: _thread_blocked
>>> Thread: 0x00000000009ce000  [0x31] State: _at_safepoint _has_called_back
>>> 0 _at_p
>>> oll_safepoint 0
>>>     JavaThread state: _thread_blocked
>>>          at SMW_WorkerThread.run(StressMonitorWait.java:139)
>>>          - waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String)
>>>
>>> Key symptoms in thread T1:
>>>
>>> - had the object locked:
>>>
>>>    locked <0xfffffd7e6a2b6ff0> (a java.lang.String)
>>>
>>> - did an Object.wait():
>>>
>>>    waiting on <0xfffffd7e6a2b6ff0> (a java.lang.String)
>>>
>>> - is blocked on reentry:
>>>
>>>    waiting for monitor entry [0xfffffd7fc0231000]
>>>
>>> Key symtoms in thread T2:
>>>
>>> - is blocked waiting to lock the object:
>>>
>>>    waiting for monitor entry [0xfffffd7fc0130000]
>>>    waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String)
>>>
>>> Key symtoms in thread T3:
>>>
>>> - is blocked waiting to lock the object:
>>>
>>>    waiting for monitor entry [0xfffffd7fc002f000]
>>>    waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String)
>>>
>

From shanliang.jiang at oracle.com  Tue Feb  4 00:54:20 2014
From: shanliang.jiang at oracle.com (shanliang)
Date: Tue, 04 Feb 2014 09:54:20 +0100
Subject: RFR 6791551: ExclusiveBind.java has a race condition
In-Reply-To: <52F0192B.9060104@oracle.com>
References: <52F0192B.9060104@oracle.com>
Message-ID: <52F0AABC.7000904@oracle.com>

Jaroslav,

Your fix should work in most case, but is it better and more reliable to 
wait a VM event as suggested in the bug? even your timeout is adapted to 
the test time factory, but the solution still depends to a fixed timeout 
and a  fixed line out.

Shanliang

Jaroslav Bachorik wrote:
> Please, review the following test fix:
>
> Issue : https://bugs.openjdk.java.net/browse/JDK-6791551
> Webrev: http://cr.openjdk.java.net/~jbachorik/6791551/webrev.00
>
> The fix prevents the situation when the first debuggee has not managed 
> to finish its intialization while the second one is started up thus 
> making the port available for the second debuggee and failing the test.
>
> The patch is using the library methods to configure and launch the 
> debuggee and the test waits for the well known string to appear in the 
> first debuggee output before attempting to launch the second debuggee.
>
> Thanks,
>
> -JB-


From jaroslav.bachorik at oracle.com  Tue Feb  4 00:58:42 2014
From: jaroslav.bachorik at oracle.com (Jaroslav Bachorik)
Date: Tue, 04 Feb 2014 09:58:42 +0100
Subject: RFR 6791551: ExclusiveBind.java has a race condition
In-Reply-To: <52F0AABC.7000904@oracle.com>
References: <52F0192B.9060104@oracle.com> <52F0AABC.7000904@oracle.com>
Message-ID: <52F0ABC2.4040700@oracle.com>

On 4.2.2014 09:54, shanliang wrote:
> Jaroslav,
>
> Your fix should work in most case, but is it better and more reliable to
> wait a VM event as suggested in the bug? even your timeout is adapted to
> the test time factory, but the solution still depends to a fixed timeout
> and a  fixed line out.

Well, if I get the test logic correctly it is supposed to test that the 
agent blocks the port even when no client has connected yet. Connecting 
to the agent and waiting for the event would change the thing the test 
checks, actually.

-JB-

>
> Shanliang
>
> Jaroslav Bachorik wrote:
>> Please, review the following test fix:
>>
>> Issue : https://bugs.openjdk.java.net/browse/JDK-6791551
>> Webrev: http://cr.openjdk.java.net/~jbachorik/6791551/webrev.00
>>
>> The fix prevents the situation when the first debuggee has not managed
>> to finish its intialization while the second one is started up thus
>> making the port available for the second debuggee and failing the test.
>>
>> The patch is using the library methods to configure and launch the
>> debuggee and the test waits for the well known string to appear in the
>> first debuggee output before attempting to launch the second debuggee.
>>
>> Thanks,
>>
>> -JB-
>


From jesper.wilhelmsson at oracle.com  Tue Feb  4 01:17:16 2014
From: jesper.wilhelmsson at oracle.com (Jesper Wilhelmsson)
Date: Tue, 04 Feb 2014 10:17:16 +0100
Subject: RFR (backport of): 8028391 - Make the Min/MaxHeapFreeRatio flags
	manageable
Message-ID: <52F0B01C.2040600@oracle.com>

Hi,

The patch from jdk9/hs-gc to make MinHeapFreeRatio and MaxHeapFreeRatio 
manageable flags and supported in ParallelGC applied cleanly to jdk8u/hs-dev.

hsx/jdk7u60 required some manual labor to patch since the argument parsing code 
had moved around, but the relevant logic was untouched so I didn't have to 
change anything in the new code. For 7 I omitted some of the minor cleanups that 
was done in the original patch.

Webrev for 7: http://cr.openjdk.java.net/~jwilhelm/8028391/webrev.7u60/
Webrev for 8: http://cr.openjdk.java.net/~jwilhelm/8028391/webrev.8u20/

(The 8u20 webrev is identical with the final version for 9.)

So kindly approve this change for backport to jdk8u20 and jdk7u60.

The bugs:
jdk9: https://bugs.openjdk.java.net/browse/JDK-8028391
jdk8: https://bugs.openjdk.java.net/browse/JDK-8033209
jdk7: https://bugs.openjdk.java.net/browse/JDK-8028720

Reviews in this thread:
http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/2014-January/009187.html

The jdk9 changeset:
http://hg.openjdk.java.net/jdk9/hs-gc/hotspot/rev/44315152d434

Thanks,
/Jesper

From olivier.lagneau at oracle.com  Tue Feb  4 02:15:29 2014
From: olivier.lagneau at oracle.com (Olivier Lagneau)
Date: Tue, 04 Feb 2014 11:15:29 +0100
Subject: Need second 7u reviewer for [7u60] Request for phase 2 approval for
	CR 8019389: SA-JDI JSR292: sun.jvm.hotspot.jdi.StackFrame.thisObject()
	throws sun.jvm.hotspot.utilities.AssertionFailure: sanity check
In-Reply-To: <2F0BD850-0C45-4377-959A-41045C75FB88@oracle.com>
References: <52DD37E3.1020402@oracle.com> <52DD3D1B.9070700@oracle.com>
	<52DD53C2.6070904@oracle.com> <52DE32CE.3080703@oracle.com>
	<2F0BD850-0C45-4377-959A-41045C75FB88@oracle.com>
Message-ID: <52F0BDC1.7050405@oracle.com>

I need a second 7u reviewer for this 7u60 critical request fix (very 
simple fix).

Could someone review it please ?

Thanks,
Olivier.

Christian Thalinger said  on date 1/23/2014 10:09 PM:
> Looks good.
>
> On Jan 21, 2014, at 12:41 AM, Olivier Lagneau 
> <olivier.lagneau at oracle.com <mailto:olivier.lagneau at oracle.com>> wrote:
>
>> Please find the new webrev with copyright date fixed (changed to 2014).
>>
>> Webrev: http://cr.openjdk.java.net/~olagneau/8019389/webrev.01/ 
>> <http://cr.openjdk.java.net/%7Eolagneau/8019389/webrev.01/>
>>
>> Olivier.
>>
>> Olivier Lagneau said  on date 1/20/2014 5:50 PM:
>>> Oops, right !
>>>
>>> Will fix that.
>>>
>>> Olivier.
>>>
>>> shanliang said  on date 1/20/2014 4:13 PM:
>>>> Olivier,
>>>>
>>>> Now it is 2014 :)
>>>>
>>>>
>>>> Olivier Lagneau wrote:
>>>>> Please review the following simple fix.
>>>>>
>>>>> Issue: https://bugs.openjdk.java.net/browse/JDK-8019389
>>>>> Webrev: http://cr.openjdk.java.net/~olagneau/8019389/webrev.00/ 
>>>>> <http://cr.openjdk.java.net/%7Eolagneau/8019389/webrev.00/>
>>>>>
>>>>> The issue is due to the fact that _invokeHandle bytecode is 
>>>>> generated by hotspot,
>>>>> but is not declared in agent code. Just declaring the new bytecode 
>>>>> solves the assertion failure.
>>>>>
>>>>> However the tests reported in 8019389 
>>>>> (bootstrapOtherStratumInStackTrace, targetOtherStratumInStackTrace)
>>>>> suffer the problem from JDK-7016268 
>>>>> <https://bugs.openjdk.java.net/browse/JDK-7016268> : Can't get 
>>>>> strata information through SA-JDI
>>>>> Thus, the "stratum mismatch" related to JDK-7016268 will still be 
>>>>> present after fix.
>>>>> This second problem has to be fixed through JDK-7016268.
>>>>>
>>>>> Thanks,
>>>>> Olivier.
>>>>
>>>
>>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20140204/46a80282/attachment-0001.html 

From serguei.spitsyn at oracle.com  Tue Feb  4 03:03:27 2014
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Tue, 04 Feb 2014 03:03:27 -0800
Subject: Need second 7u reviewer for [7u60] Request for phase 2 approval
	for CR 8019389: SA-JDI JSR292:
	sun.jvm.hotspot.jdi.StackFrame.thisObject()
	throws sun.jvm.hotspot.utilities.AssertionFailure: sanity check
In-Reply-To: <52F0BDC1.7050405@oracle.com>
References: <52DD37E3.1020402@oracle.com>
	<52DD3D1B.9070700@oracle.com>	<52DD53C2.6070904@oracle.com>
	<52DE32CE.3080703@oracle.com>	<2F0BD850-0C45-4377-959A-41045C75FB88@oracle.com>
	<52F0BDC1.7050405@oracle.com>
Message-ID: <52F0C8FF.70100@oracle.com>

Hi Oliver,

The fix looks good.
Was this already fixed in the JDK 9 and 8?

Thanks,
Serguei

On 2/4/14 2:15 AM, Olivier Lagneau wrote:
> I need a second 7u reviewer for this 7u60 critical request fix (very 
> simple fix).
>
> Could someone review it please ?
>
> Thanks,
> Olivier.
>
> Christian Thalinger said  on date 1/23/2014 10:09 PM:
>> Looks good.
>>
>> On Jan 21, 2014, at 12:41 AM, Olivier Lagneau 
>> <olivier.lagneau at oracle.com <mailto:olivier.lagneau at oracle.com>> wrote:
>>
>>> Please find the new webrev with copyright date fixed (changed to 2014).
>>>
>>> Webrev: http://cr.openjdk.java.net/~olagneau/8019389/webrev.01/ 
>>> <http://cr.openjdk.java.net/%7Eolagneau/8019389/webrev.01/>
>>>
>>> Olivier.
>>>
>>> Olivier Lagneau said  on date 1/20/2014 5:50 PM:
>>>> Oops, right !
>>>>
>>>> Will fix that.
>>>>
>>>> Olivier.
>>>>
>>>> shanliang said  on date 1/20/2014 4:13 PM:
>>>>> Olivier,
>>>>>
>>>>> Now it is 2014 :)
>>>>>
>>>>>
>>>>> Olivier Lagneau wrote:
>>>>>> Please review the following simple fix.
>>>>>>
>>>>>> Issue: https://bugs.openjdk.java.net/browse/JDK-8019389
>>>>>> Webrev: http://cr.openjdk.java.net/~olagneau/8019389/webrev.00/ 
>>>>>> <http://cr.openjdk.java.net/%7Eolagneau/8019389/webrev.00/>
>>>>>>
>>>>>> The issue is due to the fact that _invokeHandle bytecode is 
>>>>>> generated by hotspot,
>>>>>> but is not declared in agent code. Just declaring the new 
>>>>>> bytecode solves the assertion failure.
>>>>>>
>>>>>> However the tests reported in 8019389 
>>>>>> (bootstrapOtherStratumInStackTrace, targetOtherStratumInStackTrace)
>>>>>> suffer the problem from JDK-7016268 
>>>>>> <https://bugs.openjdk.java.net/browse/JDK-7016268> : Can't get 
>>>>>> strata information through SA-JDI
>>>>>> Thus, the "stratum mismatch" related to JDK-7016268 will still be 
>>>>>> present after fix.
>>>>>> This second problem has to be fixed through JDK-7016268.
>>>>>>
>>>>>> Thanks,
>>>>>> Olivier.
>>>>>
>>>>
>>>
>>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20140204/6ae668b8/attachment.html 

From serguei.spitsyn at oracle.com  Tue Feb  4 03:13:20 2014
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Tue, 04 Feb 2014 03:13:20 -0800
Subject: 2-nd round RFR (S) 8032223: nsk/regression/b4663146 gets
	assert(SafepointSynchronize::is_at_safepoint()
	|| JvmtiEnv::is_thread_fully_suspended(get_thread(), false,
	&debug_bits))
Message-ID: <52F0CB50.3040703@oracle.com>

Please, review the fix for:
   https://bugs.openjdk.java.net/browse/JDK-8032223


Open webrev:
http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/hotspot/8032223-JVMTI-FRAME.1/

Summary:

   This is the second round of review for this issue.
   But it was decided that the JDK-8032223 must be used to cover it 
instead of the JDK-6471769.
   The 8032223 was initially closed as a dup of 6471769 but it has been 
re-open now.

   There is a general issue in the suspend equivalent condition mechanism:
   Two subsequent calls to the JvmtiEnv::is_thread_fully_suspended() may 
return different results:
     - 1-st: true
     - 2-nd: false

   This suspend equivalent issue is covered by another bug:
     https://bugs.openjdk.java.net/browse/JDK-6280037

   The bug to fix in this review is a specific manifestation of the 6280037
   in the JVMTI GetFrameCount() that has a major impact on the SQE nightly.
   It is on the Test Stabilization radar as well as the 6280037.
   There are many tests intermittently failing because of this.
   I've also decided to fix the same issue in the JVMTI 
GetFrameLocation() as well.

   The JVMTI GetFrameCount() spec tells:
     "If this function is called for a thread actively executing 
bytecodes (for example,
      not the current thread and not suspended), the information 
returned is transient."

   So, it is Ok to call the GetFrameCount() for the non-suspended target 
thread.
   To achieve safety, the frame count for non-suspended threads is 
calculated at a safepoint.
   It should be Ok and more safe to do the same for suspended threads as 
well.
   There should be no big performance impact because it is already on a 
slow path.
   It is still important to avoid safepointing when the target thread is 
current.

   The bug 6280037 should go out of the Test Stabilization radar (remove 
the svc-nightly label)
   as the most of the impacted tests must be covered by the 8032223.


Testing:
   In progress:
     - nsk.jvmti, nsk.jdi, nsk.jdwp
     - JTreg com/sun/jdi


Thanks,
Serguei


From olivier.lagneau at oracle.com  Tue Feb  4 03:10:38 2014
From: olivier.lagneau at oracle.com (Olivier Lagneau)
Date: Tue, 04 Feb 2014 12:10:38 +0100
Subject: Need second 7u reviewer for [7u60] Request for phase 2 approval
	for CR 8019389: SA-JDI JSR292:
	sun.jvm.hotspot.jdi.StackFrame.thisObject()
	throws sun.jvm.hotspot.utilities.AssertionFailure: sanity check
In-Reply-To: <52F0C8FF.70100@oracle.com>
References: <52DD37E3.1020402@oracle.com>
	<52DD3D1B.9070700@oracle.com>	<52DD53C2.6070904@oracle.com>
	<52DE32CE.3080703@oracle.com>	<2F0BD850-0C45-4377-959A-41045C75FB88@oracle.com>
	<52F0BDC1.7050405@oracle.com> <52F0C8FF.70100@oracle.com>
Message-ID: <52F0CAAE.3010106@oracle.com>

Thanks Serguei !

This is a 7u only fix. This code is already present in 8 and 9.

Olivier.

serguei.spitsyn at oracle.com said  on date 2/4/2014 12:03 PM:
> Hi Oliver,
>
> The fix looks good.
> Was this already fixed in the JDK 9 and 8?
>
> Thanks,
> Serguei
>
> On 2/4/14 2:15 AM, Olivier Lagneau wrote:
>> I need a second 7u reviewer for this 7u60 critical request fix (very 
>> simple fix).
>>
>> Could someone review it please ?
>>
>> Thanks,
>> Olivier.
>>
>> Christian Thalinger said  on date 1/23/2014 10:09 PM:
>>> Looks good.
>>>
>>> On Jan 21, 2014, at 12:41 AM, Olivier Lagneau 
>>> <olivier.lagneau at oracle.com <mailto:olivier.lagneau at oracle.com>> wrote:
>>>
>>>> Please find the new webrev with copyright date fixed (changed to 2014).
>>>>
>>>> Webrev: http://cr.openjdk.java.net/~olagneau/8019389/webrev.01/ 
>>>> <http://cr.openjdk.java.net/%7Eolagneau/8019389/webrev.01/>
>>>>
>>>> Olivier.
>>>>
>>>> Olivier Lagneau said  on date 1/20/2014 5:50 PM:
>>>>> Oops, right !
>>>>>
>>>>> Will fix that.
>>>>>
>>>>> Olivier.
>>>>>
>>>>> shanliang said  on date 1/20/2014 4:13 PM:
>>>>>> Olivier,
>>>>>>
>>>>>> Now it is 2014 :)
>>>>>>
>>>>>>
>>>>>> Olivier Lagneau wrote:
>>>>>>> Please review the following simple fix.
>>>>>>>
>>>>>>> Issue: https://bugs.openjdk.java.net/browse/JDK-8019389
>>>>>>> Webrev: http://cr.openjdk.java.net/~olagneau/8019389/webrev.00/ 
>>>>>>> <http://cr.openjdk.java.net/%7Eolagneau/8019389/webrev.00/>
>>>>>>>
>>>>>>> The issue is due to the fact that _invokeHandle bytecode is 
>>>>>>> generated by hotspot,
>>>>>>> but is not declared in agent code. Just declaring the new 
>>>>>>> bytecode solves the assertion failure.
>>>>>>>
>>>>>>> However the tests reported in 8019389 
>>>>>>> (bootstrapOtherStratumInStackTrace, targetOtherStratumInStackTrace)
>>>>>>> suffer the problem from JDK-7016268 
>>>>>>> <https://bugs.openjdk.java.net/browse/JDK-7016268> : Can't get 
>>>>>>> strata information through SA-JDI
>>>>>>> Thus, the "stratum mismatch" related to JDK-7016268 will still 
>>>>>>> be present after fix.
>>>>>>> This second problem has to be fixed through JDK-7016268.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Olivier.
>>>>>>
>>>>>
>>>>
>>>
>>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20140204/8ffc47b8/attachment.html 

From serguei.spitsyn at oracle.com  Tue Feb  4 03:32:05 2014
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Tue, 04 Feb 2014 03:32:05 -0800
Subject: Need second 7u reviewer for [7u60] Request for phase 2 approval
	for CR 8019389: SA-JDI JSR292:
	sun.jvm.hotspot.jdi.StackFrame.thisObject()
	throws sun.jvm.hotspot.utilities.AssertionFailure: sanity check
In-Reply-To: <52F0CAAE.3010106@oracle.com>
References: <52DD37E3.1020402@oracle.com>
	<52DD3D1B.9070700@oracle.com>	<52DD53C2.6070904@oracle.com>
	<52DE32CE.3080703@oracle.com>	<2F0BD850-0C45-4377-959A-41045C75FB88@oracle.com>
	<52F0BDC1.7050405@oracle.com> <52F0C8FF.70100@oracle.com>
	<52F0CAAE.3010106@oracle.com>
Message-ID: <52F0CFB5.3040107@oracle.com>

On 2/4/14 3:10 AM, Olivier Lagneau wrote:
> Thanks Serguei !
>
> This is a 7u only fix. This code is already present in 8 and 9.
Ok, thanks.
Just in case, I've attached one of the latest e-mail notifications from 
Alejandro.
Ask questions if there is any confusion.

Thanks,
Serguei

>
> Olivier.
>
> serguei.spitsyn at oracle.com said  on date 2/4/2014 12:03 PM:
>> Hi Oliver,
>>
>> The fix looks good.
>> Was this already fixed in the JDK 9 and 8?
>>
>> Thanks,
>> Serguei
>>
>> On 2/4/14 2:15 AM, Olivier Lagneau wrote:
>>> I need a second 7u reviewer for this 7u60 critical request fix (very 
>>> simple fix).
>>>
>>> Could someone review it please ?
>>>
>>> Thanks,
>>> Olivier.
>>>
>>> Christian Thalinger said  on date 1/23/2014 10:09 PM:
>>>> Looks good.
>>>>
>>>> On Jan 21, 2014, at 12:41 AM, Olivier Lagneau 
>>>> <olivier.lagneau at oracle.com <mailto:olivier.lagneau at oracle.com>> wrote:
>>>>
>>>>> Please find the new webrev with copyright date fixed (changed to 
>>>>> 2014).
>>>>>
>>>>> Webrev: http://cr.openjdk.java.net/~olagneau/8019389/webrev.01/ 
>>>>> <http://cr.openjdk.java.net/%7Eolagneau/8019389/webrev.01/>
>>>>>
>>>>> Olivier.
>>>>>
>>>>> Olivier Lagneau said  on date 1/20/2014 5:50 PM:
>>>>>> Oops, right !
>>>>>>
>>>>>> Will fix that.
>>>>>>
>>>>>> Olivier.
>>>>>>
>>>>>> shanliang said  on date 1/20/2014 4:13 PM:
>>>>>>> Olivier,
>>>>>>>
>>>>>>> Now it is 2014 :)
>>>>>>>
>>>>>>>
>>>>>>> Olivier Lagneau wrote:
>>>>>>>> Please review the following simple fix.
>>>>>>>>
>>>>>>>> Issue: https://bugs.openjdk.java.net/browse/JDK-8019389
>>>>>>>> Webrev: http://cr.openjdk.java.net/~olagneau/8019389/webrev.00/ 
>>>>>>>> <http://cr.openjdk.java.net/%7Eolagneau/8019389/webrev.00/>
>>>>>>>>
>>>>>>>> The issue is due to the fact that _invokeHandle bytecode is 
>>>>>>>> generated by hotspot,
>>>>>>>> but is not declared in agent code. Just declaring the new 
>>>>>>>> bytecode solves the assertion failure.
>>>>>>>>
>>>>>>>> However the tests reported in 8019389 
>>>>>>>> (bootstrapOtherStratumInStackTrace, targetOtherStratumInStackTrace)
>>>>>>>> suffer the problem from JDK-7016268 
>>>>>>>> <https://bugs.openjdk.java.net/browse/JDK-7016268> : Can't get 
>>>>>>>> strata information through SA-JDI
>>>>>>>> Thus, the "stratum mismatch" related to JDK-7016268 will still 
>>>>>>>> be present after fix.
>>>>>>>> This second problem has to be fixed through JDK-7016268.
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Olivier.
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20140204/75360197/attachment-0001.html 
-------------- next part --------------
An embedded message was scrubbed...
From: Alejandro E Murillo <alejandro.murillo at oracle.com>
Subject: [Rectification] Re: New hotspot repo for 7u60 changes
Date: Mon, 03 Feb 2014 16:14:29 -0700
Size: 4082
Url: http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20140204/75360197/AttachedMessage-0001.nws 

From david.holmes at oracle.com  Tue Feb  4 03:46:20 2014
From: david.holmes at oracle.com (David Holmes)
Date: Tue, 04 Feb 2014 21:46:20 +1000
Subject: 2-nd round RFR (S) 8032223: nsk/regression/b4663146 gets
	assert(SafepointSynchronize::is_at_safepoint()
	|| JvmtiEnv::is_thread_fully_suspended(get_thread(), false,
	&debug_bits))
In-Reply-To: <52F0CB50.3040703@oracle.com>
References: <52F0CB50.3040703@oracle.com>
Message-ID: <52F0D30C.20401@oracle.com>

Hi Serguei,

Looks okay to me.

Minor nit: "use a vm-op for safety" is actually "use a vm-safepoint-op 
for safety". Not all VM ops need involve a safepoint.

David

On 4/02/2014 9:13 PM, serguei.spitsyn at oracle.com wrote:
> Please, review the fix for:
>    https://bugs.openjdk.java.net/browse/JDK-8032223
>
>
> Open webrev:
> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/hotspot/8032223-JVMTI-FRAME.1/
>
>
> Summary:
>
>    This is the second round of review for this issue.
>    But it was decided that the JDK-8032223 must be used to cover it
> instead of the JDK-6471769.
>    The 8032223 was initially closed as a dup of 6471769 but it has been
> re-open now.
>
>    There is a general issue in the suspend equivalent condition mechanism:
>    Two subsequent calls to the JvmtiEnv::is_thread_fully_suspended() may
> return different results:
>      - 1-st: true
>      - 2-nd: false
>
>    This suspend equivalent issue is covered by another bug:
>      https://bugs.openjdk.java.net/browse/JDK-6280037
>
>    The bug to fix in this review is a specific manifestation of the 6280037
>    in the JVMTI GetFrameCount() that has a major impact on the SQE nightly.
>    It is on the Test Stabilization radar as well as the 6280037.
>    There are many tests intermittently failing because of this.
>    I've also decided to fix the same issue in the JVMTI
> GetFrameLocation() as well.
>
>    The JVMTI GetFrameCount() spec tells:
>      "If this function is called for a thread actively executing
> bytecodes (for example,
>       not the current thread and not suspended), the information
> returned is transient."
>
>    So, it is Ok to call the GetFrameCount() for the non-suspended target
> thread.
>    To achieve safety, the frame count for non-suspended threads is
> calculated at a safepoint.
>    It should be Ok and more safe to do the same for suspended threads as
> well.
>    There should be no big performance impact because it is already on a
> slow path.
>    It is still important to avoid safepointing when the target thread is
> current.
>
>    The bug 6280037 should go out of the Test Stabilization radar (remove
> the svc-nightly label)
>    as the most of the impacted tests must be covered by the 8032223.
>
>
> Testing:
>    In progress:
>      - nsk.jvmti, nsk.jdi, nsk.jdwp
>      - JTreg com/sun/jdi
>
>
> Thanks,
> Serguei
>

From olivier.lagneau at oracle.com  Tue Feb  4 03:33:34 2014
From: olivier.lagneau at oracle.com (Olivier Lagneau)
Date: Tue, 04 Feb 2014 12:33:34 +0100
Subject: Need second 7u reviewer for [7u60] Request for phase 2 approval
	for CR 8019389: SA-JDI JSR292:
	sun.jvm.hotspot.jdi.StackFrame.thisObject()
	throws sun.jvm.hotspot.utilities.AssertionFailure: sanity check
In-Reply-To: <52F0CFB5.3040107@oracle.com>
References: <52DD37E3.1020402@oracle.com>
	<52DD3D1B.9070700@oracle.com>	<52DD53C2.6070904@oracle.com>
	<52DE32CE.3080703@oracle.com>	<2F0BD850-0C45-4377-959A-41045C75FB88@oracle.com>
	<52F0BDC1.7050405@oracle.com> <52F0C8FF.70100@oracle.com>
	<52F0CAAE.3010106@oracle.com> <52F0CFB5.3040107@oracle.com>
Message-ID: <52F0D00E.2040700@oracle.com>

serguei.spitsyn at oracle.com said  on date 2/4/2014 12:32 PM:
>
> Just in case, I've attached one of the latest e-mail notifications 
> from Alejandro.
> Ask questions if there is any confusion.
Thanks Serguey for reminding.
I have been in touch with Alejandro about where to push the fix.

Olivier.
>
> Thanks,
> Serguei
>
>>
>> Olivier.
>>
>> serguei.spitsyn at oracle.com said  on date 2/4/2014 12:03 PM:
>>> Hi Oliver,
>>>
>>> The fix looks good.
>>> Was this already fixed in the JDK 9 and 8?
>>>
>>> Thanks,
>>> Serguei
>>>
>>> On 2/4/14 2:15 AM, Olivier Lagneau wrote:
>>>> I need a second 7u reviewer for this 7u60 critical request fix 
>>>> (very simple fix).
>>>>
>>>> Could someone review it please ?
>>>>
>>>> Thanks,
>>>> Olivier.
>>>>
>>>> Christian Thalinger said  on date 1/23/2014 10:09 PM:
>>>>> Looks good.
>>>>>
>>>>> On Jan 21, 2014, at 12:41 AM, Olivier Lagneau 
>>>>> <olivier.lagneau at oracle.com <mailto:olivier.lagneau at oracle.com>> 
>>>>> wrote:
>>>>>
>>>>>> Please find the new webrev with copyright date fixed (changed to 
>>>>>> 2014).
>>>>>>
>>>>>> Webrev: http://cr.openjdk.java.net/~olagneau/8019389/webrev.01/ 
>>>>>> <http://cr.openjdk.java.net/%7Eolagneau/8019389/webrev.01/>
>>>>>>
>>>>>> Olivier.
>>>>>>
>>>>>> Olivier Lagneau said  on date 1/20/2014 5:50 PM:
>>>>>>> Oops, right !
>>>>>>>
>>>>>>> Will fix that.
>>>>>>>
>>>>>>> Olivier.
>>>>>>>
>>>>>>> shanliang said  on date 1/20/2014 4:13 PM:
>>>>>>>> Olivier,
>>>>>>>>
>>>>>>>> Now it is 2014 :)
>>>>>>>>
>>>>>>>>
>>>>>>>> Olivier Lagneau wrote:
>>>>>>>>> Please review the following simple fix.
>>>>>>>>>
>>>>>>>>> Issue: https://bugs.openjdk.java.net/browse/JDK-8019389
>>>>>>>>> Webrev: 
>>>>>>>>> http://cr.openjdk.java.net/~olagneau/8019389/webrev.00/ 
>>>>>>>>> <http://cr.openjdk.java.net/%7Eolagneau/8019389/webrev.00/>
>>>>>>>>>
>>>>>>>>> The issue is due to the fact that _invokeHandle bytecode is 
>>>>>>>>> generated by hotspot,
>>>>>>>>> but is not declared in agent code. Just declaring the new 
>>>>>>>>> bytecode solves the assertion failure.
>>>>>>>>>
>>>>>>>>> However the tests reported in 8019389 
>>>>>>>>> (bootstrapOtherStratumInStackTrace, 
>>>>>>>>> targetOtherStratumInStackTrace)
>>>>>>>>> suffer the problem from JDK-7016268 
>>>>>>>>> <https://bugs.openjdk.java.net/browse/JDK-7016268> : Can't get 
>>>>>>>>> strata information through SA-JDI
>>>>>>>>> Thus, the "stratum mismatch" related to JDK-7016268 will still 
>>>>>>>>> be present after fix.
>>>>>>>>> This second problem has to be fixed through JDK-7016268.
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Olivier.
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20140204/b50a26bc/attachment.html 

From dmitry.samersoff at oracle.com  Tue Feb  4 05:46:49 2014
From: dmitry.samersoff at oracle.com (Dmitry Samersoff)
Date: Tue, 04 Feb 2014 17:46:49 +0400
Subject: RR(S): JDK-8023667 SA: ExceptionBlob and other C2 classes not
	available in client VM
In-Reply-To: <CFDDFEB8-4C53-4EEF-850E-D6C0E7BFE07C@oracle.com>
References: <52EBFC89.3030809@oracle.com>
	<A10C11D1-EABD-4D6A-B3BF-137E4F3EDB4E@oracle.com>
	<52EF7675.3020707@oracle.com>
	<CFDDFEB8-4C53-4EEF-850E-D6C0E7BFE07C@oracle.com>
Message-ID: <52F0EF49.4050003@oracle.com>

Staffan,

You was right, only two classes is C2 specific.

Here is updated webrev.

http://cr.openjdk.java.net/~dsamersoff/JDK-8023667/webrev.02/

-Dmitry

On 2014-02-03 16:13, Staffan Larsen wrote:
> 
> On 3 feb 2014, at 11:59, Dmitry Samersoff <dmitry.samersoff at oracle.com> wrote:
> 
>> Staffan,
>>
>> All items under try-catch exists only in server (C2 compiler) VM, so if
>> one of them is missing in target VM other items is missed as well.
> 
> Are you sure? Some of them look very generic: ?JavaThread?, ?SurrogateLockerThread?, ?GenCollectedHeap?, etc. Perhaps the catch-ignore-c2-specifics should end at lin 821? 
> 
> /Staffan
> 
> 
>>
>> I have no ideas what happens later if SA tries to access one of this
>> non-initialized items, but I don't see any errors in my home testing.
>>
>> -Dmitry
>>
>> On 2014-02-03 14:39, Staffan Larsen wrote:
>>> It looks like this new version will leave some things uninitialized
>>> if an item higher on the list throws an exception. I don?t think that
>>> is what you want.
>>>
>>> /Staffan
>>>
>>>
>>> On 31 jan 2014, at 20:42, Dmitry Samersoff
>>> <dmitry.samersoff at oracle.com> wrote:
>>>
>>>> Hi Everybody,
>>>>
>>>> Please review the fix:
>>>>
>>>> http://cr.openjdk.java.net/~dsamersoff/JDK-8023667/webrev.01/
>>>>
>>>> sa.js try to initialize all possible VM structures regardless of
>>>> type of VM we are running on
>>>>
>>>> -- Dmitry Samersoff Oracle Java development team, Saint Petersburg,
>>>> Russia * I would love to change the world, but they won't give me
>>>> the sources.
>>>
>>
>>
>> -- 
>> Dmitry Samersoff
>> Oracle Java development team, Saint Petersburg, Russia
>> * I would love to change the world, but they won't give me the sources.
> 


-- 
Dmitry Samersoff
Oracle Java development team, Saint Petersburg, Russia
* I would love to change the world, but they won't give me the sources.

From staffan.larsen at oracle.com  Tue Feb  4 05:47:33 2014
From: staffan.larsen at oracle.com (Staffan Larsen)
Date: Tue, 4 Feb 2014 14:47:33 +0100
Subject: RR(S): JDK-8023667 SA: ExceptionBlob and other C2 classes not
	available in client VM
In-Reply-To: <52F0EF49.4050003@oracle.com>
References: <52EBFC89.3030809@oracle.com>
	<A10C11D1-EABD-4D6A-B3BF-137E4F3EDB4E@oracle.com>
	<52EF7675.3020707@oracle.com>
	<CFDDFEB8-4C53-4EEF-850E-D6C0E7BFE07C@oracle.com>
	<52F0EF49.4050003@oracle.com>
Message-ID: <A46D0DF0-AC02-4014-A062-CD45ACF4E207@oracle.com>

Looks good now!

/Staffan

On 4 feb 2014, at 14:46, Dmitry Samersoff <dmitry.samersoff at oracle.com> wrote:

> Staffan,
> 
> You was right, only two classes is C2 specific.
> 
> Here is updated webrev.
> 
> http://cr.openjdk.java.net/~dsamersoff/JDK-8023667/webrev.02/
> 
> -Dmitry
> 
> On 2014-02-03 16:13, Staffan Larsen wrote:
>> 
>> On 3 feb 2014, at 11:59, Dmitry Samersoff <dmitry.samersoff at oracle.com> wrote:
>> 
>>> Staffan,
>>> 
>>> All items under try-catch exists only in server (C2 compiler) VM, so if
>>> one of them is missing in target VM other items is missed as well.
>> 
>> Are you sure? Some of them look very generic: ?JavaThread?, ?SurrogateLockerThread?, ?GenCollectedHeap?, etc. Perhaps the catch-ignore-c2-specifics should end at lin 821? 
>> 
>> /Staffan
>> 
>> 
>>> 
>>> I have no ideas what happens later if SA tries to access one of this
>>> non-initialized items, but I don't see any errors in my home testing.
>>> 
>>> -Dmitry
>>> 
>>> On 2014-02-03 14:39, Staffan Larsen wrote:
>>>> It looks like this new version will leave some things uninitialized
>>>> if an item higher on the list throws an exception. I don?t think that
>>>> is what you want.
>>>> 
>>>> /Staffan
>>>> 
>>>> 
>>>> On 31 jan 2014, at 20:42, Dmitry Samersoff
>>>> <dmitry.samersoff at oracle.com> wrote:
>>>> 
>>>>> Hi Everybody,
>>>>> 
>>>>> Please review the fix:
>>>>> 
>>>>> http://cr.openjdk.java.net/~dsamersoff/JDK-8023667/webrev.01/
>>>>> 
>>>>> sa.js try to initialize all possible VM structures regardless of
>>>>> type of VM we are running on
>>>>> 
>>>>> -- Dmitry Samersoff Oracle Java development team, Saint Petersburg,
>>>>> Russia * I would love to change the world, but they won't give me
>>>>> the sources.
>>>> 
>>> 
>>> 
>>> -- 
>>> Dmitry Samersoff
>>> Oracle Java development team, Saint Petersburg, Russia
>>> * I would love to change the world, but they won't give me the sources.
>> 
> 
> 
> -- 
> Dmitry Samersoff
> Oracle Java development team, Saint Petersburg, Russia
> * I would love to change the world, but they won't give me the sources.


From mikael.gerdin at oracle.com  Tue Feb  4 06:44:55 2014
From: mikael.gerdin at oracle.com (Mikael Gerdin)
Date: Tue, 04 Feb 2014 15:44:55 +0100
Subject: RFR (backport of): 8028391 - Make the Min/MaxHeapFreeRatio flags
	manageable
In-Reply-To: <52F0B01C.2040600@oracle.com>
References: <52F0B01C.2040600@oracle.com>
Message-ID: <16972825.PJJixsIhPJ@mgerdin03>

Hi Jesper,

On Tuesday 04 February 2014 10.17.16 Jesper Wilhelmsson wrote:
> Hi,
> 
> The patch from jdk9/hs-gc to make MinHeapFreeRatio and MaxHeapFreeRatio
> manageable flags and supported in ParallelGC applied cleanly to
> jdk8u/hs-dev.
> 
> hsx/jdk7u60 required some manual labor to patch since the argument parsing
> code had moved around, but the relevant logic was untouched so I didn't
> have to change anything in the new code. For 7 I omitted some of the minor
> cleanups that was done in the original patch.
> 
> Webrev for 7: http://cr.openjdk.java.net/~jwilhelm/8028391/webrev.7u60/

The 7u60 backport looks good to me.

> Webrev for 8: http://cr.openjdk.java.net/~jwilhelm/8028391/webrev.8u20/
> 
> (The 8u20 webrev is identical with the final version for 9.)

In that case, you don't need to ask for reviews for the 8u20 backport, you can 
just go ahead and push it.

/Mikael

> 
> So kindly approve this change for backport to jdk8u20 and jdk7u60.
> 
> The bugs:
> jdk9: https://bugs.openjdk.java.net/browse/JDK-8028391
> jdk8: https://bugs.openjdk.java.net/browse/JDK-8033209
> jdk7: https://bugs.openjdk.java.net/browse/JDK-8028720
> 
> Reviews in this thread:
> http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/2014-January/009187.ht
> ml
> 
> The jdk9 changeset:
> http://hg.openjdk.java.net/jdk9/hs-gc/hotspot/rev/44315152d434
> 
> Thanks,
> /Jesper


From shanliang.jiang at oracle.com  Tue Feb  4 07:07:13 2014
From: shanliang.jiang at oracle.com (shanliang)
Date: Tue, 04 Feb 2014 16:07:13 +0100
Subject: RFR 6791551: ExclusiveBind.java has a race condition
In-Reply-To: <52F0ABC2.4040700@oracle.com>
References: <52F0192B.9060104@oracle.com> <52F0AABC.7000904@oracle.com>
	<52F0ABC2.4040700@oracle.com>
Message-ID: <52F10221.6010303@oracle.com>

Jaroslav Bachorik wrote:
> On 4.2.2014 09:54, shanliang wrote:
>> Jaroslav,
>>
>> Your fix should work in most case, but is it better and more reliable to
>> wait a VM event as suggested in the bug? even your timeout is adapted to
>> the test time factory, but the solution still depends to a fixed timeout
>> and a  fixed line out.
>
> Well, if I get the test logic correctly it is supposed to test that 
> the agent blocks the port even when no client has connected yet. 
> Connecting to the agent and waiting for the event would change the 
> thing the test checks, actually.
You are right that the test should not attach a VM before launching the 
second debuggee. Let's hope that 5000 * Utils.TIMEOUT_FACTOR works for 
all testing machines.
Looks OK.

Thanks,
Shanliang
>
> -JB-
>
>>
>> Shanliang
>>
>> Jaroslav Bachorik wrote:
>>> Please, review the following test fix:
>>>
>>> Issue : https://bugs.openjdk.java.net/browse/JDK-6791551
>>> Webrev: http://cr.openjdk.java.net/~jbachorik/6791551/webrev.00
>>>
>>> The fix prevents the situation when the first debuggee has not managed
>>> to finish its intialization while the second one is started up thus
>>> making the port available for the second debuggee and failing the test.
>>>
>>> The patch is using the library methods to configure and launch the
>>> debuggee and the test waits for the well known string to appear in the
>>> first debuggee output before attempting to launch the second debuggee.
>>>
>>> Thanks,
>>>
>>> -JB-
>>
>


From bengt.rutisson at oracle.com  Tue Feb  4 07:18:34 2014
From: bengt.rutisson at oracle.com (Bengt Rutisson)
Date: Tue, 04 Feb 2014 16:18:34 +0100
Subject: RFR (backport of): 8028391 - Make the Min/MaxHeapFreeRatio flags
	manageable
In-Reply-To: <52F0B01C.2040600@oracle.com>
References: <52F0B01C.2040600@oracle.com>
Message-ID: <52F104CA.3030702@oracle.com>


Hi Jesper,

I browsed the 7u60 back port and it looks fine as far as I can tell.

I did not look at the 8u20 back port since I don't think you need a 
review if it applied cleanly.


One small detail that you probably should not change for these back 
ports but that I should have noticed in the original patch is the error 
message in Arguments::verify_MinHeapFreeRatio(). It says 
"MinHeapFreeRatio must have a value between 0 and 100" but in fact we 
only allow 0-99.

I guess this is probably not worth fixing. Not even for 9. But I just 
wanted to mention it since I should have caught it in the original review.

Thanks,
Bengt

On 2/4/14 10:17 AM, Jesper Wilhelmsson wrote:
> Hi,
>
> The patch from jdk9/hs-gc to make MinHeapFreeRatio and 
> MaxHeapFreeRatio manageable flags and supported in ParallelGC applied 
> cleanly to jdk8u/hs-dev.
>
> hsx/jdk7u60 required some manual labor to patch since the argument 
> parsing code had moved around, but the relevant logic was untouched so 
> I didn't have to change anything in the new code. For 7 I omitted some 
> of the minor cleanups that was done in the original patch.
>
> Webrev for 7: http://cr.openjdk.java.net/~jwilhelm/8028391/webrev.7u60/
> Webrev for 8: http://cr.openjdk.java.net/~jwilhelm/8028391/webrev.8u20/
>
> (The 8u20 webrev is identical with the final version for 9.)
>
> So kindly approve this change for backport to jdk8u20 and jdk7u60.
>
> The bugs:
> jdk9: https://bugs.openjdk.java.net/browse/JDK-8028391
> jdk8: https://bugs.openjdk.java.net/browse/JDK-8033209
> jdk7: https://bugs.openjdk.java.net/browse/JDK-8028720
>
> Reviews in this thread:
> http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/2014-January/009187.html 
>
>
> The jdk9 changeset:
> http://hg.openjdk.java.net/jdk9/hs-gc/hotspot/rev/44315152d434
>
> Thanks,
> /Jesper


From daniel.daugherty at oracle.com  Tue Feb  4 07:48:10 2014
From: daniel.daugherty at oracle.com (Daniel D. Daugherty)
Date: Tue, 04 Feb 2014 08:48:10 -0700
Subject: 2-nd round RFR (S) 8032223: nsk/regression/b4663146 gets
	assert(SafepointSynchronize::is_at_safepoint()
	|| JvmtiEnv::is_thread_fully_suspended(get_thread(), false,
	&debug_bits))
In-Reply-To: <52F0CB50.3040703@oracle.com>
References: <52F0CB50.3040703@oracle.com>
Message-ID: <52F10BBA.8060504@oracle.com>

On 2/4/14 4:13 AM, serguei.spitsyn at oracle.com wrote:
> Please, review the fix for:
>   https://bugs.openjdk.java.net/browse/JDK-8032223
>
>
> Open webrev:
> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/hotspot/8032223-JVMTI-FRAME.1/ 
>

src/share/vm/prims/jvmtiEnvBase.hpp
     No comments.

src/share/vm/prims/jvmtiEnv.cpp
     No comments beyond David's tweak to the comment.


For future work...

Looks like these VM ops also need the liveness check on the
target JavaThread:

     VM_GetOwnedMonitorInfo
     VM_GetStackTrace

JvmtiEnv::GetStackTrace() looks like it has the same
is_thread_fully_suspended() flaw...

Dan


>
> Summary:
>
>   This is the second round of review for this issue.
>   But it was decided that the JDK-8032223 must be used to cover it 
> instead of the JDK-6471769.
>   The 8032223 was initially closed as a dup of 6471769 but it has been 
> re-open now.
>
>   There is a general issue in the suspend equivalent condition mechanism:
>   Two subsequent calls to the JvmtiEnv::is_thread_fully_suspended() 
> may return different results:
>     - 1-st: true
>     - 2-nd: false
>
>   This suspend equivalent issue is covered by another bug:
>     https://bugs.openjdk.java.net/browse/JDK-6280037
>
>   The bug to fix in this review is a specific manifestation of the 
> 6280037
>   in the JVMTI GetFrameCount() that has a major impact on the SQE 
> nightly.
>   It is on the Test Stabilization radar as well as the 6280037.
>   There are many tests intermittently failing because of this.
>   I've also decided to fix the same issue in the JVMTI 
> GetFrameLocation() as well.
>
>   The JVMTI GetFrameCount() spec tells:
>     "If this function is called for a thread actively executing 
> bytecodes (for example,
>      not the current thread and not suspended), the information 
> returned is transient."
>
>   So, it is Ok to call the GetFrameCount() for the non-suspended 
> target thread.
>   To achieve safety, the frame count for non-suspended threads is 
> calculated at a safepoint.
>   It should be Ok and more safe to do the same for suspended threads 
> as well.
>   There should be no big performance impact because it is already on a 
> slow path.
>   It is still important to avoid safepointing when the target thread 
> is current.
>
>   The bug 6280037 should go out of the Test Stabilization radar 
> (remove the svc-nightly label)
>   as the most of the impacted tests must be covered by the 8032223.
>
>
> Testing:
>   In progress:
>     - nsk.jvmti, nsk.jdi, nsk.jdwp
>     - JTreg com/sun/jdi
>
>
> Thanks,
> Serguei
>


From daniel.daugherty at oracle.com  Tue Feb  4 10:52:32 2014
From: daniel.daugherty at oracle.com (Daniel D. Daugherty)
Date: Tue, 04 Feb 2014 11:52:32 -0700
Subject: code review round 0 for ObjectMonitor-JVM/TI hang fix (8028073)
In-Reply-To: <52F08B34.1010300@oracle.com>
References: <52ED3F28.5070806@oracle.com> <52EF874A.8020603@oracle.com>
	<52EFAE1C.3050202@oracle.com> <52F08B34.1010300@oracle.com>
Message-ID: <52F136F0.5050408@oracle.com>

On 2/3/14 11:39 PM, David Holmes wrote:
> On 4/02/2014 12:56 AM, Daniel D. Daugherty wrote:
>> Adding Dave Dice to this thread...
>>
>> On 2/3/14 5:10 AM, David Holmes wrote:
>>> Hi Dan,
>>>
>>> On 2/02/2014 4:38 AM, Daniel D. Daugherty wrote:
>>>> Greetings,
>>>>
>>>> I have a fix ready for the following bug:
>>>>
>>>>      8028073 race condition in ObjectMonitor implementation causing
>>>> deadlocks
>>>>      https://bugs.openjdk.java.net/browse/JDK-8028073
>>>>
>>>> On the surface, this is a very simple fix that relocates a few 
>>>> lines of
>>>> code, relocates and rewrites the comments associated with that code 
>>>> and
>>>> adds several new comments.
>>>>
>>>> Of course, in reality, the issue is much more complicated, but I'm
>>>> hoping to make it easy for anyone not acquainted with this issue to
>>>> understand what's going on.
>>>
>>> Let's spare them the trouble - Reviewed :)
>>
>> Thanks for the review!
>>
>>
>>> I'm still troubled by the rationale for the original placement of this
>>> code. It makes sense to issue the unpark prior to the park() call but
>>> on the other hand there is no agent/event code in that path that would
>>> have consumed the original unpark. ??
>>
>> I think I've acknowledged that concern in the "Risks" section below:
>>
>>     - while the old code was not effective for the hang that is being
>>       fixed with this bug, it is possible that the old code prevented
>>       a different bug in the successor protocol from manifesting
>>
>> Hopefully, Dave Dice will chime in and share the history...
>
> For the record Dave did chime in off-list and this code block dates 
> back to the JVMPI implementation which did 
> jvmpi::post_monitor_contended_exit_event at the end of exit(), which 
> is called just before the code block in question. That was back in JDK 
> 5. By JDK 6 JVMPI had been stripped out but the code block remained.

David, thanks for closing the loop on this one.

Dan


>
>>> Thanks, I know you've really enjoyed this one :)
>>
>> I actually did enjoy this one from a 'challenge to the mind' POV.
>> But I have to say that the intricacies of the successor protocol
>> gave me headaches... Hopefully, my successor.notes will help anyone
>> else that is crazy enough to wander down this path...
>
> Indeed!
>
> Thanks,
> David
>
>
>> Dan
>>
>>
>>>
>>> David
>>> -----
>>>
>>>> Here are the JDK9 webrev URLs:
>>>>
>>>> OpenJDK:
>>>> http://cr.openjdk.java.net/~dcubed/8028073-webrev/0-jdk9-hs-runtime/
>>>>
>>>> Oracle internal:
>>>> http://javaweb.us.oracle.com/~ddaugher/8028073-webrev/0-jdk9-hs-runtime/ 
>>>>
>>>>
>>>> The simple summary:
>>>>
>>>> - since Java Monitors and JVM/TI RawMonitors share a ParkEvent,
>>>>    it is possible for a JVM/TI monitor event handler to accidentally
>>>>    consume a ParkEvent.unpark() call meant for Java Monitor layer
>>>> - the original code fix was made on 2005.07.04 using this bug ID:
>>>>    https://bugs.openjdk.java.net/browse/JDK-5030359
>>>> - it's the right fix, but it's in the wrong place
>>>> - the fix needs to be after the JVMTI_EVENT_MONITOR_WAITED
>>>>    event handler is called because it is that event handler
>>>>    that can cause the hang
>>>>
>>>>
>>>> Testing
>>>> -------
>>>>
>>>> - a new StessMonitorWait test has been created that reliably
>>>>    reproduces the hang in JDK[6789]; see the bug's gory details
>>>>    for the specific versions where the hang has been reproduced
>>>>    - the test reliably reproduces the hang in 5 seconds on my
>>>>      T7600 running Solaris 10u11 X86; 1 minute runs reproduce
>>>>      the hang reliably on other machines
>>>>    - 12 hour stress run of the new test on Linux-X64, MacOS X-X64,
>>>>      Solaris-SPARCV9, Solaris-X64, and Win7-X86 with the JPRT
>>>>      bits did not reproduce the hang
>>>> - JPRT test job
>>>> - VM/SQE Adhoc test job on Server VM, fastdebug bits on Linux-X86,
>>>>    Linux-X64, MacOS X-X64, Solaris-SPARCV9, Solaris-X64, Windows-X86,
>>>>    and Windows-X64:
>>>>    - vm.quick
>>>>    - Kitchensink (bigapps)
>>>>    - Weblogic+medrec (bigapps)
>>>>    - runThese (bigapps)
>>>>
>>>>
>>>> The Gory Details Start Here
>>>> ---------------------------
>>>>
>>>> This is the old location of block of code that's being moved:
>>>>
>>>> src/share/vm/runtime/objectMonitor.cpp:
>>>>
>>>> 1440 void ObjectMonitor::wait(jlong millis, bool interruptible, 
>>>> TRAPS) {
>>>> <snip>
>>>> 1499    exit (true, Self) ;                    // exit the monitor
>>>> <snip>
>>>> 1513    if (node._notified != 0 && _succ == Self) {
>>>> 1514       node._event->unpark();
>>>> 1515    }
>>>>
>>>>
>>>> This is the new location of block of code that's being moved:
>>>>
>>>> src/share/vm/runtime/objectMonitor.cpp:
>>>>
>>>> 1452 void ObjectMonitor::wait(jlong millis, bool interruptible, 
>>>> TRAPS) {
>>>> <snip>
>>>> 1601      if (JvmtiExport::should_post_monitor_waited()) {
>>>> 1602        JvmtiExport::post_monitor_waited(jt, this, ret ==
>>>> OS_TIMEOUT);
>>>> <snip>
>>>> 1604        if (node._notified != 0 && _succ == Self) {
>>>> <snip>
>>>> 1620          node._event->unpark();
>>>> 1621        }
>>>>
>>>>
>>>> The Risks
>>>> ---------
>>>>
>>>> - The code now executes only when the JVMTI_EVENT_MONITOR_WAITED event
>>>>    is enabled:
>>>>    - previously it was always executed
>>>>    - while the old code was not effective for the hang that is being
>>>>      fixed with this bug, it is possible that the old code prevented
>>>>      a different bug in the successor protocol from manifesting
>>>>    - thorough analysis of the successor protocol did not reveal a
>>>>      case where the old code was needed in the old location
>>>> - Thorough analysis indicates that the other JVM/TI monitor events
>>>>    do not need a fix like the one for JVMTI_EVENT_MONITOR_WAITED:
>>>>    - the successor protocol is complicated and the analysis could
>>>>      be wrong when certain options are used
>>>>    - comments were added to each location where a JVM/TI monitor
>>>>      event handler is called documenting why a fix like this one
>>>>      is not needed there
>>>>    - if the analysis is wrong, the new comments show where a new
>>>>      code change would be needed
>>>>
>>>>
>>>> The Scenario
>>>> ------------
>>>>
>>>> I've created a scenario that reproduces this hang:
>>>>
>>>> T1 - enters monitor and calls monitor.wait()
>>>> T2 - enters the monitor, calls monitor.notify() and exits the monitor
>>>> T3 - enters and exits the monitor
>>>> T4 - enters the monitor, delays for 5 seconds, exits the monitor
>>>>
>>>> A JVM/TI agent that enables JVMTI_EVENT_MONITOR_WAITED and has a
>>>> handler that: enters a raw monitor, waits for 1ms, exits a raw 
>>>> monitor.
>>>>
>>>> Here are the six events necessary to make this hang happen:
>>>>
>>>> // KEY-EVENT-1a: After being unparked(), T1 has cleared the _succ 
>>>> field,
>>>> but
>>>> // KEY-EVENT-1b: T3 is exiting the monitor and makes T1 the successor
>>>> again.
>>>>
>>>> // KEY-EVENT-2a: The unpark() done by T3 when it made T1 the successor
>>>> // KEY-EVENT-2b: is consumed by the JVM/TI event handler.
>>>>
>>>> // KEY-EVENT-3a: T3 made T1 the successor
>>>> // KEY-EVENT-3b: but before T1 could reenter the monitor T4 grabbed 
>>>> it.
>>>>
>>>> // KEY-EVENT-4a: T1's TrySpin() call sees T4 as NotRunnable so
>>>> // KEY-EVENT-4b: T1 bails from TrySpin without touching _succ.
>>>>
>>>> // KEY-EVENT-5a: T4 sees that T1 is still the successor so
>>>> // KEY-EVENT-5b: T4 takes the quick exit path (no ExitEpilog)
>>>>
>>>> // KEY-EVENT-6a: T1 is about to park and it is the successor, but
>>>> // KEY-EVENT-6b: T3's unpark has been eaten by the JVM/TI event 
>>>> handler
>>>> // KEY-EVENT-6c: and T4 took the quick exit path. T1 is about to be
>>>> stuck.
>>>>
>>>>
>>>> This bug is intertwined with:
>>>>
>>>> - The ObjectMonitor successor protocol
>>>> - the sharing of a ParkEvent between Java Monitors and JVM/TI
>>>> RawMonitors
>>>>
>>>> There is a very long successor.notes attachment to JDK-8028073 that
>>>> attempts to describe the ObjectMonitor successor protocol. It's good
>>>> for putting pretty much anyone to sleep.
>>>>
>>>> Since this hang reproduces back to JDK6, this bug is taking the easily
>>>> backported solution of moving the original fix to the right location.
>>>> The following new bug has been filed for possible future work in this
>>>> area by the Serviceability Team:
>>>>
>>>>      8033399 add a separate ParkEvent for JVM/TI RawMonitor use
>>>>      https://bugs.openjdk.java.net/browse/JDK-8033399
>>>>
>>>>
>>>> The Symptoms
>>>> ------------
>>>>
>>>> With intermittent hangs like this, it is useful to know what to look
>>>> for in order to determine if you are running into this issue:
>>>>
>>>> - if you aren't using a debugger or a profiler or some other
>>>>    JVM/TI agent, then this hang is not the same as yours
>>>> - if your JVM/TI agent isn't using a JVMTI_EVENT_MONITOR_WAITED
>>>>    event handler, then this hang is not the same as yours
>>>> - if your JVMTI_EVENT_MONITOR_WAITED event handler is not using
>>>>    JVM/TI RawMonitors, then this hang is not the same as yours
>>>> - if your JVMTI_EVENT_MONITOR_WAITED event handler is calling
>>>>    back into Java code, then you might just be insane and this
>>>>    hang might be similar to yours. However, using a Java callback
>>>>    in an event handler is an even bigger problem/risk so fix that
>>>>    first.
>>>> - if you one or more threads blocked like this and making no
>>>>    progress, then this hang might be the same as yours:
>>>>
>>>> "T1" #22 prio=5 os_prio=64 tid=0x00000000009ca800 nid=0x2f waiting for
>>>> monitor e
>>>> ntry [0xfffffd7fc0231000]
>>>>     java.lang.Thread.State: BLOCKED (on object monitor)
>>>>     JavaThread state: _thread_blocked
>>>> Thread: 0x00000000009ca800  [0x2f] State: _at_safepoint 
>>>> _has_called_back
>>>> 0 _at_p
>>>> oll_safepoint 0
>>>>     JavaThread state: _thread_blocked
>>>>          at java.lang.Object.wait(Native Method)
>>>>          - waiting on <0xfffffd7e6a2b6ff0> (a java.lang.String)
>>>>          at java.lang.Object.wait(Object.java:502)
>>>>          at SMW_WorkerThread.run(StressMonitorWait.java:103)
>>>>          - locked <0xfffffd7e6a2b6ff0> (a java.lang.String)
>>>>
>>>> "T2" #23 prio=5 os_prio=64 tid=0x00000000009cc000 nid=0x30 waiting for
>>>> monitor e
>>>> ntry [0xfffffd7fc0130000]
>>>>     java.lang.Thread.State: BLOCKED (on object monitor)
>>>>     JavaThread state: _thread_blocked
>>>> Thread: 0x00000000009cc000  [0x30] State: _at_safepoint 
>>>> _has_called_back
>>>> 0 _at_p
>>>> oll_safepoint 0
>>>>     JavaThread state: _thread_blocked
>>>>          at SMW_WorkerThread.run(StressMonitorWait.java:120)
>>>>          - waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String)
>>>>
>>>> "T3" #24 prio=5 os_prio=64 tid=0x00000000009ce000 nid=0x31 waiting for
>>>> monitor e
>>>> ntry [0xfffffd7fc002f000]
>>>>     java.lang.Thread.State: BLOCKED (on object monitor)
>>>>     JavaThread state: _thread_blocked
>>>> Thread: 0x00000000009ce000  [0x31] State: _at_safepoint 
>>>> _has_called_back
>>>> 0 _at_p
>>>> oll_safepoint 0
>>>>     JavaThread state: _thread_blocked
>>>>          at SMW_WorkerThread.run(StressMonitorWait.java:139)
>>>>          - waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String)
>>>>
>>>> Key symptoms in thread T1:
>>>>
>>>> - had the object locked:
>>>>
>>>>    locked <0xfffffd7e6a2b6ff0> (a java.lang.String)
>>>>
>>>> - did an Object.wait():
>>>>
>>>>    waiting on <0xfffffd7e6a2b6ff0> (a java.lang.String)
>>>>
>>>> - is blocked on reentry:
>>>>
>>>>    waiting for monitor entry [0xfffffd7fc0231000]
>>>>
>>>> Key symtoms in thread T2:
>>>>
>>>> - is blocked waiting to lock the object:
>>>>
>>>>    waiting for monitor entry [0xfffffd7fc0130000]
>>>>    waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String)
>>>>
>>>> Key symtoms in thread T3:
>>>>
>>>> - is blocked waiting to lock the object:
>>>>
>>>>    waiting for monitor entry [0xfffffd7fc002f000]
>>>>    waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String)
>>>>
>>
>


From serguei.spitsyn at oracle.com  Tue Feb  4 11:25:17 2014
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Tue, 04 Feb 2014 11:25:17 -0800
Subject: 2-nd round RFR (S) 8032223: nsk/regression/b4663146 gets
	assert(SafepointSynchronize::is_at_safepoint()
	|| JvmtiEnv::is_thread_fully_suspended(get_thread(), false,
	&debug_bits))
In-Reply-To: <52F0D30C.20401@oracle.com>
References: <52F0CB50.3040703@oracle.com> <52F0D30C.20401@oracle.com>
Message-ID: <52F13E9D.2040105@oracle.com>

Thanks, David!
Serguei

On 2/4/14 3:46 AM, David Holmes wrote:
> Hi Serguei,
>
> Looks okay to me.
>
> Minor nit: "use a vm-op for safety" is actually "use a vm-safepoint-op 
> for safety". Not all VM ops need involve a safepoint.
>
> David
>
> On 4/02/2014 9:13 PM, serguei.spitsyn at oracle.com wrote:
>> Please, review the fix for:
>>    https://bugs.openjdk.java.net/browse/JDK-8032223
>>
>>
>> Open webrev:
>> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/hotspot/8032223-JVMTI-FRAME.1/ 
>>
>>
>>
>> Summary:
>>
>>    This is the second round of review for this issue.
>>    But it was decided that the JDK-8032223 must be used to cover it
>> instead of the JDK-6471769.
>>    The 8032223 was initially closed as a dup of 6471769 but it has been
>> re-open now.
>>
>>    There is a general issue in the suspend equivalent condition 
>> mechanism:
>>    Two subsequent calls to the JvmtiEnv::is_thread_fully_suspended() may
>> return different results:
>>      - 1-st: true
>>      - 2-nd: false
>>
>>    This suspend equivalent issue is covered by another bug:
>>      https://bugs.openjdk.java.net/browse/JDK-6280037
>>
>>    The bug to fix in this review is a specific manifestation of the 
>> 6280037
>>    in the JVMTI GetFrameCount() that has a major impact on the SQE 
>> nightly.
>>    It is on the Test Stabilization radar as well as the 6280037.
>>    There are many tests intermittently failing because of this.
>>    I've also decided to fix the same issue in the JVMTI
>> GetFrameLocation() as well.
>>
>>    The JVMTI GetFrameCount() spec tells:
>>      "If this function is called for a thread actively executing
>> bytecodes (for example,
>>       not the current thread and not suspended), the information
>> returned is transient."
>>
>>    So, it is Ok to call the GetFrameCount() for the non-suspended target
>> thread.
>>    To achieve safety, the frame count for non-suspended threads is
>> calculated at a safepoint.
>>    It should be Ok and more safe to do the same for suspended threads as
>> well.
>>    There should be no big performance impact because it is already on a
>> slow path.
>>    It is still important to avoid safepointing when the target thread is
>> current.
>>
>>    The bug 6280037 should go out of the Test Stabilization radar (remove
>> the svc-nightly label)
>>    as the most of the impacted tests must be covered by the 8032223.
>>
>>
>> Testing:
>>    In progress:
>>      - nsk.jvmti, nsk.jdi, nsk.jdwp
>>      - JTreg com/sun/jdi
>>
>>
>> Thanks,
>> Serguei
>>


From serguei.spitsyn at oracle.com  Tue Feb  4 12:45:00 2014
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Tue, 04 Feb 2014 12:45:00 -0800
Subject: 2-nd round RFR (S) 8032223: nsk/regression/b4663146 gets
	assert(SafepointSynchronize::is_at_safepoint()
	|| JvmtiEnv::is_thread_fully_suspended(get_thread(), false,
	&debug_bits))
In-Reply-To: <52F10BBA.8060504@oracle.com>
References: <52F0CB50.3040703@oracle.com> <52F10BBA.8060504@oracle.com>
Message-ID: <52F1514C.1050405@oracle.com>

Thanks, Dan!
Some comments below.

On 2/4/14 7:48 AM, Daniel D. Daugherty wrote:
> On 2/4/14 4:13 AM, serguei.spitsyn at oracle.com wrote:
>> Please, review the fix for:
>>   https://bugs.openjdk.java.net/browse/JDK-8032223
>>
>>
>> Open webrev:
>> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/hotspot/8032223-JVMTI-FRAME.1/ 
>>
>
> src/share/vm/prims/jvmtiEnvBase.hpp
>     No comments.
>
> src/share/vm/prims/jvmtiEnv.cpp
>     No comments beyond David's tweak to the comment.

Will fix it before push.

> For future work...
>
> Looks like these VM ops also need the liveness check on the
> target JavaThread:
>
>     VM_GetOwnedMonitorInfo
>     VM_GetStackTrace
>
> JvmtiEnv::GetStackTrace() looks like it has the same
> is_thread_fully_suspended() flaw...

I know about this.
There are even more issues:
   GetCurrentContendedMonitor
   GetOwnedMonitorStackDepthInfo
   NotifyFramePop

I'll check if we can use an existing bug to add this extra request.
Otherwise, will file new one.

Thanks,
Serguei

>
> Dan
>
>
>
>>
>> Summary:
>>
>>   This is the second round of review for this issue.
>>   But it was decided that the JDK-8032223 must be used to cover it 
>> instead of the JDK-6471769.
>>   The 8032223 was initially closed as a dup of 6471769 but it has 
>> been re-open now.
>>
>>   There is a general issue in the suspend equivalent condition 
>> mechanism:
>>   Two subsequent calls to the JvmtiEnv::is_thread_fully_suspended() 
>> may return different results:
>>     - 1-st: true
>>     - 2-nd: false
>>
>>   This suspend equivalent issue is covered by another bug:
>>     https://bugs.openjdk.java.net/browse/JDK-6280037
>>
>>   The bug to fix in this review is a specific manifestation of the 
>> 6280037
>>   in the JVMTI GetFrameCount() that has a major impact on the SQE 
>> nightly.
>>   It is on the Test Stabilization radar as well as the 6280037.
>>   There are many tests intermittently failing because of this.
>>   I've also decided to fix the same issue in the JVMTI 
>> GetFrameLocation() as well.
>>
>>   The JVMTI GetFrameCount() spec tells:
>>     "If this function is called for a thread actively executing 
>> bytecodes (for example,
>>      not the current thread and not suspended), the information 
>> returned is transient."
>>
>>   So, it is Ok to call the GetFrameCount() for the non-suspended 
>> target thread.
>>   To achieve safety, the frame count for non-suspended threads is 
>> calculated at a safepoint.
>>   It should be Ok and more safe to do the same for suspended threads 
>> as well.
>>   There should be no big performance impact because it is already on 
>> a slow path.
>>   It is still important to avoid safepointing when the target thread 
>> is current.
>>
>>   The bug 6280037 should go out of the Test Stabilization radar 
>> (remove the svc-nightly label)
>>   as the most of the impacted tests must be covered by the 8032223.
>>
>>
>> Testing:
>>   In progress:
>>     - nsk.jvmti, nsk.jdi, nsk.jdwp
>>     - JTreg com/sun/jdi
>>
>>
>> Thanks,
>> Serguei
>>
>


From vladimir.kozlov at oracle.com  Tue Feb  4 15:45:29 2014
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Tue, 04 Feb 2014 15:45:29 -0800
Subject: Review Request (S) 8025841: JVMTI: "vtable stub" dynamic code
	notification is misplaced
In-Reply-To: <52F01569.3070201@oracle.com>
References: <52F01569.3070201@oracle.com>
Message-ID: <52F17B99.6080405@oracle.com>

Changes look reasonable.

Thank,
Vladimir

On 2/3/14 2:17 PM, serguei.spitsyn at oracle.com wrote:
> Please, review the fix for:
>    https://bugs.openjdk.java.net/browse/JDK-8025841
>
>
> Open webrev:
> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/hotspot/omazurov/8025841-JVMTI-vtbl.1
>
>
> Summary:
>
>    The fix contributed by Oleg Mazurov to improve profiling data quality.
>    It moves the "vtable stub" dynamic code notification to the right place.
>    I've already reviewed the fix, and it looks good to me.
>
>    Bug report description:
>
>     "JVMTI_EVENT_DYNAMIC_CODE_GENERATED for "vtable stub" gets scheduled
> when
>      a new chunk of memory for subsequent vtable and itable stubs is
> allocated.
>      That chunk is uninitialized (contains zeros or garbage) although
> due to the fact
>      that the actual event delivery is deferred, at least one vtable
> comes out right.
>
>      This event should describe an individual vtable/itable stub (base
> address and size)
>      and only after it's been created (memory is actually populated with
> code).
>      Where VM diagnostic messages about vtable/itable stubs are issued upon
>      -XX:+PrintAdapterHandlers appears exactly the right place for JVMTI
> events as well.
>
>      Getting vtables/itables right is important in the context of
> performance analysis as
>      that dynamically generated code may accumulate quite noticeable CPU
> time
>      (especially itabes), sometimes larger than the actual Java methods
> called."
>
>
> Testing:
>    Oleg tested it in the Oracle Studio Performance Analyzer environment.
>    nsk.jvmti, nsk.jdi, nsk.jdwp,
>    In progress: Jtreg com/sun/jdi, java/lang/instrument
>
>
> Thanks,
> Serguei
>

From serguei.spitsyn at oracle.com  Tue Feb  4 16:14:48 2014
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Tue, 04 Feb 2014 16:14:48 -0800
Subject: Review Request (S) 8025841: JVMTI: "vtable stub" dynamic code
	notification is misplaced
In-Reply-To: <52F17B99.6080405@oracle.com>
References: <52F01569.3070201@oracle.com> <52F17B99.6080405@oracle.com>
Message-ID: <52F18278.309@oracle.com>

Thanks a lot, Vladimir!
Serguei

On 2/4/14 3:45 PM, Vladimir Kozlov wrote:
> Changes look reasonable.
>
> Thank,
> Vladimir
>
> On 2/3/14 2:17 PM, serguei.spitsyn at oracle.com wrote:
>> Please, review the fix for:
>>    https://bugs.openjdk.java.net/browse/JDK-8025841
>>
>>
>> Open webrev:
>> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/hotspot/omazurov/8025841-JVMTI-vtbl.1 
>>
>>
>>
>> Summary:
>>
>>    The fix contributed by Oleg Mazurov to improve profiling data 
>> quality.
>>    It moves the "vtable stub" dynamic code notification to the right 
>> place.
>>    I've already reviewed the fix, and it looks good to me.
>>
>>    Bug report description:
>>
>>     "JVMTI_EVENT_DYNAMIC_CODE_GENERATED for "vtable stub" gets scheduled
>> when
>>      a new chunk of memory for subsequent vtable and itable stubs is
>> allocated.
>>      That chunk is uninitialized (contains zeros or garbage) although
>> due to the fact
>>      that the actual event delivery is deferred, at least one vtable
>> comes out right.
>>
>>      This event should describe an individual vtable/itable stub (base
>> address and size)
>>      and only after it's been created (memory is actually populated with
>> code).
>>      Where VM diagnostic messages about vtable/itable stubs are 
>> issued upon
>>      -XX:+PrintAdapterHandlers appears exactly the right place for JVMTI
>> events as well.
>>
>>      Getting vtables/itables right is important in the context of
>> performance analysis as
>>      that dynamically generated code may accumulate quite noticeable CPU
>> time
>>      (especially itabes), sometimes larger than the actual Java methods
>> called."
>>
>>
>> Testing:
>>    Oleg tested it in the Oracle Studio Performance Analyzer environment.
>>    nsk.jvmti, nsk.jdi, nsk.jdwp,
>>    In progress: Jtreg com/sun/jdi, java/lang/instrument
>>
>>
>> Thanks,
>> Serguei
>>


From staffan.larsen at oracle.com  Wed Feb  5 00:38:27 2014
From: staffan.larsen at oracle.com (Staffan Larsen)
Date: Wed, 5 Feb 2014 09:38:27 +0100
Subject: RFR(S): JDK-8029808 com/sun/jdi/ProcessAttachTest.sh times out 
In-Reply-To: <52EFFCB5.2030004@oracle.com>
References: <79C6D4D7-F8A0-4A33-AF7A-D1007F227A86@oracle.com>
	<52EA7F46.5070405@oracle.com>
	<16D6BA50-82FE-41F6-8423-FE218D397C58@oracle.com>
	<52EA84DC.5090008@oracle.com> <52EA872E.7070209@oracle.com>
	<D1A333D9-1B29-443E-BFD7-E140F49B53F6@oracle.com>
	<52EFFCB5.2030004@oracle.com>
Message-ID: <F583900C-0F38-4E98-B1BB-E3D0102222C8@oracle.com>

Dmitry, Alan: Thanks!

On 3 feb 2014, at 21:31, Dmitry Samersoff <dmitry.samersoff at oracle.com> wrote:

> Staffan,
> 
> Looks good for me!
> 
> -Dmitry
> 
> On 2014-02-03 23:18, Staffan Larsen wrote:
>> OK, my last try at this was very broken - I?m glad it didn?t pass the review.
>> 
>> Here is a new try: http://cr.openjdk.java.net/~sla/8029808/webrev.01/
>> 
>> As Alan says, we can?t use jps on the (second) debuggee becuse it is started with suspend=y. In that mode it is waiting for a debugger to attach before it has started executing any Java code and before jps is able to identify it.
>> 
>> I did find a race in the test, though. When the first debuggee is started (suspend=n) there is a race where the debugger process can attach and finish before the debuggee has written out the portfile. In that case ShutdownDebuggee will fail. I have seen this failure mode happen and I can force it by inserting a Thread.sleep into the debuggee. I solved this by waiting for the port file to appear in the suspend=n case as well.
>> 
>> There is also a potential problem where both debuggees use the same output file and we may think the second debuggee has started before it actually has (because the output file isn?t empty). I solved that be creating different output files for the two test cases.
>> 
>> I?ve also redirected all error streams so any errors aren?t lost.
>> 
>> Thanks,
>> /Staffan
>> 
>> On 30 jan 2014, at 18:09, Alan Bateman <Alan.Bateman at oracle.com> wrote:
>> 
>>> On 30/01/2014 16:59, Dmitry Samersoff wrote:
>>>> Staffan,
>>>> 
>>>> No it's not to your code. Sorry for not being clean enough.
>>>> 
>>>> You cleaned up unix code, but windows code remains bad. Particularly, it
>>>> has bad unconditional sleep 2
>>>> 
>>>> I would propose replace *windows manipulation with CYGWIN/MKS pids* to
>>>> call to JPS that return windows pid
>>>> 
>>> I have a vague memory that we had to use ps to get the pid because of the suspend=y test which cause the debuggee to suspect during startup. I might be wrong on this of course, it was a long time ago.
>>> 
>>> -Alan
>> 
> 
> 
> -- 
> Dmitry Samersoff
> Oracle Java development team, Saint Petersburg, Russia
> * I would love to change the world, but they won't give me the sources.


From erik.helin at oracle.com  Wed Feb  5 02:40:30 2014
From: erik.helin at oracle.com (Erik Helin)
Date: Wed, 05 Feb 2014 11:40:30 +0100
Subject: JDK-7090324: gclog rotation via external tool
In-Reply-To: <52EA33B3.6080409@lab.ntt.co.jp>
References: <52483BDB.8040206@ysfactory.dip.jp>	<52496A21.8080608@ysfactory.dip.jp>	<527CDD56.7080106@ysfactory.dip.jp>	<52A09642.4030609@ysfactory.dip.jp>	<52E26FA2.40909@ysfactory.dip.jp>	<0E26045D-F7F9-49BC-AB36-A42C1DC6E64E@oracle.com>
	<52E91000.9010600@ysfactory.dip.jp> <52E91AAA.3060008@oracle.com>
	<52E9248D.2090108@ysfactory.dip.jp>
	<52E9FA5B.6010306@lab.ntt.co.jp>
	<5B189F0A-7408-47C7-9719-DC2990355209@oracle.com>
	<52EA33B3.6080409@lab.ntt.co.jp>
Message-ID: <52F2151E.4040805@oracle.com>

Hi Yasumasa,

I've looked through the latest patch, it is much better! I just have two 
comments:

- ostream.hpp:
   Why did you add GCLogFileSize != 0 in should_rotate? The old check
   just checked that _bytes_written > GCLogFileSize.

- TestGCLogRotationViaJcmd.java:
   Could you use the helper class JDKToolLauncher to start jmap? The
   code would then be slightly easier to read:

for (int times = 1; times < NUM_LOGS; times++) {
     // Run jcmd <pid> GC.rotate_log
     JDKToolLauncher jmap = JDKToolLauncher.create("jmap")
                                           .addToolArg(pid)
                                           .addToolArg("GC.rotate_log");
     ProcessBuilder pb = new ProcessBuilder(jmap.getCommand());

     // Make sure we didn't crash
     OutputAnalyzer output = new OutputAnalyzer(pb.start());
     output.shouldHaveExitValue(0);
}

Thanks,
Erik

On 01/30/2014 12:12 PM, Yasumasa Suenaga wrote:
> Hi Staffan,
>
> I've uploaded new webrev:
> http://cr.openjdk.java.net/~ysuenaga/JDK-7090324/webrev.03/
>
> On 2014/01/30 17:23, Staffan Larsen wrote:
>> Would it be possible for the Diagnostic Command to output the location
>> of the rotated log? When invoking the command it would be good to get
>> some kind of feedback.
>
> I changed rotate_log() to redirect messages to jcmd.
> If GC.rotate_log is executed, we can get messages on jcmd console as below:
> ------------
> $ jcmd 18976 GC.rotate_log
> 18976:
> 2014-01-30 19:59:39 GC log rotation request has been received. Saved as
> test.log.0
> 2014-01-30 19:59:39 GC log file created test.log.1
> ------------
>
>
>> test/gc/7090324/Test7090324.java:
>> - I think this needs to have the Oracle copyright notice as well.
>> - Tests should now use descriptive names, not bug numbers:
>> https://wiki.openjdk.java.net/display/HotSpot/Naming+HotSpot+JTReg+Tests
>> - nits: lots of missing spaces before ?{?, and after ?for?, ?if?
>> - line 47: you don?t need to clean up old files, jtreg will give you a
>> fresh scratch directory to run in
>
> I've fixed.
> Could you review again?
>
>
> Thanks,
>
> Yasumasa
>
>> /Staffan
>>
>>
>>
>> On 30 jan 2014, at 08:08, Yasumasa
>> Suenaga<suenaga.yasumasa at lab.ntt.co.jp>  wrote:
>>
>>> Hi Erik, Staffan,
>>>
>>> I've uploaded new webrev. Could you review this ?
>>> http://cr.openjdk.java.net/~ysuenaga/JDK-7090324/webrev.02/
>>>
>>> This patch includes fixes from comments of Staffan and Erik.
>>>
>>> And I created new test of this patch as Test7090324 .
>>> This test works fine with jtreg.
>>>
>>>
>>> Thanks,
>>>
>>> Yasumasa
>>>
>>> On 2014/01/30 0:55, Yasumasa Suenaga wrote:
>>>> Hi Erik,
>>>>
>>>> On 2014/01/30 0:13, Erik Helin wrote:
>>>>> Hi Yasumasa,
>>>>>
>>>>> (have to use HTML email to get a width of more than 78 chars, sorry)
>>>>>
>>>>> why did you change the code in arguments.cpp in the method
>>>>> check_gc_log_consistency?
>>>>
>>>> In current implementation, check_gclog_consistency() checks three
>>>> parameters:
>>>>
>>>> - GC log filename
>>>> - NumberOfGCLogFiles
>>>> - GCLogFileSize
>>>>
>>>> My customer uses external trigger "ONLY" for rotating logs.
>>>> If they want to do that, GCLogFileSize does not need.
>>>>
>>>>
>>>>> Next, the gcLogFileStream::rotate_log method now does a lot of things.
>>>>> Could you separate out the first block into a new method,
>>>>> gcLogFileStream::should_rotate(bool force)?
>>>>>
>>>>> This was, the code would read:
>>>>>
>>>>>> bool gcLogFileStream::should_rotate(bool force) {
>>>>>> return force || _bytes_writen>= GCLogFileSize;
>>>>>> }
>>>>>>
>>>>>> void gcLogFileStream::rotate_log(bool force) {
>>>>>> char time_msg[FILENAMEBUFLEN];
>>>>>> char time_str[EXTRACHARLEN];
>>>>>> char current_file_name[FILENAMEBUFLEN];
>>>>>> char renamed_file_name[FILENAMEBUFLEN];
>>>>>>
>>>>>> if (!should_rotate(force)) {
>>>>>> return;
>>>>>> }
>>>>>>
>>>>>> ...
>>>>>> }
>>>>>
>>>>> Could you please update your patch?
>>>>
>>>> I will do that.
>>>>
>>>>
>>>>> There is a new empty line in the rotate_log method:
>>>>>
>>>>>> }
>>>>>> +
>>>>>> #ifdef ASSERT
>>>>>
>>>>> could you please remove it?
>>>>
>>>> I will do that.
>>>>
>>>>
>>>>> The logging change in rotate_log uses a different kind of if/else
>>>>> syntax
>>>>> than the rest of the file:
>>>>>
>>>>>> if (force) {
>>>>>> ...
>>>>>> }
>>>>>> else {
>>>>>> ...
>>>>>> }
>>>>>
>>>>> The other if/else statements in the file uses:
>>>>>
>>>>>> if (cond) {
>>>>>> ...
>>>>>> } else {
>>>>>> ...
>>>>>> }
>>>>>
>>>>> Could you please update your change to use the same if/else syntax?
>>>>
>>>> I will do that.
>>>>
>>>>
>>>>> This part of the change duplicates the code:
>>>>>
>>>>> + jio_snprintf(time_msg, sizeof(time_msg), "%s GC log rotation
>>>>> request has been received. Saved as %s\n",
>>>>> + os::local_time_string((char *)time_str, sizeof(time_str)),
>>>>> + renamed_file_name);
>>>>> + }
>>>>> + else {
>>>>> + jio_snprintf(time_msg, sizeof(time_msg), "%s GC log file has
>>>>> reached the"
>>>>> " maximum size. Saved as %s\n",
>>>>> - os::local_time_string((char *)time_str, sizeof(time_str)),
>>>>> + os::local_time_string((char *)time_str, sizeof(time_str)),
>>>>> renamed_file_name);
>>>>>
>>>>> Could you instead just change the message, as in:
>>>>>
>>>>>> const char* msg = forced ? "%s GC log rotation request has been
>>>>>> received. Saved as %s\n" :
>>>>>> "%s GC log file has reached the maximum size. Saved as %s\n";
>>>>>> jio_snprintf(msg, os::local_time_string((char *)time_str,
>>>>>> sizeof(time_str)), renamed_file_name);
>>>>
>>>> I will do that.
>>>>
>>>>
>>>>> The declaration of rotate_log in ostream.hpp still uses the old
>>>>> variable name is_force, it should use force,
>>>>> just as the definition.
>>>>
>>>> Sorry, I will fix it.
>>>>
>>>>
>>>>> Finally, could you add a test that tests your change? Have a look
>>>>> at the other tests
>>>>> in hotspot/test/gc to see how you can do it
>>>>> (you might want to use some functionality from
>>>>> hotspot/test/testlibrary).
>>>>
>>>> I found three tests as following:
>>>>
>>>> [ysuenaga at xelvis test]$ find . -iname "*jcmd*"
>>>> ./runtime/NMT/JcmdWithNMTDisabled.java
>>>> ./runtime/NMT/JcmdScale.java
>>>> ./gc/TestG1ZeroPGCTJcmdThreadPrint.java
>>>>
>>>> I understand that these tests checks output (stdout/stderr) with
>>>> OutputAnalyzer.
>>>> However, my patch affects target VM. So I guess current test cannot
>>>> check
>>>> that GC log rotation is succeeded.
>>>>
>>>> Should I make test which checks exit value of jcmd ?
>>>>
>>>>
>>>> Thanks,
>>>>
>>>> Yasumasa
>>>>
>>>>> Thanks,
>>>>> Erik
>>>>>
>>>>> On 2014-01-29 15:28, Yasumasa Suenaga wrote:
>>>>>> Hi Staffan,
>>>>>>
>>>>>> Thank you for reviewing!
>>>>>> I've uploaded new webrev.
>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-7090324/webrev.01/
>>>>>>
>>>>>> On 2014/01/29 20:56, Staffan Larsen wrote:
>>>>>>> Yasumasa,
>>>>>>>
>>>>>>> src/share/vm/runtime/arguments.cpp
>>>>>>> no comments
>>>>>>>
>>>>>>> src/share/vm/runtime/safepoint.cpp
>>>>>>> I was surprised that gc log size was checked after each safe
>>>>>>> point. That seems an uneccssary burden to place on a safe point.
>>>>>>> Instead we should switch to a periodic task that checks the gc
>>>>>>> log size. However, this is unrelated to you patch, so please
>>>>>>> ignore for now.
>>>>>>
>>>>>> Agree.
>>>>>> However, I think that PeriodicTask also is not appropriate for this.
>>>>>>
>>>>>> Size of GC log file is increased when GC is occurred.
>>>>>> So I think rotate function should be called at entry of each GC
>>>>>> events
>>>>>> e.g. VM_GC_Operation::doit_prologue() etc...
>>>>>>
>>>>>>
>>>>>>> src/share/vm/runtime/vm_operations.hpp
>>>>>>> line 402: nit: missing space before {
>>>>>>
>>>>>> Fixed.
>>>>>>
>>>>>>
>>>>>>> line 405: I think ?force? is a better name than ?is_force?
>>>>>>
>>>>>> I removed "force" option from DCmd.
>>>>>> So I removed this from VMOperation.
>>>>>>
>>>>>>
>>>>>>> src/share/vm/services/diagnosticCommand.cpp
>>>>>>> line 666: What does this do without the -force option? It looks
>>>>>>> to me that the non-force case will happen after each safe point
>>>>>>> (see above) and that there is no need to ever do this from a
>>>>>>> diagnostic command. Can we remove the option?
>>>>>>
>>>>>> Indeed.
>>>>>> I removed "force" option.
>>>>>>
>>>>>>
>>>>>>> line 677: ?Target VM does not support GC log file rotation."
>>>>>>
>>>>>> Fixed.
>>>>>>
>>>>>>
>>>>>>> nits: some missing spaces before ?{' and after ?if'
>>>>>>
>>>>>> Fixed.
>>>>>>
>>>>>>
>>>>>>> src/share/vm/services/diagnosticCommand.hpp
>>>>>>> I think RotateGCLogDCmd should require the ?control? permission
>>>>>>> when executed via JMX, so please add:
>>>>>>> static const JavaPermission permission() {
>>>>>>> JavaPermission p = {"java.lang.management.ManagementPermission",
>>>>>>> "control", NULL};
>>>>>>> return p;
>>>>>>> }
>>>>>>
>>>>>> Added.
>>>>>>
>>>>>>
>>>>>>> line 394: Maybe ?Force the GC log file to be rotated.? is a
>>>>>>> better description?
>>>>>>
>>>>>> Fixed.
>>>>>>
>>>>>>
>>>>>>> src/share/vm/utilities/ostream.cpp
>>>>>>> line 662: I think ?force? is a better name than ?is_force?
>>>>>>> line 668: The comment says exactly the same thing as the code so
>>>>>>> I think it can be skipped
>>>>>>> line 671: ?GC log file rotation occurs by external trigger ONLY."
>>>>>>> line 675: "not need? ->  ?no need?
>>>>>>> line 718: "GC log rotation request has been received?
>>>>>>
>>>>>> Fixed them.
>>>>>>
>>>>>>
>>>>>> Thanks,
>>>>>>
>>>>>> Yasumasa
>>>>>>
>>>>>>
>>>>>>> src/share/vm/utilities/ostream.hpp
>>>>>>> no comments
>>>>>>>
>>>>>>>
>>>>>>> Thanks,
>>>>>>> /Staffan
>>>>>>>
>>>>>>> On 24 jan 2014, at 14:50, Yasumasa
>>>>>>> Suenaga<yasu at ysfactory.dip.jp>  wrote:
>>>>>>>
>>>>>>>> Hi all,
>>>>>>>>
>>>>>>>> I've created webrev:
>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-7090324/webrev.00/
>>>>>>>>
>>>>>>>> This patch works fine on current jdk9/hs-rt .
>>>>>>>> Could you review this?
>>>>>>>>
>>>>>>>>
>>>>>>>> I am just an Author. So I need a sponsor.
>>>>>>>> Could you help me?
>>>>>>>>
>>>>>>>>
>>>>>>>> Please cooperate.
>>>>>>>>
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>>
>>>>>>>> Yasumasa
>>>>>>>>
>>>>>>>>
>>>>>>>> On 2013/12/06 0:05, Yasumasa Suenaga wrote:
>>>>>>>>> Hi all,
>>>>>>>>>
>>>>>>>>> Did someone read my email?
>>>>>>>>> I really hope to merge "JDK-7090324: gclog rotation via
>>>>>>>>> external tool" .
>>>>>>>>>
>>>>>>>>> I hear that someone need this RFE. So I want to discuss about
>>>>>>>>> this.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>>
>>>>>>>>> Yasumasa
>>>>>>>>>
>>>>>>>>> On 2013/11/08 21:47, Yasumasa Suenaga wrote:
>>>>>>>>>> Hi all,
>>>>>>>>>>
>>>>>>>>>> Did someone read my mail?
>>>>>>>>>>
>>>>>>>>>> I think that this RFE helps us to watch Java heap on
>>>>>>>>>> production system.
>>>>>>>>>> Also I think this RFE is able to be part of the JEP 158
>>>>>>>>>> (Unified JVM Logging) .
>>>>>>>>>>
>>>>>>>>>> I want to update this RFE in JDK Bug System, but I don't have
>>>>>>>>>> account.
>>>>>>>>>> So I've posted email at first.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>>
>>>>>>>>>> Yasumasa
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On 2013/09/30 21:10, Yasumasa Suenaga wrote:
>>>>>>>>>>> In previous email, I've attached new patch for this RFE.
>>>>>>>>>>> It works fine with current hsx.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Yasumasa
>>>>>>>>>>>
>>>>>>>>>>> On 2013/09/29 23:40, Yasu wrote:
>>>>>>>>>>>> Hi all,
>>>>>>>>>>>>
>>>>>>>>>>>> We are using "logrotate" tool on RHEL for various log rotation.
>>>>>>>>>>>> Current HotSpot has gclog rotation function for log size base,
>>>>>>>>>>>> however I need to rotate gc log synchronizing with logrotate
>>>>>>>>>>>> tool.
>>>>>>>>>>>>
>>>>>>>>>>>> So I've created RFE as "JDK-7090324: gclog rotation via
>>>>>>>>>>>> external tool" .
>>>>>>>>>>>> And Sr. Engineering Manager in Oracle said he use the
>>>>>>>>>>>> essence of my patch in one
>>>>>>>>>>>> of the jcmd subcommands.
>>>>>>>>>>>> http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/2011-September/003274.html
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> 2 years ago, I posted a patch for this RFE.
>>>>>>>>>>>> But this patch is too old to apply for current HotSpot.
>>>>>>>>>>>>
>>>>>>>>>>>> In last month, a similar discussion was appeared in ML.
>>>>>>>>>>>> So I think it's time to discuss this RFE.
>>>>>>>>>>>> http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/2013-August/008029.html
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Please cooperate.
>>>>>>>>>>>>
>>>>>>>>>>>> Best regards,
>>>>>>>>>>>> Yasumasa
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>
>


From yasu at ysfactory.dip.jp  Wed Feb  5 04:09:04 2014
From: yasu at ysfactory.dip.jp (Yasumasa Suenaga)
Date: Wed, 05 Feb 2014 21:09:04 +0900
Subject: JDK-7090324: gclog rotation via external tool
In-Reply-To: <52F2151E.4040805@oracle.com>
References: <52483BDB.8040206@ysfactory.dip.jp>	<52496A21.8080608@ysfactory.dip.jp>	<527CDD56.7080106@ysfactory.dip.jp>	<52A09642.4030609@ysfactory.dip.jp>	<52E26FA2.40909@ysfactory.dip.jp>	<0E26045D-F7F9-49BC-AB36-A42C1DC6E64E@oracle.com>
	<52E91000.9010600@ysfactory.dip.jp> <52E91AAA.3060008@oracle.com>
	<52E9248D.2090108@ysfactory.dip.jp>
	<52E9FA5B.6010306@lab.ntt.co.jp>
	<5B189F0A-7408-47C7-9719-DC2990355209@oracle.com>
	<52EA33B3.6080409@lab.ntt.co.jp> <52F2151E.4040805@oracle.com>
Message-ID: <52F229E0.3010400@ysfactory.dip.jp>

Hi Erik,

Thank you for reviewing again!
I've updated new webrev.

On 02/05/2014 07:40 PM, Erik Helin wrote:
> Hi Yasumasa,
>
> I've looked through the latest patch, it is much better! I just have 
> two comments:
>
> - ostream.hpp:
>   Why did you add GCLogFileSize != 0 in should_rotate? The old check
>   just checked that _bytes_written > GCLogFileSize.

Default value of GCLogFileSIze is "0" in globals.hpp .
So if this state is missed, should_rotate() returns true in anytime.


> - TestGCLogRotationViaJcmd.java:
>   Could you use the helper class JDKToolLauncher to start jmap? The
>   code would then be slightly easier to read:
>
> for (int times = 1; times < NUM_LOGS; times++) {
>     // Run jcmd <pid> GC.rotate_log
>     JDKToolLauncher jmap = JDKToolLauncher.create("jmap")
>                                           .addToolArg(pid)
> .addToolArg("GC.rotate_log");
>     ProcessBuilder pb = new ProcessBuilder(jmap.getCommand());
>
>     // Make sure we didn't crash
>     OutputAnalyzer output = new OutputAnalyzer(pb.start());
>     output.shouldHaveExitValue(0);
> }

I've fixed. Could you check the patch?


Thanks,

Yasumasa


> Thanks,
> Erik
>
> On 01/30/2014 12:12 PM, Yasumasa Suenaga wrote:
>> Hi Staffan,
>>
>> I've uploaded new webrev:
>> http://cr.openjdk.java.net/~ysuenaga/JDK-7090324/webrev.03/
>>
>> On 2014/01/30 17:23, Staffan Larsen wrote:
>>> Would it be possible for the Diagnostic Command to output the location
>>> of the rotated log? When invoking the command it would be good to get
>>> some kind of feedback.
>>
>> I changed rotate_log() to redirect messages to jcmd.
>> If GC.rotate_log is executed, we can get messages on jcmd console as 
>> below:
>> ------------
>> $ jcmd 18976 GC.rotate_log
>> 18976:
>> 2014-01-30 19:59:39 GC log rotation request has been received. Saved as
>> test.log.0
>> 2014-01-30 19:59:39 GC log file created test.log.1
>> ------------
>>
>>
>>> test/gc/7090324/Test7090324.java:
>>> - I think this needs to have the Oracle copyright notice as well.
>>> - Tests should now use descriptive names, not bug numbers:
>>> https://wiki.openjdk.java.net/display/HotSpot/Naming+HotSpot+JTReg+Tests 
>>>
>>> - nits: lots of missing spaces before ?{?, and after ?for?, ?if?
>>> - line 47: you don?t need to clean up old files, jtreg will give you a
>>> fresh scratch directory to run in
>>
>> I've fixed.
>> Could you review again?
>>
>>
>> Thanks,
>>
>> Yasumasa
>>
>>> /Staffan
>>>
>>>
>>>
>>> On 30 jan 2014, at 08:08, Yasumasa
>>> Suenaga<suenaga.yasumasa at lab.ntt.co.jp>  wrote:
>>>
>>>> Hi Erik, Staffan,
>>>>
>>>> I've uploaded new webrev. Could you review this ?
>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-7090324/webrev.02/
>>>>
>>>> This patch includes fixes from comments of Staffan and Erik.
>>>>
>>>> And I created new test of this patch as Test7090324 .
>>>> This test works fine with jtreg.
>>>>
>>>>
>>>> Thanks,
>>>>
>>>> Yasumasa
>>>>
>>>> On 2014/01/30 0:55, Yasumasa Suenaga wrote:
>>>>> Hi Erik,
>>>>>
>>>>> On 2014/01/30 0:13, Erik Helin wrote:
>>>>>> Hi Yasumasa,
>>>>>>
>>>>>> (have to use HTML email to get a width of more than 78 chars, sorry)
>>>>>>
>>>>>> why did you change the code in arguments.cpp in the method
>>>>>> check_gc_log_consistency?
>>>>>
>>>>> In current implementation, check_gclog_consistency() checks three
>>>>> parameters:
>>>>>
>>>>> - GC log filename
>>>>> - NumberOfGCLogFiles
>>>>> - GCLogFileSize
>>>>>
>>>>> My customer uses external trigger "ONLY" for rotating logs.
>>>>> If they want to do that, GCLogFileSize does not need.
>>>>>
>>>>>
>>>>>> Next, the gcLogFileStream::rotate_log method now does a lot of 
>>>>>> things.
>>>>>> Could you separate out the first block into a new method,
>>>>>> gcLogFileStream::should_rotate(bool force)?
>>>>>>
>>>>>> This was, the code would read:
>>>>>>
>>>>>>> bool gcLogFileStream::should_rotate(bool force) {
>>>>>>> return force || _bytes_writen>= GCLogFileSize;
>>>>>>> }
>>>>>>>
>>>>>>> void gcLogFileStream::rotate_log(bool force) {
>>>>>>> char time_msg[FILENAMEBUFLEN];
>>>>>>> char time_str[EXTRACHARLEN];
>>>>>>> char current_file_name[FILENAMEBUFLEN];
>>>>>>> char renamed_file_name[FILENAMEBUFLEN];
>>>>>>>
>>>>>>> if (!should_rotate(force)) {
>>>>>>> return;
>>>>>>> }
>>>>>>>
>>>>>>> ...
>>>>>>> }
>>>>>>
>>>>>> Could you please update your patch?
>>>>>
>>>>> I will do that.
>>>>>
>>>>>
>>>>>> There is a new empty line in the rotate_log method:
>>>>>>
>>>>>>> }
>>>>>>> +
>>>>>>> #ifdef ASSERT
>>>>>>
>>>>>> could you please remove it?
>>>>>
>>>>> I will do that.
>>>>>
>>>>>
>>>>>> The logging change in rotate_log uses a different kind of if/else
>>>>>> syntax
>>>>>> than the rest of the file:
>>>>>>
>>>>>>> if (force) {
>>>>>>> ...
>>>>>>> }
>>>>>>> else {
>>>>>>> ...
>>>>>>> }
>>>>>>
>>>>>> The other if/else statements in the file uses:
>>>>>>
>>>>>>> if (cond) {
>>>>>>> ...
>>>>>>> } else {
>>>>>>> ...
>>>>>>> }
>>>>>>
>>>>>> Could you please update your change to use the same if/else syntax?
>>>>>
>>>>> I will do that.
>>>>>
>>>>>
>>>>>> This part of the change duplicates the code:
>>>>>>
>>>>>> + jio_snprintf(time_msg, sizeof(time_msg), "%s GC log rotation
>>>>>> request has been received. Saved as %s\n",
>>>>>> + os::local_time_string((char *)time_str, sizeof(time_str)),
>>>>>> + renamed_file_name);
>>>>>> + }
>>>>>> + else {
>>>>>> + jio_snprintf(time_msg, sizeof(time_msg), "%s GC log file has
>>>>>> reached the"
>>>>>> " maximum size. Saved as %s\n",
>>>>>> - os::local_time_string((char *)time_str, sizeof(time_str)),
>>>>>> + os::local_time_string((char *)time_str, sizeof(time_str)),
>>>>>> renamed_file_name);
>>>>>>
>>>>>> Could you instead just change the message, as in:
>>>>>>
>>>>>>> const char* msg = forced ? "%s GC log rotation request has been
>>>>>>> received. Saved as %s\n" :
>>>>>>> "%s GC log file has reached the maximum size. Saved as %s\n";
>>>>>>> jio_snprintf(msg, os::local_time_string((char *)time_str,
>>>>>>> sizeof(time_str)), renamed_file_name);
>>>>>
>>>>> I will do that.
>>>>>
>>>>>
>>>>>> The declaration of rotate_log in ostream.hpp still uses the old
>>>>>> variable name is_force, it should use force,
>>>>>> just as the definition.
>>>>>
>>>>> Sorry, I will fix it.
>>>>>
>>>>>
>>>>>> Finally, could you add a test that tests your change? Have a look
>>>>>> at the other tests
>>>>>> in hotspot/test/gc to see how you can do it
>>>>>> (you might want to use some functionality from
>>>>>> hotspot/test/testlibrary).
>>>>>
>>>>> I found three tests as following:
>>>>>
>>>>> [ysuenaga at xelvis test]$ find . -iname "*jcmd*"
>>>>> ./runtime/NMT/JcmdWithNMTDisabled.java
>>>>> ./runtime/NMT/JcmdScale.java
>>>>> ./gc/TestG1ZeroPGCTJcmdThreadPrint.java
>>>>>
>>>>> I understand that these tests checks output (stdout/stderr) with
>>>>> OutputAnalyzer.
>>>>> However, my patch affects target VM. So I guess current test cannot
>>>>> check
>>>>> that GC log rotation is succeeded.
>>>>>
>>>>> Should I make test which checks exit value of jcmd ?
>>>>>
>>>>>
>>>>> Thanks,
>>>>>
>>>>> Yasumasa
>>>>>
>>>>>> Thanks,
>>>>>> Erik
>>>>>>
>>>>>> On 2014-01-29 15:28, Yasumasa Suenaga wrote:
>>>>>>> Hi Staffan,
>>>>>>>
>>>>>>> Thank you for reviewing!
>>>>>>> I've uploaded new webrev.
>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-7090324/webrev.01/
>>>>>>>
>>>>>>> On 2014/01/29 20:56, Staffan Larsen wrote:
>>>>>>>> Yasumasa,
>>>>>>>>
>>>>>>>> src/share/vm/runtime/arguments.cpp
>>>>>>>> no comments
>>>>>>>>
>>>>>>>> src/share/vm/runtime/safepoint.cpp
>>>>>>>> I was surprised that gc log size was checked after each safe
>>>>>>>> point. That seems an uneccssary burden to place on a safe point.
>>>>>>>> Instead we should switch to a periodic task that checks the gc
>>>>>>>> log size. However, this is unrelated to you patch, so please
>>>>>>>> ignore for now.
>>>>>>>
>>>>>>> Agree.
>>>>>>> However, I think that PeriodicTask also is not appropriate for 
>>>>>>> this.
>>>>>>>
>>>>>>> Size of GC log file is increased when GC is occurred.
>>>>>>> So I think rotate function should be called at entry of each GC
>>>>>>> events
>>>>>>> e.g. VM_GC_Operation::doit_prologue() etc...
>>>>>>>
>>>>>>>
>>>>>>>> src/share/vm/runtime/vm_operations.hpp
>>>>>>>> line 402: nit: missing space before {
>>>>>>>
>>>>>>> Fixed.
>>>>>>>
>>>>>>>
>>>>>>>> line 405: I think ?force? is a better name than ?is_force?
>>>>>>>
>>>>>>> I removed "force" option from DCmd.
>>>>>>> So I removed this from VMOperation.
>>>>>>>
>>>>>>>
>>>>>>>> src/share/vm/services/diagnosticCommand.cpp
>>>>>>>> line 666: What does this do without the -force option? It looks
>>>>>>>> to me that the non-force case will happen after each safe point
>>>>>>>> (see above) and that there is no need to ever do this from a
>>>>>>>> diagnostic command. Can we remove the option?
>>>>>>>
>>>>>>> Indeed.
>>>>>>> I removed "force" option.
>>>>>>>
>>>>>>>
>>>>>>>> line 677: ?Target VM does not support GC log file rotation."
>>>>>>>
>>>>>>> Fixed.
>>>>>>>
>>>>>>>
>>>>>>>> nits: some missing spaces before ?{' and after ?if'
>>>>>>>
>>>>>>> Fixed.
>>>>>>>
>>>>>>>
>>>>>>>> src/share/vm/services/diagnosticCommand.hpp
>>>>>>>> I think RotateGCLogDCmd should require the ?control? permission
>>>>>>>> when executed via JMX, so please add:
>>>>>>>> static const JavaPermission permission() {
>>>>>>>> JavaPermission p = {"java.lang.management.ManagementPermission",
>>>>>>>> "control", NULL};
>>>>>>>> return p;
>>>>>>>> }
>>>>>>>
>>>>>>> Added.
>>>>>>>
>>>>>>>
>>>>>>>> line 394: Maybe ?Force the GC log file to be rotated.? is a
>>>>>>>> better description?
>>>>>>>
>>>>>>> Fixed.
>>>>>>>
>>>>>>>
>>>>>>>> src/share/vm/utilities/ostream.cpp
>>>>>>>> line 662: I think ?force? is a better name than ?is_force?
>>>>>>>> line 668: The comment says exactly the same thing as the code so
>>>>>>>> I think it can be skipped
>>>>>>>> line 671: ?GC log file rotation occurs by external trigger ONLY."
>>>>>>>> line 675: "not need? ->  ?no need?
>>>>>>>> line 718: "GC log rotation request has been received?
>>>>>>>
>>>>>>> Fixed them.
>>>>>>>
>>>>>>>
>>>>>>> Thanks,
>>>>>>>
>>>>>>> Yasumasa
>>>>>>>
>>>>>>>
>>>>>>>> src/share/vm/utilities/ostream.hpp
>>>>>>>> no comments
>>>>>>>>
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> /Staffan
>>>>>>>>
>>>>>>>> On 24 jan 2014, at 14:50, Yasumasa
>>>>>>>> Suenaga<yasu at ysfactory.dip.jp>  wrote:
>>>>>>>>
>>>>>>>>> Hi all,
>>>>>>>>>
>>>>>>>>> I've created webrev:
>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-7090324/webrev.00/
>>>>>>>>>
>>>>>>>>> This patch works fine on current jdk9/hs-rt .
>>>>>>>>> Could you review this?
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> I am just an Author. So I need a sponsor.
>>>>>>>>> Could you help me?
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Please cooperate.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>>
>>>>>>>>> Yasumasa
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 2013/12/06 0:05, Yasumasa Suenaga wrote:
>>>>>>>>>> Hi all,
>>>>>>>>>>
>>>>>>>>>> Did someone read my email?
>>>>>>>>>> I really hope to merge "JDK-7090324: gclog rotation via
>>>>>>>>>> external tool" .
>>>>>>>>>>
>>>>>>>>>> I hear that someone need this RFE. So I want to discuss about
>>>>>>>>>> this.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>>
>>>>>>>>>> Yasumasa
>>>>>>>>>>
>>>>>>>>>> On 2013/11/08 21:47, Yasumasa Suenaga wrote:
>>>>>>>>>>> Hi all,
>>>>>>>>>>>
>>>>>>>>>>> Did someone read my mail?
>>>>>>>>>>>
>>>>>>>>>>> I think that this RFE helps us to watch Java heap on
>>>>>>>>>>> production system.
>>>>>>>>>>> Also I think this RFE is able to be part of the JEP 158
>>>>>>>>>>> (Unified JVM Logging) .
>>>>>>>>>>>
>>>>>>>>>>> I want to update this RFE in JDK Bug System, but I don't have
>>>>>>>>>>> account.
>>>>>>>>>>> So I've posted email at first.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>>
>>>>>>>>>>> Yasumasa
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On 2013/09/30 21:10, Yasumasa Suenaga wrote:
>>>>>>>>>>>> In previous email, I've attached new patch for this RFE.
>>>>>>>>>>>> It works fine with current hsx.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Yasumasa
>>>>>>>>>>>>
>>>>>>>>>>>> On 2013/09/29 23:40, Yasu wrote:
>>>>>>>>>>>>> Hi all,
>>>>>>>>>>>>>
>>>>>>>>>>>>> We are using "logrotate" tool on RHEL for various log 
>>>>>>>>>>>>> rotation.
>>>>>>>>>>>>> Current HotSpot has gclog rotation function for log size 
>>>>>>>>>>>>> base,
>>>>>>>>>>>>> however I need to rotate gc log synchronizing with logrotate
>>>>>>>>>>>>> tool.
>>>>>>>>>>>>>
>>>>>>>>>>>>> So I've created RFE as "JDK-7090324: gclog rotation via
>>>>>>>>>>>>> external tool" .
>>>>>>>>>>>>> And Sr. Engineering Manager in Oracle said he use the
>>>>>>>>>>>>> essence of my patch in one
>>>>>>>>>>>>> of the jcmd subcommands.
>>>>>>>>>>>>> http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/2011-September/003274.html 
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> 2 years ago, I posted a patch for this RFE.
>>>>>>>>>>>>> But this patch is too old to apply for current HotSpot.
>>>>>>>>>>>>>
>>>>>>>>>>>>> In last month, a similar discussion was appeared in ML.
>>>>>>>>>>>>> So I think it's time to discuss this RFE.
>>>>>>>>>>>>> http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/2013-August/008029.html 
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Please cooperate.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Best regards,
>>>>>>>>>>>>> Yasumasa
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>
>>
>


From yasu at ysfactory.dip.jp  Wed Feb  5 04:13:20 2014
From: yasu at ysfactory.dip.jp (Yasumasa Suenaga)
Date: Wed, 05 Feb 2014 21:13:20 +0900
Subject: JDK-7090324: gclog rotation via external tool
In-Reply-To: <52F229E0.3010400@ysfactory.dip.jp>
References: <52483BDB.8040206@ysfactory.dip.jp>	<52496A21.8080608@ysfactory.dip.jp>	<527CDD56.7080106@ysfactory.dip.jp>	<52A09642.4030609@ysfactory.dip.jp>	<52E26FA2.40909@ysfactory.dip.jp>	<0E26045D-F7F9-49BC-AB36-A42C1DC6E64E@oracle.com>
	<52E91000.9010600@ysfactory.dip.jp> <52E91AAA.3060008@oracle.com>
	<52E9248D.2090108@ysfactory.dip.jp>
	<52E9FA5B.6010306@lab.ntt.co.jp>
	<5B189F0A-7408-47C7-9719-DC2990355209@oracle.com>
	<52EA33B3.6080409@lab.ntt.co.jp> <52F2151E.4040805@oracle.com>
	<52F229E0.3010400@ysfactory.dip.jp>
Message-ID: <52F22AE0.1000008@ysfactory.dip.jp>

Sorry, I forgot to paste URL of new webrev :-P
http://cr.openjdk.java.net/~ysuenaga/JDK-7090324/webrev.04/


Yasumasa

On 02/05/2014 09:09 PM, Yasumasa Suenaga wrote:
> Hi Erik,
>
> Thank you for reviewing again!
> I've updated new webrev.
>
> On 02/05/2014 07:40 PM, Erik Helin wrote:
>> Hi Yasumasa,
>>
>> I've looked through the latest patch, it is much better! I just have 
>> two comments:
>>
>> - ostream.hpp:
>>   Why did you add GCLogFileSize != 0 in should_rotate? The old check
>>   just checked that _bytes_written > GCLogFileSize.
>
> Default value of GCLogFileSIze is "0" in globals.hpp .
> So if this state is missed, should_rotate() returns true in anytime.
>
>
>> - TestGCLogRotationViaJcmd.java:
>>   Could you use the helper class JDKToolLauncher to start jmap? The
>>   code would then be slightly easier to read:
>>
>> for (int times = 1; times < NUM_LOGS; times++) {
>>     // Run jcmd <pid> GC.rotate_log
>>     JDKToolLauncher jmap = JDKToolLauncher.create("jmap")
>>                                           .addToolArg(pid)
>> .addToolArg("GC.rotate_log");
>>     ProcessBuilder pb = new ProcessBuilder(jmap.getCommand());
>>
>>     // Make sure we didn't crash
>>     OutputAnalyzer output = new OutputAnalyzer(pb.start());
>>     output.shouldHaveExitValue(0);
>> }
>
> I've fixed. Could you check the patch?
>
>
> Thanks,
>
> Yasumasa
>
>
>> Thanks,
>> Erik
>>
>> On 01/30/2014 12:12 PM, Yasumasa Suenaga wrote:
>>> Hi Staffan,
>>>
>>> I've uploaded new webrev:
>>> http://cr.openjdk.java.net/~ysuenaga/JDK-7090324/webrev.03/
>>>
>>> On 2014/01/30 17:23, Staffan Larsen wrote:
>>>> Would it be possible for the Diagnostic Command to output the location
>>>> of the rotated log? When invoking the command it would be good to get
>>>> some kind of feedback.
>>>
>>> I changed rotate_log() to redirect messages to jcmd.
>>> If GC.rotate_log is executed, we can get messages on jcmd console as 
>>> below:
>>> ------------
>>> $ jcmd 18976 GC.rotate_log
>>> 18976:
>>> 2014-01-30 19:59:39 GC log rotation request has been received. Saved as
>>> test.log.0
>>> 2014-01-30 19:59:39 GC log file created test.log.1
>>> ------------
>>>
>>>
>>>> test/gc/7090324/Test7090324.java:
>>>> - I think this needs to have the Oracle copyright notice as well.
>>>> - Tests should now use descriptive names, not bug numbers:
>>>> https://wiki.openjdk.java.net/display/HotSpot/Naming+HotSpot+JTReg+Tests 
>>>>
>>>> - nits: lots of missing spaces before ?{?, and after ?for?, ?if?
>>>> - line 47: you don?t need to clean up old files, jtreg will give you a
>>>> fresh scratch directory to run in
>>>
>>> I've fixed.
>>> Could you review again?
>>>
>>>
>>> Thanks,
>>>
>>> Yasumasa
>>>
>>>> /Staffan
>>>>
>>>>
>>>>
>>>> On 30 jan 2014, at 08:08, Yasumasa
>>>> Suenaga<suenaga.yasumasa at lab.ntt.co.jp>  wrote:
>>>>
>>>>> Hi Erik, Staffan,
>>>>>
>>>>> I've uploaded new webrev. Could you review this ?
>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-7090324/webrev.02/
>>>>>
>>>>> This patch includes fixes from comments of Staffan and Erik.
>>>>>
>>>>> And I created new test of this patch as Test7090324 .
>>>>> This test works fine with jtreg.
>>>>>
>>>>>
>>>>> Thanks,
>>>>>
>>>>> Yasumasa
>>>>>
>>>>> On 2014/01/30 0:55, Yasumasa Suenaga wrote:
>>>>>> Hi Erik,
>>>>>>
>>>>>> On 2014/01/30 0:13, Erik Helin wrote:
>>>>>>> Hi Yasumasa,
>>>>>>>
>>>>>>> (have to use HTML email to get a width of more than 78 chars, 
>>>>>>> sorry)
>>>>>>>
>>>>>>> why did you change the code in arguments.cpp in the method
>>>>>>> check_gc_log_consistency?
>>>>>>
>>>>>> In current implementation, check_gclog_consistency() checks three
>>>>>> parameters:
>>>>>>
>>>>>> - GC log filename
>>>>>> - NumberOfGCLogFiles
>>>>>> - GCLogFileSize
>>>>>>
>>>>>> My customer uses external trigger "ONLY" for rotating logs.
>>>>>> If they want to do that, GCLogFileSize does not need.
>>>>>>
>>>>>>
>>>>>>> Next, the gcLogFileStream::rotate_log method now does a lot of 
>>>>>>> things.
>>>>>>> Could you separate out the first block into a new method,
>>>>>>> gcLogFileStream::should_rotate(bool force)?
>>>>>>>
>>>>>>> This was, the code would read:
>>>>>>>
>>>>>>>> bool gcLogFileStream::should_rotate(bool force) {
>>>>>>>> return force || _bytes_writen>= GCLogFileSize;
>>>>>>>> }
>>>>>>>>
>>>>>>>> void gcLogFileStream::rotate_log(bool force) {
>>>>>>>> char time_msg[FILENAMEBUFLEN];
>>>>>>>> char time_str[EXTRACHARLEN];
>>>>>>>> char current_file_name[FILENAMEBUFLEN];
>>>>>>>> char renamed_file_name[FILENAMEBUFLEN];
>>>>>>>>
>>>>>>>> if (!should_rotate(force)) {
>>>>>>>> return;
>>>>>>>> }
>>>>>>>>
>>>>>>>> ...
>>>>>>>> }
>>>>>>>
>>>>>>> Could you please update your patch?
>>>>>>
>>>>>> I will do that.
>>>>>>
>>>>>>
>>>>>>> There is a new empty line in the rotate_log method:
>>>>>>>
>>>>>>>> }
>>>>>>>> +
>>>>>>>> #ifdef ASSERT
>>>>>>>
>>>>>>> could you please remove it?
>>>>>>
>>>>>> I will do that.
>>>>>>
>>>>>>
>>>>>>> The logging change in rotate_log uses a different kind of if/else
>>>>>>> syntax
>>>>>>> than the rest of the file:
>>>>>>>
>>>>>>>> if (force) {
>>>>>>>> ...
>>>>>>>> }
>>>>>>>> else {
>>>>>>>> ...
>>>>>>>> }
>>>>>>>
>>>>>>> The other if/else statements in the file uses:
>>>>>>>
>>>>>>>> if (cond) {
>>>>>>>> ...
>>>>>>>> } else {
>>>>>>>> ...
>>>>>>>> }
>>>>>>>
>>>>>>> Could you please update your change to use the same if/else syntax?
>>>>>>
>>>>>> I will do that.
>>>>>>
>>>>>>
>>>>>>> This part of the change duplicates the code:
>>>>>>>
>>>>>>> + jio_snprintf(time_msg, sizeof(time_msg), "%s GC log rotation
>>>>>>> request has been received. Saved as %s\n",
>>>>>>> + os::local_time_string((char *)time_str, sizeof(time_str)),
>>>>>>> + renamed_file_name);
>>>>>>> + }
>>>>>>> + else {
>>>>>>> + jio_snprintf(time_msg, sizeof(time_msg), "%s GC log file has
>>>>>>> reached the"
>>>>>>> " maximum size. Saved as %s\n",
>>>>>>> - os::local_time_string((char *)time_str, sizeof(time_str)),
>>>>>>> + os::local_time_string((char *)time_str, sizeof(time_str)),
>>>>>>> renamed_file_name);
>>>>>>>
>>>>>>> Could you instead just change the message, as in:
>>>>>>>
>>>>>>>> const char* msg = forced ? "%s GC log rotation request has been
>>>>>>>> received. Saved as %s\n" :
>>>>>>>> "%s GC log file has reached the maximum size. Saved as %s\n";
>>>>>>>> jio_snprintf(msg, os::local_time_string((char *)time_str,
>>>>>>>> sizeof(time_str)), renamed_file_name);
>>>>>>
>>>>>> I will do that.
>>>>>>
>>>>>>
>>>>>>> The declaration of rotate_log in ostream.hpp still uses the old
>>>>>>> variable name is_force, it should use force,
>>>>>>> just as the definition.
>>>>>>
>>>>>> Sorry, I will fix it.
>>>>>>
>>>>>>
>>>>>>> Finally, could you add a test that tests your change? Have a look
>>>>>>> at the other tests
>>>>>>> in hotspot/test/gc to see how you can do it
>>>>>>> (you might want to use some functionality from
>>>>>>> hotspot/test/testlibrary).
>>>>>>
>>>>>> I found three tests as following:
>>>>>>
>>>>>> [ysuenaga at xelvis test]$ find . -iname "*jcmd*"
>>>>>> ./runtime/NMT/JcmdWithNMTDisabled.java
>>>>>> ./runtime/NMT/JcmdScale.java
>>>>>> ./gc/TestG1ZeroPGCTJcmdThreadPrint.java
>>>>>>
>>>>>> I understand that these tests checks output (stdout/stderr) with
>>>>>> OutputAnalyzer.
>>>>>> However, my patch affects target VM. So I guess current test cannot
>>>>>> check
>>>>>> that GC log rotation is succeeded.
>>>>>>
>>>>>> Should I make test which checks exit value of jcmd ?
>>>>>>
>>>>>>
>>>>>> Thanks,
>>>>>>
>>>>>> Yasumasa
>>>>>>
>>>>>>> Thanks,
>>>>>>> Erik
>>>>>>>
>>>>>>> On 2014-01-29 15:28, Yasumasa Suenaga wrote:
>>>>>>>> Hi Staffan,
>>>>>>>>
>>>>>>>> Thank you for reviewing!
>>>>>>>> I've uploaded new webrev.
>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-7090324/webrev.01/
>>>>>>>>
>>>>>>>> On 2014/01/29 20:56, Staffan Larsen wrote:
>>>>>>>>> Yasumasa,
>>>>>>>>>
>>>>>>>>> src/share/vm/runtime/arguments.cpp
>>>>>>>>> no comments
>>>>>>>>>
>>>>>>>>> src/share/vm/runtime/safepoint.cpp
>>>>>>>>> I was surprised that gc log size was checked after each safe
>>>>>>>>> point. That seems an uneccssary burden to place on a safe point.
>>>>>>>>> Instead we should switch to a periodic task that checks the gc
>>>>>>>>> log size. However, this is unrelated to you patch, so please
>>>>>>>>> ignore for now.
>>>>>>>>
>>>>>>>> Agree.
>>>>>>>> However, I think that PeriodicTask also is not appropriate for 
>>>>>>>> this.
>>>>>>>>
>>>>>>>> Size of GC log file is increased when GC is occurred.
>>>>>>>> So I think rotate function should be called at entry of each GC
>>>>>>>> events
>>>>>>>> e.g. VM_GC_Operation::doit_prologue() etc...
>>>>>>>>
>>>>>>>>
>>>>>>>>> src/share/vm/runtime/vm_operations.hpp
>>>>>>>>> line 402: nit: missing space before {
>>>>>>>>
>>>>>>>> Fixed.
>>>>>>>>
>>>>>>>>
>>>>>>>>> line 405: I think ?force? is a better name than ?is_force?
>>>>>>>>
>>>>>>>> I removed "force" option from DCmd.
>>>>>>>> So I removed this from VMOperation.
>>>>>>>>
>>>>>>>>
>>>>>>>>> src/share/vm/services/diagnosticCommand.cpp
>>>>>>>>> line 666: What does this do without the -force option? It looks
>>>>>>>>> to me that the non-force case will happen after each safe point
>>>>>>>>> (see above) and that there is no need to ever do this from a
>>>>>>>>> diagnostic command. Can we remove the option?
>>>>>>>>
>>>>>>>> Indeed.
>>>>>>>> I removed "force" option.
>>>>>>>>
>>>>>>>>
>>>>>>>>> line 677: ?Target VM does not support GC log file rotation."
>>>>>>>>
>>>>>>>> Fixed.
>>>>>>>>
>>>>>>>>
>>>>>>>>> nits: some missing spaces before ?{' and after ?if'
>>>>>>>>
>>>>>>>> Fixed.
>>>>>>>>
>>>>>>>>
>>>>>>>>> src/share/vm/services/diagnosticCommand.hpp
>>>>>>>>> I think RotateGCLogDCmd should require the ?control? permission
>>>>>>>>> when executed via JMX, so please add:
>>>>>>>>> static const JavaPermission permission() {
>>>>>>>>> JavaPermission p = {"java.lang.management.ManagementPermission",
>>>>>>>>> "control", NULL};
>>>>>>>>> return p;
>>>>>>>>> }
>>>>>>>>
>>>>>>>> Added.
>>>>>>>>
>>>>>>>>
>>>>>>>>> line 394: Maybe ?Force the GC log file to be rotated.? is a
>>>>>>>>> better description?
>>>>>>>>
>>>>>>>> Fixed.
>>>>>>>>
>>>>>>>>
>>>>>>>>> src/share/vm/utilities/ostream.cpp
>>>>>>>>> line 662: I think ?force? is a better name than ?is_force?
>>>>>>>>> line 668: The comment says exactly the same thing as the code so
>>>>>>>>> I think it can be skipped
>>>>>>>>> line 671: ?GC log file rotation occurs by external trigger ONLY."
>>>>>>>>> line 675: "not need? ->  ?no need?
>>>>>>>>> line 718: "GC log rotation request has been received?
>>>>>>>>
>>>>>>>> Fixed them.
>>>>>>>>
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>>
>>>>>>>> Yasumasa
>>>>>>>>
>>>>>>>>
>>>>>>>>> src/share/vm/utilities/ostream.hpp
>>>>>>>>> no comments
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> /Staffan
>>>>>>>>>
>>>>>>>>> On 24 jan 2014, at 14:50, Yasumasa
>>>>>>>>> Suenaga<yasu at ysfactory.dip.jp>  wrote:
>>>>>>>>>
>>>>>>>>>> Hi all,
>>>>>>>>>>
>>>>>>>>>> I've created webrev:
>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-7090324/webrev.00/
>>>>>>>>>>
>>>>>>>>>> This patch works fine on current jdk9/hs-rt .
>>>>>>>>>> Could you review this?
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> I am just an Author. So I need a sponsor.
>>>>>>>>>> Could you help me?
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Please cooperate.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>>
>>>>>>>>>> Yasumasa
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On 2013/12/06 0:05, Yasumasa Suenaga wrote:
>>>>>>>>>>> Hi all,
>>>>>>>>>>>
>>>>>>>>>>> Did someone read my email?
>>>>>>>>>>> I really hope to merge "JDK-7090324: gclog rotation via
>>>>>>>>>>> external tool" .
>>>>>>>>>>>
>>>>>>>>>>> I hear that someone need this RFE. So I want to discuss about
>>>>>>>>>>> this.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>>
>>>>>>>>>>> Yasumasa
>>>>>>>>>>>
>>>>>>>>>>> On 2013/11/08 21:47, Yasumasa Suenaga wrote:
>>>>>>>>>>>> Hi all,
>>>>>>>>>>>>
>>>>>>>>>>>> Did someone read my mail?
>>>>>>>>>>>>
>>>>>>>>>>>> I think that this RFE helps us to watch Java heap on
>>>>>>>>>>>> production system.
>>>>>>>>>>>> Also I think this RFE is able to be part of the JEP 158
>>>>>>>>>>>> (Unified JVM Logging) .
>>>>>>>>>>>>
>>>>>>>>>>>> I want to update this RFE in JDK Bug System, but I don't have
>>>>>>>>>>>> account.
>>>>>>>>>>>> So I've posted email at first.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>
>>>>>>>>>>>> Yasumasa
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On 2013/09/30 21:10, Yasumasa Suenaga wrote:
>>>>>>>>>>>>> In previous email, I've attached new patch for this RFE.
>>>>>>>>>>>>> It works fine with current hsx.
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Yasumasa
>>>>>>>>>>>>>
>>>>>>>>>>>>> On 2013/09/29 23:40, Yasu wrote:
>>>>>>>>>>>>>> Hi all,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> We are using "logrotate" tool on RHEL for various log 
>>>>>>>>>>>>>> rotation.
>>>>>>>>>>>>>> Current HotSpot has gclog rotation function for log size 
>>>>>>>>>>>>>> base,
>>>>>>>>>>>>>> however I need to rotate gc log synchronizing with logrotate
>>>>>>>>>>>>>> tool.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> So I've created RFE as "JDK-7090324: gclog rotation via
>>>>>>>>>>>>>> external tool" .
>>>>>>>>>>>>>> And Sr. Engineering Manager in Oracle said he use the
>>>>>>>>>>>>>> essence of my patch in one
>>>>>>>>>>>>>> of the jcmd subcommands.
>>>>>>>>>>>>>> http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/2011-September/003274.html 
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> 2 years ago, I posted a patch for this RFE.
>>>>>>>>>>>>>> But this patch is too old to apply for current HotSpot.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> In last month, a similar discussion was appeared in ML.
>>>>>>>>>>>>>> So I think it's time to discuss this RFE.
>>>>>>>>>>>>>> http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/2013-August/008029.html 
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Please cooperate.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Best regards,
>>>>>>>>>>>>>> Yasumasa
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>
>>>
>>
>


From taras.ledkov at oracle.com  Wed Feb  5 04:42:59 2014
From: taras.ledkov at oracle.com (taras ledkov)
Date: Wed, 05 Feb 2014 16:42:59 +0400
Subject: Review request for 7195249: Some jtreg tests use hard coded ports
In-Reply-To: <52DE41BB.40309@oracle.com>
References: <529EF58F.5000701@oracle.com>
	<52A58687.6020708@oracle.com>	<52A5953A.5040102@oracle.com>
	<52A7061E.8040002@oracle.com>	<52BC2A7D.3070403@oracle.com>	<B27290D9-2EA6-4330-9A31-59957AA968ED@oracle.com>
	<52D6A61A.5020109@oracle.com> <52DD3BB6.2070607@oracle.com>
	<A2DEA977-75EF-4AC2-AC86-E22B8BBFFE97@oracle.com>
	<52DE3E3D.5070903@oracle.com> <52DE41BB.40309@oracle.com>
Message-ID: <52F231D3.1050208@oracle.com>

Hi,

So please take a look at the review against JDK9.
The reviewed patch had not been integrated into JDK8.

Port to JDK9 is identical. The difference: the ProcessTools.java has 
been already patched by Jaroslav.

Webrev for jdk part:
http://cr.openjdk.java.net/~anazarov/7195249/jdk/webrev.03/

Webrev for hs part:
http://cr.openjdk.java.net/~anazarov/7195249/hs/webrev.03/


On 21.01.2014 13:45, Jaroslav Bachorik wrote:
> Hi Taras,
>
> On 21.1.2014 10:30, taras ledkov wrote:
>> Hi Jaroslav,
>>
>> Could you please review the last changes?
>> Are you OK?
>
> Yes, the change looks ok. But I think we will need to get back to this
> problem eventually and implement a central port dispatcher if we want to
> be 100% sure the port conflicts wouldn't occur. But your changes reduce
> the chance significantly.
>
> Thanks for taking care of this.
>
> -JB-
>
>>
>> On 20.01.2014 19:21, Staffan Larsen wrote:
>>> Sorry for not replying earlier. Yes, I?m ok with these changes.
>>>
>>> Thanks,
>>> /Staffan
>>>
>>> On 20 jan 2014, at 16:07, taras ledkov <taras.ledkov at oracle.com> wrote:
>>>
>>>> Hi Staffan,
>>>>
>>>> I fixed the tests according with your comments.
>>>> Are you OK?
>>>>
>>>> On 15.01.2014 19:15, taras ledkov wrote:
>>>>> Hi,
>>>>>
>>>>> Please take a look at the new review.
>>>>>
>>>>> Webrev for jdk part:
>>>>> http://cr.openjdk.java.net/~anazarov/7195249/jdk/webrev.02/
>>>>>
>>>>> Webrev for hs part:
>>>>> http://cr.openjdk.java.net/~anazarov/7195249/hs/webrev.02/
>>>>>
>>>>> My answers are inline:
>>>>>
>>>>> On 08.01.2014 17:46, Staffan Larsen wrote:
>>>>>> Hi Taras,
>>>>>>
>>>>>> Thanks for doing this clean up and conversion of tests into Java.
>>>>>> Here?s a couple of comments:
>>>>>>
>>>>>> test/runtime/6294277/SourceDebugExtension.java:
>>>>>> This test could be simplified by not specifying an address at all.
>>>>>> Since the test never connects to the JVM started with -Xrunjdwp,
>>>>>> there
>>>>>> is no reason to specify an address. If address is unspecified (and
>>>>>> server=y), the connector will pick an address and print it to the
>>>>>> command line. Thus the only change that needs to be done is to remove
>>>>>> ",address=8888? from the @run command.
>>>>> fixed
>>>>>
>>>>>> test/sun/management/jmxremote/bootstrap/RmiBootstrapTest.sh:
>>>>>> test/sun/management/jmxremote/bootstrap/RmiSslBootstrapTest.sh:
>>>>>> These tests do not compile cleanly with an empty JTwork directory. It
>>>>>> seems that having one @build for each class does not work well - when
>>>>>> compiling RmiBootstrapTest.java it cannot find TestLogger. Moving all
>>>>>> classes to one @build statement solved this problem for me.
>>>>> fixed
>>>>>
>>>>>> test/lib/testlibrary/jdk/testlibrary/ProcessTools.java:
>>>>>>   187         Future<Void> stdoutTask = stdout.process();
>>>>>>   188         Future<Void> stderrTask = stderr.process();
>>>>>> The stdoutTask and stderrTask variables are unused.
>>>>> fixed
>>>>>
>>>>>> test/sun/management/jmxremote/bootstrap/RmiRegistrySslTest.java:
>>>>>> At first I thought something was wrong with this file - the diff is
>>>>>> very weird. Then I realized you renamed an old file and created a new
>>>>>> file using the old name.
>>>>> You are right. I did it to keep the test name.
>>>>>
>>>>>> test/sun/management/jmxremote/bootstrap/AbstractFilePermissionTest.java:
>>>>>>
>>>>>>
>>>>>> - Is resetPasswordFilePermission() really necessary? It looks like
>>>>>> you
>>>>>> delete the files at the beginning of the test in any case.
>>>>> I think yes. n the first place, this functionality was at the old
>>>>> code.
>>>>> In the second place, a file without write permission may be a problem
>>>>> for a further cleanup (not by the test, for example for the tests
>>>>> launcher scripts etc.)
>>>>>
>>>>>> - I find the names and usage of ?mgmt? and ?file2PermissionTest?
>>>>>> confusing. They are both Paths. One is used directly by the
>>>>>> sub-classes, the other has a getter method.
>>>>> fixed
>>>>>
>>>>>> - Lines 57-58: Don?t swallow exceptions, add an ex.printStackTrace().
>>>>>> (Same thing for all other places where you call Integer.parseInt())
>>>>> fixed
>>>>>
>>>>>> test/sun/management/jmxremote/bootstrap/Dummy.java:
>>>>>> This file is never used as far as I can see.
>>>>> It is used by PasswordFilePermissionTest & SSLConfigFilePermissionTest
>>>>> via the AbstractFilePermissionTest (see the doTest method,
>>>>> AbstractFilePermissionTest : 162).
>>>>>
>>>>>> Thanks,
>>>>>> /Staffan
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> On 26 dec 2013, at 14:09, taras ledkov <taras.ledkov at oracle.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> Please take a look at the review with fixed issues about trying to
>>>>>>> launch test that needs free port several times.
>>>>>>>
>>>>>>> Webrev for jdk part:
>>>>>>> http://cr.openjdk.java.net/~anazarov/7195249/jdk/webrev.01/
>>>>>>>
>>>>>>> Webrev for hs part:
>>>>>>> http://cr.openjdk.java.net/~anazarov/7195249/hs/webrev.01/
>>>>>>>
>>>>>>> Pay your attention to new method ProcessTools.startProcess(String,
>>>>>>> ProcessBuilder, Consumer<String>) that is used to analyze all output
>>>>>>> of a sub-process.  It has common part with
>>>>>>> ProcessTools.startProcess(String, ProcessBuilder, Predicate<String>,
>>>>>>> long, TumeUnit) that is used to determine the warm-up moment.
>>>>>>>
>>>>>>> I think the ProcessTools.startProcess(String, ProcessBuilder,
>>>>>>> Predicate<String>, long, TumeUnit) may be changed by adding LinePump
>>>>>>> to stderr if there is not serious reason for restricting the warm-up
>>>>>>> analysis to stdout stream.
>>>>>>>
>>>>>>> On 10.12.2013 16:16, Yekaterina Kantserova wrote:
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> I've consulted with Serviceability engineers (add them to CC
>>>>>>>> list) and
>>>>>>>> they would like to see tests to solve these problem so far:
>>>>>>>>
>>>>>>>> 2. Implement loops in every test.
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Katja
>>>>>>>>
>>>>>>>>
>>>>>>>> On 12/09/2013 11:02 AM, Alexandre (Shura) Iline wrote:
>>>>>>>>> Guys.
>>>>>>>>>
>>>>>>>>> Let me try to sum up what was said before and may be suggest a
>>>>>>>>> compromise.
>>>>>>>>>
>>>>>>>>> 1. There is a desire to have a support port allocation on the
>>>>>>>>> level of
>>>>>>>>> a JTReg suite execution. Taras created a bug for that
>>>>>>>>> (https://bugs.openjdk.java.net/browse/JDK-7195249). Whether it
>>>>>>>>> is a
>>>>>>>>> test harness API or a library API does not really matter from
>>>>>>>>> usage
>>>>>>>>> point of view.
>>>>>>>>>
>>>>>>>>> 2. There is no way to make the tests absolutely stable, whatever
>>>>>>>>> port
>>>>>>>>> allocation logic is used. The best we could do is to try to
>>>>>>>>> perform
>>>>>>>>> the test logic with different ports until the test succeeds.
>>>>>>>>>
>>>>>>>>> Both arguments make sense. #2 is the ultimate answer, of course,
>>>>>>>>> but
>>>>>>>>> better be used in conjunction with a meaningful port selection
>>>>>>>>> algorithm.
>>>>>>>>>
>>>>>>>>> At the same time, copying a loop-until-success login from one
>>>>>>>>> test to
>>>>>>>>> another may be not the best solution. Library could help with
>>>>>>>>> that I
>>>>>>>>> believe. There only need to be an API method which takes
>>>>>>>>> behavior as a
>>>>>>>>> parameter and run it until it succeeds. Something like:
>>>>>>>>> public <T> runOnAFreePort(Function<T, Integer>)
>>>>>>>>> or similar. There could be arguments of how/whether to implement
>>>>>>>>> it,
>>>>>>>>> the solution would not work for shell tests, etc, but still ...
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> With the tests in question though, we have a few options.
>>>>>>>>>
>>>>>>>>> 1. Integrate tests as is. Get to it later after reaching
>>>>>>>>> agreement in
>>>>>>>>> the library, etc.
>>>>>>>>> 2. Implement loops in every test.
>>>>>>>>> 3. Wait for the library to be ready and only then integrate the
>>>>>>>>> changes.
>>>>>>>>>
>>>>>>>>> Please let us know which one is closer to your heart.
>>>>>>>>>
>>>>>>>>> I personally prefer #1 for the reason that the changes already
>>>>>>>>> supposed to make the tests more stable and also there are many
>>>>>>>>> more
>>>>>>>>> tests tests which use ports, so the scope of the problem is bigger
>>>>>>>>> than these.
>>>>>>>>>
>>>>>>>>> Shura
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> Taras,
>>>>>>>>>>
>>>>>>>>>> I agree with the previous comments, that Utils.getFreePort()
>>>>>>>>>> does not
>>>>>>>>>> guarantee the port will be still free when you start your
>>>>>>>>>> process.
>>>>>>>>>> Unfortunately I don't think the library can do more. However,
>>>>>>>>>> there is a
>>>>>>>>>> solution.
>>>>>>>>>>
>>>>>>>>>> Please, look at the *jdk/test/sun/tools/jstatd/JstatdTest.java
>>>>>>>>>> tryToSetupJstatdProcess()*. In brief, the test will try to
>>>>>>>>>> start a
>>>>>>>>>> process with a free port and then check if
>>>>>>>>>> /java.rmi.server.ExportException: Port already in use/ has been
>>>>>>>>>> thrown.
>>>>>>>>>> If yes, you have to retry.
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>> Katja
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On 12/02/2013 01:39 PM, taras ledkov wrote:
>>>>>>>>>>> Hi Everyone,
>>>>>>>>>>>
>>>>>>>>>>> Whatever logic is to be chosen to select a free port, it is the
>>>>>>>>>>> library responsibility to implements it, would not you agree?
>>>>>>>>>>>
>>>>>>>>>>> Hence what I am suggesting is to integrate the tests as is.
>>>>>>>>>>>
>>>>>>>>>>> Should we decide to replace logic of the port selection, we
>>>>>>>>>>> could do
>>>>>>>>>>> it later in the library.
>>>>>>>>>>>
>>>>>>>>>>> On 21.11.2013 15:00, Jaroslav Bachorik wrote:
>>>>>>>>>>>> On 20.11.2013 18:38, Dmitry Samersoff wrote:
>>>>>>>>>>>>> Roger,
>>>>>>>>>>>>>
>>>>>>>>>>>>> As soon as we close a socket nobody can guarantee that the
>>>>>>>>>>>>> port is
>>>>>>>>>>>>> free.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Moreover, port returned by getFreePort()[1] remains not
>>>>>>>>>>>>> accessible
>>>>>>>>>>>>> for
>>>>>>>>>>>>> some time - it depends to system setup, take a look to
>>>>>>>>>>>>> discussions
>>>>>>>>>>>>> around SO_REUSEPORT for Linux or SO_REUSEADDR and SO_LINGER
>>>>>>>>>>>>> for
>>>>>>>>>>>>> BSD.
>>>>>>>>>>>>>
>>>>>>>>>>>>> So from stability point of view it's better to just return
>>>>>>>>>>>>> random
>>>>>>>>>>>>> number
>>>>>>>>>>>>> between 49152 and 65535.
>>>>>>>>>>>>
>>>>>>>>>>>> Well, this doesn't seem to improve the odds by much. When
>>>>>>>>>>>> there are
>>>>>>>>>>>> more
>>>>>>>>>>>> tests run in parallel, all of them requiring a free port,
>>>>>>>>>>>> nothing
>>>>>>>>>>>> prevents the random function to return the same port to all of
>>>>>>>>>>>> them.
>>>>>>>>>>>> Also, two subsequent requests can return the same port and
>>>>>>>>>>>> cause
>>>>>>>>>>>> problems with timing when a port used by a previous test is not
>>>>>>>>>>>> fully
>>>>>>>>>>>> ready to be assigned to a different socket. And as Dmitry
>>>>>>>>>>>> pointed out
>>>>>>>>>>>> unless one can keep hold of the allocated socket and use it
>>>>>>>>>>>> later
>>>>>>>>>>>> there
>>>>>>>>>>>> is no guarantee that a port which was tested unallocated will
>>>>>>>>>>>> remain
>>>>>>>>>>>> unallocated also for the next few milliseconds.
>>>>>>>>>>>>
>>>>>>>>>>>> The only fail proof solution would be a port allocating service
>>>>>>>>>>>> provided
>>>>>>>>>>>> by the harness. Until then we can only (hopefully) decrease the
>>>>>>>>>>>> chance
>>>>>>>>>>>> of intermittent failures due to a port being in use.
>>>>>>>>>>>>
>>>>>>>>>>>> -JB-
>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> -Dmitry
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> [1]
>>>>>>>>>>>>>
>>>>>>>>>>>>> 141     public static int getFreePort() throws
>>>>>>>>>>>>> InterruptedException,
>>>>>>>>>>>>> IOException {
>>>>>>>>>>>>>   142         int port = -1;
>>>>>>>>>>>>>   143
>>>>>>>>>>>>>   144         while (port <= 0) {
>>>>>>>>>>>>>   145             Thread.sleep(100);
>>>>>>>>>>>>>   146
>>>>>>>>>>>>>   147             ServerSocket serverSocket = null;
>>>>>>>>>>>>>   148             try {
>>>>>>>>>>>>>   149                 serverSocket = new ServerSocket(0);
>>>>>>>>>>>>>   150                 port = serverSocket.getLocalPort();
>>>>>>>>>>>>>   151             } finally {
>>>>>>>>>>>>>   152                 serverSocket.close();
>>>>>>>>>>>>>   153             }
>>>>>>>>>>>>>   154         }
>>>>>>>>>>>>>   155
>>>>>>>>>>>>>   156         return port;
>>>>>>>>>>>>>   157     }
>>>>>>>>>>>>>   158
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On 2013-11-20 19:40, roger riggs wrote:
>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> fyi,  The jdk.testlibrary.Utils.getFreePort() method will
>>>>>>>>>>>>>> Open an
>>>>>>>>>>>>>> free
>>>>>>>>>>>>>> Socket, close it and return
>>>>>>>>>>>>>> the port number.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> And as Alan recommended, use (0) when possible to have the
>>>>>>>>>>>>>> system
>>>>>>>>>>>>>> assign
>>>>>>>>>>>>>> the port #.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Roger
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On 11/20/2013 8:04 AM, Dmitry Samersoff wrote:
>>>>>>>>>>>>>>> Taras,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> *The only* correct way to take really free port is:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> 1. Chose random number between 49152 and 65535
>>>>>>>>>>>>>>> 2. Open socket
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> if socket fails - repeat step 1
>>>>>>>>>>>>>>> if socket OK - return *socket*
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> If you can't keep the socket open (e.g. you have to pass
>>>>>>>>>>>>>>> port
>>>>>>>>>>>>>>> number as
>>>>>>>>>>>>>>> property value) you shouldn't do any pre-check as it has no
>>>>>>>>>>>>>>> value
>>>>>>>>>>>>>>> - as
>>>>>>>>>>>>>>> as soon as you close socket someone can take the port.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> So just choose a random number within the range above and
>>>>>>>>>>>>>>> let
>>>>>>>>>>>>>>> networking
>>>>>>>>>>>>>>> code opening socket to handle port conflict.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> -Dmitry
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On 2013-11-20 15:54, taras ledkov wrote:
>>>>>>>>>>>>>>>> Hi Everyone,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I am working on bug
>>>>>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-7195249.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> There are two webrevs:
>>>>>>>>>>>>>>>> Webrev for jdk part:
>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~anazarov/7195249/jdk/webrev.00/
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Webrev for hs part:
>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~anazarov/7195249/hs/webrev.00/
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Please take a look at some notes:
>>>>>>>>>>>>>>>> - After discussing with Yekaterina Kantserova & Jaroslav
>>>>>>>>>>>>>>>> Bachorik
>>>>>>>>>>>>>>>> some
>>>>>>>>>>>>>>>> shell tests have been converted to java based tests
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> - PasswordFilePermissionTest & SSLConfigFilePermissionTest
>>>>>>>>>>>>>>>> tests
>>>>>>>>>>>>>>>> looked
>>>>>>>>>>>>>>>> very similar, so a common parent class was created for
>>>>>>>>>>>>>>>> them:
>>>>>>>>>>>>>>>> AbstractFilePermissionTest
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> - What was called RmiRegistrySslTest.java I've renamed to
>>>>>>>>>>>>>>>> RmiRegistrySslTestApp.java. The java code to replace old
>>>>>>>>>>>>>>>> shell
>>>>>>>>>>>>>>>> script
>>>>>>>>>>>>>>>> RmiRegistrySslTest.sh is called RmiRegistrySslTest.java,
>>>>>>>>>>>>>>>> hence the
>>>>>>>>>>>>>>>> huge
>>>>>>>>>>>>>>>> diff.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> - The new RmiRegistrySslTest.java has some lines similar
>>>>>>>>>>>>>>>> to the
>>>>>>>>>>>>>>>> AbstractFilePermissionTest.java, I nevertheless decided
>>>>>>>>>>>>>>>> to not
>>>>>>>>>>>>>>>> complicate the code further and leave it as is. Please
>>>>>>>>>>>>>>>> let me
>>>>>>>>>>>>>>>> know if
>>>>>>>>>>>>>>>> this is somehow not acceptable
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> - com/oracle/java/testlibrary/Utils.java that is added to
>>>>>>>>>>>>>>>> hotspot
>>>>>>>>>>>>>>>> repository is taken from this patch:
>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ykantser/8023138/webrev.00/test/lib/testlibrary/jdk/testlibrary/Utils.java.sdiff.html
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> - These tests will need additional changes when test
>>>>>>>>>>>>>>>> library
>>>>>>>>>>>>>>>> process
>>>>>>>>>>>>>>>> tools will support command line options inheritance
>>>>>>>>>>>>>>>> (http://mail.openjdk.java.net/pipermail/serviceability-dev/2013-November/013235.html)
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> With best regards,
>>>>>>> Taras Ledkov
>>>>>>> Mail-To: taras.ledkov at oracle.com
>>>>>>> skype: taras_ledkov
>>>>>>> Phone: 7(812)3346-157
>>>>>>
>>>>>
>>>>
>>>> --
>>>> With best regards,
>>>> Taras Ledkov
>>>> Mail-To: taras.ledkov at oracle.com
>>>> skype: taras_ledkov
>>>> Phone: 7(812)3346-157
>>>
>>
>

-- 
With best regards,
Taras Ledkov
Mail-To: taras.ledkov at oracle.com
skype: taras_ledkov
Phone: 7(812)3346-157

From jaroslav.bachorik at oracle.com  Wed Feb  5 05:20:23 2014
From: jaroslav.bachorik at oracle.com (Jaroslav Bachorik)
Date: Wed, 05 Feb 2014 14:20:23 +0100
Subject: Review request for 7195249: Some jtreg tests use hard coded ports
In-Reply-To: <52F231D3.1050208@oracle.com>
References: <529EF58F.5000701@oracle.com>
	<52A58687.6020708@oracle.com>	<52A5953A.5040102@oracle.com>
	<52A7061E.8040002@oracle.com>	<52BC2A7D.3070403@oracle.com>	<B27290D9-2EA6-4330-9A31-59957AA968ED@oracle.com>
	<52D6A61A.5020109@oracle.com> <52DD3BB6.2070607@oracle.com>
	<A2DEA977-75EF-4AC2-AC86-E22B8BBFFE97@oracle.com>
	<52DE3E3D.5070903@oracle.com> <52DE41BB.40309@oracle.com>
	<52F231D3.1050208@oracle.com>
Message-ID: <52F23A97.7020807@oracle.com>

Hi Taras,

thanks for taking care of this.

The changes look fine to me.

One minor nit is unused imports of the library classes in 
"test/sun/management/jmxremote/bootstrap/SSLConfigFilePermissionTest.java". 
It does not use any of those classes as its base class 
"AbstractFilePermissionTest" does all the heavy lifting.

Cheers,

-JB-

On 5.2.2014 13:42, taras ledkov wrote:
> Hi,
>
> So please take a look at the review against JDK9.
> The reviewed patch had not been integrated into JDK8.
>
> Port to JDK9 is identical. The difference: the ProcessTools.java has
> been already patched by Jaroslav.
>
> Webrev for jdk part:
> http://cr.openjdk.java.net/~anazarov/7195249/jdk/webrev.03/
>
> Webrev for hs part:
> http://cr.openjdk.java.net/~anazarov/7195249/hs/webrev.03/
>
>
> On 21.01.2014 13:45, Jaroslav Bachorik wrote:
>> Hi Taras,
>>
>> On 21.1.2014 10:30, taras ledkov wrote:
>>> Hi Jaroslav,
>>>
>>> Could you please review the last changes?
>>> Are you OK?
>>
>> Yes, the change looks ok. But I think we will need to get back to this
>> problem eventually and implement a central port dispatcher if we want to
>> be 100% sure the port conflicts wouldn't occur. But your changes reduce
>> the chance significantly.
>>
>> Thanks for taking care of this.
>>
>> -JB-
>>
>>>
>>> On 20.01.2014 19:21, Staffan Larsen wrote:
>>>> Sorry for not replying earlier. Yes, I?m ok with these changes.
>>>>
>>>> Thanks,
>>>> /Staffan
>>>>
>>>> On 20 jan 2014, at 16:07, taras ledkov <taras.ledkov at oracle.com> wrote:
>>>>
>>>>> Hi Staffan,
>>>>>
>>>>> I fixed the tests according with your comments.
>>>>> Are you OK?
>>>>>
>>>>> On 15.01.2014 19:15, taras ledkov wrote:
>>>>>> Hi,
>>>>>>
>>>>>> Please take a look at the new review.
>>>>>>
>>>>>> Webrev for jdk part:
>>>>>> http://cr.openjdk.java.net/~anazarov/7195249/jdk/webrev.02/
>>>>>>
>>>>>> Webrev for hs part:
>>>>>> http://cr.openjdk.java.net/~anazarov/7195249/hs/webrev.02/
>>>>>>
>>>>>> My answers are inline:
>>>>>>
>>>>>> On 08.01.2014 17:46, Staffan Larsen wrote:
>>>>>>> Hi Taras,
>>>>>>>
>>>>>>> Thanks for doing this clean up and conversion of tests into Java.
>>>>>>> Here?s a couple of comments:
>>>>>>>
>>>>>>> test/runtime/6294277/SourceDebugExtension.java:
>>>>>>> This test could be simplified by not specifying an address at all.
>>>>>>> Since the test never connects to the JVM started with -Xrunjdwp,
>>>>>>> there
>>>>>>> is no reason to specify an address. If address is unspecified (and
>>>>>>> server=y), the connector will pick an address and print it to the
>>>>>>> command line. Thus the only change that needs to be done is to
>>>>>>> remove
>>>>>>> ",address=8888? from the @run command.
>>>>>> fixed
>>>>>>
>>>>>>> test/sun/management/jmxremote/bootstrap/RmiBootstrapTest.sh:
>>>>>>> test/sun/management/jmxremote/bootstrap/RmiSslBootstrapTest.sh:
>>>>>>> These tests do not compile cleanly with an empty JTwork
>>>>>>> directory. It
>>>>>>> seems that having one @build for each class does not work well -
>>>>>>> when
>>>>>>> compiling RmiBootstrapTest.java it cannot find TestLogger. Moving
>>>>>>> all
>>>>>>> classes to one @build statement solved this problem for me.
>>>>>> fixed
>>>>>>
>>>>>>> test/lib/testlibrary/jdk/testlibrary/ProcessTools.java:
>>>>>>>   187         Future<Void> stdoutTask = stdout.process();
>>>>>>>   188         Future<Void> stderrTask = stderr.process();
>>>>>>> The stdoutTask and stderrTask variables are unused.
>>>>>> fixed
>>>>>>
>>>>>>> test/sun/management/jmxremote/bootstrap/RmiRegistrySslTest.java:
>>>>>>> At first I thought something was wrong with this file - the diff is
>>>>>>> very weird. Then I realized you renamed an old file and created a
>>>>>>> new
>>>>>>> file using the old name.
>>>>>> You are right. I did it to keep the test name.
>>>>>>
>>>>>>> test/sun/management/jmxremote/bootstrap/AbstractFilePermissionTest.java:
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> - Is resetPasswordFilePermission() really necessary? It looks like
>>>>>>> you
>>>>>>> delete the files at the beginning of the test in any case.
>>>>>> I think yes. n the first place, this functionality was at the old
>>>>>> code.
>>>>>> In the second place, a file without write permission may be a problem
>>>>>> for a further cleanup (not by the test, for example for the tests
>>>>>> launcher scripts etc.)
>>>>>>
>>>>>>> - I find the names and usage of ?mgmt? and ?file2PermissionTest?
>>>>>>> confusing. They are both Paths. One is used directly by the
>>>>>>> sub-classes, the other has a getter method.
>>>>>> fixed
>>>>>>
>>>>>>> - Lines 57-58: Don?t swallow exceptions, add an
>>>>>>> ex.printStackTrace().
>>>>>>> (Same thing for all other places where you call Integer.parseInt())
>>>>>> fixed
>>>>>>
>>>>>>> test/sun/management/jmxremote/bootstrap/Dummy.java:
>>>>>>> This file is never used as far as I can see.
>>>>>> It is used by PasswordFilePermissionTest &
>>>>>> SSLConfigFilePermissionTest
>>>>>> via the AbstractFilePermissionTest (see the doTest method,
>>>>>> AbstractFilePermissionTest : 162).
>>>>>>
>>>>>>> Thanks,
>>>>>>> /Staffan
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On 26 dec 2013, at 14:09, taras ledkov <taras.ledkov at oracle.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> Please take a look at the review with fixed issues about trying to
>>>>>>>> launch test that needs free port several times.
>>>>>>>>
>>>>>>>> Webrev for jdk part:
>>>>>>>> http://cr.openjdk.java.net/~anazarov/7195249/jdk/webrev.01/
>>>>>>>>
>>>>>>>> Webrev for hs part:
>>>>>>>> http://cr.openjdk.java.net/~anazarov/7195249/hs/webrev.01/
>>>>>>>>
>>>>>>>> Pay your attention to new method ProcessTools.startProcess(String,
>>>>>>>> ProcessBuilder, Consumer<String>) that is used to analyze all
>>>>>>>> output
>>>>>>>> of a sub-process.  It has common part with
>>>>>>>> ProcessTools.startProcess(String, ProcessBuilder,
>>>>>>>> Predicate<String>,
>>>>>>>> long, TumeUnit) that is used to determine the warm-up moment.
>>>>>>>>
>>>>>>>> I think the ProcessTools.startProcess(String, ProcessBuilder,
>>>>>>>> Predicate<String>, long, TumeUnit) may be changed by adding
>>>>>>>> LinePump
>>>>>>>> to stderr if there is not serious reason for restricting the
>>>>>>>> warm-up
>>>>>>>> analysis to stdout stream.
>>>>>>>>
>>>>>>>> On 10.12.2013 16:16, Yekaterina Kantserova wrote:
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>> I've consulted with Serviceability engineers (add them to CC
>>>>>>>>> list) and
>>>>>>>>> they would like to see tests to solve these problem so far:
>>>>>>>>>
>>>>>>>>> 2. Implement loops in every test.
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Katja
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 12/09/2013 11:02 AM, Alexandre (Shura) Iline wrote:
>>>>>>>>>> Guys.
>>>>>>>>>>
>>>>>>>>>> Let me try to sum up what was said before and may be suggest a
>>>>>>>>>> compromise.
>>>>>>>>>>
>>>>>>>>>> 1. There is a desire to have a support port allocation on the
>>>>>>>>>> level of
>>>>>>>>>> a JTReg suite execution. Taras created a bug for that
>>>>>>>>>> (https://bugs.openjdk.java.net/browse/JDK-7195249). Whether it
>>>>>>>>>> is a
>>>>>>>>>> test harness API or a library API does not really matter from
>>>>>>>>>> usage
>>>>>>>>>> point of view.
>>>>>>>>>>
>>>>>>>>>> 2. There is no way to make the tests absolutely stable, whatever
>>>>>>>>>> port
>>>>>>>>>> allocation logic is used. The best we could do is to try to
>>>>>>>>>> perform
>>>>>>>>>> the test logic with different ports until the test succeeds.
>>>>>>>>>>
>>>>>>>>>> Both arguments make sense. #2 is the ultimate answer, of course,
>>>>>>>>>> but
>>>>>>>>>> better be used in conjunction with a meaningful port selection
>>>>>>>>>> algorithm.
>>>>>>>>>>
>>>>>>>>>> At the same time, copying a loop-until-success login from one
>>>>>>>>>> test to
>>>>>>>>>> another may be not the best solution. Library could help with
>>>>>>>>>> that I
>>>>>>>>>> believe. There only need to be an API method which takes
>>>>>>>>>> behavior as a
>>>>>>>>>> parameter and run it until it succeeds. Something like:
>>>>>>>>>> public <T> runOnAFreePort(Function<T, Integer>)
>>>>>>>>>> or similar. There could be arguments of how/whether to implement
>>>>>>>>>> it,
>>>>>>>>>> the solution would not work for shell tests, etc, but still ...
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> With the tests in question though, we have a few options.
>>>>>>>>>>
>>>>>>>>>> 1. Integrate tests as is. Get to it later after reaching
>>>>>>>>>> agreement in
>>>>>>>>>> the library, etc.
>>>>>>>>>> 2. Implement loops in every test.
>>>>>>>>>> 3. Wait for the library to be ready and only then integrate the
>>>>>>>>>> changes.
>>>>>>>>>>
>>>>>>>>>> Please let us know which one is closer to your heart.
>>>>>>>>>>
>>>>>>>>>> I personally prefer #1 for the reason that the changes already
>>>>>>>>>> supposed to make the tests more stable and also there are many
>>>>>>>>>> more
>>>>>>>>>> tests tests which use ports, so the scope of the problem is
>>>>>>>>>> bigger
>>>>>>>>>> than these.
>>>>>>>>>>
>>>>>>>>>> Shura
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>> Taras,
>>>>>>>>>>>
>>>>>>>>>>> I agree with the previous comments, that Utils.getFreePort()
>>>>>>>>>>> does not
>>>>>>>>>>> guarantee the port will be still free when you start your
>>>>>>>>>>> process.
>>>>>>>>>>> Unfortunately I don't think the library can do more. However,
>>>>>>>>>>> there is a
>>>>>>>>>>> solution.
>>>>>>>>>>>
>>>>>>>>>>> Please, look at the *jdk/test/sun/tools/jstatd/JstatdTest.java
>>>>>>>>>>> tryToSetupJstatdProcess()*. In brief, the test will try to
>>>>>>>>>>> start a
>>>>>>>>>>> process with a free port and then check if
>>>>>>>>>>> /java.rmi.server.ExportException: Port already in use/ has been
>>>>>>>>>>> thrown.
>>>>>>>>>>> If yes, you have to retry.
>>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>> Katja
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On 12/02/2013 01:39 PM, taras ledkov wrote:
>>>>>>>>>>>> Hi Everyone,
>>>>>>>>>>>>
>>>>>>>>>>>> Whatever logic is to be chosen to select a free port, it is the
>>>>>>>>>>>> library responsibility to implements it, would not you agree?
>>>>>>>>>>>>
>>>>>>>>>>>> Hence what I am suggesting is to integrate the tests as is.
>>>>>>>>>>>>
>>>>>>>>>>>> Should we decide to replace logic of the port selection, we
>>>>>>>>>>>> could do
>>>>>>>>>>>> it later in the library.
>>>>>>>>>>>>
>>>>>>>>>>>> On 21.11.2013 15:00, Jaroslav Bachorik wrote:
>>>>>>>>>>>>> On 20.11.2013 18:38, Dmitry Samersoff wrote:
>>>>>>>>>>>>>> Roger,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> As soon as we close a socket nobody can guarantee that the
>>>>>>>>>>>>>> port is
>>>>>>>>>>>>>> free.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Moreover, port returned by getFreePort()[1] remains not
>>>>>>>>>>>>>> accessible
>>>>>>>>>>>>>> for
>>>>>>>>>>>>>> some time - it depends to system setup, take a look to
>>>>>>>>>>>>>> discussions
>>>>>>>>>>>>>> around SO_REUSEPORT for Linux or SO_REUSEADDR and SO_LINGER
>>>>>>>>>>>>>> for
>>>>>>>>>>>>>> BSD.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> So from stability point of view it's better to just return
>>>>>>>>>>>>>> random
>>>>>>>>>>>>>> number
>>>>>>>>>>>>>> between 49152 and 65535.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Well, this doesn't seem to improve the odds by much. When
>>>>>>>>>>>>> there are
>>>>>>>>>>>>> more
>>>>>>>>>>>>> tests run in parallel, all of them requiring a free port,
>>>>>>>>>>>>> nothing
>>>>>>>>>>>>> prevents the random function to return the same port to all of
>>>>>>>>>>>>> them.
>>>>>>>>>>>>> Also, two subsequent requests can return the same port and
>>>>>>>>>>>>> cause
>>>>>>>>>>>>> problems with timing when a port used by a previous test is
>>>>>>>>>>>>> not
>>>>>>>>>>>>> fully
>>>>>>>>>>>>> ready to be assigned to a different socket. And as Dmitry
>>>>>>>>>>>>> pointed out
>>>>>>>>>>>>> unless one can keep hold of the allocated socket and use it
>>>>>>>>>>>>> later
>>>>>>>>>>>>> there
>>>>>>>>>>>>> is no guarantee that a port which was tested unallocated will
>>>>>>>>>>>>> remain
>>>>>>>>>>>>> unallocated also for the next few milliseconds.
>>>>>>>>>>>>>
>>>>>>>>>>>>> The only fail proof solution would be a port allocating
>>>>>>>>>>>>> service
>>>>>>>>>>>>> provided
>>>>>>>>>>>>> by the harness. Until then we can only (hopefully) decrease
>>>>>>>>>>>>> the
>>>>>>>>>>>>> chance
>>>>>>>>>>>>> of intermittent failures due to a port being in use.
>>>>>>>>>>>>>
>>>>>>>>>>>>> -JB-
>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> -Dmitry
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> [1]
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> 141     public static int getFreePort() throws
>>>>>>>>>>>>>> InterruptedException,
>>>>>>>>>>>>>> IOException {
>>>>>>>>>>>>>>   142         int port = -1;
>>>>>>>>>>>>>>   143
>>>>>>>>>>>>>>   144         while (port <= 0) {
>>>>>>>>>>>>>>   145             Thread.sleep(100);
>>>>>>>>>>>>>>   146
>>>>>>>>>>>>>>   147             ServerSocket serverSocket = null;
>>>>>>>>>>>>>>   148             try {
>>>>>>>>>>>>>>   149                 serverSocket = new ServerSocket(0);
>>>>>>>>>>>>>>   150                 port = serverSocket.getLocalPort();
>>>>>>>>>>>>>>   151             } finally {
>>>>>>>>>>>>>>   152                 serverSocket.close();
>>>>>>>>>>>>>>   153             }
>>>>>>>>>>>>>>   154         }
>>>>>>>>>>>>>>   155
>>>>>>>>>>>>>>   156         return port;
>>>>>>>>>>>>>>   157     }
>>>>>>>>>>>>>>   158
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On 2013-11-20 19:40, roger riggs wrote:
>>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> fyi,  The jdk.testlibrary.Utils.getFreePort() method will
>>>>>>>>>>>>>>> Open an
>>>>>>>>>>>>>>> free
>>>>>>>>>>>>>>> Socket, close it and return
>>>>>>>>>>>>>>> the port number.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> And as Alan recommended, use (0) when possible to have the
>>>>>>>>>>>>>>> system
>>>>>>>>>>>>>>> assign
>>>>>>>>>>>>>>> the port #.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Roger
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On 11/20/2013 8:04 AM, Dmitry Samersoff wrote:
>>>>>>>>>>>>>>>> Taras,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> *The only* correct way to take really free port is:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> 1. Chose random number between 49152 and 65535
>>>>>>>>>>>>>>>> 2. Open socket
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> if socket fails - repeat step 1
>>>>>>>>>>>>>>>> if socket OK - return *socket*
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> If you can't keep the socket open (e.g. you have to pass
>>>>>>>>>>>>>>>> port
>>>>>>>>>>>>>>>> number as
>>>>>>>>>>>>>>>> property value) you shouldn't do any pre-check as it has no
>>>>>>>>>>>>>>>> value
>>>>>>>>>>>>>>>> - as
>>>>>>>>>>>>>>>> as soon as you close socket someone can take the port.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> So just choose a random number within the range above and
>>>>>>>>>>>>>>>> let
>>>>>>>>>>>>>>>> networking
>>>>>>>>>>>>>>>> code opening socket to handle port conflict.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> -Dmitry
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On 2013-11-20 15:54, taras ledkov wrote:
>>>>>>>>>>>>>>>>> Hi Everyone,
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> I am working on bug
>>>>>>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-7195249.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> There are two webrevs:
>>>>>>>>>>>>>>>>> Webrev for jdk part:
>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~anazarov/7195249/jdk/webrev.00/
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Webrev for hs part:
>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~anazarov/7195249/hs/webrev.00/
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Please take a look at some notes:
>>>>>>>>>>>>>>>>> - After discussing with Yekaterina Kantserova & Jaroslav
>>>>>>>>>>>>>>>>> Bachorik
>>>>>>>>>>>>>>>>> some
>>>>>>>>>>>>>>>>> shell tests have been converted to java based tests
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> - PasswordFilePermissionTest & SSLConfigFilePermissionTest
>>>>>>>>>>>>>>>>> tests
>>>>>>>>>>>>>>>>> looked
>>>>>>>>>>>>>>>>> very similar, so a common parent class was created for
>>>>>>>>>>>>>>>>> them:
>>>>>>>>>>>>>>>>> AbstractFilePermissionTest
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> - What was called RmiRegistrySslTest.java I've renamed to
>>>>>>>>>>>>>>>>> RmiRegistrySslTestApp.java. The java code to replace old
>>>>>>>>>>>>>>>>> shell
>>>>>>>>>>>>>>>>> script
>>>>>>>>>>>>>>>>> RmiRegistrySslTest.sh is called RmiRegistrySslTest.java,
>>>>>>>>>>>>>>>>> hence the
>>>>>>>>>>>>>>>>> huge
>>>>>>>>>>>>>>>>> diff.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> - The new RmiRegistrySslTest.java has some lines similar
>>>>>>>>>>>>>>>>> to the
>>>>>>>>>>>>>>>>> AbstractFilePermissionTest.java, I nevertheless decided
>>>>>>>>>>>>>>>>> to not
>>>>>>>>>>>>>>>>> complicate the code further and leave it as is. Please
>>>>>>>>>>>>>>>>> let me
>>>>>>>>>>>>>>>>> know if
>>>>>>>>>>>>>>>>> this is somehow not acceptable
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> - com/oracle/java/testlibrary/Utils.java that is added to
>>>>>>>>>>>>>>>>> hotspot
>>>>>>>>>>>>>>>>> repository is taken from this patch:
>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ykantser/8023138/webrev.00/test/lib/testlibrary/jdk/testlibrary/Utils.java.sdiff.html
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> - These tests will need additional changes when test
>>>>>>>>>>>>>>>>> library
>>>>>>>>>>>>>>>>> process
>>>>>>>>>>>>>>>>> tools will support command line options inheritance
>>>>>>>>>>>>>>>>> (http://mail.openjdk.java.net/pipermail/serviceability-dev/2013-November/013235.html)
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> With best regards,
>>>>>>>> Taras Ledkov
>>>>>>>> Mail-To: taras.ledkov at oracle.com
>>>>>>>> skype: taras_ledkov
>>>>>>>> Phone: 7(812)3346-157
>>>>>>>
>>>>>>
>>>>>
>>>>> --
>>>>> With best regards,
>>>>> Taras Ledkov
>>>>> Mail-To: taras.ledkov at oracle.com
>>>>> skype: taras_ledkov
>>>>> Phone: 7(812)3346-157
>>>>
>>>
>>
>


From dmitry.samersoff at oracle.com  Wed Feb  5 05:45:15 2014
From: dmitry.samersoff at oracle.com (Dmitry Samersoff)
Date: Wed, 05 Feb 2014 17:45:15 +0400
Subject: Need Second reviewer!  Re: RR(S): JDK-8023667 SA: ExceptionBlob and
	other C2 classes not available in client VM
In-Reply-To: <52F0EF49.4050003@oracle.com>
References: <52EBFC89.3030809@oracle.com>	<A10C11D1-EABD-4D6A-B3BF-137E4F3EDB4E@oracle.com>	<52EF7675.3020707@oracle.com>	<CFDDFEB8-4C53-4EEF-850E-D6C0E7BFE07C@oracle.com>
	<52F0EF49.4050003@oracle.com>
Message-ID: <52F2406B.70405@oracle.com>

On 2014-02-04 17:46, Dmitry Samersoff wrote:
> Staffan,
> 
> You was right, only two classes is C2 specific.
> 
> Here is updated webrev.
> 
> http://cr.openjdk.java.net/~dsamersoff/JDK-8023667/webrev.02/
> 
> -Dmitry
> 
> On 2014-02-03 16:13, Staffan Larsen wrote:
>>
>> On 3 feb 2014, at 11:59, Dmitry Samersoff <dmitry.samersoff at oracle.com> wrote:
>>
>>> Staffan,
>>>
>>> All items under try-catch exists only in server (C2 compiler) VM, so if
>>> one of them is missing in target VM other items is missed as well.
>>
>> Are you sure? Some of them look very generic: ?JavaThread?, ?SurrogateLockerThread?, ?GenCollectedHeap?, etc. Perhaps the catch-ignore-c2-specifics should end at lin 821? 
>>
>> /Staffan
>>
>>
>>>
>>> I have no ideas what happens later if SA tries to access one of this
>>> non-initialized items, but I don't see any errors in my home testing.
>>>
>>> -Dmitry
>>>
>>> On 2014-02-03 14:39, Staffan Larsen wrote:
>>>> It looks like this new version will leave some things uninitialized
>>>> if an item higher on the list throws an exception. I don?t think that
>>>> is what you want.
>>>>
>>>> /Staffan
>>>>
>>>>
>>>> On 31 jan 2014, at 20:42, Dmitry Samersoff
>>>> <dmitry.samersoff at oracle.com> wrote:
>>>>
>>>>> Hi Everybody,
>>>>>
>>>>> Please review the fix:
>>>>>
>>>>> http://cr.openjdk.java.net/~dsamersoff/JDK-8023667/webrev.01/
>>>>>
>>>>> sa.js try to initialize all possible VM structures regardless of
>>>>> type of VM we are running on
>>>>>
>>>>> -- Dmitry Samersoff Oracle Java development team, Saint Petersburg,
>>>>> Russia * I would love to change the world, but they won't give me
>>>>> the sources.
>>>>
>>>
>>>
>>> -- 
>>> Dmitry Samersoff
>>> Oracle Java development team, Saint Petersburg, Russia
>>> * I would love to change the world, but they won't give me the sources.
>>
> 
> 


-- 
Dmitry Samersoff
Oracle Java development team, Saint Petersburg, Russia
* I would love to change the world, but they won't give me the source code.

From volker.simonis at gmail.com  Wed Feb  5 10:09:25 2014
From: volker.simonis at gmail.com (Volker Simonis)
Date: Wed, 5 Feb 2014 19:09:25 +0100
Subject: Why do we need both - export maps AND
	-fvisibility=hidden/__attribute__((visibility("default")))
In-Reply-To: <52F1E551.2070304@oracle.com>
References: <CA+3eh11sqgKrUzqqi4u7p-uuXrQYjwGCuZ-jLVVeLAY5cXhveQ@mail.gmail.com>
	<52F18B03.1000107@oracle.com>
	<CAPYFHW1mMGjA8NqPX7e=WBY=NFu5NOc69ncLQcYXO=eNWbjGwQ@mail.gmail.com>
	<52F1E551.2070304@oracle.com>
Message-ID: <CA+3eh11N+-W7r54qZFA-gEOxtdV_u4s4r6_QKT1s08sTS8D88w@mail.gmail.com>

On Wed, Feb 5, 2014 at 8:16 AM, David Holmes <david.holmes at oracle.com> wrote:
> On 5/02/2014 5:09 PM, Jeremy Manson wrote:
>>
>> Also, don't you statically link libstdc++ into Hotspot?
>
>
> Normally yes.
>

That's bad - security-wise as well as from a usability perspective:

Security-wise because the JDK will need to be rebuild for every
security fix in libgcc and libstdc++. I'm not sure if there's at least
somebody actively and regularly looking at libgcc/libstdc++ security
issues. Considering the fact, that the JDK is usually build on a "not
so new" Linux release when it is released for the first time, this
even increases the possibility that libgcc/libstdc++ are actively
supported on that release for the full JDK life time.

>From a user perspective, because users can only safely use/link C++
JNI libraries with the JDK which have been build against the exactly
same libgcc/libstdc++ version.

Others, like RedHat (OpenJDK/IcedTea) and SAP (SAP JVM) don't do this
any more - i.e. they link libgcc/libstdc++ dynamically, without any
problems. While I'm aware that dynamic linking imposes some
compatibility risks as well if the same binary must support a bunch of
different releases, I still think that is easier to mange.

I think static linking of libgcc/libstdc++ is pre-gcc-2.9.2 paranoia,
but that's my personal opinion. Also that's a totally different
discussion not related to my initial question.

>
>> It would break
>> many, many things if the symbols from that were exported from Hotspot
>> (pretty much anything with native code written in C++).
>
>
> Doe static linking automatically re-export all the visible symbols of the
> linked in library? That would seem undesirable under most circumstances.
>

Yes, Jeremy is totally right. I did some tests and realized, that even
if we use -fvisibility=hidden in the compile step of a file, this will
not help if we make a static library (i.e. a '.a' archive) from it.
Linking such an archive into a shared library, will still let the
shared library export all "non-static" objects from the archive.

So, considering that the OpenDJK by default still statically links
libgcc/libstdc++ by default, this is one justification for the usage
of map files.

I did some more experiments and looked at the current build system and
also realized the following: the build dynamically builds a list of
all vtable symbols of all classes and puts them into the map file
which is used to generate the libjvm.so. If I remember right, they are
(or were) needed for the serviceability agent.

Now that would be another justification for the usage of a map file as
I don't now how we could otherwise export the vtable symbols of all
classes. But the funny thing is that I've just realized that this
technique isn't working any more with gcc 4.3 and above (which I think
is the build compiler of jdk8). Here are the details:

The vtable symbols are defined as weak symbols in the object files like so:

0000000000000000 V _ZTV10ArrayKlass

If such an object file will be linked with gcc 4.1.2 without map file
into the libjvm.so, the symbol would turn into a local data object
like so:

0000000000e37160 d _ZTV10ArrayKlass

However, if we use a map file which specifies that the symbol
'_ZTV10ArrayKlass' should be exported, the libjvm.so will contain the
following, global symbol:

0000000001423240 D _ZTV10arrayKlass

Now I think this is the expected result of the whole dynamic map-file
generation process. However, with gcc 4.3 and later, there's no
difference if the corresponding vtable symbol is mentioned in the
export map or not. The resulting libjvm.so always only contains a
local data object (just use 'nm --defined-only --extern-only
libjvm.so' to check).

Now I'm not sure if the vtable symbols arn't used by the
serviceability agent any more or if just nobody has realized until now
that they are not in exported from the libjvm.so any more. In any
case, the whole magic of constructing a link export file at build time
seems unnecessary by now. I've CC-ed serviceability group to comment
on this issue.

I'll plan do write down all my findings in a more readable way within
the next days.

If there are any more/other comments on this topic I'll be highly interested.

Regards,
Volker


> David
>
>> Jeremy
>>
>>
>> On Tue, Feb 4, 2014 at 4:51 PM, David Holmes <david.holmes at oracle.com
>> <mailto:david.holmes at oracle.com>> wrote:
>>
>>     Hi Volker,
>>
>>
>>     On 5/02/2014 2:18 AM, Volker Simonis wrote:
>>
>>         I know this is an old topic and actually I was involved myself
>>         in some
>>         of these discussions. Nevertheless, the topic regularly pops up
>>         again
>>         and every time this happens and I take a deeper look at the topic
>> I
>>         have the feeling I still haven?t completely understood it.
>>
>>         Current state:
>>
>>         On Linux, we are currently compiling the HotSpot with the
>>         '-fvisibility=hidden' flag which effectively hides all symbols
>>         except
>>         the ones which are explicitely declared to have default visibility
>>         (with the macro JNIEXPORT which expands to
>>         __attribute__((visibility("__default")))).
>>
>>
>>         This was introduced by "6588413: Use -fvisibility=hidden for gcc
>>         compiles" [1] in HS 21 and it is potentially a good change
>>         because it
>>         hides all symbols by default and it enables certain compiler
>>         optimizations on some platforms (see "How To Write Shared
>> Libraries"
>>         [2]).
>>
>>         However, independently of this change, we still use export maps
>>         (i.e.
>>         --version-script=mapfile___reorder) in addition to setting the
>>
>>         explicit
>>         symbol visibility. While [2] recommends this as good practice, I
>>         don't
>>         see the real benefit of having both -
>>         -fvisibility=hidden/____attribute__((visibility("__default")))
>>
>>         and export
>>         maps except if we were maintaining a versioned ABI. But as far as
>> I
>>         can tell, that's not the case for HotSpot.
>>
>>         So finally my question: is there any real reason (except the
>>         fact that
>>         "we always did it that way") why we would still need to use export
>>         maps on Linux where symbol hiding is done with
>>         -fvisibility=hidden/____attribute__((visibility("__default")))
>>
>>         anyway?
>>         Dropping the maps would make the maintenance of exported symbols
>>         easier as we wouldn't have to keep the export map files and the
>>         JNIEXPORT functions in sync.
>>
>>
>>      From past discussions isn't this because we need the map files to
>>     expose dynamically generated symbols? (And so we must then also add
>>     the other visible symbols.) Does that make the visibility attributes
>>     redundant? I'm not sure, but they serve as a clear statement of
>>     intent anyway.
>>
>>     David
>>
>>
>>         Thank you and best regards,
>>         Volker
>>
>>         [1] https://bugs.openjdk.java.net/__browse/JDK-6588413
>>         <https://bugs.openjdk.java.net/browse/JDK-6588413>
>>         [2] http://www.akkadia.org/__drepper/dsohowto.pdf
>>         <http://www.akkadia.org/drepper/dsohowto.pdf>
>>
>>
>

From daniel.daugherty at oracle.com  Wed Feb  5 15:20:52 2014
From: daniel.daugherty at oracle.com (Daniel D. Daugherty)
Date: Wed, 05 Feb 2014 16:20:52 -0700
Subject: code review round 0 for minor FDS makefile fix (8033714)
Message-ID: <52F2C754.40809@oracle.com>

This code review request is going to three different aliases.
Don't use Thunderbird's "reply to list" option since it will
pick just _one_ of the _three_ lists.


Greetings,

Doug Simon and Tom Rodriguez have sent a Full Debug Symbols (FDS)
makefile fix our way. Here are the bug and webrev URLs:

http://cr.openjdk.java.net/~dcubed/8033714-webrev/0-jdk9-hs-runtime/

     8033714 hotspot 'install_jvm' bld target broken with 
ZIP_DEBUGINFO_FILES=0
     https://bugs.openjdk.java.net/browse/JDK-8033714

As you might guess from the bug synopsis, this fix is needed when
building without ZIP'ing the debuginfo files (ZIP_DEBUGINFO_FILES=0).
Based on the Graal project fix, I've updated a few other places where
building with FDS disabled is affected.

As always, comments and suggestions are welcome.

Dan

From ron.durbin at oracle.com  Wed Feb  5 18:04:02 2014
From: ron.durbin at oracle.com (Ron Durbin)
Date: Wed, 5 Feb 2014 18:04:02 -0800 (PST)
Subject: code review round 0 for minor FDS makefile fix (8033714)
In-Reply-To: <52F2C754.40809@oracle.com>
References: <52F2C754.40809@oracle.com>
Message-ID: <626b3248-50fd-4f91-8aba-fed5fb3a49cc@default>


Dan

The changes look good

Ron
> -----Original Message-----
> From: Daniel D. Daugherty
> Sent: Wednesday, February 05, 2014 4:21 PM
> To: hotspot-runtime-dev at openjdk.java.net; serviceability-dev at openjdk.java.net; build-dev;
> Doug Simon; Tom Rodriguez
> Subject: code review round 0 for minor FDS makefile fix (8033714)
> 
> This code review request is going to three different aliases.
> Don't use Thunderbird's "reply to list" option since it will pick just _one_ of the _three_
> lists.
> 
> 
> Greetings,
> 
> Doug Simon and Tom Rodriguez have sent a Full Debug Symbols (FDS) makefile fix our way. Here
> are the bug and webrev URLs:
> 
> http://cr.openjdk.java.net/~dcubed/8033714-webrev/0-jdk9-hs-runtime/
> 
>      8033714 hotspot 'install_jvm' bld target broken with
> ZIP_DEBUGINFO_FILES=0
>      https://bugs.openjdk.java.net/browse/JDK-8033714
> 
> As you might guess from the bug synopsis, this fix is needed when building without ZIP'ing
> the debuginfo files (ZIP_DEBUGINFO_FILES=0).
> Based on the Graal project fix, I've updated a few other places where building with FDS
> disabled is affected.
> 
> As always, comments and suggestions are welcome.
> 
> Dan

From daniel.daugherty at oracle.com  Wed Feb  5 20:25:49 2014
From: daniel.daugherty at oracle.com (Daniel D. Daugherty)
Date: Wed, 05 Feb 2014 21:25:49 -0700
Subject: code review round 0 for minor FDS makefile fix (8033714)
In-Reply-To: <626b3248-50fd-4f91-8aba-fed5fb3a49cc@default>
References: <52F2C754.40809@oracle.com>
	<626b3248-50fd-4f91-8aba-fed5fb3a49cc@default>
Message-ID: <52F30ECD.8080303@oracle.com>

Ron,

Thanks for the review.

Dan


On 2/5/14 7:04 PM, Ron Durbin wrote:
> Dan
>
> The changes look good
>
> Ron
>> -----Original Message-----
>> From: Daniel D. Daugherty
>> Sent: Wednesday, February 05, 2014 4:21 PM
>> To: hotspot-runtime-dev at openjdk.java.net; serviceability-dev at openjdk.java.net; build-dev;
>> Doug Simon; Tom Rodriguez
>> Subject: code review round 0 for minor FDS makefile fix (8033714)
>>
>> This code review request is going to three different aliases.
>> Don't use Thunderbird's "reply to list" option since it will pick just _one_ of the _three_
>> lists.
>>
>>
>> Greetings,
>>
>> Doug Simon and Tom Rodriguez have sent a Full Debug Symbols (FDS) makefile fix our way. Here
>> are the bug and webrev URLs:
>>
>> http://cr.openjdk.java.net/~dcubed/8033714-webrev/0-jdk9-hs-runtime/
>>
>>       8033714 hotspot 'install_jvm' bld target broken with
>> ZIP_DEBUGINFO_FILES=0
>>       https://bugs.openjdk.java.net/browse/JDK-8033714
>>
>> As you might guess from the bug synopsis, this fix is needed when building without ZIP'ing
>> the debuginfo files (ZIP_DEBUGINFO_FILES=0).
>> Based on the Graal project fix, I've updated a few other places where building with FDS
>> disabled is affected.
>>
>> As always, comments and suggestions are welcome.
>>
>> Dan


From david.holmes at oracle.com  Wed Feb  5 20:28:44 2014
From: david.holmes at oracle.com (David Holmes)
Date: Thu, 06 Feb 2014 14:28:44 +1000
Subject: code review round 0 for minor FDS makefile fix (8033714)
In-Reply-To: <52F2C754.40809@oracle.com>
References: <52F2C754.40809@oracle.com>
Message-ID: <52F30F7C.7010407@oracle.com>

Hi Dan,

Looks good to me. (I never run the install targets :( )

Thanks,
David

On 6/02/2014 9:20 AM, Daniel D. Daugherty wrote:
> This code review request is going to three different aliases.
> Don't use Thunderbird's "reply to list" option since it will
> pick just _one_ of the _three_ lists.
>
>
> Greetings,
>
> Doug Simon and Tom Rodriguez have sent a Full Debug Symbols (FDS)
> makefile fix our way. Here are the bug and webrev URLs:
>
> http://cr.openjdk.java.net/~dcubed/8033714-webrev/0-jdk9-hs-runtime/
>
>      8033714 hotspot 'install_jvm' bld target broken with
> ZIP_DEBUGINFO_FILES=0
>      https://bugs.openjdk.java.net/browse/JDK-8033714
>
> As you might guess from the bug synopsis, this fix is needed when
> building without ZIP'ing the debuginfo files (ZIP_DEBUGINFO_FILES=0).
> Based on the Graal project fix, I've updated a few other places where
> building with FDS disabled is affected.
>
> As always, comments and suggestions are welcome.
>
> Dan

From david.holmes at oracle.com  Wed Feb  5 20:36:40 2014
From: david.holmes at oracle.com (David Holmes)
Date: Thu, 06 Feb 2014 14:36:40 +1000
Subject: Need Second reviewer!  Re: RR(S): JDK-8023667 SA: ExceptionBlob
	and other C2 classes not available in client VM
In-Reply-To: <52F2406B.70405@oracle.com>
References: <52EBFC89.3030809@oracle.com>	<A10C11D1-EABD-4D6A-B3BF-137E4F3EDB4E@oracle.com>	<52EF7675.3020707@oracle.com>	<CFDDFEB8-4C53-4EEF-850E-D6C0E7BFE07C@oracle.com>	<52F0EF49.4050003@oracle.com>
	<52F2406B.70405@oracle.com>
Message-ID: <52F31158.1050101@oracle.com>

Hi Dmitry,

This looks okay to me.

But can I suggest not starting a new email thread when you need to 
prompt for reviews as it makes this review disconnected from the 
original RFR email. (I would have replied to the original but I don't 
have a local copy to do that.)

Thanks,
David

On 5/02/2014 11:45 PM, Dmitry Samersoff wrote:
> On 2014-02-04 17:46, Dmitry Samersoff wrote:
>> Staffan,
>>
>> You was right, only two classes is C2 specific.
>>
>> Here is updated webrev.
>>
>> http://cr.openjdk.java.net/~dsamersoff/JDK-8023667/webrev.02/
>>
>> -Dmitry
>>
>> On 2014-02-03 16:13, Staffan Larsen wrote:
>>>
>>> On 3 feb 2014, at 11:59, Dmitry Samersoff <dmitry.samersoff at oracle.com> wrote:
>>>
>>>> Staffan,
>>>>
>>>> All items under try-catch exists only in server (C2 compiler) VM, so if
>>>> one of them is missing in target VM other items is missed as well.
>>>
>>> Are you sure? Some of them look very generic: ?JavaThread?, ?SurrogateLockerThread?, ?GenCollectedHeap?, etc. Perhaps the catch-ignore-c2-specifics should end at lin 821?
>>>
>>> /Staffan
>>>
>>>
>>>>
>>>> I have no ideas what happens later if SA tries to access one of this
>>>> non-initialized items, but I don't see any errors in my home testing.
>>>>
>>>> -Dmitry
>>>>
>>>> On 2014-02-03 14:39, Staffan Larsen wrote:
>>>>> It looks like this new version will leave some things uninitialized
>>>>> if an item higher on the list throws an exception. I don?t think that
>>>>> is what you want.
>>>>>
>>>>> /Staffan
>>>>>
>>>>>
>>>>> On 31 jan 2014, at 20:42, Dmitry Samersoff
>>>>> <dmitry.samersoff at oracle.com> wrote:
>>>>>
>>>>>> Hi Everybody,
>>>>>>
>>>>>> Please review the fix:
>>>>>>
>>>>>> http://cr.openjdk.java.net/~dsamersoff/JDK-8023667/webrev.01/
>>>>>>
>>>>>> sa.js try to initialize all possible VM structures regardless of
>>>>>> type of VM we are running on
>>>>>>
>>>>>> -- Dmitry Samersoff Oracle Java development team, Saint Petersburg,
>>>>>> Russia * I would love to change the world, but they won't give me
>>>>>> the sources.
>>>>>
>>>>
>>>>
>>>> --
>>>> Dmitry Samersoff
>>>> Oracle Java development team, Saint Petersburg, Russia
>>>> * I would love to change the world, but they won't give me the sources.
>>>
>>
>>
>
>

From Alan.Bateman at oracle.com  Thu Feb  6 02:04:13 2014
From: Alan.Bateman at oracle.com (Alan Bateman)
Date: Thu, 06 Feb 2014 10:04:13 +0000
Subject: 8033758: gcc warnings compiling jdk/src/share/back
Message-ID: <52F35E1D.3040507@oracle.com>


This is a drive-by fix to the JDWP agent to fix 50+ warnings that have 
been annoying me, see:
     https://bugs.openjdk.java.net/browse/JDK-8033758

The bulk of the warnings stem from using a switch statement to switch on 
JVMTI and JDWP internal "agent" errors. The agent errors are arranged 
(in util.h) so that they have integer values beyond the range of the 
JVMTI errors.

There are several ways to deal with this, including suppressing the 
warning. I toyed with splitting the switch statement into two so that 
the JVMTI and agent errors are handled separately. It didn't seem to be 
worth and the simplest is just to cast the jvmtiError to an int.

The other warning (in ArrayTypeImpl's writeNewObjectArray) is just that 
the compiler doesn't know if componentClass has been set in 
getComponentClass. The simplest thing here is just to initialize to NULL.

The proposed changes are below.

-Alan


diff --git a/src/share/back/ArrayTypeImpl.c b/src/share/back/ArrayTypeImpl.c
--- a/src/share/back/ArrayTypeImpl.c
+++ b/src/share/back/ArrayTypeImpl.c
@@ -112,7 +112,7 @@
      WITH_LOCAL_REFS(env, 1) {

          jarray array;
-        jclass componentClass;
+        jclass componentClass = NULL;
          jdwpError serror;

          serror = getComponentClass(env, arrayClass,
diff --git a/src/share/back/error_messages.c 
b/src/share/back/error_messages.c
--- a/src/share/back/error_messages.c
+++ b/src/share/back/error_messages.c
@@ -140,7 +140,7 @@
  const char *
  jvmtiErrorText(jvmtiError error)
  {
-    switch (error) {
+    switch ((int)error) {
          CASE_RETURN_TEXT(JVMTI_ERROR_NONE)
          CASE_RETURN_TEXT(JVMTI_ERROR_INVALID_THREAD)
          CASE_RETURN_TEXT(JVMTI_ERROR_INVALID_THREAD_GROUP)
diff --git a/src/share/back/util.c b/src/share/back/util.c
--- a/src/share/back/util.c
+++ b/src/share/back/util.c
@@ -2122,7 +2122,7 @@
  jdwpError
  map2jdwpError(jvmtiError error)
  {
-    switch ( error ) {
+    switch ( (int)error ) {
          case JVMTI_ERROR_NONE:
              return JDWP_ERROR(NONE);
          case AGENT_ERROR_INVALID_THREAD:

From david.holmes at oracle.com  Thu Feb  6 02:22:06 2014
From: david.holmes at oracle.com (David Holmes)
Date: Thu, 06 Feb 2014 20:22:06 +1000
Subject: 8033758: gcc warnings compiling jdk/src/share/back
In-Reply-To: <52F35E1D.3040507@oracle.com>
References: <52F35E1D.3040507@oracle.com>
Message-ID: <52F3624E.8010404@oracle.com>

Hi Alan,

On 6/02/2014 8:04 PM, Alan Bateman wrote:
>
> This is a drive-by fix to the JDWP agent to fix 50+ warnings that have
> been annoying me, see:
>      https://bugs.openjdk.java.net/browse/JDK-8033758
>
> The bulk of the warnings stem from using a switch statement to switch on
> JVMTI and JDWP internal "agent" errors. The agent errors are arranged
> (in util.h) so that they have integer values beyond the range of the
> JVMTI errors.
>
> There are several ways to deal with this, including suppressing the
> warning. I toyed with splitting the switch statement into two so that
> the JVMTI and agent errors are handled separately. It didn't seem to be
> worth and the simplest is just to cast the jvmtiError to an int.

Ok - not the nicest but this isn't worth any additional effort.

> The other warning (in ArrayTypeImpl's writeNewObjectArray) is just that
> the compiler doesn't know if componentClass has been set in
> getComponentClass. The simplest thing here is just to initialize to NULL.

Ok.

David
-----

> The proposed changes are below.
>
> -Alan
>
>
> diff --git a/src/share/back/ArrayTypeImpl.c
> b/src/share/back/ArrayTypeImpl.c
> --- a/src/share/back/ArrayTypeImpl.c
> +++ b/src/share/back/ArrayTypeImpl.c
> @@ -112,7 +112,7 @@
>       WITH_LOCAL_REFS(env, 1) {
>
>           jarray array;
> -        jclass componentClass;
> +        jclass componentClass = NULL;
>           jdwpError serror;
>
>           serror = getComponentClass(env, arrayClass,
> diff --git a/src/share/back/error_messages.c
> b/src/share/back/error_messages.c
> --- a/src/share/back/error_messages.c
> +++ b/src/share/back/error_messages.c
> @@ -140,7 +140,7 @@
>   const char *
>   jvmtiErrorText(jvmtiError error)
>   {
> -    switch (error) {
> +    switch ((int)error) {
>           CASE_RETURN_TEXT(JVMTI_ERROR_NONE)
>           CASE_RETURN_TEXT(JVMTI_ERROR_INVALID_THREAD)
>           CASE_RETURN_TEXT(JVMTI_ERROR_INVALID_THREAD_GROUP)
> diff --git a/src/share/back/util.c b/src/share/back/util.c
> --- a/src/share/back/util.c
> +++ b/src/share/back/util.c
> @@ -2122,7 +2122,7 @@
>   jdwpError
>   map2jdwpError(jvmtiError error)
>   {
> -    switch ( error ) {
> +    switch ( (int)error ) {
>           case JVMTI_ERROR_NONE:
>               return JDWP_ERROR(NONE);
>           case AGENT_ERROR_INVALID_THREAD:

From staffan.larsen at oracle.com  Thu Feb  6 02:34:00 2014
From: staffan.larsen at oracle.com (Staffan Larsen)
Date: Thu, 6 Feb 2014 11:34:00 +0100
Subject: 8033758: gcc warnings compiling jdk/src/share/back
In-Reply-To: <52F35E1D.3040507@oracle.com>
References: <52F35E1D.3040507@oracle.com>
Message-ID: <8E79CC47-31B2-4DAB-B8FB-AA56B6B71188@oracle.com>

Looks good!

Thanks,
/Staffan

On 6 feb 2014, at 11:04, Alan Bateman <Alan.Bateman at oracle.com> wrote:

> 
> This is a drive-by fix to the JDWP agent to fix 50+ warnings that have been annoying me, see:
>    https://bugs.openjdk.java.net/browse/JDK-8033758
> 
> The bulk of the warnings stem from using a switch statement to switch on JVMTI and JDWP internal "agent" errors. The agent errors are arranged (in util.h) so that they have integer values beyond the range of the JVMTI errors.
> 
> There are several ways to deal with this, including suppressing the warning. I toyed with splitting the switch statement into two so that the JVMTI and agent errors are handled separately. It didn't seem to be worth and the simplest is just to cast the jvmtiError to an int.
> 
> The other warning (in ArrayTypeImpl's writeNewObjectArray) is just that the compiler doesn't know if componentClass has been set in getComponentClass. The simplest thing here is just to initialize to NULL.
> 
> The proposed changes are below.
> 
> -Alan
> 
> 
> diff --git a/src/share/back/ArrayTypeImpl.c b/src/share/back/ArrayTypeImpl.c
> --- a/src/share/back/ArrayTypeImpl.c
> +++ b/src/share/back/ArrayTypeImpl.c
> @@ -112,7 +112,7 @@
>     WITH_LOCAL_REFS(env, 1) {
> 
>         jarray array;
> -        jclass componentClass;
> +        jclass componentClass = NULL;
>         jdwpError serror;
> 
>         serror = getComponentClass(env, arrayClass,
> diff --git a/src/share/back/error_messages.c b/src/share/back/error_messages.c
> --- a/src/share/back/error_messages.c
> +++ b/src/share/back/error_messages.c
> @@ -140,7 +140,7 @@
> const char *
> jvmtiErrorText(jvmtiError error)
> {
> -    switch (error) {
> +    switch ((int)error) {
>         CASE_RETURN_TEXT(JVMTI_ERROR_NONE)
>         CASE_RETURN_TEXT(JVMTI_ERROR_INVALID_THREAD)
>         CASE_RETURN_TEXT(JVMTI_ERROR_INVALID_THREAD_GROUP)
> diff --git a/src/share/back/util.c b/src/share/back/util.c
> --- a/src/share/back/util.c
> +++ b/src/share/back/util.c
> @@ -2122,7 +2122,7 @@
> jdwpError
> map2jdwpError(jvmtiError error)
> {
> -    switch ( error ) {
> +    switch ( (int)error ) {
>         case JVMTI_ERROR_NONE:
>             return JDWP_ERROR(NONE);
>         case AGENT_ERROR_INVALID_THREAD:


From serguei.spitsyn at oracle.com  Thu Feb  6 02:45:31 2014
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Thu, 06 Feb 2014 02:45:31 -0800
Subject: 8033758: gcc warnings compiling jdk/src/share/back
In-Reply-To: <52F35E1D.3040507@oracle.com>
References: <52F35E1D.3040507@oracle.com>
Message-ID: <52F367CB.6080905@oracle.com>

Looks good.

Thanks, Alan!
Serguei

On 2/6/14 2:04 AM, Alan Bateman wrote:
>
> This is a drive-by fix to the JDWP agent to fix 50+ warnings that have 
> been annoying me, see:
>     https://bugs.openjdk.java.net/browse/JDK-8033758
>
> The bulk of the warnings stem from using a switch statement to switch 
> on JVMTI and JDWP internal "agent" errors. The agent errors are 
> arranged (in util.h) so that they have integer values beyond the range 
> of the JVMTI errors.
>
> There are several ways to deal with this, including suppressing the 
> warning. I toyed with splitting the switch statement into two so that 
> the JVMTI and agent errors are handled separately. It didn't seem to 
> be worth and the simplest is just to cast the jvmtiError to an int.
>
> The other warning (in ArrayTypeImpl's writeNewObjectArray) is just 
> that the compiler doesn't know if componentClass has been set in 
> getComponentClass. The simplest thing here is just to initialize to NULL.
>
> The proposed changes are below.
>
> -Alan
>
>
> diff --git a/src/share/back/ArrayTypeImpl.c 
> b/src/share/back/ArrayTypeImpl.c
> --- a/src/share/back/ArrayTypeImpl.c
> +++ b/src/share/back/ArrayTypeImpl.c
> @@ -112,7 +112,7 @@
>      WITH_LOCAL_REFS(env, 1) {
>
>          jarray array;
> -        jclass componentClass;
> +        jclass componentClass = NULL;
>          jdwpError serror;
>
>          serror = getComponentClass(env, arrayClass,
> diff --git a/src/share/back/error_messages.c 
> b/src/share/back/error_messages.c
> --- a/src/share/back/error_messages.c
> +++ b/src/share/back/error_messages.c
> @@ -140,7 +140,7 @@
>  const char *
>  jvmtiErrorText(jvmtiError error)
>  {
> -    switch (error) {
> +    switch ((int)error) {
>          CASE_RETURN_TEXT(JVMTI_ERROR_NONE)
>          CASE_RETURN_TEXT(JVMTI_ERROR_INVALID_THREAD)
>          CASE_RETURN_TEXT(JVMTI_ERROR_INVALID_THREAD_GROUP)
> diff --git a/src/share/back/util.c b/src/share/back/util.c
> --- a/src/share/back/util.c
> +++ b/src/share/back/util.c
> @@ -2122,7 +2122,7 @@
>  jdwpError
>  map2jdwpError(jvmtiError error)
>  {
> -    switch ( error ) {
> +    switch ( (int)error ) {
>          case JVMTI_ERROR_NONE:
>              return JDWP_ERROR(NONE);
>          case AGENT_ERROR_INVALID_THREAD:


From Alan.Bateman at oracle.com  Thu Feb  6 02:46:54 2014
From: Alan.Bateman at oracle.com (Alan Bateman)
Date: Thu, 06 Feb 2014 10:46:54 +0000
Subject: 8033758: gcc warnings compiling jdk/src/share/back
In-Reply-To: <52F3624E.8010404@oracle.com>
References: <52F35E1D.3040507@oracle.com> <52F3624E.8010404@oracle.com>
Message-ID: <52F3681E.6010302@oracle.com>

On 06/02/2014 10:22, David Holmes wrote:
>
> Ok - not the nicest but this isn't worth any additional effort.
Right, it's not worth it.

Thanks for the quick review (Staffan too). I've just run the JDI tests 
on all platforms (as that exercises JDWP) and the tests are passing so I 
think we are good to go.

-Alan

From daniel.daugherty at oracle.com  Thu Feb  6 06:07:59 2014
From: daniel.daugherty at oracle.com (Daniel D. Daugherty)
Date: Thu, 06 Feb 2014 07:07:59 -0700
Subject: code review round 0 for minor FDS makefile fix (8033714)
In-Reply-To: <52F30F7C.7010407@oracle.com>
References: <52F2C754.40809@oracle.com> <52F30F7C.7010407@oracle.com>
Message-ID: <52F3973F.3040509@oracle.com>

On 2/5/14 9:28 PM, David Holmes wrote:
> Hi Dan,
>
> Looks good to me.

Thanks for the review!


> (I never run the install targets :( )

Neither do I and apparently neither does JPRT... That's how this
slipped through the cracks...

Dan


>
> Thanks,
> David
>
> On 6/02/2014 9:20 AM, Daniel D. Daugherty wrote:
>> This code review request is going to three different aliases.
>> Don't use Thunderbird's "reply to list" option since it will
>> pick just _one_ of the _three_ lists.
>>
>>
>> Greetings,
>>
>> Doug Simon and Tom Rodriguez have sent a Full Debug Symbols (FDS)
>> makefile fix our way. Here are the bug and webrev URLs:
>>
>> http://cr.openjdk.java.net/~dcubed/8033714-webrev/0-jdk9-hs-runtime/
>>
>>      8033714 hotspot 'install_jvm' bld target broken with
>> ZIP_DEBUGINFO_FILES=0
>>      https://bugs.openjdk.java.net/browse/JDK-8033714
>>
>> As you might guess from the bug synopsis, this fix is needed when
>> building without ZIP'ing the debuginfo files (ZIP_DEBUGINFO_FILES=0).
>> Based on the Graal project fix, I've updated a few other places where
>> building with FDS disabled is affected.
>>
>> As always, comments and suggestions are welcome.
>>
>> Dan


From daniel.daugherty at oracle.com  Thu Feb  6 06:57:57 2014
From: daniel.daugherty at oracle.com (Daniel D. Daugherty)
Date: Thu, 06 Feb 2014 07:57:57 -0700
Subject: 8033758: gcc warnings compiling jdk/src/share/back
In-Reply-To: <52F35E1D.3040507@oracle.com>
References: <52F35E1D.3040507@oracle.com>
Message-ID: <52F3A2F5.3010807@oracle.com>

Looks good to me. Nice to see you back in Serviceability code...:-)

Dan


On 2/6/14 3:04 AM, Alan Bateman wrote:
>
> This is a drive-by fix to the JDWP agent to fix 50+ warnings that have 
> been annoying me, see:
>     https://bugs.openjdk.java.net/browse/JDK-8033758
>
> The bulk of the warnings stem from using a switch statement to switch 
> on JVMTI and JDWP internal "agent" errors. The agent errors are 
> arranged (in util.h) so that they have integer values beyond the range 
> of the JVMTI errors.
>
> There are several ways to deal with this, including suppressing the 
> warning. I toyed with splitting the switch statement into two so that 
> the JVMTI and agent errors are handled separately. It didn't seem to 
> be worth and the simplest is just to cast the jvmtiError to an int.
>
> The other warning (in ArrayTypeImpl's writeNewObjectArray) is just 
> that the compiler doesn't know if componentClass has been set in 
> getComponentClass. The simplest thing here is just to initialize to NULL.
>
> The proposed changes are below.
>
> -Alan
>
>
> diff --git a/src/share/back/ArrayTypeImpl.c 
> b/src/share/back/ArrayTypeImpl.c
> --- a/src/share/back/ArrayTypeImpl.c
> +++ b/src/share/back/ArrayTypeImpl.c
> @@ -112,7 +112,7 @@
>      WITH_LOCAL_REFS(env, 1) {
>
>          jarray array;
> -        jclass componentClass;
> +        jclass componentClass = NULL;
>          jdwpError serror;
>
>          serror = getComponentClass(env, arrayClass,
> diff --git a/src/share/back/error_messages.c 
> b/src/share/back/error_messages.c
> --- a/src/share/back/error_messages.c
> +++ b/src/share/back/error_messages.c
> @@ -140,7 +140,7 @@
>  const char *
>  jvmtiErrorText(jvmtiError error)
>  {
> -    switch (error) {
> +    switch ((int)error) {
>          CASE_RETURN_TEXT(JVMTI_ERROR_NONE)
>          CASE_RETURN_TEXT(JVMTI_ERROR_INVALID_THREAD)
>          CASE_RETURN_TEXT(JVMTI_ERROR_INVALID_THREAD_GROUP)
> diff --git a/src/share/back/util.c b/src/share/back/util.c
> --- a/src/share/back/util.c
> +++ b/src/share/back/util.c
> @@ -2122,7 +2122,7 @@
>  jdwpError
>  map2jdwpError(jvmtiError error)
>  {
> -    switch ( error ) {
> +    switch ( (int)error ) {
>          case JVMTI_ERROR_NONE:
>              return JDWP_ERROR(NONE);
>          case AGENT_ERROR_INVALID_THREAD:


From tom.rodriguez at oracle.com  Thu Feb  6 08:07:18 2014
From: tom.rodriguez at oracle.com (Tom Rodriguez)
Date: Thu, 6 Feb 2014 08:07:18 -0800
Subject: code review round 0 for minor FDS makefile fix (8033714)
In-Reply-To: <52F3973F.3040509@oracle.com>
References: <52F2C754.40809@oracle.com> <52F30F7C.7010407@oracle.com>
	<52F3973F.3040509@oracle.com>
Message-ID: <1ECD0F23-FF74-442B-AEDC-C6CDD5AE9738@oracle.com>

Looks good to me too.  Thanks for fixing this.

tom

On Feb 6, 2014, at 6:07 AM, Daniel D. Daugherty <daniel.daugherty at oracle.com> wrote:

> On 2/5/14 9:28 PM, David Holmes wrote:
>> Hi Dan,
>> 
>> Looks good to me.
> 
> Thanks for the review!
> 
> 
>> (I never run the install targets :( )
> 
> Neither do I and apparently neither does JPRT... That's how this
> slipped through the cracks...
> 
> Dan
> 
> 
>> 
>> Thanks,
>> David
>> 
>> On 6/02/2014 9:20 AM, Daniel D. Daugherty wrote:
>>> This code review request is going to three different aliases.
>>> Don't use Thunderbird's "reply to list" option since it will
>>> pick just _one_ of the _three_ lists.
>>> 
>>> 
>>> Greetings,
>>> 
>>> Doug Simon and Tom Rodriguez have sent a Full Debug Symbols (FDS)
>>> makefile fix our way. Here are the bug and webrev URLs:
>>> 
>>> http://cr.openjdk.java.net/~dcubed/8033714-webrev/0-jdk9-hs-runtime/
>>> 
>>>     8033714 hotspot 'install_jvm' bld target broken with
>>> ZIP_DEBUGINFO_FILES=0
>>>     https://bugs.openjdk.java.net/browse/JDK-8033714
>>> 
>>> As you might guess from the bug synopsis, this fix is needed when
>>> building without ZIP'ing the debuginfo files (ZIP_DEBUGINFO_FILES=0).
>>> Based on the Graal project fix, I've updated a few other places where
>>> building with FDS disabled is affected.
>>> 
>>> As always, comments and suggestions are welcome.
>>> 
>>> Dan
> 


From daniel.daugherty at oracle.com  Thu Feb  6 08:32:37 2014
From: daniel.daugherty at oracle.com (Daniel D. Daugherty)
Date: Thu, 06 Feb 2014 09:32:37 -0700
Subject: code review round 0 for minor FDS makefile fix (8033714)
In-Reply-To: <C85EFACD-5D9B-4F1E-B811-69DE12EE14D4@oracle.com>
References: <52F2C754.40809@oracle.com> <52F30F7C.7010407@oracle.com>
	<52F3973F.3040509@oracle.com>
	<1ECD0F23-FF74-442B-AEDC-C6CDD5AE9738@oracle.com>
	<C85EFACD-5D9B-4F1E-B811-69DE12EE14D4@oracle.com>
Message-ID: <52F3B925.5030602@oracle.com>

On 2/6/14 9:29 AM, Doug Simon wrote:
> Not sure if I?m being asked for a review, but if so, looks good.

Yes, I was looking for a review. In particular because I tweaked your
original fix...

Thanks for the review!

Dan


>
> On Feb 6, 2014, at 5:07 PM, Tom Rodriguez <tom.rodriguez at oracle.com> wrote:
>
>> Looks good to me too.  Thanks for fixing this.
>>
>> tom
>>
>> On Feb 6, 2014, at 6:07 AM, Daniel D. Daugherty <daniel.daugherty at oracle.com> wrote:
>>
>>> On 2/5/14 9:28 PM, David Holmes wrote:
>>>> Hi Dan,
>>>>
>>>> Looks good to me.
>>> Thanks for the review!
>>>
>>>
>>>> (I never run the install targets :( )
>>> Neither do I and apparently neither does JPRT... That's how this
>>> slipped through the cracks...
>>>
>>> Dan
>>>
>>>
>>>> Thanks,
>>>> David
>>>>
>>>> On 6/02/2014 9:20 AM, Daniel D. Daugherty wrote:
>>>>> This code review request is going to three different aliases.
>>>>> Don't use Thunderbird's "reply to list" option since it will
>>>>> pick just _one_ of the _three_ lists.
>>>>>
>>>>>
>>>>> Greetings,
>>>>>
>>>>> Doug Simon and Tom Rodriguez have sent a Full Debug Symbols (FDS)
>>>>> makefile fix our way. Here are the bug and webrev URLs:
>>>>>
>>>>> http://cr.openjdk.java.net/~dcubed/8033714-webrev/0-jdk9-hs-runtime/
>>>>>
>>>>>     8033714 hotspot 'install_jvm' bld target broken with
>>>>> ZIP_DEBUGINFO_FILES=0
>>>>>     https://bugs.openjdk.java.net/browse/JDK-8033714
>>>>>
>>>>> As you might guess from the bug synopsis, this fix is needed when
>>>>> building without ZIP'ing the debuginfo files (ZIP_DEBUGINFO_FILES=0).
>>>>> Based on the Graal project fix, I've updated a few other places where
>>>>> building with FDS disabled is affected.
>>>>>
>>>>> As always, comments and suggestions are welcome.
>>>>>
>>>>> Dan


From doug.simon at oracle.com  Thu Feb  6 08:29:02 2014
From: doug.simon at oracle.com (Doug Simon)
Date: Thu, 6 Feb 2014 17:29:02 +0100
Subject: code review round 0 for minor FDS makefile fix (8033714)
In-Reply-To: <1ECD0F23-FF74-442B-AEDC-C6CDD5AE9738@oracle.com>
References: <52F2C754.40809@oracle.com> <52F30F7C.7010407@oracle.com>
	<52F3973F.3040509@oracle.com>
	<1ECD0F23-FF74-442B-AEDC-C6CDD5AE9738@oracle.com>
Message-ID: <C85EFACD-5D9B-4F1E-B811-69DE12EE14D4@oracle.com>

Not sure if I?m being asked for a review, but if so, looks good.

On Feb 6, 2014, at 5:07 PM, Tom Rodriguez <tom.rodriguez at oracle.com> wrote:

> Looks good to me too.  Thanks for fixing this.
> 
> tom
> 
> On Feb 6, 2014, at 6:07 AM, Daniel D. Daugherty <daniel.daugherty at oracle.com> wrote:
> 
>> On 2/5/14 9:28 PM, David Holmes wrote:
>>> Hi Dan,
>>> 
>>> Looks good to me.
>> 
>> Thanks for the review!
>> 
>> 
>>> (I never run the install targets :( )
>> 
>> Neither do I and apparently neither does JPRT... That's how this
>> slipped through the cracks...
>> 
>> Dan
>> 
>> 
>>> 
>>> Thanks,
>>> David
>>> 
>>> On 6/02/2014 9:20 AM, Daniel D. Daugherty wrote:
>>>> This code review request is going to three different aliases.
>>>> Don't use Thunderbird's "reply to list" option since it will
>>>> pick just _one_ of the _three_ lists.
>>>> 
>>>> 
>>>> Greetings,
>>>> 
>>>> Doug Simon and Tom Rodriguez have sent a Full Debug Symbols (FDS)
>>>> makefile fix our way. Here are the bug and webrev URLs:
>>>> 
>>>> http://cr.openjdk.java.net/~dcubed/8033714-webrev/0-jdk9-hs-runtime/
>>>> 
>>>>    8033714 hotspot 'install_jvm' bld target broken with
>>>> ZIP_DEBUGINFO_FILES=0
>>>>    https://bugs.openjdk.java.net/browse/JDK-8033714
>>>> 
>>>> As you might guess from the bug synopsis, this fix is needed when
>>>> building without ZIP'ing the debuginfo files (ZIP_DEBUGINFO_FILES=0).
>>>> Based on the Graal project fix, I've updated a few other places where
>>>> building with FDS disabled is affected.
>>>> 
>>>> As always, comments and suggestions are welcome.
>>>> 
>>>> Dan
>> 
> 


From tim.bell at oracle.com  Thu Feb  6 08:53:53 2014
From: tim.bell at oracle.com (Tim Bell)
Date: Thu, 06 Feb 2014 08:53:53 -0800
Subject: code review round 0 for minor FDS makefile fix (8033714)
In-Reply-To: <52F3B925.5030602@oracle.com>
References: <52F2C754.40809@oracle.com>
	<52F30F7C.7010407@oracle.com>	<52F3973F.3040509@oracle.com>	<1ECD0F23-FF74-442B-AEDC-C6CDD5AE9738@oracle.com>	<C85EFACD-5D9B-4F1E-B811-69DE12EE14D4@oracle.com>
	<52F3B925.5030602@oracle.com>
Message-ID: <52F3BE21.9000100@oracle.com>

On 02/ 6/14 08:32 AM, Daniel D. Daugherty wrote:

Looks good to me, Dan

Tim

>>>>> On 6/02/2014 9:20 AM, Daniel D. Daugherty wrote:
>>>>>> This code review request is going to three different aliases.
>>>>>> Don't use Thunderbird's "reply to list" option since it will
>>>>>> pick just _one_ of the _three_ lists.
>>>>>>
>>>>>>
>>>>>> Greetings,
>>>>>>
>>>>>> Doug Simon and Tom Rodriguez have sent a Full Debug Symbols (FDS)
>>>>>> makefile fix our way. Here are the bug and webrev URLs:
>>>>>>
>>>>>> http://cr.openjdk.java.net/~dcubed/8033714-webrev/0-jdk9-hs-runtime/
>>>>>>
>>>>>>     8033714 hotspot 'install_jvm' bld target broken with
>>>>>> ZIP_DEBUGINFO_FILES=0
>>>>>>     https://bugs.openjdk.java.net/browse/JDK-8033714
>>>>>>
>>>>>> As you might guess from the bug synopsis, this fix is needed when
>>>>>> building without ZIP'ing the debuginfo files 
>>>>>> (ZIP_DEBUGINFO_FILES=0).
>>>>>> Based on the Graal project fix, I've updated a few other places 
>>>>>> where
>>>>>> building with FDS disabled is affected.
>>>>>>
>>>>>> As always, comments and suggestions are welcome.
>>>>>>
>>>>>> Dan
>


From daniel.daugherty at oracle.com  Thu Feb  6 09:07:00 2014
From: daniel.daugherty at oracle.com (Daniel D. Daugherty)
Date: Thu, 06 Feb 2014 10:07:00 -0700
Subject: code review round 0 for minor FDS makefile fix (8033714)
In-Reply-To: <52F3BE21.9000100@oracle.com>
References: <52F2C754.40809@oracle.com>
	<52F30F7C.7010407@oracle.com>	<52F3973F.3040509@oracle.com>	<1ECD0F23-FF74-442B-AEDC-C6CDD5AE9738@oracle.com>	<C85EFACD-5D9B-4F1E-B811-69DE12EE14D4@oracle.com>
	<52F3B925.5030602@oracle.com> <52F3BE21.9000100@oracle.com>
Message-ID: <52F3C134.2000004@oracle.com>

Thanks for the review!

Dan


On 2/6/14 9:53 AM, Tim Bell wrote:
> On 02/ 6/14 08:32 AM, Daniel D. Daugherty wrote:
>
> Looks good to me, Dan
>
> Tim
>
>>>>>> On 6/02/2014 9:20 AM, Daniel D. Daugherty wrote:
>>>>>>> This code review request is going to three different aliases.
>>>>>>> Don't use Thunderbird's "reply to list" option since it will
>>>>>>> pick just _one_ of the _three_ lists.
>>>>>>>
>>>>>>>
>>>>>>> Greetings,
>>>>>>>
>>>>>>> Doug Simon and Tom Rodriguez have sent a Full Debug Symbols (FDS)
>>>>>>> makefile fix our way. Here are the bug and webrev URLs:
>>>>>>>
>>>>>>> http://cr.openjdk.java.net/~dcubed/8033714-webrev/0-jdk9-hs-runtime/ 
>>>>>>>
>>>>>>>
>>>>>>>     8033714 hotspot 'install_jvm' bld target broken with
>>>>>>> ZIP_DEBUGINFO_FILES=0
>>>>>>>     https://bugs.openjdk.java.net/browse/JDK-8033714
>>>>>>>
>>>>>>> As you might guess from the bug synopsis, this fix is needed when
>>>>>>> building without ZIP'ing the debuginfo files 
>>>>>>> (ZIP_DEBUGINFO_FILES=0).
>>>>>>> Based on the Graal project fix, I've updated a few other places 
>>>>>>> where
>>>>>>> building with FDS disabled is affected.
>>>>>>>
>>>>>>> As always, comments and suggestions are welcome.
>>>>>>>
>>>>>>> Dan
>>
>


From henry.jen at oracle.com  Thu Feb  6 10:34:09 2014
From: henry.jen at oracle.com (henry.jen at oracle.com)
Date: Thu, 06 Feb 2014 18:34:09 +0000
Subject: hg: jdk8/tl/jdk: 8033590: java.util.Comparator::thenComparing has
	unnecessary type restriction
Message-ID: <20140206183432.A898562A7F@hg.openjdk.java.net>

Changeset: 7534523b4174
Author:    henryjen
Date:      2014-02-06 10:30 -0800
URL:       http://hg.openjdk.java.net/jdk8/tl/jdk/rev/7534523b4174

8033590: java.util.Comparator::thenComparing has unnecessary type restriction
Reviewed-by: psandoz

! src/share/classes/java/util/Comparator.java
! test/java/util/Comparator/TypeTest.java


From serguei.spitsyn at oracle.com  Thu Feb  6 15:17:53 2014
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Thu, 06 Feb 2014 15:17:53 -0800
Subject: Review Request (S) 8025841: JVMTI: "vtable stub" dynamic code
	notification is misplaced
In-Reply-To: <52F01569.3070201@oracle.com>
References: <52F01569.3070201@oracle.com>
Message-ID: <52F41821.10403@oracle.com>

Runtime team,

This fix was reviewed by Vladimir K. and me.
Just wanted to make sure if you would like to review it as well.
If not, then I will push it.

Thanks,
Serguei

On 2/3/14 2:17 PM, serguei.spitsyn at oracle.com wrote:
> Please, review the fix for:
>   https://bugs.openjdk.java.net/browse/JDK-8025841
>
>
> Open webrev:
> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/hotspot/omazurov/8025841-JVMTI-vtbl.1 
>
>
> Summary:
>
>   The fix contributed by Oleg Mazurov to improve profiling data quality.
>   It moves the "vtable stub" dynamic code notification to the right 
> place.
>   I've already reviewed the fix, and it looks good to me.
>
>   Bug report description:
>
>    "JVMTI_EVENT_DYNAMIC_CODE_GENERATED for "vtable stub" gets 
> scheduled when
>     a new chunk of memory for subsequent vtable and itable stubs is 
> allocated.
>     That chunk is uninitialized (contains zeros or garbage) although 
> due to the fact
>     that the actual event delivery is deferred, at least one vtable 
> comes out right.
>
>     This event should describe an individual vtable/itable stub (base 
> address and size)
>     and only after it's been created (memory is actually populated 
> with code).
>     Where VM diagnostic messages about vtable/itable stubs are issued 
> upon
>     -XX:+PrintAdapterHandlers appears exactly the right place for 
> JVMTI events as well.
>
>     Getting vtables/itables right is important in the context of 
> performance analysis as
>     that dynamically generated code may accumulate quite noticeable 
> CPU time
>     (especially itabes), sometimes larger than the actual Java methods 
> called."
>
>
> Testing:
>   Oleg tested it in the Oracle Studio Performance Analyzer environment.
>   nsk.jvmti, nsk.jdi, nsk.jdwp,
>   In progress: Jtreg com/sun/jdi, java/lang/instrument
>
>
> Thanks,
> Serguei
>


From coleen.phillimore at oracle.com  Thu Feb  6 15:32:13 2014
From: coleen.phillimore at oracle.com (Coleen Phillimore)
Date: Thu, 06 Feb 2014 18:32:13 -0500
Subject: Review Request (S) 8025841: JVMTI: "vtable stub" dynamic code
	notification is misplaced
In-Reply-To: <52F41821.10403@oracle.com>
References: <52F01569.3070201@oracle.com> <52F41821.10403@oracle.com>
Message-ID: <52F41B7D.6090206@oracle.com>


Hi, I clicked on this a couple times.  It seems okay but isn't there a 
safer way to identify code blobs that are vtable stubs, without looking 
at the name (which can change in while creating it).  A comment at least 
when you create "vtable chunks" would be good.   It seems that someone 
might want to rename it "vtable or itable buffers", or something like that.

thanks,
Coleen

On 2/6/14 6:17 PM, serguei.spitsyn at oracle.com wrote:
> Runtime team,
>
> This fix was reviewed by Vladimir K. and me.
> Just wanted to make sure if you would like to review it as well.
> If not, then I will push it.
>
> Thanks,
> Serguei
>
> On 2/3/14 2:17 PM, serguei.spitsyn at oracle.com wrote:
>> Please, review the fix for:
>>   https://bugs.openjdk.java.net/browse/JDK-8025841
>>
>>
>> Open webrev:
>> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/hotspot/omazurov/8025841-JVMTI-vtbl.1 
>>
>>
>> Summary:
>>
>>   The fix contributed by Oleg Mazurov to improve profiling data quality.
>>   It moves the "vtable stub" dynamic code notification to the right 
>> place.
>>   I've already reviewed the fix, and it looks good to me.
>>
>>   Bug report description:
>>
>>    "JVMTI_EVENT_DYNAMIC_CODE_GENERATED for "vtable stub" gets 
>> scheduled when
>>     a new chunk of memory for subsequent vtable and itable stubs is 
>> allocated.
>>     That chunk is uninitialized (contains zeros or garbage) although 
>> due to the fact
>>     that the actual event delivery is deferred, at least one vtable 
>> comes out right.
>>
>>     This event should describe an individual vtable/itable stub (base 
>> address and size)
>>     and only after it's been created (memory is actually populated 
>> with code).
>>     Where VM diagnostic messages about vtable/itable stubs are issued 
>> upon
>>     -XX:+PrintAdapterHandlers appears exactly the right place for 
>> JVMTI events as well.
>>
>>     Getting vtables/itables right is important in the context of 
>> performance analysis as
>>     that dynamically generated code may accumulate quite noticeable 
>> CPU time
>>     (especially itabes), sometimes larger than the actual Java 
>> methods called."
>>
>>
>> Testing:
>>   Oleg tested it in the Oracle Studio Performance Analyzer environment.
>>   nsk.jvmti, nsk.jdi, nsk.jdwp,
>>   In progress: Jtreg com/sun/jdi, java/lang/instrument
>>
>>
>> Thanks,
>> Serguei
>>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20140206/55f4f99f/attachment.html 

From oleg.mazurov at oracle.com  Thu Feb  6 15:52:28 2014
From: oleg.mazurov at oracle.com (Oleg Mazurov)
Date: Thu, 6 Feb 2014 15:52:28 -0800
Subject: Review Request (S) 8025841: JVMTI: "vtable stub" dynamic code
	notification is misplaced
In-Reply-To: <52F41B7D.6090206@oracle.com>
References: <52F01569.3070201@oracle.com> <52F41821.10403@oracle.com>
	<52F41B7D.6090206@oracle.com>
Message-ID: <B6B09E1E-E531-4C0F-BCC0-FF45032A30C0@oracle.com>

My understanding was that a buffer blob was just that - a buffer. Could potentially contain code fragments of different kinds.
Thus, is_buffer_blob() was the closest type available. Agree that a dependency on its name is not reliable, though testing 
will reveal if the condition turns false for "vtable chunks" due to a name change (I had to deal with that particular test, Serguei
should be able to identify it). Adding a comment to where the name is defined (vtableStubs.cpp) that such a dependency exists
is a good idea.
Thanks,

    -- Oleg

On Feb 6, 2014, at 3:32 PM, Coleen Phillimore wrote:

> 
> Hi, I clicked on this a couple times.  It seems okay but isn't there a safer way to identify code blobs that are vtable stubs, without looking at the name (which can change in while creating it).  A comment at least when you create "vtable chunks" would be good.   It seems that someone might want to rename it "vtable or itable buffers", or something like that.
> 
> thanks,
> Coleen
> 
> On 2/6/14 6:17 PM, serguei.spitsyn at oracle.com wrote:
>> Runtime team, 
>> 
>> This fix was reviewed by Vladimir K. and me. 
>> Just wanted to make sure if you would like to review it as well. 
>> If not, then I will push it. 
>> 
>> Thanks, 
>> Serguei 
>> 
>> On 2/3/14 2:17 PM, serguei.spitsyn at oracle.com wrote: 
>>> Please, review the fix for: 
>>>   https://bugs.openjdk.java.net/browse/JDK-8025841 
>>> 
>>> 
>>> Open webrev: 
>>> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/hotspot/omazurov/8025841-JVMTI-vtbl.1 
>>> 
>>> Summary: 
>>> 
>>>   The fix contributed by Oleg Mazurov to improve profiling data quality. 
>>>   It moves the "vtable stub" dynamic code notification to the right place. 
>>>   I've already reviewed the fix, and it looks good to me. 
>>> 
>>>   Bug report description: 
>>> 
>>>    "JVMTI_EVENT_DYNAMIC_CODE_GENERATED for "vtable stub" gets scheduled when 
>>>     a new chunk of memory for subsequent vtable and itable stubs is allocated. 
>>>     That chunk is uninitialized (contains zeros or garbage) although due to the fact 
>>>     that the actual event delivery is deferred, at least one vtable comes out right. 
>>> 
>>>     This event should describe an individual vtable/itable stub (base address and size) 
>>>     and only after it's been created (memory is actually populated with code). 
>>>     Where VM diagnostic messages about vtable/itable stubs are issued upon 
>>>     -XX:+PrintAdapterHandlers appears exactly the right place for JVMTI events as well. 
>>> 
>>>     Getting vtables/itables right is important in the context of performance analysis as 
>>>     that dynamically generated code may accumulate quite noticeable CPU time 
>>>     (especially itabes), sometimes larger than the actual Java methods called." 
>>> 
>>> 
>>> Testing: 
>>>   Oleg tested it in the Oracle Studio Performance Analyzer environment. 
>>>   nsk.jvmti, nsk.jdi, nsk.jdwp, 
>>>   In progress: Jtreg com/sun/jdi, java/lang/instrument 
>>> 
>>> 
>>> Thanks, 
>>> Serguei 
>>> 
>> 
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20140206/902e54bd/attachment.html 

From coleen.phillimore at oracle.com  Thu Feb  6 16:23:05 2014
From: coleen.phillimore at oracle.com (Coleen Phillimore)
Date: Thu, 06 Feb 2014 19:23:05 -0500
Subject: Review Request (S) 8025841: JVMTI: "vtable stub" dynamic code
	notification is misplaced
In-Reply-To: <B6B09E1E-E531-4C0F-BCC0-FF45032A30C0@oracle.com>
References: <52F01569.3070201@oracle.com> <52F41821.10403@oracle.com>
	<52F41B7D.6090206@oracle.com>
	<B6B09E1E-E531-4C0F-BCC0-FF45032A30C0@oracle.com>
Message-ID: <52F42769.3040603@oracle.com>


Okay, thanks for adding a comment.
Coleen

On 2/6/14 6:52 PM, Oleg Mazurov wrote:
> My understanding was that a buffer blob was just that - a buffer. 
> Could potentially contain code fragments of different kinds.
> Thus, is_buffer_blob() was the closest type available. Agree that a 
> dependency on its name is not reliable, though testing
> will reveal if the condition turns false for "vtable chunks" due to a 
> name change (I had to deal with that particular test, Serguei
> should be able to identify it). Adding a comment to where the name is 
> defined (vtableStubs.cpp) that such a dependency exists
> is a good idea.
> Thanks,
>
>     -- Oleg
>
> On Feb 6, 2014, at 3:32 PM, Coleen Phillimore wrote:
>
>>
>> Hi, I clicked on this a couple times.  It seems okay but isn't there 
>> a safer way to identify code blobs that are vtable stubs, without 
>> looking at the name (which can change in while creating it).  A 
>> comment at least when you create "vtable chunks" would be good.   It 
>> seems that someone might want to rename it "vtable or itable 
>> buffers", or something like that.
>>
>> thanks,
>> Coleen
>>
>> On 2/6/14 6:17 PM, serguei.spitsyn at oracle.com wrote:
>>> Runtime team,
>>>
>>> This fix was reviewed by Vladimir K. and me.
>>> Just wanted to make sure if you would like to review it as well.
>>> If not, then I will push it.
>>>
>>> Thanks,
>>> Serguei
>>>
>>> On 2/3/14 2:17 PM, serguei.spitsyn at oracle.com wrote:
>>>> Please, review the fix for:
>>>> https://bugs.openjdk.java.net/browse/JDK-8025841
>>>>
>>>>
>>>> Open webrev:
>>>> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/hotspot/omazurov/8025841-JVMTI-vtbl.1 
>>>>
>>>>
>>>> Summary:
>>>>
>>>>   The fix contributed by Oleg Mazurov to improve profiling data 
>>>> quality.
>>>>   It moves the "vtable stub" dynamic code notification to the right 
>>>> place.
>>>>   I've already reviewed the fix, and it looks good to me.
>>>>
>>>>   Bug report description:
>>>>
>>>>    "JVMTI_EVENT_DYNAMIC_CODE_GENERATED for "vtable stub" gets 
>>>> scheduled when
>>>>     a new chunk of memory for subsequent vtable and itable stubs is 
>>>> allocated.
>>>>     That chunk is uninitialized (contains zeros or garbage) 
>>>> although due to the fact
>>>>     that the actual event delivery is deferred, at least one vtable 
>>>> comes out right.
>>>>
>>>>     This event should describe an individual vtable/itable stub 
>>>> (base address and size)
>>>>     and only after it's been created (memory is actually populated 
>>>> with code).
>>>>     Where VM diagnostic messages about vtable/itable stubs are 
>>>> issued upon
>>>>     -XX:+PrintAdapterHandlers appears exactly the right place for 
>>>> JVMTI events as well.
>>>>
>>>>     Getting vtables/itables right is important in the context of 
>>>> performance analysis as
>>>>     that dynamically generated code may accumulate quite noticeable 
>>>> CPU time
>>>>     (especially itabes), sometimes larger than the actual Java 
>>>> methods called."
>>>>
>>>>
>>>> Testing:
>>>>   Oleg tested it in the Oracle Studio Performance Analyzer 
>>>> environment.
>>>>   nsk.jvmti, nsk.jdi, nsk.jdwp,
>>>>   In progress: Jtreg com/sun/jdi, java/lang/instrument
>>>>
>>>>
>>>> Thanks,
>>>> Serguei
>>>>
>>>
>>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20140206/d2c0fe97/attachment-0001.html 

From staffan.larsen at oracle.com  Fri Feb  7 02:46:07 2014
From: staffan.larsen at oracle.com (Staffan Larsen)
Date: Fri, 7 Feb 2014 11:46:07 +0100
Subject: RFR(S): JDK-8033911 Simplify instrumentation of FileInputStream and
	RandomAccessFile
Message-ID: <2BA69765-5405-413A-BAC6-EDE77C176FFA@oracle.com>

A few of the public read and write methods in FileInputStream and RandomAccessFile are declared native. This means that it is hard to instrument them using byte code instrumentation. Changing the public methods to be to non-native and instead calling private native methods simplifies instrumentation. 

webrev: http://cr.openjdk.java.net/~sla/8033911/webrev.00/
bug: https://bugs.openjdk.java.net/browse/JDK-8033911

Thanks,
/Staffan

From Alan.Bateman at oracle.com  Fri Feb  7 02:56:30 2014
From: Alan.Bateman at oracle.com (Alan Bateman)
Date: Fri, 07 Feb 2014 10:56:30 +0000
Subject: RFR(S): JDK-8033911 Simplify instrumentation of FileInputStream
	and RandomAccessFile
In-Reply-To: <2BA69765-5405-413A-BAC6-EDE77C176FFA@oracle.com>
References: <2BA69765-5405-413A-BAC6-EDE77C176FFA@oracle.com>
Message-ID: <52F4BBDE.4010005@oracle.com>

On 07/02/2014 10:46, Staffan Larsen wrote:
> A few of the public read and write methods in FileInputStream and RandomAccessFile are declared native. This means that it is hard to instrument them using byte code instrumentation. Changing the public methods to be to non-native and instead calling private native methods simplifies instrumentation.
>
> webrev: http://cr.openjdk.java.net/~sla/8033911/webrev.00/
> bug: https://bugs.openjdk.java.net/browse/JDK-8033911
>
I assume you know this already but both JVM TI and java.lang.instrument 
do have support for hooking into the resolution of native methods. It is 
of course more complicated and not for the fainthearted.

The proposed changes look okay. The only thing is that it might not be 
complete but perhaps it's not too interesting to instrument methods such 
as skip or available.

-Alan.

From dmitry.samersoff at oracle.com  Fri Feb  7 03:07:23 2014
From: dmitry.samersoff at oracle.com (Dmitry Samersoff)
Date: Fri, 07 Feb 2014 15:07:23 +0400
Subject: RFR(S): JDK-8033911 Simplify instrumentation of FileInputStream
	and RandomAccessFile
In-Reply-To: <2BA69765-5405-413A-BAC6-EDE77C176FFA@oracle.com>
References: <2BA69765-5405-413A-BAC6-EDE77C176FFA@oracle.com>
Message-ID: <52F4BE6B.8040109@oracle.com>

Staffan,

As far as you touching this.

Is it possible to change all native methods in these two classes to have
0 at the end of name?

i.e. readBytes => readBytes0

it's pure cosmetic, but fairly simplify core dump reading and later
grep-ing.

-Dmitry

On 2014-02-07 14:46, Staffan Larsen wrote:
> A few of the public read and write methods in FileInputStream and RandomAccessFile are declared native. This means that it is hard to instrument them using byte code instrumentation. Changing the public methods to be to non-native and instead calling private native methods simplifies instrumentation. 
> 
> webrev: http://cr.openjdk.java.net/~sla/8033911/webrev.00/
> bug: https://bugs.openjdk.java.net/browse/JDK-8033911
> 
> Thanks,
> /Staffan
> 


-- 
Dmitry Samersoff
Oracle Java development team, Saint Petersburg, Russia
* I would love to change the world, but they won't give me the sources.

From staffan.larsen at oracle.com  Fri Feb  7 03:27:46 2014
From: staffan.larsen at oracle.com (Staffan Larsen)
Date: Fri, 7 Feb 2014 12:27:46 +0100
Subject: RFR(S): JDK-8033911 Simplify instrumentation of FileInputStream
	and RandomAccessFile
In-Reply-To: <52F4BBDE.4010005@oracle.com>
References: <2BA69765-5405-413A-BAC6-EDE77C176FFA@oracle.com>
	<52F4BBDE.4010005@oracle.com>
Message-ID: <E3D45C2F-E2EF-452B-9B02-B82F337A6CBC@oracle.com>


On 7 feb 2014, at 11:56, Alan Bateman <Alan.Bateman at oracle.com> wrote:

> On 07/02/2014 10:46, Staffan Larsen wrote:
>> A few of the public read and write methods in FileInputStream and RandomAccessFile are declared native. This means that it is hard to instrument them using byte code instrumentation. Changing the public methods to be to non-native and instead calling private native methods simplifies instrumentation.
>> 
>> webrev: http://cr.openjdk.java.net/~sla/8033911/webrev.00/
>> bug: https://bugs.openjdk.java.net/browse/JDK-8033911
>> 
> I assume you know this already but both JVM TI and java.lang.instrument do have support for hooking into the resolution of native methods. It is of course more complicated and not for the fainthearted.

Yes, but a large drawback of using SetNativeMethodPrefix is that it requires schema changes to the class file (adding a method). This is not possible when dynamically attaching an instrumentation agent.

> The proposed changes look okay. The only thing is that it might not be complete but perhaps it's not too interesting to instrument methods such as skip or available.

That was my thinking, too.

Thanks,
/Staffan

From staffan.larsen at oracle.com  Fri Feb  7 03:28:12 2014
From: staffan.larsen at oracle.com (Staffan Larsen)
Date: Fri, 7 Feb 2014 12:28:12 +0100
Subject: RFR(S): JDK-8033911 Simplify instrumentation of FileInputStream
	and RandomAccessFile
In-Reply-To: <52F4BE6B.8040109@oracle.com>
References: <2BA69765-5405-413A-BAC6-EDE77C176FFA@oracle.com>
	<52F4BE6B.8040109@oracle.com>
Message-ID: <D954ED50-C80E-4C66-AD35-5E71B53BB0FE@oracle.com>

I would prefer that to be a different change.

Thanks,
/Staffan

On 7 feb 2014, at 12:07, Dmitry Samersoff <dmitry.samersoff at oracle.com> wrote:

> Staffan,
> 
> As far as you touching this.
> 
> Is it possible to change all native methods in these two classes to have
> 0 at the end of name?
> 
> i.e. readBytes => readBytes0
> 
> it's pure cosmetic, but fairly simplify core dump reading and later
> grep-ing.
> 
> -Dmitry
> 
> On 2014-02-07 14:46, Staffan Larsen wrote:
>> A few of the public read and write methods in FileInputStream and RandomAccessFile are declared native. This means that it is hard to instrument them using byte code instrumentation. Changing the public methods to be to non-native and instead calling private native methods simplifies instrumentation. 
>> 
>> webrev: http://cr.openjdk.java.net/~sla/8033911/webrev.00/
>> bug: https://bugs.openjdk.java.net/browse/JDK-8033911
>> 
>> Thanks,
>> /Staffan
>> 
> 
> 
> -- 
> Dmitry Samersoff
> Oracle Java development team, Saint Petersburg, Russia
> * I would love to change the world, but they won't give me the sources.


From dmitry.samersoff at oracle.com  Fri Feb  7 03:30:23 2014
From: dmitry.samersoff at oracle.com (Dmitry Samersoff)
Date: Fri, 07 Feb 2014 15:30:23 +0400
Subject: RFR(S): JDK-8033911 Simplify instrumentation of FileInputStream
	and RandomAccessFile
In-Reply-To: <D954ED50-C80E-4C66-AD35-5E71B53BB0FE@oracle.com>
References: <2BA69765-5405-413A-BAC6-EDE77C176FFA@oracle.com>
	<52F4BE6B.8040109@oracle.com>
	<D954ED50-C80E-4C66-AD35-5E71B53BB0FE@oracle.com>
Message-ID: <52F4C3CF.4090505@oracle.com>

Staffan,

OK! Looks good for me.

-Dmitry

On 2014-02-07 15:28, Staffan Larsen wrote:
> I would prefer that to be a different change.
> 
> Thanks,
> /Staffan
> 
> On 7 feb 2014, at 12:07, Dmitry Samersoff <dmitry.samersoff at oracle.com> wrote:
> 
>> Staffan,
>>
>> As far as you touching this.
>>
>> Is it possible to change all native methods in these two classes to have
>> 0 at the end of name?
>>
>> i.e. readBytes => readBytes0
>>
>> it's pure cosmetic, but fairly simplify core dump reading and later
>> grep-ing.
>>
>> -Dmitry
>>
>> On 2014-02-07 14:46, Staffan Larsen wrote:
>>> A few of the public read and write methods in FileInputStream and RandomAccessFile are declared native. This means that it is hard to instrument them using byte code instrumentation. Changing the public methods to be to non-native and instead calling private native methods simplifies instrumentation. 
>>>
>>> webrev: http://cr.openjdk.java.net/~sla/8033911/webrev.00/
>>> bug: https://bugs.openjdk.java.net/browse/JDK-8033911
>>>
>>> Thanks,
>>> /Staffan
>>>
>>
>>
>> -- 
>> Dmitry Samersoff
>> Oracle Java development team, Saint Petersburg, Russia
>> * I would love to change the world, but they won't give me the sources.
> 


-- 
Dmitry Samersoff
Oracle Java development team, Saint Petersburg, Russia
* I would love to change the world, but they won't give me the sources.

From jaroslav.bachorik at oracle.com  Fri Feb  7 03:38:59 2014
From: jaroslav.bachorik at oracle.com (Jaroslav Bachorik)
Date: Fri, 07 Feb 2014 12:38:59 +0100
Subject: RFR(S): JDK-8033911 Simplify instrumentation of FileInputStream
	and	RandomAccessFile
In-Reply-To: <2BA69765-5405-413A-BAC6-EDE77C176FFA@oracle.com>
References: <2BA69765-5405-413A-BAC6-EDE77C176FFA@oracle.com>
Message-ID: <3d89f5ba-3c6e-4571-b7bf-19cbad4e5b5c@email.android.com>

Looks good. Additional benefit is the compliance with the secure coding guide.

-JB-

On February 7, 2014 11:46:07 AM CET, Staffan Larsen <staffan.larsen at oracle.com> wrote:
>A few of the public read and write methods in FileInputStream and
>RandomAccessFile are declared native. This means that it is hard to
>instrument them using byte code instrumentation. Changing the public
>methods to be to non-native and instead calling private native methods
>simplifies instrumentation. 
>
>webrev: http://cr.openjdk.java.net/~sla/8033911/webrev.00/
>bug: https://bugs.openjdk.java.net/browse/JDK-8033911
>
>Thanks,
>/Staffan

-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20140207/cf88cc49/attachment.html 

From staffan.larsen at oracle.com  Fri Feb  7 04:07:04 2014
From: staffan.larsen at oracle.com (Staffan Larsen)
Date: Fri, 7 Feb 2014 13:07:04 +0100
Subject: RFR: JDK-8033917 Keep track of file paths in file streams and
	channels for instrumentation purposes
Message-ID: <25552F6F-0FFC-42EA-9B0B-49EC135A5EE6@oracle.com>

Instrumentation agents that want to instrument FileInputStream/FileOutputStream to see which files are being accessed do not currently have access to the file system path of the stream. This is because the path is never stored in the stream class, only the file descriptor is. (This is also true for RandomAccessFile and FileChannel).

An agent could instrument the respective constructors to store the path. The problem is where to store it. To add a field, the instrumentation agent needs to change the schema of the class. This is not possible at runtime but can be done at class-loading time. However for a j.l.instrument agent these classes are already defined when the agent is first called. For a native JVMTI agent the problem becomes parsing and modifying byte codes in a native agent which is error prone and requires a lot of code to maintain.

If instead the stream classes were modified to store a reference to the path, it would be readily available for agents at a minimum of cost to the libraries. This is what this patch does. FileInputStream, FileOutputStream, RandomAccessFile and FileChannelImpl are modified to record the path they operate on in a private field. There are no accessors added to retrieve the path - it is purely stored for instrumentation purposes. The path is intentionally not resolved to be an absolute path since that would potentially add unwanted overhead. If a stream is created from a file descriptor, no path will be stored. 

The overhead for this path will be keeping the path String alive for a longer period of time. I hope this will not cause any problems.

A consumer of this feature will be Java Flight Recorder, but the implementation is usable by other agents as well.

webrev: http://cr.openjdk.java.net/~sla/8033917/webrev.00/
bug: https://bugs.openjdk.java.net/browse/JDK-8033917

Thanks,
/Staffan

From Alan.Bateman at oracle.com  Fri Feb  7 04:27:53 2014
From: Alan.Bateman at oracle.com (Alan Bateman)
Date: Fri, 07 Feb 2014 12:27:53 +0000
Subject: RFR: JDK-8033917 Keep track of file paths in file streams and
	channels for instrumentation purposes
In-Reply-To: <25552F6F-0FFC-42EA-9B0B-49EC135A5EE6@oracle.com>
References: <25552F6F-0FFC-42EA-9B0B-49EC135A5EE6@oracle.com>
Message-ID: <52F4D149.8000900@oracle.com>

On 07/02/2014 12:07, Staffan Larsen wrote:
> Instrumentation agents that want to instrument FileInputStream/FileOutputStream to see which files are being accessed do not currently have access to the file system path of the stream. This is because the path is never stored in the stream class, only the file descriptor is. (This is also true for RandomAccessFile and FileChannel).
>
> An agent could instrument the respective constructors to store the path. The problem is where to store it. To add a field, the instrumentation agent needs to change the schema of the class. This is not possible at runtime but can be done at class-loading time. However for a j.l.instrument agent these classes are already defined when the agent is first called. For a native JVMTI agent the problem becomes parsing and modifying byte codes in a native agent which is error prone and requires a lot of code to maintain.
>
> If instead the stream classes were modified to store a reference to the path, it would be readily available for agents at a minimum of cost to the libraries. This is what this patch does. FileInputStream, FileOutputStream, RandomAccessFile and FileChannelImpl are modified to record the path they operate on in a private field. There are no accessors added to retrieve the path - it is purely stored for instrumentation purposes. The path is intentionally not resolved to be an absolute path since that would potentially add unwanted overhead. If a stream is created from a file descriptor, no path will be stored.
>
> The overhead for this path will be keeping the path String alive for a longer period of time. I hope this will not cause any problems.
>
> A consumer of this feature will be Java Flight Recorder, but the implementation is usable by other agents as well.
>
> webrev: http://cr.openjdk.java.net/~sla/8033917/webrev.00/
> bug: https://bugs.openjdk.java.net/browse/JDK-8033917
>
I have reservations about doing this as hints of code making use of 
private fields which isn't good.

For the comments in FileInputStream and other then it might be best to 
keep the line lengths consistent with the existing code if you can (it 
makes future side-by-side reviews a bit easier too).

In WindowsChannelFactory then you've re-order and expand imports. The 
ordering of the import groups in this area has been Java SE, 
JDK-specific and then finally the JDK-internal. It's not a big deal of 
course. Personally I prefer the original static imports but I know some 
people don't.

-Alan.


From staffan.larsen at oracle.com  Fri Feb  7 04:49:55 2014
From: staffan.larsen at oracle.com (Staffan Larsen)
Date: Fri, 7 Feb 2014 13:49:55 +0100
Subject: RFR: JDK-8033917 Keep track of file paths in file streams and
	channels for instrumentation purposes
In-Reply-To: <52F4D149.8000900@oracle.com>
References: <25552F6F-0FFC-42EA-9B0B-49EC135A5EE6@oracle.com>
	<52F4D149.8000900@oracle.com>
Message-ID: <BF9892F1-106D-4C99-A3C3-AFF2F251F271@oracle.com>


On 7 feb 2014, at 13:27, Alan Bateman <Alan.Bateman at oracle.com> wrote:

> On 07/02/2014 12:07, Staffan Larsen wrote:
>> Instrumentation agents that want to instrument FileInputStream/FileOutputStream to see which files are being accessed do not currently have access to the file system path of the stream. This is because the path is never stored in the stream class, only the file descriptor is. (This is also true for RandomAccessFile and FileChannel).
>> 
>> An agent could instrument the respective constructors to store the path. The problem is where to store it. To add a field, the instrumentation agent needs to change the schema of the class. This is not possible at runtime but can be done at class-loading time. However for a j.l.instrument agent these classes are already defined when the agent is first called. For a native JVMTI agent the problem becomes parsing and modifying byte codes in a native agent which is error prone and requires a lot of code to maintain.
>> 
>> If instead the stream classes were modified to store a reference to the path, it would be readily available for agents at a minimum of cost to the libraries. This is what this patch does. FileInputStream, FileOutputStream, RandomAccessFile and FileChannelImpl are modified to record the path they operate on in a private field. There are no accessors added to retrieve the path - it is purely stored for instrumentation purposes. The path is intentionally not resolved to be an absolute path since that would potentially add unwanted overhead. If a stream is created from a file descriptor, no path will be stored.
>> 
>> The overhead for this path will be keeping the path String alive for a longer period of time. I hope this will not cause any problems.
>> 
>> A consumer of this feature will be Java Flight Recorder, but the implementation is usable by other agents as well.
>> 
>> webrev: http://cr.openjdk.java.net/~sla/8033917/webrev.00/
>> bug: https://bugs.openjdk.java.net/browse/JDK-8033917
>> 
> I have reservations about doing this as hints of code making use of private fields which isn't good.
> 
> For the comments in FileInputStream and other then it might be best to keep the line lengths consistent with the existing code if you can (it makes future side-by-side reviews a bit easier too).

I?ve updated the comments to have shorter lines and javadoc style comments.

> In WindowsChannelFactory then you've re-order and expand imports. The ordering of the import groups in this area has been Java SE, JDK-specific and then finally the JDK-internal. It's not a big deal of course. Personally I prefer the original static imports but I know some people don?t.

I?ve reverted to the original order, and only expanded the non-static imports. (The danger of IDEs).

Updated webrev here:  http://cr.openjdk.java.net/~sla/8033917/webrev.01/

Thanks,
/Staffan

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20140207/889ad8c6/attachment.html 

From Alan.Bateman at oracle.com  Fri Feb  7 05:22:15 2014
From: Alan.Bateman at oracle.com (Alan Bateman)
Date: Fri, 07 Feb 2014 13:22:15 +0000
Subject: RFR: JDK-8033917 Keep track of file paths in file streams and
	channels for instrumentation purposes
In-Reply-To: <BF9892F1-106D-4C99-A3C3-AFF2F251F271@oracle.com>
References: <25552F6F-0FFC-42EA-9B0B-49EC135A5EE6@oracle.com>
	<52F4D149.8000900@oracle.com>
	<BF9892F1-106D-4C99-A3C3-AFF2F251F271@oracle.com>
Message-ID: <52F4DE07.7090003@oracle.com>

On 07/02/2014 12:49, Staffan Larsen wrote:
>
> Updated webrev here: 
> http://cr.openjdk.java.net/~sla/8033917/webrev.01/ 
> <http://cr.openjdk.java.net/%7Esla/8033917/webrev.01/>
>
Thanks for the updates, looks good.

-Alan.

From staffan.larsen at oracle.com  Fri Feb  7 05:25:05 2014
From: staffan.larsen at oracle.com (Staffan Larsen)
Date: Fri, 7 Feb 2014 14:25:05 +0100
Subject: RFR: JDK-8033917 Keep track of file paths in file streams and
	channels for instrumentation purposes
In-Reply-To: <52F4DE07.7090003@oracle.com>
References: <25552F6F-0FFC-42EA-9B0B-49EC135A5EE6@oracle.com>
	<52F4D149.8000900@oracle.com>
	<BF9892F1-106D-4C99-A3C3-AFF2F251F271@oracle.com>
	<52F4DE07.7090003@oracle.com>
Message-ID: <C56BC0A3-3AB7-4C18-B954-A6E239D4DCC6@oracle.com>

Thanks Alan!

On 7 feb 2014, at 14:22, Alan Bateman <Alan.Bateman at oracle.com> wrote:

> On 07/02/2014 12:49, Staffan Larsen wrote:
>> 
>> Updated webrev here: http://cr.openjdk.java.net/~sla/8033917/webrev.01/ <http://cr.openjdk.java.net/%7Esla/8033917/webrev.01/>
>> 
> Thanks for the updates, looks good.
> 
> -Alan.


From staffan.larsen at oracle.com  Fri Feb  7 05:25:43 2014
From: staffan.larsen at oracle.com (Staffan Larsen)
Date: Fri, 7 Feb 2014 14:25:43 +0100
Subject: RFR(S): JDK-8033911 Simplify instrumentation of FileInputStream
	and	RandomAccessFile
In-Reply-To: <3d89f5ba-3c6e-4571-b7bf-19cbad4e5b5c@email.android.com>
References: <2BA69765-5405-413A-BAC6-EDE77C176FFA@oracle.com>
	<3d89f5ba-3c6e-4571-b7bf-19cbad4e5b5c@email.android.com>
Message-ID: <C72FDA1A-EC57-490A-8989-25BEDE64C654@oracle.com>

Alan, Jaroslav, Dmitry: Thanks!

On 7 feb 2014, at 12:38, Jaroslav Bachorik <jaroslav.bachorik at oracle.com> wrote:

> Looks good. Additional benefit is the compliance with the secure coding guide.
> 
> -JB-
> 
> On February 7, 2014 11:46:07 AM CET, Staffan Larsen <staffan.larsen at oracle.com> wrote:
> A few of the public read and write methods in FileInputStream and RandomAccessFile are declared native. This means that it is hard to instrument them using byte code instrumentation. Changing the public methods to be to non-native and instead calling private native methods simplifies instrumentation. 
> 
> webrev: http://cr.openjdk.java.net/~sla/8033911/webrev.00/
> bug: https://bugs.openjdk.java.net/browse/JDK-8033911
> 
> Thanks,
> /Staffan
> 
> -- 
> Sent from my Android device with K-9 Mail. Please excuse my brevity.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20140207/9ae4c673/attachment.html 

From dmitry.samersoff at oracle.com  Fri Feb  7 07:24:00 2014
From: dmitry.samersoff at oracle.com (Dmitry Samersoff)
Date: Fri, 07 Feb 2014 19:24:00 +0400
Subject: RFR: JDK-8033917 Keep track of file paths in file streams and
	channels for instrumentation purposes
In-Reply-To: <25552F6F-0FFC-42EA-9B0B-49EC135A5EE6@oracle.com>
References: <25552F6F-0FFC-42EA-9B0B-49EC135A5EE6@oracle.com>
Message-ID: <52F4FA90.2080908@oracle.com>

Staffan,

FileInputStream.java

55:  It's better to initialize path with null.
134: It's better to assign name at one of first lines, in this case we
will be able to retrieve file name ever if open fails for some reason.
171: It's not necessary

(the same is applicable to other files)

I'm a bit scared changing signature of public methods of FileChannelImpl
but if Alan says it's OK - lets go with it.

-Dmitry


On 2014-02-07 16:07, Staffan Larsen wrote:
> Instrumentation agents that want to instrument
> FileInputStream/FileOutputStream to see which files are being
> accessed do not currently have access to the file system path of the
> stream. This is because the path is never stored in the stream class,
> only the file descriptor is. (This is also true for RandomAccessFile
> and FileChannel).
> 
> An agent could instrument the respective constructors to store the
> path. The problem is where to store it. To add a field, the
> instrumentation agent needs to change the schema of the class. This
> is not possible at runtime but can be done at class-loading time.
> However for a j.l.instrument agent these classes are already defined
> when the agent is first called. For a native JVMTI agent the problem
> becomes parsing and modifying byte codes in a native agent which is
> error prone and requires a lot of code to maintain.
> 
> If instead the stream classes were modified to store a reference to
> the path, it would be readily available for agents at a minimum of
> cost to the libraries. This is what this patch does. FileInputStream,
> FileOutputStream, RandomAccessFile and FileChannelImpl are modified
> to record the path they operate on in a private field. There are no
> accessors added to retrieve the path - it is purely stored for
> instrumentation purposes. The path is intentionally not resolved to
> be an absolute path since that would potentially add unwanted
> overhead. If a stream is created from a file descriptor, no path will
> be stored.
> 
> The overhead for this path will be keeping the path String alive for
> a longer period of time. I hope this will not cause any problems.
> 
> A consumer of this feature will be Java Flight Recorder, but the
> implementation is usable by other agents as well.
> 
> webrev: http://cr.openjdk.java.net/~sla/8033917/webrev.00/ bug:
> https://bugs.openjdk.java.net/browse/JDK-8033917
> 
> Thanks, /Staffan
> 


-- 
Dmitry Samersoff
Oracle Java development team, Saint Petersburg, Russia
* I would love to change the world, but they won't give me the sources.

From Alan.Bateman at oracle.com  Fri Feb  7 07:27:36 2014
From: Alan.Bateman at oracle.com (Alan Bateman)
Date: Fri, 07 Feb 2014 15:27:36 +0000
Subject: RFR: JDK-8033917 Keep track of file paths in file streams and
	channels for instrumentation purposes
In-Reply-To: <52F4FA90.2080908@oracle.com>
References: <25552F6F-0FFC-42EA-9B0B-49EC135A5EE6@oracle.com>
	<52F4FA90.2080908@oracle.com>
Message-ID: <52F4FB68.7040805@oracle.com>

On 07/02/2014 15:24, Dmitry Samersoff wrote:
> :
>
> I'm a bit scared changing signature of public methods of FileChannelImpl
> but if Alan says it's OK - lets go with it.
>
This is sun.nio.ch and so JDK-internal, nothing outside of the JDK 
should be using it directly.

-Alan

From chris.hegarty at oracle.com  Fri Feb  7 07:32:03 2014
From: chris.hegarty at oracle.com (Chris Hegarty)
Date: Fri, 07 Feb 2014 15:32:03 +0000
Subject: RFR: JDK-8033917 Keep track of file paths in file streams and
	channels for instrumentation purposes
In-Reply-To: <52F4FA90.2080908@oracle.com>
References: <25552F6F-0FFC-42EA-9B0B-49EC135A5EE6@oracle.com>
	<52F4FA90.2080908@oracle.com>
Message-ID: <52F4FC73.5080803@oracle.com>

On 07/02/14 15:24, Dmitry Samersoff wrote:
> Staffan,
>
> FileInputStream.java
>
> 55:  It's better to initialize path with null.

I'm afraid I disagree with this. The default value is already null, why 
set it to null again? I see this pattern all over the code, but it seems 
completely redundant to me.

-Chris.

> 134: It's better to assign name at one of first lines, in this case we
> will be able to retrieve file name ever if open fails for some reason.
> 171: It's not necessary
>
> (the same is applicable to other files)
>
> I'm a bit scared changing signature of public methods of FileChannelImpl
> but if Alan says it's OK - lets go with it.
>
> -Dmitry
>
>
> On 2014-02-07 16:07, Staffan Larsen wrote:
>> Instrumentation agents that want to instrument
>> FileInputStream/FileOutputStream to see which files are being
>> accessed do not currently have access to the file system path of the
>> stream. This is because the path is never stored in the stream class,
>> only the file descriptor is. (This is also true for RandomAccessFile
>> and FileChannel).
>>
>> An agent could instrument the respective constructors to store the
>> path. The problem is where to store it. To add a field, the
>> instrumentation agent needs to change the schema of the class. This
>> is not possible at runtime but can be done at class-loading time.
>> However for a j.l.instrument agent these classes are already defined
>> when the agent is first called. For a native JVMTI agent the problem
>> becomes parsing and modifying byte codes in a native agent which is
>> error prone and requires a lot of code to maintain.
>>
>> If instead the stream classes were modified to store a reference to
>> the path, it would be readily available for agents at a minimum of
>> cost to the libraries. This is what this patch does. FileInputStream,
>> FileOutputStream, RandomAccessFile and FileChannelImpl are modified
>> to record the path they operate on in a private field. There are no
>> accessors added to retrieve the path - it is purely stored for
>> instrumentation purposes. The path is intentionally not resolved to
>> be an absolute path since that would potentially add unwanted
>> overhead. If a stream is created from a file descriptor, no path will
>> be stored.
>>
>> The overhead for this path will be keeping the path String alive for
>> a longer period of time. I hope this will not cause any problems.
>>
>> A consumer of this feature will be Java Flight Recorder, but the
>> implementation is usable by other agents as well.
>>
>> webrev: http://cr.openjdk.java.net/~sla/8033917/webrev.00/ bug:
>> https://bugs.openjdk.java.net/browse/JDK-8033917
>>
>> Thanks, /Staffan
>>
>
>

From staffan.larsen at oracle.com  Fri Feb  7 07:49:25 2014
From: staffan.larsen at oracle.com (Staffan Larsen)
Date: Fri, 7 Feb 2014 16:49:25 +0100
Subject: RFR: JDK-8033917 Keep track of file paths in file streams and
	channels for instrumentation purposes
In-Reply-To: <52F4FA90.2080908@oracle.com>
References: <25552F6F-0FFC-42EA-9B0B-49EC135A5EE6@oracle.com>
	<52F4FA90.2080908@oracle.com>
Message-ID: <3F1E5072-1745-4EA4-B783-93CAD2643B91@oracle.com>


On 7 feb 2014, at 16:24, Dmitry Samersoff <dmitry.samersoff at oracle.com> wrote:

> Staffan,
> 
> FileInputStream.java
> 
> 55:  It's better to initialize path with null.

I agree with Chris here. The value should be explicitly initialized by all constructors.

> 134: It's better to assign name at one of first lines, in this case we
> will be able to retrieve file name ever if open fails for some reason.

This is the constructor. If anything fails it will throw and exception, and there won?t be an object to look at.

> 171: It's not necessary

All constructors must initialize the value. 

Thanks,
/Staffan

> 
> (the same is applicable to other files)
> 
> I'm a bit scared changing signature of public methods of FileChannelImpl
> but if Alan says it's OK - lets go with it.
> 
> -Dmitry
> 
> 
> On 2014-02-07 16:07, Staffan Larsen wrote:
>> Instrumentation agents that want to instrument
>> FileInputStream/FileOutputStream to see which files are being
>> accessed do not currently have access to the file system path of the
>> stream. This is because the path is never stored in the stream class,
>> only the file descriptor is. (This is also true for RandomAccessFile
>> and FileChannel).
>> 
>> An agent could instrument the respective constructors to store the
>> path. The problem is where to store it. To add a field, the
>> instrumentation agent needs to change the schema of the class. This
>> is not possible at runtime but can be done at class-loading time.
>> However for a j.l.instrument agent these classes are already defined
>> when the agent is first called. For a native JVMTI agent the problem
>> becomes parsing and modifying byte codes in a native agent which is
>> error prone and requires a lot of code to maintain.
>> 
>> If instead the stream classes were modified to store a reference to
>> the path, it would be readily available for agents at a minimum of
>> cost to the libraries. This is what this patch does. FileInputStream,
>> FileOutputStream, RandomAccessFile and FileChannelImpl are modified
>> to record the path they operate on in a private field. There are no
>> accessors added to retrieve the path - it is purely stored for
>> instrumentation purposes. The path is intentionally not resolved to
>> be an absolute path since that would potentially add unwanted
>> overhead. If a stream is created from a file descriptor, no path will
>> be stored.
>> 
>> The overhead for this path will be keeping the path String alive for
>> a longer period of time. I hope this will not cause any problems.
>> 
>> A consumer of this feature will be Java Flight Recorder, but the
>> implementation is usable by other agents as well.
>> 
>> webrev: http://cr.openjdk.java.net/~sla/8033917/webrev.00/ bug:
>> https://bugs.openjdk.java.net/browse/JDK-8033917
>> 
>> Thanks, /Staffan
>> 
> 
> 
> -- 
> Dmitry Samersoff
> Oracle Java development team, Saint Petersburg, Russia
> * I would love to change the world, but they won't give me the sources.


From dmitry.samersoff at oracle.com  Fri Feb  7 08:18:07 2014
From: dmitry.samersoff at oracle.com (Dmitry Samersoff)
Date: Fri, 07 Feb 2014 20:18:07 +0400
Subject: RFR: JDK-8033917 Keep track of file paths in file streams and
	channels for instrumentation purposes
In-Reply-To: <52F4FC73.5080803@oracle.com>
References: <25552F6F-0FFC-42EA-9B0B-49EC135A5EE6@oracle.com>
	<52F4FA90.2080908@oracle.com> <52F4FC73.5080803@oracle.com>
Message-ID: <52F5073F.3090904@oracle.com>

On 2014-02-07 19:32, Chris Hegarty wrote:
> On 07/02/14 15:24, Dmitry Samersoff wrote:
>> Staffan,
>>
>> FileInputStream.java
>>
>> 55:  It's better to initialize path with null.
> 
> I'm afraid I disagree with this. The default value is already null, why
> set it to null again? I see this pattern all over the code, but it seems
> completely redundant to me.

Yes, It's NOOP but it makes readers and variety of "security" tools happy.

I will not press for it, but as far as rest of the code

(e.g. private FileChannel channel = null; )

uses this pattern and initialize variables explicitly, I think it's good
to initialize this variable as well.

-Dmitry


> 
> -Chris.
> 
>> 134: It's better to assign name at one of first lines, in this case we
>> will be able to retrieve file name ever if open fails for some reason.
>> 171: It's not necessary
>>
>> (the same is applicable to other files)
>>
>> I'm a bit scared changing signature of public methods of FileChannelImpl
>> but if Alan says it's OK - lets go with it.
>>
>> -Dmitry
>>
>>
>> On 2014-02-07 16:07, Staffan Larsen wrote:
>>> Instrumentation agents that want to instrument
>>> FileInputStream/FileOutputStream to see which files are being
>>> accessed do not currently have access to the file system path of the
>>> stream. This is because the path is never stored in the stream class,
>>> only the file descriptor is. (This is also true for RandomAccessFile
>>> and FileChannel).
>>>
>>> An agent could instrument the respective constructors to store the
>>> path. The problem is where to store it. To add a field, the
>>> instrumentation agent needs to change the schema of the class. This
>>> is not possible at runtime but can be done at class-loading time.
>>> However for a j.l.instrument agent these classes are already defined
>>> when the agent is first called. For a native JVMTI agent the problem
>>> becomes parsing and modifying byte codes in a native agent which is
>>> error prone and requires a lot of code to maintain.
>>>
>>> If instead the stream classes were modified to store a reference to
>>> the path, it would be readily available for agents at a minimum of
>>> cost to the libraries. This is what this patch does. FileInputStream,
>>> FileOutputStream, RandomAccessFile and FileChannelImpl are modified
>>> to record the path they operate on in a private field. There are no
>>> accessors added to retrieve the path - it is purely stored for
>>> instrumentation purposes. The path is intentionally not resolved to
>>> be an absolute path since that would potentially add unwanted
>>> overhead. If a stream is created from a file descriptor, no path will
>>> be stored.
>>>
>>> The overhead for this path will be keeping the path String alive for
>>> a longer period of time. I hope this will not cause any problems.
>>>
>>> A consumer of this feature will be Java Flight Recorder, but the
>>> implementation is usable by other agents as well.
>>>
>>> webrev: http://cr.openjdk.java.net/~sla/8033917/webrev.00/ bug:
>>> https://bugs.openjdk.java.net/browse/JDK-8033917
>>>
>>> Thanks, /Staffan
>>>
>>
>>


-- 
Dmitry Samersoff
Oracle Java development team, Saint Petersburg, Russia
* I would love to change the world, but they won't give me the sources.

From dmitry.samersoff at oracle.com  Fri Feb  7 08:19:11 2014
From: dmitry.samersoff at oracle.com (Dmitry Samersoff)
Date: Fri, 07 Feb 2014 20:19:11 +0400
Subject: RFR: JDK-8033917 Keep track of file paths in file streams and
	channels for instrumentation purposes
In-Reply-To: <3F1E5072-1745-4EA4-B783-93CAD2643B91@oracle.com>
References: <25552F6F-0FFC-42EA-9B0B-49EC135A5EE6@oracle.com>
	<52F4FA90.2080908@oracle.com>
	<3F1E5072-1745-4EA4-B783-93CAD2643B91@oracle.com>
Message-ID: <52F5077F.6000500@oracle.com>

Staffan,

OK!

-Dmitry

On 2014-02-07 19:49, Staffan Larsen wrote:
> 
> On 7 feb 2014, at 16:24, Dmitry Samersoff <dmitry.samersoff at oracle.com> wrote:
> 
>> Staffan,
>>
>> FileInputStream.java
>>
>> 55:  It's better to initialize path with null.
> 
> I agree with Chris here. The value should be explicitly initialized by all constructors.
> 
>> 134: It's better to assign name at one of first lines, in this case we
>> will be able to retrieve file name ever if open fails for some reason.
> 
> This is the constructor. If anything fails it will throw and exception, and there won?t be an object to look at.
> 
>> 171: It's not necessary
> 
> All constructors must initialize the value. 
> 
> Thanks,
> /Staffan
> 
>>
>> (the same is applicable to other files)
>>
>> I'm a bit scared changing signature of public methods of FileChannelImpl
>> but if Alan says it's OK - lets go with it.
>>
>> -Dmitry
>>
>>
>> On 2014-02-07 16:07, Staffan Larsen wrote:
>>> Instrumentation agents that want to instrument
>>> FileInputStream/FileOutputStream to see which files are being
>>> accessed do not currently have access to the file system path of the
>>> stream. This is because the path is never stored in the stream class,
>>> only the file descriptor is. (This is also true for RandomAccessFile
>>> and FileChannel).
>>>
>>> An agent could instrument the respective constructors to store the
>>> path. The problem is where to store it. To add a field, the
>>> instrumentation agent needs to change the schema of the class. This
>>> is not possible at runtime but can be done at class-loading time.
>>> However for a j.l.instrument agent these classes are already defined
>>> when the agent is first called. For a native JVMTI agent the problem
>>> becomes parsing and modifying byte codes in a native agent which is
>>> error prone and requires a lot of code to maintain.
>>>
>>> If instead the stream classes were modified to store a reference to
>>> the path, it would be readily available for agents at a minimum of
>>> cost to the libraries. This is what this patch does. FileInputStream,
>>> FileOutputStream, RandomAccessFile and FileChannelImpl are modified
>>> to record the path they operate on in a private field. There are no
>>> accessors added to retrieve the path - it is purely stored for
>>> instrumentation purposes. The path is intentionally not resolved to
>>> be an absolute path since that would potentially add unwanted
>>> overhead. If a stream is created from a file descriptor, no path will
>>> be stored.
>>>
>>> The overhead for this path will be keeping the path String alive for
>>> a longer period of time. I hope this will not cause any problems.
>>>
>>> A consumer of this feature will be Java Flight Recorder, but the
>>> implementation is usable by other agents as well.
>>>
>>> webrev: http://cr.openjdk.java.net/~sla/8033917/webrev.00/ bug:
>>> https://bugs.openjdk.java.net/browse/JDK-8033917
>>>
>>> Thanks, /Staffan
>>>
>>
>>
>> -- 
>> Dmitry Samersoff
>> Oracle Java development team, Saint Petersburg, Russia
>> * I would love to change the world, but they won't give me the sources.
> 


-- 
Dmitry Samersoff
Oracle Java development team, Saint Petersburg, Russia
* I would love to change the world, but they won't give me the sources.

From dean.long at oracle.com  Fri Feb  7 13:06:14 2014
From: dean.long at oracle.com (Dean Long)
Date: Fri, 07 Feb 2014 13:06:14 -0800
Subject: Review Request (S) 8025841: JVMTI: "vtable stub" dynamic code
	notification is misplaced
In-Reply-To: <B6B09E1E-E531-4C0F-BCC0-FF45032A30C0@oracle.com>
References: <52F01569.3070201@oracle.com>
	<52F41821.10403@oracle.com>	<52F41B7D.6090206@oracle.com>
	<B6B09E1E-E531-4C0F-BCC0-FF45032A30C0@oracle.com>
Message-ID: <52F54AC6.8020804@oracle.com>

What's the cost for adding a new BufferBlob subtype?  We already have 
AdapterBlob and MethodHandlesAdapterBlob.

dl

On 2/6/2014 3:52 PM, Oleg Mazurov wrote:
> My understanding was that a buffer blob was just that - a buffer. Could potentially contain code fragments of different kinds.
> Thus, is_buffer_blob() was the closest type available. Agree that a dependency on its name is not reliable, though testing
> will reveal if the condition turns false for "vtable chunks" due to a name change (I had to deal with that particular test, Serguei
> should be able to identify it). Adding a comment to where the name is defined (vtableStubs.cpp) that such a dependency exists
> is a good idea.
> Thanks,
>
>      -- Oleg
>
> On Feb 6, 2014, at 3:32 PM, Coleen Phillimore wrote:
>
>> Hi, I clicked on this a couple times.  It seems okay but isn't there a safer way to identify code blobs that are vtable stubs, without looking at the name (which can change in while creating it).  A comment at least when you create "vtable chunks" would be good.   It seems that someone might want to rename it "vtable or itable buffers", or something like that.
>>
>> thanks,
>> Coleen
>>
>> On 2/6/14 6:17 PM, serguei.spitsyn at oracle.com wrote:
>>> Runtime team,
>>>
>>> This fix was reviewed by Vladimir K. and me.
>>> Just wanted to make sure if you would like to review it as well.
>>> If not, then I will push it.
>>>
>>> Thanks,
>>> Serguei
>>>
>>> On 2/3/14 2:17 PM, serguei.spitsyn at oracle.com wrote:
>>>> Please, review the fix for:
>>>>    https://bugs.openjdk.java.net/browse/JDK-8025841
>>>>
>>>>
>>>> Open webrev:
>>>> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/hotspot/omazurov/8025841-JVMTI-vtbl.1
>>>>
>>>> Summary:
>>>>
>>>>    The fix contributed by Oleg Mazurov to improve profiling data quality.
>>>>    It moves the "vtable stub" dynamic code notification to the right place.
>>>>    I've already reviewed the fix, and it looks good to me.
>>>>
>>>>    Bug report description:
>>>>
>>>>     "JVMTI_EVENT_DYNAMIC_CODE_GENERATED for "vtable stub" gets scheduled when
>>>>      a new chunk of memory for subsequent vtable and itable stubs is allocated.
>>>>      That chunk is uninitialized (contains zeros or garbage) although due to the fact
>>>>      that the actual event delivery is deferred, at least one vtable comes out right.
>>>>
>>>>      This event should describe an individual vtable/itable stub (base address and size)
>>>>      and only after it's been created (memory is actually populated with code).
>>>>      Where VM diagnostic messages about vtable/itable stubs are issued upon
>>>>      -XX:+PrintAdapterHandlers appears exactly the right place for JVMTI events as well.
>>>>
>>>>      Getting vtables/itables right is important in the context of performance analysis as
>>>>      that dynamically generated code may accumulate quite noticeable CPU time
>>>>      (especially itabes), sometimes larger than the actual Java methods called."
>>>>
>>>>
>>>> Testing:
>>>>    Oleg tested it in the Oracle Studio Performance Analyzer environment.
>>>>    nsk.jvmti, nsk.jdi, nsk.jdwp,
>>>>    In progress: Jtreg com/sun/jdi, java/lang/instrument
>>>>
>>>>
>>>> Thanks,
>>>> Serguei
>>>>


From karen.kinnear at oracle.com  Fri Feb  7 13:56:33 2014
From: karen.kinnear at oracle.com (Karen Kinnear)
Date: Fri, 7 Feb 2014 16:56:33 -0500
Subject: code review round 0 for ObjectMonitor-JVM/TI hang fix (8028073)
In-Reply-To: <52ED3F28.5070806@oracle.com>
References: <52ED3F28.5070806@oracle.com>
Message-ID: <6C8A4CDA-B5D5-44E4-865E-A7205D533AFD@oracle.com>

Dan,

Greatly impressed. Very subtle. Thanks  to you and David and Dice and Serguei
for a lot of blood, sweat and tears.

Apologies that I haven't been following all the details, so a couple of questions.

1. I like the new fix. I totally appreciate the massive documentation in the bug, it really helps.
And the comments in the code.

2. In removing the earlier fix, does that work because
   - there is no posting of monitor_contended_exit (or any posting) during
    the exit logic? Or did I just miss finding it?

3. Did I read the code correctly that the Thread::SpinAcquire can make a timed park
call on the same thread's _ParkEvent? And that this is used to get on and off the wait queue,
i.e. to acquire the WaitSetLock?
   Is there the same risk that a notify might be eaten here also?

If so, I wonder if we want this added unpark to not just be called if JVMTI_EVENT_MONITOR_WAITED
is enabled?

thanks,
Karen

On Feb 1, 2014, at 1:38 PM, Daniel D. Daugherty wrote:

> Greetings,
> 
> I have a fix ready for the following bug:
> 
>    8028073 race condition in ObjectMonitor implementation causing deadlocks
>    https://bugs.openjdk.java.net/browse/JDK-8028073
> 
> On the surface, this is a very simple fix that relocates a few lines of
> code, relocates and rewrites the comments associated with that code and
> adds several new comments.
> 
> Of course, in reality, the issue is much more complicated, but I'm
> hoping to make it easy for anyone not acquainted with this issue to
> understand what's going on.
> 
> Here are the JDK9 webrev URLs:
> 
> OpenJDK:
> http://cr.openjdk.java.net/~dcubed/8028073-webrev/0-jdk9-hs-runtime/
> 
> Oracle internal:
> http://javaweb.us.oracle.com/~ddaugher/8028073-webrev/0-jdk9-hs-runtime/
> 
> The simple summary:
> 
> - since Java Monitors and JVM/TI RawMonitors share a ParkEvent,
>  it is possible for a JVM/TI monitor event handler to accidentally
>  consume a ParkEvent.unpark() call meant for Java Monitor layer
> - the original code fix was made on 2005.07.04 using this bug ID:
>  https://bugs.openjdk.java.net/browse/JDK-5030359
> - it's the right fix, but it's in the wrong place
> - the fix needs to be after the JVMTI_EVENT_MONITOR_WAITED
>  event handler is called because it is that event handler
>  that can cause the hang
> 
> 
> Testing
> -------
> 
> - a new StessMonitorWait test has been created that reliably
>  reproduces the hang in JDK[6789]; see the bug's gory details
>  for the specific versions where the hang has been reproduced
>  - the test reliably reproduces the hang in 5 seconds on my
>    T7600 running Solaris 10u11 X86; 1 minute runs reproduce
>    the hang reliably on other machines
>  - 12 hour stress run of the new test on Linux-X64, MacOS X-X64,
>    Solaris-SPARCV9, Solaris-X64, and Win7-X86 with the JPRT
>    bits did not reproduce the hang
> - JPRT test job
> - VM/SQE Adhoc test job on Server VM, fastdebug bits on Linux-X86,
>  Linux-X64, MacOS X-X64, Solaris-SPARCV9, Solaris-X64, Windows-X86,
>  and Windows-X64:
>  - vm.quick
>  - Kitchensink (bigapps)
>  - Weblogic+medrec (bigapps)
>  - runThese (bigapps)
> 
> 
> The Gory Details Start Here
> ---------------------------
> 
> This is the old location of block of code that's being moved:
> 
> src/share/vm/runtime/objectMonitor.cpp:
> 
> 1440 void ObjectMonitor::wait(jlong millis, bool interruptible, TRAPS) {
> <snip>
> 1499    exit (true, Self) ;                    // exit the monitor
> <snip>
> 1513    if (node._notified != 0 && _succ == Self) {
> 1514       node._event->unpark();
> 1515    }
> 
> 
> This is the new location of block of code that's being moved:
> 
> src/share/vm/runtime/objectMonitor.cpp:
> 
> 1452 void ObjectMonitor::wait(jlong millis, bool interruptible, TRAPS) {
> <snip>
> 1601      if (JvmtiExport::should_post_monitor_waited()) {
> 1602        JvmtiExport::post_monitor_waited(jt, this, ret == OS_TIMEOUT);
> <snip>
> 1604        if (node._notified != 0 && _succ == Self) {
> <snip>
> 1620          node._event->unpark();
> 1621        }
> 
> 
> The Risks
> ---------
> 
> - The code now executes only when the JVMTI_EVENT_MONITOR_WAITED event
>  is enabled:
>  - previously it was always executed
>  - while the old code was not effective for the hang that is being
>    fixed with this bug, it is possible that the old code prevented
>    a different bug in the successor protocol from manifesting
>  - thorough analysis of the successor protocol did not reveal a
>    case where the old code was needed in the old location
> - Thorough analysis indicates that the other JVM/TI monitor events
>  do not need a fix like the one for JVMTI_EVENT_MONITOR_WAITED:
>  - the successor protocol is complicated and the analysis could
>    be wrong when certain options are used
>  - comments were added to each location where a JVM/TI monitor
>    event handler is called documenting why a fix like this one
>    is not needed there
>  - if the analysis is wrong, the new comments show where a new
>    code change would be needed
> 
> 
> The Scenario
> ------------
> 
> I've created a scenario that reproduces this hang:
> 
> T1 - enters monitor and calls monitor.wait()
> T2 - enters the monitor, calls monitor.notify() and exits the monitor
> T3 - enters and exits the monitor
> T4 - enters the monitor, delays for 5 seconds, exits the monitor
> 
> A JVM/TI agent that enables JVMTI_EVENT_MONITOR_WAITED and has a
> handler that: enters a raw monitor, waits for 1ms, exits a raw monitor.
> 
> Here are the six events necessary to make this hang happen:
> 
> // KEY-EVENT-1a: After being unparked(), T1 has cleared the _succ field, but
> // KEY-EVENT-1b: T3 is exiting the monitor and makes T1 the successor again.
> 
> // KEY-EVENT-2a: The unpark() done by T3 when it made T1 the successor
> // KEY-EVENT-2b: is consumed by the JVM/TI event handler.
> 
> // KEY-EVENT-3a: T3 made T1 the successor
> // KEY-EVENT-3b: but before T1 could reenter the monitor T4 grabbed it.
> 
> // KEY-EVENT-4a: T1's TrySpin() call sees T4 as NotRunnable so
> // KEY-EVENT-4b: T1 bails from TrySpin without touching _succ.
> 
> // KEY-EVENT-5a: T4 sees that T1 is still the successor so
> // KEY-EVENT-5b: T4 takes the quick exit path (no ExitEpilog)
> 
> // KEY-EVENT-6a: T1 is about to park and it is the successor, but
> // KEY-EVENT-6b: T3's unpark has been eaten by the JVM/TI event handler
> // KEY-EVENT-6c: and T4 took the quick exit path. T1 is about to be stuck.
> 
> 
> This bug is intertwined with:
> 
> - The ObjectMonitor successor protocol
> - the sharing of a ParkEvent between Java Monitors and JVM/TI RawMonitors
> 
> There is a very long successor.notes attachment to JDK-8028073 that
> attempts to describe the ObjectMonitor successor protocol. It's good
> for putting pretty much anyone to sleep.
> 
> Since this hang reproduces back to JDK6, this bug is taking the easily
> backported solution of moving the original fix to the right location.
> The following new bug has been filed for possible future work in this
> area by the Serviceability Team:
> 
>    8033399 add a separate ParkEvent for JVM/TI RawMonitor use
>    https://bugs.openjdk.java.net/browse/JDK-8033399
> 
> 
> The Symptoms
> ------------
> 
> With intermittent hangs like this, it is useful to know what to look
> for in order to determine if you are running into this issue:
> 
> - if you aren't using a debugger or a profiler or some other
>  JVM/TI agent, then this hang is not the same as yours
> - if your JVM/TI agent isn't using a JVMTI_EVENT_MONITOR_WAITED
>  event handler, then this hang is not the same as yours
> - if your JVMTI_EVENT_MONITOR_WAITED event handler is not using
>  JVM/TI RawMonitors, then this hang is not the same as yours
> - if your JVMTI_EVENT_MONITOR_WAITED event handler is calling
>  back into Java code, then you might just be insane and this
>  hang might be similar to yours. However, using a Java callback
>  in an event handler is an even bigger problem/risk so fix that
>  first.
> - if you one or more threads blocked like this and making no
>  progress, then this hang might be the same as yours:
> 
> "T1" #22 prio=5 os_prio=64 tid=0x00000000009ca800 nid=0x2f waiting for monitor e
> ntry [0xfffffd7fc0231000]
>   java.lang.Thread.State: BLOCKED (on object monitor)
>   JavaThread state: _thread_blocked
> Thread: 0x00000000009ca800  [0x2f] State: _at_safepoint _has_called_back 0 _at_p
> oll_safepoint 0
>   JavaThread state: _thread_blocked
>        at java.lang.Object.wait(Native Method)
>        - waiting on <0xfffffd7e6a2b6ff0> (a java.lang.String)
>        at java.lang.Object.wait(Object.java:502)
>        at SMW_WorkerThread.run(StressMonitorWait.java:103)
>        - locked <0xfffffd7e6a2b6ff0> (a java.lang.String)
> 
> "T2" #23 prio=5 os_prio=64 tid=0x00000000009cc000 nid=0x30 waiting for monitor e
> ntry [0xfffffd7fc0130000]
>   java.lang.Thread.State: BLOCKED (on object monitor)
>   JavaThread state: _thread_blocked
> Thread: 0x00000000009cc000  [0x30] State: _at_safepoint _has_called_back 0 _at_p
> oll_safepoint 0
>   JavaThread state: _thread_blocked
>        at SMW_WorkerThread.run(StressMonitorWait.java:120)
>        - waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String)
> 
> "T3" #24 prio=5 os_prio=64 tid=0x00000000009ce000 nid=0x31 waiting for monitor e
> ntry [0xfffffd7fc002f000]
>   java.lang.Thread.State: BLOCKED (on object monitor)
>   JavaThread state: _thread_blocked
> Thread: 0x00000000009ce000  [0x31] State: _at_safepoint _has_called_back 0 _at_p
> oll_safepoint 0
>   JavaThread state: _thread_blocked
>        at SMW_WorkerThread.run(StressMonitorWait.java:139)
>        - waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String)
> 
> Key symptoms in thread T1:
> 
> - had the object locked:
> 
>  locked <0xfffffd7e6a2b6ff0> (a java.lang.String)
> 
> - did an Object.wait():
> 
>  waiting on <0xfffffd7e6a2b6ff0> (a java.lang.String)
> 
> - is blocked on reentry:
> 
>  waiting for monitor entry [0xfffffd7fc0231000]
> 
> Key symtoms in thread T2:
> 
> - is blocked waiting to lock the object:
> 
>  waiting for monitor entry [0xfffffd7fc0130000]
>  waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String)
> 
> Key symtoms in thread T3:
> 
> - is blocked waiting to lock the object:
> 
>  waiting for monitor entry [0xfffffd7fc002f000]
>  waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String)
> 


From erik.gahlin at oracle.com  Fri Feb  7 14:29:39 2014
From: erik.gahlin at oracle.com (Erik Gahlin)
Date: Fri, 07 Feb 2014 23:29:39 +0100
Subject: RFR(S): 	JDK-8032250 : Add trace event for VM flag changes
In-Reply-To: <4213C88C-3617-466C-909B-F8DAA37C8015@oracle.com>
References: <4213C88C-3617-466C-909B-F8DAA37C8015@oracle.com>
Message-ID: <52F55E53.3010203@oracle.com>

> 128 <value  type="LONG" field="old_value" label="Old Value" />
 > 129 <value type="LONG" field="new_value" label="New Value" />

Field identifiers should use Java naming convention, oldValue and newValue.

Erik

Staffan Larsen skrev 2014-01-21 14:47:
> This is a patch to capture a trace event whenever a manageable VM flag is changed in runtime.
>
> webrev: http://cr.openjdk.java.net/~sla/8032250/webrev.00/
> bug: https://bugs.openjdk.java.net/browse/JDK-8032250
>
> Thanks,
> /Staffan


From daniel.daugherty at oracle.com  Fri Feb  7 15:45:48 2014
From: daniel.daugherty at oracle.com (Daniel D. Daugherty)
Date: Fri, 07 Feb 2014 16:45:48 -0700
Subject: code review round 0 for ObjectMonitor-JVM/TI hang fix (8028073)
In-Reply-To: <6C8A4CDA-B5D5-44E4-865E-A7205D533AFD@oracle.com>
References: <52ED3F28.5070806@oracle.com>
	<6C8A4CDA-B5D5-44E4-865E-A7205D533AFD@oracle.com>
Message-ID: <52F5702C.50508@oracle.com>

On 2/7/14 2:56 PM, Karen Kinnear wrote:
> Dan,
>
> Greatly impressed.

Thanks! And thanks for the review!


> Very subtle.

Since you've worked with this code, that shouldn't be a
surprise... :-) We're down to the subtle bugs... :-(


> Thanks  to you and David and Dice and Serguei
> for a lot of blood, sweat and tears.

You're very welcome.


> Apologies that I haven't been following all the details, so a couple of questions.

No apologies needed. You have other fish to fry...


> 1. I like the new fix. I totally appreciate the massive documentation in the bug, it really helps.
> And the comments in the code.

Glad you like the fix and the docs.


> 2. In removing the earlier fix, does that work because
>     - there is no posting of monitor_contended_exit (or any posting) during
>      the exit logic? Or did I just miss finding it?

Dave Dice tracked the earlier fix down the JVM/PI MONITOR_CONTENDED_EXIT
event. There's no equivalent event in JVM/TI so the old fix is no longer
needed.


> 3. Did I read the code correctly that the Thread::SpinAcquire can make a timed park
> call on the same thread's _ParkEvent? And that this is used to get on and off the wait queue,
> i.e. to acquire the WaitSetLock?
>     Is there the same risk that a notify might be eaten here also?

As far as I can see, Thread::SpinAcquire() does not use a ParkEvent
at all. However, Thread::muxAcquire() does use a ParkEvent, but it
is a different ParkEvent. From src/share/vm/runtime/thread.hpp:

   ParkEvent * _ParkEvent ;               // for synchronized()
   ParkEvent * _SleepEvent ;              // for Thread.sleep
   ParkEvent * _MutexEvent ;              // for native internal 
Mutex/Monitor
   ParkEvent * _MuxEvent ;                // for low-level 
muxAcquire-muxRelease

So ObjectMonitor uses the _ParkEvent field and Thread::muxAcquire()
uses the _MuxEvent. There are some comments in thread.cpp about
how _MuxEvent could be eliminated and _ParkEvent shared, but I don't
think we ever want to go there.

I also filed this RFE:

     8033399 add a separate ParkEvent for JVM/TI RawMonitor use
     https://bugs.openjdk.java.net/browse/JDK-8033399

just in case the Serviceability team wants to migrate JVM/TI RawMonitors
to a separate ParkEvent.

Please let me know if you concur that I've resolved issue #3.


> If so, I wonder if we want this added unpark to not just be called if JVMTI_EVENT_MONITOR_WAITED
> is enabled?

I don't think we need it, but I've noted its removal as a risk.

Again, thanks for the review!

Dan


>
> thanks,
> Karen
>
> On Feb 1, 2014, at 1:38 PM, Daniel D. Daugherty wrote:
>
>> Greetings,
>>
>> I have a fix ready for the following bug:
>>
>>     8028073 race condition in ObjectMonitor implementation causing deadlocks
>>     https://bugs.openjdk.java.net/browse/JDK-8028073
>>
>> On the surface, this is a very simple fix that relocates a few lines of
>> code, relocates and rewrites the comments associated with that code and
>> adds several new comments.
>>
>> Of course, in reality, the issue is much more complicated, but I'm
>> hoping to make it easy for anyone not acquainted with this issue to
>> understand what's going on.
>>
>> Here are the JDK9 webrev URLs:
>>
>> OpenJDK:
>> http://cr.openjdk.java.net/~dcubed/8028073-webrev/0-jdk9-hs-runtime/
>>
>> Oracle internal:
>> http://javaweb.us.oracle.com/~ddaugher/8028073-webrev/0-jdk9-hs-runtime/
>>
>> The simple summary:
>>
>> - since Java Monitors and JVM/TI RawMonitors share a ParkEvent,
>>   it is possible for a JVM/TI monitor event handler to accidentally
>>   consume a ParkEvent.unpark() call meant for Java Monitor layer
>> - the original code fix was made on 2005.07.04 using this bug ID:
>>   https://bugs.openjdk.java.net/browse/JDK-5030359
>> - it's the right fix, but it's in the wrong place
>> - the fix needs to be after the JVMTI_EVENT_MONITOR_WAITED
>>   event handler is called because it is that event handler
>>   that can cause the hang
>>
>>
>> Testing
>> -------
>>
>> - a new StessMonitorWait test has been created that reliably
>>   reproduces the hang in JDK[6789]; see the bug's gory details
>>   for the specific versions where the hang has been reproduced
>>   - the test reliably reproduces the hang in 5 seconds on my
>>     T7600 running Solaris 10u11 X86; 1 minute runs reproduce
>>     the hang reliably on other machines
>>   - 12 hour stress run of the new test on Linux-X64, MacOS X-X64,
>>     Solaris-SPARCV9, Solaris-X64, and Win7-X86 with the JPRT
>>     bits did not reproduce the hang
>> - JPRT test job
>> - VM/SQE Adhoc test job on Server VM, fastdebug bits on Linux-X86,
>>   Linux-X64, MacOS X-X64, Solaris-SPARCV9, Solaris-X64, Windows-X86,
>>   and Windows-X64:
>>   - vm.quick
>>   - Kitchensink (bigapps)
>>   - Weblogic+medrec (bigapps)
>>   - runThese (bigapps)
>>
>>
>> The Gory Details Start Here
>> ---------------------------
>>
>> This is the old location of block of code that's being moved:
>>
>> src/share/vm/runtime/objectMonitor.cpp:
>>
>> 1440 void ObjectMonitor::wait(jlong millis, bool interruptible, TRAPS) {
>> <snip>
>> 1499    exit (true, Self) ;                    // exit the monitor
>> <snip>
>> 1513    if (node._notified != 0 && _succ == Self) {
>> 1514       node._event->unpark();
>> 1515    }
>>
>>
>> This is the new location of block of code that's being moved:
>>
>> src/share/vm/runtime/objectMonitor.cpp:
>>
>> 1452 void ObjectMonitor::wait(jlong millis, bool interruptible, TRAPS) {
>> <snip>
>> 1601      if (JvmtiExport::should_post_monitor_waited()) {
>> 1602        JvmtiExport::post_monitor_waited(jt, this, ret == OS_TIMEOUT);
>> <snip>
>> 1604        if (node._notified != 0 && _succ == Self) {
>> <snip>
>> 1620          node._event->unpark();
>> 1621        }
>>
>>
>> The Risks
>> ---------
>>
>> - The code now executes only when the JVMTI_EVENT_MONITOR_WAITED event
>>   is enabled:
>>   - previously it was always executed
>>   - while the old code was not effective for the hang that is being
>>     fixed with this bug, it is possible that the old code prevented
>>     a different bug in the successor protocol from manifesting
>>   - thorough analysis of the successor protocol did not reveal a
>>     case where the old code was needed in the old location
>> - Thorough analysis indicates that the other JVM/TI monitor events
>>   do not need a fix like the one for JVMTI_EVENT_MONITOR_WAITED:
>>   - the successor protocol is complicated and the analysis could
>>     be wrong when certain options are used
>>   - comments were added to each location where a JVM/TI monitor
>>     event handler is called documenting why a fix like this one
>>     is not needed there
>>   - if the analysis is wrong, the new comments show where a new
>>     code change would be needed
>>
>>
>> The Scenario
>> ------------
>>
>> I've created a scenario that reproduces this hang:
>>
>> T1 - enters monitor and calls monitor.wait()
>> T2 - enters the monitor, calls monitor.notify() and exits the monitor
>> T3 - enters and exits the monitor
>> T4 - enters the monitor, delays for 5 seconds, exits the monitor
>>
>> A JVM/TI agent that enables JVMTI_EVENT_MONITOR_WAITED and has a
>> handler that: enters a raw monitor, waits for 1ms, exits a raw monitor.
>>
>> Here are the six events necessary to make this hang happen:
>>
>> // KEY-EVENT-1a: After being unparked(), T1 has cleared the _succ field, but
>> // KEY-EVENT-1b: T3 is exiting the monitor and makes T1 the successor again.
>>
>> // KEY-EVENT-2a: The unpark() done by T3 when it made T1 the successor
>> // KEY-EVENT-2b: is consumed by the JVM/TI event handler.
>>
>> // KEY-EVENT-3a: T3 made T1 the successor
>> // KEY-EVENT-3b: but before T1 could reenter the monitor T4 grabbed it.
>>
>> // KEY-EVENT-4a: T1's TrySpin() call sees T4 as NotRunnable so
>> // KEY-EVENT-4b: T1 bails from TrySpin without touching _succ.
>>
>> // KEY-EVENT-5a: T4 sees that T1 is still the successor so
>> // KEY-EVENT-5b: T4 takes the quick exit path (no ExitEpilog)
>>
>> // KEY-EVENT-6a: T1 is about to park and it is the successor, but
>> // KEY-EVENT-6b: T3's unpark has been eaten by the JVM/TI event handler
>> // KEY-EVENT-6c: and T4 took the quick exit path. T1 is about to be stuck.
>>
>>
>> This bug is intertwined with:
>>
>> - The ObjectMonitor successor protocol
>> - the sharing of a ParkEvent between Java Monitors and JVM/TI RawMonitors
>>
>> There is a very long successor.notes attachment to JDK-8028073 that
>> attempts to describe the ObjectMonitor successor protocol. It's good
>> for putting pretty much anyone to sleep.
>>
>> Since this hang reproduces back to JDK6, this bug is taking the easily
>> backported solution of moving the original fix to the right location.
>> The following new bug has been filed for possible future work in this
>> area by the Serviceability Team:
>>
>>     8033399 add a separate ParkEvent for JVM/TI RawMonitor use
>>     https://bugs.openjdk.java.net/browse/JDK-8033399
>>
>>
>> The Symptoms
>> ------------
>>
>> With intermittent hangs like this, it is useful to know what to look
>> for in order to determine if you are running into this issue:
>>
>> - if you aren't using a debugger or a profiler or some other
>>   JVM/TI agent, then this hang is not the same as yours
>> - if your JVM/TI agent isn't using a JVMTI_EVENT_MONITOR_WAITED
>>   event handler, then this hang is not the same as yours
>> - if your JVMTI_EVENT_MONITOR_WAITED event handler is not using
>>   JVM/TI RawMonitors, then this hang is not the same as yours
>> - if your JVMTI_EVENT_MONITOR_WAITED event handler is calling
>>   back into Java code, then you might just be insane and this
>>   hang might be similar to yours. However, using a Java callback
>>   in an event handler is an even bigger problem/risk so fix that
>>   first.
>> - if you one or more threads blocked like this and making no
>>   progress, then this hang might be the same as yours:
>>
>> "T1" #22 prio=5 os_prio=64 tid=0x00000000009ca800 nid=0x2f waiting for monitor e
>> ntry [0xfffffd7fc0231000]
>>    java.lang.Thread.State: BLOCKED (on object monitor)
>>    JavaThread state: _thread_blocked
>> Thread: 0x00000000009ca800  [0x2f] State: _at_safepoint _has_called_back 0 _at_p
>> oll_safepoint 0
>>    JavaThread state: _thread_blocked
>>         at java.lang.Object.wait(Native Method)
>>         - waiting on <0xfffffd7e6a2b6ff0> (a java.lang.String)
>>         at java.lang.Object.wait(Object.java:502)
>>         at SMW_WorkerThread.run(StressMonitorWait.java:103)
>>         - locked <0xfffffd7e6a2b6ff0> (a java.lang.String)
>>
>> "T2" #23 prio=5 os_prio=64 tid=0x00000000009cc000 nid=0x30 waiting for monitor e
>> ntry [0xfffffd7fc0130000]
>>    java.lang.Thread.State: BLOCKED (on object monitor)
>>    JavaThread state: _thread_blocked
>> Thread: 0x00000000009cc000  [0x30] State: _at_safepoint _has_called_back 0 _at_p
>> oll_safepoint 0
>>    JavaThread state: _thread_blocked
>>         at SMW_WorkerThread.run(StressMonitorWait.java:120)
>>         - waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String)
>>
>> "T3" #24 prio=5 os_prio=64 tid=0x00000000009ce000 nid=0x31 waiting for monitor e
>> ntry [0xfffffd7fc002f000]
>>    java.lang.Thread.State: BLOCKED (on object monitor)
>>    JavaThread state: _thread_blocked
>> Thread: 0x00000000009ce000  [0x31] State: _at_safepoint _has_called_back 0 _at_p
>> oll_safepoint 0
>>    JavaThread state: _thread_blocked
>>         at SMW_WorkerThread.run(StressMonitorWait.java:139)
>>         - waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String)
>>
>> Key symptoms in thread T1:
>>
>> - had the object locked:
>>
>>   locked <0xfffffd7e6a2b6ff0> (a java.lang.String)
>>
>> - did an Object.wait():
>>
>>   waiting on <0xfffffd7e6a2b6ff0> (a java.lang.String)
>>
>> - is blocked on reentry:
>>
>>   waiting for monitor entry [0xfffffd7fc0231000]
>>
>> Key symtoms in thread T2:
>>
>> - is blocked waiting to lock the object:
>>
>>   waiting for monitor entry [0xfffffd7fc0130000]
>>   waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String)
>>
>> Key symtoms in thread T3:
>>
>> - is blocked waiting to lock the object:
>>
>>   waiting for monitor entry [0xfffffd7fc002f000]
>>   waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String)
>>


From oleg.mazurov at oracle.com  Fri Feb  7 15:55:53 2014
From: oleg.mazurov at oracle.com (Oleg Mazurov)
Date: Fri, 07 Feb 2014 15:55:53 -0800
Subject: Review Request (S) 8025841: JVMTI: "vtable stub" dynamic code
	notification is misplaced
In-Reply-To: <52F54AC6.8020804@oracle.com>
References: <52F01569.3070201@oracle.com>
	<52F41821.10403@oracle.com>	<52F41B7D.6090206@oracle.com>
	<B6B09E1E-E531-4C0F-BCC0-FF45032A30C0@oracle.com>
	<52F54AC6.8020804@oracle.com>
Message-ID: <52F57289.3080201@oracle.com>

The cost might be minimal but that would be a move in the wrong 
direction in my opinion.
A larger problem is that BufferBlobs being just placeholders for dynamic 
code should not be
reported via JVMTI at all: when they are created they usually contain no 
executable code and
actual objects placed into such blobs are reported separately (that 
wasn't true for vtable/itable
stubs before this fix). That the same address, that of a BufferBlob and 
its first object, could be
reported twice is revealed by a comment to a loop that follows the 
problematic comparison:

src/share/vm/prims/jvmtiCodeBlobEvents.cpp:

  124   // exclude VtableStubs, which are processed separately
  125   if (cb->is_buffer_blob() && strcmp(cb->name(), "vtable chunks") == 0) {
  126     return;
  127   }
  128
  129   // check if this starting address has been seen already - the
  130   // assumption is that stubs are inserted into the list before the
  131   // enclosing BufferBlobs.
  132   address addr = cb->code_begin();
  133   for (int i=0; i<_global_code_blobs->length(); i++) {
  134     JvmtiCodeBlobDesc* scb = _global_code_blobs->at(i);
  135     if (addr == scb->code_begin()) {
  136       return;
  137     }
  138   }

I believe that now that the vtable stub problem is fixed the need for 
that check is gone
and both the strcmp() call and the following loop could be removed 
altogether, thus stopping
further processing for *any* BufferBlob and avoiding a O(n^2) overhead 
they were causing.
The scope of that change is much larger than the original problem 
entailed and would require
not just additional ad hoc testing on the JMTI consumer side but also 
thorough statical analysis
for all BufferBlob uses in the VM code.
In fact, I was going to file a linked JIRA issue on that further 
improvement and if my idea for
it holds true there would be no need for a new BufferBlob subtype.

     -- Oleg

On 2/7/2014 1:06 PM, Dean Long wrote:
> What's the cost for adding a new BufferBlob subtype?  We already have 
> AdapterBlob and MethodHandlesAdapterBlob.
>
> dl
>
> On 2/6/2014 3:52 PM, Oleg Mazurov wrote:
>> My understanding was that a buffer blob was just that - a buffer. 
>> Could potentially contain code fragments of different kinds.
>> Thus, is_buffer_blob() was the closest type available. Agree that a 
>> dependency on its name is not reliable, though testing
>> will reveal if the condition turns false for "vtable chunks" due to a 
>> name change (I had to deal with that particular test, Serguei
>> should be able to identify it). Adding a comment to where the name is 
>> defined (vtableStubs.cpp) that such a dependency exists
>> is a good idea.
>> Thanks,
>>
>>      -- Oleg
>>
>> On Feb 6, 2014, at 3:32 PM, Coleen Phillimore wrote:
>>
>>> Hi, I clicked on this a couple times. It seems okay but isn't there 
>>> a safer way to identify code blobs that are vtable stubs, without 
>>> looking at the name (which can change in while creating it).  A 
>>> comment at least when you create "vtable chunks" would be good.   It 
>>> seems that someone might want to rename it "vtable or itable 
>>> buffers", or something like that.
>>>
>>> thanks,
>>> Coleen
>>>
>>> On 2/6/14 6:17 PM, serguei.spitsyn at oracle.com wrote:
>>>> Runtime team,
>>>>
>>>> This fix was reviewed by Vladimir K. and me.
>>>> Just wanted to make sure if you would like to review it as well.
>>>> If not, then I will push it.
>>>>
>>>> Thanks,
>>>> Serguei
>>>>
>>>> On 2/3/14 2:17 PM, serguei.spitsyn at oracle.com wrote:
>>>>> Please, review the fix for:
>>>>>    https://bugs.openjdk.java.net/browse/JDK-8025841
>>>>>
>>>>>
>>>>> Open webrev:
>>>>> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/hotspot/omazurov/8025841-JVMTI-vtbl.1 
>>>>>
>>>>>
>>>>> Summary:
>>>>>
>>>>>    The fix contributed by Oleg Mazurov to improve profiling data 
>>>>> quality.
>>>>>    It moves the "vtable stub" dynamic code notification to the 
>>>>> right place.
>>>>>    I've already reviewed the fix, and it looks good to me.
>>>>>
>>>>>    Bug report description:
>>>>>
>>>>>     "JVMTI_EVENT_DYNAMIC_CODE_GENERATED for "vtable stub" gets 
>>>>> scheduled when
>>>>>      a new chunk of memory for subsequent vtable and itable stubs 
>>>>> is allocated.
>>>>>      That chunk is uninitialized (contains zeros or garbage) 
>>>>> although due to the fact
>>>>>      that the actual event delivery is deferred, at least one 
>>>>> vtable comes out right.
>>>>>
>>>>>      This event should describe an individual vtable/itable stub 
>>>>> (base address and size)
>>>>>      and only after it's been created (memory is actually 
>>>>> populated with code).
>>>>>      Where VM diagnostic messages about vtable/itable stubs are 
>>>>> issued upon
>>>>>      -XX:+PrintAdapterHandlers appears exactly the right place for 
>>>>> JVMTI events as well.
>>>>>
>>>>>      Getting vtables/itables right is important in the context of 
>>>>> performance analysis as
>>>>>      that dynamically generated code may accumulate quite 
>>>>> noticeable CPU time
>>>>>      (especially itabes), sometimes larger than the actual Java 
>>>>> methods called."
>>>>>
>>>>>
>>>>> Testing:
>>>>>    Oleg tested it in the Oracle Studio Performance Analyzer 
>>>>> environment.
>>>>>    nsk.jvmti, nsk.jdi, nsk.jdwp,
>>>>>    In progress: Jtreg com/sun/jdi, java/lang/instrument
>>>>>
>>>>>
>>>>> Thanks,
>>>>> Serguei
>>>>>
>


From dean.long at oracle.com  Fri Feb  7 16:08:36 2014
From: dean.long at oracle.com (Dean Long)
Date: Fri, 07 Feb 2014 16:08:36 -0800
Subject: Review Request (S) 8025841: JVMTI: "vtable stub" dynamic code
	notification is misplaced
In-Reply-To: <52F57289.3080201@oracle.com>
References: <52F01569.3070201@oracle.com>
	<52F41821.10403@oracle.com>	<52F41B7D.6090206@oracle.com>
	<B6B09E1E-E531-4C0F-BCC0-FF45032A30C0@oracle.com>
	<52F54AC6.8020804@oracle.com> <52F57289.3080201@oracle.com>
Message-ID: <52F57584.70007@oracle.com>

OK.  Your further improvement idea sounds promising.

dl

On 2/7/2014 3:55 PM, Oleg Mazurov wrote:
> The cost might be minimal but that would be a move in the wrong 
> direction in my opinion.
> A larger problem is that BufferBlobs being just placeholders for 
> dynamic code should not be
> reported via JVMTI at all: when they are created they usually contain 
> no executable code and
> actual objects placed into such blobs are reported separately (that 
> wasn't true for vtable/itable
> stubs before this fix). That the same address, that of a BufferBlob 
> and its first object, could be
> reported twice is revealed by a comment to a loop that follows the 
> problematic comparison:
>
> src/share/vm/prims/jvmtiCodeBlobEvents.cpp:
>
>  124   // exclude VtableStubs, which are processed separately
>  125   if (cb->is_buffer_blob() && strcmp(cb->name(), "vtable chunks") 
> == 0) {
>  126     return;
>  127   }
>  128
>  129   // check if this starting address has been seen already - the
>  130   // assumption is that stubs are inserted into the list before the
>  131   // enclosing BufferBlobs.
>  132   address addr = cb->code_begin();
>  133   for (int i=0; i<_global_code_blobs->length(); i++) {
>  134     JvmtiCodeBlobDesc* scb = _global_code_blobs->at(i);
>  135     if (addr == scb->code_begin()) {
>  136       return;
>  137     }
>  138   }
>
> I believe that now that the vtable stub problem is fixed the need for 
> that check is gone
> and both the strcmp() call and the following loop could be removed 
> altogether, thus stopping
> further processing for *any* BufferBlob and avoiding a O(n^2) overhead 
> they were causing.
> The scope of that change is much larger than the original problem 
> entailed and would require
> not just additional ad hoc testing on the JMTI consumer side but also 
> thorough statical analysis
> for all BufferBlob uses in the VM code.
> In fact, I was going to file a linked JIRA issue on that further 
> improvement and if my idea for
> it holds true there would be no need for a new BufferBlob subtype.
>
>     -- Oleg
>
> On 2/7/2014 1:06 PM, Dean Long wrote:
>> What's the cost for adding a new BufferBlob subtype?  We already have 
>> AdapterBlob and MethodHandlesAdapterBlob.
>>
>> dl
>>
>> On 2/6/2014 3:52 PM, Oleg Mazurov wrote:
>>> My understanding was that a buffer blob was just that - a buffer. 
>>> Could potentially contain code fragments of different kinds.
>>> Thus, is_buffer_blob() was the closest type available. Agree that a 
>>> dependency on its name is not reliable, though testing
>>> will reveal if the condition turns false for "vtable chunks" due to 
>>> a name change (I had to deal with that particular test, Serguei
>>> should be able to identify it). Adding a comment to where the name 
>>> is defined (vtableStubs.cpp) that such a dependency exists
>>> is a good idea.
>>> Thanks,
>>>
>>>      -- Oleg
>>>
>>> On Feb 6, 2014, at 3:32 PM, Coleen Phillimore wrote:
>>>
>>>> Hi, I clicked on this a couple times. It seems okay but isn't there 
>>>> a safer way to identify code blobs that are vtable stubs, without 
>>>> looking at the name (which can change in while creating it).  A 
>>>> comment at least when you create "vtable chunks" would be good.   
>>>> It seems that someone might want to rename it "vtable or itable 
>>>> buffers", or something like that.
>>>>
>>>> thanks,
>>>> Coleen
>>>>
>>>> On 2/6/14 6:17 PM, serguei.spitsyn at oracle.com wrote:
>>>>> Runtime team,
>>>>>
>>>>> This fix was reviewed by Vladimir K. and me.
>>>>> Just wanted to make sure if you would like to review it as well.
>>>>> If not, then I will push it.
>>>>>
>>>>> Thanks,
>>>>> Serguei
>>>>>
>>>>> On 2/3/14 2:17 PM, serguei.spitsyn at oracle.com wrote:
>>>>>> Please, review the fix for:
>>>>>>    https://bugs.openjdk.java.net/browse/JDK-8025841
>>>>>>
>>>>>>
>>>>>> Open webrev:
>>>>>> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/hotspot/omazurov/8025841-JVMTI-vtbl.1 
>>>>>>
>>>>>>
>>>>>> Summary:
>>>>>>
>>>>>>    The fix contributed by Oleg Mazurov to improve profiling data 
>>>>>> quality.
>>>>>>    It moves the "vtable stub" dynamic code notification to the 
>>>>>> right place.
>>>>>>    I've already reviewed the fix, and it looks good to me.
>>>>>>
>>>>>>    Bug report description:
>>>>>>
>>>>>>     "JVMTI_EVENT_DYNAMIC_CODE_GENERATED for "vtable stub" gets 
>>>>>> scheduled when
>>>>>>      a new chunk of memory for subsequent vtable and itable stubs 
>>>>>> is allocated.
>>>>>>      That chunk is uninitialized (contains zeros or garbage) 
>>>>>> although due to the fact
>>>>>>      that the actual event delivery is deferred, at least one 
>>>>>> vtable comes out right.
>>>>>>
>>>>>>      This event should describe an individual vtable/itable stub 
>>>>>> (base address and size)
>>>>>>      and only after it's been created (memory is actually 
>>>>>> populated with code).
>>>>>>      Where VM diagnostic messages about vtable/itable stubs are 
>>>>>> issued upon
>>>>>>      -XX:+PrintAdapterHandlers appears exactly the right place 
>>>>>> for JVMTI events as well.
>>>>>>
>>>>>>      Getting vtables/itables right is important in the context of 
>>>>>> performance analysis as
>>>>>>      that dynamically generated code may accumulate quite 
>>>>>> noticeable CPU time
>>>>>>      (especially itabes), sometimes larger than the actual Java 
>>>>>> methods called."
>>>>>>
>>>>>>
>>>>>> Testing:
>>>>>>    Oleg tested it in the Oracle Studio Performance Analyzer 
>>>>>> environment.
>>>>>>    nsk.jvmti, nsk.jdi, nsk.jdwp,
>>>>>>    In progress: Jtreg com/sun/jdi, java/lang/instrument
>>>>>>
>>>>>>
>>>>>> Thanks,
>>>>>> Serguei
>>>>>>
>>
>


From david.holmes at oracle.com  Sun Feb  9 19:37:11 2014
From: david.holmes at oracle.com (David Holmes)
Date: Mon, 10 Feb 2014 13:37:11 +1000
Subject: code review round 0 for ObjectMonitor-JVM/TI hang fix (8028073)
In-Reply-To: <52F5702C.50508@oracle.com>
References: <52ED3F28.5070806@oracle.com>	<6C8A4CDA-B5D5-44E4-865E-A7205D533AFD@oracle.com>
	<52F5702C.50508@oracle.com>
Message-ID: <52F84967.8040807@oracle.com>

trimming content ...

On 8/02/2014 9:45 AM, Daniel D. Daugherty wrote:
> On 2/7/14 2:56 PM, Karen Kinnear wrote:
>> 3. Did I read the code correctly that the Thread::SpinAcquire can make
>> a timed park
>> call on the same thread's _ParkEvent? And that this is used to get on
>> and off the wait queue,
>> i.e. to acquire the WaitSetLock?
>>     Is there the same risk that a notify might be eaten here also?
>
> As far as I can see, Thread::SpinAcquire() does not use a ParkEvent

It sure does:

void Thread::SpinAcquire (volatile int * adr, const char * LockName) {
   if (Atomic::cmpxchg (1, adr, 0) == 0) {
      return ;   // normal fast-path return
   }

   // Slow-path : We've encountered contention -- Spin/Yield/Block strategy.
   TEVENT (SpinAcquire - ctx) ;
   int ctr = 0 ;
   int Yields = 0 ;
   for (;;) {
      while (*adr != 0) {
         ++ctr ;
         if ((ctr & 0xFFF) == 0 || !os::is_MP()) {
            if (Yields > 5) {
              // Consider using a simple NakedSleep() instead.
              // Then SpinAcquire could be called by non-JVM threads
              Thread::current()->_ParkEvent->park(1) ;

So considering Karen's question ... I can't tell for certain. :(

I do not think the SpinAcquire on grabbing the wait-set lock to add to 
the wait-set can be an issue because we will only park in response to 
the actual wait, and hence only get unparked due to a notify/notifyAll, 
but at this point we still own the monitor so no notify/notifyAll is 
possible.

However, for the removal from the wait-set a more complex analysis is 
needed. To do the SpinAcquire we must still be flagged as TS_WAIT - 
which means we have not been notified, but must be returning due to a 
timeout (or spurious wakeup?). In such circumstances could we be _succ? 
I don't think so but I'll leave it to Dan to confirm that part :)

David
-----

> at all. However, Thread::muxAcquire() does use a ParkEvent, but it
> is a different ParkEvent. From src/share/vm/runtime/thread.hpp:
>
>    ParkEvent * _ParkEvent ;               // for synchronized()
>    ParkEvent * _SleepEvent ;              // for Thread.sleep
>    ParkEvent * _MutexEvent ;              // for native internal
> Mutex/Monitor
>    ParkEvent * _MuxEvent ;                // for low-level
> muxAcquire-muxRelease
>
> So ObjectMonitor uses the _ParkEvent field and Thread::muxAcquire()
> uses the _MuxEvent. There are some comments in thread.cpp about
> how _MuxEvent could be eliminated and _ParkEvent shared, but I don't
> think we ever want to go there.
>
> I also filed this RFE:
>
>      8033399 add a separate ParkEvent for JVM/TI RawMonitor use
>      https://bugs.openjdk.java.net/browse/JDK-8033399
>
> just in case the Serviceability team wants to migrate JVM/TI RawMonitors
> to a separate ParkEvent.
>
> Please let me know if you concur that I've resolved issue #3.
>
>
>> If so, I wonder if we want this added unpark to not just be called if
>> JVMTI_EVENT_MONITOR_WAITED
>> is enabled?
>
> I don't think we need it, but I've noted its removal as a risk.
>
> Again, thanks for the review!
>
> Dan
>
>
>>
>> thanks,
>> Karen
>>
>> On Feb 1, 2014, at 1:38 PM, Daniel D. Daugherty wrote:
>>
>>> Greetings,
>>>
>>> I have a fix ready for the following bug:
>>>
>>>     8028073 race condition in ObjectMonitor implementation causing
>>> deadlocks
>>>     https://bugs.openjdk.java.net/browse/JDK-8028073
>>>
>>> On the surface, this is a very simple fix that relocates a few lines of
>>> code, relocates and rewrites the comments associated with that code and
>>> adds several new comments.
>>>
>>> Of course, in reality, the issue is much more complicated, but I'm
>>> hoping to make it easy for anyone not acquainted with this issue to
>>> understand what's going on.
>>>
>>> Here are the JDK9 webrev URLs:
>>>
>>> OpenJDK:
>>> http://cr.openjdk.java.net/~dcubed/8028073-webrev/0-jdk9-hs-runtime/
>>>
>>> Oracle internal:
>>> http://javaweb.us.oracle.com/~ddaugher/8028073-webrev/0-jdk9-hs-runtime/
>>>
>>> The simple summary:
>>>
>>> - since Java Monitors and JVM/TI RawMonitors share a ParkEvent,
>>>   it is possible for a JVM/TI monitor event handler to accidentally
>>>   consume a ParkEvent.unpark() call meant for Java Monitor layer
>>> - the original code fix was made on 2005.07.04 using this bug ID:
>>>   https://bugs.openjdk.java.net/browse/JDK-5030359
>>> - it's the right fix, but it's in the wrong place
>>> - the fix needs to be after the JVMTI_EVENT_MONITOR_WAITED
>>>   event handler is called because it is that event handler
>>>   that can cause the hang
>>>
>>>
>>> Testing
>>> -------
>>>
>>> - a new StessMonitorWait test has been created that reliably
>>>   reproduces the hang in JDK[6789]; see the bug's gory details
>>>   for the specific versions where the hang has been reproduced
>>>   - the test reliably reproduces the hang in 5 seconds on my
>>>     T7600 running Solaris 10u11 X86; 1 minute runs reproduce
>>>     the hang reliably on other machines
>>>   - 12 hour stress run of the new test on Linux-X64, MacOS X-X64,
>>>     Solaris-SPARCV9, Solaris-X64, and Win7-X86 with the JPRT
>>>     bits did not reproduce the hang
>>> - JPRT test job
>>> - VM/SQE Adhoc test job on Server VM, fastdebug bits on Linux-X86,
>>>   Linux-X64, MacOS X-X64, Solaris-SPARCV9, Solaris-X64, Windows-X86,
>>>   and Windows-X64:
>>>   - vm.quick
>>>   - Kitchensink (bigapps)
>>>   - Weblogic+medrec (bigapps)
>>>   - runThese (bigapps)
>>>
>>>
>>> The Gory Details Start Here
>>> ---------------------------
>>>
>>> This is the old location of block of code that's being moved:
>>>
>>> src/share/vm/runtime/objectMonitor.cpp:
>>>
>>> 1440 void ObjectMonitor::wait(jlong millis, bool interruptible, TRAPS) {
>>> <snip>
>>> 1499    exit (true, Self) ;                    // exit the monitor
>>> <snip>
>>> 1513    if (node._notified != 0 && _succ == Self) {
>>> 1514       node._event->unpark();
>>> 1515    }
>>>
>>>
>>> This is the new location of block of code that's being moved:
>>>
>>> src/share/vm/runtime/objectMonitor.cpp:
>>>
>>> 1452 void ObjectMonitor::wait(jlong millis, bool interruptible, TRAPS) {
>>> <snip>
>>> 1601      if (JvmtiExport::should_post_monitor_waited()) {
>>> 1602        JvmtiExport::post_monitor_waited(jt, this, ret ==
>>> OS_TIMEOUT);
>>> <snip>
>>> 1604        if (node._notified != 0 && _succ == Self) {
>>> <snip>
>>> 1620          node._event->unpark();
>>> 1621        }
>>>
>>>
>>> The Risks
>>> ---------
>>>
>>> - The code now executes only when the JVMTI_EVENT_MONITOR_WAITED event
>>>   is enabled:
>>>   - previously it was always executed
>>>   - while the old code was not effective for the hang that is being
>>>     fixed with this bug, it is possible that the old code prevented
>>>     a different bug in the successor protocol from manifesting
>>>   - thorough analysis of the successor protocol did not reveal a
>>>     case where the old code was needed in the old location
>>> - Thorough analysis indicates that the other JVM/TI monitor events
>>>   do not need a fix like the one for JVMTI_EVENT_MONITOR_WAITED:
>>>   - the successor protocol is complicated and the analysis could
>>>     be wrong when certain options are used
>>>   - comments were added to each location where a JVM/TI monitor
>>>     event handler is called documenting why a fix like this one
>>>     is not needed there
>>>   - if the analysis is wrong, the new comments show where a new
>>>     code change would be needed
>>>
>>>
>>> The Scenario
>>> ------------
>>>
>>> I've created a scenario that reproduces this hang:
>>>
>>> T1 - enters monitor and calls monitor.wait()
>>> T2 - enters the monitor, calls monitor.notify() and exits the monitor
>>> T3 - enters and exits the monitor
>>> T4 - enters the monitor, delays for 5 seconds, exits the monitor
>>>
>>> A JVM/TI agent that enables JVMTI_EVENT_MONITOR_WAITED and has a
>>> handler that: enters a raw monitor, waits for 1ms, exits a raw monitor.
>>>
>>> Here are the six events necessary to make this hang happen:
>>>
>>> // KEY-EVENT-1a: After being unparked(), T1 has cleared the _succ
>>> field, but
>>> // KEY-EVENT-1b: T3 is exiting the monitor and makes T1 the successor
>>> again.
>>>
>>> // KEY-EVENT-2a: The unpark() done by T3 when it made T1 the successor
>>> // KEY-EVENT-2b: is consumed by the JVM/TI event handler.
>>>
>>> // KEY-EVENT-3a: T3 made T1 the successor
>>> // KEY-EVENT-3b: but before T1 could reenter the monitor T4 grabbed it.
>>>
>>> // KEY-EVENT-4a: T1's TrySpin() call sees T4 as NotRunnable so
>>> // KEY-EVENT-4b: T1 bails from TrySpin without touching _succ.
>>>
>>> // KEY-EVENT-5a: T4 sees that T1 is still the successor so
>>> // KEY-EVENT-5b: T4 takes the quick exit path (no ExitEpilog)
>>>
>>> // KEY-EVENT-6a: T1 is about to park and it is the successor, but
>>> // KEY-EVENT-6b: T3's unpark has been eaten by the JVM/TI event handler
>>> // KEY-EVENT-6c: and T4 took the quick exit path. T1 is about to be
>>> stuck.
>>>
>>>
>>> This bug is intertwined with:
>>>
>>> - The ObjectMonitor successor protocol
>>> - the sharing of a ParkEvent between Java Monitors and JVM/TI
>>> RawMonitors
>>>
>>> There is a very long successor.notes attachment to JDK-8028073 that
>>> attempts to describe the ObjectMonitor successor protocol. It's good
>>> for putting pretty much anyone to sleep.
>>>
>>> Since this hang reproduces back to JDK6, this bug is taking the easily
>>> backported solution of moving the original fix to the right location.
>>> The following new bug has been filed for possible future work in this
>>> area by the Serviceability Team:
>>>
>>>     8033399 add a separate ParkEvent for JVM/TI RawMonitor use
>>>     https://bugs.openjdk.java.net/browse/JDK-8033399
>>>
>>>
>>> The Symptoms
>>> ------------
>>>
>>> With intermittent hangs like this, it is useful to know what to look
>>> for in order to determine if you are running into this issue:
>>>
>>> - if you aren't using a debugger or a profiler or some other
>>>   JVM/TI agent, then this hang is not the same as yours
>>> - if your JVM/TI agent isn't using a JVMTI_EVENT_MONITOR_WAITED
>>>   event handler, then this hang is not the same as yours
>>> - if your JVMTI_EVENT_MONITOR_WAITED event handler is not using
>>>   JVM/TI RawMonitors, then this hang is not the same as yours
>>> - if your JVMTI_EVENT_MONITOR_WAITED event handler is calling
>>>   back into Java code, then you might just be insane and this
>>>   hang might be similar to yours. However, using a Java callback
>>>   in an event handler is an even bigger problem/risk so fix that
>>>   first.
>>> - if you one or more threads blocked like this and making no
>>>   progress, then this hang might be the same as yours:
>>>
>>> "T1" #22 prio=5 os_prio=64 tid=0x00000000009ca800 nid=0x2f waiting
>>> for monitor e
>>> ntry [0xfffffd7fc0231000]
>>>    java.lang.Thread.State: BLOCKED (on object monitor)
>>>    JavaThread state: _thread_blocked
>>> Thread: 0x00000000009ca800  [0x2f] State: _at_safepoint
>>> _has_called_back 0 _at_p
>>> oll_safepoint 0
>>>    JavaThread state: _thread_blocked
>>>         at java.lang.Object.wait(Native Method)
>>>         - waiting on <0xfffffd7e6a2b6ff0> (a java.lang.String)
>>>         at java.lang.Object.wait(Object.java:502)
>>>         at SMW_WorkerThread.run(StressMonitorWait.java:103)
>>>         - locked <0xfffffd7e6a2b6ff0> (a java.lang.String)
>>>
>>> "T2" #23 prio=5 os_prio=64 tid=0x00000000009cc000 nid=0x30 waiting
>>> for monitor e
>>> ntry [0xfffffd7fc0130000]
>>>    java.lang.Thread.State: BLOCKED (on object monitor)
>>>    JavaThread state: _thread_blocked
>>> Thread: 0x00000000009cc000  [0x30] State: _at_safepoint
>>> _has_called_back 0 _at_p
>>> oll_safepoint 0
>>>    JavaThread state: _thread_blocked
>>>         at SMW_WorkerThread.run(StressMonitorWait.java:120)
>>>         - waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String)
>>>
>>> "T3" #24 prio=5 os_prio=64 tid=0x00000000009ce000 nid=0x31 waiting
>>> for monitor e
>>> ntry [0xfffffd7fc002f000]
>>>    java.lang.Thread.State: BLOCKED (on object monitor)
>>>    JavaThread state: _thread_blocked
>>> Thread: 0x00000000009ce000  [0x31] State: _at_safepoint
>>> _has_called_back 0 _at_p
>>> oll_safepoint 0
>>>    JavaThread state: _thread_blocked
>>>         at SMW_WorkerThread.run(StressMonitorWait.java:139)
>>>         - waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String)
>>>
>>> Key symptoms in thread T1:
>>>
>>> - had the object locked:
>>>
>>>   locked <0xfffffd7e6a2b6ff0> (a java.lang.String)
>>>
>>> - did an Object.wait():
>>>
>>>   waiting on <0xfffffd7e6a2b6ff0> (a java.lang.String)
>>>
>>> - is blocked on reentry:
>>>
>>>   waiting for monitor entry [0xfffffd7fc0231000]
>>>
>>> Key symtoms in thread T2:
>>>
>>> - is blocked waiting to lock the object:
>>>
>>>   waiting for monitor entry [0xfffffd7fc0130000]
>>>   waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String)
>>>
>>> Key symtoms in thread T3:
>>>
>>> - is blocked waiting to lock the object:
>>>
>>>   waiting for monitor entry [0xfffffd7fc002f000]
>>>   waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String)
>>>
>

From jaroslav.bachorik at oracle.com  Mon Feb 10 04:00:56 2014
From: jaroslav.bachorik at oracle.com (Jaroslav Bachorik)
Date: Mon, 10 Feb 2014 13:00:56 +0100
Subject: RFR 6791551: ExclusiveBind.java has a race condition
In-Reply-To: <52F10221.6010303@oracle.com>
References: <52F0192B.9060104@oracle.com> <52F0AABC.7000904@oracle.com>
	<52F0ABC2.4040700@oracle.com> <52F10221.6010303@oracle.com>
Message-ID: <52F8BF78.60803@oracle.com>

On 4.2.2014 16:07, shanliang wrote:
> Jaroslav Bachorik wrote:
>> On 4.2.2014 09:54, shanliang wrote:
>>> Jaroslav,
>>>
>>> Your fix should work in most case, but is it better and more reliable to
>>> wait a VM event as suggested in the bug? even your timeout is adapted to
>>> the test time factory, but the solution still depends to a fixed timeout
>>> and a  fixed line out.
>>
>> Well, if I get the test logic correctly it is supposed to test that
>> the agent blocks the port even when no client has connected yet.
>> Connecting to the agent and waiting for the event would change the
>> thing the test checks, actually.
> You are right that the test should not attach a VM before launching the
> second debuggee. Let's hope that 5000 * Utils.TIMEOUT_FACTOR works for
> all testing machines.

Hopefully it should. 5 seconds to start the debugee under normal 
circumstances sounds more than enough. For the exceptional circumstances 
the TIMEOUT_FACTOR should be properly tuned. We will see.


> Looks OK.

Thanks!

May I get an official reviewer to take a look at this, please?

-JB-

>
> Thanks,
> Shanliang
>>
>> -JB-
>>
>>>
>>> Shanliang
>>>
>>> Jaroslav Bachorik wrote:
>>>> Please, review the following test fix:
>>>>
>>>> Issue : https://bugs.openjdk.java.net/browse/JDK-6791551
>>>> Webrev: http://cr.openjdk.java.net/~jbachorik/6791551/webrev.00
>>>>
>>>> The fix prevents the situation when the first debuggee has not managed
>>>> to finish its intialization while the second one is started up thus
>>>> making the port available for the second debuggee and failing the test.
>>>>
>>>> The patch is using the library methods to configure and launch the
>>>> debuggee and the test waits for the well known string to appear in the
>>>> first debuggee output before attempting to launch the second debuggee.
>>>>
>>>> Thanks,
>>>>
>>>> -JB-
>>>
>>
>

From staffan.larsen at oracle.com  Mon Feb 10 04:22:41 2014
From: staffan.larsen at oracle.com (Staffan Larsen)
Date: Mon, 10 Feb 2014 13:22:41 +0100
Subject: RFR 6791551: ExclusiveBind.java has a race condition
In-Reply-To: <52F8BF78.60803@oracle.com>
References: <52F0192B.9060104@oracle.com> <52F0AABC.7000904@oracle.com>
	<52F0ABC2.4040700@oracle.com> <52F10221.6010303@oracle.com>
	<52F8BF78.60803@oracle.com>
Message-ID: <16C6B74D-4E42-4464-B6CE-7338D90E68AE@oracle.com>

Looks good!

Thanks,
/Staffan

On 10 feb 2014, at 13:00, Jaroslav Bachorik <jaroslav.bachorik at oracle.com> wrote:

> On 4.2.2014 16:07, shanliang wrote:
>> Jaroslav Bachorik wrote:
>>> On 4.2.2014 09:54, shanliang wrote:
>>>> Jaroslav,
>>>> 
>>>> Your fix should work in most case, but is it better and more reliable to
>>>> wait a VM event as suggested in the bug? even your timeout is adapted to
>>>> the test time factory, but the solution still depends to a fixed timeout
>>>> and a  fixed line out.
>>> 
>>> Well, if I get the test logic correctly it is supposed to test that
>>> the agent blocks the port even when no client has connected yet.
>>> Connecting to the agent and waiting for the event would change the
>>> thing the test checks, actually.
>> You are right that the test should not attach a VM before launching the
>> second debuggee. Let's hope that 5000 * Utils.TIMEOUT_FACTOR works for
>> all testing machines.
> 
> Hopefully it should. 5 seconds to start the debugee under normal circumstances sounds more than enough. For the exceptional circumstances the TIMEOUT_FACTOR should be properly tuned. We will see.
> 
> 
>> Looks OK.
> 
> Thanks!
> 
> May I get an official reviewer to take a look at this, please?
> 
> -JB-
> 
>> 
>> Thanks,
>> Shanliang
>>> 
>>> -JB-
>>> 
>>>> 
>>>> Shanliang
>>>> 
>>>> Jaroslav Bachorik wrote:
>>>>> Please, review the following test fix:
>>>>> 
>>>>> Issue : https://bugs.openjdk.java.net/browse/JDK-6791551
>>>>> Webrev: http://cr.openjdk.java.net/~jbachorik/6791551/webrev.00
>>>>> 
>>>>> The fix prevents the situation when the first debuggee has not managed
>>>>> to finish its intialization while the second one is started up thus
>>>>> making the port available for the second debuggee and failing the test.
>>>>> 
>>>>> The patch is using the library methods to configure and launch the
>>>>> debuggee and the test waits for the well known string to appear in the
>>>>> first debuggee output before attempting to launch the second debuggee.
>>>>> 
>>>>> Thanks,
>>>>> 
>>>>> -JB-

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20140210/eae4553d/attachment.html 

From staffan.larsen at oracle.com  Mon Feb 10 05:38:48 2014
From: staffan.larsen at oracle.com (Staffan Larsen)
Date: Mon, 10 Feb 2014 14:38:48 +0100
Subject: RFR: 6545422 NativeErrors.java uses wrong path name in exec
In-Reply-To: <7a061a24-dd85-4834-b6fd-fb917e0f9a47@default>
References: <7a061a24-dd85-4834-b6fd-fb917e0f9a47@default>
Message-ID: <6FB1E959-DFDD-481E-9020-9EE3230D6C14@oracle.com>

And a looong time later: Looks good!

Thanks,
/Staffan

On 4 dec 2013, at 15:38, Mattias Tobiasson <mattias.tobiasson at oracle.com> wrote:

> Hi,
> Could someone please review this patch?
> 
> webrev: http://cr.openjdk.java.net/~ykantser/6545422/webrev.00
> bug: https://bugs.openjdk.java.net/browse/JDK-6545422
> 
> Changes:
> Uses common testlibrary function to find native2ascii binary.
> Check null values instead of throwing NullPointerException at different locations. The old version checked null with "assert", which is not always enabled.
> 
> 
> Mattias


From rob.mckenna at oracle.com  Mon Feb 10 06:50:15 2014
From: rob.mckenna at oracle.com (rob.mckenna at oracle.com)
Date: Mon, 10 Feb 2014 14:50:15 +0000
Subject: hg: jdk8/tl/jdk: 7152892: some jtreg tests fail with permission denied
Message-ID: <20140210145247.DD74462B24@hg.openjdk.java.net>

Changeset: da4b0962ad11
Author:    robm
Date:      2014-02-10 14:35 +0000
URL:       http://hg.openjdk.java.net/jdk8/tl/jdk/rev/da4b0962ad11

7152892: some jtreg tests fail with permission denied
Reviewed-by: coffeys

! test/java/lang/ClassLoader/Assert.sh
! test/java/rmi/registry/readTest/readTest.sh
! test/java/util/zip/ZipFile/ReadZip.java
! test/sun/net/www/protocol/jar/jarbug/run.sh


From daniel.daugherty at oracle.com  Mon Feb 10 10:55:41 2014
From: daniel.daugherty at oracle.com (Daniel D. Daugherty)
Date: Mon, 10 Feb 2014 11:55:41 -0700
Subject: code review round 0 for ObjectMonitor-JVM/TI hang fix (8028073)
In-Reply-To: <52F84967.8040807@oracle.com>
References: <52ED3F28.5070806@oracle.com>	<6C8A4CDA-B5D5-44E4-865E-A7205D533AFD@oracle.com>
	<52F5702C.50508@oracle.com> <52F84967.8040807@oracle.com>
Message-ID: <52F920AD.6050705@oracle.com>

On 2/9/14 8:37 PM, David Holmes wrote:
> trimming content ...
>
> On 8/02/2014 9:45 AM, Daniel D. Daugherty wrote:
>> On 2/7/14 2:56 PM, Karen Kinnear wrote:
>>> 3. Did I read the code correctly that the Thread::SpinAcquire can make
>>> a timed park
>>> call on the same thread's _ParkEvent? And that this is used to get on
>>> and off the wait queue,
>>> i.e. to acquire the WaitSetLock?
>>>     Is there the same risk that a notify might be eaten here also?
>>
>> As far as I can see, Thread::SpinAcquire() does not use a ParkEvent
>
> It sure does:
>
> void Thread::SpinAcquire (volatile int * adr, const char * LockName) {
>   if (Atomic::cmpxchg (1, adr, 0) == 0) {
>      return ;   // normal fast-path return
>   }
>
>   // Slow-path : We've encountered contention -- Spin/Yield/Block 
> strategy.
>   TEVENT (SpinAcquire - ctx) ;
>   int ctr = 0 ;
>   int Yields = 0 ;
>   for (;;) {
>      while (*adr != 0) {
>         ++ctr ;
>         if ((ctr & 0xFFF) == 0 || !os::is_MP()) {
>            if (Yields > 5) {
>              // Consider using a simple NakedSleep() instead.
>              // Then SpinAcquire could be called by non-JVM threads
>              Thread::current()->_ParkEvent->park(1) ;

Ummmm... that's not the code I'm seeing...

src/share/vm/runtime/thread.cpp:

   4417  void Thread::SpinAcquire (volatile int * adr, const char * 
LockName) {
   4418    if (Atomic::cmpxchg (1, adr, 0) == 0) {
   4419       return ;   // normal fast-path return
   4420    }
   4421
   4422    // Slow-path : We've encountered contention -- 
Spin/Yield/Block strategy.
   4423    TEVENT (SpinAcquire - ctx) ;
   4424    int ctr = 0 ;
   4425    int Yields = 0 ;
   4426    for (;;) {
   4427       while (*adr != 0) {
   4428          ++ctr ;
   4429          if ((ctr & 0xFFF) == 0 || !os::is_MP()) {
   4430             if (Yields > 5) {
   4431               os::naked_short_sleep(1);
   4432             } else {
   4433               os::NakedYield() ;
   4434               ++Yields ;
   4435             }
   4436          } else {
   4437             SpinPause() ;
   4438          }
   4439       }
   4440       if (Atomic::cmpxchg (1, adr, 0) == 0) return ;
   4441    }
   4442  }

Mr Simms recently changed the above code via:

changeset:   5832:5944dba4badc
user:        dsimms
date:        Fri Jan 24 09:28:47 2014 +0100
summary:     8028280: ParkEvent leak when running modified runThese 
which only loads classes

os::naked_short_sleep() is new:

- BSD/MacOS X, Linux - uses nanosleep()
- Solaris - uses usleep()
- Windows - uses Sleep()

The fix for 8028280 was pushed to JDK9/hs-rt on 2014.01.24 and to JDK9/hs
on 2014.01.29. I don't see any signs that Mr Simm's fix will be backported
to JDK8u/HSX-25u (yet) so this part of the review thread might impact the
backport of my fix to earlier releases.


> So considering Karen's question ... I can't tell for certain. :(
>
> I do not think the SpinAcquire on grabbing the wait-set lock to add to 
> the wait-set can be an issue because we will only park in response to 
> the actual wait, and hence only get unparked due to a 
> notify/notifyAll, but at this point we still own the monitor so no 
> notify/notifyAll is possible.
>
> However, for the removal from the wait-set a more complex analysis is 
> needed. To do the SpinAcquire we must still be flagged as TS_WAIT - 
> which means we have not been notified, but must be returning due to a 
> timeout (or spurious wakeup?). In such circumstances could we be 
> _succ? I don't think so but I'll leave it to Dan to confirm that part :)

So for HSX-25 and probably older...

There are four Thread::SpinAcquire() calls in the objectMonitor code:

     Thread::SpinAcquire (&_WaitSetLock, "WaitSet - add") ;
     Thread::SpinAcquire (&_WaitSetLock, "WaitSet - unlink") ;
     Thread::SpinAcquire (&_WaitSetLock, "WaitSet - notify") ;
     Thread::SpinAcquire (&_WaitSetLock, "WaitSet - notifyall") ;

We can easily rule out the "notify" and "notifyAll" uses since the
current thread owns the Java-level monitor and there are no events
to post in this part of the notify() or notifyAll() protocols.

For the "WaitSet - add" use, the current thread owns the Java-level
monitor and the thread has not been added as a waiter yet so another
thread cannot do the notify-exit-make-successor part of the protocol
yet.

For the "WaitSet - unlink" use:

src/share/vm/runtime/objectMonitor.cpp:

   1569       if (node.TState == ObjectWaiter::TS_WAIT) {
   1570           Thread::SpinAcquire (&_WaitSetLock, "WaitSet - unlink") ;
   1571           if (node.TState == ObjectWaiter::TS_WAIT) {
   1572              DequeueSpecificWaiter (&node) ;       // unlink 
from WaitSet
   1573              assert(node._notified == 0, "invariant");
   1574              node.TState = ObjectWaiter::TS_RUN ;
   1575           }
   1576           Thread::SpinRelease (&_WaitSetLock) ;
   1577       }

It is the call on line 1570 above that gets us into this code:

src/share/vm/runtime/thread.cpp:

   4435  void Thread::SpinAcquire (volatile int * adr, const char * 
LockName) {
   4436    if (Atomic::cmpxchg (1, adr, 0) == 0) {
   4437       return ;   // normal fast-path return
   4438    }
   4439
   4440    // Slow-path : We've encountered contention -- 
Spin/Yield/Block strategy.
   4441    TEVENT (SpinAcquire - ctx) ;
   4442    int ctr = 0 ;
   4443    int Yields = 0 ;
   4444    for (;;) {
   4445       while (*adr != 0) {
   4446          ++ctr ;
   4447          if ((ctr & 0xFFF) == 0 || !os::is_MP()) {
   4448             if (Yields > 5) {
   4449               // Consider using a simple NakedSleep() instead.
   4450               // Then SpinAcquire could be called by non-JVM threads
   4451               Thread::current()->_ParkEvent->park(1) ;
   4452             } else {
   4453               os::NakedYield() ;
   4454               ++Yields ;
   4455             }
   4456          } else {
   4457             SpinPause() ;
   4458          }
   4459       }
   4460       if (Atomic::cmpxchg (1, adr, 0) == 0) return ;
   4461    }
   4462  }

And the above code can consume the unpark() on line 4451.

So how the heck do we get to line 1570???

Well, the target thread would have to be both notified and unparked
to be executing this code path. When the notify() code runs, the
target of the notify() is changed from ObjectWaiter::TS_WAIT to
ObjectWaiter::TS_ENTER unless Knob_MoveNotifyee == 4. The default
for Knob_MoveNotifyee == 2 so we're in non default mode here...

Here are the Knob_MoveNotifyee policy values:

    1717      if (Policy == 0) {       // prepend to EntryList
    1728      if (Policy == 1) {      // append to EntryList
    1744      if (Policy == 2) {      // prepend to cxq
    1760      if (Policy == 3) {      // append to cxq

For Knob_MoveNotifyee == 4 (or higher), we use the old mechanism
where we just unpark the target thread and let it run. Part of
that code changes from ObjectWaiter::TS_WAIT to ObjectWaiter::TS_RUN.

The code works the same for notifyAll() for the thread picked
to be notified. For the Knob_MoveNotifyee == 4 (or higher) case,
we just unpark all the waiters and we a free-for-all.

So it looks like the code block from lines 1569-1577 is never
used... or is it? Well... you have to remember two things:

1) spurious unpark()
2) timed wait()

The caller might have called wait(0), but that doesn't mean that
the underlying park() mechanism won't have a spurious unpark().
Or better, the caller might have called wait(1) and be running
again after a millisecond.

So in the HSX25 and older system (i.e., without Mr Simms fix for
8028280), it is possible for this call:

   1570           Thread::SpinAcquire (&_WaitSetLock, "WaitSet - unlink") ;

to consume the unpark(). The gauntlet that has to be traversed
to get to this call:

   4451               Thread::current()->_ParkEvent->park(1) ;

is impressive:

- fast-path acquisition of the _WaitSetLock has to fail:

   4436    if (Atomic::cmpxchg (1, adr, 0) == 0) {
   4437       return ;   // normal fast-path return
   4438    }

- if the machine is a uniprocessor, then 6 os::NakedYield()
   call-loop-recheck attempts have to fail:

   4447          if ((ctr & 0xFFF) == 0 || !os::is_MP()) {
   4448             if (Yields > 5) {
   4449               // Consider using a simple NakedSleep() instead.
   4450               // Then SpinAcquire could be called by non-JVM threads
   4451               Thread::current()->_ParkEvent->park(1) ;
   4452             } else {
   4453               os::NakedYield() ;
   4454               ++Yields ;
   4455             }

- if the machine is a multi-processor, then 6 rounds of { 4095 SpinPause()
   attempts, 1 os::NakedYield() attempt}  have to fail:

   4446          ++ctr ;
   4447          if ((ctr & 0xFFF) == 0 || !os::is_MP()) {
   4448             if (Yields > 5) {
   4449               // Consider using a simple NakedSleep() instead.
   4450               // Then SpinAcquire could be called by non-JVM threads
   4451               Thread::current()->_ParkEvent->park(1) ;
   4452             } else {
   4453               os::NakedYield() ;
   4454               ++Yields ;
   4455             }
   4456          } else {
   4457             SpinPause() ;
   4458          }

But it is possible. It is one of those once-in-a-blue moon type
windows where everything has to line up just so.

So how do we address this issue in HSX-25 and possibly older?

If Mr Simms fix for 8028280 is also backported, then there is no
issue. If it is not backported, then applying the fix for this
bug like so:

src/share/vm/runtime/objectMonitor.cpp:

   1596       if (JvmtiExport::should_post_monitor_waited()) {
   1597         JvmtiExport::post_monitor_waited(jt, this, ret == 
OS_TIMEOUT);
   1598       }

   1604       if (node._notified != 0 && _succ == Self) {
   1605         // In this part of the monitor wait-notify-reenter 
protocol it
   1606         // is possible (and normal) for another thread to do a 
fastpath
   1607         // monitor enter-exit while this thread is still trying 
to get
   1608         // to the reenter portion of the protocol.
   1609         //
   1610         // The ObjectMonitor was notified and the current thread is
   1611         // the successor which also means that an unpark() has 
already
   1612         // been done. The JVMTI_EVENT_MONITOR_WAITED event 
handler can
   1613         // consume the unpark() that was done when the successor was
   1614         // set because the same ParkEvent is shared between Java
   1615         // monitors and JVM/TI RawMonitors (for now).
   1616         //
   1617         // We redo the unpark() to ensure forward progress, i.e., we
   1618         // don't want all pending threads hanging (parked) with none
   1619         // entering the unlocked monitor.
   1620         node._event->unpark();
   1621       }

Of course the line numbers for the "fix" would be different and the comment
would need to be updated to reflect that the:

   1570           Thread::SpinAcquire (&_WaitSetLock, "WaitSet - unlink") ;

call above could also consume an unpark(), but it should work.

If you've read this far, then I'm impressed. If you've read this far
and only fallen asleep a couple of times, then I'm still impressed.

Summary: I don't think we have an issue in JDK9, but we'll have to do
          the fix in JDK8/HSX25 and older a little differently.

Dan


>
> David
> -----
>
>> at all. However, Thread::muxAcquire() does use a ParkEvent, but it
>> is a different ParkEvent. From src/share/vm/runtime/thread.hpp:
>>
>>    ParkEvent * _ParkEvent ;               // for synchronized()
>>    ParkEvent * _SleepEvent ;              // for Thread.sleep
>>    ParkEvent * _MutexEvent ;              // for native internal
>> Mutex/Monitor
>>    ParkEvent * _MuxEvent ;                // for low-level
>> muxAcquire-muxRelease
>>
>> So ObjectMonitor uses the _ParkEvent field and Thread::muxAcquire()
>> uses the _MuxEvent. There are some comments in thread.cpp about
>> how _MuxEvent could be eliminated and _ParkEvent shared, but I don't
>> think we ever want to go there.
>>
>> I also filed this RFE:
>>
>>      8033399 add a separate ParkEvent for JVM/TI RawMonitor use
>> https://bugs.openjdk.java.net/browse/JDK-8033399
>>
>> just in case the Serviceability team wants to migrate JVM/TI RawMonitors
>> to a separate ParkEvent.
>>
>> Please let me know if you concur that I've resolved issue #3.
>>
>>
>>> If so, I wonder if we want this added unpark to not just be called if
>>> JVMTI_EVENT_MONITOR_WAITED
>>> is enabled?
>>
>> I don't think we need it, but I've noted its removal as a risk.
>>
>> Again, thanks for the review!
>>
>> Dan
>>
>>
>>>
>>> thanks,
>>> Karen
>>>
>>> On Feb 1, 2014, at 1:38 PM, Daniel D. Daugherty wrote:
>>>
>>>> Greetings,
>>>>
>>>> I have a fix ready for the following bug:
>>>>
>>>>     8028073 race condition in ObjectMonitor implementation causing
>>>> deadlocks
>>>> https://bugs.openjdk.java.net/browse/JDK-8028073
>>>>
>>>> On the surface, this is a very simple fix that relocates a few 
>>>> lines of
>>>> code, relocates and rewrites the comments associated with that code 
>>>> and
>>>> adds several new comments.
>>>>
>>>> Of course, in reality, the issue is much more complicated, but I'm
>>>> hoping to make it easy for anyone not acquainted with this issue to
>>>> understand what's going on.
>>>>
>>>> Here are the JDK9 webrev URLs:
>>>>
>>>> OpenJDK:
>>>> http://cr.openjdk.java.net/~dcubed/8028073-webrev/0-jdk9-hs-runtime/
>>>>
>>>> Oracle internal:
>>>> http://javaweb.us.oracle.com/~ddaugher/8028073-webrev/0-jdk9-hs-runtime/ 
>>>>
>>>>
>>>> The simple summary:
>>>>
>>>> - since Java Monitors and JVM/TI RawMonitors share a ParkEvent,
>>>>   it is possible for a JVM/TI monitor event handler to accidentally
>>>>   consume a ParkEvent.unpark() call meant for Java Monitor layer
>>>> - the original code fix was made on 2005.07.04 using this bug ID:
>>>> https://bugs.openjdk.java.net/browse/JDK-5030359
>>>> - it's the right fix, but it's in the wrong place
>>>> - the fix needs to be after the JVMTI_EVENT_MONITOR_WAITED
>>>>   event handler is called because it is that event handler
>>>>   that can cause the hang
>>>>
>>>>
>>>> Testing
>>>> -------
>>>>
>>>> - a new StessMonitorWait test has been created that reliably
>>>>   reproduces the hang in JDK[6789]; see the bug's gory details
>>>>   for the specific versions where the hang has been reproduced
>>>>   - the test reliably reproduces the hang in 5 seconds on my
>>>>     T7600 running Solaris 10u11 X86; 1 minute runs reproduce
>>>>     the hang reliably on other machines
>>>>   - 12 hour stress run of the new test on Linux-X64, MacOS X-X64,
>>>>     Solaris-SPARCV9, Solaris-X64, and Win7-X86 with the JPRT
>>>>     bits did not reproduce the hang
>>>> - JPRT test job
>>>> - VM/SQE Adhoc test job on Server VM, fastdebug bits on Linux-X86,
>>>>   Linux-X64, MacOS X-X64, Solaris-SPARCV9, Solaris-X64, Windows-X86,
>>>>   and Windows-X64:
>>>>   - vm.quick
>>>>   - Kitchensink (bigapps)
>>>>   - Weblogic+medrec (bigapps)
>>>>   - runThese (bigapps)
>>>>
>>>>
>>>> The Gory Details Start Here
>>>> ---------------------------
>>>>
>>>> This is the old location of block of code that's being moved:
>>>>
>>>> src/share/vm/runtime/objectMonitor.cpp:
>>>>
>>>> 1440 void ObjectMonitor::wait(jlong millis, bool interruptible, 
>>>> TRAPS) {
>>>> <snip>
>>>> 1499    exit (true, Self) ;                    // exit the monitor
>>>> <snip>
>>>> 1513    if (node._notified != 0 && _succ == Self) {
>>>> 1514       node._event->unpark();
>>>> 1515    }
>>>>
>>>>
>>>> This is the new location of block of code that's being moved:
>>>>
>>>> src/share/vm/runtime/objectMonitor.cpp:
>>>>
>>>> 1452 void ObjectMonitor::wait(jlong millis, bool interruptible, 
>>>> TRAPS) {
>>>> <snip>
>>>> 1601      if (JvmtiExport::should_post_monitor_waited()) {
>>>> 1602        JvmtiExport::post_monitor_waited(jt, this, ret ==
>>>> OS_TIMEOUT);
>>>> <snip>
>>>> 1604        if (node._notified != 0 && _succ == Self) {
>>>> <snip>
>>>> 1620          node._event->unpark();
>>>> 1621        }
>>>>
>>>>
>>>> The Risks
>>>> ---------
>>>>
>>>> - The code now executes only when the JVMTI_EVENT_MONITOR_WAITED event
>>>>   is enabled:
>>>>   - previously it was always executed
>>>>   - while the old code was not effective for the hang that is being
>>>>     fixed with this bug, it is possible that the old code prevented
>>>>     a different bug in the successor protocol from manifesting
>>>>   - thorough analysis of the successor protocol did not reveal a
>>>>     case where the old code was needed in the old location
>>>> - Thorough analysis indicates that the other JVM/TI monitor events
>>>>   do not need a fix like the one for JVMTI_EVENT_MONITOR_WAITED:
>>>>   - the successor protocol is complicated and the analysis could
>>>>     be wrong when certain options are used
>>>>   - comments were added to each location where a JVM/TI monitor
>>>>     event handler is called documenting why a fix like this one
>>>>     is not needed there
>>>>   - if the analysis is wrong, the new comments show where a new
>>>>     code change would be needed
>>>>
>>>>
>>>> The Scenario
>>>> ------------
>>>>
>>>> I've created a scenario that reproduces this hang:
>>>>
>>>> T1 - enters monitor and calls monitor.wait()
>>>> T2 - enters the monitor, calls monitor.notify() and exits the monitor
>>>> T3 - enters and exits the monitor
>>>> T4 - enters the monitor, delays for 5 seconds, exits the monitor
>>>>
>>>> A JVM/TI agent that enables JVMTI_EVENT_MONITOR_WAITED and has a
>>>> handler that: enters a raw monitor, waits for 1ms, exits a raw 
>>>> monitor.
>>>>
>>>> Here are the six events necessary to make this hang happen:
>>>>
>>>> // KEY-EVENT-1a: After being unparked(), T1 has cleared the _succ
>>>> field, but
>>>> // KEY-EVENT-1b: T3 is exiting the monitor and makes T1 the successor
>>>> again.
>>>>
>>>> // KEY-EVENT-2a: The unpark() done by T3 when it made T1 the successor
>>>> // KEY-EVENT-2b: is consumed by the JVM/TI event handler.
>>>>
>>>> // KEY-EVENT-3a: T3 made T1 the successor
>>>> // KEY-EVENT-3b: but before T1 could reenter the monitor T4 grabbed 
>>>> it.
>>>>
>>>> // KEY-EVENT-4a: T1's TrySpin() call sees T4 as NotRunnable so
>>>> // KEY-EVENT-4b: T1 bails from TrySpin without touching _succ.
>>>>
>>>> // KEY-EVENT-5a: T4 sees that T1 is still the successor so
>>>> // KEY-EVENT-5b: T4 takes the quick exit path (no ExitEpilog)
>>>>
>>>> // KEY-EVENT-6a: T1 is about to park and it is the successor, but
>>>> // KEY-EVENT-6b: T3's unpark has been eaten by the JVM/TI event 
>>>> handler
>>>> // KEY-EVENT-6c: and T4 took the quick exit path. T1 is about to be
>>>> stuck.
>>>>
>>>>
>>>> This bug is intertwined with:
>>>>
>>>> - The ObjectMonitor successor protocol
>>>> - the sharing of a ParkEvent between Java Monitors and JVM/TI
>>>> RawMonitors
>>>>
>>>> There is a very long successor.notes attachment to JDK-8028073 that
>>>> attempts to describe the ObjectMonitor successor protocol. It's good
>>>> for putting pretty much anyone to sleep.
>>>>
>>>> Since this hang reproduces back to JDK6, this bug is taking the easily
>>>> backported solution of moving the original fix to the right location.
>>>> The following new bug has been filed for possible future work in this
>>>> area by the Serviceability Team:
>>>>
>>>>     8033399 add a separate ParkEvent for JVM/TI RawMonitor use
>>>> https://bugs.openjdk.java.net/browse/JDK-8033399
>>>>
>>>>
>>>> The Symptoms
>>>> ------------
>>>>
>>>> With intermittent hangs like this, it is useful to know what to look
>>>> for in order to determine if you are running into this issue:
>>>>
>>>> - if you aren't using a debugger or a profiler or some other
>>>>   JVM/TI agent, then this hang is not the same as yours
>>>> - if your JVM/TI agent isn't using a JVMTI_EVENT_MONITOR_WAITED
>>>>   event handler, then this hang is not the same as yours
>>>> - if your JVMTI_EVENT_MONITOR_WAITED event handler is not using
>>>>   JVM/TI RawMonitors, then this hang is not the same as yours
>>>> - if your JVMTI_EVENT_MONITOR_WAITED event handler is calling
>>>>   back into Java code, then you might just be insane and this
>>>>   hang might be similar to yours. However, using a Java callback
>>>>   in an event handler is an even bigger problem/risk so fix that
>>>>   first.
>>>> - if you one or more threads blocked like this and making no
>>>>   progress, then this hang might be the same as yours:
>>>>
>>>> "T1" #22 prio=5 os_prio=64 tid=0x00000000009ca800 nid=0x2f waiting
>>>> for monitor e
>>>> ntry [0xfffffd7fc0231000]
>>>>    java.lang.Thread.State: BLOCKED (on object monitor)
>>>>    JavaThread state: _thread_blocked
>>>> Thread: 0x00000000009ca800  [0x2f] State: _at_safepoint
>>>> _has_called_back 0 _at_p
>>>> oll_safepoint 0
>>>>    JavaThread state: _thread_blocked
>>>>         at java.lang.Object.wait(Native Method)
>>>>         - waiting on <0xfffffd7e6a2b6ff0> (a java.lang.String)
>>>>         at java.lang.Object.wait(Object.java:502)
>>>>         at SMW_WorkerThread.run(StressMonitorWait.java:103)
>>>>         - locked <0xfffffd7e6a2b6ff0> (a java.lang.String)
>>>>
>>>> "T2" #23 prio=5 os_prio=64 tid=0x00000000009cc000 nid=0x30 waiting
>>>> for monitor e
>>>> ntry [0xfffffd7fc0130000]
>>>>    java.lang.Thread.State: BLOCKED (on object monitor)
>>>>    JavaThread state: _thread_blocked
>>>> Thread: 0x00000000009cc000  [0x30] State: _at_safepoint
>>>> _has_called_back 0 _at_p
>>>> oll_safepoint 0
>>>>    JavaThread state: _thread_blocked
>>>>         at SMW_WorkerThread.run(StressMonitorWait.java:120)
>>>>         - waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String)
>>>>
>>>> "T3" #24 prio=5 os_prio=64 tid=0x00000000009ce000 nid=0x31 waiting
>>>> for monitor e
>>>> ntry [0xfffffd7fc002f000]
>>>>    java.lang.Thread.State: BLOCKED (on object monitor)
>>>>    JavaThread state: _thread_blocked
>>>> Thread: 0x00000000009ce000  [0x31] State: _at_safepoint
>>>> _has_called_back 0 _at_p
>>>> oll_safepoint 0
>>>>    JavaThread state: _thread_blocked
>>>>         at SMW_WorkerThread.run(StressMonitorWait.java:139)
>>>>         - waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String)
>>>>
>>>> Key symptoms in thread T1:
>>>>
>>>> - had the object locked:
>>>>
>>>>   locked <0xfffffd7e6a2b6ff0> (a java.lang.String)
>>>>
>>>> - did an Object.wait():
>>>>
>>>>   waiting on <0xfffffd7e6a2b6ff0> (a java.lang.String)
>>>>
>>>> - is blocked on reentry:
>>>>
>>>>   waiting for monitor entry [0xfffffd7fc0231000]
>>>>
>>>> Key symtoms in thread T2:
>>>>
>>>> - is blocked waiting to lock the object:
>>>>
>>>>   waiting for monitor entry [0xfffffd7fc0130000]
>>>>   waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String)
>>>>
>>>> Key symtoms in thread T3:
>>>>
>>>> - is blocked waiting to lock the object:
>>>>
>>>>   waiting for monitor entry [0xfffffd7fc002f000]
>>>>   waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String)
>>>>
>>


From karen.kinnear at oracle.com  Mon Feb 10 12:20:57 2014
From: karen.kinnear at oracle.com (Karen Kinnear)
Date: Mon, 10 Feb 2014 15:20:57 -0500
Subject: code review round 0 for ObjectMonitor-JVM/TI hang fix (8028073)
In-Reply-To: <52F920AD.6050705@oracle.com>
References: <52ED3F28.5070806@oracle.com>	<6C8A4CDA-B5D5-44E4-865E-A7205D533AFD@oracle.com>
	<52F5702C.50508@oracle.com> <52F84967.8040807@oracle.com>
	<52F920AD.6050705@oracle.com>
Message-ID: <5B22D063-1125-4E1D-8D96-C5A5B3E78BA0@oracle.com>

Dan,

Thank you so much. My bad - I was looking at a jdk8 repo, not a jdk9 one.

So I agree that the JDK9 fix as is works. Code change reviewed.

For JDK8:
I don't believe we were planning to backport this to 8 given risks of changes in this area.

I did reach the same conclusion you did, that the WaitSetLock acquirers who already own
the lock don't have this issue, but those that don't already own the lock do have
the problem, and the timed wait could trigger this.
And that a JDK8 fix would take the change out of the jvmti conditional, or need the 8028280
fix, which I also believe we do not plan to backport.

thank you for the detailed walk-through,
Karen

On Feb 10, 2014, at 1:55 PM, Daniel D. Daugherty wrote:

> On 2/9/14 8:37 PM, David Holmes wrote:
>> trimming content ...
>> 
>> On 8/02/2014 9:45 AM, Daniel D. Daugherty wrote:
>>> On 2/7/14 2:56 PM, Karen Kinnear wrote:
>>>> 3. Did I read the code correctly that the Thread::SpinAcquire can make
>>>> a timed park
>>>> call on the same thread's _ParkEvent? And that this is used to get on
>>>> and off the wait queue,
>>>> i.e. to acquire the WaitSetLock?
>>>>    Is there the same risk that a notify might be eaten here also?
>>> 
>>> As far as I can see, Thread::SpinAcquire() does not use a ParkEvent
>> 
>> It sure does:
>> 
>> void Thread::SpinAcquire (volatile int * adr, const char * LockName) {
>>  if (Atomic::cmpxchg (1, adr, 0) == 0) {
>>     return ;   // normal fast-path return
>>  }
>> 
>>  // Slow-path : We've encountered contention -- Spin/Yield/Block strategy.
>>  TEVENT (SpinAcquire - ctx) ;
>>  int ctr = 0 ;
>>  int Yields = 0 ;
>>  for (;;) {
>>     while (*adr != 0) {
>>        ++ctr ;
>>        if ((ctr & 0xFFF) == 0 || !os::is_MP()) {
>>           if (Yields > 5) {
>>             // Consider using a simple NakedSleep() instead.
>>             // Then SpinAcquire could be called by non-JVM threads
>>             Thread::current()->_ParkEvent->park(1) ;
> 
> Ummmm... that's not the code I'm seeing...
> 
> src/share/vm/runtime/thread.cpp:
> 
>  4417  void Thread::SpinAcquire (volatile int * adr, const char * LockName) {
>  4418    if (Atomic::cmpxchg (1, adr, 0) == 0) {
>  4419       return ;   // normal fast-path return
>  4420    }
>  4421
>  4422    // Slow-path : We've encountered contention -- Spin/Yield/Block strategy.
>  4423    TEVENT (SpinAcquire - ctx) ;
>  4424    int ctr = 0 ;
>  4425    int Yields = 0 ;
>  4426    for (;;) {
>  4427       while (*adr != 0) {
>  4428          ++ctr ;
>  4429          if ((ctr & 0xFFF) == 0 || !os::is_MP()) {
>  4430             if (Yields > 5) {
>  4431               os::naked_short_sleep(1);
>  4432             } else {
>  4433               os::NakedYield() ;
>  4434               ++Yields ;
>  4435             }
>  4436          } else {
>  4437             SpinPause() ;
>  4438          }
>  4439       }
>  4440       if (Atomic::cmpxchg (1, adr, 0) == 0) return ;
>  4441    }
>  4442  }
> 
> Mr Simms recently changed the above code via:
> 
> changeset:   5832:5944dba4badc
> user:        dsimms
> date:        Fri Jan 24 09:28:47 2014 +0100
> summary:     8028280: ParkEvent leak when running modified runThese which only loads classes
> 
> os::naked_short_sleep() is new:
> 
> - BSD/MacOS X, Linux - uses nanosleep()
> - Solaris - uses usleep()
> - Windows - uses Sleep()
> 
> The fix for 8028280 was pushed to JDK9/hs-rt on 2014.01.24 and to JDK9/hs
> on 2014.01.29. I don't see any signs that Mr Simm's fix will be backported
> to JDK8u/HSX-25u (yet) so this part of the review thread might impact the
> backport of my fix to earlier releases.
> 
> 
>> So considering Karen's question ... I can't tell for certain. :(
>> 
>> I do not think the SpinAcquire on grabbing the wait-set lock to add to the wait-set can be an issue because we will only park in response to the actual wait, and hence only get unparked due to a notify/notifyAll, but at this point we still own the monitor so no notify/notifyAll is possible.
>> 
>> However, for the removal from the wait-set a more complex analysis is needed. To do the SpinAcquire we must still be flagged as TS_WAIT - which means we have not been notified, but must be returning due to a timeout (or spurious wakeup?). In such circumstances could we be _succ? I don't think so but I'll leave it to Dan to confirm that part :)
> 
> So for HSX-25 and probably older...
> 
> There are four Thread::SpinAcquire() calls in the objectMonitor code:
> 
>    Thread::SpinAcquire (&_WaitSetLock, "WaitSet - add") ;
>    Thread::SpinAcquire (&_WaitSetLock, "WaitSet - unlink") ;
>    Thread::SpinAcquire (&_WaitSetLock, "WaitSet - notify") ;
>    Thread::SpinAcquire (&_WaitSetLock, "WaitSet - notifyall") ;
> 
> We can easily rule out the "notify" and "notifyAll" uses since the
> current thread owns the Java-level monitor and there are no events
> to post in this part of the notify() or notifyAll() protocols.
> 
> For the "WaitSet - add" use, the current thread owns the Java-level
> monitor and the thread has not been added as a waiter yet so another
> thread cannot do the notify-exit-make-successor part of the protocol
> yet.
> 
> For the "WaitSet - unlink" use:
> 
> src/share/vm/runtime/objectMonitor.cpp:
> 
>  1569       if (node.TState == ObjectWaiter::TS_WAIT) {
>  1570           Thread::SpinAcquire (&_WaitSetLock, "WaitSet - unlink") ;
>  1571           if (node.TState == ObjectWaiter::TS_WAIT) {
>  1572              DequeueSpecificWaiter (&node) ;       // unlink from WaitSet
>  1573              assert(node._notified == 0, "invariant");
>  1574              node.TState = ObjectWaiter::TS_RUN ;
>  1575           }
>  1576           Thread::SpinRelease (&_WaitSetLock) ;
>  1577       }
> 
> It is the call on line 1570 above that gets us into this code:
> 
> src/share/vm/runtime/thread.cpp:
> 
>  4435  void Thread::SpinAcquire (volatile int * adr, const char * LockName) {
>  4436    if (Atomic::cmpxchg (1, adr, 0) == 0) {
>  4437       return ;   // normal fast-path return
>  4438    }
>  4439
>  4440    // Slow-path : We've encountered contention -- Spin/Yield/Block strategy.
>  4441    TEVENT (SpinAcquire - ctx) ;
>  4442    int ctr = 0 ;
>  4443    int Yields = 0 ;
>  4444    for (;;) {
>  4445       while (*adr != 0) {
>  4446          ++ctr ;
>  4447          if ((ctr & 0xFFF) == 0 || !os::is_MP()) {
>  4448             if (Yields > 5) {
>  4449               // Consider using a simple NakedSleep() instead.
>  4450               // Then SpinAcquire could be called by non-JVM threads
>  4451               Thread::current()->_ParkEvent->park(1) ;
>  4452             } else {
>  4453               os::NakedYield() ;
>  4454               ++Yields ;
>  4455             }
>  4456          } else {
>  4457             SpinPause() ;
>  4458          }
>  4459       }
>  4460       if (Atomic::cmpxchg (1, adr, 0) == 0) return ;
>  4461    }
>  4462  }
> 
> And the above code can consume the unpark() on line 4451.
> 
> So how the heck do we get to line 1570???
> 
> Well, the target thread would have to be both notified and unparked
> to be executing this code path. When the notify() code runs, the
> target of the notify() is changed from ObjectWaiter::TS_WAIT to
> ObjectWaiter::TS_ENTER unless Knob_MoveNotifyee == 4. The default
> for Knob_MoveNotifyee == 2 so we're in non default mode here...
> 
> Here are the Knob_MoveNotifyee policy values:
> 
>   1717      if (Policy == 0) {       // prepend to EntryList
>   1728      if (Policy == 1) {      // append to EntryList
>   1744      if (Policy == 2) {      // prepend to cxq
>   1760      if (Policy == 3) {      // append to cxq
> 
> For Knob_MoveNotifyee == 4 (or higher), we use the old mechanism
> where we just unpark the target thread and let it run. Part of
> that code changes from ObjectWaiter::TS_WAIT to ObjectWaiter::TS_RUN.
> 
> The code works the same for notifyAll() for the thread picked
> to be notified. For the Knob_MoveNotifyee == 4 (or higher) case,
> we just unpark all the waiters and we a free-for-all.
> 
> So it looks like the code block from lines 1569-1577 is never
> used... or is it? Well... you have to remember two things:
> 
> 1) spurious unpark()
> 2) timed wait()
> 
> The caller might have called wait(0), but that doesn't mean that
> the underlying park() mechanism won't have a spurious unpark().
> Or better, the caller might have called wait(1) and be running
> again after a millisecond.
> 
> So in the HSX25 and older system (i.e., without Mr Simms fix for
> 8028280), it is possible for this call:
> 
>  1570           Thread::SpinAcquire (&_WaitSetLock, "WaitSet - unlink") ;
> 
> to consume the unpark(). The gauntlet that has to be traversed
> to get to this call:
> 
>  4451               Thread::current()->_ParkEvent->park(1) ;
> 
> is impressive:
> 
> - fast-path acquisition of the _WaitSetLock has to fail:
> 
>  4436    if (Atomic::cmpxchg (1, adr, 0) == 0) {
>  4437       return ;   // normal fast-path return
>  4438    }
> 
> - if the machine is a uniprocessor, then 6 os::NakedYield()
>  call-loop-recheck attempts have to fail:
> 
>  4447          if ((ctr & 0xFFF) == 0 || !os::is_MP()) {
>  4448             if (Yields > 5) {
>  4449               // Consider using a simple NakedSleep() instead.
>  4450               // Then SpinAcquire could be called by non-JVM threads
>  4451               Thread::current()->_ParkEvent->park(1) ;
>  4452             } else {
>  4453               os::NakedYield() ;
>  4454               ++Yields ;
>  4455             }
> 
> - if the machine is a multi-processor, then 6 rounds of { 4095 SpinPause()
>  attempts, 1 os::NakedYield() attempt}  have to fail:
> 
>  4446          ++ctr ;
>  4447          if ((ctr & 0xFFF) == 0 || !os::is_MP()) {
>  4448             if (Yields > 5) {
>  4449               // Consider using a simple NakedSleep() instead.
>  4450               // Then SpinAcquire could be called by non-JVM threads
>  4451               Thread::current()->_ParkEvent->park(1) ;
>  4452             } else {
>  4453               os::NakedYield() ;
>  4454               ++Yields ;
>  4455             }
>  4456          } else {
>  4457             SpinPause() ;
>  4458          }
> 
> But it is possible. It is one of those once-in-a-blue moon type
> windows where everything has to line up just so.
> 
> So how do we address this issue in HSX-25 and possibly older?
> 
> If Mr Simms fix for 8028280 is also backported, then there is no
> issue. If it is not backported, then applying the fix for this
> bug like so:
> 
> src/share/vm/runtime/objectMonitor.cpp:
> 
>  1596       if (JvmtiExport::should_post_monitor_waited()) {
>  1597         JvmtiExport::post_monitor_waited(jt, this, ret == OS_TIMEOUT);
>  1598       }
> 
>  1604       if (node._notified != 0 && _succ == Self) {
>  1605         // In this part of the monitor wait-notify-reenter protocol it
>  1606         // is possible (and normal) for another thread to do a fastpath
>  1607         // monitor enter-exit while this thread is still trying to get
>  1608         // to the reenter portion of the protocol.
>  1609         //
>  1610         // The ObjectMonitor was notified and the current thread is
>  1611         // the successor which also means that an unpark() has already
>  1612         // been done. The JVMTI_EVENT_MONITOR_WAITED event handler can
>  1613         // consume the unpark() that was done when the successor was
>  1614         // set because the same ParkEvent is shared between Java
>  1615         // monitors and JVM/TI RawMonitors (for now).
>  1616         //
>  1617         // We redo the unpark() to ensure forward progress, i.e., we
>  1618         // don't want all pending threads hanging (parked) with none
>  1619         // entering the unlocked monitor.
>  1620         node._event->unpark();
>  1621       }
> 
> Of course the line numbers for the "fix" would be different and the comment
> would need to be updated to reflect that the:
> 
>  1570           Thread::SpinAcquire (&_WaitSetLock, "WaitSet - unlink") ;
> 
> call above could also consume an unpark(), but it should work.
> 
> If you've read this far, then I'm impressed. If you've read this far
> and only fallen asleep a couple of times, then I'm still impressed.
> 
> Summary: I don't think we have an issue in JDK9, but we'll have to do
>         the fix in JDK8/HSX25 and older a little differently.
> 
> Dan
> 
> 
>> 
>> David
>> -----
>> 
>>> at all. However, Thread::muxAcquire() does use a ParkEvent, but it
>>> is a different ParkEvent. From src/share/vm/runtime/thread.hpp:
>>> 
>>>   ParkEvent * _ParkEvent ;               // for synchronized()
>>>   ParkEvent * _SleepEvent ;              // for Thread.sleep
>>>   ParkEvent * _MutexEvent ;              // for native internal
>>> Mutex/Monitor
>>>   ParkEvent * _MuxEvent ;                // for low-level
>>> muxAcquire-muxRelease
>>> 
>>> So ObjectMonitor uses the _ParkEvent field and Thread::muxAcquire()
>>> uses the _MuxEvent. There are some comments in thread.cpp about
>>> how _MuxEvent could be eliminated and _ParkEvent shared, but I don't
>>> think we ever want to go there.
>>> 
>>> I also filed this RFE:
>>> 
>>>     8033399 add a separate ParkEvent for JVM/TI RawMonitor use
>>> https://bugs.openjdk.java.net/browse/JDK-8033399
>>> 
>>> just in case the Serviceability team wants to migrate JVM/TI RawMonitors
>>> to a separate ParkEvent.
>>> 
>>> Please let me know if you concur that I've resolved issue #3.
>>> 
>>> 
>>>> If so, I wonder if we want this added unpark to not just be called if
>>>> JVMTI_EVENT_MONITOR_WAITED
>>>> is enabled?
>>> 
>>> I don't think we need it, but I've noted its removal as a risk.
>>> 
>>> Again, thanks for the review!
>>> 
>>> Dan
>>> 
>>> 
>>>> 
>>>> thanks,
>>>> Karen
>>>> 
>>>> On Feb 1, 2014, at 1:38 PM, Daniel D. Daugherty wrote:
>>>> 
>>>>> Greetings,
>>>>> 
>>>>> I have a fix ready for the following bug:
>>>>> 
>>>>>    8028073 race condition in ObjectMonitor implementation causing
>>>>> deadlocks
>>>>> https://bugs.openjdk.java.net/browse/JDK-8028073
>>>>> 
>>>>> On the surface, this is a very simple fix that relocates a few lines of
>>>>> code, relocates and rewrites the comments associated with that code and
>>>>> adds several new comments.
>>>>> 
>>>>> Of course, in reality, the issue is much more complicated, but I'm
>>>>> hoping to make it easy for anyone not acquainted with this issue to
>>>>> understand what's going on.
>>>>> 
>>>>> Here are the JDK9 webrev URLs:
>>>>> 
>>>>> OpenJDK:
>>>>> http://cr.openjdk.java.net/~dcubed/8028073-webrev/0-jdk9-hs-runtime/
>>>>> 
>>>>> Oracle internal:
>>>>> http://javaweb.us.oracle.com/~ddaugher/8028073-webrev/0-jdk9-hs-runtime/ 
>>>>> 
>>>>> The simple summary:
>>>>> 
>>>>> - since Java Monitors and JVM/TI RawMonitors share a ParkEvent,
>>>>>  it is possible for a JVM/TI monitor event handler to accidentally
>>>>>  consume a ParkEvent.unpark() call meant for Java Monitor layer
>>>>> - the original code fix was made on 2005.07.04 using this bug ID:
>>>>> https://bugs.openjdk.java.net/browse/JDK-5030359
>>>>> - it's the right fix, but it's in the wrong place
>>>>> - the fix needs to be after the JVMTI_EVENT_MONITOR_WAITED
>>>>>  event handler is called because it is that event handler
>>>>>  that can cause the hang
>>>>> 
>>>>> 
>>>>> Testing
>>>>> -------
>>>>> 
>>>>> - a new StessMonitorWait test has been created that reliably
>>>>>  reproduces the hang in JDK[6789]; see the bug's gory details
>>>>>  for the specific versions where the hang has been reproduced
>>>>>  - the test reliably reproduces the hang in 5 seconds on my
>>>>>    T7600 running Solaris 10u11 X86; 1 minute runs reproduce
>>>>>    the hang reliably on other machines
>>>>>  - 12 hour stress run of the new test on Linux-X64, MacOS X-X64,
>>>>>    Solaris-SPARCV9, Solaris-X64, and Win7-X86 with the JPRT
>>>>>    bits did not reproduce the hang
>>>>> - JPRT test job
>>>>> - VM/SQE Adhoc test job on Server VM, fastdebug bits on Linux-X86,
>>>>>  Linux-X64, MacOS X-X64, Solaris-SPARCV9, Solaris-X64, Windows-X86,
>>>>>  and Windows-X64:
>>>>>  - vm.quick
>>>>>  - Kitchensink (bigapps)
>>>>>  - Weblogic+medrec (bigapps)
>>>>>  - runThese (bigapps)
>>>>> 
>>>>> 
>>>>> The Gory Details Start Here
>>>>> ---------------------------
>>>>> 
>>>>> This is the old location of block of code that's being moved:
>>>>> 
>>>>> src/share/vm/runtime/objectMonitor.cpp:
>>>>> 
>>>>> 1440 void ObjectMonitor::wait(jlong millis, bool interruptible, TRAPS) {
>>>>> <snip>
>>>>> 1499    exit (true, Self) ;                    // exit the monitor
>>>>> <snip>
>>>>> 1513    if (node._notified != 0 && _succ == Self) {
>>>>> 1514       node._event->unpark();
>>>>> 1515    }
>>>>> 
>>>>> 
>>>>> This is the new location of block of code that's being moved:
>>>>> 
>>>>> src/share/vm/runtime/objectMonitor.cpp:
>>>>> 
>>>>> 1452 void ObjectMonitor::wait(jlong millis, bool interruptible, TRAPS) {
>>>>> <snip>
>>>>> 1601      if (JvmtiExport::should_post_monitor_waited()) {
>>>>> 1602        JvmtiExport::post_monitor_waited(jt, this, ret ==
>>>>> OS_TIMEOUT);
>>>>> <snip>
>>>>> 1604        if (node._notified != 0 && _succ == Self) {
>>>>> <snip>
>>>>> 1620          node._event->unpark();
>>>>> 1621        }
>>>>> 
>>>>> 
>>>>> The Risks
>>>>> ---------
>>>>> 
>>>>> - The code now executes only when the JVMTI_EVENT_MONITOR_WAITED event
>>>>>  is enabled:
>>>>>  - previously it was always executed
>>>>>  - while the old code was not effective for the hang that is being
>>>>>    fixed with this bug, it is possible that the old code prevented
>>>>>    a different bug in the successor protocol from manifesting
>>>>>  - thorough analysis of the successor protocol did not reveal a
>>>>>    case where the old code was needed in the old location
>>>>> - Thorough analysis indicates that the other JVM/TI monitor events
>>>>>  do not need a fix like the one for JVMTI_EVENT_MONITOR_WAITED:
>>>>>  - the successor protocol is complicated and the analysis could
>>>>>    be wrong when certain options are used
>>>>>  - comments were added to each location where a JVM/TI monitor
>>>>>    event handler is called documenting why a fix like this one
>>>>>    is not needed there
>>>>>  - if the analysis is wrong, the new comments show where a new
>>>>>    code change would be needed
>>>>> 
>>>>> 
>>>>> The Scenario
>>>>> ------------
>>>>> 
>>>>> I've created a scenario that reproduces this hang:
>>>>> 
>>>>> T1 - enters monitor and calls monitor.wait()
>>>>> T2 - enters the monitor, calls monitor.notify() and exits the monitor
>>>>> T3 - enters and exits the monitor
>>>>> T4 - enters the monitor, delays for 5 seconds, exits the monitor
>>>>> 
>>>>> A JVM/TI agent that enables JVMTI_EVENT_MONITOR_WAITED and has a
>>>>> handler that: enters a raw monitor, waits for 1ms, exits a raw monitor.
>>>>> 
>>>>> Here are the six events necessary to make this hang happen:
>>>>> 
>>>>> // KEY-EVENT-1a: After being unparked(), T1 has cleared the _succ
>>>>> field, but
>>>>> // KEY-EVENT-1b: T3 is exiting the monitor and makes T1 the successor
>>>>> again.
>>>>> 
>>>>> // KEY-EVENT-2a: The unpark() done by T3 when it made T1 the successor
>>>>> // KEY-EVENT-2b: is consumed by the JVM/TI event handler.
>>>>> 
>>>>> // KEY-EVENT-3a: T3 made T1 the successor
>>>>> // KEY-EVENT-3b: but before T1 could reenter the monitor T4 grabbed it.
>>>>> 
>>>>> // KEY-EVENT-4a: T1's TrySpin() call sees T4 as NotRunnable so
>>>>> // KEY-EVENT-4b: T1 bails from TrySpin without touching _succ.
>>>>> 
>>>>> // KEY-EVENT-5a: T4 sees that T1 is still the successor so
>>>>> // KEY-EVENT-5b: T4 takes the quick exit path (no ExitEpilog)
>>>>> 
>>>>> // KEY-EVENT-6a: T1 is about to park and it is the successor, but
>>>>> // KEY-EVENT-6b: T3's unpark has been eaten by the JVM/TI event handler
>>>>> // KEY-EVENT-6c: and T4 took the quick exit path. T1 is about to be
>>>>> stuck.
>>>>> 
>>>>> 
>>>>> This bug is intertwined with:
>>>>> 
>>>>> - The ObjectMonitor successor protocol
>>>>> - the sharing of a ParkEvent between Java Monitors and JVM/TI
>>>>> RawMonitors
>>>>> 
>>>>> There is a very long successor.notes attachment to JDK-8028073 that
>>>>> attempts to describe the ObjectMonitor successor protocol. It's good
>>>>> for putting pretty much anyone to sleep.
>>>>> 
>>>>> Since this hang reproduces back to JDK6, this bug is taking the easily
>>>>> backported solution of moving the original fix to the right location.
>>>>> The following new bug has been filed for possible future work in this
>>>>> area by the Serviceability Team:
>>>>> 
>>>>>    8033399 add a separate ParkEvent for JVM/TI RawMonitor use
>>>>> https://bugs.openjdk.java.net/browse/JDK-8033399
>>>>> 
>>>>> 
>>>>> The Symptoms
>>>>> ------------
>>>>> 
>>>>> With intermittent hangs like this, it is useful to know what to look
>>>>> for in order to determine if you are running into this issue:
>>>>> 
>>>>> - if you aren't using a debugger or a profiler or some other
>>>>>  JVM/TI agent, then this hang is not the same as yours
>>>>> - if your JVM/TI agent isn't using a JVMTI_EVENT_MONITOR_WAITED
>>>>>  event handler, then this hang is not the same as yours
>>>>> - if your JVMTI_EVENT_MONITOR_WAITED event handler is not using
>>>>>  JVM/TI RawMonitors, then this hang is not the same as yours
>>>>> - if your JVMTI_EVENT_MONITOR_WAITED event handler is calling
>>>>>  back into Java code, then you might just be insane and this
>>>>>  hang might be similar to yours. However, using a Java callback
>>>>>  in an event handler is an even bigger problem/risk so fix that
>>>>>  first.
>>>>> - if you one or more threads blocked like this and making no
>>>>>  progress, then this hang might be the same as yours:
>>>>> 
>>>>> "T1" #22 prio=5 os_prio=64 tid=0x00000000009ca800 nid=0x2f waiting
>>>>> for monitor e
>>>>> ntry [0xfffffd7fc0231000]
>>>>>   java.lang.Thread.State: BLOCKED (on object monitor)
>>>>>   JavaThread state: _thread_blocked
>>>>> Thread: 0x00000000009ca800  [0x2f] State: _at_safepoint
>>>>> _has_called_back 0 _at_p
>>>>> oll_safepoint 0
>>>>>   JavaThread state: _thread_blocked
>>>>>        at java.lang.Object.wait(Native Method)
>>>>>        - waiting on <0xfffffd7e6a2b6ff0> (a java.lang.String)
>>>>>        at java.lang.Object.wait(Object.java:502)
>>>>>        at SMW_WorkerThread.run(StressMonitorWait.java:103)
>>>>>        - locked <0xfffffd7e6a2b6ff0> (a java.lang.String)
>>>>> 
>>>>> "T2" #23 prio=5 os_prio=64 tid=0x00000000009cc000 nid=0x30 waiting
>>>>> for monitor e
>>>>> ntry [0xfffffd7fc0130000]
>>>>>   java.lang.Thread.State: BLOCKED (on object monitor)
>>>>>   JavaThread state: _thread_blocked
>>>>> Thread: 0x00000000009cc000  [0x30] State: _at_safepoint
>>>>> _has_called_back 0 _at_p
>>>>> oll_safepoint 0
>>>>>   JavaThread state: _thread_blocked
>>>>>        at SMW_WorkerThread.run(StressMonitorWait.java:120)
>>>>>        - waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String)
>>>>> 
>>>>> "T3" #24 prio=5 os_prio=64 tid=0x00000000009ce000 nid=0x31 waiting
>>>>> for monitor e
>>>>> ntry [0xfffffd7fc002f000]
>>>>>   java.lang.Thread.State: BLOCKED (on object monitor)
>>>>>   JavaThread state: _thread_blocked
>>>>> Thread: 0x00000000009ce000  [0x31] State: _at_safepoint
>>>>> _has_called_back 0 _at_p
>>>>> oll_safepoint 0
>>>>>   JavaThread state: _thread_blocked
>>>>>        at SMW_WorkerThread.run(StressMonitorWait.java:139)
>>>>>        - waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String)
>>>>> 
>>>>> Key symptoms in thread T1:
>>>>> 
>>>>> - had the object locked:
>>>>> 
>>>>>  locked <0xfffffd7e6a2b6ff0> (a java.lang.String)
>>>>> 
>>>>> - did an Object.wait():
>>>>> 
>>>>>  waiting on <0xfffffd7e6a2b6ff0> (a java.lang.String)
>>>>> 
>>>>> - is blocked on reentry:
>>>>> 
>>>>>  waiting for monitor entry [0xfffffd7fc0231000]
>>>>> 
>>>>> Key symtoms in thread T2:
>>>>> 
>>>>> - is blocked waiting to lock the object:
>>>>> 
>>>>>  waiting for monitor entry [0xfffffd7fc0130000]
>>>>>  waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String)
>>>>> 
>>>>> Key symtoms in thread T3:
>>>>> 
>>>>> - is blocked waiting to lock the object:
>>>>> 
>>>>>  waiting for monitor entry [0xfffffd7fc002f000]
>>>>>  waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String)
>>>>> 
>>> 
> 


From daniel.daugherty at oracle.com  Mon Feb 10 12:31:20 2014
From: daniel.daugherty at oracle.com (Daniel D. Daugherty)
Date: Mon, 10 Feb 2014 13:31:20 -0700
Subject: code review round 0 for ObjectMonitor-JVM/TI hang fix (8028073)
In-Reply-To: <5B22D063-1125-4E1D-8D96-C5A5B3E78BA0@oracle.com>
References: <52ED3F28.5070806@oracle.com>	<6C8A4CDA-B5D5-44E4-865E-A7205D533AFD@oracle.com>
	<52F5702C.50508@oracle.com> <52F84967.8040807@oracle.com>
	<52F920AD.6050705@oracle.com>
	<5B22D063-1125-4E1D-8D96-C5A5B3E78BA0@oracle.com>
Message-ID: <52F93718.5010007@oracle.com>

On 2/10/14 1:20 PM, Karen Kinnear wrote:
> Dan,
>
> Thank you so much. My bad - I was looking at a jdk8 repo, not a jdk9 one.

No problem... I had the advantage of wanting Mr Simms changes so
that I could (more easily) develop the debug code flow hooks that
I'm planning to add to the "debug tips and tricks" wiki...


> So I agree that the JDK9 fix as is works. Code change reviewed.

Thanks for confirmation!


> For JDK8:
> I don't believe we were planning to backport this to 8 given risks of changes in this area.

Ummm.... Not JDK8-GA, but definitely a JDK8-Update... As usual,
I plan to do the backport engineering and I'll let someone else
worry about the politics... :-)


> I did reach the same conclusion you did, that the WaitSetLock acquirers who already own
> the lock don't have this issue, but those that don't already own the lock do have
> the problem, and the timed wait could trigger this.
> And that a JDK8 fix would take the change out of the jvmti conditional, or need the 8028280
> fix, which I also believe we do not plan to backport.

Yeah, I'll chat with Mr Simms about backporting 8028280... That
os::naked_short_sleep() function is so very useful...


> thank you for the detailed walk-through,

No problem. Thank you for slogging through the details here.

Dan


> Karen
>
> On Feb 10, 2014, at 1:55 PM, Daniel D. Daugherty wrote:
>
>> On 2/9/14 8:37 PM, David Holmes wrote:
>>> trimming content ...
>>>
>>> On 8/02/2014 9:45 AM, Daniel D. Daugherty wrote:
>>>> On 2/7/14 2:56 PM, Karen Kinnear wrote:
>>>>> 3. Did I read the code correctly that the Thread::SpinAcquire can make
>>>>> a timed park
>>>>> call on the same thread's _ParkEvent? And that this is used to get on
>>>>> and off the wait queue,
>>>>> i.e. to acquire the WaitSetLock?
>>>>>     Is there the same risk that a notify might be eaten here also?
>>>> As far as I can see, Thread::SpinAcquire() does not use a ParkEvent
>>> It sure does:
>>>
>>> void Thread::SpinAcquire (volatile int * adr, const char * LockName) {
>>>   if (Atomic::cmpxchg (1, adr, 0) == 0) {
>>>      return ;   // normal fast-path return
>>>   }
>>>
>>>   // Slow-path : We've encountered contention -- Spin/Yield/Block strategy.
>>>   TEVENT (SpinAcquire - ctx) ;
>>>   int ctr = 0 ;
>>>   int Yields = 0 ;
>>>   for (;;) {
>>>      while (*adr != 0) {
>>>         ++ctr ;
>>>         if ((ctr & 0xFFF) == 0 || !os::is_MP()) {
>>>            if (Yields > 5) {
>>>              // Consider using a simple NakedSleep() instead.
>>>              // Then SpinAcquire could be called by non-JVM threads
>>>              Thread::current()->_ParkEvent->park(1) ;
>> Ummmm... that's not the code I'm seeing...
>>
>> src/share/vm/runtime/thread.cpp:
>>
>>   4417  void Thread::SpinAcquire (volatile int * adr, const char * LockName) {
>>   4418    if (Atomic::cmpxchg (1, adr, 0) == 0) {
>>   4419       return ;   // normal fast-path return
>>   4420    }
>>   4421
>>   4422    // Slow-path : We've encountered contention -- Spin/Yield/Block strategy.
>>   4423    TEVENT (SpinAcquire - ctx) ;
>>   4424    int ctr = 0 ;
>>   4425    int Yields = 0 ;
>>   4426    for (;;) {
>>   4427       while (*adr != 0) {
>>   4428          ++ctr ;
>>   4429          if ((ctr & 0xFFF) == 0 || !os::is_MP()) {
>>   4430             if (Yields > 5) {
>>   4431               os::naked_short_sleep(1);
>>   4432             } else {
>>   4433               os::NakedYield() ;
>>   4434               ++Yields ;
>>   4435             }
>>   4436          } else {
>>   4437             SpinPause() ;
>>   4438          }
>>   4439       }
>>   4440       if (Atomic::cmpxchg (1, adr, 0) == 0) return ;
>>   4441    }
>>   4442  }
>>
>> Mr Simms recently changed the above code via:
>>
>> changeset:   5832:5944dba4badc
>> user:        dsimms
>> date:        Fri Jan 24 09:28:47 2014 +0100
>> summary:     8028280: ParkEvent leak when running modified runThese which only loads classes
>>
>> os::naked_short_sleep() is new:
>>
>> - BSD/MacOS X, Linux - uses nanosleep()
>> - Solaris - uses usleep()
>> - Windows - uses Sleep()
>>
>> The fix for 8028280 was pushed to JDK9/hs-rt on 2014.01.24 and to JDK9/hs
>> on 2014.01.29. I don't see any signs that Mr Simm's fix will be backported
>> to JDK8u/HSX-25u (yet) so this part of the review thread might impact the
>> backport of my fix to earlier releases.
>>
>>
>>> So considering Karen's question ... I can't tell for certain. :(
>>>
>>> I do not think the SpinAcquire on grabbing the wait-set lock to add to the wait-set can be an issue because we will only park in response to the actual wait, and hence only get unparked due to a notify/notifyAll, but at this point we still own the monitor so no notify/notifyAll is possible.
>>>
>>> However, for the removal from the wait-set a more complex analysis is needed. To do the SpinAcquire we must still be flagged as TS_WAIT - which means we have not been notified, but must be returning due to a timeout (or spurious wakeup?). In such circumstances could we be _succ? I don't think so but I'll leave it to Dan to confirm that part :)
>> So for HSX-25 and probably older...
>>
>> There are four Thread::SpinAcquire() calls in the objectMonitor code:
>>
>>     Thread::SpinAcquire (&_WaitSetLock, "WaitSet - add") ;
>>     Thread::SpinAcquire (&_WaitSetLock, "WaitSet - unlink") ;
>>     Thread::SpinAcquire (&_WaitSetLock, "WaitSet - notify") ;
>>     Thread::SpinAcquire (&_WaitSetLock, "WaitSet - notifyall") ;
>>
>> We can easily rule out the "notify" and "notifyAll" uses since the
>> current thread owns the Java-level monitor and there are no events
>> to post in this part of the notify() or notifyAll() protocols.
>>
>> For the "WaitSet - add" use, the current thread owns the Java-level
>> monitor and the thread has not been added as a waiter yet so another
>> thread cannot do the notify-exit-make-successor part of the protocol
>> yet.
>>
>> For the "WaitSet - unlink" use:
>>
>> src/share/vm/runtime/objectMonitor.cpp:
>>
>>   1569       if (node.TState == ObjectWaiter::TS_WAIT) {
>>   1570           Thread::SpinAcquire (&_WaitSetLock, "WaitSet - unlink") ;
>>   1571           if (node.TState == ObjectWaiter::TS_WAIT) {
>>   1572              DequeueSpecificWaiter (&node) ;       // unlink from WaitSet
>>   1573              assert(node._notified == 0, "invariant");
>>   1574              node.TState = ObjectWaiter::TS_RUN ;
>>   1575           }
>>   1576           Thread::SpinRelease (&_WaitSetLock) ;
>>   1577       }
>>
>> It is the call on line 1570 above that gets us into this code:
>>
>> src/share/vm/runtime/thread.cpp:
>>
>>   4435  void Thread::SpinAcquire (volatile int * adr, const char * LockName) {
>>   4436    if (Atomic::cmpxchg (1, adr, 0) == 0) {
>>   4437       return ;   // normal fast-path return
>>   4438    }
>>   4439
>>   4440    // Slow-path : We've encountered contention -- Spin/Yield/Block strategy.
>>   4441    TEVENT (SpinAcquire - ctx) ;
>>   4442    int ctr = 0 ;
>>   4443    int Yields = 0 ;
>>   4444    for (;;) {
>>   4445       while (*adr != 0) {
>>   4446          ++ctr ;
>>   4447          if ((ctr & 0xFFF) == 0 || !os::is_MP()) {
>>   4448             if (Yields > 5) {
>>   4449               // Consider using a simple NakedSleep() instead.
>>   4450               // Then SpinAcquire could be called by non-JVM threads
>>   4451               Thread::current()->_ParkEvent->park(1) ;
>>   4452             } else {
>>   4453               os::NakedYield() ;
>>   4454               ++Yields ;
>>   4455             }
>>   4456          } else {
>>   4457             SpinPause() ;
>>   4458          }
>>   4459       }
>>   4460       if (Atomic::cmpxchg (1, adr, 0) == 0) return ;
>>   4461    }
>>   4462  }
>>
>> And the above code can consume the unpark() on line 4451.
>>
>> So how the heck do we get to line 1570???
>>
>> Well, the target thread would have to be both notified and unparked
>> to be executing this code path. When the notify() code runs, the
>> target of the notify() is changed from ObjectWaiter::TS_WAIT to
>> ObjectWaiter::TS_ENTER unless Knob_MoveNotifyee == 4. The default
>> for Knob_MoveNotifyee == 2 so we're in non default mode here...
>>
>> Here are the Knob_MoveNotifyee policy values:
>>
>>    1717      if (Policy == 0) {       // prepend to EntryList
>>    1728      if (Policy == 1) {      // append to EntryList
>>    1744      if (Policy == 2) {      // prepend to cxq
>>    1760      if (Policy == 3) {      // append to cxq
>>
>> For Knob_MoveNotifyee == 4 (or higher), we use the old mechanism
>> where we just unpark the target thread and let it run. Part of
>> that code changes from ObjectWaiter::TS_WAIT to ObjectWaiter::TS_RUN.
>>
>> The code works the same for notifyAll() for the thread picked
>> to be notified. For the Knob_MoveNotifyee == 4 (or higher) case,
>> we just unpark all the waiters and we a free-for-all.
>>
>> So it looks like the code block from lines 1569-1577 is never
>> used... or is it? Well... you have to remember two things:
>>
>> 1) spurious unpark()
>> 2) timed wait()
>>
>> The caller might have called wait(0), but that doesn't mean that
>> the underlying park() mechanism won't have a spurious unpark().
>> Or better, the caller might have called wait(1) and be running
>> again after a millisecond.
>>
>> So in the HSX25 and older system (i.e., without Mr Simms fix for
>> 8028280), it is possible for this call:
>>
>>   1570           Thread::SpinAcquire (&_WaitSetLock, "WaitSet - unlink") ;
>>
>> to consume the unpark(). The gauntlet that has to be traversed
>> to get to this call:
>>
>>   4451               Thread::current()->_ParkEvent->park(1) ;
>>
>> is impressive:
>>
>> - fast-path acquisition of the _WaitSetLock has to fail:
>>
>>   4436    if (Atomic::cmpxchg (1, adr, 0) == 0) {
>>   4437       return ;   // normal fast-path return
>>   4438    }
>>
>> - if the machine is a uniprocessor, then 6 os::NakedYield()
>>   call-loop-recheck attempts have to fail:
>>
>>   4447          if ((ctr & 0xFFF) == 0 || !os::is_MP()) {
>>   4448             if (Yields > 5) {
>>   4449               // Consider using a simple NakedSleep() instead.
>>   4450               // Then SpinAcquire could be called by non-JVM threads
>>   4451               Thread::current()->_ParkEvent->park(1) ;
>>   4452             } else {
>>   4453               os::NakedYield() ;
>>   4454               ++Yields ;
>>   4455             }
>>
>> - if the machine is a multi-processor, then 6 rounds of { 4095 SpinPause()
>>   attempts, 1 os::NakedYield() attempt}  have to fail:
>>
>>   4446          ++ctr ;
>>   4447          if ((ctr & 0xFFF) == 0 || !os::is_MP()) {
>>   4448             if (Yields > 5) {
>>   4449               // Consider using a simple NakedSleep() instead.
>>   4450               // Then SpinAcquire could be called by non-JVM threads
>>   4451               Thread::current()->_ParkEvent->park(1) ;
>>   4452             } else {
>>   4453               os::NakedYield() ;
>>   4454               ++Yields ;
>>   4455             }
>>   4456          } else {
>>   4457             SpinPause() ;
>>   4458          }
>>
>> But it is possible. It is one of those once-in-a-blue moon type
>> windows where everything has to line up just so.
>>
>> So how do we address this issue in HSX-25 and possibly older?
>>
>> If Mr Simms fix for 8028280 is also backported, then there is no
>> issue. If it is not backported, then applying the fix for this
>> bug like so:
>>
>> src/share/vm/runtime/objectMonitor.cpp:
>>
>>   1596       if (JvmtiExport::should_post_monitor_waited()) {
>>   1597         JvmtiExport::post_monitor_waited(jt, this, ret == OS_TIMEOUT);
>>   1598       }
>>
>>   1604       if (node._notified != 0 && _succ == Self) {
>>   1605         // In this part of the monitor wait-notify-reenter protocol it
>>   1606         // is possible (and normal) for another thread to do a fastpath
>>   1607         // monitor enter-exit while this thread is still trying to get
>>   1608         // to the reenter portion of the protocol.
>>   1609         //
>>   1610         // The ObjectMonitor was notified and the current thread is
>>   1611         // the successor which also means that an unpark() has already
>>   1612         // been done. The JVMTI_EVENT_MONITOR_WAITED event handler can
>>   1613         // consume the unpark() that was done when the successor was
>>   1614         // set because the same ParkEvent is shared between Java
>>   1615         // monitors and JVM/TI RawMonitors (for now).
>>   1616         //
>>   1617         // We redo the unpark() to ensure forward progress, i.e., we
>>   1618         // don't want all pending threads hanging (parked) with none
>>   1619         // entering the unlocked monitor.
>>   1620         node._event->unpark();
>>   1621       }
>>
>> Of course the line numbers for the "fix" would be different and the comment
>> would need to be updated to reflect that the:
>>
>>   1570           Thread::SpinAcquire (&_WaitSetLock, "WaitSet - unlink") ;
>>
>> call above could also consume an unpark(), but it should work.
>>
>> If you've read this far, then I'm impressed. If you've read this far
>> and only fallen asleep a couple of times, then I'm still impressed.
>>
>> Summary: I don't think we have an issue in JDK9, but we'll have to do
>>          the fix in JDK8/HSX25 and older a little differently.
>>
>> Dan
>>
>>
>>> David
>>> -----
>>>
>>>> at all. However, Thread::muxAcquire() does use a ParkEvent, but it
>>>> is a different ParkEvent. From src/share/vm/runtime/thread.hpp:
>>>>
>>>>    ParkEvent * _ParkEvent ;               // for synchronized()
>>>>    ParkEvent * _SleepEvent ;              // for Thread.sleep
>>>>    ParkEvent * _MutexEvent ;              // for native internal
>>>> Mutex/Monitor
>>>>    ParkEvent * _MuxEvent ;                // for low-level
>>>> muxAcquire-muxRelease
>>>>
>>>> So ObjectMonitor uses the _ParkEvent field and Thread::muxAcquire()
>>>> uses the _MuxEvent. There are some comments in thread.cpp about
>>>> how _MuxEvent could be eliminated and _ParkEvent shared, but I don't
>>>> think we ever want to go there.
>>>>
>>>> I also filed this RFE:
>>>>
>>>>      8033399 add a separate ParkEvent for JVM/TI RawMonitor use
>>>> https://bugs.openjdk.java.net/browse/JDK-8033399
>>>>
>>>> just in case the Serviceability team wants to migrate JVM/TI RawMonitors
>>>> to a separate ParkEvent.
>>>>
>>>> Please let me know if you concur that I've resolved issue #3.
>>>>
>>>>
>>>>> If so, I wonder if we want this added unpark to not just be called if
>>>>> JVMTI_EVENT_MONITOR_WAITED
>>>>> is enabled?
>>>> I don't think we need it, but I've noted its removal as a risk.
>>>>
>>>> Again, thanks for the review!
>>>>
>>>> Dan
>>>>
>>>>
>>>>> thanks,
>>>>> Karen
>>>>>
>>>>> On Feb 1, 2014, at 1:38 PM, Daniel D. Daugherty wrote:
>>>>>
>>>>>> Greetings,
>>>>>>
>>>>>> I have a fix ready for the following bug:
>>>>>>
>>>>>>     8028073 race condition in ObjectMonitor implementation causing
>>>>>> deadlocks
>>>>>> https://bugs.openjdk.java.net/browse/JDK-8028073
>>>>>>
>>>>>> On the surface, this is a very simple fix that relocates a few lines of
>>>>>> code, relocates and rewrites the comments associated with that code and
>>>>>> adds several new comments.
>>>>>>
>>>>>> Of course, in reality, the issue is much more complicated, but I'm
>>>>>> hoping to make it easy for anyone not acquainted with this issue to
>>>>>> understand what's going on.
>>>>>>
>>>>>> Here are the JDK9 webrev URLs:
>>>>>>
>>>>>> OpenJDK:
>>>>>> http://cr.openjdk.java.net/~dcubed/8028073-webrev/0-jdk9-hs-runtime/
>>>>>>
>>>>>> Oracle internal:
>>>>>> http://javaweb.us.oracle.com/~ddaugher/8028073-webrev/0-jdk9-hs-runtime/
>>>>>>
>>>>>> The simple summary:
>>>>>>
>>>>>> - since Java Monitors and JVM/TI RawMonitors share a ParkEvent,
>>>>>>   it is possible for a JVM/TI monitor event handler to accidentally
>>>>>>   consume a ParkEvent.unpark() call meant for Java Monitor layer
>>>>>> - the original code fix was made on 2005.07.04 using this bug ID:
>>>>>> https://bugs.openjdk.java.net/browse/JDK-5030359
>>>>>> - it's the right fix, but it's in the wrong place
>>>>>> - the fix needs to be after the JVMTI_EVENT_MONITOR_WAITED
>>>>>>   event handler is called because it is that event handler
>>>>>>   that can cause the hang
>>>>>>
>>>>>>
>>>>>> Testing
>>>>>> -------
>>>>>>
>>>>>> - a new StessMonitorWait test has been created that reliably
>>>>>>   reproduces the hang in JDK[6789]; see the bug's gory details
>>>>>>   for the specific versions where the hang has been reproduced
>>>>>>   - the test reliably reproduces the hang in 5 seconds on my
>>>>>>     T7600 running Solaris 10u11 X86; 1 minute runs reproduce
>>>>>>     the hang reliably on other machines
>>>>>>   - 12 hour stress run of the new test on Linux-X64, MacOS X-X64,
>>>>>>     Solaris-SPARCV9, Solaris-X64, and Win7-X86 with the JPRT
>>>>>>     bits did not reproduce the hang
>>>>>> - JPRT test job
>>>>>> - VM/SQE Adhoc test job on Server VM, fastdebug bits on Linux-X86,
>>>>>>   Linux-X64, MacOS X-X64, Solaris-SPARCV9, Solaris-X64, Windows-X86,
>>>>>>   and Windows-X64:
>>>>>>   - vm.quick
>>>>>>   - Kitchensink (bigapps)
>>>>>>   - Weblogic+medrec (bigapps)
>>>>>>   - runThese (bigapps)
>>>>>>
>>>>>>
>>>>>> The Gory Details Start Here
>>>>>> ---------------------------
>>>>>>
>>>>>> This is the old location of block of code that's being moved:
>>>>>>
>>>>>> src/share/vm/runtime/objectMonitor.cpp:
>>>>>>
>>>>>> 1440 void ObjectMonitor::wait(jlong millis, bool interruptible, TRAPS) {
>>>>>> <snip>
>>>>>> 1499    exit (true, Self) ;                    // exit the monitor
>>>>>> <snip>
>>>>>> 1513    if (node._notified != 0 && _succ == Self) {
>>>>>> 1514       node._event->unpark();
>>>>>> 1515    }
>>>>>>
>>>>>>
>>>>>> This is the new location of block of code that's being moved:
>>>>>>
>>>>>> src/share/vm/runtime/objectMonitor.cpp:
>>>>>>
>>>>>> 1452 void ObjectMonitor::wait(jlong millis, bool interruptible, TRAPS) {
>>>>>> <snip>
>>>>>> 1601      if (JvmtiExport::should_post_monitor_waited()) {
>>>>>> 1602        JvmtiExport::post_monitor_waited(jt, this, ret ==
>>>>>> OS_TIMEOUT);
>>>>>> <snip>
>>>>>> 1604        if (node._notified != 0 && _succ == Self) {
>>>>>> <snip>
>>>>>> 1620          node._event->unpark();
>>>>>> 1621        }
>>>>>>
>>>>>>
>>>>>> The Risks
>>>>>> ---------
>>>>>>
>>>>>> - The code now executes only when the JVMTI_EVENT_MONITOR_WAITED event
>>>>>>   is enabled:
>>>>>>   - previously it was always executed
>>>>>>   - while the old code was not effective for the hang that is being
>>>>>>     fixed with this bug, it is possible that the old code prevented
>>>>>>     a different bug in the successor protocol from manifesting
>>>>>>   - thorough analysis of the successor protocol did not reveal a
>>>>>>     case where the old code was needed in the old location
>>>>>> - Thorough analysis indicates that the other JVM/TI monitor events
>>>>>>   do not need a fix like the one for JVMTI_EVENT_MONITOR_WAITED:
>>>>>>   - the successor protocol is complicated and the analysis could
>>>>>>     be wrong when certain options are used
>>>>>>   - comments were added to each location where a JVM/TI monitor
>>>>>>     event handler is called documenting why a fix like this one
>>>>>>     is not needed there
>>>>>>   - if the analysis is wrong, the new comments show where a new
>>>>>>     code change would be needed
>>>>>>
>>>>>>
>>>>>> The Scenario
>>>>>> ------------
>>>>>>
>>>>>> I've created a scenario that reproduces this hang:
>>>>>>
>>>>>> T1 - enters monitor and calls monitor.wait()
>>>>>> T2 - enters the monitor, calls monitor.notify() and exits the monitor
>>>>>> T3 - enters and exits the monitor
>>>>>> T4 - enters the monitor, delays for 5 seconds, exits the monitor
>>>>>>
>>>>>> A JVM/TI agent that enables JVMTI_EVENT_MONITOR_WAITED and has a
>>>>>> handler that: enters a raw monitor, waits for 1ms, exits a raw monitor.
>>>>>>
>>>>>> Here are the six events necessary to make this hang happen:
>>>>>>
>>>>>> // KEY-EVENT-1a: After being unparked(), T1 has cleared the _succ
>>>>>> field, but
>>>>>> // KEY-EVENT-1b: T3 is exiting the monitor and makes T1 the successor
>>>>>> again.
>>>>>>
>>>>>> // KEY-EVENT-2a: The unpark() done by T3 when it made T1 the successor
>>>>>> // KEY-EVENT-2b: is consumed by the JVM/TI event handler.
>>>>>>
>>>>>> // KEY-EVENT-3a: T3 made T1 the successor
>>>>>> // KEY-EVENT-3b: but before T1 could reenter the monitor T4 grabbed it.
>>>>>>
>>>>>> // KEY-EVENT-4a: T1's TrySpin() call sees T4 as NotRunnable so
>>>>>> // KEY-EVENT-4b: T1 bails from TrySpin without touching _succ.
>>>>>>
>>>>>> // KEY-EVENT-5a: T4 sees that T1 is still the successor so
>>>>>> // KEY-EVENT-5b: T4 takes the quick exit path (no ExitEpilog)
>>>>>>
>>>>>> // KEY-EVENT-6a: T1 is about to park and it is the successor, but
>>>>>> // KEY-EVENT-6b: T3's unpark has been eaten by the JVM/TI event handler
>>>>>> // KEY-EVENT-6c: and T4 took the quick exit path. T1 is about to be
>>>>>> stuck.
>>>>>>
>>>>>>
>>>>>> This bug is intertwined with:
>>>>>>
>>>>>> - The ObjectMonitor successor protocol
>>>>>> - the sharing of a ParkEvent between Java Monitors and JVM/TI
>>>>>> RawMonitors
>>>>>>
>>>>>> There is a very long successor.notes attachment to JDK-8028073 that
>>>>>> attempts to describe the ObjectMonitor successor protocol. It's good
>>>>>> for putting pretty much anyone to sleep.
>>>>>>
>>>>>> Since this hang reproduces back to JDK6, this bug is taking the easily
>>>>>> backported solution of moving the original fix to the right location.
>>>>>> The following new bug has been filed for possible future work in this
>>>>>> area by the Serviceability Team:
>>>>>>
>>>>>>     8033399 add a separate ParkEvent for JVM/TI RawMonitor use
>>>>>> https://bugs.openjdk.java.net/browse/JDK-8033399
>>>>>>
>>>>>>
>>>>>> The Symptoms
>>>>>> ------------
>>>>>>
>>>>>> With intermittent hangs like this, it is useful to know what to look
>>>>>> for in order to determine if you are running into this issue:
>>>>>>
>>>>>> - if you aren't using a debugger or a profiler or some other
>>>>>>   JVM/TI agent, then this hang is not the same as yours
>>>>>> - if your JVM/TI agent isn't using a JVMTI_EVENT_MONITOR_WAITED
>>>>>>   event handler, then this hang is not the same as yours
>>>>>> - if your JVMTI_EVENT_MONITOR_WAITED event handler is not using
>>>>>>   JVM/TI RawMonitors, then this hang is not the same as yours
>>>>>> - if your JVMTI_EVENT_MONITOR_WAITED event handler is calling
>>>>>>   back into Java code, then you might just be insane and this
>>>>>>   hang might be similar to yours. However, using a Java callback
>>>>>>   in an event handler is an even bigger problem/risk so fix that
>>>>>>   first.
>>>>>> - if you one or more threads blocked like this and making no
>>>>>>   progress, then this hang might be the same as yours:
>>>>>>
>>>>>> "T1" #22 prio=5 os_prio=64 tid=0x00000000009ca800 nid=0x2f waiting
>>>>>> for monitor e
>>>>>> ntry [0xfffffd7fc0231000]
>>>>>>    java.lang.Thread.State: BLOCKED (on object monitor)
>>>>>>    JavaThread state: _thread_blocked
>>>>>> Thread: 0x00000000009ca800  [0x2f] State: _at_safepoint
>>>>>> _has_called_back 0 _at_p
>>>>>> oll_safepoint 0
>>>>>>    JavaThread state: _thread_blocked
>>>>>>         at java.lang.Object.wait(Native Method)
>>>>>>         - waiting on <0xfffffd7e6a2b6ff0> (a java.lang.String)
>>>>>>         at java.lang.Object.wait(Object.java:502)
>>>>>>         at SMW_WorkerThread.run(StressMonitorWait.java:103)
>>>>>>         - locked <0xfffffd7e6a2b6ff0> (a java.lang.String)
>>>>>>
>>>>>> "T2" #23 prio=5 os_prio=64 tid=0x00000000009cc000 nid=0x30 waiting
>>>>>> for monitor e
>>>>>> ntry [0xfffffd7fc0130000]
>>>>>>    java.lang.Thread.State: BLOCKED (on object monitor)
>>>>>>    JavaThread state: _thread_blocked
>>>>>> Thread: 0x00000000009cc000  [0x30] State: _at_safepoint
>>>>>> _has_called_back 0 _at_p
>>>>>> oll_safepoint 0
>>>>>>    JavaThread state: _thread_blocked
>>>>>>         at SMW_WorkerThread.run(StressMonitorWait.java:120)
>>>>>>         - waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String)
>>>>>>
>>>>>> "T3" #24 prio=5 os_prio=64 tid=0x00000000009ce000 nid=0x31 waiting
>>>>>> for monitor e
>>>>>> ntry [0xfffffd7fc002f000]
>>>>>>    java.lang.Thread.State: BLOCKED (on object monitor)
>>>>>>    JavaThread state: _thread_blocked
>>>>>> Thread: 0x00000000009ce000  [0x31] State: _at_safepoint
>>>>>> _has_called_back 0 _at_p
>>>>>> oll_safepoint 0
>>>>>>    JavaThread state: _thread_blocked
>>>>>>         at SMW_WorkerThread.run(StressMonitorWait.java:139)
>>>>>>         - waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String)
>>>>>>
>>>>>> Key symptoms in thread T1:
>>>>>>
>>>>>> - had the object locked:
>>>>>>
>>>>>>   locked <0xfffffd7e6a2b6ff0> (a java.lang.String)
>>>>>>
>>>>>> - did an Object.wait():
>>>>>>
>>>>>>   waiting on <0xfffffd7e6a2b6ff0> (a java.lang.String)
>>>>>>
>>>>>> - is blocked on reentry:
>>>>>>
>>>>>>   waiting for monitor entry [0xfffffd7fc0231000]
>>>>>>
>>>>>> Key symtoms in thread T2:
>>>>>>
>>>>>> - is blocked waiting to lock the object:
>>>>>>
>>>>>>   waiting for monitor entry [0xfffffd7fc0130000]
>>>>>>   waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String)
>>>>>>
>>>>>> Key symtoms in thread T3:
>>>>>>
>>>>>> - is blocked waiting to lock the object:
>>>>>>
>>>>>>   waiting for monitor entry [0xfffffd7fc002f000]
>>>>>>   waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String)
>>>>>>


From david.holmes at oracle.com  Mon Feb 10 22:37:23 2014
From: david.holmes at oracle.com (David Holmes)
Date: Tue, 11 Feb 2014 16:37:23 +1000
Subject: code review round 0 for ObjectMonitor-JVM/TI hang fix (8028073)
In-Reply-To: <52F920AD.6050705@oracle.com>
References: <52ED3F28.5070806@oracle.com>	<6C8A4CDA-B5D5-44E4-865E-A7205D533AFD@oracle.com>
	<52F5702C.50508@oracle.com> <52F84967.8040807@oracle.com>
	<52F920AD.6050705@oracle.com>
Message-ID: <52F9C523.3010403@oracle.com>

Dan - thanks for an awesome analysis once again!

Yes mea culpa the code I was looking at was sans Mr Simms most 
fortuitous change. (I'm suffering from repo overload these days.)

As per my previous email I could see that a timed-wait, or spurious 
wakeup, could take us to the potentially problematic SpinAcquire, 
however I could not see how/if that thread could become _succ such that 
an unpark directed to _succ could be consumed in the SpinAcquire park().

Thanks again,
David

On 11/02/2014 4:55 AM, Daniel D. Daugherty wrote:
> On 2/9/14 8:37 PM, David Holmes wrote:
>> trimming content ...
>>
>> On 8/02/2014 9:45 AM, Daniel D. Daugherty wrote:
>>> On 2/7/14 2:56 PM, Karen Kinnear wrote:
>>>> 3. Did I read the code correctly that the Thread::SpinAcquire can make
>>>> a timed park
>>>> call on the same thread's _ParkEvent? And that this is used to get on
>>>> and off the wait queue,
>>>> i.e. to acquire the WaitSetLock?
>>>>     Is there the same risk that a notify might be eaten here also?
>>>
>>> As far as I can see, Thread::SpinAcquire() does not use a ParkEvent
>>
>> It sure does:
>>
>> void Thread::SpinAcquire (volatile int * adr, const char * LockName) {
>>   if (Atomic::cmpxchg (1, adr, 0) == 0) {
>>      return ;   // normal fast-path return
>>   }
>>
>>   // Slow-path : We've encountered contention -- Spin/Yield/Block
>> strategy.
>>   TEVENT (SpinAcquire - ctx) ;
>>   int ctr = 0 ;
>>   int Yields = 0 ;
>>   for (;;) {
>>      while (*adr != 0) {
>>         ++ctr ;
>>         if ((ctr & 0xFFF) == 0 || !os::is_MP()) {
>>            if (Yields > 5) {
>>              // Consider using a simple NakedSleep() instead.
>>              // Then SpinAcquire could be called by non-JVM threads
>>              Thread::current()->_ParkEvent->park(1) ;
>
> Ummmm... that's not the code I'm seeing...
>
> src/share/vm/runtime/thread.cpp:
>
>    4417  void Thread::SpinAcquire (volatile int * adr, const char *
> LockName) {
>    4418    if (Atomic::cmpxchg (1, adr, 0) == 0) {
>    4419       return ;   // normal fast-path return
>    4420    }
>    4421
>    4422    // Slow-path : We've encountered contention --
> Spin/Yield/Block strategy.
>    4423    TEVENT (SpinAcquire - ctx) ;
>    4424    int ctr = 0 ;
>    4425    int Yields = 0 ;
>    4426    for (;;) {
>    4427       while (*adr != 0) {
>    4428          ++ctr ;
>    4429          if ((ctr & 0xFFF) == 0 || !os::is_MP()) {
>    4430             if (Yields > 5) {
>    4431               os::naked_short_sleep(1);
>    4432             } else {
>    4433               os::NakedYield() ;
>    4434               ++Yields ;
>    4435             }
>    4436          } else {
>    4437             SpinPause() ;
>    4438          }
>    4439       }
>    4440       if (Atomic::cmpxchg (1, adr, 0) == 0) return ;
>    4441    }
>    4442  }
>
> Mr Simms recently changed the above code via:
>
> changeset:   5832:5944dba4badc
> user:        dsimms
> date:        Fri Jan 24 09:28:47 2014 +0100
> summary:     8028280: ParkEvent leak when running modified runThese
> which only loads classes
>
> os::naked_short_sleep() is new:
>
> - BSD/MacOS X, Linux - uses nanosleep()
> - Solaris - uses usleep()
> - Windows - uses Sleep()
>
> The fix for 8028280 was pushed to JDK9/hs-rt on 2014.01.24 and to JDK9/hs
> on 2014.01.29. I don't see any signs that Mr Simm's fix will be backported
> to JDK8u/HSX-25u (yet) so this part of the review thread might impact the
> backport of my fix to earlier releases.
>
>
>> So considering Karen's question ... I can't tell for certain. :(
>>
>> I do not think the SpinAcquire on grabbing the wait-set lock to add to
>> the wait-set can be an issue because we will only park in response to
>> the actual wait, and hence only get unparked due to a
>> notify/notifyAll, but at this point we still own the monitor so no
>> notify/notifyAll is possible.
>>
>> However, for the removal from the wait-set a more complex analysis is
>> needed. To do the SpinAcquire we must still be flagged as TS_WAIT -
>> which means we have not been notified, but must be returning due to a
>> timeout (or spurious wakeup?). In such circumstances could we be
>> _succ? I don't think so but I'll leave it to Dan to confirm that part :)
>
> So for HSX-25 and probably older...
>
> There are four Thread::SpinAcquire() calls in the objectMonitor code:
>
>      Thread::SpinAcquire (&_WaitSetLock, "WaitSet - add") ;
>      Thread::SpinAcquire (&_WaitSetLock, "WaitSet - unlink") ;
>      Thread::SpinAcquire (&_WaitSetLock, "WaitSet - notify") ;
>      Thread::SpinAcquire (&_WaitSetLock, "WaitSet - notifyall") ;
>
> We can easily rule out the "notify" and "notifyAll" uses since the
> current thread owns the Java-level monitor and there are no events
> to post in this part of the notify() or notifyAll() protocols.
>
> For the "WaitSet - add" use, the current thread owns the Java-level
> monitor and the thread has not been added as a waiter yet so another
> thread cannot do the notify-exit-make-successor part of the protocol
> yet.
>
> For the "WaitSet - unlink" use:
>
> src/share/vm/runtime/objectMonitor.cpp:
>
>    1569       if (node.TState == ObjectWaiter::TS_WAIT) {
>    1570           Thread::SpinAcquire (&_WaitSetLock, "WaitSet - unlink") ;
>    1571           if (node.TState == ObjectWaiter::TS_WAIT) {
>    1572              DequeueSpecificWaiter (&node) ;       // unlink
> from WaitSet
>    1573              assert(node._notified == 0, "invariant");
>    1574              node.TState = ObjectWaiter::TS_RUN ;
>    1575           }
>    1576           Thread::SpinRelease (&_WaitSetLock) ;
>    1577       }
>
> It is the call on line 1570 above that gets us into this code:
>
> src/share/vm/runtime/thread.cpp:
>
>    4435  void Thread::SpinAcquire (volatile int * adr, const char *
> LockName) {
>    4436    if (Atomic::cmpxchg (1, adr, 0) == 0) {
>    4437       return ;   // normal fast-path return
>    4438    }
>    4439
>    4440    // Slow-path : We've encountered contention --
> Spin/Yield/Block strategy.
>    4441    TEVENT (SpinAcquire - ctx) ;
>    4442    int ctr = 0 ;
>    4443    int Yields = 0 ;
>    4444    for (;;) {
>    4445       while (*adr != 0) {
>    4446          ++ctr ;
>    4447          if ((ctr & 0xFFF) == 0 || !os::is_MP()) {
>    4448             if (Yields > 5) {
>    4449               // Consider using a simple NakedSleep() instead.
>    4450               // Then SpinAcquire could be called by non-JVM
> threads
>    4451               Thread::current()->_ParkEvent->park(1) ;
>    4452             } else {
>    4453               os::NakedYield() ;
>    4454               ++Yields ;
>    4455             }
>    4456          } else {
>    4457             SpinPause() ;
>    4458          }
>    4459       }
>    4460       if (Atomic::cmpxchg (1, adr, 0) == 0) return ;
>    4461    }
>    4462  }
>
> And the above code can consume the unpark() on line 4451.
>
> So how the heck do we get to line 1570???
>
> Well, the target thread would have to be both notified and unparked
> to be executing this code path. When the notify() code runs, the
> target of the notify() is changed from ObjectWaiter::TS_WAIT to
> ObjectWaiter::TS_ENTER unless Knob_MoveNotifyee == 4. The default
> for Knob_MoveNotifyee == 2 so we're in non default mode here...
>
> Here are the Knob_MoveNotifyee policy values:
>
>     1717      if (Policy == 0) {       // prepend to EntryList
>     1728      if (Policy == 1) {      // append to EntryList
>     1744      if (Policy == 2) {      // prepend to cxq
>     1760      if (Policy == 3) {      // append to cxq
>
> For Knob_MoveNotifyee == 4 (or higher), we use the old mechanism
> where we just unpark the target thread and let it run. Part of
> that code changes from ObjectWaiter::TS_WAIT to ObjectWaiter::TS_RUN.
>
> The code works the same for notifyAll() for the thread picked
> to be notified. For the Knob_MoveNotifyee == 4 (or higher) case,
> we just unpark all the waiters and we a free-for-all.
>
> So it looks like the code block from lines 1569-1577 is never
> used... or is it? Well... you have to remember two things:
>
> 1) spurious unpark()
> 2) timed wait()
>
> The caller might have called wait(0), but that doesn't mean that
> the underlying park() mechanism won't have a spurious unpark().
> Or better, the caller might have called wait(1) and be running
> again after a millisecond.
>
> So in the HSX25 and older system (i.e., without Mr Simms fix for
> 8028280), it is possible for this call:
>
>    1570           Thread::SpinAcquire (&_WaitSetLock, "WaitSet - unlink") ;
>
> to consume the unpark(). The gauntlet that has to be traversed
> to get to this call:
>
>    4451               Thread::current()->_ParkEvent->park(1) ;
>
> is impressive:
>
> - fast-path acquisition of the _WaitSetLock has to fail:
>
>    4436    if (Atomic::cmpxchg (1, adr, 0) == 0) {
>    4437       return ;   // normal fast-path return
>    4438    }
>
> - if the machine is a uniprocessor, then 6 os::NakedYield()
>    call-loop-recheck attempts have to fail:
>
>    4447          if ((ctr & 0xFFF) == 0 || !os::is_MP()) {
>    4448             if (Yields > 5) {
>    4449               // Consider using a simple NakedSleep() instead.
>    4450               // Then SpinAcquire could be called by non-JVM
> threads
>    4451               Thread::current()->_ParkEvent->park(1) ;
>    4452             } else {
>    4453               os::NakedYield() ;
>    4454               ++Yields ;
>    4455             }
>
> - if the machine is a multi-processor, then 6 rounds of { 4095 SpinPause()
>    attempts, 1 os::NakedYield() attempt}  have to fail:
>
>    4446          ++ctr ;
>    4447          if ((ctr & 0xFFF) == 0 || !os::is_MP()) {
>    4448             if (Yields > 5) {
>    4449               // Consider using a simple NakedSleep() instead.
>    4450               // Then SpinAcquire could be called by non-JVM
> threads
>    4451               Thread::current()->_ParkEvent->park(1) ;
>    4452             } else {
>    4453               os::NakedYield() ;
>    4454               ++Yields ;
>    4455             }
>    4456          } else {
>    4457             SpinPause() ;
>    4458          }
>
> But it is possible. It is one of those once-in-a-blue moon type
> windows where everything has to line up just so.
>
> So how do we address this issue in HSX-25 and possibly older?
>
> If Mr Simms fix for 8028280 is also backported, then there is no
> issue. If it is not backported, then applying the fix for this
> bug like so:
>
> src/share/vm/runtime/objectMonitor.cpp:
>
>    1596       if (JvmtiExport::should_post_monitor_waited()) {
>    1597         JvmtiExport::post_monitor_waited(jt, this, ret ==
> OS_TIMEOUT);
>    1598       }
>
>    1604       if (node._notified != 0 && _succ == Self) {
>    1605         // In this part of the monitor wait-notify-reenter
> protocol it
>    1606         // is possible (and normal) for another thread to do a
> fastpath
>    1607         // monitor enter-exit while this thread is still trying
> to get
>    1608         // to the reenter portion of the protocol.
>    1609         //
>    1610         // The ObjectMonitor was notified and the current thread is
>    1611         // the successor which also means that an unpark() has
> already
>    1612         // been done. The JVMTI_EVENT_MONITOR_WAITED event
> handler can
>    1613         // consume the unpark() that was done when the successor
> was
>    1614         // set because the same ParkEvent is shared between Java
>    1615         // monitors and JVM/TI RawMonitors (for now).
>    1616         //
>    1617         // We redo the unpark() to ensure forward progress,
> i.e., we
>    1618         // don't want all pending threads hanging (parked) with
> none
>    1619         // entering the unlocked monitor.
>    1620         node._event->unpark();
>    1621       }
>
> Of course the line numbers for the "fix" would be different and the comment
> would need to be updated to reflect that the:
>
>    1570           Thread::SpinAcquire (&_WaitSetLock, "WaitSet - unlink") ;
>
> call above could also consume an unpark(), but it should work.
>
> If you've read this far, then I'm impressed. If you've read this far
> and only fallen asleep a couple of times, then I'm still impressed.
>
> Summary: I don't think we have an issue in JDK9, but we'll have to do
>           the fix in JDK8/HSX25 and older a little differently.
>
> Dan
>
>
>>
>> David
>> -----
>>
>>> at all. However, Thread::muxAcquire() does use a ParkEvent, but it
>>> is a different ParkEvent. From src/share/vm/runtime/thread.hpp:
>>>
>>>    ParkEvent * _ParkEvent ;               // for synchronized()
>>>    ParkEvent * _SleepEvent ;              // for Thread.sleep
>>>    ParkEvent * _MutexEvent ;              // for native internal
>>> Mutex/Monitor
>>>    ParkEvent * _MuxEvent ;                // for low-level
>>> muxAcquire-muxRelease
>>>
>>> So ObjectMonitor uses the _ParkEvent field and Thread::muxAcquire()
>>> uses the _MuxEvent. There are some comments in thread.cpp about
>>> how _MuxEvent could be eliminated and _ParkEvent shared, but I don't
>>> think we ever want to go there.
>>>
>>> I also filed this RFE:
>>>
>>>      8033399 add a separate ParkEvent for JVM/TI RawMonitor use
>>> https://bugs.openjdk.java.net/browse/JDK-8033399
>>>
>>> just in case the Serviceability team wants to migrate JVM/TI RawMonitors
>>> to a separate ParkEvent.
>>>
>>> Please let me know if you concur that I've resolved issue #3.
>>>
>>>
>>>> If so, I wonder if we want this added unpark to not just be called if
>>>> JVMTI_EVENT_MONITOR_WAITED
>>>> is enabled?
>>>
>>> I don't think we need it, but I've noted its removal as a risk.
>>>
>>> Again, thanks for the review!
>>>
>>> Dan
>>>
>>>
>>>>
>>>> thanks,
>>>> Karen
>>>>
>>>> On Feb 1, 2014, at 1:38 PM, Daniel D. Daugherty wrote:
>>>>
>>>>> Greetings,
>>>>>
>>>>> I have a fix ready for the following bug:
>>>>>
>>>>>     8028073 race condition in ObjectMonitor implementation causing
>>>>> deadlocks
>>>>> https://bugs.openjdk.java.net/browse/JDK-8028073
>>>>>
>>>>> On the surface, this is a very simple fix that relocates a few
>>>>> lines of
>>>>> code, relocates and rewrites the comments associated with that code
>>>>> and
>>>>> adds several new comments.
>>>>>
>>>>> Of course, in reality, the issue is much more complicated, but I'm
>>>>> hoping to make it easy for anyone not acquainted with this issue to
>>>>> understand what's going on.
>>>>>
>>>>> Here are the JDK9 webrev URLs:
>>>>>
>>>>> OpenJDK:
>>>>> http://cr.openjdk.java.net/~dcubed/8028073-webrev/0-jdk9-hs-runtime/
>>>>>
>>>>> Oracle internal:
>>>>> http://javaweb.us.oracle.com/~ddaugher/8028073-webrev/0-jdk9-hs-runtime/
>>>>>
>>>>>
>>>>> The simple summary:
>>>>>
>>>>> - since Java Monitors and JVM/TI RawMonitors share a ParkEvent,
>>>>>   it is possible for a JVM/TI monitor event handler to accidentally
>>>>>   consume a ParkEvent.unpark() call meant for Java Monitor layer
>>>>> - the original code fix was made on 2005.07.04 using this bug ID:
>>>>> https://bugs.openjdk.java.net/browse/JDK-5030359
>>>>> - it's the right fix, but it's in the wrong place
>>>>> - the fix needs to be after the JVMTI_EVENT_MONITOR_WAITED
>>>>>   event handler is called because it is that event handler
>>>>>   that can cause the hang
>>>>>
>>>>>
>>>>> Testing
>>>>> -------
>>>>>
>>>>> - a new StessMonitorWait test has been created that reliably
>>>>>   reproduces the hang in JDK[6789]; see the bug's gory details
>>>>>   for the specific versions where the hang has been reproduced
>>>>>   - the test reliably reproduces the hang in 5 seconds on my
>>>>>     T7600 running Solaris 10u11 X86; 1 minute runs reproduce
>>>>>     the hang reliably on other machines
>>>>>   - 12 hour stress run of the new test on Linux-X64, MacOS X-X64,
>>>>>     Solaris-SPARCV9, Solaris-X64, and Win7-X86 with the JPRT
>>>>>     bits did not reproduce the hang
>>>>> - JPRT test job
>>>>> - VM/SQE Adhoc test job on Server VM, fastdebug bits on Linux-X86,
>>>>>   Linux-X64, MacOS X-X64, Solaris-SPARCV9, Solaris-X64, Windows-X86,
>>>>>   and Windows-X64:
>>>>>   - vm.quick
>>>>>   - Kitchensink (bigapps)
>>>>>   - Weblogic+medrec (bigapps)
>>>>>   - runThese (bigapps)
>>>>>
>>>>>
>>>>> The Gory Details Start Here
>>>>> ---------------------------
>>>>>
>>>>> This is the old location of block of code that's being moved:
>>>>>
>>>>> src/share/vm/runtime/objectMonitor.cpp:
>>>>>
>>>>> 1440 void ObjectMonitor::wait(jlong millis, bool interruptible,
>>>>> TRAPS) {
>>>>> <snip>
>>>>> 1499    exit (true, Self) ;                    // exit the monitor
>>>>> <snip>
>>>>> 1513    if (node._notified != 0 && _succ == Self) {
>>>>> 1514       node._event->unpark();
>>>>> 1515    }
>>>>>
>>>>>
>>>>> This is the new location of block of code that's being moved:
>>>>>
>>>>> src/share/vm/runtime/objectMonitor.cpp:
>>>>>
>>>>> 1452 void ObjectMonitor::wait(jlong millis, bool interruptible,
>>>>> TRAPS) {
>>>>> <snip>
>>>>> 1601      if (JvmtiExport::should_post_monitor_waited()) {
>>>>> 1602        JvmtiExport::post_monitor_waited(jt, this, ret ==
>>>>> OS_TIMEOUT);
>>>>> <snip>
>>>>> 1604        if (node._notified != 0 && _succ == Self) {
>>>>> <snip>
>>>>> 1620          node._event->unpark();
>>>>> 1621        }
>>>>>
>>>>>
>>>>> The Risks
>>>>> ---------
>>>>>
>>>>> - The code now executes only when the JVMTI_EVENT_MONITOR_WAITED event
>>>>>   is enabled:
>>>>>   - previously it was always executed
>>>>>   - while the old code was not effective for the hang that is being
>>>>>     fixed with this bug, it is possible that the old code prevented
>>>>>     a different bug in the successor protocol from manifesting
>>>>>   - thorough analysis of the successor protocol did not reveal a
>>>>>     case where the old code was needed in the old location
>>>>> - Thorough analysis indicates that the other JVM/TI monitor events
>>>>>   do not need a fix like the one for JVMTI_EVENT_MONITOR_WAITED:
>>>>>   - the successor protocol is complicated and the analysis could
>>>>>     be wrong when certain options are used
>>>>>   - comments were added to each location where a JVM/TI monitor
>>>>>     event handler is called documenting why a fix like this one
>>>>>     is not needed there
>>>>>   - if the analysis is wrong, the new comments show where a new
>>>>>     code change would be needed
>>>>>
>>>>>
>>>>> The Scenario
>>>>> ------------
>>>>>
>>>>> I've created a scenario that reproduces this hang:
>>>>>
>>>>> T1 - enters monitor and calls monitor.wait()
>>>>> T2 - enters the monitor, calls monitor.notify() and exits the monitor
>>>>> T3 - enters and exits the monitor
>>>>> T4 - enters the monitor, delays for 5 seconds, exits the monitor
>>>>>
>>>>> A JVM/TI agent that enables JVMTI_EVENT_MONITOR_WAITED and has a
>>>>> handler that: enters a raw monitor, waits for 1ms, exits a raw
>>>>> monitor.
>>>>>
>>>>> Here are the six events necessary to make this hang happen:
>>>>>
>>>>> // KEY-EVENT-1a: After being unparked(), T1 has cleared the _succ
>>>>> field, but
>>>>> // KEY-EVENT-1b: T3 is exiting the monitor and makes T1 the successor
>>>>> again.
>>>>>
>>>>> // KEY-EVENT-2a: The unpark() done by T3 when it made T1 the successor
>>>>> // KEY-EVENT-2b: is consumed by the JVM/TI event handler.
>>>>>
>>>>> // KEY-EVENT-3a: T3 made T1 the successor
>>>>> // KEY-EVENT-3b: but before T1 could reenter the monitor T4 grabbed
>>>>> it.
>>>>>
>>>>> // KEY-EVENT-4a: T1's TrySpin() call sees T4 as NotRunnable so
>>>>> // KEY-EVENT-4b: T1 bails from TrySpin without touching _succ.
>>>>>
>>>>> // KEY-EVENT-5a: T4 sees that T1 is still the successor so
>>>>> // KEY-EVENT-5b: T4 takes the quick exit path (no ExitEpilog)
>>>>>
>>>>> // KEY-EVENT-6a: T1 is about to park and it is the successor, but
>>>>> // KEY-EVENT-6b: T3's unpark has been eaten by the JVM/TI event
>>>>> handler
>>>>> // KEY-EVENT-6c: and T4 took the quick exit path. T1 is about to be
>>>>> stuck.
>>>>>
>>>>>
>>>>> This bug is intertwined with:
>>>>>
>>>>> - The ObjectMonitor successor protocol
>>>>> - the sharing of a ParkEvent between Java Monitors and JVM/TI
>>>>> RawMonitors
>>>>>
>>>>> There is a very long successor.notes attachment to JDK-8028073 that
>>>>> attempts to describe the ObjectMonitor successor protocol. It's good
>>>>> for putting pretty much anyone to sleep.
>>>>>
>>>>> Since this hang reproduces back to JDK6, this bug is taking the easily
>>>>> backported solution of moving the original fix to the right location.
>>>>> The following new bug has been filed for possible future work in this
>>>>> area by the Serviceability Team:
>>>>>
>>>>>     8033399 add a separate ParkEvent for JVM/TI RawMonitor use
>>>>> https://bugs.openjdk.java.net/browse/JDK-8033399
>>>>>
>>>>>
>>>>> The Symptoms
>>>>> ------------
>>>>>
>>>>> With intermittent hangs like this, it is useful to know what to look
>>>>> for in order to determine if you are running into this issue:
>>>>>
>>>>> - if you aren't using a debugger or a profiler or some other
>>>>>   JVM/TI agent, then this hang is not the same as yours
>>>>> - if your JVM/TI agent isn't using a JVMTI_EVENT_MONITOR_WAITED
>>>>>   event handler, then this hang is not the same as yours
>>>>> - if your JVMTI_EVENT_MONITOR_WAITED event handler is not using
>>>>>   JVM/TI RawMonitors, then this hang is not the same as yours
>>>>> - if your JVMTI_EVENT_MONITOR_WAITED event handler is calling
>>>>>   back into Java code, then you might just be insane and this
>>>>>   hang might be similar to yours. However, using a Java callback
>>>>>   in an event handler is an even bigger problem/risk so fix that
>>>>>   first.
>>>>> - if you one or more threads blocked like this and making no
>>>>>   progress, then this hang might be the same as yours:
>>>>>
>>>>> "T1" #22 prio=5 os_prio=64 tid=0x00000000009ca800 nid=0x2f waiting
>>>>> for monitor e
>>>>> ntry [0xfffffd7fc0231000]
>>>>>    java.lang.Thread.State: BLOCKED (on object monitor)
>>>>>    JavaThread state: _thread_blocked
>>>>> Thread: 0x00000000009ca800  [0x2f] State: _at_safepoint
>>>>> _has_called_back 0 _at_p
>>>>> oll_safepoint 0
>>>>>    JavaThread state: _thread_blocked
>>>>>         at java.lang.Object.wait(Native Method)
>>>>>         - waiting on <0xfffffd7e6a2b6ff0> (a java.lang.String)
>>>>>         at java.lang.Object.wait(Object.java:502)
>>>>>         at SMW_WorkerThread.run(StressMonitorWait.java:103)
>>>>>         - locked <0xfffffd7e6a2b6ff0> (a java.lang.String)
>>>>>
>>>>> "T2" #23 prio=5 os_prio=64 tid=0x00000000009cc000 nid=0x30 waiting
>>>>> for monitor e
>>>>> ntry [0xfffffd7fc0130000]
>>>>>    java.lang.Thread.State: BLOCKED (on object monitor)
>>>>>    JavaThread state: _thread_blocked
>>>>> Thread: 0x00000000009cc000  [0x30] State: _at_safepoint
>>>>> _has_called_back 0 _at_p
>>>>> oll_safepoint 0
>>>>>    JavaThread state: _thread_blocked
>>>>>         at SMW_WorkerThread.run(StressMonitorWait.java:120)
>>>>>         - waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String)
>>>>>
>>>>> "T3" #24 prio=5 os_prio=64 tid=0x00000000009ce000 nid=0x31 waiting
>>>>> for monitor e
>>>>> ntry [0xfffffd7fc002f000]
>>>>>    java.lang.Thread.State: BLOCKED (on object monitor)
>>>>>    JavaThread state: _thread_blocked
>>>>> Thread: 0x00000000009ce000  [0x31] State: _at_safepoint
>>>>> _has_called_back 0 _at_p
>>>>> oll_safepoint 0
>>>>>    JavaThread state: _thread_blocked
>>>>>         at SMW_WorkerThread.run(StressMonitorWait.java:139)
>>>>>         - waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String)
>>>>>
>>>>> Key symptoms in thread T1:
>>>>>
>>>>> - had the object locked:
>>>>>
>>>>>   locked <0xfffffd7e6a2b6ff0> (a java.lang.String)
>>>>>
>>>>> - did an Object.wait():
>>>>>
>>>>>   waiting on <0xfffffd7e6a2b6ff0> (a java.lang.String)
>>>>>
>>>>> - is blocked on reentry:
>>>>>
>>>>>   waiting for monitor entry [0xfffffd7fc0231000]
>>>>>
>>>>> Key symtoms in thread T2:
>>>>>
>>>>> - is blocked waiting to lock the object:
>>>>>
>>>>>   waiting for monitor entry [0xfffffd7fc0130000]
>>>>>   waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String)
>>>>>
>>>>> Key symtoms in thread T3:
>>>>>
>>>>> - is blocked waiting to lock the object:
>>>>>
>>>>>   waiting for monitor entry [0xfffffd7fc002f000]
>>>>>   waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String)
>>>>>
>>>
>

From staffan at larsen.se  Tue Feb 11 02:18:09 2014
From: staffan at larsen.se (staffan at larsen.se)
Date: Tue, 11 Feb 2014 11:18:09 +0100
Subject: RFR(XS): JDK-8034176 Update mapfile for libjfr
Message-ID: <CDD60B6C-F8E3-4693-AB76-43A4CC0C97B8@larsen.se>

Please review this small fix for libjfr/mapfile-vers.

Thanks,
/Staffan

diff --git a/make/mapfiles/libjfr/mapfile-vers b/make/mapfiles/libjfr/mapfile-vers
--- a/make/mapfiles/libjfr/mapfile-vers
+++ b/make/mapfiles/libjfr/mapfile-vers
@@ -34,6 +34,7 @@
       Java_oracle_jrockit_jfr_VMJFR_getPeriod;
       Java_oracle_jrockit_jfr_VMJFR_descriptors;
       Java_oracle_jrockit_jfr_VMJFR_redefineClass0;
+      Java_oracle_jrockit_jfr_VMJFR_retransformClasses0;
       JNI_OnLoad;
   local:
       *;

From staffan.larsen at oracle.com  Tue Feb 11 03:00:47 2014
From: staffan.larsen at oracle.com (Staffan Larsen)
Date: Tue, 11 Feb 2014 12:00:47 +0100
Subject: RFR(XS): JDK-8034176 Update mapfile for libjfr
In-Reply-To: <CDD60B6C-F8E3-4693-AB76-43A4CC0C97B8@larsen.se>
References: <CDD60B6C-F8E3-4693-AB76-43A4CC0C97B8@larsen.se>
Message-ID: <E3D82E6F-8579-4F49-9A92-581E2415D169@oracle.com>

Resending (email to build-dev bounced since I used the wrong sender address).

On 11 feb 2014, at 11:18, staffan at larsen.se wrote:

> Please review this small fix for libjfr/mapfile-vers.
> 
> Thanks,
> /Staffan
> 
> diff --git a/make/mapfiles/libjfr/mapfile-vers b/make/mapfiles/libjfr/mapfile-vers
> --- a/make/mapfiles/libjfr/mapfile-vers
> +++ b/make/mapfiles/libjfr/mapfile-vers
> @@ -34,6 +34,7 @@
>       Java_oracle_jrockit_jfr_VMJFR_getPeriod;
>       Java_oracle_jrockit_jfr_VMJFR_descriptors;
>       Java_oracle_jrockit_jfr_VMJFR_redefineClass0;
> +      Java_oracle_jrockit_jfr_VMJFR_retransformClasses0;
>       JNI_OnLoad;
>   local:
>       *;


From erik.joelsson at oracle.com  Tue Feb 11 03:14:01 2014
From: erik.joelsson at oracle.com (Erik Joelsson)
Date: Tue, 11 Feb 2014 12:14:01 +0100
Subject: RFR(XS): JDK-8034176 Update mapfile for libjfr
In-Reply-To: <E3D82E6F-8579-4F49-9A92-581E2415D169@oracle.com>
References: <CDD60B6C-F8E3-4693-AB76-43A4CC0C97B8@larsen.se>
	<E3D82E6F-8579-4F49-9A92-581E2415D169@oracle.com>
Message-ID: <52FA05F9.20206@oracle.com>

Looks good, but I can't help but wonder why the mapfile for libjfr is in 
the open.

/Erik

On 2014-02-11 12:00, Staffan Larsen wrote:
> Resending (email to build-dev bounced since I used the wrong sender address).
>
> On 11 feb 2014, at 11:18, staffan at larsen.se wrote:
>
>> Please review this small fix for libjfr/mapfile-vers.
>>
>> Thanks,
>> /Staffan
>>
>> diff --git a/make/mapfiles/libjfr/mapfile-vers b/make/mapfiles/libjfr/mapfile-vers
>> --- a/make/mapfiles/libjfr/mapfile-vers
>> +++ b/make/mapfiles/libjfr/mapfile-vers
>> @@ -34,6 +34,7 @@
>>        Java_oracle_jrockit_jfr_VMJFR_getPeriod;
>>        Java_oracle_jrockit_jfr_VMJFR_descriptors;
>>        Java_oracle_jrockit_jfr_VMJFR_redefineClass0;
>> +      Java_oracle_jrockit_jfr_VMJFR_retransformClasses0;
>>        JNI_OnLoad;
>>    local:
>>        *;


From staffan.larsen at oracle.com  Tue Feb 11 04:01:37 2014
From: staffan.larsen at oracle.com (Staffan Larsen)
Date: Tue, 11 Feb 2014 13:01:37 +0100
Subject: RFR(XS): JDK-8034176 Update mapfile for libjfr
In-Reply-To: <52FA05F9.20206@oracle.com>
References: <CDD60B6C-F8E3-4693-AB76-43A4CC0C97B8@larsen.se>
	<E3D82E6F-8579-4F49-9A92-581E2415D169@oracle.com>
	<52FA05F9.20206@oracle.com>
Message-ID: <7482D938-F9B1-4890-B48D-AF892B6FA3C3@oracle.com>


On 11 feb 2014, at 12:14, Erik Joelsson <erik.joelsson at oracle.com> wrote:

> Looks good, but I can't help but wonder why the mapfile for libjfr is in the open.

Yes, that is unfortunate. It used to be in closed source in jdk7, but moved into open with the new build system.

/Staffan

> 
> /Erik
> 
> On 2014-02-11 12:00, Staffan Larsen wrote:
>> Resending (email to build-dev bounced since I used the wrong sender address).
>> 
>> On 11 feb 2014, at 11:18, staffan at larsen.se wrote:
>> 
>>> Please review this small fix for libjfr/mapfile-vers.
>>> 
>>> Thanks,
>>> /Staffan
>>> 
>>> diff --git a/make/mapfiles/libjfr/mapfile-vers b/make/mapfiles/libjfr/mapfile-vers
>>> --- a/make/mapfiles/libjfr/mapfile-vers
>>> +++ b/make/mapfiles/libjfr/mapfile-vers
>>> @@ -34,6 +34,7 @@
>>>       Java_oracle_jrockit_jfr_VMJFR_getPeriod;
>>>       Java_oracle_jrockit_jfr_VMJFR_descriptors;
>>>       Java_oracle_jrockit_jfr_VMJFR_redefineClass0;
>>> +      Java_oracle_jrockit_jfr_VMJFR_retransformClasses0;
>>>       JNI_OnLoad;
>>>   local:
>>>       *;
> 


From fweimer at redhat.com  Tue Feb 11 04:52:03 2014
From: fweimer at redhat.com (Florian Weimer)
Date: Tue, 11 Feb 2014 13:52:03 +0100
Subject: Project to improve hs_err files
In-Reply-To: <cf2b0521-76b0-4cd8-88bc-b2986418f9be@default>
References: <cf2b0521-76b0-4cd8-88bc-b2986418f9be@default>
Message-ID: <52FA1CF3.2070902@redhat.com>

On 09/06/2013 01:32 PM, Mattis Castegren wrote:

> Please let me know if you have ideas of what information would make
> hs_err files more useful, and I will add them to my list.

Structured logging to the systemd journal would be an interesting 
addition.  This might be a bit tricky from a signal handler, though:

http://lists.freedesktop.org/archives/systemd-devel/2012-November/007341.html
http://lists.freedesktop.org/archives/systemd-devel/2014-January/016465.html

-- 
Florian Weimer / Red Hat Product Security Team

From staffan.larsen at oracle.com  Tue Feb 11 06:12:59 2014
From: staffan.larsen at oracle.com (Staffan Larsen)
Date: Tue, 11 Feb 2014 15:12:59 +0100
Subject: RFR(S): 6952105 TEST_BUG: testcase failure, not very often,
	com/sun/jdi/SuspendThreadTest.java
Message-ID: <E0E54A66-772E-4E94-BC30-FDC0564079B3@oracle.com>

Updated the test to use proper synchronization and notification between threads. Should be more stable and much faster.

bug: https://bugs.openjdk.java.net/browse/JDK-6952105
webrev: http://cr.openjdk.java.net/~sla/6952105/webrev.00/

Thanks,
/Staffan

From shanliang.jiang at oracle.com  Tue Feb 11 06:13:46 2014
From: shanliang.jiang at oracle.com (shanliang)
Date: Tue, 11 Feb 2014 15:13:46 +0100
Subject: Codereview request: 8007710 runtime/7158988/FieldMonitor.java fails
	with com.sun.jdi.VMDisconnectedException: Connection closed
Message-ID: <52FA301A.2040301@oracle.com>

Hi ,

The problem could be that FieldMonitor did not have enough time to 
"addFieldWatch" but the vm to monitor (TestPostFieldModification) was 
already ended.

So we should make sure that TestPostFieldModification exits after 
FieldMonitor has done necessary. The solution proposed here is that 
FieldMonitor creates a file after adding field watching, and 
TestPostFieldModification quits only after finding the file.

web:
http://icncweb.fr.oracle.com/~shjiang/webrev/8007710/00/

bug:
https://bugs.openjdk.java.net/browse/JDK-8007710

Thanks,
Shanliang

From staffan.larsen at oracle.com  Tue Feb 11 07:04:46 2014
From: staffan.larsen at oracle.com (Staffan Larsen)
Date: Tue, 11 Feb 2014 16:04:46 +0100
Subject: Codereview request: 8007710 runtime/7158988/FieldMonitor.java
	fails with com.sun.jdi.VMDisconnectedException: Connection closed
In-Reply-To: <52FA301A.2040301@oracle.com>
References: <52FA301A.2040301@oracle.com>
Message-ID: <F34A2199-253B-49B8-BEC9-00AEA98029C2@oracle.com>

Hi Shanliang,

I can?t quite see how the test can fail in this way. When the ClassPrepareEvent happens, the debuggee will be suspended. So when addFieldWatch() is called, the debuggee should not have moved.

One problem I do see with the test is that it does not wait for a VMStartEvent before setting up requests. I?m not sure if that could cause the failure in the bug report, though.

/Staffan

On 11 feb 2014, at 15:13, shanliang <shanliang.jiang at oracle.com> wrote:

> Hi ,
> 
> The problem could be that FieldMonitor did not have enough time to "addFieldWatch" but the vm to monitor (TestPostFieldModification) was already ended.
> 
> So we should make sure that TestPostFieldModification exits after FieldMonitor has done necessary. The solution proposed here is that FieldMonitor creates a file after adding field watching, and TestPostFieldModification quits only after finding the file.
> 
> web:
> http://icncweb.fr.oracle.com/~shjiang/webrev/8007710/00/
> 
> bug:
> https://bugs.openjdk.java.net/browse/JDK-8007710
> 
> Thanks,
> Shanliang


From shanliang.jiang at oracle.com  Tue Feb 11 07:21:30 2014
From: shanliang.jiang at oracle.com (shanliang)
Date: Tue, 11 Feb 2014 16:21:30 +0100
Subject: Codereview request: 8007710 runtime/7158988/FieldMonitor.java
	fails with com.sun.jdi.VMDisconnectedException: Connection closed
In-Reply-To: <52FA301A.2040301@oracle.com>
References: <52FA301A.2040301@oracle.com>
Message-ID: <52FA3FFA.70502@oracle.com>

public web:
http://cr.openjdk.java.net/~sjiang/JDK-8007710/00/

Shanliang

shanliang wrote:
> Hi ,
>
> The problem could be that FieldMonitor did not have enough time to 
> "addFieldWatch" but the vm to monitor (TestPostFieldModification) was 
> already ended.
>
> So we should make sure that TestPostFieldModification exits after 
> FieldMonitor has done necessary. The solution proposed here is that 
> FieldMonitor creates a file after adding field watching, and 
> TestPostFieldModification quits only after finding the file.
>
> web:
> http://icncweb.fr.oracle.com/~shjiang/webrev/8007710/00/
>
> bug:
> https://bugs.openjdk.java.net/browse/JDK-8007710
>
> Thanks,
> Shanliang


From shanliang.jiang at oracle.com  Tue Feb 11 07:31:26 2014
From: shanliang.jiang at oracle.com (shanliang)
Date: Tue, 11 Feb 2014 16:31:26 +0100
Subject: Codereview request: 8007710 runtime/7158988/FieldMonitor.java
	fails with com.sun.jdi.VMDisconnectedException: Connection closed
In-Reply-To: <F34A2199-253B-49B8-BEC9-00AEA98029C2@oracle.com>
References: <52FA301A.2040301@oracle.com>
	<F34A2199-253B-49B8-BEC9-00AEA98029C2@oracle.com>
Message-ID: <52FA424E.2050202@oracle.com>

Staffan Larsen wrote:
> Hi Shanliang,
>
> I can?t quite see how the test can fail in this way. When the ClassPrepareEvent happens, the debuggee will be suspended. So when addFieldWatch() is called, the debuggee should not have moved.
>   
I am not expert of jdi so I may miss something here. I checked the 
failure trace and saw the report exception happen when FieldMonitor 
received ClassPrepareEvent and was doing addFieldWatch. FieldMonitor did 
call "vm.resume()" before treating events.

I reproduced the bug by add sleep(1000) after vm.resume() but before 
calling eventQueue.remove();

Thanks,
Shanliang
> One problem I do see with the test is that it does not wait for a VMStartEvent before setting up requests. I?m not sure if that could cause the failure in the bug report, though.
>
> /Staffan
>
> On 11 feb 2014, at 15:13, shanliang <shanliang.jiang at oracle.com> wrote:
>
>   
>> Hi ,
>>
>> The problem could be that FieldMonitor did not have enough time to "addFieldWatch" but the vm to monitor (TestPostFieldModification) was already ended.
>>
>> So we should make sure that TestPostFieldModification exits after FieldMonitor has done necessary. The solution proposed here is that FieldMonitor creates a file after adding field watching, and TestPostFieldModification quits only after finding the file.
>>
>> web:
>> http://icncweb.fr.oracle.com/~shjiang/webrev/8007710/00/
>>
>> bug:
>> https://bugs.openjdk.java.net/browse/JDK-8007710
>>
>> Thanks,
>> Shanliang
>>     
>
>   

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20140211/eca01cf7/attachment-0001.html 

From jaroslav.bachorik at oracle.com  Tue Feb 11 07:43:17 2014
From: jaroslav.bachorik at oracle.com (Jaroslav Bachorik)
Date: Tue, 11 Feb 2014 16:43:17 +0100
Subject: Codereview request: 8007710 runtime/7158988/FieldMonitor.java
	fails with com.sun.jdi.VMDisconnectedException: Connection closed
In-Reply-To: <52FA424E.2050202@oracle.com>
References: <52FA301A.2040301@oracle.com>	<F34A2199-253B-49B8-BEC9-00AEA98029C2@oracle.com>
	<52FA424E.2050202@oracle.com>
Message-ID: <52FA4515.3050904@oracle.com>

On 11.2.2014 16:31, shanliang wrote:
> Staffan Larsen wrote:
>> Hi Shanliang,
>>
>> I can?t quite see how the test can fail in this way. When the
>> ClassPrepareEvent happens, the debuggee will be suspended. So when
>> addFieldWatch() is called, the debuggee should not have moved.
> I am not expert of jdi so I may miss something here. I checked the
> failure trace and saw the report exception happen when FieldMonitor
> received ClassPrepareEvent and was doing addFieldWatch. FieldMonitor did
> call "vm.resume()" before treating events.

AFAICS, calling vm.resume() results in an almost immediate debuggee 
death. The gc() invoking thread "d" is flagged as a deamon and as such 
doesn't prevent the process from exiting. The other thread is not a 
daemon but will finish in only few cycles.

>
> I reproduced the bug by add sleep(1000) after vm.resume() but before
> calling eventQueue.remove();

It looks like some kind of synchronization between the debugger and the 
debuggee is necessary. But I wonder if you should better use the 
process.getOuptuptStream() to write and flush a message for the debugee 
indicating that it can exit. And in the debugee you would just do 
System.in.read() as the last statement in the main() method. Seems more 
robust than involving files.

Cheers,

-JB-

>
> Thanks,
> Shanliang
>> One problem I do see with the test is that it does not wait for a
>> VMStartEvent before setting up requests. I?m not sure if that could
>> cause the failure in the bug report, though.
>>
>> /Staffan
>>
>> On 11 feb 2014, at 15:13, shanliang <shanliang.jiang at oracle.com> wrote:
>>
>>> Hi ,
>>>
>>> The problem could be that FieldMonitor did not have enough time to
>>> "addFieldWatch" but the vm to monitor (TestPostFieldModification) was
>>> already ended.
>>>
>>> So we should make sure that TestPostFieldModification exits after
>>> FieldMonitor has done necessary. The solution proposed here is that
>>> FieldMonitor creates a file after adding field watching, and
>>> TestPostFieldModification quits only after finding the file.
>>>
>>> web:
>>> http://icncweb.fr.oracle.com/~shjiang/webrev/8007710/00/
>>>
>>> bug:
>>> https://bugs.openjdk.java.net/browse/JDK-8007710
>>>
>>> Thanks,
>>> Shanliang
>>
>
>


From shanliang.jiang at oracle.com  Tue Feb 11 08:17:55 2014
From: shanliang.jiang at oracle.com (shanliang)
Date: Tue, 11 Feb 2014 17:17:55 +0100
Subject: Codereview request: 8007710 runtime/7158988/FieldMonitor.java
	fails with com.sun.jdi.VMDisconnectedException: Connection closed
In-Reply-To: <52FA4515.3050904@oracle.com>
References: <52FA301A.2040301@oracle.com>	<F34A2199-253B-49B8-BEC9-00AEA98029C2@oracle.com>	<52FA424E.2050202@oracle.com>
	<52FA4515.3050904@oracle.com>
Message-ID: <52FA4D33.4030402@oracle.com>

Jaroslav Bachorik wrote:
> On 11.2.2014 16:31, shanliang wrote:
>> Staffan Larsen wrote:
>>> Hi Shanliang,
>>>
>>> I can?t quite see how the test can fail in this way. When the
>>> ClassPrepareEvent happens, the debuggee will be suspended. So when
>>> addFieldWatch() is called, the debuggee should not have moved.
>> I am not expert of jdi so I may miss something here. I checked the
>> failure trace and saw the report exception happen when FieldMonitor
>> received ClassPrepareEvent and was doing addFieldWatch. FieldMonitor did
>> call "vm.resume()" before treating events.
>
> AFAICS, calling vm.resume() results in an almost immediate debuggee 
> death. The gc() invoking thread "d" is flagged as a deamon and as such 
> doesn't prevent the process from exiting. The other thread is not a 
> daemon but will finish in only few cycles.
I looked at the class com.sun.jdi.VirtualMachine, here is the Javadoc of 
the method "resume":
    /**
     * Continues the execution of the application running in this
     * virtual machine. All threads are resumed as documented in
     * {@link ThreadReference#resume}.
     *
     * @throws VMCannotBeModifiedException if the VirtualMachine is 
read-only - see {@link VirtualMachine#canBeModified()}.
     *
     * @see #suspend
     */
    void resume();
My understanding is that the debuggee resumes to work after this call, 
instead to die?
>
>>
>> I reproduced the bug by add sleep(1000) after vm.resume() but before
>> calling eventQueue.remove();
>
> It looks like some kind of synchronization between the debugger and 
> the debuggee is necessary. But I wonder if you should better use the 
> process.getOuptuptStream() to write and flush a message for the 
> debugee indicating that it can exit. And in the debugee you would just 
> do System.in.read() as the last statement in the main() method. Seems 
> more robust than involving files.
It could work, but creating a file in the testing directory should have 
no issue, but yes maybe less performance.

Thanks,
Shanliang
>
> Cheers,
>
> -JB-
>
>>
>> Thanks,
>> Shanliang
>>> One problem I do see with the test is that it does not wait for a
>>> VMStartEvent before setting up requests. I?m not sure if that could
>>> cause the failure in the bug report, though.
>>>
>>> /Staffan
>>>
>>> On 11 feb 2014, at 15:13, shanliang <shanliang.jiang at oracle.com> wrote:
>>>
>>>> Hi ,
>>>>
>>>> The problem could be that FieldMonitor did not have enough time to
>>>> "addFieldWatch" but the vm to monitor (TestPostFieldModification) was
>>>> already ended.
>>>>
>>>> So we should make sure that TestPostFieldModification exits after
>>>> FieldMonitor has done necessary. The solution proposed here is that
>>>> FieldMonitor creates a file after adding field watching, and
>>>> TestPostFieldModification quits only after finding the file.
>>>>
>>>> web:
>>>> http://icncweb.fr.oracle.com/~shjiang/webrev/8007710/00/
>>>>
>>>> bug:
>>>> https://bugs.openjdk.java.net/browse/JDK-8007710
>>>>
>>>> Thanks,
>>>> Shanliang
>>>
>>
>>
>


From shanliang.jiang at oracle.com  Tue Feb 11 08:37:18 2014
From: shanliang.jiang at oracle.com (shanliang)
Date: Tue, 11 Feb 2014 17:37:18 +0100
Subject: Codereview request: 8007710 runtime/7158988/FieldMonitor.java
	fails with com.sun.jdi.VMDisconnectedException: Connection closed
In-Reply-To: <52FA4D33.4030402@oracle.com>
References: <52FA301A.2040301@oracle.com>	<F34A2199-253B-49B8-BEC9-00AEA98029C2@oracle.com>	<52FA424E.2050202@oracle.com>	<52FA4515.3050904@oracle.com>
	<52FA4D33.4030402@oracle.com>
Message-ID: <52FA51BE.4070701@oracle.com>

shanliang wrote:
> Jaroslav Bachorik wrote:
>> On 11.2.2014 16:31, shanliang wrote:
>>> Staffan Larsen wrote:
>>>> Hi Shanliang,
>>>>
>>>> I can?t quite see how the test can fail in this way. When the
>>>> ClassPrepareEvent happens, the debuggee will be suspended. So when
>>>> addFieldWatch() is called, the debuggee should not have moved.
>>> I am not expert of jdi so I may miss something here. I checked the
>>> failure trace and saw the report exception happen when FieldMonitor
>>> received ClassPrepareEvent and was doing addFieldWatch. FieldMonitor 
>>> did
>>> call "vm.resume()" before treating events.
>>
>> AFAICS, calling vm.resume() results in an almost immediate debuggee 
>> death. The gc() invoking thread "d" is flagged as a deamon and as 
>> such doesn't prevent the process from exiting. The other thread is 
>> not a daemon but will finish in only few cycles.
> I looked at the class com.sun.jdi.VirtualMachine, here is the Javadoc 
> of the method "resume":
>    /**
>     * Continues the execution of the application running in this
>     * virtual machine. All threads are resumed as documented in
>     * {@link ThreadReference#resume}.
>     *
>     * @throws VMCannotBeModifiedException if the VirtualMachine is 
> read-only - see {@link VirtualMachine#canBeModified()}.
>     *
>     * @see #suspend
>     */
>    void resume();
> My understanding is that the debuggee resumes to work after this call, 
> instead to die?
In fact the problem is here, the vm (TestPostFieldModification) should 
not die before FieldMonitor finishes addFieldWatch.

Shanliang
>>
>>>
>>> I reproduced the bug by add sleep(1000) after vm.resume() but before
>>> calling eventQueue.remove();
>>
>> It looks like some kind of synchronization between the debugger and 
>> the debuggee is necessary. But I wonder if you should better use the 
>> process.getOuptuptStream() to write and flush a message for the 
>> debugee indicating that it can exit. And in the debugee you would 
>> just do System.in.read() as the last statement in the main() method. 
>> Seems more robust than involving files.
> It could work, but creating a file in the testing directory should 
> have no issue, but yes maybe less performance.
>
> Thanks,
> Shanliang
>>
>> Cheers,
>>
>> -JB-
>>
>>>
>>> Thanks,
>>> Shanliang
>>>> One problem I do see with the test is that it does not wait for a
>>>> VMStartEvent before setting up requests. I?m not sure if that could
>>>> cause the failure in the bug report, though.
>>>>
>>>> /Staffan
>>>>
>>>> On 11 feb 2014, at 15:13, shanliang <shanliang.jiang at oracle.com> 
>>>> wrote:
>>>>
>>>>> Hi ,
>>>>>
>>>>> The problem could be that FieldMonitor did not have enough time to
>>>>> "addFieldWatch" but the vm to monitor (TestPostFieldModification) was
>>>>> already ended.
>>>>>
>>>>> So we should make sure that TestPostFieldModification exits after
>>>>> FieldMonitor has done necessary. The solution proposed here is that
>>>>> FieldMonitor creates a file after adding field watching, and
>>>>> TestPostFieldModification quits only after finding the file.
>>>>>
>>>>> web:
>>>>> http://icncweb.fr.oracle.com/~shjiang/webrev/8007710/00/
>>>>>
>>>>> bug:
>>>>> https://bugs.openjdk.java.net/browse/JDK-8007710
>>>>>
>>>>> Thanks,
>>>>> Shanliang
>>>>
>>>
>>>
>>
>


From jaroslav.bachorik at oracle.com  Tue Feb 11 08:41:01 2014
From: jaroslav.bachorik at oracle.com (Jaroslav Bachorik)
Date: Tue, 11 Feb 2014 17:41:01 +0100
Subject: Codereview request: 8007710 runtime/7158988/FieldMonitor.java
	fails with com.sun.jdi.VMDisconnectedException: Connection closed
In-Reply-To: <52FA4D33.4030402@oracle.com>
References: <52FA301A.2040301@oracle.com>	<F34A2199-253B-49B8-BEC9-00AEA98029C2@oracle.com>	<52FA424E.2050202@oracle.com>
	<52FA4515.3050904@oracle.com> <52FA4D33.4030402@oracle.com>
Message-ID: <52FA529D.3010306@oracle.com>

On 11.2.2014 17:17, shanliang wrote:
> Jaroslav Bachorik wrote:
>> On 11.2.2014 16:31, shanliang wrote:
>>> Staffan Larsen wrote:
>>>> Hi Shanliang,
>>>>
>>>> I can?t quite see how the test can fail in this way. When the
>>>> ClassPrepareEvent happens, the debuggee will be suspended. So when
>>>> addFieldWatch() is called, the debuggee should not have moved.
>>> I am not expert of jdi so I may miss something here. I checked the
>>> failure trace and saw the report exception happen when FieldMonitor
>>> received ClassPrepareEvent and was doing addFieldWatch. FieldMonitor did
>>> call "vm.resume()" before treating events.
>>
>> AFAICS, calling vm.resume() results in an almost immediate debuggee
>> death. The gc() invoking thread "d" is flagged as a deamon and as such
>> doesn't prevent the process from exiting. The other thread is not a
>> daemon but will finish in only few cycles.
> I looked at the class com.sun.jdi.VirtualMachine, here is the Javadoc of
> the method "resume":
>     /**
>      * Continues the execution of the application running in this
>      * virtual machine. All threads are resumed as documented in
>      * {@link ThreadReference#resume}.
>      *
>      * @throws VMCannotBeModifiedException if the VirtualMachine is
> read-only - see {@link VirtualMachine#canBeModified()}.
>      *
>      * @see #suspend
>      */
>     void resume();
> My understanding is that the debuggee resumes to work after this call,
> instead to die?
Yes. It resumes. But only for a few microseconds at best. Those 10 
string concatenations will not take long. So the VM will exit almost 
immediately after receiving the resume command.

>>
>>>
>>> I reproduced the bug by add sleep(1000) after vm.resume() but before
>>> calling eventQueue.remove();
>>
>> It looks like some kind of synchronization between the debugger and
>> the debuggee is necessary. But I wonder if you should better use the
>> process.getOuptuptStream() to write and flush a message for the
>> debugee indicating that it can exit. And in the debugee you would just
>> do System.in.read() as the last statement in the main() method. Seems
>> more robust than involving files.
> It could work, but creating a file in the testing directory should have
> no issue, but yes maybe less performance.
... and possible file system issues, necessity to clean up the file, not 
ready for parallelization etc. BTW, I have a recollection that you 
should create test specific files in a path specified by some other 
system property, not "user.dir". I can try to find the property name.

-JB-

>
> Thanks,
> Shanliang
>>
>> Cheers,
>>
>> -JB-
>>
>>>
>>> Thanks,
>>> Shanliang
>>>> One problem I do see with the test is that it does not wait for a
>>>> VMStartEvent before setting up requests. I?m not sure if that could
>>>> cause the failure in the bug report, though.
>>>>
>>>> /Staffan
>>>>
>>>> On 11 feb 2014, at 15:13, shanliang <shanliang.jiang at oracle.com> wrote:
>>>>
>>>>> Hi ,
>>>>>
>>>>> The problem could be that FieldMonitor did not have enough time to
>>>>> "addFieldWatch" but the vm to monitor (TestPostFieldModification) was
>>>>> already ended.
>>>>>
>>>>> So we should make sure that TestPostFieldModification exits after
>>>>> FieldMonitor has done necessary. The solution proposed here is that
>>>>> FieldMonitor creates a file after adding field watching, and
>>>>> TestPostFieldModification quits only after finding the file.
>>>>>
>>>>> web:
>>>>> http://icncweb.fr.oracle.com/~shjiang/webrev/8007710/00/
>>>>>
>>>>> bug:
>>>>> https://bugs.openjdk.java.net/browse/JDK-8007710
>>>>>
>>>>> Thanks,
>>>>> Shanliang
>>>>
>>>
>>>
>>
>


From shanliang.jiang at oracle.com  Tue Feb 11 09:30:57 2014
From: shanliang.jiang at oracle.com (shanliang)
Date: Tue, 11 Feb 2014 18:30:57 +0100
Subject: Codereview request: 8007710 runtime/7158988/FieldMonitor.java
	fails with com.sun.jdi.VMDisconnectedException: Connection closed
In-Reply-To: <52FA51BE.4070701@oracle.com>
References: <52FA301A.2040301@oracle.com>	<F34A2199-253B-49B8-BEC9-00AEA98029C2@oracle.com>	<52FA424E.2050202@oracle.com>	<52FA4515.3050904@oracle.com>	<52FA4D33.4030402@oracle.com>
	<52FA51BE.4070701@oracle.com>
Message-ID: <52FA5E51.3040704@oracle.com>

Here is the new fix in which FieldMonitor will write to 
TestPostFieldModification, to inform the latter to quit, as suggested bu 
Jaroslav
    http://cr.openjdk.java.net/~sjiang/JDK-8007710/01/

Thanks,
Shanliang

shanliang wrote:
> shanliang wrote:
>> Jaroslav Bachorik wrote:
>>> On 11.2.2014 16:31, shanliang wrote:
>>>> Staffan Larsen wrote:
>>>>> Hi Shanliang,
>>>>>
>>>>> I can?t quite see how the test can fail in this way. When the
>>>>> ClassPrepareEvent happens, the debuggee will be suspended. So when
>>>>> addFieldWatch() is called, the debuggee should not have moved.
>>>> I am not expert of jdi so I may miss something here. I checked the
>>>> failure trace and saw the report exception happen when FieldMonitor
>>>> received ClassPrepareEvent and was doing addFieldWatch. 
>>>> FieldMonitor did
>>>> call "vm.resume()" before treating events.
>>>
>>> AFAICS, calling vm.resume() results in an almost immediate debuggee 
>>> death. The gc() invoking thread "d" is flagged as a deamon and as 
>>> such doesn't prevent the process from exiting. The other thread is 
>>> not a daemon but will finish in only few cycles.
>> I looked at the class com.sun.jdi.VirtualMachine, here is the Javadoc 
>> of the method "resume":
>>    /**
>>     * Continues the execution of the application running in this
>>     * virtual machine. All threads are resumed as documented in
>>     * {@link ThreadReference#resume}.
>>     *
>>     * @throws VMCannotBeModifiedException if the VirtualMachine is 
>> read-only - see {@link VirtualMachine#canBeModified()}.
>>     *
>>     * @see #suspend
>>     */
>>    void resume();
>> My understanding is that the debuggee resumes to work after this 
>> call, instead to die?
> In fact the problem is here, the vm (TestPostFieldModification) should 
> not die before FieldMonitor finishes addFieldWatch.
>
> Shanliang
>>>
>>>>
>>>> I reproduced the bug by add sleep(1000) after vm.resume() but before
>>>> calling eventQueue.remove();
>>>
>>> It looks like some kind of synchronization between the debugger and 
>>> the debuggee is necessary. But I wonder if you should better use the 
>>> process.getOuptuptStream() to write and flush a message for the 
>>> debugee indicating that it can exit. And in the debugee you would 
>>> just do System.in.read() as the last statement in the main() method. 
>>> Seems more robust than involving files.
>> It could work, but creating a file in the testing directory should 
>> have no issue, but yes maybe less performance.
>>
>> Thanks,
>> Shanliang
>>>
>>> Cheers,
>>>
>>> -JB-
>>>
>>>>
>>>> Thanks,
>>>> Shanliang
>>>>> One problem I do see with the test is that it does not wait for a
>>>>> VMStartEvent before setting up requests. I?m not sure if that could
>>>>> cause the failure in the bug report, though.
>>>>>
>>>>> /Staffan
>>>>>
>>>>> On 11 feb 2014, at 15:13, shanliang <shanliang.jiang at oracle.com> 
>>>>> wrote:
>>>>>
>>>>>> Hi ,
>>>>>>
>>>>>> The problem could be that FieldMonitor did not have enough time to
>>>>>> "addFieldWatch" but the vm to monitor (TestPostFieldModification) 
>>>>>> was
>>>>>> already ended.
>>>>>>
>>>>>> So we should make sure that TestPostFieldModification exits after
>>>>>> FieldMonitor has done necessary. The solution proposed here is that
>>>>>> FieldMonitor creates a file after adding field watching, and
>>>>>> TestPostFieldModification quits only after finding the file.
>>>>>>
>>>>>> web:
>>>>>> http://icncweb.fr.oracle.com/~shjiang/webrev/8007710/00/
>>>>>>
>>>>>> bug:
>>>>>> https://bugs.openjdk.java.net/browse/JDK-8007710
>>>>>>
>>>>>> Thanks,
>>>>>> Shanliang
>>>>>
>>>>
>>>>
>>>
>>
>


From mark.reinhold at oracle.com  Tue Feb 11 09:42:40 2014
From: mark.reinhold at oracle.com (mark.reinhold at oracle.com)
Date: Tue, 11 Feb 2014 09:42:40 -0800
Subject: Changeset rolled back: jdk8/tl/jdk: 7152892: some jtreg tests fail
	with permission denied
Message-ID: <20140211094240.954088@eggemoggin.niobe.net>

This changeset was erroneously pushed to jdk8/tl/jdk:

  Changeset: da4b0962ad11
  Author:    robm
  Date:      2014-02-10 14:35 +0000
  URL:       http://hg.openjdk.java.net/jdk8/tl/jdk/rev/da4b0962ad11

  7152892: some jtreg tests fail with permission denied
  Reviewed-by: coffeys

  ! test/java/lang/ClassLoader/Assert.sh
  ! test/java/rmi/registry/readTest/readTest.sh
  ! test/java/util/zip/ZipFile/ReadZip.java
  ! test/sun/net/www/protocol/jar/jarbug/run.sh

It has been rolled back on the server and blacklisted in jcheck.

If you have a clone of this repository then either strip this
changeset or destroy the repository and re-clone it.

- Mark

From staffan.larsen at oracle.com  Tue Feb 11 11:15:43 2014
From: staffan.larsen at oracle.com (Staffan Larsen)
Date: Tue, 11 Feb 2014 20:15:43 +0100
Subject: Codereview request: 8007710 runtime/7158988/FieldMonitor.java
	fails with com.sun.jdi.VMDisconnectedException: Connection closed
In-Reply-To: <52FA5E51.3040704@oracle.com>
References: <52FA301A.2040301@oracle.com>	<F34A2199-253B-49B8-BEC9-00AEA98029C2@oracle.com>	<52FA424E.2050202@oracle.com>	<52FA4515.3050904@oracle.com>	<52FA4D33.4030402@oracle.com>
	<52FA51BE.4070701@oracle.com> <52FA5E51.3040704@oracle.com>
Message-ID: <3F301491-8A27-4319-A694-D1E8ACB2F249@oracle.com>

I think what you need to do is wait for the VMStartEvent before you add requests to the VM. Note this paragraph from the VirtualMachine doc:

 Note that a target VM launched by a launching connector is not
 guaranteed to be stable until after the VMStartEvent has been
 received.

I think adding code that looks something like this will make the test stable:

    VirtualMachine vm = launchTarget(CLASS_NAME);
    EventQueue eventQueue = vm.eventQueue();

    boolean started = false;
    while(!started) {
      EventSet eventSet = eventQueue.remove();
      for (Event event : eventSet) {
        if (event instanceof VMStartEvent) {
          started = true;
        }
        if (event instanceof VMDeathEvent
            || event instanceof VMDisconnectEvent) {
          throw new Error("VM died before it started...:"+event);
        }
      }
    }

    System.out.println("Vm launched");

There is also no reason to call addFieldWatch() before the ClassPrepareEvent has been received. The call to vm..classesByName() will just return an empty list anyway.

While you are in there you can also remove the unused StringBuffer near the top of main().
 
Thanks,
/Staffan

On 11 feb 2014, at 18:30, shanliang <shanliang.jiang at oracle.com> wrote:

> Here is the new fix in which FieldMonitor will write to TestPostFieldModification, to inform the latter to quit, as suggested bu Jaroslav
>   http://cr.openjdk.java.net/~sjiang/JDK-8007710/01/
> 
> Thanks,
> Shanliang
> 
> shanliang wrote:
>> shanliang wrote:
>>> Jaroslav Bachorik wrote:
>>>> On 11.2.2014 16:31, shanliang wrote:
>>>>> Staffan Larsen wrote:
>>>>>> Hi Shanliang,
>>>>>> 
>>>>>> I can?t quite see how the test can fail in this way. When the
>>>>>> ClassPrepareEvent happens, the debuggee will be suspended. So when
>>>>>> addFieldWatch() is called, the debuggee should not have moved.
>>>>> I am not expert of jdi so I may miss something here. I checked the
>>>>> failure trace and saw the report exception happen when FieldMonitor
>>>>> received ClassPrepareEvent and was doing addFieldWatch. FieldMonitor did
>>>>> call "vm.resume()" before treating events.
>>>> 
>>>> AFAICS, calling vm.resume() results in an almost immediate debuggee death. The gc() invoking thread "d" is flagged as a deamon and as such doesn't prevent the process from exiting. The other thread is not a daemon but will finish in only few cycles.
>>> I looked at the class com.sun.jdi.VirtualMachine, here is the Javadoc of the method "resume":
>>>   /**
>>>    * Continues the execution of the application running in this
>>>    * virtual machine. All threads are resumed as documented in
>>>    * {@link ThreadReference#resume}.
>>>    *
>>>    * @throws VMCannotBeModifiedException if the VirtualMachine is read-only - see {@link VirtualMachine#canBeModified()}.
>>>    *
>>>    * @see #suspend
>>>    */
>>>   void resume();
>>> My understanding is that the debuggee resumes to work after this call, instead to die?
>> In fact the problem is here, the vm (TestPostFieldModification) should not die before FieldMonitor finishes addFieldWatch.
>> 
>> Shanliang
>>>> 
>>>>> 
>>>>> I reproduced the bug by add sleep(1000) after vm.resume() but before
>>>>> calling eventQueue.remove();
>>>> 
>>>> It looks like some kind of synchronization between the debugger and the debuggee is necessary. But I wonder if you should better use the process.getOuptuptStream() to write and flush a message for the debugee indicating that it can exit. And in the debugee you would just do System.in.read() as the last statement in the main() method. Seems more robust than involving files.
>>> It could work, but creating a file in the testing directory should have no issue, but yes maybe less performance.
>>> 
>>> Thanks,
>>> Shanliang
>>>> 
>>>> Cheers,
>>>> 
>>>> -JB-
>>>> 
>>>>> 
>>>>> Thanks,
>>>>> Shanliang
>>>>>> One problem I do see with the test is that it does not wait for a
>>>>>> VMStartEvent before setting up requests. I?m not sure if that could
>>>>>> cause the failure in the bug report, though.
>>>>>> 
>>>>>> /Staffan
>>>>>> 
>>>>>> On 11 feb 2014, at 15:13, shanliang <shanliang.jiang at oracle.com> wrote:
>>>>>> 
>>>>>>> Hi ,
>>>>>>> 
>>>>>>> The problem could be that FieldMonitor did not have enough time to
>>>>>>> "addFieldWatch" but the vm to monitor (TestPostFieldModification) was
>>>>>>> already ended.
>>>>>>> 
>>>>>>> So we should make sure that TestPostFieldModification exits after
>>>>>>> FieldMonitor has done necessary. The solution proposed here is that
>>>>>>> FieldMonitor creates a file after adding field watching, and
>>>>>>> TestPostFieldModification quits only after finding the file.
>>>>>>> 
>>>>>>> web:
>>>>>>> http://icncweb.fr.oracle.com/~shjiang/webrev/8007710/00/
>>>>>>> 
>>>>>>> bug:
>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8007710
>>>>>>> 
>>>>>>> Thanks,
>>>>>>> Shanliang
>>>>>> 
>>>>> 
>>>>> 
>>>> 
>>> 
>> 
> 


From serguei.spitsyn at oracle.com  Tue Feb 11 12:23:27 2014
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Tue, 11 Feb 2014 12:23:27 -0800
Subject: RFR(S): 6952105 TEST_BUG: testcase failure, not very often,
	com/sun/jdi/SuspendThreadTest.java
In-Reply-To: <E0E54A66-772E-4E94-BC30-FDC0564079B3@oracle.com>
References: <E0E54A66-772E-4E94-BC30-FDC0564079B3@oracle.com>
Message-ID: <52FA86BF.4040002@oracle.com>

Staffan,

It is nice fix to have.
I do not see any issues with it.

Thanks,
Serguei

On 2/11/14 6:12 AM, Staffan Larsen wrote:
> Updated the test to use proper synchronization and notification between threads. Should be more stable and much faster.
>
> bug: https://bugs.openjdk.java.net/browse/JDK-6952105
> webrev: http://cr.openjdk.java.net/~sla/6952105/webrev.00/
>
> Thanks,
> /Staffan


From dmitry.samersoff at oracle.com  Tue Feb 11 12:35:52 2014
From: dmitry.samersoff at oracle.com (Dmitry Samersoff)
Date: Wed, 12 Feb 2014 00:35:52 +0400
Subject: RFR(S): 6952105 TEST_BUG: testcase failure, not very often,
	com/sun/jdi/SuspendThreadTest.java
In-Reply-To: <E0E54A66-772E-4E94-BC30-FDC0564079B3@oracle.com>
References: <E0E54A66-772E-4E94-BC30-FDC0564079B3@oracle.com>
Message-ID: <52FA89A8.7060105@oracle.com>

Staffan,

Nice catch!

Looks good for me.

-Dmitry

On 2014-02-11 18:12, Staffan Larsen wrote:
> Updated the test to use proper synchronization and notification between threads. Should be more stable and much faster.
> 
> bug: https://bugs.openjdk.java.net/browse/JDK-6952105
> webrev: http://cr.openjdk.java.net/~sla/6952105/webrev.00/
> 
> Thanks,
> /Staffan
> 


-- 
Dmitry Samersoff
Oracle Java development team, Saint Petersburg, Russia
* I would love to change the world, but they won't give me the sources.

From serguei.spitsyn at oracle.com  Tue Feb 11 12:55:08 2014
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Tue, 11 Feb 2014 12:55:08 -0800
Subject: Codereview request: 8007710 runtime/7158988/FieldMonitor.java
	fails with com.sun.jdi.VMDisconnectedException: Connection closed
In-Reply-To: <52FA5E51.3040704@oracle.com>
References: <52FA301A.2040301@oracle.com>	<F34A2199-253B-49B8-BEC9-00AEA98029C2@oracle.com>	<52FA424E.2050202@oracle.com>	<52FA4515.3050904@oracle.com>	<52FA4D33.4030402@oracle.com>	<52FA51BE.4070701@oracle.com>
	<52FA5E51.3040704@oracle.com>
Message-ID: <52FA8E2C.3050605@oracle.com>

The fix looks good.
But could you change "impossible" at line 45 to something more adequate, 
i.e. "caught exception"? :

   41         System.out.println("---TestPostFieldModification-run waiting to exit ...");
   42         try {
   43             System.in.read();
   44         } catch (Exception e) {
   45             System.out.println("---TestPostFieldModification-run impossible? "+e);
   46             e.printStackTrace();
   47         }


Thanks,
Serguei


On 2/11/14 9:30 AM, shanliang wrote:
> Here is the new fix in which FieldMonitor will write to 
> TestPostFieldModification, to inform the latter to quit, as suggested 
> bu Jaroslav
>    http://cr.openjdk.java.net/~sjiang/JDK-8007710/01/
>
> Thanks,
> Shanliang
>
> shanliang wrote:
>> shanliang wrote:
>>> Jaroslav Bachorik wrote:
>>>> On 11.2.2014 16:31, shanliang wrote:
>>>>> Staffan Larsen wrote:
>>>>>> Hi Shanliang,
>>>>>>
>>>>>> I can?t quite see how the test can fail in this way. When the
>>>>>> ClassPrepareEvent happens, the debuggee will be suspended. So when
>>>>>> addFieldWatch() is called, the debuggee should not have moved.
>>>>> I am not expert of jdi so I may miss something here. I checked the
>>>>> failure trace and saw the report exception happen when FieldMonitor
>>>>> received ClassPrepareEvent and was doing addFieldWatch. 
>>>>> FieldMonitor did
>>>>> call "vm.resume()" before treating events.
>>>>
>>>> AFAICS, calling vm.resume() results in an almost immediate debuggee 
>>>> death. The gc() invoking thread "d" is flagged as a deamon and as 
>>>> such doesn't prevent the process from exiting. The other thread is 
>>>> not a daemon but will finish in only few cycles.
>>> I looked at the class com.sun.jdi.VirtualMachine, here is the 
>>> Javadoc of the method "resume":
>>>    /**
>>>     * Continues the execution of the application running in this
>>>     * virtual machine. All threads are resumed as documented in
>>>     * {@link ThreadReference#resume}.
>>>     *
>>>     * @throws VMCannotBeModifiedException if the VirtualMachine is 
>>> read-only - see {@link VirtualMachine#canBeModified()}.
>>>     *
>>>     * @see #suspend
>>>     */
>>>    void resume();
>>> My understanding is that the debuggee resumes to work after this 
>>> call, instead to die?
>> In fact the problem is here, the vm (TestPostFieldModification) 
>> should not die before FieldMonitor finishes addFieldWatch.
>>
>> Shanliang
>>>>
>>>>>
>>>>> I reproduced the bug by add sleep(1000) after vm.resume() but before
>>>>> calling eventQueue.remove();
>>>>
>>>> It looks like some kind of synchronization between the debugger and 
>>>> the debuggee is necessary. But I wonder if you should better use 
>>>> the process.getOuptuptStream() to write and flush a message for the 
>>>> debugee indicating that it can exit. And in the debugee you would 
>>>> just do System.in.read() as the last statement in the main() 
>>>> method. Seems more robust than involving files.
>>> It could work, but creating a file in the testing directory should 
>>> have no issue, but yes maybe less performance.
>>>
>>> Thanks,
>>> Shanliang
>>>>
>>>> Cheers,
>>>>
>>>> -JB-
>>>>
>>>>>
>>>>> Thanks,
>>>>> Shanliang
>>>>>> One problem I do see with the test is that it does not wait for a
>>>>>> VMStartEvent before setting up requests. I?m not sure if that could
>>>>>> cause the failure in the bug report, though.
>>>>>>
>>>>>> /Staffan
>>>>>>
>>>>>> On 11 feb 2014, at 15:13, shanliang <shanliang.jiang at oracle.com> 
>>>>>> wrote:
>>>>>>
>>>>>>> Hi ,
>>>>>>>
>>>>>>> The problem could be that FieldMonitor did not have enough time to
>>>>>>> "addFieldWatch" but the vm to monitor 
>>>>>>> (TestPostFieldModification) was
>>>>>>> already ended.
>>>>>>>
>>>>>>> So we should make sure that TestPostFieldModification exits after
>>>>>>> FieldMonitor has done necessary. The solution proposed here is that
>>>>>>> FieldMonitor creates a file after adding field watching, and
>>>>>>> TestPostFieldModification quits only after finding the file.
>>>>>>>
>>>>>>> web:
>>>>>>> http://icncweb.fr.oracle.com/~shjiang/webrev/8007710/00/
>>>>>>>
>>>>>>> bug:
>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8007710
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Shanliang
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20140211/63046720/attachment.html 

From lana.steuck at oracle.com  Tue Feb 11 13:40:10 2014
From: lana.steuck at oracle.com (lana.steuck at oracle.com)
Date: Tue, 11 Feb 2014 21:40:10 +0000
Subject: hg: jdk8/tl/corba: 7 new changesets
Message-ID: <20140211214024.5EA1E62B9A@hg.openjdk.java.net>

Changeset: 18c4d03cf516
Author:    katleman
Date:      2014-01-22 12:53 -0800
URL:       http://hg.openjdk.java.net/jdk8/tl/corba/rev/18c4d03cf516

Added tag jdk8-b125 for changeset 7b45151c7a05

! .hgtags

Changeset: 8ceb68fd9e10
Author:    katleman
Date:      2014-01-22 14:06 -0800
URL:       http://hg.openjdk.java.net/jdk8/tl/corba/rev/8ceb68fd9e10

Merge

! .hgtags

Changeset: cfa04e69b115
Author:    katleman
Date:      2014-01-24 15:07 -0800
URL:       http://hg.openjdk.java.net/jdk8/tl/corba/rev/cfa04e69b115

Added tag jdk8-b126 for changeset 8ceb68fd9e10

! .hgtags

Changeset: b8c71dae0557
Author:    lana
Date:      2014-01-29 11:11 -0800
URL:       http://hg.openjdk.java.net/jdk8/tl/corba/rev/b8c71dae0557

Merge


Changeset: 113e7569b49b
Author:    katleman
Date:      2014-01-30 12:16 -0800
URL:       http://hg.openjdk.java.net/jdk8/tl/corba/rev/113e7569b49b

Added tag jdk8-b127 for changeset b8c71dae0557

! .hgtags

Changeset: 5c72d74c6805
Author:    katleman
Date:      2014-02-01 18:21 -0800
URL:       http://hg.openjdk.java.net/jdk8/tl/corba/rev/5c72d74c6805

Added tag jdk8-b128 for changeset 113e7569b49b

! .hgtags

Changeset: eea0d7dfcbe2
Author:    katleman
Date:      2014-02-06 17:34 -0800
URL:       http://hg.openjdk.java.net/jdk8/tl/corba/rev/eea0d7dfcbe2

Added tag jdk8-b129 for changeset 5c72d74c6805

! .hgtags


From lana.steuck at oracle.com  Tue Feb 11 13:40:17 2014
From: lana.steuck at oracle.com (lana.steuck at oracle.com)
Date: Tue, 11 Feb 2014 21:40:17 +0000
Subject: hg: jdk8/tl/langtools: 7 new changesets
Message-ID: <20140211214140.425E862B9E@hg.openjdk.java.net>

Changeset: 9a4dbfe11ed1
Author:    katleman
Date:      2014-01-22 12:54 -0800
URL:       http://hg.openjdk.java.net/jdk8/tl/langtools/rev/9a4dbfe11ed1

Added tag jdk8-b125 for changeset 436176151e85

! .hgtags

Changeset: ba24b6304362
Author:    katleman
Date:      2014-01-22 14:09 -0800
URL:       http://hg.openjdk.java.net/jdk8/tl/langtools/rev/ba24b6304362

Merge

! .hgtags

Changeset: 305b97f4651b
Author:    katleman
Date:      2014-01-24 15:08 -0800
URL:       http://hg.openjdk.java.net/jdk8/tl/langtools/rev/305b97f4651b

Added tag jdk8-b126 for changeset ba24b6304362

! .hgtags

Changeset: bb69217ed812
Author:    lana
Date:      2014-01-29 11:12 -0800
URL:       http://hg.openjdk.java.net/jdk8/tl/langtools/rev/bb69217ed812

Merge


Changeset: 09cdd3b493c0
Author:    katleman
Date:      2014-01-30 12:17 -0800
URL:       http://hg.openjdk.java.net/jdk8/tl/langtools/rev/09cdd3b493c0

Added tag jdk8-b127 for changeset bb69217ed812

! .hgtags

Changeset: 8fe7202d3c38
Author:    katleman
Date:      2014-02-01 18:21 -0800
URL:       http://hg.openjdk.java.net/jdk8/tl/langtools/rev/8fe7202d3c38

Added tag jdk8-b128 for changeset 09cdd3b493c0

! .hgtags

Changeset: 9d81ae1c417a
Author:    katleman
Date:      2014-02-06 17:35 -0800
URL:       http://hg.openjdk.java.net/jdk8/tl/langtools/rev/9d81ae1c417a

Added tag jdk8-b129 for changeset 8fe7202d3c38

! .hgtags


From lana.steuck at oracle.com  Tue Feb 11 13:40:13 2014
From: lana.steuck at oracle.com (lana.steuck at oracle.com)
Date: Tue, 11 Feb 2014 21:40:13 +0000
Subject: hg: jdk8/tl/jaxws: 7 new changesets
Message-ID: <20140211214119.DF23C62B9C@hg.openjdk.java.net>

Changeset: c0040f0b75e2
Author:    katleman
Date:      2014-01-22 12:53 -0800
URL:       http://hg.openjdk.java.net/jdk8/tl/jaxws/rev/c0040f0b75e2

Added tag jdk8-b125 for changeset ef71ecbcd7bc

! .hgtags

Changeset: 7193a007a159
Author:    katleman
Date:      2014-01-22 14:07 -0800
URL:       http://hg.openjdk.java.net/jdk8/tl/jaxws/rev/7193a007a159

Merge

! .hgtags

Changeset: 3f682f2ea376
Author:    katleman
Date:      2014-01-24 15:08 -0800
URL:       http://hg.openjdk.java.net/jdk8/tl/jaxws/rev/3f682f2ea376

Added tag jdk8-b126 for changeset 7193a007a159

! .hgtags

Changeset: 8e46fe36e175
Author:    lana
Date:      2014-01-29 11:11 -0800
URL:       http://hg.openjdk.java.net/jdk8/tl/jaxws/rev/8e46fe36e175

Merge


Changeset: de172acc095b
Author:    katleman
Date:      2014-01-30 12:16 -0800
URL:       http://hg.openjdk.java.net/jdk8/tl/jaxws/rev/de172acc095b

Added tag jdk8-b127 for changeset 8e46fe36e175

! .hgtags

Changeset: aabc90596123
Author:    katleman
Date:      2014-02-01 18:21 -0800
URL:       http://hg.openjdk.java.net/jdk8/tl/jaxws/rev/aabc90596123

Added tag jdk8-b128 for changeset de172acc095b

! .hgtags

Changeset: 4195c0956930
Author:    katleman
Date:      2014-02-06 17:35 -0800
URL:       http://hg.openjdk.java.net/jdk8/tl/jaxws/rev/4195c0956930

Added tag jdk8-b129 for changeset aabc90596123

! .hgtags


From lana.steuck at oracle.com  Tue Feb 11 13:40:17 2014
From: lana.steuck at oracle.com (lana.steuck at oracle.com)
Date: Tue, 11 Feb 2014 21:40:17 +0000
Subject: hg: jdk8/tl/jaxp: 7 new changesets
Message-ID: <20140211214123.063DA62B9D@hg.openjdk.java.net>

Changeset: 6a5af8a36aaf
Author:    katleman
Date:      2014-01-22 12:53 -0800
URL:       http://hg.openjdk.java.net/jdk8/tl/jaxp/rev/6a5af8a36aaf

Added tag jdk8-b125 for changeset 83bb924238f8

! .hgtags

Changeset: 390cc275c04c
Author:    katleman
Date:      2014-01-22 14:07 -0800
URL:       http://hg.openjdk.java.net/jdk8/tl/jaxp/rev/390cc275c04c

Merge

! .hgtags

Changeset: 573c261a2025
Author:    katleman
Date:      2014-01-24 15:08 -0800
URL:       http://hg.openjdk.java.net/jdk8/tl/jaxp/rev/573c261a2025

Added tag jdk8-b126 for changeset 390cc275c04c

! .hgtags

Changeset: b68cdb63a70b
Author:    lana
Date:      2014-01-29 11:11 -0800
URL:       http://hg.openjdk.java.net/jdk8/tl/jaxp/rev/b68cdb63a70b

Merge


Changeset: b1839922f10c
Author:    katleman
Date:      2014-01-30 12:16 -0800
URL:       http://hg.openjdk.java.net/jdk8/tl/jaxp/rev/b1839922f10c

Added tag jdk8-b127 for changeset b68cdb63a70b

! .hgtags

Changeset: b7752cea7c81
Author:    katleman
Date:      2014-02-01 18:21 -0800
URL:       http://hg.openjdk.java.net/jdk8/tl/jaxp/rev/b7752cea7c81

Added tag jdk8-b128 for changeset b1839922f10c

! .hgtags

Changeset: 0cb0cd015218
Author:    katleman
Date:      2014-02-06 17:34 -0800
URL:       http://hg.openjdk.java.net/jdk8/tl/jaxp/rev/0cb0cd015218

Added tag jdk8-b129 for changeset b7752cea7c81

! .hgtags


From lana.steuck at oracle.com  Tue Feb 11 13:40:24 2014
From: lana.steuck at oracle.com (lana.steuck at oracle.com)
Date: Tue, 11 Feb 2014 21:40:24 +0000
Subject: hg: jdk8/tl/nashorn: 8 new changesets
Message-ID: <20140211214101.632E862B9B@hg.openjdk.java.net>

Changeset: d336209a0e45
Author:    katleman
Date:      2014-01-22 12:54 -0800
URL:       http://hg.openjdk.java.net/jdk8/tl/nashorn/rev/d336209a0e45

Added tag jdk8-b125 for changeset 7346abe2ea03

! .hgtags

Changeset: 095263db862d
Author:    katleman
Date:      2014-01-22 14:00 -0800
URL:       http://hg.openjdk.java.net/jdk8/tl/nashorn/rev/095263db862d

Merge

! .hgtags

Changeset: e2522604c7c9
Author:    katleman
Date:      2014-01-24 15:08 -0800
URL:       http://hg.openjdk.java.net/jdk8/tl/nashorn/rev/e2522604c7c9

Added tag jdk8-b126 for changeset 095263db862d

! .hgtags

Changeset: fdfbb745caf0
Author:    lana
Date:      2014-01-29 11:12 -0800
URL:       http://hg.openjdk.java.net/jdk8/tl/nashorn/rev/fdfbb745caf0

Merge


Changeset: 7dfde83426d1
Author:    katleman
Date:      2014-01-30 12:17 -0800
URL:       http://hg.openjdk.java.net/jdk8/tl/nashorn/rev/7dfde83426d1

Added tag jdk8-b127 for changeset fdfbb745caf0

! .hgtags

Changeset: 73cbad0c5d28
Author:    lana
Date:      2014-01-31 13:47 -0800
URL:       http://hg.openjdk.java.net/jdk8/tl/nashorn/rev/73cbad0c5d28

Merge


Changeset: 9cc3fd32fbab
Author:    katleman
Date:      2014-02-01 18:21 -0800
URL:       http://hg.openjdk.java.net/jdk8/tl/nashorn/rev/9cc3fd32fbab

Added tag jdk8-b128 for changeset 73cbad0c5d28

! .hgtags

Changeset: f87eba70e9ee
Author:    katleman
Date:      2014-02-06 17:35 -0800
URL:       http://hg.openjdk.java.net/jdk8/tl/nashorn/rev/f87eba70e9ee

Added tag jdk8-b129 for changeset 9cc3fd32fbab

! .hgtags


From lana.steuck at oracle.com  Tue Feb 11 13:40:17 2014
From: lana.steuck at oracle.com (lana.steuck at oracle.com)
Date: Tue, 11 Feb 2014 21:40:17 +0000
Subject: hg: jdk8/tl/hotspot: 23 new changesets
Message-ID: <20140211214227.12C4F62B9F@hg.openjdk.java.net>

Changeset: 16e0c6c84a91
Author:    amurillo
Date:      2014-01-13 16:00 -0800
URL:       http://hg.openjdk.java.net/jdk8/tl/hotspot/rev/16e0c6c84a91

8031553: new hotspot build - hs25-b67
Reviewed-by: jcoomes

! make/hotspot_version

Changeset: 12ad8db39f76
Author:    roland
Date:      2014-01-14 09:44 +0100
URL:       http://hg.openjdk.java.net/jdk8/tl/hotspot/rev/12ad8db39f76

8028764: dtrace/hotspot_jni/ALL/ALL001 crashes the vm on Solaris-amd64, SIGSEGV in MarkSweep::follow_stack()+0x8a
Summary: C1 generates code to encode compressed oop into tmp register before runtime call for patching where GC may happen
Reviewed-by: iveresov, twisti, kvn
Contributed-by: mgerdin <mikael.gerdin at oracle.com>

! src/cpu/x86/vm/c1_LIRAssembler_x86.cpp

Changeset: 8b81451dc7f7
Author:    twisti
Date:      2014-01-16 16:18 -0800
URL:       http://hg.openjdk.java.net/jdk8/tl/hotspot/rev/8b81451dc7f7

8022395: java.util.zip.ZipException: Not in GZIP format in JT_JDK/test/java/util/zip/GZIP tests
Reviewed-by: kvn, iveresov

! src/cpu/x86/vm/c1_LIRGenerator_x86.cpp

Changeset: 3585183c191a
Author:    amurillo
Date:      2014-01-17 20:24 -0800
URL:       http://hg.openjdk.java.net/jdk8/tl/hotspot/rev/3585183c191a

Merge


Changeset: 5df2666e4573
Author:    amurillo
Date:      2014-01-17 20:24 -0800
URL:       http://hg.openjdk.java.net/jdk8/tl/hotspot/rev/5df2666e4573

Added tag hs25-b67 for changeset 3585183c191a

! .hgtags

Changeset: 55ff9170e27d
Author:    katleman
Date:      2014-01-22 12:53 -0800
URL:       http://hg.openjdk.java.net/jdk8/tl/hotspot/rev/55ff9170e27d

Added tag jdk8-b125 for changeset df333ee12bba

! .hgtags

Changeset: c8218f1072a0
Author:    katleman
Date:      2014-01-22 14:07 -0800
URL:       http://hg.openjdk.java.net/jdk8/tl/hotspot/rev/c8218f1072a0

Merge

! .hgtags

Changeset: 9a11d5e679cf
Author:    katleman
Date:      2014-01-24 15:07 -0800
URL:       http://hg.openjdk.java.net/jdk8/tl/hotspot/rev/9a11d5e679cf

Added tag jdk8-b126 for changeset c8218f1072a0

! .hgtags

Changeset: c2106608358b
Author:    amurillo
Date:      2014-01-17 20:30 -0800
URL:       http://hg.openjdk.java.net/jdk8/tl/hotspot/rev/c2106608358b

8032015: new hotspot build - hs25-b68
Reviewed-by: jcoomes

! make/hotspot_version

Changeset: 709018897c81
Author:    vlivanov
Date:      2014-01-23 01:23 +0400
URL:       http://hg.openjdk.java.net/jdk8/tl/hotspot/rev/709018897c81

8031695: CHA ignores default methods during analysis leading to incorrect code generation
Reviewed-by: jrose, acorn, hseigel, lfoltan

! src/share/vm/code/dependencies.cpp
+ test/compiler/inlining/DefaultAndConcreteMethodsCHA.java

Changeset: f970454708b8
Author:    iveresov
Date:      2014-01-17 18:09 -0800
URL:       http://hg.openjdk.java.net/jdk8/tl/hotspot/rev/f970454708b8

8032207: C2: assert(VerifyOops || MachNode::size(ra_) <= (3+1)*4) failed: bad fixed size
Summary: Fix the sizing of loadUS2L_immI16 and loadI2L_immI
Reviewed-by: kvn, azeemj

! src/cpu/sparc/vm/sparc.ad
+ test/compiler/codegen/LoadWithMask.java

Changeset: 984401824c5e
Author:    iveresov
Date:      2014-01-21 20:05 -0800
URL:       http://hg.openjdk.java.net/jdk8/tl/hotspot/rev/984401824c5e

8031743: C2: loadI2L_immI broken for negative memory values
Summary: Restrict loadI2L_imm optimizations to positive values of mask
Reviewed-by: kvn, dlong

! src/cpu/sparc/vm/sparc.ad
! src/cpu/x86/vm/x86_32.ad
! src/cpu/x86/vm/x86_64.ad
+ test/compiler/codegen/LoadWithMask2.java

Changeset: d45454002494
Author:    amurillo
Date:      2014-01-23 13:37 -0800
URL:       http://hg.openjdk.java.net/jdk8/tl/hotspot/rev/d45454002494

Merge


Changeset: 2c564e329c87
Author:    amurillo
Date:      2014-01-23 13:37 -0800
URL:       http://hg.openjdk.java.net/jdk8/tl/hotspot/rev/2c564e329c87

Added tag hs25-b68 for changeset d45454002494

! .hgtags

Changeset: 58879cd9f8df
Author:    amurillo
Date:      2014-01-28 09:51 -0800
URL:       http://hg.openjdk.java.net/jdk8/tl/hotspot/rev/58879cd9f8df

Merge

! .hgtags

Changeset: 7e412f95e310
Author:    amurillo
Date:      2014-01-23 13:53 -0800
URL:       http://hg.openjdk.java.net/jdk8/tl/hotspot/rev/7e412f95e310

8032608: new hotspot build - hs25-b69
Reviewed-by: jcoomes

! make/hotspot_version

Changeset: 2185d483f5f8
Author:    kvn
Date:      2014-01-27 10:20 -0800
URL:       http://hg.openjdk.java.net/jdk8/tl/hotspot/rev/2185d483f5f8

8032566: Crash in JIT when running Scala compiler (and compiling Scala std lib)
Summary: Switch off EliminateAutoBox flag by default in jdk8 release.
Reviewed-by: iveresov

! src/share/vm/opto/c2_globals.hpp

Changeset: 32f017489ba5
Author:    amurillo
Date:      2014-01-28 15:00 -0800
URL:       http://hg.openjdk.java.net/jdk8/tl/hotspot/rev/32f017489ba5

Merge


Changeset: 1d8728efc05f
Author:    amurillo
Date:      2014-01-28 15:00 -0800
URL:       http://hg.openjdk.java.net/jdk8/tl/hotspot/rev/1d8728efc05f

Added tag hs25-b69 for changeset 32f017489ba5

! .hgtags

Changeset: 35038da7bb9d
Author:    lana
Date:      2014-01-29 11:11 -0800
URL:       http://hg.openjdk.java.net/jdk8/tl/hotspot/rev/35038da7bb9d

Merge


Changeset: 874c0b4a946c
Author:    katleman
Date:      2014-01-30 12:16 -0800
URL:       http://hg.openjdk.java.net/jdk8/tl/hotspot/rev/874c0b4a946c

Added tag jdk8-b127 for changeset 35038da7bb9d

! .hgtags

Changeset: cb39165c4a65
Author:    katleman
Date:      2014-02-01 18:21 -0800
URL:       http://hg.openjdk.java.net/jdk8/tl/hotspot/rev/cb39165c4a65

Added tag jdk8-b128 for changeset 874c0b4a946c

! .hgtags

Changeset: 1dbaf664a611
Author:    katleman
Date:      2014-02-06 17:34 -0800
URL:       http://hg.openjdk.java.net/jdk8/tl/hotspot/rev/1dbaf664a611

Added tag jdk8-b129 for changeset cb39165c4a65

! .hgtags


From lana.steuck at oracle.com  Tue Feb 11 13:40:47 2014
From: lana.steuck at oracle.com (lana.steuck at oracle.com)
Date: Tue, 11 Feb 2014 21:40:47 +0000
Subject: hg: jdk8/tl/jdk: 14 new changesets
Message-ID: <20140211214445.3A9FE62BA0@hg.openjdk.java.net>

Changeset: 75cf17ceb6d1
Author:    katleman
Date:      2014-01-22 12:54 -0800
URL:       http://hg.openjdk.java.net/jdk8/tl/jdk/rev/75cf17ceb6d1

Added tag jdk8-b125 for changeset ae303640bc1c

! .hgtags

Changeset: 95410515ba5f
Author:    katleman
Date:      2014-01-22 14:08 -0800
URL:       http://hg.openjdk.java.net/jdk8/tl/jdk/rev/95410515ba5f

Merge

! .hgtags

Changeset: 91bce40d0347
Author:    alexsch
Date:      2014-01-23 20:36 +0400
URL:       http://hg.openjdk.java.net/jdk8/tl/jdk/rev/91bce40d0347

8032063: javax.swing.plaf.metal.MetalFileChooserUI$FilterComboBoxModel extends non-standard API
Reviewed-by: pchelko, serb

! src/macosx/classes/com/apple/laf/AquaFileChooserUI.java
! src/share/classes/com/sun/java/swing/plaf/gtk/GTKFileChooserUI.java
! src/share/classes/com/sun/java/swing/plaf/motif/MotifFileChooserUI.java
! src/share/classes/com/sun/java/swing/plaf/windows/WindowsFileChooserUI.java
! src/share/classes/javax/swing/plaf/metal/MetalFileChooserUI.java
- src/share/classes/sun/swing/AbstractFilterComboBoxModel.java
! src/share/classes/sun/swing/plaf/synth/SynthFileChooserUIImpl.java

Changeset: 6935e7a3a7c9
Author:    amurillo
Date:      2014-01-23 14:46 -0800
URL:       http://hg.openjdk.java.net/jdk8/tl/jdk/rev/6935e7a3a7c9

Merge


Changeset: a9088d517f2f
Author:    amurillo
Date:      2014-01-23 14:47 -0800
URL:       http://hg.openjdk.java.net/jdk8/tl/jdk/rev/a9088d517f2f

Merge


Changeset: a635c394328c
Author:    katleman
Date:      2014-01-24 15:08 -0800
URL:       http://hg.openjdk.java.net/jdk8/tl/jdk/rev/a635c394328c

Added tag jdk8-b126 for changeset a9088d517f2f

! .hgtags

Changeset: fbf251b8ef8a
Author:    lana
Date:      2014-01-29 11:11 -0800
URL:       http://hg.openjdk.java.net/jdk8/tl/jdk/rev/fbf251b8ef8a

Merge


Changeset: f777d83e0433
Author:    katleman
Date:      2014-01-30 12:17 -0800
URL:       http://hg.openjdk.java.net/jdk8/tl/jdk/rev/f777d83e0433

Added tag jdk8-b127 for changeset fbf251b8ef8a

! .hgtags

Changeset: f644211c59fd
Author:    lana
Date:      2014-01-31 13:47 -0800
URL:       http://hg.openjdk.java.net/jdk8/tl/jdk/rev/f644211c59fd

Merge


Changeset: 3c9473004f38
Author:    katleman
Date:      2014-02-01 18:21 -0800
URL:       http://hg.openjdk.java.net/jdk8/tl/jdk/rev/3c9473004f38

Added tag jdk8-b128 for changeset f644211c59fd

! .hgtags

Changeset: ab6e7bb8ff9f
Author:    pchelko
Date:      2014-01-22 16:15 +0400
URL:       http://hg.openjdk.java.net/jdk8/tl/jdk/rev/ab6e7bb8ff9f

7155984: Security problems in regression test java/awt/PrintJob/Security/SecurityDialogTest.java
Reviewed-by: anthony, serb

! src/macosx/classes/apple/laf/JRSUIUtils.java

Changeset: eef10feca8ca
Author:    lana
Date:      2014-02-06 13:28 -0800
URL:       http://hg.openjdk.java.net/jdk8/tl/jdk/rev/eef10feca8ca

Merge


Changeset: 80568a19aab7
Author:    lana
Date:      2014-02-06 13:29 -0800
URL:       http://hg.openjdk.java.net/jdk8/tl/jdk/rev/80568a19aab7

Merge


Changeset: 43386cc9a017
Author:    katleman
Date:      2014-02-06 17:35 -0800
URL:       http://hg.openjdk.java.net/jdk8/tl/jdk/rev/43386cc9a017

Added tag jdk8-b129 for changeset 80568a19aab7

! .hgtags


From daniel.daugherty at oracle.com  Tue Feb 11 14:03:23 2014
From: daniel.daugherty at oracle.com (Daniel D. Daugherty)
Date: Tue, 11 Feb 2014 15:03:23 -0700
Subject: code review round 0 for ObjectMonitor-JVM/TI hang fix (8028073)
In-Reply-To: <52F9C523.3010403@oracle.com>
References: <52ED3F28.5070806@oracle.com>	<6C8A4CDA-B5D5-44E4-865E-A7205D533AFD@oracle.com>
	<52F5702C.50508@oracle.com> <52F84967.8040807@oracle.com>
	<52F920AD.6050705@oracle.com> <52F9C523.3010403@oracle.com>
Message-ID: <52FA9E2B.6040907@oracle.com>

On 2/10/14 11:37 PM, David Holmes wrote:
> Dan - thanks for an awesome analysis once again!

You're welcome.


> Yes mea culpa the code I was looking at was sans Mr Simms most 
> fortuitous change. (I'm suffering from repo overload these days.)

No apology needed. It's a good thing that you and Karen were both
looking at slightly older versions of the code. If we had not gone
down this road, we may not have noticed this bug when backporting
the fix to HSX-25 or older.


> As per my previous email I could see that a timed-wait, or spurious 
> wakeup, could take us to the potentially problematic SpinAcquire, 
> however I could not see how/if that thread could become _succ such 
> that an unpark directed to _succ could be consumed in the SpinAcquire 
> park().

Yes, a big part of my previous reply was fleshing out what you said
in an order that made sense to my brain. I should have made it more
clear that I was restating things to make sure I had it right.

Sorry, I forgot to address your successor question in my previous
reply. Once a thread that has called Object.wait() and has been
notified, then it is eligible to be made the successor by another
thread that has acquired and exited the monitor.

The window of opportunity for losing an unpark() opens when a
thread can be seen on either the entry-queue or the cx-queue. At
that point, any thread going through ExitEpilog can make the
target thread the successor and call unpark() on it. The call
to SpinAcquire() that we're discussing happens right in the
window so that SpinAcquire() call in HSX-25 and older code can
consume the unpark().

Hopefully, this resolves the last of this review thread.

Dan


>
> Thanks again,
> David
>
> On 11/02/2014 4:55 AM, Daniel D. Daugherty wrote:
>> On 2/9/14 8:37 PM, David Holmes wrote:
>>> trimming content ...
>>>
>>> On 8/02/2014 9:45 AM, Daniel D. Daugherty wrote:
>>>> On 2/7/14 2:56 PM, Karen Kinnear wrote:
>>>>> 3. Did I read the code correctly that the Thread::SpinAcquire can 
>>>>> make
>>>>> a timed park
>>>>> call on the same thread's _ParkEvent? And that this is used to get on
>>>>> and off the wait queue,
>>>>> i.e. to acquire the WaitSetLock?
>>>>>     Is there the same risk that a notify might be eaten here also?
>>>>
>>>> As far as I can see, Thread::SpinAcquire() does not use a ParkEvent
>>>
>>> It sure does:
>>>
>>> void Thread::SpinAcquire (volatile int * adr, const char * LockName) {
>>>   if (Atomic::cmpxchg (1, adr, 0) == 0) {
>>>      return ;   // normal fast-path return
>>>   }
>>>
>>>   // Slow-path : We've encountered contention -- Spin/Yield/Block
>>> strategy.
>>>   TEVENT (SpinAcquire - ctx) ;
>>>   int ctr = 0 ;
>>>   int Yields = 0 ;
>>>   for (;;) {
>>>      while (*adr != 0) {
>>>         ++ctr ;
>>>         if ((ctr & 0xFFF) == 0 || !os::is_MP()) {
>>>            if (Yields > 5) {
>>>              // Consider using a simple NakedSleep() instead.
>>>              // Then SpinAcquire could be called by non-JVM threads
>>>              Thread::current()->_ParkEvent->park(1) ;
>>
>> Ummmm... that's not the code I'm seeing...
>>
>> src/share/vm/runtime/thread.cpp:
>>
>>    4417  void Thread::SpinAcquire (volatile int * adr, const char *
>> LockName) {
>>    4418    if (Atomic::cmpxchg (1, adr, 0) == 0) {
>>    4419       return ;   // normal fast-path return
>>    4420    }
>>    4421
>>    4422    // Slow-path : We've encountered contention --
>> Spin/Yield/Block strategy.
>>    4423    TEVENT (SpinAcquire - ctx) ;
>>    4424    int ctr = 0 ;
>>    4425    int Yields = 0 ;
>>    4426    for (;;) {
>>    4427       while (*adr != 0) {
>>    4428          ++ctr ;
>>    4429          if ((ctr & 0xFFF) == 0 || !os::is_MP()) {
>>    4430             if (Yields > 5) {
>>    4431               os::naked_short_sleep(1);
>>    4432             } else {
>>    4433               os::NakedYield() ;
>>    4434               ++Yields ;
>>    4435             }
>>    4436          } else {
>>    4437             SpinPause() ;
>>    4438          }
>>    4439       }
>>    4440       if (Atomic::cmpxchg (1, adr, 0) == 0) return ;
>>    4441    }
>>    4442  }
>>
>> Mr Simms recently changed the above code via:
>>
>> changeset:   5832:5944dba4badc
>> user:        dsimms
>> date:        Fri Jan 24 09:28:47 2014 +0100
>> summary:     8028280: ParkEvent leak when running modified runThese
>> which only loads classes
>>
>> os::naked_short_sleep() is new:
>>
>> - BSD/MacOS X, Linux - uses nanosleep()
>> - Solaris - uses usleep()
>> - Windows - uses Sleep()
>>
>> The fix for 8028280 was pushed to JDK9/hs-rt on 2014.01.24 and to 
>> JDK9/hs
>> on 2014.01.29. I don't see any signs that Mr Simm's fix will be 
>> backported
>> to JDK8u/HSX-25u (yet) so this part of the review thread might impact 
>> the
>> backport of my fix to earlier releases.
>>
>>
>>> So considering Karen's question ... I can't tell for certain. :(
>>>
>>> I do not think the SpinAcquire on grabbing the wait-set lock to add to
>>> the wait-set can be an issue because we will only park in response to
>>> the actual wait, and hence only get unparked due to a
>>> notify/notifyAll, but at this point we still own the monitor so no
>>> notify/notifyAll is possible.
>>>
>>> However, for the removal from the wait-set a more complex analysis is
>>> needed. To do the SpinAcquire we must still be flagged as TS_WAIT -
>>> which means we have not been notified, but must be returning due to a
>>> timeout (or spurious wakeup?). In such circumstances could we be
>>> _succ? I don't think so but I'll leave it to Dan to confirm that 
>>> part :)
>>
>> So for HSX-25 and probably older...
>>
>> There are four Thread::SpinAcquire() calls in the objectMonitor code:
>>
>>      Thread::SpinAcquire (&_WaitSetLock, "WaitSet - add") ;
>>      Thread::SpinAcquire (&_WaitSetLock, "WaitSet - unlink") ;
>>      Thread::SpinAcquire (&_WaitSetLock, "WaitSet - notify") ;
>>      Thread::SpinAcquire (&_WaitSetLock, "WaitSet - notifyall") ;
>>
>> We can easily rule out the "notify" and "notifyAll" uses since the
>> current thread owns the Java-level monitor and there are no events
>> to post in this part of the notify() or notifyAll() protocols.
>>
>> For the "WaitSet - add" use, the current thread owns the Java-level
>> monitor and the thread has not been added as a waiter yet so another
>> thread cannot do the notify-exit-make-successor part of the protocol
>> yet.
>>
>> For the "WaitSet - unlink" use:
>>
>> src/share/vm/runtime/objectMonitor.cpp:
>>
>>    1569       if (node.TState == ObjectWaiter::TS_WAIT) {
>>    1570           Thread::SpinAcquire (&_WaitSetLock, "WaitSet - 
>> unlink") ;
>>    1571           if (node.TState == ObjectWaiter::TS_WAIT) {
>>    1572              DequeueSpecificWaiter (&node) ; // unlink
>> from WaitSet
>>    1573              assert(node._notified == 0, "invariant");
>>    1574              node.TState = ObjectWaiter::TS_RUN ;
>>    1575           }
>>    1576           Thread::SpinRelease (&_WaitSetLock) ;
>>    1577       }
>>
>> It is the call on line 1570 above that gets us into this code:
>>
>> src/share/vm/runtime/thread.cpp:
>>
>>    4435  void Thread::SpinAcquire (volatile int * adr, const char *
>> LockName) {
>>    4436    if (Atomic::cmpxchg (1, adr, 0) == 0) {
>>    4437       return ;   // normal fast-path return
>>    4438    }
>>    4439
>>    4440    // Slow-path : We've encountered contention --
>> Spin/Yield/Block strategy.
>>    4441    TEVENT (SpinAcquire - ctx) ;
>>    4442    int ctr = 0 ;
>>    4443    int Yields = 0 ;
>>    4444    for (;;) {
>>    4445       while (*adr != 0) {
>>    4446          ++ctr ;
>>    4447          if ((ctr & 0xFFF) == 0 || !os::is_MP()) {
>>    4448             if (Yields > 5) {
>>    4449               // Consider using a simple NakedSleep() instead.
>>    4450               // Then SpinAcquire could be called by non-JVM
>> threads
>>    4451 Thread::current()->_ParkEvent->park(1) ;
>>    4452             } else {
>>    4453               os::NakedYield() ;
>>    4454               ++Yields ;
>>    4455             }
>>    4456          } else {
>>    4457             SpinPause() ;
>>    4458          }
>>    4459       }
>>    4460       if (Atomic::cmpxchg (1, adr, 0) == 0) return ;
>>    4461    }
>>    4462  }
>>
>> And the above code can consume the unpark() on line 4451.
>>
>> So how the heck do we get to line 1570???
>>
>> Well, the target thread would have to be both notified and unparked
>> to be executing this code path. When the notify() code runs, the
>> target of the notify() is changed from ObjectWaiter::TS_WAIT to
>> ObjectWaiter::TS_ENTER unless Knob_MoveNotifyee == 4. The default
>> for Knob_MoveNotifyee == 2 so we're in non default mode here...
>>
>> Here are the Knob_MoveNotifyee policy values:
>>
>>     1717      if (Policy == 0) {       // prepend to EntryList
>>     1728      if (Policy == 1) {      // append to EntryList
>>     1744      if (Policy == 2) {      // prepend to cxq
>>     1760      if (Policy == 3) {      // append to cxq
>>
>> For Knob_MoveNotifyee == 4 (or higher), we use the old mechanism
>> where we just unpark the target thread and let it run. Part of
>> that code changes from ObjectWaiter::TS_WAIT to ObjectWaiter::TS_RUN.
>>
>> The code works the same for notifyAll() for the thread picked
>> to be notified. For the Knob_MoveNotifyee == 4 (or higher) case,
>> we just unpark all the waiters and we a free-for-all.
>>
>> So it looks like the code block from lines 1569-1577 is never
>> used... or is it? Well... you have to remember two things:
>>
>> 1) spurious unpark()
>> 2) timed wait()
>>
>> The caller might have called wait(0), but that doesn't mean that
>> the underlying park() mechanism won't have a spurious unpark().
>> Or better, the caller might have called wait(1) and be running
>> again after a millisecond.
>>
>> So in the HSX25 and older system (i.e., without Mr Simms fix for
>> 8028280), it is possible for this call:
>>
>>    1570           Thread::SpinAcquire (&_WaitSetLock, "WaitSet - 
>> unlink") ;
>>
>> to consume the unpark(). The gauntlet that has to be traversed
>> to get to this call:
>>
>>    4451 Thread::current()->_ParkEvent->park(1) ;
>>
>> is impressive:
>>
>> - fast-path acquisition of the _WaitSetLock has to fail:
>>
>>    4436    if (Atomic::cmpxchg (1, adr, 0) == 0) {
>>    4437       return ;   // normal fast-path return
>>    4438    }
>>
>> - if the machine is a uniprocessor, then 6 os::NakedYield()
>>    call-loop-recheck attempts have to fail:
>>
>>    4447          if ((ctr & 0xFFF) == 0 || !os::is_MP()) {
>>    4448             if (Yields > 5) {
>>    4449               // Consider using a simple NakedSleep() instead.
>>    4450               // Then SpinAcquire could be called by non-JVM
>> threads
>>    4451 Thread::current()->_ParkEvent->park(1) ;
>>    4452             } else {
>>    4453               os::NakedYield() ;
>>    4454               ++Yields ;
>>    4455             }
>>
>> - if the machine is a multi-processor, then 6 rounds of { 4095 
>> SpinPause()
>>    attempts, 1 os::NakedYield() attempt}  have to fail:
>>
>>    4446          ++ctr ;
>>    4447          if ((ctr & 0xFFF) == 0 || !os::is_MP()) {
>>    4448             if (Yields > 5) {
>>    4449               // Consider using a simple NakedSleep() instead.
>>    4450               // Then SpinAcquire could be called by non-JVM
>> threads
>>    4451 Thread::current()->_ParkEvent->park(1) ;
>>    4452             } else {
>>    4453               os::NakedYield() ;
>>    4454               ++Yields ;
>>    4455             }
>>    4456          } else {
>>    4457             SpinPause() ;
>>    4458          }
>>
>> But it is possible. It is one of those once-in-a-blue moon type
>> windows where everything has to line up just so.
>>
>> So how do we address this issue in HSX-25 and possibly older?
>>
>> If Mr Simms fix for 8028280 is also backported, then there is no
>> issue. If it is not backported, then applying the fix for this
>> bug like so:
>>
>> src/share/vm/runtime/objectMonitor.cpp:
>>
>>    1596       if (JvmtiExport::should_post_monitor_waited()) {
>>    1597         JvmtiExport::post_monitor_waited(jt, this, ret ==
>> OS_TIMEOUT);
>>    1598       }
>>
>>    1604       if (node._notified != 0 && _succ == Self) {
>>    1605         // In this part of the monitor wait-notify-reenter
>> protocol it
>>    1606         // is possible (and normal) for another thread to do a
>> fastpath
>>    1607         // monitor enter-exit while this thread is still trying
>> to get
>>    1608         // to the reenter portion of the protocol.
>>    1609         //
>>    1610         // The ObjectMonitor was notified and the current 
>> thread is
>>    1611         // the successor which also means that an unpark() has
>> already
>>    1612         // been done. The JVMTI_EVENT_MONITOR_WAITED event
>> handler can
>>    1613         // consume the unpark() that was done when the successor
>> was
>>    1614         // set because the same ParkEvent is shared between Java
>>    1615         // monitors and JVM/TI RawMonitors (for now).
>>    1616         //
>>    1617         // We redo the unpark() to ensure forward progress,
>> i.e., we
>>    1618         // don't want all pending threads hanging (parked) with
>> none
>>    1619         // entering the unlocked monitor.
>>    1620         node._event->unpark();
>>    1621       }
>>
>> Of course the line numbers for the "fix" would be different and the 
>> comment
>> would need to be updated to reflect that the:
>>
>>    1570           Thread::SpinAcquire (&_WaitSetLock, "WaitSet - 
>> unlink") ;
>>
>> call above could also consume an unpark(), but it should work.
>>
>> If you've read this far, then I'm impressed. If you've read this far
>> and only fallen asleep a couple of times, then I'm still impressed.
>>
>> Summary: I don't think we have an issue in JDK9, but we'll have to do
>>           the fix in JDK8/HSX25 and older a little differently.
>>
>> Dan
>>
>>
>>>
>>> David
>>> -----
>>>
>>>> at all. However, Thread::muxAcquire() does use a ParkEvent, but it
>>>> is a different ParkEvent. From src/share/vm/runtime/thread.hpp:
>>>>
>>>>    ParkEvent * _ParkEvent ;               // for synchronized()
>>>>    ParkEvent * _SleepEvent ;              // for Thread.sleep
>>>>    ParkEvent * _MutexEvent ;              // for native internal
>>>> Mutex/Monitor
>>>>    ParkEvent * _MuxEvent ;                // for low-level
>>>> muxAcquire-muxRelease
>>>>
>>>> So ObjectMonitor uses the _ParkEvent field and Thread::muxAcquire()
>>>> uses the _MuxEvent. There are some comments in thread.cpp about
>>>> how _MuxEvent could be eliminated and _ParkEvent shared, but I don't
>>>> think we ever want to go there.
>>>>
>>>> I also filed this RFE:
>>>>
>>>>      8033399 add a separate ParkEvent for JVM/TI RawMonitor use
>>>> https://bugs.openjdk.java.net/browse/JDK-8033399
>>>>
>>>> just in case the Serviceability team wants to migrate JVM/TI 
>>>> RawMonitors
>>>> to a separate ParkEvent.
>>>>
>>>> Please let me know if you concur that I've resolved issue #3.
>>>>
>>>>
>>>>> If so, I wonder if we want this added unpark to not just be called if
>>>>> JVMTI_EVENT_MONITOR_WAITED
>>>>> is enabled?
>>>>
>>>> I don't think we need it, but I've noted its removal as a risk.
>>>>
>>>> Again, thanks for the review!
>>>>
>>>> Dan
>>>>
>>>>
>>>>>
>>>>> thanks,
>>>>> Karen
>>>>>
>>>>> On Feb 1, 2014, at 1:38 PM, Daniel D. Daugherty wrote:
>>>>>
>>>>>> Greetings,
>>>>>>
>>>>>> I have a fix ready for the following bug:
>>>>>>
>>>>>>     8028073 race condition in ObjectMonitor implementation causing
>>>>>> deadlocks
>>>>>> https://bugs.openjdk.java.net/browse/JDK-8028073
>>>>>>
>>>>>> On the surface, this is a very simple fix that relocates a few
>>>>>> lines of
>>>>>> code, relocates and rewrites the comments associated with that code
>>>>>> and
>>>>>> adds several new comments.
>>>>>>
>>>>>> Of course, in reality, the issue is much more complicated, but I'm
>>>>>> hoping to make it easy for anyone not acquainted with this issue to
>>>>>> understand what's going on.
>>>>>>
>>>>>> Here are the JDK9 webrev URLs:
>>>>>>
>>>>>> OpenJDK:
>>>>>> http://cr.openjdk.java.net/~dcubed/8028073-webrev/0-jdk9-hs-runtime/
>>>>>>
>>>>>> Oracle internal:
>>>>>> http://javaweb.us.oracle.com/~ddaugher/8028073-webrev/0-jdk9-hs-runtime/ 
>>>>>>
>>>>>>
>>>>>>
>>>>>> The simple summary:
>>>>>>
>>>>>> - since Java Monitors and JVM/TI RawMonitors share a ParkEvent,
>>>>>>   it is possible for a JVM/TI monitor event handler to accidentally
>>>>>>   consume a ParkEvent.unpark() call meant for Java Monitor layer
>>>>>> - the original code fix was made on 2005.07.04 using this bug ID:
>>>>>> https://bugs.openjdk.java.net/browse/JDK-5030359
>>>>>> - it's the right fix, but it's in the wrong place
>>>>>> - the fix needs to be after the JVMTI_EVENT_MONITOR_WAITED
>>>>>>   event handler is called because it is that event handler
>>>>>>   that can cause the hang
>>>>>>
>>>>>>
>>>>>> Testing
>>>>>> -------
>>>>>>
>>>>>> - a new StessMonitorWait test has been created that reliably
>>>>>>   reproduces the hang in JDK[6789]; see the bug's gory details
>>>>>>   for the specific versions where the hang has been reproduced
>>>>>>   - the test reliably reproduces the hang in 5 seconds on my
>>>>>>     T7600 running Solaris 10u11 X86; 1 minute runs reproduce
>>>>>>     the hang reliably on other machines
>>>>>>   - 12 hour stress run of the new test on Linux-X64, MacOS X-X64,
>>>>>>     Solaris-SPARCV9, Solaris-X64, and Win7-X86 with the JPRT
>>>>>>     bits did not reproduce the hang
>>>>>> - JPRT test job
>>>>>> - VM/SQE Adhoc test job on Server VM, fastdebug bits on Linux-X86,
>>>>>>   Linux-X64, MacOS X-X64, Solaris-SPARCV9, Solaris-X64, Windows-X86,
>>>>>>   and Windows-X64:
>>>>>>   - vm.quick
>>>>>>   - Kitchensink (bigapps)
>>>>>>   - Weblogic+medrec (bigapps)
>>>>>>   - runThese (bigapps)
>>>>>>
>>>>>>
>>>>>> The Gory Details Start Here
>>>>>> ---------------------------
>>>>>>
>>>>>> This is the old location of block of code that's being moved:
>>>>>>
>>>>>> src/share/vm/runtime/objectMonitor.cpp:
>>>>>>
>>>>>> 1440 void ObjectMonitor::wait(jlong millis, bool interruptible,
>>>>>> TRAPS) {
>>>>>> <snip>
>>>>>> 1499    exit (true, Self) ;                    // exit the monitor
>>>>>> <snip>
>>>>>> 1513    if (node._notified != 0 && _succ == Self) {
>>>>>> 1514       node._event->unpark();
>>>>>> 1515    }
>>>>>>
>>>>>>
>>>>>> This is the new location of block of code that's being moved:
>>>>>>
>>>>>> src/share/vm/runtime/objectMonitor.cpp:
>>>>>>
>>>>>> 1452 void ObjectMonitor::wait(jlong millis, bool interruptible,
>>>>>> TRAPS) {
>>>>>> <snip>
>>>>>> 1601      if (JvmtiExport::should_post_monitor_waited()) {
>>>>>> 1602        JvmtiExport::post_monitor_waited(jt, this, ret ==
>>>>>> OS_TIMEOUT);
>>>>>> <snip>
>>>>>> 1604        if (node._notified != 0 && _succ == Self) {
>>>>>> <snip>
>>>>>> 1620          node._event->unpark();
>>>>>> 1621        }
>>>>>>
>>>>>>
>>>>>> The Risks
>>>>>> ---------
>>>>>>
>>>>>> - The code now executes only when the JVMTI_EVENT_MONITOR_WAITED 
>>>>>> event
>>>>>>   is enabled:
>>>>>>   - previously it was always executed
>>>>>>   - while the old code was not effective for the hang that is being
>>>>>>     fixed with this bug, it is possible that the old code prevented
>>>>>>     a different bug in the successor protocol from manifesting
>>>>>>   - thorough analysis of the successor protocol did not reveal a
>>>>>>     case where the old code was needed in the old location
>>>>>> - Thorough analysis indicates that the other JVM/TI monitor events
>>>>>>   do not need a fix like the one for JVMTI_EVENT_MONITOR_WAITED:
>>>>>>   - the successor protocol is complicated and the analysis could
>>>>>>     be wrong when certain options are used
>>>>>>   - comments were added to each location where a JVM/TI monitor
>>>>>>     event handler is called documenting why a fix like this one
>>>>>>     is not needed there
>>>>>>   - if the analysis is wrong, the new comments show where a new
>>>>>>     code change would be needed
>>>>>>
>>>>>>
>>>>>> The Scenario
>>>>>> ------------
>>>>>>
>>>>>> I've created a scenario that reproduces this hang:
>>>>>>
>>>>>> T1 - enters monitor and calls monitor.wait()
>>>>>> T2 - enters the monitor, calls monitor.notify() and exits the 
>>>>>> monitor
>>>>>> T3 - enters and exits the monitor
>>>>>> T4 - enters the monitor, delays for 5 seconds, exits the monitor
>>>>>>
>>>>>> A JVM/TI agent that enables JVMTI_EVENT_MONITOR_WAITED and has a
>>>>>> handler that: enters a raw monitor, waits for 1ms, exits a raw
>>>>>> monitor.
>>>>>>
>>>>>> Here are the six events necessary to make this hang happen:
>>>>>>
>>>>>> // KEY-EVENT-1a: After being unparked(), T1 has cleared the _succ
>>>>>> field, but
>>>>>> // KEY-EVENT-1b: T3 is exiting the monitor and makes T1 the 
>>>>>> successor
>>>>>> again.
>>>>>>
>>>>>> // KEY-EVENT-2a: The unpark() done by T3 when it made T1 the 
>>>>>> successor
>>>>>> // KEY-EVENT-2b: is consumed by the JVM/TI event handler.
>>>>>>
>>>>>> // KEY-EVENT-3a: T3 made T1 the successor
>>>>>> // KEY-EVENT-3b: but before T1 could reenter the monitor T4 grabbed
>>>>>> it.
>>>>>>
>>>>>> // KEY-EVENT-4a: T1's TrySpin() call sees T4 as NotRunnable so
>>>>>> // KEY-EVENT-4b: T1 bails from TrySpin without touching _succ.
>>>>>>
>>>>>> // KEY-EVENT-5a: T4 sees that T1 is still the successor so
>>>>>> // KEY-EVENT-5b: T4 takes the quick exit path (no ExitEpilog)
>>>>>>
>>>>>> // KEY-EVENT-6a: T1 is about to park and it is the successor, but
>>>>>> // KEY-EVENT-6b: T3's unpark has been eaten by the JVM/TI event
>>>>>> handler
>>>>>> // KEY-EVENT-6c: and T4 took the quick exit path. T1 is about to be
>>>>>> stuck.
>>>>>>
>>>>>>
>>>>>> This bug is intertwined with:
>>>>>>
>>>>>> - The ObjectMonitor successor protocol
>>>>>> - the sharing of a ParkEvent between Java Monitors and JVM/TI
>>>>>> RawMonitors
>>>>>>
>>>>>> There is a very long successor.notes attachment to JDK-8028073 that
>>>>>> attempts to describe the ObjectMonitor successor protocol. It's good
>>>>>> for putting pretty much anyone to sleep.
>>>>>>
>>>>>> Since this hang reproduces back to JDK6, this bug is taking the 
>>>>>> easily
>>>>>> backported solution of moving the original fix to the right 
>>>>>> location.
>>>>>> The following new bug has been filed for possible future work in 
>>>>>> this
>>>>>> area by the Serviceability Team:
>>>>>>
>>>>>>     8033399 add a separate ParkEvent for JVM/TI RawMonitor use
>>>>>> https://bugs.openjdk.java.net/browse/JDK-8033399
>>>>>>
>>>>>>
>>>>>> The Symptoms
>>>>>> ------------
>>>>>>
>>>>>> With intermittent hangs like this, it is useful to know what to look
>>>>>> for in order to determine if you are running into this issue:
>>>>>>
>>>>>> - if you aren't using a debugger or a profiler or some other
>>>>>>   JVM/TI agent, then this hang is not the same as yours
>>>>>> - if your JVM/TI agent isn't using a JVMTI_EVENT_MONITOR_WAITED
>>>>>>   event handler, then this hang is not the same as yours
>>>>>> - if your JVMTI_EVENT_MONITOR_WAITED event handler is not using
>>>>>>   JVM/TI RawMonitors, then this hang is not the same as yours
>>>>>> - if your JVMTI_EVENT_MONITOR_WAITED event handler is calling
>>>>>>   back into Java code, then you might just be insane and this
>>>>>>   hang might be similar to yours. However, using a Java callback
>>>>>>   in an event handler is an even bigger problem/risk so fix that
>>>>>>   first.
>>>>>> - if you one or more threads blocked like this and making no
>>>>>>   progress, then this hang might be the same as yours:
>>>>>>
>>>>>> "T1" #22 prio=5 os_prio=64 tid=0x00000000009ca800 nid=0x2f waiting
>>>>>> for monitor e
>>>>>> ntry [0xfffffd7fc0231000]
>>>>>>    java.lang.Thread.State: BLOCKED (on object monitor)
>>>>>>    JavaThread state: _thread_blocked
>>>>>> Thread: 0x00000000009ca800  [0x2f] State: _at_safepoint
>>>>>> _has_called_back 0 _at_p
>>>>>> oll_safepoint 0
>>>>>>    JavaThread state: _thread_blocked
>>>>>>         at java.lang.Object.wait(Native Method)
>>>>>>         - waiting on <0xfffffd7e6a2b6ff0> (a java.lang.String)
>>>>>>         at java.lang.Object.wait(Object.java:502)
>>>>>>         at SMW_WorkerThread.run(StressMonitorWait.java:103)
>>>>>>         - locked <0xfffffd7e6a2b6ff0> (a java.lang.String)
>>>>>>
>>>>>> "T2" #23 prio=5 os_prio=64 tid=0x00000000009cc000 nid=0x30 waiting
>>>>>> for monitor e
>>>>>> ntry [0xfffffd7fc0130000]
>>>>>>    java.lang.Thread.State: BLOCKED (on object monitor)
>>>>>>    JavaThread state: _thread_blocked
>>>>>> Thread: 0x00000000009cc000  [0x30] State: _at_safepoint
>>>>>> _has_called_back 0 _at_p
>>>>>> oll_safepoint 0
>>>>>>    JavaThread state: _thread_blocked
>>>>>>         at SMW_WorkerThread.run(StressMonitorWait.java:120)
>>>>>>         - waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String)
>>>>>>
>>>>>> "T3" #24 prio=5 os_prio=64 tid=0x00000000009ce000 nid=0x31 waiting
>>>>>> for monitor e
>>>>>> ntry [0xfffffd7fc002f000]
>>>>>>    java.lang.Thread.State: BLOCKED (on object monitor)
>>>>>>    JavaThread state: _thread_blocked
>>>>>> Thread: 0x00000000009ce000  [0x31] State: _at_safepoint
>>>>>> _has_called_back 0 _at_p
>>>>>> oll_safepoint 0
>>>>>>    JavaThread state: _thread_blocked
>>>>>>         at SMW_WorkerThread.run(StressMonitorWait.java:139)
>>>>>>         - waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String)
>>>>>>
>>>>>> Key symptoms in thread T1:
>>>>>>
>>>>>> - had the object locked:
>>>>>>
>>>>>>   locked <0xfffffd7e6a2b6ff0> (a java.lang.String)
>>>>>>
>>>>>> - did an Object.wait():
>>>>>>
>>>>>>   waiting on <0xfffffd7e6a2b6ff0> (a java.lang.String)
>>>>>>
>>>>>> - is blocked on reentry:
>>>>>>
>>>>>>   waiting for monitor entry [0xfffffd7fc0231000]
>>>>>>
>>>>>> Key symtoms in thread T2:
>>>>>>
>>>>>> - is blocked waiting to lock the object:
>>>>>>
>>>>>>   waiting for monitor entry [0xfffffd7fc0130000]
>>>>>>   waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String)
>>>>>>
>>>>>> Key symtoms in thread T3:
>>>>>>
>>>>>> - is blocked waiting to lock the object:
>>>>>>
>>>>>>   waiting for monitor entry [0xfffffd7fc002f000]
>>>>>>   waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String)
>>>>>>
>>>>
>>


From lana.steuck at oracle.com  Tue Feb 11 13:40:08 2014
From: lana.steuck at oracle.com (lana.steuck at oracle.com)
Date: Tue, 11 Feb 2014 21:40:08 +0000
Subject: hg: jdk8/tl: 8 new changesets
Message-ID: <20140211214010.B63CD62B98@hg.openjdk.java.net>

Changeset: 950921234b10
Author:    katleman
Date:      2014-01-22 12:53 -0800
URL:       http://hg.openjdk.java.net/jdk8/tl/rev/950921234b10

Added tag jdk8-b125 for changeset 790bbd46b201

! .hgtags

Changeset: 1b5d578f93ef
Author:    katleman
Date:      2014-01-22 14:06 -0800
URL:       http://hg.openjdk.java.net/jdk8/tl/rev/1b5d578f93ef

Merge

! .hgtags

Changeset: 9ccce5bf1b0e
Author:    katleman
Date:      2014-01-24 04:45 -0800
URL:       http://hg.openjdk.java.net/jdk8/tl/rev/9ccce5bf1b0e

Merge


Changeset: 4f8fa4724c14
Author:    katleman
Date:      2014-01-24 15:07 -0800
URL:       http://hg.openjdk.java.net/jdk8/tl/rev/4f8fa4724c14

Added tag jdk8-b126 for changeset 9ccce5bf1b0e

! .hgtags

Changeset: 2e2ffb9e4b69
Author:    lana
Date:      2014-01-29 11:11 -0800
URL:       http://hg.openjdk.java.net/jdk8/tl/rev/2e2ffb9e4b69

Merge


Changeset: 101e42de4686
Author:    katleman
Date:      2014-01-30 12:16 -0800
URL:       http://hg.openjdk.java.net/jdk8/tl/rev/101e42de4686

Added tag jdk8-b127 for changeset 2e2ffb9e4b69

! .hgtags

Changeset: 1e5fe8654913
Author:    katleman
Date:      2014-02-01 18:21 -0800
URL:       http://hg.openjdk.java.net/jdk8/tl/rev/1e5fe8654913

Added tag jdk8-b128 for changeset 101e42de4686

! .hgtags

Changeset: 839546caab12
Author:    katleman
Date:      2014-02-06 17:34 -0800
URL:       http://hg.openjdk.java.net/jdk8/tl/rev/839546caab12

Added tag jdk8-b129 for changeset 1e5fe8654913

! .hgtags


From serguei.spitsyn at oracle.com  Tue Feb 11 18:57:06 2014
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Tue, 11 Feb 2014 18:57:06 -0800
Subject: 2-nd round RFR (S) 8032223: nsk/regression/b4663146 gets
	assert(SafepointSynchronize::is_at_safepoint()
	|| JvmtiEnv::is_thread_fully_suspended(get_thread(), false,
	&debug_bits))
In-Reply-To: <52F1514C.1050405@oracle.com>
References: <52F0CB50.3040703@oracle.com> <52F10BBA.8060504@oracle.com>
	<52F1514C.1050405@oracle.com>
Message-ID: <52FAE302.1040305@oracle.com>

On 2/4/14 12:45 PM, serguei.spitsyn at oracle.com wrote:
> Thanks, Dan!
> Some comments below.
>
> On 2/4/14 7:48 AM, Daniel D. Daugherty wrote:
>> On 2/4/14 4:13 AM, serguei.spitsyn at oracle.com wrote:
>>> Please, review the fix for:
>>>   https://bugs.openjdk.java.net/browse/JDK-8032223
>>>
>>>
>>> Open webrev:
>>> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/hotspot/8032223-JVMTI-FRAME.1/ 
>>>
>>
>> src/share/vm/prims/jvmtiEnvBase.hpp
>>     No comments.
>>
>> src/share/vm/prims/jvmtiEnv.cpp
>>     No comments beyond David's tweak to the comment.
>
> Will fix it before push.
>
>> For future work...
>>
>> Looks like these VM ops also need the liveness check on the
>> target JavaThread:
>>
>>     VM_GetOwnedMonitorInfo
>>     VM_GetStackTrace
>>
>> JvmtiEnv::GetStackTrace() looks like it has the same
>> is_thread_fully_suspended() flaw...
>
> I know about this.
> There are even more issues:
>   GetCurrentContendedMonitor
>   GetOwnedMonitorStackDepthInfo
>   NotifyFramePop
>
> I'll check if we can use an existing bug to add this extra request.
> Otherwise, will file new one.

Filed this one:
   https://bugs.openjdk.java.net/browse/JDK-8034249

Thanks,
Serguei

>
> Thanks,
> Serguei
>
>>
>> Dan
>>
>>
>>
>>>
>>> Summary:
>>>
>>>   This is the second round of review for this issue.
>>>   But it was decided that the JDK-8032223 must be used to cover it 
>>> instead of the JDK-6471769.
>>>   The 8032223 was initially closed as a dup of 6471769 but it has 
>>> been re-open now.
>>>
>>>   There is a general issue in the suspend equivalent condition 
>>> mechanism:
>>>   Two subsequent calls to the JvmtiEnv::is_thread_fully_suspended() 
>>> may return different results:
>>>     - 1-st: true
>>>     - 2-nd: false
>>>
>>>   This suspend equivalent issue is covered by another bug:
>>>     https://bugs.openjdk.java.net/browse/JDK-6280037
>>>
>>>   The bug to fix in this review is a specific manifestation of the 
>>> 6280037
>>>   in the JVMTI GetFrameCount() that has a major impact on the SQE 
>>> nightly.
>>>   It is on the Test Stabilization radar as well as the 6280037.
>>>   There are many tests intermittently failing because of this.
>>>   I've also decided to fix the same issue in the JVMTI 
>>> GetFrameLocation() as well.
>>>
>>>   The JVMTI GetFrameCount() spec tells:
>>>     "If this function is called for a thread actively executing 
>>> bytecodes (for example,
>>>      not the current thread and not suspended), the information 
>>> returned is transient."
>>>
>>>   So, it is Ok to call the GetFrameCount() for the non-suspended 
>>> target thread.
>>>   To achieve safety, the frame count for non-suspended threads is 
>>> calculated at a safepoint.
>>>   It should be Ok and more safe to do the same for suspended threads 
>>> as well.
>>>   There should be no big performance impact because it is already on 
>>> a slow path.
>>>   It is still important to avoid safepointing when the target thread 
>>> is current.
>>>
>>>   The bug 6280037 should go out of the Test Stabilization radar 
>>> (remove the svc-nightly label)
>>>   as the most of the impacted tests must be covered by the 8032223.
>>>
>>>
>>> Testing:
>>>   In progress:
>>>     - nsk.jvmti, nsk.jdi, nsk.jdwp
>>>     - JTreg com/sun/jdi
>>>
>>>
>>> Thanks,
>>> Serguei
>>>
>>
>


From volker.simonis at gmail.com  Wed Feb 12 06:28:50 2014
From: volker.simonis at gmail.com (Volker Simonis)
Date: Wed, 12 Feb 2014 15:28:50 +0100
Subject: Project to improve hs_err files
In-Reply-To: <52FB73D0.1070402@oracle.com>
References: <cf2b0521-76b0-4cd8-88bc-b2986418f9be@default>
	<52FB73D0.1070402@oracle.com>
Message-ID: <CA+3eh11+-N0GjOt-sttPksSCdySd+5oYF3mZw=B+Wkmi0ZOjUA@mail.gmail.com>

On Wed, Feb 12, 2014 at 2:14 PM, Nils Eliasson <nils.eliasson at oracle.com> wrote:
> On 2013-09-06 13:32, Mattis Castegren wrote:
>
> ...
>
>
>
> * Bigger changes
>
> - Re-structure hs_err file ...
>
> Yes, please move the compiler related info together. Having the current
> compiler task separated from the compilation events with the heap regions
> causes a lot of scrolling.
>
>
>
> * Additional Data
>
> ...
>
>
> Make sure we read the OS-context when hitting asserts in debug-builds. Today
> we only get registers, code and stack on actual crashes. Very valuable info
> that is missing when investigating many test failures.
>

I'm afraid that will be not possible. For a crash (i.e. because of a
SIGSEGV) the kernel saves the context for the PC which caused the
problem. 'assert' and 'guarantee' are macros which evaluate an
expression and call report_vm_error if that expression was 'false'. I
think the best we could do is to call 'getcontext()' after the
evaluation of the expression, but that's not actually the context
where the evaluation of the expression failed. But I agree that it may
be helpful as well.

And you should already get a complete stack trace for 'asserts' and
'guarantees' with the current implementation. Don't you see it?

Regards,
Volker

> //Nils

From yasu at ysfactory.dip.jp  Wed Feb 12 06:32:11 2014
From: yasu at ysfactory.dip.jp (Yasumasa Suenaga)
Date: Wed, 12 Feb 2014 23:32:11 +0900
Subject: JDK-7090324: gclog rotation via external tool
In-Reply-To: <52F22AE0.1000008@ysfactory.dip.jp>
References: <52483BDB.8040206@ysfactory.dip.jp>	<52496A21.8080608@ysfactory.dip.jp>	<527CDD56.7080106@ysfactory.dip.jp>	<52A09642.4030609@ysfactory.dip.jp>	<52E26FA2.40909@ysfactory.dip.jp>	<0E26045D-F7F9-49BC-AB36-A42C1DC6E64E@oracle.com>
	<52E91000.9010600@ysfactory.dip.jp> <52E91AAA.3060008@oracle.com>
	<52E9248D.2090108@ysfactory.dip.jp>
	<52E9FA5B.6010306@lab.ntt.co.jp>
	<5B189F0A-7408-47C7-9719-DC2990355209@oracle.com>
	<52EA33B3.6080409@lab.ntt.co.jp> <52F2151E.4040805@oracle.com>
	<52F229E0.3010400@ysfactory.dip.jp>
	<52F22AE0.1000008@ysfactory.dip.jp>
Message-ID: <52FB85EB.2090005@ysfactory.dip.jp>

Hi all,

I've uploaded new webrev:
http://cr.openjdk.java.net/~ysuenaga/JDK-7090324/webrev.05/

Erik pointed me that my patch changes current behavior for GCLogFileSize.

In current implementation, default value of GCLogFileSize is set to "0" and
if user set this value to less than 8K, JVM adjust it to 8K.


Below are the scenarios:

   1. -Xloggc:test.log -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=3
        Should result in GCLogFileSize "0" (GC log rotation will be 
turned off)

   2. -Xloggc:test.log -XX:+UseGCLogFileRotation 
-XX:NumberOfGCLogFiles=3 -XX:GCLogFileSize=10K
        Should result in GCLogFileSize 10K

   3. -Xloggc:test.log -XX:+UseGCLogFileRotation 
-XX:NumberOfGCLogFiles=3 -XX:GCLogFileSize=2K
        Should result in GCLogFileSize 8K

 From the result of 3, we can think that GCLogFileSize is set to 8K by 
default.
So I want to change default value of this to 8K in globals.hpp .

And I want to treat "0" as special number for rotating by external trigger.
 From the result of 1, if GCLogFileSIze is set to "0", 
UseGCLogFileRotation is set to false.
Definition of GCLogFileSize in globals.hpp, "0" means "no rotation" .
Thus I think this changes does not make different behavior from current 
implementation.
------------------------
   product(uintx, GCLogFileSize, 
0,                                          \
           "GC log file size (default: 0 bytes, no rotation). 
"              \
           "It requires 
UseGCLogFileRotation")                               \
------------------------


Could you review this ?


Thanks,

Yasumasa


On 02/05/2014 09:13 PM, Yasumasa Suenaga wrote:
> Sorry, I forgot to paste URL of new webrev :-P
> http://cr.openjdk.java.net/~ysuenaga/JDK-7090324/webrev.04/
>
>
> Yasumasa
>
> On 02/05/2014 09:09 PM, Yasumasa Suenaga wrote:
>> Hi Erik,
>>
>> Thank you for reviewing again!
>> I've updated new webrev.
>>
>> On 02/05/2014 07:40 PM, Erik Helin wrote:
>>> Hi Yasumasa,
>>>
>>> I've looked through the latest patch, it is much better! I just have 
>>> two comments:
>>>
>>> - ostream.hpp:
>>>   Why did you add GCLogFileSize != 0 in should_rotate? The old check
>>>   just checked that _bytes_written > GCLogFileSize.
>>
>> Default value of GCLogFileSIze is "0" in globals.hpp .
>> So if this state is missed, should_rotate() returns true in anytime.
>>
>>
>>> - TestGCLogRotationViaJcmd.java:
>>>   Could you use the helper class JDKToolLauncher to start jmap? The
>>>   code would then be slightly easier to read:
>>>
>>> for (int times = 1; times < NUM_LOGS; times++) {
>>>     // Run jcmd <pid> GC.rotate_log
>>>     JDKToolLauncher jmap = JDKToolLauncher.create("jmap")
>>>                                           .addToolArg(pid)
>>> .addToolArg("GC.rotate_log");
>>>     ProcessBuilder pb = new ProcessBuilder(jmap.getCommand());
>>>
>>>     // Make sure we didn't crash
>>>     OutputAnalyzer output = new OutputAnalyzer(pb.start());
>>>     output.shouldHaveExitValue(0);
>>> }
>>
>> I've fixed. Could you check the patch?
>>
>>
>> Thanks,
>>
>> Yasumasa
>>
>>
>>> Thanks,
>>> Erik
>>>
>>> On 01/30/2014 12:12 PM, Yasumasa Suenaga wrote:
>>>> Hi Staffan,
>>>>
>>>> I've uploaded new webrev:
>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-7090324/webrev.03/
>>>>
>>>> On 2014/01/30 17:23, Staffan Larsen wrote:
>>>>> Would it be possible for the Diagnostic Command to output the 
>>>>> location
>>>>> of the rotated log? When invoking the command it would be good to get
>>>>> some kind of feedback.
>>>>
>>>> I changed rotate_log() to redirect messages to jcmd.
>>>> If GC.rotate_log is executed, we can get messages on jcmd console 
>>>> as below:
>>>> ------------
>>>> $ jcmd 18976 GC.rotate_log
>>>> 18976:
>>>> 2014-01-30 19:59:39 GC log rotation request has been received. 
>>>> Saved as
>>>> test.log.0
>>>> 2014-01-30 19:59:39 GC log file created test.log.1
>>>> ------------
>>>>
>>>>
>>>>> test/gc/7090324/Test7090324.java:
>>>>> - I think this needs to have the Oracle copyright notice as well.
>>>>> - Tests should now use descriptive names, not bug numbers:
>>>>> https://wiki.openjdk.java.net/display/HotSpot/Naming+HotSpot+JTReg+Tests 
>>>>>
>>>>> - nits: lots of missing spaces before ?{?, and after ?for?, ?if?
>>>>> - line 47: you don?t need to clean up old files, jtreg will give 
>>>>> you a
>>>>> fresh scratch directory to run in
>>>>
>>>> I've fixed.
>>>> Could you review again?
>>>>
>>>>
>>>> Thanks,
>>>>
>>>> Yasumasa
>>>>
>>>>> /Staffan
>>>>>
>>>>>
>>>>>
>>>>> On 30 jan 2014, at 08:08, Yasumasa
>>>>> Suenaga<suenaga.yasumasa at lab.ntt.co.jp>  wrote:
>>>>>
>>>>>> Hi Erik, Staffan,
>>>>>>
>>>>>> I've uploaded new webrev. Could you review this ?
>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-7090324/webrev.02/
>>>>>>
>>>>>> This patch includes fixes from comments of Staffan and Erik.
>>>>>>
>>>>>> And I created new test of this patch as Test7090324 .
>>>>>> This test works fine with jtreg.
>>>>>>
>>>>>>
>>>>>> Thanks,
>>>>>>
>>>>>> Yasumasa
>>>>>>
>>>>>> On 2014/01/30 0:55, Yasumasa Suenaga wrote:
>>>>>>> Hi Erik,
>>>>>>>
>>>>>>> On 2014/01/30 0:13, Erik Helin wrote:
>>>>>>>> Hi Yasumasa,
>>>>>>>>
>>>>>>>> (have to use HTML email to get a width of more than 78 chars, 
>>>>>>>> sorry)
>>>>>>>>
>>>>>>>> why did you change the code in arguments.cpp in the method
>>>>>>>> check_gc_log_consistency?
>>>>>>>
>>>>>>> In current implementation, check_gclog_consistency() checks three
>>>>>>> parameters:
>>>>>>>
>>>>>>> - GC log filename
>>>>>>> - NumberOfGCLogFiles
>>>>>>> - GCLogFileSize
>>>>>>>
>>>>>>> My customer uses external trigger "ONLY" for rotating logs.
>>>>>>> If they want to do that, GCLogFileSize does not need.
>>>>>>>
>>>>>>>
>>>>>>>> Next, the gcLogFileStream::rotate_log method now does a lot of 
>>>>>>>> things.
>>>>>>>> Could you separate out the first block into a new method,
>>>>>>>> gcLogFileStream::should_rotate(bool force)?
>>>>>>>>
>>>>>>>> This was, the code would read:
>>>>>>>>
>>>>>>>>> bool gcLogFileStream::should_rotate(bool force) {
>>>>>>>>> return force || _bytes_writen>= GCLogFileSize;
>>>>>>>>> }
>>>>>>>>>
>>>>>>>>> void gcLogFileStream::rotate_log(bool force) {
>>>>>>>>> char time_msg[FILENAMEBUFLEN];
>>>>>>>>> char time_str[EXTRACHARLEN];
>>>>>>>>> char current_file_name[FILENAMEBUFLEN];
>>>>>>>>> char renamed_file_name[FILENAMEBUFLEN];
>>>>>>>>>
>>>>>>>>> if (!should_rotate(force)) {
>>>>>>>>> return;
>>>>>>>>> }
>>>>>>>>>
>>>>>>>>> ...
>>>>>>>>> }
>>>>>>>>
>>>>>>>> Could you please update your patch?
>>>>>>>
>>>>>>> I will do that.
>>>>>>>
>>>>>>>
>>>>>>>> There is a new empty line in the rotate_log method:
>>>>>>>>
>>>>>>>>> }
>>>>>>>>> +
>>>>>>>>> #ifdef ASSERT
>>>>>>>>
>>>>>>>> could you please remove it?
>>>>>>>
>>>>>>> I will do that.
>>>>>>>
>>>>>>>
>>>>>>>> The logging change in rotate_log uses a different kind of if/else
>>>>>>>> syntax
>>>>>>>> than the rest of the file:
>>>>>>>>
>>>>>>>>> if (force) {
>>>>>>>>> ...
>>>>>>>>> }
>>>>>>>>> else {
>>>>>>>>> ...
>>>>>>>>> }
>>>>>>>>
>>>>>>>> The other if/else statements in the file uses:
>>>>>>>>
>>>>>>>>> if (cond) {
>>>>>>>>> ...
>>>>>>>>> } else {
>>>>>>>>> ...
>>>>>>>>> }
>>>>>>>>
>>>>>>>> Could you please update your change to use the same if/else 
>>>>>>>> syntax?
>>>>>>>
>>>>>>> I will do that.
>>>>>>>
>>>>>>>
>>>>>>>> This part of the change duplicates the code:
>>>>>>>>
>>>>>>>> + jio_snprintf(time_msg, sizeof(time_msg), "%s GC log rotation
>>>>>>>> request has been received. Saved as %s\n",
>>>>>>>> + os::local_time_string((char *)time_str, sizeof(time_str)),
>>>>>>>> + renamed_file_name);
>>>>>>>> + }
>>>>>>>> + else {
>>>>>>>> + jio_snprintf(time_msg, sizeof(time_msg), "%s GC log file has
>>>>>>>> reached the"
>>>>>>>> " maximum size. Saved as %s\n",
>>>>>>>> - os::local_time_string((char *)time_str, sizeof(time_str)),
>>>>>>>> + os::local_time_string((char *)time_str, sizeof(time_str)),
>>>>>>>> renamed_file_name);
>>>>>>>>
>>>>>>>> Could you instead just change the message, as in:
>>>>>>>>
>>>>>>>>> const char* msg = forced ? "%s GC log rotation request has been
>>>>>>>>> received. Saved as %s\n" :
>>>>>>>>> "%s GC log file has reached the maximum size. Saved as %s\n";
>>>>>>>>> jio_snprintf(msg, os::local_time_string((char *)time_str,
>>>>>>>>> sizeof(time_str)), renamed_file_name);
>>>>>>>
>>>>>>> I will do that.
>>>>>>>
>>>>>>>
>>>>>>>> The declaration of rotate_log in ostream.hpp still uses the old
>>>>>>>> variable name is_force, it should use force,
>>>>>>>> just as the definition.
>>>>>>>
>>>>>>> Sorry, I will fix it.
>>>>>>>
>>>>>>>
>>>>>>>> Finally, could you add a test that tests your change? Have a look
>>>>>>>> at the other tests
>>>>>>>> in hotspot/test/gc to see how you can do it
>>>>>>>> (you might want to use some functionality from
>>>>>>>> hotspot/test/testlibrary).
>>>>>>>
>>>>>>> I found three tests as following:
>>>>>>>
>>>>>>> [ysuenaga at xelvis test]$ find . -iname "*jcmd*"
>>>>>>> ./runtime/NMT/JcmdWithNMTDisabled.java
>>>>>>> ./runtime/NMT/JcmdScale.java
>>>>>>> ./gc/TestG1ZeroPGCTJcmdThreadPrint.java
>>>>>>>
>>>>>>> I understand that these tests checks output (stdout/stderr) with
>>>>>>> OutputAnalyzer.
>>>>>>> However, my patch affects target VM. So I guess current test cannot
>>>>>>> check
>>>>>>> that GC log rotation is succeeded.
>>>>>>>
>>>>>>> Should I make test which checks exit value of jcmd ?
>>>>>>>
>>>>>>>
>>>>>>> Thanks,
>>>>>>>
>>>>>>> Yasumasa
>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Erik
>>>>>>>>
>>>>>>>> On 2014-01-29 15:28, Yasumasa Suenaga wrote:
>>>>>>>>> Hi Staffan,
>>>>>>>>>
>>>>>>>>> Thank you for reviewing!
>>>>>>>>> I've uploaded new webrev.
>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-7090324/webrev.01/
>>>>>>>>>
>>>>>>>>> On 2014/01/29 20:56, Staffan Larsen wrote:
>>>>>>>>>> Yasumasa,
>>>>>>>>>>
>>>>>>>>>> src/share/vm/runtime/arguments.cpp
>>>>>>>>>> no comments
>>>>>>>>>>
>>>>>>>>>> src/share/vm/runtime/safepoint.cpp
>>>>>>>>>> I was surprised that gc log size was checked after each safe
>>>>>>>>>> point. That seems an uneccssary burden to place on a safe point.
>>>>>>>>>> Instead we should switch to a periodic task that checks the gc
>>>>>>>>>> log size. However, this is unrelated to you patch, so please
>>>>>>>>>> ignore for now.
>>>>>>>>>
>>>>>>>>> Agree.
>>>>>>>>> However, I think that PeriodicTask also is not appropriate for 
>>>>>>>>> this.
>>>>>>>>>
>>>>>>>>> Size of GC log file is increased when GC is occurred.
>>>>>>>>> So I think rotate function should be called at entry of each GC
>>>>>>>>> events
>>>>>>>>> e.g. VM_GC_Operation::doit_prologue() etc...
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> src/share/vm/runtime/vm_operations.hpp
>>>>>>>>>> line 402: nit: missing space before {
>>>>>>>>>
>>>>>>>>> Fixed.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> line 405: I think ?force? is a better name than ?is_force?
>>>>>>>>>
>>>>>>>>> I removed "force" option from DCmd.
>>>>>>>>> So I removed this from VMOperation.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> src/share/vm/services/diagnosticCommand.cpp
>>>>>>>>>> line 666: What does this do without the -force option? It looks
>>>>>>>>>> to me that the non-force case will happen after each safe point
>>>>>>>>>> (see above) and that there is no need to ever do this from a
>>>>>>>>>> diagnostic command. Can we remove the option?
>>>>>>>>>
>>>>>>>>> Indeed.
>>>>>>>>> I removed "force" option.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> line 677: ?Target VM does not support GC log file rotation."
>>>>>>>>>
>>>>>>>>> Fixed.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> nits: some missing spaces before ?{' and after ?if'
>>>>>>>>>
>>>>>>>>> Fixed.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> src/share/vm/services/diagnosticCommand.hpp
>>>>>>>>>> I think RotateGCLogDCmd should require the ?control? permission
>>>>>>>>>> when executed via JMX, so please add:
>>>>>>>>>> static const JavaPermission permission() {
>>>>>>>>>> JavaPermission p = {"java.lang.management.ManagementPermission",
>>>>>>>>>> "control", NULL};
>>>>>>>>>> return p;
>>>>>>>>>> }
>>>>>>>>>
>>>>>>>>> Added.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> line 394: Maybe ?Force the GC log file to be rotated.? is a
>>>>>>>>>> better description?
>>>>>>>>>
>>>>>>>>> Fixed.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> src/share/vm/utilities/ostream.cpp
>>>>>>>>>> line 662: I think ?force? is a better name than ?is_force?
>>>>>>>>>> line 668: The comment says exactly the same thing as the code so
>>>>>>>>>> I think it can be skipped
>>>>>>>>>> line 671: ?GC log file rotation occurs by external trigger 
>>>>>>>>>> ONLY."
>>>>>>>>>> line 675: "not need? ->  ?no need?
>>>>>>>>>> line 718: "GC log rotation request has been received?
>>>>>>>>>
>>>>>>>>> Fixed them.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>>
>>>>>>>>> Yasumasa
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> src/share/vm/utilities/ostream.hpp
>>>>>>>>>> no comments
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>> /Staffan
>>>>>>>>>>
>>>>>>>>>> On 24 jan 2014, at 14:50, Yasumasa
>>>>>>>>>> Suenaga<yasu at ysfactory.dip.jp>  wrote:
>>>>>>>>>>
>>>>>>>>>>> Hi all,
>>>>>>>>>>>
>>>>>>>>>>> I've created webrev:
>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-7090324/webrev.00/
>>>>>>>>>>>
>>>>>>>>>>> This patch works fine on current jdk9/hs-rt .
>>>>>>>>>>> Could you review this?
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> I am just an Author. So I need a sponsor.
>>>>>>>>>>> Could you help me?
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Please cooperate.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>>
>>>>>>>>>>> Yasumasa
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On 2013/12/06 0:05, Yasumasa Suenaga wrote:
>>>>>>>>>>>> Hi all,
>>>>>>>>>>>>
>>>>>>>>>>>> Did someone read my email?
>>>>>>>>>>>> I really hope to merge "JDK-7090324: gclog rotation via
>>>>>>>>>>>> external tool" .
>>>>>>>>>>>>
>>>>>>>>>>>> I hear that someone need this RFE. So I want to discuss about
>>>>>>>>>>>> this.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>
>>>>>>>>>>>> Yasumasa
>>>>>>>>>>>>
>>>>>>>>>>>> On 2013/11/08 21:47, Yasumasa Suenaga wrote:
>>>>>>>>>>>>> Hi all,
>>>>>>>>>>>>>
>>>>>>>>>>>>> Did someone read my mail?
>>>>>>>>>>>>>
>>>>>>>>>>>>> I think that this RFE helps us to watch Java heap on
>>>>>>>>>>>>> production system.
>>>>>>>>>>>>> Also I think this RFE is able to be part of the JEP 158
>>>>>>>>>>>>> (Unified JVM Logging) .
>>>>>>>>>>>>>
>>>>>>>>>>>>> I want to update this RFE in JDK Bug System, but I don't have
>>>>>>>>>>>>> account.
>>>>>>>>>>>>> So I've posted email at first.
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>
>>>>>>>>>>>>> Yasumasa
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On 2013/09/30 21:10, Yasumasa Suenaga wrote:
>>>>>>>>>>>>>> In previous email, I've attached new patch for this RFE.
>>>>>>>>>>>>>> It works fine with current hsx.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Yasumasa
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On 2013/09/29 23:40, Yasu wrote:
>>>>>>>>>>>>>>> Hi all,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> We are using "logrotate" tool on RHEL for various log 
>>>>>>>>>>>>>>> rotation.
>>>>>>>>>>>>>>> Current HotSpot has gclog rotation function for log size 
>>>>>>>>>>>>>>> base,
>>>>>>>>>>>>>>> however I need to rotate gc log synchronizing with 
>>>>>>>>>>>>>>> logrotate
>>>>>>>>>>>>>>> tool.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> So I've created RFE as "JDK-7090324: gclog rotation via
>>>>>>>>>>>>>>> external tool" .
>>>>>>>>>>>>>>> And Sr. Engineering Manager in Oracle said he use the
>>>>>>>>>>>>>>> essence of my patch in one
>>>>>>>>>>>>>>> of the jcmd subcommands.
>>>>>>>>>>>>>>> http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/2011-September/003274.html 
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> 2 years ago, I posted a patch for this RFE.
>>>>>>>>>>>>>>> But this patch is too old to apply for current HotSpot.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> In last month, a similar discussion was appeared in ML.
>>>>>>>>>>>>>>> So I think it's time to discuss this RFE.
>>>>>>>>>>>>>>> http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/2013-August/008029.html 
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Please cooperate.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Best regards,
>>>>>>>>>>>>>>> Yasumasa
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>
>>>>
>>>
>>
>


From shanliang.jiang at oracle.com  Wed Feb 12 09:04:50 2014
From: shanliang.jiang at oracle.com (shanliang)
Date: Wed, 12 Feb 2014 18:04:50 +0100
Subject: Codereview request: 8007710 runtime/7158988/FieldMonitor.java
	fails with com.sun.jdi.VMDisconnectedException: Connection closed
In-Reply-To: <3F301491-8A27-4319-A694-D1E8ACB2F249@oracle.com>
References: <52FA301A.2040301@oracle.com>	<F34A2199-253B-49B8-BEC9-00AEA98029C2@oracle.com>	<52FA424E.2050202@oracle.com>	<52FA4515.3050904@oracle.com>	<52FA4D33.4030402@oracle.com>
	<52FA51BE.4070701@oracle.com> <52FA5E51.3040704@oracle.com>
	<3F301491-8A27-4319-A694-D1E8ACB2F249@oracle.com>
Message-ID: <52FBA9B2.6090502@oracle.com>

Staffan Larsen wrote:
> I think what you need to do is wait for the VMStartEvent before you add requests to the VM. Note this paragraph from the VirtualMachine doc:
>
>  Note that a target VM launched by a launching connector is not
>  guaranteed to be stable until after the VMStartEvent has been
>  received.
>   
I may miss something here, I believe VMStartEvent must be the first 
event, when the test got ClassPrepareEvent, it must already received 
VMStartEvent.
> I think adding code that looks something like this will make the test stable:
>
>     VirtualMachine vm = launchTarget(CLASS_NAME);
>     EventQueue eventQueue = vm.eventQueue();
>
>     boolean started = false;
>     while(!started) {
>       EventSet eventSet = eventQueue.remove();
>       for (Event event : eventSet) {
>         if (event instanceof VMStartEvent) {
>           started = true;
>         }
>         if (event instanceof VMDeathEvent
>             || event instanceof VMDisconnectEvent) {
>           throw new Error("VM died before it started...:"+event);
>         }
>       }
>     }
>
>     System.out.println("Vm launched");
>   
The code you proposed could improve the test, it made sure that 
TestPostFieldModification was started, but I am afraid that it did not 
address the issue causing the failure, the issue I believe was that 
TestPostFieldModification exited before or during FieldMonitor called 
addFieldWatch(), that was why addFieldWatch() received 
VMDisconnectedException. When the test was treating ClassPrepareEvent, 
even if VMDeathEvent or VMDisconnectEvent arrived, it must be still 
waiting in the eventQueue because it arrived after ClassPrepareEvent.

My fix was to not allow TestPostFieldModification to exit before 
addFieldWatch() was done. 
>   
> There is also no reason to call addFieldWatch() before the ClassPrepareEvent has been received. The call to vm..classesByName() will just return an empty list anyway.
>   
I do not know why the test called addFieldWatch before ClassPrepareEvent 
had been received, but yes the returned list was empty, so agree to 
remove it.
> While you are in there you can also remove the unused StringBuffer near the top of main().
>   
Yes it was already removed in version 01

Here is the new webrev:
http://cr.openjdk.java.net/~sjiang/JDK-8007710/02/

Thanks,
Shanliang
>  
> Thanks,
> /Staffan
>
> On 11 feb 2014, at 18:30, shanliang <shanliang.jiang at oracle.com> wrote:
>
>   
>> Here is the new fix in which FieldMonitor will write to TestPostFieldModification, to inform the latter to quit, as suggested bu Jaroslav
>>   http://cr.openjdk.java.net/~sjiang/JDK-8007710/01/
>>
>> Thanks,
>> Shanliang
>>
>> shanliang wrote:
>>     
>>> shanliang wrote:
>>>       
>>>> Jaroslav Bachorik wrote:
>>>>         
>>>>> On 11.2.2014 16:31, shanliang wrote:
>>>>>           
>>>>>> Staffan Larsen wrote:
>>>>>>             
>>>>>>> Hi Shanliang,
>>>>>>>
>>>>>>> I can?t quite see how the test can fail in this way. When the
>>>>>>> ClassPrepareEvent happens, the debuggee will be suspended. So when
>>>>>>> addFieldWatch() is called, the debuggee should not have moved.
>>>>>>>               
>>>>>> I am not expert of jdi so I may miss something here. I checked the
>>>>>> failure trace and saw the report exception happen when FieldMonitor
>>>>>> received ClassPrepareEvent and was doing addFieldWatch. FieldMonitor did
>>>>>> call "vm.resume()" before treating events.
>>>>>>             
>>>>> AFAICS, calling vm.resume() results in an almost immediate debuggee death. The gc() invoking thread "d" is flagged as a deamon and as such doesn't prevent the process from exiting. The other thread is not a daemon but will finish in only few cycles.
>>>>>           
>>>> I looked at the class com.sun.jdi.VirtualMachine, here is the Javadoc of the method "resume":
>>>>   /**
>>>>    * Continues the execution of the application running in this
>>>>    * virtual machine. All threads are resumed as documented in
>>>>    * {@link ThreadReference#resume}.
>>>>    *
>>>>    * @throws VMCannotBeModifiedException if the VirtualMachine is read-only - see {@link VirtualMachine#canBeModified()}.
>>>>    *
>>>>    * @see #suspend
>>>>    */
>>>>   void resume();
>>>> My understanding is that the debuggee resumes to work after this call, instead to die?
>>>>         
>>> In fact the problem is here, the vm (TestPostFieldModification) should not die before FieldMonitor finishes addFieldWatch.
>>>
>>> Shanliang
>>>       
>>>>>> I reproduced the bug by add sleep(1000) after vm.resume() but before
>>>>>> calling eventQueue.remove();
>>>>>>             
>>>>> It looks like some kind of synchronization between the debugger and the debuggee is necessary. But I wonder if you should better use the process.getOuptuptStream() to write and flush a message for the debugee indicating that it can exit. And in the debugee you would just do System.in.read() as the last statement in the main() method. Seems more robust than involving files.
>>>>>           
>>>> It could work, but creating a file in the testing directory should have no issue, but yes maybe less performance.
>>>>
>>>> Thanks,
>>>> Shanliang
>>>>         
>>>>> Cheers,
>>>>>
>>>>> -JB-
>>>>>
>>>>>           
>>>>>> Thanks,
>>>>>> Shanliang
>>>>>>             
>>>>>>> One problem I do see with the test is that it does not wait for a
>>>>>>> VMStartEvent before setting up requests. I?m not sure if that could
>>>>>>> cause the failure in the bug report, though.
>>>>>>>
>>>>>>> /Staffan
>>>>>>>
>>>>>>> On 11 feb 2014, at 15:13, shanliang <shanliang.jiang at oracle.com> wrote:
>>>>>>>
>>>>>>>               
>>>>>>>> Hi ,
>>>>>>>>
>>>>>>>> The problem could be that FieldMonitor did not have enough time to
>>>>>>>> "addFieldWatch" but the vm to monitor (TestPostFieldModification) was
>>>>>>>> already ended.
>>>>>>>>
>>>>>>>> So we should make sure that TestPostFieldModification exits after
>>>>>>>> FieldMonitor has done necessary. The solution proposed here is that
>>>>>>>> FieldMonitor creates a file after adding field watching, and
>>>>>>>> TestPostFieldModification quits only after finding the file.
>>>>>>>>
>>>>>>>> web:
>>>>>>>> http://icncweb.fr.oracle.com/~shjiang/webrev/8007710/00/
>>>>>>>>
>>>>>>>> bug:
>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8007710
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Shanliang
>>>>>>>>                 
>>>>>>             
>
>   

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20140212/3853039a/attachment.html 

From shanliang.jiang at oracle.com  Wed Feb 12 09:06:03 2014
From: shanliang.jiang at oracle.com (shanliang)
Date: Wed, 12 Feb 2014 18:06:03 +0100
Subject: Codereview request: 8007710 runtime/7158988/FieldMonitor.java
	fails with com.sun.jdi.VMDisconnectedException: Connection closed
In-Reply-To: <52FA8E2C.3050605@oracle.com>
References: <52FA301A.2040301@oracle.com>	<F34A2199-253B-49B8-BEC9-00AEA98029C2@oracle.com>	<52FA424E.2050202@oracle.com>	<52FA4515.3050904@oracle.com>	<52FA4D33.4030402@oracle.com>	<52FA51BE.4070701@oracle.com>
	<52FA5E51.3040704@oracle.com> <52FA8E2C.3050605@oracle.com>
Message-ID: <52FBA9FB.1040002@oracle.com>

serguei.spitsyn at oracle.com wrote:
> The fix looks good.
> But could you change "impossible" at line 45 to something more 
> adequate, i.e. "caught exception"? :
>
>   41         System.out.println("---TestPostFieldModification-run waiting to exit ...");
>   42         try {
>   43             System.in.read();
>   44         } catch (Exception e) {
>   45             System.out.println("---TestPostFieldModification-run impossible? "+e);
>   46             e.printStackTrace();
>   47         }
>
Done.
Thanks for reviewing.
Shanliang
>
> Thanks,
> Serguei
>
>
> On 2/11/14 9:30 AM, shanliang wrote:
>> Here is the new fix in which FieldMonitor will write to 
>> TestPostFieldModification, to inform the latter to quit, as suggested 
>> bu Jaroslav
>>    http://cr.openjdk.java.net/~sjiang/JDK-8007710/01/
>>
>> Thanks,
>> Shanliang
>>
>> shanliang wrote:
>>> shanliang wrote:
>>>> Jaroslav Bachorik wrote:
>>>>> On 11.2.2014 16:31, shanliang wrote:
>>>>>> Staffan Larsen wrote:
>>>>>>> Hi Shanliang,
>>>>>>>
>>>>>>> I can?t quite see how the test can fail in this way. When the
>>>>>>> ClassPrepareEvent happens, the debuggee will be suspended. So when
>>>>>>> addFieldWatch() is called, the debuggee should not have moved.
>>>>>> I am not expert of jdi so I may miss something here. I checked the
>>>>>> failure trace and saw the report exception happen when FieldMonitor
>>>>>> received ClassPrepareEvent and was doing addFieldWatch. 
>>>>>> FieldMonitor did
>>>>>> call "vm.resume()" before treating events.
>>>>>
>>>>> AFAICS, calling vm.resume() results in an almost immediate 
>>>>> debuggee death. The gc() invoking thread "d" is flagged as a 
>>>>> deamon and as such doesn't prevent the process from exiting. The 
>>>>> other thread is not a daemon but will finish in only few cycles.
>>>> I looked at the class com.sun.jdi.VirtualMachine, here is the 
>>>> Javadoc of the method "resume":
>>>>    /**
>>>>     * Continues the execution of the application running in this
>>>>     * virtual machine. All threads are resumed as documented in
>>>>     * {@link ThreadReference#resume}.
>>>>     *
>>>>     * @throws VMCannotBeModifiedException if the VirtualMachine is 
>>>> read-only - see {@link VirtualMachine#canBeModified()}.
>>>>     *
>>>>     * @see #suspend
>>>>     */
>>>>    void resume();
>>>> My understanding is that the debuggee resumes to work after this 
>>>> call, instead to die?
>>> In fact the problem is here, the vm (TestPostFieldModification) 
>>> should not die before FieldMonitor finishes addFieldWatch.
>>>
>>> Shanliang
>>>>>
>>>>>>
>>>>>> I reproduced the bug by add sleep(1000) after vm.resume() but before
>>>>>> calling eventQueue.remove();
>>>>>
>>>>> It looks like some kind of synchronization between the debugger 
>>>>> and the debuggee is necessary. But I wonder if you should better 
>>>>> use the process.getOuptuptStream() to write and flush a message 
>>>>> for the debugee indicating that it can exit. And in the debugee 
>>>>> you would just do System.in.read() as the last statement in the 
>>>>> main() method. Seems more robust than involving files.
>>>> It could work, but creating a file in the testing directory should 
>>>> have no issue, but yes maybe less performance.
>>>>
>>>> Thanks,
>>>> Shanliang
>>>>>
>>>>> Cheers,
>>>>>
>>>>> -JB-
>>>>>
>>>>>>
>>>>>> Thanks,
>>>>>> Shanliang
>>>>>>> One problem I do see with the test is that it does not wait for a
>>>>>>> VMStartEvent before setting up requests. I?m not sure if that could
>>>>>>> cause the failure in the bug report, though.
>>>>>>>
>>>>>>> /Staffan
>>>>>>>
>>>>>>> On 11 feb 2014, at 15:13, shanliang <shanliang.jiang at oracle.com> 
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Hi ,
>>>>>>>>
>>>>>>>> The problem could be that FieldMonitor did not have enough time to
>>>>>>>> "addFieldWatch" but the vm to monitor 
>>>>>>>> (TestPostFieldModification) was
>>>>>>>> already ended.
>>>>>>>>
>>>>>>>> So we should make sure that TestPostFieldModification exits after
>>>>>>>> FieldMonitor has done necessary. The solution proposed here is 
>>>>>>>> that
>>>>>>>> FieldMonitor creates a file after adding field watching, and
>>>>>>>> TestPostFieldModification quits only after finding the file.
>>>>>>>>
>>>>>>>> web:
>>>>>>>> http://icncweb.fr.oracle.com/~shjiang/webrev/8007710/00/
>>>>>>>>
>>>>>>>> bug:
>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8007710
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Shanliang
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20140212/79fed66a/attachment-0001.html 

From staffan.larsen at oracle.com  Wed Feb 12 09:27:42 2014
From: staffan.larsen at oracle.com (Staffan Larsen)
Date: Wed, 12 Feb 2014 18:27:42 +0100
Subject: Codereview request: 8007710 runtime/7158988/FieldMonitor.java
	fails with com.sun.jdi.VMDisconnectedException: Connection closed
In-Reply-To: <52FBA9B2.6090502@oracle.com>
References: <52FA301A.2040301@oracle.com>	<F34A2199-253B-49B8-BEC9-00AEA98029C2@oracle.com>	<52FA424E.2050202@oracle.com>	<52FA4515.3050904@oracle.com>	<52FA4D33.4030402@oracle.com>
	<52FA51BE.4070701@oracle.com> <52FA5E51.3040704@oracle.com>
	<3F301491-8A27-4319-A694-D1E8ACB2F249@oracle.com>
	<52FBA9B2.6090502@oracle.com>
Message-ID: <BC488F45-A698-4C92-986F-BBBEBA553CEF@oracle.com>

I?m still not happy with this fix since I think the extra output stream synchronization logic is not needed - the debuggee should be suspended at all the interesting points. The fix I proposed is cleaner and (as far as I can tell) also fixes the problem. The only thing is that I can?t quite explain what goes wrong without the fix? I?d really like to understand that. I?ll try to dig deeper and see if I can understand exactly what happens.

/Staffan

On 12 feb 2014, at 18:04, shanliang <shanliang.jiang at oracle.com> wrote:

> Staffan Larsen wrote:
>> 
>> I think what you need to do is wait for the VMStartEvent before you add requests to the VM. Note this paragraph from the VirtualMachine doc:
>> 
>>  Note that a target VM launched by a launching connector is not
>>  guaranteed to be stable until after the VMStartEvent has been
>>  received.
>>   
> I may miss something here, I believe VMStartEvent must be the first event, when the test got ClassPrepareEvent, it must already received VMStartEvent.
>> 
>> I think adding code that looks something like this will make the test stable:
>> 
>>     VirtualMachine vm = launchTarget(CLASS_NAME);
>>     EventQueue eventQueue = vm.eventQueue();
>> 
>>     boolean started = false;
>>     while(!started) {
>>       EventSet eventSet = eventQueue.remove();
>>       for (Event event : eventSet) {
>>         if (event instanceof VMStartEvent) {
>>           started = true;
>>         }
>>         if (event instanceof VMDeathEvent
>>             || event instanceof VMDisconnectEvent) {
>>           throw new Error("VM died before it started...:"+event);
>>         }
>>       }
>>     }
>> 
>>     System.out.println("Vm launched");
>>   
> The code you proposed could improve the test, it made sure that TestPostFieldModification was started, but I am afraid that it did not address the issue causing the failure, the issue I believe was that TestPostFieldModification exited before or during FieldMonitor called addFieldWatch(), that was why addFieldWatch() received VMDisconnectedException. When the test was treating ClassPrepareEvent, even if VMDeathEvent or VMDisconnectEvent arrived, it must be still waiting in the eventQueue because it arrived after ClassPrepareEvent.
> 
> My fix was to not allow TestPostFieldModification to exit before addFieldWatch() was done. 
>> 
>>   
>> There is also no reason to call addFieldWatch() before the ClassPrepareEvent has been received. The call to vm..classesByName() will just return an empty list anyway.
>>   
> I do not know why the test called addFieldWatch before ClassPrepareEvent had been received, but yes the returned list was empty, so agree to remove it.
>> While you are in there you can also remove the unused StringBuffer near the top of main().
>>   
> Yes it was already removed in version 01
> 
> Here is the new webrev:
> http://cr.openjdk.java.net/~sjiang/JDK-8007710/02/
> 
> Thanks,
> Shanliang
>>  
>> Thanks,
>> /Staffan
>> 
>> On 11 feb 2014, at 18:30, shanliang <shanliang.jiang at oracle.com> wrote:
>> 
>>   
>>> Here is the new fix in which FieldMonitor will write to TestPostFieldModification, to inform the latter to quit, as suggested bu Jaroslav
>>>   http://cr.openjdk.java.net/~sjiang/JDK-8007710/01/
>>> 
>>> Thanks,
>>> Shanliang
>>> 
>>> shanliang wrote:
>>>     
>>>> shanliang wrote:
>>>>       
>>>>> Jaroslav Bachorik wrote:
>>>>>         
>>>>>> On 11.2.2014 16:31, shanliang wrote:
>>>>>>           
>>>>>>> Staffan Larsen wrote:
>>>>>>>             
>>>>>>>> Hi Shanliang,
>>>>>>>> 
>>>>>>>> I can?t quite see how the test can fail in this way. When the
>>>>>>>> ClassPrepareEvent happens, the debuggee will be suspended. So when
>>>>>>>> addFieldWatch() is called, the debuggee should not have moved.
>>>>>>>>               
>>>>>>> I am not expert of jdi so I may miss something here. I checked the
>>>>>>> failure trace and saw the report exception happen when FieldMonitor
>>>>>>> received ClassPrepareEvent and was doing addFieldWatch. FieldMonitor did
>>>>>>> call "vm.resume()" before treating events.
>>>>>>>             
>>>>>> AFAICS, calling vm.resume() results in an almost immediate debuggee death. The gc() invoking thread "d" is flagged as a deamon and as such doesn't prevent the process from exiting. The other thread is not a daemon but will finish in only few cycles.
>>>>>>           
>>>>> I looked at the class com.sun.jdi.VirtualMachine, here is the Javadoc of the method "resume":
>>>>>   /**
>>>>>    * Continues the execution of the application running in this
>>>>>    * virtual machine. All threads are resumed as documented in
>>>>>    * {@link ThreadReference#resume}.
>>>>>    *
>>>>>    * @throws VMCannotBeModifiedException if the VirtualMachine is read-only - see {@link VirtualMachine#canBeModified()}.
>>>>>    *
>>>>>    * @see #suspend
>>>>>    */
>>>>>   void resume();
>>>>> My understanding is that the debuggee resumes to work after this call, instead to die?
>>>>>         
>>>> In fact the problem is here, the vm (TestPostFieldModification) should not die before FieldMonitor finishes addFieldWatch.
>>>> 
>>>> Shanliang
>>>>       
>>>>>>> I reproduced the bug by add sleep(1000) after vm.resume() but before
>>>>>>> calling eventQueue.remove();
>>>>>>>             
>>>>>> It looks like some kind of synchronization between the debugger and the debuggee is necessary. But I wonder if you should better use the process.getOuptuptStream() to write and flush a message for the debugee indicating that it can exit. And in the debugee you would just do System.in.read() as the last statement in the main() method. Seems more robust than involving files.
>>>>>>           
>>>>> It could work, but creating a file in the testing directory should have no issue, but yes maybe less performance.
>>>>> 
>>>>> Thanks,
>>>>> Shanliang
>>>>>         
>>>>>> Cheers,
>>>>>> 
>>>>>> -JB-
>>>>>> 
>>>>>>           
>>>>>>> Thanks,
>>>>>>> Shanliang
>>>>>>>             
>>>>>>>> One problem I do see with the test is that it does not wait for a
>>>>>>>> VMStartEvent before setting up requests. I?m not sure if that could
>>>>>>>> cause the failure in the bug report, though.
>>>>>>>> 
>>>>>>>> /Staffan
>>>>>>>> 
>>>>>>>> On 11 feb 2014, at 15:13, shanliang <shanliang.jiang at oracle.com> wrote:
>>>>>>>> 
>>>>>>>>               
>>>>>>>>> Hi ,
>>>>>>>>> 
>>>>>>>>> The problem could be that FieldMonitor did not have enough time to
>>>>>>>>> "addFieldWatch" but the vm to monitor (TestPostFieldModification) was
>>>>>>>>> already ended.
>>>>>>>>> 
>>>>>>>>> So we should make sure that TestPostFieldModification exits after
>>>>>>>>> FieldMonitor has done necessary. The solution proposed here is that
>>>>>>>>> FieldMonitor creates a file after adding field watching, and
>>>>>>>>> TestPostFieldModification quits only after finding the file.
>>>>>>>>> 
>>>>>>>>> web:
>>>>>>>>> http://icncweb.fr.oracle.com/~shjiang/webrev/8007710/00/
>>>>>>>>> 
>>>>>>>>> bug:
>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8007710
>>>>>>>>> 
>>>>>>>>> Thanks,
>>>>>>>>> Shanliang
>>>>>>>>>                 
>>>>>>>             
>>   
> 
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20140212/1f406f74/attachment.html 

From nils.eliasson at oracle.com  Wed Feb 12 09:30:02 2014
From: nils.eliasson at oracle.com (Nils Eliasson)
Date: Wed, 12 Feb 2014 18:30:02 +0100
Subject: Project to improve hs_err files
In-Reply-To: <CA+3eh11+-N0GjOt-sttPksSCdySd+5oYF3mZw=B+Wkmi0ZOjUA@mail.gmail.com>
References: <cf2b0521-76b0-4cd8-88bc-b2986418f9be@default>
	<52FB73D0.1070402@oracle.com>
	<CA+3eh11+-N0GjOt-sttPksSCdySd+5oYF3mZw=B+Wkmi0ZOjUA@mail.gmail.com>
Message-ID: <52FBAF9A.1090206@oracle.com>


On 2014-02-12 15:28, Volker Simonis wrote:
> On Wed, Feb 12, 2014 at 2:14 PM, Nils Eliasson <nils.eliasson at oracle.com> wrote:
>> On 2013-09-06 13:32, Mattis Castegren wrote:
>>
>> ...
>>
>>
>> * Additional Data
>>
>> ...
>>
>>
>> Make sure we read the OS-context when hitting asserts in debug-builds. Today
>> we only get registers, code and stack on actual crashes. Very valuable info
>> that is missing when investigating many test failures.
>>
> I'm afraid that will be not possible. For a crash (i.e. because of a
> SIGSEGV) the kernel saves the context for the PC which caused the
> problem. 'assert' and 'guarantee' are macros which evaluate an
> expression and call report_vm_error if that expression was 'false'. I
> think the best we could do is to call 'getcontext()' after the
> evaluation of the expression, but that's not actually the context
> where the evaluation of the expression failed. But I agree that it may
> be helpful as well.

getContext() works at least for some platforms. An alternative is to a 
force crash and let the crash handler do the works as usual. It has been 
done and worked pretty well.


> And you should already get a complete stack trace for 'asserts' and
> 'guarantees' with the current implementation. Don't you see it?

I get a stacktrace but no raw stack from the current frame.

Regards,
Nilx


>
> Regards,
> Volker
>
>> //Nils


From volker.simonis at gmail.com  Wed Feb 12 10:03:51 2014
From: volker.simonis at gmail.com (Volker Simonis)
Date: Wed, 12 Feb 2014 19:03:51 +0100
Subject: Project to improve hs_err files
In-Reply-To: <52FBAF9A.1090206@oracle.com>
References: <cf2b0521-76b0-4cd8-88bc-b2986418f9be@default>
	<52FB73D0.1070402@oracle.com>
	<CA+3eh11+-N0GjOt-sttPksSCdySd+5oYF3mZw=B+Wkmi0ZOjUA@mail.gmail.com>
	<52FBAF9A.1090206@oracle.com>
Message-ID: <CA+3eh11nQBAZcCEu935m857ege9pSUfc8vs_DUho-ME7rMiS_w@mail.gmail.com>

On Wed, Feb 12, 2014 at 6:30 PM, Nils Eliasson <nils.eliasson at oracle.com> wrote:
>
>
> On 2014-02-12 15:28, Volker Simonis wrote:
>>
>> On Wed, Feb 12, 2014 at 2:14 PM, Nils Eliasson <nils.eliasson at oracle.com>
>> wrote:
>>>
>>> On 2013-09-06 13:32, Mattis Castegren wrote:
>>>
>>> ...
>>>
>>>
>>> * Additional Data
>>>
>>> ...
>>>
>>>
>>> Make sure we read the OS-context when hitting asserts in debug-builds.
>>> Today
>>> we only get registers, code and stack on actual crashes. Very valuable
>>> info
>>> that is missing when investigating many test failures.
>>>
>> I'm afraid that will be not possible. For a crash (i.e. because of a
>> SIGSEGV) the kernel saves the context for the PC which caused the
>> problem. 'assert' and 'guarantee' are macros which evaluate an
>> expression and call report_vm_error if that expression was 'false'. I
>> think the best we could do is to call 'getcontext()' after the
>> evaluation of the expression, but that's not actually the context
>> where the evaluation of the expression failed. But I agree that it may
>> be helpful as well.
>
>
> getContext() works at least for some platforms. An alternative is to a force
> crash and let the crash handler do the works as usual. It has been done and
> worked pretty well.
>

This is a good idea, but it may interfer with -XX:SuppressErrorAt. We
currently only check if we should ignore an assertion in
report_vm_error(). If we would crash before (i.e. right after the
evaluation of the expression), -XX:SuppressErrorAt would be useless.
On the other side, if w only crash in report_vm_error(), we would get
the context in that functions which would not contain the desired
register values in the frame which crashed either. So we'd probably
have to think how this could be implemented in the best way.

>
>
>> And you should already get a complete stack trace for 'asserts' and
>> 'guarantees' with the current implementation. Don't you see it?
>
>
> I get a stacktrace but no raw stack from the current frame.
>
> Regards,
> Nilx
>
>
>>
>> Regards,
>> Volker
>>
>>> //Nils
>
>

From jaroslav.bachorik at oracle.com  Thu Feb 13 01:17:21 2014
From: jaroslav.bachorik at oracle.com (Jaroslav Bachorik)
Date: Thu, 13 Feb 2014 10:17:21 +0100
Subject: Codereview request: 8007710 runtime/7158988/FieldMonitor.java
	fails with com.sun.jdi.VMDisconnectedException: Connection closed
In-Reply-To: <BC488F45-A698-4C92-986F-BBBEBA553CEF@oracle.com>
References: <52FA301A.2040301@oracle.com>	<F34A2199-253B-49B8-BEC9-00AEA98029C2@oracle.com>	<52FA424E.2050202@oracle.com>	<52FA4515.3050904@oracle.com>	<52FA4D33.4030402@oracle.com>	<52FA51BE.4070701@oracle.com>
	<52FA5E51.3040704@oracle.com>	<3F301491-8A27-4319-A694-D1E8ACB2F249@oracle.com>	<52FBA9B2.6090502@oracle.com>
	<BC488F45-A698-4C92-986F-BBBEBA553CEF@oracle.com>
Message-ID: <52FC8DA1.9090900@oracle.com>

Hi Staffan,

On 12.2.2014 18:27, Staffan Larsen wrote:
> I?m still not happy with this fix since I think the extra output stream synchronization logic is not needed - the debuggee should be suspended at all the interesting points. The fix I proposed is cleaner and (as far as I can tell) also fixes the problem. The only thing is that I can?t quite explain what goes wrong without the fix? I?d really like to understand that. I?ll try to dig deeper and see if I can understand exactly what happens.

Yes, bringing the VM to a stable state before calling other JDI 
functions helps to stabilize the test even without the additional 
synchronization via stdout/stdin.

I just wonder whether this check should not be done inside 
com.sun.jdi.connect.LaunchingConnector#launch() implementation. Does it 
even make sense to hand off an unstable VM?

-JB-

>
> /Staffan
>
> On 12 feb 2014, at 18:04, shanliang <shanliang.jiang at oracle.com> wrote:
>
>> Staffan Larsen wrote:
>>>
>>> I think what you need to do is wait for the VMStartEvent before you add requests to the VM. Note this paragraph from the VirtualMachine doc:
>>>
>>>   Note that a target VM launched by a launching connector is not
>>>   guaranteed to be stable until after the VMStartEvent has been
>>>   received.
>>>
>> I may miss something here, I believe VMStartEvent must be the first event, when the test got ClassPrepareEvent, it must already received VMStartEvent.
>>>
>>> I think adding code that looks something like this will make the test stable:
>>>
>>>      VirtualMachine vm = launchTarget(CLASS_NAME);
>>>      EventQueue eventQueue = vm.eventQueue();
>>>
>>>      boolean started = false;
>>>      while(!started) {
>>>        EventSet eventSet = eventQueue.remove();
>>>        for (Event event : eventSet) {
>>>          if (event instanceof VMStartEvent) {
>>>            started = true;
>>>          }
>>>          if (event instanceof VMDeathEvent
>>>              || event instanceof VMDisconnectEvent) {
>>>            throw new Error("VM died before it started...:"+event);
>>>          }
>>>        }
>>>      }
>>>
>>>      System.out.println("Vm launched");
>>>
>> The code you proposed could improve the test, it made sure that TestPostFieldModification was started, but I am afraid that it did not address the issue causing the failure, the issue I believe was that TestPostFieldModification exited before or during FieldMonitor called addFieldWatch(), that was why addFieldWatch() received VMDisconnectedException. When the test was treating ClassPrepareEvent, even if VMDeathEvent or VMDisconnectEvent arrived, it must be still waiting in the eventQueue because it arrived after ClassPrepareEvent.
>>
>> My fix was to not allow TestPostFieldModification to exit before addFieldWatch() was done.
>>>
>>>
>>> There is also no reason to call addFieldWatch() before the ClassPrepareEvent has been received. The call to vm..classesByName() will just return an empty list anyway.
>>>
>> I do not know why the test called addFieldWatch before ClassPrepareEvent had been received, but yes the returned list was empty, so agree to remove it.
>>> While you are in there you can also remove the unused StringBuffer near the top of main().
>>>
>> Yes it was already removed in version 01
>>
>> Here is the new webrev:
>> http://cr.openjdk.java.net/~sjiang/JDK-8007710/02/
>>
>> Thanks,
>> Shanliang
>>>
>>> Thanks,
>>> /Staffan
>>>
>>> On 11 feb 2014, at 18:30, shanliang <shanliang.jiang at oracle.com> wrote:
>>>
>>>
>>>> Here is the new fix in which FieldMonitor will write to TestPostFieldModification, to inform the latter to quit, as suggested bu Jaroslav
>>>>    http://cr.openjdk.java.net/~sjiang/JDK-8007710/01/
>>>>
>>>> Thanks,
>>>> Shanliang
>>>>
>>>> shanliang wrote:
>>>>
>>>>> shanliang wrote:
>>>>>
>>>>>> Jaroslav Bachorik wrote:
>>>>>>
>>>>>>> On 11.2.2014 16:31, shanliang wrote:
>>>>>>>
>>>>>>>> Staffan Larsen wrote:
>>>>>>>>
>>>>>>>>> Hi Shanliang,
>>>>>>>>>
>>>>>>>>> I can?t quite see how the test can fail in this way. When the
>>>>>>>>> ClassPrepareEvent happens, the debuggee will be suspended. So when
>>>>>>>>> addFieldWatch() is called, the debuggee should not have moved.
>>>>>>>>>
>>>>>>>> I am not expert of jdi so I may miss something here. I checked the
>>>>>>>> failure trace and saw the report exception happen when FieldMonitor
>>>>>>>> received ClassPrepareEvent and was doing addFieldWatch. FieldMonitor did
>>>>>>>> call "vm.resume()" before treating events.
>>>>>>>>
>>>>>>> AFAICS, calling vm.resume() results in an almost immediate debuggee death. The gc() invoking thread "d" is flagged as a deamon and as such doesn't prevent the process from exiting. The other thread is not a daemon but will finish in only few cycles.
>>>>>>>
>>>>>> I looked at the class com.sun.jdi.VirtualMachine, here is the Javadoc of the method "resume":
>>>>>>    /**
>>>>>>     * Continues the execution of the application running in this
>>>>>>     * virtual machine. All threads are resumed as documented in
>>>>>>     * {@link ThreadReference#resume}.
>>>>>>     *
>>>>>>     * @throws VMCannotBeModifiedException if the VirtualMachine is read-only - see {@link VirtualMachine#canBeModified()}.
>>>>>>     *
>>>>>>     * @see #suspend
>>>>>>     */
>>>>>>    void resume();
>>>>>> My understanding is that the debuggee resumes to work after this call, instead to die?
>>>>>>
>>>>> In fact the problem is here, the vm (TestPostFieldModification) should not die before FieldMonitor finishes addFieldWatch.
>>>>>
>>>>> Shanliang
>>>>>
>>>>>>>> I reproduced the bug by add sleep(1000) after vm.resume() but before
>>>>>>>> calling eventQueue.remove();
>>>>>>>>
>>>>>>> It looks like some kind of synchronization between the debugger and the debuggee is necessary. But I wonder if you should better use the process.getOuptuptStream() to write and flush a message for the debugee indicating that it can exit. And in the debugee you would just do System.in.read() as the last statement in the main() method. Seems more robust than involving files.
>>>>>>>
>>>>>> It could work, but creating a file in the testing directory should have no issue, but yes maybe less performance.
>>>>>>
>>>>>> Thanks,
>>>>>> Shanliang
>>>>>>
>>>>>>> Cheers,
>>>>>>>
>>>>>>> -JB-
>>>>>>>
>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Shanliang
>>>>>>>>
>>>>>>>>> One problem I do see with the test is that it does not wait for a
>>>>>>>>> VMStartEvent before setting up requests. I?m not sure if that could
>>>>>>>>> cause the failure in the bug report, though.
>>>>>>>>>
>>>>>>>>> /Staffan
>>>>>>>>>
>>>>>>>>> On 11 feb 2014, at 15:13, shanliang <shanliang.jiang at oracle.com> wrote:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> Hi ,
>>>>>>>>>>
>>>>>>>>>> The problem could be that FieldMonitor did not have enough time to
>>>>>>>>>> "addFieldWatch" but the vm to monitor (TestPostFieldModification) was
>>>>>>>>>> already ended.
>>>>>>>>>>
>>>>>>>>>> So we should make sure that TestPostFieldModification exits after
>>>>>>>>>> FieldMonitor has done necessary. The solution proposed here is that
>>>>>>>>>> FieldMonitor creates a file after adding field watching, and
>>>>>>>>>> TestPostFieldModification quits only after finding the file.
>>>>>>>>>>
>>>>>>>>>> web:
>>>>>>>>>> http://icncweb.fr.oracle.com/~shjiang/webrev/8007710/00/
>>>>>>>>>>
>>>>>>>>>> bug:
>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8007710
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>> Shanliang
>>>>>>>>>>
>>>>>>>>
>>>
>>
>>
>
>


From staffan.larsen at oracle.com  Thu Feb 13 02:04:05 2014
From: staffan.larsen at oracle.com (Staffan Larsen)
Date: Thu, 13 Feb 2014 11:04:05 +0100
Subject: Codereview request: 8007710 runtime/7158988/FieldMonitor.java
	fails with com.sun.jdi.VMDisconnectedException: Connection closed
In-Reply-To: <52FC8DA1.9090900@oracle.com>
References: <52FA301A.2040301@oracle.com>	<F34A2199-253B-49B8-BEC9-00AEA98029C2@oracle.com>	<52FA424E.2050202@oracle.com>	<52FA4515.3050904@oracle.com>	<52FA4D33.4030402@oracle.com>	<52FA51BE.4070701@oracle.com>
	<52FA5E51.3040704@oracle.com>	<3F301491-8A27-4319-A694-D1E8ACB2F249@oracle.com>	<52FBA9B2.6090502@oracle.com>
	<BC488F45-A698-4C92-986F-BBBEBA553CEF@oracle.com>
	<52FC8DA1.9090900@oracle.com>
Message-ID: <1353E47D-3ACB-42FD-ABD3-845CEB6EE389@oracle.com>


On 13 feb 2014, at 10:17, Jaroslav Bachorik <jaroslav.bachorik at oracle.com> wrote:

> Hi Staffan,
> 
> On 12.2.2014 18:27, Staffan Larsen wrote:
>> I?m still not happy with this fix since I think the extra output stream synchronization logic is not needed - the debuggee should be suspended at all the interesting points. The fix I proposed is cleaner and (as far as I can tell) also fixes the problem. The only thing is that I can?t quite explain what goes wrong without the fix? I?d really like to understand that. I?ll try to dig deeper and see if I can understand exactly what happens.
> 
> Yes, bringing the VM to a stable state before calling other JDI functions helps to stabilize the test even without the additional synchronization via stdout/stdin.
> 
> I just wonder whether this check should not be done inside com.sun.jdi.connect.LaunchingConnector#launch() implementation. Does it even make sense to hand off an unstable VM?

Good question, but hard to change now - all implementations depend on the current functionality. The VMStartEvent also gives you a reference to the main thread.

/S

> 
> -JB-
> 
>> 
>> /Staffan
>> 
>> On 12 feb 2014, at 18:04, shanliang <shanliang.jiang at oracle.com> wrote:
>> 
>>> Staffan Larsen wrote:
>>>> 
>>>> I think what you need to do is wait for the VMStartEvent before you add requests to the VM. Note this paragraph from the VirtualMachine doc:
>>>> 
>>>>  Note that a target VM launched by a launching connector is not
>>>>  guaranteed to be stable until after the VMStartEvent has been
>>>>  received.
>>>> 
>>> I may miss something here, I believe VMStartEvent must be the first event, when the test got ClassPrepareEvent, it must already received VMStartEvent.
>>>> 
>>>> I think adding code that looks something like this will make the test stable:
>>>> 
>>>>     VirtualMachine vm = launchTarget(CLASS_NAME);
>>>>     EventQueue eventQueue = vm.eventQueue();
>>>> 
>>>>     boolean started = false;
>>>>     while(!started) {
>>>>       EventSet eventSet = eventQueue.remove();
>>>>       for (Event event : eventSet) {
>>>>         if (event instanceof VMStartEvent) {
>>>>           started = true;
>>>>         }
>>>>         if (event instanceof VMDeathEvent
>>>>             || event instanceof VMDisconnectEvent) {
>>>>           throw new Error("VM died before it started...:"+event);
>>>>         }
>>>>       }
>>>>     }
>>>> 
>>>>     System.out.println("Vm launched");
>>>> 
>>> The code you proposed could improve the test, it made sure that TestPostFieldModification was started, but I am afraid that it did not address the issue causing the failure, the issue I believe was that TestPostFieldModification exited before or during FieldMonitor called addFieldWatch(), that was why addFieldWatch() received VMDisconnectedException. When the test was treating ClassPrepareEvent, even if VMDeathEvent or VMDisconnectEvent arrived, it must be still waiting in the eventQueue because it arrived after ClassPrepareEvent.
>>> 
>>> My fix was to not allow TestPostFieldModification to exit before addFieldWatch() was done.
>>>> 
>>>> 
>>>> There is also no reason to call addFieldWatch() before the ClassPrepareEvent has been received. The call to vm..classesByName() will just return an empty list anyway.
>>>> 
>>> I do not know why the test called addFieldWatch before ClassPrepareEvent had been received, but yes the returned list was empty, so agree to remove it.
>>>> While you are in there you can also remove the unused StringBuffer near the top of main().
>>>> 
>>> Yes it was already removed in version 01
>>> 
>>> Here is the new webrev:
>>> http://cr.openjdk.java.net/~sjiang/JDK-8007710/02/
>>> 
>>> Thanks,
>>> Shanliang
>>>> 
>>>> Thanks,
>>>> /Staffan
>>>> 
>>>> On 11 feb 2014, at 18:30, shanliang <shanliang.jiang at oracle.com> wrote:
>>>> 
>>>> 
>>>>> Here is the new fix in which FieldMonitor will write to TestPostFieldModification, to inform the latter to quit, as suggested bu Jaroslav
>>>>>   http://cr.openjdk.java.net/~sjiang/JDK-8007710/01/
>>>>> 
>>>>> Thanks,
>>>>> Shanliang
>>>>> 
>>>>> shanliang wrote:
>>>>> 
>>>>>> shanliang wrote:
>>>>>> 
>>>>>>> Jaroslav Bachorik wrote:
>>>>>>> 
>>>>>>>> On 11.2.2014 16:31, shanliang wrote:
>>>>>>>> 
>>>>>>>>> Staffan Larsen wrote:
>>>>>>>>> 
>>>>>>>>>> Hi Shanliang,
>>>>>>>>>> 
>>>>>>>>>> I can?t quite see how the test can fail in this way. When the
>>>>>>>>>> ClassPrepareEvent happens, the debuggee will be suspended. So when
>>>>>>>>>> addFieldWatch() is called, the debuggee should not have moved.
>>>>>>>>>> 
>>>>>>>>> I am not expert of jdi so I may miss something here. I checked the
>>>>>>>>> failure trace and saw the report exception happen when FieldMonitor
>>>>>>>>> received ClassPrepareEvent and was doing addFieldWatch. FieldMonitor did
>>>>>>>>> call "vm.resume()" before treating events.
>>>>>>>>> 
>>>>>>>> AFAICS, calling vm.resume() results in an almost immediate debuggee death. The gc() invoking thread "d" is flagged as a deamon and as such doesn't prevent the process from exiting. The other thread is not a daemon but will finish in only few cycles.
>>>>>>>> 
>>>>>>> I looked at the class com.sun.jdi.VirtualMachine, here is the Javadoc of the method "resume":
>>>>>>>   /**
>>>>>>>    * Continues the execution of the application running in this
>>>>>>>    * virtual machine. All threads are resumed as documented in
>>>>>>>    * {@link ThreadReference#resume}.
>>>>>>>    *
>>>>>>>    * @throws VMCannotBeModifiedException if the VirtualMachine is read-only - see {@link VirtualMachine#canBeModified()}.
>>>>>>>    *
>>>>>>>    * @see #suspend
>>>>>>>    */
>>>>>>>   void resume();
>>>>>>> My understanding is that the debuggee resumes to work after this call, instead to die?
>>>>>>> 
>>>>>> In fact the problem is here, the vm (TestPostFieldModification) should not die before FieldMonitor finishes addFieldWatch.
>>>>>> 
>>>>>> Shanliang
>>>>>> 
>>>>>>>>> I reproduced the bug by add sleep(1000) after vm.resume() but before
>>>>>>>>> calling eventQueue.remove();
>>>>>>>>> 
>>>>>>>> It looks like some kind of synchronization between the debugger and the debuggee is necessary. But I wonder if you should better use the process.getOuptuptStream() to write and flush a message for the debugee indicating that it can exit. And in the debugee you would just do System.in.read() as the last statement in the main() method. Seems more robust than involving files.
>>>>>>>> 
>>>>>>> It could work, but creating a file in the testing directory should have no issue, but yes maybe less performance.
>>>>>>> 
>>>>>>> Thanks,
>>>>>>> Shanliang
>>>>>>> 
>>>>>>>> Cheers,
>>>>>>>> 
>>>>>>>> -JB-
>>>>>>>> 
>>>>>>>> 
>>>>>>>>> Thanks,
>>>>>>>>> Shanliang
>>>>>>>>> 
>>>>>>>>>> One problem I do see with the test is that it does not wait for a
>>>>>>>>>> VMStartEvent before setting up requests. I?m not sure if that could
>>>>>>>>>> cause the failure in the bug report, though.
>>>>>>>>>> 
>>>>>>>>>> /Staffan
>>>>>>>>>> 
>>>>>>>>>> On 11 feb 2014, at 15:13, shanliang <shanliang.jiang at oracle.com> wrote:
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>>> Hi ,
>>>>>>>>>>> 
>>>>>>>>>>> The problem could be that FieldMonitor did not have enough time to
>>>>>>>>>>> "addFieldWatch" but the vm to monitor (TestPostFieldModification) was
>>>>>>>>>>> already ended.
>>>>>>>>>>> 
>>>>>>>>>>> So we should make sure that TestPostFieldModification exits after
>>>>>>>>>>> FieldMonitor has done necessary. The solution proposed here is that
>>>>>>>>>>> FieldMonitor creates a file after adding field watching, and
>>>>>>>>>>> TestPostFieldModification quits only after finding the file.
>>>>>>>>>>> 
>>>>>>>>>>> web:
>>>>>>>>>>> http://icncweb.fr.oracle.com/~shjiang/webrev/8007710/00/
>>>>>>>>>>> 
>>>>>>>>>>> bug:
>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8007710
>>>>>>>>>>> 
>>>>>>>>>>> Thanks,
>>>>>>>>>>> Shanliang
>>>>>>>>>>> 
>>>>>>>>> 
>>>> 
>>> 
>>> 
>> 
>> 
> 


From shanliang.jiang at oracle.com  Thu Feb 13 02:36:04 2014
From: shanliang.jiang at oracle.com (shanliang)
Date: Thu, 13 Feb 2014 11:36:04 +0100
Subject: Codereview request: 8007710 runtime/7158988/FieldMonitor.java
	fails with com.sun.jdi.VMDisconnectedException: Connection closed
In-Reply-To: <1353E47D-3ACB-42FD-ABD3-845CEB6EE389@oracle.com>
References: <52FA301A.2040301@oracle.com>	<F34A2199-253B-49B8-BEC9-00AEA98029C2@oracle.com>	<52FA424E.2050202@oracle.com>	<52FA4515.3050904@oracle.com>	<52FA4D33.4030402@oracle.com>	<52FA51BE.4070701@oracle.com>	<52FA5E51.3040704@oracle.com>	<3F301491-8A27-4319-A694-D1E8ACB2F249@oracle.com>	<52FBA9B2.6090502@oracle.com>	<BC488F45-A698-4C92-986F-BBBEBA553CEF@oracle.com>	<52FC8DA1.9090900@oracle.com>
	<1353E47D-3ACB-42FD-ABD3-845CEB6EE389@oracle.com>
Message-ID: <52FCA014.9000106@oracle.com>

Staffan Larsen wrote:
> On 13 feb 2014, at 10:17, Jaroslav Bachorik <jaroslav.bachorik at oracle.com> wrote:
>
>   
>> Hi Staffan,
>>
>> On 12.2.2014 18:27, Staffan Larsen wrote:
>>     
>>> I?m still not happy with this fix since I think the extra output stream synchronization logic is not needed - the debuggee should be suspended at all the interesting points. The fix I proposed is cleaner and (as far as I can tell) also fixes the problem. The only thing is that I can?t quite explain what goes wrong without the fix? I?d really like to understand that. I?ll try to dig deeper and see if I can understand exactly what happens.
>>>       
>> Yes, bringing the VM to a stable state before calling other JDI functions helps to stabilize the test even without the additional synchronization via stdout/stdin.
>>
>> I just wonder whether this check should not be done inside com.sun.jdi.connect.LaunchingConnector#launch() implementation. Does it even make sense to hand off an unstable VM?
>>     
>
> Good question, but hard to change now - all implementations depend on the current functionality. The VMStartEvent also gives you a reference to the main thread.
>   
The test failed when it received ClassPrepareEvent and did 
addFieldWatch, that meant the test must receive already VMStartEvent, 
because VMStartEvent must be the first event, if it was true then the vm 
must be already stable when failing.

Except that the test received ClassPrepareEvent before VMStartEvent then 
it was doing addFieldWatch with a possibly unstable VM. in this case we 
might have a serious bug in VirtualMachine implementation, and if this 
is true the fix proposed to check "start" may make miss 
ClassPrepareEvent, then the test would test nothing.

Shanliang
> /S
>
>   
>> -JB-
>>
>>     
>>> /Staffan
>>>
>>> On 12 feb 2014, at 18:04, shanliang <shanliang.jiang at oracle.com> wrote:
>>>
>>>       
>>>> Staffan Larsen wrote:
>>>>         
>>>>> I think what you need to do is wait for the VMStartEvent before you add requests to the VM. Note this paragraph from the VirtualMachine doc:
>>>>>
>>>>>  Note that a target VM launched by a launching connector is not
>>>>>  guaranteed to be stable until after the VMStartEvent has been
>>>>>  received.
>>>>>
>>>>>           
>>>> I may miss something here, I believe VMStartEvent must be the first event, when the test got ClassPrepareEvent, it must already received VMStartEvent.
>>>>         
>>>>> I think adding code that looks something like this will make the test stable:
>>>>>
>>>>>     VirtualMachine vm = launchTarget(CLASS_NAME);
>>>>>     EventQueue eventQueue = vm.eventQueue();
>>>>>
>>>>>     boolean started = false;
>>>>>     while(!started) {
>>>>>       EventSet eventSet = eventQueue.remove();
>>>>>       for (Event event : eventSet) {
>>>>>         if (event instanceof VMStartEvent) {
>>>>>           started = true;
>>>>>         }
>>>>>         if (event instanceof VMDeathEvent
>>>>>             || event instanceof VMDisconnectEvent) {
>>>>>           throw new Error("VM died before it started...:"+event);
>>>>>         }
>>>>>       }
>>>>>     }
>>>>>
>>>>>     System.out.println("Vm launched");
>>>>>
>>>>>           
>>>> The code you proposed could improve the test, it made sure that TestPostFieldModification was started, but I am afraid that it did not address the issue causing the failure, the issue I believe was that TestPostFieldModification exited before or during FieldMonitor called addFieldWatch(), that was why addFieldWatch() received VMDisconnectedException. When the test was treating ClassPrepareEvent, even if VMDeathEvent or VMDisconnectEvent arrived, it must be still waiting in the eventQueue because it arrived after ClassPrepareEvent.
>>>>
>>>> My fix was to not allow TestPostFieldModification to exit before addFieldWatch() was done.
>>>>         
>>>>> There is also no reason to call addFieldWatch() before the ClassPrepareEvent has been received. The call to vm..classesByName() will just return an empty list anyway.
>>>>>
>>>>>           
>>>> I do not know why the test called addFieldWatch before ClassPrepareEvent had been received, but yes the returned list was empty, so agree to remove it.
>>>>         
>>>>> While you are in there you can also remove the unused StringBuffer near the top of main().
>>>>>
>>>>>           
>>>> Yes it was already removed in version 01
>>>>
>>>> Here is the new webrev:
>>>> http://cr.openjdk.java.net/~sjiang/JDK-8007710/02/
>>>>
>>>> Thanks,
>>>> Shanliang
>>>>         
>>>>> Thanks,
>>>>> /Staffan
>>>>>
>>>>> On 11 feb 2014, at 18:30, shanliang <shanliang.jiang at oracle.com> wrote:
>>>>>
>>>>>
>>>>>           
>>>>>> Here is the new fix in which FieldMonitor will write to TestPostFieldModification, to inform the latter to quit, as suggested bu Jaroslav
>>>>>>   http://cr.openjdk.java.net/~sjiang/JDK-8007710/01/
>>>>>>
>>>>>> Thanks,
>>>>>> Shanliang
>>>>>>
>>>>>> shanliang wrote:
>>>>>>
>>>>>>             
>>>>>>> shanliang wrote:
>>>>>>>
>>>>>>>               
>>>>>>>> Jaroslav Bachorik wrote:
>>>>>>>>
>>>>>>>>                 
>>>>>>>>> On 11.2.2014 16:31, shanliang wrote:
>>>>>>>>>
>>>>>>>>>                   
>>>>>>>>>> Staffan Larsen wrote:
>>>>>>>>>>
>>>>>>>>>>                     
>>>>>>>>>>> Hi Shanliang,
>>>>>>>>>>>
>>>>>>>>>>> I can?t quite see how the test can fail in this way. When the
>>>>>>>>>>> ClassPrepareEvent happens, the debuggee will be suspended. So when
>>>>>>>>>>> addFieldWatch() is called, the debuggee should not have moved.
>>>>>>>>>>>
>>>>>>>>>>>                       
>>>>>>>>>> I am not expert of jdi so I may miss something here. I checked the
>>>>>>>>>> failure trace and saw the report exception happen when FieldMonitor
>>>>>>>>>> received ClassPrepareEvent and was doing addFieldWatch. FieldMonitor did
>>>>>>>>>> call "vm.resume()" before treating events.
>>>>>>>>>>
>>>>>>>>>>                     
>>>>>>>>> AFAICS, calling vm.resume() results in an almost immediate debuggee death. The gc() invoking thread "d" is flagged as a deamon and as such doesn't prevent the process from exiting. The other thread is not a daemon but will finish in only few cycles.
>>>>>>>>>
>>>>>>>>>                   
>>>>>>>> I looked at the class com.sun.jdi.VirtualMachine, here is the Javadoc of the method "resume":
>>>>>>>>   /**
>>>>>>>>    * Continues the execution of the application running in this
>>>>>>>>    * virtual machine. All threads are resumed as documented in
>>>>>>>>    * {@link ThreadReference#resume}.
>>>>>>>>    *
>>>>>>>>    * @throws VMCannotBeModifiedException if the VirtualMachine is read-only - see {@link VirtualMachine#canBeModified()}.
>>>>>>>>    *
>>>>>>>>    * @see #suspend
>>>>>>>>    */
>>>>>>>>   void resume();
>>>>>>>> My understanding is that the debuggee resumes to work after this call, instead to die?
>>>>>>>>
>>>>>>>>                 
>>>>>>> In fact the problem is here, the vm (TestPostFieldModification) should not die before FieldMonitor finishes addFieldWatch.
>>>>>>>
>>>>>>> Shanliang
>>>>>>>
>>>>>>>               
>>>>>>>>>> I reproduced the bug by add sleep(1000) after vm.resume() but before
>>>>>>>>>> calling eventQueue.remove();
>>>>>>>>>>
>>>>>>>>>>                     
>>>>>>>>> It looks like some kind of synchronization between the debugger and the debuggee is necessary. But I wonder if you should better use the process.getOuptuptStream() to write and flush a message for the debugee indicating that it can exit. And in the debugee you would just do System.in.read() as the last statement in the main() method. Seems more robust than involving files.
>>>>>>>>>
>>>>>>>>>                   
>>>>>>>> It could work, but creating a file in the testing directory should have no issue, but yes maybe less performance.
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Shanliang
>>>>>>>>
>>>>>>>>                 
>>>>>>>>> Cheers,
>>>>>>>>>
>>>>>>>>> -JB-
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>                   
>>>>>>>>>> Thanks,
>>>>>>>>>> Shanliang
>>>>>>>>>>
>>>>>>>>>>                     
>>>>>>>>>>> One problem I do see with the test is that it does not wait for a
>>>>>>>>>>> VMStartEvent before setting up requests. I?m not sure if that could
>>>>>>>>>>> cause the failure in the bug report, though.
>>>>>>>>>>>
>>>>>>>>>>> /Staffan
>>>>>>>>>>>
>>>>>>>>>>> On 11 feb 2014, at 15:13, shanliang <shanliang.jiang at oracle.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>                       
>>>>>>>>>>>> Hi ,
>>>>>>>>>>>>
>>>>>>>>>>>> The problem could be that FieldMonitor did not have enough time to
>>>>>>>>>>>> "addFieldWatch" but the vm to monitor (TestPostFieldModification) was
>>>>>>>>>>>> already ended.
>>>>>>>>>>>>
>>>>>>>>>>>> So we should make sure that TestPostFieldModification exits after
>>>>>>>>>>>> FieldMonitor has done necessary. The solution proposed here is that
>>>>>>>>>>>> FieldMonitor creates a file after adding field watching, and
>>>>>>>>>>>> TestPostFieldModification quits only after finding the file.
>>>>>>>>>>>>
>>>>>>>>>>>> web:
>>>>>>>>>>>> http://icncweb.fr.oracle.com/~shjiang/webrev/8007710/00/
>>>>>>>>>>>>
>>>>>>>>>>>> bug:
>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8007710
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>> Shanliang
>>>>>>>>>>>>
>>>>>>>>>>>>                         
>>>>         
>>>       
>
>   

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20140213/2ce2f159/attachment-0001.html 

From mattias.tobiasson at oracle.com  Thu Feb 13 03:57:40 2014
From: mattias.tobiasson at oracle.com (Mattias Tobiasson)
Date: Thu, 13 Feb 2014 03:57:40 -0800 (PST)
Subject: RFR: 6545422 NativeErrors.java uses wrong path name in exec
Message-ID: <850898d0-8094-4769-8ef6-b0b5d8b2dbad@default>

Hi,
Could you please sponsor and submit this patch?

Mattias

----- Original Message -----
From: staffan.larsen at oracle.com
To: mattias.tobiasson at oracle.com
Cc: serviceability-dev at openjdk.java.net
Sent: Monday, February 10, 2014 2:38:51 PM GMT +01:00 Amsterdam / Berlin / Bern / Rome / Stockholm / Vienna
Subject: Re: RFR: 6545422 NativeErrors.java uses wrong path name in exec

And a looong time later: Looks good!

Thanks,
/Staffan

On 4 dec 2013, at 15:38, Mattias Tobiasson <mattias.tobiasson at oracle.com> wrote:

> Hi,
> Could someone please review this patch?
> 
> webrev: http://cr.openjdk.java.net/~ykantser/6545422/webrev.00
> bug: https://bugs.openjdk.java.net/browse/JDK-6545422
> 
> Changes:
> Uses common testlibrary function to find native2ascii binary.
> Check null values instead of throwing NullPointerException at different locations. The old version checked null with "assert", which is not always enabled.
> 
> 
> Mattias

-------------- next part --------------
A non-text attachment was scrubbed...
Name: 6545422.patch
Type: text/x-patch
Size: 6890 bytes
Desc: not available
Url : http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20140213/4c2b41df/6545422.patch 

From Alan.Bateman at oracle.com  Thu Feb 13 05:18:03 2014
From: Alan.Bateman at oracle.com (Alan Bateman)
Date: Thu, 13 Feb 2014 13:18:03 +0000
Subject: 8034856/8034857: More gcc warnings
Message-ID: <52FCC60B.6040205@oracle.com>


The number of native code warnings in the build is annoying so this is 
another drive-by fix that eliminates a few of them in the serviceability 
and security areas. The webrev with the changes is here:

http://cr.openjdk.java.net/~alanb/8034856+8034857/webrev/

In the pkcs11 code the issue is the function prototypes for the throwXXX 
functions aren't included. This is fixed by including pkcs11wrapper.h 
but that exposes another issue with the header file includes that needed 
to be fixed.

In JarFacade the issue is that it uses isspace but doesn't include the 
ctype.h

For LinuxOperatingSystem.c then there are 12 warnings related to fscanf 
usages where the format specifier is %lld and the code wants to read 
into a uint64_t. I've changed the format specifier to"%"SCNd64 so that 
it matches uint64_t and should be okay on both 32 and 64-bit.

Thanks,
Alan.

From staffan.larsen at oracle.com  Thu Feb 13 05:17:24 2014
From: staffan.larsen at oracle.com (Staffan Larsen)
Date: Thu, 13 Feb 2014 14:17:24 +0100
Subject: Codereview request: 8007710 runtime/7158988/FieldMonitor.java
	fails with com.sun.jdi.VMDisconnectedException: Connection closed
In-Reply-To: <52FCA014.9000106@oracle.com>
References: <52FA301A.2040301@oracle.com>	<F34A2199-253B-49B8-BEC9-00AEA98029C2@oracle.com>	<52FA424E.2050202@oracle.com>	<52FA4515.3050904@oracle.com>	<52FA4D33.4030402@oracle.com>	<52FA51BE.4070701@oracle.com>	<52FA5E51.3040704@oracle.com>	<3F301491-8A27-4319-A694-D1E8ACB2F249@oracle.com>	<52FBA9B2.6090502@oracle.com>	<BC488F45-A698-4C92-986F-BBBEBA553CEF@oracle.com>	<52FC8DA1.9090900@oracle.com>
	<1353E47D-3ACB-42FD-ABD3-845CEB6EE389@oracle.com>
	<52FCA014.9000106@oracle.com>
Message-ID: <EF6C879F-2658-47E9-B733-0E0E027E3B4E@oracle.com>

I think I understand what happens now.

The test code, simplified, looks like this (with the Thread.sleep() added that causes the test to fail):

  launchTarget();
  addClassWatch();
  vm.resume();
  Thread.sleep(1000);
  while(connected) {
      eventSet = eventQueue.remove()
      for(event : eventQueue) {
          if (event instanceof ClassPrepareEvent) {
              addFieldWatch();
          }
      }
      eventSet.resume();
  }

By default all events that happen will cause the debuggee to suspend (see EventRequest.setSuspendPolicy()). Thus when we get to addFieldWatch(), the vm should be suspended and we should be able to create the field watch without problem. But the VM isn?t suspended and that is why the test fail. 

Why isn?t the VM suspended? When we get to the ?for(event : eventQueue)? the first time there are *two* events already in the queue: the VMStartEvent and a ClassPrepareEvent. At this point the VM is suspended and everything is good. We look at the first eventSet which only contains the VMStartEvent, we ignore the event, but we resume the VM. We then loop and look at the ClassPrepareEvent, but by now the VM is already running and has also terminated. Failure.

Thus, we need to handle the VMStartEvent. I suggest a modification to my previous code:

  launchTarget();
  while(connected) {
      eventSet = eventQueue.remove()
      for(event : eventQueue) {
          if (event instanceof VMStartEvent) {
              addClassWatch();
          }
          if (event instanceof ClassPrepareEvent) {
              addFieldWatch();
          }
      }
      eventSet.resume();
  }

This will cause us to have complete control over the state of the debuggee. The first event we see will be the VMStartEvent. The VM will be suspended. We can add a class watch here. Then we resume the VM. The second event we see will be the ClassPrepareEvent with the VM suspended. We can add the field watch. Then we resume the VM and wait for the field watch events.

Thanks,
/Staffan

On 13 feb 2014, at 11:36, shanliang <shanliang.jiang at oracle.com> wrote:

> Staffan Larsen wrote:
>> 
>> On 13 feb 2014, at 10:17, Jaroslav Bachorik <jaroslav.bachorik at oracle.com> wrote:
>> 
>>   
>>> Hi Staffan,
>>> 
>>> On 12.2.2014 18:27, Staffan Larsen wrote:
>>>     
>>>> I?m still not happy with this fix since I think the extra output stream synchronization logic is not needed - the debuggee should be suspended at all the interesting points. The fix I proposed is cleaner and (as far as I can tell) also fixes the problem. The only thing is that I can?t quite explain what goes wrong without the fix? I?d really like to understand that. I?ll try to dig deeper and see if I can understand exactly what happens.
>>>>       
>>> Yes, bringing the VM to a stable state before calling other JDI functions helps to stabilize the test even without the additional synchronization via stdout/stdin.
>>> 
>>> I just wonder whether this check should not be done inside com.sun.jdi.connect.LaunchingConnector#launch() implementation. Does it even make sense to hand off an unstable VM?
>>>     
>> Good question, but hard to change now - all implementations depend on the current functionality. The VMStartEvent also gives you a reference to the main thread.
>>   
> The test failed when it received ClassPrepareEvent and did addFieldWatch, that meant the test must receive already VMStartEvent, because VMStartEvent must be the first event, if it was true then the vm must be already stable when failing.
> 
> Except that the test received ClassPrepareEvent before VMStartEvent then it was doing addFieldWatch with a possibly unstable VM. in this case we might have a serious bug in VirtualMachine implementation, and if this is true the fix proposed to check "start" may make miss ClassPrepareEvent, then the test would test nothing.
> 
> Shanliang
>> /S
>> 
>>   
>>> -JB-
>>> 
>>>     
>>>> /Staffan
>>>> 
>>>> On 12 feb 2014, at 18:04, shanliang <shanliang.jiang at oracle.com> wrote:
>>>> 
>>>>       
>>>>> Staffan Larsen wrote:
>>>>>         
>>>>>> I think what you need to do is wait for the VMStartEvent before you add requests to the VM. Note this paragraph from the VirtualMachine doc:
>>>>>> 
>>>>>>  Note that a target VM launched by a launching connector is not
>>>>>>  guaranteed to be stable until after the VMStartEvent has been
>>>>>>  received.
>>>>>> 
>>>>>>           
>>>>> I may miss something here, I believe VMStartEvent must be the first event, when the test got ClassPrepareEvent, it must already received VMStartEvent.
>>>>>         
>>>>>> I think adding code that looks something like this will make the test stable:
>>>>>> 
>>>>>>     VirtualMachine vm = launchTarget(CLASS_NAME);
>>>>>>     EventQueue eventQueue = vm.eventQueue();
>>>>>> 
>>>>>>     boolean started = false;
>>>>>>     while(!started) {
>>>>>>       EventSet eventSet = eventQueue.remove();
>>>>>>       for (Event event : eventSet) {
>>>>>>         if (event instanceof VMStartEvent) {
>>>>>>           started = true;
>>>>>>         }
>>>>>>         if (event instanceof VMDeathEvent
>>>>>>             || event instanceof VMDisconnectEvent) {
>>>>>>           throw new Error("VM died before it started...:"+event);
>>>>>>         }
>>>>>>       }
>>>>>>     }
>>>>>> 
>>>>>>     System.out.println("Vm launched");
>>>>>> 
>>>>>>           
>>>>> The code you proposed could improve the test, it made sure that TestPostFieldModification was started, but I am afraid that it did not address the issue causing the failure, the issue I believe was that TestPostFieldModification exited before or during FieldMonitor called addFieldWatch(), that was why addFieldWatch() received VMDisconnectedException. When the test was treating ClassPrepareEvent, even if VMDeathEvent or VMDisconnectEvent arrived, it must be still waiting in the eventQueue because it arrived after ClassPrepareEvent.
>>>>> 
>>>>> My fix was to not allow TestPostFieldModification to exit before addFieldWatch() was done.
>>>>>         
>>>>>> There is also no reason to call addFieldWatch() before the ClassPrepareEvent has been received. The call to vm..classesByName() will just return an empty list anyway.
>>>>>> 
>>>>>>           
>>>>> I do not know why the test called addFieldWatch before ClassPrepareEvent had been received, but yes the returned list was empty, so agree to remove it.
>>>>>         
>>>>>> While you are in there you can also remove the unused StringBuffer near the top of main().
>>>>>> 
>>>>>>           
>>>>> Yes it was already removed in version 01
>>>>> 
>>>>> Here is the new webrev:
>>>>> http://cr.openjdk.java.net/~sjiang/JDK-8007710/02/
>>>>> 
>>>>> Thanks,
>>>>> Shanliang
>>>>>         
>>>>>> Thanks,
>>>>>> /Staffan
>>>>>> 
>>>>>> On 11 feb 2014, at 18:30, shanliang <shanliang.jiang at oracle.com> wrote:
>>>>>> 
>>>>>> 
>>>>>>           
>>>>>>> Here is the new fix in which FieldMonitor will write to TestPostFieldModification, to inform the latter to quit, as suggested bu Jaroslav
>>>>>>>   http://cr.openjdk.java.net/~sjiang/JDK-8007710/01/
>>>>>>> 
>>>>>>> Thanks,
>>>>>>> Shanliang
>>>>>>> 
>>>>>>> shanliang wrote:
>>>>>>> 
>>>>>>>             
>>>>>>>> shanliang wrote:
>>>>>>>> 
>>>>>>>>               
>>>>>>>>> Jaroslav Bachorik wrote:
>>>>>>>>> 
>>>>>>>>>                 
>>>>>>>>>> On 11.2.2014 16:31, shanliang wrote:
>>>>>>>>>> 
>>>>>>>>>>                   
>>>>>>>>>>> Staffan Larsen wrote:
>>>>>>>>>>> 
>>>>>>>>>>>                     
>>>>>>>>>>>> Hi Shanliang,
>>>>>>>>>>>> 
>>>>>>>>>>>> I can?t quite see how the test can fail in this way. When the
>>>>>>>>>>>> ClassPrepareEvent happens, the debuggee will be suspended. So when
>>>>>>>>>>>> addFieldWatch() is called, the debuggee should not have moved.
>>>>>>>>>>>> 
>>>>>>>>>>>>                       
>>>>>>>>>>> I am not expert of jdi so I may miss something here. I checked the
>>>>>>>>>>> failure trace and saw the report exception happen when FieldMonitor
>>>>>>>>>>> received ClassPrepareEvent and was doing addFieldWatch. FieldMonitor did
>>>>>>>>>>> call "vm.resume()" before treating events.
>>>>>>>>>>> 
>>>>>>>>>>>                     
>>>>>>>>>> AFAICS, calling vm.resume() results in an almost immediate debuggee death. The gc() invoking thread "d" is flagged as a deamon and as such doesn't prevent the process from exiting. The other thread is not a daemon but will finish in only few cycles.
>>>>>>>>>> 
>>>>>>>>>>                   
>>>>>>>>> I looked at the class com.sun.jdi.VirtualMachine, here is the Javadoc of the method "resume":
>>>>>>>>>   /**
>>>>>>>>>    * Continues the execution of the application running in this
>>>>>>>>>    * virtual machine. All threads are resumed as documented in
>>>>>>>>>    * {@link ThreadReference#resume}.
>>>>>>>>>    *
>>>>>>>>>    * @throws VMCannotBeModifiedException if the VirtualMachine is read-only - see {@link VirtualMachine#canBeModified()}.
>>>>>>>>>    *
>>>>>>>>>    * @see #suspend
>>>>>>>>>    */
>>>>>>>>>   void resume();
>>>>>>>>> My understanding is that the debuggee resumes to work after this call, instead to die?
>>>>>>>>> 
>>>>>>>>>                 
>>>>>>>> In fact the problem is here, the vm (TestPostFieldModification) should not die before FieldMonitor finishes addFieldWatch.
>>>>>>>> 
>>>>>>>> Shanliang
>>>>>>>> 
>>>>>>>>               
>>>>>>>>>>> I reproduced the bug by add sleep(1000) after vm.resume() but before
>>>>>>>>>>> calling eventQueue.remove();
>>>>>>>>>>> 
>>>>>>>>>>>                     
>>>>>>>>>> It looks like some kind of synchronization between the debugger and the debuggee is necessary. But I wonder if you should better use the process.getOuptuptStream() to write and flush a message for the debugee indicating that it can exit. And in the debugee you would just do System.in.read() as the last statement in the main() method. Seems more robust than involving files.
>>>>>>>>>> 
>>>>>>>>>>                   
>>>>>>>>> It could work, but creating a file in the testing directory should have no issue, but yes maybe less performance.
>>>>>>>>> 
>>>>>>>>> Thanks,
>>>>>>>>> Shanliang
>>>>>>>>> 
>>>>>>>>>                 
>>>>>>>>>> Cheers,
>>>>>>>>>> 
>>>>>>>>>> -JB-
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>>                   
>>>>>>>>>>> Thanks,
>>>>>>>>>>> Shanliang
>>>>>>>>>>> 
>>>>>>>>>>>                     
>>>>>>>>>>>> One problem I do see with the test is that it does not wait for a
>>>>>>>>>>>> VMStartEvent before setting up requests. I?m not sure if that could
>>>>>>>>>>>> cause the failure in the bug report, though.
>>>>>>>>>>>> 
>>>>>>>>>>>> /Staffan
>>>>>>>>>>>> 
>>>>>>>>>>>> On 11 feb 2014, at 15:13, shanliang <shanliang.jiang at oracle.com> wrote:
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>>                       
>>>>>>>>>>>>> Hi ,
>>>>>>>>>>>>> 
>>>>>>>>>>>>> The problem could be that FieldMonitor did not have enough time to
>>>>>>>>>>>>> "addFieldWatch" but the vm to monitor (TestPostFieldModification) was
>>>>>>>>>>>>> already ended.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> So we should make sure that TestPostFieldModification exits after
>>>>>>>>>>>>> FieldMonitor has done necessary. The solution proposed here is that
>>>>>>>>>>>>> FieldMonitor creates a file after adding field watching, and
>>>>>>>>>>>>> TestPostFieldModification quits only after finding the file.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> web:
>>>>>>>>>>>>> http://icncweb.fr.oracle.com/~shjiang/webrev/8007710/00/
>>>>>>>>>>>>> 
>>>>>>>>>>>>> bug:
>>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8007710
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>> Shanliang
>>>>>>>>>>>>> 
>>>>>>>>>>>>>                         
>>>>>         
>>>>       
>>   
> 
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20140213/9d1ca03f/attachment-0001.html 

From staffan.larsen at oracle.com  Thu Feb 13 05:24:39 2014
From: staffan.larsen at oracle.com (Staffan Larsen)
Date: Thu, 13 Feb 2014 14:24:39 +0100
Subject: 8034856/8034857: More gcc warnings
In-Reply-To: <52FCC60B.6040205@oracle.com>
References: <52FCC60B.6040205@oracle.com>
Message-ID: <EE373E0A-80D6-43F8-A728-750255521512@oracle.com>

Changes look good.

/Staffan

On 13 feb 2014, at 14:18, Alan Bateman <Alan.Bateman at oracle.com> wrote:

> 
> The number of native code warnings in the build is annoying so this is another drive-by fix that eliminates a few of them in the serviceability and security areas. The webrev with the changes is here:
> 
> http://cr.openjdk.java.net/~alanb/8034856+8034857/webrev/
> 
> In the pkcs11 code the issue is the function prototypes for the throwXXX functions aren't included. This is fixed by including pkcs11wrapper.h but that exposes another issue with the header file includes that needed to be fixed.
> 
> In JarFacade the issue is that it uses isspace but doesn't include the ctype.h
> 
> For LinuxOperatingSystem.c then there are 12 warnings related to fscanf usages where the format specifier is %lld and the code wants to read into a uint64_t. I've changed the format specifier to"%"SCNd64 so that it matches uint64_t and should be okay on both 32 and 64-bit.
> 
> Thanks,
> Alan.


From sean.mullan at oracle.com  Thu Feb 13 06:05:49 2014
From: sean.mullan at oracle.com (Sean Mullan)
Date: Thu, 13 Feb 2014 09:05:49 -0500
Subject: 8034856/8034857: More gcc warnings
In-Reply-To: <52FCC60B.6040205@oracle.com>
References: <52FCC60B.6040205@oracle.com>
Message-ID: <52FCD13D.9040605@oracle.com>

Looks fine to me.

--Sean

On 02/13/2014 08:18 AM, Alan Bateman wrote:
>
> The number of native code warnings in the build is annoying so this is
> another drive-by fix that eliminates a few of them in the serviceability
> and security areas. The webrev with the changes is here:
>
> http://cr.openjdk.java.net/~alanb/8034856+8034857/webrev/
>
> In the pkcs11 code the issue is the function prototypes for the throwXXX
> functions aren't included. This is fixed by including pkcs11wrapper.h
> but that exposes another issue with the header file includes that needed
> to be fixed.
>
> In JarFacade the issue is that it uses isspace but doesn't include the
> ctype.h
>
> For LinuxOperatingSystem.c then there are 12 warnings related to fscanf
> usages where the format specifier is %lld and the code wants to read
> into a uint64_t. I've changed the format specifier to"%"SCNd64 so that
> it matches uint64_t and should be okay on both 32 and 64-bit.
>
> Thanks,
> Alan.


From daniel.daugherty at oracle.com  Thu Feb 13 09:07:40 2014
From: daniel.daugherty at oracle.com (Daniel D. Daugherty)
Date: Thu, 13 Feb 2014 10:07:40 -0700
Subject: RFR(XS): JDK-8034867 - Object references from expression stack
	are not visible as roots via jvmti FollowReferences and HeapDump
In-Reply-To: <02D5D45C1F8DB848A7AE20E80EE61A5C3980BF2A@DEWDFEMB20C.global.corp.sap>
References: <02D5D45C1F8DB848A7AE20E80EE61A5C3980BF2A@DEWDFEMB20C.global.corp.sap>
Message-ID: <52FCFBDC.4020602@oracle.com>

Adding serviceability-dev at openjdk.java.net since JVM/TI belongs to
the Serviceability team.

Please make sure that replies to this review thread include both
aliases for completeness.

Dan


On 2/13/14 9:57 AM, Siebenborn, Axel wrote:
>
> Hi all,
>
> could I have a review for the following change?
>
> It adds code to iterate over the slots of the expression stack.
>
> Found object references are used as heap roots, the same way as object 
> references in locals slots.
>
> Bug:
>
> https://bugs.openjdk.java.net/browse/JDK-8034867
>
> Webrev:
>
> http://sapjvm.com/as/webrevs/8034867
>
> Thanks,
>
> Axel
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20140213/490f7e03/attachment.html 

From shanliang.jiang at oracle.com  Thu Feb 13 09:15:38 2014
From: shanliang.jiang at oracle.com (shanliang)
Date: Thu, 13 Feb 2014 18:15:38 +0100
Subject: Codereview request: 8007710 runtime/7158988/FieldMonitor.java
	fails with com.sun.jdi.VMDisconnectedException: Connection closed
In-Reply-To: <EF6C879F-2658-47E9-B733-0E0E027E3B4E@oracle.com>
References: <52FA301A.2040301@oracle.com>	<F34A2199-253B-49B8-BEC9-00AEA98029C2@oracle.com>	<52FA424E.2050202@oracle.com>	<52FA4515.3050904@oracle.com>	<52FA4D33.4030402@oracle.com>	<52FA51BE.4070701@oracle.com>	<52FA5E51.3040704@oracle.com>	<3F301491-8A27-4319-A694-D1E8ACB2F249@oracle.com>	<52FBA9B2.6090502@oracle.com>	<BC488F45-A698-4C92-986F-BBBEBA553CEF@oracle.com>	<52FC8DA1.9090900@oracle.com>
	<1353E47D-3ACB-42FD-ABD3-845CEB6EE389@oracle.com>
	<52FCA014.9000106@oracle.com>
	<EF6C879F-2658-47E9-B733-0E0E027E3B4E@oracle.com>
Message-ID: <52FCFDBA.5040108@oracle.com>

Staffan,

Very nice analysis!

The fix must be very simple, just remove the line
    108   vm.resume
it is an error because here the test does not yet treat the events in 
eventSet.

the line
    136   eventSet.resume();
is the right place to resume the threads after event treatment.

Here is the new webrev:
http://cr.openjdk.java.net/~sjiang/JDK-8007710/03/

Thanks,
Shanliang

Staffan Larsen wrote:
> I think I understand what happens now.
>
> The test code, simplified, looks like this (with the Thread.sleep() 
> added that causes the test to fail):
>
>   launchTarget();
>   addClassWatch();
>   vm.resume();
>   Thread.sleep(1000);
>   while(connected) {
>       eventSet = eventQueue.remove()
>       for(event : eventQueue) {
>           if (event instanceof ClassPrepareEvent) {
>               addFieldWatch();
>           }
>       }
>       eventSet.resume();
>   }
>
> By default all events that happen will cause the debuggee to suspend 
> (see EventRequest.setSuspendPolicy()). Thus when we get to 
> addFieldWatch(), the vm should be suspended and we should be able to 
> create the field watch without problem. But the VM isn?t suspended and 
> that is why the test fail. 
>
> Why isn?t the VM suspended? When we get to the ?for(event : 
> eventQueue)? the first time there are *two* events already in the 
> queue: the VMStartEvent and a ClassPrepareEvent. At this point the VM 
> is suspended and everything is good. We look at the first eventSet 
> which only contains the VMStartEvent, we ignore the event, but we 
> resume the VM. We then loop and look at the ClassPrepareEvent, but by 
> now the VM is already running and has also terminated. Failure.
>
> Thus, we need to handle the VMStartEvent. I suggest a modification to 
> my previous code:
>
>   launchTarget();
>   while(connected) {
>       eventSet = eventQueue.remove()
>       for(event : eventQueue) {
>           if (event instanceof VMStartEvent) {
>               addClassWatch();
>           }
>           if (event instanceof ClassPrepareEvent) {
>               addFieldWatch();
>           }
>       }
>       eventSet.resume();
>   }
>
> This will cause us to have complete control over the state of the 
> debuggee. The first event we see will be the VMStartEvent. The VM will 
> be suspended. We can add a class watch here. Then we resume the VM. 
> The second event we see will be the ClassPrepareEvent with the VM 
> suspended. We can add the field watch. Then we resume the VM and wait 
> for the field watch events.
>
> Thanks,
> /Staffan
>
> On 13 feb 2014, at 11:36, shanliang <shanliang.jiang at oracle.com 
> <mailto:shanliang.jiang at oracle.com>> wrote:
>
>> Staffan Larsen wrote:
>>> On 13 feb 2014, at 10:17, Jaroslav Bachorik <jaroslav.bachorik at oracle.com> wrote:
>>>
>>>   
>>>> Hi Staffan,
>>>>
>>>> On 12.2.2014 18:27, Staffan Larsen wrote:
>>>>     
>>>>> I?m still not happy with this fix since I think the extra output stream synchronization logic is not needed - the debuggee should be suspended at all the interesting points. The fix I proposed is cleaner and (as far as I can tell) also fixes the problem. The only thing is that I can?t quite explain what goes wrong without the fix? I?d really like to understand that. I?ll try to dig deeper and see if I can understand exactly what happens.
>>>>>       
>>>> Yes, bringing the VM to a stable state before calling other JDI functions helps to stabilize the test even without the additional synchronization via stdout/stdin.
>>>>
>>>> I just wonder whether this check should not be done inside com.sun.jdi.connect.LaunchingConnector#launch() implementation. Does it even make sense to hand off an unstable VM?
>>>>     
>>> Good question, but hard to change now - all implementations depend on the current functionality. The VMStartEvent also gives you a reference to the main thread.
>>>   
>> The test failed when it received ClassPrepareEvent and did 
>> addFieldWatch, that meant the test must receive already VMStartEvent, 
>> because VMStartEvent must be the first event, if it was true then the 
>> vm must be already stable when failing.
>>
>> Except that the test received ClassPrepareEvent before VMStartEvent 
>> then it was doing addFieldWatch with a possibly unstable VM. in this 
>> case we might have a serious bug in VirtualMachine implementation, 
>> and if this is true the fix proposed to check "start" may make miss 
>> ClassPrepareEvent, then the test would test nothing.
>>
>> Shanliang
>>> /S
>>>
>>>   
>>>> -JB-
>>>>
>>>>     
>>>>> /Staffan
>>>>>
>>>>> On 12 feb 2014, at 18:04, shanliang <shanliang.jiang at oracle.com> wrote:
>>>>>
>>>>>       
>>>>>> Staffan Larsen wrote:
>>>>>>         
>>>>>>> I think what you need to do is wait for the VMStartEvent before you add requests to the VM. Note this paragraph from the VirtualMachine doc:
>>>>>>>
>>>>>>>  Note that a target VM launched by a launching connector is not
>>>>>>>  guaranteed to be stable until after the VMStartEvent has been
>>>>>>>  received.
>>>>>>>
>>>>>>>           
>>>>>> I may miss something here, I believe VMStartEvent must be the first event, when the test got ClassPrepareEvent, it must already received VMStartEvent.
>>>>>>         
>>>>>>> I think adding code that looks something like this will make the test stable:
>>>>>>>
>>>>>>>     VirtualMachine vm = launchTarget(CLASS_NAME);
>>>>>>>     EventQueue eventQueue = vm.eventQueue();
>>>>>>>
>>>>>>>     boolean started = false;
>>>>>>>     while(!started) {
>>>>>>>       EventSet eventSet = eventQueue.remove();
>>>>>>>       for (Event event : eventSet) {
>>>>>>>         if (event instanceof VMStartEvent) {
>>>>>>>           started = true;
>>>>>>>         }
>>>>>>>         if (event instanceof VMDeathEvent
>>>>>>>             || event instanceof VMDisconnectEvent) {
>>>>>>>           throw new Error("VM died before it started...:"+event);
>>>>>>>         }
>>>>>>>       }
>>>>>>>     }
>>>>>>>
>>>>>>>     System.out.println("Vm launched");
>>>>>>>
>>>>>>>           
>>>>>> The code you proposed could improve the test, it made sure that TestPostFieldModification was started, but I am afraid that it did not address the issue causing the failure, the issue I believe was that TestPostFieldModification exited before or during FieldMonitor called addFieldWatch(), that was why addFieldWatch() received VMDisconnectedException. When the test was treating ClassPrepareEvent, even if VMDeathEvent or VMDisconnectEvent arrived, it must be still waiting in the eventQueue because it arrived after ClassPrepareEvent.
>>>>>>
>>>>>> My fix was to not allow TestPostFieldModification to exit before addFieldWatch() was done.
>>>>>>         
>>>>>>> There is also no reason to call addFieldWatch() before the ClassPrepareEvent has been received. The call to vm..classesByName() will just return an empty list anyway.
>>>>>>>
>>>>>>>           
>>>>>> I do not know why the test called addFieldWatch before ClassPrepareEvent had been received, but yes the returned list was empty, so agree to remove it.
>>>>>>         
>>>>>>> While you are in there you can also remove the unused StringBuffer near the top of main().
>>>>>>>
>>>>>>>           
>>>>>> Yes it was already removed in version 01
>>>>>>
>>>>>> Here is the new webrev:
>>>>>> http://cr.openjdk.java.net/~sjiang/JDK-8007710/02/
>>>>>>
>>>>>> Thanks,
>>>>>> Shanliang
>>>>>>         
>>>>>>> Thanks,
>>>>>>> /Staffan
>>>>>>>
>>>>>>> On 11 feb 2014, at 18:30, shanliang <shanliang.jiang at oracle.com> wrote:
>>>>>>>
>>>>>>>
>>>>>>>           
>>>>>>>> Here is the new fix in which FieldMonitor will write to TestPostFieldModification, to inform the latter to quit, as suggested bu Jaroslav
>>>>>>>>   http://cr.openjdk.java.net/~sjiang/JDK-8007710/01/
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Shanliang
>>>>>>>>
>>>>>>>> shanliang wrote:
>>>>>>>>
>>>>>>>>             
>>>>>>>>> shanliang wrote:
>>>>>>>>>
>>>>>>>>>               
>>>>>>>>>> Jaroslav Bachorik wrote:
>>>>>>>>>>
>>>>>>>>>>                 
>>>>>>>>>>> On 11.2.2014 16:31, shanliang wrote:
>>>>>>>>>>>
>>>>>>>>>>>                   
>>>>>>>>>>>> Staffan Larsen wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>                     
>>>>>>>>>>>>> Hi Shanliang,
>>>>>>>>>>>>>
>>>>>>>>>>>>> I can?t quite see how the test can fail in this way. When the
>>>>>>>>>>>>> ClassPrepareEvent happens, the debuggee will be suspended. So when
>>>>>>>>>>>>> addFieldWatch() is called, the debuggee should not have moved.
>>>>>>>>>>>>>
>>>>>>>>>>>>>                       
>>>>>>>>>>>> I am not expert of jdi so I may miss something here. I checked the
>>>>>>>>>>>> failure trace and saw the report exception happen when FieldMonitor
>>>>>>>>>>>> received ClassPrepareEvent and was doing addFieldWatch. FieldMonitor did
>>>>>>>>>>>> call "vm.resume()" before treating events.
>>>>>>>>>>>>
>>>>>>>>>>>>                     
>>>>>>>>>>> AFAICS, calling vm.resume() results in an almost immediate debuggee death. The gc() invoking thread "d" is flagged as a deamon and as such doesn't prevent the process from exiting. The other thread is not a daemon but will finish in only few cycles.
>>>>>>>>>>>
>>>>>>>>>>>                   
>>>>>>>>>> I looked at the class com.sun.jdi.VirtualMachine, here is the Javadoc of the method "resume":
>>>>>>>>>>   /**
>>>>>>>>>>    * Continues the execution of the application running in this
>>>>>>>>>>    * virtual machine. All threads are resumed as documented in
>>>>>>>>>>    * {@link ThreadReference#resume}.
>>>>>>>>>>    *
>>>>>>>>>>    * @throws VMCannotBeModifiedException if the VirtualMachine is read-only - see {@link VirtualMachine#canBeModified()}.
>>>>>>>>>>    *
>>>>>>>>>>    * @see #suspend
>>>>>>>>>>    */
>>>>>>>>>>   void resume();
>>>>>>>>>> My understanding is that the debuggee resumes to work after this call, instead to die?
>>>>>>>>>>
>>>>>>>>>>                 
>>>>>>>>> In fact the problem is here, the vm (TestPostFieldModification) should not die before FieldMonitor finishes addFieldWatch.
>>>>>>>>>
>>>>>>>>> Shanliang
>>>>>>>>>
>>>>>>>>>               
>>>>>>>>>>>> I reproduced the bug by add sleep(1000) after vm.resume() but before
>>>>>>>>>>>> calling eventQueue.remove();
>>>>>>>>>>>>
>>>>>>>>>>>>                     
>>>>>>>>>>> It looks like some kind of synchronization between the debugger and the debuggee is necessary. But I wonder if you should better use the process.getOuptuptStream() to write and flush a message for the debugee indicating that it can exit. And in the debugee you would just do System.in.read() as the last statement in the main() method. Seems more robust than involving files.
>>>>>>>>>>>
>>>>>>>>>>>                   
>>>>>>>>>> It could work, but creating a file in the testing directory should have no issue, but yes maybe less performance.
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>> Shanliang
>>>>>>>>>>
>>>>>>>>>>                 
>>>>>>>>>>> Cheers,
>>>>>>>>>>>
>>>>>>>>>>> -JB-
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>                   
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>> Shanliang
>>>>>>>>>>>>
>>>>>>>>>>>>                     
>>>>>>>>>>>>> One problem I do see with the test is that it does not wait for a
>>>>>>>>>>>>> VMStartEvent before setting up requests. I?m not sure if that could
>>>>>>>>>>>>> cause the failure in the bug report, though.
>>>>>>>>>>>>>
>>>>>>>>>>>>> /Staffan
>>>>>>>>>>>>>
>>>>>>>>>>>>> On 11 feb 2014, at 15:13, shanliang <shanliang.jiang at oracle.com> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>                       
>>>>>>>>>>>>>> Hi ,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> The problem could be that FieldMonitor did not have enough time to
>>>>>>>>>>>>>> "addFieldWatch" but the vm to monitor (TestPostFieldModification) was
>>>>>>>>>>>>>> already ended.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> So we should make sure that TestPostFieldModification exits after
>>>>>>>>>>>>>> FieldMonitor has done necessary. The solution proposed here is that
>>>>>>>>>>>>>> FieldMonitor creates a file after adding field watching, and
>>>>>>>>>>>>>> TestPostFieldModification quits only after finding the file.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> web:
>>>>>>>>>>>>>> http://icncweb.fr.oracle.com/~shjiang/webrev/8007710/00/
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> bug:
>>>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8007710
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>> Shanliang
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>                         
>>>>>>         
>>>>>       
>>>   
>>
>>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20140213/e7a14bc3/attachment-0001.html 

From mikael.vidstedt at oracle.com  Thu Feb 13 09:56:48 2014
From: mikael.vidstedt at oracle.com (Mikael Vidstedt)
Date: Thu, 13 Feb 2014 09:56:48 -0800
Subject: 8034856/8034857: More gcc warnings
In-Reply-To: <52FCC60B.6040205@oracle.com>
References: <52FCC60B.6040205@oracle.com>
Message-ID: <6E073B44-2FF7-419F-A094-7CD5C3AE0788@oracle.com>


Alan,

I made the change to JarFacade.c myself last week, only to then see the comment a few lines above where you added the new include. It seems to indicate that including ctype.h on Solaris/SPARC is a bad idea. I have no idea if the comment is still relevant, but that may be worth understanding first.

Cheers,
Mikael

> On Feb 13, 2014, at 5:18, Alan Bateman <Alan.Bateman at oracle.com> wrote:
> 
> 
> The number of native code warnings in the build is annoying so this is another drive-by fix that eliminates a few of them in the serviceability and security areas. The webrev with the changes is here:
> 
> http://cr.openjdk.java.net/~alanb/8034856+8034857/webrev/
> 
> In the pkcs11 code the issue is the function prototypes for the throwXXX functions aren't included. This is fixed by including pkcs11wrapper.h but that exposes another issue with the header file includes that needed to be fixed.
> 
> In JarFacade the issue is that it uses isspace but doesn't include the ctype.h
> 
> For LinuxOperatingSystem.c then there are 12 warnings related to fscanf usages where the format specifier is %lld and the code wants to read into a uint64_t. I've changed the format specifier to"%"SCNd64 so that it matches uint64_t and should be okay on both 32 and 64-bit.
> 
> Thanks,
> Alan.

From Alan.Bateman at oracle.com  Thu Feb 13 10:23:30 2014
From: Alan.Bateman at oracle.com (Alan Bateman)
Date: Thu, 13 Feb 2014 18:23:30 +0000
Subject: 8034856/8034857: More gcc warnings
In-Reply-To: <6E073B44-2FF7-419F-A094-7CD5C3AE0788@oracle.com>
References: <52FCC60B.6040205@oracle.com>
	<6E073B44-2FF7-419F-A094-7CD5C3AE0788@oracle.com>
Message-ID: <52FD0DA2.7030201@oracle.com>

On 13/02/2014 17:56, Mikael Vidstedt wrote:
> Alan,
>
> I made the change to JarFacade.c myself last week, only to then see the comment a few lines above where you added the new include. It seems to indicate that including ctype.h on Solaris/SPARC is a bad idea. I have no idea if the comment is still relevant, but that may be worth understanding first.
>
>
Do you have cycles to look into it? As the code is using isspace already 
then it's not clear (unless there are different versions). Before 
pushing the changes then I ran the tests on all platforms (including 
Solaris) and the j.l.i tests include a number of tests exercise these 
manifest attributes with a non-US characters.

As an aside, the native code warnings coming from the jdk repository are 
really annoying so this is the reason for the drive-by fixes when I get 
a few minutes. I think others are doing the same.

-Alan.

From shanliang.jiang at oracle.com  Thu Feb 13 12:15:20 2014
From: shanliang.jiang at oracle.com (shanliang)
Date: Thu, 13 Feb 2014 21:15:20 +0100
Subject: Codereview request: 8007710 runtime/7158988/FieldMonitor.java
	fails with com.sun.jdi.VMDisconnectedException: Connection closed
In-Reply-To: <52FCFDBA.5040108@oracle.com>
References: <52FA301A.2040301@oracle.com>	<F34A2199-253B-49B8-BEC9-00AEA98029C2@oracle.com>	<52FA424E.2050202@oracle.com>	<52FA4515.3050904@oracle.com>	<52FA4D33.4030402@oracle.com>	<52FA51BE.4070701@oracle.com>	<52FA5E51.3040704@oracle.com>	<3F301491-8A27-4319-A694-D1E8ACB2F249@oracle.com>	<52FBA9B2.6090502@oracle.com>	<BC488F45-A698-4C92-986F-BBBEBA553CEF@oracle.com>	<52FC8DA1.9090900@oracle.com>	<1353E47D-3ACB-42FD-ABD3-845CEB6EE389@oracle.com>	<52FCA014.9000106@oracle.com>	<EF6C879F-2658-47E9-B733-0E0E027E3B4E@oracle.com>
	<52FCFDBA.5040108@oracle.com>
Message-ID: <52FD27D8.8060200@oracle.com>

Hi,

Here is Version 4:
    http://cr.openjdk.java.net/~sjiang/JDK-8007710/04/

1) remove the line
    108   vm.resume()
2) call addClassWatch(vm) only when receiving VMStartEvent
3) make sure that the test receives ModificationWatchpointEvent
4) clean

Thanks,
Shanliang

shanliang wrote:
> Staffan,
>
> Very nice analysis!
>
> The fix must be very simple, just remove the line
>     108   vm.resume
> it is an error because here the test does not yet treat the events in 
> eventSet.
>
> the line
>     136   eventSet.resume();
> is the right place to resume the threads after event treatment.
>
> Here is the new webrev:
> http://cr.openjdk.java.net/~sjiang/JDK-8007710/03/
>
> Thanks,
> Shanliang
>
> Staffan Larsen wrote:
>> I think I understand what happens now.
>>
>> The test code, simplified, looks like this (with the Thread.sleep() 
>> added that causes the test to fail):
>>
>>   launchTarget();
>>   addClassWatch();
>>   vm.resume();
>>   Thread.sleep(1000);
>>   while(connected) {
>>       eventSet = eventQueue.remove()
>>       for(event : eventQueue) {
>>           if (event instanceof ClassPrepareEvent) {
>>               addFieldWatch();
>>           }
>>       }
>>       eventSet.resume();
>>   }
>>
>> By default all events that happen will cause the debuggee to suspend 
>> (see EventRequest.setSuspendPolicy()). Thus when we get to 
>> addFieldWatch(), the vm should be suspended and we should be able to 
>> create the field watch without problem. But the VM isn?t suspended 
>> and that is why the test fail. 
>>
>> Why isn?t the VM suspended? When we get to the ?for(event : 
>> eventQueue)? the first time there are *two* events already in the 
>> queue: the VMStartEvent and a ClassPrepareEvent. At this point the VM 
>> is suspended and everything is good. We look at the first eventSet 
>> which only contains the VMStartEvent, we ignore the event, but we 
>> resume the VM. We then loop and look at the ClassPrepareEvent, but by 
>> now the VM is already running and has also terminated. Failure.
>>
>> Thus, we need to handle the VMStartEvent. I suggest a modification to 
>> my previous code:
>>
>>   launchTarget();
>>   while(connected) {
>>       eventSet = eventQueue.remove()
>>       for(event : eventQueue) {
>>           if (event instanceof VMStartEvent) {
>>               addClassWatch();
>>           }
>>           if (event instanceof ClassPrepareEvent) {
>>               addFieldWatch();
>>           }
>>       }
>>       eventSet.resume();
>>   }
>>
>> This will cause us to have complete control over the state of the 
>> debuggee. The first event we see will be the VMStartEvent. The VM 
>> will be suspended. We can add a class watch here. Then we resume the 
>> VM. The second event we see will be the ClassPrepareEvent with the VM 
>> suspended. We can add the field watch. Then we resume the VM and wait 
>> for the field watch events.
>>
>> Thanks,
>> /Staffan
>>
>> On 13 feb 2014, at 11:36, shanliang <shanliang.jiang at oracle.com 
>> <mailto:shanliang.jiang at oracle.com>> wrote:
>>
>>> Staffan Larsen wrote:
>>>> On 13 feb 2014, at 10:17, Jaroslav Bachorik <jaroslav.bachorik at oracle.com> wrote:
>>>>
>>>>   
>>>>> Hi Staffan,
>>>>>
>>>>> On 12.2.2014 18:27, Staffan Larsen wrote:
>>>>>     
>>>>>> I?m still not happy with this fix since I think the extra output stream synchronization logic is not needed - the debuggee should be suspended at all the interesting points. The fix I proposed is cleaner and (as far as I can tell) also fixes the problem. The only thing is that I can?t quite explain what goes wrong without the fix? I?d really like to understand that. I?ll try to dig deeper and see if I can understand exactly what happens.
>>>>>>       
>>>>> Yes, bringing the VM to a stable state before calling other JDI functions helps to stabilize the test even without the additional synchronization via stdout/stdin.
>>>>>
>>>>> I just wonder whether this check should not be done inside com.sun.jdi.connect.LaunchingConnector#launch() implementation. Does it even make sense to hand off an unstable VM?
>>>>>     
>>>> Good question, but hard to change now - all implementations depend on the current functionality. The VMStartEvent also gives you a reference to the main thread.
>>>>   
>>> The test failed when it received ClassPrepareEvent and did 
>>> addFieldWatch, that meant the test must receive already 
>>> VMStartEvent, because VMStartEvent must be the first event, if it 
>>> was true then the vm must be already stable when failing.
>>>
>>> Except that the test received ClassPrepareEvent before VMStartEvent 
>>> then it was doing addFieldWatch with a possibly unstable VM. in this 
>>> case we might have a serious bug in VirtualMachine implementation, 
>>> and if this is true the fix proposed to check "start" may make miss 
>>> ClassPrepareEvent, then the test would test nothing.
>>>
>>> Shanliang
>>>> /S
>>>>
>>>>   
>>>>> -JB-
>>>>>
>>>>>     
>>>>>> /Staffan
>>>>>>
>>>>>> On 12 feb 2014, at 18:04, shanliang <shanliang.jiang at oracle.com> wrote:
>>>>>>
>>>>>>       
>>>>>>> Staffan Larsen wrote:
>>>>>>>         
>>>>>>>> I think what you need to do is wait for the VMStartEvent before you add requests to the VM. Note this paragraph from the VirtualMachine doc:
>>>>>>>>
>>>>>>>>  Note that a target VM launched by a launching connector is not
>>>>>>>>  guaranteed to be stable until after the VMStartEvent has been
>>>>>>>>  received.
>>>>>>>>
>>>>>>>>           
>>>>>>> I may miss something here, I believe VMStartEvent must be the first event, when the test got ClassPrepareEvent, it must already received VMStartEvent.
>>>>>>>         
>>>>>>>> I think adding code that looks something like this will make the test stable:
>>>>>>>>
>>>>>>>>     VirtualMachine vm = launchTarget(CLASS_NAME);
>>>>>>>>     EventQueue eventQueue = vm.eventQueue();
>>>>>>>>
>>>>>>>>     boolean started = false;
>>>>>>>>     while(!started) {
>>>>>>>>       EventSet eventSet = eventQueue.remove();
>>>>>>>>       for (Event event : eventSet) {
>>>>>>>>         if (event instanceof VMStartEvent) {
>>>>>>>>           started = true;
>>>>>>>>         }
>>>>>>>>         if (event instanceof VMDeathEvent
>>>>>>>>             || event instanceof VMDisconnectEvent) {
>>>>>>>>           throw new Error("VM died before it started...:"+event);
>>>>>>>>         }
>>>>>>>>       }
>>>>>>>>     }
>>>>>>>>
>>>>>>>>     System.out.println("Vm launched");
>>>>>>>>
>>>>>>>>           
>>>>>>> The code you proposed could improve the test, it made sure that TestPostFieldModification was started, but I am afraid that it did not address the issue causing the failure, the issue I believe was that TestPostFieldModification exited before or during FieldMonitor called addFieldWatch(), that was why addFieldWatch() received VMDisconnectedException. When the test was treating ClassPrepareEvent, even if VMDeathEvent or VMDisconnectEvent arrived, it must be still waiting in the eventQueue because it arrived after ClassPrepareEvent.
>>>>>>>
>>>>>>> My fix was to not allow TestPostFieldModification to exit before addFieldWatch() was done.
>>>>>>>         
>>>>>>>> There is also no reason to call addFieldWatch() before the ClassPrepareEvent has been received. The call to vm..classesByName() will just return an empty list anyway.
>>>>>>>>
>>>>>>>>           
>>>>>>> I do not know why the test called addFieldWatch before ClassPrepareEvent had been received, but yes the returned list was empty, so agree to remove it.
>>>>>>>         
>>>>>>>> While you are in there you can also remove the unused StringBuffer near the top of main().
>>>>>>>>
>>>>>>>>           
>>>>>>> Yes it was already removed in version 01
>>>>>>>
>>>>>>> Here is the new webrev:
>>>>>>> http://cr.openjdk.java.net/~sjiang/JDK-8007710/02/
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Shanliang
>>>>>>>         
>>>>>>>> Thanks,
>>>>>>>> /Staffan
>>>>>>>>
>>>>>>>> On 11 feb 2014, at 18:30, shanliang <shanliang.jiang at oracle.com> wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>>           
>>>>>>>>> Here is the new fix in which FieldMonitor will write to TestPostFieldModification, to inform the latter to quit, as suggested bu Jaroslav
>>>>>>>>>   http://cr.openjdk.java.net/~sjiang/JDK-8007710/01/
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Shanliang
>>>>>>>>>
>>>>>>>>> shanliang wrote:
>>>>>>>>>
>>>>>>>>>             
>>>>>>>>>> shanliang wrote:
>>>>>>>>>>
>>>>>>>>>>               
>>>>>>>>>>> Jaroslav Bachorik wrote:
>>>>>>>>>>>
>>>>>>>>>>>                 
>>>>>>>>>>>> On 11.2.2014 16:31, shanliang wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>                   
>>>>>>>>>>>>> Staffan Larsen wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>                     
>>>>>>>>>>>>>> Hi Shanliang,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I can?t quite see how the test can fail in this way. When the
>>>>>>>>>>>>>> ClassPrepareEvent happens, the debuggee will be suspended. So when
>>>>>>>>>>>>>> addFieldWatch() is called, the debuggee should not have moved.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>                       
>>>>>>>>>>>>> I am not expert of jdi so I may miss something here. I checked the
>>>>>>>>>>>>> failure trace and saw the report exception happen when FieldMonitor
>>>>>>>>>>>>> received ClassPrepareEvent and was doing addFieldWatch. FieldMonitor did
>>>>>>>>>>>>> call "vm.resume()" before treating events.
>>>>>>>>>>>>>
>>>>>>>>>>>>>                     
>>>>>>>>>>>> AFAICS, calling vm.resume() results in an almost immediate debuggee death. The gc() invoking thread "d" is flagged as a deamon and as such doesn't prevent the process from exiting. The other thread is not a daemon but will finish in only few cycles.
>>>>>>>>>>>>
>>>>>>>>>>>>                   
>>>>>>>>>>> I looked at the class com.sun.jdi.VirtualMachine, here is the Javadoc of the method "resume":
>>>>>>>>>>>   /**
>>>>>>>>>>>    * Continues the execution of the application running in this
>>>>>>>>>>>    * virtual machine. All threads are resumed as documented in
>>>>>>>>>>>    * {@link ThreadReference#resume}.
>>>>>>>>>>>    *
>>>>>>>>>>>    * @throws VMCannotBeModifiedException if the VirtualMachine is read-only - see {@link VirtualMachine#canBeModified()}.
>>>>>>>>>>>    *
>>>>>>>>>>>    * @see #suspend
>>>>>>>>>>>    */
>>>>>>>>>>>   void resume();
>>>>>>>>>>> My understanding is that the debuggee resumes to work after this call, instead to die?
>>>>>>>>>>>
>>>>>>>>>>>                 
>>>>>>>>>> In fact the problem is here, the vm (TestPostFieldModification) should not die before FieldMonitor finishes addFieldWatch.
>>>>>>>>>>
>>>>>>>>>> Shanliang
>>>>>>>>>>
>>>>>>>>>>               
>>>>>>>>>>>>> I reproduced the bug by add sleep(1000) after vm.resume() but before
>>>>>>>>>>>>> calling eventQueue.remove();
>>>>>>>>>>>>>
>>>>>>>>>>>>>                     
>>>>>>>>>>>> It looks like some kind of synchronization between the debugger and the debuggee is necessary. But I wonder if you should better use the process.getOuptuptStream() to write and flush a message for the debugee indicating that it can exit. And in the debugee you would just do System.in.read() as the last statement in the main() method. Seems more robust than involving files.
>>>>>>>>>>>>
>>>>>>>>>>>>                   
>>>>>>>>>>> It could work, but creating a file in the testing directory should have no issue, but yes maybe less performance.
>>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>> Shanliang
>>>>>>>>>>>
>>>>>>>>>>>                 
>>>>>>>>>>>> Cheers,
>>>>>>>>>>>>
>>>>>>>>>>>> -JB-
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>                   
>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>> Shanliang
>>>>>>>>>>>>>
>>>>>>>>>>>>>                     
>>>>>>>>>>>>>> One problem I do see with the test is that it does not wait for a
>>>>>>>>>>>>>> VMStartEvent before setting up requests. I?m not sure if that could
>>>>>>>>>>>>>> cause the failure in the bug report, though.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> /Staffan
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On 11 feb 2014, at 15:13, shanliang <shanliang.jiang at oracle.com> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>                       
>>>>>>>>>>>>>>> Hi ,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> The problem could be that FieldMonitor did not have enough time to
>>>>>>>>>>>>>>> "addFieldWatch" but the vm to monitor (TestPostFieldModification) was
>>>>>>>>>>>>>>> already ended.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> So we should make sure that TestPostFieldModification exits after
>>>>>>>>>>>>>>> FieldMonitor has done necessary. The solution proposed here is that
>>>>>>>>>>>>>>> FieldMonitor creates a file after adding field watching, and
>>>>>>>>>>>>>>> TestPostFieldModification quits only after finding the file.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> web:
>>>>>>>>>>>>>>> http://icncweb.fr.oracle.com/~shjiang/webrev/8007710/00/
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> bug:
>>>>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8007710
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>> Shanliang
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>                         
>>>>>>>         
>>>>>>       
>>>>   
>>>
>>>
>>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20140213/34327469/attachment-0001.html 

From mikael.vidstedt at oracle.com  Thu Feb 13 13:14:45 2014
From: mikael.vidstedt at oracle.com (Mikael Vidstedt)
Date: Thu, 13 Feb 2014 13:14:45 -0800
Subject: 8034856/8034857: More gcc warnings
In-Reply-To: <52FD0DA2.7030201@oracle.com>
References: <52FCC60B.6040205@oracle.com>
	<6E073B44-2FF7-419F-A094-7CD5C3AE0788@oracle.com>
	<52FD0DA2.7030201@oracle.com>
Message-ID: <52FD35C5.1000306@oracle.com>


On 2014-02-13 10:23, Alan Bateman wrote:
> On 13/02/2014 17:56, Mikael Vidstedt wrote:
>> Alan,
>>
>> I made the change to JarFacade.c myself last week, only to then see 
>> the comment a few lines above where you added the new include. It 
>> seems to indicate that including ctype.h on Solaris/SPARC is a bad 
>> idea. I have no idea if the comment is still relevant, but that may 
>> be worth understanding first.
>>
>>
> Do you have cycles to look into it? As the code is using isspace 
> already then it's not clear (unless there are different versions). 
> Before pushing the changes then I ran the tests on all platforms 
> (including Solaris) and the j.l.i tests include a number of tests 
> exercise these manifest attributes with a non-US characters.

The change in question appears to come from 
https://bugs.openjdk.java.net/browse/JDK-6679866, but I'm not sure the 
bug gives enough additional information. My speculation (and it's really 
just a speculation) is that it's not related to isspace per-se, but to 
something else which gets defined/redefined/undefined by including 
ctype.h. I guess it would be good to know if we have tests which cover 
the thing the comment is alluding to (non-ascii in Premain-Class).

> As an aside, the native code warnings coming from the jdk repository 
> are really annoying so this is the reason for the drive-by fixes when 
> I get a few minutes. I think others are doing the same.

Absolutely support this work! As a matter of fact I have a couple of 
change in a sandbox I should send out for review.

Cheers,
Mikael


From serguei.spitsyn at oracle.com  Thu Feb 13 16:01:07 2014
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Thu, 13 Feb 2014 16:01:07 -0800
Subject: RFR (S) 8034249: need more workarounds for suspend equivalent
	condition issue
Message-ID: <52FD5CC3.4090502@oracle.com>

Please, review the fix for:
   https://bugs.openjdk.java.net/browse/JDK-8034249


Open webrev:
http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/hotspot/8034249-JVMTI-MON.1

Summary:

   This issue was identified in the review of the 8032223 and it is 
similar to the 8032223
   but impacts different JVMTI functions:
     GetCurrentContendedMonitor, GetOwnedMonitorInfo,
     GetOwnedMonitorStackDepthInfo, GetStackTrace

   There is a general issue in the suspend equivalent condition mechanism:
   Two subsequent calls to the JvmtiEnv::is_thread_fully_suspended() may 
return different results:
     - 1-st: true
     - 2-nd: false

   This suspend equivalent issue is covered by another bug:
     https://bugs.openjdk.java.net/browse/JDK-6280037

   This fix is to work around the 6280037.
   It is more safe to collect the necesary information at a safepoint 
instead of
   relying on the suspension of the target thread.


Testing:
   In progress: nsk.jvmti, nsk.jdi, nsk.jdwp, JTreg com/sun/jdi


Thanks,
Serguei

From staffan.larsen at oracle.com  Fri Feb 14 00:45:04 2014
From: staffan.larsen at oracle.com (Staffan Larsen)
Date: Fri, 14 Feb 2014 09:45:04 +0100
Subject: Codereview request: 8007710 runtime/7158988/FieldMonitor.java
	fails with com.sun.jdi.VMDisconnectedException: Connection closed
In-Reply-To: <52FD27D8.8060200@oracle.com>
References: <52FA301A.2040301@oracle.com>	<F34A2199-253B-49B8-BEC9-00AEA98029C2@oracle.com>	<52FA424E.2050202@oracle.com>	<52FA4515.3050904@oracle.com>	<52FA4D33.4030402@oracle.com>	<52FA51BE.4070701@oracle.com>	<52FA5E51.3040704@oracle.com>	<3F301491-8A27-4319-A694-D1E8ACB2F249@oracle.com>	<52FBA9B2.6090502@oracle.com>	<BC488F45-A698-4C92-986F-BBBEBA553CEF@oracle.com>	<52FC8DA1.9090900@oracle.com>	<1353E47D-3ACB-42FD-ABD3-845CEB6EE389@oracle.com>	<52FCA014.9000106@oracle.com>	<EF6C879F-2658-47E9-B733-0E0E027E3B4E@oracle.com>
	<52FCFDBA.5040108@oracle.com> <52FD27D8.8060200@oracle.com>
Message-ID: <28F9E62B-3A20-4BFB-AA40-9568C8BEDBC2@oracle.com>

This version looks good! Thanks for hanging in there.

The only improvement would be to count and verify the number of ModificationWatchpointEvent (there should be 10).

Thanks,
/Staffan

On 13 feb 2014, at 21:15, shanliang <shanliang.jiang at oracle.com> wrote:

> Hi,
> 
> Here is Version 4:
>     http://cr.openjdk.java.net/~sjiang/JDK-8007710/04/
> 
> 1) remove the line
>     108   vm.resume()
> 2) call addClassWatch(vm) only when receiving VMStartEvent
> 3) make sure that the test receives ModificationWatchpointEvent
> 4) clean
> 
> Thanks,
> Shanliang
> 
> shanliang wrote:
>> 
>> Staffan,
>> 
>> Very nice analysis! 
>> 
>> The fix must be very simple, just remove the line 
>>     108   vm.resume
>> it is an error because here the test does not yet treat the events in eventSet.
>> 
>> the line
>>     136   eventSet.resume();
>> is the right place to resume the threads after event treatment.
>> 
>> Here is the new webrev:
>> http://cr.openjdk.java.net/~sjiang/JDK-8007710/03/
>> 
>> Thanks,
>> Shanliang
>> 
>> Staffan Larsen wrote:
>>> 
>>> I think I understand what happens now.
>>> 
>>> The test code, simplified, looks like this (with the Thread.sleep() added that causes the test to fail):
>>> 
>>>   launchTarget();
>>>   addClassWatch();
>>>   vm.resume();
>>>   Thread.sleep(1000);
>>>   while(connected) {
>>>       eventSet = eventQueue.remove()
>>>       for(event : eventQueue) {
>>>           if (event instanceof ClassPrepareEvent) {
>>>               addFieldWatch();
>>>           }
>>>       }
>>>       eventSet.resume();
>>>   }
>>> 
>>> By default all events that happen will cause the debuggee to suspend (see EventRequest.setSuspendPolicy()). Thus when we get to addFieldWatch(), the vm should be suspended and we should be able to create the field watch without problem. But the VM isn?t suspended and that is why the test fail. 
>>> 
>>> Why isn?t the VM suspended? When we get to the ?for(event : eventQueue)? the first time there are *two* events already in the queue: the VMStartEvent and a ClassPrepareEvent. At this point the VM is suspended and everything is good. We look at the first eventSet which only contains the VMStartEvent, we ignore the event, but we resume the VM. We then loop and look at the ClassPrepareEvent, but by now the VM is already running and has also terminated. Failure.
>>> 
>>> Thus, we need to handle the VMStartEvent. I suggest a modification to my previous code:
>>> 
>>>   launchTarget();
>>>   while(connected) {
>>>       eventSet = eventQueue.remove()
>>>       for(event : eventQueue) {
>>>           if (event instanceof VMStartEvent) {
>>>               addClassWatch();
>>>           }
>>>           if (event instanceof ClassPrepareEvent) {
>>>               addFieldWatch();
>>>           }
>>>       }
>>>       eventSet.resume();
>>>   }
>>> 
>>> This will cause us to have complete control over the state of the debuggee. The first event we see will be the VMStartEvent. The VM will be suspended. We can add a class watch here. Then we resume the VM. The second event we see will be the ClassPrepareEvent with the VM suspended. We can add the field watch. Then we resume the VM and wait for the field watch events.
>>> 
>>> Thanks,
>>> /Staffan
>>> 
>>> On 13 feb 2014, at 11:36, shanliang <shanliang.jiang at oracle.com> wrote:
>>> 
>>>> Staffan Larsen wrote:
>>>>> 
>>>>> On 13 feb 2014, at 10:17, Jaroslav Bachorik <jaroslav.bachorik at oracle.com> wrote:
>>>>> 
>>>>>   
>>>>>> Hi Staffan,
>>>>>> 
>>>>>> On 12.2.2014 18:27, Staffan Larsen wrote:
>>>>>>     
>>>>>>> I?m still not happy with this fix since I think the extra output stream synchronization logic is not needed - the debuggee should be suspended at all the interesting points. The fix I proposed is cleaner and (as far as I can tell) also fixes the problem. The only thing is that I can?t quite explain what goes wrong without the fix? I?d really like to understand that. I?ll try to dig deeper and see if I can understand exactly what happens.
>>>>>>>       
>>>>>> Yes, bringing the VM to a stable state before calling other JDI functions helps to stabilize the test even without the additional synchronization via stdout/stdin.
>>>>>> 
>>>>>> I just wonder whether this check should not be done inside com.sun.jdi.connect.LaunchingConnector#launch() implementation. Does it even make sense to hand off an unstable VM?
>>>>>>     
>>>>> Good question, but hard to change now - all implementations depend on the current functionality. The VMStartEvent also gives you a reference to the main thread.
>>>>>   
>>>> The test failed when it received ClassPrepareEvent and did addFieldWatch, that meant the test must receive already VMStartEvent, because VMStartEvent must be the first event, if it was true then the vm must be already stable when failing.
>>>> 
>>>> Except that the test received ClassPrepareEvent before VMStartEvent then it was doing addFieldWatch with a possibly unstable VM. in this case we might have a serious bug in VirtualMachine implementation, and if this is true the fix proposed to check "start" may make miss ClassPrepareEvent, then the test would test nothing.
>>>> 
>>>> Shanliang
>>>>> /S
>>>>> 
>>>>>   
>>>>>> -JB-
>>>>>> 
>>>>>>     
>>>>>>> /Staffan
>>>>>>> 
>>>>>>> On 12 feb 2014, at 18:04, shanliang <shanliang.jiang at oracle.com> wrote:
>>>>>>> 
>>>>>>>       
>>>>>>>> Staffan Larsen wrote:
>>>>>>>>         
>>>>>>>>> I think what you need to do is wait for the VMStartEvent before you add requests to the VM. Note this paragraph from the VirtualMachine doc:
>>>>>>>>> 
>>>>>>>>>  Note that a target VM launched by a launching connector is not
>>>>>>>>>  guaranteed to be stable until after the VMStartEvent has been
>>>>>>>>>  received.
>>>>>>>>> 
>>>>>>>>>           
>>>>>>>> I may miss something here, I believe VMStartEvent must be the first event, when the test got ClassPrepareEvent, it must already received VMStartEvent.
>>>>>>>>         
>>>>>>>>> I think adding code that looks something like this will make the test stable:
>>>>>>>>> 
>>>>>>>>>     VirtualMachine vm = launchTarget(CLASS_NAME);
>>>>>>>>>     EventQueue eventQueue = vm.eventQueue();
>>>>>>>>> 
>>>>>>>>>     boolean started = false;
>>>>>>>>>     while(!started) {
>>>>>>>>>       EventSet eventSet = eventQueue.remove();
>>>>>>>>>       for (Event event : eventSet) {
>>>>>>>>>         if (event instanceof VMStartEvent) {
>>>>>>>>>           started = true;
>>>>>>>>>         }
>>>>>>>>>         if (event instanceof VMDeathEvent
>>>>>>>>>             || event instanceof VMDisconnectEvent) {
>>>>>>>>>           throw new Error("VM died before it started...:"+event);
>>>>>>>>>         }
>>>>>>>>>       }
>>>>>>>>>     }
>>>>>>>>> 
>>>>>>>>>     System.out.println("Vm launched");
>>>>>>>>> 
>>>>>>>>>           
>>>>>>>> The code you proposed could improve the test, it made sure that TestPostFieldModification was started, but I am afraid that it did not address the issue causing the failure, the issue I believe was that TestPostFieldModification exited before or during FieldMonitor called addFieldWatch(), that was why addFieldWatch() received VMDisconnectedException. When the test was treating ClassPrepareEvent, even if VMDeathEvent or VMDisconnectEvent arrived, it must be still waiting in the eventQueue because it arrived after ClassPrepareEvent.
>>>>>>>> 
>>>>>>>> My fix was to not allow TestPostFieldModification to exit before addFieldWatch() was done.
>>>>>>>>         
>>>>>>>>> There is also no reason to call addFieldWatch() before the ClassPrepareEvent has been received. The call to vm..classesByName() will just return an empty list anyway.
>>>>>>>>> 
>>>>>>>>>           
>>>>>>>> I do not know why the test called addFieldWatch before ClassPrepareEvent had been received, but yes the returned list was empty, so agree to remove it.
>>>>>>>>         
>>>>>>>>> While you are in there you can also remove the unused StringBuffer near the top of main().
>>>>>>>>> 
>>>>>>>>>           
>>>>>>>> Yes it was already removed in version 01
>>>>>>>> 
>>>>>>>> Here is the new webrev:
>>>>>>>> http://cr.openjdk.java.net/~sjiang/JDK-8007710/02/
>>>>>>>> 
>>>>>>>> Thanks,
>>>>>>>> Shanliang
>>>>>>>>         
>>>>>>>>> Thanks,
>>>>>>>>> /Staffan
>>>>>>>>> 
>>>>>>>>> On 11 feb 2014, at 18:30, shanliang <shanliang.jiang at oracle.com> wrote:
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>>           
>>>>>>>>>> Here is the new fix in which FieldMonitor will write to TestPostFieldModification, to inform the latter to quit, as suggested bu Jaroslav
>>>>>>>>>>   http://cr.openjdk.java.net/~sjiang/JDK-8007710/01/
>>>>>>>>>> 
>>>>>>>>>> Thanks,
>>>>>>>>>> Shanliang
>>>>>>>>>> 
>>>>>>>>>> shanliang wrote:
>>>>>>>>>> 
>>>>>>>>>>             
>>>>>>>>>>> shanliang wrote:
>>>>>>>>>>> 
>>>>>>>>>>>               
>>>>>>>>>>>> Jaroslav Bachorik wrote:
>>>>>>>>>>>> 
>>>>>>>>>>>>                 
>>>>>>>>>>>>> On 11.2.2014 16:31, shanliang wrote:
>>>>>>>>>>>>> 
>>>>>>>>>>>>>                   
>>>>>>>>>>>>>> Staffan Larsen wrote:
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>                     
>>>>>>>>>>>>>>> Hi Shanliang,
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> I can?t quite see how the test can fail in this way. When the
>>>>>>>>>>>>>>> ClassPrepareEvent happens, the debuggee will be suspended. So when
>>>>>>>>>>>>>>> addFieldWatch() is called, the debuggee should not have moved.
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>                       
>>>>>>>>>>>>>> I am not expert of jdi so I may miss something here. I checked the
>>>>>>>>>>>>>> failure trace and saw the report exception happen when FieldMonitor
>>>>>>>>>>>>>> received ClassPrepareEvent and was doing addFieldWatch. FieldMonitor did
>>>>>>>>>>>>>> call "vm.resume()" before treating events.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>                     
>>>>>>>>>>>>> AFAICS, calling vm.resume() results in an almost immediate debuggee death. The gc() invoking thread "d" is flagged as a deamon and as such doesn't prevent the process from exiting. The other thread is not a daemon but will finish in only few cycles.
>>>>>>>>>>>>> 
>>>>>>>>>>>>>                   
>>>>>>>>>>>> I looked at the class com.sun.jdi.VirtualMachine, here is the Javadoc of the method "resume":
>>>>>>>>>>>>   /**
>>>>>>>>>>>>    * Continues the execution of the application running in this
>>>>>>>>>>>>    * virtual machine. All threads are resumed as documented in
>>>>>>>>>>>>    * {@link ThreadReference#resume}.
>>>>>>>>>>>>    *
>>>>>>>>>>>>    * @throws VMCannotBeModifiedException if the VirtualMachine is read-only - see {@link VirtualMachine#canBeModified()}.
>>>>>>>>>>>>    *
>>>>>>>>>>>>    * @see #suspend
>>>>>>>>>>>>    */
>>>>>>>>>>>>   void resume();
>>>>>>>>>>>> My understanding is that the debuggee resumes to work after this call, instead to die?
>>>>>>>>>>>> 
>>>>>>>>>>>>                 
>>>>>>>>>>> In fact the problem is here, the vm (TestPostFieldModification) should not die before FieldMonitor finishes addFieldWatch.
>>>>>>>>>>> 
>>>>>>>>>>> Shanliang
>>>>>>>>>>> 
>>>>>>>>>>>               
>>>>>>>>>>>>>> I reproduced the bug by add sleep(1000) after vm.resume() but before
>>>>>>>>>>>>>> calling eventQueue.remove();
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>                     
>>>>>>>>>>>>> It looks like some kind of synchronization between the debugger and the debuggee is necessary. But I wonder if you should better use the process.getOuptuptStream() to write and flush a message for the debugee indicating that it can exit. And in the debugee you would just do System.in.read() as the last statement in the main() method. Seems more robust than involving files.
>>>>>>>>>>>>> 
>>>>>>>>>>>>>                   
>>>>>>>>>>>> It could work, but creating a file in the testing directory should have no issue, but yes maybe less performance.
>>>>>>>>>>>> 
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>> Shanliang
>>>>>>>>>>>> 
>>>>>>>>>>>>                 
>>>>>>>>>>>>> Cheers,
>>>>>>>>>>>>> 
>>>>>>>>>>>>> -JB-
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>>                   
>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>> Shanliang
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>                     
>>>>>>>>>>>>>>> One problem I do see with the test is that it does not wait for a
>>>>>>>>>>>>>>> VMStartEvent before setting up requests. I?m not sure if that could
>>>>>>>>>>>>>>> cause the failure in the bug report, though.
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> /Staffan
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> On 11 feb 2014, at 15:13, shanliang <shanliang.jiang at oracle.com> wrote:
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>                       
>>>>>>>>>>>>>>>> Hi ,
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> The problem could be that FieldMonitor did not have enough time to
>>>>>>>>>>>>>>>> "addFieldWatch" but the vm to monitor (TestPostFieldModification) was
>>>>>>>>>>>>>>>> already ended.
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> So we should make sure that TestPostFieldModification exits after
>>>>>>>>>>>>>>>> FieldMonitor has done necessary. The solution proposed here is that
>>>>>>>>>>>>>>>> FieldMonitor creates a file after adding field watching, and
>>>>>>>>>>>>>>>> TestPostFieldModification quits only after finding the file.
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> web:
>>>>>>>>>>>>>>>> http://icncweb.fr.oracle.com/~shjiang/webrev/8007710/00/
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> bug:
>>>>>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8007710
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>> Shanliang
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>                         
>>>>>>>>         
>>>>>>>       
>>>>>   
>>>> 
>>>> 
>>> 
>> 
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20140214/48950ed3/attachment-0001.html 

From shanliang.jiang at oracle.com  Fri Feb 14 02:00:02 2014
From: shanliang.jiang at oracle.com (shanliang)
Date: Fri, 14 Feb 2014 11:00:02 +0100
Subject: Codereview request: 8007710 runtime/7158988/FieldMonitor.java
	fails with com.sun.jdi.VMDisconnectedException: Connection closed
In-Reply-To: <28F9E62B-3A20-4BFB-AA40-9568C8BEDBC2@oracle.com>
References: <52FA301A.2040301@oracle.com>	<F34A2199-253B-49B8-BEC9-00AEA98029C2@oracle.com>	<52FA424E.2050202@oracle.com>	<52FA4515.3050904@oracle.com>	<52FA4D33.4030402@oracle.com>	<52FA51BE.4070701@oracle.com>	<52FA5E51.3040704@oracle.com>	<3F301491-8A27-4319-A694-D1E8ACB2F249@oracle.com>	<52FBA9B2.6090502@oracle.com>	<BC488F45-A698-4C92-986F-BBBEBA553CEF@oracle.com>	<52FC8DA1.9090900@oracle.com>	<1353E47D-3ACB-42FD-ABD3-845CEB6EE389@oracle.com>	<52FCA014.9000106@oracle.com>	<EF6C879F-2658-47E9-B733-0E0E027E3B4E@oracle.com>
	<52FCFDBA.5040108@oracle.com> <52FD27D8.8060200@oracle.com>
	<28F9E62B-3A20-4BFB-AA40-9568C8BEDBC2@oracle.com>
Message-ID: <52FDE922.3040301@oracle.com>

Staffan Larsen wrote:
> This version looks good! Thanks for hanging in there.
>
> The only improvement would be to count and verify the number of 
> ModificationWatchpointEvent (there should be 10).
Good idea, here is:
http://cr.openjdk.java.net/~sjiang/JDK-8007710/05/

Thanks,
Shanliang
>
> Thanks,
> /Staffan
>
> On 13 feb 2014, at 21:15, shanliang <shanliang.jiang at oracle.com 
> <mailto:shanliang.jiang at oracle.com>> wrote:
>
>> Hi,
>>
>> Here is Version 4:
>>     http://cr.openjdk.java.net/~sjiang/JDK-8007710/04/
>>
>> 1) remove the line
>>     108   vm.resume()
>> 2) call addClassWatch(vm) only when receiving VMStartEvent
>> 3) make sure that the test receives ModificationWatchpointEvent
>> 4) clean
>>
>> Thanks,
>> Shanliang
>>
>> shanliang wrote:
>>> Staffan,
>>>
>>> Very nice analysis!
>>>
>>> The fix must be very simple, just remove the line
>>>     108   vm.resume
>>> it is an error because here the test does not yet treat the events 
>>> in eventSet.
>>>
>>> the line
>>>     136   eventSet.resume();
>>> is the right place to resume the threads after event treatment.
>>>
>>> Here is the new webrev:
>>> http://cr.openjdk.java.net/~sjiang/JDK-8007710/03/
>>>
>>> Thanks,
>>> Shanliang
>>>
>>> Staffan Larsen wrote:
>>>> I think I understand what happens now.
>>>>
>>>> The test code, simplified, looks like this (with the Thread.sleep() 
>>>> added that causes the test to fail):
>>>>
>>>>   launchTarget();
>>>>   addClassWatch();
>>>>   vm.resume();
>>>>   Thread.sleep(1000);
>>>>   while(connected) {
>>>>       eventSet = eventQueue.remove()
>>>>       for(event : eventQueue) {
>>>>           if (event instanceof ClassPrepareEvent) {
>>>>               addFieldWatch();
>>>>           }
>>>>       }
>>>>       eventSet.resume();
>>>>   }
>>>>
>>>> By default all events that happen will cause the debuggee to 
>>>> suspend (see EventRequest.setSuspendPolicy()). Thus when we get to 
>>>> addFieldWatch(), the vm should be suspended and we should be able 
>>>> to create the field watch without problem. But the VM isn?t 
>>>> suspended and that is why the test fail. 
>>>>
>>>> Why isn?t the VM suspended? When we get to the ?for(event : 
>>>> eventQueue)? the first time there are *two* events already in the 
>>>> queue: the VMStartEvent and a ClassPrepareEvent. At this point the 
>>>> VM is suspended and everything is good. We look at the first 
>>>> eventSet which only contains the VMStartEvent, we ignore the event, 
>>>> but we resume the VM. We then loop and look at the 
>>>> ClassPrepareEvent, but by now the VM is already running and has 
>>>> also terminated. Failure.
>>>>
>>>> Thus, we need to handle the VMStartEvent. I suggest a modification 
>>>> to my previous code:
>>>>
>>>>   launchTarget();
>>>>   while(connected) {
>>>>       eventSet = eventQueue.remove()
>>>>       for(event : eventQueue) {
>>>>           if (event instanceof VMStartEvent) {
>>>>               addClassWatch();
>>>>           }
>>>>           if (event instanceof ClassPrepareEvent) {
>>>>               addFieldWatch();
>>>>           }
>>>>       }
>>>>       eventSet.resume();
>>>>   }
>>>>
>>>> This will cause us to have complete control over the state of the 
>>>> debuggee. The first event we see will be the VMStartEvent. The VM 
>>>> will be suspended. We can add a class watch here. Then we resume 
>>>> the VM. The second event we see will be the ClassPrepareEvent with 
>>>> the VM suspended. We can add the field watch. Then we resume the VM 
>>>> and wait for the field watch events.
>>>>
>>>> Thanks,
>>>> /Staffan
>>>>
>>>> On 13 feb 2014, at 11:36, shanliang <shanliang.jiang at oracle.com 
>>>> <mailto:shanliang.jiang at oracle.com>> wrote:
>>>>
>>>>> Staffan Larsen wrote:
>>>>>> On 13 feb 2014, at 10:17, Jaroslav Bachorik <jaroslav.bachorik at oracle.com> wrote:
>>>>>>
>>>>>>   
>>>>>>> Hi Staffan,
>>>>>>>
>>>>>>> On 12.2.2014 18:27, Staffan Larsen wrote:
>>>>>>>     
>>>>>>>> I?m still not happy with this fix since I think the extra output stream synchronization logic is not needed - the debuggee should be suspended at all the interesting points. The fix I proposed is cleaner and (as far as I can tell) also fixes the problem. The only thing is that I can?t quite explain what goes wrong without the fix? I?d really like to understand that. I?ll try to dig deeper and see if I can understand exactly what happens.
>>>>>>>>       
>>>>>>> Yes, bringing the VM to a stable state before calling other JDI functions helps to stabilize the test even without the additional synchronization via stdout/stdin.
>>>>>>>
>>>>>>> I just wonder whether this check should not be done inside com.sun.jdi.connect.LaunchingConnector#launch() implementation. Does it even make sense to hand off an unstable VM?
>>>>>>>     
>>>>>> Good question, but hard to change now - all implementations depend on the current functionality. The VMStartEvent also gives you a reference to the main thread.
>>>>>>   
>>>>> The test failed when it received ClassPrepareEvent and did 
>>>>> addFieldWatch, that meant the test must receive already 
>>>>> VMStartEvent, because VMStartEvent must be the first event, if it 
>>>>> was true then the vm must be already stable when failing.
>>>>>
>>>>> Except that the test received ClassPrepareEvent before 
>>>>> VMStartEvent then it was doing addFieldWatch with a possibly 
>>>>> unstable VM. in this case we might have a serious bug in 
>>>>> VirtualMachine implementation, and if this is true the fix 
>>>>> proposed to check "start" may make miss ClassPrepareEvent, then 
>>>>> the test would test nothing.
>>>>>
>>>>> Shanliang
>>>>>> /S
>>>>>>
>>>>>>   
>>>>>>> -JB-
>>>>>>>
>>>>>>>     
>>>>>>>> /Staffan
>>>>>>>>
>>>>>>>> On 12 feb 2014, at 18:04, shanliang <shanliang.jiang at oracle.com> wrote:
>>>>>>>>
>>>>>>>>       
>>>>>>>>> Staffan Larsen wrote:
>>>>>>>>>         
>>>>>>>>>> I think what you need to do is wait for the VMStartEvent before you add requests to the VM. Note this paragraph from the VirtualMachine doc:
>>>>>>>>>>
>>>>>>>>>>  Note that a target VM launched by a launching connector is not
>>>>>>>>>>  guaranteed to be stable until after the VMStartEvent has been
>>>>>>>>>>  received.
>>>>>>>>>>
>>>>>>>>>>           
>>>>>>>>> I may miss something here, I believe VMStartEvent must be the first event, when the test got ClassPrepareEvent, it must already received VMStartEvent.
>>>>>>>>>         
>>>>>>>>>> I think adding code that looks something like this will make the test stable:
>>>>>>>>>>
>>>>>>>>>>     VirtualMachine vm = launchTarget(CLASS_NAME);
>>>>>>>>>>     EventQueue eventQueue = vm.eventQueue();
>>>>>>>>>>
>>>>>>>>>>     boolean started = false;
>>>>>>>>>>     while(!started) {
>>>>>>>>>>       EventSet eventSet = eventQueue.remove();
>>>>>>>>>>       for (Event event : eventSet) {
>>>>>>>>>>         if (event instanceof VMStartEvent) {
>>>>>>>>>>           started = true;
>>>>>>>>>>         }
>>>>>>>>>>         if (event instanceof VMDeathEvent
>>>>>>>>>>             || event instanceof VMDisconnectEvent) {
>>>>>>>>>>           throw new Error("VM died before it started...:"+event);
>>>>>>>>>>         }
>>>>>>>>>>       }
>>>>>>>>>>     }
>>>>>>>>>>
>>>>>>>>>>     System.out.println("Vm launched");
>>>>>>>>>>
>>>>>>>>>>           
>>>>>>>>> The code you proposed could improve the test, it made sure that TestPostFieldModification was started, but I am afraid that it did not address the issue causing the failure, the issue I believe was that TestPostFieldModification exited before or during FieldMonitor called addFieldWatch(), that was why addFieldWatch() received VMDisconnectedException. When the test was treating ClassPrepareEvent, even if VMDeathEvent or VMDisconnectEvent arrived, it must be still waiting in the eventQueue because it arrived after ClassPrepareEvent.
>>>>>>>>>
>>>>>>>>> My fix was to not allow TestPostFieldModification to exit before addFieldWatch() was done.
>>>>>>>>>         
>>>>>>>>>> There is also no reason to call addFieldWatch() before the ClassPrepareEvent has been received. The call to vm..classesByName() will just return an empty list anyway.
>>>>>>>>>>
>>>>>>>>>>           
>>>>>>>>> I do not know why the test called addFieldWatch before ClassPrepareEvent had been received, but yes the returned list was empty, so agree to remove it.
>>>>>>>>>         
>>>>>>>>>> While you are in there you can also remove the unused StringBuffer near the top of main().
>>>>>>>>>>
>>>>>>>>>>           
>>>>>>>>> Yes it was already removed in version 01
>>>>>>>>>
>>>>>>>>> Here is the new webrev:
>>>>>>>>> http://cr.openjdk.java.net/~sjiang/JDK-8007710/02/
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Shanliang
>>>>>>>>>         
>>>>>>>>>> Thanks,
>>>>>>>>>> /Staffan
>>>>>>>>>>
>>>>>>>>>> On 11 feb 2014, at 18:30, shanliang <shanliang.jiang at oracle.com> wrote:
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>           
>>>>>>>>>>> Here is the new fix in which FieldMonitor will write to TestPostFieldModification, to inform the latter to quit, as suggested bu Jaroslav
>>>>>>>>>>>   http://cr.openjdk.java.net/~sjiang/JDK-8007710/01/
>>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>> Shanliang
>>>>>>>>>>>
>>>>>>>>>>> shanliang wrote:
>>>>>>>>>>>
>>>>>>>>>>>             
>>>>>>>>>>>> shanliang wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>               
>>>>>>>>>>>>> Jaroslav Bachorik wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>                 
>>>>>>>>>>>>>> On 11.2.2014 16:31, shanliang wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>                   
>>>>>>>>>>>>>>> Staffan Larsen wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>                     
>>>>>>>>>>>>>>>> Hi Shanliang,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I can?t quite see how the test can fail in this way. When the
>>>>>>>>>>>>>>>> ClassPrepareEvent happens, the debuggee will be suspended. So when
>>>>>>>>>>>>>>>> addFieldWatch() is called, the debuggee should not have moved.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>                       
>>>>>>>>>>>>>>> I am not expert of jdi so I may miss something here. I checked the
>>>>>>>>>>>>>>> failure trace and saw the report exception happen when FieldMonitor
>>>>>>>>>>>>>>> received ClassPrepareEvent and was doing addFieldWatch. FieldMonitor did
>>>>>>>>>>>>>>> call "vm.resume()" before treating events.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>                     
>>>>>>>>>>>>>> AFAICS, calling vm.resume() results in an almost immediate debuggee death. The gc() invoking thread "d" is flagged as a deamon and as such doesn't prevent the process from exiting. The other thread is not a daemon but will finish in only few cycles.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>                   
>>>>>>>>>>>>> I looked at the class com.sun.jdi.VirtualMachine, here is the Javadoc of the method "resume":
>>>>>>>>>>>>>   /**
>>>>>>>>>>>>>    * Continues the execution of the application running in this
>>>>>>>>>>>>>    * virtual machine. All threads are resumed as documented in
>>>>>>>>>>>>>    * {@link ThreadReference#resume}.
>>>>>>>>>>>>>    *
>>>>>>>>>>>>>    * @throws VMCannotBeModifiedException if the VirtualMachine is read-only - see {@link VirtualMachine#canBeModified()}.
>>>>>>>>>>>>>    *
>>>>>>>>>>>>>    * @see #suspend
>>>>>>>>>>>>>    */
>>>>>>>>>>>>>   void resume();
>>>>>>>>>>>>> My understanding is that the debuggee resumes to work after this call, instead to die?
>>>>>>>>>>>>>
>>>>>>>>>>>>>                 
>>>>>>>>>>>> In fact the problem is here, the vm (TestPostFieldModification) should not die before FieldMonitor finishes addFieldWatch.
>>>>>>>>>>>>
>>>>>>>>>>>> Shanliang
>>>>>>>>>>>>
>>>>>>>>>>>>               
>>>>>>>>>>>>>>> I reproduced the bug by add sleep(1000) after vm.resume() but before
>>>>>>>>>>>>>>> calling eventQueue.remove();
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>                     
>>>>>>>>>>>>>> It looks like some kind of synchronization between the debugger and the debuggee is necessary. But I wonder if you should better use the process.getOuptuptStream() to write and flush a message for the debugee indicating that it can exit. And in the debugee you would just do System.in.read() as the last statement in the main() method. Seems more robust than involving files.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>                   
>>>>>>>>>>>>> It could work, but creating a file in the testing directory should have no issue, but yes maybe less performance.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>> Shanliang
>>>>>>>>>>>>>
>>>>>>>>>>>>>                 
>>>>>>>>>>>>>> Cheers,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> -JB-
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>                   
>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>> Shanliang
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>                     
>>>>>>>>>>>>>>>> One problem I do see with the test is that it does not wait for a
>>>>>>>>>>>>>>>> VMStartEvent before setting up requests. I?m not sure if that could
>>>>>>>>>>>>>>>> cause the failure in the bug report, though.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> /Staffan
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On 11 feb 2014, at 15:13, shanliang <shanliang.jiang at oracle.com> wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>                       
>>>>>>>>>>>>>>>>> Hi ,
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> The problem could be that FieldMonitor did not have enough time to
>>>>>>>>>>>>>>>>> "addFieldWatch" but the vm to monitor (TestPostFieldModification) was
>>>>>>>>>>>>>>>>> already ended.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> So we should make sure that TestPostFieldModification exits after
>>>>>>>>>>>>>>>>> FieldMonitor has done necessary. The solution proposed here is that
>>>>>>>>>>>>>>>>> FieldMonitor creates a file after adding field watching, and
>>>>>>>>>>>>>>>>> TestPostFieldModification quits only after finding the file.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> web:
>>>>>>>>>>>>>>>>> http://icncweb.fr.oracle.com/~shjiang/webrev/8007710/00/
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> bug:
>>>>>>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8007710
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>> Shanliang
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>                         
>>>>>>>>>         
>>>>>>>>       
>>>>>>   
>>>>>
>>>>>
>>>>
>>>
>>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20140214/247b058e/attachment-0001.html 

From jaroslav.bachorik at oracle.com  Fri Feb 14 03:07:09 2014
From: jaroslav.bachorik at oracle.com (Jaroslav Bachorik)
Date: Fri, 14 Feb 2014 12:07:09 +0100
Subject: RFR 4505697: nsk/jdi/ExceptionEvent/_itself_/exevent006 and exevent008
	tests fail with InvocationTargetException
Message-ID: <52FDF8DD.9090602@oracle.com>

This is a round-0 review request.

The reflection code intercepting the exceptions thrown in the invoked 
methods does not play nicely with JVMTI (which, in this case, propagates 
to JDI).

The reflection code lacks the traditional error handler - therefore, 
upon throwing the NumberFormatException, the stack is searched for 
appropriate handlers and none are found. This leaves the 
"exception_detected" flag set to true while normally it would be reset 
to false once the exception is handled. The reflection code then goes on 
and wraps the NumberFormatException into InvocationTargetException and 
throws it. But, alas, the "exception_detected" flag is still set to true 
and no JVMTI exception event will be sent out.

The proposed solution is to call 
thread->jvmti_thread_state()->clear_exception_detected() at the 
appropriate places in the reflection code to reset the 
"exception_detected" flag and enable the InvocationTargetException be 
properly reported over JVMTI.

Issue : https://bugs.openjdk.java.net/browse/JDK-4505697
Webrev: http://cr.openjdk.java.net/~jbachorik/4505697/webrev.00

Thanks!

-JB-

From staffan.larsen at oracle.com  Fri Feb 14 03:16:10 2014
From: staffan.larsen at oracle.com (Staffan Larsen)
Date: Fri, 14 Feb 2014 12:16:10 +0100
Subject: Codereview request: 8007710 runtime/7158988/FieldMonitor.java
	fails with com.sun.jdi.VMDisconnectedException: Connection closed
In-Reply-To: <52FDE922.3040301@oracle.com>
References: <52FA301A.2040301@oracle.com>	<F34A2199-253B-49B8-BEC9-00AEA98029C2@oracle.com>	<52FA424E.2050202@oracle.com>	<52FA4515.3050904@oracle.com>	<52FA4D33.4030402@oracle.com>	<52FA51BE.4070701@oracle.com>	<52FA5E51.3040704@oracle.com>	<3F301491-8A27-4319-A694-D1E8ACB2F249@oracle.com>	<52FBA9B2.6090502@oracle.com>	<BC488F45-A698-4C92-986F-BBBEBA553CEF@oracle.com>	<52FC8DA1.9090900@oracle.com>	<1353E47D-3ACB-42FD-ABD3-845CEB6EE389@oracle.com>	<52FCA014.9000106@oracle.com>	<EF6C879F-2658-47E9-B733-0E0E027E3B4E@oracle.com>
	<52FCFDBA.5040108@oracle.com> <52FD27D8.8060200@oracle.com>
	<28F9E62B-3A20-4BFB-AA40-9568C8BEDBC2@oracle.com>
	<52FDE922.3040301@oracle.com>
Message-ID: <CF3A843D-8175-4AFA-B079-2C7C2F805A46@oracle.com>

Good.

Just a small spelling error ?Espected? -> ?Expected?.

/Staffan

On 14 feb 2014, at 11:00, shanliang <shanliang.jiang at oracle.com> wrote:

> Staffan Larsen wrote:
>> 
>> This version looks good! Thanks for hanging in there.
>> 
>> The only improvement would be to count and verify the number of ModificationWatchpointEvent (there should be 10).
> Good idea, here is:
> http://cr.openjdk.java.net/~sjiang/JDK-8007710/05/
> 
> Thanks,
> Shanliang
>> 
>> Thanks,
>> /Staffan
>> 
>> On 13 feb 2014, at 21:15, shanliang <shanliang.jiang at oracle.com> wrote:
>> 
>>> Hi,
>>> 
>>> Here is Version 4:
>>>     http://cr.openjdk.java.net/~sjiang/JDK-8007710/04/
>>> 
>>> 1) remove the line
>>>     108   vm.resume()
>>> 2) call addClassWatch(vm) only when receiving VMStartEvent
>>> 3) make sure that the test receives ModificationWatchpointEvent
>>> 4) clean
>>> 
>>> Thanks,
>>> Shanliang
>>> 
>>> shanliang wrote:
>>>> 
>>>> Staffan,
>>>> 
>>>> Very nice analysis! 
>>>> 
>>>> The fix must be very simple, just remove the line 
>>>>     108   vm.resume
>>>> it is an error because here the test does not yet treat the events in eventSet.
>>>> 
>>>> the line
>>>>     136   eventSet.resume();
>>>> is the right place to resume the threads after event treatment.
>>>> 
>>>> Here is the new webrev:
>>>> http://cr.openjdk.java.net/~sjiang/JDK-8007710/03/
>>>> 
>>>> Thanks,
>>>> Shanliang
>>>> 
>>>> Staffan Larsen wrote:
>>>>> 
>>>>> I think I understand what happens now.
>>>>> 
>>>>> The test code, simplified, looks like this (with the Thread.sleep() added that causes the test to fail):
>>>>> 
>>>>>   launchTarget();
>>>>>   addClassWatch();
>>>>>   vm.resume();
>>>>>   Thread.sleep(1000);
>>>>>   while(connected) {
>>>>>       eventSet = eventQueue.remove()
>>>>>       for(event : eventQueue) {
>>>>>           if (event instanceof ClassPrepareEvent) {
>>>>>               addFieldWatch();
>>>>>           }
>>>>>       }
>>>>>       eventSet.resume();
>>>>>   }
>>>>> 
>>>>> By default all events that happen will cause the debuggee to suspend (see EventRequest.setSuspendPolicy()). Thus when we get to addFieldWatch(), the vm should be suspended and we should be able to create the field watch without problem. But the VM isn?t suspended and that is why the test fail. 
>>>>> 
>>>>> Why isn?t the VM suspended? When we get to the ?for(event : eventQueue)? the first time there are *two* events already in the queue: the VMStartEvent and a ClassPrepareEvent. At this point the VM is suspended and everything is good. We look at the first eventSet which only contains the VMStartEvent, we ignore the event, but we resume the VM. We then loop and look at the ClassPrepareEvent, but by now the VM is already running and has also terminated. Failure.
>>>>> 
>>>>> Thus, we need to handle the VMStartEvent. I suggest a modification to my previous code:
>>>>> 
>>>>>   launchTarget();
>>>>>   while(connected) {
>>>>>       eventSet = eventQueue.remove()
>>>>>       for(event : eventQueue) {
>>>>>           if (event instanceof VMStartEvent) {
>>>>>               addClassWatch();
>>>>>           }
>>>>>           if (event instanceof ClassPrepareEvent) {
>>>>>               addFieldWatch();
>>>>>           }
>>>>>       }
>>>>>       eventSet.resume();
>>>>>   }
>>>>> 
>>>>> This will cause us to have complete control over the state of the debuggee. The first event we see will be the VMStartEvent. The VM will be suspended. We can add a class watch here. Then we resume the VM. The second event we see will be the ClassPrepareEvent with the VM suspended. We can add the field watch. Then we resume the VM and wait for the field watch events.
>>>>> 
>>>>> Thanks,
>>>>> /Staffan
>>>>> 
>>>>> On 13 feb 2014, at 11:36, shanliang <shanliang.jiang at oracle.com> wrote:
>>>>> 
>>>>>> Staffan Larsen wrote:
>>>>>>> 
>>>>>>> On 13 feb 2014, at 10:17, Jaroslav Bachorik <jaroslav.bachorik at oracle.com> wrote:
>>>>>>> 
>>>>>>>   
>>>>>>>> Hi Staffan,
>>>>>>>> 
>>>>>>>> On 12.2.2014 18:27, Staffan Larsen wrote:
>>>>>>>>     
>>>>>>>>> I?m still not happy with this fix since I think the extra output stream synchronization logic is not needed - the debuggee should be suspended at all the interesting points. The fix I proposed is cleaner and (as far as I can tell) also fixes the problem. The only thing is that I can?t quite explain what goes wrong without the fix? I?d really like to understand that. I?ll try to dig deeper and see if I can understand exactly what happens.
>>>>>>>>>       
>>>>>>>> Yes, bringing the VM to a stable state before calling other JDI functions helps to stabilize the test even without the additional synchronization via stdout/stdin.
>>>>>>>> 
>>>>>>>> I just wonder whether this check should not be done inside com.sun.jdi.connect.LaunchingConnector#launch() implementation. Does it even make sense to hand off an unstable VM?
>>>>>>>>     
>>>>>>> Good question, but hard to change now - all implementations depend on the current functionality. The VMStartEvent also gives you a reference to the main thread.
>>>>>>>   
>>>>>> The test failed when it received ClassPrepareEvent and did addFieldWatch, that meant the test must receive already VMStartEvent, because VMStartEvent must be the first event, if it was true then the vm must be already stable when failing.
>>>>>> 
>>>>>> Except that the test received ClassPrepareEvent before VMStartEvent then it was doing addFieldWatch with a possibly unstable VM. in this case we might have a serious bug in VirtualMachine implementation, and if this is true the fix proposed to check "start" may make miss ClassPrepareEvent, then the test would test nothing.
>>>>>> 
>>>>>> Shanliang
>>>>>>> /S
>>>>>>> 
>>>>>>>   
>>>>>>>> -JB-
>>>>>>>> 
>>>>>>>>     
>>>>>>>>> /Staffan
>>>>>>>>> 
>>>>>>>>> On 12 feb 2014, at 18:04, shanliang <shanliang.jiang at oracle.com> wrote:
>>>>>>>>> 
>>>>>>>>>       
>>>>>>>>>> Staffan Larsen wrote:
>>>>>>>>>>         
>>>>>>>>>>> I think what you need to do is wait for the VMStartEvent before you add requests to the VM. Note this paragraph from the VirtualMachine doc:
>>>>>>>>>>> 
>>>>>>>>>>>  Note that a target VM launched by a launching connector is not
>>>>>>>>>>>  guaranteed to be stable until after the VMStartEvent has been
>>>>>>>>>>>  received.
>>>>>>>>>>> 
>>>>>>>>>>>           
>>>>>>>>>> I may miss something here, I believe VMStartEvent must be the first event, when the test got ClassPrepareEvent, it must already received VMStartEvent.
>>>>>>>>>>         
>>>>>>>>>>> I think adding code that looks something like this will make the test stable:
>>>>>>>>>>> 
>>>>>>>>>>>     VirtualMachine vm = launchTarget(CLASS_NAME);
>>>>>>>>>>>     EventQueue eventQueue = vm.eventQueue();
>>>>>>>>>>> 
>>>>>>>>>>>     boolean started = false;
>>>>>>>>>>>     while(!started) {
>>>>>>>>>>>       EventSet eventSet = eventQueue.remove();
>>>>>>>>>>>       for (Event event : eventSet) {
>>>>>>>>>>>         if (event instanceof VMStartEvent) {
>>>>>>>>>>>           started = true;
>>>>>>>>>>>         }
>>>>>>>>>>>         if (event instanceof VMDeathEvent
>>>>>>>>>>>             || event instanceof VMDisconnectEvent) {
>>>>>>>>>>>           throw new Error("VM died before it started...:"+event);
>>>>>>>>>>>         }
>>>>>>>>>>>       }
>>>>>>>>>>>     }
>>>>>>>>>>> 
>>>>>>>>>>>     System.out.println("Vm launched");
>>>>>>>>>>> 
>>>>>>>>>>>           
>>>>>>>>>> The code you proposed could improve the test, it made sure that TestPostFieldModification was started, but I am afraid that it did not address the issue causing the failure, the issue I believe was that TestPostFieldModification exited before or during FieldMonitor called addFieldWatch(), that was why addFieldWatch() received VMDisconnectedException. When the test was treating ClassPrepareEvent, even if VMDeathEvent or VMDisconnectEvent arrived, it must be still waiting in the eventQueue because it arrived after ClassPrepareEvent.
>>>>>>>>>> 
>>>>>>>>>> My fix was to not allow TestPostFieldModification to exit before addFieldWatch() was done.
>>>>>>>>>>         
>>>>>>>>>>> There is also no reason to call addFieldWatch() before the ClassPrepareEvent has been received. The call to vm..classesByName() will just return an empty list anyway.
>>>>>>>>>>> 
>>>>>>>>>>>           
>>>>>>>>>> I do not know why the test called addFieldWatch before ClassPrepareEvent had been received, but yes the returned list was empty, so agree to remove it.
>>>>>>>>>>         
>>>>>>>>>>> While you are in there you can also remove the unused StringBuffer near the top of main().
>>>>>>>>>>> 
>>>>>>>>>>>           
>>>>>>>>>> Yes it was already removed in version 01
>>>>>>>>>> 
>>>>>>>>>> Here is the new webrev:
>>>>>>>>>> http://cr.openjdk.java.net/~sjiang/JDK-8007710/02/
>>>>>>>>>> 
>>>>>>>>>> Thanks,
>>>>>>>>>> Shanliang
>>>>>>>>>>         
>>>>>>>>>>> Thanks,
>>>>>>>>>>> /Staffan
>>>>>>>>>>> 
>>>>>>>>>>> On 11 feb 2014, at 18:30, shanliang <shanliang.jiang at oracle.com> wrote:
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>>           
>>>>>>>>>>>> Here is the new fix in which FieldMonitor will write to TestPostFieldModification, to inform the latter to quit, as suggested bu Jaroslav
>>>>>>>>>>>>   http://cr.openjdk.java.net/~sjiang/JDK-8007710/01/
>>>>>>>>>>>> 
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>> Shanliang
>>>>>>>>>>>> 
>>>>>>>>>>>> shanliang wrote:
>>>>>>>>>>>> 
>>>>>>>>>>>>             
>>>>>>>>>>>>> shanliang wrote:
>>>>>>>>>>>>> 
>>>>>>>>>>>>>               
>>>>>>>>>>>>>> Jaroslav Bachorik wrote:
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>                 
>>>>>>>>>>>>>>> On 11.2.2014 16:31, shanliang wrote:
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>                   
>>>>>>>>>>>>>>>> Staffan Larsen wrote:
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>                     
>>>>>>>>>>>>>>>>> Hi Shanliang,
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> I can?t quite see how the test can fail in this way. When the
>>>>>>>>>>>>>>>>> ClassPrepareEvent happens, the debuggee will be suspended. So when
>>>>>>>>>>>>>>>>> addFieldWatch() is called, the debuggee should not have moved.
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>                       
>>>>>>>>>>>>>>>> I am not expert of jdi so I may miss something here. I checked the
>>>>>>>>>>>>>>>> failure trace and saw the report exception happen when FieldMonitor
>>>>>>>>>>>>>>>> received ClassPrepareEvent and was doing addFieldWatch. FieldMonitor did
>>>>>>>>>>>>>>>> call "vm.resume()" before treating events.
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>                     
>>>>>>>>>>>>>>> AFAICS, calling vm.resume() results in an almost immediate debuggee death. The gc() invoking thread "d" is flagged as a deamon and as such doesn't prevent the process from exiting. The other thread is not a daemon but will finish in only few cycles.
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>                   
>>>>>>>>>>>>>> I looked at the class com.sun.jdi.VirtualMachine, here is the Javadoc of the method "resume":
>>>>>>>>>>>>>>   /**
>>>>>>>>>>>>>>    * Continues the execution of the application running in this
>>>>>>>>>>>>>>    * virtual machine. All threads are resumed as documented in
>>>>>>>>>>>>>>    * {@link ThreadReference#resume}.
>>>>>>>>>>>>>>    *
>>>>>>>>>>>>>>    * @throws VMCannotBeModifiedException if the VirtualMachine is read-only - see {@link VirtualMachine#canBeModified()}.
>>>>>>>>>>>>>>    *
>>>>>>>>>>>>>>    * @see #suspend
>>>>>>>>>>>>>>    */
>>>>>>>>>>>>>>   void resume();
>>>>>>>>>>>>>> My understanding is that the debuggee resumes to work after this call, instead to die?
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>                 
>>>>>>>>>>>>> In fact the problem is here, the vm (TestPostFieldModification) should not die before FieldMonitor finishes addFieldWatch.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Shanliang
>>>>>>>>>>>>> 
>>>>>>>>>>>>>               
>>>>>>>>>>>>>>>> I reproduced the bug by add sleep(1000) after vm.resume() but before
>>>>>>>>>>>>>>>> calling eventQueue.remove();
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>                     
>>>>>>>>>>>>>>> It looks like some kind of synchronization between the debugger and the debuggee is necessary. But I wonder if you should better use the process.getOuptuptStream() to write and flush a message for the debugee indicating that it can exit. And in the debugee you would just do System.in.read() as the last statement in the main() method. Seems more robust than involving files.
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>                   
>>>>>>>>>>>>>> It could work, but creating a file in the testing directory should have no issue, but yes maybe less performance.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>> Shanliang
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>                 
>>>>>>>>>>>>>>> Cheers,
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> -JB-
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>                   
>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>> Shanliang
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>                     
>>>>>>>>>>>>>>>>> One problem I do see with the test is that it does not wait for a
>>>>>>>>>>>>>>>>> VMStartEvent before setting up requests. I?m not sure if that could
>>>>>>>>>>>>>>>>> cause the failure in the bug report, though.
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> /Staffan
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> On 11 feb 2014, at 15:13, shanliang <shanliang.jiang at oracle.com> wrote:
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>                       
>>>>>>>>>>>>>>>>>> Hi ,
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> The problem could be that FieldMonitor did not have enough time to
>>>>>>>>>>>>>>>>>> "addFieldWatch" but the vm to monitor (TestPostFieldModification) was
>>>>>>>>>>>>>>>>>> already ended.
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> So we should make sure that TestPostFieldModification exits after
>>>>>>>>>>>>>>>>>> FieldMonitor has done necessary. The solution proposed here is that
>>>>>>>>>>>>>>>>>> FieldMonitor creates a file after adding field watching, and
>>>>>>>>>>>>>>>>>> TestPostFieldModification quits only after finding the file.
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> web:
>>>>>>>>>>>>>>>>>> http://icncweb.fr.oracle.com/~shjiang/webrev/8007710/00/
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> bug:
>>>>>>>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8007710
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>> Shanliang
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>                         
>>>>>>>>>>         
>>>>>>>>>       
>>>>>>>   
>>>>>> 
>>>>>> 
>>>>> 
>>>> 
>>> 
>> 
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20140214/a4346110/attachment-0001.html 

From dmitry.samersoff at oracle.com  Fri Feb 14 06:52:10 2014
From: dmitry.samersoff at oracle.com (Dmitry Samersoff)
Date: Fri, 14 Feb 2014 18:52:10 +0400
Subject: RR(S): JDK-7189721 nightly tests failed on JDI attaching
Message-ID: <52FE2D9A.4030700@oracle.com>

Hi Everyone,

Please, review the changes.

http://cr.openjdk.java.net/~dsamersoff/JDK-7189721/webrev.01/

Summary:

If for some reason elf section with section names is not loaded to cache
it attempts to read data using NULL section pointer.


-- 
Dmitry Samersoff
Oracle Java development team, Saint Petersburg, Russia
* I would love to change the world, but they won't give me the sources.

From daniel.daugherty at oracle.com  Fri Feb 14 07:45:49 2014
From: daniel.daugherty at oracle.com (Daniel D. Daugherty)
Date: Fri, 14 Feb 2014 08:45:49 -0700
Subject: RFR (S) 8034249: need more workarounds for suspend equivalent
	condition issue
In-Reply-To: <52FD5CC3.4090502@oracle.com>
References: <52FD5CC3.4090502@oracle.com>
Message-ID: <52FE3A2D.808@oracle.com>

On 2/13/14 5:01 PM, serguei.spitsyn at oracle.com wrote:
> Please, review the fix for:
>   https://bugs.openjdk.java.net/browse/JDK-8034249
>
>
> Open webrev:
> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/hotspot/8034249-JVMTI-MON.1 
>

src/share/vm/prims/jvmtiEnvBase.hpp
     line 360 and 446 are a bit long

src/share/vm/prims/jvmtiEnv.cpp
     No comments.

Thumbs up.

Dan


>
> Summary:
>
>   This issue was identified in the review of the 8032223 and it is 
> similar to the 8032223
>   but impacts different JVMTI functions:
>     GetCurrentContendedMonitor, GetOwnedMonitorInfo,
>     GetOwnedMonitorStackDepthInfo, GetStackTrace
>
>   There is a general issue in the suspend equivalent condition mechanism:
>   Two subsequent calls to the JvmtiEnv::is_thread_fully_suspended() 
> may return different results:
>     - 1-st: true
>     - 2-nd: false
>
>   This suspend equivalent issue is covered by another bug:
>     https://bugs.openjdk.java.net/browse/JDK-6280037
>
>   This fix is to work around the 6280037.
>   It is more safe to collect the necesary information at a safepoint 
> instead of
>   relying on the suspension of the target thread.
>
>
> Testing:
>   In progress: nsk.jvmti, nsk.jdi, nsk.jdwp, JTreg com/sun/jdi
>
>
> Thanks,
> Serguei


From daniel.daugherty at oracle.com  Fri Feb 14 08:38:58 2014
From: daniel.daugherty at oracle.com (Daniel D. Daugherty)
Date: Fri, 14 Feb 2014 09:38:58 -0700
Subject: RFR 4505697: nsk/jdi/ExceptionEvent/_itself_/exevent006 and
	exevent008 tests fail with InvocationTargetException
In-Reply-To: <52FDF8DD.9090602@oracle.com>
References: <52FDF8DD.9090602@oracle.com>
Message-ID: <52FE46A2.20705@oracle.com>

 > Webrev: http://cr.openjdk.java.net/~jbachorik/4505697/webrev.00

Nice job on such an old bug.

src/share/vm/prims/jvmtiExport.hpp
     No comments.

src/share/vm/prims/jvmtiExport.cpp
     line 2170:       state->clear_exception_detected();
         HotSpot indent is two spaces.

src/share/vm/runtime/reflection.cpp
     line 948: JvmtiExport::clear_detected_exception((JavaThread*) THREAD);
     line 1085: JvmtiExport::clear_detected_exception((JavaThread*) THREAD);
         HotSpot indent is two spaces.

So there are two existing tests for this failure mode:

     nsk/jdi/ExceptionEvent/_itself_/exevent006
     nsk/jdi/ExceptionEvent/_itself_/exevent008

so I'm presuming that you've run the above tests and they are now
happy, but you haven't really stated that. What other pre-integration
testing do you have planned?

Dan


On 2/14/14 4:07 AM, Jaroslav Bachorik wrote:
> This is a round-0 review request.
>
> The reflection code intercepting the exceptions thrown in the invoked 
> methods does not play nicely with JVMTI (which, in this case, 
> propagates to JDI).
>
> The reflection code lacks the traditional error handler - therefore, 
> upon throwing the NumberFormatException, the stack is searched for 
> appropriate handlers and none are found. This leaves the 
> "exception_detected" flag set to true while normally it would be reset 
> to false once the exception is handled. The reflection code then goes 
> on and wraps the NumberFormatException into InvocationTargetException 
> and throws it. But, alas, the "exception_detected" flag is still set 
> to true and no JVMTI exception event will be sent out.
>
> The proposed solution is to call 
> thread->jvmti_thread_state()->clear_exception_detected() at the 
> appropriate places in the reflection code to reset the 
> "exception_detected" flag and enable the InvocationTargetException be 
> properly reported over JVMTI.
>
> Issue : https://bugs.openjdk.java.net/browse/JDK-4505697
> Webrev: http://cr.openjdk.java.net/~jbachorik/4505697/webrev.00
>
> Thanks!
>
> -JB-
>
>


From serguei.spitsyn at oracle.com  Fri Feb 14 09:31:37 2014
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Fri, 14 Feb 2014 09:31:37 -0800
Subject: RFR (S) 8034249: need more workarounds for suspend equivalent
	condition issue
In-Reply-To: <52FE3A2D.808@oracle.com>
References: <52FD5CC3.4090502@oracle.com> <52FE3A2D.808@oracle.com>
Message-ID: <52FE52F9.5070701@oracle.com>


On 2/14/14 7:45 AM, Daniel D. Daugherty wrote:
> On 2/13/14 5:01 PM, serguei.spitsyn at oracle.com wrote:
>> Please, review the fix for:
>>   https://bugs.openjdk.java.net/browse/JDK-8034249
>>
>>
>> Open webrev:
>> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/hotspot/8034249-JVMTI-MON.1 
>>
>
> src/share/vm/prims/jvmtiEnvBase.hpp
>     line 360 and 446 are a bit long

I'll split the lines.


>
> src/share/vm/prims/jvmtiEnv.cpp
>     No comments.
>
> Thumbs up.

Thank you for review!
Serguei
>
> Dan
>
>
>
>>
>> Summary:
>>
>>   This issue was identified in the review of the 8032223 and it is 
>> similar to the 8032223
>>   but impacts different JVMTI functions:
>>     GetCurrentContendedMonitor, GetOwnedMonitorInfo,
>>     GetOwnedMonitorStackDepthInfo, GetStackTrace
>>
>>   There is a general issue in the suspend equivalent condition 
>> mechanism:
>>   Two subsequent calls to the JvmtiEnv::is_thread_fully_suspended() 
>> may return different results:
>>     - 1-st: true
>>     - 2-nd: false
>>
>>   This suspend equivalent issue is covered by another bug:
>>     https://bugs.openjdk.java.net/browse/JDK-6280037
>>
>>   This fix is to work around the 6280037.
>>   It is more safe to collect the necesary information at a safepoint 
>> instead of
>>   relying on the suspension of the target thread.
>>
>>
>> Testing:
>>   In progress: nsk.jvmti, nsk.jdi, nsk.jdwp, JTreg com/sun/jdi
>>
>>
>> Thanks,
>> Serguei
>


From serguei.spitsyn at oracle.com  Fri Feb 14 10:46:30 2014
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Fri, 14 Feb 2014 10:46:30 -0800
Subject: RFR 4505697: nsk/jdi/ExceptionEvent/_itself_/exevent006 and
	exevent008 tests fail with InvocationTargetException
In-Reply-To: <52FDF8DD.9090602@oracle.com>
References: <52FDF8DD.9090602@oracle.com>
Message-ID: <52FE6486.8000603@oracle.com>

Jaroslav,

It looks good in general modulo indent comments from Dan.

But I have a doubt that acquiring the JvmtiThreadState_lock is needed
or right thing to do in the JvmtiExport::clear_detected_exception().
It seems, both clear_exception_detected() and set_exception_detected() 
are always
called on current thread and so, it has to be safe to do without 
acquiring any locks.

And I'm repeating my question about pre-integration testing (Dan is 
asking about the same).

Thanks,
Serguei


On 2/14/14 3:07 AM, Jaroslav Bachorik wrote:
> This is a round-0 review request.
>
> The reflection code intercepting the exceptions thrown in the invoked 
> methods does not play nicely with JVMTI (which, in this case, 
> propagates to JDI).
>
> The reflection code lacks the traditional error handler - therefore, 
> upon throwing the NumberFormatException, the stack is searched for 
> appropriate handlers and none are found. This leaves the 
> "exception_detected" flag set to true while normally it would be reset 
> to false once the exception is handled. The reflection code then goes 
> on and wraps the NumberFormatException into InvocationTargetException 
> and throws it. But, alas, the "exception_detected" flag is still set 
> to true and no JVMTI exception event will be sent out.
>
> The proposed solution is to call 
> thread->jvmti_thread_state()->clear_exception_detected() at the 
> appropriate places in the reflection code to reset the 
> "exception_detected" flag and enable the InvocationTargetException be 
> properly reported over JVMTI.
>
> Issue : https://bugs.openjdk.java.net/browse/JDK-4505697
> Webrev: http://cr.openjdk.java.net/~jbachorik/4505697/webrev.00
>
> Thanks!
>
> -JB-

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20140214/be23acd6/attachment.html 

From daniel.daugherty at oracle.com  Fri Feb 14 12:33:12 2014
From: daniel.daugherty at oracle.com (Daniel D. Daugherty)
Date: Fri, 14 Feb 2014 13:33:12 -0700
Subject: RFR 4505697: nsk/jdi/ExceptionEvent/_itself_/exevent006 and
	exevent008 tests fail with InvocationTargetException
In-Reply-To: <52FE6486.8000603@oracle.com>
References: <52FDF8DD.9090602@oracle.com> <52FE6486.8000603@oracle.com>
Message-ID: <52FE7D88.8000701@oracle.com>

On 2/14/14 11:46 AM, serguei.spitsyn at oracle.com wrote:
> Jaroslav,
>
> It looks good in general modulo indent comments from Dan.
>
> But I have a doubt that acquiring the JvmtiThreadState_lock is needed
> or right thing to do in the JvmtiExport::clear_detected_exception().
> It seems, both clear_exception_detected() and set_exception_detected() 
> are always
> called on current thread and so, it has to be safe to do without 
> acquiring any locks.

My JVM/TI-foo is rusty, but I believe that JvmtiThreadState stuff
can also be queried/modified by other threads so grabbing the
associated lock is a good idea.

Dan


>
> And I'm repeating my question about pre-integration testing (Dan is 
> asking about the same).
>
> Thanks,
> Serguei
>
>
> On 2/14/14 3:07 AM, Jaroslav Bachorik wrote:
>> This is a round-0 review request.
>>
>> The reflection code intercepting the exceptions thrown in the invoked 
>> methods does not play nicely with JVMTI (which, in this case, 
>> propagates to JDI).
>>
>> The reflection code lacks the traditional error handler - therefore, 
>> upon throwing the NumberFormatException, the stack is searched for 
>> appropriate handlers and none are found. This leaves the 
>> "exception_detected" flag set to true while normally it would be 
>> reset to false once the exception is handled. The reflection code 
>> then goes on and wraps the NumberFormatException into 
>> InvocationTargetException and throws it. But, alas, the 
>> "exception_detected" flag is still set to true and no JVMTI exception 
>> event will be sent out.
>>
>> The proposed solution is to call 
>> thread->jvmti_thread_state()->clear_exception_detected() at the 
>> appropriate places in the reflection code to reset the 
>> "exception_detected" flag and enable the InvocationTargetException be 
>> properly reported over JVMTI.
>>
>> Issue : https://bugs.openjdk.java.net/browse/JDK-4505697
>> Webrev: http://cr.openjdk.java.net/~jbachorik/4505697/webrev.00
>>
>> Thanks!
>>
>> -JB-
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20140214/3a4dd4bc/attachment.html 

From serguei.spitsyn at oracle.com  Fri Feb 14 14:13:50 2014
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Fri, 14 Feb 2014 14:13:50 -0800
Subject: RFR 4505697: nsk/jdi/ExceptionEvent/_itself_/exevent006 and
	exevent008 tests fail with InvocationTargetException
In-Reply-To: <52FE7D88.8000701@oracle.com>
References: <52FDF8DD.9090602@oracle.com> <52FE6486.8000603@oracle.com>
	<52FE7D88.8000701@oracle.com>
Message-ID: <52FE951E.1060002@oracle.com>

On 2/14/14 12:33 PM, Daniel D. Daugherty wrote:
> On 2/14/14 11:46 AM, serguei.spitsyn at oracle.com wrote:
>> Jaroslav,
>>
>> It looks good in general modulo indent comments from Dan.
>>
>> But I have a doubt that acquiring the JvmtiThreadState_lock is needed
>> or right thing to do in the JvmtiExport::clear_detected_exception().
>> It seems, both clear_exception_detected() and 
>> set_exception_detected() are always
>> called on current thread and so, it has to be safe to do without 
>> acquiring any locks.
>
> My JVM/TI-foo is rusty, but I believe that JvmtiThreadState stuff
> can also be queried/modified by other threads so grabbing the
> associated lock is a good idea.

The lock synchronization is cooperative.
It does not help much if the lock is not acquired in other places.
I can be wrong, but I've not found yet any place in the code where the
clear_exception_detected() and set_exception_detected() are called
under protection of the JvmtiThreadState_lock.

Thanks,
Serguei

>
> Dan
>
>
>>
>> And I'm repeating my question about pre-integration testing (Dan is 
>> asking about the same).
>>
>> Thanks,
>> Serguei
>>
>>
>> On 2/14/14 3:07 AM, Jaroslav Bachorik wrote:
>>> This is a round-0 review request.
>>>
>>> The reflection code intercepting the exceptions thrown in the 
>>> invoked methods does not play nicely with JVMTI (which, in this 
>>> case, propagates to JDI).
>>>
>>> The reflection code lacks the traditional error handler - therefore, 
>>> upon throwing the NumberFormatException, the stack is searched for 
>>> appropriate handlers and none are found. This leaves the 
>>> "exception_detected" flag set to true while normally it would be 
>>> reset to false once the exception is handled. The reflection code 
>>> then goes on and wraps the NumberFormatException into 
>>> InvocationTargetException and throws it. But, alas, the 
>>> "exception_detected" flag is still set to true and no JVMTI 
>>> exception event will be sent out.
>>>
>>> The proposed solution is to call 
>>> thread->jvmti_thread_state()->clear_exception_detected() at the 
>>> appropriate places in the reflection code to reset the 
>>> "exception_detected" flag and enable the InvocationTargetException 
>>> be properly reported over JVMTI.
>>>
>>> Issue : https://bugs.openjdk.java.net/browse/JDK-4505697
>>> Webrev: http://cr.openjdk.java.net/~jbachorik/4505697/webrev.00
>>>
>>> Thanks!
>>>
>>> -JB-
>>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20140214/6fdcf9dd/attachment.html 

From Alan.Bateman at oracle.com  Sun Feb 16 09:20:47 2014
From: Alan.Bateman at oracle.com (Alan Bateman)
Date: Sun, 16 Feb 2014 17:20:47 +0000
Subject: 8034856/8034857: More gcc warnings
In-Reply-To: <52FD35C5.1000306@oracle.com>
References: <52FCC60B.6040205@oracle.com>
	<6E073B44-2FF7-419F-A094-7CD5C3AE0788@oracle.com>
	<52FD0DA2.7030201@oracle.com> <52FD35C5.1000306@oracle.com>
Message-ID: <5300F36F.9010205@oracle.com>

On 13/02/2014 21:14, Mikael Vidstedt wrote:
> :
>
> The change in question appears to come from 
> https://bugs.openjdk.java.net/browse/JDK-6679866, but I'm not sure the 
> bug gives enough additional information. My speculation (and it's 
> really just a speculation) is that it's not related to isspace per-se, 
> but to something else which gets defined/redefined/undefined by 
> including ctype.h. I guess it would be good to know if we have tests 
> which cover the thing the comment is alluding to (non-ascii in 
> Premain-Class).
Thanks for pointing this out. I looked at it again and the issue is that 
isspace is a macro and depends on the locale. By not including ctype.h 
then it means we get linked to the libc function instead. One approach 
is to include ctype.h and then #undef isspace, another is to define 
function prototype ourselves. I think the latter is a little bit better 
because it would avoid accidental usage of other local sensitive char 
classifiers. Attached is the patch that I propose. I have deliberate 
moved to to after other includes so we get a chance to #undef in the 
event that it gets included by something else.

On tests then PremainClassTest.java is good enough to find this on Solaris.

-Alan


diff --git a/src/share/instrument/JarFacade.c 
b/src/share/instrument/JarFacade.c
--- a/src/share/instrument/JarFacade.c
+++ b/src/share/instrument/JarFacade.c
@@ -23,17 +23,20 @@
   * questions.
   */

-#ifdef _WIN32
-/*
- * Win* needs this include. However, Linux and Solaris do not.
- * Having this include on Solaris SPARC breaks having non US-ASCII
- * characters in the value of the Premain-Class attribute.
- */
-#include <ctype.h>
-#endif /* _WIN32 */
  #include <string.h>
  #include <stdlib.h>
-#include <ctype.h>
+
+/**
+ * ctype.h is required on Windows. For other platforms we use a function
+ * prototype to ensure that we use the libc isspace function rather than
+ * the isspace macro (due to isspace being locale sensitive)
+ */
+#ifdef _WIN32
+  #include <ctype.h>
+#else
+  #undef isspace
+  extern int isspace(int c);
+#endif /* _WIN32 */

  #include "jni.h"
  #include "manifest_info.h"

From martinrb at google.com  Sun Feb 16 14:20:30 2014
From: martinrb at google.com (Martin Buchholz)
Date: Sun, 16 Feb 2014 14:20:30 -0800
Subject: 8034856/8034857: More gcc warnings
In-Reply-To: <5300F36F.9010205@oracle.com>
References: <52FCC60B.6040205@oracle.com>
	<6E073B44-2FF7-419F-A094-7CD5C3AE0788@oracle.com>
	<52FD0DA2.7030201@oracle.com> <52FD35C5.1000306@oracle.com>
	<5300F36F.9010205@oracle.com>
Message-ID: <CA+kOe09Uc3LQG7Ui7fhxsQABacrPdvEpRVGoKxnoY-0FHm4Ntw@mail.gmail.com>

Those locale-dependent APIs - more trouble than they're worth.  More often
than not, you want a locale-independent version.

So just define your own is_ASCII_space etc. like everybody else has done
and move on.


On Sun, Feb 16, 2014 at 9:20 AM, Alan Bateman <Alan.Bateman at oracle.com>wrote:

> On 13/02/2014 21:14, Mikael Vidstedt wrote:
>
>> :
>>
>> The change in question appears to come from
>> https://bugs.openjdk.java.net/browse/JDK-6679866, but I'm not sure the
>> bug gives enough additional information. My speculation (and it's really
>> just a speculation) is that it's not related to isspace per-se, but to
>> something else which gets defined/redefined/undefined by including ctype.h.
>> I guess it would be good to know if we have tests which cover the thing the
>> comment is alluding to (non-ascii in Premain-Class).
>>
> Thanks for pointing this out. I looked at it again and the issue is that
> isspace is a macro and depends on the locale. By not including ctype.h then
> it means we get linked to the libc function instead. One approach is to
> include ctype.h and then #undef isspace, another is to define function
> prototype ourselves. I think the latter is a little bit better because it
> would avoid accidental usage of other local sensitive char classifiers.
> Attached is the patch that I propose. I have deliberate moved to to after
> other includes so we get a chance to #undef in the event that it gets
> included by something else.
>
> On tests then PremainClassTest.java is good enough to find this on Solaris.
>
> -Alan
>
>
> diff --git a/src/share/instrument/JarFacade.c b/src/share/instrument/
> JarFacade.c
> --- a/src/share/instrument/JarFacade.c
> +++ b/src/share/instrument/JarFacade.c
> @@ -23,17 +23,20 @@
>   * questions.
>   */
>
> -#ifdef _WIN32
> -/*
> - * Win* needs this include. However, Linux and Solaris do not.
> - * Having this include on Solaris SPARC breaks having non US-ASCII
> - * characters in the value of the Premain-Class attribute.
> - */
> -#include <ctype.h>
> -#endif /* _WIN32 */
>  #include <string.h>
>  #include <stdlib.h>
> -#include <ctype.h>
> +
> +/**
> + * ctype.h is required on Windows. For other platforms we use a function
> + * prototype to ensure that we use the libc isspace function rather than
> + * the isspace macro (due to isspace being locale sensitive)
> + */
> +#ifdef _WIN32
> +  #include <ctype.h>
> +#else
> +  #undef isspace
> +  extern int isspace(int c);
> +#endif /* _WIN32 */
>
>  #include "jni.h"
>  #include "manifest_info.h"
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20140216/2af02169/attachment.html 

From david.holmes at oracle.com  Sun Feb 16 17:30:02 2014
From: david.holmes at oracle.com (David Holmes)
Date: Mon, 17 Feb 2014 11:30:02 +1000
Subject: RFR(XS): JDK-8034176 Update mapfile for libjfr
In-Reply-To: <7482D938-F9B1-4890-B48D-AF892B6FA3C3@oracle.com>
References: <CDD60B6C-F8E3-4693-AB76-43A4CC0C97B8@larsen.se>	<E3D82E6F-8579-4F49-9A92-581E2415D169@oracle.com>	<52FA05F9.20206@oracle.com>
	<7482D938-F9B1-4890-B48D-AF892B6FA3C3@oracle.com>
Message-ID: <5301661A.2070807@oracle.com>

On 11/02/2014 10:01 PM, Staffan Larsen wrote:
>
> On 11 feb 2014, at 12:14, Erik Joelsson <erik.joelsson at oracle.com> wrote:
>
>> Looks good, but I can't help but wonder why the mapfile for libjfr is in the open.
>
> Yes, that is unfortunate. It used to be in closed source in jdk7, but moved into open with the new build system.

That should be fixed. Looks like we need a 
make/closed/lib/Serviceability.gmk that contains the BUILD_LIBJFR 
definition.

David

> /Staffan
>
>>
>> /Erik
>>
>> On 2014-02-11 12:00, Staffan Larsen wrote:
>>> Resending (email to build-dev bounced since I used the wrong sender address).
>>>
>>> On 11 feb 2014, at 11:18, staffan at larsen.se wrote:
>>>
>>>> Please review this small fix for libjfr/mapfile-vers.
>>>>
>>>> Thanks,
>>>> /Staffan
>>>>
>>>> diff --git a/make/mapfiles/libjfr/mapfile-vers b/make/mapfiles/libjfr/mapfile-vers
>>>> --- a/make/mapfiles/libjfr/mapfile-vers
>>>> +++ b/make/mapfiles/libjfr/mapfile-vers
>>>> @@ -34,6 +34,7 @@
>>>>        Java_oracle_jrockit_jfr_VMJFR_getPeriod;
>>>>        Java_oracle_jrockit_jfr_VMJFR_descriptors;
>>>>        Java_oracle_jrockit_jfr_VMJFR_redefineClass0;
>>>> +      Java_oracle_jrockit_jfr_VMJFR_retransformClasses0;
>>>>        JNI_OnLoad;
>>>>    local:
>>>>        *;
>>
>

From mikael.vidstedt at oracle.com  Sun Feb 16 21:51:20 2014
From: mikael.vidstedt at oracle.com (Mikael Vidstedt)
Date: Sun, 16 Feb 2014 21:51:20 -0800
Subject: 8034856/8034857: More gcc warnings
In-Reply-To: <CA+kOe09Uc3LQG7Ui7fhxsQABacrPdvEpRVGoKxnoY-0FHm4Ntw@mail.gmail.com>
References: <52FCC60B.6040205@oracle.com>
	<6E073B44-2FF7-419F-A094-7CD5C3AE0788@oracle.com>
	<52FD0DA2.7030201@oracle.com> <52FD35C5.1000306@oracle.com>
	<5300F36F.9010205@oracle.com>
	<CA+kOe09Uc3LQG7Ui7fhxsQABacrPdvEpRVGoKxnoY-0FHm4Ntw@mail.gmail.com>
Message-ID: <4DCAB60B-5D60-4873-A5A4-ACF123F10045@oracle.com>


I'm inclined to agree with this. Since the code depends on a specific behavior of isspace which does not match what the system provided function does I too think it would be more robust to implement our own version of it.

Cheers,
Mikael

> On Feb 16, 2014, at 14:20, Martin Buchholz <martinrb at google.com> wrote:
> 
> Those locale-dependent APIs - more trouble than they're worth.  More often than not, you want a locale-independent version.
> 
> So just define your own is_ASCII_space etc. like everybody else has done and move on.
> 
> 
>> On Sun, Feb 16, 2014 at 9:20 AM, Alan Bateman <Alan.Bateman at oracle.com> wrote:
>>> On 13/02/2014 21:14, Mikael Vidstedt wrote:
>>> :
>>> 
>>> The change in question appears to come from https://bugs.openjdk.java.net/browse/JDK-6679866, but I'm not sure the bug gives enough additional information. My speculation (and it's really just a speculation) is that it's not related to isspace per-se, but to something else which gets defined/redefined/undefined by including ctype.h. I guess it would be good to know if we have tests which cover the thing the comment is alluding to (non-ascii in Premain-Class).
>> Thanks for pointing this out. I looked at it again and the issue is that isspace is a macro and depends on the locale. By not including ctype.h then it means we get linked to the libc function instead. One approach is to include ctype.h and then #undef isspace, another is to define function prototype ourselves. I think the latter is a little bit better because it would avoid accidental usage of other local sensitive char classifiers. Attached is the patch that I propose. I have deliberate moved to to after other includes so we get a chance to #undef in the event that it gets included by something else.
>> 
>> On tests then PremainClassTest.java is good enough to find this on Solaris.
>> 
>> -Alan
>> 
>> 
>> diff --git a/src/share/instrument/JarFacade.c b/src/share/instrument/JarFacade.c
>> --- a/src/share/instrument/JarFacade.c
>> +++ b/src/share/instrument/JarFacade.c
>> @@ -23,17 +23,20 @@
>>   * questions.
>>   */
>> 
>> -#ifdef _WIN32
>> -/*
>> - * Win* needs this include. However, Linux and Solaris do not.
>> - * Having this include on Solaris SPARC breaks having non US-ASCII
>> - * characters in the value of the Premain-Class attribute.
>> - */
>> -#include <ctype.h>
>> -#endif /* _WIN32 */
>>  #include <string.h>
>>  #include <stdlib.h>
>> -#include <ctype.h>
>> +
>> +/**
>> + * ctype.h is required on Windows. For other platforms we use a function
>> + * prototype to ensure that we use the libc isspace function rather than
>> + * the isspace macro (due to isspace being locale sensitive)
>> + */
>> +#ifdef _WIN32
>> +  #include <ctype.h>
>> +#else
>> +  #undef isspace
>> +  extern int isspace(int c);
>> +#endif /* _WIN32 */
>> 
>>  #include "jni.h"
>>  #include "manifest_info.h"
> 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20140216/8c63f17b/attachment.html 

From staffan.larsen at oracle.com  Sun Feb 16 23:30:53 2014
From: staffan.larsen at oracle.com (Staffan Larsen)
Date: Mon, 17 Feb 2014 08:30:53 +0100
Subject: RFR(XS): JDK-8034176 Update mapfile for libjfr
In-Reply-To: <5301661A.2070807@oracle.com>
References: <CDD60B6C-F8E3-4693-AB76-43A4CC0C97B8@larsen.se>	<E3D82E6F-8579-4F49-9A92-581E2415D169@oracle.com>	<52FA05F9.20206@oracle.com>
	<7482D938-F9B1-4890-B48D-AF892B6FA3C3@oracle.com>
	<5301661A.2070807@oracle.com>
Message-ID: <7422AD5A-1830-4902-A3B6-7ADBFE4C5721@oracle.com>


On 17 feb 2014, at 02:30, David Holmes <david.holmes at oracle.com> wrote:

> On 11/02/2014 10:01 PM, Staffan Larsen wrote:
>> 
>> On 11 feb 2014, at 12:14, Erik Joelsson <erik.joelsson at oracle.com> wrote:
>> 
>>> Looks good, but I can't help but wonder why the mapfile for libjfr is in the open.
>> 
>> Yes, that is unfortunate. It used to be in closed source in jdk7, but moved into open with the new build system.
> 
> That should be fixed. Looks like we need a make/closed/lib/Serviceability.gmk that contains the BUILD_LIBJFR definition.

Yes, I?ve filed a bug about it.

Thanks,
/Staffan


> 
> David
> 
>> /Staffan
>> 
>>> 
>>> /Erik
>>> 
>>> On 2014-02-11 12:00, Staffan Larsen wrote:
>>>> Resending (email to build-dev bounced since I used the wrong sender address).
>>>> 
>>>> On 11 feb 2014, at 11:18, staffan at larsen.se wrote:
>>>> 
>>>>> Please review this small fix for libjfr/mapfile-vers.
>>>>> 
>>>>> Thanks,
>>>>> /Staffan
>>>>> 
>>>>> diff --git a/make/mapfiles/libjfr/mapfile-vers b/make/mapfiles/libjfr/mapfile-vers
>>>>> --- a/make/mapfiles/libjfr/mapfile-vers
>>>>> +++ b/make/mapfiles/libjfr/mapfile-vers
>>>>> @@ -34,6 +34,7 @@
>>>>>       Java_oracle_jrockit_jfr_VMJFR_getPeriod;
>>>>>       Java_oracle_jrockit_jfr_VMJFR_descriptors;
>>>>>       Java_oracle_jrockit_jfr_VMJFR_redefineClass0;
>>>>> +      Java_oracle_jrockit_jfr_VMJFR_retransformClasses0;
>>>>>       JNI_OnLoad;
>>>>>   local:
>>>>>       *;
>>> 
>> 


From jaroslav.bachorik at oracle.com  Sun Feb 16 23:53:07 2014
From: jaroslav.bachorik at oracle.com (Jaroslav Bachorik)
Date: Mon, 17 Feb 2014 08:53:07 +0100
Subject: RFR 4505697: nsk/jdi/ExceptionEvent/_itself_/exevent006 and
	exevent008 tests fail with InvocationTargetException
In-Reply-To: <52FE46A2.20705@oracle.com>
References: <52FDF8DD.9090602@oracle.com> <52FE46A2.20705@oracle.com>
Message-ID: <5301BFE3.7050400@oracle.com>

Hi Dan,

On 14.2.2014 17:38, Daniel D. Daugherty wrote:
>  > Webrev: http://cr.openjdk.java.net/~jbachorik/4505697/webrev.00
>
> Nice job on such an old bug.
>
> src/share/vm/prims/jvmtiExport.hpp
>      No comments.
>
> src/share/vm/prims/jvmtiExport.cpp
>      line 2170:       state->clear_exception_detected();
>          HotSpot indent is two spaces.
>
> src/share/vm/runtime/reflection.cpp
>      line 948: JvmtiExport::clear_detected_exception((JavaThread*) THREAD);
>      line 1085: JvmtiExport::clear_detected_exception((JavaThread*)
> THREAD);
>          HotSpot indent is two spaces.

Indentation will be fixed.

>
> So there are two existing tests for this failure mode:
>
>      nsk/jdi/ExceptionEvent/_itself_/exevent006
>      nsk/jdi/ExceptionEvent/_itself_/exevent008
>
> so I'm presuming that you've run the above tests and they are now
> happy, but you haven't really stated that. What other pre-integration
> testing do you have planned?

Yes, those two tests are not failing any more. The full testing would 
include running both JTREG (testsets 'core' and 'svc') and aurora 
testbase at least for 'runtime', 'nsk/jvmti' and 'nsk/jdi'.

-JB-

>
> Dan
>
>
> On 2/14/14 4:07 AM, Jaroslav Bachorik wrote:
>> This is a round-0 review request.
>>
>> The reflection code intercepting the exceptions thrown in the invoked
>> methods does not play nicely with JVMTI (which, in this case,
>> propagates to JDI).
>>
>> The reflection code lacks the traditional error handler - therefore,
>> upon throwing the NumberFormatException, the stack is searched for
>> appropriate handlers and none are found. This leaves the
>> "exception_detected" flag set to true while normally it would be reset
>> to false once the exception is handled. The reflection code then goes
>> on and wraps the NumberFormatException into InvocationTargetException
>> and throws it. But, alas, the "exception_detected" flag is still set
>> to true and no JVMTI exception event will be sent out.
>>
>> The proposed solution is to call
>> thread->jvmti_thread_state()->clear_exception_detected() at the
>> appropriate places in the reflection code to reset the
>> "exception_detected" flag and enable the InvocationTargetException be
>> properly reported over JVMTI.
>>
>> Issue : https://bugs.openjdk.java.net/browse/JDK-4505697
>> Webrev: http://cr.openjdk.java.net/~jbachorik/4505697/webrev.00
>>
>> Thanks!
>>
>> -JB-
>>
>>
>


From jaroslav.bachorik at oracle.com  Mon Feb 17 00:04:59 2014
From: jaroslav.bachorik at oracle.com (Jaroslav Bachorik)
Date: Mon, 17 Feb 2014 09:04:59 +0100
Subject: RFR 4505697: nsk/jdi/ExceptionEvent/_itself_/exevent006 and
	exevent008 tests fail with InvocationTargetException
In-Reply-To: <52FE951E.1060002@oracle.com>
References: <52FDF8DD.9090602@oracle.com> <52FE6486.8000603@oracle.com>
	<52FE7D88.8000701@oracle.com> <52FE951E.1060002@oracle.com>
Message-ID: <5301C2AB.2070600@oracle.com>

On 14.2.2014 23:13, serguei.spitsyn at oracle.com wrote:
> On 2/14/14 12:33 PM, Daniel D. Daugherty wrote:
>> On 2/14/14 11:46 AM, serguei.spitsyn at oracle.com wrote:
>>> Jaroslav,
>>>
>>> It looks good in general modulo indent comments from Dan.
>>>
>>> But I have a doubt that acquiring the JvmtiThreadState_lock is needed
>>> or right thing to do in the JvmtiExport::clear_detected_exception().
>>> It seems, both clear_exception_detected() and
>>> set_exception_detected() are always
>>> called on current thread and so, it has to be safe to do without
>>> acquiring any locks.
>>
>> My JVM/TI-foo is rusty, but I believe that JvmtiThreadState stuff
>> can also be queried/modified by other threads so grabbing the
>> associated lock is a good idea.
>
> The lock synchronization is cooperative.
> It does not help much if the lock is not acquired in other places.
> I can be wrong, but I've not found yet any place in the code where the
> clear_exception_detected() and set_exception_detected() are called
> under protection of the JvmtiThreadState_lock.

I copied the locking over from "JvmtiExport::cleanup_thread(JavaThread* 
thread)". That method is also supposed to work only with the current 
thread but acquires the lock nonetheless. But if you are sure that the 
lock is not required I have no objections removing it.

-JB-

>
> Thanks,
> Serguei
>
>>
>> Dan
>>
>>
>>>
>>> And I'm repeating my question about pre-integration testing (Dan is
>>> asking about the same).
>>>
>>> Thanks,
>>> Serguei
>>>
>>>
>>> On 2/14/14 3:07 AM, Jaroslav Bachorik wrote:
>>>> This is a round-0 review request.
>>>>
>>>> The reflection code intercepting the exceptions thrown in the
>>>> invoked methods does not play nicely with JVMTI (which, in this
>>>> case, propagates to JDI).
>>>>
>>>> The reflection code lacks the traditional error handler - therefore,
>>>> upon throwing the NumberFormatException, the stack is searched for
>>>> appropriate handlers and none are found. This leaves the
>>>> "exception_detected" flag set to true while normally it would be
>>>> reset to false once the exception is handled. The reflection code
>>>> then goes on and wraps the NumberFormatException into
>>>> InvocationTargetException and throws it. But, alas, the
>>>> "exception_detected" flag is still set to true and no JVMTI
>>>> exception event will be sent out.
>>>>
>>>> The proposed solution is to call
>>>> thread->jvmti_thread_state()->clear_exception_detected() at the
>>>> appropriate places in the reflection code to reset the
>>>> "exception_detected" flag and enable the InvocationTargetException
>>>> be properly reported over JVMTI.
>>>>
>>>> Issue : https://bugs.openjdk.java.net/browse/JDK-4505697
>>>> Webrev: http://cr.openjdk.java.net/~jbachorik/4505697/webrev.00
>>>>
>>>> Thanks!
>>>>
>>>> -JB-
>>>
>>
>
>


From fredrik.arvidsson at oracle.com  Mon Feb 17 01:46:42 2014
From: fredrik.arvidsson at oracle.com (Fredrik Arvidsson)
Date: Mon, 17 Feb 2014 10:46:42 +0100
Subject: RFR(XS): JDK-4515292: ReferenceType.isStatic() returns true for arrays
Message-ID: <5301DA82.2070001@oracle.com>

Hi please review this minor JDI fix.

This bug was found when investigating failing JDI tests. According to 
the JavaDoc all calls to ReferenceType when it is an array type should 
return false for isStatic().

Webrev: http://cr.openjdk.java.net/~farvidsson/4515292/webrev.00/ 
<http://cr.openjdk.java.net/%7Efarvidsson/4515292/webrev.00/>
Jira: https://bugs.openjdk.java.net/browse/JDK-4515292

Cheers
/Fredrik

From jaroslav.bachorik at oracle.com  Mon Feb 17 03:01:49 2014
From: jaroslav.bachorik at oracle.com (Jaroslav Bachorik)
Date: Mon, 17 Feb 2014 12:01:49 +0100
Subject: RFR(XS): JDK-4515292: ReferenceType.isStatic() returns true for
	arrays
In-Reply-To: <5301DA82.2070001@oracle.com>
References: <5301DA82.2070001@oracle.com>
Message-ID: <5301EC1D.5010407@oracle.com>

Looks good! (not a reviewer)

-JB-

On 17.2.2014 10:46, Fredrik Arvidsson wrote:
> Hi please review this minor JDI fix.
>
> This bug was found when investigating failing JDI tests. According to
> the JavaDoc all calls to ReferenceType when it is an array type should
> return false for isStatic().
>
> Webrev: http://cr.openjdk.java.net/~farvidsson/4515292/webrev.00/
> <http://cr.openjdk.java.net/%7Efarvidsson/4515292/webrev.00/>
> Jira: https://bugs.openjdk.java.net/browse/JDK-4515292
>
> Cheers
> /Fredrik


From Alan.Bateman at oracle.com  Mon Feb 17 07:08:02 2014
From: Alan.Bateman at oracle.com (Alan Bateman)
Date: Mon, 17 Feb 2014 15:08:02 +0000
Subject: 8034856/8034857: More gcc warnings
In-Reply-To: <4DCAB60B-5D60-4873-A5A4-ACF123F10045@oracle.com>
References: <52FCC60B.6040205@oracle.com>
	<6E073B44-2FF7-419F-A094-7CD5C3AE0788@oracle.com>
	<52FD0DA2.7030201@oracle.com> <52FD35C5.1000306@oracle.com>
	<5300F36F.9010205@oracle.com>
	<CA+kOe09Uc3LQG7Ui7fhxsQABacrPdvEpRVGoKxnoY-0FHm4Ntw@mail.gmail.com>
	<4DCAB60B-5D60-4873-A5A4-ACF123F10045@oracle.com>
Message-ID: <530225D2.4040709@oracle.com>

On 17/02/2014 05:51, Mikael Vidstedt wrote:
>
> I'm inclined to agree with this. Since the code depends on a specific 
> behavior of isspace which does not match what the system provided 
> function does I too think it would be more robust to implement our own 
> version of it.
I completely agree that changing this code to use its own isspace is the 
right thing, it just seems a bit much for a drive-by fixed to gcc 
warnings. Do either of you want to take it?

-Alan
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20140217/9d7b6232/attachment.html 

From serguei.spitsyn at oracle.com  Mon Feb 17 11:44:35 2014
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Mon, 17 Feb 2014 11:44:35 -0800
Subject: RFR(XS): JDK-4515292: ReferenceType.isStatic() returns true for
	arrays
In-Reply-To: <5301DA82.2070001@oracle.com>
References: <5301DA82.2070001@oracle.com>
Message-ID: <530266A3.50406@oracle.com>

Looks good.

Thank you for fixing it!
Serguei

On 2/17/14 1:46 AM, Fredrik Arvidsson wrote:
> Hi please review this minor JDI fix.
>
> This bug was found when investigating failing JDI tests. According to 
> the JavaDoc all calls to ReferenceType when it is an array type should 
> return false for isStatic().
>
> Webrev: http://cr.openjdk.java.net/~farvidsson/4515292/webrev.00/ 
> <http://cr.openjdk.java.net/%7Efarvidsson/4515292/webrev.00/>
> Jira: https://bugs.openjdk.java.net/browse/JDK-4515292
>
> Cheers
> /Fredrik


From mikael.vidstedt at oracle.com  Mon Feb 17 19:59:04 2014
From: mikael.vidstedt at oracle.com (Mikael Vidstedt)
Date: Mon, 17 Feb 2014 19:59:04 -0800
Subject: 8034856/8034857: More gcc warnings
In-Reply-To: <530225D2.4040709@oracle.com>
References: <52FCC60B.6040205@oracle.com>
	<6E073B44-2FF7-419F-A094-7CD5C3AE0788@oracle.com>
	<52FD0DA2.7030201@oracle.com> <52FD35C5.1000306@oracle.com>
	<5300F36F.9010205@oracle.com>
	<CA+kOe09Uc3LQG7Ui7fhxsQABacrPdvEpRVGoKxnoY-0FHm4Ntw@mail.gmail.com>
	<4DCAB60B-5D60-4873-A5A4-ACF123F10045@oracle.com>
	<530225D2.4040709@oracle.com>
Message-ID: <5302DA88.9060306@oracle.com>


On 2014-02-17 07:08, Alan Bateman wrote:
> On 17/02/2014 05:51, Mikael Vidstedt wrote:
>>
>> I'm inclined to agree with this. Since the code depends on a specific 
>> behavior of isspace which does not match what the system provided 
>> function does I too think it would be more robust to implement our 
>> own version of it.
> I completely agree that changing this code to use its own isspace is 
> the right thing, it just seems a bit much for a drive-by fixed to gcc 
> warnings. Do either of you want to take it?

How about:

http://cr.openjdk.java.net/~mikael/webrevs/isspace/webrev.00/webrev/

Cheers,
Mikael

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20140217/0cf3246b/attachment.html 

From david.holmes at oracle.com  Mon Feb 17 21:02:29 2014
From: david.holmes at oracle.com (David Holmes)
Date: Tue, 18 Feb 2014 15:02:29 +1000
Subject: RFR (S) 8034249: need more workarounds for suspend equivalent
	condition issue
In-Reply-To: <52FD5CC3.4090502@oracle.com>
References: <52FD5CC3.4090502@oracle.com>
Message-ID: <5302E965.3090505@oracle.com>

Hi Serguei,

This looks good to me.

I wonder if we will reach a point where we can delete 
is_thread_fully_suspended? ;-)

David

On 14/02/2014 10:01 AM, serguei.spitsyn at oracle.com wrote:
> Please, review the fix for:
>    https://bugs.openjdk.java.net/browse/JDK-8034249
>
>
> Open webrev:
> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/hotspot/8034249-JVMTI-MON.1
>
>
> Summary:
>
>    This issue was identified in the review of the 8032223 and it is
> similar to the 8032223
>    but impacts different JVMTI functions:
>      GetCurrentContendedMonitor, GetOwnedMonitorInfo,
>      GetOwnedMonitorStackDepthInfo, GetStackTrace
>
>    There is a general issue in the suspend equivalent condition mechanism:
>    Two subsequent calls to the JvmtiEnv::is_thread_fully_suspended() may
> return different results:
>      - 1-st: true
>      - 2-nd: false
>
>    This suspend equivalent issue is covered by another bug:
>      https://bugs.openjdk.java.net/browse/JDK-6280037
>
>    This fix is to work around the 6280037.
>    It is more safe to collect the necesary information at a safepoint
> instead of
>    relying on the suspension of the target thread.
>
>
> Testing:
>    In progress: nsk.jvmti, nsk.jdi, nsk.jdwp, JTreg com/sun/jdi
>
>
> Thanks,
> Serguei

From david.holmes at oracle.com  Mon Feb 17 21:28:15 2014
From: david.holmes at oracle.com (David Holmes)
Date: Tue, 18 Feb 2014 15:28:15 +1000
Subject: RFR 4505697: nsk/jdi/ExceptionEvent/_itself_/exevent006 and
	exevent008 tests fail with InvocationTargetException
In-Reply-To: <52FDF8DD.9090602@oracle.com>
References: <52FDF8DD.9090602@oracle.com>
Message-ID: <5302EF6F.9090204@oracle.com>

Hi Jaroslav,

It seems to me that this issue extends to other places in the VM. In 
particular class initialization in instanceKlass.cpp - anywhere that one 
exception is "caught" in the VM and then wrapped with, or replaced by, 
another exception, will only notify JVMTI of the original exception.

David

On 14/02/2014 9:07 PM, Jaroslav Bachorik wrote:
> This is a round-0 review request.
>
> The reflection code intercepting the exceptions thrown in the invoked
> methods does not play nicely with JVMTI (which, in this case, propagates
> to JDI).
>
> The reflection code lacks the traditional error handler - therefore,
> upon throwing the NumberFormatException, the stack is searched for
> appropriate handlers and none are found. This leaves the
> "exception_detected" flag set to true while normally it would be reset
> to false once the exception is handled. The reflection code then goes on
> and wraps the NumberFormatException into InvocationTargetException and
> throws it. But, alas, the "exception_detected" flag is still set to true
> and no JVMTI exception event will be sent out.
>
> The proposed solution is to call
> thread->jvmti_thread_state()->clear_exception_detected() at the
> appropriate places in the reflection code to reset the
> "exception_detected" flag and enable the InvocationTargetException be
> properly reported over JVMTI.
>
> Issue : https://bugs.openjdk.java.net/browse/JDK-4505697
> Webrev: http://cr.openjdk.java.net/~jbachorik/4505697/webrev.00
>
> Thanks!
>
> -JB-

From serguei.spitsyn at oracle.com  Mon Feb 17 21:33:08 2014
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Mon, 17 Feb 2014 21:33:08 -0800
Subject: RFR (S) 8034249: need more workarounds for suspend equivalent
	condition issue
In-Reply-To: <5302E965.3090505@oracle.com>
References: <52FD5CC3.4090502@oracle.com> <5302E965.3090505@oracle.com>
Message-ID: <5302F094.7090006@oracle.com>

Thank you a lot, David!

On 2/17/14 9:02 PM, David Holmes wrote:
> Hi Serguei,
>
> This looks good to me.
>
> I wonder if we will reach a point where we can delete 
> is_thread_fully_suspended? ;-)

I know what you mean by this. :)
There are still some space to improve safety with the safepoint mechanizm.
Of course, the is_thread_fully_suspended() is still needed for external 
JVMTI/JDI purposes.


Thanks,
Serguei

>
> David
>
> On 14/02/2014 10:01 AM, serguei.spitsyn at oracle.com wrote:
>> Please, review the fix for:
>>    https://bugs.openjdk.java.net/browse/JDK-8034249
>>
>>
>> Open webrev:
>> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/hotspot/8034249-JVMTI-MON.1 
>>
>>
>>
>> Summary:
>>
>>    This issue was identified in the review of the 8032223 and it is
>> similar to the 8032223
>>    but impacts different JVMTI functions:
>>      GetCurrentContendedMonitor, GetOwnedMonitorInfo,
>>      GetOwnedMonitorStackDepthInfo, GetStackTrace
>>
>>    There is a general issue in the suspend equivalent condition 
>> mechanism:
>>    Two subsequent calls to the JvmtiEnv::is_thread_fully_suspended() may
>> return different results:
>>      - 1-st: true
>>      - 2-nd: false
>>
>>    This suspend equivalent issue is covered by another bug:
>>      https://bugs.openjdk.java.net/browse/JDK-6280037
>>
>>    This fix is to work around the 6280037.
>>    It is more safe to collect the necesary information at a safepoint
>> instead of
>>    relying on the suspension of the target thread.
>>
>>
>> Testing:
>>    In progress: nsk.jvmti, nsk.jdi, nsk.jdwp, JTreg com/sun/jdi
>>
>>
>> Thanks,
>> Serguei


From Alan.Bateman at oracle.com  Tue Feb 18 00:33:40 2014
From: Alan.Bateman at oracle.com (Alan Bateman)
Date: Tue, 18 Feb 2014 08:33:40 +0000
Subject: 8034856/8034857: More gcc warnings
In-Reply-To: <5302DA88.9060306@oracle.com>
References: <52FCC60B.6040205@oracle.com>
	<6E073B44-2FF7-419F-A094-7CD5C3AE0788@oracle.com>
	<52FD0DA2.7030201@oracle.com> <52FD35C5.1000306@oracle.com>
	<5300F36F.9010205@oracle.com>
	<CA+kOe09Uc3LQG7Ui7fhxsQABacrPdvEpRVGoKxnoY-0FHm4Ntw@mail.gmail.com>
	<4DCAB60B-5D60-4873-A5A4-ACF123F10045@oracle.com>
	<530225D2.4040709@oracle.com> <5302DA88.9060306@oracle.com>
Message-ID: <53031AE4.2050700@oracle.com>

On 18/02/2014 03:59, Mikael Vidstedt wrote:
>
> How about:
>
> http://cr.openjdk.java.net/~mikael/webrevs/isspace/webrev.00/webrev/
>
> Cheers,
> Mikael
>
I checked the java.lang.instrument spec and for the Boot-Class-Path 
attribute then it doesn't say any more than "space". It might be worth 
checking the manifest parsing code (parse_manfiest.c) to see how 
continuations are handled as I suspect \r and \n can't appear in the 
attribute value (in which case the check might really only need to be 
for space and \t.

Otherwise replacing isspace is good and your isspaceAscii is likely to 
match the libc isspace (at runtime). This code isn't performance 
sensitive but maybe check space first would be a bit better. Also the 
library native code using 4 space indent rather than hotspot's 2.

I created JDK-8035054 a few days ago to track this. Thanks for taking it 
as I am busy with a number of other things at the moment.

-Alan
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20140218/8c866848/attachment.html 

From jaroslav.bachorik at oracle.com  Tue Feb 18 00:47:36 2014
From: jaroslav.bachorik at oracle.com (Jaroslav Bachorik)
Date: Tue, 18 Feb 2014 09:47:36 +0100
Subject: RFR 4505697: nsk/jdi/ExceptionEvent/_itself_/exevent006 and
	exevent008 tests fail with InvocationTargetException
In-Reply-To: <5302EF6F.9090204@oracle.com>
References: <52FDF8DD.9090602@oracle.com> <5302EF6F.9090204@oracle.com>
Message-ID: <53031E28.1060007@oracle.com>

Hi David,

On 18.2.2014 06:28, David Holmes wrote:
> Hi Jaroslav,
>
> It seems to me that this issue extends to other places in the VM. In
> particular class initialization in instanceKlass.cpp - anywhere that one
> exception is "caught" in the VM and then wrapped with, or replaced by,
> another exception, will only notify JVMTI of the original exception.

Thanks for pointing this out. Turns out there is another location in 
jvm.cpp which needs the same treatment.

BTW, what is your take on the necessity to grab the 
JvmtiThreadState_lock before cleaning the detected exception in the 
jvmti thread state?

-JB-

>
> David
>
> On 14/02/2014 9:07 PM, Jaroslav Bachorik wrote:
>> This is a round-0 review request.
>>
>> The reflection code intercepting the exceptions thrown in the invoked
>> methods does not play nicely with JVMTI (which, in this case, propagates
>> to JDI).
>>
>> The reflection code lacks the traditional error handler - therefore,
>> upon throwing the NumberFormatException, the stack is searched for
>> appropriate handlers and none are found. This leaves the
>> "exception_detected" flag set to true while normally it would be reset
>> to false once the exception is handled. The reflection code then goes on
>> and wraps the NumberFormatException into InvocationTargetException and
>> throws it. But, alas, the "exception_detected" flag is still set to true
>> and no JVMTI exception event will be sent out.
>>
>> The proposed solution is to call
>> thread->jvmti_thread_state()->clear_exception_detected() at the
>> appropriate places in the reflection code to reset the
>> "exception_detected" flag and enable the InvocationTargetException be
>> properly reported over JVMTI.
>>
>> Issue : https://bugs.openjdk.java.net/browse/JDK-4505697
>> Webrev: http://cr.openjdk.java.net/~jbachorik/4505697/webrev.00
>>
>> Thanks!
>>
>> -JB-


From dmitry.samersoff at oracle.com  Tue Feb 18 00:58:39 2014
From: dmitry.samersoff at oracle.com (Dmitry Samersoff)
Date: Tue, 18 Feb 2014 12:58:39 +0400
Subject: 8034856/8034857: More gcc warnings
In-Reply-To: <5302DA88.9060306@oracle.com>
References: <52FCC60B.6040205@oracle.com>	<6E073B44-2FF7-419F-A094-7CD5C3AE0788@oracle.com>	<52FD0DA2.7030201@oracle.com>
	<52FD35C5.1000306@oracle.com>	<5300F36F.9010205@oracle.com>	<CA+kOe09Uc3LQG7Ui7fhxsQABacrPdvEpRVGoKxnoY-0FHm4Ntw@mail.gmail.com>	<4DCAB60B-5D60-4873-A5A4-ACF123F10045@oracle.com>	<530225D2.4040709@oracle.com>
	<5302DA88.9060306@oracle.com>
Message-ID: <530320BF.2070607@oracle.com>

Mikael,

> http://cr.openjdk.java.net/~mikael/webrevs/isspace/webrev.00/webrev/

1. I agree that ctypes isspace usually cause more problems than solve
and it's good to have our own version.

2. one of possible implementation is

 #define isspaceASCII(c) (strchr(SPACE_CHARS,c) != NULL)

-Dmitry


On 2014-02-18 07:59, Mikael Vidstedt wrote:
> 
> On 2014-02-17 07:08, Alan Bateman wrote:
>> On 17/02/2014 05:51, Mikael Vidstedt wrote:
>>>
>>> I'm inclined to agree with this. Since the code depends on a specific
>>> behavior of isspace which does not match what the system provided
>>> function does I too think it would be more robust to implement our
>>> own version of it.
>> I completely agree that changing this code to use its own isspace is
>> the right thing, it just seems a bit much for a drive-by fixed to gcc
>> warnings. Do either of you want to take it?
> 
> How about:
> 
> http://cr.openjdk.java.net/~mikael/webrevs/isspace/webrev.00/webrev/
> 
> Cheers,
> Mikael
> 


-- 
Dmitry Samersoff
Oracle Java development team, Saint Petersburg, Russia
* I would love to change the world, but they won't give me the sources.

From david.holmes at oracle.com  Tue Feb 18 01:39:42 2014
From: david.holmes at oracle.com (David Holmes)
Date: Tue, 18 Feb 2014 19:39:42 +1000
Subject: RFR 4505697: nsk/jdi/ExceptionEvent/_itself_/exevent006 and
	exevent008 tests fail with InvocationTargetException
In-Reply-To: <53031E28.1060007@oracle.com>
References: <52FDF8DD.9090602@oracle.com> <5302EF6F.9090204@oracle.com>
	<53031E28.1060007@oracle.com>
Message-ID: <53032A5E.2000705@oracle.com>

On 18/02/2014 6:47 PM, Jaroslav Bachorik wrote:
> Hi David,
>
> On 18.2.2014 06:28, David Holmes wrote:
>> Hi Jaroslav,
>>
>> It seems to me that this issue extends to other places in the VM. In
>> particular class initialization in instanceKlass.cpp - anywhere that one
>> exception is "caught" in the VM and then wrapped with, or replaced by,
>> another exception, will only notify JVMTI of the original exception.
>
> Thanks for pointing this out. Turns out there is another location in
> jvm.cpp which needs the same treatment.
>
> BTW, what is your take on the necessity to grab the
> JvmtiThreadState_lock before cleaning the detected exception in the
> jvmti thread state?

I would need to analyze all of the code that accesses it to determine 
that. My initial thought was that it seemed unnecessary and I did look 
at some of the code which seemed to indicate other threads would only 
access it at a safepoint. But there may be other access points that I'm 
not aware of.

David

> -JB-
>
>>
>> David
>>
>> On 14/02/2014 9:07 PM, Jaroslav Bachorik wrote:
>>> This is a round-0 review request.
>>>
>>> The reflection code intercepting the exceptions thrown in the invoked
>>> methods does not play nicely with JVMTI (which, in this case, propagates
>>> to JDI).
>>>
>>> The reflection code lacks the traditional error handler - therefore,
>>> upon throwing the NumberFormatException, the stack is searched for
>>> appropriate handlers and none are found. This leaves the
>>> "exception_detected" flag set to true while normally it would be reset
>>> to false once the exception is handled. The reflection code then goes on
>>> and wraps the NumberFormatException into InvocationTargetException and
>>> throws it. But, alas, the "exception_detected" flag is still set to true
>>> and no JVMTI exception event will be sent out.
>>>
>>> The proposed solution is to call
>>> thread->jvmti_thread_state()->clear_exception_detected() at the
>>> appropriate places in the reflection code to reset the
>>> "exception_detected" flag and enable the InvocationTargetException be
>>> properly reported over JVMTI.
>>>
>>> Issue : https://bugs.openjdk.java.net/browse/JDK-4505697
>>> Webrev: http://cr.openjdk.java.net/~jbachorik/4505697/webrev.00
>>>
>>> Thanks!
>>>
>>> -JB-
>

From fredrik.arvidsson at oracle.com  Tue Feb 18 01:43:36 2014
From: fredrik.arvidsson at oracle.com (Fredrik Arvidsson)
Date: Tue, 18 Feb 2014 10:43:36 +0100
Subject: RFR(XS): JDK-8034203: Change JavaDoc for
	com.sun.jdi.request.EventRequest.setEnabled(boolean val)
Message-ID: <53032B48.2060003@oracle.com>

Hi

Please help me review this small change:

This is a minor change to the JavaDoc in the JDI API. The problem was 
discovered when trying to fix a failing test.

Webrev: 8034203 
<http://cr.openjdk.java.net/%7Efarvidsson/8034203/webrev.00/>
Bug: JDK-8034203 <https://bugs.openjdk.java.net/browse/JDK-8034203>
CCC: 8034203 <http://ccc.us.oracle.com/8034203>

Thanks
/Fredrik
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20140218/25306b71/attachment.html 

From staffan.larsen at oracle.com  Tue Feb 18 02:08:16 2014
From: staffan.larsen at oracle.com (Staffan Larsen)
Date: Tue, 18 Feb 2014 11:08:16 +0100
Subject: RFR(XS): JDK-4515292: ReferenceType.isStatic() returns true for
	arrays
In-Reply-To: <5301DA82.2070001@oracle.com>
References: <5301DA82.2070001@oracle.com>
Message-ID: <7A8179FC-A3DF-4C35-BE95-143305F1C253@oracle.com>

Looks good!

Thanks,
/Staffan

On 17 feb 2014, at 10:46, Fredrik Arvidsson <fredrik.arvidsson at oracle.com> wrote:

> Hi please review this minor JDI fix.
> 
> This bug was found when investigating failing JDI tests. According to the JavaDoc all calls to ReferenceType when it is an array type should return false for isStatic().
> 
> Webrev: http://cr.openjdk.java.net/~farvidsson/4515292/webrev.00/ <http://cr.openjdk.java.net/%7Efarvidsson/4515292/webrev.00/>
> Jira: https://bugs.openjdk.java.net/browse/JDK-4515292
> 
> Cheers
> /Fredrik


From serguei.spitsyn at oracle.com  Tue Feb 18 02:18:32 2014
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Tue, 18 Feb 2014 02:18:32 -0800
Subject: RFR 4505697: nsk/jdi/ExceptionEvent/_itself_/exevent006 and
	exevent008 tests fail with InvocationTargetException
In-Reply-To: <5301C2AB.2070600@oracle.com>
References: <52FDF8DD.9090602@oracle.com> <52FE6486.8000603@oracle.com>
	<52FE7D88.8000701@oracle.com> <52FE951E.1060002@oracle.com>
	<5301C2AB.2070600@oracle.com>
Message-ID: <53033378.2060300@oracle.com>

On 2/17/14 12:04 AM, Jaroslav Bachorik wrote:
> On 14.2.2014 23:13, serguei.spitsyn at oracle.com wrote:
>> On 2/14/14 12:33 PM, Daniel D. Daugherty wrote:
>>> On 2/14/14 11:46 AM, serguei.spitsyn at oracle.com wrote:
>>>> Jaroslav,
>>>>
>>>> It looks good in general modulo indent comments from Dan.
>>>>
>>>> But I have a doubt that acquiring the JvmtiThreadState_lock is needed
>>>> or right thing to do in the JvmtiExport::clear_detected_exception().
>>>> It seems, both clear_exception_detected() and
>>>> set_exception_detected() are always
>>>> called on current thread and so, it has to be safe to do without
>>>> acquiring any locks.
>>>
>>> My JVM/TI-foo is rusty, but I believe that JvmtiThreadState stuff
>>> can also be queried/modified by other threads so grabbing the
>>> associated lock is a good idea.
>>
>> The lock synchronization is cooperative.
>> It does not help much if the lock is not acquired in other places.
>> I can be wrong, but I've not found yet any place in the code where the
>> clear_exception_detected() and set_exception_detected() are called
>> under protection of the JvmtiThreadState_lock.
>
> I copied the locking over from 
> "JvmtiExport::cleanup_thread(JavaThread* thread)". That method is also 
> supposed to work only with the current thread but acquires the lock 
> nonetheless. But if you are sure that the lock is not required I have 
> no objections removing it.

I'm suggesting to remove it, as it is not used in other places in the code.
It is going to be confusing if it is used in one place and missed in others.

Thanks,
Serguei

>
> -JB-
>
>>
>> Thanks,
>> Serguei
>>
>>>
>>> Dan
>>>
>>>
>>>>
>>>> And I'm repeating my question about pre-integration testing (Dan is
>>>> asking about the same).
>>>>
>>>> Thanks,
>>>> Serguei
>>>>
>>>>
>>>> On 2/14/14 3:07 AM, Jaroslav Bachorik wrote:
>>>>> This is a round-0 review request.
>>>>>
>>>>> The reflection code intercepting the exceptions thrown in the
>>>>> invoked methods does not play nicely with JVMTI (which, in this
>>>>> case, propagates to JDI).
>>>>>
>>>>> The reflection code lacks the traditional error handler - therefore,
>>>>> upon throwing the NumberFormatException, the stack is searched for
>>>>> appropriate handlers and none are found. This leaves the
>>>>> "exception_detected" flag set to true while normally it would be
>>>>> reset to false once the exception is handled. The reflection code
>>>>> then goes on and wraps the NumberFormatException into
>>>>> InvocationTargetException and throws it. But, alas, the
>>>>> "exception_detected" flag is still set to true and no JVMTI
>>>>> exception event will be sent out.
>>>>>
>>>>> The proposed solution is to call
>>>>> thread->jvmti_thread_state()->clear_exception_detected() at the
>>>>> appropriate places in the reflection code to reset the
>>>>> "exception_detected" flag and enable the InvocationTargetException
>>>>> be properly reported over JVMTI.
>>>>>
>>>>> Issue : https://bugs.openjdk.java.net/browse/JDK-4505697
>>>>> Webrev: http://cr.openjdk.java.net/~jbachorik/4505697/webrev.00
>>>>>
>>>>> Thanks!
>>>>>
>>>>> -JB-
>>>>
>>>
>>
>>
>


From staffan.larsen at oracle.com  Tue Feb 18 02:19:24 2014
From: staffan.larsen at oracle.com (Staffan Larsen)
Date: Tue, 18 Feb 2014 11:19:24 +0100
Subject: RFR(S): 6952105 TEST_BUG: testcase failure, not very often,
	com/sun/jdi/SuspendThreadTest.java
In-Reply-To: <E0E54A66-772E-4E94-BC30-FDC0564079B3@oracle.com>
References: <E0E54A66-772E-4E94-BC30-FDC0564079B3@oracle.com>
Message-ID: <0C0B4222-F372-4484-B92B-647C6BAE3D19@oracle.com>

Still looking for Reviewer for this change.

Thanks,
/Staffan

On 11 feb 2014, at 15:12, Staffan Larsen <staffan.larsen at oracle.com> wrote:

> Updated the test to use proper synchronization and notification between threads. Should be more stable and much faster.
> 
> bug: https://bugs.openjdk.java.net/browse/JDK-6952105
> webrev: http://cr.openjdk.java.net/~sla/6952105/webrev.00/
> 
> Thanks,
> /Staffan


From serguei.spitsyn at oracle.com  Tue Feb 18 03:42:20 2014
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Tue, 18 Feb 2014 03:42:20 -0800
Subject: RFR(S): 6952105 TEST_BUG: testcase failure, not very often,
	com/sun/jdi/SuspendThreadTest.java
In-Reply-To: <0C0B4222-F372-4484-B92B-647C6BAE3D19@oracle.com>
References: <E0E54A66-772E-4E94-BC30-FDC0564079B3@oracle.com>
	<0C0B4222-F372-4484-B92B-647C6BAE3D19@oracle.com>
Message-ID: <5303471C.4060405@oracle.com>

The fix looks good.

Thanks,
Serguei

On 2/18/14 2:19 AM, Staffan Larsen wrote:
> Still looking for Reviewer for this change.
>
> Thanks,
> /Staffan
>
> On 11 feb 2014, at 15:12, Staffan Larsen <staffan.larsen at oracle.com> wrote:
>
>> Updated the test to use proper synchronization and notification between threads. Should be more stable and much faster.
>>
>> bug: https://bugs.openjdk.java.net/browse/JDK-6952105
>> webrev: http://cr.openjdk.java.net/~sla/6952105/webrev.00/
>>
>> Thanks,
>> /Staffan


From david.holmes at oracle.com  Tue Feb 18 04:09:49 2014
From: david.holmes at oracle.com (David Holmes)
Date: Tue, 18 Feb 2014 22:09:49 +1000
Subject: RFR(S): 6952105 TEST_BUG: testcase failure, not very often,
	com/sun/jdi/SuspendThreadTest.java
In-Reply-To: <0C0B4222-F372-4484-B92B-647C6BAE3D19@oracle.com>
References: <E0E54A66-772E-4E94-BC30-FDC0564079B3@oracle.com>
	<0C0B4222-F372-4484-B92B-647C6BAE3D19@oracle.com>
Message-ID: <53034D8D.7050908@oracle.com>

Hi Staffan,

If you get a spurious wakeup from wait():

  151             try {
  152                 synchronized (bkptSignal) {
  153                     bkptSignal.wait(5000);
  154                 }
  155             } catch (InterruptedException ee) {
  156             }
  157             if (prevBkptCount == bkptCount) {
  158                 failure("failure: test hung");

you could report failure. But that is far less likely than the current 
problem using sleep.

David

On 18/02/2014 8:19 PM, Staffan Larsen wrote:
> Still looking for Reviewer for this change.
>
> Thanks,
> /Staffan
>
> On 11 feb 2014, at 15:12, Staffan Larsen <staffan.larsen at oracle.com> wrote:
>
>> Updated the test to use proper synchronization and notification between threads. Should be more stable and much faster.
>>
>> bug: https://bugs.openjdk.java.net/browse/JDK-6952105
>> webrev: http://cr.openjdk.java.net/~sla/6952105/webrev.00/
>>
>> Thanks,
>> /Staffan
>

From staffan.larsen at oracle.com  Tue Feb 18 05:03:07 2014
From: staffan.larsen at oracle.com (Staffan Larsen)
Date: Tue, 18 Feb 2014 14:03:07 +0100
Subject: RFR(S): 6952105 TEST_BUG: testcase failure, not very often,
	com/sun/jdi/SuspendThreadTest.java
In-Reply-To: <53034D8D.7050908@oracle.com>
References: <E0E54A66-772E-4E94-BC30-FDC0564079B3@oracle.com>
	<0C0B4222-F372-4484-B92B-647C6BAE3D19@oracle.com>
	<53034D8D.7050908@oracle.com>
Message-ID: <F775F855-28B6-42E6-A7F3-B467FD990B11@oracle.com>


On 18 feb 2014, at 13:09, David Holmes <david.holmes at oracle.com> wrote:

> Hi Staffan,
> 
> If you get a spurious wakeup from wait():
> 
> 151             try {
> 152                 synchronized (bkptSignal) {
> 153                     bkptSignal.wait(5000);
> 154                 }
> 155             } catch (InterruptedException ee) {
> 156             }
> 157             if (prevBkptCount == bkptCount) {
> 158                 failure("failure: test hung");
> 
> you could report failure. But that is far less likely than the current problem using sleep.

Right. Adding ?continue;? inside the catch(InterruptedException) block should guard against that.

/Staffan

> 
> David
> 
> On 18/02/2014 8:19 PM, Staffan Larsen wrote:
>> Still looking for Reviewer for this change.
>> 
>> Thanks,
>> /Staffan
>> 
>> On 11 feb 2014, at 15:12, Staffan Larsen <staffan.larsen at oracle.com> wrote:
>> 
>>> Updated the test to use proper synchronization and notification between threads. Should be more stable and much faster.
>>> 
>>> bug: https://bugs.openjdk.java.net/browse/JDK-6952105
>>> webrev: http://cr.openjdk.java.net/~sla/6952105/webrev.00/
>>> 
>>> Thanks,
>>> /Staffan
>> 


From jaroslav.bachorik at oracle.com  Tue Feb 18 08:22:36 2014
From: jaroslav.bachorik at oracle.com (Jaroslav Bachorik)
Date: Tue, 18 Feb 2014 17:22:36 +0100
Subject: RFR 8034168: ThreadMXBean/Locks.java failed, blocked on wrong object
Message-ID: <530388CC.9080807@oracle.com>

Please, review the following test change.

Issue : https://bugs.openjdk.java.net/browse/JDK-8034168
Webrev: http://cr.openjdk.java.net/~jbachorik/8034168/webrev.00

The test fails because of falsely evaluating the thread being parked as 
actually waiting on a monitor. This is because there is no difference in 
java thread state for those two situations. The test is using Phaser for 
synchronization between the checked and checking thread to make sure an 
appropriate code section is entered before performing asserts. Then it 
checks the checked thread state and waits till it becomes WAITING. 
Unfortunately, when Phaser needs to wait it parks the thread and sets 
the thread state to WAITING. From now on the test is in a completely 
random state and the result will largely depend on timing - thus failing 
intermittently.

The solution is to use an additional volatile variable to prevent 
falsely indicating the park() induced WAITING state.

Thanks,

-JB-

From martinrb at google.com  Tue Feb 18 09:06:39 2014
From: martinrb at google.com (Martin Buchholz)
Date: Tue, 18 Feb 2014 09:06:39 -0800
Subject: RFR 8034168: ThreadMXBean/Locks.java failed,
	blocked on wrong object
In-Reply-To: <530388CC.9080807@oracle.com>
References: <530388CC.9080807@oracle.com>
Message-ID: <CA+kOe09mG0g_5d=pa51Dgj-kLRfkf1J-HSmT9P9zctqA1ufr5w@mail.gmail.com>

Not checking any details, but tests that want to wait for a particular
thread state are a good reason to use

volatile boolean flag;
...
while (!flag) Thread.yield();

I prefer calling Thread.yield to sleeping in this special case, in part
because I don't want to rely on the implementation of sleep, while yield is
semantically a no-op.  (Also sleeping 100ms is a long time for a computer)


On Tue, Feb 18, 2014 at 8:22 AM, Jaroslav Bachorik <
jaroslav.bachorik at oracle.com> wrote:

> Please, review the following test change.
>
> Issue : https://bugs.openjdk.java.net/browse/JDK-8034168
> Webrev: http://cr.openjdk.java.net/~jbachorik/8034168/webrev.00
>
> The test fails because of falsely evaluating the thread being parked as
> actually waiting on a monitor. This is because there is no difference in
> java thread state for those two situations. The test is using Phaser for
> synchronization between the checked and checking thread to make sure an
> appropriate code section is entered before performing asserts. Then it
> checks the checked thread state and waits till it becomes WAITING.
> Unfortunately, when Phaser needs to wait it parks the thread and sets the
> thread state to WAITING. From now on the test is in a completely random
> state and the result will largely depend on timing - thus failing
> intermittently.
>
> The solution is to use an additional volatile variable to prevent falsely
> indicating the park() induced WAITING state.
>
> Thanks,
>
> -JB-
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20140218/9f0cd912/attachment.html 

From mikael.vidstedt at oracle.com  Tue Feb 18 11:45:33 2014
From: mikael.vidstedt at oracle.com (Mikael Vidstedt)
Date: Tue, 18 Feb 2014 11:45:33 -0800
Subject: 8034856/8034857: More gcc warnings
In-Reply-To: <53031AE4.2050700@oracle.com>
References: <52FCC60B.6040205@oracle.com>
	<6E073B44-2FF7-419F-A094-7CD5C3AE0788@oracle.com>
	<52FD0DA2.7030201@oracle.com> <52FD35C5.1000306@oracle.com>
	<5300F36F.9010205@oracle.com>
	<CA+kOe09Uc3LQG7Ui7fhxsQABacrPdvEpRVGoKxnoY-0FHm4Ntw@mail.gmail.com>
	<4DCAB60B-5D60-4873-A5A4-ACF123F10045@oracle.com>
	<530225D2.4040709@oracle.com> <5302DA88.9060306@oracle.com>
	<53031AE4.2050700@oracle.com>
Message-ID: <5303B85D.10505@oracle.com>


On 2014-02-18 00:33, Alan Bateman wrote:
> On 18/02/2014 03:59, Mikael Vidstedt wrote:
>>
>> How about:
>>
>> http://cr.openjdk.java.net/~mikael/webrevs/isspace/webrev.00/webrev/
>>
>> Cheers,
>> Mikael
>>
> I checked the java.lang.instrument spec and for the Boot-Class-Path 
> attribute then it doesn't say any more than "space". It might be worth 
> checking the manifest parsing code (parse_manfiest.c) to see how 
> continuations are handled as I suspect \r and \n can't appear in the 
> attribute value (in which case the check might really only need to be 
> for space and \t.

That makes sense, and in fact parse_manifest.c does not even appear to 
allow for \t, so I'm more and more starting to think that a reasonable 
implementation in this context would be:

static int isNormalSpace(int c) { return c == ' '; }

In which case it probably shouldn't even be a separate function to start 
with. I would like to get a second opinion on the implications of only 
checking for ' ' (0x20) though.

If we want to allow both ' ' and \t we should probably call the function 
isblankAscii.

> Otherwise replacing isspace is good and your isspaceAscii is likely to 
> match the libc isspace (at runtime). This code isn't performance 
> sensitive but maybe check space first would be a bit better. Also the 
> library native code using 4 space indent rather than hotspot's 2.

Will fix indentation. I seriously doubt that the performance difference 
warrants the more complicated code.

> I created JDK-8035054 a few days ago to track this. Thanks for taking 
> it as I am busy with a number of other things at the moment.

Always for you, sir! ;)

/Mikael

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20140218/ad9ae5c7/attachment.html 

From david.holmes at oracle.com  Tue Feb 18 11:49:58 2014
From: david.holmes at oracle.com (David Holmes)
Date: Wed, 19 Feb 2014 05:49:58 +1000
Subject: RFR(S): 6952105 TEST_BUG: testcase failure, not very often,
	com/sun/jdi/SuspendThreadTest.java
In-Reply-To: <F775F855-28B6-42E6-A7F3-B467FD990B11@oracle.com>
References: <E0E54A66-772E-4E94-BC30-FDC0564079B3@oracle.com>
	<0C0B4222-F372-4484-B92B-647C6BAE3D19@oracle.com>
	<53034D8D.7050908@oracle.com>
	<F775F855-28B6-42E6-A7F3-B467FD990B11@oracle.com>
Message-ID: <5303B966.20709@oracle.com>

On 18/02/2014 11:03 PM, Staffan Larsen wrote:
>
> On 18 feb 2014, at 13:09, David Holmes <david.holmes at oracle.com> wrote:
>
>> Hi Staffan,
>>
>> If you get a spurious wakeup from wait():
>>
>> 151             try {
>> 152                 synchronized (bkptSignal) {
>> 153                     bkptSignal.wait(5000);
>> 154                 }
>> 155             } catch (InterruptedException ee) {
>> 156             }
>> 157             if (prevBkptCount == bkptCount) {
>> 158                 failure("failure: test hung");
>>
>> you could report failure. But that is far less likely than the current problem using sleep.
>
> Right. Adding ?continue;? inside the catch(InterruptedException) block should guard against that.

No, a spurious wakeup is not an interrupt - the wait() will simply return.

David
>
> /Staffan
>
>>
>> David
>>
>> On 18/02/2014 8:19 PM, Staffan Larsen wrote:
>>> Still looking for Reviewer for this change.
>>>
>>> Thanks,
>>> /Staffan
>>>
>>> On 11 feb 2014, at 15:12, Staffan Larsen <staffan.larsen at oracle.com> wrote:
>>>
>>>> Updated the test to use proper synchronization and notification between threads. Should be more stable and much faster.
>>>>
>>>> bug: https://bugs.openjdk.java.net/browse/JDK-6952105
>>>> webrev: http://cr.openjdk.java.net/~sla/6952105/webrev.00/
>>>>
>>>> Thanks,
>>>> /Staffan
>>>
>

From sitnikov.vladimir at gmail.com  Tue Feb 18 11:57:46 2014
From: sitnikov.vladimir at gmail.com (Vladimir Sitnikov)
Date: Tue, 18 Feb 2014 23:57:46 +0400
Subject: JDK-8005604, instance size/metaspace data in HPROF
Message-ID: <CAB=Je-HFxP63UH95aMGqJey7=DqGEva25tm2sQVKTHgs6yu-kA@mail.gmail.com>

Seems like HPROF does not include information on non-heap memory.
It would be nice to have accurate class size (e.g. with @Contended), space
wastage, permgen/metaspace (e.g. constant pool size or whatever is in
C-heap), code cache blobs in the heap dump.

Alexey Shipil?v raised similar question regarding instance size CR 8005604:
HPROF should report the actual instance size [1], [2], however it seems no
progress is made.

Creating brand-new format does not look like an easy solution, and it is
not clear how new features will be added later. It is not clear why new
format would be better in terms of efforts to update tooling.

What do you think if we keep HPROF format completely intact and dump
additional information via synthetic classes and objects?

For instance, to dump precise instance size of java class, we include
"vm.native.synthetic.ClassLayout { int instanceSize; int spaceLoss; }"
class in the generated HPROF file. For each java class we dump "instance"
of this ClassInfo.
This will reveal true instance size, space loss (e.g. alignment,
@contended, etc) to the heap analyzer.
The resulting HPROF will be fully compatible with existing analyzers, so no
significant harm is made.

To associate java.lang.Class and its vm.native metadata, a new synthetic
static reference from java.lang.Class to vm.native.synthetic.ClassInfo is
added.
Another approach is to reference from vm.native to java.lang.Class, however
it will be harder to analyze (more clicks, more complex queries)

Pros:
1) We can dump real instance size or any new information while keeping
HPROF format intact
2) Current tools will parse and display the dump just fine. With moderate
effort tools can be improved to use this "metadata" for calculations (and
proper accounting of "used heap size")
3) Query language of memory analyzers (e.g. map-reduce-javascript in
VirtualVM, OQL/SQL in Eclipse MAT, etc) can leverage this new data. The
approach of synthetic classes does not require to update analyzer for that.
4) More details (e.g. code cache blobs or whatever) can be added in the
similar way to the dump while still keeping forward and backward
compatibility in terms of file format

Cons:
1) I am not sure if iteration over permgen/metaspace/codecache is safe in
terms of crashes (or whatever). It is not good a good idea to crash when
writing HPROF dump (especially during OutOfMemoryError), however I believe
at least basic information (e.g. instance size) should be available in a
safe way.
2) Existing analyzers will charge "heap size" for synthetic
classes/objects. I do not think it is a big deal, however, if we dump
permgen/metaspace/codecache, then the amount of synthetic objects might be
significant enough to confuse non-expecting engineers


1: https://bugs.openjdk.java.net/browse/JDK-8005604
2:
http://mail.openjdk.java.net/pipermail/serviceability-dev/2012-December/007852.html

-- 
Regards,
Vladimir Sitnikov
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20140218/9085839c/attachment-0001.html 

From shanliang.jiang at oracle.com  Tue Feb 18 13:17:40 2014
From: shanliang.jiang at oracle.com (shanliang)
Date: Tue, 18 Feb 2014 22:17:40 +0100
Subject: RFR(S): 6952105 TEST_BUG: testcase failure, not very often,
	com/sun/jdi/SuspendThreadTest.java
In-Reply-To: <5303B966.20709@oracle.com>
References: <E0E54A66-772E-4E94-BC30-FDC0564079B3@oracle.com>	<0C0B4222-F372-4484-B92B-647C6BAE3D19@oracle.com>	<53034D8D.7050908@oracle.com>	<F775F855-28B6-42E6-A7F3-B467FD990B11@oracle.com>
	<5303B966.20709@oracle.com>
Message-ID: <5303CDF4.1050702@oracle.com>

I am looking at the old file:
 143         while (bkptCount < maxBkpts) {
 144             prevBkptCount = bkptCount;

suppose the following execution sequence:
1)   when Line 143 was called by Thread1, we had bkptCount == maxBkpts - 1;
 
2) bkptCount++ was executed by thread2;

3) Line 144 was called by thread1,

in this case it was sure that the line
    152                 failure("failure: test hung");
would be called.

It is good to add:
    synchronized (bkptSignal)
in the fix, but we need to put Line 143 and 144 into synchronization too.

To deal with a spurious wakeup, we might do like this:
        long stopTime = System.currentTimeMillis() + 5000;
        do {
            try {
                bkptSignal.wait(100);
            } catch (InterruptedException e){}
        } while(prevBkptCount == bkptCount && System.currentTimeMillis() 
< stopTime);

Shanliang

David Holmes wrote:
> On 18/02/2014 11:03 PM, Staffan Larsen wrote:
>>
>> On 18 feb 2014, at 13:09, David Holmes <david.holmes at oracle.com> wrote:
>>
>>> Hi Staffan,
>>>
>>> If you get a spurious wakeup from wait():
>>>
>>> 151             try {
>>> 152                 synchronized (bkptSignal) {
>>> 153                     bkptSignal.wait(5000);
>>> 154                 }
>>> 155             } catch (InterruptedException ee) {
>>> 156             }
>>> 157             if (prevBkptCount == bkptCount) {
>>> 158                 failure("failure: test hung");
>>>
>>> you could report failure. But that is far less likely than the 
>>> current problem using sleep.
>>
>> Right. Adding ?continue;? inside the catch(InterruptedException) 
>> block should guard against that.
>
> No, a spurious wakeup is not an interrupt - the wait() will simply 
> return.
>
> David
>>
>> /Staffan
>>
>>>
>>> David
>>>
>>> On 18/02/2014 8:19 PM, Staffan Larsen wrote:
>>>> Still looking for Reviewer for this change.
>>>>
>>>> Thanks,
>>>> /Staffan
>>>>
>>>> On 11 feb 2014, at 15:12, Staffan Larsen 
>>>> <staffan.larsen at oracle.com> wrote:
>>>>
>>>>> Updated the test to use proper synchronization and notification 
>>>>> between threads. Should be more stable and much faster.
>>>>>
>>>>> bug: https://bugs.openjdk.java.net/browse/JDK-6952105
>>>>> webrev: http://cr.openjdk.java.net/~sla/6952105/webrev.00/
>>>>>
>>>>> Thanks,
>>>>> /Staffan
>>>>
>>


From jaroslav.bachorik at oracle.com  Tue Feb 18 23:29:58 2014
From: jaroslav.bachorik at oracle.com (Jaroslav Bachorik)
Date: Wed, 19 Feb 2014 08:29:58 +0100
Subject: RFR 8034168: ThreadMXBean/Locks.java failed, blocked on wrong
	object
In-Reply-To: <CA+kOe09mG0g_5d=pa51Dgj-kLRfkf1J-HSmT9P9zctqA1ufr5w@mail.gmail.com>
References: <530388CC.9080807@oracle.com>
	<CA+kOe09mG0g_5d=pa51Dgj-kLRfkf1J-HSmT9P9zctqA1ufr5w@mail.gmail.com>
Message-ID: <53045D76.5080907@oracle.com>

On 18.2.2014 18:06, Martin Buchholz wrote:
> Not checking any details, but tests that want to wait for a particular
> thread state are a good reason to use
>
> volatile boolean flag;
> ...
> while (!flag) Thread.yield();
>
> I prefer calling Thread.yield to sleeping in this special case, in part
> because I don't want to rely on the implementation of sleep, while yield is
> semantically a no-op.  (Also sleeping 100ms is a long time for a computer)

There were discussions for a similar fix regarding Thread.yield(). The 
concern was that using Thread.yield() in a tight loop might very easily 
lead to starvation on single core machines. Therefore Thread.sleep(10) 
is used to be sure the flag setting thread has actually a chance to 
progress.

-JB-

>
>
>
> On Tue, Feb 18, 2014 at 8:22 AM, Jaroslav Bachorik <
> jaroslav.bachorik at oracle.com> wrote:
>
>> Please, review the following test change.
>>
>> Issue : https://bugs.openjdk.java.net/browse/JDK-8034168
>> Webrev: http://cr.openjdk.java.net/~jbachorik/8034168/webrev.00
>>
>> The test fails because of falsely evaluating the thread being parked as
>> actually waiting on a monitor. This is because there is no difference in
>> java thread state for those two situations. The test is using Phaser for
>> synchronization between the checked and checking thread to make sure an
>> appropriate code section is entered before performing asserts. Then it
>> checks the checked thread state and waits till it becomes WAITING.
>> Unfortunately, when Phaser needs to wait it parks the thread and sets the
>> thread state to WAITING. From now on the test is in a completely random
>> state and the result will largely depend on timing - thus failing
>> intermittently.
>>
>> The solution is to use an additional volatile variable to prevent falsely
>> indicating the park() induced WAITING state.
>>
>> Thanks,
>>
>> -JB-
>>
>


From shanliang.jiang at oracle.com  Wed Feb 19 00:17:45 2014
From: shanliang.jiang at oracle.com (shanliang)
Date: Wed, 19 Feb 2014 09:17:45 +0100
Subject: Codereview request: 8035195 demo/jvmti/mtrace/TraceJFrame.java can't
	connect to X11
Message-ID: <530468A9.7080509@oracle.com>

Hi,

The failure was from a bad DISPLAY setting, this must be not intended, 
so we re-throw the exception and tell to make sure that the DISPLAY is 
correct.

The fix is only to add a catch of InternalError at the beginning, the 
rest code is not modified but re-formated to remove some spaces.

webrev:
http://cr.openjdk.java.net/~sjiang/JDK-8035195/00 
<http://cr.openjdk.java.net/%7Esjiang/JDK-8035195/00>/

bug:
https://bugs.openjdk.java.net/browse/JDK-8035195


Thanks,
Shanliang

From david.holmes at oracle.com  Tue Feb 18 22:09:34 2014
From: david.holmes at oracle.com (David Holmes)
Date: Wed, 19 Feb 2014 16:09:34 +1000
Subject: RFR(S): 6952105 TEST_BUG: testcase failure, not very often,
	com/sun/jdi/SuspendThreadTest.java
In-Reply-To: <5303CDF4.1050702@oracle.com>
References: <E0E54A66-772E-4E94-BC30-FDC0564079B3@oracle.com>	<0C0B4222-F372-4484-B92B-647C6BAE3D19@oracle.com>	<53034D8D.7050908@oracle.com>	<F775F855-28B6-42E6-A7F3-B467FD990B11@oracle.com>
	<5303B966.20709@oracle.com> <5303CDF4.1050702@oracle.com>
Message-ID: <53044A9E.6050507@oracle.com>

On 19/02/2014 7:17 AM, shanliang wrote:
> I am looking at the old file:
> 143         while (bkptCount < maxBkpts) {
> 144             prevBkptCount = bkptCount;
>
> suppose the following execution sequence:
> 1)   when Line 143 was called by Thread1, we had bkptCount == maxBkpts - 1;
>
> 2) bkptCount++ was executed by thread2;
>
> 3) Line 144 was called by thread1,
>
> in this case it was sure that the line
>     152                 failure("failure: test hung");
> would be called.

Yes I was looking at that race too. The comments suggest that we should 
never reach a point where we get to maxBkpts, so this failure would be 
very rare and would likely indicate a real problem.

> It is good to add:
>     synchronized (bkptSignal)
> in the fix, but we need to put Line 143 and 144 into synchronization too.
>
> To deal with a spurious wakeup, we might do like this:
>         long stopTime = System.currentTimeMillis() + 5000;
>         do {
>             try {
>                 bkptSignal.wait(100);
>             } catch (InterruptedException e){}
>         } while(prevBkptCount == bkptCount && System.currentTimeMillis()
> < stopTime);

It is better to use System.nanoTime() rather than the non-monotonic 
currentTimeMillis(). And you really want a while loop rather than 
do-while so we don't always do that 100ms wait.

David

> Shanliang
>
> David Holmes wrote:
>> On 18/02/2014 11:03 PM, Staffan Larsen wrote:
>>>
>>> On 18 feb 2014, at 13:09, David Holmes <david.holmes at oracle.com> wrote:
>>>
>>>> Hi Staffan,
>>>>
>>>> If you get a spurious wakeup from wait():
>>>>
>>>> 151             try {
>>>> 152                 synchronized (bkptSignal) {
>>>> 153                     bkptSignal.wait(5000);
>>>> 154                 }
>>>> 155             } catch (InterruptedException ee) {
>>>> 156             }
>>>> 157             if (prevBkptCount == bkptCount) {
>>>> 158                 failure("failure: test hung");
>>>>
>>>> you could report failure. But that is far less likely than the
>>>> current problem using sleep.
>>>
>>> Right. Adding ?continue;? inside the catch(InterruptedException)
>>> block should guard against that.
>>
>> No, a spurious wakeup is not an interrupt - the wait() will simply
>> return.
>>
>> David
>>>
>>> /Staffan
>>>
>>>>
>>>> David
>>>>
>>>> On 18/02/2014 8:19 PM, Staffan Larsen wrote:
>>>>> Still looking for Reviewer for this change.
>>>>>
>>>>> Thanks,
>>>>> /Staffan
>>>>>
>>>>> On 11 feb 2014, at 15:12, Staffan Larsen
>>>>> <staffan.larsen at oracle.com> wrote:
>>>>>
>>>>>> Updated the test to use proper synchronization and notification
>>>>>> between threads. Should be more stable and much faster.
>>>>>>
>>>>>> bug: https://bugs.openjdk.java.net/browse/JDK-6952105
>>>>>> webrev: http://cr.openjdk.java.net/~sla/6952105/webrev.00/
>>>>>>
>>>>>> Thanks,
>>>>>> /Staffan
>>>>>
>>>
>

From staffan.larsen at oracle.com  Wed Feb 19 01:01:59 2014
From: staffan.larsen at oracle.com (Staffan Larsen)
Date: Wed, 19 Feb 2014 10:01:59 +0100
Subject: RFR(S): 6952105 TEST_BUG: testcase failure, not very often,
	com/sun/jdi/SuspendThreadTest.java
In-Reply-To: <53044A9E.6050507@oracle.com>
References: <E0E54A66-772E-4E94-BC30-FDC0564079B3@oracle.com>	<0C0B4222-F372-4484-B92B-647C6BAE3D19@oracle.com>	<53034D8D.7050908@oracle.com>	<F775F855-28B6-42E6-A7F3-B467FD990B11@oracle.com>
	<5303B966.20709@oracle.com> <5303CDF4.1050702@oracle.com>
	<53044A9E.6050507@oracle.com>
Message-ID: <6B94A897-2385-472A-B0CA-B0FC56D1D818@oracle.com>

Thanks for the feedback!

I chose to use yet another variable to avoid the spurious wakeups. I?ve also increased the range of the synchronized statement to avoid the race.

http://cr.openjdk.java.net/~sla/6952105/webrev.01/

Thanks,
/Staffan

On 19 feb 2014, at 07:09, David Holmes <david.holmes at oracle.com> wrote:

> On 19/02/2014 7:17 AM, shanliang wrote:
>> I am looking at the old file:
>> 143         while (bkptCount < maxBkpts) {
>> 144             prevBkptCount = bkptCount;
>> 
>> suppose the following execution sequence:
>> 1)   when Line 143 was called by Thread1, we had bkptCount == maxBkpts - 1;
>> 
>> 2) bkptCount++ was executed by thread2;
>> 
>> 3) Line 144 was called by thread1,
>> 
>> in this case it was sure that the line
>>    152                 failure("failure: test hung");
>> would be called.
> 
> Yes I was looking at that race too. The comments suggest that we should never reach a point where we get to maxBkpts, so this failure would be very rare and would likely indicate a real problem.
> 
>> It is good to add:
>>    synchronized (bkptSignal)
>> in the fix, but we need to put Line 143 and 144 into synchronization too.
>> 
>> To deal with a spurious wakeup, we might do like this:
>>        long stopTime = System.currentTimeMillis() + 5000;
>>        do {
>>            try {
>>                bkptSignal.wait(100);
>>            } catch (InterruptedException e){}
>>        } while(prevBkptCount == bkptCount && System.currentTimeMillis()
>> < stopTime);
> 
> It is better to use System.nanoTime() rather than the non-monotonic currentTimeMillis(). And you really want a while loop rather than do-while so we don't always do that 100ms wait.
> 
> David
> 
>> Shanliang
>> 
>> David Holmes wrote:
>>> On 18/02/2014 11:03 PM, Staffan Larsen wrote:
>>>> 
>>>> On 18 feb 2014, at 13:09, David Holmes <david.holmes at oracle.com> wrote:
>>>> 
>>>>> Hi Staffan,
>>>>> 
>>>>> If you get a spurious wakeup from wait():
>>>>> 
>>>>> 151             try {
>>>>> 152                 synchronized (bkptSignal) {
>>>>> 153                     bkptSignal.wait(5000);
>>>>> 154                 }
>>>>> 155             } catch (InterruptedException ee) {
>>>>> 156             }
>>>>> 157             if (prevBkptCount == bkptCount) {
>>>>> 158                 failure("failure: test hung");
>>>>> 
>>>>> you could report failure. But that is far less likely than the
>>>>> current problem using sleep.
>>>> 
>>>> Right. Adding ?continue;? inside the catch(InterruptedException)
>>>> block should guard against that.
>>> 
>>> No, a spurious wakeup is not an interrupt - the wait() will simply
>>> return.
>>> 
>>> David
>>>> 
>>>> /Staffan
>>>> 
>>>>> 
>>>>> David
>>>>> 
>>>>> On 18/02/2014 8:19 PM, Staffan Larsen wrote:
>>>>>> Still looking for Reviewer for this change.
>>>>>> 
>>>>>> Thanks,
>>>>>> /Staffan
>>>>>> 
>>>>>> On 11 feb 2014, at 15:12, Staffan Larsen
>>>>>> <staffan.larsen at oracle.com> wrote:
>>>>>> 
>>>>>>> Updated the test to use proper synchronization and notification
>>>>>>> between threads. Should be more stable and much faster.
>>>>>>> 
>>>>>>> bug: https://bugs.openjdk.java.net/browse/JDK-6952105
>>>>>>> webrev: http://cr.openjdk.java.net/~sla/6952105/webrev.00/
>>>>>>> 
>>>>>>> Thanks,
>>>>>>> /Staffan

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20140219/352fb18b/attachment-0001.html 

From david.holmes at oracle.com  Wed Feb 19 01:38:51 2014
From: david.holmes at oracle.com (David Holmes)
Date: Wed, 19 Feb 2014 19:38:51 +1000
Subject: RFR(S): 6952105 TEST_BUG: testcase failure, not very often,
	com/sun/jdi/SuspendThreadTest.java
In-Reply-To: <6B94A897-2385-472A-B0CA-B0FC56D1D818@oracle.com>
References: <E0E54A66-772E-4E94-BC30-FDC0564079B3@oracle.com>	<0C0B4222-F372-4484-B92B-647C6BAE3D19@oracle.com>	<53034D8D.7050908@oracle.com>	<F775F855-28B6-42E6-A7F3-B467FD990B11@oracle.com>
	<5303B966.20709@oracle.com> <5303CDF4.1050702@oracle.com>
	<53044A9E.6050507@oracle.com>
	<6B94A897-2385-472A-B0CA-B0FC56D1D818@oracle.com>
Message-ID: <53047BAB.5020904@oracle.com>

On 19/02/2014 7:01 PM, Staffan Larsen wrote:
> Thanks for the feedback!
>
> I chose to use yet another variable to avoid the spurious wakeups. I?ve
> also increased the range of the synchronized statement to avoid the race.
>
> http://cr.openjdk.java.net/~sla/6952105/webrev.01/

Slightly simpler to just do:

bkptSignal.wait(5000);
if (!signalSent)
   continue;

but what you have works.

Also signalSent doesn't need to be volatile as it is only accessed 
within the sync blocks.

Thanks,
David

> Thanks,
> /Staffan
>
> On 19 feb 2014, at 07:09, David Holmes <david.holmes at oracle.com> wrote:
>
>> On 19/02/2014 7:17 AM, shanliang wrote:
>>> I am looking at the old file:
>>> 143         while (bkptCount < maxBkpts) {
>>> 144             prevBkptCount = bkptCount;
>>>
>>> suppose the following execution sequence:
>>> 1)   when Line 143 was called by Thread1, we had bkptCount ==
>>> maxBkpts - 1;
>>>
>>> 2) bkptCount++ was executed by thread2;
>>>
>>> 3) Line 144 was called by thread1,
>>>
>>> in this case it was sure that the line
>>>    152                 failure("failure: test hung");
>>> would be called.
>>
>> Yes I was looking at that race too. The comments suggest that we
>> should never reach a point where we get to maxBkpts, so this failure
>> would be very rare and would likely indicate a real problem.
>>
>>> It is good to add:
>>>    synchronized (bkptSignal)
>>> in the fix, but we need to put Line 143 and 144 into synchronization too.
>>>
>>> To deal with a spurious wakeup, we might do like this:
>>>        long stopTime = System.currentTimeMillis() + 5000;
>>>        do {
>>>            try {
>>>                bkptSignal.wait(100);
>>>            } catch (InterruptedException e){}
>>>        } while(prevBkptCount == bkptCount && System.currentTimeMillis()
>>> < stopTime);
>>
>> It is better to use System.nanoTime() rather than the non-monotonic
>> currentTimeMillis(). And you really want a while loop rather than
>> do-while so we don't always do that 100ms wait.
>>
>> David
>>
>>> Shanliang
>>>
>>> David Holmes wrote:
>>>> On 18/02/2014 11:03 PM, Staffan Larsen wrote:
>>>>>
>>>>> On 18 feb 2014, at 13:09, David Holmes <david.holmes at oracle.com> wrote:
>>>>>
>>>>>> Hi Staffan,
>>>>>>
>>>>>> If you get a spurious wakeup from wait():
>>>>>>
>>>>>> 151             try {
>>>>>> 152                 synchronized (bkptSignal) {
>>>>>> 153                     bkptSignal.wait(5000);
>>>>>> 154                 }
>>>>>> 155             } catch (InterruptedException ee) {
>>>>>> 156             }
>>>>>> 157             if (prevBkptCount == bkptCount) {
>>>>>> 158                 failure("failure: test hung");
>>>>>>
>>>>>> you could report failure. But that is far less likely than the
>>>>>> current problem using sleep.
>>>>>
>>>>> Right. Adding ?continue;? inside the catch(InterruptedException)
>>>>> block should guard against that.
>>>>
>>>> No, a spurious wakeup is not an interrupt - the wait() will simply
>>>> return.
>>>>
>>>> David
>>>>>
>>>>> /Staffan
>>>>>
>>>>>>
>>>>>> David
>>>>>>
>>>>>> On 18/02/2014 8:19 PM, Staffan Larsen wrote:
>>>>>>> Still looking for Reviewer for this change.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> /Staffan
>>>>>>>
>>>>>>> On 11 feb 2014, at 15:12, Staffan Larsen
>>>>>>> <staffan.larsen at oracle.com> wrote:
>>>>>>>
>>>>>>>> Updated the test to use proper synchronization and notification
>>>>>>>> between threads. Should be more stable and much faster.
>>>>>>>>
>>>>>>>> bug: https://bugs.openjdk.java.net/browse/JDK-6952105
>>>>>>>> webrev: http://cr.openjdk.java.net/~sla/6952105/webrev.00/
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> /Staffan
>

From staffan.larsen at oracle.com  Wed Feb 19 01:42:51 2014
From: staffan.larsen at oracle.com (Staffan Larsen)
Date: Wed, 19 Feb 2014 10:42:51 +0100
Subject: RFR(S): 6952105 TEST_BUG: testcase failure, not very often,
	com/sun/jdi/SuspendThreadTest.java
In-Reply-To: <53047BAB.5020904@oracle.com>
References: <E0E54A66-772E-4E94-BC30-FDC0564079B3@oracle.com>	<0C0B4222-F372-4484-B92B-647C6BAE3D19@oracle.com>	<53034D8D.7050908@oracle.com>	<F775F855-28B6-42E6-A7F3-B467FD990B11@oracle.com>
	<5303B966.20709@oracle.com> <5303CDF4.1050702@oracle.com>
	<53044A9E.6050507@oracle.com>
	<6B94A897-2385-472A-B0CA-B0FC56D1D818@oracle.com>
	<53047BAB.5020904@oracle.com>
Message-ID: <626ADA78-77A9-44C3-B2D5-C42798C4F46F@oracle.com>


On 19 feb 2014, at 10:38, David Holmes <david.holmes at oracle.com> wrote:

> On 19/02/2014 7:01 PM, Staffan Larsen wrote:
>> Thanks for the feedback!
>> 
>> I chose to use yet another variable to avoid the spurious wakeups. I?ve
>> also increased the range of the synchronized statement to avoid the race.
>> 
>> http://cr.openjdk.java.net/~sla/6952105/webrev.01/
> 
> Slightly simpler to just do:
> 
> bkptSignal.wait(5000);
> if (!signalSent)
>  continue;
> 
> but what you have works.
> 
> Also signalSent doesn't need to be volatile as it is only accessed within the sync blocks.

True. And true for bkptCount as well now, except for one usage in a println. I?ll remove the volatile on signalSent, but keep it on bkptCount.

Thanks,
/Staffan

> 
> Thanks,
> David
> 
>> Thanks,
>> /Staffan
>> 
>> On 19 feb 2014, at 07:09, David Holmes <david.holmes at oracle.com> wrote:
>> 
>>> On 19/02/2014 7:17 AM, shanliang wrote:
>>>> I am looking at the old file:
>>>> 143         while (bkptCount < maxBkpts) {
>>>> 144             prevBkptCount = bkptCount;
>>>> 
>>>> suppose the following execution sequence:
>>>> 1)   when Line 143 was called by Thread1, we had bkptCount ==
>>>> maxBkpts - 1;
>>>> 
>>>> 2) bkptCount++ was executed by thread2;
>>>> 
>>>> 3) Line 144 was called by thread1,
>>>> 
>>>> in this case it was sure that the line
>>>>   152                 failure("failure: test hung");
>>>> would be called.
>>> 
>>> Yes I was looking at that race too. The comments suggest that we
>>> should never reach a point where we get to maxBkpts, so this failure
>>> would be very rare and would likely indicate a real problem.
>>> 
>>>> It is good to add:
>>>>   synchronized (bkptSignal)
>>>> in the fix, but we need to put Line 143 and 144 into synchronization too.
>>>> 
>>>> To deal with a spurious wakeup, we might do like this:
>>>>       long stopTime = System.currentTimeMillis() + 5000;
>>>>       do {
>>>>           try {
>>>>               bkptSignal.wait(100);
>>>>           } catch (InterruptedException e){}
>>>>       } while(prevBkptCount == bkptCount && System.currentTimeMillis()
>>>> < stopTime);
>>> 
>>> It is better to use System.nanoTime() rather than the non-monotonic
>>> currentTimeMillis(). And you really want a while loop rather than
>>> do-while so we don't always do that 100ms wait.
>>> 
>>> David
>>> 
>>>> Shanliang
>>>> 
>>>> David Holmes wrote:
>>>>> On 18/02/2014 11:03 PM, Staffan Larsen wrote:
>>>>>> 
>>>>>> On 18 feb 2014, at 13:09, David Holmes <david.holmes at oracle.com> wrote:
>>>>>> 
>>>>>>> Hi Staffan,
>>>>>>> 
>>>>>>> If you get a spurious wakeup from wait():
>>>>>>> 
>>>>>>> 151             try {
>>>>>>> 152                 synchronized (bkptSignal) {
>>>>>>> 153                     bkptSignal.wait(5000);
>>>>>>> 154                 }
>>>>>>> 155             } catch (InterruptedException ee) {
>>>>>>> 156             }
>>>>>>> 157             if (prevBkptCount == bkptCount) {
>>>>>>> 158                 failure("failure: test hung");
>>>>>>> 
>>>>>>> you could report failure. But that is far less likely than the
>>>>>>> current problem using sleep.
>>>>>> 
>>>>>> Right. Adding ?continue;? inside the catch(InterruptedException)
>>>>>> block should guard against that.
>>>>> 
>>>>> No, a spurious wakeup is not an interrupt - the wait() will simply
>>>>> return.
>>>>> 
>>>>> David
>>>>>> 
>>>>>> /Staffan
>>>>>> 
>>>>>>> 
>>>>>>> David
>>>>>>> 
>>>>>>> On 18/02/2014 8:19 PM, Staffan Larsen wrote:
>>>>>>>> Still looking for Reviewer for this change.
>>>>>>>> 
>>>>>>>> Thanks,
>>>>>>>> /Staffan
>>>>>>>> 
>>>>>>>> On 11 feb 2014, at 15:12, Staffan Larsen
>>>>>>>> <staffan.larsen at oracle.com> wrote:
>>>>>>>> 
>>>>>>>>> Updated the test to use proper synchronization and notification
>>>>>>>>> between threads. Should be more stable and much faster.
>>>>>>>>> 
>>>>>>>>> bug: https://bugs.openjdk.java.net/browse/JDK-6952105
>>>>>>>>> webrev: http://cr.openjdk.java.net/~sla/6952105/webrev.00/
>>>>>>>>> 
>>>>>>>>> Thanks,
>>>>>>>>> /Staffan
>> 


From taras.ledkov at oracle.com  Wed Feb 19 03:05:45 2014
From: taras.ledkov at oracle.com (taras ledkov)
Date: Wed, 19 Feb 2014 15:05:45 +0400
Subject: Review request for 7195249: Some jtreg tests use hard coded ports
In-Reply-To: <52F23A97.7020807@oracle.com>
References: <529EF58F.5000701@oracle.com>
	<52A58687.6020708@oracle.com>	<52A5953A.5040102@oracle.com>
	<52A7061E.8040002@oracle.com>	<52BC2A7D.3070403@oracle.com>	<B27290D9-2EA6-4330-9A31-59957AA968ED@oracle.com>
	<52D6A61A.5020109@oracle.com> <52DD3BB6.2070607@oracle.com>
	<A2DEA977-75EF-4AC2-AC86-E22B8BBFFE97@oracle.com>
	<52DE3E3D.5070903@oracle.com> <52DE41BB.40309@oracle.com>
	<52F231D3.1050208@oracle.com> <52F23A97.7020807@oracle.com>
Message-ID: <53049009.9020204@oracle.com>

Hi,

Imports are fixed:

http://cr.openjdk.java.net/~anazarov/7195249/jdk/webrev.04/

On 05.02.2014 17:20, Jaroslav Bachorik wrote:
> Hi Taras,
>
> thanks for taking care of this.
>
> The changes look fine to me.
>
> One minor nit is unused imports of the library classes in
> "test/sun/management/jmxremote/bootstrap/SSLConfigFilePermissionTest.java".
> It does not use any of those classes as its base class
> "AbstractFilePermissionTest" does all the heavy lifting.
>
> Cheers,
>
> -JB-
>
> On 5.2.2014 13:42, taras ledkov wrote:
>> Hi,
>>
>> So please take a look at the review against JDK9.
>> The reviewed patch had not been integrated into JDK8.
>>
>> Port to JDK9 is identical. The difference: the ProcessTools.java has
>> been already patched by Jaroslav.
>>
>> Webrev for jdk part:
>> http://cr.openjdk.java.net/~anazarov/7195249/jdk/webrev.03/
>>
>> Webrev for hs part:
>> http://cr.openjdk.java.net/~anazarov/7195249/hs/webrev.03/
>>
>>
>> On 21.01.2014 13:45, Jaroslav Bachorik wrote:
>>> Hi Taras,
>>>
>>> On 21.1.2014 10:30, taras ledkov wrote:
>>>> Hi Jaroslav,
>>>>
>>>> Could you please review the last changes?
>>>> Are you OK?
>>>
>>> Yes, the change looks ok. But I think we will need to get back to this
>>> problem eventually and implement a central port dispatcher if we want to
>>> be 100% sure the port conflicts wouldn't occur. But your changes reduce
>>> the chance significantly.
>>>
>>> Thanks for taking care of this.
>>>
>>> -JB-
>>>
>>>>
>>>> On 20.01.2014 19:21, Staffan Larsen wrote:
>>>>> Sorry for not replying earlier. Yes, I?m ok with these changes.
>>>>>
>>>>> Thanks,
>>>>> /Staffan
>>>>>
>>>>> On 20 jan 2014, at 16:07, taras ledkov <taras.ledkov at oracle.com>
>>>>> wrote:
>>>>>
>>>>>> Hi Staffan,
>>>>>>
>>>>>> I fixed the tests according with your comments.
>>>>>> Are you OK?
>>>>>>
>>>>>> On 15.01.2014 19:15, taras ledkov wrote:
>>>>>>> Hi,
>>>>>>>
>>>>>>> Please take a look at the new review.
>>>>>>>
>>>>>>> Webrev for jdk part:
>>>>>>> http://cr.openjdk.java.net/~anazarov/7195249/jdk/webrev.02/
>>>>>>>
>>>>>>> Webrev for hs part:
>>>>>>> http://cr.openjdk.java.net/~anazarov/7195249/hs/webrev.02/
>>>>>>>
>>>>>>> My answers are inline:
>>>>>>>
>>>>>>> On 08.01.2014 17:46, Staffan Larsen wrote:
>>>>>>>> Hi Taras,
>>>>>>>>
>>>>>>>> Thanks for doing this clean up and conversion of tests into Java.
>>>>>>>> Here?s a couple of comments:
>>>>>>>>
>>>>>>>> test/runtime/6294277/SourceDebugExtension.java:
>>>>>>>> This test could be simplified by not specifying an address at all.
>>>>>>>> Since the test never connects to the JVM started with -Xrunjdwp,
>>>>>>>> there
>>>>>>>> is no reason to specify an address. If address is unspecified (and
>>>>>>>> server=y), the connector will pick an address and print it to the
>>>>>>>> command line. Thus the only change that needs to be done is to
>>>>>>>> remove
>>>>>>>> ",address=8888? from the @run command.
>>>>>>> fixed
>>>>>>>
>>>>>>>> test/sun/management/jmxremote/bootstrap/RmiBootstrapTest.sh:
>>>>>>>> test/sun/management/jmxremote/bootstrap/RmiSslBootstrapTest.sh:
>>>>>>>> These tests do not compile cleanly with an empty JTwork
>>>>>>>> directory. It
>>>>>>>> seems that having one @build for each class does not work well -
>>>>>>>> when
>>>>>>>> compiling RmiBootstrapTest.java it cannot find TestLogger. Moving
>>>>>>>> all
>>>>>>>> classes to one @build statement solved this problem for me.
>>>>>>> fixed
>>>>>>>
>>>>>>>> test/lib/testlibrary/jdk/testlibrary/ProcessTools.java:
>>>>>>>>   187         Future<Void> stdoutTask = stdout.process();
>>>>>>>>   188         Future<Void> stderrTask = stderr.process();
>>>>>>>> The stdoutTask and stderrTask variables are unused.
>>>>>>> fixed
>>>>>>>
>>>>>>>> test/sun/management/jmxremote/bootstrap/RmiRegistrySslTest.java:
>>>>>>>> At first I thought something was wrong with this file - the diff is
>>>>>>>> very weird. Then I realized you renamed an old file and created a
>>>>>>>> new
>>>>>>>> file using the old name.
>>>>>>> You are right. I did it to keep the test name.
>>>>>>>
>>>>>>>> test/sun/management/jmxremote/bootstrap/AbstractFilePermissionTest.java:
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> - Is resetPasswordFilePermission() really necessary? It looks like
>>>>>>>> you
>>>>>>>> delete the files at the beginning of the test in any case.
>>>>>>> I think yes. n the first place, this functionality was at the old
>>>>>>> code.
>>>>>>> In the second place, a file without write permission may be a
>>>>>>> problem
>>>>>>> for a further cleanup (not by the test, for example for the tests
>>>>>>> launcher scripts etc.)
>>>>>>>
>>>>>>>> - I find the names and usage of ?mgmt? and ?file2PermissionTest?
>>>>>>>> confusing. They are both Paths. One is used directly by the
>>>>>>>> sub-classes, the other has a getter method.
>>>>>>> fixed
>>>>>>>
>>>>>>>> - Lines 57-58: Don?t swallow exceptions, add an
>>>>>>>> ex.printStackTrace().
>>>>>>>> (Same thing for all other places where you call Integer.parseInt())
>>>>>>> fixed
>>>>>>>
>>>>>>>> test/sun/management/jmxremote/bootstrap/Dummy.java:
>>>>>>>> This file is never used as far as I can see.
>>>>>>> It is used by PasswordFilePermissionTest &
>>>>>>> SSLConfigFilePermissionTest
>>>>>>> via the AbstractFilePermissionTest (see the doTest method,
>>>>>>> AbstractFilePermissionTest : 162).
>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> /Staffan
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On 26 dec 2013, at 14:09, taras ledkov <taras.ledkov at oracle.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>> Please take a look at the review with fixed issues about trying to
>>>>>>>>> launch test that needs free port several times.
>>>>>>>>>
>>>>>>>>> Webrev for jdk part:
>>>>>>>>> http://cr.openjdk.java.net/~anazarov/7195249/jdk/webrev.01/
>>>>>>>>>
>>>>>>>>> Webrev for hs part:
>>>>>>>>> http://cr.openjdk.java.net/~anazarov/7195249/hs/webrev.01/
>>>>>>>>>
>>>>>>>>> Pay your attention to new method ProcessTools.startProcess(String,
>>>>>>>>> ProcessBuilder, Consumer<String>) that is used to analyze all
>>>>>>>>> output
>>>>>>>>> of a sub-process.  It has common part with
>>>>>>>>> ProcessTools.startProcess(String, ProcessBuilder,
>>>>>>>>> Predicate<String>,
>>>>>>>>> long, TumeUnit) that is used to determine the warm-up moment.
>>>>>>>>>
>>>>>>>>> I think the ProcessTools.startProcess(String, ProcessBuilder,
>>>>>>>>> Predicate<String>, long, TumeUnit) may be changed by adding
>>>>>>>>> LinePump
>>>>>>>>> to stderr if there is not serious reason for restricting the
>>>>>>>>> warm-up
>>>>>>>>> analysis to stdout stream.
>>>>>>>>>
>>>>>>>>> On 10.12.2013 16:16, Yekaterina Kantserova wrote:
>>>>>>>>>> Hi,
>>>>>>>>>>
>>>>>>>>>> I've consulted with Serviceability engineers (add them to CC
>>>>>>>>>> list) and
>>>>>>>>>> they would like to see tests to solve these problem so far:
>>>>>>>>>>
>>>>>>>>>> 2. Implement loops in every test.
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>> Katja
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On 12/09/2013 11:02 AM, Alexandre (Shura) Iline wrote:
>>>>>>>>>>> Guys.
>>>>>>>>>>>
>>>>>>>>>>> Let me try to sum up what was said before and may be suggest a
>>>>>>>>>>> compromise.
>>>>>>>>>>>
>>>>>>>>>>> 1. There is a desire to have a support port allocation on the
>>>>>>>>>>> level of
>>>>>>>>>>> a JTReg suite execution. Taras created a bug for that
>>>>>>>>>>> (https://bugs.openjdk.java.net/browse/JDK-7195249). Whether it
>>>>>>>>>>> is a
>>>>>>>>>>> test harness API or a library API does not really matter from
>>>>>>>>>>> usage
>>>>>>>>>>> point of view.
>>>>>>>>>>>
>>>>>>>>>>> 2. There is no way to make the tests absolutely stable, whatever
>>>>>>>>>>> port
>>>>>>>>>>> allocation logic is used. The best we could do is to try to
>>>>>>>>>>> perform
>>>>>>>>>>> the test logic with different ports until the test succeeds.
>>>>>>>>>>>
>>>>>>>>>>> Both arguments make sense. #2 is the ultimate answer, of course,
>>>>>>>>>>> but
>>>>>>>>>>> better be used in conjunction with a meaningful port selection
>>>>>>>>>>> algorithm.
>>>>>>>>>>>
>>>>>>>>>>> At the same time, copying a loop-until-success login from one
>>>>>>>>>>> test to
>>>>>>>>>>> another may be not the best solution. Library could help with
>>>>>>>>>>> that I
>>>>>>>>>>> believe. There only need to be an API method which takes
>>>>>>>>>>> behavior as a
>>>>>>>>>>> parameter and run it until it succeeds. Something like:
>>>>>>>>>>> public <T> runOnAFreePort(Function<T, Integer>)
>>>>>>>>>>> or similar. There could be arguments of how/whether to implement
>>>>>>>>>>> it,
>>>>>>>>>>> the solution would not work for shell tests, etc, but still ...
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> With the tests in question though, we have a few options.
>>>>>>>>>>>
>>>>>>>>>>> 1. Integrate tests as is. Get to it later after reaching
>>>>>>>>>>> agreement in
>>>>>>>>>>> the library, etc.
>>>>>>>>>>> 2. Implement loops in every test.
>>>>>>>>>>> 3. Wait for the library to be ready and only then integrate the
>>>>>>>>>>> changes.
>>>>>>>>>>>
>>>>>>>>>>> Please let us know which one is closer to your heart.
>>>>>>>>>>>
>>>>>>>>>>> I personally prefer #1 for the reason that the changes already
>>>>>>>>>>> supposed to make the tests more stable and also there are many
>>>>>>>>>>> more
>>>>>>>>>>> tests tests which use ports, so the scope of the problem is
>>>>>>>>>>> bigger
>>>>>>>>>>> than these.
>>>>>>>>>>>
>>>>>>>>>>> Shura
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>> Taras,
>>>>>>>>>>>>
>>>>>>>>>>>> I agree with the previous comments, that Utils.getFreePort()
>>>>>>>>>>>> does not
>>>>>>>>>>>> guarantee the port will be still free when you start your
>>>>>>>>>>>> process.
>>>>>>>>>>>> Unfortunately I don't think the library can do more. However,
>>>>>>>>>>>> there is a
>>>>>>>>>>>> solution.
>>>>>>>>>>>>
>>>>>>>>>>>> Please, look at the *jdk/test/sun/tools/jstatd/JstatdTest.java
>>>>>>>>>>>> tryToSetupJstatdProcess()*. In brief, the test will try to
>>>>>>>>>>>> start a
>>>>>>>>>>>> process with a free port and then check if
>>>>>>>>>>>> /java.rmi.server.ExportException: Port already in use/ has been
>>>>>>>>>>>> thrown.
>>>>>>>>>>>> If yes, you have to retry.
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>> Katja
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On 12/02/2013 01:39 PM, taras ledkov wrote:
>>>>>>>>>>>>> Hi Everyone,
>>>>>>>>>>>>>
>>>>>>>>>>>>> Whatever logic is to be chosen to select a free port, it is
>>>>>>>>>>>>> the
>>>>>>>>>>>>> library responsibility to implements it, would not you agree?
>>>>>>>>>>>>>
>>>>>>>>>>>>> Hence what I am suggesting is to integrate the tests as is.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Should we decide to replace logic of the port selection, we
>>>>>>>>>>>>> could do
>>>>>>>>>>>>> it later in the library.
>>>>>>>>>>>>>
>>>>>>>>>>>>> On 21.11.2013 15:00, Jaroslav Bachorik wrote:
>>>>>>>>>>>>>> On 20.11.2013 18:38, Dmitry Samersoff wrote:
>>>>>>>>>>>>>>> Roger,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> As soon as we close a socket nobody can guarantee that the
>>>>>>>>>>>>>>> port is
>>>>>>>>>>>>>>> free.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Moreover, port returned by getFreePort()[1] remains not
>>>>>>>>>>>>>>> accessible
>>>>>>>>>>>>>>> for
>>>>>>>>>>>>>>> some time - it depends to system setup, take a look to
>>>>>>>>>>>>>>> discussions
>>>>>>>>>>>>>>> around SO_REUSEPORT for Linux or SO_REUSEADDR and SO_LINGER
>>>>>>>>>>>>>>> for
>>>>>>>>>>>>>>> BSD.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> So from stability point of view it's better to just return
>>>>>>>>>>>>>>> random
>>>>>>>>>>>>>>> number
>>>>>>>>>>>>>>> between 49152 and 65535.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Well, this doesn't seem to improve the odds by much. When
>>>>>>>>>>>>>> there are
>>>>>>>>>>>>>> more
>>>>>>>>>>>>>> tests run in parallel, all of them requiring a free port,
>>>>>>>>>>>>>> nothing
>>>>>>>>>>>>>> prevents the random function to return the same port to
>>>>>>>>>>>>>> all of
>>>>>>>>>>>>>> them.
>>>>>>>>>>>>>> Also, two subsequent requests can return the same port and
>>>>>>>>>>>>>> cause
>>>>>>>>>>>>>> problems with timing when a port used by a previous test is
>>>>>>>>>>>>>> not
>>>>>>>>>>>>>> fully
>>>>>>>>>>>>>> ready to be assigned to a different socket. And as Dmitry
>>>>>>>>>>>>>> pointed out
>>>>>>>>>>>>>> unless one can keep hold of the allocated socket and use it
>>>>>>>>>>>>>> later
>>>>>>>>>>>>>> there
>>>>>>>>>>>>>> is no guarantee that a port which was tested unallocated will
>>>>>>>>>>>>>> remain
>>>>>>>>>>>>>> unallocated also for the next few milliseconds.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> The only fail proof solution would be a port allocating
>>>>>>>>>>>>>> service
>>>>>>>>>>>>>> provided
>>>>>>>>>>>>>> by the harness. Until then we can only (hopefully) decrease
>>>>>>>>>>>>>> the
>>>>>>>>>>>>>> chance
>>>>>>>>>>>>>> of intermittent failures due to a port being in use.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> -JB-
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> -Dmitry
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> [1]
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> 141     public static int getFreePort() throws
>>>>>>>>>>>>>>> InterruptedException,
>>>>>>>>>>>>>>> IOException {
>>>>>>>>>>>>>>>   142         int port = -1;
>>>>>>>>>>>>>>>   143
>>>>>>>>>>>>>>>   144         while (port <= 0) {
>>>>>>>>>>>>>>>   145             Thread.sleep(100);
>>>>>>>>>>>>>>>   146
>>>>>>>>>>>>>>>   147             ServerSocket serverSocket = null;
>>>>>>>>>>>>>>>   148             try {
>>>>>>>>>>>>>>>   149                 serverSocket = new ServerSocket(0);
>>>>>>>>>>>>>>>   150                 port = serverSocket.getLocalPort();
>>>>>>>>>>>>>>>   151             } finally {
>>>>>>>>>>>>>>>   152                 serverSocket.close();
>>>>>>>>>>>>>>>   153             }
>>>>>>>>>>>>>>>   154         }
>>>>>>>>>>>>>>>   155
>>>>>>>>>>>>>>>   156         return port;
>>>>>>>>>>>>>>>   157     }
>>>>>>>>>>>>>>>   158
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On 2013-11-20 19:40, roger riggs wrote:
>>>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> fyi,  The jdk.testlibrary.Utils.getFreePort() method will
>>>>>>>>>>>>>>>> Open an
>>>>>>>>>>>>>>>> free
>>>>>>>>>>>>>>>> Socket, close it and return
>>>>>>>>>>>>>>>> the port number.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> And as Alan recommended, use (0) when possible to have the
>>>>>>>>>>>>>>>> system
>>>>>>>>>>>>>>>> assign
>>>>>>>>>>>>>>>> the port #.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Roger
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On 11/20/2013 8:04 AM, Dmitry Samersoff wrote:
>>>>>>>>>>>>>>>>> Taras,
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> *The only* correct way to take really free port is:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> 1. Chose random number between 49152 and 65535
>>>>>>>>>>>>>>>>> 2. Open socket
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> if socket fails - repeat step 1
>>>>>>>>>>>>>>>>> if socket OK - return *socket*
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> If you can't keep the socket open (e.g. you have to pass
>>>>>>>>>>>>>>>>> port
>>>>>>>>>>>>>>>>> number as
>>>>>>>>>>>>>>>>> property value) you shouldn't do any pre-check as it
>>>>>>>>>>>>>>>>> has no
>>>>>>>>>>>>>>>>> value
>>>>>>>>>>>>>>>>> - as
>>>>>>>>>>>>>>>>> as soon as you close socket someone can take the port.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> So just choose a random number within the range above and
>>>>>>>>>>>>>>>>> let
>>>>>>>>>>>>>>>>> networking
>>>>>>>>>>>>>>>>> code opening socket to handle port conflict.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> -Dmitry
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On 2013-11-20 15:54, taras ledkov wrote:
>>>>>>>>>>>>>>>>>> Hi Everyone,
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> I am working on bug
>>>>>>>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-7195249.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> There are two webrevs:
>>>>>>>>>>>>>>>>>> Webrev for jdk part:
>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~anazarov/7195249/jdk/webrev.00/
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Webrev for hs part:
>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~anazarov/7195249/hs/webrev.00/
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Please take a look at some notes:
>>>>>>>>>>>>>>>>>> - After discussing with Yekaterina Kantserova & Jaroslav
>>>>>>>>>>>>>>>>>> Bachorik
>>>>>>>>>>>>>>>>>> some
>>>>>>>>>>>>>>>>>> shell tests have been converted to java based tests
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> - PasswordFilePermissionTest &
>>>>>>>>>>>>>>>>>> SSLConfigFilePermissionTest
>>>>>>>>>>>>>>>>>> tests
>>>>>>>>>>>>>>>>>> looked
>>>>>>>>>>>>>>>>>> very similar, so a common parent class was created for
>>>>>>>>>>>>>>>>>> them:
>>>>>>>>>>>>>>>>>> AbstractFilePermissionTest
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> - What was called RmiRegistrySslTest.java I've renamed to
>>>>>>>>>>>>>>>>>> RmiRegistrySslTestApp.java. The java code to replace old
>>>>>>>>>>>>>>>>>> shell
>>>>>>>>>>>>>>>>>> script
>>>>>>>>>>>>>>>>>> RmiRegistrySslTest.sh is called RmiRegistrySslTest.java,
>>>>>>>>>>>>>>>>>> hence the
>>>>>>>>>>>>>>>>>> huge
>>>>>>>>>>>>>>>>>> diff.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> - The new RmiRegistrySslTest.java has some lines similar
>>>>>>>>>>>>>>>>>> to the
>>>>>>>>>>>>>>>>>> AbstractFilePermissionTest.java, I nevertheless decided
>>>>>>>>>>>>>>>>>> to not
>>>>>>>>>>>>>>>>>> complicate the code further and leave it as is. Please
>>>>>>>>>>>>>>>>>> let me
>>>>>>>>>>>>>>>>>> know if
>>>>>>>>>>>>>>>>>> this is somehow not acceptable
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> - com/oracle/java/testlibrary/Utils.java that is added to
>>>>>>>>>>>>>>>>>> hotspot
>>>>>>>>>>>>>>>>>> repository is taken from this patch:
>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ykantser/8023138/webrev.00/test/lib/testlibrary/jdk/testlibrary/Utils.java.sdiff.html
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> - These tests will need additional changes when test
>>>>>>>>>>>>>>>>>> library
>>>>>>>>>>>>>>>>>> process
>>>>>>>>>>>>>>>>>> tools will support command line options inheritance
>>>>>>>>>>>>>>>>>> (http://mail.openjdk.java.net/pipermail/serviceability-dev/2013-November/013235.html)
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> With best regards,
>>>>>>>>> Taras Ledkov
>>>>>>>>> Mail-To: taras.ledkov at oracle.com
>>>>>>>>> skype: taras_ledkov
>>>>>>>>> Phone: 7(812)3346-157
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>> --
>>>>>> With best regards,
>>>>>> Taras Ledkov
>>>>>> Mail-To: taras.ledkov at oracle.com
>>>>>> skype: taras_ledkov
>>>>>> Phone: 7(812)3346-157
>>>>>
>>>>
>>>
>>
>

-- 
With best regards,
Taras Ledkov
Mail-To: taras.ledkov at oracle.com
skype: taras_ledkov
Phone: 7(812)3346-157

From jaroslav.bachorik at oracle.com  Wed Feb 19 07:59:08 2014
From: jaroslav.bachorik at oracle.com (Jaroslav Bachorik)
Date: Wed, 19 Feb 2014 16:59:08 +0100
Subject: RFR 4505697: nsk/jdi/ExceptionEvent/_itself_/exevent006 and
	exevent008 tests fail with InvocationTargetException
In-Reply-To: <53033378.2060300@oracle.com>
References: <52FDF8DD.9090602@oracle.com> <52FE6486.8000603@oracle.com>
	<52FE7D88.8000701@oracle.com> <52FE951E.1060002@oracle.com>
	<5301C2AB.2070600@oracle.com> <53033378.2060300@oracle.com>
Message-ID: <5304D4CC.1020007@oracle.com>

On 18.2.2014 11:18, serguei.spitsyn at oracle.com wrote:
> On 2/17/14 12:04 AM, Jaroslav Bachorik wrote:
>> On 14.2.2014 23:13, serguei.spitsyn at oracle.com wrote:
>>> On 2/14/14 12:33 PM, Daniel D. Daugherty wrote:
>>>> On 2/14/14 11:46 AM, serguei.spitsyn at oracle.com wrote:
>>>>> Jaroslav,
>>>>>
>>>>> It looks good in general modulo indent comments from Dan.
>>>>>
>>>>> But I have a doubt that acquiring the JvmtiThreadState_lock is needed
>>>>> or right thing to do in the JvmtiExport::clear_detected_exception().
>>>>> It seems, both clear_exception_detected() and
>>>>> set_exception_detected() are always
>>>>> called on current thread and so, it has to be safe to do without
>>>>> acquiring any locks.
>>>>
>>>> My JVM/TI-foo is rusty, but I believe that JvmtiThreadState stuff
>>>> can also be queried/modified by other threads so grabbing the
>>>> associated lock is a good idea.
>>>
>>> The lock synchronization is cooperative.
>>> It does not help much if the lock is not acquired in other places.
>>> I can be wrong, but I've not found yet any place in the code where the
>>> clear_exception_detected() and set_exception_detected() are called
>>> under protection of the JvmtiThreadState_lock.
>>
>> I copied the locking over from
>> "JvmtiExport::cleanup_thread(JavaThread* thread)". That method is also
>> supposed to work only with the current thread but acquires the lock
>> nonetheless. But if you are sure that the lock is not required I have
>> no objections removing it.
>
> I'm suggesting to remove it, as it is not used in other places in the code.
> It is going to be confusing if it is used in one place and missed in
> others.

I've removed the lock and applied the same cleanup logic to other places 
where exceptions are rewrapped.

Webrev: http://cr.openjdk.java.net/~jbachorik/4505697/webrev.02
JPRT run: 
http://prt-web.us.oracle.com//archive/2014/02/2014-02-19-114618.jbachorik.hotspot/
Aurora Adhoc: 
http://aurora.ru.oracle.com//faces/Batch.xhtml?batchName=418853.VMSQE.adhoc.JPRT.full 
(still running at the moment; no failures so far)

>
> Thanks,
> Serguei
>
>>
>> -JB-
>>
>>>
>>> Thanks,
>>> Serguei
>>>
>>>>
>>>> Dan
>>>>
>>>>
>>>>>
>>>>> And I'm repeating my question about pre-integration testing (Dan is
>>>>> asking about the same).
>>>>>
>>>>> Thanks,
>>>>> Serguei
>>>>>
>>>>>
>>>>> On 2/14/14 3:07 AM, Jaroslav Bachorik wrote:
>>>>>> This is a round-0 review request.
>>>>>>
>>>>>> The reflection code intercepting the exceptions thrown in the
>>>>>> invoked methods does not play nicely with JVMTI (which, in this
>>>>>> case, propagates to JDI).
>>>>>>
>>>>>> The reflection code lacks the traditional error handler - therefore,
>>>>>> upon throwing the NumberFormatException, the stack is searched for
>>>>>> appropriate handlers and none are found. This leaves the
>>>>>> "exception_detected" flag set to true while normally it would be
>>>>>> reset to false once the exception is handled. The reflection code
>>>>>> then goes on and wraps the NumberFormatException into
>>>>>> InvocationTargetException and throws it. But, alas, the
>>>>>> "exception_detected" flag is still set to true and no JVMTI
>>>>>> exception event will be sent out.
>>>>>>
>>>>>> The proposed solution is to call
>>>>>> thread->jvmti_thread_state()->clear_exception_detected() at the
>>>>>> appropriate places in the reflection code to reset the
>>>>>> "exception_detected" flag and enable the InvocationTargetException
>>>>>> be properly reported over JVMTI.
>>>>>>
>>>>>> Issue : https://bugs.openjdk.java.net/browse/JDK-4505697
>>>>>> Webrev: http://cr.openjdk.java.net/~jbachorik/4505697/webrev.00
>>>>>>
>>>>>> Thanks!
>>>>>>
>>>>>> -JB-
>>>>>
>>>>
>>>
>>>
>>
>


From martinrb at google.com  Wed Feb 19 08:56:03 2014
From: martinrb at google.com (Martin Buchholz)
Date: Wed, 19 Feb 2014 08:56:03 -0800
Subject: RFR 8034168: ThreadMXBean/Locks.java failed,
	blocked on wrong object
In-Reply-To: <53045D76.5080907@oracle.com>
References: <530388CC.9080807@oracle.com>
	<CA+kOe09mG0g_5d=pa51Dgj-kLRfkf1J-HSmT9P9zctqA1ufr5w@mail.gmail.com>
	<53045D76.5080907@oracle.com>
Message-ID: <CA+kOe08BRfbu63CAzAZ4p7ZpYfmHGBBSQVvFaMVg6-2+xeuV6A@mail.gmail.com>

The jsr166 tck tests make systematic use of Thread.yield, e.g.


    /**
     * Spin-waits up to the specified number of milliseconds for the given
     * thread to enter a wait state: BLOCKED, WAITING, or TIMED_WAITING.
     */
    void waitForThreadToEnterWaitState(Thread thread, long timeoutMillis) {
        long startTime = System.nanoTime();
        for (;;) {
            Thread.State s = thread.getState();
            if (s == Thread.State.BLOCKED ||
                s == Thread.State.WAITING ||
                s == Thread.State.TIMED_WAITING)
                return;
            else if (s == Thread.State.TERMINATED)
                fail("Unexpected thread termination");
            else if (millisElapsedSince(startTime) > timeoutMillis) {
                threadAssertTrue(thread.isAlive());
                return;
            }
            Thread.yield();
        }
    }


On Tue, Feb 18, 2014 at 11:29 PM, Jaroslav Bachorik <
jaroslav.bachorik at oracle.com> wrote:

> On 18.2.2014 18:06, Martin Buchholz wrote:
>
>> Not checking any details, but tests that want to wait for a particular
>> thread state are a good reason to use
>>
>> volatile boolean flag;
>> ...
>> while (!flag) Thread.yield();
>>
>> I prefer calling Thread.yield to sleeping in this special case, in part
>> because I don't want to rely on the implementation of sleep, while yield
>> is
>> semantically a no-op.  (Also sleeping 100ms is a long time for a computer)
>>
>
> There were discussions for a similar fix regarding Thread.yield(). The
> concern was that using Thread.yield() in a tight loop might very easily
> lead to starvation on single core machines. Therefore Thread.sleep(10) is
> used to be sure the flag setting thread has actually a chance to progress.
>
> -JB-
>
>
>
>>
>>
>> On Tue, Feb 18, 2014 at 8:22 AM, Jaroslav Bachorik <
>> jaroslav.bachorik at oracle.com> wrote:
>>
>>  Please, review the following test change.
>>>
>>> Issue : https://bugs.openjdk.java.net/browse/JDK-8034168
>>> Webrev: http://cr.openjdk.java.net/~jbachorik/8034168/webrev.00
>>>
>>> The test fails because of falsely evaluating the thread being parked as
>>> actually waiting on a monitor. This is because there is no difference in
>>> java thread state for those two situations. The test is using Phaser for
>>> synchronization between the checked and checking thread to make sure an
>>> appropriate code section is entered before performing asserts. Then it
>>> checks the checked thread state and waits till it becomes WAITING.
>>> Unfortunately, when Phaser needs to wait it parks the thread and sets the
>>> thread state to WAITING. From now on the test is in a completely random
>>> state and the result will largely depend on timing - thus failing
>>> intermittently.
>>>
>>> The solution is to use an additional volatile variable to prevent falsely
>>> indicating the park() induced WAITING state.
>>>
>>> Thanks,
>>>
>>> -JB-
>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20140219/d1b80e06/attachment.html 

From daniel.daugherty at oracle.com  Wed Feb 19 09:01:39 2014
From: daniel.daugherty at oracle.com (Daniel D. Daugherty)
Date: Wed, 19 Feb 2014 10:01:39 -0700
Subject: RFR 4505697: nsk/jdi/ExceptionEvent/_itself_/exevent006 and
	exevent008 tests fail with InvocationTargetException
In-Reply-To: <5304D4CC.1020007@oracle.com>
References: <52FDF8DD.9090602@oracle.com> <52FE6486.8000603@oracle.com>
	<52FE7D88.8000701@oracle.com> <52FE951E.1060002@oracle.com>
	<5301C2AB.2070600@oracle.com> <53033378.2060300@oracle.com>
	<5304D4CC.1020007@oracle.com>
Message-ID: <5304E373.1050003@oracle.com>

 > Webrev: http://cr.openjdk.java.net/~jbachorik/4505697/webrev.02

src/share/vm/prims/jvmtiExport.hpp
     No comments.

src/share/vm/prims/jvmtiExport.cpp
     No comments.

src/share/vm/oops/instanceKlass.cpp
     No comments.

src/share/vm/prims/jvm.cpp
     No comments.

src/share/vm/runtime/reflection.cpp
     lines 948, 1085: HotSpot indent is two spaces

Thumbs up.

Dan


On 2/19/14 8:59 AM, Jaroslav Bachorik wrote:
> On 18.2.2014 11:18, serguei.spitsyn at oracle.com wrote:
>> On 2/17/14 12:04 AM, Jaroslav Bachorik wrote:
>>> On 14.2.2014 23:13, serguei.spitsyn at oracle.com wrote:
>>>> On 2/14/14 12:33 PM, Daniel D. Daugherty wrote:
>>>>> On 2/14/14 11:46 AM, serguei.spitsyn at oracle.com wrote:
>>>>>> Jaroslav,
>>>>>>
>>>>>> It looks good in general modulo indent comments from Dan.
>>>>>>
>>>>>> But I have a doubt that acquiring the JvmtiThreadState_lock is 
>>>>>> needed
>>>>>> or right thing to do in the JvmtiExport::clear_detected_exception().
>>>>>> It seems, both clear_exception_detected() and
>>>>>> set_exception_detected() are always
>>>>>> called on current thread and so, it has to be safe to do without
>>>>>> acquiring any locks.
>>>>>
>>>>> My JVM/TI-foo is rusty, but I believe that JvmtiThreadState stuff
>>>>> can also be queried/modified by other threads so grabbing the
>>>>> associated lock is a good idea.
>>>>
>>>> The lock synchronization is cooperative.
>>>> It does not help much if the lock is not acquired in other places.
>>>> I can be wrong, but I've not found yet any place in the code where the
>>>> clear_exception_detected() and set_exception_detected() are called
>>>> under protection of the JvmtiThreadState_lock.
>>>
>>> I copied the locking over from
>>> "JvmtiExport::cleanup_thread(JavaThread* thread)". That method is also
>>> supposed to work only with the current thread but acquires the lock
>>> nonetheless. But if you are sure that the lock is not required I have
>>> no objections removing it.
>>
>> I'm suggesting to remove it, as it is not used in other places in the 
>> code.
>> It is going to be confusing if it is used in one place and missed in
>> others.
>
> I've removed the lock and applied the same cleanup logic to other 
> places where exceptions are rewrapped.
>
> Webrev: http://cr.openjdk.java.net/~jbachorik/4505697/webrev.02
> JPRT run: 
> http://prt-web.us.oracle.com//archive/2014/02/2014-02-19-114618.jbachorik.hotspot/
> Aurora Adhoc: 
> http://aurora.ru.oracle.com//faces/Batch.xhtml?batchName=418853.VMSQE.adhoc.JPRT.full 
> (still running at the moment; no failures so far)
>
>>
>> Thanks,
>> Serguei
>>
>>>
>>> -JB-
>>>
>>>>
>>>> Thanks,
>>>> Serguei
>>>>
>>>>>
>>>>> Dan
>>>>>
>>>>>
>>>>>>
>>>>>> And I'm repeating my question about pre-integration testing (Dan is
>>>>>> asking about the same).
>>>>>>
>>>>>> Thanks,
>>>>>> Serguei
>>>>>>
>>>>>>
>>>>>> On 2/14/14 3:07 AM, Jaroslav Bachorik wrote:
>>>>>>> This is a round-0 review request.
>>>>>>>
>>>>>>> The reflection code intercepting the exceptions thrown in the
>>>>>>> invoked methods does not play nicely with JVMTI (which, in this
>>>>>>> case, propagates to JDI).
>>>>>>>
>>>>>>> The reflection code lacks the traditional error handler - 
>>>>>>> therefore,
>>>>>>> upon throwing the NumberFormatException, the stack is searched for
>>>>>>> appropriate handlers and none are found. This leaves the
>>>>>>> "exception_detected" flag set to true while normally it would be
>>>>>>> reset to false once the exception is handled. The reflection code
>>>>>>> then goes on and wraps the NumberFormatException into
>>>>>>> InvocationTargetException and throws it. But, alas, the
>>>>>>> "exception_detected" flag is still set to true and no JVMTI
>>>>>>> exception event will be sent out.
>>>>>>>
>>>>>>> The proposed solution is to call
>>>>>>> thread->jvmti_thread_state()->clear_exception_detected() at the
>>>>>>> appropriate places in the reflection code to reset the
>>>>>>> "exception_detected" flag and enable the InvocationTargetException
>>>>>>> be properly reported over JVMTI.
>>>>>>>
>>>>>>> Issue : https://bugs.openjdk.java.net/browse/JDK-4505697
>>>>>>> Webrev: http://cr.openjdk.java.net/~jbachorik/4505697/webrev.00
>>>>>>>
>>>>>>> Thanks!
>>>>>>>
>>>>>>> -JB-
>>>>>>
>>>>>
>>>>
>>>>
>>>
>>
>
>
>


From Alan.Bateman at oracle.com  Wed Feb 19 09:07:04 2014
From: Alan.Bateman at oracle.com (Alan Bateman)
Date: Wed, 19 Feb 2014 17:07:04 +0000
Subject: 8034856/8034857: More gcc warnings
In-Reply-To: <5303B85D.10505@oracle.com>
References: <52FCC60B.6040205@oracle.com>
	<6E073B44-2FF7-419F-A094-7CD5C3AE0788@oracle.com>
	<52FD0DA2.7030201@oracle.com> <52FD35C5.1000306@oracle.com>
	<5300F36F.9010205@oracle.com>
	<CA+kOe09Uc3LQG7Ui7fhxsQABacrPdvEpRVGoKxnoY-0FHm4Ntw@mail.gmail.com>
	<4DCAB60B-5D60-4873-A5A4-ACF123F10045@oracle.com>
	<530225D2.4040709@oracle.com> <5302DA88.9060306@oracle.com>
	<53031AE4.2050700@oracle.com> <5303B85D.10505@oracle.com>
Message-ID: <5304E4B8.6040908@oracle.com>

On 18/02/2014 19:45, Mikael Vidstedt wrote:
>
> That makes sense, and in fact parse_manifest.c does not even appear to 
> allow for \t, so I'm more and more starting to think that a reasonable 
> implementation in this context would be:
>
> static int isNormalSpace(int c) { return c == ' '; }
>
> In which case it probably shouldn't even be a separate function to 
> start with. I would like to get a second opinion on the implications 
> of only checking for ' ' (0x20) though.
>
> If we want to allow both ' ' and \t we should probably call the 
> function isblankAscii.
Thanks again for taking this. On \t then if it's nor handled by the 
parsing code then isNormalSpace should be fine.

-Alan.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20140219/aaa8c604/attachment.html 

From serguei.spitsyn at oracle.com  Wed Feb 19 10:20:38 2014
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Wed, 19 Feb 2014 10:20:38 -0800
Subject: RFR 4505697: nsk/jdi/ExceptionEvent/_itself_/exevent006 and
	exevent008 tests fail with InvocationTargetException
In-Reply-To: <5304D4CC.1020007@oracle.com>
References: <52FDF8DD.9090602@oracle.com> <52FE6486.8000603@oracle.com>
	<52FE7D88.8000701@oracle.com> <52FE951E.1060002@oracle.com>
	<5301C2AB.2070600@oracle.com> <53033378.2060300@oracle.com>
	<5304D4CC.1020007@oracle.com>
Message-ID: <5304F5F6.5070500@oracle.com>


On 2/19/14 7:59 AM, Jaroslav Bachorik wrote:
> On 18.2.2014 11:18, serguei.spitsyn at oracle.com wrote:
>> On 2/17/14 12:04 AM, Jaroslav Bachorik wrote:
>>> On 14.2.2014 23:13, serguei.spitsyn at oracle.com wrote:
>>>> On 2/14/14 12:33 PM, Daniel D. Daugherty wrote:
>>>>> On 2/14/14 11:46 AM, serguei.spitsyn at oracle.com wrote:
>>>>>> Jaroslav,
>>>>>>
>>>>>> It looks good in general modulo indent comments from Dan.
>>>>>>
>>>>>> But I have a doubt that acquiring the JvmtiThreadState_lock is 
>>>>>> needed
>>>>>> or right thing to do in the JvmtiExport::clear_detected_exception().
>>>>>> It seems, both clear_exception_detected() and
>>>>>> set_exception_detected() are always
>>>>>> called on current thread and so, it has to be safe to do without
>>>>>> acquiring any locks.
>>>>>
>>>>> My JVM/TI-foo is rusty, but I believe that JvmtiThreadState stuff
>>>>> can also be queried/modified by other threads so grabbing the
>>>>> associated lock is a good idea.
>>>>
>>>> The lock synchronization is cooperative.
>>>> It does not help much if the lock is not acquired in other places.
>>>> I can be wrong, but I've not found yet any place in the code where the
>>>> clear_exception_detected() and set_exception_detected() are called
>>>> under protection of the JvmtiThreadState_lock.
>>>
>>> I copied the locking over from
>>> "JvmtiExport::cleanup_thread(JavaThread* thread)". That method is also
>>> supposed to work only with the current thread but acquires the lock
>>> nonetheless. But if you are sure that the lock is not required I have
>>> no objections removing it.
>>
>> I'm suggesting to remove it, as it is not used in other places in the 
>> code.
>> It is going to be confusing if it is used in one place and missed in
>> others.
>
> I've removed the lock and applied the same cleanup logic to other 
> places where exceptions are rewrapped.
>
> Webrev: http://cr.openjdk.java.net/~jbachorik/4505697/webrev.02
> JPRT run: 
> http://prt-web.us.oracle.com//archive/2014/02/2014-02-19-114618.jbachorik.hotspot/
> Aurora Adhoc: 
> http://aurora.ru.oracle.com//faces/Batch.xhtml?batchName=418853.VMSQE.adhoc.JPRT.full 
> (still running at the moment; no failures so far)
>

It looks good to me.
Thank you for changes in the fix!

Thanks,
Serguei


>>
>> Thanks,
>> Serguei
>>
>>>
>>> -JB-
>>>
>>>>
>>>> Thanks,
>>>> Serguei
>>>>
>>>>>
>>>>> Dan
>>>>>
>>>>>
>>>>>>
>>>>>> And I'm repeating my question about pre-integration testing (Dan is
>>>>>> asking about the same).
>>>>>>
>>>>>> Thanks,
>>>>>> Serguei
>>>>>>
>>>>>>
>>>>>> On 2/14/14 3:07 AM, Jaroslav Bachorik wrote:
>>>>>>> This is a round-0 review request.
>>>>>>>
>>>>>>> The reflection code intercepting the exceptions thrown in the
>>>>>>> invoked methods does not play nicely with JVMTI (which, in this
>>>>>>> case, propagates to JDI).
>>>>>>>
>>>>>>> The reflection code lacks the traditional error handler - 
>>>>>>> therefore,
>>>>>>> upon throwing the NumberFormatException, the stack is searched for
>>>>>>> appropriate handlers and none are found. This leaves the
>>>>>>> "exception_detected" flag set to true while normally it would be
>>>>>>> reset to false once the exception is handled. The reflection code
>>>>>>> then goes on and wraps the NumberFormatException into
>>>>>>> InvocationTargetException and throws it. But, alas, the
>>>>>>> "exception_detected" flag is still set to true and no JVMTI
>>>>>>> exception event will be sent out.
>>>>>>>
>>>>>>> The proposed solution is to call
>>>>>>> thread->jvmti_thread_state()->clear_exception_detected() at the
>>>>>>> appropriate places in the reflection code to reset the
>>>>>>> "exception_detected" flag and enable the InvocationTargetException
>>>>>>> be properly reported over JVMTI.
>>>>>>>
>>>>>>> Issue : https://bugs.openjdk.java.net/browse/JDK-4505697
>>>>>>> Webrev: http://cr.openjdk.java.net/~jbachorik/4505697/webrev.00
>>>>>>>
>>>>>>> Thanks!
>>>>>>>
>>>>>>> -JB-
>>>>>>
>>>>>
>>>>
>>>>
>>>
>>
>


From mikael.vidstedt at oracle.com  Wed Feb 19 10:22:06 2014
From: mikael.vidstedt at oracle.com (Mikael Vidstedt)
Date: Wed, 19 Feb 2014 10:22:06 -0800
Subject: 8034856/8034857: More gcc warnings
In-Reply-To: <5304E4B8.6040908@oracle.com>
References: <52FCC60B.6040205@oracle.com>
	<6E073B44-2FF7-419F-A094-7CD5C3AE0788@oracle.com>
	<52FD0DA2.7030201@oracle.com> <52FD35C5.1000306@oracle.com>
	<5300F36F.9010205@oracle.com>
	<CA+kOe09Uc3LQG7Ui7fhxsQABacrPdvEpRVGoKxnoY-0FHm4Ntw@mail.gmail.com>
	<4DCAB60B-5D60-4873-A5A4-ACF123F10045@oracle.com>
	<530225D2.4040709@oracle.com> <5302DA88.9060306@oracle.com>
	<53031AE4.2050700@oracle.com> <5303B85D.10505@oracle.com>
	<5304E4B8.6040908@oracle.com>
Message-ID: <5304F64E.4000206@oracle.com>


On 2014-02-19 09:07, Alan Bateman wrote:
> On 18/02/2014 19:45, Mikael Vidstedt wrote:
>>
>> That makes sense, and in fact parse_manifest.c does not even appear 
>> to allow for \t, so I'm more and more starting to think that a 
>> reasonable implementation in this context would be:
>>
>> static int isNormalSpace(int c) { return c == ' '; }
>>
>> In which case it probably shouldn't even be a separate function to 
>> start with. I would like to get a second opinion on the implications 
>> of only checking for ' ' (0x20) though.
>>
>> If we want to allow both ' ' and \t we should probably call the 
>> function isblankAscii.
> Thanks again for taking this. On \t then if it's nor handled by the 
> parsing code then isNormalSpace should be fine.

Since I'm not exactly an expert on the code in question I would 
certainly appreciate it if somebody could verify me on that. I'm looking 
at parse_nv_pair (lines 430-542) in:

http://hg.openjdk.java.net/jdk9/dev/jdk/file/c766ec3e4877/src/share/bin/parse_manifest.c

The documented grammar in the comment only mentions "SPACE" and the code 
below doesn't make any references to \t. As a matter of fact, it only 
checks for one single, mandatory SPACE after the colon (enforced at line 
535-536) and doesn't care to remove any space characters at the end of 
the value. The while loop only deals with continuations. If additional 
spaces do exist they will as far as I can tell be part of the value. Are 
they trimmed later? I'm assuming it would be nice to have both parsers 
(parse_manifest & JarFacade) behave the same way?

Here's what it would look like to only check for space, but still eat 
any additional spaces which doesn't match what 
parse_manifest/parse_nv_pair does:

http://cr.openjdk.java.net/~mikael/webrevs/isspace/webrev.01/webrev/

Cheers,
Mikael

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20140219/f045a3d6/attachment.html 

From coleen.phillimore at oracle.com  Wed Feb 19 11:28:26 2014
From: coleen.phillimore at oracle.com (Coleen Phillimore)
Date: Wed, 19 Feb 2014 14:28:26 -0500
Subject: RFR(XS): JDK-8034867 - Object references from expression stack
	are not visible as roots via jvmti FollowReferences and HeapDump
In-Reply-To: <02D5D45C1F8DB848A7AE20E80EE61A5C3980BF2A@DEWDFEMB20C.global.corp.sap>
References: <02D5D45C1F8DB848A7AE20E80EE61A5C3980BF2A@DEWDFEMB20C.global.corp.sap>
Message-ID: <530505DA.2070305@oracle.com>


This change looks good.  Do you need a sponsor?
thanks,
Coleen

On 2/13/14 11:57 AM, Siebenborn, Axel wrote:
>
> Hi all,
>
> could I have a review for the following change?
>
> It adds code to iterate over the slots of the expression stack.
>
> Found object references are used as heap roots, the same way as object 
> references in locals slots.
>
> Bug:
>
> https://bugs.openjdk.java.net/browse/JDK-8034867
>
> Webrev:
>
> http://sapjvm.com/as/webrevs/8034867
>
> Thanks,
>
> Axel
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20140219/8b4a8588/attachment.html 

From serguei.spitsyn at oracle.com  Wed Feb 19 21:30:20 2014
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Wed, 19 Feb 2014 21:30:20 -0800
Subject: RFR(XS): JDK-8034867 - Object references from expression stack
	are not visible as roots via jvmti FollowReferences and HeapDump
In-Reply-To: <530505DA.2070305@oracle.com>
References: <02D5D45C1F8DB848A7AE20E80EE61A5C3980BF2A@DEWDFEMB20C.global.corp.sap>
	<530505DA.2070305@oracle.com>
Message-ID: <530592EC.20207@oracle.com>

It looks good to me too.
But does Axel need to sign an OCA ?

This is a link to follow:
http://openjdk.java.net/contribute/


Thanks,
Serguei


On 2/19/14 11:28 AM, Coleen Phillimore wrote:
>
> This change looks good.  Do you need a sponsor?
> thanks,
> Coleen
>
> On 2/13/14 11:57 AM, Siebenborn, Axel wrote:
>>
>> Hi all,
>>
>> could I have a review for the following change?
>>
>> It adds code to iterate over the slots of the expression stack.
>>
>> Found object references are used as heap roots, the same way as 
>> object references in locals slots.
>>
>> Bug:
>>
>> https://bugs.openjdk.java.net/browse/JDK-8034867
>>
>> Webrev:
>>
>> http://sapjvm.com/as/webrevs/8034867
>>
>> Thanks,
>>
>> Axel
>>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20140219/63cd4d0b/attachment.html 

From staffan.larsen at oracle.com  Thu Feb 20 00:05:21 2014
From: staffan.larsen at oracle.com (Staffan Larsen)
Date: Thu, 20 Feb 2014 09:05:21 +0100
Subject: Codereview request: 8035195 demo/jvmti/mtrace/TraceJFrame.java
	can't connect to X11
In-Reply-To: <530468A9.7080509@oracle.com>
References: <530468A9.7080509@oracle.com>
Message-ID: <2B50B541-4A32-4D76-A514-95B34E29C73D@oracle.com>

Looks like an OK change although it does not fix the issue with test failures that we see. For that, someone in SQE needs to look at the environment and make sure the tests is run correctly. 

I suggest you leave the current bug open and create a new one for this test output change.

/Staffan

On 19 feb 2014, at 09:17, shanliang <shanliang.jiang at oracle.com> wrote:

> Hi,
> 
> The failure was from a bad DISPLAY setting, this must be not intended, so we re-throw the exception and tell to make sure that the DISPLAY is correct.
> 
> The fix is only to add a catch of InternalError at the beginning, the rest code is not modified but re-formated to remove some spaces.
> 
> webrev:
> http://cr.openjdk.java.net/~sjiang/JDK-8035195/00 <http://cr.openjdk.java.net/%7Esjiang/JDK-8035195/00>/
> 
> bug:
> https://bugs.openjdk.java.net/browse/JDK-8035195
> 
> 
> Thanks,
> Shanliang


From volker.simonis at gmail.com  Thu Feb 20 00:21:25 2014
From: volker.simonis at gmail.com (Volker Simonis)
Date: Thu, 20 Feb 2014 09:21:25 +0100
Subject: RFR(XS): JDK-8034867 - Object references from expression stack
	are not visible as roots via jvmti FollowReferences and HeapDump
In-Reply-To: <530592EC.20207@oracle.com>
References: <02D5D45C1F8DB848A7AE20E80EE61A5C3980BF2A@DEWDFEMB20C.global.corp.sap>
	<530505DA.2070305@oracle.com> <530592EC.20207@oracle.com>
Message-ID: <CA+3eh109WFe6NQNWKZT7152Jnk8vO3_sG1TohSL0_Z5Z6L4JrA@mail.gmail.com>

Axel is covered by the SAP OCA.

Thanks,
Volker


On Thu, Feb 20, 2014 at 6:30 AM, serguei.spitsyn at oracle.com
<serguei.spitsyn at oracle.com> wrote:
> It looks good to me too.
> But does Axel need to sign an OCA ?
>
> This is a link to follow:
>   http://openjdk.java.net/contribute/
>
>
> Thanks,
> Serguei
>
>
>
> On 2/19/14 11:28 AM, Coleen Phillimore wrote:
>
>
> This change looks good.  Do you need a sponsor?
> thanks,
> Coleen
>
> On 2/13/14 11:57 AM, Siebenborn, Axel wrote:
>
> Hi all,
>
> could I have a review for the following change?
>
>
>
> It adds code to iterate over the slots of the expression stack.
>
> Found object references are used as heap roots, the same way as object
> references in locals slots.
>
>
>
> Bug:
>
> https://bugs.openjdk.java.net/browse/JDK-8034867
>
>
>
> Webrev:
>
> http://sapjvm.com/as/webrevs/8034867
>
>
>
> Thanks,
>
> Axel
>
>
>

From serguei.spitsyn at oracle.com  Thu Feb 20 00:44:13 2014
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Thu, 20 Feb 2014 00:44:13 -0800
Subject: RFR(XS): JDK-8034867 - Object references from expression stack
	are not visible as roots via jvmti FollowReferences and HeapDump
In-Reply-To: <CA+3eh109WFe6NQNWKZT7152Jnk8vO3_sG1TohSL0_Z5Z6L4JrA@mail.gmail.com>
References: <02D5D45C1F8DB848A7AE20E80EE61A5C3980BF2A@DEWDFEMB20C.global.corp.sap>	<530505DA.2070305@oracle.com>	<530592EC.20207@oracle.com>
	<CA+3eh109WFe6NQNWKZT7152Jnk8vO3_sG1TohSL0_Z5Z6L4JrA@mail.gmail.com>
Message-ID: <5305C05D.6070708@oracle.com>

Ok

Thanks!
Serguei

On 2/20/14 12:21 AM, Volker Simonis wrote:
> Axel is covered by the SAP OCA.
>
> Thanks,
> Volker
>
>
> On Thu, Feb 20, 2014 at 6:30 AM, serguei.spitsyn at oracle.com
> <serguei.spitsyn at oracle.com> wrote:
>> It looks good to me too.
>> But does Axel need to sign an OCA ?
>>
>> This is a link to follow:
>>    http://openjdk.java.net/contribute/
>>
>>
>> Thanks,
>> Serguei
>>
>>
>>
>> On 2/19/14 11:28 AM, Coleen Phillimore wrote:
>>
>>
>> This change looks good.  Do you need a sponsor?
>> thanks,
>> Coleen
>>
>> On 2/13/14 11:57 AM, Siebenborn, Axel wrote:
>>
>> Hi all,
>>
>> could I have a review for the following change?
>>
>>
>>
>> It adds code to iterate over the slots of the expression stack.
>>
>> Found object references are used as heap roots, the same way as object
>> references in locals slots.
>>
>>
>>
>> Bug:
>>
>> https://bugs.openjdk.java.net/browse/JDK-8034867
>>
>>
>>
>> Webrev:
>>
>> http://sapjvm.com/as/webrevs/8034867
>>
>>
>>
>> Thanks,
>>
>> Axel
>>
>>
>>


From jaroslav.bachorik at oracle.com  Thu Feb 20 00:47:10 2014
From: jaroslav.bachorik at oracle.com (Jaroslav Bachorik)
Date: Thu, 20 Feb 2014 09:47:10 +0100
Subject: Codereview request: 8035195 demo/jvmti/mtrace/TraceJFrame.java
	can't connect to X11
In-Reply-To: <2B50B541-4A32-4D76-A514-95B34E29C73D@oracle.com>
References: <530468A9.7080509@oracle.com>
	<2B50B541-4A32-4D76-A514-95B34E29C73D@oracle.com>
Message-ID: <5305C10E.4060704@oracle.com>

On 20.2.2014 09:05, Staffan Larsen wrote:
> Looks like an OK change although it does not fix the issue with test failures that we see. For that, someone in SQE needs to look at the environment and make sure the tests is run correctly.
>
> I suggest you leave the current bug open and create a new one for this test output change.
+1

-JB-

>
> /Staffan
>
> On 19 feb 2014, at 09:17, shanliang <shanliang.jiang at oracle.com> wrote:
>
>> Hi,
>>
>> The failure was from a bad DISPLAY setting, this must be not intended, so we re-throw the exception and tell to make sure that the DISPLAY is correct.
>>
>> The fix is only to add a catch of InternalError at the beginning, the rest code is not modified but re-formated to remove some spaces.
>>
>> webrev:
>> http://cr.openjdk.java.net/~sjiang/JDK-8035195/00 <http://cr.openjdk.java.net/%7Esjiang/JDK-8035195/00>/
>>
>> bug:
>> https://bugs.openjdk.java.net/browse/JDK-8035195
>>
>>
>> Thanks,
>> Shanliang
>


From jaroslav.bachorik at oracle.com  Thu Feb 20 00:49:06 2014
From: jaroslav.bachorik at oracle.com (Jaroslav Bachorik)
Date: Thu, 20 Feb 2014 09:49:06 +0100
Subject: RFR 4505697: nsk/jdi/ExceptionEvent/_itself_/exevent006 and
	exevent008 tests fail with InvocationTargetException
In-Reply-To: <5304E373.1050003@oracle.com>
References: <52FDF8DD.9090602@oracle.com> <52FE6486.8000603@oracle.com>
	<52FE7D88.8000701@oracle.com> <52FE951E.1060002@oracle.com>
	<5301C2AB.2070600@oracle.com> <53033378.2060300@oracle.com>
	<5304D4CC.1020007@oracle.com> <5304E373.1050003@oracle.com>
Message-ID: <5305C182.6020802@oracle.com>

On 19.2.2014 18:01, Daniel D. Daugherty wrote:
>  > Webrev: http://cr.openjdk.java.net/~jbachorik/4505697/webrev.02
>
> src/share/vm/prims/jvmtiExport.hpp
>      No comments.
>
> src/share/vm/prims/jvmtiExport.cpp
>      No comments.
>
> src/share/vm/oops/instanceKlass.cpp
>      No comments.
>
> src/share/vm/prims/jvm.cpp
>      No comments.
>
> src/share/vm/runtime/reflection.cpp
>      lines 948, 1085: HotSpot indent is two spaces
>
> Thumbs up.
>

Thanks Dan!

Could I have a second HS reviewer to take a look at this, please?

-JB-

> Dan
>
>
> On 2/19/14 8:59 AM, Jaroslav Bachorik wrote:
>> On 18.2.2014 11:18, serguei.spitsyn at oracle.com wrote:
>>> On 2/17/14 12:04 AM, Jaroslav Bachorik wrote:
>>>> On 14.2.2014 23:13, serguei.spitsyn at oracle.com wrote:
>>>>> On 2/14/14 12:33 PM, Daniel D. Daugherty wrote:
>>>>>> On 2/14/14 11:46 AM, serguei.spitsyn at oracle.com wrote:
>>>>>>> Jaroslav,
>>>>>>>
>>>>>>> It looks good in general modulo indent comments from Dan.
>>>>>>>
>>>>>>> But I have a doubt that acquiring the JvmtiThreadState_lock is
>>>>>>> needed
>>>>>>> or right thing to do in the JvmtiExport::clear_detected_exception().
>>>>>>> It seems, both clear_exception_detected() and
>>>>>>> set_exception_detected() are always
>>>>>>> called on current thread and so, it has to be safe to do without
>>>>>>> acquiring any locks.
>>>>>>
>>>>>> My JVM/TI-foo is rusty, but I believe that JvmtiThreadState stuff
>>>>>> can also be queried/modified by other threads so grabbing the
>>>>>> associated lock is a good idea.
>>>>>
>>>>> The lock synchronization is cooperative.
>>>>> It does not help much if the lock is not acquired in other places.
>>>>> I can be wrong, but I've not found yet any place in the code where the
>>>>> clear_exception_detected() and set_exception_detected() are called
>>>>> under protection of the JvmtiThreadState_lock.
>>>>
>>>> I copied the locking over from
>>>> "JvmtiExport::cleanup_thread(JavaThread* thread)". That method is also
>>>> supposed to work only with the current thread but acquires the lock
>>>> nonetheless. But if you are sure that the lock is not required I have
>>>> no objections removing it.
>>>
>>> I'm suggesting to remove it, as it is not used in other places in the
>>> code.
>>> It is going to be confusing if it is used in one place and missed in
>>> others.
>>
>> I've removed the lock and applied the same cleanup logic to other
>> places where exceptions are rewrapped.
>>
>> Webrev: http://cr.openjdk.java.net/~jbachorik/4505697/webrev.02
>> JPRT run:
>> http://prt-web.us.oracle.com//archive/2014/02/2014-02-19-114618.jbachorik.hotspot/
>>
>> Aurora Adhoc:
>> http://aurora.ru.oracle.com//faces/Batch.xhtml?batchName=418853.VMSQE.adhoc.JPRT.full
>> (still running at the moment; no failures so far)
>>
>>>
>>> Thanks,
>>> Serguei
>>>
>>>>
>>>> -JB-
>>>>
>>>>>
>>>>> Thanks,
>>>>> Serguei
>>>>>
>>>>>>
>>>>>> Dan
>>>>>>
>>>>>>
>>>>>>>
>>>>>>> And I'm repeating my question about pre-integration testing (Dan is
>>>>>>> asking about the same).
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Serguei
>>>>>>>
>>>>>>>
>>>>>>> On 2/14/14 3:07 AM, Jaroslav Bachorik wrote:
>>>>>>>> This is a round-0 review request.
>>>>>>>>
>>>>>>>> The reflection code intercepting the exceptions thrown in the
>>>>>>>> invoked methods does not play nicely with JVMTI (which, in this
>>>>>>>> case, propagates to JDI).
>>>>>>>>
>>>>>>>> The reflection code lacks the traditional error handler -
>>>>>>>> therefore,
>>>>>>>> upon throwing the NumberFormatException, the stack is searched for
>>>>>>>> appropriate handlers and none are found. This leaves the
>>>>>>>> "exception_detected" flag set to true while normally it would be
>>>>>>>> reset to false once the exception is handled. The reflection code
>>>>>>>> then goes on and wraps the NumberFormatException into
>>>>>>>> InvocationTargetException and throws it. But, alas, the
>>>>>>>> "exception_detected" flag is still set to true and no JVMTI
>>>>>>>> exception event will be sent out.
>>>>>>>>
>>>>>>>> The proposed solution is to call
>>>>>>>> thread->jvmti_thread_state()->clear_exception_detected() at the
>>>>>>>> appropriate places in the reflection code to reset the
>>>>>>>> "exception_detected" flag and enable the InvocationTargetException
>>>>>>>> be properly reported over JVMTI.
>>>>>>>>
>>>>>>>> Issue : https://bugs.openjdk.java.net/browse/JDK-4505697
>>>>>>>> Webrev: http://cr.openjdk.java.net/~jbachorik/4505697/webrev.00
>>>>>>>>
>>>>>>>> Thanks!
>>>>>>>>
>>>>>>>> -JB-
>>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>
>>
>>
>


From axel.siebenborn at sap.com  Thu Feb 20 01:05:32 2014
From: axel.siebenborn at sap.com (Siebenborn, Axel)
Date: Thu, 20 Feb 2014 09:05:32 +0000
Subject: RFR(XS): JDK-8034867 - Object references from expression stack
	are not visible as roots via jvmti FollowReferences and HeapDump
In-Reply-To: <CA+3eh109WFe6NQNWKZT7152Jnk8vO3_sG1TohSL0_Z5Z6L4JrA@mail.gmail.com>
References: <02D5D45C1F8DB848A7AE20E80EE61A5C3980BF2A@DEWDFEMB20C.global.corp.sap>
	<530505DA.2070305@oracle.com>	<530592EC.20207@oracle.com>
	<CA+3eh109WFe6NQNWKZT7152Jnk8vO3_sG1TohSL0_Z5Z6L4JrA@mail.gmail.com>
Message-ID: <02D5D45C1F8DB848A7AE20E80EE61A5C3980D881@DEWDFEMB20C.global.corp.sap>

Yes, I need a sponsor for this change.
Thanks for the reviews,
Axel


> -----Original Message-----
> From: Volker Simonis [mailto:volker.simonis at gmail.com]
> Sent: Donnerstag, 20. Februar 2014 09:21
> To: serguei.spitsyn at oracle.com
> Cc: Coleen Phillimore; hotspot-runtime-dev at openjdk.java.net;
> serviceability-dev at openjdk.java.net; Siebenborn, Axel
> Subject: Re: RFR(XS): JDK-8034867 - Object references from expression stack
> are not visible as roots via jvmti FollowReferences and HeapDump
> 
> Axel is covered by the SAP OCA.
> 
> Thanks,
> Volker
> 
> 
> On Thu, Feb 20, 2014 at 6:30 AM, serguei.spitsyn at oracle.com
> <serguei.spitsyn at oracle.com> wrote:
> > It looks good to me too.
> > But does Axel need to sign an OCA ?
> >
> > This is a link to follow:
> >   http://openjdk.java.net/contribute/
> >
> >
> > Thanks,
> > Serguei
> >
> >
> >
> > On 2/19/14 11:28 AM, Coleen Phillimore wrote:
> >
> >
> > This change looks good.  Do you need a sponsor?
> > thanks,
> > Coleen
> >
> > On 2/13/14 11:57 AM, Siebenborn, Axel wrote:
> >
> > Hi all,
> >
> > could I have a review for the following change?
> >
> >
> >
> > It adds code to iterate over the slots of the expression stack.
> >
> > Found object references are used as heap roots, the same way as object
> > references in locals slots.
> >
> >
> >
> > Bug:
> >
> > https://bugs.openjdk.java.net/browse/JDK-8034867
> >
> >
> >
> > Webrev:
> >
> > http://sapjvm.com/as/webrevs/8034867
> >
> >
> >
> > Thanks,
> >
> > Axel
> >
> >
> >

From serguei.spitsyn at oracle.com  Thu Feb 20 01:28:36 2014
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Thu, 20 Feb 2014 01:28:36 -0800
Subject: RFR 4505697: nsk/jdi/ExceptionEvent/_itself_/exevent006 and
	exevent008 tests fail with InvocationTargetException
In-Reply-To: <5305C182.6020802@oracle.com>
References: <52FDF8DD.9090602@oracle.com> <52FE6486.8000603@oracle.com>
	<52FE7D88.8000701@oracle.com> <52FE951E.1060002@oracle.com>
	<5301C2AB.2070600@oracle.com> <53033378.2060300@oracle.com>
	<5304D4CC.1020007@oracle.com> <5304E373.1050003@oracle.com>
	<5305C182.6020802@oracle.com>
Message-ID: <5305CAC4.2050308@oracle.com>

On 2/20/14 12:49 AM, Jaroslav Bachorik wrote:
> On 19.2.2014 18:01, Daniel D. Daugherty wrote:
>>  > Webrev: http://cr.openjdk.java.net/~jbachorik/4505697/webrev.02
>>
>> src/share/vm/prims/jvmtiExport.hpp
>>      No comments.
>>
>> src/share/vm/prims/jvmtiExport.cpp
>>      No comments.
>>
>> src/share/vm/oops/instanceKlass.cpp
>>      No comments.
>>
>> src/share/vm/prims/jvm.cpp
>>      No comments.
>>
>> src/share/vm/runtime/reflection.cpp
>>      lines 948, 1085: HotSpot indent is two spaces
>>
>> Thumbs up.
>>
>
> Thanks Dan!
>
> Could I have a second HS reviewer to take a look at this, please?

You need just one review with the reviewer status and at least two 
reviews total.
You already got reviews from Dan and me which is enough.
But it still might be worth to wait for David to review this version as 
he already reviewed prev. one.


Thanks,
Serguei

>
> -JB-
>
>> Dan
>>
>>
>> On 2/19/14 8:59 AM, Jaroslav Bachorik wrote:
>>> On 18.2.2014 11:18, serguei.spitsyn at oracle.com wrote:
>>>> On 2/17/14 12:04 AM, Jaroslav Bachorik wrote:
>>>>> On 14.2.2014 23:13, serguei.spitsyn at oracle.com wrote:
>>>>>> On 2/14/14 12:33 PM, Daniel D. Daugherty wrote:
>>>>>>> On 2/14/14 11:46 AM, serguei.spitsyn at oracle.com wrote:
>>>>>>>> Jaroslav,
>>>>>>>>
>>>>>>>> It looks good in general modulo indent comments from Dan.
>>>>>>>>
>>>>>>>> But I have a doubt that acquiring the JvmtiThreadState_lock is
>>>>>>>> needed
>>>>>>>> or right thing to do in the 
>>>>>>>> JvmtiExport::clear_detected_exception().
>>>>>>>> It seems, both clear_exception_detected() and
>>>>>>>> set_exception_detected() are always
>>>>>>>> called on current thread and so, it has to be safe to do without
>>>>>>>> acquiring any locks.
>>>>>>>
>>>>>>> My JVM/TI-foo is rusty, but I believe that JvmtiThreadState stuff
>>>>>>> can also be queried/modified by other threads so grabbing the
>>>>>>> associated lock is a good idea.
>>>>>>
>>>>>> The lock synchronization is cooperative.
>>>>>> It does not help much if the lock is not acquired in other places.
>>>>>> I can be wrong, but I've not found yet any place in the code 
>>>>>> where the
>>>>>> clear_exception_detected() and set_exception_detected() are called
>>>>>> under protection of the JvmtiThreadState_lock.
>>>>>
>>>>> I copied the locking over from
>>>>> "JvmtiExport::cleanup_thread(JavaThread* thread)". That method is 
>>>>> also
>>>>> supposed to work only with the current thread but acquires the lock
>>>>> nonetheless. But if you are sure that the lock is not required I have
>>>>> no objections removing it.
>>>>
>>>> I'm suggesting to remove it, as it is not used in other places in the
>>>> code.
>>>> It is going to be confusing if it is used in one place and missed in
>>>> others.
>>>
>>> I've removed the lock and applied the same cleanup logic to other
>>> places where exceptions are rewrapped.
>>>
>>> Webrev: http://cr.openjdk.java.net/~jbachorik/4505697/webrev.02
>>> JPRT run:
>>> http://prt-web.us.oracle.com//archive/2014/02/2014-02-19-114618.jbachorik.hotspot/ 
>>>
>>>
>>> Aurora Adhoc:
>>> http://aurora.ru.oracle.com//faces/Batch.xhtml?batchName=418853.VMSQE.adhoc.JPRT.full 
>>>
>>> (still running at the moment; no failures so far)
>>>
>>>>
>>>> Thanks,
>>>> Serguei
>>>>
>>>>>
>>>>> -JB-
>>>>>
>>>>>>
>>>>>> Thanks,
>>>>>> Serguei
>>>>>>
>>>>>>>
>>>>>>> Dan
>>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>> And I'm repeating my question about pre-integration testing 
>>>>>>>> (Dan is
>>>>>>>> asking about the same).
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Serguei
>>>>>>>>
>>>>>>>>
>>>>>>>> On 2/14/14 3:07 AM, Jaroslav Bachorik wrote:
>>>>>>>>> This is a round-0 review request.
>>>>>>>>>
>>>>>>>>> The reflection code intercepting the exceptions thrown in the
>>>>>>>>> invoked methods does not play nicely with JVMTI (which, in this
>>>>>>>>> case, propagates to JDI).
>>>>>>>>>
>>>>>>>>> The reflection code lacks the traditional error handler -
>>>>>>>>> therefore,
>>>>>>>>> upon throwing the NumberFormatException, the stack is searched 
>>>>>>>>> for
>>>>>>>>> appropriate handlers and none are found. This leaves the
>>>>>>>>> "exception_detected" flag set to true while normally it would be
>>>>>>>>> reset to false once the exception is handled. The reflection code
>>>>>>>>> then goes on and wraps the NumberFormatException into
>>>>>>>>> InvocationTargetException and throws it. But, alas, the
>>>>>>>>> "exception_detected" flag is still set to true and no JVMTI
>>>>>>>>> exception event will be sent out.
>>>>>>>>>
>>>>>>>>> The proposed solution is to call
>>>>>>>>> thread->jvmti_thread_state()->clear_exception_detected() at the
>>>>>>>>> appropriate places in the reflection code to reset the
>>>>>>>>> "exception_detected" flag and enable the 
>>>>>>>>> InvocationTargetException
>>>>>>>>> be properly reported over JVMTI.
>>>>>>>>>
>>>>>>>>> Issue : https://bugs.openjdk.java.net/browse/JDK-4505697
>>>>>>>>> Webrev: http://cr.openjdk.java.net/~jbachorik/4505697/webrev.00
>>>>>>>>>
>>>>>>>>> Thanks!
>>>>>>>>>
>>>>>>>>> -JB-
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>>
>>>
>>
>


From david.holmes at oracle.com  Thu Feb 20 02:40:48 2014
From: david.holmes at oracle.com (David Holmes)
Date: Thu, 20 Feb 2014 20:40:48 +1000
Subject: RFR 4505697: nsk/jdi/ExceptionEvent/_itself_/exevent006 and
	exevent008 tests fail with InvocationTargetException
In-Reply-To: <5304D4CC.1020007@oracle.com>
References: <52FDF8DD.9090602@oracle.com>
	<52FE6486.8000603@oracle.com>	<52FE7D88.8000701@oracle.com>
	<52FE951E.1060002@oracle.com>	<5301C2AB.2070600@oracle.com>
	<53033378.2060300@oracle.com> <5304D4CC.1020007@oracle.com>
Message-ID: <5305DBB0.3080809@oracle.com>

Hi Jaroslav,

instanceKlass.cpp:

Comment is wrong:

913     // JVMTI internal flag reset is needed in order to report 
InvocationTargetException

It will be ExceptionInInitializerError

You added this:

  917 
this_oop->set_initialization_state_and_notify(initialization_error, THREAD);
   918       CLEAR_PENDING_EXCEPTION;   // ignore any exception thrown, 
class initialization error is thrown below
+ 919       // JVMTI has already reported the pending exception
+ 920       // JVMTI internal flag reset is needed in order to report 
InvocationTargetException
+ 921       JvmtiExport::clear_detected_exception((JavaThread*)THREAD);

but there are a number of places where 
set_initialization_state_and_notify is called when a pending exception 
has been cleared, and then CLEAR_PENDING_EXCEPTION is called again, but 
you didn't modify those other locations. They will rethrow the original 
exception so I suppose that is okay from JVMTI's perspective. But the 
flip-side of this is that if set_initialization_state_and_notify does 
throw an exception, JVMTI will never see it.

---

jvm.cpp

Comment is wrong again - not InvocationTargetException.

---

David
------


On 20/02/2014 1:59 AM, Jaroslav Bachorik wrote:
> On 18.2.2014 11:18, serguei.spitsyn at oracle.com wrote:
>> On 2/17/14 12:04 AM, Jaroslav Bachorik wrote:
>>> On 14.2.2014 23:13, serguei.spitsyn at oracle.com wrote:
>>>> On 2/14/14 12:33 PM, Daniel D. Daugherty wrote:
>>>>> On 2/14/14 11:46 AM, serguei.spitsyn at oracle.com wrote:
>>>>>> Jaroslav,
>>>>>>
>>>>>> It looks good in general modulo indent comments from Dan.
>>>>>>
>>>>>> But I have a doubt that acquiring the JvmtiThreadState_lock is needed
>>>>>> or right thing to do in the JvmtiExport::clear_detected_exception().
>>>>>> It seems, both clear_exception_detected() and
>>>>>> set_exception_detected() are always
>>>>>> called on current thread and so, it has to be safe to do without
>>>>>> acquiring any locks.
>>>>>
>>>>> My JVM/TI-foo is rusty, but I believe that JvmtiThreadState stuff
>>>>> can also be queried/modified by other threads so grabbing the
>>>>> associated lock is a good idea.
>>>>
>>>> The lock synchronization is cooperative.
>>>> It does not help much if the lock is not acquired in other places.
>>>> I can be wrong, but I've not found yet any place in the code where the
>>>> clear_exception_detected() and set_exception_detected() are called
>>>> under protection of the JvmtiThreadState_lock.
>>>
>>> I copied the locking over from
>>> "JvmtiExport::cleanup_thread(JavaThread* thread)". That method is also
>>> supposed to work only with the current thread but acquires the lock
>>> nonetheless. But if you are sure that the lock is not required I have
>>> no objections removing it.
>>
>> I'm suggesting to remove it, as it is not used in other places in the
>> code.
>> It is going to be confusing if it is used in one place and missed in
>> others.
>
> I've removed the lock and applied the same cleanup logic to other places
> where exceptions are rewrapped.
>
> Webrev: http://cr.openjdk.java.net/~jbachorik/4505697/webrev.02
> JPRT run:
> http://prt-web.us.oracle.com//archive/2014/02/2014-02-19-114618.jbachorik.hotspot/
>
> Aurora Adhoc:
> http://aurora.ru.oracle.com//faces/Batch.xhtml?batchName=418853.VMSQE.adhoc.JPRT.full
> (still running at the moment; no failures so far)
>
>>
>> Thanks,
>> Serguei
>>
>>>
>>> -JB-
>>>
>>>>
>>>> Thanks,
>>>> Serguei
>>>>
>>>>>
>>>>> Dan
>>>>>
>>>>>
>>>>>>
>>>>>> And I'm repeating my question about pre-integration testing (Dan is
>>>>>> asking about the same).
>>>>>>
>>>>>> Thanks,
>>>>>> Serguei
>>>>>>
>>>>>>
>>>>>> On 2/14/14 3:07 AM, Jaroslav Bachorik wrote:
>>>>>>> This is a round-0 review request.
>>>>>>>
>>>>>>> The reflection code intercepting the exceptions thrown in the
>>>>>>> invoked methods does not play nicely with JVMTI (which, in this
>>>>>>> case, propagates to JDI).
>>>>>>>
>>>>>>> The reflection code lacks the traditional error handler - therefore,
>>>>>>> upon throwing the NumberFormatException, the stack is searched for
>>>>>>> appropriate handlers and none are found. This leaves the
>>>>>>> "exception_detected" flag set to true while normally it would be
>>>>>>> reset to false once the exception is handled. The reflection code
>>>>>>> then goes on and wraps the NumberFormatException into
>>>>>>> InvocationTargetException and throws it. But, alas, the
>>>>>>> "exception_detected" flag is still set to true and no JVMTI
>>>>>>> exception event will be sent out.
>>>>>>>
>>>>>>> The proposed solution is to call
>>>>>>> thread->jvmti_thread_state()->clear_exception_detected() at the
>>>>>>> appropriate places in the reflection code to reset the
>>>>>>> "exception_detected" flag and enable the InvocationTargetException
>>>>>>> be properly reported over JVMTI.
>>>>>>>
>>>>>>> Issue : https://bugs.openjdk.java.net/browse/JDK-4505697
>>>>>>> Webrev: http://cr.openjdk.java.net/~jbachorik/4505697/webrev.00
>>>>>>>
>>>>>>> Thanks!
>>>>>>>
>>>>>>> -JB-
>>>>>>
>>>>>
>>>>
>>>>
>>>
>>
>

From staffan.larsen at oracle.com  Thu Feb 20 03:22:31 2014
From: staffan.larsen at oracle.com (Staffan Larsen)
Date: Thu, 20 Feb 2014 12:22:31 +0100
Subject: JDK-8022203 - Intermittent test failures in demo/jvmti/hprof
In-Reply-To: <5305DBF0.5070902@oracle.com>
References: <5302016A.1020906@oracle.com>
	<6A4175A3-DAEC-42D2-99E6-42960B6122E1@oracle.com>
	<5305AC54.2010102@oracle.com>
	<27761889-3A44-4A4B-932C-91D25EEE94B5@oracle.com>
	<5305BA1F.4050700@oracle.com>
	<C5B70E06-7822-4C2A-8377-D51B2D2465AF@oracle.com>
	<5305DBF0.5070902@oracle.com>
Message-ID: <AFB7E48F-FC39-4CAC-BEF7-51418EA6422B@oracle.com>

Looks good!

Thanks,
/Staffan

On 20 feb 2014, at 11:41, Sergei Kovalev <sergei.kovalev at oracle.com> wrote:

> 
> Hello everyone,
> 
> I'm working for https://bugs.openjdk.java.net/browse/JDK-8022203.
> 
> As it was discussed, I prepared fix for the issue:http://cr.openjdk.java.net/~anazarov/8022203/webrev.02/
> 
> The failure caused by slippage in IO operation due to file system performance. To avoid this situation we will use an unique file name for profiling dump. In this case we will have zero probability that tests would overlapped by output file. By default the file name for dump output builds from library name plus .java. prefix and .txt suffix. Hprof tool supports "file" option that used for providing the output file name. Therefore we can just add unique file name to options string for each individual tests that used hprof.
> 
> Testing done:
> - single run of tests
> - Stability run of all modified tests
> - Modified stability run. I did some test modification that helped me to increase reproducibility rating. It was 500-1000 iteration instead 10,000 in original case. Test with modified options string ran 10,000 iteration.
> 
> Could you please approve the fix?
> 
> -- 
> With best regards,
> Sergei
> 


From jaroslav.bachorik at oracle.com  Thu Feb 20 05:41:37 2014
From: jaroslav.bachorik at oracle.com (Jaroslav Bachorik)
Date: Thu, 20 Feb 2014 14:41:37 +0100
Subject: RFR 4505697: nsk/jdi/ExceptionEvent/_itself_/exevent006 and
	exevent008 tests fail with InvocationTargetException
In-Reply-To: <5305DBB0.3080809@oracle.com>
References: <52FDF8DD.9090602@oracle.com>
	<52FE6486.8000603@oracle.com>	<52FE7D88.8000701@oracle.com>
	<52FE951E.1060002@oracle.com>	<5301C2AB.2070600@oracle.com>
	<53033378.2060300@oracle.com> <5304D4CC.1020007@oracle.com>
	<5305DBB0.3080809@oracle.com>
Message-ID: <53060611.3040209@oracle.com>

On 20.2.2014 11:40, David Holmes wrote:
> Hi Jaroslav,
>
> instanceKlass.cpp:
>
> Comment is wrong:
>
> 913     // JVMTI internal flag reset is needed in order to report
> InvocationTargetException
>
> It will be ExceptionInInitializerError

Will fix. Copypaste ...

>
> You added this:
>
>   917
> this_oop->set_initialization_state_and_notify(initialization_error,
> THREAD);
>    918       CLEAR_PENDING_EXCEPTION;   // ignore any exception thrown,
> class initialization error is thrown below
> + 919       // JVMTI has already reported the pending exception
> + 920       // JVMTI internal flag reset is needed in order to report
> InvocationTargetException
> + 921       JvmtiExport::clear_detected_exception((JavaThread*)THREAD);
>
> but there are a number of places where
> set_initialization_state_and_notify is called when a pending exception
> has been cleared, and then CLEAR_PENDING_EXCEPTION is called again, but
> you didn't modify those other locations. They will rethrow the original
> exception so I suppose that is okay from JVMTI's perspective. But the
> flip-side of this is that if set_initialization_state_and_notify does
> throw an exception, JVMTI will never see it.

I don't know if it supposed to see it. It seems that any exception 
thrown from set_initialization_state_and_notify is thoroughly ignored 
and hidden from the outer world. Perhaps someone more experienced in 
JVMTI than me would like to chime in here? Serguei?

-JB-

>
> ---
>
> jvm.cpp
>
> Comment is wrong again - not InvocationTargetException.
>
> ---
>
> David
> ------
>
>
>
>
> On 20/02/2014 1:59 AM, Jaroslav Bachorik wrote:
>> On 18.2.2014 11:18, serguei.spitsyn at oracle.com wrote:
>>> On 2/17/14 12:04 AM, Jaroslav Bachorik wrote:
>>>> On 14.2.2014 23:13, serguei.spitsyn at oracle.com wrote:
>>>>> On 2/14/14 12:33 PM, Daniel D. Daugherty wrote:
>>>>>> On 2/14/14 11:46 AM, serguei.spitsyn at oracle.com wrote:
>>>>>>> Jaroslav,
>>>>>>>
>>>>>>> It looks good in general modulo indent comments from Dan.
>>>>>>>
>>>>>>> But I have a doubt that acquiring the JvmtiThreadState_lock is
>>>>>>> needed
>>>>>>> or right thing to do in the JvmtiExport::clear_detected_exception().
>>>>>>> It seems, both clear_exception_detected() and
>>>>>>> set_exception_detected() are always
>>>>>>> called on current thread and so, it has to be safe to do without
>>>>>>> acquiring any locks.
>>>>>>
>>>>>> My JVM/TI-foo is rusty, but I believe that JvmtiThreadState stuff
>>>>>> can also be queried/modified by other threads so grabbing the
>>>>>> associated lock is a good idea.
>>>>>
>>>>> The lock synchronization is cooperative.
>>>>> It does not help much if the lock is not acquired in other places.
>>>>> I can be wrong, but I've not found yet any place in the code where the
>>>>> clear_exception_detected() and set_exception_detected() are called
>>>>> under protection of the JvmtiThreadState_lock.
>>>>
>>>> I copied the locking over from
>>>> "JvmtiExport::cleanup_thread(JavaThread* thread)". That method is also
>>>> supposed to work only with the current thread but acquires the lock
>>>> nonetheless. But if you are sure that the lock is not required I have
>>>> no objections removing it.
>>>
>>> I'm suggesting to remove it, as it is not used in other places in the
>>> code.
>>> It is going to be confusing if it is used in one place and missed in
>>> others.
>>
>> I've removed the lock and applied the same cleanup logic to other places
>> where exceptions are rewrapped.
>>
>> Webrev: http://cr.openjdk.java.net/~jbachorik/4505697/webrev.02
>> JPRT run:
>> http://prt-web.us.oracle.com//archive/2014/02/2014-02-19-114618.jbachorik.hotspot/
>>
>>
>> Aurora Adhoc:
>> http://aurora.ru.oracle.com//faces/Batch.xhtml?batchName=418853.VMSQE.adhoc.JPRT.full
>>
>> (still running at the moment; no failures so far)
>>
>>>
>>> Thanks,
>>> Serguei
>>>
>>>>
>>>> -JB-
>>>>
>>>>>
>>>>> Thanks,
>>>>> Serguei
>>>>>
>>>>>>
>>>>>> Dan
>>>>>>
>>>>>>
>>>>>>>
>>>>>>> And I'm repeating my question about pre-integration testing (Dan is
>>>>>>> asking about the same).
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Serguei
>>>>>>>
>>>>>>>
>>>>>>> On 2/14/14 3:07 AM, Jaroslav Bachorik wrote:
>>>>>>>> This is a round-0 review request.
>>>>>>>>
>>>>>>>> The reflection code intercepting the exceptions thrown in the
>>>>>>>> invoked methods does not play nicely with JVMTI (which, in this
>>>>>>>> case, propagates to JDI).
>>>>>>>>
>>>>>>>> The reflection code lacks the traditional error handler -
>>>>>>>> therefore,
>>>>>>>> upon throwing the NumberFormatException, the stack is searched for
>>>>>>>> appropriate handlers and none are found. This leaves the
>>>>>>>> "exception_detected" flag set to true while normally it would be
>>>>>>>> reset to false once the exception is handled. The reflection code
>>>>>>>> then goes on and wraps the NumberFormatException into
>>>>>>>> InvocationTargetException and throws it. But, alas, the
>>>>>>>> "exception_detected" flag is still set to true and no JVMTI
>>>>>>>> exception event will be sent out.
>>>>>>>>
>>>>>>>> The proposed solution is to call
>>>>>>>> thread->jvmti_thread_state()->clear_exception_detected() at the
>>>>>>>> appropriate places in the reflection code to reset the
>>>>>>>> "exception_detected" flag and enable the InvocationTargetException
>>>>>>>> be properly reported over JVMTI.
>>>>>>>>
>>>>>>>> Issue : https://bugs.openjdk.java.net/browse/JDK-4505697
>>>>>>>> Webrev: http://cr.openjdk.java.net/~jbachorik/4505697/webrev.00
>>>>>>>>
>>>>>>>> Thanks!
>>>>>>>>
>>>>>>>> -JB-
>>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>


From jaroslav.bachorik at oracle.com  Thu Feb 20 06:16:55 2014
From: jaroslav.bachorik at oracle.com (Jaroslav Bachorik)
Date: Thu, 20 Feb 2014 15:16:55 +0100
Subject: RFR 8034168: ThreadMXBean/Locks.java failed, blocked on wrong
	object
In-Reply-To: <CA+kOe08BRfbu63CAzAZ4p7ZpYfmHGBBSQVvFaMVg6-2+xeuV6A@mail.gmail.com>
References: <530388CC.9080807@oracle.com>	<CA+kOe09mG0g_5d=pa51Dgj-kLRfkf1J-HSmT9P9zctqA1ufr5w@mail.gmail.com>	<53045D76.5080907@oracle.com>
	<CA+kOe08BRfbu63CAzAZ4p7ZpYfmHGBBSQVvFaMVg6-2+xeuV6A@mail.gmail.com>
Message-ID: <53060E57.6010300@oracle.com>

This was discussed when reviewing 
ThreadMXBean/SynchronizationStatistics.java

Regarding a busy wait checking the thread states and issuing 
Thread.yield() now and then David Holmes wrote:
"Not elegant and not completely reliable either. Probably adequate on a
multi-core system but single-core and with some schedulers it could 
just be a busy spin." [1]

As far as I understand, the only benefit of using Thread.yield() instead 
of Thread.sleep() would be 10-100ms shorter execution time, right?

-JB-

[1] http://mail.openjdk.java.net/pipermail/jmx-dev/2013-October/000484.html

On 19.2.2014 17:56, Martin Buchholz wrote:
> The jsr166 tck tests make systematic use of Thread.yield, e.g.
>
>
>      /**
>       * Spin-waits up to the specified number of milliseconds for the given
>       * thread to enter a wait state: BLOCKED, WAITING, or TIMED_WAITING.
>       */
>      void waitForThreadToEnterWaitState(Thread thread, long timeoutMillis) {
>          long startTime = System.nanoTime();
>          for (;;) {
>              Thread.State s = thread.getState();
>              if (s == Thread.State.BLOCKED ||
>                  s == Thread.State.WAITING ||
>                  s == Thread.State.TIMED_WAITING)
>                  return;
>              else if (s == Thread.State.TERMINATED)
>                  fail("Unexpected thread termination");
>              else if (millisElapsedSince(startTime) > timeoutMillis) {
>                  threadAssertTrue(thread.isAlive());
>                  return;
>              }
>              Thread.yield();
>          }
>      }
>
>
>
> On Tue, Feb 18, 2014 at 11:29 PM, Jaroslav Bachorik <
> jaroslav.bachorik at oracle.com> wrote:
>
>> On 18.2.2014 18:06, Martin Buchholz wrote:
>>
>>> Not checking any details, but tests that want to wait for a particular
>>> thread state are a good reason to use
>>>
>>> volatile boolean flag;
>>> ...
>>> while (!flag) Thread.yield();
>>>
>>> I prefer calling Thread.yield to sleeping in this special case, in part
>>> because I don't want to rely on the implementation of sleep, while yield
>>> is
>>> semantically a no-op.  (Also sleeping 100ms is a long time for a computer)
>>>
>>
>> There were discussions for a similar fix regarding Thread.yield(). The
>> concern was that using Thread.yield() in a tight loop might very easily
>> lead to starvation on single core machines. Therefore Thread.sleep(10) is
>> used to be sure the flag setting thread has actually a chance to progress.
>>
>> -JB-
>>
>>
>>
>>>
>>>
>>> On Tue, Feb 18, 2014 at 8:22 AM, Jaroslav Bachorik <
>>> jaroslav.bachorik at oracle.com> wrote:
>>>
>>>   Please, review the following test change.
>>>>
>>>> Issue : https://bugs.openjdk.java.net/browse/JDK-8034168
>>>> Webrev: http://cr.openjdk.java.net/~jbachorik/8034168/webrev.00
>>>>
>>>> The test fails because of falsely evaluating the thread being parked as
>>>> actually waiting on a monitor. This is because there is no difference in
>>>> java thread state for those two situations. The test is using Phaser for
>>>> synchronization between the checked and checking thread to make sure an
>>>> appropriate code section is entered before performing asserts. Then it
>>>> checks the checked thread state and waits till it becomes WAITING.
>>>> Unfortunately, when Phaser needs to wait it parks the thread and sets the
>>>> thread state to WAITING. From now on the test is in a completely random
>>>> state and the result will largely depend on timing - thus failing
>>>> intermittently.
>>>>
>>>> The solution is to use an additional volatile variable to prevent falsely
>>>> indicating the park() induced WAITING state.
>>>>
>>>> Thanks,
>>>>
>>>> -JB-
>>>>
>>>>
>>>
>>
>


From jaroslav.bachorik at oracle.com  Thu Feb 20 06:21:29 2014
From: jaroslav.bachorik at oracle.com (Jaroslav Bachorik)
Date: Thu, 20 Feb 2014 15:21:29 +0100
Subject: RFR 8035395: sun/management/jmxremote/startstop/JMXStartStopTest.java
	fails intermittently: Port already in use
Message-ID: <53060F69.7050704@oracle.com>

Please, review this test fix.

Issue : https://bugs.openjdk.java.net/browse/JDK-8035395
Webrev: http://cr.openjdk.java.net/~jbachorik/8035395/webrev.00

Currently, the test is using two fixed ports to start JMX connector and 
RMI registry when necessary. It can not deal with situations when the 
ports are not available. The patch is adding the ability to obtain ports 
from the ephemeral range and use them instead of the hardcoded ones. It 
also tries to minimize the chance of another process stealing the ports 
by holding the corresponding SocketServers open till right before the 
port is actually needed.

Thanks,

-JB-

From sergei.kovalev at oracle.com  Thu Feb 20 02:41:52 2014
From: sergei.kovalev at oracle.com (Sergei Kovalev)
Date: Thu, 20 Feb 2014 14:41:52 +0400
Subject: JDK-8022203 - Intermittent test failures in demo/jvmti/hprof
In-Reply-To: <C5B70E06-7822-4C2A-8377-D51B2D2465AF@oracle.com>
References: <5302016A.1020906@oracle.com>
	<6A4175A3-DAEC-42D2-99E6-42960B6122E1@oracle.com>
	<5305AC54.2010102@oracle.com>
	<27761889-3A44-4A4B-932C-91D25EEE94B5@oracle.com>
	<5305BA1F.4050700@oracle.com>
	<C5B70E06-7822-4C2A-8377-D51B2D2465AF@oracle.com>
Message-ID: <5305DBF0.5070902@oracle.com>


Hello everyone,

I'm working for https://bugs.openjdk.java.net/browse/JDK-8022203.

As it was discussed, I prepared fix for the issue:http://cr.openjdk.java.net/~anazarov/8022203/webrev.02/

The failure caused by slippage in IO operation due to file system performance. To avoid this situation we will use an unique file name for profiling dump. In this case we will have zero probability that tests would overlapped by output file. By default the file name for dump output builds from library name plus .java. prefix and .txt suffix. Hprof tool supports "file" option that used for providing the output file name. Therefore we can just add unique file name to options string for each individual tests that used hprof.

Testing done:
- single run of tests
- Stability run of all modified tests
- Modified stability run. I did some test modification that helped me to increase reproducibility rating. It was 500-1000 iteration instead 10,000 in original case. Test with modified options string ran 10,000 iteration.

Could you please approve the fix?

-- 
With best regards,
Sergei


From martinrb at google.com  Thu Feb 20 09:04:25 2014
From: martinrb at google.com (Martin Buchholz)
Date: Thu, 20 Feb 2014 09:04:25 -0800
Subject: RFR 8034168: ThreadMXBean/Locks.java failed,
	blocked on wrong object
In-Reply-To: <53060E57.6010300@oracle.com>
References: <530388CC.9080807@oracle.com>
	<CA+kOe09mG0g_5d=pa51Dgj-kLRfkf1J-HSmT9P9zctqA1ufr5w@mail.gmail.com>
	<53045D76.5080907@oracle.com>
	<CA+kOe08BRfbu63CAzAZ4p7ZpYfmHGBBSQVvFaMVg6-2+xeuV6A@mail.gmail.com>
	<53060E57.6010300@oracle.com>
Message-ID: <CA+kOe0-TziqPS6PUpDVodeJiVF+Ms_JrRrQ2nFvqFindLikhVw@mail.gmail.com>

I think David is too pessimistic about Thread.yield being ineffective on
Java SE implementations (OTOH David is a Java Embedded expert).  In
practice an implementation that never thread switched out of a yield() loop
would not pass the tck.  As for theory: it's true that Thread.yield has no
progress guarantees, but then neither does Thread.sleep.  A perverse
implementation can always starve any thread it feels like.

Anyways, there is nothing wrong with your sleep loop.  Except maybe you
want to time out eventually.


On Thu, Feb 20, 2014 at 6:16 AM, Jaroslav Bachorik <
jaroslav.bachorik at oracle.com> wrote:

> This was discussed when reviewing ThreadMXBean/
> SynchronizationStatistics.java
>
> Regarding a busy wait checking the thread states and issuing
> Thread.yield() now and then David Holmes wrote:
> "Not elegant and not completely reliable either. Probably adequate on a
> multi-core system but single-core and with some schedulers it could just
> be a busy spin." [1]
>
> As far as I understand, the only benefit of using Thread.yield() instead
> of Thread.sleep() would be 10-100ms shorter execution time, right?
>
> -JB-
>
> [1] http://mail.openjdk.java.net/pipermail/jmx-dev/2013-
> October/000484.html
>
>
> On 19.2.2014 17:56, Martin Buchholz wrote:
>
>> The jsr166 tck tests make systematic use of Thread.yield, e.g.
>>
>>
>>      /**
>>       * Spin-waits up to the specified number of milliseconds for the
>> given
>>       * thread to enter a wait state: BLOCKED, WAITING, or TIMED_WAITING.
>>       */
>>      void waitForThreadToEnterWaitState(Thread thread, long
>> timeoutMillis) {
>>          long startTime = System.nanoTime();
>>          for (;;) {
>>              Thread.State s = thread.getState();
>>              if (s == Thread.State.BLOCKED ||
>>                  s == Thread.State.WAITING ||
>>                  s == Thread.State.TIMED_WAITING)
>>                  return;
>>              else if (s == Thread.State.TERMINATED)
>>                  fail("Unexpected thread termination");
>>              else if (millisElapsedSince(startTime) > timeoutMillis) {
>>                  threadAssertTrue(thread.isAlive());
>>                  return;
>>              }
>>              Thread.yield();
>>          }
>>      }
>>
>>
>>
>> On Tue, Feb 18, 2014 at 11:29 PM, Jaroslav Bachorik <
>> jaroslav.bachorik at oracle.com> wrote:
>>
>>  On 18.2.2014 18:06, Martin Buchholz wrote:
>>>
>>>  Not checking any details, but tests that want to wait for a particular
>>>> thread state are a good reason to use
>>>>
>>>> volatile boolean flag;
>>>> ...
>>>> while (!flag) Thread.yield();
>>>>
>>>> I prefer calling Thread.yield to sleeping in this special case, in part
>>>> because I don't want to rely on the implementation of sleep, while yield
>>>> is
>>>> semantically a no-op.  (Also sleeping 100ms is a long time for a
>>>> computer)
>>>>
>>>>
>>> There were discussions for a similar fix regarding Thread.yield(). The
>>> concern was that using Thread.yield() in a tight loop might very easily
>>> lead to starvation on single core machines. Therefore Thread.sleep(10) is
>>> used to be sure the flag setting thread has actually a chance to
>>> progress.
>>>
>>> -JB-
>>>
>>>
>>>
>>>
>>>>
>>>> On Tue, Feb 18, 2014 at 8:22 AM, Jaroslav Bachorik <
>>>> jaroslav.bachorik at oracle.com> wrote:
>>>>
>>>>   Please, review the following test change.
>>>>
>>>>>
>>>>> Issue : https://bugs.openjdk.java.net/browse/JDK-8034168
>>>>> Webrev: http://cr.openjdk.java.net/~jbachorik/8034168/webrev.00
>>>>>
>>>>> The test fails because of falsely evaluating the thread being parked as
>>>>> actually waiting on a monitor. This is because there is no difference
>>>>> in
>>>>> java thread state for those two situations. The test is using Phaser
>>>>> for
>>>>> synchronization between the checked and checking thread to make sure an
>>>>> appropriate code section is entered before performing asserts. Then it
>>>>> checks the checked thread state and waits till it becomes WAITING.
>>>>> Unfortunately, when Phaser needs to wait it parks the thread and sets
>>>>> the
>>>>> thread state to WAITING. From now on the test is in a completely random
>>>>> state and the result will largely depend on timing - thus failing
>>>>> intermittently.
>>>>>
>>>>> The solution is to use an additional volatile variable to prevent
>>>>> falsely
>>>>> indicating the park() induced WAITING state.
>>>>>
>>>>> Thanks,
>>>>>
>>>>> -JB-
>>>>>
>>>>>
>>>>>
>>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20140220/f1162adc/attachment.html 

From staffan.larsen at oracle.com  Thu Feb 20 09:42:05 2014
From: staffan.larsen at oracle.com (Staffan Larsen)
Date: Thu, 20 Feb 2014 18:42:05 +0100
Subject: jmx-dev RFR 8035395:
	sun/management/jmxremote/startstop/JMXStartStopTest.java fails
	intermittently: Port already in use
In-Reply-To: <53060F69.7050704@oracle.com>
References: <53060F69.7050704@oracle.com>
Message-ID: <C137B41D-E9B2-4446-980E-5BDB6F758C7E@oracle.com>

Looks good.

You have a commented out call to debugPortUsage(pa); which you can remove before pushing.

Thanks,
/Staffan

On 20 feb 2014, at 15:21, Jaroslav Bachorik <jaroslav.bachorik at oracle.com> wrote:

> Please, review this test fix.
> 
> Issue : https://bugs.openjdk.java.net/browse/JDK-8035395
> Webrev: http://cr.openjdk.java.net/~jbachorik/8035395/webrev.00
> 
> Currently, the test is using two fixed ports to start JMX connector and RMI registry when necessary. It can not deal with situations when the ports are not available. The patch is adding the ability to obtain ports from the ephemeral range and use them instead of the hardcoded ones. It also tries to minimize the chance of another process stealing the ports by holding the corresponding SocketServers open till right before the port is actually needed.
> 
> Thanks,
> 
> -JB-


From shanliang.jiang at oracle.com  Thu Feb 20 10:20:08 2014
From: shanliang.jiang at oracle.com (shanliang)
Date: Thu, 20 Feb 2014 19:20:08 +0100
Subject: RFR 8035395:
	sun/management/jmxremote/startstop/JMXStartStopTest.java
	fails intermittently: Port already in use
In-Reply-To: <53060F69.7050704@oracle.com>
References: <53060F69.7050704@oracle.com>
Message-ID: <53064758.7090009@oracle.com>

Jaroslav,

The failed tests were:
    1, 7, 8, 9

but the tests using this port (port2: 50235) were
    1, 3, 4, 6, 7, 8, 9

and tests 2,4,6 were passed.

so I think that the problem might be that the port was not fully 
released when a test was trying to use it, even the port was closed by 
the previous test.

Your solution is to create a Server socket on a free port, then release 
it when a test needs it. I suspect whether we will fall into same issue 
here: the port would not be fully released when using it?

Shanliang

Jaroslav Bachorik wrote:
> Please, review this test fix.
>
> Issue : https://bugs.openjdk.java.net/browse/JDK-8035395
> Webrev: http://cr.openjdk.java.net/~jbachorik/8035395/webrev.00
>
> Currently, the test is using two fixed ports to start JMX connector 
> and RMI registry when necessary. It can not deal with situations when 
> the ports are not available. The patch is adding the ability to obtain 
> ports from the ephemeral range and use them instead of the hardcoded 
> ones. It also tries to minimize the chance of another process stealing 
> the ports by holding the corresponding SocketServers open till right 
> before the port is actually needed.
>
> Thanks,
>
> -JB-


From serguei.spitsyn at oracle.com  Thu Feb 20 12:04:42 2014
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Thu, 20 Feb 2014 12:04:42 -0800
Subject: RFR(XS): JDK-8034867 - Object references from expression stack
	are not visible as roots via jvmti FollowReferences and HeapDump
In-Reply-To: <02D5D45C1F8DB848A7AE20E80EE61A5C3980D881@DEWDFEMB20C.global.corp.sap>
References: <02D5D45C1F8DB848A7AE20E80EE61A5C3980BF2A@DEWDFEMB20C.global.corp.sap>	<530505DA.2070305@oracle.com>	<530592EC.20207@oracle.com>
	<CA+3eh109WFe6NQNWKZT7152Jnk8vO3_sG1TohSL0_Z5Z6L4JrA@mail.gmail.com>
	<02D5D45C1F8DB848A7AE20E80EE61A5C3980D881@DEWDFEMB20C.global.corp.sap>
Message-ID: <53065FDA.8050608@oracle.com>

Hi Axel,

We decided that I will sponsor your change.
My current plan is to test and push it over weekend.

Thanks,
Serguei


On 2/20/14 1:05 AM, Siebenborn, Axel wrote:
> Yes, I need a sponsor for this change.
> Thanks for the reviews,
> Axel
>
>
>> -----Original Message-----
>> From: Volker Simonis [mailto:volker.simonis at gmail.com]
>> Sent: Donnerstag, 20. Februar 2014 09:21
>> To: serguei.spitsyn at oracle.com
>> Cc: Coleen Phillimore; hotspot-runtime-dev at openjdk.java.net;
>> serviceability-dev at openjdk.java.net; Siebenborn, Axel
>> Subject: Re: RFR(XS): JDK-8034867 - Object references from expression stack
>> are not visible as roots via jvmti FollowReferences and HeapDump
>>
>> Axel is covered by the SAP OCA.
>>
>> Thanks,
>> Volker
>>
>>
>> On Thu, Feb 20, 2014 at 6:30 AM, serguei.spitsyn at oracle.com
>> <serguei.spitsyn at oracle.com> wrote:
>>> It looks good to me too.
>>> But does Axel need to sign an OCA ?
>>>
>>> This is a link to follow:
>>>    http://openjdk.java.net/contribute/
>>>
>>>
>>> Thanks,
>>> Serguei
>>>
>>>
>>>
>>> On 2/19/14 11:28 AM, Coleen Phillimore wrote:
>>>
>>>
>>> This change looks good.  Do you need a sponsor?
>>> thanks,
>>> Coleen
>>>
>>> On 2/13/14 11:57 AM, Siebenborn, Axel wrote:
>>>
>>> Hi all,
>>>
>>> could I have a review for the following change?
>>>
>>>
>>>
>>> It adds code to iterate over the slots of the expression stack.
>>>
>>> Found object references are used as heap roots, the same way as object
>>> references in locals slots.
>>>
>>>
>>>
>>> Bug:
>>>
>>> https://bugs.openjdk.java.net/browse/JDK-8034867
>>>
>>>
>>>
>>> Webrev:
>>>
>>> http://sapjvm.com/as/webrevs/8034867
>>>
>>>
>>>
>>> Thanks,
>>>
>>> Axel
>>>
>>>
>>>


From david.holmes at oracle.com  Thu Feb 20 23:24:57 2014
From: david.holmes at oracle.com (David Holmes)
Date: Fri, 21 Feb 2014 17:24:57 +1000
Subject: RFR 4505697: nsk/jdi/ExceptionEvent/_itself_/exevent006 and
	exevent008 tests fail with InvocationTargetException
In-Reply-To: <53060611.3040209@oracle.com>
References: <52FDF8DD.9090602@oracle.com>
	<52FE6486.8000603@oracle.com>	<52FE7D88.8000701@oracle.com>
	<52FE951E.1060002@oracle.com>	<5301C2AB.2070600@oracle.com>
	<53033378.2060300@oracle.com> <5304D4CC.1020007@oracle.com>
	<5305DBB0.3080809@oracle.com> <53060611.3040209@oracle.com>
Message-ID: <5306FF49.4010607@oracle.com>

On 20/02/2014 11:41 PM, Jaroslav Bachorik wrote:
> On 20.2.2014 11:40, David Holmes wrote:
>> Hi Jaroslav,
>>
>> instanceKlass.cpp:
>>
>> Comment is wrong:
>>
>> 913     // JVMTI internal flag reset is needed in order to report
>> InvocationTargetException
>>
>> It will be ExceptionInInitializerError
>
> Will fix. Copypaste ...
>
>>
>> You added this:
>>
>>   917
>> this_oop->set_initialization_state_and_notify(initialization_error,
>> THREAD);
>>    918       CLEAR_PENDING_EXCEPTION;   // ignore any exception thrown,
>> class initialization error is thrown below
>> + 919       // JVMTI has already reported the pending exception
>> + 920       // JVMTI internal flag reset is needed in order to report
>> InvocationTargetException
>> + 921       JvmtiExport::clear_detected_exception((JavaThread*)THREAD);
>>
>> but there are a number of places where
>> set_initialization_state_and_notify is called when a pending exception
>> has been cleared, and then CLEAR_PENDING_EXCEPTION is called again, but
>> you didn't modify those other locations. They will rethrow the original
>> exception so I suppose that is okay from JVMTI's perspective. But the
>> flip-side of this is that if set_initialization_state_and_notify does
>> throw an exception, JVMTI will never see it.
>
> I don't know if it supposed to see it. It seems that any exception
> thrown from set_initialization_state_and_notify is thoroughly ignored
> and hidden from the outer world. Perhaps someone more experienced in
> JVMTI than me would like to chime in here? Serguei?

set_initialization_state_and_notify is often called when an exception 
has already occurred during the class loading/initialization process. It 
is that original exception that we want to propagate but meanwhile we 
have to perform this action to update the state and wakeup any waiters. 
So we cache the original exception, clear it, do the state update and 
then clear any pending exception (I think the only exception possible 
here is OOME!), then rethrow the original. If this action did indeed 
throw OOME then we might not be able to wake up the waiter(s) and that 
might lead to a hang. While a debug VM could use TraceExceptions to 
(hopefully) spot the OOME, in a product VM it would be invisible, even 
if a JVMTI agent was tracking exceptions. So I think it should be 
visible to JVMTI. I would like to hear other opinions though.

However this is going beyond the scope of fixing these particular tests 
so I'm fine if this is simply recorded in another bug for future clean up.

Thanks,
David

> -JB-
>
>>
>> ---
>>
>> jvm.cpp
>>
>> Comment is wrong again - not InvocationTargetException.
>>
>> ---
>>
>> David
>> ------
>>
>>
>>
>>
>> On 20/02/2014 1:59 AM, Jaroslav Bachorik wrote:
>>> On 18.2.2014 11:18, serguei.spitsyn at oracle.com wrote:
>>>> On 2/17/14 12:04 AM, Jaroslav Bachorik wrote:
>>>>> On 14.2.2014 23:13, serguei.spitsyn at oracle.com wrote:
>>>>>> On 2/14/14 12:33 PM, Daniel D. Daugherty wrote:
>>>>>>> On 2/14/14 11:46 AM, serguei.spitsyn at oracle.com wrote:
>>>>>>>> Jaroslav,
>>>>>>>>
>>>>>>>> It looks good in general modulo indent comments from Dan.
>>>>>>>>
>>>>>>>> But I have a doubt that acquiring the JvmtiThreadState_lock is
>>>>>>>> needed
>>>>>>>> or right thing to do in the
>>>>>>>> JvmtiExport::clear_detected_exception().
>>>>>>>> It seems, both clear_exception_detected() and
>>>>>>>> set_exception_detected() are always
>>>>>>>> called on current thread and so, it has to be safe to do without
>>>>>>>> acquiring any locks.
>>>>>>>
>>>>>>> My JVM/TI-foo is rusty, but I believe that JvmtiThreadState stuff
>>>>>>> can also be queried/modified by other threads so grabbing the
>>>>>>> associated lock is a good idea.
>>>>>>
>>>>>> The lock synchronization is cooperative.
>>>>>> It does not help much if the lock is not acquired in other places.
>>>>>> I can be wrong, but I've not found yet any place in the code where
>>>>>> the
>>>>>> clear_exception_detected() and set_exception_detected() are called
>>>>>> under protection of the JvmtiThreadState_lock.
>>>>>
>>>>> I copied the locking over from
>>>>> "JvmtiExport::cleanup_thread(JavaThread* thread)". That method is also
>>>>> supposed to work only with the current thread but acquires the lock
>>>>> nonetheless. But if you are sure that the lock is not required I have
>>>>> no objections removing it.
>>>>
>>>> I'm suggesting to remove it, as it is not used in other places in the
>>>> code.
>>>> It is going to be confusing if it is used in one place and missed in
>>>> others.
>>>
>>> I've removed the lock and applied the same cleanup logic to other places
>>> where exceptions are rewrapped.
>>>
>>> Webrev: http://cr.openjdk.java.net/~jbachorik/4505697/webrev.02
>>> JPRT run:
>>> http://prt-web.us.oracle.com//archive/2014/02/2014-02-19-114618.jbachorik.hotspot/
>>>
>>>
>>>
>>> Aurora Adhoc:
>>> http://aurora.ru.oracle.com//faces/Batch.xhtml?batchName=418853.VMSQE.adhoc.JPRT.full
>>>
>>>
>>> (still running at the moment; no failures so far)
>>>
>>>>
>>>> Thanks,
>>>> Serguei
>>>>
>>>>>
>>>>> -JB-
>>>>>
>>>>>>
>>>>>> Thanks,
>>>>>> Serguei
>>>>>>
>>>>>>>
>>>>>>> Dan
>>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>> And I'm repeating my question about pre-integration testing (Dan is
>>>>>>>> asking about the same).
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Serguei
>>>>>>>>
>>>>>>>>
>>>>>>>> On 2/14/14 3:07 AM, Jaroslav Bachorik wrote:
>>>>>>>>> This is a round-0 review request.
>>>>>>>>>
>>>>>>>>> The reflection code intercepting the exceptions thrown in the
>>>>>>>>> invoked methods does not play nicely with JVMTI (which, in this
>>>>>>>>> case, propagates to JDI).
>>>>>>>>>
>>>>>>>>> The reflection code lacks the traditional error handler -
>>>>>>>>> therefore,
>>>>>>>>> upon throwing the NumberFormatException, the stack is searched for
>>>>>>>>> appropriate handlers and none are found. This leaves the
>>>>>>>>> "exception_detected" flag set to true while normally it would be
>>>>>>>>> reset to false once the exception is handled. The reflection code
>>>>>>>>> then goes on and wraps the NumberFormatException into
>>>>>>>>> InvocationTargetException and throws it. But, alas, the
>>>>>>>>> "exception_detected" flag is still set to true and no JVMTI
>>>>>>>>> exception event will be sent out.
>>>>>>>>>
>>>>>>>>> The proposed solution is to call
>>>>>>>>> thread->jvmti_thread_state()->clear_exception_detected() at the
>>>>>>>>> appropriate places in the reflection code to reset the
>>>>>>>>> "exception_detected" flag and enable the InvocationTargetException
>>>>>>>>> be properly reported over JVMTI.
>>>>>>>>>
>>>>>>>>> Issue : https://bugs.openjdk.java.net/browse/JDK-4505697
>>>>>>>>> Webrev: http://cr.openjdk.java.net/~jbachorik/4505697/webrev.00
>>>>>>>>>
>>>>>>>>> Thanks!
>>>>>>>>>
>>>>>>>>> -JB-
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>

From jaroslav.bachorik at oracle.com  Thu Feb 20 23:35:30 2014
From: jaroslav.bachorik at oracle.com (Jaroslav Bachorik)
Date: Fri, 21 Feb 2014 08:35:30 +0100
Subject: RFR 8035395:
	sun/management/jmxremote/startstop/JMXStartStopTest.java
	fails intermittently: Port already in use
In-Reply-To: <53064758.7090009@oracle.com>
References: <53060F69.7050704@oracle.com> <53064758.7090009@oracle.com>
Message-ID: <530701C2.1020003@oracle.com>

Hi Shanliang,

On 20.2.2014 19:20, shanliang wrote:
> Jaroslav,
>
> The failed tests were:
>     1, 7, 8, 9
>
> but the tests using this port (port2: 50235) were
>     1, 3, 4, 6, 7, 8, 9
>
> and tests 2,4,6 were passed.
>
> so I think that the problem might be that the port was not fully
> released when a test was trying to use it, even the port was closed by
> the previous test.

I don't think this is the case.

Firstly, test 1 fails. There are no previous tests possibly holding on 
to the required port. So the port must have been taken by some foreign 
process. The port number being from the ephemeral range doesn't help 
either, quite contrary.

Secondly, a port can not be partially bound - either a process binds to 
a port or not. Also, when a process exits all the bound ports must be 
released. Since we are waiting for the exit code of the launched test 
application before proceeding all the ports used by that application 
must be released before the main test routine can continue.

>
> Your solution is to create a Server socket on a free port, then release
> it when a test needs it. I suspect whether we will fall into same issue
> here: the port would not be fully released when using it?

No. SocketServer.close() is called synchronously right before the port 
is going to be used. This call unbinds the socket and returns. At the 
moment of the return the port is free. I've run the tests locally and 
via JPRT and they are all passing.

Thanks,

-JB-

>
> Shanliang
>
> Jaroslav Bachorik wrote:
>> Please, review this test fix.
>>
>> Issue : https://bugs.openjdk.java.net/browse/JDK-8035395
>> Webrev: http://cr.openjdk.java.net/~jbachorik/8035395/webrev.00
>>
>> Currently, the test is using two fixed ports to start JMX connector
>> and RMI registry when necessary. It can not deal with situations when
>> the ports are not available. The patch is adding the ability to obtain
>> ports from the ephemeral range and use them instead of the hardcoded
>> ones. It also tries to minimize the chance of another process stealing
>> the ports by holding the corresponding SocketServers open till right
>> before the port is actually needed.
>>
>> Thanks,
>>
>> -JB-
>


From shanliang.jiang at oracle.com  Fri Feb 21 01:26:26 2014
From: shanliang.jiang at oracle.com (shanliang)
Date: Fri, 21 Feb 2014 10:26:26 +0100
Subject: RFR 8035395:
	sun/management/jmxremote/startstop/JMXStartStopTest.java
	fails intermittently: Port already in use
In-Reply-To: <530701C2.1020003@oracle.com>
References: <53060F69.7050704@oracle.com> <53064758.7090009@oracle.com>
	<530701C2.1020003@oracle.com>
Message-ID: <53071BC2.3060307@oracle.com>

Jaroslav Bachorik wrote:
> Hi Shanliang,
>
> On 20.2.2014 19:20, shanliang wrote:
>> Jaroslav,
>>
>> The failed tests were:
>>     1, 7, 8, 9
>>
>> but the tests using this port (port2: 50235) were
>>     1, 3, 4, 6, 7, 8, 9
>>
>> and tests 2,4,6 were passed.
>>
>> so I think that the problem might be that the port was not fully
>> released when a test was trying to use it, even the port was closed by
>> the previous test.
>
> I don't think this is the case.
>
> Firstly, test 1 fails. There are no previous tests possibly holding on 
> to the required port. So the port must have been taken by some foreign 
> process. The port number being from the ephemeral range doesn't help 
> either, quite contrary.
>
> Secondly, a port can not be partially bound - either a process binds 
> to a port or not. Also, when a process exits all the bound ports must 
> be released. Since we are waiting for the exit code of the launched 
> test application before proceeding all the ports used by that 
> application must be released before the main test routine can continue.
Why did exception "Port already in use error: " happen for Test1, then 
the port was free for 3/4/6, and then the exception appeared again for 
7, 8, 9?

A port is possibly unavailable after being closed, because it can be in 
the state TIME_WAIT.

Your fix created a server socket but no client would connect to it, then 
the port could be available immediately after close(), not need to enter 
TIME_WAIT state, if so hopeful the fix could work.

Shanliang
>
>>
>> Your solution is to create a Server socket on a free port, then release
>> it when a test needs it. I suspect whether we will fall into same issue
>> here: the port would not be fully released when using it?
>
> No. SocketServer.close() is called synchronously right before the port 
> is going to be used. This call unbinds the socket and returns. At the 
> moment of the return the port is free. I've run the tests locally and 
> via JPRT and they are all passing.
>
> Thanks,
>
> -JB-
>
>>
>> Shanliang
>>
>> Jaroslav Bachorik wrote:
>>> Please, review this test fix.
>>>
>>> Issue : https://bugs.openjdk.java.net/browse/JDK-8035395
>>> Webrev: http://cr.openjdk.java.net/~jbachorik/8035395/webrev.00
>>>
>>> Currently, the test is using two fixed ports to start JMX connector
>>> and RMI registry when necessary. It can not deal with situations when
>>> the ports are not available. The patch is adding the ability to obtain
>>> ports from the ephemeral range and use them instead of the hardcoded
>>> ones. It also tries to minimize the chance of another process stealing
>>> the ports by holding the corresponding SocketServers open till right
>>> before the port is actually needed.
>>>
>>> Thanks,
>>>
>>> -JB-
>>
>


From jaroslav.bachorik at oracle.com  Fri Feb 21 01:50:04 2014
From: jaroslav.bachorik at oracle.com (Jaroslav Bachorik)
Date: Fri, 21 Feb 2014 10:50:04 +0100
Subject: RFR 8035395:
	sun/management/jmxremote/startstop/JMXStartStopTest.java
	fails intermittently: Port already in use
In-Reply-To: <53071BC2.3060307@oracle.com>
References: <53060F69.7050704@oracle.com> <53064758.7090009@oracle.com>
	<530701C2.1020003@oracle.com> <53071BC2.3060307@oracle.com>
Message-ID: <5307214C.2080009@oracle.com>

On 21.2.2014 10:26, shanliang wrote:
> Jaroslav Bachorik wrote:
>> Hi Shanliang,
>>
>> On 20.2.2014 19:20, shanliang wrote:
>>> Jaroslav,
>>>
>>> The failed tests were:
>>>     1, 7, 8, 9
>>>
>>> but the tests using this port (port2: 50235) were
>>>     1, 3, 4, 6, 7, 8, 9
>>>
>>> and tests 2,4,6 were passed.
>>>
>>> so I think that the problem might be that the port was not fully
>>> released when a test was trying to use it, even the port was closed by
>>> the previous test.
>>
>> I don't think this is the case.
>>
>> Firstly, test 1 fails. There are no previous tests possibly holding on
>> to the required port. So the port must have been taken by some foreign
>> process. The port number being from the ephemeral range doesn't help
>> either, quite contrary.
>>
>> Secondly, a port can not be partially bound - either a process binds
>> to a port or not. Also, when a process exits all the bound ports must
>> be released. Since we are waiting for the exit code of the launched
>> test application before proceeding all the ports used by that
>> application must be released before the main test routine can continue.
> Why did exception "Port already in use error: " happen for Test1, then
> the port was free for 3/4/6, and then the exception appeared again for
> 7, 8, 9?

Test1: fails; it tries to start JMX connector on port 50235 and check 
the conenction afterward
Test2: passes; does not use port 50235
Test3: passes; checks for not being able to connect to port 50235
Test4: fails; the same as Test1
Test5: passes; the same as Test2
Test6: fails; it tries to start RMI registry on port 50235 and fails
Test7: fails; the same as Test1
Test8: fails; the same as Test1
Test9: fails; the same as Test1
Test10: passes; the same as Test2
Test11: passes; the same as Test2
Test12: passes; the same as Test2
Test13: passes; the same as Test2

The port doesn't mysteriously become used and unused. It is still 
occupied by a different process. Some of the tests don't fail simply 
because they don't use the port.

>
> A port is possibly unavailable after being closed, because it can be in
> the state TIME_WAIT.
>
> Your fix created a server socket but no client would connect to it, then
> the port could be available immediately after close(), not need to enter
> TIME_WAIT state, if so hopeful the fix could work.

SocketServer does not accept any incoming connection and as such the 
socket should not go to TIME_WAIT state when it is closed. The JPRT 
results would indicate that this is indeed the case.

-JB-

>
> Shanliang
>>
>>>
>>> Your solution is to create a Server socket on a free port, then release
>>> it when a test needs it. I suspect whether we will fall into same issue
>>> here: the port would not be fully released when using it?
>>
>> No. SocketServer.close() is called synchronously right before the port
>> is going to be used. This call unbinds the socket and returns. At the
>> moment of the return the port is free. I've run the tests locally and
>> via JPRT and they are all passing.
>>
>> Thanks,
>>
>> -JB-
>>
>>>
>>> Shanliang
>>>
>>> Jaroslav Bachorik wrote:
>>>> Please, review this test fix.
>>>>
>>>> Issue : https://bugs.openjdk.java.net/browse/JDK-8035395
>>>> Webrev: http://cr.openjdk.java.net/~jbachorik/8035395/webrev.00
>>>>
>>>> Currently, the test is using two fixed ports to start JMX connector
>>>> and RMI registry when necessary. It can not deal with situations when
>>>> the ports are not available. The patch is adding the ability to obtain
>>>> ports from the ephemeral range and use them instead of the hardcoded
>>>> ones. It also tries to minimize the chance of another process stealing
>>>> the ports by holding the corresponding SocketServers open till right
>>>> before the port is actually needed.
>>>>
>>>> Thanks,
>>>>
>>>> -JB-
>>>
>>
>


From jaroslav.bachorik at oracle.com  Fri Feb 21 02:13:39 2014
From: jaroslav.bachorik at oracle.com (Jaroslav Bachorik)
Date: Fri, 21 Feb 2014 11:13:39 +0100
Subject: RFR 8033787: [TESTBUG] Few jdk regression management tests should
	be improved with -XX:+UsePerfData usage
Message-ID: <530726D3.6020504@oracle.com>

Please, review this very simple test fix.

Issue : https://bugs.openjdk.java.net/browse/JDK-8033787
Webrev: http://cr.openjdk.java.net/~jbachorik/8033787/webrev.00

The fix just adds -XX:+UsePerfData parameter to the @run section of the 
mentioned test.

-JB-

From david.holmes at oracle.com  Fri Feb 21 03:16:22 2014
From: david.holmes at oracle.com (David Holmes)
Date: Fri, 21 Feb 2014 21:16:22 +1000
Subject: RFR 8033787: [TESTBUG] Few jdk regression management tests should
	be improved with -XX:+UsePerfData usage
In-Reply-To: <530726D3.6020504@oracle.com>
References: <530726D3.6020504@oracle.com>
Message-ID: <53073586.3050203@oracle.com>

Looks okay to me.

David

On 21/02/2014 8:13 PM, Jaroslav Bachorik wrote:
> Please, review this very simple test fix.
>
> Issue : https://bugs.openjdk.java.net/browse/JDK-8033787
> Webrev: http://cr.openjdk.java.net/~jbachorik/8033787/webrev.00
>
> The fix just adds -XX:+UsePerfData parameter to the @run section of the
> mentioned test.
>
> -JB-

From shanliang.jiang at oracle.com  Fri Feb 21 04:42:11 2014
From: shanliang.jiang at oracle.com (shanliang)
Date: Fri, 21 Feb 2014 13:42:11 +0100
Subject: RFR 8035395:
	sun/management/jmxremote/startstop/JMXStartStopTest.java
	fails intermittently: Port already in use
In-Reply-To: <5307214C.2080009@oracle.com>
References: <53060F69.7050704@oracle.com> <53064758.7090009@oracle.com>
	<530701C2.1020003@oracle.com> <53071BC2.3060307@oracle.com>
	<5307214C.2080009@oracle.com>
Message-ID: <530749A3.7060802@oracle.com>

Jaroslav Bachorik wrote:
> On 21.2.2014 10:26, shanliang wrote:
>> Jaroslav Bachorik wrote:
>>> Hi Shanliang,
>>>
>>> On 20.2.2014 19:20, shanliang wrote:
>>>> Jaroslav,
>>>>
>>>> The failed tests were:
>>>>     1, 7, 8, 9
>>>>
>>>> but the tests using this port (port2: 50235) were
>>>>     1, 3, 4, 6, 7, 8, 9
>>>>
>>>> and tests 2,4,6 were passed.
>>>>
>>>> so I think that the problem might be that the port was not fully
>>>> released when a test was trying to use it, even the port was closed by
>>>> the previous test.
>>>
>>> I don't think this is the case.
>>>
>>> Firstly, test 1 fails. There are no previous tests possibly holding on
>>> to the required port. So the port must have been taken by some foreign
>>> process. The port number being from the ephemeral range doesn't help
>>> either, quite contrary.
>>>
>>> Secondly, a port can not be partially bound - either a process binds
>>> to a port or not. Also, when a process exits all the bound ports must
>>> be released. Since we are waiting for the exit code of the launched
>>> test application before proceeding all the ports used by that
>>> application must be released before the main test routine can continue.
>> Why did exception "Port already in use error: " happen for Test1, then
>> the port was free for 3/4/6, and then the exception appeared again for
>> 7, 8, 9?
>
> Test1: fails; it tries to start JMX connector on port 50235 and check 
> the conenction afterward
> Test2: passes; does not use port 50235
> Test3: passes; checks for not being able to connect to port 50235
> Test4: fails; the same as Test1
> Test5: passes; the same as Test2
> Test6: fails; it tries to start RMI registry on port 50235 and fails
> Test7: fails; the same as Test1
> Test8: fails; the same as Test1
> Test9: fails; the same as Test1
> Test10: passes; the same as Test2
> Test11: passes; the same as Test2
> Test12: passes; the same as Test2
> Test13: passes; the same as Test2
>
> The port doesn't mysteriously become used and unused. It is still 
> occupied by a different process. Some of the tests don't fail simply 
> because they don't use the port.
Yes right, some tests passed because they did not expected JMX 
connection, not sure possible to distinguish between a port issue and no 
jmx server issue.

The fix looks OK.

Thanks,
Shanliang
>
>>
>> A port is possibly unavailable after being closed, because it can be in
>> the state TIME_WAIT.
>>
>> Your fix created a server socket but no client would connect to it, then
>> the port could be available immediately after close(), not need to enter
>> TIME_WAIT state, if so hopeful the fix could work.
>
> SocketServer does not accept any incoming connection and as such the 
> socket should not go to TIME_WAIT state when it is closed. The JPRT 
> results would indicate that this is indeed the case.
>
> -JB-
>
>>
>> Shanliang
>>>
>>>>
>>>> Your solution is to create a Server socket on a free port, then 
>>>> release
>>>> it when a test needs it. I suspect whether we will fall into same 
>>>> issue
>>>> here: the port would not be fully released when using it?
>>>
>>> No. SocketServer.close() is called synchronously right before the port
>>> is going to be used. This call unbinds the socket and returns. At the
>>> moment of the return the port is free. I've run the tests locally and
>>> via JPRT and they are all passing.
>>>
>>> Thanks,
>>>
>>> -JB-
>>>
>>>>
>>>> Shanliang
>>>>
>>>> Jaroslav Bachorik wrote:
>>>>> Please, review this test fix.
>>>>>
>>>>> Issue : https://bugs.openjdk.java.net/browse/JDK-8035395
>>>>> Webrev: http://cr.openjdk.java.net/~jbachorik/8035395/webrev.00
>>>>>
>>>>> Currently, the test is using two fixed ports to start JMX connector
>>>>> and RMI registry when necessary. It can not deal with situations when
>>>>> the ports are not available. The patch is adding the ability to 
>>>>> obtain
>>>>> ports from the ephemeral range and use them instead of the hardcoded
>>>>> ones. It also tries to minimize the chance of another process 
>>>>> stealing
>>>>> the ports by holding the corresponding SocketServers open till right
>>>>> before the port is actually needed.
>>>>>
>>>>> Thanks,
>>>>>
>>>>> -JB-
>>>>
>>>
>>
>


From staffan.larsen at oracle.com  Fri Feb 21 06:10:48 2014
From: staffan.larsen at oracle.com (Staffan Larsen)
Date: Fri, 21 Feb 2014 15:10:48 +0100
Subject: RFR: 8035150 ShouldNotReachHere() in ConstantPool::copy_entry_to
Message-ID: <19821D33-93B6-43FA-85DC-070DB4800911@oracle.com>

This is an attempt to solve a crash while redefining a class that has unresolved class references in its constant pool. I would appreciate some extra scrutiny here since I am unfamiliar with this code path.

I have also added a test that causes a JVM crash without the fix. 

The updates to the test library is all code copied from the jdk version of the test library.

webrev: http://cr.openjdk.java.net/~sla/8035150/webrev.00/
bug: https://bugs.openjdk.java.net/browse/JDK-8035150

Thanks,
/Staffan

From dmitry.samersoff at oracle.com  Fri Feb 21 08:08:43 2014
From: dmitry.samersoff at oracle.com (Dmitry Samersoff)
Date: Fri, 21 Feb 2014 20:08:43 +0400
Subject: PING Re: RR(S): JDK-7189721 nightly tests failed on JDI attaching
In-Reply-To: <52FE2D9A.4030700@oracle.com>
References: <52FE2D9A.4030700@oracle.com>
Message-ID: <53077A0B.5010401@oracle.com>

On 2014-02-14 18:52, Dmitry Samersoff wrote:
> Hi Everyone,
> 
> Please, review the changes.
> 
> http://cr.openjdk.java.net/~dsamersoff/JDK-7189721/webrev.01/
> 
> Summary:
> 
> If for some reason elf section with section names is not loaded to cache
> it attempts to read data using NULL section pointer.
> 
> 


-- 
Dmitry Samersoff
Oracle Java development team, Saint Petersburg, Russia
* I would love to change the world, but they won't give me the sources.

From daniel.daugherty at oracle.com  Fri Feb 21 19:40:15 2014
From: daniel.daugherty at oracle.com (Daniel D. Daugherty)
Date: Fri, 21 Feb 2014 20:40:15 -0700
Subject: RFR round 0 JDK8u backport of ObjectMonitor-JVM/TI hang fix (8028073)
In-Reply-To: <52ED3F28.5070806@oracle.com>
References: <52ED3F28.5070806@oracle.com>
Message-ID: <53081C1F.2070008@oracle.com>

Greetings,

This is a code review request for the JDK8u-hs-dev backport of the
following ObjectMonitor-JVM/TI hang fix:

     8028073 race condition in ObjectMonitor implementation causing 
deadlocks
     https://bugs.openjdk.java.net/browse/JDK-8028073

Here is the JDK8u-hs-dev webrev URL:

http://cr.openjdk.java.net/~dcubed/8028073-webrev/0-jdk8u-hs-dev/

This is _almost_ a straight forward backport of the JDK9 fix. The only
difference to the fix was discussed at the end of the JDK9 review and
was determined to only be needed in versions of HotSpot without the
fix for 8028280:

http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2014-February/010745.html

8028280 has not yet been backported to JDK8u-hs-dev.

The easiest way to review the backport is to download the two patch
files from the webrevs and compare them with something like:

     jfilemerge -r -w 8028073_exp.patch 8028073_exp_for_jdk8u_hs.patch

The same testing has been performed on the JDK8u-hs-dev version as
with the JDK9-hs-runtime version.

Thanks, in advance, for any comments, questions or suggestions.

Dan


On 2/1/14 11:38 AM, Daniel D. Daugherty wrote:
 > Greetings,
 >
 > I have a fix ready for the following bug:
 >
 >     8028073 race condition in ObjectMonitor implementation causing 
deadlocks
 >     https://bugs.openjdk.java.net/browse/JDK-8028073
 >
 > On the surface, this is a very simple fix that relocates a few lines of
 > code, relocates and rewrites the comments associated with that code and
 > adds several new comments.
 >
 > Of course, in reality, the issue is much more complicated, but I'm
 > hoping to make it easy for anyone not acquainted with this issue to
 > understand what's going on.
 >
 > Here are the JDK9 webrev URLs:
 >
 > OpenJDK:
 > http://cr.openjdk.java.net/~dcubed/8028073-webrev/0-jdk9-hs-runtime/
 >
 > Oracle internal:
 > http://javaweb.us.oracle.com/~ddaugher/8028073-webrev/0-jdk9-hs-runtime/
 >
 > The simple summary:
 >
 > - since Java Monitors and JVM/TI RawMonitors share a ParkEvent,
 >   it is possible for a JVM/TI monitor event handler to accidentally
 >   consume a ParkEvent.unpark() call meant for Java Monitor layer
 > - the original code fix was made on 2005.07.04 using this bug ID:
 >   https://bugs.openjdk.java.net/browse/JDK-5030359
 > - it's the right fix, but it's in the wrong place
 > - the fix needs to be after the JVMTI_EVENT_MONITOR_WAITED
 >   event handler is called because it is that event handler
 >   that can cause the hang
 >
 >
 > Testing
 > -------
 >
 > - a new StessMonitorWait test has been created that reliably
 >   reproduces the hang in JDK[6789]; see the bug's gory details
 >   for the specific versions where the hang has been reproduced
 >   - the test reliably reproduces the hang in 5 seconds on my
 >     T7600 running Solaris 10u11 X86; 1 minute runs reproduce
 >     the hang reliably on other machines
 >   - 12 hour stress run of the new test on Linux-X64, MacOS X-X64,
 >     Solaris-SPARCV9, Solaris-X64, and Win7-X86 with the JPRT
 >     bits did not reproduce the hang
 > - JPRT test job
 > - VM/SQE Adhoc test job on Server VM, fastdebug bits on Linux-X86,
 >   Linux-X64, MacOS X-X64, Solaris-SPARCV9, Solaris-X64, Windows-X86,
 >   and Windows-X64:
 >   - vm.quick
 >   - Kitchensink (bigapps)
 >   - Weblogic+medrec (bigapps)
 >   - runThese (bigapps)
 >
 >
 > The Gory Details Start Here
 > ---------------------------
 >
 > This is the old location of block of code that's being moved:
 >
 > src/share/vm/runtime/objectMonitor.cpp:
 >
 > 1440 void ObjectMonitor::wait(jlong millis, bool interruptible, TRAPS) {
 > <snip>
 > 1499    exit (true, Self) ;                    // exit the monitor
 > <snip>
 > 1513    if (node._notified != 0 && _succ == Self) {
 > 1514       node._event->unpark();
 > 1515    }
 >
 >
 > This is the new location of block of code that's being moved:
 >
 > src/share/vm/runtime/objectMonitor.cpp:
 >
 > 1452 void ObjectMonitor::wait(jlong millis, bool interruptible, TRAPS) {
 > <snip>
 > 1601      if (JvmtiExport::should_post_monitor_waited()) {
 > 1602        JvmtiExport::post_monitor_waited(jt, this, ret == 
OS_TIMEOUT);
 > <snip>
 > 1604        if (node._notified != 0 && _succ == Self) {
 > <snip>
 > 1620          node._event->unpark();
 > 1621        }
 >
 >
 > The Risks
 > ---------
 >
 > - The code now executes only when the JVMTI_EVENT_MONITOR_WAITED event
 >   is enabled:
 >   - previously it was always executed
 >   - while the old code was not effective for the hang that is being
 >     fixed with this bug, it is possible that the old code prevented
 >     a different bug in the successor protocol from manifesting
 >   - thorough analysis of the successor protocol did not reveal a
 >     case where the old code was needed in the old location
 > - Thorough analysis indicates that the other JVM/TI monitor events
 >   do not need a fix like the one for JVMTI_EVENT_MONITOR_WAITED:
 >   - the successor protocol is complicated and the analysis could
 >     be wrong when certain options are used
 >   - comments were added to each location where a JVM/TI monitor
 >     event handler is called documenting why a fix like this one
 >     is not needed there
 >   - if the analysis is wrong, the new comments show where a new
 >     code change would be needed
 >
 >
 > The Scenario
 > ------------
 >
 > I've created a scenario that reproduces this hang:
 >
 > T1 - enters monitor and calls monitor.wait()
 > T2 - enters the monitor, calls monitor.notify() and exits the monitor
 > T3 - enters and exits the monitor
 > T4 - enters the monitor, delays for 5 seconds, exits the monitor
 >
 > A JVM/TI agent that enables JVMTI_EVENT_MONITOR_WAITED and has a
 > handler that: enters a raw monitor, waits for 1ms, exits a raw monitor.
 >
 > Here are the six events necessary to make this hang happen:
 >
 > // KEY-EVENT-1a: After being unparked(), T1 has cleared the _succ 
field, but
 > // KEY-EVENT-1b: T3 is exiting the monitor and makes T1 the successor 
again.
 >
 > // KEY-EVENT-2a: The unpark() done by T3 when it made T1 the successor
 > // KEY-EVENT-2b: is consumed by the JVM/TI event handler.
 >
 > // KEY-EVENT-3a: T3 made T1 the successor
 > // KEY-EVENT-3b: but before T1 could reenter the monitor T4 grabbed it.
 >
 > // KEY-EVENT-4a: T1's TrySpin() call sees T4 as NotRunnable so
 > // KEY-EVENT-4b: T1 bails from TrySpin without touching _succ.
 >
 > // KEY-EVENT-5a: T4 sees that T1 is still the successor so
 > // KEY-EVENT-5b: T4 takes the quick exit path (no ExitEpilog)
 >
 > // KEY-EVENT-6a: T1 is about to park and it is the successor, but
 > // KEY-EVENT-6b: T3's unpark has been eaten by the JVM/TI event handler
 > // KEY-EVENT-6c: and T4 took the quick exit path. T1 is about to be 
stuck.
 >
 >
 > This bug is intertwined with:
 >
 > - The ObjectMonitor successor protocol
 > - the sharing of a ParkEvent between Java Monitors and JVM/TI RawMonitors
 >
 > There is a very long successor.notes attachment to JDK-8028073 that
 > attempts to describe the ObjectMonitor successor protocol. It's good
 > for putting pretty much anyone to sleep.
 >
 > Since this hang reproduces back to JDK6, this bug is taking the easily
 > backported solution of moving the original fix to the right location.
 > The following new bug has been filed for possible future work in this
 > area by the Serviceability Team:
 >
 >     8033399 add a separate ParkEvent for JVM/TI RawMonitor use
 >     https://bugs.openjdk.java.net/browse/JDK-8033399
 >
 >
 > The Symptoms
 > ------------
 >
 > With intermittent hangs like this, it is useful to know what to look
 > for in order to determine if you are running into this issue:
 >
 > - if you aren't using a debugger or a profiler or some other
 >   JVM/TI agent, then this hang is not the same as yours
 > - if your JVM/TI agent isn't using a JVMTI_EVENT_MONITOR_WAITED
 >   event handler, then this hang is not the same as yours
 > - if your JVMTI_EVENT_MONITOR_WAITED event handler is not using
 >   JVM/TI RawMonitors, then this hang is not the same as yours
 > - if your JVMTI_EVENT_MONITOR_WAITED event handler is calling
 >   back into Java code, then you might just be insane and this
 >   hang might be similar to yours. However, using a Java callback
 >   in an event handler is an even bigger problem/risk so fix that
 >   first.
 > - if you one or more threads blocked like this and making no
 >   progress, then this hang might be the same as yours:
 >
 > "T1" #22 prio=5 os_prio=64 tid=0x00000000009ca800 nid=0x2f waiting 
for monitor e
 > ntry [0xfffffd7fc0231000]
 >    java.lang.Thread.State: BLOCKED (on object monitor)
 >    JavaThread state: _thread_blocked
 > Thread: 0x00000000009ca800  [0x2f] State: _at_safepoint 
_has_called_back 0 _at_p
 > oll_safepoint 0
 >    JavaThread state: _thread_blocked
 >         at java.lang.Object.wait(Native Method)
 >         - waiting on <0xfffffd7e6a2b6ff0> (a java.lang.String)
 >         at java.lang.Object.wait(Object.java:502)
 >         at SMW_WorkerThread.run(StressMonitorWait.java:103)
 >         - locked <0xfffffd7e6a2b6ff0> (a java.lang.String)
 >
 > "T2" #23 prio=5 os_prio=64 tid=0x00000000009cc000 nid=0x30 waiting 
for monitor e
 > ntry [0xfffffd7fc0130000]
 >    java.lang.Thread.State: BLOCKED (on object monitor)
 >    JavaThread state: _thread_blocked
 > Thread: 0x00000000009cc000  [0x30] State: _at_safepoint 
_has_called_back 0 _at_p
 > oll_safepoint 0
 >    JavaThread state: _thread_blocked
 >         at SMW_WorkerThread.run(StressMonitorWait.java:120)
 >         - waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String)
 >
 > "T3" #24 prio=5 os_prio=64 tid=0x00000000009ce000 nid=0x31 waiting 
for monitor e
 > ntry [0xfffffd7fc002f000]
 >    java.lang.Thread.State: BLOCKED (on object monitor)
 >    JavaThread state: _thread_blocked
 > Thread: 0x00000000009ce000  [0x31] State: _at_safepoint 
_has_called_back 0 _at_p
 > oll_safepoint 0
 >    JavaThread state: _thread_blocked
 >         at SMW_WorkerThread.run(StressMonitorWait.java:139)
 >         - waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String)
 >
 > Key symptoms in thread T1:
 >
 > - had the object locked:
 >
 >   locked <0xfffffd7e6a2b6ff0> (a java.lang.String)
 >
 > - did an Object.wait():
 >
 >   waiting on <0xfffffd7e6a2b6ff0> (a java.lang.String)
 >
 > - is blocked on reentry:
 >
 >   waiting for monitor entry [0xfffffd7fc0231000]
 >
 > Key symptoms in thread T2:
 >
 > - is blocked waiting to lock the object:
 >
 >   waiting for monitor entry [0xfffffd7fc0130000]
 >   waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String)
 >
 > Key symptoms in thread T3:
 >
 > - is blocked waiting to lock the object:
 >
 >   waiting for monitor entry [0xfffffd7fc002f000]
 >   waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String)


From erik.gahlin at oracle.com  Sat Feb 22 07:56:51 2014
From: erik.gahlin at oracle.com (Erik Gahlin)
Date: Sat, 22 Feb 2014 16:56:51 +0100
Subject: RFR 8034168: ThreadMXBean/Locks.java failed, blocked on wrong
	object
In-Reply-To: <530388CC.9080807@oracle.com>
References: <530388CC.9080807@oracle.com>
Message-ID: <5308C8C3.4020907@oracle.com>

Looks good.

/Erik

Jaroslav Bachorik skrev 2/18/14 5:22 PM:
> Please, review the following test change.
>
> Issue : https://bugs.openjdk.java.net/browse/JDK-8034168
> Webrev: http://cr.openjdk.java.net/~jbachorik/8034168/webrev.00
>
> The test fails because of falsely evaluating the thread being parked 
> as actually waiting on a monitor. This is because there is no 
> difference in java thread state for those two situations. The test is 
> using Phaser for synchronization between the checked and checking 
> thread to make sure an appropriate code section is entered before 
> performing asserts. Then it checks the checked thread state and waits 
> till it becomes WAITING. Unfortunately, when Phaser needs to wait it 
> parks the thread and sets the thread state to WAITING. From now on the 
> test is in a completely random state and the result will largely 
> depend on timing - thus failing intermittently.
>
> The solution is to use an additional volatile variable to prevent 
> falsely indicating the park() induced WAITING state.
>
> Thanks,
>
> -JB-


From Alan.Bateman at oracle.com  Sun Feb 23 01:19:33 2014
From: Alan.Bateman at oracle.com (Alan Bateman)
Date: Sun, 23 Feb 2014 09:19:33 +0000
Subject: 8034856/8034857: More gcc warnings
In-Reply-To: <5304F64E.4000206@oracle.com>
References: <52FCC60B.6040205@oracle.com>
	<6E073B44-2FF7-419F-A094-7CD5C3AE0788@oracle.com>
	<52FD0DA2.7030201@oracle.com> <52FD35C5.1000306@oracle.com>
	<5300F36F.9010205@oracle.com>
	<CA+kOe09Uc3LQG7Ui7fhxsQABacrPdvEpRVGoKxnoY-0FHm4Ntw@mail.gmail.com>
	<4DCAB60B-5D60-4873-A5A4-ACF123F10045@oracle.com>
	<530225D2.4040709@oracle.com> <5302DA88.9060306@oracle.com>
	<53031AE4.2050700@oracle.com> <5303B85D.10505@oracle.com>
	<5304E4B8.6040908@oracle.com> <5304F64E.4000206@oracle.com>
Message-ID: <5309BD25.2030407@oracle.com>

On 19/02/2014 18:22, Mikael Vidstedt wrote:
> :
>
> The documented grammar in the comment only mentions "SPACE" and the 
> code below doesn't make any references to \t. As a matter of fact, it 
> only checks for one single, mandatory SPACE after the colon (enforced 
> at line 535-536) and doesn't care to remove any space characters at 
> the end of the value. The while loop only deals with continuations. If 
> additional spaces do exist they will as far as I can tell be part of 
> the value. Are they trimmed later? I'm assuming it would be nice to 
> have both parsers (parse_manifest & JarFacade) behave the same way?
>
> Here's what it would look like to only check for space, but still eat 
> any additional spaces which doesn't match what 
> parse_manifest/parse_nv_pair does:
>
> http://cr.openjdk.java.net/~mikael/webrevs/isspace/webrev.01/webrev/
>
Sorry for the delay getting back to you on this (I've been busy with 
other things).

I checked the JAR File Specification, which is turn references RFC 822 
as the "inspiration" for the name-value pairs. The SPACE token is just 
ASCII SP. So I agree it's just ASCII SP that needs to be handled here, 
not LWSP-char which includes ASCII HT.

Looking at JDK-6274276 then the trimming was done to avoid 
hard-to-diagnose problems with leading/trailing spaces. It's possible 
that this is inconsistent with other areas where JAR file attributes are 
used. I would suggest leaving it as is for now as this is potentially 
changing behavior in several areas.

-Alan
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20140223/d6a19933/attachment.html 

From serguei.spitsyn at oracle.com  Sun Feb 23 23:32:33 2014
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Sun, 23 Feb 2014 23:32:33 -0800
Subject: RFR(XS): JDK-8034867 - Object references from expression stack
	are not visible as roots via jvmti FollowReferences and HeapDump
In-Reply-To: <53065FDA.8050608@oracle.com>
References: <02D5D45C1F8DB848A7AE20E80EE61A5C3980BF2A@DEWDFEMB20C.global.corp.sap>	<530505DA.2070305@oracle.com>	<530592EC.20207@oracle.com>	<CA+3eh109WFe6NQNWKZT7152Jnk8vO3_sG1TohSL0_Z5Z6L4JrA@mail.gmail.com>	<02D5D45C1F8DB848A7AE20E80EE61A5C3980D881@DEWDFEMB20C.global.corp.sap>
	<53065FDA.8050608@oracle.com>
Message-ID: <530AF591.9070401@oracle.com>

The fix has been integrated into the jdk9/hs-rt repo.

I've also filed a qe-test/hotspot Enhancement to integrate the test
attached to the bug report into the VM SQE test base:
   https://bugs.openjdk.java.net/browse/INTJDK-7609362

Thanks,
Serguei


On 2/20/14 12:04 PM, serguei.spitsyn at oracle.com wrote:
> Hi Axel,
>
> We decided that I will sponsor your change.
> My current plan is to test and push it over weekend.
>
> Thanks,
> Serguei
>
>
> On 2/20/14 1:05 AM, Siebenborn, Axel wrote:
>> Yes, I need a sponsor for this change.
>> Thanks for the reviews,
>> Axel
>>
>>
>>> -----Original Message-----
>>> From: Volker Simonis [mailto:volker.simonis at gmail.com]
>>> Sent: Donnerstag, 20. Februar 2014 09:21
>>> To: serguei.spitsyn at oracle.com
>>> Cc: Coleen Phillimore; hotspot-runtime-dev at openjdk.java.net;
>>> serviceability-dev at openjdk.java.net; Siebenborn, Axel
>>> Subject: Re: RFR(XS): JDK-8034867 - Object references from 
>>> expression stack
>>> are not visible as roots via jvmti FollowReferences and HeapDump
>>>
>>> Axel is covered by the SAP OCA.
>>>
>>> Thanks,
>>> Volker
>>>
>>>
>>> On Thu, Feb 20, 2014 at 6:30 AM, serguei.spitsyn at oracle.com
>>> <serguei.spitsyn at oracle.com> wrote:
>>>> It looks good to me too.
>>>> But does Axel need to sign an OCA ?
>>>>
>>>> This is a link to follow:
>>>>    http://openjdk.java.net/contribute/
>>>>
>>>>
>>>> Thanks,
>>>> Serguei
>>>>
>>>>
>>>>
>>>> On 2/19/14 11:28 AM, Coleen Phillimore wrote:
>>>>
>>>>
>>>> This change looks good.  Do you need a sponsor?
>>>> thanks,
>>>> Coleen
>>>>
>>>> On 2/13/14 11:57 AM, Siebenborn, Axel wrote:
>>>>
>>>> Hi all,
>>>>
>>>> could I have a review for the following change?
>>>>
>>>>
>>>>
>>>> It adds code to iterate over the slots of the expression stack.
>>>>
>>>> Found object references are used as heap roots, the same way as object
>>>> references in locals slots.
>>>>
>>>>
>>>>
>>>> Bug:
>>>>
>>>> https://bugs.openjdk.java.net/browse/JDK-8034867
>>>>
>>>>
>>>>
>>>> Webrev:
>>>>
>>>> http://sapjvm.com/as/webrevs/8034867
>>>>
>>>>
>>>>
>>>> Thanks,
>>>>
>>>> Axel
>>>>
>>>>
>>>>
>


From staffan.larsen at oracle.com  Mon Feb 24 00:07:00 2014
From: staffan.larsen at oracle.com (Staffan Larsen)
Date: Mon, 24 Feb 2014 09:07:00 +0100
Subject: RFR(XS): JDK-8034203: Change JavaDoc for
	com.sun.jdi.request.EventRequest.setEnabled(boolean val)
In-Reply-To: <53032B48.2060003@oracle.com>
References: <53032B48.2060003@oracle.com>
Message-ID: <72947A99-0450-45A2-9190-9DCBAF0C3CDE@oracle.com>

Looks good!

Thanks,
/Staffan

On 18 feb 2014, at 10:43, Fredrik Arvidsson <fredrik.arvidsson at oracle.com> wrote:

> Hi
> 
> Please help me review this small change:
> 
> This is a minor change to the JavaDoc in the JDI API. The problem was discovered when trying to fix a failing test. 
> 
> Webrev: 8034203
> Bug: JDK-8034203
> CCC: 8034203
> 
> Thanks
> /Fredrik

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20140224/b2edc044/attachment.html 

From volker.simonis at gmail.com  Mon Feb 24 00:39:36 2014
From: volker.simonis at gmail.com (Volker Simonis)
Date: Mon, 24 Feb 2014 09:39:36 +0100
Subject: RFR(XS): JDK-8034867 - Object references from expression stack
	are not visible as roots via jvmti FollowReferences and HeapDump
In-Reply-To: <530AF591.9070401@oracle.com>
References: <02D5D45C1F8DB848A7AE20E80EE61A5C3980BF2A@DEWDFEMB20C.global.corp.sap>
	<530505DA.2070305@oracle.com> <530592EC.20207@oracle.com>
	<CA+3eh109WFe6NQNWKZT7152Jnk8vO3_sG1TohSL0_Z5Z6L4JrA@mail.gmail.com>
	<02D5D45C1F8DB848A7AE20E80EE61A5C3980D881@DEWDFEMB20C.global.corp.sap>
	<53065FDA.8050608@oracle.com> <530AF591.9070401@oracle.com>
Message-ID: <CA+3eh10-VFm4zN-EcP8Xt3aoisrkejpigxhFS2rw1UHmF+SOvA@mail.gmail.com>

Hi Serguei,

thanks a lot for sponsoring this change.

It seems that https://bugs.openjdk.java.net/browse/INTJDK-7609362 is
not visible.
Could you please make it readable for everybody.

Thank you and best regards,
Volker


On Mon, Feb 24, 2014 at 8:32 AM, serguei.spitsyn at oracle.com
<serguei.spitsyn at oracle.com> wrote:
> The fix has been integrated into the jdk9/hs-rt repo.
>
> I've also filed a qe-test/hotspot Enhancement to integrate the test
> attached to the bug report into the VM SQE test base:
>   https://bugs.openjdk.java.net/browse/INTJDK-7609362
>
> Thanks,
> Serguei
>
>
>
> On 2/20/14 12:04 PM, serguei.spitsyn at oracle.com wrote:
>>
>> Hi Axel,
>>
>> We decided that I will sponsor your change.
>> My current plan is to test and push it over weekend.
>>
>> Thanks,
>> Serguei
>>
>>
>> On 2/20/14 1:05 AM, Siebenborn, Axel wrote:
>>>
>>> Yes, I need a sponsor for this change.
>>> Thanks for the reviews,
>>> Axel
>>>
>>>
>>>> -----Original Message-----
>>>> From: Volker Simonis [mailto:volker.simonis at gmail.com]
>>>> Sent: Donnerstag, 20. Februar 2014 09:21
>>>> To: serguei.spitsyn at oracle.com
>>>> Cc: Coleen Phillimore; hotspot-runtime-dev at openjdk.java.net;
>>>> serviceability-dev at openjdk.java.net; Siebenborn, Axel
>>>> Subject: Re: RFR(XS): JDK-8034867 - Object references from expression
>>>> stack
>>>> are not visible as roots via jvmti FollowReferences and HeapDump
>>>>
>>>> Axel is covered by the SAP OCA.
>>>>
>>>> Thanks,
>>>> Volker
>>>>
>>>>
>>>> On Thu, Feb 20, 2014 at 6:30 AM, serguei.spitsyn at oracle.com
>>>> <serguei.spitsyn at oracle.com> wrote:
>>>>>
>>>>> It looks good to me too.
>>>>> But does Axel need to sign an OCA ?
>>>>>
>>>>> This is a link to follow:
>>>>>    http://openjdk.java.net/contribute/
>>>>>
>>>>>
>>>>> Thanks,
>>>>> Serguei
>>>>>
>>>>>
>>>>>
>>>>> On 2/19/14 11:28 AM, Coleen Phillimore wrote:
>>>>>
>>>>>
>>>>> This change looks good.  Do you need a sponsor?
>>>>> thanks,
>>>>> Coleen
>>>>>
>>>>> On 2/13/14 11:57 AM, Siebenborn, Axel wrote:
>>>>>
>>>>> Hi all,
>>>>>
>>>>> could I have a review for the following change?
>>>>>
>>>>>
>>>>>
>>>>> It adds code to iterate over the slots of the expression stack.
>>>>>
>>>>> Found object references are used as heap roots, the same way as object
>>>>> references in locals slots.
>>>>>
>>>>>
>>>>>
>>>>> Bug:
>>>>>
>>>>> https://bugs.openjdk.java.net/browse/JDK-8034867
>>>>>
>>>>>
>>>>>
>>>>> Webrev:
>>>>>
>>>>> http://sapjvm.com/as/webrevs/8034867
>>>>>
>>>>>
>>>>>
>>>>> Thanks,
>>>>>
>>>>> Axel
>>>>>
>>>>>
>>>>>
>>
>

From jaroslav.bachorik at oracle.com  Mon Feb 24 01:04:30 2014
From: jaroslav.bachorik at oracle.com (Jaroslav Bachorik)
Date: Mon, 24 Feb 2014 10:04:30 +0100
Subject: RFR 4505697: nsk/jdi/ExceptionEvent/_itself_/exevent006 and
	exevent008 tests fail with InvocationTargetException
In-Reply-To: <5306FF49.4010607@oracle.com>
References: <52FDF8DD.9090602@oracle.com>
	<52FE6486.8000603@oracle.com>	<52FE7D88.8000701@oracle.com>
	<52FE951E.1060002@oracle.com>	<5301C2AB.2070600@oracle.com>
	<53033378.2060300@oracle.com> <5304D4CC.1020007@oracle.com>
	<5305DBB0.3080809@oracle.com> <53060611.3040209@oracle.com>
	<5306FF49.4010607@oracle.com>
Message-ID: <530B0B1E.3060508@oracle.com>

On 21.2.2014 08:24, David Holmes wrote:
> On 20/02/2014 11:41 PM, Jaroslav Bachorik wrote:
>> On 20.2.2014 11:40, David Holmes wrote:
>>> Hi Jaroslav,
>>>
>>> instanceKlass.cpp:
>>>
>>> Comment is wrong:
>>>
>>> 913     // JVMTI internal flag reset is needed in order to report
>>> InvocationTargetException
>>>
>>> It will be ExceptionInInitializerError
>>
>> Will fix. Copypaste ...
>>
>>>
>>> You added this:
>>>
>>>   917
>>> this_oop->set_initialization_state_and_notify(initialization_error,
>>> THREAD);
>>>    918       CLEAR_PENDING_EXCEPTION;   // ignore any exception thrown,
>>> class initialization error is thrown below
>>> + 919       // JVMTI has already reported the pending exception
>>> + 920       // JVMTI internal flag reset is needed in order to report
>>> InvocationTargetException
>>> + 921       JvmtiExport::clear_detected_exception((JavaThread*)THREAD);
>>>
>>> but there are a number of places where
>>> set_initialization_state_and_notify is called when a pending exception
>>> has been cleared, and then CLEAR_PENDING_EXCEPTION is called again, but
>>> you didn't modify those other locations. They will rethrow the original
>>> exception so I suppose that is okay from JVMTI's perspective. But the
>>> flip-side of this is that if set_initialization_state_and_notify does
>>> throw an exception, JVMTI will never see it.
>>
>> I don't know if it supposed to see it. It seems that any exception
>> thrown from set_initialization_state_and_notify is thoroughly ignored
>> and hidden from the outer world. Perhaps someone more experienced in
>> JVMTI than me would like to chime in here? Serguei?
>
> set_initialization_state_and_notify is often called when an exception
> has already occurred during the class loading/initialization process. It
> is that original exception that we want to propagate but meanwhile we
> have to perform this action to update the state and wakeup any waiters.
> So we cache the original exception, clear it, do the state update and
> then clear any pending exception (I think the only exception possible
> here is OOME!), then rethrow the original. If this action did indeed
> throw OOME then we might not be able to wake up the waiter(s) and that
> might lead to a hang. While a debug VM could use TraceExceptions to
> (hopefully) spot the OOME, in a product VM it would be invisible, even
> if a JVMTI agent was tracking exceptions. So I think it should be
> visible to JVMTI. I would like to hear other opinions though.
>
> However this is going beyond the scope of fixing these particular tests
> so I'm fine if this is simply recorded in another bug for future clean up.

Created https://bugs.openjdk.java.net/browse/JDK-8035646

Thanks for the review, everyone.

-JB-

>
> Thanks,
> David
>
>> -JB-
>>
>>>
>>> ---
>>>
>>> jvm.cpp
>>>
>>> Comment is wrong again - not InvocationTargetException.
>>>
>>> ---
>>>
>>> David
>>> ------
>>>
>>>
>>>
>>>
>>> On 20/02/2014 1:59 AM, Jaroslav Bachorik wrote:
>>>> On 18.2.2014 11:18, serguei.spitsyn at oracle.com wrote:
>>>>> On 2/17/14 12:04 AM, Jaroslav Bachorik wrote:
>>>>>> On 14.2.2014 23:13, serguei.spitsyn at oracle.com wrote:
>>>>>>> On 2/14/14 12:33 PM, Daniel D. Daugherty wrote:
>>>>>>>> On 2/14/14 11:46 AM, serguei.spitsyn at oracle.com wrote:
>>>>>>>>> Jaroslav,
>>>>>>>>>
>>>>>>>>> It looks good in general modulo indent comments from Dan.
>>>>>>>>>
>>>>>>>>> But I have a doubt that acquiring the JvmtiThreadState_lock is
>>>>>>>>> needed
>>>>>>>>> or right thing to do in the
>>>>>>>>> JvmtiExport::clear_detected_exception().
>>>>>>>>> It seems, both clear_exception_detected() and
>>>>>>>>> set_exception_detected() are always
>>>>>>>>> called on current thread and so, it has to be safe to do without
>>>>>>>>> acquiring any locks.
>>>>>>>>
>>>>>>>> My JVM/TI-foo is rusty, but I believe that JvmtiThreadState stuff
>>>>>>>> can also be queried/modified by other threads so grabbing the
>>>>>>>> associated lock is a good idea.
>>>>>>>
>>>>>>> The lock synchronization is cooperative.
>>>>>>> It does not help much if the lock is not acquired in other places.
>>>>>>> I can be wrong, but I've not found yet any place in the code where
>>>>>>> the
>>>>>>> clear_exception_detected() and set_exception_detected() are called
>>>>>>> under protection of the JvmtiThreadState_lock.
>>>>>>
>>>>>> I copied the locking over from
>>>>>> "JvmtiExport::cleanup_thread(JavaThread* thread)". That method is
>>>>>> also
>>>>>> supposed to work only with the current thread but acquires the lock
>>>>>> nonetheless. But if you are sure that the lock is not required I have
>>>>>> no objections removing it.
>>>>>
>>>>> I'm suggesting to remove it, as it is not used in other places in the
>>>>> code.
>>>>> It is going to be confusing if it is used in one place and missed in
>>>>> others.
>>>>
>>>> I've removed the lock and applied the same cleanup logic to other
>>>> places
>>>> where exceptions are rewrapped.
>>>>
>>>> Webrev: http://cr.openjdk.java.net/~jbachorik/4505697/webrev.02
>>>> JPRT run:
>>>> http://prt-web.us.oracle.com//archive/2014/02/2014-02-19-114618.jbachorik.hotspot/
>>>>
>>>>
>>>>
>>>>
>>>> Aurora Adhoc:
>>>> http://aurora.ru.oracle.com//faces/Batch.xhtml?batchName=418853.VMSQE.adhoc.JPRT.full
>>>>
>>>>
>>>>
>>>> (still running at the moment; no failures so far)
>>>>
>>>>>
>>>>> Thanks,
>>>>> Serguei
>>>>>
>>>>>>
>>>>>> -JB-
>>>>>>
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Serguei
>>>>>>>
>>>>>>>>
>>>>>>>> Dan
>>>>>>>>
>>>>>>>>
>>>>>>>>>
>>>>>>>>> And I'm repeating my question about pre-integration testing
>>>>>>>>> (Dan is
>>>>>>>>> asking about the same).
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Serguei
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 2/14/14 3:07 AM, Jaroslav Bachorik wrote:
>>>>>>>>>> This is a round-0 review request.
>>>>>>>>>>
>>>>>>>>>> The reflection code intercepting the exceptions thrown in the
>>>>>>>>>> invoked methods does not play nicely with JVMTI (which, in this
>>>>>>>>>> case, propagates to JDI).
>>>>>>>>>>
>>>>>>>>>> The reflection code lacks the traditional error handler -
>>>>>>>>>> therefore,
>>>>>>>>>> upon throwing the NumberFormatException, the stack is searched
>>>>>>>>>> for
>>>>>>>>>> appropriate handlers and none are found. This leaves the
>>>>>>>>>> "exception_detected" flag set to true while normally it would be
>>>>>>>>>> reset to false once the exception is handled. The reflection code
>>>>>>>>>> then goes on and wraps the NumberFormatException into
>>>>>>>>>> InvocationTargetException and throws it. But, alas, the
>>>>>>>>>> "exception_detected" flag is still set to true and no JVMTI
>>>>>>>>>> exception event will be sent out.
>>>>>>>>>>
>>>>>>>>>> The proposed solution is to call
>>>>>>>>>> thread->jvmti_thread_state()->clear_exception_detected() at the
>>>>>>>>>> appropriate places in the reflection code to reset the
>>>>>>>>>> "exception_detected" flag and enable the
>>>>>>>>>> InvocationTargetException
>>>>>>>>>> be properly reported over JVMTI.
>>>>>>>>>>
>>>>>>>>>> Issue : https://bugs.openjdk.java.net/browse/JDK-4505697
>>>>>>>>>> Webrev: http://cr.openjdk.java.net/~jbachorik/4505697/webrev.00
>>>>>>>>>>
>>>>>>>>>> Thanks!
>>>>>>>>>>
>>>>>>>>>> -JB-
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>


From staffan.larsen at oracle.com  Mon Feb 24 01:16:52 2014
From: staffan.larsen at oracle.com (Staffan Larsen)
Date: Mon, 24 Feb 2014 10:16:52 +0100
Subject: RFR(XS): JDK-8034867 - Object references from expression stack
	are not visible as roots via jvmti FollowReferences and HeapDump
In-Reply-To: <CA+3eh10-VFm4zN-EcP8Xt3aoisrkejpigxhFS2rw1UHmF+SOvA@mail.gmail.com>
References: <02D5D45C1F8DB848A7AE20E80EE61A5C3980BF2A@DEWDFEMB20C.global.corp.sap>
	<530505DA.2070305@oracle.com> <530592EC.20207@oracle.com>
	<CA+3eh109WFe6NQNWKZT7152Jnk8vO3_sG1TohSL0_Z5Z6L4JrA@mail.gmail.com>
	<02D5D45C1F8DB848A7AE20E80EE61A5C3980D881@DEWDFEMB20C.global.corp.sap>
	<53065FDA.8050608@oracle.com> <530AF591.9070401@oracle.com>
	<CA+3eh10-VFm4zN-EcP8Xt3aoisrkejpigxhFS2rw1UHmF+SOvA@mail.gmail.com>
Message-ID: <3D7EB0F4-128E-4933-88DA-7B5849BE2672@oracle.com>


On 24 feb 2014, at 09:39, Volker Simonis <volker.simonis at gmail.com> wrote:

> Hi Serguei,
> 
> thanks a lot for sponsoring this change.
> 
> It seems that https://bugs.openjdk.java.net/browse/INTJDK-7609362 is
> not visible.
> Could you please make it readable for everybody.

Since it is filed on an internal test suite it?s not possible to make the bug visible to everyone.

/Staffan

> 
> Thank you and best regards,
> Volker
> 
> 
> On Mon, Feb 24, 2014 at 8:32 AM, serguei.spitsyn at oracle.com
> <serguei.spitsyn at oracle.com> wrote:
>> The fix has been integrated into the jdk9/hs-rt repo.
>> 
>> I've also filed a qe-test/hotspot Enhancement to integrate the test
>> attached to the bug report into the VM SQE test base:
>>  https://bugs.openjdk.java.net/browse/INTJDK-7609362
>> 
>> Thanks,
>> Serguei
>> 
>> 
>> 
>> On 2/20/14 12:04 PM, serguei.spitsyn at oracle.com wrote:
>>> 
>>> Hi Axel,
>>> 
>>> We decided that I will sponsor your change.
>>> My current plan is to test and push it over weekend.
>>> 
>>> Thanks,
>>> Serguei
>>> 
>>> 
>>> On 2/20/14 1:05 AM, Siebenborn, Axel wrote:
>>>> 
>>>> Yes, I need a sponsor for this change.
>>>> Thanks for the reviews,
>>>> Axel
>>>> 
>>>> 
>>>>> -----Original Message-----
>>>>> From: Volker Simonis [mailto:volker.simonis at gmail.com]
>>>>> Sent: Donnerstag, 20. Februar 2014 09:21
>>>>> To: serguei.spitsyn at oracle.com
>>>>> Cc: Coleen Phillimore; hotspot-runtime-dev at openjdk.java.net;
>>>>> serviceability-dev at openjdk.java.net; Siebenborn, Axel
>>>>> Subject: Re: RFR(XS): JDK-8034867 - Object references from expression
>>>>> stack
>>>>> are not visible as roots via jvmti FollowReferences and HeapDump
>>>>> 
>>>>> Axel is covered by the SAP OCA.
>>>>> 
>>>>> Thanks,
>>>>> Volker
>>>>> 
>>>>> 
>>>>> On Thu, Feb 20, 2014 at 6:30 AM, serguei.spitsyn at oracle.com
>>>>> <serguei.spitsyn at oracle.com> wrote:
>>>>>> 
>>>>>> It looks good to me too.
>>>>>> But does Axel need to sign an OCA ?
>>>>>> 
>>>>>> This is a link to follow:
>>>>>>   http://openjdk.java.net/contribute/
>>>>>> 
>>>>>> 
>>>>>> Thanks,
>>>>>> Serguei
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> On 2/19/14 11:28 AM, Coleen Phillimore wrote:
>>>>>> 
>>>>>> 
>>>>>> This change looks good.  Do you need a sponsor?
>>>>>> thanks,
>>>>>> Coleen
>>>>>> 
>>>>>> On 2/13/14 11:57 AM, Siebenborn, Axel wrote:
>>>>>> 
>>>>>> Hi all,
>>>>>> 
>>>>>> could I have a review for the following change?
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> It adds code to iterate over the slots of the expression stack.
>>>>>> 
>>>>>> Found object references are used as heap roots, the same way as object
>>>>>> references in locals slots.
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> Bug:
>>>>>> 
>>>>>> https://bugs.openjdk.java.net/browse/JDK-8034867
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> Webrev:
>>>>>> 
>>>>>> http://sapjvm.com/as/webrevs/8034867
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> Thanks,
>>>>>> 
>>>>>> Axel
>>>>>> 
>>>>>> 
>>>>>> 
>>> 
>> 


From markus.gronlund at oracle.com  Mon Feb 24 01:42:42 2014
From: markus.gronlund at oracle.com (Markus Gronlund)
Date: Mon, 24 Feb 2014 01:42:42 -0800 (PST)
Subject: RFR: 8035150 ShouldNotReachHere() in ConstantPool::copy_entry_to
In-Reply-To: <19821D33-93B6-43FA-85DC-070DB4800911@oracle.com>
References: <19821D33-93B6-43FA-85DC-070DB4800911@oracle.com>
Message-ID: <ad8bc97a-9d9b-4596-ae95-5a3601622f29@default>

Hi Staffan,

I would think this is the correct fix.

The other two constant pool "error" tags, besides UnresolvedClassInError, which signal constant pool resolution errors are MethodTypeInError and MethodHandleInError - these error tags are associated with their corresponding "success" tags in switch targets in ConstantPool::copy_entry_to(), as well as in additional routines in constantPool.cpp.

In addition, in other routines in ConstantPool.cpp, the error tag JVM_CONSTANT_UnresolvedClassInError is associated with JVM_CONSTANT_UnresolvedClass -  ConstantPool::resolve_constant_at_impl() for example.

Thanks
Markus


-----Original Message-----
From: Staffan Larsen 
Sent: den 21 februari 2014 15:11
To: hotspot-runtime-dev; serviceability-dev at openjdk.java.net serviceability-dev at openjdk.java.net
Subject: RFR: 8035150 ShouldNotReachHere() in ConstantPool::copy_entry_to

This is an attempt to solve a crash while redefining a class that has unresolved class references in its constant pool. I would appreciate some extra scrutiny here since I am unfamiliar with this code path.

I have also added a test that causes a JVM crash without the fix. 

The updates to the test library is all code copied from the jdk version of the test library.

webrev: http://cr.openjdk.java.net/~sla/8035150/webrev.00/
bug: https://bugs.openjdk.java.net/browse/JDK-8035150

Thanks,
/Staffan

From volker.simonis at gmail.com  Mon Feb 24 01:43:21 2014
From: volker.simonis at gmail.com (Volker Simonis)
Date: Mon, 24 Feb 2014 10:43:21 +0100
Subject: RFR(XS): JDK-8034867 - Object references from expression stack
	are not visible as roots via jvmti FollowReferences and HeapDump
In-Reply-To: <3D7EB0F4-128E-4933-88DA-7B5849BE2672@oracle.com>
References: <02D5D45C1F8DB848A7AE20E80EE61A5C3980BF2A@DEWDFEMB20C.global.corp.sap>
	<530505DA.2070305@oracle.com> <530592EC.20207@oracle.com>
	<CA+3eh109WFe6NQNWKZT7152Jnk8vO3_sG1TohSL0_Z5Z6L4JrA@mail.gmail.com>
	<02D5D45C1F8DB848A7AE20E80EE61A5C3980D881@DEWDFEMB20C.global.corp.sap>
	<53065FDA.8050608@oracle.com> <530AF591.9070401@oracle.com>
	<CA+3eh10-VFm4zN-EcP8Xt3aoisrkejpigxhFS2rw1UHmF+SOvA@mail.gmail.com>
	<3D7EB0F4-128E-4933-88DA-7B5849BE2672@oracle.com>
Message-ID: <CA+3eh1256dtfupidvDyUFv=6YbXEyM9nNfV_=rMHvUrq4+-AnQ@mail.gmail.com>

On Mon, Feb 24, 2014 at 10:16 AM, Staffan Larsen
<staffan.larsen at oracle.com> wrote:
>
> On 24 feb 2014, at 09:39, Volker Simonis <volker.simonis at gmail.com> wrote:
>
>> Hi Serguei,
>>
>> thanks a lot for sponsoring this change.
>>
>> It seems that https://bugs.openjdk.java.net/browse/INTJDK-7609362 is
>> not visible.
>> Could you please make it readable for everybody.
>
> Since it is filed on an internal test suite it?s not possible to make the bug visible to everyone.
>

OK, thanks. I mis-read Serguei's mail. I thought he wanted to create a
jdk regression test (altough that's a little complicated because of
the need to create a native shared library).

Regards,
Volker

> /Staffan
>
>>
>> Thank you and best regards,
>> Volker
>>
>>
>> On Mon, Feb 24, 2014 at 8:32 AM, serguei.spitsyn at oracle.com
>> <serguei.spitsyn at oracle.com> wrote:
>>> The fix has been integrated into the jdk9/hs-rt repo.
>>>
>>> I've also filed a qe-test/hotspot Enhancement to integrate the test
>>> attached to the bug report into the VM SQE test base:
>>>  https://bugs.openjdk.java.net/browse/INTJDK-7609362
>>>
>>> Thanks,
>>> Serguei
>>>
>>>
>>>
>>> On 2/20/14 12:04 PM, serguei.spitsyn at oracle.com wrote:
>>>>
>>>> Hi Axel,
>>>>
>>>> We decided that I will sponsor your change.
>>>> My current plan is to test and push it over weekend.
>>>>
>>>> Thanks,
>>>> Serguei
>>>>
>>>>
>>>> On 2/20/14 1:05 AM, Siebenborn, Axel wrote:
>>>>>
>>>>> Yes, I need a sponsor for this change.
>>>>> Thanks for the reviews,
>>>>> Axel
>>>>>
>>>>>
>>>>>> -----Original Message-----
>>>>>> From: Volker Simonis [mailto:volker.simonis at gmail.com]
>>>>>> Sent: Donnerstag, 20. Februar 2014 09:21
>>>>>> To: serguei.spitsyn at oracle.com
>>>>>> Cc: Coleen Phillimore; hotspot-runtime-dev at openjdk.java.net;
>>>>>> serviceability-dev at openjdk.java.net; Siebenborn, Axel
>>>>>> Subject: Re: RFR(XS): JDK-8034867 - Object references from expression
>>>>>> stack
>>>>>> are not visible as roots via jvmti FollowReferences and HeapDump
>>>>>>
>>>>>> Axel is covered by the SAP OCA.
>>>>>>
>>>>>> Thanks,
>>>>>> Volker
>>>>>>
>>>>>>
>>>>>> On Thu, Feb 20, 2014 at 6:30 AM, serguei.spitsyn at oracle.com
>>>>>> <serguei.spitsyn at oracle.com> wrote:
>>>>>>>
>>>>>>> It looks good to me too.
>>>>>>> But does Axel need to sign an OCA ?
>>>>>>>
>>>>>>> This is a link to follow:
>>>>>>>   http://openjdk.java.net/contribute/
>>>>>>>
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Serguei
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On 2/19/14 11:28 AM, Coleen Phillimore wrote:
>>>>>>>
>>>>>>>
>>>>>>> This change looks good.  Do you need a sponsor?
>>>>>>> thanks,
>>>>>>> Coleen
>>>>>>>
>>>>>>> On 2/13/14 11:57 AM, Siebenborn, Axel wrote:
>>>>>>>
>>>>>>> Hi all,
>>>>>>>
>>>>>>> could I have a review for the following change?
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> It adds code to iterate over the slots of the expression stack.
>>>>>>>
>>>>>>> Found object references are used as heap roots, the same way as object
>>>>>>> references in locals slots.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Bug:
>>>>>>>
>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8034867
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Webrev:
>>>>>>>
>>>>>>> http://sapjvm.com/as/webrevs/8034867
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Thanks,
>>>>>>>
>>>>>>> Axel
>>>>>>>
>>>>>>>
>>>>>>>
>>>>
>>>
>

From staffan.larsen at oracle.com  Mon Feb 24 02:17:41 2014
From: staffan.larsen at oracle.com (Staffan Larsen)
Date: Mon, 24 Feb 2014 11:17:41 +0100
Subject: RFR(XS): JDK-8034867 - Object references from expression stack
	are not visible as roots via jvmti FollowReferences and HeapDump
In-Reply-To: <CA+3eh1256dtfupidvDyUFv=6YbXEyM9nNfV_=rMHvUrq4+-AnQ@mail.gmail.com>
References: <02D5D45C1F8DB848A7AE20E80EE61A5C3980BF2A@DEWDFEMB20C.global.corp.sap>
	<530505DA.2070305@oracle.com> <530592EC.20207@oracle.com>
	<CA+3eh109WFe6NQNWKZT7152Jnk8vO3_sG1TohSL0_Z5Z6L4JrA@mail.gmail.com>
	<02D5D45C1F8DB848A7AE20E80EE61A5C3980D881@DEWDFEMB20C.global.corp.sap>
	<53065FDA.8050608@oracle.com> <530AF591.9070401@oracle.com>
	<CA+3eh10-VFm4zN-EcP8Xt3aoisrkejpigxhFS2rw1UHmF+SOvA@mail.gmail.com>
	<3D7EB0F4-128E-4933-88DA-7B5849BE2672@oracle.com>
	<CA+3eh1256dtfupidvDyUFv=6YbXEyM9nNfV_=rMHvUrq4+-AnQ@mail.gmail.com>
Message-ID: <244E1F17-0104-411E-AE16-CF7027E46A1E@oracle.com>


On 24 feb 2014, at 10:43, Volker Simonis <volker.simonis at gmail.com> wrote:

> On Mon, Feb 24, 2014 at 10:16 AM, Staffan Larsen
> <staffan.larsen at oracle.com> wrote:
>> 
>> On 24 feb 2014, at 09:39, Volker Simonis <volker.simonis at gmail.com> wrote:
>> 
>>> Hi Serguei,
>>> 
>>> thanks a lot for sponsoring this change.
>>> 
>>> It seems that https://bugs.openjdk.java.net/browse/INTJDK-7609362 is
>>> not visible.
>>> Could you please make it readable for everybody.
>> 
>> Since it is filed on an internal test suite it?s not possible to make the bug visible to everyone.
>> 
> 
> OK, thanks. I mis-read Serguei's mail. I thought he wanted to create a
> jdk regression test (altough that's a little complicated because of
> the need to create a native shared library).

Yes, that is the root problem. Working on fixing it?

/Staffan

> 
> Regards,
> Volker
> 
>> /Staffan
>> 
>>> 
>>> Thank you and best regards,
>>> Volker
>>> 
>>> 
>>> On Mon, Feb 24, 2014 at 8:32 AM, serguei.spitsyn at oracle.com
>>> <serguei.spitsyn at oracle.com> wrote:
>>>> The fix has been integrated into the jdk9/hs-rt repo.
>>>> 
>>>> I've also filed a qe-test/hotspot Enhancement to integrate the test
>>>> attached to the bug report into the VM SQE test base:
>>>> https://bugs.openjdk.java.net/browse/INTJDK-7609362
>>>> 
>>>> Thanks,
>>>> Serguei
>>>> 
>>>> 
>>>> 
>>>> On 2/20/14 12:04 PM, serguei.spitsyn at oracle.com wrote:
>>>>> 
>>>>> Hi Axel,
>>>>> 
>>>>> We decided that I will sponsor your change.
>>>>> My current plan is to test and push it over weekend.
>>>>> 
>>>>> Thanks,
>>>>> Serguei
>>>>> 
>>>>> 
>>>>> On 2/20/14 1:05 AM, Siebenborn, Axel wrote:
>>>>>> 
>>>>>> Yes, I need a sponsor for this change.
>>>>>> Thanks for the reviews,
>>>>>> Axel
>>>>>> 
>>>>>> 
>>>>>>> -----Original Message-----
>>>>>>> From: Volker Simonis [mailto:volker.simonis at gmail.com]
>>>>>>> Sent: Donnerstag, 20. Februar 2014 09:21
>>>>>>> To: serguei.spitsyn at oracle.com
>>>>>>> Cc: Coleen Phillimore; hotspot-runtime-dev at openjdk.java.net;
>>>>>>> serviceability-dev at openjdk.java.net; Siebenborn, Axel
>>>>>>> Subject: Re: RFR(XS): JDK-8034867 - Object references from expression
>>>>>>> stack
>>>>>>> are not visible as roots via jvmti FollowReferences and HeapDump
>>>>>>> 
>>>>>>> Axel is covered by the SAP OCA.
>>>>>>> 
>>>>>>> Thanks,
>>>>>>> Volker
>>>>>>> 
>>>>>>> 
>>>>>>> On Thu, Feb 20, 2014 at 6:30 AM, serguei.spitsyn at oracle.com
>>>>>>> <serguei.spitsyn at oracle.com> wrote:
>>>>>>>> 
>>>>>>>> It looks good to me too.
>>>>>>>> But does Axel need to sign an OCA ?
>>>>>>>> 
>>>>>>>> This is a link to follow:
>>>>>>>>  http://openjdk.java.net/contribute/
>>>>>>>> 
>>>>>>>> 
>>>>>>>> Thanks,
>>>>>>>> Serguei
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> On 2/19/14 11:28 AM, Coleen Phillimore wrote:
>>>>>>>> 
>>>>>>>> 
>>>>>>>> This change looks good.  Do you need a sponsor?
>>>>>>>> thanks,
>>>>>>>> Coleen
>>>>>>>> 
>>>>>>>> On 2/13/14 11:57 AM, Siebenborn, Axel wrote:
>>>>>>>> 
>>>>>>>> Hi all,
>>>>>>>> 
>>>>>>>> could I have a review for the following change?
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> It adds code to iterate over the slots of the expression stack.
>>>>>>>> 
>>>>>>>> Found object references are used as heap roots, the same way as object
>>>>>>>> references in locals slots.
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> Bug:
>>>>>>>> 
>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8034867
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> Webrev:
>>>>>>>> 
>>>>>>>> http://sapjvm.com/as/webrevs/8034867
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> Thanks,
>>>>>>>> 
>>>>>>>> Axel

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20140224/04704fde/attachment.html 

From mikael.vidstedt at oracle.com  Mon Feb 24 11:22:33 2014
From: mikael.vidstedt at oracle.com (Mikael Vidstedt)
Date: Mon, 24 Feb 2014 11:22:33 -0800
Subject: 8034856/8034857: More gcc warnings
In-Reply-To: <5309BD25.2030407@oracle.com>
References: <52FCC60B.6040205@oracle.com>
	<6E073B44-2FF7-419F-A094-7CD5C3AE0788@oracle.com>
	<52FD0DA2.7030201@oracle.com> <52FD35C5.1000306@oracle.com>
	<5300F36F.9010205@oracle.com>
	<CA+kOe09Uc3LQG7Ui7fhxsQABacrPdvEpRVGoKxnoY-0FHm4Ntw@mail.gmail.com>
	<4DCAB60B-5D60-4873-A5A4-ACF123F10045@oracle.com>
	<530225D2.4040709@oracle.com> <5302DA88.9060306@oracle.com>
	<53031AE4.2050700@oracle.com> <5303B85D.10505@oracle.com>
	<5304E4B8.6040908@oracle.com> <5304F64E.4000206@oracle.com>
	<5309BD25.2030407@oracle.com>
Message-ID: <530B9BF9.40102@oracle.com>


On 2014-02-23 01:19, Alan Bateman wrote:
> On 19/02/2014 18:22, Mikael Vidstedt wrote:
>> :
>>
>> The documented grammar in the comment only mentions "SPACE" and the 
>> code below doesn't make any references to \t. As a matter of fact, it 
>> only checks for one single, mandatory SPACE after the colon (enforced 
>> at line 535-536) and doesn't care to remove any space characters at 
>> the end of the value. The while loop only deals with continuations. 
>> If additional spaces do exist they will as far as I can tell be part 
>> of the value. Are they trimmed later? I'm assuming it would be nice 
>> to have both parsers (parse_manifest & JarFacade) behave the same way?
>>
>> Here's what it would look like to only check for space, but still eat 
>> any additional spaces which doesn't match what 
>> parse_manifest/parse_nv_pair does:
>>
>> http://cr.openjdk.java.net/~mikael/webrevs/isspace/webrev.01/webrev/
>>
> Sorry for the delay getting back to you on this (I've been busy with 
> other things).
>
> I checked the JAR File Specification, which is turn references RFC 822 
> as the "inspiration" for the name-value pairs. The SPACE token is just 
> ASCII SP. So I agree it's just ASCII SP that needs to be handled here, 
> not LWSP-char which includes ASCII HT.
>
> Looking at JDK-6274276 then the trimming was done to avoid 
> hard-to-diagnose problems with leading/trailing spaces. It's possible 
> that this is inconsistent with other areas where JAR file attributes 
> are used. I would suggest leaving it as is for now as this is 
> potentially changing behavior in several areas.

That sounds reasonable. I'll keep the webrev.01 approach - only check 
for and trim ASCII SP.

Thanks for the review!

Cheers,
Mikael

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20140224/cc0a7399/attachment.html 

From kevin.walls at oracle.com  Tue Feb 25 01:54:31 2014
From: kevin.walls at oracle.com (Kevin Walls)
Date: Tue, 25 Feb 2014 09:54:31 +0000
Subject: RR(S): 8028623 and 8032466: serviceability agent hashcode changes.
In-Reply-To: <52E0DDE2.6050601@oracle.com>
References: <52E0033B.80109@oracle.com>	<05BFE552-8255-4C60-AE82-981E0FD9E5AD@oracle.com>
	<52E0DDE2.6050601@oracle.com>
Message-ID: <530C6857.6000601@oracle.com>

Hi -

I'm going to backport this (8028623 hashcode change, and testcase tweak 
8032466) to jdk8u (and after that, to jdk7u).  The same changesets work 
and test fine.

8u webrev, planning to push to 
http://hg.openjdk.java.net/jdk8u/hs-dev/hotspot:

http://cr.openjdk.java.net/~kevinw/8028623/8028623.8u.001/

This email is a further review/request, or if that is redundant it's 
just a warning in case anybody thinks that is a bad idea for some 
reason.  Feel free to speak, but I would rather like to presume the same 
reviewers are OK with it...

Thanks
Kevin


On 23/01/14 09:16, Kevin Walls wrote:
> Thanks Dmitry, Staffan!
>
> On 23/01/14 07:05, Staffan Larsen wrote:
>> Looks good!
>>
>> Thanks,
>> /Staffan
>>
>> On 22 jan 2014, at 18:43, Kevin Walls <kevin.walls at oracle.com> wrote:
>>
>>> Hi,
>>>
>>> Can I just get a review of this jtreg tag change in a testcase.  The 
>>> testcase specifically contains a utf8 character and to make sure it 
>>> compiles everywhere we need to specify -encoding on the @compile 
>>> line, and then add an @run line:
>>>
>>> webrev
>>> http://cr.openjdk.java.net/~kevinw/8032466/webrev.00/
>>>
>>> bug
>>> https://bugs.openjdk.java.net/browse/JDK-8032466
>>>
>>> (if the testcase runs fine without this change, in jtreg locally you 
>>> can make it fail by setting -encoding ascii)
>>>
>>> Thanks!
>>> Kevin
>


From jaroslav.bachorik at oracle.com  Tue Feb 25 05:45:13 2014
From: jaroslav.bachorik at oracle.com (Jaroslav Bachorik)
Date: Tue, 25 Feb 2014 14:45:13 +0100
Subject: RFR 8034168: ThreadMXBean/Locks.java failed, blocked on wrong
	object
In-Reply-To: <CA+kOe0-TziqPS6PUpDVodeJiVF+Ms_JrRrQ2nFvqFindLikhVw@mail.gmail.com>
References: <530388CC.9080807@oracle.com>	<CA+kOe09mG0g_5d=pa51Dgj-kLRfkf1J-HSmT9P9zctqA1ufr5w@mail.gmail.com>	<53045D76.5080907@oracle.com>	<CA+kOe08BRfbu63CAzAZ4p7ZpYfmHGBBSQVvFaMVg6-2+xeuV6A@mail.gmail.com>	<53060E57.6010300@oracle.com>
	<CA+kOe0-TziqPS6PUpDVodeJiVF+Ms_JrRrQ2nFvqFindLikhVw@mail.gmail.com>
Message-ID: <530C9E69.9070501@oracle.com>

On 20.2.2014 18:04, Martin Buchholz wrote:
> I think David is too pessimistic about Thread.yield being ineffective on
> Java SE implementations (OTOH David is a Java Embedded expert).  In
> practice an implementation that never thread switched out of a yield() loop
> would not pass the tck.  As for theory: it's true that Thread.yield has no
> progress guarantees, but then neither does Thread.sleep.  A perverse
> implementation can always starve any thread it feels like.
>
> Anyways, there is nothing wrong with your sleep loop.  Except maybe you
> want to time out eventually.

The test harness should take care of timeout - at least that's what I 
can see as the preferred approach in the recent test fixes.

Since David doesn't seem to have a strong opinion about Thread.sleep() 
vs. Thread.yield() here and I am rather impartial to that as well I can 
change the fix to use Thread.yield() instead if you are sure it won't 
cause any troubles.

Thanks,

-JB-

>
>
> On Thu, Feb 20, 2014 at 6:16 AM, Jaroslav Bachorik <
> jaroslav.bachorik at oracle.com> wrote:
>
>> This was discussed when reviewing ThreadMXBean/
>> SynchronizationStatistics.java
>>
>> Regarding a busy wait checking the thread states and issuing
>> Thread.yield() now and then David Holmes wrote:
>> "Not elegant and not completely reliable either. Probably adequate on a
>> multi-core system but single-core and with some schedulers it could just
>> be a busy spin." [1]
>>
>> As far as I understand, the only benefit of using Thread.yield() instead
>> of Thread.sleep() would be 10-100ms shorter execution time, right?
>>
>> -JB-
>>
>> [1] http://mail.openjdk.java.net/pipermail/jmx-dev/2013-
>> October/000484.html
>>
>>
>> On 19.2.2014 17:56, Martin Buchholz wrote:
>>
>>> The jsr166 tck tests make systematic use of Thread.yield, e.g.
>>>
>>>
>>>       /**
>>>        * Spin-waits up to the specified number of milliseconds for the
>>> given
>>>        * thread to enter a wait state: BLOCKED, WAITING, or TIMED_WAITING.
>>>        */
>>>       void waitForThreadToEnterWaitState(Thread thread, long
>>> timeoutMillis) {
>>>           long startTime = System.nanoTime();
>>>           for (;;) {
>>>               Thread.State s = thread.getState();
>>>               if (s == Thread.State.BLOCKED ||
>>>                   s == Thread.State.WAITING ||
>>>                   s == Thread.State.TIMED_WAITING)
>>>                   return;
>>>               else if (s == Thread.State.TERMINATED)
>>>                   fail("Unexpected thread termination");
>>>               else if (millisElapsedSince(startTime) > timeoutMillis) {
>>>                   threadAssertTrue(thread.isAlive());
>>>                   return;
>>>               }
>>>               Thread.yield();
>>>           }
>>>       }
>>>
>>>
>>>
>>> On Tue, Feb 18, 2014 at 11:29 PM, Jaroslav Bachorik <
>>> jaroslav.bachorik at oracle.com> wrote:
>>>
>>>   On 18.2.2014 18:06, Martin Buchholz wrote:
>>>>
>>>>   Not checking any details, but tests that want to wait for a particular
>>>>> thread state are a good reason to use
>>>>>
>>>>> volatile boolean flag;
>>>>> ...
>>>>> while (!flag) Thread.yield();
>>>>>
>>>>> I prefer calling Thread.yield to sleeping in this special case, in part
>>>>> because I don't want to rely on the implementation of sleep, while yield
>>>>> is
>>>>> semantically a no-op.  (Also sleeping 100ms is a long time for a
>>>>> computer)
>>>>>
>>>>>
>>>> There were discussions for a similar fix regarding Thread.yield(). The
>>>> concern was that using Thread.yield() in a tight loop might very easily
>>>> lead to starvation on single core machines. Therefore Thread.sleep(10) is
>>>> used to be sure the flag setting thread has actually a chance to
>>>> progress.
>>>>
>>>> -JB-
>>>>
>>>>
>>>>
>>>>
>>>>>
>>>>> On Tue, Feb 18, 2014 at 8:22 AM, Jaroslav Bachorik <
>>>>> jaroslav.bachorik at oracle.com> wrote:
>>>>>
>>>>>    Please, review the following test change.
>>>>>
>>>>>>
>>>>>> Issue : https://bugs.openjdk.java.net/browse/JDK-8034168
>>>>>> Webrev: http://cr.openjdk.java.net/~jbachorik/8034168/webrev.00
>>>>>>
>>>>>> The test fails because of falsely evaluating the thread being parked as
>>>>>> actually waiting on a monitor. This is because there is no difference
>>>>>> in
>>>>>> java thread state for those two situations. The test is using Phaser
>>>>>> for
>>>>>> synchronization between the checked and checking thread to make sure an
>>>>>> appropriate code section is entered before performing asserts. Then it
>>>>>> checks the checked thread state and waits till it becomes WAITING.
>>>>>> Unfortunately, when Phaser needs to wait it parks the thread and sets
>>>>>> the
>>>>>> thread state to WAITING. From now on the test is in a completely random
>>>>>> state and the result will largely depend on timing - thus failing
>>>>>> intermittently.
>>>>>>
>>>>>> The solution is to use an additional volatile variable to prevent
>>>>>> falsely
>>>>>> indicating the park() induced WAITING state.
>>>>>>
>>>>>> Thanks,
>>>>>>
>>>>>> -JB-
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>


From jaroslav.bachorik at oracle.com  Tue Feb 25 05:52:07 2014
From: jaroslav.bachorik at oracle.com (Jaroslav Bachorik)
Date: Tue, 25 Feb 2014 14:52:07 +0100
Subject: Review request for 7195249: Some jtreg tests use hard coded ports
In-Reply-To: <53049009.9020204@oracle.com>
References: <529EF58F.5000701@oracle.com>
	<52A58687.6020708@oracle.com>	<52A5953A.5040102@oracle.com>
	<52A7061E.8040002@oracle.com>	<52BC2A7D.3070403@oracle.com>	<B27290D9-2EA6-4330-9A31-59957AA968ED@oracle.com>
	<52D6A61A.5020109@oracle.com> <52DD3BB6.2070607@oracle.com>
	<A2DEA977-75EF-4AC2-AC86-E22B8BBFFE97@oracle.com>
	<52DE3E3D.5070903@oracle.com> <52DE41BB.40309@oracle.com>
	<52F231D3.1050208@oracle.com> <52F23A97.7020807@oracle.com>
	<53049009.9020204@oracle.com>
Message-ID: <530CA007.5080006@oracle.com>

Thumbs up. (not a "reviewer", though)

-JB-

On 19.2.2014 12:05, taras ledkov wrote:
> Hi,
>
> Imports are fixed:
>
> http://cr.openjdk.java.net/~anazarov/7195249/jdk/webrev.04/
>
> On 05.02.2014 17:20, Jaroslav Bachorik wrote:
>> Hi Taras,
>>
>> thanks for taking care of this.
>>
>> The changes look fine to me.
>>
>> One minor nit is unused imports of the library classes in
>> "test/sun/management/jmxremote/bootstrap/SSLConfigFilePermissionTest.java".
>>
>> It does not use any of those classes as its base class
>> "AbstractFilePermissionTest" does all the heavy lifting.
>>
>> Cheers,
>>
>> -JB-
>>
>> On 5.2.2014 13:42, taras ledkov wrote:
>>> Hi,
>>>
>>> So please take a look at the review against JDK9.
>>> The reviewed patch had not been integrated into JDK8.
>>>
>>> Port to JDK9 is identical. The difference: the ProcessTools.java has
>>> been already patched by Jaroslav.
>>>
>>> Webrev for jdk part:
>>> http://cr.openjdk.java.net/~anazarov/7195249/jdk/webrev.03/
>>>
>>> Webrev for hs part:
>>> http://cr.openjdk.java.net/~anazarov/7195249/hs/webrev.03/
>>>
>>>
>>> On 21.01.2014 13:45, Jaroslav Bachorik wrote:
>>>> Hi Taras,
>>>>
>>>> On 21.1.2014 10:30, taras ledkov wrote:
>>>>> Hi Jaroslav,
>>>>>
>>>>> Could you please review the last changes?
>>>>> Are you OK?
>>>>
>>>> Yes, the change looks ok. But I think we will need to get back to this
>>>> problem eventually and implement a central port dispatcher if we
>>>> want to
>>>> be 100% sure the port conflicts wouldn't occur. But your changes reduce
>>>> the chance significantly.
>>>>
>>>> Thanks for taking care of this.
>>>>
>>>> -JB-
>>>>
>>>>>
>>>>> On 20.01.2014 19:21, Staffan Larsen wrote:
>>>>>> Sorry for not replying earlier. Yes, I?m ok with these changes.
>>>>>>
>>>>>> Thanks,
>>>>>> /Staffan
>>>>>>
>>>>>> On 20 jan 2014, at 16:07, taras ledkov <taras.ledkov at oracle.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Hi Staffan,
>>>>>>>
>>>>>>> I fixed the tests according with your comments.
>>>>>>> Are you OK?
>>>>>>>
>>>>>>> On 15.01.2014 19:15, taras ledkov wrote:
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> Please take a look at the new review.
>>>>>>>>
>>>>>>>> Webrev for jdk part:
>>>>>>>> http://cr.openjdk.java.net/~anazarov/7195249/jdk/webrev.02/
>>>>>>>>
>>>>>>>> Webrev for hs part:
>>>>>>>> http://cr.openjdk.java.net/~anazarov/7195249/hs/webrev.02/
>>>>>>>>
>>>>>>>> My answers are inline:
>>>>>>>>
>>>>>>>> On 08.01.2014 17:46, Staffan Larsen wrote:
>>>>>>>>> Hi Taras,
>>>>>>>>>
>>>>>>>>> Thanks for doing this clean up and conversion of tests into Java.
>>>>>>>>> Here?s a couple of comments:
>>>>>>>>>
>>>>>>>>> test/runtime/6294277/SourceDebugExtension.java:
>>>>>>>>> This test could be simplified by not specifying an address at all.
>>>>>>>>> Since the test never connects to the JVM started with -Xrunjdwp,
>>>>>>>>> there
>>>>>>>>> is no reason to specify an address. If address is unspecified (and
>>>>>>>>> server=y), the connector will pick an address and print it to the
>>>>>>>>> command line. Thus the only change that needs to be done is to
>>>>>>>>> remove
>>>>>>>>> ",address=8888? from the @run command.
>>>>>>>> fixed
>>>>>>>>
>>>>>>>>> test/sun/management/jmxremote/bootstrap/RmiBootstrapTest.sh:
>>>>>>>>> test/sun/management/jmxremote/bootstrap/RmiSslBootstrapTest.sh:
>>>>>>>>> These tests do not compile cleanly with an empty JTwork
>>>>>>>>> directory. It
>>>>>>>>> seems that having one @build for each class does not work well -
>>>>>>>>> when
>>>>>>>>> compiling RmiBootstrapTest.java it cannot find TestLogger. Moving
>>>>>>>>> all
>>>>>>>>> classes to one @build statement solved this problem for me.
>>>>>>>> fixed
>>>>>>>>
>>>>>>>>> test/lib/testlibrary/jdk/testlibrary/ProcessTools.java:
>>>>>>>>>   187         Future<Void> stdoutTask = stdout.process();
>>>>>>>>>   188         Future<Void> stderrTask = stderr.process();
>>>>>>>>> The stdoutTask and stderrTask variables are unused.
>>>>>>>> fixed
>>>>>>>>
>>>>>>>>> test/sun/management/jmxremote/bootstrap/RmiRegistrySslTest.java:
>>>>>>>>> At first I thought something was wrong with this file - the
>>>>>>>>> diff is
>>>>>>>>> very weird. Then I realized you renamed an old file and created a
>>>>>>>>> new
>>>>>>>>> file using the old name.
>>>>>>>> You are right. I did it to keep the test name.
>>>>>>>>
>>>>>>>>> test/sun/management/jmxremote/bootstrap/AbstractFilePermissionTest.java:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> - Is resetPasswordFilePermission() really necessary? It looks like
>>>>>>>>> you
>>>>>>>>> delete the files at the beginning of the test in any case.
>>>>>>>> I think yes. n the first place, this functionality was at the old
>>>>>>>> code.
>>>>>>>> In the second place, a file without write permission may be a
>>>>>>>> problem
>>>>>>>> for a further cleanup (not by the test, for example for the tests
>>>>>>>> launcher scripts etc.)
>>>>>>>>
>>>>>>>>> - I find the names and usage of ?mgmt? and ?file2PermissionTest?
>>>>>>>>> confusing. They are both Paths. One is used directly by the
>>>>>>>>> sub-classes, the other has a getter method.
>>>>>>>> fixed
>>>>>>>>
>>>>>>>>> - Lines 57-58: Don?t swallow exceptions, add an
>>>>>>>>> ex.printStackTrace().
>>>>>>>>> (Same thing for all other places where you call
>>>>>>>>> Integer.parseInt())
>>>>>>>> fixed
>>>>>>>>
>>>>>>>>> test/sun/management/jmxremote/bootstrap/Dummy.java:
>>>>>>>>> This file is never used as far as I can see.
>>>>>>>> It is used by PasswordFilePermissionTest &
>>>>>>>> SSLConfigFilePermissionTest
>>>>>>>> via the AbstractFilePermissionTest (see the doTest method,
>>>>>>>> AbstractFilePermissionTest : 162).
>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> /Staffan
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 26 dec 2013, at 14:09, taras ledkov <taras.ledkov at oracle.com>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> Hi,
>>>>>>>>>>
>>>>>>>>>> Please take a look at the review with fixed issues about
>>>>>>>>>> trying to
>>>>>>>>>> launch test that needs free port several times.
>>>>>>>>>>
>>>>>>>>>> Webrev for jdk part:
>>>>>>>>>> http://cr.openjdk.java.net/~anazarov/7195249/jdk/webrev.01/
>>>>>>>>>>
>>>>>>>>>> Webrev for hs part:
>>>>>>>>>> http://cr.openjdk.java.net/~anazarov/7195249/hs/webrev.01/
>>>>>>>>>>
>>>>>>>>>> Pay your attention to new method
>>>>>>>>>> ProcessTools.startProcess(String,
>>>>>>>>>> ProcessBuilder, Consumer<String>) that is used to analyze all
>>>>>>>>>> output
>>>>>>>>>> of a sub-process.  It has common part with
>>>>>>>>>> ProcessTools.startProcess(String, ProcessBuilder,
>>>>>>>>>> Predicate<String>,
>>>>>>>>>> long, TumeUnit) that is used to determine the warm-up moment.
>>>>>>>>>>
>>>>>>>>>> I think the ProcessTools.startProcess(String, ProcessBuilder,
>>>>>>>>>> Predicate<String>, long, TumeUnit) may be changed by adding
>>>>>>>>>> LinePump
>>>>>>>>>> to stderr if there is not serious reason for restricting the
>>>>>>>>>> warm-up
>>>>>>>>>> analysis to stdout stream.
>>>>>>>>>>
>>>>>>>>>> On 10.12.2013 16:16, Yekaterina Kantserova wrote:
>>>>>>>>>>> Hi,
>>>>>>>>>>>
>>>>>>>>>>> I've consulted with Serviceability engineers (add them to CC
>>>>>>>>>>> list) and
>>>>>>>>>>> they would like to see tests to solve these problem so far:
>>>>>>>>>>>
>>>>>>>>>>> 2. Implement loops in every test.
>>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>> Katja
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On 12/09/2013 11:02 AM, Alexandre (Shura) Iline wrote:
>>>>>>>>>>>> Guys.
>>>>>>>>>>>>
>>>>>>>>>>>> Let me try to sum up what was said before and may be suggest a
>>>>>>>>>>>> compromise.
>>>>>>>>>>>>
>>>>>>>>>>>> 1. There is a desire to have a support port allocation on the
>>>>>>>>>>>> level of
>>>>>>>>>>>> a JTReg suite execution. Taras created a bug for that
>>>>>>>>>>>> (https://bugs.openjdk.java.net/browse/JDK-7195249). Whether it
>>>>>>>>>>>> is a
>>>>>>>>>>>> test harness API or a library API does not really matter from
>>>>>>>>>>>> usage
>>>>>>>>>>>> point of view.
>>>>>>>>>>>>
>>>>>>>>>>>> 2. There is no way to make the tests absolutely stable,
>>>>>>>>>>>> whatever
>>>>>>>>>>>> port
>>>>>>>>>>>> allocation logic is used. The best we could do is to try to
>>>>>>>>>>>> perform
>>>>>>>>>>>> the test logic with different ports until the test succeeds.
>>>>>>>>>>>>
>>>>>>>>>>>> Both arguments make sense. #2 is the ultimate answer, of
>>>>>>>>>>>> course,
>>>>>>>>>>>> but
>>>>>>>>>>>> better be used in conjunction with a meaningful port selection
>>>>>>>>>>>> algorithm.
>>>>>>>>>>>>
>>>>>>>>>>>> At the same time, copying a loop-until-success login from one
>>>>>>>>>>>> test to
>>>>>>>>>>>> another may be not the best solution. Library could help with
>>>>>>>>>>>> that I
>>>>>>>>>>>> believe. There only need to be an API method which takes
>>>>>>>>>>>> behavior as a
>>>>>>>>>>>> parameter and run it until it succeeds. Something like:
>>>>>>>>>>>> public <T> runOnAFreePort(Function<T, Integer>)
>>>>>>>>>>>> or similar. There could be arguments of how/whether to
>>>>>>>>>>>> implement
>>>>>>>>>>>> it,
>>>>>>>>>>>> the solution would not work for shell tests, etc, but still ...
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> With the tests in question though, we have a few options.
>>>>>>>>>>>>
>>>>>>>>>>>> 1. Integrate tests as is. Get to it later after reaching
>>>>>>>>>>>> agreement in
>>>>>>>>>>>> the library, etc.
>>>>>>>>>>>> 2. Implement loops in every test.
>>>>>>>>>>>> 3. Wait for the library to be ready and only then integrate the
>>>>>>>>>>>> changes.
>>>>>>>>>>>>
>>>>>>>>>>>> Please let us know which one is closer to your heart.
>>>>>>>>>>>>
>>>>>>>>>>>> I personally prefer #1 for the reason that the changes already
>>>>>>>>>>>> supposed to make the tests more stable and also there are many
>>>>>>>>>>>> more
>>>>>>>>>>>> tests tests which use ports, so the scope of the problem is
>>>>>>>>>>>> bigger
>>>>>>>>>>>> than these.
>>>>>>>>>>>>
>>>>>>>>>>>> Shura
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>> Taras,
>>>>>>>>>>>>>
>>>>>>>>>>>>> I agree with the previous comments, that Utils.getFreePort()
>>>>>>>>>>>>> does not
>>>>>>>>>>>>> guarantee the port will be still free when you start your
>>>>>>>>>>>>> process.
>>>>>>>>>>>>> Unfortunately I don't think the library can do more. However,
>>>>>>>>>>>>> there is a
>>>>>>>>>>>>> solution.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Please, look at the *jdk/test/sun/tools/jstatd/JstatdTest.java
>>>>>>>>>>>>> tryToSetupJstatdProcess()*. In brief, the test will try to
>>>>>>>>>>>>> start a
>>>>>>>>>>>>> process with a free port and then check if
>>>>>>>>>>>>> /java.rmi.server.ExportException: Port already in use/ has
>>>>>>>>>>>>> been
>>>>>>>>>>>>> thrown.
>>>>>>>>>>>>> If yes, you have to retry.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>> Katja
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On 12/02/2013 01:39 PM, taras ledkov wrote:
>>>>>>>>>>>>>> Hi Everyone,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Whatever logic is to be chosen to select a free port, it is
>>>>>>>>>>>>>> the
>>>>>>>>>>>>>> library responsibility to implements it, would not you agree?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Hence what I am suggesting is to integrate the tests as is.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Should we decide to replace logic of the port selection, we
>>>>>>>>>>>>>> could do
>>>>>>>>>>>>>> it later in the library.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On 21.11.2013 15:00, Jaroslav Bachorik wrote:
>>>>>>>>>>>>>>> On 20.11.2013 18:38, Dmitry Samersoff wrote:
>>>>>>>>>>>>>>>> Roger,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> As soon as we close a socket nobody can guarantee that the
>>>>>>>>>>>>>>>> port is
>>>>>>>>>>>>>>>> free.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Moreover, port returned by getFreePort()[1] remains not
>>>>>>>>>>>>>>>> accessible
>>>>>>>>>>>>>>>> for
>>>>>>>>>>>>>>>> some time - it depends to system setup, take a look to
>>>>>>>>>>>>>>>> discussions
>>>>>>>>>>>>>>>> around SO_REUSEPORT for Linux or SO_REUSEADDR and SO_LINGER
>>>>>>>>>>>>>>>> for
>>>>>>>>>>>>>>>> BSD.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> So from stability point of view it's better to just return
>>>>>>>>>>>>>>>> random
>>>>>>>>>>>>>>>> number
>>>>>>>>>>>>>>>> between 49152 and 65535.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Well, this doesn't seem to improve the odds by much. When
>>>>>>>>>>>>>>> there are
>>>>>>>>>>>>>>> more
>>>>>>>>>>>>>>> tests run in parallel, all of them requiring a free port,
>>>>>>>>>>>>>>> nothing
>>>>>>>>>>>>>>> prevents the random function to return the same port to
>>>>>>>>>>>>>>> all of
>>>>>>>>>>>>>>> them.
>>>>>>>>>>>>>>> Also, two subsequent requests can return the same port and
>>>>>>>>>>>>>>> cause
>>>>>>>>>>>>>>> problems with timing when a port used by a previous test is
>>>>>>>>>>>>>>> not
>>>>>>>>>>>>>>> fully
>>>>>>>>>>>>>>> ready to be assigned to a different socket. And as Dmitry
>>>>>>>>>>>>>>> pointed out
>>>>>>>>>>>>>>> unless one can keep hold of the allocated socket and use it
>>>>>>>>>>>>>>> later
>>>>>>>>>>>>>>> there
>>>>>>>>>>>>>>> is no guarantee that a port which was tested unallocated
>>>>>>>>>>>>>>> will
>>>>>>>>>>>>>>> remain
>>>>>>>>>>>>>>> unallocated also for the next few milliseconds.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> The only fail proof solution would be a port allocating
>>>>>>>>>>>>>>> service
>>>>>>>>>>>>>>> provided
>>>>>>>>>>>>>>> by the harness. Until then we can only (hopefully) decrease
>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>> chance
>>>>>>>>>>>>>>> of intermittent failures due to a port being in use.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> -JB-
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> -Dmitry
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> [1]
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> 141     public static int getFreePort() throws
>>>>>>>>>>>>>>>> InterruptedException,
>>>>>>>>>>>>>>>> IOException {
>>>>>>>>>>>>>>>>   142         int port = -1;
>>>>>>>>>>>>>>>>   143
>>>>>>>>>>>>>>>>   144         while (port <= 0) {
>>>>>>>>>>>>>>>>   145             Thread.sleep(100);
>>>>>>>>>>>>>>>>   146
>>>>>>>>>>>>>>>>   147             ServerSocket serverSocket = null;
>>>>>>>>>>>>>>>>   148             try {
>>>>>>>>>>>>>>>>   149                 serverSocket = new ServerSocket(0);
>>>>>>>>>>>>>>>>   150                 port = serverSocket.getLocalPort();
>>>>>>>>>>>>>>>>   151             } finally {
>>>>>>>>>>>>>>>>   152                 serverSocket.close();
>>>>>>>>>>>>>>>>   153             }
>>>>>>>>>>>>>>>>   154         }
>>>>>>>>>>>>>>>>   155
>>>>>>>>>>>>>>>>   156         return port;
>>>>>>>>>>>>>>>>   157     }
>>>>>>>>>>>>>>>>   158
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On 2013-11-20 19:40, roger riggs wrote:
>>>>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> fyi,  The jdk.testlibrary.Utils.getFreePort() method will
>>>>>>>>>>>>>>>>> Open an
>>>>>>>>>>>>>>>>> free
>>>>>>>>>>>>>>>>> Socket, close it and return
>>>>>>>>>>>>>>>>> the port number.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> And as Alan recommended, use (0) when possible to have the
>>>>>>>>>>>>>>>>> system
>>>>>>>>>>>>>>>>> assign
>>>>>>>>>>>>>>>>> the port #.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Roger
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On 11/20/2013 8:04 AM, Dmitry Samersoff wrote:
>>>>>>>>>>>>>>>>>> Taras,
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> *The only* correct way to take really free port is:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> 1. Chose random number between 49152 and 65535
>>>>>>>>>>>>>>>>>> 2. Open socket
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> if socket fails - repeat step 1
>>>>>>>>>>>>>>>>>> if socket OK - return *socket*
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> If you can't keep the socket open (e.g. you have to pass
>>>>>>>>>>>>>>>>>> port
>>>>>>>>>>>>>>>>>> number as
>>>>>>>>>>>>>>>>>> property value) you shouldn't do any pre-check as it
>>>>>>>>>>>>>>>>>> has no
>>>>>>>>>>>>>>>>>> value
>>>>>>>>>>>>>>>>>> - as
>>>>>>>>>>>>>>>>>> as soon as you close socket someone can take the port.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> So just choose a random number within the range above and
>>>>>>>>>>>>>>>>>> let
>>>>>>>>>>>>>>>>>> networking
>>>>>>>>>>>>>>>>>> code opening socket to handle port conflict.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> -Dmitry
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> On 2013-11-20 15:54, taras ledkov wrote:
>>>>>>>>>>>>>>>>>>> Hi Everyone,
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> I am working on bug
>>>>>>>>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-7195249.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> There are two webrevs:
>>>>>>>>>>>>>>>>>>> Webrev for jdk part:
>>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~anazarov/7195249/jdk/webrev.00/
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Webrev for hs part:
>>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~anazarov/7195249/hs/webrev.00/
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Please take a look at some notes:
>>>>>>>>>>>>>>>>>>> - After discussing with Yekaterina Kantserova & Jaroslav
>>>>>>>>>>>>>>>>>>> Bachorik
>>>>>>>>>>>>>>>>>>> some
>>>>>>>>>>>>>>>>>>> shell tests have been converted to java based tests
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> - PasswordFilePermissionTest &
>>>>>>>>>>>>>>>>>>> SSLConfigFilePermissionTest
>>>>>>>>>>>>>>>>>>> tests
>>>>>>>>>>>>>>>>>>> looked
>>>>>>>>>>>>>>>>>>> very similar, so a common parent class was created for
>>>>>>>>>>>>>>>>>>> them:
>>>>>>>>>>>>>>>>>>> AbstractFilePermissionTest
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> - What was called RmiRegistrySslTest.java I've
>>>>>>>>>>>>>>>>>>> renamed to
>>>>>>>>>>>>>>>>>>> RmiRegistrySslTestApp.java. The java code to replace old
>>>>>>>>>>>>>>>>>>> shell
>>>>>>>>>>>>>>>>>>> script
>>>>>>>>>>>>>>>>>>> RmiRegistrySslTest.sh is called RmiRegistrySslTest.java,
>>>>>>>>>>>>>>>>>>> hence the
>>>>>>>>>>>>>>>>>>> huge
>>>>>>>>>>>>>>>>>>> diff.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> - The new RmiRegistrySslTest.java has some lines similar
>>>>>>>>>>>>>>>>>>> to the
>>>>>>>>>>>>>>>>>>> AbstractFilePermissionTest.java, I nevertheless decided
>>>>>>>>>>>>>>>>>>> to not
>>>>>>>>>>>>>>>>>>> complicate the code further and leave it as is. Please
>>>>>>>>>>>>>>>>>>> let me
>>>>>>>>>>>>>>>>>>> know if
>>>>>>>>>>>>>>>>>>> this is somehow not acceptable
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> - com/oracle/java/testlibrary/Utils.java that is
>>>>>>>>>>>>>>>>>>> added to
>>>>>>>>>>>>>>>>>>> hotspot
>>>>>>>>>>>>>>>>>>> repository is taken from this patch:
>>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ykantser/8023138/webrev.00/test/lib/testlibrary/jdk/testlibrary/Utils.java.sdiff.html
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> - These tests will need additional changes when test
>>>>>>>>>>>>>>>>>>> library
>>>>>>>>>>>>>>>>>>> process
>>>>>>>>>>>>>>>>>>> tools will support command line options inheritance
>>>>>>>>>>>>>>>>>>> (http://mail.openjdk.java.net/pipermail/serviceability-dev/2013-November/013235.html)
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>> With best regards,
>>>>>>>>>> Taras Ledkov
>>>>>>>>>> Mail-To: taras.ledkov at oracle.com
>>>>>>>>>> skype: taras_ledkov
>>>>>>>>>> Phone: 7(812)3346-157
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> With best regards,
>>>>>>> Taras Ledkov
>>>>>>> Mail-To: taras.ledkov at oracle.com
>>>>>>> skype: taras_ledkov
>>>>>>> Phone: 7(812)3346-157
>>>>>>
>>>>>
>>>>
>>>
>>
>


From daniel.daugherty at oracle.com  Tue Feb 25 08:03:36 2014
From: daniel.daugherty at oracle.com (Daniel D. Daugherty)
Date: Tue, 25 Feb 2014 09:03:36 -0700
Subject: RFR round 0 JDK8u backport of ObjectMonitor-JVM/TI hang fix
	(8028073)
In-Reply-To: <53081C1F.2070008@oracle.com>
References: <52ED3F28.5070806@oracle.com> <53081C1F.2070008@oracle.com>
Message-ID: <530CBED8.4050509@oracle.com>

Ping! Still haven't heard from anyone on this backport...

Dan


On 2/21/14 8:40 PM, Daniel D. Daugherty wrote:
> Greetings,
>
> This is a code review request for the JDK8u-hs-dev backport of the
> following ObjectMonitor-JVM/TI hang fix:
>
>     8028073 race condition in ObjectMonitor implementation causing 
> deadlocks
>     https://bugs.openjdk.java.net/browse/JDK-8028073
>
> Here is the JDK8u-hs-dev webrev URL:
>
> http://cr.openjdk.java.net/~dcubed/8028073-webrev/0-jdk8u-hs-dev/
>
> This is _almost_ a straight forward backport of the JDK9 fix. The only
> difference to the fix was discussed at the end of the JDK9 review and
> was determined to only be needed in versions of HotSpot without the
> fix for 8028280:
>
> http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2014-February/010745.html 
>
>
> 8028280 has not yet been backported to JDK8u-hs-dev.
>
> The easiest way to review the backport is to download the two patch
> files from the webrevs and compare them with something like:
>
>     jfilemerge -r -w 8028073_exp.patch 8028073_exp_for_jdk8u_hs.patch
>
> The same testing has been performed on the JDK8u-hs-dev version as
> with the JDK9-hs-runtime version.
>
> Thanks, in advance, for any comments, questions or suggestions.
>
> Dan
>
>
> On 2/1/14 11:38 AM, Daniel D. Daugherty wrote:
> > Greetings,
> >
> > I have a fix ready for the following bug:
> >
> >     8028073 race condition in ObjectMonitor implementation causing 
> deadlocks
> >     https://bugs.openjdk.java.net/browse/JDK-8028073
> >
> > On the surface, this is a very simple fix that relocates a few lines of
> > code, relocates and rewrites the comments associated with that code and
> > adds several new comments.
> >
> > Of course, in reality, the issue is much more complicated, but I'm
> > hoping to make it easy for anyone not acquainted with this issue to
> > understand what's going on.
> >
> > Here are the JDK9 webrev URLs:
> >
> > OpenJDK:
> > http://cr.openjdk.java.net/~dcubed/8028073-webrev/0-jdk9-hs-runtime/
> >
> > Oracle internal:
> > 
> http://javaweb.us.oracle.com/~ddaugher/8028073-webrev/0-jdk9-hs-runtime/
> >
> > The simple summary:
> >
> > - since Java Monitors and JVM/TI RawMonitors share a ParkEvent,
> >   it is possible for a JVM/TI monitor event handler to accidentally
> >   consume a ParkEvent.unpark() call meant for Java Monitor layer
> > - the original code fix was made on 2005.07.04 using this bug ID:
> >   https://bugs.openjdk.java.net/browse/JDK-5030359
> > - it's the right fix, but it's in the wrong place
> > - the fix needs to be after the JVMTI_EVENT_MONITOR_WAITED
> >   event handler is called because it is that event handler
> >   that can cause the hang
> >
> >
> > Testing
> > -------
> >
> > - a new StessMonitorWait test has been created that reliably
> >   reproduces the hang in JDK[6789]; see the bug's gory details
> >   for the specific versions where the hang has been reproduced
> >   - the test reliably reproduces the hang in 5 seconds on my
> >     T7600 running Solaris 10u11 X86; 1 minute runs reproduce
> >     the hang reliably on other machines
> >   - 12 hour stress run of the new test on Linux-X64, MacOS X-X64,
> >     Solaris-SPARCV9, Solaris-X64, and Win7-X86 with the JPRT
> >     bits did not reproduce the hang
> > - JPRT test job
> > - VM/SQE Adhoc test job on Server VM, fastdebug bits on Linux-X86,
> >   Linux-X64, MacOS X-X64, Solaris-SPARCV9, Solaris-X64, Windows-X86,
> >   and Windows-X64:
> >   - vm.quick
> >   - Kitchensink (bigapps)
> >   - Weblogic+medrec (bigapps)
> >   - runThese (bigapps)
> >
> >
> > The Gory Details Start Here
> > ---------------------------
> >
> > This is the old location of block of code that's being moved:
> >
> > src/share/vm/runtime/objectMonitor.cpp:
> >
> > 1440 void ObjectMonitor::wait(jlong millis, bool interruptible, 
> TRAPS) {
> > <snip>
> > 1499    exit (true, Self) ;                    // exit the monitor
> > <snip>
> > 1513    if (node._notified != 0 && _succ == Self) {
> > 1514       node._event->unpark();
> > 1515    }
> >
> >
> > This is the new location of block of code that's being moved:
> >
> > src/share/vm/runtime/objectMonitor.cpp:
> >
> > 1452 void ObjectMonitor::wait(jlong millis, bool interruptible, 
> TRAPS) {
> > <snip>
> > 1601      if (JvmtiExport::should_post_monitor_waited()) {
> > 1602        JvmtiExport::post_monitor_waited(jt, this, ret == 
> OS_TIMEOUT);
> > <snip>
> > 1604        if (node._notified != 0 && _succ == Self) {
> > <snip>
> > 1620          node._event->unpark();
> > 1621        }
> >
> >
> > The Risks
> > ---------
> >
> > - The code now executes only when the JVMTI_EVENT_MONITOR_WAITED event
> >   is enabled:
> >   - previously it was always executed
> >   - while the old code was not effective for the hang that is being
> >     fixed with this bug, it is possible that the old code prevented
> >     a different bug in the successor protocol from manifesting
> >   - thorough analysis of the successor protocol did not reveal a
> >     case where the old code was needed in the old location
> > - Thorough analysis indicates that the other JVM/TI monitor events
> >   do not need a fix like the one for JVMTI_EVENT_MONITOR_WAITED:
> >   - the successor protocol is complicated and the analysis could
> >     be wrong when certain options are used
> >   - comments were added to each location where a JVM/TI monitor
> >     event handler is called documenting why a fix like this one
> >     is not needed there
> >   - if the analysis is wrong, the new comments show where a new
> >     code change would be needed
> >
> >
> > The Scenario
> > ------------
> >
> > I've created a scenario that reproduces this hang:
> >
> > T1 - enters monitor and calls monitor.wait()
> > T2 - enters the monitor, calls monitor.notify() and exits the monitor
> > T3 - enters and exits the monitor
> > T4 - enters the monitor, delays for 5 seconds, exits the monitor
> >
> > A JVM/TI agent that enables JVMTI_EVENT_MONITOR_WAITED and has a
> > handler that: enters a raw monitor, waits for 1ms, exits a raw monitor.
> >
> > Here are the six events necessary to make this hang happen:
> >
> > // KEY-EVENT-1a: After being unparked(), T1 has cleared the _succ 
> field, but
> > // KEY-EVENT-1b: T3 is exiting the monitor and makes T1 the 
> successor again.
> >
> > // KEY-EVENT-2a: The unpark() done by T3 when it made T1 the successor
> > // KEY-EVENT-2b: is consumed by the JVM/TI event handler.
> >
> > // KEY-EVENT-3a: T3 made T1 the successor
> > // KEY-EVENT-3b: but before T1 could reenter the monitor T4 grabbed it.
> >
> > // KEY-EVENT-4a: T1's TrySpin() call sees T4 as NotRunnable so
> > // KEY-EVENT-4b: T1 bails from TrySpin without touching _succ.
> >
> > // KEY-EVENT-5a: T4 sees that T1 is still the successor so
> > // KEY-EVENT-5b: T4 takes the quick exit path (no ExitEpilog)
> >
> > // KEY-EVENT-6a: T1 is about to park and it is the successor, but
> > // KEY-EVENT-6b: T3's unpark has been eaten by the JVM/TI event handler
> > // KEY-EVENT-6c: and T4 took the quick exit path. T1 is about to be 
> stuck.
> >
> >
> > This bug is intertwined with:
> >
> > - The ObjectMonitor successor protocol
> > - the sharing of a ParkEvent between Java Monitors and JVM/TI 
> RawMonitors
> >
> > There is a very long successor.notes attachment to JDK-8028073 that
> > attempts to describe the ObjectMonitor successor protocol. It's good
> > for putting pretty much anyone to sleep.
> >
> > Since this hang reproduces back to JDK6, this bug is taking the easily
> > backported solution of moving the original fix to the right location.
> > The following new bug has been filed for possible future work in this
> > area by the Serviceability Team:
> >
> >     8033399 add a separate ParkEvent for JVM/TI RawMonitor use
> >     https://bugs.openjdk.java.net/browse/JDK-8033399
> >
> >
> > The Symptoms
> > ------------
> >
> > With intermittent hangs like this, it is useful to know what to look
> > for in order to determine if you are running into this issue:
> >
> > - if you aren't using a debugger or a profiler or some other
> >   JVM/TI agent, then this hang is not the same as yours
> > - if your JVM/TI agent isn't using a JVMTI_EVENT_MONITOR_WAITED
> >   event handler, then this hang is not the same as yours
> > - if your JVMTI_EVENT_MONITOR_WAITED event handler is not using
> >   JVM/TI RawMonitors, then this hang is not the same as yours
> > - if your JVMTI_EVENT_MONITOR_WAITED event handler is calling
> >   back into Java code, then you might just be insane and this
> >   hang might be similar to yours. However, using a Java callback
> >   in an event handler is an even bigger problem/risk so fix that
> >   first.
> > - if you one or more threads blocked like this and making no
> >   progress, then this hang might be the same as yours:
> >
> > "T1" #22 prio=5 os_prio=64 tid=0x00000000009ca800 nid=0x2f waiting 
> for monitor e
> > ntry [0xfffffd7fc0231000]
> >    java.lang.Thread.State: BLOCKED (on object monitor)
> >    JavaThread state: _thread_blocked
> > Thread: 0x00000000009ca800  [0x2f] State: _at_safepoint 
> _has_called_back 0 _at_p
> > oll_safepoint 0
> >    JavaThread state: _thread_blocked
> >         at java.lang.Object.wait(Native Method)
> >         - waiting on <0xfffffd7e6a2b6ff0> (a java.lang.String)
> >         at java.lang.Object.wait(Object.java:502)
> >         at SMW_WorkerThread.run(StressMonitorWait.java:103)
> >         - locked <0xfffffd7e6a2b6ff0> (a java.lang.String)
> >
> > "T2" #23 prio=5 os_prio=64 tid=0x00000000009cc000 nid=0x30 waiting 
> for monitor e
> > ntry [0xfffffd7fc0130000]
> >    java.lang.Thread.State: BLOCKED (on object monitor)
> >    JavaThread state: _thread_blocked
> > Thread: 0x00000000009cc000  [0x30] State: _at_safepoint 
> _has_called_back 0 _at_p
> > oll_safepoint 0
> >    JavaThread state: _thread_blocked
> >         at SMW_WorkerThread.run(StressMonitorWait.java:120)
> >         - waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String)
> >
> > "T3" #24 prio=5 os_prio=64 tid=0x00000000009ce000 nid=0x31 waiting 
> for monitor e
> > ntry [0xfffffd7fc002f000]
> >    java.lang.Thread.State: BLOCKED (on object monitor)
> >    JavaThread state: _thread_blocked
> > Thread: 0x00000000009ce000  [0x31] State: _at_safepoint 
> _has_called_back 0 _at_p
> > oll_safepoint 0
> >    JavaThread state: _thread_blocked
> >         at SMW_WorkerThread.run(StressMonitorWait.java:139)
> >         - waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String)
> >
> > Key symptoms in thread T1:
> >
> > - had the object locked:
> >
> >   locked <0xfffffd7e6a2b6ff0> (a java.lang.String)
> >
> > - did an Object.wait():
> >
> >   waiting on <0xfffffd7e6a2b6ff0> (a java.lang.String)
> >
> > - is blocked on reentry:
> >
> >   waiting for monitor entry [0xfffffd7fc0231000]
> >
> > Key symptoms in thread T2:
> >
> > - is blocked waiting to lock the object:
> >
> >   waiting for monitor entry [0xfffffd7fc0130000]
> >   waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String)
> >
> > Key symptoms in thread T3:
> >
> > - is blocked waiting to lock the object:
> >
> >   waiting for monitor entry [0xfffffd7fc002f000]
> >   waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String)
>
>
>


From ron.durbin at oracle.com  Tue Feb 25 12:00:07 2014
From: ron.durbin at oracle.com (Ron Durbin)
Date: Tue, 25 Feb 2014 12:00:07 -0800 (PST)
Subject: RFR round 0 JDK8u backport of ObjectMonitor-JVM/TI hang fix
	(8028073)
In-Reply-To: <530CBED8.4050509@oracle.com>
References: <52ED3F28.5070806@oracle.com> <53081C1F.2070008@oracle.com>
	<530CBED8.4050509@oracle.com>
Message-ID: <1a31f034-a44c-4c0c-b350-ea626ed81147@default>

Code looks good.
Thx for the local in person review.

Thx Ron

> -----Original Message-----
> From: Daniel D. Daugherty
> Sent: Tuesday, February 25, 2014 9:04 AM
> To: David Holmes; Serguei Spitsyn; Dave Dice; Karen Kinnear
> Cc: serviceability-dev at openjdk.java.net; hotspot-runtime-dev at openjdk.java.net
> Subject: Re: RFR round 0 JDK8u backport of ObjectMonitor-JVM/TI hang fix (8028073)
> 
> Ping! Still haven't heard from anyone on this backport...
> 
> Dan
> 
> 
> On 2/21/14 8:40 PM, Daniel D. Daugherty wrote:
> > Greetings,
> >
> > This is a code review request for the JDK8u-hs-dev backport of the
> > following ObjectMonitor-JVM/TI hang fix:
> >
> >     8028073 race condition in ObjectMonitor implementation causing
> > deadlocks
> >     https://bugs.openjdk.java.net/browse/JDK-8028073
> >
> > Here is the JDK8u-hs-dev webrev URL:
> >
> > http://cr.openjdk.java.net/~dcubed/8028073-webrev/0-jdk8u-hs-dev/
> >
> > This is _almost_ a straight forward backport of the JDK9 fix. The only
> > difference to the fix was discussed at the end of the JDK9 review and
> > was determined to only be needed in versions of HotSpot without the
> > fix for 8028280:
> >
> > http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2014-February/010745.html
> >
> >
> > 8028280 has not yet been backported to JDK8u-hs-dev.
> >
> > The easiest way to review the backport is to download the two patch
> > files from the webrevs and compare them with something like:
> >
> >     jfilemerge -r -w 8028073_exp.patch 8028073_exp_for_jdk8u_hs.patch
> >
> > The same testing has been performed on the JDK8u-hs-dev version as
> > with the JDK9-hs-runtime version.
> >
> > Thanks, in advance, for any comments, questions or suggestions.
> >
> > Dan
> >
> >
> > On 2/1/14 11:38 AM, Daniel D. Daugherty wrote:
> > > Greetings,
> > >
> > > I have a fix ready for the following bug:
> > >
> > >     8028073 race condition in ObjectMonitor implementation causing
> > deadlocks
> > >     https://bugs.openjdk.java.net/browse/JDK-8028073
> > >
> > > On the surface, this is a very simple fix that relocates a few lines of
> > > code, relocates and rewrites the comments associated with that code and
> > > adds several new comments.
> > >
> > > Of course, in reality, the issue is much more complicated, but I'm
> > > hoping to make it easy for anyone not acquainted with this issue to
> > > understand what's going on.
> > >
> > > Here are the JDK9 webrev URLs:
> > >
> > > OpenJDK:
> > > http://cr.openjdk.java.net/~dcubed/8028073-webrev/0-jdk9-hs-runtime/
> > >
> > > Oracle internal:
> > >
> > http://javaweb.us.oracle.com/~ddaugher/8028073-webrev/0-jdk9-hs-runtime/
> > >
> > > The simple summary:
> > >
> > > - since Java Monitors and JVM/TI RawMonitors share a ParkEvent,
> > >   it is possible for a JVM/TI monitor event handler to accidentally
> > >   consume a ParkEvent.unpark() call meant for Java Monitor layer
> > > - the original code fix was made on 2005.07.04 using this bug ID:
> > >   https://bugs.openjdk.java.net/browse/JDK-5030359
> > > - it's the right fix, but it's in the wrong place
> > > - the fix needs to be after the JVMTI_EVENT_MONITOR_WAITED
> > >   event handler is called because it is that event handler
> > >   that can cause the hang
> > >
> > >
> > > Testing
> > > -------
> > >
> > > - a new StessMonitorWait test has been created that reliably
> > >   reproduces the hang in JDK[6789]; see the bug's gory details
> > >   for the specific versions where the hang has been reproduced
> > >   - the test reliably reproduces the hang in 5 seconds on my
> > >     T7600 running Solaris 10u11 X86; 1 minute runs reproduce
> > >     the hang reliably on other machines
> > >   - 12 hour stress run of the new test on Linux-X64, MacOS X-X64,
> > >     Solaris-SPARCV9, Solaris-X64, and Win7-X86 with the JPRT
> > >     bits did not reproduce the hang
> > > - JPRT test job
> > > - VM/SQE Adhoc test job on Server VM, fastdebug bits on Linux-X86,
> > >   Linux-X64, MacOS X-X64, Solaris-SPARCV9, Solaris-X64, Windows-X86,
> > >   and Windows-X64:
> > >   - vm.quick
> > >   - Kitchensink (bigapps)
> > >   - Weblogic+medrec (bigapps)
> > >   - runThese (bigapps)
> > >
> > >
> > > The Gory Details Start Here
> > > ---------------------------
> > >
> > > This is the old location of block of code that's being moved:
> > >
> > > src/share/vm/runtime/objectMonitor.cpp:
> > >
> > > 1440 void ObjectMonitor::wait(jlong millis, bool interruptible,
> > TRAPS) {
> > > <snip>
> > > 1499    exit (true, Self) ;                    // exit the monitor
> > > <snip>
> > > 1513    if (node._notified != 0 && _succ == Self) {
> > > 1514       node._event->unpark();
> > > 1515    }
> > >
> > >
> > > This is the new location of block of code that's being moved:
> > >
> > > src/share/vm/runtime/objectMonitor.cpp:
> > >
> > > 1452 void ObjectMonitor::wait(jlong millis, bool interruptible,
> > TRAPS) {
> > > <snip>
> > > 1601      if (JvmtiExport::should_post_monitor_waited()) {
> > > 1602        JvmtiExport::post_monitor_waited(jt, this, ret ==
> > OS_TIMEOUT);
> > > <snip>
> > > 1604        if (node._notified != 0 && _succ == Self) {
> > > <snip>
> > > 1620          node._event->unpark();
> > > 1621        }
> > >
> > >
> > > The Risks
> > > ---------
> > >
> > > - The code now executes only when the JVMTI_EVENT_MONITOR_WAITED event
> > >   is enabled:
> > >   - previously it was always executed
> > >   - while the old code was not effective for the hang that is being
> > >     fixed with this bug, it is possible that the old code prevented
> > >     a different bug in the successor protocol from manifesting
> > >   - thorough analysis of the successor protocol did not reveal a
> > >     case where the old code was needed in the old location
> > > - Thorough analysis indicates that the other JVM/TI monitor events
> > >   do not need a fix like the one for JVMTI_EVENT_MONITOR_WAITED:
> > >   - the successor protocol is complicated and the analysis could
> > >     be wrong when certain options are used
> > >   - comments were added to each location where a JVM/TI monitor
> > >     event handler is called documenting why a fix like this one
> > >     is not needed there
> > >   - if the analysis is wrong, the new comments show where a new
> > >     code change would be needed
> > >
> > >
> > > The Scenario
> > > ------------
> > >
> > > I've created a scenario that reproduces this hang:
> > >
> > > T1 - enters monitor and calls monitor.wait()
> > > T2 - enters the monitor, calls monitor.notify() and exits the monitor
> > > T3 - enters and exits the monitor
> > > T4 - enters the monitor, delays for 5 seconds, exits the monitor
> > >
> > > A JVM/TI agent that enables JVMTI_EVENT_MONITOR_WAITED and has a
> > > handler that: enters a raw monitor, waits for 1ms, exits a raw monitor.
> > >
> > > Here are the six events necessary to make this hang happen:
> > >
> > > // KEY-EVENT-1a: After being unparked(), T1 has cleared the _succ
> > field, but
> > > // KEY-EVENT-1b: T3 is exiting the monitor and makes T1 the
> > successor again.
> > >
> > > // KEY-EVENT-2a: The unpark() done by T3 when it made T1 the successor
> > > // KEY-EVENT-2b: is consumed by the JVM/TI event handler.
> > >
> > > // KEY-EVENT-3a: T3 made T1 the successor
> > > // KEY-EVENT-3b: but before T1 could reenter the monitor T4 grabbed it.
> > >
> > > // KEY-EVENT-4a: T1's TrySpin() call sees T4 as NotRunnable so
> > > // KEY-EVENT-4b: T1 bails from TrySpin without touching _succ.
> > >
> > > // KEY-EVENT-5a: T4 sees that T1 is still the successor so
> > > // KEY-EVENT-5b: T4 takes the quick exit path (no ExitEpilog)
> > >
> > > // KEY-EVENT-6a: T1 is about to park and it is the successor, but
> > > // KEY-EVENT-6b: T3's unpark has been eaten by the JVM/TI event handler
> > > // KEY-EVENT-6c: and T4 took the quick exit path. T1 is about to be
> > stuck.
> > >
> > >
> > > This bug is intertwined with:
> > >
> > > - The ObjectMonitor successor protocol
> > > - the sharing of a ParkEvent between Java Monitors and JVM/TI
> > RawMonitors
> > >
> > > There is a very long successor.notes attachment to JDK-8028073 that
> > > attempts to describe the ObjectMonitor successor protocol. It's good
> > > for putting pretty much anyone to sleep.
> > >
> > > Since this hang reproduces back to JDK6, this bug is taking the easily
> > > backported solution of moving the original fix to the right location.
> > > The following new bug has been filed for possible future work in this
> > > area by the Serviceability Team:
> > >
> > >     8033399 add a separate ParkEvent for JVM/TI RawMonitor use
> > >     https://bugs.openjdk.java.net/browse/JDK-8033399
> > >
> > >
> > > The Symptoms
> > > ------------
> > >
> > > With intermittent hangs like this, it is useful to know what to look
> > > for in order to determine if you are running into this issue:
> > >
> > > - if you aren't using a debugger or a profiler or some other
> > >   JVM/TI agent, then this hang is not the same as yours
> > > - if your JVM/TI agent isn't using a JVMTI_EVENT_MONITOR_WAITED
> > >   event handler, then this hang is not the same as yours
> > > - if your JVMTI_EVENT_MONITOR_WAITED event handler is not using
> > >   JVM/TI RawMonitors, then this hang is not the same as yours
> > > - if your JVMTI_EVENT_MONITOR_WAITED event handler is calling
> > >   back into Java code, then you might just be insane and this
> > >   hang might be similar to yours. However, using a Java callback
> > >   in an event handler is an even bigger problem/risk so fix that
> > >   first.
> > > - if you one or more threads blocked like this and making no
> > >   progress, then this hang might be the same as yours:
> > >
> > > "T1" #22 prio=5 os_prio=64 tid=0x00000000009ca800 nid=0x2f waiting
> > for monitor e
> > > ntry [0xfffffd7fc0231000]
> > >    java.lang.Thread.State: BLOCKED (on object monitor)
> > >    JavaThread state: _thread_blocked
> > > Thread: 0x00000000009ca800  [0x2f] State: _at_safepoint
> > _has_called_back 0 _at_p
> > > oll_safepoint 0
> > >    JavaThread state: _thread_blocked
> > >         at java.lang.Object.wait(Native Method)
> > >         - waiting on <0xfffffd7e6a2b6ff0> (a java.lang.String)
> > >         at java.lang.Object.wait(Object.java:502)
> > >         at SMW_WorkerThread.run(StressMonitorWait.java:103)
> > >         - locked <0xfffffd7e6a2b6ff0> (a java.lang.String)
> > >
> > > "T2" #23 prio=5 os_prio=64 tid=0x00000000009cc000 nid=0x30 waiting
> > for monitor e
> > > ntry [0xfffffd7fc0130000]
> > >    java.lang.Thread.State: BLOCKED (on object monitor)
> > >    JavaThread state: _thread_blocked
> > > Thread: 0x00000000009cc000  [0x30] State: _at_safepoint
> > _has_called_back 0 _at_p
> > > oll_safepoint 0
> > >    JavaThread state: _thread_blocked
> > >         at SMW_WorkerThread.run(StressMonitorWait.java:120)
> > >         - waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String)
> > >
> > > "T3" #24 prio=5 os_prio=64 tid=0x00000000009ce000 nid=0x31 waiting
> > for monitor e
> > > ntry [0xfffffd7fc002f000]
> > >    java.lang.Thread.State: BLOCKED (on object monitor)
> > >    JavaThread state: _thread_blocked
> > > Thread: 0x00000000009ce000  [0x31] State: _at_safepoint
> > _has_called_back 0 _at_p
> > > oll_safepoint 0
> > >    JavaThread state: _thread_blocked
> > >         at SMW_WorkerThread.run(StressMonitorWait.java:139)
> > >         - waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String)
> > >
> > > Key symptoms in thread T1:
> > >
> > > - had the object locked:
> > >
> > >   locked <0xfffffd7e6a2b6ff0> (a java.lang.String)
> > >
> > > - did an Object.wait():
> > >
> > >   waiting on <0xfffffd7e6a2b6ff0> (a java.lang.String)
> > >
> > > - is blocked on reentry:
> > >
> > >   waiting for monitor entry [0xfffffd7fc0231000]
> > >
> > > Key symptoms in thread T2:
> > >
> > > - is blocked waiting to lock the object:
> > >
> > >   waiting for monitor entry [0xfffffd7fc0130000]
> > >   waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String)
> > >
> > > Key symptoms in thread T3:
> > >
> > > - is blocked waiting to lock the object:
> > >
> > >   waiting for monitor entry [0xfffffd7fc002f000]
> > >   waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String)
> >
> >
> >
> 

From serguei.spitsyn at oracle.com  Tue Feb 25 12:43:41 2014
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Tue, 25 Feb 2014 12:43:41 -0800
Subject: RFR 6471769: Error: assert(_cur_stack_depth == count_frames(),
	"cur_stack_depth out of sync")
Message-ID: <530D007D.4040602@oracle.com>

Please, review the fix for:
   https://bugs.openjdk.java.net/browse/JDK-6471769


Open webrev:
http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/hotspot/6471769-JVMTI-DEPTH.1

Summary:

   This is another Test Stabilization issue.
   The fix is very similar to other JVMTI stabilization fixes.
   It is to use safepoints for updating the PopFrame data instead of 
relying on the
   suspend equivalent condition mechanism 
(JvmtiEnv::is_thread_fully_suspended())
   which is not adequate from the reliability point of view.

Testing:
   In progress: nsk.jvmti, nsk.jdi, nsk.jdwp, JTreg com/sun/jdi


Thanks,
Serguei


From daniel.daugherty at oracle.com  Tue Feb 25 13:51:32 2014
From: daniel.daugherty at oracle.com (Daniel D. Daugherty)
Date: Tue, 25 Feb 2014 14:51:32 -0700
Subject: RFR round 0 JDK8u backport of ObjectMonitor-JVM/TI hang fix
	(8028073)
In-Reply-To: <1a31f034-a44c-4c0c-b350-ea626ed81147@default>
References: <52ED3F28.5070806@oracle.com> <53081C1F.2070008@oracle.com>
	<530CBED8.4050509@oracle.com>
	<1a31f034-a44c-4c0c-b350-ea626ed81147@default>
Message-ID: <530D1064.6080304@oracle.com>

Thanks Ron!

Dan


On 2/25/14 1:00 PM, Ron Durbin wrote:
> Code looks good.
> Thx for the local in person review.
>
> Thx Ron
>
>> -----Original Message-----
>> From: Daniel D. Daugherty
>> Sent: Tuesday, February 25, 2014 9:04 AM
>> To: David Holmes; Serguei Spitsyn; Dave Dice; Karen Kinnear
>> Cc: serviceability-dev at openjdk.java.net; hotspot-runtime-dev at openjdk.java.net
>> Subject: Re: RFR round 0 JDK8u backport of ObjectMonitor-JVM/TI hang fix (8028073)
>>
>> Ping! Still haven't heard from anyone on this backport...
>>
>> Dan
>>
>>
>> On 2/21/14 8:40 PM, Daniel D. Daugherty wrote:
>>> Greetings,
>>>
>>> This is a code review request for the JDK8u-hs-dev backport of the
>>> following ObjectMonitor-JVM/TI hang fix:
>>>
>>>      8028073 race condition in ObjectMonitor implementation causing
>>> deadlocks
>>>      https://bugs.openjdk.java.net/browse/JDK-8028073
>>>
>>> Here is the JDK8u-hs-dev webrev URL:
>>>
>>> http://cr.openjdk.java.net/~dcubed/8028073-webrev/0-jdk8u-hs-dev/
>>>
>>> This is _almost_ a straight forward backport of the JDK9 fix. The only
>>> difference to the fix was discussed at the end of the JDK9 review and
>>> was determined to only be needed in versions of HotSpot without the
>>> fix for 8028280:
>>>
>>> http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2014-February/010745.html
>>>
>>>
>>> 8028280 has not yet been backported to JDK8u-hs-dev.
>>>
>>> The easiest way to review the backport is to download the two patch
>>> files from the webrevs and compare them with something like:
>>>
>>>      jfilemerge -r -w 8028073_exp.patch 8028073_exp_for_jdk8u_hs.patch
>>>
>>> The same testing has been performed on the JDK8u-hs-dev version as
>>> with the JDK9-hs-runtime version.
>>>
>>> Thanks, in advance, for any comments, questions or suggestions.
>>>
>>> Dan
>>>
>>>
>>> On 2/1/14 11:38 AM, Daniel D. Daugherty wrote:
>>>> Greetings,
>>>>
>>>> I have a fix ready for the following bug:
>>>>
>>>>      8028073 race condition in ObjectMonitor implementation causing
>>> deadlocks
>>>>      https://bugs.openjdk.java.net/browse/JDK-8028073
>>>>
>>>> On the surface, this is a very simple fix that relocates a few lines of
>>>> code, relocates and rewrites the comments associated with that code and
>>>> adds several new comments.
>>>>
>>>> Of course, in reality, the issue is much more complicated, but I'm
>>>> hoping to make it easy for anyone not acquainted with this issue to
>>>> understand what's going on.
>>>>
>>>> Here are the JDK9 webrev URLs:
>>>>
>>>> OpenJDK:
>>>> http://cr.openjdk.java.net/~dcubed/8028073-webrev/0-jdk9-hs-runtime/
>>>>
>>>> Oracle internal:
>>>>
>>> http://javaweb.us.oracle.com/~ddaugher/8028073-webrev/0-jdk9-hs-runtime/
>>>> The simple summary:
>>>>
>>>> - since Java Monitors and JVM/TI RawMonitors share a ParkEvent,
>>>>    it is possible for a JVM/TI monitor event handler to accidentally
>>>>    consume a ParkEvent.unpark() call meant for Java Monitor layer
>>>> - the original code fix was made on 2005.07.04 using this bug ID:
>>>>    https://bugs.openjdk.java.net/browse/JDK-5030359
>>>> - it's the right fix, but it's in the wrong place
>>>> - the fix needs to be after the JVMTI_EVENT_MONITOR_WAITED
>>>>    event handler is called because it is that event handler
>>>>    that can cause the hang
>>>>
>>>>
>>>> Testing
>>>> -------
>>>>
>>>> - a new StessMonitorWait test has been created that reliably
>>>>    reproduces the hang in JDK[6789]; see the bug's gory details
>>>>    for the specific versions where the hang has been reproduced
>>>>    - the test reliably reproduces the hang in 5 seconds on my
>>>>      T7600 running Solaris 10u11 X86; 1 minute runs reproduce
>>>>      the hang reliably on other machines
>>>>    - 12 hour stress run of the new test on Linux-X64, MacOS X-X64,
>>>>      Solaris-SPARCV9, Solaris-X64, and Win7-X86 with the JPRT
>>>>      bits did not reproduce the hang
>>>> - JPRT test job
>>>> - VM/SQE Adhoc test job on Server VM, fastdebug bits on Linux-X86,
>>>>    Linux-X64, MacOS X-X64, Solaris-SPARCV9, Solaris-X64, Windows-X86,
>>>>    and Windows-X64:
>>>>    - vm.quick
>>>>    - Kitchensink (bigapps)
>>>>    - Weblogic+medrec (bigapps)
>>>>    - runThese (bigapps)
>>>>
>>>>
>>>> The Gory Details Start Here
>>>> ---------------------------
>>>>
>>>> This is the old location of block of code that's being moved:
>>>>
>>>> src/share/vm/runtime/objectMonitor.cpp:
>>>>
>>>> 1440 void ObjectMonitor::wait(jlong millis, bool interruptible,
>>> TRAPS) {
>>>> <snip>
>>>> 1499    exit (true, Self) ;                    // exit the monitor
>>>> <snip>
>>>> 1513    if (node._notified != 0 && _succ == Self) {
>>>> 1514       node._event->unpark();
>>>> 1515    }
>>>>
>>>>
>>>> This is the new location of block of code that's being moved:
>>>>
>>>> src/share/vm/runtime/objectMonitor.cpp:
>>>>
>>>> 1452 void ObjectMonitor::wait(jlong millis, bool interruptible,
>>> TRAPS) {
>>>> <snip>
>>>> 1601      if (JvmtiExport::should_post_monitor_waited()) {
>>>> 1602        JvmtiExport::post_monitor_waited(jt, this, ret ==
>>> OS_TIMEOUT);
>>>> <snip>
>>>> 1604        if (node._notified != 0 && _succ == Self) {
>>>> <snip>
>>>> 1620          node._event->unpark();
>>>> 1621        }
>>>>
>>>>
>>>> The Risks
>>>> ---------
>>>>
>>>> - The code now executes only when the JVMTI_EVENT_MONITOR_WAITED event
>>>>    is enabled:
>>>>    - previously it was always executed
>>>>    - while the old code was not effective for the hang that is being
>>>>      fixed with this bug, it is possible that the old code prevented
>>>>      a different bug in the successor protocol from manifesting
>>>>    - thorough analysis of the successor protocol did not reveal a
>>>>      case where the old code was needed in the old location
>>>> - Thorough analysis indicates that the other JVM/TI monitor events
>>>>    do not need a fix like the one for JVMTI_EVENT_MONITOR_WAITED:
>>>>    - the successor protocol is complicated and the analysis could
>>>>      be wrong when certain options are used
>>>>    - comments were added to each location where a JVM/TI monitor
>>>>      event handler is called documenting why a fix like this one
>>>>      is not needed there
>>>>    - if the analysis is wrong, the new comments show where a new
>>>>      code change would be needed
>>>>
>>>>
>>>> The Scenario
>>>> ------------
>>>>
>>>> I've created a scenario that reproduces this hang:
>>>>
>>>> T1 - enters monitor and calls monitor.wait()
>>>> T2 - enters the monitor, calls monitor.notify() and exits the monitor
>>>> T3 - enters and exits the monitor
>>>> T4 - enters the monitor, delays for 5 seconds, exits the monitor
>>>>
>>>> A JVM/TI agent that enables JVMTI_EVENT_MONITOR_WAITED and has a
>>>> handler that: enters a raw monitor, waits for 1ms, exits a raw monitor.
>>>>
>>>> Here are the six events necessary to make this hang happen:
>>>>
>>>> // KEY-EVENT-1a: After being unparked(), T1 has cleared the _succ
>>> field, but
>>>> // KEY-EVENT-1b: T3 is exiting the monitor and makes T1 the
>>> successor again.
>>>> // KEY-EVENT-2a: The unpark() done by T3 when it made T1 the successor
>>>> // KEY-EVENT-2b: is consumed by the JVM/TI event handler.
>>>>
>>>> // KEY-EVENT-3a: T3 made T1 the successor
>>>> // KEY-EVENT-3b: but before T1 could reenter the monitor T4 grabbed it.
>>>>
>>>> // KEY-EVENT-4a: T1's TrySpin() call sees T4 as NotRunnable so
>>>> // KEY-EVENT-4b: T1 bails from TrySpin without touching _succ.
>>>>
>>>> // KEY-EVENT-5a: T4 sees that T1 is still the successor so
>>>> // KEY-EVENT-5b: T4 takes the quick exit path (no ExitEpilog)
>>>>
>>>> // KEY-EVENT-6a: T1 is about to park and it is the successor, but
>>>> // KEY-EVENT-6b: T3's unpark has been eaten by the JVM/TI event handler
>>>> // KEY-EVENT-6c: and T4 took the quick exit path. T1 is about to be
>>> stuck.
>>>>
>>>> This bug is intertwined with:
>>>>
>>>> - The ObjectMonitor successor protocol
>>>> - the sharing of a ParkEvent between Java Monitors and JVM/TI
>>> RawMonitors
>>>> There is a very long successor.notes attachment to JDK-8028073 that
>>>> attempts to describe the ObjectMonitor successor protocol. It's good
>>>> for putting pretty much anyone to sleep.
>>>>
>>>> Since this hang reproduces back to JDK6, this bug is taking the easily
>>>> backported solution of moving the original fix to the right location.
>>>> The following new bug has been filed for possible future work in this
>>>> area by the Serviceability Team:
>>>>
>>>>      8033399 add a separate ParkEvent for JVM/TI RawMonitor use
>>>>      https://bugs.openjdk.java.net/browse/JDK-8033399
>>>>
>>>>
>>>> The Symptoms
>>>> ------------
>>>>
>>>> With intermittent hangs like this, it is useful to know what to look
>>>> for in order to determine if you are running into this issue:
>>>>
>>>> - if you aren't using a debugger or a profiler or some other
>>>>    JVM/TI agent, then this hang is not the same as yours
>>>> - if your JVM/TI agent isn't using a JVMTI_EVENT_MONITOR_WAITED
>>>>    event handler, then this hang is not the same as yours
>>>> - if your JVMTI_EVENT_MONITOR_WAITED event handler is not using
>>>>    JVM/TI RawMonitors, then this hang is not the same as yours
>>>> - if your JVMTI_EVENT_MONITOR_WAITED event handler is calling
>>>>    back into Java code, then you might just be insane and this
>>>>    hang might be similar to yours. However, using a Java callback
>>>>    in an event handler is an even bigger problem/risk so fix that
>>>>    first.
>>>> - if you one or more threads blocked like this and making no
>>>>    progress, then this hang might be the same as yours:
>>>>
>>>> "T1" #22 prio=5 os_prio=64 tid=0x00000000009ca800 nid=0x2f waiting
>>> for monitor e
>>>> ntry [0xfffffd7fc0231000]
>>>>     java.lang.Thread.State: BLOCKED (on object monitor)
>>>>     JavaThread state: _thread_blocked
>>>> Thread: 0x00000000009ca800  [0x2f] State: _at_safepoint
>>> _has_called_back 0 _at_p
>>>> oll_safepoint 0
>>>>     JavaThread state: _thread_blocked
>>>>          at java.lang.Object.wait(Native Method)
>>>>          - waiting on <0xfffffd7e6a2b6ff0> (a java.lang.String)
>>>>          at java.lang.Object.wait(Object.java:502)
>>>>          at SMW_WorkerThread.run(StressMonitorWait.java:103)
>>>>          - locked <0xfffffd7e6a2b6ff0> (a java.lang.String)
>>>>
>>>> "T2" #23 prio=5 os_prio=64 tid=0x00000000009cc000 nid=0x30 waiting
>>> for monitor e
>>>> ntry [0xfffffd7fc0130000]
>>>>     java.lang.Thread.State: BLOCKED (on object monitor)
>>>>     JavaThread state: _thread_blocked
>>>> Thread: 0x00000000009cc000  [0x30] State: _at_safepoint
>>> _has_called_back 0 _at_p
>>>> oll_safepoint 0
>>>>     JavaThread state: _thread_blocked
>>>>          at SMW_WorkerThread.run(StressMonitorWait.java:120)
>>>>          - waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String)
>>>>
>>>> "T3" #24 prio=5 os_prio=64 tid=0x00000000009ce000 nid=0x31 waiting
>>> for monitor e
>>>> ntry [0xfffffd7fc002f000]
>>>>     java.lang.Thread.State: BLOCKED (on object monitor)
>>>>     JavaThread state: _thread_blocked
>>>> Thread: 0x00000000009ce000  [0x31] State: _at_safepoint
>>> _has_called_back 0 _at_p
>>>> oll_safepoint 0
>>>>     JavaThread state: _thread_blocked
>>>>          at SMW_WorkerThread.run(StressMonitorWait.java:139)
>>>>          - waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String)
>>>>
>>>> Key symptoms in thread T1:
>>>>
>>>> - had the object locked:
>>>>
>>>>    locked <0xfffffd7e6a2b6ff0> (a java.lang.String)
>>>>
>>>> - did an Object.wait():
>>>>
>>>>    waiting on <0xfffffd7e6a2b6ff0> (a java.lang.String)
>>>>
>>>> - is blocked on reentry:
>>>>
>>>>    waiting for monitor entry [0xfffffd7fc0231000]
>>>>
>>>> Key symptoms in thread T2:
>>>>
>>>> - is blocked waiting to lock the object:
>>>>
>>>>    waiting for monitor entry [0xfffffd7fc0130000]
>>>>    waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String)
>>>>
>>>> Key symptoms in thread T3:
>>>>
>>>> - is blocked waiting to lock the object:
>>>>
>>>>    waiting for monitor entry [0xfffffd7fc002f000]
>>>>    waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String)
>>>
>>>


From serguei.spitsyn at oracle.com  Tue Feb 25 13:54:49 2014
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Tue, 25 Feb 2014 13:54:49 -0800
Subject: RFR round 0 JDK8u backport of ObjectMonitor-JVM/TI hang fix
	(8028073)
In-Reply-To: <530CBED8.4050509@oracle.com>
References: <52ED3F28.5070806@oracle.com> <53081C1F.2070008@oracle.com>
	<530CBED8.4050509@oracle.com>
Message-ID: <530D1129.4060605@oracle.com>

Will review it today.
You know, your fix is tricky. :)

Thanks,
Serguei

On 2/25/14 8:03 AM, Daniel D. Daugherty wrote:
> Ping! Still haven't heard from anyone on this backport...
>
> Dan
>
>
> On 2/21/14 8:40 PM, Daniel D. Daugherty wrote:
>> Greetings,
>>
>> This is a code review request for the JDK8u-hs-dev backport of the
>> following ObjectMonitor-JVM/TI hang fix:
>>
>>     8028073 race condition in ObjectMonitor implementation causing 
>> deadlocks
>>     https://bugs.openjdk.java.net/browse/JDK-8028073
>>
>> Here is the JDK8u-hs-dev webrev URL:
>>
>> http://cr.openjdk.java.net/~dcubed/8028073-webrev/0-jdk8u-hs-dev/
>>
>> This is _almost_ a straight forward backport of the JDK9 fix. The only
>> difference to the fix was discussed at the end of the JDK9 review and
>> was determined to only be needed in versions of HotSpot without the
>> fix for 8028280:
>>
>> http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2014-February/010745.html 
>>
>>
>> 8028280 has not yet been backported to JDK8u-hs-dev.
>>
>> The easiest way to review the backport is to download the two patch
>> files from the webrevs and compare them with something like:
>>
>>     jfilemerge -r -w 8028073_exp.patch 8028073_exp_for_jdk8u_hs.patch
>>
>> The same testing has been performed on the JDK8u-hs-dev version as
>> with the JDK9-hs-runtime version.
>>
>> Thanks, in advance, for any comments, questions or suggestions.
>>
>> Dan
>>
>>
>> On 2/1/14 11:38 AM, Daniel D. Daugherty wrote:
>> > Greetings,
>> >
>> > I have a fix ready for the following bug:
>> >
>> >     8028073 race condition in ObjectMonitor implementation causing 
>> deadlocks
>> >     https://bugs.openjdk.java.net/browse/JDK-8028073
>> >
>> > On the surface, this is a very simple fix that relocates a few 
>> lines of
>> > code, relocates and rewrites the comments associated with that code 
>> and
>> > adds several new comments.
>> >
>> > Of course, in reality, the issue is much more complicated, but I'm
>> > hoping to make it easy for anyone not acquainted with this issue to
>> > understand what's going on.
>> >
>> > Here are the JDK9 webrev URLs:
>> >
>> > OpenJDK:
>> > http://cr.openjdk.java.net/~dcubed/8028073-webrev/0-jdk9-hs-runtime/
>> >
>> > Oracle internal:
>> > 
>> http://javaweb.us.oracle.com/~ddaugher/8028073-webrev/0-jdk9-hs-runtime/
>> >
>> > The simple summary:
>> >
>> > - since Java Monitors and JVM/TI RawMonitors share a ParkEvent,
>> >   it is possible for a JVM/TI monitor event handler to accidentally
>> >   consume a ParkEvent.unpark() call meant for Java Monitor layer
>> > - the original code fix was made on 2005.07.04 using this bug ID:
>> >   https://bugs.openjdk.java.net/browse/JDK-5030359
>> > - it's the right fix, but it's in the wrong place
>> > - the fix needs to be after the JVMTI_EVENT_MONITOR_WAITED
>> >   event handler is called because it is that event handler
>> >   that can cause the hang
>> >
>> >
>> > Testing
>> > -------
>> >
>> > - a new StessMonitorWait test has been created that reliably
>> >   reproduces the hang in JDK[6789]; see the bug's gory details
>> >   for the specific versions where the hang has been reproduced
>> >   - the test reliably reproduces the hang in 5 seconds on my
>> >     T7600 running Solaris 10u11 X86; 1 minute runs reproduce
>> >     the hang reliably on other machines
>> >   - 12 hour stress run of the new test on Linux-X64, MacOS X-X64,
>> >     Solaris-SPARCV9, Solaris-X64, and Win7-X86 with the JPRT
>> >     bits did not reproduce the hang
>> > - JPRT test job
>> > - VM/SQE Adhoc test job on Server VM, fastdebug bits on Linux-X86,
>> >   Linux-X64, MacOS X-X64, Solaris-SPARCV9, Solaris-X64, Windows-X86,
>> >   and Windows-X64:
>> >   - vm.quick
>> >   - Kitchensink (bigapps)
>> >   - Weblogic+medrec (bigapps)
>> >   - runThese (bigapps)
>> >
>> >
>> > The Gory Details Start Here
>> > ---------------------------
>> >
>> > This is the old location of block of code that's being moved:
>> >
>> > src/share/vm/runtime/objectMonitor.cpp:
>> >
>> > 1440 void ObjectMonitor::wait(jlong millis, bool interruptible, 
>> TRAPS) {
>> > <snip>
>> > 1499    exit (true, Self) ;                    // exit the monitor
>> > <snip>
>> > 1513    if (node._notified != 0 && _succ == Self) {
>> > 1514       node._event->unpark();
>> > 1515    }
>> >
>> >
>> > This is the new location of block of code that's being moved:
>> >
>> > src/share/vm/runtime/objectMonitor.cpp:
>> >
>> > 1452 void ObjectMonitor::wait(jlong millis, bool interruptible, 
>> TRAPS) {
>> > <snip>
>> > 1601      if (JvmtiExport::should_post_monitor_waited()) {
>> > 1602        JvmtiExport::post_monitor_waited(jt, this, ret == 
>> OS_TIMEOUT);
>> > <snip>
>> > 1604        if (node._notified != 0 && _succ == Self) {
>> > <snip>
>> > 1620          node._event->unpark();
>> > 1621        }
>> >
>> >
>> > The Risks
>> > ---------
>> >
>> > - The code now executes only when the JVMTI_EVENT_MONITOR_WAITED event
>> >   is enabled:
>> >   - previously it was always executed
>> >   - while the old code was not effective for the hang that is being
>> >     fixed with this bug, it is possible that the old code prevented
>> >     a different bug in the successor protocol from manifesting
>> >   - thorough analysis of the successor protocol did not reveal a
>> >     case where the old code was needed in the old location
>> > - Thorough analysis indicates that the other JVM/TI monitor events
>> >   do not need a fix like the one for JVMTI_EVENT_MONITOR_WAITED:
>> >   - the successor protocol is complicated and the analysis could
>> >     be wrong when certain options are used
>> >   - comments were added to each location where a JVM/TI monitor
>> >     event handler is called documenting why a fix like this one
>> >     is not needed there
>> >   - if the analysis is wrong, the new comments show where a new
>> >     code change would be needed
>> >
>> >
>> > The Scenario
>> > ------------
>> >
>> > I've created a scenario that reproduces this hang:
>> >
>> > T1 - enters monitor and calls monitor.wait()
>> > T2 - enters the monitor, calls monitor.notify() and exits the monitor
>> > T3 - enters and exits the monitor
>> > T4 - enters the monitor, delays for 5 seconds, exits the monitor
>> >
>> > A JVM/TI agent that enables JVMTI_EVENT_MONITOR_WAITED and has a
>> > handler that: enters a raw monitor, waits for 1ms, exits a raw 
>> monitor.
>> >
>> > Here are the six events necessary to make this hang happen:
>> >
>> > // KEY-EVENT-1a: After being unparked(), T1 has cleared the _succ 
>> field, but
>> > // KEY-EVENT-1b: T3 is exiting the monitor and makes T1 the 
>> successor again.
>> >
>> > // KEY-EVENT-2a: The unpark() done by T3 when it made T1 the successor
>> > // KEY-EVENT-2b: is consumed by the JVM/TI event handler.
>> >
>> > // KEY-EVENT-3a: T3 made T1 the successor
>> > // KEY-EVENT-3b: but before T1 could reenter the monitor T4 grabbed 
>> it.
>> >
>> > // KEY-EVENT-4a: T1's TrySpin() call sees T4 as NotRunnable so
>> > // KEY-EVENT-4b: T1 bails from TrySpin without touching _succ.
>> >
>> > // KEY-EVENT-5a: T4 sees that T1 is still the successor so
>> > // KEY-EVENT-5b: T4 takes the quick exit path (no ExitEpilog)
>> >
>> > // KEY-EVENT-6a: T1 is about to park and it is the successor, but
>> > // KEY-EVENT-6b: T3's unpark has been eaten by the JVM/TI event 
>> handler
>> > // KEY-EVENT-6c: and T4 took the quick exit path. T1 is about to be 
>> stuck.
>> >
>> >
>> > This bug is intertwined with:
>> >
>> > - The ObjectMonitor successor protocol
>> > - the sharing of a ParkEvent between Java Monitors and JVM/TI 
>> RawMonitors
>> >
>> > There is a very long successor.notes attachment to JDK-8028073 that
>> > attempts to describe the ObjectMonitor successor protocol. It's good
>> > for putting pretty much anyone to sleep.
>> >
>> > Since this hang reproduces back to JDK6, this bug is taking the easily
>> > backported solution of moving the original fix to the right location.
>> > The following new bug has been filed for possible future work in this
>> > area by the Serviceability Team:
>> >
>> >     8033399 add a separate ParkEvent for JVM/TI RawMonitor use
>> >     https://bugs.openjdk.java.net/browse/JDK-8033399
>> >
>> >
>> > The Symptoms
>> > ------------
>> >
>> > With intermittent hangs like this, it is useful to know what to look
>> > for in order to determine if you are running into this issue:
>> >
>> > - if you aren't using a debugger or a profiler or some other
>> >   JVM/TI agent, then this hang is not the same as yours
>> > - if your JVM/TI agent isn't using a JVMTI_EVENT_MONITOR_WAITED
>> >   event handler, then this hang is not the same as yours
>> > - if your JVMTI_EVENT_MONITOR_WAITED event handler is not using
>> >   JVM/TI RawMonitors, then this hang is not the same as yours
>> > - if your JVMTI_EVENT_MONITOR_WAITED event handler is calling
>> >   back into Java code, then you might just be insane and this
>> >   hang might be similar to yours. However, using a Java callback
>> >   in an event handler is an even bigger problem/risk so fix that
>> >   first.
>> > - if you one or more threads blocked like this and making no
>> >   progress, then this hang might be the same as yours:
>> >
>> > "T1" #22 prio=5 os_prio=64 tid=0x00000000009ca800 nid=0x2f waiting 
>> for monitor e
>> > ntry [0xfffffd7fc0231000]
>> >    java.lang.Thread.State: BLOCKED (on object monitor)
>> >    JavaThread state: _thread_blocked
>> > Thread: 0x00000000009ca800  [0x2f] State: _at_safepoint 
>> _has_called_back 0 _at_p
>> > oll_safepoint 0
>> >    JavaThread state: _thread_blocked
>> >         at java.lang.Object.wait(Native Method)
>> >         - waiting on <0xfffffd7e6a2b6ff0> (a java.lang.String)
>> >         at java.lang.Object.wait(Object.java:502)
>> >         at SMW_WorkerThread.run(StressMonitorWait.java:103)
>> >         - locked <0xfffffd7e6a2b6ff0> (a java.lang.String)
>> >
>> > "T2" #23 prio=5 os_prio=64 tid=0x00000000009cc000 nid=0x30 waiting 
>> for monitor e
>> > ntry [0xfffffd7fc0130000]
>> >    java.lang.Thread.State: BLOCKED (on object monitor)
>> >    JavaThread state: _thread_blocked
>> > Thread: 0x00000000009cc000  [0x30] State: _at_safepoint 
>> _has_called_back 0 _at_p
>> > oll_safepoint 0
>> >    JavaThread state: _thread_blocked
>> >         at SMW_WorkerThread.run(StressMonitorWait.java:120)
>> >         - waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String)
>> >
>> > "T3" #24 prio=5 os_prio=64 tid=0x00000000009ce000 nid=0x31 waiting 
>> for monitor e
>> > ntry [0xfffffd7fc002f000]
>> >    java.lang.Thread.State: BLOCKED (on object monitor)
>> >    JavaThread state: _thread_blocked
>> > Thread: 0x00000000009ce000  [0x31] State: _at_safepoint 
>> _has_called_back 0 _at_p
>> > oll_safepoint 0
>> >    JavaThread state: _thread_blocked
>> >         at SMW_WorkerThread.run(StressMonitorWait.java:139)
>> >         - waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String)
>> >
>> > Key symptoms in thread T1:
>> >
>> > - had the object locked:
>> >
>> >   locked <0xfffffd7e6a2b6ff0> (a java.lang.String)
>> >
>> > - did an Object.wait():
>> >
>> >   waiting on <0xfffffd7e6a2b6ff0> (a java.lang.String)
>> >
>> > - is blocked on reentry:
>> >
>> >   waiting for monitor entry [0xfffffd7fc0231000]
>> >
>> > Key symptoms in thread T2:
>> >
>> > - is blocked waiting to lock the object:
>> >
>> >   waiting for monitor entry [0xfffffd7fc0130000]
>> >   waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String)
>> >
>> > Key symptoms in thread T3:
>> >
>> > - is blocked waiting to lock the object:
>> >
>> >   waiting for monitor entry [0xfffffd7fc002f000]
>> >   waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String)
>>
>>
>>
>


From daniel.daugherty at oracle.com  Tue Feb 25 14:57:20 2014
From: daniel.daugherty at oracle.com (Daniel D. Daugherty)
Date: Tue, 25 Feb 2014 15:57:20 -0700
Subject: RFR 6471769: Error: assert(_cur_stack_depth == count_frames(),
	"cur_stack_depth out of sync")
In-Reply-To: <530D007D.4040602@oracle.com>
References: <530D007D.4040602@oracle.com>
Message-ID: <530D1FD0.8070308@oracle.com>

On 2/25/14 1:43 PM, serguei.spitsyn at oracle.com wrote:
> Please, review the fix for:
>   https://bugs.openjdk.java.net/browse/JDK-6471769
>
>
> Open webrev:
> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/hotspot/6471769-JVMTI-DEPTH.1 
>

src/share/vm/runtime/vm_operations.hpp
     No comments.

src/share/vm/prims/jvmtiEnvBase.hpp
     No comments.

src/share/vm/prims/jvmtiEnv.cpp
     No comments.

src/share/vm/prims/jvmtiEnvThreadState.cpp
     No comments.

src/share/vm/prims/jvmtiThreadState.cpp
     line 66:   _cur_stack_depth = UNKNOWN_STACK_DEPTH;
         This looks like the key piece of this fix with respect to the
         assert() in the bug report. I suspect that the first call to
         JvmtiThreadState::cur_stack_depth() is racing with another
         thread that happens to do something else that inits or sets
         _cur_stack_depth to an acceptable value.

     line 251:     "must be current thread or at safepont");
     line 284:     "must be current thread or at safepont");
         typo: 'safepont' -> 'safepoint'

Thumbs up! No need to re-review the typo fixes.

Dan


>
> Summary:
>
>   This is another Test Stabilization issue.
>   The fix is very similar to other JVMTI stabilization fixes.
>   It is to use safepoints for updating the PopFrame data instead of 
> relying on the
>   suspend equivalent condition mechanism 
> (JvmtiEnv::is_thread_fully_suspended())
>   which is not adequate from the reliability point of view.
>
> Testing:
>   In progress: nsk.jvmti, nsk.jdi, nsk.jdwp, JTreg com/sun/jdi
>
>
> Thanks,
> Serguei
>


From Karen.Kinnear at Oracle.com  Tue Feb 25 15:11:10 2014
From: Karen.Kinnear at Oracle.com (Karen Kinnear)
Date: Tue, 25 Feb 2014 18:11:10 -0500
Subject: RFR round 0 JDK8u backport of ObjectMonitor-JVM/TI hang fix
	(8028073)
In-Reply-To: <53081C1F.2070008@oracle.com>
References: <52ED3F28.5070806@oracle.com> <53081C1F.2070008@oracle.com>
Message-ID: <8100537C-485E-4699-926B-CB1A4BF36DA7@Oracle.com>

Dan,

Code looks good.  This makes sense to me. Thank you for the detailed comments and testing.

thanks,
Karen

p.s. sorry - you would think getting three copies of the review request would mean
I would not completely overlook this in my emails :-)
On Feb 21, 2014, at 10:40 PM, Daniel D. Daugherty wrote:

> Greetings,
> 
> This is a code review request for the JDK8u-hs-dev backport of the
> following ObjectMonitor-JVM/TI hang fix:
> 
>    8028073 race condition in ObjectMonitor implementation causing deadlocks
>    https://bugs.openjdk.java.net/browse/JDK-8028073
> 
> Here is the JDK8u-hs-dev webrev URL:
> 
> http://cr.openjdk.java.net/~dcubed/8028073-webrev/0-jdk8u-hs-dev/
> 
> This is _almost_ a straight forward backport of the JDK9 fix. The only
> difference to the fix was discussed at the end of the JDK9 review and
> was determined to only be needed in versions of HotSpot without the
> fix for 8028280:
> 
> http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2014-February/010745.html
> 
> 8028280 has not yet been backported to JDK8u-hs-dev.
> 
> The easiest way to review the backport is to download the two patch
> files from the webrevs and compare them with something like:
> 
>    jfilemerge -r -w 8028073_exp.patch 8028073_exp_for_jdk8u_hs.patch
> 
> The same testing has been performed on the JDK8u-hs-dev version as
> with the JDK9-hs-runtime version.
> 
> Thanks, in advance, for any comments, questions or suggestions.
> 
> Dan
> 
> 
> On 2/1/14 11:38 AM, Daniel D. Daugherty wrote:
> > Greetings,
> >
> > I have a fix ready for the following bug:
> >
> >     8028073 race condition in ObjectMonitor implementation causing deadlocks
> >     https://bugs.openjdk.java.net/browse/JDK-8028073
> >
> > On the surface, this is a very simple fix that relocates a few lines of
> > code, relocates and rewrites the comments associated with that code and
> > adds several new comments.
> >
> > Of course, in reality, the issue is much more complicated, but I'm
> > hoping to make it easy for anyone not acquainted with this issue to
> > understand what's going on.
> >
> > Here are the JDK9 webrev URLs:
> >
> > OpenJDK:
> > http://cr.openjdk.java.net/~dcubed/8028073-webrev/0-jdk9-hs-runtime/
> >
> > Oracle internal:
> > http://javaweb.us.oracle.com/~ddaugher/8028073-webrev/0-jdk9-hs-runtime/
> >
> > The simple summary:
> >
> > - since Java Monitors and JVM/TI RawMonitors share a ParkEvent,
> >   it is possible for a JVM/TI monitor event handler to accidentally
> >   consume a ParkEvent.unpark() call meant for Java Monitor layer
> > - the original code fix was made on 2005.07.04 using this bug ID:
> >   https://bugs.openjdk.java.net/browse/JDK-5030359
> > - it's the right fix, but it's in the wrong place
> > - the fix needs to be after the JVMTI_EVENT_MONITOR_WAITED
> >   event handler is called because it is that event handler
> >   that can cause the hang
> >
> >
> > Testing
> > -------
> >
> > - a new StessMonitorWait test has been created that reliably
> >   reproduces the hang in JDK[6789]; see the bug's gory details
> >   for the specific versions where the hang has been reproduced
> >   - the test reliably reproduces the hang in 5 seconds on my
> >     T7600 running Solaris 10u11 X86; 1 minute runs reproduce
> >     the hang reliably on other machines
> >   - 12 hour stress run of the new test on Linux-X64, MacOS X-X64,
> >     Solaris-SPARCV9, Solaris-X64, and Win7-X86 with the JPRT
> >     bits did not reproduce the hang
> > - JPRT test job
> > - VM/SQE Adhoc test job on Server VM, fastdebug bits on Linux-X86,
> >   Linux-X64, MacOS X-X64, Solaris-SPARCV9, Solaris-X64, Windows-X86,
> >   and Windows-X64:
> >   - vm.quick
> >   - Kitchensink (bigapps)
> >   - Weblogic+medrec (bigapps)
> >   - runThese (bigapps)
> >
> >
> > The Gory Details Start Here
> > ---------------------------
> >
> > This is the old location of block of code that's being moved:
> >
> > src/share/vm/runtime/objectMonitor.cpp:
> >
> > 1440 void ObjectMonitor::wait(jlong millis, bool interruptible, TRAPS) {
> > <snip>
> > 1499    exit (true, Self) ;                    // exit the monitor
> > <snip>
> > 1513    if (node._notified != 0 && _succ == Self) {
> > 1514       node._event->unpark();
> > 1515    }
> >
> >
> > This is the new location of block of code that's being moved:
> >
> > src/share/vm/runtime/objectMonitor.cpp:
> >
> > 1452 void ObjectMonitor::wait(jlong millis, bool interruptible, TRAPS) {
> > <snip>
> > 1601      if (JvmtiExport::should_post_monitor_waited()) {
> > 1602        JvmtiExport::post_monitor_waited(jt, this, ret == OS_TIMEOUT);
> > <snip>
> > 1604        if (node._notified != 0 && _succ == Self) {
> > <snip>
> > 1620          node._event->unpark();
> > 1621        }
> >
> >
> > The Risks
> > ---------
> >
> > - The code now executes only when the JVMTI_EVENT_MONITOR_WAITED event
> >   is enabled:
> >   - previously it was always executed
> >   - while the old code was not effective for the hang that is being
> >     fixed with this bug, it is possible that the old code prevented
> >     a different bug in the successor protocol from manifesting
> >   - thorough analysis of the successor protocol did not reveal a
> >     case where the old code was needed in the old location
> > - Thorough analysis indicates that the other JVM/TI monitor events
> >   do not need a fix like the one for JVMTI_EVENT_MONITOR_WAITED:
> >   - the successor protocol is complicated and the analysis could
> >     be wrong when certain options are used
> >   - comments were added to each location where a JVM/TI monitor
> >     event handler is called documenting why a fix like this one
> >     is not needed there
> >   - if the analysis is wrong, the new comments show where a new
> >     code change would be needed
> >
> >
> > The Scenario
> > ------------
> >
> > I've created a scenario that reproduces this hang:
> >
> > T1 - enters monitor and calls monitor.wait()
> > T2 - enters the monitor, calls monitor.notify() and exits the monitor
> > T3 - enters and exits the monitor
> > T4 - enters the monitor, delays for 5 seconds, exits the monitor
> >
> > A JVM/TI agent that enables JVMTI_EVENT_MONITOR_WAITED and has a
> > handler that: enters a raw monitor, waits for 1ms, exits a raw monitor.
> >
> > Here are the six events necessary to make this hang happen:
> >
> > // KEY-EVENT-1a: After being unparked(), T1 has cleared the _succ field, but
> > // KEY-EVENT-1b: T3 is exiting the monitor and makes T1 the successor again.
> >
> > // KEY-EVENT-2a: The unpark() done by T3 when it made T1 the successor
> > // KEY-EVENT-2b: is consumed by the JVM/TI event handler.
> >
> > // KEY-EVENT-3a: T3 made T1 the successor
> > // KEY-EVENT-3b: but before T1 could reenter the monitor T4 grabbed it.
> >
> > // KEY-EVENT-4a: T1's TrySpin() call sees T4 as NotRunnable so
> > // KEY-EVENT-4b: T1 bails from TrySpin without touching _succ.
> >
> > // KEY-EVENT-5a: T4 sees that T1 is still the successor so
> > // KEY-EVENT-5b: T4 takes the quick exit path (no ExitEpilog)
> >
> > // KEY-EVENT-6a: T1 is about to park and it is the successor, but
> > // KEY-EVENT-6b: T3's unpark has been eaten by the JVM/TI event handler
> > // KEY-EVENT-6c: and T4 took the quick exit path. T1 is about to be stuck.
> >
> >
> > This bug is intertwined with:
> >
> > - The ObjectMonitor successor protocol
> > - the sharing of a ParkEvent between Java Monitors and JVM/TI RawMonitors
> >
> > There is a very long successor.notes attachment to JDK-8028073 that
> > attempts to describe the ObjectMonitor successor protocol. It's good
> > for putting pretty much anyone to sleep.
> >
> > Since this hang reproduces back to JDK6, this bug is taking the easily
> > backported solution of moving the original fix to the right location.
> > The following new bug has been filed for possible future work in this
> > area by the Serviceability Team:
> >
> >     8033399 add a separate ParkEvent for JVM/TI RawMonitor use
> >     https://bugs.openjdk.java.net/browse/JDK-8033399
> >
> >
> > The Symptoms
> > ------------
> >
> > With intermittent hangs like this, it is useful to know what to look
> > for in order to determine if you are running into this issue:
> >
> > - if you aren't using a debugger or a profiler or some other
> >   JVM/TI agent, then this hang is not the same as yours
> > - if your JVM/TI agent isn't using a JVMTI_EVENT_MONITOR_WAITED
> >   event handler, then this hang is not the same as yours
> > - if your JVMTI_EVENT_MONITOR_WAITED event handler is not using
> >   JVM/TI RawMonitors, then this hang is not the same as yours
> > - if your JVMTI_EVENT_MONITOR_WAITED event handler is calling
> >   back into Java code, then you might just be insane and this
> >   hang might be similar to yours. However, using a Java callback
> >   in an event handler is an even bigger problem/risk so fix that
> >   first.
> > - if you one or more threads blocked like this and making no
> >   progress, then this hang might be the same as yours:
> >
> > "T1" #22 prio=5 os_prio=64 tid=0x00000000009ca800 nid=0x2f waiting for monitor e
> > ntry [0xfffffd7fc0231000]
> >    java.lang.Thread.State: BLOCKED (on object monitor)
> >    JavaThread state: _thread_blocked
> > Thread: 0x00000000009ca800  [0x2f] State: _at_safepoint _has_called_back 0 _at_p
> > oll_safepoint 0
> >    JavaThread state: _thread_blocked
> >         at java.lang.Object.wait(Native Method)
> >         - waiting on <0xfffffd7e6a2b6ff0> (a java.lang.String)
> >         at java.lang.Object.wait(Object.java:502)
> >         at SMW_WorkerThread.run(StressMonitorWait.java:103)
> >         - locked <0xfffffd7e6a2b6ff0> (a java.lang.String)
> >
> > "T2" #23 prio=5 os_prio=64 tid=0x00000000009cc000 nid=0x30 waiting for monitor e
> > ntry [0xfffffd7fc0130000]
> >    java.lang.Thread.State: BLOCKED (on object monitor)
> >    JavaThread state: _thread_blocked
> > Thread: 0x00000000009cc000  [0x30] State: _at_safepoint _has_called_back 0 _at_p
> > oll_safepoint 0
> >    JavaThread state: _thread_blocked
> >         at SMW_WorkerThread.run(StressMonitorWait.java:120)
> >         - waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String)
> >
> > "T3" #24 prio=5 os_prio=64 tid=0x00000000009ce000 nid=0x31 waiting for monitor e
> > ntry [0xfffffd7fc002f000]
> >    java.lang.Thread.State: BLOCKED (on object monitor)
> >    JavaThread state: _thread_blocked
> > Thread: 0x00000000009ce000  [0x31] State: _at_safepoint _has_called_back 0 _at_p
> > oll_safepoint 0
> >    JavaThread state: _thread_blocked
> >         at SMW_WorkerThread.run(StressMonitorWait.java:139)
> >         - waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String)
> >
> > Key symptoms in thread T1:
> >
> > - had the object locked:
> >
> >   locked <0xfffffd7e6a2b6ff0> (a java.lang.String)
> >
> > - did an Object.wait():
> >
> >   waiting on <0xfffffd7e6a2b6ff0> (a java.lang.String)
> >
> > - is blocked on reentry:
> >
> >   waiting for monitor entry [0xfffffd7fc0231000]
> >
> > Key symptoms in thread T2:
> >
> > - is blocked waiting to lock the object:
> >
> >   waiting for monitor entry [0xfffffd7fc0130000]
> >   waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String)
> >
> > Key symptoms in thread T3:
> >
> > - is blocked waiting to lock the object:
> >
> >   waiting for monitor entry [0xfffffd7fc002f000]
> >   waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String)
> 


From daniel.daugherty at oracle.com  Tue Feb 25 15:26:16 2014
From: daniel.daugherty at oracle.com (Daniel D. Daugherty)
Date: Tue, 25 Feb 2014 16:26:16 -0700
Subject: RFR round 0 JDK8u backport of ObjectMonitor-JVM/TI hang fix
	(8028073)
In-Reply-To: <8100537C-485E-4699-926B-CB1A4BF36DA7@Oracle.com>
References: <52ED3F28.5070806@oracle.com> <53081C1F.2070008@oracle.com>
	<8100537C-485E-4699-926B-CB1A4BF36DA7@Oracle.com>
Message-ID: <530D2698.9030503@oracle.com>

On 2/25/14 4:11 PM, Karen Kinnear wrote:
> Dan,
>
> Code looks good.  This makes sense to me. Thank you for the detailed comments and testing.

Thanks for the review!


> thanks,
> Karen
>
> p.s. sorry - you would think getting three copies of the review request would mean
> I would not completely overlook this in my emails :-)

No worries. I'm using the backports as fillers for my other tasks...
sometimes I need a break from the Win7 DBI... :-)

Dan


> On Feb 21, 2014, at 10:40 PM, Daniel D. Daugherty wrote:
>
>> Greetings,
>>
>> This is a code review request for the JDK8u-hs-dev backport of the
>> following ObjectMonitor-JVM/TI hang fix:
>>
>>     8028073 race condition in ObjectMonitor implementation causing deadlocks
>>     https://bugs.openjdk.java.net/browse/JDK-8028073
>>
>> Here is the JDK8u-hs-dev webrev URL:
>>
>> http://cr.openjdk.java.net/~dcubed/8028073-webrev/0-jdk8u-hs-dev/
>>
>> This is _almost_ a straight forward backport of the JDK9 fix. The only
>> difference to the fix was discussed at the end of the JDK9 review and
>> was determined to only be needed in versions of HotSpot without the
>> fix for 8028280:
>>
>> http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2014-February/010745.html
>>
>> 8028280 has not yet been backported to JDK8u-hs-dev.
>>
>> The easiest way to review the backport is to download the two patch
>> files from the webrevs and compare them with something like:
>>
>>     jfilemerge -r -w 8028073_exp.patch 8028073_exp_for_jdk8u_hs.patch
>>
>> The same testing has been performed on the JDK8u-hs-dev version as
>> with the JDK9-hs-runtime version.
>>
>> Thanks, in advance, for any comments, questions or suggestions.
>>
>> Dan
>>
>>
>> On 2/1/14 11:38 AM, Daniel D. Daugherty wrote:
>>> Greetings,
>>>
>>> I have a fix ready for the following bug:
>>>
>>>      8028073 race condition in ObjectMonitor implementation causing deadlocks
>>>      https://bugs.openjdk.java.net/browse/JDK-8028073
>>>
>>> On the surface, this is a very simple fix that relocates a few lines of
>>> code, relocates and rewrites the comments associated with that code and
>>> adds several new comments.
>>>
>>> Of course, in reality, the issue is much more complicated, but I'm
>>> hoping to make it easy for anyone not acquainted with this issue to
>>> understand what's going on.
>>>
>>> Here are the JDK9 webrev URLs:
>>>
>>> OpenJDK:
>>> http://cr.openjdk.java.net/~dcubed/8028073-webrev/0-jdk9-hs-runtime/
>>>
>>> Oracle internal:
>>> http://javaweb.us.oracle.com/~ddaugher/8028073-webrev/0-jdk9-hs-runtime/
>>>
>>> The simple summary:
>>>
>>> - since Java Monitors and JVM/TI RawMonitors share a ParkEvent,
>>>    it is possible for a JVM/TI monitor event handler to accidentally
>>>    consume a ParkEvent.unpark() call meant for Java Monitor layer
>>> - the original code fix was made on 2005.07.04 using this bug ID:
>>>    https://bugs.openjdk.java.net/browse/JDK-5030359
>>> - it's the right fix, but it's in the wrong place
>>> - the fix needs to be after the JVMTI_EVENT_MONITOR_WAITED
>>>    event handler is called because it is that event handler
>>>    that can cause the hang
>>>
>>>
>>> Testing
>>> -------
>>>
>>> - a new StessMonitorWait test has been created that reliably
>>>    reproduces the hang in JDK[6789]; see the bug's gory details
>>>    for the specific versions where the hang has been reproduced
>>>    - the test reliably reproduces the hang in 5 seconds on my
>>>      T7600 running Solaris 10u11 X86; 1 minute runs reproduce
>>>      the hang reliably on other machines
>>>    - 12 hour stress run of the new test on Linux-X64, MacOS X-X64,
>>>      Solaris-SPARCV9, Solaris-X64, and Win7-X86 with the JPRT
>>>      bits did not reproduce the hang
>>> - JPRT test job
>>> - VM/SQE Adhoc test job on Server VM, fastdebug bits on Linux-X86,
>>>    Linux-X64, MacOS X-X64, Solaris-SPARCV9, Solaris-X64, Windows-X86,
>>>    and Windows-X64:
>>>    - vm.quick
>>>    - Kitchensink (bigapps)
>>>    - Weblogic+medrec (bigapps)
>>>    - runThese (bigapps)
>>>
>>>
>>> The Gory Details Start Here
>>> ---------------------------
>>>
>>> This is the old location of block of code that's being moved:
>>>
>>> src/share/vm/runtime/objectMonitor.cpp:
>>>
>>> 1440 void ObjectMonitor::wait(jlong millis, bool interruptible, TRAPS) {
>>> <snip>
>>> 1499    exit (true, Self) ;                    // exit the monitor
>>> <snip>
>>> 1513    if (node._notified != 0 && _succ == Self) {
>>> 1514       node._event->unpark();
>>> 1515    }
>>>
>>>
>>> This is the new location of block of code that's being moved:
>>>
>>> src/share/vm/runtime/objectMonitor.cpp:
>>>
>>> 1452 void ObjectMonitor::wait(jlong millis, bool interruptible, TRAPS) {
>>> <snip>
>>> 1601      if (JvmtiExport::should_post_monitor_waited()) {
>>> 1602        JvmtiExport::post_monitor_waited(jt, this, ret == OS_TIMEOUT);
>>> <snip>
>>> 1604        if (node._notified != 0 && _succ == Self) {
>>> <snip>
>>> 1620          node._event->unpark();
>>> 1621        }
>>>
>>>
>>> The Risks
>>> ---------
>>>
>>> - The code now executes only when the JVMTI_EVENT_MONITOR_WAITED event
>>>    is enabled:
>>>    - previously it was always executed
>>>    - while the old code was not effective for the hang that is being
>>>      fixed with this bug, it is possible that the old code prevented
>>>      a different bug in the successor protocol from manifesting
>>>    - thorough analysis of the successor protocol did not reveal a
>>>      case where the old code was needed in the old location
>>> - Thorough analysis indicates that the other JVM/TI monitor events
>>>    do not need a fix like the one for JVMTI_EVENT_MONITOR_WAITED:
>>>    - the successor protocol is complicated and the analysis could
>>>      be wrong when certain options are used
>>>    - comments were added to each location where a JVM/TI monitor
>>>      event handler is called documenting why a fix like this one
>>>      is not needed there
>>>    - if the analysis is wrong, the new comments show where a new
>>>      code change would be needed
>>>
>>>
>>> The Scenario
>>> ------------
>>>
>>> I've created a scenario that reproduces this hang:
>>>
>>> T1 - enters monitor and calls monitor.wait()
>>> T2 - enters the monitor, calls monitor.notify() and exits the monitor
>>> T3 - enters and exits the monitor
>>> T4 - enters the monitor, delays for 5 seconds, exits the monitor
>>>
>>> A JVM/TI agent that enables JVMTI_EVENT_MONITOR_WAITED and has a
>>> handler that: enters a raw monitor, waits for 1ms, exits a raw monitor.
>>>
>>> Here are the six events necessary to make this hang happen:
>>>
>>> // KEY-EVENT-1a: After being unparked(), T1 has cleared the _succ field, but
>>> // KEY-EVENT-1b: T3 is exiting the monitor and makes T1 the successor again.
>>>
>>> // KEY-EVENT-2a: The unpark() done by T3 when it made T1 the successor
>>> // KEY-EVENT-2b: is consumed by the JVM/TI event handler.
>>>
>>> // KEY-EVENT-3a: T3 made T1 the successor
>>> // KEY-EVENT-3b: but before T1 could reenter the monitor T4 grabbed it.
>>>
>>> // KEY-EVENT-4a: T1's TrySpin() call sees T4 as NotRunnable so
>>> // KEY-EVENT-4b: T1 bails from TrySpin without touching _succ.
>>>
>>> // KEY-EVENT-5a: T4 sees that T1 is still the successor so
>>> // KEY-EVENT-5b: T4 takes the quick exit path (no ExitEpilog)
>>>
>>> // KEY-EVENT-6a: T1 is about to park and it is the successor, but
>>> // KEY-EVENT-6b: T3's unpark has been eaten by the JVM/TI event handler
>>> // KEY-EVENT-6c: and T4 took the quick exit path. T1 is about to be stuck.
>>>
>>>
>>> This bug is intertwined with:
>>>
>>> - The ObjectMonitor successor protocol
>>> - the sharing of a ParkEvent between Java Monitors and JVM/TI RawMonitors
>>>
>>> There is a very long successor.notes attachment to JDK-8028073 that
>>> attempts to describe the ObjectMonitor successor protocol. It's good
>>> for putting pretty much anyone to sleep.
>>>
>>> Since this hang reproduces back to JDK6, this bug is taking the easily
>>> backported solution of moving the original fix to the right location.
>>> The following new bug has been filed for possible future work in this
>>> area by the Serviceability Team:
>>>
>>>      8033399 add a separate ParkEvent for JVM/TI RawMonitor use
>>>      https://bugs.openjdk.java.net/browse/JDK-8033399
>>>
>>>
>>> The Symptoms
>>> ------------
>>>
>>> With intermittent hangs like this, it is useful to know what to look
>>> for in order to determine if you are running into this issue:
>>>
>>> - if you aren't using a debugger or a profiler or some other
>>>    JVM/TI agent, then this hang is not the same as yours
>>> - if your JVM/TI agent isn't using a JVMTI_EVENT_MONITOR_WAITED
>>>    event handler, then this hang is not the same as yours
>>> - if your JVMTI_EVENT_MONITOR_WAITED event handler is not using
>>>    JVM/TI RawMonitors, then this hang is not the same as yours
>>> - if your JVMTI_EVENT_MONITOR_WAITED event handler is calling
>>>    back into Java code, then you might just be insane and this
>>>    hang might be similar to yours. However, using a Java callback
>>>    in an event handler is an even bigger problem/risk so fix that
>>>    first.
>>> - if you one or more threads blocked like this and making no
>>>    progress, then this hang might be the same as yours:
>>>
>>> "T1" #22 prio=5 os_prio=64 tid=0x00000000009ca800 nid=0x2f waiting for monitor e
>>> ntry [0xfffffd7fc0231000]
>>>     java.lang.Thread.State: BLOCKED (on object monitor)
>>>     JavaThread state: _thread_blocked
>>> Thread: 0x00000000009ca800  [0x2f] State: _at_safepoint _has_called_back 0 _at_p
>>> oll_safepoint 0
>>>     JavaThread state: _thread_blocked
>>>          at java.lang.Object.wait(Native Method)
>>>          - waiting on <0xfffffd7e6a2b6ff0> (a java.lang.String)
>>>          at java.lang.Object.wait(Object.java:502)
>>>          at SMW_WorkerThread.run(StressMonitorWait.java:103)
>>>          - locked <0xfffffd7e6a2b6ff0> (a java.lang.String)
>>>
>>> "T2" #23 prio=5 os_prio=64 tid=0x00000000009cc000 nid=0x30 waiting for monitor e
>>> ntry [0xfffffd7fc0130000]
>>>     java.lang.Thread.State: BLOCKED (on object monitor)
>>>     JavaThread state: _thread_blocked
>>> Thread: 0x00000000009cc000  [0x30] State: _at_safepoint _has_called_back 0 _at_p
>>> oll_safepoint 0
>>>     JavaThread state: _thread_blocked
>>>          at SMW_WorkerThread.run(StressMonitorWait.java:120)
>>>          - waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String)
>>>
>>> "T3" #24 prio=5 os_prio=64 tid=0x00000000009ce000 nid=0x31 waiting for monitor e
>>> ntry [0xfffffd7fc002f000]
>>>     java.lang.Thread.State: BLOCKED (on object monitor)
>>>     JavaThread state: _thread_blocked
>>> Thread: 0x00000000009ce000  [0x31] State: _at_safepoint _has_called_back 0 _at_p
>>> oll_safepoint 0
>>>     JavaThread state: _thread_blocked
>>>          at SMW_WorkerThread.run(StressMonitorWait.java:139)
>>>          - waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String)
>>>
>>> Key symptoms in thread T1:
>>>
>>> - had the object locked:
>>>
>>>    locked <0xfffffd7e6a2b6ff0> (a java.lang.String)
>>>
>>> - did an Object.wait():
>>>
>>>    waiting on <0xfffffd7e6a2b6ff0> (a java.lang.String)
>>>
>>> - is blocked on reentry:
>>>
>>>    waiting for monitor entry [0xfffffd7fc0231000]
>>>
>>> Key symptoms in thread T2:
>>>
>>> - is blocked waiting to lock the object:
>>>
>>>    waiting for monitor entry [0xfffffd7fc0130000]
>>>    waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String)
>>>
>>> Key symptoms in thread T3:
>>>
>>> - is blocked waiting to lock the object:
>>>
>>>    waiting for monitor entry [0xfffffd7fc002f000]
>>>    waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String)


From daniel.daugherty at oracle.com  Tue Feb 25 15:26:43 2014
From: daniel.daugherty at oracle.com (Daniel D. Daugherty)
Date: Tue, 25 Feb 2014 16:26:43 -0700
Subject: RFR round 0 JDK8u backport of ObjectMonitor-JVM/TI hang fix
	(8028073)
In-Reply-To: <530D1129.4060605@oracle.com>
References: <52ED3F28.5070806@oracle.com> <53081C1F.2070008@oracle.com>
	<530CBED8.4050509@oracle.com> <530D1129.4060605@oracle.com>
Message-ID: <530D26B3.6000702@oracle.com>

On 2/25/14 2:54 PM, serguei.spitsyn at oracle.com wrote:
> Will review it today.
> You know, your fix is tricky. :)

Thanks. I will wait for your review.

Dan


>
> Thanks,
> Serguei
>
> On 2/25/14 8:03 AM, Daniel D. Daugherty wrote:
>> Ping! Still haven't heard from anyone on this backport...
>>
>> Dan
>>
>>
>> On 2/21/14 8:40 PM, Daniel D. Daugherty wrote:
>>> Greetings,
>>>
>>> This is a code review request for the JDK8u-hs-dev backport of the
>>> following ObjectMonitor-JVM/TI hang fix:
>>>
>>>     8028073 race condition in ObjectMonitor implementation causing 
>>> deadlocks
>>>     https://bugs.openjdk.java.net/browse/JDK-8028073
>>>
>>> Here is the JDK8u-hs-dev webrev URL:
>>>
>>> http://cr.openjdk.java.net/~dcubed/8028073-webrev/0-jdk8u-hs-dev/
>>>
>>> This is _almost_ a straight forward backport of the JDK9 fix. The only
>>> difference to the fix was discussed at the end of the JDK9 review and
>>> was determined to only be needed in versions of HotSpot without the
>>> fix for 8028280:
>>>
>>> http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2014-February/010745.html 
>>>
>>>
>>> 8028280 has not yet been backported to JDK8u-hs-dev.
>>>
>>> The easiest way to review the backport is to download the two patch
>>> files from the webrevs and compare them with something like:
>>>
>>>     jfilemerge -r -w 8028073_exp.patch 8028073_exp_for_jdk8u_hs.patch
>>>
>>> The same testing has been performed on the JDK8u-hs-dev version as
>>> with the JDK9-hs-runtime version.
>>>
>>> Thanks, in advance, for any comments, questions or suggestions.
>>>
>>> Dan
>>>
>>>
>>> On 2/1/14 11:38 AM, Daniel D. Daugherty wrote:
>>> > Greetings,
>>> >
>>> > I have a fix ready for the following bug:
>>> >
>>> >     8028073 race condition in ObjectMonitor implementation causing 
>>> deadlocks
>>> >     https://bugs.openjdk.java.net/browse/JDK-8028073
>>> >
>>> > On the surface, this is a very simple fix that relocates a few 
>>> lines of
>>> > code, relocates and rewrites the comments associated with that 
>>> code and
>>> > adds several new comments.
>>> >
>>> > Of course, in reality, the issue is much more complicated, but I'm
>>> > hoping to make it easy for anyone not acquainted with this issue to
>>> > understand what's going on.
>>> >
>>> > Here are the JDK9 webrev URLs:
>>> >
>>> > OpenJDK:
>>> > http://cr.openjdk.java.net/~dcubed/8028073-webrev/0-jdk9-hs-runtime/
>>> >
>>> > Oracle internal:
>>> > 
>>> http://javaweb.us.oracle.com/~ddaugher/8028073-webrev/0-jdk9-hs-runtime/ 
>>>
>>> >
>>> > The simple summary:
>>> >
>>> > - since Java Monitors and JVM/TI RawMonitors share a ParkEvent,
>>> >   it is possible for a JVM/TI monitor event handler to accidentally
>>> >   consume a ParkEvent.unpark() call meant for Java Monitor layer
>>> > - the original code fix was made on 2005.07.04 using this bug ID:
>>> >   https://bugs.openjdk.java.net/browse/JDK-5030359
>>> > - it's the right fix, but it's in the wrong place
>>> > - the fix needs to be after the JVMTI_EVENT_MONITOR_WAITED
>>> >   event handler is called because it is that event handler
>>> >   that can cause the hang
>>> >
>>> >
>>> > Testing
>>> > -------
>>> >
>>> > - a new StessMonitorWait test has been created that reliably
>>> >   reproduces the hang in JDK[6789]; see the bug's gory details
>>> >   for the specific versions where the hang has been reproduced
>>> >   - the test reliably reproduces the hang in 5 seconds on my
>>> >     T7600 running Solaris 10u11 X86; 1 minute runs reproduce
>>> >     the hang reliably on other machines
>>> >   - 12 hour stress run of the new test on Linux-X64, MacOS X-X64,
>>> >     Solaris-SPARCV9, Solaris-X64, and Win7-X86 with the JPRT
>>> >     bits did not reproduce the hang
>>> > - JPRT test job
>>> > - VM/SQE Adhoc test job on Server VM, fastdebug bits on Linux-X86,
>>> >   Linux-X64, MacOS X-X64, Solaris-SPARCV9, Solaris-X64, Windows-X86,
>>> >   and Windows-X64:
>>> >   - vm.quick
>>> >   - Kitchensink (bigapps)
>>> >   - Weblogic+medrec (bigapps)
>>> >   - runThese (bigapps)
>>> >
>>> >
>>> > The Gory Details Start Here
>>> > ---------------------------
>>> >
>>> > This is the old location of block of code that's being moved:
>>> >
>>> > src/share/vm/runtime/objectMonitor.cpp:
>>> >
>>> > 1440 void ObjectMonitor::wait(jlong millis, bool interruptible, 
>>> TRAPS) {
>>> > <snip>
>>> > 1499    exit (true, Self) ;                    // exit the monitor
>>> > <snip>
>>> > 1513    if (node._notified != 0 && _succ == Self) {
>>> > 1514       node._event->unpark();
>>> > 1515    }
>>> >
>>> >
>>> > This is the new location of block of code that's being moved:
>>> >
>>> > src/share/vm/runtime/objectMonitor.cpp:
>>> >
>>> > 1452 void ObjectMonitor::wait(jlong millis, bool interruptible, 
>>> TRAPS) {
>>> > <snip>
>>> > 1601      if (JvmtiExport::should_post_monitor_waited()) {
>>> > 1602        JvmtiExport::post_monitor_waited(jt, this, ret == 
>>> OS_TIMEOUT);
>>> > <snip>
>>> > 1604        if (node._notified != 0 && _succ == Self) {
>>> > <snip>
>>> > 1620          node._event->unpark();
>>> > 1621        }
>>> >
>>> >
>>> > The Risks
>>> > ---------
>>> >
>>> > - The code now executes only when the JVMTI_EVENT_MONITOR_WAITED 
>>> event
>>> >   is enabled:
>>> >   - previously it was always executed
>>> >   - while the old code was not effective for the hang that is being
>>> >     fixed with this bug, it is possible that the old code prevented
>>> >     a different bug in the successor protocol from manifesting
>>> >   - thorough analysis of the successor protocol did not reveal a
>>> >     case where the old code was needed in the old location
>>> > - Thorough analysis indicates that the other JVM/TI monitor events
>>> >   do not need a fix like the one for JVMTI_EVENT_MONITOR_WAITED:
>>> >   - the successor protocol is complicated and the analysis could
>>> >     be wrong when certain options are used
>>> >   - comments were added to each location where a JVM/TI monitor
>>> >     event handler is called documenting why a fix like this one
>>> >     is not needed there
>>> >   - if the analysis is wrong, the new comments show where a new
>>> >     code change would be needed
>>> >
>>> >
>>> > The Scenario
>>> > ------------
>>> >
>>> > I've created a scenario that reproduces this hang:
>>> >
>>> > T1 - enters monitor and calls monitor.wait()
>>> > T2 - enters the monitor, calls monitor.notify() and exits the monitor
>>> > T3 - enters and exits the monitor
>>> > T4 - enters the monitor, delays for 5 seconds, exits the monitor
>>> >
>>> > A JVM/TI agent that enables JVMTI_EVENT_MONITOR_WAITED and has a
>>> > handler that: enters a raw monitor, waits for 1ms, exits a raw 
>>> monitor.
>>> >
>>> > Here are the six events necessary to make this hang happen:
>>> >
>>> > // KEY-EVENT-1a: After being unparked(), T1 has cleared the _succ 
>>> field, but
>>> > // KEY-EVENT-1b: T3 is exiting the monitor and makes T1 the 
>>> successor again.
>>> >
>>> > // KEY-EVENT-2a: The unpark() done by T3 when it made T1 the 
>>> successor
>>> > // KEY-EVENT-2b: is consumed by the JVM/TI event handler.
>>> >
>>> > // KEY-EVENT-3a: T3 made T1 the successor
>>> > // KEY-EVENT-3b: but before T1 could reenter the monitor T4 
>>> grabbed it.
>>> >
>>> > // KEY-EVENT-4a: T1's TrySpin() call sees T4 as NotRunnable so
>>> > // KEY-EVENT-4b: T1 bails from TrySpin without touching _succ.
>>> >
>>> > // KEY-EVENT-5a: T4 sees that T1 is still the successor so
>>> > // KEY-EVENT-5b: T4 takes the quick exit path (no ExitEpilog)
>>> >
>>> > // KEY-EVENT-6a: T1 is about to park and it is the successor, but
>>> > // KEY-EVENT-6b: T3's unpark has been eaten by the JVM/TI event 
>>> handler
>>> > // KEY-EVENT-6c: and T4 took the quick exit path. T1 is about to 
>>> be stuck.
>>> >
>>> >
>>> > This bug is intertwined with:
>>> >
>>> > - The ObjectMonitor successor protocol
>>> > - the sharing of a ParkEvent between Java Monitors and JVM/TI 
>>> RawMonitors
>>> >
>>> > There is a very long successor.notes attachment to JDK-8028073 that
>>> > attempts to describe the ObjectMonitor successor protocol. It's good
>>> > for putting pretty much anyone to sleep.
>>> >
>>> > Since this hang reproduces back to JDK6, this bug is taking the 
>>> easily
>>> > backported solution of moving the original fix to the right location.
>>> > The following new bug has been filed for possible future work in this
>>> > area by the Serviceability Team:
>>> >
>>> >     8033399 add a separate ParkEvent for JVM/TI RawMonitor use
>>> >     https://bugs.openjdk.java.net/browse/JDK-8033399
>>> >
>>> >
>>> > The Symptoms
>>> > ------------
>>> >
>>> > With intermittent hangs like this, it is useful to know what to look
>>> > for in order to determine if you are running into this issue:
>>> >
>>> > - if you aren't using a debugger or a profiler or some other
>>> >   JVM/TI agent, then this hang is not the same as yours
>>> > - if your JVM/TI agent isn't using a JVMTI_EVENT_MONITOR_WAITED
>>> >   event handler, then this hang is not the same as yours
>>> > - if your JVMTI_EVENT_MONITOR_WAITED event handler is not using
>>> >   JVM/TI RawMonitors, then this hang is not the same as yours
>>> > - if your JVMTI_EVENT_MONITOR_WAITED event handler is calling
>>> >   back into Java code, then you might just be insane and this
>>> >   hang might be similar to yours. However, using a Java callback
>>> >   in an event handler is an even bigger problem/risk so fix that
>>> >   first.
>>> > - if you one or more threads blocked like this and making no
>>> >   progress, then this hang might be the same as yours:
>>> >
>>> > "T1" #22 prio=5 os_prio=64 tid=0x00000000009ca800 nid=0x2f waiting 
>>> for monitor e
>>> > ntry [0xfffffd7fc0231000]
>>> >    java.lang.Thread.State: BLOCKED (on object monitor)
>>> >    JavaThread state: _thread_blocked
>>> > Thread: 0x00000000009ca800  [0x2f] State: _at_safepoint 
>>> _has_called_back 0 _at_p
>>> > oll_safepoint 0
>>> >    JavaThread state: _thread_blocked
>>> >         at java.lang.Object.wait(Native Method)
>>> >         - waiting on <0xfffffd7e6a2b6ff0> (a java.lang.String)
>>> >         at java.lang.Object.wait(Object.java:502)
>>> >         at SMW_WorkerThread.run(StressMonitorWait.java:103)
>>> >         - locked <0xfffffd7e6a2b6ff0> (a java.lang.String)
>>> >
>>> > "T2" #23 prio=5 os_prio=64 tid=0x00000000009cc000 nid=0x30 waiting 
>>> for monitor e
>>> > ntry [0xfffffd7fc0130000]
>>> >    java.lang.Thread.State: BLOCKED (on object monitor)
>>> >    JavaThread state: _thread_blocked
>>> > Thread: 0x00000000009cc000  [0x30] State: _at_safepoint 
>>> _has_called_back 0 _at_p
>>> > oll_safepoint 0
>>> >    JavaThread state: _thread_blocked
>>> >         at SMW_WorkerThread.run(StressMonitorWait.java:120)
>>> >         - waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String)
>>> >
>>> > "T3" #24 prio=5 os_prio=64 tid=0x00000000009ce000 nid=0x31 waiting 
>>> for monitor e
>>> > ntry [0xfffffd7fc002f000]
>>> >    java.lang.Thread.State: BLOCKED (on object monitor)
>>> >    JavaThread state: _thread_blocked
>>> > Thread: 0x00000000009ce000  [0x31] State: _at_safepoint 
>>> _has_called_back 0 _at_p
>>> > oll_safepoint 0
>>> >    JavaThread state: _thread_blocked
>>> >         at SMW_WorkerThread.run(StressMonitorWait.java:139)
>>> >         - waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String)
>>> >
>>> > Key symptoms in thread T1:
>>> >
>>> > - had the object locked:
>>> >
>>> >   locked <0xfffffd7e6a2b6ff0> (a java.lang.String)
>>> >
>>> > - did an Object.wait():
>>> >
>>> >   waiting on <0xfffffd7e6a2b6ff0> (a java.lang.String)
>>> >
>>> > - is blocked on reentry:
>>> >
>>> >   waiting for monitor entry [0xfffffd7fc0231000]
>>> >
>>> > Key symptoms in thread T2:
>>> >
>>> > - is blocked waiting to lock the object:
>>> >
>>> >   waiting for monitor entry [0xfffffd7fc0130000]
>>> >   waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String)
>>> >
>>> > Key symptoms in thread T3:
>>> >
>>> > - is blocked waiting to lock the object:
>>> >
>>> >   waiting for monitor entry [0xfffffd7fc002f000]
>>> >   waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String)
>>>
>>>
>>>
>>
>


From serguei.spitsyn at oracle.com  Tue Feb 25 15:35:54 2014
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Tue, 25 Feb 2014 15:35:54 -0800
Subject: RFR 6471769: Error: assert(_cur_stack_depth == count_frames(),
	"cur_stack_depth out of sync")
In-Reply-To: <530D1FD0.8070308@oracle.com>
References: <530D007D.4040602@oracle.com> <530D1FD0.8070308@oracle.com>
Message-ID: <530D28DA.1070308@oracle.com>

On 2/25/14 2:57 PM, Daniel D. Daugherty wrote:
> On 2/25/14 1:43 PM, serguei.spitsyn at oracle.com wrote:
>> Please, review the fix for:
>>   https://bugs.openjdk.java.net/browse/JDK-6471769
>>
>>
>> Open webrev:
>> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/hotspot/6471769-JVMTI-DEPTH.1 
>>
>
> src/share/vm/runtime/vm_operations.hpp
>     No comments.
>
> src/share/vm/prims/jvmtiEnvBase.hpp
>     No comments.
>
> src/share/vm/prims/jvmtiEnv.cpp
>     No comments.
>
> src/share/vm/prims/jvmtiEnvThreadState.cpp
>     No comments.
>
> src/share/vm/prims/jvmtiThreadState.cpp
>     line 66:   _cur_stack_depth = UNKNOWN_STACK_DEPTH;
>         This looks like the key piece of this fix with respect to the
>         assert() in the bug report. I suspect that the first call to
>         JvmtiThreadState::cur_stack_depth() is racing with another
>         thread that happens to do something else that inits or sets
>         _cur_stack_depth to an acceptable value.

Another potential cause of the issue is that the cur_stack_depth() is 
called from the
update_for_pop_top_frame() which used to be called under suspend 
equivalent condition.
The update_for_pop_top_frame() has been changed to be called at a 
safepoint now.

>
>     line 251:     "must be current thread or at safepont");
>     line 284:     "must be current thread or at safepont");
>         typo: 'safepont' -> 'safepoint'
>
> Thumbs up! No need to re-review the typo fixes.

Sure. I'll fix it before pushing.

Thanks, Dan!
Serguei

>
> Dan
>
>
>>
>> Summary:
>>
>>   This is another Test Stabilization issue.
>>   The fix is very similar to other JVMTI stabilization fixes.
>>   It is to use safepoints for updating the PopFrame data instead of 
>> relying on the
>>   suspend equivalent condition mechanism 
>> (JvmtiEnv::is_thread_fully_suspended())
>>   which is not adequate from the reliability point of view.
>>
>> Testing:
>>   In progress: nsk.jvmti, nsk.jdi, nsk.jdwp, JTreg com/sun/jdi
>>
>>
>> Thanks,
>> Serguei
>>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20140225/4e4c9753/attachment-0001.html 

From serguei.spitsyn at oracle.com  Tue Feb 25 16:20:40 2014
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Tue, 25 Feb 2014 16:20:40 -0800
Subject: RFR round 0 JDK8u backport of ObjectMonitor-JVM/TI hang fix
	(8028073)
In-Reply-To: <530CBED8.4050509@oracle.com>
References: <52ED3F28.5070806@oracle.com> <53081C1F.2070008@oracle.com>
	<530CBED8.4050509@oracle.com>
Message-ID: <530D3358.5090304@oracle.com>

Dan,

The fix looks good to me.
I like the comments.
They help to understand this aspect of the protocol.

Thanks,
Serguei


On 2/25/14 8:03 AM, Daniel D. Daugherty wrote:
> Ping! Still haven't heard from anyone on this backport...
>
> Dan
>
>
> On 2/21/14 8:40 PM, Daniel D. Daugherty wrote:
>> Greetings,
>>
>> This is a code review request for the JDK8u-hs-dev backport of the
>> following ObjectMonitor-JVM/TI hang fix:
>>
>>     8028073 race condition in ObjectMonitor implementation causing 
>> deadlocks
>>     https://bugs.openjdk.java.net/browse/JDK-8028073
>>
>> Here is the JDK8u-hs-dev webrev URL:
>>
>> http://cr.openjdk.java.net/~dcubed/8028073-webrev/0-jdk8u-hs-dev/
>>
>> This is _almost_ a straight forward backport of the JDK9 fix. The only
>> difference to the fix was discussed at the end of the JDK9 review and
>> was determined to only be needed in versions of HotSpot without the
>> fix for 8028280:
>>
>> http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2014-February/010745.html 
>>
>>
>> 8028280 has not yet been backported to JDK8u-hs-dev.
>>
>> The easiest way to review the backport is to download the two patch
>> files from the webrevs and compare them with something like:
>>
>>     jfilemerge -r -w 8028073_exp.patch 8028073_exp_for_jdk8u_hs.patch
>>
>> The same testing has been performed on the JDK8u-hs-dev version as
>> with the JDK9-hs-runtime version.
>>
>> Thanks, in advance, for any comments, questions or suggestions.
>>
>> Dan
>>
>>
>> On 2/1/14 11:38 AM, Daniel D. Daugherty wrote:
>> > Greetings,
>> >
>> > I have a fix ready for the following bug:
>> >
>> >     8028073 race condition in ObjectMonitor implementation causing 
>> deadlocks
>> >     https://bugs.openjdk.java.net/browse/JDK-8028073
>> >
>> > On the surface, this is a very simple fix that relocates a few 
>> lines of
>> > code, relocates and rewrites the comments associated with that code 
>> and
>> > adds several new comments.
>> >
>> > Of course, in reality, the issue is much more complicated, but I'm
>> > hoping to make it easy for anyone not acquainted with this issue to
>> > understand what's going on.
>> >
>> > Here are the JDK9 webrev URLs:
>> >
>> > OpenJDK:
>> > http://cr.openjdk.java.net/~dcubed/8028073-webrev/0-jdk9-hs-runtime/
>> >
>> > Oracle internal:
>> > 
>> http://javaweb.us.oracle.com/~ddaugher/8028073-webrev/0-jdk9-hs-runtime/
>> >
>> > The simple summary:
>> >
>> > - since Java Monitors and JVM/TI RawMonitors share a ParkEvent,
>> >   it is possible for a JVM/TI monitor event handler to accidentally
>> >   consume a ParkEvent.unpark() call meant for Java Monitor layer
>> > - the original code fix was made on 2005.07.04 using this bug ID:
>> >   https://bugs.openjdk.java.net/browse/JDK-5030359
>> > - it's the right fix, but it's in the wrong place
>> > - the fix needs to be after the JVMTI_EVENT_MONITOR_WAITED
>> >   event handler is called because it is that event handler
>> >   that can cause the hang
>> >
>> >
>> > Testing
>> > -------
>> >
>> > - a new StessMonitorWait test has been created that reliably
>> >   reproduces the hang in JDK[6789]; see the bug's gory details
>> >   for the specific versions where the hang has been reproduced
>> >   - the test reliably reproduces the hang in 5 seconds on my
>> >     T7600 running Solaris 10u11 X86; 1 minute runs reproduce
>> >     the hang reliably on other machines
>> >   - 12 hour stress run of the new test on Linux-X64, MacOS X-X64,
>> >     Solaris-SPARCV9, Solaris-X64, and Win7-X86 with the JPRT
>> >     bits did not reproduce the hang
>> > - JPRT test job
>> > - VM/SQE Adhoc test job on Server VM, fastdebug bits on Linux-X86,
>> >   Linux-X64, MacOS X-X64, Solaris-SPARCV9, Solaris-X64, Windows-X86,
>> >   and Windows-X64:
>> >   - vm.quick
>> >   - Kitchensink (bigapps)
>> >   - Weblogic+medrec (bigapps)
>> >   - runThese (bigapps)
>> >
>> >
>> > The Gory Details Start Here
>> > ---------------------------
>> >
>> > This is the old location of block of code that's being moved:
>> >
>> > src/share/vm/runtime/objectMonitor.cpp:
>> >
>> > 1440 void ObjectMonitor::wait(jlong millis, bool interruptible, 
>> TRAPS) {
>> > <snip>
>> > 1499    exit (true, Self) ;                    // exit the monitor
>> > <snip>
>> > 1513    if (node._notified != 0 && _succ == Self) {
>> > 1514       node._event->unpark();
>> > 1515    }
>> >
>> >
>> > This is the new location of block of code that's being moved:
>> >
>> > src/share/vm/runtime/objectMonitor.cpp:
>> >
>> > 1452 void ObjectMonitor::wait(jlong millis, bool interruptible, 
>> TRAPS) {
>> > <snip>
>> > 1601      if (JvmtiExport::should_post_monitor_waited()) {
>> > 1602        JvmtiExport::post_monitor_waited(jt, this, ret == 
>> OS_TIMEOUT);
>> > <snip>
>> > 1604        if (node._notified != 0 && _succ == Self) {
>> > <snip>
>> > 1620          node._event->unpark();
>> > 1621        }
>> >
>> >
>> > The Risks
>> > ---------
>> >
>> > - The code now executes only when the JVMTI_EVENT_MONITOR_WAITED event
>> >   is enabled:
>> >   - previously it was always executed
>> >   - while the old code was not effective for the hang that is being
>> >     fixed with this bug, it is possible that the old code prevented
>> >     a different bug in the successor protocol from manifesting
>> >   - thorough analysis of the successor protocol did not reveal a
>> >     case where the old code was needed in the old location
>> > - Thorough analysis indicates that the other JVM/TI monitor events
>> >   do not need a fix like the one for JVMTI_EVENT_MONITOR_WAITED:
>> >   - the successor protocol is complicated and the analysis could
>> >     be wrong when certain options are used
>> >   - comments were added to each location where a JVM/TI monitor
>> >     event handler is called documenting why a fix like this one
>> >     is not needed there
>> >   - if the analysis is wrong, the new comments show where a new
>> >     code change would be needed
>> >
>> >
>> > The Scenario
>> > ------------
>> >
>> > I've created a scenario that reproduces this hang:
>> >
>> > T1 - enters monitor and calls monitor.wait()
>> > T2 - enters the monitor, calls monitor.notify() and exits the monitor
>> > T3 - enters and exits the monitor
>> > T4 - enters the monitor, delays for 5 seconds, exits the monitor
>> >
>> > A JVM/TI agent that enables JVMTI_EVENT_MONITOR_WAITED and has a
>> > handler that: enters a raw monitor, waits for 1ms, exits a raw 
>> monitor.
>> >
>> > Here are the six events necessary to make this hang happen:
>> >
>> > // KEY-EVENT-1a: After being unparked(), T1 has cleared the _succ 
>> field, but
>> > // KEY-EVENT-1b: T3 is exiting the monitor and makes T1 the 
>> successor again.
>> >
>> > // KEY-EVENT-2a: The unpark() done by T3 when it made T1 the successor
>> > // KEY-EVENT-2b: is consumed by the JVM/TI event handler.
>> >
>> > // KEY-EVENT-3a: T3 made T1 the successor
>> > // KEY-EVENT-3b: but before T1 could reenter the monitor T4 grabbed 
>> it.
>> >
>> > // KEY-EVENT-4a: T1's TrySpin() call sees T4 as NotRunnable so
>> > // KEY-EVENT-4b: T1 bails from TrySpin without touching _succ.
>> >
>> > // KEY-EVENT-5a: T4 sees that T1 is still the successor so
>> > // KEY-EVENT-5b: T4 takes the quick exit path (no ExitEpilog)
>> >
>> > // KEY-EVENT-6a: T1 is about to park and it is the successor, but
>> > // KEY-EVENT-6b: T3's unpark has been eaten by the JVM/TI event 
>> handler
>> > // KEY-EVENT-6c: and T4 took the quick exit path. T1 is about to be 
>> stuck.
>> >
>> >
>> > This bug is intertwined with:
>> >
>> > - The ObjectMonitor successor protocol
>> > - the sharing of a ParkEvent between Java Monitors and JVM/TI 
>> RawMonitors
>> >
>> > There is a very long successor.notes attachment to JDK-8028073 that
>> > attempts to describe the ObjectMonitor successor protocol. It's good
>> > for putting pretty much anyone to sleep.
>> >
>> > Since this hang reproduces back to JDK6, this bug is taking the easily
>> > backported solution of moving the original fix to the right location.
>> > The following new bug has been filed for possible future work in this
>> > area by the Serviceability Team:
>> >
>> >     8033399 add a separate ParkEvent for JVM/TI RawMonitor use
>> >     https://bugs.openjdk.java.net/browse/JDK-8033399
>> >
>> >
>> > The Symptoms
>> > ------------
>> >
>> > With intermittent hangs like this, it is useful to know what to look
>> > for in order to determine if you are running into this issue:
>> >
>> > - if you aren't using a debugger or a profiler or some other
>> >   JVM/TI agent, then this hang is not the same as yours
>> > - if your JVM/TI agent isn't using a JVMTI_EVENT_MONITOR_WAITED
>> >   event handler, then this hang is not the same as yours
>> > - if your JVMTI_EVENT_MONITOR_WAITED event handler is not using
>> >   JVM/TI RawMonitors, then this hang is not the same as yours
>> > - if your JVMTI_EVENT_MONITOR_WAITED event handler is calling
>> >   back into Java code, then you might just be insane and this
>> >   hang might be similar to yours. However, using a Java callback
>> >   in an event handler is an even bigger problem/risk so fix that
>> >   first.
>> > - if you one or more threads blocked like this and making no
>> >   progress, then this hang might be the same as yours:
>> >
>> > "T1" #22 prio=5 os_prio=64 tid=0x00000000009ca800 nid=0x2f waiting 
>> for monitor e
>> > ntry [0xfffffd7fc0231000]
>> >    java.lang.Thread.State: BLOCKED (on object monitor)
>> >    JavaThread state: _thread_blocked
>> > Thread: 0x00000000009ca800  [0x2f] State: _at_safepoint 
>> _has_called_back 0 _at_p
>> > oll_safepoint 0
>> >    JavaThread state: _thread_blocked
>> >         at java.lang.Object.wait(Native Method)
>> >         - waiting on <0xfffffd7e6a2b6ff0> (a java.lang.String)
>> >         at java.lang.Object.wait(Object.java:502)
>> >         at SMW_WorkerThread.run(StressMonitorWait.java:103)
>> >         - locked <0xfffffd7e6a2b6ff0> (a java.lang.String)
>> >
>> > "T2" #23 prio=5 os_prio=64 tid=0x00000000009cc000 nid=0x30 waiting 
>> for monitor e
>> > ntry [0xfffffd7fc0130000]
>> >    java.lang.Thread.State: BLOCKED (on object monitor)
>> >    JavaThread state: _thread_blocked
>> > Thread: 0x00000000009cc000  [0x30] State: _at_safepoint 
>> _has_called_back 0 _at_p
>> > oll_safepoint 0
>> >    JavaThread state: _thread_blocked
>> >         at SMW_WorkerThread.run(StressMonitorWait.java:120)
>> >         - waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String)
>> >
>> > "T3" #24 prio=5 os_prio=64 tid=0x00000000009ce000 nid=0x31 waiting 
>> for monitor e
>> > ntry [0xfffffd7fc002f000]
>> >    java.lang.Thread.State: BLOCKED (on object monitor)
>> >    JavaThread state: _thread_blocked
>> > Thread: 0x00000000009ce000  [0x31] State: _at_safepoint 
>> _has_called_back 0 _at_p
>> > oll_safepoint 0
>> >    JavaThread state: _thread_blocked
>> >         at SMW_WorkerThread.run(StressMonitorWait.java:139)
>> >         - waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String)
>> >
>> > Key symptoms in thread T1:
>> >
>> > - had the object locked:
>> >
>> >   locked <0xfffffd7e6a2b6ff0> (a java.lang.String)
>> >
>> > - did an Object.wait():
>> >
>> >   waiting on <0xfffffd7e6a2b6ff0> (a java.lang.String)
>> >
>> > - is blocked on reentry:
>> >
>> >   waiting for monitor entry [0xfffffd7fc0231000]
>> >
>> > Key symptoms in thread T2:
>> >
>> > - is blocked waiting to lock the object:
>> >
>> >   waiting for monitor entry [0xfffffd7fc0130000]
>> >   waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String)
>> >
>> > Key symptoms in thread T3:
>> >
>> > - is blocked waiting to lock the object:
>> >
>> >   waiting for monitor entry [0xfffffd7fc002f000]
>> >   waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String)
>>
>>
>>
>


From daniel.daugherty at oracle.com  Tue Feb 25 16:28:25 2014
From: daniel.daugherty at oracle.com (Daniel D. Daugherty)
Date: Tue, 25 Feb 2014 17:28:25 -0700
Subject: RFR round 0 JDK8u backport of ObjectMonitor-JVM/TI hang fix
	(8028073)
In-Reply-To: <530D3358.5090304@oracle.com>
References: <52ED3F28.5070806@oracle.com> <53081C1F.2070008@oracle.com>
	<530CBED8.4050509@oracle.com> <530D3358.5090304@oracle.com>
Message-ID: <530D3529.9080007@oracle.com>

Thanks for the review!

Dan


On 2/25/14 5:20 PM, serguei.spitsyn at oracle.com wrote:
> Dan,
>
> The fix looks good to me.
> I like the comments.
> They help to understand this aspect of the protocol.
>
> Thanks,
> Serguei
>
>
> On 2/25/14 8:03 AM, Daniel D. Daugherty wrote:
>> Ping! Still haven't heard from anyone on this backport...
>>
>> Dan
>>
>>
>> On 2/21/14 8:40 PM, Daniel D. Daugherty wrote:
>>> Greetings,
>>>
>>> This is a code review request for the JDK8u-hs-dev backport of the
>>> following ObjectMonitor-JVM/TI hang fix:
>>>
>>>     8028073 race condition in ObjectMonitor implementation causing 
>>> deadlocks
>>>     https://bugs.openjdk.java.net/browse/JDK-8028073
>>>
>>> Here is the JDK8u-hs-dev webrev URL:
>>>
>>> http://cr.openjdk.java.net/~dcubed/8028073-webrev/0-jdk8u-hs-dev/
>>>
>>> This is _almost_ a straight forward backport of the JDK9 fix. The only
>>> difference to the fix was discussed at the end of the JDK9 review and
>>> was determined to only be needed in versions of HotSpot without the
>>> fix for 8028280:
>>>
>>> http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2014-February/010745.html 
>>>
>>>
>>> 8028280 has not yet been backported to JDK8u-hs-dev.
>>>
>>> The easiest way to review the backport is to download the two patch
>>> files from the webrevs and compare them with something like:
>>>
>>>     jfilemerge -r -w 8028073_exp.patch 8028073_exp_for_jdk8u_hs.patch
>>>
>>> The same testing has been performed on the JDK8u-hs-dev version as
>>> with the JDK9-hs-runtime version.
>>>
>>> Thanks, in advance, for any comments, questions or suggestions.
>>>
>>> Dan
>>>
>>>
>>> On 2/1/14 11:38 AM, Daniel D. Daugherty wrote:
>>> > Greetings,
>>> >
>>> > I have a fix ready for the following bug:
>>> >
>>> >     8028073 race condition in ObjectMonitor implementation causing 
>>> deadlocks
>>> >     https://bugs.openjdk.java.net/browse/JDK-8028073
>>> >
>>> > On the surface, this is a very simple fix that relocates a few 
>>> lines of
>>> > code, relocates and rewrites the comments associated with that 
>>> code and
>>> > adds several new comments.
>>> >
>>> > Of course, in reality, the issue is much more complicated, but I'm
>>> > hoping to make it easy for anyone not acquainted with this issue to
>>> > understand what's going on.
>>> >
>>> > Here are the JDK9 webrev URLs:
>>> >
>>> > OpenJDK:
>>> > http://cr.openjdk.java.net/~dcubed/8028073-webrev/0-jdk9-hs-runtime/
>>> >
>>> > Oracle internal:
>>> > 
>>> http://javaweb.us.oracle.com/~ddaugher/8028073-webrev/0-jdk9-hs-runtime/ 
>>>
>>> >
>>> > The simple summary:
>>> >
>>> > - since Java Monitors and JVM/TI RawMonitors share a ParkEvent,
>>> >   it is possible for a JVM/TI monitor event handler to accidentally
>>> >   consume a ParkEvent.unpark() call meant for Java Monitor layer
>>> > - the original code fix was made on 2005.07.04 using this bug ID:
>>> >   https://bugs.openjdk.java.net/browse/JDK-5030359
>>> > - it's the right fix, but it's in the wrong place
>>> > - the fix needs to be after the JVMTI_EVENT_MONITOR_WAITED
>>> >   event handler is called because it is that event handler
>>> >   that can cause the hang
>>> >
>>> >
>>> > Testing
>>> > -------
>>> >
>>> > - a new StessMonitorWait test has been created that reliably
>>> >   reproduces the hang in JDK[6789]; see the bug's gory details
>>> >   for the specific versions where the hang has been reproduced
>>> >   - the test reliably reproduces the hang in 5 seconds on my
>>> >     T7600 running Solaris 10u11 X86; 1 minute runs reproduce
>>> >     the hang reliably on other machines
>>> >   - 12 hour stress run of the new test on Linux-X64, MacOS X-X64,
>>> >     Solaris-SPARCV9, Solaris-X64, and Win7-X86 with the JPRT
>>> >     bits did not reproduce the hang
>>> > - JPRT test job
>>> > - VM/SQE Adhoc test job on Server VM, fastdebug bits on Linux-X86,
>>> >   Linux-X64, MacOS X-X64, Solaris-SPARCV9, Solaris-X64, Windows-X86,
>>> >   and Windows-X64:
>>> >   - vm.quick
>>> >   - Kitchensink (bigapps)
>>> >   - Weblogic+medrec (bigapps)
>>> >   - runThese (bigapps)
>>> >
>>> >
>>> > The Gory Details Start Here
>>> > ---------------------------
>>> >
>>> > This is the old location of block of code that's being moved:
>>> >
>>> > src/share/vm/runtime/objectMonitor.cpp:
>>> >
>>> > 1440 void ObjectMonitor::wait(jlong millis, bool interruptible, 
>>> TRAPS) {
>>> > <snip>
>>> > 1499    exit (true, Self) ;                    // exit the monitor
>>> > <snip>
>>> > 1513    if (node._notified != 0 && _succ == Self) {
>>> > 1514       node._event->unpark();
>>> > 1515    }
>>> >
>>> >
>>> > This is the new location of block of code that's being moved:
>>> >
>>> > src/share/vm/runtime/objectMonitor.cpp:
>>> >
>>> > 1452 void ObjectMonitor::wait(jlong millis, bool interruptible, 
>>> TRAPS) {
>>> > <snip>
>>> > 1601      if (JvmtiExport::should_post_monitor_waited()) {
>>> > 1602        JvmtiExport::post_monitor_waited(jt, this, ret == 
>>> OS_TIMEOUT);
>>> > <snip>
>>> > 1604        if (node._notified != 0 && _succ == Self) {
>>> > <snip>
>>> > 1620          node._event->unpark();
>>> > 1621        }
>>> >
>>> >
>>> > The Risks
>>> > ---------
>>> >
>>> > - The code now executes only when the JVMTI_EVENT_MONITOR_WAITED 
>>> event
>>> >   is enabled:
>>> >   - previously it was always executed
>>> >   - while the old code was not effective for the hang that is being
>>> >     fixed with this bug, it is possible that the old code prevented
>>> >     a different bug in the successor protocol from manifesting
>>> >   - thorough analysis of the successor protocol did not reveal a
>>> >     case where the old code was needed in the old location
>>> > - Thorough analysis indicates that the other JVM/TI monitor events
>>> >   do not need a fix like the one for JVMTI_EVENT_MONITOR_WAITED:
>>> >   - the successor protocol is complicated and the analysis could
>>> >     be wrong when certain options are used
>>> >   - comments were added to each location where a JVM/TI monitor
>>> >     event handler is called documenting why a fix like this one
>>> >     is not needed there
>>> >   - if the analysis is wrong, the new comments show where a new
>>> >     code change would be needed
>>> >
>>> >
>>> > The Scenario
>>> > ------------
>>> >
>>> > I've created a scenario that reproduces this hang:
>>> >
>>> > T1 - enters monitor and calls monitor.wait()
>>> > T2 - enters the monitor, calls monitor.notify() and exits the monitor
>>> > T3 - enters and exits the monitor
>>> > T4 - enters the monitor, delays for 5 seconds, exits the monitor
>>> >
>>> > A JVM/TI agent that enables JVMTI_EVENT_MONITOR_WAITED and has a
>>> > handler that: enters a raw monitor, waits for 1ms, exits a raw 
>>> monitor.
>>> >
>>> > Here are the six events necessary to make this hang happen:
>>> >
>>> > // KEY-EVENT-1a: After being unparked(), T1 has cleared the _succ 
>>> field, but
>>> > // KEY-EVENT-1b: T3 is exiting the monitor and makes T1 the 
>>> successor again.
>>> >
>>> > // KEY-EVENT-2a: The unpark() done by T3 when it made T1 the 
>>> successor
>>> > // KEY-EVENT-2b: is consumed by the JVM/TI event handler.
>>> >
>>> > // KEY-EVENT-3a: T3 made T1 the successor
>>> > // KEY-EVENT-3b: but before T1 could reenter the monitor T4 
>>> grabbed it.
>>> >
>>> > // KEY-EVENT-4a: T1's TrySpin() call sees T4 as NotRunnable so
>>> > // KEY-EVENT-4b: T1 bails from TrySpin without touching _succ.
>>> >
>>> > // KEY-EVENT-5a: T4 sees that T1 is still the successor so
>>> > // KEY-EVENT-5b: T4 takes the quick exit path (no ExitEpilog)
>>> >
>>> > // KEY-EVENT-6a: T1 is about to park and it is the successor, but
>>> > // KEY-EVENT-6b: T3's unpark has been eaten by the JVM/TI event 
>>> handler
>>> > // KEY-EVENT-6c: and T4 took the quick exit path. T1 is about to 
>>> be stuck.
>>> >
>>> >
>>> > This bug is intertwined with:
>>> >
>>> > - The ObjectMonitor successor protocol
>>> > - the sharing of a ParkEvent between Java Monitors and JVM/TI 
>>> RawMonitors
>>> >
>>> > There is a very long successor.notes attachment to JDK-8028073 that
>>> > attempts to describe the ObjectMonitor successor protocol. It's good
>>> > for putting pretty much anyone to sleep.
>>> >
>>> > Since this hang reproduces back to JDK6, this bug is taking the 
>>> easily
>>> > backported solution of moving the original fix to the right location.
>>> > The following new bug has been filed for possible future work in this
>>> > area by the Serviceability Team:
>>> >
>>> >     8033399 add a separate ParkEvent for JVM/TI RawMonitor use
>>> >     https://bugs.openjdk.java.net/browse/JDK-8033399
>>> >
>>> >
>>> > The Symptoms
>>> > ------------
>>> >
>>> > With intermittent hangs like this, it is useful to know what to look
>>> > for in order to determine if you are running into this issue:
>>> >
>>> > - if you aren't using a debugger or a profiler or some other
>>> >   JVM/TI agent, then this hang is not the same as yours
>>> > - if your JVM/TI agent isn't using a JVMTI_EVENT_MONITOR_WAITED
>>> >   event handler, then this hang is not the same as yours
>>> > - if your JVMTI_EVENT_MONITOR_WAITED event handler is not using
>>> >   JVM/TI RawMonitors, then this hang is not the same as yours
>>> > - if your JVMTI_EVENT_MONITOR_WAITED event handler is calling
>>> >   back into Java code, then you might just be insane and this
>>> >   hang might be similar to yours. However, using a Java callback
>>> >   in an event handler is an even bigger problem/risk so fix that
>>> >   first.
>>> > - if you one or more threads blocked like this and making no
>>> >   progress, then this hang might be the same as yours:
>>> >
>>> > "T1" #22 prio=5 os_prio=64 tid=0x00000000009ca800 nid=0x2f waiting 
>>> for monitor e
>>> > ntry [0xfffffd7fc0231000]
>>> >    java.lang.Thread.State: BLOCKED (on object monitor)
>>> >    JavaThread state: _thread_blocked
>>> > Thread: 0x00000000009ca800  [0x2f] State: _at_safepoint 
>>> _has_called_back 0 _at_p
>>> > oll_safepoint 0
>>> >    JavaThread state: _thread_blocked
>>> >         at java.lang.Object.wait(Native Method)
>>> >         - waiting on <0xfffffd7e6a2b6ff0> (a java.lang.String)
>>> >         at java.lang.Object.wait(Object.java:502)
>>> >         at SMW_WorkerThread.run(StressMonitorWait.java:103)
>>> >         - locked <0xfffffd7e6a2b6ff0> (a java.lang.String)
>>> >
>>> > "T2" #23 prio=5 os_prio=64 tid=0x00000000009cc000 nid=0x30 waiting 
>>> for monitor e
>>> > ntry [0xfffffd7fc0130000]
>>> >    java.lang.Thread.State: BLOCKED (on object monitor)
>>> >    JavaThread state: _thread_blocked
>>> > Thread: 0x00000000009cc000  [0x30] State: _at_safepoint 
>>> _has_called_back 0 _at_p
>>> > oll_safepoint 0
>>> >    JavaThread state: _thread_blocked
>>> >         at SMW_WorkerThread.run(StressMonitorWait.java:120)
>>> >         - waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String)
>>> >
>>> > "T3" #24 prio=5 os_prio=64 tid=0x00000000009ce000 nid=0x31 waiting 
>>> for monitor e
>>> > ntry [0xfffffd7fc002f000]
>>> >    java.lang.Thread.State: BLOCKED (on object monitor)
>>> >    JavaThread state: _thread_blocked
>>> > Thread: 0x00000000009ce000  [0x31] State: _at_safepoint 
>>> _has_called_back 0 _at_p
>>> > oll_safepoint 0
>>> >    JavaThread state: _thread_blocked
>>> >         at SMW_WorkerThread.run(StressMonitorWait.java:139)
>>> >         - waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String)
>>> >
>>> > Key symptoms in thread T1:
>>> >
>>> > - had the object locked:
>>> >
>>> >   locked <0xfffffd7e6a2b6ff0> (a java.lang.String)
>>> >
>>> > - did an Object.wait():
>>> >
>>> >   waiting on <0xfffffd7e6a2b6ff0> (a java.lang.String)
>>> >
>>> > - is blocked on reentry:
>>> >
>>> >   waiting for monitor entry [0xfffffd7fc0231000]
>>> >
>>> > Key symptoms in thread T2:
>>> >
>>> > - is blocked waiting to lock the object:
>>> >
>>> >   waiting for monitor entry [0xfffffd7fc0130000]
>>> >   waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String)
>>> >
>>> > Key symptoms in thread T3:
>>> >
>>> > - is blocked waiting to lock the object:
>>> >
>>> >   waiting for monitor entry [0xfffffd7fc002f000]
>>> >   waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String)
>>>
>>>
>>>
>>
>


From daniel.daugherty at oracle.com  Tue Feb 25 16:48:07 2014
From: daniel.daugherty at oracle.com (Daniel D. Daugherty)
Date: Tue, 25 Feb 2014 17:48:07 -0700
Subject: RFR: 8035150 ShouldNotReachHere() in ConstantPool::copy_entry_to
In-Reply-To: <ad8bc97a-9d9b-4596-ae95-5a3601622f29@default>
References: <19821D33-93B6-43FA-85DC-070DB4800911@oracle.com>
	<ad8bc97a-9d9b-4596-ae95-5a3601622f29@default>
Message-ID: <530D39C7.6080009@oracle.com>

I concur with Markus. Pairing JVM_CONSTANT_UnresolvedClassInError with
JVM_CONSTANT_UnresolvedClass in the ConstantPool::copy_entry_to()
switch looks like the right thing to do.

The usual questions:

- why wasn't this failure mode seen before JDK8?
- was this failure caught somewhere else before JDK8 and changes
   in JDK8 exposed a new code path?

Reasoning about this from a 30,000 foot view, I don't see any reason
why you can't redefine a class that has a constant pool ref that
refers to a class in error. You won't be able to use the error'ed
class, but there's no reason it can't be in there... Or does that
violate the rule that you can't redefine a class that isn't fully
linked (what ever that means...)???

So what does your new test on JDK7 or JDK6? Just curious...

Dan


On 2/24/14 2:42 AM, Markus Gronlund wrote:
> Hi Staffan,
>
> I would think this is the correct fix.
>
> The other two constant pool "error" tags, besides UnresolvedClassInError, which signal constant pool resolution errors are MethodTypeInError and MethodHandleInError - these error tags are associated with their corresponding "success" tags in switch targets in ConstantPool::copy_entry_to(), as well as in additional routines in constantPool.cpp.
>
> In addition, in other routines in ConstantPool.cpp, the error tag JVM_CONSTANT_UnresolvedClassInError is associated with JVM_CONSTANT_UnresolvedClass -  ConstantPool::resolve_constant_at_impl() for example.
>
> Thanks
> Markus
>
>
> -----Original Message-----
> From: Staffan Larsen
> Sent: den 21 februari 2014 15:11
> To: hotspot-runtime-dev; serviceability-dev at openjdk.java.net serviceability-dev at openjdk.java.net
> Subject: RFR: 8035150 ShouldNotReachHere() in ConstantPool::copy_entry_to
>
> This is an attempt to solve a crash while redefining a class that has unresolved class references in its constant pool. I would appreciate some extra scrutiny here since I am unfamiliar with this code path.
>
> I have also added a test that causes a JVM crash without the fix.
>
> The updates to the test library is all code copied from the jdk version of the test library.
>
> webrev: http://cr.openjdk.java.net/~sla/8035150/webrev.00/
> bug: https://bugs.openjdk.java.net/browse/JDK-8035150
>
> Thanks,
> /Staffan


From christian.thalinger at oracle.com  Tue Feb 25 17:55:20 2014
From: christian.thalinger at oracle.com (Christian Thalinger)
Date: Tue, 25 Feb 2014 17:55:20 -0800
Subject: RFR 6471769: Error: assert(_cur_stack_depth == count_frames(),
	"cur_stack_depth out of sync")
In-Reply-To: <530D007D.4040602@oracle.com>
References: <530D007D.4040602@oracle.com>
Message-ID: <254DB943-0858-49A3-8BF3-3323F9BA4722@oracle.com>

Looks good.

On Feb 25, 2014, at 12:43 PM, serguei.spitsyn at oracle.com wrote:

> Please, review the fix for:
>  https://bugs.openjdk.java.net/browse/JDK-6471769
> 
> 
> Open webrev:
> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/hotspot/6471769-JVMTI-DEPTH.1
> 
> Summary:
> 
>  This is another Test Stabilization issue.
>  The fix is very similar to other JVMTI stabilization fixes.
>  It is to use safepoints for updating the PopFrame data instead of relying on the
>  suspend equivalent condition mechanism (JvmtiEnv::is_thread_fully_suspended())
>  which is not adequate from the reliability point of view.
> 
> Testing:
>  In progress: nsk.jvmti, nsk.jdi, nsk.jdwp, JTreg com/sun/jdi
> 
> 
> Thanks,
> Serguei
> 


From serguei.spitsyn at oracle.com  Tue Feb 25 18:11:10 2014
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Tue, 25 Feb 2014 18:11:10 -0800
Subject: RFR 6471769: Error: assert(_cur_stack_depth == count_frames(),
	"cur_stack_depth out of sync")
In-Reply-To: <254DB943-0858-49A3-8BF3-3323F9BA4722@oracle.com>
References: <530D007D.4040602@oracle.com>
	<254DB943-0858-49A3-8BF3-3323F9BA4722@oracle.com>
Message-ID: <530D4D3E.3030707@oracle.com>

Thanks, Christian!
Serguei

On 2/25/14 5:55 PM, Christian Thalinger wrote:
> Looks good.
>
> On Feb 25, 2014, at 12:43 PM, serguei.spitsyn at oracle.com wrote:
>
>> Please, review the fix for:
>>   https://bugs.openjdk.java.net/browse/JDK-6471769
>>
>>
>> Open webrev:
>> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/hotspot/6471769-JVMTI-DEPTH.1
>>
>> Summary:
>>
>>   This is another Test Stabilization issue.
>>   The fix is very similar to other JVMTI stabilization fixes.
>>   It is to use safepoints for updating the PopFrame data instead of relying on the
>>   suspend equivalent condition mechanism (JvmtiEnv::is_thread_fully_suspended())
>>   which is not adequate from the reliability point of view.
>>
>> Testing:
>>   In progress: nsk.jvmti, nsk.jdi, nsk.jdwp, JTreg com/sun/jdi
>>
>>
>> Thanks,
>> Serguei
>>


From staffan.larsen at oracle.com  Wed Feb 26 00:31:59 2014
From: staffan.larsen at oracle.com (Staffan Larsen)
Date: Wed, 26 Feb 2014 09:31:59 +0100
Subject: RFR: 8035150 ShouldNotReachHere() in ConstantPool::copy_entry_to
In-Reply-To: <530D39C7.6080009@oracle.com>
References: <19821D33-93B6-43FA-85DC-070DB4800911@oracle.com>
	<ad8bc97a-9d9b-4596-ae95-5a3601622f29@default>
	<530D39C7.6080009@oracle.com>
Message-ID: <CB246ADC-BDDD-4928-893A-9D2C5D602065@oracle.com>


On 26 feb 2014, at 01:48, Daniel D. Daugherty <daniel.daugherty at oracle.com> wrote:

> I concur with Markus. Pairing JVM_CONSTANT_UnresolvedClassInError with
> JVM_CONSTANT_UnresolvedClass in the ConstantPool::copy_entry_to()
> switch looks like the right thing to do.

Good - thanks.

> 
> The usual questions:
> 
> - why wasn't this failure mode seen before JDK8?

No tests for this ? ;)

> - was this failure caught somewhere else before JDK8 and changes
>  in JDK8 exposed a new code path?
> 
> Reasoning about this from a 30,000 foot view, I don't see any reason
> why you can't redefine a class that has a constant pool ref that
> refers to a class in error. You won't be able to use the error'ed
> class, but there's no reason it can't be in there... Or does that
> violate the rule that you can't redefine a class that isn't fully
> linked (what ever that means...)???
> 
> So what does your new test on JDK7 or JDK6? Just curious?

The test passes on jdk7, but fails on jdk8. (I don?t have a jdk6). I don?t know why it passes on jdk7, do you think it?s important to track it down?

/Staffan

> 
> Dan
> 
> 
> On 2/24/14 2:42 AM, Markus Gronlund wrote:
>> Hi Staffan,
>> 
>> I would think this is the correct fix.
>> 
>> The other two constant pool "error" tags, besides UnresolvedClassInError, which signal constant pool resolution errors are MethodTypeInError and MethodHandleInError - these error tags are associated with their corresponding "success" tags in switch targets in ConstantPool::copy_entry_to(), as well as in additional routines in constantPool.cpp.
>> 
>> In addition, in other routines in ConstantPool.cpp, the error tag JVM_CONSTANT_UnresolvedClassInError is associated with JVM_CONSTANT_UnresolvedClass -  ConstantPool::resolve_constant_at_impl() for example.
>> 
>> Thanks
>> Markus
>> 
>> 
>> -----Original Message-----
>> From: Staffan Larsen
>> Sent: den 21 februari 2014 15:11
>> To: hotspot-runtime-dev; serviceability-dev at openjdk.java.net serviceability-dev at openjdk.java.net
>> Subject: RFR: 8035150 ShouldNotReachHere() in ConstantPool::copy_entry_to
>> 
>> This is an attempt to solve a crash while redefining a class that has unresolved class references in its constant pool. I would appreciate some extra scrutiny here since I am unfamiliar with this code path.
>> 
>> I have also added a test that causes a JVM crash without the fix.
>> 
>> The updates to the test library is all code copied from the jdk version of the test library.
>> 
>> webrev: http://cr.openjdk.java.net/~sla/8035150/webrev.00/
>> bug: https://bugs.openjdk.java.net/browse/JDK-8035150
>> 
>> Thanks,
>> /Staffan
> 


From daniel.daugherty at oracle.com  Wed Feb 26 06:03:55 2014
From: daniel.daugherty at oracle.com (Daniel D. Daugherty)
Date: Wed, 26 Feb 2014 07:03:55 -0700
Subject: RFR: 8035150 ShouldNotReachHere() in ConstantPool::copy_entry_to
In-Reply-To: <CB246ADC-BDDD-4928-893A-9D2C5D602065@oracle.com>
References: <19821D33-93B6-43FA-85DC-070DB4800911@oracle.com>
	<ad8bc97a-9d9b-4596-ae95-5a3601622f29@default>
	<530D39C7.6080009@oracle.com>
	<CB246ADC-BDDD-4928-893A-9D2C5D602065@oracle.com>
Message-ID: <530DF44B.7060209@oracle.com>

On 2/26/14 1:31 AM, Staffan Larsen wrote:
> On 26 feb 2014, at 01:48, Daniel D. Daugherty <daniel.daugherty at oracle.com> wrote:
>
>> I concur with Markus. Pairing JVM_CONSTANT_UnresolvedClassInError with
>> JVM_CONSTANT_UnresolvedClass in the ConstantPool::copy_entry_to()
>> switch looks like the right thing to do.
> Good - thanks.
>
>> The usual questions:
>>
>> - why wasn't this failure mode seen before JDK8?
> No tests for this ? ;)

I should have been more clear... :-) Why hasn't the NetBeans profiler
run into this before? That profiler is a wonderful test for the
RedefineClasses/RetransformClasses stuff...


>
>> - was this failure caught somewhere else before JDK8 and changes
>>   in JDK8 exposed a new code path?
>>
>> Reasoning about this from a 30,000 foot view, I don't see any reason
>> why you can't redefine a class that has a constant pool ref that
>> refers to a class in error. You won't be able to use the error'ed
>> class, but there's no reason it can't be in there... Or does that
>> violate the rule that you can't redefine a class that isn't fully
>> linked (what ever that means...)???
>>
>> So what does your new test on JDK7 or JDK6? Just curious?
> The test passes on jdk7, but fails on jdk8. (I don?t have a jdk6). I don?t know why it passes on jdk7, do you think it?s important to track it down?

The fact that it passes on JDK7 is the useful piece of data.
Figuring out why is much less important. BTW, which JDK7
version? One of the updates or GA/FCS?

Dan


>
> /Staffan
>
>> Dan
>>
>>
>> On 2/24/14 2:42 AM, Markus Gronlund wrote:
>>> Hi Staffan,
>>>
>>> I would think this is the correct fix.
>>>
>>> The other two constant pool "error" tags, besides UnresolvedClassInError, which signal constant pool resolution errors are MethodTypeInError and MethodHandleInError - these error tags are associated with their corresponding "success" tags in switch targets in ConstantPool::copy_entry_to(), as well as in additional routines in constantPool.cpp.
>>>
>>> In addition, in other routines in ConstantPool.cpp, the error tag JVM_CONSTANT_UnresolvedClassInError is associated with JVM_CONSTANT_UnresolvedClass -  ConstantPool::resolve_constant_at_impl() for example.
>>>
>>> Thanks
>>> Markus
>>>
>>>
>>> -----Original Message-----
>>> From: Staffan Larsen
>>> Sent: den 21 februari 2014 15:11
>>> To: hotspot-runtime-dev; serviceability-dev at openjdk.java.net serviceability-dev at openjdk.java.net
>>> Subject: RFR: 8035150 ShouldNotReachHere() in ConstantPool::copy_entry_to
>>>
>>> This is an attempt to solve a crash while redefining a class that has unresolved class references in its constant pool. I would appreciate some extra scrutiny here since I am unfamiliar with this code path.
>>>
>>> I have also added a test that causes a JVM crash without the fix.
>>>
>>> The updates to the test library is all code copied from the jdk version of the test library.
>>>
>>> webrev: http://cr.openjdk.java.net/~sla/8035150/webrev.00/
>>> bug: https://bugs.openjdk.java.net/browse/JDK-8035150
>>>
>>> Thanks,
>>> /Staffan


From staffan.larsen at oracle.com  Wed Feb 26 06:15:43 2014
From: staffan.larsen at oracle.com (Staffan Larsen)
Date: Wed, 26 Feb 2014 15:15:43 +0100
Subject: RFR: 8035150 ShouldNotReachHere() in ConstantPool::copy_entry_to
In-Reply-To: <530DF44B.7060209@oracle.com>
References: <19821D33-93B6-43FA-85DC-070DB4800911@oracle.com>
	<ad8bc97a-9d9b-4596-ae95-5a3601622f29@default>
	<530D39C7.6080009@oracle.com>
	<CB246ADC-BDDD-4928-893A-9D2C5D602065@oracle.com>
	<530DF44B.7060209@oracle.com>
Message-ID: <F9F51E64-23F0-4BF8-BF43-B4E2DF560A3F@oracle.com>


On 26 feb 2014, at 15:03, Daniel D. Daugherty <daniel.daugherty at oracle.com> wrote:

> On 2/26/14 1:31 AM, Staffan Larsen wrote:
>> On 26 feb 2014, at 01:48, Daniel D. Daugherty <daniel.daugherty at oracle.com> wrote:
>> 
>>> I concur with Markus. Pairing JVM_CONSTANT_UnresolvedClassInError with
>>> JVM_CONSTANT_UnresolvedClass in the ConstantPool::copy_entry_to()
>>> switch looks like the right thing to do.
>> Good - thanks.
>> 
>>> The usual questions:
>>> 
>>> - why wasn't this failure mode seen before JDK8?
>> No tests for this ? ;)
> 
> I should have been more clear... :-) Why hasn't the NetBeans profiler
> run into this before? That profiler is a wonderful test for the
> RedefineClasses/RetransformClasses stuff?

Ah, ok. No idea...

> 
> 
>> 
>>> - was this failure caught somewhere else before JDK8 and changes
>>>  in JDK8 exposed a new code path?
>>> 
>>> Reasoning about this from a 30,000 foot view, I don't see any reason
>>> why you can't redefine a class that has a constant pool ref that
>>> refers to a class in error. You won't be able to use the error'ed
>>> class, but there's no reason it can't be in there... Or does that
>>> violate the rule that you can't redefine a class that isn't fully
>>> linked (what ever that means...)???
>>> 
>>> So what does your new test on JDK7 or JDK6? Just curious?
>> The test passes on jdk7, but fails on jdk8. (I don?t have a jdk6). I don?t know why it passes on jdk7, do you think it?s important to track it down?
> 
> The fact that it passes on JDK7 is the useful piece of data.
> Figuring out why is much less important. BTW, which JDK7
> version? One of the updates or GA/FCS?

I used 7u45, but now I tested with 7u4 as well - passes there, too.

Are you ok with pushing the change?

Thanks,
/Staffan

> 
> Dan
> 
> 
>> 
>> /Staffan
>> 
>>> Dan
>>> 
>>> 
>>> On 2/24/14 2:42 AM, Markus Gronlund wrote:
>>>> Hi Staffan,
>>>> 
>>>> I would think this is the correct fix.
>>>> 
>>>> The other two constant pool "error" tags, besides UnresolvedClassInError, which signal constant pool resolution errors are MethodTypeInError and MethodHandleInError - these error tags are associated with their corresponding "success" tags in switch targets in ConstantPool::copy_entry_to(), as well as in additional routines in constantPool.cpp.
>>>> 
>>>> In addition, in other routines in ConstantPool.cpp, the error tag JVM_CONSTANT_UnresolvedClassInError is associated with JVM_CONSTANT_UnresolvedClass -  ConstantPool::resolve_constant_at_impl() for example.
>>>> 
>>>> Thanks
>>>> Markus
>>>> 
>>>> 
>>>> -----Original Message-----
>>>> From: Staffan Larsen
>>>> Sent: den 21 februari 2014 15:11
>>>> To: hotspot-runtime-dev; serviceability-dev at openjdk.java.net serviceability-dev at openjdk.java.net
>>>> Subject: RFR: 8035150 ShouldNotReachHere() in ConstantPool::copy_entry_to
>>>> 
>>>> This is an attempt to solve a crash while redefining a class that has unresolved class references in its constant pool. I would appreciate some extra scrutiny here since I am unfamiliar with this code path.
>>>> 
>>>> I have also added a test that causes a JVM crash without the fix.
>>>> 
>>>> The updates to the test library is all code copied from the jdk version of the test library.
>>>> 
>>>> webrev: http://cr.openjdk.java.net/~sla/8035150/webrev.00/
>>>> bug: https://bugs.openjdk.java.net/browse/JDK-8035150
>>>> 
>>>> Thanks,
>>>> /Staffan

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20140226/7fa0b7d6/attachment-0001.html 

From taras.ledkov at oracle.com  Wed Feb 26 06:24:28 2014
From: taras.ledkov at oracle.com (taras ledkov)
Date: Wed, 26 Feb 2014 18:24:28 +0400
Subject: Review request for 7195249: Some jtreg tests use hard coded ports
In-Reply-To: <530CA007.5080006@oracle.com>
References: <529EF58F.5000701@oracle.com>
	<52A58687.6020708@oracle.com>	<52A5953A.5040102@oracle.com>
	<52A7061E.8040002@oracle.com>	<52BC2A7D.3070403@oracle.com>	<B27290D9-2EA6-4330-9A31-59957AA968ED@oracle.com>
	<52D6A61A.5020109@oracle.com> <52DD3BB6.2070607@oracle.com>
	<A2DEA977-75EF-4AC2-AC86-E22B8BBFFE97@oracle.com>
	<52DE3E3D.5070903@oracle.com> <52DE41BB.40309@oracle.com>
	<52F231D3.1050208@oracle.com> <52F23A97.7020807@oracle.com>
	<53049009.9020204@oracle.com> <530CA007.5080006@oracle.com>
Message-ID: <530DF91C.5010000@oracle.com>

Hi,

Alan, Mandy could you please review the fix: 
https://bugs.openjdk.java.net/browse/JDK-7195249.

I had the discussion with Jaroslav and Staffan and they have approved my 
fix, but they are not reviewers.

Webrev for jdk part:
http://cr.openjdk.java.net/~anazarov/7195249/jdk/webrev.04/

Webrev for hs part:
http://cr.openjdk.java.net/~anazarov/7195249/hs/webrev.03/

On 25.02.2014 17:52, Jaroslav Bachorik wrote:
> Thumbs up. (not a "reviewer", though)
>
> -JB-
>
> On 19.2.2014 12:05, taras ledkov wrote:
>> Hi,
>>
>> Imports are fixed:
>>
>> http://cr.openjdk.java.net/~anazarov/7195249/jdk/webrev.04/
>>
>> On 05.02.2014 17:20, Jaroslav Bachorik wrote:
>>> Hi Taras,
>>>
>>> thanks for taking care of this.
>>>
>>> The changes look fine to me.
>>>
>>> One minor nit is unused imports of the library classes in
>>> "test/sun/management/jmxremote/bootstrap/SSLConfigFilePermissionTest.java".
>>>
>>>
>>> It does not use any of those classes as its base class
>>> "AbstractFilePermissionTest" does all the heavy lifting.
>>>
>>> Cheers,
>>>
>>> -JB-
>>>
>>> On 5.2.2014 13:42, taras ledkov wrote:
>>>> Hi,
>>>>
>>>> So please take a look at the review against JDK9.
>>>> The reviewed patch had not been integrated into JDK8.
>>>>
>>>> Port to JDK9 is identical. The difference: the ProcessTools.java has
>>>> been already patched by Jaroslav.
>>>>
>>>> Webrev for jdk part:
>>>> http://cr.openjdk.java.net/~anazarov/7195249/jdk/webrev.03/
>>>>
>>>> Webrev for hs part:
>>>> http://cr.openjdk.java.net/~anazarov/7195249/hs/webrev.03/
>>>>
>>>>
>>>> On 21.01.2014 13:45, Jaroslav Bachorik wrote:
>>>>> Hi Taras,
>>>>>
>>>>> On 21.1.2014 10:30, taras ledkov wrote:
>>>>>> Hi Jaroslav,
>>>>>>
>>>>>> Could you please review the last changes?
>>>>>> Are you OK?
>>>>>
>>>>> Yes, the change looks ok. But I think we will need to get back to this
>>>>> problem eventually and implement a central port dispatcher if we
>>>>> want to
>>>>> be 100% sure the port conflicts wouldn't occur. But your changes
>>>>> reduce
>>>>> the chance significantly.
>>>>>
>>>>> Thanks for taking care of this.
>>>>>
>>>>> -JB-
>>>>>
>>>>>>
>>>>>> On 20.01.2014 19:21, Staffan Larsen wrote:
>>>>>>> Sorry for not replying earlier. Yes, I?m ok with these changes.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> /Staffan
>>>>>>>
>>>>>>> On 20 jan 2014, at 16:07, taras ledkov <taras.ledkov at oracle.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Hi Staffan,
>>>>>>>>
>>>>>>>> I fixed the tests according with your comments.
>>>>>>>> Are you OK?
>>>>>>>>
>>>>>>>> On 15.01.2014 19:15, taras ledkov wrote:
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>> Please take a look at the new review.
>>>>>>>>>
>>>>>>>>> Webrev for jdk part:
>>>>>>>>> http://cr.openjdk.java.net/~anazarov/7195249/jdk/webrev.02/
>>>>>>>>>
>>>>>>>>> Webrev for hs part:
>>>>>>>>> http://cr.openjdk.java.net/~anazarov/7195249/hs/webrev.02/
>>>>>>>>>
>>>>>>>>> My answers are inline:
>>>>>>>>>
>>>>>>>>> On 08.01.2014 17:46, Staffan Larsen wrote:
>>>>>>>>>> Hi Taras,
>>>>>>>>>>
>>>>>>>>>> Thanks for doing this clean up and conversion of tests into Java.
>>>>>>>>>> Here?s a couple of comments:
>>>>>>>>>>
>>>>>>>>>> test/runtime/6294277/SourceDebugExtension.java:
>>>>>>>>>> This test could be simplified by not specifying an address at
>>>>>>>>>> all.
>>>>>>>>>> Since the test never connects to the JVM started with -Xrunjdwp,
>>>>>>>>>> there
>>>>>>>>>> is no reason to specify an address. If address is unspecified
>>>>>>>>>> (and
>>>>>>>>>> server=y), the connector will pick an address and print it to the
>>>>>>>>>> command line. Thus the only change that needs to be done is to
>>>>>>>>>> remove
>>>>>>>>>> ",address=8888? from the @run command.
>>>>>>>>> fixed
>>>>>>>>>
>>>>>>>>>> test/sun/management/jmxremote/bootstrap/RmiBootstrapTest.sh:
>>>>>>>>>> test/sun/management/jmxremote/bootstrap/RmiSslBootstrapTest.sh:
>>>>>>>>>> These tests do not compile cleanly with an empty JTwork
>>>>>>>>>> directory. It
>>>>>>>>>> seems that having one @build for each class does not work well -
>>>>>>>>>> when
>>>>>>>>>> compiling RmiBootstrapTest.java it cannot find TestLogger. Moving
>>>>>>>>>> all
>>>>>>>>>> classes to one @build statement solved this problem for me.
>>>>>>>>> fixed
>>>>>>>>>
>>>>>>>>>> test/lib/testlibrary/jdk/testlibrary/ProcessTools.java:
>>>>>>>>>>   187         Future<Void> stdoutTask = stdout.process();
>>>>>>>>>>   188         Future<Void> stderrTask = stderr.process();
>>>>>>>>>> The stdoutTask and stderrTask variables are unused.
>>>>>>>>> fixed
>>>>>>>>>
>>>>>>>>>> test/sun/management/jmxremote/bootstrap/RmiRegistrySslTest.java:
>>>>>>>>>> At first I thought something was wrong with this file - the
>>>>>>>>>> diff is
>>>>>>>>>> very weird. Then I realized you renamed an old file and created a
>>>>>>>>>> new
>>>>>>>>>> file using the old name.
>>>>>>>>> You are right. I did it to keep the test name.
>>>>>>>>>
>>>>>>>>>> test/sun/management/jmxremote/bootstrap/AbstractFilePermissionTest.java:
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> - Is resetPasswordFilePermission() really necessary? It looks
>>>>>>>>>> like
>>>>>>>>>> you
>>>>>>>>>> delete the files at the beginning of the test in any case.
>>>>>>>>> I think yes. n the first place, this functionality was at the old
>>>>>>>>> code.
>>>>>>>>> In the second place, a file without write permission may be a
>>>>>>>>> problem
>>>>>>>>> for a further cleanup (not by the test, for example for the tests
>>>>>>>>> launcher scripts etc.)
>>>>>>>>>
>>>>>>>>>> - I find the names and usage of ?mgmt? and ?file2PermissionTest?
>>>>>>>>>> confusing. They are both Paths. One is used directly by the
>>>>>>>>>> sub-classes, the other has a getter method.
>>>>>>>>> fixed
>>>>>>>>>
>>>>>>>>>> - Lines 57-58: Don?t swallow exceptions, add an
>>>>>>>>>> ex.printStackTrace().
>>>>>>>>>> (Same thing for all other places where you call
>>>>>>>>>> Integer.parseInt())
>>>>>>>>> fixed
>>>>>>>>>
>>>>>>>>>> test/sun/management/jmxremote/bootstrap/Dummy.java:
>>>>>>>>>> This file is never used as far as I can see.
>>>>>>>>> It is used by PasswordFilePermissionTest &
>>>>>>>>> SSLConfigFilePermissionTest
>>>>>>>>> via the AbstractFilePermissionTest (see the doTest method,
>>>>>>>>> AbstractFilePermissionTest : 162).
>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>> /Staffan
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On 26 dec 2013, at 14:09, taras ledkov <taras.ledkov at oracle.com>
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>> Hi,
>>>>>>>>>>>
>>>>>>>>>>> Please take a look at the review with fixed issues about
>>>>>>>>>>> trying to
>>>>>>>>>>> launch test that needs free port several times.
>>>>>>>>>>>
>>>>>>>>>>> Webrev for jdk part:
>>>>>>>>>>> http://cr.openjdk.java.net/~anazarov/7195249/jdk/webrev.01/
>>>>>>>>>>>
>>>>>>>>>>> Webrev for hs part:
>>>>>>>>>>> http://cr.openjdk.java.net/~anazarov/7195249/hs/webrev.01/
>>>>>>>>>>>
>>>>>>>>>>> Pay your attention to new method
>>>>>>>>>>> ProcessTools.startProcess(String,
>>>>>>>>>>> ProcessBuilder, Consumer<String>) that is used to analyze all
>>>>>>>>>>> output
>>>>>>>>>>> of a sub-process.  It has common part with
>>>>>>>>>>> ProcessTools.startProcess(String, ProcessBuilder,
>>>>>>>>>>> Predicate<String>,
>>>>>>>>>>> long, TumeUnit) that is used to determine the warm-up moment.
>>>>>>>>>>>
>>>>>>>>>>> I think the ProcessTools.startProcess(String, ProcessBuilder,
>>>>>>>>>>> Predicate<String>, long, TumeUnit) may be changed by adding
>>>>>>>>>>> LinePump
>>>>>>>>>>> to stderr if there is not serious reason for restricting the
>>>>>>>>>>> warm-up
>>>>>>>>>>> analysis to stdout stream.
>>>>>>>>>>>
>>>>>>>>>>> On 10.12.2013 16:16, Yekaterina Kantserova wrote:
>>>>>>>>>>>> Hi,
>>>>>>>>>>>>
>>>>>>>>>>>> I've consulted with Serviceability engineers (add them to CC
>>>>>>>>>>>> list) and
>>>>>>>>>>>> they would like to see tests to solve these problem so far:
>>>>>>>>>>>>
>>>>>>>>>>>> 2. Implement loops in every test.
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>> Katja
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On 12/09/2013 11:02 AM, Alexandre (Shura) Iline wrote:
>>>>>>>>>>>>> Guys.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Let me try to sum up what was said before and may be suggest a
>>>>>>>>>>>>> compromise.
>>>>>>>>>>>>>
>>>>>>>>>>>>> 1. There is a desire to have a support port allocation on the
>>>>>>>>>>>>> level of
>>>>>>>>>>>>> a JTReg suite execution. Taras created a bug for that
>>>>>>>>>>>>> (https://bugs.openjdk.java.net/browse/JDK-7195249). Whether it
>>>>>>>>>>>>> is a
>>>>>>>>>>>>> test harness API or a library API does not really matter from
>>>>>>>>>>>>> usage
>>>>>>>>>>>>> point of view.
>>>>>>>>>>>>>
>>>>>>>>>>>>> 2. There is no way to make the tests absolutely stable,
>>>>>>>>>>>>> whatever
>>>>>>>>>>>>> port
>>>>>>>>>>>>> allocation logic is used. The best we could do is to try to
>>>>>>>>>>>>> perform
>>>>>>>>>>>>> the test logic with different ports until the test succeeds.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Both arguments make sense. #2 is the ultimate answer, of
>>>>>>>>>>>>> course,
>>>>>>>>>>>>> but
>>>>>>>>>>>>> better be used in conjunction with a meaningful port selection
>>>>>>>>>>>>> algorithm.
>>>>>>>>>>>>>
>>>>>>>>>>>>> At the same time, copying a loop-until-success login from one
>>>>>>>>>>>>> test to
>>>>>>>>>>>>> another may be not the best solution. Library could help with
>>>>>>>>>>>>> that I
>>>>>>>>>>>>> believe. There only need to be an API method which takes
>>>>>>>>>>>>> behavior as a
>>>>>>>>>>>>> parameter and run it until it succeeds. Something like:
>>>>>>>>>>>>> public <T> runOnAFreePort(Function<T, Integer>)
>>>>>>>>>>>>> or similar. There could be arguments of how/whether to
>>>>>>>>>>>>> implement
>>>>>>>>>>>>> it,
>>>>>>>>>>>>> the solution would not work for shell tests, etc, but still
>>>>>>>>>>>>> ...
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> With the tests in question though, we have a few options.
>>>>>>>>>>>>>
>>>>>>>>>>>>> 1. Integrate tests as is. Get to it later after reaching
>>>>>>>>>>>>> agreement in
>>>>>>>>>>>>> the library, etc.
>>>>>>>>>>>>> 2. Implement loops in every test.
>>>>>>>>>>>>> 3. Wait for the library to be ready and only then integrate
>>>>>>>>>>>>> the
>>>>>>>>>>>>> changes.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Please let us know which one is closer to your heart.
>>>>>>>>>>>>>
>>>>>>>>>>>>> I personally prefer #1 for the reason that the changes already
>>>>>>>>>>>>> supposed to make the tests more stable and also there are many
>>>>>>>>>>>>> more
>>>>>>>>>>>>> tests tests which use ports, so the scope of the problem is
>>>>>>>>>>>>> bigger
>>>>>>>>>>>>> than these.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Shura
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Taras,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I agree with the previous comments, that Utils.getFreePort()
>>>>>>>>>>>>>> does not
>>>>>>>>>>>>>> guarantee the port will be still free when you start your
>>>>>>>>>>>>>> process.
>>>>>>>>>>>>>> Unfortunately I don't think the library can do more. However,
>>>>>>>>>>>>>> there is a
>>>>>>>>>>>>>> solution.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Please, look at the
>>>>>>>>>>>>>> *jdk/test/sun/tools/jstatd/JstatdTest.java
>>>>>>>>>>>>>> tryToSetupJstatdProcess()*. In brief, the test will try to
>>>>>>>>>>>>>> start a
>>>>>>>>>>>>>> process with a free port and then check if
>>>>>>>>>>>>>> /java.rmi.server.ExportException: Port already in use/ has
>>>>>>>>>>>>>> been
>>>>>>>>>>>>>> thrown.
>>>>>>>>>>>>>> If yes, you have to retry.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>> Katja
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On 12/02/2013 01:39 PM, taras ledkov wrote:
>>>>>>>>>>>>>>> Hi Everyone,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Whatever logic is to be chosen to select a free port, it is
>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>> library responsibility to implements it, would not you
>>>>>>>>>>>>>>> agree?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Hence what I am suggesting is to integrate the tests as is.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Should we decide to replace logic of the port selection, we
>>>>>>>>>>>>>>> could do
>>>>>>>>>>>>>>> it later in the library.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On 21.11.2013 15:00, Jaroslav Bachorik wrote:
>>>>>>>>>>>>>>>> On 20.11.2013 18:38, Dmitry Samersoff wrote:
>>>>>>>>>>>>>>>>> Roger,
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> As soon as we close a socket nobody can guarantee that the
>>>>>>>>>>>>>>>>> port is
>>>>>>>>>>>>>>>>> free.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Moreover, port returned by getFreePort()[1] remains not
>>>>>>>>>>>>>>>>> accessible
>>>>>>>>>>>>>>>>> for
>>>>>>>>>>>>>>>>> some time - it depends to system setup, take a look to
>>>>>>>>>>>>>>>>> discussions
>>>>>>>>>>>>>>>>> around SO_REUSEPORT for Linux or SO_REUSEADDR and
>>>>>>>>>>>>>>>>> SO_LINGER
>>>>>>>>>>>>>>>>> for
>>>>>>>>>>>>>>>>> BSD.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> So from stability point of view it's better to just return
>>>>>>>>>>>>>>>>> random
>>>>>>>>>>>>>>>>> number
>>>>>>>>>>>>>>>>> between 49152 and 65535.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Well, this doesn't seem to improve the odds by much. When
>>>>>>>>>>>>>>>> there are
>>>>>>>>>>>>>>>> more
>>>>>>>>>>>>>>>> tests run in parallel, all of them requiring a free port,
>>>>>>>>>>>>>>>> nothing
>>>>>>>>>>>>>>>> prevents the random function to return the same port to
>>>>>>>>>>>>>>>> all of
>>>>>>>>>>>>>>>> them.
>>>>>>>>>>>>>>>> Also, two subsequent requests can return the same port and
>>>>>>>>>>>>>>>> cause
>>>>>>>>>>>>>>>> problems with timing when a port used by a previous test is
>>>>>>>>>>>>>>>> not
>>>>>>>>>>>>>>>> fully
>>>>>>>>>>>>>>>> ready to be assigned to a different socket. And as Dmitry
>>>>>>>>>>>>>>>> pointed out
>>>>>>>>>>>>>>>> unless one can keep hold of the allocated socket and use it
>>>>>>>>>>>>>>>> later
>>>>>>>>>>>>>>>> there
>>>>>>>>>>>>>>>> is no guarantee that a port which was tested unallocated
>>>>>>>>>>>>>>>> will
>>>>>>>>>>>>>>>> remain
>>>>>>>>>>>>>>>> unallocated also for the next few milliseconds.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> The only fail proof solution would be a port allocating
>>>>>>>>>>>>>>>> service
>>>>>>>>>>>>>>>> provided
>>>>>>>>>>>>>>>> by the harness. Until then we can only (hopefully) decrease
>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>> chance
>>>>>>>>>>>>>>>> of intermittent failures due to a port being in use.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> -JB-
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> -Dmitry
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> [1]
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> 141     public static int getFreePort() throws
>>>>>>>>>>>>>>>>> InterruptedException,
>>>>>>>>>>>>>>>>> IOException {
>>>>>>>>>>>>>>>>>   142         int port = -1;
>>>>>>>>>>>>>>>>>   143
>>>>>>>>>>>>>>>>>   144         while (port <= 0) {
>>>>>>>>>>>>>>>>>   145             Thread.sleep(100);
>>>>>>>>>>>>>>>>>   146
>>>>>>>>>>>>>>>>>   147             ServerSocket serverSocket = null;
>>>>>>>>>>>>>>>>>   148             try {
>>>>>>>>>>>>>>>>>   149                 serverSocket = new ServerSocket(0);
>>>>>>>>>>>>>>>>>   150                 port = serverSocket.getLocalPort();
>>>>>>>>>>>>>>>>>   151             } finally {
>>>>>>>>>>>>>>>>>   152                 serverSocket.close();
>>>>>>>>>>>>>>>>>   153             }
>>>>>>>>>>>>>>>>>   154         }
>>>>>>>>>>>>>>>>>   155
>>>>>>>>>>>>>>>>>   156         return port;
>>>>>>>>>>>>>>>>>   157     }
>>>>>>>>>>>>>>>>>   158
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On 2013-11-20 19:40, roger riggs wrote:
>>>>>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> fyi,  The jdk.testlibrary.Utils.getFreePort() method will
>>>>>>>>>>>>>>>>>> Open an
>>>>>>>>>>>>>>>>>> free
>>>>>>>>>>>>>>>>>> Socket, close it and return
>>>>>>>>>>>>>>>>>> the port number.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> And as Alan recommended, use (0) when possible to have
>>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>> system
>>>>>>>>>>>>>>>>>> assign
>>>>>>>>>>>>>>>>>> the port #.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Roger
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> On 11/20/2013 8:04 AM, Dmitry Samersoff wrote:
>>>>>>>>>>>>>>>>>>> Taras,
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> *The only* correct way to take really free port is:
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> 1. Chose random number between 49152 and 65535
>>>>>>>>>>>>>>>>>>> 2. Open socket
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> if socket fails - repeat step 1
>>>>>>>>>>>>>>>>>>> if socket OK - return *socket*
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> If you can't keep the socket open (e.g. you have to pass
>>>>>>>>>>>>>>>>>>> port
>>>>>>>>>>>>>>>>>>> number as
>>>>>>>>>>>>>>>>>>> property value) you shouldn't do any pre-check as it
>>>>>>>>>>>>>>>>>>> has no
>>>>>>>>>>>>>>>>>>> value
>>>>>>>>>>>>>>>>>>> - as
>>>>>>>>>>>>>>>>>>> as soon as you close socket someone can take the port.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> So just choose a random number within the range above
>>>>>>>>>>>>>>>>>>> and
>>>>>>>>>>>>>>>>>>> let
>>>>>>>>>>>>>>>>>>> networking
>>>>>>>>>>>>>>>>>>> code opening socket to handle port conflict.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> -Dmitry
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> On 2013-11-20 15:54, taras ledkov wrote:
>>>>>>>>>>>>>>>>>>>> Hi Everyone,
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> I am working on bug
>>>>>>>>>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-7195249.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> There are two webrevs:
>>>>>>>>>>>>>>>>>>>> Webrev for jdk part:
>>>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~anazarov/7195249/jdk/webrev.00/
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Webrev for hs part:
>>>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~anazarov/7195249/hs/webrev.00/
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Please take a look at some notes:
>>>>>>>>>>>>>>>>>>>> - After discussing with Yekaterina Kantserova &
>>>>>>>>>>>>>>>>>>>> Jaroslav
>>>>>>>>>>>>>>>>>>>> Bachorik
>>>>>>>>>>>>>>>>>>>> some
>>>>>>>>>>>>>>>>>>>> shell tests have been converted to java based tests
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> - PasswordFilePermissionTest &
>>>>>>>>>>>>>>>>>>>> SSLConfigFilePermissionTest
>>>>>>>>>>>>>>>>>>>> tests
>>>>>>>>>>>>>>>>>>>> looked
>>>>>>>>>>>>>>>>>>>> very similar, so a common parent class was created for
>>>>>>>>>>>>>>>>>>>> them:
>>>>>>>>>>>>>>>>>>>> AbstractFilePermissionTest
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> - What was called RmiRegistrySslTest.java I've
>>>>>>>>>>>>>>>>>>>> renamed to
>>>>>>>>>>>>>>>>>>>> RmiRegistrySslTestApp.java. The java code to replace
>>>>>>>>>>>>>>>>>>>> old
>>>>>>>>>>>>>>>>>>>> shell
>>>>>>>>>>>>>>>>>>>> script
>>>>>>>>>>>>>>>>>>>> RmiRegistrySslTest.sh is called
>>>>>>>>>>>>>>>>>>>> RmiRegistrySslTest.java,
>>>>>>>>>>>>>>>>>>>> hence the
>>>>>>>>>>>>>>>>>>>> huge
>>>>>>>>>>>>>>>>>>>> diff.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> - The new RmiRegistrySslTest.java has some lines
>>>>>>>>>>>>>>>>>>>> similar
>>>>>>>>>>>>>>>>>>>> to the
>>>>>>>>>>>>>>>>>>>> AbstractFilePermissionTest.java, I nevertheless decided
>>>>>>>>>>>>>>>>>>>> to not
>>>>>>>>>>>>>>>>>>>> complicate the code further and leave it as is. Please
>>>>>>>>>>>>>>>>>>>> let me
>>>>>>>>>>>>>>>>>>>> know if
>>>>>>>>>>>>>>>>>>>> this is somehow not acceptable
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> - com/oracle/java/testlibrary/Utils.java that is
>>>>>>>>>>>>>>>>>>>> added to
>>>>>>>>>>>>>>>>>>>> hotspot
>>>>>>>>>>>>>>>>>>>> repository is taken from this patch:
>>>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ykantser/8023138/webrev.00/test/lib/testlibrary/jdk/testlibrary/Utils.java.sdiff.html
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> - These tests will need additional changes when test
>>>>>>>>>>>>>>>>>>>> library
>>>>>>>>>>>>>>>>>>>> process
>>>>>>>>>>>>>>>>>>>> tools will support command line options inheritance
>>>>>>>>>>>>>>>>>>>> (http://mail.openjdk.java.net/pipermail/serviceability-dev/2013-November/013235.html)
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> --
>>>>>>>>>>> With best regards,
>>>>>>>>>>> Taras Ledkov
>>>>>>>>>>> Mail-To: taras.ledkov at oracle.com
>>>>>>>>>>> skype: taras_ledkov
>>>>>>>>>>> Phone: 7(812)3346-157
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> With best regards,
>>>>>>>> Taras Ledkov
>>>>>>>> Mail-To: taras.ledkov at oracle.com
>>>>>>>> skype: taras_ledkov
>>>>>>>> Phone: 7(812)3346-157
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

-- 
With best regards,
Taras Ledkov
Mail-To: taras.ledkov at oracle.com
skype: taras_ledkov
Phone: 7(812)3346-157

From daniel.daugherty at oracle.com  Wed Feb 26 06:28:33 2014
From: daniel.daugherty at oracle.com (Daniel D. Daugherty)
Date: Wed, 26 Feb 2014 07:28:33 -0700
Subject: RFR: 8035150 ShouldNotReachHere() in ConstantPool::copy_entry_to
In-Reply-To: <F9F51E64-23F0-4BF8-BF43-B4E2DF560A3F@oracle.com>
References: <19821D33-93B6-43FA-85DC-070DB4800911@oracle.com>
	<ad8bc97a-9d9b-4596-ae95-5a3601622f29@default>
	<530D39C7.6080009@oracle.com>
	<CB246ADC-BDDD-4928-893A-9D2C5D602065@oracle.com>
	<530DF44B.7060209@oracle.com>
	<F9F51E64-23F0-4BF8-BF43-B4E2DF560A3F@oracle.com>
Message-ID: <530DFA11.90303@oracle.com>

On 2/26/14 7:15 AM, Staffan Larsen wrote:
>
> On 26 feb 2014, at 15:03, Daniel D. Daugherty 
> <daniel.daugherty at oracle.com <mailto:daniel.daugherty at oracle.com>> wrote:
>
>> On 2/26/14 1:31 AM, Staffan Larsen wrote:
>>> On 26 feb 2014, at 01:48, Daniel D. Daugherty 
>>> <daniel.daugherty at oracle.com <mailto:daniel.daugherty at oracle.com>> 
>>> wrote:
>>>
>>>> I concur with Markus. Pairing JVM_CONSTANT_UnresolvedClassInError with
>>>> JVM_CONSTANT_UnresolvedClass in the ConstantPool::copy_entry_to()
>>>> switch looks like the right thing to do.
>>> Good - thanks.
>>>
>>>> The usual questions:
>>>>
>>>> - why wasn't this failure mode seen before JDK8?
>>> No tests for this ? ;)
>>
>> I should have been more clear... :-) Why hasn't the NetBeans profiler
>> run into this before? That profiler is a wonderful test for the
>> RedefineClasses/RetransformClasses stuff?
>
> Ah, ok. No idea...
>
>>
>>
>>>
>>>> - was this failure caught somewhere else before JDK8 and changes
>>>>  in JDK8 exposed a new code path?
>>>>
>>>> Reasoning about this from a 30,000 foot view, I don't see any reason
>>>> why you can't redefine a class that has a constant pool ref that
>>>> refers to a class in error. You won't be able to use the error'ed
>>>> class, but there's no reason it can't be in there... Or does that
>>>> violate the rule that you can't redefine a class that isn't fully
>>>> linked (what ever that means...)???
>>>>
>>>> So what does your new test on JDK7 or JDK6? Just curious?
>>> The test passes on jdk7, but fails on jdk8. (I don?t have a jdk6). I 
>>> don?t know why it passes on jdk7, do you think it?s important to 
>>> track it down?
>>
>> The fact that it passes on JDK7 is the useful piece of data.
>> Figuring out why is much less important. BTW, which JDK7
>> version? One of the updates or GA/FCS?
>
> I used 7u45, but now I tested with 7u4 as well - passes there, too.

Sounds like the change/breakage is limited to JDK8 so that's
a relief.


>
> Are you ok with pushing the change?

Very much so.

Dan


>
> Thanks,
> /Staffan
>
>>
>> Dan
>>
>>
>>>
>>> /Staffan
>>>
>>>> Dan
>>>>
>>>>
>>>> On 2/24/14 2:42 AM, Markus Gronlund wrote:
>>>>> Hi Staffan,
>>>>>
>>>>> I would think this is the correct fix.
>>>>>
>>>>> The other two constant pool "error" tags, besides 
>>>>> UnresolvedClassInError, which signal constant pool resolution 
>>>>> errors are MethodTypeInError and MethodHandleInError - these error 
>>>>> tags are associated with their corresponding "success" tags in 
>>>>> switch targets in ConstantPool::copy_entry_to(), as well as in 
>>>>> additional routines in constantPool.cpp.
>>>>>
>>>>> In addition, in other routines in ConstantPool.cpp, the error tag 
>>>>> JVM_CONSTANT_UnresolvedClassInError is associated with 
>>>>> JVM_CONSTANT_UnresolvedClass - 
>>>>>  ConstantPool::resolve_constant_at_impl() for example.
>>>>>
>>>>> Thanks
>>>>> Markus
>>>>>
>>>>>
>>>>> -----Original Message-----
>>>>> From: Staffan Larsen
>>>>> Sent: den 21 februari 2014 15:11
>>>>> To: hotspot-runtime-dev; serviceability-dev at openjdk.java.net 
>>>>> <mailto:serviceability-dev at openjdk.java.net> 
>>>>> serviceability-dev at openjdk.java.net 
>>>>> <mailto:serviceability-dev at openjdk.java.net>
>>>>> Subject: RFR: 8035150 ShouldNotReachHere() in 
>>>>> ConstantPool::copy_entry_to
>>>>>
>>>>> This is an attempt to solve a crash while redefining a class that 
>>>>> has unresolved class references in its constant pool. I would 
>>>>> appreciate some extra scrutiny here since I am unfamiliar with 
>>>>> this code path.
>>>>>
>>>>> I have also added a test that causes a JVM crash without the fix.
>>>>>
>>>>> The updates to the test library is all code copied from the jdk 
>>>>> version of the test library.
>>>>>
>>>>> webrev: http://cr.openjdk.java.net/~sla/8035150/webrev.00/ 
>>>>> <http://cr.openjdk.java.net/%7Esla/8035150/webrev.00/>
>>>>> bug: https://bugs.openjdk.java.net/browse/JDK-8035150
>>>>>
>>>>> Thanks,
>>>>> /Staffan
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20140226/79157738/attachment-0001.html 

From staffan.larsen at oracle.com  Wed Feb 26 06:33:22 2014
From: staffan.larsen at oracle.com (Staffan Larsen)
Date: Wed, 26 Feb 2014 15:33:22 +0100
Subject: RFR: 8035150 ShouldNotReachHere() in ConstantPool::copy_entry_to
In-Reply-To: <530DFA11.90303@oracle.com>
References: <19821D33-93B6-43FA-85DC-070DB4800911@oracle.com>
	<ad8bc97a-9d9b-4596-ae95-5a3601622f29@default>
	<530D39C7.6080009@oracle.com>
	<CB246ADC-BDDD-4928-893A-9D2C5D602065@oracle.com>
	<530DF44B.7060209@oracle.com>
	<F9F51E64-23F0-4BF8-BF43-B4E2DF560A3F@oracle.com>
	<530DFA11.90303@oracle.com>
Message-ID: <4BECFB25-0704-40DE-AFC7-0948DD6531AF@oracle.com>

Thanks Dan and Markus!

On 26 feb 2014, at 15:28, Daniel D. Daugherty <daniel.daugherty at oracle.com> wrote:

> On 2/26/14 7:15 AM, Staffan Larsen wrote:
>> 
>> On 26 feb 2014, at 15:03, Daniel D. Daugherty <daniel.daugherty at oracle.com> wrote:
>> 
>>> On 2/26/14 1:31 AM, Staffan Larsen wrote:
>>>> On 26 feb 2014, at 01:48, Daniel D. Daugherty <daniel.daugherty at oracle.com> wrote:
>>>> 
>>>>> I concur with Markus. Pairing JVM_CONSTANT_UnresolvedClassInError with
>>>>> JVM_CONSTANT_UnresolvedClass in the ConstantPool::copy_entry_to()
>>>>> switch looks like the right thing to do.
>>>> Good - thanks.
>>>> 
>>>>> The usual questions:
>>>>> 
>>>>> - why wasn't this failure mode seen before JDK8?
>>>> No tests for this ? ;)
>>> 
>>> I should have been more clear... :-) Why hasn't the NetBeans profiler
>>> run into this before? That profiler is a wonderful test for the
>>> RedefineClasses/RetransformClasses stuff?
>> 
>> Ah, ok. No idea...
>> 
>>> 
>>> 
>>>> 
>>>>> - was this failure caught somewhere else before JDK8 and changes
>>>>>  in JDK8 exposed a new code path?
>>>>> 
>>>>> Reasoning about this from a 30,000 foot view, I don't see any reason
>>>>> why you can't redefine a class that has a constant pool ref that
>>>>> refers to a class in error. You won't be able to use the error'ed
>>>>> class, but there's no reason it can't be in there... Or does that
>>>>> violate the rule that you can't redefine a class that isn't fully
>>>>> linked (what ever that means...)???
>>>>> 
>>>>> So what does your new test on JDK7 or JDK6? Just curious?
>>>> The test passes on jdk7, but fails on jdk8. (I don?t have a jdk6). I don?t know why it passes on jdk7, do you think it?s important to track it down?
>>> 
>>> The fact that it passes on JDK7 is the useful piece of data.
>>> Figuring out why is much less important. BTW, which JDK7
>>> version? One of the updates or GA/FCS?
>> 
>> I used 7u45, but now I tested with 7u4 as well - passes there, too.
> 
> Sounds like the change/breakage is limited to JDK8 so that's
> a relief.
> 
> 
>> 
>> Are you ok with pushing the change?
> 
> Very much so.
> 
> Dan
> 
> 
>> 
>> Thanks,
>> /Staffan
>> 
>>> 
>>> Dan
>>> 
>>> 
>>>> 
>>>> /Staffan
>>>> 
>>>>> Dan
>>>>> 
>>>>> 
>>>>> On 2/24/14 2:42 AM, Markus Gronlund wrote:
>>>>>> Hi Staffan,
>>>>>> 
>>>>>> I would think this is the correct fix.
>>>>>> 
>>>>>> The other two constant pool "error" tags, besides UnresolvedClassInError, which signal constant pool resolution errors are MethodTypeInError and MethodHandleInError - these error tags are associated with their corresponding "success" tags in switch targets in ConstantPool::copy_entry_to(), as well as in additional routines in constantPool.cpp.
>>>>>> 
>>>>>> In addition, in other routines in ConstantPool.cpp, the error tag JVM_CONSTANT_UnresolvedClassInError is associated with JVM_CONSTANT_UnresolvedClass -  ConstantPool::resolve_constant_at_impl() for example.
>>>>>> 
>>>>>> Thanks
>>>>>> Markus
>>>>>> 
>>>>>> 
>>>>>> -----Original Message-----
>>>>>> From: Staffan Larsen
>>>>>> Sent: den 21 februari 2014 15:11
>>>>>> To: hotspot-runtime-dev; serviceability-dev at openjdk.java.net serviceability-dev at openjdk.java.net
>>>>>> Subject: RFR: 8035150 ShouldNotReachHere() in ConstantPool::copy_entry_to
>>>>>> 
>>>>>> This is an attempt to solve a crash while redefining a class that has unresolved class references in its constant pool. I would appreciate some extra scrutiny here since I am unfamiliar with this code path.
>>>>>> 
>>>>>> I have also added a test that causes a JVM crash without the fix.
>>>>>> 
>>>>>> The updates to the test library is all code copied from the jdk version of the test library.
>>>>>> 
>>>>>> webrev: http://cr.openjdk.java.net/~sla/8035150/webrev.00/
>>>>>> bug: https://bugs.openjdk.java.net/browse/JDK-8035150
>>>>>> 
>>>>>> Thanks,
>>>>>> /Staffan
>> 
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20140226/6df7515d/attachment.html 

From staffan.larsen at oracle.com  Wed Feb 26 06:36:11 2014
From: staffan.larsen at oracle.com (Staffan Larsen)
Date: Wed, 26 Feb 2014 15:36:11 +0100
Subject: Review request for 7195249: Some jtreg tests use hard coded ports
In-Reply-To: <530DF91C.5010000@oracle.com>
References: <529EF58F.5000701@oracle.com>
	<52A58687.6020708@oracle.com>	<52A5953A.5040102@oracle.com>
	<52A7061E.8040002@oracle.com>	<52BC2A7D.3070403@oracle.com>	<B27290D9-2EA6-4330-9A31-59957AA968ED@oracle.com>
	<52D6A61A.5020109@oracle.com> <52DD3BB6.2070607@oracle.com>
	<A2DEA977-75EF-4AC2-AC86-E22B8BBFFE97@oracle.com>
	<52DE3E3D.5070903@oracle.com> <52DE41BB.40309@oracle.com>
	<52F231D3.1050208@oracle.com> <52F23A97.7020807@oracle.com>
	<53049009.9020204@oracle.com> <530CA007.5080006@oracle.com>
	<530DF91C.5010000@oracle.com>
Message-ID: <2C889ABA-D705-4518-85EC-C5686DC43647@oracle.com>


On 26 feb 2014, at 15:24, taras ledkov <taras.ledkov at oracle.com> wrote:

> Hi,
> 
> Alan, Mandy could you please review the fix: https://bugs.openjdk.java.net/browse/JDK-7195249.
> 
> I had the discussion with Jaroslav and Staffan and they have approved my fix, but they are not reviewers.

I am a Reviewer.

Thanks,
/Staffan

> 
> Webrev for jdk part:
> http://cr.openjdk.java.net/~anazarov/7195249/jdk/webrev.04/
> 
> Webrev for hs part:
> http://cr.openjdk.java.net/~anazarov/7195249/hs/webrev.03/
> 
> On 25.02.2014 17:52, Jaroslav Bachorik wrote:
>> Thumbs up. (not a "reviewer", though)
>> 
>> -JB-
>> 
>> On 19.2.2014 12:05, taras ledkov wrote:
>>> Hi,
>>> 
>>> Imports are fixed:
>>> 
>>> http://cr.openjdk.java.net/~anazarov/7195249/jdk/webrev.04/
>>> 
>>> On 05.02.2014 17:20, Jaroslav Bachorik wrote:
>>>> Hi Taras,
>>>> 
>>>> thanks for taking care of this.
>>>> 
>>>> The changes look fine to me.
>>>> 
>>>> One minor nit is unused imports of the library classes in
>>>> "test/sun/management/jmxremote/bootstrap/SSLConfigFilePermissionTest.java".
>>>> 
>>>> 
>>>> It does not use any of those classes as its base class
>>>> "AbstractFilePermissionTest" does all the heavy lifting.
>>>> 
>>>> Cheers,
>>>> 
>>>> -JB-
>>>> 
>>>> On 5.2.2014 13:42, taras ledkov wrote:
>>>>> Hi,
>>>>> 
>>>>> So please take a look at the review against JDK9.
>>>>> The reviewed patch had not been integrated into JDK8.
>>>>> 
>>>>> Port to JDK9 is identical. The difference: the ProcessTools.java has
>>>>> been already patched by Jaroslav.
>>>>> 
>>>>> Webrev for jdk part:
>>>>> http://cr.openjdk.java.net/~anazarov/7195249/jdk/webrev.03/
>>>>> 
>>>>> Webrev for hs part:
>>>>> http://cr.openjdk.java.net/~anazarov/7195249/hs/webrev.03/
>>>>> 
>>>>> 
>>>>> On 21.01.2014 13:45, Jaroslav Bachorik wrote:
>>>>>> Hi Taras,
>>>>>> 
>>>>>> On 21.1.2014 10:30, taras ledkov wrote:
>>>>>>> Hi Jaroslav,
>>>>>>> 
>>>>>>> Could you please review the last changes?
>>>>>>> Are you OK?
>>>>>> 
>>>>>> Yes, the change looks ok. But I think we will need to get back to this
>>>>>> problem eventually and implement a central port dispatcher if we
>>>>>> want to
>>>>>> be 100% sure the port conflicts wouldn't occur. But your changes
>>>>>> reduce
>>>>>> the chance significantly.
>>>>>> 
>>>>>> Thanks for taking care of this.
>>>>>> 
>>>>>> -JB-
>>>>>> 
>>>>>>> 
>>>>>>> On 20.01.2014 19:21, Staffan Larsen wrote:
>>>>>>>> Sorry for not replying earlier. Yes, I?m ok with these changes.
>>>>>>>> 
>>>>>>>> Thanks,
>>>>>>>> /Staffan
>>>>>>>> 
>>>>>>>> On 20 jan 2014, at 16:07, taras ledkov <taras.ledkov at oracle.com>
>>>>>>>> wrote:
>>>>>>>> 
>>>>>>>>> Hi Staffan,
>>>>>>>>> 
>>>>>>>>> I fixed the tests according with your comments.
>>>>>>>>> Are you OK?
>>>>>>>>> 
>>>>>>>>> On 15.01.2014 19:15, taras ledkov wrote:
>>>>>>>>>> Hi,
>>>>>>>>>> 
>>>>>>>>>> Please take a look at the new review.
>>>>>>>>>> 
>>>>>>>>>> Webrev for jdk part:
>>>>>>>>>> http://cr.openjdk.java.net/~anazarov/7195249/jdk/webrev.02/
>>>>>>>>>> 
>>>>>>>>>> Webrev for hs part:
>>>>>>>>>> http://cr.openjdk.java.net/~anazarov/7195249/hs/webrev.02/
>>>>>>>>>> 
>>>>>>>>>> My answers are inline:
>>>>>>>>>> 
>>>>>>>>>> On 08.01.2014 17:46, Staffan Larsen wrote:
>>>>>>>>>>> Hi Taras,
>>>>>>>>>>> 
>>>>>>>>>>> Thanks for doing this clean up and conversion of tests into Java.
>>>>>>>>>>> Here?s a couple of comments:
>>>>>>>>>>> 
>>>>>>>>>>> test/runtime/6294277/SourceDebugExtension.java:
>>>>>>>>>>> This test could be simplified by not specifying an address at
>>>>>>>>>>> all.
>>>>>>>>>>> Since the test never connects to the JVM started with -Xrunjdwp,
>>>>>>>>>>> there
>>>>>>>>>>> is no reason to specify an address. If address is unspecified
>>>>>>>>>>> (and
>>>>>>>>>>> server=y), the connector will pick an address and print it to the
>>>>>>>>>>> command line. Thus the only change that needs to be done is to
>>>>>>>>>>> remove
>>>>>>>>>>> ",address=8888? from the @run command.
>>>>>>>>>> fixed
>>>>>>>>>> 
>>>>>>>>>>> test/sun/management/jmxremote/bootstrap/RmiBootstrapTest.sh:
>>>>>>>>>>> test/sun/management/jmxremote/bootstrap/RmiSslBootstrapTest.sh:
>>>>>>>>>>> These tests do not compile cleanly with an empty JTwork
>>>>>>>>>>> directory. It
>>>>>>>>>>> seems that having one @build for each class does not work well -
>>>>>>>>>>> when
>>>>>>>>>>> compiling RmiBootstrapTest.java it cannot find TestLogger. Moving
>>>>>>>>>>> all
>>>>>>>>>>> classes to one @build statement solved this problem for me.
>>>>>>>>>> fixed
>>>>>>>>>> 
>>>>>>>>>>> test/lib/testlibrary/jdk/testlibrary/ProcessTools.java:
>>>>>>>>>>>  187         Future<Void> stdoutTask = stdout.process();
>>>>>>>>>>>  188         Future<Void> stderrTask = stderr.process();
>>>>>>>>>>> The stdoutTask and stderrTask variables are unused.
>>>>>>>>>> fixed
>>>>>>>>>> 
>>>>>>>>>>> test/sun/management/jmxremote/bootstrap/RmiRegistrySslTest.java:
>>>>>>>>>>> At first I thought something was wrong with this file - the
>>>>>>>>>>> diff is
>>>>>>>>>>> very weird. Then I realized you renamed an old file and created a
>>>>>>>>>>> new
>>>>>>>>>>> file using the old name.
>>>>>>>>>> You are right. I did it to keep the test name.
>>>>>>>>>> 
>>>>>>>>>>> test/sun/management/jmxremote/bootstrap/AbstractFilePermissionTest.java:
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> - Is resetPasswordFilePermission() really necessary? It looks
>>>>>>>>>>> like
>>>>>>>>>>> you
>>>>>>>>>>> delete the files at the beginning of the test in any case.
>>>>>>>>>> I think yes. n the first place, this functionality was at the old
>>>>>>>>>> code.
>>>>>>>>>> In the second place, a file without write permission may be a
>>>>>>>>>> problem
>>>>>>>>>> for a further cleanup (not by the test, for example for the tests
>>>>>>>>>> launcher scripts etc.)
>>>>>>>>>> 
>>>>>>>>>>> - I find the names and usage of ?mgmt? and ?file2PermissionTest?
>>>>>>>>>>> confusing. They are both Paths. One is used directly by the
>>>>>>>>>>> sub-classes, the other has a getter method.
>>>>>>>>>> fixed
>>>>>>>>>> 
>>>>>>>>>>> - Lines 57-58: Don?t swallow exceptions, add an
>>>>>>>>>>> ex.printStackTrace().
>>>>>>>>>>> (Same thing for all other places where you call
>>>>>>>>>>> Integer.parseInt())
>>>>>>>>>> fixed
>>>>>>>>>> 
>>>>>>>>>>> test/sun/management/jmxremote/bootstrap/Dummy.java:
>>>>>>>>>>> This file is never used as far as I can see.
>>>>>>>>>> It is used by PasswordFilePermissionTest &
>>>>>>>>>> SSLConfigFilePermissionTest
>>>>>>>>>> via the AbstractFilePermissionTest (see the doTest method,
>>>>>>>>>> AbstractFilePermissionTest : 162).
>>>>>>>>>> 
>>>>>>>>>>> Thanks,
>>>>>>>>>>> /Staffan
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> On 26 dec 2013, at 14:09, taras ledkov <taras.ledkov at oracle.com>
>>>>>>>>>>> wrote:
>>>>>>>>>>> 
>>>>>>>>>>>> Hi,
>>>>>>>>>>>> 
>>>>>>>>>>>> Please take a look at the review with fixed issues about
>>>>>>>>>>>> trying to
>>>>>>>>>>>> launch test that needs free port several times.
>>>>>>>>>>>> 
>>>>>>>>>>>> Webrev for jdk part:
>>>>>>>>>>>> http://cr.openjdk.java.net/~anazarov/7195249/jdk/webrev.01/
>>>>>>>>>>>> 
>>>>>>>>>>>> Webrev for hs part:
>>>>>>>>>>>> http://cr.openjdk.java.net/~anazarov/7195249/hs/webrev.01/
>>>>>>>>>>>> 
>>>>>>>>>>>> Pay your attention to new method
>>>>>>>>>>>> ProcessTools.startProcess(String,
>>>>>>>>>>>> ProcessBuilder, Consumer<String>) that is used to analyze all
>>>>>>>>>>>> output
>>>>>>>>>>>> of a sub-process.  It has common part with
>>>>>>>>>>>> ProcessTools.startProcess(String, ProcessBuilder,
>>>>>>>>>>>> Predicate<String>,
>>>>>>>>>>>> long, TumeUnit) that is used to determine the warm-up moment.
>>>>>>>>>>>> 
>>>>>>>>>>>> I think the ProcessTools.startProcess(String, ProcessBuilder,
>>>>>>>>>>>> Predicate<String>, long, TumeUnit) may be changed by adding
>>>>>>>>>>>> LinePump
>>>>>>>>>>>> to stderr if there is not serious reason for restricting the
>>>>>>>>>>>> warm-up
>>>>>>>>>>>> analysis to stdout stream.
>>>>>>>>>>>> 
>>>>>>>>>>>> On 10.12.2013 16:16, Yekaterina Kantserova wrote:
>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>> 
>>>>>>>>>>>>> I've consulted with Serviceability engineers (add them to CC
>>>>>>>>>>>>> list) and
>>>>>>>>>>>>> they would like to see tests to solve these problem so far:
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 2. Implement loops in every test.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>> Katja
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> On 12/09/2013 11:02 AM, Alexandre (Shura) Iline wrote:
>>>>>>>>>>>>>> Guys.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Let me try to sum up what was said before and may be suggest a
>>>>>>>>>>>>>> compromise.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 1. There is a desire to have a support port allocation on the
>>>>>>>>>>>>>> level of
>>>>>>>>>>>>>> a JTReg suite execution. Taras created a bug for that
>>>>>>>>>>>>>> (https://bugs.openjdk.java.net/browse/JDK-7195249). Whether it
>>>>>>>>>>>>>> is a
>>>>>>>>>>>>>> test harness API or a library API does not really matter from
>>>>>>>>>>>>>> usage
>>>>>>>>>>>>>> point of view.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 2. There is no way to make the tests absolutely stable,
>>>>>>>>>>>>>> whatever
>>>>>>>>>>>>>> port
>>>>>>>>>>>>>> allocation logic is used. The best we could do is to try to
>>>>>>>>>>>>>> perform
>>>>>>>>>>>>>> the test logic with different ports until the test succeeds.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Both arguments make sense. #2 is the ultimate answer, of
>>>>>>>>>>>>>> course,
>>>>>>>>>>>>>> but
>>>>>>>>>>>>>> better be used in conjunction with a meaningful port selection
>>>>>>>>>>>>>> algorithm.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> At the same time, copying a loop-until-success login from one
>>>>>>>>>>>>>> test to
>>>>>>>>>>>>>> another may be not the best solution. Library could help with
>>>>>>>>>>>>>> that I
>>>>>>>>>>>>>> believe. There only need to be an API method which takes
>>>>>>>>>>>>>> behavior as a
>>>>>>>>>>>>>> parameter and run it until it succeeds. Something like:
>>>>>>>>>>>>>> public <T> runOnAFreePort(Function<T, Integer>)
>>>>>>>>>>>>>> or similar. There could be arguments of how/whether to
>>>>>>>>>>>>>> implement
>>>>>>>>>>>>>> it,
>>>>>>>>>>>>>> the solution would not work for shell tests, etc, but still
>>>>>>>>>>>>>> ...
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> With the tests in question though, we have a few options.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 1. Integrate tests as is. Get to it later after reaching
>>>>>>>>>>>>>> agreement in
>>>>>>>>>>>>>> the library, etc.
>>>>>>>>>>>>>> 2. Implement loops in every test.
>>>>>>>>>>>>>> 3. Wait for the library to be ready and only then integrate
>>>>>>>>>>>>>> the
>>>>>>>>>>>>>> changes.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Please let us know which one is closer to your heart.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> I personally prefer #1 for the reason that the changes already
>>>>>>>>>>>>>> supposed to make the tests more stable and also there are many
>>>>>>>>>>>>>> more
>>>>>>>>>>>>>> tests tests which use ports, so the scope of the problem is
>>>>>>>>>>>>>> bigger
>>>>>>>>>>>>>> than these.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Shura
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> Taras,
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> I agree with the previous comments, that Utils.getFreePort()
>>>>>>>>>>>>>>> does not
>>>>>>>>>>>>>>> guarantee the port will be still free when you start your
>>>>>>>>>>>>>>> process.
>>>>>>>>>>>>>>> Unfortunately I don't think the library can do more. However,
>>>>>>>>>>>>>>> there is a
>>>>>>>>>>>>>>> solution.
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> Please, look at the
>>>>>>>>>>>>>>> *jdk/test/sun/tools/jstatd/JstatdTest.java
>>>>>>>>>>>>>>> tryToSetupJstatdProcess()*. In brief, the test will try to
>>>>>>>>>>>>>>> start a
>>>>>>>>>>>>>>> process with a free port and then check if
>>>>>>>>>>>>>>> /java.rmi.server.ExportException: Port already in use/ has
>>>>>>>>>>>>>>> been
>>>>>>>>>>>>>>> thrown.
>>>>>>>>>>>>>>> If yes, you have to retry.
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>> Katja
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> On 12/02/2013 01:39 PM, taras ledkov wrote:
>>>>>>>>>>>>>>>> Hi Everyone,
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> Whatever logic is to be chosen to select a free port, it is
>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>> library responsibility to implements it, would not you
>>>>>>>>>>>>>>>> agree?
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> Hence what I am suggesting is to integrate the tests as is.
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> Should we decide to replace logic of the port selection, we
>>>>>>>>>>>>>>>> could do
>>>>>>>>>>>>>>>> it later in the library.
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> On 21.11.2013 15:00, Jaroslav Bachorik wrote:
>>>>>>>>>>>>>>>>> On 20.11.2013 18:38, Dmitry Samersoff wrote:
>>>>>>>>>>>>>>>>>> Roger,
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> As soon as we close a socket nobody can guarantee that the
>>>>>>>>>>>>>>>>>> port is
>>>>>>>>>>>>>>>>>> free.
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> Moreover, port returned by getFreePort()[1] remains not
>>>>>>>>>>>>>>>>>> accessible
>>>>>>>>>>>>>>>>>> for
>>>>>>>>>>>>>>>>>> some time - it depends to system setup, take a look to
>>>>>>>>>>>>>>>>>> discussions
>>>>>>>>>>>>>>>>>> around SO_REUSEPORT for Linux or SO_REUSEADDR and
>>>>>>>>>>>>>>>>>> SO_LINGER
>>>>>>>>>>>>>>>>>> for
>>>>>>>>>>>>>>>>>> BSD.
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> So from stability point of view it's better to just return
>>>>>>>>>>>>>>>>>> random
>>>>>>>>>>>>>>>>>> number
>>>>>>>>>>>>>>>>>> between 49152 and 65535.
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> Well, this doesn't seem to improve the odds by much. When
>>>>>>>>>>>>>>>>> there are
>>>>>>>>>>>>>>>>> more
>>>>>>>>>>>>>>>>> tests run in parallel, all of them requiring a free port,
>>>>>>>>>>>>>>>>> nothing
>>>>>>>>>>>>>>>>> prevents the random function to return the same port to
>>>>>>>>>>>>>>>>> all of
>>>>>>>>>>>>>>>>> them.
>>>>>>>>>>>>>>>>> Also, two subsequent requests can return the same port and
>>>>>>>>>>>>>>>>> cause
>>>>>>>>>>>>>>>>> problems with timing when a port used by a previous test is
>>>>>>>>>>>>>>>>> not
>>>>>>>>>>>>>>>>> fully
>>>>>>>>>>>>>>>>> ready to be assigned to a different socket. And as Dmitry
>>>>>>>>>>>>>>>>> pointed out
>>>>>>>>>>>>>>>>> unless one can keep hold of the allocated socket and use it
>>>>>>>>>>>>>>>>> later
>>>>>>>>>>>>>>>>> there
>>>>>>>>>>>>>>>>> is no guarantee that a port which was tested unallocated
>>>>>>>>>>>>>>>>> will
>>>>>>>>>>>>>>>>> remain
>>>>>>>>>>>>>>>>> unallocated also for the next few milliseconds.
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> The only fail proof solution would be a port allocating
>>>>>>>>>>>>>>>>> service
>>>>>>>>>>>>>>>>> provided
>>>>>>>>>>>>>>>>> by the harness. Until then we can only (hopefully) decrease
>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>> chance
>>>>>>>>>>>>>>>>> of intermittent failures due to a port being in use.
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> -JB-
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> -Dmitry
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> [1]
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> 141     public static int getFreePort() throws
>>>>>>>>>>>>>>>>>> InterruptedException,
>>>>>>>>>>>>>>>>>> IOException {
>>>>>>>>>>>>>>>>>>  142         int port = -1;
>>>>>>>>>>>>>>>>>>  143
>>>>>>>>>>>>>>>>>>  144         while (port <= 0) {
>>>>>>>>>>>>>>>>>>  145             Thread.sleep(100);
>>>>>>>>>>>>>>>>>>  146
>>>>>>>>>>>>>>>>>>  147             ServerSocket serverSocket = null;
>>>>>>>>>>>>>>>>>>  148             try {
>>>>>>>>>>>>>>>>>>  149                 serverSocket = new ServerSocket(0);
>>>>>>>>>>>>>>>>>>  150                 port = serverSocket.getLocalPort();
>>>>>>>>>>>>>>>>>>  151             } finally {
>>>>>>>>>>>>>>>>>>  152                 serverSocket.close();
>>>>>>>>>>>>>>>>>>  153             }
>>>>>>>>>>>>>>>>>>  154         }
>>>>>>>>>>>>>>>>>>  155
>>>>>>>>>>>>>>>>>>  156         return port;
>>>>>>>>>>>>>>>>>>  157     }
>>>>>>>>>>>>>>>>>>  158
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> On 2013-11-20 19:40, roger riggs wrote:
>>>>>>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> fyi,  The jdk.testlibrary.Utils.getFreePort() method will
>>>>>>>>>>>>>>>>>>> Open an
>>>>>>>>>>>>>>>>>>> free
>>>>>>>>>>>>>>>>>>> Socket, close it and return
>>>>>>>>>>>>>>>>>>> the port number.
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> And as Alan recommended, use (0) when possible to have
>>>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>> system
>>>>>>>>>>>>>>>>>>> assign
>>>>>>>>>>>>>>>>>>> the port #.
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> Roger
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> On 11/20/2013 8:04 AM, Dmitry Samersoff wrote:
>>>>>>>>>>>>>>>>>>>> Taras,
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> *The only* correct way to take really free port is:
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> 1. Chose random number between 49152 and 65535
>>>>>>>>>>>>>>>>>>>> 2. Open socket
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> if socket fails - repeat step 1
>>>>>>>>>>>>>>>>>>>> if socket OK - return *socket*
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> If you can't keep the socket open (e.g. you have to pass
>>>>>>>>>>>>>>>>>>>> port
>>>>>>>>>>>>>>>>>>>> number as
>>>>>>>>>>>>>>>>>>>> property value) you shouldn't do any pre-check as it
>>>>>>>>>>>>>>>>>>>> has no
>>>>>>>>>>>>>>>>>>>> value
>>>>>>>>>>>>>>>>>>>> - as
>>>>>>>>>>>>>>>>>>>> as soon as you close socket someone can take the port.
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> So just choose a random number within the range above
>>>>>>>>>>>>>>>>>>>> and
>>>>>>>>>>>>>>>>>>>> let
>>>>>>>>>>>>>>>>>>>> networking
>>>>>>>>>>>>>>>>>>>> code opening socket to handle port conflict.
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> -Dmitry
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> On 2013-11-20 15:54, taras ledkov wrote:
>>>>>>>>>>>>>>>>>>>>> Hi Everyone,
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> I am working on bug
>>>>>>>>>>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-7195249.
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> There are two webrevs:
>>>>>>>>>>>>>>>>>>>>> Webrev for jdk part:
>>>>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~anazarov/7195249/jdk/webrev.00/
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> Webrev for hs part:
>>>>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~anazarov/7195249/hs/webrev.00/
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> Please take a look at some notes:
>>>>>>>>>>>>>>>>>>>>> - After discussing with Yekaterina Kantserova &
>>>>>>>>>>>>>>>>>>>>> Jaroslav
>>>>>>>>>>>>>>>>>>>>> Bachorik
>>>>>>>>>>>>>>>>>>>>> some
>>>>>>>>>>>>>>>>>>>>> shell tests have been converted to java based tests
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> - PasswordFilePermissionTest &
>>>>>>>>>>>>>>>>>>>>> SSLConfigFilePermissionTest
>>>>>>>>>>>>>>>>>>>>> tests
>>>>>>>>>>>>>>>>>>>>> looked
>>>>>>>>>>>>>>>>>>>>> very similar, so a common parent class was created for
>>>>>>>>>>>>>>>>>>>>> them:
>>>>>>>>>>>>>>>>>>>>> AbstractFilePermissionTest
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> - What was called RmiRegistrySslTest.java I've
>>>>>>>>>>>>>>>>>>>>> renamed to
>>>>>>>>>>>>>>>>>>>>> RmiRegistrySslTestApp.java. The java code to replace
>>>>>>>>>>>>>>>>>>>>> old
>>>>>>>>>>>>>>>>>>>>> shell
>>>>>>>>>>>>>>>>>>>>> script
>>>>>>>>>>>>>>>>>>>>> RmiRegistrySslTest.sh is called
>>>>>>>>>>>>>>>>>>>>> RmiRegistrySslTest.java,
>>>>>>>>>>>>>>>>>>>>> hence the
>>>>>>>>>>>>>>>>>>>>> huge
>>>>>>>>>>>>>>>>>>>>> diff.
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> - The new RmiRegistrySslTest.java has some lines
>>>>>>>>>>>>>>>>>>>>> similar
>>>>>>>>>>>>>>>>>>>>> to the
>>>>>>>>>>>>>>>>>>>>> AbstractFilePermissionTest.java, I nevertheless decided
>>>>>>>>>>>>>>>>>>>>> to not
>>>>>>>>>>>>>>>>>>>>> complicate the code further and leave it as is. Please
>>>>>>>>>>>>>>>>>>>>> let me
>>>>>>>>>>>>>>>>>>>>> know if
>>>>>>>>>>>>>>>>>>>>> this is somehow not acceptable
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> - com/oracle/java/testlibrary/Utils.java that is
>>>>>>>>>>>>>>>>>>>>> added to
>>>>>>>>>>>>>>>>>>>>> hotspot
>>>>>>>>>>>>>>>>>>>>> repository is taken from this patch:
>>>>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ykantser/8023138/webrev.00/test/lib/testlibrary/jdk/testlibrary/Utils.java.sdiff.html
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> - These tests will need additional changes when test
>>>>>>>>>>>>>>>>>>>>> library
>>>>>>>>>>>>>>>>>>>>> process
>>>>>>>>>>>>>>>>>>>>> tools will support command line options inheritance
>>>>>>>>>>>>>>>>>>>>> (http://mail.openjdk.java.net/pipermail/serviceability-dev/2013-November/013235.html)
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> --
>>>>>>>>>>>> With best regards,
>>>>>>>>>>>> Taras Ledkov
>>>>>>>>>>>> Mail-To: taras.ledkov at oracle.com
>>>>>>>>>>>> skype: taras_ledkov
>>>>>>>>>>>> Phone: 7(812)3346-157
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> --
>>>>>>>>> With best regards,
>>>>>>>>> Taras Ledkov
>>>>>>>>> Mail-To: taras.ledkov at oracle.com
>>>>>>>>> skype: taras_ledkov
>>>>>>>>> Phone: 7(812)3346-157
>>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>> 
>>>> 
>>> 
>> 
> 
> -- 
> With best regards,
> Taras Ledkov
> Mail-To: taras.ledkov at oracle.com
> skype: taras_ledkov
> Phone: 7(812)3346-157


From brent.christian at oracle.com  Wed Feb 26 11:34:51 2014
From: brent.christian at oracle.com (Brent Christian)
Date: Wed, 26 Feb 2014 11:34:51 -0800
Subject: RFR 6835233 : Fedora 9 jdk regression test failed:
	java/lang/instrument/ParallelTransformerLoader.sh
Message-ID: <530E41DB.20801@oracle.com>

File under "chipping away at test stabilization issues."

https://bugs.openjdk.java.net/browse/JDK-6835233

I've done some repeated runs of this test on my Linux machine.  The test 
fails every time with 6u3.  It fails intermittently on 7 (after 145 
iterations for 7u45, and 62 iterations for 7u60b07).  I have not been 
able to reproduce the failure on 8 or 9, running 1000 iterations each on 
8b115, 8b129, and 9b02.

I would like to resolve this bug by removing the "@ignore" tag for JDK 
9, and bring the test back into rotation.  If the failure comes back, 
I'll submit a new issue for further investigation.

The change is:

  # @bug 5088398
-# @ignore until bug 6835233 dealt with
  # @summary Test parallel class loading by parallel transformers.

Thanks,
-Brent

From sean.coffey at oracle.com  Wed Feb 26 15:07:08 2014
From: sean.coffey at oracle.com (sean.coffey at oracle.com)
Date: Wed, 26 Feb 2014 23:07:08 +0000
Subject: hg: jdk8/tl/corba: 8035618: Four api/org_omg/CORBA TCK tests fail
	under plugin only
Message-ID: <20140226230711.5750362F74@hg.openjdk.java.net>

Changeset: 0683ee308085
Author:    coffeys
Date:      2014-02-26 23:04 +0000
URL:       http://hg.openjdk.java.net/jdk8/tl/corba/rev/0683ee308085

8035618: Four api/org_omg/CORBA TCK tests fail under plugin only
Reviewed-by: mchung, chegar

! src/share/classes/com/sun/corba/se/spi/orb/ORB.java


From staffan.larsen at oracle.com  Wed Feb 26 22:58:48 2014
From: staffan.larsen at oracle.com (Staffan Larsen)
Date: Thu, 27 Feb 2014 07:58:48 +0100
Subject: RFR 6835233 : Fedora 9 jdk regression test failed:
	java/lang/instrument/ParallelTransformerLoader.sh
In-Reply-To: <530E41DB.20801@oracle.com>
References: <530E41DB.20801@oracle.com>
Message-ID: <5A25AA54-BD79-4590-8230-0C875CAF181D@oracle.com>

Looks good!

Thanks,
/Staffan

On 26 feb 2014, at 20:34, Brent Christian <brent.christian at oracle.com> wrote:

> File under "chipping away at test stabilization issues."
> 
> https://bugs.openjdk.java.net/browse/JDK-6835233
> 
> I've done some repeated runs of this test on my Linux machine.  The test fails every time with 6u3.  It fails intermittently on 7 (after 145 iterations for 7u45, and 62 iterations for 7u60b07).  I have not been able to reproduce the failure on 8 or 9, running 1000 iterations each on 8b115, 8b129, and 9b02.
> 
> I would like to resolve this bug by removing the "@ignore" tag for JDK 9, and bring the test back into rotation.  If the failure comes back, I'll submit a new issue for further investigation.
> 
> The change is:
> 
> # @bug 5088398
> -# @ignore until bug 6835233 dealt with
> # @summary Test parallel class loading by parallel transformers.
> 
> Thanks,
> -Brent


From Alan.Bateman at oracle.com  Wed Feb 26 23:47:57 2014
From: Alan.Bateman at oracle.com (Alan Bateman)
Date: Thu, 27 Feb 2014 07:47:57 +0000
Subject: RFR 6835233 : Fedora 9 jdk regression test failed:
	java/lang/instrument/ParallelTransformerLoader.sh
In-Reply-To: <530E41DB.20801@oracle.com>
References: <530E41DB.20801@oracle.com>
Message-ID: <530EEDAD.9050400@oracle.com>

On 26/02/2014 19:34, Brent Christian wrote:
> :
>
> The change is:
>
>  # @bug 5088398
> -# @ignore until bug 6835233 dealt with
>  # @summary Test parallel class loading by parallel transformers.
This looks okay to me too, I assume that if there is any residual issue 
that it will show up quickly once the test is running again.

-Alan

From serguei.spitsyn at oracle.com  Thu Feb 27 00:25:00 2014
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Thu, 27 Feb 2014 00:25:00 -0800
Subject: 2-nd round RFR 6471769: Error: assert(_cur_stack_depth ==
	count_frames(), "cur_stack_depth out of sync")
In-Reply-To: <530D007D.4040602@oracle.com>
References: <530D007D.4040602@oracle.com>
Message-ID: <530EF65C.5040707@oracle.com>

Please, review the fix for:
   https://bugs.openjdk.java.net/browse/JDK-6471769


Open webrev:
http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/hotspot/6471769-JVMTI-DEPTH.2

Summary:

   It is the 2-nd round of review because the JTREG com/sun/jdi tests 
discovered a regression
   in the first round change. The issue was in the 
JvmtiEventController::clear_frame_pop()
   lock synchronization that is not allowed at safepoints.

   As a result I've changed the JvmtiEnv::NotifyFramePop to use a VM 
operation for safety.
   Also, I've removed the lock synchronization from the 3 impacted 
JvmtiEventController::
   functions: set_frame_pop(), clear_frame_pop() and clear_to_frame_pop().

Testing:
   In progress: nsk.jvmti, nsk.jdi, nsk.jdwp, JTreg com/sun/jdi


Thanks,
Serguei


On 2/25/14 12:43 PM, serguei.spitsyn at oracle.com wrote:
> Please, review the fix for:
>   https://bugs.openjdk.java.net/browse/JDK-6471769
>
>
> Open webrev:
> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/hotspot/6471769-JVMTI-DEPTH.1 
>
>
> Summary:
>
>   This is another Test Stabilization issue.
>   The fix is very similar to other JVMTI stabilization fixes.
>   It is to use safepoints for updating the PopFrame data instead of 
> relying on the
>   suspend equivalent condition mechanism 
> (JvmtiEnv::is_thread_fully_suspended())
>   which is not adequate from the reliability point of view.
>
> Testing:
>   In progress: nsk.jvmti, nsk.jdi, nsk.jdwp, JTreg com/sun/jdi
>
>
> Thanks,
> Serguei
>


From mattias.tobiasson at oracle.com  Thu Feb 27 02:43:57 2014
From: mattias.tobiasson at oracle.com (Mattias Tobiasson)
Date: Thu, 27 Feb 2014 02:43:57 -0800 (PST)
Subject: RFR(XS) 8031065: LowMemoryTest2.sh fails: OutOfMemoryError: Metaspace
Message-ID: <4b9ed302-c006-4829-b732-2ffc123d84f4@default>

Hi,
Could you please review this test fix.

The test verifies that MemoryPoolMXBean sends a notification when used memory has reached the threshold.
The flag thresholdExceeded marks if we have reached the memory threshold. When the flag is set, the test slows down to give time for the notification to be received. 
The problem is that thresholdExceeded is overwritten every time in the loop. Instead it should be set if any pool has reached the threshold. This means that the test continues to allocate memory at full speed, and we may get an OutOfMemory before we get the notification.

bug:
https://bugs.openjdk.java.net/browse/JDK-8031065

webrev:
http://cr.openjdk.java.net/~ykantser/8031065/webrev.00/

Mattias

From staffan.larsen at oracle.com  Thu Feb 27 03:38:15 2014
From: staffan.larsen at oracle.com (Staffan Larsen)
Date: Thu, 27 Feb 2014 12:38:15 +0100
Subject: RFR(XS) 8031065: LowMemoryTest2.sh fails: OutOfMemoryError:
	Metaspace
In-Reply-To: <4b9ed302-c006-4829-b732-2ffc123d84f4@default>
References: <4b9ed302-c006-4829-b732-2ffc123d84f4@default>
Message-ID: <0FF85AAC-03C4-49BA-8F6F-4FCEA09138EA@oracle.com>

Looks good!

Thanks,
/Staffan

On 27 feb 2014, at 11:43, Mattias Tobiasson <mattias.tobiasson at oracle.com> wrote:

> Hi,
> Could you please review this test fix.
> 
> The test verifies that MemoryPoolMXBean sends a notification when used memory has reached the threshold.
> The flag thresholdExceeded marks if we have reached the memory threshold. When the flag is set, the test slows down to give time for the notification to be received. 
> The problem is that thresholdExceeded is overwritten every time in the loop. Instead it should be set if any pool has reached the threshold. This means that the test continues to allocate memory at full speed, and we may get an OutOfMemory before we get the notification.
> 
> bug:
> https://bugs.openjdk.java.net/browse/JDK-8031065
> 
> webrev:
> http://cr.openjdk.java.net/~ykantser/8031065/webrev.00/
> 
> Mattias


From taras.ledkov at oracle.com  Thu Feb 27 03:39:38 2014
From: taras.ledkov at oracle.com (taras ledkov)
Date: Thu, 27 Feb 2014 15:39:38 +0400
Subject: Review request for 7195249: Some jtreg tests use hard coded ports
In-Reply-To: <2C889ABA-D705-4518-85EC-C5686DC43647@oracle.com>
References: <529EF58F.5000701@oracle.com>
	<52A58687.6020708@oracle.com>	<52A5953A.5040102@oracle.com>
	<52A7061E.8040002@oracle.com>	<52BC2A7D.3070403@oracle.com>	<B27290D9-2EA6-4330-9A31-59957AA968ED@oracle.com>
	<52D6A61A.5020109@oracle.com> <52DD3BB6.2070607@oracle.com>
	<A2DEA977-75EF-4AC2-AC86-E22B8BBFFE97@oracle.com>
	<52DE3E3D.5070903@oracle.com> <52DE41BB.40309@oracle.com>
	<52F231D3.1050208@oracle.com> <52F23A97.7020807@oracle.com>
	<53049009.9020204@oracle.com> <530CA007.5080006@oracle.com>
	<530DF91C.5010000@oracle.com>
	<2C889ABA-D705-4518-85EC-C5686DC43647@oracle.com>
Message-ID: <530F23FA.6070805@oracle.com>

Staffan, Excuse me.
I didn't know about that.
I looked at the http://openjdk.java.net/census#serviceability.

Now I'm in need of the second review.

On 26.02.2014 18:36, Staffan Larsen wrote:
>
> On 26 feb 2014, at 15:24, taras ledkov <taras.ledkov at oracle.com> wrote:
>
>> Hi,
>>
>> Alan, Mandy could you please review the fix: https://bugs.openjdk.java.net/browse/JDK-7195249.
>>
>> I had the discussion with Jaroslav and Staffan and they have approved my fix, but they are not reviewers.
>
> I am a Reviewer.
>
> Thanks,
> /Staffan
>
>>
>> Webrev for jdk part:
>> http://cr.openjdk.java.net/~anazarov/7195249/jdk/webrev.04/
>>
>> Webrev for hs part:
>> http://cr.openjdk.java.net/~anazarov/7195249/hs/webrev.03/
>>
>> On 25.02.2014 17:52, Jaroslav Bachorik wrote:
>>> Thumbs up. (not a "reviewer", though)
>>>
>>> -JB-
>>>
>>> On 19.2.2014 12:05, taras ledkov wrote:
>>>> Hi,
>>>>
>>>> Imports are fixed:
>>>>
>>>> http://cr.openjdk.java.net/~anazarov/7195249/jdk/webrev.04/
>>>>
>>>> On 05.02.2014 17:20, Jaroslav Bachorik wrote:
>>>>> Hi Taras,
>>>>>
>>>>> thanks for taking care of this.
>>>>>
>>>>> The changes look fine to me.
>>>>>
>>>>> One minor nit is unused imports of the library classes in
>>>>> "test/sun/management/jmxremote/bootstrap/SSLConfigFilePermissionTest.java".
>>>>>
>>>>>
>>>>> It does not use any of those classes as its base class
>>>>> "AbstractFilePermissionTest" does all the heavy lifting.
>>>>>
>>>>> Cheers,
>>>>>
>>>>> -JB-
>>>>>
>>>>> On 5.2.2014 13:42, taras ledkov wrote:
>>>>>> Hi,
>>>>>>
>>>>>> So please take a look at the review against JDK9.
>>>>>> The reviewed patch had not been integrated into JDK8.
>>>>>>
>>>>>> Port to JDK9 is identical. The difference: the ProcessTools.java has
>>>>>> been already patched by Jaroslav.
>>>>>>
>>>>>> Webrev for jdk part:
>>>>>> http://cr.openjdk.java.net/~anazarov/7195249/jdk/webrev.03/
>>>>>>
>>>>>> Webrev for hs part:
>>>>>> http://cr.openjdk.java.net/~anazarov/7195249/hs/webrev.03/
>>>>>>
>>>>>>
>>>>>> On 21.01.2014 13:45, Jaroslav Bachorik wrote:
>>>>>>> Hi Taras,
>>>>>>>
>>>>>>> On 21.1.2014 10:30, taras ledkov wrote:
>>>>>>>> Hi Jaroslav,
>>>>>>>>
>>>>>>>> Could you please review the last changes?
>>>>>>>> Are you OK?
>>>>>>>
>>>>>>> Yes, the change looks ok. But I think we will need to get back to this
>>>>>>> problem eventually and implement a central port dispatcher if we
>>>>>>> want to
>>>>>>> be 100% sure the port conflicts wouldn't occur. But your changes
>>>>>>> reduce
>>>>>>> the chance significantly.
>>>>>>>
>>>>>>> Thanks for taking care of this.
>>>>>>>
>>>>>>> -JB-
>>>>>>>
>>>>>>>>
>>>>>>>> On 20.01.2014 19:21, Staffan Larsen wrote:
>>>>>>>>> Sorry for not replying earlier. Yes, I?m ok with these changes.
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> /Staffan
>>>>>>>>>
>>>>>>>>> On 20 jan 2014, at 16:07, taras ledkov <taras.ledkov at oracle.com>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> Hi Staffan,
>>>>>>>>>>
>>>>>>>>>> I fixed the tests according with your comments.
>>>>>>>>>> Are you OK?
>>>>>>>>>>
>>>>>>>>>> On 15.01.2014 19:15, taras ledkov wrote:
>>>>>>>>>>> Hi,
>>>>>>>>>>>
>>>>>>>>>>> Please take a look at the new review.
>>>>>>>>>>>
>>>>>>>>>>> Webrev for jdk part:
>>>>>>>>>>> http://cr.openjdk.java.net/~anazarov/7195249/jdk/webrev.02/
>>>>>>>>>>>
>>>>>>>>>>> Webrev for hs part:
>>>>>>>>>>> http://cr.openjdk.java.net/~anazarov/7195249/hs/webrev.02/
>>>>>>>>>>>
>>>>>>>>>>> My answers are inline:
>>>>>>>>>>>
>>>>>>>>>>> On 08.01.2014 17:46, Staffan Larsen wrote:
>>>>>>>>>>>> Hi Taras,
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks for doing this clean up and conversion of tests into Java.
>>>>>>>>>>>> Here?s a couple of comments:
>>>>>>>>>>>>
>>>>>>>>>>>> test/runtime/6294277/SourceDebugExtension.java:
>>>>>>>>>>>> This test could be simplified by not specifying an address at
>>>>>>>>>>>> all.
>>>>>>>>>>>> Since the test never connects to the JVM started with -Xrunjdwp,
>>>>>>>>>>>> there
>>>>>>>>>>>> is no reason to specify an address. If address is unspecified
>>>>>>>>>>>> (and
>>>>>>>>>>>> server=y), the connector will pick an address and print it to the
>>>>>>>>>>>> command line. Thus the only change that needs to be done is to
>>>>>>>>>>>> remove
>>>>>>>>>>>> ",address=8888? from the @run command.
>>>>>>>>>>> fixed
>>>>>>>>>>>
>>>>>>>>>>>> test/sun/management/jmxremote/bootstrap/RmiBootstrapTest.sh:
>>>>>>>>>>>> test/sun/management/jmxremote/bootstrap/RmiSslBootstrapTest.sh:
>>>>>>>>>>>> These tests do not compile cleanly with an empty JTwork
>>>>>>>>>>>> directory. It
>>>>>>>>>>>> seems that having one @build for each class does not work well -
>>>>>>>>>>>> when
>>>>>>>>>>>> compiling RmiBootstrapTest.java it cannot find TestLogger. Moving
>>>>>>>>>>>> all
>>>>>>>>>>>> classes to one @build statement solved this problem for me.
>>>>>>>>>>> fixed
>>>>>>>>>>>
>>>>>>>>>>>> test/lib/testlibrary/jdk/testlibrary/ProcessTools.java:
>>>>>>>>>>>>   187         Future<Void> stdoutTask = stdout.process();
>>>>>>>>>>>>   188         Future<Void> stderrTask = stderr.process();
>>>>>>>>>>>> The stdoutTask and stderrTask variables are unused.
>>>>>>>>>>> fixed
>>>>>>>>>>>
>>>>>>>>>>>> test/sun/management/jmxremote/bootstrap/RmiRegistrySslTest.java:
>>>>>>>>>>>> At first I thought something was wrong with this file - the
>>>>>>>>>>>> diff is
>>>>>>>>>>>> very weird. Then I realized you renamed an old file and created a
>>>>>>>>>>>> new
>>>>>>>>>>>> file using the old name.
>>>>>>>>>>> You are right. I did it to keep the test name.
>>>>>>>>>>>
>>>>>>>>>>>> test/sun/management/jmxremote/bootstrap/AbstractFilePermissionTest.java:
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> - Is resetPasswordFilePermission() really necessary? It looks
>>>>>>>>>>>> like
>>>>>>>>>>>> you
>>>>>>>>>>>> delete the files at the beginning of the test in any case.
>>>>>>>>>>> I think yes. n the first place, this functionality was at the old
>>>>>>>>>>> code.
>>>>>>>>>>> In the second place, a file without write permission may be a
>>>>>>>>>>> problem
>>>>>>>>>>> for a further cleanup (not by the test, for example for the tests
>>>>>>>>>>> launcher scripts etc.)
>>>>>>>>>>>
>>>>>>>>>>>> - I find the names and usage of ?mgmt? and ?file2PermissionTest?
>>>>>>>>>>>> confusing. They are both Paths. One is used directly by the
>>>>>>>>>>>> sub-classes, the other has a getter method.
>>>>>>>>>>> fixed
>>>>>>>>>>>
>>>>>>>>>>>> - Lines 57-58: Don?t swallow exceptions, add an
>>>>>>>>>>>> ex.printStackTrace().
>>>>>>>>>>>> (Same thing for all other places where you call
>>>>>>>>>>>> Integer.parseInt())
>>>>>>>>>>> fixed
>>>>>>>>>>>
>>>>>>>>>>>> test/sun/management/jmxremote/bootstrap/Dummy.java:
>>>>>>>>>>>> This file is never used as far as I can see.
>>>>>>>>>>> It is used by PasswordFilePermissionTest &
>>>>>>>>>>> SSLConfigFilePermissionTest
>>>>>>>>>>> via the AbstractFilePermissionTest (see the doTest method,
>>>>>>>>>>> AbstractFilePermissionTest : 162).
>>>>>>>>>>>
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>> /Staffan
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On 26 dec 2013, at 14:09, taras ledkov <taras.ledkov at oracle.com>
>>>>>>>>>>>> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>
>>>>>>>>>>>>> Please take a look at the review with fixed issues about
>>>>>>>>>>>>> trying to
>>>>>>>>>>>>> launch test that needs free port several times.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Webrev for jdk part:
>>>>>>>>>>>>> http://cr.openjdk.java.net/~anazarov/7195249/jdk/webrev.01/
>>>>>>>>>>>>>
>>>>>>>>>>>>> Webrev for hs part:
>>>>>>>>>>>>> http://cr.openjdk.java.net/~anazarov/7195249/hs/webrev.01/
>>>>>>>>>>>>>
>>>>>>>>>>>>> Pay your attention to new method
>>>>>>>>>>>>> ProcessTools.startProcess(String,
>>>>>>>>>>>>> ProcessBuilder, Consumer<String>) that is used to analyze all
>>>>>>>>>>>>> output
>>>>>>>>>>>>> of a sub-process.  It has common part with
>>>>>>>>>>>>> ProcessTools.startProcess(String, ProcessBuilder,
>>>>>>>>>>>>> Predicate<String>,
>>>>>>>>>>>>> long, TumeUnit) that is used to determine the warm-up moment.
>>>>>>>>>>>>>
>>>>>>>>>>>>> I think the ProcessTools.startProcess(String, ProcessBuilder,
>>>>>>>>>>>>> Predicate<String>, long, TumeUnit) may be changed by adding
>>>>>>>>>>>>> LinePump
>>>>>>>>>>>>> to stderr if there is not serious reason for restricting the
>>>>>>>>>>>>> warm-up
>>>>>>>>>>>>> analysis to stdout stream.
>>>>>>>>>>>>>
>>>>>>>>>>>>> On 10.12.2013 16:16, Yekaterina Kantserova wrote:
>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I've consulted with Serviceability engineers (add them to CC
>>>>>>>>>>>>>> list) and
>>>>>>>>>>>>>> they would like to see tests to solve these problem so far:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> 2. Implement loops in every test.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>> Katja
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On 12/09/2013 11:02 AM, Alexandre (Shura) Iline wrote:
>>>>>>>>>>>>>>> Guys.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Let me try to sum up what was said before and may be suggest a
>>>>>>>>>>>>>>> compromise.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> 1. There is a desire to have a support port allocation on the
>>>>>>>>>>>>>>> level of
>>>>>>>>>>>>>>> a JTReg suite execution. Taras created a bug for that
>>>>>>>>>>>>>>> (https://bugs.openjdk.java.net/browse/JDK-7195249). Whether it
>>>>>>>>>>>>>>> is a
>>>>>>>>>>>>>>> test harness API or a library API does not really matter from
>>>>>>>>>>>>>>> usage
>>>>>>>>>>>>>>> point of view.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> 2. There is no way to make the tests absolutely stable,
>>>>>>>>>>>>>>> whatever
>>>>>>>>>>>>>>> port
>>>>>>>>>>>>>>> allocation logic is used. The best we could do is to try to
>>>>>>>>>>>>>>> perform
>>>>>>>>>>>>>>> the test logic with different ports until the test succeeds.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Both arguments make sense. #2 is the ultimate answer, of
>>>>>>>>>>>>>>> course,
>>>>>>>>>>>>>>> but
>>>>>>>>>>>>>>> better be used in conjunction with a meaningful port selection
>>>>>>>>>>>>>>> algorithm.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> At the same time, copying a loop-until-success login from one
>>>>>>>>>>>>>>> test to
>>>>>>>>>>>>>>> another may be not the best solution. Library could help with
>>>>>>>>>>>>>>> that I
>>>>>>>>>>>>>>> believe. There only need to be an API method which takes
>>>>>>>>>>>>>>> behavior as a
>>>>>>>>>>>>>>> parameter and run it until it succeeds. Something like:
>>>>>>>>>>>>>>> public <T> runOnAFreePort(Function<T, Integer>)
>>>>>>>>>>>>>>> or similar. There could be arguments of how/whether to
>>>>>>>>>>>>>>> implement
>>>>>>>>>>>>>>> it,
>>>>>>>>>>>>>>> the solution would not work for shell tests, etc, but still
>>>>>>>>>>>>>>> ...
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> With the tests in question though, we have a few options.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> 1. Integrate tests as is. Get to it later after reaching
>>>>>>>>>>>>>>> agreement in
>>>>>>>>>>>>>>> the library, etc.
>>>>>>>>>>>>>>> 2. Implement loops in every test.
>>>>>>>>>>>>>>> 3. Wait for the library to be ready and only then integrate
>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>> changes.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Please let us know which one is closer to your heart.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I personally prefer #1 for the reason that the changes already
>>>>>>>>>>>>>>> supposed to make the tests more stable and also there are many
>>>>>>>>>>>>>>> more
>>>>>>>>>>>>>>> tests tests which use ports, so the scope of the problem is
>>>>>>>>>>>>>>> bigger
>>>>>>>>>>>>>>> than these.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Shura
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Taras,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I agree with the previous comments, that Utils.getFreePort()
>>>>>>>>>>>>>>>> does not
>>>>>>>>>>>>>>>> guarantee the port will be still free when you start your
>>>>>>>>>>>>>>>> process.
>>>>>>>>>>>>>>>> Unfortunately I don't think the library can do more. However,
>>>>>>>>>>>>>>>> there is a
>>>>>>>>>>>>>>>> solution.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Please, look at the
>>>>>>>>>>>>>>>> *jdk/test/sun/tools/jstatd/JstatdTest.java
>>>>>>>>>>>>>>>> tryToSetupJstatdProcess()*. In brief, the test will try to
>>>>>>>>>>>>>>>> start a
>>>>>>>>>>>>>>>> process with a free port and then check if
>>>>>>>>>>>>>>>> /java.rmi.server.ExportException: Port already in use/ has
>>>>>>>>>>>>>>>> been
>>>>>>>>>>>>>>>> thrown.
>>>>>>>>>>>>>>>> If yes, you have to retry.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>> Katja
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On 12/02/2013 01:39 PM, taras ledkov wrote:
>>>>>>>>>>>>>>>>> Hi Everyone,
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Whatever logic is to be chosen to select a free port, it is
>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>> library responsibility to implements it, would not you
>>>>>>>>>>>>>>>>> agree?
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Hence what I am suggesting is to integrate the tests as is.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Should we decide to replace logic of the port selection, we
>>>>>>>>>>>>>>>>> could do
>>>>>>>>>>>>>>>>> it later in the library.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On 21.11.2013 15:00, Jaroslav Bachorik wrote:
>>>>>>>>>>>>>>>>>> On 20.11.2013 18:38, Dmitry Samersoff wrote:
>>>>>>>>>>>>>>>>>>> Roger,
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> As soon as we close a socket nobody can guarantee that the
>>>>>>>>>>>>>>>>>>> port is
>>>>>>>>>>>>>>>>>>> free.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Moreover, port returned by getFreePort()[1] remains not
>>>>>>>>>>>>>>>>>>> accessible
>>>>>>>>>>>>>>>>>>> for
>>>>>>>>>>>>>>>>>>> some time - it depends to system setup, take a look to
>>>>>>>>>>>>>>>>>>> discussions
>>>>>>>>>>>>>>>>>>> around SO_REUSEPORT for Linux or SO_REUSEADDR and
>>>>>>>>>>>>>>>>>>> SO_LINGER
>>>>>>>>>>>>>>>>>>> for
>>>>>>>>>>>>>>>>>>> BSD.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> So from stability point of view it's better to just return
>>>>>>>>>>>>>>>>>>> random
>>>>>>>>>>>>>>>>>>> number
>>>>>>>>>>>>>>>>>>> between 49152 and 65535.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Well, this doesn't seem to improve the odds by much. When
>>>>>>>>>>>>>>>>>> there are
>>>>>>>>>>>>>>>>>> more
>>>>>>>>>>>>>>>>>> tests run in parallel, all of them requiring a free port,
>>>>>>>>>>>>>>>>>> nothing
>>>>>>>>>>>>>>>>>> prevents the random function to return the same port to
>>>>>>>>>>>>>>>>>> all of
>>>>>>>>>>>>>>>>>> them.
>>>>>>>>>>>>>>>>>> Also, two subsequent requests can return the same port and
>>>>>>>>>>>>>>>>>> cause
>>>>>>>>>>>>>>>>>> problems with timing when a port used by a previous test is
>>>>>>>>>>>>>>>>>> not
>>>>>>>>>>>>>>>>>> fully
>>>>>>>>>>>>>>>>>> ready to be assigned to a different socket. And as Dmitry
>>>>>>>>>>>>>>>>>> pointed out
>>>>>>>>>>>>>>>>>> unless one can keep hold of the allocated socket and use it
>>>>>>>>>>>>>>>>>> later
>>>>>>>>>>>>>>>>>> there
>>>>>>>>>>>>>>>>>> is no guarantee that a port which was tested unallocated
>>>>>>>>>>>>>>>>>> will
>>>>>>>>>>>>>>>>>> remain
>>>>>>>>>>>>>>>>>> unallocated also for the next few milliseconds.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> The only fail proof solution would be a port allocating
>>>>>>>>>>>>>>>>>> service
>>>>>>>>>>>>>>>>>> provided
>>>>>>>>>>>>>>>>>> by the harness. Until then we can only (hopefully) decrease
>>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>> chance
>>>>>>>>>>>>>>>>>> of intermittent failures due to a port being in use.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> -JB-
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> -Dmitry
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> [1]
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> 141     public static int getFreePort() throws
>>>>>>>>>>>>>>>>>>> InterruptedException,
>>>>>>>>>>>>>>>>>>> IOException {
>>>>>>>>>>>>>>>>>>>   142         int port = -1;
>>>>>>>>>>>>>>>>>>>   143
>>>>>>>>>>>>>>>>>>>   144         while (port <= 0) {
>>>>>>>>>>>>>>>>>>>   145             Thread.sleep(100);
>>>>>>>>>>>>>>>>>>>   146
>>>>>>>>>>>>>>>>>>>   147             ServerSocket serverSocket = null;
>>>>>>>>>>>>>>>>>>>   148             try {
>>>>>>>>>>>>>>>>>>>   149                 serverSocket = new ServerSocket(0);
>>>>>>>>>>>>>>>>>>>   150                 port = serverSocket.getLocalPort();
>>>>>>>>>>>>>>>>>>>   151             } finally {
>>>>>>>>>>>>>>>>>>>   152                 serverSocket.close();
>>>>>>>>>>>>>>>>>>>   153             }
>>>>>>>>>>>>>>>>>>>   154         }
>>>>>>>>>>>>>>>>>>>   155
>>>>>>>>>>>>>>>>>>>   156         return port;
>>>>>>>>>>>>>>>>>>>   157     }
>>>>>>>>>>>>>>>>>>>   158
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> On 2013-11-20 19:40, roger riggs wrote:
>>>>>>>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> fyi,  The jdk.testlibrary.Utils.getFreePort() method will
>>>>>>>>>>>>>>>>>>>> Open an
>>>>>>>>>>>>>>>>>>>> free
>>>>>>>>>>>>>>>>>>>> Socket, close it and return
>>>>>>>>>>>>>>>>>>>> the port number.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> And as Alan recommended, use (0) when possible to have
>>>>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>> system
>>>>>>>>>>>>>>>>>>>> assign
>>>>>>>>>>>>>>>>>>>> the port #.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Roger
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> On 11/20/2013 8:04 AM, Dmitry Samersoff wrote:
>>>>>>>>>>>>>>>>>>>>> Taras,
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> *The only* correct way to take really free port is:
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> 1. Chose random number between 49152 and 65535
>>>>>>>>>>>>>>>>>>>>> 2. Open socket
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> if socket fails - repeat step 1
>>>>>>>>>>>>>>>>>>>>> if socket OK - return *socket*
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> If you can't keep the socket open (e.g. you have to pass
>>>>>>>>>>>>>>>>>>>>> port
>>>>>>>>>>>>>>>>>>>>> number as
>>>>>>>>>>>>>>>>>>>>> property value) you shouldn't do any pre-check as it
>>>>>>>>>>>>>>>>>>>>> has no
>>>>>>>>>>>>>>>>>>>>> value
>>>>>>>>>>>>>>>>>>>>> - as
>>>>>>>>>>>>>>>>>>>>> as soon as you close socket someone can take the port.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> So just choose a random number within the range above
>>>>>>>>>>>>>>>>>>>>> and
>>>>>>>>>>>>>>>>>>>>> let
>>>>>>>>>>>>>>>>>>>>> networking
>>>>>>>>>>>>>>>>>>>>> code opening socket to handle port conflict.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> -Dmitry
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> On 2013-11-20 15:54, taras ledkov wrote:
>>>>>>>>>>>>>>>>>>>>>> Hi Everyone,
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> I am working on bug
>>>>>>>>>>>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-7195249.
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> There are two webrevs:
>>>>>>>>>>>>>>>>>>>>>> Webrev for jdk part:
>>>>>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~anazarov/7195249/jdk/webrev.00/
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> Webrev for hs part:
>>>>>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~anazarov/7195249/hs/webrev.00/
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> Please take a look at some notes:
>>>>>>>>>>>>>>>>>>>>>> - After discussing with Yekaterina Kantserova &
>>>>>>>>>>>>>>>>>>>>>> Jaroslav
>>>>>>>>>>>>>>>>>>>>>> Bachorik
>>>>>>>>>>>>>>>>>>>>>> some
>>>>>>>>>>>>>>>>>>>>>> shell tests have been converted to java based tests
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> - PasswordFilePermissionTest &
>>>>>>>>>>>>>>>>>>>>>> SSLConfigFilePermissionTest
>>>>>>>>>>>>>>>>>>>>>> tests
>>>>>>>>>>>>>>>>>>>>>> looked
>>>>>>>>>>>>>>>>>>>>>> very similar, so a common parent class was created for
>>>>>>>>>>>>>>>>>>>>>> them:
>>>>>>>>>>>>>>>>>>>>>> AbstractFilePermissionTest
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> - What was called RmiRegistrySslTest.java I've
>>>>>>>>>>>>>>>>>>>>>> renamed to
>>>>>>>>>>>>>>>>>>>>>> RmiRegistrySslTestApp.java. The java code to replace
>>>>>>>>>>>>>>>>>>>>>> old
>>>>>>>>>>>>>>>>>>>>>> shell
>>>>>>>>>>>>>>>>>>>>>> script
>>>>>>>>>>>>>>>>>>>>>> RmiRegistrySslTest.sh is called
>>>>>>>>>>>>>>>>>>>>>> RmiRegistrySslTest.java,
>>>>>>>>>>>>>>>>>>>>>> hence the
>>>>>>>>>>>>>>>>>>>>>> huge
>>>>>>>>>>>>>>>>>>>>>> diff.
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> - The new RmiRegistrySslTest.java has some lines
>>>>>>>>>>>>>>>>>>>>>> similar
>>>>>>>>>>>>>>>>>>>>>> to the
>>>>>>>>>>>>>>>>>>>>>> AbstractFilePermissionTest.java, I nevertheless decided
>>>>>>>>>>>>>>>>>>>>>> to not
>>>>>>>>>>>>>>>>>>>>>> complicate the code further and leave it as is. Please
>>>>>>>>>>>>>>>>>>>>>> let me
>>>>>>>>>>>>>>>>>>>>>> know if
>>>>>>>>>>>>>>>>>>>>>> this is somehow not acceptable
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> - com/oracle/java/testlibrary/Utils.java that is
>>>>>>>>>>>>>>>>>>>>>> added to
>>>>>>>>>>>>>>>>>>>>>> hotspot
>>>>>>>>>>>>>>>>>>>>>> repository is taken from this patch:
>>>>>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ykantser/8023138/webrev.00/test/lib/testlibrary/jdk/testlibrary/Utils.java.sdiff.html
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> - These tests will need additional changes when test
>>>>>>>>>>>>>>>>>>>>>> library
>>>>>>>>>>>>>>>>>>>>>> process
>>>>>>>>>>>>>>>>>>>>>> tools will support command line options inheritance
>>>>>>>>>>>>>>>>>>>>>> (http://mail.openjdk.java.net/pipermail/serviceability-dev/2013-November/013235.html)
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> --
>>>>>>>>>>>>> With best regards,
>>>>>>>>>>>>> Taras Ledkov
>>>>>>>>>>>>> Mail-To: taras.ledkov at oracle.com
>>>>>>>>>>>>> skype: taras_ledkov
>>>>>>>>>>>>> Phone: 7(812)3346-157
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>> With best regards,
>>>>>>>>>> Taras Ledkov
>>>>>>>>>> Mail-To: taras.ledkov at oracle.com
>>>>>>>>>> skype: taras_ledkov
>>>>>>>>>> Phone: 7(812)3346-157
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>> --
>> With best regards,
>> Taras Ledkov
>> Mail-To: taras.ledkov at oracle.com
>> skype: taras_ledkov
>> Phone: 7(812)3346-157
>

-- 
With best regards,
Taras Ledkov
Mail-To: taras.ledkov at oracle.com
skype: taras_ledkov
Phone: 7(812)3346-157

From daniel.fuchs at oracle.com  Thu Feb 27 03:49:18 2014
From: daniel.fuchs at oracle.com (Daniel Fuchs)
Date: Thu, 27 Feb 2014 12:49:18 +0100
Subject: RFR(XS) 8031065: LowMemoryTest2.sh fails: OutOfMemoryError:
	Metaspace
In-Reply-To: <4b9ed302-c006-4829-b732-2ffc123d84f4@default>
References: <4b9ed302-c006-4829-b732-2ffc123d84f4@default>
Message-ID: <530F263E.8090408@oracle.com>

On 2/27/14 11:43 AM, Mattias Tobiasson wrote:
> Hi,
> Could you please review this test fix.
>
> The test verifies that MemoryPoolMXBean sends a notification when used memory has reached the threshold.
> The flag thresholdExceeded marks if we have reached the memory threshold. When the flag is set, the test slows down to give time for the notification to be received.
> The problem is that thresholdExceeded is overwritten every time in the loop. Instead it should be set if any pool has reached the threshold. This means that the test continues to allocate memory at full speed, and we may get an OutOfMemory before we get the notification.
>

Hi Mattias,

I wonder whether you should also stop calling loadNext() once
thresholdExceeded is true?

best regards,

-- daniel

> bug:
> https://bugs.openjdk.java.net/browse/JDK-8031065
>
> webrev:
> http://cr.openjdk.java.net/~ykantser/8031065/webrev.00/
>
> Mattias
>


From staffan.larsen at oracle.com  Thu Feb 27 03:50:13 2014
From: staffan.larsen at oracle.com (Staffan Larsen)
Date: Thu, 27 Feb 2014 12:50:13 +0100
Subject: Review request for 7195249: Some jtreg tests use hard coded ports
In-Reply-To: <530F23FA.6070805@oracle.com>
References: <529EF58F.5000701@oracle.com>
	<52A58687.6020708@oracle.com>	<52A5953A.5040102@oracle.com>
	<52A7061E.8040002@oracle.com>	<52BC2A7D.3070403@oracle.com>	<B27290D9-2EA6-4330-9A31-59957AA968ED@oracle.com>
	<52D6A61A.5020109@oracle.com> <52DD3BB6.2070607@oracle.com>
	<A2DEA977-75EF-4AC2-AC86-E22B8BBFFE97@oracle.com>
	<52DE3E3D.5070903@oracle.com> <52DE41BB.40309@oracle.com>
	<52F231D3.1050208@oracle.com> <52F23A97.7020807@oracle.com>
	<53049009.9020204@oracle.com> <530CA007.5080006@oracle.com>
	<530DF91C.5010000@oracle.com>
	<2C889ABA-D705-4518-85EC-C5686DC43647@oracle.com>
	<530F23FA.6070805@oracle.com>
Message-ID: <840B8EC7-DAF9-44BB-9D53-33FA0DB11F9D@oracle.com>


On 27 feb 2014, at 12:39, taras ledkov <taras.ledkov at oracle.com> wrote:

> Staffan, Excuse me.

No problem.

> I didn't know about that.
> I looked at the http://openjdk.java.net/census#serviceability.

In this case the relevant project is jdk9 since that is where the change should happen: http://openjdk.java.net/census#jdk9

The ?serviceability? group is mostly dormant.

> Now I'm in need of the second review.

For the jdk9 project the only requirement is one review from an official Reviewer. You currently have my review (an official Reviewer) and Jaroslav?s review (not an official Reviewer). So you are all set.

Since you are not a Commiter you cannot push your changes yourself, but need a sponsor to do that. I can do that for you if you send me the ?hg export? output for the changeset that passes jcheck.

Thanks,
/Staffan


> 
> On 26.02.2014 18:36, Staffan Larsen wrote:
>> 
>> On 26 feb 2014, at 15:24, taras ledkov <taras.ledkov at oracle.com> wrote:
>> 
>>> Hi,
>>> 
>>> Alan, Mandy could you please review the fix: https://bugs.openjdk.java.net/browse/JDK-7195249.
>>> 
>>> I had the discussion with Jaroslav and Staffan and they have approved my fix, but they are not reviewers.
>> 
>> I am a Reviewer.
>> 
>> Thanks,
>> /Staffan
>> 
>>> 
>>> Webrev for jdk part:
>>> http://cr.openjdk.java.net/~anazarov/7195249/jdk/webrev.04/
>>> 
>>> Webrev for hs part:
>>> http://cr.openjdk.java.net/~anazarov/7195249/hs/webrev.03/
>>> 
>>> On 25.02.2014 17:52, Jaroslav Bachorik wrote:
>>>> Thumbs up. (not a "reviewer", though)
>>>> 
>>>> -JB-
>>>> 
>>>> On 19.2.2014 12:05, taras ledkov wrote:
>>>>> Hi,
>>>>> 
>>>>> Imports are fixed:
>>>>> 
>>>>> http://cr.openjdk.java.net/~anazarov/7195249/jdk/webrev.04/
>>>>> 
>>>>> On 05.02.2014 17:20, Jaroslav Bachorik wrote:
>>>>>> Hi Taras,
>>>>>> 
>>>>>> thanks for taking care of this.
>>>>>> 
>>>>>> The changes look fine to me.
>>>>>> 
>>>>>> One minor nit is unused imports of the library classes in
>>>>>> "test/sun/management/jmxremote/bootstrap/SSLConfigFilePermissionTest.java".
>>>>>> 
>>>>>> 
>>>>>> It does not use any of those classes as its base class
>>>>>> "AbstractFilePermissionTest" does all the heavy lifting.
>>>>>> 
>>>>>> Cheers,
>>>>>> 
>>>>>> -JB-
>>>>>> 
>>>>>> On 5.2.2014 13:42, taras ledkov wrote:
>>>>>>> Hi,
>>>>>>> 
>>>>>>> So please take a look at the review against JDK9.
>>>>>>> The reviewed patch had not been integrated into JDK8.
>>>>>>> 
>>>>>>> Port to JDK9 is identical. The difference: the ProcessTools.java has
>>>>>>> been already patched by Jaroslav.
>>>>>>> 
>>>>>>> Webrev for jdk part:
>>>>>>> http://cr.openjdk.java.net/~anazarov/7195249/jdk/webrev.03/
>>>>>>> 
>>>>>>> Webrev for hs part:
>>>>>>> http://cr.openjdk.java.net/~anazarov/7195249/hs/webrev.03/
>>>>>>> 
>>>>>>> 
>>>>>>> On 21.01.2014 13:45, Jaroslav Bachorik wrote:
>>>>>>>> Hi Taras,
>>>>>>>> 
>>>>>>>> On 21.1.2014 10:30, taras ledkov wrote:
>>>>>>>>> Hi Jaroslav,
>>>>>>>>> 
>>>>>>>>> Could you please review the last changes?
>>>>>>>>> Are you OK?
>>>>>>>> 
>>>>>>>> Yes, the change looks ok. But I think we will need to get back to this
>>>>>>>> problem eventually and implement a central port dispatcher if we
>>>>>>>> want to
>>>>>>>> be 100% sure the port conflicts wouldn't occur. But your changes
>>>>>>>> reduce
>>>>>>>> the chance significantly.
>>>>>>>> 
>>>>>>>> Thanks for taking care of this.
>>>>>>>> 
>>>>>>>> -JB-
>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> On 20.01.2014 19:21, Staffan Larsen wrote:
>>>>>>>>>> Sorry for not replying earlier. Yes, I?m ok with these changes.
>>>>>>>>>> 
>>>>>>>>>> Thanks,
>>>>>>>>>> /Staffan
>>>>>>>>>> 
>>>>>>>>>> On 20 jan 2014, at 16:07, taras ledkov <taras.ledkov at oracle.com>
>>>>>>>>>> wrote:
>>>>>>>>>> 
>>>>>>>>>>> Hi Staffan,
>>>>>>>>>>> 
>>>>>>>>>>> I fixed the tests according with your comments.
>>>>>>>>>>> Are you OK?
>>>>>>>>>>> 
>>>>>>>>>>> On 15.01.2014 19:15, taras ledkov wrote:
>>>>>>>>>>>> Hi,
>>>>>>>>>>>> 
>>>>>>>>>>>> Please take a look at the new review.
>>>>>>>>>>>> 
>>>>>>>>>>>> Webrev for jdk part:
>>>>>>>>>>>> http://cr.openjdk.java.net/~anazarov/7195249/jdk/webrev.02/
>>>>>>>>>>>> 
>>>>>>>>>>>> Webrev for hs part:
>>>>>>>>>>>> http://cr.openjdk.java.net/~anazarov/7195249/hs/webrev.02/
>>>>>>>>>>>> 
>>>>>>>>>>>> My answers are inline:
>>>>>>>>>>>> 
>>>>>>>>>>>> On 08.01.2014 17:46, Staffan Larsen wrote:
>>>>>>>>>>>>> Hi Taras,
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Thanks for doing this clean up and conversion of tests into Java.
>>>>>>>>>>>>> Here?s a couple of comments:
>>>>>>>>>>>>> 
>>>>>>>>>>>>> test/runtime/6294277/SourceDebugExtension.java:
>>>>>>>>>>>>> This test could be simplified by not specifying an address at
>>>>>>>>>>>>> all.
>>>>>>>>>>>>> Since the test never connects to the JVM started with -Xrunjdwp,
>>>>>>>>>>>>> there
>>>>>>>>>>>>> is no reason to specify an address. If address is unspecified
>>>>>>>>>>>>> (and
>>>>>>>>>>>>> server=y), the connector will pick an address and print it to the
>>>>>>>>>>>>> command line. Thus the only change that needs to be done is to
>>>>>>>>>>>>> remove
>>>>>>>>>>>>> ",address=8888? from the @run command.
>>>>>>>>>>>> fixed
>>>>>>>>>>>> 
>>>>>>>>>>>>> test/sun/management/jmxremote/bootstrap/RmiBootstrapTest.sh:
>>>>>>>>>>>>> test/sun/management/jmxremote/bootstrap/RmiSslBootstrapTest.sh:
>>>>>>>>>>>>> These tests do not compile cleanly with an empty JTwork
>>>>>>>>>>>>> directory. It
>>>>>>>>>>>>> seems that having one @build for each class does not work well -
>>>>>>>>>>>>> when
>>>>>>>>>>>>> compiling RmiBootstrapTest.java it cannot find TestLogger. Moving
>>>>>>>>>>>>> all
>>>>>>>>>>>>> classes to one @build statement solved this problem for me.
>>>>>>>>>>>> fixed
>>>>>>>>>>>> 
>>>>>>>>>>>>> test/lib/testlibrary/jdk/testlibrary/ProcessTools.java:
>>>>>>>>>>>>>  187         Future<Void> stdoutTask = stdout.process();
>>>>>>>>>>>>>  188         Future<Void> stderrTask = stderr.process();
>>>>>>>>>>>>> The stdoutTask and stderrTask variables are unused.
>>>>>>>>>>>> fixed
>>>>>>>>>>>> 
>>>>>>>>>>>>> test/sun/management/jmxremote/bootstrap/RmiRegistrySslTest.java:
>>>>>>>>>>>>> At first I thought something was wrong with this file - the
>>>>>>>>>>>>> diff is
>>>>>>>>>>>>> very weird. Then I realized you renamed an old file and created a
>>>>>>>>>>>>> new
>>>>>>>>>>>>> file using the old name.
>>>>>>>>>>>> You are right. I did it to keep the test name.
>>>>>>>>>>>> 
>>>>>>>>>>>>> test/sun/management/jmxremote/bootstrap/AbstractFilePermissionTest.java:
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> - Is resetPasswordFilePermission() really necessary? It looks
>>>>>>>>>>>>> like
>>>>>>>>>>>>> you
>>>>>>>>>>>>> delete the files at the beginning of the test in any case.
>>>>>>>>>>>> I think yes. n the first place, this functionality was at the old
>>>>>>>>>>>> code.
>>>>>>>>>>>> In the second place, a file without write permission may be a
>>>>>>>>>>>> problem
>>>>>>>>>>>> for a further cleanup (not by the test, for example for the tests
>>>>>>>>>>>> launcher scripts etc.)
>>>>>>>>>>>> 
>>>>>>>>>>>>> - I find the names and usage of ?mgmt? and ?file2PermissionTest?
>>>>>>>>>>>>> confusing. They are both Paths. One is used directly by the
>>>>>>>>>>>>> sub-classes, the other has a getter method.
>>>>>>>>>>>> fixed
>>>>>>>>>>>> 
>>>>>>>>>>>>> - Lines 57-58: Don?t swallow exceptions, add an
>>>>>>>>>>>>> ex.printStackTrace().
>>>>>>>>>>>>> (Same thing for all other places where you call
>>>>>>>>>>>>> Integer.parseInt())
>>>>>>>>>>>> fixed
>>>>>>>>>>>> 
>>>>>>>>>>>>> test/sun/management/jmxremote/bootstrap/Dummy.java:
>>>>>>>>>>>>> This file is never used as far as I can see.
>>>>>>>>>>>> It is used by PasswordFilePermissionTest &
>>>>>>>>>>>> SSLConfigFilePermissionTest
>>>>>>>>>>>> via the AbstractFilePermissionTest (see the doTest method,
>>>>>>>>>>>> AbstractFilePermissionTest : 162).
>>>>>>>>>>>> 
>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>> /Staffan
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> On 26 dec 2013, at 14:09, taras ledkov <taras.ledkov at oracle.com>
>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Please take a look at the review with fixed issues about
>>>>>>>>>>>>>> trying to
>>>>>>>>>>>>>> launch test that needs free port several times.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Webrev for jdk part:
>>>>>>>>>>>>>> http://cr.openjdk.java.net/~anazarov/7195249/jdk/webrev.01/
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Webrev for hs part:
>>>>>>>>>>>>>> http://cr.openjdk.java.net/~anazarov/7195249/hs/webrev.01/
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Pay your attention to new method
>>>>>>>>>>>>>> ProcessTools.startProcess(String,
>>>>>>>>>>>>>> ProcessBuilder, Consumer<String>) that is used to analyze all
>>>>>>>>>>>>>> output
>>>>>>>>>>>>>> of a sub-process.  It has common part with
>>>>>>>>>>>>>> ProcessTools.startProcess(String, ProcessBuilder,
>>>>>>>>>>>>>> Predicate<String>,
>>>>>>>>>>>>>> long, TumeUnit) that is used to determine the warm-up moment.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> I think the ProcessTools.startProcess(String, ProcessBuilder,
>>>>>>>>>>>>>> Predicate<String>, long, TumeUnit) may be changed by adding
>>>>>>>>>>>>>> LinePump
>>>>>>>>>>>>>> to stderr if there is not serious reason for restricting the
>>>>>>>>>>>>>> warm-up
>>>>>>>>>>>>>> analysis to stdout stream.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> On 10.12.2013 16:16, Yekaterina Kantserova wrote:
>>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> I've consulted with Serviceability engineers (add them to CC
>>>>>>>>>>>>>>> list) and
>>>>>>>>>>>>>>> they would like to see tests to solve these problem so far:
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 2. Implement loops in every test.
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>> Katja
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> On 12/09/2013 11:02 AM, Alexandre (Shura) Iline wrote:
>>>>>>>>>>>>>>>> Guys.
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> Let me try to sum up what was said before and may be suggest a
>>>>>>>>>>>>>>>> compromise.
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> 1. There is a desire to have a support port allocation on the
>>>>>>>>>>>>>>>> level of
>>>>>>>>>>>>>>>> a JTReg suite execution. Taras created a bug for that
>>>>>>>>>>>>>>>> (https://bugs.openjdk.java.net/browse/JDK-7195249). Whether it
>>>>>>>>>>>>>>>> is a
>>>>>>>>>>>>>>>> test harness API or a library API does not really matter from
>>>>>>>>>>>>>>>> usage
>>>>>>>>>>>>>>>> point of view.
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> 2. There is no way to make the tests absolutely stable,
>>>>>>>>>>>>>>>> whatever
>>>>>>>>>>>>>>>> port
>>>>>>>>>>>>>>>> allocation logic is used. The best we could do is to try to
>>>>>>>>>>>>>>>> perform
>>>>>>>>>>>>>>>> the test logic with different ports until the test succeeds.
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> Both arguments make sense. #2 is the ultimate answer, of
>>>>>>>>>>>>>>>> course,
>>>>>>>>>>>>>>>> but
>>>>>>>>>>>>>>>> better be used in conjunction with a meaningful port selection
>>>>>>>>>>>>>>>> algorithm.
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> At the same time, copying a loop-until-success login from one
>>>>>>>>>>>>>>>> test to
>>>>>>>>>>>>>>>> another may be not the best solution. Library could help with
>>>>>>>>>>>>>>>> that I
>>>>>>>>>>>>>>>> believe. There only need to be an API method which takes
>>>>>>>>>>>>>>>> behavior as a
>>>>>>>>>>>>>>>> parameter and run it until it succeeds. Something like:
>>>>>>>>>>>>>>>> public <T> runOnAFreePort(Function<T, Integer>)
>>>>>>>>>>>>>>>> or similar. There could be arguments of how/whether to
>>>>>>>>>>>>>>>> implement
>>>>>>>>>>>>>>>> it,
>>>>>>>>>>>>>>>> the solution would not work for shell tests, etc, but still
>>>>>>>>>>>>>>>> ...
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> With the tests in question though, we have a few options.
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> 1. Integrate tests as is. Get to it later after reaching
>>>>>>>>>>>>>>>> agreement in
>>>>>>>>>>>>>>>> the library, etc.
>>>>>>>>>>>>>>>> 2. Implement loops in every test.
>>>>>>>>>>>>>>>> 3. Wait for the library to be ready and only then integrate
>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>> changes.
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> Please let us know which one is closer to your heart.
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> I personally prefer #1 for the reason that the changes already
>>>>>>>>>>>>>>>> supposed to make the tests more stable and also there are many
>>>>>>>>>>>>>>>> more
>>>>>>>>>>>>>>>> tests tests which use ports, so the scope of the problem is
>>>>>>>>>>>>>>>> bigger
>>>>>>>>>>>>>>>> than these.
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> Shura
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> Taras,
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> I agree with the previous comments, that Utils.getFreePort()
>>>>>>>>>>>>>>>>> does not
>>>>>>>>>>>>>>>>> guarantee the port will be still free when you start your
>>>>>>>>>>>>>>>>> process.
>>>>>>>>>>>>>>>>> Unfortunately I don't think the library can do more. However,
>>>>>>>>>>>>>>>>> there is a
>>>>>>>>>>>>>>>>> solution.
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> Please, look at the
>>>>>>>>>>>>>>>>> *jdk/test/sun/tools/jstatd/JstatdTest.java
>>>>>>>>>>>>>>>>> tryToSetupJstatdProcess()*. In brief, the test will try to
>>>>>>>>>>>>>>>>> start a
>>>>>>>>>>>>>>>>> process with a free port and then check if
>>>>>>>>>>>>>>>>> /java.rmi.server.ExportException: Port already in use/ has
>>>>>>>>>>>>>>>>> been
>>>>>>>>>>>>>>>>> thrown.
>>>>>>>>>>>>>>>>> If yes, you have to retry.
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>> Katja
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> On 12/02/2013 01:39 PM, taras ledkov wrote:
>>>>>>>>>>>>>>>>>> Hi Everyone,
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> Whatever logic is to be chosen to select a free port, it is
>>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>> library responsibility to implements it, would not you
>>>>>>>>>>>>>>>>>> agree?
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> Hence what I am suggesting is to integrate the tests as is.
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> Should we decide to replace logic of the port selection, we
>>>>>>>>>>>>>>>>>> could do
>>>>>>>>>>>>>>>>>> it later in the library.
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> On 21.11.2013 15:00, Jaroslav Bachorik wrote:
>>>>>>>>>>>>>>>>>>> On 20.11.2013 18:38, Dmitry Samersoff wrote:
>>>>>>>>>>>>>>>>>>>> Roger,
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> As soon as we close a socket nobody can guarantee that the
>>>>>>>>>>>>>>>>>>>> port is
>>>>>>>>>>>>>>>>>>>> free.
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> Moreover, port returned by getFreePort()[1] remains not
>>>>>>>>>>>>>>>>>>>> accessible
>>>>>>>>>>>>>>>>>>>> for
>>>>>>>>>>>>>>>>>>>> some time - it depends to system setup, take a look to
>>>>>>>>>>>>>>>>>>>> discussions
>>>>>>>>>>>>>>>>>>>> around SO_REUSEPORT for Linux or SO_REUSEADDR and
>>>>>>>>>>>>>>>>>>>> SO_LINGER
>>>>>>>>>>>>>>>>>>>> for
>>>>>>>>>>>>>>>>>>>> BSD.
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> So from stability point of view it's better to just return
>>>>>>>>>>>>>>>>>>>> random
>>>>>>>>>>>>>>>>>>>> number
>>>>>>>>>>>>>>>>>>>> between 49152 and 65535.
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> Well, this doesn't seem to improve the odds by much. When
>>>>>>>>>>>>>>>>>>> there are
>>>>>>>>>>>>>>>>>>> more
>>>>>>>>>>>>>>>>>>> tests run in parallel, all of them requiring a free port,
>>>>>>>>>>>>>>>>>>> nothing
>>>>>>>>>>>>>>>>>>> prevents the random function to return the same port to
>>>>>>>>>>>>>>>>>>> all of
>>>>>>>>>>>>>>>>>>> them.
>>>>>>>>>>>>>>>>>>> Also, two subsequent requests can return the same port and
>>>>>>>>>>>>>>>>>>> cause
>>>>>>>>>>>>>>>>>>> problems with timing when a port used by a previous test is
>>>>>>>>>>>>>>>>>>> not
>>>>>>>>>>>>>>>>>>> fully
>>>>>>>>>>>>>>>>>>> ready to be assigned to a different socket. And as Dmitry
>>>>>>>>>>>>>>>>>>> pointed out
>>>>>>>>>>>>>>>>>>> unless one can keep hold of the allocated socket and use it
>>>>>>>>>>>>>>>>>>> later
>>>>>>>>>>>>>>>>>>> there
>>>>>>>>>>>>>>>>>>> is no guarantee that a port which was tested unallocated
>>>>>>>>>>>>>>>>>>> will
>>>>>>>>>>>>>>>>>>> remain
>>>>>>>>>>>>>>>>>>> unallocated also for the next few milliseconds.
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> The only fail proof solution would be a port allocating
>>>>>>>>>>>>>>>>>>> service
>>>>>>>>>>>>>>>>>>> provided
>>>>>>>>>>>>>>>>>>> by the harness. Until then we can only (hopefully) decrease
>>>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>> chance
>>>>>>>>>>>>>>>>>>> of intermittent failures due to a port being in use.
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> -JB-
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> -Dmitry
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> [1]
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> 141     public static int getFreePort() throws
>>>>>>>>>>>>>>>>>>>> InterruptedException,
>>>>>>>>>>>>>>>>>>>> IOException {
>>>>>>>>>>>>>>>>>>>>  142         int port = -1;
>>>>>>>>>>>>>>>>>>>>  143
>>>>>>>>>>>>>>>>>>>>  144         while (port <= 0) {
>>>>>>>>>>>>>>>>>>>>  145             Thread.sleep(100);
>>>>>>>>>>>>>>>>>>>>  146
>>>>>>>>>>>>>>>>>>>>  147             ServerSocket serverSocket = null;
>>>>>>>>>>>>>>>>>>>>  148             try {
>>>>>>>>>>>>>>>>>>>>  149                 serverSocket = new ServerSocket(0);
>>>>>>>>>>>>>>>>>>>>  150                 port = serverSocket.getLocalPort();
>>>>>>>>>>>>>>>>>>>>  151             } finally {
>>>>>>>>>>>>>>>>>>>>  152                 serverSocket.close();
>>>>>>>>>>>>>>>>>>>>  153             }
>>>>>>>>>>>>>>>>>>>>  154         }
>>>>>>>>>>>>>>>>>>>>  155
>>>>>>>>>>>>>>>>>>>>  156         return port;
>>>>>>>>>>>>>>>>>>>>  157     }
>>>>>>>>>>>>>>>>>>>>  158
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> On 2013-11-20 19:40, roger riggs wrote:
>>>>>>>>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> fyi,  The jdk.testlibrary.Utils.getFreePort() method will
>>>>>>>>>>>>>>>>>>>>> Open an
>>>>>>>>>>>>>>>>>>>>> free
>>>>>>>>>>>>>>>>>>>>> Socket, close it and return
>>>>>>>>>>>>>>>>>>>>> the port number.
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> And as Alan recommended, use (0) when possible to have
>>>>>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>>> system
>>>>>>>>>>>>>>>>>>>>> assign
>>>>>>>>>>>>>>>>>>>>> the port #.
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> Roger
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> On 11/20/2013 8:04 AM, Dmitry Samersoff wrote:
>>>>>>>>>>>>>>>>>>>>>> Taras,
>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>> *The only* correct way to take really free port is:
>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>> 1. Chose random number between 49152 and 65535
>>>>>>>>>>>>>>>>>>>>>> 2. Open socket
>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>> if socket fails - repeat step 1
>>>>>>>>>>>>>>>>>>>>>> if socket OK - return *socket*
>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>> If you can't keep the socket open (e.g. you have to pass
>>>>>>>>>>>>>>>>>>>>>> port
>>>>>>>>>>>>>>>>>>>>>> number as
>>>>>>>>>>>>>>>>>>>>>> property value) you shouldn't do any pre-check as it
>>>>>>>>>>>>>>>>>>>>>> has no
>>>>>>>>>>>>>>>>>>>>>> value
>>>>>>>>>>>>>>>>>>>>>> - as
>>>>>>>>>>>>>>>>>>>>>> as soon as you close socket someone can take the port.
>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>> So just choose a random number within the range above
>>>>>>>>>>>>>>>>>>>>>> and
>>>>>>>>>>>>>>>>>>>>>> let
>>>>>>>>>>>>>>>>>>>>>> networking
>>>>>>>>>>>>>>>>>>>>>> code opening socket to handle port conflict.
>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>> -Dmitry
>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>> On 2013-11-20 15:54, taras ledkov wrote:
>>>>>>>>>>>>>>>>>>>>>>> Hi Everyone,
>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>> I am working on bug
>>>>>>>>>>>>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-7195249.
>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>> There are two webrevs:
>>>>>>>>>>>>>>>>>>>>>>> Webrev for jdk part:
>>>>>>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~anazarov/7195249/jdk/webrev.00/
>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>> Webrev for hs part:
>>>>>>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~anazarov/7195249/hs/webrev.00/
>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>> Please take a look at some notes:
>>>>>>>>>>>>>>>>>>>>>>> - After discussing with Yekaterina Kantserova &
>>>>>>>>>>>>>>>>>>>>>>> Jaroslav
>>>>>>>>>>>>>>>>>>>>>>> Bachorik
>>>>>>>>>>>>>>>>>>>>>>> some
>>>>>>>>>>>>>>>>>>>>>>> shell tests have been converted to java based tests
>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>> - PasswordFilePermissionTest &
>>>>>>>>>>>>>>>>>>>>>>> SSLConfigFilePermissionTest
>>>>>>>>>>>>>>>>>>>>>>> tests
>>>>>>>>>>>>>>>>>>>>>>> looked
>>>>>>>>>>>>>>>>>>>>>>> very similar, so a common parent class was created for
>>>>>>>>>>>>>>>>>>>>>>> them:
>>>>>>>>>>>>>>>>>>>>>>> AbstractFilePermissionTest
>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>> - What was called RmiRegistrySslTest.java I've
>>>>>>>>>>>>>>>>>>>>>>> renamed to
>>>>>>>>>>>>>>>>>>>>>>> RmiRegistrySslTestApp.java. The java code to replace
>>>>>>>>>>>>>>>>>>>>>>> old
>>>>>>>>>>>>>>>>>>>>>>> shell
>>>>>>>>>>>>>>>>>>>>>>> script
>>>>>>>>>>>>>>>>>>>>>>> RmiRegistrySslTest.sh is called
>>>>>>>>>>>>>>>>>>>>>>> RmiRegistrySslTest.java,
>>>>>>>>>>>>>>>>>>>>>>> hence the
>>>>>>>>>>>>>>>>>>>>>>> huge
>>>>>>>>>>>>>>>>>>>>>>> diff.
>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>> - The new RmiRegistrySslTest.java has some lines
>>>>>>>>>>>>>>>>>>>>>>> similar
>>>>>>>>>>>>>>>>>>>>>>> to the
>>>>>>>>>>>>>>>>>>>>>>> AbstractFilePermissionTest.java, I nevertheless decided
>>>>>>>>>>>>>>>>>>>>>>> to not
>>>>>>>>>>>>>>>>>>>>>>> complicate the code further and leave it as is. Please
>>>>>>>>>>>>>>>>>>>>>>> let me
>>>>>>>>>>>>>>>>>>>>>>> know if
>>>>>>>>>>>>>>>>>>>>>>> this is somehow not acceptable
>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>> - com/oracle/java/testlibrary/Utils.java that is
>>>>>>>>>>>>>>>>>>>>>>> added to
>>>>>>>>>>>>>>>>>>>>>>> hotspot
>>>>>>>>>>>>>>>>>>>>>>> repository is taken from this patch:
>>>>>>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ykantser/8023138/webrev.00/test/lib/testlibrary/jdk/testlibrary/Utils.java.sdiff.html
>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>> - These tests will need additional changes when test
>>>>>>>>>>>>>>>>>>>>>>> library
>>>>>>>>>>>>>>>>>>>>>>> process
>>>>>>>>>>>>>>>>>>>>>>> tools will support command line options inheritance
>>>>>>>>>>>>>>>>>>>>>>> (http://mail.openjdk.java.net/pipermail/serviceability-dev/2013-November/013235.html)
>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> --
>>>>>>>>>>>>>> With best regards,
>>>>>>>>>>>>>> Taras Ledkov
>>>>>>>>>>>>>> Mail-To: taras.ledkov at oracle.com
>>>>>>>>>>>>>> skype: taras_ledkov
>>>>>>>>>>>>>> Phone: 7(812)3346-157
>>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> --
>>>>>>>>>>> With best regards,
>>>>>>>>>>> Taras Ledkov
>>>>>>>>>>> Mail-To: taras.ledkov at oracle.com
>>>>>>>>>>> skype: taras_ledkov
>>>>>>>>>>> Phone: 7(812)3346-157
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>> 
>>>> 
>>> 
>>> --
>>> With best regards,
>>> Taras Ledkov
>>> Mail-To: taras.ledkov at oracle.com
>>> skype: taras_ledkov
>>> Phone: 7(812)3346-157
>> 
> 
> -- 
> With best regards,
> Taras Ledkov
> Mail-To: taras.ledkov at oracle.com
> skype: taras_ledkov
> Phone: 7(812)3346-157


From jaroslav.bachorik at oracle.com  Thu Feb 27 03:56:13 2014
From: jaroslav.bachorik at oracle.com (Jaroslav Bachorik)
Date: Thu, 27 Feb 2014 12:56:13 +0100
Subject: Review request for 7195249: Some jtreg tests use hard coded ports
In-Reply-To: <530F23FA.6070805@oracle.com>
References: <529EF58F.5000701@oracle.com>	<52A58687.6020708@oracle.com>	<52A5953A.5040102@oracle.com>	<52A7061E.8040002@oracle.com>	<52BC2A7D.3070403@oracle.com>	<B27290D9-2EA6-4330-9A31-59957AA968ED@oracle.com>	<52D6A61A.5020109@oracle.com>
	<52DD3BB6.2070607@oracle.com>	<A2DEA977-75EF-4AC2-AC86-E22B8BBFFE97@oracle.com>	<52DE3E3D.5070903@oracle.com>
	<52DE41BB.40309@oracle.com>	<52F231D3.1050208@oracle.com>
	<52F23A97.7020807@oracle.com>	<53049009.9020204@oracle.com>
	<530CA007.5080006@oracle.com>	<530DF91C.5010000@oracle.com>	<2C889ABA-D705-4518-85EC-C5686DC43647@oracle.com>
	<530F23FA.6070805@oracle.com>
Message-ID: <530F27DD.3000704@oracle.com>

On 27.2.2014 12:39, taras ledkov wrote:
> Staffan, Excuse me.
> I didn't know about that.
> I looked at the http://openjdk.java.net/census#serviceability.
>
> Now I'm in need of the second review.

You only need one Reviewer to approve. The other approvals may come from 
non-Reviewers.

-JB-

>
> On 26.02.2014 18:36, Staffan Larsen wrote:
>>
>> On 26 feb 2014, at 15:24, taras ledkov <taras.ledkov at oracle.com> wrote:
>>
>>> Hi,
>>>
>>> Alan, Mandy could you please review the fix:
>>> https://bugs.openjdk.java.net/browse/JDK-7195249.
>>>
>>> I had the discussion with Jaroslav and Staffan and they have approved
>>> my fix, but they are not reviewers.
>>
>> I am a Reviewer.
>>
>> Thanks,
>> /Staffan
>>
>>>
>>> Webrev for jdk part:
>>> http://cr.openjdk.java.net/~anazarov/7195249/jdk/webrev.04/
>>>
>>> Webrev for hs part:
>>> http://cr.openjdk.java.net/~anazarov/7195249/hs/webrev.03/
>>>
>>> On 25.02.2014 17:52, Jaroslav Bachorik wrote:
>>>> Thumbs up. (not a "reviewer", though)
>>>>
>>>> -JB-
>>>>
>>>> On 19.2.2014 12:05, taras ledkov wrote:
>>>>> Hi,
>>>>>
>>>>> Imports are fixed:
>>>>>
>>>>> http://cr.openjdk.java.net/~anazarov/7195249/jdk/webrev.04/
>>>>>
>>>>> On 05.02.2014 17:20, Jaroslav Bachorik wrote:
>>>>>> Hi Taras,
>>>>>>
>>>>>> thanks for taking care of this.
>>>>>>
>>>>>> The changes look fine to me.
>>>>>>
>>>>>> One minor nit is unused imports of the library classes in
>>>>>> "test/sun/management/jmxremote/bootstrap/SSLConfigFilePermissionTest.java".
>>>>>>
>>>>>>
>>>>>>
>>>>>> It does not use any of those classes as its base class
>>>>>> "AbstractFilePermissionTest" does all the heavy lifting.
>>>>>>
>>>>>> Cheers,
>>>>>>
>>>>>> -JB-
>>>>>>
>>>>>> On 5.2.2014 13:42, taras ledkov wrote:
>>>>>>> Hi,
>>>>>>>
>>>>>>> So please take a look at the review against JDK9.
>>>>>>> The reviewed patch had not been integrated into JDK8.
>>>>>>>
>>>>>>> Port to JDK9 is identical. The difference: the ProcessTools.java has
>>>>>>> been already patched by Jaroslav.
>>>>>>>
>>>>>>> Webrev for jdk part:
>>>>>>> http://cr.openjdk.java.net/~anazarov/7195249/jdk/webrev.03/
>>>>>>>
>>>>>>> Webrev for hs part:
>>>>>>> http://cr.openjdk.java.net/~anazarov/7195249/hs/webrev.03/
>>>>>>>
>>>>>>>
>>>>>>> On 21.01.2014 13:45, Jaroslav Bachorik wrote:
>>>>>>>> Hi Taras,
>>>>>>>>
>>>>>>>> On 21.1.2014 10:30, taras ledkov wrote:
>>>>>>>>> Hi Jaroslav,
>>>>>>>>>
>>>>>>>>> Could you please review the last changes?
>>>>>>>>> Are you OK?
>>>>>>>>
>>>>>>>> Yes, the change looks ok. But I think we will need to get back
>>>>>>>> to this
>>>>>>>> problem eventually and implement a central port dispatcher if we
>>>>>>>> want to
>>>>>>>> be 100% sure the port conflicts wouldn't occur. But your changes
>>>>>>>> reduce
>>>>>>>> the chance significantly.
>>>>>>>>
>>>>>>>> Thanks for taking care of this.
>>>>>>>>
>>>>>>>> -JB-
>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 20.01.2014 19:21, Staffan Larsen wrote:
>>>>>>>>>> Sorry for not replying earlier. Yes, I?m ok with these changes.
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>> /Staffan
>>>>>>>>>>
>>>>>>>>>> On 20 jan 2014, at 16:07, taras ledkov <taras.ledkov at oracle.com>
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>> Hi Staffan,
>>>>>>>>>>>
>>>>>>>>>>> I fixed the tests according with your comments.
>>>>>>>>>>> Are you OK?
>>>>>>>>>>>
>>>>>>>>>>> On 15.01.2014 19:15, taras ledkov wrote:
>>>>>>>>>>>> Hi,
>>>>>>>>>>>>
>>>>>>>>>>>> Please take a look at the new review.
>>>>>>>>>>>>
>>>>>>>>>>>> Webrev for jdk part:
>>>>>>>>>>>> http://cr.openjdk.java.net/~anazarov/7195249/jdk/webrev.02/
>>>>>>>>>>>>
>>>>>>>>>>>> Webrev for hs part:
>>>>>>>>>>>> http://cr.openjdk.java.net/~anazarov/7195249/hs/webrev.02/
>>>>>>>>>>>>
>>>>>>>>>>>> My answers are inline:
>>>>>>>>>>>>
>>>>>>>>>>>> On 08.01.2014 17:46, Staffan Larsen wrote:
>>>>>>>>>>>>> Hi Taras,
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks for doing this clean up and conversion of tests into
>>>>>>>>>>>>> Java.
>>>>>>>>>>>>> Here?s a couple of comments:
>>>>>>>>>>>>>
>>>>>>>>>>>>> test/runtime/6294277/SourceDebugExtension.java:
>>>>>>>>>>>>> This test could be simplified by not specifying an address at
>>>>>>>>>>>>> all.
>>>>>>>>>>>>> Since the test never connects to the JVM started with
>>>>>>>>>>>>> -Xrunjdwp,
>>>>>>>>>>>>> there
>>>>>>>>>>>>> is no reason to specify an address. If address is unspecified
>>>>>>>>>>>>> (and
>>>>>>>>>>>>> server=y), the connector will pick an address and print it
>>>>>>>>>>>>> to the
>>>>>>>>>>>>> command line. Thus the only change that needs to be done is to
>>>>>>>>>>>>> remove
>>>>>>>>>>>>> ",address=8888? from the @run command.
>>>>>>>>>>>> fixed
>>>>>>>>>>>>
>>>>>>>>>>>>> test/sun/management/jmxremote/bootstrap/RmiBootstrapTest.sh:
>>>>>>>>>>>>> test/sun/management/jmxremote/bootstrap/RmiSslBootstrapTest.sh:
>>>>>>>>>>>>>
>>>>>>>>>>>>> These tests do not compile cleanly with an empty JTwork
>>>>>>>>>>>>> directory. It
>>>>>>>>>>>>> seems that having one @build for each class does not work
>>>>>>>>>>>>> well -
>>>>>>>>>>>>> when
>>>>>>>>>>>>> compiling RmiBootstrapTest.java it cannot find TestLogger.
>>>>>>>>>>>>> Moving
>>>>>>>>>>>>> all
>>>>>>>>>>>>> classes to one @build statement solved this problem for me.
>>>>>>>>>>>> fixed
>>>>>>>>>>>>
>>>>>>>>>>>>> test/lib/testlibrary/jdk/testlibrary/ProcessTools.java:
>>>>>>>>>>>>>   187         Future<Void> stdoutTask = stdout.process();
>>>>>>>>>>>>>   188         Future<Void> stderrTask = stderr.process();
>>>>>>>>>>>>> The stdoutTask and stderrTask variables are unused.
>>>>>>>>>>>> fixed
>>>>>>>>>>>>
>>>>>>>>>>>>> test/sun/management/jmxremote/bootstrap/RmiRegistrySslTest.java:
>>>>>>>>>>>>>
>>>>>>>>>>>>> At first I thought something was wrong with this file - the
>>>>>>>>>>>>> diff is
>>>>>>>>>>>>> very weird. Then I realized you renamed an old file and
>>>>>>>>>>>>> created a
>>>>>>>>>>>>> new
>>>>>>>>>>>>> file using the old name.
>>>>>>>>>>>> You are right. I did it to keep the test name.
>>>>>>>>>>>>
>>>>>>>>>>>>> test/sun/management/jmxremote/bootstrap/AbstractFilePermissionTest.java:
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> - Is resetPasswordFilePermission() really necessary? It looks
>>>>>>>>>>>>> like
>>>>>>>>>>>>> you
>>>>>>>>>>>>> delete the files at the beginning of the test in any case.
>>>>>>>>>>>> I think yes. n the first place, this functionality was at
>>>>>>>>>>>> the old
>>>>>>>>>>>> code.
>>>>>>>>>>>> In the second place, a file without write permission may be a
>>>>>>>>>>>> problem
>>>>>>>>>>>> for a further cleanup (not by the test, for example for the
>>>>>>>>>>>> tests
>>>>>>>>>>>> launcher scripts etc.)
>>>>>>>>>>>>
>>>>>>>>>>>>> - I find the names and usage of ?mgmt? and
>>>>>>>>>>>>> ?file2PermissionTest?
>>>>>>>>>>>>> confusing. They are both Paths. One is used directly by the
>>>>>>>>>>>>> sub-classes, the other has a getter method.
>>>>>>>>>>>> fixed
>>>>>>>>>>>>
>>>>>>>>>>>>> - Lines 57-58: Don?t swallow exceptions, add an
>>>>>>>>>>>>> ex.printStackTrace().
>>>>>>>>>>>>> (Same thing for all other places where you call
>>>>>>>>>>>>> Integer.parseInt())
>>>>>>>>>>>> fixed
>>>>>>>>>>>>
>>>>>>>>>>>>> test/sun/management/jmxremote/bootstrap/Dummy.java:
>>>>>>>>>>>>> This file is never used as far as I can see.
>>>>>>>>>>>> It is used by PasswordFilePermissionTest &
>>>>>>>>>>>> SSLConfigFilePermissionTest
>>>>>>>>>>>> via the AbstractFilePermissionTest (see the doTest method,
>>>>>>>>>>>> AbstractFilePermissionTest : 162).
>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>> /Staffan
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On 26 dec 2013, at 14:09, taras ledkov
>>>>>>>>>>>>> <taras.ledkov at oracle.com>
>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Please take a look at the review with fixed issues about
>>>>>>>>>>>>>> trying to
>>>>>>>>>>>>>> launch test that needs free port several times.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Webrev for jdk part:
>>>>>>>>>>>>>> http://cr.openjdk.java.net/~anazarov/7195249/jdk/webrev.01/
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Webrev for hs part:
>>>>>>>>>>>>>> http://cr.openjdk.java.net/~anazarov/7195249/hs/webrev.01/
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Pay your attention to new method
>>>>>>>>>>>>>> ProcessTools.startProcess(String,
>>>>>>>>>>>>>> ProcessBuilder, Consumer<String>) that is used to analyze all
>>>>>>>>>>>>>> output
>>>>>>>>>>>>>> of a sub-process.  It has common part with
>>>>>>>>>>>>>> ProcessTools.startProcess(String, ProcessBuilder,
>>>>>>>>>>>>>> Predicate<String>,
>>>>>>>>>>>>>> long, TumeUnit) that is used to determine the warm-up moment.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I think the ProcessTools.startProcess(String, ProcessBuilder,
>>>>>>>>>>>>>> Predicate<String>, long, TumeUnit) may be changed by adding
>>>>>>>>>>>>>> LinePump
>>>>>>>>>>>>>> to stderr if there is not serious reason for restricting the
>>>>>>>>>>>>>> warm-up
>>>>>>>>>>>>>> analysis to stdout stream.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On 10.12.2013 16:16, Yekaterina Kantserova wrote:
>>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I've consulted with Serviceability engineers (add them to CC
>>>>>>>>>>>>>>> list) and
>>>>>>>>>>>>>>> they would like to see tests to solve these problem so far:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> 2. Implement loops in every test.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>> Katja
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On 12/09/2013 11:02 AM, Alexandre (Shura) Iline wrote:
>>>>>>>>>>>>>>>> Guys.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Let me try to sum up what was said before and may be
>>>>>>>>>>>>>>>> suggest a
>>>>>>>>>>>>>>>> compromise.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> 1. There is a desire to have a support port allocation
>>>>>>>>>>>>>>>> on the
>>>>>>>>>>>>>>>> level of
>>>>>>>>>>>>>>>> a JTReg suite execution. Taras created a bug for that
>>>>>>>>>>>>>>>> (https://bugs.openjdk.java.net/browse/JDK-7195249).
>>>>>>>>>>>>>>>> Whether it
>>>>>>>>>>>>>>>> is a
>>>>>>>>>>>>>>>> test harness API or a library API does not really matter
>>>>>>>>>>>>>>>> from
>>>>>>>>>>>>>>>> usage
>>>>>>>>>>>>>>>> point of view.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> 2. There is no way to make the tests absolutely stable,
>>>>>>>>>>>>>>>> whatever
>>>>>>>>>>>>>>>> port
>>>>>>>>>>>>>>>> allocation logic is used. The best we could do is to try to
>>>>>>>>>>>>>>>> perform
>>>>>>>>>>>>>>>> the test logic with different ports until the test
>>>>>>>>>>>>>>>> succeeds.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Both arguments make sense. #2 is the ultimate answer, of
>>>>>>>>>>>>>>>> course,
>>>>>>>>>>>>>>>> but
>>>>>>>>>>>>>>>> better be used in conjunction with a meaningful port
>>>>>>>>>>>>>>>> selection
>>>>>>>>>>>>>>>> algorithm.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> At the same time, copying a loop-until-success login
>>>>>>>>>>>>>>>> from one
>>>>>>>>>>>>>>>> test to
>>>>>>>>>>>>>>>> another may be not the best solution. Library could help
>>>>>>>>>>>>>>>> with
>>>>>>>>>>>>>>>> that I
>>>>>>>>>>>>>>>> believe. There only need to be an API method which takes
>>>>>>>>>>>>>>>> behavior as a
>>>>>>>>>>>>>>>> parameter and run it until it succeeds. Something like:
>>>>>>>>>>>>>>>> public <T> runOnAFreePort(Function<T, Integer>)
>>>>>>>>>>>>>>>> or similar. There could be arguments of how/whether to
>>>>>>>>>>>>>>>> implement
>>>>>>>>>>>>>>>> it,
>>>>>>>>>>>>>>>> the solution would not work for shell tests, etc, but still
>>>>>>>>>>>>>>>> ...
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> With the tests in question though, we have a few options.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> 1. Integrate tests as is. Get to it later after reaching
>>>>>>>>>>>>>>>> agreement in
>>>>>>>>>>>>>>>> the library, etc.
>>>>>>>>>>>>>>>> 2. Implement loops in every test.
>>>>>>>>>>>>>>>> 3. Wait for the library to be ready and only then integrate
>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>> changes.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Please let us know which one is closer to your heart.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I personally prefer #1 for the reason that the changes
>>>>>>>>>>>>>>>> already
>>>>>>>>>>>>>>>> supposed to make the tests more stable and also there
>>>>>>>>>>>>>>>> are many
>>>>>>>>>>>>>>>> more
>>>>>>>>>>>>>>>> tests tests which use ports, so the scope of the problem is
>>>>>>>>>>>>>>>> bigger
>>>>>>>>>>>>>>>> than these.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Shura
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Taras,
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> I agree with the previous comments, that
>>>>>>>>>>>>>>>>> Utils.getFreePort()
>>>>>>>>>>>>>>>>> does not
>>>>>>>>>>>>>>>>> guarantee the port will be still free when you start your
>>>>>>>>>>>>>>>>> process.
>>>>>>>>>>>>>>>>> Unfortunately I don't think the library can do more.
>>>>>>>>>>>>>>>>> However,
>>>>>>>>>>>>>>>>> there is a
>>>>>>>>>>>>>>>>> solution.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Please, look at the
>>>>>>>>>>>>>>>>> *jdk/test/sun/tools/jstatd/JstatdTest.java
>>>>>>>>>>>>>>>>> tryToSetupJstatdProcess()*. In brief, the test will try to
>>>>>>>>>>>>>>>>> start a
>>>>>>>>>>>>>>>>> process with a free port and then check if
>>>>>>>>>>>>>>>>> /java.rmi.server.ExportException: Port already in use/ has
>>>>>>>>>>>>>>>>> been
>>>>>>>>>>>>>>>>> thrown.
>>>>>>>>>>>>>>>>> If yes, you have to retry.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>> Katja
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On 12/02/2013 01:39 PM, taras ledkov wrote:
>>>>>>>>>>>>>>>>>> Hi Everyone,
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Whatever logic is to be chosen to select a free port,
>>>>>>>>>>>>>>>>>> it is
>>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>> library responsibility to implements it, would not you
>>>>>>>>>>>>>>>>>> agree?
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Hence what I am suggesting is to integrate the tests
>>>>>>>>>>>>>>>>>> as is.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Should we decide to replace logic of the port
>>>>>>>>>>>>>>>>>> selection, we
>>>>>>>>>>>>>>>>>> could do
>>>>>>>>>>>>>>>>>> it later in the library.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> On 21.11.2013 15:00, Jaroslav Bachorik wrote:
>>>>>>>>>>>>>>>>>>> On 20.11.2013 18:38, Dmitry Samersoff wrote:
>>>>>>>>>>>>>>>>>>>> Roger,
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> As soon as we close a socket nobody can guarantee
>>>>>>>>>>>>>>>>>>>> that the
>>>>>>>>>>>>>>>>>>>> port is
>>>>>>>>>>>>>>>>>>>> free.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Moreover, port returned by getFreePort()[1] remains not
>>>>>>>>>>>>>>>>>>>> accessible
>>>>>>>>>>>>>>>>>>>> for
>>>>>>>>>>>>>>>>>>>> some time - it depends to system setup, take a look to
>>>>>>>>>>>>>>>>>>>> discussions
>>>>>>>>>>>>>>>>>>>> around SO_REUSEPORT for Linux or SO_REUSEADDR and
>>>>>>>>>>>>>>>>>>>> SO_LINGER
>>>>>>>>>>>>>>>>>>>> for
>>>>>>>>>>>>>>>>>>>> BSD.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> So from stability point of view it's better to just
>>>>>>>>>>>>>>>>>>>> return
>>>>>>>>>>>>>>>>>>>> random
>>>>>>>>>>>>>>>>>>>> number
>>>>>>>>>>>>>>>>>>>> between 49152 and 65535.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Well, this doesn't seem to improve the odds by much.
>>>>>>>>>>>>>>>>>>> When
>>>>>>>>>>>>>>>>>>> there are
>>>>>>>>>>>>>>>>>>> more
>>>>>>>>>>>>>>>>>>> tests run in parallel, all of them requiring a free
>>>>>>>>>>>>>>>>>>> port,
>>>>>>>>>>>>>>>>>>> nothing
>>>>>>>>>>>>>>>>>>> prevents the random function to return the same port to
>>>>>>>>>>>>>>>>>>> all of
>>>>>>>>>>>>>>>>>>> them.
>>>>>>>>>>>>>>>>>>> Also, two subsequent requests can return the same
>>>>>>>>>>>>>>>>>>> port and
>>>>>>>>>>>>>>>>>>> cause
>>>>>>>>>>>>>>>>>>> problems with timing when a port used by a previous
>>>>>>>>>>>>>>>>>>> test is
>>>>>>>>>>>>>>>>>>> not
>>>>>>>>>>>>>>>>>>> fully
>>>>>>>>>>>>>>>>>>> ready to be assigned to a different socket. And as
>>>>>>>>>>>>>>>>>>> Dmitry
>>>>>>>>>>>>>>>>>>> pointed out
>>>>>>>>>>>>>>>>>>> unless one can keep hold of the allocated socket and
>>>>>>>>>>>>>>>>>>> use it
>>>>>>>>>>>>>>>>>>> later
>>>>>>>>>>>>>>>>>>> there
>>>>>>>>>>>>>>>>>>> is no guarantee that a port which was tested unallocated
>>>>>>>>>>>>>>>>>>> will
>>>>>>>>>>>>>>>>>>> remain
>>>>>>>>>>>>>>>>>>> unallocated also for the next few milliseconds.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> The only fail proof solution would be a port allocating
>>>>>>>>>>>>>>>>>>> service
>>>>>>>>>>>>>>>>>>> provided
>>>>>>>>>>>>>>>>>>> by the harness. Until then we can only (hopefully)
>>>>>>>>>>>>>>>>>>> decrease
>>>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>> chance
>>>>>>>>>>>>>>>>>>> of intermittent failures due to a port being in use.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> -JB-
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> -Dmitry
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> [1]
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> 141     public static int getFreePort() throws
>>>>>>>>>>>>>>>>>>>> InterruptedException,
>>>>>>>>>>>>>>>>>>>> IOException {
>>>>>>>>>>>>>>>>>>>>   142         int port = -1;
>>>>>>>>>>>>>>>>>>>>   143
>>>>>>>>>>>>>>>>>>>>   144         while (port <= 0) {
>>>>>>>>>>>>>>>>>>>>   145             Thread.sleep(100);
>>>>>>>>>>>>>>>>>>>>   146
>>>>>>>>>>>>>>>>>>>>   147             ServerSocket serverSocket = null;
>>>>>>>>>>>>>>>>>>>>   148             try {
>>>>>>>>>>>>>>>>>>>>   149                 serverSocket = new
>>>>>>>>>>>>>>>>>>>> ServerSocket(0);
>>>>>>>>>>>>>>>>>>>>   150                 port =
>>>>>>>>>>>>>>>>>>>> serverSocket.getLocalPort();
>>>>>>>>>>>>>>>>>>>>   151             } finally {
>>>>>>>>>>>>>>>>>>>>   152                 serverSocket.close();
>>>>>>>>>>>>>>>>>>>>   153             }
>>>>>>>>>>>>>>>>>>>>   154         }
>>>>>>>>>>>>>>>>>>>>   155
>>>>>>>>>>>>>>>>>>>>   156         return port;
>>>>>>>>>>>>>>>>>>>>   157     }
>>>>>>>>>>>>>>>>>>>>   158
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> On 2013-11-20 19:40, roger riggs wrote:
>>>>>>>>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> fyi,  The jdk.testlibrary.Utils.getFreePort()
>>>>>>>>>>>>>>>>>>>>> method will
>>>>>>>>>>>>>>>>>>>>> Open an
>>>>>>>>>>>>>>>>>>>>> free
>>>>>>>>>>>>>>>>>>>>> Socket, close it and return
>>>>>>>>>>>>>>>>>>>>> the port number.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> And as Alan recommended, use (0) when possible to have
>>>>>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>>> system
>>>>>>>>>>>>>>>>>>>>> assign
>>>>>>>>>>>>>>>>>>>>> the port #.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Roger
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> On 11/20/2013 8:04 AM, Dmitry Samersoff wrote:
>>>>>>>>>>>>>>>>>>>>>> Taras,
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> *The only* correct way to take really free port is:
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> 1. Chose random number between 49152 and 65535
>>>>>>>>>>>>>>>>>>>>>> 2. Open socket
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> if socket fails - repeat step 1
>>>>>>>>>>>>>>>>>>>>>> if socket OK - return *socket*
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> If you can't keep the socket open (e.g. you have
>>>>>>>>>>>>>>>>>>>>>> to pass
>>>>>>>>>>>>>>>>>>>>>> port
>>>>>>>>>>>>>>>>>>>>>> number as
>>>>>>>>>>>>>>>>>>>>>> property value) you shouldn't do any pre-check as it
>>>>>>>>>>>>>>>>>>>>>> has no
>>>>>>>>>>>>>>>>>>>>>> value
>>>>>>>>>>>>>>>>>>>>>> - as
>>>>>>>>>>>>>>>>>>>>>> as soon as you close socket someone can take the
>>>>>>>>>>>>>>>>>>>>>> port.
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> So just choose a random number within the range above
>>>>>>>>>>>>>>>>>>>>>> and
>>>>>>>>>>>>>>>>>>>>>> let
>>>>>>>>>>>>>>>>>>>>>> networking
>>>>>>>>>>>>>>>>>>>>>> code opening socket to handle port conflict.
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> -Dmitry
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> On 2013-11-20 15:54, taras ledkov wrote:
>>>>>>>>>>>>>>>>>>>>>>> Hi Everyone,
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> I am working on bug
>>>>>>>>>>>>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-7195249.
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> There are two webrevs:
>>>>>>>>>>>>>>>>>>>>>>> Webrev for jdk part:
>>>>>>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~anazarov/7195249/jdk/webrev.00/
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> Webrev for hs part:
>>>>>>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~anazarov/7195249/hs/webrev.00/
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> Please take a look at some notes:
>>>>>>>>>>>>>>>>>>>>>>> - After discussing with Yekaterina Kantserova &
>>>>>>>>>>>>>>>>>>>>>>> Jaroslav
>>>>>>>>>>>>>>>>>>>>>>> Bachorik
>>>>>>>>>>>>>>>>>>>>>>> some
>>>>>>>>>>>>>>>>>>>>>>> shell tests have been converted to java based tests
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> - PasswordFilePermissionTest &
>>>>>>>>>>>>>>>>>>>>>>> SSLConfigFilePermissionTest
>>>>>>>>>>>>>>>>>>>>>>> tests
>>>>>>>>>>>>>>>>>>>>>>> looked
>>>>>>>>>>>>>>>>>>>>>>> very similar, so a common parent class was
>>>>>>>>>>>>>>>>>>>>>>> created for
>>>>>>>>>>>>>>>>>>>>>>> them:
>>>>>>>>>>>>>>>>>>>>>>> AbstractFilePermissionTest
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> - What was called RmiRegistrySslTest.java I've
>>>>>>>>>>>>>>>>>>>>>>> renamed to
>>>>>>>>>>>>>>>>>>>>>>> RmiRegistrySslTestApp.java. The java code to replace
>>>>>>>>>>>>>>>>>>>>>>> old
>>>>>>>>>>>>>>>>>>>>>>> shell
>>>>>>>>>>>>>>>>>>>>>>> script
>>>>>>>>>>>>>>>>>>>>>>> RmiRegistrySslTest.sh is called
>>>>>>>>>>>>>>>>>>>>>>> RmiRegistrySslTest.java,
>>>>>>>>>>>>>>>>>>>>>>> hence the
>>>>>>>>>>>>>>>>>>>>>>> huge
>>>>>>>>>>>>>>>>>>>>>>> diff.
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> - The new RmiRegistrySslTest.java has some lines
>>>>>>>>>>>>>>>>>>>>>>> similar
>>>>>>>>>>>>>>>>>>>>>>> to the
>>>>>>>>>>>>>>>>>>>>>>> AbstractFilePermissionTest.java, I nevertheless
>>>>>>>>>>>>>>>>>>>>>>> decided
>>>>>>>>>>>>>>>>>>>>>>> to not
>>>>>>>>>>>>>>>>>>>>>>> complicate the code further and leave it as is.
>>>>>>>>>>>>>>>>>>>>>>> Please
>>>>>>>>>>>>>>>>>>>>>>> let me
>>>>>>>>>>>>>>>>>>>>>>> know if
>>>>>>>>>>>>>>>>>>>>>>> this is somehow not acceptable
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> - com/oracle/java/testlibrary/Utils.java that is
>>>>>>>>>>>>>>>>>>>>>>> added to
>>>>>>>>>>>>>>>>>>>>>>> hotspot
>>>>>>>>>>>>>>>>>>>>>>> repository is taken from this patch:
>>>>>>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ykantser/8023138/webrev.00/test/lib/testlibrary/jdk/testlibrary/Utils.java.sdiff.html
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> - These tests will need additional changes when test
>>>>>>>>>>>>>>>>>>>>>>> library
>>>>>>>>>>>>>>>>>>>>>>> process
>>>>>>>>>>>>>>>>>>>>>>> tools will support command line options inheritance
>>>>>>>>>>>>>>>>>>>>>>> (http://mail.openjdk.java.net/pipermail/serviceability-dev/2013-November/013235.html)
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> --
>>>>>>>>>>>>>> With best regards,
>>>>>>>>>>>>>> Taras Ledkov
>>>>>>>>>>>>>> Mail-To: taras.ledkov at oracle.com
>>>>>>>>>>>>>> skype: taras_ledkov
>>>>>>>>>>>>>> Phone: 7(812)3346-157
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> --
>>>>>>>>>>> With best regards,
>>>>>>>>>>> Taras Ledkov
>>>>>>>>>>> Mail-To: taras.ledkov at oracle.com
>>>>>>>>>>> skype: taras_ledkov
>>>>>>>>>>> Phone: 7(812)3346-157
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>> --
>>> With best regards,
>>> Taras Ledkov
>>> Mail-To: taras.ledkov at oracle.com
>>> skype: taras_ledkov
>>> Phone: 7(812)3346-157
>>
>


From shanliang.jiang at oracle.com  Thu Feb 27 03:59:47 2014
From: shanliang.jiang at oracle.com (shanliang)
Date: Thu, 27 Feb 2014 12:59:47 +0100
Subject: RFR(XS) 8031065: LowMemoryTest2.sh fails: OutOfMemoryError:
	Metaspace
In-Reply-To: <530F263E.8090408@oracle.com>
References: <4b9ed302-c006-4829-b732-2ffc123d84f4@default>
	<530F263E.8090408@oracle.com>
Message-ID: <530F28B3.4020809@oracle.com>

Daniel Fuchs wrote:
> On 2/27/14 11:43 AM, Mattias Tobiasson wrote:
>> Hi,
>> Could you please review this test fix.
>>
>> The test verifies that MemoryPoolMXBean sends a notification when 
>> used memory has reached the threshold.
>> The flag thresholdExceeded marks if we have reached the memory 
>> threshold. When the flag is set, the test slows down to give time for 
>> the notification to be received.
>> The problem is that thresholdExceeded is overwritten every time in 
>> the loop. Instead it should be set if any pool has reached the 
>> threshold. This means that the test continues to allocate memory at 
>> full speed, and we may get an OutOfMemory before we get the 
>> notification.
>>
>
> Hi Mattias,
>
> I wonder whether you should also stop calling loadNext() once
> thresholdExceeded is true?
Yes I am thinking this too.

Line 172 is unnecessary, after thresholdExceeded becomes true, Line 170 
will always be skipped.

Shanliang
>
> best regards,
>
> -- daniel
>
>> bug:
>> https://bugs.openjdk.java.net/browse/JDK-8031065
>>
>> webrev:
>> http://cr.openjdk.java.net/~ykantser/8031065/webrev.00/
>>
>> Mattias
>>
>


From mattias.tobiasson at oracle.com  Thu Feb 27 04:16:10 2014
From: mattias.tobiasson at oracle.com (Mattias Tobiasson)
Date: Thu, 27 Feb 2014 04:16:10 -0800 (PST)
Subject: RFR(XS) 8031065: LowMemoryTest2.sh fails: OutOfMemoryError:
	Metaspace
Message-ID: <0b126035-d2ba-461a-854f-943e4c3bd0ce@default>

Hi, thanks for the fast reviews.

I did think about stop calling loadNext() after the flag has been set. The main reason for not doing that was just because I wanted to change as little as possible. Now the test works as originally intended.
I do not mind removing the calls to loadNext(), but then we would need some timeout waiting for the callback. Currently the test "times out" with an OutOfMemory when we have allocated the remaining 20% of the space.


About line 172, you are correct. I will just remove that line. Thanks!

Mattias

----- Original Message -----
From: shanliang.jiang at oracle.com
To: daniel.fuchs at oracle.com
Cc: mattias.tobiasson at oracle.com, serviceability-dev at openjdk.java.net
Sent: Thursday, February 27, 2014 12:59:49 PM GMT +01:00 Amsterdam / Berlin / Bern / Rome / Stockholm / Vienna
Subject: Re: RFR(XS) 8031065: LowMemoryTest2.sh fails: OutOfMemoryError: Metaspace

Daniel Fuchs wrote:
> On 2/27/14 11:43 AM, Mattias Tobiasson wrote:
>> Hi,
>> Could you please review this test fix.
>>
>> The test verifies that MemoryPoolMXBean sends a notification when 
>> used memory has reached the threshold.
>> The flag thresholdExceeded marks if we have reached the memory 
>> threshold. When the flag is set, the test slows down to give time for 
>> the notification to be received.
>> The problem is that thresholdExceeded is overwritten every time in 
>> the loop. Instead it should be set if any pool has reached the 
>> threshold. This means that the test continues to allocate memory at 
>> full speed, and we may get an OutOfMemory before we get the 
>> notification.
>>
>
> Hi Mattias,
>
> I wonder whether you should also stop calling loadNext() once
> thresholdExceeded is true?
Yes I am thinking this too.

Line 172 is unnecessary, after thresholdExceeded becomes true, Line 170 
will always be skipped.

Shanliang
>
> best regards,
>
> -- daniel
>
>> bug:
>> https://bugs.openjdk.java.net/browse/JDK-8031065
>>
>> webrev:
>> http://cr.openjdk.java.net/~ykantser/8031065/webrev.00/
>>
>> Mattias
>>
>


From dmitry.samersoff at oracle.com  Thu Feb 27 04:53:47 2014
From: dmitry.samersoff at oracle.com (Dmitry Samersoff)
Date: Thu, 27 Feb 2014 16:53:47 +0400
Subject: PING Re: RR(S): JDK-7189721 nightly tests failed on JDI attaching
In-Reply-To: <53077A0B.5010401@oracle.com>
References: <52FE2D9A.4030700@oracle.com> <53077A0B.5010401@oracle.com>
Message-ID: <530F355B.1010601@oracle.com>

Still waiting for review.

On 2014-02-21 20:08, Dmitry Samersoff wrote:
> On 2014-02-14 18:52, Dmitry Samersoff wrote:
>> Hi Everyone,
>>
>> Please, review the changes.
>>
>> http://cr.openjdk.java.net/~dsamersoff/JDK-7189721/webrev.01/
>>
>> Summary:
>>
>> If for some reason elf section with section names is not loaded to cache
>> it attempts to read data using NULL section pointer.


-- 
Dmitry Samersoff
Oracle Java development team, Saint Petersburg, Russia
* I would love to change the world, but they won't give me the sources.

From jaroslav.bachorik at oracle.com  Thu Feb 27 04:58:22 2014
From: jaroslav.bachorik at oracle.com (Jaroslav Bachorik)
Date: Thu, 27 Feb 2014 13:58:22 +0100
Subject: RFR 8035668: [test] Check that jstat does not cause problems when
	monitored application exits before jstat finishes its work
Message-ID: <530F366E.2040806@oracle.com>

Please, review the addition of the jstat related test.

Issue : https://bugs.openjdk.java.net/browse/JDK-8035668
Webrev: http://cr.openjdk.java.net/~jbachorik/8035668/webrev.00/

The test logic is simple - first the application to be monitored is 
started. When the application is ready jstat is launched - it 
periodically (each 100ms) polls the monitored application status. At 
this moment the monitored application is signalled the possibility to 
exit - it waits for 500ms and exits. There should not be any errors 
neither in the monitored application nor jstat (checked by the exit 
status of their processes) and they should not hang (detected by the 
harness timeout).

Thanks,

-JB-

From roger.riggs at oracle.com  Thu Feb 27 07:34:01 2014
From: roger.riggs at oracle.com (roger riggs)
Date: Thu, 27 Feb 2014 10:34:01 -0500
Subject: RFR 9: 8035889: jdk testlibrary - add printing of values of failed
	assertions
In-Reply-To: <530EAE5F.3050308@oracle.com>
References: <530E4FD8.9040605@oracle.com> <530E9C6C.8080409@oracle.com>
	<530EAC78.4000301@Oracle.com> <530EAE5F.3050308@oracle.com>
Message-ID: <530F5AE9.8000405@oracle.com>

Hi Mandy,

I updated the webrev:
http://cr.openjdk.java.net/~rriggs/webrev-testlibrary-asserts-8035889/

Alan suggested copying serviceability-dev so they have a chance to 
review if desired.

I want to investigate if it is possible to use the TestNG Assert classes 
without
the TestNG execution framework.
It would be necessary to compile/run against TestNG.jar but it might not
need the entire mechanism.

Thanks, Roger

On 2/26/2014 10:17 PM, Mandy Chung wrote:
> On 2/26/2014 7:09 PM, Roger Riggs wrote:
>> Hi Mandy,
>>
>> Yes, it might be more productive to switch the tests to TestNG.
>> But it did provide support in cases where TestNG could not be used,
>> for example in a directory of existing tests that had custom reporting.
>>
>> But I remember there is a problem with TestNG having a dependency for XML
>> which is not supported in Profile1 and a number of tests had to be 
>> disabled
>> in that configuration.  Will XML always be available.  Do we need to 
>> solve
>> or work around that problem with TestNG?
>>
>
> This is a good point.   When we want to test just the base module for 
> example, how can we run TestNG tests?  We need to address that certainly.
>
> My comment on TestNG is a question for new tests using this Asserts 
> class.  Your patch is fine to go (after taking out @library tag if I 
> got it correct).
>
> Mandy
>
>> Thanks, Roger
>>
>> On 2/26/14 9:01 PM, Mandy Chung wrote:
>>> Hi Roger,
>>>
>>> On 2/26/2014 12:34 PM, roger riggs wrote:
>>>> The testlibrary for the jdk should be printing the values in the 
>>>> failed
>>>> assertions to make debugging easier and quicker.
>>>>
>>>> The webrev adds the printing of the failed assertions and added 
>>>> methods
>>>> for formatting and unconditional fail methods.
>>>>
>>>> Webrev:
>>>> http://cr.openjdk.java.net/~rriggs/webrev-testlibrary-asserts-8035889/
>>>>
>>>
>>> AssertsTest.java: line 28:  @library doesn't look like it's needed. 
>>> There is no jdk/test/testlibrary directory and I think 
>>> jdk.testlibrary.* are found as relative to $test.src.
>>>
>>> Otherwise, the change looks okay.
>>>
>>> Now that jtreg supports TestNG and I wonder if this class should 
>>> retire some day (there are only about 10 existing tests using this 
>>> class).  Are you writing new tests using this Asserts class?
>>>
>>> Mandy
>>>
>>>> Bug:
>>>>    8035889: jdk testlibrary - add printing of values of failed 
>>>> assertions
>>>>
>>>> Thanks, Roger
>>>>
>>>> [1] https://bugs.openjdk.java.net/browse/JDK-8035889
>>>>
>>>>
>>>
>>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20140227/a01db940/attachment.html 

From shanliang.jiang at oracle.com  Thu Feb 27 08:12:49 2014
From: shanliang.jiang at oracle.com (shanliang)
Date: Thu, 27 Feb 2014 17:12:49 +0100
Subject: RFR(XS) 8031065: LowMemoryTest2.sh fails: OutOfMemoryError:
	Metaspace
In-Reply-To: <0b126035-d2ba-461a-854f-943e4c3bd0ce@default>
References: <0b126035-d2ba-461a-854f-943e4c3bd0ce@default>
Message-ID: <530F6401.8010708@oracle.com>

Mattias Tobiasson wrote:
> Hi, thanks for the fast reviews.
>
> I did think about stop calling loadNext() after the flag has been set. The main reason for not doing that was just because I wanted to change as little as possible. Now the test works as originally intended.
>   
I prefer to do like this too :)
> I do not mind removing the calls to loadNext(), but then we would need some timeout waiting for the callback. Currently the test "times out" with an OutOfMemory when we have allocated the remaining 20% of the space.
>   
You do not need to add a timeout, only change Line 151
    for(;;)
to
    while(!listenerInvoked) {

and remove 160 -- 162

in case that an expected notification is not arrived, the testing 
harness has a timeout to stop the test.
   
This way makes the test more robust,  but I am OK with the current fix.

Thanks,
Shanliang
>
> About line 172, you are correct. I will just remove that line. Thanks!
>
> Mattias
>
> ----- Original Message -----
> From: shanliang.jiang at oracle.com
> To: daniel.fuchs at oracle.com
> Cc: mattias.tobiasson at oracle.com, serviceability-dev at openjdk.java.net
> Sent: Thursday, February 27, 2014 12:59:49 PM GMT +01:00 Amsterdam / Berlin / Bern / Rome / Stockholm / Vienna
> Subject: Re: RFR(XS) 8031065: LowMemoryTest2.sh fails: OutOfMemoryError: Metaspace
>
> Daniel Fuchs wrote:
>   
>> On 2/27/14 11:43 AM, Mattias Tobiasson wrote:
>>     
>>> Hi,
>>> Could you please review this test fix.
>>>
>>> The test verifies that MemoryPoolMXBean sends a notification when 
>>> used memory has reached the threshold.
>>> The flag thresholdExceeded marks if we have reached the memory 
>>> threshold. When the flag is set, the test slows down to give time for 
>>> the notification to be received.
>>> The problem is that thresholdExceeded is overwritten every time in 
>>> the loop. Instead it should be set if any pool has reached the 
>>> threshold. This means that the test continues to allocate memory at 
>>> full speed, and we may get an OutOfMemory before we get the 
>>> notification.
>>>
>>>       
>> Hi Mattias,
>>
>> I wonder whether you should also stop calling loadNext() once
>> thresholdExceeded is true?
>>     
> Yes I am thinking this too.
>
> Line 172 is unnecessary, after thresholdExceeded becomes true, Line 170 
> will always be skipped.
>
> Shanliang
>   
>> best regards,
>>
>> -- daniel
>>
>>     
>>> bug:
>>> https://bugs.openjdk.java.net/browse/JDK-8031065
>>>
>>> webrev:
>>> http://cr.openjdk.java.net/~ykantser/8031065/webrev.00/
>>>
>>> Mattias
>>>
>>>       
>
>   

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20140227/aa886465/attachment-0001.html 

From pavel.punegov at oracle.com  Thu Feb 27 08:20:54 2014
From: pavel.punegov at oracle.com (Pavel Punegov)
Date: Thu, 27 Feb 2014 20:20:54 +0400
Subject: RFR 6946101: ShellScaffold.sh based tests can issue "write error:
	Broken pipe" messages
Message-ID: <1701612.OzZc2qZ2Ld@work-dell>

Please review the fix for:
  https://bugs.openjdk.java.net/browse/JDK-6946101

webrev: 
  http://cr.openjdk.java.net/~iignatyev/ppunegov/6946101/webrev.00/


This change fixes two issues with the tests:
  1. Fix incorrect 'quit' command sending to JDB when JDB process was finished.
  2. Improve JDB unexpected exit detection and  process synchronization.

Description of fix:
  1. Add allowExit parameter to cmd() to show that the given command can finish
  JDB. E.g., 'cont' command make JDB execute debuggee to the end.
  If allowExit wasn't set for a command then  assume that it can't finish
  execution, and fail the test if it did.

  2. Make test fail if it tries to send 'quit' or 'exit' commands. This makes
  it impossible to send quit/exit from test by mistake. Scaffold will
  finish JDB by itself if JDB didn't finish before be a command with allowExit
  set. Add dofinish() function to be the only method that may exit JDB.

  3. Add proper synchronization into waitForFinish(). On all systems except
  SunOS use wait (from bash). On Solaris find the shell subprocess and wait for
  its finish. It replaces wait  used on all other systems, because it
  doesn't work on sh/ksh as in bash. 

  4. Fix tests: add allowExit to tests where it's needed.

-- 
Thanks,
Pavel Punegov

From daniel.daugherty at oracle.com  Thu Feb 27 11:09:06 2014
From: daniel.daugherty at oracle.com (Daniel D. Daugherty)
Date: Thu, 27 Feb 2014 12:09:06 -0700
Subject: 2-nd round RFR 6471769: Error: assert(_cur_stack_depth ==
	count_frames(), "cur_stack_depth out of sync")
In-Reply-To: <530EF65C.5040707@oracle.com>
References: <530D007D.4040602@oracle.com> <530EF65C.5040707@oracle.com>
Message-ID: <530F8D52.2040400@oracle.com>

On 2/27/14 1:25 AM, serguei.spitsyn at oracle.com wrote:
> Please, review the fix for:
>   https://bugs.openjdk.java.net/browse/JDK-6471769
>
>
> Open webrev:
> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/hotspot/6471769-JVMTI-DEPTH.2 
>

src/share/vm/runtime/vm_operations.hpp
     No comments.

src/share/vm/prims/jvmtiEnvBase.hpp
     No comments.

src/share/vm/prims/jvmtiEnv.cpp
     No comments.

src/share/vm/prims/jvmtiEnvThreadState.cpp
     No comments.

src/share/vm/prims/jvmtiEventController.cpp
     JvmtiEventController::set_frame_pop() is called by
     JvmtiEnvThreadState::set_frame_pop() which is called by
     JvmtiEnv::NotifyFramePop().

     The "MutexLocker mu(JvmtiThreadState_lock)" in
     JvmtiEventController::set_frame_pop() protected the work
     done by JvmtiEventControllerPrivate::set_frame_pop():

       ets->get_frame_pops()->set(fpop);
recompute_thread_enabled(ets->get_thread()->jvmti_thread_state());

     Since multiple threads can call JVM/TI NotifyFramePop() on the
     same target thread, what keeps the threads from messing with
     the list of frame pops simultaneously or messing with the
     thread enabled events bits in parallel?

     I suspect that this might also be an issue for
     JvmtiEventController::clear_frame_pop() and
     JvmtiEventController::clear_to_frame_pop() also.

src/share/vm/prims/jvmtiThreadState.cpp
     No comments.

Dan


>
> Summary:
>
>   It is the 2-nd round of review because the JTREG com/sun/jdi tests 
> discovered a regression
>   in the first round change. The issue was in the 
> JvmtiEventController::clear_frame_pop()
>   lock synchronization that is not allowed at safepoints.
>
>   As a result I've changed the JvmtiEnv::NotifyFramePop to use a VM 
> operation for safety.
>   Also, I've removed the lock synchronization from the 3 impacted 
> JvmtiEventController::
>   functions: set_frame_pop(), clear_frame_pop() and clear_to_frame_pop().
>
> Testing:
>   In progress: nsk.jvmti, nsk.jdi, nsk.jdwp, JTreg com/sun/jdi
>
>
> Thanks,
> Serguei
>
>
> On 2/25/14 12:43 PM, serguei.spitsyn at oracle.com wrote:
>> Please, review the fix for:
>>   https://bugs.openjdk.java.net/browse/JDK-6471769
>>
>>
>> Open webrev:
>> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/hotspot/6471769-JVMTI-DEPTH.1 
>>
>>
>> Summary:
>>
>>   This is another Test Stabilization issue.
>>   The fix is very similar to other JVMTI stabilization fixes.
>>   It is to use safepoints for updating the PopFrame data instead of 
>> relying on the
>>   suspend equivalent condition mechanism 
>> (JvmtiEnv::is_thread_fully_suspended())
>>   which is not adequate from the reliability point of view.
>>
>> Testing:
>>   In progress: nsk.jvmti, nsk.jdi, nsk.jdwp, JTreg com/sun/jdi
>>
>>
>> Thanks,
>> Serguei
>>
>


From serguei.spitsyn at oracle.com  Thu Feb 27 12:28:35 2014
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Thu, 27 Feb 2014 12:28:35 -0800
Subject: 2-nd round RFR 6471769: Error: assert(_cur_stack_depth ==
	count_frames(), "cur_stack_depth out of sync")
In-Reply-To: <530F8D52.2040400@oracle.com>
References: <530D007D.4040602@oracle.com> <530EF65C.5040707@oracle.com>
	<530F8D52.2040400@oracle.com>
Message-ID: <530F9FF3.7030209@oracle.com>

Dan,

Thank you a lot for reviewing this!

On 2/27/14 11:09 AM, Daniel D. Daugherty wrote:
> On 2/27/14 1:25 AM, serguei.spitsyn at oracle.com wrote:
>> Please, review the fix for:
>>   https://bugs.openjdk.java.net/browse/JDK-6471769
>>
>>
>> Open webrev:
>> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/hotspot/6471769-JVMTI-DEPTH.2 
>>
>
> src/share/vm/runtime/vm_operations.hpp
>     No comments.
>
> src/share/vm/prims/jvmtiEnvBase.hpp
>     No comments.
>
> src/share/vm/prims/jvmtiEnv.cpp
>     No comments.
>
> src/share/vm/prims/jvmtiEnvThreadState.cpp
>     No comments.
>
> src/share/vm/prims/jvmtiEventController.cpp
>     JvmtiEventController::set_frame_pop() is called by
>     JvmtiEnvThreadState::set_frame_pop() which is called by
>     JvmtiEnv::NotifyFramePop().
>
>     The "MutexLocker mu(JvmtiThreadState_lock)" in
>     JvmtiEventController::set_frame_pop() protected the work
>     done by JvmtiEventControllerPrivate::set_frame_pop():
>
>       ets->get_frame_pops()->set(fpop);
> recompute_thread_enabled(ets->get_thread()->jvmti_thread_state());

Your check is the right thing to do, thanks!
I had to explain this more clearly in this 2-nd review request.

The approach I've taken here is that all this code paths are executed
on the target thread or at a safepoint.

It is true for all 3 functions:
   set_frame_pop(), clear_frame_pop() and clear_to_frame_pop().

And the updated assert guards ensure that it is the case.

It could be a good idea to add a No_Safepoint_Verifier for PopFrame() 
and NotifyFramePop()
to make sure the current/target thread does not go to safepoint until it 
is returned from
update_for_pop_top_frame() and set_frame_pop() correspondingly.
A No_Safepoint_Verifier can be also needed in the 
JvmtiExport::post_method_exit().

These are all places where these functions are called:
prims/jvmtiEnv.cpp: 
state->env_thread_state(this)->set_frame_pop(frame_number); // 
JvmtiEnv::NotifyFramePop()
prims/jvmtiExport.cpp: 
ets->clear_frame_pop(cur_frame_number);                        // 
JvmtiExport::post_method_exit()
prims/jvmtiThreadState.cpp: 
ets->clear_frame_pop(popframe_number);              // 
JvmtiThreadState::update_for_pop_top_frame()

The function JvmtiEnvThreadState::clear_to_frame_pop() is never called now.

Thanks,
Serguei


>
>     Since multiple threads can call JVM/TI NotifyFramePop() on the
>     same target thread, what keeps the threads from messing with
>     the list of frame pops simultaneously or messing with the
>     thread enabled events bits in parallel?
>
>     I suspect that this might also be an issue for
>     JvmtiEventController::clear_frame_pop() and
>     JvmtiEventController::clear_to_frame_pop() also.
>
> src/share/vm/prims/jvmtiThreadState.cpp
>     No comments.
>
> Dan
>
>
>>
>> Summary:
>>
>>   It is the 2-nd round of review because the JTREG com/sun/jdi tests 
>> discovered a regression
>>   in the first round change. The issue was in the 
>> JvmtiEventController::clear_frame_pop()
>>   lock synchronization that is not allowed at safepoints.
>>
>>   As a result I've changed the JvmtiEnv::NotifyFramePop to use a VM 
>> operation for safety.
>>   Also, I've removed the lock synchronization from the 3 impacted 
>> JvmtiEventController::
>>   functions: set_frame_pop(), clear_frame_pop() and 
>> clear_to_frame_pop().
>>
>> Testing:
>>   In progress: nsk.jvmti, nsk.jdi, nsk.jdwp, JTreg com/sun/jdi
>>
>>
>> Thanks,
>> Serguei
>>
>>
>> On 2/25/14 12:43 PM, serguei.spitsyn at oracle.com wrote:
>>> Please, review the fix for:
>>>   https://bugs.openjdk.java.net/browse/JDK-6471769
>>>
>>>
>>> Open webrev:
>>> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/hotspot/6471769-JVMTI-DEPTH.1 
>>>
>>>
>>> Summary:
>>>
>>>   This is another Test Stabilization issue.
>>>   The fix is very similar to other JVMTI stabilization fixes.
>>>   It is to use safepoints for updating the PopFrame data instead of 
>>> relying on the
>>>   suspend equivalent condition mechanism 
>>> (JvmtiEnv::is_thread_fully_suspended())
>>>   which is not adequate from the reliability point of view.
>>>
>>> Testing:
>>>   In progress: nsk.jvmti, nsk.jdi, nsk.jdwp, JTreg com/sun/jdi
>>>
>>>
>>> Thanks,
>>> Serguei
>>>
>>
>


From mandy.chung at oracle.com  Thu Feb 27 12:32:58 2014
From: mandy.chung at oracle.com (Mandy Chung)
Date: Thu, 27 Feb 2014 12:32:58 -0800
Subject: RFR 9: 8035889: jdk testlibrary - add printing of values of failed
	assertions
In-Reply-To: <530F5AE9.8000405@oracle.com>
References: <530E4FD8.9040605@oracle.com>
	<530E9C6C.8080409@oracle.com>	<530EAC78.4000301@Oracle.com>
	<530EAE5F.3050308@oracle.com> <530F5AE9.8000405@oracle.com>
Message-ID: <530FA0F9.5000106@oracle.com>

Looks okay to me.

Mandy

On 2/27/14 7:34 AM, roger riggs wrote:
> Hi Mandy,
>
> I updated the webrev:
> http://cr.openjdk.java.net/~rriggs/webrev-testlibrary-asserts-8035889/
>
> Alan suggested copying serviceability-dev so they have a chance to 
> review if desired.
>
> I want to investigate if it is possible to use the TestNG Assert 
> classes without
> the TestNG execution framework.
> It would be necessary to compile/run against TestNG.jar but it might not
> need the entire mechanism.
>
> Thanks, Roger
>
> On 2/26/2014 10:17 PM, Mandy Chung wrote:
>> On 2/26/2014 7:09 PM, Roger Riggs wrote:
>>> Hi Mandy,
>>>
>>> Yes, it might be more productive to switch the tests to TestNG.
>>> But it did provide support in cases where TestNG could not be used,
>>> for example in a directory of existing tests that had custom reporting.
>>>
>>> But I remember there is a problem with TestNG having a dependency 
>>> for XML
>>> which is not supported in Profile1 and a number of tests had to be 
>>> disabled
>>> in that configuration.  Will XML always be available.  Do we need to 
>>> solve
>>> or work around that problem with TestNG?
>>>
>>
>> This is a good point.   When we want to test just the base module for 
>> example, how can we run TestNG tests?  We need to address that certainly.
>>
>> My comment on TestNG is a question for new tests using this Asserts 
>> class.  Your patch is fine to go (after taking out @library tag if I 
>> got it correct).
>>
>> Mandy
>>
>>> Thanks, Roger
>>>
>>> On 2/26/14 9:01 PM, Mandy Chung wrote:
>>>> Hi Roger,
>>>>
>>>> On 2/26/2014 12:34 PM, roger riggs wrote:
>>>>> The testlibrary for the jdk should be printing the values in the 
>>>>> failed
>>>>> assertions to make debugging easier and quicker.
>>>>>
>>>>> The webrev adds the printing of the failed assertions and added 
>>>>> methods
>>>>> for formatting and unconditional fail methods.
>>>>>
>>>>> Webrev:
>>>>> http://cr.openjdk.java.net/~rriggs/webrev-testlibrary-asserts-8035889/ 
>>>>>
>>>>>
>>>>
>>>> AssertsTest.java: line 28:  @library doesn't look like it's needed. 
>>>> There is no jdk/test/testlibrary directory and I think 
>>>> jdk.testlibrary.* are found as relative to $test.src.
>>>>
>>>> Otherwise, the change looks okay.
>>>>
>>>> Now that jtreg supports TestNG and I wonder if this class should 
>>>> retire some day (there are only about 10 existing tests using this 
>>>> class).  Are you writing new tests using this Asserts class?
>>>>
>>>> Mandy
>>>>
>>>>> Bug:
>>>>>    8035889: jdk testlibrary - add printing of values of failed 
>>>>> assertions
>>>>>
>>>>> Thanks, Roger
>>>>>
>>>>> [1] https://bugs.openjdk.java.net/browse/JDK-8035889
>>>>>
>>>>>
>>>>
>>>
>>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20140227/46e827fc/attachment.html 

From serguei.spitsyn at oracle.com  Thu Feb 27 13:03:27 2014
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Thu, 27 Feb 2014 13:03:27 -0800
Subject: 2-nd round RFR 6471769: Error: assert(_cur_stack_depth ==
	count_frames(), "cur_stack_depth out of sync")
In-Reply-To: <530F9FF3.7030209@oracle.com>
References: <530D007D.4040602@oracle.com> <530EF65C.5040707@oracle.com>
	<530F8D52.2040400@oracle.com> <530F9FF3.7030209@oracle.com>
Message-ID: <530FA81F.8040309@oracle.com>

On 2/27/14 12:28 PM, serguei.spitsyn at oracle.com wrote:
> Dan,
>
> Thank you a lot for reviewing this!
>
> On 2/27/14 11:09 AM, Daniel D. Daugherty wrote:
>> On 2/27/14 1:25 AM, serguei.spitsyn at oracle.com wrote:
>>> Please, review the fix for:
>>>   https://bugs.openjdk.java.net/browse/JDK-6471769
>>>
>>>
>>> Open webrev:
>>> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/hotspot/6471769-JVMTI-DEPTH.2 
>>>
>>
>> src/share/vm/runtime/vm_operations.hpp
>>     No comments.
>>
>> src/share/vm/prims/jvmtiEnvBase.hpp
>>     No comments.
>>
>> src/share/vm/prims/jvmtiEnv.cpp
>>     No comments.
>>
>> src/share/vm/prims/jvmtiEnvThreadState.cpp
>>     No comments.
>>
>> src/share/vm/prims/jvmtiEventController.cpp
>>     JvmtiEventController::set_frame_pop() is called by
>>     JvmtiEnvThreadState::set_frame_pop() which is called by
>>     JvmtiEnv::NotifyFramePop().
>>
>>     The "MutexLocker mu(JvmtiThreadState_lock)" in
>>     JvmtiEventController::set_frame_pop() protected the work
>>     done by JvmtiEventControllerPrivate::set_frame_pop():
>>
>>       ets->get_frame_pops()->set(fpop);
>> recompute_thread_enabled(ets->get_thread()->jvmti_thread_state());
>
> Your check is the right thing to do, thanks!
> I had to explain this more clearly in this 2-nd review request.
>
> The approach I've taken here is that all this code paths are executed
> on the target thread or at a safepoint.
>
> It is true for all 3 functions:
>   set_frame_pop(), clear_frame_pop() and clear_to_frame_pop().
>
> And the updated assert guards ensure that it is the case.
>
> It could be a good idea to add a No_Safepoint_Verifier for PopFrame() 
> and NotifyFramePop()
> to make sure the current/target thread does not go to safepoint until 
> it is returned from
> update_for_pop_top_frame() and set_frame_pop() correspondingly.
> A No_Safepoint_Verifier can be also needed in the 
> JvmtiExport::post_method_exit().
>
> These are all places where these functions are called:
> prims/jvmtiEnv.cpp: 
> state->env_thread_state(this)->set_frame_pop(frame_number); // 
> JvmtiEnv::NotifyFramePop()
> prims/jvmtiExport.cpp: ets->clear_frame_pop(cur_frame_number); // 
> JvmtiExport::post_method_exit()
> prims/jvmtiThreadState.cpp: 
> ets->clear_frame_pop(popframe_number);              // 
> JvmtiThreadState::update_for_pop_top_frame()
>
> The function JvmtiEnvThreadState::clear_to_frame_pop() is never called 
> now.

There is still a concern about recompute_thread_enabled().
If it is normally always protected with the JvmtiThreadState_lock
then the approach above is not going to work.
I'm trying to check this now.

Thanks,
Serguei


>
> Thanks,
> Serguei
>
>
>
>
>
>>
>>     Since multiple threads can call JVM/TI NotifyFramePop() on the
>>     same target thread, what keeps the threads from messing with
>>     the list of frame pops simultaneously or messing with the
>>     thread enabled events bits in parallel?
>>
>>     I suspect that this might also be an issue for
>>     JvmtiEventController::clear_frame_pop() and
>>     JvmtiEventController::clear_to_frame_pop() also.
>>
>> src/share/vm/prims/jvmtiThreadState.cpp
>>     No comments.
>>
>> Dan
>>
>>
>>>
>>> Summary:
>>>
>>>   It is the 2-nd round of review because the JTREG com/sun/jdi tests 
>>> discovered a regression
>>>   in the first round change. The issue was in the 
>>> JvmtiEventController::clear_frame_pop()
>>>   lock synchronization that is not allowed at safepoints.
>>>
>>>   As a result I've changed the JvmtiEnv::NotifyFramePop to use a VM 
>>> operation for safety.
>>>   Also, I've removed the lock synchronization from the 3 impacted 
>>> JvmtiEventController::
>>>   functions: set_frame_pop(), clear_frame_pop() and 
>>> clear_to_frame_pop().
>>>
>>> Testing:
>>>   In progress: nsk.jvmti, nsk.jdi, nsk.jdwp, JTreg com/sun/jdi
>>>
>>>
>>> Thanks,
>>> Serguei
>>>
>>>
>>> On 2/25/14 12:43 PM, serguei.spitsyn at oracle.com wrote:
>>>> Please, review the fix for:
>>>>   https://bugs.openjdk.java.net/browse/JDK-6471769
>>>>
>>>>
>>>> Open webrev:
>>>> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/hotspot/6471769-JVMTI-DEPTH.1 
>>>>
>>>>
>>>> Summary:
>>>>
>>>>   This is another Test Stabilization issue.
>>>>   The fix is very similar to other JVMTI stabilization fixes.
>>>>   It is to use safepoints for updating the PopFrame data instead of 
>>>> relying on the
>>>>   suspend equivalent condition mechanism 
>>>> (JvmtiEnv::is_thread_fully_suspended())
>>>>   which is not adequate from the reliability point of view.
>>>>
>>>> Testing:
>>>>   In progress: nsk.jvmti, nsk.jdi, nsk.jdwp, JTreg com/sun/jdi
>>>>
>>>>
>>>> Thanks,
>>>> Serguei
>>>>
>>>
>>
>


From serguei.spitsyn at oracle.com  Thu Feb 27 14:00:18 2014
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Thu, 27 Feb 2014 14:00:18 -0800
Subject: 2-nd round RFR 6471769: Error: assert(_cur_stack_depth ==
	count_frames(), "cur_stack_depth out of sync")
In-Reply-To: <530FA81F.8040309@oracle.com>
References: <530D007D.4040602@oracle.com> <530EF65C.5040707@oracle.com>
	<530F8D52.2040400@oracle.com> <530F9FF3.7030209@oracle.com>
	<530FA81F.8040309@oracle.com>
Message-ID: <530FB572.4010507@oracle.com>

On 2/27/14 1:03 PM, serguei.spitsyn at oracle.com wrote:
> On 2/27/14 12:28 PM, serguei.spitsyn at oracle.com wrote:
>> Dan,
>>
>> Thank you a lot for reviewing this!
>>
>> On 2/27/14 11:09 AM, Daniel D. Daugherty wrote:
>>> On 2/27/14 1:25 AM, serguei.spitsyn at oracle.com wrote:
>>>> Please, review the fix for:
>>>>   https://bugs.openjdk.java.net/browse/JDK-6471769
>>>>
>>>>
>>>> Open webrev:
>>>> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/hotspot/6471769-JVMTI-DEPTH.2 
>>>>
>>>
>>> src/share/vm/runtime/vm_operations.hpp
>>>     No comments.
>>>
>>> src/share/vm/prims/jvmtiEnvBase.hpp
>>>     No comments.
>>>
>>> src/share/vm/prims/jvmtiEnv.cpp
>>>     No comments.
>>>
>>> src/share/vm/prims/jvmtiEnvThreadState.cpp
>>>     No comments.
>>>
>>> src/share/vm/prims/jvmtiEventController.cpp
>>>     JvmtiEventController::set_frame_pop() is called by
>>>     JvmtiEnvThreadState::set_frame_pop() which is called by
>>>     JvmtiEnv::NotifyFramePop().
>>>
>>>     The "MutexLocker mu(JvmtiThreadState_lock)" in
>>>     JvmtiEventController::set_frame_pop() protected the work
>>>     done by JvmtiEventControllerPrivate::set_frame_pop():
>>>
>>>       ets->get_frame_pops()->set(fpop);
>>> recompute_thread_enabled(ets->get_thread()->jvmti_thread_state());
>>
>> Your check is the right thing to do, thanks!
>> I had to explain this more clearly in this 2-nd review request.
>>
>> The approach I've taken here is that all this code paths are executed
>> on the target thread or at a safepoint.
>>
>> It is true for all 3 functions:
>>   set_frame_pop(), clear_frame_pop() and clear_to_frame_pop().
>>
>> And the updated assert guards ensure that it is the case.
>>
>> It could be a good idea to add a No_Safepoint_Verifier for PopFrame() 
>> and NotifyFramePop()
>> to make sure the current/target thread does not go to safepoint until 
>> it is returned from
>> update_for_pop_top_frame() and set_frame_pop() correspondingly.
>> A No_Safepoint_Verifier can be also needed in the 
>> JvmtiExport::post_method_exit().
>>
>> These are all places where these functions are called:
>> prims/jvmtiEnv.cpp: 
>> state->env_thread_state(this)->set_frame_pop(frame_number); // 
>> JvmtiEnv::NotifyFramePop()
>> prims/jvmtiExport.cpp: ets->clear_frame_pop(cur_frame_number); // 
>> JvmtiExport::post_method_exit()
>> prims/jvmtiThreadState.cpp: 
>> ets->clear_frame_pop(popframe_number);              // 
>> JvmtiThreadState::update_for_pop_top_frame()
>>
>> The function JvmtiEnvThreadState::clear_to_frame_pop() is never 
>> called now.
>
> There is still a concern about recompute_thread_enabled().
> If it is normally always protected with the JvmtiThreadState_lock
> then the approach above is not going to work.
> I'm trying to check this now.

Dan,

I came to a conclusion that these 3 functions still must be protected
by the JvmtiThreadState_lock when they are called out of a safepoint.
It is a little bit ugly but has to be safe though.

Please, let me know if you see eny problems with that.
I'll send a new webrev soon.

Thanks,
Serguei


>
> Thanks,
> Serguei
>
>
>>
>> Thanks,
>> Serguei
>>
>>
>>
>>
>>
>>>
>>>     Since multiple threads can call JVM/TI NotifyFramePop() on the
>>>     same target thread, what keeps the threads from messing with
>>>     the list of frame pops simultaneously or messing with the
>>>     thread enabled events bits in parallel?
>>>
>>>     I suspect that this might also be an issue for
>>>     JvmtiEventController::clear_frame_pop() and
>>>     JvmtiEventController::clear_to_frame_pop() also.
>>>
>>> src/share/vm/prims/jvmtiThreadState.cpp
>>>     No comments.
>>>
>>> Dan
>>>
>>>
>>>>
>>>> Summary:
>>>>
>>>>   It is the 2-nd round of review because the JTREG com/sun/jdi 
>>>> tests discovered a regression
>>>>   in the first round change. The issue was in the 
>>>> JvmtiEventController::clear_frame_pop()
>>>>   lock synchronization that is not allowed at safepoints.
>>>>
>>>>   As a result I've changed the JvmtiEnv::NotifyFramePop to use a VM 
>>>> operation for safety.
>>>>   Also, I've removed the lock synchronization from the 3 impacted 
>>>> JvmtiEventController::
>>>>   functions: set_frame_pop(), clear_frame_pop() and 
>>>> clear_to_frame_pop().
>>>>
>>>> Testing:
>>>>   In progress: nsk.jvmti, nsk.jdi, nsk.jdwp, JTreg com/sun/jdi
>>>>
>>>>
>>>> Thanks,
>>>> Serguei
>>>>
>>>>
>>>> On 2/25/14 12:43 PM, serguei.spitsyn at oracle.com wrote:
>>>>> Please, review the fix for:
>>>>>   https://bugs.openjdk.java.net/browse/JDK-6471769
>>>>>
>>>>>
>>>>> Open webrev:
>>>>> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/hotspot/6471769-JVMTI-DEPTH.1 
>>>>>
>>>>>
>>>>> Summary:
>>>>>
>>>>>   This is another Test Stabilization issue.
>>>>>   The fix is very similar to other JVMTI stabilization fixes.
>>>>>   It is to use safepoints for updating the PopFrame data instead 
>>>>> of relying on the
>>>>>   suspend equivalent condition mechanism 
>>>>> (JvmtiEnv::is_thread_fully_suspended())
>>>>>   which is not adequate from the reliability point of view.
>>>>>
>>>>> Testing:
>>>>>   In progress: nsk.jvmti, nsk.jdi, nsk.jdwp, JTreg com/sun/jdi
>>>>>
>>>>>
>>>>> Thanks,
>>>>> Serguei
>>>>>
>>>>
>>>
>>
>


From daniel.daugherty at oracle.com  Thu Feb 27 14:47:15 2014
From: daniel.daugherty at oracle.com (Daniel D. Daugherty)
Date: Thu, 27 Feb 2014 15:47:15 -0700
Subject: RFR 6946101: ShellScaffold.sh based tests can issue "write error:
	Broken pipe" messages
In-Reply-To: <1701612.OzZc2qZ2Ld@work-dell>
References: <1701612.OzZc2qZ2Ld@work-dell>
Message-ID: <530FC073.3060905@oracle.com>

On 2/27/14 9:20 AM, Pavel Punegov wrote:
> Please review the fix for:
>    https://bugs.openjdk.java.net/browse/JDK-6946101
>
> webrev:
>    http://cr.openjdk.java.net/~iignatyev/ppunegov/6946101/webrev.00/

test/com/sun/jdi/ShellScaffold.sh
     line 531: # allows JDB to exit"
         stray double-quote at end of comment

     line 563: dofail "It's not allowed to send quit and exit commands 
from the test"
         'and' should be 'or'

     line 819:  # Kill debugger, it could be hang
         Typo: 'hang' -> 'hung'

test/com/sun/jdi/ArrayLengthDumpTest.sh
test/com/sun/jdi/CatchAllTest.sh
test/com/sun/jdi/CatchCaughtTest.sh
test/com/sun/jdi/CatchPatternTest.sh
test/com/sun/jdi/CommandCommentDelimiter.sh
test/com/sun/jdi/DeferredStepTest.sh
test/com/sun/jdi/DeoptimizeWalk.sh
test/com/sun/jdi/EvalArgs.sh
test/com/sun/jdi/GetLocalVariables3Test.sh
test/com/sun/jdi/GetLocalVariables4Test.sh
test/com/sun/jdi/JdbExprTest.sh
test/com/sun/jdi/JdbLockTest.sh
test/com/sun/jdi/JdbMethodExitTest.sh
test/com/sun/jdi/JdbMissStep.sh
test/com/sun/jdi/MixedSuspendTest.sh
test/com/sun/jdi/NotAField.sh
test/com/sun/jdi/NullLocalVariable.sh
test/com/sun/jdi/Redefine-g.sh
test/com/sun/jdi/RedefineAnnotation.sh
test/com/sun/jdi/RedefineChangeClassOrder.sh
test/com/sun/jdi/RedefineClasses.sh
test/com/sun/jdi/RedefineException.sh
test/com/sun/jdi/RedefineFinal.sh
test/com/sun/jdi/RedefineImplementor.sh
test/com/sun/jdi/RedefineIntConstantToLong.sh
test/com/sun/jdi/RedefineMulti.sh
test/com/sun/jdi/RedefinePop.sh
test/com/sun/jdi/RedefineTTYLineNumber.sh
test/com/sun/jdi/StringConvertTest.sh
test/com/sun/jdi/WatchFramePop.sh
     I _think_ I understand the new test driver style:

     - get rid of all explicit 'cmd quit' usages because mydojdbCmds()
       now wraps the test's dojdbCmds with a 'quit' cmd
     - any test that previously ended with a 'cmd cont' is presumed to
       be OK of that 'cmd cont' caused jdb to execute off the end of
       main(); sounds reasonable to me
     - perfect example of the new logic to catch an errant run off the
       end is  test/com/sun/jdi/WatchFramePop.sh
       - the last jdb cmd is 'next'
       - and jdb is NOT supposed to run off the end
       - the new logic should catch this nicely; I _think_ the old
         logic would only catch a run off the end if someone manually
         checked the test result


Thumbs up!

Dan


>
>
> This change fixes two issues with the tests:
>    1. Fix incorrect 'quit' command sending to JDB when JDB process was finished.
>    2. Improve JDB unexpected exit detection and  process synchronization.
>
> Description of fix:
>    1. Add allowExit parameter to cmd() to show that the given command can finish
>    JDB. E.g., 'cont' command make JDB execute debuggee to the end.
>    If allowExit wasn't set for a command then  assume that it can't finish
>    execution, and fail the test if it did.
>
>    2. Make test fail if it tries to send 'quit' or 'exit' commands. This makes
>    it impossible to send quit/exit from test by mistake. Scaffold will
>    finish JDB by itself if JDB didn't finish before be a command with allowExit
>    set. Add dofinish() function to be the only method that may exit JDB.
>
>    3. Add proper synchronization into waitForFinish(). On all systems except
>    SunOS use wait (from bash). On Solaris find the shell subprocess and wait for
>    its finish. It replaces wait  used on all other systems, because it
>    doesn't work on sh/ksh as in bash.
>
>    4. Fix tests: add allowExit to tests where it's needed.
>


From serguei.spitsyn at oracle.com  Thu Feb 27 19:50:18 2014
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Thu, 27 Feb 2014 19:50:18 -0800
Subject: 3-nd round RFR 6471769: Error: assert(_cur_stack_depth ==
	count_frames(), "cur_stack_depth out of sync")
In-Reply-To: <530FB572.4010507@oracle.com>
References: <530D007D.4040602@oracle.com>
	<530EF65C.5040707@oracle.com>	<530F8D52.2040400@oracle.com>
	<530F9FF3.7030209@oracle.com>	<530FA81F.8040309@oracle.com>
	<530FB572.4010507@oracle.com>
Message-ID: <5310077A.7010004@oracle.com>

Please, review the fix for:
   https://bugs.openjdk.java.net/browse/JDK-6471769


Open webrev:
http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/hotspot/6471769-JVMTI-DEPTH.3

Summary:

   It is another attempt to fix the JTREG com/sun/jdi tests regression
   discovered in the first round change.
   The fix is to avoid lock synchronization at 
safepoints(jvmtiEventController.cpp).
   Thanks to Dan for catching the problem in the 2-nd round of review!

Testing:
   All tests are passed: nsk.jvmti, nsk.jdi, nsk.jdwp, JTreg com/sun/jdi


Thanks,
Serguei


On 2/27/14 2:00 PM, serguei.spitsyn at oracle.com wrote:
> On 2/27/14 1:03 PM, serguei.spitsyn at oracle.com wrote:
>> On 2/27/14 12:28 PM, serguei.spitsyn at oracle.com wrote:
>>> Dan,
>>>
>>> Thank you a lot for reviewing this!
>>>
>>> On 2/27/14 11:09 AM, Daniel D. Daugherty wrote:
>>>> On 2/27/14 1:25 AM, serguei.spitsyn at oracle.com wrote:
>>>>> Please, review the fix for:
>>>>>   https://bugs.openjdk.java.net/browse/JDK-6471769
>>>>>
>>>>>
>>>>> Open webrev:
>>>>> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/hotspot/6471769-JVMTI-DEPTH.2 
>>>>>
>>>>
>>>> src/share/vm/runtime/vm_operations.hpp
>>>>     No comments.
>>>>
>>>> src/share/vm/prims/jvmtiEnvBase.hpp
>>>>     No comments.
>>>>
>>>> src/share/vm/prims/jvmtiEnv.cpp
>>>>     No comments.
>>>>
>>>> src/share/vm/prims/jvmtiEnvThreadState.cpp
>>>>     No comments.
>>>>
>>>> src/share/vm/prims/jvmtiEventController.cpp
>>>>     JvmtiEventController::set_frame_pop() is called by
>>>>     JvmtiEnvThreadState::set_frame_pop() which is called by
>>>>     JvmtiEnv::NotifyFramePop().
>>>>
>>>>     The "MutexLocker mu(JvmtiThreadState_lock)" in
>>>>     JvmtiEventController::set_frame_pop() protected the work
>>>>     done by JvmtiEventControllerPrivate::set_frame_pop():
>>>>
>>>>       ets->get_frame_pops()->set(fpop);
>>>> recompute_thread_enabled(ets->get_thread()->jvmti_thread_state());
>>>
>>> Your check is the right thing to do, thanks!
>>> I had to explain this more clearly in this 2-nd review request.
>>>
>>> The approach I've taken here is that all this code paths are executed
>>> on the target thread or at a safepoint.
>>>
>>> It is true for all 3 functions:
>>>   set_frame_pop(), clear_frame_pop() and clear_to_frame_pop().
>>>
>>> And the updated assert guards ensure that it is the case.
>>>
>>> It could be a good idea to add a No_Safepoint_Verifier for 
>>> PopFrame() and NotifyFramePop()
>>> to make sure the current/target thread does not go to safepoint 
>>> until it is returned from
>>> update_for_pop_top_frame() and set_frame_pop() correspondingly.
>>> A No_Safepoint_Verifier can be also needed in the 
>>> JvmtiExport::post_method_exit().
>>>
>>> These are all places where these functions are called:
>>> prims/jvmtiEnv.cpp: 
>>> state->env_thread_state(this)->set_frame_pop(frame_number); // 
>>> JvmtiEnv::NotifyFramePop()
>>> prims/jvmtiExport.cpp: ets->clear_frame_pop(cur_frame_number); // 
>>> JvmtiExport::post_method_exit()
>>> prims/jvmtiThreadState.cpp: 
>>> ets->clear_frame_pop(popframe_number);              // 
>>> JvmtiThreadState::update_for_pop_top_frame()
>>>
>>> The function JvmtiEnvThreadState::clear_to_frame_pop() is never 
>>> called now.
>>
>> There is still a concern about recompute_thread_enabled().
>> If it is normally always protected with the JvmtiThreadState_lock
>> then the approach above is not going to work.
>> I'm trying to check this now.
>
> Dan,
>
> I came to a conclusion that these 3 functions still must be protected
> by the JvmtiThreadState_lock when they are called out of a safepoint.
> It is a little bit ugly but has to be safe though.
>
> Please, let me know if you see eny problems with that.
> I'll send a new webrev soon.
>
> Thanks,
> Serguei
>
>
>>
>> Thanks,
>> Serguei
>>
>>
>>>
>>> Thanks,
>>> Serguei
>>>
>>>
>>>
>>>
>>>
>>>>
>>>>     Since multiple threads can call JVM/TI NotifyFramePop() on the
>>>>     same target thread, what keeps the threads from messing with
>>>>     the list of frame pops simultaneously or messing with the
>>>>     thread enabled events bits in parallel?
>>>>
>>>>     I suspect that this might also be an issue for
>>>>     JvmtiEventController::clear_frame_pop() and
>>>>     JvmtiEventController::clear_to_frame_pop() also.
>>>>
>>>> src/share/vm/prims/jvmtiThreadState.cpp
>>>>     No comments.
>>>>
>>>> Dan
>>>>
>>>>
>>>>>
>>>>> Summary:
>>>>>
>>>>>   It is the 2-nd round of review because the JTREG com/sun/jdi 
>>>>> tests discovered a regression
>>>>>   in the first round change. The issue was in the 
>>>>> JvmtiEventController::clear_frame_pop()
>>>>>   lock synchronization that is not allowed at safepoints.
>>>>>
>>>>>   As a result I've changed the JvmtiEnv::NotifyFramePop to use a 
>>>>> VM operation for safety.
>>>>>   Also, I've removed the lock synchronization from the 3 impacted 
>>>>> JvmtiEventController::
>>>>>   functions: set_frame_pop(), clear_frame_pop() and 
>>>>> clear_to_frame_pop().
>>>>>
>>>>> Testing:
>>>>>   In progress: nsk.jvmti, nsk.jdi, nsk.jdwp, JTreg com/sun/jdi
>>>>>
>>>>>
>>>>> Thanks,
>>>>> Serguei
>>>>>
>>>>>
>>>>> On 2/25/14 12:43 PM, serguei.spitsyn at oracle.com wrote:
>>>>>> Please, review the fix for:
>>>>>>   https://bugs.openjdk.java.net/browse/JDK-6471769
>>>>>>
>>>>>>
>>>>>> Open webrev:
>>>>>> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/hotspot/6471769-JVMTI-DEPTH.1 
>>>>>>
>>>>>>
>>>>>> Summary:
>>>>>>
>>>>>>   This is another Test Stabilization issue.
>>>>>>   The fix is very similar to other JVMTI stabilization fixes.
>>>>>>   It is to use safepoints for updating the PopFrame data instead 
>>>>>> of relying on the
>>>>>>   suspend equivalent condition mechanism 
>>>>>> (JvmtiEnv::is_thread_fully_suspended())
>>>>>>   which is not adequate from the reliability point of view.
>>>>>>
>>>>>> Testing:
>>>>>>   In progress: nsk.jvmti, nsk.jdi, nsk.jdwp, JTreg com/sun/jdi
>>>>>>
>>>>>>
>>>>>> Thanks,
>>>>>> Serguei
>>>>>>
>>>>>
>>>>
>>>
>>
>


From david.holmes at oracle.com  Thu Feb 27 22:04:50 2014
From: david.holmes at oracle.com (David Holmes)
Date: Fri, 28 Feb 2014 16:04:50 +1000
Subject: 3-nd round RFR 6471769: Error: assert(_cur_stack_depth ==
	count_frames(), "cur_stack_depth out of sync")
In-Reply-To: <5310077A.7010004@oracle.com>
References: <530D007D.4040602@oracle.com>	<530EF65C.5040707@oracle.com>	<530F8D52.2040400@oracle.com>	<530F9FF3.7030209@oracle.com>	<530FA81F.8040309@oracle.com>	<530FB572.4010507@oracle.com>
	<5310077A.7010004@oracle.com>
Message-ID: <53102702.4070603@oracle.com>

Hi Serguei,

On 28/02/2014 1:50 PM, serguei.spitsyn at oracle.com wrote:
> Please, review the fix for:
>    https://bugs.openjdk.java.net/browse/JDK-6471769
>
>
> Open webrev:
> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/hotspot/6471769-JVMTI-DEPTH.3
>
>
> Summary:
>
>    It is another attempt to fix the JTREG com/sun/jdi tests regression
>    discovered in the first round change.
>    The fix is to avoid lock synchronization at
> safepoints(jvmtiEventController.cpp).
>    Thanks to Dan for catching the problem in the 2-nd round of review!

The basic approach here seems sound.

I find the checking for cur->is_VMThread() somewhat overly conservative 
- if we are at a safepoint, and executing this code, then we must be the 
VMThread. But ok.

You could also use MutexLockerEx to avoid the need for locked and 
unlocked paths to a common call, but that's just stylistic. Though if 
you are grabbing the current thread anyway you can also use the 
MutexLocker calls that take the thread arg - to avoid a second look-up 
of the current thread.

David
-----

> Testing:
>    All tests are passed: nsk.jvmti, nsk.jdi, nsk.jdwp, JTreg com/sun/jdi
>
>
> Thanks,
> Serguei
>
>
> On 2/27/14 2:00 PM, serguei.spitsyn at oracle.com wrote:
>> On 2/27/14 1:03 PM, serguei.spitsyn at oracle.com wrote:
>>> On 2/27/14 12:28 PM, serguei.spitsyn at oracle.com wrote:
>>>> Dan,
>>>>
>>>> Thank you a lot for reviewing this!
>>>>
>>>> On 2/27/14 11:09 AM, Daniel D. Daugherty wrote:
>>>>> On 2/27/14 1:25 AM, serguei.spitsyn at oracle.com wrote:
>>>>>> Please, review the fix for:
>>>>>>   https://bugs.openjdk.java.net/browse/JDK-6471769
>>>>>>
>>>>>>
>>>>>> Open webrev:
>>>>>> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/hotspot/6471769-JVMTI-DEPTH.2
>>>>>>
>>>>>
>>>>> src/share/vm/runtime/vm_operations.hpp
>>>>>     No comments.
>>>>>
>>>>> src/share/vm/prims/jvmtiEnvBase.hpp
>>>>>     No comments.
>>>>>
>>>>> src/share/vm/prims/jvmtiEnv.cpp
>>>>>     No comments.
>>>>>
>>>>> src/share/vm/prims/jvmtiEnvThreadState.cpp
>>>>>     No comments.
>>>>>
>>>>> src/share/vm/prims/jvmtiEventController.cpp
>>>>>     JvmtiEventController::set_frame_pop() is called by
>>>>>     JvmtiEnvThreadState::set_frame_pop() which is called by
>>>>>     JvmtiEnv::NotifyFramePop().
>>>>>
>>>>>     The "MutexLocker mu(JvmtiThreadState_lock)" in
>>>>>     JvmtiEventController::set_frame_pop() protected the work
>>>>>     done by JvmtiEventControllerPrivate::set_frame_pop():
>>>>>
>>>>>       ets->get_frame_pops()->set(fpop);
>>>>> recompute_thread_enabled(ets->get_thread()->jvmti_thread_state());
>>>>
>>>> Your check is the right thing to do, thanks!
>>>> I had to explain this more clearly in this 2-nd review request.
>>>>
>>>> The approach I've taken here is that all this code paths are executed
>>>> on the target thread or at a safepoint.
>>>>
>>>> It is true for all 3 functions:
>>>>   set_frame_pop(), clear_frame_pop() and clear_to_frame_pop().
>>>>
>>>> And the updated assert guards ensure that it is the case.
>>>>
>>>> It could be a good idea to add a No_Safepoint_Verifier for
>>>> PopFrame() and NotifyFramePop()
>>>> to make sure the current/target thread does not go to safepoint
>>>> until it is returned from
>>>> update_for_pop_top_frame() and set_frame_pop() correspondingly.
>>>> A No_Safepoint_Verifier can be also needed in the
>>>> JvmtiExport::post_method_exit().
>>>>
>>>> These are all places where these functions are called:
>>>> prims/jvmtiEnv.cpp:
>>>> state->env_thread_state(this)->set_frame_pop(frame_number); //
>>>> JvmtiEnv::NotifyFramePop()
>>>> prims/jvmtiExport.cpp: ets->clear_frame_pop(cur_frame_number); //
>>>> JvmtiExport::post_method_exit()
>>>> prims/jvmtiThreadState.cpp:
>>>> ets->clear_frame_pop(popframe_number);              //
>>>> JvmtiThreadState::update_for_pop_top_frame()
>>>>
>>>> The function JvmtiEnvThreadState::clear_to_frame_pop() is never
>>>> called now.
>>>
>>> There is still a concern about recompute_thread_enabled().
>>> If it is normally always protected with the JvmtiThreadState_lock
>>> then the approach above is not going to work.
>>> I'm trying to check this now.
>>
>> Dan,
>>
>> I came to a conclusion that these 3 functions still must be protected
>> by the JvmtiThreadState_lock when they are called out of a safepoint.
>> It is a little bit ugly but has to be safe though.
>>
>> Please, let me know if you see eny problems with that.
>> I'll send a new webrev soon.
>>
>> Thanks,
>> Serguei
>>
>>
>>>
>>> Thanks,
>>> Serguei
>>>
>>>
>>>>
>>>> Thanks,
>>>> Serguei
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>>
>>>>>     Since multiple threads can call JVM/TI NotifyFramePop() on the
>>>>>     same target thread, what keeps the threads from messing with
>>>>>     the list of frame pops simultaneously or messing with the
>>>>>     thread enabled events bits in parallel?
>>>>>
>>>>>     I suspect that this might also be an issue for
>>>>>     JvmtiEventController::clear_frame_pop() and
>>>>>     JvmtiEventController::clear_to_frame_pop() also.
>>>>>
>>>>> src/share/vm/prims/jvmtiThreadState.cpp
>>>>>     No comments.
>>>>>
>>>>> Dan
>>>>>
>>>>>
>>>>>>
>>>>>> Summary:
>>>>>>
>>>>>>   It is the 2-nd round of review because the JTREG com/sun/jdi
>>>>>> tests discovered a regression
>>>>>>   in the first round change. The issue was in the
>>>>>> JvmtiEventController::clear_frame_pop()
>>>>>>   lock synchronization that is not allowed at safepoints.
>>>>>>
>>>>>>   As a result I've changed the JvmtiEnv::NotifyFramePop to use a
>>>>>> VM operation for safety.
>>>>>>   Also, I've removed the lock synchronization from the 3 impacted
>>>>>> JvmtiEventController::
>>>>>>   functions: set_frame_pop(), clear_frame_pop() and
>>>>>> clear_to_frame_pop().
>>>>>>
>>>>>> Testing:
>>>>>>   In progress: nsk.jvmti, nsk.jdi, nsk.jdwp, JTreg com/sun/jdi
>>>>>>
>>>>>>
>>>>>> Thanks,
>>>>>> Serguei
>>>>>>
>>>>>>
>>>>>> On 2/25/14 12:43 PM, serguei.spitsyn at oracle.com wrote:
>>>>>>> Please, review the fix for:
>>>>>>>   https://bugs.openjdk.java.net/browse/JDK-6471769
>>>>>>>
>>>>>>>
>>>>>>> Open webrev:
>>>>>>> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/hotspot/6471769-JVMTI-DEPTH.1
>>>>>>>
>>>>>>>
>>>>>>> Summary:
>>>>>>>
>>>>>>>   This is another Test Stabilization issue.
>>>>>>>   The fix is very similar to other JVMTI stabilization fixes.
>>>>>>>   It is to use safepoints for updating the PopFrame data instead
>>>>>>> of relying on the
>>>>>>>   suspend equivalent condition mechanism
>>>>>>> (JvmtiEnv::is_thread_fully_suspended())
>>>>>>>   which is not adequate from the reliability point of view.
>>>>>>>
>>>>>>> Testing:
>>>>>>>   In progress: nsk.jvmti, nsk.jdi, nsk.jdwp, JTreg com/sun/jdi
>>>>>>>
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Serguei
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

From staffan.larsen at oracle.com  Fri Feb 28 00:31:29 2014
From: staffan.larsen at oracle.com (Staffan Larsen)
Date: Fri, 28 Feb 2014 09:31:29 +0100
Subject: RFR 6946101: ShellScaffold.sh based tests can issue "write error:
	Broken pipe" messages
In-Reply-To: <530FC073.3060905@oracle.com>
References: <1701612.OzZc2qZ2Ld@work-dell> <530FC073.3060905@oracle.com>
Message-ID: <47FE9723-6797-4135-A103-E1E42675FE99@oracle.com>

Very nice change - looks good!

test/com/sun/jdi/ShellScaffold.sh 
  line 1000:     # mydojdbCmds() didn't finished because it waits for JDB message
    nit: finished -> finish

Just a note that this should be pushed through jdk9/dev and not jdk9/hs-comp (where the webrev was made).

Thanks,
/Staffan


On 27 feb 2014, at 23:47, Daniel D. Daugherty <daniel.daugherty at oracle.com> wrote:

> On 2/27/14 9:20 AM, Pavel Punegov wrote:
>> Please review the fix for:
>>   https://bugs.openjdk.java.net/browse/JDK-6946101
>> 
>> webrev:
>>   http://cr.openjdk.java.net/~iignatyev/ppunegov/6946101/webrev.00/
> 
> test/com/sun/jdi/ShellScaffold.sh
>    line 531: # allows JDB to exit"
>        stray double-quote at end of comment
> 
>    line 563: dofail "It's not allowed to send quit and exit commands from the test"
>        'and' should be 'or'
> 
>    line 819:  # Kill debugger, it could be hang
>        Typo: 'hang' -> 'hung'
> 
> test/com/sun/jdi/ArrayLengthDumpTest.sh
> test/com/sun/jdi/CatchAllTest.sh
> test/com/sun/jdi/CatchCaughtTest.sh
> test/com/sun/jdi/CatchPatternTest.sh
> test/com/sun/jdi/CommandCommentDelimiter.sh
> test/com/sun/jdi/DeferredStepTest.sh
> test/com/sun/jdi/DeoptimizeWalk.sh
> test/com/sun/jdi/EvalArgs.sh
> test/com/sun/jdi/GetLocalVariables3Test.sh
> test/com/sun/jdi/GetLocalVariables4Test.sh
> test/com/sun/jdi/JdbExprTest.sh
> test/com/sun/jdi/JdbLockTest.sh
> test/com/sun/jdi/JdbMethodExitTest.sh
> test/com/sun/jdi/JdbMissStep.sh
> test/com/sun/jdi/MixedSuspendTest.sh
> test/com/sun/jdi/NotAField.sh
> test/com/sun/jdi/NullLocalVariable.sh
> test/com/sun/jdi/Redefine-g.sh
> test/com/sun/jdi/RedefineAnnotation.sh
> test/com/sun/jdi/RedefineChangeClassOrder.sh
> test/com/sun/jdi/RedefineClasses.sh
> test/com/sun/jdi/RedefineException.sh
> test/com/sun/jdi/RedefineFinal.sh
> test/com/sun/jdi/RedefineImplementor.sh
> test/com/sun/jdi/RedefineIntConstantToLong.sh
> test/com/sun/jdi/RedefineMulti.sh
> test/com/sun/jdi/RedefinePop.sh
> test/com/sun/jdi/RedefineTTYLineNumber.sh
> test/com/sun/jdi/StringConvertTest.sh
> test/com/sun/jdi/WatchFramePop.sh
>    I _think_ I understand the new test driver style:
> 
>    - get rid of all explicit 'cmd quit' usages because mydojdbCmds()
>      now wraps the test's dojdbCmds with a 'quit' cmd
>    - any test that previously ended with a 'cmd cont' is presumed to
>      be OK of that 'cmd cont' caused jdb to execute off the end of
>      main(); sounds reasonable to me
>    - perfect example of the new logic to catch an errant run off the
>      end is  test/com/sun/jdi/WatchFramePop.sh
>      - the last jdb cmd is 'next'
>      - and jdb is NOT supposed to run off the end
>      - the new logic should catch this nicely; I _think_ the old
>        logic would only catch a run off the end if someone manually
>        checked the test result
> 
> 
> Thumbs up!
> 
> Dan
> 
> 
>> 
>> 
>> This change fixes two issues with the tests:
>>   1. Fix incorrect 'quit' command sending to JDB when JDB process was finished.
>>   2. Improve JDB unexpected exit detection and  process synchronization.
>> 
>> Description of fix:
>>   1. Add allowExit parameter to cmd() to show that the given command can finish
>>   JDB. E.g., 'cont' command make JDB execute debuggee to the end.
>>   If allowExit wasn't set for a command then  assume that it can't finish
>>   execution, and fail the test if it did.
>> 
>>   2. Make test fail if it tries to send 'quit' or 'exit' commands. This makes
>>   it impossible to send quit/exit from test by mistake. Scaffold will
>>   finish JDB by itself if JDB didn't finish before be a command with allowExit
>>   set. Add dofinish() function to be the only method that may exit JDB.
>> 
>>   3. Add proper synchronization into waitForFinish(). On all systems except
>>   SunOS use wait (from bash). On Solaris find the shell subprocess and wait for
>>   its finish. It replaces wait  used on all other systems, because it
>>   doesn't work on sh/ksh as in bash.
>> 
>>   4. Fix tests: add allowExit to tests where it's needed.
>> 
> 


From staffan.larsen at oracle.com  Fri Feb 28 01:43:20 2014
From: staffan.larsen at oracle.com (Staffan Larsen)
Date: Fri, 28 Feb 2014 10:43:20 +0100
Subject: RFR 9: 8035889: jdk testlibrary - add printing of values of
	failed assertions
In-Reply-To: <530F5AE9.8000405@oracle.com>
References: <530E4FD8.9040605@oracle.com> <530E9C6C.8080409@oracle.com>
	<530EAC78.4000301@Oracle.com> <530EAE5F.3050308@oracle.com>
	<530F5AE9.8000405@oracle.com>
Message-ID: <6A397A15-56DD-41FA-885F-4A3BF2415CB6@oracle.com>

Looks good to me!

Thanks,
/Staffan

On 27 feb 2014, at 16:34, roger riggs <roger.riggs at oracle.com> wrote:

> Hi Mandy,
> 
> I updated the webrev:  
>   http://cr.openjdk.java.net/~rriggs/webrev-testlibrary-asserts-8035889/ 
> 
> Alan suggested copying serviceability-dev so they have a chance to review if desired.
> 
> I want to investigate if it is possible to use the TestNG Assert classes without
> the TestNG execution framework.  
> It would be necessary to compile/run against TestNG.jar but it might not 
> need the entire mechanism.
> 
> Thanks, Roger
> 
> On 2/26/2014 10:17 PM, Mandy Chung wrote:
>> On 2/26/2014 7:09 PM, Roger Riggs wrote:
>>> Hi Mandy,
>>> 
>>> Yes, it might be more productive to switch the tests to TestNG.
>>> But it did provide support in cases where TestNG could not be used, 
>>> for example in a directory of existing tests that had custom reporting.
>>> 
>>> But I remember there is a problem with TestNG having a dependency for XML
>>> which is not supported in Profile1 and a number of tests had to be disabled
>>> in that configuration.  Will XML always be available.  Do we need to solve
>>> or work around that problem with TestNG?
>>> 
>> 
>> This is a good point.   When we want to test just the base module for example, how can we run TestNG tests?  We need to address that certainly.
>> 
>> My comment on TestNG is a question for new tests using this Asserts class.  Your patch is fine to go (after taking out @library tag if I got it correct). 
>> 
>> Mandy
>> 
>>> Thanks, Roger
>>> 
>>> On 2/26/14 9:01 PM, Mandy Chung wrote:
>>>> Hi Roger, 
>>>> 
>>>> On 2/26/2014 12:34 PM, roger riggs wrote: 
>>>>> The testlibrary for the jdk should be printing the values in the failed 
>>>>> assertions to make debugging easier and quicker. 
>>>>> 
>>>>> The webrev adds the printing of the failed assertions and added methods 
>>>>> for formatting and unconditional fail methods. 
>>>>> 
>>>>> Webrev: 
>>>>> http://cr.openjdk.java.net/~rriggs/webrev-testlibrary-asserts-8035889/ 
>>>>> 
>>>> 
>>>> AssertsTest.java: line 28:  @library doesn't look like it's needed. There is no jdk/test/testlibrary directory and I think           jdk.testlibrary.* are found as relative to $test.src. 
>>>> 
>>>> Otherwise, the change looks okay. 
>>>> 
>>>> Now that jtreg supports TestNG and I wonder if this class should retire some day (there are only about 10 existing tests using this class).  Are you writing new tests using this Asserts class? 
>>>> 
>>>> Mandy 
>>>> 
>>>>> Bug: 
>>>>>    8035889: jdk testlibrary - add printing of values of failed assertions 
>>>>> 
>>>>> Thanks, Roger 
>>>>> 
>>>>> [1] https://bugs.openjdk.java.net/browse/JDK-8035889 
>>>>> 
>>>>> 
>>>> 
>>> 
>> 
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20140228/44b9c51b/attachment.html 

From staffan.larsen at oracle.com  Fri Feb 28 01:47:02 2014
From: staffan.larsen at oracle.com (Staffan Larsen)
Date: Fri, 28 Feb 2014 10:47:02 +0100
Subject: RFR 8035668: [test] Check that jstat does not cause problems when
	monitored application exits before jstat finishes its work
In-Reply-To: <530F366E.2040806@oracle.com>
References: <530F366E.2040806@oracle.com>
Message-ID: <939A40A3-C202-49B6-B454-841D972A727E@oracle.com>

You are missing the copyright-header.

Otherwise looks good.

Thanks,
/Staffan

On 27 feb 2014, at 13:58, Jaroslav Bachorik <jaroslav.bachorik at oracle.com> wrote:

> Please, review the addition of the jstat related test.
> 
> Issue : https://bugs.openjdk.java.net/browse/JDK-8035668
> Webrev: http://cr.openjdk.java.net/~jbachorik/8035668/webrev.00/
> 
> The test logic is simple - first the application to be monitored is started. When the application is ready jstat is launched - it periodically (each 100ms) polls the monitored application status. At this moment the monitored application is signalled the possibility to exit - it waits for 500ms and exits. There should not be any errors neither in the monitored application nor jstat (checked by the exit status of their processes) and they should not hang (detected by the harness timeout).
> 
> Thanks,
> 
> -JB-


From mattias.tobiasson at oracle.com  Fri Feb 28 02:33:08 2014
From: mattias.tobiasson at oracle.com (Mattias Tobiasson)
Date: Fri, 28 Feb 2014 02:33:08 -0800 (PST)
Subject: RFR(XS) 8031065: LowMemoryTest2.sh fails: OutOfMemoryError:
	Metaspace
Message-ID: <8e5e581d-c5a9-4cd1-9a40-9d2d2a456b2f@default>

Hi,
I have updated the test and now stop allocating when we have reached the threshold.
Since we now do all allocations first and then just wait for the notification, I have split the loop into two separate loops to make it clearer.

To detect if we have reached the threshold I now check MemoryPoolMXBean.getUsageThresholdCount() > 0 instead of checking isUsageThresholdExceeded().
The reason for that is because the notification event is not generated immediately when isUsageThresholdExceeded() = true. The notification is only generated at the next GC. So that is the reason for why the old test kept allocating after it had reached the threshold (to trigger another GC).

getUsageThresholdCount() is updated at the same time as the event is generated. So after getUsageThresholdCount() > 0, I can just wait for the notification without more allocations.

webrev:
http://cr.openjdk.java.net/~mtobiass/8031065/webrev.01

bug:
https://bugs.openjdk.java.net/browse/JDK-8031065

Mattias

----- Original Message ----- 
From: shanliang.jiang at oracle.com 
To: mattias.tobiasson at oracle.com 
Cc: serviceability-dev at openjdk.java.net, daniel.fuchs at oracle.com 
Sent: Thursday, February 27, 2014 5:12:51 PM GMT +01:00 Amsterdam / Berlin / Bern / Rome / Stockholm / Vienna 
Subject: Re: RFR(XS) 8031065: LowMemoryTest2.sh fails: OutOfMemoryError: Metaspace 


Mattias Tobiasson wrote: 

Hi, thanks for the fast reviews.

I did think about stop calling loadNext() after the flag has been set. The main reason for not doing that was just because I wanted to change as little as possible. Now the test works as originally intended. I prefer to do like this too :) 


I do not mind removing the calls to loadNext(), but then we would need some timeout waiting for the callback. Currently the test "times out" with an OutOfMemory when we have allocated the remaining 20% of the space. You do not need to add a timeout, only change Line 151 
for(;;) 
to 
while(!listenerInvoked) { 

and remove 160 -- 162 

in case that an expected notification is not arrived, the testing harness has a timeout to stop the test. 

This way makes the test more robust, but I am OK with the current fix. 

Thanks, 
Shanliang 


About line 172, you are correct. I will just remove that line. Thanks!

Mattias

----- Original Message -----
From: shanliang.jiang at oracle.com To: daniel.fuchs at oracle.com Cc: mattias.tobiasson at oracle.com , serviceability-dev at openjdk.java.net Sent: Thursday, February 27, 2014 12:59:49 PM GMT +01:00 Amsterdam / Berlin / Bern / Rome / Stockholm / Vienna
Subject: Re: RFR(XS) 8031065: LowMemoryTest2.sh fails: OutOfMemoryError: Metaspace

Daniel Fuchs wrote: 

On 2/27/14 11:43 AM, Mattias Tobiasson wrote: 

Hi,
Could you please review this test fix.

The test verifies that MemoryPoolMXBean sends a notification when 
used memory has reached the threshold.
The flag thresholdExceeded marks if we have reached the memory 
threshold. When the flag is set, the test slows down to give time for 
the notification to be received.
The problem is that thresholdExceeded is overwritten every time in 
the loop. Instead it should be set if any pool has reached the 
threshold. This means that the test continues to allocate memory at 
full speed, and we may get an OutOfMemory before we get the 
notification. Hi Mattias,

I wonder whether you should also stop calling loadNext() once
thresholdExceeded is true? Yes I am thinking this too.

Line 172 is unnecessary, after thresholdExceeded becomes true, Line 170 
will always be skipped.

Shanliang 

best regards,

-- daniel 

bug: https://bugs.openjdk.java.net/browse/JDK-8031065 webrev: http://cr.openjdk.java.net/~ykantser/8031065/webrev.00/ Mattias 

From shanliang.jiang at oracle.com  Fri Feb 28 02:58:11 2014
From: shanliang.jiang at oracle.com (shanliang)
Date: Fri, 28 Feb 2014 11:58:11 +0100
Subject: RFR(XS) 8031065: LowMemoryTest2.sh fails: OutOfMemoryError:
	Metaspace
In-Reply-To: <8e5e581d-c5a9-4cd1-9a40-9d2d2a456b2f@default>
References: <8e5e581d-c5a9-4cd1-9a40-9d2d2a456b2f@default>
Message-ID: <53106BC3.2060505@oracle.com>

Looks good!

It could be improved to not use the variable thresholdExceeded:

change Line 146 to
    while(true) {
and remove Line 143 and 158

Thanks,
Shanliang


Mattias Tobiasson wrote:
> Hi,
> I have updated the test and now stop allocating when we have reached the threshold.
> Since we now do all allocations first and then just wait for the notification, I have split the loop into two separate loops to make it clearer.
>
> To detect if we have reached the threshold I now check MemoryPoolMXBean.getUsageThresholdCount() > 0 instead of checking isUsageThresholdExceeded().
> The reason for that is because the notification event is not generated immediately when isUsageThresholdExceeded() = true. The notification is only generated at the next GC. So that is the reason for why the old test kept allocating after it had reached the threshold (to trigger another GC).
>
> getUsageThresholdCount() is updated at the same time as the event is generated. So after getUsageThresholdCount() > 0, I can just wait for the notification without more allocations.
>
> webrev:
> http://cr.openjdk.java.net/~mtobiass/8031065/webrev.01
>
> bug:
> https://bugs.openjdk.java.net/browse/JDK-8031065
>
> Mattias
>
> ----- Original Message ----- 
> From: shanliang.jiang at oracle.com 
> To: mattias.tobiasson at oracle.com 
> Cc: serviceability-dev at openjdk.java.net, daniel.fuchs at oracle.com 
> Sent: Thursday, February 27, 2014 5:12:51 PM GMT +01:00 Amsterdam / Berlin / Bern / Rome / Stockholm / Vienna 
> Subject: Re: RFR(XS) 8031065: LowMemoryTest2.sh fails: OutOfMemoryError: Metaspace 
>
>
> Mattias Tobiasson wrote: 
>
> Hi, thanks for the fast reviews.
>
> I did think about stop calling loadNext() after the flag has been set. The main reason for not doing that was just because I wanted to change as little as possible. Now the test works as originally intended. I prefer to do like this too :) 
>
>
> I do not mind removing the calls to loadNext(), but then we would need some timeout waiting for the callback. Currently the test "times out" with an OutOfMemory when we have allocated the remaining 20% of the space. You do not need to add a timeout, only change Line 151 
> for(;;) 
> to 
> while(!listenerInvoked) { 
>
> and remove 160 -- 162 
>
> in case that an expected notification is not arrived, the testing harness has a timeout to stop the test. 
>
> This way makes the test more robust, but I am OK with the current fix. 
>
> Thanks, 
> Shanliang 
>
>
> About line 172, you are correct. I will just remove that line. Thanks!
>
> Mattias
>
> ----- Original Message -----
> From: shanliang.jiang at oracle.com To: daniel.fuchs at oracle.com Cc: mattias.tobiasson at oracle.com , serviceability-dev at openjdk.java.net Sent: Thursday, February 27, 2014 12:59:49 PM GMT +01:00 Amsterdam / Berlin / Bern / Rome / Stockholm / Vienna
> Subject: Re: RFR(XS) 8031065: LowMemoryTest2.sh fails: OutOfMemoryError: Metaspace
>
> Daniel Fuchs wrote: 
>
> On 2/27/14 11:43 AM, Mattias Tobiasson wrote: 
>
> Hi,
> Could you please review this test fix.
>
> The test verifies that MemoryPoolMXBean sends a notification when 
> used memory has reached the threshold.
> The flag thresholdExceeded marks if we have reached the memory 
> threshold. When the flag is set, the test slows down to give time for 
> the notification to be received.
> The problem is that thresholdExceeded is overwritten every time in 
> the loop. Instead it should be set if any pool has reached the 
> threshold. This means that the test continues to allocate memory at 
> full speed, and we may get an OutOfMemory before we get the 
> notification. Hi Mattias,
>
> I wonder whether you should also stop calling loadNext() once
> thresholdExceeded is true? Yes I am thinking this too.
>
> Line 172 is unnecessary, after thresholdExceeded becomes true, Line 170 
> will always be skipped.
>
> Shanliang 
>
> best regards,
>
> -- daniel 
>
> bug: https://bugs.openjdk.java.net/browse/JDK-8031065 webrev: http://cr.openjdk.java.net/~ykantser/8031065/webrev.00/ Mattias 
>   


From daniel.fuchs at oracle.com  Fri Feb 28 03:05:47 2014
From: daniel.fuchs at oracle.com (Daniel Fuchs)
Date: Fri, 28 Feb 2014 12:05:47 +0100
Subject: RFR(XS) 8031065: LowMemoryTest2.sh fails: OutOfMemoryError:
	Metaspace
In-Reply-To: <8e5e581d-c5a9-4cd1-9a40-9d2d2a456b2f@default>
References: <8e5e581d-c5a9-4cd1-9a40-9d2d2a456b2f@default>
Message-ID: <53106D8B.9040407@oracle.com>

Hi Mattias,

The new version looks good!

best regards,

-- daniel

On 2/28/14 11:33 AM, Mattias Tobiasson wrote:
> Hi,
> I have updated the test and now stop allocating when we have reached the threshold.
> Since we now do all allocations first and then just wait for the notification, I have split the loop into two separate loops to make it clearer.
>
> To detect if we have reached the threshold I now check MemoryPoolMXBean.getUsageThresholdCount() > 0 instead of checking isUsageThresholdExceeded().
> The reason for that is because the notification event is not generated immediately when isUsageThresholdExceeded() = true. The notification is only generated at the next GC. So that is the reason for why the old test kept allocating after it had reached the threshold (to trigger another GC).
>
> getUsageThresholdCount() is updated at the same time as the event is generated. So after getUsageThresholdCount() > 0, I can just wait for the notification without more allocations.
>
> webrev:
> http://cr.openjdk.java.net/~mtobiass/8031065/webrev.01
>
> bug:
> https://bugs.openjdk.java.net/browse/JDK-8031065
>
> Mattias
>
> ----- Original Message -----
> From: shanliang.jiang at oracle.com
> To: mattias.tobiasson at oracle.com
> Cc: serviceability-dev at openjdk.java.net, daniel.fuchs at oracle.com
> Sent: Thursday, February 27, 2014 5:12:51 PM GMT +01:00 Amsterdam / Berlin / Bern / Rome / Stockholm / Vienna
> Subject: Re: RFR(XS) 8031065: LowMemoryTest2.sh fails: OutOfMemoryError: Metaspace
>
>
> Mattias Tobiasson wrote:
>
> Hi, thanks for the fast reviews.
>
> I did think about stop calling loadNext() after the flag has been set. The main reason for not doing that was just because I wanted to change as little as possible. Now the test works as originally intended. I prefer to do like this too :)
>
>
> I do not mind removing the calls to loadNext(), but then we would need some timeout waiting for the callback. Currently the test "times out" with an OutOfMemory when we have allocated the remaining 20% of the space. You do not need to add a timeout, only change Line 151
> for(;;)
> to
> while(!listenerInvoked) {
>
> and remove 160 -- 162
>
> in case that an expected notification is not arrived, the testing harness has a timeout to stop the test.
>
> This way makes the test more robust, but I am OK with the current fix.
>
> Thanks,
> Shanliang
>
>
> About line 172, you are correct. I will just remove that line. Thanks!
>
> Mattias
>
> ----- Original Message -----
> From: shanliang.jiang at oracle.com To: daniel.fuchs at oracle.com Cc: mattias.tobiasson at oracle.com , serviceability-dev at openjdk.java.net Sent: Thursday, February 27, 2014 12:59:49 PM GMT +01:00 Amsterdam / Berlin / Bern / Rome / Stockholm / Vienna
> Subject: Re: RFR(XS) 8031065: LowMemoryTest2.sh fails: OutOfMemoryError: Metaspace
>
> Daniel Fuchs wrote:
>
> On 2/27/14 11:43 AM, Mattias Tobiasson wrote:
>
> Hi,
> Could you please review this test fix.
>
> The test verifies that MemoryPoolMXBean sends a notification when
> used memory has reached the threshold.
> The flag thresholdExceeded marks if we have reached the memory
> threshold. When the flag is set, the test slows down to give time for
> the notification to be received.
> The problem is that thresholdExceeded is overwritten every time in
> the loop. Instead it should be set if any pool has reached the
> threshold. This means that the test continues to allocate memory at
> full speed, and we may get an OutOfMemory before we get the
> notification. Hi Mattias,
>
> I wonder whether you should also stop calling loadNext() once
> thresholdExceeded is true? Yes I am thinking this too.
>
> Line 172 is unnecessary, after thresholdExceeded becomes true, Line 170
> will always be skipped.
>
> Shanliang
>
> best regards,
>
> -- daniel
>
> bug: https://bugs.openjdk.java.net/browse/JDK-8031065 webrev: http://cr.openjdk.java.net/~ykantser/8031065/webrev.00/ Mattias
>


From mattias.tobiasson at oracle.com  Fri Feb 28 03:47:11 2014
From: mattias.tobiasson at oracle.com (Mattias Tobiasson)
Date: Fri, 28 Feb 2014 03:47:11 -0800 (PST)
Subject: RFR(XS) 8031065: LowMemoryTest2.sh fails: OutOfMemoryError:
	Metaspace
Message-ID: <69c2acbd-c1a0-47fa-b255-9d110fdeb135@default>

Thanks for the review.
Unfortunately I can not remove the thresholdExceeded, because the "break" only leaves the loop of memory pools.
I know nested loops are not perfect, but I did not want to change too much from the original test.

Mattias

----- Original Message -----
From: shanliang.jiang at oracle.com
To: mattias.tobiasson at oracle.com
Cc: serviceability-dev at openjdk.java.net, daniel.fuchs at oracle.com
Sent: Friday, February 28, 2014 11:58:20 AM GMT +01:00 Amsterdam / Berlin / Bern / Rome / Stockholm / Vienna
Subject: Re: RFR(XS) 8031065: LowMemoryTest2.sh fails: OutOfMemoryError: Metaspace

Looks good!

It could be improved to not use the variable thresholdExceeded:

change Line 146 to
    while(true) {
and remove Line 143 and 158

Thanks,
Shanliang


Mattias Tobiasson wrote:
> Hi,
> I have updated the test and now stop allocating when we have reached the threshold.
> Since we now do all allocations first and then just wait for the notification, I have split the loop into two separate loops to make it clearer.
>
> To detect if we have reached the threshold I now check MemoryPoolMXBean.getUsageThresholdCount() > 0 instead of checking isUsageThresholdExceeded().
> The reason for that is because the notification event is not generated immediately when isUsageThresholdExceeded() = true. The notification is only generated at the next GC. So that is the reason for why the old test kept allocating after it had reached the threshold (to trigger another GC).
>
> getUsageThresholdCount() is updated at the same time as the event is generated. So after getUsageThresholdCount() > 0, I can just wait for the notification without more allocations.
>
> webrev:
> http://cr.openjdk.java.net/~mtobiass/8031065/webrev.01
>
> bug:
> https://bugs.openjdk.java.net/browse/JDK-8031065
>
> Mattias
>
> ----- Original Message ----- 
> From: shanliang.jiang at oracle.com 
> To: mattias.tobiasson at oracle.com 
> Cc: serviceability-dev at openjdk.java.net, daniel.fuchs at oracle.com 
> Sent: Thursday, February 27, 2014 5:12:51 PM GMT +01:00 Amsterdam / Berlin / Bern / Rome / Stockholm / Vienna 
> Subject: Re: RFR(XS) 8031065: LowMemoryTest2.sh fails: OutOfMemoryError: Metaspace 
>
>
> Mattias Tobiasson wrote: 
>
> Hi, thanks for the fast reviews.
>
> I did think about stop calling loadNext() after the flag has been set. The main reason for not doing that was just because I wanted to change as little as possible. Now the test works as originally intended. I prefer to do like this too :) 
>
>
> I do not mind removing the calls to loadNext(), but then we would need some timeout waiting for the callback. Currently the test "times out" with an OutOfMemory when we have allocated the remaining 20% of the space. You do not need to add a timeout, only change Line 151 
> for(;;) 
> to 
> while(!listenerInvoked) { 
>
> and remove 160 -- 162 
>
> in case that an expected notification is not arrived, the testing harness has a timeout to stop the test. 
>
> This way makes the test more robust, but I am OK with the current fix. 
>
> Thanks, 
> Shanliang 
>
>
> About line 172, you are correct. I will just remove that line. Thanks!
>
> Mattias
>
> ----- Original Message -----
> From: shanliang.jiang at oracle.com To: daniel.fuchs at oracle.com Cc: mattias.tobiasson at oracle.com , serviceability-dev at openjdk.java.net Sent: Thursday, February 27, 2014 12:59:49 PM GMT +01:00 Amsterdam / Berlin / Bern / Rome / Stockholm / Vienna
> Subject: Re: RFR(XS) 8031065: LowMemoryTest2.sh fails: OutOfMemoryError: Metaspace
>
> Daniel Fuchs wrote: 
>
> On 2/27/14 11:43 AM, Mattias Tobiasson wrote: 
>
> Hi,
> Could you please review this test fix.
>
> The test verifies that MemoryPoolMXBean sends a notification when 
> used memory has reached the threshold.
> The flag thresholdExceeded marks if we have reached the memory 
> threshold. When the flag is set, the test slows down to give time for 
> the notification to be received.
> The problem is that thresholdExceeded is overwritten every time in 
> the loop. Instead it should be set if any pool has reached the 
> threshold. This means that the test continues to allocate memory at 
> full speed, and we may get an OutOfMemory before we get the 
> notification. Hi Mattias,
>
> I wonder whether you should also stop calling loadNext() once
> thresholdExceeded is true? Yes I am thinking this too.
>
> Line 172 is unnecessary, after thresholdExceeded becomes true, Line 170 
> will always be skipped.
>
> Shanliang 
>
> best regards,
>
> -- daniel 
>
> bug: https://bugs.openjdk.java.net/browse/JDK-8031065 webrev: http://cr.openjdk.java.net/~ykantser/8031065/webrev.00/ Mattias 
>   


From mattias.tobiasson at oracle.com  Fri Feb 28 04:56:11 2014
From: mattias.tobiasson at oracle.com (Mattias Tobiasson)
Date: Fri, 28 Feb 2014 04:56:11 -0800 (PST)
Subject: RFR(XS) 6625574: MemoryMXBean/Pending.java fails: "pending
	finalization = 0 but expected > 0"
Message-ID: <78525bc3-f9ea-466d-abd0-b3ea762078bf@default>

Hi,
Could you please review this fix?

The test often fails when run with command line "-Xcomp"

The test creates some objects and stores them in a local variable in the function. The test expects those objects to survive until they are set to null. The problem seems to be that the optimizer realizes that the objects are never used and removes them before the test expects it.

The solution is to move the local variable out to a public static variable. Then the optimizer will not remove it before expected. 

webrev:
http://cr.openjdk.java.net/~mtobiass/6625574/webrev.00

bug:
https://bugs.openjdk.java.net/browse/JDK-6625574

Mattias

From staffan.larsen at oracle.com  Fri Feb 28 04:58:38 2014
From: staffan.larsen at oracle.com (Staffan Larsen)
Date: Fri, 28 Feb 2014 13:58:38 +0100
Subject: RFR(XS) 6625574: MemoryMXBean/Pending.java fails: "pending
	finalization = 0 but expected > 0"
In-Reply-To: <78525bc3-f9ea-466d-abd0-b3ea762078bf@default>
References: <78525bc3-f9ea-466d-abd0-b3ea762078bf@default>
Message-ID: <3F7E2914-034A-4737-8F87-64F5EDCF81DE@oracle.com>

Looks good!

Thanks,
/Staffan

On 28 feb 2014, at 13:56, Mattias Tobiasson <mattias.tobiasson at oracle.com> wrote:

> Hi,
> Could you please review this fix?
> 
> The test often fails when run with command line "-Xcomp"
> 
> The test creates some objects and stores them in a local variable in the function. The test expects those objects to survive until they are set to null. The problem seems to be that the optimizer realizes that the objects are never used and removes them before the test expects it.
> 
> The solution is to move the local variable out to a public static variable. Then the optimizer will not remove it before expected. 
> 
> webrev:
> http://cr.openjdk.java.net/~mtobiass/6625574/webrev.00
> 
> bug:
> https://bugs.openjdk.java.net/browse/JDK-6625574
> 
> Mattias


From pavel.punegov at oracle.com  Fri Feb 28 06:36:23 2014
From: pavel.punegov at oracle.com (Pavel Punegov)
Date: Fri, 28 Feb 2014 18:36:23 +0400
Subject: RFR 6946101: ShellScaffold.sh based tests can issue "write error:
	Broken pipe" messages
In-Reply-To: <47FE9723-6797-4135-A103-E1E42675FE99@oracle.com>
References: <1701612.OzZc2qZ2Ld@work-dell> <530FC073.3060905@oracle.com>
	<47FE9723-6797-4135-A103-E1E42675FE99@oracle.com>
Message-ID: <1838555.9rITELihyr@work-dell>

New wevrev, thanks to Igor I.:
  http://cr.openjdk.java.net/~iignatyev/ppunegov/6946101/webrev.01/

Fixed typos/grammar
Added next string to catch the situation when jdb exited with 
input stream closed prematurely (break in a while loop before the check) :
   996     # jdb exited because its input stream closed prematurely
   997     jdbFailIfPresent "Input stream closed"

It could happen if dojdbCmds() subshell was killed or exited prematurely.
Without this check test fails with no complains about Input stream.


On Fri, 28 of Feb 2014 09:31:29 Staffan Larsen wrote:
> Very nice change - looks good!
> 
> test/com/sun/jdi/ShellScaffold.sh
>   line 1000:     # mydojdbCmds() didn't finished because it waits for JDB
> message nit: finished -> finish
Fixed

> Just a note that this should be pushed through jdk9/dev and not jdk9/hs-comp
> (where the webrev was made).
Igor made a webrev based on jdk9/dev and will sponsor me.


On 27 feb 2014, at 23:47, Daniel D. Daugherty <daniel.daugherty at oracle.com> 
wrote:
>> On 2/27/14 9:20 AM, Pavel Punegov wrote:
>>> Please review the fix for:
>>>   https://bugs.openjdk.java.net/browse/JDK-6946101
>>> 
>>> webrev:
>>>   http://cr.openjdk.java.net/~iignatyev/ppunegov/6946101/webrev.00/
>> 
>> test/com/sun/jdi/ShellScaffold.sh
>> 
>>    line 531: # allows JDB to exit"
>>    
>>        stray double-quote at end of comment
>>    
>>    line 563: dofail "It's not allowed to send quit and exit commands from
>>    the test">    
>>        'and' should be 'or'
>>    
>>    line 819:  # Kill debugger, it could be hang
>>    
>>        Typo: 'hang' -> 'hung'
>> 
Fixed

>>    I _think_ I understand the new test driver style:
>>
>>    - get rid of all explicit 'cmd quit' usages because mydojdbCmds()
>>      now wraps the test's dojdbCmds with a 'quit' cmd 
Yep

>>    - any test that previously ended with a 'cmd cont' is presumed to
>>      be OK of that 'cmd cont' caused jdb to execute off the end of
>>      main(); sounds reasonable to me
The test  JdbMethodExitTest.sh  doesn't have allowExit set for the last
'cmd cont'.  This 'cont' should run jdb to breakpoint set with bkpt();
See java file inside the test:
  183        // test trace method exit
  184        traceExit1();
  185        bkpt();
  186 
  187   }

>>    - perfect example of the new logic to catch an errant run off the
>>      end is  test/com/sun/jdi/WatchFramePop.sh
>>      - the last jdb cmd is 'next'
>>      - and jdb is NOT supposed to run off the end
>>      - the new logic should catch this nicely; I _think_ the old
>>        logic would only catch a run off the end if someone manually
>>        checked the test result
My testing (see in the bug comments)  shows that old logic can only catch the  
message absence with  failIfNotPresent() functions. If I remove these checks 
tests will silently pass even though we have sent 'cmd exit'  inside the test.

-- 
Thanks,
Pavel Punegov

From staffan.larsen at oracle.com  Fri Feb 28 06:39:19 2014
From: staffan.larsen at oracle.com (Staffan Larsen)
Date: Fri, 28 Feb 2014 15:39:19 +0100
Subject: RFR 6946101: ShellScaffold.sh based tests can issue "write error:
	Broken pipe" messages
In-Reply-To: <1838555.9rITELihyr@work-dell>
References: <1701612.OzZc2qZ2Ld@work-dell> <530FC073.3060905@oracle.com>
	<47FE9723-6797-4135-A103-E1E42675FE99@oracle.com>
	<1838555.9rITELihyr@work-dell>
Message-ID: <867D8A1C-031A-49F2-8AB4-ECD3BAEE8153@oracle.com>

Still good.

Thanks,
/Staffan

On 28 feb 2014, at 15:36, Pavel Punegov <pavel.punegov at oracle.com> wrote:

> New wevrev, thanks to Igor I.:
>  http://cr.openjdk.java.net/~iignatyev/ppunegov/6946101/webrev.01/
> 
> Fixed typos/grammar
> Added next string to catch the situation when jdb exited with 
> input stream closed prematurely (break in a while loop before the check) :
>   996     # jdb exited because its input stream closed prematurely
>   997     jdbFailIfPresent "Input stream closed"
> 
> It could happen if dojdbCmds() subshell was killed or exited prematurely.
> Without this check test fails with no complains about Input stream.
> 
> 
> On Fri, 28 of Feb 2014 09:31:29 Staffan Larsen wrote:
>> Very nice change - looks good!
>> 
>> test/com/sun/jdi/ShellScaffold.sh
>>  line 1000:     # mydojdbCmds() didn't finished because it waits for JDB
>> message nit: finished -> finish
> Fixed
> 
>> Just a note that this should be pushed through jdk9/dev and not jdk9/hs-comp
>> (where the webrev was made).
> Igor made a webrev based on jdk9/dev and will sponsor me.
> 
> 
> On 27 feb 2014, at 23:47, Daniel D. Daugherty <daniel.daugherty at oracle.com> 
> wrote:
>>> On 2/27/14 9:20 AM, Pavel Punegov wrote:
>>>> Please review the fix for:
>>>>  https://bugs.openjdk.java.net/browse/JDK-6946101
>>>> 
>>>> webrev:
>>>>  http://cr.openjdk.java.net/~iignatyev/ppunegov/6946101/webrev.00/
>>> 
>>> test/com/sun/jdi/ShellScaffold.sh
>>> 
>>>   line 531: # allows JDB to exit"
>>> 
>>>       stray double-quote at end of comment
>>> 
>>>   line 563: dofail "It's not allowed to send quit and exit commands from
>>>   the test">    
>>>       'and' should be 'or'
>>> 
>>>   line 819:  # Kill debugger, it could be hang
>>> 
>>>       Typo: 'hang' -> 'hung'
>>> 
> Fixed
> 
>>>   I _think_ I understand the new test driver style:
>>> 
>>>   - get rid of all explicit 'cmd quit' usages because mydojdbCmds()
>>>     now wraps the test's dojdbCmds with a 'quit' cmd 
> Yep
> 
>>>   - any test that previously ended with a 'cmd cont' is presumed to
>>>     be OK of that 'cmd cont' caused jdb to execute off the end of
>>>     main(); sounds reasonable to me
> The test  JdbMethodExitTest.sh  doesn't have allowExit set for the last
> 'cmd cont'.  This 'cont' should run jdb to breakpoint set with bkpt();
> See java file inside the test:
>  183        // test trace method exit
>  184        traceExit1();
>  185        bkpt();
>  186 
>  187   }
> 
>>>   - perfect example of the new logic to catch an errant run off the
>>>     end is  test/com/sun/jdi/WatchFramePop.sh
>>>     - the last jdb cmd is 'next'
>>>     - and jdb is NOT supposed to run off the end
>>>     - the new logic should catch this nicely; I _think_ the old
>>>       logic would only catch a run off the end if someone manually
>>>       checked the test result
> My testing (see in the bug comments)  shows that old logic can only catch the  
> message absence with  failIfNotPresent() functions. If I remove these checks 
> tests will silently pass even though we have sent 'cmd exit'  inside the test.
> 
> -- 
> Thanks,
> Pavel Punegov


From stuart.marks at oracle.com  Fri Feb 28 09:27:26 2014
From: stuart.marks at oracle.com (Stuart Marks)
Date: Fri, 28 Feb 2014 09:27:26 -0800
Subject: RFR 6835233 : Fedora 9 jdk regression test failed:
	java/lang/instrument/ParallelTransformerLoader.sh
In-Reply-To: <530E41DB.20801@oracle.com>
References: <530E41DB.20801@oracle.com>
Message-ID: <5310C6FE.6040406@oracle.com>

On 2/26/14 11:34 AM, Brent Christian wrote:
> File under "chipping away at test stabilization issues."
>
> https://bugs.openjdk.java.net/browse/JDK-6835233
>
> I would like to resolve this bug by removing the "@ignore" tag for JDK 9, and
> bring the test back into rotation.  If the failure comes back, I'll submit a new
> issue for further investigation.

+1 from me on "chipping away at test stabilization issues." :-)

I guess there is some risk of adding new intermittent failures, but tackling 
@ignore'd tests is important too. Thanks for keeping an eye this.

s'marks

From robert.field at oracle.com  Fri Feb 28 10:45:50 2014
From: robert.field at oracle.com (robert.field at oracle.com)
Date: Fri, 28 Feb 2014 18:45:50 +0000
Subject: hg: jdk8/tl/jdk: 8035777: Consistent Lambda construction
Message-ID: <20140228184805.22B6D6239C@hg.openjdk.java.net>

Changeset: 183a8c520b4a
Author:    rfield
Date:      2014-02-28 10:43 -0800
URL:       http://hg.openjdk.java.net/jdk8/tl/jdk/rev/183a8c520b4a

8035777: Consistent Lambda construction
Reviewed-by: ahgross, briangoetz, dlsmith

! src/share/classes/java/lang/invoke/AbstractValidatingLambdaMetafactory.java
! src/share/classes/java/lang/invoke/TypeConvertingMethodAdapter.java
+ test/java/lang/invoke/lambda/LambdaReceiver.java
+ test/java/lang/invoke/lambda/LambdaReceiverBridge.java
+ test/java/lang/invoke/lambda/LambdaReceiver_anotherpkg/LambdaReceiver_A.java
+ test/java/lang/invoke/lambda/LambdaReturn.java
+ test/java/lang/invoke/lambda/MetafactoryArityTest.java
+ test/java/lang/invoke/lambda/MetafactoryParameterCastTest.java
+ test/java/lang/invoke/lambda/MetafactorySamReturnTest.java


From brent.christian at oracle.com  Fri Feb 28 11:01:55 2014
From: brent.christian at oracle.com (Brent Christian)
Date: Fri, 28 Feb 2014 11:01:55 -0800
Subject: RFR 6835233 : Fedora 9 jdk regression test failed:
	java/lang/instrument/ParallelTransformerLoader.sh
In-Reply-To: <5310C6FE.6040406@oracle.com>
References: <530E41DB.20801@oracle.com> <5310C6FE.6040406@oracle.com>
Message-ID: <5310DD23.1040706@oracle.com>

On 2/28/14 9:27 AM, Stuart Marks wrote:
>
> I guess there is some risk of adding new intermittent failures, but
> tackling @ignore'd tests is important too.

Right - the main risk is that we will see this test fail again at some 
point in the future.  I'll be keeping an eye out for that.

Thanks for the review, guys.

-Brent


From daniel.daugherty at oracle.com  Fri Feb 28 11:22:51 2014
From: daniel.daugherty at oracle.com (Daniel D. Daugherty)
Date: Fri, 28 Feb 2014 12:22:51 -0700
Subject: RFR 6946101: ShellScaffold.sh based tests can issue "write error:
	Broken pipe" messages
In-Reply-To: <1838555.9rITELihyr@work-dell>
References: <1701612.OzZc2qZ2Ld@work-dell> <530FC073.3060905@oracle.com>
	<47FE9723-6797-4135-A103-E1E42675FE99@oracle.com>
	<1838555.9rITELihyr@work-dell>
Message-ID: <5310E20B.5080900@oracle.com>

On 2/28/14 7:36 AM, Pavel Punegov wrote:
> New wevrev, thanks to Igor I.:
>    http://cr.openjdk.java.net/~iignatyev/ppunegov/6946101/webrev.01/

Thumbs up.

Dan


>
> Fixed typos/grammar
> Added next string to catch the situation when jdb exited with
> input stream closed prematurely (break in a while loop before the check) :
>     996     # jdb exited because its input stream closed prematurely
>     997     jdbFailIfPresent "Input stream closed"
>
> It could happen if dojdbCmds() subshell was killed or exited prematurely.
> Without this check test fails with no complains about Input stream.
>
>
> On Fri, 28 of Feb 2014 09:31:29 Staffan Larsen wrote:
>> Very nice change - looks good!
>>
>> test/com/sun/jdi/ShellScaffold.sh
>>    line 1000:     # mydojdbCmds() didn't finished because it waits for JDB
>> message nit: finished -> finish
> Fixed
>
>> Just a note that this should be pushed through jdk9/dev and not jdk9/hs-comp
>> (where the webrev was made).
> Igor made a webrev based on jdk9/dev and will sponsor me.
>
>
> On 27 feb 2014, at 23:47, Daniel D. Daugherty <daniel.daugherty at oracle.com>
> wrote:
>>> On 2/27/14 9:20 AM, Pavel Punegov wrote:
>>>> Please review the fix for:
>>>>    https://bugs.openjdk.java.net/browse/JDK-6946101
>>>>
>>>> webrev:
>>>>    http://cr.openjdk.java.net/~iignatyev/ppunegov/6946101/webrev.00/
>>> test/com/sun/jdi/ShellScaffold.sh
>>>
>>>     line 531: # allows JDB to exit"
>>>     
>>>         stray double-quote at end of comment
>>>     
>>>     line 563: dofail "It's not allowed to send quit and exit commands from
>>>     the test">
>>>         'and' should be 'or'
>>>     
>>>     line 819:  # Kill debugger, it could be hang
>>>     
>>>         Typo: 'hang' -> 'hung'
>>>
> Fixed
>
>>>     I _think_ I understand the new test driver style:
>>>
>>>     - get rid of all explicit 'cmd quit' usages because mydojdbCmds()
>>>       now wraps the test's dojdbCmds with a 'quit' cmd
> Yep
>
>>>     - any test that previously ended with a 'cmd cont' is presumed to
>>>       be OK of that 'cmd cont' caused jdb to execute off the end of
>>>       main(); sounds reasonable to me
> The test  JdbMethodExitTest.sh  doesn't have allowExit set for the last
> 'cmd cont'.  This 'cont' should run jdb to breakpoint set with bkpt();
> See java file inside the test:
>    183        // test trace method exit
>    184        traceExit1();
>    185        bkpt();
>    186
>    187   }
>
>>>     - perfect example of the new logic to catch an errant run off the
>>>       end is  test/com/sun/jdi/WatchFramePop.sh
>>>       - the last jdb cmd is 'next'
>>>       - and jdb is NOT supposed to run off the end
>>>       - the new logic should catch this nicely; I _think_ the old
>>>         logic would only catch a run off the end if someone manually
>>>         checked the test result
> My testing (see in the bug comments)  shows that old logic can only catch the
> message absence with  failIfNotPresent() functions. If I remove these checks
> tests will silently pass even though we have sent 'cmd exit'  inside the test.
>


From serguei.spitsyn at oracle.com  Fri Feb 28 11:55:02 2014
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Fri, 28 Feb 2014 11:55:02 -0800
Subject: 3-nd round RFR 6471769: Error: assert(_cur_stack_depth ==
	count_frames(), "cur_stack_depth out of sync")
In-Reply-To: <53102702.4070603@oracle.com>
References: <530D007D.4040602@oracle.com>	<530EF65C.5040707@oracle.com>	<530F8D52.2040400@oracle.com>	<530F9FF3.7030209@oracle.com>	<530FA81F.8040309@oracle.com>	<530FB572.4010507@oracle.com>
	<5310077A.7010004@oracle.com> <53102702.4070603@oracle.com>
Message-ID: <5310E996.9010800@oracle.com>

On 2/27/14 10:04 PM, David Holmes wrote:
> Hi Serguei,
>
> On 28/02/2014 1:50 PM, serguei.spitsyn at oracle.com wrote:
>> Please, review the fix for:
>>    https://bugs.openjdk.java.net/browse/JDK-6471769
>>
>>
>> Open webrev:
>> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/hotspot/6471769-JVMTI-DEPTH.3 
>>
>>
>>
>> Summary:
>>
>>    It is another attempt to fix the JTREG com/sun/jdi tests regression
>>    discovered in the first round change.
>>    The fix is to avoid lock synchronization at
>> safepoints(jvmtiEventController.cpp).
>>    Thanks to Dan for catching the problem in the 2-nd round of review!
>
> The basic approach here seems sound.

Thank you for reviewing the fix!

>
> I find the checking for cur->is_VMThread() somewhat overly 
> conservative - if we are at a safepoint, and executing this code, then 
> we must be the VMThread. But ok.

Agreed and simplified. Thanks!

>
> You could also use MutexLockerEx to avoid the need for locked and 
> unlocked paths to a common call, but that's just stylistic. Though if 
> you are grabbing the current thread anyway you can also use the 
> MutexLocker calls that take the thread arg - to avoid a second look-up 
> of the current thread.

Thank you for reminding. I keep forgetting about it.
Will check what is better here, just do not want to rerun the whole testing.
But I'm in favor to make it simpler. :)

Thanks,
Serguei

>
> David
> -----
>
>> Testing:
>>    All tests are passed: nsk.jvmti, nsk.jdi, nsk.jdwp, JTreg com/sun/jdi
>>
>>
>> Thanks,
>> Serguei
>>
>>
>> On 2/27/14 2:00 PM, serguei.spitsyn at oracle.com wrote:
>>> On 2/27/14 1:03 PM, serguei.spitsyn at oracle.com wrote:
>>>> On 2/27/14 12:28 PM, serguei.spitsyn at oracle.com wrote:
>>>>> Dan,
>>>>>
>>>>> Thank you a lot for reviewing this!
>>>>>
>>>>> On 2/27/14 11:09 AM, Daniel D. Daugherty wrote:
>>>>>> On 2/27/14 1:25 AM, serguei.spitsyn at oracle.com wrote:
>>>>>>> Please, review the fix for:
>>>>>>>   https://bugs.openjdk.java.net/browse/JDK-6471769
>>>>>>>
>>>>>>>
>>>>>>> Open webrev:
>>>>>>> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/hotspot/6471769-JVMTI-DEPTH.2 
>>>>>>>
>>>>>>>
>>>>>>
>>>>>> src/share/vm/runtime/vm_operations.hpp
>>>>>>     No comments.
>>>>>>
>>>>>> src/share/vm/prims/jvmtiEnvBase.hpp
>>>>>>     No comments.
>>>>>>
>>>>>> src/share/vm/prims/jvmtiEnv.cpp
>>>>>>     No comments.
>>>>>>
>>>>>> src/share/vm/prims/jvmtiEnvThreadState.cpp
>>>>>>     No comments.
>>>>>>
>>>>>> src/share/vm/prims/jvmtiEventController.cpp
>>>>>>     JvmtiEventController::set_frame_pop() is called by
>>>>>>     JvmtiEnvThreadState::set_frame_pop() which is called by
>>>>>>     JvmtiEnv::NotifyFramePop().
>>>>>>
>>>>>>     The "MutexLocker mu(JvmtiThreadState_lock)" in
>>>>>>     JvmtiEventController::set_frame_pop() protected the work
>>>>>>     done by JvmtiEventControllerPrivate::set_frame_pop():
>>>>>>
>>>>>>       ets->get_frame_pops()->set(fpop);
>>>>>> recompute_thread_enabled(ets->get_thread()->jvmti_thread_state());
>>>>>
>>>>> Your check is the right thing to do, thanks!
>>>>> I had to explain this more clearly in this 2-nd review request.
>>>>>
>>>>> The approach I've taken here is that all this code paths are executed
>>>>> on the target thread or at a safepoint.
>>>>>
>>>>> It is true for all 3 functions:
>>>>>   set_frame_pop(), clear_frame_pop() and clear_to_frame_pop().
>>>>>
>>>>> And the updated assert guards ensure that it is the case.
>>>>>
>>>>> It could be a good idea to add a No_Safepoint_Verifier for
>>>>> PopFrame() and NotifyFramePop()
>>>>> to make sure the current/target thread does not go to safepoint
>>>>> until it is returned from
>>>>> update_for_pop_top_frame() and set_frame_pop() correspondingly.
>>>>> A No_Safepoint_Verifier can be also needed in the
>>>>> JvmtiExport::post_method_exit().
>>>>>
>>>>> These are all places where these functions are called:
>>>>> prims/jvmtiEnv.cpp:
>>>>> state->env_thread_state(this)->set_frame_pop(frame_number); //
>>>>> JvmtiEnv::NotifyFramePop()
>>>>> prims/jvmtiExport.cpp: ets->clear_frame_pop(cur_frame_number); //
>>>>> JvmtiExport::post_method_exit()
>>>>> prims/jvmtiThreadState.cpp:
>>>>> ets->clear_frame_pop(popframe_number);              //
>>>>> JvmtiThreadState::update_for_pop_top_frame()
>>>>>
>>>>> The function JvmtiEnvThreadState::clear_to_frame_pop() is never
>>>>> called now.
>>>>
>>>> There is still a concern about recompute_thread_enabled().
>>>> If it is normally always protected with the JvmtiThreadState_lock
>>>> then the approach above is not going to work.
>>>> I'm trying to check this now.
>>>
>>> Dan,
>>>
>>> I came to a conclusion that these 3 functions still must be protected
>>> by the JvmtiThreadState_lock when they are called out of a safepoint.
>>> It is a little bit ugly but has to be safe though.
>>>
>>> Please, let me know if you see eny problems with that.
>>> I'll send a new webrev soon.
>>>
>>> Thanks,
>>> Serguei
>>>
>>>
>>>>
>>>> Thanks,
>>>> Serguei
>>>>
>>>>
>>>>>
>>>>> Thanks,
>>>>> Serguei
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>>
>>>>>>     Since multiple threads can call JVM/TI NotifyFramePop() on the
>>>>>>     same target thread, what keeps the threads from messing with
>>>>>>     the list of frame pops simultaneously or messing with the
>>>>>>     thread enabled events bits in parallel?
>>>>>>
>>>>>>     I suspect that this might also be an issue for
>>>>>>     JvmtiEventController::clear_frame_pop() and
>>>>>>     JvmtiEventController::clear_to_frame_pop() also.
>>>>>>
>>>>>> src/share/vm/prims/jvmtiThreadState.cpp
>>>>>>     No comments.
>>>>>>
>>>>>> Dan
>>>>>>
>>>>>>
>>>>>>>
>>>>>>> Summary:
>>>>>>>
>>>>>>>   It is the 2-nd round of review because the JTREG com/sun/jdi
>>>>>>> tests discovered a regression
>>>>>>>   in the first round change. The issue was in the
>>>>>>> JvmtiEventController::clear_frame_pop()
>>>>>>>   lock synchronization that is not allowed at safepoints.
>>>>>>>
>>>>>>>   As a result I've changed the JvmtiEnv::NotifyFramePop to use a
>>>>>>> VM operation for safety.
>>>>>>>   Also, I've removed the lock synchronization from the 3 impacted
>>>>>>> JvmtiEventController::
>>>>>>>   functions: set_frame_pop(), clear_frame_pop() and
>>>>>>> clear_to_frame_pop().
>>>>>>>
>>>>>>> Testing:
>>>>>>>   In progress: nsk.jvmti, nsk.jdi, nsk.jdwp, JTreg com/sun/jdi
>>>>>>>
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Serguei
>>>>>>>
>>>>>>>
>>>>>>> On 2/25/14 12:43 PM, serguei.spitsyn at oracle.com wrote:
>>>>>>>> Please, review the fix for:
>>>>>>>>   https://bugs.openjdk.java.net/browse/JDK-6471769
>>>>>>>>
>>>>>>>>
>>>>>>>> Open webrev:
>>>>>>>> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/hotspot/6471769-JVMTI-DEPTH.1 
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> Summary:
>>>>>>>>
>>>>>>>>   This is another Test Stabilization issue.
>>>>>>>>   The fix is very similar to other JVMTI stabilization fixes.
>>>>>>>>   It is to use safepoints for updating the PopFrame data instead
>>>>>>>> of relying on the
>>>>>>>>   suspend equivalent condition mechanism
>>>>>>>> (JvmtiEnv::is_thread_fully_suspended())
>>>>>>>>   which is not adequate from the reliability point of view.
>>>>>>>>
>>>>>>>> Testing:
>>>>>>>>   In progress: nsk.jvmti, nsk.jdi, nsk.jdwp, JTreg com/sun/jdi
>>>>>>>>
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Serguei
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>


From daniel.daugherty at oracle.com  Fri Feb 28 13:12:57 2014
From: daniel.daugherty at oracle.com (Daniel D. Daugherty)
Date: Fri, 28 Feb 2014 14:12:57 -0700
Subject: 3-nd round RFR 6471769: Error: assert(_cur_stack_depth ==
	count_frames(), "cur_stack_depth out of sync")
In-Reply-To: <5310E996.9010800@oracle.com>
References: <530D007D.4040602@oracle.com>	<530EF65C.5040707@oracle.com>	<530F8D52.2040400@oracle.com>	<530F9FF3.7030209@oracle.com>	<530FA81F.8040309@oracle.com>	<530FB572.4010507@oracle.com>
	<5310077A.7010004@oracle.com> <53102702.4070603@oracle.com>
	<5310E996.9010800@oracle.com>
Message-ID: <5310FBD9.5090607@oracle.com>

On 2/28/14 12:55 PM, serguei.spitsyn at oracle.com wrote:
> On 2/27/14 10:04 PM, David Holmes wrote:
>> Hi Serguei,
>>
>> On 28/02/2014 1:50 PM, serguei.spitsyn at oracle.com wrote:
>>> Please, review the fix for:
>>>    https://bugs.openjdk.java.net/browse/JDK-6471769
>>>
>>>
>>> Open webrev:
>>> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/hotspot/6471769-JVMTI-DEPTH.3 
>>>

Thumbs up! (including the tweaks in the .4 version)

src/share/vm/runtime/vm_operations.hpp
     No comments.

src/share/vm/prims/jvmtiEnvBase.hpp
     line 375: bool allow_nested_vm_operations() const        { return 
true; }
         Does VM_SetFramePop really gave to permit nested VMops?

src/share/vm/prims/jvmtiEnv.cpp
     No comments.

src/share/vm/prims/jvmtiEnvThreadState.cpp
     No comments.

src/share/vm/prims/jvmtiEventController.cpp
     No comments.

src/share/vm/prims/jvmtiThreadState.cpp
     No comments.

Dan


>>>
>>>
>>> Summary:
>>>
>>>    It is another attempt to fix the JTREG com/sun/jdi tests regression
>>>    discovered in the first round change.
>>>    The fix is to avoid lock synchronization at
>>> safepoints(jvmtiEventController.cpp).
>>>    Thanks to Dan for catching the problem in the 2-nd round of review!
>>
>> The basic approach here seems sound.
>
> Thank you for reviewing the fix!
>
>>
>> I find the checking for cur->is_VMThread() somewhat overly 
>> conservative - if we are at a safepoint, and executing this code, 
>> then we must be the VMThread. But ok.
>
> Agreed and simplified. Thanks!
>
>>
>> You could also use MutexLockerEx to avoid the need for locked and 
>> unlocked paths to a common call, but that's just stylistic. Though if 
>> you are grabbing the current thread anyway you can also use the 
>> MutexLocker calls that take the thread arg - to avoid a second 
>> look-up of the current thread.
>
> Thank you for reminding. I keep forgetting about it.
> Will check what is better here, just do not want to rerun the whole 
> testing.
> But I'm in favor to make it simpler. :)
>
> Thanks,
> Serguei
>
>>
>> David
>> -----
>>
>>> Testing:
>>>    All tests are passed: nsk.jvmti, nsk.jdi, nsk.jdwp, JTreg 
>>> com/sun/jdi
>>>
>>>
>>> Thanks,
>>> Serguei
>>>
>>>
>>> On 2/27/14 2:00 PM, serguei.spitsyn at oracle.com wrote:
>>>> On 2/27/14 1:03 PM, serguei.spitsyn at oracle.com wrote:
>>>>> On 2/27/14 12:28 PM, serguei.spitsyn at oracle.com wrote:
>>>>>> Dan,
>>>>>>
>>>>>> Thank you a lot for reviewing this!
>>>>>>
>>>>>> On 2/27/14 11:09 AM, Daniel D. Daugherty wrote:
>>>>>>> On 2/27/14 1:25 AM, serguei.spitsyn at oracle.com wrote:
>>>>>>>> Please, review the fix for:
>>>>>>>>   https://bugs.openjdk.java.net/browse/JDK-6471769
>>>>>>>>
>>>>>>>>
>>>>>>>> Open webrev:
>>>>>>>> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/hotspot/6471769-JVMTI-DEPTH.2 
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>> src/share/vm/runtime/vm_operations.hpp
>>>>>>>     No comments.
>>>>>>>
>>>>>>> src/share/vm/prims/jvmtiEnvBase.hpp
>>>>>>>     No comments.
>>>>>>>
>>>>>>> src/share/vm/prims/jvmtiEnv.cpp
>>>>>>>     No comments.
>>>>>>>
>>>>>>> src/share/vm/prims/jvmtiEnvThreadState.cpp
>>>>>>>     No comments.
>>>>>>>
>>>>>>> src/share/vm/prims/jvmtiEventController.cpp
>>>>>>>     JvmtiEventController::set_frame_pop() is called by
>>>>>>>     JvmtiEnvThreadState::set_frame_pop() which is called by
>>>>>>>     JvmtiEnv::NotifyFramePop().
>>>>>>>
>>>>>>>     The "MutexLocker mu(JvmtiThreadState_lock)" in
>>>>>>>     JvmtiEventController::set_frame_pop() protected the work
>>>>>>>     done by JvmtiEventControllerPrivate::set_frame_pop():
>>>>>>>
>>>>>>>       ets->get_frame_pops()->set(fpop);
>>>>>>> recompute_thread_enabled(ets->get_thread()->jvmti_thread_state());
>>>>>>
>>>>>> Your check is the right thing to do, thanks!
>>>>>> I had to explain this more clearly in this 2-nd review request.
>>>>>>
>>>>>> The approach I've taken here is that all this code paths are 
>>>>>> executed
>>>>>> on the target thread or at a safepoint.
>>>>>>
>>>>>> It is true for all 3 functions:
>>>>>>   set_frame_pop(), clear_frame_pop() and clear_to_frame_pop().
>>>>>>
>>>>>> And the updated assert guards ensure that it is the case.
>>>>>>
>>>>>> It could be a good idea to add a No_Safepoint_Verifier for
>>>>>> PopFrame() and NotifyFramePop()
>>>>>> to make sure the current/target thread does not go to safepoint
>>>>>> until it is returned from
>>>>>> update_for_pop_top_frame() and set_frame_pop() correspondingly.
>>>>>> A No_Safepoint_Verifier can be also needed in the
>>>>>> JvmtiExport::post_method_exit().
>>>>>>
>>>>>> These are all places where these functions are called:
>>>>>> prims/jvmtiEnv.cpp:
>>>>>> state->env_thread_state(this)->set_frame_pop(frame_number); //
>>>>>> JvmtiEnv::NotifyFramePop()
>>>>>> prims/jvmtiExport.cpp: ets->clear_frame_pop(cur_frame_number); //
>>>>>> JvmtiExport::post_method_exit()
>>>>>> prims/jvmtiThreadState.cpp:
>>>>>> ets->clear_frame_pop(popframe_number); //
>>>>>> JvmtiThreadState::update_for_pop_top_frame()
>>>>>>
>>>>>> The function JvmtiEnvThreadState::clear_to_frame_pop() is never
>>>>>> called now.
>>>>>
>>>>> There is still a concern about recompute_thread_enabled().
>>>>> If it is normally always protected with the JvmtiThreadState_lock
>>>>> then the approach above is not going to work.
>>>>> I'm trying to check this now.
>>>>
>>>> Dan,
>>>>
>>>> I came to a conclusion that these 3 functions still must be protected
>>>> by the JvmtiThreadState_lock when they are called out of a safepoint.
>>>> It is a little bit ugly but has to be safe though.
>>>>
>>>> Please, let me know if you see eny problems with that.
>>>> I'll send a new webrev soon.
>>>>
>>>> Thanks,
>>>> Serguei
>>>>
>>>>
>>>>>
>>>>> Thanks,
>>>>> Serguei
>>>>>
>>>>>
>>>>>>
>>>>>> Thanks,
>>>>>> Serguei
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>>
>>>>>>>     Since multiple threads can call JVM/TI NotifyFramePop() on the
>>>>>>>     same target thread, what keeps the threads from messing with
>>>>>>>     the list of frame pops simultaneously or messing with the
>>>>>>>     thread enabled events bits in parallel?
>>>>>>>
>>>>>>>     I suspect that this might also be an issue for
>>>>>>>     JvmtiEventController::clear_frame_pop() and
>>>>>>>     JvmtiEventController::clear_to_frame_pop() also.
>>>>>>>
>>>>>>> src/share/vm/prims/jvmtiThreadState.cpp
>>>>>>>     No comments.
>>>>>>>
>>>>>>> Dan
>>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>> Summary:
>>>>>>>>
>>>>>>>>   It is the 2-nd round of review because the JTREG com/sun/jdi
>>>>>>>> tests discovered a regression
>>>>>>>>   in the first round change. The issue was in the
>>>>>>>> JvmtiEventController::clear_frame_pop()
>>>>>>>>   lock synchronization that is not allowed at safepoints.
>>>>>>>>
>>>>>>>>   As a result I've changed the JvmtiEnv::NotifyFramePop to use a
>>>>>>>> VM operation for safety.
>>>>>>>>   Also, I've removed the lock synchronization from the 3 impacted
>>>>>>>> JvmtiEventController::
>>>>>>>>   functions: set_frame_pop(), clear_frame_pop() and
>>>>>>>> clear_to_frame_pop().
>>>>>>>>
>>>>>>>> Testing:
>>>>>>>>   In progress: nsk.jvmti, nsk.jdi, nsk.jdwp, JTreg com/sun/jdi
>>>>>>>>
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Serguei
>>>>>>>>
>>>>>>>>
>>>>>>>> On 2/25/14 12:43 PM, serguei.spitsyn at oracle.com wrote:
>>>>>>>>> Please, review the fix for:
>>>>>>>>>   https://bugs.openjdk.java.net/browse/JDK-6471769
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Open webrev:
>>>>>>>>> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/hotspot/6471769-JVMTI-DEPTH.1 
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Summary:
>>>>>>>>>
>>>>>>>>>   This is another Test Stabilization issue.
>>>>>>>>>   The fix is very similar to other JVMTI stabilization fixes.
>>>>>>>>>   It is to use safepoints for updating the PopFrame data instead
>>>>>>>>> of relying on the
>>>>>>>>>   suspend equivalent condition mechanism
>>>>>>>>> (JvmtiEnv::is_thread_fully_suspended())
>>>>>>>>>   which is not adequate from the reliability point of view.
>>>>>>>>>
>>>>>>>>> Testing:
>>>>>>>>>   In progress: nsk.jvmti, nsk.jdi, nsk.jdwp, JTreg com/sun/jdi
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Serguei
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>


From frederic.parain at oracle.com  Fri Feb 28 13:19:42 2014
From: frederic.parain at oracle.com (frederic parain)
Date: Fri, 28 Feb 2014 22:19:42 +0100
Subject: RFR(S): JDK-8035952 Remove use of JVM_Open, JVM_Read and JVM_Close
	functions from serviceability code
Message-ID: <5310FD6E.7050305@oracle.com>

Greetings,

Please review this small changeset for CR JDK-8035952
"Remove use of JVM_Open, JVM_Read and JVM_Close functions from 
serviceability code"

Bug:
https://bugs.openjdk.java.net/browse/JDK-8035952

Webrev:
http://cr.openjdk.java.net/~fparain/8035952/webrev.00/

Tested with jdk_management test suite + manual tests
to verify consistency of values returned by new code.

Thank you,

Fred

-- 
Frederic Parain - Oracle
Grenoble Engineering Center - France
Phone: +33 4 76 18 81 17
Email: Frederic.Parain at oracle.com

From serguei.spitsyn at oracle.com  Fri Feb 28 13:24:47 2014
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Fri, 28 Feb 2014 13:24:47 -0800
Subject: 3-nd round RFR 6471769: Error: assert(_cur_stack_depth ==
	count_frames(), "cur_stack_depth out of sync")
In-Reply-To: <5310FBD9.5090607@oracle.com>
References: <530D007D.4040602@oracle.com>	<530EF65C.5040707@oracle.com>	<530F8D52.2040400@oracle.com>	<530F9FF3.7030209@oracle.com>	<530FA81F.8040309@oracle.com>	<530FB572.4010507@oracle.com>
	<5310077A.7010004@oracle.com> <53102702.4070603@oracle.com>
	<5310E996.9010800@oracle.com> <5310FBD9.5090607@oracle.com>
Message-ID: <5310FE9F.1060606@oracle.com>

On 2/28/14 1:12 PM, Daniel D. Daugherty wrote:
> On 2/28/14 12:55 PM, serguei.spitsyn at oracle.com wrote:
>> On 2/27/14 10:04 PM, David Holmes wrote:
>>> Hi Serguei,
>>>
>>> On 28/02/2014 1:50 PM, serguei.spitsyn at oracle.com wrote:
>>>> Please, review the fix for:
>>>>    https://bugs.openjdk.java.net/browse/JDK-6471769
>>>>
>>>>
>>>> Open webrev:
>>>> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/hotspot/6471769-JVMTI-DEPTH.3 
>>>>
>
> Thumbs up! (including the tweaks in the .4 version)

Thanks a lot, Dan!

>
> src/share/vm/runtime/vm_operations.hpp
>     No comments.
>
> src/share/vm/prims/jvmtiEnvBase.hpp
>     line 375: bool allow_nested_vm_operations() const        { return 
> true; }
>         Does VM_SetFramePop really gave to permit nested VMops?

Yes.
It was a real surprise that it is really necessary.

Thanks!
Serguei

>
> src/share/vm/prims/jvmtiEnv.cpp
>     No comments.
>
> src/share/vm/prims/jvmtiEnvThreadState.cpp
>     No comments.
>
> src/share/vm/prims/jvmtiEventController.cpp
>     No comments.
>
> src/share/vm/prims/jvmtiThreadState.cpp
>     No comments.
>
> Dan
>
>
>
>>>>
>>>>
>>>> Summary:
>>>>
>>>>    It is another attempt to fix the JTREG com/sun/jdi tests regression
>>>>    discovered in the first round change.
>>>>    The fix is to avoid lock synchronization at
>>>> safepoints(jvmtiEventController.cpp).
>>>>    Thanks to Dan for catching the problem in the 2-nd round of review!
>>>
>>> The basic approach here seems sound.
>>
>> Thank you for reviewing the fix!
>>
>>>
>>> I find the checking for cur->is_VMThread() somewhat overly 
>>> conservative - if we are at a safepoint, and executing this code, 
>>> then we must be the VMThread. But ok.
>>
>> Agreed and simplified. Thanks!
>>
>>>
>>> You could also use MutexLockerEx to avoid the need for locked and 
>>> unlocked paths to a common call, but that's just stylistic. Though 
>>> if you are grabbing the current thread anyway you can also use the 
>>> MutexLocker calls that take the thread arg - to avoid a second 
>>> look-up of the current thread.
>>
>> Thank you for reminding. I keep forgetting about it.
>> Will check what is better here, just do not want to rerun the whole 
>> testing.
>> But I'm in favor to make it simpler. :)
>>
>> Thanks,
>> Serguei
>>
>>>
>>> David
>>> -----
>>>
>>>> Testing:
>>>>    All tests are passed: nsk.jvmti, nsk.jdi, nsk.jdwp, JTreg 
>>>> com/sun/jdi
>>>>
>>>>
>>>> Thanks,
>>>> Serguei
>>>>
>>>>
>>>> On 2/27/14 2:00 PM, serguei.spitsyn at oracle.com wrote:
>>>>> On 2/27/14 1:03 PM, serguei.spitsyn at oracle.com wrote:
>>>>>> On 2/27/14 12:28 PM, serguei.spitsyn at oracle.com wrote:
>>>>>>> Dan,
>>>>>>>
>>>>>>> Thank you a lot for reviewing this!
>>>>>>>
>>>>>>> On 2/27/14 11:09 AM, Daniel D. Daugherty wrote:
>>>>>>>> On 2/27/14 1:25 AM, serguei.spitsyn at oracle.com wrote:
>>>>>>>>> Please, review the fix for:
>>>>>>>>>   https://bugs.openjdk.java.net/browse/JDK-6471769
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Open webrev:
>>>>>>>>> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/hotspot/6471769-JVMTI-DEPTH.2 
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>> src/share/vm/runtime/vm_operations.hpp
>>>>>>>>     No comments.
>>>>>>>>
>>>>>>>> src/share/vm/prims/jvmtiEnvBase.hpp
>>>>>>>>     No comments.
>>>>>>>>
>>>>>>>> src/share/vm/prims/jvmtiEnv.cpp
>>>>>>>>     No comments.
>>>>>>>>
>>>>>>>> src/share/vm/prims/jvmtiEnvThreadState.cpp
>>>>>>>>     No comments.
>>>>>>>>
>>>>>>>> src/share/vm/prims/jvmtiEventController.cpp
>>>>>>>>     JvmtiEventController::set_frame_pop() is called by
>>>>>>>>     JvmtiEnvThreadState::set_frame_pop() which is called by
>>>>>>>>     JvmtiEnv::NotifyFramePop().
>>>>>>>>
>>>>>>>>     The "MutexLocker mu(JvmtiThreadState_lock)" in
>>>>>>>>     JvmtiEventController::set_frame_pop() protected the work
>>>>>>>>     done by JvmtiEventControllerPrivate::set_frame_pop():
>>>>>>>>
>>>>>>>>       ets->get_frame_pops()->set(fpop);
>>>>>>>> recompute_thread_enabled(ets->get_thread()->jvmti_thread_state());
>>>>>>>
>>>>>>> Your check is the right thing to do, thanks!
>>>>>>> I had to explain this more clearly in this 2-nd review request.
>>>>>>>
>>>>>>> The approach I've taken here is that all this code paths are 
>>>>>>> executed
>>>>>>> on the target thread or at a safepoint.
>>>>>>>
>>>>>>> It is true for all 3 functions:
>>>>>>>   set_frame_pop(), clear_frame_pop() and clear_to_frame_pop().
>>>>>>>
>>>>>>> And the updated assert guards ensure that it is the case.
>>>>>>>
>>>>>>> It could be a good idea to add a No_Safepoint_Verifier for
>>>>>>> PopFrame() and NotifyFramePop()
>>>>>>> to make sure the current/target thread does not go to safepoint
>>>>>>> until it is returned from
>>>>>>> update_for_pop_top_frame() and set_frame_pop() correspondingly.
>>>>>>> A No_Safepoint_Verifier can be also needed in the
>>>>>>> JvmtiExport::post_method_exit().
>>>>>>>
>>>>>>> These are all places where these functions are called:
>>>>>>> prims/jvmtiEnv.cpp:
>>>>>>> state->env_thread_state(this)->set_frame_pop(frame_number); //
>>>>>>> JvmtiEnv::NotifyFramePop()
>>>>>>> prims/jvmtiExport.cpp: ets->clear_frame_pop(cur_frame_number); //
>>>>>>> JvmtiExport::post_method_exit()
>>>>>>> prims/jvmtiThreadState.cpp:
>>>>>>> ets->clear_frame_pop(popframe_number); //
>>>>>>> JvmtiThreadState::update_for_pop_top_frame()
>>>>>>>
>>>>>>> The function JvmtiEnvThreadState::clear_to_frame_pop() is never
>>>>>>> called now.
>>>>>>
>>>>>> There is still a concern about recompute_thread_enabled().
>>>>>> If it is normally always protected with the JvmtiThreadState_lock
>>>>>> then the approach above is not going to work.
>>>>>> I'm trying to check this now.
>>>>>
>>>>> Dan,
>>>>>
>>>>> I came to a conclusion that these 3 functions still must be protected
>>>>> by the JvmtiThreadState_lock when they are called out of a safepoint.
>>>>> It is a little bit ugly but has to be safe though.
>>>>>
>>>>> Please, let me know if you see eny problems with that.
>>>>> I'll send a new webrev soon.
>>>>>
>>>>> Thanks,
>>>>> Serguei
>>>>>
>>>>>
>>>>>>
>>>>>> Thanks,
>>>>>> Serguei
>>>>>>
>>>>>>
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Serguei
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>>     Since multiple threads can call JVM/TI NotifyFramePop() on the
>>>>>>>>     same target thread, what keeps the threads from messing with
>>>>>>>>     the list of frame pops simultaneously or messing with the
>>>>>>>>     thread enabled events bits in parallel?
>>>>>>>>
>>>>>>>>     I suspect that this might also be an issue for
>>>>>>>>     JvmtiEventController::clear_frame_pop() and
>>>>>>>>     JvmtiEventController::clear_to_frame_pop() also.
>>>>>>>>
>>>>>>>> src/share/vm/prims/jvmtiThreadState.cpp
>>>>>>>>     No comments.
>>>>>>>>
>>>>>>>> Dan
>>>>>>>>
>>>>>>>>
>>>>>>>>>
>>>>>>>>> Summary:
>>>>>>>>>
>>>>>>>>>   It is the 2-nd round of review because the JTREG com/sun/jdi
>>>>>>>>> tests discovered a regression
>>>>>>>>>   in the first round change. The issue was in the
>>>>>>>>> JvmtiEventController::clear_frame_pop()
>>>>>>>>>   lock synchronization that is not allowed at safepoints.
>>>>>>>>>
>>>>>>>>>   As a result I've changed the JvmtiEnv::NotifyFramePop to use a
>>>>>>>>> VM operation for safety.
>>>>>>>>>   Also, I've removed the lock synchronization from the 3 impacted
>>>>>>>>> JvmtiEventController::
>>>>>>>>>   functions: set_frame_pop(), clear_frame_pop() and
>>>>>>>>> clear_to_frame_pop().
>>>>>>>>>
>>>>>>>>> Testing:
>>>>>>>>>   In progress: nsk.jvmti, nsk.jdi, nsk.jdwp, JTreg com/sun/jdi
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Serguei
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 2/25/14 12:43 PM, serguei.spitsyn at oracle.com wrote:
>>>>>>>>>> Please, review the fix for:
>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-6471769
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Open webrev:
>>>>>>>>>> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/hotspot/6471769-JVMTI-DEPTH.1 
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Summary:
>>>>>>>>>>
>>>>>>>>>>   This is another Test Stabilization issue.
>>>>>>>>>>   The fix is very similar to other JVMTI stabilization fixes.
>>>>>>>>>>   It is to use safepoints for updating the PopFrame data instead
>>>>>>>>>> of relying on the
>>>>>>>>>>   suspend equivalent condition mechanism
>>>>>>>>>> (JvmtiEnv::is_thread_fully_suspended())
>>>>>>>>>>   which is not adequate from the reliability point of view.
>>>>>>>>>>
>>>>>>>>>> Testing:
>>>>>>>>>>   In progress: nsk.jvmti, nsk.jdi, nsk.jdwp, JTreg com/sun/jdi
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>> Serguei
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>
>


From daniel.daugherty at oracle.com  Fri Feb 28 13:26:43 2014
From: daniel.daugherty at oracle.com (Daniel D. Daugherty)
Date: Fri, 28 Feb 2014 14:26:43 -0700
Subject: 3-nd round RFR 6471769: Error: assert(_cur_stack_depth ==
	count_frames(), "cur_stack_depth out of sync")
In-Reply-To: <5310FE9F.1060606@oracle.com>
References: <530D007D.4040602@oracle.com>	<530EF65C.5040707@oracle.com>	<530F8D52.2040400@oracle.com>	<530F9FF3.7030209@oracle.com>	<530FA81F.8040309@oracle.com>	<530FB572.4010507@oracle.com>
	<5310077A.7010004@oracle.com> <53102702.4070603@oracle.com>
	<5310E996.9010800@oracle.com> <5310FBD9.5090607@oracle.com>
	<5310FE9F.1060606@oracle.com>
Message-ID: <5310FF13.8050204@oracle.com>

On 2/28/14 2:24 PM, serguei.spitsyn at oracle.com wrote:
> On 2/28/14 1:12 PM, Daniel D. Daugherty wrote:
>> On 2/28/14 12:55 PM, serguei.spitsyn at oracle.com wrote:
>>> On 2/27/14 10:04 PM, David Holmes wrote:
>>>> Hi Serguei,
>>>>
>>>> On 28/02/2014 1:50 PM, serguei.spitsyn at oracle.com wrote:
>>>>> Please, review the fix for:
>>>>>    https://bugs.openjdk.java.net/browse/JDK-6471769
>>>>>
>>>>>
>>>>> Open webrev:
>>>>> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/hotspot/6471769-JVMTI-DEPTH.3 
>>>>>
>>
>> Thumbs up! (including the tweaks in the .4 version)
>
> Thanks a lot, Dan!
>
>>
>> src/share/vm/runtime/vm_operations.hpp
>>     No comments.
>>
>> src/share/vm/prims/jvmtiEnvBase.hpp
>>     line 375: bool allow_nested_vm_operations() const        { return 
>> true; }
>>         Does VM_SetFramePop really gave to permit nested VMops?
>
> Yes.
> It was a real surprise that it is really necessary.

Maybe add a one line comment saying something like:

// needs to permit nested because this VMop can be invoked from XXX VMop...

or something like that.

Dan


>
> Thanks!
> Serguei
>
>>
>> src/share/vm/prims/jvmtiEnv.cpp
>>     No comments.
>>
>> src/share/vm/prims/jvmtiEnvThreadState.cpp
>>     No comments.
>>
>> src/share/vm/prims/jvmtiEventController.cpp
>>     No comments.
>>
>> src/share/vm/prims/jvmtiThreadState.cpp
>>     No comments.
>>
>> Dan
>>
>>
>>
>>>>>
>>>>>
>>>>> Summary:
>>>>>
>>>>>    It is another attempt to fix the JTREG com/sun/jdi tests 
>>>>> regression
>>>>>    discovered in the first round change.
>>>>>    The fix is to avoid lock synchronization at
>>>>> safepoints(jvmtiEventController.cpp).
>>>>>    Thanks to Dan for catching the problem in the 2-nd round of 
>>>>> review!
>>>>
>>>> The basic approach here seems sound.
>>>
>>> Thank you for reviewing the fix!
>>>
>>>>
>>>> I find the checking for cur->is_VMThread() somewhat overly 
>>>> conservative - if we are at a safepoint, and executing this code, 
>>>> then we must be the VMThread. But ok.
>>>
>>> Agreed and simplified. Thanks!
>>>
>>>>
>>>> You could also use MutexLockerEx to avoid the need for locked and 
>>>> unlocked paths to a common call, but that's just stylistic. Though 
>>>> if you are grabbing the current thread anyway you can also use the 
>>>> MutexLocker calls that take the thread arg - to avoid a second 
>>>> look-up of the current thread.
>>>
>>> Thank you for reminding. I keep forgetting about it.
>>> Will check what is better here, just do not want to rerun the whole 
>>> testing.
>>> But I'm in favor to make it simpler. :)
>>>
>>> Thanks,
>>> Serguei
>>>
>>>>
>>>> David
>>>> -----
>>>>
>>>>> Testing:
>>>>>    All tests are passed: nsk.jvmti, nsk.jdi, nsk.jdwp, JTreg 
>>>>> com/sun/jdi
>>>>>
>>>>>
>>>>> Thanks,
>>>>> Serguei
>>>>>
>>>>>
>>>>> On 2/27/14 2:00 PM, serguei.spitsyn at oracle.com wrote:
>>>>>> On 2/27/14 1:03 PM, serguei.spitsyn at oracle.com wrote:
>>>>>>> On 2/27/14 12:28 PM, serguei.spitsyn at oracle.com wrote:
>>>>>>>> Dan,
>>>>>>>>
>>>>>>>> Thank you a lot for reviewing this!
>>>>>>>>
>>>>>>>> On 2/27/14 11:09 AM, Daniel D. Daugherty wrote:
>>>>>>>>> On 2/27/14 1:25 AM, serguei.spitsyn at oracle.com wrote:
>>>>>>>>>> Please, review the fix for:
>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-6471769
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Open webrev:
>>>>>>>>>> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/hotspot/6471769-JVMTI-DEPTH.2 
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> src/share/vm/runtime/vm_operations.hpp
>>>>>>>>>     No comments.
>>>>>>>>>
>>>>>>>>> src/share/vm/prims/jvmtiEnvBase.hpp
>>>>>>>>>     No comments.
>>>>>>>>>
>>>>>>>>> src/share/vm/prims/jvmtiEnv.cpp
>>>>>>>>>     No comments.
>>>>>>>>>
>>>>>>>>> src/share/vm/prims/jvmtiEnvThreadState.cpp
>>>>>>>>>     No comments.
>>>>>>>>>
>>>>>>>>> src/share/vm/prims/jvmtiEventController.cpp
>>>>>>>>>     JvmtiEventController::set_frame_pop() is called by
>>>>>>>>>     JvmtiEnvThreadState::set_frame_pop() which is called by
>>>>>>>>>     JvmtiEnv::NotifyFramePop().
>>>>>>>>>
>>>>>>>>>     The "MutexLocker mu(JvmtiThreadState_lock)" in
>>>>>>>>>     JvmtiEventController::set_frame_pop() protected the work
>>>>>>>>>     done by JvmtiEventControllerPrivate::set_frame_pop():
>>>>>>>>>
>>>>>>>>>       ets->get_frame_pops()->set(fpop);
>>>>>>>>> recompute_thread_enabled(ets->get_thread()->jvmti_thread_state()); 
>>>>>>>>>
>>>>>>>>
>>>>>>>> Your check is the right thing to do, thanks!
>>>>>>>> I had to explain this more clearly in this 2-nd review request.
>>>>>>>>
>>>>>>>> The approach I've taken here is that all this code paths are 
>>>>>>>> executed
>>>>>>>> on the target thread or at a safepoint.
>>>>>>>>
>>>>>>>> It is true for all 3 functions:
>>>>>>>>   set_frame_pop(), clear_frame_pop() and clear_to_frame_pop().
>>>>>>>>
>>>>>>>> And the updated assert guards ensure that it is the case.
>>>>>>>>
>>>>>>>> It could be a good idea to add a No_Safepoint_Verifier for
>>>>>>>> PopFrame() and NotifyFramePop()
>>>>>>>> to make sure the current/target thread does not go to safepoint
>>>>>>>> until it is returned from
>>>>>>>> update_for_pop_top_frame() and set_frame_pop() correspondingly.
>>>>>>>> A No_Safepoint_Verifier can be also needed in the
>>>>>>>> JvmtiExport::post_method_exit().
>>>>>>>>
>>>>>>>> These are all places where these functions are called:
>>>>>>>> prims/jvmtiEnv.cpp:
>>>>>>>> state->env_thread_state(this)->set_frame_pop(frame_number); //
>>>>>>>> JvmtiEnv::NotifyFramePop()
>>>>>>>> prims/jvmtiExport.cpp: ets->clear_frame_pop(cur_frame_number); //
>>>>>>>> JvmtiExport::post_method_exit()
>>>>>>>> prims/jvmtiThreadState.cpp:
>>>>>>>> ets->clear_frame_pop(popframe_number); //
>>>>>>>> JvmtiThreadState::update_for_pop_top_frame()
>>>>>>>>
>>>>>>>> The function JvmtiEnvThreadState::clear_to_frame_pop() is never
>>>>>>>> called now.
>>>>>>>
>>>>>>> There is still a concern about recompute_thread_enabled().
>>>>>>> If it is normally always protected with the JvmtiThreadState_lock
>>>>>>> then the approach above is not going to work.
>>>>>>> I'm trying to check this now.
>>>>>>
>>>>>> Dan,
>>>>>>
>>>>>> I came to a conclusion that these 3 functions still must be 
>>>>>> protected
>>>>>> by the JvmtiThreadState_lock when they are called out of a 
>>>>>> safepoint.
>>>>>> It is a little bit ugly but has to be safe though.
>>>>>>
>>>>>> Please, let me know if you see eny problems with that.
>>>>>> I'll send a new webrev soon.
>>>>>>
>>>>>> Thanks,
>>>>>> Serguei
>>>>>>
>>>>>>
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Serguei
>>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Serguei
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>>
>>>>>>>>>     Since multiple threads can call JVM/TI NotifyFramePop() on 
>>>>>>>>> the
>>>>>>>>>     same target thread, what keeps the threads from messing with
>>>>>>>>>     the list of frame pops simultaneously or messing with the
>>>>>>>>>     thread enabled events bits in parallel?
>>>>>>>>>
>>>>>>>>>     I suspect that this might also be an issue for
>>>>>>>>>     JvmtiEventController::clear_frame_pop() and
>>>>>>>>>     JvmtiEventController::clear_to_frame_pop() also.
>>>>>>>>>
>>>>>>>>> src/share/vm/prims/jvmtiThreadState.cpp
>>>>>>>>>     No comments.
>>>>>>>>>
>>>>>>>>> Dan
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Summary:
>>>>>>>>>>
>>>>>>>>>>   It is the 2-nd round of review because the JTREG com/sun/jdi
>>>>>>>>>> tests discovered a regression
>>>>>>>>>>   in the first round change. The issue was in the
>>>>>>>>>> JvmtiEventController::clear_frame_pop()
>>>>>>>>>>   lock synchronization that is not allowed at safepoints.
>>>>>>>>>>
>>>>>>>>>>   As a result I've changed the JvmtiEnv::NotifyFramePop to use a
>>>>>>>>>> VM operation for safety.
>>>>>>>>>>   Also, I've removed the lock synchronization from the 3 
>>>>>>>>>> impacted
>>>>>>>>>> JvmtiEventController::
>>>>>>>>>>   functions: set_frame_pop(), clear_frame_pop() and
>>>>>>>>>> clear_to_frame_pop().
>>>>>>>>>>
>>>>>>>>>> Testing:
>>>>>>>>>>   In progress: nsk.jvmti, nsk.jdi, nsk.jdwp, JTreg com/sun/jdi
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>> Serguei
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On 2/25/14 12:43 PM, serguei.spitsyn at oracle.com wrote:
>>>>>>>>>>> Please, review the fix for:
>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-6471769
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Open webrev:
>>>>>>>>>>> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/hotspot/6471769-JVMTI-DEPTH.1 
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Summary:
>>>>>>>>>>>
>>>>>>>>>>>   This is another Test Stabilization issue.
>>>>>>>>>>>   The fix is very similar to other JVMTI stabilization fixes.
>>>>>>>>>>>   It is to use safepoints for updating the PopFrame data 
>>>>>>>>>>> instead
>>>>>>>>>>> of relying on the
>>>>>>>>>>>   suspend equivalent condition mechanism
>>>>>>>>>>> (JvmtiEnv::is_thread_fully_suspended())
>>>>>>>>>>>   which is not adequate from the reliability point of view.
>>>>>>>>>>>
>>>>>>>>>>> Testing:
>>>>>>>>>>>   In progress: nsk.jvmti, nsk.jdi, nsk.jdwp, JTreg com/sun/jdi
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>> Serguei
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>
>>
>


From serguei.spitsyn at oracle.com  Fri Feb 28 13:33:46 2014
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Fri, 28 Feb 2014 13:33:46 -0800
Subject: 3-nd round RFR 6471769: Error: assert(_cur_stack_depth ==
	count_frames(), "cur_stack_depth out of sync")
In-Reply-To: <5310FF13.8050204@oracle.com>
References: <530D007D.4040602@oracle.com>	<530EF65C.5040707@oracle.com>	<530F8D52.2040400@oracle.com>	<530F9FF3.7030209@oracle.com>	<530FA81F.8040309@oracle.com>	<530FB572.4010507@oracle.com>
	<5310077A.7010004@oracle.com> <53102702.4070603@oracle.com>
	<5310E996.9010800@oracle.com> <5310FBD9.5090607@oracle.com>
	<5310FE9F.1060606@oracle.com> <5310FF13.8050204@oracle.com>
Message-ID: <531100BA.1020303@oracle.com>

On 2/28/14 1:26 PM, Daniel D. Daugherty wrote:
> On 2/28/14 2:24 PM, serguei.spitsyn at oracle.com wrote:
>> On 2/28/14 1:12 PM, Daniel D. Daugherty wrote:
>>> On 2/28/14 12:55 PM, serguei.spitsyn at oracle.com wrote:
>>>> On 2/27/14 10:04 PM, David Holmes wrote:
>>>>> Hi Serguei,
>>>>>
>>>>> On 28/02/2014 1:50 PM, serguei.spitsyn at oracle.com wrote:
>>>>>> Please, review the fix for:
>>>>>>    https://bugs.openjdk.java.net/browse/JDK-6471769
>>>>>>
>>>>>>
>>>>>> Open webrev:
>>>>>> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/hotspot/6471769-JVMTI-DEPTH.3 
>>>>>>
>>>
>>> Thumbs up! (including the tweaks in the .4 version)
>>
>> Thanks a lot, Dan!
>>
>>>
>>> src/share/vm/runtime/vm_operations.hpp
>>>     No comments.
>>>
>>> src/share/vm/prims/jvmtiEnvBase.hpp
>>>     line 375: bool allow_nested_vm_operations() const        { 
>>> return true; }
>>>         Does VM_SetFramePop really gave to permit nested VMops?
>>
>> Yes.
>> It was a real surprise that it is really necessary.
>
> Maybe add a one line comment saying something like:
>
> // needs to permit nested because this VMop can be invoked from XXX 
> VMop...
>
> or something like that.

Ok, I'll add a comment before the push.

The latest public webrev with the simplification fixes suggested by 
David (no above comment yet):
http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/hotspot/6471769-JVMTI-DEPTH.4/

Thanks,
Serguei

>
> Dan
>
>
>>
>> Thanks!
>> Serguei
>>
>>>
>>> src/share/vm/prims/jvmtiEnv.cpp
>>>     No comments.
>>>
>>> src/share/vm/prims/jvmtiEnvThreadState.cpp
>>>     No comments.
>>>
>>> src/share/vm/prims/jvmtiEventController.cpp
>>>     No comments.
>>>
>>> src/share/vm/prims/jvmtiThreadState.cpp
>>>     No comments.
>>>
>>> Dan
>>>
>>>
>>>
>>>>>>
>>>>>>
>>>>>> Summary:
>>>>>>
>>>>>>    It is another attempt to fix the JTREG com/sun/jdi tests 
>>>>>> regression
>>>>>>    discovered in the first round change.
>>>>>>    The fix is to avoid lock synchronization at
>>>>>> safepoints(jvmtiEventController.cpp).
>>>>>>    Thanks to Dan for catching the problem in the 2-nd round of 
>>>>>> review!
>>>>>
>>>>> The basic approach here seems sound.
>>>>
>>>> Thank you for reviewing the fix!
>>>>
>>>>>
>>>>> I find the checking for cur->is_VMThread() somewhat overly 
>>>>> conservative - if we are at a safepoint, and executing this code, 
>>>>> then we must be the VMThread. But ok.
>>>>
>>>> Agreed and simplified. Thanks!
>>>>
>>>>>
>>>>> You could also use MutexLockerEx to avoid the need for locked and 
>>>>> unlocked paths to a common call, but that's just stylistic. Though 
>>>>> if you are grabbing the current thread anyway you can also use the 
>>>>> MutexLocker calls that take the thread arg - to avoid a second 
>>>>> look-up of the current thread.
>>>>
>>>> Thank you for reminding. I keep forgetting about it.
>>>> Will check what is better here, just do not want to rerun the whole 
>>>> testing.
>>>> But I'm in favor to make it simpler. :)
>>>>
>>>> Thanks,
>>>> Serguei
>>>>
>>>>>
>>>>> David
>>>>> -----
>>>>>
>>>>>> Testing:
>>>>>>    All tests are passed: nsk.jvmti, nsk.jdi, nsk.jdwp, JTreg 
>>>>>> com/sun/jdi
>>>>>>
>>>>>>
>>>>>> Thanks,
>>>>>> Serguei
>>>>>>
>>>>>>
>>>>>> On 2/27/14 2:00 PM, serguei.spitsyn at oracle.com wrote:
>>>>>>> On 2/27/14 1:03 PM, serguei.spitsyn at oracle.com wrote:
>>>>>>>> On 2/27/14 12:28 PM, serguei.spitsyn at oracle.com wrote:
>>>>>>>>> Dan,
>>>>>>>>>
>>>>>>>>> Thank you a lot for reviewing this!
>>>>>>>>>
>>>>>>>>> On 2/27/14 11:09 AM, Daniel D. Daugherty wrote:
>>>>>>>>>> On 2/27/14 1:25 AM, serguei.spitsyn at oracle.com wrote:
>>>>>>>>>>> Please, review the fix for:
>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-6471769
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Open webrev:
>>>>>>>>>>> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/hotspot/6471769-JVMTI-DEPTH.2 
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> src/share/vm/runtime/vm_operations.hpp
>>>>>>>>>>     No comments.
>>>>>>>>>>
>>>>>>>>>> src/share/vm/prims/jvmtiEnvBase.hpp
>>>>>>>>>>     No comments.
>>>>>>>>>>
>>>>>>>>>> src/share/vm/prims/jvmtiEnv.cpp
>>>>>>>>>>     No comments.
>>>>>>>>>>
>>>>>>>>>> src/share/vm/prims/jvmtiEnvThreadState.cpp
>>>>>>>>>>     No comments.
>>>>>>>>>>
>>>>>>>>>> src/share/vm/prims/jvmtiEventController.cpp
>>>>>>>>>>     JvmtiEventController::set_frame_pop() is called by
>>>>>>>>>>     JvmtiEnvThreadState::set_frame_pop() which is called by
>>>>>>>>>>     JvmtiEnv::NotifyFramePop().
>>>>>>>>>>
>>>>>>>>>>     The "MutexLocker mu(JvmtiThreadState_lock)" in
>>>>>>>>>>     JvmtiEventController::set_frame_pop() protected the work
>>>>>>>>>>     done by JvmtiEventControllerPrivate::set_frame_pop():
>>>>>>>>>>
>>>>>>>>>>       ets->get_frame_pops()->set(fpop);
>>>>>>>>>> recompute_thread_enabled(ets->get_thread()->jvmti_thread_state()); 
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Your check is the right thing to do, thanks!
>>>>>>>>> I had to explain this more clearly in this 2-nd review request.
>>>>>>>>>
>>>>>>>>> The approach I've taken here is that all this code paths are 
>>>>>>>>> executed
>>>>>>>>> on the target thread or at a safepoint.
>>>>>>>>>
>>>>>>>>> It is true for all 3 functions:
>>>>>>>>>   set_frame_pop(), clear_frame_pop() and clear_to_frame_pop().
>>>>>>>>>
>>>>>>>>> And the updated assert guards ensure that it is the case.
>>>>>>>>>
>>>>>>>>> It could be a good idea to add a No_Safepoint_Verifier for
>>>>>>>>> PopFrame() and NotifyFramePop()
>>>>>>>>> to make sure the current/target thread does not go to safepoint
>>>>>>>>> until it is returned from
>>>>>>>>> update_for_pop_top_frame() and set_frame_pop() correspondingly.
>>>>>>>>> A No_Safepoint_Verifier can be also needed in the
>>>>>>>>> JvmtiExport::post_method_exit().
>>>>>>>>>
>>>>>>>>> These are all places where these functions are called:
>>>>>>>>> prims/jvmtiEnv.cpp:
>>>>>>>>> state->env_thread_state(this)->set_frame_pop(frame_number); //
>>>>>>>>> JvmtiEnv::NotifyFramePop()
>>>>>>>>> prims/jvmtiExport.cpp: ets->clear_frame_pop(cur_frame_number); //
>>>>>>>>> JvmtiExport::post_method_exit()
>>>>>>>>> prims/jvmtiThreadState.cpp:
>>>>>>>>> ets->clear_frame_pop(popframe_number); //
>>>>>>>>> JvmtiThreadState::update_for_pop_top_frame()
>>>>>>>>>
>>>>>>>>> The function JvmtiEnvThreadState::clear_to_frame_pop() is never
>>>>>>>>> called now.
>>>>>>>>
>>>>>>>> There is still a concern about recompute_thread_enabled().
>>>>>>>> If it is normally always protected with the JvmtiThreadState_lock
>>>>>>>> then the approach above is not going to work.
>>>>>>>> I'm trying to check this now.
>>>>>>>
>>>>>>> Dan,
>>>>>>>
>>>>>>> I came to a conclusion that these 3 functions still must be 
>>>>>>> protected
>>>>>>> by the JvmtiThreadState_lock when they are called out of a 
>>>>>>> safepoint.
>>>>>>> It is a little bit ugly but has to be safe though.
>>>>>>>
>>>>>>> Please, let me know if you see eny problems with that.
>>>>>>> I'll send a new webrev soon.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Serguei
>>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Serguei
>>>>>>>>
>>>>>>>>
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Serguei
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>     Since multiple threads can call JVM/TI NotifyFramePop() 
>>>>>>>>>> on the
>>>>>>>>>>     same target thread, what keeps the threads from messing with
>>>>>>>>>>     the list of frame pops simultaneously or messing with the
>>>>>>>>>>     thread enabled events bits in parallel?
>>>>>>>>>>
>>>>>>>>>>     I suspect that this might also be an issue for
>>>>>>>>>>     JvmtiEventController::clear_frame_pop() and
>>>>>>>>>>     JvmtiEventController::clear_to_frame_pop() also.
>>>>>>>>>>
>>>>>>>>>> src/share/vm/prims/jvmtiThreadState.cpp
>>>>>>>>>>     No comments.
>>>>>>>>>>
>>>>>>>>>> Dan
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Summary:
>>>>>>>>>>>
>>>>>>>>>>>   It is the 2-nd round of review because the JTREG com/sun/jdi
>>>>>>>>>>> tests discovered a regression
>>>>>>>>>>>   in the first round change. The issue was in the
>>>>>>>>>>> JvmtiEventController::clear_frame_pop()
>>>>>>>>>>>   lock synchronization that is not allowed at safepoints.
>>>>>>>>>>>
>>>>>>>>>>>   As a result I've changed the JvmtiEnv::NotifyFramePop to 
>>>>>>>>>>> use a
>>>>>>>>>>> VM operation for safety.
>>>>>>>>>>>   Also, I've removed the lock synchronization from the 3 
>>>>>>>>>>> impacted
>>>>>>>>>>> JvmtiEventController::
>>>>>>>>>>>   functions: set_frame_pop(), clear_frame_pop() and
>>>>>>>>>>> clear_to_frame_pop().
>>>>>>>>>>>
>>>>>>>>>>> Testing:
>>>>>>>>>>>   In progress: nsk.jvmti, nsk.jdi, nsk.jdwp, JTreg com/sun/jdi
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>> Serguei
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On 2/25/14 12:43 PM, serguei.spitsyn at oracle.com wrote:
>>>>>>>>>>>> Please, review the fix for:
>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-6471769
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Open webrev:
>>>>>>>>>>>> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/hotspot/6471769-JVMTI-DEPTH.1 
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Summary:
>>>>>>>>>>>>
>>>>>>>>>>>>   This is another Test Stabilization issue.
>>>>>>>>>>>>   The fix is very similar to other JVMTI stabilization fixes.
>>>>>>>>>>>>   It is to use safepoints for updating the PopFrame data 
>>>>>>>>>>>> instead
>>>>>>>>>>>> of relying on the
>>>>>>>>>>>>   suspend equivalent condition mechanism
>>>>>>>>>>>> (JvmtiEnv::is_thread_fully_suspended())
>>>>>>>>>>>>   which is not adequate from the reliability point of view.
>>>>>>>>>>>>
>>>>>>>>>>>> Testing:
>>>>>>>>>>>>   In progress: nsk.jvmti, nsk.jdi, nsk.jdwp, JTreg com/sun/jdi
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>> Serguei
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>
>>>
>>
>


From daniel.daugherty at oracle.com  Fri Feb 28 13:53:43 2014
From: daniel.daugherty at oracle.com (Daniel D. Daugherty)
Date: Fri, 28 Feb 2014 14:53:43 -0700
Subject: RFR round 0 JDK8u backport of ObjectMonitor-JVM/TI hang fix
	(8028073)
In-Reply-To: <53081C1F.2070008@oracle.com>
References: <52ED3F28.5070806@oracle.com> <53081C1F.2070008@oracle.com>
Message-ID: <53110567.6020600@oracle.com>

Greetings,

This is a code review request for the JDK7u-hs-dev backport of the
following ObjectMonitor-JVM/TI hang fix:

     8028073 race condition in ObjectMonitor implementation causing 
deadlocks
     https://bugs.openjdk.java.net/browse/JDK-8028073

Here is the JDK7u-hs-dev webrev URL:

     http://cr.openjdk.java.net/~dcubed/8028073-webrev/0-jdk7u-hs-dev/

This is a straight forward backport of the JDK8u-hs-dev version of the
fix modulo lines numbers and context diff anchors. The diff of the two
'hg diff --nodates' files is below.

Same testing as for the JDK9 and JDK8u versions with the same results.

I need one sanity check review from someone.

Dan

$ diff 8028073-diffs-jdk[78]u-hs.txt.00
1c1
< diff -r bdb3798edd11 src/share/vm/prims/jvm.cpp
---
 > diff -r 231051bff068 src/share/vm/prims/jvm.cpp
11c11
< @@ -527,6 +527,12 @@ JVM_ENTRY(void, JVM_MonitorWait(JNIEnv*
---
 > @@ -518,6 +518,12 @@ JVM_ENTRY(void, JVM_MonitorWait(JNIEnv*
22c22
<    ObjectSynchronizer::wait(obj, ms, THREAD);
---
 >    ObjectSynchronizer::wait(obj, ms, CHECK);
24c24
< diff -r bdb3798edd11 src/share/vm/runtime/objectMonitor.cpp
---
 > diff -r 231051bff068 src/share/vm/runtime/objectMonitor.cpp
34c34
< @@ -385,6 +385,12 @@ void ATTR ObjectMonitor::enter(TRAPS) {
---
 > @@ -382,6 +382,12 @@ void ATTR ObjectMonitor::enter(TRAPS) {
47c47
< @@ -442,6 +448,12 @@ void ATTR ObjectMonitor::enter(TRAPS) {
---
 > @@ -439,6 +445,12 @@ void ATTR ObjectMonitor::enter(TRAPS) {
60c60
< @@ -1459,6 +1471,14 @@ void ObjectMonitor::wait(jlong millis, b
---
 > @@ -1456,6 +1468,14 @@ void ObjectMonitor::wait(jlong millis, b
75c75
< @@ -1501,21 +1521,6 @@ void ObjectMonitor::wait(jlong millis, b
---
 > @@ -1498,21 +1518,6 @@ void ObjectMonitor::wait(jlong millis, b
97c97
< @@ -1598,6 +1603,33 @@ void ObjectMonitor::wait(jlong millis, b
---
 > @@ -1595,6 +1600,33 @@ void ObjectMonitor::wait(jlong millis, b


On 2/21/14 8:40 PM, Daniel D. Daugherty wrote:
> Greetings,
>
> This is a code review request for the JDK8u-hs-dev backport of the
> following ObjectMonitor-JVM/TI hang fix:
>
>     8028073 race condition in ObjectMonitor implementation causing 
> deadlocks
>     https://bugs.openjdk.java.net/browse/JDK-8028073
>
> Here is the JDK8u-hs-dev webrev URL:
>
> http://cr.openjdk.java.net/~dcubed/8028073-webrev/0-jdk8u-hs-dev/
>
> This is _almost_ a straight forward backport of the JDK9 fix. The only
> difference to the fix was discussed at the end of the JDK9 review and
> was determined to only be needed in versions of HotSpot without the
> fix for 8028280:
>
> http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2014-February/010745.html 
>
>
> 8028280 has not yet been backported to JDK8u-hs-dev.
>
> The easiest way to review the backport is to download the two patch
> files from the webrevs and compare them with something like:
>
>     jfilemerge -r -w 8028073_exp.patch 8028073_exp_for_jdk8u_hs.patch
>
> The same testing has been performed on the JDK8u-hs-dev version as
> with the JDK9-hs-runtime version.
>
> Thanks, in advance, for any comments, questions or suggestions.
>
> Dan
>
>
> On 2/1/14 11:38 AM, Daniel D. Daugherty wrote:
> > Greetings,
> >
> > I have a fix ready for the following bug:
> >
> >     8028073 race condition in ObjectMonitor implementation causing 
> deadlocks
> >     https://bugs.openjdk.java.net/browse/JDK-8028073
> >
> > On the surface, this is a very simple fix that relocates a few lines of
> > code, relocates and rewrites the comments associated with that code and
> > adds several new comments.
> >
> > Of course, in reality, the issue is much more complicated, but I'm
> > hoping to make it easy for anyone not acquainted with this issue to
> > understand what's going on.
> >
> > Here are the JDK9 webrev URLs:
> >
> > OpenJDK:
> > http://cr.openjdk.java.net/~dcubed/8028073-webrev/0-jdk9-hs-runtime/
> >
> > Oracle internal:
> > 
> http://javaweb.us.oracle.com/~ddaugher/8028073-webrev/0-jdk9-hs-runtime/
> >
> > The simple summary:
> >
> > - since Java Monitors and JVM/TI RawMonitors share a ParkEvent,
> >   it is possible for a JVM/TI monitor event handler to accidentally
> >   consume a ParkEvent.unpark() call meant for Java Monitor layer
> > - the original code fix was made on 2005.07.04 using this bug ID:
> >   https://bugs.openjdk.java.net/browse/JDK-5030359
> > - it's the right fix, but it's in the wrong place
> > - the fix needs to be after the JVMTI_EVENT_MONITOR_WAITED
> >   event handler is called because it is that event handler
> >   that can cause the hang
> >
> >
> > Testing
> > -------
> >
> > - a new StessMonitorWait test has been created that reliably
> >   reproduces the hang in JDK[6789]; see the bug's gory details
> >   for the specific versions where the hang has been reproduced
> >   - the test reliably reproduces the hang in 5 seconds on my
> >     T7600 running Solaris 10u11 X86; 1 minute runs reproduce
> >     the hang reliably on other machines
> >   - 12 hour stress run of the new test on Linux-X64, MacOS X-X64,
> >     Solaris-SPARCV9, Solaris-X64, and Win7-X86 with the JPRT
> >     bits did not reproduce the hang
> > - JPRT test job
> > - VM/SQE Adhoc test job on Server VM, fastdebug bits on Linux-X86,
> >   Linux-X64, MacOS X-X64, Solaris-SPARCV9, Solaris-X64, Windows-X86,
> >   and Windows-X64:
> >   - vm.quick
> >   - Kitchensink (bigapps)
> >   - Weblogic+medrec (bigapps)
> >   - runThese (bigapps)
> >
> >
> > The Gory Details Start Here
> > ---------------------------
> >
> > This is the old location of block of code that's being moved:
> >
> > src/share/vm/runtime/objectMonitor.cpp:
> >
> > 1440 void ObjectMonitor::wait(jlong millis, bool interruptible, 
> TRAPS) {
> > <snip>
> > 1499    exit (true, Self) ;                    // exit the monitor
> > <snip>
> > 1513    if (node._notified != 0 && _succ == Self) {
> > 1514       node._event->unpark();
> > 1515    }
> >
> >
> > This is the new location of block of code that's being moved:
> >
> > src/share/vm/runtime/objectMonitor.cpp:
> >
> > 1452 void ObjectMonitor::wait(jlong millis, bool interruptible, 
> TRAPS) {
> > <snip>
> > 1601      if (JvmtiExport::should_post_monitor_waited()) {
> > 1602        JvmtiExport::post_monitor_waited(jt, this, ret == 
> OS_TIMEOUT);
> > <snip>
> > 1604        if (node._notified != 0 && _succ == Self) {
> > <snip>
> > 1620          node._event->unpark();
> > 1621        }
> >
> >
> > The Risks
> > ---------
> >
> > - The code now executes only when the JVMTI_EVENT_MONITOR_WAITED event
> >   is enabled:
> >   - previously it was always executed
> >   - while the old code was not effective for the hang that is being
> >     fixed with this bug, it is possible that the old code prevented
> >     a different bug in the successor protocol from manifesting
> >   - thorough analysis of the successor protocol did not reveal a
> >     case where the old code was needed in the old location
> > - Thorough analysis indicates that the other JVM/TI monitor events
> >   do not need a fix like the one for JVMTI_EVENT_MONITOR_WAITED:
> >   - the successor protocol is complicated and the analysis could
> >     be wrong when certain options are used
> >   - comments were added to each location where a JVM/TI monitor
> >     event handler is called documenting why a fix like this one
> >     is not needed there
> >   - if the analysis is wrong, the new comments show where a new
> >     code change would be needed
> >
> >
> > The Scenario
> > ------------
> >
> > I've created a scenario that reproduces this hang:
> >
> > T1 - enters monitor and calls monitor.wait()
> > T2 - enters the monitor, calls monitor.notify() and exits the monitor
> > T3 - enters and exits the monitor
> > T4 - enters the monitor, delays for 5 seconds, exits the monitor
> >
> > A JVM/TI agent that enables JVMTI_EVENT_MONITOR_WAITED and has a
> > handler that: enters a raw monitor, waits for 1ms, exits a raw monitor.
> >
> > Here are the six events necessary to make this hang happen:
> >
> > // KEY-EVENT-1a: After being unparked(), T1 has cleared the _succ 
> field, but
> > // KEY-EVENT-1b: T3 is exiting the monitor and makes T1 the 
> successor again.
> >
> > // KEY-EVENT-2a: The unpark() done by T3 when it made T1 the successor
> > // KEY-EVENT-2b: is consumed by the JVM/TI event handler.
> >
> > // KEY-EVENT-3a: T3 made T1 the successor
> > // KEY-EVENT-3b: but before T1 could reenter the monitor T4 grabbed it.
> >
> > // KEY-EVENT-4a: T1's TrySpin() call sees T4 as NotRunnable so
> > // KEY-EVENT-4b: T1 bails from TrySpin without touching _succ.
> >
> > // KEY-EVENT-5a: T4 sees that T1 is still the successor so
> > // KEY-EVENT-5b: T4 takes the quick exit path (no ExitEpilog)
> >
> > // KEY-EVENT-6a: T1 is about to park and it is the successor, but
> > // KEY-EVENT-6b: T3's unpark has been eaten by the JVM/TI event handler
> > // KEY-EVENT-6c: and T4 took the quick exit path. T1 is about to be 
> stuck.
> >
> >
> > This bug is intertwined with:
> >
> > - The ObjectMonitor successor protocol
> > - the sharing of a ParkEvent between Java Monitors and JVM/TI 
> RawMonitors
> >
> > There is a very long successor.notes attachment to JDK-8028073 that
> > attempts to describe the ObjectMonitor successor protocol. It's good
> > for putting pretty much anyone to sleep.
> >
> > Since this hang reproduces back to JDK6, this bug is taking the easily
> > backported solution of moving the original fix to the right location.
> > The following new bug has been filed for possible future work in this
> > area by the Serviceability Team:
> >
> >     8033399 add a separate ParkEvent for JVM/TI RawMonitor use
> >     https://bugs.openjdk.java.net/browse/JDK-8033399
> >
> >
> > The Symptoms
> > ------------
> >
> > With intermittent hangs like this, it is useful to know what to look
> > for in order to determine if you are running into this issue:
> >
> > - if you aren't using a debugger or a profiler or some other
> >   JVM/TI agent, then this hang is not the same as yours
> > - if your JVM/TI agent isn't using a JVMTI_EVENT_MONITOR_WAITED
> >   event handler, then this hang is not the same as yours
> > - if your JVMTI_EVENT_MONITOR_WAITED event handler is not using
> >   JVM/TI RawMonitors, then this hang is not the same as yours
> > - if your JVMTI_EVENT_MONITOR_WAITED event handler is calling
> >   back into Java code, then you might just be insane and this
> >   hang might be similar to yours. However, using a Java callback
> >   in an event handler is an even bigger problem/risk so fix that
> >   first.
> > - if you one or more threads blocked like this and making no
> >   progress, then this hang might be the same as yours:
> >
> > "T1" #22 prio=5 os_prio=64 tid=0x00000000009ca800 nid=0x2f waiting 
> for monitor e
> > ntry [0xfffffd7fc0231000]
> >    java.lang.Thread.State: BLOCKED (on object monitor)
> >    JavaThread state: _thread_blocked
> > Thread: 0x00000000009ca800  [0x2f] State: _at_safepoint 
> _has_called_back 0 _at_p
> > oll_safepoint 0
> >    JavaThread state: _thread_blocked
> >         at java.lang.Object.wait(Native Method)
> >         - waiting on <0xfffffd7e6a2b6ff0> (a java.lang.String)
> >         at java.lang.Object.wait(Object.java:502)
> >         at SMW_WorkerThread.run(StressMonitorWait.java:103)
> >         - locked <0xfffffd7e6a2b6ff0> (a java.lang.String)
> >
> > "T2" #23 prio=5 os_prio=64 tid=0x00000000009cc000 nid=0x30 waiting 
> for monitor e
> > ntry [0xfffffd7fc0130000]
> >    java.lang.Thread.State: BLOCKED (on object monitor)
> >    JavaThread state: _thread_blocked
> > Thread: 0x00000000009cc000  [0x30] State: _at_safepoint 
> _has_called_back 0 _at_p
> > oll_safepoint 0
> >    JavaThread state: _thread_blocked
> >         at SMW_WorkerThread.run(StressMonitorWait.java:120)
> >         - waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String)
> >
> > "T3" #24 prio=5 os_prio=64 tid=0x00000000009ce000 nid=0x31 waiting 
> for monitor e
> > ntry [0xfffffd7fc002f000]
> >    java.lang.Thread.State: BLOCKED (on object monitor)
> >    JavaThread state: _thread_blocked
> > Thread: 0x00000000009ce000  [0x31] State: _at_safepoint 
> _has_called_back 0 _at_p
> > oll_safepoint 0
> >    JavaThread state: _thread_blocked
> >         at SMW_WorkerThread.run(StressMonitorWait.java:139)
> >         - waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String)
> >
> > Key symptoms in thread T1:
> >
> > - had the object locked:
> >
> >   locked <0xfffffd7e6a2b6ff0> (a java.lang.String)
> >
> > - did an Object.wait():
> >
> >   waiting on <0xfffffd7e6a2b6ff0> (a java.lang.String)
> >
> > - is blocked on reentry:
> >
> >   waiting for monitor entry [0xfffffd7fc0231000]
> >
> > Key symptoms in thread T2:
> >
> > - is blocked waiting to lock the object:
> >
> >   waiting for monitor entry [0xfffffd7fc0130000]
> >   waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String)
> >
> > Key symptoms in thread T3:
> >
> > - is blocked waiting to lock the object:
> >
> >   waiting for monitor entry [0xfffffd7fc002f000]
> >   waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String)
>
>
>


From daniel.daugherty at oracle.com  Fri Feb 28 13:55:08 2014
From: daniel.daugherty at oracle.com (Daniel D. Daugherty)
Date: Fri, 28 Feb 2014 14:55:08 -0700
Subject: RFR round 0 JDK7u backport of ObjectMonitor-JVM/TI hang fix (8028073)
In-Reply-To: <53081C1F.2070008@oracle.com>
References: <52ED3F28.5070806@oracle.com> <53081C1F.2070008@oracle.com>
Message-ID: <531105BC.4050509@oracle.com>

Resend with the corrected subject line... sigh...

Greetings,

This is a code review request for the JDK7u-hs-dev backport of the
following ObjectMonitor-JVM/TI hang fix:

     8028073 race condition in ObjectMonitor implementation causing 
deadlocks
     https://bugs.openjdk.java.net/browse/JDK-8028073

Here is the JDK7u-hs-dev webrev URL:

     http://cr.openjdk.java.net/~dcubed/8028073-webrev/0-jdk7u-hs-dev/

This is a straight forward backport of the JDK8u-hs-dev version of the
fix modulo lines numbers and context diff anchors. The diff of the two
'hg diff --nodates' files is below.

Same testing as for the JDK9 and JDK8u versions with the same results.

I need one sanity check review from someone.

Dan

$ diff 8028073-diffs-jdk[78]u-hs.txt.00
1c1
< diff -r bdb3798edd11 src/share/vm/prims/jvm.cpp
---
> diff -r 231051bff068 src/share/vm/prims/jvm.cpp
11c11
< @@ -527,6 +527,12 @@ JVM_ENTRY(void, JVM_MonitorWait(JNIEnv*
---
> @@ -518,6 +518,12 @@ JVM_ENTRY(void, JVM_MonitorWait(JNIEnv*
22c22
<    ObjectSynchronizer::wait(obj, ms, THREAD);
---
>    ObjectSynchronizer::wait(obj, ms, CHECK);
24c24
< diff -r bdb3798edd11 src/share/vm/runtime/objectMonitor.cpp
---
> diff -r 231051bff068 src/share/vm/runtime/objectMonitor.cpp
34c34
< @@ -385,6 +385,12 @@ void ATTR ObjectMonitor::enter(TRAPS) {
---
> @@ -382,6 +382,12 @@ void ATTR ObjectMonitor::enter(TRAPS) {
47c47
< @@ -442,6 +448,12 @@ void ATTR ObjectMonitor::enter(TRAPS) {
---
> @@ -439,6 +445,12 @@ void ATTR ObjectMonitor::enter(TRAPS) {
60c60
< @@ -1459,6 +1471,14 @@ void ObjectMonitor::wait(jlong millis, b
---
> @@ -1456,6 +1468,14 @@ void ObjectMonitor::wait(jlong millis, b
75c75
< @@ -1501,21 +1521,6 @@ void ObjectMonitor::wait(jlong millis, b
---
> @@ -1498,21 +1518,6 @@ void ObjectMonitor::wait(jlong millis, b
97c97
< @@ -1598,6 +1603,33 @@ void ObjectMonitor::wait(jlong millis, b
---
> @@ -1595,6 +1600,33 @@ void ObjectMonitor::wait(jlong millis, b


On 2/21/14 8:40 PM, Daniel D. Daugherty wrote:
> Greetings,
>
> This is a code review request for the JDK8u-hs-dev backport of the
> following ObjectMonitor-JVM/TI hang fix:
>
>     8028073 race condition in ObjectMonitor implementation causing
> deadlocks
>     https://bugs.openjdk.java.net/browse/JDK-8028073
>
> Here is the JDK8u-hs-dev webrev URL:
>
> http://cr.openjdk.java.net/~dcubed/8028073-webrev/0-jdk8u-hs-dev/
>
> This is _almost_ a straight forward backport of the JDK9 fix. The only
> difference to the fix was discussed at the end of the JDK9 review and
> was determined to only be needed in versions of HotSpot without the
> fix for 8028280:
>
> http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2014-February/010745.html
>
>
> 8028280 has not yet been backported to JDK8u-hs-dev.
>
> The easiest way to review the backport is to download the two patch
> files from the webrevs and compare them with something like:
>
>     jfilemerge -r -w 8028073_exp.patch 8028073_exp_for_jdk8u_hs.patch
>
> The same testing has been performed on the JDK8u-hs-dev version as
> with the JDK9-hs-runtime version.
>
> Thanks, in advance, for any comments, questions or suggestions.
>
> Dan
>
>
> On 2/1/14 11:38 AM, Daniel D. Daugherty wrote:
> > Greetings,
> >
> > I have a fix ready for the following bug:
> >
> >     8028073 race condition in ObjectMonitor implementation causing
> deadlocks
> >     https://bugs.openjdk.java.net/browse/JDK-8028073
> >
> > On the surface, this is a very simple fix that relocates a few lines of
> > code, relocates and rewrites the comments associated with that code and
> > adds several new comments.
> >
> > Of course, in reality, the issue is much more complicated, but I'm
> > hoping to make it easy for anyone not acquainted with this issue to
> > understand what's going on.
> >
> > Here are the JDK9 webrev URLs:
> >
> > OpenJDK:
> > http://cr.openjdk.java.net/~dcubed/8028073-webrev/0-jdk9-hs-runtime/
> >
> > Oracle internal:
> >
> http://javaweb.us.oracle.com/~ddaugher/8028073-webrev/0-jdk9-hs-runtime/
> >
> > The simple summary:
> >
> > - since Java Monitors and JVM/TI RawMonitors share a ParkEvent,
> >   it is possible for a JVM/TI monitor event handler to accidentally
> >   consume a ParkEvent.unpark() call meant for Java Monitor layer
> > - the original code fix was made on 2005.07.04 using this bug ID:
> >   https://bugs.openjdk.java.net/browse/JDK-5030359
> > - it's the right fix, but it's in the wrong place
> > - the fix needs to be after the JVMTI_EVENT_MONITOR_WAITED
> >   event handler is called because it is that event handler
> >   that can cause the hang
> >
> >
> > Testing
> > -------
> >
> > - a new StessMonitorWait test has been created that reliably
> >   reproduces the hang in JDK[6789]; see the bug's gory details
> >   for the specific versions where the hang has been reproduced
> >   - the test reliably reproduces the hang in 5 seconds on my
> >     T7600 running Solaris 10u11 X86; 1 minute runs reproduce
> >     the hang reliably on other machines
> >   - 12 hour stress run of the new test on Linux-X64, MacOS X-X64,
> >     Solaris-SPARCV9, Solaris-X64, and Win7-X86 with the JPRT
> >     bits did not reproduce the hang
> > - JPRT test job
> > - VM/SQE Adhoc test job on Server VM, fastdebug bits on Linux-X86,
> >   Linux-X64, MacOS X-X64, Solaris-SPARCV9, Solaris-X64, Windows-X86,
> >   and Windows-X64:
> >   - vm.quick
> >   - Kitchensink (bigapps)
> >   - Weblogic+medrec (bigapps)
> >   - runThese (bigapps)
> >
> >
> > The Gory Details Start Here
> > ---------------------------
> >
> > This is the old location of block of code that's being moved:
> >
> > src/share/vm/runtime/objectMonitor.cpp:
> >
> > 1440 void ObjectMonitor::wait(jlong millis, bool interruptible,
> TRAPS) {
> > <snip>
> > 1499    exit (true, Self) ;                    // exit the monitor
> > <snip>
> > 1513    if (node._notified != 0 && _succ == Self) {
> > 1514       node._event->unpark();
> > 1515    }
> >
> >
> > This is the new location of block of code that's being moved:
> >
> > src/share/vm/runtime/objectMonitor.cpp:
> >
> > 1452 void ObjectMonitor::wait(jlong millis, bool interruptible,
> TRAPS) {
> > <snip>
> > 1601      if (JvmtiExport::should_post_monitor_waited()) {
> > 1602        JvmtiExport::post_monitor_waited(jt, this, ret ==
> OS_TIMEOUT);
> > <snip>
> > 1604        if (node._notified != 0 && _succ == Self) {
> > <snip>
> > 1620          node._event->unpark();
> > 1621        }
> >
> >
> > The Risks
> > ---------
> >
> > - The code now executes only when the JVMTI_EVENT_MONITOR_WAITED event
> >   is enabled:
> >   - previously it was always executed
> >   - while the old code was not effective for the hang that is being
> >     fixed with this bug, it is possible that the old code prevented
> >     a different bug in the successor protocol from manifesting
> >   - thorough analysis of the successor protocol did not reveal a
> >     case where the old code was needed in the old location
> > - Thorough analysis indicates that the other JVM/TI monitor events
> >   do not need a fix like the one for JVMTI_EVENT_MONITOR_WAITED:
> >   - the successor protocol is complicated and the analysis could
> >     be wrong when certain options are used
> >   - comments were added to each location where a JVM/TI monitor
> >     event handler is called documenting why a fix like this one
> >     is not needed there
> >   - if the analysis is wrong, the new comments show where a new
> >     code change would be needed
> >
> >
> > The Scenario
> > ------------
> >
> > I've created a scenario that reproduces this hang:
> >
> > T1 - enters monitor and calls monitor.wait()
> > T2 - enters the monitor, calls monitor.notify() and exits the monitor
> > T3 - enters and exits the monitor
> > T4 - enters the monitor, delays for 5 seconds, exits the monitor
> >
> > A JVM/TI agent that enables JVMTI_EVENT_MONITOR_WAITED and has a
> > handler that: enters a raw monitor, waits for 1ms, exits a raw monitor.
> >
> > Here are the six events necessary to make this hang happen:
> >
> > // KEY-EVENT-1a: After being unparked(), T1 has cleared the _succ
> field, but
> > // KEY-EVENT-1b: T3 is exiting the monitor and makes T1 the
> successor again.
> >
> > // KEY-EVENT-2a: The unpark() done by T3 when it made T1 the successor
> > // KEY-EVENT-2b: is consumed by the JVM/TI event handler.
> >
> > // KEY-EVENT-3a: T3 made T1 the successor
> > // KEY-EVENT-3b: but before T1 could reenter the monitor T4 grabbed it.
> >
> > // KEY-EVENT-4a: T1's TrySpin() call sees T4 as NotRunnable so
> > // KEY-EVENT-4b: T1 bails from TrySpin without touching _succ.
> >
> > // KEY-EVENT-5a: T4 sees that T1 is still the successor so
> > // KEY-EVENT-5b: T4 takes the quick exit path (no ExitEpilog)
> >
> > // KEY-EVENT-6a: T1 is about to park and it is the successor, but
> > // KEY-EVENT-6b: T3's unpark has been eaten by the JVM/TI event handler
> > // KEY-EVENT-6c: and T4 took the quick exit path. T1 is about to be
> stuck.
> >
> >
> > This bug is intertwined with:
> >
> > - The ObjectMonitor successor protocol
> > - the sharing of a ParkEvent between Java Monitors and JVM/TI
> RawMonitors
> >
> > There is a very long successor.notes attachment to JDK-8028073 that
> > attempts to describe the ObjectMonitor successor protocol. It's good
> > for putting pretty much anyone to sleep.
> >
> > Since this hang reproduces back to JDK6, this bug is taking the easily
> > backported solution of moving the original fix to the right location.
> > The following new bug has been filed for possible future work in this
> > area by the Serviceability Team:
> >
> >     8033399 add a separate ParkEvent for JVM/TI RawMonitor use
> >     https://bugs.openjdk.java.net/browse/JDK-8033399
> >
> >
> > The Symptoms
> > ------------
> >
> > With intermittent hangs like this, it is useful to know what to look
> > for in order to determine if you are running into this issue:
> >
> > - if you aren't using a debugger or a profiler or some other
> >   JVM/TI agent, then this hang is not the same as yours
> > - if your JVM/TI agent isn't using a JVMTI_EVENT_MONITOR_WAITED
> >   event handler, then this hang is not the same as yours
> > - if your JVMTI_EVENT_MONITOR_WAITED event handler is not using
> >   JVM/TI RawMonitors, then this hang is not the same as yours
> > - if your JVMTI_EVENT_MONITOR_WAITED event handler is calling
> >   back into Java code, then you might just be insane and this
> >   hang might be similar to yours. However, using a Java callback
> >   in an event handler is an even bigger problem/risk so fix that
> >   first.
> > - if you one or more threads blocked like this and making no
> >   progress, then this hang might be the same as yours:
> >
> > "T1" #22 prio=5 os_prio=64 tid=0x00000000009ca800 nid=0x2f waiting
> for monitor e
> > ntry [0xfffffd7fc0231000]
> >    java.lang.Thread.State: BLOCKED (on object monitor)
> >    JavaThread state: _thread_blocked
> > Thread: 0x00000000009ca800  [0x2f] State: _at_safepoint
> _has_called_back 0 _at_p
> > oll_safepoint 0
> >    JavaThread state: _thread_blocked
> >         at java.lang.Object.wait(Native Method)
> >         - waiting on <0xfffffd7e6a2b6ff0> (a java.lang.String)
> >         at java.lang.Object.wait(Object.java:502)
> >         at SMW_WorkerThread.run(StressMonitorWait.java:103)
> >         - locked <0xfffffd7e6a2b6ff0> (a java.lang.String)
> >
> > "T2" #23 prio=5 os_prio=64 tid=0x00000000009cc000 nid=0x30 waiting
> for monitor e
> > ntry [0xfffffd7fc0130000]
> >    java.lang.Thread.State: BLOCKED (on object monitor)
> >    JavaThread state: _thread_blocked
> > Thread: 0x00000000009cc000  [0x30] State: _at_safepoint
> _has_called_back 0 _at_p
> > oll_safepoint 0
> >    JavaThread state: _thread_blocked
> >         at SMW_WorkerThread.run(StressMonitorWait.java:120)
> >         - waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String)
> >
> > "T3" #24 prio=5 os_prio=64 tid=0x00000000009ce000 nid=0x31 waiting
> for monitor e
> > ntry [0xfffffd7fc002f000]
> >    java.lang.Thread.State: BLOCKED (on object monitor)
> >    JavaThread state: _thread_blocked
> > Thread: 0x00000000009ce000  [0x31] State: _at_safepoint
> _has_called_back 0 _at_p
> > oll_safepoint 0
> >    JavaThread state: _thread_blocked
> >         at SMW_WorkerThread.run(StressMonitorWait.java:139)
> >         - waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String)
> >
> > Key symptoms in thread T1:
> >
> > - had the object locked:
> >
> >   locked <0xfffffd7e6a2b6ff0> (a java.lang.String)
> >
> > - did an Object.wait():
> >
> >   waiting on <0xfffffd7e6a2b6ff0> (a java.lang.String)
> >
> > - is blocked on reentry:
> >
> >   waiting for monitor entry [0xfffffd7fc0231000]
> >
> > Key symptoms in thread T2:
> >
> > - is blocked waiting to lock the object:
> >
> >   waiting for monitor entry [0xfffffd7fc0130000]
> >   waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String)
> >
> > Key symptoms in thread T3:
> >
> > - is blocked waiting to lock the object:
> >
> >   waiting for monitor entry [0xfffffd7fc002f000]
> >   waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String)
>
>
>


From serguei.spitsyn at oracle.com  Fri Feb 28 14:09:45 2014
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Fri, 28 Feb 2014 14:09:45 -0800
Subject: RFR round 0 JDK7u backport of ObjectMonitor-JVM/TI hang fix
	(8028073)
In-Reply-To: <531105BC.4050509@oracle.com>
References: <52ED3F28.5070806@oracle.com> <53081C1F.2070008@oracle.com>
	<531105BC.4050509@oracle.com>
Message-ID: <53110929.6020507@oracle.com>

On 2/28/14 1:55 PM, Daniel D. Daugherty wrote:
> Resend with the corrected subject line... sigh...
>
> Greetings,
>
> This is a code review request for the JDK7u-hs-dev backport of the
> following ObjectMonitor-JVM/TI hang fix:
>
>     8028073 race condition in ObjectMonitor implementation causing 
> deadlocks
>     https://bugs.openjdk.java.net/browse/JDK-8028073
>
> Here is the JDK7u-hs-dev webrev URL:
>
> http://cr.openjdk.java.net/~dcubed/8028073-webrev/0-jdk7u-hs-dev/
>
> This is a straight forward backport of the JDK8u-hs-dev version of the
> fix modulo lines numbers and context diff anchors. The diff of the two
> 'hg diff --nodates' files is below.
>
> Same testing as for the JDK9 and JDK8u versions with the same results.
>
> I need one sanity check review from someone.

Looks good.

Thanks,
Serguei


>
> Dan
>
> $ diff 8028073-diffs-jdk[78]u-hs.txt.00
> 1c1
> < diff -r bdb3798edd11 src/share/vm/prims/jvm.cpp
> ---
>> diff -r 231051bff068 src/share/vm/prims/jvm.cpp
> 11c11
> < @@ -527,6 +527,12 @@ JVM_ENTRY(void, JVM_MonitorWait(JNIEnv*
> ---
>> @@ -518,6 +518,12 @@ JVM_ENTRY(void, JVM_MonitorWait(JNIEnv*
> 22c22
> <    ObjectSynchronizer::wait(obj, ms, THREAD);
> ---
>>    ObjectSynchronizer::wait(obj, ms, CHECK);
> 24c24
> < diff -r bdb3798edd11 src/share/vm/runtime/objectMonitor.cpp
> ---
>> diff -r 231051bff068 src/share/vm/runtime/objectMonitor.cpp
> 34c34
> < @@ -385,6 +385,12 @@ void ATTR ObjectMonitor::enter(TRAPS) {
> ---
>> @@ -382,6 +382,12 @@ void ATTR ObjectMonitor::enter(TRAPS) {
> 47c47
> < @@ -442,6 +448,12 @@ void ATTR ObjectMonitor::enter(TRAPS) {
> ---
>> @@ -439,6 +445,12 @@ void ATTR ObjectMonitor::enter(TRAPS) {
> 60c60
> < @@ -1459,6 +1471,14 @@ void ObjectMonitor::wait(jlong millis, b
> ---
>> @@ -1456,6 +1468,14 @@ void ObjectMonitor::wait(jlong millis, b
> 75c75
> < @@ -1501,21 +1521,6 @@ void ObjectMonitor::wait(jlong millis, b
> ---
>> @@ -1498,21 +1518,6 @@ void ObjectMonitor::wait(jlong millis, b
> 97c97
> < @@ -1598,6 +1603,33 @@ void ObjectMonitor::wait(jlong millis, b
> ---
>> @@ -1595,6 +1600,33 @@ void ObjectMonitor::wait(jlong millis, b
>
>
> On 2/21/14 8:40 PM, Daniel D. Daugherty wrote:
>> Greetings,
>>
>> This is a code review request for the JDK8u-hs-dev backport of the
>> following ObjectMonitor-JVM/TI hang fix:
>>
>>     8028073 race condition in ObjectMonitor implementation causing
>> deadlocks
>>     https://bugs.openjdk.java.net/browse/JDK-8028073
>>
>> Here is the JDK8u-hs-dev webrev URL:
>>
>> http://cr.openjdk.java.net/~dcubed/8028073-webrev/0-jdk8u-hs-dev/
>>
>> This is _almost_ a straight forward backport of the JDK9 fix. The only
>> difference to the fix was discussed at the end of the JDK9 review and
>> was determined to only be needed in versions of HotSpot without the
>> fix for 8028280:
>>
>> http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2014-February/010745.html 
>>
>>
>>
>> 8028280 has not yet been backported to JDK8u-hs-dev.
>>
>> The easiest way to review the backport is to download the two patch
>> files from the webrevs and compare them with something like:
>>
>>     jfilemerge -r -w 8028073_exp.patch 8028073_exp_for_jdk8u_hs.patch
>>
>> The same testing has been performed on the JDK8u-hs-dev version as
>> with the JDK9-hs-runtime version.
>>
>> Thanks, in advance, for any comments, questions or suggestions.
>>
>> Dan
>>
>>
>> On 2/1/14 11:38 AM, Daniel D. Daugherty wrote:
>> > Greetings,
>> >
>> > I have a fix ready for the following bug:
>> >
>> >     8028073 race condition in ObjectMonitor implementation causing
>> deadlocks
>> >     https://bugs.openjdk.java.net/browse/JDK-8028073
>> >
>> > On the surface, this is a very simple fix that relocates a few 
>> lines of
>> > code, relocates and rewrites the comments associated with that code 
>> and
>> > adds several new comments.
>> >
>> > Of course, in reality, the issue is much more complicated, but I'm
>> > hoping to make it easy for anyone not acquainted with this issue to
>> > understand what's going on.
>> >
>> > Here are the JDK9 webrev URLs:
>> >
>> > OpenJDK:
>> > http://cr.openjdk.java.net/~dcubed/8028073-webrev/0-jdk9-hs-runtime/
>> >
>> > Oracle internal:
>> >
>> http://javaweb.us.oracle.com/~ddaugher/8028073-webrev/0-jdk9-hs-runtime/
>> >
>> > The simple summary:
>> >
>> > - since Java Monitors and JVM/TI RawMonitors share a ParkEvent,
>> >   it is possible for a JVM/TI monitor event handler to accidentally
>> >   consume a ParkEvent.unpark() call meant for Java Monitor layer
>> > - the original code fix was made on 2005.07.04 using this bug ID:
>> >   https://bugs.openjdk.java.net/browse/JDK-5030359
>> > - it's the right fix, but it's in the wrong place
>> > - the fix needs to be after the JVMTI_EVENT_MONITOR_WAITED
>> >   event handler is called because it is that event handler
>> >   that can cause the hang
>> >
>> >
>> > Testing
>> > -------
>> >
>> > - a new StessMonitorWait test has been created that reliably
>> >   reproduces the hang in JDK[6789]; see the bug's gory details
>> >   for the specific versions where the hang has been reproduced
>> >   - the test reliably reproduces the hang in 5 seconds on my
>> >     T7600 running Solaris 10u11 X86; 1 minute runs reproduce
>> >     the hang reliably on other machines
>> >   - 12 hour stress run of the new test on Linux-X64, MacOS X-X64,
>> >     Solaris-SPARCV9, Solaris-X64, and Win7-X86 with the JPRT
>> >     bits did not reproduce the hang
>> > - JPRT test job
>> > - VM/SQE Adhoc test job on Server VM, fastdebug bits on Linux-X86,
>> >   Linux-X64, MacOS X-X64, Solaris-SPARCV9, Solaris-X64, Windows-X86,
>> >   and Windows-X64:
>> >   - vm.quick
>> >   - Kitchensink (bigapps)
>> >   - Weblogic+medrec (bigapps)
>> >   - runThese (bigapps)
>> >
>> >
>> > The Gory Details Start Here
>> > ---------------------------
>> >
>> > This is the old location of block of code that's being moved:
>> >
>> > src/share/vm/runtime/objectMonitor.cpp:
>> >
>> > 1440 void ObjectMonitor::wait(jlong millis, bool interruptible,
>> TRAPS) {
>> > <snip>
>> > 1499    exit (true, Self) ;                    // exit the monitor
>> > <snip>
>> > 1513    if (node._notified != 0 && _succ == Self) {
>> > 1514       node._event->unpark();
>> > 1515    }
>> >
>> >
>> > This is the new location of block of code that's being moved:
>> >
>> > src/share/vm/runtime/objectMonitor.cpp:
>> >
>> > 1452 void ObjectMonitor::wait(jlong millis, bool interruptible,
>> TRAPS) {
>> > <snip>
>> > 1601      if (JvmtiExport::should_post_monitor_waited()) {
>> > 1602        JvmtiExport::post_monitor_waited(jt, this, ret ==
>> OS_TIMEOUT);
>> > <snip>
>> > 1604        if (node._notified != 0 && _succ == Self) {
>> > <snip>
>> > 1620          node._event->unpark();
>> > 1621        }
>> >
>> >
>> > The Risks
>> > ---------
>> >
>> > - The code now executes only when the JVMTI_EVENT_MONITOR_WAITED event
>> >   is enabled:
>> >   - previously it was always executed
>> >   - while the old code was not effective for the hang that is being
>> >     fixed with this bug, it is possible that the old code prevented
>> >     a different bug in the successor protocol from manifesting
>> >   - thorough analysis of the successor protocol did not reveal a
>> >     case where the old code was needed in the old location
>> > - Thorough analysis indicates that the other JVM/TI monitor events
>> >   do not need a fix like the one for JVMTI_EVENT_MONITOR_WAITED:
>> >   - the successor protocol is complicated and the analysis could
>> >     be wrong when certain options are used
>> >   - comments were added to each location where a JVM/TI monitor
>> >     event handler is called documenting why a fix like this one
>> >     is not needed there
>> >   - if the analysis is wrong, the new comments show where a new
>> >     code change would be needed
>> >
>> >
>> > The Scenario
>> > ------------
>> >
>> > I've created a scenario that reproduces this hang:
>> >
>> > T1 - enters monitor and calls monitor.wait()
>> > T2 - enters the monitor, calls monitor.notify() and exits the monitor
>> > T3 - enters and exits the monitor
>> > T4 - enters the monitor, delays for 5 seconds, exits the monitor
>> >
>> > A JVM/TI agent that enables JVMTI_EVENT_MONITOR_WAITED and has a
>> > handler that: enters a raw monitor, waits for 1ms, exits a raw 
>> monitor.
>> >
>> > Here are the six events necessary to make this hang happen:
>> >
>> > // KEY-EVENT-1a: After being unparked(), T1 has cleared the _succ
>> field, but
>> > // KEY-EVENT-1b: T3 is exiting the monitor and makes T1 the
>> successor again.
>> >
>> > // KEY-EVENT-2a: The unpark() done by T3 when it made T1 the successor
>> > // KEY-EVENT-2b: is consumed by the JVM/TI event handler.
>> >
>> > // KEY-EVENT-3a: T3 made T1 the successor
>> > // KEY-EVENT-3b: but before T1 could reenter the monitor T4 grabbed 
>> it.
>> >
>> > // KEY-EVENT-4a: T1's TrySpin() call sees T4 as NotRunnable so
>> > // KEY-EVENT-4b: T1 bails from TrySpin without touching _succ.
>> >
>> > // KEY-EVENT-5a: T4 sees that T1 is still the successor so
>> > // KEY-EVENT-5b: T4 takes the quick exit path (no ExitEpilog)
>> >
>> > // KEY-EVENT-6a: T1 is about to park and it is the successor, but
>> > // KEY-EVENT-6b: T3's unpark has been eaten by the JVM/TI event 
>> handler
>> > // KEY-EVENT-6c: and T4 took the quick exit path. T1 is about to be
>> stuck.
>> >
>> >
>> > This bug is intertwined with:
>> >
>> > - The ObjectMonitor successor protocol
>> > - the sharing of a ParkEvent between Java Monitors and JVM/TI
>> RawMonitors
>> >
>> > There is a very long successor.notes attachment to JDK-8028073 that
>> > attempts to describe the ObjectMonitor successor protocol. It's good
>> > for putting pretty much anyone to sleep.
>> >
>> > Since this hang reproduces back to JDK6, this bug is taking the easily
>> > backported solution of moving the original fix to the right location.
>> > The following new bug has been filed for possible future work in this
>> > area by the Serviceability Team:
>> >
>> >     8033399 add a separate ParkEvent for JVM/TI RawMonitor use
>> >     https://bugs.openjdk.java.net/browse/JDK-8033399
>> >
>> >
>> > The Symptoms
>> > ------------
>> >
>> > With intermittent hangs like this, it is useful to know what to look
>> > for in order to determine if you are running into this issue:
>> >
>> > - if you aren't using a debugger or a profiler or some other
>> >   JVM/TI agent, then this hang is not the same as yours
>> > - if your JVM/TI agent isn't using a JVMTI_EVENT_MONITOR_WAITED
>> >   event handler, then this hang is not the same as yours
>> > - if your JVMTI_EVENT_MONITOR_WAITED event handler is not using
>> >   JVM/TI RawMonitors, then this hang is not the same as yours
>> > - if your JVMTI_EVENT_MONITOR_WAITED event handler is calling
>> >   back into Java code, then you might just be insane and this
>> >   hang might be similar to yours. However, using a Java callback
>> >   in an event handler is an even bigger problem/risk so fix that
>> >   first.
>> > - if you one or more threads blocked like this and making no
>> >   progress, then this hang might be the same as yours:
>> >
>> > "T1" #22 prio=5 os_prio=64 tid=0x00000000009ca800 nid=0x2f waiting
>> for monitor e
>> > ntry [0xfffffd7fc0231000]
>> >    java.lang.Thread.State: BLOCKED (on object monitor)
>> >    JavaThread state: _thread_blocked
>> > Thread: 0x00000000009ca800  [0x2f] State: _at_safepoint
>> _has_called_back 0 _at_p
>> > oll_safepoint 0
>> >    JavaThread state: _thread_blocked
>> >         at java.lang.Object.wait(Native Method)
>> >         - waiting on <0xfffffd7e6a2b6ff0> (a java.lang.String)
>> >         at java.lang.Object.wait(Object.java:502)
>> >         at SMW_WorkerThread.run(StressMonitorWait.java:103)
>> >         - locked <0xfffffd7e6a2b6ff0> (a java.lang.String)
>> >
>> > "T2" #23 prio=5 os_prio=64 tid=0x00000000009cc000 nid=0x30 waiting
>> for monitor e
>> > ntry [0xfffffd7fc0130000]
>> >    java.lang.Thread.State: BLOCKED (on object monitor)
>> >    JavaThread state: _thread_blocked
>> > Thread: 0x00000000009cc000  [0x30] State: _at_safepoint
>> _has_called_back 0 _at_p
>> > oll_safepoint 0
>> >    JavaThread state: _thread_blocked
>> >         at SMW_WorkerThread.run(StressMonitorWait.java:120)
>> >         - waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String)
>> >
>> > "T3" #24 prio=5 os_prio=64 tid=0x00000000009ce000 nid=0x31 waiting
>> for monitor e
>> > ntry [0xfffffd7fc002f000]
>> >    java.lang.Thread.State: BLOCKED (on object monitor)
>> >    JavaThread state: _thread_blocked
>> > Thread: 0x00000000009ce000  [0x31] State: _at_safepoint
>> _has_called_back 0 _at_p
>> > oll_safepoint 0
>> >    JavaThread state: _thread_blocked
>> >         at SMW_WorkerThread.run(StressMonitorWait.java:139)
>> >         - waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String)
>> >
>> > Key symptoms in thread T1:
>> >
>> > - had the object locked:
>> >
>> >   locked <0xfffffd7e6a2b6ff0> (a java.lang.String)
>> >
>> > - did an Object.wait():
>> >
>> >   waiting on <0xfffffd7e6a2b6ff0> (a java.lang.String)
>> >
>> > - is blocked on reentry:
>> >
>> >   waiting for monitor entry [0xfffffd7fc0231000]
>> >
>> > Key symptoms in thread T2:
>> >
>> > - is blocked waiting to lock the object:
>> >
>> >   waiting for monitor entry [0xfffffd7fc0130000]
>> >   waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String)
>> >
>> > Key symptoms in thread T3:
>> >
>> > - is blocked waiting to lock the object:
>> >
>> >   waiting for monitor entry [0xfffffd7fc002f000]
>> >   waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String)
>>
>>
>>
>


From daniel.daugherty at oracle.com  Fri Feb 28 14:15:55 2014
From: daniel.daugherty at oracle.com (Daniel D. Daugherty)
Date: Fri, 28 Feb 2014 15:15:55 -0700
Subject: RFR round 0 JDK7u backport of ObjectMonitor-JVM/TI hang fix
	(8028073)
In-Reply-To: <53110929.6020507@oracle.com>
References: <52ED3F28.5070806@oracle.com> <53081C1F.2070008@oracle.com>
	<531105BC.4050509@oracle.com> <53110929.6020507@oracle.com>
Message-ID: <53110A9B.1030508@oracle.com>

 > Looks good.

Thanks! Last backport (by me) for this one...

Dan


On 2/28/14 3:09 PM, serguei.spitsyn at oracle.com wrote:
> On 2/28/14 1:55 PM, Daniel D. Daugherty wrote:
>> Resend with the corrected subject line... sigh...
>>
>> Greetings,
>>
>> This is a code review request for the JDK7u-hs-dev backport of the
>> following ObjectMonitor-JVM/TI hang fix:
>>
>>     8028073 race condition in ObjectMonitor implementation causing 
>> deadlocks
>>     https://bugs.openjdk.java.net/browse/JDK-8028073
>>
>> Here is the JDK7u-hs-dev webrev URL:
>>
>> http://cr.openjdk.java.net/~dcubed/8028073-webrev/0-jdk7u-hs-dev/
>>
>> This is a straight forward backport of the JDK8u-hs-dev version of the
>> fix modulo lines numbers and context diff anchors. The diff of the two
>> 'hg diff --nodates' files is below.
>>
>> Same testing as for the JDK9 and JDK8u versions with the same results.
>>
>> I need one sanity check review from someone.
>
> Looks good.
>
> Thanks,
> Serguei
>
>
>>
>> Dan
>>
>> $ diff 8028073-diffs-jdk[78]u-hs.txt.00
>> 1c1
>> < diff -r bdb3798edd11 src/share/vm/prims/jvm.cpp
>> ---
>>> diff -r 231051bff068 src/share/vm/prims/jvm.cpp
>> 11c11
>> < @@ -527,6 +527,12 @@ JVM_ENTRY(void, JVM_MonitorWait(JNIEnv*
>> ---
>>> @@ -518,6 +518,12 @@ JVM_ENTRY(void, JVM_MonitorWait(JNIEnv*
>> 22c22
>> <    ObjectSynchronizer::wait(obj, ms, THREAD);
>> ---
>>>    ObjectSynchronizer::wait(obj, ms, CHECK);
>> 24c24
>> < diff -r bdb3798edd11 src/share/vm/runtime/objectMonitor.cpp
>> ---
>>> diff -r 231051bff068 src/share/vm/runtime/objectMonitor.cpp
>> 34c34
>> < @@ -385,6 +385,12 @@ void ATTR ObjectMonitor::enter(TRAPS) {
>> ---
>>> @@ -382,6 +382,12 @@ void ATTR ObjectMonitor::enter(TRAPS) {
>> 47c47
>> < @@ -442,6 +448,12 @@ void ATTR ObjectMonitor::enter(TRAPS) {
>> ---
>>> @@ -439,6 +445,12 @@ void ATTR ObjectMonitor::enter(TRAPS) {
>> 60c60
>> < @@ -1459,6 +1471,14 @@ void ObjectMonitor::wait(jlong millis, b
>> ---
>>> @@ -1456,6 +1468,14 @@ void ObjectMonitor::wait(jlong millis, b
>> 75c75
>> < @@ -1501,21 +1521,6 @@ void ObjectMonitor::wait(jlong millis, b
>> ---
>>> @@ -1498,21 +1518,6 @@ void ObjectMonitor::wait(jlong millis, b
>> 97c97
>> < @@ -1598,6 +1603,33 @@ void ObjectMonitor::wait(jlong millis, b
>> ---
>>> @@ -1595,6 +1600,33 @@ void ObjectMonitor::wait(jlong millis, b
>>
>>
>> On 2/21/14 8:40 PM, Daniel D. Daugherty wrote:
>>> Greetings,
>>>
>>> This is a code review request for the JDK8u-hs-dev backport of the
>>> following ObjectMonitor-JVM/TI hang fix:
>>>
>>>     8028073 race condition in ObjectMonitor implementation causing
>>> deadlocks
>>>     https://bugs.openjdk.java.net/browse/JDK-8028073
>>>
>>> Here is the JDK8u-hs-dev webrev URL:
>>>
>>> http://cr.openjdk.java.net/~dcubed/8028073-webrev/0-jdk8u-hs-dev/
>>>
>>> This is _almost_ a straight forward backport of the JDK9 fix. The only
>>> difference to the fix was discussed at the end of the JDK9 review and
>>> was determined to only be needed in versions of HotSpot without the
>>> fix for 8028280:
>>>
>>> http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2014-February/010745.html 
>>>
>>>
>>>
>>> 8028280 has not yet been backported to JDK8u-hs-dev.
>>>
>>> The easiest way to review the backport is to download the two patch
>>> files from the webrevs and compare them with something like:
>>>
>>>     jfilemerge -r -w 8028073_exp.patch 8028073_exp_for_jdk8u_hs.patch
>>>
>>> The same testing has been performed on the JDK8u-hs-dev version as
>>> with the JDK9-hs-runtime version.
>>>
>>> Thanks, in advance, for any comments, questions or suggestions.
>>>
>>> Dan
>>>
>>>
>>> On 2/1/14 11:38 AM, Daniel D. Daugherty wrote:
>>> > Greetings,
>>> >
>>> > I have a fix ready for the following bug:
>>> >
>>> >     8028073 race condition in ObjectMonitor implementation causing
>>> deadlocks
>>> >     https://bugs.openjdk.java.net/browse/JDK-8028073
>>> >
>>> > On the surface, this is a very simple fix that relocates a few 
>>> lines of
>>> > code, relocates and rewrites the comments associated with that 
>>> code and
>>> > adds several new comments.
>>> >
>>> > Of course, in reality, the issue is much more complicated, but I'm
>>> > hoping to make it easy for anyone not acquainted with this issue to
>>> > understand what's going on.
>>> >
>>> > Here are the JDK9 webrev URLs:
>>> >
>>> > OpenJDK:
>>> > http://cr.openjdk.java.net/~dcubed/8028073-webrev/0-jdk9-hs-runtime/
>>> >
>>> > Oracle internal:
>>> >
>>> http://javaweb.us.oracle.com/~ddaugher/8028073-webrev/0-jdk9-hs-runtime/ 
>>>
>>> >
>>> > The simple summary:
>>> >
>>> > - since Java Monitors and JVM/TI RawMonitors share a ParkEvent,
>>> >   it is possible for a JVM/TI monitor event handler to accidentally
>>> >   consume a ParkEvent.unpark() call meant for Java Monitor layer
>>> > - the original code fix was made on 2005.07.04 using this bug ID:
>>> >   https://bugs.openjdk.java.net/browse/JDK-5030359
>>> > - it's the right fix, but it's in the wrong place
>>> > - the fix needs to be after the JVMTI_EVENT_MONITOR_WAITED
>>> >   event handler is called because it is that event handler
>>> >   that can cause the hang
>>> >
>>> >
>>> > Testing
>>> > -------
>>> >
>>> > - a new StessMonitorWait test has been created that reliably
>>> >   reproduces the hang in JDK[6789]; see the bug's gory details
>>> >   for the specific versions where the hang has been reproduced
>>> >   - the test reliably reproduces the hang in 5 seconds on my
>>> >     T7600 running Solaris 10u11 X86; 1 minute runs reproduce
>>> >     the hang reliably on other machines
>>> >   - 12 hour stress run of the new test on Linux-X64, MacOS X-X64,
>>> >     Solaris-SPARCV9, Solaris-X64, and Win7-X86 with the JPRT
>>> >     bits did not reproduce the hang
>>> > - JPRT test job
>>> > - VM/SQE Adhoc test job on Server VM, fastdebug bits on Linux-X86,
>>> >   Linux-X64, MacOS X-X64, Solaris-SPARCV9, Solaris-X64, Windows-X86,
>>> >   and Windows-X64:
>>> >   - vm.quick
>>> >   - Kitchensink (bigapps)
>>> >   - Weblogic+medrec (bigapps)
>>> >   - runThese (bigapps)
>>> >
>>> >
>>> > The Gory Details Start Here
>>> > ---------------------------
>>> >
>>> > This is the old location of block of code that's being moved:
>>> >
>>> > src/share/vm/runtime/objectMonitor.cpp:
>>> >
>>> > 1440 void ObjectMonitor::wait(jlong millis, bool interruptible,
>>> TRAPS) {
>>> > <snip>
>>> > 1499    exit (true, Self) ;                    // exit the monitor
>>> > <snip>
>>> > 1513    if (node._notified != 0 && _succ == Self) {
>>> > 1514       node._event->unpark();
>>> > 1515    }
>>> >
>>> >
>>> > This is the new location of block of code that's being moved:
>>> >
>>> > src/share/vm/runtime/objectMonitor.cpp:
>>> >
>>> > 1452 void ObjectMonitor::wait(jlong millis, bool interruptible,
>>> TRAPS) {
>>> > <snip>
>>> > 1601      if (JvmtiExport::should_post_monitor_waited()) {
>>> > 1602        JvmtiExport::post_monitor_waited(jt, this, ret ==
>>> OS_TIMEOUT);
>>> > <snip>
>>> > 1604        if (node._notified != 0 && _succ == Self) {
>>> > <snip>
>>> > 1620          node._event->unpark();
>>> > 1621        }
>>> >
>>> >
>>> > The Risks
>>> > ---------
>>> >
>>> > - The code now executes only when the JVMTI_EVENT_MONITOR_WAITED 
>>> event
>>> >   is enabled:
>>> >   - previously it was always executed
>>> >   - while the old code was not effective for the hang that is being
>>> >     fixed with this bug, it is possible that the old code prevented
>>> >     a different bug in the successor protocol from manifesting
>>> >   - thorough analysis of the successor protocol did not reveal a
>>> >     case where the old code was needed in the old location
>>> > - Thorough analysis indicates that the other JVM/TI monitor events
>>> >   do not need a fix like the one for JVMTI_EVENT_MONITOR_WAITED:
>>> >   - the successor protocol is complicated and the analysis could
>>> >     be wrong when certain options are used
>>> >   - comments were added to each location where a JVM/TI monitor
>>> >     event handler is called documenting why a fix like this one
>>> >     is not needed there
>>> >   - if the analysis is wrong, the new comments show where a new
>>> >     code change would be needed
>>> >
>>> >
>>> > The Scenario
>>> > ------------
>>> >
>>> > I've created a scenario that reproduces this hang:
>>> >
>>> > T1 - enters monitor and calls monitor.wait()
>>> > T2 - enters the monitor, calls monitor.notify() and exits the monitor
>>> > T3 - enters and exits the monitor
>>> > T4 - enters the monitor, delays for 5 seconds, exits the monitor
>>> >
>>> > A JVM/TI agent that enables JVMTI_EVENT_MONITOR_WAITED and has a
>>> > handler that: enters a raw monitor, waits for 1ms, exits a raw 
>>> monitor.
>>> >
>>> > Here are the six events necessary to make this hang happen:
>>> >
>>> > // KEY-EVENT-1a: After being unparked(), T1 has cleared the _succ
>>> field, but
>>> > // KEY-EVENT-1b: T3 is exiting the monitor and makes T1 the
>>> successor again.
>>> >
>>> > // KEY-EVENT-2a: The unpark() done by T3 when it made T1 the 
>>> successor
>>> > // KEY-EVENT-2b: is consumed by the JVM/TI event handler.
>>> >
>>> > // KEY-EVENT-3a: T3 made T1 the successor
>>> > // KEY-EVENT-3b: but before T1 could reenter the monitor T4 
>>> grabbed it.
>>> >
>>> > // KEY-EVENT-4a: T1's TrySpin() call sees T4 as NotRunnable so
>>> > // KEY-EVENT-4b: T1 bails from TrySpin without touching _succ.
>>> >
>>> > // KEY-EVENT-5a: T4 sees that T1 is still the successor so
>>> > // KEY-EVENT-5b: T4 takes the quick exit path (no ExitEpilog)
>>> >
>>> > // KEY-EVENT-6a: T1 is about to park and it is the successor, but
>>> > // KEY-EVENT-6b: T3's unpark has been eaten by the JVM/TI event 
>>> handler
>>> > // KEY-EVENT-6c: and T4 took the quick exit path. T1 is about to be
>>> stuck.
>>> >
>>> >
>>> > This bug is intertwined with:
>>> >
>>> > - The ObjectMonitor successor protocol
>>> > - the sharing of a ParkEvent between Java Monitors and JVM/TI
>>> RawMonitors
>>> >
>>> > There is a very long successor.notes attachment to JDK-8028073 that
>>> > attempts to describe the ObjectMonitor successor protocol. It's good
>>> > for putting pretty much anyone to sleep.
>>> >
>>> > Since this hang reproduces back to JDK6, this bug is taking the 
>>> easily
>>> > backported solution of moving the original fix to the right location.
>>> > The following new bug has been filed for possible future work in this
>>> > area by the Serviceability Team:
>>> >
>>> >     8033399 add a separate ParkEvent for JVM/TI RawMonitor use
>>> >     https://bugs.openjdk.java.net/browse/JDK-8033399
>>> >
>>> >
>>> > The Symptoms
>>> > ------------
>>> >
>>> > With intermittent hangs like this, it is useful to know what to look
>>> > for in order to determine if you are running into this issue:
>>> >
>>> > - if you aren't using a debugger or a profiler or some other
>>> >   JVM/TI agent, then this hang is not the same as yours
>>> > - if your JVM/TI agent isn't using a JVMTI_EVENT_MONITOR_WAITED
>>> >   event handler, then this hang is not the same as yours
>>> > - if your JVMTI_EVENT_MONITOR_WAITED event handler is not using
>>> >   JVM/TI RawMonitors, then this hang is not the same as yours
>>> > - if your JVMTI_EVENT_MONITOR_WAITED event handler is calling
>>> >   back into Java code, then you might just be insane and this
>>> >   hang might be similar to yours. However, using a Java callback
>>> >   in an event handler is an even bigger problem/risk so fix that
>>> >   first.
>>> > - if you one or more threads blocked like this and making no
>>> >   progress, then this hang might be the same as yours:
>>> >
>>> > "T1" #22 prio=5 os_prio=64 tid=0x00000000009ca800 nid=0x2f waiting
>>> for monitor e
>>> > ntry [0xfffffd7fc0231000]
>>> >    java.lang.Thread.State: BLOCKED (on object monitor)
>>> >    JavaThread state: _thread_blocked
>>> > Thread: 0x00000000009ca800  [0x2f] State: _at_safepoint
>>> _has_called_back 0 _at_p
>>> > oll_safepoint 0
>>> >    JavaThread state: _thread_blocked
>>> >         at java.lang.Object.wait(Native Method)
>>> >         - waiting on <0xfffffd7e6a2b6ff0> (a java.lang.String)
>>> >         at java.lang.Object.wait(Object.java:502)
>>> >         at SMW_WorkerThread.run(StressMonitorWait.java:103)
>>> >         - locked <0xfffffd7e6a2b6ff0> (a java.lang.String)
>>> >
>>> > "T2" #23 prio=5 os_prio=64 tid=0x00000000009cc000 nid=0x30 waiting
>>> for monitor e
>>> > ntry [0xfffffd7fc0130000]
>>> >    java.lang.Thread.State: BLOCKED (on object monitor)
>>> >    JavaThread state: _thread_blocked
>>> > Thread: 0x00000000009cc000  [0x30] State: _at_safepoint
>>> _has_called_back 0 _at_p
>>> > oll_safepoint 0
>>> >    JavaThread state: _thread_blocked
>>> >         at SMW_WorkerThread.run(StressMonitorWait.java:120)
>>> >         - waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String)
>>> >
>>> > "T3" #24 prio=5 os_prio=64 tid=0x00000000009ce000 nid=0x31 waiting
>>> for monitor e
>>> > ntry [0xfffffd7fc002f000]
>>> >    java.lang.Thread.State: BLOCKED (on object monitor)
>>> >    JavaThread state: _thread_blocked
>>> > Thread: 0x00000000009ce000  [0x31] State: _at_safepoint
>>> _has_called_back 0 _at_p
>>> > oll_safepoint 0
>>> >    JavaThread state: _thread_blocked
>>> >         at SMW_WorkerThread.run(StressMonitorWait.java:139)
>>> >         - waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String)
>>> >
>>> > Key symptoms in thread T1:
>>> >
>>> > - had the object locked:
>>> >
>>> >   locked <0xfffffd7e6a2b6ff0> (a java.lang.String)
>>> >
>>> > - did an Object.wait():
>>> >
>>> >   waiting on <0xfffffd7e6a2b6ff0> (a java.lang.String)
>>> >
>>> > - is blocked on reentry:
>>> >
>>> >   waiting for monitor entry [0xfffffd7fc0231000]
>>> >
>>> > Key symptoms in thread T2:
>>> >
>>> > - is blocked waiting to lock the object:
>>> >
>>> >   waiting for monitor entry [0xfffffd7fc0130000]
>>> >   waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String)
>>> >
>>> > Key symptoms in thread T3:
>>> >
>>> > - is blocked waiting to lock the object:
>>> >
>>> >   waiting for monitor entry [0xfffffd7fc002f000]
>>> >   waiting to lock <0xfffffd7e6a2b6ff0> (a java.lang.String)
>>>
>>>
>>>
>>
>


From chris.hegarty at oracle.com  Fri Feb 28 14:27:08 2014
From: chris.hegarty at oracle.com (Chris Hegarty)
Date: Fri, 28 Feb 2014 22:27:08 +0000
Subject: RFR(S): JDK-8035952 Remove use of JVM_Open,
	JVM_Read and JVM_Close functions from serviceability code
In-Reply-To: <5310FD6E.7050305@oracle.com>
References: <5310FD6E.7050305@oracle.com>
Message-ID: <711DB132-BE41-436F-91BD-3AB2CCB9524A@oracle.com>

Looks good to me Frederic.

JVM_Read is restartable, but that may not be an issue here.

-Chris.

> On 28 Feb 2014, at 21:19, frederic parain <frederic.parain at oracle.com> wrote:
> 
> Greetings,
> 
> Please review this small changeset for CR JDK-8035952
> "Remove use of JVM_Open, JVM_Read and JVM_Close functions from serviceability code"
> 
> Bug:
> https://bugs.openjdk.java.net/browse/JDK-8035952
> 
> Webrev:
> http://cr.openjdk.java.net/~fparain/8035952/webrev.00/
> 
> Tested with jdk_management test suite + manual tests
> to verify consistency of values returned by new code.
> 
> Thank you,
> 
> Fred
> 
> -- 
> Frederic Parain - Oracle
> Grenoble Engineering Center - France
> Phone: +33 4 76 18 81 17
> Email: Frederic.Parain at oracle.com

From martinrb at google.com  Fri Feb 28 14:31:29 2014
From: martinrb at google.com (Martin Buchholz)
Date: Fri, 28 Feb 2014 14:31:29 -0800
Subject: RFR(S): JDK-8035952 Remove use of JVM_Open, JVM_Read and
	JVM_Close functions from serviceability code
In-Reply-To: <711DB132-BE41-436F-91BD-3AB2CCB9524A@oracle.com>
References: <5310FD6E.7050305@oracle.com>
	<711DB132-BE41-436F-91BD-3AB2CCB9524A@oracle.com>
Message-ID: <CA+kOe0_Gjb4kgpAf8DmHmCDsVKS0wFArmAR8E3mxURX9F_oCGg@mail.gmail.com>

I'm still bummed out that no one ever explained what JVM_Read's purpose in
life  was.

https://bugs.openjdk.java.net/browse/JDK-4904617
JVM_Read needs better documentation


On Fri, Feb 28, 2014 at 2:27 PM, Chris Hegarty <chris.hegarty at oracle.com>wrote:

> Looks good to me Frederic.
>
> JVM_Read is restartable, but that may not be an issue here.
>
> -Chris.
>
> > On 28 Feb 2014, at 21:19, frederic parain <frederic.parain at oracle.com>
> wrote:
> >
> > Greetings,
> >
> > Please review this small changeset for CR JDK-8035952
> > "Remove use of JVM_Open, JVM_Read and JVM_Close functions from
> serviceability code"
> >
> > Bug:
> > https://bugs.openjdk.java.net/browse/JDK-8035952
> >
> > Webrev:
> > http://cr.openjdk.java.net/~fparain/8035952/webrev.00/
> >
> > Tested with jdk_management test suite + manual tests
> > to verify consistency of values returned by new code.
> >
> > Thank you,
> >
> > Fred
> >
> > --
> > Frederic Parain - Oracle
> > Grenoble Engineering Center - France
> > Phone: +33 4 76 18 81 17
> > Email: Frederic.Parain at oracle.com
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20140228/d09bb9a7/attachment.html 

From mandy.chung at oracle.com  Fri Feb 28 15:10:08 2014
From: mandy.chung at oracle.com (Mandy Chung)
Date: Fri, 28 Feb 2014 15:10:08 -0800
Subject: RFR(S): JDK-8035952 Remove use of JVM_Open, JVM_Read and JVM_Close
	functions from serviceability code
In-Reply-To: <5310FD6E.7050305@oracle.com>
References: <5310FD6E.7050305@oracle.com>
Message-ID: <53111750.3070106@oracle.com>

On 2/28/14 1:19 PM, frederic parain wrote:
> Greetings,
>
> Please review this small changeset for CR JDK-8035952
> "Remove use of JVM_Open, JVM_Read and JVM_Close functions from 
> serviceability code"
>
> Bug:
> https://bugs.openjdk.java.net/browse/JDK-8035952
>
> Webrev:
> http://cr.openjdk.java.net/~fparain/8035952/webrev.00/

Looks good.  It should have used open/read/close functions from the 
beginning.

thanks
Mandy