RFR: JDK-6546236 Thread interrupt() of Thread.sleep() can be lost on Solaris due to race with signal handler

frederic parain frederic.parain at oracle.com
Tue Feb 18 07:32:00 PST 2014


Hi Dan,

Thank you for this second review. I applied all
your suggestions to the changeset.

I rolled back the warning/assert code to its initial
state: a single warning conditioned by the expectation
of a monotonic clock.

The new webrev is available here:

http://cr.openjdk.java.net/~fparain/6546236/webrev.02/

Regards,

Fred

On 17/02/2014 19:52, Daniel D. Daugherty wrote:
>  > http://cr.openjdk.java.net/~fparain/6546236/webrev.01/
>
> src/share/vm/runtime/os.hpp
>      No comments.
>
> src/os/bsd/vm/os_bsd.inline.hpp
>      If you move the Bsd::_clock_gettime check here, you should be
>      able to delete Bsd::supports_monotonic_clock from
>      src/os/bsd/vm/os_bsd.hpp.
>
> src/os/bsd/vm/os_bsd.cpp
>      No comments.
>
> src/os/linux/vm/os_linux.hpp
>      No comments.
>
> src/os/linux/vm/os_linux.inline.hpp
>      No comments.
>
> src/os/linux/vm/os_linux.cpp
>      No comments.
>
> src/os/solaris/vm/os_solaris.inline.hpp
>      No comments.
>
> src/os/solaris/vm/os_solaris.cpp
>      No comments.
>
> src/os/windows/vm/os_windows.hpp
>      No comments.
>
> src/os/windows/vm/os_windows.inline.hpp
>      No comments.
>
> src/os/windows/vm/os_windows.cpp
>      No comments.
>
> src/os/posix/vm/os_posix.cpp
>      line 333: assert(true, "unexpected time moving backwards detected
> in os:sleep()");
>      line 375: assert(true, "unexpected time moving backwards detected
> on os::sleep()");
>          These asserts will not fire. The 'true' needs to be 'false'.
>
>      You might want to distinguish between the two different pairs of
>      messages:
>
>   332           warning("time moving backwards detected in
> os:sleep(interruptible)");
>   333           assert(false, "unexpected time moving backwards detected
> in os:sleep(interruptible)");
>
>      and:
>
>   374           warning("time moving backwards detected on
> os::sleep(!interruptible)");
>   375           assert(false, "unexpected time moving backwards detected
> on os::sleep(!interruptible)");
>
> Dan
>
>
> On 2/17/14 9:17 AM, Frederic Parain wrote:
>> Hi Dan,
>>
>> Thank you very much for your review, I knew it was a painful review
>> with all this code moving from one directory to another. I've
>> in-lined my answers below.
>>
>> The new webrev is here:
>> http://cr.openjdk.java.net/~fparain/6546236/webrev.01/
>>
>> On 02/13/2014 07:14 PM, Daniel D. Daugherty wrote:
>>>  > http://cr.openjdk.java.net/~fparain/6546236/webrev.00/
>>>
>>> Sorry for the delay. This was a difficult review to perform due to
>>> the code motion between files. My technique was to use the 'udiff'
>>> view for a side-by-side visual compare. Of course, this means I
>>> could easily have missed critical differences... :-)
>>>
>>> src/os/bsd/vm/os_bsd.cpp
>>>      Deletes os::sleep(), os::naked_sleep(), os::interrupt() and
>>>      os::is_interrupted().
>>>
>>>      bsd << os::sleep >> posix
>>>        << assert(!Bsd::supports_monotonic_clock(), "time moving
>>> backwards");
>>>        >> warning("time moving backwards");
>>>            The above diff happens twice. Without at least an assert here
>>>            we will be less likely to see this issue pop-up on specific
>>>            test machines.
>>
>> I've re-written this warning. Now the warning will be printed only
>> if the JVM was expecting a monotonic clock and its time warps backward.
>> Note that warnings can be disabled on product builds with
>> -XX:-PrintWarnings if the warning is causing trouble on VM output
>> stream.
>> The warning is also associated with an assert for our debug builds,
>> as we discussed privately.
>>
>>>      bsd << os::is_interrupted >> posix
>>>        picked up a long comment from the Solaris version
>>>
>>> src/os/linux/vm/os_linux.cpp
>>>      Deletes os::sleep(), os::naked_sleep(), os::interrupt() and
>>>      os::is_interrupted().
>>>
>>>      linux << os::sleep >> posix
>>>        << assert(!Linux::supports_monotonic_clock(), "time moving
>>> backwards");
>>>        >> warning("time moving backwards");
>>>            The above diff happens twice. Without at least an assert here
>>>            we will be less likely to see this issue pop-up on specific
>>>            test machines.
>>
>> Same comment as above.
>>
>>>      linux << os::is_interrupted >> posix
>>>        picked up a long comment from the Solaris version
>>>
>>> src/os/solaris/vm/os_solaris.cpp
>>>      Deletes os::sleep(), os::naked_sleep(), os::interrupt() and
>>>      os::is_interrupted(). Also deletes static os_sleep() and
>>>      os::Solaris::setup_interruptible_already_blocked()
>>>
>>>      solaris << os::interrupt >> posix
>>>        << // When events are used everywhere for os::sleep, then this
>>> thr_kill
>>>        << // will only be needed if UseVMInterruptibleIO is true.
>>>        << if (!isInterrupted) {
>>>        <<   int status = thr_kill(osthread->thread_id(),
>>> os::Solaris::SIGinterrupt());
>>>        <<   assert_status(status == 0, status, "thr_kill");
>>>            Deleting the above code presumes that you're also going to
>>>            remove the remaining uses of UseVMInterruptibleIO. Should
>>>            that be done with this changeset so things are not out of
>>>            sync.
>>
>> Initially I had the fix for 6546236 and the removal of
>> UseVMInterruptibleIO in the same workspace, but I split it into
>> two different changesets to ease tracking in our bug database.
>> The changeset to remove UseVMInterruptibleIO should be ready very
>> soon after fix for 6546236 has been pushed.
>>
>>>        << // Bump thread interruption counter
>>>        << RuntimeService::record_thread_interrupt_signaled_count();
>>>            This call was only ever made on Solaris. Is this just a
>>>            counter for when a SIGinterrupt has to be thrown because
>>>            of the UseVMInterruptibleIO stuff?
>>>
>>>      Doesn't this deletion make INTERRUPTIBLE_NORESTART_VM_ALWAYS
>>>      unused so it can be deleted from
>>>      src/os/solaris/vm/os_solaris.inline.hpp
>>>
>>>      Were you also planning on removing the last vestiges of
>>>      UseVMInterruptibleIO? Or will that be done with another bug?
>>
>> The changeset to remove UseVMInterruptibleIO includes all those cleanup:
>> clean deprecation of the flag, re-work of all the macros around
>> interruptible calls, cleanup of RuntimeServices methods and the
>> associated performance counters.
>>
>>> src/os/posix/vm/os_posix.cpp
>>>      Adds os::sleep(), os::naked_sleep(), os::interrupt() and
>>>      os::is_interrupted().
>>>
>>>      > // the interrupted flag into the os::Solaris::Event structure,
>>>          Should this be more generic instead of "Solaris" specific?
>>
>> I replaced this comment by the new version provided by David Holmes,
>> so no more Solaric specific content.
>>
>> Thank you,
>>
>> Fred
>>
>>> On 2/10/14 8:58 AM, Frederic Parain wrote:
>>>> Greetings,
>>>>
>>>> Please review this fix for JDK-6546236.
>>>>
>>>> A race condition exists in the implementation of
>>>> Thread.sleep() and Thread.interrupt() on the
>>>> Solaris platform. For historical reasons, the
>>>> Solaris implementation of sleep() and interrupt()
>>>> differs from the other Unices (Linux, BSD).
>>>> It uses poll() calls while Linux and BSD are
>>>> using ParkEvents.
>>>>
>>>> This changeset reconciles the Solaris code with
>>>> Linux and BSD code. And instead of having three
>>>> copies of this code, sleep() and interrupt()
>>>> code are moved to the os_posix.cpp from
>>>> os_[solaris|linux|bsd].cpp files.
>>>>
>>>> Note that with this changeset, the Solaris
>>>> implementation of sleep() and interrupt()
>>>> is loosing the support for UseVMInterruptibleIO
>>>> but this is not an issue because this VM
>>>> flag has been deprecated in JDK8 and will be
>>>> removed in JDK9.
>>>>
>>>> The code has been tested with JPRT, vm.quick.testlist
>>>> as well as jdk_lang, jdk_util, jdk_nio and jdk_awt
>>>> (not a random list, all those jdk_* test suites are
>>>> using Thread.sleep() and thread.interrupt()).
>>>>
>>>> The bug description:
>>>> https://bugs.openjdk.java.net/browse/JDK-6546236
>>>>
>>>> The webrev:
>>>> http://cr.openjdk.java.net/~fparain/6546236/webrev.00/
>>>>
>>>> Thank you,
>>>>
>>>> Fred
>>>>
>>>
>>
>

-- 
Frederic Parain - Oracle
Grenoble Engineering Center - France
Phone: +33 4 76 18 81 17
Email: Frederic.Parain at oracle.com


More information about the hotspot-runtime-dev mailing list