RFR: JDK-6546236 Thread interrupt() of Thread.sleep() can be lost on Solaris due to race with signal handler

Frederic Parain frederic.parain at oracle.com
Wed Feb 19 08:05:46 PST 2014


Dan,

Thank you for your reviews,

Fred

On 02/19/2014 04:01 PM, Daniel D. Daugherty wrote:
>  > http://cr.openjdk.java.net/~fparain/6546236/webrev.02/
>
> Thumbs up!
>
> src/share/vm/runtime/os.hpp
>      No comments.
>
> src/os/posix/vm/os_posix.cpp
>      No comments.
>
> src/os/bsd/vm/os_bsd.hpp
>      No comments.
>
> src/os/bsd/vm/os_bsd.inline.hpp
>      No comments.
>
> src/os/bsd/vm/os_bsd.cpp
>      No comments.
>
> src/os/linux/vm/os_linux.hpp
>      No comments.
>
> src/os/linux/vm/os_linux.inline.hpp
>      line 331 is very long
>      line 370 is very long
>
>      If you fix the too long lines, no re-review is necessary.
>
> src/os/linux/vm/os_linux.cpp
>      No comments.
>
> src/os/solaris/vm/os_solaris.inline.hpp
>      No comments.
>
> src/os/solaris/vm/os_solaris.cpp
>      No comments.
>
> src/os/windows/vm/os_windows.hpp
>      No comments.
>
> src/os/windows/vm/os_windows.inline.hpp
>      No comments.
>
> src/os/windows/vm/os_windows.cpp
>      No comments.
>
> Dan
>
>
> On 2/18/14 8:32 AM, frederic parain wrote:
>> Hi Dan,
>>
>> Thank you for this second review. I applied all
>> your suggestions to the changeset.
>>
>> I rolled back the warning/assert code to its initial
>> state: a single warning conditioned by the expectation
>> of a monotonic clock.
>>
>> The new webrev is available here:
>>
>> http://cr.openjdk.java.net/~fparain/6546236/webrev.02/
>>
>> Regards,
>>
>> Fred
>>
>> On 17/02/2014 19:52, Daniel D. Daugherty wrote:
>>>  > http://cr.openjdk.java.net/~fparain/6546236/webrev.01/
>>>
>>> src/share/vm/runtime/os.hpp
>>>      No comments.
>>>
>>> src/os/bsd/vm/os_bsd.inline.hpp
>>>      If you move the Bsd::_clock_gettime check here, you should be
>>>      able to delete Bsd::supports_monotonic_clock from
>>>      src/os/bsd/vm/os_bsd.hpp.
>>>
>>> src/os/bsd/vm/os_bsd.cpp
>>>      No comments.
>>>
>>> src/os/linux/vm/os_linux.hpp
>>>      No comments.
>>>
>>> src/os/linux/vm/os_linux.inline.hpp
>>>      No comments.
>>>
>>> src/os/linux/vm/os_linux.cpp
>>>      No comments.
>>>
>>> src/os/solaris/vm/os_solaris.inline.hpp
>>>      No comments.
>>>
>>> src/os/solaris/vm/os_solaris.cpp
>>>      No comments.
>>>
>>> src/os/windows/vm/os_windows.hpp
>>>      No comments.
>>>
>>> src/os/windows/vm/os_windows.inline.hpp
>>>      No comments.
>>>
>>> src/os/windows/vm/os_windows.cpp
>>>      No comments.
>>>
>>> src/os/posix/vm/os_posix.cpp
>>>      line 333: assert(true, "unexpected time moving backwards detected
>>> in os:sleep()");
>>>      line 375: assert(true, "unexpected time moving backwards detected
>>> on os::sleep()");
>>>          These asserts will not fire. The 'true' needs to be 'false'.
>>>
>>>      You might want to distinguish between the two different pairs of
>>>      messages:
>>>
>>>   332           warning("time moving backwards detected in
>>> os:sleep(interruptible)");
>>>   333           assert(false, "unexpected time moving backwards detected
>>> in os:sleep(interruptible)");
>>>
>>>      and:
>>>
>>>   374           warning("time moving backwards detected on
>>> os::sleep(!interruptible)");
>>>   375           assert(false, "unexpected time moving backwards detected
>>> on os::sleep(!interruptible)");
>>>
>>> Dan
>>>
>>>
>>> On 2/17/14 9:17 AM, Frederic Parain wrote:
>>>> Hi Dan,
>>>>
>>>> Thank you very much for your review, I knew it was a painful review
>>>> with all this code moving from one directory to another. I've
>>>> in-lined my answers below.
>>>>
>>>> The new webrev is here:
>>>> http://cr.openjdk.java.net/~fparain/6546236/webrev.01/
>>>>
>>>> On 02/13/2014 07:14 PM, Daniel D. Daugherty wrote:
>>>>>  > http://cr.openjdk.java.net/~fparain/6546236/webrev.00/
>>>>>
>>>>> Sorry for the delay. This was a difficult review to perform due to
>>>>> the code motion between files. My technique was to use the 'udiff'
>>>>> view for a side-by-side visual compare. Of course, this means I
>>>>> could easily have missed critical differences... :-)
>>>>>
>>>>> src/os/bsd/vm/os_bsd.cpp
>>>>>      Deletes os::sleep(), os::naked_sleep(), os::interrupt() and
>>>>>      os::is_interrupted().
>>>>>
>>>>>      bsd << os::sleep >> posix
>>>>>        << assert(!Bsd::supports_monotonic_clock(), "time moving
>>>>> backwards");
>>>>>        >> warning("time moving backwards");
>>>>>            The above diff happens twice. Without at least an assert
>>>>> here
>>>>>            we will be less likely to see this issue pop-up on specific
>>>>>            test machines.
>>>>
>>>> I've re-written this warning. Now the warning will be printed only
>>>> if the JVM was expecting a monotonic clock and its time warps backward.
>>>> Note that warnings can be disabled on product builds with
>>>> -XX:-PrintWarnings if the warning is causing trouble on VM output
>>>> stream.
>>>> The warning is also associated with an assert for our debug builds,
>>>> as we discussed privately.
>>>>
>>>>>      bsd << os::is_interrupted >> posix
>>>>>        picked up a long comment from the Solaris version
>>>>>
>>>>> src/os/linux/vm/os_linux.cpp
>>>>>      Deletes os::sleep(), os::naked_sleep(), os::interrupt() and
>>>>>      os::is_interrupted().
>>>>>
>>>>>      linux << os::sleep >> posix
>>>>>        << assert(!Linux::supports_monotonic_clock(), "time moving
>>>>> backwards");
>>>>>        >> warning("time moving backwards");
>>>>>            The above diff happens twice. Without at least an assert
>>>>> here
>>>>>            we will be less likely to see this issue pop-up on specific
>>>>>            test machines.
>>>>
>>>> Same comment as above.
>>>>
>>>>>      linux << os::is_interrupted >> posix
>>>>>        picked up a long comment from the Solaris version
>>>>>
>>>>> src/os/solaris/vm/os_solaris.cpp
>>>>>      Deletes os::sleep(), os::naked_sleep(), os::interrupt() and
>>>>>      os::is_interrupted(). Also deletes static os_sleep() and
>>>>>      os::Solaris::setup_interruptible_already_blocked()
>>>>>
>>>>>      solaris << os::interrupt >> posix
>>>>>        << // When events are used everywhere for os::sleep, then this
>>>>> thr_kill
>>>>>        << // will only be needed if UseVMInterruptibleIO is true.
>>>>>        << if (!isInterrupted) {
>>>>>        <<   int status = thr_kill(osthread->thread_id(),
>>>>> os::Solaris::SIGinterrupt());
>>>>>        <<   assert_status(status == 0, status, "thr_kill");
>>>>>            Deleting the above code presumes that you're also going to
>>>>>            remove the remaining uses of UseVMInterruptibleIO. Should
>>>>>            that be done with this changeset so things are not out of
>>>>>            sync.
>>>>
>>>> Initially I had the fix for 6546236 and the removal of
>>>> UseVMInterruptibleIO in the same workspace, but I split it into
>>>> two different changesets to ease tracking in our bug database.
>>>> The changeset to remove UseVMInterruptibleIO should be ready very
>>>> soon after fix for 6546236 has been pushed.
>>>>
>>>>>        << // Bump thread interruption counter
>>>>>        << RuntimeService::record_thread_interrupt_signaled_count();
>>>>>            This call was only ever made on Solaris. Is this just a
>>>>>            counter for when a SIGinterrupt has to be thrown because
>>>>>            of the UseVMInterruptibleIO stuff?
>>>>>
>>>>>      Doesn't this deletion make INTERRUPTIBLE_NORESTART_VM_ALWAYS
>>>>>      unused so it can be deleted from
>>>>>      src/os/solaris/vm/os_solaris.inline.hpp
>>>>>
>>>>>      Were you also planning on removing the last vestiges of
>>>>>      UseVMInterruptibleIO? Or will that be done with another bug?
>>>>
>>>> The changeset to remove UseVMInterruptibleIO includes all those
>>>> cleanup:
>>>> clean deprecation of the flag, re-work of all the macros around
>>>> interruptible calls, cleanup of RuntimeServices methods and the
>>>> associated performance counters.
>>>>
>>>>> src/os/posix/vm/os_posix.cpp
>>>>>      Adds os::sleep(), os::naked_sleep(), os::interrupt() and
>>>>>      os::is_interrupted().
>>>>>
>>>>>      > // the interrupted flag into the os::Solaris::Event structure,
>>>>>          Should this be more generic instead of "Solaris" specific?
>>>>
>>>> I replaced this comment by the new version provided by David Holmes,
>>>> so no more Solaric specific content.
>>>>
>>>> Thank you,
>>>>
>>>> Fred
>>>>
>>>>> On 2/10/14 8:58 AM, Frederic Parain wrote:
>>>>>> Greetings,
>>>>>>
>>>>>> Please review this fix for JDK-6546236.
>>>>>>
>>>>>> A race condition exists in the implementation of
>>>>>> Thread.sleep() and Thread.interrupt() on the
>>>>>> Solaris platform. For historical reasons, the
>>>>>> Solaris implementation of sleep() and interrupt()
>>>>>> differs from the other Unices (Linux, BSD).
>>>>>> It uses poll() calls while Linux and BSD are
>>>>>> using ParkEvents.
>>>>>>
>>>>>> This changeset reconciles the Solaris code with
>>>>>> Linux and BSD code. And instead of having three
>>>>>> copies of this code, sleep() and interrupt()
>>>>>> code are moved to the os_posix.cpp from
>>>>>> os_[solaris|linux|bsd].cpp files.
>>>>>>
>>>>>> Note that with this changeset, the Solaris
>>>>>> implementation of sleep() and interrupt()
>>>>>> is loosing the support for UseVMInterruptibleIO
>>>>>> but this is not an issue because this VM
>>>>>> flag has been deprecated in JDK8 and will be
>>>>>> removed in JDK9.
>>>>>>
>>>>>> The code has been tested with JPRT, vm.quick.testlist
>>>>>> as well as jdk_lang, jdk_util, jdk_nio and jdk_awt
>>>>>> (not a random list, all those jdk_* test suites are
>>>>>> using Thread.sleep() and thread.interrupt()).
>>>>>>
>>>>>> The bug description:
>>>>>> https://bugs.openjdk.java.net/browse/JDK-6546236
>>>>>>
>>>>>> The webrev:
>>>>>> http://cr.openjdk.java.net/~fparain/6546236/webrev.00/
>>>>>>
>>>>>> Thank you,
>>>>>>
>>>>>> Fred
>>>>>>
>>>>>
>>>>
>>>
>>
>

-- 
Frederic Parain - Oracle
Grenoble Engineering Center - France
Phone: +33 4 76 18 81 17
Email: Frederic.Parain at oracle.com


More information about the hotspot-runtime-dev mailing list