RFR: 6900441 PlatformEvent.park(millis) on Linux could still be affected by changes to the time-of-day clock

David Holmes david.holmes at oracle.com
Wed Sep 11 16:14:50 PDT 2013


updated webrev:

http://cr.openjdk.java.net/~dholmes/6900441/webrev.v2/

On 12/09/2013 7:10 AM, Daniel D. Daugherty wrote:
> On 9/11/13 5:06 AM, David Holmes wrote:
>> webrev:
>>
>> http://cr.openjdk.java.net/~dholmes/6900441/webrev/
>
> Looks like Ron attached the two TimeJump... tests that were developed
> back in July. Please let Ron, Jerry and I know how these tests work
> with your fix in place. (We're curious...)

Of course these tests pass (and many thanks to everyone involved in 
providing them) - I should have mentioned that. I also adapted them to 
test parkNanos so that we cover the three main user-level API's (there 
are only two VM-level APIs). I also adapted into a separate (manual 
observation) test of parkUntil to ensure that absolute waits do get 
affected by the clock change (this passes with old and new code of course).

> Short version: Thumbs up! Nothing critical to change

Thanks.

> Long version:
>
> src/os/linux/vm/os_linux.hpp
>      No comments.
>
> src/os/linux/vm/os_linux.cpp
>      line 4712: warning("Unable to associate monotonic clock with ...
>          is way too long and should be split into at least two strings.

Fixed.

>      line 4713:       }
>      line 4714 :      else {
>          HotSpot style is "} else {" (i.e., on the same line)

Fixed. Along with a fix pre-existing ones in the same code.

>      line 4715: fatal(err_msg("pthread_condattr_setclock: %s",
> strerror(status)));
>          line is indented two spaces too many

Fixed.

>      line 5541: assert_status(status == 0, status, "clock_gettime");
>      line 5553: assert(status == 0, "gettimeofday");
>          So why is one an assert_status() call and the other is a
>          plain old assert() call?

Different API's. The old unix API's, like gettimeofday return -1 on 
error and set errno. The "modern" posix APIs, eg pthread APIs and 
clock_gettime etc, return the actual error code on error - hence 
assert_status can be used to show the actual error in that case.

>      line 5808: //  tty->print_cr("abstime = (%lld, %lld)",
> (jlong)absTime->tv_sec,
>                 (jlong)absTime->tv_nsec);
>
>          Leftover debugging/testing code?

Yep -fixed.

Thanks again!

David
-----

> Dan
>
>
>
>> Short version: use CLOCK_MONOTONIC with pthread_cond_t objects for
>> relative timed-waits.
>>
>> Long version - see below :)
>>
>> Thanks,
>> David
>> -----
>>
>> Background: relative timed-waits (Thread.sleep, Object.wait,
>> LockSupport.parkNanos) should not be affected by changes to the
>> time-of-day clock. The linux implementation uses
>> pthread_cond_timedwait on default initialized pthread_cond_t objects.
>> pthread_cond_timedwait actually takes an absolute time and the default
>> clock is CLOCK_REALTIME (which is a time-of-day clock). You would
>> expect then that changing the time-of-day would impact these
>> supposedly relative timed waits: time going forward would cause early
>> returns; time going backwards would cause extended delays. But for
>> many years it did not - the reason being that the glibc/pthreads
>> implementors had not implemented it to work that way. Originally the
>> monotonic clock didn't even exist so there was only one way this could
>> be implemented (and we had LinuxThreads vs NPTL etc etc - lots of
>> history).
>>
>> Skip forward to 2009 and glibc was modified to correct this behaviour
>> - but only on 64-bit linux (no idea why - the 32-bit variant was added
>> earlier this year but I don't know what glibc version that corresponds
>> to). For 64-bit this was, as I understand it, glibc 2.12. By
>> coincidence in 2009 I filed 6900441 as I knew we would have to change
>> the implementation when glibc was fixed - unfortunately I wasn't aware
>> that it actually had been fixed at that time. The bug was slated for
>> future work, I went on to other things, blah blah blah ...
>>
>> We didn't get a flurry of bug reports in 2009, nor 2010, 2011 ... why
>> not? Partly because people tend to avoid large jumps in system time
>> and this bug only becomes noticeable when "large" backward jumps occur
>> and threads 'hang'. (Forward jumps cause early returns that are either
>> filtered out or permitted as spurious-wakeups). And partly it seems
>> because it took time for glibc 2.12 to get into some Linux
>> distributions and for people to take up the new version.
>>
>> Skip forward to today and we are starting to see a number of reports
>> about this problem as people are now using the new glibc (and may have
>> been for a while) and also seeing time changes have unexpected impact
>> on their programs. So this needs to be fixed ASAP before it becomes a
>> major problem and will be backported to OpenJDK 7 and 6.
>>
>> The fix is relatively straight-forward: the pthread_cond_t objects
>> used for the relative timed-waits have to be associated with
>> CLOCK_MONOTONIC so that they are not affected by changes to the
>> time-of-day clock. There is a slight complication in that the
>> LockSupport.park API supports both a relative and absolute timed-wait
>> so we need two different pthread_cond_t each associated with different
>> clocks.
>>
>> Notes:
>>
>> 1. This is a linux fix only. I don't know if we also have the problem
>> on other OS but it hasn't been flagged and while I will check, it is
>> more important to get this out for Linux ASAP.
>>
>> 2. Given the late stage of JDK 8 release cycle (to minimize risk), and
>> to ease backporting to 6 and 7, I made no attempt to do any kind of
>> code clean up here. This code is full of historical anachronisms and
>> for Java 9 I hope to see it all cleaned up, but for now all the
>> baggage and duplication must remain as-is.
>>
>> 3. We can obviously only fix this if we have a monotonic clock hence
>> that has to be used to guard the new code. These days it would be
>> extremely rare to not have the monotonic clock but I still use the guard.
>>
>> 4. CLOCK_MONOTONIC is not in fact completely immune to changes in the
>> time of day clock but it won't jump backwards. The new clock on the
>> block is CLOCK_MONOTONIC_RAW which should always advance at a constant
>> rate with no jumps. We have a RFE to start using CLOCK_MONOTONIC_RAW
>> for System.nanoTime(), and we would use it for the pthread_cond_t too,
>> but we can't use that until the JDK 8 build platform is updated to a
>> linux version that actually has that clock at build time. That update
>> is very near but not yet here so we stay with CLOCK_MONOTONIC.
>


More information about the hotspot-dev mailing list