JVM64bit running on Linux 64bit: when system time changes the JVM may hang (bug_id=6900441)

David Holmes david.holmes at oracle.com
Mon Sep 2 15:04:55 PDT 2013


There's a lot of history here. In my view of relative and absolute times 
the way the kernel worked in the past was buggy. I know some shared that 
view. But there was also a lot of contention about how changes to the 
system clock should, or should not affect, existing timer queues. And 
this has been true on Solaris as well as linux. Hence in my view of the 
theory this should never have worked well, but the kernel implementors 
had a different view and so it did in fact work till recently (and I 
don't know what exactly has changed).

More inline ...

On 3/09/2013 5:24 AM, Andrew Haley wrote:
> On 09/02/2013 04:44 PM, bruno bossola wrote:
>
>> Thanks for your answer.  That's probably correct in the sense this is what
>> the code is doing :) I won't discuss that at all.
>>
>> However pthread_cond_timedwait uses by default, as clock id, CLOCK_REALTIME
>> which, unfortunately is affected by settime()/settimeofday() calls (on
>> Linux): for that reason it cannot be used to measure nanoseconds delays,
>> which is what the specification requires. CLOCK_REALTIME is not guaranteed
>> to monotonically count as this is the actual "system time": each time my
>> system syncs time using a NTP server on the net, the time might jump
>> forward or backward. The correct call (again on Linux)  would require to
>> use CLOCK_MONOTONIC as clock id.
>>
>> Your point would be that, based on Posix specs, setting the value of the
>> CLOCK_REALTIME clock using clock_settime() should have no effect on threads
>> that are waiting. Is that correct?
>
> The first thing I had to do was understand what was going on.
>
> Secondly, what could be done to fix it?  pthread_cond_timedwait() uses
> the real-time clock.  That's not just by default: as far as I can see
> there's no way to make it use any other clock.  So, fixing this
> problem requires some reengineering, not just replacing CLOCK_REALTIME
> with CLOCK_MONOTONIC.

You can associate any available clock with a pthread_cond_t by using a 
pthread_condattr_t on which  pthread_condattr_setclock has been called.

> Finally, I had to ask "Is this an important bug?"  I don't think it
> is, and David Holmes allowed been much more than I would have done.
> If you make the time go backwards in a big jump, all manner of things
> in a Unix system will go wrong.  In particular, timestaps on files
> will be affected, and anything that relies on such things will be
> affected too.  Anyone with root privileges should know this.
>
> The right answer in the short term is to use NTP at boot, so that
> systems are not provoked in this way.

I strongly believe in two distinct notions of time: relative and 
absolute. Any API involving relative time should never be affected by 
changes to the absolute value of a clock; and anything involving 
absolute times obviously should. Unfortunately not everyone agrees with 
this simple model and the confusion surrounding absolute vs relative 
time APIs has existed pretty much from day one and it is a mess on all 
the OS I've worked on. As a result the practical advice is as you say - 
don't mess with the system time without expecting problems.

That said, if we can make things play nicely together then we should. 
The timer/clock management code in the VM has gone very stale over the 
years and there are lots of known issues (Windows has its own set of 
problems!). When I worked in runtime back in 2007 this big cleanup was 
on my plate. But I moved to real-time and this work languished - as it 
was working okay it was never a high priority fix.

David
-----

> Andrew.
>
>


More information about the hotspot-runtime-dev mailing list