RFR (S): 8218483: Crash in "assert(_daemon_threads_count->get_value() > daemon_count) failed: thread count mismatch 5 : 5"

David Holmes david.holmes at oracle.com
Wed Apr 3 10:38:41 UTC 2019


On 3/04/2019 5:37 pm, Thomas Stüfe wrote:
> Hi David,
> 
> On Wed, Apr 3, 2019 at 9:20 AM David Holmes <david.holmes at oracle.com 
> <mailto:david.holmes at oracle.com>> wrote:
> 
>     Hi Thomas,
> 
>     On 3/04/2019 4:37 pm, Thomas Stüfe wrote:
>      > Hi David,
>      >
>      >
>      > On Tue, Apr 2, 2019 at 10:57 PM David Holmes
>     <david.holmes at oracle.com <mailto:david.holmes at oracle.com>
>      > <mailto:david.holmes at oracle.com
>     <mailto:david.holmes at oracle.com>>> wrote:
>      >
>      >     Hi Thomas,
>      >
>      >     Thanks for taking a look at this.
>      >
>      >     On 3/04/2019 5:41 am, Thomas Stüfe wrote:
>      >      > Hi David,
>      >      >
>      >      > first thanks for the good analysis!
>      >      >
>      >      > Is this not a problem with the usage of setDaemon():
>      >      >
>      >      >
>      >
>     https://docs.oracle.com/javase/7/docs/api/java/lang/Thread.html#setDaemon(boolean)
>      >      >
>      >      > "This method must be invoked before the thread is started."
>      >
>      >     Not the usage as such, but there is a problem with setDaemon
>     - as per:
>      >
>      > https://bugs.openjdk.java.net/browse/JDK-8221657
>      >
>      >     The test that causes the crash in the VM deliberately tests a
>     case
>      >     where
>      >     it expects to get the IllegalThreadStateException.
>      >
>      >      > I think the real solution would be for setDaemon to
>     distinguish
>      >     between
>      >      > not-yet-started, running and finished. It should not use
>      >     isAlive(). It
>      >      > should throw an exception if it has been started,
>     regardless of
>      >     whether
>      >      > it finished already or not.
>      >
>      >     Yes that fix is needed at the Java level. The use of isAlive()
>      >     pre-dates
>      >     the existence of Thread.State.
>      >
>      >     But a change at the Java level may be some time coming given
>     this is a
>      >     day one bug in the spec and implementation of
>     Thread.setDaemon, so I
>      >     wanted to address this quickly in the VM as we are seeing
>     these crashes
>      >     in testing.
>      >
>      >
>      > I think a simple patch could be very simply using
>      >
>      > if (threadStatus != 0)
>      >
>      > instead of
>      >
>      > isAlive()
>      >
>      > in Thread.setDaemon?
> 
>     Sure the fix is trivial (plus the method needs to be synchronized), but
>     that assumes that this spec inconsistency:
> 
>            * <p> This method must be invoked before the thread is started.
>            *
>            * @throws  IllegalThreadStateException
>            *          if this thread is {@linkplain #isAlive alive}
> 
>     is resolved in favour of the first statement. They may decide that
>     after
>     25 years it's better to maintain the "not alive" semantics and permit
>     you to modify a terminated thread.
> 
> 
> Okay, I get it now. You are worried about backward compatibility. 
> Someone calling setDaemon() in this way would now get an exception where 
> beforehand he would not.
> 
> But how about this then:
> 
> public final void setDaemon(boolean on) {
>      checkAccess();
>      if (isAlive()) {
>          throw new IllegalThreadStateException();
>      } else if (threadStatus != 0) {
>        // Not alive but not NEW - terminated?
>        // do not change daemon state. Do not throw to not break backward 
> compatibility.
>      }  else {
>        daemon = on;
>      }
> }
> 
> Of course that would be observable from the outside (Thread::isDaemon()).
> 
> At the expense of some complexity (e.g. two variables, one "real", one 
> outward facing as source for isDaemon), this could be fixed.
> 
> --
> 
> But I do not want to stop your change. I think it is fine, I cannot see 
> anything wrong with it.

Okay thanks for clarifying.

> For a moment I wondered whether we are exposed to a similar thing here:
> 
> thread.cpp:1996  ThreadService::current_thread_exiting(this, 
> is_daemon(threadObj()));
> 
> But at this point isAlive() would still return true, yes? Since it seems 
> it only gets reset in ensure_join().

Correct. The window for this bug is a call to setDaemon between the 
ensure_join and the Threads::remove.

Thanks,
David

> --
> 
> Cheers, Thomas
> 
> 
> 
>      > We do this in other places in Thread.java too.
>      >
>      > --
>      >
>      > Also I think it makes sense to scan for similar errors in the
>     code base
>      > (isAlive being used as "has-been-started") and fix those too.
>      >
>      > For example:
>      >
>      > ApplicationShutdownHook.java:
>      >
>      > static synchronized void add(Thread hook) {
>      >      if(hooks == null)
>      >          throw new IllegalStateException("Shutdown in progress");
>      >
>      >      if (hook.isAlive())
>      >          throw new IllegalArgumentException("Hook already running");
>      >
>      >      if (hooks.containsKey(hook))
>      >          throw new IllegalArgumentException("Hook previously
>     registered");
>      >
>      >      hooks.put(hook, hook);
>      > }
>      >
>      > would register a terminated thread as shutdown hook. I found similar
>      > looking code in ThreadPoolExecutor.
> 
>     Yeah that's a nasty bug - you can register a shutdown hook that will
>     result in other shutdown hooks not getting started!
> 
>      > I really think the jdk would be really the right place to fix this.
> 
>     And it may get fixed there eventually. Meanwhile I just want to stop
>     these fairly new assertions from triggering.
> 
>     Thanks,
>     David
> 
>      >
>      >     Thanks,
>      >     David
>      >
>      >      > Not sure. Its late, I may not be thinking straight.
>      >      >
>      >      > Cheers, Thomas
>      >      >
>      >      >
>      >      >
>      >      > On Tue, Apr 2, 2019 at 12:33 AM David Holmes
>      >     <david.holmes at oracle.com <mailto:david.holmes at oracle.com>
>     <mailto:david.holmes at oracle.com <mailto:david.holmes at oracle.com>>
>      >      > <mailto:david.holmes at oracle.com
>     <mailto:david.holmes at oracle.com>
>      >     <mailto:david.holmes at oracle.com
>     <mailto:david.holmes at oracle.com>>>> wrote:
>      >      >
>      >      >     Bug: https://bugs.openjdk.java.net/browse/JDK-8218483
>      >      >     webrev:
>     http://cr.openjdk.java.net/~dholmes/8218483/webrev/
>      >      >
>      >      >     A bug in Thread.setDaemon (JDK-8221657) means that the
>     daemon
>      >     state
>      >      >     of a
>      >      >     thread can change after the thread is !isAlive() at
>     the Java
>      >     level. If
>      >      >     this happens before the VM call to
>      >     ThreadService::remove_thread then we
>      >      >     have a situation where we incremented the thread counters
>      >     when the
>      >      >     thread was not a daemon, and we decrement the thread
>     counters
>      >     when the
>      >      >     thread is a daemon - and so the counters are out of
>     sync and the
>      >      >     assertion fires.
>      >      >
>      >      >     The simple fix is to capture the daemon state of the
>     thread
>      >     while it is
>      >      >     still alive and to pass that through to
>     Threads::remove and thus
>      >      >     ThreadService::remove_thread.
>      >      >
>      >      >     Testing:
>      >      >         - manual test with modified VM (to delay
>     Threads::remove
>      >     call)
>      >      >     as per
>      >      >     the bug report
>      >      >         - mach 5 tiers 1-3
>      >      >
>      >      >     Thanks,
>      >      >     David
>      >      >
>      >
> 


More information about the hotspot-runtime-dev mailing list