RFR (S): 8218483: Crash in "assert(_daemon_threads_count->get_value() > daemon_count) failed: thread count mismatch 5 : 5"
David Holmes
david.holmes at oracle.com
Wed Apr 3 10:38:41 UTC 2019
On 3/04/2019 5:37 pm, Thomas Stüfe wrote:
> Hi David,
>
> On Wed, Apr 3, 2019 at 9:20 AM David Holmes <david.holmes at oracle.com
> <mailto:david.holmes at oracle.com>> wrote:
>
> Hi Thomas,
>
> On 3/04/2019 4:37 pm, Thomas Stüfe wrote:
> > Hi David,
> >
> >
> > On Tue, Apr 2, 2019 at 10:57 PM David Holmes
> <david.holmes at oracle.com <mailto:david.holmes at oracle.com>
> > <mailto:david.holmes at oracle.com
> <mailto:david.holmes at oracle.com>>> wrote:
> >
> > Hi Thomas,
> >
> > Thanks for taking a look at this.
> >
> > On 3/04/2019 5:41 am, Thomas Stüfe wrote:
> > > Hi David,
> > >
> > > first thanks for the good analysis!
> > >
> > > Is this not a problem with the usage of setDaemon():
> > >
> > >
> >
> https://docs.oracle.com/javase/7/docs/api/java/lang/Thread.html#setDaemon(boolean)
> > >
> > > "This method must be invoked before the thread is started."
> >
> > Not the usage as such, but there is a problem with setDaemon
> - as per:
> >
> > https://bugs.openjdk.java.net/browse/JDK-8221657
> >
> > The test that causes the crash in the VM deliberately tests a
> case
> > where
> > it expects to get the IllegalThreadStateException.
> >
> > > I think the real solution would be for setDaemon to
> distinguish
> > between
> > > not-yet-started, running and finished. It should not use
> > isAlive(). It
> > > should throw an exception if it has been started,
> regardless of
> > whether
> > > it finished already or not.
> >
> > Yes that fix is needed at the Java level. The use of isAlive()
> > pre-dates
> > the existence of Thread.State.
> >
> > But a change at the Java level may be some time coming given
> this is a
> > day one bug in the spec and implementation of
> Thread.setDaemon, so I
> > wanted to address this quickly in the VM as we are seeing
> these crashes
> > in testing.
> >
> >
> > I think a simple patch could be very simply using
> >
> > if (threadStatus != 0)
> >
> > instead of
> >
> > isAlive()
> >
> > in Thread.setDaemon?
>
> Sure the fix is trivial (plus the method needs to be synchronized), but
> that assumes that this spec inconsistency:
>
> * <p> This method must be invoked before the thread is started.
> *
> * @throws IllegalThreadStateException
> * if this thread is {@linkplain #isAlive alive}
>
> is resolved in favour of the first statement. They may decide that
> after
> 25 years it's better to maintain the "not alive" semantics and permit
> you to modify a terminated thread.
>
>
> Okay, I get it now. You are worried about backward compatibility.
> Someone calling setDaemon() in this way would now get an exception where
> beforehand he would not.
>
> But how about this then:
>
> public final void setDaemon(boolean on) {
> checkAccess();
> if (isAlive()) {
> throw new IllegalThreadStateException();
> } else if (threadStatus != 0) {
> // Not alive but not NEW - terminated?
> // do not change daemon state. Do not throw to not break backward
> compatibility.
> } else {
> daemon = on;
> }
> }
>
> Of course that would be observable from the outside (Thread::isDaemon()).
>
> At the expense of some complexity (e.g. two variables, one "real", one
> outward facing as source for isDaemon), this could be fixed.
>
> --
>
> But I do not want to stop your change. I think it is fine, I cannot see
> anything wrong with it.
Okay thanks for clarifying.
> For a moment I wondered whether we are exposed to a similar thing here:
>
> thread.cpp:1996 ThreadService::current_thread_exiting(this,
> is_daemon(threadObj()));
>
> But at this point isAlive() would still return true, yes? Since it seems
> it only gets reset in ensure_join().
Correct. The window for this bug is a call to setDaemon between the
ensure_join and the Threads::remove.
Thanks,
David
> --
>
> Cheers, Thomas
>
>
>
> > We do this in other places in Thread.java too.
> >
> > --
> >
> > Also I think it makes sense to scan for similar errors in the
> code base
> > (isAlive being used as "has-been-started") and fix those too.
> >
> > For example:
> >
> > ApplicationShutdownHook.java:
> >
> > static synchronized void add(Thread hook) {
> > if(hooks == null)
> > throw new IllegalStateException("Shutdown in progress");
> >
> > if (hook.isAlive())
> > throw new IllegalArgumentException("Hook already running");
> >
> > if (hooks.containsKey(hook))
> > throw new IllegalArgumentException("Hook previously
> registered");
> >
> > hooks.put(hook, hook);
> > }
> >
> > would register a terminated thread as shutdown hook. I found similar
> > looking code in ThreadPoolExecutor.
>
> Yeah that's a nasty bug - you can register a shutdown hook that will
> result in other shutdown hooks not getting started!
>
> > I really think the jdk would be really the right place to fix this.
>
> And it may get fixed there eventually. Meanwhile I just want to stop
> these fairly new assertions from triggering.
>
> Thanks,
> David
>
> >
> > Thanks,
> > David
> >
> > > Not sure. Its late, I may not be thinking straight.
> > >
> > > Cheers, Thomas
> > >
> > >
> > >
> > > On Tue, Apr 2, 2019 at 12:33 AM David Holmes
> > <david.holmes at oracle.com <mailto:david.holmes at oracle.com>
> <mailto:david.holmes at oracle.com <mailto:david.holmes at oracle.com>>
> > > <mailto:david.holmes at oracle.com
> <mailto:david.holmes at oracle.com>
> > <mailto:david.holmes at oracle.com
> <mailto:david.holmes at oracle.com>>>> wrote:
> > >
> > > Bug: https://bugs.openjdk.java.net/browse/JDK-8218483
> > > webrev:
> http://cr.openjdk.java.net/~dholmes/8218483/webrev/
> > >
> > > A bug in Thread.setDaemon (JDK-8221657) means that the
> daemon
> > state
> > > of a
> > > thread can change after the thread is !isAlive() at
> the Java
> > level. If
> > > this happens before the VM call to
> > ThreadService::remove_thread then we
> > > have a situation where we incremented the thread counters
> > when the
> > > thread was not a daemon, and we decrement the thread
> counters
> > when the
> > > thread is a daemon - and so the counters are out of
> sync and the
> > > assertion fires.
> > >
> > > The simple fix is to capture the daemon state of the
> thread
> > while it is
> > > still alive and to pass that through to
> Threads::remove and thus
> > > ThreadService::remove_thread.
> > >
> > > Testing:
> > > - manual test with modified VM (to delay
> Threads::remove
> > call)
> > > as per
> > > the bug report
> > > - mach 5 tiers 1-3
> > >
> > > Thanks,
> > > David
> > >
> >
>
More information about the hotspot-runtime-dev
mailing list