Fast graceful shutdown of ThreadPerTaskExecutor (when expected WAITING Threads)
Rob Bygrave
robin.bygrave at gmail.com
Mon Dec 19 21:51:29 UTC 2022
I have been looking at the shutdown process of Helidon Nima web server
which makes use of:
Executors.newThreadPerTaskExecutor(Thread.ofVirtual()
.allowSetThreadLocals(true)
.inheritInheritableThreadLocals(false)
.factory());
Firstly, there is no issue with Nima web server shutdown when using HTTP
1.0 no keepalive. The virtual threads for this case live for the length of
a single request/response only.
Where I am hitting a question/issue is with Nima web server shutdown when
using HTTP 1.1 keepalive true (and there is at least 1 connection being
kept alive). What happens with HTTP 1.1 with keepalive true is that there
is 1 virtual thread per connection (slight simplification). After the
request/response has been processed the virtual thread then looks to read
the next request. When there are no more requests coming into the web
server we see this thread is WAITING looking to read the first part of the
next request.
The thread is WAITING "while reading the prologue" here:
https://github.com/helidon-io/helidon/blob/main/nima/webserver/webserver/src/main/java/io/helidon/nima/webserver/http1/Http1Connection.java#L115
Conceptually when the Nima web server is "idle" we expect to be able to
shutdown the web server gracefully (allowing for active requests to
complete) and quickly. In this "idle" state with HTTP 1.1 keepalive
connections the ThreadPerTaskExecutor contains alive threads that are in
WAITING state while "reading the prologue of the next request".
Currently the webserver.stop() ends up as the usual:
executorService.shutdown();
if (!executorService.awaitTermination(...)) {
executorService.shutdownNow()
}
Where the executorService in question is ThreadPerTaskExecutor. The above
shutdown does not execute in a timely manner based on the timeout used for
executorService.awaitTermination(...) - for example if this timeout is 10
seconds we get a pretty slow shutdown on a conceptually idle web server.
*Some thoughts*
An approach would be as part of [ThreadPerTaskExecutor].shutdown() to
firstly look to interrupt threads that are state == WAITING &&
some-application-specific-logic-that-says-the-task-is-interruptible (which
for Nima is that the task is readPrologue() at Http1Connection.java#L115).
e.g. Perhaps have a interface InterruptableTask extends Runnable ... and on
shutdown() firstly iterate the active threads that are WAITING and if their
tasks are InterruptableTask and
application-specific-logic-that-says-the-task-is-interruptible true then
interrupt that thread.
I note there is some non-public API like Stream<Thread> threads() which
applications can't use (at least yet). My current hacking around with this
shutdown in Helidon hasn't given me any elegant solutions in application
logic in that we need to now have the application additionally hold the
Runnable tasks (which is not ideal in the http 1.0 no keepalive case).
Does anyone have some thoughts or suggestions on this?
I have raised this as an issue in Helidon as:
https://github.com/helidon-io/helidon/issues/5717
with a related PR with failing tests. I'm raising this here because my
current thought is that an elegant solution might be with
ThreadPerTaskExecutor on shutdown or some exposure of the waiting threads
accessible to application code.
Thanks, Rob.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/loom-dev/attachments/20221220/ad8e2d3c/attachment-0001.htm>
More information about the loom-dev
mailing list