Fast graceful shutdown of ThreadPerTaskExecutor (when expected WAITING Threads)

Eric Kolotyluk eric at kolotyluk.net
Tue Dec 20 21:09:40 UTC 2022


As another curious casual observer, who has been playing with Nima...

Rob, why are you not using StructuredTaskScope instead of 
Executors.newThreadPerTaskExecutor?

Basically, is there something you know that I don't know?

Cheers, Eric

On 2022-12-20 1:01 p.m., Rob Bygrave wrote:
> /> The way to interrupt sockets has always been to close them/
>
> We are looking to achieve "Graceful shutdown" which means it allows 
> in-flight requests to complete before shutting down.  We don't want to 
> just nuke em or said differently, it is the job of 
> executorService.awaitTermination(...) executorService.shutdownNow() to 
> nuke anything still active after the timeout.
>
> ... yes we want to interrupt Virtual Threads that are WAITING and 
> known to be /*not in-flight*/ /*BEFORE*/ 
> executorService.awaitTermination(...)
>
>
> /> anything to do with Loom and virtual threads./
>
> Maybe. There are a few things that make me ask though and I'm in 
> particular looking at ThreadPerTaskExecutor. One is that all the 
> existing Java http servers that I know of use NIO (as you would 
> expect) and specifically that means "getting the next request" for 
> http 1.1 keepalive true is an /*event*/ where as with Virtual Threads 
> and Nima "getting the next request" is a Virtual Thread blocking on IO 
> waiting for those next bytes to arrive (which to me is the "new 
> normal" approach that Virtual Threads brings).
>
> That is, with Virtual Threads I am guessing that it's potentially a 
> "new normal" pattern where we are looking to shutdown gracefully a 
> ThreadPerTaskExecutor which is holding active Virtual Threads that are 
> blocking waiting on IO (but not yet deemed in-flight).  I'm suggesting 
> that ThreadPerTaskExecutor currently does not make it nice / easy / 
> friendly to shutdown gracefully allowing for in-flight requests.
>
> I see ThreadPerTaskExecutor.shutdownNow() always returns an empty list 
> of Runnable (this is too late in the shutdown process anyway so not 
> useful per say but we are trying to access the Stream<Runnable>). I 
> see the internal ThreadContainer and the internal Stream<Thread> 
> threads() method but I don't see anything available to application 
> code to assist the shutdown of a ThreadPerTaskExecutor for this case 
> which I feel might become a common pattern going forward.
>
>
>
> Cheers, Rob.
>
>
> On Wed, 21 Dec 2022 at 05:38, Jens Lideström 
> <jens.lidestrom at fripost.org> wrote:
>
>     Hello,
>
>     Random passer-by here.
>
>     I don't think this has anything to do with Loom and virtual threads.
>
>     The way to interrupt sockets has always been to close them. Then
>     blocking methods will throw an exception.
>
>     See this, for example:
>     https://stackoverflow.com/questions/4425350/how-to-terminate-a-thread-blocking-on-socket-io-operation-instantly
>
>     Nima seems to do that for the server socket, but it FIRST waits
>     for the ExecutorService to shutdown:
>
>     https://github.com/helidon-io/helidon/blob/main/nima/webserver/webserver/src/main/java/io/helidon/nima/webserver/ServerListener.java#L137
>
>     I think Nima should close the socket FIRST, then shutdown the
>     ExecutorService.
>
>     (I have not checked how Nima handles the client sockets. They
>     should be treated in the same way.)
>
>     Having written the above it strikes me that maybe you aware of
>     this already? Maybe you are suggesting the addition of some new
>     kind of application independent interruption mechanism that are
>     honoured by InputStreams from Sockets? I don't think this is
>     related to Loop, either.
>
>     Best regards,
>     Jens Lideström
>
>
>     On 2022-12-19 22:51, Rob Bygrave wrote:
>     > I have been looking at the shutdown process of Helidon Nima web
>     server which makes use of:
>     >
>     > Executors.newThreadPerTaskExecutor(Thread.ofVirtual()
>     >    .allowSetThreadLocals(true)
>     >    .inheritInheritableThreadLocals(false)
>     >    .factory());
>     >
>     > Firstly, there is no issue with Nima web server shutdown when
>     using HTTP 1.0 no keepalive. The virtual threads for this case
>     live for the length of a single request/response only.
>     >
>     > Where I am hitting a question/issue is with Nima web server
>     shutdown when using HTTP 1.1 keepalive true (and there is at least
>     1 connection being kept alive). What happens with HTTP 1.1 with
>     keepalive true is that there is 1 virtual thread per connection
>     (slight simplification). After the request/response has been
>     processed the virtual thread then looks to read the next request.
>     When there are no more requests coming into the web server we see
>     this thread is WAITING looking to read the first part of the next
>     request.
>     >
>     > The thread is WAITING "while reading the prologue" here:
>     >
>     https://github.com/helidon-io/helidon/blob/main/nima/webserver/webserver/src/main/java/io/helidon/nima/webserver/http1/Http1Connection.java#L115
>     <https://github.com/helidon-io/helidon/blob/main/nima/webserver/webserver/src/main/java/io/helidon/nima/webserver/http1/Http1Connection.java#L115>
>     >
>     > Conceptually when the Nima web server is "idle" we expect to be
>     able to shutdown the web server gracefully (allowing for active
>     requests to complete) and quickly. In this "idle" state with HTTP
>     1.1 keepalive connections the ThreadPerTaskExecutor contains alive
>     threads that are in WAITING state while "reading the prologue of
>     the next request".
>     >
>     > Currently the webserver.stop() ends up as the usual:
>     >
>     > executorService.shutdown();
>     > if (!executorService.awaitTermination(...)) {
>     >    executorService.shutdownNow()
>     > }
>     >
>     > Where the executorService in question is ThreadPerTaskExecutor.
>     The above shutdown does not execute in a timely manner based on
>     the timeout used for executorService.awaitTermination(...) - for
>     example if this timeout is 10 seconds we get a pretty slow
>     shutdown on a conceptually idle web server.
>     >
>     >
>     > *Some thoughts*
>     > An approach would be as part of
>     [ThreadPerTaskExecutor].shutdown() to firstly look to interrupt
>     threads that are state == WAITING &&
>     some-application-specific-logic-that-says-the-task-is-interruptible
>     (which for Nima is that the task is readPrologue() at
>     Http1Connection.java#L115).
>     >
>     > e.g. Perhaps have a interface InterruptableTask extends Runnable
>     ... and on shutdown() firstly iterate the active threads that are
>     WAITING and if their tasks are InterruptableTask and
>     application-specific-logic-that-says-the-task-is-interruptible
>     true then interrupt that thread.
>     >
>     > I note there is some non-public API like Stream<Thread>
>     threads() which applications can't use (at least yet). My current
>     hacking around with this shutdown in Helidon hasn't given me any
>     elegant solutions in application logic in that we need to now have
>     the application additionally hold the Runnable tasks (which is not
>     ideal in the http 1.0 no keepalive case).
>     >
>     >
>     > Does anyone have some thoughts or suggestions on this?
>     >
>     >
>     > I have raised this as an issue in Helidon as:
>     https://github.com/helidon-io/helidon/issues/5717
>     <https://github.com/helidon-io/helidon/issues/5717>
>     > with a related PR with failing tests. I'm raising this here
>     because my current thought is that an elegant solution might be
>     with ThreadPerTaskExecutor on shutdown or some exposure of the
>     waiting threads accessible to application code.
>     >
>     >
>     > Thanks, Rob.
>     >
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/loom-dev/attachments/20221220/020c7806/attachment.htm>


More information about the loom-dev mailing list