Fast graceful shutdown of ThreadPerTaskExecutor (when expected WAITING Threads)
Eric Kolotyluk
eric at kolotyluk.net
Tue Dec 20 21:09:40 UTC 2022
As another curious casual observer, who has been playing with Nima...
Rob, why are you not using StructuredTaskScope instead of
Executors.newThreadPerTaskExecutor?
Basically, is there something you know that I don't know?
Cheers, Eric
On 2022-12-20 1:01 p.m., Rob Bygrave wrote:
> /> The way to interrupt sockets has always been to close them/
>
> We are looking to achieve "Graceful shutdown" which means it allows
> in-flight requests to complete before shutting down. We don't want to
> just nuke em or said differently, it is the job of
> executorService.awaitTermination(...) executorService.shutdownNow() to
> nuke anything still active after the timeout.
>
> ... yes we want to interrupt Virtual Threads that are WAITING and
> known to be /*not in-flight*/ /*BEFORE*/
> executorService.awaitTermination(...)
>
>
> /> anything to do with Loom and virtual threads./
>
> Maybe. There are a few things that make me ask though and I'm in
> particular looking at ThreadPerTaskExecutor. One is that all the
> existing Java http servers that I know of use NIO (as you would
> expect) and specifically that means "getting the next request" for
> http 1.1 keepalive true is an /*event*/ where as with Virtual Threads
> and Nima "getting the next request" is a Virtual Thread blocking on IO
> waiting for those next bytes to arrive (which to me is the "new
> normal" approach that Virtual Threads brings).
>
> That is, with Virtual Threads I am guessing that it's potentially a
> "new normal" pattern where we are looking to shutdown gracefully a
> ThreadPerTaskExecutor which is holding active Virtual Threads that are
> blocking waiting on IO (but not yet deemed in-flight). I'm suggesting
> that ThreadPerTaskExecutor currently does not make it nice / easy /
> friendly to shutdown gracefully allowing for in-flight requests.
>
> I see ThreadPerTaskExecutor.shutdownNow() always returns an empty list
> of Runnable (this is too late in the shutdown process anyway so not
> useful per say but we are trying to access the Stream<Runnable>). I
> see the internal ThreadContainer and the internal Stream<Thread>
> threads() method but I don't see anything available to application
> code to assist the shutdown of a ThreadPerTaskExecutor for this case
> which I feel might become a common pattern going forward.
>
>
>
> Cheers, Rob.
>
>
> On Wed, 21 Dec 2022 at 05:38, Jens Lideström
> <jens.lidestrom at fripost.org> wrote:
>
> Hello,
>
> Random passer-by here.
>
> I don't think this has anything to do with Loom and virtual threads.
>
> The way to interrupt sockets has always been to close them. Then
> blocking methods will throw an exception.
>
> See this, for example:
> https://stackoverflow.com/questions/4425350/how-to-terminate-a-thread-blocking-on-socket-io-operation-instantly
>
> Nima seems to do that for the server socket, but it FIRST waits
> for the ExecutorService to shutdown:
>
> https://github.com/helidon-io/helidon/blob/main/nima/webserver/webserver/src/main/java/io/helidon/nima/webserver/ServerListener.java#L137
>
> I think Nima should close the socket FIRST, then shutdown the
> ExecutorService.
>
> (I have not checked how Nima handles the client sockets. They
> should be treated in the same way.)
>
> Having written the above it strikes me that maybe you aware of
> this already? Maybe you are suggesting the addition of some new
> kind of application independent interruption mechanism that are
> honoured by InputStreams from Sockets? I don't think this is
> related to Loop, either.
>
> Best regards,
> Jens Lideström
>
>
> On 2022-12-19 22:51, Rob Bygrave wrote:
> > I have been looking at the shutdown process of Helidon Nima web
> server which makes use of:
> >
> > Executors.newThreadPerTaskExecutor(Thread.ofVirtual()
> > .allowSetThreadLocals(true)
> > .inheritInheritableThreadLocals(false)
> > .factory());
> >
> > Firstly, there is no issue with Nima web server shutdown when
> using HTTP 1.0 no keepalive. The virtual threads for this case
> live for the length of a single request/response only.
> >
> > Where I am hitting a question/issue is with Nima web server
> shutdown when using HTTP 1.1 keepalive true (and there is at least
> 1 connection being kept alive). What happens with HTTP 1.1 with
> keepalive true is that there is 1 virtual thread per connection
> (slight simplification). After the request/response has been
> processed the virtual thread then looks to read the next request.
> When there are no more requests coming into the web server we see
> this thread is WAITING looking to read the first part of the next
> request.
> >
> > The thread is WAITING "while reading the prologue" here:
> >
> https://github.com/helidon-io/helidon/blob/main/nima/webserver/webserver/src/main/java/io/helidon/nima/webserver/http1/Http1Connection.java#L115
> <https://github.com/helidon-io/helidon/blob/main/nima/webserver/webserver/src/main/java/io/helidon/nima/webserver/http1/Http1Connection.java#L115>
> >
> > Conceptually when the Nima web server is "idle" we expect to be
> able to shutdown the web server gracefully (allowing for active
> requests to complete) and quickly. In this "idle" state with HTTP
> 1.1 keepalive connections the ThreadPerTaskExecutor contains alive
> threads that are in WAITING state while "reading the prologue of
> the next request".
> >
> > Currently the webserver.stop() ends up as the usual:
> >
> > executorService.shutdown();
> > if (!executorService.awaitTermination(...)) {
> > executorService.shutdownNow()
> > }
> >
> > Where the executorService in question is ThreadPerTaskExecutor.
> The above shutdown does not execute in a timely manner based on
> the timeout used for executorService.awaitTermination(...) - for
> example if this timeout is 10 seconds we get a pretty slow
> shutdown on a conceptually idle web server.
> >
> >
> > *Some thoughts*
> > An approach would be as part of
> [ThreadPerTaskExecutor].shutdown() to firstly look to interrupt
> threads that are state == WAITING &&
> some-application-specific-logic-that-says-the-task-is-interruptible
> (which for Nima is that the task is readPrologue() at
> Http1Connection.java#L115).
> >
> > e.g. Perhaps have a interface InterruptableTask extends Runnable
> ... and on shutdown() firstly iterate the active threads that are
> WAITING and if their tasks are InterruptableTask and
> application-specific-logic-that-says-the-task-is-interruptible
> true then interrupt that thread.
> >
> > I note there is some non-public API like Stream<Thread>
> threads() which applications can't use (at least yet). My current
> hacking around with this shutdown in Helidon hasn't given me any
> elegant solutions in application logic in that we need to now have
> the application additionally hold the Runnable tasks (which is not
> ideal in the http 1.0 no keepalive case).
> >
> >
> > Does anyone have some thoughts or suggestions on this?
> >
> >
> > I have raised this as an issue in Helidon as:
> https://github.com/helidon-io/helidon/issues/5717
> <https://github.com/helidon-io/helidon/issues/5717>
> > with a related PR with failing tests. I'm raising this here
> because my current thought is that an elegant solution might be
> with ThreadPerTaskExecutor on shutdown or some exposure of the
> waiting threads accessible to application code.
> >
> >
> > Thanks, Rob.
> >
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/loom-dev/attachments/20221220/020c7806/attachment.htm>
More information about the loom-dev
mailing list