Can continuation support finally solve the "How do I stop this thread" problem?

Fri Sep 2 04:42:10 UTC 2022

On 2/09/2022 3:15 am, Archie Cobbs wrote:
> Thanks to Alan and Ron for your replies.
> 
> There are sort of two separate issues here, one narrow (how to reliably 
> stop a thread) and one larger (what is sandboxing and how best to do it).
> 
> The discussion about sandboxing is very interesting but for now I'm 
> really just using it as a motivating example for the more narrow 
> question, which is about whether and how it should be possible to stop a 
> thread (no matter what it's doing).

It can never be "no matter what it is doing" because that is simply 
unsafe in general. There have to be constraints on what kind of 
operations can be abandoned mid-stream with zero adverse consequences. 
Thread.stop was just an attractive nuisance here because it gave the 
illusion of being a simple mechanism to do this, whilst in reality 
allowing chaos to rule. (And yes StackOverflowError is nearly as bad!). 
You cannot write non-trivial async-exception-proof code if any bytecode 
could "throw".

> There are other scenarios where one 
> might want to ensure a thread stops regardless of what it's doing (e.g., 
> JShell).
> 
> So my question is simply what's the implementation distance between X 
> and Y? Where X = [We are now after implementing virtual threads], and Y 
> = [The ability to always unblock a blocked thread].
> 
> For the sake of clarity let's assume whoever wants to do this is able to 
> rewrite bytecode (e.g., to add checks for a "stop" flag on backward 
> branches) to avoid infinite loops, etc., so the only real barrier is the 
> thread being stuck in a blocking method call.

To which the only general solution is "only block using mechanisms you 
control and which are interruptible". There is no general solution to a 
blocking OS call unless the OS provides a mechanism to do so. But 
unblocking the thread is only one side of the problem - the hard part is 
what state do you leave the thing you were blocked upon once you 
unblock? There is no general answer there. This is why interruptible I/O 
(seemed like a great idea at the time) was only ever partially 
implemented and then phased out, to be replaced with 
InterruptibleChannels that explicitly stated that if you get interrupted 
then the channel gets closed - so very clear on what state things are 
left in. That is good but still makes it hard to actually program to 
(thread independence and isolation are your friends here).

One thing new that virtual threads potentially bring to this space would 
be a way to say "never schedule this thread again". That would at least 
allow more defensive programming if you know what the scheduling points 
are - though time-preemptive scheduling would defeat that.

Cheers,
David

> On Thu, Sep 1, 2022 at 4:04 AM Ron Pressler <ron.pressler at oracle.com 
> <mailto:ron.pressler at oracle.com>> wrote:
> 
>     See Alan’s response re the details of interruption. 
> 
> 
> Thread interrupts definitely make sense as the right way to unblock a 
> thread.
> 
> But will that ALWAYS work? No, currently, it does not (see previous 
> example).
> 
> Question - what are the blocking operations that can't be interrupted, 
> and would it be possible to include them as well somehow? This would be 
> one solution.
> 
>     When it comes to the question of forcibly killing threads, for it to
>     be generally useful, there must be limitations imposed on what the
>     threads can do to data that is accessed by other threads, as an
>     errant thread could otherwise harm other threads.
> 
> 
> If you're talking about releasing object monitors, then I agree with you 
> (were you thinking of others?). I don't think a "kill -9" style solution 
> works. Instead, you would need to do something like this:
> 
>    1. Ensure the thread wakes up if blocked (this is what I'm asking 
> about currently)
>    2. Trigger the throwing of ThreadDeath or similar exception to unwind 
> the stack
> 
> Step #2 could be via Thread.stop() or via bytecode rewriting (checks at 
> backward branches). Basically #1 is the only part that can't reliably be 
> solved today.
> 
>     The Java platform (and language) currently does not impose such
>     “isolation” limitation, but some language targeting the Java
>     platform could, and so it could also emit interruption checks when
>     compiling to bytecode. A language that behaves in this way is Erlang
>     (although I don’t know if implementations targeting Java are
>     currently maintained), but even there there are pitfalls.
> 
> 
> You've got me curious... What are the additional pitfalls?
> 
>     A similar path is available to jshell as well, as it can emit
>     interruption checks in the bytecode.
> 
> 
> Yep. Currently it just resorts to ThreadGroup.stop()... obviously, it's 
> in the same boat here.
> 
> ---snip---
> 
> Transitioning now from the "how to stop a thread" discussion to the 
> larger "sandbox" discussion...
> 
>     Even if all “dangerous” APIs are blocked, any user code could
>     allocate as much memory as it likes, exhausting the memory available
>     for the entire process.
> 
> 
> Not necessarily. Bytecode rewriting could be used to track memory 
> allocations with weak references used to track deallocations. This would 
> impose a performance penalty of course, and the accounting wouldn't be 
> perfectly accurate, but you could make that inaccuracy conservative so 
> it would work for the purposes of containment.
> 
>     If it is also allowed to spawn platform threads at will, it can also
>     exhaust the CPU allocation for the entire process.
> 
> 
> Not sure I understand... with bytecode rewriting (e.g., adding checks at 
> backward branches), you should be able to limit the total CPU 
> utilization of a thread or group of threads. Kind of clunky for sure but 
> it should work, right?
> 
>     The Java runtime could be changed to support isolated heaps in
>     imitations of the isolation and memory restrictions offered by the
>     OS, but if the goal is to share some internal runtime data
>     structures for efficiency, there are ways to do that for multiple
>     processes. Given that the kernel has more arrows in its quiver to
>     support such isolation, including at the native code level, that is
>     probably the most appropriate level to provide it, I think, so much
>     so that trying to implement proper and secure isolation in
>     multi-user server programs in user mode is a fool’s errand. A
>     multi-user Java runtime running in *kernel* mode is a different
>     matter altogether, but it is currently beyond the scope of the
>     OpenJDK JDK. 
> 
> 
> OK so let me throw out a not-so hypothetical example. Oracle supports 
> Java stored procedures 
> <https://docs.oracle.com/database/121/JJDEV/chfive.htm#JJDEV13247>. I'm 
> not familiar with it, but is this not implemented as server-side 
> sandboxed code? Isn't that a valid motivating example?
> 
> The Java code may be prevented (via Java permissions) from doing any 
> file or network I/O, in which case the stuck thread problem gets a lot 
> easier (probably all blocking system calls require interaction with some 
> file, process, or network socket).
> 
> But I'm still curious how they stop runaway Java code or runaway memory 
> allocation?
> 
> -Archie
> 
> -- 
> Archie L. Cobbs