Custom scheduler: Using loom for deterministic simulation testing

Sat Jul 5 20:25:44 UTC 2025

Hi,
I have also been looking into deterministic simulation on the side and
belatedly saw some interest about it on the loom email list with the
"Custom scheduler: Customize current time and timed waits" thread. I wanted
to briefly expand on the topic of using Loom to make deterministic
simulation testing more easily available to the java community.

BLUF:  Deterministic simulation testing makes multi threaded logic single
threaded and fuzz tests the execution order, greatly simplifying discovery
of race conditions and deadlocks. See this blog post by Ao Li
<https://aoli.al/blogs/jdk-bug/> for an example doing this in java and the
paper behind it  Fray: An Efficient General-Purpose Concurrency Testing
Platform for the JVM <https://arxiv.org/abs/2501.12618>.

The above explanation omits how hard it is to actually make java or any
language deterministic. This is where James Baker had a clever idea
<https://jbaker.io/2022/05/09/project-loom-for-distributed-systems/> of
using Loom, since we could replace the virtual thread scheduler with a
deterministic one, Loom does most of the work for us without any extra
frameworks. This works brilliantly, however there are three main caveats:

   1. Its not easy to replace the virtual thread scheduler
   2. The aforementioned scheduler has no understanding of delays so any
   Thread.sleep or Object.wait introduces continuations back into the
   scheduled execution pool non-deterministically. This affects the execution
   order so bugs can't always be reproduced.
   3. IO also introduces continuations back to the scheduler execution pool
   non-deterministically.

The first issue can currently be solved by using reflection to make the virtual
thread constructor public
<https://github.com/ryeats/loom-dst/blob/main/src/main/java/org/example/SchedulableVirtualThreadFactory.java>.
The
second issue can currently be solved by controlling system time by
replacing the byte code for all calls to System.nanoTime(),
System.onCurrentTimeMillis() and Instant.now() using an agent at runtime
<https://github.com/cmu-pasta/fray/blob/main/instrumentation/base/src/main/kotlin/org/pastalab/fray/instrumentation/base/visitors/TimeInstrumenter.kt>.
The third likely has no general solution but I am interested in hearing
ideas.

Even if there is no solution to IO non-determinism and developers have to
stub out all IO, deterministic simulation is still an incredibly promising
tool for making concurrent and distributed systems programing much simpler
and safer. I think because of this use case there would be a lot of benefit
if the Loom API allowed instrumenting the scheduler, even better if we
could access the DELAYED_TASK_SCHEDULERS which understand delays.

Thank you for your time, I have been super excited to use virtual threads
and amazed by the ingenuity that brought them to java.

Ryan
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/loom-dev/attachments/20250705/765e3367/attachment.htm>