<div dir="ltr"><br id="gmail-docs-internal-guid-88db2ac3-7fff-f03b-e5f4-b681fbb0be20"><p dir="ltr" style="line-height:1.656;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial,sans-serif;color:rgb(0,0,0);background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre-wrap">Controlled concurrency testing (CCT) is not only about testing, but also debugging. While the JDK issue mentioned in the blog post isn't technically a bug, deterministic testing helps answer crucial questions like: Is my application buggy? Is there an issue with my library? Am I using the wrong library? The key advantage is that once you identify a bug, you can replay it deterministically every time.</span></p><br><p dir="ltr" style="line-height:1.656;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial,sans-serif;color:rgb(0,0,0);background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre-wrap">CCT benefits both single-process concurrent systems and distributed systems by systematically exploring different thread interleavings, which accelerates race condition discovery. Go race detector is only for data races, while many deterministic testing frameworks are designed to find a broader range of race conditions. </span></p><br><p dir="ltr" style="line-height:1.656;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial,sans-serif;color:rgb(0,0,0);background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre-wrap">Coincidentally, I'm currently interning at <a href="https://antithesis.com/">Antithesis</a> where we're exploring the integration of Fray with the deterministic hypervisor to combine the strengths of both approaches. Fray excels at exploring thread interleavings but doesn't handle network traffic, file I/O, and other sources of non-determinism. Meanwhile, a deterministic hypervisor provides a deterministic environment but relies on the kernel schedulers to run concurrent programs (could be less efficient).</span></p><br><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial,sans-serif;color:rgb(0,0,0);background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre-wrap">I really like the idea of using a customized user-space scheduler for CCT. I know </span><a href="https://github.com/awslabs/shuttle" style="text-decoration:none"><span style="font-size:11pt;font-family:Arial,sans-serif;color:rgb(17,85,204);background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:underline;vertical-align:baseline;white-space:pre-wrap">shuttle</span></a><span style="font-size:11pt;font-family:Arial,sans-serif;color:rgb(0,0,0);background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre-wrap"> and </span><a href="https://github.com/tokio-rs/loom" style="text-decoration:none"><span style="font-size:11pt;font-family:Arial,sans-serif;color:rgb(17,85,204);background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:underline;vertical-align:baseline;white-space:pre-wrap">loom-rs</span></a><span style="font-size:11pt;font-family:Arial,sans-serif;color:rgb(0,0,0);background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre-wrap"> are doing this. One challenge I’m thinking about is that applications may mix both green threads with physical threads. This could be messy because the CCT itself relies on concurrency primitives. </span></p><br><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial,sans-serif;color:rgb(0,0,0);background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre-wrap">From my experience, handling all non-determinism within the JVM is particularly challenging. When I attempted this approach in Fray, it made the framework both cumbersome and fragile. In contrast, I'm impressed by how elegantly Antithesis's deterministic hypervisor solves these issues for Fray. For network and I/O operations, Fray can simply mark threads as blocked when applications perform network operations and unblock them upon completion while running inside the hypervisor.</span><a href="https://github.com/cmu-pasta/fray/blob/main/core/src/main/kotlin/org/pastalab/fray/core/controllers/ReactiveNetworkController.kt" style="text-decoration:none"><span style="font-size:11pt;font-family:Arial,sans-serif;color:rgb(0,0,0);background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre-wrap"> </span><span style="font-size:11pt;font-family:Arial,sans-serif;color:rgb(17,85,204);background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:underline;vertical-align:baseline;white-space:pre-wrap">https://github.com/cmu-pasta/fray/blob/main/core/src/main/kotlin/org/pastalab/fray/core/controllers/ReactiveNetworkController.kt</span></a><span style="font-size:11pt;font-family:Arial,sans-serif;color:rgb(0,0,0);background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre-wrap"> You may compare it against a version where Fray tries to manage the network IO itself: </span><a href="https://github.com/cmu-pasta/fray/blob/main/core/src/main/kotlin/org/pastalab/fray/core/controllers/ProactiveNetworkController.kt" style="text-decoration:none"><span style="font-size:11pt;font-family:Arial,sans-serif;color:rgb(17,85,204);background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:underline;vertical-align:baseline;white-space:pre-wrap">https://github.com/cmu-pasta/fray/blob/main/core/src/main/kotlin/org/pastalab/fray/core/controllers/ProactiveNetworkController.kt</span></a></p><br><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial,sans-serif;color:rgb(0,0,0);background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre-wrap">For an open-source solution, <a href="https://github.com/rr-debugger/rr">RR</a>+Fray sounds promising but I haven’t tried myself. Of course, this cannot test distributed systems. </span></p><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial,sans-serif;color:rgb(0,0,0);background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre-wrap"><br></span></p><p style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial,sans-serif;color:rgb(0,0,0);background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre-wrap">Ao</span></p><br></div><br><div class="gmail_quote gmail_quote_container"><div dir="ltr" class="gmail_attr">On Sat, Jul 5, 2025 at 4:26 PM Ryan Yeats <<a href="mailto:ryeats@gmail.com">ryeats@gmail.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div>Hi, </div>I have also been looking into deterministic simulation on the side and belatedly saw some interest about it on the loom email list with the "Custom scheduler: Customize current time and timed waits" thread. I wanted to briefly expand on the topic of using Loom to make deterministic simulation testing more easily available to the java community.<div><br></div><div><div>BLUF:
Deterministic simulation testing makes multi threaded logic single threaded and fuzz tests the execution order, greatly simplifying discovery of race conditions and deadlocks. See this blog post by <a href="https://aoli.al/blogs/jdk-bug/" target="_blank">Ao Li</a> for an example doing this in java and the paper behind it <a href="https://arxiv.org/abs/2501.12618" target="_blank"> Fray: An Efficient General-Purpose Concurrency Testing Platform for the JVM</a>.</div><div><br></div><div>The above explanation omits how hard it is to actually make java or any language deterministic. This is where<a href="https://jbaker.io/2022/05/09/project-loom-for-distributed-systems/" target="_blank"> James Baker had a clever idea</a> of using Loom, since we could replace the virtual thread scheduler with a deterministic one, Loom does most of the work for us without any extra frameworks. This works brilliantly, however there are three main caveats:</div><div><ol><li>Its not easy to replace the virtual thread scheduler</li><li>The aforementioned scheduler has no understanding of delays so any Thread.sleep or Object.wait introduces continuations back into the scheduled execution pool non-deterministically. This affects the execution order so bugs can't always be reproduced.</li><li>IO also introduces continuations back to the scheduler execution pool non-deterministically.</li></ol><div>The first issue can currently be solved by using reflection to make the <a href="https://github.com/ryeats/loom-dst/blob/main/src/main/java/org/example/SchedulableVirtualThreadFactory.java" target="_blank">virtual thread constructor public</a>. The second issue can currently be solved by controlling system time by replacing the byte code for all calls to System.nanoTime(), System.onCurrentTimeMillis() and Instant.now() <a href="https://github.com/cmu-pasta/fray/blob/main/instrumentation/base/src/main/kotlin/org/pastalab/fray/instrumentation/base/visitors/TimeInstrumenter.kt" target="_blank">using an agent at runtime</a>. The third likely has no general solution but I am interested in hearing ideas.</div></div><div><br></div><div>Even if there is no solution to IO non-determinism and developers have to stub out all IO, deterministic simulation is still an incredibly promising tool for making concurrent and distributed systems programing much simpler and safer. I think because of this use case there would be a lot of benefit if the Loom API allowed instrumenting the scheduler, even better if we could access the DELAYED_TASK_SCHEDULERS which understand delays. </div><div><br></div><div>Thank you for your time, I have been super excited to use virtual threads and amazed by the ingenuity that brought them to java.</div><div><br></div><div>Ryan</div><div><br></div><div><br></div></div></div>
</blockquote></div>