RFR: 8323807: Async UL: Add a stalling mode to async UL
Johan Sjölen
jsjolen at openjdk.org
Mon Dec 16 17:16:48 UTC 2024
Hi,
In January of this year I took a stab at implementing a stalling mode for UL, see link: https://github.com/openjdk/jdk/pull/17757 . I also talked about this feature in the mailing lists and seemed to receive positive feedback. With that PR, I also implemented a circular buffer. This PR didn't go through because 1. The stalling mode was broken 2. The complexity was a bit too large imho.
This PR does a much smaller change by only focusing on implementing the actual stalling.
The addition in terms of command line changes are the same as before, you can now specify the mode of your async logging:
$ java -Xlog:async:drop # Dropping mode, same as today
$ java -Xlog:async:stall # Stalling mode!
$ java -Xlog:async # Dropping mode by default still
The change in protocol is quite simple. If a producer thread `P` cannot fit a message into the buffer, it `malloc`s a message and exposes it via a shared pointer. It blocks all other producer threads from writing into the buffer. At the same time, the consumer thread (`AsyncLogWriter`) will perform all writing. When the consumer thread has emptied the write buffer, it writes the stalled message, notifies `P` and releases all locks. `P` then let's all other producer threads continue.
We do this by having two locks: `Outer` and `Inner`. In our example above, `P` prevents any other producers from progressing by holding the outer lock, but allows the consumer thread to progress by releasing the inner lock.
In pseudo-code we have something like this in the stalling case.
void produce() {
OuterLock olock;
InnerLock ilock;
bool out_of_memory = attempt_produce(shared_buffer);
if (out_of_memory) {
pmsg = new Message();
shared_message = pmsg;
while (shared_message != nullptr) ilock.wait();
free(pmsg);
}
}
void consume() {
InnerLock ilock;
consume(shared_buffer);
if (shared_message != nullptr) {
consume(shared_message);
ilock.notify();
}
}
*Note!* It is very important that the consumer prints all output found in the buffer before printing the stalled message. This is because logging is output in Program Order. In other words: `print(m0); print(m1);` means that `m0` must appear before `m1` in the log file.
*Note!* Yes, we do force *all* threads to stall before the original stalled message has been printed. This isn't optimal, but I still have hope that we can switch to a faster circular buffer in the future (actually, I've got that code too). If logs are written to a SSD with a ~700MiB/s sequential write speed, then this will be negligible if stalls are not very common.
# Some benchmarks
I did some basic benchmarking with Spring Boot Hello World and performing some HTTP requests while doing a JFR recording.
I ran the following command for drop, similar for stall and sync:
$ /mvnw spring-boot:run -Dspring-boot.run.jvmArguments='-XX:StartFlightRecording=filename=/home/johan/dropping.jfr -Xlog:async:drop -Xlog:all=debug:file=droppinglog::filecount=0'
And I stressed the server by performing 100 000 requests over 100 concurrent connections with 4 threads participating.
The result speed wise is:
# Stalling
requests: 100000
good requests: 100000 [100%]
bad requests: 0 [0%]
seconds: 6.476
requests/sec: 15441.263
# Dropping
requests: 100000
good requests: 100000 [100%]
bad requests: 0 [0%]
seconds: 4.937
requests/sec: 20255.733
# Sync
requests: 100000
good requests: 100000 [100%]
bad requests: 0 [0%]
seconds: 27.899
requests/sec: 3584.328
The size of the `stallinglog` is `501MiB` and the size of the `droppinglog` is `292MiB`, so the amount written is quite different. Note: We're logging 501MiB over a period of <7 seconds, this is a huge amount of logging.
JFR measurements show the following results in JMC.
## Stalling

## Dropping

## Sync

I believe that this form of stalling is sufficiently performant to serve as a middle-of-the-road choice for those who do not want to lose any log messages yet have fairly small safepoints.
Best regards,
Johan
-------------
Commit messages:
- Implement stalling mode for ping-pong buffers
Changes: https://git.openjdk.org/jdk/pull/22770/files
Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=22770&range=00
Issue: https://bugs.openjdk.org/browse/JDK-8323807
Stats: 133 lines in 6 files changed: 94 ins; 5 del; 34 mod
Patch: https://git.openjdk.org/jdk/pull/22770.diff
Fetch: git fetch https://git.openjdk.org/jdk.git pull/22770/head:pull/22770
PR: https://git.openjdk.org/jdk/pull/22770
More information about the hotspot-runtime-dev
mailing list