RFR: 8372584: [Linux]: Replace reading proc to get thread user CPU time with clock_gettime [v3]

David Holmes dholmes at openjdk.org
Mon Dec 1 04:13:50 UTC 2025


On Fri, 28 Nov 2025 15:54:24 GMT, Jonas Norlinder <jnorlinder at openjdk.org> wrote:

>> Since kernel v2.6.12 the Linux ABI have had support for encoding the clock types in the last three bits. Setting bit to 001 (CPUCLOCK_VIRT) will result in the kernel returning only user time. POSIX compliant implementations of pthread_getcpuclockid for the Linux kernel defaults to construct a clockid that with 010 (CPUCLOCK_SCHED) set, which return system+user time, which is what the POSIX standard mandates, see POSIX.1-2024/IEEE Std 1003.1-2024 §3.90. This patch joins the family of glibc, musl etc.  that utilities this bit pattern.
>> 
>> This PR also results in improved performance and thus a reduced observer effect, especially for the 100th percentile (max).
>> 
>> Before patch:
>> 
>> Benchmark                  Mode      Cnt  Score    Error  Units
>> CPUTime.execute          sample  7506555  0.008 ±  0.001  ms/op
>> CPUTime.execute:p0.00    sample           0.008           ms/op
>> CPUTime.execute:p0.50    sample           0.008           ms/op
>> CPUTime.execute:p0.90    sample           0.008           ms/op
>> CPUTime.execute:p0.95    sample           0.008           ms/op
>> CPUTime.execute:p0.99    sample           0.012           ms/op
>> CPUTime.execute:p0.999   sample           0.015           ms/op
>> CPUTime.execute:p0.9999  sample           0.021           ms/op
>> CPUTime.execute:p1.00    sample           1.030           ms/op
>> 
>> 
>> After patch:
>> 
>> Benchmark                  Mode      Cnt   Score    Error  Units
>> CPUTime.execute          sample  8984189  ≈ 10⁻³           ms/op
>> CPUTime.execute:p0.00    sample           ≈ 10⁻³           ms/op
>> CPUTime.execute:p0.50    sample           ≈ 10⁻³           ms/op
>> CPUTime.execute:p0.90    sample           ≈ 10⁻³           ms/op
>> CPUTime.execute:p0.95    sample           ≈ 10⁻³           ms/op
>> CPUTime.execute:p0.99    sample            0.001           ms/op
>> CPUTime.execute:p0.999   sample            0.001           ms/op
>> CPUTime.execute:p0.9999  sample            0.006           ms/op
>> CPUTime.execute:p1.00    sample            0.054           ms/op
>> 
>> 
>> Testing: `java/lang/management/ThreadMXBean/ThreadUserTime.java` and the added microbenchmark.
>
> Jonas Norlinder has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Remove unused imports

Overall looks good. I'd forgotten that I found about this in 2018.

A few minor nits.

Can't really comment on the benchmark.

src/hotspot/os/linux/os_linux.cpp line 4961:

> 4959: }
> 4960: 
> 4961: // Since kernel v2.6.12 the Linux ABI have had support for encoding the clock types

Suggestion:

// Since kernel v2.6.12 the Linux ABI has had support for encoding the clock types

src/hotspot/os/linux/os_linux.cpp line 4968:

> 4966: // POSIX.1-2024/IEEE Std 1003.1-2024 §3.90.
> 4967: static clockid_t get_thread_clockid(Thread* thread, bool total, bool* success) {
> 4968:   constexpr clockid_t CLOCK_TYPE_MASK = 3;

Shouldn't the mask be covering 3-bits?

src/hotspot/os/linux/os_linux.cpp line 4979:

> 4977:     // to detach itself from the VM - which should result in ESRCH.
> 4978:     assert_status(rc == ESRCH, rc, "pthread_getcpuclockid failed");
> 4979:     *success = false;

The normal way I've seen this pattern used is to set it to true rather than assuming it was true to begin with.

-------------

PR Review: https://git.openjdk.org/jdk/pull/28556#pullrequestreview-3523063061
PR Review Comment: https://git.openjdk.org/jdk/pull/28556#discussion_r2575538639
PR Review Comment: https://git.openjdk.org/jdk/pull/28556#discussion_r2575548230
PR Review Comment: https://git.openjdk.org/jdk/pull/28556#discussion_r2575553017


More information about the hotspot-runtime-dev mailing list