RFR: 8361716 : GCTraceCPUTime may report incorrect times during high load from JNI code [v2]

Stefan Johansson sjohanss at openjdk.org
Tue Dec 16 14:56:54 UTC 2025


On Tue, 16 Dec 2025 13:38:44 GMT, Jonas Norlinder <jnorlinder at openjdk.org> wrote:

>> GCTraceCPUTime is used by Serial, Parallel and G1 to estimate GC CPU time spent during safepoints. It estimates CPU time by sampling the CPU time for entire process by using `os::getTimesSecs`.
>> 
>> This will not be correct if native code from e.g. JNI is used and is incurring a high load on the machine as native code is not paused during safepoints.
>> 
>> A minimal example:
>> 
>>     public static void main(String[] args) {
>>         // Initial GC that clean up start-up garbage etc.
>>         System.gc();
>> 
>>         for (int i = 1; i <= 128; i++) {
>>             addLoad(); // Creates 128 threads, that spins in a loop
>>             try {
>>                 Thread.sleep(2000);
>>             } catch (InterruptedException e) {}
>>             System.gc();
>>         }
>>         System.exit(0);
>>     }
>> 
>> where addLoad() is a native call to a C++ program that create 128 pthreads that busy-spin forever with the largest nice value allowed.
>> 
>> In such scenario we can observe using G1
>> 
>> 
>> [0.026s][info][gc,cpu] GC(0) User=0.00s Sys=0.00s Real=0.01s
>> [2.103s][info][gc,cpu] GC(1) User=0.89s Sys=0.00s Real=0.03s
>> [4.129s][info][gc,cpu] GC(2) User=0.51s Sys=0.00s Real=0.01s
>> [6.195s][info][gc,cpu] GC(3) User=1.72s Sys=0.00s Real=0.06s
>> [8.244s][info][gc,cpu] GC(4) User=1.46s Sys=0.00s Real=0.05s
>> ...
>> [201.191s][info][gc,cpu] GC(76) User=37.91s Sys=0.00s Real=1.19s
>> [205.064s][info][gc,cpu] GC(77) User=52.44s Sys=0.00s Real=1.64s
>> [208.068s][info][gc,cpu] GC(78) User=23.49s Sys=0.00s Real=0.74s
>> [211.795s][info][gc,cpu] GC(79) User=46.96s Sys=0.00s Real=1.47s
>> 
>> 
>> Since the Java code does not do anything the time should be close to 0.
>> 
>> # Implementation Comment
>> 
>> Total CPU time for GC operations on the VM thread is obtained by cumulatively adding to a counter. Since system time is calculated (on Linux) with `total - user` is value contains a small error delta. If we continuously increment a value this error would grow to an unacceptable level. Therefore, the implementation will rely on the fact that no other VM operation may run during GC safepoint, making it safe to query the VM thread for it's detailed CPU time at the start and end of GC pause.
>
> Jonas Norlinder has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Fix review comments

Looks good in general, some more comments.

src/hotspot/share/gc/shared/gcTraceTime.cpp line 86:

> 84:     cpu_time_t cpu_time_stringdedup = CPUTimeUsage::GC::detailed_stringdedup();
> 85:     _starting_user_time = cpu_time_vm.user + cpu_time_gc.user + cpu_time_stringdedup.user;
> 86:     _starting_system_time = cpu_time_vm.system + cpu_time_gc.system + cpu_time_stringdedup.system;

As I mentioned offline, maybe extract this to a helper that can also be used in the destructor. If you like the idea to use `cpu_time_t` to store the start measurement the helper could return a `cpu_time_t` as well.

src/hotspot/share/gc/shared/gcTraceTime.cpp line 101:

> 99:     cpu_time_t cpu_time_stringdedup = CPUTimeUsage::GC::detailed_stringdedup();
> 100: 
> 101:     double real_time = os::elapsedTime() - _starting_real_time;

You are subtracting the start time here and also in the if-statement below, I guess one of them should be removed.

src/hotspot/share/gc/shared/gcTraceTime.hpp line 40:

> 38:   bool _active;                 // True if times will be measured and printed
> 39:   jlong _starting_user_time;    // User time at start of measurement
> 40:   jlong _starting_system_time;  // System time at start of measurement

What do you think about just using `cpu_time_t` here?

src/hotspot/share/runtime/os.hpp line 175:

> 173: };
> 174: typedef struct cpu_time cpu_time_t;
> 175: 

I looked a bit closer at this now, I could make sense to use a more common Hotspot type here since this is nothing that we expect the JDK will use. I think that is the main reason we use `jlong` in `os::thread_cpu_time()`.

The new `detailed_thread_cpu_time()` is a Hotspot API and could then use Hotspot types, like `uint64_t`. But then we likely want to change this in the users of this class as well, for example `DetailedCPUTimeThreadClosure`, and I'm not sure how big the fan-out would be. Maybe we can change the type as a follow-up if more people agree with this.

-------------

Changes requested by sjohanss (Reviewer).

PR Review: https://git.openjdk.org/jdk/pull/28824#pullrequestreview-3583442358
PR Review Comment: https://git.openjdk.org/jdk/pull/28824#discussion_r2623595381
PR Review Comment: https://git.openjdk.org/jdk/pull/28824#discussion_r2623604120
PR Review Comment: https://git.openjdk.org/jdk/pull/28824#discussion_r2623570439
PR Review Comment: https://git.openjdk.org/jdk/pull/28824#discussion_r2623553450


More information about the hotspot-gc-dev mailing list