Re: RFR: 8326446: The User and System of jdk.CPULoad on Apple M1 are inaccurate [v2]
hi
I would like to fix this.
As the description in [JDK-8326446](https://bugs.openjdk.org/browse/JDK-8326446). JFR uses task_info() with flavor TASK_ABSOLUTETIME_INFO to read User and System time is not reliable on Apple m1.
Libc provides the [times](https://man7.org/linux/man-pages/man2/times.2.html) function, which uses TASK_BASIC_INFO_COUNT and TASK_THREAD_TIMES_INFO_COUNT. It will also return the real time of the process, and the time unit is the same, which is suitable for solving this problem.
I ran test/jdk/jdk/jfr/event/os/TestCPULoad.java and passed.
I would appreciate it if you could review this.
Long Yang has updated the pull request incrementally with one additional commit since the last revision: update copyright header ------------- Changes: - all: https://git.openjdk.org/jdk/pull/17976/files - new: https://git.openjdk.org/jdk/pull/17976/files/20a3e526..4cf1dba3 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=17976&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=17976&range=00-01 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/17976.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17976/head:pull/17976 PR: https://git.openjdk.org/jdk/pull/17976
On Fri, 23 Feb 2024 08:21:11 GMT, Long Yang <duke@openjdk.org> wrote:
hi
I would like to fix this.
As the description in [JDK-8326446](https://bugs.openjdk.org/browse/JDK-8326446). JFR uses task_info() with flavor TASK_ABSOLUTETIME_INFO to read User and System time. it is not reliable on Apple m1.
Libc provides the [times](https://man7.org/linux/man-pages/man2/times.2.html) function, which uses TASK_BASIC_INFO_COUNT and TASK_THREAD_TIMES_INFO_COUNT. It will also return the real time of the process, and the time unit is the same, which is suitable for solving this problem.
I ran test/jdk/jdk/jfr/event/os/TestCPULoad.java and passed.
I would appreciate it if you could review this.
Long Yang has updated the pull request incrementally with one additional commit since the last revision:
update copyright header
Seems reasonable. Is there a risk of introducing a regression for existing platforms? Did the result on x64 turn out equivalent to pre-change? ------------- PR Comment: https://git.openjdk.org/jdk/pull/17976#issuecomment-1968812005
On Wed, 28 Feb 2024 11:46:35 GMT, Markus Grönlund <mgronlun@openjdk.org> wrote:
Seems reasonable.
Is there a risk of introducing a regression for existing platforms? Did the result on x64 turn out equivalent to pre-change?
I ran the pre- and post-change versions on MacOS X64, and I can see that the results are consistent. OS Information Time OS Version -------- -------------------------------------------------------------------------------------------------------------- 22:28:13 uname: Darwin 21.6.0 Darwin Kernel Version 21.6.0: Mon Apr 24 21:10:53 PDT 2023; root:xnu-8020.240.18.701.5... CPU Information Time Type Description Sockets Cores Hardware Threads -------- --------------------------------------- --------------------------------------- ------- ----- ---------------- 22:28:13 Intel Haswell (HT) SSE SSE2 SSE3 SSS... Brand: Intel(R) Core(TM) i7-4770HQ C... 1 4 8 CPU Load (pre fix, I ran the changed version first, so the time is newer) Time JVM User JVM System Machine Total ------------------ ------------------ -------------------- ----------------------- 22:28:15 10.41% 39.19% 76.21% 22:28:16 10.31% 39.46% 70.88% 22:28:17 10.30% 39.46% 68.58% 22:28:18 10.30% 39.47% 67.96% CPU Load (post fix) Time JVM User JVM System Machine Total ------------------ ------------------ -------------------- ----------------------- 22:25:58 10.40% 38.99% 82.34% 22:25:59 10.37% 39.50% 82.15% 22:26:00 10.27% 39.23% 68.65% 22:26:01 10.37% 39.63% 68.25% ------------- PR Comment: https://git.openjdk.org/jdk/pull/17976#issuecomment-1969134580
On Fri, 23 Feb 2024 08:21:11 GMT, Long Yang <duke@openjdk.org> wrote:
hi
I would like to fix this.
As the description in [JDK-8326446](https://bugs.openjdk.org/browse/JDK-8326446). JFR uses task_info() with flavor TASK_ABSOLUTETIME_INFO to read User and System time. it is not reliable on Apple m1.
Libc provides the [times](https://man7.org/linux/man-pages/man2/times.2.html) function, which uses TASK_BASIC_INFO_COUNT and TASK_THREAD_TIMES_INFO_COUNT. It will also return the real time of the process, and the time unit is the same, which is suitable for solving this problem.
I ran test/jdk/jdk/jfr/event/os/TestCPULoad.java and passed.
I would appreciate it if you could review this.
Long Yang has updated the pull request incrementally with one additional commit since the last revision:
update copyright header
The units are changed from nanos to ticks, but this is opaque because doubles are used to represent only the ratios? ------------- PR Comment: https://git.openjdk.org/jdk/pull/17976#issuecomment-1968838102
On Wed, 28 Feb 2024 12:03:02 GMT, Markus Grönlund <mgronlun@openjdk.org> wrote:
The units are changed from nanos to ticks, but this is opaque because doubles are used to represent only the ratios?
Yes ------------- PR Comment: https://git.openjdk.org/jdk/pull/17976#issuecomment-1969125193
On Fri, 23 Feb 2024 08:21:11 GMT, Long Yang <duke@openjdk.org> wrote:
hi
I would like to fix this.
As the description in [JDK-8326446](https://bugs.openjdk.org/browse/JDK-8326446). JFR uses task_info() with flavor TASK_ABSOLUTETIME_INFO to read User and System time. it is not reliable on Apple m1.
Libc provides the [times](https://man7.org/linux/man-pages/man2/times.2.html) function, which uses TASK_BASIC_INFO_COUNT and TASK_THREAD_TIMES_INFO_COUNT. It will also return the real time of the process, and the time unit is the same, which is suitable for solving this problem.
I ran test/jdk/jdk/jfr/event/os/TestCPULoad.java and passed.
I would appreciate it if you could review this.
Long Yang has updated the pull request incrementally with one additional commit since the last revision:
update copyright header
Marked as reviewed by mgronlun (Reviewer). Thank you for confirming. ------------- PR Review: https://git.openjdk.org/jdk/pull/17976#pullrequestreview-1908615501 PR Comment: https://git.openjdk.org/jdk/pull/17976#issuecomment-1970922323
On Thu, 29 Feb 2024 11:21:56 GMT, Markus Grönlund <mgronlun@openjdk.org> wrote:
Long Yang has updated the pull request incrementally with one additional commit since the last revision:
update copyright header
Thank you for confirming.
@mgronlun Thanks for your review. Does this patch need a second review ? ------------- PR Comment: https://git.openjdk.org/jdk/pull/17976#issuecomment-1970960139
On Fri, 23 Feb 2024 08:21:11 GMT, Long Yang <duke@openjdk.org> wrote:
hi
I would like to fix this.
As the description in [JDK-8326446](https://bugs.openjdk.org/browse/JDK-8326446). JFR uses task_info() with flavor TASK_ABSOLUTETIME_INFO to read User and System time. it is not reliable on Apple m1.
Libc provides the [times](https://man7.org/linux/man-pages/man2/times.2.html) function, which uses TASK_BASIC_INFO_COUNT and TASK_THREAD_TIMES_INFO_COUNT. It will also return the real time of the process, and the time unit is the same, which is suitable for solving this problem.
I ran test/jdk/jdk/jfr/event/os/TestCPULoad.java and passed.
I would appreciate it if you could review this.
Long Yang has updated the pull request incrementally with one additional commit since the last revision:
update copyright header
I think it needs a second review. Also, I am not very versed in Mac internals. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17976#issuecomment-1971076962
On Thu, 29 Feb 2024 12:52:31 GMT, Markus Grönlund <mgronlun@openjdk.org> wrote:
I think it needs a second review. Also, I am not very versed in Mac internals.
Thanks. Would you help me to invite the second reviewer? ------------- PR Comment: https://git.openjdk.org/jdk/pull/17976#issuecomment-1971127435
On Fri, 23 Feb 2024 08:21:11 GMT, Long Yang <duke@openjdk.org> wrote:
hi
I would like to fix this.
As the description in [JDK-8326446](https://bugs.openjdk.org/browse/JDK-8326446). JFR uses task_info() with flavor TASK_ABSOLUTETIME_INFO to read User and System time. it is not reliable on Apple m1.
Libc provides the [times](https://man7.org/linux/man-pages/man2/times.2.html) function, which uses TASK_BASIC_INFO_COUNT and TASK_THREAD_TIMES_INFO_COUNT. It will also return the real time of the process, and the time unit is the same, which is suitable for solving this problem.
I ran test/jdk/jdk/jfr/event/os/TestCPULoad.java and passed.
I would appreciate it if you could review this.
Long Yang has updated the pull request incrementally with one additional commit since the last revision:
update copyright header
@egahlin Could I have a second review of this backport. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17976#issuecomment-1975867030
On Mon, 4 Mar 2024 07:09:06 GMT, Long Yang <duke@openjdk.org> wrote:
@egahlin Could I have a second review of this patch.
Looks reasonable, but why was this removed: `if(!now_in_nanos(&total_cpu_nanos)) { return OS_ERR; }` ------------- PR Comment: https://git.openjdk.org/jdk/pull/17976#issuecomment-1982079941
On Fri, 23 Feb 2024 08:21:11 GMT, Long Yang <duke@openjdk.org> wrote:
hi
I would like to fix this.
As the description in [JDK-8326446](https://bugs.openjdk.org/browse/JDK-8326446). JFR uses task_info() with flavor TASK_ABSOLUTETIME_INFO to read User and System time. it is not reliable on Apple m1.
Libc provides the [times](https://man7.org/linux/man-pages/man2/times.2.html) function, which uses TASK_BASIC_INFO_COUNT and TASK_THREAD_TIMES_INFO_COUNT. It will also return the real time of the process, and the time unit is the same, which is suitable for solving this problem.
I ran test/jdk/jdk/jfr/event/os/TestCPULoad.java and passed.
I would appreciate it if you could review this.
Long Yang has updated the pull request incrementally with one additional commit since the last revision:
update copyright header
Because the [times](https://man7.org/linux/man-pages/man2/times.2.html) libc api returns the real time of the process. So we don't need to calculate the `total_cpu_nanos`, just use the return value of `times`. struct tms buf; clock_t jvm_real = times(&buf); if (jvm_real == (clock_t) (-1)) { return OS_ERR; } ... uint64_t delta = active_processor_count * (jvm_real - _jvm_real); ------------- PR Comment: https://git.openjdk.org/jdk/pull/17976#issuecomment-1982198946
On Thu, 7 Mar 2024 01:57:24 GMT, Long Yang <duke@openjdk.org> wrote:
Because the [times](https://man7.org/linux/man-pages/man2/times.2.html) libc api returns the real time of the process.
OK, I had a vague memory about some other issue, but I don't remember and it probably doesn't apply. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17976#issuecomment-1983635869
On Fri, 23 Feb 2024 08:21:11 GMT, Long Yang <duke@openjdk.org> wrote:
hi
I would like to fix this.
As the description in [JDK-8326446](https://bugs.openjdk.org/browse/JDK-8326446). JFR uses task_info() with flavor TASK_ABSOLUTETIME_INFO to read User and System time. it is not reliable on Apple m1.
Libc provides the [times](https://man7.org/linux/man-pages/man2/times.2.html) function, which uses TASK_BASIC_INFO_COUNT and TASK_THREAD_TIMES_INFO_COUNT. It will also return the real time of the process, and the time unit is the same, which is suitable for solving this problem.
I ran test/jdk/jdk/jfr/event/os/TestCPULoad.java and passed.
I would appreciate it if you could review this.
Long Yang has updated the pull request incrementally with one additional commit since the last revision:
update copyright header
Marked as reviewed by egahlin (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/17976#pullrequestreview-1922606074
On Fri, 23 Feb 2024 08:21:11 GMT, Long Yang <duke@openjdk.org> wrote:
hi
I would like to fix this.
As the description in [JDK-8326446](https://bugs.openjdk.org/browse/JDK-8326446). JFR uses task_info() with flavor TASK_ABSOLUTETIME_INFO to read User and System time. it is not reliable on Apple m1.
Libc provides the [times](https://man7.org/linux/man-pages/man2/times.2.html) function, which uses TASK_BASIC_INFO_COUNT and TASK_THREAD_TIMES_INFO_COUNT. It will also return the real time of the process, and the time unit is the same, which is suitable for solving this problem.
I ran test/jdk/jdk/jfr/event/os/TestCPULoad.java and passed.
I would appreciate it if you could review this.
Long Yang has updated the pull request incrementally with one additional commit since the last revision:
update copyright header
Thank you all. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17976#issuecomment-1984915907
participants (3)
-
Erik Gahlin
-
Long Yang
-
Markus Grönlund