From swen at openjdk.org  Thu May  1 17:35:56 2025
From: swen at openjdk.org (Shaojin Wen)
Date: Thu, 1 May 2025 17:35:56 GMT
Subject: RFR: 8356044: Use Double::hashCode and Long::hashCode in
 java.vm.ci.meta
Message-ID: <8SlBOjUBPGyZbR9GxEBZlLzOiNPbdws1GTZ4gGY8v9c=.fdefa26b-52ee-48f9-b814-3981b79f6012@github.com>

Similar to #24959 and #24971 and #24987, AbstractProfiledItem/PrimitiveConstant in java.vm.ci.meta can also be simplified similarly.

Replace manual bitwise operations in hashCode implementations of java.vm.ci.meta.AbstractProfiledItem/java.vm.ci.meta.PrimitiveConstant with Long::hashCode/Double.hashCode.

-------------

Commit messages:
 - Use Double::hashCode & Long::hashCode

Changes: https://git.openjdk.org/jdk/pull/24988/files
  Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=24988&range=00
  Issue: https://bugs.openjdk.org/browse/JDK-8356044
  Stats: 8 lines in 2 files changed: 0 ins; 5 del; 3 mod
  Patch: https://git.openjdk.org/jdk/pull/24988.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/24988/head:pull/24988

PR: https://git.openjdk.org/jdk/pull/24988

From jbhateja at openjdk.org  Fri May  2 07:50:27 2025
From: jbhateja at openjdk.org (Jatin Bhateja)
Date: Fri, 2 May 2025 07:50:27 GMT
Subject: RFR: 8352675: Support Intel AVX10 converged vector ISA feature
 detection [v7]
In-Reply-To: <OrjQDBEJjPrCWLpEPj4DmEpaWEFcHY3R8PiZ6ccxMxU=.2862c36d-88d0-45b8-ad28-b50730570da4@github.com>
References: <OrjQDBEJjPrCWLpEPj4DmEpaWEFcHY3R8PiZ6ccxMxU=.2862c36d-88d0-45b8-ad28-b50730570da4@github.com>
Message-ID: <tHgrkrZ7PBwN8vH4utPb66B46LuB2DhPYiX7boPkRME=.5e3bccb5-829f-4e9e-822c-44110e5b1889@github.com>

> - Intel AVX10[1] extends and enhances the capabilities of Intel AVX-512 to benefit all Intel? products and will be the vector ISA of choice moving into the future. 
> - It supports a new ISA versioning scheme which simplifies the existing AVX512 feature enumeration scheme. Feature set supported by an AVX10 ISA version will be supported by all the versions above it.
> - The initial, fully-featured version of Intel? AVX10 will be enumerated as Version 2 (denoted as Intel? AVX10.2). This will include the new ISA extension over the existing AVX512 instructions. 
> - An early version of Intel? AVX10 (Version 1, or Intel? AVX10.1) that only enumerates the Intel? AVX-512 instruction set at 128, 256, and 512 bits will be enabled on the Granite Rapids Server for software pre-enabling.
> 
> This patch adds the necessary CPUID feature detection for AVX10 ISA version 1 and 2.  In terms of architectural state save restoration, AVX10 is isomorphic to AVX512 support up till Granite Rapids. State components affected by AVX10 extension include SSE, AVX, Opmask, ZMM_Hi256, and Hi16_ZMM registers. 
> 
> The patch has been regressed through tier1 and jvmci tests 
> 
> Please review and share your feedback.
> 
> Best Regards,
> Jatin
> 
> [1] https://www.intel.com/content/www/us/en/content-details/844829/intel-advanced-vector-extensions-10-2-intel-avx10-2-architecture-specification.html

Jatin Bhateja has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains eight commits:

 - Addressing code refactoring comments
 - Merge branch 'master' of http://github.com/openjdk/jdk into JDK-8352675
 - Fix windows build
 - Merge branch 'master' of http://github.com/openjdk/jdk into JDK-8352675
 - Add dynamic sized feature vectors
 - Merge branch 'master' of http://github.com/openjdk/jdk into JDK-8352675
 - dropping unneeded feature enabling/checks
 - 8352675: Support Intel AVX10 converged vector ISA feature detection

-------------

Changes: https://git.openjdk.org/jdk/pull/24329/files
  Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=24329&range=06
  Stats: 545 lines in 27 files changed: 315 ins; 14 del; 216 mod
  Patch: https://git.openjdk.org/jdk/pull/24329.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/24329/head:pull/24329

PR: https://git.openjdk.org/jdk/pull/24329

From mchevalier at openjdk.org  Fri May  2 08:07:57 2025
From: mchevalier at openjdk.org (Marc Chevalier)
Date: Fri, 2 May 2025 08:07:57 GMT
Subject: RFR: 8347901: C2 should remove unused leaf / pure runtime calls
Message-ID: <FDC4kftsSAqf2FB3mxOABGMgWhr_qty0_BktGUiuTuE=.060b5943-74c3-461d-8806-b6da1722c207@github.com>

A first part toward a better support of pure functions.

## Pure Functions

Pure functions (considered here) are functions that have no side effects, no effect on the control flow (no exception or such), cannot deopt etc.. It's really a function that you can execute anywhere, with whichever arguments without effect other than wasting time. Integer division is not pure as dividing by zero is throwing. But many floating point functions will just return `NaN` or `+/-infinity` in problematic cases.

## Scope

We are not going all powerful for now! It's mostly about identifying some pure functions and being able to remove them if the result is unused. Some other things are not part of this PR, on purpose. Especially, this PR doesn't propose a way to move pure calls around. The reason is that pure calls are macro nodes later expanded into other, regular calls, which require a control input. To be able to do the expansion, we just keep the control in the pure call as well.

## Implementation Overview

We created here some new node kind for pure calls that are expanded into regular calls during macro expansion. This also allows the removal of `ModD` and `ModF` nodes that have their pure equivalent now. They are surprisingly hard to unify with other floating point functions from an implementation point of view!

IR framework and IGV needed a little bit of fixing.

Thanks,
Marc

-------------

Commit messages:
 - Clean up IRNode
 - cleanup
 - hash and cmp
 - get_early_ctrl_for_expensive
 - depends_only_on_test
 - depends_only_on_test
 - First try

Changes: https://git.openjdk.org/jdk/pull/24966/files
  Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=24966&range=00
  Issue: https://bugs.openjdk.org/browse/JDK-8347901
  Stats: 694 lines in 15 files changed: 449 ins; 226 del; 19 mod
  Patch: https://git.openjdk.org/jdk/pull/24966.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/24966/head:pull/24966

PR: https://git.openjdk.org/jdk/pull/24966

From jbhateja at openjdk.org  Fri May  2 08:08:27 2025
From: jbhateja at openjdk.org (Jatin Bhateja)
Date: Fri, 2 May 2025 08:08:27 GMT
Subject: RFR: 8352675: Support Intel AVX10 converged vector ISA feature
 detection [v8]
In-Reply-To: <OrjQDBEJjPrCWLpEPj4DmEpaWEFcHY3R8PiZ6ccxMxU=.2862c36d-88d0-45b8-ad28-b50730570da4@github.com>
References: <OrjQDBEJjPrCWLpEPj4DmEpaWEFcHY3R8PiZ6ccxMxU=.2862c36d-88d0-45b8-ad28-b50730570da4@github.com>
Message-ID: <w09msTYMH4mTht3ExiDf6AmrYeLEi-R0fgfLPpwaCB4=.ad2554fa-2ad7-49bd-ade9-43191ea07dc6@github.com>

> - Intel AVX10[1] extends and enhances the capabilities of Intel AVX-512 to benefit all Intel? products and will be the vector ISA of choice moving into the future. 
> - It supports a new ISA versioning scheme which simplifies the existing AVX512 feature enumeration scheme. Feature set supported by an AVX10 ISA version will be supported by all the versions above it.
> - The initial, fully-featured version of Intel? AVX10 will be enumerated as Version 2 (denoted as Intel? AVX10.2). This will include the new ISA extension over the existing AVX512 instructions. 
> - An early version of Intel? AVX10 (Version 1, or Intel? AVX10.1) that only enumerates the Intel? AVX-512 instruction set at 128, 256, and 512 bits will be enabled on the Granite Rapids Server for software pre-enabling.
> 
> This patch adds the necessary CPUID feature detection for AVX10 ISA version 1 and 2.  In terms of architectural state save restoration, AVX10 is isomorphic to AVX512 support up till Granite Rapids. State components affected by AVX10 extension include SSE, AVX, Opmask, ZMM_Hi256, and Hi16_ZMM registers. 
> 
> The patch has been regressed through tier1 and jvmci tests 
> 
> Please review and share your feedback.
> 
> Best Regards,
> Jatin
> 
> [1] https://www.intel.com/content/www/us/en/content-details/844829/intel-advanced-vector-extensions-10-2-intel-avx10-2-architecture-specification.html

Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision:

  Updating comment

-------------

Changes:
  - all: https://git.openjdk.org/jdk/pull/24329/files
  - new: https://git.openjdk.org/jdk/pull/24329/files/04de0289..4a614be8

Webrevs:
 - full: https://webrevs.openjdk.org/?repo=jdk&pr=24329&range=07
 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24329&range=06-07

  Stats: 5 lines in 1 file changed: 2 ins; 0 del; 3 mod
  Patch: https://git.openjdk.org/jdk/pull/24329.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/24329/head:pull/24329

PR: https://git.openjdk.org/jdk/pull/24329

From jbhateja at openjdk.org  Fri May  2 11:31:01 2025
From: jbhateja at openjdk.org (Jatin Bhateja)
Date: Fri, 2 May 2025 11:31:01 GMT
Subject: RFR: 8352675: Support Intel AVX10 converged vector ISA feature
 detection [v9]
In-Reply-To: <OrjQDBEJjPrCWLpEPj4DmEpaWEFcHY3R8PiZ6ccxMxU=.2862c36d-88d0-45b8-ad28-b50730570da4@github.com>
References: <OrjQDBEJjPrCWLpEPj4DmEpaWEFcHY3R8PiZ6ccxMxU=.2862c36d-88d0-45b8-ad28-b50730570da4@github.com>
Message-ID: <w98GDb6NDmuCaqLQa2J4K9O4BtCiepKybqylTIFqxUs=.d90f5996-05c2-4279-8b27-38ab92cd40d3@github.com>

> - Intel AVX10[1] extends and enhances the capabilities of Intel AVX-512 to benefit all Intel? products and will be the vector ISA of choice moving into the future. 
> - It supports a new ISA versioning scheme which simplifies the existing AVX512 feature enumeration scheme. Feature set supported by an AVX10 ISA version will be supported by all the versions above it.
> - The initial, fully-featured version of Intel? AVX10 will be enumerated as Version 2 (denoted as Intel? AVX10.2). This will include the new ISA extension over the existing AVX512 instructions. 
> - An early version of Intel? AVX10 (Version 1, or Intel? AVX10.1) that only enumerates the Intel? AVX-512 instruction set at 128, 256, and 512 bits will be enabled on the Granite Rapids Server for software pre-enabling.
> 
> This patch adds the necessary CPUID feature detection for AVX10 ISA version 1 and 2.  In terms of architectural state save restoration, AVX10 is isomorphic to AVX512 support up till Granite Rapids. State components affected by AVX10 extension include SSE, AVX, Opmask, ZMM_Hi256, and Hi16_ZMM registers. 
> 
> The patch has been regressed through tier1 and jvmci tests 
> 
> Please review and share your feedback.
> 
> Best Regards,
> Jatin
> 
> [1] https://www.intel.com/content/www/us/en/content-details/844829/intel-advanced-vector-extensions-10-2-intel-avx10-2-architecture-specification.html

Jatin Bhateja has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR. The pull request contains one new commit since the last revision:

  Refactoring code to create a seperate VM_Features class

-------------

Changes:
  - all: https://git.openjdk.org/jdk/pull/24329/files
  - new: https://git.openjdk.org/jdk/pull/24329/files/4a614be8..a9258174

Webrevs:
 - full: https://webrevs.openjdk.org/?repo=jdk&pr=24329&range=08
 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24329&range=07-08

  Stats: 63 lines in 3 files changed: 32 ins; 22 del; 9 mod
  Patch: https://git.openjdk.org/jdk/pull/24329.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/24329/head:pull/24329

PR: https://git.openjdk.org/jdk/pull/24329

From sviswanathan at openjdk.org  Fri May  2 20:54:47 2025
From: sviswanathan at openjdk.org (Sandhya Viswanathan)
Date: Fri, 2 May 2025 20:54:47 GMT
Subject: RFR: 8352675: Support Intel AVX10 converged vector ISA feature
 detection [v9]
In-Reply-To: <w98GDb6NDmuCaqLQa2J4K9O4BtCiepKybqylTIFqxUs=.d90f5996-05c2-4279-8b27-38ab92cd40d3@github.com>
References: <OrjQDBEJjPrCWLpEPj4DmEpaWEFcHY3R8PiZ6ccxMxU=.2862c36d-88d0-45b8-ad28-b50730570da4@github.com>
 <w98GDb6NDmuCaqLQa2J4K9O4BtCiepKybqylTIFqxUs=.d90f5996-05c2-4279-8b27-38ab92cd40d3@github.com>
Message-ID: <lKiwlSZEbVtX8pcXoGLs4-u3kJJsdtH_MW0g3eXFois=.6f8889e0-9e2d-4dd3-b413-cbaa2e121709@github.com>

On Fri, 2 May 2025 11:31:01 GMT, Jatin Bhateja <jbhateja at openjdk.org> wrote:

>> - Intel AVX10[1] extends and enhances the capabilities of Intel AVX-512 to benefit all Intel? products and will be the vector ISA of choice moving into the future. 
>> - It supports a new ISA versioning scheme which simplifies the existing AVX512 feature enumeration scheme. Feature set supported by an AVX10 ISA version will be supported by all the versions above it.
>> - The initial, fully-featured version of Intel? AVX10 will be enumerated as Version 2 (denoted as Intel? AVX10.2). This will include the new ISA extension over the existing AVX512 instructions. 
>> - An early version of Intel? AVX10 (Version 1, or Intel? AVX10.1) that only enumerates the Intel? AVX-512 instruction set at 128, 256, and 512 bits will be enabled on the Granite Rapids Server for software pre-enabling.
>> 
>> This patch adds the necessary CPUID feature detection for AVX10 ISA version 1 and 2.  In terms of architectural state save restoration, AVX10 is isomorphic to AVX512 support up till Granite Rapids. State components affected by AVX10 extension include SSE, AVX, Opmask, ZMM_Hi256, and Hi16_ZMM registers. 
>> 
>> The patch has been regressed through tier1 and jvmci tests 
>> 
>> Please review and share your feedback.
>> 
>> Best Regards,
>> Jatin
>> 
>> [1] https://www.intel.com/content/www/us/en/content-details/844829/intel-advanced-vector-extensions-10-2-intel-avx10-2-architecture-specification.html
>
> Jatin Bhateja has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR. The pull request contains one new commit since the last revision:
> 
>   Refactoring code to create a seperate VM_Features class

src/hotspot/cpu/x86/vm_version_x86.cpp line 464:

> 462:     __ movl(rcx, 0x18000000); // cpuid1 bits osxsave | avx
> 463:     __ andl(rcx, Address(rsi, 8)); // cpuid1 bits osxsave | avx
> 464:     __ jccb(Assembler::equal, done); // jump if AVX is not supported

This doesn't not have same effect as before. Consider input is 0x10000000, the andl result will not be zero with this code and so jump to done will not happen. Whereas prior to this change, the cmpl with 0x18000000 will fail for equality and so a jump to done will happen.  This is the case for all the places where we are checking more than 1 set bit.

src/hotspot/cpu/x86/vm_version_x86.cpp line 468:

> 466:     __ movl(rax, 0x6);
> 467:     __ andl(rax, Address(rbp, in_bytes(VM_Version::xem_xcr0_offset()))); // xcr0 bits sse | ymm
> 468:     __ jccb(Assembler::notEqual, start_simd_check); // return if AVX is not supported

See prior comment, need the cmpl and jmp here.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/24329#discussion_r2072134109
PR Review Comment: https://git.openjdk.org/jdk/pull/24329#discussion_r2072136639

From vlivanov at openjdk.org  Fri May  2 20:59:46 2025
From: vlivanov at openjdk.org (Vladimir Ivanov)
Date: Fri, 2 May 2025 20:59:46 GMT
Subject: RFR: 8352675: Support Intel AVX10 converged vector ISA feature
 detection [v9]
In-Reply-To: <w98GDb6NDmuCaqLQa2J4K9O4BtCiepKybqylTIFqxUs=.d90f5996-05c2-4279-8b27-38ab92cd40d3@github.com>
References: <OrjQDBEJjPrCWLpEPj4DmEpaWEFcHY3R8PiZ6ccxMxU=.2862c36d-88d0-45b8-ad28-b50730570da4@github.com>
 <w98GDb6NDmuCaqLQa2J4K9O4BtCiepKybqylTIFqxUs=.d90f5996-05c2-4279-8b27-38ab92cd40d3@github.com>
Message-ID: <Ndhp-8O5cTRhYcahTk4L5Fqd0WHd01gmMMsbdJY_YEQ=.c277d7e3-a1ae-4350-aad0-3e85af29cbf3@github.com>

On Fri, 2 May 2025 11:31:01 GMT, Jatin Bhateja <jbhateja at openjdk.org> wrote:

>> - Intel AVX10[1] extends and enhances the capabilities of Intel AVX-512 to benefit all Intel? products and will be the vector ISA of choice moving into the future. 
>> - It supports a new ISA versioning scheme which simplifies the existing AVX512 feature enumeration scheme. Feature set supported by an AVX10 ISA version will be supported by all the versions above it.
>> - The initial, fully-featured version of Intel? AVX10 will be enumerated as Version 2 (denoted as Intel? AVX10.2). This will include the new ISA extension over the existing AVX512 instructions. 
>> - An early version of Intel? AVX10 (Version 1, or Intel? AVX10.1) that only enumerates the Intel? AVX-512 instruction set at 128, 256, and 512 bits will be enabled on the Granite Rapids Server for software pre-enabling.
>> 
>> This patch adds the necessary CPUID feature detection for AVX10 ISA version 1 and 2.  In terms of architectural state save restoration, AVX10 is isomorphic to AVX512 support up till Granite Rapids. State components affected by AVX10 extension include SSE, AVX, Opmask, ZMM_Hi256, and Hi16_ZMM registers. 
>> 
>> The patch has been regressed through tier1 and jvmci tests 
>> 
>> Please review and share your feedback.
>> 
>> Best Regards,
>> Jatin
>> 
>> [1] https://www.intel.com/content/www/us/en/content-details/844829/intel-advanced-vector-extensions-10-2-intel-avx10-2-architecture-specification.html
>
> Jatin Bhateja has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR. The pull request contains one new commit since the last revision:
> 
>   Refactoring code to create a seperate VM_Features class

Jatin, are you done with the refactorings?

-------------

PR Comment: https://git.openjdk.org/jdk/pull/24329#issuecomment-2848107604

From duke at openjdk.org  Fri May  2 22:31:00 2025
From: duke at openjdk.org (Mohamed Issa)
Date: Fri, 2 May 2025 22:31:00 GMT
Subject: RFR: 8353686: Optimize Math.cbrt for x86 64 bit platforms [v2]
In-Reply-To: <1NsI0OGP9RcnbEwlJwDj1dZ3w7zCP4DxJhEmO1quSgo=.3b1e3da9-9aa9-4221-a73a-e2f3ec5f456b@github.com>
References: <1NsI0OGP9RcnbEwlJwDj1dZ3w7zCP4DxJhEmO1quSgo=.3b1e3da9-9aa9-4221-a73a-e2f3ec5f456b@github.com>
Message-ID: <kP52md4jtegpdRjEhoCH39icyT6550HpKOiZb47lAlM=.cabb7671-3343-4dfd-aa0f-9143a2714c0e@github.com>

> The goal of this PR is to implement an x86_64 intrinsic for java.lang.Math.cbrt() using libm.
> 
> The results of all tests posted below were captured with an [Intel? Xeon 6761P](https://www.intel.com/content/www/us/en/products/sku/241842/intel-xeon-6761p-processor-336m-cache-2-50-ghz/specifications.html) using [OpenJDK v25-b15](https://github.com/openjdk/jdk/releases/tag/jdk-25%2B15) as the baseline version.
> 
> For performance data collected with the built in **cbrt** micro-benchmark, see the table below. Each result is the mean of 8 individual runs. Overall, the intrinsic provides a performance uplift of 41%.
> 
> | Benchmark        | Throughput with baseline (op/s) | Throughput with intrinsic (op/s) | Speedup |
> | :----------------: | :----------------------------------: | :----------------------------------: | :---------: |
> | MathBench.cbrt | 148242                                        | 209122                                        | 1.41x       |
> 
> Finally, the `jtreg:test/jdk/java/lang/Math/CubeRootTests.java` test passed with the changes.

Mohamed Issa has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains four commits:

 - Merge branch 'openjdk:master' into user/missa-prime/cbrt
 - Change coeff_table alignment from 4 bytes to 16 bytes to conform with movapd instruction
 - Merge branch 'master' into user/missa-prime/cbrt
 - x86_64 intrinsic for cbrt using libm

-------------

Changes: https://git.openjdk.org/jdk/pull/24470/files
  Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=24470&range=01
  Stats: 466 lines in 26 files changed: 453 ins; 1 del; 12 mod
  Patch: https://git.openjdk.org/jdk/pull/24470.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/24470/head:pull/24470

PR: https://git.openjdk.org/jdk/pull/24470

From jbhateja at openjdk.org  Sat May  3 07:26:29 2025
From: jbhateja at openjdk.org (Jatin Bhateja)
Date: Sat, 3 May 2025 07:26:29 GMT
Subject: RFR: 8352675: Support Intel AVX10 converged vector ISA feature
 detection [v10]
In-Reply-To: <OrjQDBEJjPrCWLpEPj4DmEpaWEFcHY3R8PiZ6ccxMxU=.2862c36d-88d0-45b8-ad28-b50730570da4@github.com>
References: <OrjQDBEJjPrCWLpEPj4DmEpaWEFcHY3R8PiZ6ccxMxU=.2862c36d-88d0-45b8-ad28-b50730570da4@github.com>
Message-ID: <PtAKp6Jg5aTx0OjxRBGw81ycHXSKAbSVgZ6KeIQ0R1o=.5baf1f13-7bb4-46e1-9d43-194bbd2412d9@github.com>

> - Intel AVX10[1] extends and enhances the capabilities of Intel AVX-512 to benefit all Intel? products and will be the vector ISA of choice moving into the future. 
> - It supports a new ISA versioning scheme which simplifies the existing AVX512 feature enumeration scheme. Feature set supported by an AVX10 ISA version will be supported by all the versions above it.
> - The initial, fully-featured version of Intel? AVX10 will be enumerated as Version 2 (denoted as Intel? AVX10.2). This will include the new ISA extension over the existing AVX512 instructions. 
> - An early version of Intel? AVX10 (Version 1, or Intel? AVX10.1) that only enumerates the Intel? AVX-512 instruction set at 128, 256, and 512 bits will be enabled on the Granite Rapids Server for software pre-enabling.
> 
> This patch adds the necessary CPUID feature detection for AVX10 ISA version 1 and 2.  In terms of architectural state save restoration, AVX10 is isomorphic to AVX512 support up till Granite Rapids. State components affected by AVX10 extension include SSE, AVX, Opmask, ZMM_Hi256, and Hi16_ZMM registers. 
> 
> The patch has been regressed through tier1 and jvmci tests 
> 
> Please review and share your feedback.
> 
> Best Regards,
> Jatin
> 
> [1] https://www.intel.com/content/www/us/en/content-details/844829/intel-advanced-vector-extensions-10-2-intel-avx10-2-architecture-specification.html

Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision:

  Review comments resolution

-------------

Changes:
  - all: https://git.openjdk.org/jdk/pull/24329/files
  - new: https://git.openjdk.org/jdk/pull/24329/files/a9258174..051c416c

Webrevs:
 - full: https://webrevs.openjdk.org/?repo=jdk&pr=24329&range=09
 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24329&range=08-09

  Stats: 8 lines in 1 file changed: 4 ins; 0 del; 4 mod
  Patch: https://git.openjdk.org/jdk/pull/24329.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/24329/head:pull/24329

PR: https://git.openjdk.org/jdk/pull/24329

From jbhateja at openjdk.org  Sat May  3 07:32:46 2025
From: jbhateja at openjdk.org (Jatin Bhateja)
Date: Sat, 3 May 2025 07:32:46 GMT
Subject: RFR: 8352675: Support Intel AVX10 converged vector ISA feature
 detection [v9]
In-Reply-To: <Ndhp-8O5cTRhYcahTk4L5Fqd0WHd01gmMMsbdJY_YEQ=.c277d7e3-a1ae-4350-aad0-3e85af29cbf3@github.com>
References: <OrjQDBEJjPrCWLpEPj4DmEpaWEFcHY3R8PiZ6ccxMxU=.2862c36d-88d0-45b8-ad28-b50730570da4@github.com>
 <w98GDb6NDmuCaqLQa2J4K9O4BtCiepKybqylTIFqxUs=.d90f5996-05c2-4279-8b27-38ab92cd40d3@github.com>
 <Ndhp-8O5cTRhYcahTk4L5Fqd0WHd01gmMMsbdJY_YEQ=.c277d7e3-a1ae-4350-aad0-3e85af29cbf3@github.com>
Message-ID: <Zrp8SQ85Xcdoq8QADyQlNEwIGRTSrsQ8cmISNAtKlhc=.e6dd22f8-9ef4-4258-9e7c-4f2365ac57b3@github.com>

On Fri, 2 May 2025 20:57:17 GMT, Vladimir Ivanov <vlivanov at openjdk.org> wrote:

> Jatin, are you done with the refactorings?

@iwanowww, I have addressed your comments. Let me know if you have further comments / feedback.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/24329#issuecomment-2848484313

From jbhateja at openjdk.org  Sat May  3 07:32:47 2025
From: jbhateja at openjdk.org (Jatin Bhateja)
Date: Sat, 3 May 2025 07:32:47 GMT
Subject: RFR: 8352675: Support Intel AVX10 converged vector ISA feature
 detection [v9]
In-Reply-To: <lKiwlSZEbVtX8pcXoGLs4-u3kJJsdtH_MW0g3eXFois=.6f8889e0-9e2d-4dd3-b413-cbaa2e121709@github.com>
References: <OrjQDBEJjPrCWLpEPj4DmEpaWEFcHY3R8PiZ6ccxMxU=.2862c36d-88d0-45b8-ad28-b50730570da4@github.com>
 <w98GDb6NDmuCaqLQa2J4K9O4BtCiepKybqylTIFqxUs=.d90f5996-05c2-4279-8b27-38ab92cd40d3@github.com>
 <lKiwlSZEbVtX8pcXoGLs4-u3kJJsdtH_MW0g3eXFois=.6f8889e0-9e2d-4dd3-b413-cbaa2e121709@github.com>
Message-ID: <c_cWPj1DB-YnZuxE1qWVaBd3OPtR1cAtFrjj5-kHIvw=.8358e6c4-fd58-4ff6-8b01-29546cc80b0a@github.com>

On Fri, 2 May 2025 20:47:01 GMT, Sandhya Viswanathan <sviswanathan at openjdk.org> wrote:

>> Jatin Bhateja has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR. The pull request contains one new commit since the last revision:
>> 
>>   Refactoring code to create a seperate VM_Features class
>
> src/hotspot/cpu/x86/vm_version_x86.cpp line 464:
> 
>> 462:     __ movl(rcx, 0x18000000); // cpuid1 bits osxsave | avx
>> 463:     __ andl(rcx, Address(rsi, 8)); // cpuid1 bits osxsave | avx
>> 464:     __ jccb(Assembler::equal, done); // jump if AVX is not supported
> 
> This doesn't not have same effect as before. Consider input is 0x10000000, the andl result will not be zero with this code and so jump to done will not happen. Whereas prior to this change, the cmpl with 0x18000000 will fail for equality and so a jump to done will happen.  This is the case for all the places where we are checking more than 1 set bit.

Thanks @sviswa7 , sub-optimality was mainly around single-bit comparisons, where we could save redundant CMP after AND, and by flipping the predicate of subsequent flag-consuming JMP,  multibits compares should remain unaltered.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/24329#discussion_r2072341101

From vlivanov at openjdk.org  Sat May  3 07:44:48 2025
From: vlivanov at openjdk.org (Vladimir Ivanov)
Date: Sat, 3 May 2025 07:44:48 GMT
Subject: RFR: 8352675: Support Intel AVX10 converged vector ISA feature
 detection [v10]
In-Reply-To: <PtAKp6Jg5aTx0OjxRBGw81ycHXSKAbSVgZ6KeIQ0R1o=.5baf1f13-7bb4-46e1-9d43-194bbd2412d9@github.com>
References: <OrjQDBEJjPrCWLpEPj4DmEpaWEFcHY3R8PiZ6ccxMxU=.2862c36d-88d0-45b8-ad28-b50730570da4@github.com>
 <PtAKp6Jg5aTx0OjxRBGw81ycHXSKAbSVgZ6KeIQ0R1o=.5baf1f13-7bb4-46e1-9d43-194bbd2412d9@github.com>
Message-ID: <V7WbvFgHqGmEcatKP9sIZApYAIdVjED16hqGNrUM1vM=.cccbbe34-5b13-4b38-9421-444050847951@github.com>

On Sat, 3 May 2025 07:26:29 GMT, Jatin Bhateja <jbhateja at openjdk.org> wrote:

>> - Intel AVX10[1] extends and enhances the capabilities of Intel AVX-512 to benefit all Intel? products and will be the vector ISA of choice moving into the future. 
>> - It supports a new ISA versioning scheme which simplifies the existing AVX512 feature enumeration scheme. Feature set supported by an AVX10 ISA version will be supported by all the versions above it.
>> - The initial, fully-featured version of Intel? AVX10 will be enumerated as Version 2 (denoted as Intel? AVX10.2). This will include the new ISA extension over the existing AVX512 instructions. 
>> - An early version of Intel? AVX10 (Version 1, or Intel? AVX10.1) that only enumerates the Intel? AVX-512 instruction set at 128, 256, and 512 bits will be enabled on the Granite Rapids Server for software pre-enabling.
>> 
>> This patch adds the necessary CPUID feature detection for AVX10 ISA version 1 and 2.  In terms of architectural state save restoration, AVX10 is isomorphic to AVX512 support up till Granite Rapids. State components affected by AVX10 extension include SSE, AVX, Opmask, ZMM_Hi256, and Hi16_ZMM registers. 
>> 
>> The patch has been regressed through tier1 and jvmci tests 
>> 
>> Please review and share your feedback.
>> 
>> Best Regards,
>> Jatin
>> 
>> [1] https://www.intel.com/content/www/us/en/content-details/844829/intel-advanced-vector-extensions-10-2-intel-avx10-2-architecture-specification.html
>
> Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Review comments resolution

Ok, thanks! I wasn't sure you finished the pass.

I'm still seeing dynamic memory allocation which IMO unnecessarily complicates the implementation. Bitmap size is fixed and well-known at compile time. It enables `VM_Feature` class to embed the array of proper size inline. And it eliminates all the problems related to undesired sharing of backed array. (Also, `pre_initialize()` is not needed as well.)

-------------

PR Comment: https://git.openjdk.org/jdk/pull/24329#issuecomment-2848488960

From jbhateja at openjdk.org  Sat May  3 07:54:46 2025
From: jbhateja at openjdk.org (Jatin Bhateja)
Date: Sat, 3 May 2025 07:54:46 GMT
Subject: RFR: 8352675: Support Intel AVX10 converged vector ISA feature
 detection [v10]
In-Reply-To: <V7WbvFgHqGmEcatKP9sIZApYAIdVjED16hqGNrUM1vM=.cccbbe34-5b13-4b38-9421-444050847951@github.com>
References: <OrjQDBEJjPrCWLpEPj4DmEpaWEFcHY3R8PiZ6ccxMxU=.2862c36d-88d0-45b8-ad28-b50730570da4@github.com>
 <PtAKp6Jg5aTx0OjxRBGw81ycHXSKAbSVgZ6KeIQ0R1o=.5baf1f13-7bb4-46e1-9d43-194bbd2412d9@github.com>
 <V7WbvFgHqGmEcatKP9sIZApYAIdVjED16hqGNrUM1vM=.cccbbe34-5b13-4b38-9421-444050847951@github.com>
Message-ID: <pMCizZGRrUCno2iOhfBp2dpdf4Epk92ZnPt4iv2ZYKw=.9d5ab0d7-414f-495e-ade0-780666f3aef6@github.com>

On Sat, 3 May 2025 07:41:43 GMT, Vladimir Ivanov <vlivanov at openjdk.org> wrote:

> Ok, thanks! I wasn't sure you finished the pass.
> 
> I'm still seeing dynamic memory allocation which IMO unnecessarily complicates the implementation. Bitmap size is fixed and well-known at compile time. It enables `VM_Feature` class to embed the array of proper size inline. And it eliminates all the problems related to undesired sharing of backed array. (Also, `pre_initialize()` is not needed as well.)

pre_initialize was put in place because codeCache_init () proceeds VM_Version_init() and  it makes calls to some assembler routines which checks for existinace of certain targets features. Its an ordering issue, pre_initialize simply allocates feature vector  upfront to prevent crashing.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/24329#issuecomment-2848492777

From vlivanov at openjdk.org  Sat May  3 07:54:47 2025
From: vlivanov at openjdk.org (Vladimir Ivanov)
Date: Sat, 3 May 2025 07:54:47 GMT
Subject: RFR: 8352675: Support Intel AVX10 converged vector ISA feature
 detection [v10]
In-Reply-To: <PtAKp6Jg5aTx0OjxRBGw81ycHXSKAbSVgZ6KeIQ0R1o=.5baf1f13-7bb4-46e1-9d43-194bbd2412d9@github.com>
References: <OrjQDBEJjPrCWLpEPj4DmEpaWEFcHY3R8PiZ6ccxMxU=.2862c36d-88d0-45b8-ad28-b50730570da4@github.com>
 <PtAKp6Jg5aTx0OjxRBGw81ycHXSKAbSVgZ6KeIQ0R1o=.5baf1f13-7bb4-46e1-9d43-194bbd2412d9@github.com>
Message-ID: <JaCbcvlFUh6OYW6IMo-E4HCYKaZTee-51AzITXR5bbk=.4a017138-896d-439f-9431-092f70511b71@github.com>

On Sat, 3 May 2025 07:26:29 GMT, Jatin Bhateja <jbhateja at openjdk.org> wrote:

>> - Intel AVX10[1] extends and enhances the capabilities of Intel AVX-512 to benefit all Intel? products and will be the vector ISA of choice moving into the future. 
>> - It supports a new ISA versioning scheme which simplifies the existing AVX512 feature enumeration scheme. Feature set supported by an AVX10 ISA version will be supported by all the versions above it.
>> - The initial, fully-featured version of Intel? AVX10 will be enumerated as Version 2 (denoted as Intel? AVX10.2). This will include the new ISA extension over the existing AVX512 instructions. 
>> - An early version of Intel? AVX10 (Version 1, or Intel? AVX10.1) that only enumerates the Intel? AVX-512 instruction set at 128, 256, and 512 bits will be enabled on the Granite Rapids Server for software pre-enabling.
>> 
>> This patch adds the necessary CPUID feature detection for AVX10 ISA version 1 and 2.  In terms of architectural state save restoration, AVX10 is isomorphic to AVX512 support up till Granite Rapids. State components affected by AVX10 extension include SSE, AVX, Opmask, ZMM_Hi256, and Hi16_ZMM registers. 
>> 
>> The patch has been regressed through tier1 and jvmci tests 
>> 
>> Please review and share your feedback.
>> 
>> Best Regards,
>> Jatin
>> 
>> [1] https://www.intel.com/content/www/us/en/content-details/844829/intel-advanced-vector-extensions-10-2-intel-avx10-2-architecture-specification.html
>
> Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Review comments resolution

src/hotspot/cpu/x86/vm_version_x86.cpp line 2867:

> 2865: 
> 2866: uint64_t VM_Version::CpuidInfo::feature_flags() const {
> 2867:   uint64_t result = 0;

It's unfortunate you migrated away from operating on a local copy. Why don't you declare a local copy (`VM_Version result`) and migrate bit manipulation to bit field accessors on it? `VM_Version::CpuidInfo::feature_flags()` can still return it by value (once you get rid of heap memory allocation, copying becomes trivial).

src/hotspot/share/runtime/abstract_vm_version.hpp line 88:

> 86:   static VM_Features _dynamic_cpu_features;
> 87: 
> 88: #define SET_CPU_FEATURE(feature) \

Why don't you supersede macros with instance methods on `VM_Version` instead?

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/24329#discussion_r2072344671
PR Review Comment: https://git.openjdk.org/jdk/pull/24329#discussion_r2072343204

From jbhateja at openjdk.org  Sat May  3 07:57:45 2025
From: jbhateja at openjdk.org (Jatin Bhateja)
Date: Sat, 3 May 2025 07:57:45 GMT
Subject: RFR: 8352675: Support Intel AVX10 converged vector ISA feature
 detection [v10]
In-Reply-To: <pMCizZGRrUCno2iOhfBp2dpdf4Epk92ZnPt4iv2ZYKw=.9d5ab0d7-414f-495e-ade0-780666f3aef6@github.com>
References: <OrjQDBEJjPrCWLpEPj4DmEpaWEFcHY3R8PiZ6ccxMxU=.2862c36d-88d0-45b8-ad28-b50730570da4@github.com>
 <PtAKp6Jg5aTx0OjxRBGw81ycHXSKAbSVgZ6KeIQ0R1o=.5baf1f13-7bb4-46e1-9d43-194bbd2412d9@github.com>
 <V7WbvFgHqGmEcatKP9sIZApYAIdVjED16hqGNrUM1vM=.cccbbe34-5b13-4b38-9421-444050847951@github.com>
 <pMCizZGRrUCno2iOhfBp2dpdf4Epk92ZnPt4iv2ZYKw=.9d5ab0d7-414f-495e-ade0-780666f3aef6@github.com>
Message-ID: <oSgjp-pN7JPy1pRA-vivck-P5sv8vMLiK9T88YTzmLU=.30323e8c-6e7a-41b6-b082-32a1485008f5@github.com>

On Sat, 3 May 2025 07:52:45 GMT, Jatin Bhateja <jbhateja at openjdk.org> wrote:

> Ok, thanks! I wasn't sure you finished the pass.
> 
> I'm still seeing dynamic memory allocation which IMO unnecessarily complicates the implementation. Bitmap size is fixed and well-known at compile time. It enables `VM_Feature` class to embed the array of proper size inline. And it eliminates all the problems related to undesired sharing of backed array. (Also, `pre_initialize()` is not needed as well.)

I made it dynamic since to keep it flexible, but the bitmap size depends on maximum feature enum value.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/24329#issuecomment-2848493614

From jbhateja at openjdk.org  Sat May  3 08:08:46 2025
From: jbhateja at openjdk.org (Jatin Bhateja)
Date: Sat, 3 May 2025 08:08:46 GMT
Subject: RFR: 8352675: Support Intel AVX10 converged vector ISA feature
 detection [v10]
In-Reply-To: <JaCbcvlFUh6OYW6IMo-E4HCYKaZTee-51AzITXR5bbk=.4a017138-896d-439f-9431-092f70511b71@github.com>
References: <OrjQDBEJjPrCWLpEPj4DmEpaWEFcHY3R8PiZ6ccxMxU=.2862c36d-88d0-45b8-ad28-b50730570da4@github.com>
 <PtAKp6Jg5aTx0OjxRBGw81ycHXSKAbSVgZ6KeIQ0R1o=.5baf1f13-7bb4-46e1-9d43-194bbd2412d9@github.com>
 <JaCbcvlFUh6OYW6IMo-E4HCYKaZTee-51AzITXR5bbk=.4a017138-896d-439f-9431-092f70511b71@github.com>
Message-ID: <SfZKEmLMA2qhDuZDwe-6JjXBE-GyL1LidXU6-hF9jlE=.68211b98-5ff0-4cfa-aa7a-cada0ca6e9b5@github.com>

On Sat, 3 May 2025 07:52:21 GMT, Vladimir Ivanov <vlivanov at openjdk.org> wrote:

>> Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision:
>> 
>>   Review comments resolution
>
> src/hotspot/cpu/x86/vm_version_x86.cpp line 2867:
> 
>> 2865: 
>> 2866: uint64_t VM_Version::CpuidInfo::feature_flags() const {
>> 2867:   uint64_t result = 0;
> 
> It's unfortunate you migrated away from operating on a local copy. Why don't you declare a local copy (`VM_Version result`) and migrate bit manipulation to bit field accessors on it? `VM_Version::CpuidInfo::feature_flags()` can still return it by value (once you get rid of heap memory allocation, copying becomes trivial).

New implimentation directly modify the feature vector bits though macros.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/24329#discussion_r2072346669

From vlivanov at openjdk.org  Sat May  3 08:17:49 2025
From: vlivanov at openjdk.org (Vladimir Ivanov)
Date: Sat, 3 May 2025 08:17:49 GMT
Subject: RFR: 8352675: Support Intel AVX10 converged vector ISA feature
 detection [v10]
In-Reply-To: <oSgjp-pN7JPy1pRA-vivck-P5sv8vMLiK9T88YTzmLU=.30323e8c-6e7a-41b6-b082-32a1485008f5@github.com>
References: <OrjQDBEJjPrCWLpEPj4DmEpaWEFcHY3R8PiZ6ccxMxU=.2862c36d-88d0-45b8-ad28-b50730570da4@github.com>
 <PtAKp6Jg5aTx0OjxRBGw81ycHXSKAbSVgZ6KeIQ0R1o=.5baf1f13-7bb4-46e1-9d43-194bbd2412d9@github.com>
 <V7WbvFgHqGmEcatKP9sIZApYAIdVjED16hqGNrUM1vM=.cccbbe34-5b13-4b38-9421-444050847951@github.com>
 <pMCizZGRrUCno2iOhfBp2dpdf4Epk92ZnPt4iv2ZYKw=.9d5ab0d7-414f-495e-ade0-780666f3aef6@github.com>
 <oSgjp-pN7JPy1pRA-vivck-P5sv8vMLiK9T88YTzmLU=.30323e8c-6e7a-41b6-b082-32a1485008f5@github.com>
Message-ID: <3t1R35B9bafRtfvqfE7D2dAeLrjaDukXlDUGb-3VtaA=.46d64318-e9fb-4bf3-8a68-8dba2c2b7b26@github.com>

On Sat, 3 May 2025 07:55:10 GMT, Jatin Bhateja <jbhateja at openjdk.org> wrote:

> Bitmap size depends on the maximum feature enum value, I made it dynamic to keep it flexible. Do you want the feature vector size to be made constant and manually bump it when we exhaust the limit?

Yes, please. (The limit may be precise - number of  elements in Feature_Flag enum - but the logic which computes the size of backing array can automatically round it and bump the size once the actual limit is reached.) 

> pre_initialize was put in place because codeCache_init() proceeds VM_Version_init() 

I wanted to say that the sole purpose of `pre_initialize` is to allocate memory. Once it goes away, there's no reason to keep it.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/24329#issuecomment-2848507499

From vlivanov at openjdk.org  Sat May  3 08:28:47 2025
From: vlivanov at openjdk.org (Vladimir Ivanov)
Date: Sat, 3 May 2025 08:28:47 GMT
Subject: RFR: 8352675: Support Intel AVX10 converged vector ISA feature
 detection [v10]
In-Reply-To: <SfZKEmLMA2qhDuZDwe-6JjXBE-GyL1LidXU6-hF9jlE=.68211b98-5ff0-4cfa-aa7a-cada0ca6e9b5@github.com>
References: <OrjQDBEJjPrCWLpEPj4DmEpaWEFcHY3R8PiZ6ccxMxU=.2862c36d-88d0-45b8-ad28-b50730570da4@github.com>
 <PtAKp6Jg5aTx0OjxRBGw81ycHXSKAbSVgZ6KeIQ0R1o=.5baf1f13-7bb4-46e1-9d43-194bbd2412d9@github.com>
 <JaCbcvlFUh6OYW6IMo-E4HCYKaZTee-51AzITXR5bbk=.4a017138-896d-439f-9431-092f70511b71@github.com>
 <SfZKEmLMA2qhDuZDwe-6JjXBE-GyL1LidXU6-hF9jlE=.68211b98-5ff0-4cfa-aa7a-cada0ca6e9b5@github.com>
Message-ID: <nh9_Fa7Mr0YGR6yrjs1zQ4kzy6kxZVj_zZHai20HTZg=.cd51f544-af5c-4656-abc7-a96c9abfc084@github.com>

On Sat, 3 May 2025 08:06:10 GMT, Jatin Bhateja <jbhateja at openjdk.org> wrote:

>> src/hotspot/cpu/x86/vm_version_x86.cpp line 2867:
>> 
>>> 2865: 
>>> 2866: uint64_t VM_Version::CpuidInfo::feature_flags() const {
>>> 2867:   uint64_t result = 0;
>> 
>> It's unfortunate you migrated away from operating on a local copy. Why don't you declare a local copy (`VM_Version result`) and migrate bit manipulation to bit field accessors on it? `VM_Version::CpuidInfo::feature_flags()` can still return it by value (once you get rid of heap memory allocation, copying becomes trivial).
>
> New implimentation directly modify the feature vector bits though macros.

I prefer explicit accessor calls on corresponding instance fields. 

It's confusing to see `VM_Version::CpuidInfo::feature_flags()` implicitly modifying `_dynamic_features_vector` through macros.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/24329#discussion_r2072349610

From jbhateja at openjdk.org  Sat May  3 08:33:46 2025
From: jbhateja at openjdk.org (Jatin Bhateja)
Date: Sat, 3 May 2025 08:33:46 GMT
Subject: RFR: 8352675: Support Intel AVX10 converged vector ISA feature
 detection [v10]
In-Reply-To: <nh9_Fa7Mr0YGR6yrjs1zQ4kzy6kxZVj_zZHai20HTZg=.cd51f544-af5c-4656-abc7-a96c9abfc084@github.com>
References: <OrjQDBEJjPrCWLpEPj4DmEpaWEFcHY3R8PiZ6ccxMxU=.2862c36d-88d0-45b8-ad28-b50730570da4@github.com>
 <PtAKp6Jg5aTx0OjxRBGw81ycHXSKAbSVgZ6KeIQ0R1o=.5baf1f13-7bb4-46e1-9d43-194bbd2412d9@github.com>
 <JaCbcvlFUh6OYW6IMo-E4HCYKaZTee-51AzITXR5bbk=.4a017138-896d-439f-9431-092f70511b71@github.com>
 <SfZKEmLMA2qhDuZDwe-6JjXBE-GyL1LidXU6-hF9jlE=.68211b98-5ff0-4cfa-aa7a-cada0ca6e9b5@github.com>
 <nh9_Fa7Mr0YGR6yrjs1zQ4kzy6kxZVj_zZHai20HTZg=.cd51f544-af5c-4656-abc7-a96c9abfc084@github.com>
Message-ID: <tdzoYi3JimM2SMdHsghX0m41YEFH0iTAKx5Ba_tWV00=.b2e2f15f-4421-4c6a-a0fb-2880531771f6@github.com>

On Sat, 3 May 2025 08:26:19 GMT, Vladimir Ivanov <vlivanov at openjdk.org> wrote:

>> New implimentation directly modify the feature vector bits though macros.
>
> I prefer explicit accessor calls on corresponding instance fields. 
> 
> It's confusing to see `VM_Version::CpuidInfo::feature_flags()` implicitly modifying `_dynamic_features_vector` through macros.

VM_Version::CpuidInfo::feature_flags() is local to x86 targets, how about changing its name to VM_Version::CpuidInfo::install_feature_flags() and use macros ?

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/24329#discussion_r2072350359

From kvn at openjdk.org  Sat May  3 22:47:55 2025
From: kvn at openjdk.org (Vladimir Kozlov)
Date: Sat, 3 May 2025 22:47:55 GMT
Subject: RFR: 8347901: C2 should remove unused leaf / pure runtime calls
In-Reply-To: <FDC4kftsSAqf2FB3mxOABGMgWhr_qty0_BktGUiuTuE=.060b5943-74c3-461d-8806-b6da1722c207@github.com>
References: <FDC4kftsSAqf2FB3mxOABGMgWhr_qty0_BktGUiuTuE=.060b5943-74c3-461d-8806-b6da1722c207@github.com>
Message-ID: <j74g8VaqsFia7OTBDquTOxyQb9dl5Ve1Y7ePt8XSbCY=.fd3f06cc-7c1c-4c15-8117-d2d36d17d366@github.com>

On Wed, 30 Apr 2025 13:18:33 GMT, Marc Chevalier <mchevalier at openjdk.org> wrote:

> A first part toward a better support of pure functions.
> 
> ## Pure Functions
> 
> Pure functions (considered here) are functions that have no side effects, no effect on the control flow (no exception or such), cannot deopt etc.. It's really a function that you can execute anywhere, with whichever arguments without effect other than wasting time. Integer division is not pure as dividing by zero is throwing. But many floating point functions will just return `NaN` or `+/-infinity` in problematic cases.
> 
> ## Scope
> 
> We are not going all powerful for now! It's mostly about identifying some pure functions and being able to remove them if the result is unused. Some other things are not part of this PR, on purpose. Especially, this PR doesn't propose a way to move pure calls around. The reason is that pure calls are macro nodes later expanded into other, regular calls, which require a control input. To be able to do the expansion, we just keep the control in the pure call as well.
> 
> ## Implementation Overview
> 
> We created here some new node kind for pure calls that are expanded into regular calls during macro expansion. This also allows the removal of `ModD` and `ModF` nodes that have their pure equivalent now. They are surprisingly hard to unify with other floating point functions from an implementation point of view!
> 
> IR framework and IGV needed a little bit of fixing.
> 
> Thanks,
> Marc

Hi @marc-chevalier 

> doesn't propose a way to move pure calls around

I agree that we should not do that in these changes.

But did you consider to move/clone such call (new macro node) **down** to "users" in case the result is not used on some paths? They will be executed only where they are needed. And I think it is safe since current control dominates paths where the result is used.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/24966#issuecomment-2848841268

From jbhateja at openjdk.org  Mon May  5 03:57:22 2025
From: jbhateja at openjdk.org (Jatin Bhateja)
Date: Mon, 5 May 2025 03:57:22 GMT
Subject: RFR: 8352675: Support Intel AVX10 converged vector ISA feature
 detection [v11]
In-Reply-To: <OrjQDBEJjPrCWLpEPj4DmEpaWEFcHY3R8PiZ6ccxMxU=.2862c36d-88d0-45b8-ad28-b50730570da4@github.com>
References: <OrjQDBEJjPrCWLpEPj4DmEpaWEFcHY3R8PiZ6ccxMxU=.2862c36d-88d0-45b8-ad28-b50730570da4@github.com>
Message-ID: <YsI-OTYC4ZVjQhgjR5qiMFkBBDHDN7ZUu2burgHuk6g=.bdfd9ac3-167b-4da1-95b8-601a69086ae6@github.com>

> - Intel AVX10[1] extends and enhances the capabilities of Intel AVX-512 to benefit all Intel? products and will be the vector ISA of choice moving into the future. 
> - It supports a new ISA versioning scheme which simplifies the existing AVX512 feature enumeration scheme. Feature set supported by an AVX10 ISA version will be supported by all the versions above it.
> - The initial, fully-featured version of Intel? AVX10 will be enumerated as Version 2 (denoted as Intel? AVX10.2). This will include the new ISA extension over the existing AVX512 instructions. 
> - An early version of Intel? AVX10 (Version 1, or Intel? AVX10.1) that only enumerates the Intel? AVX-512 instruction set at 128, 256, and 512 bits will be enabled on the Granite Rapids Server for software pre-enabling.
> 
> This patch adds the necessary CPUID feature detection for AVX10 ISA version 1 and 2.  In terms of architectural state save restoration, AVX10 is isomorphic to AVX512 support up till Granite Rapids. State components affected by AVX10 extension include SSE, AVX, Opmask, ZMM_Hi256, and Hi16_ZMM registers. 
> 
> The patch has been regressed through tier1 and jvmci tests 
> 
> Please review and share your feedback.
> 
> Best Regards,
> Jatin
> 
> [1] https://www.intel.com/content/www/us/en/content-details/844829/intel-advanced-vector-extensions-10-2-intel-avx10-2-architecture-specification.html

Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision:

  Reveiw comments resolutions

-------------

Changes:
  - all: https://git.openjdk.org/jdk/pull/24329/files
  - new: https://git.openjdk.org/jdk/pull/24329/files/051c416c..b314ed0e

Webrevs:
 - full: https://webrevs.openjdk.org/?repo=jdk&pr=24329&range=10
 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24329&range=09-10

  Stats: 376 lines in 22 files changed: 25 ins; 68 del; 283 mod
  Patch: https://git.openjdk.org/jdk/pull/24329.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/24329/head:pull/24329

PR: https://git.openjdk.org/jdk/pull/24329

From jbhateja at openjdk.org  Mon May  5 03:57:22 2025
From: jbhateja at openjdk.org (Jatin Bhateja)
Date: Mon, 5 May 2025 03:57:22 GMT
Subject: RFR: 8352675: Support Intel AVX10 converged vector ISA feature
 detection [v11]
In-Reply-To: <nh9_Fa7Mr0YGR6yrjs1zQ4kzy6kxZVj_zZHai20HTZg=.cd51f544-af5c-4656-abc7-a96c9abfc084@github.com>
References: <OrjQDBEJjPrCWLpEPj4DmEpaWEFcHY3R8PiZ6ccxMxU=.2862c36d-88d0-45b8-ad28-b50730570da4@github.com>
 <PtAKp6Jg5aTx0OjxRBGw81ycHXSKAbSVgZ6KeIQ0R1o=.5baf1f13-7bb4-46e1-9d43-194bbd2412d9@github.com>
 <JaCbcvlFUh6OYW6IMo-E4HCYKaZTee-51AzITXR5bbk=.4a017138-896d-439f-9431-092f70511b71@github.com>
 <SfZKEmLMA2qhDuZDwe-6JjXBE-GyL1LidXU6-hF9jlE=.68211b98-5ff0-4cfa-aa7a-cada0ca6e9b5@github.com>
 <nh9_Fa7Mr0YGR6yrjs1zQ4kzy6kxZVj_zZHai20HTZg=.cd51f544-af5c-4656-abc7-a96c9abfc084@github.com>
Message-ID: <9d9DVuqRAeb_8kiEwkPQH6g2eBU5Jc_5ZSBAi1in9X0=.1d955598-f466-46ff-8b1f-71c87abd6313@github.com>

On Sat, 3 May 2025 08:26:19 GMT, Vladimir Ivanov <vlivanov at openjdk.org> wrote:

>> New implimentation directly modify the feature vector bits though macros.
>
> I prefer explicit accessor calls on corresponding instance fields. 
> 
> It's confusing to see `VM_Version::CpuidInfo::feature_flags()` implicitly modifying `_dynamic_features_vector` through macros.

I have changed this local rountine name to install_feature_flags to confirm to its semantics

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/24329#discussion_r2072818174

From jbhateja at openjdk.org  Mon May  5 04:06:02 2025
From: jbhateja at openjdk.org (Jatin Bhateja)
Date: Mon, 5 May 2025 04:06:02 GMT
Subject: RFR: 8352675: Support Intel AVX10 converged vector ISA feature
 detection [v12]
In-Reply-To: <OrjQDBEJjPrCWLpEPj4DmEpaWEFcHY3R8PiZ6ccxMxU=.2862c36d-88d0-45b8-ad28-b50730570da4@github.com>
References: <OrjQDBEJjPrCWLpEPj4DmEpaWEFcHY3R8PiZ6ccxMxU=.2862c36d-88d0-45b8-ad28-b50730570da4@github.com>
Message-ID: <2ioSQVtfXhnqvAXqiadwR1HuJsz3t9nytY0wRps-x68=.35220ade-0e70-41c6-9ebd-a271e7dcb2bb@github.com>

> - Intel AVX10[1] extends and enhances the capabilities of Intel AVX-512 to benefit all Intel? products and will be the vector ISA of choice moving into the future. 
> - It supports a new ISA versioning scheme which simplifies the existing AVX512 feature enumeration scheme. Feature set supported by an AVX10 ISA version will be supported by all the versions above it.
> - The initial, fully-featured version of Intel? AVX10 will be enumerated as Version 2 (denoted as Intel? AVX10.2). This will include the new ISA extension over the existing AVX512 instructions. 
> - An early version of Intel? AVX10 (Version 1, or Intel? AVX10.1) that only enumerates the Intel? AVX-512 instruction set at 128, 256, and 512 bits will be enabled on the Granite Rapids Server for software pre-enabling.
> 
> This patch adds the necessary CPUID feature detection for AVX10 ISA version 1 and 2.  In terms of architectural state save restoration, AVX10 is isomorphic to AVX512 support up till Granite Rapids. State components affected by AVX10 extension include SSE, AVX, Opmask, ZMM_Hi256, and Hi16_ZMM registers. 
> 
> The patch has been regressed through tier1 and jvmci tests 
> 
> Please review and share your feedback.
> 
> Best Regards,
> Jatin
> 
> [1] https://www.intel.com/content/www/us/en/content-details/844829/intel-advanced-vector-extensions-10-2-intel-avx10-2-architecture-specification.html

Jatin Bhateja has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR. The pull request contains two new commits since the last revision:

 - Updating comment
 - Review comments resolutions

-------------

Changes:
  - all: https://git.openjdk.org/jdk/pull/24329/files
  - new: https://git.openjdk.org/jdk/pull/24329/files/b314ed0e..7b414b8c

Webrevs:
 - full: https://webrevs.openjdk.org/?repo=jdk&pr=24329&range=11
 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24329&range=10-11

  Stats: 13 lines in 4 files changed: 0 ins; 8 del; 5 mod
  Patch: https://git.openjdk.org/jdk/pull/24329.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/24329/head:pull/24329

PR: https://git.openjdk.org/jdk/pull/24329

From mchevalier at openjdk.org  Mon May  5 06:44:44 2025
From: mchevalier at openjdk.org (Marc Chevalier)
Date: Mon, 5 May 2025 06:44:44 GMT
Subject: RFR: 8347901: C2 should remove unused leaf / pure runtime calls
In-Reply-To: <FDC4kftsSAqf2FB3mxOABGMgWhr_qty0_BktGUiuTuE=.060b5943-74c3-461d-8806-b6da1722c207@github.com>
References: <FDC4kftsSAqf2FB3mxOABGMgWhr_qty0_BktGUiuTuE=.060b5943-74c3-461d-8806-b6da1722c207@github.com>
Message-ID: <vEIc8mJ4UgPbZ4lY9ixhWgWFmQRqXqt-M79Pi-8IKVg=.1f305964-56e2-43c4-99d6-9e4aec5c30c2@github.com>

On Wed, 30 Apr 2025 13:18:33 GMT, Marc Chevalier <mchevalier at openjdk.org> wrote:

> A first part toward a better support of pure functions.
> 
> ## Pure Functions
> 
> Pure functions (considered here) are functions that have no side effects, no effect on the control flow (no exception or such), cannot deopt etc.. It's really a function that you can execute anywhere, with whichever arguments without effect other than wasting time. Integer division is not pure as dividing by zero is throwing. But many floating point functions will just return `NaN` or `+/-infinity` in problematic cases.
> 
> ## Scope
> 
> We are not going all powerful for now! It's mostly about identifying some pure functions and being able to remove them if the result is unused. Some other things are not part of this PR, on purpose. Especially, this PR doesn't propose a way to move pure calls around. The reason is that pure calls are macro nodes later expanded into other, regular calls, which require a control input. To be able to do the expansion, we just keep the control in the pure call as well.
> 
> ## Implementation Overview
> 
> We created here some new node kind for pure calls that are expanded into regular calls during macro expansion. This also allows the removal of `ModD` and `ModF` nodes that have their pure equivalent now. They are surprisingly hard to unify with other floating point functions from an implementation point of view!
> 
> IR framework and IGV needed a little bit of fixing.
> 
> Thanks,
> Marc

I've considered it, but rather for a follow-up. My thought was to first introduce the node types, removal mechanics and such, but keep it pined by control and not touch that in this change. In the follow-up, I was hoping I would have "just" the control-pinning problem to address.

Moving the calls down may be beneficial in case the result is not used in a branch (and then we save the call when executing the branch not using it), but if the usage is in a loop, we rather want the call to stay (or be hoisted) before the loop. The heuristic "out of as many loops as possible, and the later possible" seems to also apply here.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/24966#issuecomment-2850052986

From rkennke at openjdk.org  Mon May  5 13:43:23 2025
From: rkennke at openjdk.org (Roman Kennke)
Date: Mon, 5 May 2025 13:43:23 GMT
Subject: RFR: 8356075: Support Shenandoah GC in JVMCI
Message-ID: <pGN_wuIoTkDaMGwotFCH_1PiohTPyFP05-7DNXmsWU4=.ce3fff95-7bd5-4e9b-aa29-cd93c0016cc2@github.com>

In order to support Shenandoah GC in Graal, some changes are required in JVMCI, namely, export Shenandoah relevant symbols.

Testing:
 - [x] extensive testing with https://github.com/oracle/graal/pull/10904

-------------

Commit messages:
 - Fix ordering of includes
 - Remove unnecessary stuff
 - Revert unrelated changes
 - Revert unrelated changes
 - Merge branch 'master' into graal-shenandoah-support
 - Support for Shenandoah card-table barriers in JVMCI
 - Revert "8321373: Build should use LC_ALL=C.UTF-8"
 - Graal Shenandoah support

Changes: https://git.openjdk.org/jdk/pull/25001/files
  Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=25001&range=00
  Issue: https://bugs.openjdk.org/browse/JDK-8356075
  Stats: 59 lines in 6 files changed: 58 ins; 0 del; 1 mod
  Patch: https://git.openjdk.org/jdk/pull/25001.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/25001/head:pull/25001

PR: https://git.openjdk.org/jdk/pull/25001

From dnsimon at openjdk.org  Mon May  5 13:53:50 2025
From: dnsimon at openjdk.org (Doug Simon)
Date: Mon, 5 May 2025 13:53:50 GMT
Subject: RFR: 8356075: Support Shenandoah GC in JVMCI
In-Reply-To: <pGN_wuIoTkDaMGwotFCH_1PiohTPyFP05-7DNXmsWU4=.ce3fff95-7bd5-4e9b-aa29-cd93c0016cc2@github.com>
References: <pGN_wuIoTkDaMGwotFCH_1PiohTPyFP05-7DNXmsWU4=.ce3fff95-7bd5-4e9b-aa29-cd93c0016cc2@github.com>
Message-ID: <KZvgnkiTE4GW44_x8SX8hsGYAkS0orwQ6lVtf6jlWnI=.3549331c-1de2-4db2-b3d1-b52beb5edf5f@github.com>

On Fri, 2 May 2025 10:35:03 GMT, Roman Kennke <rkennke at openjdk.org> wrote:

> In order to support Shenandoah GC in Graal, some changes are required in JVMCI, namely, export Shenandoah relevant symbols.
> 
> Testing:
>  - [x] extensive testing with https://github.com/oracle/graal/pull/10904

LGTM

-------------

Marked as reviewed by dnsimon (Reviewer).

PR Review: https://git.openjdk.org/jdk/pull/25001#pullrequestreview-2814890860

From shade at openjdk.org  Mon May  5 15:38:46 2025
From: shade at openjdk.org (Aleksey Shipilev)
Date: Mon, 5 May 2025 15:38:46 GMT
Subject: RFR: 8356075: Support Shenandoah GC in JVMCI
In-Reply-To: <pGN_wuIoTkDaMGwotFCH_1PiohTPyFP05-7DNXmsWU4=.ce3fff95-7bd5-4e9b-aa29-cd93c0016cc2@github.com>
References: <pGN_wuIoTkDaMGwotFCH_1PiohTPyFP05-7DNXmsWU4=.ce3fff95-7bd5-4e9b-aa29-cd93c0016cc2@github.com>
Message-ID: <26EKhVnyWuLQxzRjvvLzzLcY2iW6fmgqs7qHWOdZQvA=.99efcb44-5788-403a-8ad1-83766184aa17@github.com>

On Fri, 2 May 2025 10:35:03 GMT, Roman Kennke <rkennke at openjdk.org> wrote:

> In order to support Shenandoah GC in Graal, some changes are required in JVMCI, namely, export Shenandoah relevant symbols.
> 
> Testing:
>  - [x] extensive testing with https://github.com/oracle/graal/pull/10904

A few questions:

src/hotspot/share/gc/shenandoah/shenandoahRuntime.hpp line 42:

> 40:   static void pre_barrier(JavaThread* thread, oopDesc* orig) {
> 41:     write_ref_field_pre(orig, thread);
> 42:   }

So, why not export `write_ref_field_pre`, instead of introducing this new method? Style/cleanliness, or something else? I am asking, because every time we add a new stub here, we would need to record it in `AOTCache` tables for Leyden benefit.

src/hotspot/share/jvmci/jvmciCompilerToVMInit.cpp line 240:

> 238:     cardtable_shift = CardTable::card_shift();
> 239:   } else if (bs->is_a(BarrierSet::ShenandoahBarrierSet)) {
> 240:     cardtable_shift = CardTable::card_shift();

I understand the barrier code does not use `cardtable_start_address`, but should we still initialize it here to `nullptr`?

-------------

PR Review: https://git.openjdk.org/jdk/pull/25001#pullrequestreview-2815217376
PR Review Comment: https://git.openjdk.org/jdk/pull/25001#discussion_r2073674847
PR Review Comment: https://git.openjdk.org/jdk/pull/25001#discussion_r2073678010

From rkennke at openjdk.org  Mon May  5 15:54:29 2025
From: rkennke at openjdk.org (Roman Kennke)
Date: Mon, 5 May 2025 15:54:29 GMT
Subject: RFR: 8356075: Support Shenandoah GC in JVMCI [v2]
In-Reply-To: <pGN_wuIoTkDaMGwotFCH_1PiohTPyFP05-7DNXmsWU4=.ce3fff95-7bd5-4e9b-aa29-cd93c0016cc2@github.com>
References: <pGN_wuIoTkDaMGwotFCH_1PiohTPyFP05-7DNXmsWU4=.ce3fff95-7bd5-4e9b-aa29-cd93c0016cc2@github.com>
Message-ID: <zirO8JqJ6A-rSFurqv31UD33CqIUSGjL6sgiztKyupw=.5a4f29db-f0bd-4870-aebd-a62c2a35db70@github.com>

> In order to support Shenandoah GC in Graal, some changes are required in JVMCI, namely, export Shenandoah relevant symbols.
> 
> Testing:
>  - [x] extensive testing with https://github.com/oracle/graal/pull/10904

Roman Kennke has updated the pull request incrementally with one additional commit since the last revision:

  Initialize cardtable_start_address to nullptr

-------------

Changes:
  - all: https://git.openjdk.org/jdk/pull/25001/files
  - new: https://git.openjdk.org/jdk/pull/25001/files/6487a9f7..c95313a9

Webrevs:
 - full: https://webrevs.openjdk.org/?repo=jdk&pr=25001&range=01
 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25001&range=00-01

  Stats: 1 line in 1 file changed: 1 ins; 0 del; 0 mod
  Patch: https://git.openjdk.org/jdk/pull/25001.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/25001/head:pull/25001

PR: https://git.openjdk.org/jdk/pull/25001

From rkennke at openjdk.org  Mon May  5 15:54:29 2025
From: rkennke at openjdk.org (Roman Kennke)
Date: Mon, 5 May 2025 15:54:29 GMT
Subject: RFR: 8356075: Support Shenandoah GC in JVMCI [v2]
In-Reply-To: <26EKhVnyWuLQxzRjvvLzzLcY2iW6fmgqs7qHWOdZQvA=.99efcb44-5788-403a-8ad1-83766184aa17@github.com>
References: <pGN_wuIoTkDaMGwotFCH_1PiohTPyFP05-7DNXmsWU4=.ce3fff95-7bd5-4e9b-aa29-cd93c0016cc2@github.com>
 <26EKhVnyWuLQxzRjvvLzzLcY2iW6fmgqs7qHWOdZQvA=.99efcb44-5788-403a-8ad1-83766184aa17@github.com>
Message-ID: <HqLJu_x8t7TPsusCAVeHrPH1NByjNuqUK0ipkEEDDFw=.cec49cc5-d6d0-4a15-b8c1-19e1b303da9b@github.com>

On Mon, 5 May 2025 15:31:59 GMT, Aleksey Shipilev <shade at openjdk.org> wrote:

>> Roman Kennke has updated the pull request incrementally with one additional commit since the last revision:
>> 
>>   Initialize cardtable_start_address to nullptr
>
> src/hotspot/share/gc/shenandoah/shenandoahRuntime.hpp line 42:
> 
>> 40:   static void pre_barrier(JavaThread* thread, oopDesc* orig) {
>> 41:     write_ref_field_pre(orig, thread);
>> 42:   }
> 
> So, why not export `write_ref_field_pre`, instead of introducing this new method? Style/cleanliness, or something else? I am asking, because every time we add a new stub here, we would need to record it in `AOTCache` tables for Leyden benefit.

It's about the argument ordering. Graal expects the Thread* to be prependend, while other JITs call it with the Thread* appended. I guess we could change other JIT calls to also prepend the thread, or change the interface to not pass the Thread* at all. I chose to follow G1 and export both variants.

> src/hotspot/share/jvmci/jvmciCompilerToVMInit.cpp line 240:
> 
>> 238:     cardtable_shift = CardTable::card_shift();
>> 239:   } else if (bs->is_a(BarrierSet::ShenandoahBarrierSet)) {
>> 240:     cardtable_shift = CardTable::card_shift();
> 
> I understand the barrier code does not use `cardtable_start_address`, but should we still initialize it here to `nullptr`?

Good point, did that.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/25001#discussion_r2073702873
PR Review Comment: https://git.openjdk.org/jdk/pull/25001#discussion_r2073705091

From cslucas at openjdk.org  Mon May  5 16:26:49 2025
From: cslucas at openjdk.org (Cesar Soares Lucas)
Date: Mon, 5 May 2025 16:26:49 GMT
Subject: RFR: 8356075: Support Shenandoah GC in JVMCI [v2]
In-Reply-To: <zirO8JqJ6A-rSFurqv31UD33CqIUSGjL6sgiztKyupw=.5a4f29db-f0bd-4870-aebd-a62c2a35db70@github.com>
References: <pGN_wuIoTkDaMGwotFCH_1PiohTPyFP05-7DNXmsWU4=.ce3fff95-7bd5-4e9b-aa29-cd93c0016cc2@github.com>
 <zirO8JqJ6A-rSFurqv31UD33CqIUSGjL6sgiztKyupw=.5a4f29db-f0bd-4870-aebd-a62c2a35db70@github.com>
Message-ID: <SUV0kGv2SY2Xe3q7ufsUto-fNnpro6JNjG6aT2HcbRU=.d399da30-8193-41da-90ce-91365090be96@github.com>

On Mon, 5 May 2025 15:54:29 GMT, Roman Kennke <rkennke at openjdk.org> wrote:

>> In order to support Shenandoah GC in Graal, some changes are required in JVMCI, namely, export Shenandoah relevant symbols.
>> 
>> Testing:
>>  - [x] extensive testing with https://github.com/oracle/graal/pull/10904
>
> Roman Kennke has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Initialize cardtable_start_address to nullptr

LGTM. Thanks.

-------------

Marked as reviewed by cslucas (Committer).

PR Review: https://git.openjdk.org/jdk/pull/25001#pullrequestreview-2815365611

From shade at openjdk.org  Mon May  5 16:50:46 2025
From: shade at openjdk.org (Aleksey Shipilev)
Date: Mon, 5 May 2025 16:50:46 GMT
Subject: RFR: 8356075: Support Shenandoah GC in JVMCI [v2]
In-Reply-To: <HqLJu_x8t7TPsusCAVeHrPH1NByjNuqUK0ipkEEDDFw=.cec49cc5-d6d0-4a15-b8c1-19e1b303da9b@github.com>
References: <pGN_wuIoTkDaMGwotFCH_1PiohTPyFP05-7DNXmsWU4=.ce3fff95-7bd5-4e9b-aa29-cd93c0016cc2@github.com>
 <26EKhVnyWuLQxzRjvvLzzLcY2iW6fmgqs7qHWOdZQvA=.99efcb44-5788-403a-8ad1-83766184aa17@github.com>
 <HqLJu_x8t7TPsusCAVeHrPH1NByjNuqUK0ipkEEDDFw=.cec49cc5-d6d0-4a15-b8c1-19e1b303da9b@github.com>
Message-ID: <92qsu_6Qj0xyyGYK8dDm97enzXvKNgj7obDEHnhVcds=.d4671e44-1ef0-4c79-a286-fa578a6eb25f@github.com>

On Mon, 5 May 2025 15:49:32 GMT, Roman Kennke <rkennke at openjdk.org> wrote:

>> src/hotspot/share/gc/shenandoah/shenandoahRuntime.hpp line 42:
>> 
>>> 40:   static void pre_barrier(JavaThread* thread, oopDesc* orig) {
>>> 41:     write_ref_field_pre(orig, thread);
>>> 42:   }
>> 
>> So, why not export `write_ref_field_pre`, instead of introducing this new method? Style/cleanliness, or something else? I am asking, because every time we add a new stub here, we would need to record it in `AOTCache` tables for Leyden benefit.
>
> It's about the argument ordering. Graal expects the Thread* to be prependend, while other JITs call it with the Thread* appended. I guess we could change other JIT calls to also prepend the thread, or change the interface to not pass the Thread* at all. I chose to follow G1 and export both variants.

Oh, so this matches `JVMCIRuntime::write_barrier_pre` for G1 (weird place to have it, but oh well). 

Does Graal need the `Thread*` argument?

I think this method is only called when SATB buffer is full. So the performance of this method is likely not affected by getting the current thread down in caller. So I think it would be more straight-forward to sharpen `ShenandoahRuntime::write_ref_field_pre` by dropping `Thread*` and then exporting that. Maybe also under the `SR::write_barrier_pre` name to be even more consistent for everything else.

Maybe @JohnTortugo wants to clean up more mess in C2 related to this :)

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/25001#discussion_r2073800305

From shade at openjdk.org  Mon May  5 16:50:48 2025
From: shade at openjdk.org (Aleksey Shipilev)
Date: Mon, 5 May 2025 16:50:48 GMT
Subject: RFR: 8356075: Support Shenandoah GC in JVMCI [v2]
In-Reply-To: <zirO8JqJ6A-rSFurqv31UD33CqIUSGjL6sgiztKyupw=.5a4f29db-f0bd-4870-aebd-a62c2a35db70@github.com>
References: <pGN_wuIoTkDaMGwotFCH_1PiohTPyFP05-7DNXmsWU4=.ce3fff95-7bd5-4e9b-aa29-cd93c0016cc2@github.com>
 <zirO8JqJ6A-rSFurqv31UD33CqIUSGjL6sgiztKyupw=.5a4f29db-f0bd-4870-aebd-a62c2a35db70@github.com>
Message-ID: <tght4dG6QGhaZk2e0JAjkGGfUe1_UdC1EkTiaONhWZQ=.80b5bbcd-8d66-42a2-967c-5c779c76ac23@github.com>

On Mon, 5 May 2025 15:54:29 GMT, Roman Kennke <rkennke at openjdk.org> wrote:

>> In order to support Shenandoah GC in Graal, some changes are required in JVMCI, namely, export Shenandoah relevant symbols.
>> 
>> Testing:
>>  - [x] extensive testing with https://github.com/oracle/graal/pull/10904
>
> Roman Kennke has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Initialize cardtable_start_address to nullptr

src/hotspot/share/jvmci/vmStructs_jvmci.cpp line 137:

> 135:   ZGC_ONLY(static_field(CompilerToVM::Data,    sizeof_ZStoreBarrierEntry,              int))                                         \
> 136:   SHENANDOAHGC_ONLY(static_field(CompilerToVM::Data, shenandoah_in_cset_fast_test_addr, address))                                      \
> 137:   SHENANDOAHGC_ONLY(static_field(CompilerToVM::Data, shenandoah_region_size_bytes_shift,int))                                        \

Also indent trailing backslashes.

src/hotspot/share/jvmci/vmStructs_jvmci.cpp line 909:

> 907:   SHENANDOAHGC_ONLY(declare_function(ShenandoahRuntime::load_reference_barrier_weak_narrow))    \
> 908:   SHENANDOAHGC_ONLY(declare_function(ShenandoahRuntime::load_reference_barrier_phantom))           \
> 909:   SHENANDOAHGC_ONLY(declare_function(ShenandoahRuntime::load_reference_barrier_phantom_narrow))    \

Also indent trailing backslashes.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/25001#discussion_r2073801311
PR Review Comment: https://git.openjdk.org/jdk/pull/25001#discussion_r2073801126

From rkennke at openjdk.org  Mon May  5 16:58:01 2025
From: rkennke at openjdk.org (Roman Kennke)
Date: Mon, 5 May 2025 16:58:01 GMT
Subject: RFR: 8356075: Support Shenandoah GC in JVMCI [v3]
In-Reply-To: <pGN_wuIoTkDaMGwotFCH_1PiohTPyFP05-7DNXmsWU4=.ce3fff95-7bd5-4e9b-aa29-cd93c0016cc2@github.com>
References: <pGN_wuIoTkDaMGwotFCH_1PiohTPyFP05-7DNXmsWU4=.ce3fff95-7bd5-4e9b-aa29-cd93c0016cc2@github.com>
Message-ID: <gWhERSP3RH5uwLO6UMo5Xrp_XPrYrF2jWi96tCpeCU0=.25ce4174-e11f-43ca-ac68-34bf20920f99@github.com>

> In order to support Shenandoah GC in Graal, some changes are required in JVMCI, namely, export Shenandoah relevant symbols.
> 
> Testing:
>  - [x] extensive testing with https://github.com/oracle/graal/pull/10904

Roman Kennke has updated the pull request incrementally with one additional commit since the last revision:

  Align backslashes

-------------

Changes:
  - all: https://git.openjdk.org/jdk/pull/25001/files
  - new: https://git.openjdk.org/jdk/pull/25001/files/c95313a9..44344585

Webrevs:
 - full: https://webrevs.openjdk.org/?repo=jdk&pr=25001&range=02
 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25001&range=01-02

  Stats: 6 lines in 1 file changed: 0 ins; 0 del; 6 mod
  Patch: https://git.openjdk.org/jdk/pull/25001.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/25001/head:pull/25001

PR: https://git.openjdk.org/jdk/pull/25001

From rkennke at openjdk.org  Mon May  5 16:58:01 2025
From: rkennke at openjdk.org (Roman Kennke)
Date: Mon, 5 May 2025 16:58:01 GMT
Subject: RFR: 8356075: Support Shenandoah GC in JVMCI [v3]
In-Reply-To: <92qsu_6Qj0xyyGYK8dDm97enzXvKNgj7obDEHnhVcds=.d4671e44-1ef0-4c79-a286-fa578a6eb25f@github.com>
References: <pGN_wuIoTkDaMGwotFCH_1PiohTPyFP05-7DNXmsWU4=.ce3fff95-7bd5-4e9b-aa29-cd93c0016cc2@github.com>
 <26EKhVnyWuLQxzRjvvLzzLcY2iW6fmgqs7qHWOdZQvA=.99efcb44-5788-403a-8ad1-83766184aa17@github.com>
 <HqLJu_x8t7TPsusCAVeHrPH1NByjNuqUK0ipkEEDDFw=.cec49cc5-d6d0-4a15-b8c1-19e1b303da9b@github.com>
 <92qsu_6Qj0xyyGYK8dDm97enzXvKNgj7obDEHnhVcds=.d4671e44-1ef0-4c79-a286-fa578a6eb25f@github.com>
Message-ID: <qmJ8YI0naIrRqDY2_LCRbXW1Dj5DZs_JkkoFXkGqYLU=.f41fbf1b-3f1e-4638-b0a1-7a16564f47ac@github.com>

On Mon, 5 May 2025 16:46:46 GMT, Aleksey Shipilev <shade at openjdk.org> wrote:

>> It's about the argument ordering. Graal expects the Thread* to be prependend, while other JITs call it with the Thread* appended. I guess we could change other JIT calls to also prepend the thread, or change the interface to not pass the Thread* at all. I chose to follow G1 and export both variants.
>
> Oh, so this matches `JVMCIRuntime::write_barrier_pre` for G1 (weird place to have it, but oh well). 
> 
> Does Graal need the `Thread*` argument?
> 
> I think this method is only called when SATB buffer is full. So the performance of this method is likely not affected by getting the current thread down in caller. So I think it would be more straight-forward to sharpen `ShenandoahRuntime::write_ref_field_pre` by dropping `Thread*` and then exporting that. Maybe also under the `SR::write_barrier_pre` name to be even more consistent for everything else.
> 
> Maybe @JohnTortugo wants to clean up more mess in C2 related to this :)

Graal does not need the Thread* argument, but the runtime code behind write_ref_pre() currently uses it. I agree, it does not look performance critical to pass it through. However, getting rid of it seems to blow the scope of this PR. I'd rather do this as a follow-up.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/25001#discussion_r2073807949

From rkennke at openjdk.org  Mon May  5 16:58:02 2025
From: rkennke at openjdk.org (Roman Kennke)
Date: Mon, 5 May 2025 16:58:02 GMT
Subject: RFR: 8356075: Support Shenandoah GC in JVMCI [v3]
In-Reply-To: <qmJ8YI0naIrRqDY2_LCRbXW1Dj5DZs_JkkoFXkGqYLU=.f41fbf1b-3f1e-4638-b0a1-7a16564f47ac@github.com>
References: <pGN_wuIoTkDaMGwotFCH_1PiohTPyFP05-7DNXmsWU4=.ce3fff95-7bd5-4e9b-aa29-cd93c0016cc2@github.com>
 <26EKhVnyWuLQxzRjvvLzzLcY2iW6fmgqs7qHWOdZQvA=.99efcb44-5788-403a-8ad1-83766184aa17@github.com>
 <HqLJu_x8t7TPsusCAVeHrPH1NByjNuqUK0ipkEEDDFw=.cec49cc5-d6d0-4a15-b8c1-19e1b303da9b@github.com>
 <92qsu_6Qj0xyyGYK8dDm97enzXvKNgj7obDEHnhVcds=.d4671e44-1ef0-4c79-a286-fa578a6eb25f@github.com>
 <qmJ8YI0naIrRqDY2_LCRbXW1Dj5DZs_JkkoFXkGqYLU=.f41fbf1b-3f1e-4638-b0a1-7a16564f47ac@github.com>
Message-ID: <_4ebbEF2UTOwozaLgA8ibDcCi1YF5I4rKA1NN3tBozs=.7ce738b0-7e79-4c93-bc4d-63f534ccc536@github.com>

On Mon, 5 May 2025 16:51:39 GMT, Roman Kennke <rkennke at openjdk.org> wrote:

>> Oh, so this matches `JVMCIRuntime::write_barrier_pre` for G1 (weird place to have it, but oh well). 
>> 
>> Does Graal need the `Thread*` argument?
>> 
>> I think this method is only called when SATB buffer is full. So the performance of this method is likely not affected by getting the current thread down in caller. So I think it would be more straight-forward to sharpen `ShenandoahRuntime::write_ref_field_pre` by dropping `Thread*` and then exporting that. Maybe also under the `SR::write_barrier_pre` name to be even more consistent for everything else.
>> 
>> Maybe @JohnTortugo wants to clean up more mess in C2 related to this :)
>
> Graal does not need the Thread* argument, but the runtime code behind write_ref_pre() currently uses it. I agree, it does not look performance critical to pass it through. However, getting rid of it seems to blow the scope of this PR. I'd rather do this as a follow-up.

Actually, I'd probably add the new entry for Graal without the Thread* argument now, and fix the others in a follow-up. Otherwise we need to deal with it on the Graal side again later once we change the entry points.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/25001#discussion_r2073813072

From kvn at openjdk.org  Mon May  5 17:02:44 2025
From: kvn at openjdk.org (Vladimir Kozlov)
Date: Mon, 5 May 2025 17:02:44 GMT
Subject: RFR: 8347901: C2 should remove unused leaf / pure runtime calls
In-Reply-To: <FDC4kftsSAqf2FB3mxOABGMgWhr_qty0_BktGUiuTuE=.060b5943-74c3-461d-8806-b6da1722c207@github.com>
References: <FDC4kftsSAqf2FB3mxOABGMgWhr_qty0_BktGUiuTuE=.060b5943-74c3-461d-8806-b6da1722c207@github.com>
Message-ID: <sObRDe6q5dj-TSEFA2Aytf85sbm2HgQkskT81wgSN4g=.22a020b7-be9d-4056-a4db-860430c2735a@github.com>

On Wed, 30 Apr 2025 13:18:33 GMT, Marc Chevalier <mchevalier at openjdk.org> wrote:

> A first part toward a better support of pure functions.
> 
> ## Pure Functions
> 
> Pure functions (considered here) are functions that have no side effects, no effect on the control flow (no exception or such), cannot deopt etc.. It's really a function that you can execute anywhere, with whichever arguments without effect other than wasting time. Integer division is not pure as dividing by zero is throwing. But many floating point functions will just return `NaN` or `+/-infinity` in problematic cases.
> 
> ## Scope
> 
> We are not going all powerful for now! It's mostly about identifying some pure functions and being able to remove them if the result is unused. Some other things are not part of this PR, on purpose. Especially, this PR doesn't propose a way to move pure calls around. The reason is that pure calls are macro nodes later expanded into other, regular calls, which require a control input. To be able to do the expansion, we just keep the control in the pure call as well.
> 
> ## Implementation Overview
> 
> We created here some new node kind for pure calls that are expanded into regular calls during macro expansion. This also allows the removal of `ModD` and `ModF` nodes that have their pure equivalent now. They are surprisingly hard to unify with other floating point functions from an implementation point of view!
> 
> IR framework and IGV needed a little bit of fixing.
> 
> Thanks,
> Marc

Nice work.

-------------

Marked as reviewed by kvn (Reviewer).

PR Review: https://git.openjdk.org/jdk/pull/24966#pullrequestreview-2815464620

From shade at openjdk.org  Mon May  5 17:03:47 2025
From: shade at openjdk.org (Aleksey Shipilev)
Date: Mon, 5 May 2025 17:03:47 GMT
Subject: RFR: 8356075: Support Shenandoah GC in JVMCI [v3]
In-Reply-To: <_4ebbEF2UTOwozaLgA8ibDcCi1YF5I4rKA1NN3tBozs=.7ce738b0-7e79-4c93-bc4d-63f534ccc536@github.com>
References: <pGN_wuIoTkDaMGwotFCH_1PiohTPyFP05-7DNXmsWU4=.ce3fff95-7bd5-4e9b-aa29-cd93c0016cc2@github.com>
 <26EKhVnyWuLQxzRjvvLzzLcY2iW6fmgqs7qHWOdZQvA=.99efcb44-5788-403a-8ad1-83766184aa17@github.com>
 <HqLJu_x8t7TPsusCAVeHrPH1NByjNuqUK0ipkEEDDFw=.cec49cc5-d6d0-4a15-b8c1-19e1b303da9b@github.com>
 <92qsu_6Qj0xyyGYK8dDm97enzXvKNgj7obDEHnhVcds=.d4671e44-1ef0-4c79-a286-fa578a6eb25f@github.com>
 <qmJ8YI0naIrRqDY2_LCRbXW1Dj5DZs_JkkoFXkGqYLU=.f41fbf1b-3f1e-4638-b0a1-7a16564f47ac@github.com>
 <_4ebbEF2UTOwozaLgA8ibDcCi1YF5I4rKA1NN3tBozs=.7ce738b0-7e79-4c93-bc4d-63f534ccc536@github.com>
Message-ID: <o4_ZemkCW10hN2elN_3KW74SPCYV_rOjSznqmKVnCpc=.d21bb661-b52d-44d2-94c3-25321bb317ac@github.com>

On Mon, 5 May 2025 16:55:36 GMT, Roman Kennke <rkennke at openjdk.org> wrote:

>> Graal does not need the Thread* argument, but the runtime code behind write_ref_pre() currently uses it. I agree, it does not look performance critical to pass it through. However, getting rid of it seems to blow the scope of this PR. I'd rather do this as a follow-up.
>
> Actually, I'd probably add the new entry for Graal without the Thread* argument now, and fix the others in a follow-up. Otherwise we need to deal with it on the Graal side again later once we change the entry points.

OK, but that follow-up risks changing the JVMCI interface _again_? How about we introduce:


static void write_barrier_pre(oopDesc* pre_val) {
  write_ref_field_pre(pre_val, JavaThread::current());
}


...and then the follow-up purges the old `write_ref_field_pre`? The implementation might need to be in `.cpp`.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/25001#discussion_r2073820137

From vlivanov at openjdk.org  Mon May  5 19:08:48 2025
From: vlivanov at openjdk.org (Vladimir Ivanov)
Date: Mon, 5 May 2025 19:08:48 GMT
Subject: RFR: 8347901: C2 should remove unused leaf / pure runtime calls
In-Reply-To: <FDC4kftsSAqf2FB3mxOABGMgWhr_qty0_BktGUiuTuE=.060b5943-74c3-461d-8806-b6da1722c207@github.com>
References: <FDC4kftsSAqf2FB3mxOABGMgWhr_qty0_BktGUiuTuE=.060b5943-74c3-461d-8806-b6da1722c207@github.com>
Message-ID: <DEspIWOBy4L2u5G4nt9nbH64kDfC3FHPTgGOhMJEbNc=.fed902be-eb79-4a6b-9acd-94828ba2ae20@github.com>

On Wed, 30 Apr 2025 13:18:33 GMT, Marc Chevalier <mchevalier at openjdk.org> wrote:

> A first part toward a better support of pure functions.
> 
> ## Pure Functions
> 
> Pure functions (considered here) are functions that have no side effects, no effect on the control flow (no exception or such), cannot deopt etc.. It's really a function that you can execute anywhere, with whichever arguments without effect other than wasting time. Integer division is not pure as dividing by zero is throwing. But many floating point functions will just return `NaN` or `+/-infinity` in problematic cases.
> 
> ## Scope
> 
> We are not going all powerful for now! It's mostly about identifying some pure functions and being able to remove them if the result is unused. Some other things are not part of this PR, on purpose. Especially, this PR doesn't propose a way to move pure calls around. The reason is that pure calls are macro nodes later expanded into other, regular calls, which require a control input. To be able to do the expansion, we just keep the control in the pure call as well.
> 
> ## Implementation Overview
> 
> We created here some new node kind for pure calls that are expanded into regular calls during macro expansion. This also allows the removal of `ModD` and `ModF` nodes that have their pure equivalent now. They are surprisingly hard to unify with other floating point functions from an implementation point of view!
> 
> IR framework and IGV needed a little bit of fixing.
> 
> Thanks,
> Marc

Good work, Marc.

High-level comment: I don't know what are the future plans, but as the patch stands now, it feels like it complicates both the design and the implementation. 

Original implementation relies on macro nodes which are later expanded into leaf runtime calls. What you propose introduce new concept of "pure calls" which is: (1) not a CallNode anymore; and (2) relies on subclassing (which makes it hard to mix with other node properties). Moreover, I don't see much benefit in committing to runtime call representation from the very beginning (early in high-level IR).

Going forward, IMO the sweet sport is to support arbitrary nodes to be lowered into leaf runtime calls. You make a big step in that direction by relaxing requirements on `PureCall` to be just a CFG node (and not a full-blown `CallLeaf` node). Next step would be to relax CFG node requirement and let compiler pick the right place to insert it. (Existing expensive node support in C2 addresses some similar challenges.)

And, as a complementary options, in some cases it may be just enough to mark individual call nodes as pure, so they can be pruned later if nobody consumes result of their computation anymore.

-------------

PR Review: https://git.openjdk.org/jdk/pull/24966#pullrequestreview-2815810010

From rkennke at openjdk.org  Mon May  5 20:25:27 2025
From: rkennke at openjdk.org (Roman Kennke)
Date: Mon, 5 May 2025 20:25:27 GMT
Subject: RFR: 8356075: Support Shenandoah GC in JVMCI [v4]
In-Reply-To: <pGN_wuIoTkDaMGwotFCH_1PiohTPyFP05-7DNXmsWU4=.ce3fff95-7bd5-4e9b-aa29-cd93c0016cc2@github.com>
References: <pGN_wuIoTkDaMGwotFCH_1PiohTPyFP05-7DNXmsWU4=.ce3fff95-7bd5-4e9b-aa29-cd93c0016cc2@github.com>
Message-ID: <dsYp242p-1jCg-9vmI8fKobTJqQlT8WACQ1e2McNLtg=.973251bd-e79e-4835-8a22-f24a6a60289a@github.com>

> In order to support Shenandoah GC in Graal, some changes are required in JVMCI, namely, export Shenandoah relevant symbols.
> 
> Testing:
>  - [x] extensive testing with https://github.com/oracle/graal/pull/10904

Roman Kennke has updated the pull request incrementally with one additional commit since the last revision:

  Simplify pre-barrier runtime entry

-------------

Changes:
  - all: https://git.openjdk.org/jdk/pull/25001/files
  - new: https://git.openjdk.org/jdk/pull/25001/files/44344585..41084f3e

Webrevs:
 - full: https://webrevs.openjdk.org/?repo=jdk&pr=25001&range=03
 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25001&range=02-03

  Stats: 8 lines in 3 files changed: 4 ins; 2 del; 2 mod
  Patch: https://git.openjdk.org/jdk/pull/25001.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/25001/head:pull/25001

PR: https://git.openjdk.org/jdk/pull/25001

From vlivanov at openjdk.org  Tue May  6 01:33:20 2025
From: vlivanov at openjdk.org (Vladimir Ivanov)
Date: Tue, 6 May 2025 01:33:20 GMT
Subject: RFR: 8352675: Support Intel AVX10 converged vector ISA feature
 detection [v12]
In-Reply-To: <2ioSQVtfXhnqvAXqiadwR1HuJsz3t9nytY0wRps-x68=.35220ade-0e70-41c6-9ebd-a271e7dcb2bb@github.com>
References: <OrjQDBEJjPrCWLpEPj4DmEpaWEFcHY3R8PiZ6ccxMxU=.2862c36d-88d0-45b8-ad28-b50730570da4@github.com>
 <2ioSQVtfXhnqvAXqiadwR1HuJsz3t9nytY0wRps-x68=.35220ade-0e70-41c6-9ebd-a271e7dcb2bb@github.com>
Message-ID: <FqDRgTOj-VT5-FgH_RvBr_gmK2U6W0UKFws8iihIY5s=.c8c93edf-e447-4c33-903d-035623f38ce2@github.com>

On Mon, 5 May 2025 04:06:02 GMT, Jatin Bhateja <jbhateja at openjdk.org> wrote:

>> - Intel AVX10[1] extends and enhances the capabilities of Intel AVX-512 to benefit all Intel? products and will be the vector ISA of choice moving into the future. 
>> - It supports a new ISA versioning scheme which simplifies the existing AVX512 feature enumeration scheme. Feature set supported by an AVX10 ISA version will be supported by all the versions above it.
>> - The initial, fully-featured version of Intel? AVX10 will be enumerated as Version 2 (denoted as Intel? AVX10.2). This will include the new ISA extension over the existing AVX512 instructions. 
>> - An early version of Intel? AVX10 (Version 1, or Intel? AVX10.1) that only enumerates the Intel? AVX-512 instruction set at 128, 256, and 512 bits will be enabled on the Granite Rapids Server for software pre-enabling.
>> 
>> This patch adds the necessary CPUID feature detection for AVX10 ISA version 1 and 2.  In terms of architectural state save restoration, AVX10 is isomorphic to AVX512 support up till Granite Rapids. State components affected by AVX10 extension include SSE, AVX, Opmask, ZMM_Hi256, and Hi16_ZMM registers. 
>> 
>> The patch has been regressed through tier1 and jvmci tests 
>> 
>> Please review and share your feedback.
>> 
>> Best Regards,
>> Jatin
>> 
>> [1] https://www.intel.com/content/www/us/en/content-details/844829/intel-advanced-vector-extensions-10-2-intel-avx10-2-architecture-specification.html
>
> Jatin Bhateja has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR. The pull request contains two new commits since the last revision:
> 
>  - Updating comment
>  - Review comments resolutions

It does look much better now. Thanks!

Some comments/suggestions follow.

src/hotspot/cpu/x86/vm_version_x86.cpp line 853:

> 851: 
> 852:   if (cpu_family() > 4) { // it supports CPUID
> 853:     _features = _cpuid_info.feature_flags(); // These can be changed by VM settings

You don't need to change this code if you equip `VM_Features` with a copy constructor.

src/hotspot/cpu/x86/vm_version_x86.cpp line 1102:

> 1100:   size_t buf_iter = cpu_info_size;
> 1101:   for (uint64_t i = 0; i < features_vector_size(); i++) {
> 1102:     insert_features_names(features_vector_elem(i), buf + buf_iter, sizeof(buf) - buf_iter, _features_names, 64 * i);

`Abstract_VM_Version::insert_features_names` is used only on x86. You can move it to `vm_version_x86.cpp/.hpp` and adjust to new layout.

src/hotspot/cpu/x86/vm_version_x86.hpp line 707:

> 705:   //
> 706:   static bool supports_cpuid()        { return _features  != 0; }
> 707:   static bool supports_cmov()         { return (_features & CPU_CMOV) != 0; }

Since you touch this code anyway, I suggest to use this opportunity to automatically derive this code using  `CPU_FEATURE_FLAGS` macro. (As an example [1].)

[1] https://github.com/openjdk/jdk/blob/master/src/hotspot/cpu/aarch64/vm_version_aarch64.hpp#L147

src/hotspot/cpu/x86/vm_version_x86.hpp line 753:

> 751:   // Feature identification which can be affected by VM settings
> 752:   //
> 753:   static bool supports_cpuid()        { return Abstract_VM_Version::vm_features_exist(); }

Is `VM_Features::_features_vector_size > 0` equivalent to `_features != 0`?

I believe you can simply drop `supports_cpuid()`. x86-32 bit port is gone and even there `cpuid` support was mandatory.

src/hotspot/share/runtime/abstract_vm_version.hpp line 51:

> 49: class VM_Features {
> 50:  public:
> 51:   using FeatureVector = uint64_t [MAX_FEATURE_VEC_SIZE];

Why did you decide to declare new type name for fixed size array type? I see you use `FeatureVector` in `vmStructs*` and JVMCI code. Does it make things simpler there?

src/hotspot/share/runtime/abstract_vm_version.hpp line 91:

> 89: 
> 90:   // CPU feature flags vector, can be affected by VM settings.
> 91:   static VM_Features _vm_target_features;

Unless we plan to migrate all platforms all at once, I suggest to move this code into `VM_Version` and keep the same names (`_features` and `_cpu_features`). Ideally, `_features` field can be moved to from `Abstract_VM_Version` to platform-specific `VM_Version`s across all platforms. But leaving it as is for now is also fine with me. 

There's a precedent: `VM_Version` already overrides `_features` field on s390 [1]. 

`VM_Features` class can start as x86-specific, but for advertisement purposes it makes sense to keep it in `abstract_vm_version.hpp`.

Alternatively, `Abstract_VM_Version::_features` can be converted from `uint64_t` to `VM_Features` and non-x86 platforms can be covered by providing overloads for currently used operators (it's mostly `|=`, `&=`, and `&`, plus convertions).

[1] https://github.com/openjdk/jdk/blob/master/src/hotspot/cpu/s390/vm_version_s390.hpp#L130

src/hotspot/share/runtime/abstract_vm_version.hpp line 97:

> 95: 
> 96:   static void sync_cpu_features() {
> 97:     memcpy(_cpu_target_features._features_vector, _vm_target_features._features_vector,

Any particular reason to use `memcpy`/`memset` and not a loop over `_features_vector` array?

I believe once you define default and copy constructors for `VM_Features`, `sync_cpu_features()` and `clear_cpu_features()` won't be needed anymore.

src/hotspot/share/runtime/abstract_vm_version.hpp line 183:

> 181:   static const char* printable_jdk_debug_level();
> 182: 
> 183:   static uint64_t features() {

Not used. Drop it.

src/hotspot/share/runtime/init.cpp line 68:

> 66: void codeCache_init();
> 67: void VM_Version_init();
> 68: void VM_Version_pre_init();

Redundant declaration.

src/jdk.internal.vm.ci/share/classes/jdk/vm/ci/hotspot/amd64/AMD64HotSpotVMConfig.java line 94:

> 92:     final long amd64CET_IBT = getConstant("VM_Version::CPU_CET_IBT", Long.class);
> 93:     final long amd64CET_SS = getConstant("VM_Version::CPU_CET_SS", Long.class);
> 94:     final long avx10_1 = getConstant("VM_Version::CPU_AVX10_1", Long.class);

Leave them as is. @mur47x111 plans to remove them [1].

[1] https://github.com/openjdk/jdk/pull/24329#issuecomment-2838223030

-------------

PR Review: https://git.openjdk.org/jdk/pull/24329#pullrequestreview-2815634822
PR Review Comment: https://git.openjdk.org/jdk/pull/24329#discussion_r2074470895
PR Review Comment: https://git.openjdk.org/jdk/pull/24329#discussion_r2074469800
PR Review Comment: https://git.openjdk.org/jdk/pull/24329#discussion_r2074484317
PR Review Comment: https://git.openjdk.org/jdk/pull/24329#discussion_r2074481382
PR Review Comment: https://git.openjdk.org/jdk/pull/24329#discussion_r2074502713
PR Review Comment: https://git.openjdk.org/jdk/pull/24329#discussion_r2074479165
PR Review Comment: https://git.openjdk.org/jdk/pull/24329#discussion_r2074496719
PR Review Comment: https://git.openjdk.org/jdk/pull/24329#discussion_r2074480203
PR Review Comment: https://git.openjdk.org/jdk/pull/24329#discussion_r2073919224
PR Review Comment: https://git.openjdk.org/jdk/pull/24329#discussion_r2074519224

From vlivanov at openjdk.org  Tue May  6 01:33:21 2025
From: vlivanov at openjdk.org (Vladimir Ivanov)
Date: Tue, 6 May 2025 01:33:21 GMT
Subject: RFR: 8352675: Support Intel AVX10 converged vector ISA feature
 detection [v12]
In-Reply-To: <9d9DVuqRAeb_8kiEwkPQH6g2eBU5Jc_5ZSBAi1in9X0=.1d955598-f466-46ff-8b1f-71c87abd6313@github.com>
References: <OrjQDBEJjPrCWLpEPj4DmEpaWEFcHY3R8PiZ6ccxMxU=.2862c36d-88d0-45b8-ad28-b50730570da4@github.com>
 <PtAKp6Jg5aTx0OjxRBGw81ycHXSKAbSVgZ6KeIQ0R1o=.5baf1f13-7bb4-46e1-9d43-194bbd2412d9@github.com>
 <JaCbcvlFUh6OYW6IMo-E4HCYKaZTee-51AzITXR5bbk=.4a017138-896d-439f-9431-092f70511b71@github.com>
 <SfZKEmLMA2qhDuZDwe-6JjXBE-GyL1LidXU6-hF9jlE=.68211b98-5ff0-4cfa-aa7a-cada0ca6e9b5@github.com>
 <nh9_Fa7Mr0YGR6yrjs1zQ4kzy6kxZVj_zZHai20HTZg=.cd51f544-af5c-4656-abc7-a96c9abfc084@github.com>
 <9d9DVuqRAeb_8kiEwkPQH6g2eBU5Jc_5ZSBAi1in9X0=.1d955598-f466-46ff-8b1f-71c87abd6313@github.com>
Message-ID: <Q3e3aNm-CFlL3rN__fDMvw-Pw32IeZ4fh9M7B9Pn-0Q=.16aad700-b806-44e9-a0d0-038a6f45f4be@github.com>

On Mon, 5 May 2025 03:54:24 GMT, Jatin Bhateja <jbhateja at openjdk.org> wrote:

>> I prefer explicit accessor calls on corresponding instance fields. 
>> 
>> It's confusing to see `VM_Version::CpuidInfo::feature_flags()` implicitly modifying `_dynamic_features_vector` through macros.
>
> I have changed this local rountine name to install_feature_flags to confirm to its semantics

It's still counter-intuitive to see `VM_Version::CpuidInfo` implicitly initializes a field in `Abstract_VM_Version` class. I prefer original code shape. Any problems with the following code shape?

VM_Features VM_Version::CpuidInfo::feature_flags() const {
  VM_Features result;
  if (std_cpuid1_edx.bits.cmpxchg8 != 0) {
    result.set_feature(CPU_CX8);
  }
  ...
  return result;
}

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/24329#discussion_r2074474099

From mchevalier at openjdk.org  Tue May  6 07:46:14 2025
From: mchevalier at openjdk.org (Marc Chevalier)
Date: Tue, 6 May 2025 07:46:14 GMT
Subject: RFR: 8347901: C2 should remove unused leaf / pure runtime calls
In-Reply-To: <FDC4kftsSAqf2FB3mxOABGMgWhr_qty0_BktGUiuTuE=.060b5943-74c3-461d-8806-b6da1722c207@github.com>
References: <FDC4kftsSAqf2FB3mxOABGMgWhr_qty0_BktGUiuTuE=.060b5943-74c3-461d-8806-b6da1722c207@github.com>
Message-ID: <XrHBrp81T81JlX15Yc3cTb-fPwGqo5uX-tj4HizOzko=.9d6f5dd0-3239-47f1-a593-87a208ad5c99@github.com>

On Wed, 30 Apr 2025 13:18:33 GMT, Marc Chevalier <mchevalier at openjdk.org> wrote:

> A first part toward a better support of pure functions.
> 
> ## Pure Functions
> 
> Pure functions (considered here) are functions that have no side effects, no effect on the control flow (no exception or such), cannot deopt etc.. It's really a function that you can execute anywhere, with whichever arguments without effect other than wasting time. Integer division is not pure as dividing by zero is throwing. But many floating point functions will just return `NaN` or `+/-infinity` in problematic cases.
> 
> ## Scope
> 
> We are not going all powerful for now! It's mostly about identifying some pure functions and being able to remove them if the result is unused. Some other things are not part of this PR, on purpose. Especially, this PR doesn't propose a way to move pure calls around. The reason is that pure calls are macro nodes later expanded into other, regular calls, which require a control input. To be able to do the expansion, we just keep the control in the pure call as well.
> 
> ## Implementation Overview
> 
> We created here some new node kind for pure calls that are expanded into regular calls during macro expansion. This also allows the removal of `ModD` and `ModF` nodes that have their pure equivalent now. They are surprisingly hard to unify with other floating point functions from an implementation point of view!
> 
> IR framework and IGV needed a little bit of fixing.
> 
> Thanks,
> Marc

Thanks for the comment. I'll think deeper about it.

I've started by trying to make PureCall a subclass of Call (or a property of LeafCall) but that broke a lot of things that were using some invariants on CallNode that weren't holding anymore. After a some time tracking bugs and trying to fix, I thought it would be simpler to have a new kind of node, and it would have less impact on existing code. Another reason I've changed it to a direct sub-class of Node is that I felt it made little sense to be a Call (or sub-class of) since Calls are Safepoint, but pure calls don't need to be (and similar "conceptual" problems). It seemed like a hack to me.

About
> support arbitrary nodes to be lowered into leaf runtime calls.

I don't think I understand what you mean. Overall, I see the weaknesses of my design, but I'm not sure which direction to take instead.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/24966#issuecomment-2853576338

From shade at openjdk.org  Tue May  6 08:15:16 2025
From: shade at openjdk.org (Aleksey Shipilev)
Date: Tue, 6 May 2025 08:15:16 GMT
Subject: RFR: 8356075: Support Shenandoah GC in JVMCI [v4]
In-Reply-To: <dsYp242p-1jCg-9vmI8fKobTJqQlT8WACQ1e2McNLtg=.973251bd-e79e-4835-8a22-f24a6a60289a@github.com>
References: <pGN_wuIoTkDaMGwotFCH_1PiohTPyFP05-7DNXmsWU4=.ce3fff95-7bd5-4e9b-aa29-cd93c0016cc2@github.com>
 <dsYp242p-1jCg-9vmI8fKobTJqQlT8WACQ1e2McNLtg=.973251bd-e79e-4835-8a22-f24a6a60289a@github.com>
Message-ID: <yCtKBT4wEUERpf_F77nWQjEaPlc4TypD2S3xRrG7Qzc=.b9e110c1-2e64-442d-8ed3-b9f9aa39053f@github.com>

On Mon, 5 May 2025 20:25:27 GMT, Roman Kennke <rkennke at openjdk.org> wrote:

>> In order to support Shenandoah GC in Graal, some changes are required in JVMCI, namely, export Shenandoah relevant symbols.
>> 
>> Testing:
>>  - [x] extensive testing with https://github.com/oracle/graal/pull/10904
>
> Roman Kennke has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Simplify pre-barrier runtime entry

All right, this works, thanks!

-------------

Marked as reviewed by shade (Reviewer).

PR Review: https://git.openjdk.org/jdk/pull/25001#pullrequestreview-2817308091

From jbhateja at openjdk.org  Tue May  6 08:49:57 2025
From: jbhateja at openjdk.org (Jatin Bhateja)
Date: Tue, 6 May 2025 08:49:57 GMT
Subject: RFR: 8352675: Support Intel AVX10 converged vector ISA feature
 detection [v13]
In-Reply-To: <OrjQDBEJjPrCWLpEPj4DmEpaWEFcHY3R8PiZ6ccxMxU=.2862c36d-88d0-45b8-ad28-b50730570da4@github.com>
References: <OrjQDBEJjPrCWLpEPj4DmEpaWEFcHY3R8PiZ6ccxMxU=.2862c36d-88d0-45b8-ad28-b50730570da4@github.com>
Message-ID: <lrXV4-uHKQ8BqYP_NdiVgpwaTK9B8uil7yQOznJPAh4=.9448f543-c8ed-4b46-84bf-955e7c3ee260@github.com>

> - Intel AVX10[1] extends and enhances the capabilities of Intel AVX-512 to benefit all Intel? products and will be the vector ISA of choice moving into the future. 
> - It supports a new ISA versioning scheme which simplifies the existing AVX512 feature enumeration scheme. Feature set supported by an AVX10 ISA version will be supported by all the versions above it.
> - The initial, fully-featured version of Intel? AVX10 will be enumerated as Version 2 (denoted as Intel? AVX10.2). This will include the new ISA extension over the existing AVX512 instructions. 
> - An early version of Intel? AVX10 (Version 1, or Intel? AVX10.1) that only enumerates the Intel? AVX-512 instruction set at 128, 256, and 512 bits will be enabled on the Granite Rapids Server for software pre-enabling.
> 
> This patch adds the necessary CPUID feature detection for AVX10 ISA version 1 and 2.  In terms of architectural state save restoration, AVX10 is isomorphic to AVX512 support up till Granite Rapids. State components affected by AVX10 extension include SSE, AVX, Opmask, ZMM_Hi256, and Hi16_ZMM registers. 
> 
> The patch has been regressed through tier1 and jvmci tests 
> 
> Please review and share your feedback.
> 
> Best Regards,
> Jatin
> 
> [1] https://www.intel.com/content/www/us/en/content-details/844829/intel-advanced-vector-extensions-10-2-intel-avx10-2-architecture-specification.html

Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision:

  Review comments resolutions

-------------

Changes:
  - all: https://git.openjdk.org/jdk/pull/24329/files
  - new: https://git.openjdk.org/jdk/pull/24329/files/7b414b8c..b25cc776

Webrevs:
 - full: https://webrevs.openjdk.org/?repo=jdk&pr=24329&range=12
 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24329&range=11-12

  Stats: 441 lines in 9 files changed: 106 ins; 107 del; 228 mod
  Patch: https://git.openjdk.org/jdk/pull/24329.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/24329/head:pull/24329

PR: https://git.openjdk.org/jdk/pull/24329

From jbhateja at openjdk.org  Tue May  6 08:49:58 2025
From: jbhateja at openjdk.org (Jatin Bhateja)
Date: Tue, 6 May 2025 08:49:58 GMT
Subject: RFR: 8352675: Support Intel AVX10 converged vector ISA feature
 detection [v12]
In-Reply-To: <FqDRgTOj-VT5-FgH_RvBr_gmK2U6W0UKFws8iihIY5s=.c8c93edf-e447-4c33-903d-035623f38ce2@github.com>
References: <OrjQDBEJjPrCWLpEPj4DmEpaWEFcHY3R8PiZ6ccxMxU=.2862c36d-88d0-45b8-ad28-b50730570da4@github.com>
 <2ioSQVtfXhnqvAXqiadwR1HuJsz3t9nytY0wRps-x68=.35220ade-0e70-41c6-9ebd-a271e7dcb2bb@github.com>
 <FqDRgTOj-VT5-FgH_RvBr_gmK2U6W0UKFws8iihIY5s=.c8c93edf-e447-4c33-903d-035623f38ce2@github.com>
Message-ID: <kX-yWgMUVIstwjYZWPqa8OBpQG3VicnsQ4UvlllvEzY=.9621f732-43f5-4998-a518-66d574b68224@github.com>

On Tue, 6 May 2025 00:30:23 GMT, Vladimir Ivanov <vlivanov at openjdk.org> wrote:

>> Jatin Bhateja has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR. The pull request contains two new commits since the last revision:
>> 
>>  - Updating comment
>>  - Review comments resolutions
>
> src/hotspot/cpu/x86/vm_version_x86.cpp line 1102:
> 
>> 1100:   size_t buf_iter = cpu_info_size;
>> 1101:   for (uint64_t i = 0; i < features_vector_size(); i++) {
>> 1102:     insert_features_names(features_vector_elem(i), buf + buf_iter, sizeof(buf) - buf_iter, _features_names, 64 * i);
> 
> `Abstract_VM_Version::insert_features_names` is used only on x86. You can move it to `vm_version_x86.cpp/.hpp` and adjust to new layout.

DONE

> src/hotspot/share/runtime/abstract_vm_version.hpp line 51:
> 
>> 49: class VM_Features {
>> 50:  public:
>> 51:   using FeatureVector = uint64_t [MAX_FEATURE_VEC_SIZE];
> 
> Why did you decide to declare new type name for fixed size array type? I see you use `FeatureVector` in `vmStructs*` and JVMCI code. Does it make things simpler there?

Yes. I was facing compilation issues with raw array types.

> src/hotspot/share/runtime/abstract_vm_version.hpp line 91:
> 
>> 89: 
>> 90:   // CPU feature flags vector, can be affected by VM settings.
>> 91:   static VM_Features _vm_target_features;
> 
> Unless we plan to migrate all platforms all at once, I suggest to move this code into `VM_Version` and keep the same names (`_features` and `_cpu_features`). Ideally, `_features` field can be moved to from `Abstract_VM_Version` to platform-specific `VM_Version`s across all platforms. But leaving it as is for now is also fine with me. 
> 
> There's a precedent: `VM_Version` already overrides `_features` field on s390 [1]. 
> 
> `VM_Features` class can start as x86-specific, but for advertisement purposes it makes sense to keep it in `abstract_vm_version.hpp`.
> 
> Alternatively, `Abstract_VM_Version::_features` can be converted from `uint64_t` to `VM_Features` and non-x86 platforms can be covered by providing overloads for currently used operators (it's mostly `|=`, `&=`, and `&`, plus convertions).
> 
> [1] https://github.com/openjdk/jdk/blob/master/src/hotspot/cpu/s390/vm_version_s390.hpp#L130

Moved VM_Features to VM_Version.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/24329#discussion_r2075014479
PR Review Comment: https://git.openjdk.org/jdk/pull/24329#discussion_r2075012045
PR Review Comment: https://git.openjdk.org/jdk/pull/24329#discussion_r2075015126

From jbhateja at openjdk.org  Tue May  6 08:49:58 2025
From: jbhateja at openjdk.org (Jatin Bhateja)
Date: Tue, 6 May 2025 08:49:58 GMT
Subject: RFR: 8352675: Support Intel AVX10 converged vector ISA feature
 detection [v13]
In-Reply-To: <FqDRgTOj-VT5-FgH_RvBr_gmK2U6W0UKFws8iihIY5s=.c8c93edf-e447-4c33-903d-035623f38ce2@github.com>
References: <OrjQDBEJjPrCWLpEPj4DmEpaWEFcHY3R8PiZ6ccxMxU=.2862c36d-88d0-45b8-ad28-b50730570da4@github.com>
 <2ioSQVtfXhnqvAXqiadwR1HuJsz3t9nytY0wRps-x68=.35220ade-0e70-41c6-9ebd-a271e7dcb2bb@github.com>
 <FqDRgTOj-VT5-FgH_RvBr_gmK2U6W0UKFws8iihIY5s=.c8c93edf-e447-4c33-903d-035623f38ce2@github.com>
Message-ID: <RVhjRRZE1AFOaHCMnU6NGElhid3utnbz8olv3Yqwi1o=.6e9f23ed-1db4-4e28-84e2-99ad436d70fe@github.com>

On Tue, 6 May 2025 00:57:29 GMT, Vladimir Ivanov <vlivanov at openjdk.org> wrote:

>> Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision:
>> 
>>   Review comments resolutions
>
> src/hotspot/cpu/x86/vm_version_x86.hpp line 707:
> 
>> 705:   //
>> 706:   static bool supports_cpuid()        { return _features  != 0; }
>> 707:   static bool supports_cmov()         { return (_features & CPU_CMOV) != 0; }
> 
> Since you touch this code anyway, I suggest to use this opportunity to automatically derive this code using  `CPU_FEATURE_FLAGS` macro. (As an example [1].)
> 
> [1] https://github.com/openjdk/jdk/blob/master/src/hotspot/cpu/aarch64/vm_version_aarch64.hpp#L147

Unlike AARCH64, there is not a 1:1 mapping b/w CPU_* features and the corresponding support checkers; some AVX512 checkers use multiple features. Skipping this for now for consistency.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/24329#discussion_r2075012161

From jbhateja at openjdk.org  Tue May  6 08:57:18 2025
From: jbhateja at openjdk.org (Jatin Bhateja)
Date: Tue, 6 May 2025 08:57:18 GMT
Subject: RFR: 8352675: Support Intel AVX10 converged vector ISA feature
 detection [v10]
In-Reply-To: <3t1R35B9bafRtfvqfE7D2dAeLrjaDukXlDUGb-3VtaA=.46d64318-e9fb-4bf3-8a68-8dba2c2b7b26@github.com>
References: <OrjQDBEJjPrCWLpEPj4DmEpaWEFcHY3R8PiZ6ccxMxU=.2862c36d-88d0-45b8-ad28-b50730570da4@github.com>
 <PtAKp6Jg5aTx0OjxRBGw81ycHXSKAbSVgZ6KeIQ0R1o=.5baf1f13-7bb4-46e1-9d43-194bbd2412d9@github.com>
 <V7WbvFgHqGmEcatKP9sIZApYAIdVjED16hqGNrUM1vM=.cccbbe34-5b13-4b38-9421-444050847951@github.com>
 <pMCizZGRrUCno2iOhfBp2dpdf4Epk92ZnPt4iv2ZYKw=.9d5ab0d7-414f-495e-ade0-780666f3aef6@github.com>
 <oSgjp-pN7JPy1pRA-vivck-P5sv8vMLiK9T88YTzmLU=.30323e8c-6e7a-41b6-b082-32a1485008f5@github.com>
 <3t1R35B9bafRtfvqfE7D2dAeLrjaDukXlDUGb-3VtaA=.46d64318-e9fb-4bf3-8a68-8dba2c2b7b26@github.com>
Message-ID: <s8GDuF_-bhg9VEXr5EAM-f1Z_wHZDGjGAD3WgpwJMTg=.91a6bb41-efee-4f41-994c-dfef142fc2ad@github.com>

On Sat, 3 May 2025 08:13:11 GMT, Vladimir Ivanov <vlivanov at openjdk.org> wrote:

>>> Ok, thanks! I wasn't sure you finished the pass.
>>> 
>>> I'm still seeing dynamic memory allocation which IMO unnecessarily complicates the implementation. Bitmap size is fixed and well-known at compile time. It enables `VM_Feature` class to embed the array of proper size inline. And it eliminates all the problems related to undesired sharing of backed array. (Also, `pre_initialize()` is not needed as well.)
>> 
>> Bitmap size depends on the maximum feature enum value, I made it dynamic to keep it flexible. Do you want the feature vector size to be made constant and manually bump it when we exhaust the limit?
>
>> Bitmap size depends on the maximum feature enum value, I made it dynamic to keep it flexible. Do you want the feature vector size to be made constant and manually bump it when we exhaust the limit?
> 
> Yes, please. (The limit may be precise - number of  elements in Feature_Flag enum - but the logic which computes the size of backing array can automatically round it and bump the size once the actual limit is reached.) 
> 
>> pre_initialize was put in place because codeCache_init() proceeds VM_Version_init() 
> 
> I wanted to say that the sole purpose of `pre_initialize` is to allocate memory. Once it goes away, there's no reason to keep it.

Hi @iwanowww , your comments have been addressed.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/24329#issuecomment-2853772762

From shade at openjdk.org  Tue May  6 09:57:47 2025
From: shade at openjdk.org (Aleksey Shipilev)
Date: Tue, 6 May 2025 09:57:47 GMT
Subject: RFR: 8356259: Lift basic -Xlog:jit* logging to "info" level
Message-ID: <2fpJJXAU-vYZkTcjJtTiy5gie8wiw836gMv3kbcidXs=.47732a59-c5ce-4d66-9f40-8d78c657374f@github.com>

We have unified logging for JIT activity: -Xlog:jit+compilation, -Xlog:jit+inlining, etc. These serve as convenient replacements for -XX:+PrintCompilation, -XX:+PrintInlining, etc. And these replacements are useful, because UL can be forwarded to file, their format can be adjusted, and they can be handled asynchronously.

However, all useful messages are on "debug" level, which is inconvenient and surprising. It is reasonable to expect some level of basic logging when supplying -Xlog:jit+compilation, e.g. "info" level. I believe we should lift at least some of the logging to "info" level for these.

Additional testing:
 - [x] Eyeballing `-Xlog:jit*` logs after the patch
 - [ ] Linux x86_64 server fastdebug, `all`

-------------

Commit messages:
 - Fix

Changes: https://git.openjdk.org/jdk/pull/25061/files
  Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=25061&range=00
  Issue: https://bugs.openjdk.org/browse/JDK-8356259
  Stats: 6 lines in 3 files changed: 0 ins; 0 del; 6 mod
  Patch: https://git.openjdk.org/jdk/pull/25061.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/25061/head:pull/25061

PR: https://git.openjdk.org/jdk/pull/25061

From rkennke at openjdk.org  Tue May  6 11:11:19 2025
From: rkennke at openjdk.org (Roman Kennke)
Date: Tue, 6 May 2025 11:11:19 GMT
Subject: RFR: 8356075: Support Shenandoah GC in JVMCI [v4]
In-Reply-To: <dsYp242p-1jCg-9vmI8fKobTJqQlT8WACQ1e2McNLtg=.973251bd-e79e-4835-8a22-f24a6a60289a@github.com>
References: <pGN_wuIoTkDaMGwotFCH_1PiohTPyFP05-7DNXmsWU4=.ce3fff95-7bd5-4e9b-aa29-cd93c0016cc2@github.com>
 <dsYp242p-1jCg-9vmI8fKobTJqQlT8WACQ1e2McNLtg=.973251bd-e79e-4835-8a22-f24a6a60289a@github.com>
Message-ID: <nP9LzqrUbUBvmgbSu0HB4Z0woXzxEk-K6vTNOatdQ1M=.69b52422-ae8d-412d-9030-c75728f1664b@github.com>

On Mon, 5 May 2025 20:25:27 GMT, Roman Kennke <rkennke at openjdk.org> wrote:

>> In order to support Shenandoah GC in Graal, some changes are required in JVMCI, namely, export Shenandoah relevant symbols.
>> 
>> Testing:
>>  - [x] extensive testing with https://github.com/oracle/graal/pull/10904
>
> Roman Kennke has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Simplify pre-barrier runtime entry

Thanks!

-------------

PR Comment: https://git.openjdk.org/jdk/pull/25001#issuecomment-2854170217

From rkennke at openjdk.org  Tue May  6 11:11:19 2025
From: rkennke at openjdk.org (Roman Kennke)
Date: Tue, 6 May 2025 11:11:19 GMT
Subject: Integrated: 8356075: Support Shenandoah GC in JVMCI
In-Reply-To: <pGN_wuIoTkDaMGwotFCH_1PiohTPyFP05-7DNXmsWU4=.ce3fff95-7bd5-4e9b-aa29-cd93c0016cc2@github.com>
References: <pGN_wuIoTkDaMGwotFCH_1PiohTPyFP05-7DNXmsWU4=.ce3fff95-7bd5-4e9b-aa29-cd93c0016cc2@github.com>
Message-ID: <tmdA6heOmPzZfEGa_IO7ocNooDFyEeSZQlkEbnXwAug=.14f43427-d0f7-410c-9d33-07f6a7532e7f@github.com>

On Fri, 2 May 2025 10:35:03 GMT, Roman Kennke <rkennke at openjdk.org> wrote:

> In order to support Shenandoah GC in Graal, some changes are required in JVMCI, namely, export Shenandoah relevant symbols.
> 
> Testing:
>  - [x] extensive testing with https://github.com/oracle/graal/pull/10904

This pull request has now been integrated.

Changeset: 614ba9fc
Author:    Roman Kennke <rkennke at openjdk.org>
URL:       https://git.openjdk.org/jdk/commit/614ba9fc41a0274a31f0e8eff8a598a7c5afe164
Stats:     62 lines in 7 files changed: 61 ins; 0 del; 1 mod

8356075: Support Shenandoah GC in JVMCI

Reviewed-by: shade, dnsimon, cslucas

-------------

PR: https://git.openjdk.org/jdk/pull/25001

From jbhateja at openjdk.org  Tue May  6 11:19:54 2025
From: jbhateja at openjdk.org (Jatin Bhateja)
Date: Tue, 6 May 2025 11:19:54 GMT
Subject: RFR: 8352675: Support Intel AVX10 converged vector ISA feature
 detection [v14]
In-Reply-To: <OrjQDBEJjPrCWLpEPj4DmEpaWEFcHY3R8PiZ6ccxMxU=.2862c36d-88d0-45b8-ad28-b50730570da4@github.com>
References: <OrjQDBEJjPrCWLpEPj4DmEpaWEFcHY3R8PiZ6ccxMxU=.2862c36d-88d0-45b8-ad28-b50730570da4@github.com>
Message-ID: <htFy0mXL1E_ZcrhKRGOyVplMlpAkJHGAbDI4CPgMeeU=.625353a5-b1b2-4ecb-988c-c0b5a80d8d37@github.com>

> - Intel AVX10[1] extends and enhances the capabilities of Intel AVX-512 to benefit all Intel? products and will be the vector ISA of choice moving into the future. 
> - It supports a new ISA versioning scheme which simplifies the existing AVX512 feature enumeration scheme. Feature set supported by an AVX10 ISA version will be supported by all the versions above it.
> - The initial, fully-featured version of Intel? AVX10 will be enumerated as Version 2 (denoted as Intel? AVX10.2). This will include the new ISA extension over the existing AVX512 instructions. 
> - An early version of Intel? AVX10 (Version 1, or Intel? AVX10.1) that only enumerates the Intel? AVX-512 instruction set at 128, 256, and 512 bits will be enabled on the Granite Rapids Server for software pre-enabling.
> 
> This patch adds the necessary CPUID feature detection for AVX10 ISA version 1 and 2.  In terms of architectural state save restoration, AVX10 is isomorphic to AVX512 support up till Granite Rapids. State components affected by AVX10 extension include SSE, AVX, Opmask, ZMM_Hi256, and Hi16_ZMM registers. 
> 
> The patch has been regressed through tier1 and jvmci tests 
> 
> Please review and share your feedback.
> 
> Best Regards,
> Jatin
> 
> [1] https://www.intel.com/content/www/us/en/content-details/844829/intel-advanced-vector-extensions-10-2-intel-avx10-2-architecture-specification.html

Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision:

  build fixes for non-x86 targets

-------------

Changes:
  - all: https://git.openjdk.org/jdk/pull/24329/files
  - new: https://git.openjdk.org/jdk/pull/24329/files/b25cc776..650e3d61

Webrevs:
 - full: https://webrevs.openjdk.org/?repo=jdk&pr=24329&range=13
 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24329&range=12-13

  Stats: 2 lines in 1 file changed: 0 ins; 0 del; 2 mod
  Patch: https://git.openjdk.org/jdk/pull/24329.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/24329/head:pull/24329

PR: https://git.openjdk.org/jdk/pull/24329

From qamai at openjdk.org  Tue May  6 11:50:19 2025
From: qamai at openjdk.org (Quan Anh Mai)
Date: Tue, 6 May 2025 11:50:19 GMT
Subject: RFR: 8352675: Support Intel AVX10 converged vector ISA feature
 detection [v14]
In-Reply-To: <htFy0mXL1E_ZcrhKRGOyVplMlpAkJHGAbDI4CPgMeeU=.625353a5-b1b2-4ecb-988c-c0b5a80d8d37@github.com>
References: <OrjQDBEJjPrCWLpEPj4DmEpaWEFcHY3R8PiZ6ccxMxU=.2862c36d-88d0-45b8-ad28-b50730570da4@github.com>
 <htFy0mXL1E_ZcrhKRGOyVplMlpAkJHGAbDI4CPgMeeU=.625353a5-b1b2-4ecb-988c-c0b5a80d8d37@github.com>
Message-ID: <zkBUcdFsiSwgD9h_Xr91R3x1NRyEh7clqITrjvVPJEM=.851763a5-2e9e-4cdd-a518-0c4cc791081c@github.com>

On Tue, 6 May 2025 11:19:54 GMT, Jatin Bhateja <jbhateja at openjdk.org> wrote:

>> - Intel AVX10[1] extends and enhances the capabilities of Intel AVX-512 to benefit all Intel? products and will be the vector ISA of choice moving into the future. 
>> - It supports a new ISA versioning scheme which simplifies the existing AVX512 feature enumeration scheme. Feature set supported by an AVX10 ISA version will be supported by all the versions above it.
>> - The initial, fully-featured version of Intel? AVX10 will be enumerated as Version 2 (denoted as Intel? AVX10.2). This will include the new ISA extension over the existing AVX512 instructions. 
>> - An early version of Intel? AVX10 (Version 1, or Intel? AVX10.1) that only enumerates the Intel? AVX-512 instruction set at 128, 256, and 512 bits will be enabled on the Granite Rapids Server for software pre-enabling.
>> 
>> This patch adds the necessary CPUID feature detection for AVX10 ISA version 1 and 2.  In terms of architectural state save restoration, AVX10 is isomorphic to AVX512 support up till Granite Rapids. State components affected by AVX10 extension include SSE, AVX, Opmask, ZMM_Hi256, and Hi16_ZMM registers. 
>> 
>> The patch has been regressed through tier1 and jvmci tests 
>> 
>> Please review and share your feedback.
>> 
>> Best Regards,
>> Jatin
>> 
>> [1] https://www.intel.com/content/www/us/en/content-details/844829/intel-advanced-vector-extensions-10-2-intel-avx10-2-architecture-specification.html
>
> Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision:
> 
>   build fixes for non-x86 targets

src/hotspot/cpu/x86/vm_version_x86.hpp line 37:

> 35: class VM_Features {
> 36:  public:
> 37:   using FeatureVector = uint64_t [MAX_FEATURE_VEC_SIZE];

Do you think it would be better to refactor this into a separate class analogous to `std::bitset`? You can start with only implementing `test`, `set`, `reset`. This would help in other use cases, too.

https://en.cppreference.com/w/cpp/utility/bitset

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/24329#discussion_r2075295556

From qamai at openjdk.org  Tue May  6 11:54:22 2025
From: qamai at openjdk.org (Quan Anh Mai)
Date: Tue, 6 May 2025 11:54:22 GMT
Subject: RFR: 8352675: Support Intel AVX10 converged vector ISA feature
 detection [v14]
In-Reply-To: <htFy0mXL1E_ZcrhKRGOyVplMlpAkJHGAbDI4CPgMeeU=.625353a5-b1b2-4ecb-988c-c0b5a80d8d37@github.com>
References: <OrjQDBEJjPrCWLpEPj4DmEpaWEFcHY3R8PiZ6ccxMxU=.2862c36d-88d0-45b8-ad28-b50730570da4@github.com>
 <htFy0mXL1E_ZcrhKRGOyVplMlpAkJHGAbDI4CPgMeeU=.625353a5-b1b2-4ecb-988c-c0b5a80d8d37@github.com>
Message-ID: <wJ00X2kc5wnlUhLY_LW9jYhv0WEAYoAC3fCVXTwjpAw=.d28d4eea-8fe7-4c76-8517-344b547e65f5@github.com>

On Tue, 6 May 2025 11:19:54 GMT, Jatin Bhateja <jbhateja at openjdk.org> wrote:

>> - Intel AVX10[1] extends and enhances the capabilities of Intel AVX-512 to benefit all Intel? products and will be the vector ISA of choice moving into the future. 
>> - It supports a new ISA versioning scheme which simplifies the existing AVX512 feature enumeration scheme. Feature set supported by an AVX10 ISA version will be supported by all the versions above it.
>> - The initial, fully-featured version of Intel? AVX10 will be enumerated as Version 2 (denoted as Intel? AVX10.2). This will include the new ISA extension over the existing AVX512 instructions. 
>> - An early version of Intel? AVX10 (Version 1, or Intel? AVX10.1) that only enumerates the Intel? AVX-512 instruction set at 128, 256, and 512 bits will be enabled on the Granite Rapids Server for software pre-enabling.
>> 
>> This patch adds the necessary CPUID feature detection for AVX10 ISA version 1 and 2.  In terms of architectural state save restoration, AVX10 is isomorphic to AVX512 support up till Granite Rapids. State components affected by AVX10 extension include SSE, AVX, Opmask, ZMM_Hi256, and Hi16_ZMM registers. 
>> 
>> The patch has been regressed through tier1 and jvmci tests 
>> 
>> Please review and share your feedback.
>> 
>> Best Regards,
>> Jatin
>> 
>> [1] https://www.intel.com/content/www/us/en/content-details/844829/intel-advanced-vector-extensions-10-2-intel-avx10-2-architecture-specification.html
>
> Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision:
> 
>   build fixes for non-x86 targets

src/hotspot/cpu/x86/vm_version_x86.hpp line 44:

> 42:   // log2 of feature vector element size in bits, used by JVMCI to check enabled feature bits.
> 43:   // Refer HotSpotJVMCIBackendFactory::convertFeaturesVector.
> 44:   static uint32_t _features_vector_element_shift_count;

Making this `static constexpr` helps constant folding, too.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/24329#discussion_r2075301116

From dnsimon at openjdk.org  Tue May  6 11:56:23 2025
From: dnsimon at openjdk.org (Doug Simon)
Date: Tue, 6 May 2025 11:56:23 GMT
Subject: RFR: 8356075: Support Shenandoah GC in JVMCI [v4]
In-Reply-To: <dsYp242p-1jCg-9vmI8fKobTJqQlT8WACQ1e2McNLtg=.973251bd-e79e-4835-8a22-f24a6a60289a@github.com>
References: <pGN_wuIoTkDaMGwotFCH_1PiohTPyFP05-7DNXmsWU4=.ce3fff95-7bd5-4e9b-aa29-cd93c0016cc2@github.com>
 <dsYp242p-1jCg-9vmI8fKobTJqQlT8WACQ1e2McNLtg=.973251bd-e79e-4835-8a22-f24a6a60289a@github.com>
Message-ID: <DtPx69gE3XTj--bYZNZqvlVhgx6DGptRASZDeXuvF2E=.e0d179c7-55b1-466a-93d6-160d7185b451@github.com>

On Mon, 5 May 2025 20:25:27 GMT, Roman Kennke <rkennke at openjdk.org> wrote:

>> In order to support Shenandoah GC in Graal, some changes are required in JVMCI, namely, export Shenandoah relevant symbols.
>> 
>> Testing:
>>  - [x] extensive testing with https://github.com/oracle/graal/pull/10904
>
> Roman Kennke has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Simplify pre-barrier runtime entry

src/hotspot/share/jvmci/jvmciCompilerToVMInit.cpp line 239:

> 237:     cardtable_start_address = base;
> 238:     cardtable_shift = CardTable::card_shift();
> 239:   } else if (bs->is_a(BarrierSet::ShenandoahBarrierSet)) {

This change is causing a failure in mach5 tier 1:

[2025-05-06T11:34:44,742Z] /workspace/open/src/hotspot/share/jvmci/jvmciCompilerToVMInit.cpp:239:35: error: no member named 'ShenandoahBarrierSet' in 'BarrierSet'
[2025-05-06T11:34:44,742Z]   } else if (bs->is_a(BarrierSet::ShenandoahBarrierSet)) {
[2025-05-06T11:34:44,742Z]                       ~~~~~~~~~~~~^
[2025-05-06T11:34:45,729Z] 1 error generated.

 I assume it's missing `#if INCLUDE_SHENANDOAHGC`.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/25001#discussion_r2075304100

From jbhateja at openjdk.org  Tue May  6 12:09:21 2025
From: jbhateja at openjdk.org (Jatin Bhateja)
Date: Tue, 6 May 2025 12:09:21 GMT
Subject: RFR: 8352675: Support Intel AVX10 converged vector ISA feature
 detection [v14]
In-Reply-To: <zkBUcdFsiSwgD9h_Xr91R3x1NRyEh7clqITrjvVPJEM=.851763a5-2e9e-4cdd-a518-0c4cc791081c@github.com>
References: <OrjQDBEJjPrCWLpEPj4DmEpaWEFcHY3R8PiZ6ccxMxU=.2862c36d-88d0-45b8-ad28-b50730570da4@github.com>
 <htFy0mXL1E_ZcrhKRGOyVplMlpAkJHGAbDI4CPgMeeU=.625353a5-b1b2-4ecb-988c-c0b5a80d8d37@github.com>
 <zkBUcdFsiSwgD9h_Xr91R3x1NRyEh7clqITrjvVPJEM=.851763a5-2e9e-4cdd-a518-0c4cc791081c@github.com>
Message-ID: <zQ_fVzpIZ9X_k7WF66cUZHdq2MQUqD78QTGxUBNMBRY=.0664e179-6321-43e2-956d-74a91f8b9187@github.com>

On Tue, 6 May 2025 11:47:47 GMT, Quan Anh Mai <qamai at openjdk.org> wrote:

>> Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision:
>> 
>>   build fixes for non-x86 targets
>
> src/hotspot/cpu/x86/vm_version_x86.hpp line 37:
> 
>> 35: class VM_Features {
>> 36:  public:
>> 37:   using FeatureVector = uint64_t [MAX_FEATURE_VEC_SIZE];
> 
> Do you think it would be better to refactor this into a separate class analogous to `std::bitset`? You can start with only implementing `test`, `set`, `reset`. This would help in other use cases, too.
> 
> https://en.cppreference.com/w/cpp/utility/bitset

In essence, what we have currently is a bitmap implementation, but its utility is limited to VM_Version for now. The current approach simplifies the JVMCI side of handling. We have an existing utility for bitset src/hotspot/share/utilities/bitMap.hpp

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/24329#discussion_r2075325468

From rkennke at openjdk.org  Tue May  6 12:22:53 2025
From: rkennke at openjdk.org (Roman Kennke)
Date: Tue, 6 May 2025 12:22:53 GMT
Subject: RFR: 8356266: Fix non-Shenandoah build after JDK-8356075
Message-ID: <9t9PKKEIz5lyztUpQjzlbAi218B71LKv2w-UvMikrF8=.987114a6-8e92-4193-910c-2688a8ecddcf@github.com>

[JDK-8356075](https://bugs.openjdk.org/browse/JDK-8356075) (see PR #25001) causes builds without Shenandoah GC to fail. It's missing an `#if INCLUDE_SHENANDOAHGC`.

Testing:
 - [x] Build without Shenandoah GC

-------------

Commit messages:
 - 8356266: Fix non-Shenandoah build after JDK-8356075

Changes: https://git.openjdk.org/jdk/pull/25064/files
  Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=25064&range=00
  Issue: https://bugs.openjdk.org/browse/JDK-8356266
  Stats: 2 lines in 1 file changed: 2 ins; 0 del; 0 mod
  Patch: https://git.openjdk.org/jdk/pull/25064.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/25064/head:pull/25064

PR: https://git.openjdk.org/jdk/pull/25064

From dnsimon at openjdk.org  Tue May  6 12:46:16 2025
From: dnsimon at openjdk.org (Doug Simon)
Date: Tue, 6 May 2025 12:46:16 GMT
Subject: RFR: 8356266: Fix non-Shenandoah build after JDK-8356075
In-Reply-To: <9t9PKKEIz5lyztUpQjzlbAi218B71LKv2w-UvMikrF8=.987114a6-8e92-4193-910c-2688a8ecddcf@github.com>
References: <9t9PKKEIz5lyztUpQjzlbAi218B71LKv2w-UvMikrF8=.987114a6-8e92-4193-910c-2688a8ecddcf@github.com>
Message-ID: <108F8BKi1AuttNCA6a1RxJYTIVnP0phMzeaUNoHMq9Q=.44c15e9a-a8d7-42c7-97c2-f1eb0b6b5e04@github.com>

On Tue, 6 May 2025 12:17:44 GMT, Roman Kennke <rkennke at openjdk.org> wrote:

> [JDK-8356075](https://bugs.openjdk.org/browse/JDK-8356075) (see PR #25001) causes builds without Shenandoah GC to fail. It's missing an `#if INCLUDE_SHENANDOAHGC`.
> 
> Testing:
>  - [x] Build without Shenandoah GC

Marked as reviewed by dnsimon (Reviewer).

-------------

PR Review: https://git.openjdk.org/jdk/pull/25064#pullrequestreview-2818123867

From rkennke at openjdk.org  Tue May  6 13:18:23 2025
From: rkennke at openjdk.org (Roman Kennke)
Date: Tue, 6 May 2025 13:18:23 GMT
Subject: RFR: 8356266: Fix non-Shenandoah build after JDK-8356075
In-Reply-To: <108F8BKi1AuttNCA6a1RxJYTIVnP0phMzeaUNoHMq9Q=.44c15e9a-a8d7-42c7-97c2-f1eb0b6b5e04@github.com>
References: <9t9PKKEIz5lyztUpQjzlbAi218B71LKv2w-UvMikrF8=.987114a6-8e92-4193-910c-2688a8ecddcf@github.com>
 <108F8BKi1AuttNCA6a1RxJYTIVnP0phMzeaUNoHMq9Q=.44c15e9a-a8d7-42c7-97c2-f1eb0b6b5e04@github.com>
Message-ID: <_dnHV1rf65FfgcxrigE2RMCBOBu_YUq58SAdmB2as2k=.605e1266-5e2f-44da-8889-3658545d6c1b@github.com>

On Tue, 6 May 2025 12:43:07 GMT, Doug Simon <dnsimon at openjdk.org> wrote:

>> [JDK-8356075](https://bugs.openjdk.org/browse/JDK-8356075) (see PR #25001) causes builds without Shenandoah GC to fail. It's missing an `#if INCLUDE_SHENANDOAHGC`.
>> 
>> Testing:
>>  - [x] Build without Shenandoah GC
>
> Marked as reviewed by dnsimon (Reviewer).

Thanks, @dougxc! Is this trivial? Can I push this right away to fix the build?

-------------

PR Comment: https://git.openjdk.org/jdk/pull/25064#issuecomment-2854547042

From shade at openjdk.org  Tue May  6 13:28:25 2025
From: shade at openjdk.org (Aleksey Shipilev)
Date: Tue, 6 May 2025 13:28:25 GMT
Subject: RFR: 8356266: Fix non-Shenandoah build after JDK-8356075
In-Reply-To: <9t9PKKEIz5lyztUpQjzlbAi218B71LKv2w-UvMikrF8=.987114a6-8e92-4193-910c-2688a8ecddcf@github.com>
References: <9t9PKKEIz5lyztUpQjzlbAi218B71LKv2w-UvMikrF8=.987114a6-8e92-4193-910c-2688a8ecddcf@github.com>
Message-ID: <y3rkCJ7JRsF1boSDeTQxSyYr8cPf9MVqdg7pRFkGRGI=.aad1583a-59da-4d39-9b86-f731f97c0b33@github.com>

On Tue, 6 May 2025 12:17:44 GMT, Roman Kennke <rkennke at openjdk.org> wrote:

> [JDK-8356075](https://bugs.openjdk.org/browse/JDK-8356075) (see PR #25001) causes builds without Shenandoah GC to fail. It's missing an `#if INCLUDE_SHENANDOAHGC`.
> 
> Testing:
>  - [x] Build without Shenandoah GC

Ah yes. Trivial.

-------------

Marked as reviewed by shade (Reviewer).

PR Review: https://git.openjdk.org/jdk/pull/25064#pullrequestreview-2818264886

From rkennke at openjdk.org  Tue May  6 13:28:26 2025
From: rkennke at openjdk.org (Roman Kennke)
Date: Tue, 6 May 2025 13:28:26 GMT
Subject: RFR: 8356266: Fix non-Shenandoah build after JDK-8356075
In-Reply-To: <9t9PKKEIz5lyztUpQjzlbAi218B71LKv2w-UvMikrF8=.987114a6-8e92-4193-910c-2688a8ecddcf@github.com>
References: <9t9PKKEIz5lyztUpQjzlbAi218B71LKv2w-UvMikrF8=.987114a6-8e92-4193-910c-2688a8ecddcf@github.com>
Message-ID: <SHISxmE67neyEHiL2Limgt8RXGEbABVlOb-etRymezc=.21e17ebb-ab51-41b1-90ba-6aa7db64a1dc@github.com>

On Tue, 6 May 2025 12:17:44 GMT, Roman Kennke <rkennke at openjdk.org> wrote:

> [JDK-8356075](https://bugs.openjdk.org/browse/JDK-8356075) (see PR #25001) causes builds without Shenandoah GC to fail. It's missing an `#if INCLUDE_SHENANDOAHGC`.
> 
> Testing:
>  - [x] Build without Shenandoah GC

Some GHA failures - they look unrelated.

Thanks!

-------------

PR Comment: https://git.openjdk.org/jdk/pull/25064#issuecomment-2854558817
PR Comment: https://git.openjdk.org/jdk/pull/25064#issuecomment-2854572905

From rkennke at openjdk.org  Tue May  6 13:28:26 2025
From: rkennke at openjdk.org (Roman Kennke)
Date: Tue, 6 May 2025 13:28:26 GMT
Subject: Integrated: 8356266: Fix non-Shenandoah build after JDK-8356075
In-Reply-To: <9t9PKKEIz5lyztUpQjzlbAi218B71LKv2w-UvMikrF8=.987114a6-8e92-4193-910c-2688a8ecddcf@github.com>
References: <9t9PKKEIz5lyztUpQjzlbAi218B71LKv2w-UvMikrF8=.987114a6-8e92-4193-910c-2688a8ecddcf@github.com>
Message-ID: <5xNQWiQmV33cfOTCB2_pb5B66d7L7IK2MXWEN-Gnqy4=.181a933e-4a2a-4210-8610-f03d62828c8c@github.com>

On Tue, 6 May 2025 12:17:44 GMT, Roman Kennke <rkennke at openjdk.org> wrote:

> [JDK-8356075](https://bugs.openjdk.org/browse/JDK-8356075) (see PR #25001) causes builds without Shenandoah GC to fail. It's missing an `#if INCLUDE_SHENANDOAHGC`.
> 
> Testing:
>  - [x] Build without Shenandoah GC

This pull request has now been integrated.

Changeset: bfdafb76
Author:    Roman Kennke <rkennke at openjdk.org>
URL:       https://git.openjdk.org/jdk/commit/bfdafb762661fad5746607aaf5b21d6d11c72ffc
Stats:     2 lines in 1 file changed: 2 ins; 0 del; 0 mod

8356266: Fix non-Shenandoah build after JDK-8356075

Reviewed-by: dnsimon, shade

-------------

PR: https://git.openjdk.org/jdk/pull/25064

From vlivanov at openjdk.org  Tue May  6 18:21:13 2025
From: vlivanov at openjdk.org (Vladimir Ivanov)
Date: Tue, 6 May 2025 18:21:13 GMT
Subject: RFR: 8347901: C2 should remove unused leaf / pure runtime calls
In-Reply-To: <XrHBrp81T81JlX15Yc3cTb-fPwGqo5uX-tj4HizOzko=.9d6f5dd0-3239-47f1-a593-87a208ad5c99@github.com>
References: <FDC4kftsSAqf2FB3mxOABGMgWhr_qty0_BktGUiuTuE=.060b5943-74c3-461d-8806-b6da1722c207@github.com>
 <XrHBrp81T81JlX15Yc3cTb-fPwGqo5uX-tj4HizOzko=.9d6f5dd0-3239-47f1-a593-87a208ad5c99@github.com>
Message-ID: <MPLoUV2KuCkLI8ZnaiH6W8hSz1kLlTvJgyLRYXnLnt0=.a39867b2-e2f6-4527-b027-363d93400950@github.com>

On Tue, 6 May 2025 07:43:57 GMT, Marc Chevalier <mchevalier at openjdk.org> wrote:

>> support arbitrary nodes to be lowered into leaf runtime calls.

A leaf runtime call which doesn't depend or change memory state can be inserted at arbitrary points in the graph. So, an arbitrary data node can be lowered into a runtime call once the place to insert it is known/chosen.  


> Overall, I see the weaknesses of my design, but I'm not sure which direction to take instead.

I suggest to experiment with untangling `ModF`/`ModD` from `CallLeaf`, making them expensive nodes (to avoid commoning during GVN) , and still lower them into `CallLeaf`.
(It doesn't have to be part of existing macro expansion. Depending on implementation considerations, earlier or later may be more appropriate. But it should be expanded before RA kicks in.)

The hard part is probably related to picking a point in CFG to insert the call, but the control the node has may be not suitable for that (e.g., if inputs don't dominate control anymore). In that case, updating control input during loop opts may be an option.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/24966#issuecomment-2855510094

From kvn at openjdk.org  Tue May  6 18:42:13 2025
From: kvn at openjdk.org (Vladimir Kozlov)
Date: Tue, 6 May 2025 18:42:13 GMT
Subject: RFR: 8356259: Lift basic -Xlog:jit* logging to "info" level
In-Reply-To: <2fpJJXAU-vYZkTcjJtTiy5gie8wiw836gMv3kbcidXs=.47732a59-c5ce-4d66-9f40-8d78c657374f@github.com>
References: <2fpJJXAU-vYZkTcjJtTiy5gie8wiw836gMv3kbcidXs=.47732a59-c5ce-4d66-9f40-8d78c657374f@github.com>
Message-ID: <6hO6sv_xTTfD8CuETfeCFvN0oURmjfX9PDIwzd4EnG4=.35d3674c-6e9e-444f-af5f-bc47586530b9@github.com>

On Tue, 6 May 2025 09:52:24 GMT, Aleksey Shipilev <shade at openjdk.org> wrote:

> We have unified logging for JIT activity: -Xlog:jit+compilation, -Xlog:jit+inlining, etc. These serve as convenient replacements for -XX:+PrintCompilation, -XX:+PrintInlining, etc. And these replacements are useful, because UL can be forwarded to file, their format can be adjusted, and they can be handled asynchronously.
> 
> However, all useful messages are on "debug" level, which is inconvenient and surprising. It is reasonable to expect some level of basic logging when supplying -Xlog:jit+compilation, e.g. "info" level. I believe we should lift at least some of the logging to "info" level for these.
> 
> Additional testing:
>  - [x] Eyeballing `-Xlog:jit*` logs after the patch
>  - [ ] Linux x86_64 server fastdebug, `all`

PrintInlining and PrintIntrinsics are diagnostic flags (while PrintCompilation is product).
So mapping UL `Info` to product flag and `Debug` to diagnostic seems valid.

Based on this, I agree with changes to `CT::print_ul()` but not others.

-------------

PR Review: https://git.openjdk.org/jdk/pull/25061#pullrequestreview-2819277571

From shade at openjdk.org  Tue May  6 19:18:54 2025
From: shade at openjdk.org (Aleksey Shipilev)
Date: Tue, 6 May 2025 19:18:54 GMT
Subject: RFR: 8356259: Lift basic -Xlog:jit* logging to "info" level [v2]
In-Reply-To: <2fpJJXAU-vYZkTcjJtTiy5gie8wiw836gMv3kbcidXs=.47732a59-c5ce-4d66-9f40-8d78c657374f@github.com>
References: <2fpJJXAU-vYZkTcjJtTiy5gie8wiw836gMv3kbcidXs=.47732a59-c5ce-4d66-9f40-8d78c657374f@github.com>
Message-ID: <JYbrnjaGPL5OptRHJNoreqBBf-soFtzFVLbZ8JktixI=.167f394d-4c01-4033-bdef-6fe7c3a659fd@github.com>

> We have unified logging for JIT activity: -Xlog:jit+compilation, -Xlog:jit+inlining, etc. These serve as convenient replacements for -XX:+PrintCompilation, -XX:+PrintInlining, etc. And these replacements are useful, because UL can be forwarded to file, their format can be adjusted, and they can be handled asynchronously.
> 
> However, all useful messages are on "debug" level, which is inconvenient and surprising. It is reasonable to expect some level of basic logging when supplying -Xlog:jit+compilation, e.g. "info" level. I believe we should lift at least some of the logging to "info" level for these.
> 
> Additional testing:
>  - [x] Eyeballing `-Xlog:jit*` logs after the patch
>  - [x] Linux x86_64 server fastdebug, `all`

Aleksey Shipilev has updated the pull request incrementally with one additional commit since the last revision:

  Only do jit+compilation

-------------

Changes:
  - all: https://git.openjdk.org/jdk/pull/25061/files
  - new: https://git.openjdk.org/jdk/pull/25061/files/2e1b9e64..2b8c9576

Webrevs:
 - full: https://webrevs.openjdk.org/?repo=jdk&pr=25061&range=01
 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25061&range=00-01

  Stats: 4 lines in 3 files changed: 0 ins; 0 del; 4 mod
  Patch: https://git.openjdk.org/jdk/pull/25061.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/25061/head:pull/25061

PR: https://git.openjdk.org/jdk/pull/25061

From shade at openjdk.org  Tue May  6 19:18:55 2025
From: shade at openjdk.org (Aleksey Shipilev)
Date: Tue, 6 May 2025 19:18:55 GMT
Subject: RFR: 8356259: Lift basic -Xlog:jit* logging to "info" level [v2]
In-Reply-To: <6hO6sv_xTTfD8CuETfeCFvN0oURmjfX9PDIwzd4EnG4=.35d3674c-6e9e-444f-af5f-bc47586530b9@github.com>
References: <2fpJJXAU-vYZkTcjJtTiy5gie8wiw836gMv3kbcidXs=.47732a59-c5ce-4d66-9f40-8d78c657374f@github.com>
 <6hO6sv_xTTfD8CuETfeCFvN0oURmjfX9PDIwzd4EnG4=.35d3674c-6e9e-444f-af5f-bc47586530b9@github.com>
Message-ID: <ONzfTr0NJiO5_ouNrA9zDxcZD2FYXipzMl8rjIUd00A=.22823ad9-d53d-4485-92cd-e47d7f3cff34@github.com>

On Tue, 6 May 2025 18:39:43 GMT, Vladimir Kozlov <kvn at openjdk.org> wrote:

> PrintInlining and PrintIntrinsics are diagnostic flags (while PrintCompilation is product). So mapping UL `Info` to product flag and `Debug` to diagnostic seems valid. Based on this, I agree with changes to `CT::print_ul()` but not others.

I am mostly interested in `PrintCompilation` myself, so that would be an acceptable compromise.

However, I do believe that `PrintInlining` along with `TraceTypeProfile` are very useful to figure out performance anomalies in the field. Those really should not be diagnostic, and UL should really be "info" for them :) But we can have that discussion at some point later.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/25061#issuecomment-2855647445

From duke at openjdk.org  Tue May  6 21:45:34 2025
From: duke at openjdk.org (Mohamed Issa)
Date: Tue, 6 May 2025 21:45:34 GMT
Subject: RFR: 8353686: Optimize Math.cbrt for x86 64 bit platforms [v3]
In-Reply-To: <1NsI0OGP9RcnbEwlJwDj1dZ3w7zCP4DxJhEmO1quSgo=.3b1e3da9-9aa9-4221-a73a-e2f3ec5f456b@github.com>
References: <1NsI0OGP9RcnbEwlJwDj1dZ3w7zCP4DxJhEmO1quSgo=.3b1e3da9-9aa9-4221-a73a-e2f3ec5f456b@github.com>
Message-ID: <H3vvJxVRsOzXpLIAnz2vc3wU_Umd9IoyI1cgqYT6mq0=.79b289da-60f2-4f62-99bd-227e03e9df2b@github.com>

> The goal of this PR is to implement an x86_64 intrinsic for java.lang.Math.cbrt() using libm.
> 
> The results of all tests posted below were captured with an [Intel? Xeon 6761P](https://www.intel.com/content/www/us/en/products/sku/241842/intel-xeon-6761p-processor-336m-cache-2-50-ghz/specifications.html) using [OpenJDK v25-b21](https://github.com/openjdk/jdk/releases/tag/jdk-25%2B21) as the baseline version.
> 
> For performance data collected with the built in **cbrt** micro-benchmark, see the table below. Each result is the mean of 8 individual runs. Overall, the intrinsic provides a performance uplift of 37%.
> 
> | Benchmark        | Throughput with baseline (op/s) | Throughput with intrinsic (op/s) | Speedup |
> | :----------------: | :----------------------------------: | :----------------------------------: | :---------: |
> | MathBench.cbrt | 152465                                        | 208537                                        | 1.37x       |
> 
> Finally, the `jtreg:test/jdk/java/lang/Math/CubeRootTests.java` test passed with the changes.

Mohamed Issa has updated the pull request incrementally with one additional commit since the last revision:

  Add new set of cbrt micro-benchmarks

-------------

Changes:
  - all: https://git.openjdk.org/jdk/pull/24470/files
  - new: https://git.openjdk.org/jdk/pull/24470/files/3212c669..57412f0d

Webrevs:
 - full: https://webrevs.openjdk.org/?repo=jdk&pr=24470&range=02
 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24470&range=01-02

  Stats: 148 lines in 1 file changed: 148 ins; 0 del; 0 mod
  Patch: https://git.openjdk.org/jdk/pull/24470.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/24470/head:pull/24470

PR: https://git.openjdk.org/jdk/pull/24470

From kvn at openjdk.org  Tue May  6 22:58:14 2025
From: kvn at openjdk.org (Vladimir Kozlov)
Date: Tue, 6 May 2025 22:58:14 GMT
Subject: RFR: 8356259: Lift basic -Xlog:jit* logging to "info" level [v2]
In-Reply-To: <JYbrnjaGPL5OptRHJNoreqBBf-soFtzFVLbZ8JktixI=.167f394d-4c01-4033-bdef-6fe7c3a659fd@github.com>
References: <2fpJJXAU-vYZkTcjJtTiy5gie8wiw836gMv3kbcidXs=.47732a59-c5ce-4d66-9f40-8d78c657374f@github.com>
 <JYbrnjaGPL5OptRHJNoreqBBf-soFtzFVLbZ8JktixI=.167f394d-4c01-4033-bdef-6fe7c3a659fd@github.com>
Message-ID: <ooHUhSigK5JgiF_aDhFp4YXJ2S98hvFykiphB2BH-oY=.e5485e88-72a8-48a4-b2db-15abbeb82846@github.com>

On Tue, 6 May 2025 19:18:54 GMT, Aleksey Shipilev <shade at openjdk.org> wrote:

>> We have unified logging for JIT activity: -Xlog:jit+compilation, -Xlog:jit+inlining, etc. These serve as convenient replacements for -XX:+PrintCompilation, -XX:+PrintInlining, etc. And these replacements are useful, because UL can be forwarded to file, their format can be adjusted, and they can be handled asynchronously.
>> 
>> However, all useful messages are on "debug" level, which is inconvenient and surprising. It is reasonable to expect some level of basic logging when supplying -Xlog:jit+compilation, e.g. "info" level. I believe we should lift at least some of the logging to "info" level for these.
>> 
>> Additional testing:
>>  - [x] Eyeballing `-Xlog:jit*` logs after the patch
>>  - [x] Linux x86_64 server fastdebug, `all`
>
> Aleksey Shipilev has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Only do jit+compilation

Trivial.

-------------

Marked as reviewed by kvn (Reviewer).

PR Review: https://git.openjdk.org/jdk/pull/25061#pullrequestreview-2819895653

From vlivanov at openjdk.org  Tue May  6 23:21:18 2025
From: vlivanov at openjdk.org (Vladimir Ivanov)
Date: Tue, 6 May 2025 23:21:18 GMT
Subject: RFR: 8352675: Support Intel AVX10 converged vector ISA feature
 detection [v14]
In-Reply-To: <htFy0mXL1E_ZcrhKRGOyVplMlpAkJHGAbDI4CPgMeeU=.625353a5-b1b2-4ecb-988c-c0b5a80d8d37@github.com>
References: <OrjQDBEJjPrCWLpEPj4DmEpaWEFcHY3R8PiZ6ccxMxU=.2862c36d-88d0-45b8-ad28-b50730570da4@github.com>
 <htFy0mXL1E_ZcrhKRGOyVplMlpAkJHGAbDI4CPgMeeU=.625353a5-b1b2-4ecb-988c-c0b5a80d8d37@github.com>
Message-ID: <NUwO63ZFtGGA3wuuYvgS89lZwEuug967jmkULxCWf6Q=.0d4aa20f-7686-41f8-aac2-620232e41453@github.com>

On Tue, 6 May 2025 11:19:54 GMT, Jatin Bhateja <jbhateja at openjdk.org> wrote:

>> - Intel AVX10[1] extends and enhances the capabilities of Intel AVX-512 to benefit all Intel? products and will be the vector ISA of choice moving into the future. 
>> - It supports a new ISA versioning scheme which simplifies the existing AVX512 feature enumeration scheme. Feature set supported by an AVX10 ISA version will be supported by all the versions above it.
>> - The initial, fully-featured version of Intel? AVX10 will be enumerated as Version 2 (denoted as Intel? AVX10.2). This will include the new ISA extension over the existing AVX512 instructions. 
>> - An early version of Intel? AVX10 (Version 1, or Intel? AVX10.1) that only enumerates the Intel? AVX-512 instruction set at 128, 256, and 512 bits will be enabled on the Granite Rapids Server for software pre-enabling.
>> 
>> This patch adds the necessary CPUID feature detection for AVX10 ISA version 1 and 2.  In terms of architectural state save restoration, AVX10 is isomorphic to AVX512 support up till Granite Rapids. State components affected by AVX10 extension include SSE, AVX, Opmask, ZMM_Hi256, and Hi16_ZMM registers. 
>> 
>> The patch has been regressed through tier1 and jvmci tests 
>> 
>> Please review and share your feedback.
>> 
>> Best Regards,
>> Jatin
>> 
>> [1] https://www.intel.com/content/www/us/en/content-details/844829/intel-advanced-vector-extensions-10-2-intel-avx10-2-architecture-specification.html
>
> Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision:
> 
>   build fixes for non-x86 targets

Very nice!

I made a cleanup pass over the code [1]. Feel free to incorporate it or let me know if you have any questions/concerns.

Meanwhile, submitted it for testing.

[1] https://github.com/iwanowww/jdk/commit/35aeb88d0d5667c9e4f699bb9b3b7169af96446a

-------------

PR Review: https://git.openjdk.org/jdk/pull/24329#pullrequestreview-2819173067

From vlivanov at openjdk.org  Tue May  6 23:21:19 2025
From: vlivanov at openjdk.org (Vladimir Ivanov)
Date: Tue, 6 May 2025 23:21:19 GMT
Subject: RFR: 8352675: Support Intel AVX10 converged vector ISA feature
 detection [v14]
In-Reply-To: <RVhjRRZE1AFOaHCMnU6NGElhid3utnbz8olv3Yqwi1o=.6e9f23ed-1db4-4e28-84e2-99ad436d70fe@github.com>
References: <OrjQDBEJjPrCWLpEPj4DmEpaWEFcHY3R8PiZ6ccxMxU=.2862c36d-88d0-45b8-ad28-b50730570da4@github.com>
 <2ioSQVtfXhnqvAXqiadwR1HuJsz3t9nytY0wRps-x68=.35220ade-0e70-41c6-9ebd-a271e7dcb2bb@github.com>
 <FqDRgTOj-VT5-FgH_RvBr_gmK2U6W0UKFws8iihIY5s=.c8c93edf-e447-4c33-903d-035623f38ce2@github.com>
 <RVhjRRZE1AFOaHCMnU6NGElhid3utnbz8olv3Yqwi1o=.6e9f23ed-1db4-4e28-84e2-99ad436d70fe@github.com>
Message-ID: <vU0bq-IJogcbNPVZz1qtNwnRnyz4LRF8O5IwvedVyZ8=.d529293a-7420-4e15-9c89-609eeaf17f87@github.com>

On Tue, 6 May 2025 08:45:15 GMT, Jatin Bhateja <jbhateja at openjdk.org> wrote:

>> src/hotspot/cpu/x86/vm_version_x86.hpp line 707:
>> 
>>> 705:   //
>>> 706:   static bool supports_cpuid()        { return _features  != 0; }
>>> 707:   static bool supports_cmov()         { return (_features & CPU_CMOV) != 0; }
>> 
>> Since you touch this code anyway, I suggest to use this opportunity to automatically derive this code using  `CPU_FEATURE_FLAGS` macro. (As an example [1].)
>> 
>> [1] https://github.com/openjdk/jdk/blob/master/src/hotspot/cpu/aarch64/vm_version_aarch64.hpp#L147
>
> Unlike AARCH64, there is not a 1:1 mapping b/w CPU_* features and the corresponding support checkers; some AVX512 checkers use multiple features. Skipping this for now for consistency.

Sure, I'm fine with addressing it separately.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/24329#discussion_r2075993391

From shade at openjdk.org  Wed May  7 07:07:18 2025
From: shade at openjdk.org (Aleksey Shipilev)
Date: Wed, 7 May 2025 07:07:18 GMT
Subject: RFR: 8356259: Lift basic -Xlog:jit* logging to "info" level [v2]
In-Reply-To: <JYbrnjaGPL5OptRHJNoreqBBf-soFtzFVLbZ8JktixI=.167f394d-4c01-4033-bdef-6fe7c3a659fd@github.com>
References: <2fpJJXAU-vYZkTcjJtTiy5gie8wiw836gMv3kbcidXs=.47732a59-c5ce-4d66-9f40-8d78c657374f@github.com>
 <JYbrnjaGPL5OptRHJNoreqBBf-soFtzFVLbZ8JktixI=.167f394d-4c01-4033-bdef-6fe7c3a659fd@github.com>
Message-ID: <I_rZ2uEz-M8fA8ITkH-jY2aLFvV_GdbioIfPMZx6v_Y=.31fcc58a-a50b-4f79-aea6-ef8a9020c955@github.com>

On Tue, 6 May 2025 19:18:54 GMT, Aleksey Shipilev <shade at openjdk.org> wrote:

>> We have unified logging for JIT activity: -Xlog:jit+compilation, -Xlog:jit+inlining, etc. These serve as convenient replacements for -XX:+PrintCompilation, -XX:+PrintInlining, etc. And these replacements are useful, because UL can be forwarded to file, their format can be adjusted, and they can be handled asynchronously.
>> 
>> However, all useful messages are on "debug" level, which is inconvenient and surprising. It is reasonable to expect some level of basic logging when supplying -Xlog:jit+compilation, e.g. "info" level. I believe we should lift at least some of the logging to "info" level for these.
>> 
>> Additional testing:
>>  - [x] Eyeballing `-Xlog:jit*` logs after the patch
>>  - [x] Linux x86_64 server fastdebug, `all`
>
> Aleksey Shipilev has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Only do jit+compilation

OK, thanks!

-------------

PR Comment: https://git.openjdk.org/jdk/pull/25061#issuecomment-2857360250

From shade at openjdk.org  Wed May  7 07:47:22 2025
From: shade at openjdk.org (Aleksey Shipilev)
Date: Wed, 7 May 2025 07:47:22 GMT
Subject: Integrated: 8356259: Lift basic -Xlog:jit* logging to "info" level
In-Reply-To: <2fpJJXAU-vYZkTcjJtTiy5gie8wiw836gMv3kbcidXs=.47732a59-c5ce-4d66-9f40-8d78c657374f@github.com>
References: <2fpJJXAU-vYZkTcjJtTiy5gie8wiw836gMv3kbcidXs=.47732a59-c5ce-4d66-9f40-8d78c657374f@github.com>
Message-ID: <761jqrKse3Lh7FxmHrUMnDPws8xEXOMB-o-Ry1HT6QI=.4c6bae97-8e59-4aff-aaa3-56dfac751eaa@github.com>

On Tue, 6 May 2025 09:52:24 GMT, Aleksey Shipilev <shade at openjdk.org> wrote:

> We have unified logging for JIT activity: -Xlog:jit+compilation, -Xlog:jit+inlining, etc. These serve as convenient replacements for -XX:+PrintCompilation, -XX:+PrintInlining, etc. And these replacements are useful, because UL can be forwarded to file, their format can be adjusted, and they can be handled asynchronously.
> 
> However, all useful messages are on "debug" level, which is inconvenient and surprising. It is reasonable to expect some level of basic logging when supplying -Xlog:jit+compilation, e.g. "info" level. I believe we should lift at least some of the logging to "info" level for these.
> 
> Additional testing:
>  - [x] Eyeballing `-Xlog:jit*` logs after the patch
>  - [x] Linux x86_64 server fastdebug, `all`

This pull request has now been integrated.

Changeset: 50895835
Author:    Aleksey Shipilev <shade at openjdk.org>
URL:       https://git.openjdk.org/jdk/commit/50895835e0c78f54a0b33db7f42f3769e2a1e652
Stats:     2 lines in 1 file changed: 0 ins; 0 del; 2 mod

8356259: Lift basic -Xlog:jit* logging to "info" level

Reviewed-by: kvn

-------------

PR: https://git.openjdk.org/jdk/pull/25061

From aph at openjdk.org  Wed May  7 09:28:19 2025
From: aph at openjdk.org (Andrew Haley)
Date: Wed, 7 May 2025 09:28:19 GMT
Subject: RFR: 8353686: Optimize Math.cbrt for x86 64 bit platforms [v3]
In-Reply-To: <H3vvJxVRsOzXpLIAnz2vc3wU_Umd9IoyI1cgqYT6mq0=.79b289da-60f2-4f62-99bd-227e03e9df2b@github.com>
References: <1NsI0OGP9RcnbEwlJwDj1dZ3w7zCP4DxJhEmO1quSgo=.3b1e3da9-9aa9-4221-a73a-e2f3ec5f456b@github.com>
 <H3vvJxVRsOzXpLIAnz2vc3wU_Umd9IoyI1cgqYT6mq0=.79b289da-60f2-4f62-99bd-227e03e9df2b@github.com>
Message-ID: <m-sqt9KlVF_WJ5Plh4_nJaia7LTBpzHkO8svt7mcisw=.2ffba1c6-a433-41e3-8a7f-b33ff874b472@github.com>

On Tue, 6 May 2025 21:45:34 GMT, Mohamed Issa <duke at openjdk.org> wrote:

>> The goal of this PR is to implement an x86_64 intrinsic for java.lang.Math.cbrt() using libm. There is a new set of micro-benchmarks are included to check the performance of specific input value ranges to help prevent regressions in the future.
>> 
>> The results of all tests posted below were captured with an [Intel? Xeon 6761P](https://www.intel.com/content/www/us/en/products/sku/241842/intel-xeon-6761p-processor-336m-cache-2-50-ghz/specifications.html) using [OpenJDK v25-b21](https://github.com/openjdk/jdk/releases/tag/jdk-25%2B21) as the baseline version.
>> 
>> For performance data collected with the new built in range micro-benchmark, see the table below. Each result is the mean of 8 individual runs, and the input ranges used match those from the original Java implementation. Overall, the intrinsic provides a major uplift of 169% when very small inputs are used and a more modest uplift of 45% for all other inputs.
>> 
>> | Input range(s)                                  | Throughput with baseline (op/s) | Throughput with intrinsic (op/s) | Speedup |
>> | :-------------------------------------: | :----------------------------------: | :----------------------------------: | :---------: |
>> | [-2^(-1022), 2^(-1022)]                   | 6568                                             | 17678                                          | 2.69x       |
>> | (-INF, -2^(-1022)], [2^(-1022), INF) | 138932                                         | 200897                                        | 1.45x       |
>> 
>> Finally, the `jtreg:test/jdk/java/lang/Math/CubeRootTests.java` test passed with the changes.
>
> Mohamed Issa has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Add new set of cbrt micro-benchmarks

src/hotspot/cpu/x86/stubGenerator_x86_64_cbrt.cpp line 62:

> 60: {
> 61:     0, 3220193280
> 62: };

What is this constant?

Its value is 0xbff0400000000000, which is -ve bit set, bias (top bit of exponent) clear, but one of the bits in the fraction is set. So its value is -0x1.04p+0. As well as the exponent it also sets the 1 bit, just below the 5 most significant bits of the fraction. I guess this in effect rounds up the value that is added in the final rounding.

Is that right?

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/24470#discussion_r2077214995

From gbarany at openjdk.org  Wed May  7 11:27:36 2025
From: gbarany at openjdk.org (=?UTF-8?B?R2VyZ8O2?= Barany)
Date: Wed, 7 May 2025 11:27:36 GMT
Subject: RFR: 8354443: [Graal] crash after deopt in
 TestG1BarrierGeneration.java
Message-ID: <CNaSfj0qxtBvdfWDXD1zqaxQTdaIjKQSTz88hngKK7c=.7a22bc43-1f3e-421b-b19d-aca663f5b771@github.com>

Remove special cases in `nmethod::is_deopt_entry` and `nmethod::is_deopt_mh_entry`. Graal used to generate a different code pattern from C2 for deopt handlers. This was changed in https://github.com/oracle/graal/commit/099f57b58edb23ed2184c11badea24edf36f30d2 to align Graal's code generation with C2. The special cases are no longer needed.

-------------

Commit messages:
 - 8354443: [Graal] crash after deopt in TestG1BarrierGeneration.java

Changes: https://git.openjdk.org/jdk/pull/25088/files
  Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=25088&range=00
  Issue: https://bugs.openjdk.org/browse/JDK-8354443
  Stats: 11 lines in 1 file changed: 0 ins; 9 del; 2 mod
  Patch: https://git.openjdk.org/jdk/pull/25088.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/25088/head:pull/25088

PR: https://git.openjdk.org/jdk/pull/25088

From jbhateja at openjdk.org  Wed May  7 11:40:05 2025
From: jbhateja at openjdk.org (Jatin Bhateja)
Date: Wed, 7 May 2025 11:40:05 GMT
Subject: RFR: 8352675: Support Intel AVX10 converged vector ISA feature
 detection [v15]
In-Reply-To: <OrjQDBEJjPrCWLpEPj4DmEpaWEFcHY3R8PiZ6ccxMxU=.2862c36d-88d0-45b8-ad28-b50730570da4@github.com>
References: <OrjQDBEJjPrCWLpEPj4DmEpaWEFcHY3R8PiZ6ccxMxU=.2862c36d-88d0-45b8-ad28-b50730570da4@github.com>
Message-ID: <K_d4lJb9g6XcCjqO1R1zwOWeYbDsORdfPtjjsy1WpgQ=.4ef67f70-096e-4285-9fe8-6caf1d634997@github.com>

> - Intel AVX10[1] extends and enhances the capabilities of Intel AVX-512 to benefit all Intel? products and will be the vector ISA of choice moving into the future. 
> - It supports a new ISA versioning scheme which simplifies the existing AVX512 feature enumeration scheme. Feature set supported by an AVX10 ISA version will be supported by all the versions above it.
> - The initial, fully-featured version of Intel? AVX10 will be enumerated as Version 2 (denoted as Intel? AVX10.2). This will include the new ISA extension over the existing AVX512 instructions. 
> - An early version of Intel? AVX10 (Version 1, or Intel? AVX10.1) that only enumerates the Intel? AVX-512 instruction set at 128, 256, and 512 bits will be enabled on the Granite Rapids Server for software pre-enabling.
> 
> This patch adds the necessary CPUID feature detection for AVX10 ISA version 1 and 2.  In terms of architectural state save restoration, AVX10 is isomorphic to AVX512 support up till Granite Rapids. State components affected by AVX10 extension include SSE, AVX, Opmask, ZMM_Hi256, and Hi16_ZMM registers. 
> 
> The patch has been regressed through tier1 and jvmci tests 
> 
> Please review and share your feedback.
> 
> Best Regards,
> Jatin
> 
> [1] https://www.intel.com/content/www/us/en/content-details/844829/intel-advanced-vector-extensions-10-2-intel-avx10-2-architecture-specification.html

Jatin Bhateja has updated the pull request incrementally with two additional commits since the last revision:

 - Making _features_bitmap size configurable
 - cleanups & refactorings

-------------

Changes:
  - all: https://git.openjdk.org/jdk/pull/24329/files
  - new: https://git.openjdk.org/jdk/pull/24329/files/650e3d61..cfc09d05

Webrevs:
 - full: https://webrevs.openjdk.org/?repo=jdk&pr=24329&range=14
 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24329&range=13-14

  Stats: 192 lines in 9 files changed: 58 ins; 87 del; 47 mod
  Patch: https://git.openjdk.org/jdk/pull/24329.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/24329/head:pull/24329

PR: https://git.openjdk.org/jdk/pull/24329

From dnsimon at openjdk.org  Wed May  7 13:25:15 2025
From: dnsimon at openjdk.org (Doug Simon)
Date: Wed, 7 May 2025 13:25:15 GMT
Subject: RFR: 8354443: [Graal] crash after deopt in
 TestG1BarrierGeneration.java
In-Reply-To: <CNaSfj0qxtBvdfWDXD1zqaxQTdaIjKQSTz88hngKK7c=.7a22bc43-1f3e-421b-b19d-aca663f5b771@github.com>
References: <CNaSfj0qxtBvdfWDXD1zqaxQTdaIjKQSTz88hngKK7c=.7a22bc43-1f3e-421b-b19d-aca663f5b771@github.com>
Message-ID: <AJjgvarmpGfjy0BvZ-DrSqJfY81snatM4GRPN9QkvWk=.fb33243a-58d6-4650-9e12-ddfb2d728b79@github.com>

On Wed, 7 May 2025 11:17:52 GMT, Gerg? Barany <gbarany at openjdk.org> wrote:

> Remove special cases in `nmethod::is_deopt_entry` and `nmethod::is_deopt_mh_entry`. Graal used to generate a different code pattern from C2 for deopt handlers. This was changed in https://github.com/oracle/graal/commit/099f57b58edb23ed2184c11badea24edf36f30d2 to align Graal's code generation with C2. The special cases are no longer needed.

LGTM

-------------

Marked as reviewed by dnsimon (Reviewer).

PR Review: https://git.openjdk.org/jdk/pull/25088#pullrequestreview-2821736057

From yzheng at openjdk.org  Wed May  7 13:36:16 2025
From: yzheng at openjdk.org (Yudi Zheng)
Date: Wed, 7 May 2025 13:36:16 GMT
Subject: RFR: 8354443: [Graal] crash after deopt in
 TestG1BarrierGeneration.java
In-Reply-To: <CNaSfj0qxtBvdfWDXD1zqaxQTdaIjKQSTz88hngKK7c=.7a22bc43-1f3e-421b-b19d-aca663f5b771@github.com>
References: <CNaSfj0qxtBvdfWDXD1zqaxQTdaIjKQSTz88hngKK7c=.7a22bc43-1f3e-421b-b19d-aca663f5b771@github.com>
Message-ID: <l8_5UblzSofYOH5DqY2P4FNbiC95i7pSfYx2IbRDWzA=.68bfaf10-4e4b-4571-b3f2-5812ae6c4adb@github.com>

On Wed, 7 May 2025 11:17:52 GMT, Gerg? Barany <gbarany at openjdk.org> wrote:

> Remove special cases in `nmethod::is_deopt_entry` and `nmethod::is_deopt_mh_entry`. Graal used to generate a different code pattern from C2 for deopt handlers. This was changed in https://github.com/oracle/graal/commit/099f57b58edb23ed2184c11badea24edf36f30d2 to align Graal's code generation with C2. The special cases are no longer needed.

LGTM

-------------

Marked as reviewed by yzheng (Committer).

PR Review: https://git.openjdk.org/jdk/pull/25088#pullrequestreview-2821790193

From gbarany at openjdk.org  Wed May  7 14:45:13 2025
From: gbarany at openjdk.org (=?UTF-8?B?R2VyZ8O2?= Barany)
Date: Wed, 7 May 2025 14:45:13 GMT
Subject: RFR: 8354443: [Graal] crash after deopt in
 TestG1BarrierGeneration.java
In-Reply-To: <CNaSfj0qxtBvdfWDXD1zqaxQTdaIjKQSTz88hngKK7c=.7a22bc43-1f3e-421b-b19d-aca663f5b771@github.com>
References: <CNaSfj0qxtBvdfWDXD1zqaxQTdaIjKQSTz88hngKK7c=.7a22bc43-1f3e-421b-b19d-aca663f5b771@github.com>
Message-ID: <FdNIvPwBvOqsgl5KBSxh48XkhKTyn4P7AFMl3gQLRhM=.bcd564f2-6acd-494f-96f7-cc29d0b71179@github.com>

On Wed, 7 May 2025 11:17:52 GMT, Gerg? Barany <gbarany at openjdk.org> wrote:

> Remove special cases in `nmethod::is_deopt_entry` and `nmethod::is_deopt_mh_entry`. Graal used to generate a different code pattern from C2 for deopt handlers. This was changed in https://github.com/oracle/graal/commit/099f57b58edb23ed2184c11badea24edf36f30d2 to align Graal's code generation with C2. The special cases are no longer needed.

Thanks for your reviews!

-------------

PR Comment: https://git.openjdk.org/jdk/pull/25088#issuecomment-2858866796

From duke at openjdk.org  Wed May  7 14:45:13 2025
From: duke at openjdk.org (duke)
Date: Wed, 7 May 2025 14:45:13 GMT
Subject: RFR: 8354443: [Graal] crash after deopt in
 TestG1BarrierGeneration.java
In-Reply-To: <CNaSfj0qxtBvdfWDXD1zqaxQTdaIjKQSTz88hngKK7c=.7a22bc43-1f3e-421b-b19d-aca663f5b771@github.com>
References: <CNaSfj0qxtBvdfWDXD1zqaxQTdaIjKQSTz88hngKK7c=.7a22bc43-1f3e-421b-b19d-aca663f5b771@github.com>
Message-ID: <yUfpgaoXa_agbEpW7hdZjRu0NpAUzROk8JYGQVjuPbs=.261cb695-0176-4944-8f45-7f65e29a0e7f@github.com>

On Wed, 7 May 2025 11:17:52 GMT, Gerg? Barany <gbarany at openjdk.org> wrote:

> Remove special cases in `nmethod::is_deopt_entry` and `nmethod::is_deopt_mh_entry`. Graal used to generate a different code pattern from C2 for deopt handlers. This was changed in https://github.com/oracle/graal/commit/099f57b58edb23ed2184c11badea24edf36f30d2 to align Graal's code generation with C2. The special cases are no longer needed.

@gergo- 
Your change (at version 8028476c2e28e2c168676209260fa68194f74cf1) is now ready to be sponsored by a Committer.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/25088#issuecomment-2858870106

From gbarany at openjdk.org  Wed May  7 14:52:20 2025
From: gbarany at openjdk.org (=?UTF-8?B?R2VyZ8O2?= Barany)
Date: Wed, 7 May 2025 14:52:20 GMT
Subject: Integrated: 8354443: [Graal] crash after deopt in
 TestG1BarrierGeneration.java
In-Reply-To: <CNaSfj0qxtBvdfWDXD1zqaxQTdaIjKQSTz88hngKK7c=.7a22bc43-1f3e-421b-b19d-aca663f5b771@github.com>
References: <CNaSfj0qxtBvdfWDXD1zqaxQTdaIjKQSTz88hngKK7c=.7a22bc43-1f3e-421b-b19d-aca663f5b771@github.com>
Message-ID: <FBJz9djFYFUm2_j9Z5QmJmk-L_Cvv5NPoig_4sQmCaI=.2f630eec-bd4f-4504-bea7-bf212daa3bca@github.com>

On Wed, 7 May 2025 11:17:52 GMT, Gerg? Barany <gbarany at openjdk.org> wrote:

> Remove special cases in `nmethod::is_deopt_entry` and `nmethod::is_deopt_mh_entry`. Graal used to generate a different code pattern from C2 for deopt handlers. This was changed in https://github.com/oracle/graal/commit/099f57b58edb23ed2184c11badea24edf36f30d2 to align Graal's code generation with C2. The special cases are no longer needed.

This pull request has now been integrated.

Changeset: 90f0f1b8
Author:    Gerg? Barany <gbarany at openjdk.org>
Committer: Yudi Zheng <yzheng at openjdk.org>
URL:       https://git.openjdk.org/jdk/commit/90f0f1b88badbf1f72d7b9434621457aa47cde30
Stats:     11 lines in 1 file changed: 0 ins; 9 del; 2 mod

8354443: [Graal] crash after deopt in TestG1BarrierGeneration.java

Reviewed-by: dnsimon, yzheng

-------------

PR: https://git.openjdk.org/jdk/pull/25088

From yzheng at openjdk.org  Wed May  7 15:39:20 2025
From: yzheng at openjdk.org (Yudi Zheng)
Date: Wed, 7 May 2025 15:39:20 GMT
Subject: RFR: 8352675: Support Intel AVX10 converged vector ISA feature
 detection [v15]
In-Reply-To: <K_d4lJb9g6XcCjqO1R1zwOWeYbDsORdfPtjjsy1WpgQ=.4ef67f70-096e-4285-9fe8-6caf1d634997@github.com>
References: <OrjQDBEJjPrCWLpEPj4DmEpaWEFcHY3R8PiZ6ccxMxU=.2862c36d-88d0-45b8-ad28-b50730570da4@github.com>
 <K_d4lJb9g6XcCjqO1R1zwOWeYbDsORdfPtjjsy1WpgQ=.4ef67f70-096e-4285-9fe8-6caf1d634997@github.com>
Message-ID: <4qUlnS5IhZxUDg2w5C3aAo_saQ1IXSnbkmSNwpgzpes=.092d9c5d-836d-41d9-aa9b-e94c4520fea7@github.com>

On Wed, 7 May 2025 11:40:05 GMT, Jatin Bhateja <jbhateja at openjdk.org> wrote:

>> - Intel AVX10[1] extends and enhances the capabilities of Intel AVX-512 to benefit all Intel? products and will be the vector ISA of choice moving into the future. 
>> - It supports a new ISA versioning scheme which simplifies the existing AVX512 feature enumeration scheme. Feature set supported by an AVX10 ISA version will be supported by all the versions above it.
>> - The initial, fully-featured version of Intel? AVX10 will be enumerated as Version 2 (denoted as Intel? AVX10.2). This will include the new ISA extension over the existing AVX512 instructions. 
>> - An early version of Intel? AVX10 (Version 1, or Intel? AVX10.1) that only enumerates the Intel? AVX-512 instruction set at 128, 256, and 512 bits will be enabled on the Granite Rapids Server for software pre-enabling.
>> 
>> This patch adds the necessary CPUID feature detection for AVX10 ISA version 1 and 2.  In terms of architectural state save restoration, AVX10 is isomorphic to AVX512 support up till Granite Rapids. State components affected by AVX10 extension include SSE, AVX, Opmask, ZMM_Hi256, and Hi16_ZMM registers. 
>> 
>> The patch has been regressed through tier1 and jvmci tests 
>> 
>> Please review and share your feedback.
>> 
>> Best Regards,
>> Jatin
>> 
>> [1] https://www.intel.com/content/www/us/en/content-details/844829/intel-advanced-vector-extensions-10-2-intel-avx10-2-architecture-specification.html
>
> Jatin Bhateja has updated the pull request incrementally with two additional commits since the last revision:
> 
>  - Making _features_bitmap size configurable
>  - cleanups & refactorings

JVMCI changes look good. Will run some Graal tests on this PR

src/jdk.internal.vm.ci/share/classes/jdk/vm/ci/hotspot/HotSpotJVMCIBackendFactory.java line 121:

> 119:                     long featureIndex = bitIndex >>> featuresElementShiftCount;
> 120:                     long featureBitMask = 1L << (bitIndex & featuresElementMask);
> 121:                     assert featureIndex < featuresBitMapSize;

`featuresBitMapSize` is size in bytes while `featureIndex` is index to long array

-------------

PR Review: https://git.openjdk.org/jdk/pull/24329#pullrequestreview-2822266780
PR Review Comment: https://git.openjdk.org/jdk/pull/24329#discussion_r2077922290

From vlivanov at openjdk.org  Wed May  7 21:53:58 2025
From: vlivanov at openjdk.org (Vladimir Ivanov)
Date: Wed, 7 May 2025 21:53:58 GMT
Subject: RFR: 8352675: Support Intel AVX10 converged vector ISA feature
 detection [v15]
In-Reply-To: <K_d4lJb9g6XcCjqO1R1zwOWeYbDsORdfPtjjsy1WpgQ=.4ef67f70-096e-4285-9fe8-6caf1d634997@github.com>
References: <OrjQDBEJjPrCWLpEPj4DmEpaWEFcHY3R8PiZ6ccxMxU=.2862c36d-88d0-45b8-ad28-b50730570da4@github.com>
 <K_d4lJb9g6XcCjqO1R1zwOWeYbDsORdfPtjjsy1WpgQ=.4ef67f70-096e-4285-9fe8-6caf1d634997@github.com>
Message-ID: <xc4m_wi-kTWx-BFvaIA7f_F71unUJUdNcPPaqwX9zAs=.1eeb7eda-63fb-42fe-8739-6aa98b6a3671@github.com>

On Wed, 7 May 2025 11:40:05 GMT, Jatin Bhateja <jbhateja at openjdk.org> wrote:

>> - Intel AVX10[1] extends and enhances the capabilities of Intel AVX-512 to benefit all Intel? products and will be the vector ISA of choice moving into the future. 
>> - It supports a new ISA versioning scheme which simplifies the existing AVX512 feature enumeration scheme. Feature set supported by an AVX10 ISA version will be supported by all the versions above it.
>> - The initial, fully-featured version of Intel? AVX10 will be enumerated as Version 2 (denoted as Intel? AVX10.2). This will include the new ISA extension over the existing AVX512 instructions. 
>> - An early version of Intel? AVX10 (Version 1, or Intel? AVX10.1) that only enumerates the Intel? AVX-512 instruction set at 128, 256, and 512 bits will be enabled on the Granite Rapids Server for software pre-enabling.
>> 
>> This patch adds the necessary CPUID feature detection for AVX10 ISA version 1 and 2.  In terms of architectural state save restoration, AVX10 is isomorphic to AVX512 support up till Granite Rapids. State components affected by AVX10 extension include SSE, AVX, Opmask, ZMM_Hi256, and Hi16_ZMM registers. 
>> 
>> The patch has been regressed through tier1 and jvmci tests 
>> 
>> Please review and share your feedback.
>> 
>> Best Regards,
>> Jatin
>> 
>> [1] https://www.intel.com/content/www/us/en/content-details/844829/intel-advanced-vector-extensions-10-2-intel-avx10-2-architecture-specification.html
>
> Jatin Bhateja has updated the pull request incrementally with two additional commits since the last revision:
> 
>  - Making _features_bitmap size configurable
>  - cleanups & refactorings

There are some SA-related failures. Fixed by [1]. Otherwise, testing results are good.

[1] https://github.com/iwanowww/jdk/commit/9100ef190befbb1967f477532a0776c135a9b728

src/hotspot/cpu/x86/vm_version_x86.hpp line 458:

> 456: 
> 457:    private:
> 458:     uint64_t _features_bitmap[(MAX_CPU_FEATURES >> 6) + 1];

Suggestion:

    uint64_t _features_bitmap[(MAX_CPU_FEATURES / BitsPerLong) + 1];

src/hotspot/cpu/x86/vm_version_x86.hpp line 460:

> 458:     uint64_t _features_bitmap[(MAX_CPU_FEATURES >> 6) + 1];
> 459: 
> 460:     STATIC_ASSERT(sizeof(_features_bitmap) * BitsPerByte > MAX_CPU_FEATURES);

Suggestion:

    STATIC_ASSERT(sizeof(_features_bitmap) * BitsPerByte >= MAX_CPU_FEATURES);

-------------

PR Review: https://git.openjdk.org/jdk/pull/24329#pullrequestreview-2822970103
PR Review Comment: https://git.openjdk.org/jdk/pull/24329#discussion_r2078346536
PR Review Comment: https://git.openjdk.org/jdk/pull/24329#discussion_r2078354983

From vlivanov at openjdk.org  Wed May  7 21:53:59 2025
From: vlivanov at openjdk.org (Vladimir Ivanov)
Date: Wed, 7 May 2025 21:53:59 GMT
Subject: RFR: 8352675: Support Intel AVX10 converged vector ISA feature
 detection [v15]
In-Reply-To: <4qUlnS5IhZxUDg2w5C3aAo_saQ1IXSnbkmSNwpgzpes=.092d9c5d-836d-41d9-aa9b-e94c4520fea7@github.com>
References: <OrjQDBEJjPrCWLpEPj4DmEpaWEFcHY3R8PiZ6ccxMxU=.2862c36d-88d0-45b8-ad28-b50730570da4@github.com>
 <K_d4lJb9g6XcCjqO1R1zwOWeYbDsORdfPtjjsy1WpgQ=.4ef67f70-096e-4285-9fe8-6caf1d634997@github.com>
 <4qUlnS5IhZxUDg2w5C3aAo_saQ1IXSnbkmSNwpgzpes=.092d9c5d-836d-41d9-aa9b-e94c4520fea7@github.com>
Message-ID: <fN3sSmAsm7IyCtuI3wLNmMxJ_FvJwAgZ6Lg9Mm6O-uQ=.b064b9e2-a54c-43da-96b5-3fc496c82cb5@github.com>

On Wed, 7 May 2025 15:28:09 GMT, Yudi Zheng <yzheng at openjdk.org> wrote:

>> Jatin Bhateja has updated the pull request incrementally with two additional commits since the last revision:
>> 
>>  - Making _features_bitmap size configurable
>>  - cleanups & refactorings
>
> src/jdk.internal.vm.ci/share/classes/jdk/vm/ci/hotspot/HotSpotJVMCIBackendFactory.java line 121:
> 
>> 119:                     long featureIndex = bitIndex >>> featuresElementShiftCount;
>> 120:                     long featureBitMask = 1L << (bitIndex & featuresElementMask);
>> 121:                     assert featureIndex < featuresBitMapSize;
> 
> `featuresBitMapSize` is size in bytes while `featureIndex` is index to long array

Good catch, Yudi.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/24329#discussion_r2078544595

From jbhateja at openjdk.org  Thu May  8 13:49:22 2025
From: jbhateja at openjdk.org (Jatin Bhateja)
Date: Thu, 8 May 2025 13:49:22 GMT
Subject: RFR: 8352675: Support Intel AVX10 converged vector ISA feature
 detection [v16]
In-Reply-To: <OrjQDBEJjPrCWLpEPj4DmEpaWEFcHY3R8PiZ6ccxMxU=.2862c36d-88d0-45b8-ad28-b50730570da4@github.com>
References: <OrjQDBEJjPrCWLpEPj4DmEpaWEFcHY3R8PiZ6ccxMxU=.2862c36d-88d0-45b8-ad28-b50730570da4@github.com>
Message-ID: <9Luwvte-huLN0cjCqBAdvitAE6ZwqPjmiLJSOEpFt04=.b9d7f325-0e85-44a9-ae18-2f770260c4f6@github.com>

> - Intel AVX10[1] extends and enhances the capabilities of Intel AVX-512 to benefit all Intel? products and will be the vector ISA of choice moving into the future. 
> - It supports a new ISA versioning scheme which simplifies the existing AVX512 feature enumeration scheme. Feature set supported by an AVX10 ISA version will be supported by all the versions above it.
> - The initial, fully-featured version of Intel? AVX10 will be enumerated as Version 2 (denoted as Intel? AVX10.2). This will include the new ISA extension over the existing AVX512 instructions. 
> - An early version of Intel? AVX10 (Version 1, or Intel? AVX10.1) that only enumerates the Intel? AVX-512 instruction set at 128, 256, and 512 bits will be enabled on the Granite Rapids Server for software pre-enabling.
> 
> This patch adds the necessary CPUID feature detection for AVX10 ISA version 1 and 2.  In terms of architectural state save restoration, AVX10 is isomorphic to AVX512 support up till Granite Rapids. State components affected by AVX10 extension include SSE, AVX, Opmask, ZMM_Hi256, and Hi16_ZMM registers. 
> 
> The patch has been regressed through tier1 and jvmci tests 
> 
> Please review and share your feedback.
> 
> Best Regards,
> Jatin
> 
> [1] https://www.intel.com/content/www/us/en/content-details/844829/intel-advanced-vector-extensions-10-2-intel-avx10-2-architecture-specification.html

Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision:

  Reveiw suggestions incorporated

-------------

Changes:
  - all: https://git.openjdk.org/jdk/pull/24329/files
  - new: https://git.openjdk.org/jdk/pull/24329/files/cfc09d05..8acbd7a6

Webrevs:
 - full: https://webrevs.openjdk.org/?repo=jdk&pr=24329&range=15
 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24329&range=14-15

  Stats: 2 lines in 1 file changed: 0 ins; 0 del; 2 mod
  Patch: https://git.openjdk.org/jdk/pull/24329.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/24329/head:pull/24329

PR: https://git.openjdk.org/jdk/pull/24329

From jbhateja at openjdk.org  Thu May  8 14:44:43 2025
From: jbhateja at openjdk.org (Jatin Bhateja)
Date: Thu, 8 May 2025 14:44:43 GMT
Subject: RFR: 8352675: Support Intel AVX10 converged vector ISA feature
 detection [v17]
In-Reply-To: <OrjQDBEJjPrCWLpEPj4DmEpaWEFcHY3R8PiZ6ccxMxU=.2862c36d-88d0-45b8-ad28-b50730570da4@github.com>
References: <OrjQDBEJjPrCWLpEPj4DmEpaWEFcHY3R8PiZ6ccxMxU=.2862c36d-88d0-45b8-ad28-b50730570da4@github.com>
Message-ID: <8tz0nbg5nt0WR_9Y_Zd_G2I26Dl8D4a5wBd0wBbrRQY=.2c71f9e8-8aa7-4a04-88df-d2ef018d73a8@github.com>

> - Intel AVX10[1] extends and enhances the capabilities of Intel AVX-512 to benefit all Intel? products and will be the vector ISA of choice moving into the future. 
> - It supports a new ISA versioning scheme which simplifies the existing AVX512 feature enumeration scheme. Feature set supported by an AVX10 ISA version will be supported by all the versions above it.
> - The initial, fully-featured version of Intel? AVX10 will be enumerated as Version 2 (denoted as Intel? AVX10.2). This will include the new ISA extension over the existing AVX512 instructions. 
> - An early version of Intel? AVX10 (Version 1, or Intel? AVX10.1) that only enumerates the Intel? AVX-512 instruction set at 128, 256, and 512 bits will be enabled on the Granite Rapids Server for software pre-enabling.
> 
> This patch adds the necessary CPUID feature detection for AVX10 ISA version 1 and 2.  In terms of architectural state save restoration, AVX10 is isomorphic to AVX512 support up till Granite Rapids. State components affected by AVX10 extension include SSE, AVX, Opmask, ZMM_Hi256, and Hi16_ZMM registers. 
> 
> The patch has been regressed through tier1 and jvmci tests 
> 
> Please review and share your feedback.
> 
> Best Regards,
> Jatin
> 
> [1] https://www.intel.com/content/www/us/en/content-details/844829/intel-advanced-vector-extensions-10-2-intel-avx10-2-architecture-specification.html

Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision:

  Code re-factoring from Vladimir

-------------

Changes:
  - all: https://git.openjdk.org/jdk/pull/24329/files
  - new: https://git.openjdk.org/jdk/pull/24329/files/8acbd7a6..1a3bce93

Webrevs:
 - full: https://webrevs.openjdk.org/?repo=jdk&pr=24329&range=16
 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24329&range=15-16

  Stats: 21 lines in 3 files changed: 7 ins; 7 del; 7 mod
  Patch: https://git.openjdk.org/jdk/pull/24329.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/24329/head:pull/24329

PR: https://git.openjdk.org/jdk/pull/24329

From yzheng at openjdk.org  Thu May  8 14:49:42 2025
From: yzheng at openjdk.org (Yudi Zheng)
Date: Thu, 8 May 2025 14:49:42 GMT
Subject: RFR: 8353735: [JVMCI] Allow specifying storage kind of the callee
 save register [v2]
In-Reply-To: <taWIY9EOZM8y920O4_MFI_-XpScHPSWbgpZqDrxOVdw=.20087dd3-85c4-444d-b4fc-4baae7acc2e2@github.com>
References: <taWIY9EOZM8y920O4_MFI_-XpScHPSWbgpZqDrxOVdw=.20087dd3-85c4-444d-b4fc-4baae7acc2e2@github.com>
Message-ID: <8jZWccxTMyrcHsQEiyaf6_TmGLBXIGdfW2bJWcVHMaU=.98eb7ab1-dc5c-4611-a2a9-4ca04d606836@github.com>

> Windows x64 ABI considers the upper portions of YMM0-YMM15 and ZMM0-ZMM15 volatile, that is, destroyed on function calls. This PR allows `RegisterConfig` implementations to refine the storage kind of callee save register, such that JVMCI compiler can exploit this information to avoid saving full width of these registers.

Yudi Zheng has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision:

 - Update javadoc
 - Merge remote-tracking branch 'upstream/master' into JDK-8353735
 - [JVMCI] Allow specifying storage kind of the callee save register

-------------

Changes:
  - all: https://git.openjdk.org/jdk/pull/24451/files
  - new: https://git.openjdk.org/jdk/pull/24451/files/339b72ef..fcdfd10d

Webrevs:
 - full: https://webrevs.openjdk.org/?repo=jdk&pr=24451&range=01
 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24451&range=00-01

  Stats: 315273 lines in 3080 files changed: 101272 ins; 201200 del; 12801 mod
  Patch: https://git.openjdk.org/jdk/pull/24451.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/24451/head:pull/24451

PR: https://git.openjdk.org/jdk/pull/24451

From yzheng at openjdk.org  Thu May  8 14:57:10 2025
From: yzheng at openjdk.org (Yudi Zheng)
Date: Thu, 8 May 2025 14:57:10 GMT
Subject: RFR: 8353735: [JVMCI] Allow specifying storage kind of the callee
 save register [v3]
In-Reply-To: <taWIY9EOZM8y920O4_MFI_-XpScHPSWbgpZqDrxOVdw=.20087dd3-85c4-444d-b4fc-4baae7acc2e2@github.com>
References: <taWIY9EOZM8y920O4_MFI_-XpScHPSWbgpZqDrxOVdw=.20087dd3-85c4-444d-b4fc-4baae7acc2e2@github.com>
Message-ID: <_8_bdUwiZc5xZqStJm2XfneFUTdCEx4c_uDsKJcMkTc=.1df612b0-30c8-4ae3-8706-bd634dd9fbc4@github.com>

> Windows x64 ABI considers the upper portions of YMM0-YMM15 and ZMM0-ZMM15 volatile, that is, destroyed on function calls. This PR allows `RegisterConfig` implementations to refine the storage kind of callee save register, such that JVMCI compiler can exploit this information to avoid saving full width of these registers.

Yudi Zheng has updated the pull request incrementally with one additional commit since the last revision:

  Update javadoc

-------------

Changes:
  - all: https://git.openjdk.org/jdk/pull/24451/files
  - new: https://git.openjdk.org/jdk/pull/24451/files/fcdfd10d..bc900518

Webrevs:
 - full: https://webrevs.openjdk.org/?repo=jdk&pr=24451&range=02
 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24451&range=01-02

  Stats: 5 lines in 1 file changed: 0 ins; 0 del; 5 mod
  Patch: https://git.openjdk.org/jdk/pull/24451.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/24451/head:pull/24451

PR: https://git.openjdk.org/jdk/pull/24451

From dnsimon at openjdk.org  Thu May  8 14:57:11 2025
From: dnsimon at openjdk.org (Doug Simon)
Date: Thu, 8 May 2025 14:57:11 GMT
Subject: RFR: 8353735: [JVMCI] Allow specifying storage kind of the callee
 save register [v3]
In-Reply-To: <_8_bdUwiZc5xZqStJm2XfneFUTdCEx4c_uDsKJcMkTc=.1df612b0-30c8-4ae3-8706-bd634dd9fbc4@github.com>
References: <taWIY9EOZM8y920O4_MFI_-XpScHPSWbgpZqDrxOVdw=.20087dd3-85c4-444d-b4fc-4baae7acc2e2@github.com>
 <_8_bdUwiZc5xZqStJm2XfneFUTdCEx4c_uDsKJcMkTc=.1df612b0-30c8-4ae3-8706-bd634dd9fbc4@github.com>
Message-ID: <VkAc6YiBv1M6yH_79cl5zpHTi64nMyDBvgus_G0fUZc=.cae5dc4a-f547-4423-b30a-04c9072d031a@github.com>

On Thu, 8 May 2025 14:54:36 GMT, Yudi Zheng <yzheng at openjdk.org> wrote:

>> Windows x64 ABI considers the upper portions of YMM0-YMM15 and ZMM0-ZMM15 volatile, that is, destroyed on function calls. This PR allows `RegisterConfig` implementations to refine the storage kind of callee save register, such that JVMCI compiler can exploit this information to avoid saving full width of these registers.
>
> Yudi Zheng has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Update javadoc

Still good.

-------------

Marked as reviewed by dnsimon (Reviewer).

PR Review: https://git.openjdk.org/jdk/pull/24451#pullrequestreview-2825424244

From jbhateja at openjdk.org  Thu May  8 19:21:31 2025
From: jbhateja at openjdk.org (Jatin Bhateja)
Date: Thu, 8 May 2025 19:21:31 GMT
Subject: RFR: 8352675: Support Intel AVX10 converged vector ISA feature
 detection [v18]
In-Reply-To: <OrjQDBEJjPrCWLpEPj4DmEpaWEFcHY3R8PiZ6ccxMxU=.2862c36d-88d0-45b8-ad28-b50730570da4@github.com>
References: <OrjQDBEJjPrCWLpEPj4DmEpaWEFcHY3R8PiZ6ccxMxU=.2862c36d-88d0-45b8-ad28-b50730570da4@github.com>
Message-ID: <0t720cpyX-RwVGVlm0b9gNbSjeMHWy5cnF-o4xSWRgU=.130e6474-3aa2-48a8-90d1-6f3a69c135ee@github.com>

> - Intel AVX10[1] extends and enhances the capabilities of Intel AVX-512 to benefit all Intel? products and will be the vector ISA of choice moving into the future. 
> - It supports a new ISA versioning scheme which simplifies the existing AVX512 feature enumeration scheme. Feature set supported by an AVX10 ISA version will be supported by all the versions above it.
> - The initial, fully-featured version of Intel? AVX10 will be enumerated as Version 2 (denoted as Intel? AVX10.2). This will include the new ISA extension over the existing AVX512 instructions. 
> - An early version of Intel? AVX10 (Version 1, or Intel? AVX10.1) that only enumerates the Intel? AVX-512 instruction set at 128, 256, and 512 bits will be enabled on the Granite Rapids Server for software pre-enabling.
> 
> This patch adds the necessary CPUID feature detection for AVX10 ISA version 1 and 2.  In terms of architectural state save restoration, AVX10 is isomorphic to AVX512 support up till Granite Rapids. State components affected by AVX10 extension include SSE, AVX, Opmask, ZMM_Hi256, and Hi16_ZMM registers. 
> 
> The patch has been regressed through tier1 and jvmci tests 
> 
> Please review and share your feedback.
> 
> Best Regards,
> Jatin
> 
> [1] https://www.intel.com/content/www/us/en/content-details/844829/intel-advanced-vector-extensions-10-2-intel-avx10-2-architecture-specification.html

Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision:

  Addressing Yudi's comments

-------------

Changes:
  - all: https://git.openjdk.org/jdk/pull/24329/files
  - new: https://git.openjdk.org/jdk/pull/24329/files/1a3bce93..c65f0777

Webrevs:
 - full: https://webrevs.openjdk.org/?repo=jdk&pr=24329&range=17
 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24329&range=16-17

  Stats: 7 lines in 5 files changed: 2 ins; 0 del; 5 mod
  Patch: https://git.openjdk.org/jdk/pull/24329.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/24329/head:pull/24329

PR: https://git.openjdk.org/jdk/pull/24329

From vlivanov at openjdk.org  Thu May  8 19:23:59 2025
From: vlivanov at openjdk.org (Vladimir Ivanov)
Date: Thu, 8 May 2025 19:23:59 GMT
Subject: RFR: 8352675: Support Intel AVX10 converged vector ISA feature
 detection [v18]
In-Reply-To: <0t720cpyX-RwVGVlm0b9gNbSjeMHWy5cnF-o4xSWRgU=.130e6474-3aa2-48a8-90d1-6f3a69c135ee@github.com>
References: <OrjQDBEJjPrCWLpEPj4DmEpaWEFcHY3R8PiZ6ccxMxU=.2862c36d-88d0-45b8-ad28-b50730570da4@github.com>
 <0t720cpyX-RwVGVlm0b9gNbSjeMHWy5cnF-o4xSWRgU=.130e6474-3aa2-48a8-90d1-6f3a69c135ee@github.com>
Message-ID: <lP0APowdGdr4PVU_74fS55G0eOcdOuFFpEstsD36Utk=.8a150fd4-9d14-4c2e-b6e6-16c66ecade87@github.com>

On Thu, 8 May 2025 19:21:31 GMT, Jatin Bhateja <jbhateja at openjdk.org> wrote:

>> - Intel AVX10[1] extends and enhances the capabilities of Intel AVX-512 to benefit all Intel? products and will be the vector ISA of choice moving into the future. 
>> - It supports a new ISA versioning scheme which simplifies the existing AVX512 feature enumeration scheme. Feature set supported by an AVX10 ISA version will be supported by all the versions above it.
>> - The initial, fully-featured version of Intel? AVX10 will be enumerated as Version 2 (denoted as Intel? AVX10.2). This will include the new ISA extension over the existing AVX512 instructions. 
>> - An early version of Intel? AVX10 (Version 1, or Intel? AVX10.1) that only enumerates the Intel? AVX-512 instruction set at 128, 256, and 512 bits will be enabled on the Granite Rapids Server for software pre-enabling.
>> 
>> This patch adds the necessary CPUID feature detection for AVX10 ISA version 1 and 2.  In terms of architectural state save restoration, AVX10 is isomorphic to AVX512 support up till Granite Rapids. State components affected by AVX10 extension include SSE, AVX, Opmask, ZMM_Hi256, and Hi16_ZMM registers. 
>> 
>> The patch has been regressed through tier1 and jvmci tests 
>> 
>> Please review and share your feedback.
>> 
>> Best Regards,
>> Jatin
>> 
>> [1] https://www.intel.com/content/www/us/en/content-details/844829/intel-advanced-vector-extensions-10-2-intel-avx10-2-architecture-specification.html
>
> Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Addressing Yudi's comments

Testing results (hs-tier1 - hs-tier4) are clean.

-------------

Marked as reviewed by vlivanov (Reviewer).

PR Review: https://git.openjdk.org/jdk/pull/24329#pullrequestreview-2826156052

From yzheng at openjdk.org  Thu May  8 19:40:00 2025
From: yzheng at openjdk.org (Yudi Zheng)
Date: Thu, 8 May 2025 19:40:00 GMT
Subject: RFR: 8352675: Support Intel AVX10 converged vector ISA feature
 detection [v18]
In-Reply-To: <0t720cpyX-RwVGVlm0b9gNbSjeMHWy5cnF-o4xSWRgU=.130e6474-3aa2-48a8-90d1-6f3a69c135ee@github.com>
References: <OrjQDBEJjPrCWLpEPj4DmEpaWEFcHY3R8PiZ6ccxMxU=.2862c36d-88d0-45b8-ad28-b50730570da4@github.com>
 <0t720cpyX-RwVGVlm0b9gNbSjeMHWy5cnF-o4xSWRgU=.130e6474-3aa2-48a8-90d1-6f3a69c135ee@github.com>
Message-ID: <CJkmmsKAQmjLEQXbZE1lU3bLXexWney70mtLK4XMjmU=.b34fa4de-7fa3-4576-b186-dbf986e8a03f@github.com>

On Thu, 8 May 2025 19:21:31 GMT, Jatin Bhateja <jbhateja at openjdk.org> wrote:

>> - Intel AVX10[1] extends and enhances the capabilities of Intel AVX-512 to benefit all Intel? products and will be the vector ISA of choice moving into the future. 
>> - It supports a new ISA versioning scheme which simplifies the existing AVX512 feature enumeration scheme. Feature set supported by an AVX10 ISA version will be supported by all the versions above it.
>> - The initial, fully-featured version of Intel? AVX10 will be enumerated as Version 2 (denoted as Intel? AVX10.2). This will include the new ISA extension over the existing AVX512 instructions. 
>> - An early version of Intel? AVX10 (Version 1, or Intel? AVX10.1) that only enumerates the Intel? AVX-512 instruction set at 128, 256, and 512 bits will be enabled on the Granite Rapids Server for software pre-enabling.
>> 
>> This patch adds the necessary CPUID feature detection for AVX10 ISA version 1 and 2.  In terms of architectural state save restoration, AVX10 is isomorphic to AVX512 support up till Granite Rapids. State components affected by AVX10 extension include SSE, AVX, Opmask, ZMM_Hi256, and Hi16_ZMM registers. 
>> 
>> The patch has been regressed through tier1 and jvmci tests 
>> 
>> Please review and share your feedback.
>> 
>> Best Regards,
>> Jatin
>> 
>> [1] https://www.intel.com/content/www/us/en/content-details/844829/intel-advanced-vector-extensions-10-2-intel-avx10-2-architecture-specification.html
>
> Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Addressing Yudi's comments

CPU features in Graal remain the same after this PR. Passed all Graal compiler unit tests.

-------------

Marked as reviewed by yzheng (Committer).

PR Review: https://git.openjdk.org/jdk/pull/24329#pullrequestreview-2826187636

From sviswanathan at openjdk.org  Fri May  9 00:03:56 2025
From: sviswanathan at openjdk.org (Sandhya Viswanathan)
Date: Fri, 9 May 2025 00:03:56 GMT
Subject: RFR: 8352675: Support Intel AVX10 converged vector ISA feature
 detection [v9]
In-Reply-To: <c_cWPj1DB-YnZuxE1qWVaBd3OPtR1cAtFrjj5-kHIvw=.8358e6c4-fd58-4ff6-8b01-29546cc80b0a@github.com>
References: <OrjQDBEJjPrCWLpEPj4DmEpaWEFcHY3R8PiZ6ccxMxU=.2862c36d-88d0-45b8-ad28-b50730570da4@github.com>
 <w98GDb6NDmuCaqLQa2J4K9O4BtCiepKybqylTIFqxUs=.d90f5996-05c2-4279-8b27-38ab92cd40d3@github.com>
 <lKiwlSZEbVtX8pcXoGLs4-u3kJJsdtH_MW0g3eXFois=.6f8889e0-9e2d-4dd3-b413-cbaa2e121709@github.com>
 <c_cWPj1DB-YnZuxE1qWVaBd3OPtR1cAtFrjj5-kHIvw=.8358e6c4-fd58-4ff6-8b01-29546cc80b0a@github.com>
Message-ID: <o2anKlS_PQOPSTvAg9maw6h7Oddma-o0MSMDz6Dphec=.2823500d-df08-490b-a802-0394ddd90317@github.com>

On Sat, 3 May 2025 07:28:04 GMT, Jatin Bhateja <jbhateja at openjdk.org> wrote:

>> src/hotspot/cpu/x86/vm_version_x86.cpp line 464:
>> 
>>> 462:     __ movl(rcx, 0x18000000); // cpuid1 bits osxsave | avx
>>> 463:     __ andl(rcx, Address(rsi, 8)); // cpuid1 bits osxsave | avx
>>> 464:     __ jccb(Assembler::equal, done); // jump if AVX is not supported
>> 
>> This doesn't not have same effect as before. Consider input is 0x10000000, the andl result will not be zero with this code and so jump to done will not happen. Whereas prior to this change, the cmpl with 0x18000000 will fail for equality and so a jump to done will happen.  This is the case for all the places where we are checking more than 1 set bit.
>
> Thanks @sviswa7 , sub-optimality was mainly around single-bit comparisons, where we could save redundant CMP after AND, and by flipping the predicate of subsequent flag-consuming JMP,  multibits compares should remain unaltered.

This and all the following places with multi-bit check still need to be fixed. If you walk through stock and new code in this PR when Address(rsi, 8) on line 468 has 0x10000000, you will observe that stock code will jump to done and new code will not jump to done. Let me know if I am missing something.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/24329#discussion_r2080592979

From sviswanathan at openjdk.org  Fri May  9 00:03:58 2025
From: sviswanathan at openjdk.org (Sandhya Viswanathan)
Date: Fri, 9 May 2025 00:03:58 GMT
Subject: RFR: 8352675: Support Intel AVX10 converged vector ISA feature
 detection [v18]
In-Reply-To: <0t720cpyX-RwVGVlm0b9gNbSjeMHWy5cnF-o4xSWRgU=.130e6474-3aa2-48a8-90d1-6f3a69c135ee@github.com>
References: <OrjQDBEJjPrCWLpEPj4DmEpaWEFcHY3R8PiZ6ccxMxU=.2862c36d-88d0-45b8-ad28-b50730570da4@github.com>
 <0t720cpyX-RwVGVlm0b9gNbSjeMHWy5cnF-o4xSWRgU=.130e6474-3aa2-48a8-90d1-6f3a69c135ee@github.com>
Message-ID: <bihplArzEOCI9FBx3y1tKSG-4jMKjkJSjLaATTvsxfE=.35b68104-a209-4978-9c54-42b37cb23e06@github.com>

On Thu, 8 May 2025 19:21:31 GMT, Jatin Bhateja <jbhateja at openjdk.org> wrote:

>> - Intel AVX10[1] extends and enhances the capabilities of Intel AVX-512 to benefit all Intel? products and will be the vector ISA of choice moving into the future. 
>> - It supports a new ISA versioning scheme which simplifies the existing AVX512 feature enumeration scheme. Feature set supported by an AVX10 ISA version will be supported by all the versions above it.
>> - The initial, fully-featured version of Intel? AVX10 will be enumerated as Version 2 (denoted as Intel? AVX10.2). This will include the new ISA extension over the existing AVX512 instructions. 
>> - An early version of Intel? AVX10 (Version 1, or Intel? AVX10.1) that only enumerates the Intel? AVX-512 instruction set at 128, 256, and 512 bits will be enabled on the Granite Rapids Server for software pre-enabling.
>> 
>> This patch adds the necessary CPUID feature detection for AVX10 ISA version 1 and 2.  In terms of architectural state save restoration, AVX10 is isomorphic to AVX512 support up till Granite Rapids. State components affected by AVX10 extension include SSE, AVX, Opmask, ZMM_Hi256, and Hi16_ZMM registers. 
>> 
>> The patch has been regressed through tier1 and jvmci tests 
>> 
>> Please review and share your feedback.
>> 
>> Best Regards,
>> Jatin
>> 
>> [1] https://www.intel.com/content/www/us/en/content-details/844829/intel-advanced-vector-extensions-10-2-intel-avx10-2-architecture-specification.html
>
> Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Addressing Yudi's comments

test/hotspot/jtreg/serviceability/sa/ClhsdbLongConstant.java line 108:

> 106:             checkLongValue("VM_Version::CPU_SHA ",
> 107:                            longConstantOutput,
> 108:                            34L);

Need to change the comment on line 94 as well.

test/lib-test/jdk/test/whitebox/CPUInfoTest.java line 69:

> 67:                     "f16c",         "pku",              "ospke",             "cet_ibt",
> 68:                     "cet_ss",       "avx512_ifma",      "serialize",         "avx_ifma",
> 69:                     "apx_f", "avx10_1", "avx10_2"

A minor nit, in between spacing could match previous statement.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/24329#discussion_r2080650055
PR Review Comment: https://git.openjdk.org/jdk/pull/24329#discussion_r2080648091

From yzheng at openjdk.org  Fri May  9 08:42:06 2025
From: yzheng at openjdk.org (Yudi Zheng)
Date: Fri, 9 May 2025 08:42:06 GMT
Subject: RFR: 8353735: [JVMCI] Allow specifying storage kind of the callee
 save register [v3]
In-Reply-To: <_8_bdUwiZc5xZqStJm2XfneFUTdCEx4c_uDsKJcMkTc=.1df612b0-30c8-4ae3-8706-bd634dd9fbc4@github.com>
References: <taWIY9EOZM8y920O4_MFI_-XpScHPSWbgpZqDrxOVdw=.20087dd3-85c4-444d-b4fc-4baae7acc2e2@github.com>
 <_8_bdUwiZc5xZqStJm2XfneFUTdCEx4c_uDsKJcMkTc=.1df612b0-30c8-4ae3-8706-bd634dd9fbc4@github.com>
Message-ID: <r3GbPjiD9QLVAVzGNdEES5A38mmvfpD7Ai952bQ6UGk=.00b08553-fa4a-4ea2-bc00-0cef7618f4da@github.com>

On Thu, 8 May 2025 14:57:10 GMT, Yudi Zheng <yzheng at openjdk.org> wrote:

>> Windows x64 ABI considers the upper portions of YMM0-YMM15 and ZMM0-ZMM15 volatile, that is, destroyed on function calls. This PR allows `RegisterConfig` implementations to refine the storage kind of callee save register, such that JVMCI compiler can exploit this information to avoid saving full width of these registers.
>
> Yudi Zheng has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Update javadoc

Tier1-3 passed. Thanks for the review!

-------------

PR Comment: https://git.openjdk.org/jdk/pull/24451#issuecomment-2865675256

From yzheng at openjdk.org  Fri May  9 08:42:07 2025
From: yzheng at openjdk.org (Yudi Zheng)
Date: Fri, 9 May 2025 08:42:07 GMT
Subject: Integrated: 8353735: [JVMCI] Allow specifying storage kind of the
 callee save register
In-Reply-To: <taWIY9EOZM8y920O4_MFI_-XpScHPSWbgpZqDrxOVdw=.20087dd3-85c4-444d-b4fc-4baae7acc2e2@github.com>
References: <taWIY9EOZM8y920O4_MFI_-XpScHPSWbgpZqDrxOVdw=.20087dd3-85c4-444d-b4fc-4baae7acc2e2@github.com>
Message-ID: <acMPIYT7rvB4p3Ofi5Bwu1QIjXJO1A_kEQtgXFTVpx4=.58c4a8cb-7f83-4410-a80c-d971ec2694df@github.com>

On Fri, 4 Apr 2025 14:47:39 GMT, Yudi Zheng <yzheng at openjdk.org> wrote:

> Windows x64 ABI considers the upper portions of YMM0-YMM15 and ZMM0-ZMM15 volatile, that is, destroyed on function calls. This PR allows `RegisterConfig` implementations to refine the storage kind of callee save register, such that JVMCI compiler can exploit this information to avoid saving full width of these registers.

This pull request has now been integrated.

Changeset: 74e981e8
Author:    Yudi Zheng <yzheng at openjdk.org>
URL:       https://git.openjdk.org/jdk/commit/74e981e85509ca072b2a45d529dab3a9883613a2
Stats:     11 lines in 1 file changed: 10 ins; 0 del; 1 mod

8353735: [JVMCI] Allow specifying storage kind of the callee save register

Reviewed-by: dnsimon, cslucas

-------------

PR: https://git.openjdk.org/jdk/pull/24451

From jbhateja at openjdk.org  Fri May  9 15:17:17 2025
From: jbhateja at openjdk.org (Jatin Bhateja)
Date: Fri, 9 May 2025 15:17:17 GMT
Subject: RFR: 8352675: Support Intel AVX10 converged vector ISA feature
 detection [v19]
In-Reply-To: <OrjQDBEJjPrCWLpEPj4DmEpaWEFcHY3R8PiZ6ccxMxU=.2862c36d-88d0-45b8-ad28-b50730570da4@github.com>
References: <OrjQDBEJjPrCWLpEPj4DmEpaWEFcHY3R8PiZ6ccxMxU=.2862c36d-88d0-45b8-ad28-b50730570da4@github.com>
Message-ID: <MshoSsDvuoMdKeqF0Uiufmw4q-kUKx15Hv7BphwE_cg=.05022a41-dfbc-43e5-ba07-f0d738926af3@github.com>

> - Intel AVX10[1] extends and enhances the capabilities of Intel AVX-512 to benefit all Intel? products and will be the vector ISA of choice moving into the future. 
> - It supports a new ISA versioning scheme which simplifies the existing AVX512 feature enumeration scheme. Feature set supported by an AVX10 ISA version will be supported by all the versions above it.
> - The initial, fully-featured version of Intel? AVX10 will be enumerated as Version 2 (denoted as Intel? AVX10.2). This will include the new ISA extension over the existing AVX512 instructions. 
> - An early version of Intel? AVX10 (Version 1, or Intel? AVX10.1) that only enumerates the Intel? AVX-512 instruction set at 128, 256, and 512 bits will be enabled on the Granite Rapids Server for software pre-enabling.
> 
> This patch adds the necessary CPUID feature detection for AVX10 ISA version 1 and 2.  In terms of architectural state save restoration, AVX10 is isomorphic to AVX512 support up till Granite Rapids. State components affected by AVX10 extension include SSE, AVX, Opmask, ZMM_Hi256, and Hi16_ZMM registers. 
> 
> The patch has been regressed through tier1 and jvmci tests 
> 
> Please review and share your feedback.
> 
> Best Regards,
> Jatin
> 
> [1] https://www.intel.com/content/www/us/en/content-details/844829/intel-advanced-vector-extensions-10-2-intel-avx10-2-architecture-specification.html

Jatin Bhateja has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 19 commits:

 - Sandhya's review comments resoultion
 - Merge branch 'master' of http://github.com/openjdk/jdk into JDK-8352675
 - Addressing Yudi's comments
 - Code re-factoring from Vladimir
 - Reveiw suggestions incorporated
 - Making _features_bitmap size configurable
 - cleanups & refactorings
 - build fixes for non-x86 targets
 - Review comments resolutions
 - Updating comment
 - ... and 9 more: https://git.openjdk.org/jdk/compare/411a63ea...f583a521

-------------

Changes: https://git.openjdk.org/jdk/pull/24329/files
  Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=24329&range=18
  Stats: 520 lines in 15 files changed: 271 ins; 29 del; 220 mod
  Patch: https://git.openjdk.org/jdk/pull/24329.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/24329/head:pull/24329

PR: https://git.openjdk.org/jdk/pull/24329

From mchevalier at openjdk.org  Fri May  9 16:08:55 2025
From: mchevalier at openjdk.org (Marc Chevalier)
Date: Fri, 9 May 2025 16:08:55 GMT
Subject: RFR: 8347901: C2 should remove unused leaf / pure runtime calls
In-Reply-To: <MPLoUV2KuCkLI8ZnaiH6W8hSz1kLlTvJgyLRYXnLnt0=.a39867b2-e2f6-4527-b027-363d93400950@github.com>
References: <FDC4kftsSAqf2FB3mxOABGMgWhr_qty0_BktGUiuTuE=.060b5943-74c3-461d-8806-b6da1722c207@github.com>
 <XrHBrp81T81JlX15Yc3cTb-fPwGqo5uX-tj4HizOzko=.9d6f5dd0-3239-47f1-a593-87a208ad5c99@github.com>
 <MPLoUV2KuCkLI8ZnaiH6W8hSz1kLlTvJgyLRYXnLnt0=.a39867b2-e2f6-4527-b027-363d93400950@github.com>
Message-ID: <Sbtz8X_o-f8g20gsLGQBEMJCtHQ3yMgUqGMuBWjHLFA=.181d2b69-dcde-4e9f-9c65-0e39bc3ee980@github.com>

On Tue, 6 May 2025 18:18:08 GMT, Vladimir Ivanov <vlivanov at openjdk.org> wrote:

>  making them expensive nodes (to avoid commoning during GVN)

Good point!

I still think I don't get everything. Let me try to sum up what I think I should do.

For now, I don't want to mess with control, but I should prepare the field. Using general Call nodes for pure calls was pretty difficult: Call nodes have too much opinion, assumptions to easily work with for pure calls. But eventually, I want to change the nodes I'm using into a Call node, and more precisely a CallLeaf (I suspect once I'm done doing all I can do with pure calls, so in macro expansion, it's fine). To be able to do this transformation, I need to know control at this point. My goal is to start with control-less nodes, but find the late control during loop optimization, control-pin them at this point (because that's when the information is available) with both control input and output (needed for the expansion in CallLeaf), and continuing with control-pinned nodes. For now, I'm happy with the control I get from parsing.

So, under my nodes, I need 2 outputs: control and data (everywhere now, and at least after control-pinning in the follow-up). I should then make ModFloating/ModD/ModF sub-classes of `MultNode` (I guess, I can make ModFloating a direct sub-class of `MultNode`. And I can introduce new node types for native math calls that would behave similarly wrt to elimination (and pinning in the future), and would also expand into `CallLeaf`. A weirdness of these nodes is that they would be CFG or not whether they are pinned already, and not depending on their type, but I'm not aware of a fundamental issue about that, as long as the change doesn't happen in the middle of a phase where it's relevant.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/24966#issuecomment-2867105355

From sviswanathan at openjdk.org  Fri May  9 22:55:58 2025
From: sviswanathan at openjdk.org (Sandhya Viswanathan)
Date: Fri, 9 May 2025 22:55:58 GMT
Subject: RFR: 8352675: Support Intel AVX10 converged vector ISA feature
 detection [v19]
In-Reply-To: <MshoSsDvuoMdKeqF0Uiufmw4q-kUKx15Hv7BphwE_cg=.05022a41-dfbc-43e5-ba07-f0d738926af3@github.com>
References: <OrjQDBEJjPrCWLpEPj4DmEpaWEFcHY3R8PiZ6ccxMxU=.2862c36d-88d0-45b8-ad28-b50730570da4@github.com>
 <MshoSsDvuoMdKeqF0Uiufmw4q-kUKx15Hv7BphwE_cg=.05022a41-dfbc-43e5-ba07-f0d738926af3@github.com>
Message-ID: <8y5JLR_7BMUJXmNNzPRusDpRWnJHtIPxZodVqQHrmmI=.ca53a482-8572-499f-af9f-6c255cf02896@github.com>

On Fri, 9 May 2025 15:17:17 GMT, Jatin Bhateja <jbhateja at openjdk.org> wrote:

>> - Intel AVX10[1] extends and enhances the capabilities of Intel AVX-512 to benefit all Intel? products and will be the vector ISA of choice moving into the future. 
>> - It supports a new ISA versioning scheme which simplifies the existing AVX512 feature enumeration scheme. Feature set supported by an AVX10 ISA version will be supported by all the versions above it.
>> - The initial, fully-featured version of Intel? AVX10 will be enumerated as Version 2 (denoted as Intel? AVX10.2). This will include the new ISA extension over the existing AVX512 instructions. 
>> - An early version of Intel? AVX10 (Version 1, or Intel? AVX10.1) that only enumerates the Intel? AVX-512 instruction set at 128, 256, and 512 bits will be enabled on the Granite Rapids Server for software pre-enabling.
>> 
>> This patch adds the necessary CPUID feature detection for AVX10 ISA version 1 and 2.  In terms of architectural state save restoration, AVX10 is isomorphic to AVX512 support up till Granite Rapids. State components affected by AVX10 extension include SSE, AVX, Opmask, ZMM_Hi256, and Hi16_ZMM registers. 
>> 
>> The patch has been regressed through tier1 and jvmci tests 
>> 
>> Please review and share your feedback.
>> 
>> Best Regards,
>> Jatin
>> 
>> [1] https://www.intel.com/content/www/us/en/content-details/844829/intel-advanced-vector-extensions-10-2-intel-avx10-2-architecture-specification.html
>
> Jatin Bhateja has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 19 commits:
> 
>  - Sandhya's review comments resoultion
>  - Merge branch 'master' of http://github.com/openjdk/jdk into JDK-8352675
>  - Addressing Yudi's comments
>  - Code re-factoring from Vladimir
>  - Reveiw suggestions incorporated
>  - Making _features_bitmap size configurable
>  - cleanups & refactorings
>  - build fixes for non-x86 targets
>  - Review comments resolutions
>  - Updating comment
>  - ... and 9 more: https://git.openjdk.org/jdk/compare/411a63ea...f583a521

Rest of the PR looks good to me.

src/hotspot/cpu/x86/vm_version_x86.cpp line 494:

> 492:     if (use_evex) {
> 493:       // check _cpuid_info.sef_cpuid7_ebx.bits.avx512f
> 494:       // OR check _cpuid_info.std_cpuid24_ebx.bits.avx10

This comment needs to be corrected:
// OR check _cpuid_info.sefsl1_cpuid7_edx.bits.avx10

src/hotspot/cpu/x86/vm_version_x86.cpp line 1052:

> 1050:   if (is_intel()) { // Intel cpus specific settings
> 1051:     if (is_knights_family()) {
> 1052:       _features.clear_feature(CPU_VZEROUPPER);

Should we be also clearing the CPU_AVX10_1 and CPU_AVX10_2 here?

-------------

PR Review: https://git.openjdk.org/jdk/pull/24329#pullrequestreview-2829142420
PR Review Comment: https://git.openjdk.org/jdk/pull/24329#discussion_r2082148591
PR Review Comment: https://git.openjdk.org/jdk/pull/24329#discussion_r2082570611

From jbhateja at openjdk.org  Fri May  9 23:36:16 2025
From: jbhateja at openjdk.org (Jatin Bhateja)
Date: Fri, 9 May 2025 23:36:16 GMT
Subject: RFR: 8352675: Support Intel AVX10 converged vector ISA feature
 detection [v20]
In-Reply-To: <OrjQDBEJjPrCWLpEPj4DmEpaWEFcHY3R8PiZ6ccxMxU=.2862c36d-88d0-45b8-ad28-b50730570da4@github.com>
References: <OrjQDBEJjPrCWLpEPj4DmEpaWEFcHY3R8PiZ6ccxMxU=.2862c36d-88d0-45b8-ad28-b50730570da4@github.com>
Message-ID: <ed8rvpoILO2WPF8oO85YgCV1yJPJ6WY8MSdNl7JBXfw=.902106ed-c894-4e51-9da6-6998a145a52b@github.com>

> - Intel AVX10[1] extends and enhances the capabilities of Intel AVX-512 to benefit all Intel? products and will be the vector ISA of choice moving into the future. 
> - It supports a new ISA versioning scheme which simplifies the existing AVX512 feature enumeration scheme. Feature set supported by an AVX10 ISA version will be supported by all the versions above it.
> - The initial, fully-featured version of Intel? AVX10 will be enumerated as Version 2 (denoted as Intel? AVX10.2). This will include the new ISA extension over the existing AVX512 instructions. 
> - An early version of Intel? AVX10 (Version 1, or Intel? AVX10.1) that only enumerates the Intel? AVX-512 instruction set at 128, 256, and 512 bits will be enabled on the Granite Rapids Server for software pre-enabling.
> 
> This patch adds the necessary CPUID feature detection for AVX10 ISA version 1 and 2.  In terms of architectural state save restoration, AVX10 is isomorphic to AVX512 support up till Granite Rapids. State components affected by AVX10 extension include SSE, AVX, Opmask, ZMM_Hi256, and Hi16_ZMM registers. 
> 
> The patch has been regressed through tier1 and jvmci tests 
> 
> Please review and share your feedback.
> 
> Best Regards,
> Jatin
> 
> [1] https://www.intel.com/content/www/us/en/content-details/844829/intel-advanced-vector-extensions-10-2-intel-avx10-2-architecture-specification.html

Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision:

  Review comments resolutions

-------------

Changes:
  - all: https://git.openjdk.org/jdk/pull/24329/files
  - new: https://git.openjdk.org/jdk/pull/24329/files/f583a521..b4654fa4

Webrevs:
 - full: https://webrevs.openjdk.org/?repo=jdk&pr=24329&range=19
 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24329&range=18-19

  Stats: 3 lines in 1 file changed: 2 ins; 0 del; 1 mod
  Patch: https://git.openjdk.org/jdk/pull/24329.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/24329/head:pull/24329

PR: https://git.openjdk.org/jdk/pull/24329

From sviswanathan at openjdk.org  Fri May  9 23:36:16 2025
From: sviswanathan at openjdk.org (Sandhya Viswanathan)
Date: Fri, 9 May 2025 23:36:16 GMT
Subject: RFR: 8352675: Support Intel AVX10 converged vector ISA feature
 detection [v20]
In-Reply-To: <ed8rvpoILO2WPF8oO85YgCV1yJPJ6WY8MSdNl7JBXfw=.902106ed-c894-4e51-9da6-6998a145a52b@github.com>
References: <OrjQDBEJjPrCWLpEPj4DmEpaWEFcHY3R8PiZ6ccxMxU=.2862c36d-88d0-45b8-ad28-b50730570da4@github.com>
 <ed8rvpoILO2WPF8oO85YgCV1yJPJ6WY8MSdNl7JBXfw=.902106ed-c894-4e51-9da6-6998a145a52b@github.com>
Message-ID: <S_GkaEOI5fl_FSJDEmVMiZaCzKBysXqJhjBTLVs4gf4=.3ef80a7b-5a69-4540-b718-3ec2c1e7d030@github.com>

On Fri, 9 May 2025 23:33:42 GMT, Jatin Bhateja <jbhateja at openjdk.org> wrote:

>> - Intel AVX10[1] extends and enhances the capabilities of Intel AVX-512 to benefit all Intel? products and will be the vector ISA of choice moving into the future. 
>> - It supports a new ISA versioning scheme which simplifies the existing AVX512 feature enumeration scheme. Feature set supported by an AVX10 ISA version will be supported by all the versions above it.
>> - The initial, fully-featured version of Intel? AVX10 will be enumerated as Version 2 (denoted as Intel? AVX10.2). This will include the new ISA extension over the existing AVX512 instructions. 
>> - An early version of Intel? AVX10 (Version 1, or Intel? AVX10.1) that only enumerates the Intel? AVX-512 instruction set at 128, 256, and 512 bits will be enabled on the Granite Rapids Server for software pre-enabling.
>> 
>> This patch adds the necessary CPUID feature detection for AVX10 ISA version 1 and 2.  In terms of architectural state save restoration, AVX10 is isomorphic to AVX512 support up till Granite Rapids. State components affected by AVX10 extension include SSE, AVX, Opmask, ZMM_Hi256, and Hi16_ZMM registers. 
>> 
>> The patch has been regressed through tier1 and jvmci tests 
>> 
>> Please review and share your feedback.
>> 
>> Best Regards,
>> Jatin
>> 
>> [1] https://www.intel.com/content/www/us/en/content-details/844829/intel-advanced-vector-extensions-10-2-intel-avx10-2-architecture-specification.html
>
> Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Review comments resolutions

Looks good to me.

-------------

Marked as reviewed by sviswanathan (Reviewer).

PR Review: https://git.openjdk.org/jdk/pull/24329#pullrequestreview-2829900271

From jbhateja at openjdk.org  Fri May  9 23:36:16 2025
From: jbhateja at openjdk.org (Jatin Bhateja)
Date: Fri, 9 May 2025 23:36:16 GMT
Subject: RFR: 8352675: Support Intel AVX10 converged vector ISA feature
 detection [v10]
In-Reply-To: <3t1R35B9bafRtfvqfE7D2dAeLrjaDukXlDUGb-3VtaA=.46d64318-e9fb-4bf3-8a68-8dba2c2b7b26@github.com>
References: <OrjQDBEJjPrCWLpEPj4DmEpaWEFcHY3R8PiZ6ccxMxU=.2862c36d-88d0-45b8-ad28-b50730570da4@github.com>
 <PtAKp6Jg5aTx0OjxRBGw81ycHXSKAbSVgZ6KeIQ0R1o=.5baf1f13-7bb4-46e1-9d43-194bbd2412d9@github.com>
 <V7WbvFgHqGmEcatKP9sIZApYAIdVjED16hqGNrUM1vM=.cccbbe34-5b13-4b38-9421-444050847951@github.com>
 <pMCizZGRrUCno2iOhfBp2dpdf4Epk92ZnPt4iv2ZYKw=.9d5ab0d7-414f-495e-ade0-780666f3aef6@github.com>
 <oSgjp-pN7JPy1pRA-vivck-P5sv8vMLiK9T88YTzmLU=.30323e8c-6e7a-41b6-b082-32a1485008f5@github.com>
 <3t1R35B9bafRtfvqfE7D2dAeLrjaDukXlDUGb-3VtaA=.46d64318-e9fb-4bf3-8a68-8dba2c2b7b26@github.com>
Message-ID: <nNexGk2hUddfsTprwUif3TELrtXApOrYJhoqUSuErNk=.dc1ce47f-2bf8-4ee7-804c-88f69f396be3@github.com>

On Sat, 3 May 2025 08:13:11 GMT, Vladimir Ivanov <vlivanov at openjdk.org> wrote:

>>> Ok, thanks! I wasn't sure you finished the pass.
>>> 
>>> I'm still seeing dynamic memory allocation which IMO unnecessarily complicates the implementation. Bitmap size is fixed and well-known at compile time. It enables `VM_Feature` class to embed the array of proper size inline. And it eliminates all the problems related to undesired sharing of backed array. (Also, `pre_initialize()` is not needed as well.)
>> 
>> Bitmap size depends on the maximum feature enum value, I made it dynamic to keep it flexible. Do you want the feature vector size to be made constant and manually bump it when we exhaust the limit?
>
>> Bitmap size depends on the maximum feature enum value, I made it dynamic to keep it flexible. Do you want the feature vector size to be made constant and manually bump it when we exhaust the limit?
> 
> Yes, please. (The limit may be precise - number of  elements in Feature_Flag enum - but the logic which computes the size of backing array can automatically round it and bump the size once the actual limit is reached.) 
> 
>> pre_initialize was put in place because codeCache_init() proceeds VM_Version_init() 
> 
> I wanted to say that the sole purpose of `pre_initialize` is to allocate memory. Once it goes away, there's no reason to keep it.

Thanks @iwanowww , @sviswa7 , @mur47x111 , @merykitty  for your reviews.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/24329#issuecomment-2868092244

From jbhateja at openjdk.org  Fri May  9 23:36:17 2025
From: jbhateja at openjdk.org (Jatin Bhateja)
Date: Fri, 9 May 2025 23:36:17 GMT
Subject: RFR: 8352675: Support Intel AVX10 converged vector ISA feature
 detection [v19]
In-Reply-To: <8y5JLR_7BMUJXmNNzPRusDpRWnJHtIPxZodVqQHrmmI=.ca53a482-8572-499f-af9f-6c255cf02896@github.com>
References: <OrjQDBEJjPrCWLpEPj4DmEpaWEFcHY3R8PiZ6ccxMxU=.2862c36d-88d0-45b8-ad28-b50730570da4@github.com>
 <MshoSsDvuoMdKeqF0Uiufmw4q-kUKx15Hv7BphwE_cg=.05022a41-dfbc-43e5-ba07-f0d738926af3@github.com>
 <8y5JLR_7BMUJXmNNzPRusDpRWnJHtIPxZodVqQHrmmI=.ca53a482-8572-499f-af9f-6c255cf02896@github.com>
Message-ID: <VolTCKMBUPrelCiH6dqBeMKsPO7pgCoEQ6Kwi6fo6Qg=.a9d794d2-e321-4933-987e-28e9a68815ce@github.com>

On Fri, 9 May 2025 22:23:41 GMT, Sandhya Viswanathan <sviswanathan at openjdk.org> wrote:

>> Jatin Bhateja has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 19 commits:
>> 
>>  - Sandhya's review comments resoultion
>>  - Merge branch 'master' of http://github.com/openjdk/jdk into JDK-8352675
>>  - Addressing Yudi's comments
>>  - Code re-factoring from Vladimir
>>  - Reveiw suggestions incorporated
>>  - Making _features_bitmap size configurable
>>  - cleanups & refactorings
>>  - build fixes for non-x86 targets
>>  - Review comments resolutions
>>  - Updating comment
>>  - ... and 9 more: https://git.openjdk.org/jdk/compare/411a63ea...f583a521
>
> src/hotspot/cpu/x86/vm_version_x86.cpp line 1052:
> 
>> 1050:   if (is_intel()) { // Intel cpus specific settings
>> 1051:     if (is_knights_family()) {
>> 1052:       _features.clear_feature(CPU_VZEROUPPER);
> 
> Should we be also clearing the CPU_AVX10_1 and CPU_AVX10_2 here?

I agree; it may help validate KNL on Diamond Rapids :-)

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/24329#discussion_r2082628062

From jbhateja at openjdk.org  Fri May  9 23:36:17 2025
From: jbhateja at openjdk.org (Jatin Bhateja)
Date: Fri, 9 May 2025 23:36:17 GMT
Subject: Integrated: 8352675: Support Intel AVX10 converged vector ISA feature
 detection
In-Reply-To: <OrjQDBEJjPrCWLpEPj4DmEpaWEFcHY3R8PiZ6ccxMxU=.2862c36d-88d0-45b8-ad28-b50730570da4@github.com>
References: <OrjQDBEJjPrCWLpEPj4DmEpaWEFcHY3R8PiZ6ccxMxU=.2862c36d-88d0-45b8-ad28-b50730570da4@github.com>
Message-ID: <If2J1Ut6A2nPO1DbC2H-QLxQLXcnnh9IDulKYw7_k4I=.f0c82970-96a4-444c-b013-f9414ad30133@github.com>

On Mon, 31 Mar 2025 13:57:22 GMT, Jatin Bhateja <jbhateja at openjdk.org> wrote:

> - Intel AVX10[1] extends and enhances the capabilities of Intel AVX-512 to benefit all Intel? products and will be the vector ISA of choice moving into the future. 
> - It supports a new ISA versioning scheme which simplifies the existing AVX512 feature enumeration scheme. Feature set supported by an AVX10 ISA version will be supported by all the versions above it.
> - The initial, fully-featured version of Intel? AVX10 will be enumerated as Version 2 (denoted as Intel? AVX10.2). This will include the new ISA extension over the existing AVX512 instructions. 
> - An early version of Intel? AVX10 (Version 1, or Intel? AVX10.1) that only enumerates the Intel? AVX-512 instruction set at 128, 256, and 512 bits will be enabled on the Granite Rapids Server for software pre-enabling.
> 
> This patch adds the necessary CPUID feature detection for AVX10 ISA version 1 and 2.  In terms of architectural state save restoration, AVX10 is isomorphic to AVX512 support up till Granite Rapids. State components affected by AVX10 extension include SSE, AVX, Opmask, ZMM_Hi256, and Hi16_ZMM registers. 
> 
> The patch has been regressed through tier1 and jvmci tests 
> 
> Please review and share your feedback.
> 
> Best Regards,
> Jatin
> 
> [1] https://www.intel.com/content/www/us/en/content-details/844829/intel-advanced-vector-extensions-10-2-intel-avx10-2-architecture-specification.html

This pull request has now been integrated.

Changeset: 3b336a9d
Author:    Jatin Bhateja <jbhateja at openjdk.org>
URL:       https://git.openjdk.org/jdk/commit/3b336a9da091c4df4373d2b845b60d2a7a4e3b1d
Stats:     522 lines in 15 files changed: 273 ins; 29 del; 220 mod

8352675: Support Intel AVX10 converged vector ISA feature detection

Reviewed-by: sviswanathan, vlivanov, yzheng

-------------

PR: https://git.openjdk.org/jdk/pull/24329

From vlivanov at openjdk.org  Sat May 10 03:18:03 2025
From: vlivanov at openjdk.org (Vladimir Ivanov)
Date: Sat, 10 May 2025 03:18:03 GMT
Subject: RFR: 8347901: C2 should remove unused leaf / pure runtime calls
In-Reply-To: <Sbtz8X_o-f8g20gsLGQBEMJCtHQ3yMgUqGMuBWjHLFA=.181d2b69-dcde-4e9f-9c65-0e39bc3ee980@github.com>
References: <FDC4kftsSAqf2FB3mxOABGMgWhr_qty0_BktGUiuTuE=.060b5943-74c3-461d-8806-b6da1722c207@github.com>
 <XrHBrp81T81JlX15Yc3cTb-fPwGqo5uX-tj4HizOzko=.9d6f5dd0-3239-47f1-a593-87a208ad5c99@github.com>
 <MPLoUV2KuCkLI8ZnaiH6W8hSz1kLlTvJgyLRYXnLnt0=.a39867b2-e2f6-4527-b027-363d93400950@github.com>
 <Sbtz8X_o-f8g20gsLGQBEMJCtHQ3yMgUqGMuBWjHLFA=.181d2b69-dcde-4e9f-9c65-0e39bc3ee980@github.com>
Message-ID: <DIIwcamCS3yY-NpbSC4Fg-3hUqVlcFYe7PIFuM-Piuw=.fbb2d924-bcf6-4b49-96e4-84e09bb0540c@github.com>

On Fri, 9 May 2025 16:06:13 GMT, Marc Chevalier <mchevalier at openjdk.org> wrote:

> My goal is to start with control-less nodes, but find the late control during loop optimization, control-pin them at this point (because that's when the information is available) with both control input and output (needed for the expansion in CallLeaf), and continuing with control-pinned nodes.

If you combine lowering with pinning, you could replace a data node with a CFG node (CallLeaf in your case) at the point in CFG you choose. A single CFG node is enough to insert  a CFG-only node, but you need to ensure the graph stays schedulable after the insertion.

If you want to start with pinned node, the simplest way would be to make `CallPure` a subclass of `CallLeaf`, require it to be CFG-only (no memory in/out, no IO, etc) and populate only control in/out when inserting it into the graph during parsing.

> For now, I'm happy with the control I get from parsing.

Keep in mind that it assumes the node is pinned in CFG from the very beginning. Once the node starts in data-only mode, the control input it gained during parsing may end up too early for node's inputs to be scheduleable.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/24966#issuecomment-2868277578

From qamai at openjdk.org  Sat May 10 05:26:52 2025
From: qamai at openjdk.org (Quan Anh Mai)
Date: Sat, 10 May 2025 05:26:52 GMT
Subject: RFR: 8347901: C2 should remove unused leaf / pure runtime calls
In-Reply-To: <FDC4kftsSAqf2FB3mxOABGMgWhr_qty0_BktGUiuTuE=.060b5943-74c3-461d-8806-b6da1722c207@github.com>
References: <FDC4kftsSAqf2FB3mxOABGMgWhr_qty0_BktGUiuTuE=.060b5943-74c3-461d-8806-b6da1722c207@github.com>
Message-ID: <xjYabzypKCqhMAG6qnWOB09F4F5YDoGEhNpZsa2ve5k=.f26d3919-27ba-4481-b1c0-1ca438f5ba22@github.com>

On Wed, 30 Apr 2025 13:18:33 GMT, Marc Chevalier <mchevalier at openjdk.org> wrote:

> A first part toward a better support of pure functions.
> 
> ## Pure Functions
> 
> Pure functions (considered here) are functions that have no side effects, no effect on the control flow (no exception or such), cannot deopt etc.. It's really a function that you can execute anywhere, with whichever arguments without effect other than wasting time. Integer division is not pure as dividing by zero is throwing. But many floating point functions will just return `NaN` or `+/-infinity` in problematic cases.
> 
> ## Scope
> 
> We are not going all powerful for now! It's mostly about identifying some pure functions and being able to remove them if the result is unused. Some other things are not part of this PR, on purpose. Especially, this PR doesn't propose a way to move pure calls around. The reason is that pure calls are macro nodes later expanded into other, regular calls, which require a control input. To be able to do the expansion, we just keep the control in the pure call as well.
> 
> ## Implementation Overview
> 
> We created here some new node kind for pure calls that are expanded into regular calls during macro expansion. This also allows the removal of `ModD` and `ModF` nodes that have their pure equivalent now. They are surprisingly hard to unify with other floating point functions from an implementation point of view!
> 
> IR framework and IGV needed a little bit of fixing.
> 
> Thanks,
> Marc

I think a very simple approach you can take is having `CallPureNode` as a pure data node. It does not have to have anything to do with `CallNode` (no lowering into a `CallNode`, no subclass from `CallNode`) and it can have its mach implementation like this:

    instruct pureCall1F(xmm0 dst, xmm0 src) %{
        match(Set dst (CallPure src));
        effect(CALL);
        format %{
            __ call(/*something*/);
        %}
    %}

-------------

PR Comment: https://git.openjdk.org/jdk/pull/24966#issuecomment-2868400653

From duke at openjdk.org  Mon May 12 08:57:36 2025
From: duke at openjdk.org (Ferenc Rakoczi)
Date: Mon, 12 May 2025 08:57:36 GMT
Subject: RFR: 8351412: Add AVX-512 intrinsics for ML-KEM [v3]
In-Reply-To: <EyhTUDIMxnzyPP14NYWuRlZXR9WTv2GaYApQO0nJ4do=.2ee863e3-fd15-41be-ac73-247b03144281@github.com>
References: <EyhTUDIMxnzyPP14NYWuRlZXR9WTv2GaYApQO0nJ4do=.2ee863e3-fd15-41be-ac73-247b03144281@github.com>
Message-ID: <xlKXlqWoR9Mm6dagQHk8hwKOQtEsfp3vXTDpkJ8x_rc=.44cceb16-8671-42c9-81e1-c201334bf60c@github.com>

> By using the AVX-512 vector registers the speed of the computation of the ML-KEM algorithms (key generation, encapsulation, decapsulation) can be approximately doubled.

Ferenc Rakoczi has updated the pull request incrementally with one additional commit since the last revision:

  Eliminating some instructions from generate_kyber12To16_avx512() + fixing a comment.

-------------

Changes:
  - all: https://git.openjdk.org/jdk/pull/24953/files
  - new: https://git.openjdk.org/jdk/pull/24953/files/c5c6449f..43455de2

Webrevs:
 - full: https://webrevs.openjdk.org/?repo=jdk&pr=24953&range=02
 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24953&range=01-02

  Stats: 75 lines in 1 file changed: 31 ins; 32 del; 12 mod
  Patch: https://git.openjdk.org/jdk/pull/24953.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/24953/head:pull/24953

PR: https://git.openjdk.org/jdk/pull/24953

From duke at openjdk.org  Mon May 12 09:05:10 2025
From: duke at openjdk.org (Ferenc Rakoczi)
Date: Mon, 12 May 2025 09:05:10 GMT
Subject: RFR: 8351412: Add AVX-512 intrinsics for ML-KEM [v4]
In-Reply-To: <EyhTUDIMxnzyPP14NYWuRlZXR9WTv2GaYApQO0nJ4do=.2ee863e3-fd15-41be-ac73-247b03144281@github.com>
References: <EyhTUDIMxnzyPP14NYWuRlZXR9WTv2GaYApQO0nJ4do=.2ee863e3-fd15-41be-ac73-247b03144281@github.com>
Message-ID: <Ws9R1LJ6lpKlnXJ_2sVAhRAEvd6b9P3DBE6viW8r_4M=.f984d862-4c0f-4a04-bcfc-a6df781b485c@github.com>

> By using the AVX-512 vector registers the speed of the computation of the ML-KEM algorithms (key generation, encapsulation, decapsulation) can be approximately doubled.

Ferenc Rakoczi has updated the pull request incrementally with one additional commit since the last revision:

  Restoring copyright notice on ML_KEM.java

-------------

Changes:
  - all: https://git.openjdk.org/jdk/pull/24953/files
  - new: https://git.openjdk.org/jdk/pull/24953/files/43455de2..215b346f

Webrevs:
 - full: https://webrevs.openjdk.org/?repo=jdk&pr=24953&range=03
 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24953&range=02-03

  Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod
  Patch: https://git.openjdk.org/jdk/pull/24953.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/24953/head:pull/24953

PR: https://git.openjdk.org/jdk/pull/24953

From shade at openjdk.org  Mon May 12 14:07:02 2025
From: shade at openjdk.org (Aleksey Shipilev)
Date: Mon, 12 May 2025 14:07:02 GMT
Subject: RFR: 8356783: CompilerTask hot_method is redundant
Message-ID: <XNK-wNdedFHMGBi32f59QNXlVrB3jFzUMPx8Cy5TsDA=.016b7060-1f7d-405d-8f68-c70758de3a4a@github.com>

This gave me some grief when implementing [JDK-8231269](https://bugs.openjdk.org/browse/JDK-8231269). From the initializations, it looks to me that `CompilerTask::hot_method()` is either `method()` or `nullptr`. In both cases, we do nothing special. So tracking `hot_method` is redundant, and can be purged. This improves performance a little, since it avoids extra handle-izing across compiler code, and of course it simplifies coding as well.

Additional testing:
 - [x] Linux x86_64 server fastdebug, `compiler/`
 - [ ]  Linux x86_64 server fastdebug, `all`

-------------

Commit messages:
 - Fix

Changes: https://git.openjdk.org/jdk/pull/25185/files
  Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=25185&range=00
  Issue: https://bugs.openjdk.org/browse/JDK-8356783
  Stats: 62 lines in 8 files changed: 0 ins; 47 del; 15 mod
  Patch: https://git.openjdk.org/jdk/pull/25185.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/25185/head:pull/25185

PR: https://git.openjdk.org/jdk/pull/25185

From kvn at openjdk.org  Mon May 12 17:04:51 2025
From: kvn at openjdk.org (Vladimir Kozlov)
Date: Mon, 12 May 2025 17:04:51 GMT
Subject: RFR: 8356783: CompilerTask hot_method is redundant
In-Reply-To: <XNK-wNdedFHMGBi32f59QNXlVrB3jFzUMPx8Cy5TsDA=.016b7060-1f7d-405d-8f68-c70758de3a4a@github.com>
References: <XNK-wNdedFHMGBi32f59QNXlVrB3jFzUMPx8Cy5TsDA=.016b7060-1f7d-405d-8f68-c70758de3a4a@github.com>
Message-ID: <Mng6WmpaaIBnnUH3EknJ1WG13Jrxngx8zDR2IiiqFr4=.89115463-deaf-42fe-abf1-76e86cc0ede9@github.com>

On Mon, 12 May 2025 14:02:43 GMT, Aleksey Shipilev <shade at openjdk.org> wrote:

> This gave me some grief when implementing [JDK-8231269](https://bugs.openjdk.org/browse/JDK-8231269). From the initializations, it looks to me that `CompilerTask::hot_method()` is either `method()` or `nullptr`. In both cases, we do nothing special. So tracking `hot_method` is redundant, and can be purged. This improves performance a little, since it avoids extra handle-izing across compiler code, and of course it simplifies coding as well.
> 
> Additional testing:
>  - [x] Linux x86_64 server fastdebug, `compiler/`
>  - [ ]  Linux x86_64 server fastdebug, `all`

There was time when we compiled caller instead of method which triggers compilation (`StackWalkCompPolicy`). It was removed in JDK 13 [JDK-8216360](https://bugs.openjdk.org/browse/JDK-8216360)

-------------

Marked as reviewed by kvn (Reviewer).

PR Review: https://git.openjdk.org/jdk/pull/25185#pullrequestreview-2833911401

From cslucas at openjdk.org  Mon May 12 19:27:53 2025
From: cslucas at openjdk.org (Cesar Soares Lucas)
Date: Mon, 12 May 2025 19:27:53 GMT
Subject: RFR: 8356783: CompilerTask hot_method is redundant
In-Reply-To: <XNK-wNdedFHMGBi32f59QNXlVrB3jFzUMPx8Cy5TsDA=.016b7060-1f7d-405d-8f68-c70758de3a4a@github.com>
References: <XNK-wNdedFHMGBi32f59QNXlVrB3jFzUMPx8Cy5TsDA=.016b7060-1f7d-405d-8f68-c70758de3a4a@github.com>
Message-ID: <xQR80rsJ313g7H7FR4Sn6dQKpM6VJvMLvaNiFyF0YdI=.6cdd20ae-bda1-4137-825a-5b82ae46d08b@github.com>

On Mon, 12 May 2025 14:02:43 GMT, Aleksey Shipilev <shade at openjdk.org> wrote:

> This gave me some grief when implementing [JDK-8231269](https://bugs.openjdk.org/browse/JDK-8231269). From the initializations, it looks to me that `CompilerTask::hot_method()` is either `method()` or `nullptr`. In both cases, we do nothing special. So tracking `hot_method` is redundant, and can be purged. This improves performance a little, since it avoids extra handle-izing across compiler code, and of course it simplifies coding as well.
> 
> Additional testing:
>  - [x] Linux x86_64 server fastdebug, `compiler/`
>  - [ ]  Linux x86_64 server fastdebug, `all`

LGTM

-------------

Marked as reviewed by cslucas (Committer).

PR Review: https://git.openjdk.org/jdk/pull/25185#pullrequestreview-2834258711

From dnsimon at openjdk.org  Mon May 12 20:17:02 2025
From: dnsimon at openjdk.org (Doug Simon)
Date: Mon, 12 May 2025 20:17:02 GMT
Subject: RFR: 8356447: Change default for EagerJVMCI to true
Message-ID: <hEyQpd1Q-XPhT_UYIqFn6Z2QlqxHHsT5_y4EMY5a-Sc=.32595747-b83f-401d-bad8-d871f73e1cd6@github.com>

By default, JVMCI and Graal initialization only occurs on the first top-tier (i.e. tier 4) JIT compilation request. This made sense prior to libgraal where the initialization was interpreted and so noticeably contributed to VM startup. However, with libgraal the initialization is sufficiently fast to not impact startup noticeably.

The motivation for JVMCI and Graal eager initialization by default is to make Graal command line option processing happen in the same VM phase as handling of all other VM command line flags. That is, errors in Graal options should:
1. Happen deterministically, not just for apps that run long enough to trigger a top tier JIT compilation. For example: `java -XX:+UnlockExperimentalVMOptions -XX:+UseGraalJIT --version`. In a JDK build that does not include Graal, this may succeed (and print out the version info) or result in a VM error ("Cannot use JVMCI compiler: No JVMCI compiler found").
2. Stop the VM before any application code can be executed. This is just good hygiene.

This PR makes JVMCI initialization eager by default if `UseJVMCICompiler` is true.
This is done for both libgraal and jargraal so that the behavior is uniform. Since jargraal is now a development configuration, VM startup costs are not critical.

-------------

Commit messages:
 - only fail-fast for a missing JVMCI compiler on a HotSpot JIT thread
 - default EagerJVMCI to true if UseJVMCICompiler is true

Changes: https://git.openjdk.org/jdk/pull/25121/files
  Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=25121&range=00
  Issue: https://bugs.openjdk.org/browse/JDK-8356447
  Stats: 34 lines in 6 files changed: 31 ins; 0 del; 3 mod
  Patch: https://git.openjdk.org/jdk/pull/25121.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/25121/head:pull/25121

PR: https://git.openjdk.org/jdk/pull/25121

From kvn at openjdk.org  Mon May 12 20:39:51 2025
From: kvn at openjdk.org (Vladimir Kozlov)
Date: Mon, 12 May 2025 20:39:51 GMT
Subject: RFR: 8356447: Change default for EagerJVMCI to true
In-Reply-To: <hEyQpd1Q-XPhT_UYIqFn6Z2QlqxHHsT5_y4EMY5a-Sc=.32595747-b83f-401d-bad8-d871f73e1cd6@github.com>
References: <hEyQpd1Q-XPhT_UYIqFn6Z2QlqxHHsT5_y4EMY5a-Sc=.32595747-b83f-401d-bad8-d871f73e1cd6@github.com>
Message-ID: <U6bszSSHwjhLTGuK71y6pyD_wqP5vJrw3Cq9tYtnzXc=.8da84f7f-a012-47b5-9812-7f1d988fc803@github.com>

On Thu, 8 May 2025 14:44:55 GMT, Doug Simon <dnsimon at openjdk.org> wrote:

> By default, JVMCI and Graal initialization only occurs on the first top-tier (i.e. tier 4) JIT compilation request. This made sense prior to libgraal where the initialization was interpreted and so noticeably contributed to VM startup. However, with libgraal the initialization is sufficiently fast to not impact startup noticeably.
> 
> The motivation for JVMCI and Graal eager initialization by default is to make Graal command line option processing happen in the same VM phase as handling of all other VM command line flags. That is, errors in Graal options should:
> 1. Happen deterministically, not just for apps that run long enough to trigger a top tier JIT compilation. For example: `java -XX:+UnlockExperimentalVMOptions -XX:+UseGraalJIT --version`. In a JDK build that does not include Graal, this may succeed (and print out the version info) or result in a VM error ("Cannot use JVMCI compiler: No JVMCI compiler found").
> 2. Stop the VM before any application code can be executed. This is just good hygiene.
> 
> This PR makes JVMCI initialization eager by default if `UseJVMCICompiler` is true.
> This is done for both libgraal and jargraal so that the behavior is uniform. Since jargraal is now a development configuration, VM startup costs are not critical.

src/hotspot/share/jvmci/jvmci_globals.cpp line 91:

> 89:     if (FLAG_IS_DEFAULT(EagerJVMCI) && !EagerJVMCI) {
> 90:       FLAG_SET_DEFAULT(EagerJVMCI, true);
> 91:     }

The default value is `false` - I don't think you need check it.
You can use `FLAG_SET_ERGO_IF_DEFAULT(EagerJVMCI, true);`

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/25121#discussion_r2085425314

From vlivanov at openjdk.org  Mon May 12 21:04:58 2025
From: vlivanov at openjdk.org (Vladimir Ivanov)
Date: Mon, 12 May 2025 21:04:58 GMT
Subject: RFR: 8347901: C2 should remove unused leaf / pure runtime calls
In-Reply-To: <xjYabzypKCqhMAG6qnWOB09F4F5YDoGEhNpZsa2ve5k=.f26d3919-27ba-4481-b1c0-1ca438f5ba22@github.com>
References: <FDC4kftsSAqf2FB3mxOABGMgWhr_qty0_BktGUiuTuE=.060b5943-74c3-461d-8806-b6da1722c207@github.com>
 <xjYabzypKCqhMAG6qnWOB09F4F5YDoGEhNpZsa2ve5k=.f26d3919-27ba-4481-b1c0-1ca438f5ba22@github.com>
Message-ID: <uFUCzJjLfRqudMg3P1NPagroCaxEersEu7Sv8813cQU=.4656742a-2a7d-454e-8e90-4b7b07e6d2a7@github.com>

On Sat, 10 May 2025 05:24:02 GMT, Quan Anh Mai <qamai at openjdk.org> wrote:

> I think a very simple approach you can take is having CallPureNode as a pure data node

It's not as simple as it seems. In order to work reliably it requires full control of the code being called, so without extra work it is appropriate for generated stubs only. If you want to call some native code VM doesn't control, then either all caller-saved registers should be preserved across the call (which may be prohibitively expensive) or it should be made explicit there's a call taking place so all ABI effects are taken into account.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/24966#issuecomment-2874057369

From qamai at openjdk.org  Tue May 13 03:14:55 2025
From: qamai at openjdk.org (Quan Anh Mai)
Date: Tue, 13 May 2025 03:14:55 GMT
Subject: RFR: 8347901: C2 should remove unused leaf / pure runtime calls
In-Reply-To: <uFUCzJjLfRqudMg3P1NPagroCaxEersEu7Sv8813cQU=.4656742a-2a7d-454e-8e90-4b7b07e6d2a7@github.com>
References: <FDC4kftsSAqf2FB3mxOABGMgWhr_qty0_BktGUiuTuE=.060b5943-74c3-461d-8806-b6da1722c207@github.com>
 <xjYabzypKCqhMAG6qnWOB09F4F5YDoGEhNpZsa2ve5k=.f26d3919-27ba-4481-b1c0-1ca438f5ba22@github.com>
 <uFUCzJjLfRqudMg3P1NPagroCaxEersEu7Sv8813cQU=.4656742a-2a7d-454e-8e90-4b7b07e6d2a7@github.com>
Message-ID: <7e0IhYYv_1dDlLgmUM8rKj5bjDx3lIhY2PRt-fC-rTs=.35437a80-80c7-4332-9339-a6f047b73289@github.com>

On Mon, 12 May 2025 21:01:34 GMT, Vladimir Ivanov <vlivanov at openjdk.org> wrote:

>> I think a very simple approach you can take is having `CallPureNode` as a pure data node. It does not have to have anything to do with `CallNode` (no lowering into a `CallNode`, no subclass from `CallNode`) and it can have its mach implementation like this:
>> 
>>     instruct pureCall1F(xmm0 dst, xmm0 src) %{
>>         match(Set dst (CallPure src));
>>         effect(CALL);
>>         format %{
>>             __ call(/*something*/);
>>         %}
>>     %}
>
>> I think a very simple approach you can take is having CallPureNode as a pure data node
> 
> It's not as simple as it seems. In order to work reliably it requires full control of the code being called, so without extra work it is appropriate for generated stubs only. If you want to call some native code VM doesn't control, then either all caller-saved registers should be preserved across the call (which may be prohibitively expensive) or it should be made explicit there's a call taking place so all ABI effects are taken into account.

@iwanowww I believe `effect(CALL)` marks that a call is taking place and the register allocator will know how to save the registers accordingly. Note that on arm, long division is implemented as a call:

https://github.com/openjdk/jdk/blob/adebfa7ffda6383f5793278ced14a193066c5f6a/src/hotspot/cpu/arm/arm.ad#L5962

And `SharedRuntime::ldiv` is implemented in C++:

https://github.com/openjdk/jdk/blob/adebfa7ffda6383f5793278ced14a193066c5f6a/src/hotspot/share/runtime/sharedRuntime.cpp#L272

-------------

PR Comment: https://git.openjdk.org/jdk/pull/24966#issuecomment-2874936879

From dnsimon at openjdk.org  Tue May 13 06:52:27 2025
From: dnsimon at openjdk.org (Doug Simon)
Date: Tue, 13 May 2025 06:52:27 GMT
Subject: RFR: 8356447: Change default for EagerJVMCI to true [v2]
In-Reply-To: <hEyQpd1Q-XPhT_UYIqFn6Z2QlqxHHsT5_y4EMY5a-Sc=.32595747-b83f-401d-bad8-d871f73e1cd6@github.com>
References: <hEyQpd1Q-XPhT_UYIqFn6Z2QlqxHHsT5_y4EMY5a-Sc=.32595747-b83f-401d-bad8-d871f73e1cd6@github.com>
Message-ID: <4rohFHtNW1xFl9DQ47qqySsYnYxtfrO7-UZ--L3CRmA=.06aa514d-1846-47ae-b7bd-7535bed88fcb@github.com>

> By default, JVMCI and Graal initialization only occurs on the first top-tier (i.e. tier 4) JIT compilation request. This made sense prior to libgraal where the initialization was interpreted and so noticeably contributed to VM startup. However, with libgraal the initialization is sufficiently fast to not impact startup noticeably.
> 
> The motivation for JVMCI and Graal eager initialization by default is to make Graal command line option processing happen in the same VM phase as handling of all other VM command line flags. That is, errors in Graal options should:
> 1. Happen deterministically, not just for apps that run long enough to trigger a top tier JIT compilation. For example: `java -XX:+UnlockExperimentalVMOptions -XX:+UseGraalJIT --version`. In a JDK build that does not include Graal, this may succeed (and print out the version info) or result in a VM error ("Cannot use JVMCI compiler: No JVMCI compiler found").
> 2. Stop the VM before any application code can be executed. This is just good hygiene.
> 
> This PR makes JVMCI initialization eager by default if `UseJVMCICompiler` is true.
> This is done for both libgraal and jargraal so that the behavior is uniform. Since jargraal is now a development configuration, VM startup costs are not critical.

Doug Simon has updated the pull request incrementally with one additional commit since the last revision:

  use FLAG_SET_ERGO_IF_DEFAULT

-------------

Changes:
  - all: https://git.openjdk.org/jdk/pull/25121/files
  - new: https://git.openjdk.org/jdk/pull/25121/files/42c351b5..ad4be5dc

Webrevs:
 - full: https://webrevs.openjdk.org/?repo=jdk&pr=25121&range=01
 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25121&range=00-01

  Stats: 3 lines in 1 file changed: 0 ins; 2 del; 1 mod
  Patch: https://git.openjdk.org/jdk/pull/25121.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/25121/head:pull/25121

PR: https://git.openjdk.org/jdk/pull/25121

From shade at openjdk.org  Tue May 13 08:33:52 2025
From: shade at openjdk.org (Aleksey Shipilev)
Date: Tue, 13 May 2025 08:33:52 GMT
Subject: RFR: 8356783: CompilerTask hot_method is redundant
In-Reply-To: <XNK-wNdedFHMGBi32f59QNXlVrB3jFzUMPx8Cy5TsDA=.016b7060-1f7d-405d-8f68-c70758de3a4a@github.com>
References: <XNK-wNdedFHMGBi32f59QNXlVrB3jFzUMPx8Cy5TsDA=.016b7060-1f7d-405d-8f68-c70758de3a4a@github.com>
Message-ID: <WBoCHWicyaX6SPv6TKGN9JIw6J4N29aC0zJSVbgLcDQ=.05cc0c07-ed7f-41fb-85e4-f0285a4bf8cb@github.com>

On Mon, 12 May 2025 14:02:43 GMT, Aleksey Shipilev <shade at openjdk.org> wrote:

> This gave me some grief when implementing [JDK-8231269](https://bugs.openjdk.org/browse/JDK-8231269). From the initializations, it looks to me that `CompilerTask::hot_method()` is either `method()` or `nullptr`. In both cases, we do nothing special. So tracking `hot_method` is redundant, and can be purged. This improves performance a little, since it avoids extra handle-izing across compiler code, and of course it simplifies coding as well.
> 
> Additional testing:
>  - [x] Linux x86_64 server fastdebug, `compiler/`
>  - [x]  Linux x86_64 server fastdebug, `all`

Thanks! Testing is green here. I'll wait a bit more if anyone else wants to review, and then I'll integrate to continue with [JDK-8231269](https://bugs.openjdk.org/browse/JDK-8231269).

-------------

PR Comment: https://git.openjdk.org/jdk/pull/25185#issuecomment-2875537533

From yzheng at openjdk.org  Tue May 13 12:39:55 2025
From: yzheng at openjdk.org (Yudi Zheng)
Date: Tue, 13 May 2025 12:39:55 GMT
Subject: RFR: 8356447: Change default for EagerJVMCI to true [v2]
In-Reply-To: <4rohFHtNW1xFl9DQ47qqySsYnYxtfrO7-UZ--L3CRmA=.06aa514d-1846-47ae-b7bd-7535bed88fcb@github.com>
References: <hEyQpd1Q-XPhT_UYIqFn6Z2QlqxHHsT5_y4EMY5a-Sc=.32595747-b83f-401d-bad8-d871f73e1cd6@github.com>
 <4rohFHtNW1xFl9DQ47qqySsYnYxtfrO7-UZ--L3CRmA=.06aa514d-1846-47ae-b7bd-7535bed88fcb@github.com>
Message-ID: <BFqmPGNtcBvpwBwAqEFGFyyP5Yv29BnSrLZ04SCcRyY=.2a2e33ac-0b03-4759-8d5e-9bdfc26c3262@github.com>

On Tue, 13 May 2025 06:52:27 GMT, Doug Simon <dnsimon at openjdk.org> wrote:

>> By default, JVMCI and Graal initialization only occurs on the first top-tier (i.e. tier 4) JIT compilation request. This made sense prior to libgraal where the initialization was interpreted and so noticeably contributed to VM startup. However, with libgraal the initialization is sufficiently fast to not impact startup noticeably.
>> 
>> The motivation for JVMCI and Graal eager initialization by default is to make Graal command line option processing happen in the same VM phase as handling of all other VM command line flags. That is, errors in Graal options should:
>> 1. Happen deterministically, not just for apps that run long enough to trigger a top tier JIT compilation. For example: `java -XX:+UnlockExperimentalVMOptions -XX:+UseGraalJIT --version`. In a JDK build that does not include Graal, this may succeed (and print out the version info) or result in a VM error ("Cannot use JVMCI compiler: No JVMCI compiler found").
>> 2. Stop the VM before any application code can be executed. This is just good hygiene.
>> 
>> This PR makes JVMCI initialization eager by default if `UseJVMCICompiler` is true.
>> This is done for both libgraal and jargraal so that the behavior is uniform. Since jargraal is now a development configuration, VM startup costs are not critical.
>
> Doug Simon has updated the pull request incrementally with one additional commit since the last revision:
> 
>   use FLAG_SET_ERGO_IF_DEFAULT

LGTM

-------------

Marked as reviewed by yzheng (Committer).

PR Review: https://git.openjdk.org/jdk/pull/25121#pullrequestreview-2836595615

From shade at openjdk.org  Tue May 13 13:20:07 2025
From: shade at openjdk.org (Aleksey Shipilev)
Date: Tue, 13 May 2025 13:20:07 GMT
Subject: RFR: 8356783: CompilerTask hot_method is redundant
In-Reply-To: <XNK-wNdedFHMGBi32f59QNXlVrB3jFzUMPx8Cy5TsDA=.016b7060-1f7d-405d-8f68-c70758de3a4a@github.com>
References: <XNK-wNdedFHMGBi32f59QNXlVrB3jFzUMPx8Cy5TsDA=.016b7060-1f7d-405d-8f68-c70758de3a4a@github.com>
Message-ID: <gFJ4p_k1Skh0WCcEBQkfrO84J4dLldsOaCHv4b4fVC0=.6bf7d3d7-80e9-4ce5-adf9-ce254329d32c@github.com>

On Mon, 12 May 2025 14:02:43 GMT, Aleksey Shipilev <shade at openjdk.org> wrote:

> This gave me some grief when implementing [JDK-8231269](https://bugs.openjdk.org/browse/JDK-8231269). From the initializations, it looks to me that `CompilerTask::hot_method()` is either `method()` or `nullptr`. In both cases, we do nothing special. So tracking `hot_method` is redundant, and can be purged. This improves performance a little, since it avoids extra handle-izing across compiler code, and of course it simplifies coding as well.
> 
> Additional testing:
>  - [x] Linux x86_64 server fastdebug, `compiler/`
>  - [x]  Linux x86_64 server fastdebug, `all`

Here goes.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/25185#issuecomment-2876483328

From shade at openjdk.org  Tue May 13 13:20:07 2025
From: shade at openjdk.org (Aleksey Shipilev)
Date: Tue, 13 May 2025 13:20:07 GMT
Subject: Integrated: 8356783: CompilerTask hot_method is redundant
In-Reply-To: <XNK-wNdedFHMGBi32f59QNXlVrB3jFzUMPx8Cy5TsDA=.016b7060-1f7d-405d-8f68-c70758de3a4a@github.com>
References: <XNK-wNdedFHMGBi32f59QNXlVrB3jFzUMPx8Cy5TsDA=.016b7060-1f7d-405d-8f68-c70758de3a4a@github.com>
Message-ID: <7GteLkEIZDn7y_TejXwlsYTUiVcwJNkQ8ul61fQgZaM=.10db59b8-485c-43b2-8846-eca08355e70a@github.com>

On Mon, 12 May 2025 14:02:43 GMT, Aleksey Shipilev <shade at openjdk.org> wrote:

> This gave me some grief when implementing [JDK-8231269](https://bugs.openjdk.org/browse/JDK-8231269). From the initializations, it looks to me that `CompilerTask::hot_method()` is either `method()` or `nullptr`. In both cases, we do nothing special. So tracking `hot_method` is redundant, and can be purged. This improves performance a little, since it avoids extra handle-izing across compiler code, and of course it simplifies coding as well.
> 
> Additional testing:
>  - [x] Linux x86_64 server fastdebug, `compiler/`
>  - [x]  Linux x86_64 server fastdebug, `all`

This pull request has now been integrated.

Changeset: 48d2acb3
Author:    Aleksey Shipilev <shade at openjdk.org>
URL:       https://git.openjdk.org/jdk/commit/48d2acb3860f742eb1c06b89f8a7208d0d7a01e7
Stats:     62 lines in 8 files changed: 0 ins; 47 del; 15 mod

8356783: CompilerTask hot_method is redundant

Reviewed-by: kvn, cslucas

-------------

PR: https://git.openjdk.org/jdk/pull/25185

From kvn at openjdk.org  Tue May 13 15:34:52 2025
From: kvn at openjdk.org (Vladimir Kozlov)
Date: Tue, 13 May 2025 15:34:52 GMT
Subject: RFR: 8356447: Change default for EagerJVMCI to true [v2]
In-Reply-To: <4rohFHtNW1xFl9DQ47qqySsYnYxtfrO7-UZ--L3CRmA=.06aa514d-1846-47ae-b7bd-7535bed88fcb@github.com>
References: <hEyQpd1Q-XPhT_UYIqFn6Z2QlqxHHsT5_y4EMY5a-Sc=.32595747-b83f-401d-bad8-d871f73e1cd6@github.com>
 <4rohFHtNW1xFl9DQ47qqySsYnYxtfrO7-UZ--L3CRmA=.06aa514d-1846-47ae-b7bd-7535bed88fcb@github.com>
Message-ID: <YwKNk67jcU5oLyDIRzm6dBXm6LZ1XWrVhI_J_bvKFsQ=.7b05c0f8-cb7a-4e4c-aea4-368d07435b2e@github.com>

On Tue, 13 May 2025 06:52:27 GMT, Doug Simon <dnsimon at openjdk.org> wrote:

>> By default, JVMCI and Graal initialization only occurs on the first top-tier (i.e. tier 4) JIT compilation request. This made sense prior to libgraal where the initialization was interpreted and so noticeably contributed to VM startup. However, with libgraal, the initialization is sufficiently fast to not impact startup.
>> 
>> The motivation for JVMCI and Graal eager initialization by default is to make Graal command line option processing happen in the same VM phase as handling of all other VM command line flags. That is, errors in Graal options should:
>> 1. Happen deterministically, not just for apps that run long enough to trigger a top tier JIT compilation. For example: `java -XX:+UnlockExperimentalVMOptions -XX:+UseGraalJIT --version`. In a JDK build that does not include Graal, this may succeed (and print out the version info) or result in a VM error ("Cannot use JVMCI compiler: No JVMCI compiler found").
>> 2. Stop the VM before any application code can be executed. This is just good hygiene.
>> 
>> This PR makes JVMCI initialization eager by default if `UseJVMCICompiler` is true.
>> This is done for both libgraal and jargraal so that the behavior is uniform. Since jargraal is now a development configuration, VM startup costs are not critical.
>
> Doug Simon has updated the pull request incrementally with one additional commit since the last revision:
> 
>   use FLAG_SET_ERGO_IF_DEFAULT

Marked as reviewed by kvn (Reviewer).

@dougxc please remind me. Is it true that with current libgraal no Java code is executed when it is initialized? Or you still have calls into core library?

-------------

PR Review: https://git.openjdk.org/jdk/pull/25121#pullrequestreview-2837264646
PR Comment: https://git.openjdk.org/jdk/pull/25121#issuecomment-2876984838

From never at openjdk.org  Tue May 13 15:39:51 2025
From: never at openjdk.org (Tom Rodriguez)
Date: Tue, 13 May 2025 15:39:51 GMT
Subject: RFR: 8356447: Change default for EagerJVMCI to true [v2]
In-Reply-To: <4rohFHtNW1xFl9DQ47qqySsYnYxtfrO7-UZ--L3CRmA=.06aa514d-1846-47ae-b7bd-7535bed88fcb@github.com>
References: <hEyQpd1Q-XPhT_UYIqFn6Z2QlqxHHsT5_y4EMY5a-Sc=.32595747-b83f-401d-bad8-d871f73e1cd6@github.com>
 <4rohFHtNW1xFl9DQ47qqySsYnYxtfrO7-UZ--L3CRmA=.06aa514d-1846-47ae-b7bd-7535bed88fcb@github.com>
Message-ID: <xeQX-3KwZOaeNHsPpsG5cZWq3UE6RFzj9baCcf4pFzc=.a97baea5-cf36-4173-900f-ac2843403a92@github.com>

On Tue, 13 May 2025 06:52:27 GMT, Doug Simon <dnsimon at openjdk.org> wrote:

>> By default, JVMCI and Graal initialization only occurs on the first top-tier (i.e. tier 4) JIT compilation request. This made sense prior to libgraal where the initialization was interpreted and so noticeably contributed to VM startup. However, with libgraal, the initialization is sufficiently fast to not impact startup.
>> 
>> The motivation for JVMCI and Graal eager initialization by default is to make Graal command line option processing happen in the same VM phase as handling of all other VM command line flags. That is, errors in Graal options should:
>> 1. Happen deterministically, not just for apps that run long enough to trigger a top tier JIT compilation. For example: `java -XX:+UnlockExperimentalVMOptions -XX:+UseGraalJIT --version`. In a JDK build that does not include Graal, this may succeed (and print out the version info) or result in a VM error ("Cannot use JVMCI compiler: No JVMCI compiler found").
>> 2. Stop the VM before any application code can be executed. This is just good hygiene.
>> 
>> This PR makes JVMCI initialization eager by default if `UseJVMCICompiler` is true.
>> This is done for both libgraal and jargraal so that the behavior is uniform. Since jargraal is now a development configuration, VM startup costs are not critical.
>
> Doug Simon has updated the pull request incrementally with one additional commit since the last revision:
> 
>   use FLAG_SET_ERGO_IF_DEFAULT

Marked as reviewed by never (Reviewer).

-------------

PR Review: https://git.openjdk.org/jdk/pull/25121#pullrequestreview-2837280275

From dnsimon at openjdk.org  Tue May 13 15:53:53 2025
From: dnsimon at openjdk.org (Doug Simon)
Date: Tue, 13 May 2025 15:53:53 GMT
Subject: RFR: 8356447: Change default for EagerJVMCI to true [v2]
In-Reply-To: <YwKNk67jcU5oLyDIRzm6dBXm6LZ1XWrVhI_J_bvKFsQ=.7b05c0f8-cb7a-4e4c-aea4-368d07435b2e@github.com>
References: <hEyQpd1Q-XPhT_UYIqFn6Z2QlqxHHsT5_y4EMY5a-Sc=.32595747-b83f-401d-bad8-d871f73e1cd6@github.com>
 <4rohFHtNW1xFl9DQ47qqySsYnYxtfrO7-UZ--L3CRmA=.06aa514d-1846-47ae-b7bd-7535bed88fcb@github.com>
 <YwKNk67jcU5oLyDIRzm6dBXm6LZ1XWrVhI_J_bvKFsQ=.7b05c0f8-cb7a-4e4c-aea4-368d07435b2e@github.com>
Message-ID: <5ryTduYlJ4b6MFzxFmjZaXl8Y7LhX5fG2TIPWXKs2dk=.c6840419-1b67-47b5-953c-437e36cf1cc0@github.com>

On Tue, 13 May 2025 15:30:03 GMT, Vladimir Kozlov <kvn at openjdk.org> wrote:

> @dougxc please remind me. Is it true that with current libgraal no Java code is executed when it is initialized? Or you still have calls into core library?

There are still some calls to `CompilerToVM.lookupType` during libgraal initialization but I think all the types it looks up will already be resolved so will not require Java code execution.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/25121#issuecomment-2877051478

From dnsimon at openjdk.org  Tue May 13 16:02:00 2025
From: dnsimon at openjdk.org (Doug Simon)
Date: Tue, 13 May 2025 16:02:00 GMT
Subject: Integrated: 8356447: Change default for EagerJVMCI to true
In-Reply-To: <hEyQpd1Q-XPhT_UYIqFn6Z2QlqxHHsT5_y4EMY5a-Sc=.32595747-b83f-401d-bad8-d871f73e1cd6@github.com>
References: <hEyQpd1Q-XPhT_UYIqFn6Z2QlqxHHsT5_y4EMY5a-Sc=.32595747-b83f-401d-bad8-d871f73e1cd6@github.com>
Message-ID: <VfVh-E6eyDcAlwEOCoHanEXcQ4Azsbj9XTtiH7qbyjc=.ceecf363-d442-446a-8f90-b751ec2bc8d1@github.com>

On Thu, 8 May 2025 14:44:55 GMT, Doug Simon <dnsimon at openjdk.org> wrote:

> By default, JVMCI and Graal initialization only occurs on the first top-tier (i.e. tier 4) JIT compilation request. This made sense prior to libgraal where the initialization was interpreted and so noticeably contributed to VM startup. However, with libgraal, the initialization is sufficiently fast to not impact startup.
> 
> The motivation for JVMCI and Graal eager initialization by default is to make Graal command line option processing happen in the same VM phase as handling of all other VM command line flags. That is, errors in Graal options should:
> 1. Happen deterministically, not just for apps that run long enough to trigger a top tier JIT compilation. For example: `java -XX:+UnlockExperimentalVMOptions -XX:+UseGraalJIT --version`. In a JDK build that does not include Graal, this may succeed (and print out the version info) or result in a VM error ("Cannot use JVMCI compiler: No JVMCI compiler found").
> 2. Stop the VM before any application code can be executed. This is just good hygiene.
> 
> This PR makes JVMCI initialization eager by default if `UseJVMCICompiler` is true.
> This is done for both libgraal and jargraal so that the behavior is uniform. Since jargraal is now a development configuration, VM startup costs are not critical.

This pull request has now been integrated.

Changeset: 08b2df80
Author:    Doug Simon <dnsimon at openjdk.org>
URL:       https://git.openjdk.org/jdk/commit/08b2df80c68e182fbf6b1fc94e991c02b23040ec
Stats:     32 lines in 6 files changed: 29 ins; 0 del; 3 mod

8356447: Change default for EagerJVMCI to true

Reviewed-by: yzheng, kvn, never

-------------

PR: https://git.openjdk.org/jdk/pull/25121

From dnsimon at openjdk.org  Tue May 13 16:01:59 2025
From: dnsimon at openjdk.org (Doug Simon)
Date: Tue, 13 May 2025 16:01:59 GMT
Subject: RFR: 8356447: Change default for EagerJVMCI to true [v2]
In-Reply-To: <4rohFHtNW1xFl9DQ47qqySsYnYxtfrO7-UZ--L3CRmA=.06aa514d-1846-47ae-b7bd-7535bed88fcb@github.com>
References: <hEyQpd1Q-XPhT_UYIqFn6Z2QlqxHHsT5_y4EMY5a-Sc=.32595747-b83f-401d-bad8-d871f73e1cd6@github.com>
 <4rohFHtNW1xFl9DQ47qqySsYnYxtfrO7-UZ--L3CRmA=.06aa514d-1846-47ae-b7bd-7535bed88fcb@github.com>
Message-ID: <z76uXolkQ9n1_ypRvwUPGCh0sIhWN5WPJedBjteLffE=.25eb1bf2-644b-4ff9-8d92-c0d813b1011e@github.com>

On Tue, 13 May 2025 06:52:27 GMT, Doug Simon <dnsimon at openjdk.org> wrote:

>> By default, JVMCI and Graal initialization only occurs on the first top-tier (i.e. tier 4) JIT compilation request. This made sense prior to libgraal where the initialization was interpreted and so noticeably contributed to VM startup. However, with libgraal, the initialization is sufficiently fast to not impact startup.
>> 
>> The motivation for JVMCI and Graal eager initialization by default is to make Graal command line option processing happen in the same VM phase as handling of all other VM command line flags. That is, errors in Graal options should:
>> 1. Happen deterministically, not just for apps that run long enough to trigger a top tier JIT compilation. For example: `java -XX:+UnlockExperimentalVMOptions -XX:+UseGraalJIT --version`. In a JDK build that does not include Graal, this may succeed (and print out the version info) or result in a VM error ("Cannot use JVMCI compiler: No JVMCI compiler found").
>> 2. Stop the VM before any application code can be executed. This is just good hygiene.
>> 
>> This PR makes JVMCI initialization eager by default if `UseJVMCICompiler` is true.
>> This is done for both libgraal and jargraal so that the behavior is uniform. Since jargraal is now a development configuration, VM startup costs are not critical.
>
> Doug Simon has updated the pull request incrementally with one additional commit since the last revision:
> 
>   use FLAG_SET_ERGO_IF_DEFAULT

Thanks for the reviews!

-------------

PR Comment: https://git.openjdk.org/jdk/pull/25121#issuecomment-2877108155

From duke at openjdk.org  Tue May 13 22:38:53 2025
From: duke at openjdk.org (Mohamed Issa)
Date: Tue, 13 May 2025 22:38:53 GMT
Subject: RFR: 8353686: Optimize Math.cbrt for x86 64 bit platforms [v3]
In-Reply-To: <m-sqt9KlVF_WJ5Plh4_nJaia7LTBpzHkO8svt7mcisw=.2ffba1c6-a433-41e3-8a7f-b33ff874b472@github.com>
References: <1NsI0OGP9RcnbEwlJwDj1dZ3w7zCP4DxJhEmO1quSgo=.3b1e3da9-9aa9-4221-a73a-e2f3ec5f456b@github.com>
 <H3vvJxVRsOzXpLIAnz2vc3wU_Umd9IoyI1cgqYT6mq0=.79b289da-60f2-4f62-99bd-227e03e9df2b@github.com>
 <m-sqt9KlVF_WJ5Plh4_nJaia7LTBpzHkO8svt7mcisw=.2ffba1c6-a433-41e3-8a7f-b33ff874b472@github.com>
Message-ID: <wSpmNko5KJYZII3Bck2H9E-H2QV04RRM85uRnxLdMuY=.815bb777-05ed-4581-b55a-8273ec1393c8@github.com>

On Wed, 7 May 2025 09:25:30 GMT, Andrew Haley <aph at openjdk.org> wrote:

>> Mohamed Issa has updated the pull request incrementally with one additional commit since the last revision:
>> 
>>   Add new set of cbrt micro-benchmarks
>
> src/hotspot/cpu/x86/stubGenerator_x86_64_cbrt.cpp line 62:
> 
>> 60: {
>> 61:     0, 3220193280
>> 62: };
> 
> What is this constant?
> 
> Its value is 0xbff0400000000000, which is -ve bit set, bias (top bit of exponent) clear, but one of the bits in the fraction is set. So its value is -0x1.04p+0. As well as the exponent it also sets the 1 bit, just below the 5 most significant bits of the fraction. I guess this in effect rounds up the value that is added in the final rounding.
> 
> Is that right?

The idea is mainly that the _EXP_MSK2_ constant operates on the input to match up with it's corresponding entries in the lookup tables: _rcp_table_, _cbrt_table_, and _D_table_. The key part starts with computing the difference (_r = x - x'_) shown in line 260 below.
```c++
__ subsd(xmm1, xmm3);

Here _x_ is essentially the input fraction with all bits while _x'_ is the input fraction with _EXP_MSK2_ applied. This is then multiplied (_r = (x - x') * rcp_table(x')_) with the corresponding lookup table entry (_-1 / 1.b1 b2 b3 b4 b5 b6_ where _b6=1_) as shown in line 264 below.
```c++
__ mulsd(xmm1, xmm4);

This value then gets used by subsequent steps that involve entries from _cbrt_table_ and _D_table_. It won't necessarily round the final result up though as those effects will depend on what the input is. However, the polynomial coefficients will have a bigger impact on rounding. For a summary of the approximations, please refer to the algorithm description comment block near the beginning of the source file.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/24470#discussion_r2087726049

From sviswanathan at openjdk.org  Wed May 14 00:41:56 2025
From: sviswanathan at openjdk.org (Sandhya Viswanathan)
Date: Wed, 14 May 2025 00:41:56 GMT
Subject: RFR: 8351412: Add AVX-512 intrinsics for ML-KEM [v4]
In-Reply-To: <Ws9R1LJ6lpKlnXJ_2sVAhRAEvd6b9P3DBE6viW8r_4M=.f984d862-4c0f-4a04-bcfc-a6df781b485c@github.com>
References: <EyhTUDIMxnzyPP14NYWuRlZXR9WTv2GaYApQO0nJ4do=.2ee863e3-fd15-41be-ac73-247b03144281@github.com>
 <Ws9R1LJ6lpKlnXJ_2sVAhRAEvd6b9P3DBE6viW8r_4M=.f984d862-4c0f-4a04-bcfc-a6df781b485c@github.com>
Message-ID: <-L1FHPpbVOvHTxMFUPMGIY9g8UFAFmJDgNRkoFONKnI=.ddef5354-e00f-4c2a-80c3-b48325fe51d2@github.com>

On Mon, 12 May 2025 09:05:10 GMT, Ferenc Rakoczi <duke at openjdk.org> wrote:

>> By using the AVX-512 vector registers the speed of the computation of the ML-KEM algorithms (key generation, encapsulation, decapsulation) can be approximately doubled.
>
> Ferenc Rakoczi has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Restoring copyright notice on ML_KEM.java

Only reviewed three intrinsics so far, more review to do.

src/hotspot/cpu/x86/stubGenerator_x86_64_kyber.cpp line 693:

> 691: // a (short[256]) = c_rarg1
> 692: // b (short[256]) = c_rarg2
> 693: // kyberConsts (short[40]) = c_rarg3

kyberConsts is not one of the arguments passed in.

src/hotspot/cpu/x86/stubGenerator_x86_64_kyber.cpp line 696:

> 694: address generate_kyberAddPoly_2_avx512(StubGenerator *stubgen,
> 695:                                        MacroAssembler *_masm) {
> 696: 

The Java code for "implKyberAddPoly(short[] result, short[] a, short[] b)" does BarrettReduction but the intrinsic code here does not. Is that intentional and how is the reduction handled?

src/hotspot/cpu/x86/stubGenerator_x86_64_kyber.cpp line 742:

> 740: // b (short[256]) = c_rarg2
> 741: // c (short[256]) = c_rarg3
> 742: // kyberConsts (short[40]) = c_rarg4

kyberConsts is not one of the arguments passed in.

src/hotspot/cpu/x86/stubGenerator_x86_64_kyber.cpp line 799:

> 797: // parsedLength (int) = c_rarg3
> 798: address generate_kyber12To16_avx512(StubGenerator *stubgen,
> 799:                                     MacroAssembler *_masm) {

If AVX512_VBMI and AVX512_VBMI2 is available, it looks to me that the loop body of this algorithm can be implemented using more efficient instructions in simple 5 steps:

Step 1:
Load 0-47, 48-95, 96-143, 144-191 condensed bytes into xmm0, xmm1, xmm2, xmm3 respectively using masked load.

Step 2:
Use vpermb to arrange xmm0 such that bytes 1, 4, 7, ... are duplicated
xmm0 before  b47, b46, ..., b0 where each b is a byte
xmm0 after b47 b46 b46 b45, ......., b5 b4 b4 b3 b2 b1 b1 b0  
Repeat this for xmm1, xmm2, xmm3

Step 3:
Use vpshldvw to shift every word (16 bits) in the xmm0 appropriately with variable shift
Shift word 31 by 4, word 30 by 0, ... word 3 by 4, word 2 by 0,  word 1 by 4, word 0 by 0
Repeat this for xmm1, xmm2, xmm3

Step 4:
Use vpand to "and" each word element in xmm0 by 0xfff.
Repeat this for xmm1, xmm2, xmm3

Step 5:
Store xmm0 into parsed
Store xmm1 into parsed + 64
Store xmm2 into parsed +128
Store xmm3 into parsed + 192

If you think there is not sufficient time, we could look into it after the merge of this PR as well.

-------------

PR Review: https://git.openjdk.org/jdk/pull/24953#pullrequestreview-2837616051
PR Review Comment: https://git.openjdk.org/jdk/pull/24953#discussion_r2087361991
PR Review Comment: https://git.openjdk.org/jdk/pull/24953#discussion_r2087377640
PR Review Comment: https://git.openjdk.org/jdk/pull/24953#discussion_r2087331798
PR Review Comment: https://git.openjdk.org/jdk/pull/24953#discussion_r2087834072

From duke at openjdk.org  Wed May 14 11:43:58 2025
From: duke at openjdk.org (Ferenc Rakoczi)
Date: Wed, 14 May 2025 11:43:58 GMT
Subject: RFR: 8351412: Add AVX-512 intrinsics for ML-KEM [v4]
In-Reply-To: <-L1FHPpbVOvHTxMFUPMGIY9g8UFAFmJDgNRkoFONKnI=.ddef5354-e00f-4c2a-80c3-b48325fe51d2@github.com>
References: <EyhTUDIMxnzyPP14NYWuRlZXR9WTv2GaYApQO0nJ4do=.2ee863e3-fd15-41be-ac73-247b03144281@github.com>
 <Ws9R1LJ6lpKlnXJ_2sVAhRAEvd6b9P3DBE6viW8r_4M=.f984d862-4c0f-4a04-bcfc-a6df781b485c@github.com>
 <-L1FHPpbVOvHTxMFUPMGIY9g8UFAFmJDgNRkoFONKnI=.ddef5354-e00f-4c2a-80c3-b48325fe51d2@github.com>
Message-ID: <VtfT2pEZw3xrqQv_ix91805LPYYj4FOm8wbizMM1-Ak=.686cde76-6175-40f9-ba4a-14104c8d879d@github.com>

On Tue, 13 May 2025 17:53:50 GMT, Sandhya Viswanathan <sviswanathan at openjdk.org> wrote:

>> Ferenc Rakoczi has updated the pull request incrementally with one additional commit since the last revision:
>> 
>>   Restoring copyright notice on ML_KEM.java
>
> src/hotspot/cpu/x86/stubGenerator_x86_64_kyber.cpp line 693:
> 
>> 691: // a (short[256]) = c_rarg1
>> 692: // b (short[256]) = c_rarg2
>> 693: // kyberConsts (short[40]) = c_rarg3
> 
> kyberConsts is not one of the arguments passed in.

Fixed.

> src/hotspot/cpu/x86/stubGenerator_x86_64_kyber.cpp line 696:
> 
>> 694: address generate_kyberAddPoly_2_avx512(StubGenerator *stubgen,
>> 695:                                        MacroAssembler *_masm) {
>> 696: 
> 
> The Java code for "implKyberAddPoly(short[] result, short[] a, short[] b)" does BarrettReduction but the intrinsic code here does not. Is that intentional and how is the reduction handled?

Actually, the Java version is the one that is too cautious. There is Barrett reduction after at most 4 consecutive uses of mlKemAddPoly(), so doing the reduction in implKyberAddPoly() is not necessary. Thanks for discovering this!

> src/hotspot/cpu/x86/stubGenerator_x86_64_kyber.cpp line 742:
> 
>> 740: // b (short[256]) = c_rarg2
>> 741: // c (short[256]) = c_rarg3
>> 742: // kyberConsts (short[40]) = c_rarg4
> 
> kyberConsts is not one of the arguments passed in.

Fixed.

> src/hotspot/cpu/x86/stubGenerator_x86_64_kyber.cpp line 799:
> 
>> 797: // parsedLength (int) = c_rarg3
>> 798: address generate_kyber12To16_avx512(StubGenerator *stubgen,
>> 799:                                     MacroAssembler *_masm) {
> 
> If AVX512_VBMI and AVX512_VBMI2 is available, it looks to me that the loop body of this algorithm can be implemented using more efficient instructions in simple 5 steps:
> 
> Step 1:
> Load 0-47, 48-95, 96-143, 144-191 condensed bytes into xmm0, xmm1, xmm2, xmm3 respectively using masked load.
> 
> Step 2:
> Use vpermb to arrange xmm0 such that bytes 1, 4, 7, ... are duplicated
> xmm0 before  b47, b46, ..., b0 where each b is a byte
> xmm0 after b47 b46 b46 b45, ......., b5 b4 b4 b3 b2 b1 b1 b0  
> Repeat this for xmm1, xmm2, xmm3
> 
> Step 3:
> Use vpshldvw to shift every word (16 bits) in the xmm0 appropriately with variable shift
> Shift word 31 by 4, word 30 by 0, ... word 3 by 4, word 2 by 0,  word 1 by 4, word 0 by 0
> Repeat this for xmm1, xmm2, xmm3
> 
> Step 4:
> Use vpand to "and" each word element in xmm0 by 0xfff.
> Repeat this for xmm1, xmm2, xmm3
> 
> Step 5:
> Store xmm0 into parsed
> Store xmm1 into parsed + 64
> Store xmm2 into parsed +128
> Store xmm3 into parsed + 192
> 
> If you think there is not sufficient time, we could look into it after the merge of this PR as well.

Yes, that way we can speed this up a little (well, in itself it might  be significant), but with the current intrinsics, the contribution of this function to the overall running time is about 1.5%, so it would not matter that much, while on the other hand not all AVX-512 capable processors have vbmi.
So I would rather not do it in this PR.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/24953#discussion_r2088738946
PR Review Comment: https://git.openjdk.org/jdk/pull/24953#discussion_r2088738841
PR Review Comment: https://git.openjdk.org/jdk/pull/24953#discussion_r2088738704
PR Review Comment: https://git.openjdk.org/jdk/pull/24953#discussion_r2088738615

From duke at openjdk.org  Wed May 14 11:49:11 2025
From: duke at openjdk.org (Ferenc Rakoczi)
Date: Wed, 14 May 2025 11:49:11 GMT
Subject: RFR: 8351412: Add AVX-512 intrinsics for ML-KEM [v5]
In-Reply-To: <EyhTUDIMxnzyPP14NYWuRlZXR9WTv2GaYApQO0nJ4do=.2ee863e3-fd15-41be-ac73-247b03144281@github.com>
References: <EyhTUDIMxnzyPP14NYWuRlZXR9WTv2GaYApQO0nJ4do=.2ee863e3-fd15-41be-ac73-247b03144281@github.com>
Message-ID: <VZ7MfXaW0GQwLeqj7_HVeNyrgXlMuIFnQIA5treq7r8=.99e1921a-6169-4835-8bcd-a64bc6cd250f@github.com>

> By using the AVX-512 vector registers the speed of the computation of the ML-KEM algorithms (key generation, encapsulation, decapsulation) can be approximately doubled.

Ferenc Rakoczi has updated the pull request incrementally with one additional commit since the last revision:

  Responding to comments by Sandhya.

-------------

Changes:
  - all: https://git.openjdk.org/jdk/pull/24953/files
  - new: https://git.openjdk.org/jdk/pull/24953/files/215b346f..32571f39

Webrevs:
 - full: https://webrevs.openjdk.org/?repo=jdk&pr=24953&range=04
 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24953&range=03-04

  Stats: 4 lines in 2 files changed: 0 ins; 3 del; 1 mod
  Patch: https://git.openjdk.org/jdk/pull/24953.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/24953/head:pull/24953

PR: https://git.openjdk.org/jdk/pull/24953

From yzheng at openjdk.org  Wed May 14 13:24:02 2025
From: yzheng at openjdk.org (Yudi Zheng)
Date: Wed, 14 May 2025 13:24:02 GMT
Subject: RFR: 8356971: [JVMCI] Export VM_Version::supports_avx512_simd_sort to
 JVMCI compiler
Message-ID: <woxDsR4SGEemo9J_mhThicoHYRXlkTIhHb7ojAdt_RA=.7779a615-b702-45d3-ab5b-4a7b83af3c18@github.com>

HotSpot selects between AVX512 and AVX2 implementations of array sort/partition stubs based on the return value of VM_Version::supports_avx512_simd_sort. The AVX2 version supports fewer element types than the AVX512 version and may fail at runtime if unsupported types are encountered. This capability information should be exposed to the JVMCI compiler to properly guard against incorrect intrinsification. This is especially important because VM_Version::supports_avx512_simd_sort includes a special exclusion rule for AMD Zen4, due to performance considerations.

-------------

Commit messages:
 - 8356971: [JVMCI] Export VM_Version::supports_avx512_simd_sort to JVMCI compiler.

Changes: https://git.openjdk.org/jdk/pull/25225/files
  Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=25225&range=00
  Issue: https://bugs.openjdk.org/browse/JDK-8356971
  Stats: 4 lines in 3 files changed: 4 ins; 0 del; 0 mod
  Patch: https://git.openjdk.org/jdk/pull/25225.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/25225/head:pull/25225

PR: https://git.openjdk.org/jdk/pull/25225

From dnsimon at openjdk.org  Wed May 14 13:42:52 2025
From: dnsimon at openjdk.org (Doug Simon)
Date: Wed, 14 May 2025 13:42:52 GMT
Subject: RFR: 8356971: [JVMCI] Export VM_Version::supports_avx512_simd_sort
 to JVMCI compiler
In-Reply-To: <woxDsR4SGEemo9J_mhThicoHYRXlkTIhHb7ojAdt_RA=.7779a615-b702-45d3-ab5b-4a7b83af3c18@github.com>
References: <woxDsR4SGEemo9J_mhThicoHYRXlkTIhHb7ojAdt_RA=.7779a615-b702-45d3-ab5b-4a7b83af3c18@github.com>
Message-ID: <pMi5MzBnAro1N5S9ZRaMY_pSVNL6UFcBedU2-y6HJHA=.09e95fac-a9a9-4891-a1ea-8296f081a553@github.com>

On Wed, 14 May 2025 13:16:26 GMT, Yudi Zheng <yzheng at openjdk.org> wrote:

> HotSpot selects between AVX512 and AVX2 implementations of array sort/partition stubs based on the return value of VM_Version::supports_avx512_simd_sort. The AVX2 version supports fewer element types than the AVX512 version and may fail at runtime if unsupported types are encountered. This capability information should be exposed to the JVMCI compiler to properly guard against incorrect intrinsification. This is especially important because VM_Version::supports_avx512_simd_sort includes a special exclusion rule for AMD Zen4, due to performance considerations.

Marked as reviewed by dnsimon (Reviewer).

-------------

PR Review: https://git.openjdk.org/jdk/pull/25225#pullrequestreview-2840252727

From sviswanathan at openjdk.org  Wed May 14 16:03:52 2025
From: sviswanathan at openjdk.org (Sandhya Viswanathan)
Date: Wed, 14 May 2025 16:03:52 GMT
Subject: RFR: 8351412: Add AVX-512 intrinsics for ML-KEM [v4]
In-Reply-To: <VtfT2pEZw3xrqQv_ix91805LPYYj4FOm8wbizMM1-Ak=.686cde76-6175-40f9-ba4a-14104c8d879d@github.com>
References: <EyhTUDIMxnzyPP14NYWuRlZXR9WTv2GaYApQO0nJ4do=.2ee863e3-fd15-41be-ac73-247b03144281@github.com>
 <Ws9R1LJ6lpKlnXJ_2sVAhRAEvd6b9P3DBE6viW8r_4M=.f984d862-4c0f-4a04-bcfc-a6df781b485c@github.com>
 <-L1FHPpbVOvHTxMFUPMGIY9g8UFAFmJDgNRkoFONKnI=.ddef5354-e00f-4c2a-80c3-b48325fe51d2@github.com>
 <VtfT2pEZw3xrqQv_ix91805LPYYj4FOm8wbizMM1-Ak=.686cde76-6175-40f9-ba4a-14104c8d879d@github.com>
Message-ID: <xVBni3yO-PigaVYkX9ar3FOEQfN8Qb3Wg_HQvS-ky6Q=.2104253b-646a-4a9b-bff5-89f1feed9434@github.com>

On Wed, 14 May 2025 11:41:30 GMT, Ferenc Rakoczi <duke at openjdk.org> wrote:

>> src/hotspot/cpu/x86/stubGenerator_x86_64_kyber.cpp line 696:
>> 
>>> 694: address generate_kyberAddPoly_2_avx512(StubGenerator *stubgen,
>>> 695:                                        MacroAssembler *_masm) {
>>> 696: 
>> 
>> The Java code for "implKyberAddPoly(short[] result, short[] a, short[] b)" does BarrettReduction but the intrinsic code here does not. Is that intentional and how is the reduction handled?
>
> Actually, the Java version is the one that is too cautious. There is Barrett reduction after at most 4 consecutive uses of mlKemAddPoly(), so doing the reduction in implKyberAddPoly() is not necessary. Thanks for discovering this!

Thanks. I have another question, is there a reason that the Java versions of AddPoly (both for 2 and 3 input) return 1, whereas the corresponding intrinsics return 0?

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/24953#discussion_r2089278218

From duke at openjdk.org  Wed May 14 16:30:54 2025
From: duke at openjdk.org (Ferenc Rakoczi)
Date: Wed, 14 May 2025 16:30:54 GMT
Subject: RFR: 8351412: Add AVX-512 intrinsics for ML-KEM [v4]
In-Reply-To: <xVBni3yO-PigaVYkX9ar3FOEQfN8Qb3Wg_HQvS-ky6Q=.2104253b-646a-4a9b-bff5-89f1feed9434@github.com>
References: <EyhTUDIMxnzyPP14NYWuRlZXR9WTv2GaYApQO0nJ4do=.2ee863e3-fd15-41be-ac73-247b03144281@github.com>
 <Ws9R1LJ6lpKlnXJ_2sVAhRAEvd6b9P3DBE6viW8r_4M=.f984d862-4c0f-4a04-bcfc-a6df781b485c@github.com>
 <-L1FHPpbVOvHTxMFUPMGIY9g8UFAFmJDgNRkoFONKnI=.ddef5354-e00f-4c2a-80c3-b48325fe51d2@github.com>
 <VtfT2pEZw3xrqQv_ix91805LPYYj4FOm8wbizMM1-Ak=.686cde76-6175-40f9-ba4a-14104c8d879d@github.com>
 <xVBni3yO-PigaVYkX9ar3FOEQfN8Qb3Wg_HQvS-ky6Q=.2104253b-646a-4a9b-bff5-89f1feed9434@github.com>
Message-ID: <HdAbWarRAmvwl2oK_cuwA9Qw1gRRQ_F6Ln0BbVhndZI=.cb689809-4d62-4188-9496-2c303605d8c2@github.com>

On Wed, 14 May 2025 16:00:55 GMT, Sandhya Viswanathan <sviswanathan at openjdk.org> wrote:

>> Actually, the Java version is the one that is too cautious. There is Barrett reduction after at most 4 consecutive uses of mlKemAddPoly(), so doing the reduction in implKyberAddPoly() is not necessary. Thanks for discovering this!
>
> Thanks. I have another question, is there a reason that the Java versions of AddPoly (both for 2 and 3 input) return 1, whereas the corresponding intrinsics return 0?

I use that for debugging. E.g. it is fairly easy to change the Java code to call both the intrinsic and Java version and compare the results. I don't see any harm in leaving that in the production version, since it is always ignored.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/24953#discussion_r2089322079

From yzheng at openjdk.org  Wed May 14 19:50:55 2025
From: yzheng at openjdk.org (Yudi Zheng)
Date: Wed, 14 May 2025 19:50:55 GMT
Subject: RFR: 8356971: [JVMCI] Export VM_Version::supports_avx512_simd_sort
 to JVMCI compiler
In-Reply-To: <woxDsR4SGEemo9J_mhThicoHYRXlkTIhHb7ojAdt_RA=.7779a615-b702-45d3-ab5b-4a7b83af3c18@github.com>
References: <woxDsR4SGEemo9J_mhThicoHYRXlkTIhHb7ojAdt_RA=.7779a615-b702-45d3-ab5b-4a7b83af3c18@github.com>
Message-ID: <to_uI6X1FjsaPfj1-nj_PAvf7Ev0ShPsVTUbIuUPb44=.cdc65459-0ef0-477c-ad65-c442cf3f0668@github.com>

On Wed, 14 May 2025 13:16:26 GMT, Yudi Zheng <yzheng at openjdk.org> wrote:

> HotSpot selects between AVX512 and AVX2 implementations of array sort/partition stubs based on the return value of VM_Version::supports_avx512_simd_sort. The AVX2 version supports fewer element types than the AVX512 version and may fail at runtime if unsupported types are encountered. This capability information should be exposed to the JVMCI compiler to properly guard against incorrect intrinsification. This is especially important because VM_Version::supports_avx512_simd_sort includes a special exclusion rule for AMD Zen4, due to performance considerations.

Thanks for the review! Passed tier1-3.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/25225#issuecomment-2881367102

From yzheng at openjdk.org  Wed May 14 19:50:55 2025
From: yzheng at openjdk.org (Yudi Zheng)
Date: Wed, 14 May 2025 19:50:55 GMT
Subject: Integrated: 8356971: [JVMCI] Export
 VM_Version::supports_avx512_simd_sort to JVMCI compiler
In-Reply-To: <woxDsR4SGEemo9J_mhThicoHYRXlkTIhHb7ojAdt_RA=.7779a615-b702-45d3-ab5b-4a7b83af3c18@github.com>
References: <woxDsR4SGEemo9J_mhThicoHYRXlkTIhHb7ojAdt_RA=.7779a615-b702-45d3-ab5b-4a7b83af3c18@github.com>
Message-ID: <C4-IrGJYMmQlutVKJ2o9qeZOMVtX2ZSNrbngdCfU0Co=.41fa7a64-efc7-4a52-8477-9b91ad1b7fc5@github.com>

On Wed, 14 May 2025 13:16:26 GMT, Yudi Zheng <yzheng at openjdk.org> wrote:

> HotSpot selects between AVX512 and AVX2 implementations of array sort/partition stubs based on the return value of VM_Version::supports_avx512_simd_sort. The AVX2 version supports fewer element types than the AVX512 version and may fail at runtime if unsupported types are encountered. This capability information should be exposed to the JVMCI compiler to properly guard against incorrect intrinsification. This is especially important because VM_Version::supports_avx512_simd_sort includes a special exclusion rule for AMD Zen4, due to performance considerations.

This pull request has now been integrated.

Changeset: 948ade8e
Author:    Yudi Zheng <yzheng at openjdk.org>
URL:       https://git.openjdk.org/jdk/commit/948ade8e7003a41683600428c8e3155c7ed798db
Stats:     4 lines in 3 files changed: 4 ins; 0 del; 0 mod

8356971: [JVMCI] Export VM_Version::supports_avx512_simd_sort to JVMCI compiler

Reviewed-by: dnsimon

-------------

PR: https://git.openjdk.org/jdk/pull/25225

From sviswanathan at openjdk.org  Thu May 15 00:38:54 2025
From: sviswanathan at openjdk.org (Sandhya Viswanathan)
Date: Thu, 15 May 2025 00:38:54 GMT
Subject: RFR: 8351412: Add AVX-512 intrinsics for ML-KEM [v5]
In-Reply-To: <VZ7MfXaW0GQwLeqj7_HVeNyrgXlMuIFnQIA5treq7r8=.99e1921a-6169-4835-8bcd-a64bc6cd250f@github.com>
References: <EyhTUDIMxnzyPP14NYWuRlZXR9WTv2GaYApQO0nJ4do=.2ee863e3-fd15-41be-ac73-247b03144281@github.com>
 <VZ7MfXaW0GQwLeqj7_HVeNyrgXlMuIFnQIA5treq7r8=.99e1921a-6169-4835-8bcd-a64bc6cd250f@github.com>
Message-ID: <q4DX289iZdnay70YIWdtRpE5G2f_5HlvBtFUOdlQOvk=.1f6876ac-f016-4223-b6b2-e664d855d56f@github.com>

On Wed, 14 May 2025 11:49:11 GMT, Ferenc Rakoczi <duke at openjdk.org> wrote:

>> By using the AVX-512 vector registers the speed of the computation of the ML-KEM algorithms (key generation, encapsulation, decapsulation) can be approximately doubled.
>
> Ferenc Rakoczi has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Responding to comments by Sandhya.

Another minor comment. Rest of the PR looks good to me.

src/hotspot/cpu/x86/stubGenerator_x86_64_kyber.cpp line 893:

> 891: //
> 892: // coeffs (short[256]) = c_rarg0
> 893: // kyberConsts (short[40]) = c_rarg1

kyberConsts is not an input parameter to implKyberBarrettReduce.

-------------

PR Review: https://git.openjdk.org/jdk/pull/24953#pullrequestreview-2840763895
PR Review Comment: https://git.openjdk.org/jdk/pull/24953#discussion_r2089284332

From tschatzl at openjdk.org  Thu May 15 08:18:47 2025
From: tschatzl at openjdk.org (Thomas Schatzl)
Date: Thu, 15 May 2025 08:18:47 GMT
Subject: RFR: 8342382: Implementation of JEP G1: Improve Application
 Throughput with a More Efficient Write-Barrier [v38]
In-Reply-To: <tt7smwtk5Lj4CC0R41IyWe6aLXA2JZPrJT3Bq1ILHr0=.abd86704-19f0-4350-b218-184ee0917f1f@github.com>
References: <tt7smwtk5Lj4CC0R41IyWe6aLXA2JZPrJT3Bq1ILHr0=.abd86704-19f0-4350-b218-184ee0917f1f@github.com>
Message-ID: <oGs-LCsoAXaXs1T2ikN-Oaf8FM9M7mnBI3BYb-P2vE0=.427e8a5a-c47d-4956-a423-8fd03d6778bd@github.com>

> Hi all,
> 
>   please review this change that implements (currently Draft) JEP: G1: Improve Application Throughput with a More Efficient Write-Barrier.
> 
> The reason for posting this early is that this is a large change, and the JEP process is already taking very long with no end in sight but we would like to have this ready by JDK 25.
> 
> ### Current situation
> 
> With this change, G1 will reduce the post write barrier to much more resemble Parallel GC's as described in the JEP. The reason is that G1 lacks in throughput compared to Parallel/Serial GC due to larger barrier.
> 
> The main reason for the current barrier is how g1 implements concurrent refinement:
> * g1 tracks dirtied cards using sets (dirty card queue set - dcqs) of buffers (dirty card queues - dcq) containing the location of dirtied cards. Refinement threads pick up their contents to re-refine. The barrier needs to enqueue card locations.
> * For correctness dirty card updates requires fine-grained synchronization between mutator and refinement threads,
> * Finally there is generic code to avoid dirtying cards altogether (filters), to avoid executing the synchronization and the enqueuing as much as possible.
> 
> These tasks require the current barrier to look as follows for an assignment `x.a = y` in pseudo code:
> 
> 
> // Filtering
> if (region(@x.a) == region(y)) goto done; // same region check
> if (y == null) goto done;     // null value check
> if (card(@x.a) == young_card) goto done;  // write to young gen check
> StoreLoad;                // synchronize
> if (card(@x.a) == dirty_card) goto done;
> 
> *card(@x.a) = dirty
> 
> // Card tracking
> enqueue(card-address(@x.a)) into thread-local-dcq;
> if (thread-local-dcq is not full) goto done;
> 
> call runtime to move thread-local-dcq into dcqs
> 
> done:
> 
> 
> Overall this post-write barrier alone is in the range of 40-50 total instructions, compared to three or four(!) for parallel and serial gc.
> 
> The large size of the inlined barrier not only has a large code footprint, but also prevents some compiler optimizations like loop unrolling or inlining.
> 
> There are several papers showing that this barrier alone can decrease throughput by 10-20% ([Yang12](https://dl.acm.org/doi/10.1145/2426642.2259004)), which is corroborated by some benchmarks (see links).
> 
> The main idea for this change is to not use fine-grained synchronization between refinement and mutator threads, but coarse grained based on atomically switching card tables. Mutators only work on the "primary" card table, refinement threads on a se...

Thomas Schatzl has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 54 commits:

 - Merge branch 'master' into 8342382-card-table-instead-of-dcq
 - * ayang review: remove sweep_epoch
 - Merge branch 'master' into card-table-as-dcq-merge
 - Merge branch 'master' into 8342382-card-table-instead-of-dcq
 - * ayang review (part 2 - yield duration changes)
 - * ayang review (part 1)
 - * indentation fix
 - * remove support for 32 bit x86 in the barrier generation code, following latest changes from @shade
 - Merge branch 'master' into 8342382-card-table-instead-of-dcq
 - * fixes after merge related to 32 bit x86 removal
 - ... and 44 more: https://git.openjdk.org/jdk/compare/5e50a584...1def83af

-------------

Changes: https://git.openjdk.org/jdk/pull/23739/files
  Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=23739&range=37
  Stats: 7088 lines in 111 files changed: 2568 ins; 3599 del; 921 mod
  Patch: https://git.openjdk.org/jdk/pull/23739.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/23739/head:pull/23739

PR: https://git.openjdk.org/jdk/pull/23739

From duke at openjdk.org  Thu May 15 13:33:42 2025
From: duke at openjdk.org (Ferenc Rakoczi)
Date: Thu, 15 May 2025 13:33:42 GMT
Subject: RFR: 8351412: Add AVX-512 intrinsics for ML-KEM [v6]
In-Reply-To: <EyhTUDIMxnzyPP14NYWuRlZXR9WTv2GaYApQO0nJ4do=.2ee863e3-fd15-41be-ac73-247b03144281@github.com>
References: <EyhTUDIMxnzyPP14NYWuRlZXR9WTv2GaYApQO0nJ4do=.2ee863e3-fd15-41be-ac73-247b03144281@github.com>
Message-ID: <XiK_5EerkQ60z0GRdkPkYuJcVxqbyZqO0ofx1Zd1JNM=.51a52b01-6cbd-44a8-ad1d-99a083e037e9@github.com>

> By using the AVX-512 vector registers the speed of the computation of the ML-KEM algorithms (key generation, encapsulation, decapsulation) can be approximately doubled.

Ferenc Rakoczi has updated the pull request incrementally with one additional commit since the last revision:

  Response to review comment + loading constants with broadcast op.

-------------

Changes:
  - all: https://git.openjdk.org/jdk/pull/24953/files
  - new: https://git.openjdk.org/jdk/pull/24953/files/32571f39..e4f3264e

Webrevs:
 - full: https://webrevs.openjdk.org/?repo=jdk&pr=24953&range=05
 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24953&range=04-05

  Stats: 107 lines in 1 file changed: 39 ins; 39 del; 29 mod
  Patch: https://git.openjdk.org/jdk/pull/24953.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/24953/head:pull/24953

PR: https://git.openjdk.org/jdk/pull/24953

From duke at openjdk.org  Thu May 15 13:48:56 2025
From: duke at openjdk.org (Ferenc Rakoczi)
Date: Thu, 15 May 2025 13:48:56 GMT
Subject: RFR: 8351412: Add AVX-512 intrinsics for ML-KEM [v5]
In-Reply-To: <q4DX289iZdnay70YIWdtRpE5G2f_5HlvBtFUOdlQOvk=.1f6876ac-f016-4223-b6b2-e664d855d56f@github.com>
References: <EyhTUDIMxnzyPP14NYWuRlZXR9WTv2GaYApQO0nJ4do=.2ee863e3-fd15-41be-ac73-247b03144281@github.com>
 <VZ7MfXaW0GQwLeqj7_HVeNyrgXlMuIFnQIA5treq7r8=.99e1921a-6169-4835-8bcd-a64bc6cd250f@github.com>
 <q4DX289iZdnay70YIWdtRpE5G2f_5HlvBtFUOdlQOvk=.1f6876ac-f016-4223-b6b2-e664d855d56f@github.com>
Message-ID: <UOxLJEYO4jG8lJ4KmvQplSE3k_68SgzdMeN8QVXSd6c=.77f08243-b7e0-4807-b3a3-c4d019427c81@github.com>

On Wed, 14 May 2025 16:04:31 GMT, Sandhya Viswanathan <sviswanathan at openjdk.org> wrote:

>> Ferenc Rakoczi has updated the pull request incrementally with one additional commit since the last revision:
>> 
>>   Responding to comments by Sandhya.
>
> src/hotspot/cpu/x86/stubGenerator_x86_64_kyber.cpp line 893:
> 
>> 891: //
>> 892: // coeffs (short[256]) = c_rarg0
>> 893: // kyberConsts (short[40]) = c_rarg1
> 
> kyberConsts is not an input parameter to implKyberBarrettReduce.

Removed.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/24953#discussion_r2091216578

From duke at openjdk.org  Thu May 15 14:06:53 2025
From: duke at openjdk.org (Ferenc Rakoczi)
Date: Thu, 15 May 2025 14:06:53 GMT
Subject: RFR: 8351412: Add AVX-512 intrinsics for ML-KEM [v5]
In-Reply-To: <q4DX289iZdnay70YIWdtRpE5G2f_5HlvBtFUOdlQOvk=.1f6876ac-f016-4223-b6b2-e664d855d56f@github.com>
References: <EyhTUDIMxnzyPP14NYWuRlZXR9WTv2GaYApQO0nJ4do=.2ee863e3-fd15-41be-ac73-247b03144281@github.com>
 <VZ7MfXaW0GQwLeqj7_HVeNyrgXlMuIFnQIA5treq7r8=.99e1921a-6169-4835-8bcd-a64bc6cd250f@github.com>
 <q4DX289iZdnay70YIWdtRpE5G2f_5HlvBtFUOdlQOvk=.1f6876ac-f016-4223-b6b2-e664d855d56f@github.com>
Message-ID: <g-1EkQ9fSXD-wqBrQdxT1pBOk5b-fALwxM19HskpY5k=.deee174d-e7f8-41cc-854d-d2f84e21680a@github.com>

On Thu, 15 May 2025 00:36:26 GMT, Sandhya Viswanathan <sviswanathan at openjdk.org> wrote:

>> Ferenc Rakoczi has updated the pull request incrementally with one additional commit since the last revision:
>> 
>>   Responding to comments by Sandhya.
>
> Another minor comment. Rest of the PR looks good to me.

@sviswa7, thanks a lot for the review!  If you agree with my changes to load the constants using broadcasting instructions instead of full AVX register loads, would you be so kind as to approve the PR and sponsor my integration?

-------------

PR Comment: https://git.openjdk.org/jdk/pull/24953#issuecomment-2883937966

From dnsimon at openjdk.org  Thu May 15 21:54:17 2025
From: dnsimon at openjdk.org (Doug Simon)
Date: Thu, 15 May 2025 21:54:17 GMT
Subject: RFR: 8345826: Do not automatically resolve jdk.internal.vm.ci when
 libgraal is used
Message-ID: <PlJ8ZpJyyH6rzVIww3DxcQ_Yi8jelxrf9YidjoaXP0Y=.0701541e-38db-4bc5-b2d5-ae5ebf89b362@github.com>

The `EnableJVMCI` flag currently serves 2 purposes:
* Guards VM code ([example](https://github.com/openjdk/jdk/blob/b1e778d9d2ad13ee5f1ed629a8805008580f86c0/src/hotspot/share/runtime/sharedRuntime.cpp#L652)).
* [Adds](https://github.com/openjdk/jdk/blob/b1e778d9d2ad13ee5f1ed629a8805008580f86c0/src/hotspot/share/runtime/arguments.cpp#L1804) `jdk.internal.vm.ci` to the root module set.

This PR changes nothing about the first point.

On the second point, to use the `jdk.internal.vm.ci` module, it must now be explicitly added with `--add-modules=jdk.internal.vm.ci`, which will also set `EnableJVMCI` as a side-effect.

The primary motivation is to make use of libgraal compatible with `-XX:+AOTClassLinking`. This flag relies on an archive of the root module set created in a separate JVM execution. If the root module set is different than what's in the archive at runtime, the AOTClassLinking [optimizations](https://bugs.openjdk.org/browse/JDK-8342279) are disabled. As `jdk.internal.vm.ci` is not resolved when creating the archive, it must not be resolved in the runtime using the archive. As such, `-XX:+EnableJVMCI` must not cause resolution of `jdk.internal.vm.ci` for libgraal to have the startup advantages of AOTClassLinking.

-------------

Commit messages:
 - added comment in check_vm_args_consistency
 - --add-modules=jdk.internal.vm.ci implies -XX:+EnableJVMCI

Changes: https://git.openjdk.org/jdk/pull/25240/files
  Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=25240&range=00
  Issue: https://bugs.openjdk.org/browse/JDK-8345826
  Stats: 63 lines in 10 files changed: 45 ins; 5 del; 13 mod
  Patch: https://git.openjdk.org/jdk/pull/25240.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/25240/head:pull/25240

PR: https://git.openjdk.org/jdk/pull/25240

From never at openjdk.org  Thu May 15 21:54:18 2025
From: never at openjdk.org (Tom Rodriguez)
Date: Thu, 15 May 2025 21:54:18 GMT
Subject: RFR: 8345826: Do not automatically resolve jdk.internal.vm.ci when
 libgraal is used
In-Reply-To: <PlJ8ZpJyyH6rzVIww3DxcQ_Yi8jelxrf9YidjoaXP0Y=.0701541e-38db-4bc5-b2d5-ae5ebf89b362@github.com>
References: <PlJ8ZpJyyH6rzVIww3DxcQ_Yi8jelxrf9YidjoaXP0Y=.0701541e-38db-4bc5-b2d5-ae5ebf89b362@github.com>
Message-ID: <uSnI8tVARc_qS2VF4UfcUlhkTIEEf2TVlgckZJMBV1M=.742bbdcd-279b-43b1-9504-ac279bda0cca@github.com>

On Wed, 14 May 2025 22:00:30 GMT, Doug Simon <dnsimon at openjdk.org> wrote:

> The `EnableJVMCI` flag currently serves 2 purposes:
> * Guards VM code ([example](https://github.com/openjdk/jdk/blob/b1e778d9d2ad13ee5f1ed629a8805008580f86c0/src/hotspot/share/runtime/sharedRuntime.cpp#L652)).
> * [Adds](https://github.com/openjdk/jdk/blob/b1e778d9d2ad13ee5f1ed629a8805008580f86c0/src/hotspot/share/runtime/arguments.cpp#L1804) `jdk.internal.vm.ci` to the root module set.
> 
> This PR changes nothing about the first point.
> 
> On the second point, to use the `jdk.internal.vm.ci` module, it must now be explicitly added with `--add-modules=jdk.internal.vm.ci`, which will also set `EnableJVMCI` as a side-effect.
> 
> The primary motivation is to make use of libgraal compatible with `-XX:+AOTClassLinking`. This flag relies on an archive of the root module set created in a separate JVM execution. If the root module set is different than what's in the archive at runtime, the AOTClassLinking [optimizations](https://bugs.openjdk.org/browse/JDK-8342279) are disabled. As `jdk.internal.vm.ci` is not resolved when creating the archive, it must not be resolved in the runtime using the archive. As such, `-XX:+EnableJVMCI` must not cause resolution of `jdk.internal.vm.ci` for libgraal to have the startup advantages of AOTClassLinking.

I found your explanation quite confusing, but the bug title is actually the most clear description of the fix.  Basically libjvmci doesn't require the existence of jdk.internal.vm.ci on the HotSpot side since it has effectively compiled that into itself.  So we are decoupling the ability to use JVMCI from the presence of the JVMCI module.  A short comment along these lines in at least your changes in check_vm_args_consistency would be helpful I think.

I do find it confusing that we are explicitly passing `--add-modules=jdk.internal.vm.ci` in a bunch of the tests.  Is that now necessary or are you just exercising the alternate ways of enabling JVMCI?

-------------

PR Comment: https://git.openjdk.org/jdk/pull/25240#issuecomment-2884370108

From dnsimon at openjdk.org  Thu May 15 21:54:18 2025
From: dnsimon at openjdk.org (Doug Simon)
Date: Thu, 15 May 2025 21:54:18 GMT
Subject: RFR: 8345826: Do not automatically resolve jdk.internal.vm.ci when
 libgraal is used
In-Reply-To: <uSnI8tVARc_qS2VF4UfcUlhkTIEEf2TVlgckZJMBV1M=.742bbdcd-279b-43b1-9504-ac279bda0cca@github.com>
References: <PlJ8ZpJyyH6rzVIww3DxcQ_Yi8jelxrf9YidjoaXP0Y=.0701541e-38db-4bc5-b2d5-ae5ebf89b362@github.com>
 <uSnI8tVARc_qS2VF4UfcUlhkTIEEf2TVlgckZJMBV1M=.742bbdcd-279b-43b1-9504-ac279bda0cca@github.com>
Message-ID: <tH31lJu0I4vpqBEfpXixe6dX-Nktm7lfVLauqr1IjT4=.07c27891-1d94-4c74-ba54-0463894c63c0@github.com>

On Thu, 15 May 2025 16:10:12 GMT, Tom Rodriguez <never at openjdk.org> wrote:

> I found your explanation quite confusing, but the bug title is actually the most clear description of the fix. Basically libjvmci doesn't require the existence of jdk.internal.vm.ci on the HotSpot side since it has effectively compiled that into itself. So we are decoupling the ability to use JVMCI from the presence of the JVMCI module. A short comment along these lines in at least your changes in check_vm_args_consistency would be helpful I think.

I added the requested comment and tried to clarify the PR description. Let me know if clarification is needed.

> I do find it confusing that we are explicitly passing `--add-modules=jdk.internal.vm.ci` in a bunch of the tests. Is that now necessary or are you just exercising the alternate ways of enabling JVMCI?

Without that option, the module will be missing and without the fail-fast check in `check_vm_args_consistency`  you would get an error such as:

Uncaught exception at src/hotspot/share/jvmci/jvmciRuntime.cpp:1433
java.lang.NoClassDefFoundError: jdk/vm/ci/code/Architecture
#
# A fatal error has been detected by the Java Runtime Environment:
#
#  Internal Error (jvmciRuntime.cpp:1636), pid=1979, tid=9731
#  fatal error: Fatal JVMCI exception (see JVMCI Events for stack trace): Uncaught exception at src/hotspot/share/jvmci/jvmciRuntime.cpp:1433
#

-------------

PR Comment: https://git.openjdk.org/jdk/pull/25240#issuecomment-2885119528

From vlivanov at openjdk.org  Thu May 15 21:58:54 2025
From: vlivanov at openjdk.org (Vladimir Ivanov)
Date: Thu, 15 May 2025 21:58:54 GMT
Subject: RFR: 8347901: C2 should remove unused leaf / pure runtime calls
In-Reply-To: <FDC4kftsSAqf2FB3mxOABGMgWhr_qty0_BktGUiuTuE=.060b5943-74c3-461d-8806-b6da1722c207@github.com>
References: <FDC4kftsSAqf2FB3mxOABGMgWhr_qty0_BktGUiuTuE=.060b5943-74c3-461d-8806-b6da1722c207@github.com>
Message-ID: <4vbXpgvmXv6Ba1fEkMKIRpUnXZ-QVdAZ7rgicqxVhpM=.7dda802c-9b8a-459d-9bd7-7a83d9fc1744@github.com>

On Wed, 30 Apr 2025 13:18:33 GMT, Marc Chevalier <mchevalier at openjdk.org> wrote:

> A first part toward a better support of pure functions.
> 
> ## Pure Functions
> 
> Pure functions (considered here) are functions that have no side effects, no effect on the control flow (no exception or such), cannot deopt etc.. It's really a function that you can execute anywhere, with whichever arguments without effect other than wasting time. Integer division is not pure as dividing by zero is throwing. But many floating point functions will just return `NaN` or `+/-infinity` in problematic cases.
> 
> ## Scope
> 
> We are not going all powerful for now! It's mostly about identifying some pure functions and being able to remove them if the result is unused. Some other things are not part of this PR, on purpose. Especially, this PR doesn't propose a way to move pure calls around. The reason is that pure calls are macro nodes later expanded into other, regular calls, which require a control input. To be able to do the expansion, we just keep the control in the pure call as well.
> 
> ## Implementation Overview
> 
> We created here some new node kind for pure calls that are expanded into regular calls during macro expansion. This also allows the removal of `ModD` and `ModF` nodes that have their pure equivalent now. They are surprisingly hard to unify with other floating point functions from an implementation point of view!
> 
> IR framework and IGV needed a little bit of fixing.
> 
> Thanks,
> Marc

Interesting! I wasn't aware ADLC already features such support. Thanks for the pointers. 

It does look attractive, especially for platform-specific use cases. But there are some pitfalls which makes it hard to use on its own. In particular, data nodes are aggressively commoned and freely flow in the graph. Unless it is taken into account during GVN and code motion, the final schedule may end up far from optimal. (In other words, it's highly beneficial to match only expensive nodes in such a way.) Moreover, some optimizations are highly sensitive to the presence of calls. (Think of the consequences of a call scheduled inside a heavily vectorized loop.)

Macro-expansion also suffers from some of those issues, but still IMO an explicit `Call` node is a more appropriate solution to the problem.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/24966#issuecomment-2885142373

From kvn at openjdk.org  Thu May 15 22:19:52 2025
From: kvn at openjdk.org (Vladimir Kozlov)
Date: Thu, 15 May 2025 22:19:52 GMT
Subject: RFR: 8345826: Do not automatically resolve jdk.internal.vm.ci when
 libgraal is used
In-Reply-To: <tH31lJu0I4vpqBEfpXixe6dX-Nktm7lfVLauqr1IjT4=.07c27891-1d94-4c74-ba54-0463894c63c0@github.com>
References: <PlJ8ZpJyyH6rzVIww3DxcQ_Yi8jelxrf9YidjoaXP0Y=.0701541e-38db-4bc5-b2d5-ae5ebf89b362@github.com>
 <uSnI8tVARc_qS2VF4UfcUlhkTIEEf2TVlgckZJMBV1M=.742bbdcd-279b-43b1-9504-ac279bda0cca@github.com>
 <tH31lJu0I4vpqBEfpXixe6dX-Nktm7lfVLauqr1IjT4=.07c27891-1d94-4c74-ba54-0463894c63c0@github.com>
Message-ID: <aEc5OigeU6qs0QlpZo2dRvhKd1OPnmUvVmEfQXmwzfU=.3fb5f87e-4416-423e-a67f-be945aac4ba1@github.com>

On Thu, 15 May 2025 21:42:06 GMT, Doug Simon <dnsimon at openjdk.org> wrote:

> Basically libjvmci doesn't require the existence of jdk.internal.vm.ci on the HotSpot side since it has effectively compiled that into itself. So we are decoupling the ability to use JVMCI from the presence of the JVMCI module.

That should be in PR and RFE (JBS) Descriptions!  This was my main question about filed REF.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/25240#issuecomment-2885172971

From kvn at openjdk.org  Thu May 15 22:26:55 2025
From: kvn at openjdk.org (Vladimir Kozlov)
Date: Thu, 15 May 2025 22:26:55 GMT
Subject: RFR: 8345826: Do not automatically resolve jdk.internal.vm.ci when
 libgraal is used
In-Reply-To: <PlJ8ZpJyyH6rzVIww3DxcQ_Yi8jelxrf9YidjoaXP0Y=.0701541e-38db-4bc5-b2d5-ae5ebf89b362@github.com>
References: <PlJ8ZpJyyH6rzVIww3DxcQ_Yi8jelxrf9YidjoaXP0Y=.0701541e-38db-4bc5-b2d5-ae5ebf89b362@github.com>
Message-ID: <5b9XLWfFY9pJ-y1fQ7FkuLSrvpFbe4hOyGOxdjPMxKw=.2c969cbd-e488-4db8-af81-ed2053d00b5d@github.com>

On Wed, 14 May 2025 22:00:30 GMT, Doug Simon <dnsimon at openjdk.org> wrote:

> The `EnableJVMCI` flag currently serves 2 purposes:
> * Guards VM code ([example](https://github.com/openjdk/jdk/blob/b1e778d9d2ad13ee5f1ed629a8805008580f86c0/src/hotspot/share/runtime/sharedRuntime.cpp#L652)).
> * [Adds](https://github.com/openjdk/jdk/blob/b1e778d9d2ad13ee5f1ed629a8805008580f86c0/src/hotspot/share/runtime/arguments.cpp#L1804) `jdk.internal.vm.ci` to the root module set.
> 
> This PR changes nothing about the first point.
> 
> On the second point, to use the `jdk.internal.vm.ci` module, it must now be explicitly added with `--add-modules=jdk.internal.vm.ci`, which will also set `EnableJVMCI` as a side-effect.
> 
> The primary motivation is to make use of libgraal compatible with `-XX:+AOTClassLinking`. This flag relies on an archive of the root module set created in a separate JVM execution. If the root module set is different than what's in the archive at runtime, the AOTClassLinking [optimizations](https://bugs.openjdk.org/browse/JDK-8342279) are disabled. As `jdk.internal.vm.ci` is not resolved when creating the archive, it must not be resolved in the runtime using the archive. As such, `-XX:+EnableJVMCI` must not cause resolution of `jdk.internal.vm.ci` for libgraal to have the startup advantages of AOTClassLinking.

src/hotspot/share/runtime/arguments.cpp line 1808:

> 1806:     // is no other representation of the jdk.internal.vm.ci module
> 1807:     // so it needs to be added to the root module set.
> 1808:     if (ClassLoader::is_module_observable("jdk.internal.vm.ci") && !UseJVMCINativeLibrary && !_jvmci_module_added) {

In which case `ClassLoader::is_module_observable("jdk.internal.vm.ci")` == `true` when `_jvmci_module_added` == `false`?
I assume this code is executed after check command line for `--add-modules=jdk.internal.vm.ci`.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/25240#discussion_r2092040772

From sviswanathan at openjdk.org  Fri May 16 00:32:51 2025
From: sviswanathan at openjdk.org (Sandhya Viswanathan)
Date: Fri, 16 May 2025 00:32:51 GMT
Subject: RFR: 8351412: Add AVX-512 intrinsics for ML-KEM [v5]
In-Reply-To: <q4DX289iZdnay70YIWdtRpE5G2f_5HlvBtFUOdlQOvk=.1f6876ac-f016-4223-b6b2-e664d855d56f@github.com>
References: <EyhTUDIMxnzyPP14NYWuRlZXR9WTv2GaYApQO0nJ4do=.2ee863e3-fd15-41be-ac73-247b03144281@github.com>
 <VZ7MfXaW0GQwLeqj7_HVeNyrgXlMuIFnQIA5treq7r8=.99e1921a-6169-4835-8bcd-a64bc6cd250f@github.com>
 <q4DX289iZdnay70YIWdtRpE5G2f_5HlvBtFUOdlQOvk=.1f6876ac-f016-4223-b6b2-e664d855d56f@github.com>
Message-ID: <D2qL9934RcsWhAeexaaOAblL6fCeaz5_HF9JjqfhjEU=.0e6196b7-62c0-47e7-8f24-bb55d4da2cc4@github.com>

On Thu, 15 May 2025 00:36:26 GMT, Sandhya Viswanathan <sviswanathan at openjdk.org> wrote:

>> Ferenc Rakoczi has updated the pull request incrementally with one additional commit since the last revision:
>> 
>>   Responding to comments by Sandhya.
>
> Another minor comment. Rest of the PR looks good to me.

> @sviswa7, thanks a lot for the review! If you agree with my changes to load the constants using broadcasting instructions instead of full AVX register loads, would you be so kind as to approve the PR and sponsor my integration?

The broadcast instructions look good. I only have one query on montMul above that I have wondering about.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/24953#issuecomment-2885339535

From sviswanathan at openjdk.org  Fri May 16 00:32:53 2025
From: sviswanathan at openjdk.org (Sandhya Viswanathan)
Date: Fri, 16 May 2025 00:32:53 GMT
Subject: RFR: 8351412: Add AVX-512 intrinsics for ML-KEM [v6]
In-Reply-To: <XiK_5EerkQ60z0GRdkPkYuJcVxqbyZqO0ofx1Zd1JNM=.51a52b01-6cbd-44a8-ad1d-99a083e037e9@github.com>
References: <EyhTUDIMxnzyPP14NYWuRlZXR9WTv2GaYApQO0nJ4do=.2ee863e3-fd15-41be-ac73-247b03144281@github.com>
 <XiK_5EerkQ60z0GRdkPkYuJcVxqbyZqO0ofx1Zd1JNM=.51a52b01-6cbd-44a8-ad1d-99a083e037e9@github.com>
Message-ID: <G8kShEh81Q8ydS6WxsuVf5tbS1VcwUB9SH1o1rpxDtQ=.86b09653-2bb5-45d3-912f-63db29ec5553@github.com>

On Thu, 15 May 2025 13:33:42 GMT, Ferenc Rakoczi <duke at openjdk.org> wrote:

>> By using the AVX-512 vector registers the speed of the computation of the ML-KEM algorithms (key generation, encapsulation, decapsulation) can be approximately doubled.
>
> Ferenc Rakoczi has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Response to review comment + loading constants with broadcast op.

src/hotspot/cpu/x86/stubGenerator_x86_64_kyber.cpp line 250:

> 248: static void montmul(int outputRegs[], int inputRegs1[], int inputRegs2[],
> 249:              int scratchRegs1[], int scratchRegs2[], MacroAssembler *_masm) {
> 250:    for (int i = 0; i < 4; i++) {

In the intrinsic for montMul we are treating as if MONT_R_BITS is 16 and MONT_Q_INV_MOD_R is 0xF301 whereas in the Java code MONT_R_BITS is 20 and MONT_Q_INT_MOD_R is 0x8F301. Are these equivalent?

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/24953#discussion_r2092137164

From dnsimon at openjdk.org  Fri May 16 06:55:52 2025
From: dnsimon at openjdk.org (Doug Simon)
Date: Fri, 16 May 2025 06:55:52 GMT
Subject: RFR: 8345826: Do not automatically resolve jdk.internal.vm.ci when
 libgraal is used
In-Reply-To: <5b9XLWfFY9pJ-y1fQ7FkuLSrvpFbe4hOyGOxdjPMxKw=.2c969cbd-e488-4db8-af81-ed2053d00b5d@github.com>
References: <PlJ8ZpJyyH6rzVIww3DxcQ_Yi8jelxrf9YidjoaXP0Y=.0701541e-38db-4bc5-b2d5-ae5ebf89b362@github.com>
 <5b9XLWfFY9pJ-y1fQ7FkuLSrvpFbe4hOyGOxdjPMxKw=.2c969cbd-e488-4db8-af81-ed2053d00b5d@github.com>
Message-ID: <GsmTtMAWgVcze3FEqXclc0LiGwCAiOj18BlOlm56Ce0=.0691e221-b248-4ae3-9dee-22cbccc4ce19@github.com>

On Thu, 15 May 2025 22:24:19 GMT, Vladimir Kozlov <kvn at openjdk.org> wrote:

>> The `EnableJVMCI` flag currently serves 2 purposes:
>> * Guards VM code ([example](https://github.com/openjdk/jdk/blob/b1e778d9d2ad13ee5f1ed629a8805008580f86c0/src/hotspot/share/runtime/sharedRuntime.cpp#L652)).
>> * [Adds](https://github.com/openjdk/jdk/blob/b1e778d9d2ad13ee5f1ed629a8805008580f86c0/src/hotspot/share/runtime/arguments.cpp#L1804) `jdk.internal.vm.ci` to the root module set.
>> 
>> This PR changes nothing about the first point.
>> 
>> On the second point, to use the `jdk.internal.vm.ci` module, it must now be explicitly added with `--add-modules=jdk.internal.vm.ci`, which will also set `EnableJVMCI` as a side-effect.
>> 
>> The primary motivation is to make use of libgraal compatible with `-XX:+AOTClassLinking`. This flag relies on an archive of the root module set created in a separate JVM execution. If the root module set is different than what's in the archive at runtime, the AOTClassLinking [optimizations](https://bugs.openjdk.org/browse/JDK-8342279) are disabled. As `jdk.internal.vm.ci` is not resolved when creating the archive, it must not be resolved in the runtime using the archive. As such, `-XX:+EnableJVMCI` must not cause resolution of `jdk.internal.vm.ci` for libgraal to have the startup advantages of AOTClassLinking.
>
> src/hotspot/share/runtime/arguments.cpp line 1808:
> 
>> 1806:     // is no other representation of the jdk.internal.vm.ci module
>> 1807:     // so it needs to be added to the root module set.
>> 1808:     if (ClassLoader::is_module_observable("jdk.internal.vm.ci") && !UseJVMCINativeLibrary && !_jvmci_module_added) {
> 
> In which case `ClassLoader::is_module_observable("jdk.internal.vm.ci")` == `true` when `_jvmci_module_added` == `false`?
> I assume this code is executed after check command line for `--add-modules=jdk.internal.vm.ci`.

The documentation for `is_module_observable` is:

  // Determines if the named module is present in the
  // modules jimage file or in the exploded modules directory.

That is, is the module present on disk. On the other hand, `_jvmci_module_added` is a test of whether an `--add-modules` option has been seen whose value included `jdk.internal.vm.ci`.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/25240#discussion_r2092445492

From dnsimon at openjdk.org  Fri May 16 07:03:57 2025
From: dnsimon at openjdk.org (Doug Simon)
Date: Fri, 16 May 2025 07:03:57 GMT
Subject: RFR: 8345826: Do not automatically resolve jdk.internal.vm.ci when
 libgraal is used
In-Reply-To: <aEc5OigeU6qs0QlpZo2dRvhKd1OPnmUvVmEfQXmwzfU=.3fb5f87e-4416-423e-a67f-be945aac4ba1@github.com>
References: <PlJ8ZpJyyH6rzVIww3DxcQ_Yi8jelxrf9YidjoaXP0Y=.0701541e-38db-4bc5-b2d5-ae5ebf89b362@github.com>
 <uSnI8tVARc_qS2VF4UfcUlhkTIEEf2TVlgckZJMBV1M=.742bbdcd-279b-43b1-9504-ac279bda0cca@github.com>
 <tH31lJu0I4vpqBEfpXixe6dX-Nktm7lfVLauqr1IjT4=.07c27891-1d94-4c74-ba54-0463894c63c0@github.com>
 <aEc5OigeU6qs0QlpZo2dRvhKd1OPnmUvVmEfQXmwzfU=.3fb5f87e-4416-423e-a67f-be945aac4ba1@github.com>
Message-ID: <ieNoCeTaPEnOnFsUq23PW90b9DfX1t0dSdXWNHAQBL8=.efb0f5b4-86ce-4f83-aa79-cb76181ca3f0@github.com>

On Thu, 15 May 2025 22:17:42 GMT, Vladimir Kozlov <kvn at openjdk.org> wrote:

> That should be in PR and RFE (JBS) Descriptions! This was my main question about filed REF.

I've updated both the PR and JBS issue descriptions. Let me know if either still need improvement.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/25240#issuecomment-2885829282

From alanb at openjdk.org  Fri May 16 08:24:52 2025
From: alanb at openjdk.org (Alan Bateman)
Date: Fri, 16 May 2025 08:24:52 GMT
Subject: RFR: 8345826: Do not automatically resolve jdk.internal.vm.ci when
 libgraal is used
In-Reply-To: <PlJ8ZpJyyH6rzVIww3DxcQ_Yi8jelxrf9YidjoaXP0Y=.0701541e-38db-4bc5-b2d5-ae5ebf89b362@github.com>
References: <PlJ8ZpJyyH6rzVIww3DxcQ_Yi8jelxrf9YidjoaXP0Y=.0701541e-38db-4bc5-b2d5-ae5ebf89b362@github.com>
Message-ID: <n7c0F7zOM3Xin-AYHicDXJ1Qkxn_TNP7xVZJZSi1Jf4=.06f5c037-8abf-4cd9-9bca-de2aec9e3eab@github.com>

On Wed, 14 May 2025 22:00:30 GMT, Doug Simon <dnsimon at openjdk.org> wrote:

> The `EnableJVMCI` flag currently serves 2 purposes:
> * Guards VM code ([example](https://github.com/openjdk/jdk/blob/b1e778d9d2ad13ee5f1ed629a8805008580f86c0/src/hotspot/share/runtime/sharedRuntime.cpp#L652)).
> * [Adds](https://github.com/openjdk/jdk/blob/b1e778d9d2ad13ee5f1ed629a8805008580f86c0/src/hotspot/share/runtime/arguments.cpp#L1804) `jdk.internal.vm.ci` to the root module set.
> 
> This PR changes nothing about the first point.
> 
> On the second point, to use the `jdk.internal.vm.ci` module, it must now be explicitly added with `--add-modules=jdk.internal.vm.ci`, which will also set `EnableJVMCI` as a side-effect.
> 
> The primary motivation is to make use of libgraal compatible with `-XX:+AOTClassLinking`. This flag relies on an archive of the root module set created in a separate JVM execution. If the root module set is different than what's in the archive at runtime, the AOTClassLinking [optimizations](https://bugs.openjdk.org/browse/JDK-8342279) are disabled. As `jdk.internal.vm.ci` is not resolved when creating the archive, it must not be resolved in the runtime using the archive. As such, `-XX:+EnableJVMCI` must not cause resolution of `jdk.internal.vm.ci` for libgraal to have the startup advantages of AOTClassLinking.

src/hotspot/share/runtime/arguments.cpp line 1811:

> 1809:       jio_fprintf(defaultStream::error_stream(),
> 1810:         "'+EnableJVMCI' requires '--add-modules=jdk.internal.vm.ci' when UseJVMCINativeLibrary is false\n");
> 1811:       return false;

There's something a bit uncomfortable about an error message naming a JDK internal module to specify to --add-modules.

If I understand correctly, +EnableJVMCI and libgraal is all good, the set of modules in the training run is the same as the production run. However, in the no libgraal scenario, and a mismatch between training and production runs (is that right)? then AOT is disabled. Is it really terrible to disable the AOTClassLinking optimizations in that scenario?

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/25240#discussion_r2092577889

From dnsimon at openjdk.org  Fri May 16 08:36:55 2025
From: dnsimon at openjdk.org (Doug Simon)
Date: Fri, 16 May 2025 08:36:55 GMT
Subject: RFR: 8345826: Do not automatically resolve jdk.internal.vm.ci when
 libgraal is used
In-Reply-To: <n7c0F7zOM3Xin-AYHicDXJ1Qkxn_TNP7xVZJZSi1Jf4=.06f5c037-8abf-4cd9-9bca-de2aec9e3eab@github.com>
References: <PlJ8ZpJyyH6rzVIww3DxcQ_Yi8jelxrf9YidjoaXP0Y=.0701541e-38db-4bc5-b2d5-ae5ebf89b362@github.com>
 <n7c0F7zOM3Xin-AYHicDXJ1Qkxn_TNP7xVZJZSi1Jf4=.06f5c037-8abf-4cd9-9bca-de2aec9e3eab@github.com>
Message-ID: <t9SBFMs83jidJi8mM1B2-kuAjoVXJmjutUhYwSIYJmA=.b1f03263-3216-4fe8-98b6-107376af23f5@github.com>

On Fri, 16 May 2025 08:22:20 GMT, Alan Bateman <alanb at openjdk.org> wrote:

> the set of modules in the training run is the same as the production run

I don't think that's true. That is, I don't think +EnableJVMCI is used in the training run is it @iklam ?

> Is it really terrible to disable the AOTClassLinking optimizations in that scenario?

Depends if you care as much about VM startup when using libgraal as when using C2. Is there a reason why only one of these should have AOTClassLinking startup benefits?

There's also the issue of all the AOTClassLinking tests having to be disabled/ignored/problem listed in the libgraal mach5 tiers. This is what initially motivated @iklam and I to come up with this solution.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/25240#discussion_r2092598097

From alanb at openjdk.org  Fri May 16 09:01:52 2025
From: alanb at openjdk.org (Alan Bateman)
Date: Fri, 16 May 2025 09:01:52 GMT
Subject: RFR: 8345826: Do not automatically resolve jdk.internal.vm.ci when
 libgraal is used
In-Reply-To: <t9SBFMs83jidJi8mM1B2-kuAjoVXJmjutUhYwSIYJmA=.b1f03263-3216-4fe8-98b6-107376af23f5@github.com>
References: <PlJ8ZpJyyH6rzVIww3DxcQ_Yi8jelxrf9YidjoaXP0Y=.0701541e-38db-4bc5-b2d5-ae5ebf89b362@github.com>
 <n7c0F7zOM3Xin-AYHicDXJ1Qkxn_TNP7xVZJZSi1Jf4=.06f5c037-8abf-4cd9-9bca-de2aec9e3eab@github.com>
 <t9SBFMs83jidJi8mM1B2-kuAjoVXJmjutUhYwSIYJmA=.b1f03263-3216-4fe8-98b6-107376af23f5@github.com>
Message-ID: <EKx_UGTifyZxj7woy3nLSA3C0vRgNrc1SjiSIqdzFlM=.87418a05-9746-408b-bc0d-49ec683bdcd3@github.com>

On Fri, 16 May 2025 08:34:33 GMT, Doug Simon <dnsimon at openjdk.org> wrote:

> Depends if you care as much about VM startup when using libgraal as when using C2. Is there a reason why only one of these should have AOTClassLinking startup benefits?

I should have been clearer, my question/comment was about the no-libgraal case, not the libgraal case. With the proposed change, I think you are looking to print an error. I'm wondering why it can't continue to add jdk.internal.vm.ci to the set of root modules.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/25240#discussion_r2092640951

From dnsimon at openjdk.org  Fri May 16 09:32:57 2025
From: dnsimon at openjdk.org (Doug Simon)
Date: Fri, 16 May 2025 09:32:57 GMT
Subject: RFR: 8345826: Do not automatically resolve jdk.internal.vm.ci when
 libgraal is used
In-Reply-To: <EKx_UGTifyZxj7woy3nLSA3C0vRgNrc1SjiSIqdzFlM=.87418a05-9746-408b-bc0d-49ec683bdcd3@github.com>
References: <PlJ8ZpJyyH6rzVIww3DxcQ_Yi8jelxrf9YidjoaXP0Y=.0701541e-38db-4bc5-b2d5-ae5ebf89b362@github.com>
 <n7c0F7zOM3Xin-AYHicDXJ1Qkxn_TNP7xVZJZSi1Jf4=.06f5c037-8abf-4cd9-9bca-de2aec9e3eab@github.com>
 <t9SBFMs83jidJi8mM1B2-kuAjoVXJmjutUhYwSIYJmA=.b1f03263-3216-4fe8-98b6-107376af23f5@github.com>
 <EKx_UGTifyZxj7woy3nLSA3C0vRgNrc1SjiSIqdzFlM=.87418a05-9746-408b-bc0d-49ec683bdcd3@github.com>
Message-ID: <uU9pC1QoIc2_MkcJkwwX56QZOtYaQbZPjEiUY9C4i-M=.4f7d08f8-b6c2-497d-a08d-477728f2efe1@github.com>

On Fri, 16 May 2025 08:59:12 GMT, Alan Bateman <alanb at openjdk.org> wrote:

>>> the set of modules in the training run is the same as the production run
>> 
>> I don't think that's true. That is, I don't think +EnableJVMCI is used in the training run is it @iklam ?
>> 
>>> Is it really terrible to disable the AOTClassLinking optimizations in that scenario?
>> 
>> Depends if you care as much about VM startup when using libgraal as when using C2. Is there a reason why only one of these should have AOTClassLinking startup benefits?
>> 
>> There's also the issue of all the AOTClassLinking tests having to be disabled/ignored/problem listed in the libgraal mach5 tiers. This is what initially motivated @iklam and I to come up with this solution.
>
>> Depends if you care as much about VM startup when using libgraal as when using C2. Is there a reason why only one of these should have AOTClassLinking startup benefits?
> 
> I should have been clearer, my question/comment was about the no-libgraal case, not the libgraal case. With the proposed change, I think you are looking to print an error. I'm wondering why it can't continue to add jdk.internal.vm.ci to the set of root modules.

Ok, that's a good suggestion. I'll explore further.

And I now understand your first comment about "+EnableJVMCI and libgraal is all good" is in the context of having applied this PR. In that context, the set of modules in the training run is indeed the same as the production run.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/25240#discussion_r2092699986

From dnsimon at openjdk.org  Fri May 16 13:16:30 2025
From: dnsimon at openjdk.org (Doug Simon)
Date: Fri, 16 May 2025 13:16:30 GMT
Subject: RFR: 8345826: Do not automatically resolve jdk.internal.vm.ci when
 libgraal is used [v2]
In-Reply-To: <PlJ8ZpJyyH6rzVIww3DxcQ_Yi8jelxrf9YidjoaXP0Y=.0701541e-38db-4bc5-b2d5-ae5ebf89b362@github.com>
References: <PlJ8ZpJyyH6rzVIww3DxcQ_Yi8jelxrf9YidjoaXP0Y=.0701541e-38db-4bc5-b2d5-ae5ebf89b362@github.com>
Message-ID: <5f7JDzMyIpKD6FAvCN5kYPJYD1mPUcAHUa43Kh74h40=.9d9fcdbe-68f5-46e1-9609-e3320bba6f77@github.com>

> The `EnableJVMCI` flag currently serves 2 purposes:
> * Guards VM code ([example](https://github.com/openjdk/jdk/blob/b1e778d9d2ad13ee5f1ed629a8805008580f86c0/src/hotspot/share/runtime/sharedRuntime.cpp#L652)).
> * [Adds](https://github.com/openjdk/jdk/blob/b1e778d9d2ad13ee5f1ed629a8805008580f86c0/src/hotspot/share/runtime/arguments.cpp#L1804) `jdk.internal.vm.ci` to the root module set.
> 
> This PR changes nothing about the first point.
> 
> On the second point, to use the `jdk.internal.vm.ci` module, it must now be explicitly added with `--add-modules=jdk.internal.vm.ci`, which will also set `EnableJVMCI` as a side-effect.
> 
> The primary motivation is to make use of libgraal compatible with `-XX:+AOTClassLinking`. This flag relies on an archive of the root module set created in a separate JVM execution. If the root module set is different than what's in the archive at runtime, the AOTClassLinking [optimizations](https://bugs.openjdk.org/browse/JDK-8342279) are disabled. As `jdk.internal.vm.ci` is not resolved when creating the archive, it must not be resolved in the runtime using the archive. As such, `-XX:+EnableJVMCI` must not cause resolution of `jdk.internal.vm.ci` for libgraal to have the startup advantages of AOTClassLinking.

Doug Simon has updated the pull request incrementally with one additional commit since the last revision:

  resolve jdk.internal.vm.ci if +EnableJVMCI and -UseJVMCINativeLibrary

-------------

Changes:
  - all: https://git.openjdk.org/jdk/pull/25240/files
  - new: https://git.openjdk.org/jdk/pull/25240/files/0e8773e1..34360331

Webrevs:
 - full: https://webrevs.openjdk.org/?repo=jdk&pr=25240&range=01
 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25240&range=00-01

  Stats: 38 lines in 9 files changed: 5 ins; 15 del; 18 mod
  Patch: https://git.openjdk.org/jdk/pull/25240.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/25240/head:pull/25240

PR: https://git.openjdk.org/jdk/pull/25240

From dnsimon at openjdk.org  Fri May 16 13:16:30 2025
From: dnsimon at openjdk.org (Doug Simon)
Date: Fri, 16 May 2025 13:16:30 GMT
Subject: RFR: 8345826: Do not automatically resolve jdk.internal.vm.ci when
 libgraal is used [v2]
In-Reply-To: <uU9pC1QoIc2_MkcJkwwX56QZOtYaQbZPjEiUY9C4i-M=.4f7d08f8-b6c2-497d-a08d-477728f2efe1@github.com>
References: <PlJ8ZpJyyH6rzVIww3DxcQ_Yi8jelxrf9YidjoaXP0Y=.0701541e-38db-4bc5-b2d5-ae5ebf89b362@github.com>
 <n7c0F7zOM3Xin-AYHicDXJ1Qkxn_TNP7xVZJZSi1Jf4=.06f5c037-8abf-4cd9-9bca-de2aec9e3eab@github.com>
 <t9SBFMs83jidJi8mM1B2-kuAjoVXJmjutUhYwSIYJmA=.b1f03263-3216-4fe8-98b6-107376af23f5@github.com>
 <EKx_UGTifyZxj7woy3nLSA3C0vRgNrc1SjiSIqdzFlM=.87418a05-9746-408b-bc0d-49ec683bdcd3@github.com>
 <uU9pC1QoIc2_MkcJkwwX56QZOtYaQbZPjEiUY9C4i-M=.4f7d08f8-b6c2-497d-a08d-477728f2efe1@github.com>
Message-ID: <lEx8lee6aH9URCi8POzpfeTMHdGfsOkEzjbb-shqrQs=.8145c232-ff04-430a-bbc8-de1ed45c1552@github.com>

On Fri, 16 May 2025 09:30:42 GMT, Doug Simon <dnsimon at openjdk.org> wrote:

>>> Depends if you care as much about VM startup when using libgraal as when using C2. Is there a reason why only one of these should have AOTClassLinking startup benefits?
>> 
>> I should have been clearer, my question/comment was about the no-libgraal case, not the libgraal case. With the proposed change, I think you are looking to print an error. I'm wondering why it can't continue to add jdk.internal.vm.ci to the set of root modules.
>
> Ok, that's a good suggestion. I'll explore further.
> 
> And I now understand your first comment about "+EnableJVMCI and libgraal is all good" is in the context of having applied this PR. In that context, the set of modules in the training run is indeed the same as the production run.

I've pushed a commit that implements your suggestion and reduces the size of the overall change nicely. More importantly, I think it's a better design - thanks!

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/25240#discussion_r2093025755

From iklam at openjdk.org  Fri May 16 13:52:03 2025
From: iklam at openjdk.org (Ioi Lam)
Date: Fri, 16 May 2025 13:52:03 GMT
Subject: RFR: 8345826: Do not automatically resolve jdk.internal.vm.ci when
 libgraal is used [v2]
In-Reply-To: <lEx8lee6aH9URCi8POzpfeTMHdGfsOkEzjbb-shqrQs=.8145c232-ff04-430a-bbc8-de1ed45c1552@github.com>
References: <PlJ8ZpJyyH6rzVIww3DxcQ_Yi8jelxrf9YidjoaXP0Y=.0701541e-38db-4bc5-b2d5-ae5ebf89b362@github.com>
 <n7c0F7zOM3Xin-AYHicDXJ1Qkxn_TNP7xVZJZSi1Jf4=.06f5c037-8abf-4cd9-9bca-de2aec9e3eab@github.com>
 <t9SBFMs83jidJi8mM1B2-kuAjoVXJmjutUhYwSIYJmA=.b1f03263-3216-4fe8-98b6-107376af23f5@github.com>
 <EKx_UGTifyZxj7woy3nLSA3C0vRgNrc1SjiSIqdzFlM=.87418a05-9746-408b-bc0d-49ec683bdcd3@github.com>
 <uU9pC1QoIc2_MkcJkwwX56QZOtYaQbZPjEiUY9C4i-M=.4f7d08f8-b6c2-497d-a08d-477728f2efe1@github.com>
 <lEx8lee6aH9URCi8POzpfeTMHdGfsOkEzjbb-shqrQs=.8145c232-ff04-430a-bbc8-de1ed45c1552@github.com>
Message-ID: <-uhFTQkTUeNHKS5yBLkapWVfcGwDBAgS8B_rS2DvWsg=.e0a7243c-9853-4855-a652-2558941bfd41@github.com>

On Fri, 16 May 2025 13:13:09 GMT, Doug Simon <dnsimon at openjdk.org> wrote:

>> Ok, that's a good suggestion. I'll explore further.
>> 
>> And I now understand your first comment about "+EnableJVMCI and libgraal is all good" is in the context of having applied this PR. In that context, the set of modules in the training run is indeed the same as the production run.
>
> I've pushed a commit that implements your suggestion and reduces the size of the overall change nicely. More importantly, I think it's a better design - thanks!

I like this latest version!

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/25240#discussion_r2093092231

From iklam at openjdk.org  Fri May 16 13:52:05 2025
From: iklam at openjdk.org (Ioi Lam)
Date: Fri, 16 May 2025 13:52:05 GMT
Subject: RFR: 8345826: Do not automatically resolve jdk.internal.vm.ci when
 libgraal is used [v2]
In-Reply-To: <5f7JDzMyIpKD6FAvCN5kYPJYD1mPUcAHUa43Kh74h40=.9d9fcdbe-68f5-46e1-9609-e3320bba6f77@github.com>
References: <PlJ8ZpJyyH6rzVIww3DxcQ_Yi8jelxrf9YidjoaXP0Y=.0701541e-38db-4bc5-b2d5-ae5ebf89b362@github.com>
 <5f7JDzMyIpKD6FAvCN5kYPJYD1mPUcAHUa43Kh74h40=.9d9fcdbe-68f5-46e1-9609-e3320bba6f77@github.com>
Message-ID: <9WobEXbqfiR1CrUzBbqo6G4sIBpMkDPduREkTuLLl3k=.59ee81b0-4d11-4bfd-8f50-d63d1f6ff35e@github.com>

On Fri, 16 May 2025 13:16:30 GMT, Doug Simon <dnsimon at openjdk.org> wrote:

>> The `EnableJVMCI` flag currently serves 2 purposes:
>> * Guards VM code ([example](https://github.com/openjdk/jdk/blob/b1e778d9d2ad13ee5f1ed629a8805008580f86c0/src/hotspot/share/runtime/sharedRuntime.cpp#L652)).
>> * [Adds](https://github.com/openjdk/jdk/blob/b1e778d9d2ad13ee5f1ed629a8805008580f86c0/src/hotspot/share/runtime/arguments.cpp#L1804) `jdk.internal.vm.ci` to the root module set.
>> 
>> This PR changes nothing about the first point.
>> 
>> On the second point, to use the `jdk.internal.vm.ci` module when libgraal is enabled, `--add-modules=jdk.internal.vm.ci` must be specified.
>> If libgraal is not enabled, +EnableJVMCI will continue to add `jdk.internal.vm.ci` to the root module set.
>> 
>> The primary motivation is to make use of libgraal compatible with `-XX:+AOTClassLinking`. This flag relies on the root module set archive created in a training run. If the root module set is different in the production run, the AOTClassLinking [optimizations](https://bugs.openjdk.org/browse/JDK-8342279) are disabled. As `jdk.internal.vm.ci` is not resolved in the training run, it must not be resolved in production run. As such, `-XX:+EnableJVMCI` must not cause resolution of `jdk.internal.vm.ci`, otherwise libgraal will not have the startup advantages of AOTClassLinking.
>
> Doug Simon has updated the pull request incrementally with one additional commit since the last revision:
> 
>   resolve jdk.internal.vm.ci if +EnableJVMCI and -UseJVMCINativeLibrary

src/hotspot/share/runtime/arguments.cpp line 1814:

> 1812:     }
> 1813:     PropertyList_unique_add(&_system_properties, "jdk.internal.vm.ci.enabled", "true",
> 1814:         AddProperty, UnwriteableProperty, InternalProperty);

What's the purpose of the `jdk.internal.vm.ci.enabled` property? Should it be enabled only if the `jdk.internal.vm.ci` module is added?

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/25240#discussion_r2093091052

From dnsimon at openjdk.org  Fri May 16 14:30:55 2025
From: dnsimon at openjdk.org (Doug Simon)
Date: Fri, 16 May 2025 14:30:55 GMT
Subject: RFR: 8345826: Do not automatically resolve jdk.internal.vm.ci when
 libgraal is used [v2]
In-Reply-To: <9WobEXbqfiR1CrUzBbqo6G4sIBpMkDPduREkTuLLl3k=.59ee81b0-4d11-4bfd-8f50-d63d1f6ff35e@github.com>
References: <PlJ8ZpJyyH6rzVIww3DxcQ_Yi8jelxrf9YidjoaXP0Y=.0701541e-38db-4bc5-b2d5-ae5ebf89b362@github.com>
 <5f7JDzMyIpKD6FAvCN5kYPJYD1mPUcAHUa43Kh74h40=.9d9fcdbe-68f5-46e1-9609-e3320bba6f77@github.com>
 <9WobEXbqfiR1CrUzBbqo6G4sIBpMkDPduREkTuLLl3k=.59ee81b0-4d11-4bfd-8f50-d63d1f6ff35e@github.com>
Message-ID: <Y3aNfbtRbgQEt_BwTfJAeomuAJsbVAycN6fngb9OQ5I=.b2f903fc-70e4-4ddf-93c7-dea6d52ed38d@github.com>

On Fri, 16 May 2025 13:48:45 GMT, Ioi Lam <iklam at openjdk.org> wrote:

>> Doug Simon has updated the pull request incrementally with one additional commit since the last revision:
>> 
>>   resolve jdk.internal.vm.ci if +EnableJVMCI and -UseJVMCINativeLibrary
>
> src/hotspot/share/runtime/arguments.cpp line 1814:
> 
>> 1812:     }
>> 1813:     PropertyList_unique_add(&_system_properties, "jdk.internal.vm.ci.enabled", "true",
>> 1814:         AddProperty, UnwriteableProperty, InternalProperty);
> 
> What's the purpose of the `jdk.internal.vm.ci.enabled` property? Should it be enabled only if the `jdk.internal.vm.ci` module is added?

It exists to [check](https://github.com/search?q=repo%3Aopenjdk%2Fjdk%20checkJVMCIEnabled&type=code) in various Java-level entry points that the JVMCI VM support has been enabled so a nicer error message can be thrown that provides a possible corrective action. However, since it's now impossible to load the JVMCI module without enabling the JVMCI VM support, this is no longer of any use. I'll remove it.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/25240#discussion_r2093162767

From dnsimon at openjdk.org  Fri May 16 14:41:31 2025
From: dnsimon at openjdk.org (Doug Simon)
Date: Fri, 16 May 2025 14:41:31 GMT
Subject: RFR: 8345826: Do not automatically resolve jdk.internal.vm.ci when
 libgraal is used [v3]
In-Reply-To: <PlJ8ZpJyyH6rzVIww3DxcQ_Yi8jelxrf9YidjoaXP0Y=.0701541e-38db-4bc5-b2d5-ae5ebf89b362@github.com>
References: <PlJ8ZpJyyH6rzVIww3DxcQ_Yi8jelxrf9YidjoaXP0Y=.0701541e-38db-4bc5-b2d5-ae5ebf89b362@github.com>
Message-ID: <_q4IPJYDVzhjP8W0KqTeFGzTB4vE-QmfAWAsLWd8m5M=.f0bd3074-ab48-425f-b6d4-e765f6d1f8f0@github.com>

> The `EnableJVMCI` flag currently serves 2 purposes:
> * Guards VM code ([example](https://github.com/openjdk/jdk/blob/b1e778d9d2ad13ee5f1ed629a8805008580f86c0/src/hotspot/share/runtime/sharedRuntime.cpp#L652)).
> * [Adds](https://github.com/openjdk/jdk/blob/b1e778d9d2ad13ee5f1ed629a8805008580f86c0/src/hotspot/share/runtime/arguments.cpp#L1804) `jdk.internal.vm.ci` to the root module set.
> 
> This PR changes nothing about the first point.
> 
> On the second point, to use the `jdk.internal.vm.ci` module when libgraal is enabled, `--add-modules=jdk.internal.vm.ci` must be specified.
> If libgraal is not enabled, +EnableJVMCI will continue to add `jdk.internal.vm.ci` to the root module set.
> 
> The primary motivation is to make use of libgraal compatible with `-XX:+AOTClassLinking`. This flag relies on the root module set archive created in a training run. If the root module set is different in the production run, the AOTClassLinking [optimizations](https://bugs.openjdk.org/browse/JDK-8342279) are disabled. As `jdk.internal.vm.ci` is not resolved in the training run, it must not be resolved in production run. As such, `-XX:+EnableJVMCI` must not cause resolution of `jdk.internal.vm.ci`, otherwise libgraal will not have the startup advantages of AOTClassLinking.

Doug Simon has updated the pull request incrementally with three additional commits since the last revision:

 - fixed comment
 - removed use of jdk.internal.vm.ci.enabled property
 - fix TestHotSpotJVMCIRuntime

-------------

Changes:
  - all: https://git.openjdk.org/jdk/pull/25240/files
  - new: https://git.openjdk.org/jdk/pull/25240/files/34360331..3cdef586

Webrevs:
 - full: https://webrevs.openjdk.org/?repo=jdk&pr=25240&range=02
 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25240&range=01-02

  Stats: 20 lines in 5 files changed: 2 ins; 18 del; 0 mod
  Patch: https://git.openjdk.org/jdk/pull/25240.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/25240/head:pull/25240

PR: https://git.openjdk.org/jdk/pull/25240

From iklam at openjdk.org  Fri May 16 15:18:51 2025
From: iklam at openjdk.org (Ioi Lam)
Date: Fri, 16 May 2025 15:18:51 GMT
Subject: RFR: 8345826: Do not automatically resolve jdk.internal.vm.ci when
 libgraal is used [v3]
In-Reply-To: <-uhFTQkTUeNHKS5yBLkapWVfcGwDBAgS8B_rS2DvWsg=.e0a7243c-9853-4855-a652-2558941bfd41@github.com>
References: <PlJ8ZpJyyH6rzVIww3DxcQ_Yi8jelxrf9YidjoaXP0Y=.0701541e-38db-4bc5-b2d5-ae5ebf89b362@github.com>
 <n7c0F7zOM3Xin-AYHicDXJ1Qkxn_TNP7xVZJZSi1Jf4=.06f5c037-8abf-4cd9-9bca-de2aec9e3eab@github.com>
 <t9SBFMs83jidJi8mM1B2-kuAjoVXJmjutUhYwSIYJmA=.b1f03263-3216-4fe8-98b6-107376af23f5@github.com>
 <EKx_UGTifyZxj7woy3nLSA3C0vRgNrc1SjiSIqdzFlM=.87418a05-9746-408b-bc0d-49ec683bdcd3@github.com>
 <uU9pC1QoIc2_MkcJkwwX56QZOtYaQbZPjEiUY9C4i-M=.4f7d08f8-b6c2-497d-a08d-477728f2efe1@github.com>
 <lEx8lee6aH9URCi8POzpfeTMHdGfsOkEzjbb-shqrQs=.8145c232-ff04-430a-bbc8-de1ed45c1552@github.com>
 <-uhFTQkTUeNHKS5yBLkapWVfcGwDBAgS8B_rS2DvWsg=.e0a7243c-9853-4855-a652-2558941bfd41@github.com>
Message-ID: <lDppbzMeN1ZKpgZt24_Jp2vgFfRpy6BFEi-pEmn-CFs=.ec07fb37-003a-4e59-a4c0-19cf85640f42@github.com>

On Fri, 16 May 2025 13:49:26 GMT, Ioi Lam <iklam at openjdk.org> wrote:

>> I've pushed a commit that implements your suggestion and reduces the size of the overall change nicely. More importantly, I think it's a better design - thanks!
>
> I like this latest version!

I ran a recent build of Oracle JDK 25 that has libjvmcicompiler.so (not including your changes):


$ ./bin/java -XX:+UnlockExperimentalVMOptions -XX:+UseGraalJIT -XX:+PrintFlagsFinal --version | \
        egrep '(EnableJVMCI)|(UseJVMCICompiler)|(UseJVMCINativeLibrary)'
     bool EnableJVMCI           = true    {JVMCI product} {default}
     bool EnableJVMCIProduct    = true    {JVMCI product} {command line}
     bool UseJVMCICompiler      = true    {JVMCI product} {default}
     bool UseJVMCINativeLibrary = true    {JVMCI product} {default}
$ ./bin/java -XX:+UnlockExperimentalVMOptions -XX:+EnableJVMCI -XX:+PrintFlagsFinal --version | \
        egrep '(EnableJVMCI)|(UseJVMCICompiler)|(UseJVMCINativeLibrary)'
     bool EnableJVMCI           = true    {JVMCI experimental} {command line}
     bool EnableJVMCIProduct    = false   {JVMCI experimental} {default}
     bool UseJVMCICompiler      = false   {JVMCI experimental} {default}
     bool UseJVMCINativeLibrary = true    {JVMCI experimental} {default}


So If you specify only `-XX:+EnableJVMCI` in the command-line, `UseJVMCINativeLibrary` will be true. As a result, with your latest version, the `jdk.internal.vm.ci` module is not added.

If you have an app that wants to use the jdk.internal.vm.ci API, you must specify both `-XX:+EnableJVMCI` and `
 --add-modules=jdk.internal.vm.ci`. Is this intentional?

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/25240#discussion_r2093246382

From dnsimon at openjdk.org  Fri May 16 15:28:53 2025
From: dnsimon at openjdk.org (Doug Simon)
Date: Fri, 16 May 2025 15:28:53 GMT
Subject: RFR: 8345826: Do not automatically resolve jdk.internal.vm.ci when
 libgraal is used [v3]
In-Reply-To: <lDppbzMeN1ZKpgZt24_Jp2vgFfRpy6BFEi-pEmn-CFs=.ec07fb37-003a-4e59-a4c0-19cf85640f42@github.com>
References: <PlJ8ZpJyyH6rzVIww3DxcQ_Yi8jelxrf9YidjoaXP0Y=.0701541e-38db-4bc5-b2d5-ae5ebf89b362@github.com>
 <n7c0F7zOM3Xin-AYHicDXJ1Qkxn_TNP7xVZJZSi1Jf4=.06f5c037-8abf-4cd9-9bca-de2aec9e3eab@github.com>
 <t9SBFMs83jidJi8mM1B2-kuAjoVXJmjutUhYwSIYJmA=.b1f03263-3216-4fe8-98b6-107376af23f5@github.com>
 <EKx_UGTifyZxj7woy3nLSA3C0vRgNrc1SjiSIqdzFlM=.87418a05-9746-408b-bc0d-49ec683bdcd3@github.com>
 <uU9pC1QoIc2_MkcJkwwX56QZOtYaQbZPjEiUY9C4i-M=.4f7d08f8-b6c2-497d-a08d-477728f2efe1@github.com>
 <lEx8lee6aH9URCi8POzpfeTMHdGfsOkEzjbb-shqrQs=.8145c232-ff04-430a-bbc8-de1ed45c1552@github.com>
 <-uhFTQkTUeNHKS5yBLkapWVfcGwDBAgS8B_rS2DvWsg=.e0a7243c-9853-4855-a652-2558941bfd41@github.com>
 <lDppbzMeN1ZKpgZt24_Jp2vgFfRpy6BFEi-pEmn-CFs=.ec07fb37-003a-4e59-a4c0-19cf85640f42@github.com>
Message-ID: <mX5dWcg5mkEoqUj5OpdzbCTshLdX2U8ZR2DZeZzb_iw=.27f97877-faaa-47dd-8623-ceaf68971010@github.com>

On Fri, 16 May 2025 15:16:15 GMT, Ioi Lam <iklam at openjdk.org> wrote:

> If you have an app that wants to use the jdk.internal.vm.ci API, you must specify both -XX:+EnableJVMCI and  --add-modules=jdk.internal.vm.ci.

You should only have to specify `--add-modules=jdk.internal.vm.ci` and that now sets `+EnableJVMCI`. If you also want libgraal to be used as the JIT (instead of C2), then you need to add `-XX:+UseGraalJIT`.

For the Truffle on Oracle JDK, this means:
* `--add-modules=jdk.internal.vm.ci`: Use C2 for JIT ("hosted") compilation and libgraal for Truffle ("guest") compilation
* `--add-modules=jdk.internal.vm.ci -XX:+UseGraalJIT`: Use libgraal for both JIT and Truffle compilation

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/25240#discussion_r2093262720

From iklam at openjdk.org  Fri May 16 15:42:54 2025
From: iklam at openjdk.org (Ioi Lam)
Date: Fri, 16 May 2025 15:42:54 GMT
Subject: RFR: 8345826: Do not automatically resolve jdk.internal.vm.ci when
 libgraal is used [v3]
In-Reply-To: <_q4IPJYDVzhjP8W0KqTeFGzTB4vE-QmfAWAsLWd8m5M=.f0bd3074-ab48-425f-b6d4-e765f6d1f8f0@github.com>
References: <PlJ8ZpJyyH6rzVIww3DxcQ_Yi8jelxrf9YidjoaXP0Y=.0701541e-38db-4bc5-b2d5-ae5ebf89b362@github.com>
 <_q4IPJYDVzhjP8W0KqTeFGzTB4vE-QmfAWAsLWd8m5M=.f0bd3074-ab48-425f-b6d4-e765f6d1f8f0@github.com>
Message-ID: <jNHmYerPxXnMsFITd2kCuUGNNCs3LXGtWvzKWQTqqYM=.ce8e19fa-f8d3-4285-8c23-c8ac529e2ec3@github.com>

On Fri, 16 May 2025 14:41:31 GMT, Doug Simon <dnsimon at openjdk.org> wrote:

>> The `EnableJVMCI` flag currently serves 2 purposes:
>> * Guards VM code ([example](https://github.com/openjdk/jdk/blob/b1e778d9d2ad13ee5f1ed629a8805008580f86c0/src/hotspot/share/runtime/sharedRuntime.cpp#L652)).
>> * [Adds](https://github.com/openjdk/jdk/blob/b1e778d9d2ad13ee5f1ed629a8805008580f86c0/src/hotspot/share/runtime/arguments.cpp#L1804) `jdk.internal.vm.ci` to the root module set.
>> 
>> This PR changes nothing about the first point.
>> 
>> On the second point, to use the `jdk.internal.vm.ci` module when libgraal is enabled, `--add-modules=jdk.internal.vm.ci` must be specified.
>> If libgraal is not enabled, +EnableJVMCI will continue to add `jdk.internal.vm.ci` to the root module set.
>> 
>> The primary motivation is to make use of libgraal compatible with `-XX:+AOTClassLinking`. This flag relies on the root module set archive created in a training run. If the root module set is different in the production run, the AOTClassLinking [optimizations](https://bugs.openjdk.org/browse/JDK-8342279) are disabled. As `jdk.internal.vm.ci` is not resolved in the training run, it must not be resolved in production run. As such, `-XX:+EnableJVMCI` must not cause resolution of `jdk.internal.vm.ci`, otherwise libgraal will not have the startup advantages of AOTClassLinking.
>
> Doug Simon has updated the pull request incrementally with three additional commits since the last revision:
> 
>  - fixed comment
>  - removed use of jdk.internal.vm.ci.enabled property
>  - fix TestHotSpotJVMCIRuntime

Marked as reviewed by iklam (Reviewer).

-------------

PR Review: https://git.openjdk.org/jdk/pull/25240#pullrequestreview-2847006395

From iklam at openjdk.org  Fri May 16 15:42:56 2025
From: iklam at openjdk.org (Ioi Lam)
Date: Fri, 16 May 2025 15:42:56 GMT
Subject: RFR: 8345826: Do not automatically resolve jdk.internal.vm.ci when
 libgraal is used [v3]
In-Reply-To: <mX5dWcg5mkEoqUj5OpdzbCTshLdX2U8ZR2DZeZzb_iw=.27f97877-faaa-47dd-8623-ceaf68971010@github.com>
References: <PlJ8ZpJyyH6rzVIww3DxcQ_Yi8jelxrf9YidjoaXP0Y=.0701541e-38db-4bc5-b2d5-ae5ebf89b362@github.com>
 <n7c0F7zOM3Xin-AYHicDXJ1Qkxn_TNP7xVZJZSi1Jf4=.06f5c037-8abf-4cd9-9bca-de2aec9e3eab@github.com>
 <t9SBFMs83jidJi8mM1B2-kuAjoVXJmjutUhYwSIYJmA=.b1f03263-3216-4fe8-98b6-107376af23f5@github.com>
 <EKx_UGTifyZxj7woy3nLSA3C0vRgNrc1SjiSIqdzFlM=.87418a05-9746-408b-bc0d-49ec683bdcd3@github.com>
 <uU9pC1QoIc2_MkcJkwwX56QZOtYaQbZPjEiUY9C4i-M=.4f7d08f8-b6c2-497d-a08d-477728f2efe1@github.com>
 <lEx8lee6aH9URCi8POzpfeTMHdGfsOkEzjbb-shqrQs=.8145c232-ff04-430a-bbc8-de1ed45c1552@github.com>
 <-uhFTQkTUeNHKS5yBLkapWVfcGwDBAgS8B_rS2DvWsg=.e0a7243c-9853-4855-a652-2558941bfd41@github.com>
 <lDppbzMeN1ZKpgZt24_Jp2vgFfRpy6BFEi-pEmn-CFs=.ec07fb37-003a-4e59-a4c0-19cf85640f42@github.com>
 <mX5dWcg5mkEoqUj5OpdzbCTshLdX2U8ZR2DZeZzb_iw=.27f97877-faaa-47dd-8623-ceaf68971010@github.com>
Message-ID: <2I_zoUvs4eHXqSJAdRTmEwTcSr58O1eFo90vfacZuz8=.907e37b4-4c81-4dbc-884c-0e05b0ea3024@github.com>

On Fri, 16 May 2025 15:26:31 GMT, Doug Simon <dnsimon at openjdk.org> wrote:

>> I ran a recent build of Oracle JDK 25 that has libjvmcicompiler.so (not including your changes):
>> 
>> 
>> $ ./bin/java -XX:+UnlockExperimentalVMOptions -XX:+UseGraalJIT -XX:+PrintFlagsFinal --version | \
>>         egrep '(EnableJVMCI)|(UseJVMCICompiler)|(UseJVMCINativeLibrary)'
>>      bool EnableJVMCI           = true    {JVMCI product} {default}
>>      bool EnableJVMCIProduct    = true    {JVMCI product} {command line}
>>      bool UseJVMCICompiler      = true    {JVMCI product} {default}
>>      bool UseJVMCINativeLibrary = true    {JVMCI product} {default}
>> $ ./bin/java -XX:+UnlockExperimentalVMOptions -XX:+EnableJVMCI -XX:+PrintFlagsFinal --version | \
>>         egrep '(EnableJVMCI)|(UseJVMCICompiler)|(UseJVMCINativeLibrary)'
>>      bool EnableJVMCI           = true    {JVMCI experimental} {command line}
>>      bool EnableJVMCIProduct    = false   {JVMCI experimental} {default}
>>      bool UseJVMCICompiler      = false   {JVMCI experimental} {default}
>>      bool UseJVMCINativeLibrary = true    {JVMCI experimental} {default}
>> 
>> 
>> So If you specify only `-XX:+EnableJVMCI` in the command-line, `UseJVMCINativeLibrary` will be true. As a result, with your latest version, the `jdk.internal.vm.ci` module is not added.
>> 
>> If you have an app that wants to use the jdk.internal.vm.ci API, you must specify both `-XX:+EnableJVMCI` and `
>>  --add-modules=jdk.internal.vm.ci`. Is this intentional?
>
>> If you have an app that wants to use the jdk.internal.vm.ci API, you must specify both -XX:+EnableJVMCI and  --add-modules=jdk.internal.vm.ci.
> 
> You should only have to specify `--add-modules=jdk.internal.vm.ci` and that now sets `+EnableJVMCI`. If you also want libgraal to be used as the JIT (instead of C2), then you need to add `-XX:+UseGraalJIT`.
> 
> For the Truffle on Oracle JDK, this means:
> * `--add-modules=jdk.internal.vm.ci`: Use C2 for JIT ("hosted") compilation and libgraal for Truffle ("guest") compilation
> * `--add-modules=jdk.internal.vm.ci -XX:+UseGraalJIT`: Use libgraal for both JIT and Truffle compilation

Ah I forgot the setting on EnableJVMCI to true. Thanks for the explanation.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/25240#discussion_r2093283370

From dnsimon at openjdk.org  Fri May 16 16:50:38 2025
From: dnsimon at openjdk.org (Doug Simon)
Date: Fri, 16 May 2025 16:50:38 GMT
Subject: RFR: 8345826: Do not automatically resolve jdk.internal.vm.ci when
 libgraal is used [v4]
In-Reply-To: <PlJ8ZpJyyH6rzVIww3DxcQ_Yi8jelxrf9YidjoaXP0Y=.0701541e-38db-4bc5-b2d5-ae5ebf89b362@github.com>
References: <PlJ8ZpJyyH6rzVIww3DxcQ_Yi8jelxrf9YidjoaXP0Y=.0701541e-38db-4bc5-b2d5-ae5ebf89b362@github.com>
Message-ID: <vinOlo2a7qV0yPBkbeM9X5CF4Q61ZpSs7Qs0OGKodus=.7a8b3aa1-b31a-4128-8a75-26cb917c2a16@github.com>

> The `EnableJVMCI` flag currently serves 2 purposes:
> * Guards VM code ([example](https://github.com/openjdk/jdk/blob/b1e778d9d2ad13ee5f1ed629a8805008580f86c0/src/hotspot/share/runtime/sharedRuntime.cpp#L652)).
> * [Adds](https://github.com/openjdk/jdk/blob/b1e778d9d2ad13ee5f1ed629a8805008580f86c0/src/hotspot/share/runtime/arguments.cpp#L1804) `jdk.internal.vm.ci` to the root module set.
> 
> This PR changes nothing about the first point.
> 
> On the second point, to use the `jdk.internal.vm.ci` module when libgraal is enabled, `--add-modules=jdk.internal.vm.ci` must be specified.
> If libgraal is not enabled, +EnableJVMCI will continue to add `jdk.internal.vm.ci` to the root module set.
> 
> The primary motivation is to make use of libgraal compatible with `-XX:+AOTClassLinking`. This flag relies on the root module set archive created in a training run. If the root module set is different in the production run, the AOTClassLinking [optimizations](https://bugs.openjdk.org/browse/JDK-8342279) are disabled. As `jdk.internal.vm.ci` is not resolved in the training run, it must not be resolved in production run. As such, `-XX:+EnableJVMCI` must not cause resolution of `jdk.internal.vm.ci`, otherwise libgraal will not have the startup advantages of AOTClassLinking.

Doug Simon has updated the pull request incrementally with one additional commit since the last revision:

  improved error message

-------------

Changes:
  - all: https://git.openjdk.org/jdk/pull/25240/files
  - new: https://git.openjdk.org/jdk/pull/25240/files/3cdef586..1fe56b41

Webrevs:
 - full: https://webrevs.openjdk.org/?repo=jdk&pr=25240&range=03
 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25240&range=02-03

  Stats: 6 lines in 3 files changed: 2 ins; 0 del; 4 mod
  Patch: https://git.openjdk.org/jdk/pull/25240.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/25240/head:pull/25240

PR: https://git.openjdk.org/jdk/pull/25240

From dnsimon at openjdk.org  Fri May 16 16:50:38 2025
From: dnsimon at openjdk.org (Doug Simon)
Date: Fri, 16 May 2025 16:50:38 GMT
Subject: RFR: 8345826: Do not automatically resolve jdk.internal.vm.ci when
 libgraal is used [v3]
In-Reply-To: <_q4IPJYDVzhjP8W0KqTeFGzTB4vE-QmfAWAsLWd8m5M=.f0bd3074-ab48-425f-b6d4-e765f6d1f8f0@github.com>
References: <PlJ8ZpJyyH6rzVIww3DxcQ_Yi8jelxrf9YidjoaXP0Y=.0701541e-38db-4bc5-b2d5-ae5ebf89b362@github.com>
 <_q4IPJYDVzhjP8W0KqTeFGzTB4vE-QmfAWAsLWd8m5M=.f0bd3074-ab48-425f-b6d4-e765f6d1f8f0@github.com>
Message-ID: <sh1qlMNxkBBtOxzQPXGPESwqgnZvQQmMxPr8z8iC4eE=.19a0dfba-caeb-4f2a-b717-1c2ccc147a11@github.com>

On Fri, 16 May 2025 14:41:31 GMT, Doug Simon <dnsimon at openjdk.org> wrote:

>> The `EnableJVMCI` flag currently serves 2 purposes:
>> * Guards VM code ([example](https://github.com/openjdk/jdk/blob/b1e778d9d2ad13ee5f1ed629a8805008580f86c0/src/hotspot/share/runtime/sharedRuntime.cpp#L652)).
>> * [Adds](https://github.com/openjdk/jdk/blob/b1e778d9d2ad13ee5f1ed629a8805008580f86c0/src/hotspot/share/runtime/arguments.cpp#L1804) `jdk.internal.vm.ci` to the root module set.
>> 
>> This PR changes nothing about the first point.
>> 
>> On the second point, to use the `jdk.internal.vm.ci` module when libgraal is enabled, `--add-modules=jdk.internal.vm.ci` must be specified.
>> If libgraal is not enabled, +EnableJVMCI will continue to add `jdk.internal.vm.ci` to the root module set.
>> 
>> The primary motivation is to make use of libgraal compatible with `-XX:+AOTClassLinking`. This flag relies on the root module set archive created in a training run. If the root module set is different in the production run, the AOTClassLinking [optimizations](https://bugs.openjdk.org/browse/JDK-8342279) are disabled. As `jdk.internal.vm.ci` is not resolved in the training run, it must not be resolved in production run. As such, `-XX:+EnableJVMCI` must not cause resolution of `jdk.internal.vm.ci`, otherwise libgraal will not have the startup advantages of AOTClassLinking.
>
> Doug Simon has updated the pull request incrementally with three additional commits since the last revision:
> 
>  - fixed comment
>  - removed use of jdk.internal.vm.ci.enabled property
>  - fix TestHotSpotJVMCIRuntime

While testing this out on Graal, I discovered an interesting corner case.


public class UseJVMCIModule {
    public static void main(String[] args) {
        jdk.vm.ci.runtime.JVMCI.getRuntime();
    }
}


If the JVMCI module is indirectly added to the root module set, it results in an error:

java --add-modules=jdk.graal.compiler --add-exports=jdk.internal.vm.ci/jdk.vm.ci.runtime=ALL-UNNAMED UseJVMCIModule.java
Exception in thread "main" java.lang.InternalError: JVMCI is not enabled
	at jdk.internal.vm.ci/jdk.vm.ci.runtime.JVMCI.initializeRuntime(Native Method)
	at jdk.internal.vm.ci/jdk.vm.ci.runtime.JVMCI.getRuntime(JVMCI.java:64)
	at UseJVMCIModule.main(UseJVMCIModule.java:3)


That is, if an app wants to use the JVMCI module, it needs to explicitly communicate this to the launcher. By the time the root module graph is being initialized in ModuleBootstrap, it's too late to set `EnableJVMCI`.

I improved the error message to make this clear:

Exception in thread "main" java.lang.InternalError: JVMCI is not enabled. Must specify '--add-modules=jdk.internal.vm.ci' or '-XX:+EnableJVMCI' to the java launcher.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/25240#issuecomment-2887219662

From never at openjdk.org  Fri May 16 19:19:52 2025
From: never at openjdk.org (Tom Rodriguez)
Date: Fri, 16 May 2025 19:19:52 GMT
Subject: RFR: 8345826: Do not automatically resolve jdk.internal.vm.ci when
 libgraal is used [v4]
In-Reply-To: <vinOlo2a7qV0yPBkbeM9X5CF4Q61ZpSs7Qs0OGKodus=.7a8b3aa1-b31a-4128-8a75-26cb917c2a16@github.com>
References: <PlJ8ZpJyyH6rzVIww3DxcQ_Yi8jelxrf9YidjoaXP0Y=.0701541e-38db-4bc5-b2d5-ae5ebf89b362@github.com>
 <vinOlo2a7qV0yPBkbeM9X5CF4Q61ZpSs7Qs0OGKodus=.7a8b3aa1-b31a-4128-8a75-26cb917c2a16@github.com>
Message-ID: <djUaSveGjS-NfM-REP2oDaR5Dje7cAb-IDBpCV3FRps=.bbe1f846-bed1-44a5-b88a-325272ab1dd2@github.com>

On Fri, 16 May 2025 16:50:38 GMT, Doug Simon <dnsimon at openjdk.org> wrote:

>> The `EnableJVMCI` flag currently serves 2 purposes:
>> * Guards VM code ([example](https://github.com/openjdk/jdk/blob/b1e778d9d2ad13ee5f1ed629a8805008580f86c0/src/hotspot/share/runtime/sharedRuntime.cpp#L652)).
>> * [Adds](https://github.com/openjdk/jdk/blob/b1e778d9d2ad13ee5f1ed629a8805008580f86c0/src/hotspot/share/runtime/arguments.cpp#L1804) `jdk.internal.vm.ci` to the root module set.
>> 
>> This PR changes nothing about the first point.
>> 
>> On the second point, to use the `jdk.internal.vm.ci` module when libgraal is enabled, `--add-modules=jdk.internal.vm.ci` must be specified.
>> If libgraal is not enabled, +EnableJVMCI will continue to add `jdk.internal.vm.ci` to the root module set.
>> 
>> The primary motivation is to make use of libgraal compatible with `-XX:+AOTClassLinking`. This flag relies on the root module set archive created in a training run. If the root module set is different in the production run, the AOTClassLinking [optimizations](https://bugs.openjdk.org/browse/JDK-8342279) are disabled. As `jdk.internal.vm.ci` is not resolved in the training run, it must not be resolved in production run. As such, `-XX:+EnableJVMCI` must not cause resolution of `jdk.internal.vm.ci`, otherwise libgraal will not have the startup advantages of AOTClassLinking.
>
> Doug Simon has updated the pull request incrementally with one additional commit since the last revision:
> 
>   improved error message

src/hotspot/share/jvmci/jvmciRuntime.hpp line 38:

> 36: #endif // INCLUDE_G1GC
> 37: 
> 38: #define JAVA_NOT_ENABLED_ERROR_MESSAGE "JVMCI is not enabled. Must specify '--add-modules=jdk.internal.vm.ci' or '-XX:+EnableJVMCI' to the java launcher."

You meant `JVMCI_NOT_ENABLED_ERROR_MESSAGE` I assume?

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/25240#discussion_r2093564087

From never at openjdk.org  Fri May 16 19:35:54 2025
From: never at openjdk.org (Tom Rodriguez)
Date: Fri, 16 May 2025 19:35:54 GMT
Subject: RFR: 8345826: Do not automatically resolve jdk.internal.vm.ci when
 libgraal is used [v4]
In-Reply-To: <vinOlo2a7qV0yPBkbeM9X5CF4Q61ZpSs7Qs0OGKodus=.7a8b3aa1-b31a-4128-8a75-26cb917c2a16@github.com>
References: <PlJ8ZpJyyH6rzVIww3DxcQ_Yi8jelxrf9YidjoaXP0Y=.0701541e-38db-4bc5-b2d5-ae5ebf89b362@github.com>
 <vinOlo2a7qV0yPBkbeM9X5CF4Q61ZpSs7Qs0OGKodus=.7a8b3aa1-b31a-4128-8a75-26cb917c2a16@github.com>
Message-ID: <Fd898_M7HyPNrUuMK8r1sQ2zhfbn7owwcP2jca7nWUk=.819811ef-df8d-4911-8035-14e480ec30ac@github.com>

On Fri, 16 May 2025 16:50:38 GMT, Doug Simon <dnsimon at openjdk.org> wrote:

>> The `EnableJVMCI` flag currently serves 2 purposes:
>> * Guards VM code ([example](https://github.com/openjdk/jdk/blob/b1e778d9d2ad13ee5f1ed629a8805008580f86c0/src/hotspot/share/runtime/sharedRuntime.cpp#L652)).
>> * [Adds](https://github.com/openjdk/jdk/blob/b1e778d9d2ad13ee5f1ed629a8805008580f86c0/src/hotspot/share/runtime/arguments.cpp#L1804) `jdk.internal.vm.ci` to the root module set.
>> 
>> This PR changes nothing about the first point.
>> 
>> On the second point, to use the `jdk.internal.vm.ci` module when libgraal is enabled, `--add-modules=jdk.internal.vm.ci` must be specified.
>> If libgraal is not enabled, +EnableJVMCI will continue to add `jdk.internal.vm.ci` to the root module set.
>> 
>> The primary motivation is to make use of libgraal compatible with `-XX:+AOTClassLinking`. This flag relies on the root module set archive created in a training run. If the root module set is different in the production run, the AOTClassLinking [optimizations](https://bugs.openjdk.org/browse/JDK-8342279) are disabled. As `jdk.internal.vm.ci` is not resolved in the training run, it must not be resolved in production run. As such, `-XX:+EnableJVMCI` must not cause resolution of `jdk.internal.vm.ci`, otherwise libgraal will not have the startup advantages of AOTClassLinking.
>
> Doug Simon has updated the pull request incrementally with one additional commit since the last revision:
> 
>   improved error message

test/hotspot/jtreg/compiler/jvmci/jdk.vm.ci.hotspot.test/src/jdk/vm/ci/hotspot/test/TestHotSpotJVMCIRuntime.java line 173:

> 171:                 "-XX:+UnlockExperimentalVMOptions",
> 172:                 "-XX:+EnableJVMCI",
> 173:                 "--add-modules=jdk.internal.vm.ci",

I stared at this for a while to understand why passing this option was required.  It's a bit confusing that explicitly passing `-XX:+EnableJVMCI` has different effects based on the value of UseJVMCINativeLibrary.  I think that if `EnableJVMCI` is passed on the command line then it should add the module even if libgraal is in use.  So something like:
`if ((!UseJVMCINativeLibrary || FLAG_IS_CMDLINE(EnableJVMCI) && ClassLoader::is_module_observable`

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/25240#discussion_r2093580679

From dnsimon at openjdk.org  Fri May 16 19:46:51 2025
From: dnsimon at openjdk.org (Doug Simon)
Date: Fri, 16 May 2025 19:46:51 GMT
Subject: RFR: 8345826: Do not automatically resolve jdk.internal.vm.ci when
 libgraal is used [v4]
In-Reply-To: <djUaSveGjS-NfM-REP2oDaR5Dje7cAb-IDBpCV3FRps=.bbe1f846-bed1-44a5-b88a-325272ab1dd2@github.com>
References: <PlJ8ZpJyyH6rzVIww3DxcQ_Yi8jelxrf9YidjoaXP0Y=.0701541e-38db-4bc5-b2d5-ae5ebf89b362@github.com>
 <vinOlo2a7qV0yPBkbeM9X5CF4Q61ZpSs7Qs0OGKodus=.7a8b3aa1-b31a-4128-8a75-26cb917c2a16@github.com>
 <djUaSveGjS-NfM-REP2oDaR5Dje7cAb-IDBpCV3FRps=.bbe1f846-bed1-44a5-b88a-325272ab1dd2@github.com>
Message-ID: <V3R6giZJTKU2RpCAHf9JpOT7rFWiXp9b2dG_STkRoe0=.fd3ad04a-50f5-44cb-814c-5e2ca72f8b48@github.com>

On Fri, 16 May 2025 19:17:42 GMT, Tom Rodriguez <never at openjdk.org> wrote:

> You meant `JVMCI_NOT_ENABLED_ERROR_MESSAGE` I assume?

Ha! Nice catch ;-)

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/25240#discussion_r2093593061

From dnsimon at openjdk.org  Fri May 16 19:56:52 2025
From: dnsimon at openjdk.org (Doug Simon)
Date: Fri, 16 May 2025 19:56:52 GMT
Subject: RFR: 8345826: Do not automatically resolve jdk.internal.vm.ci when
 libgraal is used [v4]
In-Reply-To: <Fd898_M7HyPNrUuMK8r1sQ2zhfbn7owwcP2jca7nWUk=.819811ef-df8d-4911-8035-14e480ec30ac@github.com>
References: <PlJ8ZpJyyH6rzVIww3DxcQ_Yi8jelxrf9YidjoaXP0Y=.0701541e-38db-4bc5-b2d5-ae5ebf89b362@github.com>
 <vinOlo2a7qV0yPBkbeM9X5CF4Q61ZpSs7Qs0OGKodus=.7a8b3aa1-b31a-4128-8a75-26cb917c2a16@github.com>
 <Fd898_M7HyPNrUuMK8r1sQ2zhfbn7owwcP2jca7nWUk=.819811ef-df8d-4911-8035-14e480ec30ac@github.com>
Message-ID: <tC7cy8qyR6FGdgG_Um93kMPkJAmsXQ6fO14JgXIN3uA=.f39b9855-1d2c-43d2-8aa3-8e2bbeebf85a@github.com>

On Fri, 16 May 2025 19:33:25 GMT, Tom Rodriguez <never at openjdk.org> wrote:

>> Doug Simon has updated the pull request incrementally with one additional commit since the last revision:
>> 
>>   improved error message
>
> test/hotspot/jtreg/compiler/jvmci/jdk.vm.ci.hotspot.test/src/jdk/vm/ci/hotspot/test/TestHotSpotJVMCIRuntime.java line 173:
> 
>> 171:                 "-XX:+UnlockExperimentalVMOptions",
>> 172:                 "-XX:+EnableJVMCI",
>> 173:                 "--add-modules=jdk.internal.vm.ci",
> 
> I stared at this for a while to understand why passing this option was required.  It's a bit confusing that explicitly passing `-XX:+EnableJVMCI` has different effects based on the value of UseJVMCINativeLibrary.  I think that if `EnableJVMCI` is passed on the command line then it should add the module even if libgraal is in use.  So something like:
> `if ((!UseJVMCINativeLibrary || FLAG_IS_CMDLINE(EnableJVMCI) && ClassLoader::is_module_observable`

I was not aware FLAG_IS_CMDLINE can be used for altering the semantics of a flag but there seems to be at least one precedent for it with [UseCompactObjectHeaders](https://github.com/openjdk/jdk/blob/3dd34517000e4ce1a21619922c62c025f98aad44/src/hotspot/share/runtime/arguments.cpp#L3671). This is quite nice as now nothing needs to change for Truffle users in terms of enabling the Truffle optimized runtime (cc @chumer).

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/25240#discussion_r2093603624

From dnsimon at openjdk.org  Fri May 16 21:06:39 2025
From: dnsimon at openjdk.org (Doug Simon)
Date: Fri, 16 May 2025 21:06:39 GMT
Subject: RFR: 8345826: Do not automatically resolve jdk.internal.vm.ci when
 libgraal is used [v5]
In-Reply-To: <PlJ8ZpJyyH6rzVIww3DxcQ_Yi8jelxrf9YidjoaXP0Y=.0701541e-38db-4bc5-b2d5-ae5ebf89b362@github.com>
References: <PlJ8ZpJyyH6rzVIww3DxcQ_Yi8jelxrf9YidjoaXP0Y=.0701541e-38db-4bc5-b2d5-ae5ebf89b362@github.com>
Message-ID: <lJd3YyL9bqFzPdlxHMhi8-N86b1BDpPc8vAtR3ip7F4=.f22e9f4e-20d3-414a-a920-499fe6b516e9@github.com>

> The `EnableJVMCI` flag currently serves 2 purposes:
> * Guards VM code ([example](https://github.com/openjdk/jdk/blob/b1e778d9d2ad13ee5f1ed629a8805008580f86c0/src/hotspot/share/runtime/sharedRuntime.cpp#L652)).
> * [Adds](https://github.com/openjdk/jdk/blob/b1e778d9d2ad13ee5f1ed629a8805008580f86c0/src/hotspot/share/runtime/arguments.cpp#L1804) `jdk.internal.vm.ci` to the root module set.
> 
> This PR changes nothing about the first point.
> 
> On the second point, to use the `jdk.internal.vm.ci` module when libgraal is enabled, `--add-modules=jdk.internal.vm.ci` must be specified.
> If libgraal is not enabled, +EnableJVMCI will continue to add `jdk.internal.vm.ci` to the root module set.
> 
> The primary motivation is to make use of libgraal compatible with `-XX:+AOTClassLinking`. This flag relies on the root module set archive created in a training run. If the root module set is different in the production run, the AOTClassLinking [optimizations](https://bugs.openjdk.org/browse/JDK-8342279) are disabled. As `jdk.internal.vm.ci` is not resolved in the training run, it must not be resolved in production run. As such, `-XX:+EnableJVMCI` must not cause resolution of `jdk.internal.vm.ci`, otherwise libgraal will not have the startup advantages of AOTClassLinking.

Doug Simon has updated the pull request incrementally with two additional commits since the last revision:

 - load the JVMCI module if +EnableJVMCI is set on the command line
 - JAVA_NOT_ENABLED_ERROR_MESSAGE -> JVMCI_NOT_ENABLED_ERROR_MESSAGE

-------------

Changes:
  - all: https://git.openjdk.org/jdk/pull/25240/files
  - new: https://git.openjdk.org/jdk/pull/25240/files/1fe56b41..d9223afb

Webrevs:
 - full: https://webrevs.openjdk.org/?repo=jdk&pr=25240&range=04
 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25240&range=03-04

  Stats: 15 lines in 6 files changed: 0 ins; 4 del; 11 mod
  Patch: https://git.openjdk.org/jdk/pull/25240.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/25240/head:pull/25240

PR: https://git.openjdk.org/jdk/pull/25240

From never at openjdk.org  Fri May 16 21:20:52 2025
From: never at openjdk.org (Tom Rodriguez)
Date: Fri, 16 May 2025 21:20:52 GMT
Subject: RFR: 8345826: Do not automatically resolve jdk.internal.vm.ci when
 libgraal is used [v5]
In-Reply-To: <lJd3YyL9bqFzPdlxHMhi8-N86b1BDpPc8vAtR3ip7F4=.f22e9f4e-20d3-414a-a920-499fe6b516e9@github.com>
References: <PlJ8ZpJyyH6rzVIww3DxcQ_Yi8jelxrf9YidjoaXP0Y=.0701541e-38db-4bc5-b2d5-ae5ebf89b362@github.com>
 <lJd3YyL9bqFzPdlxHMhi8-N86b1BDpPc8vAtR3ip7F4=.f22e9f4e-20d3-414a-a920-499fe6b516e9@github.com>
Message-ID: <eP7A-NkaPFbWSz2p-Gno-DlgbFBjt-kusRyO9WSwrjE=.57121767-138a-4f4f-8a44-bee62bb55e79@github.com>

On Fri, 16 May 2025 21:06:39 GMT, Doug Simon <dnsimon at openjdk.org> wrote:

>> The `EnableJVMCI` flag currently serves 2 purposes:
>> * Guards VM code ([example](https://github.com/openjdk/jdk/blob/b1e778d9d2ad13ee5f1ed629a8805008580f86c0/src/hotspot/share/runtime/sharedRuntime.cpp#L652)).
>> * [Adds](https://github.com/openjdk/jdk/blob/b1e778d9d2ad13ee5f1ed629a8805008580f86c0/src/hotspot/share/runtime/arguments.cpp#L1804) `jdk.internal.vm.ci` to the root module set.
>> 
>> This PR changes nothing about the first point.
>> 
>> On the second point, to use the `jdk.internal.vm.ci` module when libgraal is enabled, `--add-modules=jdk.internal.vm.ci` must be specified.
>> If libgraal is not enabled, +EnableJVMCI will continue to add `jdk.internal.vm.ci` to the root module set.
>> 
>> The primary motivation is to make use of libgraal compatible with `-XX:+AOTClassLinking`. This flag relies on the root module set archive created in a training run. If the root module set is different in the production run, the AOTClassLinking [optimizations](https://bugs.openjdk.org/browse/JDK-8342279) are disabled. As `jdk.internal.vm.ci` is not resolved in the training run, it must not be resolved in production run. As such, `-XX:+EnableJVMCI` must not cause resolution of `jdk.internal.vm.ci`, otherwise libgraal will not have the startup advantages of AOTClassLinking.
>
> Doug Simon has updated the pull request incrementally with two additional commits since the last revision:
> 
>  - load the JVMCI module if +EnableJVMCI is set on the command line
>  - JAVA_NOT_ENABLED_ERROR_MESSAGE -> JVMCI_NOT_ENABLED_ERROR_MESSAGE

new version seems nice and clean.

-------------

Marked as reviewed by never (Reviewer).

PR Review: https://git.openjdk.org/jdk/pull/25240#pullrequestreview-2847639000

From dnsimon at openjdk.org  Sat May 17 09:40:54 2025
From: dnsimon at openjdk.org (Doug Simon)
Date: Sat, 17 May 2025 09:40:54 GMT
Subject: RFR: 8345826: Do not automatically resolve jdk.internal.vm.ci when
 libgraal is used [v3]
In-Reply-To: <jNHmYerPxXnMsFITd2kCuUGNNCs3LXGtWvzKWQTqqYM=.ce8e19fa-f8d3-4285-8c23-c8ac529e2ec3@github.com>
References: <PlJ8ZpJyyH6rzVIww3DxcQ_Yi8jelxrf9YidjoaXP0Y=.0701541e-38db-4bc5-b2d5-ae5ebf89b362@github.com>
 <_q4IPJYDVzhjP8W0KqTeFGzTB4vE-QmfAWAsLWd8m5M=.f0bd3074-ab48-425f-b6d4-e765f6d1f8f0@github.com>
 <jNHmYerPxXnMsFITd2kCuUGNNCs3LXGtWvzKWQTqqYM=.ce8e19fa-f8d3-4285-8c23-c8ac529e2ec3@github.com>
Message-ID: <ryp6CSHU_FDflhrs_5e8qV9gCFEmLdpWfKS2GHqumZs=.9aa20a8e-9baf-4ea1-bef5-a212c92f4ba9@github.com>

On Fri, 16 May 2025 15:40:12 GMT, Ioi Lam <iklam at openjdk.org> wrote:

>> Doug Simon has updated the pull request incrementally with three additional commits since the last revision:
>> 
>>  - fixed comment
>>  - removed use of jdk.internal.vm.ci.enabled property
>>  - fix TestHotSpotJVMCIRuntime
>
> Marked as reviewed by iklam (Reviewer).

Any further feedback or concerns @iklam or @AlanBateman ?

-------------

PR Comment: https://git.openjdk.org/jdk/pull/25240#issuecomment-2888253695

From iklam at openjdk.org  Sun May 18 05:54:06 2025
From: iklam at openjdk.org (Ioi Lam)
Date: Sun, 18 May 2025 05:54:06 GMT
Subject: RFR: 8345826: Do not automatically resolve jdk.internal.vm.ci when
 libgraal is used [v5]
In-Reply-To: <lJd3YyL9bqFzPdlxHMhi8-N86b1BDpPc8vAtR3ip7F4=.f22e9f4e-20d3-414a-a920-499fe6b516e9@github.com>
References: <PlJ8ZpJyyH6rzVIww3DxcQ_Yi8jelxrf9YidjoaXP0Y=.0701541e-38db-4bc5-b2d5-ae5ebf89b362@github.com>
 <lJd3YyL9bqFzPdlxHMhi8-N86b1BDpPc8vAtR3ip7F4=.f22e9f4e-20d3-414a-a920-499fe6b516e9@github.com>
Message-ID: <41K4xsn27SKncEkQLqryRwgvoLwrlRTfobgQDGOO0Dg=.55866213-f3b4-44ab-9a1e-80a83733d06b@github.com>

On Fri, 16 May 2025 21:06:39 GMT, Doug Simon <dnsimon at openjdk.org> wrote:

>> The `EnableJVMCI` flag currently serves 2 purposes:
>> * Guards VM code ([example](https://github.com/openjdk/jdk/blob/b1e778d9d2ad13ee5f1ed629a8805008580f86c0/src/hotspot/share/runtime/sharedRuntime.cpp#L652)).
>> * [Adds](https://github.com/openjdk/jdk/blob/b1e778d9d2ad13ee5f1ed629a8805008580f86c0/src/hotspot/share/runtime/arguments.cpp#L1804) `jdk.internal.vm.ci` to the root module set.
>> 
>> This PR changes nothing about the first point.
>> 
>> On the second point, to use the `jdk.internal.vm.ci` module when libgraal is enabled, `-XX:+EnableJVMCI` must be explicitly specified on the command line (as opposed to being true as a result of [`-XX:+UseJVMCICompiler`](https://github.com/openjdk/jdk/blob/76570c627db527f856f2394fb9ead02939eca621/src/hotspot/share/jvmci/jvmci_globals.cpp#L88) or [`-XX:+EnableJVMCIProduct`](https://github.com/openjdk/jdk/blob/76570c627db527f856f2394fb9ead02939eca621/src/hotspot/share/jvmci/jvmci_globals.cpp#L64)). Alternatively, `--add-modules=jdk.internal.vm.ci` can be specified - it has the same semantics as `-XX:+EnableJVMCI`.
>> If libgraal is not enabled, +EnableJVMCI will continue to add `jdk.internal.vm.ci` to the root module set.
>> 
>> The primary motivation is to make use of libgraal compatible with `-XX:+AOTClassLinking`. This flag relies on the root module set archive created in a training run. If the root module set is different in the production run, the AOTClassLinking [optimizations](https://bugs.openjdk.org/browse/JDK-8342279) are disabled. As `jdk.internal.vm.ci` is not resolved in the training run, it must not be resolved in production run. As such, `-XX:+EnableJVMCI` must not cause resolution of `jdk.internal.vm.ci`, otherwise libgraal will not have the startup advantages of AOTClassLinking.
>
> Doug Simon has updated the pull request incrementally with two additional commits since the last revision:
> 
>  - load the JVMCI module if +EnableJVMCI is set on the command line
>  - JAVA_NOT_ENABLED_ERROR_MESSAGE -> JVMCI_NOT_ENABLED_ERROR_MESSAGE

Latest version looks good to me.

-------------

Marked as reviewed by iklam (Reviewer).

PR Review: https://git.openjdk.org/jdk/pull/25240#pullrequestreview-2848736472

From dnsimon at openjdk.org  Sun May 18 19:14:06 2025
From: dnsimon at openjdk.org (Doug Simon)
Date: Sun, 18 May 2025 19:14:06 GMT
Subject: RFR: 8345826: Do not automatically resolve jdk.internal.vm.ci when
 libgraal is used [v6]
In-Reply-To: <PlJ8ZpJyyH6rzVIww3DxcQ_Yi8jelxrf9YidjoaXP0Y=.0701541e-38db-4bc5-b2d5-ae5ebf89b362@github.com>
References: <PlJ8ZpJyyH6rzVIww3DxcQ_Yi8jelxrf9YidjoaXP0Y=.0701541e-38db-4bc5-b2d5-ae5ebf89b362@github.com>
Message-ID: <cyU925v1Ts8NTfDXBQkFYkyupnYDEl3ROdswtq3Do5U=.77428b7b-9b58-4e50-887b-3d747fe8d1e6@github.com>

> The `EnableJVMCI` flag currently serves 2 purposes:
> * Guards VM code ([example](https://github.com/openjdk/jdk/blob/b1e778d9d2ad13ee5f1ed629a8805008580f86c0/src/hotspot/share/runtime/sharedRuntime.cpp#L652)).
> * [Adds](https://github.com/openjdk/jdk/blob/b1e778d9d2ad13ee5f1ed629a8805008580f86c0/src/hotspot/share/runtime/arguments.cpp#L1804) `jdk.internal.vm.ci` to the root module set.
> 
> This PR changes nothing about the first point.
> 
> On the second point, to use the `jdk.internal.vm.ci` module when libgraal is enabled, `-XX:+EnableJVMCI` must be explicitly specified on the command line (as opposed to being true as a result of [`-XX:+UseJVMCICompiler`](https://github.com/openjdk/jdk/blob/76570c627db527f856f2394fb9ead02939eca621/src/hotspot/share/jvmci/jvmci_globals.cpp#L88) or [`-XX:+EnableJVMCIProduct`](https://github.com/openjdk/jdk/blob/76570c627db527f856f2394fb9ead02939eca621/src/hotspot/share/jvmci/jvmci_globals.cpp#L64)). Alternatively, `--add-modules=jdk.internal.vm.ci` can be specified - it has the same semantics as `-XX:+EnableJVMCI`.
> If libgraal is not enabled, +EnableJVMCI will continue to add `jdk.internal.vm.ci` to the root module set.
> 
> The primary motivation is to make use of libgraal compatible with `-XX:+AOTClassLinking`. This flag relies on the root module set archive created in a training run. If the root module set is different in the production run, the AOTClassLinking [optimizations](https://bugs.openjdk.org/browse/JDK-8342279) are disabled. As `jdk.internal.vm.ci` is not resolved in the training run, it must not be resolved in production run. As such, `-XX:+EnableJVMCI` must not cause resolution of `jdk.internal.vm.ci`, otherwise libgraal will not have the startup advantages of AOTClassLinking.

Doug Simon has updated the pull request incrementally with one additional commit since the last revision:

  load the JVMCI module if +EnableJVMCI is set in the jimage

-------------

Changes:
  - all: https://git.openjdk.org/jdk/pull/25240/files
  - new: https://git.openjdk.org/jdk/pull/25240/files/d9223afb..196425f9

Webrevs:
 - full: https://webrevs.openjdk.org/?repo=jdk&pr=25240&range=05
 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25240&range=04-05

  Stats: 9 lines in 2 files changed: 4 ins; 0 del; 5 mod
  Patch: https://git.openjdk.org/jdk/pull/25240.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/25240/head:pull/25240

PR: https://git.openjdk.org/jdk/pull/25240

From dnsimon at openjdk.org  Sun May 18 19:19:52 2025
From: dnsimon at openjdk.org (Doug Simon)
Date: Sun, 18 May 2025 19:19:52 GMT
Subject: RFR: 8345826: Do not automatically resolve jdk.internal.vm.ci when
 libgraal is used [v6]
In-Reply-To: <cyU925v1Ts8NTfDXBQkFYkyupnYDEl3ROdswtq3Do5U=.77428b7b-9b58-4e50-887b-3d747fe8d1e6@github.com>
References: <PlJ8ZpJyyH6rzVIww3DxcQ_Yi8jelxrf9YidjoaXP0Y=.0701541e-38db-4bc5-b2d5-ae5ebf89b362@github.com>
 <cyU925v1Ts8NTfDXBQkFYkyupnYDEl3ROdswtq3Do5U=.77428b7b-9b58-4e50-887b-3d747fe8d1e6@github.com>
Message-ID: <p_DzYtulq7V30D-WywSFR82Zw0LBihoIff3EvGmHOyA=.9a5d8882-7362-4792-8a07-07dd33304896@github.com>

On Sun, 18 May 2025 19:14:06 GMT, Doug Simon <dnsimon at openjdk.org> wrote:

>> The `EnableJVMCI` flag currently serves 2 purposes:
>> * Guards VM code ([example](https://github.com/openjdk/jdk/blob/b1e778d9d2ad13ee5f1ed629a8805008580f86c0/src/hotspot/share/runtime/sharedRuntime.cpp#L652)).
>> * [Adds](https://github.com/openjdk/jdk/blob/b1e778d9d2ad13ee5f1ed629a8805008580f86c0/src/hotspot/share/runtime/arguments.cpp#L1804) `jdk.internal.vm.ci` to the root module set.
>> 
>> This PR changes nothing about the first point.
>> 
>> On the second point, to use the `jdk.internal.vm.ci` module when libgraal is enabled, `-XX:+EnableJVMCI` must be explicitly specified to the launcher (as opposed to being true as a result of [`-XX:+UseJVMCICompiler`](https://github.com/openjdk/jdk/blob/76570c627db527f856f2394fb9ead02939eca621/src/hotspot/share/jvmci/jvmci_globals.cpp#L88) or [`-XX:+EnableJVMCIProduct`](https://github.com/openjdk/jdk/blob/76570c627db527f856f2394fb9ead02939eca621/src/hotspot/share/jvmci/jvmci_globals.cpp#L64)). Alternatively, `--add-modules=jdk.internal.vm.ci` can be specified - it has the same semantics as `-XX:+EnableJVMCI`.
>> If libgraal is not enabled, +EnableJVMCI will continue to add `jdk.internal.vm.ci` to the root module set.
>> 
>> The primary motivation is to make use of libgraal compatible with `-XX:+AOTClassLinking`. This flag relies on the root module set archive created in a training run. If the root module set is different in the production run, the AOTClassLinking [optimizations](https://bugs.openjdk.org/browse/JDK-8342279) are disabled. As `jdk.internal.vm.ci` is not resolved in the training run, it must not be resolved in production run. As such, `-XX:+EnableJVMCI` must not cause resolution of `jdk.internal.vm.ci`, otherwise libgraal will not have the startup advantages of AOTClassLinking.
>
> Doug Simon has updated the pull request incrementally with one additional commit since the last revision:
> 
>   load the JVMCI module if +EnableJVMCI is set in the jimage

In addition to loading the JVMCI module when `-XX:+EnableJVMCI` is on the command line, it should also be done when `-XX:+EnableJVMCI` is set by the jimage. The latter is how GraalVM sets some defaults and +EnableJVMCI is such a default. This ensures that the root module set is the same in training and production runs for AOTClassLinking on GraalVM.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/25240#issuecomment-2889161533

From mchevalier at openjdk.org  Mon May 19 07:01:52 2025
From: mchevalier at openjdk.org (Marc Chevalier)
Date: Mon, 19 May 2025 07:01:52 GMT
Subject: RFR: 8347901: C2 should remove unused leaf / pure runtime calls
In-Reply-To: <7e0IhYYv_1dDlLgmUM8rKj5bjDx3lIhY2PRt-fC-rTs=.35437a80-80c7-4332-9339-a6f047b73289@github.com>
References: <FDC4kftsSAqf2FB3mxOABGMgWhr_qty0_BktGUiuTuE=.060b5943-74c3-461d-8806-b6da1722c207@github.com>
 <xjYabzypKCqhMAG6qnWOB09F4F5YDoGEhNpZsa2ve5k=.f26d3919-27ba-4481-b1c0-1ca438f5ba22@github.com>
 <uFUCzJjLfRqudMg3P1NPagroCaxEersEu7Sv8813cQU=.4656742a-2a7d-454e-8e90-4b7b07e6d2a7@github.com>
 <7e0IhYYv_1dDlLgmUM8rKj5bjDx3lIhY2PRt-fC-rTs=.35437a80-80c7-4332-9339-a6f047b73289@github.com>
Message-ID: <coqKb3zX50_88kOs13nn9LYuf1cEs4EtzubxseIt9Ko=.481b0ec6-dc4c-4d64-acd9-4550132547cd@github.com>

On Tue, 13 May 2025 03:12:29 GMT, Quan Anh Mai <qamai at openjdk.org> wrote:

>>> I think a very simple approach you can take is having CallPureNode as a pure data node
>> 
>> It's not as simple as it seems. In order to work reliably it requires full control of the code being called, so without extra work it is appropriate for generated stubs only. If you want to call some native code VM doesn't control, then either all caller-saved registers should be preserved across the call (which may be prohibitively expensive) or it should be made explicit there's a call taking place so all ABI effects are taken into account.
>
> @iwanowww I believe `effect(CALL)` marks that a call is taking place and the register allocator will know how to save the registers accordingly. Note that on arm, long division is implemented as a call:
> 
> https://github.com/openjdk/jdk/blob/adebfa7ffda6383f5793278ced14a193066c5f6a/src/hotspot/cpu/arm/arm.ad#L5962
> 
> And `SharedRuntime::ldiv` is implemented in C++:
> 
> https://github.com/openjdk/jdk/blob/adebfa7ffda6383f5793278ced14a193066c5f6a/src/hotspot/share/runtime/sharedRuntime.cpp#L272

I like @merykitty's suggestion, but I don't understand how bad are the disadvantages of it. Commoning can be prevented as you mentioned above. As for scheduling, isn't it the same problem for many nodes? If we have something like

var x = anOject.aField;  // anObject known to be not null
if (flag) {  // flag independent of `anObject`
  // something with x
} else {
  // [...] nothing with x
}

I don't think there is any ordering between the if and the definition of `x`, and so we should push the latter under the if. And conversely, if the declaration is already in the branch in the original code, we should not let it float above. Or in case of loop, we should rather put it outside as much as possible. But none of that seems enforced by edges: memory node is not a CFG node, the nodes if the `if(flag)` might not use memory (so no memory edges)... The same would be true for an arithmetic node (like `AddI`, for instance), but we could argue those are cheap (even if in a loop, cheap becomes expensive), while a memory access is not that cheap.
So, don't the problems we have with @merykitty's pure-call-as-pure-data-node suggestion already exist for other node kinds? And if we would have troubles with scheduling of pure calls, shouldn't we have this kind of issue already?

-------------

PR Comment: https://git.openjdk.org/jdk/pull/24966#issuecomment-2889840427

From alanb at openjdk.org  Mon May 19 07:21:53 2025
From: alanb at openjdk.org (Alan Bateman)
Date: Mon, 19 May 2025 07:21:53 GMT
Subject: RFR: 8345826: Do not automatically resolve jdk.internal.vm.ci when
 libgraal is used [v6]
In-Reply-To: <cyU925v1Ts8NTfDXBQkFYkyupnYDEl3ROdswtq3Do5U=.77428b7b-9b58-4e50-887b-3d747fe8d1e6@github.com>
References: <PlJ8ZpJyyH6rzVIww3DxcQ_Yi8jelxrf9YidjoaXP0Y=.0701541e-38db-4bc5-b2d5-ae5ebf89b362@github.com>
 <cyU925v1Ts8NTfDXBQkFYkyupnYDEl3ROdswtq3Do5U=.77428b7b-9b58-4e50-887b-3d747fe8d1e6@github.com>
Message-ID: <BbbaN6IuW8wHoLCfpnCRvz_fR40QR93ZzzVPnIh1C5k=.15bf4c2a-8679-49b7-b110-6d9d49889bee@github.com>

On Sun, 18 May 2025 19:14:06 GMT, Doug Simon <dnsimon at openjdk.org> wrote:

>> The `EnableJVMCI` flag currently serves 2 purposes:
>> * Guards VM code ([example](https://github.com/openjdk/jdk/blob/b1e778d9d2ad13ee5f1ed629a8805008580f86c0/src/hotspot/share/runtime/sharedRuntime.cpp#L652)).
>> * [Adds](https://github.com/openjdk/jdk/blob/b1e778d9d2ad13ee5f1ed629a8805008580f86c0/src/hotspot/share/runtime/arguments.cpp#L1804) `jdk.internal.vm.ci` to the root module set.
>> 
>> This PR changes nothing about the first point.
>> 
>> On the second point, to use the `jdk.internal.vm.ci` module when libgraal is enabled, `-XX:+EnableJVMCI` must be explicitly specified to the launcher (as opposed to being true as a result of [`-XX:+UseJVMCICompiler`](https://github.com/openjdk/jdk/blob/76570c627db527f856f2394fb9ead02939eca621/src/hotspot/share/jvmci/jvmci_globals.cpp#L88) or [`-XX:+EnableJVMCIProduct`](https://github.com/openjdk/jdk/blob/76570c627db527f856f2394fb9ead02939eca621/src/hotspot/share/jvmci/jvmci_globals.cpp#L64)). Alternatively, `--add-modules=jdk.internal.vm.ci` can be specified - it has the same semantics as `-XX:+EnableJVMCI`.
>> If libgraal is not enabled, +EnableJVMCI will continue to add `jdk.internal.vm.ci` to the root module set.
>> 
>> The primary motivation is to make use of libgraal compatible with `-XX:+AOTClassLinking`. This flag relies on the root module set archive created in a training run. If the root module set is different in the production run, the AOTClassLinking [optimizations](https://bugs.openjdk.org/browse/JDK-8342279) are disabled. As `jdk.internal.vm.ci` is not resolved in the training run, it must not be resolved in production run. As such, `-XX:+EnableJVMCI` must not cause resolution of `jdk.internal.vm.ci`, otherwise libgraal will not have the startup advantages of AOTClassLinking.
>> 
>> Graal adaption PR: https://github.com/oracle/graal/pull/11212
>
> Doug Simon has updated the pull request incrementally with one additional commit since the last revision:
> 
>   load the JVMCI module if +EnableJVMCI is set in the jimage

src/hotspot/share/runtime/arguments.cpp line 2264:

> 2262:           }
> 2263:         }
> 2264:       }

This only works if jdk.internal.vm.ci is specified to --add-modules, it won't set _jvmci_module_added if jdk.internal.vm.ci is resolved because some other module require it. As the module is JDK internal and doesn't export an API then I assume it would be rare-to-never to require it, is that right?

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/25240#discussion_r2095007839

From qamai at openjdk.org  Mon May 19 07:24:01 2025
From: qamai at openjdk.org (Quan Anh Mai)
Date: Mon, 19 May 2025 07:24:01 GMT
Subject: RFR: 8347901: C2 should remove unused leaf / pure runtime calls
In-Reply-To: <4vbXpgvmXv6Ba1fEkMKIRpUnXZ-QVdAZ7rgicqxVhpM=.7dda802c-9b8a-459d-9bd7-7a83d9fc1744@github.com>
References: <FDC4kftsSAqf2FB3mxOABGMgWhr_qty0_BktGUiuTuE=.060b5943-74c3-461d-8806-b6da1722c207@github.com>
 <4vbXpgvmXv6Ba1fEkMKIRpUnXZ-QVdAZ7rgicqxVhpM=.7dda802c-9b8a-459d-9bd7-7a83d9fc1744@github.com>
Message-ID: <_iOKkIEZDrhUNSnn4GshsjW79IzVkUyY31LozGq8fcI=.01ecf0ab-641a-427d-bb65-f657df4f49e4@github.com>

On Thu, 15 May 2025 21:56:32 GMT, Vladimir Ivanov <vlivanov at openjdk.org> wrote:

>> A first part toward a better support of pure functions.
>> 
>> ## Pure Functions
>> 
>> Pure functions (considered here) are functions that have no side effects, no effect on the control flow (no exception or such), cannot deopt etc.. It's really a function that you can execute anywhere, with whichever arguments without effect other than wasting time. Integer division is not pure as dividing by zero is throwing. But many floating point functions will just return `NaN` or `+/-infinity` in problematic cases.
>> 
>> ## Scope
>> 
>> We are not going all powerful for now! It's mostly about identifying some pure functions and being able to remove them if the result is unused. Some other things are not part of this PR, on purpose. Especially, this PR doesn't propose a way to move pure calls around. The reason is that pure calls are macro nodes later expanded into other, regular calls, which require a control input. To be able to do the expansion, we just keep the control in the pure call as well.
>> 
>> ## Implementation Overview
>> 
>> We created here some new node kind for pure calls that are expanded into regular calls during macro expansion. This also allows the removal of `ModD` and `ModF` nodes that have their pure equivalent now. They are surprisingly hard to unify with other floating point functions from an implementation point of view!
>> 
>> IR framework and IGV needed a little bit of fixing.
>> 
>> Thanks,
>> Marc
>
> Interesting! I wasn't aware ADLC already features such support. Thanks for the pointers. 
> 
> It does look attractive, especially for platform-specific use cases. But there are some pitfalls which makes it hard to use on its own. In particular, data nodes are aggressively commoned and freely flow in the graph. Unless it is taken into account during GVN and code motion, the final schedule may end up far from optimal. (In other words, it's highly beneficial to match only expensive nodes in such a way.) Moreover, some optimizations are highly sensitive to the presence of calls. (Think of the consequences of a call scheduled inside a heavily vectorized loop.)
> 
> Macro-expansion also suffers from some of those issues, but still IMO an explicit `Call` node is a more appropriate solution to the problem.

Tbh I don't understand @iwanowww arguments. We have expensive data nodes such as `SqrtD` that have control inputs to prevent them floating too aggressively. Additionally, a `CallNode` is pinned AT its control input, while a data node is pinned UNDER its control input. It gives the scheduler much more freedom scheduling a data node to a better location compared to a call node.

Ideally, what we want to do with expensive data nodes is to common them aggressively like any other data node. Then, during code motion, we can clone them if it is beneficial.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/24966#issuecomment-2889891820

From alanb at openjdk.org  Mon May 19 07:29:01 2025
From: alanb at openjdk.org (Alan Bateman)
Date: Mon, 19 May 2025 07:29:01 GMT
Subject: RFR: 8345826: Do not automatically resolve jdk.internal.vm.ci when
 libgraal is used [v6]
In-Reply-To: <cyU925v1Ts8NTfDXBQkFYkyupnYDEl3ROdswtq3Do5U=.77428b7b-9b58-4e50-887b-3d747fe8d1e6@github.com>
References: <PlJ8ZpJyyH6rzVIww3DxcQ_Yi8jelxrf9YidjoaXP0Y=.0701541e-38db-4bc5-b2d5-ae5ebf89b362@github.com>
 <cyU925v1Ts8NTfDXBQkFYkyupnYDEl3ROdswtq3Do5U=.77428b7b-9b58-4e50-887b-3d747fe8d1e6@github.com>
Message-ID: <gmLU_0mQ3N7yrySJNy6-oANMXghnnTNjbnTmKkg9QM0=.0d091a10-4a1e-4151-b0dd-f74a27fcb83a@github.com>

On Sun, 18 May 2025 19:14:06 GMT, Doug Simon <dnsimon at openjdk.org> wrote:

>> The `EnableJVMCI` flag currently serves 2 purposes:
>> * Guards VM code ([example](https://github.com/openjdk/jdk/blob/b1e778d9d2ad13ee5f1ed629a8805008580f86c0/src/hotspot/share/runtime/sharedRuntime.cpp#L652)).
>> * [Adds](https://github.com/openjdk/jdk/blob/b1e778d9d2ad13ee5f1ed629a8805008580f86c0/src/hotspot/share/runtime/arguments.cpp#L1804) `jdk.internal.vm.ci` to the root module set.
>> 
>> This PR changes nothing about the first point.
>> 
>> On the second point, to use the `jdk.internal.vm.ci` module when libgraal is enabled, `-XX:+EnableJVMCI` must be explicitly specified to the launcher (as opposed to being true as a result of [`-XX:+UseJVMCICompiler`](https://github.com/openjdk/jdk/blob/76570c627db527f856f2394fb9ead02939eca621/src/hotspot/share/jvmci/jvmci_globals.cpp#L88) or [`-XX:+EnableJVMCIProduct`](https://github.com/openjdk/jdk/blob/76570c627db527f856f2394fb9ead02939eca621/src/hotspot/share/jvmci/jvmci_globals.cpp#L64)). Alternatively, `--add-modules=jdk.internal.vm.ci` can be specified - it has the same semantics as `-XX:+EnableJVMCI`.
>> If libgraal is not enabled, +EnableJVMCI will continue to add `jdk.internal.vm.ci` to the root module set.
>> 
>> The primary motivation is to make use of libgraal compatible with `-XX:+AOTClassLinking`. This flag relies on the root module set archive created in a training run. If the root module set is different in the production run, the AOTClassLinking [optimizations](https://bugs.openjdk.org/browse/JDK-8342279) are disabled. As `jdk.internal.vm.ci` is not resolved in the training run, it must not be resolved in production run. As such, `-XX:+EnableJVMCI` must not cause resolution of `jdk.internal.vm.ci`, otherwise libgraal will not have the startup advantages of AOTClassLinking.
>> 
>> Graal adaption PR: https://github.com/oracle/graal/pull/11212
>
> Doug Simon has updated the pull request incrementally with one additional commit since the last revision:
> 
>   load the JVMCI module if +EnableJVMCI is set in the jimage

src/hotspot/share/jvmci/jvmciRuntime.hpp line 38:

> 36: #endif // INCLUDE_G1GC
> 37: 
> 38: #define JVMCI_NOT_ENABLED_ERROR_MESSAGE "JVMCI is not enabled. Must specify '--add-modules=jdk.internal.vm.ci' or '-XX:+EnableJVMCI' to the java launcher."

It's the exception message for an InternalError so maybe this is okay but in general I don't think you want to be asking users to have to name a JDK internal module.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/25240#discussion_r2095016026

From dnsimon at openjdk.org  Mon May 19 07:29:04 2025
From: dnsimon at openjdk.org (Doug Simon)
Date: Mon, 19 May 2025 07:29:04 GMT
Subject: RFR: 8345826: Do not automatically resolve jdk.internal.vm.ci when
 libgraal is used [v6]
In-Reply-To: <gmLU_0mQ3N7yrySJNy6-oANMXghnnTNjbnTmKkg9QM0=.0d091a10-4a1e-4151-b0dd-f74a27fcb83a@github.com>
References: <PlJ8ZpJyyH6rzVIww3DxcQ_Yi8jelxrf9YidjoaXP0Y=.0701541e-38db-4bc5-b2d5-ae5ebf89b362@github.com>
 <cyU925v1Ts8NTfDXBQkFYkyupnYDEl3ROdswtq3Do5U=.77428b7b-9b58-4e50-887b-3d747fe8d1e6@github.com>
 <gmLU_0mQ3N7yrySJNy6-oANMXghnnTNjbnTmKkg9QM0=.0d091a10-4a1e-4151-b0dd-f74a27fcb83a@github.com>
Message-ID: <1jsJPDBKXdqW9cWDKx8B_86qlqoo1QR1Bu93DC6KWGI=.ad77b0c1-9984-405a-ad2b-6391006748a9@github.com>

On Mon, 19 May 2025 07:23:54 GMT, Alan Bateman <alanb at openjdk.org> wrote:

>> Doug Simon has updated the pull request incrementally with one additional commit since the last revision:
>> 
>>   load the JVMCI module if +EnableJVMCI is set in the jimage
>
> src/hotspot/share/jvmci/jvmciRuntime.hpp line 38:
> 
>> 36: #endif // INCLUDE_G1GC
>> 37: 
>> 38: #define JVMCI_NOT_ENABLED_ERROR_MESSAGE "JVMCI is not enabled. Must specify '--add-modules=jdk.internal.vm.ci' or '-XX:+EnableJVMCI' to the java launcher."
> 
> It's the exception message for an InternalError so maybe this is okay but in general I don't think you want to be asking users to have to name a JDK internal module.

I think anyone using a non-standard configuration for Graal would not be surprised about JVMCI and would actually appreciate the extra hint when they get things "wrong".

> src/hotspot/share/runtime/arguments.cpp line 2264:
> 
>> 2262:           }
>> 2263:         }
>> 2264:       }
> 
> This only works if jdk.internal.vm.ci is specified to --add-modules, it won't set _jvmci_module_added if jdk.internal.vm.ci is resolved because some other module require it. As the module is JDK internal and doesn't export an API then I assume it would be rare-to-never to require it, is that right?

Correct. I realized this might be a bit confusing so improved the error message a little: https://github.com/openjdk/jdk/pull/25240#issuecomment-2887219662
But as you say, it should never be encountered in practice.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/25240#discussion_r2095021472
PR Review Comment: https://git.openjdk.org/jdk/pull/25240#discussion_r2095012796

From dnsimon at openjdk.org  Mon May 19 07:46:23 2025
From: dnsimon at openjdk.org (Doug Simon)
Date: Mon, 19 May 2025 07:46:23 GMT
Subject: RFR: 8345826: Do not automatically resolve jdk.internal.vm.ci when
 libgraal is used [v7]
In-Reply-To: <PlJ8ZpJyyH6rzVIww3DxcQ_Yi8jelxrf9YidjoaXP0Y=.0701541e-38db-4bc5-b2d5-ae5ebf89b362@github.com>
References: <PlJ8ZpJyyH6rzVIww3DxcQ_Yi8jelxrf9YidjoaXP0Y=.0701541e-38db-4bc5-b2d5-ae5ebf89b362@github.com>
Message-ID: <MvEM8FIlKWT9HhUeFsrEWFoJf5BRVgZ08GxOKRdbLwY=.4ac67a7f-20b9-41e3-bb82-a960beec41c2@github.com>

> The `EnableJVMCI` flag currently serves 2 purposes:
> * Guards VM code ([example](https://github.com/openjdk/jdk/blob/b1e778d9d2ad13ee5f1ed629a8805008580f86c0/src/hotspot/share/runtime/sharedRuntime.cpp#L652)).
> * [Adds](https://github.com/openjdk/jdk/blob/b1e778d9d2ad13ee5f1ed629a8805008580f86c0/src/hotspot/share/runtime/arguments.cpp#L1804) `jdk.internal.vm.ci` to the root module set.
> 
> This PR changes nothing about the first point.
> 
> On the second point, to use the `jdk.internal.vm.ci` module when libgraal is enabled, `-XX:+EnableJVMCI` must be explicitly specified to the launcher (as opposed to being true as a result of [`-XX:+UseJVMCICompiler`](https://github.com/openjdk/jdk/blob/76570c627db527f856f2394fb9ead02939eca621/src/hotspot/share/jvmci/jvmci_globals.cpp#L88) or [`-XX:+EnableJVMCIProduct`](https://github.com/openjdk/jdk/blob/76570c627db527f856f2394fb9ead02939eca621/src/hotspot/share/jvmci/jvmci_globals.cpp#L64)). Alternatively, `--add-modules=jdk.internal.vm.ci` can be specified - it has the same semantics as `-XX:+EnableJVMCI`.
> If libgraal is not enabled, +EnableJVMCI will continue to add `jdk.internal.vm.ci` to the root module set.
> 
> The primary motivation is to make use of libgraal compatible with `-XX:+AOTClassLinking`. This flag relies on the root module set archive created in a training run. If the root module set is different in the production run, the AOTClassLinking [optimizations](https://bugs.openjdk.org/browse/JDK-8342279) are disabled. As `jdk.internal.vm.ci` is not resolved in the training run, it must not be resolved in production run. As such, `-XX:+EnableJVMCI` must not cause resolution of `jdk.internal.vm.ci`, otherwise libgraal will not have the startup advantages of AOTClassLinking.
> 
> Graal adaption PR: https://github.com/oracle/graal/pull/11212

Doug Simon has updated the pull request incrementally with one additional commit since the last revision:

  swapped order of recommended options in error message

-------------

Changes:
  - all: https://git.openjdk.org/jdk/pull/25240/files
  - new: https://git.openjdk.org/jdk/pull/25240/files/196425f9..b74077f1

Webrevs:
 - full: https://webrevs.openjdk.org/?repo=jdk&pr=25240&range=06
 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25240&range=05-06

  Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod
  Patch: https://git.openjdk.org/jdk/pull/25240.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/25240/head:pull/25240

PR: https://git.openjdk.org/jdk/pull/25240

From fniephaus at openjdk.org  Mon May 19 07:46:23 2025
From: fniephaus at openjdk.org (Fabio Niephaus)
Date: Mon, 19 May 2025 07:46:23 GMT
Subject: RFR: 8345826: Do not automatically resolve jdk.internal.vm.ci when
 libgraal is used [v6]
In-Reply-To: <1jsJPDBKXdqW9cWDKx8B_86qlqoo1QR1Bu93DC6KWGI=.ad77b0c1-9984-405a-ad2b-6391006748a9@github.com>
References: <PlJ8ZpJyyH6rzVIww3DxcQ_Yi8jelxrf9YidjoaXP0Y=.0701541e-38db-4bc5-b2d5-ae5ebf89b362@github.com>
 <cyU925v1Ts8NTfDXBQkFYkyupnYDEl3ROdswtq3Do5U=.77428b7b-9b58-4e50-887b-3d747fe8d1e6@github.com>
 <gmLU_0mQ3N7yrySJNy6-oANMXghnnTNjbnTmKkg9QM0=.0d091a10-4a1e-4151-b0dd-f74a27fcb83a@github.com>
 <1jsJPDBKXdqW9cWDKx8B_86qlqoo1QR1Bu93DC6KWGI=.ad77b0c1-9984-405a-ad2b-6391006748a9@github.com>
Message-ID: <hk6aqTKWtsjLGsPNQXAHWIb68ByEb7VYue3DGsBmGaQ=.c66a5f83-5c23-4932-9a40-5f32181f4db7@github.com>

On Mon, 19 May 2025 07:26:01 GMT, Doug Simon <dnsimon at openjdk.org> wrote:

>> src/hotspot/share/jvmci/jvmciRuntime.hpp line 38:
>> 
>>> 36: #endif // INCLUDE_G1GC
>>> 37: 
>>> 38: #define JVMCI_NOT_ENABLED_ERROR_MESSAGE "JVMCI is not enabled. Must specify '--add-modules=jdk.internal.vm.ci' or '-XX:+EnableJVMCI' to the java launcher."
>> 
>> It's the exception message for an InternalError so maybe this is okay but in general I don't think you want to be asking users to have to name a JDK internal module.
>
> I think anyone using a non-standard configuration for Graal would not be surprised about JVMCI and would actually appreciate the extra hint when they get things "wrong".

Assuming we want users to use `-XX:+EnableJVMCI` over `--add-modules=jdk.internal.vm.ci` for said reason, maybe flip the order?

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/25240#discussion_r2095053343

From dnsimon at openjdk.org  Mon May 19 07:46:23 2025
From: dnsimon at openjdk.org (Doug Simon)
Date: Mon, 19 May 2025 07:46:23 GMT
Subject: RFR: 8345826: Do not automatically resolve jdk.internal.vm.ci when
 libgraal is used [v6]
In-Reply-To: <hk6aqTKWtsjLGsPNQXAHWIb68ByEb7VYue3DGsBmGaQ=.c66a5f83-5c23-4932-9a40-5f32181f4db7@github.com>
References: <PlJ8ZpJyyH6rzVIww3DxcQ_Yi8jelxrf9YidjoaXP0Y=.0701541e-38db-4bc5-b2d5-ae5ebf89b362@github.com>
 <cyU925v1Ts8NTfDXBQkFYkyupnYDEl3ROdswtq3Do5U=.77428b7b-9b58-4e50-887b-3d747fe8d1e6@github.com>
 <gmLU_0mQ3N7yrySJNy6-oANMXghnnTNjbnTmKkg9QM0=.0d091a10-4a1e-4151-b0dd-f74a27fcb83a@github.com>
 <1jsJPDBKXdqW9cWDKx8B_86qlqoo1QR1Bu93DC6KWGI=.ad77b0c1-9984-405a-ad2b-6391006748a9@github.com>
 <hk6aqTKWtsjLGsPNQXAHWIb68ByEb7VYue3DGsBmGaQ=.c66a5f83-5c23-4932-9a40-5f32181f4db7@github.com>
Message-ID: <N1btWjYAUVZKxqrrbn9TzUcW5ULFdWNhe9oDy45Ull8=.116ee55c-3553-488c-9c27-36624aadfbb6@github.com>

On Mon, 19 May 2025 07:40:45 GMT, Fabio Niephaus <fniephaus at openjdk.org> wrote:

>> I think anyone using a non-standard configuration for Graal would not be surprised about JVMCI and would actually appreciate the extra hint when they get things "wrong".
>
> Assuming we want users to use `-XX:+EnableJVMCI` over `--add-modules=jdk.internal.vm.ci` for said reason, maybe flip the order?

Done.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/25240#discussion_r2095057877

From yzheng at openjdk.org  Mon May 19 15:19:15 2025
From: yzheng at openjdk.org (Yudi Zheng)
Date: Mon, 19 May 2025 15:19:15 GMT
Subject: RFR: 8334717: Add JVMCI support for APX EGPRs
Message-ID: <D9htbFH3sa2Ra2-_EYtf6o8SK7RYeMZoOJ78C2gImRI=.540520cc-aaf9-4da6-b44d-2b91128859e9@github.com>

This PR marks extra general purpose registers introduced by Intel APX as Graal allocatables. It also drops AMD64/AArch64/RISCV64.flags and RegisterArray

-------------

Commit messages:
 - address comments
 - Add JVMCI support for APX EGPRs

Changes: https://git.openjdk.org/jdk/pull/23159/files
  Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=23159&range=00
  Issue: https://bugs.openjdk.org/browse/JDK-8334717
  Stats: 546 lines in 18 files changed: 41 ins; 334 del; 171 mod
  Patch: https://git.openjdk.org/jdk/pull/23159.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/23159/head:pull/23159

PR: https://git.openjdk.org/jdk/pull/23159

From yzheng at openjdk.org  Mon May 19 15:19:15 2025
From: yzheng at openjdk.org (Yudi Zheng)
Date: Mon, 19 May 2025 15:19:15 GMT
Subject: RFR: 8334717: Add JVMCI support for APX EGPRs
In-Reply-To: <D9htbFH3sa2Ra2-_EYtf6o8SK7RYeMZoOJ78C2gImRI=.540520cc-aaf9-4da6-b44d-2b91128859e9@github.com>
References: <D9htbFH3sa2Ra2-_EYtf6o8SK7RYeMZoOJ78C2gImRI=.540520cc-aaf9-4da6-b44d-2b91128859e9@github.com>
Message-ID: <p3B2QtAwcWKdnHo68aBEktS9XR2HyDWDLwdpyeEdW_A=.68d87bd7-88b2-4072-b9a7-e5a03b8e7aa0@github.com>

On Thu, 16 Jan 2025 16:01:32 GMT, Yudi Zheng <yzheng at openjdk.org> wrote:

> This PR marks extra general purpose registers introduced by Intel APX as Graal allocatables. It also drops AMD64/AArch64/RISCV64.flags and RegisterArray

keep alive

-------------

PR Comment: https://git.openjdk.org/jdk/pull/23159#issuecomment-2724418056

From dnsimon at openjdk.org  Mon May 19 15:19:15 2025
From: dnsimon at openjdk.org (Doug Simon)
Date: Mon, 19 May 2025 15:19:15 GMT
Subject: RFR: 8334717: Add JVMCI support for APX EGPRs
In-Reply-To: <D9htbFH3sa2Ra2-_EYtf6o8SK7RYeMZoOJ78C2gImRI=.540520cc-aaf9-4da6-b44d-2b91128859e9@github.com>
References: <D9htbFH3sa2Ra2-_EYtf6o8SK7RYeMZoOJ78C2gImRI=.540520cc-aaf9-4da6-b44d-2b91128859e9@github.com>
Message-ID: <oDmOdm7vsWMs3xO7U1D8G2coLzL_w2VnTLCKTVnnmRg=.650bf245-d7bd-4442-b840-830ccbedbf23@github.com>

On Thu, 16 Jan 2025 16:01:32 GMT, Yudi Zheng <yzheng at openjdk.org> wrote:

> This PR marks extra general purpose registers introduced by Intel APX as Graal allocatables. It also drops AMD64/AArch64/RISCV64.flags and RegisterArray

src/jdk.internal.vm.ci/share/classes/jdk/vm/ci/code/RegisterAttributes.java line 58:

> 56:      *         element at index i holds the attributes of the register whose number is i.
> 57:      */
> 58:     public static RegisterAttributes[] createMap(RegisterConfig registerConfig, List<Register> registers) {

We should remove raw arrays as much as possible in JVMCI and replace them with immutable Lists:

     * @return an immutable list whose length is the max register number in {@code registers} plus 1. An
     *         element at index i holds the attributes of the register whose number is i.
     */
    public static List<RegisterAttributes[] createMap(RegisterConfig registerConfig, List<Register> registers) {

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/23159#discussion_r2033171362

From yzheng at openjdk.org  Mon May 19 15:19:15 2025
From: yzheng at openjdk.org (Yudi Zheng)
Date: Mon, 19 May 2025 15:19:15 GMT
Subject: RFR: 8334717: Add JVMCI support for APX EGPRs
In-Reply-To: <oDmOdm7vsWMs3xO7U1D8G2coLzL_w2VnTLCKTVnnmRg=.650bf245-d7bd-4442-b840-830ccbedbf23@github.com>
References: <D9htbFH3sa2Ra2-_EYtf6o8SK7RYeMZoOJ78C2gImRI=.540520cc-aaf9-4da6-b44d-2b91128859e9@github.com>
 <oDmOdm7vsWMs3xO7U1D8G2coLzL_w2VnTLCKTVnnmRg=.650bf245-d7bd-4442-b840-830ccbedbf23@github.com>
Message-ID: <Huc_pu-QcFqtbV8KsSDYorHHrMkdIvPbSEQLNaJagtk=.5cd2c453-1699-42ea-a420-07e0c23cfedf@github.com>

On Tue, 8 Apr 2025 13:15:29 GMT, Doug Simon <dnsimon at openjdk.org> wrote:

>> This PR marks extra general purpose registers introduced by Intel APX as Graal allocatables. It also drops AMD64/AArch64/RISCV64.flags and RegisterArray
>
> src/jdk.internal.vm.ci/share/classes/jdk/vm/ci/code/RegisterAttributes.java line 58:
> 
>> 56:      *         element at index i holds the attributes of the register whose number is i.
>> 57:      */
>> 58:     public static RegisterAttributes[] createMap(RegisterConfig registerConfig, List<Register> registers) {
> 
> We should remove raw arrays as much as possible in JVMCI and replace them with immutable Lists:
> 
>      * @return an immutable list whose length is the max register number in {@code registers} plus 1. An
>      *         element at index i holds the attributes of the register whose number is i.
>      */
>     public static List<RegisterAttributes[] createMap(RegisterConfig registerConfig, List<Register> registers) {

I have audited all the .clone() on array objects and changed as much as possible. Let me know if there is still some opportunity

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/23159#discussion_r2095952012

From dnsimon at openjdk.org  Mon May 19 15:26:53 2025
From: dnsimon at openjdk.org (Doug Simon)
Date: Mon, 19 May 2025 15:26:53 GMT
Subject: RFR: 8334717: Add JVMCI support for APX EGPRs
In-Reply-To: <Huc_pu-QcFqtbV8KsSDYorHHrMkdIvPbSEQLNaJagtk=.5cd2c453-1699-42ea-a420-07e0c23cfedf@github.com>
References: <D9htbFH3sa2Ra2-_EYtf6o8SK7RYeMZoOJ78C2gImRI=.540520cc-aaf9-4da6-b44d-2b91128859e9@github.com>
 <oDmOdm7vsWMs3xO7U1D8G2coLzL_w2VnTLCKTVnnmRg=.650bf245-d7bd-4442-b840-830ccbedbf23@github.com>
 <Huc_pu-QcFqtbV8KsSDYorHHrMkdIvPbSEQLNaJagtk=.5cd2c453-1699-42ea-a420-07e0c23cfedf@github.com>
Message-ID: <_MgCOg5EY1Sa1zo0ZwQ5Xr8rU3cU5CI5GHxCwIHGoSo=.d90889b8-f821-4695-85cb-1c8727638e7a@github.com>

On Mon, 19 May 2025 15:16:27 GMT, Yudi Zheng <yzheng at openjdk.org> wrote:

>> src/jdk.internal.vm.ci/share/classes/jdk/vm/ci/code/RegisterAttributes.java line 58:
>> 
>>> 56:      *         element at index i holds the attributes of the register whose number is i.
>>> 57:      */
>>> 58:     public static RegisterAttributes[] createMap(RegisterConfig registerConfig, List<Register> registers) {
>> 
>> We should remove raw arrays as much as possible in JVMCI and replace them with immutable Lists:
>> 
>>      * @return an immutable list whose length is the max register number in {@code registers} plus 1. An
>>      *         element at index i holds the attributes of the register whose number is i.
>>      */
>>     public static List<RegisterAttributes[] createMap(RegisterConfig registerConfig, List<Register> registers) {
>
> I have audited all the .clone() on array objects and changed as much as possible. Let me know if there is still some opportunity

Looks good - thanks!

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/23159#discussion_r2095973654

From dnsimon at openjdk.org  Mon May 19 16:50:51 2025
From: dnsimon at openjdk.org (Doug Simon)
Date: Mon, 19 May 2025 16:50:51 GMT
Subject: RFR: 8334717: Add JVMCI support for APX EGPRs
In-Reply-To: <D9htbFH3sa2Ra2-_EYtf6o8SK7RYeMZoOJ78C2gImRI=.540520cc-aaf9-4da6-b44d-2b91128859e9@github.com>
References: <D9htbFH3sa2Ra2-_EYtf6o8SK7RYeMZoOJ78C2gImRI=.540520cc-aaf9-4da6-b44d-2b91128859e9@github.com>
Message-ID: <6U3nTT2mClWCu8SHNL9JmMfwaKITOkvSmzI-3GAr-WY=.d51c748f-199c-4254-8555-15ab31ce78fd@github.com>

On Thu, 16 Jan 2025 16:01:32 GMT, Yudi Zheng <yzheng at openjdk.org> wrote:

> This PR marks extra general purpose registers introduced by Intel APX as Graal allocatables. It also drops AMD64/AArch64/RISCV64.flags and RegisterArray

LGTM

-------------

Marked as reviewed by dnsimon (Reviewer).

PR Review: https://git.openjdk.org/jdk/pull/23159#pullrequestreview-2851438732

From never at openjdk.org  Mon May 19 17:42:56 2025
From: never at openjdk.org (Tom Rodriguez)
Date: Mon, 19 May 2025 17:42:56 GMT
Subject: RFR: 8345826: Do not automatically resolve jdk.internal.vm.ci when
 libgraal is used [v7]
In-Reply-To: <MvEM8FIlKWT9HhUeFsrEWFoJf5BRVgZ08GxOKRdbLwY=.4ac67a7f-20b9-41e3-bb82-a960beec41c2@github.com>
References: <PlJ8ZpJyyH6rzVIww3DxcQ_Yi8jelxrf9YidjoaXP0Y=.0701541e-38db-4bc5-b2d5-ae5ebf89b362@github.com>
 <MvEM8FIlKWT9HhUeFsrEWFoJf5BRVgZ08GxOKRdbLwY=.4ac67a7f-20b9-41e3-bb82-a960beec41c2@github.com>
Message-ID: <s3RM2tUa6lYwIyMdA7CRaqp5BvoLk_-q8VS763PcS5c=.eafdfd48-a5f3-423b-b49e-1f8fe6b183d2@github.com>

On Mon, 19 May 2025 07:46:23 GMT, Doug Simon <dnsimon at openjdk.org> wrote:

>> The `EnableJVMCI` flag currently serves 2 purposes:
>> * Guards VM code ([example](https://github.com/openjdk/jdk/blob/b1e778d9d2ad13ee5f1ed629a8805008580f86c0/src/hotspot/share/runtime/sharedRuntime.cpp#L652)).
>> * [Adds](https://github.com/openjdk/jdk/blob/b1e778d9d2ad13ee5f1ed629a8805008580f86c0/src/hotspot/share/runtime/arguments.cpp#L1804) `jdk.internal.vm.ci` to the root module set.
>> 
>> This PR changes nothing about the first point.
>> 
>> On the second point, to use the `jdk.internal.vm.ci` module when libgraal is enabled, `-XX:+EnableJVMCI` must be explicitly specified to the launcher (as opposed to being true as a result of [`-XX:+UseJVMCICompiler`](https://github.com/openjdk/jdk/blob/76570c627db527f856f2394fb9ead02939eca621/src/hotspot/share/jvmci/jvmci_globals.cpp#L88) or [`-XX:+EnableJVMCIProduct`](https://github.com/openjdk/jdk/blob/76570c627db527f856f2394fb9ead02939eca621/src/hotspot/share/jvmci/jvmci_globals.cpp#L64)). Alternatively, `--add-modules=jdk.internal.vm.ci` can be specified - it has the same semantics as `-XX:+EnableJVMCI`.
>> If libgraal is not enabled, +EnableJVMCI will continue to add `jdk.internal.vm.ci` to the root module set.
>> 
>> The primary motivation is to make use of libgraal compatible with `-XX:+AOTClassLinking`. This flag relies on the root module set archive created in a training run. If the root module set is different in the production run, the AOTClassLinking [optimizations](https://bugs.openjdk.org/browse/JDK-8342279) are disabled. As `jdk.internal.vm.ci` is not resolved in the training run, it must not be resolved in production run. As such, `-XX:+EnableJVMCI` must not cause resolution of `jdk.internal.vm.ci`, otherwise libgraal will not have the startup advantages of AOTClassLinking.
>> 
>> Graal adaption PR: https://github.com/oracle/graal/pull/11212
>
> Doug Simon has updated the pull request incrementally with one additional commit since the last revision:
> 
>   swapped order of recommended options in error message

Marked as reviewed by never (Reviewer).

-------------

PR Review: https://git.openjdk.org/jdk/pull/25240#pullrequestreview-2851570977

From dnsimon at openjdk.org  Mon May 19 17:56:01 2025
From: dnsimon at openjdk.org (Doug Simon)
Date: Mon, 19 May 2025 17:56:01 GMT
Subject: RFR: 8357135: java.lang.OutOfMemoryError: Error creating or attaching
 to libjvmci after JDK-8356447
Message-ID: <ae8M3ac5iIemGBew5gSj4PVgo7e_Cs5Nq1_x6dRfKJE=.4b9f3be1-ebc5-4171-b097-732c4d2f48af@github.com>

As of [JDK-8356447](https://bugs.openjdk.org/browse/JDK-8356447), libgraal initialization happens during VM startup. If during this initialization, the libgraal heap cannot be created due to lack of virtual address space, the VM will exit with:


Error occurred during initialization of VM
java.lang.OutOfMemoryError: Error creating or attaching to libjvmci (err: -1000000801, description: Reserving address space for the new isolate failed.)


This causes problems for tests that limit the virtual address space with `ulimit -v` such as `gc/arguments/TestUseCompressedOopsFlagsWithUlimit.java` and `vmTestbase/nsk/jvmti/Allocate/alloc001/alloc001.java`.
Instead of exiting the VM, the failure should be silent (unless `-XX:+PrintCompilation` is enabled) as the VM can continue without libgraal, albeit in a crippled state. This PR implements this solution.

Alternative solutions include:
1. Trying to adjust the values used with `ulimit -v` in the tests to accommodate the [virtual address reservations](https://github.com/oracle/graal/blob/69f10d3d658a6aeca3d5ce59c64af6a18336f14c/substratevm/src/com.oracle.svm.core.genscavenge/src/com/oracle/svm/core/genscavenge/AddressRangeCommittedMemoryProvider.java#L150) needed by libgraal. This is brittle as it assumes knowledge about how much address space is needed (which is turn depends on how many libgraal compiler threads are created).
2. Add a `@requires !vm.libgraal.jit` guard to the tests so they are not run when libgraal is in use.

I think the solution in this PR is the most robust for the long term.

-------------

Commit messages:
 - do not exit VM if libjvmci env creation fails

Changes: https://git.openjdk.org/jdk/pull/25307/files
  Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=25307&range=00
  Issue: https://bugs.openjdk.org/browse/JDK-8357135
  Stats: 29 lines in 3 files changed: 9 ins; 17 del; 3 mod
  Patch: https://git.openjdk.org/jdk/pull/25307.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/25307/head:pull/25307

PR: https://git.openjdk.org/jdk/pull/25307

From vlivanov at openjdk.org  Tue May 20 03:29:54 2025
From: vlivanov at openjdk.org (Vladimir Ivanov)
Date: Tue, 20 May 2025 03:29:54 GMT
Subject: RFR: 8347901: C2 should remove unused leaf / pure runtime calls
In-Reply-To: <FDC4kftsSAqf2FB3mxOABGMgWhr_qty0_BktGUiuTuE=.060b5943-74c3-461d-8806-b6da1722c207@github.com>
References: <FDC4kftsSAqf2FB3mxOABGMgWhr_qty0_BktGUiuTuE=.060b5943-74c3-461d-8806-b6da1722c207@github.com>
Message-ID: <VyhWEgB7ogn67Vk1EFEyT-FlFGtUJQbyiAbrL3lUsoI=.29192455-8cf7-450a-a136-d736e2333660@github.com>

On Wed, 30 Apr 2025 13:18:33 GMT, Marc Chevalier <mchevalier at openjdk.org> wrote:

> A first part toward a better support of pure functions.
> 
> ## Pure Functions
> 
> Pure functions (considered here) are functions that have no side effects, no effect on the control flow (no exception or such), cannot deopt etc.. It's really a function that you can execute anywhere, with whichever arguments without effect other than wasting time. Integer division is not pure as dividing by zero is throwing. But many floating point functions will just return `NaN` or `+/-infinity` in problematic cases.
> 
> ## Scope
> 
> We are not going all powerful for now! It's mostly about identifying some pure functions and being able to remove them if the result is unused. Some other things are not part of this PR, on purpose. Especially, this PR doesn't propose a way to move pure calls around. The reason is that pure calls are macro nodes later expanded into other, regular calls, which require a control input. To be able to do the expansion, we just keep the control in the pure call as well.
> 
> ## Implementation Overview
> 
> We created here some new node kind for pure calls that are expanded into regular calls during macro expansion. This also allows the removal of `ModD` and `ModF` nodes that have their pure equivalent now. They are surprisingly hard to unify with other floating point functions from an implementation point of view!
> 
> IR framework and IGV needed a little bit of fixing.
> 
> Thanks,
> Marc

I'm just pointing out that delaying lowering decision till matching phase neither makes scheduling easier nor makes implementation simpler.

For loop opts it is important to know when loops contain calls and act accordingly (by trying to hoist relevant nodes out of loops and disabling some optimizations when the calls are still there).

The difference between CFG nodes effectively pinned AT some point and non-CFG nodes with control dependency (effectively pushing them UNDER their control input) becomes insignificant once CFG nodes depend solely on control. In other words, once a call node doesn't consume/produce memory and I/O states, it becomes straightforward to move it around in CFG when desired (between it's inputs and users). 

Speaking of scheduling, would default scheduling heuristics do a good job? The case of expensive nodes exemplifies the need of custom scheduling heuristics for such nodes. 

Implementation-wise, lowering during matching becomes platform-specific and requires each platform to introduce `effect(CALL)`  AD instructions. Moreover, each call shape (determined by arity and argument kinds) has to be explicitly handled with a dedicated AD instruction. And it doesn't benefit from existing support of call nodes every platform already has.


> Ideally, what we want to do with expensive data nodes is to common them aggressively like any other data node. Then, during code motion, we can clone them if it is beneficial.

The current implementation of expensive nodes can definitely be improved, but the nice property it has is that it only decreases the number of nodes through careful commoning during loop opts. Once cloning is allowed, there's a new problem to care about: the case of too many clones. 

A simple incremental improvement would be to teach `PhaseIdealLoop::process_expensive_nodes()` to push expensive nodes closer to their users if they are on less frequent code paths. Then it can be taught (how and when) to clone expensive nodes between multiple users.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/24966#issuecomment-2892797262

From yzheng at openjdk.org  Tue May 20 06:14:09 2025
From: yzheng at openjdk.org (Yudi Zheng)
Date: Tue, 20 May 2025 06:14:09 GMT
Subject: RFR: 8334717: Add JVMCI support for APX EGPRs [v2]
In-Reply-To: <D9htbFH3sa2Ra2-_EYtf6o8SK7RYeMZoOJ78C2gImRI=.540520cc-aaf9-4da6-b44d-2b91128859e9@github.com>
References: <D9htbFH3sa2Ra2-_EYtf6o8SK7RYeMZoOJ78C2gImRI=.540520cc-aaf9-4da6-b44d-2b91128859e9@github.com>
Message-ID: <ab5quB6HEK2zh2dlc8si-_yI7q0kzAO1foYxETSEbZM=.b773efb8-a644-4255-b2c5-1127fe0370d7@github.com>

> This PR marks extra general purpose registers introduced by Intel APX as Graal allocatables. It also drops AMD64/AArch64/RISCV64.flags and RegisterArray

Yudi Zheng has updated the pull request incrementally with one additional commit since the last revision:

  fix tests

-------------

Changes:
  - all: https://git.openjdk.org/jdk/pull/23159/files
  - new: https://git.openjdk.org/jdk/pull/23159/files/aabb8996..37e4d2a4

Webrevs:
 - full: https://webrevs.openjdk.org/?repo=jdk&pr=23159&range=01
 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=23159&range=00-01

  Stats: 15 lines in 3 files changed: 3 ins; 0 del; 12 mod
  Patch: https://git.openjdk.org/jdk/pull/23159.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/23159/head:pull/23159

PR: https://git.openjdk.org/jdk/pull/23159

From duke at openjdk.org  Tue May 20 11:56:52 2025
From: duke at openjdk.org (Ferenc Rakoczi)
Date: Tue, 20 May 2025 11:56:52 GMT
Subject: RFR: 8351412: Add AVX-512 intrinsics for ML-KEM [v6]
In-Reply-To: <G8kShEh81Q8ydS6WxsuVf5tbS1VcwUB9SH1o1rpxDtQ=.86b09653-2bb5-45d3-912f-63db29ec5553@github.com>
References: <EyhTUDIMxnzyPP14NYWuRlZXR9WTv2GaYApQO0nJ4do=.2ee863e3-fd15-41be-ac73-247b03144281@github.com>
 <XiK_5EerkQ60z0GRdkPkYuJcVxqbyZqO0ofx1Zd1JNM=.51a52b01-6cbd-44a8-ad1d-99a083e037e9@github.com>
 <G8kShEh81Q8ydS6WxsuVf5tbS1VcwUB9SH1o1rpxDtQ=.86b09653-2bb5-45d3-912f-63db29ec5553@github.com>
Message-ID: <PLdSuq6LJJCr_X-mnccWuZdmEbOrBv_srZwPOs2v8EQ=.e756009f-660a-459b-83be-0f0d544fc32c@github.com>

On Fri, 16 May 2025 00:28:18 GMT, Sandhya Viswanathan <sviswanathan at openjdk.org> wrote:

>> Ferenc Rakoczi has updated the pull request incrementally with one additional commit since the last revision:
>> 
>>   Response to review comment + loading constants with broadcast op.
>
> src/hotspot/cpu/x86/stubGenerator_x86_64_kyber.cpp line 250:
> 
>> 248: static void montmul(int outputRegs[], int inputRegs1[], int inputRegs2[],
>> 249:              int scratchRegs1[], int scratchRegs2[], MacroAssembler *_masm) {
>> 250:    for (int i = 0; i < 4; i++) {
> 
> In the intrinsic for montMul we are treating as if MONT_R_BITS is 16 and MONT_Q_INV_MOD_R is 0xF301 whereas in the Java code MONT_R_BITS is 20 and MONT_Q_INT_MOD_R is 0x8F301. Are these equivalent?

As used in this case, they are equivalent.  For z = montmul(a,b), z will be  between -q and q and congruent to a * b * R^-1 mod q, where R > 2 * q, R is a power of 2, -R/2 * q <= a * b < R/2 * q. For the Java code, we use R = 2^20 and for the intrinsic, R = 2^16. In our computations, b is always c * R mod q, so the montmul() really  computes a * c mod q. In the Java code, we use 32-bit numbers for the computations, and we use R = 2^20 because that way the a * b numbers that occur during all computations stay in the required range (the inverse NTT computation is where they can grow the most), so we don't have to do Barrett reductions during that computation. For the intrinsics, we use R = 2^16, because this way we can do twice as much work in parallel, but we have to do Barrett reduction after levels 2 and 4 in the inverse NTT computation.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/24953#discussion_r2097757145

From dnsimon at openjdk.org  Tue May 20 12:14:07 2025
From: dnsimon at openjdk.org (Doug Simon)
Date: Tue, 20 May 2025 12:14:07 GMT
Subject: RFR: 8357135: java.lang.OutOfMemoryError: Error creating or
 attaching to libjvmci after JDK-8356447 [v2]
In-Reply-To: <ae8M3ac5iIemGBew5gSj4PVgo7e_Cs5Nq1_x6dRfKJE=.4b9f3be1-ebc5-4171-b097-732c4d2f48af@github.com>
References: <ae8M3ac5iIemGBew5gSj4PVgo7e_Cs5Nq1_x6dRfKJE=.4b9f3be1-ebc5-4171-b097-732c4d2f48af@github.com>
Message-ID: <PpMS7gcHWmKh7TdNWrLS7y9VnPsAQE3nSiKpTYh8nKI=.6e0e4e22-2a48-4bad-a150-c0dd9fd649f8@github.com>

> As of [JDK-8356447](https://bugs.openjdk.org/browse/JDK-8356447), libgraal initialization happens during VM startup. If during this initialization, the libgraal heap cannot be created due to lack of virtual address space, the VM will exit with:
> 
> 
> Error occurred during initialization of VM
> java.lang.OutOfMemoryError: Error creating or attaching to libjvmci (err: -1000000801, description: Reserving address space for the new isolate failed.)
> 
> 
> This causes problems for tests that limit the virtual address space with `ulimit -v` such as `gc/arguments/TestUseCompressedOopsFlagsWithUlimit.java` and `vmTestbase/nsk/jvmti/Allocate/alloc001/alloc001.java`.
> Instead of exiting the VM, the failure should be silent (unless `-XX:+PrintCompilation` is enabled) as the VM can continue without libgraal, albeit in a crippled state. This PR implements this solution.
> 
> Alternative solutions include:
> 1. Trying to adjust the values used with `ulimit -v` in the tests to accommodate the [virtual address reservations](https://github.com/oracle/graal/blob/69f10d3d658a6aeca3d5ce59c64af6a18336f14c/substratevm/src/com.oracle.svm.core.genscavenge/src/com/oracle/svm/core/genscavenge/AddressRangeCommittedMemoryProvider.java#L150) needed by libgraal. This is brittle as it assumes knowledge about how much address space is needed (which is turn depends on how many libgraal compiler threads are created).
> 2. Add a `@requires !vm.libgraal.jit` guard to the tests so they are not run when libgraal is in use.
> 
> I think the solution in this PR is the most robust for the long term.

Doug Simon has updated the pull request incrementally with one additional commit since the last revision:

  consolidate JVMCI eager initialization

-------------

Changes:
  - all: https://git.openjdk.org/jdk/pull/25307/files
  - new: https://git.openjdk.org/jdk/pull/25307/files/7eb259b9..32986d1a

Webrevs:
 - full: https://webrevs.openjdk.org/?repo=jdk&pr=25307&range=01
 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25307&range=00-01

  Stats: 41 lines in 5 files changed: 17 ins; 19 del; 5 mod
  Patch: https://git.openjdk.org/jdk/pull/25307.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/25307/head:pull/25307

PR: https://git.openjdk.org/jdk/pull/25307

From yzheng at openjdk.org  Tue May 20 12:27:53 2025
From: yzheng at openjdk.org (Yudi Zheng)
Date: Tue, 20 May 2025 12:27:53 GMT
Subject: RFR: 8357135: java.lang.OutOfMemoryError: Error creating or
 attaching to libjvmci after JDK-8356447 [v2]
In-Reply-To: <PpMS7gcHWmKh7TdNWrLS7y9VnPsAQE3nSiKpTYh8nKI=.6e0e4e22-2a48-4bad-a150-c0dd9fd649f8@github.com>
References: <ae8M3ac5iIemGBew5gSj4PVgo7e_Cs5Nq1_x6dRfKJE=.4b9f3be1-ebc5-4171-b097-732c4d2f48af@github.com>
 <PpMS7gcHWmKh7TdNWrLS7y9VnPsAQE3nSiKpTYh8nKI=.6e0e4e22-2a48-4bad-a150-c0dd9fd649f8@github.com>
Message-ID: <3aLK-TCHFl8-YyAX6Ppjm458pXwA5jGq6qssypzvTw0=.8ad6de3e-63f4-4c1b-bac4-01c84549a7d7@github.com>

On Tue, 20 May 2025 12:14:07 GMT, Doug Simon <dnsimon at openjdk.org> wrote:

>> As of [JDK-8356447](https://bugs.openjdk.org/browse/JDK-8356447), libgraal initialization happens during VM startup. If during this initialization, the libgraal heap cannot be created due to lack of virtual address space, the VM will exit with:
>> 
>> 
>> Error occurred during initialization of VM
>> java.lang.OutOfMemoryError: Error creating or attaching to libjvmci (err: -1000000801, description: Reserving address space for the new isolate failed.)
>> 
>> 
>> This causes problems for tests that limit the virtual address space with `ulimit -v` such as `gc/arguments/TestUseCompressedOopsFlagsWithUlimit.java` and `vmTestbase/nsk/jvmti/Allocate/alloc001/alloc001.java`.
>> Instead of exiting the VM, the failure should be silent (unless `-XX:+PrintCompilation` is enabled) as the VM can continue without libgraal, albeit in a crippled state. This PR implements this solution.
>> 
>> Alternative solutions include:
>> 1. Trying to adjust the values used with `ulimit -v` in the tests to accommodate the [virtual address reservations](https://github.com/oracle/graal/blob/69f10d3d658a6aeca3d5ce59c64af6a18336f14c/substratevm/src/com.oracle.svm.core.genscavenge/src/com/oracle/svm/core/genscavenge/AddressRangeCommittedMemoryProvider.java#L150) needed by libgraal. This is brittle as it assumes knowledge about how much address space is needed (which is turn depends on how many libgraal compiler threads are created).
>> 2. Add a `@requires !vm.libgraal.jit` guard to the tests so they are not run when libgraal is in use.
>> 
>> I think the solution in this PR is the most robust for the long term.
>
> Doug Simon has updated the pull request incrementally with one additional commit since the last revision:
> 
>   consolidate JVMCI eager initialization

LGTM

-------------

Marked as reviewed by yzheng (Committer).

PR Review: https://git.openjdk.org/jdk/pull/25307#pullrequestreview-2853970394

From dnsimon at openjdk.org  Tue May 20 12:58:28 2025
From: dnsimon at openjdk.org (Doug Simon)
Date: Tue, 20 May 2025 12:58:28 GMT
Subject: RFR: 8357370: Export supported GCs in JVMCI
In-Reply-To: <EoQGC-wmL12DKTETN_TxonKLuX0awOPhMhQsXM3RJRw=.6ba8494f-4053-40eb-bb18-b2f022540c3e@github.com>
References: <EoQGC-wmL12DKTETN_TxonKLuX0awOPhMhQsXM3RJRw=.6ba8494f-4053-40eb-bb18-b2f022540c3e@github.com>
Message-ID: <vcSujBFCQpySjCaAvASXkVDOak7EuBBZWnfJU_hK3GA=.85b0f3a2-9dcb-490f-9096-32b29cf083f7@github.com>

On Tue, 20 May 2025 12:52:02 GMT, Roman Kennke <rkennke at openjdk.org> wrote:

> I need a way to detect in JVMCI if Shenandoah GC is supported (that is, built-in) by HotSpot. I need it for Shenandoah, because some vendors don't build it, but for cleanliness the relevant preprocessor constants should be exported for all GCs.
> 
> Testing:
>  - [x] build/test https://github.com/oracle/graal/pull/10904

src/hotspot/share/jvmci/vmStructs_jvmci.cpp line 498:

> 496:   declare_preprocessor_constant("ASSERT", DEBUG_ONLY(1) NOT_DEBUG(0))     \
> 497:                                                                           \
> 498:   declare_preprocessor_constant("INCLUDE_SERIALGC",     INCLUDE_SERIALGC)     \

Probably best to make the formatting consistent with how it's done for the `JVM_ACC_*` constants below (i.e., no alignment of values).

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/25325#discussion_r2097893655

From rkennke at openjdk.org  Tue May 20 12:58:27 2025
From: rkennke at openjdk.org (Roman Kennke)
Date: Tue, 20 May 2025 12:58:27 GMT
Subject: RFR: 8357370: Export supported GCs in JVMCI
Message-ID: <EoQGC-wmL12DKTETN_TxonKLuX0awOPhMhQsXM3RJRw=.6ba8494f-4053-40eb-bb18-b2f022540c3e@github.com>

I need a way to detect in JVMCI if Shenandoah GC is supported (that is, built-in) by HotSpot. I need it for Shenandoah, because some vendors don't build it, but for cleanliness the relevant preprocessor constants should be exported for all GCs.

Testing:
 - [x] build/test https://github.com/oracle/graal/pull/10904

-------------

Commit messages:
 - 8357370: Export supported GCs in JVMCI

Changes: https://git.openjdk.org/jdk/pull/25325/files
  Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=25325&range=00
  Issue: https://bugs.openjdk.org/browse/JDK-8357370
  Stats: 6 lines in 1 file changed: 6 ins; 0 del; 0 mod
  Patch: https://git.openjdk.org/jdk/pull/25325.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/25325/head:pull/25325

PR: https://git.openjdk.org/jdk/pull/25325

From rkennke at openjdk.org  Tue May 20 13:12:06 2025
From: rkennke at openjdk.org (Roman Kennke)
Date: Tue, 20 May 2025 13:12:06 GMT
Subject: RFR: 8357370: Export supported GCs in JVMCI [v2]
In-Reply-To: <EoQGC-wmL12DKTETN_TxonKLuX0awOPhMhQsXM3RJRw=.6ba8494f-4053-40eb-bb18-b2f022540c3e@github.com>
References: <EoQGC-wmL12DKTETN_TxonKLuX0awOPhMhQsXM3RJRw=.6ba8494f-4053-40eb-bb18-b2f022540c3e@github.com>
Message-ID: <nag7zFgIVgKjjbEq7HEny_S7iTlqpVbY_7O1iGcZb3c=.721d0c7b-97c3-4abf-81a8-e7761c808ce3@github.com>

> I need a way to detect in JVMCI if Shenandoah GC is supported (that is, built-in) by HotSpot. I need it for Shenandoah, because some vendors don't build it, but for cleanliness the relevant preprocessor constants should be exported for all GCs.
> 
> Testing:
>  - [x] build/test https://github.com/oracle/graal/pull/10904

Roman Kennke has updated the pull request incrementally with one additional commit since the last revision:

  Don't align values

-------------

Changes:
  - all: https://git.openjdk.org/jdk/pull/25325/files
  - new: https://git.openjdk.org/jdk/pull/25325/files/7caef245..321a0940

Webrevs:
 - full: https://webrevs.openjdk.org/?repo=jdk&pr=25325&range=01
 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25325&range=00-01

  Stats: 4 lines in 1 file changed: 0 ins; 0 del; 4 mod
  Patch: https://git.openjdk.org/jdk/pull/25325.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/25325/head:pull/25325

PR: https://git.openjdk.org/jdk/pull/25325

From rkennke at openjdk.org  Tue May 20 13:35:31 2025
From: rkennke at openjdk.org (Roman Kennke)
Date: Tue, 20 May 2025 13:35:31 GMT
Subject: RFR: 8357370: Export supported GCs in JVMCI [v3]
In-Reply-To: <EoQGC-wmL12DKTETN_TxonKLuX0awOPhMhQsXM3RJRw=.6ba8494f-4053-40eb-bb18-b2f022540c3e@github.com>
References: <EoQGC-wmL12DKTETN_TxonKLuX0awOPhMhQsXM3RJRw=.6ba8494f-4053-40eb-bb18-b2f022540c3e@github.com>
Message-ID: <OKlmr4OQnseMQmNcVvX9poYdB8XEUo3QBF-AFcG9HH4=.8ba1fe84-76af-4e89-a5dc-7867965e3821@github.com>

> I need a way to detect in JVMCI if Shenandoah GC is supported (that is, built-in) by HotSpot. I need it for Shenandoah, because some vendors don't build it, but for cleanliness the relevant preprocessor constants should be exported for all GCs.
> 
> Testing:
>  - [x] build/test https://github.com/oracle/graal/pull/10904

Roman Kennke has updated the pull request incrementally with one additional commit since the last revision:

  Align most trailing \s

-------------

Changes:
  - all: https://git.openjdk.org/jdk/pull/25325/files
  - new: https://git.openjdk.org/jdk/pull/25325/files/321a0940..16d82e7b

Webrevs:
 - full: https://webrevs.openjdk.org/?repo=jdk&pr=25325&range=02
 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25325&range=01-02

  Stats: 4 lines in 1 file changed: 0 ins; 0 del; 4 mod
  Patch: https://git.openjdk.org/jdk/pull/25325.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/25325/head:pull/25325

PR: https://git.openjdk.org/jdk/pull/25325

From dnsimon at openjdk.org  Tue May 20 13:35:31 2025
From: dnsimon at openjdk.org (Doug Simon)
Date: Tue, 20 May 2025 13:35:31 GMT
Subject: RFR: 8357370: Export supported GCs in JVMCI [v3]
In-Reply-To: <OKlmr4OQnseMQmNcVvX9poYdB8XEUo3QBF-AFcG9HH4=.8ba1fe84-76af-4e89-a5dc-7867965e3821@github.com>
References: <EoQGC-wmL12DKTETN_TxonKLuX0awOPhMhQsXM3RJRw=.6ba8494f-4053-40eb-bb18-b2f022540c3e@github.com>
 <OKlmr4OQnseMQmNcVvX9poYdB8XEUo3QBF-AFcG9HH4=.8ba1fe84-76af-4e89-a5dc-7867965e3821@github.com>
Message-ID: <GYbTb0msR8yCGDeh6u77kpUnnC4tP_7w6LZKoR1D92M=.adcf3c24-ce0a-4970-807c-ebfd2a2786f8@github.com>

On Tue, 20 May 2025 13:32:08 GMT, Roman Kennke <rkennke at openjdk.org> wrote:

>> I need a way to detect in JVMCI if Shenandoah GC is supported (that is, built-in) by HotSpot. I need it for Shenandoah, because some vendors don't build it, but for cleanliness the relevant preprocessor constants should be exported for all GCs.
>> 
>> Testing:
>>  - [x] build/test https://github.com/oracle/graal/pull/10904
>
> Roman Kennke has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Align most trailing \s

LGTM and trivial.

-------------

Marked as reviewed by dnsimon (Reviewer).

PR Review: https://git.openjdk.org/jdk/pull/25325#pullrequestreview-2854208149

From sviswanathan at openjdk.org  Tue May 20 17:17:58 2025
From: sviswanathan at openjdk.org (Sandhya Viswanathan)
Date: Tue, 20 May 2025 17:17:58 GMT
Subject: RFR: 8351412: Add AVX-512 intrinsics for ML-KEM [v6]
In-Reply-To: <PLdSuq6LJJCr_X-mnccWuZdmEbOrBv_srZwPOs2v8EQ=.e756009f-660a-459b-83be-0f0d544fc32c@github.com>
References: <EyhTUDIMxnzyPP14NYWuRlZXR9WTv2GaYApQO0nJ4do=.2ee863e3-fd15-41be-ac73-247b03144281@github.com>
 <XiK_5EerkQ60z0GRdkPkYuJcVxqbyZqO0ofx1Zd1JNM=.51a52b01-6cbd-44a8-ad1d-99a083e037e9@github.com>
 <G8kShEh81Q8ydS6WxsuVf5tbS1VcwUB9SH1o1rpxDtQ=.86b09653-2bb5-45d3-912f-63db29ec5553@github.com>
 <PLdSuq6LJJCr_X-mnccWuZdmEbOrBv_srZwPOs2v8EQ=.e756009f-660a-459b-83be-0f0d544fc32c@github.com>
Message-ID: <QCcK72OLrwAGZrLDmUqiH_w4DkLGBfe6OINGEAU1AqY=.13453fa8-26bd-4617-ac35-a2f89cbaff69@github.com>

On Tue, 20 May 2025 11:51:49 GMT, Ferenc Rakoczi <duke at openjdk.org> wrote:

>> src/hotspot/cpu/x86/stubGenerator_x86_64_kyber.cpp line 250:
>> 
>>> 248: static void montmul(int outputRegs[], int inputRegs1[], int inputRegs2[],
>>> 249:              int scratchRegs1[], int scratchRegs2[], MacroAssembler *_masm) {
>>> 250:    for (int i = 0; i < 4; i++) {
>> 
>> In the intrinsic for montMul we are treating as if MONT_R_BITS is 16 and MONT_Q_INV_MOD_R is 0xF301 whereas in the Java code MONT_R_BITS is 20 and MONT_Q_INT_MOD_R is 0x8F301. Are these equivalent?
>
> As used in this case, they are equivalent.  For z = montmul(a,b), z will be  between -q and q and congruent to a * b * R^-1 mod q, where R > 2 * q, R is a power of 2, -R/2 * q <= a * b < R/2 * q. For the Java code, we use R = 2^20 and for the intrinsic, R = 2^16. In our computations, b is always c * R mod q, so the montmul() really  computes a * c mod q. In the Java code, we use 32-bit numbers for the computations, and we use R = 2^20 because that way the a * b numbers that occur during all computations stay in the required range (the inverse NTT computation is where they can grow the most), so we don't have to do Barrett reductions during that computation. For the intrinsics, we use R = 2^16, because this way we can do twice as much work in parallel, but we have to do Barrett reduction after levels 2 and 4 in the inverse NTT computation.

Thanks a lot for the explanation. It would be good to add it as a comment in the stubGenerator_x86_64_kyber.cpp.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/24953#discussion_r2098491524

From sviswanathan at openjdk.org  Tue May 20 17:35:55 2025
From: sviswanathan at openjdk.org (Sandhya Viswanathan)
Date: Tue, 20 May 2025 17:35:55 GMT
Subject: RFR: 8351412: Add AVX-512 intrinsics for ML-KEM [v6]
In-Reply-To: <XiK_5EerkQ60z0GRdkPkYuJcVxqbyZqO0ofx1Zd1JNM=.51a52b01-6cbd-44a8-ad1d-99a083e037e9@github.com>
References: <EyhTUDIMxnzyPP14NYWuRlZXR9WTv2GaYApQO0nJ4do=.2ee863e3-fd15-41be-ac73-247b03144281@github.com>
 <XiK_5EerkQ60z0GRdkPkYuJcVxqbyZqO0ofx1Zd1JNM=.51a52b01-6cbd-44a8-ad1d-99a083e037e9@github.com>
Message-ID: <TgTmS02LeS7ie9AQxKP8yR9lZQgtusyrXKTfY2xX5Cc=.87be806f-d108-4b07-a0db-4869381d6c09@github.com>

On Thu, 15 May 2025 13:33:42 GMT, Ferenc Rakoczi <duke at openjdk.org> wrote:

>> By using the AVX-512 vector registers the speed of the computation of the ML-KEM algorithms (key generation, encapsulation, decapsulation) can be approximately doubled.
>
> Ferenc Rakoczi has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Response to review comment + loading constants with broadcast op.

Looks good to me.

-------------

Marked as reviewed by sviswanathan (Reviewer).

PR Review: https://git.openjdk.org/jdk/pull/24953#pullrequestreview-2855056310

From duke at openjdk.org  Tue May 20 17:49:14 2025
From: duke at openjdk.org (Ferenc Rakoczi)
Date: Tue, 20 May 2025 17:49:14 GMT
Subject: RFR: 8351412: Add AVX-512 intrinsics for ML-KEM [v7]
In-Reply-To: <EyhTUDIMxnzyPP14NYWuRlZXR9WTv2GaYApQO0nJ4do=.2ee863e3-fd15-41be-ac73-247b03144281@github.com>
References: <EyhTUDIMxnzyPP14NYWuRlZXR9WTv2GaYApQO0nJ4do=.2ee863e3-fd15-41be-ac73-247b03144281@github.com>
Message-ID: <DnmOKfbcTKPsR9G2Hq_6DLt5tAIuAZdy5eEzhxNSCcE=.6f869c95-3be6-42fa-90d4-5bf251272594@github.com>

> By using the AVX-512 vector registers the speed of the computation of the ML-KEM algorithms (key generation, encapsulation, decapsulation) can be approximately doubled.

Ferenc Rakoczi has updated the pull request incrementally with one additional commit since the last revision:

  Added some comments.

-------------

Changes:
  - all: https://git.openjdk.org/jdk/pull/24953/files
  - new: https://git.openjdk.org/jdk/pull/24953/files/e4f3264e..ea2152da

Webrevs:
 - full: https://webrevs.openjdk.org/?repo=jdk&pr=24953&range=06
 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24953&range=05-06

  Stats: 14 lines in 1 file changed: 14 ins; 0 del; 0 mod
  Patch: https://git.openjdk.org/jdk/pull/24953.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/24953/head:pull/24953

PR: https://git.openjdk.org/jdk/pull/24953

From sviswanathan at openjdk.org  Tue May 20 17:52:55 2025
From: sviswanathan at openjdk.org (Sandhya Viswanathan)
Date: Tue, 20 May 2025 17:52:55 GMT
Subject: RFR: 8351412: Add AVX-512 intrinsics for ML-KEM [v7]
In-Reply-To: <DnmOKfbcTKPsR9G2Hq_6DLt5tAIuAZdy5eEzhxNSCcE=.6f869c95-3be6-42fa-90d4-5bf251272594@github.com>
References: <EyhTUDIMxnzyPP14NYWuRlZXR9WTv2GaYApQO0nJ4do=.2ee863e3-fd15-41be-ac73-247b03144281@github.com>
 <DnmOKfbcTKPsR9G2Hq_6DLt5tAIuAZdy5eEzhxNSCcE=.6f869c95-3be6-42fa-90d4-5bf251272594@github.com>
Message-ID: <LhasX17GlhjgzZ_V2hlJxGZp-KpYb9ZvbZBduPyzDHY=.23e1181f-9fa3-427c-b31f-70b5efc176e9@github.com>

On Tue, 20 May 2025 17:49:14 GMT, Ferenc Rakoczi <duke at openjdk.org> wrote:

>> By using the AVX-512 vector registers the speed of the computation of the ML-KEM algorithms (key generation, encapsulation, decapsulation) can be approximately doubled.
>
> Ferenc Rakoczi has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Added some comments.

Thanks for adding the comment.

-------------

Marked as reviewed by sviswanathan (Reviewer).

PR Review: https://git.openjdk.org/jdk/pull/24953#pullrequestreview-2855099857

From duke at openjdk.org  Tue May 20 18:48:55 2025
From: duke at openjdk.org (duke)
Date: Tue, 20 May 2025 18:48:55 GMT
Subject: RFR: 8351412: Add AVX-512 intrinsics for ML-KEM [v7]
In-Reply-To: <DnmOKfbcTKPsR9G2Hq_6DLt5tAIuAZdy5eEzhxNSCcE=.6f869c95-3be6-42fa-90d4-5bf251272594@github.com>
References: <EyhTUDIMxnzyPP14NYWuRlZXR9WTv2GaYApQO0nJ4do=.2ee863e3-fd15-41be-ac73-247b03144281@github.com>
 <DnmOKfbcTKPsR9G2Hq_6DLt5tAIuAZdy5eEzhxNSCcE=.6f869c95-3be6-42fa-90d4-5bf251272594@github.com>
Message-ID: <3ev08acOQdRUvWRfhksWfQER7TRnpd7gY5mA-OUb8_k=.5b3fe078-dbe9-41ec-b810-f7485280eba8@github.com>

On Tue, 20 May 2025 17:49:14 GMT, Ferenc Rakoczi <duke at openjdk.org> wrote:

>> By using the AVX-512 vector registers the speed of the computation of the ML-KEM algorithms (key generation, encapsulation, decapsulation) can be approximately doubled.
>
> Ferenc Rakoczi has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Added some comments.

@ferakocz 
Your change (at version ea2152dab73080d2b4759526d220f19706d768b6) is now ready to be sponsored by a Committer.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/24953#issuecomment-2895471350

From duke at openjdk.org  Tue May 20 19:08:59 2025
From: duke at openjdk.org (Ferenc Rakoczi)
Date: Tue, 20 May 2025 19:08:59 GMT
Subject: Integrated: 8351412: Add AVX-512 intrinsics for ML-KEM
In-Reply-To: <EyhTUDIMxnzyPP14NYWuRlZXR9WTv2GaYApQO0nJ4do=.2ee863e3-fd15-41be-ac73-247b03144281@github.com>
References: <EyhTUDIMxnzyPP14NYWuRlZXR9WTv2GaYApQO0nJ4do=.2ee863e3-fd15-41be-ac73-247b03144281@github.com>
Message-ID: <PMRu_hQglgYckwNzva2_vZtasbwLxEhd-CTgWQh-imM=.98ad8c30-0836-495d-8ed1-84d8bec9c632@github.com>

On Tue, 29 Apr 2025 18:49:52 GMT, Ferenc Rakoczi <duke at openjdk.org> wrote:

> By using the AVX-512 vector registers the speed of the computation of the ML-KEM algorithms (key generation, encapsulation, decapsulation) can be approximately doubled.

This pull request has now been integrated.

Changeset: 972f2ebe
Author:    Ferenc Rakoczi <ferenc.r.rakoczi at oracle.com>
Committer: Sandhya Viswanathan <sviswanathan at openjdk.org>
URL:       https://git.openjdk.org/jdk/commit/972f2ebe978280d22531a70116e79837632f6ebc
Stats:     988 lines in 10 files changed: 977 ins; 2 del; 9 mod

8351412: Add AVX-512 intrinsics for ML-KEM

Reviewed-by: sviswanathan

-------------

PR: https://git.openjdk.org/jdk/pull/24953

From mullan at openjdk.org  Tue May 20 19:13:58 2025
From: mullan at openjdk.org (Sean Mullan)
Date: Tue, 20 May 2025 19:13:58 GMT
Subject: RFR: 8351412: Add AVX-512 intrinsics for ML-KEM [v7]
In-Reply-To: <DnmOKfbcTKPsR9G2Hq_6DLt5tAIuAZdy5eEzhxNSCcE=.6f869c95-3be6-42fa-90d4-5bf251272594@github.com>
References: <EyhTUDIMxnzyPP14NYWuRlZXR9WTv2GaYApQO0nJ4do=.2ee863e3-fd15-41be-ac73-247b03144281@github.com>
 <DnmOKfbcTKPsR9G2Hq_6DLt5tAIuAZdy5eEzhxNSCcE=.6f869c95-3be6-42fa-90d4-5bf251272594@github.com>
Message-ID: <0ZhaH_07oxLDZxz8wVEgbsbYWB50sjuLZxYwyM4ftno=.2adb899d-7768-481d-975b-8e0ee3e6f2c2@github.com>

On Tue, 20 May 2025 17:49:14 GMT, Ferenc Rakoczi <duke at openjdk.org> wrote:

>> By using the AVX-512 vector registers the speed of the computation of the ML-KEM algorithms (key generation, encapsulation, decapsulation) can be approximately doubled.
>
> Ferenc Rakoczi has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Added some comments.

Please also write a release note as the performance improvement is significant. Thanks!

-------------

PR Comment: https://git.openjdk.org/jdk/pull/24953#issuecomment-2895525488

From lmesnik at openjdk.org  Tue May 20 23:53:59 2025
From: lmesnik at openjdk.org (Leonid Mesnik)
Date: Tue, 20 May 2025 23:53:59 GMT
Subject: RFR: 8351412: Add AVX-512 intrinsics for ML-KEM [v7]
In-Reply-To: <DnmOKfbcTKPsR9G2Hq_6DLt5tAIuAZdy5eEzhxNSCcE=.6f869c95-3be6-42fa-90d4-5bf251272594@github.com>
References: <EyhTUDIMxnzyPP14NYWuRlZXR9WTv2GaYApQO0nJ4do=.2ee863e3-fd15-41be-ac73-247b03144281@github.com>
 <DnmOKfbcTKPsR9G2Hq_6DLt5tAIuAZdy5eEzhxNSCcE=.6f869c95-3be6-42fa-90d4-5bf251272594@github.com>
Message-ID: <EjTlFPRs9t57TQgFhtbc_bmYRxMIDdq6COpXo_UT08w=.f23b4d46-359a-4d6e-bea3-5c522580d65c@github.com>

On Tue, 20 May 2025 17:49:14 GMT, Ferenc Rakoczi <duke at openjdk.org> wrote:

>> By using the AVX-512 vector registers the speed of the computation of the ML-KEM algorithms (key generation, encapsulation, decapsulation) can be approximately doubled.
>
> Ferenc Rakoczi has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Added some comments.

I haven't find answer an my question about testing. How this fix is tested?

-------------

PR Comment: https://git.openjdk.org/jdk/pull/24953#issuecomment-2896080458

From duke at openjdk.org  Wed May 21 04:43:59 2025
From: duke at openjdk.org (Ferenc Rakoczi)
Date: Wed, 21 May 2025 04:43:59 GMT
Subject: RFR: 8351412: Add AVX-512 intrinsics for ML-KEM [v7]
In-Reply-To: <EjTlFPRs9t57TQgFhtbc_bmYRxMIDdq6COpXo_UT08w=.f23b4d46-359a-4d6e-bea3-5c522580d65c@github.com>
References: <EyhTUDIMxnzyPP14NYWuRlZXR9WTv2GaYApQO0nJ4do=.2ee863e3-fd15-41be-ac73-247b03144281@github.com>
 <DnmOKfbcTKPsR9G2Hq_6DLt5tAIuAZdy5eEzhxNSCcE=.6f869c95-3be6-42fa-90d4-5bf251272594@github.com>
 <EjTlFPRs9t57TQgFhtbc_bmYRxMIDdq6COpXo_UT08w=.f23b4d46-359a-4d6e-bea3-5c522580d65c@github.com>
Message-ID: <sXPZ6BhnLB9fKd8NJ_SGCC4SrSpjYk5t_H7E1ugxe9o=.5304136f-2d7f-41be-877a-16b0692feddf@github.com>

On Tue, 20 May 2025 23:51:15 GMT, Leonid Mesnik <lmesnik at openjdk.org> wrote:

> I haven't find answer an my question about testing. How this fix is tested?
The change in the file test/jdk/sun/security/provider/acvp/Launcher.java in PR https://github.com/openjdk/jdk/pull/23860/files covers this as well.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/24953#issuecomment-2896548094

From lmesnik at openjdk.org  Wed May 21 05:02:58 2025
From: lmesnik at openjdk.org (Leonid Mesnik)
Date: Wed, 21 May 2025 05:02:58 GMT
Subject: RFR: 8351412: Add AVX-512 intrinsics for ML-KEM [v7]
In-Reply-To: <DnmOKfbcTKPsR9G2Hq_6DLt5tAIuAZdy5eEzhxNSCcE=.6f869c95-3be6-42fa-90d4-5bf251272594@github.com>
References: <EyhTUDIMxnzyPP14NYWuRlZXR9WTv2GaYApQO0nJ4do=.2ee863e3-fd15-41be-ac73-247b03144281@github.com>
 <DnmOKfbcTKPsR9G2Hq_6DLt5tAIuAZdy5eEzhxNSCcE=.6f869c95-3be6-42fa-90d4-5bf251272594@github.com>
Message-ID: <b4kTo1n0ERWwu_V4bz-keL0dek83MKIA90H3h-OX-s8=.4b81f4d7-b791-4423-a544-82df3638f586@github.com>

On Tue, 20 May 2025 17:49:14 GMT, Ferenc Rakoczi <duke at openjdk.org> wrote:

>> By using the AVX-512 vector registers the speed of the computation of the ML-KEM algorithms (key generation, encapsulation, decapsulation) can be approximately doubled.
>
> Ferenc Rakoczi has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Added some comments.

Thanks for pointing to the test.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/24953#issuecomment-2896581694

From dnsimon at openjdk.org  Wed May 21 08:56:59 2025
From: dnsimon at openjdk.org (Doug Simon)
Date: Wed, 21 May 2025 08:56:59 GMT
Subject: RFR: 8334717: Add JVMCI support for APX EGPRs [v2]
In-Reply-To: <ab5quB6HEK2zh2dlc8si-_yI7q0kzAO1foYxETSEbZM=.b773efb8-a644-4255-b2c5-1127fe0370d7@github.com>
References: <D9htbFH3sa2Ra2-_EYtf6o8SK7RYeMZoOJ78C2gImRI=.540520cc-aaf9-4da6-b44d-2b91128859e9@github.com>
 <ab5quB6HEK2zh2dlc8si-_yI7q0kzAO1foYxETSEbZM=.b773efb8-a644-4255-b2c5-1127fe0370d7@github.com>
Message-ID: <wP8C8y-g0Y7skZImEIkjw8fFVGnHfemEa8F1Skd39lY=.7e81a88b-544a-4a72-8742-b0ea23f180e7@github.com>

On Tue, 20 May 2025 06:14:09 GMT, Yudi Zheng <yzheng at openjdk.org> wrote:

>> This PR marks extra general purpose registers introduced by Intel APX as Graal allocatables. It also drops AMD64/AArch64/RISCV64.flags and RegisterArray
>
> Yudi Zheng has updated the pull request incrementally with one additional commit since the last revision:
> 
>   fix tests

Marked as reviewed by dnsimon (Reviewer).

-------------

PR Review: https://git.openjdk.org/jdk/pull/23159#pullrequestreview-2856879701

From yzheng at openjdk.org  Wed May 21 08:56:59 2025
From: yzheng at openjdk.org (Yudi Zheng)
Date: Wed, 21 May 2025 08:56:59 GMT
Subject: RFR: 8334717: Add JVMCI support for APX EGPRs [v2]
In-Reply-To: <ab5quB6HEK2zh2dlc8si-_yI7q0kzAO1foYxETSEbZM=.b773efb8-a644-4255-b2c5-1127fe0370d7@github.com>
References: <D9htbFH3sa2Ra2-_EYtf6o8SK7RYeMZoOJ78C2gImRI=.540520cc-aaf9-4da6-b44d-2b91128859e9@github.com>
 <ab5quB6HEK2zh2dlc8si-_yI7q0kzAO1foYxETSEbZM=.b773efb8-a644-4255-b2c5-1127fe0370d7@github.com>
Message-ID: <8B0cGaejoT19Paf9ccpOje3O6DccoOuE2nm8G6o0gVY=.abd66730-d1c7-4819-9bae-ecdf78fa8e9b@github.com>

On Tue, 20 May 2025 06:14:09 GMT, Yudi Zheng <yzheng at openjdk.org> wrote:

>> This PR marks extra general purpose registers introduced by Intel APX as Graal allocatables. It also drops AMD64/AArch64/RISCV64.flags and RegisterArray
>
> Yudi Zheng has updated the pull request incrementally with one additional commit since the last revision:
> 
>   fix tests

thanks for the review!

-------------

PR Comment: https://git.openjdk.org/jdk/pull/23159#issuecomment-2897150185

From yzheng at openjdk.org  Wed May 21 08:56:59 2025
From: yzheng at openjdk.org (Yudi Zheng)
Date: Wed, 21 May 2025 08:56:59 GMT
Subject: Integrated: 8334717: Add JVMCI support for APX EGPRs
In-Reply-To: <D9htbFH3sa2Ra2-_EYtf6o8SK7RYeMZoOJ78C2gImRI=.540520cc-aaf9-4da6-b44d-2b91128859e9@github.com>
References: <D9htbFH3sa2Ra2-_EYtf6o8SK7RYeMZoOJ78C2gImRI=.540520cc-aaf9-4da6-b44d-2b91128859e9@github.com>
Message-ID: <C0Khhknj5VMn-iJpubGGEJnq4STvITqS-fJFSdqzqk4=.99535778-ceaf-44d3-9f05-78f17c6b9f35@github.com>

On Thu, 16 Jan 2025 16:01:32 GMT, Yudi Zheng <yzheng at openjdk.org> wrote:

> This PR marks extra general purpose registers introduced by Intel APX as Graal allocatables. It also drops AMD64/AArch64/RISCV64.flags and RegisterArray

This pull request has now been integrated.

Changeset: 735c7899
Author:    Yudi Zheng <yzheng at openjdk.org>
URL:       https://git.openjdk.org/jdk/commit/735c7899d124a4e0c9579ea7802c9475eaedda10
Stats:     561 lines in 21 files changed: 44 ins; 334 del; 183 mod

8334717: Add JVMCI support for APX EGPRs

Reviewed-by: dnsimon

-------------

PR: https://git.openjdk.org/jdk/pull/23159

From rkennke at openjdk.org  Wed May 21 11:14:59 2025
From: rkennke at openjdk.org (Roman Kennke)
Date: Wed, 21 May 2025 11:14:59 GMT
Subject: RFR: 8357370: Export supported GCs in JVMCI [v3]
In-Reply-To: <OKlmr4OQnseMQmNcVvX9poYdB8XEUo3QBF-AFcG9HH4=.8ba1fe84-76af-4e89-a5dc-7867965e3821@github.com>
References: <EoQGC-wmL12DKTETN_TxonKLuX0awOPhMhQsXM3RJRw=.6ba8494f-4053-40eb-bb18-b2f022540c3e@github.com>
 <OKlmr4OQnseMQmNcVvX9poYdB8XEUo3QBF-AFcG9HH4=.8ba1fe84-76af-4e89-a5dc-7867965e3821@github.com>
Message-ID: <TJz9APVXe4yQ_kK1p24yecA_KQ8IfVbnv3KsKP0pkA0=.9e499938-0204-4112-a7e4-409bbef004af@github.com>

On Tue, 20 May 2025 13:35:31 GMT, Roman Kennke <rkennke at openjdk.org> wrote:

>> I need a way to detect in JVMCI if Shenandoah GC is supported (that is, built-in) by HotSpot. I need it for Shenandoah, because some vendors don't build it, but for cleanliness the relevant preprocessor constants should be exported for all GCs.
>> 
>> Testing:
>>  - [x] build/test https://github.com/oracle/graal/pull/10904
>
> Roman Kennke has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Align most trailing \s

Thanks!

-------------

PR Comment: https://git.openjdk.org/jdk/pull/25325#issuecomment-2897553017

From rkennke at openjdk.org  Wed May 21 11:15:00 2025
From: rkennke at openjdk.org (Roman Kennke)
Date: Wed, 21 May 2025 11:15:00 GMT
Subject: Integrated: 8357370: Export supported GCs in JVMCI
In-Reply-To: <EoQGC-wmL12DKTETN_TxonKLuX0awOPhMhQsXM3RJRw=.6ba8494f-4053-40eb-bb18-b2f022540c3e@github.com>
References: <EoQGC-wmL12DKTETN_TxonKLuX0awOPhMhQsXM3RJRw=.6ba8494f-4053-40eb-bb18-b2f022540c3e@github.com>
Message-ID: <3NC5H23jy_ZVGiP7FHXskbyotZZcJlEvU7O3idP0zk8=.b52e774d-fdf4-4ba9-86a6-dd158ffc9ead@github.com>

On Tue, 20 May 2025 12:52:02 GMT, Roman Kennke <rkennke at openjdk.org> wrote:

> I need a way to detect in JVMCI if Shenandoah GC is supported (that is, built-in) by HotSpot. I need it for Shenandoah, because some vendors don't build it, but for cleanliness the relevant preprocessor constants should be exported for all GCs.
> 
> Testing:
>  - [x] build/test https://github.com/oracle/graal/pull/10904

This pull request has now been integrated.

Changeset: 2c126f19
Author:    Roman Kennke <rkennke at openjdk.org>
URL:       https://git.openjdk.org/jdk/commit/2c126f1954435a5b4d6cdc367b7b5e8c91cfae63
Stats:     6 lines in 1 file changed: 6 ins; 0 del; 0 mod

8357370: Export supported GCs in JVMCI

Reviewed-by: dnsimon

-------------

PR: https://git.openjdk.org/jdk/pull/25325

From yzheng at openjdk.org  Wed May 21 15:06:16 2025
From: yzheng at openjdk.org (Yudi Zheng)
Date: Wed, 21 May 2025 15:06:16 GMT
Subject: RFR: 8357424: [JVMCI] Avoid incrementing decompilation count for
 hosted compiled nmethod
Message-ID: <Tlf7gJojsuXSi0SBTjCvrgEwYfuc0kUB968V_jOZpTU=.38a4841e-8073-4aa2-a9f8-c2ea4395b2e1@github.com>

Hosted Truffle compilations are installed on the OptimizedCallTarget#profiledPERoot method. Any deoptimization contributes to its decompile count, which can easily exceed the PerMethodRecompilationCutoff threshold, permanently preventing highest tier compilation on this method. This PR exempts hosted compilations from this cutoff by ensuring their decompile count is not incremented for hosted compiled nmethods.

-------------

Commit messages:
 - [JVMCI] Avoid incrementing decompilation count for hosted compiled nmethod

Changes: https://git.openjdk.org/jdk/pull/25356/files
  Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=25356&range=00
  Issue: https://bugs.openjdk.org/browse/JDK-8357424
  Stats: 45 lines in 4 files changed: 37 ins; 0 del; 8 mod
  Patch: https://git.openjdk.org/jdk/pull/25356.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/25356/head:pull/25356

PR: https://git.openjdk.org/jdk/pull/25356

From yzheng at openjdk.org  Wed May 21 15:10:30 2025
From: yzheng at openjdk.org (Yudi Zheng)
Date: Wed, 21 May 2025 15:10:30 GMT
Subject: RFR: 8357424: [JVMCI] Avoid incrementing decompilation count for
 hosted compiled nmethod [v2]
In-Reply-To: <Tlf7gJojsuXSi0SBTjCvrgEwYfuc0kUB968V_jOZpTU=.38a4841e-8073-4aa2-a9f8-c2ea4395b2e1@github.com>
References: <Tlf7gJojsuXSi0SBTjCvrgEwYfuc0kUB968V_jOZpTU=.38a4841e-8073-4aa2-a9f8-c2ea4395b2e1@github.com>
Message-ID: <KhHLOeEBxRa3pPyA2GWDB_1QItvblLXMZ0TumGQas68=.d6af7e0c-7176-4713-b131-834f3b145947@github.com>

> Hosted Truffle compilations are installed on the OptimizedCallTarget#profiledPERoot method. Any deoptimization contributes to its decompile count, which can easily exceed the PerMethodRecompilationCutoff threshold, permanently preventing highest tier compilation on this method. This PR exempts hosted compilations from this cutoff by ensuring their decompile count is not incremented for hosted compiled nmethods.

Yudi Zheng has updated the pull request incrementally with one additional commit since the last revision:

  update copyright

-------------

Changes:
  - all: https://git.openjdk.org/jdk/pull/25356/files
  - new: https://git.openjdk.org/jdk/pull/25356/files/8fcd7104..ef4a4c98

Webrevs:
 - full: https://webrevs.openjdk.org/?repo=jdk&pr=25356&range=01
 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25356&range=00-01

  Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod
  Patch: https://git.openjdk.org/jdk/pull/25356.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/25356/head:pull/25356

PR: https://git.openjdk.org/jdk/pull/25356

From iklam at openjdk.org  Wed May 21 15:18:00 2025
From: iklam at openjdk.org (Ioi Lam)
Date: Wed, 21 May 2025 15:18:00 GMT
Subject: RFR: 8345826: Do not automatically resolve jdk.internal.vm.ci when
 libgraal is used [v7]
In-Reply-To: <MvEM8FIlKWT9HhUeFsrEWFoJf5BRVgZ08GxOKRdbLwY=.4ac67a7f-20b9-41e3-bb82-a960beec41c2@github.com>
References: <PlJ8ZpJyyH6rzVIww3DxcQ_Yi8jelxrf9YidjoaXP0Y=.0701541e-38db-4bc5-b2d5-ae5ebf89b362@github.com>
 <MvEM8FIlKWT9HhUeFsrEWFoJf5BRVgZ08GxOKRdbLwY=.4ac67a7f-20b9-41e3-bb82-a960beec41c2@github.com>
Message-ID: <NZ_roWV0hKGF62cmaAuYdrTZ6htMegnOpZEI-gTQ3f4=.b5497d02-c5bd-4fe3-8ce1-8dcadd38413e@github.com>

On Mon, 19 May 2025 07:46:23 GMT, Doug Simon <dnsimon at openjdk.org> wrote:

>> The `EnableJVMCI` flag currently serves 2 purposes:
>> * Guards VM code ([example](https://github.com/openjdk/jdk/blob/b1e778d9d2ad13ee5f1ed629a8805008580f86c0/src/hotspot/share/runtime/sharedRuntime.cpp#L652)).
>> * [Adds](https://github.com/openjdk/jdk/blob/b1e778d9d2ad13ee5f1ed629a8805008580f86c0/src/hotspot/share/runtime/arguments.cpp#L1804) `jdk.internal.vm.ci` to the root module set.
>> 
>> This PR changes nothing about the first point.
>> 
>> On the second point, to use the `jdk.internal.vm.ci` module when libgraal is enabled, `-XX:+EnableJVMCI` must be explicitly specified to the launcher (as opposed to being true as a result of [`-XX:+UseJVMCICompiler`](https://github.com/openjdk/jdk/blob/76570c627db527f856f2394fb9ead02939eca621/src/hotspot/share/jvmci/jvmci_globals.cpp#L88) or [`-XX:+EnableJVMCIProduct`](https://github.com/openjdk/jdk/blob/76570c627db527f856f2394fb9ead02939eca621/src/hotspot/share/jvmci/jvmci_globals.cpp#L64)). Alternatively, `--add-modules=jdk.internal.vm.ci` can be specified - it has the same semantics as `-XX:+EnableJVMCI`.
>> If libgraal is not enabled, +EnableJVMCI will continue to add `jdk.internal.vm.ci` to the root module set.
>> 
>> The primary motivation is to make use of libgraal compatible with `-XX:+AOTClassLinking`. This flag relies on the root module set archive created in a training run. If the root module set is different in the production run, the AOTClassLinking [optimizations](https://bugs.openjdk.org/browse/JDK-8342279) are disabled. As `jdk.internal.vm.ci` is not resolved in the training run, it must not be resolved in production run. As such, `-XX:+EnableJVMCI` must not cause resolution of `jdk.internal.vm.ci`, otherwise libgraal will not have the startup advantages of AOTClassLinking.
>> 
>> Graal adaption PR: https://github.com/oracle/graal/pull/11212
>
> Doug Simon has updated the pull request incrementally with one additional commit since the last revision:
> 
>   swapped order of recommended options in error message

Marked as reviewed by iklam (Reviewer).

-------------

PR Review: https://git.openjdk.org/jdk/pull/25240#pullrequestreview-2858145831

From never at openjdk.org  Wed May 21 15:57:55 2025
From: never at openjdk.org (Tom Rodriguez)
Date: Wed, 21 May 2025 15:57:55 GMT
Subject: RFR: 8357135: java.lang.OutOfMemoryError: Error creating or
 attaching to libjvmci after JDK-8356447 [v2]
In-Reply-To: <PpMS7gcHWmKh7TdNWrLS7y9VnPsAQE3nSiKpTYh8nKI=.6e0e4e22-2a48-4bad-a150-c0dd9fd649f8@github.com>
References: <ae8M3ac5iIemGBew5gSj4PVgo7e_Cs5Nq1_x6dRfKJE=.4b9f3be1-ebc5-4171-b097-732c4d2f48af@github.com>
 <PpMS7gcHWmKh7TdNWrLS7y9VnPsAQE3nSiKpTYh8nKI=.6e0e4e22-2a48-4bad-a150-c0dd9fd649f8@github.com>
Message-ID: <i4LjYRoz5g7gLKOxKYqJJGWMao5MC3jX3qdEZH-3xBg=.b6524b02-1b80-47ad-8953-bf99e09f7e7a@github.com>

On Tue, 20 May 2025 12:14:07 GMT, Doug Simon <dnsimon at openjdk.org> wrote:

>> As of [JDK-8356447](https://bugs.openjdk.org/browse/JDK-8356447), libgraal initialization happens during VM startup. If during this initialization, the libgraal heap cannot be created due to lack of virtual address space, the VM will exit with:
>> 
>> 
>> Error occurred during initialization of VM
>> java.lang.OutOfMemoryError: Error creating or attaching to libjvmci (err: -1000000801, description: Reserving address space for the new isolate failed.)
>> 
>> 
>> This causes problems for tests that limit the virtual address space with `ulimit -v` such as `gc/arguments/TestUseCompressedOopsFlagsWithUlimit.java` and `vmTestbase/nsk/jvmti/Allocate/alloc001/alloc001.java`.
>> Instead of exiting the VM, the failure should be silent (unless `-XX:+PrintCompilation` is enabled) as the VM can continue without libgraal, albeit in a crippled state. This PR implements this solution.
>> 
>> Alternative solutions include:
>> 1. Trying to adjust the values used with `ulimit -v` in the tests to accommodate the [virtual address reservations](https://github.com/oracle/graal/blob/69f10d3d658a6aeca3d5ce59c64af6a18336f14c/substratevm/src/com.oracle.svm.core.genscavenge/src/com/oracle/svm/core/genscavenge/AddressRangeCommittedMemoryProvider.java#L150) needed by libgraal. This is brittle as it assumes knowledge about how much address space is needed (which is turn depends on how many libgraal compiler threads are created).
>> 2. Add a `@requires !vm.libgraal.jit` guard to the tests so they are not run when libgraal is in use.
>> 
>> I think the solution in this PR is the most robust for the long term.
>
> Doug Simon has updated the pull request incrementally with one additional commit since the last revision:
> 
>   consolidate JVMCI eager initialization

Silently disabling the top level JIT seems like a bad default behaviour for customers.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/25307#issuecomment-2898465575

From dnsimon at openjdk.org  Wed May 21 16:15:52 2025
From: dnsimon at openjdk.org (Doug Simon)
Date: Wed, 21 May 2025 16:15:52 GMT
Subject: RFR: 8357135: java.lang.OutOfMemoryError: Error creating or
 attaching to libjvmci after JDK-8356447 [v2]
In-Reply-To: <i4LjYRoz5g7gLKOxKYqJJGWMao5MC3jX3qdEZH-3xBg=.b6524b02-1b80-47ad-8953-bf99e09f7e7a@github.com>
References: <ae8M3ac5iIemGBew5gSj4PVgo7e_Cs5Nq1_x6dRfKJE=.4b9f3be1-ebc5-4171-b097-732c4d2f48af@github.com>
 <PpMS7gcHWmKh7TdNWrLS7y9VnPsAQE3nSiKpTYh8nKI=.6e0e4e22-2a48-4bad-a150-c0dd9fd649f8@github.com>
 <i4LjYRoz5g7gLKOxKYqJJGWMao5MC3jX3qdEZH-3xBg=.b6524b02-1b80-47ad-8953-bf99e09f7e7a@github.com>
Message-ID: <BSgRRnJQnNBBzeeVb9ZO535VTJyUGk5knXAUC6kr4bA=.4fd4cad8-5a87-4a45-a144-0f9856d74dc2@github.com>

On Wed, 21 May 2025 15:54:58 GMT, Tom Rodriguez <never at openjdk.org> wrote:

> Silently disabling the top level JIT seems like a bad default behaviour for customers.

This does not disable the JIT, just suppresses a specific type of error (i.e., reserving virtual address space for the SVM heap) when trying to initialize libgraal at startup. Importantly, the error of badly specified libgraal options still causes a VM exit.

What alternative solution would you prefer? One of the other 2 proposals in the PR description? Or something else?

-------------

PR Comment: https://git.openjdk.org/jdk/pull/25307#issuecomment-2898519536

From kvn at openjdk.org  Wed May 21 17:19:57 2025
From: kvn at openjdk.org (Vladimir Kozlov)
Date: Wed, 21 May 2025 17:19:57 GMT
Subject: RFR: 8345826: Do not automatically resolve jdk.internal.vm.ci when
 libgraal is used [v7]
In-Reply-To: <MvEM8FIlKWT9HhUeFsrEWFoJf5BRVgZ08GxOKRdbLwY=.4ac67a7f-20b9-41e3-bb82-a960beec41c2@github.com>
References: <PlJ8ZpJyyH6rzVIww3DxcQ_Yi8jelxrf9YidjoaXP0Y=.0701541e-38db-4bc5-b2d5-ae5ebf89b362@github.com>
 <MvEM8FIlKWT9HhUeFsrEWFoJf5BRVgZ08GxOKRdbLwY=.4ac67a7f-20b9-41e3-bb82-a960beec41c2@github.com>
Message-ID: <cn6xefyqtPciIhpMkKS_ZJ0onKkbXWGRB_gKAMb2Czw=.69ceb318-80d3-44f1-851d-9a6a2dad0921@github.com>

On Mon, 19 May 2025 07:46:23 GMT, Doug Simon <dnsimon at openjdk.org> wrote:

>> The `EnableJVMCI` flag currently serves 2 purposes:
>> * Guards VM code ([example](https://github.com/openjdk/jdk/blob/b1e778d9d2ad13ee5f1ed629a8805008580f86c0/src/hotspot/share/runtime/sharedRuntime.cpp#L652)).
>> * [Adds](https://github.com/openjdk/jdk/blob/b1e778d9d2ad13ee5f1ed629a8805008580f86c0/src/hotspot/share/runtime/arguments.cpp#L1804) `jdk.internal.vm.ci` to the root module set.
>> 
>> This PR changes nothing about the first point.
>> 
>> On the second point, to use the `jdk.internal.vm.ci` module when libgraal is enabled, `-XX:+EnableJVMCI` must be explicitly specified to the launcher (as opposed to being true as a result of [`-XX:+UseJVMCICompiler`](https://github.com/openjdk/jdk/blob/76570c627db527f856f2394fb9ead02939eca621/src/hotspot/share/jvmci/jvmci_globals.cpp#L88) or [`-XX:+EnableJVMCIProduct`](https://github.com/openjdk/jdk/blob/76570c627db527f856f2394fb9ead02939eca621/src/hotspot/share/jvmci/jvmci_globals.cpp#L64)). Alternatively, `--add-modules=jdk.internal.vm.ci` can be specified - it has the same semantics as `-XX:+EnableJVMCI`.
>> If libgraal is not enabled, +EnableJVMCI will continue to add `jdk.internal.vm.ci` to the root module set.
>> 
>> The primary motivation is to make use of libgraal compatible with `-XX:+AOTClassLinking`. This flag relies on the root module set archive created in a training run. If the root module set is different in the production run, the AOTClassLinking [optimizations](https://bugs.openjdk.org/browse/JDK-8342279) are disabled. As `jdk.internal.vm.ci` is not resolved in the training run, it must not be resolved in production run. As such, `-XX:+EnableJVMCI` must not cause resolution of `jdk.internal.vm.ci`, otherwise libgraal will not have the startup advantages of AOTClassLinking.
>> 
>> Graal adaption PR: https://github.com/oracle/graal/pull/11212
>
> Doug Simon has updated the pull request incrementally with one additional commit since the last revision:
> 
>   swapped order of recommended options in error message

Looks good.

-------------

Marked as reviewed by kvn (Reviewer).

PR Review: https://git.openjdk.org/jdk/pull/25240#pullrequestreview-2858568094

From never at openjdk.org  Wed May 21 17:55:55 2025
From: never at openjdk.org (Tom Rodriguez)
Date: Wed, 21 May 2025 17:55:55 GMT
Subject: RFR: 8357135: java.lang.OutOfMemoryError: Error creating or
 attaching to libjvmci after JDK-8356447 [v2]
In-Reply-To: <PpMS7gcHWmKh7TdNWrLS7y9VnPsAQE3nSiKpTYh8nKI=.6e0e4e22-2a48-4bad-a150-c0dd9fd649f8@github.com>
References: <ae8M3ac5iIemGBew5gSj4PVgo7e_Cs5Nq1_x6dRfKJE=.4b9f3be1-ebc5-4171-b097-732c4d2f48af@github.com>
 <PpMS7gcHWmKh7TdNWrLS7y9VnPsAQE3nSiKpTYh8nKI=.6e0e4e22-2a48-4bad-a150-c0dd9fd649f8@github.com>
Message-ID: <PdWyfY6AypgX5fJhMbM70SvMuqurNKnD2i13XCIF3i4=.5fe14740-8805-40c9-a895-4f449eb0cfc8@github.com>

On Tue, 20 May 2025 12:14:07 GMT, Doug Simon <dnsimon at openjdk.org> wrote:

>> As of [JDK-8356447](https://bugs.openjdk.org/browse/JDK-8356447), libgraal initialization happens during VM startup. If during this initialization, the libgraal heap cannot be created due to lack of virtual address space, the VM will exit with:
>> 
>> 
>> Error occurred during initialization of VM
>> java.lang.OutOfMemoryError: Error creating or attaching to libjvmci (err: -1000000801, description: Reserving address space for the new isolate failed.)
>> 
>> 
>> This causes problems for tests that limit the virtual address space with `ulimit -v` such as `gc/arguments/TestUseCompressedOopsFlagsWithUlimit.java` and `vmTestbase/nsk/jvmti/Allocate/alloc001/alloc001.java`.
>> Instead of exiting the VM, the failure should be silent (unless `-XX:+PrintCompilation` is enabled) as the VM can continue without libgraal, albeit in a crippled state. This PR implements this solution.
>> 
>> Alternative solutions include:
>> 1. Trying to adjust the values used with `ulimit -v` in the tests to accommodate the [virtual address reservations](https://github.com/oracle/graal/blob/69f10d3d658a6aeca3d5ce59c64af6a18336f14c/substratevm/src/com.oracle.svm.core.genscavenge/src/com/oracle/svm/core/genscavenge/AddressRangeCommittedMemoryProvider.java#L150) needed by libgraal. This is brittle as it assumes knowledge about how much address space is needed (which is turn depends on how many libgraal compiler threads are created).
>> 2. Add a `@requires !vm.libgraal.jit` guard to the tests so they are not run when libgraal is in use.
>> 
>> I think the solution in this PR is the most robust for the long term.
>
> Doug Simon has updated the pull request incrementally with one additional commit since the last revision:
> 
>   consolidate JVMCI eager initialization

After this executes we have a running JVM without a working libgraal right?  It might be rare in a user environment but it's very confusing behaviour for an end user.  Might this not occur in a virtualized environment?

I agree it would be very hard to make libgraal robust in the face of such a limited virtual address space so I think disabling the tests for libgraal would be easiest.  Or both of those tests could probably just run with -Xint to avoid this completely.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/25307#issuecomment-2898777546

From dnsimon at openjdk.org  Wed May 21 19:24:00 2025
From: dnsimon at openjdk.org (Doug Simon)
Date: Wed, 21 May 2025 19:24:00 GMT
Subject: RFR: 8345826: Do not automatically resolve jdk.internal.vm.ci when
 libgraal is used [v7]
In-Reply-To: <MvEM8FIlKWT9HhUeFsrEWFoJf5BRVgZ08GxOKRdbLwY=.4ac67a7f-20b9-41e3-bb82-a960beec41c2@github.com>
References: <PlJ8ZpJyyH6rzVIww3DxcQ_Yi8jelxrf9YidjoaXP0Y=.0701541e-38db-4bc5-b2d5-ae5ebf89b362@github.com>
 <MvEM8FIlKWT9HhUeFsrEWFoJf5BRVgZ08GxOKRdbLwY=.4ac67a7f-20b9-41e3-bb82-a960beec41c2@github.com>
Message-ID: <5HZC3_I8BfmE7cq4-2CvEkUiwayB2nMX3uF7EXV2Csw=.e4230abe-b732-444c-b391-935aac6b7891@github.com>

On Mon, 19 May 2025 07:46:23 GMT, Doug Simon <dnsimon at openjdk.org> wrote:

>> The `EnableJVMCI` flag currently serves 2 purposes:
>> * Guards VM code ([example](https://github.com/openjdk/jdk/blob/b1e778d9d2ad13ee5f1ed629a8805008580f86c0/src/hotspot/share/runtime/sharedRuntime.cpp#L652)).
>> * [Adds](https://github.com/openjdk/jdk/blob/b1e778d9d2ad13ee5f1ed629a8805008580f86c0/src/hotspot/share/runtime/arguments.cpp#L1804) `jdk.internal.vm.ci` to the root module set.
>> 
>> This PR changes nothing about the first point.
>> 
>> On the second point, to use the `jdk.internal.vm.ci` module when libgraal is enabled, `-XX:+EnableJVMCI` must be explicitly specified to the launcher (as opposed to being true as a result of [`-XX:+UseJVMCICompiler`](https://github.com/openjdk/jdk/blob/76570c627db527f856f2394fb9ead02939eca621/src/hotspot/share/jvmci/jvmci_globals.cpp#L88) or [`-XX:+EnableJVMCIProduct`](https://github.com/openjdk/jdk/blob/76570c627db527f856f2394fb9ead02939eca621/src/hotspot/share/jvmci/jvmci_globals.cpp#L64)). Alternatively, `--add-modules=jdk.internal.vm.ci` can be specified - it has the same semantics as `-XX:+EnableJVMCI`.
>> If libgraal is not enabled, +EnableJVMCI will continue to add `jdk.internal.vm.ci` to the root module set.
>> 
>> The primary motivation is to make use of libgraal compatible with `-XX:+AOTClassLinking`. This flag relies on the root module set archive created in a training run. If the root module set is different in the production run, the AOTClassLinking [optimizations](https://bugs.openjdk.org/browse/JDK-8342279) are disabled. As `jdk.internal.vm.ci` is not resolved in the training run, it must not be resolved in production run. As such, `-XX:+EnableJVMCI` must not cause resolution of `jdk.internal.vm.ci`, otherwise libgraal will not have the startup advantages of AOTClassLinking.
>> 
>> Graal adaption PR: https://github.com/oracle/graal/pull/11212
>
> Doug Simon has updated the pull request incrementally with one additional commit since the last revision:
> 
>   swapped order of recommended options in error message

Thanks for the reviews.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/25240#issuecomment-2898984593

From dnsimon at openjdk.org  Wed May 21 19:24:01 2025
From: dnsimon at openjdk.org (Doug Simon)
Date: Wed, 21 May 2025 19:24:01 GMT
Subject: Integrated: 8345826: Do not automatically resolve jdk.internal.vm.ci
 when libgraal is used
In-Reply-To: <PlJ8ZpJyyH6rzVIww3DxcQ_Yi8jelxrf9YidjoaXP0Y=.0701541e-38db-4bc5-b2d5-ae5ebf89b362@github.com>
References: <PlJ8ZpJyyH6rzVIww3DxcQ_Yi8jelxrf9YidjoaXP0Y=.0701541e-38db-4bc5-b2d5-ae5ebf89b362@github.com>
Message-ID: <n9mFQC9jyXyL94dEfnWas-YHAsHDMnUayY9h8bdhYYM=.6121fa6f-9877-4172-9af8-0505fc64a889@github.com>

On Wed, 14 May 2025 22:00:30 GMT, Doug Simon <dnsimon at openjdk.org> wrote:

> The `EnableJVMCI` flag currently serves 2 purposes:
> * Guards VM code ([example](https://github.com/openjdk/jdk/blob/b1e778d9d2ad13ee5f1ed629a8805008580f86c0/src/hotspot/share/runtime/sharedRuntime.cpp#L652)).
> * [Adds](https://github.com/openjdk/jdk/blob/b1e778d9d2ad13ee5f1ed629a8805008580f86c0/src/hotspot/share/runtime/arguments.cpp#L1804) `jdk.internal.vm.ci` to the root module set.
> 
> This PR changes nothing about the first point.
> 
> On the second point, to use the `jdk.internal.vm.ci` module when libgraal is enabled, `-XX:+EnableJVMCI` must be explicitly specified to the launcher (as opposed to being true as a result of [`-XX:+UseJVMCICompiler`](https://github.com/openjdk/jdk/blob/76570c627db527f856f2394fb9ead02939eca621/src/hotspot/share/jvmci/jvmci_globals.cpp#L88) or [`-XX:+EnableJVMCIProduct`](https://github.com/openjdk/jdk/blob/76570c627db527f856f2394fb9ead02939eca621/src/hotspot/share/jvmci/jvmci_globals.cpp#L64)). Alternatively, `--add-modules=jdk.internal.vm.ci` can be specified - it has the same semantics as `-XX:+EnableJVMCI`.
> If libgraal is not enabled, +EnableJVMCI will continue to add `jdk.internal.vm.ci` to the root module set.
> 
> The primary motivation is to make use of libgraal compatible with `-XX:+AOTClassLinking`. This flag relies on the root module set archive created in a training run. If the root module set is different in the production run, the AOTClassLinking [optimizations](https://bugs.openjdk.org/browse/JDK-8342279) are disabled. As `jdk.internal.vm.ci` is not resolved in the training run, it must not be resolved in production run. As such, `-XX:+EnableJVMCI` must not cause resolution of `jdk.internal.vm.ci`, otherwise libgraal will not have the startup advantages of AOTClassLinking.
> 
> Graal adaption PR: https://github.com/oracle/graal/pull/11212

This pull request has now been integrated.

Changeset: 81536830
Author:    Doug Simon <dnsimon at openjdk.org>
URL:       https://git.openjdk.org/jdk/commit/81536830ed096005c4f09ab446238ce50989cea9
Stats:     54 lines in 8 files changed: 31 ins; 15 del; 8 mod

8345826: Do not automatically resolve jdk.internal.vm.ci when libgraal is used

Reviewed-by: iklam, never, kvn

-------------

PR: https://git.openjdk.org/jdk/pull/25240

From dnsimon at openjdk.org  Wed May 21 20:41:35 2025
From: dnsimon at openjdk.org (Doug Simon)
Date: Wed, 21 May 2025 20:41:35 GMT
Subject: RFR: 8357135: java.lang.OutOfMemoryError: Error creating or
 attaching to libjvmci after JDK-8356447 [v3]
In-Reply-To: <ae8M3ac5iIemGBew5gSj4PVgo7e_Cs5Nq1_x6dRfKJE=.4b9f3be1-ebc5-4171-b097-732c4d2f48af@github.com>
References: <ae8M3ac5iIemGBew5gSj4PVgo7e_Cs5Nq1_x6dRfKJE=.4b9f3be1-ebc5-4171-b097-732c4d2f48af@github.com>
Message-ID: <KXwcoz2jSbc6Egsw6naaGAK7j3zCYz642jqEqmXVb-4=.418359fd-e093-422a-a1b7-7d0002bf5448@github.com>

> As of [JDK-8356447](https://bugs.openjdk.org/browse/JDK-8356447), libgraal initialization happens during VM startup. If during this initialization, the libgraal heap cannot be created due to lack of virtual address space, the VM will exit with:
> 
> 
> Error occurred during initialization of VM
> java.lang.OutOfMemoryError: Error creating or attaching to libjvmci (err: -1000000801, description: Reserving address space for the new isolate failed.)
> 
> 
> This causes problems for tests that limit the virtual address space with `ulimit -v` such as `gc/arguments/TestUseCompressedOopsFlagsWithUlimit.java` and `vmTestbase/nsk/jvmti/Allocate/alloc001/alloc001.java`.
> Since these tests were passing on libgraal prior to JDK-8356447, they obviously do not require JIT compilation. The simplest fix is to then use `-Xint` to disable the JIT.

Doug Simon has updated the pull request incrementally with three additional commits since the last revision:

 - tests that use 'ulimit -v' should run with -Xint
 - Revert "do not exit VM if libjvmci env creation fails"
   
   This reverts commit 7eb259b92553669065db57d230476cf465a67d02.
 - Revert "consolidate JVMCI eager initialization"
   
   This reverts commit 32986d1a2b741ee8c9090cefbecc148bb8fbd7e4.

-------------

Changes:
  - all: https://git.openjdk.org/jdk/pull/25307/files
  - new: https://git.openjdk.org/jdk/pull/25307/files/32986d1a..1a79617e

Webrevs:
 - full: https://webrevs.openjdk.org/?repo=jdk&pr=25307&range=02
 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25307&range=01-02

  Stats: 55 lines in 9 files changed: 30 ins; 18 del; 7 mod
  Patch: https://git.openjdk.org/jdk/pull/25307.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/25307/head:pull/25307

PR: https://git.openjdk.org/jdk/pull/25307

From dnsimon at openjdk.org  Wed May 21 20:41:35 2025
From: dnsimon at openjdk.org (Doug Simon)
Date: Wed, 21 May 2025 20:41:35 GMT
Subject: RFR: 8357135: java.lang.OutOfMemoryError: Error creating or
 attaching to libjvmci after JDK-8356447 [v2]
In-Reply-To: <PdWyfY6AypgX5fJhMbM70SvMuqurNKnD2i13XCIF3i4=.5fe14740-8805-40c9-a895-4f449eb0cfc8@github.com>
References: <ae8M3ac5iIemGBew5gSj4PVgo7e_Cs5Nq1_x6dRfKJE=.4b9f3be1-ebc5-4171-b097-732c4d2f48af@github.com>
 <PpMS7gcHWmKh7TdNWrLS7y9VnPsAQE3nSiKpTYh8nKI=.6e0e4e22-2a48-4bad-a150-c0dd9fd649f8@github.com>
 <PdWyfY6AypgX5fJhMbM70SvMuqurNKnD2i13XCIF3i4=.5fe14740-8805-40c9-a895-4f449eb0cfc8@github.com>
Message-ID: <x6YYLGIP4rL_S9spEvdzB-LPYbfFZUJN2gy9OUpT8Rg=.8f39e117-1b7c-48f4-8e16-abc1f75f78a2@github.com>

On Wed, 21 May 2025 17:53:13 GMT, Tom Rodriguez <never at openjdk.org> wrote:

> Or both of those tests could probably just run with -Xint to avoid this completely.

I've reverted to this solution - thanks for the suggestion.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/25307#issuecomment-2899176436

From dnsimon at openjdk.org  Wed May 21 20:46:04 2025
From: dnsimon at openjdk.org (Doug Simon)
Date: Wed, 21 May 2025 20:46:04 GMT
Subject: RFR: 8357135: java.lang.OutOfMemoryError: Error creating or
 attaching to libjvmci after JDK-8356447 [v4]
In-Reply-To: <ae8M3ac5iIemGBew5gSj4PVgo7e_Cs5Nq1_x6dRfKJE=.4b9f3be1-ebc5-4171-b097-732c4d2f48af@github.com>
References: <ae8M3ac5iIemGBew5gSj4PVgo7e_Cs5Nq1_x6dRfKJE=.4b9f3be1-ebc5-4171-b097-732c4d2f48af@github.com>
Message-ID: <ggzDLe2OsLGq034gA580NjmSsAjxIQVY93R5OIOXZTA=.f7f5b6a3-e52d-4b6c-8762-4c2fefa93ca8@github.com>

> As of [JDK-8356447](https://bugs.openjdk.org/browse/JDK-8356447), libgraal initialization happens during VM startup. If during this initialization, the libgraal heap cannot be created due to lack of virtual address space, the VM will exit with:
> 
> 
> Error occurred during initialization of VM
> java.lang.OutOfMemoryError: Error creating or attaching to libjvmci (err: -1000000801, description: Reserving address space for the new isolate failed.)
> 
> 
> This causes problems for tests that limit the virtual address space with `ulimit -v` such as `gc/arguments/TestUseCompressedOopsFlagsWithUlimit.java` and `vmTestbase/nsk/jvmti/Allocate/alloc001/alloc001.java`.
> Since these tests were passing on libgraal prior to JDK-8356447, they obviously do not require JIT compilation. The simplest fix is to then use `-Xint` to disable the JIT.

Doug Simon has updated the pull request incrementally with one additional commit since the last revision:

  added comments justifying use of -Xint

-------------

Changes:
  - all: https://git.openjdk.org/jdk/pull/25307/files
  - new: https://git.openjdk.org/jdk/pull/25307/files/1a79617e..b0d45b1b

Webrevs:
 - full: https://webrevs.openjdk.org/?repo=jdk&pr=25307&range=03
 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25307&range=02-03

  Stats: 7 lines in 2 files changed: 5 ins; 0 del; 2 mod
  Patch: https://git.openjdk.org/jdk/pull/25307.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/25307/head:pull/25307

PR: https://git.openjdk.org/jdk/pull/25307

From dnsimon at openjdk.org  Wed May 21 20:46:05 2025
From: dnsimon at openjdk.org (Doug Simon)
Date: Wed, 21 May 2025 20:46:05 GMT
Subject: RFR: 8357135: java.lang.OutOfMemoryError: Error creating or
 attaching to libjvmci after JDK-8356447 [v3]
In-Reply-To: <KXwcoz2jSbc6Egsw6naaGAK7j3zCYz642jqEqmXVb-4=.418359fd-e093-422a-a1b7-7d0002bf5448@github.com>
References: <ae8M3ac5iIemGBew5gSj4PVgo7e_Cs5Nq1_x6dRfKJE=.4b9f3be1-ebc5-4171-b097-732c4d2f48af@github.com>
 <KXwcoz2jSbc6Egsw6naaGAK7j3zCYz642jqEqmXVb-4=.418359fd-e093-422a-a1b7-7d0002bf5448@github.com>
Message-ID: <4TOJwaT4xDVYnzB1co2JKSILNBV5lwBUduMZHRtquSU=.754489ed-035f-427b-8903-f5edcd0309cd@github.com>

On Wed, 21 May 2025 20:41:35 GMT, Doug Simon <dnsimon at openjdk.org> wrote:

>> As of [JDK-8356447](https://bugs.openjdk.org/browse/JDK-8356447), libgraal initialization happens during VM startup. If during this initialization, the libgraal heap cannot be created due to lack of virtual address space, the VM will exit with:
>> 
>> 
>> Error occurred during initialization of VM
>> java.lang.OutOfMemoryError: Error creating or attaching to libjvmci (err: -1000000801, description: Reserving address space for the new isolate failed.)
>> 
>> 
>> This causes problems for tests that limit the virtual address space with `ulimit -v` such as `gc/arguments/TestUseCompressedOopsFlagsWithUlimit.java` and `vmTestbase/nsk/jvmti/Allocate/alloc001/alloc001.java`.
>> Since these tests were passing on libgraal prior to JDK-8356447, they obviously do not require JIT compilation. The simplest fix is to then use `-Xint` to disable the JIT.
>
> Doug Simon has updated the pull request incrementally with three additional commits since the last revision:
> 
>  - tests that use 'ulimit -v' should run with -Xint
>  - Revert "do not exit VM if libjvmci env creation fails"
>    
>    This reverts commit 7eb259b92553669065db57d230476cf465a67d02.
>  - Revert "consolidate JVMCI eager initialization"
>    
>    This reverts commit 32986d1a2b741ee8c9090cefbecc148bb8fbd7e4.

Tested locally with a build that includes libgraal.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/25307#issuecomment-2899184608

From dnsimon at openjdk.org  Wed May 21 20:59:33 2025
From: dnsimon at openjdk.org (Doug Simon)
Date: Wed, 21 May 2025 20:59:33 GMT
Subject: RFR: 8357135: java.lang.OutOfMemoryError: Error creating or
 attaching to libjvmci after JDK-8356447 [v5]
In-Reply-To: <ae8M3ac5iIemGBew5gSj4PVgo7e_Cs5Nq1_x6dRfKJE=.4b9f3be1-ebc5-4171-b097-732c4d2f48af@github.com>
References: <ae8M3ac5iIemGBew5gSj4PVgo7e_Cs5Nq1_x6dRfKJE=.4b9f3be1-ebc5-4171-b097-732c4d2f48af@github.com>
Message-ID: <ZUzzarcaTuOdJ6S-gqlrdQ9LTCRz2L0aaF6Y6dEwHl8=.4d4a5b7c-f67d-4e29-ac80-9f7fb5a7e177@github.com>

> As of [JDK-8356447](https://bugs.openjdk.org/browse/JDK-8356447), libgraal initialization happens during VM startup. If during this initialization, the libgraal heap cannot be created due to lack of virtual address space, the VM will exit with:
> 
> 
> Error occurred during initialization of VM
> java.lang.OutOfMemoryError: Error creating or attaching to libjvmci (err: -1000000801, description: Reserving address space for the new isolate failed.)
> 
> 
> This causes problems for tests that limit the virtual address space with `ulimit -v` such as `gc/arguments/TestUseCompressedOopsFlagsWithUlimit.java` and `vmTestbase/nsk/jvmti/Allocate/alloc001/alloc001.java`.
> Since these tests were passing on libgraal prior to JDK-8356447, they obviously do not require JIT compilation. The simplest fix is to then use `-Xint` to disable the JIT.

Doug Simon has updated the pull request incrementally with one additional commit since the last revision:

  removed trailing space

-------------

Changes:
  - all: https://git.openjdk.org/jdk/pull/25307/files
  - new: https://git.openjdk.org/jdk/pull/25307/files/b0d45b1b..3201a5d6

Webrevs:
 - full: https://webrevs.openjdk.org/?repo=jdk&pr=25307&range=04
 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25307&range=03-04

  Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod
  Patch: https://git.openjdk.org/jdk/pull/25307.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/25307/head:pull/25307

PR: https://git.openjdk.org/jdk/pull/25307

From dnsimon at openjdk.org  Wed May 21 21:07:23 2025
From: dnsimon at openjdk.org (Doug Simon)
Date: Wed, 21 May 2025 21:07:23 GMT
Subject: RFR: 8357506: [JVMCI] Consolidate eager JVMCI initialization code
Message-ID: <4rwSBU4yySv789oCHmEoFgVsOqj7ZJ65owldXBviF-s=.9b15ee7e-32a0-4a54-b39c-bf1a440b1dd1@github.com>

While working on [JDK-8357135](https://bugs.openjdk.org/browse/JDK-8357135), I was reminded that some of the code implementing eager JVMCI compiler initialization (i.e. `-XX:+EagerJVMCI`) is in helper methods such as `JVMCIRuntime::call_getCompiler` that sound general purpose but are only used for eager JVMCI compiler initialization. This PR inlines `JVMCIRuntime::call_getCompiler` and renames `JVMCI::initialize_compiler` to `initialize_compiler_in_create_vm` to make its single use case clearer.

-------------

Commit messages:
 - consolidate JVMCI eager initialization

Changes: https://git.openjdk.org/jdk/pull/25369/files
  Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=25369&range=00
  Issue: https://bugs.openjdk.org/browse/JDK-8357506
  Stats: 25 lines in 6 files changed: 5 ins; 14 del; 6 mod
  Patch: https://git.openjdk.org/jdk/pull/25369.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/25369/head:pull/25369

PR: https://git.openjdk.org/jdk/pull/25369

From dnsimon at openjdk.org  Wed May 21 21:15:52 2025
From: dnsimon at openjdk.org (Doug Simon)
Date: Wed, 21 May 2025 21:15:52 GMT
Subject: RFR: 8357424: [JVMCI] Avoid incrementing decompilation count for
 hosted compiled nmethod [v2]
In-Reply-To: <KhHLOeEBxRa3pPyA2GWDB_1QItvblLXMZ0TumGQas68=.d6af7e0c-7176-4713-b131-834f3b145947@github.com>
References: <Tlf7gJojsuXSi0SBTjCvrgEwYfuc0kUB968V_jOZpTU=.38a4841e-8073-4aa2-a9f8-c2ea4395b2e1@github.com>
 <KhHLOeEBxRa3pPyA2GWDB_1QItvblLXMZ0TumGQas68=.d6af7e0c-7176-4713-b131-834f3b145947@github.com>
Message-ID: <mM1k2bSpD4MT56Vj3z1ZOxBrLE01_GWqAT92Kpp8guc=.4915a1ee-a837-49e9-a241-cf6b3761a356@github.com>

On Wed, 21 May 2025 15:10:30 GMT, Yudi Zheng <yzheng at openjdk.org> wrote:

>> Hosted Truffle compilations are installed on the OptimizedCallTarget#profiledPERoot method. Any deoptimization contributes to its decompile count, which can easily exceed the PerMethodRecompilationCutoff threshold, permanently preventing highest tier compilation on this method. This PR exempts hosted compilations from this cutoff by ensuring their decompile count is not incremented for hosted compiled nmethods.
>
> Yudi Zheng has updated the pull request incrementally with one additional commit since the last revision:
> 
>   update copyright

src/hotspot/share/jvmci/jvmciRuntime.hpp line 49:

> 47:   friend class JVMCIVMStructs;
> 48: 
> 49:   // Is HotSpotNmethod.name non-null? If so, the value is

This comment needs to be moved inside the bitfield struct above `_has_name`.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/25356#discussion_r2101179755

From dnsimon at openjdk.org  Wed May 21 21:22:56 2025
From: dnsimon at openjdk.org (Doug Simon)
Date: Wed, 21 May 2025 21:22:56 GMT
Subject: RFR: 8357424: [JVMCI] Avoid incrementing decompilation count for
 hosted compiled nmethod [v2]
In-Reply-To: <KhHLOeEBxRa3pPyA2GWDB_1QItvblLXMZ0TumGQas68=.d6af7e0c-7176-4713-b131-834f3b145947@github.com>
References: <Tlf7gJojsuXSi0SBTjCvrgEwYfuc0kUB968V_jOZpTU=.38a4841e-8073-4aa2-a9f8-c2ea4395b2e1@github.com>
 <KhHLOeEBxRa3pPyA2GWDB_1QItvblLXMZ0TumGQas68=.d6af7e0c-7176-4713-b131-834f3b145947@github.com>
Message-ID: <I-nGwB8hQugEyzacLfrbAJQ-7Cah5Ce75Js7qx-qGkU=.ddef2670-e5aa-45d3-9bb0-6ed9efb7d056@github.com>

On Wed, 21 May 2025 15:10:30 GMT, Yudi Zheng <yzheng at openjdk.org> wrote:

>> Hosted Truffle compilations are installed on the OptimizedCallTarget#profiledPERoot method. Any deoptimization contributes to its decompile count, which can easily exceed the PerMethodRecompilationCutoff threshold, permanently preventing highest tier compilation on this method. This PR exempts hosted compilations from this cutoff by ensuring their decompile count is not incremented for hosted compiled nmethods.
>
> Yudi Zheng has updated the pull request incrementally with one additional commit since the last revision:
> 
>   update copyright

src/hotspot/share/code/nmethod.cpp line 2061:

> 2059: #if INCLUDE_JVMCI
> 2060:       if (jvmci_nmethod_data() != nullptr && !jvmci_nmethod_data()->is_default()) {
> 2061:         // Hosted compilations are not subject to the recompilation cutoff

Suggestion:

        // Non-default (i.e., non-CompileBroker) compilations are not subject to the recompilation cutoff

"hosted compilations" can be confusing (even though I see we unfortunately already use it elsewhere in JVMCI)

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/25356#discussion_r2101189921

From yzheng at openjdk.org  Thu May 22 07:50:05 2025
From: yzheng at openjdk.org (Yudi Zheng)
Date: Thu, 22 May 2025 07:50:05 GMT
Subject: RFR: 8357424: [JVMCI] Avoid incrementing decompilation count for
 hosted compiled nmethod [v3]
In-Reply-To: <Tlf7gJojsuXSi0SBTjCvrgEwYfuc0kUB968V_jOZpTU=.38a4841e-8073-4aa2-a9f8-c2ea4395b2e1@github.com>
References: <Tlf7gJojsuXSi0SBTjCvrgEwYfuc0kUB968V_jOZpTU=.38a4841e-8073-4aa2-a9f8-c2ea4395b2e1@github.com>
Message-ID: <1ZsklUTLqTRFvGHmsHFj45Mh_rvdmoxecbJ-hpkFqms=.0ff8dc74-2fd8-4c6f-97fb-e24b33cbd6d0@github.com>

> Hosted Truffle compilations are installed on the OptimizedCallTarget#profiledPERoot method. Any deoptimization contributes to its decompile count, which can easily exceed the PerMethodRecompilationCutoff threshold, permanently preventing highest tier compilation on this method. This PR exempts hosted compilations from this cutoff by ensuring their decompile count is not incremented for hosted compiled nmethods.

Yudi Zheng has updated the pull request incrementally with one additional commit since the last revision:

  address comments

-------------

Changes:
  - all: https://git.openjdk.org/jdk/pull/25356/files
  - new: https://git.openjdk.org/jdk/pull/25356/files/ef4a4c98..e66c16d1

Webrevs:
 - full: https://webrevs.openjdk.org/?repo=jdk&pr=25356&range=02
 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25356&range=01-02

  Stats: 7 lines in 2 files changed: 4 ins; 2 del; 1 mod
  Patch: https://git.openjdk.org/jdk/pull/25356.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/25356/head:pull/25356

PR: https://git.openjdk.org/jdk/pull/25356

From yzheng at openjdk.org  Thu May 22 08:04:35 2025
From: yzheng at openjdk.org (Yudi Zheng)
Date: Thu, 22 May 2025 08:04:35 GMT
Subject: RFR: 8357424: [JVMCI] Avoid incrementing decompilation count for
 hosted compiled nmethod [v4]
In-Reply-To: <Tlf7gJojsuXSi0SBTjCvrgEwYfuc0kUB968V_jOZpTU=.38a4841e-8073-4aa2-a9f8-c2ea4395b2e1@github.com>
References: <Tlf7gJojsuXSi0SBTjCvrgEwYfuc0kUB968V_jOZpTU=.38a4841e-8073-4aa2-a9f8-c2ea4395b2e1@github.com>
Message-ID: <jV_EaLvJc6R7LOqAvuY4xjBuB9z7p0fL31Hk0A0bGp8=.7e480681-b385-4bbc-b0ab-33c77eee6e7a@github.com>

> Hosted Truffle compilations are installed on the OptimizedCallTarget#profiledPERoot method. Any deoptimization contributes to its decompile count, which can easily exceed the PerMethodRecompilationCutoff threshold, permanently preventing highest tier compilation on this method. This PR exempts hosted compilations from this cutoff by ensuring their decompile count is not incremented for hosted compiled nmethods.

Yudi Zheng has updated the pull request incrementally with one additional commit since the last revision:

  address comments

-------------

Changes:
  - all: https://git.openjdk.org/jdk/pull/25356/files
  - new: https://git.openjdk.org/jdk/pull/25356/files/e66c16d1..b72213ae

Webrevs:
 - full: https://webrevs.openjdk.org/?repo=jdk&pr=25356&range=03
 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25356&range=02-03

  Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod
  Patch: https://git.openjdk.org/jdk/pull/25356.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/25356/head:pull/25356

PR: https://git.openjdk.org/jdk/pull/25356

From dnsimon at openjdk.org  Thu May 22 08:35:51 2025
From: dnsimon at openjdk.org (Doug Simon)
Date: Thu, 22 May 2025 08:35:51 GMT
Subject: RFR: 8357424: [JVMCI] Avoid incrementing decompilation count for
 hosted compiled nmethod [v4]
In-Reply-To: <jV_EaLvJc6R7LOqAvuY4xjBuB9z7p0fL31Hk0A0bGp8=.7e480681-b385-4bbc-b0ab-33c77eee6e7a@github.com>
References: <Tlf7gJojsuXSi0SBTjCvrgEwYfuc0kUB968V_jOZpTU=.38a4841e-8073-4aa2-a9f8-c2ea4395b2e1@github.com>
 <jV_EaLvJc6R7LOqAvuY4xjBuB9z7p0fL31Hk0A0bGp8=.7e480681-b385-4bbc-b0ab-33c77eee6e7a@github.com>
Message-ID: <fdPTiwJHxeF9s6igXVu0rHMUvRKBFHlqLm6Yg9ZPTdM=.17333d9f-5bab-413f-b876-bb6c20d25a4f@github.com>

On Thu, 22 May 2025 08:04:35 GMT, Yudi Zheng <yzheng at openjdk.org> wrote:

>> Hosted Truffle compilations are installed on the OptimizedCallTarget#profiledPERoot method. Any deoptimization contributes to its decompile count, which can easily exceed the PerMethodRecompilationCutoff threshold, permanently preventing highest tier compilation on this method. This PR exempts hosted compilations from this cutoff by ensuring their decompile count is not incremented for hosted compiled nmethods.
>
> Yudi Zheng has updated the pull request incrementally with one additional commit since the last revision:
> 
>   address comments

Looks good.

-------------

Marked as reviewed by dnsimon (Reviewer).

PR Review: https://git.openjdk.org/jdk/pull/25356#pullrequestreview-2860294240

From never at openjdk.org  Thu May 22 15:34:00 2025
From: never at openjdk.org (Tom Rodriguez)
Date: Thu, 22 May 2025 15:34:00 GMT
Subject: RFR: 8357135: java.lang.OutOfMemoryError: Error creating or
 attaching to libjvmci after JDK-8356447 [v5]
In-Reply-To: <ZUzzarcaTuOdJ6S-gqlrdQ9LTCRz2L0aaF6Y6dEwHl8=.4d4a5b7c-f67d-4e29-ac80-9f7fb5a7e177@github.com>
References: <ae8M3ac5iIemGBew5gSj4PVgo7e_Cs5Nq1_x6dRfKJE=.4b9f3be1-ebc5-4171-b097-732c4d2f48af@github.com>
 <ZUzzarcaTuOdJ6S-gqlrdQ9LTCRz2L0aaF6Y6dEwHl8=.4d4a5b7c-f67d-4e29-ac80-9f7fb5a7e177@github.com>
Message-ID: <T5-d2-vY5A8VSf9NMsIdQagThU2DOB5Ks8XMWAM9FKQ=.14228234-dc92-465c-a580-6208fc6e4015@github.com>

On Wed, 21 May 2025 20:59:33 GMT, Doug Simon <dnsimon at openjdk.org> wrote:

>> As of [JDK-8356447](https://bugs.openjdk.org/browse/JDK-8356447), libgraal initialization happens during VM startup. If during this initialization, the libgraal heap cannot be created due to lack of virtual address space, the VM will exit with:
>> 
>> 
>> Error occurred during initialization of VM
>> java.lang.OutOfMemoryError: Error creating or attaching to libjvmci (err: -1000000801, description: Reserving address space for the new isolate failed.)
>> 
>> 
>> This causes problems for tests that limit the virtual address space with `ulimit -v` such as `gc/arguments/TestUseCompressedOopsFlagsWithUlimit.java` and `vmTestbase/nsk/jvmti/Allocate/alloc001/alloc001.java`.
>> Since these tests were passing on libgraal prior to JDK-8356447, they obviously do not require JIT compilation. The simplest fix is to then use `-Xint` to disable the JIT.
>
> Doug Simon has updated the pull request incrementally with one additional commit since the last revision:
> 
>   removed trailing space

This seems reasonable to me.

-------------

Marked as reviewed by never (Reviewer).

PR Review: https://git.openjdk.org/jdk/pull/25307#pullrequestreview-2861697506

From dnsimon at openjdk.org  Thu May 22 17:03:59 2025
From: dnsimon at openjdk.org (Doug Simon)
Date: Thu, 22 May 2025 17:03:59 GMT
Subject: RFR: 8357135: java.lang.OutOfMemoryError: Error creating or
 attaching to libjvmci after JDK-8356447 [v5]
In-Reply-To: <ZUzzarcaTuOdJ6S-gqlrdQ9LTCRz2L0aaF6Y6dEwHl8=.4d4a5b7c-f67d-4e29-ac80-9f7fb5a7e177@github.com>
References: <ae8M3ac5iIemGBew5gSj4PVgo7e_Cs5Nq1_x6dRfKJE=.4b9f3be1-ebc5-4171-b097-732c4d2f48af@github.com>
 <ZUzzarcaTuOdJ6S-gqlrdQ9LTCRz2L0aaF6Y6dEwHl8=.4d4a5b7c-f67d-4e29-ac80-9f7fb5a7e177@github.com>
Message-ID: <ur3l29q105XUXEBvRj88AlrPSJ4bcXc0GLLcYf2VQR0=.7e280669-b5da-4d9e-9ed6-9df12a088b87@github.com>

On Wed, 21 May 2025 20:59:33 GMT, Doug Simon <dnsimon at openjdk.org> wrote:

>> As of [JDK-8356447](https://bugs.openjdk.org/browse/JDK-8356447), libgraal initialization happens during VM startup. If during this initialization, the libgraal heap cannot be created due to lack of virtual address space, the VM will exit with:
>> 
>> 
>> Error occurred during initialization of VM
>> java.lang.OutOfMemoryError: Error creating or attaching to libjvmci (err: -1000000801, description: Reserving address space for the new isolate failed.)
>> 
>> 
>> This causes problems for tests that limit the virtual address space with `ulimit -v` such as `gc/arguments/TestUseCompressedOopsFlagsWithUlimit.java` and `vmTestbase/nsk/jvmti/Allocate/alloc001/alloc001.java`.
>> Since these tests were passing on libgraal prior to JDK-8356447, they obviously do not require JIT compilation. The simplest fix is to then use `-Xint` to disable the JIT.
>
> Doug Simon has updated the pull request incrementally with one additional commit since the last revision:
> 
>   removed trailing space

Thanks for the reviews.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/25307#issuecomment-2901962121

From dnsimon at openjdk.org  Thu May 22 17:04:00 2025
From: dnsimon at openjdk.org (Doug Simon)
Date: Thu, 22 May 2025 17:04:00 GMT
Subject: Integrated: 8357135: java.lang.OutOfMemoryError: Error creating or
 attaching to libjvmci after JDK-8356447
In-Reply-To: <ae8M3ac5iIemGBew5gSj4PVgo7e_Cs5Nq1_x6dRfKJE=.4b9f3be1-ebc5-4171-b097-732c4d2f48af@github.com>
References: <ae8M3ac5iIemGBew5gSj4PVgo7e_Cs5Nq1_x6dRfKJE=.4b9f3be1-ebc5-4171-b097-732c4d2f48af@github.com>
Message-ID: <wPSMk_sj96mlPP_1SqeupUeopIKpZquj6DnlHRwo8cM=.7cea2924-90a1-4e91-a7f6-b7547b343cea@github.com>

On Mon, 19 May 2025 17:50:21 GMT, Doug Simon <dnsimon at openjdk.org> wrote:

> As of [JDK-8356447](https://bugs.openjdk.org/browse/JDK-8356447), libgraal initialization happens during VM startup. If during this initialization, the libgraal heap cannot be created due to lack of virtual address space, the VM will exit with:
> 
> 
> Error occurred during initialization of VM
> java.lang.OutOfMemoryError: Error creating or attaching to libjvmci (err: -1000000801, description: Reserving address space for the new isolate failed.)
> 
> 
> This causes problems for tests that limit the virtual address space with `ulimit -v` such as `gc/arguments/TestUseCompressedOopsFlagsWithUlimit.java` and `vmTestbase/nsk/jvmti/Allocate/alloc001/alloc001.java`.
> Since these tests were passing on libgraal prior to JDK-8356447, they obviously do not require JIT compilation. The simplest fix is to then use `-Xint` to disable the JIT.

This pull request has now been integrated.

Changeset: 1258af42
Author:    Doug Simon <dnsimon at openjdk.org>
URL:       https://git.openjdk.org/jdk/commit/1258af42bec92a2797897cb6126b60b582a29d76
Stats:     7 lines in 2 files changed: 7 ins; 0 del; 0 mod

8357135: java.lang.OutOfMemoryError: Error creating or attaching to libjvmci after JDK-8356447

Reviewed-by: never, yzheng

-------------

PR: https://git.openjdk.org/jdk/pull/25307

From dnsimon at openjdk.org  Thu May 22 17:24:08 2025
From: dnsimon at openjdk.org (Doug Simon)
Date: Thu, 22 May 2025 17:24:08 GMT
Subject: RFR: 8357506: [JVMCI] Consolidate eager JVMCI initialization code
 [v2]
In-Reply-To: <4rwSBU4yySv789oCHmEoFgVsOqj7ZJ65owldXBviF-s=.9b15ee7e-32a0-4a54-b39c-bf1a440b1dd1@github.com>
References: <4rwSBU4yySv789oCHmEoFgVsOqj7ZJ65owldXBviF-s=.9b15ee7e-32a0-4a54-b39c-bf1a440b1dd1@github.com>
Message-ID: <ZF8xnP5iD7wSQ3dOnRLJ-aUtyM4_pvgsIGzPxfVNiFE=.1ae6abbf-3515-476d-b1e3-71dc6b2ed58d@github.com>

> While working on [JDK-8357135](https://bugs.openjdk.org/browse/JDK-8357135), I was reminded that some of the code implementing eager JVMCI compiler initialization (i.e. `-XX:+EagerJVMCI`) is in helper methods such as `JVMCIRuntime::call_getCompiler` that sound general purpose but are only used for eager JVMCI compiler initialization. This PR inlines `JVMCIRuntime::call_getCompiler` and renames `JVMCI::initialize_compiler` to `initialize_compiler_in_create_vm` to make its single use case clearer.

Doug Simon has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR. The pull request contains one new commit since the last revision:

  consolidate JVMCI eager initialization

-------------

Changes:
  - all: https://git.openjdk.org/jdk/pull/25369/files
  - new: https://git.openjdk.org/jdk/pull/25369/files/c069487d..3b4bf20e

Webrevs:
 - full: https://webrevs.openjdk.org/?repo=jdk&pr=25369&range=01
 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25369&range=00-01

  Stats: 2 lines in 2 files changed: 0 ins; 0 del; 2 mod
  Patch: https://git.openjdk.org/jdk/pull/25369.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/25369/head:pull/25369

PR: https://git.openjdk.org/jdk/pull/25369

From dnsimon at openjdk.org  Thu May 22 18:03:31 2025
From: dnsimon at openjdk.org (Doug Simon)
Date: Thu, 22 May 2025 18:03:31 GMT
Subject: RFR: 8357581: [JVMCI] Add ProfilingInfo.getDecompileCount
Message-ID: <UzcQzrbW-H5V_53lRiAwBpppPyw2enbKS7bkJdbONSk=.a3d65cd3-0e53-4e77-b6af-0c5dab69f9af@github.com>

Graal is adding enhanced logic to detect deoptimization cycles and needs to be able to query a method's decompilation counter (i.e. `MethodData::_compiler_counters._nof_decompiles`).
This PR adds the `HotSpotProfilingInfo` interface so that such HotSpot-specific profiling info can be accessed.
The change looks bigger in the GitHub review UI than it really is. I have simply renamed the pre-existing `HotSpotProfilingInfo` private class as `HotSpotProfilingInfoImpl` and repurposed the `HotSpotProfilingInfo` name for the *new* public interface.

-------------

Commit messages:
 - added HotSpotProfilingInfo

Changes: https://git.openjdk.org/jdk/pull/25397/files
  Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=25397&range=00
  Issue: https://bugs.openjdk.org/browse/JDK-8357581
  Stats: 235 lines in 5 files changed: 17 ins; 194 del; 24 mod
  Patch: https://git.openjdk.org/jdk/pull/25397.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/25397/head:pull/25397

PR: https://git.openjdk.org/jdk/pull/25397

From never at openjdk.org  Thu May 22 18:32:53 2025
From: never at openjdk.org (Tom Rodriguez)
Date: Thu, 22 May 2025 18:32:53 GMT
Subject: RFR: 8357581: [JVMCI] Add ProfilingInfo.getDecompileCount
In-Reply-To: <UzcQzrbW-H5V_53lRiAwBpppPyw2enbKS7bkJdbONSk=.a3d65cd3-0e53-4e77-b6af-0c5dab69f9af@github.com>
References: <UzcQzrbW-H5V_53lRiAwBpppPyw2enbKS7bkJdbONSk=.a3d65cd3-0e53-4e77-b6af-0c5dab69f9af@github.com>
Message-ID: <dR1K8EAVTwMmUpKDR0kPaaTsQAOSjd04xTShZYSeNCs=.e0d084b2-52b9-4366-82ad-de6451cad817@github.com>

On Thu, 22 May 2025 17:12:34 GMT, Doug Simon <dnsimon at openjdk.org> wrote:

> Graal is adding enhanced logic to detect deoptimization cycles and needs to be able to query a method's decompilation counter (i.e. `MethodData::_compiler_counters._nof_decompiles`).
> This PR adds the `HotSpotProfilingInfo` interface so that such HotSpot-specific profiling info can be accessed.
> The change looks bigger in the GitHub review UI than it really is. I have simply renamed the pre-existing `HotSpotProfilingInfo` private class as `HotSpotProfilingInfoImpl` and repurposed the `HotSpotProfilingInfo` name for the *new* public interface.

Looks good.

-------------

Marked as reviewed by never (Reviewer).

PR Review: https://git.openjdk.org/jdk/pull/25397#pullrequestreview-2862216006

From kvn at openjdk.org  Thu May 22 22:01:57 2025
From: kvn at openjdk.org (Vladimir Kozlov)
Date: Thu, 22 May 2025 22:01:57 GMT
Subject: RFR: 8357581: [JVMCI] Add HotSpotProfilingInfo
In-Reply-To: <UzcQzrbW-H5V_53lRiAwBpppPyw2enbKS7bkJdbONSk=.a3d65cd3-0e53-4e77-b6af-0c5dab69f9af@github.com>
References: <UzcQzrbW-H5V_53lRiAwBpppPyw2enbKS7bkJdbONSk=.a3d65cd3-0e53-4e77-b6af-0c5dab69f9af@github.com>
Message-ID: <T_xnCxUJ4sqES4bYQqpoVueDgntb2F3e30jCTIlCYGc=.bca14599-9dea-4586-86bd-dfa864883781@github.com>

On Thu, 22 May 2025 17:12:34 GMT, Doug Simon <dnsimon at openjdk.org> wrote:

> Graal is adding enhanced logic to detect deoptimization cycles and needs to be able to query a method's decompilation counter (i.e. `MethodData::_compiler_counters._nof_decompiles`).
> This PR adds the `HotSpotProfilingInfo` interface so that such HotSpot-specific profiling info can be accessed.
> The change looks bigger in the GitHub review UI than it really is. I have simply renamed the pre-existing `HotSpotProfilingInfo` private class as `HotSpotProfilingInfoImpl` and repurposed the `HotSpotProfilingInfo` name for the *new* public interface.

Just one cosmetic comment about copyright year.

src/jdk.internal.vm.ci/share/classes/jdk/vm/ci/hotspot/HotSpotProfilingInfo.java line 2:

> 1: /*
> 2:  * Copyright (c) 2025, Oracle and/or its affiliates. All rights reserved.

Please, keep 2 years: 2012, 2025. Even if you changed content the file is still present.

src/jdk.internal.vm.ci/share/classes/jdk/vm/ci/hotspot/HotSpotProfilingInfoImpl.java line 2:

> 1: /*
> 2:  * Copyright (c) 2012, 2025, Oracle and/or its affiliates. All rights reserved.

this is one is fine since you copied it from an other file.

-------------

PR Review: https://git.openjdk.org/jdk/pull/25397#pullrequestreview-2862651511
PR Review Comment: https://git.openjdk.org/jdk/pull/25397#discussion_r2103452147
PR Review Comment: https://git.openjdk.org/jdk/pull/25397#discussion_r2103452864

From kvn at openjdk.org  Thu May 22 22:05:01 2025
From: kvn at openjdk.org (Vladimir Kozlov)
Date: Thu, 22 May 2025 22:05:01 GMT
Subject: RFR: 8357506: [JVMCI] Consolidate eager JVMCI initialization code
 [v2]
In-Reply-To: <ZF8xnP5iD7wSQ3dOnRLJ-aUtyM4_pvgsIGzPxfVNiFE=.1ae6abbf-3515-476d-b1e3-71dc6b2ed58d@github.com>
References: <4rwSBU4yySv789oCHmEoFgVsOqj7ZJ65owldXBviF-s=.9b15ee7e-32a0-4a54-b39c-bf1a440b1dd1@github.com>
 <ZF8xnP5iD7wSQ3dOnRLJ-aUtyM4_pvgsIGzPxfVNiFE=.1ae6abbf-3515-476d-b1e3-71dc6b2ed58d@github.com>
Message-ID: <QSq9tQOvH3a-UM2PGIqwHF13K5bFThV06Y_h-Kc8k0k=.d8451e43-0610-45af-8147-975631f92fd6@github.com>

On Thu, 22 May 2025 17:24:08 GMT, Doug Simon <dnsimon at openjdk.org> wrote:

>> While working on [JDK-8357135](https://bugs.openjdk.org/browse/JDK-8357135), I was reminded that some of the code implementing eager JVMCI compiler initialization (i.e. `-XX:+EagerJVMCI`) is in helper methods such as `JVMCIRuntime::call_getCompiler` that sound general purpose but are only used for eager JVMCI compiler initialization. This PR inlines `JVMCIRuntime::call_getCompiler` and renames `JVMCI::initialize_compiler` to `initialize_compiler_in_create_vm` to make its single use case clearer.
>
> Doug Simon has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR. The pull request contains one new commit since the last revision:
> 
>   consolidate JVMCI eager initialization

Good.

-------------

Marked as reviewed by kvn (Reviewer).

PR Review: https://git.openjdk.org/jdk/pull/25369#pullrequestreview-2862661991

From cslucas at openjdk.org  Thu May 22 22:38:12 2025
From: cslucas at openjdk.org (Cesar Soares Lucas)
Date: Thu, 22 May 2025 22:38:12 GMT
Subject: RFR: 8357396: Refactor nmethod::make_not_entrant to use Enum instead
 of "const char*"
Message-ID: <OH8p3ETX34bE4gEHkW-TaQrxN9Lkp_blwjxZ9c9rgsw=.3e39bcd5-6499-4e4f-bddc-546bac5663a3@github.com>

Please review this refactor to transform the reasons for making an nmethod not entrant from `const char*` into enum values.

Tested on Linux x64 with JTREG tier1-3 in fastdebug and release mode.

-------------

Commit messages:
 - Refactor nmethod make_not_entrant reason

Changes: https://git.openjdk.org/jdk/pull/25338/files
  Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=25338&range=00
  Issue: https://bugs.openjdk.org/browse/JDK-8357396
  Stats: 60 lines in 15 files changed: 26 ins; 4 del; 30 mod
  Patch: https://git.openjdk.org/jdk/pull/25338.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/25338/head:pull/25338

PR: https://git.openjdk.org/jdk/pull/25338

From dnsimon at openjdk.org  Fri May 23 06:19:42 2025
From: dnsimon at openjdk.org (Doug Simon)
Date: Fri, 23 May 2025 06:19:42 GMT
Subject: RFR: 8357581: [JVMCI] Add HotSpotProfilingInfo [v2]
In-Reply-To: <UzcQzrbW-H5V_53lRiAwBpppPyw2enbKS7bkJdbONSk=.a3d65cd3-0e53-4e77-b6af-0c5dab69f9af@github.com>
References: <UzcQzrbW-H5V_53lRiAwBpppPyw2enbKS7bkJdbONSk=.a3d65cd3-0e53-4e77-b6af-0c5dab69f9af@github.com>
Message-ID: <nBkLbiS6kpy2_gOztAT8HHgQb2A8ADRf2KV4L3DAqLA=.6faf0fc5-b8fb-4d93-ad73-eeca2bbe9fbd@github.com>

> Graal is adding enhanced logic to detect deoptimization cycles and needs to be able to query a method's decompilation counter (i.e. `MethodData::_compiler_counters._nof_decompiles`).
> This PR adds the `HotSpotProfilingInfo` interface so that such HotSpot-specific profiling info can be accessed.
> The change looks bigger in the GitHub review UI than it really is. I have simply renamed the pre-existing `HotSpotProfilingInfo` private class as `HotSpotProfilingInfoImpl` and repurposed the `HotSpotProfilingInfo` name for the *new* public interface.

Doug Simon has updated the pull request incrementally with one additional commit since the last revision:

  fix copyright

-------------

Changes:
  - all: https://git.openjdk.org/jdk/pull/25397/files
  - new: https://git.openjdk.org/jdk/pull/25397/files/d95475b0..12a9a059

Webrevs:
 - full: https://webrevs.openjdk.org/?repo=jdk&pr=25397&range=01
 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25397&range=00-01

  Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod
  Patch: https://git.openjdk.org/jdk/pull/25397.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/25397/head:pull/25397

PR: https://git.openjdk.org/jdk/pull/25397

From dnsimon at openjdk.org  Fri May 23 06:19:42 2025
From: dnsimon at openjdk.org (Doug Simon)
Date: Fri, 23 May 2025 06:19:42 GMT
Subject: RFR: 8357581: [JVMCI] Add HotSpotProfilingInfo [v2]
In-Reply-To: <T_xnCxUJ4sqES4bYQqpoVueDgntb2F3e30jCTIlCYGc=.bca14599-9dea-4586-86bd-dfa864883781@github.com>
References: <UzcQzrbW-H5V_53lRiAwBpppPyw2enbKS7bkJdbONSk=.a3d65cd3-0e53-4e77-b6af-0c5dab69f9af@github.com>
 <T_xnCxUJ4sqES4bYQqpoVueDgntb2F3e30jCTIlCYGc=.bca14599-9dea-4586-86bd-dfa864883781@github.com>
Message-ID: <THzQf5kjSa8X1V7J01trcvxBi79PoIZRpp_JIdBqzMc=.4fd81cef-9578-4e7f-be18-3438eb5de5bc@github.com>

On Thu, 22 May 2025 21:56:06 GMT, Vladimir Kozlov <kvn at openjdk.org> wrote:

>> Doug Simon has updated the pull request incrementally with one additional commit since the last revision:
>> 
>>   fix copyright
>
> src/jdk.internal.vm.ci/share/classes/jdk/vm/ci/hotspot/HotSpotProfilingInfo.java line 2:
> 
>> 1: /*
>> 2:  * Copyright (c) 2025, Oracle and/or its affiliates. All rights reserved.
> 
> Please, keep 2 years: 2012, 2025. Even if you changed content the file is still present.

Fixed.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/25397#discussion_r2103880075

From yzheng at openjdk.org  Fri May 23 06:35:56 2025
From: yzheng at openjdk.org (Yudi Zheng)
Date: Fri, 23 May 2025 06:35:56 GMT
Subject: RFR: 8357506: [JVMCI] Consolidate eager JVMCI initialization code
 [v2]
In-Reply-To: <ZF8xnP5iD7wSQ3dOnRLJ-aUtyM4_pvgsIGzPxfVNiFE=.1ae6abbf-3515-476d-b1e3-71dc6b2ed58d@github.com>
References: <4rwSBU4yySv789oCHmEoFgVsOqj7ZJ65owldXBviF-s=.9b15ee7e-32a0-4a54-b39c-bf1a440b1dd1@github.com>
 <ZF8xnP5iD7wSQ3dOnRLJ-aUtyM4_pvgsIGzPxfVNiFE=.1ae6abbf-3515-476d-b1e3-71dc6b2ed58d@github.com>
Message-ID: <vG386490G_jrTP-cTJEIBrYXFc5_h4MVa8szWOaze44=.ffd36ede-865b-4ebf-82ee-efb7c4fd7a20@github.com>

On Thu, 22 May 2025 17:24:08 GMT, Doug Simon <dnsimon at openjdk.org> wrote:

>> While working on [JDK-8357135](https://bugs.openjdk.org/browse/JDK-8357135), I was reminded that some of the code implementing eager JVMCI compiler initialization (i.e. `-XX:+EagerJVMCI`) is in helper methods such as `JVMCIRuntime::call_getCompiler` that sound general purpose but are only used for eager JVMCI compiler initialization. This PR inlines `JVMCIRuntime::call_getCompiler` and renames `JVMCI::initialize_compiler` to `initialize_compiler_in_create_vm` to make its single use case clearer.
>
> Doug Simon has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR. The pull request contains one new commit since the last revision:
> 
>   consolidate JVMCI eager initialization

LGTM

-------------

Marked as reviewed by yzheng (Committer).

PR Review: https://git.openjdk.org/jdk/pull/25369#pullrequestreview-2863339517

From dnsimon at openjdk.org  Fri May 23 06:35:57 2025
From: dnsimon at openjdk.org (Doug Simon)
Date: Fri, 23 May 2025 06:35:57 GMT
Subject: RFR: 8357506: [JVMCI] Consolidate eager JVMCI initialization code
 [v2]
In-Reply-To: <ZF8xnP5iD7wSQ3dOnRLJ-aUtyM4_pvgsIGzPxfVNiFE=.1ae6abbf-3515-476d-b1e3-71dc6b2ed58d@github.com>
References: <4rwSBU4yySv789oCHmEoFgVsOqj7ZJ65owldXBviF-s=.9b15ee7e-32a0-4a54-b39c-bf1a440b1dd1@github.com>
 <ZF8xnP5iD7wSQ3dOnRLJ-aUtyM4_pvgsIGzPxfVNiFE=.1ae6abbf-3515-476d-b1e3-71dc6b2ed58d@github.com>
Message-ID: <g4JLixwok1fd3Wz0q0NhePgn35SY35UpQjjGPHH9-LE=.e9a5523f-1ffb-4ec5-94f1-9f42a4ee94e5@github.com>

On Thu, 22 May 2025 17:24:08 GMT, Doug Simon <dnsimon at openjdk.org> wrote:

>> While working on [JDK-8357135](https://bugs.openjdk.org/browse/JDK-8357135), I was reminded that some of the code implementing eager JVMCI compiler initialization (i.e. `-XX:+EagerJVMCI`) is in helper methods such as `JVMCIRuntime::call_getCompiler` that sound general purpose but are only used for eager JVMCI compiler initialization. This PR inlines `JVMCIRuntime::call_getCompiler` and renames `JVMCI::initialize_compiler` to `initialize_compiler_in_create_vm` to make its single use case clearer.
>
> Doug Simon has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR. The pull request contains one new commit since the last revision:
> 
>   consolidate JVMCI eager initialization

Thanks for the reviews.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/25369#issuecomment-2903410300

From dnsimon at openjdk.org  Fri May 23 06:35:57 2025
From: dnsimon at openjdk.org (Doug Simon)
Date: Fri, 23 May 2025 06:35:57 GMT
Subject: Integrated: 8357506: [JVMCI] Consolidate eager JVMCI initialization
 code
In-Reply-To: <4rwSBU4yySv789oCHmEoFgVsOqj7ZJ65owldXBviF-s=.9b15ee7e-32a0-4a54-b39c-bf1a440b1dd1@github.com>
References: <4rwSBU4yySv789oCHmEoFgVsOqj7ZJ65owldXBviF-s=.9b15ee7e-32a0-4a54-b39c-bf1a440b1dd1@github.com>
Message-ID: <DxIhtuHWuYRocP1ciOA5Em7XcXnwg4MOo7RxZz4Gmp0=.7fcf2045-9234-46fd-a9ee-065a8bea707d@github.com>

On Wed, 21 May 2025 20:58:23 GMT, Doug Simon <dnsimon at openjdk.org> wrote:

> While working on [JDK-8357135](https://bugs.openjdk.org/browse/JDK-8357135), I was reminded that some of the code implementing eager JVMCI compiler initialization (i.e. `-XX:+EagerJVMCI`) is in helper methods such as `JVMCIRuntime::call_getCompiler` that sound general purpose but are only used for eager JVMCI compiler initialization. This PR inlines `JVMCIRuntime::call_getCompiler` and renames `JVMCI::initialize_compiler` to `initialize_compiler_in_create_vm` to make its single use case clearer.

This pull request has now been integrated.

Changeset: d6e4c5f6
Author:    Doug Simon <dnsimon at openjdk.org>
URL:       https://git.openjdk.org/jdk/commit/d6e4c5f65932114b5c6f455db6cfaa220607ce18
Stats:     25 lines in 6 files changed: 5 ins; 14 del; 6 mod

8357506: [JVMCI] Consolidate eager JVMCI initialization code

Reviewed-by: kvn, yzheng

-------------

PR: https://git.openjdk.org/jdk/pull/25369

From mhaessig at openjdk.org  Fri May 23 07:43:51 2025
From: mhaessig at openjdk.org (Manuel =?UTF-8?B?SMOkc3NpZw==?=)
Date: Fri, 23 May 2025 07:43:51 GMT
Subject: RFR: 8357396: Refactor nmethod::make_not_entrant to use Enum
 instead of "const char*"
In-Reply-To: <OH8p3ETX34bE4gEHkW-TaQrxN9Lkp_blwjxZ9c9rgsw=.3e39bcd5-6499-4e4f-bddc-546bac5663a3@github.com>
References: <OH8p3ETX34bE4gEHkW-TaQrxN9Lkp_blwjxZ9c9rgsw=.3e39bcd5-6499-4e4f-bddc-546bac5663a3@github.com>
Message-ID: <2INMtrpMVMQg0FbhXDN51Snx7cg9jweK4ym464FMHes=.e6362690-ed9c-4395-ae75-a8e754a20fc6@github.com>

On Tue, 20 May 2025 22:08:18 GMT, Cesar Soares Lucas <cslucas at openjdk.org> wrote:

> Please review this refactor to transform the reasons for making an nmethod not entrant from `const char*` into enum values.
> 
> Tested on Linux x64 with JTREG tier1-3 in fastdebug and release mode.

Thank you for working on this. I agree that an enum is the better option here.
However, a scoped enum might be more appropriate here. For one, because that is the guidance in the [style guide](https://github.com/openjdk/jdk/blob/master/doc/hotspot-style.md#enum). Secondly, I would argue that this enum should have some `as_string`-like function so the reason is still printed in plain text insteatd of an int. That spares me going into the source and counting down the enum, when I'm debugging a deopt ? . Once the codes does no rely on implicint conversion to print an int, we  do not care about the underlying type.

-------------

PR Review: https://git.openjdk.org/jdk/pull/25338#pullrequestreview-2863515112

From mchevalier at openjdk.org  Fri May 23 07:53:52 2025
From: mchevalier at openjdk.org (Marc Chevalier)
Date: Fri, 23 May 2025 07:53:52 GMT
Subject: RFR: 8357396: Refactor nmethod::make_not_entrant to use Enum
 instead of "const char*"
In-Reply-To: <OH8p3ETX34bE4gEHkW-TaQrxN9Lkp_blwjxZ9c9rgsw=.3e39bcd5-6499-4e4f-bddc-546bac5663a3@github.com>
References: <OH8p3ETX34bE4gEHkW-TaQrxN9Lkp_blwjxZ9c9rgsw=.3e39bcd5-6499-4e4f-bddc-546bac5663a3@github.com>
Message-ID: <Japtl-MHfI7UBNE7x9KmwZ4HoR3y2Y2oC8FSnaZIQpI=.2fbd6005-745d-4417-8200-d2dccb8e3031@github.com>

On Tue, 20 May 2025 22:08:18 GMT, Cesar Soares Lucas <cslucas at openjdk.org> wrote:

> Please review this refactor to transform the reasons for making an nmethod not entrant from `const char*` into enum values.
> 
> Tested on Linux x64 with JTREG tier1-3 in fastdebug and release mode.

src/hotspot/share/jvmci/jvmciCompilerToVM.cpp line 1388:

> 1386: C2V_VMENTRY(void, invalidateHotSpotNmethod, (JNIEnv* env, jobject, jobject hs_nmethod, jboolean deoptimize))
> 1387:   JVMCIObject nmethod_mirror = JVMCIENV->wrap(hs_nmethod);
> 1388:   JVMCIENV->invalidate_nmethod_mirror(nmethod_mirror, deoptimize, nmethod::NMethodChangeReason::JVMCI_invalidate_nmethod, JVMCI_CHECK);

Agree with @mhaessig's comment. If further discussion turns out to prefer the current `enum` over an `enum class`, then this `NMethodChangeReason::` is useless.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/25338#discussion_r2104027069

From shade at openjdk.org  Fri May 23 09:01:56 2025
From: shade at openjdk.org (Aleksey Shipilev)
Date: Fri, 23 May 2025 09:01:56 GMT
Subject: RFR: 8357396: Refactor nmethod::make_not_entrant to use Enum
 instead of "const char*"
In-Reply-To: <OH8p3ETX34bE4gEHkW-TaQrxN9Lkp_blwjxZ9c9rgsw=.3e39bcd5-6499-4e4f-bddc-546bac5663a3@github.com>
References: <OH8p3ETX34bE4gEHkW-TaQrxN9Lkp_blwjxZ9c9rgsw=.3e39bcd5-6499-4e4f-bddc-546bac5663a3@github.com>
Message-ID: <XmGDXtWQxBcj6-dvOUIzBCu5DRCjVrWnFEtlYLpaB3Y=.7532b987-3fa0-49bf-9570-993c9abd79cb@github.com>

On Tue, 20 May 2025 22:08:18 GMT, Cesar Soares Lucas <cslucas at openjdk.org> wrote:

> Please review this refactor to transform the reasons for making an nmethod not entrant from `const char*` into enum values.
> 
> Tested on Linux x64 with JTREG tier1-3 in fastdebug and release mode.

I added this argument to `make_not_entrant` recently in [JDK-8351640](https://bugs.openjdk.org/browse/JDK-8351640) -- mostly to print it in `PrintCompilation` logs. Putting enum might be fine, but it _has to_ maintain the same level of human readability. Do not just print `made not entrant: 42`.

-------------

Changes requested by shade (Reviewer).

PR Review: https://git.openjdk.org/jdk/pull/25338#pullrequestreview-2863731885

From kvn at openjdk.org  Fri May 23 15:57:53 2025
From: kvn at openjdk.org (Vladimir Kozlov)
Date: Fri, 23 May 2025 15:57:53 GMT
Subject: RFR: 8357581: [JVMCI] Add HotSpotProfilingInfo [v2]
In-Reply-To: <nBkLbiS6kpy2_gOztAT8HHgQb2A8ADRf2KV4L3DAqLA=.6faf0fc5-b8fb-4d93-ad73-eeca2bbe9fbd@github.com>
References: <UzcQzrbW-H5V_53lRiAwBpppPyw2enbKS7bkJdbONSk=.a3d65cd3-0e53-4e77-b6af-0c5dab69f9af@github.com>
 <nBkLbiS6kpy2_gOztAT8HHgQb2A8ADRf2KV4L3DAqLA=.6faf0fc5-b8fb-4d93-ad73-eeca2bbe9fbd@github.com>
Message-ID: <He4NTXBt6lIj8gvvQx0Ea-r5JsWctNC2KUUoflF9ga8=.975bb7e8-fc4f-4639-b17d-33ad5823ad32@github.com>

On Fri, 23 May 2025 06:19:42 GMT, Doug Simon <dnsimon at openjdk.org> wrote:

>> Graal is adding enhanced logic to detect deoptimization cycles and needs to be able to query a method's decompilation counter (i.e. `MethodData::_compiler_counters._nof_decompiles`).
>> This PR adds the `HotSpotProfilingInfo` interface so that such HotSpot-specific profiling info can be accessed.
>> The change looks bigger in the GitHub review UI than it really is. I have simply renamed the pre-existing `HotSpotProfilingInfo` private class as `HotSpotProfilingInfoImpl` and repurposed the `HotSpotProfilingInfo` name for the *new* public interface.
>
> Doug Simon has updated the pull request incrementally with one additional commit since the last revision:
> 
>   fix copyright

Good.

-------------

Marked as reviewed by kvn (Reviewer).

PR Review: https://git.openjdk.org/jdk/pull/25397#pullrequestreview-2864916858

From dnsimon at openjdk.org  Fri May 23 16:33:04 2025
From: dnsimon at openjdk.org (Doug Simon)
Date: Fri, 23 May 2025 16:33:04 GMT
Subject: RFR: 8357581: [JVMCI] Add HotSpotProfilingInfo [v2]
In-Reply-To: <nBkLbiS6kpy2_gOztAT8HHgQb2A8ADRf2KV4L3DAqLA=.6faf0fc5-b8fb-4d93-ad73-eeca2bbe9fbd@github.com>
References: <UzcQzrbW-H5V_53lRiAwBpppPyw2enbKS7bkJdbONSk=.a3d65cd3-0e53-4e77-b6af-0c5dab69f9af@github.com>
 <nBkLbiS6kpy2_gOztAT8HHgQb2A8ADRf2KV4L3DAqLA=.6faf0fc5-b8fb-4d93-ad73-eeca2bbe9fbd@github.com>
Message-ID: <aTCsrVT0fU0fH3bMkRDMTCXZiWNJVD3c5U1K0bSXFzY=.dc947bed-09d8-4bf8-a946-7e4dfd3b17b2@github.com>

On Fri, 23 May 2025 06:19:42 GMT, Doug Simon <dnsimon at openjdk.org> wrote:

>> Graal is adding enhanced logic to detect deoptimization cycles and needs to be able to query a method's decompilation counter (i.e. `MethodData::_compiler_counters._nof_decompiles`).
>> This PR adds the `HotSpotProfilingInfo` interface so that such HotSpot-specific profiling info can be accessed.
>> The change looks bigger in the GitHub review UI than it really is. I have simply renamed the pre-existing `HotSpotProfilingInfo` private class as `HotSpotProfilingInfoImpl` and repurposed the `HotSpotProfilingInfo` name for the *new* public interface.
>
> Doug Simon has updated the pull request incrementally with one additional commit since the last revision:
> 
>   fix copyright

Thanks for the reviews.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/25397#issuecomment-2905036890

From dnsimon at openjdk.org  Fri May 23 16:33:05 2025
From: dnsimon at openjdk.org (Doug Simon)
Date: Fri, 23 May 2025 16:33:05 GMT
Subject: Integrated: 8357581: [JVMCI] Add HotSpotProfilingInfo
In-Reply-To: <UzcQzrbW-H5V_53lRiAwBpppPyw2enbKS7bkJdbONSk=.a3d65cd3-0e53-4e77-b6af-0c5dab69f9af@github.com>
References: <UzcQzrbW-H5V_53lRiAwBpppPyw2enbKS7bkJdbONSk=.a3d65cd3-0e53-4e77-b6af-0c5dab69f9af@github.com>
Message-ID: <mUHKeEFNwUda7AXyW7SoCSQgnRhEMmBXBDevmPgutYI=.2797f5e1-ad79-4650-a057-32ca03b257e4@github.com>

On Thu, 22 May 2025 17:12:34 GMT, Doug Simon <dnsimon at openjdk.org> wrote:

> Graal is adding enhanced logic to detect deoptimization cycles and needs to be able to query a method's decompilation counter (i.e. `MethodData::_compiler_counters._nof_decompiles`).
> This PR adds the `HotSpotProfilingInfo` interface so that such HotSpot-specific profiling info can be accessed.
> The change looks bigger in the GitHub review UI than it really is. I have simply renamed the pre-existing `HotSpotProfilingInfo` private class as `HotSpotProfilingInfoImpl` and repurposed the `HotSpotProfilingInfo` name for the *new* public interface.

This pull request has now been integrated.

Changeset: 2b6b7661
Author:    Doug Simon <dnsimon at openjdk.org>
URL:       https://git.openjdk.org/jdk/commit/2b6b7661b949971fe776714795d7dd46ed343cde
Stats:     235 lines in 5 files changed: 17 ins; 194 del; 24 mod

8357581: [JVMCI] Add HotSpotProfilingInfo

Reviewed-by: kvn, never

-------------

PR: https://git.openjdk.org/jdk/pull/25397

From cslucas at openjdk.org  Fri May 23 18:24:52 2025
From: cslucas at openjdk.org (Cesar Soares Lucas)
Date: Fri, 23 May 2025 18:24:52 GMT
Subject: RFR: 8357396: Refactor nmethod::make_not_entrant to use Enum
 instead of "const char*"
In-Reply-To: <OH8p3ETX34bE4gEHkW-TaQrxN9Lkp_blwjxZ9c9rgsw=.3e39bcd5-6499-4e4f-bddc-546bac5663a3@github.com>
References: <OH8p3ETX34bE4gEHkW-TaQrxN9Lkp_blwjxZ9c9rgsw=.3e39bcd5-6499-4e4f-bddc-546bac5663a3@github.com>
Message-ID: <V0MQ0eYZTOlQsHpjlP66kjuLKJzLkt8nG4qmTa0MA7w=.876413ad-b335-499a-beb4-568c63a8e3ee@github.com>

On Tue, 20 May 2025 22:08:18 GMT, Cesar Soares Lucas <cslucas at openjdk.org> wrote:

> Please review this refactor to transform the reasons for making an nmethod not entrant from `const char*` into enum values.
> 
> Tested on Linux x64 with JTREG tier1-3 in fastdebug and release mode.

Thank you for the comments. I'll make the refactoring.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/25338#issuecomment-2905409542

From duke at openjdk.org  Fri May 23 20:54:43 2025
From: duke at openjdk.org (Zihao Lin)
Date: Fri, 23 May 2025 20:54:43 GMT
Subject: RFR: 8344116: C2: remove slice parameter from LoadNode::make [v7]
In-Reply-To: <Po0DIjZv6wmSdwNcL1BeN5s9xvih8YKDqaw7Io5wIl8=.82dc3319-de61-4afc-898c-a7550bf9c9ac@github.com>
References: <Po0DIjZv6wmSdwNcL1BeN5s9xvih8YKDqaw7Io5wIl8=.82dc3319-de61-4afc-898c-a7550bf9c9ac@github.com>
Message-ID: <tiztHtRuTMsrDXUPxefb6ckb1RUmtInrbQAdTTwd5Uk=.c98e7cfc-1811-47c0-938d-a684c022ea74@github.com>

> This patch remove slice parameter from LoadNode::make
> 
> Mention in https://github.com/openjdk/jdk/pull/21834#pullrequestreview-2429164805
> 
> Hi team, I am new, I'd appreciate any guidance. Thank a lot!

Zihao Lin has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains five additional commits since the last revision:

 - Merge branch 'openjdk:master' into 8344116
 - Merge branch 'openjdk:master' into 8344116
 - Fix build
 - Fix test failed
 - 8344116: C2: remove slice parameter from LoadNode::make

-------------

Changes:
  - all: https://git.openjdk.org/jdk/pull/24258/files
  - new: https://git.openjdk.org/jdk/pull/24258/files/3efb1c17..ea83736e

Webrevs:
 - full: https://webrevs.openjdk.org/?repo=jdk&pr=24258&range=06
 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24258&range=05-06

  Stats: 393670 lines in 4531 files changed: 146248 ins; 225477 del; 21945 mod
  Patch: https://git.openjdk.org/jdk/pull/24258.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/24258/head:pull/24258

PR: https://git.openjdk.org/jdk/pull/24258

From duke at openjdk.org  Mon May 26 13:05:01 2025
From: duke at openjdk.org (Ferenc Rakoczi)
Date: Mon, 26 May 2025 13:05:01 GMT
Subject: RFR: 8351412: Add AVX-512 intrinsics for ML-KEM [v7]
In-Reply-To: <0ZhaH_07oxLDZxz8wVEgbsbYWB50sjuLZxYwyM4ftno=.2adb899d-7768-481d-975b-8e0ee3e6f2c2@github.com>
References: <EyhTUDIMxnzyPP14NYWuRlZXR9WTv2GaYApQO0nJ4do=.2ee863e3-fd15-41be-ac73-247b03144281@github.com>
 <DnmOKfbcTKPsR9G2Hq_6DLt5tAIuAZdy5eEzhxNSCcE=.6f869c95-3be6-42fa-90d4-5bf251272594@github.com>
 <0ZhaH_07oxLDZxz8wVEgbsbYWB50sjuLZxYwyM4ftno=.2adb899d-7768-481d-975b-8e0ee3e6f2c2@github.com>
Message-ID: <4Uc0-fOqIFIS5GFYXPTC6xp0WtcKrj9XNn_OEkl1N_I=.0ad95f85-4674-4ca3-a602-a965b97b699c@github.com>

On Tue, 20 May 2025 19:10:45 GMT, Sean Mullan <mullan at openjdk.org> wrote:

> Please also write a release note as the performance improvement is significant. Thanks!

Done. https://bugs.openjdk.org/browse/JDK-8357741 Release Note: ML-KEM Performance Improved

-------------

PR Comment: https://git.openjdk.org/jdk/pull/24953#issuecomment-2909684771

From never at openjdk.org  Tue May 27 17:23:52 2025
From: never at openjdk.org (Tom Rodriguez)
Date: Tue, 27 May 2025 17:23:52 GMT
Subject: RFR: 8357424: [JVMCI] Avoid incrementing decompilation count for
 hosted compiled nmethod [v4]
In-Reply-To: <jV_EaLvJc6R7LOqAvuY4xjBuB9z7p0fL31Hk0A0bGp8=.7e480681-b385-4bbc-b0ab-33c77eee6e7a@github.com>
References: <Tlf7gJojsuXSi0SBTjCvrgEwYfuc0kUB968V_jOZpTU=.38a4841e-8073-4aa2-a9f8-c2ea4395b2e1@github.com>
 <jV_EaLvJc6R7LOqAvuY4xjBuB9z7p0fL31Hk0A0bGp8=.7e480681-b385-4bbc-b0ab-33c77eee6e7a@github.com>
Message-ID: <y_xRmf4a77WEDQYEXLQBbKB7-83wjkvcMyjYgYk7AcI=.4a09bb87-a100-4137-a505-6c516650a7b7@github.com>

On Thu, 22 May 2025 08:04:35 GMT, Yudi Zheng <yzheng at openjdk.org> wrote:

>> Hosted Truffle compilations are installed on the OptimizedCallTarget#profiledPERoot method. Any deoptimization contributes to its decompile count, which can easily exceed the PerMethodRecompilationCutoff threshold, permanently preventing highest tier compilation on this method. This PR exempts hosted compilations from this cutoff by ensuring their decompile count is not incremented for hosted compiled nmethods.
>
> Yudi Zheng has updated the pull request incrementally with one additional commit since the last revision:
> 
>   address comments

src/hotspot/share/code/nmethod.cpp line 2059:

> 2057: 
> 2058:     if (update_recompile_counts()) {
> 2059: #if INCLUDE_JVMCI

I think this logic should be in nmethod::inc_decompile_method itself.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/25356#discussion_r2109762040

From never at openjdk.org  Tue May 27 17:32:57 2025
From: never at openjdk.org (Tom Rodriguez)
Date: Tue, 27 May 2025 17:32:57 GMT
Subject: RFR: 8357424: [JVMCI] Avoid incrementing decompilation count for
 hosted compiled nmethod [v4]
In-Reply-To: <jV_EaLvJc6R7LOqAvuY4xjBuB9z7p0fL31Hk0A0bGp8=.7e480681-b385-4bbc-b0ab-33c77eee6e7a@github.com>
References: <Tlf7gJojsuXSi0SBTjCvrgEwYfuc0kUB968V_jOZpTU=.38a4841e-8073-4aa2-a9f8-c2ea4395b2e1@github.com>
 <jV_EaLvJc6R7LOqAvuY4xjBuB9z7p0fL31Hk0A0bGp8=.7e480681-b385-4bbc-b0ab-33c77eee6e7a@github.com>
Message-ID: <ATJVh-zOQP0TKBATn8hw_kEtbfuLS81RuRiIik90rUk=.3f1221af-4825-4a09-b377-8abd03fc396e@github.com>

On Thu, 22 May 2025 08:04:35 GMT, Yudi Zheng <yzheng at openjdk.org> wrote:

>> Hosted Truffle compilations are installed on the OptimizedCallTarget#profiledPERoot method. Any deoptimization contributes to its decompile count, which can easily exceed the PerMethodRecompilationCutoff threshold, permanently preventing highest tier compilation on this method. This PR exempts hosted compilations from this cutoff by ensuring their decompile count is not incremented for hosted compiled nmethods.
>
> Yudi Zheng has updated the pull request incrementally with one additional commit since the last revision:
> 
>   address comments

I think there are two levels of counters that we might want to disable.  We definitely want to stop deopts and recompilations from marking the method not compilable which the current change does.  Additionally JVMCIRuntime::register_method will perform this logic if validate_compile_task_dependencies fails and I don't think we want that.  I think the new `!is_default` guard idiom should be in a helper like `nmethod::is_jvmci_hosted`.  Do we use the hosted language elsewhere?

The second level is to stop all counter updates in hosted compiles, for similar reasons.  Those updates won't lead to disabling compilation but they will quickly lead to saturating of all the counters which is fairly pointless but probably benign.  This would be done by setting `update_trap_state` to false for hosted nmethods.  That also has the effect of keeping `inc_recompile_count` false.  I think that's the right thing to do but I'd want to make sure that we test truffle workloads with those changes before making that change to make sure there isn't some subtle problem with that change.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/25356#issuecomment-2913383620

From sparasa at openjdk.org  Tue May 27 19:59:54 2025
From: sparasa at openjdk.org (Srinivas Vamsi Parasa)
Date: Tue, 27 May 2025 19:59:54 GMT
Subject: RFR: 8353686: Optimize Math.cbrt for x86 64 bit platforms [v3]
In-Reply-To: <H3vvJxVRsOzXpLIAnz2vc3wU_Umd9IoyI1cgqYT6mq0=.79b289da-60f2-4f62-99bd-227e03e9df2b@github.com>
References: <1NsI0OGP9RcnbEwlJwDj1dZ3w7zCP4DxJhEmO1quSgo=.3b1e3da9-9aa9-4221-a73a-e2f3ec5f456b@github.com>
 <H3vvJxVRsOzXpLIAnz2vc3wU_Umd9IoyI1cgqYT6mq0=.79b289da-60f2-4f62-99bd-227e03e9df2b@github.com>
Message-ID: <Y2Ne_JUjmtV3LHSRxRigX9mkY_fsQBU5OQJn6eS5CXw=.bb72ec4c-40af-4aa0-ab43-29e05a3b37b2@github.com>

On Tue, 6 May 2025 21:45:34 GMT, Mohamed Issa <duke at openjdk.org> wrote:

>> The goal of this PR is to implement an x86_64 intrinsic for java.lang.Math.cbrt() using libm. There is a new set of micro-benchmarks are included to check the performance of specific input value ranges to help prevent regressions in the future.
>> 
>> The results of all tests posted below were captured with an [Intel? Xeon 6761P](https://www.intel.com/content/www/us/en/products/sku/241842/intel-xeon-6761p-processor-336m-cache-2-50-ghz/specifications.html) using [OpenJDK v25-b21](https://github.com/openjdk/jdk/releases/tag/jdk-25%2B21) as the baseline version.
>> 
>> For performance data collected with the new built in range micro-benchmark, see the table below. Each result is the mean of 8 individual runs, and the input ranges used match those from the original Java implementation. Overall, the intrinsic provides a major uplift of 169% when very small inputs are used and a more modest uplift of 45% for all other inputs.
>> 
>> | Input range(s)                                  | Throughput with baseline (op/s) | Throughput with intrinsic (op/s) | Speedup |
>> | :-------------------------------------: | :----------------------------------: | :----------------------------------: | :---------: |
>> | [-2^(-1022), 2^(-1022)]                   | 6568                                             | 17678                                          | 2.69x       |
>> | (-INF, -2^(-1022)], [2^(-1022), INF) | 138932                                         | 200897                                        | 1.45x       |
>> 
>> Finally, the `jtreg:test/jdk/java/lang/Math/CubeRootTests.java` test passed with the changes.
>
> Mohamed Issa has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Add new set of cbrt micro-benchmarks

This PR looks good to me. I independently ran the correctness tests and performance benchmarks. 

Thanks,
Vamsi

-------------

Marked as reviewed by sparasa (Author).

PR Review: https://git.openjdk.org/jdk/pull/24470#pullrequestreview-2872425795

From sviswanathan at openjdk.org  Tue May 27 23:43:53 2025
From: sviswanathan at openjdk.org (Sandhya Viswanathan)
Date: Tue, 27 May 2025 23:43:53 GMT
Subject: RFR: 8353686: Optimize Math.cbrt for x86 64 bit platforms [v3]
In-Reply-To: <H3vvJxVRsOzXpLIAnz2vc3wU_Umd9IoyI1cgqYT6mq0=.79b289da-60f2-4f62-99bd-227e03e9df2b@github.com>
References: <1NsI0OGP9RcnbEwlJwDj1dZ3w7zCP4DxJhEmO1quSgo=.3b1e3da9-9aa9-4221-a73a-e2f3ec5f456b@github.com>
 <H3vvJxVRsOzXpLIAnz2vc3wU_Umd9IoyI1cgqYT6mq0=.79b289da-60f2-4f62-99bd-227e03e9df2b@github.com>
Message-ID: <62SQLH6KGe8_w0LxmPSPW7C34v9-KFYliTk0RzMgJTs=.80886efd-7f4f-408c-8bbd-07af12ab9701@github.com>

On Tue, 6 May 2025 21:45:34 GMT, Mohamed Issa <duke at openjdk.org> wrote:

>> The goal of this PR is to implement an x86_64 intrinsic for java.lang.Math.cbrt() using libm. There is a new set of micro-benchmarks are included to check the performance of specific input value ranges to help prevent regressions in the future.
>> 
>> The command to run all range specific micro-benchmarks is posted below.
>> 
>> `make test TEST="micro:CbrtPerf.CbrtPerfRanges"`
>> 
>> The results of all tests posted below were captured with an [Intel? Xeon 6761P](https://www.intel.com/content/www/us/en/products/sku/241842/intel-xeon-6761p-processor-336m-cache-2-50-ghz/specifications.html) using [OpenJDK v25-b21](https://github.com/openjdk/jdk/releases/tag/jdk-25%2B21) as the baseline version.
>> 
>> For performance data collected with the new built in range micro-benchmark, see the table below. Each result is the mean of 8 individual runs, and the input ranges used match those from the original Java implementation. Overall, the intrinsic provides a major uplift of 169% when very small inputs are used and a more modest uplift of 45% for all other inputs.
>> 
>> | Input range(s)                                  | Throughput with baseline (op/s) | Throughput with intrinsic (op/s) | Speedup |
>> | :-------------------------------------: | :----------------------------------: | :----------------------------------: | :---------: |
>> | [-2^(-1022), 2^(-1022)]                   | 6568                                             | 17678                                          | 2.69x       |
>> | (-INF, -2^(-1022)], [2^(-1022), INF) | 138932                                         | 200897                                        | 1.45x       |
>> 
>> Finally, the `jtreg:test/jdk/java/lang/Math/CubeRootTests.java` test passed with the changes.
>
> Mohamed Issa has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Add new set of cbrt micro-benchmarks

src/hotspot/cpu/x86/macroAssembler_x86.hpp line 1251:

> 1249:   void movapd(XMMRegister dst, Address src) { Assembler::movapd(dst, src); }
> 1250:   void movapd(XMMRegister dst, AddressLiteral src, Register rscratch = noreg);
> 1251: 

You could write it as:
using Assembler::movapd;
void movapd(XMMRegister dst, AddressLiteral src, Register rscratch = noreg);

src/hotspot/cpu/x86/macroAssembler_x86.hpp line 1323:

> 1321:   void unpckhpd(XMMRegister dst, XMMRegister src) { Assembler::unpckhpd(dst, src); }
> 1322:   void unpcklpd(XMMRegister dst, XMMRegister src) { Assembler::unpcklpd(dst, src); }
> 1323: 

Do we need these declarations here?

src/hotspot/cpu/x86/stubGenerator_x86_64_cbrt.cpp line 43:

> 41: //
> 42: // Special cases:
> 43: //  cbrt(NaN) = quiet NaN, and raise invalid exception

No exception is raised so the comment needs to be corrected.

src/hotspot/cpu/x86/stubGenerator_x86_64_cbrt.cpp line 226:

> 224:   __ andl(rcx, 248);
> 225:   __ lea(r8, ExternalAddress(rcp_table));
> 226:   __ movsd(xmm4, Address(r8, rcx, Address::times_1));

This address and other instructions using similar address could be written as  Address(rcx, r8, Address::times_1).

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/24470#discussion_r2110406675
PR Review Comment: https://git.openjdk.org/jdk/pull/24470#discussion_r2110426188
PR Review Comment: https://git.openjdk.org/jdk/pull/24470#discussion_r2110536680
PR Review Comment: https://git.openjdk.org/jdk/pull/24470#discussion_r2110535561

From duke at openjdk.org  Wed May 28 18:39:13 2025
From: duke at openjdk.org (Mohamed Issa)
Date: Wed, 28 May 2025 18:39:13 GMT
Subject: RFR: 8353686: Optimize Math.cbrt for x86 64 bit platforms [v4]
In-Reply-To: <1NsI0OGP9RcnbEwlJwDj1dZ3w7zCP4DxJhEmO1quSgo=.3b1e3da9-9aa9-4221-a73a-e2f3ec5f456b@github.com>
References: <1NsI0OGP9RcnbEwlJwDj1dZ3w7zCP4DxJhEmO1quSgo=.3b1e3da9-9aa9-4221-a73a-e2f3ec5f456b@github.com>
Message-ID: <72GCipLKeCWCG-4jsG5XhZKkTdVsWafEq_wA0oD-0mk=.c70af814-54fe-43d2-b7c9-72b845eb99d5@github.com>

> The goal of this PR is to implement an x86_64 intrinsic for java.lang.Math.cbrt() using libm. There is a new set of micro-benchmarks are included to check the performance of specific input value ranges to help prevent regressions in the future.
> 
> The command to run all range specific micro-benchmarks is posted below.
> 
> `make test TEST="micro:CbrtPerf.CbrtPerfRanges"`
> 
> The results of all tests posted below were captured with an [Intel? Xeon 6761P](https://www.intel.com/content/www/us/en/products/sku/241842/intel-xeon-6761p-processor-336m-cache-2-50-ghz/specifications.html) using [OpenJDK v25-b21](https://github.com/openjdk/jdk/releases/tag/jdk-25%2B21) as the baseline version.
> 
> For performance data collected with the new built in range micro-benchmark, see the table below. Each result is the mean of 8 individual runs, and the input ranges used match those from the original Java implementation. Overall, the intrinsic provides a major uplift of 169% when very small inputs are used and a more modest uplift of 45% for all other inputs.
> 
> | Input range(s)                                  | Baseline throughput (ops/ms) | Intrinsic throughput (ops/ms) | Speedup |
> | :-------------------------------------: | :-------------------------------: | :-------------------------------: | :---------: |
> | [-2^(-1022), 2^(-1022)]                   | 6568                                        | 17678                                      | 2.69x       |
> | (-INF, -2^(-1022)], [2^(-1022), INF) | 138932                                    | 200897                                    | 1.45x       |
> 
> Finally, the `jtreg:test/jdk/java/lang/Math/CubeRootTests.java` test passed with the changes.

Mohamed Issa has updated the pull request incrementally with four additional commits since the last revision:

 - Remove comment mentioning invalid exception when NaN input is provided
 - Use rcx as base and r8 as index for address calculations in certain cbrt stub generator instructions
 - Remove unnecessary unpckhpd and unpcklpd definitions in macro-assembler header file
 - Remove unnecessary movapd definitions in macro-assembler header file

-------------

Changes:
  - all: https://git.openjdk.org/jdk/pull/24470/files
  - new: https://git.openjdk.org/jdk/pull/24470/files/57412f0d..ff4d4f22

Webrevs:
 - full: https://webrevs.openjdk.org/?repo=jdk&pr=24470&range=03
 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24470&range=02-03

  Stats: 10 lines in 2 files changed: 0 ins; 4 del; 6 mod
  Patch: https://git.openjdk.org/jdk/pull/24470.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/24470/head:pull/24470

PR: https://git.openjdk.org/jdk/pull/24470

From sviswanathan at openjdk.org  Wed May 28 18:39:13 2025
From: sviswanathan at openjdk.org (Sandhya Viswanathan)
Date: Wed, 28 May 2025 18:39:13 GMT
Subject: RFR: 8353686: Optimize Math.cbrt for x86 64 bit platforms [v4]
In-Reply-To: <72GCipLKeCWCG-4jsG5XhZKkTdVsWafEq_wA0oD-0mk=.c70af814-54fe-43d2-b7c9-72b845eb99d5@github.com>
References: <1NsI0OGP9RcnbEwlJwDj1dZ3w7zCP4DxJhEmO1quSgo=.3b1e3da9-9aa9-4221-a73a-e2f3ec5f456b@github.com>
 <72GCipLKeCWCG-4jsG5XhZKkTdVsWafEq_wA0oD-0mk=.c70af814-54fe-43d2-b7c9-72b845eb99d5@github.com>
Message-ID: <eJStgCQHuB9TUbgyAPo_h4HNK22AJb70aV4XmbElftw=.f4caccc2-a766-4ed1-bbf2-f60a9447ff09@github.com>

On Wed, 28 May 2025 18:36:38 GMT, Mohamed Issa <duke at openjdk.org> wrote:

>> The goal of this PR is to implement an x86_64 intrinsic for java.lang.Math.cbrt() using libm. There is a new set of micro-benchmarks are included to check the performance of specific input value ranges to help prevent regressions in the future.
>> 
>> The command to run all range specific micro-benchmarks is posted below.
>> 
>> `make test TEST="micro:CbrtPerf.CbrtPerfRanges"`
>> 
>> The results of all tests posted below were captured with an [Intel? Xeon 6761P](https://www.intel.com/content/www/us/en/products/sku/241842/intel-xeon-6761p-processor-336m-cache-2-50-ghz/specifications.html) using [OpenJDK v25-b21](https://github.com/openjdk/jdk/releases/tag/jdk-25%2B21) as the baseline version.
>> 
>> For performance data collected with the new built in range micro-benchmark, see the table below. Each result is the mean of 8 individual runs, and the input ranges used match those from the original Java implementation. Overall, the intrinsic provides a major uplift of 169% when very small inputs are used and a more modest uplift of 45% for all other inputs.
>> 
>> | Input range(s)                                  | Baseline throughput (ops/ms) | Intrinsic throughput (ops/ms) | Speedup |
>> | :-------------------------------------: | :-------------------------------: | :-------------------------------: | :---------: |
>> | [-2^(-1022), 2^(-1022)]                   | 6568                                        | 17678                                      | 2.69x       |
>> | (-INF, -2^(-1022)], [2^(-1022), INF) | 138932                                    | 200897                                    | 1.45x       |
>> 
>> Finally, the `jtreg:test/jdk/java/lang/Math/CubeRootTests.java` test passed with the changes.
>
> Mohamed Issa has updated the pull request incrementally with four additional commits since the last revision:
> 
>  - Remove comment mentioning invalid exception when NaN input is provided
>  - Use rcx as base and r8 as index for address calculations in certain cbrt stub generator instructions
>  - Remove unnecessary unpckhpd and unpcklpd definitions in macro-assembler header file
>  - Remove unnecessary movapd definitions in macro-assembler header file

Looks good to me.

-------------

Marked as reviewed by sviswanathan (Reviewer).

PR Review: https://git.openjdk.org/jdk/pull/24470#pullrequestreview-2876071455

From duke at openjdk.org  Wed May 28 18:39:14 2025
From: duke at openjdk.org (Mohamed Issa)
Date: Wed, 28 May 2025 18:39:14 GMT
Subject: RFR: 8353686: Optimize Math.cbrt for x86 64 bit platforms [v3]
In-Reply-To: <62SQLH6KGe8_w0LxmPSPW7C34v9-KFYliTk0RzMgJTs=.80886efd-7f4f-408c-8bbd-07af12ab9701@github.com>
References: <1NsI0OGP9RcnbEwlJwDj1dZ3w7zCP4DxJhEmO1quSgo=.3b1e3da9-9aa9-4221-a73a-e2f3ec5f456b@github.com>
 <H3vvJxVRsOzXpLIAnz2vc3wU_Umd9IoyI1cgqYT6mq0=.79b289da-60f2-4f62-99bd-227e03e9df2b@github.com>
 <62SQLH6KGe8_w0LxmPSPW7C34v9-KFYliTk0RzMgJTs=.80886efd-7f4f-408c-8bbd-07af12ab9701@github.com>
Message-ID: <8ArA3awbbtTvNZfaKvAdlv2oMcLP0cASBHr-VRK_-dc=.0fe37d63-7eb0-49d7-abf1-812e7be8dde8@github.com>

On Tue, 27 May 2025 22:30:17 GMT, Sandhya Viswanathan <sviswanathan at openjdk.org> wrote:

>> Mohamed Issa has updated the pull request incrementally with one additional commit since the last revision:
>> 
>>   Add new set of cbrt micro-benchmarks
>
> src/hotspot/cpu/x86/macroAssembler_x86.hpp line 1251:
> 
>> 1249:   void movapd(XMMRegister dst, Address src) { Assembler::movapd(dst, src); }
>> 1250:   void movapd(XMMRegister dst, AddressLiteral src, Register rscratch = noreg);
>> 1251: 
> 
> You could write it as:
> using Assembler::movapd;
> void movapd(XMMRegister dst, AddressLiteral src, Register rscratch = noreg);

Ok, this is updated now.

> src/hotspot/cpu/x86/macroAssembler_x86.hpp line 1323:
> 
>> 1321:   void unpckhpd(XMMRegister dst, XMMRegister src) { Assembler::unpckhpd(dst, src); }
>> 1322:   void unpcklpd(XMMRegister dst, XMMRegister src) { Assembler::unpcklpd(dst, src); }
>> 1323: 
> 
> Do we need these declarations here?

No, they were superfluous as cbrt stub generator could already access them. I removed them now.

> src/hotspot/cpu/x86/stubGenerator_x86_64_cbrt.cpp line 43:
> 
>> 41: //
>> 42: // Special cases:
>> 43: //  cbrt(NaN) = quiet NaN, and raise invalid exception
> 
> No exception is raised so the comment needs to be corrected.

This is corrected now.

> src/hotspot/cpu/x86/stubGenerator_x86_64_cbrt.cpp line 226:
> 
>> 224:   __ andl(rcx, 248);
>> 225:   __ lea(r8, ExternalAddress(rcp_table));
>> 226:   __ movsd(xmm4, Address(r8, rcx, Address::times_1));
> 
> This address and other instructions using similar address could be written as  Address(rcx, r8, Address::times_1).

Ok, I have changed the order, so rcx is viewed as base and r8 is viewed as index now.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/24470#discussion_r2112516974
PR Review Comment: https://git.openjdk.org/jdk/pull/24470#discussion_r2112518997
PR Review Comment: https://git.openjdk.org/jdk/pull/24470#discussion_r2112520793
PR Review Comment: https://git.openjdk.org/jdk/pull/24470#discussion_r2112520486

From jbhateja at openjdk.org  Thu May 29 08:38:53 2025
From: jbhateja at openjdk.org (Jatin Bhateja)
Date: Thu, 29 May 2025 08:38:53 GMT
Subject: RFR: 8353686: Optimize Math.cbrt for x86 64 bit platforms [v4]
In-Reply-To: <72GCipLKeCWCG-4jsG5XhZKkTdVsWafEq_wA0oD-0mk=.c70af814-54fe-43d2-b7c9-72b845eb99d5@github.com>
References: <1NsI0OGP9RcnbEwlJwDj1dZ3w7zCP4DxJhEmO1quSgo=.3b1e3da9-9aa9-4221-a73a-e2f3ec5f456b@github.com>
 <72GCipLKeCWCG-4jsG5XhZKkTdVsWafEq_wA0oD-0mk=.c70af814-54fe-43d2-b7c9-72b845eb99d5@github.com>
Message-ID: <kPx9NqOgNJPStCsPd4PtRhfbQeEXi5xbefvSjCHoPSY=.aa26a490-8f90-4e17-8c9e-cde0c25a9fbb@github.com>

On Wed, 28 May 2025 18:39:13 GMT, Mohamed Issa <duke at openjdk.org> wrote:

>> The goal of this PR is to implement an x86_64 intrinsic for java.lang.Math.cbrt() using libm. There is a new set of micro-benchmarks are included to check the performance of specific input value ranges to help prevent regressions in the future.
>> 
>> The command to run all range specific micro-benchmarks is posted below.
>> 
>> `make test TEST="micro:CbrtPerf.CbrtPerfRanges"`
>> 
>> The results of all tests posted below were captured with an [Intel? Xeon 6761P](https://www.intel.com/content/www/us/en/products/sku/241842/intel-xeon-6761p-processor-336m-cache-2-50-ghz/specifications.html) using [OpenJDK v25-b21](https://github.com/openjdk/jdk/releases/tag/jdk-25%2B21) as the baseline version.
>> 
>> For performance data collected with the new built in range micro-benchmark, see the table below. Each result is the mean of 8 individual runs, and the input ranges used match those from the original Java implementation. Overall, the intrinsic provides a major uplift of 169% when very small inputs are used and a more modest uplift of 45% for all other inputs.
>> 
>> | Input range(s)                                  | Baseline throughput (ops/ms) | Intrinsic throughput (ops/ms) | Speedup |
>> | :-------------------------------------: | :-------------------------------: | :-------------------------------: | :---------: |
>> | [-2^(-1022), 2^(-1022)]                   | 6568                                        | 17678                                      | 2.69x       |
>> | (-INF, -2^(-1022)], [2^(-1022), INF) | 138932                                    | 200897                                    | 1.45x       |
>> 
>> Finally, the `jtreg:test/jdk/java/lang/Math/CubeRootTests.java` test passed with the changes.
>
> Mohamed Issa has updated the pull request incrementally with four additional commits since the last revision:
> 
>  - Remove comment mentioning invalid exception when NaN input is provided
>  - Use rcx as base and r8 as index for address calculations in certain cbrt stub generator instructions
>  - Remove unnecessary unpckhpd and unpcklpd definitions in macro-assembler header file
>  - Remove unnecessary movapd definitions in macro-assembler header file

Patch looks good to me,  some comment included.

src/hotspot/cpu/x86/stubGenerator_x86_64_cbrt.cpp line 185:

> 183: 
> 184: #define __ _masm->
> 185: 

Original Intel libm inline sequence uses hexadecimal constants, I would have preferred to use them as it is to maintain 1:1 mapping b/w instruction sequence.

test/micro/org/openjdk/bench/java/lang/CbrtPerf.java line 56:

> 54:     public static class CbrtPerfRanges {
> 55:         public static int cbrtInputCount = 2048;
> 56: 

Please create separate CbrtPerfSpecialValues for +/- 0.0 and +/- Infinity and NaN values.
I understand that handling special cases in intrinsic may impact general case performance but its ok to have atleast micro for it.

test/micro/org/openjdk/bench/java/lang/CbrtPerf.java line 114:

> 112:         public static final double constDouble512 = 512.0;
> 113: 
> 114:         @Benchmark

Baseline:-
Benchmark                                     (cbrtRangeIndex)   Mode  Cnt        Score   Error   Units
CbrtPerf.CbrtPerfConstant.cbrtConstDouble0                 N/A  thrpt    2  2673018.356          ops/ms
CbrtPerf.CbrtPerfConstant.cbrtConstDouble1                 N/A  thrpt    2  2684233.593          ops/ms
CbrtPerf.CbrtPerfConstant.cbrtConstDouble27                N/A  thrpt    2  2684250.835          ops/ms
CbrtPerf.CbrtPerfConstant.cbrtConstDouble512               N/A  thrpt    2  2683616.321          ops/ms
Withopt:-
Benchmark                                     (cbrtRangeIndex)   Mode  Cnt       Score   Error   Units
CbrtPerf.CbrtPerfConstant.cbrtConstDouble0                 N/A  thrpt    2   284575.292          ops/ms
CbrtPerf.CbrtPerfConstant.cbrtConstDouble1                 N/A  thrpt    2   162876.035          ops/ms
CbrtPerf.CbrtPerfConstant.cbrtConstDouble27                N/A  thrpt    2   163227.835          ops/ms
CbrtPerf.CbrtPerfConstant.cbrtConstDouble512               N/A  thrpt    2   162998.844          ops/ms


There is approximaely 10x performance improvement by disabling intrinsic for compile time constant inputs.
I have created a follow up JBS to track it. https://bugs.openjdk.org/browse/JDK-8358039

-------------

PR Review: https://git.openjdk.org/jdk/pull/24470#pullrequestreview-2877492755
PR Review Comment: https://git.openjdk.org/jdk/pull/24470#discussion_r2113462482
PR Review Comment: https://git.openjdk.org/jdk/pull/24470#discussion_r2113484695
PR Review Comment: https://git.openjdk.org/jdk/pull/24470#discussion_r2113472992

From jwaters at openjdk.org  Thu May 29 09:03:55 2025
From: jwaters at openjdk.org (Julian Waters)
Date: Thu, 29 May 2025 09:03:55 GMT
Subject: RFR: 8353686: Optimize Math.cbrt for x86 64 bit platforms [v4]
In-Reply-To: <72GCipLKeCWCG-4jsG5XhZKkTdVsWafEq_wA0oD-0mk=.c70af814-54fe-43d2-b7c9-72b845eb99d5@github.com>
References: <1NsI0OGP9RcnbEwlJwDj1dZ3w7zCP4DxJhEmO1quSgo=.3b1e3da9-9aa9-4221-a73a-e2f3ec5f456b@github.com>
 <72GCipLKeCWCG-4jsG5XhZKkTdVsWafEq_wA0oD-0mk=.c70af814-54fe-43d2-b7c9-72b845eb99d5@github.com>
Message-ID: <cTPAVD5xnYDItBLloQQRf1T1dF6rc9LmMLlcSXXGfj4=.ad19da34-6dd1-4f38-91f1-3f7dacdb5e10@github.com>

On Wed, 28 May 2025 18:39:13 GMT, Mohamed Issa <duke at openjdk.org> wrote:

>> The goal of this PR is to implement an x86_64 intrinsic for java.lang.Math.cbrt() using libm. There is a new set of micro-benchmarks are included to check the performance of specific input value ranges to help prevent regressions in the future.
>> 
>> The command to run all range specific micro-benchmarks is posted below.
>> 
>> `make test TEST="micro:CbrtPerf.CbrtPerfRanges"`
>> 
>> The results of all tests posted below were captured with an [Intel? Xeon 6761P](https://www.intel.com/content/www/us/en/products/sku/241842/intel-xeon-6761p-processor-336m-cache-2-50-ghz/specifications.html) using [OpenJDK v25-b21](https://github.com/openjdk/jdk/releases/tag/jdk-25%2B21) as the baseline version.
>> 
>> For performance data collected with the new built in range micro-benchmark, see the table below. Each result is the mean of 8 individual runs, and the input ranges used match those from the original Java implementation. Overall, the intrinsic provides a major uplift of 169% when very small inputs are used and a more modest uplift of 45% for all other inputs.
>> 
>> | Input range(s)                                  | Baseline throughput (ops/ms) | Intrinsic throughput (ops/ms) | Speedup |
>> | :-------------------------------------: | :-------------------------------: | :-------------------------------: | :---------: |
>> | [-2^(-1022), 2^(-1022)]                   | 6568                                        | 17678                                      | 2.69x       |
>> | (-INF, -2^(-1022)], [2^(-1022), INF) | 138932                                    | 200897                                    | 1.45x       |
>> 
>> Finally, the `jtreg:test/jdk/java/lang/Math/CubeRootTests.java` test passed with the changes.
>
> Mohamed Issa has updated the pull request incrementally with four additional commits since the last revision:
> 
>  - Remove comment mentioning invalid exception when NaN input is provided
>  - Use rcx as base and r8 as index for address calculations in certain cbrt stub generator instructions
>  - Remove unnecessary unpckhpd and unpcklpd definitions in macro-assembler header file
>  - Remove unnecessary movapd definitions in macro-assembler header file

src/hotspot/cpu/x86/stubGenerator_x86_64_cbrt.cpp line 49:

> 47: /******************************************************************************/
> 48: 
> 49: ATTRIBUTE_ALIGNED(4) static const juint _SIG_MASK[] =

ATTRIBUTE_ALIGNED expands to alignas, I suggest using that directly instead

src/hotspot/cpu/x86/templateInterpreterGenerator_x86_64.cpp line 503:

> 501: 
> 502:   return entry_point;
> 503: }

Is the newline removal intentional?

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/24470#discussion_r2113530767
PR Review Comment: https://git.openjdk.org/jdk/pull/24470#discussion_r2113529587

From dnsimon at openjdk.org  Thu May 29 13:04:40 2025
From: dnsimon at openjdk.org (Doug Simon)
Date: Thu, 29 May 2025 13:04:40 GMT
Subject: RFR: 8357619: [JVMCI] Revisit phantom_ref parameter in
 JVMCINMethodData::get_nmethod_mirror
Message-ID: <37LbN00VRPqAt9LN8jx43xx3QGsF6jnPFS_OQLUa-0U=.687f6afe-d13a-4d03-af0c-ac91a9862b13@github.com>

The point of the `phantom_ref` parameter (introduced by [JDK-8234359](https://bugs.openjdk.org/browse/JDK-8234359)) of `JVMCINMethodData::get_nmethod_mirror` is to avoid the special resurrection semantics of a phantom read when reading the field during GC, which is when `JVMCINMethodData::invalidate_nmethod_mirror` can be called.
This case can be handled directly in `JVMCINMethodData::invalidate_nmethod_mirror` and so the `phantom_ref` parameter can be removed.

-------------

Commit messages:
 - remove phantom_ref arg from JVMCINMethodData::get_nmethod_mirror

Changes: https://git.openjdk.org/jdk/pull/25488/files
  Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=25488&range=00
  Issue: https://bugs.openjdk.org/browse/JDK-8357619
  Stats: 14 lines in 3 files changed: 1 ins; 5 del; 8 mod
  Patch: https://git.openjdk.org/jdk/pull/25488.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/25488/head:pull/25488

PR: https://git.openjdk.org/jdk/pull/25488

From eosterlund at openjdk.org  Thu May 29 13:04:41 2025
From: eosterlund at openjdk.org (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=)
Date: Thu, 29 May 2025 13:04:41 GMT
Subject: RFR: 8357619: [JVMCI] Revisit phantom_ref parameter in
 JVMCINMethodData::get_nmethod_mirror
In-Reply-To: <37LbN00VRPqAt9LN8jx43xx3QGsF6jnPFS_OQLUa-0U=.687f6afe-d13a-4d03-af0c-ac91a9862b13@github.com>
References: <37LbN00VRPqAt9LN8jx43xx3QGsF6jnPFS_OQLUa-0U=.687f6afe-d13a-4d03-af0c-ac91a9862b13@github.com>
Message-ID: <5xZ0aIZT-xJM_h06TD061mZ_3T1qAPkd1F75vipRJ_w=.7d0e9d10-3a6c-4d1f-b457-aa0e1dd61560@github.com>

On Wed, 28 May 2025 10:28:38 GMT, Doug Simon <dnsimon at openjdk.org> wrote:

> The point of the `phantom_ref` parameter (introduced by [JDK-8234359](https://bugs.openjdk.org/browse/JDK-8234359)) of `JVMCINMethodData::get_nmethod_mirror` is to avoid the special resurrection semantics of a phantom read when reading the field during GC, which is when `JVMCINMethodData::invalidate_nmethod_mirror` can be called.
> This case can be handled directly in `JVMCINMethodData::invalidate_nmethod_mirror` and so the `phantom_ref` parameter can be removed.

src/hotspot/share/jvmci/jvmciCompilerToVM.cpp line 2834:

> 2832:           // Only the mirror in the HotSpot heap is accessible
> 2833:           // through JVMCINMethodData
> 2834:           oop nmethod_mirror = data->get_nmethod_mirror(nm);

Is the nmethod guaranteed to be on-stack here? If not it gotta be phantom.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/25488#discussion_r2112390139

From dnsimon at openjdk.org  Thu May 29 13:04:41 2025
From: dnsimon at openjdk.org (Doug Simon)
Date: Thu, 29 May 2025 13:04:41 GMT
Subject: RFR: 8357619: [JVMCI] Revisit phantom_ref parameter in
 JVMCINMethodData::get_nmethod_mirror
In-Reply-To: <5xZ0aIZT-xJM_h06TD061mZ_3T1qAPkd1F75vipRJ_w=.7d0e9d10-3a6c-4d1f-b457-aa0e1dd61560@github.com>
References: <37LbN00VRPqAt9LN8jx43xx3QGsF6jnPFS_OQLUa-0U=.687f6afe-d13a-4d03-af0c-ac91a9862b13@github.com>
 <5xZ0aIZT-xJM_h06TD061mZ_3T1qAPkd1F75vipRJ_w=.7d0e9d10-3a6c-4d1f-b457-aa0e1dd61560@github.com>
Message-ID: <FNom2mfupoibvGfpXP3fG5TwcsYQMc0sOrZ-f1x51N8=.17065dff-9bc5-4609-9354-5d1478afd003@github.com>

On Wed, 28 May 2025 17:15:48 GMT, Erik ?sterlund <eosterlund at openjdk.org> wrote:

>> The point of the `phantom_ref` parameter (introduced by [JDK-8234359](https://bugs.openjdk.org/browse/JDK-8234359)) of `JVMCINMethodData::get_nmethod_mirror` is to avoid the special resurrection semantics of a phantom read when reading the field during GC, which is when `JVMCINMethodData::invalidate_nmethod_mirror` can be called.
>> This case can be handled directly in `JVMCINMethodData::invalidate_nmethod_mirror` and so the `phantom_ref` parameter can be removed.
>
> src/hotspot/share/jvmci/jvmciCompilerToVM.cpp line 2834:
> 
>> 2832:           // Only the mirror in the HotSpot heap is accessible
>> 2833:           // through JVMCINMethodData
>> 2834:           oop nmethod_mirror = data->get_nmethod_mirror(nm);
> 
> Is the nmethod guaranteed to be on-stack here? If not it gotta be phantom.

Is the use of `JVMCINMethodHandle` equivalent to `nm` being on-stack?

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/25488#discussion_r2112482026

From eosterlund at openjdk.org  Thu May 29 13:04:41 2025
From: eosterlund at openjdk.org (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=)
Date: Thu, 29 May 2025 13:04:41 GMT
Subject: RFR: 8357619: [JVMCI] Revisit phantom_ref parameter in
 JVMCINMethodData::get_nmethod_mirror
In-Reply-To: <FNom2mfupoibvGfpXP3fG5TwcsYQMc0sOrZ-f1x51N8=.17065dff-9bc5-4609-9354-5d1478afd003@github.com>
References: <37LbN00VRPqAt9LN8jx43xx3QGsF6jnPFS_OQLUa-0U=.687f6afe-d13a-4d03-af0c-ac91a9862b13@github.com>
 <5xZ0aIZT-xJM_h06TD061mZ_3T1qAPkd1F75vipRJ_w=.7d0e9d10-3a6c-4d1f-b457-aa0e1dd61560@github.com>
 <FNom2mfupoibvGfpXP3fG5TwcsYQMc0sOrZ-f1x51N8=.17065dff-9bc5-4609-9354-5d1478afd003@github.com>
Message-ID: <HLBFcl4It_G3yYs-mAVwFA82AjT0NdK_Bw0l0xQUGnU=.6cda0629-517b-4ba5-8ba2-8aa62136c6ed@github.com>

On Wed, 28 May 2025 18:10:39 GMT, Doug Simon <dnsimon at openjdk.org> wrote:

>> src/hotspot/share/jvmci/jvmciCompilerToVM.cpp line 2834:
>> 
>>> 2832:           // Only the mirror in the HotSpot heap is accessible
>>> 2833:           // through JVMCINMethodData
>>> 2834:           oop nmethod_mirror = data->get_nmethod_mirror(nm);
>> 
>> Is the nmethod guaranteed to be on-stack here? If not it gotta be phantom.
>
> Is the use of `JVMCINMethodHandle` equivalent to `nm` being on-stack?

Yes. Great, so that should be fine then.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/25488#discussion_r2112673894

From duke at openjdk.org  Thu May 29 18:56:11 2025
From: duke at openjdk.org (Mohamed Issa)
Date: Thu, 29 May 2025 18:56:11 GMT
Subject: RFR: 8353686: Optimize Math.cbrt for x86 64 bit platforms [v5]
In-Reply-To: <1NsI0OGP9RcnbEwlJwDj1dZ3w7zCP4DxJhEmO1quSgo=.3b1e3da9-9aa9-4221-a73a-e2f3ec5f456b@github.com>
References: <1NsI0OGP9RcnbEwlJwDj1dZ3w7zCP4DxJhEmO1quSgo=.3b1e3da9-9aa9-4221-a73a-e2f3ec5f456b@github.com>
Message-ID: <GE6s8raLeTlcMDb0coe89QbMYSPqWtjjkr_puB8hcx8=.d4911eb3-3201-441b-81a9-b476ea5b64c9@github.com>

> The goal of this PR is to implement an x86_64 intrinsic for java.lang.Math.cbrt() using libm. There is a new set of micro-benchmarks are included to check the performance of specific input value ranges to help prevent regressions in the future.
> 
> The command to run all range specific micro-benchmarks is posted below.
> 
> `make test TEST="micro:CbrtPerf.CbrtPerfRanges"`
> 
> The results of all tests posted below were captured with an [Intel? Xeon 6761P](https://www.intel.com/content/www/us/en/products/sku/241842/intel-xeon-6761p-processor-336m-cache-2-50-ghz/specifications.html) using [OpenJDK v25-b21](https://github.com/openjdk/jdk/releases/tag/jdk-25%2B21) as the baseline version.
> 
> For performance data collected with the new built in range micro-benchmark, see the table below. Each result is the mean of 8 individual runs, and the input ranges used match those from the original Java implementation. Overall, the intrinsic provides a major uplift of 169% when very small inputs are used and a more modest uplift of 45% for all other inputs.
> 
> | Input range(s)                                  | Baseline throughput (ops/ms) | Intrinsic throughput (ops/ms) | Speedup |
> | :-------------------------------------: | :-------------------------------: | :-------------------------------: | :---------: |
> | [-2^(-1022), 2^(-1022)]                   | 6568                                        | 17678                                      | 2.69x       |
> | (-INF, -2^(-1022)], [2^(-1022), INF) | 138932                                    | 200897                                    | 1.45x       |
> 
> Finally, the `jtreg:test/jdk/java/lang/Math/CubeRootTests.java` test passed with the changes.

Mohamed Issa has updated the pull request incrementally with two additional commits since the last revision:

 - Add newline back to templateInterpreterGenerator_x86_64.cpp source file
 - Add special case values to cbrt micro-benchmark set

-------------

Changes:
  - all: https://git.openjdk.org/jdk/pull/24470/files
  - new: https://git.openjdk.org/jdk/pull/24470/files/ff4d4f22..233e0188

Webrevs:
 - full: https://webrevs.openjdk.org/?repo=jdk&pr=24470&range=04
 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24470&range=03-04

  Stats: 40 lines in 2 files changed: 39 ins; 0 del; 1 mod
  Patch: https://git.openjdk.org/jdk/pull/24470.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/24470/head:pull/24470

PR: https://git.openjdk.org/jdk/pull/24470

From duke at openjdk.org  Thu May 29 18:56:12 2025
From: duke at openjdk.org (Mohamed Issa)
Date: Thu, 29 May 2025 18:56:12 GMT
Subject: RFR: 8353686: Optimize Math.cbrt for x86 64 bit platforms [v4]
In-Reply-To: <cTPAVD5xnYDItBLloQQRf1T1dF6rc9LmMLlcSXXGfj4=.ad19da34-6dd1-4f38-91f1-3f7dacdb5e10@github.com>
References: <1NsI0OGP9RcnbEwlJwDj1dZ3w7zCP4DxJhEmO1quSgo=.3b1e3da9-9aa9-4221-a73a-e2f3ec5f456b@github.com>
 <72GCipLKeCWCG-4jsG5XhZKkTdVsWafEq_wA0oD-0mk=.c70af814-54fe-43d2-b7c9-72b845eb99d5@github.com>
 <cTPAVD5xnYDItBLloQQRf1T1dF6rc9LmMLlcSXXGfj4=.ad19da34-6dd1-4f38-91f1-3f7dacdb5e10@github.com>
Message-ID: <CE7DC_l4IRgh8ia7JFugTpN7E55hce7YBJpxlVnOreE=.e35a487c-ddb8-4893-b80c-64aa7d25f455@github.com>

On Thu, 29 May 2025 09:01:05 GMT, Julian Waters <jwaters at openjdk.org> wrote:

>> Mohamed Issa has updated the pull request incrementally with four additional commits since the last revision:
>> 
>>  - Remove comment mentioning invalid exception when NaN input is provided
>>  - Use rcx as base and r8 as index for address calculations in certain cbrt stub generator instructions
>>  - Remove unnecessary unpckhpd and unpcklpd definitions in macro-assembler header file
>>  - Remove unnecessary movapd definitions in macro-assembler header file
>
> src/hotspot/cpu/x86/stubGenerator_x86_64_cbrt.cpp line 49:
> 
>> 47: /******************************************************************************/
>> 48: 
>> 49: ATTRIBUTE_ALIGNED(4) static const juint _SIG_MASK[] =
> 
> ATTRIBUTE_ALIGNED expands to alignas, I suggest using that directly instead

The ATTRIBUTE_ALIGNED micro is used in other stub generator files. Should all of those be changed to alignas as well?

Is the suggestion to change just for code readability?

> src/hotspot/cpu/x86/templateInterpreterGenerator_x86_64.cpp line 503:
> 
>> 501: 
>> 502:   return entry_point;
>> 503: }
> 
> Is the newline removal intentional?

It wasn't intentional. Thanks for spotting that. I added it back.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/24470#discussion_r2114557661
PR Review Comment: https://git.openjdk.org/jdk/pull/24470#discussion_r2114553562

From duke at openjdk.org  Thu May 29 18:56:12 2025
From: duke at openjdk.org (Mohamed Issa)
Date: Thu, 29 May 2025 18:56:12 GMT
Subject: RFR: 8353686: Optimize Math.cbrt for x86 64 bit platforms [v4]
In-Reply-To: <kPx9NqOgNJPStCsPd4PtRhfbQeEXi5xbefvSjCHoPSY=.aa26a490-8f90-4e17-8c9e-cde0c25a9fbb@github.com>
References: <1NsI0OGP9RcnbEwlJwDj1dZ3w7zCP4DxJhEmO1quSgo=.3b1e3da9-9aa9-4221-a73a-e2f3ec5f456b@github.com>
 <72GCipLKeCWCG-4jsG5XhZKkTdVsWafEq_wA0oD-0mk=.c70af814-54fe-43d2-b7c9-72b845eb99d5@github.com>
 <kPx9NqOgNJPStCsPd4PtRhfbQeEXi5xbefvSjCHoPSY=.aa26a490-8f90-4e17-8c9e-cde0c25a9fbb@github.com>
Message-ID: <eJbb5hzhM27dVuc7MD6kqvlLReNHthJNiRXttkPwzQo=.bb1151a7-959a-4f56-8ebf-bde8126ea5d4@github.com>

On Thu, 29 May 2025 08:21:29 GMT, Jatin Bhateja <jbhateja at openjdk.org> wrote:

>> Mohamed Issa has updated the pull request incrementally with four additional commits since the last revision:
>> 
>>  - Remove comment mentioning invalid exception when NaN input is provided
>>  - Use rcx as base and r8 as index for address calculations in certain cbrt stub generator instructions
>>  - Remove unnecessary unpckhpd and unpcklpd definitions in macro-assembler header file
>>  - Remove unnecessary movapd definitions in macro-assembler header file
>
> src/hotspot/cpu/x86/stubGenerator_x86_64_cbrt.cpp line 185:
> 
>> 183: 
>> 184: #define __ _masm->
>> 185: 
> 
> Original Intel libm inline sequence uses hexadecimal constants, I would have preferred to use them as it is to maintain 1:1 mapping b/w instruction sequence.

The assembly reference code used for this implementation uses decimal constants.

> test/micro/org/openjdk/bench/java/lang/CbrtPerf.java line 56:
> 
>> 54:     public static class CbrtPerfRanges {
>> 55:         public static int cbrtInputCount = 2048;
>> 56: 
> 
> Please create separate CbrtPerfSpecialValues for +/- 0.0 and +/- Infinity and NaN values.
> I understand that handling special cases in intrinsic may impact general case performance but its ok to have atleast micro for it.

Ok, I added this to the new set of micro-benchmarks. I kept them as variable values.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/24470#discussion_r2114551009
PR Review Comment: https://git.openjdk.org/jdk/pull/24470#discussion_r2114552717

From duke at openjdk.org  Thu May 29 22:33:52 2025
From: duke at openjdk.org (Mohamed Issa)
Date: Thu, 29 May 2025 22:33:52 GMT
Subject: RFR: 8353686: Optimize Math.cbrt for x86 64 bit platforms [v4]
In-Reply-To: <kPx9NqOgNJPStCsPd4PtRhfbQeEXi5xbefvSjCHoPSY=.aa26a490-8f90-4e17-8c9e-cde0c25a9fbb@github.com>
References: <1NsI0OGP9RcnbEwlJwDj1dZ3w7zCP4DxJhEmO1quSgo=.3b1e3da9-9aa9-4221-a73a-e2f3ec5f456b@github.com>
 <72GCipLKeCWCG-4jsG5XhZKkTdVsWafEq_wA0oD-0mk=.c70af814-54fe-43d2-b7c9-72b845eb99d5@github.com>
 <kPx9NqOgNJPStCsPd4PtRhfbQeEXi5xbefvSjCHoPSY=.aa26a490-8f90-4e17-8c9e-cde0c25a9fbb@github.com>
Message-ID: <TKma1AJWRWYjH4SZ4N2CTukn1yXtu_wfACG0dTbTRFg=.e2e27f2c-e4cc-4bf8-9a04-2985dc007002@github.com>

On Thu, 29 May 2025 08:36:31 GMT, Jatin Bhateja <jbhateja at openjdk.org> wrote:

>> Mohamed Issa has updated the pull request incrementally with four additional commits since the last revision:
>> 
>>  - Remove comment mentioning invalid exception when NaN input is provided
>>  - Use rcx as base and r8 as index for address calculations in certain cbrt stub generator instructions
>>  - Remove unnecessary unpckhpd and unpcklpd definitions in macro-assembler header file
>>  - Remove unnecessary movapd definitions in macro-assembler header file
>
> Patch looks good to me,  some comment included.

@jatin-bhateja Please let me know if there's anything else to address.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/24470#issuecomment-2920726222

From jwaters at openjdk.org  Fri May 30 05:40:54 2025
From: jwaters at openjdk.org (Julian Waters)
Date: Fri, 30 May 2025 05:40:54 GMT
Subject: RFR: 8353686: Optimize Math.cbrt for x86 64 bit platforms [v4]
In-Reply-To: <CE7DC_l4IRgh8ia7JFugTpN7E55hce7YBJpxlVnOreE=.e35a487c-ddb8-4893-b80c-64aa7d25f455@github.com>
References: <1NsI0OGP9RcnbEwlJwDj1dZ3w7zCP4DxJhEmO1quSgo=.3b1e3da9-9aa9-4221-a73a-e2f3ec5f456b@github.com>
 <72GCipLKeCWCG-4jsG5XhZKkTdVsWafEq_wA0oD-0mk=.c70af814-54fe-43d2-b7c9-72b845eb99d5@github.com>
 <cTPAVD5xnYDItBLloQQRf1T1dF6rc9LmMLlcSXXGfj4=.ad19da34-6dd1-4f38-91f1-3f7dacdb5e10@github.com>
 <CE7DC_l4IRgh8ia7JFugTpN7E55hce7YBJpxlVnOreE=.e35a487c-ddb8-4893-b80c-64aa7d25f455@github.com>
Message-ID: <g4W5HTytRF9PwMzzPsB5JugHocIUWpD80VK6XG5az7s=.0df493bd-13d9-497a-a23a-2c8dd80b05a6@github.com>

On Thu, 29 May 2025 18:52:51 GMT, Mohamed Issa <duke at openjdk.org> wrote:

>> src/hotspot/cpu/x86/stubGenerator_x86_64_cbrt.cpp line 49:
>> 
>>> 47: /******************************************************************************/
>>> 48: 
>>> 49: ATTRIBUTE_ALIGNED(4) static const juint _SIG_MASK[] =
>> 
>> ATTRIBUTE_ALIGNED expands to alignas, I suggest using that directly instead
>
> The ATTRIBUTE_ALIGNED macro is used in other stub generator files. Should all of those be changed to alignas as well? If so, would it best to make those changes in a separate PR?
> 
> Also, is the suggestion to change just for code readability?

There's no need to change the other files, that would be out of scope for this Pull Request. Yes, it's just a suggestion for readability, it can be ignored if you deem it as not necessary.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/24470#discussion_r2115174843

From eosterlund at openjdk.org  Fri May 30 07:58:51 2025
From: eosterlund at openjdk.org (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=)
Date: Fri, 30 May 2025 07:58:51 GMT
Subject: RFR: 8357619: [JVMCI] Revisit phantom_ref parameter in
 JVMCINMethodData::get_nmethod_mirror
In-Reply-To: <37LbN00VRPqAt9LN8jx43xx3QGsF6jnPFS_OQLUa-0U=.687f6afe-d13a-4d03-af0c-ac91a9862b13@github.com>
References: <37LbN00VRPqAt9LN8jx43xx3QGsF6jnPFS_OQLUa-0U=.687f6afe-d13a-4d03-af0c-ac91a9862b13@github.com>
Message-ID: <JJP7tCjoJoSSMVBxZmbmquwkeBquRHyNGKKg7CWQ7ts=.00c5e2f9-4524-4ace-8971-43c02e191dda@github.com>

On Wed, 28 May 2025 10:28:38 GMT, Doug Simon <dnsimon at openjdk.org> wrote:

> The point of the `phantom_ref` parameter (introduced by [JDK-8234359](https://bugs.openjdk.org/browse/JDK-8234359)) of `JVMCINMethodData::get_nmethod_mirror` is to avoid the special resurrection semantics of a phantom read when reading the field during GC, which is when `JVMCINMethodData::invalidate_nmethod_mirror` can be called.
> This case can be handled directly in `JVMCINMethodData::invalidate_nmethod_mirror` and so the `phantom_ref` parameter can be removed.

Looks good.

-------------

Marked as reviewed by eosterlund (Reviewer).

PR Review: https://git.openjdk.org/jdk/pull/25488#pullrequestreview-2880492044

From duke at openjdk.org  Fri May 30 15:06:04 2025
From: duke at openjdk.org (Tom Shull)
Date: Fri, 30 May 2025 15:06:04 GMT
Subject: RFR: 8357987: [JVMCI] Add Support for Retrieving All Non-Static
 Methods of a ResolvedJavaType.
Message-ID: <mRzKVfwoUWsIiQoMx2rVRinkLcV9w14P2rSMWz9_m2g=.adb59499-09c3-4085-8464-7e6f75bad624@github.com>

Currently from ResolvedJavaType one can retrieve all declared methods, static methods, and constructors of the given type. However, internally in HotSpot there are also VM-internal methods, such as overpass methods, associated with a given type which we cannot access via the API.

To correct this, we should add a new method which enables VM-internal methods, such as overpass methods, to be accessed.

-------------

Commit messages:
 - implement getAllMethods
 - address reviewer feedback
 - Add Support for Retrieving All Non-Static Methods of a ResolvedJavaType.

Changes: https://git.openjdk.org/jdk/pull/25498/files
  Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=25498&range=00
  Issue: https://bugs.openjdk.org/browse/JDK-8357987
  Stats: 107 lines in 11 files changed: 106 ins; 0 del; 1 mod
  Patch: https://git.openjdk.org/jdk/pull/25498.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/25498/head:pull/25498

PR: https://git.openjdk.org/jdk/pull/25498

From dnsimon at openjdk.org  Fri May 30 15:06:08 2025
From: dnsimon at openjdk.org (Doug Simon)
Date: Fri, 30 May 2025 15:06:08 GMT
Subject: RFR: 8357987: [JVMCI] Add Support for Retrieving All Non-Static
 Methods of a ResolvedJavaType.
In-Reply-To: <mRzKVfwoUWsIiQoMx2rVRinkLcV9w14P2rSMWz9_m2g=.adb59499-09c3-4085-8464-7e6f75bad624@github.com>
References: <mRzKVfwoUWsIiQoMx2rVRinkLcV9w14P2rSMWz9_m2g=.adb59499-09c3-4085-8464-7e6f75bad624@github.com>
Message-ID: <OclBEcfmMgopFnExwfDOoF4D7dX0v42DPYgJyPwhPGc=.40f0e084-955e-4e70-9cf9-cfd09c5c72fb@github.com>

On Wed, 28 May 2025 15:55:39 GMT, Tom Shull <duke at openjdk.org> wrote:

> Currently from ResolvedJavaType one can retrieve all declared methods, static methods, and constructors of the given type. However, internally in HotSpot there are also VM-internal methods, such as overpass methods, associated with a given type which we cannot access via the API.
> 
> To correct this, we should add a new method which enables VM-internal methods, such as overpass methods, to be accessed.

I also updated the title of https://bugs.openjdk.org/browse/JDK-8357987 to Not Be All Capitalized so you'll need to fix the title of this PR.

src/hotspot/share/jvmci/jvmciCompilerToVM.cpp line 580:

> 578: C2V_END
> 579: 
> 580: C2V_VMENTRY_0(jboolean, isOverpass,(JNIEnv* env, jobject, ARGUMENT_PAIR(method)))

Delete this method - it's no longer used.

src/hotspot/share/jvmci/jvmciCompilerToVM.cpp line 3315:

> 3313:   {CC "setNotInlinableOrCompilable",                  CC "(" HS_METHOD2 ")V",                                                               FN_PTR(setNotInlinableOrCompilable)},
> 3314:   {CC "isCompilable",                                 CC "(" HS_METHOD2 ")Z",                                                               FN_PTR(isCompilable)},
> 3315:   {CC "isOverpass",                                   CC "(" HS_METHOD2 ")Z",                                                               FN_PTR(isOverpass)},

delete

src/jdk.internal.vm.ci/share/classes/jdk/vm/ci/hotspot/CompilerToVM.java line 179:

> 177:     private native boolean isCompilable(HotSpotResolvedJavaMethodImpl method, long methodPointer);
> 178: 
> 179:     /**

Delete this method - it's no longer used.

src/jdk.internal.vm.ci/share/classes/jdk/vm/ci/hotspot/CompilerToVM.java line 1162:

> 1160: 
> 1161:     /**
> 1162:      * Gets the {@link ResolvedJavaMethod}s for all non-overpass instance methods of {@code klass}.

all non-overpass and non-constructor

src/jdk.internal.vm.ci/share/classes/jdk/vm/ci/hotspot/CompilerToVM.java line 1171:

> 1169: 
> 1170:     /**
> 1171:      * Gets the {@link ResolvedJavaMethod}s for all instance methods of {@code klass}.

instance -> non-static
Instance -> NonStatic

src/jdk.internal.vm.ci/share/classes/jdk/vm/ci/hotspot/HotSpotResolvedJavaMethodImpl.java line 583:

> 581:     @Override
> 582:     public boolean isDeclared() {
> 583:         if (isConstructor() || isStatic()) {

`isStatic()` -> `isClassInitializer()`

src/jdk.internal.vm.ci/share/classes/jdk/vm/ci/hotspot/HotSpotResolvedJavaMethodImpl.java line 586:

> 584:             return false;
> 585:         }
> 586:         return !compilerToVM().isOverpass(this);

I think you can do this with a direct flag check:

boolean isOverpass = (getConstMethodFlags() & config().constMethodIsOverpass) != 0;
return isOverpass;

See #20256 as an example of the other changes needed for this.

src/jdk.internal.vm.ci/share/classes/jdk/vm/ci/meta/ResolvedJavaMethod.java line 118:

> 116: 
> 117:     /**
> 118:      * Returns {@code true} if this method would be contained in the array returned by

`would be` -> `is`

src/jdk.internal.vm.ci/share/classes/jdk/vm/ci/meta/ResolvedJavaType.java line 370:

> 368: 
> 369:     /**
> 370:      * Returns a list containing all the non-static methods present within this type.

Point out that the returned list is unmodifiable (like the API for `Stream.toList()` does).

test/hotspot/jtreg/compiler/jvmci/jdk.vm.ci.runtime.test/src/jdk/vm/ci/runtime/test/TestResolvedJavaType.java line 1027:

> 1025:             ResolvedJavaType type = metaAccess.lookupJavaType(c);
> 1026:             Set<ResolvedJavaMethod> allMethods = new HashSet<>(type.getAllMethods(true));
> 1027:             boolean included = Arrays.stream(type.getDeclaredMethods()).allMatch(m -> allMethods.contains(m));

You can produce a more helpful error message by collecting the entries from getDeclaredMethods, getDeclaredConstructors and the class initialized that are *not* in `allMethods`.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/25498#issuecomment-2921656256
PR Review Comment: https://git.openjdk.org/jdk/pull/25498#discussion_r2113593898
PR Review Comment: https://git.openjdk.org/jdk/pull/25498#discussion_r2113594155
PR Review Comment: https://git.openjdk.org/jdk/pull/25498#discussion_r2113593301
PR Review Comment: https://git.openjdk.org/jdk/pull/25498#discussion_r2112455015
PR Review Comment: https://git.openjdk.org/jdk/pull/25498#discussion_r2112455704
PR Review Comment: https://git.openjdk.org/jdk/pull/25498#discussion_r2112434269
PR Review Comment: https://git.openjdk.org/jdk/pull/25498#discussion_r2112449433
PR Review Comment: https://git.openjdk.org/jdk/pull/25498#discussion_r2112420844
PR Review Comment: https://git.openjdk.org/jdk/pull/25498#discussion_r2112451810
PR Review Comment: https://git.openjdk.org/jdk/pull/25498#discussion_r2115430479

From duke at openjdk.org  Fri May 30 15:06:09 2025
From: duke at openjdk.org (Tom Shull)
Date: Fri, 30 May 2025 15:06:09 GMT
Subject: RFR: 8357987: [JVMCI] Add Support for Retrieving All Non-Static
 Methods of a ResolvedJavaType.
In-Reply-To: <OclBEcfmMgopFnExwfDOoF4D7dX0v42DPYgJyPwhPGc=.40f0e084-955e-4e70-9cf9-cfd09c5c72fb@github.com>
References: <mRzKVfwoUWsIiQoMx2rVRinkLcV9w14P2rSMWz9_m2g=.adb59499-09c3-4085-8464-7e6f75bad624@github.com>
 <OclBEcfmMgopFnExwfDOoF4D7dX0v42DPYgJyPwhPGc=.40f0e084-955e-4e70-9cf9-cfd09c5c72fb@github.com>
Message-ID: <621JpJVqtfhOtmuHd54KXE7kbOW_RzTQuudFesTADJ0=.d0985feb-fd1a-45eb-8246-261cb3127d2a@github.com>

On Wed, 28 May 2025 17:54:27 GMT, Doug Simon <dnsimon at openjdk.org> wrote:

>> Currently from ResolvedJavaType one can retrieve all declared methods, static methods, and constructors of the given type. However, internally in HotSpot there are also VM-internal methods, such as overpass methods, associated with a given type which we cannot access via the API.
>> 
>> To correct this, we should add a new method which enables VM-internal methods, such as overpass methods, to be accessed.
>
> src/jdk.internal.vm.ci/share/classes/jdk/vm/ci/hotspot/CompilerToVM.java line 1171:
> 
>> 1169: 
>> 1170:     /**
>> 1171:      * Gets the {@link ResolvedJavaMethod}s for all instance methods of {@code klass}.
> 
> instance -> non-static
> Instance -> NonStatic

I realized NonStatic is not accurate - we return everything except `<init>s` and `<clinit>` - so I switched to `NonInitializerMethods` everywhere. Does that seem fair?

> src/jdk.internal.vm.ci/share/classes/jdk/vm/ci/hotspot/HotSpotResolvedJavaMethodImpl.java line 586:
> 
>> 584:             return false;
>> 585:         }
>> 586:         return !compilerToVM().isOverpass(this);
> 
> I think you can do this with a direct flag check:
> 
> boolean isOverpass = (getConstMethodFlags() & config().constMethodIsOverpass) != 0;
> return isOverpass;
> 
> See #20256 as an example of the other changes needed for this.

good call. changed

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/25498#discussion_r2112886022
PR Review Comment: https://git.openjdk.org/jdk/pull/25498#discussion_r2112884946

From duke at openjdk.org  Fri May 30 15:06:09 2025
From: duke at openjdk.org (Tom Shull)
Date: Fri, 30 May 2025 15:06:09 GMT
Subject: RFR: 8357987: [JVMCI] Add Support for Retrieving All Non-Static
 Methods of a ResolvedJavaType.
In-Reply-To: <621JpJVqtfhOtmuHd54KXE7kbOW_RzTQuudFesTADJ0=.d0985feb-fd1a-45eb-8246-261cb3127d2a@github.com>
References: <mRzKVfwoUWsIiQoMx2rVRinkLcV9w14P2rSMWz9_m2g=.adb59499-09c3-4085-8464-7e6f75bad624@github.com>
 <OclBEcfmMgopFnExwfDOoF4D7dX0v42DPYgJyPwhPGc=.40f0e084-955e-4e70-9cf9-cfd09c5c72fb@github.com>
 <621JpJVqtfhOtmuHd54KXE7kbOW_RzTQuudFesTADJ0=.d0985feb-fd1a-45eb-8246-261cb3127d2a@github.com>
Message-ID: <ZBAoB_Si9hi0eZztvhGaptpSmOykOjEuPe3eQJo3hhg=.32b6fe89-48d5-466d-b3d1-ec281b3c1ea1@github.com>

On Wed, 28 May 2025 22:46:46 GMT, Tom Shull <duke at openjdk.org> wrote:

>> src/jdk.internal.vm.ci/share/classes/jdk/vm/ci/hotspot/CompilerToVM.java line 1171:
>> 
>>> 1169: 
>>> 1170:     /**
>>> 1171:      * Gets the {@link ResolvedJavaMethod}s for all instance methods of {@code klass}.
>> 
>> instance -> non-static
>> Instance -> NonStatic
>
> I realized NonStatic is not accurate - we return everything except `<init>s` and `<clinit>` - so I switched to `NonInitializerMethods` everywhere. Does that seem fair?

thinking about it more, it's probably better if we do no filtering and return all methods in `InstanceKlass->_methods`. How about something like `getAllMethods`:

```   
 /**
     * Returns a list containing all methods present within this type. This list can include
     * methods implicitly created and used by the VM.
     * The returned List is unmodifiable; calls to any mutator method
     * will always cause {@code UnsupportedOperationException} to be thrown.
     *
     * @param forceLink if {@code true}, forces this type to be {@link #link linked}
     */
    List<ResolvedJavaMethod> getAllMethods(boolean forceLink);

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/25498#discussion_r2113338609

From dnsimon at openjdk.org  Fri May 30 15:06:09 2025
From: dnsimon at openjdk.org (Doug Simon)
Date: Fri, 30 May 2025 15:06:09 GMT
Subject: RFR: 8357987: [JVMCI] Add Support for Retrieving All Non-Static
 Methods of a ResolvedJavaType.
In-Reply-To: <ZBAoB_Si9hi0eZztvhGaptpSmOykOjEuPe3eQJo3hhg=.32b6fe89-48d5-466d-b3d1-ec281b3c1ea1@github.com>
References: <mRzKVfwoUWsIiQoMx2rVRinkLcV9w14P2rSMWz9_m2g=.adb59499-09c3-4085-8464-7e6f75bad624@github.com>
 <OclBEcfmMgopFnExwfDOoF4D7dX0v42DPYgJyPwhPGc=.40f0e084-955e-4e70-9cf9-cfd09c5c72fb@github.com>
 <621JpJVqtfhOtmuHd54KXE7kbOW_RzTQuudFesTADJ0=.d0985feb-fd1a-45eb-8246-261cb3127d2a@github.com>
 <ZBAoB_Si9hi0eZztvhGaptpSmOykOjEuPe3eQJo3hhg=.32b6fe89-48d5-466d-b3d1-ec281b3c1ea1@github.com>
Message-ID: <5TwuZTOvXugCHTiNOQpfYWtfwgV9b0HTyzoPdRMSB3U=.8e616674-895a-4c16-9d4e-2655b7b410f7@github.com>

On Thu, 29 May 2025 06:56:18 GMT, Tom Shull <duke at openjdk.org> wrote:

>> I realized NonStatic is not accurate - we return everything except `<init>s` and `<clinit>` - so I switched to `NonInitializerMethods` everywhere. Does that seem fair?
>
> thinking about it more, it's probably better if we do no filtering and return all methods in `InstanceKlass->_methods`. How about something like `getAllMethods`:
> 
> ```   
>  /**
>      * Returns a list containing all methods present within this type. This list can include
>      * methods implicitly created and used by the VM.
>      * The returned List is unmodifiable; calls to any mutator method
>      * will always cause {@code UnsupportedOperationException} to be thrown.
>      *
>      * @param forceLink if {@code true}, forces this type to be {@link #link linked}
>      */
>     List<ResolvedJavaMethod> getAllMethods(boolean forceLink);

Yes, that's a good idea - it's more future proof and lets the caller do the filtering.

`This list can include methods implicitly created and used by the VM that are not present in {@link #getDeclaredMethods}.`

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/25498#discussion_r2113592457

From duke at openjdk.org  Fri May 30 15:06:29 2025
From: duke at openjdk.org (Tom Shull)
Date: Fri, 30 May 2025 15:06:29 GMT
Subject: RFR: 8357660: [JVMCI] Add Support for Retrieving All Indy
 BootstrapMethodInvocations directly from the ConstantPool
Message-ID: <Lc6lP7En8OrhiQT7Y90xz7av5HpuzXPOLk57MPNYSZU=.cdb8db7e-e3cc-4a6d-9efd-b1c1f902d8a2@github.com>

This PR adds support for directly retrieving all invokedynamic BootstrapMethodInvocations from a ConstantPool.

In addition, two methods are added to the BootstrapMethodInvocations:
1. `void resolveInvokeDynamic()`
2. `JavaConstant lookupInvokeDynamicAppendix()`

The combination of these two features allows one to directly interact with all invokedynamic information of a given ConstantPool without having to iterate through all of the Classfile's methods to find all invokedynamic bytecodes

-------------

Commit messages:
 - complete changes
 - commit review suggestion
 - commit review suggestion
 - change to allow both indys and condys to be looked up all at once
 - address reviewer feedback
 - style fixes and add testing to TestDynamicConstants.
 - Add support for retrieving all Indy BootstrapMethodInvocations from Constant Pool.

Changes: https://git.openjdk.org/jdk/pull/25420/files
  Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=25420&range=00
  Issue: https://bugs.openjdk.org/browse/JDK-8357660
  Stats: 142 lines in 5 files changed: 130 ins; 0 del; 12 mod
  Patch: https://git.openjdk.org/jdk/pull/25420.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/25420/head:pull/25420

PR: https://git.openjdk.org/jdk/pull/25420

From dnsimon at openjdk.org  Fri May 30 15:06:30 2025
From: dnsimon at openjdk.org (Doug Simon)
Date: Fri, 30 May 2025 15:06:30 GMT
Subject: RFR: 8357660: [JVMCI] Add Support for Retrieving All Indy
 BootstrapMethodInvocations directly from the ConstantPool
In-Reply-To: <Lc6lP7En8OrhiQT7Y90xz7av5HpuzXPOLk57MPNYSZU=.cdb8db7e-e3cc-4a6d-9efd-b1c1f902d8a2@github.com>
References: <Lc6lP7En8OrhiQT7Y90xz7av5HpuzXPOLk57MPNYSZU=.cdb8db7e-e3cc-4a6d-9efd-b1c1f902d8a2@github.com>
Message-ID: <mcfWvAY_AciOhNDzR_KU_GFkI7ngK7T7esQt8QaQ7qo=.e77f736e-ccc1-4690-9248-7a2589c665a9@github.com>

On Fri, 23 May 2025 17:37:14 GMT, Tom Shull <duke at openjdk.org> wrote:

> This PR adds support for directly retrieving all invokedynamic BootstrapMethodInvocations from a ConstantPool.
> 
> In addition, two methods are added to the BootstrapMethodInvocations:
> 1. `void resolveInvokeDynamic()`
> 2. `JavaConstant lookupInvokeDynamicAppendix()`
> 
> The combination of these two features allows one to directly interact with all invokedynamic information of a given ConstantPool without having to iterate through all of the Classfile's methods to find all invokedynamic bytecodes

Please add some tests for the new methods to `test/hotspot/jtreg/compiler/jvmci/jdk.vm.ci.hotspot.test/src/jdk/vm/ci/hotspot/test/TestDynamicConstant.java`.

I also updated the title of https://bugs.openjdk.org/browse/JDK-8357660 to Not Be All Capitalized so you'll need to fix the title of this PR.
Also, please update both titles and descriptions further to reflect the final changes (i.e. lookupBootstrapMethodInvocations instead of lookupIndyBootstrapMethodInvocations).

src/jdk.internal.vm.ci/share/classes/jdk/vm/ci/hotspot/CompilerToVM.java line 476:

> 474: 
> 475:     /**
> 476:      * Returns the number of {@code ResolvedIndyEntry} present within this constant

`{@code ResolvedIndyEntry}` -> `{@code ResolvedIndyEntry}s`

src/jdk.internal.vm.ci/share/classes/jdk/vm/ci/hotspot/HotSpotConstantPool.java line 540:

> 538:         private final JavaConstant type;
> 539:         private final List<JavaConstant> staticArguments;
> 540:         private final int index;

index -> cpiOrIndyIndex

src/jdk.internal.vm.ci/share/classes/jdk/vm/ci/hotspot/HotSpotConstantPool.java line 651:

> 649:                 return List.of();
> 650:             }
> 651:             return IntStream.range(0, numIndys).mapToObj(i -> lookupBootstrapMethodInvocation(i, Bytecodes.INVOKEDYNAMIC))

Suggestion:

            return IntStream.range(0, numIndys)
                            .mapToObj(i -> lookupBootstrapMethodInvocation(i, Bytecodes.INVOKEDYNAMIC))
                            .toList();

src/jdk.internal.vm.ci/share/classes/jdk/vm/ci/hotspot/HotSpotConstantPool.java line 654:

> 652:                     .toList();
> 653:         } else {
> 654:             return IntStream.range(1, length()).filter(cpi -> {

Suggestion:

            return IntStream.range(1, length())
                            .filter(this::isDynamicEntry)
                            .mapToObj(...);


and:

    private boolean isDynamicEntry(int cpi) {
        JvmConstant tagAt = getTagAt(cpi);
        return tagAt != null && tagAt.name.equals("Dynamic");
    }

src/jdk.internal.vm.ci/share/classes/jdk/vm/ci/hotspot/HotSpotConstantPool.java line 657:

> 655:         } else {
> 656:             return IntStream.range(1, length())
> 657:                             .filter(this::isDynamicEntry)

Looks like you forgot to add the definition of `isDynamicEntry` that I suggested:

    private boolean isDynamicEntry(int cpi) {
        JvmConstant tagAt = getTagAt(cpi);
        return tagAt != null && tagAt.name.equals("Dynamic");
    }

src/jdk.internal.vm.ci/share/classes/jdk/vm/ci/meta/ConstantPool.java line 198:

> 196:          * If this bootstrap method invocation is for a {@code
> 197:          * CONSTANTAdd_InvokeDynamic_info} pool entry, then this method ensures the
> 198:          * invoke dynamic is resolved. This can be used to compile time resolve the

What exactly does resolving an invoke dynamic mean?
Also I would leave out the sentence about "compile time" unless you clarify exactly what that means.

src/jdk.internal.vm.ci/share/classes/jdk/vm/ci/meta/ConstantPool.java line 233:

> 231: 
> 232:     /**
> 233:      * Returns the BootstrapMethodInvocation instances for all invokedynamic

Point out that the returned list is unmodifiable (like the API for `Stream.toList()` does).

src/jdk.internal.vm.ci/share/classes/jdk/vm/ci/meta/ConstantPool.java line 237:

> 235:      * is returned.
> 236:      */
> 237:     BootstrapMethodInvocation[] lookupAllIndyBootstrapMethodInvocations();

Why not make this return all `BootstrapMethodInvocation`s? The caller can then filter out the indy ones with `isInvokeDynamic`. Also, please return a `List<BootstrapMethodInvocation>` instead of an array - we should never return arrays from JVMCI (see #23159 as an example of addressing existing API). Lastly, return `List.of()` instead of null.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/25420#issuecomment-2906643446
PR Comment: https://git.openjdk.org/jdk/pull/25420#issuecomment-2921667337
PR Review Comment: https://git.openjdk.org/jdk/pull/25420#discussion_r2107428322
PR Review Comment: https://git.openjdk.org/jdk/pull/25420#discussion_r2115447272
PR Review Comment: https://git.openjdk.org/jdk/pull/25420#discussion_r2114177826
PR Review Comment: https://git.openjdk.org/jdk/pull/25420#discussion_r2114187417
PR Review Comment: https://git.openjdk.org/jdk/pull/25420#discussion_r2114737379
PR Review Comment: https://git.openjdk.org/jdk/pull/25420#discussion_r2107430633
PR Review Comment: https://git.openjdk.org/jdk/pull/25420#discussion_r2112429562
PR Review Comment: https://git.openjdk.org/jdk/pull/25420#discussion_r2107441215

From duke at openjdk.org  Fri May 30 15:06:09 2025
From: duke at openjdk.org (Tom Shull)
Date: Fri, 30 May 2025 15:06:09 GMT
Subject: RFR: 8357987: [JVMCI] Add Support for Retrieving All Non-Static
 Methods of a ResolvedJavaType.
In-Reply-To: <5TwuZTOvXugCHTiNOQpfYWtfwgV9b0HTyzoPdRMSB3U=.8e616674-895a-4c16-9d4e-2655b7b410f7@github.com>
References: <mRzKVfwoUWsIiQoMx2rVRinkLcV9w14P2rSMWz9_m2g=.adb59499-09c3-4085-8464-7e6f75bad624@github.com>
 <OclBEcfmMgopFnExwfDOoF4D7dX0v42DPYgJyPwhPGc=.40f0e084-955e-4e70-9cf9-cfd09c5c72fb@github.com>
 <621JpJVqtfhOtmuHd54KXE7kbOW_RzTQuudFesTADJ0=.d0985feb-fd1a-45eb-8246-261cb3127d2a@github.com>
 <ZBAoB_Si9hi0eZztvhGaptpSmOykOjEuPe3eQJo3hhg=.32b6fe89-48d5-466d-b3d1-ec281b3c1ea1@github.com>
 <5TwuZTOvXugCHTiNOQpfYWtfwgV9b0HTyzoPdRMSB3U=.8e616674-895a-4c16-9d4e-2655b7b410f7@github.com>
Message-ID: <3kzvHswjZ98huXibmqouApRGInSf3rwkIwQReBOCANc=.c51d8496-6032-4387-8b5b-fba8b1d7adf4@github.com>

On Thu, 29 May 2025 09:40:37 GMT, Doug Simon <dnsimon at openjdk.org> wrote:

>> thinking about it more, it's probably better if we do no filtering and return all methods in `InstanceKlass->_methods`. How about something like `getAllMethods`:
>> 
>> ```   
>>  /**
>>      * Returns a list containing all methods present within this type. This list can include
>>      * methods implicitly created and used by the VM.
>>      * The returned List is unmodifiable; calls to any mutator method
>>      * will always cause {@code UnsupportedOperationException} to be thrown.
>>      *
>>      * @param forceLink if {@code true}, forces this type to be {@link #link linked}
>>      */
>>     List<ResolvedJavaMethod> getAllMethods(boolean forceLink);
>
> Yes, that's a good idea - it's more future proof and lets the caller do the filtering.
> 
> `This list can include methods implicitly created and used by the VM that are not present in {@link #getDeclaredMethods}.`

I changed it now to be `getAllMethods`

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/25498#discussion_r2114796740

From dnsimon at openjdk.org  Fri May 30 15:06:09 2025
From: dnsimon at openjdk.org (Doug Simon)
Date: Fri, 30 May 2025 15:06:09 GMT
Subject: RFR: 8357987: [JVMCI] Add Support for Retrieving All Non-Static
 Methods of a ResolvedJavaType.
In-Reply-To: <OclBEcfmMgopFnExwfDOoF4D7dX0v42DPYgJyPwhPGc=.40f0e084-955e-4e70-9cf9-cfd09c5c72fb@github.com>
References: <mRzKVfwoUWsIiQoMx2rVRinkLcV9w14P2rSMWz9_m2g=.adb59499-09c3-4085-8464-7e6f75bad624@github.com>
 <OclBEcfmMgopFnExwfDOoF4D7dX0v42DPYgJyPwhPGc=.40f0e084-955e-4e70-9cf9-cfd09c5c72fb@github.com>
Message-ID: <cVGtYGAiNjtYA0fCqDps1jC66MO3inCqs4jcsX2r4LU=.67706502-08e4-4c26-882d-04518d18b719@github.com>

On Wed, 28 May 2025 17:41:15 GMT, Doug Simon <dnsimon at openjdk.org> wrote:

>> Currently from ResolvedJavaType one can retrieve all declared methods, static methods, and constructors of the given type. However, internally in HotSpot there are also VM-internal methods, such as overpass methods, associated with a given type which we cannot access via the API.
>> 
>> To correct this, we should add a new method which enables VM-internal methods, such as overpass methods, to be accessed.
>
> src/jdk.internal.vm.ci/share/classes/jdk/vm/ci/hotspot/HotSpotResolvedJavaMethodImpl.java line 583:
> 
>> 581:     @Override
>> 582:     public boolean isDeclared() {
>> 583:         if (isConstructor() || isStatic()) {
> 
> `isStatic()` -> `isClassInitializer()`

Looks like you did not yet make the `isClassInitializer()` fix. This also implies some missing test coverage in TestResolvedJavaType. Can you please address both these issues.

> src/jdk.internal.vm.ci/share/classes/jdk/vm/ci/meta/ResolvedJavaMethod.java line 118:
> 
>> 116: 
>> 117:     /**
>> 118:      * Returns {@code true} if this method would be contained in the array returned by
> 
> `would be` -> `is`

not yet fixed (or pushed?)

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/25498#discussion_r2113583058
PR Review Comment: https://git.openjdk.org/jdk/pull/25498#discussion_r2113587598

From duke at openjdk.org  Fri May 30 15:06:30 2025
From: duke at openjdk.org (Tom Shull)
Date: Fri, 30 May 2025 15:06:30 GMT
Subject: RFR: 8357660: [JVMCI] Add Support for Retrieving All Indy
 BootstrapMethodInvocations directly from the ConstantPool
In-Reply-To: <mcfWvAY_AciOhNDzR_KU_GFkI7ngK7T7esQt8QaQ7qo=.e77f736e-ccc1-4690-9248-7a2589c665a9@github.com>
References: <Lc6lP7En8OrhiQT7Y90xz7av5HpuzXPOLk57MPNYSZU=.cdb8db7e-e3cc-4a6d-9efd-b1c1f902d8a2@github.com>
 <mcfWvAY_AciOhNDzR_KU_GFkI7ngK7T7esQt8QaQ7qo=.e77f736e-ccc1-4690-9248-7a2589c665a9@github.com>
Message-ID: <MZE-CV0HtDOdoEfFaS8hRFELYOFDGqMFZCIiqRoFiHE=.92953d72-9354-4b60-9ae8-4922c63ddcd7@github.com>

On Sat, 24 May 2025 08:49:54 GMT, Doug Simon <dnsimon at openjdk.org> wrote:

>> This PR adds support for directly retrieving all invokedynamic BootstrapMethodInvocations from a ConstantPool.
>> 
>> In addition, two methods are added to the BootstrapMethodInvocations:
>> 1. `void resolveInvokeDynamic()`
>> 2. `JavaConstant lookupInvokeDynamicAppendix()`
>> 
>> The combination of these two features allows one to directly interact with all invokedynamic information of a given ConstantPool without having to iterate through all of the Classfile's methods to find all invokedynamic bytecodes
>
> Please add some tests for the new methods to `test/hotspot/jtreg/compiler/jvmci/jdk.vm.ci.hotspot.test/src/jdk/vm/ci/hotspot/test/TestDynamicConstant.java`.

@dougxc I integrated testing for the new methods into `TestDynamicConstant.java` now

@dougxc I cleaned up the PR to now have the symmetric lookup option and updated the tests

> src/jdk.internal.vm.ci/share/classes/jdk/vm/ci/hotspot/HotSpotConstantPool.java line 657:
> 
>> 655:         } else {
>> 656:             return IntStream.range(1, length())
>> 657:                             .filter(this::isDynamicEntry)
> 
> Looks like you forgot to add the definition of `isDynamicEntry` that I suggested:
> 
>     private boolean isDynamicEntry(int cpi) {
>         JvmConstant tagAt = getTagAt(cpi);
>         return tagAt != null && tagAt.name.equals("Dynamic");
>     }

Yes, I applied the suggested change via github, and am just validating it works now (which of course it doesn't). I'll fix it

> src/jdk.internal.vm.ci/share/classes/jdk/vm/ci/meta/ConstantPool.java line 198:
> 
>> 196:          * If this bootstrap method invocation is for a {@code
>> 197:          * CONSTANTAdd_InvokeDynamic_info} pool entry, then this method ensures the
>> 198:          * invoke dynamic is resolved. This can be used to compile time resolve the
> 
> What exactly does resolving an invoke dynamic mean?
> Also I would leave out the sentence about "compile time" unless you clarify exactly what that means.

Would you want me to add a reference to https://docs.oracle.com/javase/specs/jvms/se24/html/jvms-5.html#jvms-5.4.3.6?

I removed the compile time sentence; I had it to be consistent with `loadReferencedType`

> src/jdk.internal.vm.ci/share/classes/jdk/vm/ci/meta/ConstantPool.java line 237:
> 
>> 235:      * is returned.
>> 236:      */
>> 237:     BootstrapMethodInvocation[] lookupAllIndyBootstrapMethodInvocations();
> 
> Why not make this return all `BootstrapMethodInvocation`s? The caller can then filter out the indy ones with `isInvokeDynamic`. Also, please return a `List<BootstrapMethodInvocation>` instead of an array - we should never return arrays from JVMCI (see #23159 as an example of addressing existing API). Lastly, return `List.of()` instead of null.

Changed to return a list.

> Why not make this return all BootstrapMethodInvocations
1. Within HotSpot it is very easy to pick off all indy BootstrapMethodInvocations via [the ConstantPoolCache](https://github.com/openjdk/jdk/blob/72a3022dc6a1521d8e3f08fe5d592f760fc462d2/src/hotspot/share/oops/cpCache.hpp#L74)
2. Each invokedynamic bytecode location has a unique BootstrapMethodInvocation instance, but they may share the same constant pool entry, so it's not trivial to find all BootstrapMethodInvocations. One would have to iterate both all method bytecodes and constant pool slots, and do some additional filtering.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/25420#issuecomment-2909796813
PR Comment: https://git.openjdk.org/jdk/pull/25420#issuecomment-2918426821
PR Review Comment: https://git.openjdk.org/jdk/pull/25420#discussion_r2114780251
PR Review Comment: https://git.openjdk.org/jdk/pull/25420#discussion_r2109301347
PR Review Comment: https://git.openjdk.org/jdk/pull/25420#discussion_r2109317539

From dnsimon at openjdk.org  Fri May 30 15:06:30 2025
From: dnsimon at openjdk.org (Doug Simon)
Date: Fri, 30 May 2025 15:06:30 GMT
Subject: RFR: 8357660: [JVMCI] Add Support for Retrieving All Indy
 BootstrapMethodInvocations directly from the ConstantPool
In-Reply-To: <MZE-CV0HtDOdoEfFaS8hRFELYOFDGqMFZCIiqRoFiHE=.92953d72-9354-4b60-9ae8-4922c63ddcd7@github.com>
References: <Lc6lP7En8OrhiQT7Y90xz7av5HpuzXPOLk57MPNYSZU=.cdb8db7e-e3cc-4a6d-9efd-b1c1f902d8a2@github.com>
 <mcfWvAY_AciOhNDzR_KU_GFkI7ngK7T7esQt8QaQ7qo=.e77f736e-ccc1-4690-9248-7a2589c665a9@github.com>
 <MZE-CV0HtDOdoEfFaS8hRFELYOFDGqMFZCIiqRoFiHE=.92953d72-9354-4b60-9ae8-4922c63ddcd7@github.com>
Message-ID: <1AMsWwdYheV0CZ9z_VWbiEPphQwkJz-HO6h-wYNCAfw=.8259a98a-89aa-40e6-98da-81c43d2a45e0@github.com>

On Tue, 27 May 2025 14:07:21 GMT, Tom Shull <duke at openjdk.org> wrote:

> Would you want me to add a reference

The main point is that resolving can execute Java code (as far as I recall) so cannot be called from a CompileBroker thread as these threads must not call Java code. However, I see that this constraint is not currently documented so it ok to leave it out for now.

>> src/jdk.internal.vm.ci/share/classes/jdk/vm/ci/meta/ConstantPool.java line 237:
>> 
>>> 235:      * is returned.
>>> 236:      */
>>> 237:     BootstrapMethodInvocation[] lookupAllIndyBootstrapMethodInvocations();
>> 
>> Why not make this return all `BootstrapMethodInvocation`s? The caller can then filter out the indy ones with `isInvokeDynamic`. Also, please return a `List<BootstrapMethodInvocation>` instead of an array - we should never return arrays from JVMCI (see #23159 as an example of addressing existing API). Lastly, return `List.of()` instead of null.
>
> Changed to return a list.
> 
>> Why not make this return all BootstrapMethodInvocations
> 1. Within HotSpot it is very easy to pick off all indy BootstrapMethodInvocations via [the ConstantPoolCache](https://github.com/openjdk/jdk/blob/72a3022dc6a1521d8e3f08fe5d592f760fc462d2/src/hotspot/share/oops/cpCache.hpp#L74)
> 2. Each invokedynamic bytecode location has a unique BootstrapMethodInvocation instance, but they may share the same constant pool entry, so it's not trivial to find all BootstrapMethodInvocations. One would have to iterate both all method bytecodes and constant pool slots, and do some additional filtering.

How about `List<BootstrapMethodInvocation> lookupBootstrapMethodInvocations(boolean indy)`? That is, it either gets the indy *or* the condy BSM invocations. I can imagine SVM wanting the latter at some point right?

BTW, I noticed that the javadoc for `ConstantPool.lookupBootstrapMethodInvocation` is somewhat incorrect. Please check and apply these corrections in this PR:

diff --git a/src/jdk.internal.vm.ci/share/classes/jdk/vm/ci/meta/ConstantPool.java b/src/jdk.internal.vm.ci/share/classes/jdk/vm/ci/meta/ConstantPool.java
index 2273b256f03..3519af4bcbb 100644
--- a/src/jdk.internal.vm.ci/share/classes/jdk/vm/ci/meta/ConstantPool.java
+++ b/src/jdk.internal.vm.ci/share/classes/jdk/vm/ci/meta/ConstantPool.java
@@ -199,12 +199,12 @@ interface BootstrapMethodInvocation {
      * in the constant pool.
      *
      * @param index if {@code opcode} is -1,  {@code index} is a constant pool index. Otherwise {@code opcode}
-     *              must be {@code Bytecodes.INVOKEDYNAMIC}, and {@code index} must be the operand of that
-     *              opcode in the bytecode stream (i.e., a {@code rawIndex}).
-     * @param opcode must be {@code Bytecodes.INVOKEDYNAMIC}, or -1 if
+     *              must be {@code Bytecodes.INVOKEDYNAMIC} or {@code CONSTANT_Dynamic_info}, and {@code index}
+     *              must be the operand of that opcode in the bytecode stream (i.e., a {@code rawIndex}).
+     * @param opcode must be {@code Bytecodes.INVOKEDYNAMIC}, {@code CONSTANT_Dynamic_info}, or -1 if
      *            {@code index} was not decoded from a bytecode stream
      * @return the bootstrap method invocation details or {@code null} if the entry specified by {@code index}
-     *         is not a {@code CONSTANT_Dynamic_info} or @{code CONSTANT_InvokeDynamic_info}
+     *         is not a {@code CONSTANT_Dynamic_info} or {@code CONSTANT_InvokeDynamic_info}
      * @jvms 4.7.23 The {@code BootstrapMethods} Attribute
      */
     default BootstrapMethodInvocation lookupBootstrapMethodInvocation(int index, int opcode) {

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/25420#discussion_r2109436288
PR Review Comment: https://git.openjdk.org/jdk/pull/25420#discussion_r2109450651

From duke at openjdk.org  Fri May 30 15:06:30 2025
From: duke at openjdk.org (Tom Shull)
Date: Fri, 30 May 2025 15:06:30 GMT
Subject: RFR: 8357660: [JVMCI] Add Support for Retrieving All Indy
 BootstrapMethodInvocations directly from the ConstantPool
In-Reply-To: <1AMsWwdYheV0CZ9z_VWbiEPphQwkJz-HO6h-wYNCAfw=.8259a98a-89aa-40e6-98da-81c43d2a45e0@github.com>
References: <Lc6lP7En8OrhiQT7Y90xz7av5HpuzXPOLk57MPNYSZU=.cdb8db7e-e3cc-4a6d-9efd-b1c1f902d8a2@github.com>
 <mcfWvAY_AciOhNDzR_KU_GFkI7ngK7T7esQt8QaQ7qo=.e77f736e-ccc1-4690-9248-7a2589c665a9@github.com>
 <MZE-CV0HtDOdoEfFaS8hRFELYOFDGqMFZCIiqRoFiHE=.92953d72-9354-4b60-9ae8-4922c63ddcd7@github.com>
 <1AMsWwdYheV0CZ9z_VWbiEPphQwkJz-HO6h-wYNCAfw=.8259a98a-89aa-40e6-98da-81c43d2a45e0@github.com>
Message-ID: <FPaXWqgbUNzAg2St_xy25-rygTB5y1iioDDTlgSqbhg=.109705fe-fb62-4007-aa51-c75a5e07f3b1@github.com>

On Tue, 27 May 2025 15:03:02 GMT, Doug Simon <dnsimon at openjdk.org> wrote:

>> Changed to return a list.
>> 
>>> Why not make this return all BootstrapMethodInvocations
>> 1. Within HotSpot it is very easy to pick off all indy BootstrapMethodInvocations via [the ConstantPoolCache](https://github.com/openjdk/jdk/blob/72a3022dc6a1521d8e3f08fe5d592f760fc462d2/src/hotspot/share/oops/cpCache.hpp#L74)
>> 2. Each invokedynamic bytecode location has a unique BootstrapMethodInvocation instance, but they may share the same constant pool entry, so it's not trivial to find all BootstrapMethodInvocations. One would have to iterate both all method bytecodes and constant pool slots, and do some additional filtering.
>
> How about `List<BootstrapMethodInvocation> lookupBootstrapMethodInvocations(boolean indy)`? That is, it either gets the indy *or* the condy BSM invocations. I can imagine SVM wanting the latter at some point right?
> 
> BTW, I noticed that the javadoc for `ConstantPool.lookupBootstrapMethodInvocation` is somewhat incorrect. Please check and apply these corrections in this PR:
> 
> diff --git a/src/jdk.internal.vm.ci/share/classes/jdk/vm/ci/meta/ConstantPool.java b/src/jdk.internal.vm.ci/share/classes/jdk/vm/ci/meta/ConstantPool.java
> index 2273b256f03..3519af4bcbb 100644
> --- a/src/jdk.internal.vm.ci/share/classes/jdk/vm/ci/meta/ConstantPool.java
> +++ b/src/jdk.internal.vm.ci/share/classes/jdk/vm/ci/meta/ConstantPool.java
> @@ -199,12 +199,12 @@ interface BootstrapMethodInvocation {
>       * in the constant pool.
>       *
>       * @param index if {@code opcode} is -1,  {@code index} is a constant pool index. Otherwise {@code opcode}
> -     *              must be {@code Bytecodes.INVOKEDYNAMIC}, and {@code index} must be the operand of that
> -     *              opcode in the bytecode stream (i.e., a {@code rawIndex}).
> -     * @param opcode must be {@code Bytecodes.INVOKEDYNAMIC}, or -1 if
> +     *              must be {@code Bytecodes.INVOKEDYNAMIC} or {@code CONSTANT_Dynamic_info}, and {@code index}
> +     *              must be the operand of that opcode in the bytecode stream (i.e., a {@code rawIndex}).
> +     * @param opcode must be {@code Bytecodes.INVOKEDYNAMIC}, {@code CONSTANT_Dynamic_info}, or -1 if
>       *            {@code index} was not decoded from a bytecode stream
>       * @return the bootstrap method invocation details or {@code null} if the entry specified by {@code index}
> -     *         is not a {@code CONSTANT_Dynamic_info} or @{code CONSTANT_InvokeDynamic_info}
> +     *         is not a {@code CONSTANT_Dynamic_info} or {@code CONSTANT_InvokeDynamic_info}
>       * @jvms 4.7.23 The {@code BootstrapMethods} Attribute
>       */
>      default BootstrapMethodInvocation lookupBootstrapMethodInvocation(int index, int opcode) {

I prototyped the option `List<BootstrapMethodInvocation> lookupBootstrapMethodInvocations(boolean indy)` here: https://github.com/openjdk/jdk/compare/master...teshull:jdk:jvmci_bootstrap_alternative

As part of this I also prototyped generic BSM resolution / lookup logic

>From the SVM perspective, retrieving condys via this new support isn't a big win. It's easy enough already to walk the ConstantPool. However, for symmetry purposes, it is reasonable to have this method (along with the resolve / lookup). What's your preference: this new version or the original?

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/25420#discussion_r2110104069

From dnsimon at openjdk.org  Fri May 30 15:06:30 2025
From: dnsimon at openjdk.org (Doug Simon)
Date: Fri, 30 May 2025 15:06:30 GMT
Subject: RFR: 8357660: [JVMCI] Add Support for Retrieving All Indy
 BootstrapMethodInvocations directly from the ConstantPool
In-Reply-To: <FPaXWqgbUNzAg2St_xy25-rygTB5y1iioDDTlgSqbhg=.109705fe-fb62-4007-aa51-c75a5e07f3b1@github.com>
References: <Lc6lP7En8OrhiQT7Y90xz7av5HpuzXPOLk57MPNYSZU=.cdb8db7e-e3cc-4a6d-9efd-b1c1f902d8a2@github.com>
 <mcfWvAY_AciOhNDzR_KU_GFkI7ngK7T7esQt8QaQ7qo=.e77f736e-ccc1-4690-9248-7a2589c665a9@github.com>
 <MZE-CV0HtDOdoEfFaS8hRFELYOFDGqMFZCIiqRoFiHE=.92953d72-9354-4b60-9ae8-4922c63ddcd7@github.com>
 <1AMsWwdYheV0CZ9z_VWbiEPphQwkJz-HO6h-wYNCAfw=.8259a98a-89aa-40e6-98da-81c43d2a45e0@github.com>
 <FPaXWqgbUNzAg2St_xy25-rygTB5y1iioDDTlgSqbhg=.109705fe-fb62-4007-aa51-c75a5e07f3b1@github.com>
Message-ID: <3Lyb5MHjplhxqRmlkR6y-GpgQWe90ij_jClRdipKMQE=.cf4fcf50-4b4f-4930-abdb-75f9d0be9942@github.com>

On Tue, 27 May 2025 20:10:50 GMT, Tom Shull <duke at openjdk.org> wrote:

>> How about `List<BootstrapMethodInvocation> lookupBootstrapMethodInvocations(boolean indy)`? That is, it either gets the indy *or* the condy BSM invocations. I can imagine SVM wanting the latter at some point right?
>> 
>> BTW, I noticed that the javadoc for `ConstantPool.lookupBootstrapMethodInvocation` is somewhat incorrect. Please check and apply these corrections in this PR:
>> 
>> diff --git a/src/jdk.internal.vm.ci/share/classes/jdk/vm/ci/meta/ConstantPool.java b/src/jdk.internal.vm.ci/share/classes/jdk/vm/ci/meta/ConstantPool.java
>> index 2273b256f03..3519af4bcbb 100644
>> --- a/src/jdk.internal.vm.ci/share/classes/jdk/vm/ci/meta/ConstantPool.java
>> +++ b/src/jdk.internal.vm.ci/share/classes/jdk/vm/ci/meta/ConstantPool.java
>> @@ -199,12 +199,12 @@ interface BootstrapMethodInvocation {
>>       * in the constant pool.
>>       *
>>       * @param index if {@code opcode} is -1,  {@code index} is a constant pool index. Otherwise {@code opcode}
>> -     *              must be {@code Bytecodes.INVOKEDYNAMIC}, and {@code index} must be the operand of that
>> -     *              opcode in the bytecode stream (i.e., a {@code rawIndex}).
>> -     * @param opcode must be {@code Bytecodes.INVOKEDYNAMIC}, or -1 if
>> +     *              must be {@code Bytecodes.INVOKEDYNAMIC} or {@code CONSTANT_Dynamic_info}, and {@code index}
>> +     *              must be the operand of that opcode in the bytecode stream (i.e., a {@code rawIndex}).
>> +     * @param opcode must be {@code Bytecodes.INVOKEDYNAMIC}, {@code CONSTANT_Dynamic_info}, or -1 if
>>       *            {@code index} was not decoded from a bytecode stream
>>       * @return the bootstrap method invocation details or {@code null} if the entry specified by {@code index}
>> -     *         is not a {@code CONSTANT_Dynamic_info} or @{code CONSTANT_InvokeDynamic_info}
>> +     *         is not a {@code CONSTANT_Dynamic_info} or {@code CONSTANT_InvokeDynamic_info}
>>       * @jvms 4.7.23 The {@code BootstrapMethods} Attribute
>>       */
>>      default BootstrapMethodInvocation lookupBootstrapMethodInvocation(int index, int opcode) {
>
> I prototyped the option `List<BootstrapMethodInvocation> lookupBootstrapMethodInvocations(boolean indy)` here: https://github.com/openjdk/jdk/compare/master...teshull:jdk:jvmci_bootstrap_alternative
> 
> As part of this I also prototyped generic BSM resolution / lookup logic
> 
> From the SVM perspective, retrieving condys via this new support isn't a big win. It's easy enough already to walk the ConstantPool. However, for symmetry purposes, it is reasonable to have this method (along with the resolve / lookup). What's your preference: this new version or the original?

I like the symmetry of the new version. Also, I think you can simplify things by replacing use of `flatMap` [here](https://github.com/openjdk/jdk/compare/master...teshull:jdk:jvmci_bootstrap_alternative#diff-b782878562668748c5c59acc2e937f7c24de4529b8a74bd3a4eae83fa0e07846R679) with `filter`.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/25420#discussion_r2111539245

From duke at openjdk.org  Fri May 30 15:06:30 2025
From: duke at openjdk.org (Tom Shull)
Date: Fri, 30 May 2025 15:06:30 GMT
Subject: RFR: 8357660: [JVMCI] Add Support for Retrieving All Indy
 BootstrapMethodInvocations directly from the ConstantPool
In-Reply-To: <3Lyb5MHjplhxqRmlkR6y-GpgQWe90ij_jClRdipKMQE=.cf4fcf50-4b4f-4930-abdb-75f9d0be9942@github.com>
References: <Lc6lP7En8OrhiQT7Y90xz7av5HpuzXPOLk57MPNYSZU=.cdb8db7e-e3cc-4a6d-9efd-b1c1f902d8a2@github.com>
 <mcfWvAY_AciOhNDzR_KU_GFkI7ngK7T7esQt8QaQ7qo=.e77f736e-ccc1-4690-9248-7a2589c665a9@github.com>
 <MZE-CV0HtDOdoEfFaS8hRFELYOFDGqMFZCIiqRoFiHE=.92953d72-9354-4b60-9ae8-4922c63ddcd7@github.com>
 <1AMsWwdYheV0CZ9z_VWbiEPphQwkJz-HO6h-wYNCAfw=.8259a98a-89aa-40e6-98da-81c43d2a45e0@github.com>
 <FPaXWqgbUNzAg2St_xy25-rygTB5y1iioDDTlgSqbhg=.109705fe-fb62-4007-aa51-c75a5e07f3b1@github.com>
 <3Lyb5MHjplhxqRmlkR6y-GpgQWe90ij_jClRdipKMQE=.cf4fcf50-4b4f-4930-abdb-75f9d0be9942@github.com>
Message-ID: <xSx-S1W27v2FdwYMkC1wSz-_TyHLnvrZ1oCUizzyXmk=.4aa4f771-8ae8-4a9f-a37b-c5e63899c00d@github.com>

On Wed, 28 May 2025 10:45:07 GMT, Doug Simon <dnsimon at openjdk.org> wrote:

>> I prototyped the option `List<BootstrapMethodInvocation> lookupBootstrapMethodInvocations(boolean indy)` here: https://github.com/openjdk/jdk/compare/master...teshull:jdk:jvmci_bootstrap_alternative
>> 
>> As part of this I also prototyped generic BSM resolution / lookup logic
>> 
>> From the SVM perspective, retrieving condys via this new support isn't a big win. It's easy enough already to walk the ConstantPool. However, for symmetry purposes, it is reasonable to have this method (along with the resolve / lookup). What's your preference: this new version or the original?
>
> I like the symmetry of the new version. Also, I think you can simplify things by replacing use of `flatMap` [here](https://github.com/openjdk/jdk/compare/master...teshull:jdk:jvmci_bootstrap_alternative#diff-b782878562668748c5c59acc2e937f7c24de4529b8a74bd3a4eae83fa0e07846R679) with `filter`.

I updated the javadoc misplaced `@` in `{@code}`. However, the `opcode` doc changes look wrong to me; the opcode must be -1 or INVOKEDYNAMIC (https://github.com/openjdk/jdk/blob/04e0fe00abcf1d7919a50e0c9dd44ce2856984ea/src/jdk.internal.vm.ci/share/classes/jdk/vm/ci/hotspot/HotSpotConstantPool.java#L592)

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/25420#discussion_r2113271157

From dnsimon at openjdk.org  Fri May 30 15:06:30 2025
From: dnsimon at openjdk.org (Doug Simon)
Date: Fri, 30 May 2025 15:06:30 GMT
Subject: RFR: 8357660: [JVMCI] Add Support for Retrieving All Indy
 BootstrapMethodInvocations directly from the ConstantPool
In-Reply-To: <xSx-S1W27v2FdwYMkC1wSz-_TyHLnvrZ1oCUizzyXmk=.4aa4f771-8ae8-4a9f-a37b-c5e63899c00d@github.com>
References: <Lc6lP7En8OrhiQT7Y90xz7av5HpuzXPOLk57MPNYSZU=.cdb8db7e-e3cc-4a6d-9efd-b1c1f902d8a2@github.com>
 <mcfWvAY_AciOhNDzR_KU_GFkI7ngK7T7esQt8QaQ7qo=.e77f736e-ccc1-4690-9248-7a2589c665a9@github.com>
 <MZE-CV0HtDOdoEfFaS8hRFELYOFDGqMFZCIiqRoFiHE=.92953d72-9354-4b60-9ae8-4922c63ddcd7@github.com>
 <1AMsWwdYheV0CZ9z_VWbiEPphQwkJz-HO6h-wYNCAfw=.8259a98a-89aa-40e6-98da-81c43d2a45e0@github.com>
 <FPaXWqgbUNzAg2St_xy25-rygTB5y1iioDDTlgSqbhg=.109705fe-fb62-4007-aa51-c75a5e07f3b1@github.com>
 <3Lyb5MHjplhxqRmlkR6y-GpgQWe90ij_jClRdipKMQE=.cf4fcf50-4b4f-4930-abdb-75f9d0be9942@github.com>
 <xSx-S1W27v2FdwYMkC1wSz-_TyHLnvrZ1oCUizzyXmk=.4aa4f771-8ae8-4a9f-a37b-c5e63899c00d@github.com>
Message-ID: <aM9X3BVegS2YAE2FR5ywOYn5s9OeUL52fDiVYgfLOHE=.2e1d394b-1645-431a-b478-432b12431d7d@github.com>

On Thu, 29 May 2025 06:04:24 GMT, Tom Shull <duke at openjdk.org> wrote:

>> I like the symmetry of the new version. Also, I think you can simplify things by replacing use of `flatMap` [here](https://github.com/openjdk/jdk/compare/master...teshull:jdk:jvmci_bootstrap_alternative#diff-b782878562668748c5c59acc2e937f7c24de4529b8a74bd3a4eae83fa0e07846R679) with `filter`.
>
> I updated the javadoc misplaced `@` in `{@code}`. However, the `opcode` doc changes look wrong to me; the opcode must be -1 or INVOKEDYNAMIC (https://github.com/openjdk/jdk/blob/04e0fe00abcf1d7919a50e0c9dd44ce2856984ea/src/jdk.internal.vm.ci/share/classes/jdk/vm/ci/hotspot/HotSpotConstantPool.java#L592)

yeah, looks like you're right. I was basing my assumption on `case "Dynamic"` in:

    @Override
    public BootstrapMethodInvocation lookupBootstrapMethodInvocation(int index, int opcode) {
        int cpi = opcode == -1 ? index : indyIndexConstantPoolIndex(index, opcode);
        final JvmConstant tag = getTagAt(cpi);
        switch (tag.name) {
            case "InvokeDynamic":
            case "Dynamic":

I guess it's possible for an INVOKEDYNAMIC to resolve it's cpi to a CONSTANT_Dynamic entry.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/25420#discussion_r2113973088

From duke at openjdk.org  Fri May 30 15:06:30 2025
From: duke at openjdk.org (Tom Shull)
Date: Fri, 30 May 2025 15:06:30 GMT
Subject: RFR: 8357660: [JVMCI] Add Support for Retrieving All Indy
 BootstrapMethodInvocations directly from the ConstantPool
In-Reply-To: <aM9X3BVegS2YAE2FR5ywOYn5s9OeUL52fDiVYgfLOHE=.2e1d394b-1645-431a-b478-432b12431d7d@github.com>
References: <Lc6lP7En8OrhiQT7Y90xz7av5HpuzXPOLk57MPNYSZU=.cdb8db7e-e3cc-4a6d-9efd-b1c1f902d8a2@github.com>
 <mcfWvAY_AciOhNDzR_KU_GFkI7ngK7T7esQt8QaQ7qo=.e77f736e-ccc1-4690-9248-7a2589c665a9@github.com>
 <MZE-CV0HtDOdoEfFaS8hRFELYOFDGqMFZCIiqRoFiHE=.92953d72-9354-4b60-9ae8-4922c63ddcd7@github.com>
 <1AMsWwdYheV0CZ9z_VWbiEPphQwkJz-HO6h-wYNCAfw=.8259a98a-89aa-40e6-98da-81c43d2a45e0@github.com>
 <FPaXWqgbUNzAg2St_xy25-rygTB5y1iioDDTlgSqbhg=.109705fe-fb62-4007-aa51-c75a5e07f3b1@github.com>
 <3Lyb5MHjplhxqRmlkR6y-GpgQWe90ij_jClRdipKMQE=.cf4fcf50-4b4f-4930-abdb-75f9d0be9942@github.com>
 <xSx-S1W27v2FdwYMkC1wSz-_TyHLnvrZ1oCUizzyXmk=.4aa4f771-8ae8-4a9f-a37b-c5e63899c00d@github.com>
 <aM9X3BVegS2YAE2FR5ywOYn5s9OeUL52fDiVYgfLOHE=.2e1d394b-1645-431a-b478-432b12431d7d@github.com>
Message-ID: <Jngc03iCLmunkaNBcDnfmvF_MKc5OQnlBtNuvPVaPm4=.bb49f1a6-6217-4860-9d65-e47ce2f5cf08@github.com>

On Thu, 29 May 2025 13:40:55 GMT, Doug Simon <dnsimon at openjdk.org> wrote:

>> I updated the javadoc misplaced `@` in `{@code}`. However, the `opcode` doc changes look wrong to me; the opcode must be -1 or INVOKEDYNAMIC (https://github.com/openjdk/jdk/blob/04e0fe00abcf1d7919a50e0c9dd44ce2856984ea/src/jdk.internal.vm.ci/share/classes/jdk/vm/ci/hotspot/HotSpotConstantPool.java#L592)
>
> yeah, looks like you're right. I was basing my assumption on `case "Dynamic"` in:
> 
>     @Override
>     public BootstrapMethodInvocation lookupBootstrapMethodInvocation(int index, int opcode) {
>         int cpi = opcode == -1 ? index : indyIndexConstantPoolIndex(index, opcode);
>         final JvmConstant tag = getTagAt(cpi);
>         switch (tag.name) {
>             case "InvokeDynamic":
>             case "Dynamic":
> 
> I guess it's possible for an INVOKEDYNAMIC to resolve it's cpi to a CONSTANT_Dynamic entry.

I think INVOKEDYNAMIC should always point to a CONSTANT_InvokeDynamic entry

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/25420#discussion_r2114794800

From never at openjdk.org  Fri May 30 16:07:52 2025
From: never at openjdk.org (Tom Rodriguez)
Date: Fri, 30 May 2025 16:07:52 GMT
Subject: RFR: 8357619: [JVMCI] Revisit phantom_ref parameter in
 JVMCINMethodData::get_nmethod_mirror
In-Reply-To: <37LbN00VRPqAt9LN8jx43xx3QGsF6jnPFS_OQLUa-0U=.687f6afe-d13a-4d03-af0c-ac91a9862b13@github.com>
References: <37LbN00VRPqAt9LN8jx43xx3QGsF6jnPFS_OQLUa-0U=.687f6afe-d13a-4d03-af0c-ac91a9862b13@github.com>
Message-ID: <OHFtgLyztrGVE9Q0p_HFrta3a7pK_uLVnVJU6g-76vA=.a7bd8695-9b7e-4149-a462-52b0429dffd8@github.com>

On Wed, 28 May 2025 10:28:38 GMT, Doug Simon <dnsimon at openjdk.org> wrote:

> The point of the `phantom_ref` parameter (introduced by [JDK-8234359](https://bugs.openjdk.org/browse/JDK-8234359)) of `JVMCINMethodData::get_nmethod_mirror` is to avoid the special resurrection semantics of a phantom read when reading the field during GC, which is when `JVMCINMethodData::invalidate_nmethod_mirror` can be called.
> This case can be handled directly in `JVMCINMethodData::invalidate_nmethod_mirror` and so the `phantom_ref` parameter can be removed.

src/hotspot/share/jvmci/jvmciRuntime.cpp line 801:

> 799: 
> 800: void JVMCINMethodData::invalidate_nmethod_mirror(nmethod* nm) {
> 801:   if (_nmethod_mirror_index == -1) {

This part is actually wrong as that's the first part of `get_nmethod_mirror` and we must always check that `get_nmethod_mirror` doesn't return nullptr.  I'd assumed that the mirror was always non-null if `_nmethod_mirror_index != -1` but that's not true.  The slot is reserved for all non-default nmethods and must stay around so that `translate` can work.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/25488#discussion_r2116193278

From rkennke at openjdk.org  Fri May 30 16:13:25 2025
From: rkennke at openjdk.org (Roman Kennke)
Date: Fri, 30 May 2025 16:13:25 GMT
Subject: RFR: 8358169: Shenandoah/JVMCI: Export GC state constants
Message-ID: <AW1S5EQJ3RBD_SuBNPbncl9eCsbYcw7jG9ovqryyijo=.b15a6da5-39c8-45dd-8824-6064f237da11@github.com>

We need the GC state enum constants available in JVMCI.

-------------

Commit messages:
 - 8358169: Shenandoah/JVMCI: Export GC state constants

Changes: https://git.openjdk.org/jdk/pull/25552/files
  Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=25552&range=00
  Issue: https://bugs.openjdk.org/browse/JDK-8358169
  Stats: 8 lines in 1 file changed: 8 ins; 0 del; 0 mod
  Patch: https://git.openjdk.org/jdk/pull/25552.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/25552/head:pull/25552

PR: https://git.openjdk.org/jdk/pull/25552

From dnsimon at openjdk.org  Fri May 30 16:39:51 2025
From: dnsimon at openjdk.org (Doug Simon)
Date: Fri, 30 May 2025 16:39:51 GMT
Subject: RFR: 8358169: Shenandoah/JVMCI: Export GC state constants
In-Reply-To: <AW1S5EQJ3RBD_SuBNPbncl9eCsbYcw7jG9ovqryyijo=.b15a6da5-39c8-45dd-8824-6064f237da11@github.com>
References: <AW1S5EQJ3RBD_SuBNPbncl9eCsbYcw7jG9ovqryyijo=.b15a6da5-39c8-45dd-8824-6064f237da11@github.com>
Message-ID: <plQeXw912hfk3HvrTvLLzM6GXoSIeVRus5Tmy0C2N3o=.03188c16-28b6-44d6-a4cb-473df4f6c189@github.com>

On Fri, 30 May 2025 16:09:03 GMT, Roman Kennke <rkennke at openjdk.org> wrote:

> We need the GC state enum constants available in JVMCI.

Looks good.

-------------

Marked as reviewed by dnsimon (Reviewer).

PR Review: https://git.openjdk.org/jdk/pull/25552#pullrequestreview-2881865876

From jbhateja at openjdk.org  Fri May 30 17:22:53 2025
From: jbhateja at openjdk.org (Jatin Bhateja)
Date: Fri, 30 May 2025 17:22:53 GMT
Subject: RFR: 8353686: Optimize Math.cbrt for x86 64 bit platforms [v4]
In-Reply-To: <eJbb5hzhM27dVuc7MD6kqvlLReNHthJNiRXttkPwzQo=.bb1151a7-959a-4f56-8ebf-bde8126ea5d4@github.com>
References: <1NsI0OGP9RcnbEwlJwDj1dZ3w7zCP4DxJhEmO1quSgo=.3b1e3da9-9aa9-4221-a73a-e2f3ec5f456b@github.com>
 <72GCipLKeCWCG-4jsG5XhZKkTdVsWafEq_wA0oD-0mk=.c70af814-54fe-43d2-b7c9-72b845eb99d5@github.com>
 <kPx9NqOgNJPStCsPd4PtRhfbQeEXi5xbefvSjCHoPSY=.aa26a490-8f90-4e17-8c9e-cde0c25a9fbb@github.com>
 <eJbb5hzhM27dVuc7MD6kqvlLReNHthJNiRXttkPwzQo=.bb1151a7-959a-4f56-8ebf-bde8126ea5d4@github.com>
Message-ID: <v7r6P6Vbx5tR4sIwy5mwub6ev0OnG1n3HipHuJe-qVc=.f1ce37ed-1218-4f0a-89ab-f4b5c7fdbbf8@github.com>

On Thu, 29 May 2025 18:49:28 GMT, Mohamed Issa <duke at openjdk.org> wrote:

>> test/micro/org/openjdk/bench/java/lang/CbrtPerf.java line 56:
>> 
>>> 54:     public static class CbrtPerfRanges {
>>> 55:         public static int cbrtInputCount = 2048;
>>> 56: 
>> 
>> Please create separate CbrtPerfSpecialValues for +/- 0.0 and +/- Infinity and NaN values.
>> I understand that handling special cases in intrinsic may impact general case performance but its ok to have atleast micro for it.
>
> Ok, I added this to the new set of micro-benchmarks. I kept them as variable values.

With Intrinsic Disabled:-

Benchmark                                              Mode  Cnt        Score   Error   Units
CbrtPerf.CbrtPerfSpecialValues.cbrtDouble0            thrpt    2  1343559.770          ops/ms
CbrtPerf.CbrtPerfSpecialValues.cbrtDoubleInf          thrpt    2   881930.283          ops/ms
CbrtPerf.CbrtPerfSpecialValues.cbrtDoubleNaN          thrpt    2   973307.409          ops/ms
CbrtPerf.CbrtPerfSpecialValues.cbrtDoubleNegative0    thrpt    2  1342454.046          ops/ms
CbrtPerf.CbrtPerfSpecialValues.cbrtDoubleNegativeInf  thrpt    2   880169.071          ops/ms

With Intrinsic Enabled:-

Benchmark                                              Mode  Cnt       Score   Error   Units
CbrtPerf.CbrtPerfSpecialValues.cbrtDouble0            thrpt    2  293228.991          ops/ms
CbrtPerf.CbrtPerfSpecialValues.cbrtDoubleInf          thrpt    2  329190.573          ops/ms
CbrtPerf.CbrtPerfSpecialValues.cbrtDoubleNaN          thrpt    2  334625.414          ops/ms
CbrtPerf.CbrtPerfSpecialValues.cbrtDoubleNegative0    thrpt    2  270939.709          ops/ms
CbrtPerf.CbrtPerfSpecialValues.cbrtDoubleNegativeInf  thrpt    2  328087.618          ops/ms


As expected, optimized intrinsic penalizes special case performance to optimize generic case control paths. Have you tried adding these special checks and measuring the impact on performance?  Alternatively, we can create a follow up JBS to address it later.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/24470#discussion_r2116295936

From jbhateja at openjdk.org  Fri May 30 17:58:55 2025
From: jbhateja at openjdk.org (Jatin Bhateja)
Date: Fri, 30 May 2025 17:58:55 GMT
Subject: RFR: 8353686: Optimize Math.cbrt for x86 64 bit platforms [v5]
In-Reply-To: <GE6s8raLeTlcMDb0coe89QbMYSPqWtjjkr_puB8hcx8=.d4911eb3-3201-441b-81a9-b476ea5b64c9@github.com>
References: <1NsI0OGP9RcnbEwlJwDj1dZ3w7zCP4DxJhEmO1quSgo=.3b1e3da9-9aa9-4221-a73a-e2f3ec5f456b@github.com>
 <GE6s8raLeTlcMDb0coe89QbMYSPqWtjjkr_puB8hcx8=.d4911eb3-3201-441b-81a9-b476ea5b64c9@github.com>
Message-ID: <k6su39c38xIXp-JnEl5EZaGUeUGWQidxR2QlL6NeV5M=.6ee048ca-9aa9-4b4a-bd86-77485e6b4dcc@github.com>

On Thu, 29 May 2025 18:56:11 GMT, Mohamed Issa <duke at openjdk.org> wrote:

>> The goal of this PR is to implement an x86_64 intrinsic for java.lang.Math.cbrt() using libm. There is a new set of micro-benchmarks are included to check the performance of specific input value ranges to help prevent regressions in the future.
>> 
>> The command to run all range specific micro-benchmarks is posted below.
>> 
>> `make test TEST="micro:CbrtPerf.CbrtPerfRanges"`
>> 
>> The results of all tests posted below were captured with an [Intel? Xeon 6761P](https://www.intel.com/content/www/us/en/products/sku/241842/intel-xeon-6761p-processor-336m-cache-2-50-ghz/specifications.html) using [OpenJDK v25-b21](https://github.com/openjdk/jdk/releases/tag/jdk-25%2B21) as the baseline version.
>> 
>> For performance data collected with the new built in range micro-benchmark, see the table below. Each result is the mean of 8 individual runs, and the input ranges used match those from the original Java implementation. Overall, the intrinsic provides a major uplift of 169% when very small inputs are used and a more modest uplift of 45% for all other inputs.
>> 
>> | Input range(s)                                  | Baseline throughput (ops/ms) | Intrinsic throughput (ops/ms) | Speedup |
>> | :-------------------------------------: | :-------------------------------: | :-------------------------------: | :---------: |
>> | [-2^(-1022), 2^(-1022)]                   | 6568                                        | 17678                                      | 2.69x       |
>> | (-INF, -2^(-1022)], [2^(-1022), INF) | 138932                                    | 200897                                    | 1.45x       |
>> 
>> Finally, the `jtreg:test/jdk/java/lang/Math/CubeRootTests.java` test passed with the changes.
>
> Mohamed Issa has updated the pull request incrementally with two additional commits since the last revision:
> 
>  - Add newline back to templateInterpreterGenerator_x86_64.cpp source file
>  - Add special case values to cbrt micro-benchmark set

LGTM,  we have already created follow up JBSs for known limiations.

-------------

Marked as reviewed by jbhateja (Reviewer).

PR Review: https://git.openjdk.org/jdk/pull/24470#pullrequestreview-2882036242

From duke at openjdk.org  Fri May 30 18:43:53 2025
From: duke at openjdk.org (duke)
Date: Fri, 30 May 2025 18:43:53 GMT
Subject: RFR: 8353686: Optimize Math.cbrt for x86 64 bit platforms [v5]
In-Reply-To: <GE6s8raLeTlcMDb0coe89QbMYSPqWtjjkr_puB8hcx8=.d4911eb3-3201-441b-81a9-b476ea5b64c9@github.com>
References: <1NsI0OGP9RcnbEwlJwDj1dZ3w7zCP4DxJhEmO1quSgo=.3b1e3da9-9aa9-4221-a73a-e2f3ec5f456b@github.com>
 <GE6s8raLeTlcMDb0coe89QbMYSPqWtjjkr_puB8hcx8=.d4911eb3-3201-441b-81a9-b476ea5b64c9@github.com>
Message-ID: <twUlp2jG-yahclPgfANuQfNC-oNmVXuygRaewNhVHRA=.7fb5cce1-4ab9-4ac2-a606-d5a30e35659a@github.com>

On Thu, 29 May 2025 18:56:11 GMT, Mohamed Issa <duke at openjdk.org> wrote:

>> The goal of this PR is to implement an x86_64 intrinsic for java.lang.Math.cbrt() using libm. There is a new set of micro-benchmarks are included to check the performance of specific input value ranges to help prevent regressions in the future.
>> 
>> The command to run all range specific micro-benchmarks is posted below.
>> 
>> `make test TEST="micro:CbrtPerf.CbrtPerfRanges"`
>> 
>> The results of all tests posted below were captured with an [Intel? Xeon 6761P](https://www.intel.com/content/www/us/en/products/sku/241842/intel-xeon-6761p-processor-336m-cache-2-50-ghz/specifications.html) using [OpenJDK v25-b21](https://github.com/openjdk/jdk/releases/tag/jdk-25%2B21) as the baseline version.
>> 
>> For performance data collected with the new built in range micro-benchmark, see the table below. Each result is the mean of 8 individual runs, and the input ranges used match those from the original Java implementation. Overall, the intrinsic provides a major uplift of 169% when very small inputs are used and a more modest uplift of 45% for all other inputs.
>> 
>> | Input range(s)                                  | Baseline throughput (ops/ms) | Intrinsic throughput (ops/ms) | Speedup |
>> | :-------------------------------------: | :-------------------------------: | :-------------------------------: | :---------: |
>> | [-2^(-1022), 2^(-1022)]                   | 6568                                        | 17678                                      | 2.69x       |
>> | (-INF, -2^(-1022)], [2^(-1022), INF) | 138932                                    | 200897                                    | 1.45x       |
>> 
>> Finally, the `jtreg:test/jdk/java/lang/Math/CubeRootTests.java` test passed with the changes.
>
> Mohamed Issa has updated the pull request incrementally with two additional commits since the last revision:
> 
>  - Add newline back to templateInterpreterGenerator_x86_64.cpp source file
>  - Add special case values to cbrt micro-benchmark set

@missa-prime 
Your change (at version 233e0188c7637cdc08bb4bebd8cb4721ccc352d1) is now ready to be sponsored by a Committer.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/24470#issuecomment-2923150761

From sviswanathan at openjdk.org  Fri May 30 19:05:55 2025
From: sviswanathan at openjdk.org (Sandhya Viswanathan)
Date: Fri, 30 May 2025 19:05:55 GMT
Subject: RFR: 8353686: Optimize Math.cbrt for x86 64 bit platforms [v5]
In-Reply-To: <GE6s8raLeTlcMDb0coe89QbMYSPqWtjjkr_puB8hcx8=.d4911eb3-3201-441b-81a9-b476ea5b64c9@github.com>
References: <1NsI0OGP9RcnbEwlJwDj1dZ3w7zCP4DxJhEmO1quSgo=.3b1e3da9-9aa9-4221-a73a-e2f3ec5f456b@github.com>
 <GE6s8raLeTlcMDb0coe89QbMYSPqWtjjkr_puB8hcx8=.d4911eb3-3201-441b-81a9-b476ea5b64c9@github.com>
Message-ID: <hwyI3EUmwW6G2T7CyqbBuW4hP9bcSiQ5sZHbQSc3sGQ=.e40be297-f036-4616-a7fb-cd0b8b6895c5@github.com>

On Thu, 29 May 2025 18:56:11 GMT, Mohamed Issa <duke at openjdk.org> wrote:

>> The goal of this PR is to implement an x86_64 intrinsic for java.lang.Math.cbrt() using libm. There is a new set of micro-benchmarks are included to check the performance of specific input value ranges to help prevent regressions in the future.
>> 
>> The command to run all range specific micro-benchmarks is posted below.
>> 
>> `make test TEST="micro:CbrtPerf.CbrtPerfRanges"`
>> 
>> The results of all tests posted below were captured with an [Intel? Xeon 6761P](https://www.intel.com/content/www/us/en/products/sku/241842/intel-xeon-6761p-processor-336m-cache-2-50-ghz/specifications.html) using [OpenJDK v25-b21](https://github.com/openjdk/jdk/releases/tag/jdk-25%2B21) as the baseline version.
>> 
>> For performance data collected with the new built in range micro-benchmark, see the table below. Each result is the mean of 8 individual runs, and the input ranges used match those from the original Java implementation. Overall, the intrinsic provides a major uplift of 169% when very small inputs are used and a more modest uplift of 45% for all other inputs.
>> 
>> | Input range(s)                                  | Baseline throughput (ops/ms) | Intrinsic throughput (ops/ms) | Speedup |
>> | :-------------------------------------: | :-------------------------------: | :-------------------------------: | :---------: |
>> | [-2^(-1022), 2^(-1022)]                   | 6568                                        | 17678                                      | 2.69x       |
>> | (-INF, -2^(-1022)], [2^(-1022), INF) | 138932                                    | 200897                                    | 1.45x       |
>> 
>> Finally, the `jtreg:test/jdk/java/lang/Math/CubeRootTests.java` test passed with the changes.
>
> Mohamed Issa has updated the pull request incrementally with two additional commits since the last revision:
> 
>  - Add newline back to templateInterpreterGenerator_x86_64.cpp source file
>  - Add special case values to cbrt micro-benchmark set

src/hotspot/cpu/x86/assembler_x86.cpp line 2879:

> 2877:   emit_operand(dst, src, 0);
> 2878: }
> 2879: 

One more change is needed. We need to set address attributes here, as movapd has Address as one of the input:
 attributes.set_address_attributes(/* tuple_type */ EVEX_FVM, /* input_size_in_bits */ EVEX_NObit);
This should be done before call to simd_prefix.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/24470#discussion_r2116482247

From duke at openjdk.org  Fri May 30 19:34:16 2025
From: duke at openjdk.org (Mohamed Issa)
Date: Fri, 30 May 2025 19:34:16 GMT
Subject: RFR: 8353686: Optimize Math.cbrt for x86 64 bit platforms [v6]
In-Reply-To: <1NsI0OGP9RcnbEwlJwDj1dZ3w7zCP4DxJhEmO1quSgo=.3b1e3da9-9aa9-4221-a73a-e2f3ec5f456b@github.com>
References: <1NsI0OGP9RcnbEwlJwDj1dZ3w7zCP4DxJhEmO1quSgo=.3b1e3da9-9aa9-4221-a73a-e2f3ec5f456b@github.com>
Message-ID: <sZVVm3lr1byp_brwwk80CWNovJIKZ4Mc_RvcRNIBBEI=.4f9e6ca0-108f-4e11-9e0c-f4b5bcf32482@github.com>

> The goal of this PR is to implement an x86_64 intrinsic for java.lang.Math.cbrt() using libm. There is a new set of micro-benchmarks are included to check the performance of specific input value ranges to help prevent regressions in the future.
> 
> The command to run all range specific micro-benchmarks is posted below.
> 
> `make test TEST="micro:CbrtPerf.CbrtPerfRanges"`
> 
> The results of all tests posted below were captured with an [Intel? Xeon 6761P](https://www.intel.com/content/www/us/en/products/sku/241842/intel-xeon-6761p-processor-336m-cache-2-50-ghz/specifications.html) using [OpenJDK v25-b21](https://github.com/openjdk/jdk/releases/tag/jdk-25%2B21) as the baseline version.
> 
> For performance data collected with the new built in range micro-benchmark, see the table below. Each result is the mean of 8 individual runs, and the input ranges used match those from the original Java implementation. Overall, the intrinsic provides a major uplift of 169% when very small inputs are used and a more modest uplift of 45% for all other inputs.
> 
> | Input range(s)                                  | Baseline throughput (ops/ms) | Intrinsic throughput (ops/ms) | Speedup |
> | :-------------------------------------: | :-------------------------------: | :-------------------------------: | :---------: |
> | [-2^(-1022), 2^(-1022)]                   | 6568                                        | 17678                                      | 2.69x       |
> | (-INF, -2^(-1022)], [2^(-1022), INF) | 138932                                    | 200897                                    | 1.45x       |
> 
> Finally, the `jtreg:test/jdk/java/lang/Math/CubeRootTests.java` test passed with the changes.

Mohamed Issa has updated the pull request incrementally with one additional commit since the last revision:

  Set address attributes in movapd assembly instruction function definition

-------------

Changes:
  - all: https://git.openjdk.org/jdk/pull/24470/files
  - new: https://git.openjdk.org/jdk/pull/24470/files/233e0188..c931222c

Webrevs:
 - full: https://webrevs.openjdk.org/?repo=jdk&pr=24470&range=05
 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24470&range=04-05

  Stats: 1 line in 1 file changed: 1 ins; 0 del; 0 mod
  Patch: https://git.openjdk.org/jdk/pull/24470.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/24470/head:pull/24470

PR: https://git.openjdk.org/jdk/pull/24470

From duke at openjdk.org  Fri May 30 19:34:16 2025
From: duke at openjdk.org (Mohamed Issa)
Date: Fri, 30 May 2025 19:34:16 GMT
Subject: RFR: 8353686: Optimize Math.cbrt for x86 64 bit platforms [v5]
In-Reply-To: <hwyI3EUmwW6G2T7CyqbBuW4hP9bcSiQ5sZHbQSc3sGQ=.e40be297-f036-4616-a7fb-cd0b8b6895c5@github.com>
References: <1NsI0OGP9RcnbEwlJwDj1dZ3w7zCP4DxJhEmO1quSgo=.3b1e3da9-9aa9-4221-a73a-e2f3ec5f456b@github.com>
 <GE6s8raLeTlcMDb0coe89QbMYSPqWtjjkr_puB8hcx8=.d4911eb3-3201-441b-81a9-b476ea5b64c9@github.com>
 <hwyI3EUmwW6G2T7CyqbBuW4hP9bcSiQ5sZHbQSc3sGQ=.e40be297-f036-4616-a7fb-cd0b8b6895c5@github.com>
Message-ID: <BEb4vhrs7UEZN4-31--4HokPd_5x4zwotTqOb0hu_B4=.b684bf8d-5a95-4899-8133-c09cad01ab7f@github.com>

On Fri, 30 May 2025 19:03:00 GMT, Sandhya Viswanathan <sviswanathan at openjdk.org> wrote:

>> Mohamed Issa has updated the pull request incrementally with two additional commits since the last revision:
>> 
>>  - Add newline back to templateInterpreterGenerator_x86_64.cpp source file
>>  - Add special case values to cbrt micro-benchmark set
>
> src/hotspot/cpu/x86/assembler_x86.cpp line 2879:
> 
>> 2877:   emit_operand(dst, src, 0);
>> 2878: }
>> 2879: 
> 
> One more change is needed. We need to set address attributes here, as movapd has Address as one of the input:
> `attributes.set_address_attributes(/* tuple_type */ EVEX_FVM, /* input_size_in_bits */ EVEX_NObit);`
> This should be done before call to simd_prefix.

I added the change and re-ran the tests.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/24470#discussion_r2116518295

From sviswanathan at openjdk.org  Fri May 30 21:22:54 2025
From: sviswanathan at openjdk.org (Sandhya Viswanathan)
Date: Fri, 30 May 2025 21:22:54 GMT
Subject: RFR: 8353686: Optimize Math.cbrt for x86 64 bit platforms [v6]
In-Reply-To: <sZVVm3lr1byp_brwwk80CWNovJIKZ4Mc_RvcRNIBBEI=.4f9e6ca0-108f-4e11-9e0c-f4b5bcf32482@github.com>
References: <1NsI0OGP9RcnbEwlJwDj1dZ3w7zCP4DxJhEmO1quSgo=.3b1e3da9-9aa9-4221-a73a-e2f3ec5f456b@github.com>
 <sZVVm3lr1byp_brwwk80CWNovJIKZ4Mc_RvcRNIBBEI=.4f9e6ca0-108f-4e11-9e0c-f4b5bcf32482@github.com>
Message-ID: <tulSz8KsGI1NK8vsnhg53XpRIBZBBl33VI__BYsnC-4=.510a2ff6-a7d0-4f3f-a122-cbc1672ae52c@github.com>

On Fri, 30 May 2025 19:34:16 GMT, Mohamed Issa <duke at openjdk.org> wrote:

>> The goal of this PR is to implement an x86_64 intrinsic for java.lang.Math.cbrt() using libm. There is a new set of micro-benchmarks are included to check the performance of specific input value ranges to help prevent regressions in the future.
>> 
>> The command to run all range specific micro-benchmarks is posted below.
>> 
>> `make test TEST="micro:CbrtPerf.CbrtPerfRanges"`
>> 
>> The results of all tests posted below were captured with an [Intel? Xeon 6761P](https://www.intel.com/content/www/us/en/products/sku/241842/intel-xeon-6761p-processor-336m-cache-2-50-ghz/specifications.html) using [OpenJDK v25-b21](https://github.com/openjdk/jdk/releases/tag/jdk-25%2B21) as the baseline version.
>> 
>> For performance data collected with the new built in range micro-benchmark, see the table below. Each result is the mean of 8 individual runs, and the input ranges used match those from the original Java implementation. Overall, the intrinsic provides a major uplift of 169% when very small inputs are used and a more modest uplift of 45% for all other inputs.
>> 
>> | Input range(s)                                  | Baseline throughput (ops/ms) | Intrinsic throughput (ops/ms) | Speedup |
>> | :-------------------------------------: | :-------------------------------: | :-------------------------------: | :---------: |
>> | [-2^(-1022), 2^(-1022)]                   | 6568                                        | 17678                                      | 2.69x       |
>> | (-INF, -2^(-1022)], [2^(-1022), INF) | 138932                                    | 200897                                    | 1.45x       |
>> 
>> Finally, the `jtreg:test/jdk/java/lang/Math/CubeRootTests.java` test passed with the changes.
>
> Mohamed Issa has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Set address attributes in movapd assembly instruction function definition

Marked as reviewed by sviswanathan (Reviewer).

-------------

PR Review: https://git.openjdk.org/jdk/pull/24470#pullrequestreview-2882583269

From duke at openjdk.org  Fri May 30 21:27:59 2025
From: duke at openjdk.org (duke)
Date: Fri, 30 May 2025 21:27:59 GMT
Subject: RFR: 8353686: Optimize Math.cbrt for x86 64 bit platforms [v6]
In-Reply-To: <sZVVm3lr1byp_brwwk80CWNovJIKZ4Mc_RvcRNIBBEI=.4f9e6ca0-108f-4e11-9e0c-f4b5bcf32482@github.com>
References: <1NsI0OGP9RcnbEwlJwDj1dZ3w7zCP4DxJhEmO1quSgo=.3b1e3da9-9aa9-4221-a73a-e2f3ec5f456b@github.com>
 <sZVVm3lr1byp_brwwk80CWNovJIKZ4Mc_RvcRNIBBEI=.4f9e6ca0-108f-4e11-9e0c-f4b5bcf32482@github.com>
Message-ID: <JYnyjMf8eQkmAfPaGvvGeKA7J1QUwoDoNSnU9y9T1IA=.8f58962a-793e-43c4-a5e3-e3310e914364@github.com>

On Fri, 30 May 2025 19:34:16 GMT, Mohamed Issa <duke at openjdk.org> wrote:

>> The goal of this PR is to implement an x86_64 intrinsic for java.lang.Math.cbrt() using libm. There is a new set of micro-benchmarks are included to check the performance of specific input value ranges to help prevent regressions in the future.
>> 
>> The command to run all range specific micro-benchmarks is posted below.
>> 
>> `make test TEST="micro:CbrtPerf.CbrtPerfRanges"`
>> 
>> The results of all tests posted below were captured with an [Intel? Xeon 6761P](https://www.intel.com/content/www/us/en/products/sku/241842/intel-xeon-6761p-processor-336m-cache-2-50-ghz/specifications.html) using [OpenJDK v25-b21](https://github.com/openjdk/jdk/releases/tag/jdk-25%2B21) as the baseline version.
>> 
>> For performance data collected with the new built in range micro-benchmark, see the table below. Each result is the mean of 8 individual runs, and the input ranges used match those from the original Java implementation. Overall, the intrinsic provides a major uplift of 169% when very small inputs are used and a more modest uplift of 45% for all other inputs.
>> 
>> | Input range(s)                                  | Baseline throughput (ops/ms) | Intrinsic throughput (ops/ms) | Speedup |
>> | :-------------------------------------: | :-------------------------------: | :-------------------------------: | :---------: |
>> | [-2^(-1022), 2^(-1022)]                   | 6568                                        | 17678                                      | 2.69x       |
>> | (-INF, -2^(-1022)], [2^(-1022), INF) | 138932                                    | 200897                                    | 1.45x       |
>> 
>> Finally, the `jtreg:test/jdk/java/lang/Math/CubeRootTests.java` test passed with the changes.
>
> Mohamed Issa has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Set address attributes in movapd assembly instruction function definition

@missa-prime 
Your change (at version c931222c7d40f296de14585d6c902552a1e66f5a) is now ready to be sponsored by a Committer.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/24470#issuecomment-2923531486

From duke at openjdk.org  Fri May 30 21:49:59 2025
From: duke at openjdk.org (Mohamed Issa)
Date: Fri, 30 May 2025 21:49:59 GMT
Subject: Integrated: 8353686: Optimize Math.cbrt for x86 64 bit platforms
In-Reply-To: <1NsI0OGP9RcnbEwlJwDj1dZ3w7zCP4DxJhEmO1quSgo=.3b1e3da9-9aa9-4221-a73a-e2f3ec5f456b@github.com>
References: <1NsI0OGP9RcnbEwlJwDj1dZ3w7zCP4DxJhEmO1quSgo=.3b1e3da9-9aa9-4221-a73a-e2f3ec5f456b@github.com>
Message-ID: <vexMWrsY7erB5EEMYlBXXCZxtY5z98zurXgacL8oBlU=.e87fa776-9af5-4815-8643-576016565c83@github.com>

On Sun, 6 Apr 2025 03:48:22 GMT, Mohamed Issa <duke at openjdk.org> wrote:

> The goal of this PR is to implement an x86_64 intrinsic for java.lang.Math.cbrt() using libm. There is a new set of micro-benchmarks are included to check the performance of specific input value ranges to help prevent regressions in the future.
> 
> The command to run all range specific micro-benchmarks is posted below.
> 
> `make test TEST="micro:CbrtPerf.CbrtPerfRanges"`
> 
> The results of all tests posted below were captured with an [Intel? Xeon 6761P](https://www.intel.com/content/www/us/en/products/sku/241842/intel-xeon-6761p-processor-336m-cache-2-50-ghz/specifications.html) using [OpenJDK v25-b21](https://github.com/openjdk/jdk/releases/tag/jdk-25%2B21) as the baseline version.
> 
> For performance data collected with the new built in range micro-benchmark, see the table below. Each result is the mean of 8 individual runs, and the input ranges used match those from the original Java implementation. Overall, the intrinsic provides a major uplift of 169% when very small inputs are used and a more modest uplift of 45% for all other inputs.
> 
> | Input range(s)                                  | Baseline throughput (ops/ms) | Intrinsic throughput (ops/ms) | Speedup |
> | :-------------------------------------: | :-------------------------------: | :-------------------------------: | :---------: |
> | [-2^(-1022), 2^(-1022)]                   | 6568                                        | 17678                                      | 2.69x       |
> | (-INF, -2^(-1022)], [2^(-1022), INF) | 138932                                    | 200897                                    | 1.45x       |
> 
> Finally, the `jtreg:test/jdk/java/lang/Math/CubeRootTests.java` test passed with the changes.

This pull request has now been integrated.

Changeset: 0df8c968
Author:    Mohamed Issa <mohamed.issa at intel.com>
Committer: Sandhya Viswanathan <sviswanathan at openjdk.org>
URL:       https://git.openjdk.org/jdk/commit/0df8c9684b8782ef830e2bd425217864c3f51784
Stats:     649 lines in 27 files changed: 637 ins; 1 del; 11 mod

8353686: Optimize Math.cbrt for x86 64 bit platforms

Reviewed-by: sviswanathan, sparasa, jbhateja

-------------

PR: https://git.openjdk.org/jdk/pull/24470

From epeter at openjdk.org  Sat May 31 11:02:00 2025
From: epeter at openjdk.org (Emanuel Peter)
Date: Sat, 31 May 2025 11:02:00 GMT
Subject: RFR: 8353686: Optimize Math.cbrt for x86 64 bit platforms [v4]
In-Reply-To: <TKma1AJWRWYjH4SZ4N2CTukn1yXtu_wfACG0dTbTRFg=.e2e27f2c-e4cc-4bf8-9a04-2985dc007002@github.com>
References: <1NsI0OGP9RcnbEwlJwDj1dZ3w7zCP4DxJhEmO1quSgo=.3b1e3da9-9aa9-4221-a73a-e2f3ec5f456b@github.com>
 <72GCipLKeCWCG-4jsG5XhZKkTdVsWafEq_wA0oD-0mk=.c70af814-54fe-43d2-b7c9-72b845eb99d5@github.com>
 <kPx9NqOgNJPStCsPd4PtRhfbQeEXi5xbefvSjCHoPSY=.aa26a490-8f90-4e17-8c9e-cde0c25a9fbb@github.com>
 <TKma1AJWRWYjH4SZ4N2CTukn1yXtu_wfACG0dTbTRFg=.e2e27f2c-e4cc-4bf8-9a04-2985dc007002@github.com>
Message-ID: <jXKLVg4kNAire2Uydf62b7kCXKF9hLsRNfDtEQmvSdw=.09a866b9-57d2-423c-9d54-166da1ca1205@github.com>

On Thu, 29 May 2025 22:30:57 GMT, Mohamed Issa <duke at openjdk.org> wrote:

>> Patch looks good to me,  some comment included.
>
> @jatin-bhateja Please let me know if there's anything else to address.

@missa-prime The patch looks reasonable. It would have been nice if we (from Oracle) could have tested it before integration, especially this close to RDP1 for JDK25. Just for next time. If there are issues with it now, you risk that it gets backed out, and you have to redo it, and it does not make it into JDK25.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/24470#issuecomment-2924944778