From xgong at openjdk.org Fri Sep 1 05:52:07 2023 From: xgong at openjdk.org (Xiaohong Gong) Date: Fri, 1 Sep 2023 05:52:07 GMT Subject: [lworld+vector] RFR: Merge lworld [v2] In-Reply-To: <_qHxzhU5ylvVi3fOGYTFKY0aIKF4i6tf0oxdrLPyL5E=.1cd2a4e5-255b-41e7-a27d-db3cdd91ed07@github.com> References: <_qHxzhU5ylvVi3fOGYTFKY0aIKF4i6tf0oxdrLPyL5E=.1cd2a4e5-255b-41e7-a27d-db3cdd91ed07@github.com> Message-ID: On Thu, 31 Aug 2023 06:53:52 GMT, Xiaohong Gong wrote: >> Xiaohong Gong has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR. The pull request contains one new commit since the last revision: >> >> Merge lworld > >> > > Unfortunately, the regressions are still there even with the AArch64 changes. One of the log is: > > test Int128VectorTests.GEInt128VectorTestsMasked(int[i], int[i], mask[i % 2]): failure > java.lang.AssertionError: expected [true] but found [false] > at org.testng.Assert.fail(Assert.java:99) > at org.testng.Assert.failNotEquals(Assert.java:1037) > at org.testng.Assert.assertEqualsImpl(Assert.java:140) > at org.testng.Assert.assertEquals(Assert.java:122) > at org.testng.Assert.assertEquals(Assert.java:819) > at org.testng.Assert.assertEquals(Assert.java:829) > at Int128VectorTests.GEInt128VectorTestsMasked(Int128VectorTests.java:4176) > > Anyway, this is not invoked by this merge, and I will revisit the regressions in furure. > > > Hi @XiaohongGong , > > > I verified that validation status is intact and other failures with additional -XX:+DeoptimizeALot option are also fixed with the merge. > > > Best Regards, Jatin > > > > > > Thanks for the review and testing @jatin-bhateja ! Did you see any other regressions on Vector API tests with this option? I also ran the tests with `-XX:+DeoptimizeALot`, but some regressions that I met before are still there on NEON. I remembered that some x86 specific code was changed (i.e. [d9f744f#diff-a47d36f3d2e8997bd3d2320ef348dd17b91f388d2cb6e91e9d99d66df5ccd897](https://github.com/openjdk/valhalla/commit/d9f744fe6e816a91aa7c8d5e84ce464e8a9d3921#diff-a47d36f3d2e8997bd3d2320ef348dd17b91f388d2cb6e91e9d99d66df5ccd897)), which is missing on AArch64 platform now. Seems this can also influence the result in interpreter, right? I will have a test by adding the similar code on AArch64 again. > > Best Regards, Xiaohong > > Failures related to shuffle/mask in *LoadStoreTests.java are no longer seen with merge. Overall validation status improved when compared with pre-merge state. Do you see any new failures introduced with merge ? > > If not can we integrate this patch and then fix the outstand regressions. > > Best Regards, Jatin No new regressions are involved by this merge. I will integrate this patch first. Thanks! ------------- PR Comment: https://git.openjdk.org/valhalla/pull/921#issuecomment-1702199013 From xgong at openjdk.org Fri Sep 1 05:52:08 2023 From: xgong at openjdk.org (Xiaohong Gong) Date: Fri, 1 Sep 2023 05:52:08 GMT Subject: [lworld+vector] Integrated: Merge lworld In-Reply-To: References: Message-ID: On Wed, 30 Aug 2023 06:04:45 GMT, Xiaohong Gong wrote: > This patch merges the latest `valhalla:lworld` to > `valhalla:lworld+vector` branch, together with following > main changes: > > 1. Resolve conflicts mainly caused by the BACKOUT of > `VectorShuffle refactory` in jdk mainline [1]. > 2. Fix the class id issue for `SafePointScalarMergeNode` added by [JDK-8287061](https://bugs.openjdk.java.net/browse/JDK-8287061). This causes an jvm crash > when building the jdk image. `lworld` branch also > fixes it (see [2]), but it seems the id is not correctly > ordered. > > [1] https://github.com/openjdk/jdk/pull/14629 > > [2] https://github.com/openjdk/valhalla/blob/lworld/src/hotspot/share/opto/node.hpp#L738 This pull request has now been integrated. Changeset: 0a40ef55 Author: Xiaohong Gong URL: https://git.openjdk.org/valhalla/commit/0a40ef552f1c93f291b11f84d060daa28746d806 Stats: 238118 lines in 3830 files changed: 127006 ins; 91855 del; 19257 mod Merge lworld Reviewed-by: jbhateja ------------- PR: https://git.openjdk.org/valhalla/pull/921 From jbhateja at openjdk.org Sun Sep 3 08:50:39 2023 From: jbhateja at openjdk.org (Jatin Bhateja) Date: Sun, 3 Sep 2023 08:50:39 GMT Subject: [lworld+vector] RFR: 8314980: [lworld+vector] consider scalarization conditions during ciMultiField creation. [v5] In-Reply-To: References: Message-ID: > Patch adds a new API _ciEnv::is_multifield_scalarized_, to scalarize multifield (ciField[s]) in case target vector cannot accommodate multifield bundle size, else it creates a hierarchical structure ciMultiField and expose entire multifield bundle as one field to C2 compiler. > > This cleans up special handling done in C2 compiler, ci field query APIs and object reconstruction handling at SafePoint. > > Please review and share your feedback. > > Best Regards, > Jatin Jatin Bhateja has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains five commits: - Merge branch 'lworld+vector' of http://github.com/openjdk/valhalla into JDK-8314980 - Removing transiant flag. - Restricting population of multifield bundle size and setting _is_multifield_base to only ciMultifield objects. - Remove unused function declaration. - 8314980: [lworld+vector] consider scalarization conditions during ciMultiField creation. ------------- Changes: https://git.openjdk.org/valhalla/pull/918/files Webrev: https://webrevs.openjdk.org/?repo=valhalla&pr=918&range=04 Stats: 147 lines in 9 files changed: 31 ins; 78 del; 38 mod Patch: https://git.openjdk.org/valhalla/pull/918.diff Fetch: git fetch https://git.openjdk.org/valhalla.git pull/918/head:pull/918 PR: https://git.openjdk.org/valhalla/pull/918 From jbhateja at openjdk.org Mon Sep 4 03:33:01 2023 From: jbhateja at openjdk.org (Jatin Bhateja) Date: Mon, 4 Sep 2023 03:33:01 GMT Subject: [lworld+vector] RFR: 8314980: [lworld+vector] consider scalarization conditions during ciMultiField creation. [v2] In-Reply-To: <5zDNhbIOYlAa6rP9FXFL4jWABiOF_LMAW0tlazJU614=.863f5b38-0ca2-41db-bf70-24ebc8dc690b@github.com> References: <5zDNhbIOYlAa6rP9FXFL4jWABiOF_LMAW0tlazJU614=.863f5b38-0ca2-41db-bf70-24ebc8dc690b@github.com> Message-ID: <4KDglG59YehIIgjFBYfviuvHSXuYpZAcGWNYULW9_P8=.77333d8d-b710-4ec9-8771-f850dc141b9c@github.com> On Wed, 30 Aug 2023 01:24:47 GMT, Xiaohong Gong wrote: >>> > Hi @jatin-bhateja, thanks for this refactoring! Code is much cleaner to me. So does it also need to clean the special handling for multifields in c1/interpreter (e.g. `deoptimize.cpp`) ? Thanks! >>> > Besides, the `secondary_fields_count()` which is broadcasted to the `bundle_size()` of the `ciField` is calculated from `multifield_info` (see: https://github.com/openjdk/valhalla/blob/lworld%2Bvector/src/hotspot/share/runtime/fieldDescriptor.inline.hpp#L81), which I think may not be synced with each other. And it is used widely in C2 and deoptimization. Will it have any issues? >> >> >> I do not think it should be a problem for de-optimization, we have separate handling for re-assignment from vector and scalars locations. >> >>> > An alternative is checking the vector size supported when calculating the `secondary_fields_count()` in `fieldDescriptor.inline.hpp`. Return the multifield count if the vector size is supported, and return `1` if not. And then, we can use this information both at the ci stage when calculating the `nonstatic_fields` in `ciInstanceKlass.cpp` and C2 compiler. WDYT? >>> >> >> I am not in favor of making changes in oop population, unless there is a pressing need, all we need is to furnish appropriate data to compilers through ci layer. >> >>> Hi @XiaohongGong , Thanks for reporting this, I was planning to revisit this in next cleanup patch, but its good to club it with this one. Will update the PR with needed modifications. >> >> I revisited the implementation and captured the flow in following diagram >> >> ![image](https://github.com/openjdk/valhalla/assets/59989778/c12b7a7c-2f37-4eb3-8108-0ba61687273c) >> >> As discussed earlier, ClassFileParser creates separate FieldInfo structures for each synthetic multifield as its needed for field layout computations. Interpreter directly operates over oop model structures, currently vectors are created by c2 compile only and it accesses oop model through compiler interface (ci). >> >> I think to remove any ambiguity, its best to populate valid bundle_size and set is_multifield_base flag only if target vector size is able to accommodate multifield payload. > >> I think to remove any ambiguity, its best to populate valid bundle_size and set is_multifield_base flag only if target vector size is able to accommodate multifield payload. > > Sounds reasonable to me. Thanks! Hi @XiaohongGong , Let me know if there are any other comments. ------------- PR Comment: https://git.openjdk.org/valhalla/pull/918#issuecomment-1704559333 From xgong at openjdk.org Mon Sep 4 04:10:59 2023 From: xgong at openjdk.org (Xiaohong Gong) Date: Mon, 4 Sep 2023 04:10:59 GMT Subject: [lworld] RFR: 8311219: [lworld] VM option "InlineFieldMaxFlatSize" cannot work well In-Reply-To: References: <5QNYV-nUYCE-Uzrsz81TwXroTK6x-aDxA4b8B1Svfto=.8e61ef68-a35e-4a92-9338-39ac4c083b2b@github.com> <13Toexb0NM-WYizAXfLLZiAhvDNMLjeySOLID5JyH6k=.90081b8f-fc73-4f1d-9d27-a0c5200da8d9@github.com> Message-ID: <9XJyUabboM3VawgevxzSiCJX5EkPwTbbw2VYutannLo=.1a98337d-6e2f-48c4-a8e7-527a46e8bd5a@github.com> On Wed, 23 Aug 2023 07:39:24 GMT, Tobias Hartmann wrote: >>> > > Are you inclined to re-instantiate the old behavior ? >>> > >>> > >>> > May I ask what the "old behavior" do you mean? Thanks! >>> >>> Hi @XiaohongGong , Updated my previous comment, I am able reproduce an intermittent crash in another test "compiler/valhalla/inlinetypes/TestGetfieldChains.java" with additional JVM options : "-Xbatch -XX:TieredStopAtLevel=1 -XX:CompileThresholdScaling=0.1 -XX:InlineFieldMaxFlatSize=0" [hs_err_pid1570004.txt](https://github.com/openjdk/valhalla/files/12385038/hs_err_pid1570004.txt) [replay_pid1570004.txt](https://github.com/openjdk/valhalla/files/12385039/replay_pid1570004.txt) [TestGetfieldChains.txt](https://github.com/openjdk/valhalla/files/12385040/TestGetfieldChains.txt) >>> >>> Please find attached relevant logs and replay file. >>> >>> Best Regards, Jatin >> >> Thanks for pointing out this! I can reproduce this issue after I rebase my patch to latest `lworld` branch. I will try my best to look at what is changed. >> >> Best Regards, >> Xiaohong > > Hi @XiaohongGong, > >> I think there maybe some potential issues exposed after this change, since almost all value objects cannot be flattened anymore with -XX:InlineFieldMaxFlatSize=0, which may need the buffer allocated. > > Yes, I think it's likely that this change reveals some existing issues. I can help with debugging but only later next month. > >> Could you please show more env/options information on this issue? I cannot reproduce it even with -XX:InlineFieldMaxFlatSize=0 -XX:TieredStopAtLevel=1 on our internal Arm NEON machine. > > It fails quite reliable in our testing on x86_64 with one of the following flag combinations: > - `-Xcomp -XX:TieredStopAtLevel=1 -DIgnoreCompilerControls=true` > - `-Xcomp -XX:-TieredCompilation -DIgnoreCompilerControls=true` > - `-DWarmup=0 -DVerifyIR=false` > - `-DWarmup=0 -XX:TieredStopAtLevel=1` > > But you can reproduce this now, right? > >> Actually I don't quite understand what this assertion mean? Why the null_free value type must be allocated in this routine? > > The `is_allocated` assertion checks that the InlineTypeNode should always have a valid oop to a heap buffer now that we just loaded it from an oop value. That's not guaranteed if the field/argument can be NULL though, that's why the assert checks for `!null_free`. Does that make sense? Hi @TobiHartmann, # A fatal error has been detected by the Java Runtime Environment: # # Internal Error (/workspace/open/src/hotspot/share/c1/c1_LIRGenerator.cpp:1778), pid=336001, tid=336015 # assert(field->type()->as_inline_klass()->is_initialized()) failed: Must be # # JRE version: Java(TM) SE Runtime Environment (22.0) (fastdebug build 22-lworld4ea-2023-08-16-1408411.tobias.hartmann.valhalla2) # Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug 22-lworld4ea-2023-08-16-1408411.tobias.hartmann.valhalla2, compiled mode, emulated-client, sharing, tiered, compressed oops, compressed class ptrs, g1 gc, linux-amd64) # Problematic frame: # V [libjvm.so+0x7dc168] LIRGenerator::access_sub_element(LIRItem&, LIRItem&, LIR_Opr&, ciField*, int)+0x808 Current CompileTask: C1: 7349 3404 b 1 compiler.valhalla.inlinetypes.TestC1::test7_verifier (47 bytes) Recently I spent sometime looking at this crash when the field is not flattened. The basic conclusion is that I didn't find out the field's flatten status can influence the klass's initialization. As a further investigation for this case, I found the code path is different in C1 compiler when the field is flattened or not. Here is my understanding to the process: The relative ops are a flattened array element loading and a followed field loading from the array element. The array element and the field of the element are both primitive class type. To optimize the whole process, C1 compiler merges the two access ops into one with a delay field access. This saves the object re-materialization from the flattened style during array access. When the field is not flattened, it goes to method `LIRGenerator::access_sub_element()`. And the assertion is added after the primitive field is loaded. C1 compiler will set the value to the primitive class's default value if the loaded field is null. And the default value requires the corresponding primitive klass is initialized. And when the field is flattened, it goes to method `LIRGenerator::access_flattened_array()`. Before it, an oop buffer is allocated, and in this method, it directly fills the fields information to the allocated oop buffer. The whole process doesn't need the klass's fully initialized since the default value is not used. To prove my assumption that field's flattened status will not influence the klass's initialization, I added the same assertion in this method (after https://github.com/openjdk/valhalla/blob/lworld/src/hotspot/share/c1/c1_LIRGenerator.cpp#L1804). And it can fail sometimes as well. As a summary, I think we'd better check the klass's initialization state before the whole process in C1 if it is necessary, just like: diff --git a/src/hotspot/share/c1/c1_GraphBuilder.cpp b/src/hotspot/share/c1/c1_GraphBuilder.cpp index 78cfd1796..35e408f34 100644 --- a/src/hotspot/share/c1/c1_GraphBuilder.cpp +++ b/src/hotspot/share/c1/c1_GraphBuilder.cpp @@ -1123,7 +1123,7 @@ void GraphBuilder::load_indexed(BasicType type) { if (s.cur_bc() == Bytecodes::_getfield) { bool will_link; ciField* next_field = s.get_field(will_link); - bool next_needs_patching = !next_field->holder()->is_loaded() || + bool next_needs_patching = !next_field->holder()->is_initialized() || !next_field->will_link(method(), Bytecodes::_getfield) || PatchALot; can_delay_access = C1UseDelayedFlattenedFieldReads && !next_needs_patching; Any better idea to fixing this issue? Please correct me if any misunderstanding! Thanks a lot! Best Regards, Xiaohong ------------- PR Comment: https://git.openjdk.org/valhalla/pull/888#issuecomment-1704581212 From xgong at openjdk.org Mon Sep 4 06:12:03 2023 From: xgong at openjdk.org (Xiaohong Gong) Date: Mon, 4 Sep 2023 06:12:03 GMT Subject: [lworld+vector] RFR: 8314980: [lworld+vector] consider scalarization conditions during ciMultiField creation. [v5] In-Reply-To: References: Message-ID: On Mon, 28 Aug 2023 06:12:20 GMT, Jatin Bhateja wrote: >> src/hotspot/share/ci/ciEnv.cpp line 1782: >> >>> 1780: return InlineTypeNode::is_multifield_scalarized(bt, vec_length); >>> 1781: #else >>> 1782: return false; >> >> 1. I think it's better to implement `is_multifield_scalarized()` here, and reference it in C2 compiler. It's opposite here. >> 2. By default, it should return `true` instead of `false` if `COMPILER2` is not supported. > > Agree with 2, thanks, will correct it! Idea here is that ci should leverage compiler exposed interfaces to determine scalarizability. What is the expected return value of `is_multifield_scalarized()` for C1 compiler? ------------- PR Review Comment: https://git.openjdk.org/valhalla/pull/918#discussion_r1314473670 From xgong at openjdk.org Mon Sep 4 06:26:07 2023 From: xgong at openjdk.org (Xiaohong Gong) Date: Mon, 4 Sep 2023 06:26:07 GMT Subject: [lworld+vector] RFR: 8314980: [lworld+vector] consider scalarization conditions during ciMultiField creation. [v2] In-Reply-To: <4KDglG59YehIIgjFBYfviuvHSXuYpZAcGWNYULW9_P8=.77333d8d-b710-4ec9-8771-f850dc141b9c@github.com> References: <5zDNhbIOYlAa6rP9FXFL4jWABiOF_LMAW0tlazJU614=.863f5b38-0ca2-41db-bf70-24ebc8dc690b@github.com> <4KDglG59YehIIgjFBYfviuvHSXuYpZAcGWNYULW9_P8=.77333d8d-b710-4ec9-8771-f850dc141b9c@github.com> Message-ID: <-VeAqgQdV1us9VIRi47_WjIFaYofSfLxVdwnTcp9O78=.3a06fd97-2608-44b2-b28b-99d162964d67@github.com> On Mon, 4 Sep 2023 03:29:48 GMT, Jatin Bhateja wrote: >>> I think to remove any ambiguity, its best to populate valid bundle_size and set is_multifield_base flag only if target vector size is able to accommodate multifield payload. >> >> Sounds reasonable to me. Thanks! > > Hi @XiaohongGong , Let me know if there are any other comments. > > > Hi @jatin-bhateja, thanks for this refactoring! Code is much cleaner to me. So does it also need to clean the special handling for multifields in c1/interpreter (e.g. `deoptimize.cpp`) ? Thanks! > > > Besides, the `secondary_fields_count()` which is broadcasted to the `bundle_size()` of the `ciField` is calculated from `multifield_info` (see: https://github.com/openjdk/valhalla/blob/lworld%2Bvector/src/hotspot/share/runtime/fieldDescriptor.inline.hpp#L81), which I think may not be synced with each other. And it is used widely in C2 and deoptimization. Will it have any issues? > > I do not think it should be a problem for de-optimization, we have separate handling for re-assignment from vector and scalars locations. What I'm worried is whether we can remove the `is_multifield` filter and `secondary_fields_count` check in deoptimization. Since I assumed the `secondary_fields_count` is meaningful only when the multifields are vectorized now, otherwise, the normal multifields have been added into the klass's nonstatic_fields, right? If so, only check whether the field is vector is enough for me. ------------- PR Comment: https://git.openjdk.org/valhalla/pull/918#issuecomment-1704680639 From jbhateja at openjdk.org Mon Sep 4 08:25:03 2023 From: jbhateja at openjdk.org (Jatin Bhateja) Date: Mon, 4 Sep 2023 08:25:03 GMT Subject: [lworld+vector] RFR: 8314980: [lworld+vector] consider scalarization conditions during ciMultiField creation. [v2] In-Reply-To: <4KDglG59YehIIgjFBYfviuvHSXuYpZAcGWNYULW9_P8=.77333d8d-b710-4ec9-8771-f850dc141b9c@github.com> References: <5zDNhbIOYlAa6rP9FXFL4jWABiOF_LMAW0tlazJU614=.863f5b38-0ca2-41db-bf70-24ebc8dc690b@github.com> <4KDglG59YehIIgjFBYfviuvHSXuYpZAcGWNYULW9_P8=.77333d8d-b710-4ec9-8771-f850dc141b9c@github.com> Message-ID: <4RohhdXd4jdG3j46h9KRDZ2sVCnaRjewyJKfpJ7wtjU=.cd0b5b57-98f6-4d6f-a273-64ee50d4caa2@github.com> On Mon, 4 Sep 2023 03:29:48 GMT, Jatin Bhateja wrote: >>> I think to remove any ambiguity, its best to populate valid bundle_size and set is_multifield_base flag only if target vector size is able to accommodate multifield payload. >> >> Sounds reasonable to me. Thanks! > > Hi @XiaohongGong , Let me know if there are any other comments. > > > > Hi @jatin-bhateja, thanks for this refactoring! Code is much cleaner to me. So does it also need to clean the special handling for multifields in c1/interpreter (e.g. `deoptimize.cpp`) ? Thanks! > > > > Besides, the `secondary_fields_count()` which is broadcasted to the `bundle_size()` of the `ciField` is calculated from `multifield_info` (see: https://github.com/openjdk/valhalla/blob/lworld%2Bvector/src/hotspot/share/runtime/fieldDescriptor.inline.hpp#L81), which I think may not be synced with each other. And it is used widely in C2 and deoptimization. Will it have any issues? > > > > > > I do not think it should be a problem for de-optimization, we have separate handling for re-assignment from vector and scalars locations. > > What I'm worried is whether we can remove the `is_multifield` filter and `secondary_fields_count` check in deoptimization. Since I assumed the `secondary_fields_count` is meaningful only when the multifields are vectorized now, otherwise, the normal multifields have been added into the klass's nonstatic_fields, right? If so, only check whether the field is vector is enough for me. You assumption still holds good, secondary_fields_count is usable only if multifiled is held in a vector, but we need secondary_fields_count to estimate the vector length unless VectorSupport::klass2length is re-introduced. ------------- PR Comment: https://git.openjdk.org/valhalla/pull/918#issuecomment-1704824175 From jbhateja at openjdk.org Mon Sep 4 08:25:03 2023 From: jbhateja at openjdk.org (Jatin Bhateja) Date: Mon, 4 Sep 2023 08:25:03 GMT Subject: [lworld+vector] RFR: 8314980: [lworld+vector] consider scalarization conditions during ciMultiField creation. [v5] In-Reply-To: References: Message-ID: On Mon, 4 Sep 2023 06:09:38 GMT, Xiaohong Gong wrote: > What is the expected return value of `is_multifield_scalarized()` for C1 compiler? C2 is the only runtime component which vectorizes multifield bundle and lower tier compiler should return a true value. ------------- PR Review Comment: https://git.openjdk.org/valhalla/pull/918#discussion_r1314596658 From xgong at openjdk.org Mon Sep 4 08:48:11 2023 From: xgong at openjdk.org (Xiaohong Gong) Date: Mon, 4 Sep 2023 08:48:11 GMT Subject: [lworld+vector] RFR: 8314980: [lworld+vector] consider scalarization conditions during ciMultiField creation. [v2] In-Reply-To: <4RohhdXd4jdG3j46h9KRDZ2sVCnaRjewyJKfpJ7wtjU=.cd0b5b57-98f6-4d6f-a273-64ee50d4caa2@github.com> References: <5zDNhbIOYlAa6rP9FXFL4jWABiOF_LMAW0tlazJU614=.863f5b38-0ca2-41db-bf70-24ebc8dc690b@github.com> <4KDglG59YehIIgjFBYfviuvHSXuYpZAcGWNYULW9_P8=.77333d8d-b710-4ec9-8771-f850dc141b9c@github.com> <4RohhdXd4jdG3j46h9KRDZ2sVCnaRjewyJKfpJ7wtjU=.cd0b5b57-98f6-4d6f-a273-64ee50d4caa2@github.com> Message-ID: On Mon, 4 Sep 2023 08:20:25 GMT, Jatin Bhateja wrote: > > > > > Hi @jatin-bhateja, thanks for this refactoring! Code is much cleaner to me. So does it also need to clean the special handling for multifields in c1/interpreter (e.g. `deoptimize.cpp`) ? Thanks! > > > > > Besides, the `secondary_fields_count()` which is broadcasted to the `bundle_size()` of the `ciField` is calculated from `multifield_info` (see: https://github.com/openjdk/valhalla/blob/lworld%2Bvector/src/hotspot/share/runtime/fieldDescriptor.inline.hpp#L81), which I think may not be synced with each other. And it is used widely in C2 and deoptimization. Will it have any issues? > > > > > > > > > I do not think it should be a problem for de-optimization, we have separate handling for re-assignment from vector and scalars locations. > > > > > > What I'm worried is whether we can remove the `is_multifield` filter and `secondary_fields_count` check in deoptimization. Since I assumed the `secondary_fields_count` is meaningful only when the multifields are vectorized now, otherwise, the normal multifields have been added into the klass's nonstatic_fields, right? If so, only check whether the field is vector is enough for me. > > You assumption still holds good, secondary_fields_count is usable only if multifiled is held in a vector, but we need secondary_fields_count to estimate the vector length unless VectorSupport::klass2length is re-introduced. I agree we can do some more cleanup here. Yes, I agree it is needed. I mean the filter for multifield (see: https://github.com/openjdk/valhalla/blob/lworld%2Bvector/src/hotspot/share/runtime/deoptimization.cpp#L1567) and the scalar field reassign loop with `secondary_fields_count` as the loop limit. Can these two part be cleaned like before? ------------- PR Comment: https://git.openjdk.org/valhalla/pull/918#issuecomment-1704860245 From xgong at openjdk.org Mon Sep 4 08:58:09 2023 From: xgong at openjdk.org (Xiaohong Gong) Date: Mon, 4 Sep 2023 08:58:09 GMT Subject: [lworld+vector] RFR: 8314980: [lworld+vector] consider scalarization conditions during ciMultiField creation. [v5] In-Reply-To: References: Message-ID: On Mon, 4 Sep 2023 08:22:32 GMT, Jatin Bhateja wrote: >> What is the expected return value of `is_multifield_scalarized()` for C1 compiler? > >> What is the expected return value of `is_multifield_scalarized()` for C1 compiler? > > C2 is the only runtime component which vectorizes multifield bundle and lower tier compiler should return a true value. Yes, correct. Per my understanding `#if COMPILER2` is a compile flag which is true if C2 is not disabled. Hence this code is just like directly `return InlineTypeNode::is_multifield_scalarized(bt, vec_length);`, right? And then it returns the same value for interpreter/c1/c2. ------------- PR Review Comment: https://git.openjdk.org/valhalla/pull/918#discussion_r1314638990 From xgong at openjdk.org Mon Sep 4 09:06:03 2023 From: xgong at openjdk.org (Xiaohong Gong) Date: Mon, 4 Sep 2023 09:06:03 GMT Subject: [lworld+vector] RFR: 8314980: [lworld+vector] consider scalarization conditions during ciMultiField creation. [v5] In-Reply-To: References: Message-ID: On Mon, 4 Sep 2023 08:55:28 GMT, Xiaohong Gong wrote: >>> What is the expected return value of `is_multifield_scalarized()` for C1 compiler? >> >> C2 is the only runtime component which vectorizes multifield bundle and lower tier compiler should return a true value. > > Yes, correct. Per my understanding `#if COMPILER2` is a compile flag which is true if C2 is not disabled. Hence this code is just like directly `return InlineTypeNode::is_multifield_scalarized(bt, vec_length);`, right? And then it returns the same value for interpreter/c1/c2. I think what we need is adding an interface here, and implement it in different compilers, or a runtime check like https://github.com/openjdk/valhalla/blob/lworld%2Bvector/src/hotspot/share/compiler/compilerDefinitions.inline.hpp#L49. WDYT? ------------- PR Review Comment: https://git.openjdk.org/valhalla/pull/918#discussion_r1314648307 From thartmann at openjdk.org Mon Sep 4 09:18:15 2023 From: thartmann at openjdk.org (Tobias Hartmann) Date: Mon, 4 Sep 2023 09:18:15 GMT Subject: [lworld] RFR: 8315412 [lworld] Preparing code for lw5 [v2] In-Reply-To: <7_ummO2jCYXdyaPqO1REaEs1QUWRg71k-IDkbfOSi2w=.dc050187-062b-4725-a9a2-7d029ed7c855@github.com> References: <7_ummO2jCYXdyaPqO1REaEs1QUWRg71k-IDkbfOSi2w=.dc050187-062b-4725-a9a2-7d029ed7c855@github.com> Message-ID: On Thu, 31 Aug 2023 19:46:01 GMT, Frederic Parain wrote: >> Quite a big patch, but it is mostly made of renaming and mechanical replacement without really changing the logic of the code. >> >> The patch includes changes to make the VM code less dependent on Q-descriptors by encoding the presence of null-free inline types fields in FieldInfo. >> >> The patch also includes a lot of renaming, in an effort to have uniformed naming of flat fields and flat arrays across the VM. The renaming has not been applied to the heap dumper and C2. They will be addressed in follow up patches. >> >> Tested with Mach5, tiers 1 to 3. >> >> Thank you >> >> Fred > > Frederic Parain has updated the pull request incrementally with one additional commit since the last revision: > > Addresing Lois' comments Nice cleanup! I only found some indentation issues. src/hotspot/cpu/aarch64/interp_masm_aarch64.cpp line 289: > 287: > 288: void InterpreterMacroAssembler::read_flat_field(Register holder_klass, > 289: Register field_index, Register field_offset, Indentation needs to be fixed. src/hotspot/cpu/aarch64/interp_masm_aarch64.hpp line 158: > 156: // - assumes holder_klass and valueKlass field klass have both been resolved > 157: void read_flat_field(Register holder_klass, > 158: Register field_index, Register field_offset, Indentation needs to be fixed. src/hotspot/cpu/aarch64/interp_masm_aarch64.hpp line 167: > 165: void read_flat_element(Register array, Register index, > 166: Register t1, Register t2, > 167: Register obj = r0); Indentation needs to be fixed. src/hotspot/cpu/x86/interp_masm_x86.cpp line 1250: > 1248: > 1249: void InterpreterMacroAssembler::read_flat_field(Register holder_klass, > 1250: Register field_index, Register field_offset, Indentation needs to be fixed. src/hotspot/cpu/x86/interp_masm_x86.cpp line 1297: > 1295: void InterpreterMacroAssembler::read_flat_element(Register array, Register index, > 1296: Register t1, Register t2, > 1297: Register obj) { Indentation needs to be fixed. src/hotspot/cpu/x86/interp_masm_x86.hpp line 244: > 242: // - 32 bits: kills rdi and rsi > 243: void read_flat_field(Register holder_klass, > 244: Register field_index, Register field_offset, Indentation needs to be fixed. src/hotspot/cpu/x86/interp_masm_x86.hpp line 253: > 251: // - 32 bits: kills rdi and rsi > 252: void read_flat_element(Register array, Register index, > 253: Register t1, Register t2, Indentation needs to be fixed. src/hotspot/cpu/x86/macroAssembler_x86.cpp line 2944: > 2942: > 2943: void MacroAssembler::test_flat_array_oop(Register oop, Register temp_reg, > 2944: Label& is_flat_array) { Indentation needs to be fixed. src/hotspot/cpu/x86/macroAssembler_x86.cpp line 2955: > 2953: > 2954: void MacroAssembler::test_non_flat_array_oop(Register oop, Register temp_reg, > 2955: Label& is_non_flat_array) { Indentation needs to be fixed. src/hotspot/share/c1/c1_LIRGenerator.cpp line 2360: > 2358: > 2359: access_flat_array(true, array, index, obj_item, > 2360: x->delayed() == nullptr ? 0 : x->delayed()->field(), Indentation needs to be fixed. src/hotspot/share/c1/c1_Runtime1.hpp line 58: > 56: stub(new_multi_array) \ > 57: stub(load_flat_array) \ > 58: stub(store_flat_array) \ Indentation needs to be fixed. src/hotspot/share/ci/ciInstanceKlass.cpp line 811: > 809: InstanceKlass* holder = fd->field_holder(); > 810: InstanceKlass* k = SystemDictionary::find_instance_klass(THREAD, name, > 811: Handle(THREAD, holder->class_loader()), Indentation needs to be fixed. src/hotspot/share/interpreter/interpreterRuntime.cpp line 366: > 364: if (cpe->is_null_free_inline_type()) { > 365: if (!cpe->is_flat()) { > 366: if (ref_h() == nullptr) { Indentation needs to be fixed. ------------- Marked as reviewed by thartmann (Committer). PR Review: https://git.openjdk.org/valhalla/pull/922#pullrequestreview-1609095614 PR Review Comment: https://git.openjdk.org/valhalla/pull/922#discussion_r1314651684 PR Review Comment: https://git.openjdk.org/valhalla/pull/922#discussion_r1314651801 PR Review Comment: https://git.openjdk.org/valhalla/pull/922#discussion_r1314651906 PR Review Comment: https://git.openjdk.org/valhalla/pull/922#discussion_r1314652538 PR Review Comment: https://git.openjdk.org/valhalla/pull/922#discussion_r1314652601 PR Review Comment: https://git.openjdk.org/valhalla/pull/922#discussion_r1314651263 PR Review Comment: https://git.openjdk.org/valhalla/pull/922#discussion_r1314651362 PR Review Comment: https://git.openjdk.org/valhalla/pull/922#discussion_r1314650915 PR Review Comment: https://git.openjdk.org/valhalla/pull/922#discussion_r1314651024 PR Review Comment: https://git.openjdk.org/valhalla/pull/922#discussion_r1314654712 PR Review Comment: https://git.openjdk.org/valhalla/pull/922#discussion_r1314655307 PR Review Comment: https://git.openjdk.org/valhalla/pull/922#discussion_r1314656771 PR Review Comment: https://git.openjdk.org/valhalla/pull/922#discussion_r1314658667 From thartmann at openjdk.org Mon Sep 4 09:27:02 2023 From: thartmann at openjdk.org (Tobias Hartmann) Date: Mon, 4 Sep 2023 09:27:02 GMT Subject: [lworld] RFR: 8311219: [lworld] VM option "InlineFieldMaxFlatSize" cannot work well In-Reply-To: <9XJyUabboM3VawgevxzSiCJX5EkPwTbbw2VYutannLo=.1a98337d-6e2f-48c4-a8e7-527a46e8bd5a@github.com> References: <5QNYV-nUYCE-Uzrsz81TwXroTK6x-aDxA4b8B1Svfto=.8e61ef68-a35e-4a92-9338-39ac4c083b2b@github.com> <13Toexb0NM-WYizAXfLLZiAhvDNMLjeySOLID5JyH6k=.90081b8f-fc73-4f1d-9d27-a0c5200da8d9@github.com> <9XJyUabboM3VawgevxzSiCJX5EkPwTbbw2V YutannLo=.1a98337d-6e2f-48c4-a8e7-527a46e8bd5a@github.com> Message-ID: On Mon, 4 Sep 2023 04:08:20 GMT, Xiaohong Gong wrote: >> Hi @XiaohongGong, >> >>> I think there maybe some potential issues exposed after this change, since almost all value objects cannot be flattened anymore with -XX:InlineFieldMaxFlatSize=0, which may need the buffer allocated. >> >> Yes, I think it's likely that this change reveals some existing issues. I can help with debugging but only later next month. >> >>> Could you please show more env/options information on this issue? I cannot reproduce it even with -XX:InlineFieldMaxFlatSize=0 -XX:TieredStopAtLevel=1 on our internal Arm NEON machine. >> >> It fails quite reliable in our testing on x86_64 with one of the following flag combinations: >> - `-Xcomp -XX:TieredStopAtLevel=1 -DIgnoreCompilerControls=true` >> - `-Xcomp -XX:-TieredCompilation -DIgnoreCompilerControls=true` >> - `-DWarmup=0 -DVerifyIR=false` >> - `-DWarmup=0 -XX:TieredStopAtLevel=1` >> >> But you can reproduce this now, right? >> >>> Actually I don't quite understand what this assertion mean? Why the null_free value type must be allocated in this routine? >> >> The `is_allocated` assertion checks that the InlineTypeNode should always have a valid oop to a heap buffer now that we just loaded it from an oop value. That's not guaranteed if the field/argument can be NULL though, that's why the assert checks for `!null_free`. Does that make sense? > > Hi @TobiHartmann, > > > # A fatal error has been detected by the Java Runtime Environment: > # > # Internal Error (/workspace/open/src/hotspot/share/c1/c1_LIRGenerator.cpp:1778), pid=336001, tid=336015 > # assert(field->type()->as_inline_klass()->is_initialized()) failed: Must be > # > # JRE version: Java(TM) SE Runtime Environment (22.0) (fastdebug build 22-lworld4ea-2023-08-16-1408411.tobias.hartmann.valhalla2) > # Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug 22-lworld4ea-2023-08-16-1408411.tobias.hartmann.valhalla2, compiled mode, emulated-client, sharing, tiered, compressed oops, compressed class ptrs, g1 gc, linux-amd64) > # Problematic frame: > # V [libjvm.so+0x7dc168] LIRGenerator::access_sub_element(LIRItem&, LIRItem&, LIR_Opr&, ciField*, int)+0x808 > > Current CompileTask: > C1: 7349 3404 b 1 compiler.valhalla.inlinetypes.TestC1::test7_verifier (47 bytes) > > > Recently I spent sometime looking at this crash when the field is not flattened. The basic conclusion is that I didn't find out the field's flatten status can influence the klass's initialization. As a further investigation for this case, I found the code path is different in C1 compiler when the field is flattened or not. > > Here is my understanding to the process: > > The relative ops are a flattened array element loading and a followed field loading from the array element. The array element and the field of the element are both primitive class type. To optimize the whole process, C1 compiler merges the two access ops into one with a delay field access. This saves the object re-materialization from the flattened style during array access. > > When the field is not flattened, it goes to method `LIRGenerator::access_sub_element()`. And the assertion is added after the primitive field is loaded. C1 compiler will set the value to the primitive class's default value if the loaded field is null. And the default value requires the corresponding primitive klass is initialized. > > And when the field is flattened, it goes to method `LIRGenerator::access_flattened_array()`. Before it, an oop buffer is allocated, and in this method, it directly fills the fields information to the allocated oop buffer. The whole process doesn't need the klass's fully initialized since the default value is not used. To prove my assumption that field's flattened status will not influence the klass's initialization, I added the same assertion in this method (after https://github.com/openjdk/valhalla/blob/lworld/src/hotspot/share/... Great analysis @XiaohongGong! The fix looks good to me. ------------- PR Comment: https://git.openjdk.org/valhalla/pull/888#issuecomment-1704920613 From jbhateja at openjdk.org Mon Sep 4 09:40:03 2023 From: jbhateja at openjdk.org (Jatin Bhateja) Date: Mon, 4 Sep 2023 09:40:03 GMT Subject: [lworld+vector] RFR: 8314980: [lworld+vector] consider scalarization conditions during ciMultiField creation. [v5] In-Reply-To: References: Message-ID: On Mon, 4 Sep 2023 09:03:37 GMT, Xiaohong Gong wrote: >> Yes, correct. Per my understanding `#if COMPILER2` is a compile flag which is true if C2 is not disabled. Hence this code is just like directly `return InlineTypeNode::is_multifield_scalarized(bt, vec_length);`, right? And then it returns the same value for interpreter/c1/c2. > > I think what we need is adding an interface here, and implement it in different compilers, or a runtime check like https://github.com/openjdk/valhalla/blob/lworld%2Bvector/src/hotspot/share/compiler/compilerDefinitions.inline.hpp#L49. WDYT? > Yes, correct. Per my understanding `#if COMPILER2` is a compile flag which is true if C2 is not disabled. Hence this code is just like directly `return InlineTypeNode::is_multifield_scalarized(bt, vec_length);`, right? And then it returns the same value for interpreter/c1/c2. My bad, I completely overlooked it, you are correct. ------------- PR Review Comment: https://git.openjdk.org/valhalla/pull/918#discussion_r1314692361 From thartmann at openjdk.org Mon Sep 4 11:05:07 2023 From: thartmann at openjdk.org (Tobias Hartmann) Date: Mon, 4 Sep 2023 11:05:07 GMT Subject: [lworld] RFR: 8311219: [lworld] VM option "InlineFieldMaxFlatSize" cannot work well [v2] In-Reply-To: References: <5QNYV-nUYCE-Uzrsz81TwXroTK6x-aDxA4b8B1Svfto=.8e61ef68-a35e-4a92-9338-39ac4c083b2b@github.com> Message-ID: On Tue, 25 Jul 2023 09:56:20 GMT, Xiaohong Gong wrote: >> Currently all the non-static final fields with inline type can be flattened, even if the layout size of the inline type is beyond >> the max flat size specified by `InlineFieldMaxFlatSize`. Please refer to the condition check [1] which decides whether a field >> can be flattened or not. >> >> Field flattening has two major side effects: atomicity and size. Fields with atomic access limitation or large size that exceeds >> the specified threshold value cannot be flattened. And final fields are special that they are immutable after initialized. So the atomic check for them can be ignored. Hence, 1) for the atomicity free type like the primitive class, the final and non-final fields with such type can be flattened. And 2) for the normal value class that has atomic feature, only the final fields with such type can be flattened. And all kinds of the flattened fields should not exceed the specified max flat size. Please see more details from [1] [2]. >> >> The original condition [1] matches the atomicity check but not the flat size limitation. Promoting the flat size check before all other checks matches the flattening policy and can make the VM option `InlineFieldMaxFlatSize` work for final fields as well. >> >> This patch also fixed the jtreg crashes involved after the field flattening condition is changed. Those tests fail with setting >> `-XX:+InlineFieldMaxFlatSize=0` by default. The main issue is the non-flattened inline type field is not buffered which is expected to be. The root cause is when parsing `withfield`, the compiler checks whether the field is primitive class type while not its flattening status. Changing to check the flattening status instead can fix the crash. >> >> [1] https://github.com/openjdk/valhalla/blob/lworld/src/hotspot/share/classfile/fieldLayoutBuilder.cpp#L759 >> [2] https://mail.openjdk.org/pipermail/valhalla-dev/2023-June/011262.html >> [3] https://mail.openjdk.org/pipermail/valhalla-dev/2023-July/011265.html > > Xiaohong Gong has updated the pull request incrementally with one additional commit since the last revision: > > Not flatten L-descriptor value object field even it is final I also had a look at the `inline type should be allocated` assert failure. The problem is that the type of a null-free, not-flat field of a primitive class that is passed in scalarized form is not marked as null-free. Here's the fix: diff --git a/src/hotspot/share/opto/type.cpp b/src/hotspot/share/opto/type.cpp index 2181a12be42..8f312be063a 100644 --- a/src/hotspot/share/opto/type.cpp +++ b/src/hotspot/share/opto/type.cpp @@ -2187,6 +2187,9 @@ static void collect_inline_fields(ciInlineKlass* vk, const Type** field_array, u ciField* field = vk->nonstatic_field_at(j); BasicType bt = field->type()->basic_type(); const Type* ft = Type::get_const_type(field->type()); + if (field->is_null_free()) { + ft = ft->join_speculative(TypePtr::NOTNULL); + } field_array[pos++] = ft; if (type2size[bt] == 2) { field_array[pos++] = Type::HALF; ------------- PR Comment: https://git.openjdk.org/valhalla/pull/888#issuecomment-1705069839 From jbhateja at openjdk.org Mon Sep 4 11:42:26 2023 From: jbhateja at openjdk.org (Jatin Bhateja) Date: Mon, 4 Sep 2023 11:42:26 GMT Subject: [lworld+vector] RFR: 8314980: [lworld+vector] consider scalarization conditions during ciMultiField creation. [v6] In-Reply-To: References: Message-ID: > Patch adds a new API _ciEnv::is_multifield_scalarized_, to scalarize multifield (ciField[s]) in case target vector cannot accommodate multifield bundle size, else it creates a hierarchical structure ciMultiField and expose entire multifield bundle as one field to C2 compiler. > > This cleans up special handling done in C2 compiler, ci field query APIs and object reconstruction handling at SafePoint. > > Please review and share your feedback. > > Best Regards, > Jatin Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision: Rebasing ci::is_multifiled_scalarized() to use dynamic compiler type check. ------------- Changes: - all: https://git.openjdk.org/valhalla/pull/918/files - new: https://git.openjdk.org/valhalla/pull/918/files/70b39321..bcb1c585 Webrevs: - full: https://webrevs.openjdk.org/?repo=valhalla&pr=918&range=05 - incr: https://webrevs.openjdk.org/?repo=valhalla&pr=918&range=04-05 Stats: 6 lines in 1 file changed: 1 ins; 0 del; 5 mod Patch: https://git.openjdk.org/valhalla/pull/918.diff Fetch: git fetch https://git.openjdk.org/valhalla.git pull/918/head:pull/918 PR: https://git.openjdk.org/valhalla/pull/918 From thartmann at openjdk.org Mon Sep 4 12:40:05 2023 From: thartmann at openjdk.org (Tobias Hartmann) Date: Mon, 4 Sep 2023 12:40:05 GMT Subject: [lworld] RFR: 8311219: [lworld] VM option "InlineFieldMaxFlatSize" cannot work well [v2] In-Reply-To: References: <5QNYV-nUYCE-Uzrsz81TwXroTK6x-aDxA4b8B1Svfto=.8e61ef68-a35e-4a92-9338-39ac4c083b2b@github.com> Message-ID: On Tue, 25 Jul 2023 09:56:20 GMT, Xiaohong Gong wrote: >> Currently all the non-static final fields with inline type can be flattened, even if the layout size of the inline type is beyond >> the max flat size specified by `InlineFieldMaxFlatSize`. Please refer to the condition check [1] which decides whether a field >> can be flattened or not. >> >> Field flattening has two major side effects: atomicity and size. Fields with atomic access limitation or large size that exceeds >> the specified threshold value cannot be flattened. And final fields are special that they are immutable after initialized. So the atomic check for them can be ignored. Hence, 1) for the atomicity free type like the primitive class, the final and non-final fields with such type can be flattened. And 2) for the normal value class that has atomic feature, only the final fields with such type can be flattened. And all kinds of the flattened fields should not exceed the specified max flat size. Please see more details from [1] [2]. >> >> The original condition [1] matches the atomicity check but not the flat size limitation. Promoting the flat size check before all other checks matches the flattening policy and can make the VM option `InlineFieldMaxFlatSize` work for final fields as well. >> >> This patch also fixed the jtreg crashes involved after the field flattening condition is changed. Those tests fail with setting >> `-XX:+InlineFieldMaxFlatSize=0` by default. The main issue is the non-flattened inline type field is not buffered which is expected to be. The root cause is when parsing `withfield`, the compiler checks whether the field is primitive class type while not its flattening status. Changing to check the flattening status instead can fix the crash. >> >> [1] https://github.com/openjdk/valhalla/blob/lworld/src/hotspot/share/classfile/fieldLayoutBuilder.cpp#L759 >> [2] https://mail.openjdk.org/pipermail/valhalla-dev/2023-June/011262.html >> [3] https://mail.openjdk.org/pipermail/valhalla-dev/2023-July/011265.html > > Xiaohong Gong has updated the pull request incrementally with one additional commit since the last revision: > > Not flatten L-descriptor value object field even it is final Sorry, above fix is not correct. I forgot that it's perfectly fine for non-flat but null-free fields to be initialized with null. We then need to replace null with the default value. Let's adjust the assert instead: - assert(!null_free || vt->is_allocated(&gvn), "inline type should be allocated"); + assert(!null_free || gvn.type(oop)->maybe_null() || vt->is_allocated(&gvn), "inline type should be allocated"); I'll revisit these asserts with JDK-8284443. ------------- PR Comment: https://git.openjdk.org/valhalla/pull/888#issuecomment-1705199246 From jbhateja at openjdk.org Mon Sep 4 13:30:07 2023 From: jbhateja at openjdk.org (Jatin Bhateja) Date: Mon, 4 Sep 2023 13:30:07 GMT Subject: [lworld+vector] RFR: 8314980: [lworld+vector] consider scalarization conditions during ciMultiField creation. [v2] In-Reply-To: <4RohhdXd4jdG3j46h9KRDZ2sVCnaRjewyJKfpJ7wtjU=.cd0b5b57-98f6-4d6f-a273-64ee50d4caa2@github.com> References: <5zDNhbIOYlAa6rP9FXFL4jWABiOF_LMAW0tlazJU614=.863f5b38-0ca2-41db-bf70-24ebc8dc690b@github.com> <4KDglG59YehIIgjFBYfviuvHSXuYpZAcGWNYULW9_P8=.77333d8d-b710-4ec9-8771-f850dc141b9c@github.com> <4RohhdXd4jdG3j46h9KRDZ2sVCnaRjewyJKfpJ7wtjU=.cd0b5b57-98f6-4d6f-a273-64ee50d4caa2@github.com> Message-ID: On Mon, 4 Sep 2023 08:20:25 GMT, Jatin Bhateja wrote: >>> > Hi @jatin-bhateja, thanks for this refactoring! Code is much cleaner to me. So does it also need to clean the special handling for multifields in c1/interpreter (e.g. `deoptimize.cpp`) ? Thanks! >>> > Besides, the `secondary_fields_count()` which is broadcasted to the `bundle_size()` of the `ciField` is calculated from `multifield_info` (see: https://github.com/openjdk/valhalla/blob/lworld%2Bvector/src/hotspot/share/runtime/fieldDescriptor.inline.hpp#L81), which I think may not be synced with each other. And it is used widely in C2 and deoptimization. Will it have any issues? >> >> >> I do not think it should be a problem for de-optimization, we have separate handling for re-assignment from vector and scalars locations. >> >>> > An alternative is checking the vector size supported when calculating the `secondary_fields_count()` in `fieldDescriptor.inline.hpp`. Return the multifield count if the vector size is supported, and return `1` if not. And then, we can use this information both at the ci stage when calculating the `nonstatic_fields` in `ciInstanceKlass.cpp` and C2 compiler. WDYT? >>> >> >> I am not in favor of making changes in oop population, unless there is a pressing need, all we need is to furnish appropriate data to compilers through ci layer. >> >>> Hi @XiaohongGong , Thanks for reporting this, I was planning to revisit this in next cleanup patch, but its good to club it with this one. Will update the PR with needed modifications. >> >> I revisited the implementation and captured the flow in following diagram >> >> ![image](https://github.com/openjdk/valhalla/assets/59989778/c12b7a7c-2f37-4eb3-8108-0ba61687273c) >> >> As discussed earlier, ClassFileParser creates separate FieldInfo structures for each synthetic multifield as its needed for field layout computations. Interpreter directly operates over oop model structures, currently vectors are created by c2 compile only and it accesses oop model through compiler interface (ci). >> >> I think to remove any ambiguity, its best to populate valid bundle_size and set is_multifield_base flag only if target vector size is able to accommodate multifield payload. > >> > > > Hi @jatin-bhateja, thanks for this refactoring! Code is much cleaner to me. So does it also need to clean the special handling for multifields in c1/interpreter (e.g. `deoptimize.cpp`) ? Thanks! >> > > > Besides, the `secondary_fields_count()` which is broadcasted to the `bundle_size()` of the `ciField` is calculated from `multifield_info` (see: https://github.com/openjdk/valhalla/blob/lworld%2Bvector/src/hotspot/share/runtime/fieldDescriptor.inline.hpp#L81), which I think may not be synced with each other. And it is used widely in C2 and deoptimization. Will it have any issues? >> > >> > >> > I do not think it should be a problem for de-optimization, we have separate handling for re-assignment from vector and scalars locations. >> >> What I'm worried is whether we can remove the `is_multifield` filter and `secondary_fields_count` check in deoptimization. Since I assumed the `secondary_fields_count` is meaningful only when the multifields are vectorized now, otherwise, the normal multifields have been added into the klass's nonstatic_fields, right? If so, only check whether the field is vector is enough for me. > > Your assumption still holds good, secondary_fields_count is usable only if multifiled is held in a vector, but we need secondary_fields_count to estimate the vector length unless VectorSupport::klass2length is re-introduced. I agree we can do some more cleanup here. > > > > > > Hi @jatin-bhateja, thanks for this refactoring! Code is much cleaner to me. So does it also need to clean the special handling for multifields in c1/interpreter (e.g. `deoptimize.cpp`) ? Thanks! > > > > > > Besides, the `secondary_fields_count()` which is broadcasted to the `bundle_size()` of the `ciField` is calculated from `multifield_info` (see: https://github.com/openjdk/valhalla/blob/lworld%2Bvector/src/hotspot/share/runtime/fieldDescriptor.inline.hpp#L81), which I think may not be synced with each other. And it is used widely in C2 and deoptimization. Will it have any issues? > > > > > > > > > > > > I do not think it should be a problem for de-optimization, we have separate handling for re-assignment from vector and scalars locations. > > > > > > > > > What I'm worried is whether we can remove the `is_multifield` filter and `secondary_fields_count` check in deoptimization. Since I assumed the `secondary_fields_count` is meaningful only when the multifields are vectorized now, otherwise, the normal multifields have been added into the klass's nonstatic_fields, right? If so, only check whether the field is vector is enough for me. > > > > > > You assumption still holds good, secondary_fields_count is usable only if multifiled is held in a vector, but we need secondary_fields_count to estimate the vector length unless VectorSupport::klass2length is re-introduced. I agree we can do some more cleanup here. > > Yes, I agree it is needed. I mean the filter for multifield (see: https://github.com/openjdk/valhalla/blob/lworld%2Bvector/src/hotspot/share/runtime/deoptimization.cpp#L1567) and the scalar field reassign loop with `secondary_fields_count` as the loop limit. Can these two part be cleaned like before? Hi @XiaohongGong , I am not in favor of modifying oops structures which are populated during raw bytecode parsing, while we can always prune ci model to present ciField/ciMultifield structures in desired format based on target supported vector size, but a change in oop model will also trigger changes in field layout computations. An iteration over raw fields of a klass should see a separate FieldInfo for each synthetic and base multifield. Also Oops model is common across different execution engines (Compilers and Interpreter), while C2 has ability to create vector IR other do not. Hence by construction Oops model should not be influenced by target vector size. We can revisit this at a later stage if needed. ------------- PR Comment: https://git.openjdk.org/valhalla/pull/918#issuecomment-1705273200 From jbhateja at openjdk.org Tue Sep 5 00:34:25 2023 From: jbhateja at openjdk.org (Jatin Bhateja) Date: Tue, 5 Sep 2023 00:34:25 GMT Subject: [lworld+vector] RFR: 8314980: [lworld+vector] consider scalarization conditions during ciMultiField creation. [v7] In-Reply-To: References: Message-ID: > Patch adds a new API _ciEnv::is_multifield_scalarized_, to scalarize multifield (ciField[s]) in case target vector cannot accommodate multifield bundle size, else it creates a hierarchical structure ciMultiField and expose entire multifield bundle as one field to C2 compiler. > > This cleans up special handling done in C2 compiler, ci field query APIs and object reconstruction handling at SafePoint. > > Please review and share your feedback. > > Best Regards, > Jatin Jatin Bhateja has updated the pull request incrementally with two additional commits since the last revision: - Minor cleanup - Check to ensure C1 compiler only build is successful. ------------- Changes: - all: https://git.openjdk.org/valhalla/pull/918/files - new: https://git.openjdk.org/valhalla/pull/918/files/bcb1c585..a09ac532 Webrevs: - full: https://webrevs.openjdk.org/?repo=valhalla&pr=918&range=06 - incr: https://webrevs.openjdk.org/?repo=valhalla&pr=918&range=05-06 Stats: 4 lines in 1 file changed: 4 ins; 0 del; 0 mod Patch: https://git.openjdk.org/valhalla/pull/918.diff Fetch: git fetch https://git.openjdk.org/valhalla.git pull/918/head:pull/918 PR: https://git.openjdk.org/valhalla/pull/918 From xgong at openjdk.org Tue Sep 5 02:07:00 2023 From: xgong at openjdk.org (Xiaohong Gong) Date: Tue, 5 Sep 2023 02:07:00 GMT Subject: [lworld+vector] RFR: 8314980: [lworld+vector] consider scalarization conditions during ciMultiField creation. [v7] In-Reply-To: References: Message-ID: <2jKkq_-FBFp_XbYj8jCMAx2RLfYGhyZjzELVcMviTCU=.a2497c66-6ccf-46c9-82d0-c4042a26b6e5@github.com> On Tue, 5 Sep 2023 00:34:25 GMT, Jatin Bhateja wrote: >> Patch adds a new API _ciEnv::is_multifield_scalarized_, to scalarize multifield (ciField[s]) in case target vector cannot accommodate multifield bundle size, else it creates a hierarchical structure ciMultiField and expose entire multifield bundle as one field to C2 compiler. >> >> This cleans up special handling done in C2 compiler, ci field query APIs and object reconstruction handling at SafePoint. >> >> Please review and share your feedback. >> >> Best Regards, >> Jatin > > Jatin Bhateja has updated the pull request incrementally with two additional commits since the last revision: > > - Minor cleanup > - Check to ensure C1 compiler only build is successful. Marked as reviewed by xgong (Committer). ------------- PR Review: https://git.openjdk.org/valhalla/pull/918#pullrequestreview-1610081306 From xgong at openjdk.org Tue Sep 5 02:07:01 2023 From: xgong at openjdk.org (Xiaohong Gong) Date: Tue, 5 Sep 2023 02:07:01 GMT Subject: [lworld+vector] RFR: 8314980: [lworld+vector] consider scalarization conditions during ciMultiField creation. [v2] In-Reply-To: References: <5zDNhbIOYlAa6rP9FXFL4jWABiOF_LMAW0tlazJU614=.863f5b38-0ca2-41db-bf70-24ebc8dc690b@github.com> <4KDglG59YehIIgjFBYfviuvHSXuYpZAcGWNYULW9_P8=.77333d8d-b710-4ec9-8771-f850dc141b9c@github.com> <4RohhdXd4jdG3j46h9KRDZ2sVCnaRjewyJKfpJ7wtjU=.cd0b5b57-98f6-4d6f-a273-64ee50d4caa2@github.com> Message-ID: <9TSq6oIyt_3bnofGvf-J0QwU6D2ou1tmIb-IqH8_FH4=.33680c3e-119a-4dff-a7bc-9aa15eb121b0@github.com> On Mon, 4 Sep 2023 13:25:53 GMT, Jatin Bhateja wrote: > > > > > > > Hi @jatin-bhateja, thanks for this refactoring! Code is much cleaner to me. So does it also need to clean the special handling for multifields in c1/interpreter (e.g. `deoptimize.cpp`) ? Thanks! > > > > > > > Besides, the `secondary_fields_count()` which is broadcasted to the `bundle_size()` of the `ciField` is calculated from `multifield_info` (see: https://github.com/openjdk/valhalla/blob/lworld%2Bvector/src/hotspot/share/runtime/fieldDescriptor.inline.hpp#L81), which I think may not be synced with each other. And it is used widely in C2 and deoptimization. Will it have any issues? > > > > > > > > > > > > > > > I do not think it should be a problem for de-optimization, we have separate handling for re-assignment from vector and scalars locations. > > > > > > > > > > > > What I'm worried is whether we can remove the `is_multifield` filter and `secondary_fields_count` check in deoptimization. Since I assumed the `secondary_fields_count` is meaningful only when the multifields are vectorized now, otherwise, the normal multifields have been added into the klass's nonstatic_fields, right? If so, only check whether the field is vector is enough for me. > > > > > > > > > You assumption still holds good, secondary_fields_count is usable only if multifiled is held in a vector, but we need secondary_fields_count to estimate the vector length unless VectorSupport::klass2length is re-introduced. I agree we can do some more cleanup here. > > > > > > Yes, I agree it is needed. I mean the filter for multifield (see: https://github.com/openjdk/valhalla/blob/lworld%2Bvector/src/hotspot/share/runtime/deoptimization.cpp#L1567) and the scalar field reassign loop with `secondary_fields_count` as the loop limit. Can these two part be cleaned like before? > > Hi @XiaohongGong , I am not in favor of modifying oops structures which are populated during raw bytecode parsing, while we can always prune ci model to present ciField/ciMultifield structures in desired format based on target supported vector size, but a change in oop model will also trigger changes in field layout computations. > > An iteration over raw fields of a klass should see a separate FieldInfo for each synthetic and base multifield. Also Oops model is common across different execution engines (Compilers and Interpreter), while C2 has ability to create vector IR other do not. Hence by construction Oops model should not be influenced by target vector size. > > We can revisit this at a later stage if needed. OK, make sense to me. I assumed that you'v verified the latest change with vector api jtreg tests, and no new regression is found, right? ------------- PR Comment: https://git.openjdk.org/valhalla/pull/918#issuecomment-1705844608 From xgong at openjdk.org Tue Sep 5 02:50:22 2023 From: xgong at openjdk.org (Xiaohong Gong) Date: Tue, 5 Sep 2023 02:50:22 GMT Subject: [lworld] RFR: 8311219: [lworld] VM option "InlineFieldMaxFlatSize" cannot work well [v3] In-Reply-To: <5QNYV-nUYCE-Uzrsz81TwXroTK6x-aDxA4b8B1Svfto=.8e61ef68-a35e-4a92-9338-39ac4c083b2b@github.com> References: <5QNYV-nUYCE-Uzrsz81TwXroTK6x-aDxA4b8B1Svfto=.8e61ef68-a35e-4a92-9338-39ac4c083b2b@github.com> Message-ID: > Currently all the non-static final fields with inline type can be flattened, even if the layout size of the inline type is beyond > the max flat size specified by `InlineFieldMaxFlatSize`. Please refer to the condition check [1] which decides whether a field > can be flattened or not. > > Field flattening has two major side effects: atomicity and size. Fields with atomic access limitation or large size that exceeds > the specified threshold value cannot be flattened. And final fields are special that they are immutable after initialized. So the atomic check for them can be ignored. Hence, 1) for the atomicity free type like the primitive class, the final and non-final fields with such type can be flattened. And 2) for the normal value class that has atomic feature, only the final fields with such type can be flattened. And all kinds of the flattened fields should not exceed the specified max flat size. Please see more details from [1] [2]. > > The original condition [1] matches the atomicity check but not the flat size limitation. Promoting the flat size check before all other checks matches the flattening policy and can make the VM option `InlineFieldMaxFlatSize` work for final fields as well. > > This patch also fixed the jtreg crashes involved after the field flattening condition is changed. Those tests fail with setting > `-XX:+InlineFieldMaxFlatSize=0` by default. The main issue is the non-flattened inline type field is not buffered which is expected to be. The root cause is when parsing `withfield`, the compiler checks whether the field is primitive class type while not its flattening status. Changing to check the flattening status instead can fix the crash. > > [1] https://github.com/openjdk/valhalla/blob/lworld/src/hotspot/share/classfile/fieldLayoutBuilder.cpp#L759 > [2] https://mail.openjdk.org/pipermail/valhalla-dev/2023-June/011262.html > [3] https://mail.openjdk.org/pipermail/valhalla-dev/2023-July/011265.html Xiaohong Gong has updated the pull request incrementally with one additional commit since the last revision: Fix "assert(field->type()->as_inline_klass()->is_initialized()) failed: Must be" ------------- Changes: - all: https://git.openjdk.org/valhalla/pull/888/files - new: https://git.openjdk.org/valhalla/pull/888/files/5742e1e8..5a3d0673 Webrevs: - full: https://webrevs.openjdk.org/?repo=valhalla&pr=888&range=02 - incr: https://webrevs.openjdk.org/?repo=valhalla&pr=888&range=01-02 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/valhalla/pull/888.diff Fetch: git fetch https://git.openjdk.org/valhalla.git pull/888/head:pull/888 PR: https://git.openjdk.org/valhalla/pull/888 From xgong at openjdk.org Tue Sep 5 03:00:07 2023 From: xgong at openjdk.org (Xiaohong Gong) Date: Tue, 5 Sep 2023 03:00:07 GMT Subject: [lworld] RFR: 8311219: [lworld] VM option "InlineFieldMaxFlatSize" cannot work well [v2] In-Reply-To: References: <5QNYV-nUYCE-Uzrsz81TwXroTK6x-aDxA4b8B1Svfto=.8e61ef68-a35e-4a92-9338-39ac4c083b2b@github.com> Message-ID: On Mon, 4 Sep 2023 12:37:39 GMT, Tobias Hartmann wrote: > Sorry, above fix is not correct. I forgot that it's perfectly fine for non-flat but null-free fields to be initialized with null. We then need to replace null with the default value. Let's adjust the assert instead: > > ``` > - assert(!null_free || vt->is_allocated(&gvn), "inline type should be allocated"); > + assert(!null_free || gvn.type(oop)->maybe_null() || vt->is_allocated(&gvn), "inline type should be allocated"); > ``` > > I'll revisit these asserts with JDK-8284443. Thanks for looking at this failure! Do you think it's necessary that I adjust the assertion in this PR and revisit it in future? ------------- PR Comment: https://git.openjdk.org/valhalla/pull/888#issuecomment-1705878464 From jbhateja at openjdk.org Tue Sep 5 11:17:06 2023 From: jbhateja at openjdk.org (Jatin Bhateja) Date: Tue, 5 Sep 2023 11:17:06 GMT Subject: [lworld+vector] RFR: 8314980: [lworld+vector] consider scalarization conditions during ciMultiField creation. [v7] In-Reply-To: References: Message-ID: On Tue, 5 Sep 2023 00:34:25 GMT, Jatin Bhateja wrote: >> Patch adds a new API _ciEnv::is_multifield_scalarized_, to scalarize multifield (ciField[s]) in case target vector cannot accommodate multifield bundle size, else it creates a hierarchical structure ciMultiField and expose entire multifield bundle as one field to C2 compiler. >> >> This cleans up special handling done in C2 compiler, ci field query APIs and object reconstruction handling at SafePoint. >> >> Please review and share your feedback. >> >> Best Regards, >> Jatin > > Jatin Bhateja has updated the pull request incrementally with two additional commits since the last revision: > > - Minor cleanup > - Check to ensure C1 compiler only build is successful. There is one intermittent failure in following test on KNL Double64VectorTests.unsliceBinaryDouble64VectorTestsBinary(double[-i * 5], double[cornerCaseValue(i)]): failure But, we have also seen similar assertion failures in other shuffle / mask tests. Plan to addressed in subsequent patch. ------------- PR Comment: https://git.openjdk.org/valhalla/pull/918#issuecomment-1706418952 From jbhateja at openjdk.org Tue Sep 5 11:17:06 2023 From: jbhateja at openjdk.org (Jatin Bhateja) Date: Tue, 5 Sep 2023 11:17:06 GMT Subject: [lworld+vector] Integrated: 8314980: [lworld+vector] consider scalarization conditions during ciMultiField creation. In-Reply-To: References: Message-ID: On Fri, 25 Aug 2023 04:57:19 GMT, Jatin Bhateja wrote: > Patch adds a new API _ciEnv::is_multifield_scalarized_, to scalarize multifield (ciField[s]) in case target vector cannot accommodate multifield bundle size, else it creates a hierarchical structure ciMultiField and expose entire multifield bundle as one field to C2 compiler. > > This cleans up special handling done in C2 compiler, ci field query APIs and object reconstruction handling at SafePoint. > > Please review and share your feedback. > > Best Regards, > Jatin This pull request has now been integrated. Changeset: 4c885f6e Author: Jatin Bhateja URL: https://git.openjdk.org/valhalla/commit/4c885f6e8aec5016ce349c98461f5fa380ef8db4 Stats: 152 lines in 9 files changed: 36 ins; 78 del; 38 mod 8314980: [lworld+vector] consider scalarization conditions during ciMultiField creation. Reviewed-by: xgong ------------- PR: https://git.openjdk.org/valhalla/pull/918 From chagedorn at openjdk.org Tue Sep 5 14:14:34 2023 From: chagedorn at openjdk.org (Christian Hagedorn) Date: Tue, 5 Sep 2023 14:14:34 GMT Subject: [lworld] RFR: 8313667: [lworld] XBarrierSetC2::clone_at_expansion() uses wrong array copy stub for cloning flat primitive type arrays Message-ID: <4jfPtTzxozsgcM5BeqUR6mXGTmx6gqvWIUCwCU8_HkQ=.83cc178b-8af5-462f-8f9b-9b1ce9b06d90@github.com> The singlegen ZGC version of `XBarrierSetC2::clone_at_expansion()` uses the wrong array copy stub for flat primitive type arrays when expanding an `ArrayCopyNode` for cloning. It wrongly treats a flat array of primitive types as flat array of oop pointers. This leads to intermittent wrong executions and crashes in ZGC because we are interpreting primitive values as oops. The fix is straight forward to special case flat arrays in `XBarrierSetC2::clone_at_expansion()` similar to what we already do in `ZBarrierSetC2::clone_at_expansion()`: https://github.com/openjdk/valhalla/blob/3b4cc5fdb038a7363e5ac8a704adacd70701c1ff/src/hotspot/share/gc/z/c2/zBarrierSetC2.cpp#L457-L465 Thanks, Christian ------------- Commit messages: - clean up test - 8313667: [lworld] XBarrierSetC2::clone_at_expansion uses wrong array copy stub for flat clone arrays Changes: https://git.openjdk.org/valhalla/pull/924/files Webrev: https://webrevs.openjdk.org/?repo=valhalla&pr=924&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8313667 Stats: 97 lines in 2 files changed: 96 ins; 0 del; 1 mod Patch: https://git.openjdk.org/valhalla/pull/924.diff Fetch: git fetch https://git.openjdk.org/valhalla.git pull/924/head:pull/924 PR: https://git.openjdk.org/valhalla/pull/924 From fparain at openjdk.org Tue Sep 5 14:17:51 2023 From: fparain at openjdk.org (Frederic Parain) Date: Tue, 5 Sep 2023 14:17:51 GMT Subject: [lworld] RFR: 8315412 [lworld] Preparing code for lw5 [v3] In-Reply-To: References: Message-ID: > Quite a big patch, but it is mostly made of renaming and mechanical replacement without really changing the logic of the code. > > The patch includes changes to make the VM code less dependent on Q-descriptors by encoding the presence of null-free inline types fields in FieldInfo. > > The patch also includes a lot of renaming, in an effort to have uniformed naming of flat fields and flat arrays across the VM. The renaming has not been applied to the heap dumper and C2. They will be addressed in follow up patches. > > Tested with Mach5, tiers 1 to 3. > > Thank you > > Fred Frederic Parain has updated the pull request incrementally with one additional commit since the last revision: Fixing indentations ------------- Changes: - all: https://git.openjdk.org/valhalla/pull/922/files - new: https://git.openjdk.org/valhalla/pull/922/files/80e6f6c7..c220f753 Webrevs: - full: https://webrevs.openjdk.org/?repo=valhalla&pr=922&range=02 - incr: https://webrevs.openjdk.org/?repo=valhalla&pr=922&range=01-02 Stats: 36 lines in 9 files changed: 6 ins; 6 del; 24 mod Patch: https://git.openjdk.org/valhalla/pull/922.diff Fetch: git fetch https://git.openjdk.org/valhalla.git pull/922/head:pull/922 PR: https://git.openjdk.org/valhalla/pull/922 From fparain at openjdk.org Tue Sep 5 14:17:54 2023 From: fparain at openjdk.org (Frederic Parain) Date: Tue, 5 Sep 2023 14:17:54 GMT Subject: [lworld] RFR: 8315412 [lworld] Preparing code for lw5 [v2] In-Reply-To: <7_ummO2jCYXdyaPqO1REaEs1QUWRg71k-IDkbfOSi2w=.dc050187-062b-4725-a9a2-7d029ed7c855@github.com> References: <7_ummO2jCYXdyaPqO1REaEs1QUWRg71k-IDkbfOSi2w=.dc050187-062b-4725-a9a2-7d029ed7c855@github.com> Message-ID: On Thu, 31 Aug 2023 19:46:01 GMT, Frederic Parain wrote: >> Quite a big patch, but it is mostly made of renaming and mechanical replacement without really changing the logic of the code. >> >> The patch includes changes to make the VM code less dependent on Q-descriptors by encoding the presence of null-free inline types fields in FieldInfo. >> >> The patch also includes a lot of renaming, in an effort to have uniformed naming of flat fields and flat arrays across the VM. The renaming has not been applied to the heap dumper and C2. They will be addressed in follow up patches. >> >> Tested with Mach5, tiers 1 to 3. >> >> Thank you >> >> Fred > > Frederic Parain has updated the pull request incrementally with one additional commit since the last revision: > > Addresing Lois' comments Lois, Tobias, thank you for your reviews, comments and indentations have been fixed. Fred ------------- PR Comment: https://git.openjdk.org/valhalla/pull/922#issuecomment-1706702087 From fparain at openjdk.org Tue Sep 5 14:21:09 2023 From: fparain at openjdk.org (Frederic Parain) Date: Tue, 5 Sep 2023 14:21:09 GMT Subject: [lworld] Integrated: 8315412 [lworld] Preparing code for lw5 In-Reply-To: References: Message-ID: <66fP4opwIJNLsfiF57tDnM1Q_dZYGuJH2BSf6NA9nas=.2ac10e96-3c64-4002-af50-a447b35d5b20@github.com> On Wed, 30 Aug 2023 20:52:42 GMT, Frederic Parain wrote: > Quite a big patch, but it is mostly made of renaming and mechanical replacement without really changing the logic of the code. > > The patch includes changes to make the VM code less dependent on Q-descriptors by encoding the presence of null-free inline types fields in FieldInfo. > > The patch also includes a lot of renaming, in an effort to have uniformed naming of flat fields and flat arrays across the VM. The renaming has not been applied to the heap dumper and C2. They will be addressed in follow up patches. > > Tested with Mach5, tiers 1 to 3. > > Thank you > > Fred This pull request has now been integrated. Changeset: 202c5cc2 Author: Frederic Parain URL: https://git.openjdk.org/valhalla/commit/202c5cc2e0f5f0a4464c71ea02e42467974708a8 Stats: 718 lines in 79 files changed: 105 ins; 94 del; 519 mod 8315412: [lworld] Preparing code for lw5 Reviewed-by: lfoltan, thartmann ------------- PR: https://git.openjdk.org/valhalla/pull/922 From chagedorn at openjdk.org Tue Sep 5 14:40:41 2023 From: chagedorn at openjdk.org (Christian Hagedorn) Date: Tue, 5 Sep 2023 14:40:41 GMT Subject: [lworld] RFR: 8313667: [lworld] XBarrierSetC2::clone_at_expansion() uses wrong array copy stub for cloning flat primitive type arrays [v2] In-Reply-To: <4jfPtTzxozsgcM5BeqUR6mXGTmx6gqvWIUCwCU8_HkQ=.83cc178b-8af5-462f-8f9b-9b1ce9b06d90@github.com> References: <4jfPtTzxozsgcM5BeqUR6mXGTmx6gqvWIUCwCU8_HkQ=.83cc178b-8af5-462f-8f9b-9b1ce9b06d90@github.com> Message-ID: > The singlegen ZGC version of `XBarrierSetC2::clone_at_expansion()` uses the wrong array copy stub for flat primitive type arrays when expanding an `ArrayCopyNode` for cloning. It wrongly treats a flat array of primitive types as flat array of oop pointers. This leads to intermittent wrong executions and crashes in ZGC because we are interpreting primitive values as oops. > > The fix is straight forward to special case flat arrays in `XBarrierSetC2::clone_at_expansion()` similar to what we already do in `ZBarrierSetC2::clone_at_expansion()`: > https://github.com/openjdk/valhalla/blob/3b4cc5fdb038a7363e5ac8a704adacd70701c1ff/src/hotspot/share/gc/z/c2/zBarrierSetC2.cpp#L457-L465 > > Thanks, > Christian Christian Hagedorn has updated the pull request incrementally with one additional commit since the last revision: Remove braces ------------- Changes: - all: https://git.openjdk.org/valhalla/pull/924/files - new: https://git.openjdk.org/valhalla/pull/924/files/acdbd737..41fa08dc Webrevs: - full: https://webrevs.openjdk.org/?repo=valhalla&pr=924&range=01 - incr: https://webrevs.openjdk.org/?repo=valhalla&pr=924&range=00-01 Stats: 2 lines in 2 files changed: 0 ins; 0 del; 2 mod Patch: https://git.openjdk.org/valhalla/pull/924.diff Fetch: git fetch https://git.openjdk.org/valhalla.git pull/924/head:pull/924 PR: https://git.openjdk.org/valhalla/pull/924 From thartmann at openjdk.org Tue Sep 5 14:40:42 2023 From: thartmann at openjdk.org (Tobias Hartmann) Date: Tue, 5 Sep 2023 14:40:42 GMT Subject: [lworld] RFR: 8313667: [lworld] XBarrierSetC2::clone_at_expansion() uses wrong array copy stub for cloning flat primitive type arrays [v2] In-Reply-To: References: <4jfPtTzxozsgcM5BeqUR6mXGTmx6gqvWIUCwCU8_HkQ=.83cc178b-8af5-462f-8f9b-9b1ce9b06d90@github.com> Message-ID: On Tue, 5 Sep 2023 14:35:57 GMT, Christian Hagedorn wrote: >> The singlegen ZGC version of `XBarrierSetC2::clone_at_expansion()` uses the wrong array copy stub for flat primitive type arrays when expanding an `ArrayCopyNode` for cloning. It wrongly treats a flat array of primitive types as flat array of oop pointers. This leads to intermittent wrong executions and crashes in ZGC because we are interpreting primitive values as oops. >> >> The fix is straight forward to special case flat arrays in `XBarrierSetC2::clone_at_expansion()` similar to what we already do in `ZBarrierSetC2::clone_at_expansion()`: >> https://github.com/openjdk/valhalla/blob/3b4cc5fdb038a7363e5ac8a704adacd70701c1ff/src/hotspot/share/gc/z/c2/zBarrierSetC2.cpp#L457-L465 >> >> Thanks, >> Christian > > Christian Hagedorn has updated the pull request incrementally with one additional commit since the last revision: > > Remove braces Thanks for working on this, Christian! The fix looks good to me. ------------- Marked as reviewed by thartmann (Committer). PR Review: https://git.openjdk.org/valhalla/pull/924#pullrequestreview-1611213556 From chagedorn at openjdk.org Wed Sep 6 06:40:00 2023 From: chagedorn at openjdk.org (Christian Hagedorn) Date: Wed, 6 Sep 2023 06:40:00 GMT Subject: [lworld] RFR: 8313667: [lworld] XBarrierSetC2::clone_at_expansion() uses wrong array copy stub for cloning flat primitive type arrays [v2] In-Reply-To: References: <4jfPtTzxozsgcM5BeqUR6mXGTmx6gqvWIUCwCU8_HkQ=.83cc178b-8af5-462f-8f9b-9b1ce9b06d90@github.com> Message-ID: On Tue, 5 Sep 2023 14:40:41 GMT, Christian Hagedorn wrote: >> The singlegen ZGC version of `XBarrierSetC2::clone_at_expansion()` uses the wrong array copy stub for flat primitive type arrays when expanding an `ArrayCopyNode` for cloning. It wrongly treats a flat array of primitive types as flat array of oop pointers. This leads to intermittent wrong executions and crashes in ZGC because we are interpreting primitive values as oops. >> >> The fix is straight forward to special case flat arrays in `XBarrierSetC2::clone_at_expansion()` similar to what we already do in `ZBarrierSetC2::clone_at_expansion()`: >> https://github.com/openjdk/valhalla/blob/3b4cc5fdb038a7363e5ac8a704adacd70701c1ff/src/hotspot/share/gc/z/c2/zBarrierSetC2.cpp#L457-L465 >> >> Thanks, >> Christian > > Christian Hagedorn has updated the pull request incrementally with one additional commit since the last revision: > > Remove braces Thanks Tobias for your review! ------------- PR Comment: https://git.openjdk.org/valhalla/pull/924#issuecomment-1707755783 From chagedorn at openjdk.org Wed Sep 6 06:44:08 2023 From: chagedorn at openjdk.org (Christian Hagedorn) Date: Wed, 6 Sep 2023 06:44:08 GMT Subject: [lworld] Integrated: 8313667: [lworld] XBarrierSetC2::clone_at_expansion() uses wrong array copy stub for cloning flat primitive type arrays In-Reply-To: <4jfPtTzxozsgcM5BeqUR6mXGTmx6gqvWIUCwCU8_HkQ=.83cc178b-8af5-462f-8f9b-9b1ce9b06d90@github.com> References: <4jfPtTzxozsgcM5BeqUR6mXGTmx6gqvWIUCwCU8_HkQ=.83cc178b-8af5-462f-8f9b-9b1ce9b06d90@github.com> Message-ID: On Tue, 5 Sep 2023 14:01:09 GMT, Christian Hagedorn wrote: > The singlegen ZGC version of `XBarrierSetC2::clone_at_expansion()` uses the wrong array copy stub for flat primitive type arrays when expanding an `ArrayCopyNode` for cloning. It wrongly treats a flat array of primitive types as flat array of oop pointers. This leads to intermittent wrong executions and crashes in ZGC because we are interpreting primitive values as oops. > > The fix is straight forward to special case flat arrays in `XBarrierSetC2::clone_at_expansion()` similar to what we already do in `ZBarrierSetC2::clone_at_expansion()`: > https://github.com/openjdk/valhalla/blob/3b4cc5fdb038a7363e5ac8a704adacd70701c1ff/src/hotspot/share/gc/z/c2/zBarrierSetC2.cpp#L457-L465 > > Thanks, > Christian This pull request has now been integrated. Changeset: 261a0457 Author: Christian Hagedorn Committer: Tobias Hartmann URL: https://git.openjdk.org/valhalla/commit/261a0457ec4dbb39427ad9b9e5e12138e4eb7e2b Stats: 98 lines in 3 files changed: 96 ins; 0 del; 2 mod 8313667: [lworld] XBarrierSetC2::clone_at_expansion() uses wrong array copy stub for cloning flat primitive type arrays Reviewed-by: thartmann ------------- PR: https://git.openjdk.org/valhalla/pull/924 From thartmann at openjdk.org Wed Sep 6 11:13:10 2023 From: thartmann at openjdk.org (Tobias Hartmann) Date: Wed, 6 Sep 2023 11:13:10 GMT Subject: [lworld] RFR: 8311219: [lworld] VM option "InlineFieldMaxFlatSize" cannot work well [v3] In-Reply-To: References: <5QNYV-nUYCE-Uzrsz81TwXroTK6x-aDxA4b8B1Svfto=.8e61ef68-a35e-4a92-9338-39ac4c083b2b@github.com> Message-ID: On Tue, 5 Sep 2023 02:50:22 GMT, Xiaohong Gong wrote: >> Currently all the non-static final fields with inline type can be flattened, even if the layout size of the inline type is beyond >> the max flat size specified by `InlineFieldMaxFlatSize`. Please refer to the condition check [1] which decides whether a field >> can be flattened or not. >> >> Field flattening has two major side effects: atomicity and size. Fields with atomic access limitation or large size that exceeds >> the specified threshold value cannot be flattened. And final fields are special that they are immutable after initialized. So the atomic check for them can be ignored. Hence, 1) for the atomicity free type like the primitive class, the final and non-final fields with such type can be flattened. And 2) for the normal value class that has atomic feature, only the final fields with such type can be flattened. And all kinds of the flattened fields should not exceed the specified max flat size. Please see more details from [1] [2]. >> >> The original condition [1] matches the atomicity check but not the flat size limitation. Promoting the flat size check before all other checks matches the flattening policy and can make the VM option `InlineFieldMaxFlatSize` work for final fields as well. >> >> This patch also fixed the jtreg crashes involved after the field flattening condition is changed. Those tests fail with setting >> `-XX:+InlineFieldMaxFlatSize=0` by default. The main issue is the non-flattened inline type field is not buffered which is expected to be. The root cause is when parsing `withfield`, the compiler checks whether the field is primitive class type while not its flattening status. Changing to check the flattening status instead can fix the crash. >> >> [1] https://github.com/openjdk/valhalla/blob/lworld/src/hotspot/share/classfile/fieldLayoutBuilder.cpp#L759 >> [2] https://mail.openjdk.org/pipermail/valhalla-dev/2023-June/011262.html >> [3] https://mail.openjdk.org/pipermail/valhalla-dev/2023-July/011265.html > > Xiaohong Gong has updated the pull request incrementally with one additional commit since the last revision: > > Fix "assert(field->type()->as_inline_klass()->is_initialized()) failed: Must be" Yes, please include it in this PR but on second thought, the proper fix should look like this: - assert(!null_free || vt->is_allocated(&gvn), "inline type should be allocated"); + assert(vt->is_allocated(&gvn) || (null_free && !vk->is_initialized()), "inline type should be allocated"); All InlineTypeNodes returned by this method should be allocated except for null free with uninitialized value class (because we can't load the default oop for those). Testing found some additional issues triggered by your change. Please include below fixes as well. We hit the `"Should have been buffered"` assert because the verification code is too strong. InlineType users don't require buffering because they are removed as well. In addition, the `u->Opcode() != Op_Return || !tf()->returns_inline_type_as_fields()` condition is not required. --- a/src/hotspot/share/opto/compile.cpp +++ b/src/hotspot/share/opto/compile.cpp @@ -2045,16 +2045,10 @@ void Compile::process_inline_types(PhaseIterGVN &igvn, bool remove) { #ifdef ASSERT // Verify that inline type is buffered when replacing by oop else if (u->is_InlineType()) { - InlineTypeNode* vt2 = u->as_InlineType(); - for (uint i = 0; i < vt2->field_count(); ++i) { - if (vt2->field_value(i) == vt && !vt2->field_is_flattened(i)) { - // Use in non-flat field - must_be_buffered = true; - } - } + // InlineType uses don't need buffering because they are about to be replaced as well } else if (u->is_Phi()) { // TODO 8302217 Remove this once InlineTypeNodes are reliably pushed through - } else if (u->Opcode() != Op_Return || !tf()->returns_inline_type_as_fields()) { + } else { must_be_buffered = true; } if (must_be_buffered && !vt->is_allocated(&igvn)) { When flattening is disabled, we sometimes bail out from compiling `test51` due to `COMPILE SKIPPED: unsupported calling sequence`. That's expected because the call to the `test` method already occupies a lot of stack space. I adjusted the test: diff --git a/test/hotspot/jtreg/compiler/valhalla/inlinetypes/TestLWorld.java b/test/hotspot/jtreg/compiler/valhalla/inlinetypes/TestLWorld.java index e1ecca66bd0..d34feba1c48 100644 --- a/test/hotspot/jtreg/compiler/valhalla/inlinetypes/TestLWorld.java +++ b/test/hotspot/jtreg/compiler/valhalla/inlinetypes/TestLWorld.java @@ -1609,10 +1609,15 @@ public class TestLWorld { } } + // Pass arguments via fields to avoid exzessive spilling leading to compilation bailouts + static Test51Value test51_arg1; + static MyValue1 test51_arg2; + static Object test51_arg3; + // Same as test2 but with field holder being an inline type @Test - public long test51(Test51Value holder, MyValue1 vt1, Object vt2) { - return holder.test(holder, vt1, vt2); + public long test51() { + return test51_arg1.test(test51_arg1, test51_arg2, test51_arg3); } @Run(test = "test51") @@ -1622,7 +1627,10 @@ public class TestLWorld { Test51Value holder = new Test51Value(); Asserts.assertEQ(testValue1.hash(), vt.hash()); Asserts.assertEQ(holder.valueField1.hash(), vt.hash()); - long result = test51(holder, vt, vt); + test51_arg1 = holder; + test51_arg2 = vt; + test51_arg3 = vt; + long result = test51(); Asserts.assertEQ(result, 9*vt.hash() + def.hashPrimitive()); } ------------- PR Comment: https://git.openjdk.org/valhalla/pull/888#issuecomment-1708138308 From jbhateja at openjdk.org Wed Sep 6 12:17:33 2023 From: jbhateja at openjdk.org (Jatin Bhateja) Date: Wed, 6 Sep 2023 12:17:33 GMT Subject: [lworld+fp16] RFR: 8308363: Initial compiler support for FP16 scalar operations. [v6] In-Reply-To: References: Message-ID: > Starting with 4th Generation Xeon, Intel has made extensive extensions to existing ISA to support 16 bit scalar and vector floating point operations based on IEEE 754 binary16 format. > > We plan to support this in multiple stages spanning across Java side definition of Float16 type, scalar operation and finally SLP vectorization support. > > This patch adds minimal Java and Compiler side support for one API Float16.add. > > **Summary of changes :-** > - Minimal implementation of Float16 primitive class supporting one operation (Float16.add) > - X86 AVX512-FP16 feature detection at VM startup. > - C2 IR and Inline expander changes for Float16.add API. > - FP16 constant folding handling. > - Backend support : Instruction selection patterns and assembler support. > - New IR framework and functional tests. > > **Implementation details:-** > > 1/ Newly defined Float16 class encapsulate a short value holding IEEE 754 binary16 encoded value. > > 2/ Float16 is a primitive class which in future will be aligned with other enhanced primitive wrapper classes proposed by [JEP-402.](https://openjdk.org/jeps/402) > > 3/ Float16 to support all the operations supported by corresponding Float class. > > 4/ Java implementation of each API will internally perform floating point operation at FP32 granularity. > > 5/ API which can be directly mapped to an Intel AVX512FP16 instruction will be a candidate for intensification by C2 compiler. > > 6/ With Valhalla, C2 compiler always creates an InlineType IR node for a value class instance. > Total number of inputs of an InlineType node match the number of non-static fields. In this case node will have one input of short type TypeInt::SHORT. > > 7/ Since all the scalar AVX512FP16 instructions operate on floating point registers and Float16 backing storage is held in a general-purpose register hence we need to introduce appropriate conversion IR which moves a 16-bit value from GPR to a XMM register and vice versa. > ![image](https://github.com/openjdk/valhalla/assets/59989778/192fca7e-6b7e-4e62-9b09-677e33eca48d) > > 8/ Current plan is to introduce a new IR node for each operation which is a subclass of its corresponding single precision IR node. This will allow leveraging idealization routines (Ideal/Identity/Value) of its parent operation. > > 9/ All the single/double precision IR nodes carry a Type::FLOAT/DOUBLE ideal type. This represents entire FP32/64 value range and is different from integral types which explicitly record lower and upper bounds of value ranges. Value resolution ... Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision: Tuning backend to use new 16 bit moves b/w GPR and XMMs. ------------- Changes: - all: https://git.openjdk.org/valhalla/pull/848/files - new: https://git.openjdk.org/valhalla/pull/848/files/53ac8929..9e2e330e Webrevs: - full: https://webrevs.openjdk.org/?repo=valhalla&pr=848&range=05 - incr: https://webrevs.openjdk.org/?repo=valhalla&pr=848&range=04-05 Stats: 35 lines in 5 files changed: 26 ins; 3 del; 6 mod Patch: https://git.openjdk.org/valhalla/pull/848.diff Fetch: git fetch https://git.openjdk.org/valhalla.git pull/848/head:pull/848 PR: https://git.openjdk.org/valhalla/pull/848 From xgong at openjdk.org Thu Sep 7 03:59:08 2023 From: xgong at openjdk.org (Xiaohong Gong) Date: Thu, 7 Sep 2023 03:59:08 GMT Subject: [lworld] RFR: 8311219: [lworld] VM option "InlineFieldMaxFlatSize" cannot work well [v3] In-Reply-To: References: <5QNYV-nUYCE-Uzrsz81TwXroTK6x-aDxA4b8B1Svfto=.8e61ef68-a35e-4a92-9338-39ac4c083b2b@github.com> Message-ID: On Wed, 6 Sep 2023 11:09:46 GMT, Tobias Hartmann wrote: > Yes, please include it in this PR but on second thought, the proper fix should look like this: > > ``` > - assert(!null_free || vt->is_allocated(&gvn), "inline type should be allocated"); > + assert(vt->is_allocated(&gvn) || (null_free && !vk->is_initialized()), "inline type should be allocated"); > ``` > > All InlineTypeNodes returned by this method should be allocated except for null free with uninitialized value class (because we can't load the default oop for those). New assertion makes sense to me. I will update the change in this PR. Thanks so much for the fixing! > Testing found some additional issues triggered by your change. Please include below fixes as well. > > We hit the `"Should have been buffered"` assert because the verification code is too strong. InlineType users don't require buffering because they are removed as well. In addition, the `u->Opcode() != Op_Return || !tf()->returns_inline_type_as_fields()` condition is not required. > > ``` > --- a/src/hotspot/share/opto/compile.cpp > +++ b/src/hotspot/share/opto/compile.cpp > @@ -2045,16 +2045,10 @@ void Compile::process_inline_types(PhaseIterGVN &igvn, bool remove) { > #ifdef ASSERT > // Verify that inline type is buffered when replacing by oop > else if (u->is_InlineType()) { > - InlineTypeNode* vt2 = u->as_InlineType(); > - for (uint i = 0; i < vt2->field_count(); ++i) { > - if (vt2->field_value(i) == vt && !vt2->field_is_flattened(i)) { > - // Use in non-flat field > - must_be_buffered = true; > - } > - } > + // InlineType uses don't need buffering because they are about to be replaced as well > } else if (u->is_Phi()) { > // TODO 8302217 Remove this once InlineTypeNodes are reliably pushed through > - } else if (u->Opcode() != Op_Return || !tf()->returns_inline_type_as_fields()) { > + } else { > must_be_buffered = true; > } > if (must_be_buffered && !vt->is_allocated(&igvn)) { > ``` Also thanks for this fixing! Applying such patch can clean new involved tests failures with `-XX:InlineFieldMaxFlatSize=0`. But I don't quite understand it well. Following is my questions: 1. This change seems extended the `buffer check` for users besides `InlineType`, didn't it? So if the current `InlineTypeNode` is used in a method call which expects the `InlineType` is scalarized as arguments/return, is the buffer for the `InlineType` still needed? 2. Conversely, if an `InlineType` is a non-flattened field of another `InlineType`, why is the buffer not needed although the user `InlineType` will be replaced with its oop? Per my understanding, the oop is the class instance of the `InlineType`, which also contains the fields. With another thinking, I assumed maybe all the `InlineType` have been scalarized to their users if it can be before this final replacement (i.e. replacement with its oop). So for almost all cases, the remaining `InlineTypeNode` must be buffered at this point, right? ------------- PR Comment: https://git.openjdk.org/valhalla/pull/888#issuecomment-1709438654 From xgong at openjdk.org Thu Sep 7 04:03:42 2023 From: xgong at openjdk.org (Xiaohong Gong) Date: Thu, 7 Sep 2023 04:03:42 GMT Subject: [lworld] RFR: 8311219: [lworld] VM option "InlineFieldMaxFlatSize" cannot work well [v4] In-Reply-To: <5QNYV-nUYCE-Uzrsz81TwXroTK6x-aDxA4b8B1Svfto=.8e61ef68-a35e-4a92-9338-39ac4c083b2b@github.com> References: <5QNYV-nUYCE-Uzrsz81TwXroTK6x-aDxA4b8B1Svfto=.8e61ef68-a35e-4a92-9338-39ac4c083b2b@github.com> Message-ID: > Currently all the non-static final fields with inline type can be flattened, even if the layout size of the inline type is beyond > the max flat size specified by `InlineFieldMaxFlatSize`. Please refer to the condition check [1] which decides whether a field > can be flattened or not. > > Field flattening has two major side effects: atomicity and size. Fields with atomic access limitation or large size that exceeds > the specified threshold value cannot be flattened. And final fields are special that they are immutable after initialized. So the atomic check for them can be ignored. Hence, 1) for the atomicity free type like the primitive class, the final and non-final fields with such type can be flattened. And 2) for the normal value class that has atomic feature, only the final fields with such type can be flattened. And all kinds of the flattened fields should not exceed the specified max flat size. Please see more details from [1] [2]. > > The original condition [1] matches the atomicity check but not the flat size limitation. Promoting the flat size check before all other checks matches the flattening policy and can make the VM option `InlineFieldMaxFlatSize` work for final fields as well. > > This patch also fixed the jtreg crashes involved after the field flattening condition is changed. Those tests fail with setting > `-XX:+InlineFieldMaxFlatSize=0` by default. The main issue is the non-flattened inline type field is not buffered which is expected to be. The root cause is when parsing `withfield`, the compiler checks whether the field is primitive class type while not its flattening status. Changing to check the flattening status instead can fix the crash. > > [1] https://github.com/openjdk/valhalla/blob/lworld/src/hotspot/share/classfile/fieldLayoutBuilder.cpp#L759 > [2] https://mail.openjdk.org/pipermail/valhalla-dev/2023-June/011262.html > [3] https://mail.openjdk.org/pipermail/valhalla-dev/2023-July/011265.html Xiaohong Gong has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains four commits: - Merge 'valhalla:lworld' into JDK-8311219 - Fix "assert(field->type()->as_inline_klass()->is_initialized()) failed: Must be" - Not flatten L-descriptor value object field even it is final - 8311219: [lworld] VM option "InlineFieldMaxFlatSize" cannot work well ------------- Changes: https://git.openjdk.org/valhalla/pull/888/files Webrev: https://webrevs.openjdk.org/?repo=valhalla&pr=888&range=03 Stats: 7 lines in 4 files changed: 3 ins; 0 del; 4 mod Patch: https://git.openjdk.org/valhalla/pull/888.diff Fetch: git fetch https://git.openjdk.org/valhalla.git pull/888/head:pull/888 PR: https://git.openjdk.org/valhalla/pull/888 From jbhateja at openjdk.org Thu Sep 7 06:28:12 2023 From: jbhateja at openjdk.org (Jatin Bhateja) Date: Thu, 7 Sep 2023 06:28:12 GMT Subject: [lworld] RFR: 8311219: [lworld] VM option "InlineFieldMaxFlatSize" cannot work well [v4] In-Reply-To: References: <5QNYV-nUYCE-Uzrsz81TwXroTK6x-aDxA4b8B1Svfto=.8e61ef68-a35e-4a92-9338-39ac4c083b2b@github.com> Message-ID: On Thu, 7 Sep 2023 04:03:42 GMT, Xiaohong Gong wrote: >> Currently all the non-static final fields with inline type can be flattened, even if the layout size of the inline type is beyond >> the max flat size specified by `InlineFieldMaxFlatSize`. Please refer to the condition check [1] which decides whether a field >> can be flattened or not. >> >> Field flattening has two major side effects: atomicity and size. Fields with atomic access limitation or large size that exceeds >> the specified threshold value cannot be flattened. And final fields are special that they are immutable after initialized. So the atomic check for them can be ignored. Hence, 1) for the atomicity free type like the primitive class, the final and non-final fields with such type can be flattened. And 2) for the normal value class that has atomic feature, only the final fields with such type can be flattened. And all kinds of the flattened fields should not exceed the specified max flat size. Please see more details from [1] [2]. >> >> The original condition [1] matches the atomicity check but not the flat size limitation. Promoting the flat size check before all other checks matches the flattening policy and can make the VM option `InlineFieldMaxFlatSize` work for final fields as well. >> >> This patch also fixed the jtreg crashes involved after the field flattening condition is changed. Those tests fail with setting >> `-XX:+InlineFieldMaxFlatSize=0` by default. The main issue is the non-flattened inline type field is not buffered which is expected to be. The root cause is when parsing `withfield`, the compiler checks whether the field is primitive class type while not its flattening status. Changing to check the flattening status instead can fix the crash. >> >> [1] https://github.com/openjdk/valhalla/blob/lworld/src/hotspot/share/classfile/fieldLayoutBuilder.cpp#L759 >> [2] https://mail.openjdk.org/pipermail/valhalla-dev/2023-June/011262.html >> [3] https://mail.openjdk.org/pipermail/valhalla-dev/2023-July/011265.html > > Xiaohong Gong has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains four commits: > > - Merge 'valhalla:lworld' into JDK-8311219 > - Fix "assert(field->type()->as_inline_klass()->is_initialized()) failed: Must be" > - Not flatten L-descriptor value object field even it is final > - 8311219: [lworld] VM option "InlineFieldMaxFlatSize" cannot work well src/hotspot/share/opto/inlinetypenode.cpp line 147: > 145: if (val1->is_InlineType()) { > 146: if (val2->is_Phi()) { > 147: val2 = gvn->transform(val2); Can you also add an assertion check here to ensure val2 is always an InlineTypeNode which was pushed forward through PhiNode. Please add a test case exercising this control flow, if a test already exist then kindly mention it on this PR. ------------- PR Review Comment: https://git.openjdk.org/valhalla/pull/888#discussion_r1318136662 From jbhateja at openjdk.org Thu Sep 7 06:40:09 2023 From: jbhateja at openjdk.org (Jatin Bhateja) Date: Thu, 7 Sep 2023 06:40:09 GMT Subject: [lworld] RFR: 8311219: [lworld] VM option "InlineFieldMaxFlatSize" cannot work well [v4] In-Reply-To: References: <5QNYV-nUYCE-Uzrsz81TwXroTK6x-aDxA4b8B1Svfto=.8e61ef68-a35e-4a92-9338-39ac4c083b2b@github.com> Message-ID: On Thu, 7 Sep 2023 04:03:42 GMT, Xiaohong Gong wrote: >> Currently all the non-static final fields with inline type can be flattened, even if the layout size of the inline type is beyond >> the max flat size specified by `InlineFieldMaxFlatSize`. Please refer to the condition check [1] which decides whether a field >> can be flattened or not. >> >> Field flattening has two major side effects: atomicity and size. Fields with atomic access limitation or large size that exceeds >> the specified threshold value cannot be flattened. And final fields are special that they are immutable after initialized. So the atomic check for them can be ignored. Hence, 1) for the atomicity free type like the primitive class, the final and non-final fields with such type can be flattened. And 2) for the normal value class that has atomic feature, only the final fields with such type can be flattened. And all kinds of the flattened fields should not exceed the specified max flat size. Please see more details from [1] [2]. >> >> The original condition [1] matches the atomicity check but not the flat size limitation. Promoting the flat size check before all other checks matches the flattening policy and can make the VM option `InlineFieldMaxFlatSize` work for final fields as well. >> >> This patch also fixed the jtreg crashes involved after the field flattening condition is changed. Those tests fail with setting >> `-XX:+InlineFieldMaxFlatSize=0` by default. The main issue is the non-flattened inline type field is not buffered which is expected to be. The root cause is when parsing `withfield`, the compiler checks whether the field is primitive class type while not its flattening status. Changing to check the flattening status instead can fix the crash. >> >> [1] https://github.com/openjdk/valhalla/blob/lworld/src/hotspot/share/classfile/fieldLayoutBuilder.cpp#L759 >> [2] https://mail.openjdk.org/pipermail/valhalla-dev/2023-June/011262.html >> [3] https://mail.openjdk.org/pipermail/valhalla-dev/2023-July/011265.html > > Xiaohong Gong has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains four commits: > > - Merge 'valhalla:lworld' into JDK-8311219 > - Fix "assert(field->type()->as_inline_klass()->is_initialized()) failed: Must be" > - Not flatten L-descriptor value object field even it is final > - 8311219: [lworld] VM option "InlineFieldMaxFlatSize" cannot work well src/hotspot/share/classfile/fieldLayoutBuilder.cpp line 761: > 759: // volatile fields are currently never flatten, this could change in the future > 760: } > 761: if (!(too_big_to_flatten | too_atomic_to_flatten | too_volatile_to_flatten)) { Please also add an IR framework test to check primitive class final/non-final field flattening, given that InlineTypeNode are removed during Optimization hence such a check has to be done on post Parse IR. ------------- PR Review Comment: https://git.openjdk.org/valhalla/pull/888#discussion_r1318146566 From xgong at openjdk.org Thu Sep 7 06:56:07 2023 From: xgong at openjdk.org (Xiaohong Gong) Date: Thu, 7 Sep 2023 06:56:07 GMT Subject: [lworld] RFR: 8311219: [lworld] VM option "InlineFieldMaxFlatSize" cannot work well [v4] In-Reply-To: References: <5QNYV-nUYCE-Uzrsz81TwXroTK6x-aDxA4b8B1Svfto=.8e61ef68-a35e-4a92-9338-39ac4c083b2b@github.com> Message-ID: On Thu, 7 Sep 2023 06:24:48 GMT, Jatin Bhateja wrote: >> Xiaohong Gong has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains four commits: >> >> - Merge 'valhalla:lworld' into JDK-8311219 >> - Fix "assert(field->type()->as_inline_klass()->is_initialized()) failed: Must be" >> - Not flatten L-descriptor value object field even it is final >> - 8311219: [lworld] VM option "InlineFieldMaxFlatSize" cannot work well > > src/hotspot/share/opto/inlinetypenode.cpp line 147: > >> 145: if (val1->is_InlineType()) { >> 146: if (val2->is_Phi()) { >> 147: val2 = gvn->transform(val2); > > Can you also add an assertion check here to ensure val2 is always an InlineTypeNode which was pushed forward through PhiNode. Please add a test case exercising this control flow, if a test already exist then kindly mention it on this PR. `val2->as_InlineType()` has already contained the assertion. This original code makes test `compiler.valhalla.inlinetypes.TestNullableArrays` crashes with: # # A fatal error has been detected by the Java Runtime Environment: # # Internal Error (/mnt/local/code/valhalla/src/hotspot/share/opto/node.hpp:977), pid=302375, tid=302391 # assert(is_InlineType()) failed: invalid node class: Phi # # JRE version: OpenJDK Runtime Environment (22.0) (fastdebug build 22-internal-git-a477e90b5) # Java VM: OpenJDK 64-Bit Server VM (fastdebug 22-internal-git-a477e90b5, mixed mode, tiered, compressed oops, compressed class ptrs, g1 gc, linux-aarch64) # Problematic frame: # V [libjvm.so+0xdbeaf8] Node::as_InlineType() const [clone .part.0]+0x18 # # Core dump will be written. Default location: Core dumps may be processed with "/usr/share/apport/apport -p%p -s%s -c%c -d%d -P%P -u%u -g%g -- %E" (or dumping to /mnt/local/code/jtreg/jtreg-git/build/images/jtreg/jtwork/hotspot/scratch/8/core.302375) # Unsupported internal testing APIs have been used. So I think it can touch this control flow? ------------- PR Review Comment: https://git.openjdk.org/valhalla/pull/888#discussion_r1318160548 From thartmann at openjdk.org Thu Sep 7 07:07:04 2023 From: thartmann at openjdk.org (Tobias Hartmann) Date: Thu, 7 Sep 2023 07:07:04 GMT Subject: [lworld] RFR: 8311219: [lworld] VM option "InlineFieldMaxFlatSize" cannot work well [v3] In-Reply-To: References: <5QNYV-nUYCE-Uzrsz81TwXroTK6x-aDxA4b8B1Svfto=.8e61ef68-a35e-4a92-9338-39ac4c083b2b@github.com> Message-ID: On Thu, 7 Sep 2023 03:56:37 GMT, Xiaohong Gong wrote: > This change seems extended the buffer check for users besides InlineType, didn't it? My change does two things: (1) Remove the `u->Opcode() != Op_Return || !tf()->returns_inline_type_as_fields()` condition because if the user would be a ReturnNode and we would return as fields, the InlineTypeNode should be buffered. (2) For InlineType users, never require buffering because the InlineType user will be replaced by its oop as well (detailed explanation below). > So if the current InlineTypeNode is used in a method call which expects the InlineType is scalarized as arguments/return, is the buffer for the InlineType still needed? No, it would not be needed but the InlineTypeNode would not be connected to the call anymore at this point. Scalarization of arguments/returns happens much earlier, we would already have wired the field values directly to the call. But note that the field value itself could be an InlineTypeNode and **that one** would need to be buffered. > Conversely, if an InlineType is a non-flattened field of another InlineType, why is the buffer not needed although the user InlineType will be replaced with its oop? Per my understanding, the oop is the class instance of the InlineType, which also contains the fields. It's not needed because we will replace the user separately and then assert that it's buffered (if required). It might well be that the user InlineType does not need to be allocated (it might itself be a flat field value of another InlineType node or something). Asserting that a non-flattened field value is always buffered "itself" is too strong because there can be cases where neither the holder nor the field value need to be buffered. It's sufficient to assert buffering for the InlineTypeNode that has a non-InlineType user. > With another thinking, I assumed maybe all the InlineType have been scalarized to their users if it can be before this final replacement (i.e. replacement with its oop). So for almost all cases, the remaining InlineTypeNode must be buffered at this point, right? Ideally yes but there will be chains of InlineTypeNodes and there is no need for the parent nodes to be buffered when their users are going to be replaced anyway. Also, there's an issue with PhiNodes not being properly replaced (see `TODO 8302217`). I run this through all our testing and the assert never triggered. ------------- PR Comment: https://git.openjdk.org/valhalla/pull/888#issuecomment-1709587001 From xgong at openjdk.org Thu Sep 7 07:13:31 2023 From: xgong at openjdk.org (Xiaohong Gong) Date: Thu, 7 Sep 2023 07:13:31 GMT Subject: [lworld] RFR: 8311219: [lworld] VM option "InlineFieldMaxFlatSize" cannot work well [v5] In-Reply-To: <5QNYV-nUYCE-Uzrsz81TwXroTK6x-aDxA4b8B1Svfto=.8e61ef68-a35e-4a92-9338-39ac4c083b2b@github.com> References: <5QNYV-nUYCE-Uzrsz81TwXroTK6x-aDxA4b8B1Svfto=.8e61ef68-a35e-4a92-9338-39ac4c083b2b@github.com> Message-ID: <9_ZKkcP30bs7z6ZO36ObEfVLlbaicHrtMLkWskMOYCg=.48cb8cf2-bd14-4113-95a1-fe1a206392d3@github.com> > Currently all the non-static final fields with inline type can be flattened, even if the layout size of the inline type is beyond > the max flat size specified by `InlineFieldMaxFlatSize`. Please refer to the condition check [1] which decides whether a field > can be flattened or not. > > Field flattening has two major side effects: atomicity and size. Fields with atomic access limitation or large size that exceeds > the specified threshold value cannot be flattened. And final fields are special that they are immutable after initialized. So the atomic check for them can be ignored. Hence, 1) for the atomicity free type like the primitive class, the final and non-final fields with such type can be flattened. And 2) for the normal value class that has atomic feature, only the final fields with such type can be flattened. And all kinds of the flattened fields should not exceed the specified max flat size. Please see more details from [1] [2]. > > The original condition [1] matches the atomicity check but not the flat size limitation. Promoting the flat size check before all other checks matches the flattening policy and can make the VM option `InlineFieldMaxFlatSize` work for final fields as well. > > This patch also fixed the jtreg crashes involved after the field flattening condition is changed. Those tests fail with setting > `-XX:+InlineFieldMaxFlatSize=0` by default. The main issue is the non-flattened inline type field is not buffered which is expected to be. The root cause is when parsing `withfield`, the compiler checks whether the field is primitive class type while not its flattening status. Changing to check the flattening status instead can fix the crash. > > [1] https://github.com/openjdk/valhalla/blob/lworld/src/hotspot/share/classfile/fieldLayoutBuilder.cpp#L759 > [2] https://mail.openjdk.org/pipermail/valhalla-dev/2023-June/011262.html > [3] https://mail.openjdk.org/pipermail/valhalla-dev/2023-July/011265.html Xiaohong Gong has updated the pull request incrementally with one additional commit since the last revision: Fix jtreg crashes ------------- Changes: - all: https://git.openjdk.org/valhalla/pull/888/files - new: https://git.openjdk.org/valhalla/pull/888/files/5b6f1b36..73d47fc1 Webrevs: - full: https://webrevs.openjdk.org/?repo=valhalla&pr=888&range=04 - incr: https://webrevs.openjdk.org/?repo=valhalla&pr=888&range=03-04 Stats: 21 lines in 3 files changed: 8 ins; 6 del; 7 mod Patch: https://git.openjdk.org/valhalla/pull/888.diff Fetch: git fetch https://git.openjdk.org/valhalla.git pull/888/head:pull/888 PR: https://git.openjdk.org/valhalla/pull/888 From xgong at openjdk.org Thu Sep 7 07:20:00 2023 From: xgong at openjdk.org (Xiaohong Gong) Date: Thu, 7 Sep 2023 07:20:00 GMT Subject: [lworld] RFR: 8311219: [lworld] VM option "InlineFieldMaxFlatSize" cannot work well [v3] In-Reply-To: References: <5QNYV-nUYCE-Uzrsz81TwXroTK6x-aDxA4b8B1Svfto=.8e61ef68-a35e-4a92-9338-39ac4c083b2b@github.com> Message-ID: On Thu, 7 Sep 2023 07:02:07 GMT, Tobias Hartmann wrote: > I run this through all our testing and the assert never triggered. Thanks for all the explanation! Make sense! I didn't find other failures as well and have pushed these changes to this PR. ------------- PR Comment: https://git.openjdk.org/valhalla/pull/888#issuecomment-1709605843 From thartmann at openjdk.org Thu Sep 7 07:29:04 2023 From: thartmann at openjdk.org (Tobias Hartmann) Date: Thu, 7 Sep 2023 07:29:04 GMT Subject: [lworld] RFR: 8311219: [lworld] VM option "InlineFieldMaxFlatSize" cannot work well [v5] In-Reply-To: <9_ZKkcP30bs7z6ZO36ObEfVLlbaicHrtMLkWskMOYCg=.48cb8cf2-bd14-4113-95a1-fe1a206392d3@github.com> References: <5QNYV-nUYCE-Uzrsz81TwXroTK6x-aDxA4b8B1Svfto=.8e61ef68-a35e-4a92-9338-39ac4c083b2b@github.com> <9_ZKkcP30bs7z6ZO36ObEfVLlbaicHrtMLkWskMOYCg=.48cb8cf2-bd14-4113-95a1-fe1a206392d3@github.com> Message-ID: On Thu, 7 Sep 2023 07:13:31 GMT, Xiaohong Gong wrote: >> Currently all the non-static final fields with inline type can be flattened, even if the layout size of the inline type is beyond >> the max flat size specified by `InlineFieldMaxFlatSize`. Please refer to the condition check [1] which decides whether a field >> can be flattened or not. >> >> Field flattening has two major side effects: atomicity and size. Fields with atomic access limitation or large size that exceeds >> the specified threshold value cannot be flattened. And final fields are special that they are immutable after initialized. So the atomic check for them can be ignored. Hence, 1) for the atomicity free type like the primitive class, the final and non-final fields with such type can be flattened. And 2) for the normal value class that has atomic feature, only the final fields with such type can be flattened. And all kinds of the flattened fields should not exceed the specified max flat size. Please see more details from [1] [2]. >> >> The original condition [1] matches the atomicity check but not the flat size limitation. Promoting the flat size check before all other checks matches the flattening policy and can make the VM option `InlineFieldMaxFlatSize` work for final fields as well. >> >> This patch also fixed the jtreg crashes involved after the field flattening condition is changed. Those tests fail with setting >> `-XX:+InlineFieldMaxFlatSize=0` by default. The main issue is the non-flattened inline type field is not buffered which is expected to be. The root cause is when parsing `withfield`, the compiler checks whether the field is primitive class type while not its flattening status. Changing to check the flattening status instead can fix the crash. >> >> [1] https://github.com/openjdk/valhalla/blob/lworld/src/hotspot/share/classfile/fieldLayoutBuilder.cpp#L759 >> [2] https://mail.openjdk.org/pipermail/valhalla-dev/2023-June/011262.html >> [3] https://mail.openjdk.org/pipermail/valhalla-dev/2023-July/011265.html > > Xiaohong Gong has updated the pull request incrementally with one additional commit since the last revision: > > Fix jtreg crashes Thanks for making all these changes. Looks good to me! ------------- Marked as reviewed by thartmann (Committer). PR Review: https://git.openjdk.org/valhalla/pull/888#pullrequestreview-1614749233 From jbhateja at openjdk.org Thu Sep 7 07:29:05 2023 From: jbhateja at openjdk.org (Jatin Bhateja) Date: Thu, 7 Sep 2023 07:29:05 GMT Subject: [lworld] RFR: 8311219: [lworld] VM option "InlineFieldMaxFlatSize" cannot work well [v4] In-Reply-To: References: <5QNYV-nUYCE-Uzrsz81TwXroTK6x-aDxA4b8B1Svfto=.8e61ef68-a35e-4a92-9338-39ac4c083b2b@github.com> Message-ID: On Thu, 7 Sep 2023 06:53:15 GMT, Xiaohong Gong wrote: >> src/hotspot/share/opto/inlinetypenode.cpp line 147: >> >>> 145: if (val1->is_InlineType()) { >>> 146: if (val2->is_Phi()) { >>> 147: val2 = gvn->transform(val2); >> >> Can you also add an assertion check here to ensure val2 is always an InlineTypeNode which was pushed forward through PhiNode. Please add a test case exercising this control flow, if a test already exist then kindly mention it on this PR. > > `val2->as_InlineType()` has already contained the assertion. This original code makes test `compiler.valhalla.inlinetypes.TestNullableArrays` crashes with: > > # > # A fatal error has been detected by the Java Runtime Environment: > # > # Internal Error (/mnt/local/code/valhalla/src/hotspot/share/opto/node.hpp:977), pid=302375, tid=302391 > # assert(is_InlineType()) failed: invalid node class: Phi > # > # JRE version: OpenJDK Runtime Environment (22.0) (fastdebug build 22-internal-git-a477e90b5) > # Java VM: OpenJDK 64-Bit Server VM (fastdebug 22-internal-git-a477e90b5, mixed mode, tiered, compressed oops, compressed class ptrs, g1 gc, linux-aarch64) > # Problematic frame: > # V [libjvm.so+0xdbeaf8] Node::as_InlineType() const [clone .part.0]+0x18 > # > # Core dump will be written. Default location: Core dumps may be processed with "/usr/share/apport/apport -p%p -s%s -c%c -d%d -P%P -u%u -g%g -- %E" (or dumping to /mnt/local/code/jtreg/jtreg-git/build/images/jtreg/jtwork/hotspot/scratch/8/core.302375) > # > Unsupported internal testing APIs have been used. > > > So I think it can touch this control flow? Thanks, this should suffice. ------------- PR Review Comment: https://git.openjdk.org/valhalla/pull/888#discussion_r1318195753 From xgong at openjdk.org Thu Sep 7 07:59:16 2023 From: xgong at openjdk.org (Xiaohong Gong) Date: Thu, 7 Sep 2023 07:59:16 GMT Subject: [lworld] RFR: 8311219: [lworld] VM option "InlineFieldMaxFlatSize" cannot work well [v4] In-Reply-To: References: <5QNYV-nUYCE-Uzrsz81TwXroTK6x-aDxA4b8B1Svfto=.8e61ef68-a35e-4a92-9338-39ac4c083b2b@github.com> Message-ID: On Thu, 7 Sep 2023 06:37:25 GMT, Jatin Bhateja wrote: >> Xiaohong Gong has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains four commits: >> >> - Merge 'valhalla:lworld' into JDK-8311219 >> - Fix "assert(field->type()->as_inline_klass()->is_initialized()) failed: Must be" >> - Not flatten L-descriptor value object field even it is final >> - 8311219: [lworld] VM option "InlineFieldMaxFlatSize" cannot work well > > src/hotspot/share/classfile/fieldLayoutBuilder.cpp line 761: > >> 759: // volatile fields are currently never flatten, this could change in the future >> 760: } >> 761: if (!(too_big_to_flatten | too_atomic_to_flatten | too_volatile_to_flatten)) { > > Please also add an IR framework test to check primitive class final/non-final field flattening, given that InlineTypeNode are removed during Optimization hence such a check has to be done on post Parse IR. The main problem for me is how to check the flatten status for an `InlineTypeNode`. Maybe we can check the decode operation generated by the field accessing (or loading) to an primitive class field. Any better idea for this? ------------- PR Review Comment: https://git.openjdk.org/valhalla/pull/888#discussion_r1318229377 From thartmann at openjdk.org Thu Sep 7 08:20:11 2023 From: thartmann at openjdk.org (Tobias Hartmann) Date: Thu, 7 Sep 2023 08:20:11 GMT Subject: [lworld] RFR: 8311219: [lworld] VM option "InlineFieldMaxFlatSize" cannot work well [v4] In-Reply-To: References: <5QNYV-nUYCE-Uzrsz81TwXroTK6x-aDxA4b8B1Svfto=.8e61ef68-a35e-4a92-9338-39ac4c083b2b@github.com> Message-ID: On Thu, 7 Sep 2023 07:56:11 GMT, Xiaohong Gong wrote: >> src/hotspot/share/classfile/fieldLayoutBuilder.cpp line 761: >> >>> 759: // volatile fields are currently never flatten, this could change in the future >>> 760: } >>> 761: if (!(too_big_to_flatten | too_atomic_to_flatten | too_volatile_to_flatten)) { >> >> Please also add an IR framework test to check primitive class final/non-final field flattening, given that InlineTypeNode are removed during Optimization hence such a check has to be done on post Parse IR. > > The main problem for me is how to check the flatten status for an `InlineTypeNode`. Maybe we can check the decode operation generated by the field accessing (or loading) to an primitive class field. Any better idea for this? Couldn't you just check that there is no oop load for that particular field, i.e., no indirection to load the fields from a heap buffer? ------------- PR Review Comment: https://git.openjdk.org/valhalla/pull/888#discussion_r1318253311 From jbhateja at openjdk.org Thu Sep 7 09:03:09 2023 From: jbhateja at openjdk.org (Jatin Bhateja) Date: Thu, 7 Sep 2023 09:03:09 GMT Subject: [lworld] RFR: 8311219: [lworld] VM option "InlineFieldMaxFlatSize" cannot work well [v4] In-Reply-To: References: <5QNYV-nUYCE-Uzrsz81TwXroTK6x-aDxA4b8B1Svfto=.8e61ef68-a35e-4a92-9338-39ac4c083b2b@github.com> Message-ID: On Thu, 7 Sep 2023 08:16:43 GMT, Tobias Hartmann wrote: >> The main problem for me is how to check the flatten status for an `InlineTypeNode`. Maybe we can check the decode operation generated by the field accessing (or loading) to an primitive class field. Any better idea for this? > > Couldn't you just check that there is no oop load for that particular field, i.e., no indirection to load the fields from a heap buffer? If a primitive class field is flattened, then compiler will not be creating an InlineTypeNode for field access, sub-field will be part of container, isn't that sufficient to make the check, e.g. primitive class ABC { int f1; long f2; ABC (int f1, long f2) { this.f1 = f1; this.f2 = f2; } } public class flatten { public final ABC field; public flatten(int val) { this.field = new ABC(val, (long)val); } public static long micro(int val) { flatten obj = new flatten(val); return obj.field.f1 + obj.field.f2; } public static void main(String [] args) { long res = 0; for (int i = 0; i < 10000; i++) { res += micro(i); } System.out.println(res); } } We see no InlineTypeNode in post parse IR with following command line:- -XX:CompileCommand=compileonly,flatten::micro -XX:+EnablePrimitiveClasses -cp . flatten But, field access does generate InlineTypeNode if flatten size explicitly set as 0 with following options. -XX:InlineFieldMaxFlatSize=0 -XX:CompileCommand=compileonly,flatten::micro -XX:+EnablePrimitiveClasses -cp . flatten EA will anyways remove non-escapable allocation, but post parse IR verification should be good enough to test your changes ? ------------- PR Review Comment: https://git.openjdk.org/valhalla/pull/888#discussion_r1318305754 From xgong at openjdk.org Fri Sep 8 02:36:52 2023 From: xgong at openjdk.org (Xiaohong Gong) Date: Fri, 8 Sep 2023 02:36:52 GMT Subject: [lworld] RFR: 8311219: [lworld] VM option "InlineFieldMaxFlatSize" cannot work well [v6] In-Reply-To: <5QNYV-nUYCE-Uzrsz81TwXroTK6x-aDxA4b8B1Svfto=.8e61ef68-a35e-4a92-9338-39ac4c083b2b@github.com> References: <5QNYV-nUYCE-Uzrsz81TwXroTK6x-aDxA4b8B1Svfto=.8e61ef68-a35e-4a92-9338-39ac4c083b2b@github.com> Message-ID: > Currently all the non-static final fields with inline type can be flattened, even if the layout size of the inline type is beyond > the max flat size specified by `InlineFieldMaxFlatSize`. Please refer to the condition check [1] which decides whether a field > can be flattened or not. > > Field flattening has two major side effects: atomicity and size. Fields with atomic access limitation or large size that exceeds > the specified threshold value cannot be flattened. And final fields are special that they are immutable after initialized. So the atomic check for them can be ignored. Hence, 1) for the atomicity free type like the primitive class, the final and non-final fields with such type can be flattened. And 2) for the normal value class that has atomic feature, only the final fields with such type can be flattened. And all kinds of the flattened fields should not exceed the specified max flat size. Please see more details from [1] [2]. > > The original condition [1] matches the atomicity check but not the flat size limitation. Promoting the flat size check before all other checks matches the flattening policy and can make the VM option `InlineFieldMaxFlatSize` work for final fields as well. > > This patch also fixed the jtreg crashes involved after the field flattening condition is changed. Those tests fail with setting > `-XX:+InlineFieldMaxFlatSize=0` by default. The main issue is the non-flattened inline type field is not buffered which is expected to be. The root cause is when parsing `withfield`, the compiler checks whether the field is primitive class type while not its flattening status. Changing to check the flattening status instead can fix the crash. > > [1] https://github.com/openjdk/valhalla/blob/lworld/src/hotspot/share/classfile/fieldLayoutBuilder.cpp#L759 > [2] https://mail.openjdk.org/pipermail/valhalla-dev/2023-June/011262.html > [3] https://mail.openjdk.org/pipermail/valhalla-dev/2023-July/011265.html Xiaohong Gong has updated the pull request incrementally with one additional commit since the last revision: Add IR test for non-flattened inline type fields ------------- Changes: - all: https://git.openjdk.org/valhalla/pull/888/files - new: https://git.openjdk.org/valhalla/pull/888/files/73d47fc1..1df4245b Webrevs: - full: https://webrevs.openjdk.org/?repo=valhalla&pr=888&range=05 - incr: https://webrevs.openjdk.org/?repo=valhalla&pr=888&range=04-05 Stats: 99 lines in 1 file changed: 99 ins; 0 del; 0 mod Patch: https://git.openjdk.org/valhalla/pull/888.diff Fetch: git fetch https://git.openjdk.org/valhalla.git pull/888/head:pull/888 PR: https://git.openjdk.org/valhalla/pull/888 From xgong at openjdk.org Fri Sep 8 02:36:52 2023 From: xgong at openjdk.org (Xiaohong Gong) Date: Fri, 8 Sep 2023 02:36:52 GMT Subject: [lworld] RFR: 8311219: [lworld] VM option "InlineFieldMaxFlatSize" cannot work well [v4] In-Reply-To: References: <5QNYV-nUYCE-Uzrsz81TwXroTK6x-aDxA4b8B1Svfto=.8e61ef68-a35e-4a92-9338-39ac4c083b2b@github.com> Message-ID: <3dhLXjVFVq3tf3KjUZhIH2eunRsvXf1_zjjvJEmiUTg=.74043b94-d470-4582-a270-d1fd52376f92@github.com> On Thu, 7 Sep 2023 08:58:05 GMT, Jatin Bhateja wrote: >> Couldn't you just check that there is no oop load for that particular field, i.e., no indirection to load the fields from a heap buffer? > > If a primitive class field is flattened, then compiler will not be creating an InlineTypeNode for field access, sub-field will be part of container, isn't that sufficient to make the check, e.g. > > > primitive class ABC { > int f1; > long f2; > ABC (int f1, long f2) { > this.f1 = f1; > this.f2 = f2; > } > } > > > public class flatten { > public final ABC field; > > public flatten(int val) { > this.field = new ABC(val, (long)val); > } > > public static long micro(int val) { > flatten obj = new flatten(val); > return obj.field.f1 + obj.field.f2; > } > > public static void main(String [] args) { > long res = 0; > for (int i = 0; i < 10000; i++) { > res += micro(i); > } > System.out.println(res); > } > } > > > > We see no InlineTypeNode in post parse IR with following command line:- > > -XX:CompileCommand=compileonly,flatten::micro -XX:+EnablePrimitiveClasses -cp . flatten > > But, field access does generate InlineTypeNode if flatten size explicitly set as 0 with following options. > > -XX:InlineFieldMaxFlatSize=0 -XX:CompileCommand=compileonly,flatten::micro -XX:+EnablePrimitiveClasses -cp . flatten > > EA will anyways remove non-escapable allocation, but post parse IR verification should be good enough to test your changes ? Hi @jatin-bhateja @TobiHartmann , I'v added the IR tests which check the oop load number for non-flattened inline type fields. Please help to take a look whether it's fine. Thanks in advance! ------------- PR Review Comment: https://git.openjdk.org/valhalla/pull/888#discussion_r1319284823 From jbhateja at openjdk.org Fri Sep 8 04:54:04 2023 From: jbhateja at openjdk.org (Jatin Bhateja) Date: Fri, 8 Sep 2023 04:54:04 GMT Subject: [lworld] RFR: 8311219: [lworld] VM option "InlineFieldMaxFlatSize" cannot work well [v4] In-Reply-To: <3dhLXjVFVq3tf3KjUZhIH2eunRsvXf1_zjjvJEmiUTg=.74043b94-d470-4582-a270-d1fd52376f92@github.com> References: <5QNYV-nUYCE-Uzrsz81TwXroTK6x-aDxA4b8B1Svfto=.8e61ef68-a35e-4a92-9338-39ac4c083b2b@github.com> <3dhLXjVFVq3tf3KjUZhIH2eunRsvXf1_zjjvJEmiUTg=.74043b94-d470-4582-a270-d1fd52376f92@github.com> Message-ID: On Fri, 8 Sep 2023 02:31:14 GMT, Xiaohong Gong wrote: >> If a primitive class field is flattened, then compiler will not be creating an InlineTypeNode for field access, sub-field will be part of container, isn't that sufficient to make the check, e.g. >> >> >> primitive class ABC { >> int f1; >> long f2; >> ABC (int f1, long f2) { >> this.f1 = f1; >> this.f2 = f2; >> } >> } >> >> >> public class flatten { >> public final ABC field; >> >> public flatten(int val) { >> this.field = new ABC(val, (long)val); >> } >> >> public static long micro(int val) { >> flatten obj = new flatten(val); >> return obj.field.f1 + obj.field.f2; >> } >> >> public static void main(String [] args) { >> long res = 0; >> for (int i = 0; i < 10000; i++) { >> res += micro(i); >> } >> System.out.println(res); >> } >> } >> >> >> >> We see no InlineTypeNode in post parse IR with following command line:- >> >> -XX:CompileCommand=compileonly,flatten::micro -XX:+EnablePrimitiveClasses -cp . flatten >> >> But, field access does generate InlineTypeNode if flatten size explicitly set as 0 with following options. >> >> -XX:InlineFieldMaxFlatSize=0 -XX:CompileCommand=compileonly,flatten::micro -XX:+EnablePrimitiveClasses -cp . flatten >> >> EA will anyways remove non-escapable allocation, but post parse IR verification should be good enough to test your changes ? > > Hi @jatin-bhateja @TobiHartmann , I'v added the IR tests which check the oop load number for non-flattened inline type fields. Please help to take a look whether it's fine. Thanks in advance! Thanks @XiaohongGong, LGTM! ------------- PR Review Comment: https://git.openjdk.org/valhalla/pull/888#discussion_r1319351799 From xgong at openjdk.org Fri Sep 8 07:10:07 2023 From: xgong at openjdk.org (Xiaohong Gong) Date: Fri, 8 Sep 2023 07:10:07 GMT Subject: [lworld] RFR: 8311219: [lworld] VM option "InlineFieldMaxFlatSize" cannot work well [v4] In-Reply-To: References: <5QNYV-nUYCE-Uzrsz81TwXroTK6x-aDxA4b8B1Svfto=.8e61ef68-a35e-4a92-9338-39ac4c083b2b@github.com> <3dhLXjVFVq3tf3KjUZhIH2eunRsvXf1_zjjvJEmiUTg=.74043b94-d470-4582-a270-d1fd52376f92@github.com> Message-ID: On Fri, 8 Sep 2023 04:50:55 GMT, Jatin Bhateja wrote: >> Hi @jatin-bhateja @TobiHartmann , I'v added the IR tests which check the oop load number for non-flattened inline type fields. Please help to take a look whether it's fine. Thanks in advance! > > Thanks @XiaohongGong, LGTM! > An alternate way could be to just check existence of InlineTypeNode in post parse IR. > `@IR(phase = {CompilePhase.AFTER_PARSING}, counts = {IRNode.INLINE_TYPE, " > 0 "})` I also tried to check `InlineTypeNode`, but unfortunately `InlineTypeNode` is also optimized out after parsing due to no usage in this case, which is the same with the field flattened. ------------- PR Review Comment: https://git.openjdk.org/valhalla/pull/888#discussion_r1319456944 From xgong at openjdk.org Fri Sep 8 09:10:13 2023 From: xgong at openjdk.org (Xiaohong Gong) Date: Fri, 8 Sep 2023 09:10:13 GMT Subject: [lworld] RFR: 8311219: [lworld] VM option "InlineFieldMaxFlatSize" cannot work well In-Reply-To: <13Toexb0NM-WYizAXfLLZiAhvDNMLjeySOLID5JyH6k=.90081b8f-fc73-4f1d-9d27-a0c5200da8d9@github.com> References: <5QNYV-nUYCE-Uzrsz81TwXroTK6x-aDxA4b8B1Svfto=.8e61ef68-a35e-4a92-9338-39ac4c083b2b@github.com> <13Toexb0NM-WYizAXfLLZiAhvDNMLjeySOLID5JyH6k=.90081b8f-fc73-4f1d-9d27-a0c5200da8d9@github.com> Message-ID: On Sat, 19 Aug 2023 04:45:13 GMT, Jatin Bhateja wrote: >>> Are you inclined to re-instantiate the old behavior ? >> >> May I ask what the "old behavior" do you mean? Thanks! > >> > Are you inclined to re-instantiate the old behavior ? >> >> May I ask what the "old behavior" do you mean? Thanks! > > Hi @XiaohongGong , > Updated my previous comment, I am able reproduce an intermittent crash in another test "compiler/valhalla/inlinetypes/TestGetfieldChains.java" with additional JVM options : "-Xbatch -XX:TieredStopAtLevel=1 -XX:CompileThresholdScaling=0.1 -XX:InlineFieldMaxFlatSize=0" > [hs_err_pid1570004.txt](https://github.com/openjdk/valhalla/files/12385038/hs_err_pid1570004.txt) > [replay_pid1570004.txt](https://github.com/openjdk/valhalla/files/12385039/replay_pid1570004.txt) > [TestGetfieldChains.txt](https://github.com/openjdk/valhalla/files/12385040/TestGetfieldChains.txt) > > Please find attached relevant logs and replay file. > > Best Regards, > Jatin Thanks for the review @jatin-bhateja @TobiHartmann! ------------- PR Comment: https://git.openjdk.org/valhalla/pull/888#issuecomment-1711336215 From xgong at openjdk.org Fri Sep 8 09:10:15 2023 From: xgong at openjdk.org (Xiaohong Gong) Date: Fri, 8 Sep 2023 09:10:15 GMT Subject: [lworld] Integrated: 8311219: [lworld] VM option "InlineFieldMaxFlatSize" cannot work well In-Reply-To: <5QNYV-nUYCE-Uzrsz81TwXroTK6x-aDxA4b8B1Svfto=.8e61ef68-a35e-4a92-9338-39ac4c083b2b@github.com> References: <5QNYV-nUYCE-Uzrsz81TwXroTK6x-aDxA4b8B1Svfto=.8e61ef68-a35e-4a92-9338-39ac4c083b2b@github.com> Message-ID: On Thu, 20 Jul 2023 09:29:59 GMT, Xiaohong Gong wrote: > Currently all the non-static final fields with inline type can be flattened, even if the layout size of the inline type is beyond > the max flat size specified by `InlineFieldMaxFlatSize`. Please refer to the condition check [1] which decides whether a field > can be flattened or not. > > Field flattening has two major side effects: atomicity and size. Fields with atomic access limitation or large size that exceeds > the specified threshold value cannot be flattened. And final fields are special that they are immutable after initialized. So the atomic check for them can be ignored. Hence, 1) for the atomicity free type like the primitive class, the final and non-final fields with such type can be flattened. And 2) for the normal value class that has atomic feature, only the final fields with such type can be flattened. And all kinds of the flattened fields should not exceed the specified max flat size. Please see more details from [1] [2]. > > The original condition [1] matches the atomicity check but not the flat size limitation. Promoting the flat size check before all other checks matches the flattening policy and can make the VM option `InlineFieldMaxFlatSize` work for final fields as well. > > This patch also fixed the jtreg crashes involved after the field flattening condition is changed. Those tests fail with setting > `-XX:+InlineFieldMaxFlatSize=0` by default. The main issue is the non-flattened inline type field is not buffered which is expected to be. The root cause is when parsing `withfield`, the compiler checks whether the field is primitive class type while not its flattening status. Changing to check the flattening status instead can fix the crash. > > [1] https://github.com/openjdk/valhalla/blob/lworld/src/hotspot/share/classfile/fieldLayoutBuilder.cpp#L759 > [2] https://mail.openjdk.org/pipermail/valhalla-dev/2023-June/011262.html > [3] https://mail.openjdk.org/pipermail/valhalla-dev/2023-July/011265.html This pull request has now been integrated. Changeset: 72cfc5ff Author: Xiaohong Gong URL: https://git.openjdk.org/valhalla/commit/72cfc5ff6b13f48df937d5a6df7e839077a5237c Stats: 127 lines in 7 files changed: 110 ins; 6 del; 11 mod 8311219: [lworld] VM option "InlineFieldMaxFlatSize" cannot work well Co-authored-by: Tobias Hartmann Reviewed-by: jbhateja, thartmann ------------- PR: https://git.openjdk.org/valhalla/pull/888 From fparain at openjdk.org Fri Sep 8 14:03:26 2023 From: fparain at openjdk.org (Frederic Parain) Date: Fri, 8 Sep 2023 14:03:26 GMT Subject: [lworld] RFR: 8315935: [lworld] Apply flat renaming to C2 code Message-ID: <_kJCZUuoQ6FGzebaluDhw7zK_GfIHWXZRMJYA8uyQuE=.96ce3f1d-f36f-4785-bc3c-67ccc468b65b@github.com> Same renaming as in JDK-8315412 applied to C2 code. Tested with Mach5, tier1 Fred ------------- Commit messages: - Fixing typos - Merge remote-tracking branch 'upstream/lworld' into c2_renaming - Flat renaming in C2 Changes: https://git.openjdk.org/valhalla/pull/925/files Webrev: https://webrevs.openjdk.org/?repo=valhalla&pr=925&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8315935 Stats: 270 lines in 38 files changed: 1 ins; 0 del; 269 mod Patch: https://git.openjdk.org/valhalla/pull/925.diff Fetch: git fetch https://git.openjdk.org/valhalla.git pull/925/head:pull/925 PR: https://git.openjdk.org/valhalla/pull/925 From jbhateja at openjdk.org Mon Sep 11 08:43:07 2023 From: jbhateja at openjdk.org (Jatin Bhateja) Date: Mon, 11 Sep 2023 08:43:07 GMT Subject: [lworld+fp16] RFR: 8308363: Initial compiler support for FP16 scalar operations. [v7] In-Reply-To: References: Message-ID: > Starting with 4th Generation Xeon, Intel has made extensive extensions to existing ISA to support 16 bit scalar and vector floating point operations based on IEEE 754 binary16 format. > > We plan to support this in multiple stages spanning across Java side definition of Float16 type, scalar operation and finally SLP vectorization support. > > This patch adds minimal Java and Compiler side support for one API Float16.add. > > **Summary of changes :-** > - Minimal implementation of Float16 primitive class supporting one operation (Float16.add) > - X86 AVX512-FP16 feature detection at VM startup. > - C2 IR and Inline expander changes for Float16.add API. > - FP16 constant folding handling. > - Backend support : Instruction selection patterns and assembler support. > - New IR framework and functional tests. > > **Implementation details:-** > > 1/ Newly defined Float16 class encapsulate a short value holding IEEE 754 binary16 encoded value. > > 2/ Float16 is a primitive class which in future will be aligned with other enhanced primitive wrapper classes proposed by [JEP-402.](https://openjdk.org/jeps/402) > > 3/ Float16 to support all the operations supported by corresponding Float class. > > 4/ Java implementation of each API will internally perform floating point operation at FP32 granularity. > > 5/ API which can be directly mapped to an Intel AVX512FP16 instruction will be a candidate for intensification by C2 compiler. > > 6/ With Valhalla, C2 compiler always creates an InlineType IR node for a value class instance. > Total number of inputs of an InlineType node match the number of non-static fields. In this case node will have one input of short type TypeInt::SHORT. > > 7/ Since all the scalar AVX512FP16 instructions operate on floating point registers and Float16 backing storage is held in a general-purpose register hence we need to introduce appropriate conversion IR which moves a 16-bit value from GPR to a XMM register and vice versa. > ![image](https://github.com/openjdk/valhalla/assets/59989778/192fca7e-6b7e-4e62-9b09-677e33eca48d) > > 8/ Current plan is to introduce a new IR node for each operation which is a subclass of its corresponding single precision IR node. This will allow leveraging idealization routines (Ideal/Identity/Value) of its parent operation. > > 9/ All the single/double precision IR nodes carry a Type::FLOAT/DOUBLE ideal type. This represents entire FP32/64 value range and is different from integral types which explicitly record lower and upper bounds of value ranges. Value resolution ... Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision: Auto-vectorizer support for Float16.sum operation. ------------- Changes: - all: https://git.openjdk.org/valhalla/pull/848/files - new: https://git.openjdk.org/valhalla/pull/848/files/9e2e330e..11f2b001 Webrevs: - full: https://webrevs.openjdk.org/?repo=valhalla&pr=848&range=06 - incr: https://webrevs.openjdk.org/?repo=valhalla&pr=848&range=05-06 Stats: 159 lines in 13 files changed: 157 ins; 0 del; 2 mod Patch: https://git.openjdk.org/valhalla/pull/848.diff Fetch: git fetch https://git.openjdk.org/valhalla.git pull/848/head:pull/848 PR: https://git.openjdk.org/valhalla/pull/848 From jbhateja at openjdk.org Mon Sep 11 08:45:06 2023 From: jbhateja at openjdk.org (Jatin Bhateja) Date: Mon, 11 Sep 2023 08:45:06 GMT Subject: [lworld+fp16] RFR: 8308363: Initial compiler support for FP16 scalar operations. [v6] In-Reply-To: References: Message-ID: On Wed, 6 Sep 2023 12:17:33 GMT, Jatin Bhateja wrote: >> Starting with 4th Generation Xeon, Intel has made extensive extensions to existing ISA to support 16 bit scalar and vector floating point operations based on IEEE 754 binary16 format. >> >> We plan to support this in multiple stages spanning across Java side definition of Float16 type, scalar operation and finally SLP vectorization support. >> >> This patch adds minimal Java and Compiler side support for one API Float16.add. >> >> **Summary of changes :-** >> - Minimal implementation of Float16 primitive class supporting one operation (Float16.add) >> - X86 AVX512-FP16 feature detection at VM startup. >> - C2 IR and Inline expander changes for Float16.add API. >> - FP16 constant folding handling. >> - Backend support : Instruction selection patterns and assembler support. >> - New IR framework and functional tests. >> >> **Implementation details:-** >> >> 1/ Newly defined Float16 class encapsulate a short value holding IEEE 754 binary16 encoded value. >> >> 2/ Float16 is a primitive class which in future will be aligned with other enhanced primitive wrapper classes proposed by [JEP-402.](https://openjdk.org/jeps/402) >> >> 3/ Float16 to support all the operations supported by corresponding Float class. >> >> 4/ Java implementation of each API will internally perform floating point operation at FP32 granularity. >> >> 5/ API which can be directly mapped to an Intel AVX512FP16 instruction will be a candidate for intensification by C2 compiler. >> >> 6/ With Valhalla, C2 compiler always creates an InlineType IR node for a value class instance. >> Total number of inputs of an InlineType node match the number of non-static fields. In this case node will have one input of short type TypeInt::SHORT. >> >> 7/ Since all the scalar AVX512FP16 instructions operate on floating point registers and Float16 backing storage is held in a general-purpose register hence we need to introduce appropriate conversion IR which moves a 16-bit value from GPR to a XMM register and vice versa. >> ![image](https://github.com/openjdk/valhalla/assets/59989778/192fca7e-6b7e-4e62-9b09-677e33eca48d) >> >> 8/ Current plan is to introduce a new IR node for each operation which is a subclass of its corresponding single precision IR node. This will allow leveraging idealization routines (Ideal/Identity/Value) of its parent operation. >> >> 9/ All the single/double precision IR nodes carry a Type::FLOAT/DOUBLE ideal type. This represents entire FP32/64 value range and is different from integral types which expli... > > Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision: > > Tuning backend to use new 16 bit moves b/w GPR and XMMs. Performance data for Float16 vector ADD operation on Sapphire Rapids with different configuration. ![image](https://github.com/openjdk/valhalla/assets/59989778/d86a9a82-297e-440d-9aa0-d4dedec86e95) ------------- PR Comment: https://git.openjdk.org/valhalla/pull/848#issuecomment-1713435287 From thartmann at openjdk.org Tue Sep 12 07:42:13 2023 From: thartmann at openjdk.org (Tobias Hartmann) Date: Tue, 12 Sep 2023 07:42:13 GMT Subject: [lworld] RFR: 8315935: [lworld] Apply flat renaming to C2 code In-Reply-To: <_kJCZUuoQ6FGzebaluDhw7zK_GfIHWXZRMJYA8uyQuE=.96ce3f1d-f36f-4785-bc3c-67ccc468b65b@github.com> References: <_kJCZUuoQ6FGzebaluDhw7zK_GfIHWXZRMJYA8uyQuE=.96ce3f1d-f36f-4785-bc3c-67ccc468b65b@github.com> Message-ID: <7jjmikRsd_YUZKhRcDs5Rv8k_iFk1dicD47KX4ZDCpo=.5f8f8ad2-6ee7-4019-9f9c-15b13c761aba@github.com> On Fri, 8 Sep 2023 13:56:25 GMT, Frederic Parain wrote: > Same renaming as in JDK-8315412 applied to C2 code. > > Tested with Mach5, tier1 > > Fred Thanks a lot for taking care of this renaming, Fred! I found some remaining occurrences of "flattened", maybe there are more that need to be updated. src/hotspot/share/oops/flatArrayOop.hpp line 32: > 30: #include "runtime/handles.hpp" > 31: > 32: // A flatArrayOop is an array containing flattened inline types (no indirection). Shouldn't this be "flat inline types" instead? src/hotspot/share/opto/loopnode.hpp line 103: > 101: bool is_loop_nest_inner_loop() const { return _loop_flags & LoopNestInnerLoop; } > 102: bool is_loop_nest_outer_loop() const { return _loop_flags & LoopNestLongOuterLoop; } > 103: bool is_flat_arrays() const { return _loop_flags & FlattenedArrays; } Should we also rename `FlattenedArrays`? src/hotspot/share/opto/parse2.cpp line 280: > 278: Node* casted_ary = ary; > 279: if (vk != nullptr && !stopped()) { > 280: // Element type is known, cast and store to flattened representation Suggestion: // Element type is known, cast and store to flat representation src/hotspot/share/opto/subnode.cpp line 1050: > 1048: if ((r0->flat_array() && r1->not_flat_array()) || > 1049: (r1->flat_array() && r0->not_flat_array())) { > 1050: // One type is flattened in arrays but the other type is not. Must be unrelated. Suggestion: // One type is flat in arrays but the other type is not. Must be unrelated. src/hotspot/share/opto/subtypenode.cpp line 55: > 53: // Handle inline type arrays > 54: if (subk->flat_array() && superk->not_flat_array()) { > 55: // The subtype is flattened in arrays and the supertype is not flattened in arrays. Must be unrelated. Suggestion: // The subtype is flat in arrays and the supertype is not flat in arrays. Must be unrelated. src/hotspot/share/opto/type.cpp line 5203: > 5201: // Meeting flat inline type array with non-flat array. Adjust (field) offset accordingly. > 5202: if (tary->_flat) { > 5203: // Result is flattened Suggestion: // Result is flat src/hotspot/share/opto/type.cpp line 5207: > 5205: field_off = is_flat() ? field_offset() : tap->field_offset(); > 5206: } else if (below_centerline(ptr)) { > 5207: // Result is non-flattened Suggestion: // Result is non-flat ------------- Marked as reviewed by thartmann (Committer). PR Review: https://git.openjdk.org/valhalla/pull/925#pullrequestreview-1621519616 PR Review Comment: https://git.openjdk.org/valhalla/pull/925#discussion_r1322576078 PR Review Comment: https://git.openjdk.org/valhalla/pull/925#discussion_r1322578698 PR Review Comment: https://git.openjdk.org/valhalla/pull/925#discussion_r1322580115 PR Review Comment: https://git.openjdk.org/valhalla/pull/925#discussion_r1322579665 PR Review Comment: https://git.openjdk.org/valhalla/pull/925#discussion_r1322580705 PR Review Comment: https://git.openjdk.org/valhalla/pull/925#discussion_r1322581323 PR Review Comment: https://git.openjdk.org/valhalla/pull/925#discussion_r1322581494 From fparain at openjdk.org Tue Sep 12 13:27:09 2023 From: fparain at openjdk.org (Frederic Parain) Date: Tue, 12 Sep 2023 13:27:09 GMT Subject: [lworld] RFR: 8315935: [lworld] Apply flat renaming to C2 code In-Reply-To: <7jjmikRsd_YUZKhRcDs5Rv8k_iFk1dicD47KX4ZDCpo=.5f8f8ad2-6ee7-4019-9f9c-15b13c761aba@github.com> References: <_kJCZUuoQ6FGzebaluDhw7zK_GfIHWXZRMJYA8uyQuE=.96ce3f1d-f36f-4785-bc3c-67ccc468b65b@github.com> <7jjmikRsd_YUZKhRcDs5Rv8k_iFk1dicD47KX4ZDCpo=.5f8f8ad2-6ee7-4019-9f9c-15b13c761aba@github.com> Message-ID: On Tue, 12 Sep 2023 07:35:38 GMT, Tobias Hartmann wrote: >> Same renaming as in JDK-8315412 applied to C2 code. >> >> Tested with Mach5, tier1 >> >> Fred > > src/hotspot/share/opto/loopnode.hpp line 103: > >> 101: bool is_loop_nest_inner_loop() const { return _loop_flags & LoopNestInnerLoop; } >> 102: bool is_loop_nest_outer_loop() const { return _loop_flags & LoopNestLongOuterLoop; } >> 103: bool is_flat_arrays() const { return _loop_flags & FlattenedArrays; } > > Should we also rename `FlattenedArrays`? Yes, definitively. ------------- PR Review Comment: https://git.openjdk.org/valhalla/pull/925#discussion_r1323040677 From fparain at openjdk.org Tue Sep 12 19:10:08 2023 From: fparain at openjdk.org (Frederic Parain) Date: Tue, 12 Sep 2023 19:10:08 GMT Subject: [lworld] RFR: 8315935: [lworld] Apply flat renaming to C2 code In-Reply-To: <7jjmikRsd_YUZKhRcDs5Rv8k_iFk1dicD47KX4ZDCpo=.5f8f8ad2-6ee7-4019-9f9c-15b13c761aba@github.com> References: <_kJCZUuoQ6FGzebaluDhw7zK_GfIHWXZRMJYA8uyQuE=.96ce3f1d-f36f-4785-bc3c-67ccc468b65b@github.com> <7jjmikRsd_YUZKhRcDs5Rv8k_iFk1dicD47KX4ZDCpo=.5f8f8ad2-6ee7-4019-9f9c-15b13c761aba@github.com> Message-ID: On Tue, 12 Sep 2023 07:33:17 GMT, Tobias Hartmann wrote: >> Same renaming as in JDK-8315412 applied to C2 code. >> >> Tested with Mach5, tier1 >> >> Fred > > src/hotspot/share/oops/flatArrayOop.hpp line 32: > >> 30: #include "runtime/handles.hpp" >> 31: >> 32: // A flatArrayOop is an array containing flattened inline types (no indirection). > > Shouldn't this be "flat inline types" instead? This is one aspect of the renaming that requires some bike-shading. So far, I've applied the "flat" qualifier to the container, the field or the array, because there's a direct impact on the shape of the container and the way it is accessed. But should we apply the "flat" qualifier to the value itself? The value is the value, a set of constants with a shape that doesn't change if it is stored in a flat field, a flat array or an standalone instance. Both "flat inline type" and "flattened inline type" sound weird to me, so I don't really have a preference. Does one make more sense or is more informative than the other? ------------- PR Review Comment: https://git.openjdk.org/valhalla/pull/925#discussion_r1323449971 From vromero at openjdk.org Tue Sep 12 20:09:13 2023 From: vromero at openjdk.org (Vicente Romero) Date: Tue, 12 Sep 2023 20:09:13 GMT Subject: RFR: Merge lworld Message-ID: <9TVPzUpLAG69L7v6ZroXr9ZPMEu74EAnqvZmrLeDHeU=.08dbc644-a999-45e5-83b3-88b72502e1ad@github.com> Merge branch 'lworld' into lw5 ------------- Commit messages: - Merge branch 'lworld' into lw5_merge_lworld - 8311219: [lworld] VM option "InlineFieldMaxFlatSize" cannot work well - 8313667: [lworld] XBarrierSetC2::clone_at_expansion() uses wrong array copy stub for cloning flat primitive type arrays - 8315412: [lworld] Preparing code for lw5 The merge commit only contains trivial merges, so no merge-specific webrevs have been generated. Changes: https://git.openjdk.org/valhalla/pull/926/files Stats: 941 lines in 85 files changed: 311 ins; 100 del; 530 mod Patch: https://git.openjdk.org/valhalla/pull/926.diff Fetch: git fetch https://git.openjdk.org/valhalla.git pull/926/head:pull/926 PR: https://git.openjdk.org/valhalla/pull/926 From vromero at openjdk.org Tue Sep 12 20:33:54 2023 From: vromero at openjdk.org (Vicente Romero) Date: Tue, 12 Sep 2023 20:33:54 GMT Subject: RFR: Merge lworld [v2] In-Reply-To: <9TVPzUpLAG69L7v6ZroXr9ZPMEu74EAnqvZmrLeDHeU=.08dbc644-a999-45e5-83b3-88b72502e1ad@github.com> References: <9TVPzUpLAG69L7v6ZroXr9ZPMEu74EAnqvZmrLeDHeU=.08dbc644-a999-45e5-83b3-88b72502e1ad@github.com> Message-ID: > Merge branch 'lworld' into lw5 Vicente Romero has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 28 additional commits since the last revision: - Merge branch 'lworld' into lw5_merge_lworld - Merge lworld - Merge lworld - 8314913: [lw5] null restrictions can only be applied to value classes - 8314899: [lw5] rename j.l.NonAtomic to j.l.LooselyConsistentValue - Merge lworld - 8314181: [lw5] the check for illegal circularity should only be done if value classes are available - 8314165: [lw5] check for illegal circularity at class loading time - Merge lworld - [lw5] regression test cleanup, relocation - ... and 18 more: https://git.openjdk.org/valhalla/compare/d13f662b...de327f30 ------------- Changes: - all: https://git.openjdk.org/valhalla/pull/926/files - new: https://git.openjdk.org/valhalla/pull/926/files/de327f30..de327f30 Webrevs: - full: https://webrevs.openjdk.org/?repo=valhalla&pr=926&range=01 - incr: https://webrevs.openjdk.org/?repo=valhalla&pr=926&range=00-01 Stats: 0 lines in 0 files changed: 0 ins; 0 del; 0 mod Patch: https://git.openjdk.org/valhalla/pull/926.diff Fetch: git fetch https://git.openjdk.org/valhalla.git pull/926/head:pull/926 PR: https://git.openjdk.org/valhalla/pull/926 From vromero at openjdk.org Tue Sep 12 20:33:57 2023 From: vromero at openjdk.org (Vicente Romero) Date: Tue, 12 Sep 2023 20:33:57 GMT Subject: Integrated: Merge lworld In-Reply-To: <9TVPzUpLAG69L7v6ZroXr9ZPMEu74EAnqvZmrLeDHeU=.08dbc644-a999-45e5-83b3-88b72502e1ad@github.com> References: <9TVPzUpLAG69L7v6ZroXr9ZPMEu74EAnqvZmrLeDHeU=.08dbc644-a999-45e5-83b3-88b72502e1ad@github.com> Message-ID: On Tue, 12 Sep 2023 20:03:39 GMT, Vicente Romero wrote: > Merge branch 'lworld' into lw5 This pull request has now been integrated. Changeset: d0c4b3ea Author: Vicente Romero URL: https://git.openjdk.org/valhalla/commit/d0c4b3ea542f68a5482a1972bce452a8ade11dda Stats: 941 lines in 85 files changed: 311 ins; 100 del; 530 mod Merge lworld ------------- PR: https://git.openjdk.org/valhalla/pull/926 From sviswanathan at openjdk.org Tue Sep 12 23:03:08 2023 From: sviswanathan at openjdk.org (Sandhya Viswanathan) Date: Tue, 12 Sep 2023 23:03:08 GMT Subject: [lworld+fp16] RFR: 8308363: Initial compiler support for FP16 scalar operations. [v7] In-Reply-To: References: Message-ID: On Mon, 11 Sep 2023 08:43:07 GMT, Jatin Bhateja wrote: >> Starting with 4th Generation Xeon, Intel has made extensive extensions to existing ISA to support 16 bit scalar and vector floating point operations based on IEEE 754 binary16 format. >> >> We plan to support this in multiple stages spanning across Java side definition of Float16 type, scalar operation and finally SLP vectorization support. >> >> This patch adds minimal Java and Compiler side support for one API Float16.add. >> >> **Summary of changes :-** >> - Minimal implementation of Float16 primitive class supporting one operation (Float16.add) >> - X86 AVX512-FP16 feature detection at VM startup. >> - C2 IR and Inline expander changes for Float16.add API. >> - FP16 constant folding handling. >> - Backend support : Instruction selection patterns and assembler support. >> - New IR framework and functional tests. >> >> **Implementation details:-** >> >> 1/ Newly defined Float16 class encapsulate a short value holding IEEE 754 binary16 encoded value. >> >> 2/ Float16 is a primitive class which in future will be aligned with other enhanced primitive wrapper classes proposed by [JEP-402.](https://openjdk.org/jeps/402) >> >> 3/ Float16 to support all the operations supported by corresponding Float class. >> >> 4/ Java implementation of each API will internally perform floating point operation at FP32 granularity. >> >> 5/ API which can be directly mapped to an Intel AVX512FP16 instruction will be a candidate for intensification by C2 compiler. >> >> 6/ With Valhalla, C2 compiler always creates an InlineType IR node for a value class instance. >> Total number of inputs of an InlineType node match the number of non-static fields. In this case node will have one input of short type TypeInt::SHORT. >> >> 7/ Since all the scalar AVX512FP16 instructions operate on floating point registers and Float16 backing storage is held in a general-purpose register hence we need to introduce appropriate conversion IR which moves a 16-bit value from GPR to a XMM register and vice versa. >> ![image](https://github.com/openjdk/valhalla/assets/59989778/192fca7e-6b7e-4e62-9b09-677e33eca48d) >> >> 8/ Current plan is to introduce a new IR node for each operation which is a subclass of its corresponding single precision IR node. This will allow leveraging idealization routines (Ideal/Identity/Value) of its parent operation. >> >> 9/ All the single/double precision IR nodes carry a Type::FLOAT/DOUBLE ideal type. This represents entire FP32/64 value range and is different from integral types which expli... > > Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision: > > Auto-vectorizer support for Float16.sum operation. Very good work in general. I only have couple of comments in the code, please take a look. Also there might be some further optimization opportunities for the path that comes from Op_ConvF2HF -> ReinterpretS2HF. ConvF2HF is doing the conversion from xmm to xmm register and then moves the xmm to gpr. ReinterpretS2HF then moves from gpr back to xmm. This unnecessary movement from xmm->gpr and from gpr->xmm could be optimized out. make/common/JavaCompilation.gmk line 277: > 275: > 276: $1_FLAGS += -g -Xlint:all $$($1_TARGET_RELEASE) $$(PARANOIA_FLAGS) $$(JAVA_WARNINGS_ARE_ERRORS) > 277: $1_FLAGS += $$($1_JAVAC_FLAGS) -XDenablePrimitiveClasses Do we need this change now that we have special handling in the VM for Float16? src/hotspot/cpu/x86/assembler_x86.cpp line 7332: > 7330: void Assembler::evaddph(XMMRegister dst, XMMRegister nds, XMMRegister src, int vector_len) { > 7331: assert(VM_Version::supports_avx512_fp16(), "requires AVX512-FP16"); > 7332: InstructionAttr attributes(vector_len, false, /* legacy_mode */ false, /* no_mask_reg */ true, /* uses_vl */ false); uses_vl should be true here. Also wondering if we could create a generic method like fp16varithop() with most of the boiler plate common code and call it from individual instructions like add, sub etc. This generic method we could either create now or when we include additional fp16 instructions. src/hotspot/cpu/x86/x86.ad line 10180: > 10178: ins_encode %{ > 10179: __ vmovw($dst$$Register, $src$$XMMRegister); > 10180: __ movswl($dst$$Register, $dst$$Register); Could we do without movswl here? src/hotspot/share/classfile/vmIntrinsics.hpp line 201: > 199: /* Float16 intrinsics, similar to what we have in Math. */ \ > 200: do_intrinsic(_sum_float16, java_lang_Float16, sum_name, floa16_float16_signature, F_S) \ > 201: do_name(sum_name, "sum") \ could this be called add instead of sum? ------------- PR Review: https://git.openjdk.org/valhalla/pull/848#pullrequestreview-1622919981 PR Review Comment: https://git.openjdk.org/valhalla/pull/848#discussion_r1323594103 PR Review Comment: https://git.openjdk.org/valhalla/pull/848#discussion_r1323436092 PR Review Comment: https://git.openjdk.org/valhalla/pull/848#discussion_r1323681084 PR Review Comment: https://git.openjdk.org/valhalla/pull/848#discussion_r1323589401 From thartmann at openjdk.org Thu Sep 14 05:13:57 2023 From: thartmann at openjdk.org (Tobias Hartmann) Date: Thu, 14 Sep 2023 05:13:57 GMT Subject: [lworld] RFR: 8315935: [lworld] Apply flat renaming to C2 code In-Reply-To: References: <_kJCZUuoQ6FGzebaluDhw7zK_GfIHWXZRMJYA8uyQuE=.96ce3f1d-f36f-4785-bc3c-67ccc468b65b@github.com> <7jjmikRsd_YUZKhRcDs5Rv8k_iFk1dicD47KX4ZDCpo=.5f8f8ad2-6ee7-4019-9f9c-15b13c761aba@github.com> Message-ID: On Tue, 12 Sep 2023 19:07:40 GMT, Frederic Parain wrote: >> src/hotspot/share/oops/flatArrayOop.hpp line 32: >> >>> 30: #include "runtime/handles.hpp" >>> 31: >>> 32: // A flatArrayOop is an array containing flattened inline types (no indirection). >> >> Shouldn't this be "flat inline types" instead? > > This is one aspect of the renaming that requires some bike-shading. > So far, I've applied the "flat" qualifier to the container, the field or the array, because there's a direct impact on the shape of the container and the way it is accessed. > But should we apply the "flat" qualifier to the value itself? The value is the value, a set of constants with a shape that doesn't change if it is stored in a flat field, a flat array or an standalone instance. > Both "flat inline type" and "flattened inline type" sound weird to me, so I don't really have a preference. Does one make more sense or is more informative than the other? I don't have a strong opinion but maybe we should simply rephrase this to: `A flatArrayOop is a flat array containing inline types (no indirection).` Or `A flatArrayOop points to a flat array containing inline types (no indirection).` ------------- PR Review Comment: https://git.openjdk.org/valhalla/pull/925#discussion_r1325362330 From fparain at openjdk.org Thu Sep 14 19:49:32 2023 From: fparain at openjdk.org (Frederic Parain) Date: Thu, 14 Sep 2023 19:49:32 GMT Subject: [lworld] RFR: 8315935: [lworld] Apply flat renaming to C2 code [v2] In-Reply-To: <_kJCZUuoQ6FGzebaluDhw7zK_GfIHWXZRMJYA8uyQuE=.96ce3f1d-f36f-4785-bc3c-67ccc468b65b@github.com> References: <_kJCZUuoQ6FGzebaluDhw7zK_GfIHWXZRMJYA8uyQuE=.96ce3f1d-f36f-4785-bc3c-67ccc468b65b@github.com> Message-ID: > Same renaming as in JDK-8315412 applied to C2 code. > > Tested with Mach5, tier1 > > Fred Frederic Parain has updated the pull request incrementally with one additional commit since the last revision: More renaming ------------- Changes: - all: https://git.openjdk.org/valhalla/pull/925/files - new: https://git.openjdk.org/valhalla/pull/925/files/2c9592ad..c841406c Webrevs: - full: https://webrevs.openjdk.org/?repo=valhalla&pr=925&range=01 - incr: https://webrevs.openjdk.org/?repo=valhalla&pr=925&range=00-01 Stats: 9 lines in 6 files changed: 0 ins; 0 del; 9 mod Patch: https://git.openjdk.org/valhalla/pull/925.diff Fetch: git fetch https://git.openjdk.org/valhalla.git pull/925/head:pull/925 PR: https://git.openjdk.org/valhalla/pull/925 From thartmann at openjdk.org Fri Sep 15 06:57:05 2023 From: thartmann at openjdk.org (Tobias Hartmann) Date: Fri, 15 Sep 2023 06:57:05 GMT Subject: [lworld] RFR: 8315935: [lworld] Apply flat renaming to C2 code [v2] In-Reply-To: References: <_kJCZUuoQ6FGzebaluDhw7zK_GfIHWXZRMJYA8uyQuE=.96ce3f1d-f36f-4785-bc3c-67ccc468b65b@github.com> Message-ID: On Thu, 14 Sep 2023 19:49:32 GMT, Frederic Parain wrote: >> Same renaming as in JDK-8315412 applied to C2 code. >> >> Tested with Mach5, tier1 >> >> Fred > > Frederic Parain has updated the pull request incrementally with one additional commit since the last revision: > > More renaming Looks good to me. Thanks! ------------- Marked as reviewed by thartmann (Committer). PR Review: https://git.openjdk.org/valhalla/pull/925#pullrequestreview-1628315444 From fparain at openjdk.org Fri Sep 15 11:12:09 2023 From: fparain at openjdk.org (Frederic Parain) Date: Fri, 15 Sep 2023 11:12:09 GMT Subject: [lworld] RFR: 8315935: [lworld] Apply flat renaming to C2 code [v2] In-Reply-To: References: <_kJCZUuoQ6FGzebaluDhw7zK_GfIHWXZRMJYA8uyQuE=.96ce3f1d-f36f-4785-bc3c-67ccc468b65b@github.com> Message-ID: <_wQoYoMKT3DvWs14pudevQrBSPN_WOuj2s4AgWkF-GE=.bc9bd69e-1608-4617-8813-a8835aa12d89@github.com> On Thu, 14 Sep 2023 19:49:32 GMT, Frederic Parain wrote: >> Same renaming as in JDK-8315412 applied to C2 code. >> >> Tested with Mach5, tier1 >> >> Fred > > Frederic Parain has updated the pull request incrementally with one additional commit since the last revision: > > More renaming Thank you for your review and feedback. ------------- PR Comment: https://git.openjdk.org/valhalla/pull/925#issuecomment-1721091581 From fparain at openjdk.org Fri Sep 15 11:12:10 2023 From: fparain at openjdk.org (Frederic Parain) Date: Fri, 15 Sep 2023 11:12:10 GMT Subject: [lworld] Integrated: 8315935: [lworld] Apply flat renaming to C2 code In-Reply-To: <_kJCZUuoQ6FGzebaluDhw7zK_GfIHWXZRMJYA8uyQuE=.96ce3f1d-f36f-4785-bc3c-67ccc468b65b@github.com> References: <_kJCZUuoQ6FGzebaluDhw7zK_GfIHWXZRMJYA8uyQuE=.96ce3f1d-f36f-4785-bc3c-67ccc468b65b@github.com> Message-ID: On Fri, 8 Sep 2023 13:56:25 GMT, Frederic Parain wrote: > Same renaming as in JDK-8315412 applied to C2 code. > > Tested with Mach5, tier1 > > Fred This pull request has now been integrated. Changeset: 0263bd93 Author: Frederic Parain URL: https://git.openjdk.org/valhalla/commit/0263bd9385f64119e9d41428d3604f5a21430fc9 Stats: 276 lines in 38 files changed: 1 ins; 0 del; 275 mod 8315935: [lworld] Apply flat renaming to C2 code Reviewed-by: thartmann ------------- PR: https://git.openjdk.org/valhalla/pull/925 From jbhateja at openjdk.org Mon Sep 18 17:35:08 2023 From: jbhateja at openjdk.org (Jatin Bhateja) Date: Mon, 18 Sep 2023 17:35:08 GMT Subject: [lworld+fp16] RFR: 8308363: Initial compiler support for FP16 scalar operations. [v7] In-Reply-To: References: Message-ID: On Tue, 12 Sep 2023 21:23:02 GMT, Sandhya Viswanathan wrote: > Do we need this change now that we have special handling in the VM for Float16? Yes, Float16 being a primitive class needs this flag during build time compilation. > Could we do without movswl here? This was done keeping in mind JVM semantics where byte/short are internally promoted to int type since JVM operands and local variables are 32 bit values. > could this be called add instead of sum? I made this change after Paul offline comments, to align it with Float.sum API. ------------- PR Review Comment: https://git.openjdk.org/valhalla/pull/848#discussion_r1329071277 PR Review Comment: https://git.openjdk.org/valhalla/pull/848#discussion_r1329071225 PR Review Comment: https://git.openjdk.org/valhalla/pull/848#discussion_r1329071334 From jbhateja at openjdk.org Mon Sep 18 17:42:07 2023 From: jbhateja at openjdk.org (Jatin Bhateja) Date: Mon, 18 Sep 2023 17:42:07 GMT Subject: [lworld+fp16] RFR: 8308363: Initial compiler support for FP16 scalar operations. [v7] In-Reply-To: References: Message-ID: On Tue, 12 Sep 2023 23:00:24 GMT, Sandhya Viswanathan wrote: > Very good work in general. I only have couple of comments in the code, please take a look. Also there might be some further optimization opportunities for the path that comes from Op_ConvF2HF -> ReinterpretS2HF. ConvF2HF is doing the conversion from xmm to xmm register and then moves the xmm to gpr. ReinterpretS2HF then moves from gpr back to xmm. This unnecessary movement from xmm->gpr and from gpr->xmm could be optimized out. Yes, but this can only be done if we factor out xmm -> gpr movement out of ConvF2HF, its inputs is a IEEE 754 binary32 bit floating point value and output is a binary16 bit value help in a GPR. It will not be proper to remove ConvF2HF + ReinterpretS2HF by a direct ideal transformation currently. ------------- PR Comment: https://git.openjdk.org/valhalla/pull/848#issuecomment-1724072435 From jbhateja at openjdk.org Mon Sep 18 17:52:38 2023 From: jbhateja at openjdk.org (Jatin Bhateja) Date: Mon, 18 Sep 2023 17:52:38 GMT Subject: [lworld+fp16] RFR: 8308363: Initial compiler support for FP16 scalar operations. [v8] In-Reply-To: References: Message-ID: > Starting with 4th Generation Xeon, Intel has made extensive extensions to existing ISA to support 16 bit scalar and vector floating point operations based on IEEE 754 binary16 format. > > We plan to support this in multiple stages spanning across Java side definition of Float16 type, scalar operation and finally SLP vectorization support. > > This patch adds minimal Java and Compiler side support for one API Float16.add. > > **Summary of changes :-** > - Minimal implementation of Float16 primitive class supporting one operation (Float16.add) > - X86 AVX512-FP16 feature detection at VM startup. > - C2 IR and Inline expander changes for Float16.add API. > - FP16 constant folding handling. > - Backend support : Instruction selection patterns and assembler support. > - New IR framework and functional tests. > > **Implementation details:-** > > 1/ Newly defined Float16 class encapsulate a short value holding IEEE 754 binary16 encoded value. > > 2/ Float16 is a primitive class which in future will be aligned with other enhanced primitive wrapper classes proposed by [JEP-402.](https://openjdk.org/jeps/402) > > 3/ Float16 to support all the operations supported by corresponding Float class. > > 4/ Java implementation of each API will internally perform floating point operation at FP32 granularity. > > 5/ API which can be directly mapped to an Intel AVX512FP16 instruction will be a candidate for intensification by C2 compiler. > > 6/ With Valhalla, C2 compiler always creates an InlineType IR node for a value class instance. > Total number of inputs of an InlineType node match the number of non-static fields. In this case node will have one input of short type TypeInt::SHORT. > > 7/ Since all the scalar AVX512FP16 instructions operate on floating point registers and Float16 backing storage is held in a general-purpose register hence we need to introduce appropriate conversion IR which moves a 16-bit value from GPR to a XMM register and vice versa. > ![image](https://github.com/openjdk/valhalla/assets/59989778/192fca7e-6b7e-4e62-9b09-677e33eca48d) > > 8/ Current plan is to introduce a new IR node for each operation which is a subclass of its corresponding single precision IR node. This will allow leveraging idealization routines (Ideal/Identity/Value) of its parent operation. > > 9/ All the single/double precision IR nodes carry a Type::FLOAT/DOUBLE ideal type. This represents entire FP32/64 value range and is different from integral types which explicitly record lower and upper bounds of value ranges. Value resolution ... Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision: Review comments resolution. ------------- Changes: - all: https://git.openjdk.org/valhalla/pull/848/files - new: https://git.openjdk.org/valhalla/pull/848/files/11f2b001..ed10ca90 Webrevs: - full: https://webrevs.openjdk.org/?repo=valhalla&pr=848&range=07 - incr: https://webrevs.openjdk.org/?repo=valhalla&pr=848&range=06-07 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/valhalla/pull/848.diff Fetch: git fetch https://git.openjdk.org/valhalla.git pull/848/head:pull/848 PR: https://git.openjdk.org/valhalla/pull/848 From vromero at openjdk.org Mon Sep 18 20:24:12 2023 From: vromero at openjdk.org (Vicente Romero) Date: Mon, 18 Sep 2023 20:24:12 GMT Subject: RFR: 8316325: [lw5] sync javac with the current JVMS, particularly assertions on new class attributes Message-ID: This PR is implementing assertions on new classfile attributes, see [1] [1] https://cr.openjdk.org/~dlsmith/jep401/jep401-20230519/specs/flattened-heap-jvms.html ------------- Commit messages: - 8316325: [lw5] sync javac with the current JVMS, particularly assertions on new class attributes Changes: https://git.openjdk.org/valhalla/pull/927/files Webrev: https://webrevs.openjdk.org/?repo=valhalla&pr=927&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8316325 Stats: 887 lines in 28 files changed: 869 ins; 4 del; 14 mod Patch: https://git.openjdk.org/valhalla/pull/927.diff Fetch: git fetch https://git.openjdk.org/valhalla.git pull/927/head:pull/927 PR: https://git.openjdk.org/valhalla/pull/927 From vromero at openjdk.org Mon Sep 18 20:36:06 2023 From: vromero at openjdk.org (Vicente Romero) Date: Mon, 18 Sep 2023 20:36:06 GMT Subject: RFR: 8316325: [lw5] sync javac with the current JVMS, particularly assertions on new class attributes [v2] In-Reply-To: References: Message-ID: > This PR is implementing assertions on new classfile attributes, see [1] > > [1] https://cr.openjdk.org/~dlsmith/jep401/jep401-20230519/specs/flattened-heap-jvms.html Vicente Romero has updated the pull request incrementally with one additional commit since the last revision: adding new line at the end of file ------------- Changes: - all: https://git.openjdk.org/valhalla/pull/927/files - new: https://git.openjdk.org/valhalla/pull/927/files/f58c3c3e..a29389be Webrevs: - full: https://webrevs.openjdk.org/?repo=valhalla&pr=927&range=01 - incr: https://webrevs.openjdk.org/?repo=valhalla&pr=927&range=00-01 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/valhalla/pull/927.diff Fetch: git fetch https://git.openjdk.org/valhalla.git pull/927/head:pull/927 PR: https://git.openjdk.org/valhalla/pull/927 From sviswanathan at openjdk.org Mon Sep 18 22:21:07 2023 From: sviswanathan at openjdk.org (Sandhya Viswanathan) Date: Mon, 18 Sep 2023 22:21:07 GMT Subject: [lworld+fp16] RFR: 8308363: Initial compiler support for FP16 scalar operations. [v7] In-Reply-To: References: Message-ID: <6e2WJ1ZcanhBoxEU81xU_-jk8-7-jbXIi5ZumrkWmv8=.6d17a4fa-9457-47a6-a074-b71fd92dd7d1@github.com> On Mon, 18 Sep 2023 17:32:22 GMT, Jatin Bhateja wrote: >> src/hotspot/cpu/x86/x86.ad line 10180: >> >>> 10178: ins_encode %{ >>> 10179: __ vmovw($dst$$Register, $src$$XMMRegister); >>> 10180: __ movswl($dst$$Register, $dst$$Register); >> >> Could we do without movswl here? > >> Could we do without movswl here? > > This was done keeping in mind JVM semantics where byte/short are internally promoted to int type since JVM operands and local variables are 32 bit values. But this is actually a float16 here (short is only indicating the 16bit storage) and it is not expected to have any integral operation directly on this. The movswl would unnecessarily add to the pathlength. ------------- PR Review Comment: https://git.openjdk.org/valhalla/pull/848#discussion_r1329335751 From sviswanathan at openjdk.org Mon Sep 18 22:32:09 2023 From: sviswanathan at openjdk.org (Sandhya Viswanathan) Date: Mon, 18 Sep 2023 22:32:09 GMT Subject: [lworld+fp16] RFR: 8308363: Initial compiler support for FP16 scalar operations. [v8] In-Reply-To: References: Message-ID: On Mon, 18 Sep 2023 17:52:38 GMT, Jatin Bhateja wrote: >> Starting with 4th Generation Xeon, Intel has made extensive extensions to existing ISA to support 16 bit scalar and vector floating point operations based on IEEE 754 binary16 format. >> >> We plan to support this in multiple stages spanning across Java side definition of Float16 type, scalar operation and finally SLP vectorization support. >> >> This patch adds minimal Java and Compiler side support for one API Float16.add. >> >> **Summary of changes :-** >> - Minimal implementation of Float16 primitive class supporting one operation (Float16.add) >> - X86 AVX512-FP16 feature detection at VM startup. >> - C2 IR and Inline expander changes for Float16.add API. >> - FP16 constant folding handling. >> - Backend support : Instruction selection patterns and assembler support. >> - New IR framework and functional tests. >> >> **Implementation details:-** >> >> 1/ Newly defined Float16 class encapsulate a short value holding IEEE 754 binary16 encoded value. >> >> 2/ Float16 is a primitive class which in future will be aligned with other enhanced primitive wrapper classes proposed by [JEP-402.](https://openjdk.org/jeps/402) >> >> 3/ Float16 to support all the operations supported by corresponding Float class. >> >> 4/ Java implementation of each API will internally perform floating point operation at FP32 granularity. >> >> 5/ API which can be directly mapped to an Intel AVX512FP16 instruction will be a candidate for intensification by C2 compiler. >> >> 6/ With Valhalla, C2 compiler always creates an InlineType IR node for a value class instance. >> Total number of inputs of an InlineType node match the number of non-static fields. In this case node will have one input of short type TypeInt::SHORT. >> >> 7/ Since all the scalar AVX512FP16 instructions operate on floating point registers and Float16 backing storage is held in a general-purpose register hence we need to introduce appropriate conversion IR which moves a 16-bit value from GPR to a XMM register and vice versa. >> ![image](https://github.com/openjdk/valhalla/assets/59989778/192fca7e-6b7e-4e62-9b09-677e33eca48d) >> >> 8/ Current plan is to introduce a new IR node for each operation which is a subclass of its corresponding single precision IR node. This will allow leveraging idealization routines (Ideal/Identity/Value) of its parent operation. >> >> 9/ All the single/double precision IR nodes carry a Type::FLOAT/DOUBLE ideal type. This represents entire FP32/64 value range and is different from integral types which expli... > > Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision: > > Review comments resolution. Marked as reviewed by sviswanathan (no project role). ------------- PR Review: https://git.openjdk.org/valhalla/pull/848#pullrequestreview-1632073594 From sviswanathan at openjdk.org Mon Sep 18 22:32:09 2023 From: sviswanathan at openjdk.org (Sandhya Viswanathan) Date: Mon, 18 Sep 2023 22:32:09 GMT Subject: [lworld+fp16] RFR: 8308363: Initial compiler support for FP16 scalar operations. [v7] In-Reply-To: References: Message-ID: On Mon, 18 Sep 2023 17:39:08 GMT, Jatin Bhateja wrote: > > Very good work in general. I only have couple of comments in the code, please take a look. Also there might be some further optimization opportunities for the path that comes from Op_ConvF2HF -> ReinterpretS2HF. ConvF2HF is doing the conversion from xmm to xmm register and then moves the xmm to gpr. ReinterpretS2HF then moves from gpr back to xmm. This unnecessary movement from xmm->gpr and from gpr->xmm could be optimized out. > > Yes, but this can only be done if we factor out xmm -> gpr movement out of ConvF2HF, its inputs is a IEEE 754 binary32 bit floating point value and output is a binary16 bit value help in a GPR. It will not be proper to remove ConvF2HF + ReinterpretS2HF by a direct ideal transformation currently. One path is having an instruct with match rule for this, something like below I think: match(Set dst (ReinterprestS2HF(ConvF2HF src))); ------------- PR Comment: https://git.openjdk.org/valhalla/pull/848#issuecomment-1724547301 From jbhateja at openjdk.org Tue Sep 19 16:16:12 2023 From: jbhateja at openjdk.org (Jatin Bhateja) Date: Tue, 19 Sep 2023 16:16:12 GMT Subject: [lworld+fp16] RFR: 8308363: Initial compiler support for FP16 scalar operations. [v7] In-Reply-To: <6e2WJ1ZcanhBoxEU81xU_-jk8-7-jbXIi5ZumrkWmv8=.6d17a4fa-9457-47a6-a074-b71fd92dd7d1@github.com> References: <6e2WJ1ZcanhBoxEU81xU_-jk8-7-jbXIi5ZumrkWmv8=.6d17a4fa-9457-47a6-a074-b71fd92dd7d1@github.com> Message-ID: <9D7Xi-38uMtmv5uvlG_rdu6OpqPcwRkBE09XgchZQBM=.2edd7a77-7c8c-4eb5-b21a-aa5ecd65c766@github.com> On Mon, 18 Sep 2023 22:18:44 GMT, Sandhya Viswanathan wrote: > But this is actually a float16 here (short is only indicating the 16bit storage) and it is not expected to have any integral operation directly on this. The movswl would unnecessarily add to the pathlength. We do have an API to return a raw value and a direct comparison b/w raw value is done at JVM word (32 bit) granularity, I was planning to remove additional sign extending move when we add Float16.compare(short, short), but I think its better to fix tests for now and remove this extra instruction. ------------- PR Review Comment: https://git.openjdk.org/valhalla/pull/848#discussion_r1330384137 From jbhateja at openjdk.org Tue Sep 19 16:41:09 2023 From: jbhateja at openjdk.org (Jatin Bhateja) Date: Tue, 19 Sep 2023 16:41:09 GMT Subject: [lworld+fp16] RFR: 8308363: Initial compiler support for FP16 scalar operations. [v9] In-Reply-To: References: Message-ID: > Starting with 4th Generation Xeon, Intel has made extensive extensions to existing ISA to support 16 bit scalar and vector floating point operations based on IEEE 754 binary16 format. > > We plan to support this in multiple stages spanning across Java side definition of Float16 type, scalar operation and finally SLP vectorization support. > > This patch adds minimal Java and Compiler side support for one API Float16.add. > > **Summary of changes :-** > - Minimal implementation of Float16 primitive class supporting one operation (Float16.add) > - X86 AVX512-FP16 feature detection at VM startup. > - C2 IR and Inline expander changes for Float16.add API. > - FP16 constant folding handling. > - Backend support : Instruction selection patterns and assembler support. > - New IR framework and functional tests. > > **Implementation details:-** > > 1/ Newly defined Float16 class encapsulate a short value holding IEEE 754 binary16 encoded value. > > 2/ Float16 is a primitive class which in future will be aligned with other enhanced primitive wrapper classes proposed by [JEP-402.](https://openjdk.org/jeps/402) > > 3/ Float16 to support all the operations supported by corresponding Float class. > > 4/ Java implementation of each API will internally perform floating point operation at FP32 granularity. > > 5/ API which can be directly mapped to an Intel AVX512FP16 instruction will be a candidate for intensification by C2 compiler. > > 6/ With Valhalla, C2 compiler always creates an InlineType IR node for a value class instance. > Total number of inputs of an InlineType node match the number of non-static fields. In this case node will have one input of short type TypeInt::SHORT. > > 7/ Since all the scalar AVX512FP16 instructions operate on floating point registers and Float16 backing storage is held in a general-purpose register hence we need to introduce appropriate conversion IR which moves a 16-bit value from GPR to a XMM register and vice versa. > ![image](https://github.com/openjdk/valhalla/assets/59989778/192fca7e-6b7e-4e62-9b09-677e33eca48d) > > 8/ Current plan is to introduce a new IR node for each operation which is a subclass of its corresponding single precision IR node. This will allow leveraging idealization routines (Ideal/Identity/Value) of its parent operation. > > 9/ All the single/double precision IR nodes carry a Type::FLOAT/DOUBLE ideal type. This represents entire FP32/64 value range and is different from integral types which explicitly record lower and upper bounds of value ranges. Value resolution ... Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision: Review comments resolutions. ------------- Changes: - all: https://git.openjdk.org/valhalla/pull/848/files - new: https://git.openjdk.org/valhalla/pull/848/files/ed10ca90..b72bb366 Webrevs: - full: https://webrevs.openjdk.org/?repo=valhalla&pr=848&range=08 - incr: https://webrevs.openjdk.org/?repo=valhalla&pr=848&range=07-08 Stats: 24 lines in 3 files changed: 19 ins; 2 del; 3 mod Patch: https://git.openjdk.org/valhalla/pull/848.diff Fetch: git fetch https://git.openjdk.org/valhalla.git pull/848/head:pull/848 PR: https://git.openjdk.org/valhalla/pull/848 From jbhateja at openjdk.org Tue Sep 19 16:41:09 2023 From: jbhateja at openjdk.org (Jatin Bhateja) Date: Tue, 19 Sep 2023 16:41:09 GMT Subject: [lworld+fp16] RFR: 8308363: Initial compiler support for FP16 scalar operations. [v7] In-Reply-To: References: Message-ID: On Mon, 18 Sep 2023 22:27:48 GMT, Sandhya Viswanathan wrote: > > > Very good work in general. I only have couple of comments in the code, please take a look. Also there might be some further optimization opportunities for the path that comes from Op_ConvF2HF -> ReinterpretS2HF. ConvF2HF is doing the conversion from xmm to xmm register and then moves the xmm to gpr. ReinterpretS2HF then moves from gpr back to xmm. This unnecessary movement from xmm->gpr and from gpr->xmm could be optimized out. > > > > > > Yes, but this can only be done if we factor out xmm -> gpr movement out of ConvF2HF, its inputs is a IEEE 754 binary32 bit floating point value and output is a binary16 bit value help in a GPR. It will not be proper to remove ConvF2HF + ReinterpretS2HF by a direct ideal transformation currently. > > One path is having an instruct with match rule for this, something like below I think: match(Set dst (ReinterprestS2HF(ConvF2HF src))); Agree, patten match will break if ConF2HF is shared across multiple nodes, but its still better to have than no optimization. ------------- PR Comment: https://git.openjdk.org/valhalla/pull/848#issuecomment-1726047994 From jbhateja at openjdk.org Tue Sep 19 16:41:34 2023 From: jbhateja at openjdk.org (Jatin Bhateja) Date: Tue, 19 Sep 2023 16:41:34 GMT Subject: [lworld+fp16] RFR: 8308363: Initial compiler support for FP16 scalar operations. [v7] In-Reply-To: References: Message-ID: <0XgJgoSy2XdgQh_A1pCCYNwCqJZe6imQkunEJnzqk-8=.1b15ec45-8d30-4b19-b294-9ed57ddf1d9d@github.com> On Mon, 18 Sep 2023 22:27:48 GMT, Sandhya Viswanathan wrote: >>> Very good work in general. I only have couple of comments in the code, please take a look. Also there might be some further optimization opportunities for the path that comes from Op_ConvF2HF -> ReinterpretS2HF. ConvF2HF is doing the conversion from xmm to xmm register and then moves the xmm to gpr. ReinterpretS2HF then moves from gpr back to xmm. This unnecessary movement from xmm->gpr and from gpr->xmm could be optimized out. >> >> Yes, but this can only be done if we factor out xmm -> gpr movement out of ConvF2HF, its inputs is a IEEE 754 binary32 bit floating point value and output is a binary16 bit value help in a GPR. It will not be proper to remove ConvF2HF + ReinterpretS2HF by a direct ideal transformation currently. > >> > Very good work in general. I only have couple of comments in the code, please take a look. Also there might be some further optimization opportunities for the path that comes from Op_ConvF2HF -> ReinterpretS2HF. ConvF2HF is doing the conversion from xmm to xmm register and then moves the xmm to gpr. ReinterpretS2HF then moves from gpr back to xmm. This unnecessary movement from xmm->gpr and from gpr->xmm could be optimized out. >> >> Yes, but this can only be done if we factor out xmm -> gpr movement out of ConvF2HF, its inputs is a IEEE 754 binary32 bit floating point value and output is a binary16 bit value help in a GPR. It will not be proper to remove ConvF2HF + ReinterpretS2HF by a direct ideal transformation currently. > > One path is having an instruct with match rule for this, something like below I think: > match(Set dst (ReinterprestS2HF(ConvF2HF src))); Hi @sviswa7 , @XiaohongGong , All your comments have been addressed. I am committing this patch Best Regards ------------- PR Comment: https://git.openjdk.org/valhalla/pull/848#issuecomment-1726057093 From jbhateja at openjdk.org Tue Sep 19 16:45:17 2023 From: jbhateja at openjdk.org (Jatin Bhateja) Date: Tue, 19 Sep 2023 16:45:17 GMT Subject: [lworld+fp16] Integrated: 8308363: Initial compiler support for FP16 scalar operations. In-Reply-To: References: Message-ID: On Mon, 22 May 2023 17:07:42 GMT, Jatin Bhateja wrote: > Starting with 4th Generation Xeon, Intel has made extensive extensions to existing ISA to support 16 bit scalar and vector floating point operations based on IEEE 754 binary16 format. > > We plan to support this in multiple stages spanning across Java side definition of Float16 type, scalar operation and finally SLP vectorization support. > > This patch adds minimal Java and Compiler side support for one API Float16.add. > > **Summary of changes :-** > - Minimal implementation of Float16 primitive class supporting one operation (Float16.add) > - X86 AVX512-FP16 feature detection at VM startup. > - C2 IR and Inline expander changes for Float16.add API. > - FP16 constant folding handling. > - Backend support : Instruction selection patterns and assembler support. > - New IR framework and functional tests. > > **Implementation details:-** > > 1/ Newly defined Float16 class encapsulate a short value holding IEEE 754 binary16 encoded value. > > 2/ Float16 is a primitive class which in future will be aligned with other enhanced primitive wrapper classes proposed by [JEP-402.](https://openjdk.org/jeps/402) > > 3/ Float16 to support all the operations supported by corresponding Float class. > > 4/ Java implementation of each API will internally perform floating point operation at FP32 granularity. > > 5/ API which can be directly mapped to an Intel AVX512FP16 instruction will be a candidate for intensification by C2 compiler. > > 6/ With Valhalla, C2 compiler always creates an InlineType IR node for a value class instance. > Total number of inputs of an InlineType node match the number of non-static fields. In this case node will have one input of short type TypeInt::SHORT. > > 7/ Since all the scalar AVX512FP16 instructions operate on floating point registers and Float16 backing storage is held in a general-purpose register hence we need to introduce appropriate conversion IR which moves a 16-bit value from GPR to a XMM register and vice versa. > ![image](https://github.com/openjdk/valhalla/assets/59989778/192fca7e-6b7e-4e62-9b09-677e33eca48d) > > 8/ Current plan is to introduce a new IR node for each operation which is a subclass of its corresponding single precision IR node. This will allow leveraging idealization routines (Ideal/Identity/Value) of its parent operation. > > 9/ All the single/double precision IR nodes carry a Type::FLOAT/DOUBLE ideal type. This represents entire FP32/64 value range and is different from integral types which explicitly record lower and upper bounds of value ranges. Value resolution ... This pull request has now been integrated. Changeset: f03fb4e4 Author: Jatin Bhateja URL: https://git.openjdk.org/valhalla/commit/f03fb4e4ee4d59ed692d0c26ddce260511f544e7 Stats: 883 lines in 33 files changed: 870 ins; 3 del; 10 mod 8308363: Initial compiler support for FP16 scalar operations. Reviewed-by: sviswanathan ------------- PR: https://git.openjdk.org/valhalla/pull/848 From vromero at openjdk.org Tue Sep 19 21:31:04 2023 From: vromero at openjdk.org (Vicente Romero) Date: Tue, 19 Sep 2023 21:31:04 GMT Subject: RFR: 8316325: [lw5] sync javac with the current JVMS, particularly assertions on new class attributes [v3] In-Reply-To: References: Message-ID: > This PR is implementing assertions on new classfile attributes, see [1] > > [1] https://cr.openjdk.org/~dlsmith/jep401/jep401-20230519/specs/flattened-heap-jvms.html Vicente Romero has updated the pull request incrementally with one additional commit since the last revision: minor refactoring ------------- Changes: - all: https://git.openjdk.org/valhalla/pull/927/files - new: https://git.openjdk.org/valhalla/pull/927/files/a29389be..cdf0e82e Webrevs: - full: https://webrevs.openjdk.org/?repo=valhalla&pr=927&range=02 - incr: https://webrevs.openjdk.org/?repo=valhalla&pr=927&range=01-02 Stats: 12 lines in 8 files changed: 3 ins; 3 del; 6 mod Patch: https://git.openjdk.org/valhalla/pull/927.diff Fetch: git fetch https://git.openjdk.org/valhalla.git pull/927/head:pull/927 PR: https://git.openjdk.org/valhalla/pull/927 From vromero at openjdk.org Tue Sep 19 21:36:11 2023 From: vromero at openjdk.org (Vicente Romero) Date: Tue, 19 Sep 2023 21:36:11 GMT Subject: Integrated: 8316325: [lw5] sync javac with the current JVMS, particularly assertions on new class attributes In-Reply-To: References: Message-ID: On Mon, 18 Sep 2023 20:17:01 GMT, Vicente Romero wrote: > This PR is implementing assertions on new classfile attributes, see [1] > > [1] https://cr.openjdk.org/~dlsmith/jep401/jep401-20230519/specs/flattened-heap-jvms.html This pull request has now been integrated. Changeset: 7d4965a8 Author: Vicente Romero URL: https://git.openjdk.org/valhalla/commit/7d4965a8f35fc9a5fdf9514933afffd7e0763125 Stats: 899 lines in 35 files changed: 872 ins; 7 del; 20 mod 8316325: [lw5] sync javac with the current JVMS, particularly assertions on new class attributes ------------- PR: https://git.openjdk.org/valhalla/pull/927 From vromero at openjdk.org Tue Sep 19 23:24:22 2023 From: vromero at openjdk.org (Vicente Romero) Date: Tue, 19 Sep 2023 23:24:22 GMT Subject: Integrated: 8316561: [lw5] class file attribute NullRestricted shouldn't be generated for arrays Message-ID: This PR is fixing a bug, basically for code like: value class V { V[]! va; public implicit V(); } the compiler shouldn't generate a NullRestricted attribute for field `va` in the future we probably issue a warning or a compiler error for this code pattern. ------------- Commit messages: - 8316561: [lw5] class file attribute NullRestricted shouldn't be generated for arrays Changes: https://git.openjdk.org/valhalla/pull/928/files Webrev: https://webrevs.openjdk.org/?repo=valhalla&pr=928&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8316561 Stats: 12 lines in 2 files changed: 11 ins; 0 del; 1 mod Patch: https://git.openjdk.org/valhalla/pull/928.diff Fetch: git fetch https://git.openjdk.org/valhalla.git pull/928/head:pull/928 PR: https://git.openjdk.org/valhalla/pull/928 From vromero at openjdk.org Tue Sep 19 23:24:23 2023 From: vromero at openjdk.org (Vicente Romero) Date: Tue, 19 Sep 2023 23:24:23 GMT Subject: Integrated: 8316561: [lw5] class file attribute NullRestricted shouldn't be generated for arrays In-Reply-To: References: Message-ID: On Tue, 19 Sep 2023 23:16:22 GMT, Vicente Romero wrote: > This PR is fixing a bug, basically for code like: > > value class V { > V[]! va; > public implicit V(); > } > > the compiler shouldn't generate a NullRestricted attribute for field `va` in the future we probably issue a warning or a compiler error for this code pattern. This pull request has now been integrated. Changeset: 152b0aaa Author: Vicente Romero URL: https://git.openjdk.org/valhalla/commit/152b0aaa6e85f514da5e5890a45efc020f825174 Stats: 12 lines in 2 files changed: 11 ins; 0 del; 1 mod 8316561: [lw5] class file attribute NullRestricted shouldn't be generated for arrays ------------- PR: https://git.openjdk.org/valhalla/pull/928 From jbhateja at openjdk.org Fri Sep 22 06:28:35 2023 From: jbhateja at openjdk.org (Jatin Bhateja) Date: Fri, 22 Sep 2023 06:28:35 GMT Subject: [lworld+vector] RFR: Merge lworld Message-ID: <9RG7Q0EswtuWXC194n7gtEuXkgIWaG0EUzEGczmNQ1U=.642ea77f-3c31-4d41-920c-84ba27c6ac87@github.com> Merge latest lworld changes into lworld+vector branch. Validation Status with this patch:- - All the tests under test/hotspot/jtreg/compiler/valhalla/inlinetypes are now passing at various AVX levels. - No new Vector API jtreg test failure seen. Best Regards, Jatin ------------- Commit messages: - jcheck failure resolution, whitespace removal - Remove multifield related special handling from C1 - Merge branch 'lworld' of http://github.com/openjdk/valhalla into merge_lworld - 8315935: [lworld] Apply flat renaming to C2 code - 8311219: [lworld] VM option "InlineFieldMaxFlatSize" cannot work well - 8313667: [lworld] XBarrierSetC2::clone_at_expansion() uses wrong array copy stub for cloning flat primitive type arrays - 8315412: [lworld] Preparing code for lw5 - 8315272: [lworld] Replacing NULL with nullptr in aarch64 code The webrevs contain the adjustments done while merging with regards to each parent branch: - lworld+vector: https://webrevs.openjdk.org/?repo=valhalla&pr=929&range=00.0 - lworld: https://webrevs.openjdk.org/?repo=valhalla&pr=929&range=00.1 Changes: https://git.openjdk.org/valhalla/pull/929/files Stats: 1312 lines in 110 files changed: 320 ins; 119 del; 873 mod Patch: https://git.openjdk.org/valhalla/pull/929.diff Fetch: git fetch https://git.openjdk.org/valhalla.git pull/929/head:pull/929 PR: https://git.openjdk.org/valhalla/pull/929 From liangchenblue at gmail.com Sat Sep 23 02:27:24 2023 From: liangchenblue at gmail.com (-) Date: Sat, 23 Sep 2023 10:27:24 +0800 Subject: Regarding the latest JEP 401 update Message-ID: Hello, First thanks to the expert group and Dan for an update to JEP 401! The updated version looks very straightforward and actionable. I have a few comments in mind: 1. Why are method descriptor's types preloaded? They are not part of the object layout. 2. API support should mention java.lang.reflect.AccessFlag. 3. More value classes aside from Optional can be migrated: OptionalInt, OptionalDouble, OptionalLong. Best, Chen Liang -------------- next part -------------- An HTML attachment was scrubbed... URL: From forax at univ-mlv.fr Sat Sep 23 06:24:29 2023 From: forax at univ-mlv.fr (Remi Forax) Date: Sat, 23 Sep 2023 08:24:29 +0200 (CEST) Subject: Regarding the latest JEP 401 update In-Reply-To: References: Message-ID: <2019741640.1633091.1695450269542.JavaMail.zimbra@univ-eiffel.fr> > From: "-" > To: "valhalla-dev" > Sent: Saturday, September 23, 2023 4:27:24 AM > Subject: Regarding the latest JEP 401 update > Hello, Hello, > First thanks to the expert group and Dan for an update to JEP 401! The updated > version looks very straightforward and actionable. > I have a few comments in mind: > 1. Why are method descriptor's types preloaded? They are not part of the object > layout. They are part of the method calling convention. We want to try to avoid boxing when calling methods that take a value type as parameter. For that we need to know if a parameter type/return type is a value type or not. This is especailly important for virtual methods (methods that can be called through a vtable) because the vtable/itable are populated early, at least in the case of Hotspot, other VMs may differ. > 2. API support should mention java.lang.reflect.AccessFlag. > 3. More value classes aside from Optional can be migrated: OptionalInt, > OptionalDouble, OptionalLong. > Best, > Chen Liang regards, R?mi -------------- next part -------------- An HTML attachment was scrubbed... URL: From liangchenblue at gmail.com Mon Sep 25 01:17:07 2023 From: liangchenblue at gmail.com (-) Date: Mon, 25 Sep 2023 09:17:07 +0800 Subject: Synchronized blocks in value constructors In-Reply-To: References: Message-ID: Removed expert group comments and added valhalla-dev to the recipient list. This topic should go to valhalla-dev. On Mon, Sep 25, 2023 at 9:07?AM - wrote: > Hello Joao, > I believe the synchronized block proposal is fragile and isn't a good hint > to the VM. Each constructor can declare its own synchronization groups, > which makes behaviorally-correct optimization harder. > For your purpose where you only need to declare some of the properties of > the value class atomically together, you can group those properties in an > atomic (regular) value class. Then, you can include these regular value > classes in a non-atomic value class. > > Say you have this: > value class Container { > int a, b, c; > Container(int a, int b, int c) { > synchronized { this.a = a; this.b = b; } > this.c = c; > } > } > > Why not this instead: > nonatomic value class Container { > value class Constraint { > int a, b; > Constraint(int a, int b) { this.a = a; this.b = b; } > } > Constraint ab; int c; > Container(int a, int b, int c) { > this.ab = new Constraint(a, b); > this.c = c; > } > } > > Chen Liang > > On Mon, Sep 25, 2023 at 8:58?AM Jo?o Mendon?a wrote: > >> I would like to propose a new syntax to specify that, in a value class, >> some fields must be assigned atomically together: >> >> *** synchronized blocks in value class constructors. *** >> >> >> Advantages: >> >> - granularity - only the specific fields involved in an invariant >> need to be assigned in the same synchronized block. Multiple blocks can be >> used for independent invariants. Tearing is allowed between blocks and >> between fields assigned outside blocks. VMs could take advantage of this to >> perform optimizations. >> - economy - no new keyword/annotation/interface needed. >> - compatibility - synchronized blocks are currently illegal in >> constructors. >> - safety - just like with identity classes, the absence of a >> synchronized block in a constructor means that the whole constructor is >> synchronized, i.e. all fields are written atomically. >> - safety - an empty synchronized block is required to indicate that >> tearing is allowed between any fields. >> - clarity - "A synchronized block of field assignments" is a very >> intuitive description of the semantics involved, given the meaning of the >> word "synchronized" in english. >> >> >> Disadvantages: >> >> - aesthetics - an empty synchronized block is required to indicate >> that tearing is allowed between any fields. >> - safety - a user of a value class has no automatic way to be >> informed of if/where tearing may occur (can be fixed with an update to >> the generation of java docs). >> - clarity - synchronized blocks (?14.18.) have two meanings: >> - The old meaning in regular methods: only one thread may be >> running inside of a block for the object given in the expression clause >> - The new meaning in constructors: all assignments in a block >> are written atomically (no expression clause) >> >> Example: >> >> value class Range { >> long start, end; >> public Range(long start, long end) { >> if (start > end) throw new IllegalArgumentException(); >> synchronized { >> this.start = start; >> this.end = end; >> } >> } } >> >> Having all fields assigned in the same synchronized block, as above, is equivalent to declaring no synchronized blocks: >> >> value class Range { >> long start, end; >> >> public Range(long start, long end) { >> if (start > end) throw new IllegalArgumentException(); >> this.start = start; >> this.end = end; >> } } >> >> >> Another example: >> >> value class Point { >> double x, y; >> public Point(double x, double y) { >> synchronized {} >> this.x = x; this.y = y; >> } } >> >> >> >> Jo?o Mendon?a >> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From liangchenblue at gmail.com Mon Sep 25 09:37:57 2023 From: liangchenblue at gmail.com (-) Date: Mon, 25 Sep 2023 17:37:57 +0800 Subject: Synchronized blocks in value constructors In-Reply-To: References: Message-ID: Honestly, I don't understand why you propose this, for synchronized is ALREADY the default behavior of Value Classes per JEP 401. By default, all fields in a value class are synchronized (non-atomic), so you won't see a Range where min > max. The current proposed syntax asks users to opt-in into non-atomic (or non-synchronized in your words) for being non-atomic is bug-prone. Chen Liang On Mon, Sep 25, 2023 at 5:01?PM Jo?o Mendon?a wrote: > Hello Chen, > > > Thank you very much for your response. > To address the fragility issue you raised, I would modify the new syntax > proposal to: > > *** synchronized blocks in value class field declarations. *** > > To your very good point on granularity, I think that synchronized blocks > could still be considered more readable than having to segregate fields in > separate value classes and using some other class modifier/keyword to > indicate non-atomicity. > All the other advantages/disadvantages would remain the same. > Your example would become: > > value class Container { > synchronized {int a, b;} > int c; > Container(int a, int b, int c) { > this.a = a; > this.b = b; > this.c = c; > } > } > > > Jo?o Mendon?a > > > On Mon, Sep 25, 2023 at 2:08?AM - wrote: > >> Hello Joao, >> I believe the synchronized block proposal is fragile and isn't a good >> hint to the VM. Each constructor can declare its own synchronization >> groups, which makes behaviorally-correct optimization harder. >> For your purpose where you only need to declare some of the properties of >> the value class atomically together, you can group those properties in an >> atomic (regular) value class. Then, you can include these regular value >> classes in a non-atomic value class. >> >> Say you have this: >> value class Container { >> int a, b, c; >> Container(int a, int b, int c) { >> synchronized { this.a = a; this.b = b; } >> this.c = c; >> } >> } >> >> Why not this instead: >> nonatomic value class Container { >> value class Constraint { >> int a, b; >> Constraint(int a, int b) { this.a = a; this.b = b; } >> } >> Constraint ab; int c; >> Container(int a, int b, int c) { >> this.ab = new Constraint(a, b); >> this.c = c; >> } >> } >> >> Chen Liang >> >> On Mon, Sep 25, 2023 at 8:58?AM Jo?o Mendon?a wrote: >> >>> I would like to propose a new syntax to specify that, in a value class, >>> some fields must be assigned atomically together: >>> >>> *** synchronized blocks in value class constructors. *** >>> >>> >>> Advantages: >>> >>> - granularity - only the specific fields involved in an invariant >>> need to be assigned in the same synchronized block. Multiple blocks can be >>> used for independent invariants. Tearing is allowed between blocks and >>> between fields assigned outside blocks. VMs could take advantage of this to >>> perform optimizations. >>> - economy - no new keyword/annotation/interface needed. >>> - compatibility - synchronized blocks are currently illegal in >>> constructors. >>> - safety - just like with identity classes, the absence of a >>> synchronized block in a constructor means that the whole constructor is >>> synchronized, i.e. all fields are written atomically. >>> - safety - an empty synchronized block is required to indicate that >>> tearing is allowed between any fields. >>> - clarity - "A synchronized block of field assignments" is a very >>> intuitive description of the semantics involved, given the meaning of the >>> word "synchronized" in english. >>> >>> >>> Disadvantages: >>> >>> - aesthetics - an empty synchronized block is required to indicate >>> that tearing is allowed between any fields. >>> - safety - a user of a value class has no automatic way to be >>> informed of if/where tearing may occur (can be fixed with an update to >>> the generation of java docs). >>> - clarity - synchronized blocks (?14.18.) have two meanings: >>> - The old meaning in regular methods: only one thread may be >>> running inside of a block for the object given in the expression clause >>> - The new meaning in constructors: all assignments in a block >>> are written atomically (no expression clause) >>> >>> Example: >>> >>> value class Range { >>> long start, end; >>> public Range(long start, long end) { >>> if (start > end) throw new IllegalArgumentException(); >>> synchronized { >>> this.start = start; >>> this.end = end; >>> } >>> } } >>> >>> Having all fields assigned in the same synchronized block, as above, is equivalent to declaring no synchronized blocks: >>> >>> value class Range { >>> long start, end; >>> >>> public Range(long start, long end) { >>> if (start > end) throw new IllegalArgumentException(); >>> this.start = start; >>> this.end = end; >>> } } >>> >>> >>> Another example: >>> >>> value class Point { >>> double x, y; >>> public Point(double x, double y) { >>> synchronized {} >>> this.x = x; this.y = y; >>> } } >>> >>> >>> >>> Jo?o Mendon?a >>> >>> > > On Mon, Sep 25, 2023 at 2:08?AM - wrote: > >> Hello Joao, >> I believe the synchronized block proposal is fragile and isn't a good >> hint to the VM. Each constructor can declare its own synchronization >> groups, which makes behaviorally-correct optimization harder. >> For your purpose where you only need to declare some of the properties of >> the value class atomically together, you can group those properties in an >> atomic (regular) value class. Then, you can include these regular value >> classes in a non-atomic value class. >> >> Say you have this: >> value class Container { >> int a, b, c; >> Container(int a, int b, int c) { >> synchronized { this.a = a; this.b = b; } >> this.c = c; >> } >> } >> >> Why not this instead: >> nonatomic value class Container { >> value class Constraint { >> int a, b; >> Constraint(int a, int b) { this.a = a; this.b = b; } >> } >> Constraint ab; int c; >> Container(int a, int b, int c) { >> this.ab = new Constraint(a, b); >> this.c = c; >> } >> } >> >> Chen Liang >> >> On Mon, Sep 25, 2023 at 8:58?AM Jo?o Mendon?a wrote: >> >>> I would like to propose a new syntax to specify that, in a value class, >>> some fields must be assigned atomically together: >>> >>> *** synchronized blocks in value class constructors. *** >>> >>> >>> Advantages: >>> >>> - granularity - only the specific fields involved in an invariant >>> need to be assigned in the same synchronized block. Multiple blocks can be >>> used for independent invariants. Tearing is allowed between blocks and >>> between fields assigned outside blocks. VMs could take advantage of this to >>> perform optimizations. >>> - economy - no new keyword/annotation/interface needed. >>> - compatibility - synchronized blocks are currently illegal in >>> constructors. >>> - safety - just like with identity classes, the absence of a >>> synchronized block in a constructor means that the whole constructor is >>> synchronized, i.e. all fields are written atomically. >>> - safety - an empty synchronized block is required to indicate that >>> tearing is allowed between any fields. >>> - clarity - "A synchronized block of field assignments" is a very >>> intuitive description of the semantics involved, given the meaning of the >>> word "synchronized" in english. >>> >>> >>> Disadvantages: >>> >>> - aesthetics - an empty synchronized block is required to indicate >>> that tearing is allowed between any fields. >>> - safety - a user of a value class has no automatic way to be >>> informed of if/where tearing may occur (can be fixed with an update to >>> the generation of java docs). >>> - clarity - synchronized blocks (?14.18.) have two meanings: >>> - The old meaning in regular methods: only one thread may be >>> running inside of a block for the object given in the expression clause >>> - The new meaning in constructors: all assignments in a block >>> are written atomically (no expression clause) >>> >>> Example: >>> >>> value class Range { >>> long start, end; >>> public Range(long start, long end) { >>> if (start > end) throw new IllegalArgumentException(); >>> synchronized { >>> this.start = start; >>> this.end = end; >>> } >>> } } >>> >>> Having all fields assigned in the same synchronized block, as above, is equivalent to declaring no synchronized blocks: >>> >>> value class Range { >>> long start, end; >>> >>> public Range(long start, long end) { >>> if (start > end) throw new IllegalArgumentException(); >>> this.start = start; >>> this.end = end; >>> } } >>> >>> >>> Another example: >>> >>> value class Point { >>> double x, y; >>> public Point(double x, double y) { >>> synchronized {} >>> this.x = x; this.y = y; >>> } } >>> >>> >>> >>> Jo?o Mendon?a >>> >>> -------------- next part -------------- An HTML attachment was scrubbed... URL: From jbhateja at openjdk.org Tue Sep 26 02:24:25 2023 From: jbhateja at openjdk.org (Jatin Bhateja) Date: Tue, 26 Sep 2023 02:24:25 GMT Subject: [lworld+vector] RFR: Merge lworld [v2] In-Reply-To: <9RG7Q0EswtuWXC194n7gtEuXkgIWaG0EUzEGczmNQ1U=.642ea77f-3c31-4d41-920c-84ba27c6ac87@github.com> References: <9RG7Q0EswtuWXC194n7gtEuXkgIWaG0EUzEGczmNQ1U=.642ea77f-3c31-4d41-920c-84ba27c6ac87@github.com> Message-ID: > Merge latest lworld changes into lworld+vector branch. > > Validation Status with this patch:- > > - All the tests under test/hotspot/jtreg/compiler/valhalla/inlinetypes are now passing at various AVX levels. > - No new Vector API jtreg test failure seen. > > Best Regards, > Jatin Jatin Bhateja has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 14 commits: - jcheck failure resolution, whitespace removal - Remove multifield related special handling from C1 - Merge branch 'lworld' of http://github.com/openjdk/valhalla into merge_lworld - 8314980: [lworld+vector] consider scalarization conditions during ciMultiField creation. Reviewed-by: xgong - Merge lworld Reviewed-by: jbhateja - 8314628: [lworld+vector] validation regression fixes and cleanups. Reviewed-by: xgong - 8311610: [lworld+vector] Clean-up of vector allocation in class VectorSupport Reviewed-by: jbhateja - 8311080: [lworld+vector] Fix jdk build failures with different options Reviewed-by: jbhateja - Merge lworld Co-authored-by: Xiaohong Gong Reviewed-by: xgong - 8307715: Integrate VectorMask/Shuffle with value/primitive classes Reviewed-by: jbhateja - ... and 4 more: https://git.openjdk.org/valhalla/compare/0263bd93...fa27d069 ------------- Changes: https://git.openjdk.org/valhalla/pull/929/files Webrev: https://webrevs.openjdk.org/?repo=valhalla&pr=929&range=01 Stats: 15656 lines in 125 files changed: 3638 ins; 7733 del; 4285 mod Patch: https://git.openjdk.org/valhalla/pull/929.diff Fetch: git fetch https://git.openjdk.org/valhalla.git pull/929/head:pull/929 PR: https://git.openjdk.org/valhalla/pull/929 From jbhateja at openjdk.org Tue Sep 26 02:24:28 2023 From: jbhateja at openjdk.org (Jatin Bhateja) Date: Tue, 26 Sep 2023 02:24:28 GMT Subject: [lworld+vector] Integrated: Merge lworld In-Reply-To: <9RG7Q0EswtuWXC194n7gtEuXkgIWaG0EUzEGczmNQ1U=.642ea77f-3c31-4d41-920c-84ba27c6ac87@github.com> References: <9RG7Q0EswtuWXC194n7gtEuXkgIWaG0EUzEGczmNQ1U=.642ea77f-3c31-4d41-920c-84ba27c6ac87@github.com> Message-ID: On Fri, 22 Sep 2023 06:12:32 GMT, Jatin Bhateja wrote: > Merge latest lworld changes into lworld+vector branch. > > Validation Status with this patch:- > > - All the tests under test/hotspot/jtreg/compiler/valhalla/inlinetypes are now passing at various AVX levels. > - No new Vector API jtreg test failure seen. > > Best Regards, > Jatin This pull request has now been integrated. Changeset: 2f05ce61 Author: Jatin Bhateja URL: https://git.openjdk.org/valhalla/commit/2f05ce61f21588033b1b3511566fe86385e31b36 Stats: 1312 lines in 110 files changed: 320 ins; 119 del; 873 mod Merge lworld ------------- PR: https://git.openjdk.org/valhalla/pull/929 From xgong at openjdk.org Tue Sep 26 03:40:42 2023 From: xgong at openjdk.org (Xiaohong Gong) Date: Tue, 26 Sep 2023 03:40:42 GMT Subject: [lworld+vector] RFR: Merge lworld [v2] In-Reply-To: References: <9RG7Q0EswtuWXC194n7gtEuXkgIWaG0EUzEGczmNQ1U=.642ea77f-3c31-4d41-920c-84ba27c6ac87@github.com> Message-ID: On Tue, 26 Sep 2023 02:24:25 GMT, Jatin Bhateja wrote: >> Merge latest lworld changes into lworld+vector branch. >> >> Validation Status with this patch:- >> >> - All the tests under test/hotspot/jtreg/compiler/valhalla/inlinetypes are now passing at various AVX levels. >> - No new Vector API jtreg test failure seen. >> >> Best Regards, >> Jatin > > Jatin Bhateja has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 14 commits: > > - jcheck failure resolution, whitespace removal > - Remove multifield related special handling from C1 > - Merge branch 'lworld' of http://github.com/openjdk/valhalla into merge_lworld > - 8314980: [lworld+vector] consider scalarization conditions during ciMultiField creation. > > Reviewed-by: xgong > - Merge lworld > > Reviewed-by: jbhateja > - 8314628: [lworld+vector] validation regression fixes and cleanups. > > Reviewed-by: xgong > - 8311610: [lworld+vector] Clean-up of vector allocation in class VectorSupport > > Reviewed-by: jbhateja > - 8311080: [lworld+vector] Fix jdk build failures with different options > > Reviewed-by: jbhateja > - Merge lworld > > Co-authored-by: Xiaohong Gong > Reviewed-by: xgong > - 8307715: Integrate VectorMask/Shuffle with value/primitive classes > > Reviewed-by: jbhateja > - ... and 4 more: https://git.openjdk.org/valhalla/compare/0263bd93...fa27d069 Hi @jatin-bhateja , Sorry for my late reply to this PR (just noticed it today)! Regarding to the removing of special handling for multifields in C1, it may cause a regression in test `jdk/incubator/vector/VectorRuns.java` and `jdk/incubator/vector/VectorHash.java`. Here is the main log: WARNING: Using incubator modules: jdk.incubator.vector java.lang.AssertionError: 1024 72 at VectorRuns.assertEquals(VectorRuns.java:51) at VectorRuns.main(VectorRuns.java:46) at java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:103) at java.base/java.lang.reflect.Method.invoke(Method.java:582) at com.sun.javatest.regtest.agent.MainWrapper$MainTask.run(MainWrapper.java:138) at java.base/java.lang.Thread.run(Thread.java:1570) JavaTest Message: Test threw exception: java.lang.AssertionError: 1024 72 JavaTest Message: shutting down test STATUS:Failed.`main' threw exception: java.lang.AssertionError: 1024 72 I guess the `nof_nonstatic_fields` in the `ciInstanceKlass.cpp` does not contain all the multifields, although https://github.com/openjdk/valhalla/blob/lworld%2Bvector/src/hotspot/share/ci/ciEnv.cpp#L1774 returns `false ` for c1. Because the ci instance is singleton and shared between c1 and c2. So if the instance klass is created and the fields are initialized in c2 compiler thread, its nonstatic fields are just like c2 in c1 which is expected to be vectorized. Could you please look at this issue? Running with `"-XX:TieredStopAtLevel=3"` or adding back the original handling in c1 can make these two tests pass. Thanks! Best Regards, Xiaohong ------------- PR Comment: https://git.openjdk.org/valhalla/pull/929#issuecomment-1734775040 From jbhateja at openjdk.org Tue Sep 26 06:28:39 2023 From: jbhateja at openjdk.org (Jatin Bhateja) Date: Tue, 26 Sep 2023 06:28:39 GMT Subject: [lworld+vector] RFR: Merge lworld [v2] In-Reply-To: References: <9RG7Q0EswtuWXC194n7gtEuXkgIWaG0EUzEGczmNQ1U=.642ea77f-3c31-4d41-920c-84ba27c6ac87@github.com> Message-ID: <-Y0uZ0UcUfj2U8BObBVaXgNeMg6A7-RzB4xiyaYnqkA=.12619789-adf9-4cc5-934c-c21b835a7788@github.com> On Tue, 26 Sep 2023 03:37:33 GMT, Xiaohong Gong wrote: > Hi @jatin-bhateja , > > Sorry for my late reply to this PR (just noticed it today)! > > Regarding to the removing of special handling for multifields in C1, it may cause a regression in test `jdk/incubator/vector/VectorRuns.java` and `jdk/incubator/vector/VectorHash.java`. > > Here is the main log: > > ``` > WARNING: Using incubator modules: jdk.incubator.vector > java.lang.AssertionError: 1024 72 > at VectorRuns.assertEquals(VectorRuns.java:51) > at VectorRuns.main(VectorRuns.java:46) > at java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:103) > at java.base/java.lang.reflect.Method.invoke(Method.java:582) > at com.sun.javatest.regtest.agent.MainWrapper$MainTask.run(MainWrapper.java:138) > at java.base/java.lang.Thread.run(Thread.java:1570) > > JavaTest Message: Test threw exception: java.lang.AssertionError: 1024 72 > JavaTest Message: shutting down test > > STATUS:Failed.`main' threw exception: java.lang.AssertionError: 1024 72 > ``` > > I guess the `nof_nonstatic_fields` in the `ciInstanceKlass.cpp` does not contain all the multifields, although https://github.com/openjdk/valhalla/blob/lworld%2Bvector/src/hotspot/share/ci/ciEnv.cpp#L1774 returns `false ` for c1. Because the ci instance is singleton and shared between c1 and c2. So if the instance klass is created and the fields are initialized in c2 compiler thread, its nonstatic fields are just like c2 in c1 which is expected to be vectorized. > > Could you please look at this issue? Running with `"-XX:TieredStopAtLevel=3"` or adding back the original handling in c1 can make these two tests pass. Thanks! > > Best Regards, Xiaohong Thanks @XiaohongGong , I wanted to give a second though to it. ci model is shared b/w compilers and should ideally be free from specializations, but as you know we have purposefully moved scalarization check upfront to reduce the complexity from field query API. As discussed earlier I am working on max species support and will soon be creating a PR. It will enable us to run entire JTREG suite and uncover regressions. ------------- PR Comment: https://git.openjdk.org/valhalla/pull/929#issuecomment-1734903321 From xgong at openjdk.org Tue Sep 26 06:36:52 2023 From: xgong at openjdk.org (Xiaohong Gong) Date: Tue, 26 Sep 2023 06:36:52 GMT Subject: [lworld+vector] RFR: Merge lworld [v2] In-Reply-To: References: <9RG7Q0EswtuWXC194n7gtEuXkgIWaG0EUzEGczmNQ1U=.642ea77f-3c31-4d41-920c-84ba27c6ac87@github.com> Message-ID: On Tue, 26 Sep 2023 03:37:33 GMT, Xiaohong Gong wrote: >> Jatin Bhateja has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 14 commits: >> >> - jcheck failure resolution, whitespace removal >> - Remove multifield related special handling from C1 >> - Merge branch 'lworld' of http://github.com/openjdk/valhalla into merge_lworld >> - 8314980: [lworld+vector] consider scalarization conditions during ciMultiField creation. >> >> Reviewed-by: xgong >> - Merge lworld >> >> Reviewed-by: jbhateja >> - 8314628: [lworld+vector] validation regression fixes and cleanups. >> >> Reviewed-by: xgong >> - 8311610: [lworld+vector] Clean-up of vector allocation in class VectorSupport >> >> Reviewed-by: jbhateja >> - 8311080: [lworld+vector] Fix jdk build failures with different options >> >> Reviewed-by: jbhateja >> - Merge lworld >> >> Co-authored-by: Xiaohong Gong >> Reviewed-by: xgong >> - 8307715: Integrate VectorMask/Shuffle with value/primitive classes >> >> Reviewed-by: jbhateja >> - ... and 4 more: https://git.openjdk.org/valhalla/compare/0263bd93...fa27d069 > > Hi @jatin-bhateja , > > Sorry for my late reply to this PR (just noticed it today)! > > Regarding to the removing of special handling for multifields in C1, it may cause a regression in test `jdk/incubator/vector/VectorRuns.java` and `jdk/incubator/vector/VectorHash.java`. > > Here is the main log: > > WARNING: Using incubator modules: jdk.incubator.vector > java.lang.AssertionError: 1024 72 > at VectorRuns.assertEquals(VectorRuns.java:51) > at VectorRuns.main(VectorRuns.java:46) > at java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:103) > at java.base/java.lang.reflect.Method.invoke(Method.java:582) > at com.sun.javatest.regtest.agent.MainWrapper$MainTask.run(MainWrapper.java:138) > at java.base/java.lang.Thread.run(Thread.java:1570) > > JavaTest Message: Test threw exception: java.lang.AssertionError: 1024 72 > JavaTest Message: shutting down test > > STATUS:Failed.`main' threw exception: java.lang.AssertionError: 1024 72 > > > I guess the `nof_nonstatic_fields` in the `ciInstanceKlass.cpp` does not contain all the multifields, although https://github.com/openjdk/valhalla/blob/lworld%2Bvector/src/hotspot/share/ci/ciEnv.cpp#L1774 returns `false ` for c1. Because the ci instance is singleton and shared between c1 and c2. So if the instance klass is created and the fields are initialized in c2 compiler thread, its nonstatic fields are just like c2 in c1 which is expected to be vectorized. > > Could you please look at this issue? Running with `"-XX:TieredStopAtLevel=3"` or adding back the original handling in c1 can make these two tests pass. Thanks! > > Best Regards, > Xiaohong > Thanks @XiaohongGong , I wanted to give a second though to it. ci model is shared b/w compilers and should ideally be free from specializations, but as you know we have purposefully moved scalarization check upfront to reduce the complexity from field query API. As discussed earlier I am working on max species support and will soon be creating a PR. It will enable us to run entire JTREG suite and uncover regressions. Yes, I'm also revisiting this part to check whether there are other side effect of current ci design, and I found this tricky issue. Maybe it's better to not consider the multifields in ci stage except some info passed to compiler. And compiler can choose to do the special handling itself. It's not only to c1 compiler, but also the identity class. We may also think about the scene that the `MultiField` is used in an identity class, although it does not have the usage now. ------------- PR Comment: https://git.openjdk.org/valhalla/pull/929#issuecomment-1734910607 From dsimms at openjdk.org Wed Sep 27 12:55:55 2023 From: dsimms at openjdk.org (David Simms) Date: Wed, 27 Sep 2023 12:55:55 GMT Subject: [lworld] RFR: Merge jdk Message-ID: <6z-L5jByauAFtGD7PS0OXhpV0VakAnR87UfObCRKubk=.6b06494c-c488-4bcf-94c0-1ea3597955f7@github.com> Merge jdk-22+9 ------------- Commit messages: - Merge fixes - Merge tag 'jdk-22+9' into lworld_merge_jdk_22_9_incr - 8306582: Remove MetaspaceShared::exit_after_static_dump() - 8313368: (fc) FileChannel.size returns 0 on block special files - 8312078: [PPC] JcmdScale.java Failing on AIX - 8312617: SIGSEGV in ConnectionGraph::verify_ram_nodes - 8313322: RISC-V: implement MD5 intrinsic - 8313593: Generational ZGC: NMT assert when the heap fails to expand - 8313402: C1: Incorrect LoadIndexed value numbering - 8311989: Test java/lang/Thread/virtual/Reflection.java timed out - ... and 65 more: https://git.openjdk.org/valhalla/compare/72cfc5ff...f6181b0b The webrevs contain the adjustments done while merging with regards to each parent branch: - lworld: https://webrevs.openjdk.org/?repo=valhalla&pr=930&range=00.0 - jdk: https://webrevs.openjdk.org/?repo=valhalla&pr=930&range=00.1 Changes: https://git.openjdk.org/valhalla/pull/930/files Stats: 9916 lines in 357 files changed: 5412 ins; 2140 del; 2364 mod Patch: https://git.openjdk.org/valhalla/pull/930.diff Fetch: git fetch https://git.openjdk.org/valhalla.git pull/930/head:pull/930 PR: https://git.openjdk.org/valhalla/pull/930 From brian.goetz at oracle.com Wed Sep 27 18:08:30 2023 From: brian.goetz at oracle.com (Brian Goetz) Date: Wed, 27 Sep 2023 14:08:30 -0400 Subject: Regarding the latest JEP 401 update In-Reply-To: References: Message-ID: > 1. Why are method descriptor's types preloaded? They are not part of > the object layout. Like layout, method calling convention is set early, and we can scalarize value objects in calling convention. > 3. More value classes aside from Optional can be migrated: > OptionalInt, OptionalDouble, OptionalLong. I think any value-based class in the JDK is a candidate.? Other possible candidates include dynamically generated classes such as lambda proxies.? There are also probably some candidates that are not marked as value-based.? Recent developments allow us to eliminate one of the restrictions of value-based classes (no accessible constructors). From dsimms at openjdk.org Thu Sep 28 08:10:07 2023 From: dsimms at openjdk.org (David Simms) Date: Thu, 28 Sep 2023 08:10:07 GMT Subject: [lworld] RFR: Merge jdk [v2] In-Reply-To: <6z-L5jByauAFtGD7PS0OXhpV0VakAnR87UfObCRKubk=.6b06494c-c488-4bcf-94c0-1ea3597955f7@github.com> References: <6z-L5jByauAFtGD7PS0OXhpV0VakAnR87UfObCRKubk=.6b06494c-c488-4bcf-94c0-1ea3597955f7@github.com> Message-ID: > Merge jdk-22+9 David Simms has updated the pull request incrementally with one additional commit since the last revision: Deferred test issues ------------- Changes: - all: https://git.openjdk.org/valhalla/pull/930/files - new: https://git.openjdk.org/valhalla/pull/930/files/f6181b0b..44ed6b15 Webrevs: - full: https://webrevs.openjdk.org/?repo=valhalla&pr=930&range=01 - incr: https://webrevs.openjdk.org/?repo=valhalla&pr=930&range=00-01 Stats: 14 lines in 2 files changed: 13 ins; 0 del; 1 mod Patch: https://git.openjdk.org/valhalla/pull/930.diff Fetch: git fetch https://git.openjdk.org/valhalla.git pull/930/head:pull/930 PR: https://git.openjdk.org/valhalla/pull/930 From jbhateja at openjdk.org Thu Sep 28 18:14:09 2023 From: jbhateja at openjdk.org (Jatin Bhateja) Date: Thu, 28 Sep 2023 18:14:09 GMT Subject: [lworld+vector] RFR: 8311675: [lworld+vector] Max Species support. Message-ID: <4eU1cg9HwjcdJnOQJMV0qhVhZLh0vvHzsO4zJj-CcUU=.cf7c8b21-4a63-4b9e-ba30-8bb10dba2fa9@github.com> - Patch adds MaxSpecies support for all types of vectors. - New factory methods and VectorPayload classes for various kinds of vector, shuffle and mask payloads. - Summary of high level flow :- 1/ Max species payloads encapsulate @multifield annotated field accepting -1 value as bundle size parameter. 2/ For Vector payload bundle size is determined using maximum vector size supported by the target. 3/ For Shuffles and Masks payloads multifield bundle size is a function of maximum vector size and vector lane size. 4/ Based on the dynamic bundle size parser creates a separate FieldInfo structure for each base and synthetic multifield and rest of the flow remains the same. Kindly review and share your feedback. Best Regards, Jatin ------------- Commit messages: - [lworld+vector] Max Species support. Changes: https://git.openjdk.org/valhalla/pull/931/files Webrev: https://webrevs.openjdk.org/?repo=valhalla&pr=931&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8311675 Stats: 6621 lines in 55 files changed: 5906 ins; 282 del; 433 mod Patch: https://git.openjdk.org/valhalla/pull/931.diff Fetch: git fetch https://git.openjdk.org/valhalla.git pull/931/head:pull/931 PR: https://git.openjdk.org/valhalla/pull/931 From fparain at openjdk.org Thu Sep 28 18:20:55 2023 From: fparain at openjdk.org (Frederic Parain) Date: Thu, 28 Sep 2023 18:20:55 GMT Subject: [lworld] RFR: Merge jdk [v2] In-Reply-To: References: <6z-L5jByauAFtGD7PS0OXhpV0VakAnR87UfObCRKubk=.6b06494c-c488-4bcf-94c0-1ea3597955f7@github.com> Message-ID: On Thu, 28 Sep 2023 08:10:07 GMT, David Simms wrote: >> Merge jdk-22+9 > > David Simms has updated the pull request incrementally with one additional commit since the last revision: > > Deferred test issues Changes requested by fparain (Committer). src/hotspot/cpu/x86/templateTable_x86.cpp line 3713: > 3711: load_resolved_field_entry(noreg, cache, rax, rbx, rdx); > 3712: // RBX: field offset, RCX: RAX: TOS, RDX: flags > 3713: __ movl(rscratch2, rdx); // saving flags for is_flat test It is dangerous to save the value in a scratch register and not using it immediately. In the current state of the code, there's no corruption of this register. But pop_and_check_object() below can produce a lot of code when verifying the oop, and if this code is modified and starts to use rschratch2, corruption will happen. Suggested change: load_resolved_field_entry(noreg, cache, rax, rbx, rdx); __ pop(rax); // Get object from stack pop_and_check_object(rcx); const Address field(rcx, rbx, Address::times_1); // Check for volatile store __ movl(rscratch2, rdx); // preserving flags for is_flat test __ testl(rscratch2, rscratch2); __ jcc(Assembler::zero, notVolatile); ------------- PR Review: https://git.openjdk.org/valhalla/pull/930#pullrequestreview-1649424730 PR Review Comment: https://git.openjdk.org/valhalla/pull/930#discussion_r1340495242 From dsimms at openjdk.org Fri Sep 29 07:13:29 2023 From: dsimms at openjdk.org (David Simms) Date: Fri, 29 Sep 2023 07:13:29 GMT Subject: [lworld] RFR: Merge jdk [v2] In-Reply-To: References: <6z-L5jByauAFtGD7PS0OXhpV0VakAnR87UfObCRKubk=.6b06494c-c488-4bcf-94c0-1ea3597955f7@github.com> Message-ID: On Thu, 28 Sep 2023 17:52:58 GMT, Frederic Parain wrote: >> David Simms has updated the pull request incrementally with one additional commit since the last revision: >> >> Deferred test issues > > src/hotspot/cpu/x86/templateTable_x86.cpp line 3713: > >> 3711: load_resolved_field_entry(noreg, cache, rax, rbx, rdx); >> 3712: // RBX: field offset, RCX: RAX: TOS, RDX: flags >> 3713: __ movl(rscratch2, rdx); // saving flags for is_flat test > > It is dangerous to save the value in a scratch register and not using it immediately. In the current state of the code, there's no corruption of this register. But pop_and_check_object() below can produce a lot of code when verifying the oop, and if this code is modified and starts to use rschratch2, corruption will happen. > Suggested change: > > load_resolved_field_entry(noreg, cache, rax, rbx, rdx); > __ pop(rax); > > // Get object from stack > pop_and_check_object(rcx); > > const Address field(rcx, rbx, Address::times_1); > > // Check for volatile store > __ movl(rscratch2, rdx); // preserving flags for is_flat test > __ testl(rscratch2, rscratch2); > __ jcc(Assembler::zero, notVolatile); Agreed, thanks for the suggested fix. ------------- PR Review Comment: https://git.openjdk.org/valhalla/pull/930#discussion_r1340982141 From dsimms at openjdk.org Fri Sep 29 07:24:46 2023 From: dsimms at openjdk.org (David Simms) Date: Fri, 29 Sep 2023 07:24:46 GMT Subject: [lworld] RFR: Merge jdk [v3] In-Reply-To: <6z-L5jByauAFtGD7PS0OXhpV0VakAnR87UfObCRKubk=.6b06494c-c488-4bcf-94c0-1ea3597955f7@github.com> References: <6z-L5jByauAFtGD7PS0OXhpV0VakAnR87UfObCRKubk=.6b06494c-c488-4bcf-94c0-1ea3597955f7@github.com> Message-ID: > Merge jdk-22+9 David Simms has updated the pull request incrementally with one additional commit since the last revision: Suggested usage of scratch register ------------- Changes: - all: https://git.openjdk.org/valhalla/pull/930/files - new: https://git.openjdk.org/valhalla/pull/930/files/44ed6b15..ed157596 Webrevs: - full: https://webrevs.openjdk.org/?repo=valhalla&pr=930&range=02 - incr: https://webrevs.openjdk.org/?repo=valhalla&pr=930&range=01-02 Stats: 6 lines in 1 file changed: 3 ins; 3 del; 0 mod Patch: https://git.openjdk.org/valhalla/pull/930.diff Fetch: git fetch https://git.openjdk.org/valhalla.git pull/930/head:pull/930 PR: https://git.openjdk.org/valhalla/pull/930