From rkennke at openjdk.java.net Fri Oct 1 08:27:01 2021 From: rkennke at openjdk.java.net (Roman Kennke) Date: Fri, 1 Oct 2021 08:27:01 GMT Subject: [master] RFR: Load Klass* from header in interpreter (x86) [v3] In-Reply-To: References: Message-ID: > This implements loading the compressed Klass* from the object header, instead of the Klass* field in the x86 interpreter. It does the fast-path (unlocked object) in assembly, and calls into the runtime to deal with locked objects. > > Testing: > - [x] tier1 > - [x] tier2 > - [x] hotspot_gc Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: Use constant instead of literal in xor-test-sequence ------------- Changes: - all: https://git.openjdk.java.net/lilliput/pull/15/files - new: https://git.openjdk.java.net/lilliput/pull/15/files/99f9a36b..a7ac46c0 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=lilliput&pr=15&range=02 - incr: https://webrevs.openjdk.java.net/?repo=lilliput&pr=15&range=01-02 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.java.net/lilliput/pull/15.diff Fetch: git fetch https://git.openjdk.java.net/lilliput pull/15/head:pull/15 PR: https://git.openjdk.java.net/lilliput/pull/15 From rkennke at openjdk.java.net Fri Oct 1 12:30:35 2021 From: rkennke at openjdk.java.net (Roman Kennke) Date: Fri, 1 Oct 2021 12:30:35 GMT Subject: [master] RFR: Load Klass* from header in interpreter (x86) [v4] In-Reply-To: References: Message-ID: > This implements loading the compressed Klass* from the object header, instead of the Klass* field in the x86 interpreter. It does the fast-path (unlocked object) in assembly, and calls into the runtime to deal with locked objects. > > Testing: > - [x] tier1 > - [x] tier2 > - [x] hotspot_gc Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: Emit null-checks before load_klass with correct offset ------------- Changes: - all: https://git.openjdk.java.net/lilliput/pull/15/files - new: https://git.openjdk.java.net/lilliput/pull/15/files/a7ac46c0..dc73c694 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=lilliput&pr=15&range=03 - incr: https://webrevs.openjdk.java.net/?repo=lilliput&pr=15&range=02-03 Stats: 62 lines in 14 files changed: 6 ins; 4 del; 52 mod Patch: https://git.openjdk.java.net/lilliput/pull/15.diff Fetch: git fetch https://git.openjdk.java.net/lilliput pull/15/head:pull/15 PR: https://git.openjdk.java.net/lilliput/pull/15 From rkennke at openjdk.java.net Wed Oct 6 16:48:11 2021 From: rkennke at openjdk.java.net (Roman Kennke) Date: Wed, 6 Oct 2021 16:48:11 GMT Subject: [master] RFR: Load Klass* from header in interpreter (x86) [v5] In-Reply-To: References: Message-ID: <4GigBipemEsHVFPPGFBcQ6GvV5sxbM6x0Fq2xyJKB0M=.57caf821-038f-4632-bc9a-027b8493fa3c@github.com> > This implements loading the compressed Klass* from the object header, instead of the Klass* field in the x86 interpreter. It does the fast-path (unlocked object) in assembly, and calls into the runtime to deal with locked objects. > > This is a proof-of-concept. It is not entirely clear if we can even call into runtime from a couple of places where load_klass() is called, especially in C2, C1 and some stubs. OTOH, we may end up not having to do any of this if we come up with a way to avoid displaced headers altogether. > > Testing: > - [x] tier1 (x86_32,x86_64) > - [x] tier2 (x86_32,x86_64) > - [x] hotspot_gc (x86_32,x86_64) Roman Kennke has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains five additional commits since the last revision: - Merge branch 'master' into klass-from-header-interpreter - Emit null-checks before load_klass with correct offset - Use constant instead of literal in xor-test-sequence - Update comment about xorb/xorq encoding - Load Klass* from header in interpreter (x86) ------------- Changes: - all: https://git.openjdk.java.net/lilliput/pull/15/files - new: https://git.openjdk.java.net/lilliput/pull/15/files/dc73c694..cbe93827 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=lilliput&pr=15&range=04 - incr: https://webrevs.openjdk.java.net/?repo=lilliput&pr=15&range=03-04 Stats: 31915 lines in 917 files changed: 22686 ins; 4497 del; 4732 mod Patch: https://git.openjdk.java.net/lilliput/pull/15.diff Fetch: git fetch https://git.openjdk.java.net/lilliput pull/15/head:pull/15 PR: https://git.openjdk.java.net/lilliput/pull/15 From rkennke at openjdk.java.net Thu Oct 7 12:39:44 2021 From: rkennke at openjdk.java.net (Roman Kennke) Date: Thu, 7 Oct 2021 12:39:44 GMT Subject: [master] RFR: Resolve displaced header in CDS dumping Message-ID: <2Qyb2U6Rq2r-DSKnrhpbBbRrfIUymIbewaC-QjMcJSs=.f7d4784a-1cff-4678-a310-877f5b6a1042@github.com> Tests showed a bug in CDS dumping. When an object is locked, we need to resolve the displaced header in order to fetch the correct Klass*. The failing test was: runtime/cds/appcds/javaldr/LockDuringDump.java and it's fixed by this change. Testing: - [x] runtime/cds/appcds/javaldr/LockDuringDump.java - [x] tier1 - [x] tier2 ------------- Commit messages: - Resolve displaced header in CDS dumping Changes: https://git.openjdk.java.net/lilliput/pull/16/files Webrev: https://webrevs.openjdk.java.net/?repo=lilliput&pr=16&range=00 Stats: 8 lines in 1 file changed: 7 ins; 0 del; 1 mod Patch: https://git.openjdk.java.net/lilliput/pull/16.diff Fetch: git fetch https://git.openjdk.java.net/lilliput pull/16/head:pull/16 PR: https://git.openjdk.java.net/lilliput/pull/16 From rkennke at redhat.com Fri Oct 8 15:43:50 2021 From: rkennke at redhat.com (Roman Kennke) Date: Fri, 8 Oct 2021 17:43:50 +0200 Subject: RFC: (How to?) Replace use of mark-word in leak-profiler (Lilliput) Message-ID: Hi there, I'm currently thinking about an issue that came up when I tested JFR/Leakprofiler with Lilliput. First come context: In Lilliput I am storing a (compressed) Klass* in the object header, and want to phase out all uses of the dedicated Klass*-word and then remove that. This means that we need to be a little more careful when accessing the object header and/or the object's klass. In the leak-profiler, we (temporarily) store an Edge* into the object header, and preserve the actual mark in a table. However, that means that the heap traversal would not work anymore because the object doesn't have a valid Klass* anymore. Therefore I'm thinking about alternative ways to associate objects with an Edge*: 1. Use a (hash-)table for it (oop->Edge*), in EdgeStore. 2. Compress the Edge* to 32bits and store that only in the lower 32bits of the header (the Klass* is in the upper 32bits, currently). Not quite sure how to do this, though, it means we have to control allocation of the Edge instances in a contiguous space. 3. Store the object Klass* somewhere as long as we have it (before overriding it in EdgeStore::associate_leak_content_with_candidate()), and use that for iterating the object (in BFSClosure::iterate()). But where? Maybe in the Edge to the object? I.o.w. each Edge would also store the Klass* of its pointee? I wanted to check with you if any of those approaches make sense to you, or maybe you have even better ideas? Thanks for helping, Roman From rkennke at openjdk.java.net Tue Oct 12 11:43:25 2021 From: rkennke at openjdk.java.net (Roman Kennke) Date: Tue, 12 Oct 2021 11:43:25 GMT Subject: [master] RFR: Fix use of Klass* in GC JFR reporting Message-ID: Some JFR related code paths in GC access the Klass* of old objects that has already been overidden by the forwarding pointer. We can easily use the Klass* that we fetched earlier instead. Testing: - [x] tier1 - [x] tier2 - [x] jdk/jfr (some of which still fails, see subsequent PRs) ------------- Commit messages: - Fix use of Klass* in GC JFR reporting Changes: https://git.openjdk.java.net/lilliput/pull/17/files Webrev: https://webrevs.openjdk.java.net/?repo=lilliput&pr=17&range=00 Stats: 16 lines in 4 files changed: 0 ins; 0 del; 16 mod Patch: https://git.openjdk.java.net/lilliput/pull/17.diff Fetch: git fetch https://git.openjdk.java.net/lilliput pull/17/head:pull/17 PR: https://git.openjdk.java.net/lilliput/pull/17 From rkennke at openjdk.java.net Tue Oct 12 11:55:31 2021 From: rkennke at openjdk.java.net (Roman Kennke) Date: Tue, 12 Oct 2021 11:55:31 GMT Subject: [master] RFR: Use hashtable for obj->edge mapping in JFR, instead of mark-word Message-ID: <9z1RIw_WH0kQ5_muA1RQ9wWmHs0EEZHrn14Qo1Ah5SA=.7c4c3904-ab48-4b10-8b21-a40734d3f696@github.com> JFR overrides the mark-word to (temporarily) store a mapping from object to Edge*. This disturbs the Klass* that we need while tracing for leaks and for later emitting object information. Let's use a hashtable for this, instead, and leave the upper half of the mark-word alone. (The lower half is still used for marking.) Testing: - [x] tier1 - [x] tier2 - [x] jdk/jfr (together with #17 most tests pass, needs one more follow-up test fix) ------------- Commit messages: - Use hashtable for obj->edge mapping in JFR, instead of mark-word Changes: https://git.openjdk.java.net/lilliput/pull/18/files Webrev: https://webrevs.openjdk.java.net/?repo=lilliput&pr=18&range=00 Stats: 61 lines in 4 files changed: 54 ins; 2 del; 5 mod Patch: https://git.openjdk.java.net/lilliput/pull/18.diff Fetch: git fetch https://git.openjdk.java.net/lilliput pull/18/head:pull/18 PR: https://git.openjdk.java.net/lilliput/pull/18 From rkennke at openjdk.java.net Tue Oct 12 12:07:21 2021 From: rkennke at openjdk.java.net (Roman Kennke) Date: Tue, 12 Oct 2021 12:07:21 GMT Subject: [master] RFR: Fix some tests to work with +UseCompressedClassPointers Message-ID: Some tests assume -UseCompressedClassPointers, but we're forcing +UseCompressedClassPointers, so let's fix those tests. Testing: - [x] tier1 - [x] tier2 - [x] jdk/jfr - [x] testlibrary_tests/ir_framework/tests ------------- Commit messages: - Fix some tests to work with +UseCompressedClassPointers Changes: https://git.openjdk.java.net/lilliput/pull/19/files Webrev: https://webrevs.openjdk.java.net/?repo=lilliput&pr=19&range=00 Stats: 4 lines in 3 files changed: 0 ins; 1 del; 3 mod Patch: https://git.openjdk.java.net/lilliput/pull/19.diff Fetch: git fetch https://git.openjdk.java.net/lilliput pull/19/head:pull/19 PR: https://git.openjdk.java.net/lilliput/pull/19 From rkennke at redhat.com Tue Oct 12 12:11:43 2021 From: rkennke at redhat.com (Roman Kennke) Date: Tue, 12 Oct 2021 14:11:43 +0200 Subject: Reviewers needed Message-ID: <918e6329-f366-5ffb-8221-e87441d54344@redhat.com> Hi all, I have a couple of open PRs in Lilliput land, and they are starting to pile up and become difficult to manage. Since this is such a deep and tricky area, I'd love to get some reviews on the stuff that I do before I push it. If you feel even remotely comfortable with any of the stuff here: https://github.com/openjdk/lilliput/pulls?q=is%3Apr+is%3Aopen+label%3Arfr ... I would appreciate if you could look over it. :-D However, at some point, if a particular PR is lingering for too long, and starts blocking progress, I might just go ahead and push it after I convinced myself that it is sane, passes tests, and has reasonable formatting. Nobody has to be a 'R'eviewer in order to do Lilliput reviews. Thanks, Roman From shade at openjdk.java.net Tue Oct 12 12:23:06 2021 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Tue, 12 Oct 2021 12:23:06 GMT Subject: [master] RFR: Resolve displaced header in CDS dumping In-Reply-To: <2Qyb2U6Rq2r-DSKnrhpbBbRrfIUymIbewaC-QjMcJSs=.f7d4784a-1cff-4678-a310-877f5b6a1042@github.com> References: <2Qyb2U6Rq2r-DSKnrhpbBbRrfIUymIbewaC-QjMcJSs=.f7d4784a-1cff-4678-a310-877f5b6a1042@github.com> Message-ID: On Thu, 7 Oct 2021 10:56:05 GMT, Roman Kennke wrote: > Tests showed a bug in CDS dumping. When an object is locked, we need to resolve the displaced header in order to fetch the correct Klass*. > > The failing test was: > runtime/cds/appcds/javaldr/LockDuringDump.java > and it's fixed by this change. > > Testing: > - [x] runtime/cds/appcds/javaldr/LockDuringDump.java > - [x] tier1 > - [x] tier2 > - [x] tier3 > - [ ] tier4 Looks reasonable. ------------- Marked as reviewed by shade (Committer). PR: https://git.openjdk.java.net/lilliput/pull/16 From shade at openjdk.java.net Tue Oct 12 12:26:18 2021 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Tue, 12 Oct 2021 12:26:18 GMT Subject: [master] RFR: Fix use of Klass* in GC JFR reporting In-Reply-To: References: Message-ID: On Tue, 12 Oct 2021 11:37:04 GMT, Roman Kennke wrote: > Some JFR related code paths in GC access the Klass* of old objects that has already been overidden by the forwarding pointer. We can easily use the Klass* that we fetched earlier instead. > > Testing: > - [x] tier1 > - [x] tier2 > - [x] jdk/jfr (some of which still fails, see subsequent PRs) Looks fine. src/hotspot/share/gc/g1/g1ParScanThreadState.cpp line 387: > 385: NOINLINE > 386: HeapWord* G1ParScanThreadState::allocate_copy_slow(G1HeapRegionAttr* dest_attr, > 387: oop old, Klass* klass, Suggestion: oop old, Klass* klass, Looks like the style is one argument per line. src/hotspot/share/gc/g1/g1ParScanThreadState.hpp line 161: > 159: > 160: HeapWord* allocate_copy_slow(G1HeapRegionAttr* dest_attr, > 161: oop old, Klass* klass, Suggestion: oop old, Klass* klass, ------------- Marked as reviewed by shade (Committer). PR: https://git.openjdk.java.net/lilliput/pull/17 From shade at openjdk.java.net Tue Oct 12 12:40:16 2021 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Tue, 12 Oct 2021 12:40:16 GMT Subject: [master] RFR: Load Klass* from header in interpreter (x86) [v5] In-Reply-To: <4GigBipemEsHVFPPGFBcQ6GvV5sxbM6x0Fq2xyJKB0M=.57caf821-038f-4632-bc9a-027b8493fa3c@github.com> References: <4GigBipemEsHVFPPGFBcQ6GvV5sxbM6x0Fq2xyJKB0M=.57caf821-038f-4632-bc9a-027b8493fa3c@github.com> Message-ID: <9tB3pKHcU0oUxaBujqq0rAWb-vqmNiapl_RthIzxEz4=.e0eb3c29-a704-4be3-9300-57877d57194f@github.com> On Wed, 6 Oct 2021 16:48:11 GMT, Roman Kennke wrote: >> This implements loading the compressed Klass* from the object header, instead of the Klass* field in the x86 interpreter. It does the fast-path (unlocked object) in assembly, and calls into the runtime to deal with locked objects. >> >> This is a proof-of-concept. It is not entirely clear if we can even call into runtime from a couple of places where load_klass() is called, especially in C2, C1 and some stubs. OTOH, we may end up not having to do any of this if we come up with a way to avoid displaced headers altogether. >> >> Testing: >> - [x] tier1 (x86_32,x86_64) >> - [x] tier2 (x86_32,x86_64) >> - [x] hotspot_gc (x86_32,x86_64) > > Roman Kennke has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains five additional commits since the last revision: > > - Merge branch 'master' into klass-from-header-interpreter > - Emit null-checks before load_klass with correct offset > - Use constant instead of literal in xor-test-sequence > - Update comment about xorb/xorq encoding > - Load Klass* from header in interpreter (x86) Looks okay for the experimental code. src/hotspot/cpu/x86/macroAssembler_x86.cpp line 4554: > 4552: // NOTE: While it would seem nice to use xorb instead (for which we don't have an encoding in our assembler), > 4553: // the encoding for xorq uses the signed version (0x81/6) of xor, which encodes as compact as xorb would, > 4554: // and does't make a difference performance-wise. I can give you `xorb`, if you want, in a subsequent PR. src/hotspot/cpu/x86/macroAssembler_x86.cpp line 4597: > 4595: null_check(src, oopDesc::klass_offset_in_bytes()); > 4596: } > 4597: movptr(dst, Address(src, oopDesc::klass_offset_in_bytes())); Do we not need to do any fixups for x86_32? src/hotspot/cpu/x86/macroAssembler_x86.hpp line 342: > 340: > 341: // oop manipulations > 342: void load_klass(Register dst, Register src, Register tmp, bool null_check_src); I think if you define `bool null_check_src = false`, then the significant part of the changes would go away. It might not be a good idea when upstreaming this, but it would probably simplify the merges quite a bit. src/hotspot/cpu/x86/templateTable_x86.cpp line 4185: > 4183: Register tmp_load_klass = LP64_ONLY(rscratch1) NOT_LP64(noreg); > 4184: __ load_klass(rdx, rdx, tmp_load_klass, false); > 4185: __ jmp(resolved); Have you run into troubles with `jmpb` short branch here? ------------- Marked as reviewed by shade (Committer). PR: https://git.openjdk.java.net/lilliput/pull/15 From shade at openjdk.java.net Tue Oct 12 12:53:24 2021 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Tue, 12 Oct 2021 12:53:24 GMT Subject: [master] RFR: Use hashtable for obj->edge mapping in JFR, instead of mark-word In-Reply-To: <9z1RIw_WH0kQ5_muA1RQ9wWmHs0EEZHrn14Qo1Ah5SA=.7c4c3904-ab48-4b10-8b21-a40734d3f696@github.com> References: <9z1RIw_WH0kQ5_muA1RQ9wWmHs0EEZHrn14Qo1Ah5SA=.7c4c3904-ab48-4b10-8b21-a40734d3f696@github.com> Message-ID: On Tue, 12 Oct 2021 11:49:27 GMT, Roman Kennke wrote: > JFR overrides the mark-word to (temporarily) store a mapping from object to Edge*. This disturbs the Klass* that we need while tracing for leaks and for later emitting object information. Let's use a hashtable for this, instead, and leave the upper half of the mark-word alone. (The lower half is still used for marking.) > > Testing: > - [x] tier1 > - [x] tier2 > - [x] jdk/jfr (together with #17 most tests pass, needs one more follow-up test fix) Looks fine, I have only a few suggestions. src/hotspot/share/jfr/leakprofiler/chains/edgeStore.cpp line 41: > 39: EdgeStore::EdgeStore() : _edges(NULL) { > 40: _edges = new EdgeHashTable(this); > 41: _objEdgeHashTable = new ObjEdgeHashTable(); Not sure why upstream code has `_edges(NULL)` in the initialization list, it feels that `_objEdgeHashTable(NULL)` should be there as well. src/hotspot/share/jfr/leakprofiler/chains/edgeStore.hpp line 63: > 61: private: > 62: oop _obj; > 63: Edge* _edge; Should be `const`, I think. ------------- Marked as reviewed by shade (Committer). PR: https://git.openjdk.java.net/lilliput/pull/18 From shade at redhat.com Tue Oct 12 12:54:30 2021 From: shade at redhat.com (Aleksey Shipilev) Date: Tue, 12 Oct 2021 14:54:30 +0200 Subject: Reviewers needed In-Reply-To: <918e6329-f366-5ffb-8221-e87441d54344@redhat.com> References: <918e6329-f366-5ffb-8221-e87441d54344@redhat.com> Message-ID: <9f6c81fa-d010-7777-2255-067f9ac1ca16@redhat.com> On 10/12/21 2:11 PM, Roman Kennke wrote: > Hi all, > > I have a couple of open PRs in Lilliput land, and they are starting to > pile up and become difficult to manage. > > Since this is such a deep and tricky area, I'd love to get some reviews > on the stuff that I do before I push it. If you feel even remotely > comfortable with any of the stuff here: > > https://github.com/openjdk/lilliput/pulls?q=is%3Apr+is%3Aopen+label%3Arfr > > ... I would appreciate if you could look over it. :-D Ask and you shall receive! Feel free to use "Request Review" on GitHub next time. -- Thanks, -Aleksey From shade at openjdk.java.net Tue Oct 12 12:56:22 2021 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Tue, 12 Oct 2021 12:56:22 GMT Subject: [master] RFR: Fix some tests to work with +UseCompressedClassPointers In-Reply-To: References: Message-ID: On Tue, 12 Oct 2021 12:02:16 GMT, Roman Kennke wrote: > Some tests assume -UseCompressedClassPointers, but we're forcing +UseCompressedClassPointers, so let's fix those tests. > > Testing: > - [x] tier1 > - [x] tier2 > - [x] jdk/jfr > - [x] testlibrary_tests/ir_framework/tests Looks fine, modulo a question. test/jdk/jdk/jfr/event/gc/objectcount/ObjectCountEventVerifier.java line 73: > 71: boolean runsOn32Bit = System.getProperty("sun.arch.data.model").equals("32"); > 72: int bytesPerWord = runsOn32Bit ? 4 : 8; > 73: int objectHeaderSize = bytesPerWord * 2; // length will be in klass-gap Would it be in klass-gap on x86_32? Run this test on x86_32 to confirm? ------------- Marked as reviewed by shade (Committer). PR: https://git.openjdk.java.net/lilliput/pull/19 From markus.gronlund at oracle.com Tue Oct 12 12:37:28 2021 From: markus.gronlund at oracle.com (Markus Gronlund) Date: Tue, 12 Oct 2021 12:37:28 +0000 Subject: RFC: (How to?) Replace use of mark-word in leak-profiler (Lilliput) In-Reply-To: References: Message-ID: Hi Roman, Thank you for bringing this up. I am exploring a few ideas, and I will get back to you soon. Cheers Markus -----Original Message----- From: hotspot-jfr-dev On Behalf Of Roman Kennke Sent: den 8 oktober 2021 17:44 To: hotspot-jfr-dev ; lilliput-dev at openjdk.java.net Subject: RFC: (How to?) Replace use of mark-word in leak-profiler (Lilliput) Hi there, I'm currently thinking about an issue that came up when I tested JFR/Leakprofiler with Lilliput. First come context: In Lilliput I am storing a (compressed) Klass* in the object header, and want to phase out all uses of the dedicated Klass*-word and then remove that. This means that we need to be a little more careful when accessing the object header and/or the object's klass. In the leak-profiler, we (temporarily) store an Edge* into the object header, and preserve the actual mark in a table. However, that means that the heap traversal would not work anymore because the object doesn't have a valid Klass* anymore. Therefore I'm thinking about alternative ways to associate objects with an Edge*: 1. Use a (hash-)table for it (oop->Edge*), in EdgeStore. 2. Compress the Edge* to 32bits and store that only in the lower 32bits of the header (the Klass* is in the upper 32bits, currently). Not quite sure how to do this, though, it means we have to control allocation of the Edge instances in a contiguous space. 3. Store the object Klass* somewhere as long as we have it (before overriding it in EdgeStore::associate_leak_content_with_candidate()), and use that for iterating the object (in BFSClosure::iterate()). But where? Maybe in the Edge to the object? I.o.w. each Edge would also store the Klass* of its pointee? I wanted to check with you if any of those approaches make sense to you, or maybe you have even better ideas? Thanks for helping, Roman From rkennke at openjdk.java.net Tue Oct 12 13:07:06 2021 From: rkennke at openjdk.java.net (Roman Kennke) Date: Tue, 12 Oct 2021 13:07:06 GMT Subject: [master] Integrated: Resolve displaced header in CDS dumping In-Reply-To: <2Qyb2U6Rq2r-DSKnrhpbBbRrfIUymIbewaC-QjMcJSs=.f7d4784a-1cff-4678-a310-877f5b6a1042@github.com> References: <2Qyb2U6Rq2r-DSKnrhpbBbRrfIUymIbewaC-QjMcJSs=.f7d4784a-1cff-4678-a310-877f5b6a1042@github.com> Message-ID: <3tEVLJTphipfuhPeWViE54TMOqe3QRU5RaR7x5izD1g=.a1d8c5ed-1237-4635-80e7-de443a5fc339@github.com> On Thu, 7 Oct 2021 10:56:05 GMT, Roman Kennke wrote: > Tests showed a bug in CDS dumping. When an object is locked, we need to resolve the displaced header in order to fetch the correct Klass*. > > The failing test was: > runtime/cds/appcds/javaldr/LockDuringDump.java > and it's fixed by this change. > > Testing: > - [x] runtime/cds/appcds/javaldr/LockDuringDump.java > - [x] tier1 > - [x] tier2 > - [x] tier3 > - [ ] tier4 This pull request has now been integrated. Changeset: 0e4f781c Author: Roman Kennke URL: https://git.openjdk.java.net/lilliput/commit/0e4f781c23b4950ab06a97f97ae0a32f9546a9d0 Stats: 8 lines in 1 file changed: 7 ins; 0 del; 1 mod Resolve displaced header in CDS dumping Reviewed-by: shade ------------- PR: https://git.openjdk.java.net/lilliput/pull/16 From rkennke at openjdk.java.net Tue Oct 12 13:09:47 2021 From: rkennke at openjdk.java.net (Roman Kennke) Date: Tue, 12 Oct 2021 13:09:47 GMT Subject: [master] RFR: Fix use of Klass* in GC JFR reporting [v2] In-Reply-To: References: Message-ID: > Some JFR related code paths in GC access the Klass* of old objects that has already been overidden by the forwarding pointer. We can easily use the Klass* that we fetched earlier instead. > > Testing: > - [x] tier1 > - [x] tier2 > - [x] jdk/jfr (some of which still fails, see subsequent PRs) Roman Kennke has updated the pull request incrementally with two additional commits since the last revision: - Update src/hotspot/share/gc/g1/g1ParScanThreadState.hpp Co-authored-by: Aleksey Shipil?v - Update src/hotspot/share/gc/g1/g1ParScanThreadState.cpp Co-authored-by: Aleksey Shipil?v ------------- Changes: - all: https://git.openjdk.java.net/lilliput/pull/17/files - new: https://git.openjdk.java.net/lilliput/pull/17/files/1bf5407f..ce81fbf9 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=lilliput&pr=17&range=01 - incr: https://webrevs.openjdk.java.net/?repo=lilliput&pr=17&range=00-01 Stats: 4 lines in 2 files changed: 2 ins; 0 del; 2 mod Patch: https://git.openjdk.java.net/lilliput/pull/17.diff Fetch: git fetch https://git.openjdk.java.net/lilliput pull/17/head:pull/17 PR: https://git.openjdk.java.net/lilliput/pull/17 From rkennke at openjdk.java.net Tue Oct 12 13:14:31 2021 From: rkennke at openjdk.java.net (Roman Kennke) Date: Tue, 12 Oct 2021 13:14:31 GMT Subject: [master] RFR: Fix use of Klass* in GC JFR reporting [v3] In-Reply-To: References: Message-ID: <_4ZtBZdrIGwU1ajlbOhl-7UG-ZGJx8xQTYaZQ8kz2ZQ=.12b2383b-af33-415c-b251-c6253dcbb6ba@github.com> > Some JFR related code paths in GC access the Klass* of old objects that has already been overidden by the forwarding pointer. We can easily use the Klass* that we fetched earlier instead. > > Testing: > - [x] tier1 > - [x] tier2 > - [x] jdk/jfr (some of which still fails, see subsequent PRs) Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: Whitespace fixes ------------- Changes: - all: https://git.openjdk.java.net/lilliput/pull/17/files - new: https://git.openjdk.java.net/lilliput/pull/17/files/ce81fbf9..5286943e Webrevs: - full: https://webrevs.openjdk.java.net/?repo=lilliput&pr=17&range=02 - incr: https://webrevs.openjdk.java.net/?repo=lilliput&pr=17&range=01-02 Stats: 2 lines in 2 files changed: 0 ins; 0 del; 2 mod Patch: https://git.openjdk.java.net/lilliput/pull/17.diff Fetch: git fetch https://git.openjdk.java.net/lilliput pull/17/head:pull/17 PR: https://git.openjdk.java.net/lilliput/pull/17 From rkennke at openjdk.java.net Tue Oct 12 13:19:15 2021 From: rkennke at openjdk.java.net (Roman Kennke) Date: Tue, 12 Oct 2021 13:19:15 GMT Subject: [master] Integrated: Fix use of Klass* in GC JFR reporting In-Reply-To: References: Message-ID: On Tue, 12 Oct 2021 11:37:04 GMT, Roman Kennke wrote: > Some JFR related code paths in GC access the Klass* of old objects that has already been overidden by the forwarding pointer. We can easily use the Klass* that we fetched earlier instead. > > Testing: > - [x] tier1 > - [x] tier2 > - [x] jdk/jfr (some of which still fails, see subsequent PRs) This pull request has now been integrated. Changeset: 8b9dc84d Author: Roman Kennke URL: https://git.openjdk.java.net/lilliput/commit/8b9dc84de52aae8e0c48f644ee4a03ddd88d9544 Stats: 16 lines in 4 files changed: 2 ins; 0 del; 14 mod Fix use of Klass* in GC JFR reporting Reviewed-by: shade ------------- PR: https://git.openjdk.java.net/lilliput/pull/17 From rkennke at redhat.com Tue Oct 12 13:19:56 2021 From: rkennke at redhat.com (Roman Kennke) Date: Tue, 12 Oct 2021 15:19:56 +0200 Subject: RFC: (How to?) Replace use of mark-word in leak-profiler (Lilliput) In-Reply-To: References: Message-ID: Thank you, Markus! In the meantime I implemented proposal #1, map oop->Edge* using a hashtable: https://github.com/openjdk/lilliput/pull/18 If you find a better way to do that, I would appreciate it! Meanwhile, I will leave that PR#18 open. Thanks, Roman > Hi Roman, > > Thank you for bringing this up. > > I am exploring a few ideas, and I will get back to you soon. > > Cheers > Markus > > -----Original Message----- > From: hotspot-jfr-dev On Behalf Of Roman Kennke > Sent: den 8 oktober 2021 17:44 > To: hotspot-jfr-dev ; lilliput-dev at openjdk.java.net > Subject: RFC: (How to?) Replace use of mark-word in leak-profiler (Lilliput) > > Hi there, > > I'm currently thinking about an issue that came up when I tested JFR/Leakprofiler with Lilliput. > > First come context: In Lilliput I am storing a (compressed) Klass* in the object header, and want to phase out all uses of the dedicated Klass*-word and then remove that. This means that we need to be a little more careful when accessing the object header and/or the object's klass. > > In the leak-profiler, we (temporarily) store an Edge* into the object header, and preserve the actual mark in a table. However, that means that the heap traversal would not work anymore because the object doesn't have a valid Klass* anymore. > > Therefore I'm thinking about alternative ways to associate objects with an Edge*: > > 1. Use a (hash-)table for it (oop->Edge*), in EdgeStore. > 2. Compress the Edge* to 32bits and store that only in the lower 32bits of the header (the Klass* is in the upper 32bits, currently). Not quite sure how to do this, though, it means we have to control allocation of the Edge instances in a contiguous space. > 3. Store the object Klass* somewhere as long as we have it (before overriding it in EdgeStore::associate_leak_content_with_candidate()), > and use that for iterating the object (in BFSClosure::iterate()). But where? Maybe in the Edge to the object? I.o.w. each Edge would also store the Klass* of its pointee? > > I wanted to check with you if any of those approaches make sense to you, or maybe you have even better ideas? > > Thanks for helping, > Roman > From rkennke at openjdk.java.net Tue Oct 12 13:30:25 2021 From: rkennke at openjdk.java.net (Roman Kennke) Date: Tue, 12 Oct 2021 13:30:25 GMT Subject: [master] RFR: Load Klass* from header in interpreter (x86) [v5] In-Reply-To: <9tB3pKHcU0oUxaBujqq0rAWb-vqmNiapl_RthIzxEz4=.e0eb3c29-a704-4be3-9300-57877d57194f@github.com> References: <4GigBipemEsHVFPPGFBcQ6GvV5sxbM6x0Fq2xyJKB0M=.57caf821-038f-4632-bc9a-027b8493fa3c@github.com> <9tB3pKHcU0oUxaBujqq0rAWb-vqmNiapl_RthIzxEz4=.e0eb3c29-a704-4be3-9300-57877d57194f@github.com> Message-ID: On Tue, 12 Oct 2021 12:34:56 GMT, Aleksey Shipilev wrote: >> Roman Kennke has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains five additional commits since the last revision: >> >> - Merge branch 'master' into klass-from-header-interpreter >> - Emit null-checks before load_klass with correct offset >> - Use constant instead of literal in xor-test-sequence >> - Update comment about xorb/xorq encoding >> - Load Klass* from header in interpreter (x86) > > src/hotspot/cpu/x86/macroAssembler_x86.cpp line 4554: > >> 4552: // NOTE: While it would seem nice to use xorb instead (for which we don't have an encoding in our assembler), >> 4553: // the encoding for xorq uses the signed version (0x81/6) of xor, which encodes as compact as xorb would, >> 4554: // and does't make a difference performance-wise. > > I can give you `xorb`, if you want, in a subsequent PR. I actually have it, locally, but as comment says, it doesn't look any better in terms of code size or performance, so why bother? > src/hotspot/cpu/x86/macroAssembler_x86.cpp line 4597: > >> 4595: null_check(src, oopDesc::klass_offset_in_bytes()); >> 4596: } >> 4597: movptr(dst, Address(src, oopDesc::klass_offset_in_bytes())); > > Do we not need to do any fixups for x86_32? No because layout-wise, the Klass* remains 4 bytes from object start, and would never get disturbed by anything that happens in the 32-bit-header. Hopefully, when we find a way to avoid displaced headers altogether, we can unify the code with x86_64 and avoid the mess that we make here. > src/hotspot/cpu/x86/templateTable_x86.cpp line 4185: > >> 4183: Register tmp_load_klass = LP64_ONLY(rscratch1) NOT_LP64(noreg); >> 4184: __ load_klass(rdx, rdx, tmp_load_klass, false); >> 4185: __ jmp(resolved); > > Have you run into troubles with `jmpb` short branch here? Yes. The (subsequent) load_klass() increases the distance to resolved. ------------- PR: https://git.openjdk.java.net/lilliput/pull/15 From shade at openjdk.java.net Tue Oct 12 13:30:29 2021 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Tue, 12 Oct 2021 13:30:29 GMT Subject: [master] RFR: Load Klass* from header in interpreter (x86) [v5] In-Reply-To: References: <4GigBipemEsHVFPPGFBcQ6GvV5sxbM6x0Fq2xyJKB0M=.57caf821-038f-4632-bc9a-027b8493fa3c@github.com> <9tB3pKHcU0oUxaBujqq0rAWb-vqmNiapl_RthIzxEz4=.e0eb3c29-a704-4be3-9300-57877d57194f@github.com> Message-ID: <0KmN7zxfdhJ6ygznPsGHGpS7FaXhpCeZPvfBhfrxd6E=.0a95951d-a19a-45d1-a86b-3890f822bc61@github.com> On Tue, 12 Oct 2021 13:23:18 GMT, Roman Kennke wrote: >> src/hotspot/cpu/x86/macroAssembler_x86.cpp line 4554: >> >>> 4552: // NOTE: While it would seem nice to use xorb instead (for which we don't have an encoding in our assembler), >>> 4553: // the encoding for xorq uses the signed version (0x81/6) of xor, which encodes as compact as xorb would, >>> 4554: // and does't make a difference performance-wise. >> >> I can give you `xorb`, if you want, in a subsequent PR. > > I actually have it, locally, but as comment says, it doesn't look any better in terms of code size or performance, so why bother? OK, fine. >> src/hotspot/cpu/x86/macroAssembler_x86.cpp line 4597: >> >>> 4595: null_check(src, oopDesc::klass_offset_in_bytes()); >>> 4596: } >>> 4597: movptr(dst, Address(src, oopDesc::klass_offset_in_bytes())); >> >> Do we not need to do any fixups for x86_32? > > No because layout-wise, the Klass* remains 4 bytes from object start, and would never get disturbed by anything that happens in the 32-bit-header. Hopefully, when we find a way to avoid displaced headers altogether, we can unify the code with x86_64 and avoid the mess that we make here. Understood. This bit looks fine then. >> src/hotspot/cpu/x86/templateTable_x86.cpp line 4185: >> >>> 4183: Register tmp_load_klass = LP64_ONLY(rscratch1) NOT_LP64(noreg); >>> 4184: __ load_klass(rdx, rdx, tmp_load_klass, false); >>> 4185: __ jmp(resolved); >> >> Have you run into troubles with `jmpb` short branch here? > > Yes. The (subsequent) load_klass() increases the distance to resolved. OK ------------- PR: https://git.openjdk.java.net/lilliput/pull/15 From rkennke at openjdk.java.net Tue Oct 12 13:42:26 2021 From: rkennke at openjdk.java.net (Roman Kennke) Date: Tue, 12 Oct 2021 13:42:26 GMT Subject: [master] RFR: Load Klass* from header in interpreter (x86) [v6] In-Reply-To: References: Message-ID: > This implements loading the compressed Klass* from the object header, instead of the Klass* field in the x86 interpreter. It does the fast-path (unlocked object) in assembly, and calls into the runtime to deal with locked objects. > > This is a proof-of-concept. It is not entirely clear if we can even call into runtime from a couple of places where load_klass() is called, especially in C2, C1 and some stubs. OTOH, we may end up not having to do any of this if we come up with a way to avoid displaced headers altogether. > > Testing: > - [x] tier1 (x86_32,x86_64) > - [x] tier2 (x86_32,x86_64) > - [x] hotspot_gc (x86_32,x86_64) Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: Use default param in load_klass ------------- Changes: - all: https://git.openjdk.java.net/lilliput/pull/15/files - new: https://git.openjdk.java.net/lilliput/pull/15/files/cbe93827..525ec01a Webrevs: - full: https://webrevs.openjdk.java.net/?repo=lilliput&pr=15&range=05 - incr: https://webrevs.openjdk.java.net/?repo=lilliput&pr=15&range=04-05 Stats: 47 lines in 13 files changed: 0 ins; 0 del; 47 mod Patch: https://git.openjdk.java.net/lilliput/pull/15.diff Fetch: git fetch https://git.openjdk.java.net/lilliput pull/15/head:pull/15 PR: https://git.openjdk.java.net/lilliput/pull/15 From rkennke at openjdk.java.net Tue Oct 12 14:18:12 2021 From: rkennke at openjdk.java.net (Roman Kennke) Date: Tue, 12 Oct 2021 14:18:12 GMT Subject: [master] Integrated: Load Klass* from header in interpreter (x86) In-Reply-To: References: Message-ID: <1tPQ8499f6oMOXoafE0hBjLejWvzMYCVzo_LI82jLvo=.9f139a0e-0be2-4e77-862d-0bb82b731247@github.com> On Mon, 27 Sep 2021 10:55:56 GMT, Roman Kennke wrote: > This implements loading the compressed Klass* from the object header, instead of the Klass* field in the x86 interpreter. It does the fast-path (unlocked object) in assembly, and calls into the runtime to deal with locked objects. > > This is a proof-of-concept. It is not entirely clear if we can even call into runtime from a couple of places where load_klass() is called, especially in C2, C1 and some stubs. OTOH, we may end up not having to do any of this if we come up with a way to avoid displaced headers altogether. > > Testing: > - [x] tier1 (x86_32,x86_64) > - [x] tier2 (x86_32,x86_64) > - [x] hotspot_gc (x86_32,x86_64) This pull request has now been integrated. Changeset: dc8fd0f1 Author: Roman Kennke URL: https://git.openjdk.java.net/lilliput/commit/dc8fd0f1ea7f1b652ecf93b789ed03b559619c2b Stats: 76 lines in 8 files changed: 58 ins; 5 del; 13 mod Load Klass* from header in interpreter (x86) Reviewed-by: shade ------------- PR: https://git.openjdk.java.net/lilliput/pull/15 From rkennke at openjdk.java.net Tue Oct 12 14:35:34 2021 From: rkennke at openjdk.java.net (Roman Kennke) Date: Tue, 12 Oct 2021 14:35:34 GMT Subject: [master] RFR: Fix some tests to work with +UseCompressedClassPointers [v2] In-Reply-To: References: Message-ID: > Some tests assume -UseCompressedClassPointers, but we're forcing +UseCompressedClassPointers, so let's fix those tests. > > Testing: > - [x] tier1 > - [x] tier2 > - [x] jdk/jfr > - [x] testlibrary_tests/ir_framework/tests Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: Special case in ObjectCountEventVerifier for 32 bits ------------- Changes: - all: https://git.openjdk.java.net/lilliput/pull/19/files - new: https://git.openjdk.java.net/lilliput/pull/19/files/6ae58819..a4c410a6 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=lilliput&pr=19&range=01 - incr: https://webrevs.openjdk.java.net/?repo=lilliput&pr=19&range=00-01 Stats: 2 lines in 1 file changed: 1 ins; 0 del; 1 mod Patch: https://git.openjdk.java.net/lilliput/pull/19.diff Fetch: git fetch https://git.openjdk.java.net/lilliput pull/19/head:pull/19 PR: https://git.openjdk.java.net/lilliput/pull/19 From rkennke at openjdk.java.net Tue Oct 12 14:35:36 2021 From: rkennke at openjdk.java.net (Roman Kennke) Date: Tue, 12 Oct 2021 14:35:36 GMT Subject: [master] RFR: Fix some tests to work with +UseCompressedClassPointers [v2] In-Reply-To: References: Message-ID: On Tue, 12 Oct 2021 12:53:24 GMT, Aleksey Shipilev wrote: > Would it be in klass-gap on x86_32? Run this test on x86_32 to confirm? Good catch! Indeed, on 32 bits, the length is in its own field, and thus we need a multiplier of 3. Fixed in a change that I just pushed, and re-ran the affected tests both on x86_32 and x86_64 for verification. ------------- PR: https://git.openjdk.java.net/lilliput/pull/19 From rkennke at openjdk.java.net Tue Oct 12 15:38:12 2021 From: rkennke at openjdk.java.net (Roman Kennke) Date: Tue, 12 Oct 2021 15:38:12 GMT Subject: [master] RFR: Fix some tests to work with +UseCompressedClassPointers [v2] In-Reply-To: References: <7Ae4uSbMVFZs-oXGSWntuDxTkM1IhxmiR8Y5CPRyd1o=.634bd635-8128-410b-9ecd-67fc5ff4e838@github.com> Message-ID: On Tue, 12 Oct 2021 14:39:48 GMT, Thomas Stuefe wrote: > Oh, the comment is wrong then. Can you check the latest changes? It should be ok now, or isn't it? ------------- PR: https://git.openjdk.java.net/lilliput/pull/19 From markus.gronlund at oracle.com Tue Oct 12 22:52:27 2021 From: markus.gronlund at oracle.com (Markus Gronlund) Date: Tue, 12 Oct 2021 22:52:27 +0000 Subject: RFC: (How to?) Replace use of mark-word in leak-profiler (Lilliput) In-Reply-To: References: Message-ID: Hi Roman, Providing another hashtable works fine to solve this problem, but I am a little bit concerned that the overhead might not fully warrant it. The default hashtable has 1009 buckets IIRC, but the number of leak candidates (samples) found during heap traversal might only be a relatively small number (default queue size is 256). I was trying to come up with something more lightweight, albeit the hashtable solution might be more straightforward to understand. This solution still uses space available in the markword, since we have already provisioned to use this as a "scratch area" as part of setup - but it is now limited to accommodate Lilliput (i.e. restricted to use only the lower 32-bits only). I hope this will enable you and the Lilliput team to make progress. https://github.com/openjdk/jdk/pull/5918 Thanks Markus -----Original Message----- From: Roman Kennke Sent: den 12 oktober 2021 15:20 To: Markus Gronlund ; hotspot-jfr-dev ; lilliput-dev at openjdk.java.net Subject: Re: RE: RFC: (How to?) Replace use of mark-word in leak-profiler (Lilliput) Thank you, Markus! In the meantime I implemented proposal #1, map oop->Edge* using a hashtable: https://urldefense.com/v3/__https://github.com/openjdk/lilliput/pull/18__;!!ACWV5N9M2RV99hQ!ZmQY9j8-lGVFEdImbsZ59XMHW7lZ3Z-y_m468GDj-fk7Diiu714wMSWlaQFJqekrscIc$ If you find a better way to do that, I would appreciate it! Meanwhile, I will leave that PR#18 open. Thanks, Roman > Hi Roman, > > Thank you for bringing this up. > > I am exploring a few ideas, and I will get back to you soon. > > Cheers > Markus > > -----Original Message----- > From: hotspot-jfr-dev On > Behalf Of Roman Kennke > Sent: den 8 oktober 2021 17:44 > To: hotspot-jfr-dev ; > lilliput-dev at openjdk.java.net > Subject: RFC: (How to?) Replace use of mark-word in leak-profiler > (Lilliput) > > Hi there, > > I'm currently thinking about an issue that came up when I tested JFR/Leakprofiler with Lilliput. > > First come context: In Lilliput I am storing a (compressed) Klass* in the object header, and want to phase out all uses of the dedicated Klass*-word and then remove that. This means that we need to be a little more careful when accessing the object header and/or the object's klass. > > In the leak-profiler, we (temporarily) store an Edge* into the object header, and preserve the actual mark in a table. However, that means that the heap traversal would not work anymore because the object doesn't have a valid Klass* anymore. > > Therefore I'm thinking about alternative ways to associate objects with an Edge*: > > 1. Use a (hash-)table for it (oop->Edge*), in EdgeStore. > 2. Compress the Edge* to 32bits and store that only in the lower 32bits of the header (the Klass* is in the upper 32bits, currently). Not quite sure how to do this, though, it means we have to control allocation of the Edge instances in a contiguous space. > 3. Store the object Klass* somewhere as long as we have it (before > overriding it in EdgeStore::associate_leak_content_with_candidate()), > and use that for iterating the object (in BFSClosure::iterate()). But where? Maybe in the Edge to the object? I.o.w. each Edge would also store the Klass* of its pointee? > > I wanted to check with you if any of those approaches make sense to you, or maybe you have even better ideas? > > Thanks for helping, > Roman > From rkennke at openjdk.java.net Wed Oct 13 09:29:22 2021 From: rkennke at openjdk.java.net (Roman Kennke) Date: Wed, 13 Oct 2021 09:29:22 GMT Subject: [master] Integrated: Fix some tests to work with +UseCompressedClassPointers In-Reply-To: References: Message-ID: On Tue, 12 Oct 2021 12:02:16 GMT, Roman Kennke wrote: > Some tests assume -UseCompressedClassPointers, but we're forcing +UseCompressedClassPointers, so let's fix those tests. > > Testing: > - [x] tier1 > - [x] tier2 > - [x] jdk/jfr > - [x] testlibrary_tests/ir_framework/tests This pull request has now been integrated. Changeset: 3dd59f89 Author: Roman Kennke URL: https://git.openjdk.java.net/lilliput/commit/3dd59f89d5547c80acf64cbc441b0c8174a7afa6 Stats: 5 lines in 3 files changed: 1 ins; 1 del; 3 mod Fix some tests to work with +UseCompressedClassPointers Reviewed-by: shade ------------- PR: https://git.openjdk.java.net/lilliput/pull/19 From shade at openjdk.java.net Thu Oct 14 13:04:26 2021 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Thu, 14 Oct 2021 13:04:26 GMT Subject: [master] RFR: Jump/stub length failures in x86_64 release builds Message-ID: <5YlrW81Ry8EGhpxpVl_aS53kmp70xbrqUwjkLVbpkiE=.dca4b4ad-1003-440a-8022-47762ec2b421@github.com> My CI reports many failures on `make bootcycle-images` on x86_64, but only in release modes: # Internal Error (macroAssembler_x86.hpp:120), pid=2457838, tid=2457856 # guarantee(this->is8bit(imm8)) failed: Short forward jump exceeds 8-bit offset at :0 # Internal Error (vtableStubs.cpp:196), pid=1871037, tid=1871042 # guarantee(masm->pc() <= s->code_end()) failed: itable #2: overflowed buffer, estimated len: 256, actual len: 295, overrun: 39 # Internal Error (vtableStubs.cpp:196), pid=1871037, tid=1871042 # guarantee(masm->pc() <= s->code_end()) failed: itable #2: overflowed buffer, estimated len: 256, actual len: 295, overrun: 39 Additional testing: - [x] Linux x86_64 release `make bootcycle-images` now works ------------- Commit messages: - Fix Changes: https://git.openjdk.java.net/lilliput/pull/22/files Webrev: https://webrevs.openjdk.java.net/?repo=lilliput&pr=22&range=00 Stats: 7 lines in 2 files changed: 1 ins; 4 del; 2 mod Patch: https://git.openjdk.java.net/lilliput/pull/22.diff Fetch: git fetch https://git.openjdk.java.net/lilliput pull/22/head:pull/22 PR: https://git.openjdk.java.net/lilliput/pull/22 From rkennke at openjdk.java.net Thu Oct 14 13:13:17 2021 From: rkennke at openjdk.java.net (Roman Kennke) Date: Thu, 14 Oct 2021 13:13:17 GMT Subject: [master] RFR: Jump/stub length failures in x86_64 release builds In-Reply-To: <5YlrW81Ry8EGhpxpVl_aS53kmp70xbrqUwjkLVbpkiE=.dca4b4ad-1003-440a-8022-47762ec2b421@github.com> References: <5YlrW81Ry8EGhpxpVl_aS53kmp70xbrqUwjkLVbpkiE=.dca4b4ad-1003-440a-8022-47762ec2b421@github.com> Message-ID: On Thu, 14 Oct 2021 12:58:37 GMT, Aleksey Shipilev wrote: > My CI reports many failures on `make bootcycle-images` on x86_64, but only in release modes: > > > # Internal Error (macroAssembler_x86.hpp:120), pid=2457838, tid=2457856 > # guarantee(this->is8bit(imm8)) failed: Short forward jump exceeds 8-bit offset at :0 > > > > # Internal Error (vtableStubs.cpp:196), pid=1871037, tid=1871042 > # guarantee(masm->pc() <= s->code_end()) failed: itable #2: overflowed buffer, estimated len: 256, actual len: 295, overrun: 39 > > > > # Internal Error (vtableStubs.cpp:196), pid=1871037, tid=1871042 > # guarantee(masm->pc() <= s->code_end()) failed: itable #2: overflowed buffer, estimated len: 256, actual len: 295, overrun: 39 > > > Additional testing: > - [x] Linux x86_64 release `make bootcycle-images` now works Looks good! Thanks for catching them! ------------- Marked as reviewed by rkennke (Lead). PR: https://git.openjdk.java.net/lilliput/pull/22 From shade at openjdk.java.net Thu Oct 14 13:13:18 2021 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Thu, 14 Oct 2021 13:13:18 GMT Subject: [master] Integrated: Jump/stub length failures in x86_64 release builds In-Reply-To: <5YlrW81Ry8EGhpxpVl_aS53kmp70xbrqUwjkLVbpkiE=.dca4b4ad-1003-440a-8022-47762ec2b421@github.com> References: <5YlrW81Ry8EGhpxpVl_aS53kmp70xbrqUwjkLVbpkiE=.dca4b4ad-1003-440a-8022-47762ec2b421@github.com> Message-ID: On Thu, 14 Oct 2021 12:58:37 GMT, Aleksey Shipilev wrote: > My CI reports many failures on `make bootcycle-images` on x86_64, but only in release modes: > > > # Internal Error (macroAssembler_x86.hpp:120), pid=2457838, tid=2457856 > # guarantee(this->is8bit(imm8)) failed: Short forward jump exceeds 8-bit offset at :0 > > > > # Internal Error (vtableStubs.cpp:196), pid=1871037, tid=1871042 > # guarantee(masm->pc() <= s->code_end()) failed: itable #2: overflowed buffer, estimated len: 256, actual len: 295, overrun: 39 > > > > # Internal Error (vtableStubs.cpp:196), pid=1871037, tid=1871042 > # guarantee(masm->pc() <= s->code_end()) failed: itable #2: overflowed buffer, estimated len: 256, actual len: 295, overrun: 39 > > > Additional testing: > - [x] Linux x86_64 release `make bootcycle-images` now works This pull request has now been integrated. Changeset: 3a4a811f Author: Aleksey Shipilev URL: https://git.openjdk.java.net/lilliput/commit/3a4a811f0691234cf1f26b14c57c927656baabbf Stats: 7 lines in 2 files changed: 1 ins; 4 del; 2 mod Jump/stub length failures in x86_64 release builds Reviewed-by: rkennke ------------- PR: https://git.openjdk.java.net/lilliput/pull/22 From rkennke at openjdk.java.net Thu Oct 14 16:46:24 2021 From: rkennke at openjdk.java.net (Roman Kennke) Date: Thu, 14 Oct 2021 16:46:24 GMT Subject: [master] RFR: Preserve Klass* in leak profiler Message-ID: The JFR leak profiler uses the mark word for marking objects during traversal, and for storing pointers to Edges. We need to preserve the compressed Klass* in the upper part of the mark-word during the traversal. The change is based on: ttps://github.com/openjdk/jdk/pull/5918 Testing: - [x] jdk/jfr (x86_64) - [x] jdk/jfr (x86_32) - [x] tier1 - [ ] tier2 ------------- Commit messages: - Merge branch 'master' into fix-jfr-2 - Merge branch 'pull/5918' into fix-jfr-2 - remove -1 - constants - Fix 32bit build - Merge branch 'master' into fix-jfr-2 - In leakprofiler, don't override Klass when marking objects - Merge branch 'pull/5918' into fix-jfr-2 - spelling - prepare leakprofiler for lilliput Changes: https://git.openjdk.java.net/lilliput/pull/21/files Webrev: https://webrevs.openjdk.java.net/?repo=lilliput&pr=21&range=00 Stats: 113 lines in 4 files changed: 81 ins; 23 del; 9 mod Patch: https://git.openjdk.java.net/lilliput/pull/21.diff Fetch: git fetch https://git.openjdk.java.net/lilliput pull/21/head:pull/21 PR: https://git.openjdk.java.net/lilliput/pull/21 From rkennke at openjdk.java.net Fri Oct 15 10:54:27 2021 From: rkennke at openjdk.java.net (Roman Kennke) Date: Fri, 15 Oct 2021 10:54:27 GMT Subject: [master] RFR: Fix self-forwarding on 32 bit platforms Message-ID: Self-forwarding as implemented by #10 uses the 3rd header bit to indicate self-forwarded object. However, on 32bit platforms, oops are only 4-byte-aligned, and a regular forwarding may set the 3rd bit, and thus make it look like self-forwarded. This breaks one of the gtests, and potentially causes severe heap corruption. OTOH, on 32 bit platforms we don't need to preserve the upper header bits, and can therefore use regular forwarding mechanism to do self-forwarding (as it was before #10). The change also changes the test_preservedMarks.cpp gtest to install 0b1 as original mark, not 0b11 which would also look like a forwarded object. It has not caused failure because it was not tested before the forwarding gets installed, but is wrong nonetheless. Testing: - [x] gtest (which was failing before) - [x] tier1 - [x] tier2 - [x] hotspot_gc ------------- Commit messages: - Fix self-forwarding on 32 bit platforms Changes: https://git.openjdk.java.net/lilliput/pull/23/files Webrev: https://webrevs.openjdk.java.net/?repo=lilliput&pr=23&range=00 Stats: 9 lines in 2 files changed: 7 ins; 0 del; 2 mod Patch: https://git.openjdk.java.net/lilliput/pull/23.diff Fetch: git fetch https://git.openjdk.java.net/lilliput pull/23/head:pull/23 PR: https://git.openjdk.java.net/lilliput/pull/23 From shade at openjdk.java.net Fri Oct 15 11:09:10 2021 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Fri, 15 Oct 2021 11:09:10 GMT Subject: [master] RFR: Fix self-forwarding on 32 bit platforms In-Reply-To: References: Message-ID: On Fri, 15 Oct 2021 10:49:16 GMT, Roman Kennke wrote: > Self-forwarding as implemented by #10 uses the 3rd header bit to indicate self-forwarded object. However, on 32bit platforms, oops are only 4-byte-aligned, and a regular forwarding may set the 3rd bit, and thus make it look like self-forwarded. This breaks one of the gtests, and potentially causes severe heap corruption. OTOH, on 32 bit platforms we don't need to preserve the upper header bits, and can therefore use regular forwarding mechanism to do self-forwarding (as it was before #10). > > The change also changes the test_preservedMarks.cpp gtest to install 0b1 as original mark, not 0b11 which would also look like a forwarded object. It has not caused failure because it was not tested before the forwarding gets installed, but is wrong nonetheless. > > Testing: > - [x] gtest (which was failing before) > - [x] tier1 > - [x] tier2 > - [x] hotspot_gc Looks fine. ------------- Marked as reviewed by shade (Committer). PR: https://git.openjdk.java.net/lilliput/pull/23 From shade at openjdk.java.net Fri Oct 15 11:13:02 2021 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Fri, 15 Oct 2021 11:13:02 GMT Subject: [master] RFR: Preserve Klass* in leak profiler In-Reply-To: References: Message-ID: On Wed, 13 Oct 2021 09:32:58 GMT, Roman Kennke wrote: > The JFR leak profiler uses the mark word for marking objects during traversal, and for storing pointers to Edges. We need to preserve the compressed Klass* in the upper part of the mark-word during the traversal. > > The change is based on: ttps://github.com/openjdk/jdk/pull/5918 > > Testing: > - [x] jdk/jfr (x86_64) > - [x] jdk/jfr (x86_32) > - [x] tier1 > - [ ] tier2 All right. This is cherry-pick of upstream change, right? Looks good then. ------------- Marked as reviewed by shade (Committer). PR: https://git.openjdk.java.net/lilliput/pull/21 From rkennke at openjdk.java.net Fri Oct 15 12:00:37 2021 From: rkennke at openjdk.java.net (Roman Kennke) Date: Fri, 15 Oct 2021 12:00:37 GMT Subject: [master] RFR: Fix self-forwarding on 32 bit platforms [v2] In-Reply-To: References: Message-ID: > Self-forwarding as implemented by #10 uses the 3rd header bit to indicate self-forwarded object. However, on 32bit platforms, oops are only 4-byte-aligned, and a regular forwarding may set the 3rd bit, and thus make it look like self-forwarded. This breaks one of the gtests, and potentially causes severe heap corruption. OTOH, on 32 bit platforms we don't need to preserve the upper header bits, and can therefore use regular forwarding mechanism to do self-forwarding (as it was before #10). > > The change also changes the test_preservedMarks.cpp gtest to install 0b1 as original mark, not 0b11 which would also look like a forwarded object. It has not caused failure because it was not tested before the forwarding gets installed, but is wrong nonetheless. > > Testing: > - [x] gtest (which was failing before) > - [x] tier1 > - [x] tier2 > - [x] hotspot_gc Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: Special-case forward_to_self_atomic() for 32bits ------------- Changes: - all: https://git.openjdk.java.net/lilliput/pull/23/files - new: https://git.openjdk.java.net/lilliput/pull/23/files/5b39c962..f3c49060 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=lilliput&pr=23&range=01 - incr: https://webrevs.openjdk.java.net/?repo=lilliput&pr=23&range=00-01 Stats: 4 lines in 1 file changed: 4 ins; 0 del; 0 mod Patch: https://git.openjdk.java.net/lilliput/pull/23.diff Fetch: git fetch https://git.openjdk.java.net/lilliput pull/23/head:pull/23 PR: https://git.openjdk.java.net/lilliput/pull/23 From rkennke at openjdk.java.net Fri Oct 15 12:54:17 2021 From: rkennke at openjdk.java.net (Roman Kennke) Date: Fri, 15 Oct 2021 12:54:17 GMT Subject: [master] RFR: Preserve Klass* in leak profiler In-Reply-To: References: Message-ID: On Fri, 15 Oct 2021 11:10:27 GMT, Aleksey Shipilev wrote: > All right. This is cherry-pick of upstream change, right? Looks good then. Well, not quite. Markus made the change in upstream PR, but hasn't been integrated into upstream JDK, yet. And I am not sure if he has any intention to do so. The change doesn't even have a bug-id, yet: https://github.com/openjdk/jdk/pull/5918 The only difference to Markus' change is the additional change in src/hotspot/share/jfr/leakprofiler/chains/objectSampleMarker.hpp. The intention is to not entirely reset the mark-word when marking objects for leak-profiler-traversal, but to preserve the Klass* part. ------------- PR: https://git.openjdk.java.net/lilliput/pull/21 From rkennke at openjdk.java.net Fri Oct 15 14:12:19 2021 From: rkennke at openjdk.java.net (Roman Kennke) Date: Fri, 15 Oct 2021 14:12:19 GMT Subject: [master] Integrated: Fix self-forwarding on 32 bit platforms In-Reply-To: References: Message-ID: On Fri, 15 Oct 2021 10:49:16 GMT, Roman Kennke wrote: > Self-forwarding as implemented by #10 uses the 3rd header bit to indicate self-forwarded object. However, on 32bit platforms, oops are only 4-byte-aligned, and a regular forwarding may set the 3rd bit, and thus make it look like self-forwarded. This breaks one of the gtests, and potentially causes severe heap corruption. OTOH, on 32 bit platforms we don't need to preserve the upper header bits, and can therefore use regular forwarding mechanism to do self-forwarding (as it was before #10). > > The change also changes the test_preservedMarks.cpp gtest to install 0b1 as original mark, not 0b11 which would also look like a forwarded object. It has not caused failure because it was not tested before the forwarding gets installed, but is wrong nonetheless. > > Testing: > - [x] gtest (which was failing before) > - [x] tier1 > - [x] tier2 > - [x] hotspot_gc This pull request has now been integrated. Changeset: 6373b6fe Author: Roman Kennke URL: https://git.openjdk.java.net/lilliput/commit/6373b6fecd361e4976141fd75640d6741531a401 Stats: 13 lines in 2 files changed: 11 ins; 0 del; 2 mod Fix self-forwarding on 32 bit platforms Reviewed-by: shade ------------- PR: https://git.openjdk.java.net/lilliput/pull/23 From rkennke at openjdk.java.net Fri Oct 15 14:40:10 2021 From: rkennke at openjdk.java.net (Roman Kennke) Date: Fri, 15 Oct 2021 14:40:10 GMT Subject: [master] RFR: Use hashtable for obj->edge mapping in JFR, instead of mark-word In-Reply-To: <9z1RIw_WH0kQ5_muA1RQ9wWmHs0EEZHrn14Qo1Ah5SA=.7c4c3904-ab48-4b10-8b21-a40734d3f696@github.com> References: <9z1RIw_WH0kQ5_muA1RQ9wWmHs0EEZHrn14Qo1Ah5SA=.7c4c3904-ab48-4b10-8b21-a40734d3f696@github.com> Message-ID: On Tue, 12 Oct 2021 11:49:27 GMT, Roman Kennke wrote: > JFR overrides the mark-word to (temporarily) store a mapping from object to Edge*. This disturbs the Klass* that we need while tracing for leaks and for later emitting object information. Let's use a hashtable for this, instead, and leave the upper half of the mark-word alone. (The lower half is still used for marking.) > > Testing: > - [x] tier1 > - [x] tier2 > - [x] jdk/jfr (together with #17 most tests pass, needs one more follow-up test fix) I'm withdrawing this in favor of #21 which is mostly openjdk/jdk/pull/5918 as implemented by @mgronlun ------------- PR: https://git.openjdk.java.net/lilliput/pull/18 From rkennke at openjdk.java.net Fri Oct 15 14:40:11 2021 From: rkennke at openjdk.java.net (Roman Kennke) Date: Fri, 15 Oct 2021 14:40:11 GMT Subject: [master] Withdrawn: Use hashtable for obj->edge mapping in JFR, instead of mark-word In-Reply-To: <9z1RIw_WH0kQ5_muA1RQ9wWmHs0EEZHrn14Qo1Ah5SA=.7c4c3904-ab48-4b10-8b21-a40734d3f696@github.com> References: <9z1RIw_WH0kQ5_muA1RQ9wWmHs0EEZHrn14Qo1Ah5SA=.7c4c3904-ab48-4b10-8b21-a40734d3f696@github.com> Message-ID: On Tue, 12 Oct 2021 11:49:27 GMT, Roman Kennke wrote: > JFR overrides the mark-word to (temporarily) store a mapping from object to Edge*. This disturbs the Klass* that we need while tracing for leaks and for later emitting object information. Let's use a hashtable for this, instead, and leave the upper half of the mark-word alone. (The lower half is still used for marking.) > > Testing: > - [x] tier1 > - [x] tier2 > - [x] jdk/jfr (together with #17 most tests pass, needs one more follow-up test fix) This pull request has been closed without being integrated. ------------- PR: https://git.openjdk.java.net/lilliput/pull/18 From rkennke at openjdk.java.net Fri Oct 15 18:03:30 2021 From: rkennke at openjdk.java.net (Roman Kennke) Date: Fri, 15 Oct 2021 18:03:30 GMT Subject: [master] RFR: Load Klass from header, C1/x86 implementation Message-ID: This implements loading the Klass* from object header (instead of dedicated Klass* field) in C1 generated code. It introduces a new C1 LIR opcode (LoadKlass) and related CodeStub for the slow-path runtime-call. The original implementation was brittle in that it emits the decode_klass() whenever it encounters a load that is T_ADDRESS and from klass_offset_in_bytes() (e.g. 4 or 8). Lucky that we did not seem to emit any unrelated such loads. This new implementation uses a dedicated C1 op instead, and expands this to the corresponding load of the header word, and call into slow-path upon fast-path-failure, in the LIR assembler. Testing: - [x] tier1 (x86_64,x86_32) - [x] tier2 (x86_64,x86_32) - [ ] tier3 (x86_64,x86_32) - [ ] tier4 (x86_64,x86_32) ------------- Commit messages: - Merge branch 'master' into klass-from-header-c1 - Revert unnecessary changes - Merge branch 'master' into klass-from-header-c1 - Merge branch 'master' into klass-from-header-c1 - Swap xchg and store_register() to avoid trashing the argument - Merge remote-tracking branch 'origin/klass-from-header-c1' into klass-from-header-c1 - Merge branch 'master' into klass-from-header-c1 - Use xchg instead of push/pop to preserve rax in load-klass runtime call stub - Make mark-load opaque in the LIR of load-klass - Use any register for getting the Klass* result, and shuffle slow-path regs accordingly - ... and 12 more: https://git.openjdk.java.net/lilliput/compare/6373b6fe...a04787d8 Changes: https://git.openjdk.java.net/lilliput/pull/20/files Webrev: https://webrevs.openjdk.java.net/?repo=lilliput&pr=20&range=00 Stats: 161 lines in 13 files changed: 140 ins; 11 del; 10 mod Patch: https://git.openjdk.java.net/lilliput/pull/20.diff Fetch: git fetch https://git.openjdk.java.net/lilliput pull/20/head:pull/20 PR: https://git.openjdk.java.net/lilliput/pull/20 From shade at openjdk.java.net Mon Oct 18 07:25:14 2021 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Mon, 18 Oct 2021 07:25:14 GMT Subject: [master] RFR: Load Klass from header, C1/x86 implementation In-Reply-To: References: Message-ID: On Tue, 12 Oct 2021 15:55:17 GMT, Roman Kennke wrote: > This implements loading the Klass* from object header (instead of dedicated Klass* field) in C1 generated code. It introduces a new C1 LIR opcode (LoadKlass) and related CodeStub for the slow-path runtime-call. > > The original implementation was brittle in that it emits the decode_klass() whenever it encounters a load that is T_ADDRESS and from klass_offset_in_bytes() (e.g. 4 or 8). Lucky that we did not seem to emit any unrelated such loads. This new implementation uses a dedicated C1 op instead, and expands this to the corresponding load of the header word, and call into slow-path upon fast-path-failure, in the LIR assembler. > > Testing: > - [x] tier1 (x86_64,x86_32) > - [x] tier2 (x86_64,x86_32) > - [ ] tier3 (x86_64,x86_32) > - [ ] tier4 (x86_64,x86_32) I have questions :) src/hotspot/cpu/x86/c1_CodeStubs_x86.cpp line 311: > 309: // without messing with the stack. > 310: __ xchgptr(rax, res); > 311: } I am trying to wrap my head around this in early Monday :) Is `res` guaranteed to be callee-save? I.e. does this stashing work if runtime call clobbers `res`? src/hotspot/cpu/x86/c1_LIRAssembler_x86.cpp line 1260: > 1258: > 1259: case T_ADDRESS: > 1260: __ movptr(dest->as_register(), from_addr); Can we / should we `assert(addr->disp() != oopDesc::klass_offset_in_bytes(), "sanity")` here and later? So that the non-`LoadKlass` loads would break early? src/hotspot/cpu/x86/c1_Runtime1_x86.cpp line 1124: > 1122: } > 1123: #else > 1124: __ should_not_reach_here(); So the intent here to generate x86_32 stub, but check it is not called ever, right? I wonder if we still need to do `StubFrame f(sasm, "load_klass", dont_gc_arguments);` even for a simple `__ should_not_reach_here();`... ------------- PR: https://git.openjdk.java.net/lilliput/pull/20 From rkennke at openjdk.java.net Mon Oct 18 10:27:18 2021 From: rkennke at openjdk.java.net (Roman Kennke) Date: Mon, 18 Oct 2021 10:27:18 GMT Subject: [master] RFR: Load Klass from header, C1/x86 implementation In-Reply-To: References: Message-ID: On Mon, 18 Oct 2021 07:12:45 GMT, Aleksey Shipilev wrote: >> This implements loading the Klass* from object header (instead of dedicated Klass* field) in C1 generated code. It introduces a new C1 LIR opcode (LoadKlass) and related CodeStub for the slow-path runtime-call. >> >> The original implementation was brittle in that it emits the decode_klass() whenever it encounters a load that is T_ADDRESS and from klass_offset_in_bytes() (e.g. 4 or 8). Lucky that we did not seem to emit any unrelated such loads. This new implementation uses a dedicated C1 op instead, and expands this to the corresponding load of the header word, and call into slow-path upon fast-path-failure, in the LIR assembler. >> >> Testing: >> - [x] tier1 (x86_64,x86_32) >> - [x] tier2 (x86_64,x86_32) >> - [ ] tier3 (x86_64,x86_32) >> - [ ] tier4 (x86_64,x86_32) > > src/hotspot/cpu/x86/c1_CodeStubs_x86.cpp line 311: > >> 309: // without messing with the stack. >> 310: __ xchgptr(rax, res); >> 311: } > > I am trying to wrap my head around this in early Monday :) Is `res` guaranteed to be callee-save? I.e. does this stashing work if runtime call clobbers `res`? The actual save/restore of registers and the runtime call happens in the stub in c1_Runtime1_x86.cpp > src/hotspot/cpu/x86/c1_LIRAssembler_x86.cpp line 1260: > >> 1258: >> 1259: case T_ADDRESS: >> 1260: __ movptr(dest->as_register(), from_addr); > > Can we / should we `assert(addr->disp() != oopDesc::klass_offset_in_bytes(), "sanity")` here and later? So that the non-`LoadKlass` loads would break early? Not sure, tbh. The current implementation is quite brittle, and we are lucky that we have apparently never hit (emit) a load with offset 8 (or 4 on x86_32) which happens to be klass_offset_in_bytes(). > src/hotspot/cpu/x86/c1_Runtime1_x86.cpp line 1124: > >> 1122: } >> 1123: #else >> 1124: __ should_not_reach_here(); > > So the intent here to generate x86_32 stub, but check it is not called ever, right? I wonder if we still need to do `StubFrame f(sasm, "load_klass", dont_gc_arguments);` even for a simple `__ should_not_reach_here();`... Perhaps not. Let me see if I can avoid this altogether. ------------- PR: https://git.openjdk.java.net/lilliput/pull/20 From rkennke at openjdk.java.net Mon Oct 18 10:43:43 2021 From: rkennke at openjdk.java.net (Roman Kennke) Date: Mon, 18 Oct 2021 10:43:43 GMT Subject: [master] RFR: Load Klass from header, C1/x86 implementation [v2] In-Reply-To: References: Message-ID: > This implements loading the Klass* from object header (instead of dedicated Klass* field) in C1 generated code. It introduces a new C1 LIR opcode (LoadKlass) and related CodeStub for the slow-path runtime-call. > > The original implementation was brittle in that it emits the decode_klass() whenever it encounters a load that is T_ADDRESS and from klass_offset_in_bytes() (e.g. 4 or 8). Lucky that we did not seem to emit any unrelated such loads. This new implementation uses a dedicated C1 op instead, and expands this to the corresponding load of the header word, and call into slow-path upon fast-path-failure, in the LIR assembler. > > Testing: > - [x] tier1 (x86_64,x86_32) > - [x] tier2 (x86_64,x86_32) > - [ ] tier3 (x86_64,x86_32) > - [ ] tier4 (x86_64,x86_32) Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: Don't create load_klass C1 runtime call stub in 32bit builds ------------- Changes: - all: https://git.openjdk.java.net/lilliput/pull/20/files - new: https://git.openjdk.java.net/lilliput/pull/20/files/a04787d8..7f89b314 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=lilliput&pr=20&range=01 - incr: https://webrevs.openjdk.java.net/?repo=lilliput&pr=20&range=00-01 Stats: 6 lines in 1 file changed: 2 ins; 4 del; 0 mod Patch: https://git.openjdk.java.net/lilliput/pull/20.diff Fetch: git fetch https://git.openjdk.java.net/lilliput pull/20/head:pull/20 PR: https://git.openjdk.java.net/lilliput/pull/20 From rkennke at openjdk.java.net Mon Oct 18 10:56:43 2021 From: rkennke at openjdk.java.net (Roman Kennke) Date: Mon, 18 Oct 2021 10:56:43 GMT Subject: [master] RFR: Preserve Klass* in leak profiler [v2] In-Reply-To: References: Message-ID: <6RU3SMWR-uLofjP84micdCLQ5DrDbMqPzN_69KSH2II=.b72c835d-aa64-4aa0-aa49-512f9533405c@github.com> > The JFR leak profiler uses the mark word for marking objects during traversal, and for storing pointers to Edges. We need to preserve the compressed Klass* in the upper part of the mark-word during the traversal. > > The change is based on: openjdk/jdk/pull/5918 > > Testing: > - [x] jdk/jfr (x86_64) > - [x] jdk/jfr (x86_32) > - [x] tier1 > - [x] tier2 Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: Cherry-pick 8275277: assert(dest_attr.is_in_cset() == (obj->forwardee() == obj)) failed ------------- Changes: - all: https://git.openjdk.java.net/lilliput/pull/21/files - new: https://git.openjdk.java.net/lilliput/pull/21/files/c1368efb..a49c72e1 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=lilliput&pr=21&range=01 - incr: https://webrevs.openjdk.java.net/?repo=lilliput&pr=21&range=00-01 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.java.net/lilliput/pull/21.diff Fetch: git fetch https://git.openjdk.java.net/lilliput pull/21/head:pull/21 PR: https://git.openjdk.java.net/lilliput/pull/21 From rkennke at openjdk.java.net Mon Oct 18 12:40:39 2021 From: rkennke at openjdk.java.net (Roman Kennke) Date: Mon, 18 Oct 2021 12:40:39 GMT Subject: [master] RFR: Fix windows build Message-ID: As title says. We need to use one more jmp instead of jmpb in the asm routine for load_klass(). ------------- Commit messages: - Use jmp instead of jmpb in load_klass asm routine Changes: https://git.openjdk.java.net/lilliput/pull/24/files Webrev: https://webrevs.openjdk.java.net/?repo=lilliput&pr=24&range=00 Stats: 15 lines in 4 files changed: 0 ins; 11 del; 4 mod Patch: https://git.openjdk.java.net/lilliput/pull/24.diff Fetch: git fetch https://git.openjdk.java.net/lilliput pull/24/head:pull/24 PR: https://git.openjdk.java.net/lilliput/pull/24 From rkennke at openjdk.java.net Mon Oct 18 12:46:28 2021 From: rkennke at openjdk.java.net (Roman Kennke) Date: Mon, 18 Oct 2021 12:46:28 GMT Subject: [master] RFR: Fix windows build [v2] In-Reply-To: References: Message-ID: > As title says. We need to use one more jmp instead of jmpb in the asm routine for load_klass(). Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: Revert unrelated changes ------------- Changes: - all: https://git.openjdk.java.net/lilliput/pull/24/files - new: https://git.openjdk.java.net/lilliput/pull/24/files/818132d0..30625a4e Webrevs: - full: https://webrevs.openjdk.java.net/?repo=lilliput&pr=24&range=01 - incr: https://webrevs.openjdk.java.net/?repo=lilliput&pr=24&range=00-01 Stats: 13 lines in 2 files changed: 11 ins; 0 del; 2 mod Patch: https://git.openjdk.java.net/lilliput/pull/24.diff Fetch: git fetch https://git.openjdk.java.net/lilliput pull/24/head:pull/24 PR: https://git.openjdk.java.net/lilliput/pull/24 From rkennke at openjdk.java.net Mon Oct 18 13:54:30 2021 From: rkennke at openjdk.java.net (Roman Kennke) Date: Mon, 18 Oct 2021 13:54:30 GMT Subject: [master] RFR: Preserve Klass* in leak profiler [v3] In-Reply-To: References: Message-ID: > The JFR leak profiler uses the mark word for marking objects during traversal, and for storing pointers to Edges. We need to preserve the compressed Klass* in the upper part of the mark-word during the traversal. > > The change is based on: openjdk/jdk/pull/5918 > > Testing: > - [x] jdk/jfr (x86_64) > - [x] jdk/jfr (x86_32) > - [x] tier1 > - [x] tier2 Roman Kennke has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 12 additional commits since the last revision: - Merge branch 'master' into fix-jfr-2 - Cherry-pick 8275277: assert(dest_attr.is_in_cset() == (obj->forwardee() == obj)) failed - Merge branch 'master' into fix-jfr-2 - Merge branch 'pull/5918' into fix-jfr-2 - remove -1 - constants - Fix 32bit build - Merge branch 'master' into fix-jfr-2 - In leakprofiler, don't override Klass when marking objects - Merge branch 'pull/5918' into fix-jfr-2 - ... and 2 more: https://git.openjdk.java.net/lilliput/compare/9b17ce55...029c37a1 ------------- Changes: - all: https://git.openjdk.java.net/lilliput/pull/21/files - new: https://git.openjdk.java.net/lilliput/pull/21/files/a49c72e1..029c37a1 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=lilliput&pr=21&range=02 - incr: https://webrevs.openjdk.java.net/?repo=lilliput&pr=21&range=01-02 Stats: 13 lines in 2 files changed: 11 ins; 0 del; 2 mod Patch: https://git.openjdk.java.net/lilliput/pull/21.diff Fetch: git fetch https://git.openjdk.java.net/lilliput pull/21/head:pull/21 PR: https://git.openjdk.java.net/lilliput/pull/21 From stuefe at openjdk.java.net Tue Oct 12 16:13:16 2021 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Tue, 12 Oct 2021 16:13:16 GMT Subject: [master] RFR: Fix some tests to work with +UseCompressedClassPointers [v2] In-Reply-To: References: <7Ae4uSbMVFZs-oXGSWntuDxTkM1IhxmiR8Y5CPRyd1o=.634bd635-8128-410b-9ecd-67fc5ff4e838@github.com> Message-ID: <_WkzXgOfU88J9I8X-Xifk0-7or5CriExJf7cQRP57r8=.de17cbe1-58a3-43c4-b927-c80cde659c0e@github.com> On Tue, 12 Oct 2021 15:34:57 GMT, Roman Kennke wrote: >> Oh, the comment is wrong then. > >> Oh, the comment is wrong then. > > Can you check the latest changes? It should be ok now, or isn't it? Yes it looks good, my comment was off. Sorry for the noise :( ------------- PR: https://git.openjdk.java.net/lilliput/pull/19 From mgronlun at openjdk.java.net Mon Oct 18 14:16:11 2021 From: mgronlun at openjdk.java.net (Markus =?UTF-8?B?R3LDtm5sdW5k?=) Date: Mon, 18 Oct 2021 14:16:11 GMT Subject: [master] RFR: Preserve Klass* in leak profiler [v3] In-Reply-To: References: Message-ID: On Mon, 18 Oct 2021 13:54:30 GMT, Roman Kennke wrote: >> The JFR leak profiler uses the mark word for marking objects during traversal, and for storing pointers to Edges. We need to preserve the compressed Klass* in the upper part of the mark-word during the traversal. >> >> The change is based on: openjdk/jdk/pull/5918 >> >> Testing: >> - [x] jdk/jfr (x86_64) >> - [x] jdk/jfr (x86_32) >> - [x] tier1 >> - [x] tier2 > > Roman Kennke has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 12 additional commits since the last revision: > > - Merge branch 'master' into fix-jfr-2 > - Cherry-pick 8275277: assert(dest_attr.is_in_cset() == (obj->forwardee() == obj)) failed > - Merge branch 'master' into fix-jfr-2 > - Merge branch 'pull/5918' into fix-jfr-2 > - remove -1 > - constants > - Fix 32bit build > - Merge branch 'master' into fix-jfr-2 > - In leakprofiler, don't override Klass when marking objects > - Merge branch 'pull/5918' into fix-jfr-2 > - ... and 2 more: https://git.openjdk.java.net/lilliput/compare/c61d7c66...029c37a1 JFR changes looks good. The only thing I am unsure of is the #ifdef _LP64 "has_displaced_mark_helper()", but I assume this is needed for Lilliput. ------------- PR: https://git.openjdk.java.net/lilliput/pull/21 From rkennke at openjdk.java.net Mon Oct 18 14:19:13 2021 From: rkennke at openjdk.java.net (Roman Kennke) Date: Mon, 18 Oct 2021 14:19:13 GMT Subject: [master] RFR: Preserve Klass* in leak profiler [v3] In-Reply-To: References: Message-ID: On Mon, 18 Oct 2021 14:13:15 GMT, Markus Gr?nlund wrote: > JFR changes looks good. The only thing I am unsure of is the #ifdef _LP64 "has_displaced_mark_helper()", but I assume this is needed for Lilliput. Yes, it fetches the actual mark-word, even if it has been displaced, and extracts the narrowKlass out of it (upper 32 bits) and puts it back into the marked header. Otherwise we'd loose the Klass information of the object during leakprofiler traversal of the object graph. ------------- PR: https://git.openjdk.java.net/lilliput/pull/21 From rkennke at openjdk.java.net Mon Oct 18 14:29:31 2021 From: rkennke at openjdk.java.net (Roman Kennke) Date: Mon, 18 Oct 2021 14:29:31 GMT Subject: [master] RFR: Fix windows build [v3] In-Reply-To: References: Message-ID: > As title says. We need to use one more jmp instead of jmpb in the asm routine for load_klass(). Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: Increase stub size estimate ------------- Changes: - all: https://git.openjdk.java.net/lilliput/pull/24/files - new: https://git.openjdk.java.net/lilliput/pull/24/files/30625a4e..c1dcf68f Webrevs: - full: https://webrevs.openjdk.java.net/?repo=lilliput&pr=24&range=02 - incr: https://webrevs.openjdk.java.net/?repo=lilliput&pr=24&range=01-02 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.java.net/lilliput/pull/24.diff Fetch: git fetch https://git.openjdk.java.net/lilliput pull/24/head:pull/24 PR: https://git.openjdk.java.net/lilliput/pull/24 From shade at openjdk.java.net Mon Oct 18 14:50:07 2021 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Mon, 18 Oct 2021 14:50:07 GMT Subject: [master] RFR: Fix windows build [v3] In-Reply-To: References: Message-ID: <8E3Gn51yp8XVEaF-v5jbwa_NT64V4hjnSNI8YpIdbRM=.44320c89-25eb-40a2-b0c9-d4985977626c@github.com> On Mon, 18 Oct 2021 14:29:31 GMT, Roman Kennke wrote: >> As title says. We need to use one more jmp instead of jmpb in the asm routine for load_klass(). > > Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: > > Increase stub size estimate Looks fine. ------------- Marked as reviewed by shade (Committer). PR: https://git.openjdk.java.net/lilliput/pull/24 From rkennke at openjdk.java.net Mon Oct 18 18:35:29 2021 From: rkennke at openjdk.java.net (Roman Kennke) Date: Mon, 18 Oct 2021 18:35:29 GMT Subject: [master] RFR: Fix windows build [v4] In-Reply-To: References: Message-ID: > As title says. We need to use one more jmp instead of jmpb in the asm routine for load_klass(). Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: Bump the vtable-stubs limit even more, according to tests ------------- Changes: - all: https://git.openjdk.java.net/lilliput/pull/24/files - new: https://git.openjdk.java.net/lilliput/pull/24/files/c1dcf68f..7f57e3d4 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=lilliput&pr=24&range=03 - incr: https://webrevs.openjdk.java.net/?repo=lilliput&pr=24&range=02-03 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.java.net/lilliput/pull/24.diff Fetch: git fetch https://git.openjdk.java.net/lilliput pull/24/head:pull/24 PR: https://git.openjdk.java.net/lilliput/pull/24 From shade at openjdk.java.net Tue Oct 19 07:49:05 2021 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Tue, 19 Oct 2021 07:49:05 GMT Subject: [master] RFR: Fix windows build [v4] In-Reply-To: References: Message-ID: On Mon, 18 Oct 2021 18:35:29 GMT, Roman Kennke wrote: >> As title says. We need to use one more jmp instead of jmpb in the asm routine for load_klass(). > > Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: > > Bump the vtable-stubs limit even more, according to tests Marked as reviewed by shade (Committer). ------------- PR: https://git.openjdk.java.net/lilliput/pull/24 From rkennke at openjdk.java.net Tue Oct 19 08:54:37 2021 From: rkennke at openjdk.java.net (Roman Kennke) Date: Tue, 19 Oct 2021 08:54:37 GMT Subject: [master] RFR: Fix windows build [v5] In-Reply-To: References: Message-ID: > As title says. We need to use one more jmp instead of jmpb in the asm routine for load_klass(). Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: Increase vtable stub limit for release builds ------------- Changes: - all: https://git.openjdk.java.net/lilliput/pull/24/files - new: https://git.openjdk.java.net/lilliput/pull/24/files/7f57e3d4..184d8585 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=lilliput&pr=24&range=04 - incr: https://webrevs.openjdk.java.net/?repo=lilliput&pr=24&range=03-04 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.java.net/lilliput/pull/24.diff Fetch: git fetch https://git.openjdk.java.net/lilliput pull/24/head:pull/24 PR: https://git.openjdk.java.net/lilliput/pull/24 From rkennke at openjdk.java.net Tue Oct 19 12:05:20 2021 From: rkennke at openjdk.java.net (Roman Kennke) Date: Tue, 19 Oct 2021 12:05:20 GMT Subject: [master] Integrated: Fix windows build In-Reply-To: References: Message-ID: <1MEAbrKnaZv8Y2i4DCCgWAEIkW-4YdbC1us24evbSpw=.30d32ca0-b63c-4ea8-98da-7aafb240eb2b@github.com> On Mon, 18 Oct 2021 12:35:23 GMT, Roman Kennke wrote: > As title says. We need to use one more jmp instead of jmpb in the asm routine for load_klass(). This pull request has now been integrated. Changeset: cfb582b1 Author: Roman Kennke URL: https://git.openjdk.java.net/lilliput/commit/cfb582b141ba925c649a914bc55c4f69b8e6b2ba Stats: 3 lines in 3 files changed: 0 ins; 0 del; 3 mod Fix windows build Reviewed-by: shade ------------- PR: https://git.openjdk.java.net/lilliput/pull/24 From rkennke at openjdk.java.net Tue Oct 19 12:52:47 2021 From: rkennke at openjdk.java.net (Roman Kennke) Date: Tue, 19 Oct 2021 12:52:47 GMT Subject: [master] RFR: Preserve Klass* in leak profiler [v4] In-Reply-To: References: Message-ID: > The JFR leak profiler uses the mark word for marking objects during traversal, and for storing pointers to Edges. We need to preserve the compressed Klass* in the upper part of the mark-word during the traversal. > > The change is based on: openjdk/jdk/pull/5918 > > Testing: > - [x] jdk/jfr (x86_64) > - [x] jdk/jfr (x86_32) > - [x] tier1 > - [x] tier2 Roman Kennke has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 13 additional commits since the last revision: - Merge branch 'master' into fix-jfr-2 - Merge branch 'master' into fix-jfr-2 - Cherry-pick 8275277: assert(dest_attr.is_in_cset() == (obj->forwardee() == obj)) failed - Merge branch 'master' into fix-jfr-2 - Merge branch 'pull/5918' into fix-jfr-2 - remove -1 - constants - Fix 32bit build - Merge branch 'master' into fix-jfr-2 - In leakprofiler, don't override Klass when marking objects - ... and 3 more: https://git.openjdk.java.net/lilliput/compare/eeef004d...bf8aa74a ------------- Changes: - all: https://git.openjdk.java.net/lilliput/pull/21/files - new: https://git.openjdk.java.net/lilliput/pull/21/files/029c37a1..bf8aa74a Webrevs: - full: https://webrevs.openjdk.java.net/?repo=lilliput&pr=21&range=03 - incr: https://webrevs.openjdk.java.net/?repo=lilliput&pr=21&range=02-03 Stats: 3 lines in 3 files changed: 0 ins; 0 del; 3 mod Patch: https://git.openjdk.java.net/lilliput/pull/21.diff Fetch: git fetch https://git.openjdk.java.net/lilliput pull/21/head:pull/21 PR: https://git.openjdk.java.net/lilliput/pull/21 From rkennke at openjdk.java.net Tue Oct 19 15:18:22 2021 From: rkennke at openjdk.java.net (Roman Kennke) Date: Tue, 19 Oct 2021 15:18:22 GMT Subject: [master] Integrated: Preserve Klass* in leak profiler In-Reply-To: References: Message-ID: On Wed, 13 Oct 2021 09:32:58 GMT, Roman Kennke wrote: > The JFR leak profiler uses the mark word for marking objects during traversal, and for storing pointers to Edges. We need to preserve the compressed Klass* in the upper part of the mark-word during the traversal. > > The change is based on: openjdk/jdk/pull/5918 > > Testing: > - [x] jdk/jfr (x86_64) > - [x] jdk/jfr (x86_32) > - [x] tier1 > - [x] tier2 This pull request has now been integrated. Changeset: 46504d85 Author: Roman Kennke URL: https://git.openjdk.java.net/lilliput/commit/46504d85c7f0740828ca93289782c0a642fc9b34 Stats: 113 lines in 4 files changed: 81 ins; 23 del; 9 mod Preserve Klass* in leak profiler Reviewed-by: shade ------------- PR: https://git.openjdk.java.net/lilliput/pull/21 From rkennke at openjdk.java.net Tue Oct 19 16:55:30 2021 From: rkennke at openjdk.java.net (Roman Kennke) Date: Tue, 19 Oct 2021 16:55:30 GMT Subject: [master] RFR: Load Klass* without causing monitor inflation Message-ID: Until now, loading the Klass* in Lilliput may cause monitor inflation, because we needed a way to safely load the mark-word in the face of concurrent stack-locking or inflation happening. However, this caused troubles with concurrent GCs, because they may attempt to inflate monitors on from-space objects while traversing the heap for relocation/evacuation. Also, this may be a performance nuisance. It turns out that we don't have to fully inflate the monitor: we only need a partial inflation, up to where we install the transient INFLATING word to prevent concurrent threads from messing with the mark-word, and then read and return the mark-word safely, and swing back the original real mark word to let other threads continue. Testing: - [x] tier1 - [x] tier2 - [ ] tier3 - [ ] tier4 - [ ] hotspot_gc ------------- Commit messages: - Load Klass* without causing monitor inflation Changes: https://git.openjdk.java.net/lilliput/pull/25/files Webrev: https://webrevs.openjdk.java.net/?repo=lilliput&pr=25&range=00 Stats: 77 lines in 2 files changed: 75 ins; 1 del; 1 mod Patch: https://git.openjdk.java.net/lilliput/pull/25.diff Fetch: git fetch https://git.openjdk.java.net/lilliput pull/25/head:pull/25 PR: https://git.openjdk.java.net/lilliput/pull/25 From shade at openjdk.java.net Wed Oct 20 09:39:28 2021 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Wed, 20 Oct 2021 09:39:28 GMT Subject: [master] RFR: Load Klass* without causing monitor inflation In-Reply-To: References: Message-ID: <3J_7oj_7S2dlPsJ9Zv1uw-gKvvuMBWcFF5EHeiKD5w0=.f92cb75f-6008-4d92-be6d-c065388cf7b1@github.com> On Tue, 19 Oct 2021 14:44:12 GMT, Roman Kennke wrote: > Until now, loading the Klass* in Lilliput may cause monitor inflation, because we needed a way to safely load the mark-word in the face of concurrent stack-locking or inflation happening. However, this caused troubles with concurrent GCs, because they may attempt to inflate monitors on from-space objects while traversing the heap for relocation/evacuation. Also, this may be a performance nuisance. > It turns out that we don't have to fully inflate the monitor: we only need a partial inflation, up to where we install the transient INFLATING word to prevent concurrent threads from messing with the mark-word, and then read and return the mark-word safely, and swing back the original real mark word to let other threads continue. > > Testing: > - [x] tier1 > - [x] tier2 > - [x] tier3 > - [ ] tier4 > - [ ] hotspot_gc OK, I am approving this because `ObjectSynchronizer::stable_mark` is Lilliput-specific, and so changes there do not affect the upstream locking code (apart from Lilliput-specific paths). ------------- Marked as reviewed by shade (Committer). PR: https://git.openjdk.java.net/lilliput/pull/25 From shade at openjdk.java.net Wed Oct 20 09:52:35 2021 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Wed, 20 Oct 2021 09:52:35 GMT Subject: [master] RFR: Load Klass* without causing monitor inflation In-Reply-To: References: Message-ID: <5D3r4HRcnQoWmOe1lNqT-cLyx1r9HoxVFGxNROUqLX0=.74f4f75c-2cb9-4fed-85c0-dc58fd187938@github.com> On Tue, 19 Oct 2021 14:44:12 GMT, Roman Kennke wrote: > Until now, loading the Klass* in Lilliput may cause monitor inflation, because we needed a way to safely load the mark-word in the face of concurrent stack-locking or inflation happening. However, this caused troubles with concurrent GCs, because they may attempt to inflate monitors on from-space objects while traversing the heap for relocation/evacuation. Also, this may be a performance nuisance. > It turns out that we don't have to fully inflate the monitor: we only need a partial inflation, up to where we install the transient INFLATING word to prevent concurrent threads from messing with the mark-word, and then read and return the mark-word safely, and swing back the original real mark word to let other threads continue. > > Testing: > - [x] tier1 > - [x] tier2 > - [x] tier3 > - [ ] tier4 > - [ ] hotspot_gc By the way, if this resolves the concurrent GC problems, should GHA workaround be removed too? https://builds.shipilev.net/patch-openjdk-lilliput/.github/workflows/submit.yml.udiff.html ------------- PR: https://git.openjdk.java.net/lilliput/pull/25 From rkennke at openjdk.java.net Wed Oct 20 10:37:30 2021 From: rkennke at openjdk.java.net (Roman Kennke) Date: Wed, 20 Oct 2021 10:37:30 GMT Subject: [master] RFR: Load Klass* without causing monitor inflation In-Reply-To: <3J_7oj_7S2dlPsJ9Zv1uw-gKvvuMBWcFF5EHeiKD5w0=.f92cb75f-6008-4d92-be6d-c065388cf7b1@github.com> References: <3J_7oj_7S2dlPsJ9Zv1uw-gKvvuMBWcFF5EHeiKD5w0=.f92cb75f-6008-4d92-be6d-c065388cf7b1@github.com> Message-ID: On Wed, 20 Oct 2021 09:36:25 GMT, Aleksey Shipilev wrote: > OK, I am approving this because `ObjectSynchronizer::stable_mark` is Lilliput-specific, and so changes there do not affect the upstream locking code (apart from Lilliput-specific paths). Thank you! Also, notice how safe_load_mark() is a simplified version of inflate(), much like stable_mark() is a simplified version of FastHashCode(). I literally copied both original methods, and stripped out the parts that are not needed. And they are not needed because in one case I need to establish a mark-word for writing (e.g. hash-code or lock), and in the case of loading the mark-word for getting a Klass* I only need to read it. The core protocol (installing 0 to block out competing threads) remains the same though. ------------- PR: https://git.openjdk.java.net/lilliput/pull/25 From stuefe at openjdk.java.net Fri Oct 22 15:28:39 2021 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Fri, 22 Oct 2021 15:28:39 GMT Subject: [master] RFR: aarch64: init Klass* in markword in C1 & template interpreter Message-ID: This is a minimal patch to make Lilliput on aarch64 work. It fixes the initialization of the object header for C1 and template interpreter. I'd like to get this in as preparation for my classpointer-shrinking-work (https://github.com/openjdk/lilliput/pull/13). I would like to test that on aarch64 too since it has a noticeably different way of encoding Klass*. Note that I did not touch MacroAssembler::load_klass(), we still pull Klass* pointer from the old place there. I did some very basic tests (gtests, some of my metaspace tests) but nothing more so far. All I have is an underpowered Raspi and testing needs a lot of patience... ------------- Commit messages: - aarch64 init Klass* in markword in C1 & template interpreter Changes: https://git.openjdk.java.net/lilliput/pull/26/files Webrev: https://webrevs.openjdk.java.net/?repo=lilliput&pr=26&range=00 Stats: 3 lines in 2 files changed: 0 ins; 1 del; 2 mod Patch: https://git.openjdk.java.net/lilliput/pull/26.diff Fetch: git fetch https://git.openjdk.java.net/lilliput pull/26/head:pull/26 PR: https://git.openjdk.java.net/lilliput/pull/26 From rkennke at openjdk.java.net Sat Oct 23 16:18:29 2021 From: rkennke at openjdk.java.net (Roman Kennke) Date: Sat, 23 Oct 2021 16:18:29 GMT Subject: [master] RFR: aarch64: init Klass* in markword in C1 & template interpreter In-Reply-To: References: Message-ID: On Fri, 22 Oct 2021 15:23:17 GMT, Thomas Stuefe wrote: > This is a minimal patch to make Lilliput on aarch64 work. It fixes the initialization of the object header for C1 and template interpreter. I'd like to get this in as preparation for my classpointer-shrinking-work (https://github.com/openjdk/lilliput/pull/13). I would like to test that on aarch64 too since it has a noticeably different way of encoding Klass*. > > Note that I did not touch MacroAssembler::load_klass(), we still pull Klass* pointer from the old place there. > > I did some very basic tests (gtests, some of my metaspace tests) but nothing more so far. All I have is an underpowered Raspi and testing needs a lot of patience... Looks correct. Thank you! ------------- Marked as reviewed by rkennke (Lead). PR: https://git.openjdk.java.net/lilliput/pull/26 From stuefe at openjdk.java.net Sat Oct 23 16:49:25 2021 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Sat, 23 Oct 2021 16:49:25 GMT Subject: [master] RFR: aarch64: init Klass* in markword in C1 & template interpreter In-Reply-To: References: Message-ID: On Sat, 23 Oct 2021 16:15:40 GMT, Roman Kennke wrote: > Looks correct. Thank you! Great, thanks! What's your protocol with Lilliput, should I wait for a second reviewer? ------------- PR: https://git.openjdk.java.net/lilliput/pull/26 From rkennke at openjdk.java.net Sat Oct 23 16:57:23 2021 From: rkennke at openjdk.java.net (Roman Kennke) Date: Sat, 23 Oct 2021 16:57:23 GMT Subject: [master] RFR: aarch64: init Klass* in markword in C1 & template interpreter In-Reply-To: References: Message-ID: On Sat, 23 Oct 2021 16:46:20 GMT, Thomas Stuefe wrote: > > Looks correct. Thank you! > > Great, thanks! What's your protocol with Lilliput, should I wait for a second reviewer? No, one reviewer is enough. Please integrate the change! ------------- PR: https://git.openjdk.java.net/lilliput/pull/26 From stuefe at openjdk.java.net Sat Oct 23 18:15:26 2021 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Sat, 23 Oct 2021 18:15:26 GMT Subject: [master] Integrated: aarch64: init Klass* in markword in C1 & template interpreter In-Reply-To: References: Message-ID: On Fri, 22 Oct 2021 15:23:17 GMT, Thomas Stuefe wrote: > This is a minimal patch to make Lilliput on aarch64 work. It fixes the initialization of the object header for C1 and template interpreter. I'd like to get this in as preparation for my classpointer-shrinking-work (https://github.com/openjdk/lilliput/pull/13). I would like to test that on aarch64 too since it has a noticeably different way of encoding Klass*. > > Note that I did not touch MacroAssembler::load_klass(), we still pull Klass* pointer from the old place there. > > I did some very basic tests (gtests, some of my metaspace tests) but nothing more so far. All I have is an underpowered Raspi and testing needs a lot of patience... This pull request has now been integrated. Changeset: 77f70ce4 Author: Thomas Stuefe URL: https://git.openjdk.java.net/lilliput/commit/77f70ce4678f2da86c980efd63f6e7145a024035 Stats: 3 lines in 2 files changed: 0 ins; 1 del; 2 mod aarch64: init Klass* in markword in C1 & template interpreter Reviewed-by: rkennke ------------- PR: https://git.openjdk.java.net/lilliput/pull/26 From aph at openjdk.java.net Wed Oct 27 09:07:33 2021 From: aph at openjdk.java.net (Andrew Haley) Date: Wed, 27 Oct 2021 09:07:33 GMT Subject: [master] RFR: Load Klass from header, C1/x86 implementation [v2] In-Reply-To: References: Message-ID: On Mon, 18 Oct 2021 10:23:36 GMT, Roman Kennke wrote: >> src/hotspot/cpu/x86/c1_LIRAssembler_x86.cpp line 1260: >> >>> 1258: >>> 1259: case T_ADDRESS: >>> 1260: __ movptr(dest->as_register(), from_addr); >> >> Can we / should we `assert(addr->disp() != oopDesc::klass_offset_in_bytes(), "sanity")` here and later? So that the non-`LoadKlass` loads would break early? > > Not sure, tbh. The current implementation is quite brittle, and we are lucky that we have apparently never hit (emit) a load with offset 8 (or 4 on x86_32) which happens to be klass_offset_in_bytes(). Well, yeah. I was appalled when I saw that when translating into AArch64 code. I thought "Are you serious?" ------------- PR: https://git.openjdk.java.net/lilliput/pull/20 From rkennke at openjdk.java.net Wed Oct 27 09:37:24 2021 From: rkennke at openjdk.java.net (Roman Kennke) Date: Wed, 27 Oct 2021 09:37:24 GMT Subject: [master] RFR: Load Klass from header, C1/x86 implementation [v2] In-Reply-To: References: Message-ID: On Wed, 27 Oct 2021 09:04:57 GMT, Andrew Haley wrote: >> Not sure, tbh. The current implementation is quite brittle, and we are lucky that we have apparently never hit (emit) a load with offset 8 (or 4 on x86_32) which happens to be klass_offset_in_bytes(). > > Well, yeah. I was appalled when I saw that when translating into AArch64 code. I thought "Are you serious?" Yeah. I think I should probably bring much of this PR upstream. ------------- PR: https://git.openjdk.java.net/lilliput/pull/20 From rkennke at openjdk.java.net Thu Oct 28 13:27:49 2021 From: rkennke at openjdk.java.net (Roman Kennke) Date: Thu, 28 Oct 2021 13:27:49 GMT Subject: [master] RFR: Rendezvous GC threads under STS for monitor deflation Message-ID: Object monitors are deflated concurrently by the MonitorDeflationThread. It first unlinks monitors from objects (i.e. restore the original object header), then handshakes (with a no-op) all Java threads, and only then destroys the monitors. This way, Java threads can safely (and racily) access monitors before the handshake, because the monitors are guaranteed to still exist when a Java thread racily reads a mark-word that is being unlinked, and the monitor can safely be destroyed after the handshake, because all Java threads would then read the correct unlinked mark-word. However, GC threads are not rendezvous'ed like that, and can read potentially dead monitors. In order to safely access monitors via object headers concurrently from GC threads, we need to rendezvous them after unlinking and before destroying the monitors, just like Java threads do, via handshake. This is important so that concurrent GCs (ZGC, Shenandoah, G1) can safely access object's Klass* (and thus object size, layout, etc) during concurrent GC phases. This only implements the parts that do the rendezvous, it still requires that affected concurrent GC threads are under SustainableThreadSet. This will be implemented in later PR. Testing: - [x] tier1 - [x] tier2 - [x] tier3 - [x] tier4 ------------- Commit messages: - Add missing override - Rendezvous GC threads under STS for monitor deflation Changes: https://git.openjdk.java.net/lilliput/pull/27/files Webrev: https://webrevs.openjdk.java.net/?repo=lilliput&pr=27&range=00 Stats: 17 lines in 2 files changed: 17 ins; 0 del; 0 mod Patch: https://git.openjdk.java.net/lilliput/pull/27.diff Fetch: git fetch https://git.openjdk.java.net/lilliput pull/27/head:pull/27 PR: https://git.openjdk.java.net/lilliput/pull/27 From eosterlund at openjdk.java.net Thu Oct 28 13:27:49 2021 From: eosterlund at openjdk.java.net (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Thu, 28 Oct 2021 13:27:49 GMT Subject: [master] RFR: Rendezvous GC threads under STS for monitor deflation In-Reply-To: References: Message-ID: On Tue, 26 Oct 2021 16:08:56 GMT, Roman Kennke wrote: > Object monitors are deflated concurrently by the MonitorDeflationThread. It first unlinks monitors from objects (i.e. restore the original object header), then handshakes (with a no-op) all Java threads, and only then destroys the monitors. This way, Java threads can safely (and racily) access monitors before the handshake, because the monitors are guaranteed to still exist when a Java thread racily reads a mark-word that is being unlinked, and the monitor can safely be destroyed after the handshake, because all Java threads would then read the correct unlinked mark-word. > > However, GC threads are not rendezvous'ed like that, and can read potentially dead monitors. > > In order to safely access monitors via object headers concurrently from GC threads, we need to rendezvous them after unlinking and before destroying the monitors, just like Java threads do, via handshake. This is important so that concurrent GCs (ZGC, Shenandoah, G1) can safely access object's Klass* (and thus object size, layout, etc) during concurrent GC phases. > > This only implements the parts that do the rendezvous, it still requires that affected concurrent GC threads are under SustainableThreadSet. This will be implemented in later PR. > > Testing: > - [x] tier1 > - [x] tier2 > - [x] tier3 > - [x] tier4 Looks good. ------------- Marked as reviewed by eosterlund (Committer). PR: https://git.openjdk.java.net/lilliput/pull/27 From shade at openjdk.java.net Thu Oct 28 17:36:39 2021 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Thu, 28 Oct 2021 17:36:39 GMT Subject: [master] RFR: Rendezvous GC threads under STS for monitor deflation In-Reply-To: References: Message-ID: On Tue, 26 Oct 2021 16:08:56 GMT, Roman Kennke wrote: > Object monitors are deflated concurrently by the MonitorDeflationThread. It first unlinks monitors from objects (i.e. restore the original object header), then handshakes (with a no-op) all Java threads, and only then destroys the monitors. This way, Java threads can safely (and racily) access monitors before the handshake, because the monitors are guaranteed to still exist when a Java thread racily reads a mark-word that is being unlinked, and the monitor can safely be destroyed after the handshake, because all Java threads would then read the correct unlinked mark-word. > > However, GC threads are not rendezvous'ed like that, and can read potentially dead monitors. > > In order to safely access monitors via object headers concurrently from GC threads, we need to rendezvous them after unlinking and before destroying the monitors, just like Java threads do, via handshake. This is important so that concurrent GCs (ZGC, Shenandoah, G1) can safely access object's Klass* (and thus object size, layout, etc) during concurrent GC phases. > > This only implements the parts that do the rendezvous, it still requires that affected concurrent GC threads are under SustainableThreadSet. This will be implemented in later PR. > > Testing: > - [x] tier1 > - [x] tier2 > - [x] tier3 > - [x] tier4 Marked as reviewed by shade (Committer). ------------- PR: https://git.openjdk.java.net/lilliput/pull/27 From rkennke at openjdk.java.net Thu Oct 28 18:19:46 2021 From: rkennke at openjdk.java.net (Roman Kennke) Date: Thu, 28 Oct 2021 18:19:46 GMT Subject: [master] Integrated: Rendezvous GC threads under STS for monitor deflation In-Reply-To: References: Message-ID: On Tue, 26 Oct 2021 16:08:56 GMT, Roman Kennke wrote: > Object monitors are deflated concurrently by the MonitorDeflationThread. It first unlinks monitors from objects (i.e. restore the original object header), then handshakes (with a no-op) all Java threads, and only then destroys the monitors. This way, Java threads can safely (and racily) access monitors before the handshake, because the monitors are guaranteed to still exist when a Java thread racily reads a mark-word that is being unlinked, and the monitor can safely be destroyed after the handshake, because all Java threads would then read the correct unlinked mark-word. > > However, GC threads are not rendezvous'ed like that, and can read potentially dead monitors. > > In order to safely access monitors via object headers concurrently from GC threads, we need to rendezvous them after unlinking and before destroying the monitors, just like Java threads do, via handshake. This is important so that concurrent GCs (ZGC, Shenandoah, G1) can safely access object's Klass* (and thus object size, layout, etc) during concurrent GC phases. > > This only implements the parts that do the rendezvous, it still requires that affected concurrent GC threads are under SustainableThreadSet. This will be implemented in later PR. > > Testing: > - [x] tier1 > - [x] tier2 > - [x] tier3 > - [x] tier4 This pull request has now been integrated. Changeset: 644138d5 Author: Roman Kennke URL: https://git.openjdk.java.net/lilliput/commit/644138d5c81e6423826ac920aca108425e955bf5 Stats: 17 lines in 2 files changed: 17 ins; 0 del; 0 mod Rendezvous GC threads under STS for monitor deflation Reviewed-by: eosterlund, shade ------------- PR: https://git.openjdk.java.net/lilliput/pull/27 From rkennke at openjdk.java.net Thu Oct 28 21:26:31 2021 From: rkennke at openjdk.java.net (Roman Kennke) Date: Thu, 28 Oct 2021 21:26:31 GMT Subject: [master] RFR: Rendezvous GC threads under STS for monitor deflation In-Reply-To: References: Message-ID: On Thu, 28 Oct 2021 21:21:22 GMT, David Holmes wrote: > Why does a GC thread need to access a monitor? It needs to access the Klass*, mainly for size & layout (heap parsing). And Klass* is stored in the header. When object is locked via monitor, it needs to be able to reach through to the displaced header in the monitor. ------------- PR: https://git.openjdk.java.net/lilliput/pull/27 From dholmes at openjdk.java.net Thu Oct 28 21:26:31 2021 From: dholmes at openjdk.java.net (David Holmes) Date: Thu, 28 Oct 2021 21:26:31 GMT Subject: [master] RFR: Rendezvous GC threads under STS for monitor deflation In-Reply-To: References: Message-ID: On Tue, 26 Oct 2021 16:08:56 GMT, Roman Kennke wrote: > Object monitors are deflated concurrently by the MonitorDeflationThread. It first unlinks monitors from objects (i.e. restore the original object header), then handshakes (with a no-op) all Java threads, and only then destroys the monitors. This way, Java threads can safely (and racily) access monitors before the handshake, because the monitors are guaranteed to still exist when a Java thread racily reads a mark-word that is being unlinked, and the monitor can safely be destroyed after the handshake, because all Java threads would then read the correct unlinked mark-word. > > However, GC threads are not rendezvous'ed like that, and can read potentially dead monitors. > > In order to safely access monitors via object headers concurrently from GC threads, we need to rendezvous them after unlinking and before destroying the monitors, just like Java threads do, via handshake. This is important so that concurrent GCs (ZGC, Shenandoah, G1) can safely access object's Klass* (and thus object size, layout, etc) during concurrent GC phases. > > This only implements the parts that do the rendezvous, it still requires that affected concurrent GC threads are under SustainableThreadSet. This will be implemented in later PR. > > Testing: > - [x] tier1 > - [x] tier2 > - [x] tier3 > - [x] tier4 Why does a GC thread need to access a monitor? ------------- PR: https://git.openjdk.java.net/lilliput/pull/27