RFR: 8305895: Implementation: JEP 450: Compact Object Headers (Experimental) [v3]

Quan Anh Mai qamai at openjdk.org
Tue May 9 03:02:24 UTC 2023


On Mon, 8 May 2023 19:00:40 GMT, Roman Kennke <rkennke at openjdk.org> wrote:

>> This is the main body of the JEP 450: Compact Object Headers (Experimental).
>> 
>> Main changes:
>>  - Introduction of the (experimental) flag UseCompactObjectHeaders. All changes in this PR are protected by this flag.
>>  - The compressed Klass* can now be stored in the mark-word of objects. In order to be able to do this, we are building on #10907, #13582 and #13779 to protect the relevant (upper 32) bits of the mark-word. Significant parts of this PR deal with loading the compressed Klass* from the mark-word, and dealing with (monitor-)locked objects. When the object is monitor-locked, we load the displaced mark-word from the monitor, and load the compressed Klass* from there. This PR also changes some code paths (mostly in GCs) to be more careful when accessing Klass* (or mark-word or size) to be able to fetch it from the forwardee in case the object is forwarded, and/or reach through to the monitor when the object is locked by a monitor.
>>  - The identity hash-code is narrowed to 25 bits.
>>  - Instances can now have their base-offset (the offset where the field layouter starts to place fields) at offset 8 (instead of 12 or 16).
>>  - Arrays will can now store their length at offset 8. Due to alignment restrictions, array elements will still start at offset 16. #11044 will resolve that restriction and allow array elements to start at offset 12 (except for long, double and uncompressed oops, which are still required to start at an element-aligned offset).
>>  - CDS can now write and read archives with the compressed header. However, it is not possible to read an archive that has been written with an opposite setting of UseCompactObjectHeaders.
>> 
>> Testing:
>>  (+UseCompactObjectHeaders tests are run with the flag hard-patched into the build, to also catch @flagless tests, and to avoid mismatches with CDS - see above.)
>>  - [x] tier1 (x86_64)
>>  - [x] tier2 (x86_64)
>>  - [x] tier3 (x86_64)
>>  - [ ] tier4 (x86_64)
>>  - [x] tier1 (aarch64)
>>  - [x] tier2 (aarch64)
>>  - [x] tier3 (aarch64)
>>  - [ ] tier4 (aarch64)
>>  - [ ] tier1 (x86_64) +UseCompactObjectHeaders
>>  - [ ] tier2 (x86_64) +UseCompactObjectHeaders
>>  - [ ] tier3 (x86_64) +UseCompactObjectHeaders
>>  - [ ] tier4 (x86_64) +UseCompactObjectHeaders
>>  - [ ] tier1 (aarch64) +UseCompactObjectHeaders
>>  - [ ] tier2 (aarch64) +UseCompactObjectHeaders
>>  - [ ] tier3 (aarch64) +UseCompactObjectHeaders
>>  - [ ] tier4 (aarch64) +UseCompactObjectHeaders
>
> Roman Kennke has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Allow to resolve mark with LW locking

I'm not sure if this is trivial or significant, but if you limit the class pointer to 30 bit, and use the upper 2 bits for locking, then you can obtain the class pointer in less instructions:

    movl dst, [obj + 4]
    andl dst, 0xBFFFFFFF
    jl slow_path

This exploits the fact that the most significant bit represents a negative number, so it clears the unrelated bit and checks for valid header at the same time, the sequence is only 2 instructions long after macro fusion, compared to the current value of 3.

This also allows quick class comparisons against constants, assuming that most instance is in unlock state, the comparison when equality is likely can be done:

    cmpl [obj + 4], con | 0x40000000
    jne slow_path

This can be matched on an `If` so that the `slow_path` can branch to the `IfTrue` label directly, and the fast path has only 1 comparison and 1 conditional jump.

Thanks.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/13844#issuecomment-1539318097


More information about the hotspot-dev mailing list