RFR: 8315884: New Object to ObjectMonitor mapping [v6]

Roman Kennke rkennke at openjdk.org
Fri Jul 12 09:55:52 UTC 2024


On Fri, 12 Jul 2024 05:57:30 GMT, Axel Boldt-Christmas <aboldtch at openjdk.org> wrote:

>> When inflating a monitor the `ObjectMonitor*` is written directly over the `markWord` and any overwritten data is displaced into a displaced `markWord`. This is problematic for concurrent GCs which needs extra care or looser semantics to use this displaced data. In Lilliput this data also contains the klass forcing this to be something that the GC has to take into account everywhere.
>> 
>> This patch introduces an alternative solution where locking only uses the lock bits of the `markWord` and inflation does not override and displace the `markWord`. This is done by keeping associations between objects and `ObjectMonitor*` in an external hash table. Different caching techniques are used to speedup lookups from compiled code.
>> 
>> A diagnostic VM option is introduced called `UseObjectMonitorTable`. It is only supported in combination with the LM_LIGHTWEIGHT locking mode (the default). 
>> 
>> This patch has been evaluated to be performance neutral when `UseObjectMonitorTable` is turned off (the default). 
>> 
>> Below is a more detailed explanation of this change and how `LM_LIGHTWEIGHT` and `UseObjectMonitorTable` works.
>> 
>> # Cleanups
>> 
>> Cleaned up displaced header usage for:
>>   * BasicLock
>>     * Contains some Zero changes
>>     * Renames one exported JVMCI field
>>   * ObjectMonitor
>>     * Updates comments and tests consistencies
>> 
>> # Refactoring
>> 
>> `ObjectMonitor::enter` has been refactored an a `ObjectMonitorContentionMark` witness object has been introduced to the signatures. Which signals that the contentions reference counter is being held. More details are given below in the section about deflation.
>> 
>> The initial purpose of this was to allow `UseObjectMonitorTable` to interact more seamlessly with the `ObjectMonitor::enter` code. 
>> 
>> _There is even more `ObjectMonitor` refactoring which can be done here to create a more understandable and enforceable API. There are a handful of invariants / assumptions which are not always explicitly asserted which could be trivially abstracted and verified by the type system by using similar witness objects._
>> 
>> # LightweightSynchronizer
>> 
>> Working on adapting and incorporating the following section as a comment in the source code
>> 
>> ## Fast Locking
>> 
>>   CAS on locking bits in markWord. 
>>   0b00 (Fast Locked) <--> 0b01 (Unlocked)
>> 
>>   When locking and 0b00 (Fast Locked) is observed, it may be beneficial to avoid inflating by spinning a bit.
>> 
>>   If 0b10 (Inflated) is observed or there is to...
>
> Axel Boldt-Christmas has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Update arguments.cpp

I've reviewed and tested some earlier versions of this change in the context of Lilliput, and haven't encountered any showstopping problems. Very nice work!

When you say 'This patch has been evaluated to be performance neutral when UseObjectMonitorTable is turned off (the default).' - what does the performance look like with +UOMT? How does it compare to -UOMT?

I've only reviewed the platform-specific changes so far. Mostly looks good, I only have some relatively minor remarks.

Will review the shared code changes separately.

src/hotspot/cpu/aarch64/c2_MacroAssembler_aarch64.cpp line 318:

> 316: 
> 317:       // Loop after unrolling, advance iterator.
> 318:       increment(t3_t, in_bytes(OMCache::oop_to_oop_difference()));

Maybe I am misreading this but... in the unroll loop you avoid emitting the increment on the last iteration, but then you emit it explicitely here? Wouldn't it be cleaner to do it in the unroll loop always and elide the explicit increment after loop?

src/hotspot/cpu/aarch64/c2_MacroAssembler_aarch64.cpp line 343:

> 341:     const Register t3_owner = t3;
> 342:     const ByteSize monitor_tag = in_ByteSize(UseObjectMonitorTable ? 0 : checked_cast<int>(markWord::monitor_value));
> 343:     const Address owner_address{t1_monitor, ObjectMonitor::owner_offset() - monitor_tag};

That may be just me, but I found that syntax weird. I first needed to look-up what the {}-initializer actually means. Hiccups like this reduce readability, IMO. I'd prefer the normal ()-init for the Address like we seem to do everywhere else.

src/hotspot/cpu/x86/c2_MacroAssembler_x86.cpp line 674:

> 672: 
> 673:       // Loop after unrolling, advance iterator.
> 674:       increment(t, in_bytes(OMCache::oop_to_oop_difference()));

Same issue as in aarch64 code.

-------------

Changes requested by rkennke (Reviewer).

PR Review: https://git.openjdk.org/jdk/pull/20067#pullrequestreview-2174300266
PR Review Comment: https://git.openjdk.org/jdk/pull/20067#discussion_r1675587650
PR Review Comment: https://git.openjdk.org/jdk/pull/20067#discussion_r1675597362
PR Review Comment: https://git.openjdk.org/jdk/pull/20067#discussion_r1675605009


More information about the core-libs-dev mailing list