RFR: 8373595: A new ObjectMonitorTable implementation

Axel Boldt-Christmas aboldtch at openjdk.org
Tue Jan 27 08:12:42 UTC 2026


On Fri, 23 Jan 2026 10:36:26 GMT, Fredrik Bredberg <fbredberg at openjdk.org> wrote:

> Me and Anton Artemov (@toxaart) investigated a quite large regression that occurred in Pet Clinic that happened if you turned on Compact Object Headers. It was found that a large portion of that regression could be attributed to not finding the monitor in the Object Monitor Cache, and because we couldn't access the Object Monitor Table from C2 generated code, we often had to take the slow path.
> 
> By making the object monitor cache larger and make it use the object's hash value as a key, we managed to mitigate the regression.
> 
> Erik Österlund (@fisk) took that idea and elevated it to the next level, which means that he rewrote the object monitor table code so that we can now search for an object in the global object monitor table from C2 generated code. I.e. from `C2_MacroAssembler::fast_lock()`. As in my and Anton's version, the key is the hash value from the object.
> 
> Erik also provided new barrier code needed for ZGC for x86 and aarch64. Roman Kennke (@rkennke) provided the same for Shenandoah, also for x86 and aarch64.
> 
> We decided to keep the Object Monitor Cache, since we found that for most programs (but not Pet Clinic) the monitor you are looking for is likely found in the first positions of the cache (it's sorted in most recently used order). However we decresed the size from 8 to 2 elements.
> 
> After running extensive performance tests we can say that this has improved the performance in many of them, not only mitigated the regression in Pet Clinic.
> 
> Tests are running okay tier1-7 on supported platforms.
> 
> The rest of the platforms (`ppc`, `riscv` and `s390`) have been smoke tested using QEMU.
> I mainly used this test for smoke testing with QEMU: `-XX:+UnlockDiagnosticVMOptions -XX:+UseObjectMonitorTable ./test/hotspot/jtreg/runtime/Monitor/UseObjectMonitorTableTest.java `

Thanks to everyone involved for implementing this. Very nice with a simpler, specialised hash table we can use from the emitted code. 

Also moving the resizing work to the deflation thread and keeping rebuilding in sync with deflation is something I always thought we should have done. Nice improvement.
Only wonder if we should do something when deflation is turned off. As we will then never `rebuild` and thus have all previous tables forever. (They are diagnostic flags) So maybe we should just not turn off the deflation thread, and instead have it only respond to `ObjectSynchronizer::request_deflate_idle_monitors();` requests, but not do anything timebased.

I have a few comments most related to type safety. 

The new table implementation looks sound.

src/hotspot/cpu/aarch64/c2_MacroAssembler_aarch64.cpp line 229:

> 227: 
> 228:       // Save the mark, we might need it to extract the hash.
> 229:       mov(rscratch2, t1_mark);

Using `rscratch2` here is a little bit scary as we call other MacroAssembler functions with the expectation of it being live. 

Guessing the `ldr`, `lea` and `increment` implementations are not likely to change anytime soon, but maybe it is worth a comment saying that we do not expect these to modify the state of `rscratch2`

src/hotspot/share/runtime/lockStack.hpp line 142:

> 140:     ObjectMonitor* _monitor = nullptr;
> 141:   } _entries[CAPACITY];
> 142:   const oop _null_sentinel = nullptr;

This should no longer be needed. As we no longer have any loops in C2 that terminates on a null read.

src/hotspot/share/runtime/objectMonitorTable.cpp line 47:

> 45:   const size_t _capacity_mask;       // One less than its power-of-two capacity
> 46:   Table* volatile _prev;             // Set while rehashing
> 47:   ObjectMonitor* volatile* _buckets; // The payload

I think this should have some opaque `enum class` type rather than `ObjectMonitor*` because we do not always store an `ObjectMonitor*` inside. We could repurpose `SpecialPointerValues` with another name.

So we avoid these reinterpret_casts from our special values to `ObjectMonitor*`. And only reinterpret_cast values we know to be `ObjectMonitor*` to `ObjectMonitor*`.

src/hotspot/share/runtime/objectMonitorTable.cpp line 51:

> 49:   char _padding[DEFAULT_CACHE_LINE_SIZE];
> 50: 
> 51:   volatile size_t _items_count;

I wonder if we want something along the lines of this here. We do not allocate many of these tables, so using some extra padding to make sure we never share cache lines with anything outside the objects either (even though it will probably always just be our NMT headers).
Suggestion:

  DEFINE_PAD_MINUS_SIZE(0, DEFAULT_CACHE_LINE_SIZE, 0);
  const size_t _capacity_mask;       // One less than its power-of-two capacity
  Table* volatile _prev;             // Set while rehashing
  ObjectMonitor* volatile* _buckets; // The payload

  DEFINE_PAD_MINUS_SIZE(1, DEFAULT_CACHE_LINE_SIZE, sizeof(_capacity_mask) + sizeof(_prev) + sizeof(_buckets));
  volatile size_t _items_count;

  DEFINE_PAD_MINUS_SIZE(2, DEFAULT_CACHE_LINE_SIZE, sizeof(_items_count));


Once we add alignment support for our CHeapObj operator new we can replace this with:
```c++
  const size_t _capacity_mask;       // One less than its power-of-two capacity
  Table* volatile _prev;             // Set while rehashing
  ObjectMonitor* volatile* _buckets; // The payload

  alignas(DEFAULT_CACHE_LINE_SIZE) volatile size_t _items_count;

src/hotspot/share/runtime/objectMonitorTable.cpp line 59:

> 57:   static ObjectMonitor* removed_entry() {
> 58:     return (ObjectMonitor*)ObjectMonitorTable::SpecialPointerValues::removed;
> 59:   }

Same comment as above w.r.t. not using reinterpret_cast on non-`ObjectMonitor*` values

src/hotspot/share/runtime/objectMonitorTable.cpp line 118:

> 116:   }
> 117: 
> 118:   ObjectMonitor* get(oop obj, int hash) {

While it might be wrong that we type the hash() interface on the markWord as `intptr_t`. I think we should stick with that type until we decide to change this in the markWord inteface. When I see a `intptr_t` -> `int` -> `size_t` roundtrip I try and figure out what the purpose is.

src/hotspot/share/runtime/objectMonitorTable.cpp line 153:

> 151:     }
> 152: 
> 153:     // Rehashing could have stareted by now, but if a monitor has been inserted in a

Suggestion:

    // Rehashing could have started by now, but if a monitor has been inserted in a

src/hotspot/share/runtime/objectMonitorTable.cpp line 158:

> 156:   }
> 157: 
> 158:   ObjectMonitor* get_set(oop obj, ObjectMonitor* new_monitor, int hash) {

Same type comment. (`int` -> `intptr_t`)

src/hotspot/share/runtime/objectMonitorTable.cpp line 218:

> 216:   }
> 217: 
> 218:   void remove(oop obj, ObjectMonitor* old_monitor, int hash) {

Same type comment. (`int` -> `intptr_t`)

src/hotspot/share/runtime/objectMonitorTable.cpp line 252:

> 250: 
> 251:   void reinsert(oop obj, ObjectMonitor* new_monitor) {
> 252:     int hash = obj->mark().hash();

Same type comment. (`int` -> `intptr_t`)

src/hotspot/share/runtime/objectMonitorTable.cpp line 334:

> 332:         // A monitor
> 333:         oop obj = monitor->object_peek();
> 334:         if (!monitor->is_being_async_deflated() && obj != nullptr) {

While there is nothing inherently wrong with the check. In the current implementation will we ever encounter a `monitor->is_being_async_deflated() == true`.

As it is the deflation thread which drives the rebuilding, and it will already have removed any entry it has deflated.

src/hotspot/share/runtime/objectMonitorTable.cpp line 351:

> 349: 
> 350: ObjectMonitor* ObjectMonitorTable::monitor_get(Thread* current, oop obj) {
> 351:   const int hash = obj->mark().hash();

Same type comment. (`int` -> `intptr_t`)

src/hotspot/share/runtime/objectMonitorTable.cpp line 382:

> 380: 
> 381: ObjectMonitor* ObjectMonitorTable::monitor_put_get(Thread* current, ObjectMonitor* monitor, oop obj) {
> 382:   const int hash = obj->mark().hash();

Same type comment. (`int` -> `intptr_t`)

src/hotspot/share/runtime/objectMonitorTable.cpp line 402:

> 400:     return;
> 401:   }
> 402:   const int hash = obj->mark().hash();

Same type comment. (`int` -> `intptr_t`)

src/hotspot/share/runtime/objectMonitorTable.hpp line 56:

> 54:     removed = 2,
> 55:     below_is_special = (removed + 1)
> 56:   } SpecialPointerValues;

Could this be a C++ enum instead?

Suggestion:

  enum SpecialPointerValues {
    empty = 0,
    tombstone = 1,
    removed = 2,
    below_is_special = (removed + 1)
  };


I think I preferred it to be an strongly typed `enum class` as well. (But we do use it as an integral in the MacroAssembler so that would introduce some casts.

-------------

PR Review: https://git.openjdk.org/jdk/pull/29383#pullrequestreview-3709345634
PR Review Comment: https://git.openjdk.org/jdk/pull/29383#discussion_r2730469114
PR Review Comment: https://git.openjdk.org/jdk/pull/29383#discussion_r2730510424
PR Review Comment: https://git.openjdk.org/jdk/pull/29383#discussion_r2730606148
PR Review Comment: https://git.openjdk.org/jdk/pull/29383#discussion_r2730553209
PR Review Comment: https://git.openjdk.org/jdk/pull/29383#discussion_r2730608119
PR Review Comment: https://git.openjdk.org/jdk/pull/29383#discussion_r2730627153
PR Review Comment: https://git.openjdk.org/jdk/pull/29383#discussion_r2730672999
PR Review Comment: https://git.openjdk.org/jdk/pull/29383#discussion_r2730627382
PR Review Comment: https://git.openjdk.org/jdk/pull/29383#discussion_r2730627673
PR Review Comment: https://git.openjdk.org/jdk/pull/29383#discussion_r2730621824
PR Review Comment: https://git.openjdk.org/jdk/pull/29383#discussion_r2730732236
PR Review Comment: https://git.openjdk.org/jdk/pull/29383#discussion_r2730624277
PR Review Comment: https://git.openjdk.org/jdk/pull/29383#discussion_r2730625000
PR Review Comment: https://git.openjdk.org/jdk/pull/29383#discussion_r2730626783
PR Review Comment: https://git.openjdk.org/jdk/pull/29383#discussion_r2730561044


More information about the hotspot-runtime-dev mailing list