RFR: JDK-8283674: Pad ObjectMonitor allocation size to cache line size

Mon Mar 28 05:38:52 UTC 2022

On Mon, 28 Mar 2022 01:56:58 GMT, David Holmes <dholmes at openjdk.org> wrote:

> Hi Thomas,
> 
> It would be nice to back this up with any kind of performance data showing that this is actually an issue in practice. How much memory do we waste by reinstating padding at this granularity?
> 
> Thanks, David

Hi David,

we lose 48bytes per OM (216->256).

A simple test on x64 Ubuntu with glibc shows that consecutive mallocs of 216 bytes within the same thread are packed tightly, with a pointer delta of 224:

0x5618b69d3950  224
0x5618b69d3a30  224
0x5618b69d3b10  224
0x5618b69d3bf0  224

So, the theoretical possibility exists that OMs share the same cache line, and its not even that improbable.

Prior to https://bugs.openjdk.java.net/browse/JDK-8253064, the old TSM-based OM storage padded OMs to align their start addresses to cachelines:
https://github.com/openjdk/jdk/blob/421a7c3b41343e60d52b6d9ee263dfd0a4d53ce1/src/hotspot/share/runtime/synchronizer.hpp#L37-L43

It did so since it stored OMs tightly. But as I have shown above the libc may also pack tightly. If padding was necessary before, would it not be necessary now too?

The problem I see this is a performance problem that may hit in some random scenarios, depending on the whims of the libc allocator. Its not predictable, since a different libc version at a customer may behave differently from whatever runs in our test systems. We have to assume the worst, that is that clusters of OMs will be packed tightly.

Cheers, Thomas

-------------

PR: https://git.openjdk.java.net/jdk/pull/7955