Sharing the markword (aka Valhalla's markword use)

John Rose john.r.rose at oracle.com
Mon Mar 11 20:01:36 UTC 2024


On 11 Mar 2024, at 5:58, Kennke, Roman wrote:

>>>> However, it’s not clear to me whether the cost of a modal narrow-klass field is going to be bearable.
>
> I don’t think it is a problem. Current Lilliput already tests the header bits, and branches when the object is monitor-locked. That cost is not measurable (need to make sure that the test-and-branch is laid out in a way that does not mess up static branch prediction, but that is easy, using stubs). The monitor-test will go away, but we can have an all-zeroes check in that same place. Handling the all-zeroes case would be slightly more costly, but not much, if we can use a fixed offset.

The key principle here is probably that there should be a slow path which is rare and easy to test.  That slow path (actually two distinct slow paths) would handle the locked state we have today and could handle the case of an “inflated klass” tomorrow.  (I just thought of the term “inflated klass”; maybe that has legs.)

Optimizations based on rare slow paths are a foundational concept in HotSpot, so they are something we know how to manage.

There is a second principle related to slow paths.  If you have two rare conditions, you should at least try to detect them in a common test.  Why?  Because for every different slow-path conditional test, your fast path gets a little slower, since it has to work through all of the slow-path tests before it can get on with the fast path.

 first_fast_and_slow_accessor() {
  if (fast_path_works())
   return inline_first_fast_path_accessor();
  // slow path
  return first_slow_path_accessor();
 }
 second_fast_and_slow_accessor() {
  if (fast_path_works())
   return inline_second_fast_path_accessor();
  // slow path
  return second_slow_path_accessor();
 }

In the above sketch, one fast path condition bit protects two slow paths.  If both accessors are used in C++ or JIT code, the compiler can hoist the one fast-path expression.  Same point for hand-written assembly.  The point also applies if there is a common “get real header” operation that factors out of both accessors, which might be the case here.

So…  If the inflated klass condition and the locked condition are both slow paths, it suggests that some LOCK BIT might play a central part in the combined optimization story.

>> Thanks for all your suggestions and clarifications, that’s all very useful.

You are welcome; I enjoy this sort of thing and am very glad it is sometimes useful.


More information about the lilliput-dev mailing list