Need help with ZGC failure in Lilliput

Roman Kennke rkennke at redhat.com
Tue Jul 20 11:47:08 UTC 2021


Hi ZGC devs,

I am struggling with a ZGC problem in Lilliput, and would like to ask 
for your opinion.

I'm currently working on changing runtime oopDesc::klass() to load the 
Klass* from the object header instead of the dedicated Klass* field:

https://github.com/openjdk/lilliput/pull/12

This required some coordination in other GCs, because it's not always 
safe to access the object header. In particular, objects may be locked, 
at which point we need to find the displaced header, or worst case, 
inflate the header. I believe I've solved that in all GCs.

However, I am still getting a failure with ZGC, which is kinda 
unexpected, because it's the only GC that is *not* messing with object 
headers (as far as I know. If you check out the above PR, the failure 
can easily reproduced with:

make run-test TEST=gc/z/TestGarbageCollectorMXBean.java

(and only that test is failing for me).

The crash is in ZHeap::is_object_live() because the ZPage there turns 
out to be NULL. I've added a bunch of debug output in that location, and 
it looks like the offending object is always inflated *and* forwarded 
when it happens, but I fail to see how this is related to each other, 
and to the page being NULL. I strongly suspect that inflation of the 
object header by calling klass() on it causes the troubles. Changing 
back to original implementation of oopDesc::klass() (swap 
commented-out-code there) makes the bug disappear.

Also, the bug always seems to happen when calling through a weak 
barrier. Not sure if that is relevant.

Any ideas? Opinions?

Thanks,
Roman



More information about the zgc-dev mailing list