Need help with ZGC failure in Lilliput
rkennke at redhat.com
Tue Jul 20 11:47:08 UTC 2021
Hi ZGC devs,
I am struggling with a ZGC problem in Lilliput, and would like to ask
for your opinion.
I'm currently working on changing runtime oopDesc::klass() to load the
Klass* from the object header instead of the dedicated Klass* field:
This required some coordination in other GCs, because it's not always
safe to access the object header. In particular, objects may be locked,
at which point we need to find the displaced header, or worst case,
inflate the header. I believe I've solved that in all GCs.
However, I am still getting a failure with ZGC, which is kinda
unexpected, because it's the only GC that is *not* messing with object
headers (as far as I know. If you check out the above PR, the failure
can easily reproduced with:
make run-test TEST=gc/z/TestGarbageCollectorMXBean.java
(and only that test is failing for me).
The crash is in ZHeap::is_object_live() because the ZPage there turns
out to be NULL. I've added a bunch of debug output in that location, and
it looks like the offending object is always inflated *and* forwarded
when it happens, but I fail to see how this is related to each other,
and to the page being NULL. I strongly suspect that inflation of the
object header by calling klass() on it causes the troubles. Changing
back to original implementation of oopDesc::klass() (swap
commented-out-code there) makes the bug disappear.
Also, the bug always seems to happen when calling through a weak
barrier. Not sure if that is relevant.
Any ideas? Opinions?
More information about the zgc-dev