RFR: 8209971: TestOptionsWithRanges.java crashes in CDS mode with G1UpdateBufferSize=1
Ioi Lam
ioi.lam at oracle.com
Thu Sep 6 03:46:25 UTC 2018
Hi Jiangli,
Thanks for fixing this! I like the simple approach that minimizes code
change.
The following comment is misleading. It suggest that mirror objects
cannot be restored at this stage, but doesn't explain why.
Also, I don't think we should single out constant pools (the crash call
stack had nothing to do with constant pool.)
2039 // No mirror object is access/restored during
2040 // resolve_wk_klasses_through(WK_KLASS_ENUM_NAME(Object_klass)...).
2041 // Mirrors are restored after java.lang.Class is loaded.
2042 //
2043 // Object_klass()->constants()->restore_unshareable_info()
restores the
2044 // constant pool resolved_references array for Object klass. It
must be
2045 // done after MetaspaceShared::fixup_mapped_heap_regions().
The real reason we have to do this is:
+ Before MetaspaceShared::fixup_mapped_heap_regions() is called, the
heap is in an inconsistent state. At some G1 regions, the bottom part
contains all zeros. However, some GC operations (debug-build
verifications only?) would need to walk G1 regions starting from the
bottom, and will run into asserts when we get an uninitialized block.
+ One of such operation is the barrier operations, as shown in the crash
call stack in the bug report. (Maybe there are others??)
For safety, we should put in asserts such that -- before
MetaspaceShared::fixup_mapped_heap_regions() is done, we must not do
anything that would trigger barrier operations. For example, we must not
call oopDesc::obj_field_put, which triggered the barrier operations in
the crash stack.
The crash is timing related and happened only when G1UpdateBufferSize
was set to one. I think we need asserts so we can reliably catch any
future problems. For example, if someone adds new code to modify obj
fields in the archived heap before fixup_mapped_heap_regions() is
called, we should get an assert immediately, instead of waiting for some
future point that may or may not happen.
I would suggest working with the GC folks for a suitable assert.
Thanks
- Ioi
On 9/5/18 7:58 PM, Jiangli Zhou wrote:
> Please review the following fix for JDK-8209971,
> TestOptionsWithRanges.java crashes in CDS mode with G1UpdateBufferSize=1.
>
> webrev: http://cr.openjdk.java.net/~jiangli/8209971/webrev.00/
>
> bug: https://bugs.openjdk.java.net/browse/JDK-8209971
>
> This is a pre-existing issue that's exposed when archived heap object
> relocation is enabled. At CDS runtime, the archived heap regions are
> mapped back to the runtime java heap. As the mapped heap data may not
> occupy the entire runtime GC region(s), the empty spaces within the
> regions are filled with dummy objects, which is done by a fixup
> process after mapping the data. The fixup was done after
> SystemDictionary::initialize(), because fixup needs to access
> SystemDictionary::Object_klass() (in some cases). However that is too
> late as archived java objects are accessed when
> SystemDictionary::initialize() resolves well-known classes.
>
> The solution is to call MetaspaceShared::fixup_mapped_heap_regions()
> at an earlier time. In the webre, the call is moved into
> SystemDictionary::resolve_preloaded_classes(). It's called after
> resolve_wk_klasses_through(WK_KLASS_ENUM_NAME(Object_klass)...), but
> before any of the archived java objects is accessed. No mirror object
> is access/restored during
> resolve_wk_klasses_through(WK_KLASS_ENUM_NAME(Object_klass)...).
> Archived mirrors are restored after java.lang.Class is loaded.
>
> Tested TestOptionsWithRanges.java locally in CDS mode. Tier1 - tier5
> normal runs and tier1 - tier4 default CDS runs are in progress.
>
> Thanks,
>
> Jiangli
>
>
More information about the hotspot-runtime-dev
mailing list