RFC/RFR: Get rid of second bitmap
Roman Kennke
rkennke at redhat.com
Tue Oct 10 09:20:49 UTC 2017
Am 10.10.2017 um 00:44 schrieb Roman Kennke:
> Am 06.10.2017 um 18:41 schrieb Aleksey Shipilev:
>> On 10/06/2017 02:27 PM, Roman Kennke wrote:
>>> AFAICT, The whole problem boils down to
>>> ShenandoahHeap::object_iterate() and related *public*
>>> methods being problematic when called at random times, in particular
>>> when the marking bitmap is not
>>> valid (e.g. marking aborted, bitmap just clearing/cleared, marking
>>> in progress).
>> Yes, exactly.
>>
>>> We could help it by squeezing in a marking pass before doing the
>>> iteration. However, if we do this,
>>> we can just as well report the visited objects to the ObjectClosure
>>> while traversing. It shouldn't
>>> matter for consumers of object_iterate() in which order the objects
>>> arrive, right?
>> The object order should not matter.
>>
>>> E.g. we can make all those methods do a safe 'iteration' by doing a
>>> single-threaded marking pass,
>>> reporting objects while we go, using a single work stack, and using
>>> 2nd marking bitmap (to avoid
>>> double-visiting objects) that we can allocate just for this purpose
>>> and deallocate when done (after
>>> all, this should be a rare situation which is not
>>> performance-critical). Right?
>> Yes, that makes sense. So this just makes another traversal through
>> the heap, returning all
>> reachable objects. Yes, Verifier does that already, and it does not
>> take much of the code. The
>> trouble with this approach is that we would need to test it
>> separately, because it will exercise the
>> non-usual code path.
>>
>> Heap dump on OOME can also fail, because we would try to commit some
>> native memory for bitmap at
>> that point.
>>
>>> I am assuming that all consumers would call object_iterate() during
>>> a safepoint (need to check
>>> this, but I'm pretty sure this is the case). We'd also need to
>>> ensure that we don't call those
>>> iterations ourselves from inside Shenandoah, unless we really want
>>> to (e.g. verification?). And
>>> provide the fast iteration - marked_object_iterate() - to use
>>> ourselves when we know that it is
>>> safe.
>> Verification uses neither object_iterate(), nor
>> marked_object_iterate(), because it takes things
>> slowly, carefully, and on its own :)
>>
>> -Aleksey
>>
> I added a test that exercises JVMTI heap iteration excessively, and
> lo-and-behold, it does crash spectacularily (even with 2nd bitmap). We
> currently cannot do this with concurrent GC going on:
>
> - it would call ensure_parsability() which will plaster over TLABs
> while we're evacuating (note that our GC threads don't participate in
> safepointing, and we don't want to).
> - dealing with dead objects is difficult: they may have broken Klass*
> (from previous concurrent class unloading) and broken oop refs.
>
> It is fixed in this proposal by implementing object_iterate() using a
> marking traversal. It only commits an auxiliary bitmap when needed,
> and uncommits it when done. It implements a very simple and dumb heap
> traversal using 1 thread, 1 oop stack and 1 marking bitmap. It reports
> only reachable objects, and that should be ok. It is only used for
> non-GC use, mostly from JVMTI. SH::ensure_parsability() (the public
> API) becomes a no-op. All linear heap scans are only done under our
> own control (using marked_object_iterate()), and need to use the new
> SH::make_tlabs_parsable().
>
> With this, we pass this new JVMTI heapdump test and all the other ones.
>
> http://cr.openjdk.java.net/~rkennke/onebitmap/webrev.03/
> <http://cr.openjdk.java.net/%7Erkennke/onebitmap/webrev.03/>
>
And here comes another small update that makes the test actually verify
that it has seen *some* objects. The JVMTI call might otherwise return
with an error code and we wouldn't notice.
http://cr.openjdk.java.net/~rkennke/onebitmap/webrev.04/
<http://cr.openjdk.java.net/%7Erkennke/onebitmap/webrev.04/>
Ok to push?
More information about the shenandoah-dev
mailing list