write barrier and card marking

Tony Guan guanxiaohua at gmail.com
Sun Aug 15 22:55:14 UTC 2010


Hi David,

Thanks for the redirection!


Hi Ramki,

Thanks for the advice. I enabled verifying before and after GC. At the
very first full GC(also the first GC), the verification before the GC
passed, but the verification after GC failed.

Here is the error:
VerifyAfterGC:[Verifying threads permgen tenured generation
TransNewGeneration def new generation remset

Internal Error (/home/tony/software/OpenJDK/jdk7/hotspot/src/share/vm/memory/cardTableRS.cpp:319),
pid=16388, tid=1085323600
#  Error: guarantee(obj == __null || (HeapWord*)p < _boundary ||
(HeapWord*)obj >= _boundary,"pointer on clean card crosses boundary")


And the trace is:
V  [libjvm.so+0x8cc0b4];;  _ZN7VMError6reportEP12outputStream+0xb72
V  [libjvm.so+0x8cd1d4];;  _ZN7VMError14report_and_dieEv+0x5f6
V  [libjvm.so+0x403a85];;  _Z12report_fatalPKciS0_+0x6b
V  [libjvm.so+0x315636];;
_ZN22VerifyCleanCardClosure11do_oop_workIP7oopDescEEvPT_+0x82
V  [libjvm.so+0x31565b];;  _ZN22VerifyCleanCardClosure6do_oopEPP7oopDesc+0x1d
V  [libjvm.so+0x51ac5b];;
_ZN13instanceKlass21iterate_static_fieldsEP10OopClosure+0xbd
V  [libjvm.so+0x52e7ac];;
_ZN18instanceKlassKlass15oop_oop_iterateEP7oopDescP10OopClosure+0xa6
V  [libjvm.so+0x2a8590];;
_ZN5Klass17oop_oop_iterate_vEP7oopDescP10OopClosure+0x32
V  [libjvm.so+0x3108b9];;  _ZN7oopDesc11oop_iterateEP10OopClosure+0x37
V  [libjvm.so+0x314289];;
_ZN11CardTableRS12verify_spaceEP5SpaceP8HeapWord+0x34f
V  [libjvm.so+0x315507];;  _ZN20VerifyCTSpaceClosure8do_spaceEP5Space+0x29
V  [libjvm.so+0x4e5913];;
_ZN28OneContigSpaceCardGeneration13space_iterateEP12SpaceClosureb+0x35
V  [libjvm.so+0x39885d];;
_ZN20CompactingPermGenGen13space_iterateEP12SpaceClosureb+0x25
V  [libjvm.so+0x31450a];;  _ZN11CardTableRS6verifyEv+0x14e
V  [libjvm.so+0x4d5216];;  _ZN16GenCollectedHeap6verifyEbb+0x13a
V  [libjvm.so+0x89cfaf];;  _ZN8Universe6verifyEbb+0xfd
V  [libjvm.so+0x4d6f17];;
_ZN16GenCollectedHeap24do_type_based_collectionEbbmbib+0xb05
V  [libjvm.so+0x4d72a3];;  _ZN16GenCollectedHeap13do_collectionEbbmbi+0x93
V  [libjvm.so+0x4d7fda];;  _ZN16GenCollectedHeap18do_full_collectionEbi+0x78
V  [libjvm.so+0x8cde05];;  _ZN17VM_GenCollectFull4doitEv+0x5b
V  [libjvm.so+0x8e4f73];;  _ZN12VM_Operation8evaluateEv+0x5b
V  [libjvm.so+0x8e29a7];;  _ZN8VMThread18evaluate_operationEP12VM_Operation+0x33
V  [libjvm.so+0x8e2ef4];;  _ZN8VMThread4loopEv+0x4c0
V  [libjvm.so+0x8e33a6];;  _ZN8VMThread3runEv+0xf4
V  [libjvm.so+0x750f51];;  _Z10java_startP6Thread+0x16f

Now to answer your first question:
For minor collections, I collect the two generations below Tenured
separately.But a collection on old generation will just collect the
whole heap.(just like the original generational GC, the only
difference is like we have two young gens indexed as 0,1, and old
indexed as 2.)

This failures means that I failed to mark some card to be dirty when
there are inter-generational pointers in it.  And because the
pre-verification passed, the only reason for this failure is the full
collection. Now can you point to me that where could this error happen
during the full GC(marksweep)?

Thanks!


Tony (Xiaohua Guan)



On Sat, Aug 14, 2010 at 1:25 PM, Y. Srinivas Ramakrishna
<y.s.ramakrishna at oracle.com> wrote:
> Tony,
>
> Do you collect each generation independently, or
> does a generation at level n (with youngest == 0)
> collect all younger generations (i.e. level m < n)?
>
> The usual failure pattern for missed remembered set entries
> is that there is a "stale" pointer in a generation not collected
> but with a pointer into a generation that was collected.
> You can gainfully use the "mangling" feature in debug builds to
> identify such cases sometimes. I have not generally seen null pointer
> exceptions in such cases though (because that would mean that you are
> getting hold of a null reference, which requires more work, i.e. clearing,
> than forgetting to update an old reference to point to a new location),
> just "broken" references to stale objects (these usually end up pointing
> into the middle of unallocated space or into the middle of another object --
> i.e. a location that was previously occupied by the object they were
> previously pointing to).
>
> Also use HeapVerifyBefore/AfterGC to get more visibility into the
> problem.
>
> best.
> -- ramki
>
> David Holmes wrote:
>>
>> Tony,
>>
>> This is a GC question not a runtime question so I've cc'ed the GC list and
>> bcc'ed the runtime list.
>>
>> David
>>
>> Tony Guan said the following on 08/14/10 02:28:
>>>
>>> Hi there,
>>>
>>> I wrote one collector of my own in hotspot, but now I have a problem.
>>> In the new collector, I inserted another generation between the
>>> defNewGeneration and TenuredGeneration. To collect this new
>>> generation, I modified the defNewGeneration collector, which is gc by
>>> copying survivors.
>>>
>>> After the collection on this generation, I am now having an Null
>>> pointer exception in the java program. My guess is that the collector
>>> failed to identify the live objects in the generation. Here I need
>>> some one to tell me if there is any missed modification on the write
>>> barrier or card marking.
>>>
>>> As far as I know, the write barrier works whenever a field write
>>> happens, without regards to which generation is influenced. So this
>>> means that the card of the written field will be marked. So my problem
>>> should be in the closures that check if there are any live objects in
>>> the generation. Am I right?
>>>
>>> Or other than the card scanning, is there anything special that I
>>> should process?
>>>
>>> Thanks a lot!
>>>
>>>
>>> Tony (Xiaohua Guan)
>
>



More information about the hotspot-gc-dev mailing list