Question about reference processing in JDK6/7

Mikael Gerdin mikael.gerdin at oracle.com
Tue Feb 25 19:09:38 UTC 2014


Hi Andrew,

On 2014-02-25 19:39, Andrew Dinn wrote:
> On 25/02/14 16:10, Mikael Gerdin wrote:
>>> The Culprit Change Set
>>> . . .
>>> In particular, the change set includes the following changes to method
>>> PSMarkSweep::mark_sweep_phase1() in file
>>> hotspot/src/share/vm/gc_implementation/parallelScavenge/psMarkSweep.cpp
>>>
>>> @@ -516,7 +516,6 @@
>>>     {
>>>       ParallelScavengeHeap::ParStrongRootsScope psrs;
>>>       Universe::oops_do(mark_and_push_closure());
>>> -    ReferenceProcessor::oops_do(mark_and_push_closure());
>>>       JNIHandles::oops_do(mark_and_push_closure());   // Global (strong)
>>> JNI handles
>>>       CodeBlobToOopClosure each_active_code_blob(mark_and_push_closure(),
>>> /*do_marking=*/ true);
>>>       Threads::oops_do(mark_and_push_closure(), &each_active_code_blob);
>>> @@ -623,7 +622,6 @@
>>>
>>>     // General strong roots.
>>>     Universe::oops_do(adjust_root_pointer_closure());
>>> -  ReferenceProcessor::oops_do(adjust_root_pointer_closure());
>>>     JNIHandles::oops_do(adjust_root_pointer_closure());   // Global
>>> (strong) JNI handles
>>>     Threads::oops_do(adjust_root_pointer_closure(), NULL);
>>>     ObjectSynchronizer::oops_do(adjust_root_pointer_closure());
>>> . . .
>> ReferenceProcessor::oops_do was only needed to keep the sentinel object alive
>> and adjust the static pointer if a perm gen compaction was performed.
>>
>> AFAIK this change was not intended to change the rate at which references were
>> processed, only to find a new solution for how to keep track of the linked
>> lists without using a "sentinel" object.
>
> Ok, I see now that this is indeed not the actual source of the problem.
> I thought that deleted call above ensured that all discovered references
> were processed but clearly the only thing the closure did was push and
> mark the sentinel.
>
> Having got rid of that red herring I have now found the source of the
> problem. It appears to be related to the changeover from using the
> 'next' field to link discovered Reference instances to using the
> 'discovered' field. This change went in shortly after the one I
> mentioned in my previous post: i.e.
>
> Revision: 3155
> Branch: default
> Author: ysr  2011-09-07 21:55:42
> Committer: ysr  2011-09-07 21:55:42
> Parent: 3154:05550041d664 (Merge)
> Child:  3156:a6128a8ed624 (7086226: UseNUMA fails on old versions of
> windows)
>
>      4965777: GC changes to support use of discovered field for pending
> references
>      Summary: If and when the reference handler thread is able to use the
> discovered field to link reference objects in its pending list, so will
> GC. In that case, GC will scan through this field once a reference
> object has been placed on the pending list, but not scan that field
> before that stage, as the field is used by the concurrent GC thread to
> link discovered objects. When ReferenceHandleR thread does not use the
> discovered field for the purpose of linking the elements in the pending
> list, as would be the case in older JDKs, the JVM will fall back to the
> old behaviour of using the next field for that purpose.
>      Reviewed-by: jcoomes, mchung, stefank
>
> The JDK is supposed to communicate whether it expects to use 'next' or
> 'discovered' by setting a flag bit in the jdk_version_info struct named
> 'pending_list_uses_discovered_field'. The problem is that the jdk6 code
> in JDK_GetVersionInfo0 does not clear this struct correctly before
> setting the fields it knows about.
>
> jdk_util.c:80
>
>      memset(info, 0, sizeof(info_size));
>
> info_size conains the size of info (i.e. 24)  but this call always zeros
> only the first 8 bytes. In the latest jdk7u this has been corrected to
>
>      memset(info, 0, info_size);
>
> So, if the stack location where info is allocated happens to contain a 1
> in the wrong place then the GC starts linking discovered references
> using the wrong field and no reference processing occurs.
>
> Thanks for your help in correcting my misunderstanding and hence
> enabling me to find this :-)

I had a feeling that it was related to the JDK-4965777 change but since 
it went in to the JDK7u2 JVM without any obvious problems in JDK7u I 
decided it had to be something else.

Great that you found the bug!

/Mikael

>
> regards,
>
>
> Andrew Dinn
> -----------
>



More information about the hotspot-gc-dev mailing list