Question about reference processing in JDK6/7

Tue Feb 25 18:39:19 UTC 2014

On 25/02/14 16:10, Mikael Gerdin wrote:
>> The Culprit Change Set
>> . . .
>> In particular, the change set includes the following changes to method
>> PSMarkSweep::mark_sweep_phase1() in file
>> hotspot/src/share/vm/gc_implementation/parallelScavenge/psMarkSweep.cpp
>>
>> @@ -516,7 +516,6 @@
>>    {
>>      ParallelScavengeHeap::ParStrongRootsScope psrs;
>>      Universe::oops_do(mark_and_push_closure());
>> -    ReferenceProcessor::oops_do(mark_and_push_closure());
>>      JNIHandles::oops_do(mark_and_push_closure());   // Global (strong)
>> JNI handles
>>      CodeBlobToOopClosure each_active_code_blob(mark_and_push_closure(),
>> /*do_marking=*/ true);
>>      Threads::oops_do(mark_and_push_closure(), &each_active_code_blob);
>> @@ -623,7 +622,6 @@
>>
>>    // General strong roots.
>>    Universe::oops_do(adjust_root_pointer_closure());
>> -  ReferenceProcessor::oops_do(adjust_root_pointer_closure());
>>    JNIHandles::oops_do(adjust_root_pointer_closure());   // Global
>> (strong) JNI handles
>>    Threads::oops_do(adjust_root_pointer_closure(), NULL);
>>    ObjectSynchronizer::oops_do(adjust_root_pointer_closure());
>> . . .
> ReferenceProcessor::oops_do was only needed to keep the sentinel object alive 
> and adjust the static pointer if a perm gen compaction was performed.
> 
> AFAIK this change was not intended to change the rate at which references were 
> processed, only to find a new solution for how to keep track of the linked 
> lists without using a "sentinel" object.

Ok, I see now that this is indeed not the actual source of the problem.
I thought that deleted call above ensured that all discovered references
were processed but clearly the only thing the closure did was push and
mark the sentinel.

Having got rid of that red herring I have now found the source of the
problem. It appears to be related to the changeover from using the
'next' field to link discovered Reference instances to using the
'discovered' field. This change went in shortly after the one I
mentioned in my previous post: i.e.

Revision: 3155
Branch: default
Author: ysr  2011-09-07 21:55:42
Committer: ysr  2011-09-07 21:55:42
Parent: 3154:05550041d664 (Merge)
Child:  3156:a6128a8ed624 (7086226: UseNUMA fails on old versions of
windows)

    4965777: GC changes to support use of discovered field for pending
references
    Summary: If and when the reference handler thread is able to use the
discovered field to link reference objects in its pending list, so will
GC. In that case, GC will scan through this field once a reference
object has been placed on the pending list, but not scan that field
before that stage, as the field is used by the concurrent GC thread to
link discovered objects. When ReferenceHandleR thread does not use the
discovered field for the purpose of linking the elements in the pending
list, as would be the case in older JDKs, the JVM will fall back to the
old behaviour of using the next field for that purpose.
    Reviewed-by: jcoomes, mchung, stefank

The JDK is supposed to communicate whether it expects to use 'next' or
'discovered' by setting a flag bit in the jdk_version_info struct named
'pending_list_uses_discovered_field'. The problem is that the jdk6 code
in JDK_GetVersionInfo0 does not clear this struct correctly before
setting the fields it knows about.

jdk_util.c:80

    memset(info, 0, sizeof(info_size));

info_size conains the size of info (i.e. 24)  but this call always zeros
only the first 8 bytes. In the latest jdk7u this has been corrected to

    memset(info, 0, info_size);

So, if the stack location where info is allocated happens to contain a 1
in the wrong place then the GC starts linking discovered references
using the wrong field and no reference processing occurs.

Thanks for your help in correcting my misunderstanding and hence
enabling me to find this :-)

regards,

Andrew Dinn
-----------