sample_eden()

Fri Aug 9 20:48:29 UTC 2013

Jon,

Sorry for a late response.

Your explanation of what's happening sounds plausible to me.

I took a look at

  http://cr.openjdk.java.net/~jmasa/8021809

which I assume is related to this issue. Here's a thought: perhaps
since work_on_young_gen_roots() also reads an (incorrect)
_eden_chunk_index, it should also have a similar check to the one in
initialize_sequential_subtasks_for_young_gen_rescan()?

Here's another thought: how about resetting _eden_chunk_index to zero
at the end of Def/ParNewGeneration::collect()? This way, it'd be the
young generation that initiates both updating (via
sample_eden_chunk()) and resetting of _eden_chunk_index. This might
make sense.

Hiroshi

On Mon, Jul 29, 2013 at 4:06 PM, Jon Masamitsu <jon.masamitsu at oracle.com> wrote:
>
> On 7/29/13 2:29 PM, Srinivas Ramakrishna wrote:
>>
>> ah, makes sense! :-)
>>
>> So, if the scavenge failed, leaving stuff in Eden and the survivor
>> spaces, the chunks should still be valid if a CMS collection could
>> happen.
>
> I expect that would be the case although I'd have to at the promotion
> failure handling code again.
>
>> (I wonder after asking that question, though, how CMS would deal with
>> such a situation -- two active survivor spaces, i think it deals OK
>> with it, but not sure if both scans today parallelized or not, or if
>> the question is moot because the failed scavenge causes a bail out to
>> a stop-world gc... perhaps the latter?) Probably an academic question,
>> but still... :-)
>
>
> Worth looking into.
>
> Jon
>
>
>>
>> - ramki
>>
>> On Mon, Jul 29, 2013 at 1:46 PM, Jon Masamitsu <jon.masamitsu at oracle.com>
>> wrote:
>>>
>>> Ramki,
>>>
>>> This has gotten interesting.  When
>>>
>>> 1) a System.gc() is called
>>>
>>> and
>>>
>>> 2) UseCMSCompactAtFullCollection is set to false
>>>
>>> the CMS generation tells GenCollectedHeap that CMS
>>> does not collect the young gen.  That's right but I hadn't
>>> appreciated that in that circumstance a young gen GC
>>> was done before the CMS gen is collected.  Makes sense
>>> but the GC epilogue code is called after all the generations
>>> have been collected by GenCollectedHeap so the
>>> _eden_chunk_index is not reset between the young GC
>>> and the CMS GC.
>>>
>>> I think the right fix is to use the occupancy of eden to
>>> decide what work we need to do instead of _eden_chunk_index.
>>>
>>> Jon
>>>
>>>
>>> On 7/26/13 6:41 PM, Srinivas Ramakrishna wrote:
>>>>
>>>> Don't have the code in front of me to check, but yes that would seem to
>>>> be
>>>> the thing to do. I thought it was reset in the young gen gc epilogue ...
>>>>
>>>> ysr1729
>>>>
>>>> On Jul 26, 2013, at 14:46, Jon Masamitsu <jon.masamitsu at oracle.com>
>>>> wrote:
>>>>
>>>>> Hiroshi,
>>>>>
>>>>> I'm looking at an assertion failure with CMSParallelInitialMarkEnabled
>>>>> and CMSEdenChunksRecordAlways both enabled.  The assertion
>>>>> failure is in do_young_space_rescan()
>>>>>
>>>>>    5506      assert(mr.is_empty() || space->used_region().contains(mr),
>>>>>    5507             "Should be in space");
>>>>>
>>>>> and the failure occurs because _eden_chunk_index is > 0 and
>>>>> eden is empty.
>>>>>
>>>>> A young GC has just occurred and a System.gc() is in progress where the
>>>>> System.gc() is executing the the usual phases of CMS in a
>>>>> stop-the-world
>>>>> fashion.  A rarely seen scenario I think.  That is, the initial mark is
>>>>> being
>>>>> executed.
>>>>>
>>>>> I was looking at the places where _eden_chunk_index is reinitialized to
>>>>> 0.  I don't think you added any in you changes, right?
>>>>>
>>>>> I was thinking that _eden_chunk_index should be reset to 0 after
>>>>> a young GC where we know that eden is empty.  What do you think?
>>>>>
>>>>> Jon
>>>>>
>>>>>
>>>>>
>