Fixing the OOM-during-evac
Roman Kennke
rkennke at redhat.com
Wed Feb 28 13:50:16 UTC 2018
On Wed, Feb 28, 2018 at 2:39 PM, Zhengyu Gu <zgu at redhat.com> wrote:
>
>> Upon entry of the evac-path, any threads will attempt to increase the
>> counter, using a CAS. Depending on the result of the CAS:
>> - success: carry on with evac
>> - failure:
>> - if offending value is a valid counter, then try again
>> - if offending value is OOM-during-evac special value: loop until
>> counter drops to 0, then exit with read-barrier
>
>
> It is not clear to me, what if initial tests are success, but two Java
> threads hit OOM simultaneous? can you elaborate?
Then both threads attempt to CAS their current (counter |
OOM_MARKER_MASK) into the counter field. Only one can succeed, but
that doesn't really matter: both will then go into a loop and wait
until the counter drops to zero, and then return with RB. Do you see a
problem with that?
>
> Thanks,
>
> -Zhengyu
>
>
>>
>> Upon exit, any threads will decrease the counter using atomic dec.
>>
>> Upon OOM-during-evac, any thread will attempt to CAS OOM-during-evac
>> special value into the counter. Depending on result:
>> - success: busy-loop until counter drops to zero, then exit with RB
>> - failure:
>> - offender is valid counter update: try again
>> - offender is OOM-during-evac: busy loop until counter drops to
>> zero, then exit with RB
>>
>> For Java threads, this protocol needs to be done in the fast
>> (assembly) path too, because they can cause evacs. Or else, we could
>> decide to disable the fast-path altogether (I was never really sure if
>> the extra assembly did us much good).
>>
>> GC threads don't have to protect every single evacuation, but can
>> instead do the protocol wholesale: i.e. enter on worker start, and
>> exit on worker done.
>>
>> Please think hard about this possible solutions and try to poke holes
>> into it. Meanwhile, I'll come up with a prototype.
>>
>> Cheers, Roman
>>
>
More information about the shenandoah-dev
mailing list