RFR/RFC: Make OOM-during-evacuation race-free
Aleksey Shipilev
shade at redhat.com
Mon Oct 23 16:22:20 UTC 2017
On 10/23/2017 05:59 PM, Roman Kennke wrote:
> Yes I see all that. But we have found out that this is a correctness issue, and that trumps
> performance, even if it's just a very miniscule case.
This is the correctness issue on the cancellation path again. And we have lots of band-aids there
already, and this is yet another band-aid. What makes it different from other band-aids is that it
touches the code we *know* is performance critical. It is a nice exercise, but a band-aid
nevertheless. Noisy performance data may lull us into believing the performance impact is okay.
> If we can come up with another solution that makes running OOM-during-evac 100% I'm all for it. I'm
> not fixed on my proposal, I just wanted to throw it out for discussion and bring something on the
> table that we can do some performance tests with.
This fwdptr mangling stuff is maybe our fallback plan, if, say, reservation scheme does not work
itself out -- that makes the whole issue about cancellation going away.
It makes little sense in my mind to allocate time for fallback plans that have bad theoreticals
before we work out and try the fix that has good theoreticals. We are still at this stage in the
project when we don't have to rush the intrusive band-aids out. We can actually take time to
reimplement parts of the collector solving the issue "properly".
I do wonder if instead of mangling the bits, we could reserve a "shadow" uncommitted memprotected
heap, and set the fwdptr to that? Then we can intercept the SEGVs coming to that shadow heap, and
redirect it to proper objects. This leaves the usual codepath the same, without ANDs, and the
failure path would experience read storms -- but why would that matter, if we are on failure path?
-Aleksey
More information about the shenandoah-dev
mailing list