RFR (M): 8159422: Very high Concurrent Mark mark stack contention
Erik Österlund
erik.osterlund at oracle.com
Wed Aug 3 15:37:03 UTC 2016
Hi Thomas,
Just had a look at your code. Wondering how your lock-free stack handles
the classic ABA problem? It's not obvious for me.
In detail:
When you pop something you:
1) Load the head
2) Load the next pointer of that head
3) CAS the head expecting the head from 1, and if matching, expecting
that the new value, which is what the next pointer used to be, will be
consistent
This is where the bad concurrency stuff can happen. Between 2 and 3, it
could be that another thread wins the race and pops the value first,
logically frees it by sticking it back to the freelist, then arbitrary
stuff happens with the original queue pushing and popping all over the
place, then eventually this same node is pushed back again to the
lock-free stack after being grabbed from the freelist, but this time
installed into the lock-free stack with a completely different next
pointer than was loaded in 2), resulting in the CAS in 3) making the
invalid assumption that it is the same node as before and hence with the
same next pointer.
Is that what those counters are there for, or am I missing something?
Perhaps versioned pointers, hazard pointers or epoch based safe memory
reclamation would be good tools here.
Thanks,
/Erik
On 2016-08-02 11:24, Thomas Schatzl wrote:
> Hi everyone,
>
> could someone take a look at this change?
>
> Its FC extension request has already been approved too...
>
> Thanks,
> Thomas
>
> On Tue, 2016-07-19 at 17:38 +0200, Thomas Schatzl wrote:
>> Hi all,
>>
>> can I have reviews for this change that removes the global (heavy-
>> weight) lock when accessing the global mark stack?
>>
>>
>> The change converts the lock and high-water mark based management of
>> the global mark stack into lock-free high-water mark and free list
>> based one.erts the lock and high-water mark based management of the
>> global mark stack into lock-free high-water mark and free list based
>> one.
>>
>> In the previous review for JDK-8160897 I already mentioned that the
>> global lock when pushing/popping elements from the global mark stack
>> is very problematic particularly when there are many marking threads
>> in the system.
>>
>> Overall, particularly at the end of marking (both in the concurrent
>> phases as well as during remark) this behavior represents a
>> significant bottleneck.
>>
>> Particularly if there is a lot of traffic from and to the mark stack
>> (to be addressed by JDK-8057003), this results in marking not
>> completing quickly enough.
>>
>> There is some some customer application on a 1 TB heap (with up to
>> 80% full at times) where this results in lunch-break like length full
>> gc pauses when concurrent marking does not complete in time.
>>
>> Overall, together with JDK-8057003, this change reduces marking times
>> from >500 seconds to manageable 10-30s. :) (at 100 concurrent marking
>> threads, more could be used) Microbenchmarks like the one from JDK-
>> 8057003 also basically scale linearly with the number of threads
>> then.
>>
>> This change will also help improve the time to safepoint
>> significantly; because if there is a safepoint request while draining
>> the mark stacks, it will now yield much earlier.
>>
>> There is one drawback, internal management reduces the usable mark
>> stack by around .1 percent. Since the follow-up, JDK-8057003 reduces
>> mark stack usage by quite a bit, this has been considered acceptable.
>>
>> This is an enhancement, which is waiting for final approval.
>>
>> CR:
>> https://bugs.openjdk.java.net/browse/JDK-8159422
>> Webrev:
>> http://cr.openjdk.java.net/~tschatzl/8159422/webrev/
>> Testing:
>> jprt, nightly run, several vm.gc runs, internal benchmarks
>>
>> Thanks,
>> Thomas
>>
More information about the hotspot-gc-dev
mailing list