RFR (S): Buffered TQ buffer breaks LIFO
Aleksey Shipilev
shade at redhat.com
Tue Jan 24 21:33:01 UTC 2017
Hi,
When doing the single-entry buffer in TQ, I missed an obvious thing: if we
bypass buffer on queue push, then the queue stops being LIFO. In other words,
current code does:
template <class E, MEMFLAGS F, unsigned int N>
inline bool BufferedOverflowTaskQueue<E, F, N>::push(E t)
{
if (_buf_empty) {
_elem = t;
_buf_empty = false;
return true;
} else {
return taskqueue_t::push(t); // oops, jumping over the buf
}
}
...which means that if we push and pop (1, 2, 3); we will get (1, 3, 2), not (3,
2, 1), as expected. Among other things, this has implications for work stealing.
In case of divide-and-conquer array handling, we keep the largest task in the
buffer, while we should have pushed it out into the tail.
Fix:
http://cr.openjdk.java.net/~shade/shenandoah/taskqueue-fix-lifo/webrev.01/
This, of course, makes array marking much faster:
100M array:
before: 121.96 s (a = 916964 us) (n = 133)
(lvls, us = 707031, 853516, 923828, 982422, 1154833)
after: 79.54 s (a = 580586 us) (n = 137)
(lvls, us = 544922, 566406, 574219, 582031, 648559)
...while keeping Tree and HashMap traversals roughly the same:
20M Tree:
before: 59.56 s (a = 437911 us) (n = 136)
(lvls, us = 408203, 425781, 433594, 435547, 488014)
after: 62.18 s (a = 457238 us) (n = 136)
(lvls, us = 425781, 445312, 451172, 460938, 509219)
20M HashMap:
before: 139.84 s (a = 1282913 us) (n = 109)
(lvls, us = 1152344, 1250000, 1289062, 1308594, 1363561)
after: 137.04 s (a = 1268875 us) (n = 108)
(lvls, us = 1171875, 1230469, 1269531, 1289062, 1327203)
Testing: hotspot_gc_shenandoah, targeted benchmarks
Thanks,
-Aleksey
More information about the shenandoah-dev
mailing list