RFR: jsr166 jdk9 integration wave 12
Paul Sandoz
Paul.Sandoz at oracle.com
Thu Nov 17 20:03:00 UTC 2016
> On 16 Nov 2016, at 16:03, Martin Buchholz <martinrb at google.com> wrote:
>
> Apologies for having this wave turn into a tsunami.
>
> CopyOnWriteArrayList was rewritten due to finding https://bugs.openjdk.java.net/browse/JDK-8169738
> ArrayDeque and ArrayBlockingQueue now both use efficient nested loop idiom for all traversals.
> Most bulk remove methods in all collection classes were rewritten for efficiency, using similar code.
> Lots of tests and benchmarks added.
Ok, i will just go through all of it.
Collection-optimization
—
Perhaps consolidate the repeated mini bit set methods as package private static methods on AbstractCollection? Unclear where the j.u.concurrent ones should go though...
It’s purely a subjective thing but i would be inclined to change the variable name “deathRow" to something more neutral such as “removedBitSet”.
The 2 layer nested loop is quite clever, certainly twists the mind when first encountered. It’s arguably more readable if there are two separate, (not-nested) loops, but that also requires two explicit invocations of the action, which may perturb the performance characteristics.
ArrayDeque
—
188 public ArrayDeque(int numElements) {
189 elements = new Object[Math.max(1, numElements + 1)];
190 }
What if Integer.MAX_VALUE is passed?
202 public ArrayDeque(Collection<? extends E> c) {
203 elements = new Object[c.size() + 1];
204 addAll(c);
205 }
What if c.size() returns Integer.MAX_VALUE?
226 * Adds i and j, mod modulus.
227 * Precondition and postcondition: 0 <= i < modulus, 0 <= j <= modulus.
228 */
229 static final int add(int i, int j, int modulus) {
Is that upper bound correct for j? 0 <= j < modulus ?
317 c.forEach(e -> addLast(e));
this::addLast, up to you which you prefer
843 public boolean tryAdvance(Consumer<? super E> action) {
844 if (action == null)
845 throw new NullPointerException();
846 int t, i;
847 if ((t = fence) < 0) t = getFence();
Is that for optimisation purposes, since the same check is also performed in getFence? If so that seems like overkill
848 if (t == (i = cursor))
849 return false;
850 final Object[] es;
851 action.accept(nonNullElementAt(es = elements, i));
852 cursor = inc(i, es.length);
Recommend updating cursor before the action
853 return true;
854 }
Collection8Test
—
429 public void testRemoveAfterForEachRemaining() {
Are tests run by default with testImplementationDetails == true? I am guessing so.
Semaphore
—
633 /**
634 * Acquires and returns all permits that are immediately available.
635 * Upon return, zero permits are available.
636 *
637 * @return the number of permits acquired
638 */
639 public int drainPermits() {
640 return sync.drainPermits();
641 }
Arguably, if positive acquires all permits, otherwise releases all permits. Perhaps:
Acquires and returns all permits that are immediately available, otherwise releases all permits (if negative).
Upton return, zero permits are available.
@return the number of permits acquired, or the number of permits release (a negative value)
Probably requires a CCC which i can manage.
SplittableRandom
—
533 /**
534 * Generates a pseudorandom number with the indicated number of
535 * low-order bits. Because this class has no subclasses, this
536 * method cannot be invoked or overridden.
537 *
538 * @param bits random bits
539 * @return the next pseudorandom value from this random number
540 * generator's sequence
541 */
542 protected int next(int bits) {
543 return nextInt() >>> (32 - bits);
544 }
545
Unless i am missing something really subtle, I don’t think this is required since SplittableRandom does not extend from Random (unlike in ThreadLocalRandom), and no associated reflective test is required.
ForkJoin
—
69 * tasks that are never joined. All worker threads are initialized
70 * with {@link Thread#isDaemon} set {@code true}.
CCC?
Thanks,
Paul.
>
>
> On Wed, Nov 16, 2016 at 3:34 PM, Paul Sandoz <paul.sandoz at oracle.com> wrote:
> Hi Martin,
>
> Glad you looked more closely at the perf aspects of ArrayDeque.
>
> I am struggling to determine what has changed since the last revision. Apart from ArrayDeque what files should i be looking at?
>
> Thanks,
> Paul.
>
> > On 16 Nov 2016, at 12:14, Martin Buchholz <martinrb at google.com> wrote:
> >
> > Thanks to Vitaly and Doug for being more concerned about offer/poll
> > performance than I was initially. ArrayDeque is now back to using the head
> > + tail + spare slot strategy, which keeps optimal performance for non-bulk
> > operations, while everything else gets faster. We have new tests and new
> > benchmarks, as a result of which new bugs were found and fixed, and this
> > integration wave has grown rather larger than originally intended.
> >
> > The JIT-friendly nested loop strategy was successful for making bulk
> > operations on circular arrays faster, and this is used for both ArrayDeque
> > and ArrayBlockingQueue.
> >
> > + /*
> > + * VMs excel at optimizing simple array loops where indices are
> > + * incrementing or decrementing over a valid slice, e.g.
> > + *
> > + * for (int i = start; i < end; i++) ... elements[i]
> > + *
> > + * Because in a circular array, elements are in general stored in
> > + * two disjoint such slices, we help the VM by writing unusual
> > + * nested loops for all traversals over the elements.
> > + */
> >
> >
> > Bulk removal operations on array-based collections retreat to a two-pass
> > algorithm, which is slightly slower (but still O(n)), but continues to
> > support (questionable) predicates that reentrantly access the collection in
> > read mode.
> >
> > http://cr.openjdk.java.net/~martin/webrevs/openjdk9/jsr166-jdk9-integration/
> >
> > On Fri, Oct 21, 2016 at 12:32 PM, Martin Buchholz <martinrb at google.com>
> > wrote:
> >
> >>
> >>
> >> On Tue, Oct 18, 2016 at 4:18 PM, Martin Buchholz <martinrb at google.com>
> >> wrote:
> >>
> >>>
> >>>
> >>> On Tue, Oct 18, 2016 at 4:00 PM, Vitaly Davidovich <vitalyd at gmail.com>
> >>> wrote:
> >>>
> >>>>
> >>>>>> * Change in allocation/capacity policy.
> >>>>>>
> >>>>>> The removal of the power-of-two restriction, and applying a 1.5x
> >>>>> growth
> >>>>>> factor (same as ArrayList) seems sensible. Does this mean that the
> >>>>> ability
> >>>>>> to compute the proper array index by using x & (length-1) wasn't
> >>>>> worth it?
> >>>>>> Or at least not worth the extra tail wastage?
> >>>>>>
> >>>>>
> >>>>> There's no integer division to optimize, as with hash tables.
> >>>>
> >>>> But it does add some new branches, doesn't it? Potentially branches that
> >>>> aren't taken prior to JIT compilation, but taken later = deopt.
> >>>>
> >>>
> >>> If it's a smidgeon slower, I don't care that much; improvement in
> >>> flexibility and scalability are more important.
> >>>
> >>> Of course, I do care about algorithmic improvements to e.g. removeIf
> >>>
> >>>
> >>>> Curious if you ran some benchmarks on ArrayDeque.
> >>>>
> >>>
> >>> Not yet. Who would like to show how slow the code is?
> >>>
> >>
> >> I revived my ancient IteratorMicroBenchmark for the latest webrev, made it
> >> a proper jtreg test, added ArrayDeque Jobs, and verified that the new code
> >> is measurably faster than the old code, at least for the mundane job of
> >> iterating.
> >> http://cr.openjdk.java.net/~martin/webrevs/openjdk9/
> >> jsr166-jdk9-integration/ArrayList/
> >>
> >>
>
>
More information about the core-libs-dev
mailing list