RFR: 8294693: Add Collections.shuffle overload that accepts RandomGenerator interface [v3]
Tagir F. Valeev
tvaleev at openjdk.org
Sat Oct 15 07:32:59 UTC 2022
On Fri, 14 Oct 2022 02:20:43 GMT, jmehrens <duke at openjdk.org> wrote:
>> Tagir F. Valeev has updated the pull request incrementally with one additional commit since the last revision:
>>
>> Fixes according to review
>>
>> 1. Reduce duplication in tests
>> 2. Use JumpableGenerator#copy() instead of create(1) in tests, as according to the spec, seed can be ignored
>> 3. Simplify documentation for shuffle(List, Random) to avoid duplication.
>
> Should this PR be held back by JEP: 431: Sequenced Collections?
>
> Depending on the final API for SequencedCollection I would think it would be possible to shuffle the contents of a sequenced collection (accessed order set) if replaceAll(java.util.function.UnaryOperator) was added from List. Shuffle could be implemented like the current non-random access branch today but replace the ListIterator logic with replaceAll.
>
> Waiting for that API to settle might save us from having to add another shuffle overload to Collections. Currently there is a workaround to shuffle with a RandomGenerator by using the wrapper which should remove some of the pressure to get this into a release quickly.
@jmehrens I don't think that `replaceAll` would work nicely with `SequencedCollection`. It will be definitely unsupported for sorted sets. For unsorted sets like `LinkedHashSet`, it's unclear how to behave if `replaceAll` returns identical elements. Throw an exception? Shrink the set size via deduplication? Also, there's no way to optimize the implementation better than clearing the set and inserting the results back, so you need to compute hash codes, distribute items to buckets, probably create tree nodes in case of collisions, etc. It would be simpler if `replaceAll` function was forced to return a permutation (like shuffle does), but it would also limit the usefulness severely.
I don't think that shuffling anything but List has a big value. Usually it's ok to copy into a fresh `ArrayList` and shuffle it instead. You won't get significant performance improvement if LinkedHashSet would be able to shuffle itself in-place. Some improvement for `ArrayDeque` would be possible but again, dumping it into a new `ArrayList` would solve the problem. Note also the proposal to provide a list view from ArrayDeque: https://bugs.openjdk.org/browse/JDK-8143850
To summarize, I have big doubts that generalizing shuffling to `SequencedCollection` interface would be useful.
-------------
PR: https://git.openjdk.org/jdk/pull/10520
More information about the core-libs-dev
mailing list