Collections.shuffle to accept RandomGenerator
Stuart Marks
stuart.marks at oracle.com
Mon Oct 3 21:32:34 UTC 2022
I'm ok with not adding a default shuffle() method to List. Shuffling does seem
fairly rare, and it potentially has a some semantic issues regarding updates to the
RNG state. Those would make it difficult to specify or to override without possibly
breaking the contract.
Having a List::shuffle default method might provide a path toward solving the issues
with shuffing a CopyOnWriteArrayList, as described recently [1]. However, that seems
like an extremely rare case, and there's a workaround (albeit fairly painful). And
there are some things that COWAL simply doesn't support (like modifications from a
ListIterator) or are unspecified (modifications to a subList).
The Arrays class does need some attention and probably should be considered
separately. It's lacking some other things too, like reverse(). One issue with
modifying the Arrays class is providing overloads for some or all of the primitives.
We've kind of held off because adding primitive overloads adds to the clutter of an
already-cluttered class. There are some functions that support only "common"
primitives int/long/double (see Arrays::setAll) which reduces the clutter; but I've
missed overloads for certain arrays like byte[].
(I'm not saying not to do this; I'm saying this is more than dropping in a single
Arrays::shuffle method.)
I'm not sure we want to change the behavior of the existing Collections::shuffle
with respect to ThreadLocalRandom. It seems like there are some weird tradeoffs
here; see JDK-8218282 [2].
s'marks
[1]: https://mail.openjdk.org/pipermail/core-libs-dev/2022-August/093777.html
[2]: https://bugs.openjdk.org/browse/JDK-8218282
On 10/1/22 6:48 PM, Jason Mehrens wrote:
> Tagir,
>
> Yes I have mixed feeling about promoting shuffle to List. I did notice that java.util.Arrays doesn't have a shuffle method. Perhaps adding RandomGenerator shuffle and slice variant of shuffle to that class would be something to consider. Array backed lists could then use that method to perform a shuffle without having to use asList.
>
> Jason
>
> ________________________________________
> From: Tagir Valeev <amaembo at gmail.com>
> Sent: Saturday, October 1, 2022 2:24 AM
> To: Jason Mehrens
> Cc: Paul Sandoz; core-libs-dev at openjdk.org
> Subject: Re: Collections.shuffle to accept RandomGenerator
>
> Thanks, Paul, Jason. I've filed an issue:
> https://bugs.openjdk.org/browse/JDK-8294693
>
> I'm not sure about the lifting shuffle() and making it an instance
> method of List interface. The shuffle() is used much less often,
> compared to sort(). Also, it produces an unspecified side-effect.
> Namely, it updates the state of the supplied RandomGenerator.
> Currently, it does this in a very stable way, calling nextInt() size-1
> times for any list implementation. So if I initialize a RNG with a
> specific seed, then shuffle several lists, they will be shuffled in
> predictable manner, regardless of a particular list implementation.
> This might be desired, e.g., to write tests or reproduce problems. If
> we move it to an instance method, implementations might vary, and
> passing identical RNG may produce different shuffle order for lists
> with identical content but having different implementations. Moreover,
> it may even leave RNG in a different state afterwards which may affect
> subsequent operations performed with the same RNG. With sorting, any
> implementation must produce the same result, so such questions don't
> rise.
>
> With best regards,
> Tagir Valeev
>
> On Thu, Sep 29, 2022 at 4:27 AM Jason Mehrens <jason_mehrens at hotmail.com> wrote:
>>
>> JDK20 has Random.from(RandomGenerator) which could be used to do something like Collections.shuffle(List, Random.from(rg)).
>>
>> List.shuffle(RandomGenerator ) would allow for more efficient CopyOnWriteArrayList shuffle.
>>
>> Jason
>>
>> ________________________________________
>> From: core-libs-dev <core-libs-dev-retn at openjdk.org> on behalf of Paul Sandoz <paul.sandoz at oracle.com>
>> Sent: Wednesday, September 28, 2022 10:36 AM
>> To: Tagir Valeev
>> Cc: core-libs-dev at openjdk.org
>> Subject: Re: Collections.shuffle to accept RandomGenerator
>>
>> That looks reasonable.
>>
>> Thinking a little more broadly.
>>
>> We could also change Collections.shuffle(List) to use ThreadLocalRandom so it does not have to cache the Random instance, plus it has better concurrent and PRNG properties. The spec does not describe the precise details of randomness.
>>
>> It’s tempting to lift Collections.shuffle(List, RandomGenerator) to List.shuffle(RandomGenerator ), similar to what we did for Collections.sort, to align with that pattern and which also aligns with reverse of sequenced collections. There is likely not much performance advantage to do doing so as there was with sort, but that location is much easier to find and use IMHO. Since the method accepts RandomGenerator the compatibility risk would likely be low.
>>
>> Paul.
>>
>>> On Sep 27, 2022, at 3:11 AM, Tagir Valeev <amaembo at gmail.com> wrote:
>>>
>>> Hello!
>>>
>>> Currently, Collections.shuffle(List, Random) accepts an outdated
>>> Random class instead of RandomGenerator interface. It looks like,
>>> RandomGenerator would suffice. The code change looks trivial (aside
>>> from documentation and testing), as shuffle() implementation does not
>>> need anything except nextInt:
>>>
>>> diff --git a/src/java.base/share/classes/java/util/Collections.java
>>> b/src/java.base/share/classes/java/util/Collections.java
>>> --- a/src/java.base/share/classes/java/util/Collections.java (revision
>>> cab590517bf705418c7376edd5d7066b13b6dde8)
>>> +++ b/src/java.base/share/classes/java/util/Collections.java (date
>>> 1664273240190)
>>> @@ -37,6 +37,7 @@
>>> import java.util.function.IntFunction;
>>> import java.util.function.Predicate;
>>> import java.util.function.UnaryOperator;
>>> +import java.util.random.RandomGenerator;
>>> import java.util.stream.IntStream;
>>> import java.util.stream.Stream;
>>> import java.util.stream.StreamSupport;
>>> @@ -454,8 +455,12 @@
>>> * @throws UnsupportedOperationException if the specified list or its
>>> * list-iterator does not support the {@code set} operation.
>>> */
>>> - @SuppressWarnings({"rawtypes", "unchecked"})
>>> public static void shuffle(List<?> list, Random rnd) {
>>> + shuffle(list, (RandomGenerator) rnd);
>>> + }
>>> +
>>> + @SuppressWarnings({"rawtypes", "unchecked"})
>>> + public static void shuffle(List<?> list, RandomGenerator rnd) {
>>> int size = list.size();
>>> if (size < SHUFFLE_THRESHOLD || list instanceof RandomAccess) {
>>> for (int i=size; i>1; i--)
>>>
>>> What do you think? Should we implement this improvement? I think I can
>>> contribute if there's a general agreement that such an enhancement
>>> would be useful.
>>>
>>> With best regards,
>>> Tagir Valeev
>>
More information about the core-libs-dev
mailing list