RFR: 8180352: Add Stream.toList() method

Remi Forax forax at univ-mlv.fr
Wed Nov 4 10:45:35 UTC 2020


----- Mail original -----
> De: "Stuart Marks" <smarks at openjdk.java.net>
> À: "core-libs-dev" <core-libs-dev at openjdk.java.net>
> Envoyé: Mercredi 4 Novembre 2020 01:14:54
> Objet: Re: RFR: 8180352: Add Stream.toList() method

> On Tue, 3 Nov 2020 18:53:23 GMT, Stuart Marks <smarks at openjdk.org> wrote:
> 
>>> Should there be a test that tests the default implementation in `j.u.s.Stream`?
>>> Or maybe there is and I missed that?
>>
>> @dfuch wrote:
>>> Should there be a test that tests the default implementation in `j.u.s.Stream`?
>>> Or maybe there is and I missed that?
>> 
>> The test method `testDefaultOps` does that. The stream test framework has a
>> thing called `DefaultMethodStreams` that delegates everything except default
>> methods to another Stream instance, so this should test the default
>> implementation.
> 
> OK, there are rather too many forking threads here for me to try to reply to any
> particular message, so I'll try to summarize things and say what I intend to
> do.
> 
> Null tolerance. While part of me wants to prohibit nulls, the fact is that
> Streams pass through nulls, and toList() would be much less useful if it were
> to reject nulls. The affinity here is closer to Stream.toArray(), which allows
> nulls, rather than Collectors.to[Unmodifiable]List.
> 
> Unmodifiability. Unlike with nulls, this is where we've been taking a strong
> stand recently, where new constructs are unmodifiable ("shallowly immutable").
> Consider the Java 9 unmodifiable collections, records, primitive classes, etc.
> -- they're all unmodifiable. They're data-race-free and are resistant to a
> whole class of bugs that arise from mutability.

again, the strong stance about null in collections was taken earlier that the one about unmodifiable collection,
all collections introduced since java 1.6 are null hostile.

I think we are bound to rediscover why :(

<teacher>
And BTW, an unmodifiable list is not data-race free, it depends if the content is immutable or not.
</teacher>

> 
> Unfortunately, the combination of the above means that the returned List is
> neither like an ArrayList nor like an unmodifiable list produced by List.of()
> or by Collectors.toUnmodifiableList(). [Heavy sigh.] While I've been somewhat
> reluctant to introduce this new thing, the alternatives of trying to reuse one
> of the existing things are worse.

This is the smell test here.
You want to introduce a 4th semantics, we already have ArrayList, Arrays.asList() and List.of().
Add a new semantics has not a linear cost, the cost is bigger than linear because of the all the interactions with iterators, streams, etc.
(BTW this remember me that List.of().stream() is not optimize well compared to ArrayList.stream())

This is a real danger, Java is already complex, if it becomes more and more, it will be irrelevant, because it becomes harder and harder to internalize all the semantics inside of your head.
There is already a C++, let us not become another one.

So i really like the fact that you have decided to go bold here and choose to return an unmodifiable list but we should go to full monty and be null hostile too, like toUnmodifiableList().

> 
> John and Rémi pointed out that the way I implemented this, using a subclass,
> reintroduces the possibility of problems with megamorphic dispatch which we had
> so carefully avoided by reducing the number of implementation classes in this
> area. I agree this is a problem.
> 
> I also had an off-line discussion with John where we discussed the serialization
> format, which unfortunately is somewhat coupled with this issue. (Fortunately I
> think it can mostly be decoupled.) Given that we're introducing a new thing,
> which is an unmodifiable-list-that-allows-nulls, this needs to be manifested in
> the serialized form of these new objects. (This corresponds to the new tag
> value of IMM_LIST_NULLS in the CollSer class.) The alternative is to reuse the
> existing serialized form, IMM_LIST, for both of these cases, relaxing it to
> allow nulls. This would be a serialization immutability problem. Suppose I had
> an application that created a data structure that used lists from List.of(),
> and I had a global assertion over that structure that it contained no nulls.
> Further suppose that I serialized and deserizalized this structure. I'd want
> that assertion to be preserved after deserialization. If another app (or a
> future version of this app) created the structure using Stream.to
> List(), this would allow nulls to leak into that structure and violate that
> assertion. Therefore, the result of Stream.toList() should not be
> serialization-compatible with List.of() et. al. That's why there's the new
> IMM_LIST_NULLS tag in the serial format.
> 
> However, the new representation in the serial format does NOT necessarily
> require the implementation to be a different class. I'm going to investigate
> collapsing ListN and ListNNullsAllowed back into a single class, while
> preserving the separate serialized forms. This should mitigate the concerns
> about megamorphic dispatch.

There are 3 reasons to not do that,
- merging ListN and ListNNullsAllowed means we will have trouble when we will want to generify that List as part of Valhalla,
  a list of primitive object (ex inline type) doesn't allow null.

- the current implementation uses @Stable and the transient state for @Stable is null, so you are creating a performance pot hole,
    private static final List<Integer> CONST = Stream.of(2, null).toList();
    ...
    CONST.get(0);  // it's a constant
    CONST.get(1);  // it's not a constant

- you have to sprinkle some if(element == null) at some places, the VM will likely to optimize them away (or at least to consider the follow up code as a dead branch) until the List really store null
  at that point, a deopt storm will likely follow creating a weird latency peak.

Rémi


More information about the core-libs-dev mailing list