RFR: 8180352: Add Stream.toList() method [v2]

Peter Levart plevart at openjdk.java.net
Tue Nov 10 09:37:58 UTC 2020


On Sun, 8 Nov 2020 15:55:58 GMT, Peter Levart <plevart at openjdk.org> wrote:

>> Hi Stuart,
>> 
>> I would like to discuss the serialization. You introduce new CollSer.IMM_LIST_NULLS type of immutable collection. This means that if this change goes into JDK16 for example, JDK15 and before will not be able to deserialize such list as they know nothing about IMM_LIST_NULLS even if such lists don't contain nulls. The reason you say to chose new type of serialization format is the following:
>> 
>>> "Suppose I had an application that created a data structure that used lists from List.of(), and I had a global assertion over that structure that it contained no nulls. Further suppose that I serialized and deserizalized this structure. I'd want that assertion to be preserved after deserialization. If another app (or a future version of this app) created the structure using Stream.to
>>>  List(), this would allow nulls to leak into that structure and violate that assertion. Therefore, the result of Stream.toList() should not be serialization-compatible with List.of() et. al. That's why there's the new IMM_LIST_NULLS tag in the serial format"
>> 
>> I don't quite get this reasoning. Let's try to decompose the reasoning giving an example. Suppose we had the following data structure:
>> 
>> public class Names implements Serializable {
>>   private final List<String> names;
>>   Names(List<String> names) {
>>     this.names = names;
>>   }
>>   public List<String> names() { return names; }
>> }
>> 
>> App v1 creates such structures using new Names(List.of(...)) and serializes/deserializes them. They keep the invariant that no nulls are present. Now comes App v2 that starts using new Names(stream.toList()) which allows nulls to be present. When such Names instance from app v2 is serialized and then deserialized in app v1, nulls "leak" into data structure of app v1 that does not expect them.
>> 
>> But the question is how does having a separate CollSer.IMM_LIST_NULLS type prevent that from happening?
>
> I can see that having a separate IMM_LIST_NULLS type might be necessary to preserve the allows-null/disallows-null behaviour of indexOf and lastIndexOf methods...
> 
> NOTE ALSO that ListN.equals(o) and ListN.hashCode() are inherited from AbstractImmutableList:
> 
>         @Override
>         public boolean equals(Object o) {
>             if (o == this) {
>                 return true;
>             }
> 
>             if (!(o instanceof List)) {
>                 return false;
>             }
> 
>             Iterator<?> oit = ((List<?>) o).iterator();
>             for (int i = 0, s = size(); i < s; i++) {
>                 if (!oit.hasNext() || !get(i).equals(oit.next())) {
>                     return false;
>                 }
>             }
>             return !oit.hasNext();
>         }
> and
>         public int hashCode() {
>             int hash = 1;
>             for (int i = 0, s = size(); i < s; i++) {
>                 hash = 31 * hash + get(i).hashCode();
>             }
>             return hash;
>         }
> 
> ...which means they will throw NPE when the list contains null. The same goes for SubList.

> @plevart wrote:
> 
> > But the question is how does having a separate CollSer.IMM_LIST_NULLS type prevent that from happening?
> 
> When a serialized list with IMM_LIST_NULLS is deserialized on an older JDK, it'll throw InvalidObjectException since that tag isn't valid on older JDKs. Obviously this is still an error, but it's a fail-fast approach that avoids letting nulls leak into a data structure where they might cause a problem some arbitrary time later.
> 

Yes, but that is JDK16+ vs. JDK15- and not App V1 vs. App V2 thing. If both apps run on JDK16+, there will be no exception.

What I'm trying to say is that the same problem of injecting unexpected nulls via serialization/deserialization can happen also if App V2 starts using ArrayList to construct the data structure and serialize it while App V1 deserializes it and expects non-null values only. App V1 would already have to guard against null values during deserialization in that case, because possibility of null values in deserialized data structure is nothing new for App V1.

-------------

PR: https://git.openjdk.java.net/jdk/pull/1026


More information about the core-libs-dev mailing list