Gatherer: spliterator characteristics are not propagated

forax at univ-mlv.fr forax at univ-mlv.fr
Mon Jan 29 09:42:37 UTC 2024


> From: "Viktor Klang" <viktor.klang at oracle.com>
> To: "Remi Forax" <forax at univ-mlv.fr>
> Cc: "core-libs-dev" <core-libs-dev at openjdk.java.net>, "Paul Sandoz"
> <psandoz at openjdk.java.net>
> Sent: Monday, January 29, 2024 8:41:21 AM
> Subject: Re: Gatherer: spliterator characteristics are not propagated

> For monomorphic benches it is likely not going to make any difference as the JIT
> will likely do the same thing, also, the bench might be dominated by other
> factors.

For me, it's more that lambdas may be able to use late-inlining (inlining after the return value of an invokedynamic is known as constant) so results may be better than it should. 
Here is an example, 

@Benchmark 
public boolean map_allMatch_repeat_3 () { 
return integers .stream() 
.map( v -> v + 1 ) 
.map( v -> v + 2 ) 
.map( v -> v + 3 ) 
.allMatch( v -> v < 1_000_000 ); 
} 
@Benchmark 
public boolean mapToInt_allMatch_repeat_3 () { 
return integers .stream() 
.mapToInt( v -> v + 1 ) 
.map( v -> v + 2 ) 
.map( v -> v + 3 ) 
.allMatch( v -> v < 1_000_000 ); 
} 
@Benchmark 
public boolean gatherer_map_allMatch_repeat_3 () { 
// Gatherer with lambdas 
return integers .stream() 
.gather( map ( v -> v + 1 )) 
.gather( map ( v -> v + 2 )) 
.gather( map ( v -> v + 3 )) 
.allMatch( v -> v < 1_000_000 ); 
} 
@Benchmark 
public boolean gatherer_mapsublcass_allMatch_repeat_3 () { 
// Gatherer as subclass 
return integers .stream() 
.gather( mapSubclass ( v -> v + 1 )) 
.gather( mapSubclass ( v -> v + 2 )) 
.gather( mapSubclass ( v -> v + 3 )) 
.allMatch( v -> v < 1_000_000 ); 
} 

With 10 second warmups, 
Benchmark                                                    Mode  Cnt     Score     Error  Units 
MapGathererBenchmark.gatherer_map_allMatch_repeat_3          avgt    5  3359.336 ± 137.314  us/op 
MapGathererBenchmark.gatherer_mapsublcass_allMatch_repeat_3  avgt    5  3755.388 ±  50.858  us/op 
MapGathererBenchmark.mapToInt_allMatch_repeat_3              avgt    5   559.891 ±   1.542  us/op 
MapGathererBenchmark.map_allMatch_repeat_3                   avgt    5  1642.167 ±   5.804  us/op 

Also, I wonder if fusing is not an anti-pattern for gatherers/streams, c2 take a loooong time to come with as code wich is not very fast. 

> Cheers,
>
Rémi 

> Viktor Klang
> Software Architect, Java Platform Group
> Oracle

> From: forax at univ-mlv.fr <forax at univ-mlv.fr>
> Sent: Thursday, 25 January 2024 14:57
> To: Viktor Klang <viktor.klang at oracle.com>
> Cc: core-libs-dev <core-libs-dev at openjdk.java.net>; Paul Sandoz
> <psandoz at openjdk.java.net>
> Subject: [External] : Re: Gatherer: spliterator characteristics are not
> propagated

>> From: "Viktor Klang" <viktor.klang at oracle.com>
>> To: "Remi Forax" <forax at univ-mlv.fr>
>> Cc: "core-libs-dev" <core-libs-dev at openjdk.java.net>, "Paul Sandoz"
>> <psandoz at openjdk.java.net>
>> Sent: Wednesday, January 24, 2024 2:45:15 PM
>> Subject: Re: Gatherer: spliterator characteristics are not propagated

>> As a (related) side-note, the ability to implement the interface directly has a
>> significant benefit to being able to have tighter control on
>> efficiency/performance as well as behavior under composition, here are some
>> examples: [
>> https://urldefense.com/v3/__https://github.com/openjdk/jdk/blob/master/test/micro/org/openjdk/bench/java/util/stream/ops/ref/BenchmarkGathererImpls.java*L113__;Iw!!ACWV5N9M2RV99hQ!MFknMPf08p9oLTHtcHIIljRuma7ct6sarJRqTWKzmBiiGJBjtDEZsgGrkDbwxLl1NhEwVwz-RhHv3h3t54rw$
>> |
>> https://github.com/openjdk/jdk/blob/master/test/micro/org/openjdk/bench/java/util/stream/ops/ref/BenchmarkGathererImpls.java#L113
>> ]

> I've not seen any differences in term of performance:
> MapGathererBenchmark.gatherer_map_sum                avgt    5  546.058 ± 4.224
>   us/op
> MapGathererBenchmark.gatherer_mapsublcass_sum  avgt    5  546.391 ± 1.508  us/op
> but yes, being able to override andThen() is important.

>> Cheers,
>>
> Rémi

>> Viktor Klang
>> Software Architect, Java Platform Group
>> Oracle

>> From: core-libs-dev <core-libs-dev-retn at openjdk.org> on behalf of Viktor Klang
>> <viktor.klang at oracle.com>
>> Sent: Wednesday, 24 January 2024 14:34
>> To: forax at univ-mlv.fr <forax at univ-mlv.fr>
>> Cc: core-libs-dev <core-libs-dev at openjdk.java.net>; Paul Sandoz
>> <psandoz at openjdk.java.net>
>> Subject: Re: Gatherer: spliterator characteristics are not propagated
>> Presuming that you mean mutating the Gatherer such that its behavior isn't
>> stable, the difference (at least to me) is that creating such a mutable
>> Gatherer would violate the specification of Gatherer: [
>> https://urldefense.com/v3/__https://github.com/openjdk/jdk/blob/master/src/java.base/share/classes/java/util/stream/Gatherer.java*L63__;Iw!!ACWV5N9M2RV99hQ!MFknMPf08p9oLTHtcHIIljRuma7ct6sarJRqTWKzmBiiGJBjtDEZsgGrkDbwxLl1NhEwVwz-RhHv3uXOerfv$
>> |
>> https://github.com/openjdk/jdk/blob/master/src/java.base/share/classes/java/util/stream/Gatherer.java#L63
>> ]

>> Cheers,
>>
>> Viktor Klang
>> Software Architect, Java Platform Group
>> Oracle

>> From: forax at univ-mlv.fr <forax at univ-mlv.fr>
>> Sent: Tuesday, 23 January 2024 21:04
>> To: Viktor Klang <viktor.klang at oracle.com>
>> Cc: core-libs-dev <core-libs-dev at openjdk.java.net>; Paul Sandoz
>> <psandoz at openjdk.java.net>
>> Subject: [External] : Re: Gatherer: spliterator characteristics are not
>> propagated

>>> From: "Viktor Klang" <viktor.klang at oracle.com>
>>> To: "Remi Forax" <forax at univ-mlv.fr>
>>> Cc: "core-libs-dev" <core-libs-dev at openjdk.java.net>, "Paul Sandoz"
>>> <psandoz at openjdk.java.net>
>>> Sent: Monday, January 22, 2024 10:06:27 PM
>>> Subject: Re: Gatherer: spliterator characteristics are not propagated

>>>> The flags are in sync with the implementation because the only way to create a
>>>> Gatherer if through the factory methods and those factory methods (and only
>>>> them) compute the proper combination of SEQUENTIAL | STATELESS | GREEDY. A user
>>>> should not be able to set those flags. Only the flags KEEP_* are settable by a
>>>> user.

>>> But I presume this also requires to have a `int characteristics()`-method on the
>>> Gatherer interfacewhich means that users who are not using the factory methods
>>> will have full possibility of not only returning the flags, but returning any
>>> int.

>> The current implementation suffers the same kind of issue, it's easy to write a
>> mutable Gatherer and change the functions after creation, worst, right in the
>> middle of a call to stream.gather(...).

>> Perhaps the Gatherer interface should be sealed ? We did not have that option
>> during the 1.8 timeframe, when the Collector API was created.

>>>> The stream implementation has a whole mechanism in place to propagate/preverse
>>>> flags like SIZED, DISTINCT or SORTED. For me, discussing about the merit of
>>>> this mechanism seems a little off topic. I would prefer the Gatherer to be a
>>>> good citizen and works seemlessly with the other intermediary operations.

>>> I can see where you're coming from here, but to me, adding API surface needs to
>>> pull its weight.
>>> In this case I wasn't convinced that it did, hence we're having this
>>> conversation. \uD83D\uDE42

>>> Cheers,
>>>
>> regards,
>> Rémi

>>> Viktor Klang
>>> Software Architect, Java Platform Group
>>> Oracle

>>> From: forax at univ-mlv.fr <forax at univ-mlv.fr>
>>> Sent: Monday, 22 January 2024 19:56
>>> To: Viktor Klang <viktor.klang at oracle.com>
>>> Cc: core-libs-dev <core-libs-dev at openjdk.java.net>; Paul Sandoz
>>> <psandoz at openjdk.java.net>
>>> Subject: [External] : Re: Gatherer: spliterator characteristics are not
>>> propagated

>>>> From: "Viktor Klang" <viktor.klang at oracle.com>
>>>> To: "Remi Forax" <forax at univ-mlv.fr>
>>>> Cc: "core-libs-dev" <core-libs-dev at openjdk.java.net>, "Paul Sandoz"
>>>> <psandoz at openjdk.java.net>
>>>> Sent: Monday, January 22, 2024 4:22:11 PM
>>>> Subject: Re: Gatherer: spliterator characteristics are not propagated

>>>> Hi Rémi,

>>> Hello,

>>>> For instance, stateless is neither recessive nor dominant, since the composition
>>>> of two stateless operations is only ever stateless if they both are greedy as
>>>> well: [
>>>> https://urldefense.com/v3/__https://github.com/openjdk/jdk/blob/master/src/java.base/share/classes/java/util/stream/Gatherers.java*L588__;Iw!!ACWV5N9M2RV99hQ!Lm52jd6kovd5t-cmrqqSLiRcIajBGXLxh85LO3eeiL6UxbKZuNPcUnO6z2i0FzMEoNr7U-cOBuWPCjo57FVW$
>>>> |
>>>> https://github.com/openjdk/jdk/blob/master/src/java.base/share/classes/java/util/stream/Gatherers.java#L588
>>>> ]

>>> Okay, so choosing SEQUENTIAL vs PARALELLIZABLE is not that important given that
>>> the combination is ad-hoc, reflecting the characterristics is enough.

>>>> So even if making it represented as ints (more like Spliterator, rather than
>>>> Collector) makes things faster, it's still both work to track, propagate, and
>>>> also becomes a side-channel that needs to remain in sync with the actual
>>>> implementation of the logic.

>>> The flags are in sync with the implementation because the only way to create a
>>> Gatherer if through the factory methods and those factory methods (and only
>>> them) compute the proper combination of SEQUENTIAL | STATELESS | GREEDY. A user
>>> should not be able to set those flags. Only the flags KEEP_* are settable by a
>>> user.

>>>> One could argue that logic such as: someCollection.stream().map(…).count() is a
>>>> performance bug/inefficiency in an of itself as it would be faster to do
>>>> someCollection.size().

>>> The stream implementation has a whole mechanism in place to propagate/preverse
>>> flags like SIZED, DISTINCT or SORTED. For me, discussing about the merit of
>>> this mechanism seems a little off topic. I would prefer the Gatherer to be a
>>> good citizen and works seemlessly with the other intermediary operations.

>>> Cheers,
>>>
>>> Viktor Klang
>>> Software Architect, Java Platform Group
>>> Oracle

>>> From: forax at univ-mlv.fr <forax at univ-mlv.fr>
>>> Sent: Monday, 22 January 2024 19:56
>>> To: Viktor Klang <viktor.klang at oracle.com>
>>> Cc: core-libs-dev <core-libs-dev at openjdk.java.net>; Paul Sandoz
>>> <psandoz at openjdk.java.net>
>>> Subject: [External] : Re: Gatherer: spliterator characteristics are not
>>> propagated

>>>> From: "Viktor Klang" <viktor.klang at oracle.com>
>>>> To: "Remi Forax" <forax at univ-mlv.fr>
>>>> Cc: "core-libs-dev" <core-libs-dev at openjdk.java.net>, "Paul Sandoz"
>>>> <psandoz at openjdk.java.net>
>>>> Sent: Monday, January 22, 2024 4:22:11 PM
>>>> Subject: Re: Gatherer: spliterator characteristics are not propagated

>>>> Hi Rémi,

>>> Hello,

>>>> For instance, stateless is neither recessive nor dominant, since the composition
>>>> of two stateless operations is only ever stateless if they both are greedy as
>>>> well: [
>>>> https://urldefense.com/v3/__https://github.com/openjdk/jdk/blob/master/src/java.base/share/classes/java/util/stream/Gatherers.java*L588__;Iw!!ACWV5N9M2RV99hQ!Lm52jd6kovd5t-cmrqqSLiRcIajBGXLxh85LO3eeiL6UxbKZuNPcUnO6z2i0FzMEoNr7U-cOBuWPCjo57FVW$
>>>> |
>>>> https://github.com/openjdk/jdk/blob/master/src/java.base/share/classes/java/util/stream/Gatherers.java#L588
>>>> ]

>>> Okay, so choosing SEQUENTIAL vs PARALELLIZABLE is not that important given that
>>> the combination is ad-hoc, reflecting the characterristics is enough.

>>>> So even if making it represented as ints (more like Spliterator, rather than
>>>> Collector) makes things faster, it's still both work to track, propagate, and
>>>> also becomes a side-channel that needs to remain in sync with the actual
>>>> implementation of the logic.

>>> The flags are in sync with the implementation because the only way to create a
>>> Gatherer if through the factory methods and those factory methods (and only
>>> them) compute the proper combination of SEQUENTIAL | STATELESS | GREEDY. A user
>>> should not be able to set those flags. Only the flags KEEP_* are settable by a
>>> user.

>>>> One could argue that logic such as: someCollection.stream().map(…).count() is a
>>>> performance bug/inefficiency in an of itself as it would be faster to do
>>>> someCollection.size().

>>> The stream implementation has a whole mechanism in place to propagate/preverse
>>> flags like SIZED, DISTINCT or SORTED. For me, discussing about the merit of
>>> this mechanism seems a little off topic. I would prefer the Gatherer to be a
>>> good citizen and works seemlessly with the other intermediary operations.

>>>> Cheers,
>>>>
>>> regards,
>>> Rémi

>>>> Viktor Klang
>>>> Software Architect, Java Platform Group
>>>> Oracle

>>>> From: forax at univ-mlv.fr <forax at univ-mlv.fr>
>>>> Sent: Saturday, 20 January 2024 17:40
>>>> To: Viktor Klang <viktor.klang at oracle.com>
>>>> Cc: core-libs-dev <core-libs-dev at openjdk.java.net>; Paul Sandoz
>>>> <psandoz at openjdk.java.net>
>>>> Subject: [External] : Re: Gatherer: spliterator characteristics are not
>>>> propagated

>>>>> From: "Viktor Klang" <viktor.klang at oracle.com>
>>>>> To: "Remi Forax" <forax at univ-mlv.fr>
>>>>> Cc: "core-libs-dev" <core-libs-dev at openjdk.java.net>, "Paul Sandoz"
>>>>> <psandoz at openjdk.java.net>
>>>>> Sent: Thursday, January 18, 2024 5:14:38 PM
>>>>> Subject: Re: Gatherer: spliterator characteristics are not propagated

>>>>>> And for A.andThen(B), A.flags & B.flags should work, the stream is sorted if
>>>>>> both gatherers keep it sorted.

>>>>> That is unfortunately not the case. That would presume that you can implement
>>>>> the composition such that it can preserve all the common flags. Some flags
>>>>> could be "dominant" and some "recessive" to use genetics nomenclature.

>>>> Some flags of the stream pipeline are "recessive", mainly SHORT_CIRCUIT, but not
>>>> the characteristics of the Gatherer which can have the corresponding "dominant"
>>>> flag, GREEDY, in this case.
>>>> And the same for sequential, the flag should be PARALELIZABLE and not
>>>> SEQUENTIAL.

>>>> The idea is that the Gatherer characteristics can have the same bit set at the
>>>> same position as the stream op flags (as defined by StreamOpFlag).
>>>> So KEEP_DISTINCT is in position 0, KEEP_SORTED in in position 2 and KEEP_SIZED
>>>> is in position 3.
>>>> For GREEDY, we use the same position as SHORT_CIRCUIT and we will flip the bit
>>>> (using XOR) when we want to extract the stream op flags from the
>>>> characteristics

>>>> All the factory methods call the generic of() with a combination of
>>>> PARALELIZABLE and STATELESS and the user can adds the characteristics GREEDY,
>>>> KEEP_DISTINCT, KEEP_SORTED and KEEP_SIZED (otherwise an exception should be
>>>> raised).

>>>> In StreamOpFlag, there are two unused positions (14 and 15), that's perfect for
>>>> our two new states PARALELIZABLE and STATELESS, so no problem here (technically
>>>> we can also reuse positions of the Spliterator characteristic given that those
>>>> flags are masked before being sent to the GathererOp super constructor).

>>>> The way to transform a Gatherer characteristics op to a stream flags op is to
>>>> flip the bits corresponding to SHORT_CIRCUIT, add the highter bit of all flags
>>>> but SHORT-CIRCUIT (because stream op flags are encoded using 2 bits) and mask
>>>> to only retain SHORT_CIRCUIT, KEEP_DISTINCT, KEEP_SORTED and KEEP_SIZED.

>>>> public static int toOpFlags ( int characteristics ) {
>>>> return (( characteristics ^ SHORT_CIRCUIT_MASK ) | HIGHER_BITS ) &
>>>> STREAM_OP_MASK ;
>>>> }

>>>> see below for a full script.

>>>>>> I suppose that if those flags already exist, it's because they have a purpose
>>>>>> and i do not understand how it can make the other operations slower.

>>>>> Extra invocations, extra storage, and extra composition overhead is not free.
>>>>> Since Stream is one-shot you need to include the construction cost with the
>>>>> execution cost. For something like an empty Stream construction cost scan be
>>>>> 90+% of the total costs.

>>>> If you create a Gatherer, the characteristics is a constant (so the validity
>>>> check is removed, it's just a mask and a test) so the result of calling
>>>> toOpFlags() is a constant too.

>>>> If the factory method is not inlined, the cost is 3 bitwise operations which is
>>>> I believe faster than the "instanceof Greedy" used in the current code.

>>>>> Cheers,
>>>>>
>>>> regards,
>>>> Rémi

>>>> ---
>>>> public class GathererFlag {
>>>> // cut and paste from StreamOpFlag
>>>> /**
>>>> * The bit pattern for setting/injecting a flag.
>>>> */
>>>> private static final int SET_BITS = 0b01 ;

>>>> /**
>>>> * The bit pattern for clearing a flag.
>>>> */
>>>> private static final int CLEAR_BITS = 0b10 ;

>>>> /**
>>>> * The bit pattern for preserving a flag.
>>>> */
>>>> private static final int PRESERVE_BITS = 0b11 ;

>>>> private static int position ( int opFlagSet ) {
>>>> return Integer . numberOfTrailingZeros ( opFlagSet ) >> 1 ;
>>>> }

>>>> private static final int DISTINCT_POSITION = position ( StreamOpFlag .
>>>> IS_DISTINCT );
>>>> private static final int SORTED_POSITION = position ( StreamOpFlag . IS_SORTED
>>>> );
>>>> private static final int SIZED_POSITION = position ( StreamOpFlag . IS_SIZED );

>>>> private static final int SHORT_CIRCUIT_POSITION = position ( StreamOpFlag .
>>>> IS_SHORT_CIRCUIT );
>>>> private static final int STATELESS_POSITION = 14 ;
>>>> private static final int PARELLIZABLE_POSITION = 15 ;

>>>> public static final int PARELLIZABLE = SET_BITS << ( PARELLIZABLE_POSITION << 1
>>>> );
>>>> public static final int STATELESS = SET_BITS << ( STATELESS_POSITION << 1 );
>>>> public static final int GREEDY = SET_BITS << ( SHORT_CIRCUIT_POSITION << 1 );

>>>> public static final int KEEP_DISTINCT = SET_BITS << ( DISTINCT_POSITION << 1 );
>>>> public static final int KEEP_SORTED = SET_BITS << ( SORTED_POSITION << 1 );
>>>> public static final int KEEP_SIZED = SET_BITS << ( SIZED_POSITION << 1 );

>>>> private static final int SHORT_CIRCUIT_MASK = SET_BITS << (
>>>> SHORT_CIRCUIT_POSITION << 1 );

>>>> // no GREEDY here
>>>> private static final int HIGHER_BITS = ( PARELLIZABLE | STATELESS |
>>>> KEEP_DISTINCT | KEEP_SORTED | KEEP_SIZED ) << 1 ;

>>>> private static final int STREAM_OP_MASK =
>>>> (( GREEDY | KEEP_DISTINCT | KEEP_SORTED | KEEP_SIZED ) << 1 ) |
>>>> GREEDY | KEEP_DISTINCT | KEEP_SORTED | KEEP_SIZED ;

>>>> public static String toString ( int characteristics ) {
>>>> return Stream . of ( characteristics )
>>>> .< String >mapMulti(( f , consumer ) -> {
>>>> if (( f & PARELLIZABLE ) == PARELLIZABLE ) {
>>>> consumer .accept( "PARELLIZABLE" );
>>>> }
>>>> if (( f & STATELESS ) == STATELESS ) {
>>>> consumer .accept( "STATELESS" );
>>>> }
>>>> if (( f & GREEDY ) == GREEDY ) {
>>>> consumer .accept( "GREEDY" );
>>>> }
>>>> if (( f & KEEP_DISTINCT ) == KEEP_DISTINCT ) {
>>>> consumer .accept( "KEEP_DISTINCT" );
>>>> }
>>>> if (( f & KEEP_SORTED ) == KEEP_SORTED ) {
>>>> consumer .accept( "KEEP_SORTED" );
>>>> }
>>>> if (( f & KEEP_SIZED ) == KEEP_SIZED ) {
>>>> consumer .accept( "KEEP_SIZED" );
>>>> }
>>>> })
>>>> .collect( Collectors . joining ( ", " ));
>>>> }

>>>> public static int toOpFlags ( int characteristics ) {
>>>> return (( characteristics ^ SHORT_CIRCUIT_MASK ) | HIGHER_BITS ) &
>>>> STREAM_OP_MASK ;
>>>> }

>>>> public static String toOpFlagsString ( int opFlags ) {
>>>> return Arrays . stream ( StreamOpFlag . values ())
>>>> .map( op -> {
>>>> if ( op .isPreserved( opFlags )) {
>>>> return "preserved " + op ;
>>>> }
>>>> if ( op .isCleared( opFlags )) {
>>>> return "cleared " + op ;
>>>> }
>>>> if ( op .isKnown( opFlags )) {
>>>> return "set " + op ;
>>>> }
>>>> return "?? " + op ;
>>>> })
>>>> .collect( Collectors . joining ( ", " ));
>>>> }

>>>> void main () {
>>>> var characteristics = PARELLIZABLE | STATELESS | GREEDY | KEEP_DISTINCT |
>>>> KEEP_SORTED | KEEP_SIZED ;
>>>> System . out .println( toOpFlagsString ( toOpFlags ( characteristics )));

>>>> var characteristics2 = STATELESS | KEEP_DISTINCT | KEEP_SIZED ;
>>>> System . out .println( toOpFlagsString ( toOpFlags ( characteristics2 )));
>>>> }
>>>> }

>>>>> Viktor Klang
>>>>> Software Architect, Java Platform Group
>>>>> Oracle

>>>>> From: forax at univ-mlv.fr <forax at univ-mlv.fr>
>>>>> Sent: Thursday, 18 January 2024 16:17
>>>>> To: Viktor Klang <viktor.klang at oracle.com>
>>>>> Cc: core-libs-dev <core-libs-dev at openjdk.java.net>; Paul Sandoz
>>>>> <psandoz at openjdk.java.net>
>>>>> Subject: [External] : Re: Gatherer: spliterator characteristics are not
>>>>> propagated

>>>>>> From: "Viktor Klang" <viktor.klang at oracle.com>
>>>>>> To: "Remi Forax" <forax at univ-mlv.fr>
>>>>>> Cc: "core-libs-dev" <core-libs-dev at openjdk.java.net>, "Paul Sandoz"
>>>>>> <psandoz at openjdk.java.net>
>>>>>> Sent: Thursday, January 18, 2024 3:36:07 PM
>>>>>> Subject: Re: Gatherer: spliterator characteristics are not propagated

>>>>>> I suspect that it is a rather slippery slope, once KEEP-flags are added, then
>>>>>> others will want to be able to have INJECT-flags, and then people might have
>>>>>> different opinions w.r.t. the default should be to clear all flags etc.

>>>>>> And that's even before one looks at the composition-part of it, what are the
>>>>>> flags for A.andThen(B)? (then extrapolate to N compositions and the available
>>>>>> set of flags always approaches 0)

>>>>>> I spent quite a bit of time on this and in the end tracking all this info, and
>>>>>> making sure that the flags of implementations correspond to the actual
>>>>>> behavior, just ended up costing performance for most streams and introduced an
>>>>>> extra dimension to creation and maintenance which I had a hard time justifying.

>>>>> It can be a slippery slope if we were designing from the ground up but the
>>>>> stream implementation already exists and SORTED, DISTINCT and SIZED are the
>>>>> flags that are already tracked by the current implementation.

>>>>> Currently only SHORT_CIRCUIT is set (if not greedy),
>>>>> see [
>>>>> https://urldefense.com/v3/__https://github.com/openjdk/jdk/blob/master/src/java.base/share/classes/java/util/stream/GathererOp.java*L209__;Iw!!ACWV5N9M2RV99hQ!PhMxqlDzLWPRuwYc7ECRKNPVs0BtnoE-RdT-Jdkng7S-iFuERAHYcWvJ-OMKGLrkPdSrUl3xj1R9ypyeqeWI$
>>>>> |
>>>>> https://github.com/openjdk/jdk/blob/master/src/java.base/share/classes/java/util/stream/GathererOp.java#L209
>>>>> ]

>>>>> And for A.andThen(B), A.flags & B.flags should work, the stream is sorted if
>>>>> both gatherers keep it sorted.

>>>>>> Making specific, rare, combinations of operations faster at the expense of
>>>>>> making 99% of all others slower is a hard pill for most to swallow.

>>>>> I suppose that if those flags already exist, it's because they have a purpose
>>>>> and i do not understand how it can make the other operations slower.

>>>>>> Cheers,
>>>>>>
>>>>> regards,
>>>>> Rémi

>>>>>> Viktor Klang
>>>>>> Software Architect, Java Platform Group
>>>>>> Oracle

>>>>>> From: forax at univ-mlv.fr <forax at univ-mlv.fr>
>>>>>> Sent: Thursday, 18 January 2024 10:28
>>>>>> To: Viktor Klang <viktor.klang at oracle.com>
>>>>>> Cc: core-libs-dev <core-libs-dev at openjdk.java.net>; Paul Sandoz
>>>>>> <psandoz at openjdk.java.net>
>>>>>> Subject: [External] : Re: Gatherer: spliterator characteristics are not
>>>>>> propagated

>>>>>>> From: "Viktor Klang" <viktor.klang at oracle.com>
>>>>>>> To: "Remi Forax" <forax at univ-mlv.fr>, "core-libs-dev"
>>>>>>> <core-libs-dev at openjdk.java.net>
>>>>>>> Sent: Wednesday, January 17, 2024 8:48:07 PM
>>>>>>> Subject: Re: Gatherer: spliterator characteristics are not propagated

>>>>>>> Hi Rémi,

>>>>>>> You can find some of my benches here: [
>>>>>>> https://urldefense.com/v3/__https://github.com/openjdk/jdk/tree/master/test/micro/org/openjdk/bench/java/util/stream/ops/ref__;!!ACWV5N9M2RV99hQ!JJy6F9NoL6wKZQK5158up_fTRvH8X7F6JK8T7Euuf8vzbSQbr23eWa9S_yb61ksONVrLrdesCF_au5zyje2l$
>>>>>>> |
>>>>>>> https://github.com/openjdk/jdk/tree/master/test/micro/org/openjdk/bench/java/util/stream/ops/ref
>>>>>>> ]

>>>>>>> Initially I had Characteristics such as ORDERED etc on Gatherer but it just
>>>>>>> didn't end up worth it when looking at the bench results over a wide array of
>>>>>>> stream sizes and number of operations.

>>>>>> I think there are 3 gatherer characteristics that make sense: KEEP_SORTED,
>>>>>> KEEP_DISTINCT and KEEP_SIZED,
>>>>>> all of them say that if the stream was sorted/distinct/sized then the stream
>>>>>> returned by a call to gather() is still sorted (with the same comparator),
>>>>>> distinct or sized.

>>>>>> As examples, map() is KEEP_SIZED, filter() is KEEP_SORTED | KEEP_DISTINCT and
>>>>>> windowFixed is KEEP_DISTINCT.

>>>>>> [CC Paul, so he can correct me if i'm saying something stupid]

>>>>>> Now for the benchmarks, it depends what you want to measure, benchmarking
>>>>>> streams is tricky. This is what i know about benchmarking streams.
>>>>>> First, the JIT has two ways to profile types at runtime,
>>>>>> Either a method takes a function as parameter
>>>>>> void map(Function function) {
>>>>>> function.apply(...)
>>>>>> }
>>>>>> and when map is called with a subtype of Function, the JIT will propagate the
>>>>>> exact type when map is inlined,
>>>>>> Or a method use a field
>>>>>> class Op {
>>>>>> Function function;

>>>>>> void map() {
>>>>>> function.apply(...)
>>>>>> }
>>>>>> }
>>>>>> in that case, the VM records the profile of function.apply() and if there are
>>>>>> more than two different profiles, the VM declare profile poluttion and do not
>>>>>> try to optimize.

>>>>>> The Stream implementation tries very hard to use only parameters instead of
>>>>>> fields, that's why it does not use classical Iterator that are pull iterator (a
>>>>>> filter iterator requires a field) but a Spliterator which is a push iterator,
>>>>>> the element is sent as parameter of the consumer.That's also why collect does
>>>>>> not use the builder pattern (that accumulate values in fields) but a Collector
>>>>>> that publish the functions to be called as parameter.

>>>>>> Obvisously, this is more complex than that, a Collector stores the functions in
>>>>>> fields so it should not work well but the implementation uses a record that
>>>>>> plays well with escape analysis. Escape analysis see the fields of an instance
>>>>>> as parameters so the functions of a Collector are correctly propagated (if the
>>>>>> escape analysis works). And lambdas are using invokedynamic, and the VM tries
>>>>>> really hard to inline invokedynamic, so lambdas (that captures value or not)
>>>>>> are routinely fully inlined with the intermediate operation of a stream.

>>>>>> In your tests, i've not seen comparaisons between an existing method like map()
>>>>>> or filter() followed by a sorted()/distinct()/count()/toList(), i.e. operations
>>>>>> where the characteristics KEEP_* have an impact and their equivalent using a
>>>>>> Gatherer.

>>>>>>> Cheers,
>>>>>>>
>>>>>> regards,
>>>>>> Rémi

>>>>>>> Viktor Klang
>>>>>>> Software Architect, Java Platform Group
>>>>>>> Oracle

>>>>>>> From: core-libs-dev <core-libs-dev-retn at openjdk.org> on behalf of Remi Forax
>>>>>>> <forax at univ-mlv.fr>
>>>>>>> Sent: Wednesday, 17 January 2024 16:48
>>>>>>> To: core-libs-dev <core-libs-dev at openjdk.java.net>
>>>>>>> Subject: Gatherer: spliterator characteristics are not propagated
>>>>>>> While doing some benchmarking of the Gatherer API, i've found that the
>>>>>>> characteristics of the spliterator was not propagated by the method
>>>>>>> Stream.gather() making the stream slower than it should.

>>>>>>> As an example, there is no way when reimplementing map() using a Gatherer to say
>>>>>>> that this intermediate operation keep the size, which is important if the
>>>>>>> terminal operation is toList() because if the size is known, toList() will
>>>>>>> presize the List and avoid the creation of an intermediary ArrayList.

>>>>>>> See [
>>>>>>> https://urldefense.com/v3/__https://github.com/forax/we_are_all_to_gather/blob/master/src/main/java/com/gihtub/forax/wearealltogather/bench/MapGathererBenchmark.java__;!!ACWV5N9M2RV99hQ!JJy6F9NoL6wKZQK5158up_fTRvH8X7F6JK8T7Euuf8vzbSQbr23eWa9S_yb61ksONVrLrdesCF_auzwTY8aB$
>>>>>>> |
>>>>>>> https://github.com/forax/we_are_all_to_gather/blob/master/src/main/java/com/gihtub/forax/wearealltogather/bench/MapGathererBenchmark.java
>>>>>>> ]

>>>>>>> I think that adding a way to propagate the spliterator characteristics through a
>>>>>>> Gatherer would greatly improve the performance of commons streams (at least all
>>>>>>> the ones that end with a call to toList).

>>>>>>> I have some idea of how to do that, but I prefer first to hear if i've overlook
>>>>>>> something and if improving the performance is something worth changing the API.

>>>>>>> regards,
>>>>>>> Rémi
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/core-libs-dev/attachments/20240129/ba4453fd/attachment-0001.htm>


More information about the core-libs-dev mailing list