EnumeratedStream

Sun Apr 21 01:03:32 UTC 2024

On Sat, Apr 20, 2024 at 7:44 PM ІП-24 Олександр Ротань <
rotan.olexandr at gmail.com> wrote:

> Also enumerated stream should also support index-aware terminal
> operations, which getherers are incapable of, so it will also require to
> create index-aware collectors. I am not aware if this is even possible, but
> this looks like another separate functionality in different place, and some
> developers might just don't be aware of its existence. I think that we as
> language devs should think not only about what is possible in language, but
> also about is it comfortable and is it obvious for user
>
Gatherers can become many-to-one; notice it has a state A, it can totally
choose to only emit a single element R in its finisher (i.e. its integrator
only touches state and element and ignores the downstream), then you can
use findAny().orElseThrow() to access that single collector result. That
said, the factory I proposed can try to wrap Collectors the same way it
wraps Gatherers. See conclusion below.

> On Sun, Apr 21, 2024, 03:36 ІП-24 Олександр Ротань <
> rotan.olexandr at gmail.com> wrote:
>
>> Yes, I think every possible intermediate operation could be made index
>> aware using gatherers. The point is: should it be turned?
>>
>> As a developers of jdk itself, we are not limited in a ways we could
>> provide tools for Java users, especially when it comes to adding completely
>> new features and not modifying existing apis.
>>
> Creating a new type of pipeline is only going to blow up the complexity of
Streams; having IntStream, LongStream, DoubleStream and Stream, together
with the 4-way spliterators (Spliterator.OfInt etc.) and iterators
(PrimitiveIterator.OfInt etc.), is already an API nightmare. And the
indexing will multiply all the intermediate and terminal operations by 2,
further blowing up the API, which I don't believe is the right direction to
go.

>
>>
> Gatherer-based approach looks like we are developers of third party
>> library that has to look for workarounds instead of directly adding
>> features we need. It's syntax is more wordy without any payoff in
>> flexibility, and obviously would be slower and memory-costy.
>>
> Gatherer is not a "third party hook", but an essential API that represents
all possible stream operations, including Collector. Gatherer would not be
slow; it already supports short-circuiting and should not add extra
overheads, as Gatherer is like an API for all possible stream operations.

>
>> For me seems that implementing this using gatherer would only introduce
>> unnecessary intermediate steps in operation internal pipeline without any
>> visible payoff.
>>
> Implementing with Gatherer would reduce useless API exposure, as indexed
operations are't that frequently used and are useless in parallel
scenarios. Especially that these indexed operations aren't optimizable by
stream subclasses, much like how findIndex is not helpful in Lists as
Predicates can't be easily decoded like Object equivalence/hashCode, which
some Lists can use to speed up indexOf.

>
>> Also, indexes created this way will be inconsistent between operations,
>> and I am not sure if that is what we are looking for.
>>
> We declare an index-aware gatherer and my said factory converts it to an
index-unaware, sequential gatherer; the factory gatherer prepares indices
before calling our index-aware gatherer.

For my factory, if you think my 4-line syntax above is too verbose, we can
encapsulate those to become
public static <T> Gatherer<T, ?, T> filter(Predicate<Indexed<T>> predicate)
etc.

And the primary methods will be:
public static <T, A, R> Gatherer<T, ?, R> indexed(Gatherer<Indexed<T>, A,
R> gatherer)
public static <T, A, R> Collector<T, ?, R> indexed(Collector<Indexed<T>, A,
R> collector)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/core-libs-dev/attachments/20240420/80407420/attachment-0001.htm>