Addition of Predicate-based findIndex and findLastIndex methods to java.util.List

Fri Apr 19 22:56:08 UTC 2024

> I think that, while finding the index of a matching
> element is not a very common task, especially among
> commercial developers, when discussing indexed streams,
> this would be the most common use case. To be honest, I
> don't even see any other use cases right now, besides
> maybe some processing based on divisibility of index
> (odd/even as the most common one). I have a pretty strong
> background in competitive programming, more than 5 years,
> and can remember anything more complex than that that
> could not be replaced with Map usage.

Here are 3 off the top of my head.

1. Fetching data from same-sized lists -- This is a trick from
Data-Oriented Design (not to be confused with Data-Oriented Programming)
where you filter down to the indexes that you want, and use the indexes to
fetch related values from the respective lists. So, I would filter down to
Player1, Player4, and Player6, then I would use their indexes to do lookups
in PlayerItemList, PlayerSkinList, etc. This tactic is especially useful
when working with structs (or value classes when Java gets them).

2. Building jumps/skips -- Fetching the indexes at hotspots, and then using
them to create a skip-list-esque structure between the hotspots. Very
useful for realigning search strategies.

3. Calculating distance using index -- This is a sister concept to bullet
2. Let's say I built my skip list above, and have determined that my
desired target is not in the hotspots. I can make a better decision to go
for parallel or sequential by seeing the distance between hotspots. That
distance would be taken by subtracting hotspot index by hotspot index.

And also, there is an entire world of index math. It is not just
divisibility.

> 1. Although indexed streams are flexible, the flexibility
> might come at the cost of readability. Syntax like
> list.FindIndex is pretty straightforward and speaks for
> itself, while indexed-stream-based approach requires
> developer to explicitly evaluate what that chain of
> method does, which would distract them from understanding
> of general program logic flow, while usually being no
> more then utility.

By this logic, many of the methods in the Stream library should instead be
in the Collection library. The are several emails on the lambda mailing
list that address why this wasn't done. I imagine that Rémi himself could
point them out to you.

> The reason for existence of findIndex is the same as for
> indexOf, so saying one is redundant means saying other
> one is redundant too.

Well hold on. I am not at all saying that it is redundant. I am saying that
the functionality that you desire can be easily recreated by using a more
far-reaching feature, the .withIndex() idea that Rémi suggested. I am just
trying to say that your idea probably should not be done on the List
interface.

> 2. Streams are performance-costy, thats the point that is
> hard to argue with. While list implementations could
> provide specified implementations of findIndex based on
> their internal structure, streams work only with generic
> interfaces. Moreover, even default implementation that
> uses ListIterator would be times faster and less resource
> consuming then generic stream-based approach. While I
> understand that performance is not as valuable as
> flexibility and clarity, it is still crucial, at least
> because of it being the main factor server maintenance
> cost after all.

Rémi already addressed a couple of these performance points in his response
to me. Long story short, Valhalla is going to make a lot of what you said
irrelevant.

Regardless, Valhalla is not here yet. So at least for now, I will concede
your point about performance.

On Fri, Apr 19, 2024 at 5:52 PM ІП-24 Олександр Ротань <
rotan.olexandr at gmail.com> wrote:

> I think that, while finding the index of a matching element is not a very
> common task, especially among commercial developers, when discussing
> indexed streams, this would be the most common use case. To be honest, I
> don't even see any other use cases right now, besides maybe some processing
> based on divisibility of index (odd/even as the most common one). I have a
> pretty strong background in competitive programming, more than 5 years, and
> can remember anything more complex than that that could not be replaced
> with Map usage.
>
> Regarding clarity, I fully agree with you. I am not a fan of those
> unreadable abbreviations in C/C++ or even in python. However, while it's
> hard to argue that your approach is more flexible, it has two downsides:
>
> 1. Although indexed streams are flexible, the flexibility might come at
> the cost of readability. Syntax like list.FindIndex is pretty
> straightforward and speaks for itself, while indexed-stream-based approach
> requires developer to explicitly evaluate what that chain of method does,
> which would distract them from understanding of general program logic flow,
> while usually being no more then utility. The reason for existence of
> findIndex is the same as for indexOf, so saying one is redundant means
> saying other one is redundant too.
>
> 2. Streams are performance-costy, thats the point that is hard to argue
> with. While list implementations could provide specified implementations of
> findIndex based on their internal structure, streams work only with generic
> interfaces. Moreover, even default implementation that uses ListIterator
> would be times faster and less resource consuming then generic stream-based
> approach. While I understand that performance is not as valuable as
> flexibility and clarity, it is still crucial, at least because of it being
> the main factor server maintenance cost after all.
>
> сб, 20 апр. 2024 г. в 00:32, David Alayachew <davidalayachew at gmail.com>:
>
>> It is not the goal of Java to be verbose, but it is also not the goal of
>> Java to be concise either. It is the goal of Java to be clear, correct, and
>> readable, in order to maximize maintainability, correctness, and
>> flexibility. When you focus on accomplishing that goal first, you will find
>> that conciseness takes care of itself.
>>
>> I understand that my approach is wordier than yours. Please also consider
>> that my suggestion does exactly one task -- stream an element with its
>> respective index. And yet, the number of ways that that can be used covers
>> hundreds of use cases while still being incredibly easy to maintain and
>> optimize. Far more than what you suggest. My feature easily composes with
>> others, with anything that can take a Stream<T> in fact.
>>
>> The bar to adding a feature to Java, even to its libraries, is high. In
>> fact, there is even a breaking point where a language/library can just be
>> too dense to work with. Too many methods, too much complexity.
>>
>> Java makes it a priority to limit adding features or functionality
>> wherever possible, only relenting when the benefit significantly outweighs
>> the cost. Adding a method like yours that only serves a tiny slice of use
>> cases just does not meet the goal of what Java is aiming for -- to get the
>> biggest bang for buck when adding a feature to language/library.
>>
>> I think that your pain point is significant and worth addressing. But I
>> think it should be done in a way that services a much larger group of users
>> and use cases. I think my idea does that.
>>
>> On Fri, Apr 19, 2024 at 5:14 PM ІП-24 Олександр Ротань <
>> rotan.olexandr at gmail.com> wrote:
>>
>>> streamWithIndex() is indeed an interesting approach, however, working
>>> with pairs of values is not always comfortable as this may make code too
>>> wordy, which may be fixed using things like properties and tuples (which I
>>> am currently working on right now btw), but still code like
>>> list.streamWithIndex().filter(Predicate).findFirst().orElse(-1) is still
>>> long, thats why i think there is room for both in Java.
>>>
>>> Also maybe instead of returning traditional -1 it would be more
>>> suitable to use OptionalInt instead, but I would like to hear community
>>> opinion about that.
>>>
>>> сб, 20 апр. 2024 г. в 00:02, David Alayachew <davidalayachew at gmail.com>:
>>>
>>>>
>>>> No Rémi, I don't think your idea is the right approach. You are working
>>>> on the wrong level of abstraction.
>>>>
>>>> Many users ask requests like this all the time, and what you are
>>>> suggesting would be even more error-prone than the equivalent for loop or the
>>>> IntStream suggestion that the user above requested. Not to mention that
>>>> getting it to parallelize would be a task many users are likely to mess up
>>>> -- either in correctness or in performance.
>>>>
>>>> I think you would get far more mileage from adding 2 methods on the
>>>> list interface streamWithIndex() and parallelStreamWithIndex() that would
>>>> return a Stream<WithIndex<T>>, as opposed to just Stream<T>.
>>>>
>>>> That way, users are not writing a custom Gatherer each time they want
>>>> to work with the index. They just have the index be a field in the object.
>>>> They can work with it the same way they would any other object field.
>>>>
>>>> Furthermore, doing it this way makes the correct answer obvious. If I
>>>> need to do something with an index, stream with the index.
>>>>
>>>> On top of that, it significantly enhances readability by making it
>>>> clear to the reader that, whatever this stream is doing will require use of
>>>> the index, so watch out for that.
>>>>
>>>>
>>>>
>>>> On Fri, Apr 19, 2024, 1:47 PM Remi Forax <forax at univ-mlv.fr> wrote:
>>>>
>>>>> Hello,
>>>>> for me, it seems what you want is  Collector on Stream which is able
>>>>> to short-circuit,
>>>>> so you can write
>>>>>   list.stream().collect(Collectors.findFirst(s -> s.contains("o")))
>>>>> and in reverse
>>>>>   list.reversed().stream().collect(Collectors.findFirst(s ->
>>>>> s.contains("o")))
>>>>>
>>>>> Using a Stream here is more general and will work with other
>>>>> collections like a LinkedHashSet for example.
>>>>> Sadly, there is no way to define a short-circuiting collector :(
>>>>>
>>>>> You can have a short-circuiting Gatherer like this
>>>>>   <T> Gatherer<T, ?, Integer> findIndex(Predicate<? super T>
>>>>> predicate) {
>>>>>     return Gatherer.ofSequential(
>>>>>       () -> new Object() { int index; },
>>>>>       Integrtor.ofGreedy((state, element, downstream) -> {
>>>>>         var index = state.index++;
>>>>>         if (predicate.test(element)) {
>>>>>           return downstream.push(index);
>>>>>         }
>>>>>         return true;
>>>>>       }));
>>>>>   }
>>>>>
>>>>> and use it like this:
>>>>>   list.stream().gather(findIndex(s ->
>>>>> s.contains("o"))).findFirst().orElse(-1);
>>>>>
>>>>> But it's more verbose.
>>>>>
>>>>> I wonder if at the same time that the Gatherer API is introduced, the
>>>>> Collector API should be enhanced to support short-circuiting collectors ?
>>>>>
>>>>> regards,
>>>>> Rémi
>>>>>
>>>>>
>>>>> ------------------------------
>>>>>
>>>>> *From: *"ІП-24 Олександр Ротань" <rotan.olexandr at gmail.com>
>>>>> *To: *"core-libs-dev" <core-libs-dev at openjdk.org>
>>>>> *Sent: *Friday, April 19, 2024 5:59:39 PM
>>>>> *Subject: *Addition of Predicate-based findIndex and findLastIndex
>>>>> methods to java.util.List
>>>>>
>>>>> Subject
>>>>> Addition of Predicate-based findIndex and findLastIndex methods to
>>>>> java.util.List
>>>>>
>>>>> Motivation
>>>>> The motivation behind this proposal is to enhance the functionality of
>>>>> the List interface by providing a more flexible way to find the index of an
>>>>> element. Currently, the indexOf and lastIndexOf methods only accept an
>>>>> object as a parameter. This limits the flexibility of these methods as they
>>>>> can only find the index of exact object matches.
>>>>>
>>>>> Here I want to propose methods that would accept a Predicate as a
>>>>> parameter, allowing users to define a condition that the desired element
>>>>> must meet. This would provide a more flexible and powerful way to find the
>>>>> index of an element in a list.
>>>>>
>>>>> The changes I am proposing are implemented in this PR:
>>>>> https://github.com/openjdk/jdk/pull/18639. Here is a brief overview
>>>>> of the changes made in this pull request:
>>>>>
>>>>> Added the findIndex  (Predicate<? super E> filter) method to the List
>>>>> interface.
>>>>> Added the findLastIndex  (Predicate<? super E> filter) method to the
>>>>> List interface.
>>>>> Implemented these methods in all non-abstract classes that implement
>>>>> the List interface, as well as List itself (default impl).
>>>>> The changes have been thoroughly tested to ensure they work as
>>>>> expected and do not introduce any regressions. The test cases cover a
>>>>> variety of scenarios to ensure the robustness of the implementation.
>>>>>
>>>>> For example, consider the following test case:
>>>>>
>>>>> List<String> list = new ArrayList<>();
>>>>> list.add("Object one");
>>>>> list.add("NotObject two");
>>>>> list.add("NotObject three");
>>>>>
>>>>> int index1 = list.findIndex(s -> s.contains("ct t"));
>>>>> System.out.println(index1); // Expected output: 1
>>>>> int index2 = list. findLastIndex(s -> s.startsWith("NotObject"));
>>>>> System.out.println(index2); // Expected output: 2
>>>>> Currently, to achieve the same result, we would have to use a more
>>>>> verbose approach:
>>>>>
>>>>> int index1 = IntStream.range(0, list.size())
>>>>>                      .filter(i -> list.get(i).contains("ct t"))
>>>>>                      .findFirst()
>>>>>                      .orElse(-1);
>>>>> System.out.println(index1); // Output: 1
>>>>> int index2 = IntStream.range(0, list.size())
>>>>>                          .filter(i ->
>>>>> list.get(i).startsWith("NotObject"))
>>>>>                          .reduce((first, second) -> second)
>>>>>                          .orElse(-1);
>>>>> System.out.println(index2); // Output: 2
>>>>> Or other approaches that require additional instructions and,
>>>>> therefore, can`t be used in all scopes (like passing argument to method).
>>>>>
>>>>> I believe these additions would greatly enhance the functionality and
>>>>> flexibility of the List interface, making it more powerful and
>>>>> user-friendly. I look forward to your feedback and am open to making any
>>>>> necessary changes based on your suggestions.
>>>>>
>>>>> The main reason I am publishing this proposal in the mailing system is
>>>>> to gather feedback from the Java developers community, especially about
>>>>> possible caveats related to backward compatibility of your projects. Would
>>>>> appreciate every opinion!
>>>>>
>>>>> Best regards
>>>>>
>>>>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/core-libs-dev/attachments/20240419/bd1111bb/attachment-0001.htm>