Addition of Predicate-based findIndex and findLastIndex methods to java.util.List
-
liangchenblue at gmail.com
Sat Apr 20 07:34:19 UTC 2024
Hi Rotan',
Interface methods are both exposed to callers and to implementations. Is
there any scenario where the implementation can be more efficient than the
stream/gatherer scanning approach?
indexOf or lastIndexOf may be faster if a list has some tricks like a
hashmap to quickly look up the index of an item. In contrast, the predicate
looks too generic and barely optimizable by list implementations; even if
we can stream in parallel, this is already covered by the stream API, so we
can simply declare a Gatherer to perform this findIndex operation on a
stream of elements (also note that a list can be reversed with
List::reversed() since JDK 21, so the same Gatherer can be reused for
findLastIndex)
I think if there is no scenario where implementation can do better with
findIndex, the findIndex should become a Gatherer, which has better
compatibility.
Best,
Chen Liang
On Fri, Apr 19, 2024 at 8:51 PM ІП-24 Олександр Ротань <
rotan.olexandr at gmail.com> wrote:
> Turned out this whole time I was only replying to David. I have to thank
> him for his patience explaining me how this whole mailing thing work.
>
> To summarize what has been missed out, besides what have you seen in
> quotes. I have pointed out that, while flexibility and simplicity is
> important, the main reason why I have decided to propose this feature is to
> enhance developer experience.
>
> I know that conciseness is not a main goal of Java, and therefore my
> proposal is kind of contradicts the general direction of development of
> Java as language, but my desire to make Java more user friendly is based on
> personal experience of introducing fresh devs or switchers to java and
> their feedback. Therefore, while it is not a priority for community, I
> really hope you guys don't mind me working at a backrank of Java making it
> a bit prettier.
>
> On Sat, Apr 20, 2024, 02:56 David Alayachew <davidalayachew at gmail.com>
> wrote:
>
>> > That was what i referred to as "Map-solvable" problems,
>> > but if this way of doing it is a thing, then I agree with
>> > the point.
>>
>> Doing it the Map way would not only be slower, but it would force you to
>> include more concepts than you need. Your identifier is your index.
>> Lists/arrays have index. Bringing a map in only complicates things.
>>
>> As for the rest of your response, I think I understand the trouble here.
>> You put it best when you said that you are introducing this feature with
>> the intent of improving the developer experience, specifically through
>> making code more concise.
>>
>> I can understand that goal, and it is a good goal to strive for, but
>> conciseness is one of the last things Java aims to provide as a language or
>> through its standard library. I won't tell you not to do it, but I will
>> tell you that there will be significant push back from folks here, and for
>> good reason -- conciseness is not the priority here at Java. At Java, the
>> lack of conciseness is not a weakness. Sure, unneeded verbosity can be a
>> weakness, but the solution I suggested above is basically the tempo that
>> Java strives for -- compose simple concepts so that you can create what you
>> need.
>>
>> All of that is to say, you have made your point clear. You goal is
>> conciseness and ease of access. Understanding that, it's clear that my
>> solution is not concise enough to meet that goal, so I won't push it
>> further.
>>
>> On Fri, Apr 19, 2024 at 7:26 PM ІП-24 Олександр Ротань <
>> rotan.olexandr at gmail.com> wrote:
>>
>>> " 1. Fetching data from same-sized lists -- This is a trick from
>>> Data-Oriented Design (not to be confused with Data-Oriented Programming)
>>> where you filter down to the indexes that you want, and use the indexes to
>>> fetch related values from the respective lists. So, I would filter down to
>>> Player1, Player4, and Player6, then I would use their indexes to do lookups
>>> in PlayerItemList, PlayerSkinList, etc. This tactic is especially useful
>>> when working with structs (or value classes when Java gets them)."
>>>
>>> That was what i referred to as "Map-solvable" problems, but if this way
>>> of doing it is a thing, then I agree with the point.
>>>
>>> "By this logic, many of the methods in the Stream library should
>>> instead be in the Collection library. There are several emails on the
>>> lambda mailing list that address why this wasn't done. I imagine that Rémi
>>> himself could point them out to you."
>>>
>>> It may be just my opinion, but I don't see anything harmful in some
>>> shortcuts that could be optimized, especially if they still converge to a
>>> single point under the hood. I would love to read mails about maintainers'
>>> opinions on why some features of Streams couldnt be also a part of
>>> Collection library if it would enhance developer experience, so, if it
>>> wouldn't be hard for you, could you attach some links? Also I guess a good
>>> compromise would be extension methods, I am currently exploring
>>> possibilities of their implementation.
>>>
>>> "Well hold on. I am not at all saying that it is redundant. I am saying
>>> that the functionality that you desire can be easily recreated by using a
>>> more far-reaching feature, the .withIndex() idea that Rémi suggested. I am
>>> just trying to say that your idea probably should not be done on the List
>>> interface."
>>>
>>> I am not saying that .withIndex() is a bad idea in any way. I can
>>> remember several times when I was desperately looking for something like
>>> python enumerate(), but all I found were some IntStream-based approaches.
>>> What I am saying is, while there is definitely room for both, withIndex
>>> does not solve the initial problem that motivated me to propose such
>>> changes: redundant (emphasis on this) verbosity. The main reason I decided
>>> to become an openjdk contributor is because I love to create tools that
>>> make development easier and more pleasant, and I feel more sophisticated,
>>> although a more specific API would deal much better with this task, and as
>>> this is my personal priority, I would stand my ground here.
>>>
>>> "Regardless, Valhalla is not here yet. So at least for now, I will
>>> concede your point about performance."
>>>
>>> Valhalla, at least for now, is nowhere near to a release even as a
>>> preview, would be great if it will get to Java releases until the next LTS.
>>> I think that even, even for a 2-4 years, this point is valid, then it at
>>> least should be taken into consideration.
>>>
>>> PS: As you might notice, I am a developer experience guy rather than a
>>> flexible guy. My opinion is a result of some personal beliefs, but also is
>>> an aggregation of opinions of numbers of other programmers that I
>>> introduced to Java, especially switchers. Many of then feel those
>>> "wordiness" of Java and it repulses potential developers, especially when
>>> there is C# just by the corner. I think that Java has really strong sides,
>>> but I would love to cover its "weaknesses" as well, especially if it
>>> doesn't come with any maintenance burdens.
>>>
>>> сб, 20 апр. 2024 г. в 02:00, David Alayachew <davidalayachew at gmail.com>:
>>>
>>>> No confusion. You spoke clearly enough to be understood the first time.
>>>>
>>>> On Fri, Apr 19, 2024 at 6:28 PM ІП-24 Олександр Ротань <
>>>> rotan.olexandr at gmail.com> wrote:
>>>>
>>>>> Regarding the first point in my last message. "Them" is referring to
>>>>> the developer, and "evaluate" means to be processed by the developer. Sorry
>>>>> for the potential confusion
>>>>>
>>>>> сб, 20 апр. 2024 г. в 00:52, ІП-24 Олександр Ротань <
>>>>> rotan.olexandr at gmail.com>:
>>>>>
>>>>>> I think that, while finding the index of a matching element is not a
>>>>>> very common task, especially among commercial developers, when discussing
>>>>>> indexed streams, this would be the most common use case. To be honest, I
>>>>>> don't even see any other use cases right now, besides maybe some processing
>>>>>> based on divisibility of index (odd/even as the most common one). I have a
>>>>>> pretty strong background in competitive programming, more than 5 years, and
>>>>>> can remember anything more complex than that that could not be replaced
>>>>>> with Map usage.
>>>>>>
>>>>>> Regarding clarity, I fully agree with you. I am not a fan of those
>>>>>> unreadable abbreviations in C/C++ or even in python. However, while it's
>>>>>> hard to argue that your approach is more flexible, it has two downsides:
>>>>>>
>>>>>> 1. Although indexed streams are flexible, the flexibility might come
>>>>>> at the cost of readability. Syntax like list.FindIndex is pretty
>>>>>> straightforward and speaks for itself, while indexed-stream-based approach
>>>>>> requires developer to explicitly evaluate what that chain of method does,
>>>>>> which would distract them from understanding of general program logic flow,
>>>>>> while usually being no more then utility. The reason for existence of
>>>>>> findIndex is the same as for indexOf, so saying one is redundant means
>>>>>> saying other one is redundant too.
>>>>>>
>>>>>> 2. Streams are performance-costy, thats the point that is hard to
>>>>>> argue with. While list implementations could provide specified
>>>>>> implementations of findIndex based on their internal structure, streams
>>>>>> work only with generic interfaces. Moreover, even default implementation
>>>>>> that uses ListIterator would be times faster and less resource consuming
>>>>>> then generic stream-based approach. While I understand that performance is
>>>>>> not as valuable as flexibility and clarity, it is still crucial, at least
>>>>>> because of it being the main factor server maintenance cost after all.
>>>>>>
>>>>>> сб, 20 апр. 2024 г. в 00:32, David Alayachew <
>>>>>> davidalayachew at gmail.com>:
>>>>>>
>>>>>>> It is not the goal of Java to be verbose, but it is also not the
>>>>>>> goal of Java to be concise either. It is the goal of Java to be clear,
>>>>>>> correct, and readable, in order to maximize maintainability, correctness,
>>>>>>> and flexibility. When you focus on accomplishing that goal first, you will
>>>>>>> find that conciseness takes care of itself.
>>>>>>>
>>>>>>> I understand that my approach is wordier than yours. Please also
>>>>>>> consider that my suggestion does exactly one task -- stream an element with
>>>>>>> its respective index. And yet, the number of ways that that can be used
>>>>>>> covers hundreds of use cases while still being incredibly easy to maintain
>>>>>>> and optimize. Far more than what you suggest. My feature easily composes
>>>>>>> with others, with anything that can take a Stream<T> in fact.
>>>>>>>
>>>>>>> The bar to adding a feature to Java, even to its libraries, is high.
>>>>>>> In fact, there is even a breaking point where a language/library can just
>>>>>>> be too dense to work with. Too many methods, too much complexity.
>>>>>>>
>>>>>>> Java makes it a priority to limit adding features or functionality
>>>>>>> wherever possible, only relenting when the benefit significantly outweighs
>>>>>>> the cost. Adding a method like yours that only serves a tiny slice of use
>>>>>>> cases just does not meet the goal of what Java is aiming for -- to get the
>>>>>>> biggest bang for buck when adding a feature to language/library.
>>>>>>>
>>>>>>> I think that your pain point is significant and worth addressing.
>>>>>>> But I think it should be done in a way that services a much larger group of
>>>>>>> users and use cases. I think my idea does that.
>>>>>>>
>>>>>>> On Fri, Apr 19, 2024 at 5:14 PM ІП-24 Олександр Ротань <
>>>>>>> rotan.olexandr at gmail.com> wrote:
>>>>>>>
>>>>>>>> streamWithIndex() is indeed an interesting approach, however,
>>>>>>>> working with pairs of values is not always comfortable as this may make
>>>>>>>> code too wordy, which may be fixed using things like properties and tuples
>>>>>>>> (which I am currently working on right now btw), but still code like
>>>>>>>> list.streamWithIndex().filter(Predicate).findFirst().orElse(-1) is still
>>>>>>>> long, thats why i think there is room for both in Java.
>>>>>>>>
>>>>>>>> Also maybe instead of returning traditional -1 it would be more
>>>>>>>> suitable to use OptionalInt instead, but I would like to hear community
>>>>>>>> opinion about that.
>>>>>>>>
>>>>>>>> сб, 20 апр. 2024 г. в 00:02, David Alayachew <
>>>>>>>> davidalayachew at gmail.com>:
>>>>>>>>
>>>>>>>>>
>>>>>>>>> No Rémi, I don't think your idea is the right approach. You are
>>>>>>>>> working on the wrong level of abstraction.
>>>>>>>>>
>>>>>>>>> Many users ask requests like this all the time, and what you are
>>>>>>>>> suggesting would be even more error-prone than the equivalent for loop or the
>>>>>>>>> IntStream suggestion that the user above requested. Not to mention
>>>>>>>>> that getting it to parallelize would be a task many users are likely to
>>>>>>>>> mess up -- either in correctness or in performance.
>>>>>>>>>
>>>>>>>>> I think you would get far more mileage from adding 2 methods on
>>>>>>>>> the list interface streamWithIndex() and parallelStreamWithIndex() that
>>>>>>>>> would return a Stream<WithIndex<T>>, as opposed to just Stream<T>.
>>>>>>>>>
>>>>>>>>> That way, users are not writing a custom Gatherer each time they
>>>>>>>>> want to work with the index. They just have the index be a field in the
>>>>>>>>> object. They can work with it the same way they would any other object
>>>>>>>>> field.
>>>>>>>>>
>>>>>>>>> Furthermore, doing it this way makes the correct answer obvious.
>>>>>>>>> If I need to do something with an index, stream with the index.
>>>>>>>>>
>>>>>>>>> On top of that, it significantly enhances readability by making it
>>>>>>>>> clear to the reader that, whatever this stream is doing will require use of
>>>>>>>>> the index, so watch out for that.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Fri, Apr 19, 2024, 1:47 PM Remi Forax <forax at univ-mlv.fr>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> Hello,
>>>>>>>>>> for me, it seems what you want is Collector on Stream which is
>>>>>>>>>> able to short-circuit,
>>>>>>>>>> so you can write
>>>>>>>>>> list.stream().collect(Collectors.findFirst(s ->
>>>>>>>>>> s.contains("o")))
>>>>>>>>>> and in reverse
>>>>>>>>>> list.reversed().stream().collect(Collectors.findFirst(s ->
>>>>>>>>>> s.contains("o")))
>>>>>>>>>>
>>>>>>>>>> Using a Stream here is more general and will work with other
>>>>>>>>>> collections like a LinkedHashSet for example.
>>>>>>>>>> Sadly, there is no way to define a short-circuiting collector :(
>>>>>>>>>>
>>>>>>>>>> You can have a short-circuiting Gatherer like this
>>>>>>>>>> <T> Gatherer<T, ?, Integer> findIndex(Predicate<? super T>
>>>>>>>>>> predicate) {
>>>>>>>>>> return Gatherer.ofSequential(
>>>>>>>>>> () -> new Object() { int index; },
>>>>>>>>>> Integrtor.ofGreedy((state, element, downstream) -> {
>>>>>>>>>> var index = state.index++;
>>>>>>>>>> if (predicate.test(element)) {
>>>>>>>>>> return downstream.push(index);
>>>>>>>>>> }
>>>>>>>>>> return true;
>>>>>>>>>> }));
>>>>>>>>>> }
>>>>>>>>>>
>>>>>>>>>> and use it like this:
>>>>>>>>>> list.stream().gather(findIndex(s ->
>>>>>>>>>> s.contains("o"))).findFirst().orElse(-1);
>>>>>>>>>>
>>>>>>>>>> But it's more verbose.
>>>>>>>>>>
>>>>>>>>>> I wonder if at the same time that the Gatherer API is introduced,
>>>>>>>>>> the Collector API should be enhanced to support short-circuiting collectors
>>>>>>>>>> ?
>>>>>>>>>>
>>>>>>>>>> regards,
>>>>>>>>>> Rémi
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> ------------------------------
>>>>>>>>>>
>>>>>>>>>> *From: *"ІП-24 Олександр Ротань" <rotan.olexandr at gmail.com>
>>>>>>>>>> *To: *"core-libs-dev" <core-libs-dev at openjdk.org>
>>>>>>>>>> *Sent: *Friday, April 19, 2024 5:59:39 PM
>>>>>>>>>> *Subject: *Addition of Predicate-based findIndex and
>>>>>>>>>> findLastIndex methods to java.util.List
>>>>>>>>>>
>>>>>>>>>> Subject
>>>>>>>>>> Addition of Predicate-based findIndex and findLastIndex methods
>>>>>>>>>> to java.util.List
>>>>>>>>>>
>>>>>>>>>> Motivation
>>>>>>>>>> The motivation behind this proposal is to enhance the
>>>>>>>>>> functionality of the List interface by providing a more flexible way to
>>>>>>>>>> find the index of an element. Currently, the indexOf and lastIndexOf
>>>>>>>>>> methods only accept an object as a parameter. This limits the flexibility
>>>>>>>>>> of these methods as they can only find the index of exact object matches.
>>>>>>>>>>
>>>>>>>>>> Here I want to propose methods that would accept a Predicate as a
>>>>>>>>>> parameter, allowing users to define a condition that the desired element
>>>>>>>>>> must meet. This would provide a more flexible and powerful way to find the
>>>>>>>>>> index of an element in a list.
>>>>>>>>>>
>>>>>>>>>> The changes I am proposing are implemented in this PR:
>>>>>>>>>> https://github.com/openjdk/jdk/pull/18639. Here is a brief
>>>>>>>>>> overview of the changes made in this pull request:
>>>>>>>>>>
>>>>>>>>>> Added the findIndex (Predicate<? super E> filter) method to the
>>>>>>>>>> List interface.
>>>>>>>>>> Added the findLastIndex (Predicate<? super E> filter) method to
>>>>>>>>>> the List interface.
>>>>>>>>>> Implemented these methods in all non-abstract classes that
>>>>>>>>>> implement the List interface, as well as List itself (default impl).
>>>>>>>>>> The changes have been thoroughly tested to ensure they work as
>>>>>>>>>> expected and do not introduce any regressions. The test cases cover a
>>>>>>>>>> variety of scenarios to ensure the robustness of the implementation.
>>>>>>>>>>
>>>>>>>>>> For example, consider the following test case:
>>>>>>>>>>
>>>>>>>>>> List<String> list = new ArrayList<>();
>>>>>>>>>> list.add("Object one");
>>>>>>>>>> list.add("NotObject two");
>>>>>>>>>> list.add("NotObject three");
>>>>>>>>>>
>>>>>>>>>> int index1 = list.findIndex(s -> s.contains("ct t"));
>>>>>>>>>> System.out.println(index1); // Expected output: 1
>>>>>>>>>> int index2 = list. findLastIndex(s -> s.startsWith("NotObject"));
>>>>>>>>>> System.out.println(index2); // Expected output: 2
>>>>>>>>>> Currently, to achieve the same result, we would have to use a
>>>>>>>>>> more verbose approach:
>>>>>>>>>>
>>>>>>>>>> int index1 = IntStream.range(0, list.size())
>>>>>>>>>> .filter(i -> list.get(i).contains("ct t"))
>>>>>>>>>> .findFirst()
>>>>>>>>>> .orElse(-1);
>>>>>>>>>> System.out.println(index1); // Output: 1
>>>>>>>>>> int index2 = IntStream.range(0, list.size())
>>>>>>>>>> .filter(i ->
>>>>>>>>>> list.get(i).startsWith("NotObject"))
>>>>>>>>>> .reduce((first, second) -> second)
>>>>>>>>>> .orElse(-1);
>>>>>>>>>> System.out.println(index2); // Output: 2
>>>>>>>>>> Or other approaches that require additional instructions and,
>>>>>>>>>> therefore, can`t be used in all scopes (like passing argument to method).
>>>>>>>>>>
>>>>>>>>>> I believe these additions would greatly enhance the functionality
>>>>>>>>>> and flexibility of the List interface, making it more powerful and
>>>>>>>>>> user-friendly. I look forward to your feedback and am open to making any
>>>>>>>>>> necessary changes based on your suggestions.
>>>>>>>>>>
>>>>>>>>>> The main reason I am publishing this proposal in the mailing
>>>>>>>>>> system is to gather feedback from the Java developers community, especially
>>>>>>>>>> about possible caveats related to backward compatibility of your projects.
>>>>>>>>>> Would appreciate every opinion!
>>>>>>>>>>
>>>>>>>>>> Best regards
>>>>>>>>>>
>>>>>>>>>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/core-libs-dev/attachments/20240420/e23fb8b6/attachment-0001.htm>
More information about the core-libs-dev
mailing list