RFR: 8302590: Add String.indexOf(int ch, int fromIndex, int toIndex) [v3]

Stuart Marks smarks at openjdk.org
Tue Feb 28 19:48:10 UTC 2023


On Tue, 28 Feb 2023 13:31:56 GMT, Raffaello Giulietti <rgiulietti at openjdk.org> wrote:

>> Add an `indexOf()` variant allowing to specify both a lower and an upper bound on the search.
>
> Raffaello Giulietti has updated the pull request incrementally with one additional commit since the last revision:
> 
>   8302590: Add String.indexOf(int ch, int fromIndex, int toIndex)

In concept, having APIs that search a string subrange is a fine idea. Unfortunately the exception handling policy is an issue. We need to think through this carefully.

Currently, almost everything that takes some kind of String index or subrange will throw IndexOutOfBoundsException if the arguments are ill-specified in some fashion. There are a few notable outliers though:

    indexOf(int ch, int fromIndex)
    indexOf(String str, int fromIndex)
    lastIndexOf(int ch, int fromIndex)
    lastIndexOf(String str, int fromIndex)
    regionMatches(boolean ignoreCase, int toffset, String other, int ooffset, int len)
    regionMatches(int toffset, String other, int ooffset, int len)

They don't throw any exceptions for ill-defined values; instead, they return -1 or false which is indistinguishable from "not found". These APIs date all the way back to 1.0. Note that the JDK 1.0 String wasn't internally consistent. For example, other 1.0-era methods like `substring` throw StringIndexOutOfBoundsException.

The prevailing API style since that time has been to check arguments and throw the appropriate exceptions, instead of masking ill-defined values by returning "not found". This tends to reveal errors earlier instead of covering them up. Compared to the existing `indexOf` methods (which specify a single starting index), the new `indexOf` method specifies a subrange; this allows a new class of "end < start" errors. It seems strange not to throw any exception for this additional class of errors.

In understand the desire to be consistent with the existing methods. Adding a new non-throwing method seems like a mistake though. It's perpetuating an old API design mistake for the sake of consistency, while also being inconsistent with current API design style.

I also don't think it's necessary to have both throwing and non-throwing methods.

I'd suggest returning to the original `indexOf(ch, from, to)` proposal, but instead having it check its subrange arguments and throwing IndexOutOfBoundsException. I don't think the exception handling inconsistency with the existing methods is that bad. If people really, really object to it, then maybe it would be necessary to choose a new name for the new method (and not introduce two versions). But choosing a good name is hard, and it introduces issues such as discoverability and questions about how the new thing differs from the existing methods, so I'm skeptical that choosing a different name would be any better.

Another possible mitigation is to add API notes to highlight the unusual behavior of the old non-throwing methods. Some of these old methods don't mention their handling of illegal index values at all. (This could be done as part of a separate PR.)

-------------

PR: https://git.openjdk.org/jdk/pull/12600


More information about the core-libs-dev mailing list