RFR: 8302590: Add String.indexOf(int ch, int fromIndex, int toIndex) [v3]

Raffaello Giulietti rgiulietti at openjdk.org
Tue Feb 28 22:16:16 UTC 2023


On Tue, 28 Feb 2023 19:45:02 GMT, Stuart Marks <smarks at openjdk.org> wrote:

>> Raffaello Giulietti has updated the pull request incrementally with one additional commit since the last revision:
>> 
>>   8302590: Add String.indexOf(int ch, int fromIndex, int toIndex)
>
> In concept, having APIs that search a string subrange is a fine idea. Unfortunately the exception handling policy is an issue. We need to think through this carefully.
> 
> Currently, almost everything that takes some kind of String index or subrange will throw IndexOutOfBoundsException if the arguments are ill-specified in some fashion. There are a few notable outliers though:
> 
>     indexOf(int ch, int fromIndex)
>     indexOf(String str, int fromIndex)
>     lastIndexOf(int ch, int fromIndex)
>     lastIndexOf(String str, int fromIndex)
>     regionMatches(boolean ignoreCase, int toffset, String other, int ooffset, int len)
>     regionMatches(int toffset, String other, int ooffset, int len)
> 
> They don't throw any exceptions for ill-defined values; instead, they return -1 or false which is indistinguishable from "not found". These APIs date all the way back to 1.0. Note that the JDK 1.0 String wasn't internally consistent. For example, other 1.0-era methods like `substring` throw StringIndexOutOfBoundsException.
> 
> The prevailing API style since that time has been to check arguments and throw the appropriate exceptions, instead of masking ill-defined values by returning "not found". This tends to reveal errors earlier instead of covering them up. Compared to the existing `indexOf` methods (which specify a single starting index), the new `indexOf` method specifies a subrange; this allows a new class of "end < start" errors. It seems strange not to throw any exception for this additional class of errors.
> 
> In understand the desire to be consistent with the existing methods. Adding a new non-throwing method seems like a mistake though. It's perpetuating an old API design mistake for the sake of consistency, while also being inconsistent with current API design style.
> 
> I also don't think it's necessary to have both throwing and non-throwing methods.
> 
> I'd suggest returning to the original `indexOf(ch, from, to)` proposal, but instead having it check its subrange arguments and throwing IndexOutOfBoundsException. I don't think the exception handling inconsistency with the existing methods is that bad. If people really, really object to it, then maybe it would be necessary to choose a new name for the new method (and not introduce two versions). But choosing a good name is hard, and it introduces issues such as discoverability and questions about how the new thing differs from the existing methods, so I'm skeptical that choosing a different name would be any better.
> 
> Another possible mitigation is to add API notes to highlight the unusual behavior of the old non-throwing methods. Some of these old methods don't mention their handling of illegal index values at all. (This could be done as part of a separate PR.)

@stuart-marks I agree.

My insistence on preserving old (but bad) behavior was well intended, but I'm now convinced that the new 3 params `indexOf()` better throws on illegal indices.

I'll thus add a check in its implementation, adapt the spec, remove the additional `checkedIndexOf()` method, and add API notes to the existing methods to highlight their unusual behavior w.r.t. illegal indices (assuming that adding these notes can be done in the same CSR issue).

-------------

PR: https://git.openjdk.org/jdk/pull/12600


More information about the core-libs-dev mailing list