RFR: 8254807: Optimize startsWith() for String.substring() [v5]

Xin Liu xliu at openjdk.java.net
Mon Dec 14 23:05:56 UTC 2020


On Mon, 14 Dec 2020 21:25:55 GMT, Vladimir Kozlov <kvn at openjdk.org> wrote:

>> This is a lot of churn and a lot of code surface area to gain a small benefit.  I don't think this sort of change takes our code base in a healthy direction.  Is it really worth it?
>> 
>> Somebody deeply familiar with the existing string-builder optimizations needs to look at this carefully, and make sure that it is maintainable going forward into the future.  Once this integrates, we pay for the maintenance forever.
>> 
>> Even more basically, I think it may be an erroneous transformation.  Given `base = "abcd"` and `s = base.substring(base, 1, 2)` (i.e., `base = "b"`), the boolean `z1 = s.startsWith("bc")` is false even though the boolean `z2 = base.substring("bc", 1)` is true.  This means it is invalid to replace `z1` with `z2`.  If the optimization were to make this transform, deeply unpredictable things could happen in an application.
>> 
>> Now, I looked for the conditional that excludes this particular error case.  I didn't find it.  This tells me that one of two things is true:  (a) The proposed change is incorrect, or (b) the proposed change is so subtle it will be impossible to maintain in a state of correctness.
>> 
>> For these reasons, I am skeptical that (a) the change should be integrated, and (b) any similar change should be integrated.
>> 
>> Somebody who is deeply familiar with the existing string optimizations may have a different opinion, and I would be happy to accept their reassurances.
>
> I agree with John that these changes are moving in wrong direction.
> I would like to know why Escape Analysis does not eliminate substring object if it is only used for StartsWith():
>  public static boolean foo(String s) { 
>      return s.substring(1).startsWith("a"); 
>  }
> Improving EA will get you more benefits.

Thanks for the feedbacks. They are extremely valuable for me. I am working on the issues that Vladimir Ivanov's comment about 'late inlining skips the overloaded methods" and the bug of "too short substring" from John's comment.

I just want to say that it's an opener PR. I try sell a technique I called it "API simplification".  From my observation, many java code can be more efficient if Java programmers just know "one more API".  Even they’ve known, we can't blame them for the hindsight because sometimes certain patterns such as s.substring(1).startsWith("a") only emerge after deep inlining.

Api simplification is here for help them out. I guess the real gain will be seen in this split case, even though it follows the same recipe here.
https://github.com/navyxliu/StringFunc/blob/master/src/com/amazon/jdkteam/brownbag/SplitAndPick.java#L57

For the startsWith case, I did spend some time to research it.  C2 can eliminate AllocateNode of newstring indeed, but I found c2 failed to eliminate two expensive nodes:  AllocateArray and ArrayCopy for the underlying byte buffer.  I don’t think this problem can be perfectly solved by EA and Scalar Replacement. Essentially, it’s redundancy and no new object should be allocated all all.

-------------

PR: https://git.openjdk.java.net/jdk/pull/974


More information about the hotspot-compiler-dev mailing list