RFR: 8254807: Optimize startsWith() for String.substring()

Vladimir Ivanov vladimir.x.ivanov at oracle.com
Mon Nov 16 12:19:27 UTC 2020


Hi Xin,

> the optimization transforms code from s=substring(base, beg, end); s.startsWith(prefix)
> to substring(base, beg, end) | base.startsWith(prefix, beg).
> 
> it reduces an use of substring. hopefully c2 optimizer can remove the used substring()

It would be very helpful to see a more elaborate description of intended 
behavior to understand better what is the desired goal of the enhancement.

Though it looks attractive to address the problem in the JIT-compiler, 
it poses some new challenges which makes proposed approach questionable. 
I understand your desire to rely on existing String-related 
optimizations, but coalescing multiple concatenations differs 
significantly from your case.

Some of the concerns/questions I had while briefly looking through the 
patch:

- You introduce a call to a method (String::startsWith(String, int)) 
which is not present in bytecode. It means (unless the method is called 
from different places) there won't be any profiling info present, and 
even if there is, it is unrelated to the call site being compiled. If 
the code is recompiled at some point (e.g., it hits an uncommon trap in 
startsWith(String,int) overload), there won't be any re-profiling 
happening. So, it can hit the very same condition on the consequent 
recompilations.


- "s.startsWith(prefix)" call node is reused and rewired to call 
unrelated method "base.startsWith(prefix, beg)". Will the new target 
method be taken into account during inlining? I would be much more 
comfortable seeing fresh call populated using standard process 
(involving CallGenerators et al) which then substitutes the original 
node. That way you make it less fragile longer term.


- "hopefully c2 optimizer can remove the used substring()"
   If everything is inlined in the end, it can happen, but it's fragile. 
Instead, you could teach C2 that the method is "pure" (no interesting 
side effects to care about) and cut the call early. It already happens 
for boxing methods (see LateInlineCallGenerator::_is_pure_call for 
details).


Overall, if you want to keep the enhancement C2-specific, I'd suggest to 
look into intrinsifying String::startsWith(String, int) and not relying 
on its bytecodes at all. That way you would avoid fighting against the 
rest of the JVM in some situations.

Best regards,
Vladimir Ivanov

> Commit messages:
>   - fix a regression test on x86_32
>   - 8254807: Optimize startsWith() for String.substring()
>   - 8254807: Optimize startsWith() for String.substring()
>   - 8254807: Optimize startsWith() for String.substring()
>   - 8254807: Optimize startsWith() for String.substring()
>   - 8254807: Optimize startsWith() for String.substring()
>   - 8254807: Optimize startsWith() for String.substring()
> 
> Changes: https://git.openjdk.java.net/jdk/pull/974/files
>   Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=974&range=00
>    Issue: https://bugs.openjdk.java.net/browse/JDK-8254807
>    Stats: 538 lines in 15 files changed: 472 ins; 56 del; 10 mod
>    Patch: https://git.openjdk.java.net/jdk/pull/974.diff
>    Fetch: git fetch https://git.openjdk.java.net/jdk pull/974/head:pull/974
> 
> PR: https://git.openjdk.java.net/jdk/pull/974
> 


More information about the hotspot-compiler-dev mailing list