RFR: 8254807: Optimize startsWith() for String.substring() [v5]

Thu Dec 17 04:39:57 UTC 2020

On Wed, 16 Dec 2020 03:54:09 GMT, Xin Liu <xliu at openjdk.org> wrote:

>>> _Mailing list message from [Vladimir Ivanov](mailto:vladimir.x.ivanov at oracle.com) on [hotspot-compiler-dev](mailto:hotspot-compiler-dev at openjdk.java.net):_
>>> 
>>> > > - You introduce a call to a method (String::startsWith(String, int))
>>> > > which is not present in bytecode. It means (unless the method is called
>>> > > from different places) there won't be any profiling info present, and
>>> > > even if there is, it is unrelated to the call site being compiled. If
>>> > > the code is recompiled at some point (e.g., it hits an uncommon trap in
>>> > > startsWith(String,int) overload), there won't be any re-profiling
>>> > > happening. So, it can hit the very same condition on the consequent
>>> > > recompilations.
>>> > 
>>> > 
>>> > Thank you to raise this! I didn't consider the profiling data before.
>>> > it is an issue. Maybe it can explain why I observe only 80% performance of hand-craft code.
>>> > TBH, I haven't had a clear and solid answer for it yet.
>>> > The transformation actually won't alter control-flow. Perhaps, I can find a way to amend methoddata from the old method.
>>> > Or, is that possible I request to reprofile it after I modify some bytecodes?
>>> 
>>> Sorry, I overlooked your question.
>>> 
>>> To be clear: the transformations during C2 compilation don't affect
>>> bytecodes in any way.
>>> 
>>> If a method is inlined, a deoptimization may lead to a surprising
>>> resumption of execution in the middle of unexpected method, but it's a
>>> one time occurrence and consequent executions will follow original path
>>> in the bytecode.
>>> 
>> Hi, Vladmir, 
>> 
>> I manage to inline the rewired method in c5a25bf.  I attempt to address your question "Will the new target
>> method be taken into account during inlining?".  From my benchmark, it is 10% faster than the last revision. 
>> 
>> Sorry, I still don't understand your concern about the inlined methods.  This is my first non-bugfix patch.  please forgive me if I look dumb.
>> If startsWith(String, int) is deoptimized, the control will give back to the interpreter. IMHO, I can resume execution back to the interpreter as long as jvms are all correct. I don't think I mess them up.  
>> 
>>> If inlining doesn't happen, then the JVM should be able to finish
>>> successully finish profiling in the callee if needed (reprofile request
>>> invalidates the nmethod and it forces the new call in C2 to go through
>>> either interpreter or C1 until it is compiled with C2).
>>> 
>>> So, I'm concerned specifically about inlined case. And I consider it a
>>> blocking issue for adapting the proposed approach of rewriting rules in
>>> C2. I don't see a way to reliably fix it without intrinsifying the
>>> operation.
>>>
>> I do mark substring and startsWith as intrinsics here. 
>> The reason I didn't mark "substring" pure because all nodes will be eliminated after late inlining if they are dead.
>> yes, if I mark it pure, it will drop the call node immediately in do_late_inline().
>>   
>>> Best regards,
>>> Vladimir Ivanov
>> 
>> Hi, John, 
>> In my defense, this patch is just 300 lines of change if you exclude test.  You can see that I try to reuse the existing codebase to minimize foosteps. As a beginner, I just deliberately make it reviewable-size. I don't think it's non-maintainable or a scramble patch. 
>> 
>> Yes, I admit that the gain is not so attractive so far. it just opens a possibility. A possibility of interprocedural optimization of java APIs.  I can also apply the same recipe from "StringBuilder.append(substring)" to "StringBuilder.append(base, begin, end)" etc. 
>> 
>> The bottom line is I have to learn in action.  All your feedbacks here are helping me to grow. I really appreciate that and I think I can return it in other patches.  I plan to explore in Vladimir kolzov's direction next.  We can rewire the underlying bytebuffer of a string in substring pattern. Of course, it's only safe to do so if EA tells us that the object never escape.
>
> hi, Vladimir, 
> 
> Now I understand what you mean.  vframeArray::unpack_to_stack() resumes vframe which points to the new overloaded method. It doesn't match the original bytecode and the entry of cpcache!  It might require appalling hack to make it work. I guess that's why you guys all say it's not the right direction.

> @navyxliu during your investigation did you look on other applications in addition to `SpecJVM2008/xml.validation`?
> 
> I would like to know what patterns you see. For example, we would not benefit much if an extracted substring is used after startsWith() call. Also I don't think we should optimize Java code which can be replaced with existing String API as in RFE's example.
> 
> Note, in RFE's example you added length check which is not needed because startsWith(prefix, offset) does it:
> https://github.com/openjdk/jdk/blob/master/src/java.base/share/classes/java/lang/String.java#L1455
> 
> And if you see a lot of cases like `return substring(base, beg, end).startsWith(prefix)` or code where substring is used based on startsWith() check:
> 
> ```
>     String sub=substring(base, beg, end);
>     if (sub.startsWith(prefix)) {
>         // Do something with sub
> ```
> 
> we should consider extending String API with `startsWith(prefix, beg, end)`.
> 
> As always fixing Java code gives a lot more performance improvement than JIT compiler can.

Hi, Vladimir, 
Thank you for taking time to review it.  Besides xml.validation, I searched this pattern in our company-scope codebase.  I did find some cases, but it's not common indeed. Even xml.validation is unlikely to benefit from this PR.  To be honest, substring.startsWith is a special case in the middle of special cases.

My original aim is to eliminate object allocation and array copying of substrings if they are never escaped. I learned some unpopular APIs such as `String.startsWith(prefix, toOffset)`,  `StringBuilder.append(substr, toOffset)` and `String.split(sep, limit)` etc. I just feel I can add an API-level transformation to avoid creation of substrings in those case. It seems a cool trick. Given the fact that we need significant efforts to consolidate correctness, now I admit I am steering in wrong direction.  I closed JDK-8254807. 

I will go back to my original direction. I have extent current EA to support substring just like you did for boxing method. I think I can idealize() the Ideal pattern of String.substring().

-------------

PR: https://git.openjdk.java.net/jdk/pull/974