RFR: 8261731: shallow copy the internal buffer of a scalar-replaced java.lang.String object
Xin Liu
xliu at openjdk.java.net
Fri Mar 19 06:27:39 UTC 2021
On Tue, 16 Mar 2021 09:11:58 GMT, Tobias Hartmann <thartmann at openjdk.org> wrote:
> Sure, I'll re-run testing.
>
> After looking through the changes, I'm concerned about all the complexity that is added to handle this special case. Do you have any performance numbers other than microbenchmark results that would justify this?
hi, @TobiHartmann ,
It's not my intention to make a big patch. I would like to get help from reviewers. Am I making thing over complex? Is there an alternative to make thing simpler?
The reason that the patch is big because I meet some constraints in current optimizing compiler.
1. It seems all oops must be allocated on heap. There's no pseudo oop. or I can’t find a way to create a temp of fake oop.
2. Scalar_replacement and ArrayCopy idealization don't support variable-length array(VLA). String::value is a VLA.
3. Lack supports of frozen array in C2's typing system. JDK-8261007
4. re-allocation doesn't support nested objects. j.l.String has 4 fields. String::value is an array of byte, which I need to reallocate too.
To overcome the constraints, I make the following new features in this patch.
1. `process_users_of_string_allocation` intercepts all uses of AllocateArray.
2. reorder the trinity Allocate-AllocateArray-ArrayCopy(or alloc-aa-ac) and eliminate them in MacroExpansion phase.
3. `ArrayCopyNode::is_string_copy` conservatively recognizes string initialization from another string.
4. invent a scheme to make deoptimization support a nested object, so deopt can reallocate an entire j.l.String object.
TBH, I haven't seen obvious performance improvement in SpecJBB. Only some microbenches can prove my idea. When I profile a string intensive application, I found that the major blocker is JDK-8261531. Current implementation requires the String object must be scalar-replaceable first. A String object is NOT scalar replaceable until its length is a compiler-time constant. It's pretty rare in the real world. Therefore, I would like to put the evaluation of performance impact on hold before I resolve JDK-8261531.
Could you help me verify that the current code can handle all code shape first? Further, I really appreciate that you can give me some advice in terms of high-level design.
thanks,
--lx
-------------
PR: https://git.openjdk.java.net/jdk/pull/2570
More information about the hotspot-compiler-dev
mailing list