RFR: 8261731: shallow copy the internal buffer of a scalar-replaced java.lang.String object

Volker Simonis simonis at openjdk.java.net
Wed Feb 17 17:49:40 UTC 2021


On Wed, 17 Feb 2021 09:00:25 GMT, Richard Reingruber <rrich at openjdk.org> wrote:

>>> Hi,
>>> 
>>> this is a smart optimization.
>>> 
>>> > There are 3 nodes involving in the construction of a java.lang.String object.
>>> > ```
>>> > 1. Allocate of itself, aka. alloc
>>> > 
>>> > 2. AllocateArray of a byte array, which is value:byte[], aka. aa
>>> > 
>>> > 3. ArrayCopyNode which copys in the contents of value, aka. ac
>>> > ```
>>> > 
>>> > 
>>> > Lemma
>>> > When a String object `alloc` is scalar replaced, C2 can eliminate `aa` and `ac`.
>>> > Because `alloc` is scalar replaced, it must be non-escaped. The field value:byte[] of j.l.String cannot be seen by external world, therefore it must not be global escaped. Because the buffer is marked as stable, it is safe to assume its contents are whatever ac copies in. Because all public java.lang.String constructors clone the incoming array, the source of `ac` is stable as well.
>>> 
>>> Are you saying that the source of `ac` cannot be accessed by another thread because of the cloning in the constructor? But the resulting string instance which is used to construct the non-escape instance can be GlobalEscape and then the source of `ac` is accessible to other threads, isn't it?
>> 
>> Hi, @reinrich 
>> You might not know, but I learn how to reallocate an object in deoptimization from your previous patches. Thank you!
>> 
>> You are right. The source of ac (`src`) might be escaped.  I didn't say other threads can't access it. I said we need to guarantee the `src` is stable,  or  it's a "frozen" array in JDK-8261007. 
>> 
>> To be honest, I didn't check the src is frozen. In practice, I only see ArrayCopy in construction [here](https://github.com/openjdk/jdk/blob/master/src/java.base/share/classes/java/lang/StringLatin1.java#L769).This method is creating a new substring from an established array, so its value is `stable`. 
>> 
>> After I read your comment, I went through String.java. I do find one open-ended constructor. yes, it's a problem!
>> https://github.com/openjdk/jdk/blob/master/src/java.base/share/classes/java/lang/String.java#L395
>> 
>> If the frozen attribute is not present, what I can come up with. an array is "stable". What's do you think?
>> 1) has annotation "stable" or
>> 2) it's non-escaped and
>> 3) can't find any store nodes along its mem stream.
>
> Hi @navyxliu,
> 
>> You might not know, but I learn how to reallocate an object in deoptimization from your previous patches. Thank you!
> 
> Oh, great! :)
> 
>> You are right. The source of ac (`src`) might be escaped. I didn't say other threads can't access it. I said we need to guarantee the `src` is stable, or it's a "frozen" array in JDK-8261007.
>> 
>> To be honest, I didn't check the src is frozen. In practice, I only see ArrayCopy in construction [here](https://github.com/openjdk/jdk/blob/master/src/java.base/share/classes/java/lang/StringLatin1.java#L769).This method is creating a new substring from an established array, so its value is `stable`.
> 
> Sorry, I wasn't aware of the @Stable annotation on the value array.
> 
> https://github.com/openjdk/jdk/blob/d19503353e5c347ce393544a3a30d5caec53d133/src/java.base/share/classes/java/lang/String.java#L154
> 
> So my concern was that another thread could for example use reflection to modify
> the value array but I reckon this is illegal then (I wonder if it is checked in
> reflection...).
> 
> Also there is already UseStringDeduplication that relies on the value array being stable.
> 
>> After I read your comment, I went through String.java. I do find one open-ended constructor. yes, it's a problem!
>> https://github.com/openjdk/jdk/blob/master/src/java.base/share/classes/java/lang/String.java#L395
> 
> I see this one is deprecated. Why do you think there's a problem?
> 
>> If the frozen attribute is not present, what I can come up with. an array is "stable". What's do you think?
>> 
>>     1. has annotation "stable" or
>> 
>>     2. it's non-escaped and
>> 
>>     3. can't find any store nodes along its mem stream.
> 
> Currently I'd think this is not necessary. `@Stable` is strong enough.
> 
> The optimization reminds me a bit of earlier Java versions where String
> instances had an offset field and the value array could be shared among String
> instances.
> 
> I guess the new String needs to be scalarized mostly because you cannot take
> care of the offset otherwise.
> 
> Coincidentally I'm currently looking at
> [LoadNode::can_see_arraycopy_value()](https://github.com/openjdk/jdk/blob/b955f85e03bafe8ce39677d0af06bf1ceb7e2cbb/src/hotspot/share/opto/memnode.cpp#L951)
> which does something similar you want to do. There I don't see any concerns
> about the src being stable. In fact I don't think this is correct. Just a side
> mark...
> 
> These are my (somewhat random) thoughts so far. I'd think it is a legal and useful optimization (though haven't yet looked at the code yet).
> 
> Thanks, Richard.

What about GC? What happens if the original String isn't reachable any more? Do you put a reference to the byte array into the corresponding oop maps to make sure it can't be garbage collected?

-------------

PR: https://git.openjdk.java.net/jdk/pull/2570


More information about the hotspot-compiler-dev mailing list