[9] RFR(S): 8155608: String intrinsic range checks are not strict enough

Fri Apr 29 20:00:31 UTC 2016

> On Apr 28, 2016, at 9:23 PM, Tobias Hartmann <tobias.hartmann at oracle.com> wrote:
> 
> Hi Chris,
> 
> On 28.04.2016 22:11, Christian Thalinger wrote:
>> 
>>> On Apr 28, 2016, at 12:45 AM, Tobias Hartmann <tobias.hartmann at oracle.com <mailto:tobias.hartmann at oracle.com>> wrote:
>>> 
>>> Hi
>>> 
>>> please review the following patch:
>>> 
>>> https://bugs.openjdk.java.net/browse/JDK-8155608
>>> http://cr.openjdk.java.net/~thartmann/8155608/jdk/webrev.00/
>> + checkBoundsOffCount(dstOff << 1, len << 1, dst.length);
>> 
>> It’s funny that we still do << 1 instead of * 2 when every compiler on this planet can optimize that.   Yeah, yeah, I know, it’s because of the interpreter but does it really matter?
> 
> I used it more for consistency because we use "<< 1" in all the other places in StringLatin1, StringUTF16 and String as well. I think this originated from the "value.length >> String.coder()" use case to get the length depending on the String encoding. Besides that, I'm not sure if interpreter speed really matters here but the String methods are executed a lot (especially during startup).
> 
>> Actually, I would prefer:
>> 
>> + checkBoundsOffCount(dstOff * Character.BYTES, len * Character.BYTES, dst.length);
> 
> I agree that this is more readable but for consistency I would like to go with the "<< 1" approach.

Again, a loss for maintainability versus consistency (a.k.a. “we’ve always done it this way”).

> 
> Thanks,
> Tobias
> 
>> 
>>> http://cr.openjdk.java.net/~thartmann/8155608/hotspot/webrev.00/
>>> 
>>> Some String API methods use StringUTF16.putChar/getChar to read a char value from a byte array. For performance reasons, putChar/getChar is intrinsified by C1/C2 without range checks (like the Unsafe counterparts). The Java callers are responsible for adding the corresponding explicit range checks if necessary.
>>> 
>>> I noticed that the Java level range checks in StringUTF16::compress(), StringUTF16::getChars() and StringLatin1::inflate() are not strong enough. Offset and length need to be multiplied by two because they index a char value in a byte array. I added a regression test that triggers the problem and also checks the other relevant intrinsics by invoking the methods with different arguments.
>>> 
>>> Tested with regression test (-Xint/-Xcomp) and RBT (running).
>>> 
>>> Thanks,
>>> Tobias