RFR: 8071571: Move substring of same string to slow path
Ivan Gerasimov
ivan.gerasimov at oracle.com
Wed May 13 21:25:19 UTC 2015
Hi Martin!
On 13.05.2015 0:22, Martin Buchholz wrote:
> All your changes look good. But:
>
> ---
> Not your bug, but it looks like the below should instead be:
> throw new StringIndexOutOfBoundsException(beginIndex);
> Perhaps fix in a follow-on change.
>
> 1935 if (subLen < 0) {
> 1936 throw new StringIndexOutOfBoundsException(subLen);
>
I guess, it was done so in order to make the error messages consistent
for str.substring(tooBigIndex) and str.substring(tooBigIndex, str.length).
There are also other places, where StringIndexOutOfBoundsException() is
passed a (wrong) distance, not an index.
It might be a good mini-project to make all the IndexOutOfBounds error
messages uniform and fully informative, I'll file a RFE for this.
> ---
> We seem to have clean progress, but for substring fast-path we can do
> a little better, I think, thus:
>
> public String substring(int beginIndex, int endIndex) {
> int subLen;
> return ((beginIndex | (value.length - endIndex)) > 0
> && (subLen = endIndex - beginIndex) > 0)
> ? new String(value, beginIndex, subLen)
> : substringSlowPath(beginIndex, endIndex);
> }
>
> private String substringSlowPath(int beginIndex, int endIndex) {
> if (beginIndex <= 0) {
> if (beginIndex < 0) {
> throw new StringIndexOutOfBoundsException(beginIndex);
> }
> if (endIndex == value.length) {
> return this;
> }
> }
> if (endIndex > value.length) {
> throw new StringIndexOutOfBoundsException(endIndex);
> }
> if (endIndex == beginIndex) {
> return "";
> }
> throw new StringIndexOutOfBoundsException(endIndex - beginIndex);
> }
>
> additional white-box tests would be required if we adopt that.
>
I think this is great, and it's lot of fun to rewrite the checks this
way, but I have concerns.
First, I think this style would reduce readability of the code.
Second, it's not obvious to me that such combined check will always be
faster.
I did some experimenting with jmh, and the results are unclear.
For example, please take a look at the following three tests:
@State(Scope.Benchmark)
public class MyBenchmark {
static char[] buff = new char[12];
@Benchmark public void testMethod_1() { substring_1(1, 10, buff); }
void substring_1(int from, int to, char[] buff) {
if ((from < 0) || (to > buff.length) || (from > to))
throw new Error();
}
@Benchmark public void testMethod_2() { substring_2(1, 10, buff); }
void substring_2(int from, int to, char[] buff) {
if ((from | (buff.length - to) | (to - from)) < 0)
throw new Error();
}
@Benchmark public void testMethod_3() { substring_3(1, 10, buff); }
void substring_3(int from, int to, char[] buff) {
int len;
if ((from | (buff.length - to)) < 0 || (len = (to - from)) < 0)
throw new Error();
}
}
Benchmark Mode Cnt Score Error Units
MyBenchmark.testMethod_1 thrpt 60 1132911599.680 ± 42375177.640 ops/s
MyBenchmark.testMethod_2 thrpt 60 813737659.576 ± 14226427.823 ops/s
MyBenchmark.testMethod_3 thrpt 60 810406621.145 ± 12316864.045 ops/s
The plain old ||-combined check was faster in this round.
Some other tests showed different results.
The speed seems to depend on the scope of the checked variables and
complexity of the expressions to calculate.
However, I still don't have a clear understanding of all the aspects we
need to pay attention to when doing such optimizations.
My third concern is this: Wouldn't it be possible to implement this
type of optimization at jvm level?
At least some conditions can automatically be combined into one.
Given the information about which execution path is expected to be fast,
hotspot should be able to quickly perform that one combined condition
check and move to the fast path in most situations.
Sincerely yours,
Ivan
More information about the core-libs-dev
mailing list