Please review: surrogate fiddle

Martin Buchholz martinrb at google.com
Wed Mar 20 23:22:32 UTC 2013


On Wed, Mar 20, 2013 at 7:09 AM, Ulf Zibis <Ulf.Zibis at cosoco.de> wrote:

> Hi Martin,
>
> nice to see you again on board.
>
>
Hi Ulf!


> Am 19.03.2013 20:18, schrieb Martin Buchholz:
>
>> Thanks!  Webrev updated.
>>
>
> Character:
> Maybe I'm blind, is there any semantical difference between
>         char c1 = seq.charAt(index++);
>         if (isHighSurrogate(c1)) {
>             if (index < seq.length()) {
> and
>         char c1 = seq.charAt(index);
>         if (isHighSurrogate(c1) && ++index < seq.length()) {
> , or is it just for beautifying the code?
>
>
No behavior difference, only the principle that one should "obviously" do
no more work than necessary, even if hotspot+CPU will very likely make that
extra work disappear.



> AbstractStringBuilder:
> Instead
>  270     public int codePointBefore(int index) {
>  271         int i = index - 1;
>  272         if ((i < 0) || (i >= count)) {
>  273             throw new StringIndexOutOfBoundsExceptio**n(index);
>  274         }
> I suggest
>  270     public int codePointBefore(int index) {
>  271         if ((--index < 0) || (index >= count)) {
>  272             throw new StringIndexOutOfBoundsExceptio**n(index);
>  273         }
> , because if e.g. the initial value of index is 0, then -1 reflects the
> out-of-bound condition, but not the initial 0 to report in the
> StringIndexOutOfBoundsExceptio**n.
> (Hopefully the following redundant i < 0 bounds check from value[i]
> becomes elimited by JIT.)
>
>
I was impressed that hotspot could indeed remove the redundant bounds
checks.
Only with -Xint was I able to demonstrate a performance win.



> If there is some register pressure, you could avoid potential register
> swapping for temp, temp2, hasSurrogate, j and n - j if you would reorder
> following lines to:
> 1390             char temp = value[j];
> 1391             char temp2 = value[j] = value[n - j];
> 1397             value[n - j] = temp;
> 1392             if (!hasSurrogate) {
> 1393                 hasSurrogate = (Character.isSurrogate(temp) ||
> 1394                                 Character.isSurrogate(temp2));
> 1395             }
> (Nomination for "performance expert II" ?)
>
>
Thanks for making me beat my head against the hotspot optimization wall!
The latest version of reverse in my webrev, after way too much fiddling, is
20% faster,
even though I'm not quite sure why.



>
>  I, ummmm... am a "performance expert".
>>
>> How about, if I can ever measure any performance win in a micro-benchmark,
>> we're allowed to keep the lower-level code?
>>
>
> Yes I remember ;-)
> As IMHO better alternative you could proof the JIT result by hsdis.
>
>
> I think, you independently should test
>     static int codePointAtImpl(char[] a, int index, int limit) {...}
> to don't read out-of-bounds trailing surrogate.
> ... and you additionally should provide a test for
>     static int codePointBeforeImpl(char[] a, int index, int start) {...}
> to don't read out-of-bounds precursory surrogate.



If you write tests, I will incorporate into this changeset!



More information about the core-libs-dev mailing list