Please review: surrogate fiddle

Ulf Zibis Ulf.Zibis at CoSoCo.de
Wed Mar 20 14:09:18 UTC 2013


Hi Martin,

nice to see you again on board.

Am 19.03.2013 20:18, schrieb Martin Buchholz:
> Thanks!  Webrev updated.

Character:
Maybe I'm blind, is there any semantical difference between
         char c1 = seq.charAt(index++);
         if (isHighSurrogate(c1)) {
             if (index < seq.length()) {
and
         char c1 = seq.charAt(index);
         if (isHighSurrogate(c1) && ++index < seq.length()) {
, or is it just for beautifying the code?

AbstractStringBuilder:
Instead
  270     public int codePointBefore(int index) {
  271         int i = index - 1;
  272         if ((i < 0) || (i >= count)) {
  273             throw new StringIndexOutOfBoundsException(index);
  274         }
I suggest
  270     public int codePointBefore(int index) {
  271         if ((--index < 0) || (index >= count)) {
  272             throw new StringIndexOutOfBoundsException(index);
  273         }
, because if e.g. the initial value of index is 0, then -1 reflects the out-of-bound condition, but 
not the initial 0 to report in the StringIndexOutOfBoundsException.
(Hopefully the following redundant i < 0 bounds check from value[i] becomes elimited by JIT.)

If there is some register pressure, you could avoid potential register swapping for temp, temp2, 
hasSurrogate, j and n - j if you would reorder following lines to:
1390             char temp = value[j];
1391             char temp2 = value[j] = value[n - j];
1397             value[n - j] = temp;
1392             if (!hasSurrogate) {
1393                 hasSurrogate = (Character.isSurrogate(temp) ||
1394                                 Character.isSurrogate(temp2));
1395             }
(Nomination for "performance expert II" ?)

> I, ummmm... am a "performance expert".
>
> How about, if I can ever measure any performance win in a micro-benchmark,
> we're allowed to keep the lower-level code?

Yes I remember ;-)
As IMHO better alternative you could proof the JIT result by hsdis.


I think, you independently should test
     static int codePointAtImpl(char[] a, int index, int limit) {...}
to don't read out-of-bounds trailing surrogate.
... and you additionally should provide a test for
     static int codePointBeforeImpl(char[] a, int index, int start) {...}
to don't read out-of-bounds precursory surrogate.


-Ulf




More information about the core-libs-dev mailing list