review request for 6798511/6860431: Include functionality of Surrogate in Character
Martin Buchholz
martinrb at google.com
Tue Mar 16 23:41:08 UTC 2010
On Tue, Mar 16, 2010 at 16:14, Ulf Zibis <Ulf.Zibis at gmx.de> wrote:
> Am 16.03.2010 22:36, schrieb Martin Buchholz:
>
> On Tue, Mar 16, 2010 at 13:58, Ulf Zibis <Ulf.Zibis at gmx.de> wrote:
>
>
>
> Additionally, toUpperCaseCharArray(), codePointCountImpl(), String(int[],
> int, int) would profit from consecutive use of isBMPCodePoint +
> isSupplementaryCodePoint() or isHighSurrogate() + isLowSurrogate.
>
>
> For codePointCountImpl(), I do not agree.
>
>
> 1-byte comparisons have less footprint, in doubt load faster from memory,
> need less L1-CPU-cache, on small/RISC/etc. CPU's would be faster and
> therefore should enhance overall performance.
> The shift additionally could be omitted on CPU's which can benefit from
> 6933327.
I am not convinced. Using byte for local variables is unlikely to
give any performance benefit. The only way use of byte can be
a win is if you read/write a bunch of them at once from memory.
I think of byte as a compression scheme for int.
> For String(int[], int, int), I do agree.
>
> Here is my latest more readable and more performant implementation:
>
> int end = offset + count;
>
> // Pass 1: Compute precise size of char[]
> int n = 0;
> for (int i = offset; i < end; i++) {
> int c = codePoints[i];
> if (Character.isBMPCodePoint(c))
> n += 1;
> else if (Character.isSupplementaryCodePoint(c))
> n += 2;
> else throw new IllegalArgumentException(Integer.toString(c));
> }
>
> // Pass 2: Allocate and fill in char[]
> char[] v = new char[n];
> for (int i = offset, j = 0; i < end; i++) {
> int c = codePoints[i];
> if (Character.isBMPCodePoint(c)) {
> v[j++] = (char) c;
> } else {
> Character.toSurrogates(c, v, j);
> j += 2;
> }
> }
>
>
> I suggest:
>
> // Pass 2: Allocate and fill in char[]
> char[] v = new char[n];
> for (int i = end; n > 0; ) {
> int c = codePoints[--i];
> if (Character.isBMPCodePoint(c))
> v[--n] = (char)c;
> else
> Character.toSurrogates(c, v, n -= 2);
> }
>
> - saves 1 variable (=reduces register pressure)
> - determining of the loop end against 0 is faster than against "end", see:
> 6932855
Perhaps, but this exceeds my micro-optimization threshold.
> BTW:
> int end = offset + count;
> could be saved, as VM would do that, for sure in HotSpot c2 compiler.
>
> -Ulf
>
>
Martin
More information about the core-libs-dev
mailing list