RFR (S) CR 8006627/8007398: Improve performance of Long.toUnsignedString0, Integer.toUnsignedString0, UUID(String) and UUID.toString
Mike Duigou
mike.duigou at oracle.com
Tue Apr 30 18:57:36 UTC 2013
Finally some follow up on this!
On Feb 13 2013, at 23:43 , Martin Buchholz wrote:
> I like the optimizations in this change, especially the avoidance of copies. Was there some good reason the jdk never before used JavaLangAccess to avoid String creation copies?
Not that I am aware of. I did work on String and thereafter the idea occurred to me. Copying...we hates it.
> I am tempted to micro-optimize this some more, e.g. specialize the hex printing code. I might get rid of the digits table and compute ASCII characters simply:
>
> ((i < 10) ? '0' : ('a' - 10)) + i
I wanted to play with JMH so used this as an excuse.
It turns out that using a calculation is a performance loser:
Benchmark Thr Cnt Sec Mean Mean error Var Units
o.o.j.s.g.t.DigitsTableVsCalc.digitsCalc 1 8 1 8463376.310 2516.169 4136965.190 ops/sec
o.o.j.s.g.t.DigitsTableVsCalc.digitsTable 1 8 1 11838194.938 4936.538 15923812.479 ops/sec
The source of the benchmarks:
@OutputTimeUnit(TimeUnit.SECONDS)
@GenerateMicroBenchmark(BenchmarkType.OpsPerTimeUnit)
public void digitsTable() throws InterruptedException {
int end = testData.length;
for (int each = 0; each < end; each++) {
int i = testData[each];
output[each] = digits[i];
}
}
@OutputTimeUnit(TimeUnit.SECONDS)
@GenerateMicroBenchmark(BenchmarkType.OpsPerTimeUnit)
public void digitsCalc() throws InterruptedException {
int end = testData.length;
for (int each = 0; each < end; each++) {
int i = testData[each];
int base = i < 10 ? 48 : 87;
output[each] = (char) (base + i);
}
}
Where testData is a 100 element static int array of random digit values (0-35).
> Why not formatUnsignedInt?
Now added and it seems to help.
> Why not make the new format methods public?
They seemed too specialized to me, particularly with the constraints on radix.
>
> Is there a better name than createStringSharedChars? We hope those chars will *not* be shared! How about newStringUnsafe(char[] chars)
>
> The spec for
> + int formatUnsignedLong(long val, int shift, char[] buf, int offset, int len);
> should make it clearer whose responsibility getting the size right is. It looks like the caller has to ensure there will be enough space, so perhaps the caller should just pass one offset instead of offset plus len.
Seems reasonable.
> + * @return the lowest character location used
> stray space
Corrected.
>
>
> On Wed, Feb 13, 2013 at 2:45 PM, Mike Duigou <mike.duigou at oracle.com> wrote:
> I have updated the patch with some of Ulf's feedback and corrected one cut-and-paste error that I made.
>
> The updated webrev is at:
>
> http://cr.openjdk.java.net/~mduigou/JDK-8006627/2/webrev/
>
> Mike
>
> On Feb 12 2013, at 18:25 , Ulf Zibis wrote:
>
> > Am 13.02.2013 02:34, schrieb Mike Duigou:
> >> Thank you for the comments Ulf.
> >>
> >> On Feb 12 2013, at 17:24 , Ulf Zibis wrote:
> >>
> >>> Am 13.02.2013 00:30, schrieb Mike Duigou:
> >>>> Hi Steven;
> >>>>
> >>>> I have updated the patch for Java 8. There's somewhat less code sharing and a bit of refactoring than your last version but the performance should be about the same or a little better.
> >>>>
> >>>> http://cr.openjdk.java.net/~mduigou/JDK-8007398/0/webrev/
> >>> Couldn't you use String(buf, true) for all to(Unsigned)String(...) methods ?
> >> Possibly. I didn't go looking too far with looking for additional improvements.
> >>
> >>> Instead of calculating the mask each time, you could use:
> >>> 309 private static String toUnsignedString(int i, int shift, int mask) {
> >> I think that would actually be inefficient. I haven't looked at the JITed code but the mask calculation is pretty cheap relative to parameter passing.
> >
> > I believe, JIT will inline the code, so there would be no parameter passing.
> >
> > Additionally the calculation of char count could be faster, if you would have short cuts like:
> > numberOfLeading2Zeros(i)
> > numberOfLeading4Zeros(i)
> > numberOfLeading8Zeros(i)
> > ...
> > So the optimum would be with separate toString methods:
> > String toBase2String(int i);
> > String toBase4String(int i);
> > String toBase8String(int i);
> > ...
> >
> > In any case I would save 2 lines:
> >
> > 311 int mag = Integer.SIZE - Long.numberOfLeadingZeros(i);
> > 312 int charCount = Math.max(((mag + (shift - 1)) / shift), 1);
> > 313 char[] buf = new char[charCount];
> > 316 int mask = (1 << shift) - 1;
> >
> > -Ulf
> >
>
>
More information about the core-libs-dev
mailing list