ClassValue perf?

Thu May 26 08:59:37 UTC 2016

Hi Michael,

On 05/23/2016 03:56 PM, Michael Haupt wrote:
> I've ran the unpatched version and Peter's two patches once more. The 
> results are attached (results.txt). They confirm Aleksey's observation.
>
> Regarding the 03 patch (plevart3 column in the results), perfasm 
> output (see http://cr.openjdk.java.net/~mhaupt/8031043/perfasm.zip 
> <http://cr.openjdk.java.net/%7Emhaupt/8031043/perfasm.zip>) suggests 
> the cost is mainly accrued in ConcurrentHashMap. The same is the case 
> for the 02 patch (plevart2 column).
>
> As things stand, I think we can even focus on Peter's 02 patch, as 
> this is the faster of his two proposals (plevart2 column in the 
> results), reduces the footprint, and reduces the implementation 
> complexity. Can anything be done to improve on its performance? (It 
> has slight performance slowdowns for the single-value case as well.)

I can't think of anything else short of improving performance of CHM itself.

Or replacing CHM with a "better" implementation:

http://cr.openjdk.java.net/~plevart/misc/ClassValue.Alternative2/webrev.04/

This webrev is similar to webrev.02. It's only difference is in 
ClassValueMap which extends LinearProbeHashtable instead of 
ConcurrentHashMap. LinearProbeHashtable is a simple implementation of a 
linear-probe hash table. It's not a full Map implementation. It only 
implements methods needed in ClassValue. With this implementation I get 
a slight boost compared to JDK 9 ClassValue implementation for all sizes 
and counts:

Benchmark                         (classCount)  (classValueCount) 
(impl)  Mode  Cnt   Score   Error  Units
ClassValueBench.randomAccess               128                  1 jdk9  
avgt   10   9.079 ± 0.092  ns/op
ClassValueBench.randomAccess               128                  4 jdk9  
avgt   10  10.615 ± 0.102  ns/op
ClassValueBench.randomAccess               128                 16 jdk9  
avgt   10  11.665 ± 0.012  ns/op
ClassValueBench.randomAccess               128                256 jdk9  
avgt   10  19.151 ± 0.219  ns/op
ClassValueBench.randomAccess              1024                  1 jdk9  
avgt   10  14.642 ± 0.425  ns/op
ClassValueBench.randomAccess              1024                  4 jdk9  
avgt   10  22.577 ± 0.093  ns/op
ClassValueBench.randomAccess              1024                 16 jdk9  
avgt   10  19.864 ± 0.736  ns/op
ClassValueBench.randomAccess              1024                256 jdk9  
avgt   10  60.470 ± 0.285  ns/op
ClassValueBench.sequentialAccess           128                  1 jdk9  
avgt   10   9.741 ± 0.033  ns/op
ClassValueBench.sequentialAccess           128                  4 jdk9  
avgt   10   8.252 ± 0.029  ns/op
ClassValueBench.sequentialAccess           128                 16 jdk9  
avgt   10   7.888 ± 1.249  ns/op
ClassValueBench.sequentialAccess           128                256 jdk9  
avgt   10  16.493 ± 0.415  ns/op
ClassValueBench.sequentialAccess          1024                  1 jdk9  
avgt   10  13.376 ± 0.452  ns/op
ClassValueBench.sequentialAccess          1024                  4 jdk9  
avgt   10  10.023 ± 0.020  ns/op
ClassValueBench.sequentialAccess          1024                 16 jdk9  
avgt   10   8.029 ± 0.178  ns/op
ClassValueBench.sequentialAccess          1024                256 jdk9  
avgt   10  33.472 ± 0.058  ns/op

Benchmark                         (classCount)  (classValueCount) 
(impl)  Mode  Cnt   Score   Error  Units
ClassValueBench.randomAccess               128                  1 pl04  
avgt   10   8.955 ± 0.055  ns/op
ClassValueBench.randomAccess               128                  4 pl04  
avgt   10   9.999 ± 0.017  ns/op
ClassValueBench.randomAccess               128                 16 pl04  
avgt   10  11.615 ± 1.928  ns/op
ClassValueBench.randomAccess               128                256 pl04  
avgt   10  17.063 ± 0.460  ns/op
ClassValueBench.randomAccess              1024                  1 pl04  
avgt   10  12.553 ± 0.086  ns/op
ClassValueBench.randomAccess              1024                  4 pl04  
avgt   10  16.766 ± 0.221  ns/op
ClassValueBench.randomAccess              1024                 16 pl04  
avgt   10  18.496 ± 0.051  ns/op
ClassValueBench.randomAccess              1024                256 pl04  
avgt   10  41.390 ± 0.321  ns/op
ClassValueBench.sequentialAccess           128                  1 pl04  
avgt   10   7.854 ± 0.381  ns/op
ClassValueBench.sequentialAccess           128                  4 pl04  
avgt   10   7.498 ± 0.055  ns/op
ClassValueBench.sequentialAccess           128                 16 pl04  
avgt   10   9.218 ± 1.000  ns/op
ClassValueBench.sequentialAccess           128                256 pl04  
avgt   10  13.593 ± 0.275  ns/op
ClassValueBench.sequentialAccess          1024                  1 pl04  
avgt   10   8.774 ± 0.037  ns/op
ClassValueBench.sequentialAccess          1024                  4 pl04  
avgt   10   8.562 ± 0.014  ns/op
ClassValueBench.sequentialAccess          1024                 16 pl04  
avgt   10   7.596 ± 0.027  ns/op
ClassValueBench.sequentialAccess          1024                256 pl04  
avgt   10  24.605 ± 0.143  ns/op

The footprint is even better than with CHM as LPHT does not maintain 
internal Map.Entry(s).

How does this implementation compare on your hardware, Michael?

Regards, Peter

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/mlvm-dev/attachments/20160526/5bc83428/attachment-0001.html>