Reduce allocation in sun.net.www.protocol.jar.Handler.parseURL
Hello, sun.net.www.protocol.jar.Handler.parseURL unconditionally calls String.substring twice on the spec string, even when ParseUtil.canonizeString later determines that no canonization was required. By letting canonizeString do the substring calls, but only when it determines that it is needed, we can remove some unnecessary String and byte[] allocations. Patch: https://github.com/eirbjo/jdk/commit/87da9032d7fb622f5d466f43faded4de672ebdd... JMH micro: @Benchmark public void initURL(Blackhole blackhole) throws MalformedURLException { blackhole.consume(new URL(new URL("jar:file:/spring-aop.jar!/"), "org/springframework/aop/framework/AbstractSingletonProxyFactoryBean.class")); } It shows a performance win for the patch in terms of throughput and bytes / operation: Baseline: Benchmark Mode Cnt Score Error Units ZipBenchmark.initURL thrpt 25 1673457.129 ± 22857.945 ops/s ZipBenchmark.initURL:·gc.alloc.rate thrpt 25 1410.227 ± 19.283 MB/sec ZipBenchmark.initURL:·gc.alloc.rate.norm thrpt 25 928.080 ± 0.005 B/op ZipBenchmark.initURL:·gc.churn.G1_Eden_Space thrpt 25 1412.557 ± 22.575 MB/sec ZipBenchmark.initURL:·gc.churn.G1_Eden_Space.norm thrpt 25 929.589 ± 5.756 B/op ZipBenchmark.initURL:·gc.churn.G1_Survivor_Space thrpt 25 0.006 ± 0.002 MB/sec ZipBenchmark.initURL:·gc.churn.G1_Survivor_Space.norm thrpt 25 0.004 ± 0.001 B/op ZipBenchmark.initURL:·gc.count thrpt 25 1441.000 counts ZipBenchmark.initURL:·gc.time thrpt 25 961.000 ms After: Benchmark Mode Cnt Score Error Units ZipBenchmark.initURL thrpt 25 1831971.983 ± 35705.091 ops/s ZipBenchmark.initURL:·gc.alloc.rate thrpt 25 1011.389 ± 19.689 MB/sec ZipBenchmark.initURL:·gc.alloc.rate.norm thrpt 25 608.061 ± 0.001 B/op ZipBenchmark.initURL:·gc.churn.G1_Eden_Space thrpt 25 1011.746 ± 20.583 MB/sec ZipBenchmark.initURL:·gc.churn.G1_Eden_Space.norm thrpt 25 608.275 ± 3.609 B/op ZipBenchmark.initURL:·gc.churn.G1_Survivor_Space thrpt 25 0.007 ± 0.001 MB/sec ZipBenchmark.initURL:·gc.churn.G1_Survivor_Space.norm thrpt 25 0.004 ± 0.001 B/op ZipBenchmark.initURL:·gc.count thrpt 25 1197.000 counts ZipBenchmark.initURL:·gc.time thrpt 25 760.000 ms Thanks, Eirik.
By letting canonizeString do the substring calls, but only when it determines that it is needed, we can remove some unnecessary String and byte[] allocations.
While English is not my mother tongue, I'm thinking we could maybe let the Vatican deal with canonizations and rename this to the more appropriate "canonicalizeString"? Eirik.
Hi, looks reasonable. I'll file an RFE. - canonize -> canonicalize seem like an appropriate change of terms :-) - the bangSlash parameter could perhaps be better named from, since bangSlash is not a concept known to ParseUtil. Or move the util methods to the Handler, since it's the only caller anyway - calling file.length() twice: maybe let len = file.length() and replace len == 0 with from >= len? Thanks! /Claes On 2021-01-20 02:52, Eirik Bjørsnøs wrote:
Hello,
sun.net.www.protocol.jar.Handler.parseURL unconditionally calls String.substring twice on the spec string, even when ParseUtil.canonizeString later determines that no canonization was required.
By letting canonizeString do the substring calls, but only when it determines that it is needed, we can remove some unnecessary String and byte[] allocations.
Patch:
https://github.com/eirbjo/jdk/commit/87da9032d7fb622f5d466f43faded4de672ebdd...
JMH micro:
@Benchmark public void initURL(Blackhole blackhole) throws MalformedURLException { blackhole.consume(new URL(new URL("jar:file:/spring-aop.jar!/"), "org/springframework/aop/framework/AbstractSingletonProxyFactoryBean.class")); }
It shows a performance win for the patch in terms of throughput and bytes / operation:
Baseline:
Benchmark Mode Cnt Score Error Units ZipBenchmark.initURL thrpt 25 1673457.129 ± 22857.945 ops/s ZipBenchmark.initURL:·gc.alloc.rate thrpt 25 1410.227 ± 19.283 MB/sec ZipBenchmark.initURL:·gc.alloc.rate.norm thrpt 25 928.080 ± 0.005 B/op ZipBenchmark.initURL:·gc.churn.G1_Eden_Space thrpt 25 1412.557 ± 22.575 MB/sec ZipBenchmark.initURL:·gc.churn.G1_Eden_Space.norm thrpt 25 929.589 ± 5.756 B/op ZipBenchmark.initURL:·gc.churn.G1_Survivor_Space thrpt 25 0.006 ± 0.002 MB/sec ZipBenchmark.initURL:·gc.churn.G1_Survivor_Space.norm thrpt 25 0.004 ± 0.001 B/op ZipBenchmark.initURL:·gc.count thrpt 25 1441.000 counts ZipBenchmark.initURL:·gc.time thrpt 25 961.000 ms
After:
Benchmark Mode Cnt Score Error Units ZipBenchmark.initURL thrpt 25 1831971.983 ± 35705.091 ops/s ZipBenchmark.initURL:·gc.alloc.rate thrpt 25 1011.389 ± 19.689 MB/sec ZipBenchmark.initURL:·gc.alloc.rate.norm thrpt 25 608.061 ± 0.001 B/op ZipBenchmark.initURL:·gc.churn.G1_Eden_Space thrpt 25 1011.746 ± 20.583 MB/sec ZipBenchmark.initURL:·gc.churn.G1_Eden_Space.norm thrpt 25 608.275 ± 3.609 B/op ZipBenchmark.initURL:·gc.churn.G1_Survivor_Space thrpt 25 0.007 ± 0.001 MB/sec ZipBenchmark.initURL:·gc.churn.G1_Survivor_Space.norm thrpt 25 0.004 ± 0.001 B/op ZipBenchmark.initURL:·gc.count thrpt 25 1197.000 counts ZipBenchmark.initURL:·gc.time thrpt 25 760.000 ms
Thanks, Eirik.
On 20/01/2021 11:57, Claes Redestad wrote:
Hi,
looks reasonable. I'll file an RFE.
- canonize -> canonicalize seem like an appropriate change of terms :-)
- the bangSlash parameter could perhaps be better named from, since bangSlash is not a concept known to ParseUtil. Or move the util methods to the Handler, since it's the only caller anyway
- calling file.length() twice: maybe let len = file.length() and replace len == 0 with from >= len?
I suspect some refactoring is needed as some of this code in ParseUtil is not general purpose and should probably move to the jar URL handler. A discussion for net-dev when there is a patch to discuss. -Alan
On 2021-01-20 13:02, Alan Bateman wrote:
On 20/01/2021 11:57, Claes Redestad wrote:
Hi,
looks reasonable. I'll file an RFE.
- canonize -> canonicalize seem like an appropriate change of terms :-)
- the bangSlash parameter could perhaps be better named from, since bangSlash is not a concept known to ParseUtil. Or move the util methods to the Handler, since it's the only caller anyway
- calling file.length() twice: maybe let len = file.length() and replace len == 0 with from >= len?
I suspect some refactoring is needed as some of this code in ParseUtil is not general purpose and should probably move to the jar URL handler. A discussion for net-dev when there is a patch to discuss.
Filed https://bugs.openjdk.java.net/browse/JDK-8260043 for this. (Eirik: once you submit a PR the review should happen automatically on the right mailing list.) /Claes
I suspect some refactoring is needed as some of this code in ParseUtil is not general purpose and should probably move to the jar URL handler. A discussion for net-dev when there is a patch to discuss.
https://github.com/openjdk/jdk/pull/2167 This moves the utility methods into Handler (does that cover "some refactoring is needed", Alan?) and also includes cleanups suggested by Claes. Eirik.
participants (3)
-
Alan Bateman
-
Claes Redestad
-
Eirik Bjørsnøs