JarFile.getVersionedEntry scalability with new release cadence
Eirik Bjørsnøs
eirbjo at gmail.com
Sat Apr 11 19:10:14 UTC 2020
Lance,
I made a small performance test. Pretty sloppy, so please don't tell
Aleksey S :-)
Results indicate there may be some performance wins to be had.
The test uses the Maven artifact com.h2database:h2:1.4.200:jar. This jar
which has 950 entries, of which the following three are versioned:
META-INF/versions/10/org/h2/util/NetUtils2.class
META-INF/versions/9/org/h2/util/Bits.class
META-INF/versions/9/org/h2/util/CurrentTimestamp.class
The performance test calls JarFile.getEntry for each of the base names
found in the jar. It does so 2000 times for 50 iterations and calculates
the average run time.
This is done once on a JarFile opened with runtime version 15, once on a
JarFile opened with runtime version 8 (which effectively disables versioned
lookup so works as a baseline). Warmup runs are run first to get stable
results.
The test is run with OpenJDK 15 built from master.
Results:
Average time to get 950 entries 2000 times:
Runtime version 15: 2903 ms
Runtime version 8: 336 ms:
This is shows the difference between testing seven versions (9, 10, 11, 12,
13, 14, 15) and not testing versions.
I then made a change to JarFile which scans the versions up front and
stores them in an int[] which is then looped over in getVersionedEntry.
Results:
Runtime version 15: 1048 ms
Runtime version 8: 315 ms:
My benchmark is of course synthetic and does not represent reality. I have
not done any analysis on the shape of typical multi-versioned jars nor
their access patterns.
However, an improvement of 2.5 - 3x is maybe worth taking a closer look?
Here's the patch for my change in JarFile.java:
Index: src/java.base/share/classes/java/util/jar/JarFile.java
IDEA additional info:
Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP
<+>UTF-8
===================================================================
--- src/java.base/share/classes/java/util/jar/JarFile.java (revision
86722cb038d3030c51f3268799a2c3dc0c508638)
+++ src/java.base/share/classes/java/util/jar/JarFile.java (date
1586631626679)
@@ -161,7 +161,7 @@
private final Runtime.Version version; // current version
private final int versionFeature; // version.feature()
private boolean isMultiRelease; // is jar multi-release?
-
+ private int[] versions; // which versions does the jar
contain
// indicates if Class-Path attribute present
private boolean hasClassPathAttribute;
// true if manifest checked for special attributes
@@ -599,12 +599,13 @@
}
private JarEntry getVersionedEntry(String name, JarEntry je) {
- if (BASE_VERSION_FEATURE < versionFeature) {
+ int[] versions = this.versions;
+ if (BASE_VERSION_FEATURE < versionFeature && versions != null &&
versions.length > 0) {
if (!name.startsWith(META_INF)) {
// search for versioned entry
- int v = versionFeature;
- while (v > BASE_VERSION_FEATURE) {
- JarFileEntry vje = getEntry0(META_INF_VERSIONS + v +
"/" + name);
+ int v = versions.length - 1;
+ while (v >= 0) {
+ JarFileEntry vje = getEntry0(META_INF_VERSIONS +
versions[v] + "/" + name);
if (vje != null) {
return vje.withBasename(name);
}
@@ -1016,9 +1017,20 @@
byte[] lbuf = new byte[512];
Attributes attr = new Attributes();
attr.read(new Manifest.FastInputStream(
- new ByteArrayInputStream(b)), lbuf);
- isMultiRelease = Boolean.parseBoolean(
- attr.getValue(Attributes.Name.MULTI_RELEASE));
+ new ByteArrayInputStream(b)), lbuf);
+ if(Boolean.parseBoolean(
+
attr.getValue(Attributes.Name.MULTI_RELEASE))) {
+ isMultiRelease = true;
+ versions = this.stream()
+ .map(ZipEntry::getName)
+ .mapToInt(this::parseVersion)
+ .filter(v -> v != -1 && v >=
BASE_VERSION_FEATURE && v <= versionFeature)
+ .distinct()
+ .sorted()
+ .toArray();
+
+ }
+
}
}
}
@@ -1026,6 +1038,27 @@
}
}
+ /**
+ * If {@code entryName} is a a versioned entry, parse and return the
version as an integer, otherwise return -1
+ */
+ private int parseVersion(String entryName) {
+ if(!entryName.startsWith(META_INF_VERSIONS)) {
+ return -1;
+ }
+
+ int separator = entryName.indexOf("/", META_INF_VERSIONS.length());
+
+ if(separator == -1) {
+ return -1;
+ }
+
+ try {
+ return Integer.parseInt(entryName, META_INF_VERSIONS.length(),
separator, 10);
+ } catch (NumberFormatException e) {
+ return -1;
+ }
+ }
+
synchronized void ensureInitialization() {
try {
maybeInstantiateVerifier();
On Fri, Apr 10, 2020 at 10:58 PM Lance Andersen <lance.andersen at oracle.com>
wrote:
> Hi Eric
>
> Feel free to enter a feature request and better yet propose a fix :-)
>
> Have a good weekend!
>
> Best
> Lance
>
> On Apr 10, 2020, at 2:59 PM, Eirik Bjørsnøs <eirbjo at gmail.com> wrote:
>
> I recently needed to re-implement multi-release lookup logic for a
> ModuleReader capable of reading modules from unpacked (exploded) jar files
> [1]
>
> It occurred to me that JarFile.getVersionedEntry checks _every_ version
> between 8 and the runtime version when looking up paths.
>
> Since META-INF/versions will probably be sparsely populated, I'm wondering
> if something could be done to avoid checking 20 different paths in OpenJDK
> 28.
>
> Perhaps scanning META-INF/versions once when opening the file could work,
> then only check existing versions in getVersionedEntry?
>
> Maybe a premature optimization today, but with the new release cadence,
> this problem is going to surface at some point in the future, right?
>
> [1]
> https://mail.openjdk.java.net/pipermail/jigsaw-dev/2020-April/014414.html
>
> Eirik.
>
>
> <http://oracle.com/us/design/oracle-email-sig-198324.gif>
> <http://oracle.com/us/design/oracle-email-sig-198324.gif>
> <http://oracle.com/us/design/oracle-email-sig-198324.gif>
> <http://oracle.com/us/design/oracle-email-sig-198324.gif>Lance Andersen|
> Principal Member of Technical Staff | +1.781.442.2037
> Oracle Java Engineering
> 1 Network Drive
> Burlington, MA 01803
> Lance.Andersen at oracle.com
>
>
>
>
More information about the core-libs-dev
mailing list